The functionality of voter registration systems can be maximised by electronic capture, storage and manipulation of data.
Voter registration system input sources can include:
- Paper forms completed by voters
- Electronic forms completed by voters
- Electronic data provided by other agencies
- Hardcopy information provided by other agencies
- Verbal advice provided by telephone or in person
- Information obtained from field workers in hardcopy or electronic form
- Information derived from returned mail
What data needs to be captured?
Before looking at how to capture data, an EMB must define what data needs to be captured. This will largely depend on the legislative requirements applicable to the voter register, but it will also include requirements indicated by administrative convenience.
Data only needs to be captured if there is a legislative and/or administrative reason for doing so. It may not be necessary to capture all of the data included by a voter on a registration form. Some fields shown on an application form may be used by EMB staff to determine eligibility to register, but there may be no need to keep the data in those fields on a permanent basis in a database.
For example, the Australian Electoral Commission's voter registration form asks voters (among other things) to stipulate place of birth and, for persons born outside Australia, citizenship details. These details are used by staff to determine that a person is eligible to register, but once eligibility has been determined there is no need to store these details on a database for future reference. If there is a need to refer to those details again (which happens only infrequently), a digitised image of the original form can be extracted.
Data input requirements can be designed through consideration of output requirements. For example:
- What output fields are stipulated by legislation or are administratively necessary?
- What output fields would be useful for sorting data or for selecting subsets of data?
- What special categories may apply to voter registrations?
- What auditing/tracking fields are needed (such as date and time of data entry or amendment, name of data entry operator, records of previous entries related to each voter)?
- What output fields will be needed on the complete range of products to be derived from the voter register? (See 'etl06'.)
Some output fields can be calculated by the computer software from other input fields and will not need to be data entered, such as electoral district, which can be derived from the address fields.
A typical list of fields captured at the input stage could include:
- Name (which can be subdivided into more precise fields, such as first name, 'middle' names, surname, family name as applicable)
- Address (which can be subdivided into more precise fields, such as flat number, street number, habitation name, street name, locality, state or district, post or zip code, country)
- Date of birth
- Sex
- Gender
- Former name (if the person's name has changed, for example by marriage or deed poll)
- Former address (so that an earlier registration can be cancelled/updated)
- Place and country of birth
- Identity number(s) (as applicable to the particular jurisdiction, such as a social security number)
- Citizenship details (for example, if proof of citizenship is required for registration)
- Postal address (for those whose postal address is different from their residential address
- Special voter category indicator (for example, a code to indicate whether a voter belongs to a special category of voters, such as a voter whose address is to be suppressed from the public voter register, or a voter who currently resides outside the home country)
Capture of data provided in paper/hardcopy form
There are essentially two ways in which data provided in hardcopy form can be converted into electronic data. The first is to use data entry operators to type or key the data into a computer system. The second is to use optical scanning hardware and intelligent character recognition (ICR) software to convert images into electronic text. Both methods have advantages and disadvantages.
Manual data entry can be preferable to scanned data entry as human operators are generally better able to interpret the intent of handwriting than is ICR software. During manual data entry of voter registration forms human operators are also able to make decisions about voter eligibility that may not be able to be automated, such as deciding whether a signature looks acceptable or whether sufficient information has been shown. However, manual data entry can be a tedious, unrewarding task, and its very monotony can lead to mistakes made through lack of concentration.
Several measures can be taken to increase the accuracy of manual data entry. A common method is to require data to be entered by one person and then verified by a second person. This verification process can take the form of keying all data twice, keeping both electronic copies separate. The two copies are then electronically compared. If they are both the same, the record is accepted. If they are different, a supervisor can check the record against the original to ensure the record is correctly keyed. Another verification method is to have a second person check the data keyed by the first person against the original form.
Another way of increasing the accuracy of manual data entry is to design the input screen used for data entry to maximise the accuracy rate. For example, a data entry screen should follow the same logical order as the form being keyed, with design elements used to force the operator's eyes to follow a logical path.
Software can also be programmed to perform logic tests as data is entered to minimise errors. For example, 'input masks' can be used, so that only numbers within a specified range can be added in a field where a number is required, and only valid dates can be added in date fields. Software can force data entry operators to add valid data to every field, so that fields cannot be skipped or left blank by accident. Where data in a field must conform to a particular standard, such as a defined list of variables, software can reject any entry that does not conform to the standard. Better still, where data in a field must conform to a defined list of variables, the system can offer only those variables, often in a 'drop down box' or a 'list box'. For example, a sex field could only allow the operator to select 'male' or 'female' as options.
Where data in the voter component of a voter register database is linked to another part of the database, such as the address component of the database, software can force data entry operators to select only a valid address from the address database. Any address given by a voter that does not conform with an address in the address database is rejected by the system, thereby forcing the operator or a supervisor to investigate the legitimacy of the claimed address. In some cases, the address given may be an unofficial variation of an official address. In others, the address may be fraudulent. If the address turns out to be legitimate but it is not contained in the address database, a separate process should be undertaken to update the address database before the voter registration can be processed, thereby preserving the integrity of the address database.
Data entry using optical scanning hardware and ICR software to convert images into electronic text may be preferable to manual data entry where large quantities of data have to be processed, and the process of manual data entry is not likely to add enough value to the process to make it worthwhile.
The biggest drawback with using ICR for data capture is the level of accuracy achieved. As hardcopy voter registration forms tend to be handwritten, the varying qualities of handwriting can make it difficult for ICR systems to accurately convert handwriting into text, particularly where names are being interpreted, as they do not give ICR software predictable grammatical patterns to follow. However, the accuracy of ICR software is continually improving, and error rates of modern ICR software are much lower than those achieved in ICR's infancy.
ICR software can be effective if the accuracy of the data capture is checked by a human operator against the original form, in much the same way as data is verified using manual operators. This process can be streamlined by software capturing both interpreted text and a picture image of the original form, and displaying them side by side on screen for operators to check. This method removes the need to refer back to the original forms and means the checking process can be undertaken relatively quickly by a trained and experienced operator.
ICR software is very well suited to capturing typed text. ICR software can be 'taught' to understand various typed or printed fonts with very high degrees of accuracy.
Capture of data provided in electronic form
By comparison with capture of data provided in paper/hardcopy form, capture of data provided in electronic form is a relatively straightforward process. By definition, data provided in electronic form does not have to be converted from hard copy. However, difficulties may arise where the data provided is not formatted in the same way as the data tables into which the data is to be included.
For example, an external agency may provide an EMB with a list of persons who are to be included on the voter register. The voter register will be set up so that data will be included in several defined fields, with each field referring to a particular type of data, such as a surname field. If the imported data does not contain information formatted in the same field structure, the data will have to be converted to fit into the desired structure. For example, an EMB may split voters' addresses into separate fields, such as flat number, street number, habitation name, street name, locality, state or district, post or zip code. Address data from the external agency might be provided in a 'free field' format, that is, the entire address might be typed in one field, with no breakdown of the address into its component parts. In this case, some means must be devised to convert the imported data into the desired format. Unfortunately, this often can only be achieved by considerable manual intervention, making the electronic data exchange a more complicated exercise than it might appear to be on the surface.
The solution to importing electronic data into a voter registration database is to coordinate data field structures with the agency supplying the data so as to ensure consistency. The best way to do this is to develop an agreed set of data structure standards that can be used across a range of agencies with similar data. Several such standards exist around the world.
Problems of data structure standards should not arise where EMBs collect electronic voter registration data directly from voters. For example, electronic registration forms provided on the internet or at computer kiosks can be structured to fit directly into the correct database structure if practicable.
Another way in which electronic voter register data can be captured is by supplying field workers with portable data entry devices. These devices can be programmed to take data entered while a field worker visits voter dwellings. Data can be downloaded from these devices by using removable disks, by connecting the devices directly to a computer or by downloading data over the internet.
As with manual entry of data, software logic tests can be applied to data captured electronically to identify any possible errors in the data. For example, any data containing letters in fields that should only contain numbers can be flagged, and operators can investigate the problem and, hopefully, correct it, going back to the source if necessary. Similarly, any addresses submitted electronically that do not conform to the standard address database can be investigated and corrections made as needed.
Capture of data provided by telephone
In some cases it may be possible to allow voters to register or update their voter register details automatically by telephone, but the opportunities for this type of transaction are rare.
However, it may be feasible (electoral legislation permitting) to accept changes to the voter register by telephone. In these cases the verbal message must be translated into an electronic form in order for it to update the electronic voter register. This could take the form of a handwritten or typed form completed by the operator, which is then keyed or scanned into the computer system. The advantage of this approach is that it leaves a paper audit trail which can be used to verify the legitimacy of changes to the register.
Alternatively, the operator taking the telephone call could update the register on screen. This has the advantage of saving time by eliminating the step of producing a hard copy record. In this case, the database should record that the change was reported by telephone, so as to leave an audit trail for the change.
Capture of information derived from returned mail
Where information derived from the voter register is used to address mail to voters, that mail may be returned with annotations that may be useful for updating the register. For example, mail that is returned 'not known at this address' can be used to annotate the voter register and (dependent on local legislation) either serve to remove the person from the register or to trigger action to investigate the person's right to remain registered.
In other cases, returned mail may indicate corrections to spelling of names or to addresses. This information could also be used to correct the register.
Depending on the type of annotation made on returned mail, the processing of capturing the data on the annotations can be automated to varying degrees. If outgoing address labels include an identifying bar code or identity number or code, that identifier could be used to simplify the data capture of any annotations on returned mail. If annotations fall into defined categories, then data capture of such information can be automated to a high degree. For example, mail containing identifying bar codes that state the voter no longer lives at the registered address could be separately categorised and run through a bar code reader so as to record the relevant data in the voter register database.
Where annotations show corrections to names or addresses, an operator would be required to key the changes into the electronic register. This process could be speeded up by using bar codes or identity numbers to quickly bring up the voter's record for correction.
Functionality of data entry systems
Data entry systems used for inputting voter register data, from either hard copy or electronic sources, can be designed to perform a range of functions that will add value to the process.
A voter register is typically a continually changing entity, particular where a continuous register is used. Even where a periodic register is used, changes must be made. A data entry system should permit addition of new records, amendment of existing records and deletion of records. A voter register can also (ideally) be designed to track changes to it over time, so that a voter's registration history is accessible.
Voter registers can also be designed to accept data impacting on a voter's record from a variety of different sources, where practicable. For example, a voter's original record may derive from an application form completed by the voter. That voter's address details may be updated at a later date by data provided electronically by another government agency.