The type of computer system used to process a voter register, and the complexity of it, will depend on:
- the size of the voter population (or number of records) to be stored on the system
- the number of different kinds of data (or data fields) to be stored
- the size of the geographic base to be used in the system (see Geographic Base of a Voter Register)
- the number of expected data transactions
- whether the system is to be used for a one-off event, or whether it will be used for an ongoing or continuous voter register
- how data is going to be input
- whether the system will store out-of-date records as well as current records (to keep voters' registration history or for auditing purposes, for example)
- how many users will access the system, and how often
- whether the system will be used in one office, or whether it will be used in several different geographically dispersed offices
- whether users will expect to access data on line, and whether they will expect data to be up to date
- what products will be output from the system (see 'etl06')
- what hardware the system will be used on
- whether the system will be run on stand-alone computers, or over a network
- how data will be sorted and manipulated
A system designed to cater to a relatively small number of voters, to be used in a small central office, and that does not have complex input and output requirements will be relatively straightforward. However, where large numbers of voters are to be recorded, and/or where large numbers of users will access the system and/or where complex input and output requirements exist, the voter registration database will need to be more carefully designed and managed to maximise its effectiveness.
Batch or On-Line Processing
A key decision to make regarding a voter register processing environment is whether to use batch or on-line processing for data input. With batch processing, data is input in 'batches' of many records (each record representing a transaction related to one voter) and stored in a temporary data file. At a regular interval (often overnight when the computer system is not in heavy demand), the batched data is uploaded to the main data file, so that new records are added, changed records are updated, and old records are deleted or archived in one process.
Batch processing is useful where the available computers are not powerful or are not joined in a network that lends itself to on-line processing of data. With batch processing, any loss of system performance resulting from data updating normally takes place overnight, when system demand is low or non-existent. With some database systems, users need to log out of the database before any updates can be processed, making overnight batch updates a useful method that avoids restricting officers' productivity. Batch processing is also useful where a system is dispersed over different physical networks, so that separate versions of the database need to be updated. In these cases updating each database once a day using a batch update is preferable to updating each database every time a single record is updated.
On-line processing is feasible where powerful computers are used and users share a network capable of allowing on-line updates. In this case, data is entered into the live database rather than a temporary batch file. As soon as an on-line record is updated, it becomes available to other users of the system. On-line processing is more convenient for users as data is kept constantly up to date and they do not have to wait for batch updates to be run overnight. However, on-line processing is more difficult to organise as it requires a complex (and usually expensive) network system to make it feasible, particularly where users are geographically dispersed. Special care also has to be taken with database design to ensure that different users are not able to update the same record at the same time, leading to the possibility of errors.
From a useability point of view, on-line processing gives users more current information than batch processing. However, batch processing can be used on less expensive systems and is generally less complex to design and manage than on-line processing. In many cases, the lack of current data using batch systems will not be a significant problem.
Distributed or Centralised Systems
A complex database like a voter register can be used in essentially two ways: as a distributed system or a centralised system.
A distributed system can be utilized where users of the system are spread over more than one network. In most cases this occurs where users are geographically separated from other users, such as regional offices in different cities. In a distributed system, each component of the system maintains its own copy of the database and (usually) a subset of the data. For example, in a jurisdiction with 6 different regions, each of the 6 regions could maintain data for voters registered only in that region. Whenever there is a need to coordinate data between the distributed regions, this could occur by way of batch updates (see above).
In a centralised system, all the data is kept on one centralised database which is accessed over a network connecting all the regional offices (if any). Taking the above example, in a jurisdiction with 6 different regions and a centralised system, users in any region could access data held on any of the regions. Any updates having a cross-regional effect (such as a voter moving from one region to another, leading to a new record in one region and a deleted record in another region) could take immediate effect. In a distributed system, such cross-regional affects would have to wait for a batch update to take effect.
The main advantages of a distributed system include lower costs stemming from less need for high-capacity, cross-region networks, and improved system performance resulting from smaller file sizes, as the amount of data dealt with in a distributed system is less than in a centralised system.
A centralised system, while it is more expensive since it needs greater network capacity and larger file sizes, has the advantages of providing access to all data to all users, and of allowing for automatic updating of records across regions.