Anda di halaman 1dari 73

Tartu University Faculty of Mathematics and Informatics Institute of Computer Science Chair of Software Engineering

Oleg M rk u

Designing Electronic Voting


Bachelors Thesis

Supervisor: Helger Lipmaa, PhD

Author: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . June 2001 Supervisor: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . June 2001 Head of the Chair: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . June 2001

Tartu 2001

Contents
Introduction Aims of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 System Analysis 1.1 Domain Model . . . . . . . . . . . . 1.2 Generic Requirements . . . . . . . . 1.3 Conventional Elections . . . . . . . . 1.4 Trust . . . . . . . . . . . . . . . . . . 1.5 On Revoking Ballots . . . . . . . . . 1.6 E-voting Requirements . . . . . . . . 1.6.1 Functional Requirements . . . 1.6.2 Non-functional Requirements 3 4 5 5 8 8 9 10 11 15 15 15 18 20 20 21 21 23 26 26 27 30 34 35 35 36 38

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

System Design 2.1 Theoretical Basis . . . . . . . . . . . . . . 2.1.1 Model of the Real World . . . . . . 2.1.2 Electronic Voting Scheme . . . . . 2.1.3 Public Key Infrastructure . . . . . . 2.1.4 Time-stamping . . . . . . . . . . . 2.1.5 Bulletin Board . . . . . . . . . . . 2.1.6 Threshold Encryption and Signature 2.1.7 Implementations of EVS . . . . . . 2.1.8 On the Freedom of Choice . . . . . 2.2 Designing Framework . . . . . . . . . . . . 2.2.1 Real World Model . . . . . . . . . 2.2.2 Computing Device . . . . . . . . . 2.2.3 Software . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

2.3

2.4

2.2.4 Threshold Trust . . . . . . . . . . . 2.2.5 Connection . . . . . . . . . . . . . 2.2.6 PKI . . . . . . . . . . . . . . . . . 2.2.7 Time-stamping . . . . . . . . . . . 2.2.8 Summary . . . . . . . . . . . . . . Design for Bulletin Board . . . . . . . . . . 2.3.1 Some Simple Ideas . . . . . . . . . 2.3.2 Synchronous Environment . . . . . 2.3.3 Asynchronous Environment . . . . 2.3.4 Practical Solutions . . . . . . . . . Design Pattern for E-voting System . . . . 2.4.1 Computing Result . . . . . . . . . 2.4.2 Meta Process . . . . . . . . . . . . 2.4.3 Design for Single Authority EVS . 2.4.4 Design for Multiple Authority EVS 2.4.5 Conclusions . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

38 40 44 47 47 48 49 50 51 52 53 57 61 61 63 65 66 68 69

Summary

Res umee (In Estonian) Bibliography

Introduction
Recently, the topic of implementing electronic voting (e-voting) has become very popular: multiple workshops have been held, there exist rms that provide corresponding services, real attempts of e-voting have taken place, media is eagerly covering this topic. The main purpose of electronical elections is allow voters to vote from as many locations as possible, ideally from their personal computing devices. An intermediate option would be to have specialized computers (kiosks) be deployed everywhere like ATMs (automated teller machines) currently are. Communication media would probably be Internet or something similar. The justication is that it would be more convenient, which would increase voter turnout. Also, one might expect that in future e-voting would become less expensive than conventional voting. At the current time e-voting is viewed as a complement to conventional elections because, for instance, not all people have access to computers and Internet (or skills to use them). Despite its tempting simplicity, this problem is much more complex than it seems at the rst moment. The main issues are security and reliability. The problem of organizing e-voting consists roughly of three parts:

Solving problem mathematically, which includes formulating model of the real world (e.g. formalizing the notion of trust), stating requirements, and nally nding a mathematical construction and proving that it satises these requirements. Such construction is called electronic voting scheme (EVS): a collection of protocols and algorithms, which implement e-voting within formulated model of the real world. I will call all this theoretical activity. Provided there is an EVS, it is needed to implement it. In particular, real world model, which was used, must be implemented. Besides that, EVS has usually relatively simple structure (nevertheless being complex mathematically), which assumes some inputs and produces some outputs. It does not consider the process of preparing input data and consuming output data. Also, e-voting must be somehow integrated into existing conventional voting

process. Real implementation must consider the whole iterative process of organizing elections. I will call this technical activity.

Finally, e-voting will inevitably differ from conventional elections: voter must perform different actions, there are different (and probably bigger) security threats, different demographical groups have different level of access to the Internet, etc. For this reason politicians and sociologists must evaluate impact of e-voting on the democratic process and decide whether it is useful at all and provide suggestions what should be changed. Besides that, laws must be changed to accommodate e-voting into conventional voting process. I will call this political activity.

Theoretical activity belongs to the eld of cryptography and has lasted for at least twenty years. The most inuentious papers in this eld are (personal opinion): [Cha81], [Ben87], [BT94], [CGS97]. Reader can nd a partial overview of this topic in my semester work [Myr00]. Basically, it can be said, that there exist solutions of acceptable security and complexity, although there is enough place for further advances. There exist some number of rms, which provide e-voting solutions. The most well-known of them are probably [VoteHere.net] and [Election.com]. The rst of them provides (at least) some description of their technology and is based on [CGS97], which is a good cryptographical construction. On the other hand the second of them has received more media attention, but does not present any description of their technology at their web site (which is a disadvantage, to my opinion). A number of workshops have been conducted, which concentrated on political and technical aspects: National Workshop on Internet Voting [IPI], Voting Integrity Project [VIP], California Internet Voting Task Force [CIVTF]. Their major nding is that although there is enough theoretical basis for implementing e-voting, technologically it is not possible to make systems secure enough. The biggest problem is insecurity of conventional personal computers and Internet. At the same time they propose using e-voting kiosks in near future.

Aims of the Thesis


This thesis can be viewed as continuation of my semester work [Myr00], where I dealt with theoretical problems of e-voting. In this work I will concentrate on the technical aspect: I will try to formulate requirements for the system and outline system design. Software engineering ideology and notation (UML) will be followed throughout the text. 4

Although it is clear that risks of voting from usual PCs over Internet are too high, it is still interesting to design e-voting system and see where and why these risks come up.

Acknowledgement
I would like to express my gratitude to Helger Lipmaa for introducing me to this subject and motivating me to deal with it and also for pulling me into Estonian e-voting project [LipMy01].

Notation
As a potential reader might have theoretical computer science (and not software engineering) background, I will describe shortly notation used. Two types of UML diagrams are used: static structure and activity. Notation is used and interpreted quite freely, which should be normal from the viewpoint of UML. There are also some other types of diagrams used, but their meaning should be evident or will be explained separately. Static structure diagram sample is presented on Figure 1. It depicts Factory (a class) that produces Cars (dashed line signifying dependency or direction of ow). Cars have names (an attribute) and operations Car::Start() and Car::Stop() (methods). Each car has at most one Owner, each owner can have many cars (arrow with 1 and * signifying one-to-many relationship). Car consists of Wheels (rhomb signifying aggregation). Car is a kind of Beeper, though it can Beep() (triangle signifying generalization, interface signifying a set of methods that some class should implement). Activity diagram sample is presented on Figure 1. It is supposed to describe state transitions and ow of a process. The upper black dot signies beginning of the process (initial state), the one at the bottom is the nal state. Bubbles signify either activity or state, arrows depict transitions. On this diagram Work1 and Work2 are performed in parallel, state Complete is reached when both of these activities complete. System architecture diagram sample is presented on Figure 1. Here threedimensional bar depicts a subsystem, simple rectangle depicts a process (I also interpret it as a user), rounded rectangle signies an object (or data), cylinder depicts datastore, grey bar between database and computer signies a boundary. Lines and arrows are used freely to signify relationships and directions of dataow.

Factory

interface Beeper +beep()

Car +name +start() +stop() * 1 * Wheel 1

Owner

Figure 1: UML Static Structure Diagram Sample.

Prepare

Work1

Work2

Complete

Figure 2: UML Activity Diagram Sample.

Programmer

Computer

Database

Program

Figure 3: System Architecture Diagram Sample.

Chapter 1

System Analysis
In this chapter I will try to describe the problem of e-voting in detail and formulate requirements for e-voting system. This would be a basis for designing and implementing such system.

1.1 Domain Model


The basic entities involved in elections are depicted on Figure 1.1.
Person +id 1 Election +name

* * Voter * 1

* Ballot Type 1 * Option +name

Figure 1.1: Domain model. Person Any person that may participate in some election. I assume that each person has unique identier. Election Specic election. I assume that elections are identied by a unique name. 8

Voter A person participating in election. Many voters can participate in election. Each person can be a voter in many elections. Voter is identied by persons identier and name of the election. Ballot Type Different voters can be presented with different ballot types at some election. Ballot type consists of some number of options amongst which voter will have to select one. Option One option belonging to some ballot type. Options are identied by names. Option names are unique within corresponding ballot type. At real election voter might be required to answer to multiple questions. In my model this can be modelled with multiple simultaneous elections. If it is important that voter answers correctly to each question, it can be easily enforced with technical or administrative methods. In addition I present the following denitions: Denition 1.1.1 Ballot - an option from some ballot type chosen by a voter. Denition 1.1.2 Tally - a set of ballots from voters of some election. Each voter can have at most one ballot. Denition 1.1.3 Election Result - calculated from tally, where for each ballot type it is said how many times each option was selected.

1.2 Generic Requirements


Any election system (either conventional or electronic) must at least satisfy the following requirements: Functional Requirements System must allow forming lists of eligible voters, needed ballot types, and assigning each voter corresponding ballot type. Each voter must have opportunity to cast at most one vote by selecting one option from his ballot type. In the end election result must be calculated. Freedom of Choice Nobody can affect voters choice. Voter should make his decision himself. It is a complex problem how to dene such requirement formally, but basically there are two options: Privacy Nobody can learn how a person voted without cooperating with him.

Incoercibility Nobody can learn how a person voted even if cooperating with him is possible. This includes that voter cannot prove how he voted. Of course, voter could just tell coercer how he voted, but coercer would not have any means of verifying this claim. Interested reader can nd longer discussion of this subject in [Myr00].

1.3 Conventional Elections


It is important to stress that e-voting is viewed as a complement to conventional non-electronic voting systems. So it is reasonable to review how such systems are functioning.

Election Organizer 1 *

Intermediate Organizer 1 *

Voting Location

Figure 1.2: Conventional elections. The structure of the system is usually hierarchical. Normally there is one organization, which is responsible for organizing election. I call it Election Organizer. On the other hand there is some number of hierarchy leaves, where people can actually vote. I call them Voting Locations. Voting locations and election organizers communicate through intermediate organizers, which group some number of voting places (usually on geographical basis). Different ballot types can be used at different voting locations. Each voter is assigned to one main voting location (close to his residence), where he can normally vote. In such setting it is possible to prevent voters from overvoting (voting more than once). Different voting technologies can be used there. One possibility is to use paper ballot, where options are presented and voter has to select one of them. After that ballot is casted into a sealed box. Later all ballots are taken out of the sealed box and counted. The counting process can be automated, for example, by means of optical scanning. All of the technologies assure that after voter has casted his ballot, it is not possible to link voters identity and his ballot, which ensures privacy. In fact, incoercibility is also guaranteed because voter is casting his ballot in a private voting booth, which implies that voter cannot prove how he voted (of course, in real world such claims are always 10

relative to how much determined is the adversary). It is important to point out that after voter has casted his vote, it is not possible to remove his vote from the tally, which might be useful, if later it turns out that voter was not eligible to vote. If voter does not want to vote at his main voting location, there are some procedures for absentee voting, which allow him to vote from a broader number of locations (e.g. even from home). In this case special measures must be taken to prevent voter from voting more than once (overvoting). One possible solution is that voter puts his ballot into a clean envelope, seals it, and then puts this envelope into a new one, where he writes his identication. After that all envelopes from one voter must arrive at the same place, where it can be ensured that he did not vote more than once. Also, if voter is not prevented from voting at his main voting location, it must be ensured that he did not use both voting procedures. This implies that the best place for gathering voters envelopes is at his main voting location. If it is decided that voters envelope should be counted, external envelope is removed and the second clean envelope (which cannot be linked to voters identity) is put into a bigger pool of clean envelopes of other voters. After that voters ballots can be processed without compromising voters privacy. If voter is obliged to form his ballot in a private voting booth, incoercibility is also guaranteed, but if ballot can be composed at any place (e.g. at home and then sent by mail) someone could have been watching how voter is lling his ballot. Election results at each voting location are passed up the hierarchy to the intermediate organizers and nally to the main organizers (separately for each ballot type). At every level information passed from lower nodes is summed up and then passed to the parent node. Integrity of the process is ensured by the presence of observers, whose function is to verify that everything is performed as needed (votes are counted correctly, voter privacy is maintained). Distributed nature of the system ensures that violations of voting process at some node will not poison the whole voting system and though will have only limited effect on election result. A separate problem is compilation of lists of people who can vote at each voting location. Probably the best solution is to have a database of all people from which lists of voters can be generated, but this is not always the case.

1.4 Trust
E-voting system must be trusted by all entities, whom decisions made using it may concern. In the case of a country, the system must be trusted by all citizens, government, organizers themselves (hopefully) and also it must be approved by the international community. The problem of verifying that system corresponds to its requirements is very

11

common in the eld of software engineering. Requirements can be usually divided into the following groups: Functional Which functions should the system be able to perform. Efciency Limitations on time, memory, and communication channel bandwidth requirements. Dependability To what extent one could rely (depend) on the system, including: Reliability Limits on statistical measure of frequency of faults. Is related to availability. Availability Minimum acceptable percentage of time, during which system performs correctly. Is related to reliability. Safety Limits on the measure of loss in the case of big failures. Security What functions should system prevent from performing. In general, such requirements are application specic, but the most usual ones are: Access Control Policy dening which system users can perform which operations. Condentiality Policy dening which system users are supposed to see which data. Integrity Policy dening how data present in the system can be entered, modied, and deleted. Most of requirements can be assigned meaningful numeric measures, but in practice it is almost never possible to measure them directly. Functional and efciency requirements can be measured to some extent by testing, but dependability and in particular security is not measurable directly almost at all (also it is very rare when systems can be built so that their correctness can be proven formally). A typical solution is to evaluate parameters indirectly, by measuring the quality of the process of creating and maintaining the system, which gives of course very vague results. The following aspects are usually taken into consideration:

Whether best practices and common sense are used. To what extent system has been subject to testing, formal verication. Presence of continuous process of improvement of systems quality.

In the case of security, important additional criterions are:

12

For how long the system has withstood real or theoretical attacks by other interested parties. Presence of continuous process of prevention, detection and reaction to possible attacks.

In general specialists procient in one eld are not able to verify systems from the other elds. A general way to ensure quality of a system is to have specialists others than those who created it review it and express opinion. In the case of approval veriers would become partially responsible for that system themselves. Figure 1.3 describes components of hypothetical e-voting system, and also specialists that are responsible for them, who must be trusted to some extent. Theory Scientists are responsible for developing corresponding theory, the quality of which must be ensured by peer review. Hardware and Software Engineers are supposed to provide hardware and software, which must be certied. Servers E-voting system consists of one or more server computers, which are set up and maintained by the operators. Integrity of servers could be ensured by the presence of observers. E-voting system E-voting system itself is set up and maintained by the organizers. Integrity of the whole process could be also ensured by the presence of observers. Network Network connecting servers and voters computing devices is set up and maintained by the providers. Voters computers Voters computing devices are set up and maintained by administrators. In many cases voters are administrators of their computers themselves. Careful reader has probably noticed that network and voters computers do not have certiers. This is mostly a reection of existing situation. It is very hard to certify in any sensible way network, which spans the whole world (Internet). Although users computers could be certied, it is not done massively at the current moment. Also in the case of contemporary personal computer (PC), it might not be a very sensible activity, because voter would be able to miscongure his computer right after certication. So voter computers certication bears any sense only to the voter himself (and may be to computers administrator), but not the rest of the world. 13

Theory

Scientists

Hardware

Software

Engineers

Servers

Operators

E-voting system

Organizers

Network

Providers

Voters' computers

Administrators

Figure 1.3: Components of hypothetical e-voting system.

14

1.5 On Revoking Ballots


Before stating the requirements for e-voting system, one more issue of integrating conventional elections and e-voting must be considered. Revoking ballots means removing some voters ballots from the tally before counting it. It is useful when there are two or more facilities for casting a ballot, which take place simultaneously. There are two ways to prevent overvoting in such situation:

When voter is casting his ballot it is checked whether his has already done that using the other facility. After election is over it is checked if someone voted using both facilities and then this voters ballot must be revoked from one of the two tallies.

As it is not possible to revoke ballots in conventional voting, introducing evoting would require one of the following:

E-voting to allow revoking ballots. E-voting and conventional voting not taking place at the same time, but sequentially. To have a database where for each voter it would be recorded if he has already voted and to maintain online connection between this database and all voting locations.

Its clear that the rst option is the most preferable as it requires the least resources (does not require database) and is the most convenient (conventional election and e-voting can take place at the same time).

1.6 E-voting Requirements


Now requirements for e-voting system can be stated.

1.6.1 Functional Requirements


System must support the following activities: Organizers use cases Elections organizer must be able to perform the following operations: Prepare election First, election information must be prepared:

15

Enter election parameters It must be possible to create new election record and enter general parameters like name and time period during which it should take place. Enter voter list After that voter list must be entered somehow. Besides that some form of voter identication must be provided. It is reasonably simple, if there exists database of all voters, from which this list could be retrieved with corresponding query. On the other hand, if voters are supposed to register at election (as in USA), it would require a separate software module to full this requirement. Enter ballot types Further, system should support entering different ballot types and options. Map voters to ballot types Finally system should allow creating mapping between voters and ballot types. Normally it would be possible to devise this mapping from supplementary information about voters, if it is available. For example, if ballot types are assigned based on where voter lives. Conduct election Having prepared election information it should be possible to conduct it. It is conceivable that system would start and stop election automatically based on the time period entered when creating election information. Start election Pause election Stop election Finish election After election has been conducted, it is needed to make conclusions. Revoke voters System must allow to leave uncounted ballots of some voters. It is an optional but very desired requirement, which was discussed in Section 1.5. Compute result After that result must be computed: for each ballot type it must be calculated how many times this option was selected. Output result Finally election result must leave the system and enter external world. One option is to print it on paper. The other option is to produce digitally signed document. Archive election As the last step election information must be archived to durable media. 16

Save election information The following information must be archived: Election information: voter list, ballot types, mapping between voters and ballot types. Binary representation of casted votes or any other information based on which election result was computed (and can be recomputed), if possible. The list of voters whose ballots were revoked when computing election result. Ofcial election result as computed before. Verify archived information It must be possible to verify that all archived information is correct, in particular that archived election result matches other archived information. Figure 1.4 illustrates the main process of e-voting system from the viewpoint of organizer.

Preparing election

Conducting election

Revoking voters

Computing result

Archiving

Figure 1.4: Voting process. Voters use cases Voter must be able to perform the following operations: Select election As there might be many elections taking place simultaneously, voter must be able to select which election he wants to work with. See ballot and make a choice After having selected election, voter must be able to see the options and select one of them. 17

Submit ballot Finally he must be able to submit his ballot to the system. Change or delete ballot Optionally it might be allowed to change or even delete previously submitted ballot. Access Control It should be possible to dene for each system operation, which users can perform it.

1.6.2 Non-functional Requirements


Convenience and Usability System must be convenient to use, especially for voters. This includes: Simplicity Performing voting process should be as simple as possible. Graphical representation of ballot should be intuitive (probably it should look like paper ballot). Mobility Ability to vote from as many places and devices as possible. Revisability Ability to change or delete previously submitted vote. Efciency It should be possible to complete voting process as fast as possible. Software delays (time during which software is performing actions, not waiting for user input) should not be longer than (say) 1 second (conventional web applications). The nal step of submitting vote should not last longer than 1 minute (say). Efciency Voters software efciency has already been discussed. Organizers software efciency should be as follows:

Election information preparation should act as a usual application where software delays should not be longer than (say) 1 second. The process of conducting elections should be efcient enough to ensure required speed of voters actions. The nal stages of election like computing result, archiving or verifying election data should not last more than a couple of hours.

Freedom of Choice Besides functional requirements, system must satisfy the freedom of choice requirement, which was discussed in Section 1.2. Privacy Is considered a minimum requirement and is necessary.

18

Incoercibility In general, incoercibility is very useful, but in the case of evoting it can never be ensured completely: if people can vote at any possible location (including at home) it is possible that someone will be looking over shoulder how someone is voting. So probably it is enough to ensure that coercibility cannot take place at a larger scale, than it is possible by physically attending when someone is voting. Dependability Its clear that e-voting system must be much more reliable than conventional software. There are two ways how system might be malfunctioning:

Some operations cannot be performed (voter cannot vote, result cannot be computed). System looks like working correctly, but it is not (voters software silently selects different option, some voters ballots are silently omitted, result is computed incorrectly).

The second type of faults is clearly more dangerous.

19

Chapter 2

System Design
In this chapter I will describe different options for implementing e-voting system. I will also try to compare them and evaluate risks. First, an overview of theoretical basis will be given and then design for different parts of the system will be described. It is important to stress that design for such system does not mean only software design, but also design of the organization and the process, and also possibly hardware. As it was already mentioned the main problem of implementing e-voting is security. Security is almost always relative - it can be broken with some investment of resource (money and/or time). Although the benets of breaking e-voting system cannot be completely measured (for instance in money), it can still be argued that adversary would not spend more resource on breaking e-voting than he would gain from breaking it. This implies that elections of different importance would require different minimum level of security.

2.1 Theoretical Basis


Before designing the real system, e-voting problem must be analysed and solved mathematically. In fact, this problem belongs to the eld of cryptography. In this text an overview of the topic will be given. Those interested in more details can nd them in [Myr00]1 .
This implies that references to sources will only be given when necessary. Interested reader can nd them in [Myr00].
1

20

2.1.1 Model of the Real World


First a simplied model of the real world must be introduced, within which it would be possible to describe possible solutions and evaluate them. The following terms are introduced: Actor Actor is any entity (person, computer), which is assumed to have an identity. Actors have ability to perform polynomially-bounded randomised computations and have means of keeping secret information. Also, actors are assumed to have synchronized clocks: there should be upper limit on clock value and speed difference. Connection Besides that, actors are able to communicate with each other. Messages can be sent between addresses, but not actors (no identication). Communication is dependable (sent messages are always received in the order of sending) and is synchronous (the time message is delivered from sender to receiver has upper bound). At the same time communication is not anonymous and not private. This means that it might be possible to nd out which actor sent or received some message and also any actor might be able to read any message. I call such connection public. In principle one could also think of private, untappable communication with reliable identication (see [Myr00]), which is actually very useful when targeting at incoercibility, but such assumption is very hard to implement in reality (every voter would need such communication channel), so I will omit it. Threshold Trust Often constructions are formed out of similar components (actors) so that their properties hold as long as at most components fail to full some requirement. For instance critical services are often constructed to be reliable as long as at least components out of are working correctly and are able to communicate (equivalent to requirement that at most components will fail). Security is often ensured as long as no more than components perform some specic operations (are dishonest). Motivation for such approach is that it might be simpler to ensure such requirement than ensuring that one specic component would not fail.

2.1.2 Electronic Voting Scheme


The aim of theoretical activity is to dene and design a simple construction, which would abstract the real world problem of organizing e-voting. Such construction is 21

called Electronic Voting Scheme (EVS). The following types of actors are dened: Voter Actor that will vote in election. Authority Actor that will organize election. One and the same actor can belong to more than one set. It is assumed that:

There is one ballot type with L options, which are known to all actors. All actors know identities of authorities and have means of communicating with them (e.g. know addresses at which they send at receive messages). All actors know identities of all voters. Some of the voters have selected one option and intend to cast it.

Electronic voting scheme is a set of protocols (algorithms) for actors, which allow voters to send their ballot with selected option to authorities and authorities to compute election result and make it available to all actors. EVS must satisfy the following requirements: Correctness Election result must be computed correctly based on all ballots submitted by the voters. Freedom of Choice Either privacy or incoercibility, as discussed previously. It is very typical to require EVS to be veriable in order to ensure correctness: computation results of authorities must be veriable by any actor. This usually means that authorities must provide computational proof of correctness of election result. Each EVS should dene:

Means of actor identication. To what extent actors must be trusted to perform according to prescribed protocols. Basically, there are two options: Assume that actor is honest - i.e. follows protocols. Assume threshold trust towards thorities).

actors (usually applied only to au-

22

2.1.3 Public Key Infrastructure


Situation when actors need to communicate secretly and with reliable identication, but can use only public communication channel without reliable identication, is very typical. In order to address it, a framework called Public Key Infrastructure has been developed. The basis for PKI is public key cryptography, which denes a set of interfaces with special properties, which must be satised by corresponding implementations. These interfaces are depicted on Figure 2.1.
Public Key +binary interface Key Generator +Generate() 1 Key Pair 1

interface Digest Algorithm +Compute()

1 1

Private Key +binary

interface Encryption Algorithm +Encrypt() +Decrypt()

interface Signature Algorithm +Sign() +Verify()

Figure 2.1: Encryption and signature algorithms.

Digest Algorithm::Generate() takes as input any binary sequence

and produces hash of xed length (say 128 bits). Digest algorithm is supposed to be collision resistant - it should be infeasible to nd two different binary sequences having the same hash.
Key Generator::Generate() is an algorithm that randomly generates

a pair of two keys of specied bit length (e.g. bits). These keys are used as inputs to Encryption Algorithm and Signature Algorithm methods.
Encryption Algorithm::Encrypt() takes as input a Public Key

and a binary sequence of any length (called plaintext) and produces a binary sequence of comparable length (called ciphertext). Encryption Al23

gorithm::Decrypt() takes as input Private Key and performs reverse transformation from ciphertext to plaintext (which was encrypted using the public key from the same key pair). It is required that actor knowing the public key (but not private), some number of ciphertexts and corresponding plaintexts, would not be able to devise any information about the plaintext corresponding to some other ciphertext.

Signature Algorithm::Sign() takes as input Private Key and a bi-

nary sequence of xed length and produces another binary sequence of xed length called signature. Length of input and output sequences is comparable to the length of the key. Signature Algorithm::Verify() takes as input Public Key, binary sequence, and signature and checks that this signature was produced from that binary sequence with corresponding private key. It is required that actor knowing the public key (but not private), some number of binary sequences and their corresponding signatures would not be able to produce signatures for any other binary sequence. In order to sign binary sequences of any length, digest is computed from them and then signed. There exist cryptographical algorithms that satisfy these interfaces, whereas it is important to assume that actors have only polynomially limited computational power2 . Now, if two actors and generate themselves pairs of keys, exchange somehow their public keys, and keep their private keys in secret, they get ability to communicate secretly and with identication: when actor wants to send message to actor , it signs it with his private key, encrypts with s public key and then sends resulting message to . No other actor who might learn message , but does not know s private key cannot devise any information about the message . At the same time is able to verify s signature, which could have been produced only by some actor knowing s private key (which is supposed to be kept in secret). Some encryption algorithms have an interesting property: is able to prove to someone else knowing message that he sent message , without revealing any information about his private key. Despite the beauty of such solution, there is one problem: actors need to exchange their public keys somehow. It is not possible to do it over the public connection, because it is not possible to ensure who did the public key come from. In
It should be stressed that there also exist many variations that do not exactly t into this description. Also not all encryption and signature algorithms have complementary counterpart in the sense of sharing the same key pair. This means that there exist encryption algorithms that do not have complementary signature algorithm, which could use the same key pair and vice versa.
2

24

fact, in such setting this problem is not solvable at all: at some moment communication with reliable identication is necessary. So at best one may require such communication to take place only once. PKI provides the following construction to deal with this problem:

All actors are assumed to have identity, which has a unique identier, which can be represented as a binary string. A special construction called certicate is introduced. It consists of: Subjects public key (in binary representation). Subjects identity (binary representation of its identicator). Some optional attributes explained later. Issuers identity. Signature of all preceding items veriable with issuers public key. Certicate is interpreted as a statement that binds contained public key to the subjects identity. If someone having such certicate has reasons to trust issuer of the certicate and he knows issuers public key, he would have reason to believe that specied public key belongs to the subject. Certicate may allow (trust) or forbid (not trust) the subject to issue certicates himself. Actor who is trusted to issue certicates is called Certication Authority (CA). Certicate may also be limited to some eld of activity. Such information can be recorded in the attributes of the certicate.

If actor has some number of certicates one could think of a graph, where nodes are identities and arcs signify certicates connecting issuer to the subject. Besides that, some nodes have associated certicates, which are assumed to belong to corresponding identities. Different nodes, certicates, and arcs can have different level of trustworthiness. After that derivations can be made on this graph (transitive closure). The simplest form of such graph is a tree: there is one root CA, which might certify some number of intermediate CAs, which nally certify all interested actors. It is assumed that everybody trusts the root CA. In order to get into this framework one would have to prove his identity to some of the CAs (what cannot be done within our model and so must be done externally) and provide his public key. After that, a certicate would be generated that would be trusted by everybody. It is important to point out that all this framework holds as long as actor is willing to keep his private key in secret. Nothing prevents him from revealing it to someone else. Also in real life someone might steal someones private key and for 25

this reason CAs are supposed to provide means of checking whether certicate is still valid. This can be accomplished, for instance, by providing certicate database (containing either valid, revoked, or both kinds of certicates) which can be queried online, or by periodically publishing certicate revocation lists (CRLs).

2.1.4 Time-stamping
Time-stamping is a complementary to PKI service, which allows binding arbitrary message to a moment in time. This is done by creating an additional time certicate message. At least two avours of time-stamping exist:
, within which Absolute Allows determining reasonably small interval the message received time certicate. Such construction is of any use only when actors clocks are synchronous.

and having time certiRelative Allows determining for any two messages cates, which of them received the time certicate earlier. Time-stamping service (TSS) is supposed to be implemented by one trusted actor or by actors with threshold trust. Actors forming such service are called Time Stamping Authorities (TSAs). In ideal, time certicates should allow comparing them without the need for contacting TSS (ofine verication). Observing time certicate of a message proves that message was created before the moment in time associated with this certicate. If a signed message incorporates time certicate of any message (e.g. empty), one could conclude that message was created after the moment in time associated with the certicate. One of the most important applications of time-stamping is in the situation when someones certicate is revoked (e.g. due to private key leak). In such situation time certicate could be used to prove that message was signed before the certicate was revoked.

2.1.5 Bulletin Board


Often a service called Bulletin Board (BB) is assumed to exist. It is supposed to allow each actor to send signed messages to it. After that every actor should eventually be able to see all messages in the order they were sent. Sender of a message should be sure whether sending message succeeded. Such requirements are called atomicity. Also events are usually dened at which bulletin board starts and stops receiving messages. If ordering of messages in not important, the construction is called reliable store.

26

The dynamics of a bulletin board could be described by latency (how long does it take after the message was sent to become readable for everyone) and monotonicity (guarantee that if someone has seen a message at the bulletin board, then at each successive read this message will be visible to every reader). Monotonicity for one specic reader could be called read repeatability. For the purpose of proving time-outs (in order to accuse some actor of not participating) it is very desirable that bulletin board would be able to tell with a reasonable precision at which moment message was sent. Such property could be called absolute time-stamping (as opposed to relative ordering provided by the bulletin board in any case). In a modication of bulletin board called atomic multicast it is supposed to forward messages to some number of subscribers. Note, that this is not the same when some actor sends a message to a group of actors himself because, for instance, there is no guarantee that one and the same message will be sent to everybody. If ordering of messages is not important, the construction is called reliable multicast. It should be clear that atomic multicast can be implemented with bulletin board by polling it periodically, which might not be as efcient, of course. A simplication of bulletin board (and atomic multicast) is to maintain a separate single writer multiple reader bulletin board for each actor. In this situation ordering of messages can be devised for each actor separately. Even in such setting one actor can prove that his message was sent after some other message by some other actor by including in the former message the hash of the latter. It is relatively simple to implement bulletin board if there is one actor everybody would trust. Otherwise a system consisting of multiple actors with threshold trust on them must be devised. It should be clear that full-blown time-stamping exists iff full-blown bulletin board exists. Still, from the viewpoint of efciency they have different prole: bulletin board requires much more storage and at the same time time-stamping might be required to process much more messages and exist for longer period of time. For the purpose of e-voting it is enough to have a single-writer bulletin board for each actor with reasonable latency, repeatable read, and absolute timestamping. Monotonicity is desirable, but is not a requirement. The number of messages at each bulletin board would be fairly small (say ).

2.1.6 Threshold Encryption and Signature


Conventional encryption and signature algorithms can be extended in such a way that private key would be split into parts (shares), which would be given to different actors, so that decryption and signing could be performed only if (threshold) 27

share holders decide so. Such construction is very useful in the context of threshold trust. Figure 2.2 depicts relevant constructions.

Public Key +binary

interface Threshold Key Generator +Generate() +Reconstruct Public Key() +Reconstruct Private Key()

Private Key +binary

Public Output +binary +Verify()

Private Key Share +binary

interface Threshold Encryption Algorithm +Encrypt() +Partial Decrypt() +Reconstruct()

interface Threshold Signature Algorithm +Partial Sign() +Reconstruct() +Verify()

Partial Decryption +binary +proof of correctness +Verify()

Partial Signature +binary +proof of correctness +Verify()

Figure 2.2: Threshold encryption and signature algorithms.

It is assumed that each of actors has his own private key and corresponding certicate is available to all others. It is also assumed that actors communicate through the bulletin board (or even better with atomic multicast). Each of

actors is supposed to execute Threshold Key Genera-

tor::Generate(), which:

Takes as input actors key pair. Outputs Private Key Share (supposed to be kept in secret) and also Public Output with actors signature (supposed to be made available to everyone).
Public Outputs can be veried with Verify() method. Those that do

not pass this verication should be ignored further. 28

Threshold Key Generator::Reconstruct Public Key() can construct a Public Key based on Public Outputs present on the bulletin board. Public key can be used by conventional Encryption Algorithm to encrypt and Signature Algorithm to verify signature.

In principle a conventional Private Key can be reconstructed with help of Threshold Key Generator::Reconstruct Private Key(), although it is not usually used.
Threshold Encryption Algorithm::Encrypt() takes as input Public Key and works exactly as conventional Encryption Algorithm. In order to decrypt a ciphertext, actors must execute Threshold Encryption Algorithm::Partial Decrypt(), which:

Takes as input ciphertext and actors Private Key Share. Produces Partial Decryption.
Partial Decryption consists of:

Binary information. Computational proof of correctness, which can be veried with Verify(), which takes as input Public Outputs present at the bulletin board. Those partial decryptions that do not pass this verication should be ignored further. Finally plaintext can be reconstructed from Partial Decryptions of actors with help of Threshold Encryption Algorithm::Reconstruct().

In order to sign a plaintext, actors must execute Threshold Signature Algorithm::Partial Sign(), which: Takes as input plaintext and actors Private Key Share. Produces Partial Signature.
Partial Signature consists of:

Binary information. Computational proof of correctness, which can be veried with Verify(), which takes as input Public Outputs present at the bulletin board. Those partial signatures that do not pass this verication should be ignored further. 29

Finally signature can be constructed from Partial Signatures of actors with help of Threshold Signature Algorithm::Reconstruct(). Threshold Signature Algorithm::Verify() takes as input Public Key and works exactly as conventional Signature Algorithm. It is important to point out that:

Both threshold decryption and signature operations can be performed only by or more actors or by all actors, the Public Output of which passes Verify() (which is relevant if their number is less than ). Decryption operation result cannot be incorrect due to proofs of correctness of partial decryptions (signature operation result can always be veried directly). This can also be used in the case .

2.1.7 Implementations of EVS


In this section I will outline possible implementations of electronic voting scheme. The simplest solution is Single Trusted Authority Solution, where there would be one authority (actor), everybody would trust. In such case every voter would sign representation of his ballot with his private key, encrypt it with authoritys public key and send it to him. Later authority would decrypt these votes, verify signatures, compute result, and make signed result available to all others. In such situation authority is trusted to:

Accept ballots from all voters3 . Count them correctly. Not to use single decrypted ballots (intermediate results of computation) in any other operation and discard them.

Despite the naivety of such solution, it can be made reasonably safe in real life. There exist many other more or less secure solutions, but the best of them (such statements are always subjective) follow the pattern, which is usually called Multiple Authority Solution. The algorithms and data structures involved are depicted on Figure 2.3.
Multiple ballots could be accepted and the latest of them used - this would allow modifying vote. Also, there could be a ballot of special form that would require authority not to count it, which would enable voter to delete previously submitted ballot.
3

30

Voter's Ballot interface Voter's Algorithm +Generate Ballot() Authority's Setup Information +information +signature +Verify() Authority's Output interface Authority's Algorithm +Setup() +Compute() +computation result +proof of correcntess +signature +Verify() +encrypted information +proof of correctness +signature +Verify()

interface Consumer's Algorithm +Compute Result() +Verify Result()

Figure 2.3: Multiple authority EVS.

31

It is assumed that there are authorities and each actor has a pair of public and private keys and certicates of all other actors. It is also assumed that actors communicate through the bulletin board (it would be good if authorities could communicate through atomic multicast). First authorities jointly execute a setup phase, during which each authority must execute algorithm Authoritys Algorithm::Setup(), which: Takes as input authoritys key pair. Might communicate with other authorities. Outputs a piece of private information that authority should keep in secret and also public Authoritys Setup Information with authoritys signature. The correctness of setup information can be veried with Authoritys Setup Information::Verify(). Those setup informations that do not pass verication should be ignored further. Resulting setup information must be available to all actors.

Further, during voting phase each voter can generate his ballot using Voters Algorithm::Generate Ballot(), which: Takes as input voters key pair, selected option, and Authoritys Setup Information from all authorities. Outputs Voters Ballot.
Voters Ballot consists of:

encrypted information about voters choice. Computational proof of correctness, which can be used to check that encrypted information was formed correctly without the need for decrypting. Voters signature. Voters ballot can be veried with Voters Ballot::Verify(). Those that do not pass this verication should be ignored further. All voters ballots should be made available to all actors.

After that, during tallying phase each authority executes Authoritys Algorithm::Compute(), which:

32

Takes as input authoritys key pair, private information generated during setup phase, Authoritys Setup Information from all authorities, and voters ballots. Veries voters ballots using Voters Ballot::Verify() and selects one correct ballot for each eligible voter. All authorities must produce exactly the same list of ballots. Produces Authoritys Output.
Authoritys Output consists of:

computation result explained later. proof of correctness of computation result. Authoritys signature. Authoritys can be veried with Authoritys Algorithm::Verify(). Those that do not pass verication should not be used further. output

Finally every actor can execute Consumers Algorithm::Compute Result(), which: Takes as input Authoritys Setup Information from all authorities, and Authoritys Output from exactly different authorities. Computes election result telling how many times each option was selected. This step can take a long time. This result must be veried with Consumers Algorithm::Verify Result(), which: Takes as input Authoritys Setup Information from all authorities, voters ballots, Authoritys Output from exactly different authorities, and the computed result. Veries voters ballots using Voters Ballot::Verify() and selects one correct ballot for each eligible voter. Selected ballots must be exactly the same as when authorities selected them. Veries Authoritys Output from those ify() method.

authorities with Ver-

Veries that election result was computed correctly from Authoritys Output from authorities. Result verication is remarkably faster than result computation. 33

It is of crucial importance, that every actor sees the same Authoritys Setup Information from every authority and the same Voters Ballot from every voter (which might not be the case if some actor could send different information to different actors). This condition implies that all information should be posted to the bulletin board. It is important to point out that even if this condition does not hold, wrong election result cannot be computed, because some verication would fail. As voters ballots are sent to the bulletin board, it is possible to allow voters sending multiple ballots, amongst which the latest would be selected. Also there could be a ballot of a separate form that would require authority not to count it. This would allow deleting previously casted vote, although this might not be secret. Also the following holds:

Election result can be computed as long as there are at least authorities, which follow algorithms. Election result can never be computed incorrectly (to be more precise it can happen with negligibly low probability). Information on choices of individual voters can be extracted if at least authorities that produced correct Authoritys Setup Information decide to do that. Also all authorities together that produced correct Authoritys Setup Information can do the same (it is relevant in the case when the number of such authorities is less than ).

All this implies, that no actor should proceed further if the number of authorities that produced correct Authoritys Setup Information is less than . Careful reader has probably already noticed similarity of this construction with threshold encryption and signature. An example of such EVS would be variations of [CGS97] with shared public key (threshold) generation in the setup phase (see [Myr00]). Another option is for instance [CFSY96].

2.1.8 On the Freedom of Choice


As it was already mentioned, freedom of choice requirement implies either privacy or incoercibility, whereas the former is obligatory and the latter is optional. All patterns for EVS described in Section 2.1.7 ensure privacy, but do not guarantee incoercibility. The simplest way to prove how voter voted is to reveal his private key. Fortunately, in real life private keys are associated with voters identity and might be used to access different services like banks, which implies that actor would not have motivation to reveal his private key. But this does not solve the problem, because it is possible to prove how voter voted without revealing the private key (private key is only used to sign). See [Myr00] for more details. 34

Generally, in an environment with only public (tappable) communication incoercibility is provably not possible. If PKI is introduced and it is assumed that voter does not want to reveal any information about his private key, it might be possible to deduce EVS that would satisfy incoercibility, but I am not knowledgeable of any such construction. Another option is based on observation that incoercibility is not achievable because voter can see intermediate computation results and is able to sign any message. This leads to a solution, where there would be a specialized device, which would ask voter which option to use, perform all computations, sign result and then pass it over to the voter, so that he would not be able to see intermediate results and would not be able to sign arbitrary messages. Unfortunately it is very hard to imagine that such solution would justify itself economically.

2.2 Designing Framework


Electronic voting schemes are built upon real world model and supporting services like PKI and bulletin board. In this section I will consider options for implementing these prerequisites. In short, they can be called computer and network security. Bulletin board will be covered in a separate section because it is more applicationspecic. Once such framework is implemented, one can concentrate on essential problems of implementing e-voting.

2.2.1 Real World Model


Previously, real world abstraction was introduced in terms of actors being capable of performing randomised computations. In reality actors are often represented by people (or even organizations) which use their computing devices to perform computations. Also software must be written for these devices. For this reason it seems sensible to introduce the following requirements: Computing Device The computing device must be: Correct Execute instructions of provided software correctly. Untappable Computing devices memory should not be readable by anybody else than devices owner. This includes both volatile and permanent memory. Randomised Computing device should provide a source of random bits for use in randomised cryptographical algorithms. Synchronized Clock All computing devices should have synchronized clocks. 35

Software Software used must be correct - it must correspond to the algorithms and specications. Besides that, most of the people do not write their software on their own. This implies that correct software must be somehow delivered to the computing device.

2.2.2 Computing Device


In general, computing devices are designed to be correct (and lately untappability has become a widely accepted requirement). In order to prevent computer from being secure the following must be done:

Gain access to the computer either virtually or physically 4 . Execute software instructions.

Computer cannot be viewed without software running on it (OS, application software, shared libraries), which might have occasional or intentional bugs, which enable external entities to manipulate computer, including executing any software instructions and inspecting permanent or volatile memory. In the context of contemporary personal computer, the following problems exist:

General purpose software is written with emphasis on functional requirements and not so much dependability (because the former, not the latter gives prot, unless the latter is critical). As a result, contemporary software systems are ooded with security vulnerabilities. Another problem related to contemporary ways of software distribution: lots of programs are installed, and usually installation programs have full access to the system and can modify any feature of it, including introducing backdoors for unauthorized access to the system from outside. Although software rms might not have motivation to do such things themselves, it is enough to have one of their employees to do that. Many people can gain virtual or physical access to the computer.

In general the following can be done to ensure computing device to be correct and untappable: Use minimal, xed, veried set of necessary software (including operating system).

There also exist ideas how to wiretap computer from distance by measuring magnetic eld, etc.

36

Use rened access control system, which gives minimum needed rights to the installed software by default. Limit access (including physical) to only relevant personnel.

It is generally agreed that contemporary PC is quite insecure (not enough correct and untappable). At the same it is probably possible to create secure computing device, because most of the problems exist due to historical reasons or lack of economical motivation. Also, the smaller and more specialized the system is, the simpler it is to make it secure. A clear borderline should be drawn between the limited number of computers used by organizers and almost unlimited number of computers used by the voters. The amount of resource that can be invested into securing organizers computers is orders of magnitude higher than for voters computers, which should generally be used as is. What regards securing voters computers, two trends should be mentioned. First, recently multiple different handheld mobile devices have become affordable to large masses. Designing such device from scratch gives a good opportunity to implement robust security from the ground up. Also some devices could be created with xed set of preinstalled unmodiable software, which would increase security of such device a lot. In practice such devices are not principally more secure than usual PCs. Another idea is to have a tamperproof device (called smartcard) having a processor and memory chip, but otherwise being not self-contained, and keep there some well managed (even better, xed) software and data (secrets). The latter could be in secret even from the owner of the card himself (e.g. private key). Smartcards have an interface through which other devices can communicate with them. Smartcards are activated by entering PIN code, which is usually passed through the device to which smartcard is attached. Denitely smartcards have their own security threats (see [Sch99]), most remarkable of them is that the device to which smartcard is attached, after the PIN code has been entered, can manipulate the card in any uncontrolled way (sign any messages) and also reveal PIN to anyone else. This implies that device to which smart card is attached must be rather secure itself. Another problem is that most of existing computers are not equipped with smartcard readers and it will probably take a long time before they become widely adopted. Besides being correct and untappable, the computing device has to be randomised. Randomness can be retrieved from special physical device or a cryptographic primitive (algorithm) called pseudorandom bit generator, which must still be seeded with small random piece of information, which is usually collected based on the behaviour of the computing device, which depends on (unpredictable) actions performed by the user. 37

The best what can be done to keep computing devices clocks synchronous is to use periodically Network Time Protocol ([NTP]) to synchronize them with some time servers clock. Simple NTP (SNTP), suitable for usual computers, provides accuracy of 1 second, which should be sufcient. Also care must be taken to avoid bugs when dealing with time zones.

2.2.3 Software
Software correctness is a direct implication of the quality of software development process. In order to increase trust towards software, development process, source code, and supplementary documentation must be reviewed and certied by some external trusted parties. After there exists trusted source code from which software can be built, it must be delivered and deployed at the computing device. A problem arises here, because despite the fact that the source code of the software was certied, it does not imply in any way that the binary distributable that is received was built from that source. The solution is to require software publisher and certiers to sign the distributable and express in this way their trust towards software with respect to specied purpose. In this case each certier would need to receive the source code, inspect it, build binary distributable, and nally sign it. This implies that all certiers should be able to produce exactly the same distributable, which means that the build tool (compiler) must be deterministic (which they hopefully are, or at least can be made quite easily). It is natural to expect that framework for expressing trust by signing binary distributable should be a part of PKI. We will see later to what extent it is supported now. Another option is to require certiers to sign the source code, distribute it, and expect end-users to build it themselves, which is quite unrealistic and also time-consuming.

2.2.4 Threshold Trust


The rst evident rule is that each component must be made reliable and secure as much as possible. The following component failure scenarios and countermeasures could be considered: Random Failure Failure events of different components must be made independent (in the sense of theory of probability). In this case probability that more than components will fail is:

38

where is probability of failure of one component. For instance if , , the probability of failure of threshold trust is less than and . Note that cannot be too small, because for instance in the case of the probability of failure would be . On the other extreme, if the events of failure are completely dependent (i.e. if one component fails, all components fail) no advantage is gained as compared to case . Active Adversary It must be ensured that each component must be broken separately from the others so that adversary would need proportionally more resource to break components. Colluding Components Actors, which control components (or indeed are them) should not wish to cooperate with each other to break the service. This is largely a political issue of selecting actors. Described countermeasures require components to be independent. The following could be done to ensure independency:

Independent implementations and manufacturers (hardware, OS, libraries, software). Independent resources: Physical location Power supply Network

Different operators and organizations.

An interesting issue arises in the context of reliability (where service remains working as long as components are functioning correctly and can communicate) when components can be fragmented into two or more segments that cannot communicate with each other (e.g. network failures). In such situation components in each segment would consider components in the other segment failed as it is not possible to decide whether connection has failed or component is not communicating intentionally. Now, if there are two segments each containing components, each of them could form the service on its own, leading to the situation of split mind. If this is an issue, protocols and algorithms should be designed to avoid or at least to require that such situation. The simplest way is to require one can contact more than components. 39

2.2.5 Connection
Connection between computing devices must be dependable: it should be possible to send a message from one address to another without a failure. There are two ways of assessing system dependability: how system acts on average (reliability) and what can happen in the worst case, especially, if some entity is interested in bringing the system down (safety, security). Internet is the rst and probably the only candidate for connection implementation. It provides functionality to send messages between nodes (computing devices) having IP addresses. Internet can be viewed as a collection of interconnected local networks (segments) and consists of the following basic components which are built one on upon another: Physical link layer Physical devices providing local packet (message of limited size) sending functionality within one segment, which may have its own address system (OSI 5 physical and link layers.) Network layer Provides means of sending packets between any IP addresses. Special computing devices called routers are used to join segments and nd suitable path for each packet. Packet sender does not get trustworthy information about whether packet has reached its destination. Nothing is done if packet sending fails. Packets can be received in an order different from the order of sending. (OSI network layer.) Transport layer Provides means of establishing virtual connection between any two IP addresses, where messages of any size can be sent in both directions. Messages are split into packets, which are sent separately. Receiver is supposed to send acknowledgement about receiving each packet. Packets that do not reach destination (which are not acknowledged by the receiver) are resent. As a result, sender has a good evidence of whether message has reached the target. Also the order of messages (in one direction) is guaranteed to remain the same. (OSI transport layer) Domain Name System (DNS) IP addresses are numeric and hard to memorize. To solve this problem, each Internet node can be assigned one or more symbolic name, which is easier to remember. DNS provides a service of mapping symbolic names to IP addresses. DNS service is implemented as a worldwide distributed hierarchical set of computers, each of which keeps a part of this information and is supposed to know where to get the rest.
OSI (Open System Interconnect) reference model - an ISO standard dening seven layers of any network implementation. In practice nobody follows it precisely, but it is a good reference model. See [OSI].
5

40

Applications Networked applications making use of previously described components. On the average it can be said that reliability of Internet is acceptable, but in the presence of active adversary Internet is by no means dependable. Further, the following context will be assumed (although most of arguments apply to any situation): application server (implemented by a limited number of nodes) to which multiple clients (nodes) connect. When attacking in described context, the following direct aims can be set:

Prevent nodes from communicating with each other (clients from connecting to server). Modify transmitted information. Create illusion of communication with fake address (either DNS or IP).

The attacks themselves can be classied into the following groups: Damage, modify, fake, or overload components of the infrastructure: Links Routers DNS servers Nodes (application server, client) Although originally Internet was designed to resist nuclear attack, there should be multiple independent paths between any nodes without common links, at the current moment Internet has become mostly hierarchical with relatively low level of redundancy: both from the viewpoint segment connections and DNS. As a result, for most of node pairs it is possible to nd an intermediate link or server (router or DNS server), which when removed would disconnect these nodes one from another. Also, it should be relatively simple to disconnect specic node from most of the others by breaking link or intermediate server close enough to it. In addition, if attacker penetrated some link or intermediate server (or the attacker is indeed the operator of the component), he would be able to imitate communication with any IP address behind him. Finally, there exist effective methods to overload communications infrastructure, known as Denial of Service attacks (DoS), discussed later. It is important to point out, that most of components of the infrastructure belong to one specic entity, which must be trusted in order to rely on the connection. 41

Damage, modify, or fake data of: Local routing (within one segment) Routers DNS As a result packets would be sent to wrong destinations, or would not reach targets at all.

Attack protocols of any layer (link, network, transport) in order to break, modify, or fake connections.

The following weaknesses of the infrastructure are usually employed: Ability to gain physical or virtual access to infrastructure components, possibly with help of so called social engineering, which targets at human beings instead of surpassing technical or computational protection methods. Bugs in underlying software: operating system, networking components, protocol implementations, application software. Probably the most important of them is so called buffer overow error, where memory area right after the buffer is overwritten when writing to buffer too long data without proper size checking. Finally the basic protocols (ARP, ICMP, RIP, IP, TCP, DNS, etc) themselves have security vulnerabilities, which can be used against the aims of the infrastructure.

As an example, lets consider an attack of overloading infrastructure components called Denial of Service (DoS). The general idea is to send more garbage information than link or intermediate or application sever can process with an aim to consume some kind of resource: either bandwidth, computational power, or for instance memory. As a result legitimate users would not get through. Attacks can be (and usually are) initiated from multiple nodes, the total resource of which is higher than one of the victim (in this case attack is called Distributed DoS). Attacks can be application specic in such a way that attacker needs much less resource to generate the garbage than the victim to process it, for instance if victim tries to decrypt sent messages. As a result less resource is needed to bring the component down. One could ask why attacker would have more resource than the victim. On one hand states elections is an important event and so a lot of resource could be spent to bring it down. On the other hand attacker is always one step ahead of victims operators: rst they set up some resource, and then attacker has a chance 42

to gain enough resource to run DoS. Finally, as experience shows it is relatively simple to break into multiple Internet nodes and manipulate them externally. Also protocol vulnerabilities can be exploited to direct big trafc towards specic node. Although there is no complete cure for this problem, partial solutions and guidelines exist, which lead to attacker needing much more resource to launch the attack, see [Ero00]. In general, the following countermeasures could be taken to prevent described attacks:

Proper development process of protocols, software, and hardware. Proper infrastructure surrounding Internet components, including rened access control policy and attack prevention, detection, and response. Systematic redundancy of connections (bandwidth, independent paths) and nodes (fail-over and load-balancing clusters) Legal measures with big punishments for network disruption, which implies that it should be possible to trace back to the originators of attacks.

The rst two of described items are directly related to previously discussed problem of correctness of software and computational device. In general, only global measures can rise dependability of Internet substantially. At the same time the problem of identication can be completely solved within PKI with help of such cryptographic constructions as Dife-Hellman key exchange, message authentication codes, symmetric encryption, and so on. In short, provided two connection endpoints have certicates and are able to establish connection, it is possible to establish communication channel, which would provide: Authenticity It is clear which PKI identity sent received data. Integrity Data cannot be modied between connection endpoints. Condentiality Data cannot be wiretapped between connection endpoints. Non-repudiation Ability to prove that received data was sent by corresponding PKI identity. There exist implementations of this approach for network layer ([IPSec]), transport layer ([TLS]), and DNS system ([DNSEXT]). These protocols also help rising the quality of connection establishment, because non-repudiation helps proving that some component behaved incorrectly (e.g. provided incorrect information), but only post factum. Also, this measure will be effective only when most of Internet nodes start following them, which is not the case at the current method. 43

Besides that, network was required to be synchronous: there should exist upper bound on how long does it take from the moment message is sent until the message is received. With some simplications, one segment of a network (one Ethernet segment) can be considered synchronous. The main concern is that the speed of data transmission degrades as a function of network load, which opens doors to effective DoS attacks. This can be relieved to some extent by isolating the segment from the external world, but it does not help with internal adversaries. In the case of Internet it is not possible to give useful and sensible upper bound - Internet is completely asynchronous, whereas failing connection can be interpreted as lasting for especially long time. Although it was assumed that connection is synchronous, most of constructions described before are not sensitive to that, although they can freeze until the messages are delivered. The most remarkable exception is the bulletin board, which is often used as a replacement for the connection itself. The conclusion would be that existing connection is not dependable enough, multiple single points of failure exist, and it can be broken on purpose quite easily. Only global measures can improve situation substantially. At the same time, if connection can be established, communication authenticity, integrity, and even privacy and non-repudiation can be achieved.

2.2.6 PKI
PKI is a framework that facilitates establishing correspondence between identities and public keys, the basic functionality of which can be usually split into:

Creation and dissemination of identity certicates. Dissemination of certicate revocation information.

Roughly, PKI consists of three parts: Standards dening data formats and algorithms. Software supporting these standards (both for authorities and clients). Organizations functioning as certication authorities.

Probably the most popular PKI standard is X.509v3 ([X.509]), which denes formats for certicates and CRLs. There exists standard software for managing certication authority and corresponding client software, which provides the following functionality (list is not complete):

Certicate retrieval, exchange.

44

Usage of CRLs, which can be automatically retrieved from certicate distribution points (CDPs). Certicate verication. Encryption and decryption. Signing and signature verication.

Hierarchical (tree) topology of CAs is widely adopted. One example of both certication authority and client software is, for instance, Microsoft Windows 2000 platform that has built-in support for PKI. There also exist organizations that act as international certication authorities. They typically produce personal certicates of different level of trustworthiness (implied by different quality of identity verication), and also DNS name certicates (given to the owner of DNS name) and code publishers certicate, which can be used to sign code distributable (see later). One example of such authority is [VeriSign]. Still there exist multiple problems:

There is no universally accepted certication authority. There is no uniform, convenient, and accepted naming convention, which would enable assigning unique identicator to any person or organization in the world. Existing standards are rather loose (many features are optional) and as a result there exist interoperability problems between software from different vendors and also different CAs including: Not all vendors use hierarchical topology of CAs. Support for certicate revocation (especially automatic through distribution points) is implemented in many different incompatible ways.

There exists problem of how certicates of root CAs reach client computers. Currently they are pre-installed with operating system, which requires additional trust towards operating system vendor.

The conclusion would be that existing PKI is rather underdeveloped and further progress is needed. PKI requires that owners of the private keys should keep them in secret. For most of personal computer users this means keeping them on the le system (hopefully) protected by le system access control and a pass phrase. Also, the private 45

key is loaded into random access memory when using it. All this works provided computing devices untappability requirement is satised. A much better solution is to keep private key on a specialized tamperproof device (smart card), which would perform cryptographic operations on its own without revealing the private key to the computer. As it was already mentioned, smart cards have their own security concerns and also smartcard readers are not widely adopted yet. In the context of e-voting another problem arises: why would some state (organizer of the elections) trust root CA, which is located in South Africa or USA. A solution would be to have local intermediate certication authority, which would function within that state. Only certicates, which are under that authority would be allowed to be involved in the elections and at the same time interoperability with the external world would be ensured. Besides conventional PKI functionality, infrastructure for code deployment to computing devices is desirable:

Software distributable may carry signature from software publisher and also from external parties, who certify that this software is correct with respect to specied purpose. Certiers should be able to assign different levels of trust towards software. Operating system should provide ways to dene policy of whether to install (run) code distributable based on its signatures. The simplest way is to ask user every time some distributable is installed (run), which would become annoying very fast (especially in the case when code is downloaded automatically from the Internet) and would lead to situation where user would always say yes without thinking. Its a separate research problem how would such policy look like and who would create and enter it (it is naive to expect end-users to do that correctly).

It is reasonable to expect such framework to be (an integral) part of operating system. Contemporary code signing is rather underdeveloped: for instance Microsofts Authenticode (code signing framework on Windows platform) enables software publisher to sign the code and assert in this way that software is safe (whatever it means), but there are no means to state correctness with respect to some purpose (at best one can add free-form string) and also there can be only one signature. In short, publishers identity is bound to the code without any direct legal implications. Forthcoming Microsofts .NET platform ([.NET], now in beta) promises to provide more rened policy engine upon Authenticode. Naturally, other vendors have implemented similar constructions in their platforms, for instance in Java 2. Still they all appear to lack support for multiple signatures

46

and expression of trust with respect to specied purpose - i.e. they do not support software certication.

2.2.7 Time-stamping
The need for time-stamping was identied at least years ago. For a long time there were either relatively inefcient solutions (for relative time-stamping) or solutions that required to have common unconditionally trusted third party (for absolute time-stamping). Quite recently in 1998 a practical and efcient solution for relative time-stamping has been developed (see [BLLV98]). The solution is still single-authority, but it is not possible to cheat undetectably, although authority can ignore someone based on his identity. Also consequently the dependability and scalability of the service might not be sufcient for every application. Time certicate processing requires in general communication with the TSA. The topic of ongoing research is implementing time-stamping service, which works with respect to threshold trust. A compendium of information on this subject can be found at the timestamping project Cuculus home page [Cuculus]. As time-stamping is relevant to e-voting mostly in the context of the bulletin board I shall return to this topic when discussing its design options.

2.2.8 Summary
Existing framework of hardware, software, network, and PKI has been growing evolutionally during last - years. It is generally agreed that contemporary infrastructure is not dependable enough, although at the same time it is considered good enough to access Internet banks and shops. Many of the existing problems can be circumvented if framework is rebuilt carefully from scratch with specic requirements in mind. Special attention should be paid to the basic constituents of the framework and people or organizations who are responsible for them:

Hardware manufacturing and deployment. Software platform development and deployment (one can use code-signing only after one has installed software, which supports code-signing). Network components. Certication authorities.

Most of information in this section has been taken from Internet and most of the resources are not self-contained enough to cite them, so I cite only one most relevant: [Rub00]. 47

2.3 Design for Bulletin Board


In this section I will describe different ideas and options for implementing bulletin board6 . I remind that for the purpose of e-voting it is enough to have a single writer multiple reader bulletin board for each actor with reasonable latency, repeatable read, and absolute time-stamping. Monotonicity is desirable, but is not a requirement. One option is to implement bulletin board with one authority. In the current context authority means a set of tightly coupled server computers located at one physical location sharing common PKI identity. Such construction maps directly to a notion of actor in previously described model of the real world. Different solutions might require different level of trust towards the authority. The simplest one (in the avour of conventional info systems) would require trust to be unconditional. In principle one could think of solution where authority would have to prove correctness of its actions and so the failure would be detectable. As it was already mentioned time-stamping and bulletin world are very similar services. My intuition suggests that one round of time-stamping construction [BLLV98] could be adapted quite easily to the needs of bulletin board. Of course previously mentioned problems still remain: possibility that authority will ignore someone, possibly unacceptable dependability and scalability. If a solution for time-stamping is developed that overcomes these problems, it could be probably quite easily adapted to the needs of bulletin board. All this can be a topic of further research. Another option is to implement bulletin board with multiple distributed authorities (in the sense of previous paragraph) with respect to threshold trust. It turns out that in such setting both from theoretical and practical point of view the approach differs quite a lot depending on whether environment is synchronous or asynchronous. In synchronous environment all actors have synchronous clocks (as dened before) and connection between them is also synchronous. Asychronous environment does not satisfy one of these properties (usually the second). In practice, synchronous environment can be approximated with some isolated from external world well supervised reliable communication channel (e.g. Ethernet segment). On the other hand Internet is clearly asynchronous. It is worth pointing out that single authority solution does not depend much on whether environment is synchronous or not: user can either succeed in establishing connection to the service or not. Theoretical (and probably also practical) activity has lasted in this eld for at
This section should be viewed as a collection of rather raw knowledge and ideas that may need further development.
6

48

least years. Good overviews of theoretical meta facts can be found in [Kest95] and [Fi00], whereas [Fi85] is the paper where some of them were proven rst. Relevant keywords are distributed consensus, Byzantine agreement, reliable broadcast. Quite a lot of practical work has been done by Michael Reiter (see [Reiter]). In the rest of this section I will rst consider different implementation options for synchronous and asynchronous cases and then try to formulate practical solutions to the bulletin board problem from the viewpoint of organizing e-voting.

2.3.1 Some Simple Ideas


In this subsection I present some simple ideas that will be of use later. The rst idea is about how one can create single-writer bulletin board out of single-writer reliable store. Provided that there are means for reliably storing messages without enforcing ordering, the following could be done to ensure it:

Use any form of time-stamping to prove ordering of the messages. When sending the next message, it should incorporate a hash of the previous message. This requires sender to remember (or retrieve) the last message sent so far. If message ordering is up to the sender (the only entity interested in ordering the messages is the sender himself), message could just incorporate a sequence number (which requires maintaining a counter) or even the current value of senders clock.

Based on this it is sensible to concentrate on implementing reliable store, out of which single-writer bulletin board can be created. The second idea is related to absolute time-stamping: Assume that we have numbers in ascending order out of which at most are assumed to be incorrect, but we do not know which. In this case belongs to the interval starting with the smallest and ending with the biggest correct number in the sequence. The reasoning is quite simple: if is correct, then implication is trivial, if is incorrect then it is surrounded in the sequence by correct numbers. This nishes the proof. This can be applied in the following situations:

Accessing timeservers to synchronize clock. If attacks of at most servers can be resisted. 49

servers are accessed,

Assume that there are time-stamping servers which append their current time to every sent message, sign it, and send back. Now if one accesses servers, it can get a trustworthy absolute time certicate as long as at most servers are malicious (or have wrong clocks).

2.3.2 Synchronous Environment


In the synchronous environment from the theoretical point of view reliable store is possible for any . Correct information will be present at all correct authorities. At the same time construction presented when proving this fact in [Fi00] is completely impractical. My idea for implementing single writer bulletin board would be as follows:

Implement reliable write-once multiple read register Maintain an array of such registers

In order to send a new message, nd out the highest index of lled register and then write to register new signed message containing an index and a hash of .

Ideas for implemeting such register by multiple authorities with threshold trust can be found in [MR98], section 6. In order to prove time-outs (i.e. that someone didnt write a message during some period of time), time-stamping should be employed. Previously described out of approach could be applied here. Note that bare time-stamping by the writer does not solve the problem: writer could rst ask time certicate, wait for unlimited period of time and only then send the message. This implies that time-stamping must be performed by the authorities implementing the service. Register implemented with such approach has the following properties:

In order to write one message, this message (or at least its digest) must be transmitted times over the network. This limits throughput of the system quite a lot. Quite large cryptographical overhead is needed to perform authenticated communication. The following limitation holds , which implies or be malicious).

. For instance if , (at most out of 10 authorities can fail

These limitations make specically this construction quite useless. I did not have time to make a big investigation and so cannot tell if there exist more practical constructions for synchronous networks. 50

2.3.3 Asynchronous Environment


From the theoretical point of view bulletin board implemented by multiple authorities with threshold trust cannot be implemented deterministically in principle if even one authority can crash (let alone malicious behaviour). The reason behind it is that it is not possible to decide whether some authority is intentionally unresponsive or the connection is just slow. See [Fi00] for more. For this reason only partial solutions to this problem exist. One option is to use protocols from synchronous world, but now time-outing cannot be used, as there are no upper bounds on transmission duration. As a result these protocols can freeze for a period of time of indenite length. As a solution to these problems Rampart toolkit is suggested in [Rei95], [Rei96], and [Rei94]. The basis of the toolkit is so called group membership protocol, which allows removing and adding authorities to the group if more than of group members decide so. This allows removing unresponsive, failed, or malicious group members without making difference between them. Atomic multicast protocol is executed upon it. After reading previously mentioned papers, I have made the following observations:

In order to write one message, this message (or at least its digest) must be transmitted times over the network. This limits throughput of the system quite a lot. Quite large cryptographical overhead is needed to perform authenticated communication. System can survive only fragmentation of less than group members. System freezes if, for instance, group is split due to network failure into two equal parts. Group of correct authorities can melt down to any size, for instance less than . As group membership change is relatively expensive operation, an attack could be mounted where authority would periodically pretend to be unresponsive, get thrown out of the group, then again become responsive, get invited into the group, and so on. In order to rise efciency, messages are broadcasted by one selected (for instance with the smallest ID) group member and if he fails, he is voted out. For this reason it makes sense to concentrate attacks on group member with the smallest index. 51

Quite many issues are left open or need further adaptation to the needs of the bulletin board.

For these reasons this construction seems to be rather impractical for use in the wild Internet, but it might be of interest on more protected and synchronous local network. I have to acknowledge that these papers are rather complex, many details are left open, so there is always a chance that I just did not quite understand them.

2.3.4 Practical Solutions


The following applications of bulletin board can be identied with respect to evoting:

Setup phase of multiple authority EVS and also threshold key generation (I will refer to them as setup phase further). Collecting voters ballots (I will call it collecting phase further). Result computation in multiple authority EVS (I will call it computing phase further).

Despite previously described semi-theoretical solutions it appears to me that practical hack is more appropriate in the current situation. The setup phase requires a high quality of bulletin board service (atomicity), but it can be restarted without any loss and also informational throughput is quite low. This phase could be performed on synchronous network. For this reason I would suggest to implement bulletin board by a single authority, which should broadcast signed messages to all participants. In the end each participant would sign the transcript of messages delivered to it and this phase would be considered complete if no participant objects that he did not receive all messages and signed transcripts match. Setup phase outcome should be made available to voters with help of administrative methods. Another option it to try to adapt Rampart, although it is questionable if it justies itself. Collecting phase must be performed at asynchronous network, the amounts of data transmitted are measured in gigabytes. Voter should be sure that if he gets conrmation that operation succeeded, then his ballot will be counted. At the same time he might not always get this conrmation because network connection might go down exactly at the moment when bulletin board sends the last conrmation message to the voter. Ordering of sent ballots is up to voter. At the same time this service cannot be restarted - it must function without failures. While collecting votes, it is sufcient to be write-only. For this reason I would suggest to have 52

independent authorities, which would collect, sent ballots. In order to submit a ballot voter would have to contact servers and send his ballot there. In the end ballot lists would be signed by the authorities, written to durable media, and physically transported to the place where computing phase takes place. Ordering of voters sent ballots could be ensured by time-stamping or even by including voters computer time into the ballot (probably not so good idea). Computing phase does not actually require the bulletin board. All that is needed is to collect self-verifying output of all authorities. This can be ensured with administrative measures.

2.4 Design Pattern for E-voting System


In this section a design pattern for e-voting system will be outlined. It will be done independently of EVS type. Both single and multiple authority EVSes can be plugged into this pattern. This will be demonstrated in later sections. At the beginning lets assume that infrastructure is already in place. System architecture is outlined on Figure 2.4. The following participants are depicted there: Organizer An organization responsible for conducting elections. It is identied by its public key that has been threshold generated and registered within PKI, whereas private key shares are kept separately and securely and are used only when needed. Voter Any person willing to participate in elections. Is identied by its public key certicate. Observer Any entity having a public key certicate, which is invited to observe and verify the voting process to ensure its integrity. Consumer Any entity interested in learning election result. The following components are presented on the gure: Election Information Management Subsystem that helps producing voter lists, ballot types, and a mapping between them. Voting BB Bulletin board servers for collecting voters ballots. Each bulletin board server has its own public key certicate, using which it is identied. EVS An instance of some electronic voting scheme. Entry Server Server having well-known name, providing voters with everything they need to participate in election: 53

Election Information Management

Entry Server

Voter(s)

Organizer

Voting BB

Revocation Information Management

EVS

Observer (s)

Consumer(s)

Archive

Figure 2.4: Design Pattern Architecture Outline.

54

Voting client software to use. Election information. Voting bulletin boards addresses and public key certicates. Observers public key certicates. EVS information, generated at the setup phase.

Revocation Information Management Subsystem that helps creating lists of voter identities that should not be counted. Archive A permanent store, where all election information is saved after election is over. Thick line on the gure signies boundary between managed and external world. The process according to which system is supposed to act is depicted on Figure 2.5. The following steps are present there: Selecting Observers The process starts with selecting observers for specic elections. Preparing Election Information Election information is prepared. Once information is ready, it is signed by the organizer. Preparing Bulletin Boards Multiple bulletin board servers are setup, possibly at different geographic locations. Some of observers are supposed to observe these servers. Servers public key certicates are collected. EVS Setup Electronic voting scheme setup phase is executed resulting in some public information that must be made available to voters. Populating Entry Server Entry server is populated with all relevant data. The integrity of all information is assured by organizers signatures. Initiating Voting After that voting can be initiated by activating bulletin board servers. Voters Voting Then for some period of time voters are given an opportunity to vote. This is done so:

Voter contacts the entry server and downloads voting software. Voting software is deployed at voters computer. Signatures on the software distributable are veried. 55

Selecting Observer(s)

Preparing Election Information

Preparing BB(s)

EVS Setup

Populating Entry Server

Initiating Voting

Preparing Voter Revocation Information

Voters Voting

Computing Election Result

Stopping Voting Observers Approving Election Result

Archiving Election Data

EVS Shutdown

Figure 2.5: Main Process of the Design Pattern. 56

Voting software contacts entry server again and checks that voter is eligible to vote, retrieves voters ballot type, bulletin board addresses and certicates, and EVS setup information. After that ballot is presented to the voter, where he can choose one option. Finally he can cast his ballot (or cancel the activity). Voting software forms voters ballot according to EVS algorithms and setup information. After that the ballot is signed with voters private key. Finally bulletin board servers are contacted and the ballot is submitted to them. There should be lower limit on how many servers must be successfully contacted. Voter is notied of success or failure.

Stopping Voting Finally voting is stopped by closing bulletin board servers. Preparing Voter Revocation Information After election is over, voter revocation information is prepared. Once information is ready, it is signed by the organizer. Computing Election Result Election result is computed with help of EVS. This will be explained later in a separate subsection. Observers Approving Election Result Observers are given opportunity to verify correctness of EVS result. After that they are given a chance to sign the result to express that they are satised with everything. Archiving Election Data Once EVS result is approved by sufcient number of observers, election data can be stored into permanent archive, where it is accessible to the consumers. EVS Shutdown Finally, electronic voting scheme is shut down. In particular, this means destroying secret information generated during EVS setup phase that may compromise election security if it leaks. It might be needed to postpone this phase for a reasonable amount of time.

2.4.1 Computing Result


The process of computing election result is depicted on Figure 2.6:

57

Entry Server

List of Revoked Voters

Voting BB(s)

Entry Server Contents

EVS

Raw Ballot List(s)

EVS Output

Observer(s)

Signature(s)

Organizer Signing and Timestamping

Full Report

Archive

Figure 2.6: Result Computation in the Design Pattern.

58

Ballot(s)

Entry Server Contents (organizer signed)

Raw Ballot List(s) (BB server and organizer signed)

EVS Output

List of Revoked Voters (organizer signed)

Observer Signature(s)

Full Report (organizer signed and timestamped)

Figure 2.7: Election Data in the Design Pattern.

At rst EVS takes as input raw ballot lists from the bulletin board servers (which are additionally signed by the organizer), list of revoked voters, and also the contents of entry server (why is it needed will become apparent later) and produces election result. After that observers are verifying and signing EVS output. Now all data is signed by the organizer once more and nally time-stamped. This forms complete election report, which can be stored in archive.

Figure 2.7 depicts components of election data and their interrelationships (here a dashed arrow from A to B means that A includes a hash of B):

EVS output is supposed to contain hashes of raw ballot lists, lists of revoked voters, and entry server contents, which were used in computation. Voters client software is supposed to include in the ballot hashes of parts of election data that were used. Observers signatures include hash of EVS output by denition. Finally full report contains hashes of observers signatures.

59

Setting up Infrastructure

Organizanizer's Key Setup

Conducting Elections

Maintaining Archive

Organizer's Key Shutdown

Figure 2.8: Meta Process of the Design Pattern. Such approach helps gluing different parts of election data together: none of the components can be modied after the full report is complete. In addition such structure commits organizer to some decisions: as bulletin board servers and observers certicates are present at the entry server, it can be devised at once which bulletin board server outputs were used and which observers have signed the result and which have not. Also such approach guarantees that all ballots were formed out of the same entry server contents. Finally, after time-stamping, organizers public key can be revoked without any harm.

60

2.4.2 Meta Process


Previously we assumed that infrastructure e-voting system was already in place. In fact it must be supported by a process of its own - meta process, which is depicted on Figure 2.8:

Systems lifetime starts with setting up the infrastructure. After that an iterative process of organizers public key threshold generation, registering it at PKI, conducting some number of elections, and nally organizers public key revocation and private key share destruction. Threshold key generation should be performed using bulletin board constructions described in section 2.3. It is sensible to keep private key shares on smartcards. In parallel with that, data archive should be maintained.

2.4.3 Design for Single Authority EVS


In this subsection I will outline design for single authority EVS, which can be plugged into the e-voting system design pattern. It is sensible to generate single authority EVSs public key in a threshold manner, whereas private key shares are to be kept separately and securely, for instance on smartcards. Bulletin board constructions described in section 2.3 should be used. Computing result in such setting is described on Figure 2.9:

Ballot list creator takes as input raw ballot lists, list of revoked voters, and entry server contents and generates a list of ballots, which should be counted. Signatures of ballots must be veried and removed. As a result eligible ballot list does not bear any direct links to voters identity. This operation is deterministic and could be duplicated on multiple computers to ensure that list is formed correctly. Finally this list should be signed by the organizer. Result computer takes as input eligible ballot list and the private key shares. Using them, it is possible to decrypt each ballot separately and compute election result. In the end election result is signed using the private key shares. This operation is the weakest point of security: single decrypted ballots should not leak outside the computer. This implies that the computer should be as secure as possible. Also, it shouldnt have any other means of communication with external world (hard disk, network) besides the one through which ballot list is entered and computed result is returned. Even

61

List of Revoked Voters

Entry Server Contents

Ballot List Creator

Raw Ballot List(s)

Eligible Ballot List

Result Computer

EVS Output

Figure 2.9: Result Computation in Single Authority EVS.

62

Ballot(s)

Entry Server Contents (organizer signed)

Raw Ballot List(s) (server and organizer signed)

Eligible Ballot List (organizer signed)

List of Revoked Voters (organizer signed)

EVS Output (signed with EVS key)

Figure 2.10: Election Data in Single Authority EVS. better, this computer could be implemented as a specialized device. This operation is deterministic and so it could be also duplicated on many computers to ensure that result is computed correctly, although computation correctness cannot be veried directly. Signature generation is not deterministic, so signatures should not be compared, but they can be veried directly. Figure 2.10 presents data structures involved and their relationships. It is very similar to the one in subsection 2.4.1 and does not need further explanation.

2.4.4 Design for Multiple Authority EVS


In this subsection I will outline design for multiple authority EVS, which can be plugged into the e-voting system design pattern. Multiple authority setup starts with dening authorities and assigning them public keys. As they are needed only for internal communication, they could be certied by the organizer and not real CA. After that EVS setup phase can execute. Bulletin board constructions described in section 2.3 should be used. It is sensible to keep private information generated during the setup on smartcards. The process of computing result is depicted on the Figure 2.11:

Each authority takes as input raw ballot lists, list of revoked voters, and entry server contents and produces signed authoritys output. After that election result can be computed based on authoritys outputs.

Figure 2.12 again presents data structures involved and their relationships. It is very similar to the one in subsection 2.4.1 and does not need further explanation. 63

List of Revoked Voters

Entry Server Contents

Authorities

Raw Ballot List(s)

Authority's Outputs

Result Computer

EVS Output

Figure 2.11: Result Computation in Multiple Authority EVS.


Entry Server Contents (organizer signed)

Ballot(s)

Raw Ballot List(s) (server and organizer signed)

Authority's Output(s) (authority signed)

List of Revoked Voters (organizer signed)

EVS Output (organizer signed)

Figure 2.12: Election Data in Multiple Authority EVS. 64

2.4.5 Conclusions
The main conclusion of this section would be that it is possible to build a generic e-voting service design pattern, into which both single authority and multiple authority electronic voting schemes can be easily plugged in in a similar way. The advantage of single authority EVS is its speed, the disadvantages are its lack of veriability and presence of single point of failure, which must be secured a lot. The advantage of multiple authority EVS is its complete veriability and lack of single points of failure - everything is done with respect to threshold trust. The disadvantage is its relative ineffectiveness. At the same time probably in many contexts both approaches have comparable and sufcient security. Multiple authority EVS becomes more secure with much bigger investment of resource.

65

Chapter 3

Summary
In this work the following tasks have been accomplished:

Formulated detailed requirements for e-voting system. Described theoretical basis for e-voting from the viewpoint of software engineering. Analysed feasibility of generic security framework, upon which e-voting system could be built. Investigated options for designing bulletin board service. Proposed general design pattern for implementing e-voting system. Described how theoretical single and multiple authority electronic voting schemes t into this pattern.

The main conclusions would be: The only serious security threat is imposed by existing framework of personal computers and Internet. Unfortunately it is enough to prevent implementing secure e-voting in near future. Although theoretically single authority electronic voting scheme is much weaker than its multiple authority counterpart, in practice they are of comparable quality having different advantages and disadvantages.

Further directions of work could be:

Increasing reliability of Internet and security of personal computing devices.

66

Further development of PKI and code signing. Further research in the eld of time-stamping and bulletin board construction. Renement of e-voting system design. Currently the design pattern is very high-level and lots of technical details are missing. Also the whole process of e-voting system maintenance should much more rened. Evaluate precise nancial and computational resources needed to create and maintain e-voting system.

67

Elektrooniliste valimiste kavandamine


Oleg M rk u Res mee u
Valimised on v ga oluline uhiskondlik toiming igas demokraatlikus riigis. Via imasel ajal on maailmas muutunud v ga populaarseks idee korraldada valimisi a elektrooniliselt - piltlikult oeldes koduarvutist ja ule Interneti. P hiliseks moti o iviks on valimiste mugavus, mis v iks t sta kodanike valimisaktiivsust, mis on o o t upiliselt usnagi madal. Pikemas perspektiivis v iks sellest olla abi ka invaliu o ididele ja vanuritele. Lisaks v ib loota, et selline valimiste korraldamise viis muuo tub ajapikku odavamaks kui tavalised valimised. Siiski vaadeldakse k esoleval ajal a e-valimisi peamiselt kui mugavat lisav imalust tavaliste valimiste juures. o E-valimisteks valmistumine koosneb uldjoontes kolmest aspektist: teoreetiline, tehniline ja poliitiline. Teoreetilise tegevuse eesm rk on luua reaalse maailma aba straktsioon ja leida selle piires p hilised konstruktsioonid, millest saaks p rast o a ules ehitada reaalse e-valimiste s steemi. Selle s steemi ehitamine ise oleks u u juba tehnilise tegevuse vallast. L puks on vaja s steem juurutada olemasolevasse o u demokraatlikku s steemi, mis oleks juba poliitikute m ngumaa. u a K esolevat bakalaureuset od v ib lugeda e-valimiste teoreetilisi aspekte a o o k sitlenud semestrit o j tkuks ning teemaks on e-valimiste realiseerimise tehnia o a line pool. Probleemile l henetakse tarkvaraarenduse vaatenurga alt. Esiteks a anal usitakse olemasolevat olukorda ning formuleeritakse n uded s steemile. u o u Sellele j rgneb teoreetiliste aluste ulevaade. E-valimiste realiseerimise t htis a a eeldus on turvaline keskkond, mis koosneb turvalistest arvutitest, t okindlast o v rgu hendusest ja m nedest teistest komponentidest. Selles t os uuritakse selle o u o o olemasolu ja saavutatavust ning n idatakse, et praegune situatsioon j tab veel a a paljugi soovida. Teine t htis komponent e-valimiste realiseerimise juures on nia inimetatud teadetetahvel, mis on aluseks turvalisele suhtlusele arvutite vahel. T os o uuritakse sellise konstruktsiooni efektiivse realiseerimise v imalusi ning tuuakse o v lja selle seoseid ajatembeldusega. J utakse j relduseni et v ga h id lahendusi a o a a a veel pole ning teema vajab edasist uurimist. L puks skitseeritakse e-valimiste o s steemi v imalik arhitektuur ning n idatakse, kuidas selles saaks kasutada eru o a inevaid teoreetilisi konstruktsioone. T o p hij reldus on, et peamine tehniline takistus e-valimiste realiseerimisel o o a on Interneti ja olemasolevate arvutite ebaturvalisus ja ebat okindlus. o

68

Bibliography
[Ben87] J. Benaloh. Veriable Secret-Ballot Elections. Ph.D. Thesis presented at Yale University, New Haven, CT (Dec. 1987). (Available as TR-561, Yale University, Department of Computer Science, New Haven, CT (Sep. 1987).) Ahto Buldas, Peeter Laud, Helger Lipmaa, Jan Villemson TimeStamping with Binary Linking Schemes. In Hugo Krawczyk, editor, Advances in Cryptology - CRYPTO 98, volume 1462 of Lecture Notes in Computer Science, pages 486-501. Springer-Verlag, 1998. http://www.tml.hut./ helger/papers/bllv98/ [BT94] J. Benaloh and D. Tuinstra. Receipt-Free Secret-Ballot Elections (extended abstract). In Proc. 26th ACM Symposium on the Theory of Computing (STOC), pp. 544-553. ACM, 1994. California Internet Voting Task Force http://www.ss.ca.gov/executive/ivote/ [CFSY96] R. Cramer, M. Franklin, B. Schoenmakers, M. Yung. Multiauthority secret ballot elections with linear work. In Advances in Cryptology - CRYPTO96, volume 1070 of Lecture Notes in Computer Science, pages 72-83, Berlin, 1996. Springer-Verlag. R. Cramer, R. Gennaro, B. Schoenmakers. A Secure and Optimally Efcient Multi-Authority Election Scheme. European Transactions of Telecommunications, 8:481-489, 1997. D. Chaum. Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms. Communications of the ACM, 24(2):84-86, 1981.

[BLLV98]

[CIVTF]

[CGS97]

[Cha81]

69

[Cuculus]

Time-Stamping Project Cuculus. http://www.tml.hut./ helger/cuculus/

[DNSEXT]

IETF DNS Extensions (dnsext) Working Group http://ietf.org/html.charters/dnsext-charter.html

[Election.com] http://www.election.com/ [Ero00] Pasi Eronen. Denial of service in public key protocols. In Proceedings of the Helsinki University of Technology Seminar on Network Security (Fall 2000), to appear in TML laboratory report series, December 2000. http://www.cs.hut./ peronen/publications/ [Fi85] M. Fischer, N. Lynch, and M. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM, 32(2), pp. 374-382, 1985. Michael J. Fischer The Consensus Problem in Unreliable Distributed Systems (A Brief Survey). Proc. Int. Conf. on Foundations of Computations Theory, 2000. http://citeseer.nj.nec.com/326938.html [LipMy01] Helger Lipmaa, Oleg M rk. E-valimiste realiseerimisv imaluste u o anal us. In Estonian. u http://www.just.ee/oldjust/JM/lipmaamyrk.pdf [IPI] National Workshop on Internet Voting. Conducted by Internet Policy Institute. Sponsored by the USA National Science Foundation. http://www.netvoting.org/ [IPSec] IETF IP Security Protocol (ipsec) Working Group http://ietf.org/html.charters/ipsec-charter.html [Kest95] Lawrence Kesteloot. Fault-Tolerant Distributed Consensus. 1995. http://tofu.alt.net/ lk/290.paper/290.paper.html [MR98] D. Malkhi and M. Reiter. Byzantine quorum systems. Distributed Computing 11(4):203-213, 1998. A preliminary version appears in Proceedings of the 29th ACM Symposium on Theory of Computing, May 1997. http://www.bell-labs.com/user/reiter/#Quorums 70

[Fi00]

[Myr00]

Oleg M rk. Electronic Voting Schemes. Semester work. u http://www.math.ut.ee/ olegm/my papers.english.html

[.NET]

Microsoft .NET Platform. http://www.microsoft.com/net/

[NTP]

Time WWW server http://www.eecis.udel.edu/ ntp/

[OSI]

OSI (Open System Interconnect) reference model (denition) http://webopedia.internet.com/TERM/O/OSI.html

[Rei94]

M. K. Reiter. Secure agreement protocols: Reliable and atomic group multicast in Rampart. In Proceedings of the 2nd ACM Conference on Computer and Communication Security, pages 68-80, November 1994. http://www.bell-labs.com/user/reiter/#Rampart

[Rei95]

M. K. Reiter. The Rampart toolkit for building high-integrity services. In Theory and Practice in Distributed Systems (Lecture Notes in Computer Science 938), pages 99-110, Springer-Verlag, 1995. http://www.bell-labs.com/user/reiter/#Rampart

[Rei96]

M. K. Reiter. A secure group membership protocol. IEEE Transactions on Software Engineering 22(1):31-42, January 1996. http://www.bell-labs.com/user/reiter/#Rampart

[Reiter] [Rub00]

Michael Reiters Homepage. http://www.bell-labs.com/user/reiter/ Avi Rubin. Security Considerations for Remote Electronic Voting over the Internet. http://avirubin.com/e-voting.security.html

[Sch99]

B. Schneier and A. Shostack. Breaking Up Is Hard to Do: Modelling Security Threats for Smart Cards. USENIX Workshop on Smart Card Technology, USENIX Press, 1999, pp. 175-185. http://www.counterpane.com/smart-card-threats.html

[TLS]

IETF Transport Layer Security (tls) Working Group http://ietf.org/html.charters/tls-charter.html 71

[VeriSign]

http://www.verisign.com

[VoteHere.net] http://votehere.net/ [VIP] Voting Integrity Project http://www.voting-integrity.org/projects/votingtechnology/ [X.509] IETF Public-Key Infrastructure (X.509) Working Group http://ietf.org/html.charters/pkix-charter.html

72

Anda mungkin juga menyukai