Chapter 5: Test Methods
The accredited test lab must design and perform procedures to test a voting system against the requirements outlined in Part 1. Test procedures must be designed and performed that address:
- Overall system capabilities;
- Pre-voting functions;
- Voting functions;
- Post-voting functions;
- System maintenance; and
- Transportation and storage.
The specific procedures to be used must be identified in the test plan prepared by the accredited test lab (see Part 2: Chapter 5: "Test Plan (test lab)"). These procedures must not rely on manufacturer testing as a substitute for independent testing.
1 Comment
Comment by E Smith/P Terwilliger (Manufacturer)
5.1. It is not clear in this section whether each electronic device that comprises the voting system is to be separately tested, or if the entire system is to be tested as a whole. 5.1.1.2. "the product" is not defined. 5.1.1.2-A.1. "it is recommended" has no place. The limit must be specified exactly. 5.1.1.2-A.2. "it is recommended" has no place. The limit must be specified exactly. 5.1.1.2-B. The "industry-recognized standards" need to be cited. This section needs to acknowledge that not all devices have telephone ports. 5.1.3.2-A. "Voting system" or "voting device"? 5.2.2-A. This is not possible. Earlier VVSG sections (1: 6.4.1.8-B and 3: 4.6) acknowledge that not all paths can be verified. Other sections, such as 5.2.3-B.1, also are counter to this requirement. 5.2.3. Using "system" instead of "voting system". 5.2.3-F.1 and 5.2.3-F.2. How is "...tests to verify..." different from "...tests to check..."? 5.3.2. Far too many significant figures in the calculation results. You can't use 7 or 8 significant digits when the input is only 2 or 3. 5.4.2-B. "expert" is subjective. 5.4.2-C vs 5.4.2-D. Why is the experience requirement higher for the election management "expert" than for the other "experts"? 5.4.2-E.a. "complete knowledge" of anything is impossible. 5.4.2-E. These requirements are heavily biased towards the election field. For instance, why would the Information Security expert be required to have designed a voting system? 5.4.3-B. "system model" is not defined, nor used in any other place in the VVSG. 5.4.3-C. "threat model" is not defined, nor used in any other place in the VVSG. 5.4.4-A, 5.4.4-B, 5.4.4-C. Sections switch from "voting system" to "voting device". Is this intentional? The use of "voting device" implies that parts of a system may pass and others may fail. 5.4.6. VSTL is used, not test lab. Is this intentional under the Program?
5.1 Hardware
5.1.1 Electromagnetic compatibility (EMC) immunity
Testing of voting systems for EMC immunity will be conducted using the black-box testing approach, which "ignores the internal mechanism of a system or component and focuses solely on the outputs generated in response to selected inputs and execution conditions" (from [IEEE00]). It will be necessary to subject voting systems to a regimen of tests including most, if not all, disturbances that might be expected to impinge on the system, as recited in the requirements of Part 1.
Note: Some EMC immunity requirements have been established by Federal Regulations or for compliance with authorities having jurisdiction as a condition for offering equipment to the US market. In such cases, part of the requirements include affixing a label or notice stating that the equipment complies with the technical requirements, and therefore the VVSG does not suggest performing a redundant test.
1 Comment
Comment by Al Backlund (Voting System Test Laboratory)
Does this mean that the VSTLs can accept FCC class B testing performed outside the certification timeframe?
5.1.1.1 Steady-state conditions
Testing laboratories that perform conformity assessments can be expected to have readily available a 120 V power supply from an energy service provider and access to a landline telephone service provider that will enable them to simulate the environment of a typical polling place.
5.1.1.2 Conducted disturbances immunity
Immunity to conducted disturbances will be demonstrated by appropriate industry-recognized tests and criteria for the ports involved in the operation of the voting system.
Adequacy of the product is demonstrated by satisfying specific "pass criteria" as outcome of the tests, which include not producing failure in the functions, firmware, or hardware.
The test procedure, test equipment, and test sequences will be based on some benchmark tests, and observation of the voltage and current waveforms during the tests, including (if relevant) detection of a "walking wounded" condition resulting from a severe but not immediately lethal stress that would produce a hardware failure some time later on.
Testing SHALL be conducted in accordance with the power port stress testing specified in IEEE Std C62.41.2™-2002 [IEEE02a] and IEEE Std C62.45™-2002 [IEEE02b].
Applies to: Electronic device
DISCUSSION
Both the IEEE and the IEC have developed test protocols for immunity of equipment power ports. In the case of a voting system intended for application in the United States, test equipment tailored to perform tests according to these two IEEE standards is readily available in tests laboratories, thus facilitating the process of compliance testing.
Source: New requirement
1 Comment
Comment by E Smith/B Pevzner (Manufacturer)
There are two IEEE specs are listed. Does it mean that testing for requirements of either of them is satisfactory?
Testing SHALL be conducted in accordance with the power port stress of "Category B" to be applied by a Combination Waveform generator, in the powered mode, between line and neutral as well as between line and equipment grounding conductor.
Applies to: Electronic device
DISCUSSION
To satisfy this requirement, it is recommended that voting systems be capable of withstanding a 1.2/50 – 8/20 Combination Wave of 6 kV open-circuit voltage, 3 kA short-circuit current, with the following application points:
- Three surges, positive polarity at the positive peak of the line voltage;
- Three surges, negative polarity at the negative peak of the line voltage, line to neutral;
- Three surges, positive polarity at the positive peak of the line voltage, line to equipment grounding conductor; and
- Three surges, negative polarity at the negative peak of the line voltage, line to equipment grounding conductor.
The requirement of three successive pulses is based on the need to monitor any possible change in the equipment response caused by the application of the surges.
Source: [IEEE02a] Table 3
Testing SHALL be conducted in accordance with the power port stress of "Category B" to be applied by a "Ring Wave" generator, in the powered mode, between line and neutral as well as between line and equipment grounding conductor and neutral to equipment grounding conductor, at the levels shown below.
Applies to: Electronic device
DISCUSSION
Two different levels are recommended:
- 6 kV open-circuit voltage per Table 2 of [IEEE02a], applied as follows:
- Three surges, positive polarity at the positive peak of the line voltage, line to neutral;
- Three surges, negative polarity at the negative peak of the line voltage, line to neutral;
- Three surges, positive polarity at the positive peak of the line voltage, line to equipment grounding conductor; and
- Three surges, negative polarity at the negative peak of the line voltage, line to equipment grounding conductor.
- 3 kV open circuit voltage, per Table 5 of [IEEE02a], applied as follows:
- Three surges, positive polarity at the positive peak of the line voltage, neutral to equipment grounding conductor; and
- Three surges, negative polarity at the negative peak of the line voltage, neutral to equipment grounding conductor.
Source: [IEEE02a] Table 2 and Table 5
Testing SHALL be conducted in accordance with the recommendations of IEEE Std C62.41.2™-2002 [IEEE02a] and IEEE Std C62.45™-2002 [IEEE02b].
Applies to: Electronic device
DISCUSSION
Unlike the preceding two tests that are deemed to represent possibly destructive surges, the Electrical Fast Transient (EFT) Burst has been developed to demonstrate equipment immunity to non-destructive but highly disruptive events. Repetitive bursts of unidirectional 5/50 ns pulses lasting 15 ms and with 300 ms separation are coupled into terminals of the voting system by coupling capacitors for the power port and by the coupling clamp for the telephone connection cables.
Source: [IEEE02a] Table 6, [ISO04b]
Testing SHALL be conducted by applying gradual steps of overvoltage across the line and neutral terminals of the voting system unit.
Applies to: Electronic device
DISCUSSION
Testing for sag immunity within the context of EMC is not necessary in view of Requirement Part 1: 6.3.4.3-A.4 that the voting system be provided with a two-hour back-up capability (to be verified by inspection). Testing for swells and permanent overvoltage conditions is necessary to ensure immunity to swells (no loss of data) and to permanent overvoltages (no overheating or operation of a protective fuse).
A) Short-duration Swells
As indicated by the ITI Curve [ITIC00], it is necessary to ensure that voting systems not be disturbed by a temporary overvoltage of 120 % normal line voltage lasting from 3 ms to 0.5 s. (Shorter durations fall within the definition of "surge.")
B) Permanent Overvoltage
As indicated by the ITI Curve [ITIC00], it is necessary to ensure that voting systems not be disturbed nor overheat for a permanent overvoltage of 110 % of the nominal 120 V rating of the voting system.
Source: New requirement
1 Comment
Comment by C R Williams (None)
"... permanent overvoltage of 110% of the nominal 120 V rating ..." A 110% overvoltage could mean that the total applied voltage is 120 v plus 110% of 120 v, or (120 + 132) v for 252 v total. I hope this is not what is meant. It would be good to clarify this, perhaps just stating what the total 'permanent withstand' voltage is for a nominal 120 v supply. (I'm guessing 132 v?)
5.1.1.2-B Communications (telephone) port disturbances
Testing SHALL be conducted in accordance with the telephone port stress testing specified in industry-recognized standards developed for telecommunications in general, particularly equipment connected to landline telephone service providers.
Applies to: Electronic device
DISCUSSION
Voting systems, by being connected to the outside service provider via premises wiring, can be exposed to a variety of electromagnetic disturbances. These have been classified as emissions from adjacent equipment, lightning-induced, power-fault induced, power contact, Electrical Fast Transient (EFT), and steady-state induced voltage.
Source: New requirement
Testing SHALL be conducted in accordance with the emissions limits stipulated for other equipment of the voting system connected to the premises wiring of the polling place.
Applies to: Electronic device
DISCUSSION
Emission limits for the power port of voting systems are discussed in Requirement Part 1: 6.3.4.2-B.1 with reference to numerical values stipulated in [Telcordia06]. EMC of a complete voting system installed in a polling facility thus implies that individual components of voting systems must demonstrate immunity against disturbances at a level equal to the limits stipulated for emissions of adjacent pieces of equipment.
Source: [Telcordia06] subclause 3.2.3
Testing SHALL be conducted in accordance with the requirements of Telcordia GR-1089 [Telcordia06] for simulation of lightning.
Applies to: Electronic device
DISCUSSION
Telcordia GR-6089 [Telcordia06] lists two types of tests, respectively (First-Level Lightning Surge Test and Second-Level Lightning Surge Test), as follows:
A) First-Level Lightning Surge Test
The particular voting system piece of equipment under test (generally referred to as "EUT") is placed in a complete operating system performing its intended functions, while monitoring proper operation, with checks performed before and after the surge sequence. Manual intervention or power cycling is not permitted before verifying proper operation of the voting system.
B) Second-Level Lightning Surge Test
Second-level lightning surge test is performed as a fire hazard indicator with cheesecloth applied to the particular EUT.
This second-level test, which can be destructive, may be performed with the EUT operating at a sub-assembly level equivalent to the standard system configuration, by providing dummy loads or associated equipment equivalent to what would be found in the complete voting system, as assembled in the polling place.
Source: [Telcordia06] subclauses 4.6.7 and 4.6.8
Testing SHALL be conducted in accordance with the requirements of Telcordia GR-1089 [Telcordia06] for simulation power-faults-induced events.
Applies to: Electronic device
DISCUSSION
Tests that can be used to assess the immunity of voting systems to power fault-induced disturbances are described in detail in [Telcordia06] for several scenarios and types of equipment, each involving a specific configuration of the test generator, test circuit, and connection of the equipment.
Source: [Telcordia06] subclause 4.6
Testing SHALL be conducted in accordance with the requirements of Telcordia GR-1089 [Telcordia06] for simulation of power-contact events.
Applies to: Electronic device
DISCUSSION
Tests for power contact (sometimes called "power cross") immunity of voting systems immunity are described in detail in [Telcordia06] for several scenarios and types of equipment, each involving a specific configuration of the test generator, test circuit, and connection of the equipment.
Source: [Telcordia06] subclause 4.6
Testing SHALL be conducted in accordance with the requirements of Telcordia GR-1089 [Telcordia06] for application of the EFT Burst.
Applies to: Electronic device
DISCUSSION
Telcordia GR-1089 [Telcordia06] calls for performing EFT tests but refers to [ISO4b] for details of the procedure. While EFT generators, per the IEC standard [ISO4b], offer the possibility of injecting the EFT burst into a power port by means of coupling capacitors, the other method described by the IEC standard, the so-called "capacitive coupling clamp," would be the recommended method for coupling the burst into leads connected to the telephone port of the voting system under test. However, because the leads (subscriber wiring premises) vary from polling place to polling place, a more repeatable test is direct injection at the telephone port via the coupling capacitors.
Source: [ISO04b] clause 6
Testing SHALL be conducted in accordance with the requirements of Telcordia GR-1089 [Telcordia06] for simulation of steady-state induced voltages.
Applies to: Electronic device
DISCUSSION
Telcordia GR-1089 [Telcordia06] describes two categories of tests, depending on the length of loops, the criterion being a loop length of 20 kft (sic). For metric system units, that criterion may be considered to be 6 km, a distance that can be exceeded for some low-density rural or suburban locations of a polling place. Therefore, the test circuit to be used should be the one applying the highest level of induced voltage.
Source: [Telcordia06] sub-clause 5.2
5.1.1.2-C Interaction between power port and telephone port
Inherent immunity against data corruption and hardware damage caused by interaction between the power port and the telephone port SHALL be demonstrated by applying a 0.5 µs – 100 kHz Ring wave between the power port and the telephone port.
Applies to: Electronic device
DISCUSSION
Although IEEE is in the process of developing a standard (IEEE PC62.50) to address the interaction between the power port and communications port, no standard has been promulgated at this date, but published papers in peer-reviewed literature [Key94] suggest that a representative surge can be the Ring Wave of [IEEE02a] applied between the equipment grounding conductor terminal of the voting system component under test and each of the tip and ring terminals of the voting system components intended to be connected to the telephone network.
Inherent immunity of the voting system might have been achieved by the manufacturer, as suggested in PC62.50, by providing a surge-protective device between these terminals that will act as a temporary bond during the surge, a function which can be verified by monitoring the voltage between the terminals when the surge is applied.
The IEEE project is IEEE PC62.50 "Draft Standard for Performance Criteria and Test Methods for Plug-in, Portable, Multiservice (Multiport) Surge Protective Devices for Equipment Connected to a 120/240 V Single Phase Power Service and Metallic Conductive Communication Line(s)." This is an unapproved standard, with estimated approval date 2008.
Source: New requirement
5.1.1.3 Radiated disturbances immunity
5.1.1.3-A Electromagnetic field immunity (80 MHz to 6.0 GHz)
Testing SHALL be conducted according to procedures in CISPR 24 [ANSI97], and either IEC 61000-4-3 [ISO06a] or IEC 61000-4-21:2003 [ISO06d].
Applies to: Electronic device
DISCUSSION
IEC 61000-4-3 [ISO06a] specifies using an absorber lined shielded room (fully or semi anechoic chamber) to expose the device-under-test. An alternative procedure is the immunity testing procedures of IEC [ISO06d], performed in a reverberating shielded room (radio-frequency reverberation chamber).
Source: [ANSI97], [ISO06a], [ISO06d]
5.1.1.3-B Electromagnetic field immunity (150 kHz to 80 MHz)
Testing for electromagnetic fields below 80 MHz SHALL be conducted according to procedures defined in IEC 61000-4-6 [ISO06b].
Testing SHALL be conducted in accordance with the recommendations of ANSI Std C63.16 [ANSI93], applying an air discharge or a contact discharge according to the nature of the enclosure of the voting system.
Applies to: Electronic device
DISCUSSION
Electrostatic discharges, simulated by a portable ESD simulator, involve an air discharge that can upset the logic operations of the circuits, depending on their status. In the case of a conducting enclosure, the resulting discharge current flowing in the enclosure can couple with the circuits and also upset the logic operations. Therefore, it is necessary to apply a sufficient number of discharges to significantly increase the probability that the circuits will be exposed to the interference at the time of the most critical transition of the logic. This condition can be satisfied by using a simulator with repetitive discharge capability while a test operator interacts with the voting terminal, mimicking the actions of a voter or initiating a data transfer from the terminal to the local tabulator.
Source: [ANSI93], [ISO01]
5.1.2 Electromagnetic compatibility (EMC) emissions limits
Testing of voting systems for EMC emission limits will be conducted using the black box testing approach, which "ignores the internal mechanism of a system or component and focuses solely on the outputs generated in response to selected inputs and execution conditions" [IEEE00].
It will be necessary to subject voting systems to a regimen of tests to demonstrate compliance with emission limits. The tests should include most, if not all disturbances that might be expected to be emitted from the implementation under test, unless compliance with mandatory limits such as FCC regulations is explicitly stated for the implementation under test.
5.1.2.1 Conducted emissions limits
5.1.2.1.1 Power port – low/high frequency ranges
As discussed in Part 1: 6.3.5 "Electromagnetic Compatibility (EMC) emission limits", the relative importance of low-frequency harmonic emissions and the current drawn by other loads in the polling place will result in a negligible percentage of harmonics at the point of common connection, as discussed in [IEEE92]. Thus, no test is required to assess the harmonic emission of a voting station.
High-frequency emission limits have been established by Federal Regulations [FCC07] as a condition for offering equipment to the US market. In such cases, part of the requirements include affixing a label or notice stating that the equipment complies with the stipulated limits. Therefore, the VVSG does not suggest performing a redundant test.
5.1.2.1.2 Communications (Telephone) port
Unintended conducted emissions from a voting system telephone port SHALL be tested for its analog voice band leads in the metallic as well as its longitudinal voltage limits.
Applies to: Voting system
DISCUSSION
Telcordia GR-1089 [Telcordia06] stipulates limits for both the common mode (longitudinal) and differential mode (metallic) over a frequency range defined by maximum voltage and terminating impedances.
Source: [Telcordia06] subclause 3.2.3
5.1.2.2 Radiated emissions
Compliance with emission limits SHALL be documented on the hardware in accordance with the stipulations of FCC Part 15, Class B [FCC07].
Applies to: Voting system
Source: [FCC07]
5.1.3 Other (non-EMC) industry-mandated requirements
5.1.3.1 Dielectric stresses
Testing SHALL be conducted in accordance with the stipulations of industry-consensus telephone requirements of Telcordia GR-1089 [Telcordia06].
5.1.3.2 Leakage via grounding port
Simple verification of an acceptable low leakage current SHALL be performed by powering the voting system under test via a listed Ground-Fault Circuit Interrupter (GFCI) and noting that no tripping of the GFCI occurs when the voting system is turned on.
Applies to: Voting system
Source: New requirement
The presence of a listing label (required by authorities having jurisdiction) referring to a safety standard, such as [UL05], makes repeating the test regimen unnecessary. Details on the safety considerations are addressed in Part 1: 3.2.8.2 "Safety".
5.1.3.4 Label of compliance
Some industry mandated requirements require demonstration of compliance, while for others the manufacturer affixes of label of compliance, which then makes repeating the tests unnecessary and economically not justifiable.
5.1.4 Non-operating environmental testing
This type of testing is designed to assess the robustness of voting systems during storage between elections and during transporting between the storage facility and the polling place.
Such testing is intended to simulate exposure to physical shock and vibration associated with handling and transportation of voting systems between a jurisdiction's storage facility and polling places. The testing additionally simulates the temperature and humidity conditions that may be encountered during storage in an uncontrolled warehouse environment or precinct environment. The procedures and conditions of this testing correspond to those of MIL-STD-810D, "Environmental Test Methods and Engineering Guidelines."
5.1.4-A Tests of non-operating equipment
All voting systems SHALL be tested in accordance with the appropriate procedures of MIL-STD-810D, "Environmental Test Methods and Engineering Guidelines'' [MIL83].
Applies to: Voting system
Source: [VVSG2005]
1 Comment
Comment by Brian V. Jarvis (Local Election Official)
Note that the latest revision of MIL-STD-810 is revision F (dated 1 January 2000). The most recent change notice (Notice #3) for that standard is dated 5 May 2003. Recommend that this requirement be updated to indicate the most recent revision of this standard. (This latest revision may result in impacts to the requirements in the sub-sections below 5.1.4-A.)
All voting systems SHALL be tested in accordance with MIL-STD-810D, Method 516.3. Procedure VI.
Applies to: Voting system
DISCUSSION
This test simulates stresses faced during maintenance and repair.
Source: [VVSG2005]
All voting systems SHALL be tested in accordance with MIL-STD-810D, Method 514.3, Category 1 – Basic Transportation, Common Carrier.
Applies to: Voting system
DISCUSSION
This test simulates stresses faced during transport between storage locations and polling places.
Source: [VVSG2005]
All voting systems SHALL be tested in accordance with MIL-STD-810D: Method 502.2, Procedure I – Storage and Method 501.2, Procedure I – Storage. The minimum temperature SHALL be -4 degrees F, and the maximum temperature SHALL be 140 degrees F.
Applies to: Voting system
DISCUSSION
This test simulates stresses faced during storage.
Source: [VVSG2005]
All voting systems SHALL be tested in accordance with humidity testing specified by MIL-STD-810D: Method 507.2, Procedure II – Natural (Hot-Humid), with test conditions that simulate a storage environment.
Applies to: Voting system
DISCUSSION
This test is intended to evaluate the ability of voting equipment to survive exposure to an uncontrolled temperature and humidity environment during storage.
Source: [VVSG2005]
5.1.5 Operating environmental testing
This type of testing is designed to assess the robustness of voting systems during operation.
5.1.5-A Tests of operating equipment
All voting systems SHALL be tested in accordance with the appropriate procedures of MIL-STD-810D, "Environmental Test Methods and Engineering Guidelines'' [MIL83].
Applies to: Voting system
Source: [VVSG2005]
All voting systems SHALL be tested according to the low temperature and high temperature testing specified by MIL-STD-810-D [MIL83]: Method 502.2, Procedure II -- Operation and Method 501.2, Procedure II -- Operation, with test conditions that simulate system operation.
Applies to: Voting system
Source: [VVSG2005]
All voting systems SHALL be tested according to the humidity testing specified by MIL-STD-810-D: Method 507.2, Procedure II – Natural (Hot –Humid), with test conditions that simulate system operation.
Applies to: Voting system
Source: New requirement
5.2 Functional Testing
Functional testing is performed to confirm the functional capabilities of a voting system. The accredited test lab designs and performs procedures to test a voting system against the requirements outlined in Part 1. Additions or variations in testing may be appropriate depending on the system's use of specific technologies and configurations, the system capabilities, and the outcomes of previous testing.
Functional tests cover the full range of system operations. They include tests of fully integrated system components, internal and external system interfaces, usability and accessibility, and security. During this process, election management functions, ballot-counting logic, and system capacity are exercised.
The accredited test lab tests the interface of all system modules and subsystems with each other against the manufacturer's specifications. For systems that use telecommunications capabilities, components that are located at the poll site or separate vote counting site are tested for effective interface, accurate vote transmission, failure detection, and failure recovery. For voting systems that use telecommunications lines or networks that are not under the control of the manufacturer (e.g., public telephone networks), the accredited test lab tests the interface of manufacturer-supplied components with these external components for effective interface, vote transmission, failure detection, and failure recovery.
The security tests focus on the ability of the system to detect, prevent, log, and recover from a broad range of security risks. The range of risks tested is determined by the design of the system and potential exposure to risk. Regardless of system design and risk profile, all systems are tested for effective access control and physical data security. For systems that use public telecommunications networks to transmit election management data or election results (such as ballots or tabulated results), security tests are conducted to ensure that the system provides the necessary identity-proofing, confidentiality, and integrity of transmitted data. The tests determine if the system is capable of detecting, logging, preventing, and recovering from types of attacks known at the time the system is submitted for qualification. The accredited test lab may meet these testing requirements by confirming the proper implementation of proven commercial security software.
5.2.1 General guidelines
5.2.1.1 General test template
Most tests will follow this general template. Different tests will elaborate on the general template in different ways, depending on what is being tested.
- Establish initial state (clean out data from previous tests, verify resident software/firmware);
- Program election and prepare ballots and/or ballot styles;
- Generate pre-election audit reports;
- Configure voting devices;
- Run system readiness tests;
- Generate system readiness audit reports;
- Precinct count only:
- Open poll;
- Run precinct count test ballots; and
- Close poll.
- Run central count test ballots (central count / absentee ballots only);
- Generate in-process audit reports;
- Generate data reports for the specified reporting contexts;
- Inspect ballot counters; and
- Inspect reports.
5.2.1.2 General pass criteria
The test lab need only consider tests that apply to the classes specified in the implementation statement, including those tests that are designated for all systems. The test verdict for all other tests SHALL be Not Applicable.
Applies to: Voting system
1 Comment
Comment by Brian V. Jarvis (Local Election Official)
Under 5.2.1.2, even though the title of the chapter is "General Pass Criteria", none of the subsections to 5.2.1.2 defines criteria for "Pass". Recommend adding a section defining the criteria for "Pass" -- unless (a) applicability of tests, (b) test assumptions, (c) missing functionality, and (d) demonstratable violations comprises an absolute finite list of conditions considered "non-pass."
If the documented assumptions for a given test are not met, the test verdict SHALL be Waived and the test SHALL NOT be executed.
Applies to: Voting system
If the test lab is unable to execute a given test because the system does not support functionality that is required per the implementation statement or is required for all systems, the test verdict SHALL be Fail.
Applies to: Voting system
5.2.1.2-D Any demonstrable violation justifies an adverse opinion
A demonstrable violation of any applicable requirement of the VVSG during the execution of any test SHALL result in a test verdict of Fail.
Applies to: Voting system
DISCUSSION
The nonconformities observed during a particular test do not necessarily relate to the purpose of that test. This requirement clarifies that a nonconformity is a nonconformity, regardless of whether it relates to the test purpose.
See Part 3: 2.5.5 "Test practices" for directions on termination, suspension, and resumption of testing following a verdict of Fail.
5.2.2 Structural coverage (white-box testing)
This section specifies requirements for "white-box" (glass-box, clear-box) testing of voting system logic.
For voting systems that reuse components or subsystems from previously tested systems, the test lab may, per Requirement Part 2: 5.1-D, find it unnecessary to repeat instruction, branch, and interface testing on the previously tested, unmodified components. However, the test lab must fully test all new or modified components and perform what regression testing is necessary to ensure that the complete system remains compliant.
5.2.2-A Instruction and branch testing
The test lab SHALL execute tests that provide coverage of every accessible instruction and branch outcome in application logic and border logic.
Applies to: Voting system
DISCUSSION
This is not exhaustive path testing, but testing of paths sufficient to cover every instruction and every branch outcome.
Full coverage of third-party logic is not mandated because it might include a large amount of code that is never used by the voting application. Nevertheless, the relevant portions of third-party logic should be tested diligently.
There should be no inaccessible code in application logic and border logic other than defensive code (including exception handlers) that is provided to defend against the occurrence of failures and "can't happen" conditions that cannot be reproduced and should not be reproducible by a test lab.
Source: Clarification of [VSS2002]/[VVSG2005] II.6.2.1 and II.A.4.3.3
3 Comments
Comment by Brian V. Jarvis (Local Election Official)
The voting system application software should not contain "...a large amount of code that is never used by the voting application." In fact, it should not contain any code that is not used by the voting application. All code in the voting application should only exist because it satisfies a requirement. If exception handlers in the source code cannot be logically invoked, recommend that it must be determined if any of this code is "deactivated code" or whether it is "dead code." If it is "deactivated code," evidence should be made available by the manufacturer that the deactivated code is disabled for the environments where its use is not intended. Unintended activation of deactivated code due to abnormal system conditions is the same as unintended activation of activated code. A combination of analysis and testing should show that the means by which such code could be inadvertantly executed are prevented, isolated, or eliminated. "Dead code" is executable code which, as a result of a design error cannot be executed or used in an operational configuration of the target computer environment and is not traceable to a system or software requirement. The "dead code" should be removed and an analysis performed to assess the effect and the need for reverification.
Comment by Gail Audette (Voting System Test Laboratory)
While this cites the VSS and VVSG this is not a clarification but a new requirement. The requirement for 100% coverage of every accessible instruction and outcome processing during unit test is not achievable in a finite amount of time. Based upon personal experience of flight software the level of scope identified in this requirement exceeds industry best practices for high reliability commercial products. We suggest reconciling this requirement with this type of product.
Comment by Al Backlund (Voting System Test Laboratory)
What is meant by "this is not exhaustive path testing"? I believe that "coverage of every accessible instruction and branch outcome" requires instrumented code and/or unit testing procedures which can be exhaustive. Would recommend that this be written as to indicate that all commonly used instruction and branch outcomes be exercised or something that more effectively communicates the scope expected.
The test lab SHALL execute tests that test the interfaces of all application logic and border logic modules and subsystems, and all third-party logic modules and subsystems that are in any way used by application logic or border logic.
Applies to: Voting system
Source: Clarification of [VSS2002]/[VVSG2005] II.6.3
5.2.2-C Pass criteria for structural testing
The test lab SHALL define pass criteria using the VVSG (for standard functionality) and the manufacturer-supplied system documentation (for implementation-specific functionality) to determine acceptable ranges of performance.
Applies to: Voting system
DISCUSSION
Because white-box tests are designed based on the implementation details of the voting system, there can be no canonical test suite. Pass criteria must always be determined by the test lab based on the available specifications.
Since the nature of the requirements specified by the manufacturer-supplied system documentation is unknown, conformity for implementation-specific functionality may be subject to interpretation. Nevertheless, egregious disagreements between the behavior of the system and the behavior specified by the manufacturer should lead to a defensible adverse finding.
Source: [VSS2002]/[VVSG2005] II.A.4.3.3
5.2.3 Functional coverage (black-box testing)
All voting system logic, including any embedded in COTS components, is subject to functional testing.
For voting systems that reuse components or subsystems from previously tested systems, the test lab may, per Requirement Part 2: 5.1-D, find it unnecessary to repeat functional testing on the previously tested, unmodified components. However, the test lab must fully test all new or modified components and perform what regression testing is necessary to ensure that the complete system remains compliant.
5.2.3-A Functional testing, VVSG requirements
The test lab SHALL execute test cases that provide coverage of every applicable, mandatory ("SHALL"), functional requirement of the VVSG.
Applies to: Voting system
DISCUSSION
Depending upon the design and intended use of the voting system, all or part of the functions listed below must be tested:
- Ballot preparation subsystem;
- Test operations performed prior to, during, and after processing of ballots, including:
- Logic tests to verify interpretation of ballot styles, and recognition of precincts to be processed;
- Accuracy tests to verify ballot reading accuracy;
- Status tests to verify equipment statement and memory contents;
- Report generation to produce test output data; and
- Report generation to produce audit data records.
- Procedures applicable to equipment used in the polling place for:
- Opening the polls and enabling the acceptance of ballots;
- Maintaining a count of processed ballots;
- Monitoring equipment status;
- Verifying equipment response to operator input commands;
- Generating real-time audit messages;
- Closing the polls and disabling the acceptance of ballots;
- Generating election data reports;
- Transfer of ballot counting equipment, or a detachable memory module, to a central counting location; and
- Electronic transmission of election data to a central counting location.
- Procedures applicable to equipment used in a central counting place:
- Initiating the processing of a ballot deck, programmable memory device, or other applicable media for one or more precincts;
- Monitoring equipment status;
- Verifying equipment response to operator input commands;
- Verifying interaction with peripheral equipment, or other data processing systems;
- Generating real-time audit messages;
- Generating precinct-level election data reports;
- Generating summary election data reports;
- Transfer of a detachable memory module to other processing equipment;
- Electronic transmission of data to other processing equipment; and
- Producing output data for interrogation by external display devices.
- Security controls have been implemented, are free of obvious errors, and operating as described in security documentation.
- Cryptography;
- Access control;
- Setup inspection;
- Software installation;
- Physical security;
- System integrity management;
- Communications;
- Audit, electronic, and paper records; and
- System event logging.
This requirement is derived from [VSS2002]/[VVSG2005] II.A.4.3.4, "Software Functional Test Case Design," in lieu of a canonical functional test suite. Once a complete, canonical test suite is available, the execution of that test suite will satisfy this requirement. For reproducibility, use of a canonical test suite is preferable to development of custom test suites
In those few cases where requirements specify "fail safe" behaviors in the event of freak occurrences and failures that cannot be reproduced and should not be reproducible by a test lab, the requirement is considered covered if the test campaign concludes with no occurrences of an event to which the requirement would apply. However, if a triggering event occurs, the test lab must assess conformity to the requirement based on the behaviors observed.
Source: [VSS2002]/[VVSG2005] II.A.4.3.4
1 Comment
Comment by alan (General Public)
gather previous cards used in previous election and use these as test samples to be input in test cycle. gather people off the street and ask them to fill in the test cards as a sample test set give no instructions on how to fill in the cards, all fill in instructions should be on the material presented. testers/developers/team members will fill in the cards in ways that the public will not, so get a sample from the public.
5.2.3-B Functional testing, capacity tests
The test lab SHALL execute tests to verify that the system and its constituent devices are able to operate correctly at the limits specified in the implementation statement; for example:
- Maximum number of ballots;
- Maximum number of ballot positions;
- Maximum number of ballot styles;
- Maximum number of contests;
- Maximum vote total (counter capacity);
- Maximum number of provisional, challenged, or review-required ballots;
- Maximum number of contest choices per contest; and
- Any similar limits that apply.
Applies to: Voting system
DISCUSSION
See Part 1: 2.4 "implementation statement". Every kind of limit is not applicable to every kind of device. For example, EBMs may not have a limit on the number of ballots they can handle.
Source: Generalization from [VSS2002]/[VVSG2005] II.6.2.3
2 Comments
Comment by Carolyn Coggins (Voting System Test Laboratory)
This requirement is a problem in the earlier standards and 5.2.3-B and 5.2.3-B1 still does not provide sufficient guidance to ensure consistency of testing across all VSTLs. Specific benchmarks needs to be provided to the labs, manufacturers and election officials on acceptable stress test limits for a thru g. These guidelines can provide a benchmark for 5.2.3.B-1 on what is a practical test. Further identify: 1) A matrix of what limits are applicable to which voting systems 2) If the manufacturer may set a limit lower than the acceptable stress test limit 3) What/where must this information be documented (provide a reference if it is already identified)
Comment by Frank Padilla (Voting System Test Laboratory)
Who sets the limits and determines if they are accurate or cover enough?
5.2.3-B.1 Practical limit on capacity operational tests
If an implementation limit is sufficiently great that it cannot be verified through operational testing without severe expense and hardship, the test lab SHALL attest this in the test report and substitute a combination of design review, logic verification, and operational testing to a reduced limit.
Applies to: Voting system
DISCUSSION
For example, since counter capacity can easily be designed to 232 and beyond without straining current technology, some reasonable limit for required operational testing is needed. However, it is preferable to test the limit operationally if there is any way to accomplish it.
1 Comments
Comment by Cem Kaner (Academic)
This is a perfect example of a situation in which a test fixture can drive the system to a limit that is impractical to reach with end-to-end testing. We should permit the use of risk-focused tests that are not end-to-end as a more desirable alternative to testing with reduced limits. .......... (Affiliation Note: IEEE representative to TGDC)
5.2.3-C Functional testing, stress tests
The test lab SHALL execute tests to verify that the system is able to respond gracefully to attempts to process more than the expected number of ballots per precinct, more than the expected number of precincts, higher than expected volume or ballot tabulation rate, or any similar conditions that tend to overload the system's capacity to process, store, and report data.
2 Comments
Comment by Gail Audette (Voting System Test Laboratory)
"Gracefully" is not a testable requirement. Please identify testable pass/fail criteria.
Comment by Frank Padilla (Voting System Test Laboratory)
"Process more than the number of" is subjective and not testable or repeatable across labs.
5.2.3-D Functional testing, volume test
The test lab SHALL conduct a volume test in conditions approximating normal use in an election. The entire system SHALL be tested, from election definition through the reporting and auditing of final results.
Applies to: Voting system
DISCUSSION
Data collected during this test contribute substantially to the evaluations of reliability, accuracy, and misfeed rate (see Part 3: 5.3 "Benchmarks").
Source: [CA06]
2 Comments
Comment by Frank Padilla (Voting System Test Laboratory)
Subjective. What is normal use in an election? This needs to be defined.
Comment by ACCURATE (Aaron Burstein) (Academic)
Volume testing is a vital element of future certification with respect to voting system reliability. It simulates the load that a typical machine might encounter during its peak use period and does so on many devices at once. It has become important in California, at least; but state-level volume testing can never be as instrumentally effective as volume testing performed during national certification. Flaws found during national certification can be fixed immediately and the system re-certified during the ongoing certification process instead of having to re-submit a delta change under a new certification attempt. Thus, this requirement should be adopted.
For systems that include VEBDs, a minimum of 100 VEBDs SHALL be tested and a minimum of 110 ballots SHALL be cast manually on each VEBD.
Applies to: VEBD
DISCUSSION
For vote-by-phone systems, this would mean having 100 concurrent callers, not necessarily 100 separate servers to answer the calls, if one server suffices to handle many incoming calls simultaneously. Other client-server systems would be analogous.
To ensure that the correct results are known, test voters should be furnished with predefined scripts that specify the votes that they should cast.
Source: [CA06]
1 Comment
Comment by Al Backlund (Voting System Test Laboratory)
100 voting terminals is not a practical number of units for several reasons: · ? Physical space requirements · ? Power requirements · ? Availability from manufacturer A possible solution is a discrete event simulation model developed by the manufacturer and verified for accuracy by the VSTL.
For systems that include precinct tabulators, a minimum of 50 precinct tabulators SHALL be tested. No fewer than 10000 test ballots SHALL be used. No fewer than 400 test ballots SHALL be counted by each precinct tabulator.
Applies to: Precinct tabulator
DISCUSSION
[GPO90] 7.5 specified, "The total number of ballots to be processed by each precinct counting device during these tests SHALL be at least ten times the number of ballots expected to be counted on a single device in an election (500 to 750), but in no case less than 5,000."
It is permissible to reuse test ballots. However, all 10000 test ballots must be used at least once, and each precinct tabulator must count at least 400 (distinct) ballots. Cycling 100 ballots 4 times through a given tabulator would not suffice. See also, Requirement Part 3: 2.5.3-A (Complete system testing).
Source: [CA06]
2 Comments
Comment by Frank Padilla (Voting System Test Laboratory)
Is the number of test units supportable and representative of the requirements?
Comment by Al Backlund (Voting System Test Laboratory)
50 tabulators is not a practical number of units for several reasons: · ? Physical space requirements · ? Power requirements · ? Availability from manufacturer A possible solution is a discrete event simulation model developed by the manufacturer and verified for accuracy by the VSTL.
For systems that include central tabulators, a minimum of 2 central tabulators SHALL be tested. No fewer than 10000 test ballots SHALL be used. A minimum ballot volume of 75000 (total across all tabulators) SHALL be tested, and no fewer than 10000 test ballots SHALL be counted by each central tabulator.
Applies to: Central tabulator
DISCUSSION
[CA06] did not specify test parameters for central tabulators. The test parameters specified here are based on the smallest case provided for central count systems in Exhibit J-1 of Appendix J, Acceptance Test Guidelines for P&M Voting Systems, of [GPO90]. An alternative would be to derive test parameters from the test specified in [GPO90] 7.3.3.2 and (differently) in [VSS2002]/[VVSG2005] II.4.7.1. A test of duration 163 hours with a ballot tabulation rate of 300 / hour yields a total ballot volume of 48900—presumably, but not necessarily, on a single tabulator.
[GPO90] 7.5 specified, "The number of test ballots for each central counting device SHALL be at least thirty times the number that would be expected to be voted on a single precinct count device, but in no case less than 15,000."
The ballot volume of 75000 is the total across all tabulators; so, for example, one could test 25000 ballots on each of 3 tabulators. The test deck must contain at least 10000 ballots. A deck of 15000 ballots could be cycled 5 times to generate the required total volume. See also, Requirement Part 3: 2.5.3-A (Complete system testing).
Source: [GPO90] Exhibit J-1 (Central Count)
The testing of MCOS SHALL include marks filled according to the recommended instructions to voters, imperfect marks as specified in Requirement Part 1: 7.7.5-D, and ballots with folds that do not intersect with voting targets.
Applies to: MCOS
Source: Numerous public comments and issues
2 Comments
Comment by alan (General Public)
test with dull #2 pencil test with black ink pen test with blue ink pen test where X is marked instead of circle filled in test if two circles filled in test if single line drawn through circle (all directions - | \ /) test with one circle is partially filled and the second circle on same line is filled completely test if mark is erased test if mark is partially erased test if mark is filled in and X'd out and other circle is filled in test if circle is ripped, torn or punctured test if circle is missing (bad form is used) test if circle is partially printed (bad form is used)
Comment by Frank Padilla (Voting System Test Laboratory)
Requirement is subjective. How are these imperfect marks to be input?
5.2.3-E Functional testing, languages
The test lab SHALL execute tests to verify that the system is able to produce and utilize ballots in all of the languages that are claimed to be supported in the implementation statement.
Applies to: Voting system
DISCUSSION
See Part 1: 2.4 "Implementation Statement".
5.2.3-F Functional testing, error cases
The test lab SHALL execute tests to verify that the system is able to detect, handle, and recover from abnormal input data, operator actions, and conditions.
1 Comment
Comment by Frank Padilla (Voting System Test Laboratory)
Subjective: "abnormal input data" is not testable or repeatable.
The test lab SHALL execute tests to verify that the system detects and handles operator errors such as inserting control cards out of sequence or attempting to install configuration data that are not properly coded for the device.
Applies to: Voting system
Source: [GPO90] 8.8
1 Comment
Comment by alan (General Public)
Testing should not be constrained to "sunny day"/positive tests. Testing should include negative test cases. Test cases should validate both the positive and negative aspects of a requirement. There should be no minimum number of tests (both positive/negative).
The test lab SHALL execute tests to check that the system is able to respond to hardware malfunctions in a manner compliant with the requirements of Part 1: 6.4.1.9 "Recovery".
Applies to: Voting system
DISCUSSION
This capability may be checked by any convenient means (e.g., power off, disconnect a cable, etc.) in any equipment associated with ballot processing.
This test pertains to "fail safe" behaviors as discussed in Requirement Part 3: 5.2.3-A. The test lab may be unable to produce a triggering event, in which case the test is passed by default.
Source: [GPO90] 8.5
For systems that use networking and/or telecommunications capabilities, the test lab SHALL execute tests to check that the system is able to detect, handle, and recover from interference with or loss of the communications link.
Applies to: Voting system
DISCUSSION
This test pertains to "fail safe" behaviors as discussed in Requirement Part 3: 5.2.3-A. The test lab may be unable to produce a triggering event, in which case the test is passed by default.
Source: [VSS2002]/[VVSG2005] II.6.3
5.2.3-G Functional testing, manufacturer functionality
The test lab SHALL execute tests that provide coverage of the full range of system functionality specified in the manufacturer's documentation, including functionality that exceeds the specific requirements of the VVSG.
Applies to: Voting system
DISCUSSION
Since the nature of the requirements specified by the manufacturer-supplied system documentation is unknown, conformity for implementation-specific functionality may be subject to interpretation. Nevertheless, egregious disagreements between the behavior of the system and the behavior specified by the manufacturer should lead to a defensible adverse finding.
Source: [VSS2002]/[VVSG2005] II.3.2.3, II.6.7
The test lab SHALL prepare a detailed matrix of VVSG requirements, system functions, and the tests that exercise them.
5.2.3-I Pass criteria for functional testing
Pass criteria for tests that are adopted from a canonical functional test suite are defined by that test suite. For all other tests, the test lab SHALL define pass criteria using the VVSG (for standard functionality) and the manufacturer-supplied system documentation (for implementation-specific functionality) to determine acceptable ranges of performance.
Applies to: Voting system
DISCUSSION
Since the nature of the requirements specified by the manufacturer-supplied system documentation is unknown, conformity for implementation-specific functionality may be subject to interpretation. Nevertheless, egregious disagreements between the behavior of the system and the behavior specified by the manufacturer should lead to a defensible adverse finding.
Source: [VSS2002]/[VVSG2005] II.A.4.3.4
5.3 Benchmarks
5.3.1 General method
Reliability, accuracy, and misfeed rate are measured using ratios, each of which is the number of some kind of event (failures, errors, or misfeeds, respectively) divided by some measure of voting volume. The test method discussed here is applicable generically to all three ratios; hence, this discussion will refer to events and volume without specifying a particular definition of either.
By keeping track of the number of events and the volume over the course of a test campaign, one can trivially calculate the observed cumulative event rate by dividing the number of events by the volume. However, the observed event rate is not necessarily a good indication of the true event rate. The true event rate describes the expected performance of the system in the field, but it cannot be observed in a test campaign of finite duration, using a finite-sized sample. Consequently, the true event rate can only be estimated using statistical methods.
In accordance with the current practice in voting system testing, the system submitted for testing is assumed to be a representative sample, so the variability of devices of the same type is out of scope.
The test method makes the simplifying assumption that events occur in a Poisson distribution, which means that the probability of an event occurring is assumed to be the same for each unit of volume processed. In reality, there are random events that satisfy this assumption but there are also nonrandom events that do not. For example, a logic error in tabulation software might be triggered every time a particular voting option is used. Consequently, a test campaign that exercised that voting option often would be more likely to indicate rejection based on reliability or accuracy than a test campaign that used different tests. However, since these VVSG require absolute correctness of tabulation logic, the only undesirable outcome is the one in which the system containing the logic error is accepted. Other evaluations specified in these VVSG, such as functional testing and logic verification, are better suited to detecting systems that produce nonrandom errors and failures. Thus, when all specified evaluations are used together, the different test method complement each other and the limitation of this particular test method with respect to nonrandom events is not bothersome.
For simplicity, all three cases (failures, errors, and misfeeds) are modeled using a continuous distribution (Poisson) rather than a discrete distribution (Binomial). In this application, where the probability of an event occurring within a unit of volume is small, the difference in results from the discrete and continuous models is negligible.
The problem is approached through classical hypothesis testing. The null hypothesis (H0) is that the true event rate, rt, is less than or equal to the benchmark event rate, rb (which means that the system is conforming).

The alternative hypothesis (H1) is that the true event rate, rt, is greater than the benchmark event rate, rb (which means that the system is non-conforming).

Assuming an event rate of r, the probability of observing n or less events for volume v is the value of the Poisson cumulative distribution function.

Let no be the number of events observed during testing and vo be the volume produced during testing. The probability α of rejecting the null hypothesis when it is in fact true is limited to be less than 0.1. Thus, H0 is rejected only if the probability of no or more events occurring given a (marginally) conforming system is less than 0.1. H0 is rejected if 1−P(no−1,rbvo)<0.1, which is equivalent to P(no−1,rbvo)>0.9. This corresponds to the 90th percentile of the distribution of the number of events that would be expected to occur in a marginally conforming system.
If at the conclusion of the test campaign the null hypothesis is not rejected, this does not necessarily mean that conformity has been demonstrated. It merely means that there is insufficient evidence to demonstrate non-conformity with 90 % confidence.
Calculating what has been demonstrated with 90 % confidence, after the fact, is completely separate from the test described above, but the logic is similar. Suppose there are no observed events after volume vo. Solving the equation P(no,rdvo)=0.1 for rd finds the "demonstrated rate" rd such that if the true rate rt were greater than rd, then the probability of having no or fewer events would be less than 0.1. The value of rd could be greater or less than the benchmark event rate rb mentioned above.
Please note that the length of testing is determined in advance by the approved test plan. To adjust the length of testing based on the observed performance of the system in the tests already executed would bias the results and is not permitted. A Probability Ratio Sequential Test (PRST) [Wald47][Epstein55][MIL96] as was specified in previous versions of these VVSG varies the length of testing without introducing bias, but practical difficulties result when the length of testing determined by the PRST disagrees with the length of testing that is otherwise required by the test plan.
9 Comments
Comment by Gail Audette (Voting System Test Laboratory)
The benchmarks are determined by the 'observed event"; however, this critical value is not defined. It is the basis for benchmarking and must be defendable. As it is currently stated this is not defendable.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
Amendment 2 to: USACM Comment #25. Section 5.3.1 General method [incorrect] Reference [2] Musa, Software Reliability Engineering (http://members.aol.com/JohnDMusa/book.htm)
Comment by Cem Kaner (Academic)
VVSG should not estimate reliability (or acceptability in any other way) of software by calculating the number of failures (test events?) divided by the number of tests (test volume?). It is too easy to influence this estimator by including large numbers of easy-to-pass tests. .......... (Affiliation Note: IEEE representative to TGDC)
Comment by Cem Kaner (Academic)
Lab tests focused on conformance testing do not model usage patterns in the field and therefore test results based on them cannot estimate failure rates in the field. This is not a defensible method for estimating reliability. .......... It may be possible to develop operational profiles from which reliability tests could be developed but this will require extensive research that would not be part of the approval process of any particular voting system. .......... (Affiliation Note: IEEE representative to TGDC)
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #25. Section 5.3.1 General method [incorrect] USACM recommends correcting the factual errors present in this subsection, as minimally enumerated below: 1. The system submitted for testing might be a representative sample of the pool of identical hardware/software systems, but the pool of tests should not be a representative sample of the events that happen during an election. 2. There is no reason to expect software reliability, software accuracy, and hardware misfeed rate to follow the same distribution. 3. The Poisson distribution is discrete, not continuous. 4. The Poisson process typically assumes a stationary underlying exponential distribution. The idea that software reliability, software accuracy, and hardware misfeed rates follow the same underlying distribution, or that the concatenation of these three (if there are only three) distributions would be anything like exponential is remarkable in its unlikelihood. 5. The observed event rate ("events" divided by "volume" over the course of a test campaign) is a highly biased measure. a. The first problem is that a regression test suite repeats the same tests from build to build. This gives rise to the classic problem of the "pesticide paradox" [ ]. The test suite is a tiny sample of the collection of possible tests. When the suite reveals bugs, they are fixed. Ultimately, the test suite becomes a collection of tests that have one thing in common: the software has passed all of them at least once. This differs from almost every other possible test (all of the ones that have not been run). Therefore, the reliability of the software is probably vastly overestimated. b. The second problem is that the pool of tests is structured to cover a specification. It does not necessarily target vulnerabilities of the software. Nor is it designed to reflect usage frequencies in the field [ ]. 6. Determining the length of testing in advance by an approved test plan sounds scientific, but many practitioners consider this software testing malpractice. There is substantial evidence that bugs cluster. Given a failure, there is reason to do follow-up testing to study this area of the product in more detail. Rigidly adhering to a plan created in the absence of failure data is to rigidly reject the idea of follow-up testing. This underestimates the number of problems in the code. Worse, this testing method reduces the chance that defects will be found and fixed because it reduces — essentially bans — the follow-up testing that would expose those additional defects.
Comment by Cem Kaner (Academic)
VVSG should not model software failure rates with a Poisson distribution or a Poisson process, or with any other distribution or stochastic process unless that distribution or process is derived from a logically and empirically-defensible model. .......... There is no reason to think that a mixture distribution that combines hardware and software events would be a simple Poisson. It is important to recognize that we are estimating performance in the tails of the distribution. To the extent that the true underlying distribution differs from the Poisson, deviations are particularly likely in the tails, yielding overestimates or underestimates of the significance of a given number of failures in a given period of activity. The statistical tables may have validity for hardware-related failures or for misfeeds, but there is no reason to think they would be valid for software or for the mixture distribution. .......... Robert Austin [Measuring and Managing Performance in Organizations, Dorset, 1996] wrote a particularly compelling discussion of the risks associating with basing high-stakes decisions on metrics that are not tightly tied to the underlying attribute that the metric attempts to estimate. Equipment vendors have a strong interest in making their numbers look good, or at least good enough to pass testing, and therefore they have an incentive to optimize their behavior in ways that improve the numbers. They also have an incentive to challenge the ways in which the numbers (total failures, total volume) are calculated, to arrive at a result that is more favorable. .......... I assume in all that follows that the manufacturers are acting in good faith. When we tell someone that their performance will be passed or failed according to a criterion, there is nothing dishonest in optimizing efforts to meet that criterion. If anything, that is what the criterion is there to accomplish. In particular, you should read the comments that follow with the understanding that I am explicitly and intentionally assuming that the vendors will be factually honest in everything that they do and that they are primarily motivated to achieve a "pass" from the system and not particularly motivated to do so in such a way as to mislead anyone about the underlying quality of the product. .......... * The VVSG (5.3.1) correctly notes that one of the characteristics of the Poisson model is that the probability of an event occurring stays constant over each "unit of volume processed." It then notes that this is not exactly correct for software because software errors might be nonrandom, that is, they might be triggered every time the same set of conditions is tested. It then dismisses this problem by saying "Thus, when all specific evaluations are used together, the different test methods complement each other and the limitation of the particular test method with respect to nonrandom events is not bothersome." I think this is a novel conclusion. I do not understand how mixing nonrandom events with random ones (to the extent that there are random failures in software) is a good foundation for a model that assumes that all events are random. .......... (a) For most software failures, the failure itself is not random at all. Given THESE conditions, THAT failure will occur. What might be thought of as random is whether and when the particular test that includes those conditions is presented to the software. The probability that a given test will yield a failure thus depends on at least two factors: how many problems remain in the software and how powerful the test is with respect to the types of problems that remain. As the software goes through testing, problems are fixed, and so the number of remaining problems diminishes. Therefore the assumption that the rate parameter of the Poisson distribution is stationary is implausible. .......... (b) The power of software tests run is not related to the underlying reliability of the software. Test power is (analogous to the power of a statistical test) the ability of the test to detect an error of a certain type if it is there. Note that no test has "absolute" power—a test that is optimized to expose an off-by-one error might be a weak detector of rounding errors. Thus, a lab can achieve a low failure rate (high apparent reliability) by running relatively low-power tests and a high failure rate (low apparent reliability) by running relatively high-power tests. Regression tests lose their power as they are used repeatedly, because the errors they are optimized to detect get found and fixed. (This problem was labeled the "pesticide paradox" by Boris Beizer in Software Testing Techniques, Van Nostrand, 2nd Edition, 1990; see also Kaner, Testing Computer Software, McGraw-Hill, 1987, p. 94). The improvement in apparent reliability with repeated use of regression tests should not be expected to predict improvement of reliability in the field, because users in the field do other things with the software beyond running these particular regression tests. Varying testing, for example by changing parameter values, combining tests, or running the tests in long random sequences, probably does a better job of mitigating operational risk but under the VVSG benchmarking definition, this testing will drive down estimated reliability at the same time as it contributes to the actual improvement of reliability. .......... (c) If there are B bugs in the software and we find a bug with 100% certainty if we run a specific test (or ones sufficiently like it), the probability of detecting one of the errors boils down to the probability that an error-revealing test makes it into the test suite. That depends on the sampling strategy (any test design strategy can be seen as a sampling strategy), whose details are under the control of the test lab, with some influence by the vendor and the VVSG. It is not clear what this sampling strategy has to do with the underlying reliability of the software. The VVSG-specified rate parameter probably has more to do with this sampling strategy than with operational reality. .......... (d) A Poisson process model for failures makes several assumptions. The first that I noted is that the probability of discovery of a failure is constant over time. This is implausible because the program presumably gets more stable, and the tests (if they are the fully scripted regression tests required by VVSG) get less powerful over time. The second is that instants of time (or units of volume) (that is, tests) are independent. This is also implausible. A widely reported pattern in test data is that some modules are much more error prone than others. Presumably, this is due to the inherent difficulty of some problems, the tremendous variations in individual programmer competence, perhaps a difference in time pressure associated with completing some tasks compared to others, etc. A sensible testing strategy adds new tests to further investigate areas that have shown some failures. If this is done, the probability of these tests exposing problems is relatively high, but that is a conditional probability—test X2 has a high probability of exposing a problem because test X1 did expose a problem. This is precisely the opposite of the assumption of independence between X1 and X2. Of course, one can preserve apparent independence of tests by never adding new tests to more carefully study areas of the program that seem weak. However, if the objective is to check the quality of the software, this restriction (no follow-up with related tests) would be bad testing, in conflict with the objective. Another problem for the idea of independence is the problem of identicality. Do we really think that the same test, run a second or third or fourth time (regression testing) should be treated as an independent sample from the pool of possible tests? .......... (e) Another problem with the Poisson process model is that some bugs are inherently harder to detect than others. Thus, if we have B bugs, the probability of detecting each one is not the same. It is usually easier to find a bug if it depends only on one feature or one parameter of a feature. A relatively simple test will do the trick. The only risk of obscurity is the possibility that only one value of the parameter would lead to failure. Special cases do exist, but they are often (and not always) at boundaries that are either visible externally or on review of the code. VVSG requires testing at boundaries and therefore most (and not all) of the single-variable special-case bugs are probably covered. However, some bugs involve combinations of two, three, or more variables or functions. Some of those variables might be relative timing of events (race conditions) or amount of free memory when a given task is attempted or access to some other resource. These are harder to detect with simple tests. .......... (f) Even the assumption that one test can only expose one defect is empirically challengeable. Unless a test is so successfully focused on the processing of one variable by one method that multiple problems are impossible (unit testing can achieve this, but not system testing), a given test might trip first over one feature and next over another. A test that combines 10 features might yield 10 (or more!) failures. This is not a merely theoretical possibility. It is a common heuristic in system testing that testing should start with single-feature tests and progress to relatively simple multi-feature combinations and then progress to user-meaningful rich scenarios. This is based on experience: companies that do mainly the multi-function scenario testing often find their tests blocked—a failure in the first steps of the test blocks continuation to the later steps. After the first bug is fixed, the test fails again, in a way that blocks further execution, and then when used again, it fails again. In the practitioner community, there are many anecdotes of bugs that should have been easy to find being found very late in testing because the planned test that finally exposed the bug was blocked by other bugs. Thus, we have commonplace examples of tests exposing many more than one defect. .......... In sum, there is no reason to think that any of the assumptions underlying a Poisson process model apply. .......... The VVSG provides a table for determining critical values associated with the ratio of the number of test events to the test volume. The idea is that even if the Poisson distribution is not a perfect estimator, perhaps it is a good first approximation. I am not a professional statistician, but I do have about 8-12 semesters of probability/statistics courses, a few more on modeling, and some practical research experience. My understanding is that if two distributions are similarly shaped, using one as an approximation of the other is possible—but the relative differences are likely to grow as you go out to the tails. That is, similar distributions often differ most in their assignment of probabilities to lower-probability events. A 90% criterion value is pretty far out in the tail of the distribution. If none of the assumptions of the Poisson model apply to software testing, it is hard to believe that numbers taken from the tail of that distribution accurately predict much about the system under test. .......... Here are some other problems associated with the VVSG's estimator of software reliability: .......... (1) As far as I can tell, in its treatment of reliability estimation, the VVSG assumes that the test volume is a fixed value, not itself a random variable. This is only true if one set of tests is run once and no other tests or other events are considered. Given that there will be regression tests, this is not true. Even if we count each regression tests only once, no matter how many times it is run (but that is unfair if the same test later exposes a different bug), a competent test lab does additional testing around any bug reported fixed. That is, if a given set of test conditions exposes a failure, and the equipment vendor fixes the bug and returns a new version of the software, a competent tester will not only test the fix with the original test that exposed the bug but will create new tests to see whether the fix actually covered the underlying problem. These can expose new problems and so they must be new units of test volume. Of course, to preserve a fixed volume of testing, we could choose not to allow such testing. However, as in many cases considered above, it might be highly undesirable to allow an incorrect model to be used as an excuse to constrain the power of testing. .......... (2) The VVSG assumes that the test results obtained in conformance testing can be used as a sound statistical estimator of the population reliability (see 5.3.1). This assumption is unreasonable. The reliability of the voting equipment software, in the field, will depend on how the software is used in the field. The tests designed for conformance testing are not designed with an objective of mapping to field usage. They are designed to achieve a level of simple coverage of the code, another level of simple coverage of documented requirements, another level of coverage of boundary values of individual variables, and so on. There are sound statistical methods available for estimating the reliability of the software in the field (see, for example, Musa, Software Reliability Engineering, McGraw-Hill, 1998), but they start from development of operational profiles—profiles of ways in which people will actually use the software. The next task is estimation of relative frequency of occurrence of each profile—a usage pattern twice as likely in the field should be involved in twice as many of the reliability tests. From here, one generates a large pool of tests, deriving each test from one of the profiles (varying specific parameters, or sequences of operation in ways consistent with the profile). Ideally, that generation should be itself driven by a random process that reflects usage patterns. From there, failure rate over the sample of tests might well be a valid estimator of field failure rate. If it is important to have a failure rate estimator, it is important to have a number that bears a defensible relationship to the underlying parameter. .......... (3) Development of operational profiles is an expensive proposition. Some vendors (such as AT&T and Microsoft) have access to customer usage patterns and, at significant expense, can develop profiles on their own. It is not clear that voting equipment vendors have this level of access to customer usage of their own equipment. In addition, the better study might be of usage of voting equipment generally, across vendors. If the profiles are essentially the same across vendors and equipment models, the same profiles can be used with new models as they are introduced, rather than requiring a hugely burdensome (in time and money) research program for each new model. Rather than requiring voting equipment vendors to do this type of research, it might make more sense for NIST (or some other agency) to fund independent (e.g. university-based) research to develop such profiles and assess their commonality across devices. This will take some number of years. Until those profiles are developed and usable, I think it is inadvisable to predicate any decisions on estimators of software reliability. .......... (5) If we assume that the TEST CAMPAIGN includes all tests done by the independent test lab, then the campaign includes all regression tests, no matter how many times these are repeated. Suppose that a given test is repeated ten times. When we compute the TEST VOLUME of the campaign, is this 10 tests or 1? .......... (6) It is one thing to say that the lab cannot qualify a device based on the testing of a prototype. It is another thing to bar the vendor from submitting a prototype to the lab for evaluation. Evaluating prototypes gives the lab an opportunity to build expertise with the system under test, making its ultimate testing of the final version more effective. And it gives the vendor an opportunity to discover the weaknesses it is blind to, enabling it to fix problems earlier in the development cycle. It is widely believed in the software engineering community that earlier testing improves quality and reduces costs. While VVSG should not require vendors to submit early versions to the lab (there may be more cost effective ways to evaluate early versions), surely it should not ban it. If a vendor does submit an early version for testing, do those tests count as part of the test campaign? Do those failures count in the ultimate total of test events? .......... (7) Suppose two equipment manufacturers have equivalent internal processes, in terms of the quality and functionality of their software, and (for simplicity) equivalent products. One submits its software to the independent test lab a little earlier in development than the other. The first submitter goes to the lab with a few more bugs and goes through one or more rounds of regression testing. Ultimately, the same bugs are found and fixed in both systems. Thus, at the end of testing, we have two equivalently reliable systems. What is the effect on the numbers? If the test campaign counts each time a regression test is run as a separate test, then the first submitter is increasing the measured test volume enormously by submitting early. If the product has only a few more bugs than the product submitted by the late submitter, then even though the first submitter’s test event total will be higher, its ratio of test events to test volume will be lower. In contrast, if regression tests are not counted twice but fixed bugs are counted as test events, then the incentive will go to the vendor who waits until the last possible minute to submit product to the lab. If we want VVSG to drive this strategy as a matter of policy, VVSG should explicitly consider and state the policy and the policy choice should be publicly reviewed. Instead, the method of calculation creates an implicit policy. .......... (8) To the extent that test volume is left loosely defined, the estimated reliability will vary enormously depending on how the test lab (paid by the vendor) computes the test volume. A rational vendor would spend effort advocating for the largest possible interpretation of volume, so as to make the denominator as large as possible. .......... (9) Consider applying a high-volume test strategy to the testing of the device. High-volume strategies have been used effectively for automotive software, telephone switching software, firmware in office automation products, and undoubtedly many other contexts. I will emphasize my own work below, and other work I am personally familiar with, not because I think it is the best in the field but because I can write with authority about the underlying observations. High-volume testing is a well-funded, fashionable area of work. Examples are state-model based testing that execute long sequences of sub-tests, each involving a controlled state transitions; testing using genetic algorithms; search-based testing, in which the test sequence involves test values chosen to be different from each other in a specified way (e.g. maximally dissimilar from the previous tests in the sequence); random-input tests or random-event tests in which a random source generates data or traffic for a long period or until the system crashes; and various types of extreme value attacks (heavy load, big input, extreme combinations, corrupted files) that string many individual tests into one long, grueling sequences of harsh tests. These are often done as security tests today, but they were seen as tests of robustness twenty years ago. These are not fundamentally new ideas. The concerns I raise below in the context of the testing types that I mention applies just as well to all of these other methods. Given that preamble, consider a specific example that I know well: suppose the lab applies long-sequence randomized regression testing (LSRRT), which McGee & Kaner ("Experiments with High Volume Test Automation," Workshop on Empirical Research in Software Testing, ACM SIGSOFT Software Engineering Notes, 29(5) 1-3 2004 discussed under the label "extended random regression"). In LSRRT, you take a set of tests that a particular build of the software has passed individually and string them together in an arbitrarily long random sequence. The key advantage of LSRRT over many other tests that push a device through very long sequences of tests is that the expected results of each test are known and therefore failures can be detected in terms of unexpected responses rather than waiting until the software crashes. A unexpected responses might be unexpected data, but it also might be unexpected behavioral timing. Oracle Corporation used a method like this in its early qualification of its database, for example. If a test that took T1 time to complete at one point in testing took T2 (much longer or much shorter) time a bit later, system engineers investigated the cause of the difference, often finding coding errors. (Unpublished oral personal communication from Bob Miner, 1987) As McGee & Kaner reported in their short case study summary, LSRRT exposed a large number of serious problems that were not being exposed by the individual tests themselves. Similarly, I have seen serious failures exposed in a different type of long-sequence testing by a PBX manufacturer whose code had gone through thorough unit testing. Stack corruption that built up over time, memory corruption triggered by particular subsequences of events or particular combinations of data, race conditions involving unexpected busy-ness of one of the processors in a multiprocessor system--these are examples of the kinds of problems exposed by long-sequence testing that are much harder to find by testing with one distinct functional test at a time. During an election, a voting system has to run without failure for many hours. Long-sequence testing addresses the question of operation over that long period. One-functional-test-at-a-time testing does not. Should a test lab employ this style of testing? If so, how should we count the test volume? At "Mentsville" (a fictitious name, requested by the well-known equipment manufacturer whose processes McGee and Kaner studied), LSRRT was often restricted to 25 distinct tests that were repeated in a random order. Fewer than 25 wasn’t seen as diverse enough. Many more than 25 made troubleshooting a failure a nightmare. (Why? Remember that the system can pass each test on its own, so the secret of the failure lies somewhere in the sequencing. If failure occurs after 48 hours of apparently-trouble-free operation, analysis of that sequence can be very complex. Limiting the number of distinct tests in the sequence was one way to limit that complexity.) Suppose that the test lab runs a 50-test LSRRT for 12 hours, i.e. a sequence that repeats 50 regression tests in random order until testing is terminated by failure or by successful completion of a 12-hour run. Suppose that on average, each of the 50 tests runs 100 times. Is this test volume 1, 50, or 5000? If the test volume is 1, equipment vendors will have a strong incentive to argue that very little of this testing should be done, because this is a very harsh style of testing. If test volume is 5000, equipment vendors will have a strong incentive to encourage the lab to do lots of this type of testing. I submit that the decision to apply this style of testing, the amount of testing to be done, and the characteristics of the tests combined in each suite should be based on other factors than the calculation of test-events/test-volume, but that calculation will drive potentially harsh debates. In practice, I have been told by testers of regulated products that they don’t do long sequence testing specifically because the metrics are impossible to agree on. Benchmark-estimation rules should NEVER drive decisions about what style of testing would be most effective for illuminating the risks associated with a product. .......... (Affiliation Note: IEEE representative to TGDC)
Comment by Cem Kaner (Academic)
It is inappropriate to treat software regression tests as if they were a representative sample of the behavior of the system under test because the system is optimized to pass them as they are repeatedly run. The more times they are run, the less predictive power they have with respect to other tests that involve other data, other combinations of functions, or other sequences of events. .......... (Affiliation Note: IEEE representative to TGDC)
Comment by Cem Kaner (Academic)
As it applies to software, this section's terminology is ambiguous or undefined. What is a test event? What is a test volume? What is a test campaign? .......... (Affiliation Note: IEEE representative to TGDC)
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
Amendment to: USACM Comment #25. Section 5.3.1 General method [incorrect] [1] Boris Beizer, Software Testing Techniques, Second Edition, 1990
5.3.2 Critical values
For a fixed probability p and a fixed value of n, the value of rv satisfying P(n,rv)=p is a constant. Part 3: Table 5-1 provides the values of rv for p=0.1 and p=0.9 for 0≤n≤750.
Given no observed events after volume vo, the demonstrated event rate rd is found by solving P(no,rdvo)=0.1 for rd. The pertinent factor is in the second column (p=0.1) in the row for n=no; dividing this factor by vo yields rd. For example, a volume of 600 with no events demonstrates an event rate of 2.302585/600, or 3.837642×10−3.
Since the condition for rejecting H0 is P(no−1,rbvo)>0.9, the critical value vc, which is the minimum volume at which H0 is not rejected for no observed events and event rate benchmark rb, is found by solving P(no−1,rbvc)=0.9 for vc. The pertinent factor is in the third column (p=0.9) in the row for n=no−1; dividing this factor by rb yields vc. For example, if a test with event rate benchmark rb=10−4 resulted in one observed event, then the system would be rejected unless the actual volume was at least 0.1053605/10−4, or 105.3605. Where the measurement of volume is discrete rather than continuous, one would round up to the next integer.
The values in Part 3: Table 5-1 were generated by the following script and Octave[2] version 2.1.73.
silent_functions=1
# Function for the root finder to zero. fsolve won't pass extra
# parameters to the function being solved, so we must use globals.
# nGlobal is number of events; pGlobal is probability.
function rvRootFn = rvRoot (rv)
global nGlobal pGlobal
rvRootFn = poisson_cdf (nGlobal, rv) - pGlobal
endfunction
# Find rv given n and p. To initialize the root finder, provide
# startingGuess that is greater than zero and approximates the
# answer.
function rvFn = rv (n, p, startingGuess)
global nGlobal pGlobal
nGlobal = n
pGlobal = p
startingGuess > 0 || error ("bad starting guess")
[rvFn, info] = fsolve ("rvRoot", startingGuess)
if (info != 1)
perror ("fsolve", info)
endif
endfunction
function table
printf (" n P=0.1 P=0.9\n")
for n = 0:750
rv01 = rv (n, 0.1, -4.9529e-05*n*n + 1.0715*n + 2.302585093)
rv09 = rv (n, 0.9, 4.9522e-05*n*n + 0.9285*n + 0.105360516)
printf ("%3u %.6e %.6e\n", n, rv01, rv09)
endfor
endfunction
fsolve_options ("tolerance", 5e-12)
table
Table 5-1 Factors for calculation of critical values
| n |
rv satisfying P(n,rv)=0.1 |
rv satisfying P(n,rv)=0.9 |
n |
rv satisfying P(n,rv)=0.1 |
rv satisfying P(n,rv)=0.9 |
n |
rv satisfying P(n,rv)=0.1 |
rv satisfying P(n,rv)=0.9 |
| 0 |
2.302585 |
0.1053605 |
251 |
272.5461 |
231.8821 |
501 |
530.9192 |
473.509 |
| 1 |
3.88972 |
0.5318116 |
252 |
273.5864 |
232.8418 |
502 |
531.9478 |
474.4804 |
| 2 |
5.32232 |
1.102065 |
253 |
274.6267 |
233.8015 |
503 |
532.9764 |
475.4519 |
| 3 |
6.680783 |
1.74477 |
254 |
275.6669 |
234.7613 |
504 |
534.0049 |
476.4233 |
| 4 |
7.99359 |
2.432591 |
255 |
276.707 |
235.7212 |
505 |
535.0334 |
477.3948 |
| 5 |
9.274674 |
3.151898 |
256 |
277.747 |
236.6812 |
506 |
536.0619 |
478.3663 |
| 6 |
10.53207 |
3.894767 |
257 |
278.787 |
237.6412 |
507 |
537.0904 |
479.3379 |
| 7 |
11.77091 |
4.656118 |
258 |
279.8269 |
238.6013 |
508 |
538.1188 |
480.3094 |
| 8 |
12.99471 |
5.432468 |
259 |
280.8667 |
239.5615 |
509 |
539.1472 |
481.2811 |
| 9 |
14.20599 |
6.221305 |
260 |
281.9064 |
240.5218 |
510 |
540.1755 |
482.2527 |
| 10 |
15.40664 |
7.020747 |
261 |
282.946 |
241.4822 |
511 |
541.2039 |
483.2243 |
| 11 |
16.59812 |
7.829342 |
262 |
283.9856 |
242.4426 |
512 |
542.2322 |
484.196 |
| 12 |
17.78159 |
8.645942 |
263 |
285.0251 |
243.4031 |
513 |
543.2605 |
485.1677 |
| 13 |
18.95796 |
9.469621 |
264 |
286.0645 |
244.3637 |
514 |
544.2887 |
486.1395 |
| 14 |
20.12801 |
10.29962 |
265 |
287.1039 |
245.3243 |
515 |
545.317 |
487.1113 |
| 15 |
21.29237 |
11.1353 |
266 |
288.1432 |
246.2851 |
516 |
546.3452 |
488.0831 |
| 16 |
22.45158 |
11.97613 |
267 |
289.1824 |
247.2459 |
517 |
547.3734 |
489.0549 |
| 17 |
23.60609 |
12.82165 |
268 |
290.2215 |
248.2067 |
518 |
548.4015 |
490.0267 |
| 18 |
24.75629 |
13.67148 |
269 |
291.2605 |
249.1677 |
519 |
549.4296 |
490.9986 |
| 19 |
25.90253 |
14.52526 |
270 |
292.2995 |
250.1287 |
520 |
550.4577 |
491.9705 |
| 20 |
27.0451 |
15.38271 |
271 |
293.3384 |
251.0898 |
521 |
551.4858 |
492.9424 |
| 21 |
28.18427 |
16.24356 |
272 |
294.3773 |
252.0509 |
522 |
552.5138 |
493.9144 |
| 22 |
29.32027 |
17.10758 |
273 |
295.416 |
253.0122 |
523 |
553.5418 |
494.8864 |
| 23 |
30.4533 |
17.97457 |
274 |
296.4547 |
253.9735 |
524 |
554.5698 |
495.8584 |
| 24 |
31.58356 |
18.84432 |
275 |
297.4934 |
254.9349 |
525 |
555.5978 |
496.8304 |
| 25 |
32.71121 |
19.71669 |
276 |
298.5319 |
255.8963 |
526 |
556.6257 |
497.8025 |
| 26 |
33.83639 |
20.59152 |
277 |
299.5704 |
256.8578 |
527 |
557.6536 |
498.7746 |
| 27 |
34.95926 |
21.46867 |
278 |
300.6088 |
257.8194 |
528 |
558.6815 |
499.7467 |
| 28 |
36.07992 |
22.34801 |
279 |
301.6472 |
258.781 |
529 |
559.7094 |
500.7189 |
| 29 |
37.1985 |
23.22944 |
280 |
302.6855 |
259.7428 |
530 |
560.7372 |
501.691 |
| 30 |
38.3151 |
24.11285 |
281 |
303.7237 |
260.7046 |
531 |
561.765 |
502.6632 |
| 31 |
39.42982 |
24.99815 |
282 |
304.7618 |
261.6664 |
532 |
562.7928 |
503.6355 |
| 32 |
40.54274 |
25.88523 |
283 |
305.7999 |
262.6283 |
533 |
563.8205 |
504.6077 |
| 33 |
41.65395 |
26.77403 |
284 |
306.8379 |
263.5903 |
534 |
564.8482 |
505.58 |
| 34 |
42.76352 |
27.66447 |
285 |
307.8758 |
264.5524 |
535 |
565.8759 |
506.5523 |
| 35 |
43.87152 |
28.55647 |
286 |
308.9137 |
265.5145 |
536 |
566.9036 |
507.5246 |
| 36 |
44.97802 |
29.44998 |
287 |
309.9515 |
266.4767 |
537 |
567.9313 |
508.497 |
| 37 |
46.08308 |
30.34493 |
288 |
310.9893 |
267.439 |
538 |
568.9589 |
509.4694 |
| 38 |
47.18676 |
31.24126 |
289 |
312.0269 |
268.4013 |
539 |
569.9865 |
510.4418 |
| 39 |
48.2891 |
32.13892 |
290 |
313.0646 |
269.3637 |
540 |
571.014 |
511.4142 |
| 40 |
49.39016 |
33.03786 |
291 |
314.1021 |
270.3261 |
541 |
572.0416 |
512.3866 |
| 41 |
50.48999 |
33.93804 |
292 |
315.1396 |
271.2886 |
542 |
573.0691 |
513.3591 |
| 42 |
51.58863 |
34.83941 |
293 |
316.177 |
272.2512 |
543 |
574.0966 |
514.3316 |
| 43 |
52.68612 |
35.74192 |
294 |
317.2144 |
273.2138 |
544 |
575.1241 |
515.3042 |
| 44 |
53.7825 |
36.64555 |
295 |
318.2517 |
274.1765 |
545 |
576.1515 |
516.2767 |
| 45 |
54.87781 |
37.55024 |
296 |
319.2889 |
275.1393 |
546 |
577.1789 |
517.2493 |
| 46 |
55.97209 |
38.45597 |
297 |
320.3261 |
276.1021 |
547 |
578.2063 |
518.2219 |
| 47 |
57.06535 |
39.36271 |
298 |
321.3632 |
277.065 |
548 |
579.2337 |
519.1945 |
| 48 |
58.15765 |
40.27042 |
299 |
322.4002 |
278.028 |
549 |
580.261 |
520.1672 |
| 49 |
59.249 |
41.17907 |
300 |
323.4372 |
278.991 |
550 |
581.2884 |
521.1399 |
| 50 |
60.33944 |
42.08863 |
301 |
324.4741 |
279.9541 |
551 |
582.3156 |
522.1126 |
| 51 |
61.42899 |
42.99909 |
302 |
325.511 |
280.9172 |
552 |
583.3429 |
523.0853 |
| 52 |
62.51768 |
43.9104 |
303 |
326.5478 |
281.8804 |
553 |
584.3702 |
524.0581 |
| 53 |
63.60553 |
44.82255 |
304 |
327.5845 |
282.8437 |
554 |
585.3974 |
525.0309 |
| 54 |
64.69257 |
45.73552 |
305 |
328.6212 |
283.807 |
555 |
586.4246 |
526.0037 |
| 55 |
65.77881 |
46.64928 |
306 |
329.6578 |
284.7704 |
556 |
587.4517 |
526.9765 |
| 56 |
66.86429 |
47.5638 |
307 |
330.6944 |
285.7338 |
557 |
588.4789 |
527.9493 |
| 57 |
67.94901 |
48.47908 |
308 |
331.7309 |
286.6973 |
558 |
589.506 |
528.9222 |
| 58 |
69.033 |
49.39509 |
309 |
332.7673 |
287.6609 |
559 |
590.5331 |
529.8951 |
| 59 |
70.11628 |
50.31182 |
310 |
333.8037 |
288.6245 |
560 |
591.5602 |
530.8681 |
| 60 |
71.19887 |
51.22923 |
311 |
334.84 |
289.5882 |
561 |
592.5872 |
531.841 |
| 61 |
72.28078 |
52.14733 |
312 |
335.8763 |
290.5519 |
562 |
593.6142 |
532.814 |
| 62 |
73.36203 |
53.06608 |
313 |
336.9125 |
291.5157 |
563 |
594.6412 |
533.787 |
| 63 |
74.44263 |
53.98548 |
314 |
337.9486 |
292.4796 |
564 |
595.6682 |
534.76 |
| 64 |
75.5226 |
54.90551 |
315 |
338.9847 |
293.4435 |
565 |
596.6952 |
535.7331 |
| 65 |
76.60196 |
55.82616 |
316 |
340.0208 |
294.4074 |
566 |
597.7221 |
536.7061 |
| 66 |
77.68071 |
56.74741 |
317 |
341.0568 |
295.3715 |
567 |
598.749 |
537.6792 |
| 67 |
78.75888 |
57.66924 |
318 |
342.0927 |
296.3355 |
568 |
599.7759 |
538.6523 |
| 68 |
79.83647 |
58.59165 |
319 |
343.1285 |
297.2997 |
569 |
600.8028 |
539.6255 |
| 69 |
80.9135 |
59.51463 |
320 |
344.1643 |
298.2639 |
570 |
601.8296 |
540.5986 |
| 70 |
81.98997 |
60.43815 |
321 |
345.2001 |
299.2281 |
571 |
602.8564 |
541.5718 |
| 71 |
83.06591 |
61.36221 |
322 |
346.2358 |
300.1924 |
572 |
603.8832 |
542.545 |
| 72 |
84.14132 |
62.2868 |
323 |
347.2714 |
301.1568 |
573 |
604.9099 |
543.5183 |
| 73 |
85.21622 |
63.21191 |
324 |
348.307 |
302.1212 |
574 |
605.9367 |
544.4915 |
| 74 |
86.29061 |
64.13753 |
325 |
349.3426 |
303.0857 |
575 |
606.9634 |
545.4648 |
| 75 |
87.3645 |
65.06364 |
326 |
350.378 |
304.0502 |
576 |
607.9901 |
546.4381 |
| 76 |
88.4379 |
65.99023 |
327 |
351.4135 |
305.0148 |
577 |
609.0168 |
547.4115 |
| 77 |
89.51083 |
66.91731 |
328 |
352.4488 |
305.9794 |
578 |
610.0434 |
548.3848 |
| 78 |
90.58329 |
67.84485 |
329 |
353.4842 |
306.9441 |
579 |
611.07 |
549.3582 |
| 79 |
91.65529 |
68.77285 |
330 |
354.5194 |
307.9088 |
580 |
612.0966 |
550.3316 |
| 80 |
92.72684 |
69.7013 |
331 |
355.5546 |
308.8736 |
581 |
613.1232 |
551.305 |
| 81 |
93.79795 |
70.63019 |
332 |
356.5898 |
309.8384 |
582 |
614.1498 |
552.2785 |
| 82 |
94.86863 |
71.55951 |
333 |
357.6249 |
310.8033 |
583 |
615.1763 |
553.2519 |
| 83 |
95.93888 |
72.48927 |
334 |
358.6599 |
311.7683 |
584 |
616.2028 |
554.2254 |
| 84 |
97.00871 |
73.41944 |
335 |
359.6949 |
312.7333 |
585 |
617.2293 |
555.1989 |
| 85 |
98.07813 |
74.35002 |
336 |
360.7299 |
313.6983 |
586 |
618.2558 |
556.1725 |
| 86 |
99.14714 |
75.281 |
337 |
361.7648 |
314.6634 |
587 |
619.2822 |
557.146 |
| 87 |
100.2158 |
76.21239 |
338 |
362.7996 |
315.6286 |
588 |
620.3086 |
558.1196 |
| 88 |
101.284 |
77.14416 |
339 |
363.8344 |
316.5938 |
589 |
621.335 |
559.0932 |
| 89 |
102.3518 |
78.07631 |
340 |
364.8692 |
317.5591 |
590 |
622.3614 |
560.0668 |
| 90 |
103.4193 |
79.00885 |
341 |
365.9038 |
318.5244 |
591 |
623.3878 |
561.0405 |
| 91 |
104.4864 |
79.94175 |
342 |
366.9385 |
319.4897 |
592 |
624.4141 |
562.0141 |
| 92 |
105.5531 |
80.87502 |
343 |
367.9731 |
320.4552 |
593 |
625.4404 |
562.9878 |
| 93 |
106.6195 |
81.80865 |
344 |
369.0076 |
321.4206 |
594 |
626.4667 |
563.9615 |
| 94 |
107.6855 |
82.74263 |
345 |
370.0421 |
322.3861 |
595 |
627.493 |
564.9353 |
| 95 |
108.7512 |
83.67695 |
346 |
371.0765 |
323.3517 |
596 |
628.5192 |
565.909 |
| 96 |
109.8165 |
84.61162 |
347 |
372.1109 |
324.3173 |
597 |
629.5454 |
566.8828 |
| 97 |
110.8815 |
85.54663 |
348 |
373.1453 |
325.283 |
598 |
630.5716 |
567.8566 |
| 98 |
111.9462 |
86.48197 |
349 |
374.1796 |
326.2487 |
599 |
631.5978 |
568.8304 |
| 99 |
113.0105 |
87.41764 |
350 |
375.2138 |
327.2144 |
600 |
632.624 |
569.8043 |
| 100 |
114.0745 |
88.35362 |
351 |
376.248 |
328.1802 |
601 |
633.6501 |
570.7781 |
| 101 |
115.1382 |
89.28993 |
352 |
377.2821 |
329.1461 |
602 |
634.6762 |
571.752 |
| 102 |
116.2016 |
90.22655 |
353 |
378.3162 |
330.112 |
603 |
635.7023 |
572.7259 |
| 103 |
117.2647 |
91.16347 |
354 |
379.3503 |
331.078 |
604 |
636.7284 |
573.6999 |
| 104 |
118.3275 |
92.1007 |
355 |
380.3843 |
332.044 |
605 |
637.7544 |
574.6738 |
| 105 |
119.3899 |
93.03823 |
356 |
381.4182 |
333.01 |
606 |
638.7804 |
575.6478 |
| 106 |
120.4521 |
93.97605 |
357 |
382.4521 |
333.9761 |
607 |
639.8064 |
576.6218 |
| 107 |
121.514 |
94.91416 |
358 |
383.486 |
334.9422 |
608 |
640.8324 |
577.5958 |
| 108 |
122.5756 |
95.85256 |
359 |
384.5198 |
335.9084 |
609 |
641.8584 |
578.5699 |
| 109 |
123.6369 |
96.79124 |
360 |
385.5536 |
336.8747 |
610 |
642.8843 |
579.5439 |
| 110 |
124.698 |
97.7302 |
361 |
386.5873 |
337.841 |
611 |
643.9102 |
580.518 |
| 111 |
125.7587 |
98.66944 |
362 |
387.6209 |
338.8073 |
612 |
644.9361 |
581.4921 |
| 112 |
126.8192 |
99.60895 |
363 |
388.6546 |
339.7737 |
613 |
645.962 |
582.4662 |
| 113 |
127.8794 |
100.5487 |
364 |
389.6881 |
340.7401 |
614 |
646.9879 |
583.4404 |
| 114 |
128.9394 |
101.4888 |
365 |
390.7217 |
341.7066 |
615 |
648.0137 |
584.4145 |
| 115 |
129.9991 |
102.4291 |
366 |
391.7552 |
342.6731 |
616 |
649.0395 |
585.3887 |
| 116 |
131.0586 |
103.3696 |
367 |
392.7886 |
343.6396 |
617 |
650.0653 |
586.3629 |
| 117 |
132.1177 |
104.3104 |
368 |
393.822 |
344.6062 |
618 |
651.0911 |
587.3372 |
| 118 |
133.1767 |
105.2515 |
369 |
394.8553 |
345.5729 |
619 |
652.1168 |
588.3114 |
| 119 |
134.2354 |
106.1928 |
370 |
395.8886 |
346.5396 |
620 |
653.1426 |
589.2857 |
| 120 |
135.2938 |
107.1344 |
371 |
396.9219 |
347.5063 |
621 |
654.1683 |
590.26 |
| 121 |
136.352 |
108.0762 |
372 |
397.9551 |
348.4731 |
622 |
655.194 |
591.2343 |
| 122 |
137.41 |
109.0182 |
373 |
398.9883 |
349.4399 |
623 |
656.2196 |
592.2086 |
| 123 |
138.4677 |
109.9605 |
374 |
400.0214 |
350.4068 |
624 |
657.2453 |
593.183 |
| 124 |
139.5252 |
110.903 |
375 |
401.0545 |
351.3737 |
625 |
658.2709 |
594.1573 |
| 125 |
140.5825 |
111.8457 |
376 |
402.0875 |
352.3407 |
626 |
659.2965 |
595.1317 |
| 126 |
141.6395 |
112.7887 |
377 |
403.1205 |
353.3077 |
627 |
660.3221 |
596.1061 |
| 127 |
142.6963 |
113.7318 |
378 |
404.1535 |
354.2748 |
628 |
661.3477 |
597.0806 |
| 128 |
143.7529 |
114.6753 |
379 |
405.1864 |
355.2419 |
629 |
662.3732 |
598.055 |
| 129 |
144.8093 |
115.6189 |
380 |
406.2192 |
356.209 |
630 |
663.3987 |
599.0295 |
| 130 |
145.8655 |
116.5627 |
381 |
407.252 |
357.1762 |
631 |
664.4242 |
600.004 |
| 131 |
146.9214 |
117.5068 |
382 |
408.2848 |
358.1434 |
632 |
665.4497 |
600.9785 |
| 132 |
147.9771 |
118.4511 |
383 |
409.3176 |
359.1107 |
633 |
666.4752 |
601.953 |
| 133 |
149.0326 |
119.3955 |
384 |
410.3503 |
360.078 |
634 |
667.5006 |
602.9276 |
| 134 |
150.088 |
120.3402 |
385 |
411.3829 |
361.0453 |
635 |
668.5261 |
603.9022 |
| 135 |
151.1431 |
121.2851 |
386 |
412.4155 |
362.0127 |
636 |
669.5515 |
604.8768 |
| 136 |
152.198 |
122.2302 |
387 |
413.4481 |
362.9802 |
637 |
670.5768 |
605.8514 |
| 137 |
153.2527 |
123.1755 |
388 |
414.4806 |
363.9476 |
638 |
671.6022 |
606.826 |
| 138 |
154.3072 |
124.121 |
389 |
415.5131 |
364.9152 |
639 |
672.6276 |
607.8007 |
| 139 |
155.3615 |
125.0667 |
390 |
416.5455 |
365.8827 |
640 |
673.6529 |
608.7754 |
| 140 |
156.4156 |
126.0126 |
391 |
417.5779 |
366.8503 |
641 |
674.6782 |
609.7501 |
| 141 |
157.4695 |
126.9586 |
392 |
418.6103 |
367.818 |
642 |
675.7035 |
610.7248 |
| 142 |
158.5233 |
127.9049 |
393 |
419.6426 |
368.7856 |
643 |
676.7287 |
611.6995 |
| 143 |
159.5768 |
128.8514 |
394 |
420.6749 |
369.7534 |
644 |
677.754 |
612.6743 |
| 144 |
160.6302 |
129.798 |
395 |
421.7071 |
370.7211 |
645 |
678.7792 |
613.649 |
| 145 |
161.6834 |
130.7448 |
396 |
422.7393 |
371.689 |
646 |
679.8044 |
614.6238 |
| 146 |
162.7364 |
131.6918 |
397 |
423.7714 |
372.6568 |
647 |
680.8296 |
615.5986 |
| 147 |
163.7892 |
132.639 |
398 |
424.8035 |
373.6247 |
648 |
681.8548 |
616.5735 |
| 148 |
164.8418 |
133.5864 |
399 |
425.8356 |
374.5926 |
649 |
682.8799 |
617.5483 |
| 149 |
165.8943 |
134.5339 |
400 |
426.8676 |
375.5606 |
650 |
683.905 |
618.5232 |
| 150 |
166.9465 |
135.4816 |
401 |
427.8996 |
376.5286 |
651 |
684.9302 |
619.4981 |
| 151 |
167.9987 |
136.4295 |
402 |
428.9316 |
377.4966 |
652 |
685.9552 |
620.473 |
| 152 |
169.0506 |
137.3776 |
403 |
429.9635 |
378.4647 |
653 |
686.9803 |
621.4479 |
| 153 |
170.1024 |
138.3258 |
404 |
430.9954 |
379.4329 |
654 |
688.0054 |
622.4229 |
| 154 |
171.154 |
139.2742 |
405 |
432.0272 |
380.401 |
655 |
689.0304 |
623.3978 |
| 155 |
172.2054 |
140.2228 |
406 |
433.059 |
381.3692 |
656 |
690.0554 |
624.3728 |
| 156 |
173.2567 |
141.1715 |
407 |
434.0907 |
382.3375 |
657 |
691.0804 |
625.3478 |
| 157 |
174.3078 |
142.1204 |
408 |
435.1225 |
383.3058 |
658 |
692.1054 |
626.3228 |
| 158 |
175.3587 |
143.0695 |
409 |
436.1541 |
384.2741 |
659 |
693.1304 |
627.2979 |
| 159 |
176.4095 |
144.0187 |
410 |
437.1858 |
385.2425 |
660 |
694.1553 |
628.2729 |
| 160 |
177.4601 |
144.9681 |
411 |
438.2174 |
386.2109 |
661 |
695.1802 |
629.248 |
| 161 |
178.5106 |
145.9176 |
412 |
439.2489 |
387.1793 |
662 |
696.2051 |
630.2231 |
| 162 |
179.5609 |
146.8673 |
413 |
440.2805 |
388.1478 |
663 |
697.23 |
631.1982 |
| 163 |
180.6111 |
147.8171 |
414 |
441.3119 |
389.1163 |
664 |
698.2549 |
632.1734 |
| 164 |
181.6611 |
148.7671 |
415 |
442.3434 |
390.0848 |
665 |
699.2797 |
633.1485 |
| 165 |
182.7109 |
149.7173 |
416 |
443.3748 |
391.0534 |
666 |
700.3045 |
634.1237 |
| 166 |
183.7606 |
150.6676 |
417 |
444.4062 |
392.0221 |
667 |
701.3293 |
635.0989 |
| 167 |
184.8102 |
151.618 |
418 |
445.4375 |
392.9907 |
668 |
702.3541 |
636.0741 |
| 168 |
185.8596 |
152.5686 |
419 |
446.4688 |
393.9594 |
669 |
703.3789 |
637.0493 |
| 169 |
186.9089 |
153.5193 |
420 |
447.5001 |
394.9282 |
670 |
704.4036 |
638.0246 |
| 170 |
187.958 |
154.4702 |
421 |
448.5313 |
395.8969 |
671 |
705.4284 |
638.9999 |
| 171 |
189.0069 |
155.4213 |
422 |
449.5625 |
396.8658 |
672 |
706.4531 |
639.9751 |
| 172 |
190.0558 |
156.3724 |
423 |
450.5936 |
397.8346 |
673 |
707.4778 |
640.9505 |
| 173 |
191.1045 |
157.3237 |
424 |
451.6247 |
398.8035 |
674 |
708.5025 |
641.9258 |
| 174 |
192.153 |
158.2752 |
425 |
452.6558 |
399.7724 |
675 |
709.5271 |
642.9011 |
| 175 |
193.2014 |
159.2268 |
426 |
453.6868 |
400.7414 |
676 |
710.5518 |
643.8765 |
| 176 |
194.2497 |
160.1785 |
427 |
454.7178 |
401.7104 |
677 |
711.5764 |
644.8518 |
| 177 |
195.2978 |
161.1304 |
428 |
455.7488 |
402.6794 |
678 |
712.601 |
645.8272 |
| 178 |
196.3458 |
162.0824 |
429 |
456.7797 |
403.6485 |
679 |
713.6256 |
646.8027 |
| 179 |
197.3937 |
163.0345 |
430 |
457.8106 |
404.6176 |
680 |
714.6501 |
647.7781 |
| 180 |
198.4414 |
163.9868 |
431 |
458.8415 |
405.5867 |
681 |
715.6747 |
648.7535 |
| 181 |
199.489 |
164.9392 |
432 |
459.8723 |
406.5559 |
682 |
716.6992 |
649.729 |
| 182 |
200.5365 |
165.8917 |
433 |
460.9031 |
407.5251 |
683 |
717.7237 |
650.7045 |
| 183 |
201.5839 |
166.8443 |
434 |
461.9338 |
408.4944 |
684 |
718.7482 |
651.68 |
| 184 |
202.6311 |
167.7971 |
435 |
462.9646 |
409.4637 |
685 |
719.7727 |
652.6555 |
| 185 |
203.6781 |
168.7501 |
436 |
463.9952 |
410.433 |
686 |
720.7972 |
653.6311 |
| 186 |
204.7251 |
169.7031 |
437 |
465.0259 |
411.4023 |
687 |
721.8216 |
654.6066 |
| 187 |
205.7719 |
170.6563 |
438 |
466.0565 |
412.3717 |
688 |
722.8461 |
655.5822 |
| 188 |
206.8186 |
171.6096 |
439 |
467.0871 |
413.3412 |
689 |
723.8705 |
656.5578 |
| 189 |
207.8652 |
172.563 |
440 |
468.1176 |
414.3106 |
690 |
724.8949 |
657.5334 |
| 190 |
208.9117 |
173.5165 |
441 |
469.1481 |
415.2801 |
691 |
725.9192 |
658.509 |
| 191 |
209.958 |
174.4702 |
442 |
470.1786 |
416.2496 |
692 |
726.9436 |
659.4847 |
| 192 |
211.0043 |
175.4239 |
443 |
471.209 |
417.2192 |
693 |
727.9679 |
660.4603 |
| 193 |
212.0504 |
176.3778 |
444 |
472.2394 |
418.1888 |
694 |
728.9922 |
661.436 |
| 194 |
213.0963 |
177.3319 |
445 |
473.2698 |
419.1584 |
695 |
730.0165 |
662.4117 |
| 195 |
214.1422 |
178.286 |
446 |
474.3001 |
420.1281 |
696 |
731.0408 |
663.3874 |
| 196 |
215.1879 |
179.2403 |
447 |
475.3304 |
421.0978 |
697 |
732.0651 |
664.3631 |
| 197 |
216.2336 |
180.1946 |
448 |
476.3607 |
422.0675 |
698 |
733.0893 |
665.3389 |
| 198 |
217.2791 |
181.1491 |
449 |
477.3909 |
423.0373 |
699 |
734.1136 |
666.3147 |
| 199 |
218.3245 |
182.1037 |
450 |
478.4211 |
424.0071 |
700 |
735.1378 |
667.2904 |
| 200 |
219.3698 |
183.0584 |
451 |
479.4513 |
424.9769 |
701 |
736.162 |
668.2662 |
| 201 |
220.415 |
184.0133 |
452 |
480.4814 |
425.9468 |
702 |
737.1862 |
669.2421 |
| 202 |
221.46 |
184.9682 |
453 |
481.5115 |
426.9167 |
703 |
738.2103 |
670.2179 |
| 203 |
222.505 |
185.9232 |
454 |
482.5416 |
427.8866 |
704 |
739.2345 |
671.1938 |
| 204 |
223.5498 |
186.8784 |
455 |
483.5716 |
428.8566 |
705 |
740.2586 |
672.1696 |
| 205 |
224.5945 |
187.8337 |
456 |
484.6016 |
429.8266 |
706 |
741.2827 |
673.1455 |
| 206 |
225.6392 |
188.789 |
457 |
485.6316 |
430.7966 |
707 |
742.3068 |
674.1214 |
| 207 |
226.6837 |
189.7445 |
458 |
486.6615 |
431.7667 |
708 |
743.3309 |
675.0973 |
| 208 |
227.7281 |
190.7001 |
459 |
487.6914 |
432.7368 |
709 |
744.355 |
676.0733 |
| 209 |
228.7724 |
191.6558 |
460 |
488.7213 |
433.7069 |
710 |
745.379 |
677.0492 |
| 210 |
229.8166 |
192.6116 |
461 |
489.7511 |
434.6771 |
711 |
746.403 |
678.0252 |
| 211 |
230.8607 |
193.5675 |
462 |
490.781 |
435.6473 |
712 |
747.427 |
679.0012 |
| 212 |
231.9047 |
194.5235 |
463 |
491.8107 |
436.6175 |
713 |
748.451 |
679.9772 |
| 213 |
232.9485 |
195.4797 |
464 |
492.8405 |
437.5878 |
714 |
749.475 |
680.9532 |
| 214 |
233.9923 |
196.4359 |
465 |
493.8702 |
438.5581 |
715 |
750.499 |
681.9293 |
| 215 |
235.036 |
197.3922 |
466 |
494.8999 |
439.5284 |
716 |
751.5229 |
682.9053 |
| 216 |
236.0796 |
198.3486 |
467 |
495.9295 |
440.4987 |
717 |
752.5468 |
683.8814 |
| 217 |
237.1231 |
199.3051 |
468 |
496.9591 |
441.4691 |
718 |
753.5708 |
684.8575 |
| 218 |
238.1664 |
200.2618 |
469 |
497.9887 |
442.4395 |
719 |
754.5946 |
685.8336 |
| 219 |
239.2097 |
201.2185 |
470 |
499.0182 |
443.41 |
720 |
755.6185 |
686.8097 |
| 220 |
240.2529 |
202.1753 |
471 |
500.0478 |
444.3805 |
721 |
756.6424 |
687.7859 |
| 221 |
241.296 |
203.1322 |
472 |
501.0773 |
445.351 |
722 |
757.6662 |
688.762 |
| 222 |
242.339 |
204.0892 |
473 |
502.1067 |
446.3215 |
723 |
758.6901 |
689.7382 |
| 223 |
243.3819 |
205.0463 |
474 |
503.1361 |
447.2921 |
724 |
759.7139 |
690.7144 |
| 224 |
244.4247 |
206.0035 |
475 |
504.1655 |
448.2627 |
725 |
760.7377 |
691.6906 |
| 225 |
245.4674 |
206.9608 |
476 |
505.1949 |
449.2333 |
726 |
761.7614 |
692.6668 |
| 226 |
246.51 |
207.9182 |
477 |
506.2242 |
450.204 |
727 |
762.7852 |
693.643 |
| 227 |
247.5525 |
208.8757 |
478 |
507.2535 |
451.1747 |
728 |
763.8089 |
694.6193 |
| 228 |
248.5949 |
209.8333 |
479 |
508.2828 |
452.1454 |
729 |
764.8327 |
695.5956 |
| 229 |
249.6372 |
210.791 |
480 |
509.312 |
453.1162 |
730 |
765.8564 |
696.5718 |
| 230 |
250.6795 |
211.7488 |
481 |
510.3413 |
454.087 |
731 |
766.8801 |
697.5482 |
| 231 |
251.7216 |
212.7066 |
482 |
511.3704 |
455.0578 |
732 |
767.9038 |
698.5245 |
| 232 |
252.7636 |
213.6646 |
483 |
512.3996 |
456.0287 |
733 |
768.9274 |
699.5008 |
| 233 |
253.8056 |
214.6226 |
484 |
513.4287 |
456.9995 |
734 |
769.9511 |
700.4772 |
| 234 |
254.8475 |
215.5807 |
485 |
514.4578 |
457.9704 |
735 |
770.9747 |
701.4535 |
| 235 |
255.8893 |
216.539 |
486 |
515.4869 |
458.9414 |
736 |
771.9983 |
702.4299 |
| 236 |
256.931 |
217.4973 |
487 |
516.5159 |
459.9123 |
737 |
773.0219 |
703.4063 |
| 237 |
257.9726 |
218.4557 |
488 |
517.5449 |
460.8833 |
738 |
774.0455 |
704.3827 |
| 238 |
259.0141 |
219.4141 |
489 |
518.5739 |
461.8544 |
739 |
775.0691 |
705.3592 |
| 239 |
260.0555 |
220.3727 |
490 |
519.6028 |
462.8254 |
740 |
776.0926 |
706.3356 |
| 240 |
261.0969 |
221.3314 |
491 |
520.6317 |
463.7965 |
741 |
777.1162 |
707.3121 |
| 241 |
262.1381 |
222.2901 |
492 |
521.6606 |
464.7676 |
742 |
778.1397 |
708.2885 |
| 242 |
263.1793 |
223.2489 |
493 |
522.6894 |
465.7388 |
743 |
779.1632 |
709.265 |
| 243 |
264.2204 |
224.2078 |
494 |
523.7183 |
466.71 |
744 |
780.1867 |
710.2416 |
| 244 |
265.2614 |
225.1668 |
495 |
524.7471 |
467.6812 |
745 |
781.2102 |
711.2181 |
| 245 |
266.3023 |
226.1259 |
496 |
525.7758 |
468.6524 |
746 |
782.2336 |
712.1946 |
| 246 |
267.3431 |
227.0851 |
497 |
526.8046 |
469.6237 |
747 |
783.2571 |
713.1712 |
| 247 |
268.3839 |
228.0443 |
498 |
527.8333 |
470.595 |
748 |
784.2805 |
714.1478 |
| 248 |
269.4246 |
229.0037 |
499 |
528.862 |
471.5663 |
749 |
785.3039 |
715.1243 |
| 249 |
270.4652 |
229.9631 |
500 |
529.8906 |
472.5376 |
750 |
786.3273 |
716.101 |
2 Comments
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
Amendment to: USACM Comment #26. Section 5.3.1 General method [incorrect] USACM recommends correcting the factual errors present in this subsection, as minimally enumerated below: 1. The system submitted for testing might be a representative sample of the pool of identical hardware/software systems, but the pool of tests should not be a representative sample of the events that happen during an election. 2. There is no reason to expect software reliability, software accuracy, and hardware misfeed rate to follow the same distribution. 3. The Poisson distribution is discrete, not continuous. 4. The Poisson process typically assumes a stationary underlying exponential distribution. The idea that software reliability, software accuracy, and hardware misfeed rates follow the same underlying distribution, or that the concatenation of these three (if there are only three) distributions would be anything like exponential is remarkable in its unlikelihood. 5. The observed event rate ("events" divided by "volume" over the course of a test campaign) is a highly biased measure. a. The first problem is that a regression test suite repeats the same tests from build to build. This gives rise to the classic problem of the "pesticide paradox" [1]. The test suite is a tiny sample of the collection of possible tests. When the suite reveals bugs, they are fixed. Ultimately, the test suite becomes a collection of tests that have one thing in common: the software has passed all of them at least once. This differs from almost every other possible test (all of the ones that have not been run). Therefore, the reliability of the software is probably vastly overestimated. b. The second problem is that the pool of tests is structured to cover a specification. It does not necessarily target vulnerabilities of the software. Nor is it designed to reflect usage frequencies in the field [2]. 6. Determining the length of testing in advance by an approved test plan sounds scientific, but many practitioners consider this software testing malpractice. There is substantial evidence that bugs cluster. Given a failure, there is reason to do follow-up testing to study this area of the product in more detail. Rigidly adhering to a plan created in the absence of failure data is to rigidly reject the idea of follow-up testing. This underestimates the number of problems in the code. Worse, this testing method reduces the chance that defects will be found and fixed because it reduces — essentially bans — the follow-up testing that would expose those additional defects. References [1] Boris Beizer, Software Testing Techniques, Second Edition, 1990 [2] Musa, Software Reliability Engineering (http://members.aol.com/JohnDMusa/book.htm
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #26. Section 5.3.2. Critical Values [incorrect] This section should be adjusted according to factual corrections made in the previous comment.
5.3.3 Reliability
5.3.3-A Reliability, pertinent tests
All tests executed during conformity assessment SHALL be considered "pertinent" for assessment of reliability, with the following exceptions:
- Tests in which failures are forced;
- Tests in which portions of the system that would be exercised during an actual election are bypassed (see Part 3: 2.5.3 "Test fixtures").
Applies to: Voting system
1 Comment
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #27. 5.3.3 Reliability [incorrect] USACM notes the apparent self-contradictions in this sub-section, enumerated in the discussion below. DISCUSSION: 1. Failure rate data are not relevant to prediction of reliability in the field unless we assume that the failure rate in the lab is representative of the failure rate that will be found in the field. This might be rational for hardware, but unless we structure the software tests to map to usage in the field, there is no rational basis for this assumption vis-à-vis the software. 2. Pass/fail criteria are based on the concatenation of hardware and software failures. A paper jam rates the same as miscount of votes. 3. Counting all "failures" for statistical purposes creates an adversarial dynamic around the classification of anomalous behaviors. To the extent that an apparently-incorrect behavior is arguably not inconsistent with the specification, there is an incentive to class it as a non-bug and therefore not fix it. The incentives should favor improving the software, not classifying problems as non-problems
5.3.3-B Failure rate data collection
The test lab SHALL record the number of failures and the applicable measure of volume for each pertinent test execution, for each type of device, and for each applicable failure type in Part 1: Table 6-3 (Part 1: 6.3.1.5 "Requirements").
Applies to: Voting device
DISCUSSION
"Type of device" refers to the different models produced by the manufacturer. These are not the same as device classes. The system may include several different models of the same class, and a given model may belong to more than one class.
1 Comment
Comment by Gail Audette (Voting System Test Laboratory)
How are failures defined? Is the number of failures (either recoverable or non-recoverable) counted without respect to the severity?
When operational testing is complete, the test lab SHALL calculate the failure total and total volume accumulated across all pertinent tests for each type of device and failure type. If, using the test method in Part 3: 5.3.1 "General method", these values indicate rejection of the null hypothesis for any type of device and type of failure, the verdict on conformity to Requirement Part 1: 6.3.1.5-A SHALL be Fail. Otherwise, the verdict SHALL be Pass.
Applies to: Voting device
1 Comment
Comment by Gail Audette (Voting System Test Laboratory)
Benchmarks are typically based on industry-wide data. What comparable industry was used for the factors for calculation of critical values? We see how the lab is expected to generate the true event rate (although we don't agree with the collection method). How are the labs supposed to evaluate the benchmark event rate in order to evaluate the null hypothesis (conforming or non-conforming)?.
The informal concept of voting system accuracy is formalized using the ratio of the number of errors that occur to the volume of data processed, also known as error rate.
All tests executed during conformity assessment SHALL be considered "pertinent" for assessment of accuracy, with the following exceptions:
- Tests in which errors are forced;
- Tests in which portions of the system that would be exercised during an actual election are bypassed (see Part 3: 2.5.3 "Test fixtures").
Applies to: Voting system
1 Comment
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #28. Section 5.3.4. Accuracy [incorrect] USACM notes the apparent self-contradictions in this sub-section as described in the discussion below. DISCUSSION: Accuracy is operationalized (not formalized) as a ratio of errors found to volume of data processed. One may assume that the word "error" is tied tightly to events that yield a miscount of the votes, allow someone to cast extra votes, or cause someone to be unable to cast a vote. If "error" includes anything in the behavior of the program that would not create an error in election result, it is difficult to understand what this operationalization has to do with the naturalistic concept of "accuracy" in a system that collects and counts votes. The operationalization is defective as an estimator unless the pool of tests is designed so as to be representative of the pool of behaviors in the field. If some aspect of the system causes a small mistake (e.g. 1-vote miscount), but is only tested once, that might be a major source of inaccuracy if everyone encounters it while voting, and it might be a trivial source if almost no one encounters it. For example, imagine a system that allowed ballots that could accept write-in votes for up to 100 candidates. Imagine an error in which 1 vote in 10 is lost in the 100th race that includes a write-in candidate. As a boundary case, this error might show up in several tests. However, it might never show up in an election. What is the accuracy using the described metric? Without a mapping from the estimator to the construct being estimated, the metric is worthless. This is a fundamental issue in measurement. We normally call it construct validity. The argument that this measure of accuracy estimates underlying system accuracy lacks even face validity.
5.3.4-B Calculation of report total error rate
Given a set of vote data reports resulting from the execution of tests, the observed cumulative report total error rate SHALL be calculated as follows:
- Define a "report item" as any one of the numeric values (totals or counts) that must appear in any of the vote data reports. Each ballot count, each vote, overvote, and undervote total for each contest, and each vote total for each contest choice in each contest is a separate report item. The required report items are detailed in Part 1: 7.8.3 "Vote data reports";
- For each report item, compute the "report item error" as the absolute value of the difference between the correct value and the reported value. Special cases: If a value is reported that should not have appeared at all (spurious item), or if an item that should have appeared in the report does not (missing item), assess a report item error of one. Additional values that are reported as a manufacturer extension to the standard are not considered spurious items;
- Compute the "report total error" as the sum of all of the report item errors from all of the reports;
- Compute the "report total volume" as the sum of all of the correct values for all of the report items that are supposed to appear in the reports. Special cases: When the same logical contest appears multiple times (e.g., when results are reported for each ballot configuration and then combined or when reports are generated for multiple reporting contexts), each manifestation of the logical contest is considered a separate contest with its own correct vote totals in this computation;
- Compute the observed cumulative report total error rate as the ratio of the report total error to the report total volume. Special cases: If both values are zero, the report total error rate is zero. If the report total volume is zero but the report total error is not, the report total error rate is infinite;
Applies to: Voting system
Source: Revision of [GPO90] F.6
The test lab SHALL record the report total error and report total volume for each pertinent test execution.
Applies to: Voting system
DISCUSSION
Accuracy is calculated as a system-level metric, not separated by device type.
1 Comment
Comment by Gail Audette (Voting System Test Laboratory)
Benchmarks are typically based on industry-wide data. What comparable industry was used for the factors for calculation of critical values? We see how the lab is expected to generate the true event rate (although we don't agree with the collection method). How are the labs supposed to evaluate the benchmark event rate in order to evaluate the null hypothesis (conforming or non-conforming)?.
When operational testing is complete, the test lab SHALL calculate the report total error and report total volume accumulated across all pertinent tests. If, using the test method in Part 3: 5.3.1 "General method", these values indicate rejection of the null hypothesis, the verdict on conformity to Requirement Part 1: 6.3.2-B SHALL be Fail. Otherwise, the verdict SHALL be Pass.
Applies to: Voting system
5.3.5 Misfeed rate
This benchmark applies only to paper-based tabulators and EBMs. Multiple feeds, misfeeds (jams), and rejections of ballots that meet all manufacturer specifications are all treated collectively as "misfeeds" for benchmarking purposes (i.e., only a single count is maintained).
All tests executed during conformity assessment SHALL be considered "pertinent" for assessment of misfeed rate, with the following exceptions:
- Tests in which misfeeds are forced.
Applies to: Voting system
5.3.5-B Calculation of misfeed rate
For paper-based tabulators and EBMs, the observed cumulative misfeed rate SHALL be calculated as follows:
- Compute the "misfeed total" as the number of times that unforced multiple feed, misfeed (jam), or rejection of a ballot that meets all manufacturer specifications has occurred during the execution of tests. It is possible for a given ballot to misfeed more than once – in such a case, each misfeed would be counted:
- Compute the "total ballot volume" as the number of successful feeds of ballot pages or cards during the execution of tests. (If the pages of a multi-page ballot are fed separately, each page counts; but if both sides of a two-sided ballot are read in one pass through the tabulator, it only counts once);
- Compute the observed cumulative misfeed rate as the ratio of the misfeed total to the total ballot volume. Special cases: If both values are zero, the misfeed rate is zero. If the total ballot volume is zero but the misfeed total is not, the misfeed rate is infinite.
Applies to: Paper-based device Λ Tabulator, EBM
DISCUSSION
"During the execution of tests" deliberately excludes jams that occur during pre-testing setup and calibration of the equipment. Uncalibrated equipment can be expected to jam frequently. Source: New requirement
5.3.5-C Misfeed rate data collection
The test lab SHALL record the misfeed total and total ballot volume for each pertinent test execution, for each type of device.
Applies to: Paper-based device Λ Tabulator, EBM
DISCUSSION
"Type of device" refers to the different models of paper-based tabulators and EBMs produced by the manufacturer.
1 Comment
Comment by Gail Audette (Voting System Test Laboratory)
Benchmarks are typically based on industry-wide data. What comparable industry was used for the factors for calculation of critical values? We see how the lab is expected to generate the true event rate (although we don't agree with the collection method). How are the labs supposed to evaluate the benchmark event rate in order to evaluate the null hypothesis (conforming or non-conforming)?.
When operational testing is complete, the test lab SHALL calculate the misfeed total and total ballot volume accumulated across all pertinent tests. If, using the test method in Part 3: 5.3.1 "General method", these values indicate rejection of the null hypothesis for any type of device, the verdict on conformity to Requirement Part 1: 6.3.3-A SHALL be Fail. Otherwise, the verdict SHALL be Pass.
5.4 Open-Ended Vulnerability Testing
Vulnerability testing is an attempt to bypass or break the security of a system or a device. Like functional testing, vulnerability testing can falsify a general assertion (namely, demonstrate that the system or device is secure) but it cannot verify the security (show that the system or device is secure in all cases). Open-ended vulnerability testing (OEVT) is conducted without the confines of a pre-determined test suite. It instead relies heavily on the experience and expertise of the OEVT Team Members, their knowledge of the system, its component devices and associated vulnerabilities, and their ability to exploit those vulnerabilities.
The goal of OEVT is to discover architecture, design and implementation flaws in the system that may not be detected using systematic functional, reliability, and security testing and which may be exploited to change the outcome of an election, interfere with voters’ ability to cast ballots or have their votes counted during an election or compromise the secrecy of the vote. The goal of OEVT also includes attempts to discover logic bombs, time bombs or other Trojan Horses that may have been introduced into the system hardware, firmware, or software for said purposes.
7 Comments
Comment by Gail Audette (Voting System Test Laboratory)
If the OEVT relies heavily on the experience and expertise of the OEVT Team Members, the testing is not repeatable and does not comply with the NIST 150-22 for repeatability (4.13.3).
Comment by Brit Williams (Academic)
This entire appears to be hastily written and poorly thought out. The purpose of certification testing is to verify that the voting system complies with the voting system guidelines. OEVT, as written, is a scatter shot approach to testing that does not address compliance with amy specific guideline. There is no management/oversight structure of the OEVT test team presented. There is no process for selecting team members presented. This is not to say that these types of tests have no value, but they should be part of the EAC Certification Procedures and not part of the VVSG. I will submit specific comments below.
Comment by David Beirne, Executive Director, Election Technology Council (Manufacturer)
OEVT is laudable, but difficult to incorporate into voting system design features. It is, by definition, subjective and undefined resulting in a security threshold that is difficult, if not impossible, to design for. Given the fact that the current dynamic for the industry is the financing of the voting system certification, no provider wishes to submit a product through an expensive process to have it fail based on a subjective standard that is not repeatable. This security process should be renamed and should incorporate clear security benchmarks that are broad in scope, but clear in their performance requirement.
Comment by E Smith/L Korb (Manufacturer)
The requirements, scope, mandate and documentation requirements of this specification are poorly defined and are dependent upon the makeup of the OEVT team. Additional work should be done to specify a fair and reasonable test. Many of the requirements of the VVSG would seem to open the possibility for new types of denial of service attacks. Much work needs to be accomplished before this section is ready for implementation.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #29. OEVT Goal [imprecise] USACM recommends that the present stated goal for OEVT (Sect 5.4, par 2, first sentence) be modified to read as follows: "The goal of OEVT is to test and analyze the target system to discover flaws that could adversely affect the election process and that may reflect systemic development process weaknesses." DISCUSSION: The current text is focused on discovering flaws that could invalidate election results. The proposed language would allow OEVT to be used in checking for flaws in other aspects of election operations, including accessibility and usability. OEVT is not meant as a replacement for quality design. The use of OEVT must be carefully described. It is a process that is difficult to replicate (if it were easy to replicate, it would not really be open-ended), so any requirements on OEVT in this VVSG should focus on the process — how the team is selected, the scope of work — of OEVT rather than specific steps. As with other parts of the testing process, this must be open to review by qualified scholars.
Comment by Matt Bishop, Mark Gondree, Sean Peisert, Elliot Proebstel (Academic)
We are excited by the addition of an "open-ended vulnerability testing" (OEVT) phase of the certification process, as described in the Aug 31, 2007 draft of the EAC's VVSG standards. Adding this type of testing, which is widely used in other venues for systems that must provide high security and reliability, will be invaluable in finding problems that could disrupt elections. As computer security practitioners, researchers, and graduate students, we anticipate that such a phase will be invaluable in future versions of these standards. The value of the OEVT lies in its ability to detect flaws that arise during use. In computer security, many flaws arise because mechanisms are integrated into a single unit, and even if the mechanisms are secure, the integration may inject unanticipated errors and problems. Further, humans are imperfect; so, as they interact with systems, problems--including security flaws--become evident. Lastly, developers of security mechanisms often make assumptions about the environment and use of their systems that differ from the environment and use of their systems in practice. Red teams can exploit this discrepancy to find flaws that security analysts who examine the systems and source code cannot. Therefore, from the point of view of computer security, open ended vulnerability testing is a valuable and necessary addition to the voting system federal certification process. The federal certification process will be better able to catch unanticipated design and implementation vulnerabilities before these flaws become "headliners". This will improve the effectiveness of testing requirements that require source code, design, and document review. It can help to identify areas where procedural defenses are critical to system security. As a result, this may lead to an increase in voter confidence and a decrease in the discrepancies between the results of federal certification and state testing, ultimately reducing the need for testing at the state level. This can also provide a means for more thoroughly testing systems that are not software independent, should such systems be "grandfathered" for a period. Requiring software independence helps the red teams frame their analysis, because one part of their work can be to verify whether the system meets this requirement--for if it does, undetected software flaws become much less harmful. In fact, software independence provides assurance that the system will function correctly even if there are software flaws. Red teaming cannot establish the absence of flaws, merely their presence, and so software independence adds assurance that the voting system will meet its goals. Based on our experience with similar Red Teaming exercises (members of our group have participated in numerous Red Teaming exercises and security evaluations relevant to election security, including the 2007 FSU report and 2007 CA Top-to-Bottom review), we have enumerated several comments below. We often refer to the part of the test lab performing the OEVT review as the "Red Team," while referring to the rest of the staff as "the lab." We refer to those threats achieving goals which would fall under the focus of the Red Team (as defined in 5.4.1-B) as "5.4.1-B threats."
Comment by ACCURATE (Aaron Burstein) (Academic)
This section lists requirements for vulnerability testing including team composition, the scope of testing, resources made available, level of effort to be expended and the rules of engagement for evaluation of the system. The team make-up and qualifications requirements are designed such that the testers possess a high level of expertise. Contrary to some criticisms of the VVSG draft as having no requirements to define vulnerability testing, these concrete requirements help to define this type of testing. They should be included in the VVSG.
5.4.1 OEVT scope and priorities
5.4.1-A Scope of open-ended vulnerability testing
The scope of open ended vulnerability testing SHALL include the voting system security during all phases of the voting process and SHALL include all manufacturer supplied voting system use procedures.
DISCUSSION
The scope of OEVT includes but is not limited to the following:
- Voting system security;
- Voting system physical security while voting devices are:
- In storage;
- Being configured;
- Being transported; and
- Being used.
- Voting system use procedures.
Source: New requirement
1 Comment
Comment by Brit Williams (Academic)
This section should specify that the voting system under test should be installed in an election environment, including all of the procedural security features used during an actual election.
5.4.1-B Focus of open-ended vulnerability testing
OEVT Team members SHALL seek out vulnerabilities in the voting system that might be used to change the outcome of an election, to interfere with voters’ ability to cast ballots or have their votes counted during an election or to compromise the secrecy of vote.
4 Comments
Comment by alan (General Public)
The OEVT team SHALL have at least one member (and not be the same person) with 6 or more years of experience in the area of software engineering, at least one member with 6 or more years of experience in the area of information security, at least one member with 6 or more years of experience in the area of penetration testing and at least one member with 6 or more years of experience in the area of voting system security. This requirement should not individualize the experience described above. These experience requirements should be contained by multiple team members.
Comment by Gail Audette (Voting System Test Laboratory)
Is the review of resumes to verify compliance of the test lab established OEVT members part of the test plan submitted to the EAC for approval (extension of Part 2 section 5.2-F)?
Comment by Brit Williams (Academic)
What is the rational for requiring that the security team members have six years experience and the electiion management team member have eight years experience?
Comment by Matt Bishop, Mark Gondree, Sean Peisert, Elliot Proebstel (Academic)
Red Team member requirements 5.4.2-C requires the Red Team be composed of "at last one member with 6 or more years of experience in the area of voting system security." Under this requirement, no members of the groups involved in the CA Top-to-Bottom review, the Florida State report, the Johns Hopkins report, and the RABA report would qualify. Further, expertise in voting systems, while helpful, has not proven essential to previous work. This clause should probably be a recommendation and not a requirement, and change "6 years" to "3 years".
The OEVT team SHALL prioritize testing efforts based on:
- threat scenarios for the voting system under investigation;
- the availability of time and resources;
- the OEVT team’s determination of easily exploitable vulnerabilities; and
- the OEVT team’s determination of which exploitation scenarios are more likely to impact the outcome of an election, interfere with voters’ ability to cast ballots or have their votes counted during an election or compromise the secrecy of the vote.
DISCUSSION
Following are suggestions for OEVT prioritization in the areas of threat scenarios, COTS products and Internet based threats. The intent here is to provide guidance on how to prioritize testing efforts given specific voting device implementations.
- All threat scenarios must be plausible in that they should not be in conflict with the anticipated implementation, associated use procedures, the workmanship requirements in section 6.4 (assuming those requirements were all met) or the development environment specification as supplied by the manufacturer in the TDP;
- Open-ended vulnerability testing should not exclude those threat scenarios involving collusion between multiple parties including manufacturer insiders. It is acknowledged that threat scenarios become less plausible as the number of conspirators increases;
- It is assumed that attackers may be well resourced and may have access to the system while under development;
- Threats that can be exploited to change the outcome of an election and flaws that can provide erroneous results for an election should have the highest priority;
- Threats that can cause a denial of service during the election should be considered of very high priority;
- Threats that can compromise the secrecy of the vote should be considered of high priority;
- A threat to disclosure or modification of metadata (e.g., security audit log) that does not change the outcome of the election, does not cause denial of service during the election, or does not compromise the secrecy of ballot should be considered of lower priority;
- If the voting device uses COTS products, then the OEVT team should also investigate publicly known vulnerabilities; and
- The OEVT team should not consider the voting device vulnerabilities that require Internet connectivity for exploitation if the voting device is not connected to the Internet during the election and otherwise. However, if the voting device is connected to another device which in turn may have been connected to the Internet (as may be the case of epollbooks), Internet based attacks may be plausible and should be investigated.
Source: New requirement
2 Comments
Comment by Harry VanSickle (State Election Official)
Please explain how "experts" will be identified for purposes of OEVT team composition. More specifically, please identify who makes that decision. The standards outline the requisite skills that are necessary to be an OEVT team member, but the standards are not clear regarding who determines whether a candidate meets the criteria.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #31. 5.4.2-E OEVT team knowledge [incorrect] USACM recommends that the word "Complete" in items numbered a and b in this subsection be replaced by the word "Expert". DISCUSSION: No one, and no small team, can have "complete knowledge." A patently impossible requirement offers no guidance as to the expected level of knowledge and competence. On the other hand, "expert knowledge", while subjective, is a recognized standard.
5.4.2 OEVT resources and level of effort
The OEVT team SHALL use the manufacturer supplied Technical Data Package (TDP) and User documentation, have access to voting devices configured similar to how they are to be used in an election, and have access to all other material and tools necessary to conduct a thorough investigation.
DISCUSSION
Materials supplied to the OEVT team should include but not be limited to the following:
- Threat analysis describing threats mitigated by the voting system;
- Security architecture describing how threats to the voting system are mitigated;
- High level design of the system;
- Any other documentation provided to the testing laboratory;
- Source code;
- Operational voting system configured for election, but with the ability for the OEVT team to reconfigure it;
- Testing reports from the developer and from the testing laboratory including previous OEVT results;
- Tools sufficient to conduct a test lab build; and
- Procedures specified by the manufacturer as necessary for implementation and secure use.
Source: New requirement
1 Comment
Comment by Cem Kaner (Academic)
The task of the OEVT should not be to verify anything; it should operate from the assumption that what has been handed to it is defective. The OEVT is tasked with discovering new types of problems which were missed in conventional testing. Focusing the OEVT effort on the materials already relied on by the other testers is a good way to encourage OEVT failure. If you aren't going to unchain the OEVT team, don't waste the manufacturer's money on them. .......... (Affiliation Note: IEEE representative to TGDC)
5.4.2-B Open-ended vulnerability team establishment
The test lab SHALL establish an OEVT team of at least 3 security experts and at least one election management expert to conduct the open-ended vulnerability testing.
2 Comments
Comment by Brit Williams (Academic)
The OEVT team should not have the authority to fail a voting system. The OEVT team should conduct their tests and submit to the VSTL a report of their findings. Similarly, the VSTL prepares a report of their tests, including the results of the OEVT, and submits their report to the EAC. The EAC is the only entity that has the authority to fail a voting system. Also, the OEVT should not be allowed to release any of their findings to any organization other than the VSTL. Ultimately, the EAC has the responsibility to determine which reports should be released to the public and which should remain proprietary in order to protect the intellectual property of the vendor or to protect the national security.
Comment by Premier Election Solutions (Manufacturer)
A system's security is based on it's threat model. If the OEVT Team modifies a system's threat model with subjective and unsubstantiated threats, then any system would fail. If the guidelines require OEVT testing, then the guidelines should provide a threat model to guide manufacturer's in the design of their security. Proposed Change: Provide a threat model in the guidelines and remove the ability for the OEVT team to modify that threat model except through formal requests for changes to revisions of the guidelines.
5.4.2-C OEVT team composition – security experts
The OEVT team SHALL have at least one member with 6 or more years of experience in the area of software engineering, at least one member with 6 or more years of experience in the area of information security, at least one member with 6 or more years of experience in the area of penetration testing and at least one member with 6 or more years of experience in the area of voting system security.
5.4.2-D OEVT Team Composition- Election Management Expert
The OEVT team SHALL have at least one member with at least 8 years of experience in the area of election management.
DISCUSSION
The OEVT team will require consultation from an elections expert who is familiar with election procedures, how the voting systems are installed and used, and how votes are counted.
Source: New requirement
The OEVT team knowledge SHALL include but not be limited to the following:
- Complete knowledge of work done to date on voting system design, research and analysis conducted on voting system security, and known and suspected flaws in voting systems;
- Complete knowledge of threats to voting systems;
- Knowledge equivalent to a Bachelor’s degree in computer science or related field;
- Experience in design, implementation, security analysis, or testing of technologies or products involved in voting system; and
- Experience in the conduct and management of elections.
5.4.2-F OEVT level of effort – test plan
In determining the level of effort to apply to open-ended vulnerability testing, the test lab SHALL take into consideration the size and complexity of the voting system; any available results from the "close ended" functional, security, and usability testing activities and laboratory analysis and testing activities; the number of vulnerabilities found in previous security analyses; and testing of the voting system and its prior versions.
5.4.2-G OEVT level of effort – commitment of resources
The OEVT team SHALL examine the system for a minimum of 12 staff weeks.
5.4.3 Rules of engagement
5.4.3-A Rules of engagement – context of testing
Open ended vulnerability testing SHALL be conducted within the context of a process model describing a specific implementation of the voting system and a corresponding model of plausible threats.
DISCUSSION
The specification of these models is supported by information provided by the manufacturer as part of the TDP. See Requirement Part 2: 3.5.1.
Source: New requirement
5.4.3-B Rules of engagement – adequate system model
The OEVT team SHALL verify that the manufacturer provided system model sufficiently describes the intended implementation of the voting system.
DISCUSSION
Manufacturer’s system model and associated documentation should reliably describe the voting system and all associated use procedures given the environment in which the system will be used.
Source: New requirement
5.4.3-C Rules of engagement – adequate threat model
The OEVT team SHALL verify that the threat model sufficiently addresses significant threats to the voting system.
DISCUSSION
Significant threats are those that could:
- Change the outcome of an election;
- Interfere with voters’ ability to cast ballots or have their votes counted during an election; or
- Compromise the secrecy of vote.
OEVT team may modify the manufacturer’s threat model to include additional, plausible threats.
Source: New requirement
5.4.4 Fail criteria
5.4.4-A OEVT fail criteria – violation of requirements
The voting device SHALL fail open ended vulnerability testing if the OEVT team finds vulnerabilities or errors in the voting device that violate requirements in the VVSG.
DISCUSSION
While the OEVT is directed at issues of device and system security, a violation of any requirement in the VVSG can lead to failure. Following are examples of issues for which the test lab must give a recommendation of "fail":
- Evidence that any single person can cause a violation of a voting system security goal (e.g., integrity of election results, privacy of the voter, availability of the voting system), assuming that all other parties follow procedures appropriate for their roles as specified in the manufacturer’s documentation;
- Manufacturer's documentation fails to adequately document all aspects of system design, development, and proper usage that are relevant to system security. This includes but is not limited to the following:
- System security objectives;
- Initialization, usage, and maintenance procedures necessary to secure operation;
- All attacks the system is designed to resist or detect; and
- Any security vulnerabilities known to the manufacturer.
- Use of a cryptographic module that has not been validated against FIPS 140-2;
- Ability to modify electronic event logs without detection;
- A VVPR that has an inaccurate or incomplete summary of the cast electronic vote;
- Unidentified software on the voting system;
- Identified software which lacks documentation of the functionality it provides to the voting device;
- Access to configuration file without authentication;
- Ability to cast more than one ballot within a voting session;
- Ability to perform restore operations in Activated State;
- Enabled remote access in Activated State; and/or
- Ballot boxes without appropriate tamper evidence countermeasures.
Source: New requirement
6 Comments
Comment by Kevin Baas (Academic)
I that think instead of saying 3. All attacks the system is designed to resist or detect; and 4. Any security vulnerabilities known to the manufacturer. it would be better to know: 3. All attacks the system is NOT designed to resist or detect; and 4. Any security vulnerabilities NOT known to the manufacturer. to accomplish this, instead of - or rather, in addition to - listing these things it would be better to say "list all the points in the process and if each point has or does not have x. For instance, they might say "oh, we use an access database here, and that's pretty secure.. but fail to mention that it's not password protected, though it obviously could be with out much effort. So there should be a list of all points and a checklist for each point, perhaps made by security experts. The goal, of course, is to have as complete a checklist as possible, so community input through something like a wiki would be very helpful in making the list more complete (by reducing the probability that an item is missed). I propose this section be amended with the following additions (or something similar which retains the principle of them): 13. All ballot/vote storage/transfer/processing points/channels in the voting, tabulation, and tabulation reporting system, in the order of flow. (i.e. following the vote from the point of entry to final certification.) And their physical and logical security environment. 14. "Point of entry" includes the software (if a computer-like system is used, such as a touchscreen voting machine) which first records the vote, and by implication, therefore, includes the source code to that software. The source code to the software that runs on the machine should be provided for review, and its vulnerabilities and so forth should be considered as with. Regarding point 13: security environment includes: a. hardware used b. logical access to hardware 1. local access security (password protected account login?, is a screen saver password used?, etc.) 2. network access (through network, modem, etc. (including shared filesystems)) security as local access security, plus network-specific items such as firewalls etc. 3. surveillance (logging, transparency) c. physical access to hardware 1. who is authorized to use the equipment 2. what is required to get into the room where the hardware is stored? (security cards, keys, etc.) 3. who has these tools (to get into the room) and how are they secured? 4. surveillance - cameras, people, etc. b. software and protocols used c. encryption used or lack thereof d. password protection used or lack thereof Regarding point 14: By "computer-like system" I mean a system that includes a subsystem which is "turing complete" or near turing complete, such that it's programmability allows if functionality to be substantially altered. For instance, this applies to systems where computer software is used in the vote entry process, such as "e-voting" machines, due to vulnerabilities inherent in programmable devices. First thing that needs to be cleared up once and for all, is that all claims that the computer program that records the vote is "proprietary software" are bogus, as: 1. Releasing the source code, privately or publicly, does not put the company at a competitive disadvantage. Writing a computer program to record a vote is trivial. (And thus fails to meet the requirements for a software patent.) (And by trivial I mean TRIVIAL.) The manufacturer is not liable to suffer any monetary damage from the private or public disclosure of the source code, as any competent programmer could produce software for recording a vote just as easily without it. Therefore, when it comes to recording a vote, having access to and/or using a competitor's source code would not provide a company with a competitive advantage (as that software is trivial to produce), and therefore would not put the source company at a competitive disadvantage. Any claims to the contrary are bogus. Any honest, competent, computer programmer will tell you this. 2. Releasing the source code, privately or publicly, does subject the company to any potential monetary losses/damages, save those incurred from the consumer being informed of a potential hazard of the product (in this case, the hazard being a flawed election), which information a consumer is legally entitled to for their protection (in this case, the protection of their right to vote). 3. Releasing the source code, privately or publicly, does not create a security vulnerability, as: a. to write source code for a machine that interfaces with input/output devices and runs on the platform, one needs knowledge of the machines _hardware_. b. to install software on a machine one needs physical access to the machine c. if such physical access necessary to install software on the machine were available to a non-specialist, that in itself would constitute a severe security vulnerability. (and therefore a failure of this test.) 4. Withholding the source code from third-party review constitutes a security vulnerability, as: a. without third-party review, the software could be written to do almost anything. It could change votes arbitrarily, or even just ignore votes all together and report an arbitrary total. The total could be preprogrammed, or an outcome could be preprogrammed and a total calculated that reasonably matches the votes but guarantees a pre-programmed outcome. etc. etc. Without explicit software review there is absolutely NO protection against these threats. b. There remains, in fact, even after code review, a number of software-related threats, including but not limited to: 1. Installation testing: The threat that said software is not the software actually installed on the machines. To secure against this, reviewers must be able to install or observe the installation process. 2. Turing completeness: The threat that said software, after being installed on the machine, is not executed by the machine. (That instead, another program is run from previously installed software or from a hardware device such as Read Only Memory.) To make sure that the system is actually running software when it is installed on it, reviewers must be able to provide different programs on the machine and see if the machine executes those instructions and produces the predicted result. (I.e. the machine should be able to demonstrate turing completeness.) 3. Timed code: a system may be set up such that failure scenarios 1 and 2 above may occur only during pre-selected times, such as only on election day. A tester/reviewer must be able to roll forward/back any and all system/internal/connected clocks to an arbitrary date/time, and test the system while it is in a state such that all relevant hardware and software thinks that it is the date/time that the system has been rolled forward (or back) to. The system should be tested in this manner for the date/time that it is supposed to function properly during (such as election day and the day after). c. and as well, the threat of tampering with electronic data, which is much greater than that of tampering with data stored physically, such as on a piece of paper, because data stored electronically is easier to access and alter. To protect against this increased threat, electronic voting systems must provide: 1. Methods, procedures, equipment, storage, and retrieval mechanisms for backing up and securing voting records for later review & verification until a time to be determined by law should be in place and tested. 2. Methods, procedures, equipment, etc. for preventing the digital tampering of voting records should be in place, reviewed, and tested. Such methods include: a. distributed backup/parity check: multiple copies of the data are distributed to different physical locations, such that at a later date they can be compared against each other for discrepancies. (which would constitute evidence of tampering) b. public key encryption: this protects against tampering by making the data essentially read-only. The data is encrypted with an asymmetric encryption scheme, and the decryption key is made public, while the encryption key is kept private, perhaps even randomly generated by the machine at election time, and then wiped from computer memory when all the votes have been recorded, encrypted and transfered. The corresponding decryption key is stored on the machine so that people and widely distributed as soon as possible to prevent tampering (via distributed backup/parity check). In asymmetric encryption, the encryption key cannot decrypt, and the decryption key cannot encrypt. This provides read-only functionality because only those with the encryption key can create data that can be decrypted by the decryption key. Data that cannot be decrypted by the decryption key was not encrypted by the encryption key, and was therefore clearly either "tampered with" after being encrypted, or was not from a person or machine that has the encryption key. There are two weaknesses to this (besides the strength of the encryption): 1. you have to make sure you have the right decryption key (the one corresponding to the one that the desired data was encrypted in, and not some look-alike data that someone else made up and encrypted) Perhaps the machine could produce a key pair right before election, and distribute the public key, and a trial run could be done to ensure that that is in fact, the key that the machine is using. 2. you have to make sure that the encryption key is kept secure, or even "thrown away" immediately after use, so that people can't generate fake voting records that pass the decryption test. c. digital signing. similar to asymmetric encryption, this helps insure that the data is from a trusted source.
Comment by Gail Audette (Voting System Test Laboratory)
All requirements within the VVSG are being tested and the voting system is verified to be in compliance (Parts 1 and 2): however, this requirement is to again fail the voting system if any of those requirements are not met. This is already a basis for voting system certification and not enhanced by the OEVT.
Comment by David Beirne, Executive Director, Election Technology Council (Manufacturer)
Despite assurances that the OEVT would not result in a "failing" mark, this section speaks to a troublesome feature of the new VVSG. The fail criteria reveals the redundant nature of the OEVT as it does not consider the role of the VSTL itself during this process.
Comment by Matt Bishop, Mark Gondree, Sean Peisert, Elliot Proebstel (Academic)
The low priority assigned to the goals in 5.4.1-C Discussion point 7 seems to contradict 5.4.4 part 4 & 5, which state that "the lab must give a recommendation of 'fail'" when it is possible to modify logs and cause incomplete audits to be generated.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #32. Section 5.4.4-A OEVT Fail Criteria — Failure Interpretation [incomplete] USACM recommends that the following new subsection, with the title above, be added to Section 5.4 as follows: Software testing, including open-ended testing, cannot demonstrate the absence of flaws. Thus, its contribution to the certification process is twofold: a. A final filter to prevent faulty voting system software from achieving certification b. Detect vendors whose development processes are not sufficiently mature to consistently produce high assurance products. The OEVT team should consider a final finding of "failure" to indicate a need to redesign the system or the system testing strategy. DISCUSSION: There is no software testing regimen that can claim comprehensive fault detection. Thus, the best that an OEVT team can hope to do is (1) Detect well-known faults left as a result of immature development processes and (2) Detect subtle faults that the team’s specific skill sets enable them to find and that routine or even mature development processes may not prevent or detect. During deliberations, the OEVT team must assess the vulnerabilities as they apply relative to vendor prescribed procedures. Fail criteria must reflect that an attack based on whether the identified vulnerability would be likely to occur, succeed, and escape detection.
Comment by ACCURATE (Aaron Burstein) (Academic)
This section further defines concrete requirements for vulnerability testing by specifying the fail criteria for vulnerability tests: a system can fail if 1) the vendors system in conjunction with use procedures and security controls do not adequately mitigate significant threats (Part 3:5.4.4-B); or 2) if found vulnerabilities could be used to: "change the outcome of an election, interfere with voters' ability to cast ballots or have their votes counted during an election, or compromise the secrecy of vote [...]" (Part 3:5.4.4-C). Thus, this section should be adopted by the EAC.
Voting systems SHALL fail open ended vulnerability testing if the manufacturer’s model of the system along with associated use procedures and security controls does not adequately mitigate all significant threats as described in the threat model.
DISCUSSION
Team may use a threat model that has been amended based on their findings in accordance with 5.4.3-C.
Source: New requirement
2 Comments
Comment by Brit Williams (Academic)
This section should be deleted. The VTSL will determine whether or not a voting system fails certification testing based on the totality of their findings, including the report from the OEVT.
Comment by ACCURATE (Aaron Burstein) (Academic)
ACCURATE's comments to Part 3:5.4.4-B (including the recommendation to adopt) apply equally to this requirement.
5.4.4-C OEVT fail criteria – critical flaws
The voting device SHALL fail open ended vulnerability testing if the OEVT team provides a plausible description of how vulnerabilities or errors found in a voting device or the implementation of its security features could be used to:
- Change the outcome of an election;
- Interfere with voters’ ability to cast ballots or have their votes counted during an election; or
- Compromise the secrecy of vote without having to demonstrate a successful exploitation of said vulnerabilities or errors.
DISCUSSION
The OEVT team does not have to develop an attack and demonstrate the exploitation of the vulnerabilities or errors they find. They do however have to offer a plausible analysis to support their claims.
Source: New requirement
4 Comments
Comment by Brit Williams (Academic)
This sectiion should be deleted. The VTSL will recommend that the system pass or fail based on the totality of the certification testing, including the report from the OEVY.
Comment by David Beirne, Executive Director, Election Technology Council (Manufacturer)
"The voting device SHALL fail open ended vulnerability testing if the OEVT team provides a plausible description of how vulnerabilities or errors found in a voting device or the implementation of its security features" This section should be stricken as it is too permissive. The OEVT is essentially saying that the security of a voting system doesn't actually have to be penetrated, only a "plausible description" of how the penetration would occur. The bar for failing a voting system has been set low that the OEVT will remain as the final arbiter for the certification of a voting system based on a subjective review and one that only requires a description of events that may cast doubt on the system's security.
Comment by ACCURATE (Aaron Burstein) (Academic)
ACCURATE's comments to Part 3:5.4.4-B (including the recommendation to adopt) apply equally to this requirement.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #33. Section 5.4.4-C OEVT Fail Criteria - Critical Flaws [incorrect] USACM recommends that subsection 5.4.4-C be modified as follows: The voting device shall fail open-ended vulnerability testing if the OEVT team demonstrates one or more critical flaws that allow an attacker to violate VVSG requirements as specified in paragraph 5.4.4-A above, under a plausible description of how vulnerabilities or errors found in a voting device or the implementation of its security features are used to: a. Change the outcome of an election; b. Interfere with voters’ ability to cast ballots or have their votes counted during an election; or c. Compromise the secrecy of vote without having to demonstrate a successful exploitation of said vulnerabilities or errors. Potential vulnerabilities for which no exploit is demonstrated may be noted as observations, but may not rise to the level of findings. DISCUSSION: OEVT failure is a serious event that may have severe financial ramifications. Thus, it cannot be justified by hypothetical attacks. OEVT testers must be held to high scientific standards that can only be reflected by the three level process of: a. Detecting vulnerability b. Envisioning an exploit for each identified instance and by c. Demonstrating each envisioned attack under plausible conditions.
5.4.5 OEVT reporting requirements
5.4.5-A OEVT reporting requirements
The OEVT team SHALL record all information discovered during the open-ended vulnerability test, including but not limited to:
- Names, organizational affiliations, summary qualifications, and resumes of the members of the OEVT;
- Time spent by each individual on the OEVT activities;
- List of hypotheses considered;
- List of hypotheses rejected and rationale;
- List of hypotheses tested, testing approach, and testing outcomes; and
- List and description of remaining vulnerabilities in the voting system:
- A description of each vulnerability including how the vulnerability can be exploited and the nature of the impact;
- For each vulnerability, the OEVT team should identify any VVSG requirements violated; and
- The OEVT team should flag those vulnerabilities as serious if the vulnerability can result in the violation of one or more VVSG requirements; a change of the outcome of an election; or a denial of service (lack of availability) during the election.
DISCUSSION
Examples of the impact of an exploited vulnerability are over-count of ballots for a candidate; undercount for a candidate; very slow response time during election; erasure of votes; and lack of availability of the voting device during election.
Source: New requirement
5 Comments
Comment by Brit Williams (Academic)
This section needs to be expanded to state that these results will be contained in a report presented to the VSTL and, furthermore, the OEVT will not release any portion of this report to any organization other thant the VTSL.
Comment by Harry VanSickle (State Election Official)
Please explain how the results of testing and threat vulnerability be disseminated to state and county election officials. Access to this information is a critical element for the state’s own certification process. There should be a section that clearly outlines how and when the results of OEVT testing will be made available. Moreover, state and county election officials should have access to the actual OEVT report, not just a condensed report by EAC. Please see our suggestion for language below. Within six (6) weeks after testing is complete at the federal level, EAC shall provide to state election officials a copy of any and all OEVT reports for the voting system.
Comment by Cem Kaner (Academic)
The more reporting, the less testing. .......... If you want time-constrained testing to yield worthwhile test results, most of the time has to be spent on testing (imagining risks,designing tests, implementing tests, executing tests, and evaluating the results). Documentation time is on top of this, and time spent on it subtracts from the time available for the testing. .......... (Affiliation Note: IEEE representative to TGDC)
Comment by Matt Bishop, Mark Gondree, Sean Peisert, Elliot Proebstel (Academic)
Focus on defense in depth We approve of the view that the the Red Teaming exercises must consider the entire system---including physical security and security procedures---as a whole. But the current version of the OEVT section strongly implies that, when procedures ameliorate potential threats to the technological parts of the voting system, the weaknesses in the technology would not be considered flaws, and would go unreported. Thus, the reports from the OEVT would not provide enough information for election officials to know the consequences of failing to follow procedural defenses that mitigate technical flaws, or to determine whether the procedural defenses are appropriate for their locality. In our experience, the fundamental principle of defense-in-depth is best captured by layers of procedural and technical defenses. We would like to see the reporting requirements re-written as follows: Reporting requirements: Include in reporting requirements a list and description of any flaws in the voting system that are remediated by procedures, full descriptions of the associated procedures, and the consequences of not following those procedures. We feel such a discussion would be useful to election officials as they integrate the system's use procedures into their local procedures. In this process, the intention behind a specific procedure may be misunderstood (especially those procedures which serve multiple purposes) and integrated into local procedures in a way that does not address all the vulnerabilities. With a discussion of those system threats for which there are no, or very few, technological defenses in place would better assist officials during system integration.
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
Structured Note-taking [incomplete] USACM recommends adding a paragraph 5.4.5-B as follows: 5.4.5-B. OEVT team process documentation requirement. Each OEVT team will conduct structured note-taking during the analysis. Where possible, all notes will be shared among team members during the entire review period, but must be shared by all members during deliberations, before the final report is prepared. These structured notes become part of the team product and must be delivered along with the OEVT final report. DISCUSSION: It is difficult to overstate the value of structured note-taking during the review process and making the notes database a work-product of each review. The level of continuity it provides between reviews justifies including it as a VVSG requirement. There are also two other benefits that may be equally as important: 1. Process Improvement. Understanding the details of the process that each team goes through can be a gold mine of best practices. 2. Accountability. OEVT is critically dependent on the skill and knowledge of the investigators. Structured note taking provides an avenue to analyze the team’s effort.
5.4.6 VSTL response to OEVT
The VSTL SHALL examine the OEVT results in the context of all other security, usability, and core function test results and update their compliance assessment of the voting system based on the OEVT.
DISCUSSION
The testing laboratory should examine each vulnerability that could result in the violation of one or more VVSG 2007 requirements; a change of the outcome of an election; or a denial of service (lack of availability) during the election and use the information to form the basis for non-compliance. If significant vulnerabilities are discovered as a result of open-ended vulnerability testing, this may be an indication of problems with test lab procedures in other areas as well as voting system design or implementation.
Source: New requirement
2 Comments
Comment by Kevin Wilson (Voting System Test Laboratory)
This section implies the OEVT team is not necessarily a part of the VSTL. Can the OEVT be members of the VSTL?
Comment by U.S. Public Policy Committee of the Association for Computing Machinery (USACM) (None)
USACM Comment #35. Section 5.4.6. VSTL Response to OEVT [incomplete] USACM recommends changing the first full sentence in Section 5.4.6-A to read: "The VSTL SHALL: 1. Forward the OEVT results to the VSTL licensing authority for their use in assessing vendor development process maturity and to assess potential corrective action; and 2. Examine the OEVT results in the context of all other security, usability, and core function test results and update their compliance assessment of the voting system based on the OEVT." DISCUSSION: The addition of requirement one will encourage feedback to testing lab authorities and the Election Assistance Commission about issues, errors and anomalies uncovered during the testing process that are not connected to specific requirements of the VVSG. Without a feedback process for problems outside the terms of the VVSG, the testing process would be subject to the voting systems equivalent of teaching to the test — covering only those items outlined in the test, and ignoring anything else — regardless of how it could influence voting, voting administration and elections.