Tagged election technology

More on CyberScoop Coverage of Voting Machine Vulnerabilities

CyberScoop‘s Chris Bing wrote a good summary of the response to Cylance’s poorly timed announcement of old news on voting machine vulnerabilities: Security Firm Stokes Election Hacking Fears.

I have a couple of details to add, but first let me re-iterate that the system in question does have vulnerabilities which have been well known for years, and reference exploits are old news. Sure, Cylance techs did write some code to create a new variant on previous exploits, but as Princeton election security expert Andrew Appel noted, the particular exploit was detectable and correctable, unlike some other hacks.

Regardless of whether Cylance violated the unwritten code of reporting on new vulnerabilities only, and regardless of good intentions vs. fear-mongering effects, the basic premise is wrong.

You can’t expect election officials to modify critical voting systems in response to a blog. In fact, election officials should not be modifying software at all, and should modify hardware only for breakage replacement.

Perhaps the folks at Cylance didn’t know that there are very special and very specific rules for modifying voting systems. Here  are 5 details about how it really works:

  • The hardware and software of voting systems is highly regulated, and modifications can only be done following regulatory review.
  • Even if this were a new vulnerability, and even if there were what some would claim is an easy fix, it would still require the vendor to act, not the election officials. Vendors would have to make the fix, and re-do their testing, then re-engage for testing by an accredited test lab (at the vendor’s expense), and then go back to government certification of the test lab’s finding.
  • Election officials are barred from “patching” or any kind of unsupervised modification. This makes a lot of sense, if you think about it: someone representing the vendor wants to modify these systems, while each of 10,000+ local election bodies is supposed to ensure only the legitimate changes happen? That’s not feasible, even if were legal.
  • Local election officials are required to do pre-election testing for machines’ “logic and accuracy,” and they must not use machines that have not passed such testing, which in some localities must also be signed off by an elections board. Making even a legitimate certified change to a system 4 days before an election would invalidate it for use on election day. Consider early voting! It is really many weeks since modifications of any kind were allowed.
  • So there is no way that a disclosure like this, with this timing, could ever be viewed as responsible by anyone who understands how voting tech is regulated and operated. I expect that it didn’t occur to the Cylance folks that there might be special rules about voting systems that would make disclosures 4 days before, or even 4 weeks before, completely impractical for any benefit. But regardless of a possible upside, it ought to have been clear that there is considerable downside for fear-mongering the integrity of an election a mere days before election day– especially this one.

And that would still be the case if this were a new finding.  Which it isn’t.

Making a new variant exploit on a vulnerability well known for some time is just grandstanding, and most responsible security folks steer clear of that to maintain their reputation.  I can’t fathom why Cylance in this case behaved so at variance with the unwritten code of ethical vulnerability research. I hope it was just impulsive behavior based on a genuine concern about the integrity of our elections.  The alternative would be most unfortunate.

— John Sebes, CTO

Old School, New Tech: What’s Really Behind Today’s Elections

Many thanks to coverage by Bloomberg’s Michaela Ross, on election tech and cyber-security.

Given so much at stake for this election with its credibility rocked by claims of rigging, and so much more at stake as we move ahead to replace and improve our election infrastructure, I’m rarely enthused about reading more about how some people think Internet voting is great, and others think it is impossible.  However, Ms. Ross did a great job of following that discussion about how “Old School May Be Better” with supporting remarks from many long time friends and colleagues in election administration and technology worlds.

Where I’d like to respond is to re-frame the “old” part of “old school” and to reject one remark from a source that Ross quoted: They’re pretending what we do today is secure … There’s not a mission critical process in the world that uses 150-year-old technology.” Three main points here:

  1. There is plenty of new technology in the so-called old school;
  2. No credible election expert pretends that our ballots are 100% secure, not even close; and
  3. That’s why we have several new and old protections on the election process, including some of that new technology.

Let me address that next in three parts, mostly about what’s old and what’s new, then circle back to the truth about security, and lastly a comment on iVoting that I’ll most defer to a later re-up on the iVoting scene.

Old and New

Here is what’s old: paper ballots. We use them because we recognize the terrible omission in voting machines from the late 19th century mechanical lever machines (can be hacked with toothpicks, tampered with screwdrivers, and retain no record of any voter’s intent other than numbers on odometer dials) and many of today’s paperless touchscreens: “hack-able” and “tamper-able” even more readily, and likewise with no actual ballot other than bits on a disk. We use paper ballots (or paper-added touchscreens as a stop-gap) because no machine can be trusted to accurately record every voter’s intent. We need paper ballots not just for disputes and recounts, but fundamentally as a way to cross check the work of the machines.

Here’s what’s new: recently defined scientific statistical methods to conduct a routine ballot audit for every election, to cross check the machines’ work, with far less effort and cost than today’s “5% manual count and compare” and variant methods used in some states. It’s never been easier to use machines for rapid counts and quick unofficial results, and then (before final results) to detect and correct instances of machine inaccuracies whether from bugs, tampering, physical failure, or other issues. It’s called Risk Limiting Audit or RLA.

Here’s what new-ish: the new standard approach is for paper ballots to be rapidly machine counted using optical scanners and digital image processing software. There are a lot of old clunky and expensive (to buy, maintain, and store) op-scanners still in use, but this isn’t “150 years old,” any more than our modern ballots are like the old 19th-century party-machine-politics balloting that was rife with fraud that led to the desire for the old lever machines. However, these older machines have low to no support for RLA.

Here’s what’s newer: many people have mobile computers in their pocket that can run optical-capture and digital image processing. It’s no longer a complicated job to make a small, inexpensive device that can read some paper, record what’s on it, and retain records that humans can cross check. There’s no reason why the op-scan method needs to be old and clunky. And with new systems, it is easy to keep the type of records (technically, a “cast vote record” for each ballot) needed for easy support for RLA.

And finally, here’s the really good part: innovation is happening to make the process easier and stronger, both here at the OSET Institute and elsewhere ranging from local to state election officials, Federal organizations like EAC and NIST, universities, and other engines of tech innovation. The future looks more like this:

  • Polling place voting machines called “ballot marking devices” that use a familiar inexpensive tablet to collect a voter’s ballot choices, and print them onto a simple “here’s all and only what you chose” ballot to
    be easily and independently verified by the voter, and cast for optical scanning.
  • Devices and ballots with professionally designed and scientifically tested usability and accessibility for the full range of voters’ needs.
  • Simple inexpensive ballot scanners for these modern ballots.
  • Digital sample ballots using the voter’s choice of computer, tablet, or phone, to enable the voter to take their own time navigating the ballot, and creating a “selections worksheet” that can be scanned into a
    ballot marking device to confirm, correct if needed, and create the ballot cast in a polling place
  • or to be used in a vote-by-mail  process, without the need to wait for an official blank ballot to arrive in the mail.
  • And below that tip of the iceberg for the critical ballot-related operations, there is a range of other innovations to streamline voter registration, voter check-in, absentee ballot processing, voter services
    and apps to navigate the whole process and avoid procedural hurdles or long lines, interactive election results exploration and analytics, and more
  •   and all with the ability for election official to provide open public data on the outcome of the whole election process, and every voter’s success in participation or lack thereof.

That’s a lot of new tech that’s in the pipeline or in use already, but in still in the old school.

Finally, two last points to loop back to Michaela’s article.

Election Protection in the Real World

First, everyone engaged in elections knows that no method of casting and counting ballots is secure.

  • Vote by mail ballots go to election officials by mail passing through many hands, not all of which may seem trustworthy to the voters.
  • Email ballots and other digital ballots go to election officials via the Internet — again via many “virtual hands” that are definitely not trustworthy — and to computers that election officials may not fully control.
  • Polling place ballots in ballot boxes are transported by mere mortals who can make mistakes, encounter mishaps, and as in a very few recent historical cases, may be dishonest insiders.
  • Voting machines are easily tampered with by those with physical access, including temp workers and contractors in warehouses, transportation services, and pre-election preparations.
  • The “central brains” behind the voting machines is often an ordinary antique PC with no real protection in today’s daunting threat environment.
  • The beat goes on with voter records systems, electronic poll books, and more.

That’s why today’s election officials work so hard on the people and processes to contain these risks, and retain control over these vital assets throughout a complex process that — honestly, going forward — could be a lot simpler and easier with innovations designed to reduce the level of effort and complexity of these same type of protections.

The Truth About iVoting Today

Secondly, lastly, and mostly for another time: Internet voting. It’s desirable, it will likely happen someday, and it will require a solid R&D program to invent the tech that can do the job with all the protections — whether against, fraud, coercion, manipulation, and accidental or intention disenfranchisement — the we have today in our state-managed, locally-operated, and (delightfully but often frustratingly) hodge podge process of voting in 9,000+ jurisdictions across the US.  I repeat, all, no compromises; no waving the magic fairy wands of trust-me-it-works-because-it-is-cool or blockchains or so-called “military grade” encryption or whatever the latest cool geek cred item is.

In the meantime short-term, we have to shore up the current creaky systems and process, especially to address the issues of “rigging,” and the crazy amount of work election professionals have to do get the job done and maintain order and trust.

And then we have to replace the current systems in the existing process with innovations that also serve to increase trust and transparency. If we don’t fix the election process that we have now, and soon, we risk the hasty addition of i-voting systems that are just as creaky and flawed, hastily adopted, and poorly understood, the same as the paperless voting machines that adopted more than a decade ago.

We can do better, in the short-term and long, and we will.  A large and growing set of election and tecnology folks, in organizations of many kinds, are dedicated to making these improvements happen, especially as this election cycle has shown us all how vitally important it is.

— John Sebes

Showtime: OSET/TrustTheVote Project Appearing at DNC Convention Strategic Forum Event

(This is a x-post from Ms. Voting Matters’ announcement on the OSET Institute’s corporate site.)

We are totally excited about an amazing opportunity tomorrow, Tuesday July 26th, to appear at an event as part of the Democrat National Convention.

The only thing that would make this truly complete is if the Republican National Convention had also invited us (we asked, and although we’re pleased to be working with several in the RNC infrastructure, making something happen was not possible.)

But the New Democrat Network (NDN) and the New Policy Institute did reach out to us and invited us to their premier Strategy Forum now being held at its 4th Democrat National Convention.  So, we’re focused on presenting to an audience estimated to exceed 1,000 per latest projections based on RSVPs as of yesterday (over 900).  This is truly an amazing opportunity for us to spread the story of our work and we’re deeply appreciative of the NDN’s invitation.

The event, “Looking Ahead: Talks on the Future of America and American Politics” is bringing together a collection of amazing thought-leaders on the future and innovation of democracy including experts such as Ari Berman, Alec Ross, Joel Gamble, Jose Antonio Vargas, and others.

The title of our presentation is: “Modernizing Our Election Technology Can Make Our Democracy Better.”

This will not be telecast, although we’re still waiting word about a webcast, video stream, or recording of the sessions.  We’ll update this as soon as we know.

However, part of our presentation will be the launch of a new 2 minute video vignette about the looming problem of obsolete voting machinery and our approach to help bring about innovation which will increase integrity, lower costs, improve participation, and rejuvenate a flagging industry with new technology to innovate the business of delivering finished voting systems. That video will be available on YouTube tomorrow afternoon, and we will add a comment to this post and update it accordingly.

OSET’s Director Citizen Outreach, Meegan Gregg, and the Foundation’s Co-Founder, Gregory Miller will deliver this “Ted-Talk” -like presentation at 12:20pm EDT at the Convention Center in Philadelphia.  It should be a great time and a huge (oops) opportunity.

State Certification of Future Voting Systems — 3 Points of Departure

In advance of this week’s EVN Conference, we’ve been talking frequently with colleagues at several election oriented groups about the way forward from the current voting system certification regime. One of the topics for the EVN conference is a shared goal for many of us: how to move toward a near future certification regime that can much better serve state election officials in states that want to have more control, customization, and tailoring of the certification process, to better serve the needs of their local election officials.

Read more

Future of Voting Systems: Future Requirements (Part 1)

For this first of several reports from the NIST/EAC Future of Voting Systems Symposium II, some readers of my recent report on standard work, may heave a sigh of relief that I’m not doing a long post that’s a laundry list of topics. However, I will be doing a series of posts on one part of the conference, a session held by some EAC who run the voting systems certification program, which relies on a “guidelines” document that is actually a complex set of standards that voting systems have to meet in order to get certified.

The reason that I am doing a series of posts is that the session was on a broad topic: if you were able to write a whole new requirements document from scratch, oriented to future voting systems, not required to support the existing certification program backwards-compatibly, then … what would you put in your hypothetical standards for each of several topics? Not surprisingly, I and my colleagues at TrustTheVote (and like minded folks in the election world more broadly) have some pretty clear views on many areas. As promised to the folks running this session, I’ll be using this blog to document more fully the recommendations we discussed, informed (with thanks) by the views of others at this conference. But I’ll be doing it in chunks over time, because I don’t think anybody wants tome here. 🙂

The Fork in the Road

The zeroth recommendation — that is, before getting to any of the topics requested! — is about the overall scope of a future standard. In the decade or so since the current one was developed (and even more years to the earlier versions), things have changed a lot in the election tech world, and change is accelerating. We are no longer in the stage of “wow that hanging chad fiasco was horrible, we need to replace them fast with computerized voting machines.” We’ve learned a lot. And one of the biggest learnings is that there is a huge fork in the road, which effects nearly all the requirements that one might make for voting systems. That’s what I want to explain today, in part because it was a good chunk of the discussions at the conference.

The fork in the road is this: you either have a voting system that supports evidence-based election results, or you don’t.

In this context, evidence-based means that the voting system produces evidence of its vote tallies that can be cross-checked by humans — and this is the important part — without having to trust or rely on software in any way. That’s important, because as we know, software is not and can never be perfect or trustworthy. In practice, what this means is that for each voter, there is a paper ballot that can be counted directly by people conducting a ballot audit. The typical practice is to take a statistically significant group of ballots for which we have machine count totals — typically a whole precinct in practice today — and manually count them to see if there is any significant variance between human and machine count that could indicate the machine count (or the human audit) had some errors. The process for resolving the rare variances is a larger topic, but the point here is that the process provides assurance of correct results without relying on computers working perfectly all the time.

That’s not the only way to build a voting system, and it’s not the only way to run an election. And in the U.S., our state and local election officials have choices. But many of them do want paper-based processes, to complement modern use of ballot marking devices for accessibility, ballot counters, ballot on demand, ballot readers for those with impaired vision, and a host of technical innovations emerging, including such things as on-boarding processes at polling places, interactive sample ballots for home use, and more. And for those election officials, the evidence-based voting systems have some important requirements.

The Harder Path

But let’s respect the other path as well, which includes a lot of paperless DRE voting machines still in use (and also some internet-voting schemes that several elections orgs are experimenting with). A lot of voters use these older systems. But there is a big difference in the requirements. Indeed, the bulk and complexity of the early requirements standard (and its larger 10-year-old successor) is due to trying to encompass early DRE based systems. Because these systems placed complete reliance on computers, the current requirements include an enormous amount of attention on security, risk management, software development practices, and more, all oriented to helping vendors build systems that would to the extent possible avoid creating a threat of “hacked elections.”

In fact, if you read it now, it looks like a time warp opened and a dropped through a doc from 2004 or so; and it reads pretty well as good advice for the time, on how to use then-current software and systems to — pardon me for the vernacular — “create a system that is not nearly as easily hacked as most stuff being made now.” (This was in windows XP days, recall.)

I suppose that some updated version of these requirements will be appropriate for future non-evidence-based voting systems. It will take a while to develop; it will be a bit dated by the time it is approved; and its use in voting system development, independent testing, and certification, will be about as burdensome as what we’ve seen in recent years. It has to be done, though, because the risks are greater now that ever, given that the expertise of cyber-adversaries continues to expand beyond the ability of the most sophisticated tech orgs to match.

The Road Not Taken?

So my 0th recommendation is do not apply these existing standards to evidence based voting systems. I’d almost like to see the new standard in two volumes – one for evidence based and one for others. It would just be a crazy waste of people’s time, effort, energy, and ingenuity to apply such burdensome requirements to evidence based systems, and ironic too: evidence based voting systems are specifically defined to entirely avoid many risks — in fact, the exact risks that the current requirements seek to mitigate! In fact, I would almost recommend further that the new version of the EAC start getting input on how to develop a new streamlined set of voting system requirements specifically for evidence-based systems. I say “almost” because I started to see exactly that starting to glimmer at the NIST/EAC conf this week. And that was super encouraging!

So, my specific recommendations will be entirely focused on what such new requirements should be for evidence-based voting systems. For the other fork in the road, the current standards set a pretty good direction. More soon …

— EJS

Election Standards, Day 2

The second day of the annual meeting of the Voting System Standards Committee was Friday Feb. 6. I’m concluding my reporting on the meeting with a round up of existing and proposed standards activity that we discussed on day 2. Each item below is about an existing working group, a group in formation, or proposed groups.

Election Process Modeling

This working group isn’t making a standard, but rather a guideline: a semi-formal model for the typical or common processes for U.S. election administration and election operations. The intent here is to document, in a structured manner, the various use cases where there needs to be data transfer from one system, component, or process to another. That will make it much easier to identify and prioritize such cases where data interoperability standards may be needed; from there, folks may choose to form a working group to address some of these needs.

This group is well along under the leadership of LA’s Kenneth Bennett, but still it’s a work in progress. I’ll be reporting more as we go along.

Digital Poll Books

This just formed working group, led by Ohio’s John Dziurlaj, will develop a standard data format for digital poll books. The starting point is to define a format that can accommodate data interchange between a voter registration system and digital pollbook — for example, a list of voters, each one with a name, address, voter status (e.g. in person vs. absentee voter). The reverse flow — all that plus a note of whether each voter checked in to vote, when, etc. — is included as well of course, but there are some other subtler issues. For example, it’s not enough to simply provide the data in that reverse flow; you need to also include data that ensures that the check-in records are from a legitimate source, and not modified. Without that, systems would be vulnerable to tampering that causes some ballots to be counted that shouldn’t, and vice versa. Also, not every pollbook does its job based on purely local pollbook records. Some rely on a callback to a central system that co-ordinates information flow among lots of digital pollbooks, and there are several hybrid models as well.

Also, there are privacy issues. In the paper world, every pollbook record was legally a public document, without including what we would now call “personal identifying information” (PII). More recently, with strong voter ID requirements, a voter check-in needs to include a comparison of a presented ID number (such as a driver’s license number) with the ID number that’s part of a voter’s registration record. Today, such ID numbers are often included in e-pollbook data, but that’s not ideal because each e-pollbook becomes a trove of PII at risk. In the upcoming data standards work, we may be able to include some optional privacy guards, like a way to store PII in cryptographically hashed form, to protect privacy but still enable a valid equivalence check — just the same way that stored-password system do.

Voting Methods Models

This newer group, led by Laura Massa-Lochridge, is also creating not a standard but a guideline to be used as a “standard” reference for other work. In this group, the focus is on the various individual approaches to voting on a ballot item and counting the votes. A familiar one is “vote for one” where the candidate with the most votes is the winner. Also familiar and well understood is “vote for N out of M”. Further, each of these has different semantics; for example, some vote-for-one contests have no winner when no candidate reaches a threshold, thus triggering a run-off. Familiar to some, but not so well understood, is “instant run-off”. In fact there different flavors of IRV, and in some cases it is not actually obvious which one is wanted or used in a particular jurisdiction. From there we get into heavy-duty election geek-dom with ranked choice voting and single transferrable vote.

The goal of this working group is to develop a formal mathematical model specifying precisely what’s meant for each variation of each voting method, with consensus from all who choose to participate, with process, oversight, and validation from an official international standards body. The result should be a great reference for elections officials and legislators to refer to, instead of (as is common now) simply referring to voting method by name, or by writing a counting algorithm into law.

Voting Machine Event Logs

This standard is nearly done, thanks to the leadership of NIST’s John Wack. It deserves more than a bit of explanation because it is a great example of both how the standards work, and the value of standard open data.

Every voting system has components for casting or counting ballots, and U.S. requirements for them include the requirement to do some logging of events that would provide researchers with data to analyze in order to assess how well the components operate, how effectively voters are able to use them, or so forth. Every product does some kind of logging, but each one’s log data format is in a different, proprietary format. So, VSSC has a data standard for log data, to enable vendors to provide logs in a common format that enables log analysis tools to combine and collate data from various systems.

So far, not the most thrilling part of standards work, but necessary to ensure that techies can understand what’s going wrong — or right — with real systems in operation. As many are aware, the current crop of voting system products do seem to misbehave during elections, and it’s important for tech assessment to learn whether there really were any faults (as opposed to operator error) and if so what. However, the curious part of this standard is not that provides a standard format for data common to pretty much every system (that’s why we call them common data formats!) like date/time, event code, event description, etc. Rather, the curious part is that it doesn’t try to provide a complete enumeration of all common events. Sure, most systems have an event that means “Voter cast the ballot” or “Completed scanning a ballot” but one vendor may call this “event 37” and another “event 29”.

Why not enumerate these in the standard? Well, for one thing ,it is hard to get a complete list, and as systems add more logging capabilities over time, the list grows. We want to issue the standard now, and don’t want to bake into it an incomplete list. (Once a standard is issued, it is some work to update it, and typically a standards group would prefer to use their efforts to standardize new stuff rather than revise old standards.) So the approach taken is different. It’s typical of many of the standards we’re working on, which is why I want to explain it for this standard. The approach is to have a particular part of the data format that’s expected to be filled by an event identifier that could be one of a canonical list defined elsewhere. It’s like the standard is saying “this ID field is just a string, but systems can choose to fill it with a string that’s from some canonical list that’s beyond the scope of this standard.” Also, the data format allows for a sort of glossary to be part of a dataset, to enable a dataset to essentially say “you’re going to see a bunch of event 37’s and in my lingo that means voter cast a ballot.”

The intent of course is that systems that conform to the standard will also choose to use this canonical list, which can grow over time, without requiring modifications to the standard. That’s nice but it begs the question: who maintains this list, and how does the maintainer allow people to submit additions to it? Good question. No answer yet, but that’s not a barrier to using the standard, and the early adopters will in essence start the list and figure out who is going manage it.

Event Logs for Voter Records

This is a topic for a to-be-formed working group, focused on issues very similar to those of the Event Log group described above, but for events of a voter records systems. The type of events we’re talking about here are things like: voter registration request rejected (including why); voter’s address change accepted; voter’s absentee ballot accepted for counting; voter’s provisional ballot rejected (and why); voter checked in to vote in person; and so on. The format will likely be pretty similar to the other event log format, and much of the discussion will be similar to above groups: whether there is a complete enumeration of actions or objects; whether to rely on external canonical lists; how to not expose PII, but allow a record can uniquely identify the voter in question (so that we can recognize when multiple events were about the same voter).

What types of interoperability would this support? Automated reporting, and data mining in general — again, larger issue — but one example is that is would support automated reporting that compares military voters to other voters in terms of voting outcomes: numbers and percentages of voters who voted absentee vs. in person, absentee voters who ballots where counted vs. rejected and if so why …

This type of reporting is already required of localities and states to the Federal government, and it is currently very burdensome for many election officials to create. As a result, one of the enthusiastic supporters of this nascent effort is a recently appointed EAC Commissioner who until recently was a state election who was official grumpy over the burden of this type reporting, but is now on the Federal commission requiring the reporting. So you can see that although not the most thrilling human endeavor, standards work can have its elements of irony. 🙂

Cast Vote Records 

Another topic for a to-be-formed working group, this is about how to extend the existing .2 standard (election result reporting) to describe just the votes recorded from a single ballot, together with other data (for example an image of a paper ballot from which the votes were recorded) that would be needed to support ballot audits. The whole larger issue of ballot audits is … larger; but you can read more about it in past posts here (search for “audit”) or elsewhere by a web search on “risk limiting audits elections.”

Ballot Specifications

Another topic for a to-be-formed working group, this is about how to extend the existing .2 standard’s description of ballot items (contests, candidates, referenda, questions, etc.) that is currently limited to what’s needed for results reporting.  The extensions could be limited to extensions needed to display online sample ballots, but could extend much further. Some of us have a particular interest in supporting “interactive sample ballots” which is, again, a larger issue, but more on that as the work unfolds.

Common Identifiers and OCD

Lastly, we also discussed more of the common identifier issue that I reported on earlier in day one. It turns out that this is another instance, thought slightly more complicated, of the issue facing a number of standards that I described above: semantic interoperability. In the .2 standard, we don’t want to bake into it an incomplete list of every possible district, office, precinct, etc. — even though we need common identifiers for these if two datasets can be interpreted as referred to the same things.

So, again, we have the issue of a separate canonical list. However, in this case, the space is huge, and the names (unlike event types) wouldn’t be self-identifying; and the things named could have multiple valid names. So there will no doubt be large directories of information about these political units, using common naming schemes. But to avoid these becoming a large muddle, we do have a smaller problem of smaller canonical lists, for example, a list of the names of all the types of district used in each state. With that, we could use existing naming schemes in a canonical way.

The most promising (by consensus of those working on standards anyway) naming scheme is that of the Open Civic Data project, including IDs of exacting this sort. The scope for OCD-IDs is broad: defining a handle for pretty much any government entity in any country, so that various organizations that have data on those entities can publish that data using a common identifier, enabling others to aggregate the data about those entities. It’s much broader than U.S. electoral districts. However, it’s already in use, including U.S. electoral districts. However, as I described above, the fly in the ointment is that plethora of types of electoral district; for a common unique name, you need to include the type of district, for example, the fire control district in CA’s San Mateo County that’s known as Fire District #3.

OK, so what, who, how will this registry, or directory, or curated list — whatever you might call it — get created and managed? Still a good question, but at least we have some clarity on what needs to be done, and maybe a bit of the how, as well. Stay tuned.

If we had this missing link (a canonical scheme for names of U.S. electoral districts) then we could use OCD-IDs (or extensions of FIPS geo codes for that matter) as an optional but commonly used and standards-based approach for constructing unique identifiers for electoral districts. Organizations that choose to use the naming scheme could issue VSSC.2 datasets that could be aggregated with others that also use the scheme. And then, people could have a much easier time aggregating those election result datasets to get large scale election results. At the risk of fore-shadowing, that’s actually a big deal to data-heads, public interest groups, and news organizations alike, as eloquently explained by a speaker at the next annual conference, which was this week in DC.

Coming Soon

At that conference — NIST/EAC Future of Voting Systems Symposium II — will be the topic of my next few reports!

— EJS

 

Election Standards – Part Two

Last time I reported on the first segment of the annual meeting of the Voting Systems Standards Group (VSSC) of the IEEE. Most of that segment was about the soon-to-be-standard for election definition and election results (called VSSC.2). I recapped some of the benefits of data standards, using that as an example.  Much of the rest of day one was related to that standard in a number of ways. First, we got an update on the handful of major comments submitted during the earlier review periods, and provided input on how to resolve these finalize the standard document. Second, we reviewed suggestions for other standards that might be good follow-on efforts.

What’s in a Name?

One example of the comments concerned the issue that I mentioned previously (concerning aggregation), that the object identifiers in one VSSC.2 dataset might have no resemblance to identifiers in another dataset, even though the two datasets were referring to some of the same real-world items described by the objects. We discussed a couple existing object naming standards for election data, FIPS and OCD-IDs. FIPs is a federal standard for numeric identification for states, counties, townships, municipalities, and related minor civil divisions (as well as many other things not related to elections).

That’s useful because those types of real-world entities are important objects in election definitions, and it’s very handy to have standard identifiers for them. However, FIPS is not so useful because there loads of other kinds of real-world entities that are electoral districts, but not covered by FIPS. In fact, there are so many of them and in such variety, that no one really knows all of the types districts in use in the U.S. So we really don’t have a finished standard naming scheme for U.S. electoral districts. We also discussed the work of the Open Civic Data project, specifically their identifier scheme and repository, abbreviated as OCD-IDs.

More on that in the report from Day 2, but to make a long story short, the consensus was that the VSSC.2 standard was just fine without a unique ID scheme, and that a new standard specifically for standardized IDs was not a large need now.

Supporting Audits

That’s one possible new standard, related to the .2 standard, that we considered and deferred. Two others got the thumbs up, at least at the level of agreement to form a study group, which is the first step. One case was pretty limited: a standard for cast-vote records (CVRs) to support ballot audits with an interoperable data standard. To only slightly simplify, one common definition of a CVR is a record of exactly what a ballot-counting device recorded from a specific ballot, the votes of which are included in a vote tally created by the device. Particularly helpful is the inclusion of a recorded image of the ballot. With that, a person who is part of a typical ballot audit can go though a batch of ballots (typically all from the same precinct) and decide whether their human judgment agrees with the machine’s interpretation, based on the human’s understanding of relevant state election law.

Support for audits with CVRs is a fundamental requirement for voting systems, so this standard is pretty important. It’s scope is limited enough that I hope we can get it done relatively quickly.

More to Come

The other study group will be looking at the rather large issue of standardizing an election definition, beyond the level of the .2 standard. That standard is very useful for a number of purposes (including election result reporting) but intentionally limited to not try to be a comprehensive standard. The study group will be looking at some use cases that might guide the definition of a smaller scope, that could be a timely a right-sized step from .2 toward a truly comprehensive standard. My personal goal, which I think many share, is to look at the question of what else, besides what we already have in .2, is needed for an election definition that could support ballot layout at least at the level of sample ballots. I like that of course, because we already documented the TrustTheVote requirements for that when we developed the sample-ballot feature of the Voter Services Portal.

Onward to day 2!

— EJS

PS: For more on the Voter Services Portal …  production version of VSP in Virginia is at https://www.vote.virginia.gov and the demo version is described at PCEA’s web site http://www.supportthevoter.gov and http://web.mit.edu/vtp/ovr3.html with an interactive version at http://va-demo.voterportal.trustthevote.org

Election Standards – What’s New

The annual meeting of U.S. elections standards board is this week. In addition to standards board members, several observers are here, and will be reporting. The next few blogs are solely my views (John Sebes), but I’ll do my best to write what I think is a consensus.

However, today I’ll start with a closely related topic — election data standards — because I think it will be helpful to refresh the readers’ memory about where standards fit in, and how important they are. I’ll do that explaining 4 benefits that are under discussions today.

Interoperability

One type of standards-enabled interoperability is data exchange. One system needs data to do its job, and the source data is produced by another system; but the two systems don’t speak the same language to express the data. In election technology, a common example is election results. Commercial election management system (EMS) products produce election definitions and election results data in their own format, because until recently there wasn’t a standard. Election reporting systems need to consume that data, but it’s hard to do because different counties (and other electoral jurisdictions) use different formats. For example, in California, a complete collection of results from all counties would involve 5 different proprietary or legacy formats, perhaps more in cases where two counties use the same EMS product but very different versions.

Large news organizations, as well as academics and other research organizations including the TrustTheVote Project, can put a lot of effort into “data-wrangling” and come up with something that’s nearly uniform. It’s time consuming and error prone, and needs to be done several times as election results get updated from election night to final results. But more to the point, election officials don’t have a ready, re-usable technical capability to “just get the data out.”

Well, now we have a standard for U.S. election definitions and election results (more on that in  reporting from the annual conference this week). What does that mean? In the medium to long term, the vendors of all the EMS products could support the new standard, and consumers of the data (elections organizations themselves, election reporting products, in-house tools of big news organizations, and of course open source systems like VoteStream) can re-tool to use standards-compliant data. But in the short to medium term, elections organizations, and their existing technology base, need the ability to translate from existing formats to the standard. (A big part of our just-restarted work on VoteStream is to create a translator/aggregator toolset for election officials, but more on that as VoteStream reporting proceeds.)

Componentization

Interoperability by itself is great in some cases, if the issue is mainly getting two systems to talk to one another. For example, at the level of an individual county, election reporting is mostly a matter of data transfer from the EMS that the county uses, to an election result publishing system. Some counties have created a basic web publishing system that consumes results from their EMS. However, it’s not so easy for any county to re-use such a solution unless they use and EMS that speaks exactly the same lingo.

For another example at the local level, a standards-compliant election definition data set can be bridge between and EMS that defines the information on each ballot, and a separate system that consumes an election definition and offers election officials the ability to design the layout of paper ballots. (In the TrustTheVote Project, we call that our Ballot Design Studio.) The point here is that data standards can enable innovations in election tech, because various different jobs can be delegated to systems that specialize in that job, and these specialized systems can inter-operate with them.

Aggregation

Component interoperability by itself is not so great if you’re trying to aggregate multiple datasets of the same kind, but from different sources. Taking election result reporting as the example again, here is a problem faced by consumers of election results. Part of one county votes in one Federal congressional district, and part of another county votes in the same district. Each county’s EMS assigns some internal identifier to each district, but it’s derived from whatever the county folks use; this is true even if an election result is represented in the new VSSC Standard.  In one county, the district — and by extension the contest for the representative for the district — might be called the 4th Congressional District, while in the other it might be CD-4.  If you’re trying to get results for that one contest, you need to be able tell that those are the same district and the results for the contest need to include numbers from both counties.

Currently, consumers of this data have processes for overcoming these challenges, but that ability is limited to each consumer org, in some cases private to that org. But what election officials need from standards is the ability to automatically aggregate disparate data sets.  Ahh, more standards!

This exact issue is one of the things we’re discussing this morning at the standards meeting: a need for a standard way to name election items that span jurisdictions or even elections in a single jurisdiction.

Combination

Combination is closely related to aggregation, except that aggregation is combined data sets of the same kind, while combination occurs when we have multiple data sets, each containing different but complementary information about some of the same things. That was one of the challenges we had in VoteStream Alpha: election results referred to precincts (vote counts per precinct), GIS data also (the geo-codes representing a precinct), and voter-registration statistics as well (number of registered voters per precinct, actually several stats related).  But many precincts had a different name in each data source! That made it challenging, for example, to report election results in the context of how registration and turnout numbers, and using mapping to visualize variations in registration levels and turnout numbers.

We’ll be showing how to automate the response to such challenges, as part of VoteStream Beta, using the data standards, identifiers, and enumerations under discussion right now.

More!

That’s the report from the morning session. More later …

— John Sebes