Tagged Open Source

NBC News, Voting Machines, and a Grandmother’s PC

 

I’d like to explain more precisely what I meant by “your grandmother’s PC” in the NBC TV Bay Area’s report on election technology. Several people thought I was referring to voting machines as easily hacked by anyone with physical access, because despite appearances:

Voting machines are like regular old PCs inside, and like any old PC …

  • … it will be happy to run any program you tell it to, where:
  • “You” is anyone that can touch the computer, even briefly, and
  • “Program” is anything at all, including malicious software specially created to compromise the voting machine.

That’s all true, of course, as many of us have seen recently in cute yet fear mongering little videos about how to “hack an election.” However, I was referring to something different and probably more important: a regular old PC running some pretty basic windows-XP application software, that an election official installed on the PC in the ordinary way, and uses in the same way as anything else.

That’s your “grandmother’s PC,” or in my son’s case, something old and clunky that looks a exactly like the PC that his grandfather had a decade plus ago – minus some hardware upgrades and software patches that were great for my father, but for voting systems are illegal.

But why is that PC “super important”? Because the software in question is the brains behind every one of that fleet of voting machines, a one stop shop to hack all the voting machines, or just fiddle vote totals after all those carefully and securely operated voting machines come home from the polling places. It’s an “election management system” (EMS) that election officials use to create the data that tells the voting machines what to do, and to combine the vote tally data into the actual election results.

That’s super important.

Nothing wrong with the EMS software itself, except for the very poor choice of creating it to run on a PC platform that by law is locked in time as it was a decade or so ago, and has no meaningful self-defenses in today threat environment. As I said, it wasn’t a thoughtful choice – nobody said it would be a good idea to run this really important software on something as easily hacked as anyone’s grandparent’s PC. But it was a pragmatic choice at the time, in the rush to the post-hanging-chads Federally funded voting system replacement derby. We are still stuck with the consequences.

It reminds me of that great old radio show, Hitchhiker’s Guide to the Galaxy, where after stealing what seems like the greatest ship in the galaxy, the starship Heart of Gold, our heroes are stuck in space-time with Eddie Your Ship-Board Computer, “ready to get a bundle of kicks from any program you care to run through me.” The problem, of course, is that while designed to do an improbably large number of useful things, it’s not able to do one very important thing: steer the ship after being asked to run a program to learn why tea tastes good.

Election management systems, voting machines, and other parts of a voting system, all have an individual very important job to do, and should not be able to do anything else. It’s not hard to build systems that way, but that’s not what’s available from today’s 3 vendors in the for-profit market for voting systems, and services to operate them to assist elections officials. We can fix that, and we are.

But it’s the election officials, many many of them public servants with a heart of gold, that should really be highlighted. They are making do with what they have, with enormous extra effort to protect these vulnerable systems, and run an election that we all can trust. They deserve better, we all deserve better, election technology that’s built for elections that are Verifiable, Accurate, Secure, and Transparent (VAST as we like to say). The “better” is in the works, here at OSET Institute and elsewhere, but there is one more key point.

Don’t be demoralized by the fear uncertainty and doubt about hacking elections. Vote. These hardworking public servants are running the election for each of us, doing their best with what they have. Make it worth something. Vote, and believe what is true, that you are an essential part of the process that makes our democracy to be truly a democracy.

— John Sebes

Election Standards – What’s New

The annual meeting of U.S. elections standards board is this week. In addition to standards board members, several observers are here, and will be reporting. The next few blogs are solely my views (John Sebes), but I’ll do my best to write what I think is a consensus.

However, today I’ll start with a closely related topic — election data standards — because I think it will be helpful to refresh the readers’ memory about where standards fit in, and how important they are. I’ll do that explaining 4 benefits that are under discussions today.

Interoperability

One type of standards-enabled interoperability is data exchange. One system needs data to do its job, and the source data is produced by another system; but the two systems don’t speak the same language to express the data. In election technology, a common example is election results. Commercial election management system (EMS) products produce election definitions and election results data in their own format, because until recently there wasn’t a standard. Election reporting systems need to consume that data, but it’s hard to do because different counties (and other electoral jurisdictions) use different formats. For example, in California, a complete collection of results from all counties would involve 5 different proprietary or legacy formats, perhaps more in cases where two counties use the same EMS product but very different versions.

Large news organizations, as well as academics and other research organizations including the TrustTheVote Project, can put a lot of effort into “data-wrangling” and come up with something that’s nearly uniform. It’s time consuming and error prone, and needs to be done several times as election results get updated from election night to final results. But more to the point, election officials don’t have a ready, re-usable technical capability to “just get the data out.”

Well, now we have a standard for U.S. election definitions and election results (more on that in  reporting from the annual conference this week). What does that mean? In the medium to long term, the vendors of all the EMS products could support the new standard, and consumers of the data (elections organizations themselves, election reporting products, in-house tools of big news organizations, and of course open source systems like VoteStream) can re-tool to use standards-compliant data. But in the short to medium term, elections organizations, and their existing technology base, need the ability to translate from existing formats to the standard. (A big part of our just-restarted work on VoteStream is to create a translator/aggregator toolset for election officials, but more on that as VoteStream reporting proceeds.)

Componentization

Interoperability by itself is great in some cases, if the issue is mainly getting two systems to talk to one another. For example, at the level of an individual county, election reporting is mostly a matter of data transfer from the EMS that the county uses, to an election result publishing system. Some counties have created a basic web publishing system that consumes results from their EMS. However, it’s not so easy for any county to re-use such a solution unless they use and EMS that speaks exactly the same lingo.

For another example at the local level, a standards-compliant election definition data set can be bridge between and EMS that defines the information on each ballot, and a separate system that consumes an election definition and offers election officials the ability to design the layout of paper ballots. (In the TrustTheVote Project, we call that our Ballot Design Studio.) The point here is that data standards can enable innovations in election tech, because various different jobs can be delegated to systems that specialize in that job, and these specialized systems can inter-operate with them.

Aggregation

Component interoperability by itself is not so great if you’re trying to aggregate multiple datasets of the same kind, but from different sources. Taking election result reporting as the example again, here is a problem faced by consumers of election results. Part of one county votes in one Federal congressional district, and part of another county votes in the same district. Each county’s EMS assigns some internal identifier to each district, but it’s derived from whatever the county folks use; this is true even if an election result is represented in the new VSSC Standard.  In one county, the district — and by extension the contest for the representative for the district — might be called the 4th Congressional District, while in the other it might be CD-4.  If you’re trying to get results for that one contest, you need to be able tell that those are the same district and the results for the contest need to include numbers from both counties.

Currently, consumers of this data have processes for overcoming these challenges, but that ability is limited to each consumer org, in some cases private to that org. But what election officials need from standards is the ability to automatically aggregate disparate data sets.  Ahh, more standards!

This exact issue is one of the things we’re discussing this morning at the standards meeting: a need for a standard way to name election items that span jurisdictions or even elections in a single jurisdiction.

Combination

Combination is closely related to aggregation, except that aggregation is combined data sets of the same kind, while combination occurs when we have multiple data sets, each containing different but complementary information about some of the same things. That was one of the challenges we had in VoteStream Alpha: election results referred to precincts (vote counts per precinct), GIS data also (the geo-codes representing a precinct), and voter-registration statistics as well (number of registered voters per precinct, actually several stats related).  But many precincts had a different name in each data source! That made it challenging, for example, to report election results in the context of how registration and turnout numbers, and using mapping to visualize variations in registration levels and turnout numbers.

We’ll be showing how to automate the response to such challenges, as part of VoteStream Beta, using the data standards, identifiers, and enumerations under discussion right now.

More!

That’s the report from the morning session. More later …

— John Sebes

 

 

Voting Heartburn over “Heartbleed”

Heartbleed is the latest high-profile consumer Internet security  issue, only a few weeks after the “Goto Fail” incident. Both are recently discovered weaknesses in the way that browsers and Web sites interact. In both cases and others, I’ve seen several comments that connect these security issues with Internet voting. But because Heartbleed is pretty darn wicked, I can’t not share my thoughts on how it connects to the work we do in the TrustTheVote project — despite the fact that i-voting is not part of it. (In fact, we have our hands full fixing the many technology gaps in the types of elections that we already have today and will continue to have for the foreseeable future.)

First off, my thanks to a security colleague Matt Bishop who offered an excellent rant (his term not mine!) on Heartbleed and what we can learn from it, and the connection to open source. The net-net is familiar: computers, software, and networks are fundamentally fallible, there will always be bugs and vulnerabilities, and that’s about as non-negotiable as the law of gravity.

Here is my take on how that observation effects elections, and specifically the choice that many many U.S. election officials have made (and which we support), that elections should be based on durable paper ballots that can be routinely audited as a cross check on potential errors in automated ballot counting. It goes like this:

  • Dang it, too many paper ballots with too many contests, to count manually.
  • We’ll have to use computers to count the paper ballots.
  • Dang it, computers and software are inherently untrustworthy.
  • Soooo ….  we’ll use sound statistical auditing methods to manually check the paper ballots, in order to check the work of the machines and detect their malfunctions.

This follows the lessons of the post-hanging-chads era:

  • Dang it, too many paper ballots with too many contests, to count manually.
  • We’ll have to use computers to directly record votes, and ditch the paper ballots.
  • Dang it, computers and software are inherently untrustworthy.
  • Oops, I guess we need the paper ballots after all.

I think that these sequences are very familiar to most readers here, but its worth a reminder now and then from experts on the 3rd point — particularly when the perennial topic of i-voting comes up– because there, the sequence is so similar yet so different:

  • Dang it, voters too far away for us to get their paper ballots in time to count them.
  • We’ll have to use computers and networks to receive digital ballots.
  • Dang it, computers and software and networks are inherently untrustworthy.
  • Soooo …. Oops.

— EJS

The “VoteStream Files” A Summary

The TrustTheVote Project Core Team has been hard at work on the Alpha version of VoteStream, our election results reporting technology. They recently wrapped up a prototype phase funded by the Knight Foundation, and then forged ahead a bit, to incorporate data from additional counties, provided by by participating state or local election officials after the official wrap-up.

DisplayAlong the way, there have been a series of postings here that together tell a story about the VoteStream prototype project. They start with a basic description of the project in Towards Standardized Election Results Data Reporting and Election Results Reload: the Time is Right. Then there was a series of posts about the project’s assumptions about data, about software (part one and part two), and about standards and converters (part one and part two).

Of course, the information wouldn’t be complete without a description of the open-source software prototype itself, provided Not Just Election Night: VoteStream.

Actually the project was as much about data, standards, and tools, as software. On the data front, there is a general introduction to a major part of the project’s work in “data wrangling” in VoteStream: Data-Wrangling of Election Results DataAfter that were more posts on data wrangling, quite deep in the data-head shed — but still important, because each one is about the work required to take real election data and real election result data from disparate counties across the country, and fit into a common data format and common online user experience. The deep data-heads can find quite a bit of detail in three postings about data wrangling, in Ramsey County MN, in Travis County TX, and in Los Angeles County CA.

Today, there is a VoteStream project web site with VoteStream itself and the latest set of multi-county election results, but also with some additional explanatory material, including the election results data for each of these counties.  Of course, you can get that from the VoteStream API or data feed, but there may be some interest in the actual source data.  For more on those developments, stay tuned!

Election Results: Data-Wrangling Los Angeles County

LA County CA is the mother of all election complexities, and the data wrangling was intense, even compared to the hardly simple efforts that I reported on previously. There are over 32,000 distinct voting regions, which I think is more than the number of seats, ridings, chairs, and so on, for every federal or state houses of government in all the parliamentary democracies in the EU.

The LA elections team was marvelously helpful, and upfront about the limits of what they can produce with the aging voting system that they are working hard on replacing. This is what we started with.

  • A nicely structured CSV file listing all the districts in LA county: over 20 different types of district, and over 900 individual districts.
  • Some legacy GIS data, part of which defined each precinct in terms of which districts it is in.
  • The existing legacy GIS data converted into XML standard format (KML), again, kindly created byLA CC-RR IT chief, Kenneth Bennett.
  • A flat text file of all the election results for the 2012 election for every precinct in LA County, and various roll-ups.
  • A sort of Rosetta Stone that is just the Presidential election results, but in a well-structured CSV file, also very kindly generated for us by Kenneth.

You’ll notice that not included is a definition of the 2012 election itself – the contests, which district each contest is for, other info on the contest, info on candidates, referenda, and so. So, first problem, we needed to reverse engineer that as best as we could, from the election results. But before we could do that, we had to figure out how to parse the flat text file of results. The “Rosetta Stone” was helpful, but we then realized that we needed information about each precinct that reported results in the flat text file. To get the precinct information, we had to parse the legacy GIS data, and map it to the districts definition.

Second problem was GIS that wasn’t obvious, but fortunately we had excellent help from Elio Salazar, a member of Ken’s team who specializes in the GIS data. He helped us sort out various intricacies and corner cases. One of the hardest turned out to be the ways in which one district (say, a school district) is a real district used for referenda, but is also sub-divided into smaller districts each being for a council seat. Some cities were subdivided this way into council seats, some not; same for water districts and several other kinds of districts.

Then, as soon as we thought we had clear sailing, it turned out that the districts file had a couple minor format errors that we had to fill by hand. Plus there were 4 special case districts that weren’t actually used in the precinct definitions, but were required for the election results. Whew! At that point we though we had a complete election definition including the geo-data of each precinct in KML. But wait! We had over 32,000 precincts defined, but only just shy of 5,000 that reported election results. I won’t go into the details of sub-precincts and precinct consolidation, and how some data was from the 32,000 viewpoint and other data from the 4,993 viewpoint. Or why 4,782 was not our favorite number for several days.

Then the final lap, actually parsing all the 100,000 plus contest results in the flat text file, normalizing and storing all the data, and then emitting it in VIP XML. We thought we had a pretty good specification (only 800 words long) of the structure implicit in the file. We came up with three major special cases, and I don’t know how many little weird cases that turned out not to be relevant to the actual vote counts. I didn’t have the heart to update the specification, but it was pretty complex, and honestly the data is so huge that we could spend many days writing consistency checks of various kinds, and manual review of the input to track down inconsistencies.

In the end, I think we got to a pretty close but probably not perfect rendition of election results. A truly re-usable and reliable data converter would need some follow-on work in close collaboration with several folks in Ken’s team — something that I hope we have the opportunity to do in a later phase of work on VoteStream.

But 100% completeness aside, we still had excellent proof of concept that even this most complex use case did in fact match the standard data model and data format we were using. With some further work using the VIP common data format with other counties, the extended VIP format should be nearly fully baked and ready work with the IEEE standards body on election data.

— EJS

Election Results: Data-Wrangling Travis County

Congratulations if you are reading this post, after having even glanced at the predecessor about Ramsey County data wrangling — one of the longer and geekier posts in recent times at TrustTheVote. There is a similar but shorter story about our work with Travis County Texas. As with Ramsey, we started with a bunch of stuff that Travis Elections folks gave us, but rather than do the chapter and verse, I can summarize a bit.

In fact, I’ll cut to the end, and then go back. We were able to fairly quickly develop data converters from the Travis Nov 2012 data to the same standards-based data format we developed for Ramsey. The exception is the GIS data, which we will circle back to later. This was a really good validation of our data conversion approach. If it extends to other counties as well, we’ll be super pleased.

The full story is that Travis elections folks have been working on election result reporting for some time, as have we at TrustTheVote Project, and we’ve learned a lot from their efforts. Because of those efforts, Travis has worked extensively on how to use the data export capabilities of their voting system product’s election management system. They have enough experience with their Hart Intercivic EMS that they know exactly the right set of export routines to use to dump exactly the right set of files. We then developed data converters to chew up the files and spit out VIP XML for the election definitions, and also a form of VIP XML for the vote tallies.

The structure of the export data roughly corresponds to the VIP schema; one flat TXT file that presents a list of each of the 7 kinds of basic item (precinct, contest, etc.) that we represent as VIP objects; and 4 files that express relations between types of objects, e.g. precincts and districts, or contests and districts. As with Ramsey, the district definitions were a bit sticky. The Travis folks provided a spreadsheet of districts, that was a sort of extension of the exports file about districts. We had to extend the extensions a bit, for similar reasons outlined in the previous account of Ramsey data-wrangling. The rest of the files were a bit crufty, with nothing to suggest the meaning of the column entries other than the name of the file. But with the raw data and some collegial help from Travis elections folks, it mapped pretty simply to the standard data format.

There was one area though, where we learned a lot more from Travis. In Travis with their Hart system, they are able to separately track vote tallies for each candidate (of course, that’s the minimum) as well as: write-ins, non-votes that result from a ballot with no choice on it (under-votes), and non-votes that result from a ballot with too many choices (over-votes). That really helped extend the data format for election results, beyond what we had from Ramsey. And again, this larger set of results data fit well into our use of the VIP format.

That sort of information helps total up the tallies from each individual precinct, to double check that every ballot was counted. But there is also supplementary data that helps even more, noting whether an under or over was from early voting, absentee voting, in person voting, etc. With further information about rejected ballots (e.g. unsigned provisional ballot affadavits, late absentee ballots), one can account for every ballot cast (whether counted or rejected), every ballot counted, every ballot in every precinct, every vote or non-vote from individual ballots — and so one — to get a complete picture down to the ground in cases where there are razor thin margins in an election.

We’re still digesting all of that, and will likely continue for some time as we continue our election-result work beyond the VoteStream prototype effort. But even at this point, we think that we have the vote-tallies part of the data standard worked out fairly well, with some additional areas for on-going work.

— EJS

Election Results: Data-Wrangling Ramsey County

Next up are several overdue reports on data wrangling of county level election data, that is, working with election officials to get legacy data needed for election results; and then putting the data into practical use. It’s where we write software to chew up whatever data we get, put it in a backend system, re-arrange it, and spit it out all tidy and clean, in a standard election data format. From there, we use the standard-format data to drive our prototype system, VoteStream.

I’ll report on each of 3 and leave it at that, even though since then we’ve forged ahead on pulling in data from other counties as well. This reports from the trenches of VoteStream will be heavy on data-head geekery, so no worries if you want to skip if that’s not your cup of tea. For better or for worse, however, this is the method of brewing up data standards.

I’ll start with Ramsey County, MN, which was our first go-round. The following is not a short or simple list, but here is what we started with:

  • Some good advice from: Joe Mansky, head of elections in Ramsey County, Minnesota; and Mark Ritchie, Secretary of State and head of elections for Minnesota.
  • A spreadsheet from Joe, listing Ramsey County’s precincts and some of the districts they are in; plus verbal info about other districts that the whole county is in.
  • Geo-data from the Minnesota State Legislative GIS office, with a “shapefile” for each precinct.
  • More data from the GIS office, from which we learned that they use a different precinct-naming scheme than Ramsey County.
  • Some 2012 election result datasets, also from the GIS office.
  • Some 2012 election result datasets from the MN SoS web site.
  • Some more good advice from Joe Mansky on how to use the election result data.
  • The VIP data format for expressing info about precincts and districts, contests and candidates, and an idea for extending that to include vote counts.
  • Some good intentions for doing the minimal modifications to the source data, and creating a VIP standard dataset that defines the election (a JEDI in our parlance, see a previous post for explanation).
  • Some more intentions and hopes for being able to do minimal modifications to create the election results data.

Along the way, we got plenty of help and encouragement from all the organizations I listed above.

Next, let me explain some problems we found, what we learned, and what we produced.

  •  The first problem was that the county data and GIS data didn’t match, but we connected the dots, and used the GIS version of precint IDs, which use the national standard, FIPS.
  • County data didn’t include statewide districts, but the election results did. So we again fell back on FIPS, and added standards-based district IDs. (We’ll be submitting that scheme to the standards bodies, when we have a chance to catch our breath.)
  • Election results depend on an intermediate object called “office” that links a contest (say, for state senate district 4) to a district (say, the 4th state senate district), via an office (say, the state senate seat for district 4), rather than a direct linkage. Sounds unimportant, but …
  • The non-local election results used the “office” to identify the contest, and this worked mostly OK. One issue was that the U.S. congress offices were all numbered, but without mentioning MN. This is a problem if multiple states report results for “Representative, 1st Congressional District” because all states have a first congressional district. Again, more hacking the district ID scheme to use FIPS.
  • The local election results did not work so well. A literal reading of the data seemed to indicate that each town in Ramsey County in the Nov. 2012 election had a contest for mayor — the same mayor’s office. Ooops! We needed to augment the source data to make plain *which* mayor’s office the contest was for.
  • Finally, still not done, we had a handful of similarly ambiguous data for offices other than mayor, that couldn’t be tied to a single town.

One last problem, for the ultra data-heads. Turns out that some precincts are not a single contiguous geographical region, but a combination of 2 that touch only at a point, or (weirder) aren’t directly connected. So our first cut at encoding the geo-data into XML (for inclusion in VIP datasets) wasn’t quite right, and the Google maps view of the data, had holes in it.

So, here is what we learned.

  • We had to semi-invent some naming conventions for districts, contests, and candidates, to keep separate  everything that was actually separate, and to disambiguate things that sounded the same but were actually different. It’s actually not important if you are only reporting results at the level of one town, but if you want to aggregate across towns, counties, states, etc., then you need more. What we have is sufficient for our needs with VoteStream, but there is real room for more standards like FIPS to make a scheme that works nationwide.
  • Using VIP was simple at first, but when we added the GIS data, and used the XML standard for it (KML), there was a lot of fine-tuning to get the datasets to be 100% compliant with the existing standards. We actually spent a surprising amount of time testing the data model extensions and validations. It was worth it, though, because we have a draft standard that works, even with those wacky precincts shaped like east and west Prussia.
  • Despite that, we were able to finish the data-wrangling fairly quickly and use a similar approach for other counties — once we figured it all out. We did spend quite a bit of time mashing this around and asking other election officials how *their* jurisdictions worked, before we got it all straight.

Lastly, here is what we produced. We now have a set of data conversion software that we can use to start with the starting data listed above, and produce election definition datasets in a repeatable way, and making the most effective use of existing standards. We also had a less settled method of data conversion for the actual results — e.g., for precinct 123, for contest X, for candidate Y, there were Z votes — similar for all precincts, all contests. That was sufficient for the data available in MN, but not yet sufficient for additional info available in other states but not in MN.

The next steps are: tackle other counties with other source data, and wrangle the data into the same standards-based format for election definitions; extend the data format for more complex results data.

Data wrangling Nov 2012 Ramsey County election was very instructive — and we couldn’t have done it without plenty of help, for which we are very grateful!

— EJS

Election Results Reporting – Assumptions About Software

Today I’m continuing with the second of a 3-part series about what we at the TrustTheVote Project are hoping to prove in our Election Night Reporting System project.  As I wrote earlier, we have assumptions in three areas, one of which is software. I’ll try to put into a nutshell a question that we’re working on an answer to:

If you were able to get the raw election results data available in a wonderful format, what types of useful Apps and services could you develop?

OK, that was not exactly the shortest question, and in order to understand what “wonderful format” means, you’d have to read my previous post on Assumptions About Data. But instead, maybe you’d like to take a minute to look at some of the work from our previous phase of ENRS work, where we focused on two seemingly unrelated aspects of ENRS technology:

  1. The user experience (UX) of a Web application that local election officials could provide to help ordinary folks visualize and navigate complex election results information.
  2. A web services API that would enable other folk’s systems (not elections officials) to receive and use the data in a manner that’s sufficiently flexible for a variety other services ranging from professional data mining to handy mobile apps.

They’re related because the end results embodied a set of assumptions about available data.

Now we’re seeing that this type of data is available, and we’re trying to prove with software prototyping that many people (not just elections organizations, and not just the TrustTheVote Project) could do cool things with that data.

There’s a bit more to say — or rather, to show and tell — that should fit in one post, so I’ll conclude next time.

— EJS

PS: Oh there is one more small thing: we’ve had a bit of an “Ah-ha” here in the Core Team, prodded by our peeps on the Project Outreach team.  This data and the apps and services that can leverage that data for all kinds of purposes has use far beyond the night of an election.  And we mentioned that once before, but the ah-ha is that what we’re working on is not just about election night results… its about all kinds of election results reporting, any time, any where.  And that means ENRS is really not that good of a code name or acronym.  Watch as “ENRS” morphs into “E2RP” for our internal project name — Election Results Reporting Platform.

Towards Standardized Election Results Data Reporting

Now that we are a ways into our “Election Night Reporting System” project, we want to start sharing some of what we are learning.  We had talked about a dedicated Wiki or some such, but our time was better spent digging into the assignment graciously supported by the Knight Foundation Prototype Fund.  Perhaps the best place to start is a summary of what we’ve been saying within the ENRS team, about what we’re trying to accomplish.

First, we’re toying with this silly internal project code name, “ENRS” and we don’t expect it to hang around forever. Our biggest grip is that what we’re trying to do extends way beyond the night of elections, but more about that later.

Our ENRS project is based on a few assumptions, or perhaps one could say some hypotheses that we hope to prove. “Prove” is probably a strong word. It might better to say that we expect that our assumptions will be valid, but with practical limitations that we’ll discover.

The assumptions are fundamentally about three related topics:

  1. The nature and detail of election results data;
  2. The types of software and services that one could build to leverage that data for public transparency; and
  3. Perhaps most critically, the ability for data and software to interact in a standard way that could be adopted broadly.

As we go along in the project, we hope to say more about the assumptions in each of these areas.

But it is the goal of feasible broad adoption of standards that is really the most important part. There’s a huge amount of latent value (in terms of transparency and accountability) to be had from aggregating and analyzing a huge amount of election results data. But most of that data is effectively locked up, at present, in thousands of little lockboxes of proprietary and/or legacy data formats.

It’s not as though most local election officials — the folks who are the source of election results data, as they conduct elections and the process of tallying ballots — want to keep the data locked up, nor to impede others’ activities in aggregating results data across counties and states, and analyzing it. Rather, most local election officials just don’t have the means to “get the data out” in way that supports such activities.

We believe that the time is right to create the technology to do just that, and enable election officials to use the technology quickly and easily. And this prototype phase of ENRS is the beginning.

Lastly, we have many people to thank, starting with Chris Barr and the Knight Foundation for its grant to support this prototype project. Further, the current work is based on a previous design phase. Our thanks to our interactive design team led by DDO, and the Travis County, TX Elections Team who provided valuable input and feedback during that earlier phase of work, without which the current project wouldn’t be possible.

— EJS

A New Opening for “Open”

I’m still feeling a bit stunned by recent events: the IRS has finally put us at the starting point that we had reasonably hoped to be at about 5 years ago. Since then, election tech dysfunction hasn’t gone away; U.S. election officials have less funding than ever to run elections; there are more requirements than ever for the use of technology in election-land; and there are more public expectations than ever of operational transparency of “open government” certainly including elections; and the for-profit tech sector does not offer election officials what they need.

So there’s more to do than we ever expected, and less time to do it in. For today, I want to re-state a focus on “open data” as the part of “open source” that’s used by “open gov” to provide “big data” for public transparency. Actually I don’t have anything new to say, having re-read previous posts:

It’s still the same. Information wants to be free, and in election land, there is lots of it that we need to see, in order to “trust but verify” that our elections are all that we hope them to be. I’m very happy that we now have a larger scope to work in, to deliver the open tech that’s needed.

— EJS