[lug] deconstructing pdf of voted Boulder ballots

Neal McBurnett neal at bcn.boulder.co.us
Fri Dec 9 00:17:40 MST 2011


Al Kolwicz recently obtained (via FOIA/CORA) some partial image scans of ballots cast in Boulder for 2011.  I've put one sample of what he got at 
 http://bcn.boulder.co.us/~neal/elections/boulder-2011/Batch_051.pdf

I'll spare you the gory details, since they go on and on, but they do get into some very interesting issues of transparency and anonymity that have been in the press recently and are now on appeal to the Colorado Supreme Court, and inspiring proposed limits on freedom of information requests like this.

The images seem to have been processed along the way, with edges cut off etc.  I think they originally came from dumping the original scanner files that came from the encrypted(?) database used by Hart's Ballot Now software.

One goal  is to recover the underlying raw data, so it can be fed into open source ballot interpretation programs like OpenScan,

 http://cseweb.ucsd.edu/~hovav/papers/wrsb10.html

or the one used in Humboldt County CA, and thus get an independent tally.

I'm puzzling over the pdf file though.  I expect there are various tools that will give me an image dump of each page, and have gotten one manually from pdfedit.   But I first want to try to figure out exactly how this one was produced, and see if we can get as close to the original raw data format as possible.

But while linux tools like pdfedit and qpdf show me all sorts of xrefs and objects and streams, I'm no pdf expert and I don't see any that seem to be images.

Can any of you pdf gurus help out?

Cheers,

Neal McBurnett                 http://neal.mcburnett.org/



More information about the LUG mailing list