View on GitHub

Some things I’ve made

Data extraction

Using the pdfplumber Python package, wrote a jupyter notebook script to go through 6,590 pages in 2,814 documents FOIA'd from the City of Eugene to extract work order data and write it to a .csv file.


With data from .csv file created above, made a map in Mapbox GL and hosted it in a Google Cloud Platform bucket and embedded it in this story presentation I built using Gannett's proprietary In Depth framework.

How Lane County voted for president

First attempt at a choropleth, using Mapbox GL & QGIS to join scraped county election results with precinct shapefile. Exported GeoJSON out of QGIS (and then learned about reducing GeoJSON file size).


USA TODAY national data & investigations team: ‘A national disgrace’: 40,600 deaths tied to US nursing homes

Pitched in to work on a distributed three-person collaboration of Python developers building scrapers on deadline to supplement the manual collection of state nursing home data for a USA TODAY story detailing the national COVID-19 death toll at long-term care facilities.

Eugene, Ore., police call log

First scraper I wrote; Dec. 2008. Scrapes Eugene Police Department police call log every 15 minutes. Currently >920K rows.

Springfield, Ore., police call log

Scrapes Springfield police call log every 15 minutes. Since 2013, >230K rows.

Websites that reverse publish, APIs

A place for local credentialed entities to enter meeting information as required by law. Password-protected posts publish immediately to web (and owner has CRUD capability) and reverse publishes daily into print Civic Calendar item.
Public repo:

No link for it is currently sad and moribund. (Perhaps resurrected in 2020.) A landing page for local election information. Powered by JSON feeds that come from a Django backend fed by a Selenium-powered web scraper of Oregon Secretary of State site. Outputs results in InDesign tagged text for use in print. (Okay, if you must look, here's a link.)
Public repo:
Sample JSON API response:

A currently superseded Django entertainment calendar app that allowed for anonymous and trusted users to enter event information, available online and created weekly Entertainment section listing via InDesign tagged text.

Online adventure guide listing utilizing Leaflet & Open Street Map, powered by Tarbell and Google Sheets that also produces InDesign tagged text for print.
Public repo:

XML feed mungers, Twitter bots, RSS feeds

Parses a push XML feed every 15 minutes that results, when there are school delayed openings and closures, in this index page, a home page widget and a Tweet from @registerguard. (Note: If there currently isn't bad weather in Lane County, Ore., USA, there probably isn’t a lot to see here.)
Public repo:

Public repo:

Also, built an automated print archive.

Our previous CMS had no public-facing archive, so I took the initiative to build one. The only available database driver was written in Java, so I learned enough Jython to get a nightly cronjob export working.

The archive was useful for many things, e.g. it powered story feeds used by The Associated Press, ProQuest, etc. Here's a NewsBank Atom feed.

When it came time to transition to a new CMS, I used the archive app to quickly pull together a custom XML export of nine year's worth of stories — more than 250,000 locally-produced items plus related assets — that were all imported into the new CMS; no stories lost.