Pinned Repositories
Bikeshare
Archive a snapshot of DC's bikesharing station vacancy and usage, appending to a CSV.
campaignclusters
Using similarity algorithms to detect bundlers and political factions in campaign finance
csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
dccampfin
Parse the PDFs of the District of Columbia's campaign finance disclosures into CSVs for expenditures and contributions. The PDFs include more fields than the supplied CSVs, specifically employment information, to detect bundling and conflict of interest.
disbursements
Data and scripts relating to the publishing of the House expenditure reports, and hopefully the Senate's in future.
docsplitter
OCR/extract text from 100s or 1000s of PDFs using AWS, similar to DocumentCloud
fecmaster
Download weekly master tables from Federal Election Commission's FTP server, from 1980 on.
pycrp
Pull OpenSecrets.org campaign finance info into MySQL
pysec
Parse XBRL filings from the SEC's EDGAR in Python
lukerosiak's Repositories
lukerosiak/pysec
Parse XBRL filings from the SEC's EDGAR in Python
lukerosiak/fecmaster
Download weekly master tables from Federal Election Commission's FTP server, from 1980 on.
lukerosiak/docsplitter
OCR/extract text from 100s or 1000s of PDFs using AWS, similar to DocumentCloud
lukerosiak/campaignclusters
Using similarity algorithms to detect bundlers and political factions in campaign finance
lukerosiak/dccampfin
Parse the PDFs of the District of Columbia's campaign finance disclosures into CSVs for expenditures and contributions. The PDFs include more fields than the supplied CSVs, specifically employment information, to detect bundling and conflict of interest.
lukerosiak/csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
lukerosiak/pycrp
Pull OpenSecrets.org campaign finance info into MySQL
lukerosiak/Bikeshare
Archive a snapshot of DC's bikesharing station vacancy and usage, appending to a CSV.
lukerosiak/disbursements
Data and scripts relating to the publishing of the House expenditure reports, and hopefully the Senate's in future.
lukerosiak/migration
Download all years of county-to-country migration data from the IRS and construct Postgres tables and views for analysis.
lukerosiak/oge-travel
Parse privately-sponsored executive branch travel
lukerosiak/court
Look up a list of names in DC area criminal, civil and liens courts
lukerosiak/datacommons
The core of sunlightlabs' Data Commons project. Includes the Transparency Data site and the APIs that power TransparencyData.com and InfluenceExplorer.com
lukerosiak/inspectors-general
Collecting reports from Inspectors General across the US federal government.
lukerosiak/mysql-politicalpartytime
MySQL importer for the Sunlight Foundation's PoliticalPartyTime.org archive of lobbyist-hosted Congressional fundraising events.
lukerosiak/dbcongress
Turn XML/JSON/YAML from the github.com/unitedstates project into a relational database
lukerosiak/drudge
Scrape the Drudge Report and calclate stats on the types of stories he's featuring. Optionally send email alerts.
lukerosiak/nicar-advanced
lukerosiak/oge
Harness White House nominees' ethics letters
lukerosiak/oversight.garden
Bringing together the oversight community's work.