ariellemartinez
Data reporter, Newsday. Stony Brook University School of Journalism Class of 2017. Also on Observable @ariellemartinez.
New York, NY
ariellemartinez's Stars
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
py-pdf/pypdf
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
jsvine/pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
wireservice/csvkit
A suite of utilities for converting to and working with CSV, the king of tabular file formats.
pdfminer/pdfminer.six
Community maintained fork of pdfminer - we fathom PDF
dedupeio/dedupe
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
jsoma/tabletop
Tabletop.js gives spreadsheets legs
tabulapdf/tabula-java
Extract tables from PDF files
nytimes/library
A collaborative documentation site, powered by Google Docs.
govtrack/govtrack.us-web
The Django source code for the GovTrack.us website.
CJWorkbench/cjworkbench
The data journalism platform with built in training
RobinL/fuzzymatcher
Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4
maxharlow/csvmatch
🔎 Finds fuzzy matches between CSV files
associatedpress/harvester
Collaborative data collection tool developed by the Associated Press
propublica/django-collaborative
ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.
jsoma/fuzzy_pandas
Fuzzy matches and merging of datasets in pandas using csvmatch
associatedpress/datakit-core
Core library for the datakit CLI framework.
palewire/first-python-notebook
A step-by-step guide to analyzing data with Python and the Jupyter notebook.
sunlightlabs/brisket
Influence Explorer: Following the political influence of people, politicians, and organizations
mtdukes/how-to
A collection of cheat sheets for remembering common commands and tips for data journalism work.
palewire/first-github-scraper
An introduction to free, automated web scraping with GitHub’s powerful new Actions framework.
ICIJ/prophecies
An ICIJ app to conduct data validation and cleaning.
ireapps/teaching-guide-python-scraping
Teaching guide for a one-hour hands-on session at an IRE/NICAR conference on scraping web data using Python.
ruthwtalbot/NICAR-2022
UrbanInstitute/MortgagesByRace
was http://datatools.urban.org/Features/mortgages-by-race/#8/41.923/-86.149 (note everything after the # is an internal link or data)
lamthuyvo/2021-10-gentrification-analysis
amkessler/goldschmidt2024_workshop
cjwinchester/ire22-python-for-data-analysis
Materials for a Python class at #IRE22 in Denver.
next-LI/industry-employment
tjk911/getting_scrape_y