ortsed's Stars
RaRe-Technologies/gensim
Topic Modelling for Humans
OpenRefine/OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
modin-project/modin
Modin: Scale your Pandas workflows by changing a single line of code
EpistasisLab/tpot
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
ricklamers/gridstudio
Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
DistrictDataLabs/yellowbrick
Visual analysis and diagnostic tools to facilitate machine learning model selection.
jarun/ddgr
:duck: DuckDuckGo from the terminal
target/matrixprofile-ts
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile
sahandha/eif
Extended Isolation Forest for Anomaly Detection
simonw/django-sql-dashboard
Django app for building dashboards using raw SQL queries
jamesmishra/mysqldump-to-csv
A quickly-hacked-together Python script to turn mysqldump files to CSV files. Optimized for Wikipedia database dumps.
cphyc/matplotlib-label-lines
Label line using matplotlib.
LexPredict/openedgar
OpenEDGAR (openedgar.io)
toddwschneider/sec-13f-filings
A nicer way to view SEC 13F filings data
JakeColtman/bartpy
Bayesian Additive Regression Trees For Python
OpenFIGI/api-examples
Examples of programs that interact with the OpenFIGI services via their APIs.
Lyonk71/pandas-dedupe
Simplifies use of the Dedupe library via Pandas
jsfenfen/990-xml-reader
IRSx: Turn the IRS' versioned XML 990 nonprofit annual tax returns into standardized python objects, json, or human readable text with original line number and description.
seanpianka/Zipcodes
A simple library for querying U.S. zipcodes.
mediacloud/date_guesser
A library to extract a publication date from a web page, along with a measure of the accuracy.
SCPR/kpcc-data-team
Where we attempt to lay a foundation, document practices and find our way to sharing the work we do and tools we use to do it at KPCC/SCPR
associatedpress/national-caseload-data-ingest
Scripts to download the U.S. Department of Justice's National Caseload Data and load it into Amazon Athena for querying
danielmoreira/sciint
Source codes and experimental results of our scientific integrity verification system.
PublicI/state-lawmakers-disclosures
Data collected from the personal financial disclosure reports of 6,933 state legislators
data-liberation-project/phmsa-hazmat-incident-reports
Data from decades of PHMSA's "5800.1" hazardous material transportation incident reports
BuzzFeedNews/2022-04-icf-analysis
Data and analysis of intermediate care facilities, supporting a BuzzFeed News investigation.
rchowe/textsql
Run SQLite commands on text files.
danbauman77/tophat
phillipecardenuto/rsiil
Recod.ai Scientific Image Integrity Library
wpinvestigative/ppp_loans