dannguyen
My name is Dan Nguyen and you can find me on Github as dannguyen & @dancow on Twitter. My interests are investigative journalism, data wrangling, and naming thi
data scientist wrangler/engineerChicago
Pinned Repositories
abbyy-finereader-ocr-senate
Evaluating the performance and accuracy of ABBYY FineReader's OCR on Senate Financial Disclosure scanned forms
github-for-portfolios
A layperson's step-by-step guide to building webpages with Github
journalism-syllabi
Computer-Assisted Reporting and Data Journalism Syllabuses, compiled by Dan Nguyen
python-notebooks-data-wrangling
Python 3.x notebooks about real-world data cleaning and visualization
watson-word-watcher
A proof of concept using IBM's Speech-to-Text API to do quick-and-dirty transcriptions
dannguyen's Repositories
dannguyen/journalism-syllabi
Computer-Assisted Reporting and Data Journalism Syllabuses, compiled by Dan Nguyen
dannguyen/bashfoo
My personally curated list of bash/command-line commands and snippets that are very useful yet I keep on forgetting
dannguyen/datajournalism-primer
a general list of resources and articles for people interested in getting into data journalism
dannguyen/screencappy
A command-line tool for making it easier to create and save screenshots as a blogger
dannguyen/csvs-to-sqlite
Convert CSV files into a SQLite database
dannguyen/excsv
goofin around with a command-line utility for quickly inspecting CSV files
dannguyen/bashappy_helpers
A bunch of helper functions I wrote to use for my own macOS terminal convenience
dannguyen/command-line-basics-mz2022
command line lessons for 2022 quickie repo
dannguyen/jekyll-datasite-template
Trying to make a template that scaffolds a basic jekyll site with bootstrap and vendor d3v5
dannguyen/bsky-api-datagatherer-for-fun
exploring the bluesky api for bulk collection of personal data via the python api
dannguyen/censusscout
making my own lightweight version of Census Explorer because y not
dannguyen/config-files
My configuration files
dannguyen/covidusa
using johns hopkins data and practicing javascript ignore me plz
dannguyen/dancow-bluesky-fun-api-tool
Just having fun with python and Bluesky's AT Protocol. Trying to build a simple CLI and enough of SDK to easily explore and collect my own Bluesky data
dannguyen/hello-svelte
need to practice this javascript thing
dannguyen/atproto-ecosystem
list of projects and implementations in the AT protocol ecosystem
dannguyen/blueskyprofiler
a quick web app to check a bluesky profile's stats
dannguyen/chicago-city-data-processing-project-2024
dannguyen/chimp-data-pipeline
dannguyen/csvwebhelper-built-by-claude
testing out Claude Sonnet 3.7 and the claude CLI programming agent
dannguyen/daily-treasury-statement
A copy of the Daily Treasury Statement (starting from 2005-10-03) as found on fiscaldata.treasury.gov
dannguyen/dang-sqlfluffer
Dan Nguyen's opinionated wrapper around sqlfluff; GO LEADING COMMAS
dannguyen/gathering-bigquery-public-census-data
Just a cute repo that gathers and collates the Census ACS data as found on bigquery-public-data
dannguyen/grafte
dannguyen/howto-whisper-autotranscribe
Info on how to setup and run auto-transcription (via OpenAI's WhisperAI, and other variations) with extra features (speaker diatrization and word segmentation) for August
dannguyen/marimo-viz-demo
just a test to see how marimo notebooks render as github repos
dannguyen/matplotlib-tutorial
Matplotlib tutorial for beginner
dannguyen/open-webui
my fork of open-webui (ai interface) to work with python 12.7
dannguyen/tsty
dannguyen/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization) (custom patched to use fasterwhisper 1.0.2)