Collection of links and references about random topics concerning Data, Visualizations and other obnoxious programming stuff ...
General Data Sources
- Internet Archive
- Papers With Code
- OpenData Tools
- Südtriol OpenData
- Washington Post - 8Billion (Interactive Data Report)
- Satellite Charts
- CIA World Factbook
Climate/weather data
Politics
Music/Scores
Statistics stuff
Data Viz stuff
- https://www.youtube.com/watch?v=pLqjQ55tz-U&ab_channel=TED
- Vortrag: exploring multidimensional data
- DataViz JobInterview
JS Tables
JS Charting Libs
Nice D3 stuff I would like to use
Digital Curation
OStweaks
Architecture
NLP (BERT/etc)
- excellent article HOWTO use PyTorch to generate DistilBERT by compression/Knowledge Distillation
- HOWTO "Train a new language model from scratch using Transformers and Tokenizers" with DistilBERT
DevOPS
GT datasets
- TC-11 GT datasets
- HTR-united GT datasets
- list of OCR related GT datasets curated by C.Neudecker
Nice code challenges
Other handy DS and ML stuff
- Blog - fastforwardlabs
- Article (TowardsDS) - how-i-deployed-a-sentiment-analyser-api-with-spacy-flask-and-heroku
- Lib - PyPyr: task runner for automation pipelines, script sequential task workflow steps in yaml, conditional execution, loops, error handling & retries
- Article (Medium) - Using Continuous Machine Learning to Run Your ML Pipeline
- GitHub - CML: open-source CLI tool for implementing continuous integration & delivery (CI/CD) with a focus on MLOps
- BlogPost - ML Ops Template
- Book - Surfing the Data Pipeline with Python
Other Github Sources & Lists
Other nice Blogs/Articles/etc
TOCODE
- https://www.youtube.com/watch?v=GVrjv9ajsJ0&ab_channel=ChristopherEdwards
- https://www.youtube.com/watch?v=ehAaFAASCeM&ab_channel=QuickCodingTuts
- https://www.youtube.com/watch?v=2-tnkzG0sKU&ab_channel=RedStapler
- https://www.youtube.com/watch?v=nzshmMlOuwI&ab_channel=BuildAppsWithPaulo
- https://www.youtube.com/watch?v=UQ_kqGDM8A4&ab_channel=CurranKelleher
Call for Action
Papers