🙂 Vincent D. Warmerdam ┣━━ 📦 Open Source Packages ┃ ┣━━ bulk - simple bulk labelling interface ┃ ┣━━ embetter - embeddings ready for sklearn ┃ ┣━━ doubtlab - suite of tools to help find bad labels ┃ ┣━━ scikit-lego - lego bricks for sklearn ┃ ┣━━ scikit-partial - partial_fit() pipelines for sklearn ┃ ┣━━ scikit-prune - prune scikit learn pipelines ┃ ┣━━ scikit-bloom - bloom transformers for sklearn ┃ ┣━━ human-learn - rule-based components for sklearn ┃ ┣━━ sentence-models - a different take on textcat ┃ ┣━━ mktestdocs - turn markdown files into pytest tests ┃ ┣━━ simsity - a super simple similarities service ┃ ┣━━ lazylines - lightweight utils for .jsonl wrangling ┃ ┣━━ cluestar - inspiration for your first text labels ┃ ┣━━ drawdata - draw datasets in jupyter ┃ ┣━━ durations - pytest duration insights ┃ ┣━━ tuilwindcss - tailwindcss for textual tui apps ┃ ┣━━ memo - saves a whole log of time ┃ ┣━━ skedulord - makes cron a bit more fun ┃ ┣━━ icepickle - cool and safe storage for linear models ┃ ┗━━ evol - grammar for genetic heuristics ┣━━ 🔬 Experiments ┃ ┣━━ valves - general .pipe()-lines ┃ ┣━━ akin - sort based on zero-shot similarities ┃ ┣━━ tjek - tjek changes with the main branch ┃ ┣━━ gitlit - tracking github action times across open source ┃ ┣━━ sentimany - many sentiment models, one repo ┃ ┣━━ tokenwiser - sklearn token tricks ┃ ┣━━ benchmarks - some random, but intersting, benchmarks ┃ ┣━━ clumper - functional API for lists of dicts ┃ ┗━━ whatlies - exploration tools for word embeddings ┣━━ 👍 Contributions ┃ ┣━━ fairlearn - contributed the CorrelationFilter ┃ ┣━━ polars - contributed the .pipe() method ┃ ┗━━ BERTopic - added lightweight sklearn pipeline support ┣━━ ⭐ Online Projects ┃ ┣━━ koaning.io - personal blog ┃ ┣━━ calmcode.io - dev education service ┃ ┗━━ dearme.email - reflection service ┣━━ 🎙️ Popular Talks ┃ ┣━━ Group-by statements that save the day ┃ ┣━━ Tools to Improve Training Data ┃ ┣━━ Optimal on Paper, Broken in Reality ┃ ┣━━ Playing by the Rules-Based-Systems ┃ ┣━━ How to Constrain Artificial Stupidity ┃ ┣━━ The Profession of Solving the Wrong Problem ┃ ┣━━ Winning with Simple, even Linear, Models ┃ ┗━━ Untitled12.ipynb ┗━━ 👨💻 Employer ┣━━ 💥 Explosion - developer tools for ml ┃ ┣━━ prodigy-hf - Prodigy integration for the HuggingFace stack ┃ ┣━━ prodigy-tui - Prodigy from the terminal ┃ ┣━━ prodigy-pdf - Annotate PDFs via Prodigy ┃ ┣━━ prodigy-ann - ANN techniques to find relevant subsets ┃ ┣━━ prodigy-lunr - Search techniques to find relevant subsets ┃ ┗━━ cluestar - inspiration for your first text labels ┗━━ 🤖 Rasa - conversational software ┣━━ nlu examples - custom nlu components for Rasa ┣━━ taipo - data augmentation tools ┗━━ algo whiteboard - nlp education Follow me on twitter @fishnets88