🙂 Vincent D. Warmerdam
┣━━ 📦 Open Source Packages
┃   ┣━━ bulk              - simple bulk labelling interface
┃   ┣━━ embetter          - embeddings ready for sklearn
┃   ┣━━ doubtlab          - suite of tools to help find bad labels
┃   ┣━━ scikit-lego       - lego bricks for sklearn
┃   ┣━━ scikit-partial    - partial_fit() pipelines for sklearn
┃   ┣━━ scikit-prune      - prune scikit learn pipelines
┃   ┣━━ scikit-bloom      - bloom transformers for sklearn
┃   ┣━━ human-learn       - rule-based components for sklearn
┃   ┣━━ sentence-models   - a different take on textcat
┃   ┣━━ mktestdocs        - turn markdown files into pytest tests
┃   ┣━━ simsity           - a super simple similarities service
┃   ┣━━ lazylines         - lightweight utils for .jsonl wrangling
┃   ┣━━ cluestar          - inspiration for your first text labels
┃   ┣━━ drawdata          - draw datasets in jupyter
┃   ┣━━ durations         - pytest duration insights
┃   ┣━━ tuilwindcss       - tailwindcss for textual tui apps
┃   ┣━━ memo              - saves a whole log of time
┃   ┣━━ skedulord         - makes cron a bit more fun
┃   ┣━━ icepickle         - cool and safe storage for linear models
┃   ┗━━ evol              - grammar for genetic heuristics
┣━━ 🔬 Experiments
┃   ┣━━ valves         - general .pipe()-lines
┃   ┣━━ akin           - sort based on zero-shot similarities
┃   ┣━━ tjek           - tjek changes with the main branch
┃   ┣━━ gitlit         - tracking github action times across open source
┃   ┣━━ sentimany      - many sentiment models, one repo
┃   ┣━━ tokenwiser     - sklearn token tricks
┃   ┣━━ benchmarks     - some random, but intersting, benchmarks
┃   ┣━━ clumper        - functional API for lists of dicts
┃   ┗━━ whatlies       - exploration tools for word embeddings
┣━━ 👍 Contributions
┃   ┣━━ fairlearn      - contributed the CorrelationFilter
┃   ┣━━ polars         - contributed the .pipe() method
┃   ┗━━ BERTopic       - added lightweight sklearn pipeline support
┣━━ ⭐ Online Projects
┃   ┣━━ koaning.io     - personal blog
┃   ┣━━ calmcode.io    - dev education service
┃   ┗━━ dearme.email   - reflection service
┣━━ 🎙️ Popular Talks
┃   ┣━━ Group-by statements that save the day
┃   ┣━━ Tools to Improve Training Data
┃   ┣━━ Optimal on Paper, Broken in Reality
┃   ┣━━ Playing by the Rules-Based-Systems
┃   ┣━━ How to Constrain Artificial Stupidity
┃   ┣━━ The Profession of Solving the Wrong Problem
┃   ┣━━ Winning with Simple, even Linear, Models
┃   ┗━━ Untitled12.ipynb
┗━━ 👨‍💻 Employer
    ┣━━ 💥 Explosion   - developer tools for ml
    ┃   ┣━━ prodigy-hf        - Prodigy integration for the HuggingFace stack
    ┃   ┣━━ prodigy-tui       - Prodigy from the terminal
    ┃   ┣━━ prodigy-pdf       - Annotate PDFs via Prodigy
    ┃   ┣━━ prodigy-ann       - ANN techniques to find relevant subsets
    ┃   ┣━━ prodigy-lunr      - Search techniques to find relevant subsets
    ┃   ┗━━ cluestar          - inspiration for your first text labels
    ┗━━ 🤖 Rasa       - conversational software
        ┣━━ nlu examples      - custom nlu components for Rasa
        ┣━━ taipo             - data augmentation tools
        ┗━━ algo whiteboard   - nlp education

Follow me on twitter @fishnets88