I'd like to learn the Hugging Face ecosystem better (transformers, datasets, accelerate + more).
So this repo is to help me learn it and simulatenously teach others.
Each example will include an end-to-end approach of starting with a dataset (custom or existing), building and evaluating a model and creating a demo to share.
Teaching style:
A machine learning cooking show! 👨🍳
Mottos:
- If in doubt, run the code. -- Machine learning is very experimental. So it's good to get in the habit of continually trying things (even if you think they won't work).
- Visualize, visualize, visualize! - If you're not sure of some dataset or some operation or some predictions, visualize it/them.
- Experiment, experiment, experiment! - Again, machine learning is very experimental. So keep trying different things!
- Data, model, demo! - Create/get a dataset, build/train/evaluate a model, create a demo to share.
Project style:
Data, model, demo!
- Create a new/reuse an existing dataset.
- Train/evaluate a model.
- Build a demo to share.
This will be our (rough) workflow:
A general Hugging Face workflow from idea to shared model and demo using tools from the Hugging Face ecosystem. These kind of workflows are not set in stone and are more of guide than specific directions. See information on each of the tools in the Hugging Face documentation.All code and text will be free/open-source, video step-by-step walkthroughs are available as a paid upgrade.
Project | Description | Dataset | Model | Demo | Video Course |
---|---|---|---|---|---|
Text classification | Build project "Food Not Food", a text classification model to classify image captions into "food" if they're about food or "not_food" if they're not about food. This is the ideal place to get started if you've never used the Hugging Face ecosystem. | Dataset | Model | Demo | Video Course |
More to come soon! | Let me know if you'd like to see anything specific by leaving an issue. |
Ideal for:
- Beginners who love things explained in detail.
- Someone who wants to create more of their own end-to-end machine learning projects.
Not ideal for:
- People with 2-3+ years of machine learning projects & experience^.
^Note: This being said, you may actually find some things helpful along the way. Best to explore and see!
- 3-6 months Python experience.
- 1x beginner machine learning or deep learning course (see my begineer-friendly ML course to learn Python + important ML concepts in one).
- PyTorch experience is a bonus (see my Learn PyTorch in a Day video or learnpytorch.io)
Hugging Face is a platform that offers access to many different kinds of open-source machine learning models and datasets.
They're also the creators of the popular transformers
library (and many more helpful libraries) which is a Python-based library for working with pre-trained models as well as custom models.
If you're getting into the world of AI and machine learning, you're going to come across Hugging Face.
A handful of pieces from the Hugging Face ecosystem. There are many more available in Hugging Face documentation.Many of the biggest companies in the world use Hugging Face for their open-source machine learning projects including Apple, Google, Facebook (Meta), Microsoft, OpenAI, ByteDance and more.
Not only does Hugging Face make it so you can use state-of-the-art machine learning models such as Stable Diffusion (for image generation) and Whipser (for audio transcription) easily, it also makes it so you can share your own models, datasets and resources.
Aside from your own website, consider Hugging Face the homepage of your AI/machine learning profile.
- Prerequisites
- Ecosystem overview (transformers, datasets, accelerate, tokenizers, Spaces, demos, models, hub etc.)
- Text classification
- Object detection
- Named entity recognition
- LLM fine-tuning
- VLM fine-tuning
- RAG workflow
- Zero-shot image classification/multi-modal workflows (CLIP)
See setup.
- Hugging Face documentation - https://huggingface.co/
- Hugging Face cookbook - https://github.com/huggingface/cookbook
Is this an official Hugging Face website?
No, it's a personal project by myself (Daniel Bourke) to learn and help others learn the Hugging Face ecosystem.