Using transformers for innovation and discovery
Deep learning frameworks greatly simplify the use of models. We cover the essential theory, skills, and technical detail to all users to take advantage of these frameworks. This workshop centers on transformer models and their application to sequential data, including natural language processing, image processing, audio processing.
We're so excited that you're joining our workshop! There's a few things you'll need to hit the ground running.
- Huggingface Account: Access workshop data, models, user interfaces, and update your own private or public model info. The free tier is sufficient for the workshop.
- Google Colab Account: If you don't have a gmail account, create one to access Google Colab. The free tier is sufficient for this workshop. If you find you need more compute, upgrade to a Google Colab Pro account!
- Familiarity with Python: Make sure you know standard Python data structures (lists, dictionaries), list and dictionary comprehensions, functions, conditional execution (if statements) and basic column/row indexing using pandas. Seem unfamiliar? Check out the readme in this intro Python repo to access 2 2-hour workshops on basic Python.
The workshops are 9am-1pm CT daily January 4th to January 9th, 2022. The first 3 hours are hands-on, and the final hour from 12-1pm is reserved for office hours. Please feel free to drop by during office hours for assistance with code bugs, specifics for your application, or discussions about approach!
- What are transformers? Let's do a quick tour!
- What can I do with transformers? What tasks are possible? Learn more here!
- Should I fine-tune my model? What is fine-tuning? Let's see with standard datasets!
- How can I collaborate and disseminate my work?
Homework: Explore pipelines. Define what you'd like to do with Transformers for your application.
- How should I format my data for use with transformers?
- What's a Huggingface Dataset and why is it helpful?
- Let's do an example.
- How can I save and share my dataset?
- What else do I need to do to prepare my data for fine-tuning?
Homework: Format your data as a Dataset and push it to HuggingFace Hub (if your data security contracts allow). Make sure to add a model card and info.
- What are the API components of fine-tuning a model?
- How can I customize the training of the model?
- How can I measure the performance of my model?
- How can I share my model for other people to use or interact with?
Homework: Finish training your model and push it to HuggingFace Hub. Create a user interface for interactivity with your model.
- How can I improve the performance of my models?
- My data and models are huge. What can I do to manage memory and compute?
- Demos: let's share our work and findings!
- Research Rooms
During these workshops, we'll have a number of breakout rooms where you'll work with others for discussion or develop code to solve an assignment. Please screenshot or paste your results in the provided Google Doc.
A number of examples will be left to the reader. Please complete these assignments prior to coming to the next workshop. These will help in developing intuition and understanding for the next workshop topics.