/ai-winter

Primary LanguageJupyter Notebook

Open In Colab

AI Winter

Summary repository for AI Winter 2023. Introduction to Transformer models, with practical applications to inferencing and training

Presented by Vanderbilt Data Science Institute data scientists:

  • Dr. Jesse Spencer-Smith, Chief Data Scientist
  • Dr. Charreau Bell, Senior Data Scientist
  • Umang Chaudhry, Data Scientist
  • Dr. Abbie Petulante, Data Scientist Postdoctoral Fellow

Quick reference for Breakout Room and Workshop Resources: https://docs.google.com/document/d/17aCJNR66ZYxqdS1pI4DUUz_RmjwY5iY4_8xH90VH7gk/edit#heading=h.3kpbseaszv6a

Overview

The objective of these workshops is to develop foundational skills in understanding, inferencing and training Transformer models primarily using HuggingFace, an extremely user-friendly API for transformers.

Getting the Most out of this Course

To get the most out of this crash course in Python:

  • Open Colab (workbook) notebooks and actively write code along with the instructor
  • Actively participate in discussion
  • Actively participate in breakout rooms
  • Perform homework assignments before coming to class the next day
  • Relax your mind and ask questions

Pre-Workshop Preparation

  • Sign up for a Google Collaboratory account. The free account should be sufficient, but you will get more compute (and longer running times) if you sign up for Colab Pro at ~$5/month.
  • Sign up for a Hugginface.co account. Again, the free account should be sufficient.
  • Suggested: Preview the book Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra and Thomas Wolf. Be sure to read Chapters 1, 2, and 3. If you are affiliated with Vanderbilt University, you can access this pre-print book (and any book by O’Reilly) free by logging into O'Reilly Media using your Vanderbilt email address. Vanderbilt licenses all content from O’Reilly. The book covers Transformers for purposes beyond text.
  • Think about any data you might want to bring to the workshop. Also begin thinking about any short projects you might want to accomplish during our workshop. We’ll have office hours for you to work with us to get your first project off the ground!

Workshop Schedule

Workshops will be held each day from 11am to 2pm Central.

Tue, Jan 3

  1. Topics Introduction to Transformers, architecture, Huggingface models, pipeline, datasets, spaces Jesse Theory, and include coding Choose task, show visual, show pipeline, then show source code? Have meaningful tasks ready. Informed consent?
  2. Preparation
  3. Resources
  4. Recording
  5. Homework / Next Steps

Wed, Jan 4

  1. Using Google Colab
  2. The Huggingface Documentation: Tutorials, API, and other resources
  3. Using pipelines for inference
  4. Introduction to fine tuning

Thu, Jan 5

  1. Fine-tuning text models in Python for classification
  2. Fine-tuning image and audio models in Python
  3. Best practices

Fri, Jan 6

  1. Image and audio models, continued
  2. Using gradio for sharing your work
  3. Using datasets for sharing your data

Breakout Rooms

During these workshops, we'll have a number of breakout rooms where you'll work with others for discussion or develop code to solve an assignment. Please screenshot or paste your results in the following Google doc:

https://docs.google.com/document/d/17aCJNR66ZYxqdS1pI4DUUz_RmjwY5iY4_8xH90VH7gk/edit#heading=h.3kpbseaszv6a

Asynchronous (Homework) Assignments

A number of examples will be left to the reader. Please complete these assignments prior to coming to the next day of the course. These homework assignments are designed to augment your understanding of Python, enable you to avoid common pitfalls of programming, resolve known areas of ambiguity that often arise in our new learners, and navigate and understand common errors that Python will throw.

Other Resources

Compute Grants for Vanderbilt Faculty and Students

DGX A100 Compute Grant: https://forms.gle/2mGfEy9DB4JU2GpZ8

Python

Transformers

  • Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra and Thomas Wolf. If you are affiliated with Vanderbilt University, you can access this pre-print book (and any book by O’Reilly) free by logging into O'Reilly Media using your Vanderbilt email address. Vanderbilt licenses all content from O’Reilly. The book covers Transformers for purposes beyond text.