Natural language processing is ubiquitous in modern intelligent technologies, serving as a foundation for language translators, virtual assistants, search engines, and many more. In this course, we cover the foundations of modern methods for natural language processing, such as word embeddings, recurrent neural networks, transformers, and pretraining, and how they can be applied to important tasks in the field, such as machine translation and text classification. We also cover issues with these state-of-the-art approaches (such as robustness, interpretability, sensitivity), identify their failure modes in different NLP applications, and discuss analysis and mitigation techniques for these issues.
Platform | Where & when |
---|---|
Lectures | Wednesdays: 9:15-11:00am [CM2] & Thursdays: 1:15-2:00pm [CE1] |
Exercises Session | Thursdays: 2:15-4:00pm [CE1] |
Project Assistance (not every week) |
Wednesdays: 11:15am-12:00pm [CM2] |
Forum | Ed Forum [link] |
Moodle | Annoucements [link] |
All lectures will be given in person and live streamed on Zoom. The link to the Zoom is available on the course Moodle page. Lectures will be recorded and uploaded to SwitchTube.
Week | Date | Topic | Instructor |
---|---|---|---|
Week 1 | 22 Feb 23 Feb |
Introduction + Building a simple neural classifier Neural LMs: word embeddings [slides, readings] |
Antoine Bosselut |
Week 2 | 1 Mar 2 Mar |
Classical and Fixed-context Language Models Recurrent Neural Networks [slides, readings] |
Antoine Bosselut |
Week 3 | 8 Mar 9 Mar |
LSTMs and Sequence-to-sequence models Theoretical properties of RNNs [slides, readings] |
Antoine Bosselut Gail Weiss |
Week 4 | 15 Mar 16 Mar |
Attention + Transformers Transformers [slides, readings] |
Antoine Bosselut |
Week 5 | 22 Mar 23 Mar |
Pretraining: ELMo, BERT Transfer Learning: Introduction [slides, readings] |
Antoine Bosselut |
Week 6 | 29 Mar 30 Mar |
Transfer Learning: Dataset Biases Transfer Learning: Prompts |
Antoine Bosselut |
Week 7 | 5 Apr 6 Apr |
Text Generation | Antoine Bosselut |
Week 8 | EASTER BREAK | Antoine Bosselut | |
Week 9 | 19 Apr 20 Apr |
In-Context Learning |
Antoine Bosselut |
Week 10 | 26 Apr 27 Apr |
Scaling Laws + Model Compression No class |
Antoine Bosselut Reza Banaei |
Week 11 | 3 May 4 May |
Ethics in NLP Ethics in NLP |
Antoine Bosselut |
Week 12 | 10 May 11 May |
Interpretability & Analysis of Language Models No class |
Antoine Bosselut |
Week 13 | 17 May 18 May |
Reading Comprehension & Open-domain QA Language & Knowledge Graphs |
Antoine Bosselut Angelika Romanou |
Week 14 | 24 May 25 May |
Tokenization + Multilingual LMs No class |
Negar Foroutan |
Week 15 | 31 May 1 Jun |
Language & Vision Language & Vision + Wrap-up |
Syrielle Montariol Antoine Bosselut |
Week | Date | Topic | Instructor |
---|---|---|---|
Week 1 | 23 Feb | Setup + Word embeddings [code] | Angelika Romanou Sepideh Mamooler Simin Fan |
Week 2 | 2 Mar | Word embeddings review Classical & Fixed-context Language Models [code] |
Angelika Romanou Mohammedreza Banaei Sepideh Mamooler |
Week 3 | 9 Mar | Language Models Review Sequence-to-sequence models [code] |
Mohammedreza Banaei Sepideh Mamooler Simin Fan |
Week 4 | 16 Mar | Sequence-to-sequence models review Attention + Transformers [code] |
Sepideh Mamooler Mete Ismayil Simin Fan |
Week 5 | 23 Mar | Transformers Review Pretraining: ELMo, BERT [code] |
Simin Fan Sepideh Mamooler Molly Petersen |
Week 6 | 30 Mar | Pretraining Review Transfer Learning: Dataset Biases |
Molly Petersen Mete Ismayil Sepideh Mamooler |
Week 7 | 6 Apr | Transfer Learning Review Text Generation |
Molly Petersen Deniz Bayazit Sepideh Mamooler |
Week 8 | 13 Apr | EASTER BREAK | |
Week 9 | 20 Apr | Text Generation Review In-context Learning |
Deniz Bayazit Silin Gao Sepideh Mamooler |
Week 10 | 27 Apr | In-context Learning Review Milestone 1 Discussion |
Silin Gao TA meetings on-demand |
Week 11 | 4 May | Project | TA meetings on-demand |
Week 12 | 11 May | Milestone 2 Discussion Project |
Silin Gao TA meetings on-demand |
Week 13 | 18 May | Project | TA meetings on-demand |
Week 14 | 25 May | Milestone 3 Discussion Project |
Deniz Bayazit TA meetings on-demand |
Week 15 | 1 Jun | Project | TA meetings on-demand |
- TAs will provide a small discussion over the last week's exercises, answering any questions and explaining the solutions. (10-15mins)
- TAs will present this week's exercise. (5mins)
- Students will be solving this week's exercises and TAs will provide answers and clarification if needed.
Note: Please make sure you have already done the setup prerequisites to run the coding parts of the exercises. You can find the instructions here.
Your grade in the course will be computed according to the following guidelines:
There will be three assignments throughout the course. They will be released and due according to the following schedule:
Link for the assignment [here].
- Released: 10 Mar 2023
- Due: 24 Mar 2023
Link for the assignment [here].
- Released: 24 Mar 2023
- Due: 7 Apr 2023
- Released: 7 Apr 2023
- Due: 28 Apr 2023
Assignments will be released on Moodle and announced on Ed.
The project will involve using large-scale language models (100B+ parameters) and medium-scale (300M parameters) in the domain of education. The project will be divided into a proposal, three milestones and a final submission. Each milestone will be worth 10% of the final grade with the remaining 30% being allocated to the final report. Each time will be supervised by one of the course TAs or AEs. More details on the content of the project and the deliverables of each milestone will be released at a later date.
- For the proposal, students will be responsible for delivering a Project Plan for executing the goals of the project, as well as submitting a list of relevant literature, demonstrating they have identified academic papers that are relevant for each of the portions of the milestones of the project. Each student in the group should review one of these papers and submit it with Milestone 1.
- Due: 21 Apr 2023
- Exact parameters of Milestone 1 will be released in future weeks.
- Due: 5 May 2023
- Exact parameters of Milestone 2 will be released in future weeks.
- Due: 19 May 2023
- Exact parameters of Milestone 3 will be released in future weeks.
- Due: 4 Jun 2023
- The final report, code, and date will be due on June 15th. Students are welcome to turn in their materials ahead of time on June 4th as soon as the semester ends.
- Due: 15 Jun 2023
All assignments and milestones are due at 11:59 PM on their due date. As we understand that circumstances can make it challenging to abide by these due dates, you will receive 6 late days over the course of the semester to be allocated to the assignments and project milestones as you see fit. No further extensions will be granted. The only exception to this rule is for the final report, code, and data. No extensions will be granted beyond June 15th.
Lecturer: Antoine Bosselut.
Teaching assistants: Mohammadreza Banaei, Deniz Bayazit, Zeming (Eric) Chen, Simin Fan, Silin Gao, Angelika Romanou
Please contact us for any organizational questions or questions related to the course content.