- Recap on Deep Learning & basic NLP (slides / lab session)
- Tokenization (slides / lab session)
- Language Modeling (slides / lab session)
- NLP without 2048 GPUs (slides / lab session)
- Handling the Risks of Language Models (slides / lab session)
- Advanced NLP tasks (slides / lab session)
- Domain-specific NLP (slides / [lab session])
- Multilingual NLP (slides / lab session)
- Multimodal NLP (slides / lab session)
The evaluation consists in a team project (3-5 people). There are two options:
- Demo : Use a well-known approach to produce a MVP for an original use-case and present it in a demo.
- Example: An online platform that detects AI-generated text.
- Example: An online platform that detects AI-generated text.
- R&D : Based on a research article, conduct original experiments and produce a report. (see Potential articles)
- Example: Do we need Next Sentence Prediction in BERT? (Answer: No)
It will consist of three steps:
- Team announcement (before 15/12/23): send an email to
nathan.godey@inria.fr
with cc'smatthieu.futeral@inria.fr
andfrancis.kulumba@inria.fr
explaining- The team members (also cc'ed)
- Type of project and vague description (can change afterwards)
- Project plan (30% of final grade, before 07/01/23): following this template, produce a project plan explaining first attempts (e.g. version alpha), how they failed/succeeded and what you want to do before the delivery.
- Project delivery (70% of final grade, before mid-February): deliver a
nb_team_members * 2
pages project report and a GitHub repo (more details coming soon)
- A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning (https://arxiv.org/abs/2204.10815)
- BPE-Dropout: Simple and Effective Subword Regularization (https://aclanthology.org/2020.acl-main.170/)
- Efficient Streaming Language Models with Attention Sinks (https://arxiv.org/abs/2309.17453)
- Lookahead decoding (https://lmsys.org/blog/2023-11-21-lookahead-decoding/)
- Efficient Memory Management for Large Language Model Serving with PagedAttention (https://arxiv.org/pdf/2309.06180.pdf)
- Detecting Pretraining Data from Large Language Models (https://arxiv.org/abs/2310.16789)
- Proving Test Set Contamination in Black Box Language Models (https://arxiv.org/abs/2310.17623)
- Mamba: Linear-Time Sequence Modeling with Selective State Spaces (https://arxiv.org/abs/2312.00752)
- Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection (https://aclanthology.org/2020.acl-main.647/)
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model (https://arxiv.org/abs/2305.18290)
- Text Embeddings Reveal (Almost) As Much As Text (https://arxiv.org/abs/2310.06816)