This is the repository of the course on neural networks based on the Transformer architecture targeted at people with experience in Python, Machine Learning, and Deep Learning but little or no experience with Transformers.
The course features a comprehensive overview of NLP applications but also covers the use of Transformers for other types of data, such as images, networks, or event sequences. The course features
The course covers the concept of attention mechanism, encoder/decoder models, text tokenization and generation, Reinforcement Learning from Human Feedback. The course is based on the analysis of different Transformer architectures and comprehensive overview of its applications, including not only Natural Language, but also images, networks, or event sequences applications. The course touches on the latest exciting topics from the AI community. In the lectures, we discuss the problems of resource consumption when training large Transformers and the opportunity to make such models more efficient. We are also talking about the problem of creating “good” training datasets, about ChatGPT and its issues, about approaches to Prompt Tuning, which is gaining popularity recently. Two homework assignments are organized as CodaLab competitions and involve long-term work with a problem and the selection of several approaches to solving with the following result discussion.
We expect that after completing the course students:
-
Will understand the essence of the main part of Transformers - the attention mechanism, as well as the basic principles of working with texts.
-
Familiarize themselves with a diverse range of Natural Language Processing and Machine Learning challenges, equipped with the tools and methodologies necessary to tailor Transformers to address specific problems.
-
Develop the ability to correlate a given task encountered during work or study with one of the generalized problems discussed in the course. For instance, mapping the identifying specific words within a document to sequence labeling, and subsequently to multilabel classification or representing word games with chat-bot as a text continuation problem.
-
Engage in homework assignments extended beyond the duration typically allotted in similar courses, enabling students to delve deeper into problem-solving, experiment gradually, and compare various approaches to data preprocessing and learning.