/BISS-2024

This repository hosts materials from the Bertinoro International Spring School 2024 course

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Bertinoro International Spring School 2024

Large Language Models and How to Instruction Tune Them (in a Sustainable Way)

Authors: Danilo Croce

Many thanks to: Claudiu Daniel Hromei for supporting the development of (most of the) code

This repository hosts materials from the Bertinoro International Spring School - BISS-2024 tutorial.

The objective of this tutorial is:

  • Introduce the Basics of Distributional Semantics, and the interplay with neural learning.
  • Introduce Transformer-based architectures, including encoding-decoding, encoder-only, and decoder-only structures.
  • Demonstrate fine-tuning of Large Language Models (LLMs) on diverse datasets in a multi-task framework.
  • Utilize Low-Rank Adaptation (LoRA) for sustainable and efficient tuning on "modest" hardware (e.g., single 16GB RAM GPU).

The repository includes code for fine-tuning a Large Language Model (based on BERT and LLaMA) to solve NLP tasks, such as the ones proposed in EVALITA 2023.

Code

Lab 1: Training BERT-based models in few lines of code

This is a Pytorch (+ Huggingface transformers) implementation of a "simple" text classifier defined using BERT-based models. In this lab we will see how it is simple to use BERT for a sentence classification task, obtaining state-of-the-art results in few lines of python code.

The python book is available at this LINK.

Lab 2: Fine-tune a LLaMA-based model for all tasks from EVALITA 2023

At the end, this tutorial shows how to encode data from different tasks into specific prompts and fine-tune the LLM using Q-LoRA. The code can be also used in Google Colab using an Nvidia-T4 GPU with 15GB memory.

The code is heavily based on the one used in ExtremITA system participating to EVALITA 2023:

The overall process is divided in four steps:

Slides

The repository also features tutorial slides (LINK).

Contacts

For queries or suggestions, raise an Issue in this repository or email croce@info.uniroma2.it