/llm-book-english

GitHub repository (English translated) for “Introduction to Large-Scale Language Models” (Gijutsu Hyoronsha, 2023)

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Introduction to Large Language Models

This is the English version repository for "Introduction to Large Language Models" (Gijutsu-Hyohron Co., Ltd., 2023).

Code

All code has been tested to work on Google Colaboratory. The datasets used in the code and the models created are available on the Hugging Face Hub.

⚠️ As of July 28, 2023, the link to the source of the MARC-ja dataset is broken, and there is an error in loading the dataset in the code sections 5.2, 5.3, 5.5.4 in the book. We have sent an inquiry email and are currently waiting for recovery.

In response to this, we have added a notebook using the Japanese sentiment analysis dataset WRIME. Please utilize it if you want to run the code.

Chapter Section/Item Colab Link
Chapter 1 Introduction 1.1 Solve natural language processing with transformers
1.2 Basic usage of transformers
Open in Colab Link
Chapter 2 Transformer 2.2 Encoder Open in Colab Link
Chapter 3 Fundamentals of Large Language Models 3.2 GPT (Decoder)
3.3 BERT・RoBERTa (Encoder)
3.4 T5 (Encoder-Decoder)
Open in Colab Link
3.6 Tokenization Open in Colab Link
Chapter 5 Fine-tuning of Large Language Models 5.2 Implementation of Sentiment Analysis Model Open in Colab
Open in Colab
Link (MARC-ja)
Link (WRIME)
5.3 Error Analysis of Sentiment Analysis Model Open in Colab
Open in Colab
Link (MARC-ja)
Link (WRIME)
5.4.1 Implementation of Natural Language Inference (Training) Open in Colab Link
5.4.1 Implementation of Natural Language Inference (Analysis) Open in Colab Link
5.4.2 Implementation of Semantic Similarity Calculation (Training) Open in Colab Link
5.4.2 Implementation of Semantic Similarity Calculation (Analysis) Open in Colab Link
5.4.3 Implementation of Multiple-Choice Question Answering Model (Training) Open in Colab Link
5.4.3 Implementation of Multiple-Choice Question Answering Model (Analysis) Open in Colab Link
5.5.4 LoRA Tuning (Sentiment Analysis) Open in Colab
Open in Colab
Link (MARC-ja)
Link (WRIME)
Chapter 6 Named Entity Recognition 6.2 Dataset, Preprocessing, and Evaluation Metrics
6.3 Implementation of Named Entity Recognition Models
6.4 Building Datasets Using Annotation Tools
Open in Colab Link
Chapter 7 Summary Generation 7.2 Dataset
7.3 Evaluation Metrics
7.4 Implementation of Headline Generation Models
7.5 Headline Generation by Various Methods
Open in Colab Link
Chapter 8 Sentence Embedding 8.3 Implementation of Sentence Embedding Models Open in Colab Link
8.4 Search Using the Nearest Neighbor Library Faiss Open in Colab Link
Chapter 9 Question Answering 9.3 Making ChatGPT Answer Quizzes Open in Colab Link
9.4.3 Implementation of BPR Open in Colab Link
9.4.4 Calculation of Passage Embeddings with BPR Open in Colab Link
9.5 Combining Document Search Models and ChatGPT Open in Colab Link

Errata

The errata for this book are published on the following page.

https://github.com/ghmagazine/llm-book/wiki/errata

Links