Introduction to Large Language Models

This is the English version repository for "Introduction to Large Language Models" (Gijutsu-Hyohron Co., Ltd., 2023).

Code

All code has been tested to work on Google Colaboratory. The datasets used in the code and the models created are available on the Hugging Face Hub.

⚠️ As of July 28, 2023, the link to the source of the MARC-ja dataset is broken, and there is an error in loading the dataset in the code sections 5.2, 5.3, 5.5.4 in the book. We have sent an inquiry email and are currently waiting for recovery.

In response to this, we have added a notebook using the Japanese sentiment analysis dataset WRIME. Please utilize it if you want to run the code.

Chapter	Section/Item	Link
Chapter 1 Introduction	1.1 Solve natural language processing with transformers 1.2 Basic usage of transformers	Link
Chapter 2 Transformer	2.2 Encoder	Link
Chapter 3 Fundamentals of Large Language Models	3.2 GPT (Decoder) 3.3 BERT・RoBERTa (Encoder) 3.4 T5 (Encoder-Decoder)	Link
	3.6 Tokenization	Link
Chapter 5 Fine-tuning of Large Language Models	5.2 Implementation of Sentiment Analysis Model	Link (MARC-ja) Link (WRIME)
	5.3 Error Analysis of Sentiment Analysis Model	Link (MARC-ja) Link (WRIME)
	5.4.1 Implementation of Natural Language Inference (Training)	Link
	5.4.1 Implementation of Natural Language Inference (Analysis)	Link
	5.4.2 Implementation of Semantic Similarity Calculation (Training)	Link
	5.4.2 Implementation of Semantic Similarity Calculation (Analysis)	Link
	5.4.3 Implementation of Multiple-Choice Question Answering Model (Training)	Link
	5.4.3 Implementation of Multiple-Choice Question Answering Model (Analysis)	Link
	5.5.4 LoRA Tuning (Sentiment Analysis)	Link (MARC-ja) Link (WRIME)
Chapter 6 Named Entity Recognition	6.2 Dataset, Preprocessing, and Evaluation Metrics 6.3 Implementation of Named Entity Recognition Models 6.4 Building Datasets Using Annotation Tools	Link
Chapter 7 Summary Generation	7.2 Dataset 7.3 Evaluation Metrics 7.4 Implementation of Headline Generation Models 7.5 Headline Generation by Various Methods	Link
Chapter 8 Sentence Embedding	8.3 Implementation of Sentence Embedding Models	Link
	8.4 Search Using the Nearest Neighbor Library `Faiss`	Link
Chapter 9 Question Answering	9.3 Making ChatGPT Answer Quizzes	Link
	9.4.3 Implementation of BPR	Link
	9.4.4 Calculation of Passage Embeddings with BPR	Link
	9.5 Combining Document Search Models and ChatGPT	Link

Errata

The errata for this book are published on the following page.

https://github.com/ghmagazine/llm-book/wiki/errata

engichang1467/llm-book-english

Introduction to Large Language Models

Code

Errata

Links