/my-llm

All about large language models

MIT LicenseMIT

My LLM

All about large language models

My Practice

My Articles

Survey

  • 2023-Challenges and Applications of Large Language Models paper

Pre-train

Survey

  • 2023-A Survey of Large Language Models [paper]

Methods

Max Sequence Length

Position

Normalization

  • RMSNorm
  • Layer Normalization
    • Pre-LN
    • Post-LN
    • Sandwich-LN
    • DeepNorm

Activation Function

  • SwiGLU
  • GeLUs
  • Swish

Tokenizer

Interpretability

LR Scheduler

Scaling Laws

  • 2020-Scaling Laws for Neural Language Models [paper]

Fine-tune

Models

General

Chinese

Japanese

Medical

  • 2023-ChatDoctor: A medical chat model fine-tuned on llama model using medical domain knowledge
  • 华驼(HuaTuo): 基于中文医学知识的LLaMA微调模型

Law

  • LawGPT_zh:中文法律大模型(獬豸)

Recommendation

  • 2023-Recalpaca: Low-rank llama instruct-tuning for recommendation

Other

  • 2023-A Survey of Domain Specialization for Large Language Models [paper]

Methods

RL

Reward Modeling
  • 2023-REWARD DESIGN WITH LANGUAGE MODELS [paper]
  • 2022-Scaling Laws for Reward Model Overoptimization [paper]
  • autocrit
  • 2023-On The Fragility of Learned Reward Functions [paper]

peft

  • 2021-LoRA- Low-Rank Adaptation of Large Language Models [paper]

align

  • 2023-RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment [paper]
  • 2023-Preference Ranking Optimization for Human Alignment [paper]
  • 2023-Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization [paper]
  • 2023-Fine-Grained Human Feedback Gives Better Rewards for Language Model Training paper]
  • 2023-Chain of Hindsight Aligns Language Models with Feedback [paper]
  • 2023-Training Socially Aligned Language Models in Simulated Human Society [paper]
  • 2023-Let’s Verify Step by Step [paper]
  • 2023-The False Promise of Imitating Proprietary LLMs [paper]
  • 2023-AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback [paper]
  • 2023-LIMA- Less Is More for Alignment [paper]
  • 2023-RRHF: Rank Responses to Align Language Models with Human Feedback without tears [paper] [code]
  • 2022-Solving math word problems with process-and outcome-based feedback [paper]
  • 2022-Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [paper]
  • 2022-Training language models to follow instructions with human feedback [paper]
  • 2022-Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned [paper]
  • 2022-LaMDA- Language Models for Dialog Applications [Paper]
  • 2022-Constitutional ai- Harmlessness from ai feedback [paper]
  • 2021-A general language assistant as a laboratory for alignment [paper]
  • 2021-Ethical and social risks of harm from language models [paper]
  • 2020-nips-Learning to summarize from human feedback [paper]
  • 2019-Fine-Tuning Language Models from Human Preferences [paper]
  • 2018-Scalable agent alignment via reward modeling: a research direction [paper]
  • Reinforcement Learning for Language Models Blog
  • 2017-nips-Deep reinforcement learning from human preferences [paper]
  • 2016-Concrete problems in ai safety [paper]

Other

  • 2022-naacl-MetaICL- Learning to Learn In Context [paper]
  • 2022-iclr-Multitask Prompted Training Enables Zero-Shot Task Generalization [paper]

Prompt Learning

  • 2023-Tree of Thoughts: Deliberate Problem Solving with Large Language Models [paper]
  • 2023-Guiding Large Language Models via Directional Stimulus Prompting [paper]
  • 2023-ICLR-Self-Consistency Improves Chain of Thought Reasoning in Language Models [paper]
  • 2023-Is Prompt All You Need No. A Comprehensive and Broader View of Instruction Learning [paper]

Survey

  • 2021-Pre-train, Prompt, and Predict- A Systematic Survey of Prompting Methods in Natural Language Processing [paper]

Prompt Tuning

  • 2023-Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition [paper]
  • 2022-AC-PPT- Pre-trained Prompt Tuning for Few-shot Learning [paper]
  • 2022-ACL-P-Tuning- Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks [paper]
  • 2021-EMNLP-The Power of Scale for Parameter-Efficient Prompt Tuning [paper]
  • 2021-acl-Prefix-Tuning- Optimizing Continuous Prompts for Generation [paper]
  • 2021-GPT Understands, Too [paper]

Integrating External Data

Tool Learning

Methods

  • 2023-OpenAGI: When LLM Meets Domain Experts [paper]
  • 2023-WebCPM: Interactive Web Search for Chinese Long-form Question Answering [paper]
  • 2023-Evaluating Verifiability in Generative Search Engines [paper]
  • 2023-Enabling Large Language Models to Generate Text with Citations [paper]
  • 2022-ACL-Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora [paper]
  • 2022-findings-acl-ELLE: Efficient Lifelong Pre-training for Emerging Data [paper]
  • langchain
  • 2023-Check Your Facts and Try Again- Improving Large Language Models with External Knowledge and Automated Feedback [paper]
  • 2022-Teaching language models to support answers with verified quotes
  • 2021-Webgpt: Browser-assisted question-answering with human feedback [paper]
  • 2021-Improving language models by retrieving from trillions of tokens
  • 2020-REALM: retrieval-augmented language model pre-training
  • 2020-Retrieval-augmented generation for knowledge-intensive NLP tasks

Other

Dataset

For Pre-training

For SFT

For Reward Model

For Evaluation

Methods

  • 2023-A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity [paper]
  • 2023-DoReMi: Optimizing data mixtures speeds up language model pretraining
  • 2023-Data selection for language models via importance resampling
  • 2022-SELF-INSTRUCT- Aligning Language Model with Self Generated Instructions [paper]
  • 2022-acl-Deduplicating training data makes language models better [paper]

Evaluation

  • 2023-findings-acl-Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation paper
  • 2023-Harnessing the Power of LLMs in Practice- A Survey on ChatGPT and Beyond [paper]
  • 2023-INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models [paper]
  • LLMZoo: a project that provides data, models, and evaluation benchmark for large language models.
  • 2023-Evaluating ChatGPT's Information Extraction Capabilities- An Assessment of Performance, Explainability, Calibration, and Faithfulness paper
  • 2023-Towards Better Instruction Following Language Models for Chinese- Investigating the Impact of Training Data and Evaluation paper
  • PandaLM
  • lm-evaluation-harness
  • BIG-bench
  • 2023-HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models [paper]
  • 2023-C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models [paper]
  • 2023-Safety Assessment of Chinese Large Language Models [paper]
  • 2022-Holistic Evaluation of Language Models [paper]

Aspects

  • helpfulness
  • honesty
  • harmlessness
  • truthfulness
  • robustness
  • Bias, Toxicity and Misinformation

评估挑战

  • 已有的评估通常只用已有的常见NLP任务,海量的其它任务并没有评估,比如写邮件

Inference

Analysis

  • Pythia: Interpreting Autoregressive Transformers Across Time and Scale
  • 2023-Inspecting and Editing Knowledge Representations in Language Models [paper]

Products

Tools

Traditional Nlp Tasks

  • 2023-AnnoLLM- Making Large Language Models to Be Better Crowdsourced Annotators [paper]
  • 2022-Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks [paper]

Sentiment Analysis

  • 2023-Sentiment Analysis in the Era of Large Language Models- A Reality Check [Paper] [GitHub]
  • 2023-Can chatgpt understand too? A comparative study on chatgpt and fine-tuned BERT
  • 2023-Is chatgpt a good sentiment analyzer? A preliminary study
  • 2023-Llms to the moon? reddit market sentiment analysis with large language models
  • 2023-Is GPT-3 a Good Data Annotator? [paper]

Weak Supervision

  • 2022-Language models in the loop: Incorporating prompting into weak supervision [paper]

Knowledge Graph

Survey

  • 2023-Unifying Large Language Models and Knowledge Graphs: A Roadmap [paper]

Related Topics

Neural Text Generation

  • 2020-ICLR-Neural text generation with unlikelihood training [paper]
  • 2021-findings-emnlp-GeDi- Generative Discriminator Guided Sequence Generation [paper]
  • 2021-ACL-DExperts- Decoding-Time Controlled Text Generation with Experts and Anti-Experts [paper]
  • 2021-ICLR-Mirostat- a neural text decoding algorithm that directly controls perplexity [paper]
  • 2022-NIPS-A Contrastive Framework for Neural Text Generation [paper]

Controllable Generation

  • 2022-ACL-Length Control in Abstractive Summarization by Pretraining Information Selection [paper]

Distributed Training

Quantization

  • 2020-Integer Quantization for Deep Learning Inference Principles and Empirical Evaluation [paper]
  • 2023-ICLR-GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers [paer]
  • 2023-QLORA: Efficient Finetuning of Quantized LLMs [paper]

Other

Related Project