/Awesome-LLM-Synthetic-Data

A reading list on LLM based Synthetic Data Generation đŸ”„

MIT LicenseMIT

Synthetic Data of LLMs, by LLMs, for LLMs

LICENSE Awesome commit PR GitHub Repo stars

This repo includes papers, tools, and blogs about Synthetic Data of LLMs, by LLMs, for LLMs.

Thanks for all the great contributors on GitHub!đŸ”„âšĄđŸ”„

Contents

1. Surveys

2. Methods

2.1. Techniques

2.2. Instruction Generation with High Quality/Complexity

3. Application Areas

3.1. Mathematical Reasoning

3.2. Code Generation

3.3. Text-to-SQL

3.4. Alignment

3.5. Reward Modeling

3.6. Long Context

3.7. Weak-to-Strong

3.8. Agent and Tool Use

3.9. Vision and Language

3.10. Factuality

4. Datasets

5. Tools

6. Blogs