/instruct-tuning-data-autogen

This project aims to generate a large dataset of instructions using the Self-Instruct methodology with the Llama-3 model. The primary goal is to bootstrap a small set of manually-written instructions into a comprehensive dataset through iterative generation, filtering, and refinement processes.

Primary LanguageJupyter Notebook

Stargazers