This repository contains Jupyter Notebooks detailing the process of scraping a Question-Answer dataset from Nairaland and fine-tuning the Llama3 model.

Nairaland Question-Answer Dataset and Llama3 Fine-Tuning

This repository contains Jupyter Notebooks detailing the process of scraping a Question-Answer dataset from Nairaland and fine-tuning the Llama3 model.

Model Description

This repository contains the code for fine-tuning the Llama3 model on the Nairaland Question-Answer dataset. The fine-tuned model, Llama3-8b-Naija_v1, is tailored for Nigerian English text generation.

Dataset Description

The dataset used for fine-tuning the model is the Nairaland Question-Answer Dataset. It consists of questions and answers scraped from the Nairaland forum.

Files

  • Llama_3_finetune.ipynb: Notebook detailing the fine-tuning process of the Llama3 model on the Nairaland dataset.
  • Naija_Llama_3_Demo.ipynb: Notebook demonstrating the usage of the fine-tuned Llama3 model for Nigerian English text generation.
  • Nairaland_Webscraping.ipynb: Notebook outlining the code for scraping the Question-Answer dataset from Nairaland.
  • Push_the_model_to_hub.ipynb: Notebook providing instructions on how to push the fine-tuned Llama3 model to the Hugging Face Model Hub.
  • combine_clean_datasets.ipynb: Notebook demonstrating how to combine and clean datasets for training the Llama3 model.