A collection of notebooks for DATAFY FINETUNING COURSE
- Learn the fundamentals of finetuning a large language model (LLM).
- Data Preparation for Finetuning with help of Human Efforts
- Understand how finetuning differs from prompt engineering, and when to use both.
- Get practical experience with real data sets, and how to use techniques for your own projects.
- Collection of Notebooks
- Google Colab
- Docker Support with Optimisation Cache etc
- Run the Notebook Server with Docker
This repo contains an notebooks
flocation contains the ntebooks
- Sign up on https://www.lamini.ai/
- Go to Account and get the API key Under
Active API Tokens
- Under
.powerml/configure_llama.yaml
, add the key dotenv
can also be used to setup the key. Use theenv.example
to createenv
file
for more on Authentcation read https://lamini-ai.github.io/auth/
Fork the Course GitHub Repository and Start working. Got Below Link and Click Fork
https://github.com/datafyresearcher/datafy-finetuning-university
, Change to your Own Username for below commands
Google Colab offer free GPU therefore, running on Google Colab is prefferd method
- To get started, you need to mount your Google Drive in Colab. You can do this by running the following command:
from google.colab import drive
drive.mount('/content/drive')
- Once your Drive is mounted, you can clone a repository by running the following command:
!git clone https://github.com/datafyresearcher/datafy-finetuning-university.git /content/drive/MyDrive/datafy-finetuning-university
This will clone the repository to your Colab instance, and you can work on it just as you would on your local machine. When you want to push your changes back to the repository, you can use the usual Git commands, such as:
!git add .
!git commit -m "Commit message"
!git push origin <branch>
- Clone the repository📂
git clone https://github.com/datafyresearcher/datafy-finetuning-university.git
- Install dependencies with Poetry and activate virtual environment🔨
poetry install
poetry shell
- Copy and Modify
env.example
to.env
Generate the HF API Key to be able to hosted models for inference and set the variables accordingly.
- Run the JupyterLab server🚀
jupyter lab
This project includes Dockerfile
to run the app in Docker container. In order to optimise the Docker Image
size and building time with cache techniques, I have follow tricks in below Article
https://medium.com/@albertazzir/blazing-fast-python-docker-builds-with-poetry-a78a66f5aed0
Build the docker container
docker build . -t datafy-finetuning-course:latest
To generate Image with DOCKER_BUILDKIT
, follow below command
DOCKER_BUILDKIT=1 docker build --target=runtime . -t datafy-finetuning-course:latest
- Run the docker container directly
docker run -d --name datafy-finetuning-course -p 8888:8888 datafy-finetuning-course:latest
- Run the docker container using docker-compose (Recommended)
docker-compose up
Make sure to include the
.env
SECRETS file when running withdocker-compose
with your own Keys.
As dlai-hf-course:latest
is a template project with minimal example. Report issues if you face any.
We have collected the Notebooks from original course and edited for few lines/functions to make them run locally.