LLaMA Docker Playground
A "Clean and Hygienic" LLaMA Playground, Play LLaMA with 7GB (int8) 10GB (pyllama) or 20GB (official) of VRAM.
At the same time, it provides Alpaca LoRA one-click running Docker image, which can finetune 7B / 65B models.
How to use
To use this project, we need to do two things:
- the first thing is to download the model
- (you can download the LLaMA models from anywhere)
- and the second thing is to build the image with the docker
- (saves time compared to downloading from Docker Hub)
Put the Models File in Right Place
Taking the smallest model as an example, you need to place the model related files like this:
.
└── models
├── 65B
│ ├── checklist.chk
│ ├── consolidated.00.pth
│ ├── consolidated.01.pth
│ ├── consolidated.02.pth
│ ├── consolidated.03.pth
│ ├── consolidated.04.pth
│ ├── consolidated.05.pth
│ ├── consolidated.06.pth
│ ├── consolidated.07.pth
│ └── params.json
├── 30B
│ ├── consolidated.00.pth
│ ├── consolidated.01.pth
│ ├── consolidated.02.pth
│ ├── consolidated.03.pth
│ └── params.json
├── 13B
│ ├── consolidated.00.pth
│ ├── consolidated.01.pth
│ └── params.json
├── 7B
│ ├── consolidated.00.pth
│ └── params.json
└── tokenizer.model
Build the LLaMA Docker Playground
If you prefer to use the official authentic model, build the docker image with the following command:
docker build -t soulteary/llama:llama . -f docker/Dockerfile.llama
If you wish to use a model with lower memory requirements, build the docker image with the following command:
docker build -t soulteary/llama:pyllama . -f docker/Dockerfile.pyllama
If you wish to use a model with the minimum memory requirements, build the docker image with the following command:
docker build -t soulteary/llama:int8 . -f docker/Dockerfile.int8
If you wish to fine-tune a model(7B-65B) with the minimum memory requirements, build the docker image with the following command:
# single GPU
docker build -t soulteary/llama:alpaca-lora-finetune . -f docker/Dockerfile.lora-finetune
# multiple GPU
docker build -t soulteary/llama:alpaca-lora-65b-finetune . -f docker/Dockerfile.lora-65b-finetune
Play with the LLaMA
For official model docker images (7B almost 21GB), use the following command:
docker run --gpus all --ipc=host --ulimit memlock=-1 -v `pwd`/models:/app/models -p 7860:7860 -it --rm soulteary/llama:llama
For lower memory requirements (7B almost 13GB) docker images, use the following command:
docker run --gpus all --ipc=host --ulimit memlock=-1 -v `pwd`/models:/llama_data -p 7860:7860 -it --rm soulteary/llama:pyllama
For the minimum memory requirements (7B almost 7.12GB) docker images, use the following command:
docker run --gpus all --ipc=host --ulimit memlock=-1 -v `pwd`/models:/app/models -p 7860:7860 -it --rm soulteary/llama:int8
For fine-tune, read this documentation.
Credits
License
Follow the rules of the game and be consistent with the original project.