Inference of MiniGPT4 in pure C/C++.
The main goal of minigpt4.cpp
is to run minigpt4 using 4-bit quantization with using the ggml library.
Requirements: git
git clone --recursive https://github.com/Maknee/minigpt4.cpp
cd minigpt4.cpp
Go to Releases and extract minigpt4
library file into the repository directory.
Requirements: CMake, Visual Studio and Git
cmake .
cmake --build . --config Release
bin\Release\minigpt4.dll
should be generated
Requirements: CMake (Ubuntu: sudo apt install cmake
)
cmake .
cmake --build . --config Release
minigpt4.so
should be generated
Requirements: CMake (MacOS: brew install cmake
)
cmake .
cmake --build . --config Release
minigpt4.dylib
should be generated
Note: If you build with opencv (allowing features such as loading and preprocessing image within the library itself), set MINIGPT4_BUILD_WITH_OPENCV
to ON
in CMakeLists.txt
or build with -DMINIGPT4_BUILD_WITH_OPENCV=ON
as a parameter to the cmake cli.
Pre-quantized models are avaliable on Hugging Face ~ 7B or 13B.
Requirements: Python 3.x and PyTorch.
Clone the MiniGPT-4 repository and perform the setup
cd minigpt4
git clone https://github.com/Vision-CAIR/MiniGPT-4.git
conda env create -f environment.yml
conda activate minigpt4
Download the pretrained checkpoint in the MiniGPT-4 repository under Checkpoint Aligned with Vicuna 7B
or Checkpoint Aligned with Vicuna 13B
or download them from Huggingface link for 7B or 13B
Convert the model weights into ggml format
cd minigpt4
python convert.py C:\pretrained_minigpt4.pth --ftype=f16
python convert.py ~/Downloads/pretrained_minigpt4.pth --outtype f16
minigpt4-7B-f16.bin
or minigpt4-13B-f16.bin
should be generated
Pre-quantized models are avaliable on Hugging Face
Requirements: Python 3.x and PyTorch.
Follow the guide from the MiniGPT4 to obtain the vicuna-v0 model.
Then, clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake .
cmake --build . --config Release
Convert the model to ggml
python convert.py <path-to-model>
Quantize the model
python quanitize <path-to-model> <output-model> Q4_1