Portable llama chatbot in USB drive.
(It is only done for 64-bit Windows.)
- Very simple tech stack:
- No web server needed.
- It can run from USB drive. Plug and play.
- Git clone or download this repository.
- Download qwen2-1_5b-instruct-q4_k_m.gguf and save it in
portable_llama_chatbot\models
folder. - Run
chatbot_in_console.bat
.
-
Make a folder
portable_llama_chatbot
and cd into it. It is the working directory.mkdir portable_llama_chatbot cd portable_llama_chatbot
-
Make another 2 folders inside
portable_llama_chatbot
,models
for model files, andinstall
for temporary installation files.mkdir models mkdir install
-
Download embeddable Python from https://www.python.org/ftp/python/3.11.9/python-3.11.9-embed-amd64.zip .
-
Unzip it to folder
python-3.11.9-embed-amd64
which is insideportable_llama_chatbot
folder. -
The embeddable Python does not come with pip. It has to be installed manually. Download the Python script from https://bootstrap.pypa.io/get-pip.py and save it as
portable_llama_chatbot\install\get-pip.py
. -
Install pip. The embeddable Python is not on system path. Run it with either full path or relative path.
.\python-3.11.9-embed-amd64\python.exe install\get-pip.py
-
Edit
.\python-3.11.9-embed-amd64\python311._pth
and replace its content with the text below:.. DLLs Lib/site-packages python311.zip . # Uncomment to run site.main() automatically import site
-
Install
llama-cpp-python
. There are several ways to do it:-
Install from the pre-built wheel in this repository:
Download and save
llama_cpp_python-0.2.88-cp311-cp311-win_amd64.whl
ininstall
, and run:.\python-3.11.9-embed-amd64\python.exe -m pip install install\llama_cpp_python-0.2.88-cp311-cp311-win_amd64.whl
-
Install from the official pre-built wheel:
.\python-3.11.9-embed-amd64\python.exe -m pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.88/llama_cpp_python-0.2.88-cp311-cp311-win_amd64.whl
-
Install from PyPI:
-
Install
scikit_build_core
which is required to buildllama-cpp-python
:.\python-3.11.9-embed-amd64\python.exe -m pip install scikit_build_core
-
Make sure Visual Studio C++ compiler and CMake are installed before building
llama-cpp-python
. -
Install
llama-cpp-python
. It will take some time to build:.\python-3.11.9-embed-amd64\python.exe -m pip install llama-cpp-python==0.2.88
-
-
-
Copy
llama_chatbot.py
,chatbot_in_console.py
andchatbot_in_console.bat
intoportable_llama_chatbot
folder. -
Download qwen2-1_5b-instruct-q4_k_m.gguf and save it in
portable_llama_chatbot\models
folder. Any model that is supported byllama-cpp-python
should work. Just need to modifymodel_name
andchat_format
inchatbot_in_console.py
script. -
Run
chatbot_in_console.bat
to use it. -
Refer to the documentation of llama-cpp-python for customization.