Welcome to the Chat With Your Docs repository, where we explore the integration of ChatDocs and GPTQ quantization method for an interactive chat app with your documents!
In the AutoGPTQ-transformers.ipynb
notebook, discover the process of creating a quantized model using AutoGPTQ. Additionally, in the C Transformers (GPTQ).ipynb
notebook, explore how to utilize the GPTQ model with CTransformer.
- To incorporate GPTQ models, install the
auto-gptq
package using:
pip install chatdocs[gptq]
-
In the main directory, create
chatdocs.yml
. -
Add the following to your
chatdocs.yml
:
llm: gptq
- To change the GPTQ models, modify your
chatdocs.yml
as follows:
gptq:
model: TheBloke/Llama-2-7B-GPTQ
model_file: model.safetensors
Note: When adding a new model for the first time, run chatdocs download
to download the model before use.
chatdocs download
install the ctransformers
package using:
pip install ctransformers[gptq]
To change the C Transformers with GPTQ model, add and change the following in your chatdocs.yml
:
ctransformers:
model: TheBloke/Llama-2-7B-GPTQ
model_file: model.safetensors
model_type: gptq
You can also use an existing local model file:
ctransformers:
model: /path/to/ggml-model.bin
model_type: gptq
- Add a directory containing documents to engage in conversation with using:
chatdocs add /path/to/documents
The processed documents will be stored in the db
directory by default.
- Chat with your documents:
chatdocs ui
Access the web UI at http://localhost:5000 in your browser.
The command-line interface is also available:
chatdocs chat
For further configuration options, visit the ChatDocs repository: ChatDocs Repo