su77ungr/CASALIOY

Getting model_path keyError

Closed this issue · 23 comments

  1. source .env file post editing for models path
  2. run python ingest.py short.pdf

Any resolution is welcome

Traceback (most recent call last):
File "/home/ubuntu/environment/CASALIOY/casalioy/ingest.py", line 150, in
main(sources_directory, cleandb)
File "/home/ubuntu/environment/CASALIOY/casalioy/ingest.py", line 144, in main
ingester.ingest_from_directory(sources_directory, chunk_size, chunk_overlap)
File "/home/ubuntu/environment/CASALIOY/casalioy/ingest.py", line 117, in ingest_from_directory
encode_fun = get_embedding_model()[1]
File "/home/ubuntu/environment/CASALIOY/casalioy/load_env.py", line 46, in get_embedding_model
model = LlamaCppEmbeddings(model_path=text_embeddings_model, n_ctx=model_n_ctx)
File "pydantic/main.py", line 339, in pydantic.main.BaseModel.init
File "pydantic/main.py", line 1102, in pydantic.main.validate_model
File "/home/ubuntu/environment/gptenv/lib/python3.10/site-packages/langchain/embeddings/llamacpp.py", line 64, in validate_environment
model_path = values["model_path"]
KeyError: 'model_path'

Please include:

  • your exact .env file

note: per the readme, python casalioy/ingest.py # optional <path_to_your_data_directory> -> the argument is the data directory, not a file

Downloaded the model and kept those in models/ folder

Generic

MODEL_N_CTX=1024
TEXT_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin
TEXT_EMBEDDINGS_MODEL_TYPE=LlamaCpp # LlamaCpp or HF
USE_MLOCK=true

Ingestion

PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50

Generation

MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=models/ggml-gpt4all-j-v1.3-groovy.bin
MODEL_TEMP=0.8
MODEL_STOP=[STOP]
CHAIN_TYPE=stuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=6 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=4

First issue:

  • ggml-model-q4_0.bin is a q4 model, which is deprecated. Use a q5 model, as written in the readme.

Are you on the latest master version ? If not, which one ?

using latest main branch, do i need to switch ?

Nope.

Can you replace your .env by the exact content of example.env and report the result here ?

Thanks, it worked with default settings, But error in startLLm.py

$ python casalioy/startLLM.py
found local model at models/sentence-transformers/all-MiniLM-L6-v2
found local model at models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
llama.cpp: loading model from models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
terminate called after throwing an instance of 'std::runtime_error'
what(): read error: Is a directory
Aborted (core dumped)

Output for ingest.py

$ python casalioy/ingest.py source_documents/AAR_20211231_CA1363851017_AR.pdf y
found local model at models/sentence-transformers/all-MiniLM-L6-v2
found local model at models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
Deleting db...
Scanning files
0.0% [=================================================================================================================================>] 0/ ? eta [?:??:??]
Done

python casalioy/ingest.py source_documents/AAR_20211231_CA1363851017_AR.pdf y

That's a wrong command. You need to provide a directory, not a file:

python casalioy/ingest.py source_documents/

$ python casalioy/ingest.py source_documents/
found local model at models/sentence-transformers/all-MiniLM-L6-v2
found local model at models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
Scanning files
Processing AAR_20211231_CA1363851017_AR.pdf
Processing 1393 chunks
Creating a new collection, size=384
Saving 1000 chunks
Saved, the collection now holds 999 documents.
embedding chunk 1001/1393
Saving 393 chunks
Saved, the collection now holds 1381 documents.
Processed AAR_20211231_CA1363851017_AR.pdf
Processing aapl-20220924.txt
Processing 565 chunks
Saving 565 chunks
Saved, the collection now holds 1943 documents.
Processed aapl-20220924.txt
Processing sample.csv
Processing 9 chunks
Saving 9 chunks
Saved, the collection now holds 1952 documents.
Processed sample.csv
Processing shor.pdf
Processing 22 chunks
Saving 22 chunks
Saved, the collection now holds 1974 documents.
Processed shor.pdf
Processing state_of_the_union.txt
Processing 90 chunks
Saving 90 chunks
Saved, the collection now holds 2064 documents.
Processed state_of_the_union.txt
Processing LLAMA Leveraging Object-Oriented Programming for Designing a Logging Framework-compressed.pdf
Processing 14 chunks
Saving 14 chunks
Saved, the collection now holds 2078 documents.
Processed LLAMA Leveraging Object-Oriented Programming for Designing a Logging Framework-compressed.pdf
Processing Constantinople.docx
Processing 13 chunks
Saving 13 chunks
Saved, the collection now holds 2090 documents.
Processed Constantinople.docx
Processing Easy_recipes.epub
[nltk_data] Downloading package punkt to /home/ubuntu/nltk_data...============================================> ] 7/ 9 eta [00:12]
[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] /home/ubuntu/nltk_data...
[nltk_data] Unzipping taggers/averaged_perceptron_tagger.zip.
Processing 31 chunks
Saving 31 chunks
Saved, the collection now holds 2121 documents.
Processed Easy_recipes.epub
Processing Muscle Spasms Charley Horse MedlinePlus.html
Processing 15 chunks
Saving 15 chunks
Saved, the collection now holds 2136 documents.
100.0% [===================================================================================================================================>] 9/ 9 eta [00:00]
Processed Muscle Spasms Charley Horse MedlinePlus.html
Done

That looks correct. What about startLLM.py now ?

Same error

$ python casalioy/startLLM.py
found local model at models/sentence-transformers/all-MiniLM-L6-v2
found local model at models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
llama.cpp: loading model from models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
terminate called after throwing an instance of 'std::runtime_error'
what(): read error: Is a directory
Aborted (core dumped)

My system has GPU
$ nvidia-smi
Wed May 17 09:29:11 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 33C P8 9W / 70W| 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

This is most likely not GPU related.

Just to make sure, your .env is the exact same as example.env and you haven't modified any source file ? Because it sounds like you're forwarding a directory to the LLM at some point.
Running python casalioy/startLLM.py works out-of-the-box for me.

Yes, using exactly same example.env post renaming it to .env

Generic

MODEL_N_CTX=1024
TEXT_EMBEDDINGS_MODEL=sentence-transformers/all-MiniLM-L6-v2
TEXT_EMBEDDINGS_MODEL_TYPE=HF # LlamaCpp or HF
USE_MLOCK=true

Ingestion

PERSIST_DIRECTORY=db
DOCUMENTS_DIRECTORY=source_documents
INGEST_CHUNK_SIZE=500
INGEST_CHUNK_OVERLAP=50

Generation

MODEL_TYPE=LlamaCpp # GPT4All or LlamaCpp
MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
MODEL_TEMP=0.8
MODEL_STOP=[STOP]
CHAIN_TYPE=stuff
N_RETRIEVE_DOCUMENTS=100 # How many documents to retrieve from the db
N_FORWARD_DOCUMENTS=6 # How many documents to forward to the LLM, chosen among those retrieved
N_GPU_LAYERS=4

What's the output of md5sum models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin ?

Do i need to replace model_path with this? How to get the output ?

It's not a python command. Just run it in the terminal.

Expected output:

❯ md5sum models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
29e959c57faa0bcdf95b1ba5f7c9e968  models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin

Thanks, getting below
$ md5sum models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin
md5sum: models/eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin: Is a directory

aaah ok the problem is a lot easier then x)
Hotfix for you: in .env replace

MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin

by

MODEL_PATH=eachadea/ggml-vicuna-7b-1.1/ggml-vic7b-q5_1.bin/ggml-vic7b-q5_1.bin

YOU HAVE ACCESS TO A TESLA GPU? @madeepakkumar1? Is this your own setup?

Thanks @hippalectryon-0 worked after giving full path of the model
Sorry @su77ungr I don't have Tesla GPU setup, but i have Nvidia GPU setup, let me know if I could help here

@madeepakkumar1 nvidia-smi outputed Nvidia Tesla T4 GPU so this should be you GPU haha. Nice card. Ignore this message.