CheshireCat with Ollama - Application startup failed
Closed this issue · 3 comments
Description
Hi and thank you for your awesome work. I am a newbie, so the "bug" may be caused by a mistake.
I am on:
Arch Linux - 6.9.3-arch1-1 x86_64
Docker version 26.1.4, build 5650f9b102
I followed this guide and changed the compose.yml according to the instructions provided.
To Reproduce
I ran docker compose up
: the images were correctly pulled, and containers were created successfully.
But I got this error:
cheshire_cat_core | ERROR: Traceback (most recent call last):
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core | async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core | File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core | return await anext(self.gen)
cheshire_cat_core | File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core | app.state.ccat = CheshireCat()
cheshire_cat_core | File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core | cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core | File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core | self.load_memory()
cheshire_cat_core | File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core | self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core | File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core | self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core | File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core | collection = VectorMemoryCollection(
cheshire_cat_core | File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core | self.check_embedding_size()
cheshire_cat_core | File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core | == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core | IndexError: list index out of range
cheshire_cat_core |
cheshire_cat_core | ERROR: Application startup failed. Exiting.
Expected behavior
I successfully pulled my model (mistral) with docker exec ollama_cat ollama pull mistral:7b-instruct-q2_K
, and if I run docker run --rm -it -p 1866:80 ghcr.io/cheshire-cat-ai/core:latest
I can start the Cat's GUI in my browser, but if I try to set my model to Ollama it does not work.
I think it can't connect with ollama_cat container, because running docker compose up
does not work for me.
I have also tried to run CheshireCat-AI-Core and Ollama simultaneously, but with no success.
Additional context
Here is my compose.yml:
[fz@fzpc ollama-cat]$ cat docker-compose.yml
version: '3.7'
services:
cheshire-cat-core:
image: ghcr.io/cheshire-cat-ai/core:latest
container_name: cheshire_cat_core
depends_on:
- cheshire-cat-vector-memory
- ollama
environment:
- PYTHONUNBUFFERED=1
- WATCHFILES_FORCE_POLLING=true
- CORE_HOST=${CORE_HOST:-localhost}
- CORE_PORT=${CORE_PORT:-1865}
- QDRANT_HOST=${QDRANT_HOST:-cheshire_cat_vector_memory}
- QDRANT_PORT=${QDRANT_PORT:-6333}
- CORE_USE_SECURE_PROTOCOLS=${CORE_USE_SECURE_PROTOCOLS:-}
- API_KEY=${API_KEY:-}
- LOG_LEVEL=${LOG_LEVEL:-WARNING}
- DEBUG=${DEBUG:-true}
- SAVE_MEMORY_SNAPSHOTS=${SAVE_MEMORY_SNAPSHOTS:-false}
ports:
- ${CORE_PORT:-1865}:80
volumes:
- ./cat/static:/app/cat/static
- ./cat/public:/app/cat/public
- ./cat/plugins:/app/cat/plugins
- ./cat/metadata.json:/app/metadata.json
restart: unless-stopped
cheshire-cat-vector-memory:
image: qdrant/qdrant:latest
container_name: cheshire_cat_vector_memory
expose:
- 6333
volumes:
- ./cat/long_term_memory/vector:/qdrant/storage
restart: unless-stopped
ollama:
container_name: ollama_cat
image: ollama/ollama:latest
volumes:
- ./ollama:/root/.ollama
expose:
- 11434
environment:
- gpus=all
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
My images' list:
[fz@fzpc ollama-cat]$ docker image list
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/cheshire-cat-ai/core 1.6.2 06a81f20b5b2 7 days ago 1.29GB
ghcr.io/cheshire-cat-ai/core latest 06a81f20b5b2 7 days ago 1.29GB
ollama/ollama 0.1.39 fa86221dbf8c 11 days ago 464MB
qdrant/qdrant v1.9.1 84938a05ba4e 5 weeks ago 156MB
Tree of ollama/ directory:
[fz@fzpc ollama-cat]$ tree ollama/
ollama/
├── id_ed25519
├── id_ed25519.pub
└── models
├── blobs
│ ├── sha256-22e1b2e8dc2fbc3ac38b50f59e49f594034462c1cd02764353a8a076d97c3a59
│ ├── sha256-43070e2d4e532684de521b885f385d0841030efa2b1a20bafb76133a5e1379c1
│ ├── sha256-6547352386940a480d6fa11958bb027b757b628c98266c0ae20886f7fd9068d6
│ ├── sha256-7933e7c155189d06a579600b0107ea73910d8b97a54efdb63cc8dee3198a4eb0
│ └── sha256-ed11eda7790d05b49395598a42b155812b17e263214292f7b87d15e14003d337
└── manifests
└── registry.ollama.ai
└── library
└── mistral
└── 7b-instruct-q2_K
7 directories, 8 files
And the complete traceback/logging:
[fz@fzpc ollama-cat]$ docker compose up
WARN[0000] /home/fz/Documents/Github/ollama-cat/docker-compose.yml: `version` is obsolete
[+] Running 3/0
✔ Container ollama_cat Created 0.0s
✔ Container cheshire_cat_vector_memory Created 0.0s
✔ Container cheshire_cat_core Created 0.0s
Attaching to cheshire_cat_core, cheshire_cat_vector_memory, ollama_cat
cheshire_cat_vector_memory | _ _
cheshire_cat_vector_memory | __ _ __| |_ __ __ _ _ __ | |_
cheshire_cat_vector_memory | / _` |/ _` | '__/ _` | '_ \| __|
cheshire_cat_vector_memory | | (_| | (_| | | | (_| | | | | |_
cheshire_cat_vector_memory | \__, |\__,_|_| \__,_|_| |_|\__|
cheshire_cat_vector_memory | |_|
cheshire_cat_vector_memory |
cheshire_cat_vector_memory | Version: 1.9.1, build: 97c107f2
cheshire_cat_vector_memory | Access web UI at http://localhost:6333/dashboard
cheshire_cat_vector_memory |
cheshire_cat_vector_memory | 2024-06-09T08:30:29.879122Z INFO storage::content_manager::consensus::persistent: Loading raft state from ./storage/raft_state.json
cheshire_cat_vector_memory | 2024-06-09T08:30:29.928418Z INFO storage::content_manager::toc: Loading collection: declarative
ollama_cat | 2024/06/09 08:30:30 routes.go:1028: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]"
ollama_cat | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:729 msg="total blobs: 5"
ollama_cat | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:736 msg="total unused blobs removed: 0"
ollama_cat | time=2024-06-09T08:30:30.628Z level=INFO source=routes.go:1074 msg="Listening on [::]:11434 (version 0.1.39)"
ollama_cat | time=2024-06-09T08:30:30.684Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1523340118/runners
cheshire_cat_vector_memory | 2024-06-09T08:30:32.343693Z INFO collection::shards::local_shard: Recovering collection declarative: 0/0 (0%)
cheshire_cat_vector_memory | 2024-06-09T08:30:32.343715Z INFO collection::shards::local_shard: Recovered collection declarative: 0/0 (100%)
cheshire_cat_vector_memory | 2024-06-09T08:30:32.344820Z INFO storage::content_manager::toc: Loading collection: episodic
ollama_cat | time=2024-06-09T08:30:34.145Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v60002 cpu cpu_avx cpu_avx2 cuda_v11]"
cheshire_cat_vector_memory | 2024-06-09T08:30:34.183915Z INFO collection::shards::local_shard: Recovering collection episodic: 0/0 (0%)
cheshire_cat_vector_memory | 2024-06-09T08:30:34.183937Z INFO collection::shards::local_shard: Recovered collection episodic: 0/0 (100%)
cheshire_cat_vector_memory | 2024-06-09T08:30:34.185059Z INFO qdrant: Distributed mode disabled
cheshire_cat_vector_memory | 2024-06-09T08:30:34.185079Z INFO qdrant: Telemetry reporting enabled, id: 43cfbf76-580f-4dcc-b35d-97cb9111486a
cheshire_cat_vector_memory | 2024-06-09T08:30:34.186845Z INFO qdrant::actix: TLS disabled for REST API
cheshire_cat_vector_memory | 2024-06-09T08:30:34.186935Z INFO qdrant::actix: Qdrant HTTP listening on 6333
cheshire_cat_vector_memory | 2024-06-09T08:30:34.186950Z INFO actix_server::builder: Starting 11 workers
cheshire_cat_vector_memory | 2024-06-09T08:30:34.186957Z INFO actix_server::server: Actix runtime found; starting in Actix runtime
cheshire_cat_vector_memory | 2024-06-09T08:30:34.189487Z INFO qdrant::tonic: Qdrant gRPC listening on 6334
cheshire_cat_vector_memory | 2024-06-09T08:30:34.189516Z INFO qdrant::tonic: TLS disabled for gRPC API
ollama_cat | time=2024-06-09T08:30:34.274Z level=INFO source=types.go:71 msg="inference compute" id=GPU-4151ce35-c2d3-a276-826b-00260b538d12 library=cuda compute=7.5 driver=12.4 name="NVIDIA GeForce RTX 2060" total="5.8 GiB" available="5.7 GiB"
cheshire_cat_core | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=138367), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=7, name='MainProcess'), 'thread': (id=128101330413376, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 35, 107772, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core | Traceback (most recent call last):
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core | formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core | KeyError: 'original_name'
cheshire_cat_core | --- End of logging error ---
cheshire_cat_core | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=146622), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=31, name='SpawnProcess-1'), 'thread': (id=136268812064576, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 38, 222034, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core | Traceback (most recent call last):
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core | formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core | KeyError: 'original_name'
cheshire_cat_core | --- End of logging error ---
Fetching 8 files: 100%|██████████| 8/8 [00:00<00:00, 62718.56it/s]
cheshire_cat_core | /usr/local/lib/python3.10/site-packages/qdrant_client/qdrant_remote.py:123: UserWarning: Api key is used with unsecure connection.
cheshire_cat_core | warnings.warn("Api key is used with unsecure connection.")
cheshire_cat_vector_memory | 2024-06-09T08:30:53.476329Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000136
cheshire_cat_vector_memory | 2024-06-09T08:30:53.488991Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000211
cheshire_cat_vector_memory | 2024-06-09T08:30:53.495510Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic/aliases HTTP/1.1" 200 114 "-" "python-httpx/0.27.0" 0.000082
cheshire_cat_vector_memory | 2024-06-09T08:30:53.514742Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000171
cheshire_cat_vector_memory | 2024-06-09T08:30:53.524782Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000080
cheshire_cat_vector_memory | 2024-06-09T08:30:53.533089Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative HTTP/1.1" 200 449 "-" "python-httpx/0.27.0" 0.000108
cheshire_cat_vector_memory | 2024-06-09T08:30:53.534032Z INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative/aliases HTTP/1.1" 200 78 "-" "python-httpx/0.27.0" 0.000051
cheshire_cat_core | ERROR: Traceback (most recent call last):
cheshire_cat_core | File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core | async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core | File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core | return await anext(self.gen)
cheshire_cat_core | File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core | app.state.ccat = CheshireCat()
cheshire_cat_core | File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core | cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core | File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core | self.load_memory()
cheshire_cat_core | File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core | self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core | File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core | self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core | File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core | collection = VectorMemoryCollection(
cheshire_cat_core | File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core | self.check_embedding_size()
cheshire_cat_core | File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core | == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core | IndexError: list index out of range
cheshire_cat_core |
cheshire_cat_core | ERROR: Application startup failed. Exiting.
Thanks in advance for your support
The error seems related to vector memory (aka qdrant and it's py client), try to use this compose.yml in this repo.
Thanks for your tip, I tried this repo days ago, but with no luck.
Now it works, almost flawlessly... thank you so much!
Moved to Local cat