CheshireCat with Ollama - Application startup failed

Question

CheshireCat with Ollama - Application startup failed

Closed this issue a month ago · 3 comments

Description
Hi and thank you for your awesome work. I am a newbie, so the "bug" may be caused by a mistake.
I am on:
Arch Linux - 6.9.3-arch1-1 x86_64
Docker version 26.1.4, build 5650f9b102

I followed this guide and changed the compose.yml according to the instructions provided.

To Reproduce
I ran docker compose up : the images were correctly pulled, and containers were created successfully.

But I got this error:

cheshire_cat_core           | ERROR:    Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core           |     async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core           |   File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core           |     return await anext(self.gen)
cheshire_cat_core           |   File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core           |     app.state.ccat = CheshireCat()
cheshire_cat_core           |   File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core           |     cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core           |     self.load_memory()
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core           |     self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core           |     self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core           |     collection = VectorMemoryCollection(
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core           |     self.check_embedding_size()
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core           |     == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core           | IndexError: list index out of range
cheshire_cat_core           |
cheshire_cat_core           | ERROR:    Application startup failed. Exiting.

Expected behavior
I successfully pulled my model (mistral) with docker exec ollama_cat ollama pull mistral:7b-instruct-q2_K, and if I run docker run --rm -it -p 1866:80 ghcr.io/cheshire-cat-ai/core:latest I can start the Cat's GUI in my browser, but if I try to set my model to Ollama it does not work.

I think it can't connect with ollama_cat container, because running docker compose up does not work for me.

I have also tried to run CheshireCat-AI-Core and Ollama simultaneously, but with no success.

Additional context
Here is my compose.yml:

[fz@fzpc ollama-cat]$ cat docker-compose.yml
version: '3.7'

services:
  cheshire-cat-core:
    image: ghcr.io/cheshire-cat-ai/core:latest
    container_name: cheshire_cat_core
    depends_on:
      - cheshire-cat-vector-memory
      - ollama
    environment:
      - PYTHONUNBUFFERED=1
      - WATCHFILES_FORCE_POLLING=true
      - CORE_HOST=${CORE_HOST:-localhost}
      - CORE_PORT=${CORE_PORT:-1865}
      - QDRANT_HOST=${QDRANT_HOST:-cheshire_cat_vector_memory}
      - QDRANT_PORT=${QDRANT_PORT:-6333}
      - CORE_USE_SECURE_PROTOCOLS=${CORE_USE_SECURE_PROTOCOLS:-}
      - API_KEY=${API_KEY:-}
      - LOG_LEVEL=${LOG_LEVEL:-WARNING}
      - DEBUG=${DEBUG:-true}
      - SAVE_MEMORY_SNAPSHOTS=${SAVE_MEMORY_SNAPSHOTS:-false}
    ports:
      - ${CORE_PORT:-1865}:80
    volumes:
      - ./cat/static:/app/cat/static
      - ./cat/public:/app/cat/public
      - ./cat/plugins:/app/cat/plugins
      - ./cat/metadata.json:/app/metadata.json
    restart: unless-stopped

  cheshire-cat-vector-memory:
    image: qdrant/qdrant:latest
    container_name: cheshire_cat_vector_memory
    expose:
      - 6333
    volumes:
      - ./cat/long_term_memory/vector:/qdrant/storage
    restart: unless-stopped

  ollama:
    container_name: ollama_cat
    image: ollama/ollama:latest
    volumes:
      - ./ollama:/root/.ollama
    expose:
      - 11434
    environment:
      - gpus=all
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

My images' list:

[fz@fzpc ollama-cat]$ docker image list
REPOSITORY                     TAG       IMAGE ID       CREATED       SIZE
ghcr.io/cheshire-cat-ai/core   1.6.2     06a81f20b5b2   7 days ago    1.29GB
ghcr.io/cheshire-cat-ai/core   latest    06a81f20b5b2   7 days ago    1.29GB
ollama/ollama                  0.1.39    fa86221dbf8c   11 days ago   464MB
qdrant/qdrant                  v1.9.1    84938a05ba4e   5 weeks ago   156MB

Tree of ollama/ directory:

[fz@fzpc ollama-cat]$ tree ollama/
ollama/
├── id_ed25519
├── id_ed25519.pub
└── models
    ├── blobs
    │   ├── sha256-22e1b2e8dc2fbc3ac38b50f59e49f594034462c1cd02764353a8a076d97c3a59
    │   ├── sha256-43070e2d4e532684de521b885f385d0841030efa2b1a20bafb76133a5e1379c1
    │   ├── sha256-6547352386940a480d6fa11958bb027b757b628c98266c0ae20886f7fd9068d6
    │   ├── sha256-7933e7c155189d06a579600b0107ea73910d8b97a54efdb63cc8dee3198a4eb0
    │   └── sha256-ed11eda7790d05b49395598a42b155812b17e263214292f7b87d15e14003d337
    └── manifests
        └── registry.ollama.ai
            └── library
                └── mistral
                    └── 7b-instruct-q2_K

7 directories, 8 files

And the complete traceback/logging:

[fz@fzpc ollama-cat]$ docker compose up
WARN[0000] /home/fz/Documents/Github/ollama-cat/docker-compose.yml: `version` is obsolete
[+] Running 3/0
 ✔ Container ollama_cat                  Created                                                   0.0s
 ✔ Container cheshire_cat_vector_memory  Created                                                   0.0s
 ✔ Container cheshire_cat_core           Created                                                   0.0s
Attaching to cheshire_cat_core, cheshire_cat_vector_memory, ollama_cat
cheshire_cat_vector_memory  |            _                 _
cheshire_cat_vector_memory  |   __ _  __| |_ __ __ _ _ __ | |_
cheshire_cat_vector_memory  |  / _` |/ _` | '__/ _` | '_ \| __|
cheshire_cat_vector_memory  | | (_| | (_| | | | (_| | | | | |_
cheshire_cat_vector_memory  |  \__, |\__,_|_|  \__,_|_| |_|\__|
cheshire_cat_vector_memory  |     |_|
cheshire_cat_vector_memory  |
cheshire_cat_vector_memory  | Version: 1.9.1, build: 97c107f2
cheshire_cat_vector_memory  | Access web UI at http://localhost:6333/dashboard
cheshire_cat_vector_memory  |
cheshire_cat_vector_memory  | 2024-06-09T08:30:29.879122Z  INFO storage::content_manager::consensus::persistent: Loading raft state from ./storage/raft_state.json
cheshire_cat_vector_memory  | 2024-06-09T08:30:29.928418Z  INFO storage::content_manager::toc: Loading collection: declarative
ollama_cat                  | 2024/06/09 08:30:30 routes.go:1028: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:729 msg="total blobs: 5"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:736 msg="total unused blobs removed: 0"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=routes.go:1074 msg="Listening on [::]:11434 (version 0.1.39)"
ollama_cat                  | time=2024-06-09T08:30:30.684Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1523340118/runners
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.343693Z  INFO collection::shards::local_shard: Recovering collection declarative: 0/0 (0%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.343715Z  INFO collection::shards::local_shard: Recovered collection declarative: 0/0 (100%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.344820Z  INFO storage::content_manager::toc: Loading collection: episodic
ollama_cat                  | time=2024-06-09T08:30:34.145Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v60002 cpu cpu_avx cpu_avx2 cuda_v11]"
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.183915Z  INFO collection::shards::local_shard: Recovering collection episodic: 0/0 (0%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.183937Z  INFO collection::shards::local_shard: Recovered collection episodic: 0/0 (100%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.185059Z  INFO qdrant: Distributed mode disabled
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.185079Z  INFO qdrant: Telemetry reporting enabled, id: 43cfbf76-580f-4dcc-b35d-97cb9111486a
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186845Z  INFO qdrant::actix: TLS disabled for REST API
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186935Z  INFO qdrant::actix: Qdrant HTTP listening on 6333
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186950Z  INFO actix_server::builder: Starting 11 workers
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186957Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.189487Z  INFO qdrant::tonic: Qdrant gRPC listening on 6334
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.189516Z  INFO qdrant::tonic: TLS disabled for gRPC API
ollama_cat                  | time=2024-06-09T08:30:34.274Z level=INFO source=types.go:71 msg="inference compute" id=GPU-4151ce35-c2d3-a276-826b-00260b538d12 library=cuda compute=7.5 driver=12.4 name="NVIDIA GeForce RTX 2060" total="5.8 GiB" available="5.7 GiB"
cheshire_cat_core           | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core           | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=138367), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=7, name='MainProcess'), 'thread': (id=128101330413376, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 35, 107772, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core           | Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core           |     formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core           | KeyError: 'original_name'
cheshire_cat_core           | --- End of logging error ---
cheshire_cat_core           | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core           | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=146622), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=31, name='SpawnProcess-1'), 'thread': (id=136268812064576, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 38, 222034, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core           | Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core           |     formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core           | KeyError: 'original_name'
cheshire_cat_core           | --- End of logging error ---
Fetching 8 files: 100%|██████████| 8/8 [00:00<00:00, 62718.56it/s]
cheshire_cat_core           | /usr/local/lib/python3.10/site-packages/qdrant_client/qdrant_remote.py:123: UserWarning: Api key is used with unsecure connection.
cheshire_cat_core           |   warnings.warn("Api key is used with unsecure connection.")
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.476329Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000136
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.488991Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000211
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.495510Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic/aliases HTTP/1.1" 200 114 "-" "python-httpx/0.27.0" 0.000082
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.514742Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000171
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.524782Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000080
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.533089Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative HTTP/1.1" 200 449 "-" "python-httpx/0.27.0" 0.000108
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.534032Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative/aliases HTTP/1.1" 200 78 "-" "python-httpx/0.27.0" 0.000051
cheshire_cat_core           | ERROR:    Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core           |     async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core           |   File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core           |     return await anext(self.gen)
cheshire_cat_core           |   File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core           |     app.state.ccat = CheshireCat()
cheshire_cat_core           |   File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core           |     cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core           |     self.load_memory()
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core           |     self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core           |     self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core           |     collection = VectorMemoryCollection(
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core           |     self.check_embedding_size()
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core           |     == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core           | IndexError: list index out of range
cheshire_cat_core           |
cheshire_cat_core           | ERROR:    Application startup failed. Exiting.

Thanks in advance for your support

Answer 1 · 2024-06-09T14:15:32.000Z

The error seems related to vector memory (aka qdrant and it's py client), try to use this compose.yml in this repo.

Answer 2 · 2024-06-09T19:41:08.000Z

Thanks for your tip, I tried this repo days ago, but with no luck.
Now it works, almost flawlessly... thank you so much!

Answer 3 · 2024-06-10T08:26:34.000Z

Moved to Local cat