cheshire-cat-ai/core

CheshireCat with Ollama - Application startup failed

Closed this issue · 3 comments

Description
Hi and thank you for your awesome work. I am a newbie, so the "bug" may be caused by a mistake.
I am on:
Arch Linux - 6.9.3-arch1-1 x86_64
Docker version 26.1.4, build 5650f9b102

I followed this guide and changed the compose.yml according to the instructions provided.

To Reproduce
I ran docker compose up : the images were correctly pulled, and containers were created successfully.

image

But I got this error:

cheshire_cat_core           | ERROR:    Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core           |     async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core           |   File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core           |     return await anext(self.gen)
cheshire_cat_core           |   File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core           |     app.state.ccat = CheshireCat()
cheshire_cat_core           |   File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core           |     cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core           |     self.load_memory()
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core           |     self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core           |     self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core           |     collection = VectorMemoryCollection(
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core           |     self.check_embedding_size()
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core           |     == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core           | IndexError: list index out of range
cheshire_cat_core           |
cheshire_cat_core           | ERROR:    Application startup failed. Exiting.

Expected behavior
I successfully pulled my model (mistral) with docker exec ollama_cat ollama pull mistral:7b-instruct-q2_K, and if I run docker run --rm -it -p 1866:80 ghcr.io/cheshire-cat-ai/core:latest I can start the Cat's GUI in my browser, but if I try to set my model to Ollama it does not work.

image
image
image

I think it can't connect with ollama_cat container, because running docker compose up does not work for me.

I have also tried to run CheshireCat-AI-Core and Ollama simultaneously, but with no success.

Additional context
Here is my compose.yml:

[fz@fzpc ollama-cat]$ cat docker-compose.yml
version: '3.7'

services:
  cheshire-cat-core:
    image: ghcr.io/cheshire-cat-ai/core:latest
    container_name: cheshire_cat_core
    depends_on:
      - cheshire-cat-vector-memory
      - ollama
    environment:
      - PYTHONUNBUFFERED=1
      - WATCHFILES_FORCE_POLLING=true
      - CORE_HOST=${CORE_HOST:-localhost}
      - CORE_PORT=${CORE_PORT:-1865}
      - QDRANT_HOST=${QDRANT_HOST:-cheshire_cat_vector_memory}
      - QDRANT_PORT=${QDRANT_PORT:-6333}
      - CORE_USE_SECURE_PROTOCOLS=${CORE_USE_SECURE_PROTOCOLS:-}
      - API_KEY=${API_KEY:-}
      - LOG_LEVEL=${LOG_LEVEL:-WARNING}
      - DEBUG=${DEBUG:-true}
      - SAVE_MEMORY_SNAPSHOTS=${SAVE_MEMORY_SNAPSHOTS:-false}
    ports:
      - ${CORE_PORT:-1865}:80
    volumes:
      - ./cat/static:/app/cat/static
      - ./cat/public:/app/cat/public
      - ./cat/plugins:/app/cat/plugins
      - ./cat/metadata.json:/app/metadata.json
    restart: unless-stopped

  cheshire-cat-vector-memory:
    image: qdrant/qdrant:latest
    container_name: cheshire_cat_vector_memory
    expose:
      - 6333
    volumes:
      - ./cat/long_term_memory/vector:/qdrant/storage
    restart: unless-stopped

  ollama:
    container_name: ollama_cat
    image: ollama/ollama:latest
    volumes:
      - ./ollama:/root/.ollama
    expose:
      - 11434
    environment:
      - gpus=all
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

My images' list:

[fz@fzpc ollama-cat]$ docker image list
REPOSITORY                     TAG       IMAGE ID       CREATED       SIZE
ghcr.io/cheshire-cat-ai/core   1.6.2     06a81f20b5b2   7 days ago    1.29GB
ghcr.io/cheshire-cat-ai/core   latest    06a81f20b5b2   7 days ago    1.29GB
ollama/ollama                  0.1.39    fa86221dbf8c   11 days ago   464MB
qdrant/qdrant                  v1.9.1    84938a05ba4e   5 weeks ago   156MB

Tree of ollama/ directory:

[fz@fzpc ollama-cat]$ tree ollama/
ollama/
├── id_ed25519
├── id_ed25519.pub
└── models
    ├── blobs
    │   ├── sha256-22e1b2e8dc2fbc3ac38b50f59e49f594034462c1cd02764353a8a076d97c3a59
    │   ├── sha256-43070e2d4e532684de521b885f385d0841030efa2b1a20bafb76133a5e1379c1
    │   ├── sha256-6547352386940a480d6fa11958bb027b757b628c98266c0ae20886f7fd9068d6
    │   ├── sha256-7933e7c155189d06a579600b0107ea73910d8b97a54efdb63cc8dee3198a4eb0
    │   └── sha256-ed11eda7790d05b49395598a42b155812b17e263214292f7b87d15e14003d337
    └── manifests
        └── registry.ollama.ai
            └── library
                └── mistral
                    └── 7b-instruct-q2_K

7 directories, 8 files

And the complete traceback/logging:

[fz@fzpc ollama-cat]$ docker compose up
WARN[0000] /home/fz/Documents/Github/ollama-cat/docker-compose.yml: `version` is obsolete
[+] Running 3/0
 ✔ Container ollama_cat                  Created                                                   0.0s
 ✔ Container cheshire_cat_vector_memory  Created                                                   0.0s
 ✔ Container cheshire_cat_core           Created                                                   0.0s
Attaching to cheshire_cat_core, cheshire_cat_vector_memory, ollama_cat
cheshire_cat_vector_memory  |            _                 _
cheshire_cat_vector_memory  |   __ _  __| |_ __ __ _ _ __ | |_
cheshire_cat_vector_memory  |  / _` |/ _` | '__/ _` | '_ \| __|
cheshire_cat_vector_memory  | | (_| | (_| | | | (_| | | | | |_
cheshire_cat_vector_memory  |  \__, |\__,_|_|  \__,_|_| |_|\__|
cheshire_cat_vector_memory  |     |_|
cheshire_cat_vector_memory  |
cheshire_cat_vector_memory  | Version: 1.9.1, build: 97c107f2
cheshire_cat_vector_memory  | Access web UI at http://localhost:6333/dashboard
cheshire_cat_vector_memory  |
cheshire_cat_vector_memory  | 2024-06-09T08:30:29.879122Z  INFO storage::content_manager::consensus::persistent: Loading raft state from ./storage/raft_state.json
cheshire_cat_vector_memory  | 2024-06-09T08:30:29.928418Z  INFO storage::content_manager::toc: Loading collection: declarative
ollama_cat                  | 2024/06/09 08:30:30 routes.go:1028: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:729 msg="total blobs: 5"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=images.go:736 msg="total unused blobs removed: 0"
ollama_cat                  | time=2024-06-09T08:30:30.628Z level=INFO source=routes.go:1074 msg="Listening on [::]:11434 (version 0.1.39)"
ollama_cat                  | time=2024-06-09T08:30:30.684Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1523340118/runners
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.343693Z  INFO collection::shards::local_shard: Recovering collection declarative: 0/0 (0%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.343715Z  INFO collection::shards::local_shard: Recovered collection declarative: 0/0 (100%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:32.344820Z  INFO storage::content_manager::toc: Loading collection: episodic
ollama_cat                  | time=2024-06-09T08:30:34.145Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v60002 cpu cpu_avx cpu_avx2 cuda_v11]"
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.183915Z  INFO collection::shards::local_shard: Recovering collection episodic: 0/0 (0%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.183937Z  INFO collection::shards::local_shard: Recovered collection episodic: 0/0 (100%)
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.185059Z  INFO qdrant: Distributed mode disabled
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.185079Z  INFO qdrant: Telemetry reporting enabled, id: 43cfbf76-580f-4dcc-b35d-97cb9111486a
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186845Z  INFO qdrant::actix: TLS disabled for REST API
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186935Z  INFO qdrant::actix: Qdrant HTTP listening on 6333
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186950Z  INFO actix_server::builder: Starting 11 workers
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.186957Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.189487Z  INFO qdrant::tonic: Qdrant gRPC listening on 6334
cheshire_cat_vector_memory  | 2024-06-09T08:30:34.189516Z  INFO qdrant::tonic: TLS disabled for gRPC API
ollama_cat                  | time=2024-06-09T08:30:34.274Z level=INFO source=types.go:71 msg="inference compute" id=GPU-4151ce35-c2d3-a276-826b-00260b538d12 library=cuda compute=7.5 driver=12.4 name="NVIDIA GeForce RTX 2060" total="5.8 GiB" available="5.7 GiB"
cheshire_cat_core           | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core           | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=138367), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=7, name='MainProcess'), 'thread': (id=128101330413376, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 35, 107772, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core           | Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core           |     formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core           | KeyError: 'original_name'
cheshire_cat_core           | --- End of logging error ---
cheshire_cat_core           | --- Logging error in Loguru Handler #1 ---
cheshire_cat_core           | Record was: {'elapsed': datetime.timedelta(seconds=1, microseconds=146622), 'exception': None, 'extra': {}, 'file': (name='embedding.py', path='/usr/local/lib/python3.10/site-packages/fastembed/embedding.py'), 'function': '<module>', 'level': (name='WARNING', no=30, icon='⚠️'), 'line': 7, 'message': 'DefaultEmbedding, FlagEmbedding, JinaEmbedding are deprecated.Use from fastembed import TextEmbedding instead.', 'module': 'embedding', 'name': 'fastembed.embedding', 'process': (id=31, name='SpawnProcess-1'), 'thread': (id=136268812064576, name='MainThread'), 'time': datetime(2024, 6, 9, 8, 30, 38, 222034, tzinfo=datetime.timezone(datetime.timedelta(0), 'UTC'))}
cheshire_cat_core           | Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/loguru/_handler.py", line 184, in emit
cheshire_cat_core           |     formatted = precomputed_format.format_map(formatter_record)
cheshire_cat_core           | KeyError: 'original_name'
cheshire_cat_core           | --- End of logging error ---
Fetching 8 files: 100%|██████████| 8/8 [00:00<00:00, 62718.56it/s]
cheshire_cat_core           | /usr/local/lib/python3.10/site-packages/qdrant_client/qdrant_remote.py:123: UserWarning: Api key is used with unsecure connection.
cheshire_cat_core           |   warnings.warn("Api key is used with unsecure connection.")
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.476329Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000136
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.488991Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000211
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.495510Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic/aliases HTTP/1.1" 200 114 "-" "python-httpx/0.27.0" 0.000082
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.514742Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/episodic HTTP/1.1" 200 450 "-" "python-httpx/0.27.0" 0.000171
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.524782Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections HTTP/1.1" 200 105 "-" "python-httpx/0.27.0" 0.000080
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.533089Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative HTTP/1.1" 200 449 "-" "python-httpx/0.27.0" 0.000108
cheshire_cat_vector_memory  | 2024-06-09T08:30:53.534032Z  INFO actix_web::middleware::logger: 172.19.0.4 "GET /collections/declarative/aliases HTTP/1.1" 200 78 "-" "python-httpx/0.27.0" 0.000051
cheshire_cat_core           | ERROR:    Traceback (most recent call last):
cheshire_cat_core           |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 732, in lifespan
cheshire_cat_core           |     async with self.lifespan_context(app) as maybe_state:
cheshire_cat_core           |   File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
cheshire_cat_core           |     return await anext(self.gen)
cheshire_cat_core           |   File "/app/cat/main.py", line 33, in lifespan
cheshire_cat_core           |     app.state.ccat = CheshireCat()
cheshire_cat_core           |   File "/app/cat/utils.py", line 172, in getinstance
cheshire_cat_core           |     cls.instances[class_] = class_(*args, **kwargs)
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 74, in __init__
cheshire_cat_core           |     self.load_memory()
cheshire_cat_core           |   File "/app/cat/looking_glass/cheshire_cat.py", line 247, in load_memory
cheshire_cat_core           |     self.memory = LongTermMemory(vector_memory_config=vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/long_term_memory.py", line 17, in __init__
cheshire_cat_core           |     self.vectors = VectorMemory(**vector_memory_config)
cheshire_cat_core           |   File "/app/cat/memory/vector_memory.py", line 37, in __init__
cheshire_cat_core           |     collection = VectorMemoryCollection(
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 52, in __init__
cheshire_cat_core           |     self.check_embedding_size()
cheshire_cat_core           |   File "/app/cat/memory/vector_memory_collection.py", line 69, in check_embedding_size
cheshire_cat_core           |     == self.client.get_collection_aliases(self.collection_name)
cheshire_cat_core           | IndexError: list index out of range
cheshire_cat_core           |
cheshire_cat_core           | ERROR:    Application startup failed. Exiting.

Thanks in advance for your support

The error seems related to vector memory (aka qdrant and it's py client), try to use this compose.yml in this repo.

Thanks for your tip, I tried this repo days ago, but with no luck.
Now it works, almost flawlessly... thank you so much!

Moved to Local cat