cheshire-cat-ai/core

[Feature] Improve docker by giving it some volumes

Closed this issue · 5 comments

Here is the list of volume paths that I propose:

services:
  cheshire-cat-core:
    image: ghcr.io/cheshire-cat-ai/core:latest
    container_name: cheshire_cat_core
    ...
    volumes:
      - cheshire-cat-data:/app/cat/data
      - cheshire-cat-plugins:/app/cat/plugins
    ...

volumes:
  - cheshire-cat-data
  - cheshire-cat-plugins

Can you give a complete example of this feature?
What are the pros compared with the volume mount we have? Is just an another way to declare filesystem binding/mapping?

My knowledge of Docker Volumes is:

  • Volumes are manage by docker and you cannot access them like we do right now.
  • Imagine developing a plugin, how do i create the folder where i put all my code?

In fact, your knowledge about docker volume is correct (in sense of how it works), it just the deployment guide which is not very clear:

  • The docker-compose-full.yml requires to mount all core directory, which contains not only the code, but also plugins and data.
  • The docker-compose-full.yml mentioned above also need to have the repository cloned, that means it is not usable with ghcr.io/cheshire-cat-ai/core:latest (because mounting core directory will also override the /app inside the container)
  • The deployment using docker command in the Quick start section does not have any volume, that mean it does not keep any data.

So, here are some suggestions that can improve the deployment using docker:

  • It's recommended to regroup all user-editable data in one place. For example on Wordpress, you can have wp-content folder as a volume, then the rest is read-only. In our case, it would be nice if plugins and data are both regrouped inside one directory.
  • Inside Dockerfile, you can precise data directory using VOLUME instruction. For example, see the Dockerfile of postgresql
  • This is just a declarative way to specify the volume. By default, as you say, the volume will be managed by docker (i.e. docker command without any -v). User can always override this behavior by using -v on docker command (or volumes: in docker compose)
  • You can have a docker-compose.dev.yml that mounts the whole core directory inside the container, for development purpose.

The deployment using docker command in the Quick start section does not have any volume, that mean it does not keep any data.

I agree that need to change it!

You can have a docker-compose.dev.yml that mounts the whole core directory inside the container, for development purpose.

I agree with that too.
Would you like to resolve those 2 with a PR? would be great!

The docker-compose-full.yml requires to mount all core directory, which contains not only the code, but also plugins and data.
The docker-compose-full.yml mentioned above also need to have the repository cloned, that means it is not usable with ghcr.io/cheshire-cat-ai/core:latest (because mounting core directory will also override the /app inside the container)

@pieroit idea is: You want to dev? clone the repo and use the compose provided. You want the service? use the image provided
I proposed to use the image but than i discover what you said, will override the /app in the container and make it unusable.
At the time, i didn't have the time to do some research so i ask to you: is there a way to resolve this problem?

Also, in develop branch the docker-compose-full.yml was deleted bc most ppl have problem with the db, specially in windows.
So from the new version will we have only a container, where the db is a simple local file managed by QdrantClient.

Inside Dockerfile, you can precise data directory using VOLUME instruction.

Maybe can this answer the question??

This is just a declarative way to specify the volume. By default, as you say, the volume will be managed by docker (i.e. docker command without any -v). User can always override this behavior by using -v on docker command (or volumes: in docker compose)

Ok, so the idea is to have a "specific name" for the volume. We the user need to change the volume, he needs to change only the path of that volume.
I tried yesterday but i had problems with the creation of the volumes, so this is why i asked for an example

@ngxson thanks for your suggestions

  • docker-compose-full.yml has been deleted, only created confusion
  • do not agree on named volumes, I know they are the recommended way, but not as user friendly as bind volumes
  • agree on only having a volumne instead of 2 or 3, let's fix this for version 2 of the cat

Thanks also @valentimarco fro joining the conversation

Sorry for the late response. Thanks for fixing that.

About the "named volumes", personally I prefer doing bind mount (same as you do), but since you stated on the README that "🐋 Production ready - 100% dockerized", I believe it's a requirement to support that "named volume".

Even though 99% users still prefer bind mount, there's 1% of the users who cannot use bind mount at all. For example, on kubernetes, you will find that there're many types of volume and the bind mount option (a.k.a host path) is actually not recommended.

This is not an urgent thing to do, I just want to fullfill that "100% dockerized" promise. Will have a look when I have more time. Thanks again.