paulbricman/conceptarium

Where is the docker data stored?

issmirnov opened this issue · 5 comments

Hey @paulbricman - really neat work! I'm deploying the conceptarium, ideoscope and lexiscore in docker via docker-compose.

Quick question: where is the core data stored?

I'd like to add a volume mount my config so that I can take backups of the data. I looked at the source code and saw the metadata pickle, but this seems to live in /app.

Thanks for the interest! All the data unique to your conceptarium is stored in the conceptarium folder in the root of the project folder. You can easily take backups of the data by making a GET request to <your conceptarium's URL>/dump, and it will give you a neat archive with that folder basically. Alternatively, you might be able to, uhm, symlink that folder to a mounted volume? Not really sure, let me know if you know how that works!

PS: Sometime this month or next one I'll publish online versions of the ideoscope and lexiscore so you wouldn't need to install them, just the conceptarium.

Understood. I took a look, and using docker-compose we can bind in a folder.

version: '3.4'

services:
  conceptarium:
    container_name: conceptarium
    image: "paulbricman/conceptarium"
    ports:
      - "8320:8000"
    volumes:
      - /path/to/data/conceptarium:/app/conceptarium
    restart: unless-stopped

That said, the current code will crash if the app/conceptarium folder already exists:

  File "/app/./util.py", line 15, in init                                                          
    os.mkdir('conceptarium')                                                                       
FileExistsError: [Errno 17] File exists: 'conceptarium'  

Perhaps you could try using https://stackoverflow.com/questions/273192/how-can-i-safely-create-a-nested-directory-in-python instead:

from pathlib import Path
Path("conceptarium").mkdir(parents=True, exist_ok=True)

If you plan to add full support for docker, it might be useful to take a $CONCEPTARIUM_DATA_PATH variable, such that it can be set in the docker environment field and pulled with os.getenv('CONCEPTARIUM_DATA_PATH'). This way your data won't live inside the app, and you can decouple source code and generated data.

Thanks for the Path suggestion, seems like a quick fix, I'll look into it. Regarding the better Docker support, could you elaborate on how being able to easily mount the folder might be useful for you?

TLDR: backups and upgrades that preserve data

Right now, the data for the conceptarium is stored in the folder tree of the docker image. By definition, docker images are temporary and transient. If I were to pull a new image, that folder would get overwritten as there would be a new image and new filesystem in use.

By decoupling the data location, I can either store my data in a dedicated docker volume, or I can use my local filesystem and run my usual filesystem based backups on that folder.

Alright, makes sense. Will add it as an enhancement issue for now