felddy/foundryvtt-docker

Remaining `config.json.lock` folder after stopping the container

florianduros opened this issue · 12 comments

Bug description

Hi,

First of all, thank you for your work on this docker image.

Every time, I'm restarting my foundry vtt server, the restart fails.
When the container stops, the config.json.lock folder is not removed and I have to manually remove it to make the restart work.

Steps to reproduce

  • Start the container (I'm doing it via docker compose)
  • Stop the container

Expected behavior

I'm expecting to be able to restart the container without to have to manually delete the lock file.

Container metadata

com.foundryvtt.version = "11.313"
org.opencontainers.image.authors = "markf+github@geekpad.com"
org.opencontainers.image.created = "2023-10-04T18:50:35.076Z"
org.opencontainers.image.description = "An easy-to-deploy Dockerized Foundry Virtual Tabletop server."
org.opencontainers.image.licenses = "MIT"
org.opencontainers.image.revision = "97183097680741b87e1171aa638a3491c59bd46d"
org.opencontainers.image.source = "https://github.com/felddy/foundryvtt-docker"
org.opencontainers.image.title = "foundryvtt-docker"
org.opencontainers.image.url = "https://github.com/felddy/foundryvtt-docker"
org.opencontainers.image.vendor = "Geekpad"
org.opencontainers.image.version = "11.313.0"

Relevant log output

* Connection #0 to host foundryvtt.s3.amazonaws.com left intact
Entrypoint | 2023-10-19 09:55:26 | [info] Installing Foundry Virtual Tabletop 11.313
Entrypoint | 2023-10-19 09:55:26 | [debug] Checking mime-type of release file.
Entrypoint | 2023-10-19 09:55:26 | [debug] Found mime-type: application/zip
Entrypoint | 2023-10-19 09:55:26 | [debug] Extracting release file.
Entrypoint | 2023-10-19 09:55:28 | [debug] Installation completed.
Entrypoint | 2023-10-19 09:55:28 | [info] Preserving release archive file in cache.
Entrypoint | 2023-10-19 09:55:28 | [debug] Patching GUI update and configuration messages.
Entrypoint | 2023-10-19 09:55:28 | [info] Not modifying existing installation license key.
Entrypoint | 2023-10-19 09:55:28 | [info] Setting data directory permissions.
Entrypoint | 2023-10-19 09:55:28 | [debug] Setting ownership of /data to 1000:1000.
Entrypoint | 2023-10-19 09:55:28 | [debug] Completed setting directory permissions.
Entrypoint | 2023-10-19 09:55:28 | [info] Starting launcher with uid:gid as 1000:1000.
Launcher | 2023-10-19 09:55:28 | [debug] Ensuring /data/Config directory exists.
Launcher | 2023-10-19 09:55:28 | [warn] CONTAINER_PRESERVE_CONFIG is set: Not updating options.json
Launcher | 2023-10-19 09:55:28 | [warn] CONTAINER_PRESERVE_CONFIG is set: Not updating admin.txt
Launcher | 2023-10-19 09:55:28 | [info] FOUNDRY_IP_DISCOVERY is set to false: Disabling IP discovery.
Launcher | 2023-10-19 09:55:28 | [info] Starting Foundry Virtual Tabletop.
FoundryVTT | 2023-10-19 09:55:29 | [info] Running on Node.js - Version 18.18.0
FoundryVTT | 2023-10-19 09:55:29 | [info] Foundry Virtual Tabletop - Version 11 Build 313
FoundryVTT | 2023-10-19 09:55:29 | [info] User Data Directory - "/data"
FoundryVTT | 2023-10-19 09:55:29 | [info] Application Options:
{
  "awsConfig": null,
  "compressSocket": false,
  "compressStatic": false,
  "cssTheme": "fantasy",
  "fullscreen": false,
  "hostname": "*******,
  "hotReload": false,
  "language": "fr.fr-core",
  "localHostname": null,
  "passwordSalt": null,
  "port": 30000,
  "protocol": null,
  "proxyPort": null,
  "proxySSL": false,
  "routePrefix": null,
  "sslCert": null,
  "sslKey": null,
  "telemetry": false,
  "updateChannel": "stable",
  "upnp": false,
  "upnpLeaseDuration": null,
  "world": null,
  "deleteNEDB": false,
  "serviceConfig": null,
  "adminPassword": "••••••••••••••••"

}
FoundryVTT | 2023-10-19 09:55:29 | [info] Software license verification succeeded
FoundryVTT | 2023-10-19 09:55:29 | [info] Server started and listening on port 30000
FoundryVTT | 2023-10-19 09:55:31 | [info] Created client session 2a78324a87b57e3d6cc52a0d
FoundryVTT | 2023-10-19 09:55:31 | [warn] The module "gm-screen" contains "dependencies" which is deprecated in favor of "relationships.requires"
Deprecated since Version 10
Backwards-compatible support will be removed in Version 13

// Restart

Entrypoint | 2023-10-19 09:55:49 | [debug] Timezone set to: Europe/Paris
Entrypoint | 2023-10-19 09:55:49 | [info] Starting felddy/foundryvtt container v11.313.0
Entrypoint | 2023-10-19 09:55:49 | [debug] CONTAINER_VERBOSE set.  Debug logging enabled.
Entrypoint | 2023-10-19 09:55:49 | [debug] Running as: uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)
Entrypoint | 2023-10-19 09:55:49 | [debug] Environment: CONTAINER_PRESERVE_CONFIG=true
CONTAINER_VERBOSE=true
FOUNDRY_ADMIN_KEY=[REDACTED]
FOUNDRY_GID=1000
FOUNDRY_HOME=/home/foundry
FOUNDRY_HOSTNAME=*******
FOUNDRY_IP_DISCOVERY=false
FOUNDRY_LANGUAGE=fr.core
FOUNDRY_PASSWORD=[REDACTED]
FOUNDRY_UID=1000
FOUNDRY_USERNAME=****
FOUNDRY_VERSION=11.313
HOME=/root
HOSTNAME=foundryvtt
NODE_VERSION=18.18.0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/home/foundry
SHLVL=1
TIMEZONE=Europe/Paris
YARN_VERSION=1.22.19
Entrypoint | 2023-10-19 09:55:49 | [info] Foundry Virtual Tabletop 11.313 is installed.
Entrypoint | 2023-10-19 09:55:49 | [info] Not modifying existing installation license key.
Entrypoint | 2023-10-19 09:55:49 | [info] Setting data directory permissions.
Entrypoint | 2023-10-19 09:55:49 | [debug] Setting ownership of /data to 1000:1000.
Entrypoint | 2023-10-19 09:55:49 | [debug] Completed setting directory permissions.
Entrypoint | 2023-10-19 09:55:49 | [info] Starting launcher with uid:gid as 1000:1000.
Launcher | 2023-10-19 09:55:49 | [debug] Ensuring /data/Config directory exists.
Launcher | 2023-10-19 09:55:49 | [warn] CONTAINER_PRESERVE_CONFIG is set: Not updating options.json
Launcher | 2023-10-19 09:55:49 | [warn] CONTAINER_PRESERVE_CONFIG is set: Not updating admin.txt
Launcher | 2023-10-19 09:55:49 | [info] FOUNDRY_IP_DISCOVERY is set to false: Disabling IP discovery.
Launcher | 2023-10-19 09:55:49 | [info] Starting Foundry Virtual Tabletop.
FoundryVTT | 2023-10-19 09:55:49 | [info] Running on Node.js - Version 18.18.0
FoundryVTT | 2023-10-19 09:55:49 | [info] Foundry Virtual Tabletop - Version 11 Build 313
FoundryVTT | 2023-10-19 09:55:49 | [info] User Data Directory - "/data"
FoundryVTT | 2023-10-19 09:55:49 | [error] A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.
Error: A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.
    at _acquireLockFile (file:///home/foundry/resources/app/dist/init.mjs:1:4977)
    at async _initializeCriticalFunctions (file:///home/foundry/resources/app/dist/init.mjs:1:2593)
    at async Module.initialize (file:///home/foundry/resources/app/dist/init.mjs:1:1564)
Launcher | 2023-10-19 09:55:50 | [error] Node process exited with code 1

Code of Conduct

  • I agree to follow this project's Code of Conduct

I've run into the same problem (using podman). The node process isn't receiving SIGTERM, and gets eventually SIGKILLed, leaving locks behind. Seems to be two problems:

  1. Same as envoyproxy/envoy#13944. Applying the same fix to entrypoint.sh (exec su-exec ...) indeed has 'launcher.sh' running as PID 1.

  2. But spawning a shell in the container and sending SIGTERM to launcher.sh still doesn't kill the container. (Sending it straight to node does) .

Too late to look further today.

Actually, podman run --init ... just does the trick. I would guess that is also true for Docker.

Thanks @haraldschioberg , it works when I'm doing a docker compose down and then a docker compose up. I used the init field https://docs.docker.com/compose/compose-file/compose-file-v3/#init

But, if I'm restarting the container with a docker compose restart, I still have the error.

In case of a down and an up, we keep the same PID thanks to the init field and foundryvtt sees that the lock is from his own process and ignores it. However, I wonder if in a case of a docker restart, the init field is ignored and the PID changes. Kind of docker magic ?

(sorry, I closed the issue by mistake)

felddy commented

I think this has to do with how podman starting the container differs from other implementations. Specifically the uid:gid of the entrypoint. Can you enable verbose logging (set CONTAINER_VERBOSE to true) and update the logs in your issue description with the verbose output. I expect that it uid that is being used isn't allowing su-exec to work all of its magic.

I'll see if I can replicate the behavior you are seeing. Then I should be able to come up with a fix for the signal handling either using some traps or coaxing the exec into working correctly.

See:

@felddy I updated the logs

felddy commented

@florianduros I think I may have solution implemented in PR #822. I've correctly implemented the use of exec that I meant to do originally. I am very confident this will solve this issue.

To test:

  1. Please change your docker compose to use image felddy/foundryvtt:issue-813
  2. Make sure to disable the init option if you enabled it in your configuration
  3. Test starting up, and tearing down, as well as the restart that caused trouble

Let me know how it goes.

If this all works I'll merge the PR and it can go live with the next release of FoundryVTT.

Yep! Thanks.

Here is me testing:

$ podman run -d --name foundryvtt --publish 30000:30000/tcp --hostname foundry --stop-signal SIGTERM --mount type=bind,source=/home/harald/foundrydata,target=/data --env CONTAINER_CACHE=/data/container_cache/ --env CONTAINER_PRESERVE_CONFIG=true docker.io/felddy/foundryvtt:issue-813
7db013503342ed1ada1b7c81d1cce975b12a7af49bd71a8c3d01464c38b77691
$ ps faux
[...]
harald    120452  0.0  0.0   9004  2304 ?        Ss   12:41   0:00  \_ /usr/bin/conmon --api-version 1 -c 955782d14fb5875082db57d30b26d23c8a9c990df2f1087db8e0faf09b3ec6b3 -u 955782d14fb587508
100420    120457 15.7  0.3 10860008 122440 ?     Ssl  12:41   0:00      \_ node resources/app/main.mjs --port=30000 --headless --noupdate --dataPath=/data
[...]
$ podman exec foundryvtt ps 
PID   USER     TIME  COMMAND
    1 foundry   0:00 node resources/app/main.mjs --port=30000 --headless --noupdate --dataPath=/data
   57 root      0:00 ps
$ time podman stop foundryvtt 
foundryvtt

real	0m0.148s
user	0m0.034s
sys	0m0.032s

perfect.

felddy commented

I'm glad that helped. Thanks for the quick turn-around on the testing. I'm going to merge that change in, and it will be included in the next release.

For now you can keep using :issue-813 or switch over to :edge or :nightly until the next release drops.

Thank you again for reporting this!

felddy commented

@haraldschioberg I realized this morning that the fix I implemented for this issue will inadvertently bypass some DDOS protections for the Foundry S3 bucket. I'm going to reopen it and push a slightly different fix. Once I get the new fix up I would appreciate if you could test again with the new image.

I expect this will happen over the next half hour. I comment again, once the new image is built.

felddy commented

The new image is built.

When you get a chance could you please pull and test with felddy/foundryvtt:improvement-signals. Same tests as above.

felddy commented

I'm pretty confident that the code changes were successful. I just rebased the PR and it will auto-merge, and close this issue once the CI tests pass.

If this issue persists please feel free to reopen this issue.