Docker container 4.0.7 failing to start
Closed this issue · 5 comments
Problem
The neo4j Docker image has started failing to start with a permissions error on logs. I'm specifying version 4.0
which currently maps to 4.0.7
. I am not mapping any volumes. My builds worked yesterday when 4.0.6
was the latest 4.0
version.
Steps to reproduce
Relevant docker-compose:-
version: '3.8'
services:
neo4j:
image: neo4j:4.0
environment:
- NEO4J_AUTH=foo/bar
- NEO4J_apoc_export_file_enabled=true
- NEO4J_apoc_import_file_enabled=true
- NEO4J_apoc_import_file_use__neo4j__config=true
- NEO4JLABS_PLUGINS=["apoc"]
Expected behaviour
Neo4j Docker container starts.
Actual behaviour
Container fails to start with the following output:-
neo4j_1 | Warning: Folder mounted to "/data/dbms" is not writable from inside container. Changing folder owner to neo4j.
neo4j_1 | Changed password for user 'neo4j'.
neo4j_1 | Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
neo4j_1 | Installing Plugin 'apoc' from https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.0.0.17/apoc-4.0.0.17-all.jar to /var/lib/neo4j/plugins/apoc.jar
neo4j_1 | Applying default values for plugin apoc to neo4j.conf
neo4j_1 | Directories in use:
neo4j_1 | home: /var/lib/neo4j
neo4j_1 | config: /var/lib/neo4j/conf
neo4j_1 | logs: /logs
neo4j_1 | plugins: /var/lib/neo4j/plugins
neo4j_1 | import: /var/lib/neo4j/import
neo4j_1 | data: /var/lib/neo4j/data
neo4j_1 | certificates: /var/lib/neo4j/certificates
neo4j_1 | run: /var/lib/neo4j/run
neo4j_1 | Starting Neo4j.
neo4j_1 | 2020-07-22 08:09:05.792+0000 INFO ======== Neo4j 4.0.7 ========
neo4j_1 | 2020-07-22 08:09:05.798+0000 INFO Starting...
neo4j_1 | 2020-07-22 08:09:06.138+0000 ERROR Failed to start Neo4j: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService@38af9828' was successfully initialized, but failed to start. Please see the attached cause exception "/logs/debug.log (Permission denied)". Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService@38af9828' was successfully initialized, but failed to start. Please see the attached cause exception "/logs/debug.log (Permission denied)".
neo4j_1 | org.neo4j.server.ServerStartupException: Starting Neo4j failed: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService@38af9828' was successfully initialized, but failed to start. Please see the attached cause exception "/logs/debug.log (Permission denied)".
neo4j_1 | at org.neo4j.server.exception.ServerStartupErrors.translateToServerStartupError(ServerStartupErrors.java:45)
neo4j_1 | at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:164)
neo4j_1 | at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:114)
neo4j_1 | at org.neo4j.server.ServerBootstrapper.start(ServerBootstrapper.java:89)
neo4j_1 | at org.neo4j.server.CommunityEntryPoint.main(CommunityEntryPoint.java:35)
neo4j_1 | Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.server.database.LifecycleManagingDatabaseService@38af9828' was successfully initialized, but failed to start. Please see the attached cause exception "/logs/debug.log (Permission denied)".
neo4j_1 | at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:465)
neo4j_1 | at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:111)
neo4j_1 | at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:157)
neo4j_1 | ... 3 more
neo4j_1 | Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /logs/debug.log (Permission denied)
neo4j_1 | at org.neo4j.graphdb.factory.module.GlobalModule.createLogService(GlobalModule.java:350)
neo4j_1 | at org.neo4j.graphdb.factory.module.GlobalModule.<init>(GlobalModule.java:162)
neo4j_1 | at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.createGlobalModule(DatabaseManagementServiceFactory.java:214)
neo4j_1 | at org.neo4j.graphdb.facade.DatabaseManagementServiceFactory.build(DatabaseManagementServiceFactory.java:119)
neo4j_1 | at org.neo4j.server.database.CommunityGraphFactory.newDatabaseManagementService(CommunityGraphFactory.java:36)
neo4j_1 | at org.neo4j.server.database.LifecycleManagingDatabaseService.start(LifecycleManagingDatabaseService.java:88)
neo4j_1 | at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:444)
neo4j_1 | ... 5 more
neo4j_1 | Caused by: java.io.FileNotFoundException: /logs/debug.log (Permission denied)
neo4j_1 | at java.base/java.io.FileOutputStream.open0(Native Method)
neo4j_1 | at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298)
neo4j_1 | at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237)
neo4j_1 | at org.neo4j.io.fs.DefaultFileSystemAbstraction.openAsOutputStream(DefaultFileSystemAbstraction.java:78)
neo4j_1 | at org.neo4j.io.fs.FileSystemUtils.createOrOpenAsOutputStream(FileSystemUtils.java:54)
neo4j_1 | at org.neo4j.logging.RotatingFileOutputStreamSupplier.openOutputFile(RotatingFileOutputStreamSupplier.java:338)
neo4j_1 | at org.neo4j.logging.RotatingFileOutputStreamSupplier.<init>(RotatingFileOutputStreamSupplier.java:137)
neo4j_1 | at org.neo4j.logging.RotatingFileOutputStreamSupplier.<init>(RotatingFileOutputStreamSupplier.java:121)
neo4j_1 | at org.neo4j.logging.internal.StoreLogService.<init>(StoreLogService.java:191)
neo4j_1 | at org.neo4j.logging.internal.StoreLogService$Builder.build(StoreLogService.java:131)
neo4j_1 | at org.neo4j.graphdb.factory.module.GlobalModule.createLogService(GlobalModule.java:346)
neo4j_1 | ... 11 more
neo4j_1 | 2020-07-22 08:09:06.139+0000 INFO Neo4j Server shutdown initiated by request
docker-tests_neo4j_1 exited with code 1
Neo4j image tag being used, eg neo4j:latest
, neo4j:enterprise-3.5
etc
neo4j:4.0
The output of the docker version
command
Client: Docker Engine - Community
Version: 19.03.12
API version: 1.40
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:45:36 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.12
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:44:07 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Operating system: (for example Windows 95/Ubuntu 16.04)
Ubuntu 18.04
@j616 I tried running your reproduction code, but Neo4j 4.0.7 started as expected:
docker-compose up
Starting 269-permission-onstart_neo4j_1 ... done
Attaching to 269-permission-onstart_neo4j_1
neo4j_1 | Changed password for user 'neo4j'.
neo4j_1 | Fetching versions.json for Plugin 'apoc' from https://neo4j-contrib.github.io/neo4j-apoc-procedures/versions.json
neo4j_1 | Installing Plugin 'apoc' from https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/4.0.0.17/apoc-4.0.0.17-all.jar to /var/lib/neo4j/plugins/apoc.jar
neo4j_1 | Applying default values for plugin apoc to neo4j.conf
neo4j_1 | Skipping dbms.security.procedures.unrestricted for plugin apoc because it is already set
neo4j_1 | Directories in use:
neo4j_1 | home: /var/lib/neo4j
neo4j_1 | config: /var/lib/neo4j/conf
neo4j_1 | logs: /logs
neo4j_1 | plugins: /var/lib/neo4j/plugins
neo4j_1 | import: /var/lib/neo4j/import
neo4j_1 | data: /var/lib/neo4j/data
neo4j_1 | certificates: /var/lib/neo4j/certificates
neo4j_1 | run: /var/lib/neo4j/run
neo4j_1 | Starting Neo4j.
neo4j_1 | 2020-07-22 12:03:10.199+0000 INFO ======== Neo4j 4.0.7 ========
neo4j_1 | 2020-07-22 12:03:10.205+0000 INFO Starting...
neo4j_1 | 2020-07-22 12:03:18.717+0000 INFO Called db.clearQueryCaches(): Query cache already empty.
neo4j_1 | 2020-07-22 12:03:18.787+0000 INFO Bolt enabled on 0.0.0.0:7687.
neo4j_1 | 2020-07-22 12:03:18.788+0000 INFO Started.
neo4j_1 | 2020-07-22 12:03:19.516+0000 INFO Remote interface available at http://localhost:7474/
There are a couple of similar sounding active issues that you may be hitting though:
- #223 where neo4j hits mysterious unreproducable permission denied errors when starting and it seems related to apoc somehow.
- #266 where uid/gid changes between 4.0.6 and 4.0.7 caused permission errors on a named volume (doesn't happen on file mounts for some reason). Although you say you aren't mounting anything, the messages
Warning: Folder mounted to "/data/dbms" is not writable from inside container. Changing folder owner to neo4j.
andCaused by: java.io.FileNotFoundException: /logs/debug.log
would suggest that there are/data
and/logs
folder being mounted.
So, my questions:
- are you SURE you're not mounting any volume/s, and if you are, are they named volumes or file mounts?
- what if you try starting your container without setting
NEO4JLABS_PLUGINS
?
Hello, thanks for your reply. I've managed to solve my issue by pruning containers. But for completeness:-
- Yes, I'm sure I'm not mounting any volumes/files. Or, to be specific, I'm not specifying any volume/file mounts in my
docker-compose.yml
file or in the commands I'm using to start the containers. - I reduced my docker-compose file to the following and still had the issue
version: '3.8'
services:
neo4j:
image: neo4j:4.0
environment:
- NEO4J_AUTH=foo/bar
The specific commands I was using to start the container:-
docker-compose -f ./docker-tests/docker-compose.yml pull
docker-compose -f ./docker-tests/docker-compose.yml up
The pull being there to make sure I get the latest version, even if I already have on that satisfies the requirements available on my machine.
If you have any thoughts on why the prune would be needed in this case, I'd be grateful. But appreciate that "go and google it" is a legitimate reply 😅
Thanks for your help.
That is weird. Maybe one of the docker layers got corrupted/corrupted itself somehow?
I would have thought that if there are no shared mounts, then two running containers would be isolated and not break each other, but I don't know the specifics of how docker caching works.
Maybe I've been making an assumption that containerisation (and linux namespacing) automatically gives you compartmentalistaion for free.
I didn't have any other instances running so I don't think it'd be that. If anything, it'd be a between launches issue rather than a between running instances issue.
It could be that one of the Dockerfiles up the stack isn't invalidating the layer cache correctly? So a new version isn't being pulled because docker doesn't think there's been any changes at that layer? But purging will force the whole stack to be pulled. I know that can cause issues when building a container locally. I'm not sure if it can also cause an issue once deployed to the repositories.
Same happened to me, pruning solved the issue