eclipse-openj9/openj9

-Xshareclasses causes crash on Google Cloud Run

Closed this issue · 23 comments

gyuuu commented

The -Xshareclasses option causes a jvm crash when runing on Google Cloud Run

The error message tells that there is no free space available.

As Stated on the Filesystem Access section of this page, Cloud Run uses some kind of In memory file system

I tried this with Micronaut and Spring Boot helloworld applications with both java8 and java11, I assume that any application will fail. After removing the -Xshareclasses option both appications worked.
I also tried to set Temporary Directory with TMPDIR env variable to another folder, but that didn't helped either.
Setting the readonly option for -Xshareclasses didn't made any difference.

I set the allocated memory to 1GB so there should be a lot of free space, I assume that OpenJ9 can't determine the free space available with the in memory filesystem.

The Dockerfiles I tested with:
FROM adoptopenjdk/openjdk11-openj9:alpine
WORKDIR /app
ADD target/backend*.jar backend.jar
ENV PORT 8080
ENV JAVA_OPTS -Xmx256m -Xms256m -Xquickstart -Xtune:virtualized
ENV TMPDIR /app/temp
ENTRYPOINT java $JAVA_OPTS -XshowSettings:vm -Xshareclasses -Dserver.port=${PORT} -Djava.security.egd=file:/dev/./urandom -jar backend.jar

FROM adoptopenjdk/openjdk8-openj9:alpine
WORKDIR /app
ADD target/backend*.jar backend.jar
ENV PORT 8080
ENV JAVA_OPTS -Xmx256m -Xms256m -Xquickstart -Xtune:virtualized
ENV TMPDIR /app/temp
ENTRYPOINT java $JAVA_OPTS -XshowSettings:vm -Xshareclasses -Dserver.port=${PORT} -Djava.security.egd=file:/dev/./urandom -jar backend.jar

FROM adoptopenjdk/openjdk8-openj9:alpine
WORKDIR /app
ADD target/backend*.jar backend.jar
ENV PORT 8080
ENV JAVA_OPTS -Xmx256m -Xms256m -Xquickstart -Xtune:virtualized
ENTRYPOINT java $JAVA_OPTS -XshowSettings:vm -Xshareclasses -Dserver.port=${PORT} -Djava.security.egd=file:/dev/./urandom -jar backend.jar

The full error message from Cloud Run logs:
{app_name: Java, facility: 1, hostname: IBM, message: JVMSHRC561E Failed to initialize the shared classes cache, there is not enough space in the file system. Available free disk space bytes = 0, requested bytes = 67108864. , msgid: , procid: 1, severity: 3, structured_data: , timestamp: 2019-08-09T…
JVMSHRC561E Failed to initialize the shared classes cache, there is not enough space in the file system. Available free disk space bytes = 0, requested bytes = 67108864.
JVMJ9VM015W Initialization error for library j9shr29(11): JVMJ9VM009E J9VMDllMain failed

The shared cache directory can be changed by setting environment variable HOME or use sub-option cacheDir=<Dir> after -Xshareclasses:

Is the file system tmpfs or ramfs ?

I'm also interested in what happens if you use nonpersistent sub-option. (e.g. -Xshareclasses:nonpersistent).

gyuuu commented

I already used the cacheDir sub-option, but I forgot to include that example here. It makes no difference.

The nonpersistent option makes it work, however it defeats the purpose. My goal is to make the container start fast while keeping it stateless.

My original setup looked like this: -Xquickstart -Xtune:virtualized -Xshareclasses:cacheDir=classCache,name=appname,readonly

I should include this in the ticket, sorry for that.

My goal is to prewarm the cache with some automated tests and include the cacheDir in the image to make the starting time faster, which is very important in serverless environments.

The container works on my machine, but don't work on Cloud Run, which is ironic, because we use containers to avoid this situation, but that's another story.

gyuuu commented

The Official documentation states that there should be at least 20MB in the /tmp directory for profile creation.

I checked the free space with a simple Docker image

FROM ubuntu
WORKDIR /app
ENTRYPOINT echo "free -h" && free -h && echo "COMMAND: df -k /app" && df -k /app && echo "COMMAND: df -k /tmp" && df -k /tmp && echo "COMMAND: df" && df

Result
2019-08-12 22:18:54.414 CESTfree -h
2019-08-12 22:18:54.625 CEST total used free shared buff/cache available
2019-08-12 22:18:54.625 CESTMem: 2.0G 1.1M 2.0G 0B 2.3M 2.0G
2019-08-12 22:18:54.625 CESTSwap: 0B 0B 0B
2019-08-12 22:18:54.628 CESTCOMMAND: df -k /app
2019-08-12 22:18:54.637 CESTFilesystem 1K-blocks Used Available Use% Mounted on
2019-08-12 22:18:54.637 CEST- 0 0 0 - /app
2019-08-12 22:18:54.637 CESTCOMMAND: df -k /tmp
2019-08-12 22:18:54.644 CESTFilesystem 1K-blocks Used Available Use% Mounted on
2019-08-12 22:18:54.644 CESTnone 0 0 0 - /tmp
2019-08-12 22:18:54.645 CESTCOMMAND: df
2019-08-12 22:18:54.652 CESTdf: /var/log: Function not implemented
2019-08-12 22:18:55.000 CESTContainer called exit(1).

So this is not a bug!
So the message is valid from a technical standpoint, the operating system reports 0 bytes free space.

However I still ask you to help me find a way around this, since as far as I know Cloud Run is the only Serverless applicaiton platform where OpenJ9 is available. And this is a usecase where AOT could make the biggest difference.

I reported the issue to Google as a Feedback for Cloud Run, if you know any better way to report this issue please share it with me.

there should be at least 20MB in the /tmp

The default shared cache directory was /tmp and the default shared cache size was 16MB.
But it is now user's home and 300MB on OpenJ9 builds. You can use -Xscmx<size> to change the shared cache size.

Eclipse OpenJ9 uses this API from Eclipse OMR to determine the free space bytes:
https://github.com/eclipse/omr/blob/master/port/unix/omrfile.c#L1224

Not all file system define every field of struct statfs/struct statvfs. I guess f_bavail/f_bfree is not defined in the filesystem of google cloud run and the value for this undefined field is 0. So I want to know the file system type for the directory where the shared cache is created (HOME or the dir passed to cacheDir=). It should show up in the output of mount.

The JVM code can be changed to not checking the free space size for this file system type. Or the OMR API can be changed to return UNDEFINED instead of 0 for free bytes on this file system.

gyuuu commented

I tried it a custom cacheDir and created the cache with -Xscmx32m, and added that one to the Docker Image.

I found the 20MB requirement for /tmp in the documentation of the error message I got

This is the output of the mount command on Cloud Run

none on / type overlayfs (rw)
none on /dev type overlayfs (rw)
none on /dev/pts type devpts (rw)
none on /proc type proc (rw)
none on /sys type overlayfs (rw)
none on /tmp type overlayfs (rw)
none on /var/log type overlayfs (rw)
none on /sys/fs/cgroup/cpu type cgroup (ro)
none on /sys/fs/cgroup/cpuacct type cgroup (ro)
none on /sys/fs/cgroup/cpuset type cgroup (ro)
none on /sys/fs/cgroup/memory type cgroup (ro)

OK... it returns none for all mounts. I don't know why google cloud run hide the file system type.....
The JVM needs to know the file system type so that we can skip the free space size check.

Probably need to write a simple program to get statfs.f_type programmatically.
http://man7.org/linux/man-pages/man2/statfs.2.html

Ahh I see. The filesystem type is overlayfs.

Please ignore #6720 (comment)

So the issue here is that the JVM fails to start when you use -Xshareclasses:cacheDir=/tmp -Xscmx32m, but you can successfully create a non-empty file under /tmp. @gyuuu Can you please confirm ?

gyuuu commented

No, not really.

It fails with any cacheDir setting I tried so far.
These are the ones I tried:

  • no cacheDir specified
  • using /opt as cacheDir
  • adding a cacheDir to the image created on my PC, and setting that one as cacheDir (even the readonly sub-option don't make any difference)
    The last option is the one I want to use.

The only sub-option that makes it work is the nonpersistent.

Might need an option to skip the file system free space check. FYI @pshipton

Looking at the history for why a space check was added back in 2011, it appears the fail behavior is less than ideal - when a shared cache is created and loaded through mmap, the file stays empty until it is actually written to. Therefore, we do not get an error when we create the shared classes cache if the file system is full. However, when we do write to the file and the disk is full a SIGBUS or a SIGSEV signal is sent.

gyuuu commented

@pshipton

Thanks for the builds, however I had some trouble with the builds you provided. I created a Docker Image using the jdk8 version and I can't get the application run with it.
The JVM fails with the following log messages both with and without the noPersistentDiskSpaceCheck flag
JVMSHRC226E Error opening shared class cache file JVMSHRC336E Port layer error code = -108 JVMSHRC337E Platform error message: No such file or directory JVMSHRC840E Failed to start up the shared cache. JVMJ9VM015W Initialization error for library j9shr29(11): JVMJ9VM009E J9VMDllMain failed Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.

I might have messed up something while building the Docker image, but I can't figure out what went wrong. I did minimal changes to this Dockerfile to build the image
I pushed a minimal reproducible example to this repository
I also pushed the built openj9 image here: gyuuu/openj9:jdk8-20190906-184840
The repository contains the Dockerfile used to build the image and a Spring Boot hello world example I tried to run with it.

Are you using readonly sub-option ? readonly should be used when the shared cache already exists on the system. The JVM is unable to create a new shared class cache file with readonly

gyuuu commented

Yes I do, I added the shared class cache and set a name for it. This config works with the adoptopenjdk image, without the noPersistentDiskSpaceCheck flag
I will check these settings tonight and try it without the readonly flag aswell.

gyuuu commented

Ignore my complaints, this was my fault, the new flag do fix the issue.
Thank you all!
The problem was I added shared classes cache built on another version of openj9.

According to caveats commented by @pshipton, Is it possible to resrve the required space beforehand?

gyuuu commented

Please note that I did my tests with the jdk8 version, I will test the jdk11 version asap.

Is it possible to resrve the required space beforehand?

Currently we do not reserve the space beforehand because doing so will increase the JVM's footprint, which is undesirable, especially on the cloud.

Is it possible to reserve the required space beforehand?

Is this really required? My understanding is that you will create a shared cache and then pre-populate it in the docker image. Creating the shared cache will consume the space it needs, and there won't be any problems when using the cache as long as readonly is used. If the creation process runs out of space, it will result in a SIGBUS or SIGSEGV, but you'll understand this means more space is required.

gyuuu commented

Sounds legit, ignore my comment.

gyuuu commented

I tested the jdk11 version too, problem solved with that one aswell.

ok thanks. I'll see about getting the new option into the next release.

gyuuu commented

Thank you!
I'm not familiar the process you work with, can we close this issue now or should we wait for the release to do so?

We'll keep it open until the code for the option lands in master. At that point, we know it will be in the next release.