palantir/go-java-launcher

Implement Improved Container Support

carterkozak opened this issue · 0 comments

Currently we support containers thusly:
https://github.com/palantir/go-java-launcher/blob/3e6f05325a8002960172b1bd0eb39aec0a5ba42f/README.md#java-heap-and-container-support

-Xms and -Xmx args are ignored, and replaced with -XX:InitialRAMPercentage=75.0 -XX:MaxRAMPercentage=75.0 to set heap memory bounds dynamically based on the container itself. This is better than configuring the container and jvm heap separately, but has a number of problems:

  1. In practice, our heap needs don't scale linearly. A container with 256m of memory will likely need a lower percentage for heap than a container with 32g of memory in order to make space for the jvms native memory requirements (which themselves scale, but not entirely linearly with heap).
  2. InitialRAMPercentage and MaxRAMPercentage seem not to be entirely equivalent to Xms and Xmx in that unused heap may still be returned to the OS when both flags are set to the same value (Documentation isn't terribly helpful here, and this may need to be re-evaluated for correctness, but this is my recollection based on the original rollout)
  3. CPU information is not read correctly by the JVM. It never was, but this became much worse when https://bugs.openjdk.org/browse/JDK-8281181 was shipped in the October Java CPU (Critical Patch Update). Where previously the JVM modified its active processor count based on detected cgroup information, it no longer does without a new (deprecated) flag: -XX:+UseContainerCpuShares palantir/sls-packaging#1414
    • However, even with this feature enabled, services set to use 1000 millicores fall through a codepath which causes them to detect all cores on the hypervisor (due to ambiguity in whether or not cgroups are being leveraged, as cgroups are used more broadly than our use-case). See cgroupV1Subsystem_linux.cpp or the similar case for cgroupsv2 in CgroupV2Subsystem.java.

Proposed Path Forward

We should teach go-java-launcher to detect the configurations that we use, where we have more specific knowledge about what we expect.

Processors

CPU shares is more important to work on in the immediate future than memory.

We should detect cpu-share information in go-java-launcher, and pass it to the JVM by adding the -XX:ActiveProcessorCount=N flag. Where N is roughly max(2, floor(millicores / 1000)). ActiveProcessorCount requires a whole number. the max(2 component is non-obvious, but I think we will be better off providing N>1 to ensure smaller applications don't get blocked by too few GC threads, as well as issues in many concurrent data-structures which assume they must operate differently when ActiveProcessorCount=1 because parallel computation is impossible, e.g. wildfly-common SpinLock.java and jboss-threads EnhancedQueueExecutor.java. When a small-ish application detects many processors, all per-processor buffers become that much larger, which results in oomkills as we discovered when JDK-8281181 was deployed.

Concretely, when CONTAINER mode is used, we should check jvmArgs for -XX:ActiveProcessorCount=<value>. If none has been explicitly set (we don't expect it to be set in the common case), we should add the flag with a reasonable value based on the formula above.

Memory

Ideally we should detect available memory, and use a better formula to determine heap size. Finding ideal values will be difficult, and may require us to take into account the number of processors, configured garbage collector, and perhaps additional preset info. I'd like to avoid giving folks a scripting language to define complex memory curves themselves if possible.