Are negative CMSInitiatingOccupancyFraction values valid?
Closed this issue · 7 comments
I have been running JTune on a couple of my servers and applying the tuning suggestions. I re-run it and re-apply those new values. It appears I'm getting honed in on an optimal set of JVM flags for the server, but I get what look like really odd -XX:CMSInitiatingOccupancyFraction
values:
* Reading gc.log file... done. Scanned 3008 lines in 0.0016 seconds.
Meta:
~~~~~
Sample Time: 29m13s (1753 seconds)
System Uptime: 52d49m
CPU Uptime: 104d1h
Proc Uptime: 1m50s
Proc Usertime: 2m22s (0.00%)
Proc Systime: 5s (0.00%)
Proc RSS: 1.71G
Proc VSize: 5.07G
Proc # Threads: 93
YG Allocation Rates*:
~~~~~~~~~~~~~~~~~~~~~
per sec (min/mean/max): 1.42M/s 145.62M/s 470.29M/s
per day (min/mean/max): 119.83G/d 12T/d 38.75T/d
OG Promotion Rates:
~~~~~~~~~~~~~~~~~~~
per sec (min/mean/max): 9K/s 47.39M/s 583.80M/s
per hr (min/mean/max): 31.63M/h 166.60G/h 2T/h
Survivor Death Rates:
~~~~~~~~~~~~~~~~~~~~~
Lengths (min/mean/max): 0/1.9/12
Death Rate Breakdown:
Age 1: 0.0% / 32.9% / 100.0% / 67.1% (min/mean/max/cuml alive %)
Age 2: -0.4% / 11.3% / 89.2% / 59.5% (min/mean/max/cuml alive %)
Age 3: -0.2% / 1.5% / 50.8% / 58.7% (min/mean/max/cuml alive %)
Age 4: -0.0% / 1.0% / 40.9% / 58.0% (min/mean/max/cuml alive %)
Age 5: -0.2% / 0.6% / 54.4% / 57.7% (min/mean/max/cuml alive %)
Age 6: -0.0% / 0.6% / 31.7% / 57.4% (min/mean/max/cuml alive %)
Age 7: -0.0% / 0.5% / 48.7% / 57.1% (min/mean/max/cuml alive %)
Age 8: 0.0% / 0.3% / 23.4% / 56.9% (min/mean/max/cuml alive %)
Age 9: -0.2% / 0.1% / 5.7% / 56.9% (min/mean/max/cuml alive %)
Age 10: -0.0% / 0.1% / 12.7% / 56.8% (min/mean/max/cuml alive %)
Age 11: -0.2% / 0.1% / 15.6% / 56.8% (min/mean/max/cuml alive %)
Age 12: -0.0% / 0.0% / 0.5% / 56.8% (min/mean/max/cuml alive %)
Age 13: -0.0% / 0.0% / 8.0% / 56.8% (min/mean/max/cuml alive %)
Age 14: 0.0% / 0.1% / 22.9% / 56.7% (min/mean/max/cuml alive %)
GC Information:
~~~~~~~~~~~~~~~
YGC/FGC Count: 430/12 (Rate: 14.72/min, 0.41/min)
GC Load (since JVM start): 3.80%
Sample Period GC Load: 3.20%
CMS Sweep Times: 2.326s / 4.335s / 5.275s / 1.21 (min/mean/max/stdev)
YGC Times: 0ms / 122ms / 570ms / 100.47 (min/mean/max/stdev)
FGC Times: 0ms / 51ms / 112ms / 30.62 (min/mean/max/stdev)
Agg. YGC Time: 55480ms
Agg. FGC Time: 673ms
Est. Time Between FGCs (min/mean/max): 4d6h 1m8s 5s
Est. OG Size for 1 FGC/hr (min/mean/max): 31.63M 166.60G 2T
Overall JVM Efficiency Score*: 96.797%
Current JVM Configuration:
~~~~~~~~~~~~~~~~~~~~~~~~~~
NewSize: 172M
OldSize: 5.19M
SurvivorRatio: 1
MinHeapFreeRatio: 40
MaxHeapFreeRatio: 70
MaxHeapSize: 3.34G
PermSize: 240M
NewRatio: 2
Recommendation Summary:
~~~~~~~~~~~~~~~~~~~~~~~
Warning: The process I'm doing the analysis on has been up for 1m50s,
and may not be in a steady-state. It's best to let it be up for more
than 5 minutes to get more realistic results.
* Warning: The calculated recommended survivor ratio of 0.46 is less than 1.
This is not possible, so I increased the size of newgen by 87.43M, and set the
survivor ratio to 1. Try the tuning suggestions, and watch closely.
- With a mean YGC time goal of 50ms, the suggested (optimized for a
YGC rate of 33.55/min) size of NewGen (including adjusting for
calculated max tenuring size) considering the above criteria should be
163 MiB (currently: 172 MiB).
- Because we're decreasing the size of NewGen, it can have an impact
on system load due to increased memory management requirements.
There's not an easy way to predict the impact to the application, so
watch this after it's tuned.
- It's recommended to have the PermGen size 1.2-1.5x (used 1.5x) the size of the
live PermGen size. New recommended size is 241MiB (currently: 240MiB).
- Looking at the worst (max) survivor percentages for all the ages, it looks
like a TenuringThreshold of 5 is ideal.
- The survivor size should be 2x the max size for tenuring threshold
of 5 given above. Given this, the survivor size of 163M is ideal.
- To ensure enough survivor space is allocated, a survivor ratio of 1 should be
used.
- It's recommended to have the max heap size 3-4x the size of the live data size
(OldGen + PermGen), and adjusted to include the recommended survivor and newgen
size. New recommended size is 4293MiB (currently: 3416MiB).
- With a max 99th percentile OG promotion rate of 122.10M/s, and the max CMS
sweep time of 5.275s, you should not have a occupancy fraction any higher than
-12363.
Java G1 Settings:
~~~~~~~~~~~~~~~~~~~
- With a max ygc stdev of 46.95, and a 99th percentile ygc mean ms of 190ms,
your config is probably not ready to move to the G1 garbage collector. Try
tuning the JVM, and see if that improves things first.
The JVM arguments from the above recommendations:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Xmx4293m -Xms4293m -Xmn163m -XX:SurvivorRatio=1 -XX:MaxTenuringThreshold=5
-XX:CMSInitiatingOccupancyFraction=-12363 -XX:PermSize=241m -XX:MaxPermSize=241m
~~~
* The allocation rate is the increase is usage before a GC done. Growth rate
is the increase in usage after a GC is done.
* The JVM efficiency score is a convenient way to quantify how efficient the
JVM is. The most efficient JVM is 100% (pretty much impossible to obtain).
* A copy of the critical data used to generate this report is stored
in /tmp/jpulse_data-eaihost.bin.bz2. Please copy this to your homedir if you
want to save/analyze this further.
Huh. That's interesting. What do you have your -Xms and -Xmx set to? Can you email me your playback file (/tmp/jpulse_data-eaihost.bin.bz2) to ebullen@linkedin.com? That'll help me look at the data it used. This file only contains jvm data used for calculations, and contains no personally identifiable information. Thanks!
Sure thing. I am queueing up an email right now.
Also, I need to know what your -Xms -Xmx values are set to.
Currently, i'm at -Xmx4031m -Xms4031m
, but that is a change since the last sent files. Do you wnat me to send you the most recent jpulse files based on these heap settings?
Were they both set to the same value when you got the negative CMSInitiatingOccupancyFraction?
Yes.
Your JVM was only running for 1m50s, and it warns you that if your JVM has been running less than 5 minutes that you may see weird results. In this case you have OG growth rates that are very high (due to a newly started JVM), and your CMS sweep times of 5 seconds, so the calculations became unreliable. Try it again by waiting 5-10 minutes (the JVM needs to be in a steady-state, AND needs to be under peak load for jtune to work correctly).
Please make sure that you read the warnings, and adjust accordingly.