[bug] Output file sometimes contains invalid JSON (`-nan`, `inf`)
Closed this issue · 3 comments
Context
I'm running a set of tests across a fleet of caches, using a fairly standard memtier command:
memtier_benchmark --server=$HOST_NAME --port=10000 --authenticate=$PASSWORD --hide-histogram --pipeline=10 --clients=$NUM_CLIENTS --threads=$NUM_THREADS --data-size=1024 --test-time=600 --key-maximum=$NUM_KEYS --key-pattern=R:R --ratio=0:1 --distinct-client-seed --run-count=1 --json-out-file=${TEST_RESULTS_DIR}/traffic-producer-clients${NUM_CLIENTS}-threads${NUM_THREADS}.json
Memtier Version
I don't have direct access to the test runners to run memtier --version
, but we're using a standard apt-get command to install the latest Memtier for jammy
. Here's the output from the install command:
Setting up memtier-benchmark (2.1.0~jammy) ...
Bug
Very occasionally, the output file will contain invalid JSON.... E.g.:
,"374":{
"Bytes RX": 1078
,"Bytes TX": 334
,"Count": 10
,"Average Latency": -nan
,"Min Latency": 9223372036854776.000
,"Max Latency": 0.000
,"p50.00": 0.000
,"p99.00": 0.000
,"p99.90": 0.000
}
(The example above is at path 'ALL STATS'.Totals.Time-Serie
**)
(**Note that the missing 's' in "Time-Series" is accurate. That's how it's showing up in my file.)
...the issue above obviously being the -nan
value for "Average Latency", which is not valid JSON. Therefore my JSON library throws an exception when I try to parse it in my .NET app. 😩
As a workaround I must first search the entire output file and replace -nan
with 0.0
or null
or something before consuming it.
But it would be nice if Memtier just didn't do that in the first place 🙂
I've found another instance of this where 'ALL STATS'.Gets.Ops/sec
, (and others), ended up with:
,"Gets":{
"Count": 119304362
,"Ops/sec": inf
,"Hits/sec": inf
,"Misses/sec": inf
,"Latency": 51.457
,"Average Latency": 51.457
,"Min Latency": 0.240
,"Max Latency": 296.959
,"KB/sec": inf
,"KB/sec RX/TX": inf
,"KB/sec RX": inf
,"KB/sec TX": inf
@DrEsteban let me work on a fix for this and reply back. Do you have an easy way to reproduce the issue locally?
@filipecosta90 Unfortunately no 🙁 It seems to happen intermittently - not with any specific combination of parameters. We'll have runs against the same cache with the same parameters work, and others fail 😢