[bug] Output file sometimes contains invalid JSON (`-nan`, `inf`)

Question

[bug] Output file sometimes contains invalid JSON (`-nan`, `inf`)

Closed this issue a month ago · 3 comments

Context

I'm running a set of tests across a fleet of caches, using a fairly standard memtier command:

memtier_benchmark --server=$HOST_NAME --port=10000 --authenticate=$PASSWORD --hide-histogram --pipeline=10 --clients=$NUM_CLIENTS --threads=$NUM_THREADS --data-size=1024 --test-time=600 --key-maximum=$NUM_KEYS --key-pattern=R:R --ratio=0:1 --distinct-client-seed --run-count=1 --json-out-file=${TEST_RESULTS_DIR}/traffic-producer-clients${NUM_CLIENTS}-threads${NUM_THREADS}.json

Memtier Version

I don't have direct access to the test runners to run memtier --version, but we're using a standard apt-get command to install the latest Memtier for jammy. Here's the output from the install command:

Setting up memtier-benchmark (2.1.0~jammy) ...

Bug

Very occasionally, the output file will contain invalid JSON.... E.g.:

,"374":{
  "Bytes RX": 1078
  ,"Bytes TX": 334
  ,"Count": 10
  ,"Average Latency": -nan
  ,"Min Latency": 9223372036854776.000
  ,"Max Latency": 0.000
  ,"p50.00": 0.000
  ,"p99.00": 0.000
  ,"p99.90": 0.000
}

(The example above is at path 'ALL STATS'.Totals.Time-Serie**)
(**Note that the missing 's' in "Time-Series" is accurate. That's how it's showing up in my file.)

...the issue above obviously being the -nan value for "Average Latency", which is not valid JSON. Therefore my JSON library throws an exception when I try to parse it in my .NET app. 😩

As a workaround I must first search the entire output file and replace -nan with 0.0 or null or something before consuming it.

But it would be nice if Memtier just didn't do that in the first place 🙂

Answer 1 · 2024-10-25T09:51:41.000Z

I've found another instance of this where 'ALL STATS'.Gets.Ops/sec, (and others), ended up with:

,"Gets":{
  "Count": 119304362
  ,"Ops/sec": inf
  ,"Hits/sec": inf
  ,"Misses/sec": inf
  ,"Latency": 51.457
  ,"Average Latency": 51.457
  ,"Min Latency": 0.240
  ,"Max Latency": 296.959
  ,"KB/sec": inf
  ,"KB/sec RX/TX": inf
  ,"KB/sec RX": inf
  ,"KB/sec TX": inf

Answer 2 · 2024-10-28T12:29:21.000Z

@DrEsteban let me work on a fix for this and reply back. Do you have an easy way to reproduce the issue locally?

Answer 3 · 2024-10-28T16:21:42.000Z

@filipecosta90 Unfortunately no 🙁 It seems to happen intermittently - not with any specific combination of parameters. We'll have runs against the same cache with the same parameters work, and others fail 😢