els0r/goProbe

goQuery: 18.45 E is shown in drops column of list target on freshly converted DB

Closed this issue · 7 comments

els0r commented

Just saw this with a large DB that was recently converted (spanning 180 days and 0.66 PB of captured traffic):

       eth7     11.34 M      9.31 M      20.38 GB      1.77 GB      248.41 k        0.00       18.45 E    2023-02-02 00:55:24    2023-08-01 22:23:50

Notice the 18.45 E. I have a feeling like this is an overflow and actually means 0. If you round 2^64 = 1.844674407370955e19 and apply humanization, you get 18.45 E.

DoD:

  • find out if this is part of the conversion logic (legacy)
  • or part of the list target (goQuery)

If it is the former, it needs to be in the v4 milestone. If not, we can fix this minor bug later.

@els0r Assuming that you can't hand over that interface's data for me to analyze you should easily be able to find out if it's from the conversion by looking at the code in legacy main.go here:

		bulkWorkload = append(bulkWorkload, goDB.BulkWorkload{
			FlowMap: block.data,
			CaptureMeta: goDB.CaptureMetadata{
				PacketsDropped: blockMetadata.PcapPacketsDropped + blockMetadata.PcapPacketsIfDropped,
			},
			Timestamp: block.ts,
		})

Afaik this is the only place where drops are transferred / written to the converted DB, so by logging / analyzing what's actually written it should be easy to discern if there's some overflow and / or issue with the math (or, simply, corrupted input). Let me know if I can be of any assistance...

els0r commented

I'm going to bet a Lunch IPA on this being the culprit. Mainly because it confirms my belief about an overflow 😀 . Taken from ./1676937600/meta.json of an interface having 18.45 E drops

    {
      "timestamp": 1676972734,
      "pcap_packets_received": -1,
      "pcap_packets_dropped": -1,
      "pcap_packets_if_dropped": -1,
      "packets_logged": 0,
      "flowcount": 0,
      "traffic": 0
    }

In a ring buffer, if you go back -1 from 0, you'll end up at the highest possible position of that buffer, aka 2^64 - 1 for a uint64 or 0xFFFFFFFFFFFFFFFF. Or it's another phenomenon.

Look closer. What we have in the PacketsDropped is an int, which defaults to int64 in a 64-bit architecture. -1 of an int64 will be represented as - you guessed it 0xFFFFFFFFFFFFFFFF. If you convert that sequence of bits back to a uint64, you'll get the highest possible value for a uint64.

Who knew -1 could be so big 😏 .

The only thing I haven't been able to reproduce is the sequence of events. What I suspect is, that through the use of the binary package in the GPDir Marshal/Unmarshal, we manage to evade the type safety of uint64. And it's not an obvious one, since if you do uint64(-1), the compiler will complain.

If we don't want to go down this rabbit hole, we could simply default back to 0 if a -1 is encountered in the metadata and check if the bug disappears. But for educational purposes, I'd like to get your take on this. Maybe you spot the obvious.

Not gonna bet against it (but I'll take the IPA anyway), had I known that the value could be negative I'd added more safety here. Thing is, yes, the compiler will guard against obvious stupidity such as trying uint64(-1), but it doesn't know the value of an int variable (at compile time) that you are casting to uint64. Easily reproducible with e.g. this (which is almost as dumb and exactly what happens during Marshal()): https://go.dev/play/p/XEo2mcC8N80

var testVal int
testVal = -1

fmt.Println(uint64(testVal)) // <- Happily compiles and prints the aforementioned value. 

I'll add a fix in a PR (forcing -1 to 0).

Rasied fako1024/slimcap#52 for corresponding changes in slimcap. @els0r Please approve linked PR asap so I can bump the version here.

els0r commented

Rasied fako1024/slimcap#52 for corresponding changes in slimcap. @els0r Please approve linked PR asap so I can bump the version here.

Done

els0r commented

Not gonna bet against it (but I'll take the IPA anyway), had I known that the value could be negative I'd added more safety here. Thing is, yes, the compiler will guard against obvious stupidity such as trying uint64(-1), but it doesn't know the value of an int variable (at compile time) that you are casting to uint64. Easily reproducible with e.g. this (which is almost as dumb and exactly what happens during Marshal()): https://go.dev/play/p/XEo2mcC8N80

var testVal int
testVal = -1

fmt.Println(uint64(testVal)) // <- Happily compiles and prints the aforementioned value. 

I'll add a fix in a PR (forcing -1 to 0).

HA! Nice one. Of course the compiler complains if it is a constant because it knows the value on compile time 💡 . Well, at least we didn't blow up a rocket 🤷 .

HA! Nice one. Of course the compiler complains if it is a constant because it knows the value on compile time . Well, at least we didn't blow up a rocket 🤷 .

Or (temporarily) "lose" (i.e. kill) contact with a deep space probe after sending the wrong command(s): https://www.businessinsider.com/nasa-loses-contact-voyager-2-sent-wrong-command-mistake-space-2023-8 🤣