JustinAzoff/flow-indexer

nfdump output is broken for ipv6

Closed this issue · 6 comments

By default, nfdump will truncate v6 addresses, which will cause errors such as:

Non fatal read error: Invalid IP Address "2610:a5..:c05::f"

I'm not 100% sure if it's related or not, but we're seeing a ton of defunct processes:

32683 ? Z 0:00 [nfdump]
32687 ? Z 0:00 [nfdump]
32692 ? Z 0:00 [nfdump]
32694 ? Z 0:00 [nfdump]
32697 ? Z 0:00 [nfdump]
32698 ? Z 0:00 [nfdump]
32702 ? Z 0:00 [nfdump]
32703 ? Z 0:00 [nfdump]
32704 ? Z 0:00 [nfdump]
32712 ? Z 0:00 [nfdump]
32721 ? Z 0:00 [nfdump]
32724 ? Z 0:00 [nfdump]

From nfdump's man page:

To make the output more readable, IPv6 addresses are shrinked down to 16 characters. The seven most and seven least digits connected with two dots '..' are displayed in any normal output formats. To display the full IPv6 address, use the appropriate long format, which is the format name followed by a 6.

Example: -o line displays an IPv6 address as 2001:23..80:d01e where as the format -o line6 displays the IPv6 address in full length 2001:234:aabb::211:24ff:fe80:d01e. The combination of -o line -6 is equivalent to -o line6.

What version are you running there? That csv version of the nfdump backend is not the one in use by default anymore.. I wrote a new backend last week that is a lot better.

This is the one that is used now:

func (b NFDUMPBackend) ExtractIps(reader io.Reader, ips *ipset.Set) (uint64, error) {
cmd := exec.Command("nfdump", "-qr", "-", "-o", "fmt:%sa %da")

It uses nfdump -o fmt:%sa %da which is 10x faster than nfdump -o csv. The docs don't mention a sa6 or da6 so I don't think it should have the ipv6 issue.

I guess the random slice of flows I grabbed+anonymized for backend/test_data/nfdump.data didn't have any v6 flows otherwise I would have caught this. Should get some more test files that contains v6 flows so this is really tested properly.

I think the zombie procs is related and because I only call cmd.Wait() at the end if there was no error reading the file. I need to rework some things to always call cmd.Wait(). Also, now that I think of it, it should probably just log the errors from ips.AddString and not return.. The whole 'non fatal error' handling has been tricky. It's there because truncated log files tend to have broken records at the end.

We were running the latest version, but I guess my fix doesn't work, then.

It looks like %sa and %da still does the stupid truncation:

1.1.1.9 1.1.2.5
1.1.1.9 1.1.2.5
2211:142..00:1::c 2000:141..1110::3

This seems to work for me:

-o 'fmt:%sa %da' -6

ah, I see

The combination of -o line -6 is equivalent to -o line6.

so yeah, that looks right.

So -o 'fmt:%sa %da' -6 works the right way?

gah, I never fully fixed this.

I can put in a fix for this, but we don't seem to have any ipv6 netflow right now. Can you get grab 1000 flows from nfdump -c 1000 and then run it through nfanon and add to ticket? then I can use it as a test case to ensure v6 doesn't break.