Segmentation Fault on TLS data types
mjtooley opened this issue · 9 comments
I am getting a segmentation fault in nfv9.c when processing a flow record with TLS data types in it. The net flow with et-analytics enabled was generated by a Cisco CSR1000v running on the AWS.
It takes a segmentation fault in nfv9_process_flow_record for any of the TLS types.
ip-10-0-0-106#show version
Cisco IOS XE Software, Version 16.09.01
Cisco IOS Software [Fuji], Virtual XE Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.9.1, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2018 by Cisco Systems, Inc.
Compiled Tue 17-Jul-18 16:57 by mcpre
See packets 20 & 22 in the attached file.
crs1000.pcap.zip
Matt
Matt, what version fo the JOY software are you using? Also do you have a dump of the segfault?
I am using version 3.0.0. When I did a backtrace it showed it was faulting on line 529 of nfv9.c. I will try and upload the segfault. I am not the most proficient with gdb, but it looked to me to be having an issue with the flowdata pointer. The pointer wasn’t null, and what it was pointing to in memory matched the contents of the netflow packet. So I was a bit stumped.
It works fine on the IDP and SPLT field types.
Matt,
I tried to reproduce this issue with the pcap you sent along. The problem does not occur. I checked the code from 3.0.0 versus what is there now and there isn't any difference between latest and 3.0.0. Line 529 in nfv9.c is where it is processing TLS sequence of records, lengths and times. I looked at the packet capture for packet 20 and the templates seemed to be filled out correctly with the right field IDs.
It would be great if you could provide the stack trace dump. As an aside, there has been a fair amount of cleanup done around memory handling. It might be worthwhile moving to 4.0.1 (latest) on master and retesting.
Ok, I reproduced it with v4.0.0
This is what I did
- git clone https://github.com/cisco/joy
- git checkout v4.0.0
- ./config.sh
- /configure
- make clean;make
- ./bin/joy bidir=1 dist=1 classify=1 nfv9_port=4739 verbosity=2 crs1000.pcap
I have attached the core dump file as well as the pcap file. When I looked at in gdb it looked to still be crashing in the same spot. So I don’t know if I am doing something wrong or what?
Matt, I will be out of the office tomorrow and all next week. I will take a look at it when I return after the holiday.
Matt, I think I have it figured out. See if the changes I made in nfv9.c makes everything better for you. Here is the diff:
diff --git a/src/nfv9.c b/src/nfv9.c
index 8239d17..61278f4 100644
--- a/src/nfv9.c
+++ b/src/nfv9.c
@@ -520,6 +520,16 @@ void nfv9_process_flow_record (flow_record_t *nf_record,
flow_data += htons(cur_template->fields[i].FieldLength);
break;
case TLS_SRLT:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
total_ms = 0; for (j = 0; j < 20; j++) { if (htons(*(const short *)(flow_data+j*2)) == 0) {
@@ -540,6 +550,16 @@ void nfv9_process_flow_record (flow_record_t *nf_record,
flow_data += htons(cur_template->fields[i].FieldLength);
break;
case TLS_CS:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
for (j = 0; j < 125; j++) { if (htons(*(const short *)(flow_data+j*2)) == 65535) { break;
@@ -551,6 +571,16 @@ void nfv9_process_flow_record (flow_record_t *nf_record,
flow_data += htons(cur_template->fields[i].FieldLength);
break;
case TLS_EXT:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
for (j = 0; j < 35; j++) { if (htons(*(const short *)(flow_data+j*2)) == 0) { break;
@@ -564,20 +594,60 @@ void nfv9_process_flow_record (flow_record_t *nf_record,
flow_data += htons(cur_template->fields[i].FieldLength);
break;
case TLS_VERSION:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
nf_record->tls->version = *(const char *)flow_data; flow_data += htons(cur_template->fields[i].FieldLength); break; case TLS_CLIENT_KEY_LENGTH:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
nf_record->tls->client_key_length = htons(*(const short *)flow_data); flow_data += htons(cur_template->fields[i].FieldLength); break; case TLS_SESSION_ID:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
nf_record->tls->sid_len = htons(*(const short *)flow_data); nf_record->tls->sid_len = min(nf_record->tls->sid_len,256); memcpy(nf_record->tls->sid, flow_data+2, nf_record->tls->sid_len); flow_data += htons(cur_template->fields[i].FieldLength); break; case TLS_HELLO_RANDOM:
-
/* if TLS structure is NULL get one */
-
if (nf_record->tls == NULL) {
-
tls_init(&nf_record->tls);
-
/* if still NULL bail on this processing */
-
if (nf_record->tls == NULL) {
-
flow_data += htons(cur_template->fields[i].FieldLength);
-
break;
-
}
-
}
-
memcpy(nf_record->tls->random, flow_data, 32); flow_data += htons(cur_template->fields[i].FieldLength); break;
Thanks - I tested it and it seems to have fixed it.