logstash-plugins/logstash-filter-xml

Force_array is not applied correctly.

radoondas opened this issue · 7 comments

Hi, it seems that force_array is not behaving as it should be.

Environment informations:
Logstash 5.3

When trying to parse XML with filter. Parsing is going well except that single values are in arrays even with option force_array=false.

"By default the filter will force single elements to be arrays. Setting this to false will prevent storing single elements in arrays."

The questions here. Is this behaviour OK?
The values extracting from XML are INSIDE of XML tag. Does this option work here as well?

Configuration:

xml {
store_xml => "false"
source => "message"
remove_namespaces => "true"
force_array => "false"
xpath => [
"af/@timestamp","[@metadata][timestamp]",
"af/gc/tenured/@freebytes","freeafter_bytes",
"af/gc/timesms/@mark","gc_mark_ms",
"af/gc/timesms/@Sweep","gc_sweep_ms",
"af/gc/timesms/@ToTal","gc_total_ms"
]
remove_field => [ "message" ]
}
All parsed field is a single value, but filter will force single elements to be arrays:

{
"freeafter_bytes" => [
[0] "5915011776"
],
"gc_total_ms" => [
[0] "1409.834"
],
"gc_sweep_ms" => [
[0] "78.448"
],
"gc_mark_ms" => [
[0] "982.531"
]
}

Example XML:

As a workaround I use "mutate", but could you please tell me about force_array. Does it work as I think or I'm wrong and I should use mutate filter to convert array into single value?

mutate {
replace => {
"freeafter_bytes" => "%{[freeafter_bytes][0]}"
"gc_mark_ms" => "%{[gc_mark_ms][0]}"
"gc_sweep_ms" => "%{[gc_sweep_ms][0]}"
"gc_total_ms" => "%{[gc_total_ms][0]}"
"[@metadata][timestamp]" => "%{[@metadata][timestamp][0]}"
}
}

{
"freeafter_bytes" => "5933611912",
"gcpolicy" => "optthruput",
"gc_total_ms" => "1505.908",
"gc_sweep_ms" => "69.508",
"gc_mark_ms" => "1095.859"
}
Thanks!

The same for Logstash 5.4

Issue still persists on Logstash 5.5.0

The force_array parameter only comes into play when store_xml is true, this is because the option is directly fed into the simpleXML instance.

My patch fixes a few remaining todo's in the ruby file, as well as reuses this parameter for when store_xml is false.

If the xpath query results in an array with a single value and force_array is false, it will use the only array element as the value, instead of the array as a whole.

Any feedback? If necessary I can rewrite it,just let me know what needs to be done!

Hello, I'm looking for this feature. I'm using logstash 6.3.2.

+1 on this

jsvd commented

fixed by #57