AMQP headers tripping up logstash, workaround not working any more. Also, snags/opportunities with updating elk stack container.
SwooshyCueb opened this issue · 8 comments
We ran into some problems prepping for UGM2022 training. This is more than one issue, but until we can get all our ducks in a row, I'm putting everything I know in here just so we can have it all in once place.
In our elk stack container used for training, AMQP headers are still present when logstash fetches messages from RabbitMQ. (Possibly related: #57)
Our workaround was to put __BEGIN_JSON__
and __END_JSON__
around the actual message content:
And then use logstash filters to extract the actual message content using these markers:
https://github.com/SwooshyCueb/irods-contrib/blob/68ed88a86d0b84fae7becaf6262565db75c02356/irods_audit_elk_stack/logstash-irods_audit.conf#L20-L36
With iRODS 4.2, this let most messages get through to kibana in our elk stack container, but since the AMQP headers are not always valid UTF8, the logstash filters sometimes fail.
With iRODS 4.3, the success rate dropped to 0.
I haven't done extensive testing yet, but it appears that updating the elk stack to 7 (more on that later) also results in a 0 success rate, regardless of the iRODS version.
Our goal should be to eliminate the token-based workaround and figure out how to either fetch the messages in such a way that the AMQP headers are not present, or to properly parse/deal with the headers ourselves.
Supposedly logstash does speak AMQP 1.0 so I suspect that we're holding something wrong. Or possibly multiple somethings.
Our elk stack container uses elasticsearch/logstash/kibana 6, which is pretty outdated at this point. Migrating to 8 would be a good idea. (Though, even if we don't, it could use a little TLC.) Follow me for a bit.
Our training has the user create an index pattern. The specific instructions are
- Type "irods_audit" in the index pattern field and click next step.
- Select "@timestamp" in the time filter field name and click create index pattern.
(Tangent: When the user starts typing in the index pattern field, kibana automatically inserts an asterisk after the cursor. The training slide is not clear on whether or not the user is to remove the asterisk.)
With elk 7 and elk 8, kibana does not let me select @timestamp
(or anything, for that matter) as a filter field name via the web interface. This may be due to the lack of messages actually getting through, but I am not sure.
The training then has the user import a json
file containing a few saved objects.
Index patterns created with the web interface have a randomly generated id
assigned to them.
The json
file contains references to the index pattern via its id
at the time it was exported.
Since the id
of the new index pattern and the id
referenced in the json
file differ, kibana treats it as a missing reference and asks the user what to do.
(Tangent: The training slide gives no direction on how the user is to handle this.)
Starting with 7, ndjson
is the preferred format for imported saved objects and the only format for export. Support for importing json
saved objects was removed in 8.
(Tangent: Each line of an ndjson
file is a full json
object. (It is therefore impossible to prettify an ndjson
file without breaking it, as the parser expects each line to have a complete object.) ndjson
files do not end with a newline character.)
Also starting with 7, index patterns are considered saved objects and can be imported/exported just like visualizations and dashboards.
Also also starting with 7, there is a curl
-friendly API for creating saved objects.
Saved object id
s can be manually specified on creation via both 'ndjson' import and the API.
We can eliminate both the missing reference problem and the @timestamp
problem by including the index pattern in the ndjson
file instead of having the user create the index pattern in the web interface. I have created such an ndjson
file, but have not been able to test it properly yet.
Alternatively, we can use the API to create the index pattern, visualizations, and dashboard either on container startup or during image build, rather than providing an ndjson
file for import.
I have updated our elk stack container to 7 here. Starting with 7, elasticsearch, logstash, and kibana have switched to systemd unit files, whereas in 6 they used init.d
scripts. As a result, this container doesn't work exactly the same way as our current container, but with a little fiddling, you can get it up and running.
However, what I really think we need to do is use an alpine-based container. I love systemd, but in this case it just overcomplicates things.
Very good - yes, the base problem is that we were always working around the utf8 headers visible to logstash. Parsing that correctly will 'solve' this problem, assuming the messages are good in the first place.
I've updated the elk 7 container to actually run the startup script and to create the index pattern, visualization objects, and dashboard automatically on first run.
elk 8 container here https://github.com/SwooshyCueb/irods-contrib/tree/elk-upd-8-focal/irods_audit_elk_stack
Still using systemd and ubuntu focal.
Also, something I probably should've mentioned before:
You need to pass --privileged
to docker run
when using these containers, otherwise systemd falls over immediately.
I had some stuff come up with a housemate so I wasn't able to play around with Qpid Proton on its own today, but I do have some findings to report anyway.
The headers showing up in RabbitMQ is probably normal.
From the rabbitmq_amqp1_0
README:
This implementation as a plugin aims for useful interoperability with AMQP 0-9-1 clients. AMQP 1.0 messages can be far more structured than AMQP 0-9-1 messages, which simply have a payload of bytes.
The way we deal with this is that an AMQP 1.0 message with a single data section will be transcoded to an AMQP 0-9-1 message with just the bytes from that section, and vice versa. An AMQP 1.0 with any other payload will keep exactly that payload (i.e., encoded AMQP 1.0 sections, concatenated), and for AMQP 0-9-1 clients the type field of the basic.properties will contain the value "amqp-1.0".
Thus, AMQP 0-9-1 clients may receive messages that they cannot understand (if they don't have an AMQP 1.0 codec handy, anyway); however, these will at least be labelled. AMQP 1.0 clients shall receive exactly what they expect.
The web interface is likely fetching messages as an AMQP 0-9-1 client, as this lines up with the behavior described here. It would also explain why we occasionally see AMQP 0-9-1 messages in RabbitMQ.
Logstash does not speak AMQP 1.0, only AMQP 0-9-1, so it's seeing the same thing, hence the binary headers.
Ten years ago SwiftMQ released AMQP 1.0 input and output plugins for Logstash 1.1. As far as I can tell, this plugin wasn't ever updated, and was removed from their website at some point between 2017 and 2019. Furthermore, it would seem that the SwiftMQ team has abandoned the idea of Logstash compatibility altogether.
I have not been able to locate a copy of the plugin anywhere on the web.
Qpid Proton speaks only AMQP 1.0.
I see a few possible solutions:
- Send our AMQP 1.0 messages from Qpid Proton in such a way that RabbitMQ can transcode them to AMQP 0-9-1 messages.
- Find a replacement for Logstash that can speak AMQP 1.0.
- Find a replacement for Qpid Proton that can speak AMQP 0-9-1.
- Find a replacement for RabbitMQ that has better AMQP 1.0/AMQP 0-9-1 interoperability.
- Use a different approach altogether.
Our initial testing in 2015(ish) was proven with the STOMP protocol and ActiveMQ.
So we'd have... iRODS Audit Plugin -> (AMQP) -> ActiveMQ -> (STOMP) -> Logstash -> Elastic -> Kibana
This article raises some interesting points. For one, we can see the exact header that shows up in our messages. Also, the JMS example they uses seems to imply that we'd need to be able to write our own custom parser for whatever picks up the RabbitMQ messages to be able to understand them.
Yeah STOMP seems to be the answer.