hreinhardt/amqp

"decodeUtf8: Invalid UTF-8 stream" when accessing non-UTF8 header data

Closed this issue · 5 comments

woffs commented

If a header field contains binary string data which is not valid UTF-8, then access to this FVString results in an Exception, as shown in https://github.com/woffs/haskell-amqp-utils/issues/1

Unfortunately it is possible to have such a situation in real life with rabbitmq.

Sholdn't FVString therefore rather be a ByteString instead of Text? Or should the decodeUtf8 exception be catched?

It seems that the official C# library handles it like a raw-byte-sequence and not UTF-8: https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/32dd86c6b9aeb1adc989a53e420ab5696973b771/projects/client/RabbitMQ.Client/src/client/impl/WireFormatting.cs#L124

So they probably know what they're doing. Still a bit confusing, since there already is FVByteArray which now becomes redundant.

I'll try fixing it and posting a new release soon.

woffs commented

Thank you so much for quick response and care! This is great open source software.

I've pushed 0.19.0 to Hackage. Unfortunately I couldn't test it since RabbitMQ currently doesn't work on my machine for some unknown reason. Would be great if you could see if everything works fine.

woffs commented

It works! It does not crash anymore.
Now I must find out on my side how I want to handle binary vs. UTF8 data in headers and look for a failsafe conversion to a String. Maybe I'll just try to decodeUtf8 and do BS.unpack as fallback.

Anyway, thanks a lot!

Thanks for testing!