"decodeUtf8: Invalid UTF-8 stream" when accessing non-UTF8 header data

Question

"decodeUtf8: Invalid UTF-8 stream" when accessing non-UTF8 header data

Closed this issue 5 years ago · 5 comments

If a header field contains binary string data which is not valid UTF-8, then access to this FVString results in an Exception, as shown in https://github.com/woffs/haskell-amqp-utils/issues/1

Unfortunately it is possible to have such a situation in real life with rabbitmq.

Sholdn't FVString therefore rather be a ByteString instead of Text? Or should the decodeUtf8 exception be catched?

Answer 1 · 2019-12-18T17:14:05.000Z

It seems that the official C# library handles it like a raw-byte-sequence and not UTF-8: https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/32dd86c6b9aeb1adc989a53e420ab5696973b771/projects/client/RabbitMQ.Client/src/client/impl/WireFormatting.cs#L124

So they probably know what they're doing. Still a bit confusing, since there already is FVByteArray which now becomes redundant.

I'll try fixing it and posting a new release soon.

Answer 2 · 2019-12-18T17:35:20.000Z

Thank you so much for quick response and care! This is great open source software.

Answer 3 · 2019-12-18T17:46:20.000Z

I've pushed 0.19.0 to Hackage. Unfortunately I couldn't test it since RabbitMQ currently doesn't work on my machine for some unknown reason. Would be great if you could see if everything works fine.

Answer 4 · 2019-12-18T18:15:55.000Z

It works! It does not crash anymore.
Now I must find out on my side how I want to handle binary vs. UTF8 data in headers and look for a failsafe conversion to a String. Maybe I'll just try to decodeUtf8 and do BS.unpack as fallback.

Anyway, thanks a lot!

Answer 5 · 2019-12-18T18:24:26.000Z

Thanks for testing!