encoding compatibility errors
sclasen opened this issue · 7 comments
When writing UTF-8 chars in a kafka message we are getting
Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
app error: Error writting messages_for_topics in Poseidon::Protocol::ProduceRequest (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting messages_for_partitions in Poseidon::Protocol::MessagesForTopic (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting message_set in Poseidon::Protocol::MessagesForPartition (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting messages in Poseidon::Protocol::MessageSetStructWithSize (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting message in Poseidon::Protocol::MessageWithOffsetStruct (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting value in Poseidon::Protocol::MessageStruct (Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8)))))) (Poseidon::Protocol::ProtocolStruct::EncodingError)
Probably due to
https://github.com/bpot/poseidon/blob/master/lib/poseidon/protocol/request_buffer.rb#L10
Can/Should that encoding be made configurable?
That string needs to be ASCII-8BIT
because it will hold binary data. There is some weird encoding (possibly bug?) behavior I came across when trying to reproduce this. If you try to append an invalid UTF-8 string to an ASCII-8BIT string it will work but the resulting string will be UTF8!:
irb(main):001:0> s = ''.encode("ASCII-8BIT")
=> ""
irb(main):002:0> n = "hello\xffasdf"
=> "hello\xFFasdf"
irb(main):003:0> n.encoding
=> #<Encoding:UTF-8>
irb(main):004:0> n.valid_encoding?
=> false
irb(main):005:0> s.encoding
=> #<Encoding:ASCII-8BIT>
irb(main):006:0> s << n
=> "hello\xFFasdf"
irb(main):007:0> s.encoding
=> #<Encoding:UTF-8>
To work around this I'm going to force all incoming strings to be ASCII-8BIT.
@sclasen can you try with the latest master and see if that fixes your issue?
Thanks for reporting this!
@bpot Dang that blows up in some cases with
class=Poseidon::Protocol::ProtocolStruct::EncodingError message="Error writting common in Poseidon::Protocol::MetadataRequest (Poseidon::Protocol::ProtocolStruct::EncodingError: Error writting client_id in Poseidon::Protocol::RequestCommon (RuntimeError: can't modify frozen String))
protocol_struct.rb:97:in `rescue in block (3 levels) in write
Okay, can you try again? It should handle frozen strings now.