hamba/avro

Problem with empty slices vs nil.

Closed this issue · 8 comments

When using a []byte in a struct and doing a marshal / unmarschal with a nil value in []byte it is changed to an empty slice. Should []byte (aka []uint8) not be supported for Avro union (["null", "bytes"]), so we can keep the difference between the nil and the empty slice?

When using a []byte in a struct and doing a marshal / unmarschal with a nil value in []byte it is changed to an empty slice.

There is no significant difference between a nil and empty slice in Go. They are interchangeable. What are you trying to preserve with nil slices?

hould []byte (aka []uint8) not be supported for Avro union (["null", "bytes"]), so we can keep the difference between the nil and the empty slice?

This would be a *[]byte which would be nil in this case. There could be special handling for slices in this case, I have not seen a use case for this yet. Perhaps you could explain yours?

No this is not a *[]byte. That would be a pointer to a slice, which is something different.

var b []byte // b == nil while b:=[]byte{} // b != nil (see: https://go.dev/play/p/Q4A3D0ChKt6 for some examples).

There are in fact differences:

  1. It uses (more) memory for the empty slice, while nil does not.
  2. You can use the nil to check if a slice is initialized (materialized?)
  3. It is not equal when you test the marshal/un-marshal of a struct containing a nil '[]byte'.

What you refer to is that both have a length of 0 I think.

What is your proposal here?

I think that it should be possible to have byte slice being supported for the union ["null", "bytes"]. I guess that should be possible for other (native?) slice types too.

The current problem for us is that if I have the above union in the schema, I get the '[]uint8 is unsupported for Avro union' error. I think this could be supported and if the value really is nil, the null is used.

If you want to change the schema generation or have that optional is not my concern as we do not use that and have our own schema generator as a proprietary tool that generates our "avrox" schemer (https://github.com/metatexx/avrox).

The current problem for us is that if I have the above union in the schema, I get the '[]uint8 is unsupported for Avro union' error. I think this could be supported and if the value really is nil, the null is used.

This is currently supported using *[]byte which is following the convention in the library of [null, X] => *X. It would be possible to have a special case for slices and maps. Would welcome a PR to that effect.

We can't use *[]byte as this is something different.

Having *[]byte 'only' encoded as [null, X] could be a problem too (see: https://go.dev/play/p/Yh_6puHgk-h). I think that there is no meaningful way to solve that. It also looks "right" at the first glance.

I am not sure how easy it is to make a PR for the special handling of slices and maps. From the error message it seems to me that it does not happen inside codec_array.go but needs to be checked a step earlier. I am not sure how long I would need to understand the code well enough to create a PR.

P.S.: I think we would like to sponsor this project, but I do not see a way for doing that.

You would need to look at codec_union.go. I will look into this when I get a moment.

Sponsorships do not work in my country, no stripe here.

Can you give this branch a try and see if it does what you think it should: #372