default values for objects in variable length arrays
cfal opened this issue · 6 comments
Hi there, I realize that v2 doesn't support default values - but I'm wondering the best way to do this, or if it's possible even in combination with go-defaults or other custom code.
I am trying to decode a TOML document that contains a variable length array of objects - and I'd like to prefill the field values as recommended in the readme.
eg, in this example, i'd like C to be set to -1
when the key is not present:
package main
import (
"bytes"
"fmt"
"github.com/pelletier/go-toml/v2"
)
func main() {
type varData struct {
A int
B int
C int
}
type data struct {
VarData []varData
}
var d data
tomlBlob := []byte(`
[[VarData]]
A = 1
B = 2
[[VarData]]
A = 2
B = 1
`)
toml.NewDecoder(bytes.NewReader(tomlBlob)).Decode(&d)
fmt.Printf("%+v\n", d)
}
it seems like with a library like go-defaults, you'd still need to call a SetDefaults()
method, and there isn't a good way to do this in advance for variable length array objects.
An option seems to be to use encoding.TextUnmarshaler
perhaps? i was thinking something like this:
func (v *VarData) UnmarshalText(text []byte) error {
v.SetDefaults()
return toml.NewDecoder(text).Decode(&v)
}
..but I was also curious with the TextUnmarshaler approach (if it works?), how you would know that the provided text []bytes
parameter is a TOML string and due to a go-toml invocation, and not from decoding other formats - ie, this would error out if we were trying to decode a JSON string instead.
Thank you!
i stumbled upon #484 and its test, and noticed that in fact UnmarshalText
does not get called when items are defined in this manner:
package main
import (
"fmt"
"github.com/pelletier/go-toml/v2"
"strconv"
)
type Integer struct {
Value int
}
func (i Integer) MarshalText() ([]byte, error) {
return []byte(strconv.Itoa(i.Value)), nil
}
func (i *Integer) UnmarshalText(data []byte) error {
fmt.Println("NEVER CALLED")
conv, err := strconv.Atoi(string(data))
if err != nil {
return err
}
i.Value = conv
return nil
}
type Config struct {
Integers []Integer
}
func main() {
raw := []byte(`
[[Integers]]
Value = 3
[[Integers]]
Value = 4
`)
var cfg Config
fmt.Println(toml.Unmarshal(raw, &cfg))
fmt.Printf("%#v", cfg)
}
results in:
<nil>
main.Config{Integers:[]main.Integer{main.Integer{Value:3}, main.Integer{Value:4}}}
Program exited.
Sorry for taking so long to respond!
The encoding.TextUnmarshaler
interface is only used to decode a specific type from a TOML string (same behavior as encoding/json
). To achieve what you want, we probably need to bring back a toml.Unmarshaler
interface, that would behave like the first example you gave:
func (v *VarData) UnmarshalTOML(text []byte) error {
v.SetDefaults()
return toml.NewDecoder(text).Decode(&v)
}
I'll think about it. I dropped the support for it when switching to v2 because it wasn't really used and poorly defined at the time, but it may be worth rebuilding this feature in the new codebase.
thanks for replying! makes sense, realized that this wouldn't work as i expected. I saw that BurntSushi/toml also has a UnmarshalTOML(interface{})
function that allows for this behavior - but it does require quite a bit of extra boilerplate.
it seems like even with UnmarshalTOML
ini your example, we wouldn't be able to set the defaults properly for variable length arrays - since we have no idea how long the array will end up being.
one solution would be if it was somehow possible to define how new objects are constructed (eg being able to instruct the decoder to call a NewVarData
function), but i'm not familiar with go reflection and not sure if that's possible.
another hacky idea is to parse twice:
func main() {
raw := []byte(`
[[Integers]]
Value = 3
[[Integers]]
Value = 4
`)
var cfg Config
fmt.Println(toml.Unmarshal(raw, &cfg))
// at this point, we know the length of cfg.VarData,
// apply all defaults
varDataLen := len(cfg.VarData)
for i := 0; i < varDataLen; i++ {
cfg.VarData[i].SetDefaults()
}
// now parse again
fmt.Println(toml.Unmarshal(raw, &cfg))
fmt.Printf("%#v", cfg)
}
haven't yet tested if this works.. 🙂
Since we are talking about hacks, how about using a "serialization field" that is only used to know whether there is an actual value in the document, and set the default or value after unmarshaling?
package main
import (
"fmt"
"github.com/pelletier/go-toml/v2"
)
type Integer struct {
OptValue *int `toml:"Value"`
V int
}
type Config struct {
Integers []Integer
}
func main() {
raw := []byte(`
[[Integers]]
Value = 3
[[Integers]]
Value = 4
[[Integers]] # should have a default
`)
var cfg Config
fmt.Println(toml.Unmarshal(raw, &cfg))
for i, x := range cfg.Integers {
if x.OptValue == nil {
cfg.Integers[i].V = -1
} else {
cfg.Integers[i].V = *x.OptValue
}
}
fmt.Printf("%#v", cfg)
}
# main.Config{Integers:[]main.Integer{main.Integer{OptValue:(*int)(0xc0000b6048), V:3}, main.Integer{OptValue:(*int)(0xc0000b6050), V:4}, main.Integer{OptValue:(*int)(nil), V:-1}}}
https://play.golang.com/p/vzo_qqbQKAK
I'm curious how people do it with encoding/json
, since I'd like to keep emulating the behavior of stdlib.
ah, that also works well.
the reason i had started investigating this is because we were doing something similar for a codebase i work on, and was wondering if there's a simpler way; instead of having it in the same struct though, we had two different structs, one with all pointers and one with no pointers. and we'd deserialize into the pointer struct, and then compare if it's nil
in order to know when to set defaults when copying over all the values into the non-pointer struct 😅
I'm curious how people do it with encoding/json, since I'd like to keep emulating the behavior of stdlib.
I think for encoding/json
, https://pkg.go.dev/encoding/json#RawMessage.UnmarshalJSON works well - unlike TOML, there's no "tables" concept or multiple ways to define an array. you'll always be passed an array like [{"Value": 3}, {"Value: 4}]
as the []byte
array in UnmarshalJSON - vs the TOML [[Integers]]
tables.
and because of this, I think if implementing UnmarshalTOML(text []byte)
- it's not going to be very clear what text
is going to be filled with, since it could be both [3, 4]
, or
[[Integers]]
Value = 3
[[Integers]]
Value = 4
.. and seems inconsistent since the latter has the Integers
label, while the former only has the values, ala UnmarshalJSON. it wouldn't be possible to pass both of these cases to toml.Unmarshal without some preprocessing.
I wonder if using an interface like BurntSushi/toml
would make more sense: https://github.com/BurntSushi/toml/blob/master/example_test.go#L284
Hm at that point why not unmarshing into a an interface{}
or map[string]interface{}
? Is it to avoid a post-processing phase?