encoding/json: add omitzero option
palsivertsen opened this issue ยท 35 comments
The omitempty json tag is kind of confusing to use when working with nested structs. The following example illustrates the most basic case using an empty struct for argument's sake.
type Foo struct {
EmptyStruct struct{} `json:",omitempty"`
}The "EmptyStruct" field is a struct without any fields and can be empty, that is equal to it's zero value. But when I try to marshal it to json the field is still included in the resulting json object. Reading the encoding/json documentation about the definition of empty it does not mention empty structs:
The "omitempty" option specifies that the field should be omitted from the encoding if the field has an empty value, defined as false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string.
It feels weird that adding the omitempty tag to struct fields is allowed if it will never have the desired effect. If structs will never be considered empty shouldn't there at least be a compiler warning when adding this tag to a struct field?
Working with json time.Time
This behavior causes some confusion when working with time.Time in json structures. Go's zero values for primitive types are fairly reasonable and "guessable" from a non-gopher point of view. But the zero value for time.Time, January 1, year 1, 00:00:00.000000000 UTC, is less common. A more(?) common zero value for time is January 01, 1970 00:00:00 UTC. To avoid confusion when working with json outside the world of go it would be nice to have a way to omit zero value dates.
A commonly suggested workaround to this problem is to use pointers, but pointers might be undesirable for a number of reasons. They are for example cumbersome to assign values to:
type Foo struct {
T *time.Time
}
_ = Foo{
T: &time.Now(), // Compile error
}
_ = Foo{
T: &[]time.Time{time.Now()}[0], // Weird workaround
}The time.Time documentation also recommends to pass the type as value, not pointer since some methods are not concurrency-safe:
Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time.
A Time value can be used by multiple goroutines simultaneously except that the methods GobDecode, UnmarshalBinary, UnmarshalJSON and UnmarshalText are not concurrency-safe.
This playground example illustrates three different uses of empty structs where I'd expect the fields to be excluded from the resulting json. Note that the time.Time type has no exposed fields and uses the time.Time.NarshalJSON() function to marshal itself into json.
Solution
Would it make sense to add a new tag, like omitzero, that also excludes structs?
There's a similar proposal in #11939, but that changes the definition of empty in the omitempty documentation.
Placed on hold.
โ rsc for the proposal review group
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
โ rsc for the proposal review group
Personally I don't like baking the omission policy into struct tags. Things like omitempty, omitnil, omitzero are all limited by that one policy. I kinda liked my proposal for #52803 since it allowed that extra flexibility.
the IsEmpty() proposal also adds this flexibility, but it seems odd to further the abstraction of what it means for a property to be "empty". Arn't we just trying to determine if we want to omit a field or not?
Or perhaps we should allow MarshalJSON to return nil as in #50480.
Say we go down that path and I write a type like this:
type Null[T any] struct {
Valid bool
Value T
}
func (n Null[T]) MarshalJSON() ([]byte, error) {
if !n.Valid {
return nil, nil
}
return json.Marshal(n.Valid)
}Now this type can only be used as an omitzero field.
@icholy I believe the return-nil option would be instead of omitzero, not in addition to it.
@joeshaw having a nil return result in a null in non omitzero or omitempty fields would change the behavior of existing code.
@icholy Returning nil would not result in null in the JSON (that is done by returning []byte("null")), it would cause the field to be omitted entirely, under the proposal from #50480. Returning nil today causes invalid JSON to be produced.
However, @dsnet points out in #52803 (comment) that nil as a return value might cause existing code which calls MarshalJSON to break because it is not expecting this behavior.
@joeshaw I want to have a general purpose Null[T] type that will be omitted in the case of omitempty or omitzero and will result in null otherwise.
@icholy So you return string null when you want JSON null, or nil when you want to omit it?
@mitar my json.Marshaler implementation has no way of knowing if it's being used in an omitempty field.
True. I think this is a broader issue: MarshalJSON not being able to known tags on the field. So I think this is a different issue than this one: how could MarshalJSON know those tags. Maybe something like what Format has (e.g., fmt.State or verb rune). But currently MarshalJSON always overrides and ignores tags. There are simply two ways to "configure" JSON marshaling: or through imperative code (by implementing MarshalJSON) or declaratively through tags. Sadly, those two word do not work together and even more: you cannot implement in MarshalJSON same things you can through tags (see #50480).
-
Lets start by agreeing on a definition for empty vs zero, here is my proposal:
- empty JSON-values:
false,0,"",null,[],{} - zero JSON-values: empty JSON-values and objects containing only empty JSON-values eg.
{a: null, b: 0}
- empty JSON-values:
-
In case we can agree on the definition in 1, we notice that the current JSON encoder implementation is not conforming to this definition.
Since structs that encode to{}are not skipped for fields with theomitemptytag.
Now we have to decide if it is desirable to fix this: is this a bug or a feature ๐?
The https://github.com/go-json-experiment/json project seems to acknowledge this error and fixes this issue. -
The
omitemptytag is not compatible with a customMarshalJSONfunction.
IfMarshalJSONreturnsnull, this value is not omitted.
In my opinion this is a bug, a field tagged withomitemptyshould never result in e.g. anullJSON-value.
.
LettingMarshalJSONreturnnilcould be an option to generally omit a value, however this is not conditional on whether theomitemptytag is present or not. -
Finally, many developers want an option to indicate that a custom object is empty.
- eg. value
trueinstead offalseshould be considered empty for your usecase and should be omitted - eg. the empty value for a time.Time value should be considered as empty
- eg. value
-
Other proposals like adding
omitzeroor an extraShouldOmit()interface are not discussed here, since I think they are no bugs and less plausible to be implemented.
Bugs in 2 and 3 can be fixed as follows:
encode.go - line 750:
// fast path omitEmpty
if f.omitEmpty && isEmptyValue(fv) {
continue
}
omitEmptyResetLocation := e.Len()
e.WriteByte(next)
if opts.escapeHTML {
e.WriteString(f.nameEscHTML)
} else {
e.WriteString(f.nameNonEsc)
}
opts.quoted = f.quoted
startLen := e.Len()
f.encoder(e, fv, opts)
newLen := e.Len()
if f.omitEmpty && (newLen-startLen) <= 5 {
// using `Next` and `bytes.NewBuffer` we can modify the end of the
// underlying slice efficiently (without doing any copying)
fullBuf := e.Next(newLen) // extract underlying slice from buffer
switch string(fullBuf[startLen:newLen]) {
case "false", "0", "\"\"", "null", "[]":
// reconstruct buffer without zero value
e.Buffer = *bytes.NewBuffer(fullBuf[:omitEmptyResetLocation])
continue
default:
// reconstruct original buffer
e.Buffer = *bytes.NewBuffer(fullBuf)
}
}
next = ','Feature 4 can be added as follows:
encode.go - line 341:
// Emptyable is the interface implemented by types that
// can provide a function to determine if they are empty.
type Emptyable interface {
IsEmpty() bool
}
var emptyableType = reflect.TypeOf((*Emptyable)(nil)).Elem()
func isEmptyValue(v reflect.Value) bool {
// first check via the type if `Emptyable` is implemented
// this way, we can prevent the more expensive `Interface`
// conversion if not required
if v.Type().Implements(emptyableType) {
if v, ok := v.Interface().(Emptyable); ok {
return v.IsEmpty()
}
}
switch v.Kind() {
case reflect.Array, reflect.Map, reflect.Slice, reflect.String:
return v.Len() == 0
case reflect.Bool:
return !v.Bool()
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
return v.Int() == 0
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
return v.Uint() == 0
case reflect.Float32, reflect.Float64:
return v.Float() == 0
case reflect.Interface, reflect.Ptr:
return v.IsNil()
}
return false
}NOTE: Instead of adding a Emptyable interface, the fix for 2 and 3 could be combined with a MarshalJSON function that returns an empty value. Meaning that if we accept the proposed fix for 2 and 3, we also fix 4.
NOTE2 (edit): by replacing case "false", "0", "\"\"", "null", "[]", "{}": with case "false", "0", "\"\"", "null", "[]":; the solution for 2 and 3 becomes a solution for 3 only, this makes it maybe easier to get accepted? -> I'll probably create a PR for this change to gather some extra feedback.
@joeshaw I tried to make this analysis as complete as possible. Please let me know in case I missed something.
@rcs @joeshaw @dsnet #53425 is an implementation that solves 'bug' 3 (as described above).
As mentioned in https://go-review.googlesource.com/c/go/+/412856, it currently awaits approval/ disapproval.
Any ideas on how to get this approved?
@inteon, the difficult lies not in the implementation but
in making sure that we have semantics that works well with the entire ecosystem.
I think it would be helpful to avoid going too much in the weeds of what the implementation would look like.
We should discuss implementation details at high-level to the degree that:
- we know it is actually possible to implement (e.g., that it depends on functionality that Go reflection provides), or
- what general performance implications a given design will have (e.g., that it won't require a two-pass scan of the value).
Let's step back and ignore how the json package currently omits object members.
In a brand new world, how would we like object members to be omitted?
- Should it be omitted based on the Go type system?
- Perhaps based on the Go value itself where certain Go values
are defined as being omitted (e.g.,false,MyStruct{}, etc.)? - Perhaps based on some special method on the Go value (e.g.,
IsZeroorShouldOmit)?
- Perhaps based on the Go value itself where certain Go values
- Should it be omitted based on the JSON type system?
- Perhaps based on whether the struct field value encodes as certain JSON values
which are defined as being omitted (e.g.,null,"",{}, etc.)? - Customized encoding (e.g., via
MarshalJSON) would affect whether
the JSON value is to be treated as being omitted.
- Perhaps based on whether the struct field value encodes as certain JSON values
Which is correct?
Both have legitimate usages and neither covers all of the common use-cases by itself.
Therefore, I believe we should support both.
Omit based on Go value
Let's suppose there was a omitgozero tag that omits a Go struct field
if the field is the zero value (per reflect.Value.IsZero).
If the field implements an IsZero() bool method,
then that method is called instead to determine whether to omit the field.
Properties of this approach:
- A "zero value" is well-defined in the Go language.
- In contrast, today's definition of being a Go
"false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string"
is harder to remember. - Also, today's definition does not handle several common use-cases, such as:
- Being able to omit a nil slice, but not an empty, non-nil slice.
- See "omit based on JSON value" for omitting empty slices (regardless of nil-ness).
- Being able to omit the zero value of a Go struct or array.
- Being able to omit a nil slice, but not an empty, non-nil slice.
- This works well with types like
netip.Addrwhere the zero value is the only invalid value.
- In contrast, today's definition of being a Go
- In general, the "zero Go value" definition has the nice property that
marshaling a Go struct will emit only the non-zero fields such that unmarshaling the JSON output
back into a Go struct will generally return the same result (per reflect.DeepEqual) - Checking for the "zero value" has first-class support in Go reflection via
reflect.Value.IsZero. - Because it has first-class support, it can be optimized in ways that the
jsonpackage simply cannot hope to achieve.
For example, this optimization makesreflect.Value.IsZeroan order of magnitude faster for comparable types.- Alternative definitions that exclude unexported fields are both more complicated to explain and
not as performant sincejsonwould have to implement its own recursive walk through a struct.- Attempts to exclude unexported fields or non-JSON serializable fields seems to actually want
omission to be defined in terms of the JSON value (see below) and not the Go value. - It cannot leverage the performance of
reflect.Value.IsZero.
- Attempts to exclude unexported fields or non-JSON serializable fields seems to actually want
- Alternative definitions that exclude unexported fields are both more complicated to explain and
- Respecting a custom
IsZeromethod allows users to provide their own custom definition of "zero".- This works well with
time.Time.IsZerowhich handles cases of zero time beyond just beingtime.Time{}.
- This works well with
Omit based on JSON value
Let's suppose there was a omitjsonempty tag that omits a Go struct field
if it would have encoded as an empty JSON value,
which we define as being a JSON null, "", {}, or [].
Properties of this approach:
- The application of this is recursively effective:
- For example consider the following:
type MyStruct struct { XXX string `json:"-"` Foo string `json:",omitgozero"` Bar *MyStruct `json:",omitjsonempty"` Baz []int `json:",omitjsonempty"` Raw json.RawMessge `json:",omitjsonempty"` } json.Marshal(MyStruct{ XXX: "ignored", // this is omitted since it is ignored Bar: &MyStruct{ Baz: []int{}, // this is omitted since is encodes as `[]` }, // this is omitted since it encodes as `{}` Raw: json.RawMessage("null"), // this is omitted since it encodes as `null` }) // thus, this just ouputs `{}`
- This allows
omitjsonemptyto handle cases thatomitgozerocannot.
- For example consider the following:
- Whether a field value is to be omitted is dependent on any custom JSON
serialization that would occur (e.g., viaMarshalJSON). - We do not support omitting arbitrary JSON values.
For our definition, every empty JSON value has a fixed width (at most 4 bytes).
This is necessary for when we implement true streaming
where it is easier and faster to ensure that we never flush the empty JSON value
so that the encoder can unwrite it if necessary. - It might seem slow to encode a JSON value only to unwrite it.
In many cases, the internal implementation can short-circuit the actual
encoding if it knows that it would have been an empty JSON value. - Our definition of an empty JSON value does not include
0since it is
ill-defined whether-0,0e123456,0.000000,-0.000e+999are included.
Also,0is not what most users think of as being "empty".
Composability
Both omitgozero and omitjsonempty would be composable.
You can specify both if you want, where the effect is the logical OR of the two.
That is, if the object member would be omitted under
either omitgozero or omitjsonempty then it is omitted.
In many cases, the behavior of omitgozero and omitjsonempty will be identical.
Users should use omitgozero when both have the same effect.
Compatability
The existing semantics of omitempty cannot be changed because we are bound
by the Go 1 compatability guarantees.
We would need to do one of the following:
- Give new semantics to a new struct tag (e.g.,
omitjsonempty), or - Have
omitemptyrespect a new method that does appear by accident
in existing code (e.g.,OmitJSONField() bool), or - Have a new major version of the
jsonpackage.
Personally, I like the names of omitzero and omitempty since they are shorter,
but changing the semantics of omitempty in the current json package is not acceptable,
which suggest that we may need new names (unless we have a new major-version of the json package).
@dsnet Thank you for your extended answer
Ok, so 'fixing' omitempty combined with MarshalJSON is considered a breaking change.
I guess adding support for IsZero or IsEmpty would also be considered breaking, since these methods could appear by accident in existing code.
-> this insight can already result in closing a lot of issues/ PRs on this repo
I generally agree with your assessment, but I think that it is important to note that even when using a new major version of the json package, developers would have much more trouble changing all json tags. e.g. because they embed a struct from a library. Therefore, I'm not really keen on the idea of changing the tag names or adding tags... ๐ค
My preference here would be to keep omitempty but change its meaning to the meaning you described for omitjsonempty.
Am I correct to assume that all work on a possible json v2 is done here: https://github.com/go-json-experiment/json (looks very promising)? Could you give a timeline on when you would be open to receive feedback/ PRs on this future work?
I'd prefer a omitzero with omitgozero semantics. Then an omitnull which omits any value which would have marshalled to null.
I think that it is important to note that even when using a new major version of the json package, developers would have much more trouble changing all json tags. e.g. because they embed a struct from a library.
Agreed. This is something I'm concerned about.
It's unclear whether this is an advantage or disadvantage, but if the new "json" package redefines the meaning of omitempty to be omitjsonempty, existing usages will behave the same for most cases (excluding numeric and boolean types; perhaps we should fold false and 0 into the definition of omitjsonempty? ๐คทโโ๏ธ).
Am I correct to assume that all work on a possible json v2 is done here: https://github.com/go-json-experiment/json (looks very promising)? Could you give a timeline on when you would be open to receive feedback/ PRs on this future work?
This is just an experimental project worked on by @dsnet, @mvdan, @johanbrandhorst, @rogpeppe, and @ChrisHines. We are not operating in any official capacity (other than @rsc being aware of the work). This work will be presented in the future as a Go proposal when it's ready. Even then, the possible results are highly varied.
The project started off somewhat ad-hoc when @mvdan tweeted his thoughts about a hypothetical v2 "json" package. From the beginning, the project had an ambitious scope trying to address most of the open issues with the JSON package in a unified way such that the features are mostly orthogonal and don't interact poorly with each other.
Progress has been slow since none of us work on Go full-time, and contribute on the side. For myself, @tailscale has been very generous in giving me the freedom to work on this aside from my other duties there.
Of relevance to this issue, go-json-experiment currently implements what I described above as omitgozero and omitjsonempty as just omitzero and omitempty.
Met with @dsnet and others yesterday. Let's put this on hold until we figure out what to do with their work.
Placed on hold.
โ rsc for the proposal review group
As someone currently vendoring encoding/xml for this reason I'd like to point out this discussion applies equally there.
I'd also like to echo @smikulcik and say that while new tags like omitgozero, omitjsonempty or omitzero would help they will be inevitably abused and make code less clear in situations where the real goal is omitting a field in some non-zero case.
For example, only allowing omission based on predetermined reasons will force a dev to essentially lie in IsZero:
type EphemeralTime time.Time
type SomeType struct {
// ValidUntil is only present while the thing is valid
ValidUntil EphemeralTime `json:"valid_until,omitzero"`
}
func (t EphemeralTime) IsZero() {
if t.Now().After(t) {
return true // Lie just to get it omitted
}
// Actual zero check
}While that's contrived for demonstration, there are other use-cases like non-zero defaults you may want to exclude.
If, instead, there was some mechanism for omitfunc either via tag or predefined interface (e.g. OmitJSON) the user is allowed to control when to omit, which is the real goal and avoids another proposal after this one for the next reason someone needs to omit a field.
I agree with @marwatk . It is not an uncommon case. Think of serializing an application configuration file. That configuration is likely full of non-zero default values. But when serializing that configuration i would not want to include everything, only the customized fields. It is much easier to maintain that way knowing that only the things i customize are listed. The use case here would be a configuration front end that reads in current configuration, then merges in your changes (from say a web page) then serializes that out to the configuration file.
Hi all, we kicked off a discussion for a possible "encoding/json/v2" package that addresses the spirit of this proposal.
See the "omitzero" struct tag option under the "Struct tag options" section, which omits a struct field if it is the zero Go value (or has a IsZero method that reports true). Under this semantic, time.Time{} would be omitted both because it is the zero Go value, but also because time.Time.IsZero would report true.
I don't get it. Couldn't we just support returning NIL in a custom MarshalJSON method? Returning NIL results in an error anyway at the moment. I wish you could make that example work: https://go.dev/play/p/Cx7e0T_D30v
I'd like to get this back onto the proposal track with the following semantic:
When marshaling, the "omitzero" option specifies that the struct field should be omitted if the field value is zero as determined by the "IsZero() bool" method if present, otherwise based on whether the field is the zero Go value (according to reflect.Value.IsZero). This option has no effect when unmarshaling. If "omitempty" is specified together with "omitzero", whether a field is omitted is based on the logical OR of the two.
#63397 contains discussion about a hypothetical v2 "json", which implements this semantic. If v2 "json" were to land, having this tag option in v1 would help in bridging the behavior of the two packages.
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
โ rsc for the proposal review group
Have all remaining concerns about this proposal been addressed?
The proposal is:
When marshaling, the "omitzero" option specifies that the struct field should be omitted if the field value is zero as determined by the "IsZero() bool" method if present, otherwise based on whether the field is the zero Go value (according to reflect.Value.IsZero). This option has no effect when unmarshaling. If "omitempty" is specified together with "omitzero", whether a field is omitted is based on the logical OR of the two.
This will mean that omitzero of a slice omits a nil slice but emits [] for a zero-length non-nil slice (and similar for maps). It will also mean that omitzero of a time.Time omits time.Time{}. Neither of these strictly requires calling the IsZero method, but custom types may may find implementing IsZero useful.
For consistency, should we support omitzero in encoding/xml with the exact semantic?
It would make sense to expand xml at the same time, but let's make that a separate proposal since we've already had quite a long discussion here.
Based on the discussion above, this proposal seems like a likely accept.
โ rsc for the proposal review group
The proposal is:
When marshaling, the "omitzero" option specifies that the struct field should be omitted if the field value is zero as determined by the "IsZero() bool" method if present, otherwise based on whether the field is the zero Go value (according to reflect.Value.IsZero). This option has no effect when unmarshaling. If "omitempty" is specified together with "omitzero", whether a field is omitted is based on the logical OR of the two.
This will mean that omitzero of a slice omits a nil slice but emits [] for a zero-length non-nil slice (and similar for maps). It will also mean that omitzero of a time.Time omits time.Time{}. Neither of these strictly requires calling the IsZero method, but custom types may may find implementing IsZero useful.
No change in consensus, so accepted. ๐
This issue now tracks the work of implementing the proposal.
โ rsc for the proposal review group
The proposal is:
When marshaling, the "omitzero" option specifies that the struct field should be omitted if the field value is zero as determined by the "IsZero() bool" method if present, otherwise based on whether the field is the zero Go value (according to reflect.Value.IsZero). This option has no effect when unmarshaling. If "omitempty" is specified together with "omitzero", whether a field is omitted is based on the logical OR of the two.
This will mean that omitzero of a slice omits a nil slice but emits [] for a zero-length non-nil slice (and similar for maps). It will also mean that omitzero of a time.Time omits time.Time{}. Neither of these strictly requires calling the IsZero method, but custom types may may find implementing IsZero useful.
Change https://go.dev/cl/615676 mentions this issue: encoding/json: add omitzero option