golang/go

proposal: encoding/json: dynamic field omission

smikulcik opened this issue ยท 24 comments

When structuring json types, it would be very helpful to allow the custom types define if the value is to be omitted. This is similar to a javascript value may set to "undefined".

Just as we allow custom types to implement "MarshalJSON" to provide a custom value: ("null", multivariate, etc). So too should we allow these custom types to be set to "Omitted" as well.

This proposal may supersede other highly desired omitnil proposals as one may leverage this abstraction to accomplish the same goal.

Proposal

Update encoding/json to check if a struct field implements an Omittable interface where Omittable is defined as follows:

type Omittable struct{
  Omitted() (bool, error)
}

If a fieldType implements this interface, omit the filed if the response of Omitted() is true.

Example (Omittable String)

type Foo struct{
  Field1 OmittableString `json:"field_1"`
}

type OmittableString struct {
  IsSet bool
  Value string
}
func (o OmittableString) UnmarshalJSON(input []byte) error {o.IsSet=true; return json.Unmarshal(input, o.Value)}
func (o OmittableString) MarshalJSON() ([]byte, error) {return json.Marshal(o.Value)}
func (o OmittableString) Omitted() (bool, error) {return !o.IsSet, nil}

x := &Foo{}

_ = json.Unmarshal([]byte(`{}`), x)
// x.Field1.IsSet = false
out, _ = json.Marshal(x)
// {}

_ = json.Unmarshal([]byte(`{"field_1": ""}`), x)
// x.Field1.IsSet = true
// x.Field1.Value = ""
out, _ = json.Marshal(x)
// {"field_1": ""}

_ = json.Unmarshal([]byte(`{"field_1": "foo"}`), x)
// x.Field1.IsSet = true
// x.Field1.Value = "foo"
out, _ = json.Marshal(x)
// {"field_1": "foo"}

Example (Omit-nil)

type Foo struct{
  Field1 OmitNilStruct `json:"field_1"`
}

type OmitNilStruct struct {
  Value *MyStruct
}
func (o OmitNilStruct) UnmarshalJSON(input []byte) error {return json.Unmarshal(input, o.Value)}
func (o OmitNilStruct) MarshalJSON() ([]byte, error) {return json.Marshal(o.Value)}
func (o OmitNilStruct) Omitted() (bool, error) {return o.Value == nil, nil}

type MyStruct struct {}

x := &Foo{}

_ = json.Unmarshal([]byte(`{}`), x)
// x.Field1.Value = nil
out, _ = json.Marshal(x)
// {}

_ = json.Unmarshal([]byte(`{"field_1": {}}`), x)
// x.Field1.Value = MyStruct{}
out, _ = json.Marshal(x)
// {"field_1": ""}

Compatibility

This may cause issues with existing custom types that happen to implement this new method. Though I would guess that the number of uses with such a pattern is low.

Implementations

I've implemented this locally and it seems to work fine. If approved I can submit a PR.

Related Proposals

see also #50480

I like what's proposed in #50480, but the implementation was considerably easier for this proposal as we can leverage the code next to omitempty.

https://cs.opensource.google/go/go/+/refs/tags/go1.18.1:src/encoding/json/encode.go;l=750

Personally, I think it might make more sense to add a method like IsEmpty() bool. That way, you can still use the omitempty tag on the struct field, and decide on a case-by-case basis whether it should omit, and it keeps it consistent with how "normal" types get omitted.

The other nice thing is it side-steps the issue of what happens when you directly pass your custom type to json.Marshal. For instance, referencing the above sample code, what does json.Marshal(OmittableString{}) do?

@rittneje Thanks for the feedback.

IMHO, the concept of a property being "empty" seems to be a flawed abstraction. True, we do have existing patterns around "emptiness" (omitempty etc.). Here I'm making that claim that the model that we use to describe a json document should more clearly expose the concept of "property omission" to the object model.

Though, to directly answer your question, I would expect json.Marshal(OmittableString{}) to follow the MarshalJSON method and return "" as it is not being composed with an object or an array and as such cannot be omitted.

I see "helpers" like omitempty and omitnil as shorthands for a policy around omit or not to omit.

We should have a way to describe a policy around property omission without making assumptions around policy as property omission is a core part of the json data structure.

With this policy-based view in mind, we would then be able to phrase the policy of omitempty within a pattern like this:

function (x Foo) Omitted() (bool, error){
  switch reflect.TypeOf(x){
    case reflect.String:
      return x == ""
    case reflect.Int:
      return x == 0
    case reflect.Ptr:
      return x == nil
    ...
    }
}
rsc commented

/cc @dsnet

rsc commented

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
โ€” rsc for the proposal review group

dsnet commented

This feels like a duplicate of #45669.

A prototype implementation of #45669 supports an omitzero option that omits a Go struct field if either:

  • the field value is the zero Go value (per reflect.Value.IsZero) OR
  • the field value implements IsZero() bool and it reports true.

This definition is simple to implement and easy to explain.

I implemented a prototype of the approach similar to #45669 for #11939 in CL 13977, though it applies to omitempty rather than a new omitzero tag. (And there's no reason why IsZeroer needs to be exported.)

Another option oft-mentioned in other proposals would be to add a sentinel error value to indicate that a value should be omitted from the MarshalJSON method, though this would presumably also have to be opt-in with a new struct tag to avoid inadvertent potential backward compatibility issues with types like time.Time.

Is the benefit over #50480 just an easier implementation?

dsnet commented

The difficulty lies not with the implementation, which is rather straightforward. The difficulty is in the details and making sure it interacts well with the overall ecosystem.

The MarshalJSON method is frequently called directly by user code where users expect to receive a valid JSON representation of the value. This fact alone suggests that:

  • having MarshalJSON return nil (#50480) or
  • having MarshalJSON return a sentinel error

are not acceptable approaches.

Also, we can't change the meaning of omitempty (at least for the current "encoding/json" package), so anything that tries to modify the semantics of that is also off the table.

Potential issues with using reflect.Value.IsZero on a struct type:

  • Nonzero unexported fields will cause it to return false
  • It returns true only if it is recursively true for all fields of the struct. This could be confusing if a struct member is not reflect.Value.IsZero but it does implement IsZero() bool itself and would return true from that. (Playground example)

encoding/json has its own isEmptyValue function which performs most of the same checks (and predates reflect.Value.IsZero) but most notably does not recurse into structs.

dsnet commented

@joeshaw: Thanks. Those are legitimate concerns, but what you're suggesting is getting increasingly more complicated. The more complicated the rules, the harder it is to use correctly.

If we go with a more complicated definition of what "zero" means, anything about being "exported" should not appear in the definition. It seems the intent is not so much that we should ignore unexported fields, but rather that we should ignore fields that have no JSON serialization. It would be odd to ignore an unexported field, but consider the zero-ness of a field marked as json:"-". But even that isn't enough. What does it mean when a field implements MarshalJSON? What does zero mean there?

Using reflect.Value.IsZero has the major advantage that it is well defined and simple. It is good enough for most simple structs (e.g., netip.Addr). To work around concerns where it is insufficient, that is why we also check the bool IsZero() bool, which allows the type's author to define what zero actually means (e.g., time.Time.IsZero).

The MarshalJSON method is frequently called directly by user code where users expect to receive a valid JSON representation of the value.

What if, instead of a sentinel error, there was a sentinel value.

func (x T) MarshalJSON() ([]byte, error) {
	return json.Empty, nil
}

Where Empty is:

var Empty = []byte("null")
dsnet commented

That's an interesting idea, but it assumes that the output of MarshalJSON is never mutated by the caller. Unfortunately, I have definitely seen code depend on that property before.

I'm not sure why sentinel errors aren't acceptable? Callers should be handling errors, and any old caller code used with new MarshalJSON code should just surface it (fail) as another error, which can prompt people to update caller code.

@dsnet could it be good that the caller is able to mutate and thus block (if they so choose) this behavior?

@seankhliao

I'm not sure why sentinel errors aren't acceptable? Callers should be handling errors, and any old caller code used with new MarshalJSON code should just surface it (fail) as another error, which can prompt people to update caller code.

Returning a sentinel error value here indicates behavior, not an error condition. Existing code should not treat it as an error, but as it has no concept of the sentinel value it will, and this will break existing behavior.

If it was only the encoding/json package that called MarshalJSON, then the change in behavior could be added at the same time as the new sentinel value. Unfortunately other code can (and does) call MarshalJSON and this change would break nearly all of that code.

In that case, it seems like a new interface is the best option:

func (x T) EmptyJSON() bool { return true }
rsc commented

Does anyone think this isn't a duplicate of #45669, which is in turn on hold for #22480?
I'd be happy to take #45669 off hold if people want to try to converge there.

rsc commented

This proposal is a duplicate of a previously discussed proposal, as noted above,
and there is no significant new information to justify reopening the discussion.
The issue has therefore been declined as a duplicate.
โ€” rsc for the proposal review group

@rsc I see this as an alternative to the omitzero proposal. Instead of baking the omission policy into the struct tag, this proposal allows us to customize the policy on if we want to omit a field. It also lends itself towards the JSON-native "omission" notion instead of the go-specific "zero-value" or the go-json-specific notion of "empty-value".

dsnet commented

There are quite a number of "omit this field" proposals. #11939 proposes special-casing an IsZero() bool method, which I think is a superior proposal than this proposal since it a much simpler interface than Omitted() (bool, error).

@dsnet Yeah the IsZero() bool is pretty good too. I added the error param in my proposal to go along with the "everything may fail" idiom. I'd be open to simplifying down to Omitted() bool if it makes a difference.