darccio/mergo

0.3.9 no longer merges public fields in embedded structs

mtrmac opened this issue · 12 comments

The documentation of Merge includes:

It won't merge unexported (private) fields and will do recursively any exported field.

Yet in 0.3.9:

package main

import (
	"fmt"

	"github.com/imdario/mergo"
)

type inner struct {
	A int
}

type outer struct {
	inner
	B int
}

func main() {
	dst := outer{
		inner: inner{A: 1},
		B:     2,
	}

	src := outer{
		inner: inner{A: 10},
		B:     20,
	}

	err := mergo.MergeWithOverwrite(&dst, src)
	if err != nil {
		panic(err.Error())
	}
	fmt.Printf("%#v\n", dst)
}

and that used to be the case with 0.3.8, returning

main.outer{inner:main.inner{A:10}, B:20}

0.3.9 ignores the embedded struct entirely, returning

main.outer{inner:main.inner{A:1}, B:20}

Using a public name for the inner struct does cause the fields to be merged recursively.

AFAICS this was intentional or at least noticed, #105 (review) points to a test that broke and had to be changed, but there was no further rationale. But that doesn’t really say why this had to change.

I'm checking this. Although, in your example you are using MergeWithOverwrite. I just tested Merge and it works fine. So, the issue is limited to MergeWithOverwrite.

FWIW I’ve taken an uninformed attempt at fixing this, but it would be more involved than I can do right now.

Enabling merging to happen for any struct field that is a struct and (recursively) contains a public field does not work because that involves assigning to newly-allocated private fields (the embedded struct), which Go rejects.

Reverting more of the struct merge code, to use the previous approach of overwriting the existing destination structure’s fields instead of making a copy of the destination as a whole does not work either, because deepMerge is now being called on values that can’t be edited like that.

Reverting even more, the whole new concept that deepMerge can return new values instead of only editing existing ones, was expanding the scope far beyond my available time, and I must assume the concept of allocating new copies was added for a good reason, so that would just break something else.

In retrospect, it might be possible to bypass the Go policy of preventing edits to unexported fields, but I haven’t looked into that either.

Fortunately, the use in containers/image can be modified not to use an unexported embedded struct, so this is not really urgent — for that use at least.

I misunderstood your case. Your inner isn't an exported field. It has an exported field but it isn't exported itself. So, the behaviour actually matches the sentence:

It won't merge unexported (private) fields and will do recursively any exported field.

inner doesn't comply the second clause, it's not exported, to be accessed recursively. So, I'm thinking the bug existed before and the PR fixed it.

I'm not sure now what is the proper approach here. I feel that there was a hidden bug. I didn't have in mind this scenario of an exported value in an embedded field.

Ah, that’s a very reasonable reading of the documentation as well.

OTOH pre-#105 hasExportedField seems quite intentional about including such cases — and there was a test for that behavior.


Pragmatically, c/image is going to change the type definition to use an exported field name: Just because the library tests fail and prevents upgrading the library dependency to 0.3.9 does not mean that any user of the library will know about that, and AFAIK Go modules don’t allow a library to “ban” uses of specific version dependencies when that library is embedded into larger projects, so the library must work against 0.3.9 as released.

So, as far as c/image goes, I’m fine with any resolution to this report.

I found it. This modification appeared because of issue #38. Before it just did deepMerge. My usual policy with PRs is that if previous tests run unmodified, I merge the PR after testing it. I think in both cases I overlooked these modifications.

Issue #38 doesn't make sense, as what he wanted was to overwrite a field that contained a *time.Time. Instead, he chose to access the embedded fields inside this struct and overwrite them 😅 His PR broke the described behaviour in the docs.

So, to me the library works as intended originally. There is a test for that issue, so the behaviour that the issue 38's contributor expected is covered by the current code.

I just caught this change because it breaks sprig which is used by Helm. As a user of the library I would call this a breaking behavior change because the behavior was previously one way and is now different. This normally requires incrementing the major version in SemVer.

@imdario Is your intention to walk back the functionality change? Or, is this something we need to discuss on Helm/sprig about communicating this or maintaining a fork?

@mattfarina The intended behaviour is the current one on 0.3.9. You shouldn't rely on being able to merge unexported fields in any case.

I understand this feels like a breaking change and I'm willing to release a new major version, but it wasn't an intended breaking change. It was a consequence of reversing a hidden behaviour that was introduced against the definition of the library.

Edit: please avoid maintaining forks if possible 😄

@imdario three things...

  1. The behavior was not hidden. It was in use by Sprig and the tools that use it (e.g., Helm).
  2. I caught the change because tests in Sprig failed after the update. We had tests for the situation.
  3. Do you have any suggestions on handling this other than a fork?

@mattfarina I need to give to this some thought. Not sure now how public fields of embedded structs should be handled.

I worry this going to keep coming back, although Go reflect returns embedded fields as unexported. So, the library in 0.3.9 is coherent with Go reflect design, but at the same time, it seems to be not useful for the users.

@mattfarina I'm testing your linked test, reducing to the core behavior of spring's merge template function:

func TestIssue139(t *testing.T) {
	srcs := []map[string]interface{}{
		{
			"h": 10,
			"i": "i",
			"j": "j",
		},
		{
			"a": 1,
			"b": 2,
			"d": map[string]interface{}{
				"e": "four",
			},
			"g": []int{6, 7},
			"i": "aye",
			"j": "jay",
			"k": map[string]interface{}{
				"l": false,
			},
		},
	}
	dst := map[string]interface{}{
		"a": "one",
		"c": 3,
		"d": map[string]interface{}{
			"f": 5,
		},
		"g": []int{8, 9},
		"i": "eye",
		"k": map[string]interface{}{
			"l": true,
		},
	}

	for _, src := range srcs {
		if err := Merge(&dst, src); err != nil {
			t.Fatal(err)
		}
	}

	expected := map[string]interface{}{
		"a": "one", // key overridden
		"b": 2,     // merged from src1
		"c": 3,     // merged from dst
		"d": map[string]interface{}{ // deep merge
			"e": "four",
			"f": 5,
		},
		"g": []int{8, 9}, // overridden - arrays are not merged
		"h": 10,          // merged from src2
		"i": "eye",       // overridden twice
		"j": "jay",       // overridden and merged
		"k": map[string]interface{}{
			"l": true, // overridden
		},
	}
	assert.Equal(t, expected, dst)
}

I'm a bit confused because your expectations aren't fulfilled in earlier versions. These are the results by version:

  • 0.3.9: cannot append two different types (int, string)
  • 0.3.8: key "j" isn't overridden
- (string) (len=1) "j": (string) (len=3) "jay",
+ (string) (len=1)"j": (string) (len=1) "j")
  • 0.3.5, 0.3.6, 0.3.7: the same as 0.3.8
  • 0.3.4: like 0.3.8 but additionally key "g" has duplicated values (another bug already fixed)

I'm running these tests with the following commands:

dario@thinkpad:~/go/src/github.com/imdario/mergo$ git checkout $VERSION
dario@thinkpad:~/go/src/github.com/imdario/mergo$ go test -timeout 30s github.com/imdario/mergo -run "^(TestIssue139)$" -count=1 -v

I think your issue isn't related to this one but with #143.

Resolved by reverting PR #105