segmentio/parquet-go

GenericWriter should write map keys to matching columns

Closed this issue · 2 comments

Given a parquet file with a schema generated from the following model:

type Inner struct {
	FieldB int
	FieldC string
}

type Model struct {
	FieldA string
	Nested Inner  
}

The GenericWriter should be able to write data from an an alternative model, where Nested is represented as map, as long as the map keys match column names from the original schema:

type AltModel struct {
	FieldA string
	Nested map[string]any
}

data := []AltModel{
	{
		FieldA: "a",
		Nested: map[string]any{"FieldB": 11, "FieldC": "c"},
	},
}

schema := parquet.SchemaOf(new(Model))
w := parquet.NewGenericWriter[AltModel](f, schema)
w.Write(data)

In its current implementation GenericWriter panics for the above code. Here is a gist that reproduces the error

This feature is useful when writing data to a schema where parts of the schema (i.e. the struct Inner) are defined dynamically at runtime.

Apologies to make more work for you, but we've decided to move development on this project to a new organization at https://github.com/parquet-go/parquet-go to ensure its long term success. We appreciate your contribution and would appreciate if you could reopen this ticket there if it is still relevant.

Thanks for letting me know @kevinburkesegment
I copied the issue to parquet-go#8