shuttle-hq/synth

Richer export types (parquet, avro)

djoanes opened this issue · 1 comments

Required Functionality
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
With the current functionality of synth, the output artifacts are all typeless. This causes there to be a necessity for auto type derivation downstream or an explicit redeclaration of the types of all the columns.

Proposed Solution
Add support for new uris, parquet:, avro:. This will allow for richer importing and exporting where the explicitly defined types are preserved.

Use case
I'd like to use synth to generate data and bulk load it into a big data ecosystem (ie. Hadoop, BigQuery)

Avro integration is possible, avrow should work well. Parquet integration would take a lot more work, as it doesn't seem to have a decent serde implantation.