INT96 import issue
hangxie opened this issue · 3 comments
INT96 values does not match original value after import:
"Int96": "1717-12-28T19:20:10.805069776Z", | "Int96": "2022-01-01T09:09:09.009009Z",
I feel like this is unresolvable due parquet-go does not store arbitrary bytes in a proper way, it should be treated as []byte but parquet-go uses string, more details at xitongsys/parquet-go#434, which means literally there is no way to import an INT96 value from JSON to parquet, thinking of INT96 is deprecated I believe we are good.
I will do a couple of tests to confirm this.
There may be a solution to parse INT96 timestamp, convert it to INT96 then let parquet-go handle the new value, however, this again needs to recursively iterate value node to decide which one to convert, which is pretty complex (see code for cat command), I tend not to do this as INT96 is deprecated but if parquet-go can fix this problem, I can take it.
I'm going to close this case as INT96 is barely supported now a day, especially write.