Physical Type of Decimals
bkief opened this issue · 6 comments
My Rust is well "rusty", but am I correct in understanding that all ODBC Decimal Types convert to FIXED_LEN_BYTE_ARRAY. I have a Oracle NUMBER(5,3) and I was surprised its physical_type
was not int32
odbc2parquet/src/query/decimal.rs
Line 44 in 8e52f25
Would it be as simple as changing this line
odbc2parquet/src/query/strategy.rs
Lines 82 to 89 in 8e52f25
to scale: p @ 0..=9,
Hello @bkief,
My Rust is well "rusty", but am I correct in understanding that all ODBC Decimal Types convert to FIXED_LEN_BYTE_ARRAY. I have a Oracle NUMBER(5,3) and I was surprised its physical_type was not int32
Yes, you are correct. All Decimals with a scale different from 0 are currently represented as FIXED_LEN_BYTE_ARRAY
. The binary representation in there is a twos_complement and not what some might suspect, their text representation.
Would it be as simple as changing this line
No, because ODBC would not represent the DECIMAL as an integer. identical
in that context means that the ODBC and Parquet representation of the type in question are identical and can be copied without any transformation. Yet the most reliable way to get decimals out of ODBC is to transfer them in their text representation. So some conversion must always happen, but for decimals with scale 0 which ODBC will happily convert into integers for me.
Cheers, Markus
Would it be better for you if the physical type were i32
? If so, how?
I was hoping to use the delta_bitpacked encoding. I have time-series data that lends itself very well to that format.
Thanks, I know understand much better. Could you think of any reason why someone might not want to have the decimal backed by a Physical integer type?