Implementation of ORC file format
Read Apache ORC in Rust.
- Read ORC files
- Read stripes (the conversion from proto metadata to memory regions)
- Decode stripes (the math of decode stripes into e.g. booleans, runs of RLE, etc.)
- Decode ORC data to Arrow Datatypes (Async/Sync)
Column Encoding | Read | Write | Rust Type | Arrow DataType |
---|---|---|---|---|
SmallInt, Int, BigInt | ✓ | i16, i32, i64 | Int16, Int32, Int64 | |
Float, Double | ✓ | f32, f64 | Float32, Float64 | |
String, Char, and VarChar | ✓ | string | Utf8 | |
Boolean | ✓ | bool | Boolean | |
TinyInt | ✓ | i8 | Int8 | |
Binary | ✓ | Vec<u8> | Binary | |
Decimal | ✗ | |||
Date | ✓ | chrono::NaiveDate | Date32 | |
Timestamp | ✓ | chrono::NaiveDateTime | Timestamp(Nanosecond,_) | |
Timestamp instant | ✗ | |||
Struct | ✓ | Struct | ||
List | ✗ | |||
Map | ✗ | |||
Union | ✗ |
Compression | Read | Write |
---|---|---|
None | ✓ | ✗ |
ZLIB | ✓ | ✗ |
SNAPPY | ✓ | ✗ |
LZO | ✓ | ✗ |
LZ4 | ✓ | ✗ |
ZSTD | ✓ | ✗ |