jorgecarleitao/arrow2

Crash when loading avro file

Opened this issue · 0 comments

I found this via dataframes in nushell. When opening an avro file that can be processed via avro-tools, I received the following error:

$ > dfr open sample_data.avro

thread 'main' panicked at 'internal error: entered unreachable code', /Users/brew/Library/Caches/Homebrew/cargo_cache/registry/src/index.crates.io-6f17d22bba15001f/arrow2-0.18.0/src/io/avro/read/deserialize.rs:42:17
stack backtrace:
   0:        0x109c188ed - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he69c0e17cb41f255
   1:        0x107fc7e7b - core::fmt::write::h66293df4c7dd941a
   2:        0x109bf24d6 - std::io::Write::write_fmt::h2f5a7ea5f48a0b56
   3:        0x109c186d0 - std::sys_common::backtrace::print::h71fd332624ce1826
   4:        0x109c19a95 - std::panicking::default_hook::{{closure}}::ha2a0e70fb3678142
   5:        0x109c19839 - std::panicking::default_hook::hb166cd42dec7ff92
   6:        0x109c1a158 - std::panicking::rust_panic_with_hook::h2b924837648ff0c0
   7:        0x109c19ef7 - std::panicking::begin_panic_handler::{{closure}}::h04e24a68d30d9f5c
   8:        0x109c18b09 - std::sys_common::backtrace::__rust_end_short_backtrace::hd45b5152c8265971
   9:        0x109c19cc2 - _rust_begin_unwind
  10:        0x109ebe8f3 - core::panicking::panic_fmt::h9302663e63786640
  11:        0x109ebe97e - core::panicking::panic::hc241fa596ee7b5bf
  12:        0x107c82fd5 - arrow2::io::avro::read::deserialize::make_mutable::hf45bb316f93342e7
  13:        0x107bc4989 - <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::hac1aaaf51e5a9c68
  14:        0x107cc0749 - <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next::h9aaae8716af942a4
  15:        0x107bd3ff3 - <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter::h4d6a9bbd64e00fd1
  16:        0x107cc291d - core::iter::adapters::try_process::haa38130211fad031
  17:        0x107c81faa - arrow2::io::avro::read::deserialize::make_mutable::hf45bb316f93342e7
  18:        0x107bc3ca7 - <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::h34fe43d77a4e4ea2
  19:        0x107cc07a0 - <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next::hb7acace4ca3da3b4
  20:        0x107c00877 - alloc::vec::Vec<T,A>::extend_desugared::h4b63fc03b51dee40
  21:        0x107bde9b3 - <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter::hdfcc1a7244bd05e5
  22:        0x107cc2679 - core::iter::adapters::try_process::h9398ad5edf6457e1
  23:        0x107c84d69 - arrow2::io::avro::read::deserialize::deserialize::ha46ac8ab77f34589
  24:        0x1081af894 - <arrow2::io::avro::read::Reader<R> as core::iter::traits::iterator::Iterator>::next::h5de93b8a755eda66
  25:        0x10823e708 - polars_io::finish_reader::he66627c0790eb19c
  26:        0x108188119 - <polars_io::avro::read::AvroReader<R> as polars_io::SerReader<R>>::finish::h6afca6396bbfd14d
  27:        0x1081dc089 - nu_cmd_dataframe::dataframe::eager::open::from_avro::h9c899e19fb4350da
  28:        0x1081d98ba - <nu_cmd_dataframe::dataframe::eager::open::OpenDataFrame as nu_protocol::engine::command::Command>::run::h5eec975b4e8f3f15
  29:        0x1086cd8ff - nu_engine::eval::eval_call::hb245c23a0a05ec19
  30:        0x1086d4ab3 - nu_engine::eval::eval_expression_with_input::h0ea7c9a8250726af
  31:        0x1086d5066 - nu_engine::eval::eval_element_with_input::h0de0d47b9ab3898d
  32:        0x1086d6457 - nu_engine::eval::eval_block::h686d5a241f4a798f
  33:        0x108101a2c - nu_cli::util::eval_source::hb1faa72058fd841a
  34:        0x1080ce946 - nu_cli::repl::evaluate_repl::h4d0e2da8951c81ad
  35:        0x1080ac4cc - nu::run::run_repl::h23e58cf0ef637612
  36:        0x1080a55ed - nu::main::h9548f5cabf7cf92d
  37:        0x1080b3fc1 - std::sys_common::backtrace::__rust_begin_short_backtrace::h446a3e91046a5ea9
  38:        0x1080a9351 - std::rt::lang_start::{{closure}}::h2f3e1fbc67a86e07
  39:        0x109c19be4 - std::panicking::try::hb5cb29dbfee1dcfc
  40:        0x109bfb5fe - std::rt::lang_start_internal::h634e63ff6023f727
  41:        0x1080a63ac - _main
  42:     0x7ff8057973a6 - <unknown>

I have attached the file in question. This is the dump of avro-tools metdata:

avro-tools getmeta sample_data.avro

23/11/21 10:57:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
avro.schema	{"type":"record","name":"FromAvroTest","namespace":"nushell","fields":[{"name":"stringField","type":"string"},{"name":"intField","type":"int"},{"name":"complexField","type":{"type":"record","name":"nestedType","fields":[{"name":"enumField","type":{"type":"enum","name":"muck","symbols":["foo","bar"]}}]}}]}
avro.codec	deflate

sample_data.tgz

I'll see if I can figure out why it's happening.