Segfault after GC when data frame contains ruby objects
q3aiml opened this issue · 2 comments
q3aiml commented
Thanks for bringing polars to ruby!
I can get Ruby to reliably crash when putting objects into a dataframe and performing garbage collection:
irb(main):004> Polars::VERSION
=> "0.9.0"
irb(main):005> df = Polars::DataFrame.new({ c: [Object.new, Object.new, Object.new] }, schema: {'c' => Polars::Object})
=>
shape: (3, 1)
...
irb(main):006> df
=>
shape: (3, 1)
┌──────────────────────────────┐
│ c │
│ --- │
│ object │
╞══════════════════════════════╡
│ #<Object:0x0000000126aa8008> │
│ #<Object:0x0000000126aa7fb8> │
│ #<Object:0x0000000126aa7f68> │
└──────────────────────────────┘
irb(main):007> GC.start
=> nil
irb(main):008> df
[ segfault ]
In more natural use cases I have also seen this appear first as corruption, with arbitrary other objects replacing those in the data frame.
Ruby version: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) +YJIT [arm64-darwin23]
I don't have tons of experience here and haven't had a chance to dig in, but I imagine the objects need to be marked as living outside ruby with something like rb_gc_mark
?