Pyarrow tables instead of dataframe
danthegoodman1 opened this issue · 2 comments
danthegoodman1 commented
I tried doing this but inserting grinded to a halt slowly, where with the copying with DFs it was fine. Maybe they were no different tables?
Either way if this is done right then performance should be sigificantly faster for larger batches
danthegoodman1 commented
it might actually be the list(map(lambda))...
danthegoodman1 commented
It was a copying issue, decided to deep copy during insert to make sure the behavior isn't unexpected, and added deep copy flag