Remove unnecessary dataframe copies in methods
Opened this issue · 0 comments
Erik-Geo commented
In LayeredData.select_by_values, the copy operation is really slow on large datasets (e.g. DINO, 5 million rows):
selected = self.df.copy()
However, this also works as intended and doesn't slow down the select_by_values operation:
selected = self.df
See where we can optimize other methods by removing unnecessary copies and make sure that we test dataframe copy/view behaviour.