Add check in panelsum() when factors are sorted
Closed this issue · 0 comments
sergiocorreia commented
This would speed up F.sort() calls when factors are sorted in the dataset; particularly useful if we run this method a lot (e.g. reghdfe)
First, create .is_sorted
Then, intercept this loop and replace (not tested):
p[index[level] = index[level] + 1] = obs
with
p[idx = index[level] = index[level] + 1] = obs
if (is_sorted) {
if (idx < last_idx)) is_sorted = 0 // set is_sorted = 1 before the loop
last_idx = idx // initially set last_idx = 0
}
Also benchmark it to see if the slowdown is high (in which case we make the sort check optional and unroll the loop)
Finally, sort()
and _sort()
should add a line like if (is_sorted) return(data)