sergiocorreia/ftools

Add check in panelsum() when factors are sorted

Closed this issue · 0 comments

This would speed up F.sort() calls when factors are sorted in the dataset; particularly useful if we run this method a lot (e.g. reghdfe)

First, create .is_sorted

Then, intercept this loop and replace (not tested):

p[index[level] = index[level] + 1] = obs

with

p[idx = index[level] = index[level] + 1] = obs
if (is_sorted) {
    if (idx < last_idx)) is_sorted = 0 // set is_sorted = 1 before the loop
    last_idx = idx // initially set last_idx = 0
}

Also benchmark it to see if the slowdown is high (in which case we make the sort check optional and unroll the loop)

Finally, sort() and _sort() should add a line like if (is_sorted) return(data)