Adding na.locf
my-R-help opened this issue · 2 comments
I think it would be nice to have the equivalent of zoo::na.locf
in RcppRoll as well.
And it would be nice to limit how far the NAs are carried forward (or backward if fromLast=T
). This is similar to the maxgap
argument, but the difference is that maxgap
just kills everything (which I think is not good, or deserves an additional argument):
x <- 1:8
x[4:5] <- NA
na.locf(x, maxgap = 1) # Current `zoo` behavior.
[1] 1 2 3 NA NA 6 7 8
na.locf(x, maxcarry = 1) # This is what I mean.
[1] 1 2 3 3 NA 6 7 8
Any particular reason why we want to re-implement this in RcppRoll
-- is it unnecessarily slow?
I don't have a specific answer to that question, but I can describe my reasoning behind this issue.
First of all, I felt that it would fit nicely with the overall idea of RcppRoll, i.e. filling values in a vector or matrix based on various rules.
Second, often one does not need the full time series functionality of zoo or xts, and just uses the roll or na.locf functions. (This is the case for me, since I do most of my stuff with data.table nowadays.) So instead of loading yet another library, I think it would be nice to make RcppRoll more self-contained.
Third, na.locf just kills the whole carry forward (or backward) if the maxgap argument is set and the gap is longer than maxgap. This is what I have meant in my previous comment above. While it may make sense for some applications, it may not be what you need in others.
Fourth, na.locf in zoo is written in R, which means there are probably some performance gains to be made using Rcpp. I haven't run any benchmarks, but my feeling is that it's not overly fast (while on the other hand I also don't recall it being excessively slow, but looking at the source code, it uses apply
, which makes me think that it can't be very fast in all cases).