Argument to pad with NAs when n > 1
datalove opened this issue · 11 comments
Hi Kevin, just wondering if you might consider including an option to pad the returned vectors/matrices with NAs when n > 1?
I'd love to be able to do something like this
data.frame(a = 1:10, b = roll_sum(1:10, 3, padNA = TRUE))
instead of this
data.frame(a = 1:10, b = c(rep(NA,2),roll_sum(1:10, 3)))
.
Of course data.frame(a = 1:x, b = roll_sum(1:x, n = 3))
throws an error because length of roll_sum(x,n)
for n > 1 is shorter than length(1:x)
.
A good idea -- I might prefer e.g. pad.left
or pad.right
in case we want to how padding occurs. Or maybe have pad be one of "left"
, "right"
, or NULL
. I'll think about this. Thanks for the feature suggestion!
Hi Kevin, I suspect I'm missing something, but if roll_abc(x)
rolls forward (1:length(x)
) and there is no option to roll backward (length(x):1
), then isn't left padding of NAs all we'd ever need?
If indeed there was an option to roll backward, if pad = TRUE
then it could automatically pad the end of the vector with NAs.
Maybe, but I think there could be users who would still prefer different 'alignment', so that e.g. they prefer to get:
> data.frame(a = 1:5, b = c(NA, NA, roll_sum(1:5, 3)))
a b
1 1 NA
2 2 NA
3 3 6
4 4 9
5 5 12
while sometimes, someone might want
> data.frame(a = 1:5, b = c(roll_sum(1:5, 3), NA, NA))
a b
1 1 6
2 2 9
3 3 12
4 4 NA
5 5 NA
Or is that a rather awkward proposition?
It looks like zoo::rollapply()
handles what you're proposing - that could be a good indicator that others would find it useful, though I suspect that left padding zeros may be a good default.
I guess I've lived a sheltered life when it comes to rolling operations :)
Let me second the request for this feature. It would be especially useful for users of dplyr, functions used in which can only return either 1 value of n values. We can't use roll_abc() with dplyr because it returns vectors that are lacking the appropriate NAs. So, for now, I use zoo:rollapply as discussed above. rollapply is nice, and I think widely used in the R finance community, but I suspect that roll_abc would be faster. I think that several of the options/approaches of rollapply are worth looking at, especially width, fill and align.
I was also mostly hoping to use RcppRoll with dplyr.
On 06/07/2014 7:35 PM, "davidkane9" notifications@github.com wrote:
Let me second the request for this feature. It would be especially useful
for users of dplyr, functions used in which can only return either 1 value
of n values. We can't use roll_abc() with dplyr because it returns vectors
that are lacking the appropriate NAs. So, for now, I use zoo:rollapply as
discussed above. rollapply is nice, and I think widely used in the R
finance community, but I suspect that roll_abc would be faster. I think
that several of the options/approaches of rollapply are worth looking at,
especially width, fill and align.—
Reply to this email directly or view it on GitHub
#1 (comment).
Hi guys,
In the devel
branch, I'm doing a big re-write. The 'main' exported functions now have the align
and fill
arguments, as from zoo::rollapply
. You can try:
devtools::install_github("kevinushey/RcppRoll", ref = "devel")
library("RcppRoll")
roll_mean(1:5, 3L, fill = NA, align = "left")
to get a feel for it.
I will merge to master after I've considered a few more things:
- Re-introducing the
by
argument, - Supporting
partial
, and - Upgrading the
rollit
androllit_raw
functions to the new interface.
Not sure whether this difference to zoo is intended (I'm personally fine with it, as I use fill=NA
a lot anyway):
x <- 2:5
rollapplyr(x, 2, mean)
[1] 2.5 3.5 4.5
roll_meanr(x, 2)
[1] NA 2.5 3.5 4.5
That is intended -- I found it strange that rollapplyr
does not automatically set a fill (thereby making its behaviour identical to rollapply
by default). I thought NA
was the most sensible default here.
I've started by implementing a simple version of fill
-- it can currently be a vector of length 1 or 3, specifying fills for the 'left padding', 'middle padding', and 'right padding' respectively.
Of course, there could be cases where someone wants to supply a vector to pad left with, e.g.
zoo::rollapplyr(1:5, 3, mean, fill = list(c(0, 1), NA, NA))
but I haven't implemented anything that general yet.
fill
has now been implemented in a way that conforms with the behaviour of zoo
's rollapply
function, with the caveat that I still maintain the alternative default behaviour for the roll_r
and roll_l
functions.