sfirke/janitor

Feature Request: duplicated_including_first()

billdenney opened this issue · 2 comments

I wish that there were a way for the base R function duplicated() to return all duplicated values as TRUE instead of the ones after the first. Would there be interest in a PR for a very simple function duplicated_including_first() to do just that?

duplicated_including_first <- function(x, ..., fromLast = NULL) {
  duplicated(x, ..., fromLast = FALSE) | duplicated(x, ..., fromLast = TRUE)
}

values <- rep(1, 3)
duplicated(values)
#> [1] FALSE  TRUE  TRUE
duplicated_including_first(values)
#> [1] TRUE TRUE TRUE

Created on 2022-05-09 by the reprex package (v2.0.1)

I too find the behavior of duplicate() to be unfortunate and wish it returned all duplicated values.

I could go either way. I lean toward saying with that one-line function readily available on the first Google result, and janitor having get_dupes(), it's not worth adding another function to the janitor namespace. But if someone else would find it useful too beyond what get_dupes() offers, then maybe we add it.

It's not a strong need, so no need to add if there's hesitation.