What are your “favorite” R mistakes?

R is great, but it’s also weird.

R was built by and for statisticians, so it’s not like other programming languages. Its idiosyncrasies can be a source of deep frustration for beginners. But I’d argue there is no better tool for data analysis.

That’s why I’m writing a free ebook How To Make Mistakes In R for O’Reilly. It’s modeled after the excellent How To Make Mistakes In Python, by Mike Pirnat.

The target audience is all R coders, from those just starting out all the way to the advanced developers. It’ll cover mistakes in set-up, style, and statistics -- and other surprises, too. I’m especially qualified to write this book because I’ve made so many R mistakes in my own work.

It's an exciting project, but I need your help. What are your “favorite” R mistakes?

I’m looking for all types, ranging from the dead-simple, beginner-level screwups to the subtle, advanced bugs you’ve encountered. Here are a few examples of mistakes I plan to address:

  • Function masking due to conflicting packages (e.g., dplyr and plyr)
  • Repeatedly typing stringsAsFactors = FALSE
  • The default table() function masking NA values
  • Not using piping (%>%) to improve code readability
  • Not using the broom package to standardize the output of statistical models
  • Not using GitHub and/or RStudio Projects for collaboration

Send them to me, via email, a GitHub pull request or on Twitter. The more the merrier. Feel free to contact me multiple times, as you recall your “favorite” R mistakes.