Idea: dplyr as a case study in API design
gadenbuie opened this issue · 1 comments
Very cool that Hadley kept notes about the design of dplyr in the repo and we can now look back at some of the thinking that went into the dplyr design. The current version is like a 4K resolution picture of what was then a very blurry idea. It's amazing to see the thread of thinking as he worked through those early stages and the API took form.
For example, this very very early design document https://github.com/tidyverse/dplyr/blob/a3cebb2a06cb7c1f413422cb5cff5c6ccdddf54c/notes/syntax.md
Like this quote
One of the key ideas of dplyr is that it shouldn't matter how your data is stored, whether its in an SQL database, in a csv file, in memory as a data frame or a data table, you should interact with it in the exactly the same way.
from https://github.com/tidyverse/dplyr/blob/9a1fe11a511caffbbebe2b92000484f845b0d5ca/notes/README.md in November 2012 that also features functions like summarise_by()
: summarise_by(baseball, "id", g = mean(g))
.