gadenbuie/garrickadenbuie-com

Idea: dplyr as a case study in API design

gadenbuie opened this issue · 1 comments

Very cool that Hadley kept notes about the design of dplyr in the repo and we can now look back at some of the thinking that went into the dplyr design. The current version is like a 4K resolution picture of what was then a very blurry idea. It's amazing to see the thread of thinking as he worked through those early stages and the API took form.

For example, this very very early design document https://github.com/tidyverse/dplyr/blob/a3cebb2a06cb7c1f413422cb5cff5c6ccdddf54c/notes/syntax.md

Like this quote

One of the key ideas of dplyr is that it shouldn't matter how your data is stored, whether its in an SQL database, in a csv file, in memory as a data frame or a data table, you should interact with it in the exactly the same way.

from https://github.com/tidyverse/dplyr/blob/9a1fe11a511caffbbebe2b92000484f845b0d5ca/notes/README.md in November 2012 that also features functions like summarise_by(): summarise_by(baseball, "id", g = mean(g)).