tidymodels
is a "meta-package" for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.
It includes a core set of packages that are loaded on startup:
-
broom
takes the messy output of built-in functions in R, such aslm
,nls
, ort.test
, and turns them into tidy data frames. -
dplyr
contains a grammar for data manipulation. -
ggplot2
implements a grammar of graphics. -
infer
is a modern approach to statistical inference. -
purrr
is a functional programming toolkit. -
recipes
is a general data preprocessor with a modern interface. It can create model matrices that incorporate feature engineering, imputation, and other help tools. -
rsample
has infrastructure for resampling data so that models can be assessed and empirically validated. -
tibble
has a modern re-imagining of the data frame. -
yardstick
contains tools for evaluating models (e.g. accuracy, RMSE, etc.)
There are a few modeling packages that are also installed along with tidymodels
(but are not attached on startup):
-
tidypredict
translates some model prediction equations to SQL for high-performance computing. -
tidyposterior
can be used to compare models using resampling and Bayesian analysis. -
tidytext
contains tidy tools for quantitative text analysis, including basic text summarization, sentiment analysis, and text modeling.
To install:
require(devtools)
devtools::install_github("tidymodels/tidymodels")
When loading the package, the versions and conflicts are listed:
library(tidymodels)
## ── Attaching packages ───────────────────────────────── tidymodels 0.0.1 ──
## ✔ ggplot2 3.0.0 ✔ recipes 0.1.3.9000
## ✔ tibble 1.4.2 ✔ broom 0.5.0
## ✔ purrr 0.2.5 ✔ yardstick 0.0.1
## ✔ dplyr 0.7.6 ✔ infer 0.3.1
## ✔ rsample 0.0.2
## ── Conflicts ──────────────────────────────────── tidymodels_conflicts() ──
## ✖ rsample::fill() masks tidyr::fill()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ recipes::prepper() masks rsample::prepper()
## ✖ recipes::step() masks stats::step()