The recipes
package is an alternative method for creating and
preprocessing design matrices that can be used for modeling or
visualization. From
Wikipedia:
In statistics, a design matrix (also known as regressor matrix or model matrix) is a matrix of values of explanatory variables of a set of objects, often denoted by X. Each row represents an individual object, with the successive columns corresponding to the variables and their specific values for that object.
While R already has long-standing methods for creating these matrices
(e.g. formulas
and model.matrix
), there are some limitations to what the existing
infrastructure can
do.
The idea of the recipes
package is to define a recipe or blueprint
that can be used to sequentially define the encodings and preprocessing
of the data (i.e. “feature engineering”). For example, to create a
simple recipe containing only an outcome and predictors and have the
predictors centered and scaled:
library(recipes)
library(mlbench)
data(Sonar)
sonar_rec <- recipe(Class ~ ., data = Sonar) %>%
step_center(all_predictors()) %>%
step_scale(all_predictors())
More information on recipes
can be found at the Get
Started page of
tidymodels.org
.
To install this package, use:
install.packages("recipes")
## for development version:
require("devtools")
install_github("tidymodels/recipes")
This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
-
For questions and discussions about tidymodels packages, modeling, and machine learning, please post on RStudio Community.
-
If you think you have encountered a bug, please submit an issue.
-
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
-
Check out further details on contributing guidelines for tidymodels packages and how to get help.