Package website: release | dev
Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.
- We started writing a book. This should be the central entry point to the package.
- The mlr3gallery has some case studies and demonstrates how frequently occurring problems can be solved. It is still in early days so stay tuned for more to come.
- Reference manual
- FAQ
- Cheatsheets
- Videos:
- Courses/Lectures
- The course Introduction to Machine learning
(I2ML) is a free
and open flipped classroom course on the basics of machine
learning.
mlr3
is used in the demos and exercises.
- The course Introduction to Machine learning
(I2ML) is a free
and open flipped classroom course on the basics of machine
learning.
- Templates/Tutorials
- mlr3-learndrake: Shows how to use mlr3 with drake for reproducible ML workflow automation.
- List of extension packages
- mlr-outreach contains public talks and slides resources.
- Our blog about mlr and mlr3. (We are not the most frequent bloggers ;) )
- Wiki: Contains mainly information for developers.
Install the last release from CRAN:
install.packages("mlr3")
Install the development version from GitHub:
remotes::install_github("mlr-org/mlr3")
library(mlr3)
# create learning task
task_iris <- TaskClassif$new(id = "iris", backend = iris, target = "Species")
task_iris
## <TaskClassif:iris> (150 x 5)
## * Target: Species
## * Properties: multiclass
## * Features (4):
## - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
# load learner and set hyperparameter
learner <- lrn("classif.rpart", cp = .01)
# train/test split
train_set <- sample(task_iris$nrow, 0.8 * task_iris$nrow)
test_set <- setdiff(seq_len(task_iris$nrow), train_set)
# train the model
learner$train(task_iris, row_ids = train_set)
# predict data
prediction <- learner$predict(task_iris, row_ids = test_set)
# calculate performance
prediction$confusion
## truth
## response setosa versicolor virginica
## setosa 11 0 0
## versicolor 0 12 1
## virginica 0 0 6
measure <- msr("classif.acc")
prediction$score(measure)
## classif.acc
## 0.9666667
# automatic resampling
resampling <- rsmp("cv", folds = 3L)
rr <- resample(task_iris, learner, resampling)
rr$score(measure)
## task task_id learner learner_id
## 1: <TaskClassif[46]> iris <LearnerClassifRpart[33]> classif.rpart
## 2: <TaskClassif[46]> iris <LearnerClassifRpart[33]> classif.rpart
## 3: <TaskClassif[46]> iris <LearnerClassifRpart[33]> classif.rpart
## resampling resampling_id iteration prediction
## 1: <ResamplingCV[19]> cv 1 <PredictionClassif[19]>
## 2: <ResamplingCV[19]> cv 2 <PredictionClassif[19]>
## 3: <ResamplingCV[19]> cv 3 <PredictionClassif[19]>
## classif.acc
## 1: 0.92
## 2: 0.92
## 3: 0.94
rr$aggregate(measure)
## classif.acc
## 0.9266667
mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.
- Only the basic building blocks for machine learning are implemented in this package.
- Focus on computation here. No visualization or other stuff. That can go in extra packages.
- Overcome the limitations of R’s S3 classes with the help of R6.
- Embrace R6 for a clean
OO-design, object state-changes and reference semantics. This might
be less “traditional R”, but seems to fit
mlr
nicely. - Embrace
data.table
for fast and convenient data frame computations. - Combine
data.table
andR6
, for this we will make heavy use of list columns in data.tables. - Defensive programming and type safety. All user input is checked
with
checkmate
. Return types are documented, and mechanisms popular in base R which “simplify” the result unpredictably (e.g.,sapply()
ordrop
argument in[.data.frame
) are avoided. - Be light on dependencies.
mlr3
requires the following packages at runtime:future.apply
: Resampling and benchmarking is parallelized with thefuture
abstraction interfacing many parallel backends.backports
: Ensures backward compatibility with older R releases. Developed by members of themlr
team. No recursive dependencies.checkmate
: Fast argument checks. Developed by members of themlr
team. No extra recursive dependencies.mlr3misc
: Miscellaneous functions used in multiple mlr3 extension packages. Developed by themlr
team. No extra recursive dependencies.paradox
: Descriptions for parameters and parameter sets. Developed by themlr
team. No extra recursive dependencies.R6
: Reference class objects. No recursive dependencies.data.table
: Extension of R’sdata.frame
. No recursive dependencies.digest
: Hash digests. No recursive dependencies.uuid
: Create unique string identifiers. No recursive dependencies.lgr
: Logging facility. No extra recursive dependencies.mlr3measures
: Performance measures. No extra recursive dependencies.mlbench
: A collection of machine learning data sets. No dependencies.
- Reflections: Objects are queryable for properties and capabilities, allowing you to program on them.
- Additional functionality that comes with extra dependencies:
Consult the wiki for short descriptions and links to the respective repositories.
This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behaviour, bugs, …) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the maintainers.
Please consult the wiki for a style guide, a roxygen guide and a pull request guide.
If you use mlr3, please cite our JOSS article:
@Article{mlr3,
title = {{mlr3}: A modern object-oriented machine learning framework in {R}},
author = {Michel Lang and Martin Binder and Jakob Richter and Patrick Schratz and Florian Pfisterer and Stefan Coors and Quay Au and Giuseppe Casalicchio and Lars Kotthoff and Bernd Bischl},
journal = {Journal of Open Source Software},
year = {2019},
month = {dec},
doi = {10.21105/joss.01903},
url = {https://joss.theoj.org/papers/10.21105/joss.01903},
}