Example from R-bloggers not reproducible
MarkusBonsch opened this issue · 3 comments
Hi there,
many thanks for your work and your blog-post on R-Bloggers (https://www.r-bloggers.com/machine-learning-pipelines-for-r/).
However, I find it impossible to reproduce the example. Can you provide a reproducible example, how to use pipeliner with modelr for cross-validation? Particularly the following parts are producing errors that the dataset is missing in the call to pipeline and that cv_rmse is not defined. Additionally, I think that the pipe operator is incorrect?:
library(tidyverse)
lm_pipeline %
pipeline(
transform_features(function(df) {
transmute(df, x1 = (waiting - mean(waiting)) / sd(waiting))
}), ...
cv_rmse %
mutate(model = map(train, ~ pipeline_func(as.data.frame(.x))),
predictions = map2(model, test, ~ predict(.x, as.data.frame(.y))),
residuals = map2(predictions, test, ~ .x - as.data.frame(.y)$eruptions),
rmse = map_dbl(residuals, ~ sqrt(mean(.x ^ 2)))) %>%
summarise(mean_rmse = mean(rmse), sd_rmse = sd(rmse))
See my session info below.
Thank's for your help.
Markus
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] modelr_0.1.0 dplyr_0.5.0 purrr_0.2.2 readr_1.1.0
[5] tidyr_0.6.1 tibble_1.2 ggplot2_2.2.1 tidyverse_1.1.1
[9] pipeliner_0.1.1.900 RevoUtilsMath_10.0.0 RevoUtils_10.0.2 RevoMods_10.0.0
[13] MicrosoftML_1.0.0 mrsdeploy_1.0 RevoScaleR_9.0.1 lattice_0.20-34
[17] rpart_4.1-10
loaded via a namespace (and not attached):
[1] reshape2_1.4.2 haven_1.0.0 colorspace_1.3-2 CompatibilityAPI_1.1.0
[5] foreign_0.8-67 withr_1.0.2 DBI_0.6 readxl_0.1.1
[9] foreach_1.4.3 plyr_1.8.4 stringr_1.2.0 munsell_0.4.3
[13] gtable_0.2.0 rvest_0.3.2 devtools_1.12.0 codetools_0.2-15
[17] psych_1.6.9 memoise_1.0.0 knitr_1.15.1 forcats_0.2.0
[21] mrupdate_1.0.0 parallel_3.3.2 curl_2.2 broom_0.4.1
[25] Rcpp_0.12.10 scales_0.4.1 jsonlite_1.1 mnormt_1.5-5
[29] hms_0.3 packrat_0.4.8-1 digest_0.6.12 stringi_1.1.3
[33] grid_3.3.2 tools_3.3.2 magrittr_1.5 lazyeval_0.2.0
[37] xml2_1.1.1 lubridate_1.6.0 assertthat_0.1 nxPacMan_1.2.1
[41] httr_1.2.1 iterators_1.0.8 R6_2.2.0 nlme_3.1-128
[45] git2r_0.18.0
Hi Markus,
Thanks for getting in touch. I'm on holiday at the moment with no computer for then the next 10 days, so please bare with me.
Have you tried the example directly from this GitHub repo's README? Looking at what you've pasted above it doesn't match exactly and I'm wondering if some of the formatting has been lost in the transition to R-Bloggers, etc.
Thanks,
Alex
It turns out that '<' followed by a '>' in Wordpress's code block formatter comments-out everything in between, so a '<-' followed by a '%>%' results in code inbetween vanishing.
This has now been fixed on the site.
Thank's for clarifying and fixing.