Any support for ft_dplyr_transformer/ft_sql_transformer?
kputschko opened this issue · 0 comments
kputschko commented
Hello,
I'm exploring Spark Pipelines and MLeap for the first time. I'm trying to export a MLeap bundle based on documentation found at the RStudio website. My pipelines make use of dplyr/sql transformations prior to modeling. Am I correct in assuming there is no support for these stages in a pipeline that I plan to export to a MLeap bundle?
mtcars_tbl <- sdf_copy_to(sc, mtcars, overwrite = TRUE)
new_mtcars <- mtcars_tbl %>% select(hp, wt, qsec, mpg)
pipeline <-
ml_pipeline(sc) %>%
ft_dplyr_transformer(new_mtcars) %>%
ft_binarizer("hp", "big_hp", threshold = 100) %>%
ft_vector_assembler(c("big_hp", "wt", "qsec"), "features") %>%
ml_gbt_regressor(label_col = "mpg")
pipeline_model <- ml_fit(pipeline, mtcars_tbl)
transformed_tbl <- ml_transform(pipeline_model, mtcars_tbl)
model_path <- file.path(tempdir(), "mtcars_model.zip")
ml_write_bundle(pipeline_model, mtcars_tbl, model_path, overwrite = TRUE)
Gives the following error:
Error: java.util.NoSuchElementException: key not found: org.apache.spark.ml.feature.SQLTransformer
I'm using Spark 2.3, sparklyr 1.1, with mleap 0.12