Azure/Azure-TDSP-Utilities

BinaryClassification RMD doesn't properly create factors

Pelonza opened this issue · 1 comments

The current code in the BinaryClassification.rmd doesn't correctly use R syntax to create factor columns
This is a giant problem for using the "auto" factor feature in the yaml files.

The cuplrit is line 118 in the B-C.rmd
Currently it reads:
if (!is.null(factorCols)) {for (i in 1:length(factorCols)) { trainDF[, factorCols[i]] <- make.names(as.factor(trainDF[, factorCols[i]])) }}

Change that line to:
if (!is.null(factorCols)) {for (i in 1:length(factorCols)) { trainDF$factorCols[i] <-as.factor(trainDF$factorCols[i]) }}

The key difference there is that R doesn't know how to handle lists when converting to factors (it generates some sort errors)... and this avoids that entirely.


With this change I (and the other yaml-file fix) I was able to run the BinaryClassification rmd file...

I have added this in the code as a comment. Thanks for pointing it out.