shokru/mlfactor.github.io

Possible typo in notdata.html

Closed this issue · 2 comments

Toward the end of chapter 1 (notations and data), for the preparation of categorical data:

data_ml <- data_ml %>% 
    group_by(date) %>%                                   # Group by date
    mutate(R1M_Usd_C = R1M_Usd > median(R1M_Usd),        # Create the categorical labels
           R12M_Usd_C = R1M_Usd > median(R12M_Usd)) %>%
    ungroup() %>%
    mutate_if(is.logical, as.factor)

shouldn't it be instead:

data_ml <- data_ml %>% 
    group_by(date) %>%                                   # Group by date
    mutate(R1M_Usd_C = R1M_Usd > median(R1M_Usd),        # Create the categorical labels
           R12M_Usd_C = R12M_Usd > median(R12M_Usd)) %>%
    ungroup() %>%
    mutate_if(is.logical, as.factor)

(i.e. R12M_Usd_C = R12M_Usd instead of R12M_Usd_C = R1M_Usd)

Yes! You are completely right.
Luckily, I don't use this variable later on in the book.
I think I only use the 1M categorical version (for trees & neural networks).
I will update this typo shortly on the online version.
Thank you for pointing this out.

Thank you!