Leukemia Data
szcf-weiya opened this issue · 3 comments
szcf-weiya commented
Paper: Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., … Lander, E. S. (1999). Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, 286(5439), 531–537. https://doi.org/10.1126/science.286.5439.531
Data: http://portals.broadinstitute.org/cgi-bin/cancer/publications/view/43
Applications in ESL: Section 18.4
szcf-weiya commented
szcf-weiya commented
R version
> lasso.path
Call: glmnet(x = t(train_X), y = train_y, family = "binomial", lambda = grid)
Df %Dev Lambda
[1,] 1 0.02309 0.3679000
[2,] 1 0.03381 0.3642000
[3,] 1 0.04433 0.3605000
[4,] 2 0.05532 0.3567000
[5,] 2 0.06634 0.3530000
...
[96,] 16 0.96520 0.0151900
[97,] 17 0.97370 0.0114700
[98,] 17 0.98230 0.0077610
[99,] 18 0.99070 0.0040480
[100,] 23 0.99920 0.0003355
> elnet.path
Call: glmnet(x = t(train_X), y = train_y, family = "binomial", alpha = 0.8, lambda = grid)
Df %Dev Lambda
[1,] 4 0.2187 0.3679000
[2,] 4 0.2269 0.3642000
[3,] 4 0.2350 0.3605000
[4,] 4 0.2432 0.3567000
[5,] 4 0.2512 0.3530000
...
[96,] 29 0.9700 0.0151900
[97,] 30 0.9773 0.0114700
[98,] 32 0.9846 0.0077610
[99,] 37 0.9919 0.0040480
[100,] 45 0.9993 0.0003355
Julia version
julia> lasso_path
Logistic GLMNet Solution Path (100 solutions for 7129 predictors in 4994 passes):
─────────────────────────────────
df pct_dev λ
─────────────────────────────────
[1] 21 0.999225 0.000335463
[2] 18 0.990733 0.00404803
[3] 17 0.982281 0.00776059
[4] 17 0.973758 0.0114732
[5] 16 0.9652 0.0151857
...
[96] 2 0.0662522 0.353029
[97] 2 0.0552272 0.356742
[98] 1 0.0443347 0.360454
[99] 1 0.0338104 0.364167
[100] 1 0.0230916 0.367879
─────────────────────────────────
julia> elnet_path
Logistic GLMNet Solution Path (100 solutions for 7129 predictors in 4683 passes):
────────────────────────────────
df pct_dev λ
────────────────────────────────
[1] 46 0.99932 0.000335463
[2] 37 0.991922 0.00404803
[3] 32 0.984638 0.00776059
[4] 30 0.97734 0.0114732
[5] 29 0.970059 0.0151857
...
[96] 4 0.251197 0.353029
[97] 4 0.243146 0.356742
[98] 4 0.235044 0.360454
[99] 4 0.226891 0.364167
[100] 4 0.218684 0.367879
────────────────────────────────
No much difference, and actually the Julia version is just a wrapper of the Fortran code, while the R version actually can be a wrapper for the Fortran code.
szcf-weiya commented