glmSolverjl

Julia Solvers for GLM specializing in calculations on large data sets that do not necessarily fit into memory and minimizing the memory required to run algorithms and using multicores.

This library implements Generalized Linear Models in the Julia programming language. It attempts to create a comprehensive library that can handle larger datasets on multicore machines by dividing the computations to blocks that can be carried out in memory (for speed) and on disk (to conserve computational resources). It offers a variety of solvers that gives the user choice, flexibility and control, and also aims to be a fully comprehensive library in terms of post processing and to be comparable in performance with the best open source GLM solver libraries, comprehensive, convenient and simple to install and use.

See demos folder for examples of usage.

Prerequisites

Openblas BLAS/LAPACK library

Feature Development

Following is a list of items that need to be completed in order to finish the implementation of the GLM library intended to be the replacement of my bigReg.

Version 0.1 Core Functionality Implementation

Version 0.2 Post-Processing & Model Search Implementation

1. Create summary function complete with pretty printing for the model output.
2. Create diagnostic plotting functions for the model outputs.
3. Measures/Tests such as X2, Significance tests, AIC/BIC, R^2, and so on.
4. Model comparisons, T-tests, ANOVA and so forth. Refer to the model comparisons package in R for inspiration.
5. Write an update() function for modifying the model and write a step() function for model searches, forward, backward, both directional searches.
6. Write/finalize the documentation.
7. Do you need to worry about which operating system this will be run on? Theoretically, if it is written in Julia, D and R it could be that you won't need to worry about this.

Version 1.0 Alpha

1. Make sure that all the functionality is working as designed and are all properly tested and documented.

Version 1.0 Beta

1. Release for testing and carry out bug fixing - Github ony

Version 1.0 Release Candidate

1. Write a presentation about this package and start publicising in Active Analytics website and do presentations about it. Do any further changes that need to be done.
2. Attempt to release this on CRAN

Version 1.0

1. By now everything should be baked in and should be stable. Release it and enjoy using it.

Version 1.1

1. Add Constraints to regression, use Generalized QR decomposition.
2. Add L1, and L-Infinity error functions for regression.
3. Include regression constraints both for linear regression and GLM using LAPACK routines for generalized least squares (MKL), see Netlib also.
4. Add regularization feature.
5. Multiple Y variables? LAPACK allows this feature to be implemented to the framework.

References

Generalized Linear Models and Extensions, 3rd Edition, James W. Hardin, Joseph M. Hilbe.
Routines for BLAS, LAPACK, MAGMA, Mark Gates, http://www.icl.utk.edu/~mgates3/docs/lapack.html.
Matrix Computations, 4th Edition, Gene H. Golub, Charles F. Van Loan.
Generalized Additive Models, An Introduction with R, 2nd Edition, Simon N. Wood.

ActiveAnalytics/glmSolverjl