Adapt the LIBLINEAR code inside the package
Opened this issue · 0 comments
textmodel_svmlin()
is based on https://vikas.sindhwani.org/svmlin.html, which is super fast, but based on somewhat inflexibly structured code that is 15 years old. It has a lot of possibilities though including semi-supervised classification, so I've kept it for tests. (This is the C++ code you adapted from the RSSL package.)
textmodel_svm()
is based on https://www.csie.ntu.edu.tw/~cjlin/liblinear/, which was updated most recently last month. It is very flexible, and adapts to both k > 2 problems as well as offering an easy way to output probabilities in prediction (using a method similar to that computed for multinomial logistic regression). This is currently taken from the LiblineaR package, although that package's version of the LIBLINEAR C++ code tends to lag behind its current version.
It would be nice to consider adapting this code to our package (or a new independent wrapper package) to do the following:
- keep the version of the LIBLINEAR C++ code up to date
- allow probability outputs for prediction (see "Q: Why you support probability outputs for logistic regression only?" in https://www.csie.ntu.edu.tw/~cjlin/liblinear/FAQ.html#training_and_prediction)
- base the sparse matrix class on Matrix rather than the current sparseM package in LiblineaR.