This package implements in R the Classifier-Lasso method by
Su, L., Shi, Z., & Phillips, P. C. (2016): "Identifying latent structures in panel data", Econometrica, 84(6), 2215-2264.
This package is under active development...
Code of the classifier-Lasso method was originally developed in MATLAB
using CVX
as the modeling language and MOSEK
as the convex solver. Here is replicable empirical examples in the paper.
The package uses an open source solver ECOS via CVXR by default. We skipped the Disciplined Convex Programming (DCP) check steps to speed up the optimization.
To further speed up the computation, an R version using Rmosek
to directly invoke MOSEK
is elaborated in "Implementing Convex Optimization in R: Two Econometric Examples" with demonstration code. In our experiments, this R+Rmosek
implementation often solves the optimization problem with at most 1/3 of the time by the MATLAB+CVX+MOSEK
implementation and at most 2/3 of the time by CVXR+ECOS
implementation without DCP check.
The current beta version can be installed from Github by:
library(devtools)
devtools::install_github("zhan-gao/classo", INSTALL_opts=c("--no-multiarch"))
library(classo)
Though not required for installation and use, Rmosek
is highly recommended. According to our extensive experience, using Rmosek
is often much faster than R with other solvers.
An installation gist of MOSEK
can be found at here.
The installation of the latest version MOSEK 9.0
includes Rmosek
. It can be invoked in R
following this instruction (Tested with success).
Alternatively, Rmosek
can be downloaded from CRAN. We have tested with success on R 3.6.3 the following lines:
install.packages("Rmosek")
library(Rmosek)
mosek_attachbuilder("path_to_the_bin_folder_of_MOSEK")
install.rmosek()
Please make sure Rmosek
is successfully installed and activated before use PLS.mosek()
function to do estimation.
The sample data is generated by DGP 1 described in Su, Shi and Phillips (2016) with N = 200 and T = 25.
data("sample_data")
# CAVEAT: Please convert data.frame to matrix to proceed.
y <- as.matrix(sample_data[, 1])
x <- as.matrix(sample_data[, -1])
n <- 200
tt <- 25
lambda <- as.numeric( 0.5 * var(y) / (tt^(1/3)) )
pls_out <- PLS.cvxr(n, tt, y, x, K = 3, lambda = lambda)
# Use Rmosek if it is successfully installed
# pls_out <- PLS.mosek(n, tt, y, x, K = 3, lambda = lambda)
# estimated slope for each group. True coefficients: [1,1; 0.4,1.6; 1.6,0.4]
pls_out$a.out
[,1] [,2]
[1,] 1.0387521 0.9986867
[2,] 0.4017041 1.6014119
[3,] 1.6197497 0.3614408
# Estimated group structure
# True group structure:
# group 2: 1 - 60
# group 1: 61 - 120
# group 3: 121 - 200
pls_out$group.est
[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[33] 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1
[65] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1
[97] 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3
[129] 3 3 3 3 3 1 3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[161] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[193] 3 3 3 3 3 3 3 3