This is an implementation using Python of the Cluster-based shrinkage correlation estimator proposed by Begusic and Kostanjcar in the paper Cluster-Based Shrinkage of Correlation Matrices for Portfolio Optimization (2019) that can be found here.
This estimator uses a clustering method to identify the underlying structure in the sample correlation matrix and then uses the structure to regularize the entries of the sample correlation estimator.
The purpose of this estimator is to regularize the correlation matrix estimation of a data set in high dimensionality (i.e. when the number of columns and the number of rows of your data set are big). There is a shrink factor np.corrcoef
or pd.Dataframe.corr
) and a shrinked version of R.
You can use this estimator on a pandas dataframe of size
# importing the library
import clustering_shrinkage_estimator
# load your data matrix
import pandas as pd
data = pd.read_csv('./my dataframe')
#Calculate the shrinked estimator with an alpha of 1
r_clustering = clustering_shrinkage_estimator.get_shrinkage_est(data, alpha = 1)
The purpose of this estimator is to regularize the correlation matrix estimation of a data set in high dimensionality (i.e. when the number of columns and the number of rows of your data set are big). There is a shrink factor
In the future, I'll try to turn this estimator into a function that will be available for download through PIP.