Computation material for: 'Modeling and forecasting large realized covariance matrices and portfolio choice.'
Link to the supplementary material
Date 29/12/2015
This repository contains the material used to compute the results in: 'Modeling and forecasting large realized covariance matrices and portfolio choice.' Due to the complexity of the computations involved and the size of the intermediate output, the material in the present repo does not contain the saved forecast results necessary to generate the output. All the code used in the paper as well as the raw data is included.
The data is contained in the data folder.
-
Files starting with CRK are R data files containing realized covariance matrices for different subset of the data at different levels of aggregation. These are Rdata files and can be loaded with the command
load('CRK_file_name)
in R. We have 1474 daily observations, 315 weekly observations, and 72 monthly observations. Each row contains a realized covariance matrix (465 columns for the 30 stocks of the Dow Jones, 496 for the Dow Jones augmented with the S&P 500 index). Rows are constructed by concatenating the upper diagonal of a realized covariance matrix so that the entries are: var(stock1), cov(stock1,stock2), var(stock2), cov(stock1,stock3), cov(stock2,stock3), var(stock3), etc. The file names should be interepreted as follows:- Files names ending with W or M refer to weekly or monthly aggregated data, all others (with har in the name) are daily data.
- Files containing dj in the name indicate that the data is composed of the 30 stocks of the Dow-Jones. Files containing fac are the Dow-Jones augmented with the S&P 500 used as a common factor.
- Files containing cens in the name refer to censored data (see paper), none is used for uncensored data.
- The transformations applied to the data are refered to as lcov and lmat or none, see the paper for details.
-
mk_aggdata.Rnw and mk_aggdata.pdf contains the code used to aggregate the daily data to weekly and monthly data.
-
dates and dj_crk_dates.txt are plain text files containing the calendar date of the daily observations.
-
sp_indus.csv is a plain text file containing information on the industry category of every stock in the S&P 500. dj-cat and dj_crk_names.txt are subsets of that file for the Dow Jones stock. get_dj_indus.R is the script used to extract the Dow Jones subset.
-
dj-ind is a plain text file containing the ticker and S&P index of the 30 stocks of the Dow Jones.
-
data_convert.R is a script to convert the data to csv for the JAE data archive.
Our computations can be summarized in 3 main steps.
- Computing the VAR forecasts. This step involves heavy computations, some of which was carried on the Lisa cluster and some on the server of the VU's Econometrics and OR department. The R scripts for these computations are found in the VAR_forecasting folder.
- Computing summary statistics of the forecasts and generating tables and figures. The knitr (.Rnw) files used for this step can be found in the VAR_processing folder. Generating the output requires that the results from step 1 be included in the fc_data folder. The output files of the forecasting procedure are not included in this repo due to their large size.
- The files used to compute the portfolio statistics. These files are cound in the portfolio folder, the main file is portopt.m. The input for the portfolio selection procedure is .csv files containing the stacked forecasts of the covariance matrix and files containing the daily returns found in the data folder. The output is saved in the .mat file. Note that these are Matlab files written by Marcelo C. Medeiros, they require the optimization toolbox.
Other material includes:
- subs: Sets of R functions used in the computations.
- DCC_and_EWMA: The code used to compute the DCC and EWMA models.
library('devtools')
install_github('lcallot/lassovar')
library('lassovar')
library('plyr')
library('parallel')
library('magrittr')
library('Matrix')
library('SparseM')
library('glmnet')
library('xtable')
library('expm')
library('doMC')
# for plotting
library('reshape2')
library('ggplot2')
# for the DCC
library('rmgarch')
All packages except lassovar are available from CRAN.