The stock-pick repository provides code to serve as a template for implementing portfolio optimization with real data from the returns of stocks in the S&P 500. Running the main.py script will download stock market data, and saves it in a format for portfolio optimization. This repository uses an opensource solver (cvxpy), however, a commercial solver (e.g., Gurobi) can be used to improve performance.
This repository is the backbone of a small research project, and it should not be used for investment advice.
The main script will generate the expected daily return and daily covariance matrix for a set of n random stocks from the S&P 500 over a set of days over a defined range. The range can be adjusted, and the number of consecutive days that are selected to generate the data is a user defined parameter.
This code will download stock data from n random stocks from the S&P 500. The data is then converted into a vector of expected returns and matrix of return covariances. Users can select the number of data_generation_cycles to generate multiple data points for multiple selections of stocks over a fixed number of days in a predefied range of dates. The data is then input into a portfolio optimization model to build m portfolios of varying risk levels γ. There are five .py files that are required to run main.py files. Below, we summarize the functionality of each .py file, but more details are provided in the files themselves.
- data_loader.py: Contains the DataLoader class, which loads the data from the dataset in a standard format.
- data_functions.py: Contains the functions that scrape new data.
- optimizer.py: Contains the PortfolioOptimizer, which builds and solves the portfolio optimization model.
- plotting.py: Contains functions for plotting the data generated by the optimizer.
The following are required to run the main.py script:
- Linux or Mac (should work on Windows with minor modifications)
- Python 3.8
This repository will create a file structure that branches from a directory called stock-picker-data. The file structure will store the return and covariance data that has been scraped so that experiments are reproducible. The directories will be structures as:
stock-picker
├──data
│ ├── average_return_*.csv
│ ├── covariance_return_*.csv
│ ├── average_return_test_*.csv
│ ├── covariance_return_test_*.csv
├──grid-search-data
│ ├── average_return_*.csv
│ ├── covariance_return_*.csv
│ ├── average_return_test_*.csv
│ ├── covariance_return_test_*.csv
├──plots
├── sample_data_summary.csv
├── data_functions.py
├── data_loader.py
├── main.py
└── optimizer.py
There are three Subdirectories:
- plots: Store any plots that are made related to the portfolios generated by the optimization method.
- grid-search-data directory stores the data that is used to choose model parameters.
- data stores the data that is used validate and test new portfolio optimization methods.
-
Make a virtual environment and activate it
virtualenv -p python3 stock-picker-venv source stock-picker-venv/bin/activate
-
Clone this repository, navigate to its directory, and install the requirements.
git clone https://github.com/ababier/stock-picker cd stock-picker pip3 install -r requirements.txt
Running the code should be straightforward. Any errors are likely the result of data being in an unexpected directory. If the code is running correctly then the progress of the optimizer should print out to the commandline.
Run the main file in your newly created virtual environment.
python3 main.py