Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy

This repository provides codes for ICAIF 2020 paper

This ensemble strategy is reimplemented in a Jupiter Notebook at FinRL.

Abstract

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose a deep ensemble reinforcement learning scheme that automatically learns a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using the three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market conditions. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand approach for processing very large data. We test our algorithms on the 30 Dow Jones stocks which have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble scheme is shown to outperform the three individual algorithms and the two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

Reference

Hongyang Yang, Xiao-Yang Liu, Shan Zhong, and Anwar Walid. 2020. Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. In ICAIF ’20: ACM International Conference on AI in Finance, Oct. 15–16, 2020, Manhattan, NY. ACM, New York, NY, USA.

Our Medium Blog

Installation:

git clone https://github.com/AI4Finance-LLC/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020.git

Prerequisites

For OpenAI Baselines, you'll need system packages CMake, OpenMPI and zlib. Those can be installed as follows

Ubuntu

sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev libgl1-mesa-glx

Mac OS X

Installation of system packages on Mac requires Homebrew. With Homebrew installed, run the following:

brew install cmake openmpi

Windows 10

To install stable-baselines on Windows, please look at the documentation.

Create and Activate Virtual Environment (Optional but highly recommended)

cd into this repository

cd Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

Under folder /Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020, create a virtual environment

pip install virtualenv

Virtualenvs are essentially folders that have copies of python executable and all python packages.

Virtualenvs can also avoid packages conflicts.

Create a virtualenv venv under folder /Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020

virtualenv -p python3 venv

To activate a virtualenv:

source venv/bin/activate

Dependencies

The script has been tested running under Python >= 3.6.0, with the folowing packages installed:

pip install -r requirements.txt

Questions

About Tensorflow 2.0: hill-a/stable-baselines#366

If you have questions regarding TensorFlow, note that tensorflow 2.0 is not compatible now, you may use

pip install tensorflow==1.15.4

If you have questions regarding Stable-baselines package, please refer to Stable-baselines installation guide. Install the Stable Baselines package using pip:

pip install stable-baselines[mpi]

This includes an optional dependency on MPI, enabling algorithms DDPG, GAIL, PPO1 and TRPO. If you do not need these algorithms, you can install without MPI:

pip install stable-baselines

Please read the documentation for more details and alternatives (from source, using docker).

Run DRL Ensemble Strategy

python run_DRL.py

Backtesting

Use Quantopian's pyfolio package to do the backtesting.

Backtesting script

Status

Version History [click to expand]

1.0.1 Changes: added ensemble strategy
0.0.1 Simple version

Data

The stock data we use is pulled from Compustat database via Wharton Research Data Services.

Ensemble Strategy

Our purpose is to create a highly robust trading strategy. So we use an ensemble method to automatically select the best performing agent among PPO, A2C, and DDPG to trade based on the Sharpe ratio. The ensemble process is described as follows:

Step 1. We use a growing window of 𝑛 months to retrain our three agents concurrently. In this paper we retrain our three agents at every 3 months.
Step 2. We validate all 3 agents by using a 12-month validation- rolling window followed by the growing window we used for train- ing to pick the best performing agent which has the highest Sharpe ratio. We also adjust risk-aversion by using turbulence index in our validation stage.
Step 3. After validation, we only use the best model which has the highest Sharpe ratio to predict and trade for the next quarter.

justincui/Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020