High-frequency trading (HFT) is a complex financial task treated as a near real-time sequential decision problem. Traditional approaches often forecast equity trends and optimize weights via combinatorial optimization. However, these methods face challenges such as computational inefficiencies and limitations in handling a large number of assets with discrete action spaces.
An efficient DRL-based policy optimization (DRPO) method for HFT has been proposed to address these issues. This method models portfolio management as a Markov Decision Process, directly inferring equity weights to maximize accumulated returns. The environment is separated into "static" market states and "dynamic" portfolio weight states, simplifying agent interactions without losing interpretability. A reward expectation calculation algorithm using probabilistic dynamic programming enables agents to collect feedback without complex trajectory sampling.
- Han, L., Ding, N., Wang, G., Cheng, D., & Liang, Y. (2023, August). Efficient Continuous Space Policy Optimization for High-frequency Trading. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 4112-4122). Source code
- torch
- numpy<2
- pandas
- python-dateutil
- yfinance
- tabulate
- wandb (optional, for using validation data during training)
Set up a Python virtual environment and install the dependencies:
pip install -r requirements.txt
- Ensure your computer has an NVIDIA GPU to train your model.
- You can modify
train_config.json
to customize your training configuration. - Run
python train.py
to start the training process.
- Make sure you have at least one pre-trained model file in the
model
folder (I have provided mine). - You can modify
trade_config.json
to customize your trading configuration. - Run
python main.py
to simulate real-time trading.
Market | Num. of stocks | Train | Validation | Test | features |
---|---|---|---|---|---|
DJIA 30 | 30 | 2001-2021 | 2022 | 2023 | open, close, high, low prices, true range ratio |
- OCHL prices are normalized by dividing them by the previous day's closing price and then taking the logarithm.
- "DOW," "CRM," and "V" have been replaced by "XOM," "PFE," and "RTX" due to insufficient data.
Start Date | End Date | CAGR | Sharpe Ratio | Maximum Drawdown | Calmar Ratio |
---|---|---|---|---|---|
2023-01-03 | 2023-12-29 | 1.204 | 1.223 | 0.154 | 1.304 |