This demo will show how to build a practical strateqy
(pairs trading) from scratch in the framework of
QTrader
.
It is recommended to use conda to manage your environment.
The following code creating a virtual environment named
demo_strategy
and installing the relevant packages:
> chmod +x ./preparations.sh
> ./preparations.sh
Or you can manually input the following commands:
> conda create -n demo_strategy python=3.8
> conda activate demo_strategy
> pip install --force-reinstall git+https://github.com/josephchenhk/qtrader@master
> pip install dill finta termcolor pyyaml func_timeout scipy statsmodels hyperopt jupyter seaborn
Refer to notebook EDA.ipynb
for exploratory data analysis. After
running the notebook, you should have prepared a data folder named
clean_data
in your current working directory. We will use this
dataset to backtest our model.
Cryptocurrencies are more inclined to co-move due to some common market sentiment than traditional assets. Therefore, they could be good candidates for a pairs trading strategy, which is based on exploiting the mean reversion in prices of securities.
There are two key things concerning a pairs trading strategy:
-
Determine the candidate pairs
-
Determine the entry and exit rules
The following paragraphs explain these in details:
Suppose S1
and S2
are the prices of two securities. In the
training window, the linear regression of the logarithmic
prices give:
where the slope
A Two-step method will be used to find out the candidate
pairs. In step 1, for the given lookback window, the
correlation of the logarithm prices will be calculated,
and only those with a correlation higher than the threshold
will be selected to enter step 2. In step 2, we will
employ an Augmented Dicky Fuller (ADF) test on the shortlisted
pairs from step 1, and only those with a p-value smaller
than the predetermined threshold will be added to the
final candidate pool. In the ADF test, linear regression
is carried out and regression coefficients
The main assumptions of this strategy could be summarized as:
-
The mean-reversion behaviors observed in the training period will continue to exist in the testing period, and the spread will mean-revert to its historical mean.
-
Once a candidate pair is determined by the two-step method, it is valid throughout the next testing period, and the hedge ratio will also remain unchanged.
Note that in reality, there is no guarantee for any of the assumptions above. Violation of the assumptions could lead to failures of the strategy.
Once we have determined the candidate pairs and their
corresponding parameter
To control the risk, we also need to apply the stop
loss rule to the strategy: if we are long security 1, and
short security 2, when
Besides the z-score condition, there are
also other entry conditions: when a pair is on,
the hedge ratio
As discussed in EDA,the trading universe is six cryptocurrency
pairs:BTC.USD
, EOS.USD
, ETH.USD
, LTC.USD
, TRX.USD
,
andXRP.USD
.
The OHLCV data of different intervals (5-min, 15-min, and 30-min)
are used for simulations. The look-back
window is fixed to be 960 bars (lookback_period=960
).
In the training period, we apply a two-step
statistical method to the data in lookback window to
determine the candidate pairs. Only those pairs with
a correlation higher than the threshold
(correlation_threshold=0.8
) and an
ADF p-value less than the threshold
(cointegration_pvalue_entry_threshold=0.1
) will be shortlisted.
The trading
window is next 480 bars (recalibration_interval=480
) immediately
following the previous training period. When the trading period
completes, the dynamic rolling window will be automatically
shifted 480 bars ahead
for the next training and trading periods.
In the trading period, the spread
The entry threshold is defined as anything between 1.5-sigma and 2-sigma
(1.5 < entry_threshold < 2.0
); and the exit threshold is defined
as anything between 2.5-sigma and 3.5-sigma
(2.5 < exit_threshold < 3.5
). These parameters will change as per the
backtesting results and individual security without risking overfitting
data. We also assume for each trading opportunity,
the maximum capital allocated to individual security
is USD 1 million (capital_per_entry=1000000
). And we
only enter the trade once for repeating signals
(max_number_of_entry=1
).
A summary of the strategy parameters is shown below:
"lookback_period": 960,
"correlation_threshold": 0.8,
"recalibration_interval": 480,
"cointegration_pvalue_entry_threshold": 0.1,
"entry_threshold": [1.5, 2.0],
"exit_threshold": [2.5, 3.5],
"max_number_of_entry": 1,
"capital_per_entry": 1000000
The objective of the strategy is to maximize the
Sharpe ratio and the total return. Therefore, the objective
function is defined as minimizing
where
We firstly test the strategy in a 5-min interval. This
means we have a training window (lookback window) of 80 hours
(
In the training dataset (in-sample), we trained the model and selected the cryptocurrency pairs with negative best loss as we are minimizing the objective function. There are 7 pairs that are selected. And the strategy will be tested on both in-sample and out-of-sample datasets.
{
('EOS.USD', 'ETH.USD'): {
'entry_threshold': 1.5000392943615142,
'exit_threshold': 3.376602464111785,
'best_loss': -0.03475020027151503},
('TRX.USD', 'XRP.USD'): {
'entry_threshold': 1.551051140512154,
'exit_threshold': 3.017624989640105,
'best_loss': -0.08643681591995156},
('BTC.USD', 'EOS.USD'): {
'entry_threshold': 1.9633378226348694,
'exit_threshold': 3.329967191852652,
'best_loss': -0.017663755051963572},
('EOS.USD', 'LTC.USD'): {
'entry_threshold': 1.7693316696798091,
'exit_threshold': 2.5532912245203043,
'best_loss': -0.016165476679308892},
('BTC.USD', 'LTC.USD'): {
'entry_threshold': 1.8775675045977969,
'exit_threshold': 2.670390189458025,
'best_loss': -0.007816275748584213},
('ETH.USD', 'LTC.USD'): {
'entry_threshold': 1.762856210892485,
'exit_threshold': 2.707378771966564,
'best_loss': -0.014213080160328405},
('BTC.USD', 'ETH.USD'): {
'entry_threshold': 1.9332887682355882,
'exit_threshold': 3.407991025881241,
'best_loss': -0.005658161919799375}
}
Below is the Backtest result from 2021-01-01 to 2021-12-31:
____________Performance____________
Start Date: 2021-01-01
End Date: 2022-01-01
Number of Trading Days: 365
Number of Instruments: 7
Number of Trades: 168
Total Return: 6.38%
Annualized Return: 6.38%
Sharpe Ratio: 0.96
Rolling Maximum Drawdown: -5.39%
Below is the Backtest result from 2021-01-01 to 2022-01-01:
____________Performance____________
Start Date: 2022-01-01
End Date: 2022-08-01
Number of Trading Days: 212
Number of Instruments: 7
Number of Trades: 89
Total Return: 1.82%
Annualized Return: 3.14%
Sharpe Ratio: 0.82
Rolling Maximum Drawdown: -3.00%
We then test the strategy in a 15-min interval. This
means we have a training window (lookback window) of 10 days
(
As discussed, there are 5 pairs that are selected. And the strategy will be tested on both in-sample and out-of-sample datasets.
{
('EOS.USD', 'LTC.USD'): {
'entry_threshold': 1.85083364536054,
'exit_threshold': 3.224360323840364,
'best_loss': -0.047703687981827114},
('EOS.USD', 'XRP.USD'): {
'entry_threshold': 1.8869138038036657,
'exit_threshold': 2.9094095009860723,
'best_loss': -0.046187517027634906},
('BTC.USD', 'EOS.USD'): {
'entry_threshold': 1.8767472177155844,
'exit_threshold': 2.6226223785191993,
'best_loss': -6.446680549378299e-05},
('EOS.USD', 'ETH.USD'): {
'entry_threshold': 1.603040067942517,
'exit_threshold': 3.4874437489605867,
'best_loss': -0.10111409247066716},
('TRX.USD', 'XRP.USD'): {
'entry_threshold': 1.5092092690764671,
'exit_threshold': 2.912104597010566,
'best_loss': -0.0025735030462288046}
}
Below is the Backtest result from 2021-01-01 to 2021-12-31:
____________Performance____________
Start Date: 2021-01-01
End Date: 2022-01-01
Number of Trading Days: 365
Number of Instruments: 5
Number of Trades: 33
Total Return: 8.14%
Annualized Return: 8.14%
Sharpe Ratio: 1.20
Rolling Maximum Drawdown: -4.92%
Below is the Backtest result from 2021-01-01 to 2022-01-01:
____________Performance____________
Start Date: 2022-01-01
End Date: 2022-08-01
Number of Trading Days: 212
Number of Instruments: 5
Number of Trades: 13
Total Return: -8.47%
Annualized Return: -14.59%
Sharpe Ratio: -1.38
Rolling Maximum Drawdown: -11.29%
We then test the strategy in a 60-min interval. This
means we have a training window (lookback window) of 40 days
(
As discussed, there is one pair that are selected. And the strategy will be tested on both in-sample and out-of-sample datasets.
{
('BTC.USD', 'LTC.USD'): {
'entry_threshold': 1.8959144645762966,
'exit_threshold': 2.9436715640836755,
'best_loss': -0.08387009026604986}
}
Below is the Backtest result from 2021-01-01 to 2021-12-31:
____________Performance____________
Start Date: 2021-01-01
End Date: 2022-01-01
Number of Trading Days: 365
Number of Instruments: 1
Number of Trades: 1
Total Return: 16.77%
Annualized Return: 16.77%
Sharpe Ratio: 1.30
Rolling Maximum Drawdown: -4.40%
Below is the Backtest result from 2021-01-01 to 2022-01-01:
____________Performance____________
Start Date: 2022-01-01
End Date: 2022-08-01
Number of Trading Days: 212
Number of Instruments: 1
Number of Trades: 1
Total Return: 7.72%
Annualized Return: 13.28%
Sharpe Ratio: 1.15
Rolling Maximum Drawdown: -4.58%
As can be seen, both the 5-min and 60-min intervals deliver positive returns in both in-sample and out-of-sample datasets. However, as the interval increases, the trading opportunities decrease.
Interval | Annualized Return | Sharpe Ratio | Maximum Drawdown | Number of Trades | ||||
---|---|---|---|---|---|---|---|---|
In-sample | Out-of-sample | In-sample | Out-of-sample | In-sample | Out-of-sample | In-sample | Out-of-sample | |
5-min | 6.38% | 3.14% | 0.96 | 0.82 | -5.39% | -3.00% | 168 | 89 |
15-min | 8.14% | -14.59% | 1.20 | -1.38 | -4.92% | -11.29% | 33 | 13 |
60-min | 16.77% | 13.28% | 1.30 | 1.15 | -4.4% | -4.58% | 1 | 1 |
There is a lot of work to be done to improve the strategy, which is included but not limited to:
-
(1). In practice, the model should be trained in a dynamic rolling window, i.e., recalibrating the parameters
entry_threshold
andexit_threshold
regularly. The code for optimization is inoptimization_pair.py
. -
(2). Consider a vectorization (dataframe/numpy) implementation of the backtest, to increase the optimization speed. It is relatively difficult to fully replicate the strategy in dataframe operations. An illustrative example is given in
pandas_pairs.py
, which covers most of the features in the model, and with much less execution time. -
(3). Add an absolute stop loss to each traded pair to mitigate drawdowns.
-
(4). Consider the actual volume to have a better estimation of executed shares.
-
(5). Consider using total least squares intead of OLS to obtain the regression coefficients (hedge ratios).
-
(6). Consider transaction costs in the simulation.
-
(7). Consider different lookback window and trading window for different time intervals.
-
(8). Utilize a one-period execution lag for all trade orders to approximate the bid-ask spread since contrarian trading strategies might be unknowingly buying for bid prices and vice versa.