High-performance backtesting engine written in C++ for evaluating trading strategies restricted to a single trading pair (e.g. BTC/USD) and finding their optimal hyper-parameters. Licensed under MIT.
- Dependencies: Bazel, Google Test, Google Benchmark, Abseil, Protocol Buffers.
- Data Analysis Tools: Jupyter, matplotlib, NumPy, Pandas.
Visual Studio Code is recommended for code editing.
DISCLAIMER: ALL OPINIONS EXPRESSED HERE ARE MY OWN AND DO NOT EXPRESS THE VIEWS OR OPINIONS OF MY EMPLOYER.
DISCLAIMER: I DO NOT PROVIDE ANY FINANCIAL, INVESTMENT, LEGAL, TAX OR ANY OTHER PROFESSIONAL ADVICE. I AM NOT A BROKER, FINANCIAL ADVISOR, INVESTMENT ADVISOR, PORTFOLIO MANAGER OR TAX ADVISOR. YOU ACKNOWLEDGE AND AGREE THAT ONLY YOU ARE RESPONSIBLE FOR YOUR USE OF ANY INFORMATION THAT YOU OBTAIN FROM THIS REPOSITORY OR SOFTWARE. YOUR DECISIONS MADE IN RELIANCE ON THE SOFTWARE OR YOUR INTERPRETATIONS OF THE DATA ARE YOUR OWN FOR WHICH YOU HAVE FULL RESPONSIBILITY. YOU EXPRESSLY AGREE THAT YOUR USE OF THE SOFTWARE IS AT YOUR SOLE RISK.
The C++ code in this repository follows Google C++ Style Guide, is clang-formatted (with Google predefined style) and thoroughly unit-tested.
People sometimes get seduced by (day-)trading stocks or cryptocurrencies only to end up making poor decisions and loosing money. This software tries to prevent that by providing a tool to scrutinize one's trading ideas.
Note: If you ever decide to expose yourself to the highly volatile waters of cryptocurrencies then I would encourage you to consider some of the following (my own personal) recommendations:
- Stick only to the most reputable and regulated cryptocurrency exchanges.
- Use only strong passwords, ideally generated by a reputable password manager.
- Always enable two-factor authentication. Avoid SMS-based authentication. Instead, use an authentication app. Backup your authentication app secrets (and/or backup codes) safely.
- Always do your own research before investing and never invest more than you can afford to loose.
- Avoid margin trading and derivatives.
- Avoid single point of failure (i.e. always have a plan B when something goes wrong, whether it is a lost or forgotten password, lost or broken mobile phone, hardware wallet, PC, etc.).
- Familiarize yourself with cryptocurrency hacks, scams, and phishing attack.
- It is advisable to keep larger amounts of cryptocurrencies in a hardware wallet rather than trusting a centralized exchange. (There is an old saying in the crypto community that if you don’t own your keys, you don’t own your crypto.)
- Write down the mnemonic phrase into a steel capsule. Never share your mnemonic phrase with anyone or anywhere online. Always keep it offline and safe.
- Instead of keeping the mnemonic phrase at home, you can also put it into a bank safe deposit box. Even more secure would be to split the mnemonic phrase into two halves and put each part into a separate safe deposit box.
- When ordering a hardware wallet, always order from the main (primary) seller website (avoid ordering from a 3rd party or ordering a second-hand hardware wallet as their security might be compromised). To conceal your own home address (and email) you can order the hardware wallet to an anonymous P. O. Box (and use a separate email for ordering).
Note: It turns out that when it comes to Bitcoin one strategy that worked well historically also happens to be the simplest one: Buy Bitcoin and HODL. For this reason we compare the performance of all trading strategies against this so-called Buy-And-HODL strategy.
So what makes trading cryptocurrencies so seductive? If you look at a price chart like the one shown below it might occur to you that if you had bought BTC at the end of March 2017 and then sold in December 2017 you would have made more than 2000%. (Moreover, there are many other cryptocurrencies that would offer even higher returns.) Although technically correct, let's break down why this is not such an easy task.
BTC/USD 2017 Bull Market, provided by TradingView.
First of all, in order for anyone to commit to execute any concrete action (such as buying BTC) one needs to (or at least should) define under which circumstances they would be willing to do that. Once the buying / selling policy is defined, it should be possible to evaluate the performance of this policy over historical data. Keep in mind, however, that the estimated historical performance is not guaranteed to persist in the future (especially when overfitting). Therefore, evaluating one's trading strategy over many (diverse) historical periods is highly recommended.
It turns out that it is not that easy to define buying / selling rules that would capture that 2000% BTC gain in 2017. For example, one would experience at least five 30% - 40% price corrections during this period often happening in a matter of few days. Such corrections could easily wipe anyone who traded with margin (even with moderate leverage).
In this section we provide a brief (but over-simplified) overview of cryptocurrency exchanges and make some assumptions for our use case. We focus solely on a single trading pair (e.g. BTC/USD) on a single centralized exchange. The first listed (crypto) currency of the trading pair is called the base currency (BTC), and the second currency is called the quote currency (USD) (see also this currency pair definition). Contrary to popular belief, interacting with a centralized cryptocurrency exchange almost never involves interacting with a blockchain. The only exceptions are cryptocurrency deposits and withdrawals, which always require a blockchain transaction. Centralized exchanges have internal databases with all account balances and orders of all customers. Thus trading a cryptocurrency is often a very fast operation (as only the internal (centralized) database needs to be updated). Moreover, for security reasons the centralized exchanges often store most of their customers' funds in offline (cold) storages.
Note: There are many (for the most part Ethereum-based) decentralized exchanges (DEXes) like Uniswap, Curve Finance, etc. and decentralized lending platforms like Aave, Maker, Compound, etc. now very popular in the DeFi ecosystem. One notable difference (w.r.t. the centralized exchanges) is that interacting with these DEXes (and other protocols) requires interacting with the Ethereum blockchain (e.g. via a web3 wallet like MetaMask or web3 API like web3.js) and is typically subject to gas fees (although there are several ongoing Ethereum scaling projects and also Eth2.0 that might reduce these fees significantly). Interestingly, it is possible to execute arbitrage trades over multiple DEXes, as one can write a smart contract that interacts with all these DEXes / lending platforms in a single Ethereum transaction. On the other hand, when doing arbitrage trading on centralized exchanges one needs to move their funds (cryptocurrencies / fiat currencies) between exchanges, which is a non-trivial process. For this reason we do not support arbitrage trading.
A cryptocurrency exchange can be viewed as a (market)place where buyers meet with sellers in order to exchange their cryptocurrencies for fiat (or other cryptocurrencies) and vice versa. Having an exchange as a facilitator of trades is beneficial, as the participants do not need to trust (or even know) each other in order to safely execute their trades. They only need to trust the exchange itself. (The DEXes go even further as the participants do not even need to trust the exchange. They only need to trust the (byte)code behind the smart contract implementing the exchange (deployed at a specific contract address), see e.g. the source code for Uniswap. They also need to trust the correctness of the compiler (e.g. Solidity) to produce correct and secure bytecode, and also the correctness and security of the whole Ethereum blockchain and their EVM.)
When restricting to a single trading pair (e.g. BTC/USD), the main data structure of a centralized exchange is the so-called order book, which contains the list of all buy orders (bids) and the list of all sell orders (asks). The exchange participants can provide the so-called liquidity into the exchange by locking their funds into these bids or asks via the so-called limit orders (free of charge). (Side note: It is also possible to provide liquidity into DEXes by locking ETH or other supported ERC-20 tokens like DAI (stable coin pegged to USD) or Wrapped Bitcoin WBTC (token pegged to Bitcoin) into the so-called liquidity pools. In this way it is possible to earn fees collected by the DEX or even earn governance tokens like UNI from Uniswap.
On the other hand, when a trader wants to buy a cryptocurrency then they can issue the so-called market buy order with the desired amount. Market buy order is executed immediately and liquidates the lowest ask(s) in the order book in order to fulfill the desired amount. Similarly, when a trader wants to sell some amount of cryptocurrency they can issue a market sell order with the desired amount and the highest bid(s) will be immediately liquidated from the exchange. The trader initiating the market order is said to demand liquidity, and the counter-party to the transaction supplies liquidity. The exchange usually extracts fees from all parties involved in these transactions, although the (counter-)party that is providing liquidity is usually subjected to smaller (or sometimes even zero) fees in order to incentivize traders to provide liquidity. Note that the order book (or the corresponding depth chart) is usually shown to all exchange participants (although the identities of the participants are hidden).
In addition to limit orders and market orders, exchanges typically provide also the so-called stop orders (or even other more exotic orders not discussed here). Stop buy order at a particular target price can be thought of as a promise by the exchange that it will execute the corresponding market buy order as soon as the price (e.g. the lowest ask) rises to (or above) the provided target price. Similarly, stop sell order at a particular target price is a promise that the exchange will execute the corresponding market sell order as soon as the price (e.g. the highest bid) drops to (or below) the provided target price. The important thing to keep in mind is that (contrary to limit orders) the target price for stop orders is not guaranteed (as the price jumps can be sudden and non-continuous). Moreover, stop orders are not visible to other exchange participants.
Unfortunately, we do not have access to (the historical updates to) the order book and all historical orders that happened on the exchange. Instead, we have access only to the so-called price history. The price history is defined as follows: Whenever a transaction (market order) was executed, in which some party and counter-party exchanged some amount V
of cryptocurrency at price P
and at time T
, the triple (T,P,V)
would be added to the price history. (Typically we do not know whether this was transaction initiated by a market buy or market sell order. Thus sometimes we can observe sequences of rather jumpy prices, especially when there was a wide bid-ask spread.) In general, it is not possible to reconstruct the order book based only on the price history. The price (e.g. if defined as the highest bid) might change suddenly without any market orders being executed. (Imagine that all participants withdrew all their bids. That would essentially crash the price to zero without a single market sell order being submitted.) To complicate matters even more, the trader's actions might impact the actions of other market participants (and the price itself).
The price history is still a rather inconvenient data structure for fast processing. Therefore, we aggregate (resample) it into the so-called OHLC (Open High Low Close) history with some fixed aggregation (sampling) period (e.g. 5 minutes per OHLC tick). An OHLC tick covering the time-period [T, T+dT)
is a tuple (T,O,H,L,C,V)
defined as follows: Let (T[i],P[i],V[i])
be all price history triples from the time-period [T, T+dT)
for i = 0 ... k-1
, such that T <= T[0] <= ... <= T[k-1] < T + dT
. Then:
O := P[0]
H := max(P[i])
L := min(P[i])
C := P[k-1]
V := sum(V[i])
(Where i
is taken over all: 0 ... k-1
.) If there were no price history triples within the period [T, T+dT)
then we define V := 0
and O = H = L = C :=
the previous close price (of a previous OHLC tick). Note that the sampling period dT
is implicitly assumed (and is not part of the tuple). For simplicity, the time T
is represented as a UNIX time (in seconds) divisible by the period dT
(also in seconds). This makes it easy to quickly find the corresponding OHLC tick for any UNIX timestamp (e.g. using a simple integer division). The trader is executed over the OHLC history one OHLC tick at a time, and emits orders at the end of each OHLC tick. The emitted orders are then executed on the next OHLC tick(s) (to avoid peeking into the future). In the following sub-sections we discuss how we deal with different order types.
Limit order guarantees the target price, but it does not guarantee fulfillment (full execution). Imagine that you placed a limit sell order (ask) of 10 BTC at 10'000 USD. Assume that the (historical) price jumped to 10'005 USD/BTC at some point, but the corresponding historical (traded) volume was only 5 BTC. It is clear that your limit order would have been at least partially filled (since all the historical asks at or above your order's target price were liquidated), but it is not clear how much of your order's target amount would have been filled (as there might have been other asks at or below you order's target price that had to be filled as well). For this reason, we have introduced max_volume_ratio
, which specifies the ratio of the (OHLC tick's) volume that would have been used to fill your limit order. For example, if it was set to 0.1 then only 0.5 BTC would have been used to partially fill your limit order (on the given OHLC tick).
As discussed before, a stop order can be thought of as a promise by the exchange that the corresponding market order will be executed if the price jumps above (or drops below) the specified target price. It is not clear, however, when exactly the exchange would execute your stop order (as the price jumps can be sudden and non-continuous) so the target price is not guaranteed. Similarly, the effective price of a market order might be different from the current market price (which is defined in our algorithm as the opening price of the OHLC tick over which the market order is being executed). The reason is that there might not be enough liquidity for executing the market order at the opening price. Again, imagine that you placed a stop buy order of 10 BTC at 10'000 USD. If the (historical) price jumped to 10'005 USD/BTC, the stop buy order would have been executed fully, but it is not clear what would have been the effective price. It can be anything between 10'000 and 10'005 USD/BTC (or even more). Therefore, we have introduced market_liquidity
, which specifies the accuracy of executing market / stop orders at their target price. If it was set to 1.0, the market / stop order would have been executed exactly at its target price (in our case 10'000 USD/BTC). If it was set to 0.0, the market / stop order would have been executed at the worst possible price w.r.t. the given OHLC tick (in our case 10'005 USD/BTC). Any value between 0.0 and 1.0 can be used. The price is then interpolated between the target price and the worst possible price. Since we have no information about market depth (or liquidity) we cannot really predict the effective price of market / stop orders, especially for large volumes.
On macOS you need to have the XCode (including the XCode Command Line Tools) installed.
On Windows you need to have the Build Tools for Visual Studio installed.
Install Bazel.
If you use Visual Studio Code then I also recommend installing the following extensions:
- C/C++: C/C++ IntelliSense, debugging, and code browsing.
- Bazel: Bazel BUILD integration.
- vscode-proto3: Protobuf editing.
For the C/C++ extension to work properly for this project you need to set the following settings:
"C_Cpp.clang_format_fallbackStyle": "Google",
"C_Cpp.default.cppStandard": "c++17"
There are few more caveats for Windows:
- Protocol Buffers come without zlib, which means that the output delimited protocol buffer files are not compressed.
- You need to follow these instructions to install Bazel. In particular, I would recommend to install MSYS2 x86_64 to get Bash and some common Unix tools (like
grep
,tar
,git
,curl
,gzip
, etc.). - The syntax for Windows Command Prompt is slightly different from Linux / macOS. I have tried to highlight the differences whenever possible.
You can download the code for this project from GitHub as follows:
git clone https://github.com/petercerno/trader-backtest.git
Navigate to the main project directory (i.e. where the WORKSPACE
file is located).
Assuming you have Bazel installed, build everything by running:
bazel build ...
Test everything by running:
bazel test ...
There are two main binaries you can run:
convert.cc
: Converts a CSV file into a more compact delimited protocol buffer file. Allows resampling price history into OHLC (Open High Low Close) history with fixed sampling rate.trader.cc
: Loads OHLC history (in delimited protocol buffer file format), optionally loads a side input history (containing additional signals), and evaluates one or more traders over it.
The source code is structured as follows:
base/
: Core trading data structures and interfaces.eval/
: Trader execution and evaluation.indicators/
: Technical indicators that can be re-used by traders.logging/
: Logging exchange and trader states.traders/
: Specific trader implementations.util/
: General utilities (not related to trading).
If you want to backtest your own trader (trading strategy), you need to add and implement the corresponding trader in the traders/
directory. That means implementing the Trader
interface (for the trading logic) and TraderEmitter
interface (for emitting new instances of your trader based on the corresponding config in the trader_config.proto
). Then you need to update the trader_factory.cc
so that you can initialize and use your trader for backtesting.
Every trader is executed (backtested) as follows:
- At every step the trader receives the latest OHLC tick
T[i]
, some additional side input signals (possibly an empty vector), and current account balances. Based on this information the trader updates its own internal state and data-structures. The current time of the trader is at the end of the OHLC tickT[i]
time period. (The trader does not receive zero volume OHLC ticks. These OHLC ticks indicate a gap in a price history, which could have been caused by an unresponsive exchange or its API.) - Then the trader needs to decide which (if any) orders to emit. The trader can assume that there are no other active orders on the exchange at this moment (see the explanation below).
- Once the trader decides which orders to emit, the exchange will execute (or cancel) all these orders over the follow-up OHLC tick
T[i+1]
. The trader does not see the follow-up OHLC tickT[i+1]
(nor any follow-up side input signals), so it cannot peek into the future by design. - Once all orders are executed (or canceled) by the exchange, the trader receives the follow-up OHLC tick
T[i+1]
and the whole process repeats.
Note that at every step every order gets either executed or canceled by the exchange. This is a design simplification so that there are no active orders that the trader needs to maintain over time. In practice, however, we would not cancel orders if they would be re-emitted again. We would simply modify the existing orders (from the previous iteration) based on the updated state.
Also note that the OHLC history sampling rate defines the frequency at which the trader is updated and emits orders. In general, traders should be designed in a frequency-agnostic way. In other words, they should have similar behavior and performance characteristics regardless of how frequently they are called. Traders should not assume anything about how often and when exactly they are called. One reason for that is that exchanges (or their APIs) sometimes become unresponsive for random periods of time (and we see that e.g. in the gaps in the price histories). Therefore, we encourage to test the traders on OHLC histories with various sampling rates.
When evaluating trader performance, we compare it to the following benchmark:
- Buy and HODL: Invest everything into the cryptocurrency and hold it.
The Buy and HODL strategy is surprisingly hard to beat (especially for BTC/USD).
There are several options how to evaluate a trader:
- Evaluate a trader over a specific time period. We can log exchange states and/or trader's internal states into a CSV file and later analyze them e.g. using Jupyter Lab.
- Evaluate a trader over multiple time periods. This lets us to see the trader's performance across many different time periods. Here, however, we do not log anything.
- Evaluate a batch of traders (i.e. do a grid search over trader's hyper-parameters) over a single or multiple time periods (in parallel). This helps us to find an interesting sub-space of trader's hyper-parameters with good performance.
First, starting from the main project directory download BTC/USD historical prices from bitcoincharts as follows:
Linux / macOS:
mkdir data
curl -k -o "$(pwd)/data/bitstampUSD.csv.gz" \
"https://api.bitcoincharts.com/v1/csv/bitstampUSD.csv.gz"
gunzip "$(pwd)/data/bitstampUSD.csv.gz"
Windows:
mkdir data
curl -k -o "%cd%\data\bitstampUSD.csv.gz" ^
"https://api.bitcoincharts.com/v1/csv/bitstampUSD.csv.gz"
gzip -d "%cd%\data\bitstampUSD.csv.gz"
Then convert the bitstampUSD.csv
file into a more compact (delimited protocol buffer) representation as follows:
Linux / macOS:
bazel run :convert -- \
--input_price_history_csv_file="/$(pwd)/data/bitstampUSD.csv" \
--output_price_history_delimited_proto_file="/$(pwd)/data/bitstampUSD.dpb" \
--start_time="2017-01-01" \
--end_time="2022-01-01"
Windows:
bazel run :convert -- ^
--input_price_history_csv_file="%cd%\data\bitstampUSD.csv" ^
--output_price_history_delimited_proto_file="%cd%\data\bitstampUSD.dpb" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01"
Output (shortened):
Selected time period:
[2017-01-01 00:00:00 - 2022-01-01 00:00:00)
Reading price history from CSV file: .../data/bitstampUSD.csv
Loaded 47743034 records in 126.722 seconds
Top 50 gaps:
1483250945 [2017-01-01 06:09:05] - 1483252128 [2017-01-01 06:28:48]: 0:19:43
1483252902 [2017-01-01 06:41:42] - 1483253968 [2017-01-01 06:59:28]: 0:17:46
1483254859 [2017-01-01 07:14:19] - 1483256048 [2017-01-01 07:34:08]: 0:19:49
1483256048 [2017-01-01 07:34:08] - 1483256938 [2017-01-01 07:48:58]: 0:14:50
1483257203 [2017-01-01 07:53:23] - 1483258092 [2017-01-01 08:08:12]: 0:14:49
...
1618387319 [2021-04-14 08:01:59] - 1618393146 [2021-04-14 09:39:06]: 1:37:07
1632304872 [2021-09-22 10:01:12] - 1632307499 [2021-09-22 10:44:59]: 0:43:47
1633075261 [2021-10-01 08:01:01] - 1633076338 [2021-10-01 08:18:58]: 0:17:57
1634716883 [2021-10-20 08:01:23] - 1634720673 [2021-10-20 09:04:33]: 1:03:10
1637744459 [2021-11-24 09:00:59] - 1637748045 [2021-11-24 10:00:45]: 0:59:46
Writing 47743034 records to the file: .../data/bitstampUSD.dpb
Finished in 8.898 seconds
As you can see, reading the full CSV file is prohibitively slow. It would be impractical to wait so long every time we wanted to evaluate the trader.
For trader evaluation, however, we need even more compressed form: an OHLC history. To get this we need to resample the price history into OHLC ticks. Although it is possible to resample the original CSV file, the delimited protocol buffer file will be much faster to read. One can resample the price history into 5 minute OHLC ticks as follows:
Linux / macOS:
bazel run :convert -- \
--input_price_history_delimited_proto_file="/$(pwd)/data/bitstampUSD.dpb" \
--output_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_5min.dpb" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--sampling_rate_sec=300
Windows:
bazel run :convert -- ^
--input_price_history_delimited_proto_file="%cd%\data\bitstampUSD.dpb" ^
--output_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_5min.dpb" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--sampling_rate_sec=300
Output (shortened):
Selected time period:
[2017-01-01 00:00:00 - 2022-01-01 00:00:00)
Reading history from delimited proto file: .../data/bitstampUSD.dpb
Loaded 47743034 records in 34.041 seconds
Top 50 gaps:
1483250945 [2017-01-01 06:09:05] - 1483252128 [2017-01-01 06:28:48]: 0:19:43
1483252902 [2017-01-01 06:41:42] - 1483253968 [2017-01-01 06:59:28]: 0:17:46
...
1634716883 [2021-10-20 08:01:23] - 1634720673 [2021-10-20 09:04:33]: 1:03:10
1637744459 [2021-11-24 09:00:59] - 1637748045 [2021-11-24 10:00:45]: 0:59:46
Removed 22 outliers
Last 20 outliers:
...
1634572737 [2021-10-18 15:58:57]: 61795.17 [0.0631]
1634572737 [2021-10-18 15:58:57]: 61763.10 [0.0369]
1634572737 [2021-10-18 15:58:57]: 61763.10 [0.4348]
1634572737 [2021-10-18 15:58:57]: 59013.00 [10.0680]
1634572737 [2021-10-18 15:58:57]: 62031.50 [0.1900]
x 1634572737 [2021-10-18 15:58:57]: 59013.00 [0.0948]
x 1634572737 [2021-10-18 15:58:57]: 59013.00 [0.0131]
1634572737 [2021-10-18 15:58:57]: 61624.00 [0.0003]
1634572737 [2021-10-18 15:58:57]: 61870.43 [0.0098]
1634572737 [2021-10-18 15:58:57]: 61870.43 [0.0004]
1634572737 [2021-10-18 15:58:57]: 61899.66 [0.0337]
1634572737 [2021-10-18 15:58:57]: 61870.43 [0.1434]
Resampled 47743012 records to 525888 OHLC ticks
Writing 525888 records to the file: .../data/bitstampUSD_5min.dpb
Finished in 0.140 seconds
In general, it is a good idea to evaluate traders over OHLC histories with different sampling rates. To this end we will also resample the price history into the OHLC history with 1 hour sampling rate as follows:
Linux / macOS:
bazel run :convert -- \
--input_price_history_delimited_proto_file="/$(pwd)/data/bitstampUSD.dpb" \
--output_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_1h.dpb" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--sampling_rate_sec=3600
Windows:
bazel run :convert -- ^
--input_price_history_delimited_proto_file="%cd%\data\bitstampUSD.dpb" ^
--output_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_1h.dpb" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--sampling_rate_sec=3600
Output (shortened):
Selected time period:
[2017-01-01 00:00:00 - 2022-01-01 00:00:00)
Reading history from delimited proto file: .../data/bitstampUSD.dpb
Loaded 47743034 records in 33.931 seconds
...
Resampled 47743012 records to 43824 OHLC ticks
Writing 43824 records to the file: .../data/bitstampUSD_1h.dpb
Finished in 0.013 seconds
It is also possible to provide an additional side history to the trader. For example, one can use the fear_and_greed_index.ipynb
notebook to download the Crypto Fear & Greed Index into a CSV file: data/fear_and_greed_index.csv
and then convert it into the delimited proto file as follows:
Linux / macOS:
bazel run :convert -- \
--input_side_history_csv_file="/$(pwd)/data/fear_and_greed_index.csv" \
--output_side_history_delimited_proto_file="/$(pwd)/data/fear_and_greed_index.dpb" \
--start_time="2018-01-01" \
--end_time="2022-01-01"
Windows:
bazel run :convert -- ^
--input_side_history_csv_file="%cd%\data\fear_and_greed_index.csv" ^
--output_side_history_delimited_proto_file="%cd%\data\fear_and_greed_index.dpb" ^
--start_time="2018-01-01" ^
--end_time="2022-01-01"
Output:
Selected time period:
[2018-01-01 00:00:00 - 2022-01-01 00:00:00)
Reading side history from CSV file: .../data/fear_and_greed_index.csv
Loaded 1427 records in 0.002 seconds
Writing 1427 records to the file: .../data/fear_and_greed_index.dpb
Finished in 0.000 seconds
Note: One needs to be very careful when defining additional side input signals for a trader. Every signal at timestamp T
can only be based on information available before (or at) the timestamp T
(in order to avoid peeking into the future).
Now we can evaluate a simple rebalancing
trader over a 5 year time period: [2017-01-01 - 2022-01-01)
(and log both the exchange states and also the trader internal states) as follows:
Linux / macOS:
bazel run :trader -- \
--input_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_5min.dpb" \
--trader="rebalancing" \
--output_exchange_log_file="/$(pwd)/data/bitstampUSD_5min_rebalancing_exchange_log.out.csv" \
--output_trader_log_file="/$(pwd)/data/bitstampUSD_5min_rebalancing_trader_log.csv" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--start_base_balance=1.0 \
--start_quote_balance=0.0
Windows:
bazel run :trader -- ^
--input_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_5min.dpb" ^
--trader="rebalancing" ^
--output_exchange_log_file="%cd%\data\bitstampUSD_5min_rebalancing_exchange_log.out.csv" ^
--output_trader_log_file="%cd%\data\bitstampUSD_5min_rebalancing_trader_log.csv" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--start_base_balance=1.0 ^
--start_quote_balance=0.0
Output:
Trader AccountConfig:
start_base_balance: 1
start_quote_balance: 0
base_unit: 1e-05
quote_unit: 0.01
market_order_fee_config {
relative_fee: 0.005
fixed_fee: 0
minimum_fee: 0
}
stop_order_fee_config {
relative_fee: 0.005
fixed_fee: 0
minimum_fee: 0
}
limit_order_fee_config {
relative_fee: 0.005
fixed_fee: 0
minimum_fee: 0
}
market_liquidity: 0.5
max_volume_ratio: 0.5
Selected time period:
[2017-01-01 00:00:00 - 2022-01-01 00:00:00)
Trader EvaluationConfig:
start_timestamp_sec: 1483228800
end_timestamp_sec: 1640995200
evaluation_period_months: 0
Reading OHLC history from: .../data/bitstampUSD_5min.dpb
- Loaded 525888 records in 0.410 seconds
- Selected 525888 records within the time period: [2017-01-01 00:00:00 - 2022-01-01 00:00:00)
rebalancing-trader[0.700|0.050] evaluation:
------------------ period ------------------ trader & base gain score t&b volatility
[2017-01-01 00:00:00 - 2022-01-01 00:00:00): 2040.36% 4682.26% 0.448 0.579 0.824
Evaluated in 3.988 seconds
To put it simply, the rebalancing
trader tries to maintain a constant portfolio allocation (in our case 70% in BTC and 30% in USD with at most ±5% error). You can learn more about the rebalancing
trader in the //traders/rebalancing_trader.ipynb
notebook.
As you can see, the trader underperforms the baseline Buy And HODL strategy quite significantly. On the other hand, it has lower volatility, which is a positive sign.
You can inspect the trader's actions in detail in the log files. For example, the data/bitstampUSD_5min_rebalancing_exchange_log.out.csv
looks as follows:
1483228800,966.340,966.370,966.160,966.370,15.697,1.000,0.000,0.000,,,,,
1483229100,966.430,966.580,966.430,966.580,0.439,1.000,0.000,0.000,,,,,
1483229100,966.430,966.580,966.430,966.580,0.439,0.700,288.470,1.450,MARKET,SELL,0.300,,
1483229400,966.570,966.570,964.600,965.550,6.662,0.700,288.470,1.450,,,,,
1483229700,965.590,966.570,965.550,965.550,20.773,0.700,288.470,1.450,,,,,
...
The columns are: timestamp_sec
, open
, high
, low
, close
, volume
, base_balance
, quote_balance
, total_fee
, order_type
, order_side
, security_amount
, cash_amount
, price
.
As you can see, on the first OHLC tick (at timestamp 1483228800
) the trader has 100% allocation in BTC. Immediately after observing this OHLC tick the rebalancing
trader emits a market sell order to restore the 70% allocation. This order is executed on the follow-up OHLC tick at timestamp 1483229100
(5 minutes later). There are two records corresponding to the timestamp 1483229100
. The first one represents the state before executing the order and the second one represents the state after executing the order. (Only successfully executed orders are logged.)
One can also inspect the trader internal states in: data/bitstampUSD_5min_rebalancing_trader_log.csv
.
1483228800,1.000,0.000,966.370
1483229100,0.700,288.470,966.580
1483229400,0.700,288.470,965.550
1483229700,0.700,288.470,965.550
1483230000,0.700,288.470,964.870
...
Note, however, that the logged trader internal states can have an arbitrary (trader-specific) structure. They are mostly used for debugging the trader.
We can also evaluate the trader over 1 hour OHLC history as follows:
Linux / macOS:
bazel run :trader -- \
--input_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_1h.dpb" \
--trader="rebalancing" \
--output_exchange_log_file="/$(pwd)/data/bitstampUSD_1h_rebalancing_exchange_log.out.csv" \
--output_trader_log_file="/$(pwd)/data/bitstampUSD_1h_rebalancing_trader_log.csv" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--start_base_balance=1.0 \
--start_quote_balance=0.0
Windows:
bazel run :trader -- ^
--input_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_1h.dpb" ^
--trader="rebalancing" ^
--output_exchange_log_file="%cd%\data\bitstampUSD_1h_rebalancing_exchange_log.out.csv" ^
--output_trader_log_file="%cd%\data\bitstampUSD_1h_rebalancing_trader_log.csv" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--start_base_balance=1.0 ^
--start_quote_balance=0.0
Output (shortened):
...
Reading OHLC history from: .../data/bitstampUSD_1h.dpb
- Loaded 43824 records in 0.032 seconds
- Selected 43824 records within the time period: [2017-01-01 00:00:00 - 2022-01-01 00:00:00)
rebalancing-trader[0.700|0.050] evaluation:
------------------ period ------------------ trader & base gain score t&b volatility
[2017-01-01 00:00:00 - 2022-01-01 00:00:00): 2055.91% 4681.13% 0.451 0.578 0.824
Evaluated in 0.334 seconds
As you can see, it has a very similar performance. This, however, does not need to be true for every trader. For example, the stop
trader has much better performance over the 5-min OHLC history than over the 1-hour OHLC history. You can learn more about the stop
trader in the //traders/stop_trader.ipynb
notebook.
We can also evaluate the rebalancing
trader over multiple 6 month time periods:
Linux / macOS:
bazel run :trader -- \
--input_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_5min.dpb" \
--trader="rebalancing" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--evaluation_period_months=6 \
--start_base_balance=1.0 \
--start_quote_balance=0.0
Windows:
bazel run :trader -- ^
--input_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_5min.dpb" ^
--trader="rebalancing" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--evaluation_period_months=6 ^
--start_base_balance=1.0 ^
--start_quote_balance=0.0
This time, however, we are not allowed to log the exchange nor trader states.
Output (shortened):
...
Reading OHLC history from: .../data/bitstampUSD_5min.dpb
- Loaded 525888 records in 0.409 seconds
- Selected 525888 records within the time period: [2017-01-01 00:00:00 - 2022-01-01 00:00:00)
rebalancing-trader[0.700|0.050] evaluation:
------------------ period ------------------ trader & base gain score t&b volatility
[2017-01-01 00:00:00 - 2017-07-01 00:00:00): 100.57% 155.13% 0.786 0.542 0.767
[2017-02-01 00:00:00 - 2017-08-01 00:00:00): 124.47% 196.26% 0.758 0.595 0.845
[2017-03-01 00:00:00 - 2017-09-01 00:00:00): 176.04% 297.85% 0.694 0.618 0.877
[2017-04-01 00:00:00 - 2017-10-01 00:00:00): 181.23% 302.96% 0.698 0.650 0.921
...
[2021-04-01 00:00:00 - 2021-10-01 00:00:00): -19.57% -25.85% 1.085 0.581 0.820
[2021-05-01 00:00:00 - 2021-11-01 00:00:00): 4.36% 6.04% 0.984 0.579 0.823
[2021-06-01 00:00:00 - 2021-12-01 00:00:00): 36.53% 51.37% 0.902 0.505 0.719
[2021-07-01 00:00:00 - 2022-01-01 00:00:00): 24.05% 32.05% 0.939 0.461 0.656
Evaluated in 7.105 seconds
It is good to see that the rebalancing
trader is very consistent in its performance.
Finally, we can find the "optimal" hyper-parameters by doing a grid search (i.e. evaluating a batch of traders with different hyper-parameters and then selecting the best one). Also in this case we cannot log the exchange nor trader states. The rebalancing
trader has only two hyper-parameters: the percentual allocation (alpha
) and the maximum allowed error of the actual allocation (epsilon
), see the RebalancingTraderConfig
in the //traders/trader_config.proto
. We can use the 1 hour sampling rate for batch evaluation (which will significantly speed-up the computation).
Note: Normally, one would split the input history into two halves (one for tuning the hyper-parameters and the other one for evaluating the selected trader).
Linux / macOS:
bazel run :trader -- \
--input_ohlc_history_delimited_proto_file="/$(pwd)/data/bitstampUSD_1h.dpb" \
--trader="rebalancing" \
--start_time="2017-01-01" \
--end_time="2022-01-01" \
--evaluation_period_months=6 \
--start_base_balance=1.0 \
--start_quote_balance=0.0 \
--evaluate_batch=true
Windows:
bazel run :trader -- ^
--input_ohlc_history_delimited_proto_file="%cd%\data\bitstampUSD_1h.dpb" ^
--trader="rebalancing" ^
--start_time="2017-01-01" ^
--end_time="2022-01-01" ^
--evaluation_period_months=6 ^
--start_base_balance=1.0 ^
--start_quote_balance=0.0 ^
--evaluate_batch=true
Output (shortened):
...
Reading OHLC history from: .../data/bitstampUSD_1h.dpb
- Loaded 43824 records in 0.033 seconds
- Selected 43824 records within the time period: [2017-01-01 00:00:00 - 2022-01-01 00:00:00)
Batch evaluation:
rebalancing-trader[0.900|0.200]: 1.00000
rebalancing-trader[0.900|0.100]: 0.98363
rebalancing-trader[0.900|0.050]: 0.97924
rebalancing-trader[0.900|0.010]: 0.96792
rebalancing-trader[0.700|0.200]: 0.91965
rebalancing-trader[0.700|0.050]: 0.91300
rebalancing-trader[0.700|0.100]: 0.90948
rebalancing-trader[0.700|0.010]: 0.90224
rebalancing-trader[0.500|0.100]: 0.85249
rebalancing-trader[0.500|0.200]: 0.85195
rebalancing-trader[0.500|0.050]: 0.84689
rebalancing-trader[0.500|0.010]: 0.82795
rebalancing-trader[0.300|0.200]: 0.78263
rebalancing-trader[0.300|0.100]: 0.77946
rebalancing-trader[0.300|0.050]: 0.77116
rebalancing-trader[0.300|0.010]: 0.75726
rebalancing-trader[0.100|0.200]: 0.70423
rebalancing-trader[0.100|0.100]: 0.70227
rebalancing-trader[0.100|0.050]: 0.69960
rebalancing-trader[0.100|0.010]: 0.69284
Evaluated in 0.184 seconds
This result suggests that the ideal portfolio allocation is to put everything into BTC and HODL.