This project has the aspiration to provide examples for backtesting trading algorithms conceived of and partly written in Python but with aspects that need better performance using C++.
It uses the Pybind11 system to interface between the two languages.
The initial delivery is example code for a possible use-case, namely the extraction of data from time-series storage
In order to backtest, data is typically extracted from storage and "replayed". That is, data recorded from past trading sessions is read and then fed to the trading algorithms to be tested. It's performance is then analysed in various ways.
The amount of data required for a valid backtest is normally a number of years worth and can include full book data and so in an active market may be hundreds of gigabytes or more.
Further, in order to reasonably simulate a real-world trading environment, the data has to be read from storage quickly. Its storage and retrieval must take into account memory limitations in the systems used.
For this reason the Parquet file format is a popular one for backtesting purposes. Parquet was created as part of Hadoop to deal with very large data sets and uses columnular storage. This means it is optimised for very fast retrieval (and storage) of series of data.
Apache provides the Arrow columnular memory format which uses Parquet as one of its file formats. It has a C++ library which we use.
The code provided simply loads a Parquet table from a file and makes available some information about it and its columns. It is tested on only a particular file, however there's no reason it shouldn't work on other files with columns of the same types.
In our example, we retrieve a Table from a Parquet file and use PyBind11 to expose methods for Python usage.
Due to the architecture of Arrow, it is necessary to implement Visitors. In our case this is the ArrayVisitor. Others who wish to solve more complex problems can expand on this code or learn from it.
Linux or MacOS. See here for Windows.
-
Clone project
git clone https://github.com/profitviews/fast-python-backtest.git cpp_crypto_algos cd fast-python-backtest
-
Install Conan
python3 -m venv .venv # Create a Python virtual env source ./.venv/bin/activate # Activate the virtual env pip install conan # Install conan
-
Install Conan Package & Configure CMake
mkdir build cd build conan install ../ --build missing source ./activate.sh cmake -DCMAKE_BUILD_TYPE=Debug ..
-
Build
cmake --build .
-
This will create (with debugging symbols):
parquet_table.cpython-310-x86_64-linux-gnu.so
-
Open
parquet_table.ipynb
in a Jupyter notebook -
The cells in
parquet_table.ipynb
should now function