The Lean Data SDK is a cross-platform template repository for developing custom data types for Lean. These data types will be consumed by QuantConnect trading algorithms and research environment, locally or in the cloud.
It is composed by example .Net solution for the data type and converter scripts.
The solution targets dotnet 5, for installation instructions please follow dotnet download.
The data downloader and converter script can be developed in different ways: Python script, Python jupyter notebook or even a bash script.
- The python script should be compatible with python 3.6.8
- Bash script will run on Ubuntu Bionic
Specifically, the enviroment where these scripts will be run is quantconnect/research based on quantconnect/lean:foundation.
This repository should be forked by each new data provider.
Once it is cloned locally, should be able to successfully build the solution, run all tests and execute the conveter scripts.
- Once the repository is forked, the existing example implementation should be adjusted to create a new data type for a particular data set.
- The Assembly name and data type have to be changed since they should be unique.
- Converter and downloader scripts should be developed following the examples in this repository. These script should be provided to
QuantConnect
as well as the fork repository at a particual commit.
TODO:
This tutorial we will create a new custom C# data type that will allow Lean algorithms or research environment to consume a particular data set.
In Lean each data type inherits from BaseData, overrides a set of methods and incoporates any specific property this data set has.
The DataLibrary
project holds an example custom data type MyCustomDataType.
GetSource()
method returns an instance ofSubscriptionDataSource
which will tell Lean from where should it source data for a particular given date, ticker, and configuration.Reader()
method should return a new instance of this data type for a given line of dataClone()
Clones the dataRequiresMapping()
Indicates whether the data source is tied to an underlying symbol and requires that corporate events be applied to it as well, such as renames and delistingsIsSparseData()
Indicates whether the data is sparse. If true, we disable logging for missing filesToString()
converts the instance to string formatDefaultResolution()
gets the default resolution for this data and security type if the user provided noneSupportedResolutions()
gets the supported resolution for this data and security typeDataTimeZone()
specifies the data time zone for this data type
It will be a requisite that each data type has a json and protobuf round trip serialization and deserialization, as well as a clone unit test. Examples provided at MyCustomDataTypeTests
The only adjusment MyCustomDataTypeTests
test suite requires for a new data type should be the CreateNewInstance()
method. Which should returned a fully initialized data point.
Creating an example QCAlgorithm
will allow quants to understand how to consume a data set and what value could it provide to their trading strategy.
A sample algorithm is provided in this repository for the defined custom data type.
Initialize()
Specifies the data and resolution required, as well as the cash and start-end dates for the algorithm. This is where the custom data should be added.OnData(Slice slice)
is the primary entry point for the algorithm. Each new data point will be pumped through it. This should be where the custom data is retrieved from theslice
object and used.
Data converter scripts will be in charge of fetching new data and processing it into a format that Lean and the new data type will be able to read.
TODO:
TODO: