/dbg-pds-tensorflow-demo

Making predictions on prices in the Deutsche Börse Public Dataset using neural networks

Primary LanguageJupyter NotebookMIT LicenseMIT

Stock Price Movement Prediction Using The Deutsche Börse Public Dataset & Machine Learning

Introduction

We use neural networks applied to stock market data from the Deutsche Börse Public Dataset (PDS) to make predictions about future price movements for each stock.

Specifically, we make a prediction on the direction of the next minute's price change using information from the previous ten minutes. We use this to power a simplified trading strategy to show potential returns.

This is intended as a demonstrate of the applications on this data set.

The Deutsche Börse Public Dataset

The Deutsche Börse PDS project provides minute-by-minute statistics over trading data from the XETRA and EUREX engines.

We focus on XETRA only. It is comprised of a variety of equities, funds and derivative securities. The PDS contains details for on a per security level, detailing trading activity by minute including the high, low, first and last prices within the time period.

Getting Started

Ensure you have Docker installed before completing the following steps.

  1. Run ./build.sh in the main repo folder to build the Docker image.
  2. Run ./run-notebook.sh to receive the notebook URL. Copy/paste this into your browser to access the notebook.
  3. Start with the notebooks in order. Notebook 02- prepared the data for the other notebooks.

Additionally, you should run step 1 (./build.sh) after each pull where the Dockerfile has been updated to rebuild your local version against the latest update.

Project Structure

The work here is divided across three notebooks:

Additional notebooks

  • What prices are predictable
    • We find out that it matters weather you predict an EndPrice, a MeanPrice or a MedianPrice in the next interval. We show how one can normalize the prices to improve the prediction.
  • Clustering Stocks
    • We cluster 100 stocks from the dataset using data from 60 days.
  • Simpler Linear Model
    • We show a well-performing linear model with hand-engineered features on a single stock. We predict the average price of the next day for a single stock. This is intended to get started easily with the dataset and price modeling.
  • Large-scale linear model predicting 20 minutes ahead
    • We run a linear model on the 50 most liquid stocks with proper training and test sets. We predict the direction of the average price in the next 20 minutes.

Documentation

General project documentation can be found in the wiki here.

Authors

  • Stefan Savev (Originate)
  • Rey Farhan (Originate)