/RL-trading-strategy

Utilizing reinforcement learning, this project implements a dynamic algorithmic trading strategy based on Q-learning with a deep Q-network. The Jupyter Notebook explores agent decisions on buying, selling, or holding Nasdaq stocks over a ten-year period (2014-2023).

MIT LicenseMIT

Reinforcement Learning–Based Trading Strategy

Overview

This Jupyter Notebook repository presents an end-to-end trading strategy based on reinforcement learning. The strategy is implemented using the Q-learning approach with a deep Q-network (DQN). The primary objective is to create a trading agent that dynamically decides to buy, sell, or hold financial instruments based on the current state of stock prices.

Problem Definition

In the reinforcement learning framework adopted for this case study, the algorithmic agent takes actions (buy, sell, or hold) contingent upon the prevailing state of the stock price. The key components encompass the trading agent, permissible actions, reward computation, state representation, and the stock market environment.

  • Agent: The trading agent is responsible for making decisions to buy, sell, or hold financial instruments.
  • Action: The possible actions include buy, sell, or hold.
  • Reward Function: Realized profit and loss (PnL) serves as the reward metric. The reward hinges on the action taken: sell (realized PnL), buy (no reward), or hold (no reward).
  • State: The state is represented using a sigmoid function, capturing the differences in past stock prices within a specified time window.
  • Environment: The stock exchange or stock market serves as the contextual environment.

Components of Reinforcement Learning Framework

Reward Function

The reward function is a pivotal parameter influencing the learning process. In this scenario, the reward function includes realized profit and loss but can be extended to incorporate risk, volatility, maximum drawdown, and transaction costs.

State

The state encapsulates the observations the agent receives from the environment. It should be indicative of the current market behavior, potentially including predictive signals or market microstructure-related values such as volume traded.

Data

The dataset utilized for this project comprises Nasdaq closing prices, sourced from Yahoo Finance. The dataset spans ten years, ranging from 2014 to 2023.

Implementation

The implementation leverages the Q-learning approach with a deep Q-network (DQN) to train the trading agent. The algorithm endeavors to make optimal decisions by maximizing the expected reward over time.

Instructions

To replicate the results and execute the code, follow these steps:

  1. Clone the repository: git clone https://github.com/Mephistopheles-0/RL-trading-strategy
  2. Install the necessary dependencies: pip install -r requirements.txt
  3. Open the Jupyter Notebook: RL-trading-strategy.ipynb