/Stock-Forecasting

Python repository that uses time-series data from the S&P 500 to train a RandomForestClassifier to predict the probability of a stock price increasing or decreasing. This script is meant for educational purposes only - this is not financial advice. Consult with your financial adviser before making any investments.

Primary LanguagePython

Repository Overview

This repository uses time-series data from the S&P 500 to train a RandomForestClassifier to predict the probability of a stock price increasing or decreasing.

S&P-500 We will use the Yahoo Finance API to get historical data for the S&P500 (^GSPC). Yahoo Finance offers an excellent range of market data on stocks, bonds, currencies, and cryptocurrencies. It also provides news reports with various insights into different markets from around the world

Install Yahoo Finance API

$ pip install yfinance 

Load Yahoo Finance API

import yfinance as yf
sp500 = yf.Ticker("^GSPC")
sp500 = sp500.history(period="max")

Install matplotlib

$ pip install matplotlib

Load matplotlib

import matplotlib.pyplot as plt

Plot S&P 500 Index

plt.plot(sp500.index, sp500["Close"])
plt.show()

S&P-Plot

Set up Target Variables

sp500["Tomorrow"] = sp500["Close"].shift(-1)
sp500["Target"] = (sp500["Tomorrow"] > sp500["Close"]).astype(int) 
sp500 = sp500.loc["1990-01-01":].copy()

Train ML Model

Install sklearn library

$ pip install sklearn

Import Random Forrest Classifier

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_score

Use S&P Time-Series Data to Train ML Model

model = RandomForestClassifier(n_estimators=200, min_samples_split=50, random_state=1)