/Microsoft_Stock_Market

Using Python and Time Series Applications to analyze and predict stock market trends

Primary LanguageJupyter Notebook

Microsoft_Stock_Market

Using Python and Time Series Applications to analyze and predict stock market trends

stock market stock market











Goal :

The study aims to develop a model using machine learning methods to analyze stock trading risks and make informed decisions on whether to stay in the market or exit, essentially determining whether to buy or sell stocks.

Dataset :

I have used data from Yahoo MSFT stock (Microsoft Corporation) from 1986/03/14 to 2022/12/20.

Date Open High Low Close Adj Close Volume
1986-03-14 0.097222 0.102431 0.097222 0.100694 0.062980 308160000
1986-03-17 0.100694 0.103299 0.100694 0.102431 0.064067 133171200
1986-03-18 0.102431 0.103299 0.098958 0.099826 0.062437 67766400
1986-03-19 0.099826 0.100694 0.097222 0.098090 0.061351 47894400
1986-03-20 0.098090 0.098090 0.094618 0.095486 0.059723 58435200
.... .... .... .... .... .... ....
2022-12-13 261.690002 263.920013 253.070007 256.920013 256.920013 42196900
2022-12-14 257.130005 262.589996 254.309998 257.220001 257.220001 35410900
2022-12-15 253.720001 254.199997 247.339996 249.009995 249.009995 35560400
2022-12-16 248.550003 249.839996 243.509995 244.690002 244.690002 86088100
2022-12-19 244.860001 245.210007 238.710007 240.449997 240.449997 29668800

This dataset contains a date column that is organized and consecutive, we can consider this dataset as a time series dataset.

dataset

Open/Close price over time of this dataset

Open/Close price over time of this dataset

After zooming in on a specific time range

Last 100 records

Candlestick charts for the same the last range time

Last 100 records - Candlestick charts

Stock Price Information

The Open Price

The Open price represents the price at which a stock was first traded during the current trading session.

The Close Price

The Close price represents the price at which a stock was last traded during the current trading session.

The High Price

The High price represents the highest price at which a stock was traded during the current trading session.

The Low Price

The Low price represents the lowest price at which a stock was traded during the current trading session.



The Open and Close prices give an idea of the general market trend for the stock in question.

  • If the Close price is higher than the Open price, it is likely that the stock experienced a price increase during the trading session, indicating a bullish trend.
  • If the Close price is lower than the Open price, it is likely that the stock experienced a price decrease during the trading session, indicating a bearish trend.
  • The High and Low prices give an idea of the market volatility for the stock in question. If the spreadbetween the High and Low prices is large, it indicates that the stock experienced high volatility during the trading session. If the spread is small, it indicates that the stock experienced low volatility.

Bullish Trand Vs. Bearish Trand

bullish Bearish















Realization :

Our analysis is (monthly-based & Daily-based), and all the decisions are made the first trading day of the month. For a reason which will be clarified by the following code, our analysis will start from 24 months after January 1986 and end the month before November 2022.

Then I selected the columns to use for the candlestick chart ("Open", "High", "Low", "Close")

I converted this unsupervised problem to a supervised problem

So if we return to our dataset and represent it with the candlestick chart, we will see variations over time in Mirosoft's actions. Our objective is to predict whether we will leave or stay in the market at the start of each trading period.

In the image bellow, I create another column 'Target' that specify if the action in the current trading session was increased or decreased ?. so i calculate the difference between the close price and open price then if it's positive it was a Bullish trend = 1 or Bearish trend = 0.

data labeling

This target is made by the current period but we want to predict for the next period, will it be bullish or bearish? data labeling

So for this we had to shift all these values ​​up so that each period will have a new target value which says that the next trading period will be bullish or bearish.

Next Step i train Machine learning and deep learning models to predict for new data if we gonna stay on the market or not.

Using Random Forest ...

rf

Using Random Forest + GridSearchCV...

rf-gridsearchcv

we notice that the results were improved ...

Using Random Forest + GridSearchCV + Feature extraction

I Calculated the logarithmic difference between consecutive prices ...

rf-gridsearchcv-fe

I used logarithmic differencing to normalize data, a common technique in financial analysis to visualize price variations in percentage terms using relative values.


Logarithmic differencing is useful for data with increasing trends over time, such as stock price data.
It involves taking the natural logarithm of each price value, then calculating the difference between consecutive values to compute relative growth rates between periods. This method helps visualize stock price growth trends for better understanding of price changes, even with significant long-term increases. It enables easy comparison of growth trends between different stocks or periods.

logarithm-diff

then i have normilized data ... logarithm-diff

Using again Random Forest + GridSearchCV for pre-precessed data... logarithm-diff

Then i deploy the model to TELL US ... WHETHER STAY OR EXIT THE MARKET :) deploy