cderinbogaz/inpredo

Very low accuracy?

sword134 opened this issue · 18 comments

Anyone else here struggling with a very low accuracy? I cant seem to get it above 51% so its about as good as a cointoss. I have an equal amount of buy and sell in my training set as well as validation set.

How big is your training dataset? How many samples are there in the each of the folders?

Yes there is over 800 images in buy and sell each. They have the exact same amount. Same for validation set, here there are only 84 images in each buy and sell folder.

Can you increase the number of samples and try again? Double the amount of samples if possible.

How can I? I am using SPY OHLC daily data from 2005. I can turn it down to hourly of course.

I couldnt get hourly data that far back so instead I settled on daily SPY data from 2000 until today. I've got almost 1000 images in train buy and 1000 in train sell. My validation set has 200 images in each category, my accuracy, however, is still abysmal.

You can download historical data from here.

https://www.investing.com/etfs/spdr-s-p-500-historical-data

@sword134 there is not guarantee that the AI will find patterns in the supplied data. In the past I have used historical data on BTC-USD. My hypothesis is; since btc-usd market is less complex compared SPY there were more patterns in it. You can give it a try with different markets such as gold-usd market. It has more similarities to BTC-USD that SPY.

@munkh-erdene did you tried it out with the SPY hourly data? If so can you tell us what was the result?

@cderinbogaz I got more datapoints from running daily data on SPY since 2000 than hourly data on SPY since 2019 (730 days). So I went with the daily data from 2000 and got what I wrote earlier in terms of amount of images and results.

@munkh-erdene I am using yfinance and would like to stick to that. I don't think the problem is the amount of data, because I have plenty.

@sword134 did you give it a try with gold-usd or other markets as well? As I said, this model was not tested on SPY but on BTC-USD. Al might not be able to find correlations in spy market.

@cderinbogaz testet with bitcoin hourly data. I got 6828 training samples and 1600 validation samples. Accuracy still hovering around 50% :/

@sword134 how many epochs are you training?

@cderinbogaz 250 for starters. But there is no improvement in the accuracy so there is no reason to train for longer

The data in the code is being classified by looking backwards. Its producing these buy and sell images based on what has transpired already. For ex for eurusd, lets assume.
10:00am price is 1.4
10:00 pm price is 1.41
10:00 am (next day) price is 1.39
The code currently makes classification by looking backward. The 0 pm is classified as BUY where as the correct image would be sell because 10 pm , we need to look forward to see what happened in future rather at 10 am next day when the price was 1.39 hence less than 1.41 at 10:00.

If you correctly do the classification, the accuracy drops dramatically to 50%.

@hxapartners aha, I knew that I wasnt doing it wrong. I was using my own dataset after all that DOESNT look backwards. This repo is simply "wrong"

I have wrote this many times in the past and I am writing it again. I trained this on BTCUSD data and for that you need to flip data because it is ordered from current time to future. On forex data its the opposite and therefore you don't need to flip the data. If its 50% that means it didnt find any valuable correlation. Unlike what @sword134 is claiming by calling the repo "wrong". Its an image classifier works by generating financial chart data, that's it.