/Training-neural-networks-to-predict-crypto-price-movements

Here my TradingBot for crypto is unveiled. For any suggestions or hints, please keep me informed

Training-neural-networks-to-predict-crypto-price-movements

In the following, I want to download historical crypto data, calculate a classification of when is a good time to buy, export price data as CandleCharts, then perform supervised learning based on the classification, and ultimately then make predictions using the model

Classification

Load and prepare historical data from https://www.cryptodatadownload.com/data/. I choose the binance exchange

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


url="https://www.cryptodatadownload.com/cdd/Binance_BTCUSDT_1h.csv"
data =pd.read_csv(url, skiprows=1) 
data.drop(['date', 'symbol', 'Volume USDT', 'tradecount'], axis=1, inplace = True) # Delete unnecessary data
data['unix'] = data['unix'].astype(str).str[:10] # Manual adjustment of the time dataset - If Unixtime has more than 10 digits, it must be divided by 1000
data['date'] =   pd.to_datetime(data.unix, unit='s') # Convert dates. Attention UNIX EPOCH time has three 0's too many.
data = data.iloc[::-1] # flip upside down
data.reset_index(drop=True, inplace = True) # Create a new index
data
unix open high low close Volume BTC date
0 1502946000 4308.83 4328.69 4291.37 4315.32 23.230000 2017-08-17 05:00:00
1 1502949600 4315.32 4345.45 4309.37 4324.35 7.230000 2017-08-17 06:00:00
2 1502953200 4324.35 4349.99 4287.41 4349.99 4.440000 2017-08-17 07:00:00

Here is a plot of the price development

# figure
fig, ax1 = plt.subplots(figsize=(15,6))
ax1.grid(color='black', linestyle='--', linewidth=0.1)
ax1.set_xlabel('date')
ax1.set_ylabel('Value in USD')
ax1.plot(data.date, data.close, label='BTC')
ax1.tick_params(axis='y')
ax1.legend()
fig.tight_layout()  # otherwise the right y-label is slightly clipped

Classification when a good entry point is in place

I would like to train the model so that he buys at good times if possible and sells when the price goes down. For this I have thought of the following logic:

If for time span x hours (e.g. 24h) the price increases by y percent (e.g. 2 %), then the range must be labeled as a good entry point However, since the price is very volatile, I calculate two moving averages

Create moving averages

average_length_current = 60 # in hours
average_length_future = 24  # in hours
distance_time =  2          # in hours x-axis
distance_value = 2          # in precent y-axis
distance_value = 1+(distance_value/100) # 

# create moving averages
data['average_length_current'] = data.loc[:,"close"].rolling(window=average_length_current).mean()
data['average_length_future'] =  data.loc[:,"close"].rolling(window=average_length_future).mean()
data.tail()
unix open high low close Volume BTC date average_length_current average_length_future
33351 1613505600 48536.35 48784.29 48173.81 48769.27 2584.417826 2021-02-16 20:00:00 48454.874000 48785.043333
33352 1613509200 48769.28 48809.09 48355.80 48575.07 2220.458707 2021-02-16 21:00:00 48445.553000 48800.357083
33353 1613512800 48575.06 49180.00 48524.65 49108.68 1763.412213 2021-02-16 22:00:00 48446.960833 48834.681667
33354 1613516400 49113.91 49290.00 48933.33 49133.45 1959.461793 2021-02-16 23:00:00 48448.265667 48885.612917
33355 1613520000 49133.45 49476.00 49133.44 49361.52 615.901013 2021-02-17 00:00:00 48447.010000 48956.133750

Plot Moving averages

# Moving Average
fig, ax1 = plt.subplots(figsize=(25,6))
ax1.grid(color='black', linestyle='--', linewidth=0.1)
ax1.set_ylabel('Value in USD')
ax1.plot(data.date[-900:], data.close[-900:], label='BTC')
ax1.plot(data.date[-900:], data.average_length_current[-900:], label='BTC-Average_length_current')
ax1.plot(data.date[-900:], data.average_length_future[-900:], label='BTC-Average_length_future')
ax1.tick_params(axis='y')
ax1.legend()
fig.tight_layout()  # otherwise the right y-label is slightly clipped

Decision making

If the future price increases by a certain percentage (distance_value), then mark the current price as a good buying opportunity. If the future price falls, then this should be taken as a selling argument. Caution: This works only because I have the historical data

Calculate buy or sell condition - 1 means buy / -1 means sell Decision formula

data['decision'] = 0 # initialization
for x in range(0, len(data)-distance_time-average_length_future):
    if data.average_length_current[x]*distance_value < data.average_length_future[x+distance_time+average_length_future]:
        data['decision'][x] = 1
    elif data.average_length_current[x]*distance_value > data.average_length_future[x+distance_time+average_length_future]:
        data['decision'][x] = -1
    else:
        data['decision'][x] = 0
data.tail()
unix open high low close Volume BTC date average_length_current average_length_future decision
33351 1613505600 48536.35 48784.29 48173.81 48769.27 2584.417826 2021-02-16 20:00:00 48454.874000 48785.043333 0
33352 1613509200 48769.28 48809.09 48355.80 48575.07 2220.458707 2021-02-16 21:00:00 48445.553000 48800.357083 0
33353 1613512800 48575.06 49180.00 48524.65 49108.68 1763.412213 2021-02-16 22:00:00 48446.960833 48834.681667 0
33354 1613516400 49113.91 49290.00 48933.33 49133.45 1959.461793 2021-02-16 23:00:00 48448.265667 48885.612917 0
33355 1613520000 49133.45 49476.00 49133.44 49361.52 615.901013 2021-02-17 00:00:00 48447.010000 48956.133750 0

This plot shows when was a good time to take a long position

# Decision
fig, ax1 = plt.subplots(figsize=(25,6))
ax1.grid(color='black', linestyle='--', linewidth=0.1)
ax1.set_ylabel('Deciscion')
ax1.plot(data.date[-900:], data.decision[-900:], label='Decision')
ax1.tick_params(axis='y')
plt.yticks(np.arange(-1,2))
y_ticks_labels = ['Sell','','Buy']
ax1.set_yticklabels(y_ticks_labels)
ax1.legend()
fig.tight_layout()  # otherwise the right y-label is slightly clipped

Here again with the overview the moving averages. The model works on the whole very well, but fast breakouts to the top are not well perceived

fig, ax1 = plt.subplots(figsize=(25,6))
ax1.grid(color='black', linestyle='--', linewidth=0.1)
color1 = 'tab:red'
ax1.set_xlabel('date')
ax1.set_ylabel('value in USD', color=color1)
ax1.plot(data.date[-900:], data.close[-900:], label='BTC', color=color1)
ax1.plot(data.date[-900:], data.average_length_current[-900:], label='BTC_ma_c')
ax1.plot(data.date[-900:], data.average_length_future[-900:], label='BTC_ma_f')
ax1.tick_params(axis='y', labelcolor=color1)
ax1.legend()
ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis
color2 = 'tab:blue'
ax2.set_ylabel('decision', color=color2)  # we already handled the x-label with ax1
ax2.plot(data.date[-900:], data.decision[-900:], label='Decision', color=color2)
ax2.tick_params(axis='y', labelcolor=color2)
ax2.set_yticks(np.arange(-1,2))
y_ticks_labels = ['Sell','','Buy']
ax2.set_yticklabels(y_ticks_labels)
ax2.legend()
fig.tight_layout()  # otherwise the right y-label is slightly clipped

The beginning of the dataset can not be recommended, because no idea whether good or bad, you do not know which value (current or future) first starts therefore all 0 entries in the deciscion column will be deleted

data.drop(data.loc[data['decision']== 0].index, inplace=True)
data.reset_index(drop=True, inplace = True) # Create a new index
data
unix open high low close Volume BTC date average_length_current average_length_future decision
0 1503154800 4042.41 4063.44 3964.67 3972.05 8.150000 2017-08-19 15:00:00 4229.466500 4077.647500 -1
1 1503158400 3972.05 4049.69 3953.40 4000.00 3.750000 2017-08-19 16:00:00 4224.319333 4067.216667 -1
2 1503162000 4000.00 4048.99 3976.72 4027.37 3.070000 2017-08-19 17:00:00 4219.520167 4060.300833 -1
3 1503165600 4027.37 4096.00 4013.69 4086.29 16.960000 2017-08-19 18:00:00 4215.552500 4059.004167 -1
4 1503169200 4086.29 4103.92 4073.47 4076.12 1.620000 2017-08-19 19:00:00 4210.988000 4056.497500 -1
... ... ... ... ... ... ... ... ... ... ...
33266 1613412000 48562.31 48698.52 48342.22 48358.99 2040.898434 2021-02-15 18:00:00 47838.691333 47922.434583 -1
33267 1613415600 48361.00 48742.52 48354.78 48657.31 1945.428372 2021-02-15 19:00:00 47866.384500 47920.931250 -1
33268 1613419200 48657.31 48801.00 48453.87 48580.99 3063.153399 2021-02-15 20:00:00 47885.568833 47906.410000 -1
33269 1613422800 48581.00 48750.00 48111.11 48207.54 2767.082753 2021-02-15 21:00:00 47902.269000 47880.780000 1
33270 1613426400 48201.18 48334.23 47643.00 48284.89 2835.289841 2021-02-15 22:00:00 47928.531000 47850.444583 1

Next, I want to know how well this method works, so I calculate how my capital would develop

Create clear recommendation matrix with purchase date & purchase values

recommendation = pd.DataFrame((np.zeros((0,4))), columns = ['time', 'data_index', 'close', 'decision']) 
k = 0
for x in range(0, len(data)-1):
    if data.decision[x] > data.decision[x+1]:
        recommendation.loc[k,"time"] = data.date[x]
        recommendation.loc[k,"data_index"] = data.index[x]
        recommendation.loc[k,"close"] = data.close[x]
        recommendation.loc[k,"decision"] = -1
    if data.decision[x] < data.decision[x+1]:
        recommendation.loc[k,"time"] = data.date[x]
        recommendation.loc[k,"data_index"] = data.index[x]
        recommendation.loc[k,"close"] = data.close[x]
        recommendation.loc[k,"decision"] = 1
    k = k + 1
recommendation.reset_index(drop=True, inplace = True) # Create a new index
recommendation
time data_index close decision
0 2017-08-22 13:00:00 70.0 3920.99 1.0
1 2017-08-25 14:00:00 143.0 4305.00 -1.0
2 2017-08-28 14:00:00 215.0 4291.13 1.0
3 2017-09-01 12:00:00 309.0 4793.00 -1.0
4 2017-09-05 10:00:00 403.0 4169.76 1.0
... ... ... ... ...
352 2021-02-13 14:00:00 33214.0 46946.90 1.0
353 2021-02-14 04:00:00 33228.0 47551.25 -1.0
354 2021-02-15 12:00:00 33260.0 47951.01 1.0
355 2021-02-15 16:00:00 33264.0 48494.96 -1.0
356 2021-02-15 20:00:00 33268.0 48580.99 1.0

Clean up matrix so that it always starts with a buy (1) and ends with a sell

if recommendation.iloc[0,-1] == -1:
    recommendation.drop(recommendation.index[0], inplace=True)
if recommendation.iloc[-1,-1] == 1:
    recommendation.drop(recommendation.index[-1], inplace=True)
recommendation
time data_index close decision
0 2017-08-22 13:00:00 70.0 3920.99 1.0
1 2017-08-25 14:00:00 143.0 4305.00 -1.0
2 2017-08-28 14:00:00 215.0 4291.13 1.0
3 2017-09-01 12:00:00 309.0 4793.00 -1.0
4 2017-09-05 10:00:00 403.0 4169.76 1.0
... ... ... ... ...
352 2021-02-13 14:00:00 33214.0 46946.90 1.0
353 2021-02-14 04:00:00 33228.0 47551.25 -1.0
354 2021-02-15 12:00:00 33260.0 47951.01 1.0
355 2021-02-15 16:00:00 33264.0 48494.96 -1.0

Summary of capital development

capital_development = pd.DataFrame((np.zeros((0,7))), columns = ['buy_time', 'buy_data_index', 'buy_close', 'sell_time', 'sell_data_index', 'sell_close','performance'])
k = 0
for x in range(0, len(recommendation)):
   if recommendation.decision[x] == 1:
       capital_development.loc[k,"buy_time"] = recommendation.time[x]
       capital_development.loc[k,"buy_data_index"] = recommendation.data_index[x]
       capital_development.loc[k,"buy_close"] = recommendation.close[x]
       capital_development.loc[k,"sell_time"] = recommendation.time[x+1]
       capital_development.loc[k,"sell_data_index"] = recommendation.data_index[x+1]
       capital_development.loc[k,"sell_close"] = recommendation.close[x+1]
       capital_development.loc[k,"performance"] = capital_development.loc[k,"sell_close"]/capital_development.loc[k,"buy_close"]
   k = k + 1
capital_development.reset_index(drop=True, inplace = True) # Create a new index
capital_development
buy_time buy_data_index buy_close sell_time sell_data_index sell_close performance
0 2017-08-22 13:00:00 70.0 3920.99 2017-08-25 14:00:00 143.0 4305.00 1.097937
1 2017-08-28 14:00:00 215.0 4291.13 2017-09-01 12:00:00 309.0 4793.00 1.116955
2 2017-09-05 10:00:00 403.0 4169.76 2017-09-07 16:00:00 457.0 4705.03 1.128369
3 2017-09-15 10:00:00 643.0 2950.00 2017-09-16 03:00:00 660.0 3770.00 1.277966
4 2017-09-16 14:00:00 671.0 3630.00 2017-09-19 16:00:00 745.0 3961.90 1.091433
... ... ... ... ... ... ... ...
173 2021-02-01 16:00:00 32929.0 33460.83 2021-02-06 16:00:00 33049.0 40608.76 1.213621
174 2021-02-07 13:00:00 33070.0 38629.26 2021-02-10 08:00:00 33137.0 46668.89 1.208123
175 2021-02-10 21:00:00 33150.0 44999.09 2021-02-12 16:00:00 33192.0 47614.40 1.058119
176 2021-02-13 14:00:00 33214.0 46946.90 2021-02-14 04:00:00 33228.0 47551.25 1.012873
177 2021-02-15 12:00:00 33260.0 47951.01 2021-02-15 16:00:00 33264.0 48494.96 1.011344

If capital development performance is below a threshold value, this entry is deleted again, so that this is not learned by the DL. Changes in equity is in column 7, find the value that is smaller than the target value, then delete the entire line.

capital_development = capital_development[capital_development.performance > distance_value]
capital_development.reset_index(drop=True, inplace = True) # Create a new index
capital_development
buy_time buy_data_index buy_close sell_time sell_data_index sell_close performance
0 2017-08-22 13:00:00 70.0 3920.99 2017-08-25 14:00:00 143.0 4305.00 1.097937
1 2017-08-28 14:00:00 215.0 4291.13 2017-09-01 12:00:00 309.0 4793.00 1.116955
2 2017-09-05 10:00:00 403.0 4169.76 2017-09-07 16:00:00 457.0 4705.03 1.128369
3 2017-09-15 10:00:00 643.0 2950.00 2017-09-16 03:00:00 660.0 3770.00 1.277966
4 2017-09-16 14:00:00 671.0 3630.00 2017-09-19 16:00:00 745.0 3961.90 1.091433
... ... ... ... ... ... ... ...
148 2021-01-24 12:00:00 32733.0 32362.31 2021-01-25 04:00:00 32749.0 33378.93 1.031414
149 2021-01-28 01:00:00 32818.0 30881.94 2021-01-30 07:00:00 32872.0 34046.80 1.102483
150 2021-02-01 16:00:00 32929.0 33460.83 2021-02-06 16:00:00 33049.0 40608.76 1.213621
151 2021-02-07 13:00:00 33070.0 38629.26 2021-02-10 08:00:00 33137.0 46668.89 1.208123
152 2021-02-10 21:00:00 33150.0 44999.09 2021-02-12 16:00:00 33192.0 47614.40 1.058119

If you now multiply every entry of the performance column with each other (product), you get the amount by how much your investment has multiplied. However, with each purchase and sale also fees arise, which were not related here.

round(capital_development['performance'].product(), 2)
3336150.44

Now the values that indicate a good buying opportunity have to be finally classified in order to plot images on them. For this purpose I have included the index of the imported data in the DataFramce capital_development

data['decision_final'] = -1 # Create a new column with initiated sell signal 
for x in range(0, len(capital_development)):
    data.loc[capital_development.iloc[x,1]:capital_development.iloc[x,4],"decision_final"] = 1

Plot

# Decision
fig, ax1 = plt.subplots(figsize=(25,6))
ax1.grid(color='black', linestyle='--', linewidth=0.1)
ax1.set_ylabel('Deciscion')
ax1.plot(data.date[-900:], data.decision[-900:], label='decision')
ax1.plot(data.date[-900:], data.decision_final[-900:], label='decision_final')
ax1.tick_params(axis='y')
plt.yticks(np.arange(-1,2))
y_ticks_labels = ['Sell','','Buy']
ax1.set_yticklabels(y_ticks_labels)
ax1.legend()
fig.tight_layout()  # otherwise the right y-label is slightly clipped

# save pandas as CSV
file_name = "classification.csv"
data.to_csv (file_name, index = False, header=True)

After that I want to create figures to train the model. For this, I again look at individual time windows and create mapping from them. Through the calculated decision support, I can then assign to the model whether a good or bad opportunity for a long position exists.

Create figures and train model

import os
from os import listdir
from os import makedirs
from os.path import isfile, join
import sys

import pandas as pd
import numpy as np
import math
import shutil

from datetime import datetime
import datetime 
from PIL import Image

import uuid
#plt.style.use('ggplot')
#%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.dates as mpl_dates
import matplotlib.pyplot as plt
#from matplotlib import pyplot

from mpl_finance import candlestick_ohlc

from keras.utils import to_categorical
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator

Folders must be properly set up

path_ = 'D:/your path where you want to store your figures/'

average_length_current = 60 # in hours
average_length_future = 24  # in hours
distance_time =  2          # in hours x-axis
distance_value = 2          # in precent y-axis
distance_value = 1+(distance_value/100) # 
parameter = ['60_24_2_1.02'] # Folder [average_length_current average_length_future distance_time distance_value]

for i in parameter:
    path_parameter = path_ + i + '/'
    print('Subdirectories were created in ' + path_parameter)
    subdirs = ['export/', 'train/', 'validation/']
    for subdir in subdirs:
        labeldirs = ['Var1/', 'Var2/']
        for labldir in labeldirs:
            newdir = path_parameter + subdir + labldir
            makedirs(newdir, exist_ok=True)
    subdirs = ['chrono/']
    for subdir in subdirs:
        newdir = path_parameter + subdir
        makedirs(newdir, exist_ok=True)

The classification file that was created earlier is loaded here again

csv_file = 'classification.csv'

# Load data
float_formatter = "{:.2f}".format
np.set_printoptions(formatter={'float_kind':float_formatter})
dataset = []
dataset = pd.read_csv(csv_file, index_col=False)
dataset['date'] = pd.to_datetime(dataset['date']) # convert object / string to datetime

Candlestick chart must be created to train the neural network

For this purpose, the past 60 hours are considered and a figure is created from them. For a better result, the image should be filled with more features, here I added the moving averages. But also indicators like resistances or trend channels can be added.

Images that represent a good buy or sell signal are stored in different folders depending on their previous classification.

for k in range(average_length_current-1,len(dataset)):
    extract = dataset.iloc[k-average_length_current+1:k+1]
    f = lambda x: mdates.date2num(datetime.datetime.fromtimestamp(x)) # damit candlestick funktioniert, muss die Zeit in matplotlib umgewandelt werden
    fig, ax = plt.subplots(num=1, figsize=(3, 3), dpi=80, facecolor='w', edgecolor='k')
    ohlc = list(zip(extract['unix'].apply(f), extract['open'],extract['high'],extract['low'],extract['close'], extract['Volume BTC']))
    candlestick_ohlc(ax, ohlc, colorup='#77d879', colordown='#db3f3f', width = 0.05, alpha=0.5)
    plt.plot(extract['date'],extract['average_length_current'], color="b", linewidth=5, alpha=0.5)
    plt.plot(extract['date'],extract['average_length_future'], color="c", linewidth=5, alpha=0.5)
    ############### Add more identifying features ###############
    plt.autoscale()
    plt.axis('off')
    # Save the figures according to the classification in Buy and Sell
    if extract.iloc[-1,-1] == -1: #sell 
        plt.savefig(path_parameter + 'export/Var1/' + str(uuid.uuid4())+'.jpg') # assign random names so that no order can be determined
    elif extract.iloc[-1,-1] == 1: #buy
        plt.savefig(path_parameter + 'export/Var2/' + str(uuid.uuid4())+'.jpg') # assign random names so that no order can be determined
    plt.cla()
    plt.clf()

For my data set up to and including February 17, 2021, 9.269 figures have now been created as a buy signal and 23,943 as a sell signal. An example figure is shown below

Adjusted all images to the correct format

In order for the images to be used for training, they must be put into a proper format

new_hight = 224
new_width = 224
var = ['export/var1/', 'export/var2/']
def resize():
        for item in dirs:
            if os.path.isfile(path+item):
                im = Image.open(path+item)
                f, e = os.path.splitext(path+item)
                width, height = im.size  
                if width != new_width:
                    if height != new_hight:
                        #print(width, height) 
                        imResize = im.resize((new_hight,new_width), Image.ANTIALIAS)
                        imResize.save(f + '.jpg', 'JPEG', quality=100)

for x in var:
    path = path_parameter + x
    dirs = os.listdir(path)
    resize()

Start storing images in the right way

80% of the data is used for training. 20% (split_factor) is used for validation, for which the images are moved to the appropriate folders

var = ['var1', 'var2']
split_factor = 0.2
for x in var:
    exportdir = path_parameter + 'export/' + x + '/'
    traindir = path_parameter + 'train/' + x + '/'
    validir  = path_parameter + 'validation/' + x + '/'

    # cut the files from the export and paste them into the training folder
    FileList = [f for f in listdir(exportdir) if isfile(join(exportdir, f))]
    Quantity_to_move = math.ceil(len(FileList))
    for x in range(0, Quantity_to_move):
        source_name = (exportdir + FileList[x])
        target_name = (traindir + FileList[x])
        shutil.move(source_name, target_name)
    # cut 20% of the files from trian and paste them into validation folder
    FileList = [f for f in listdir(traindir) if isfile(join(traindir, f))]
    Quantity_to_move = math.ceil(len(FileList)*split_factor)
    for x in range(0, Quantity_to_move):
        source_name = (traindir + FileList[x])
        target_name = (validir + FileList[x])
        shutil.move(source_name, target_name)

Train Model

To train the model, I take very good pre-trained model (VGG16). Then the model is trained based on the saved figures

def define_model():
   model = VGG16(include_top=False, input_shape=(224, 224, 3)) # load model
   for layer in model.layers: # mark loaded layers as not trainable
       layer.trainable = False # mark loaded layers as not trainable
   flat1 = Flatten()(model.layers[-1].output) # add new classifier layers
   class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
   output = Dense(1, activation='sigmoid')(class1)
   model = Model(inputs=model.inputs, outputs=output) # define new model
   opt = SGD(lr=0.001, momentum=0.9) # compile model
   model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
   return model

def run_test_harness():
   model = define_model() # define model
   datagen = ImageDataGenerator(featurewise_center=True) # create data generator
   datagen.mean = [123.68, 116.779, 103.939] # specify imagenet mean values for centering
   train_it = datagen.flow_from_directory(path_parameter + 'train/', 
       class_mode='binary', batch_size=64, target_size=(224, 224)) # prepare iterator
   test_it = datagen.flow_from_directory(path_parameter + 'validation/',
       class_mode='binary', batch_size=64, target_size=(224, 224))
   history = model.fit_generator(train_it, steps_per_epoch=len(train_it),
       validation_data=test_it, validation_steps=len(test_it), epochs=epochs, verbose=1) # fit model
   _, acc = model.evaluate_generator(test_it, steps=len(test_it), verbose=0) # evaluate model
   print('> %.3f' % (acc * 100.0))
   model.save(path_parameter + '_model.h5') # save model

epochs = 4
run_test_harness() # entry point, run the test harness
print(str(datetime.datetime.now()) + ' Model trained')

Found 26569 images belonging to 2 classes.

Found 6643 images belonging to 2 classes.

Epoch 1/8 416/416 [==============================] - 94s 226ms/step - loss: 0.4015 - accuracy: 0.8547 - val_loss: 0.2106 - val_accuracy: 0.8919

Epoch 2/8 416/416 [==============================] - 94s 226ms/step - loss: 0.1918 - accuracy: 0.9223 - val_loss: 0.1305 - val_accuracy: 0.9572

Epoch 3/8 416/416 [==============================] - 113s 271ms/step - loss: 0.1066 - accuracy: 0.9592 - val_loss: 0.1676 - val_accuracy: 0.9619

Epoch 4/8 416/416 [==============================] - 94s 226ms/step - loss: 0.0813 - accuracy: 0.9685 - val_loss: 0.0259 - val_accuracy: 0.9679

Epoch 5/8 416/416 [==============================] - 94s 227ms/step - loss: 0.0591 - accuracy: 0.9785 - val_loss: 0.1162 - val_accuracy: 0.9658

Epoch 6/8 416/416 [==============================] - 94s 227ms/step - loss: 0.0539 - accuracy: 0.9800 - val_loss: 0.0215 - val_accuracy: 0.9771

Epoch 7/8 416/416 [==============================] - 95s 228ms/step - loss: 0.0343 - accuracy: 0.9873 - val_loss: 0.0086 - val_accuracy: 0.9818

Epoch 8/8 416/416 [==============================] - 95s 227ms/step - loss: 0.0271 - accuracy: 0.9902 - val_loss: 0.0266 - val_accuracy: 0.9872

98.720

Actual Prediction

import os
from os import listdir
from os import makedirs
from os.path import isfile, join
import sys

import pandas as pd
import numpy as np
import math
import shutil
import datetime
from PIL import Image

import uuid
#plt.style.use('ggplot')
#%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.dates as mpl_dates
import matplotlib.pyplot as plt
#from matplotlib import pyplot

from mpl_finance import candlestick_ohlc
import datetime  
from keras.utils import to_categorical
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator

from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import load_model
# Initialization
average_length_current = 60 # in hours
average_length_future = 24  # in hours
distance_time =  2          # in hours x-axis
distance_value = 2          # in precent y-axis
distance_value = 1+(distance_value/100) # 

path_ = 'C:/Users/andre/Bilder/DeepLearning/'
path_ = 'D:/finance/crypto/'
parameter = ['60_24_2_1.02'] # Folder [average_length_current average_length_future distance_time distance_value]
for i in parameter:
    path_parameter = path_ + i + '/'

# csv_file = 'classification.csv'

# # Load data
# float_formatter = "{:.2f}".format
# np.set_printoptions(formatter={'float_kind':float_formatter})
# dataset = []
# dataset = pd.read_csv(csv_file, index_col=False)
# dataset['date'] = pd.to_datetime(dataset['date']) # convert object / string to datetime

url="https://www.cryptodatadownload.com/cdd/Binance_BTCUSDT_1h.csv"
data =pd.read_csv(url, skiprows=1) 
data.drop(['date', 'symbol', 'Volume USDT', 'tradecount'], axis=1, inplace = True) # Delete unnecessary data
data['unix'] = data['unix'].astype(str).str[:10] # Manual adjustment of the time dataset - If Unixtime has more than 10 digits, it must be divided by 1000
data['unix'] = data.unix.astype(int)
data['date'] =   pd.to_datetime(data.unix, unit='s') # Convert dates. Attention UNIX EPOCH time has three 0's too many.
data = data.iloc[::-1] # flip upside down
data.reset_index(drop=True, inplace = True) # Create a new index

average_length_current = 60 # in hours
average_length_future = 24  # in hours
distance_time =  2          # in hours x-axis
distance_value = 2          # in precent y-axis
distance_value = 1+(distance_value/100) # 

# create moving averages
data['average_length_current'] = data.loc[:,"close"].rolling(window=average_length_current).mean()
data['average_length_future'] =  data.loc[:,"close"].rolling(window=average_length_future).mean()

Load the current course. The hourly data can be obtained via APIs from exchanges. However, you must create the same figures with the same features.

extract = data.iloc[-average_length_current:]
f = lambda x: mdates.date2num(datetime.datetime.fromtimestamp(x)) # for candlestick to work, time must be converted to matplotlib
fig, ax = plt.subplots(num=1, figsize=(3, 3), dpi=80, facecolor='w', edgecolor='k')
ohlc = list(zip(extract['unix'].apply(f), extract['open'],extract['high'],extract['low'],extract['close'], extract['Volume BTC']))
candlestick_ohlc(ax, ohlc, colorup='#77d879', colordown='#db3f3f', width = 0.05, alpha=0.5)
plt.plot(extract['date'],extract['average_length_current'], color="b", linewidth=5, alpha=0.5)
plt.plot(extract['date'],extract['average_length_future'], color="c", linewidth=5, alpha=0.5)

plt.autoscale()
plt.axis('off')
plt.savefig(path_parameter + 'chrono/' + str(extract.iloc[-1,0]) + '.jpg')
plt.cla() #Reset figure content
plt.clf() #Reset figure content

The DeepLearning model does not yet know this image and will now make a prediction whether it is a good or a bad opportunity to shop

# From here the actual prediction starts
model = load_model(path_parameter + '_model.h5') # load model
figures_path = path_parameter + 'chrono/' #load the path where the images are stored
file_list = [f for f in listdir(figures_path) if isfile(join(figures_path, f))] #erstelle Dateiliste aller Namen im Chrono-Ordner
file_number = math.ceil(len(file_list)) #Anzahl der vorhandenen Bilder im Chrono-Ordner


# initiate an output table
output = pd.DataFrame((np.zeros((0,3))), columns = ['unix', 'time', 'prediction']) 


# load and prepare the image
def load_image(file_name):
    # load the image
    img = load_img(file_name, target_size=(224, 224)) # prepare
    img = img_to_array(img) # convert to array
    img = img.reshape(1, 224, 224, 3) # reshape into a single sample with 3 channels
    return img

# prediction
for x in range(0, file_number):
    figure_name = figures_path + file_list[x]
    print(figure_name)
    img = load_image(figure_name)
    output.loc[x,"unix"] = str(file_list[x])[:-4]
    output.loc[x,"time"] = pd.to_datetime(str(file_list[x])[:-4], unit='s')
    output.loc[x,"prediction"] = model.predict(img)[0][0]
    
# save pandas as CSV
file_name = "output_prediction.csv"
output.to_csv (file_name, index = False, header=True)

If the prediction-score goes more in the direction of 1, it implies that it is a good time to open a long position. If the result is going to be 0, then it is better to give up the position.

unix time prediction
0 1613779200 2021-02-20 00:00:00 0.997728

Happy trading