fabiodimarco/tf-levenberg-marquardt

How to save this model and load weights?

advanced-bencoding opened this issue · 12 comments

Hello im having trouble saving and loading weights, please help out

Model <levenberg_marquardt.ModelWrapper object at 0x000001BBE6EFFD90> cannot be saved because the input shape is not available. Please specify an input shape either by calling build(input_shape) directly, or by calling the model on actual data using Model(), Model.fit(), or Model.predict()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

tf.random.set_seed(42)

path_checkpoint = "training_1/cp.ckpt"
directory_checkpoint = os.path.dirname(path_checkpoint)

callback = tf.keras.callbacks.ModelCheckpoint(filepath=path_checkpoint,
save_best_only=True,
monitor="loss",
verbose=1)

model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation="relu"),
tf.keras.layers.Dense(1)
])

model.build((None, 9))

model_wrapper = lm.ModelWrapper(model)

model_wrapper.compile(
optimizer=tf.keras.optimizers.SGD(learning_rate=0.1),
loss=lm.MeanSquaredError())

model_wrapper.fit(X_train, y_train, epochs=1000, callbacks=[callback])

Hi, this last code you posted seems correct. It is missing the part where you load back the weights of the model. You could add something like this:

model_wrapper = lm.ModelWrapper(model)

# load model checkpoint
if path_checkpoint is not None:
    if os.path.isfile(path_checkpoint + ".index"):
        model_wrapper.load_weights(path_checkpoint)
    else:
        print(f"No pretrained model found at {path_checkpoint}")

Let me know if that solves your problem.

there are some things I dont understand. Like when I save do I have to also save my original model, or just the wrapper class model that you gave, could you walk me through the steps?

WARNING:absl:Found untraced functions such as _update_step_xla while saving (showing 1 of 1). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: ./deploy/mymodel\assets
INFO:tensorflow:Assets written to: ./deploy/mymodel\assets
this is shown when i save the wrapper

Not sure about that warning, I am not getting it. Could you please provide me the complete code that you are running?
The tf.keras.callbacks.ModelCheckpoint will save the ModelWrapper which internally is similar to do model_wrapper = tf.keras.Sequential([model]) with some overridden member functions.
So you are saving a Sequential which wraps your original model. However, when the model_wrapper is is loaded back you can access you original model by doing model_wrapper.model.

Saving the whole model is working correctly, it is giving me same results, I will let you know as soon as I figure out saving weights properly, after that I will close this issue, because saving and loading weights still giving me different results on my predictions.

ValueError: Model <levenberg_marquardt.ModelWrapper object at 0x000001DC606D2650> cannot be saved because the input shape is not available. Please specify an input shape either by calling build(input_shape) directly, or by calling the model on actual data using Model(), Model.fit(), or Model.predict().

This happens if I set save weights only to False in the same code I put above.

when im using model_wrapper.build((None, 9)) before model_wrapper.fit, then it gives me a weird output, where loss is at 0 at every epoch and no saving takes place after first epoch
So with this, I ask you, what is input shape for wrapper class? Is it something different?

No the input shape for the model_wrapper has to be exactly the same of the original model input shape.
Compared to standard training, I need the model to be build beforehand in order to initialise some data inside the LM Trainer.
If you provide me the code I can help you to make it work.

I want the callback to save the entire model when there is improvement in loss metric

import tensorflow as tf
import numpy as np
import levenberg_marquardt as lm
import pandas as pd
import os
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

path = './data'
cpi_df = pd.read_csv(path+'/cpi.csv')
crudeoil_df =pd.read_csv(path+'/crudeoil.csv')
em_df = pd.read_csv(path+'/emergingm.csv')
forex_df = pd.read_csv(path+'/forex.csv')
gold_df = pd.read_csv(path+'/gold.csv')
iip_df = pd.read_csv(path+'/industrial production.csv')
interest_df = pd.read_csv(path+'/interest.csv')
msci_df = pd.read_csv(path+'/msci.csv')
vix_df = pd.read_csv(path+'/vix.csv')
sensex_df = pd.read_csv(path+'/sensex.csv')

ann_data = sensex_df.join(cpi_df.iloc[:,1]).join(crudeoil_df.iloc[:,1]).join(em_df.iloc[:,1]).join(forex_df.iloc[:,1]).join(gold_df.iloc[:,1]).join(iip_df.iloc[:,1]).join(interest_df.iloc[:,1]).join(msci_df.iloc[:,1]).join(vix_df.iloc[:,1]).set_index('date')

X = ann_data.dropna().drop("sensex", axis=1).astype(np.float32)
y = ann_data.dropna()["sensex"].astype(np.float32)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

tf.random.set_seed(42)

path_checkpoint = "training_10/cp.ckpt"
directory_checkpoint = os.path.dirname(path_checkpoint)

callback = tf.keras.callbacks.ModelCheckpoint(filepath=path_checkpoint,
                                                 save_best_only=True,
                                                 save_weights_only=False,
                                                 monitor="loss",
                                                 verbose=1)

model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, activation="relu"),
    tf.keras.layers.Dense(1)
])

model.build((None, 9))

model_wrapper = lm.ModelWrapper(model)

model_wrapper.compile(
    optimizer=tf.keras.optimizers.SGD(learning_rate=0.1),
    loss=lm.MeanSquaredError())
model_wrapper.fit(X_train, y_train, epochs=500, batch_size=32, callbacks=[callback])
model_wrapper.evaluate(X_test, y_test)

plt.plot(model_wrapper.predict(X_test), 'o')
plt.plot(y_test, 'o')
plt.legend(["preds", "actuals"])
plt.show()

cpi.csv
crudeoil.csv
emergingm.csv
forex.csv
gold.csv
industrial production.csv
interest.csv
msci.csv
sensex.csv
vix.csv

There is a problem in my code when: save_weights_only=False in tf.keras.callbacks.ModelCheckpoint.

I will try to fix that in the next days, I think I know how to do it.

However, it works on my side if you set save_weights_only=True:

callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=path_checkpoint,
    save_best_only=True,
    save_weights_only=True,
    monitor="loss",
    verbose=0)

Btw, I think you need to normalize your data in some ways at the moment the losses are huge! and training convergence is very unstable.

Let me know if setting save_weights_only=True solves your problem.