Problem with fit function
Closed this issue · 5 comments
Hi, thanks a lot for helping me, I'm really struggling with this homework.
I'm a CS student with pretty mediocre coding abilities and also new to deep learning, so I asked for help from some classmates who watched your tutorial and succeeded with your method, to work on this Captcha recognition project, but this is an issue that none of them have encountered.
I'm running my code on google colab, and here are some of details of my implementation:
training code: https://colab.research.google.com/drive/1scQlm4hHoxGjS74537kAcELKGrGdLKxA?usp=sharing
mltu folder: https://drive.google.com/drive/folders/1V1ozlK1CmoYH8vSHuZaIeaHkII3NeioQ?usp=sharing
dataset: https://drive.google.com/drive/folders/1o49WI0O4x1HIU54eFuhovvS1g0aK5UTo?usp=sharing
-
I uploaded the dataset provided by my professor and the mltu-1.0.8 folder to my google drive
-
I used these two lines of code for my colab notebook to gain access to my google drive:
from google.colab import drive
drive.mount('/content/drive/', force_remount=True) -
My classmates were using a linux environment with python version=3.9.16, and after some trial and error, they found that some additional libraries have to be installed, therefore the two cells in my training.ipynb file:
!pip install PyYAML>=6.0
!pip install tqdm
!pip install pandas
!pip install numpy
!pip install opencv-python
!pip install onnxruntime
!pip install librosa==0.9.2
!pip install matplotlib
!pip install onnx==1.12.0
!pip install tensorflow==2.10
!pip install tf2onnx
!apt-get install python3.9
!ln -sf /usr/bin/python3.9 /usr/local/bin/python
!python --version
-
The rest are some minor changes to the file paths(point to my goodle drive folder) and config parameters (self.vocab, self.height, self.width, etc.)
-
All cells can run without encountering any errors until the last cell:
model.fit(
train_data_provider,
validation_data=val_data_provider,
epochs=configs.train_epochs,
callbacks=[earlystopper, checkpoint, trainLogger, reduceLROnPlat, tb_callback, model2onnx],
workers=configs.train_workers
)
where the error popped up:
ValueError: Failed to find data adapter that can handle input: <class 'drive.MyDrive.mltu.dataProvider.DataProvider'>, <class 'NoneType'>
I'm guessing the problem came from environmental issues (eg. I didn'tproperly change the version of python, or the versions of python, tensorflow and keras are not compatible), but I'm really not sure (sorry for my lack of skills).
If you need any other details of my implementation to find out what caused the error, please let me know. I really appreciate the help since the deadline of this homework is near.
I have your same issue. Have you fixed it? If you fix it, please share it with me
Ok, I tested it with mltu==1.0.10
you don't need to clone repository, you can type pip install mltu==1.0.10 (I'll release 1.0.11 version now with small fix)
annotations had few rows with not existing files, and in configs you were using set(...) that you shouldn't be using
this code worked to me:
import tensorflow as tf
try: [tf.config.experimental.set_memory_growth(gpu, True) for gpu in tf.config.experimental.list_physical_devices("GPU")]
except: pass
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, TensorBoard
from mltu.tensorflow.dataProvider import DataProvider
from mltu.tensorflow.losses import CTCloss
from mltu.tensorflow.callbacks import Model2onnx, TrainLogger
from mltu.tensorflow.metrics import CWERMetric
from mltu.preprocessors import ImageReader
from mltu.transformers import ImageResizer, LabelIndexer, LabelPadding
from mltu.augmentors import RandomBrightness, RandomRotate, RandomErodeDilate
from mltu.annotations.images import CVImage
from model import train_model
from configs import ModelConfigs
import os
import pandas as pd
df_train = pd.DataFrame(pd.read_csv("Datasets/train/annotations.csv"))
X_train = df_train["filename"].to_numpy()
y_train = df_train["label"].to_numpy()
dataset = []
for i in range(len(df_train)):
image_path = "Datasets/train/" + X_train[i]
if not os.path.exists(image_path):
print("File not found: " + image_path)
continue
label = y_train[i]
dataset.append([image_path, label])
configs = ModelConfigs()
# Create a data provider for the dataset
data_provider = DataProvider(
dataset=dataset,
skip_validation=True,
batch_size=configs.batch_size,
data_preprocessors=[ImageReader(CVImage)],
transformers=[
ImageResizer(configs.width, configs.height),
LabelIndexer(configs.vocab),
LabelPadding(max_word_length=configs.max_text_length, padding_value=len(configs.vocab))
],
)
# Split the dataset into training and validation sets
train_data_provider, val_data_provider = data_provider.split(split = 0.9)
# Augment training data with random brightness, rotation and erode/dilate
train_data_provider.augmentors = [RandomBrightness(), RandomRotate(), RandomErodeDilate()]
# Creating TensorFlow model architecture
model = train_model(
input_dim = (configs.height, configs.width, 3),
output_dim = len(configs.vocab),
)
# Compile the model and print summary
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=configs.learning_rate),
loss=CTCloss(),
metrics=[CWERMetric(padding_token=len(configs.vocab))],
run_eagerly=False
)
model.summary(line_length=110)
# Define path to save the model
os.makedirs(configs.model_path, exist_ok=True)
# Define callbacks
earlystopper = EarlyStopping(monitor="val_CER", patience=50, verbose=1)
checkpoint = ModelCheckpoint(f"{configs.model_path}/model.h5", monitor="val_CER", verbose=1, save_best_only=True, mode="min")
trainLogger = TrainLogger(configs.model_path)
tb_callback = TensorBoard(f"{configs.model_path}/logs", update_freq=1)
reduceLROnPlat = ReduceLROnPlateau(monitor="val_CER", factor=0.9, min_delta=1e-10, patience=20, verbose=1, mode="auto")
model2onnx = Model2onnx(f"{configs.model_path}/model.h5")
# Train the model
model.fit(
train_data_provider,
validation_data=val_data_provider,
epochs=configs.train_epochs,
callbacks=[earlystopper, checkpoint, trainLogger, reduceLROnPlat, tb_callback, model2onnx],
workers=configs.train_workers
)
# Save training and validation datasets as csv files
train_data_provider.to_csv(os.path.join(configs.model_path, "train.csv"))
val_data_provider.to_csv(os.path.join(configs.model_path, "val.csv"))
configs.py
import os
from datetime import datetime
from mltu.configs import BaseModelConfigs
class ModelConfigs(BaseModelConfigs):
def __init__(self):
super().__init__()
self.model_path = os.path.join("Models/test", datetime.strftime(datetime.now(), "%Y%m%d%H%M"))
self.vocab = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
self.height = 96
self.width = 96
self.max_text_length = 4
self.batch_size = 64
self.learning_rate = 1e-3
self.train_epochs = 150
self.train_workers = 20
next time make sure you preprocess your data correctly
test it and let me know if I can close this issue
@pythonlessons It works perfectly with mltu==1.0.11 ! Thanks a lot for your help, you're a lifesaver.
Nice!