Error: StopIteration
CotarP opened this issue · 17 comments
Hi, the generator (train_gen) is empty. In your case, problem seems to be train_gen doesn't generate data.
You can check if the data generator works properly (cell 22).
If the data generator works :
- perhaps your learning rate is too high?
- check pixels values are in the interval 0., 1.
Thanks for the suggestions. With my input the data generator works properly.
Than I tried with your images and annotation and I didn't change any parameters. I still get the same result while training.
I don't know what could cause the notebook not to work with the same settings.
I also checked your images, and the pixel value is not between 1 and 0. Should I transform images before training?
And I notice that in the jupyter notebook it says tensorflow 2.1.0 and in your read me you said 2.0. I am now using 2.1, but I don't think that would be the problem.
Are you using tensorflow 2 notebook version (Yolo_V2_tf_2.ipynb) or tf 1 version (Yolo_V2_tf_eager.ipynb)?
Pixel values in the dataset are in the range [0, 255]. Cell 13 convert values in the range [0., 1.) : tf.image.convert_image_dtype(x_img, tf.float32)
I don’t think that tensorflow version is causing the problem during training.
I will try to test the notebook on my side.
I'm using tensorflow 2 notebook version (Yolo_V2_tf_2.ipynb).
I cloned the repository on my computer and launched Yolo_V2_tf_2.ipynb. The training is working well. I am using tensorflow 2.4.1.
Please check the pixel values just before entering the model and try different learning rates.
I checked for pixel value in few places and found, that in most cases values goes from 0 to 1, including 1.
And I am still using your data.
Ok.
- Is the notebook works well when you just clone the repository and run the notebook?
- Can you check a lower learning rate : try learning_rate = 1e-6 (cell 27 : optimizer = tf.keras.optimizers.Adam(learning_rate=1e-6, beta_1=0.9, beta_2=0.999, epsilon=1e-08))
I saved the code from github and opened it in Jupyter. The only thing I changed was image/annotation directories.
I tried learning_rate = 1e-6 an also 10e-7 and its still the same.
Does the notebook work correctly when you do not change the image/annotation directories?
It's the same.
I have no more ideas to solve this problem.
I think I found the problem. There must be a problem with GPU 'connection', since it worked once I changed the code to use only CPU. I didn't think of that before, since it looks like it can aces the GPU (in 2 cell).
If you still have any ideas, i would be happy to try.
But either way, thank you for all your time and help.
I also noticed a huge difference in performance of cell 22. When using GPU this cell needs a few minutes to compile and while using CPU it's done immediately.
I'm glad you're close to a solution. On my computer, cell 22 runs very quickly with GPU. Maybe you can check your graphic card driver installation.