AIWintermuteAI/aXeleRate

After installation of aXeleRate, test_training_inference.py are freezing

orossant opened this issue · 2 comments

Describe the bug
I have followed your very helpfull and clear procedure explained here : https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

After having installed conda, activating an environment and installing aXelerate on my Mac, I have launched some tests to check all is fine. So I have made different test :
python ./tests_training_inference.py
python ./tests_training_inference.py -t classifier -a 'Tini Yolo'
python ./tests_training_inference.py -t classifier -a 'Full Yolo'
there is always the same error : Epoch 1/5 is ok, but when it comes to Epoch 2/5, it is freezing at 1/5 steps

To Reproduce
Steps to reproduce the behavior:
just follow https://www.instructables.com/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/

Expected behavior
A clear and concise description of what you expected to happen.

Expected behavior : be able to run Epoch 1 to 5 without any errors or freeze

Screenshots
Epoch 1/5
5/5 [==============================] - 8s 1s/step - loss: 1.6890 - accuracy: 0.2889 - val_loss: 1.6095 - val_accuracy: 0.2000

Epoch 00001: val_accuracy improved from -inf to 0.20000, saving model to projects/classifier/2021-01-08_15-34-56/Classifier_best_val_accuracy.h5
Epoch 00000: Learning rate is 2.6666666666666667e-05.

Epoch 2/5
1/5 [=====>........................] - ETA: 6s - loss: 1.4810 - accuracy: 0.5000

Environment (please complete the following information):
environment local : MacOs X Catalina, miniconda 3 installed and one dedicated environment activated
conda create -n yolo python=3.7
conda activate yolo
pip install git+https://github.com/AIWintermuteAI/aXeleRate
inside aXeleRate folder
python ./tests_training_inference.py

Additional context
Add any other context about the problem here.
I have found many people have similar pbs but in different context
https://www.google.com/search?client=firefox-b-e&q=keras+freeze+during+training

additional info to give all package version inside my environment

conda list   yolo2   16:16:45 

packages in environment at /Users/orossant/miniconda3/envs/yolo2:

Name Version Build Channel

absl-py 0.11.0 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
axelerate 0.7.0 pypi_0 pypi
ca-certificates 2020.12.8 hecd8cb5_0
cachetools 4.2.0 pypi_0 pypi
certifi 2020.12.5 py37hecd8cb5_0
chardet 4.0.0 pypi_0 pypi
cycler 0.10.0 pypi_0 pypi
decorator 4.4.2 pypi_0 pypi
defusedxml 0.6.0 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
gast 0.3.3 pypi_0 pypi
google-auth 1.24.0 pypi_0 pypi
google-auth-oauthlib 0.4.2 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.32.0 pypi_0 pypi
h5py 2.10.0 pypi_0 pypi
idna 2.10 pypi_0 pypi
imageio 2.9.0 pypi_0 pypi
imgaug 0.4.0 pypi_0 pypi
importlib-metadata 3.3.0 pypi_0 pypi
jinja2 2.11.2 pypi_0 pypi
joblib 1.0.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.3.1 pypi_0 pypi
libcxx 10.0.0 1
libedit 3.1.20191231 h1de35cc_1
libffi 3.3 hb1e8313_2
markdown 3.3.3 pypi_0 pypi
markupsafe 1.1.1 pypi_0 pypi
matplotlib 3.3.3 pypi_0 pypi
ncurses 6.2 h0a44026_1
networkx 2.5 pypi_0 pypi
numpy 1.19.5 pypi_0 pypi
oauthlib 3.1.0 pypi_0 pypi
onnx 1.8.0 pypi_0 pypi
opencv-python 4.1.2.30 pypi_0 pypi
openssl 1.1.1i h9ed2024_0
opt-einsum 3.3.0 pypi_0 pypi
pascal-voc-writer 0.1.4 pypi_0 pypi
pillow 8.1.0 pypi_0 pypi
pip 20.3.3 py37hecd8cb5_0
protobuf 3.14.0 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pyparsing 2.4.7 pypi_0 pypi
python 3.7.9 h26836e1_0
python-dateutil 2.8.1 pypi_0 pypi
pywavelets 1.1.1 pypi_0 pypi
readline 8.0 h1de35cc_0
requests 2.25.1 pypi_0 pypi
requests-oauthlib 1.3.0 pypi_0 pypi
rsa 4.6 pypi_0 pypi
scikit-image 0.18.1 pypi_0 pypi
scikit-learn 0.24.0 pypi_0 pypi
scipy 1.6.0 pypi_0 pypi
setuptools 51.0.0 py37hecd8cb5_2
shapely 1.7.1 pypi_0 pypi
six 1.15.0 pypi_0 pypi
sklearn 0.0 pypi_0 pypi
sqlite 3.33.0 hffcf06c_0
tensorboard 2.4.0 pypi_0 pypi
tensorboard-plugin-wit 1.7.0 pypi_0 pypi
tensorflow 2.4.0 pypi_0 pypi
tensorflow-estimator 2.4.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tf2onnx 1.7.2 pypi_0 pypi
threadpoolctl 2.1.0 pypi_0 pypi
tifffile 2020.12.8 pypi_0 pypi
tk 8.6.10 hb0a8c7a_0
tqdm 4.55.1 pypi_0 pypi
typing-extensions 3.7.4.3 pypi_0 pypi
urllib3 1.26.2 pypi_0 pypi
werkzeug 1.0.1 pypi_0 pypi
wheel 0.36.2 pyhd3eb1b0_0
wrapt 1.12.1 pypi_0 pypi
xz 5.2.5 h1de35cc_0
zipp 3.4.0 pypi_0 pypi
zlib 1.2.11 h1de35cc_3

Hello!
Well, the main issue that I'm seeing here is that you're using Mac OS. aXeleRate is meant to be run (and tested only) on Linux(Ubuntu 18.04) and Google Colab. While training theoretically should work on both Windows and Mac OS, aXeleRate is primarily meant as framework for training AND conversion of models to be run on embedded devices. The conversions step utilizes various converters , some of them (such as Google Edge TPU model converter) do no run anywhere except Linux, and some others(such as nncase)are unstable and buggy on Win/Mac.
I need to add this to README :)

Meanwhile, you can try running training in Google Colab, where you can utilize GPUs. Alternatively, if you want to run aXeleRate locally on Mac computer, you can install and run it in virtual machine.