Training stuck
empty2enrich opened this issue · 0 comments
After I execute the training command, the program gets stuck, the last log is as follows:
------------------------------------------ log-------------------------------------
NING:tensorflow:From /code/tensorflow-deeplab-v3-plus/deeplab_model.py:280: The name tf.metrics.mean_iou is deprecated. Please use tf.compat.v1.metrics.mean_iou instead.
W1020 11:33:01.788969 140706233194240 module_wrapper.py:139] From /code/tensorflow-deeplab-v3-plus/deeplab_model.py:280: The name tf.metrics.mean_iou is deprecated. Please use tf.compat.v1.metrics.mean_iou instead.
WARNING:tensorflow:From /miniconda3/envs/deeplabv3/lib/python3.6/site-packages/tensorflow_core/python/ops/metrics_impl.py:1178: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W1020 11:33:01.858843 140706233194240 deprecation.py:323] From /miniconda3/envs/deeplabv3/lib/python3.6/site-packages/tensorflow_core/python/ops/metrics_impl.py:1178: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
WARNING:tensorflow:From /code/tensorflow-deeplab-v3-plus/deeplab_model.py:291: The name tf.diag_part is deprecated. Please use tf.linalg.tensor_diag_part instead.
W1020 11:33:01.867624 140706233194240 module_wrapper.py:139] From /code/tensorflow-deeplab-v3-plus/deeplab_model.py:291: The name tf.diag_part is deprecated. Please use tf.linalg.tensor_diag_part instead.
INFO:tensorflow:Done calling model_fn.
I1020 11:33:02.115318 140706233194240 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I1020 11:33:02.117126 140706233194240 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I1020 11:33:07.461160 140706233194240 monitored_session.py:240] Graph was finalized.
2022-10-20 11:33:07.461720: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-10-20 11:33:07.474594: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2397155000 Hz
2022-10-20 11:33:07.475820: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556f71f84180 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-10-20 11:33:07.475844: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version