PaddlePaddle/PaddleSeg

训练过程中评估,直接退出

seastronger opened this issue · 2 comments

问题确认 Search before asking

Bug描述 Describe the Bug

2024-03-26 08:18:24 [INFO] [TRAIN] epoch: 1, iter: 10/100, loss: 1.6568, lr: 0.009186, batch_cost: 0.2519, reader_cost: 0.00900, ips: 15.8788 samples/sec | ETA 00:00:22
2024-03-26 08:18:25 [INFO] [TRAIN] epoch: 1, iter: 20/100, loss: 0.5181, lr: 0.008272, batch_cost: 0.1580, reader_cost: 0.03392, ips: 25.3199 samples/sec | ETA 00:00:12
2024-03-26 08:18:27 [INFO] [TRAIN] epoch: 1, iter: 30/100, loss: 0.2872, lr: 0.007347, batch_cost: 0.1593, reader_cost: 0.03512, ips: 25.1148 samples/sec | ETA 00:00:11
2024-03-26 08:18:29 [INFO] [TRAIN] epoch: 1, iter: 40/100, loss: 0.1943, lr: 0.006409, batch_cost: 0.1571, reader_cost: 0.03307, ips: 25.4556 samples/sec | ETA 00:00:09
2024-03-26 08:18:30 [INFO] [TRAIN] epoch: 1, iter: 50/100, loss: 0.2356, lr: 0.005455, batch_cost: 0.1564, reader_cost: 0.03290, ips: 25.5746 samples/sec | ETA 00:00:07
‘’‘ 2024-03-26 08:18:30 [INFO] Start evaluating (total_samples: 76, total_iters: 76)...
76/76 [==============================] - 3s 34ms/step - batch_cost: 0.0336 - reader cost: 3.1583e-04’‘’ 我训练100次,每隔50次进行一次保存,这到了50次后直接退出了

(paddle36) D:\BaiduNetdiskDownload\code\PaddleSeg-release-2.8>

复现环境 Environment

python tools/train.py --config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml --save_interval 50 --do_eval --use_vdl --save_dir output 这是我的训练命令,用的是官网的例子和数据集。

------------Environment Information-------------
platform: Windows-10-10.0.19041-SP0
Python: 3.6.13 |Anaconda, Inc.| (default, Mar 16 2021, 11:37:27) [MSC v.1916 64 bit (AMD64)]
Paddle compiled with cuda: True
NVCC: Build cuda_11.2.r11.2/compiler.29373293_0
cudnn: 8.1
GPUs used: 1
CUDA_VISIBLE_DEVICES: 0
GPU: ['GPU 0: NVIDIA GeForce']
PaddleSeg: 2.8.0
PaddlePaddle: 2.4.2
OpenCV: 4.5.5 这是我的环境

Bug描述确认 Bug description confirmation

  • 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR? Are you willing to submit a PR?

  • 我愿意提交PR!I'd like to help by submitting a PR!

This problem should occur when running on Windows. It may not happen on Linux.