ultralytics/yolov5

Why is there a sudden drop in accuracy?

lqh964165950 opened this issue · 6 comments

Search before asking

Question

I have made some improvements in yolov5, but when training the improved model,sometimes there is a sudden drop in mAP50,
precision and recall.I want to know why this happend.
Uploading 曲线.png…

Additional

No response

👋 Hello @lqh964165950, thank you for reaching out about YOLOv5 🚀! This is an automated response to help guide you through resolving your issue. An Ultralytics engineer will also assist you soon.

For your question regarding the sudden drop in mAP50, precision, and recall, here are some general considerations:

  • If this is a 🐛 Bug Report, please ensure to provide a minimum reproducible example which includes the code modifications you've made and any unique data that could help us debug the problem.

  • If this is a custom training ❓ Question, provide as much detail as possible. This might include dataset image examples, specific model changes, training logs, and verify you are following the best practices for achieving optimal model performance.

Requirements

Ensure you have Python>=3.8.0 installed and all necessary dependencies are included as per the requirements for YOLOv5.

Environments

YOLOv5 can run in multiple environments, such as Notebooks with free GPU, Google Cloud Deep Learning VMs, Amazon Deep Learning AMIs, or using the Docker Image.

Status

Check the YOLOv5 CI (Continuous Integration) tests to ensure all operations like training, validation, inference, export, and benchmarks are functioning correctly. These tests occur every 24 hours and on every commit across different operating systems such as macOS, Windows, and Ubuntu.

Feel free to follow these steps, and let us know if you need more specific support based on your detailed findings! 😊

@lqh964165950 a sudden drop in mAP, precision, and recall during training can be due to several factors such as changes in dataset quality, learning rate fluctuations, or model overfitting. Ensure your dataset annotations are accurate and consistent. You might also want to experiment with different learning rates or utilize techniques like data augmentation to improve model generalization. If the issue persists, consider using pretrained weights as a starting point to stabilize training. You can find more details on these topics in the YOLO Common Issues guide.

Since I am utilizing the public dataset VisDrone, I believe the dataset itself is not the source of the issue. Assuming the problem lies in model overfitting, how can I address this? Additionally, I would like to inquire whether running multiple tasks on the same GPU could be a factor. I've observed that when a server executes only one task, the recall rate remains normal. However, when other tasks are also run on the server, the recall rate drops rapidly, even nearing zero.

To address potential overfitting with your YOLOv5 model, consider implementing data augmentation techniques or experimenting with dropout layers. Running multiple tasks on the same GPU can indeed impact performance due to resource contention, which might explain the drop in recall. Ensure your GPU resources are not over-allocated by monitoring usage with nvidia-smi.