dvl-tum/mot_neural_solver

Why trained using ground truth annotations?

Closed this issue · 3 comments

From the config file and the checkpoint that you have released, it seems that the model is trained directly with ground truth annotations. While most trackers train using the public detections along with the ground truth, is there any particular reason for training with ground truth directly?

Hi,

The model is indeed trained with ground truth bounding boxes. Notice, though, that we do data augmentation to simulate the typical sources of inaccuracies of object detectors.

We did try to use object detections for training. In order to use object detections for training, we needed to assign a ground identity to each box, so that we can later determine labels for the edges among them. We assign ground truth ids to each detected box by running bipartite matching between ground truth and detected bounding boxes based on IoU. This procedure is not perfect when there are heavy occlusions, and after visually inspecting the resulting identity assignments (and corresponding edge labeling) we noticed a significant number of ambiguities, and cases in which the assigned edge labels did not align with our intuition, hence yielding 'noisy labels'. As a workaround, we decided to simply train with bounding boxes, and simulate 'detector errors' as data augmentation. After running experiments, training with ground truth did not seem to hurt performance and seemed to yield more stable results.

Thank you for the quick reply. That makes sense. And also thank you for releasing this awesome and very modular code implementation. PS: Do you have any survey paper or link that lists all the recent neural net approach for solving graph based tracking?

Thanks a lot :) And no, unfortunately, I'm not aware of any such survey paper!