i trained my own dataset with dino, categories include faces, heads, people, etc. I found that some images miss detection result for whole class when inferencing...
normal:
all 'person' miss detection:
i am so confused....how can i resolve it?
It seems like we meet the same problem. After fine-tuning mm-grounding-dino,when I inference on a 'multi-objects' image,the model can only detects one object.