Can we perform captioning tasks on pedestrian images in reality to obtain finer grained text descriptions.

Question

Can we perform captioning tasks on pedestrian images in reality to obtain finer grained text descriptions.

Closed this issue 6 months ago · 1 comments

shams2023 commented 6 months ago

Can we perform captioning tasks on pedestrian images in reality to obtain finer grained text descriptions.
I have some real-life pedestrian images and I would like to generate corresponding text descriptions for them. I am not sure if your model can be implemented, and I look forward to your reply!
Thank you!

Answer 1 · 2024-04-02T23:03:59.000Z

It is likely that this will work for your problem, however it is hard to say exactly without additional detail (as our method can be applied to any standard image captioning approach). I encourage reading through our paper, and determining if it is applicable to your particular situation.