Is it possible to use top-down view when we train?

Question

Is it possible to use top-down view when we train?

jeje910 opened this issue 2 years ago · 2 comments

In the paper, the dataset is made with several panels include top-down view
But I couldn't find the API which brings the Top-down view from both commander & follower.

Is it right that there is no API to bring it? And if so, is it illegal to use the view while training?
Regardless.

Answer 1 · 2022-09-30T17:23:18.000Z

This is a great question; thank you for bringing it up.

In principle the Commander and Follower models are not meant to have access to the 2D RGB top down map.

That said, human players were able to see the map during data collection, so there's a reasonable argument to be made that models may need it to ground some referential expressions. I think, given that, if you want to use top down maps "Go for it" just in any subsequent write-up about the modeling you'd want to talk about why. I'll see if I can dig up the code to generate the top down 2D images, but in the meantime I know I developed it from the responses in this thread:

allenai/ai2thor#124

You can add some loops to clear out (delete) scene objects that are movable if you want to get something close to the "empty" looking topdown 2D images used during TEACh data collection.

Hope this helps!

Answer 2 · 2022-10-02T14:38:07.000Z

Thank you for your wonderful help!
It was exactly what I was wondered and now totally understand what you want to say. I'll try to work on it

Once again thank you for your kind reply!
Best regards