Curious about action candidate selection

Question

Curious about action candidate selection

MathieuTuli opened this issue 3 years ago · 2 comments

Hi again,

I was curious about the impetus behind using an action candidate scoring paradigm. From my understanding, TextWorld games are usually played using a verb + noun style action process. That is, the agent must select a verb to and noun pair, which concatenate to form the action. This was the case in TextWorld-Coin-Collector for example. So I'm wondering why this style of action selection/generation/composition wasn't used in this work. Was it for better performance? Is it an easier task? Was it incompatibility? Or something else?

On incompatibility, analyzing the admissible commands from the TextWorld environment used in this repo, actions generally follow the verb+noun format, except in cases like "take apple from counter", where "from counter" describes the noun's location. Now there are cases where the noun is also described using multiple tokens (i.e. "sliding patio door", "red apple", etc.), but in general, you could consider these as one noun. So does this more complex action structure force the use of a candidate ranking method?

From my potentially incomplete perspective, it just seems that a candidate ranking method over action composition would be a little easier as a task, and would also mean the agent never performs a wrong/incorrect action. I can understand that the state defines the permissible actions, but at the same time here the agent is being fed those permissible actions and doesn't have to discover what's okay and not okay.

Thanks for any insight you can provide!

Answer 1 · 2021-08-03T13:25:20.000Z

Hi @MathieuTuli, you are right. We decided to use action candidate ranking rather than action generation to simplify the task. The main reason being we wanted this work, GATA, to focus on Learning Dynamic Belief Graphs, i.e. focusing on state representation.

Candidate ranking has been used in the past and is often called choice-based text games. We agree with you that an "ideal" agent should learn to generate the action and not rely on such handicaps. Note that the TextWorld engine can provide the list of the verbs and object names relevant to each game, i.e., textworld.EnvInfos(entities=True, verbs=True) or even textworld.EnvInfos(entities=True, command_templates=True). The available information that can be requested from TextWorld engine is listed here: https://textworld.readthedocs.io/en/stable/textworld.html#textworld.core.EnvInfos.

Answer 2 · 2021-08-03T13:29:49.000Z

Hi @MarcCote, thanks for the clarification, that makes sense. I had forgotten about the different game types. Thanks for providing those code snippets, I was actually looking for that!