How to refer an gameobject as "THIS" or "THAT"
Opened this issue · 3 comments
I am working on a real time VR application where I want to look at a game object and refer to it as "THIS" or "THAT". How do I achieve this functionality? At first, I thought of adding the looked game object's name alongside with the natural language user prompt to send it to LLM. But that was a bit unstable. So do you have any idea for a more reliable method to address game objects in VR setting as "THIS" or "THAT" without having to specify its name?
Sorry for the late reply! I am actually working on a VR-MCP that could achieve this more conveniently, but it will take some time to integrate the APIs.
To answer your questions, three major ways to do this: (1)In VR, you would want hand interaction (such as pointing) to achieve this, which would involve ray casting. (2)Gaze interaction is also an important aspect, in which you would need the Gaze API to achieve such a thing; (3)Furthermore, if you do this in the editor, have some meta prompts that would retrieve the camera view to have the LLM predict the referred object is also something we can consider about, even though that eats more tokens.
Let me know what you think, and also let me know if you made significant progress!
For now, I am still using the object's name, which uses manage_gameobject tool to find the object I am talking about. One addition I made was to make the received object's names case-INsensitive. That way, cube=Cube=CUBE are all the same. Which eliminates simple "Object not found errors".
Other than that, I was thinking to maybe create a dictionary for the common words like "THIS", "THAT", "IT" and so on and match them to currentLookedObject.name of the object from the raycast. Haven't implemented though yet
This