web-arena-x/visualwebarena

Release log of success/failures for GPT4+SOM trajectories

sanjari-orb opened this issue · 2 comments

Thanks to the authors for releasing the GPT4+SOM trajectories.

However, I do not see any way to find which traces correspond to succeeding tasks v/s failing tasks. Can this information be released as well?

This was done in the WebArena repository while releasing the GPT execution traces: https://github.com/web-arena-x/webarena/tree/main/resources#1132023-execution-traces-from-our-experiments-v2

This should be available in the zip file you linked, as classifieds_gpt4v_som/results.txt, reddit_gpt4v_som/results.txt, and shopping_gpt4v_som/results.txt:

Screenshot 2024-07-22 at 7 55 43 PM

I missed this, thanks!