Alfworld GPT-3 Results

Question

Alfworld GPT-3 Results

gautierdag opened this issue 2 years ago · 3 comments

Hi,
I wondered if you had more details or numbers from your GPT-3 results on Alfworld? For instance, do you have the splits of accuracy across the different subtasks (as in Table 3 in the paper)?

I would try to reproduce it, but I reckon the total cost would be > $100 and would like to avoid it if possible.

Answer 1 · 2023-06-11T15:42:33.000Z

Hi, at the end of 134 instances, the six category

prefixes = {
    'pick_and_place': 'put',
    'pick_clean_then_place': 'clean',
    'pick_heat_then_place': 'heat',
    'pick_cool_then_place': 'cool',
    'look_at_obj': 'examine',
    'pick_two_obj': 'puttwo'
}

has the final result

134 r 0 rs [19, 19, 7, 17, 16, 8] cnts [24, 31, 23, 21, 18, 17] sum(rs)/sum(cnts) 0.6417910447761194

e.g. put tasks are 19/24 correct.

Answer 2 · 2023-06-11T15:43:58.000Z

A more complete trajectory is at https://gist.github.com/ysymyth/01045e5b65651eccd63a5a46964b8216

Answer 3 · 2023-06-11T16:16:31.000Z

Thank you!