GPT-4-1106-preview

Question

GPT-4-1106-preview

Closed this issue 4 months ago · 3 comments

GPT-4-1106-preview What are the scores for each of the 13 tasks in the 4k test

Answer 1 · 2024-09-06T08:31:07.000Z

This is the score I measured

Answer 2 · 2024-09-10T17:27:15.000Z

Tasks	niah_single_1	niah_single_2	niah_single_3	niah_multikey_1	niah_multikey_2	niah_multikey_3	niah_multivalue	niah_multiquery	vt	fwe	cwe	qa_1	qa_2
Score	100.0	100.0	100.0	100.0	100.0	100.0	99.5	100.0	100.0	98.0	100.0	88.0	70.0

These are my results in the 4k test. Can you check the responses in variable tracking and common word extraction? I think maybe the API returns something unexpected.

Answer 3 · 2024-09-13T08:12:51.000Z

Thank you thank you