tylerthecoder/gpt-vision-grid-extraction

An experiment on the capability of gpt4 vision

Python

GPT 4 grid extraction

Question

Can gpt4 vision perfectly extract the x y positions of objects on a grid? Lets see.

Plan

generate random grid and image of it
send image to gpt 4 vision
compare the two grids

Notes

GPT kept refusing to be able to do this for a bit.

Currently seems to get confused on where to start counting.

Results