/gpt-vision-grid-extraction

An experiment on the capability of gpt4 vision

Primary LanguagePython

GPT 4 grid extraction

Question

Can gpt4 vision perfectly extract the x y positions of objects on a grid? Lets see.

Plan

  • generate random grid and image of it
  • send image to gpt 4 vision
  • compare the two grids

Notes

GPT kept refusing to be able to do this for a bit.

Currently seems to get confused on where to start counting.

Results