facebookresearch/chameleon

Multimodal results for assessing architectural plans

steveterry66 opened this issue · 1 comments

Hi, this is a use case that I'm looking for. I have attached the screenshot results with an erroneous interpretation (The dimensions of the yard are 10x25.). Please advise on how I could improve the results. Also, I have attached the original input jpeg, and a prompt of what I really want.
Manshed chameloen prompt.txt
manshed
chameleonResults

BTW, it runs okay on my laptop with a Nvidia 4070

We're not able to help with very specific use cases. Maybe try a chain of thought asking for explicit ocr and asking it to enumerate the calculation. Not sure. Good luck. Closing the issue.