Multimodal results for assessing architectural plans
steveterry66 opened this issue · 1 comments
steveterry66 commented
Hi, this is a use case that I'm looking for. I have attached the screenshot results with an erroneous interpretation (The dimensions of the yard are 10x25.). Please advise on how I could improve the results. Also, I have attached the original input jpeg, and a prompt of what I really want.
Manshed chameloen prompt.txt
BTW, it runs okay on my laptop with a Nvidia 4070
lshamis commented
We're not able to help with very specific use cases. Maybe try a chain of thought asking for explicit ocr and asking it to enumerate the calculation. Not sure. Good luck. Closing the issue.