Optimization of non-GPT 4 major model outputs
warlockedward opened this issue · 1 comments
warlockedward commented
Currently I try to use Qwen1.5-72B, deepseek-33b, and mixtral-8x7b models to drive mentat, and I find that the answers given always have some errors, there are misunderstandings and inaccurate code modifications, and I'm not sure what is causing them, and I don't know if there is any later support for the ability to target non-GPT4 models? Thank you very much.
biobootloader commented
in our testing, no models other than GPT-4 and Claude 3 Opus can handle the complex edit format required for Mentat.
We do have some changes coming that might make things easier for local models though. This experiment is a step in that direction: #530