/mnms

m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks

Primary LanguagePython

Stargazers