benchmarking question

Question

benchmarking question

lalehsg opened this issue 10 months ago · 1 comments

Hi all, I am wondering if there's any comparison between Defog and Mixtral 8x7B Instruct model has been done. Thanks in advance!

Answer 1 · 2024-03-06T21:10:37.000Z

Hi there! You can run a comparison on sql-eval, with the code below. IIRC, mistral-medium via the Mistral API is the 8x7B model. On SQL-Eval, it is 63% accurate, compared to 90% for sqlcoder-7b-2

python -W ignore main.py \
  -db postgres \
  -o "results/results.csv" \
  -g mistral \
  -f "prompts/prompt_mistral.md" \
  -m mistral-medium \
  -p 5