Support for ExLlama V2

Question

Closed this issue a year ago · 2 comments

Answer 1 · 2023-09-14T15:10:28.000Z

Exllama v2 seems to be working now. Would you like to test this out?
Simply add version=2 to ExllamaModel as below:

your_gptq_model = ExllamaModel(
    version=2,
    model_path="TheBloke/MythoMax-L2-13B-GPTQ",  # automatic download
    max_total_tokens=4096,
)

Answer 2 · 2023-10-28T12:33:52.000Z

Thank you!