Segmentation fault (core dumped)

Question

Segmentation fault (core dumped)

Opened this issue 23 days ago · 3 comments

The bug
Hi! I am getting the message Segmentation fault (core dumped) while running the following code.

To Reproduce

from guidance import models, gen

llama3 = models.Transformers("meta-llama/Meta-Llama-3-8B-Instruct", device_map = 'auto')
llama3 + f'Do you want a joke or a poem? ' + gen(stop='.')

System info (please complete the following information):
guidance version is 0.1.15

GPU info:

Answer 1 · 2024-06-12T12:18:49.000Z

What version of LlamaCpp are you on? And does this happen if you run on the CPU rather than the GPU?

Answer 2 · 2024-06-12T12:27:17.000Z

What version of LlamaCpp are you on? And does this happen if you run on the CPU rather than the GPU?

Hi
The version of llama_cpp_python is 0.2.77.
The same issue occurs when I use the CPU rather than the GPU.

Answer 3 · 2024-06-12T12:58:51.000Z

And the prompt is fine when run directly from LlamaCpp (sorry, we see a lot of segfaults from LlamaCpp, and a segfault can't be from our code, which is 100% Python).