ggerganov/llama.cpp

Infinite update_slots issue on latest build (1265c67)

Leowolf93 opened this issue · 0 comments

I am experiencing an issue with the latest build (1265c67) where update_slots seems to be called infinitely, causing the model to get stuck during long-running tasks.
When executing a time-consuming task, the model freezes and the log shows repeated occurrences of:

{"tid":"139842773884928","timestamp":1715689179,"level":"INFO","function":"print_timings","line":342,"msg":" total time = 1749.19 ms","id_slot":0,"id_task":15676,"t_prompt_processing":234.708,"t_token_generation":1514.486,"t_total":1749.1940000000002} {"tid":"139842773884928","timestamp":1715689179,"level":"INFO","function":"update_slots","line":1780,"msg":"slot released","id_slot":0,"id_task":15676,"n_ctx":16384,"n_past":159,"n_system_tokens":0,"n_cache_tokens":0,"truncated":false} {"tid":"139842773884928","timestamp":1715689179,"level":"INFO","function":"update_slots","line":1806,"msg":"all slots are idle"} ... {"tid":"139842773884928","timestamp":1715689851,"level":"INFO","function":"update_slots","line":1835,"msg":"slot context shift","id_slot":0,"id_task":15720,"n_keep":0,"n_left":16383,"n_discard":8191,"n_ctx":16384,"n_past":16383,"n_system_tokens":0,"n_cache_tokens":0} {"tid":"139842773884928","timestamp":1715690239,"level":"INFO","function":"update_slots","line":1835,"msg":"slot context shift","id_slot":0,"id_task":15720,"n_keep":0,"n_left":16383,"n_discard":8191,"n_ctx":16384,"n_past":16383,"n_system_tokens":0,"n_cache_tokens":0}

Increasing the context size does not seem to alleviate this issue. However, reverting to a version prior to May 10th resolves the problem.
I apologize if this issue is not very detailed. I will provide further information if possible. Thank you!