Crash with multiple Load calls
Closed this issue · 7 comments
Describe the bug
Hello,
for this and the previous version i get regular crashes (aprox. 1 start of 3) on osx.
Build and Editor, acc and noAcc.
=================================================================
Managed Stacktrace:
at <unknown> <0xffffffff>
at System.Object:wrapper_native_0x5335dbdb4 <0x00007>
at LLMUnity.LLM:<Tokenize>b__65_0 <0x00087>
at <>c__DisplayClass64_0:<LLMReply>b__0 <0x00083>
at System.Threading.Tasks.Task:InnerInvoke <0x000c7>
at System.Threading.Tasks.Task:Execute <0x00063>
at System.Threading.Tasks.Task:ExecutionContextCallback <0x00097>
at System.Threading.ExecutionContext:RunInternal <0x003d3>
at System.Threading.ExecutionContext:Run <0x0006b>
at System.Threading.Tasks.Task:ExecuteWithThreadLocal <0x00223>
at System.Threading.Tasks.Task:ExecuteEntry <0x001c3>
at System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem <0x00057>
at System.Threading.ThreadPoolWorkQueue:Dispatch <0x0048f>
at System.Threading._ThreadPoolWaitCallback:PerformWaitCallback <0x0008b>
at <Module>:runtime_invoke_bool <0x0011b>
=================================================================
2024-08-09T12:58:59.181Z|0x3dc513000|Obtained 25 stack frames.
2024-08-09T12:58:59.182Z|0x3dc513000|#0 0x0000019e1a9698 in setjmp
2024-08-09T12:58:59.183Z|0x3dc513000|#1 0x000005335dbe0c in LLM_Tokenize
2024-08-09T12:58:59.183Z|0x3dc513000|#2 0x0000040fbad35c in (wrapper managed-to-native) object:wrapper_native_0x5335dbdb4 (intptr,string,intptr) [{0x396b874c8} + 0xec] (0x40fbad270 0x40fbad41c) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#3 0x000004170abdd0 in LLMUnity.LLM:b__65_0 (intptr,string,intptr) [{0x151c53710} + 0x88] [./Library/PackageCache/ai.undream.llm/Runtime/LLM.cs :: 424u] (0x4170abd48 0x4170abdf8) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#4 0x000004170aba24 in LLMUnity.LLM/<>c__DisplayClass64_0:b__0 () [{0x5c70f97c0} + 0x84] [./Library/PackageCache/ai.undream.llm/Runtime/LLM.cs :: 407u] (0x4170ab9a0 0x4170aba48) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#5 0x00000422daa050 in System.Threading.Tasks.Task:InnerInvoke () [{0x1504481e0} + 0xc8] (0x422da9f88 0x422daa110) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#6 0x00000422aa13e4 in System.Threading.Tasks.Task:Execute () [{0x3955d4b50} + 0x64] (0x422aa1380 0x422aa1470) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#7 0x00000422aa0f88 in System.Threading.Tasks.Task:ExecutionContextCallback (object) [{0x3955d4b78} + 0x98] (0x422aa0ef0 0x422aa0fb0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#8 0x00000422a9f95c in System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [{0x395564ac8} + 0x3d4] (0x422a9f588 0x422a9f9e8) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#9 0x00000422a9f48c in System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [{0x395564a48} + 0x6c] (0x422a9f420 0x422a9f4b0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#10 0x00000422a9e8bc in System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&) [{0x3955d49d8} + 0x224] (0x422a9e698 0x422a9e9e0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#11 0x00000422a9d5a4 in System.Threading.Tasks.Task:ExecuteEntry (bool) [{0x3955d48e8} + 0x1c4] (0x422a9d3e0 0x422a9d6dc) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#12 0x00000422a9d2d8 in System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [{0x150448048} + 0x58] (0x422a9d280 0x422a9d2fc) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#13 0x00000422a86fd0 in System.Threading.ThreadPoolWorkQueue:Dispatch () [{0x39596ede8} + 0x490] (0x422a86b40 0x422a8729c) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#14 0x00000422a85dbc in System.Threading._ThreadPoolWaitCallback:PerformWaitCallback () [{0x1509cb8b8} + 0x8c] (0x422a85d30 0x422a85e08) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#15 0x00000422a86484 in (wrapper runtime-invoke) :runtime_invoke_bool (object,intptr,intptr,intptr) [{0x39596eeb8} + 0x11c] (0x422a86368 0x422a86648) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.184Z|0x3dc513000|#16 0x0000038f91774c in mono_jit_runtime_invoke
2024-08-09T12:58:59.184Z|0x3dc513000|#17 0x0000038fa9cd00 in do_runtime_invoke
2024-08-09T12:58:59.185Z|0x3dc513000|#18 0x0000038fac1884 in worker_callback
2024-08-09T12:58:59.185Z|0x3dc513000|#19 0x0000038fa16974 in worker_thread
2024-08-09T12:58:59.185Z|0x3dc513000|#20 0x0000038fabec60 in start_wrapper_internal
2024-08-09T12:58:59.186Z|0x3dc513000|#21 0x0000038fabeb0c in start_wrapper
2024-08-09T12:58:59.186Z|0x3dc513000|#22 0x0000038fb3e008 in GC_inner_start_routine
2024-08-09T12:58:59.187Z|0x3dc513000|#23 0x0000038fb3df90 in GC_start_routine
2024-08-09T12:58:59.187Z|0x3dc513000|#24 0x0000019e17af94 in _pthread_start
2024-08-09T12:58:59.187Z|0x3dc513000|Launching bug reporter
Attribute Qt::AA_EnableHighDpiScaling must be set before QCoreApplication is created.
�[40m�[32minfo�[39m�[22m�[49m: Microsoft.Hosting.Lifetime[0]
Application is shutting down...
�[40m�[32minfo�[39m�[22m�[49m: Unity.ILPP.Runner.PostProcessingAssemblyLoadContext[0]
ALC ILPP context 1 is unloading
Steps to reproduce
No response
LLMUnity version
2.1.0-2.0.3
Operating System
macOs
Caused by calling Load() multiple times during warmup for different chats on the same bot.
Thank you for the bug report, I'll look into it next week 👍
Here the Editor Log:
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time = 26406.68 ms / 2699 tokens ( 9.78 ms per token, 102.21 tokens per second)","id_slot":0,"id_task":41,"t_prompt_processing":26406.678,"n_prompt_tokens_processed":2699,"t_token":9.783874768432753,"n_tokens_second":102.2089942551653}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time = 12923.37 ms / 76 runs ( 170.04 ms per token, 5.88 tokens per second)","id_slot":0,"id_task":41,"t_token_generation":12923.372,"n_decoded":76,"t_token":170.0443684210526,"n_tokens_second":5.8808181022723796}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":347,"msg":" total time = 39330.05 ms","id_slot":0,"id_task":41,"t_prompt_processing":26406.678,"t_token_generation":12923.372,"t_total":39330.05}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":41,"n_ctx":4096,"n_past":2776,"n_system_tokens":0,"n_cache_tokens":2776,"truncated":true}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":48}
2024-08-13T21:09:16.361Z|0x1f251cc00|LLM 11: Severe error occured
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":48,"p0":2694}
[Crash!]
Edit: log exchanged with a more informative one
Edit2: Digging this for a while now i came to belive, that other errors just bubbled up throgh the stack into the warmup callback and that itself is innocent.
hi, could you check again with the latest release (v2.2.0)?
I implemented a fix in the LLM creation / destruction.
Hi, i have checked the latest release and noticed that the llmCharacter WarmupCallback can return before llm.started. Combined with a Nullref in the callback allegedly caused the original error above.
UnityThread
{
_ = llmCharacter.Warmup(WarmUpCallback);
}
private void WarmUpCallback()
{
llm.SetBasePrompt("something")); // LLM not created error
NotExisting.Val = x; // Crash with last reported lifesign comming from LLM
}
imo the crash is resolved - unity is just gone before it can name the real culprit up the stack.
I can't think how this can happen 🙂 .
All the local chat calls (i.e. not on remote server) including warmup pass through this line
https://github.com/undreamai/LLMUnity/blob/main/Runtime/LLMCharacter.cs#L714
which proceeds only if the llm has failed or started successfully.
This i not exactly my expertise and I'am just guessing that it could be a UnityMainThread vs. Tasks thing. Here is what i found: Tasks may not synchronize correctly with Unity's main thread and Unity API calls made from threads other than the main thread can lead to such race conditions. You can use UnityMainThreadDispatcher or use UnityEngine.UnitySynchronizationContext to marshal back to the main thread.