Limiting script execution time reliably
sigmonsays opened this issue · 3 comments
I'm struggling to find a good solution to implementing a reliable mechanism to limit how long a script is allowed to run.
A lot of examples show using debug.sethook to check if a timer has elapsed every 100k instructions or so but this is not that reliable as a C call can block.
Another solution is to use alarm system call or setrlimit on lua in a new thread/process but that has its own set of drawbacks.
Has a script timeout been considered before as something we can set on the VM before executing a pcall?
go-lua
was written before the context
package was a thing. It would be ideal if most of its exported functions took a ctx context.Context
argument, and then you could simply use the context.WithTimeout(...)
pattern. That's not possible in general, but in a controlled environment workarounds are possible. For Shopify's genghis
tool, we override sleep
, for example, to allow context cancellation from the function that invokes the script:
func registerSleep(ctx context.Context, l *lua.State) {
l.Register("sleep", func(l *lua.State) int {
ns := lua.CheckNumber(l, 1)
childCtx, _ := context.WithTimeout(ctx, time.Nanosecond*time.Duration(ns))
<-childCtx.Done()
// Check whether the run was terminated or the timeout expired.
select {
case <-ctx.Done():
// Run was terminated.
lua.Errorf(l, ctx.Err().Error())
default:
}
return 0
})
// Override time.sleep from goluago.
l.PushGlobalTable()
l.Field(-1, "package")
l.Field(-1, "loaded")
l.Field(-1, "goluago/time")
l.Field(-4, "sleep")
l.SetField(-2, "sleep")
l.Pop(4) // Pop the tables.
}
Our most common blocking native functions also take an outer context.
That seems reasonable and was the approach I took for now. It seems easier to control everything a script does which allows being timed out.
While we're at it, some mechanism to limit the VM's RAM usage would be appreciated as well.