symflower/eval-dev-quality

Sandbox execution

Closed this issue · 1 comments

We need a common helper to sandbox all the executions we are doing. Right now, an LLM could generate a remove-all-your-files call, and we just execute it.

Closing in favor of #198