openai/human-eval

why use ThreadPoolExecutor with GIL in background?

johnmclain opened this issue · 1 comments

In evaluation the code uses ThreadPoolExecutor at first and in each thread use multiprocessing package. Why not use ProcessPoolExecutor at first? Is there any consideration of optimizing performance?

@johnmclain

The ThreadPoolExecutor is used for concurrently executing the validation task for multiple generated code samples. Now these code sample validation are further wrapped as a process via multiprocessing to contain and isolate it . This ensures there is no conflict and also as a system security measure , executing and validating unsafe code.

Now why not processpool directly?

Threads are light and scales well with resources.
using a processpool directly is slightly less secure approach to validate unsafe code

If the code is safe code , using processpool directly makes a better sense.