enforce_eager=True not being respected in test scripts
Opened this issue · 2 comments
Problem
When running test scripts with enforce_eager=True specified, the logs still show enforce_eager=False and CUDA graphs are being calculated. This makes startup slower and leads to a slower feedback cycle during testing.
Reproduction
In the test script src/art/test/test_step_skipping.py, we're passing enforce_eager=True:
# Register the model
await model.register(
backend,
_openai_client_config={"engine_args": {"enforce_eager": True}},
)However, when running the script, the logs show that enforce_eager is still False and CUDA graphs are being compiled.
Expected Behavior
When enforce_eager=True is passed in the configuration, it should:
- Skip CUDA graph compilation
- Start up faster
- Provide quicker feedback during testing
Impact
This issue affects development velocity as tests take longer to start and provide feedback than necessary.
Environment
- The issue can be reproduced by running:
./src/art/test/test_step_skipping.py - The script is configured to run on GPU with sky launch
@corbt _openai_client_config.engine_args does not initialize the engine, so this is unsurprising. Instead engine args have to be specified with TrainableModel._internal_config. The reason why we also have engine_args here is because the OpenAI-compatible API server looks at some of these arguments. This API is really sub-optimal and probably best solved by unifying all args under register, or potentially a new API, something like deploy.
Thanks, that's helpful. Yes definitely in favor of simplifying the API here!