Add run_on_ec2 flag to benchmark_single_table
amontanez24 opened this issue · 0 comments
amontanez24 commented
Problem Description
As a user, sometimes the benchmark function take a while and a lot of compute resources so it would be nice to be able to run it on a separate instance.
Expected behavior
- Add
run_on_ec2
boolean parameter to benchmark_single_table - If it is True
- Launch an ec2 instance
- Install sdgym on that instance
- Run the job on that instance with the rest of the parameters
- Store the output in an S3 folder based on the value of
output_filepath
Technical Details
output_filepath
is required to be an s3 bucket if the flag is enabled. We should add a check for that- We should do this using boto3 directly. They have a function called run_instances that can take in a script. In our case, we just want the script to pip install sdgym and then run the cli with the commands provided in the method above