You are tired of running commands one by one manually when the previous command is finished. A lot of commands (maybe experiments) need to run in your project (maybe paper). You want to run them concurrently, without leaving the Free time.
Hypo is a tool to help you run commands concurrently. You can use Hypo to run them concurrently without wasting time. It run your complex commands in System-level. You can easily manipulate the commands in the way you like.
You can install it by pip install hypo-run
.
# In the file folder_a/index.py
from hypo import run, Run
@run()
def trial():
return [Run(command="echo this_is_a_very_complex_prompt_to_start_your_experiment_in_bash", name="indicate your task")]
@run(max_workers=10) # run 10 tasks concurrently
def trial():
return [Run(command=f"echo {i}", name="echo 0 to 9") for i in range(10)]
Then you can start your task parallel.
# hypo <dir_name> <function_name>
hypo folder_a index.trial # to start method trial. Create tasks, then run.
# if the file named `index.py`, then you can ignore the file name.
hypo folder_a trial # to start method trial. Create tasks, then run.
# if you are already in the folder_a
hypo trial # to start method trial. Create tasks, then run.
Or, directly call the function you need.
trial()
# Then, `python folder_a/index.py`
After running all experiments, you can check the task summary in the output folder named summary.json
.
[
{
"name": "A very complex task",
"command": "echo this_is_a_very_complex_prompt_to_start_your_experiment_in_bash",
"cwd": "/data/Hypothesis/hypo",
"output": "/data/Hypothesis/hypo/a",
"datetime": "2024-06-27__18-35-21"
},
{
"name": "Git Version",
"command": "git rev-parse HEAD",
"cwd": "/data/Hypothesis/hypo",
"output": "/data/Hypothesis/hypo",
"datetime": "2024-06-27__18-35-21"
}
]
You may have a lot of cuda tasks to do. Run them concurrently! Assume your GPU could have 2 task to run at the same time. cuda_visible_devices
will be the environment variable CUDA_VISIBLE_DEVICES
pass to processing.
from hypo import run, Run, run_git_checkout
from itertools import product
@run(cuda_visible_devices={0, 1, 6, 7}, max_workers=8)
def compare():
cmd_templete = "python main.py --category {clz}"
clzs = [
"table",
"sofa",
"bench",
"watercraft",
# ... a really long list
]
tasks = []
for clz, method in product(clzs, ["my_method", "baseline", "sota"]): # the method you want to compare
task = [
run_git_checkout(method), # branch name. git checkout to the branch you want to run.
Run(
command=cmd_templete.format(clz=clz),
name=f"{method}-{clz}",
cwd="/path/to/your/project",
out="/summary.json/will/be/generated/here",
),
]
tasks.append(task)
return tasks
You do not need to worry about the run_git_checkout
. Python will load all file in memory at the start. Your code will not go wrong.
You can use some pre-defined Run. for example, the git version
using run_git_status
.
from hypo import runs, Run, run_git_status
@runs()
def method():
return [
Run(name="a", cwd=".", output="./a", command="echo $cwd"),
run_git_status(),
]
If you want to run the command in a specific git version, you can use run_git_checkout
.
from hypo import run, run_git_checkout, Run
@run()
def test_run_git_checkout():
return [
[
run_git_checkout("a6bb0c3"), # commit name
Run(command="python main.py", output=".", cwd=".", name="run1"),
],
[
run_git_checkout("main"), # branch name
Run(command="python main.py", output=".", cwd=".", name="run2"),
],
]
A progress bar will be shown in the terminal. You can easily check the progress of your tasks. This progress bar will not block the log you printed in the processing.
- Just use the
subprocess.run()
to run the command. - Use the ThreadPoolExecutor to run the command concurrently.
- Get the Run object from the queue and assign the command to the ThreadPoolExecutor by
max_workers
. - To avoid the
git
conflict at the same time, theGlobalResources
as the threading lock will be created.
Enjoy it to make the life easier.