`dm_control` `cheetah` `run` training stops suddenly
letusfly85 opened this issue · 3 comments
letusfly85 commented
Hi, I'm now trying to execute dm_control walker walk
, walker run
, and cheetah run
.
Two walker walk
, walker run
work fine, however cheetah run
fails during training like below...
Failure message
Number of errored trials: 1
+--------------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|--------------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| id=43fbb_00000-seed=8373 | 4 | /home/acc12468eh/ray_results/dm_control/cheetah/run/2020-09-14T19-51-36-sl-sac/id=43fbb_00000-seed=8373_0_hidden_layer_sizes=(256, 256),preprocessors=({'pixels': {'class_name': 'convnet_preprocessor', 'config'_2020-09-14_19-51-38hsvhe5yt/error.txt |
+--------------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Traceback (most recent call last):
File "/home/acc12468eh/miniconda3/envs/softlearning/bin/softlearning", line 11, in <module>
load_entry_point('softlearning', 'console_scripts', 'softlearning')()
File "/home/acc12468eh/softlearning/softlearning/scripts/console_scripts.py", line 207, in main
return cli()
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/home/acc12468eh/softlearning/softlearning/scripts/console_scripts.py", line 73, in run_example_local_cmd
return run_example_local(example_module_name, example_argv)
File "/home/acc12468eh/softlearning/examples/instrument.py", line 244, in run_example_local
reuse_actors=True)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/tune.py", line 356, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [id=43fbb_00000-seed=8373])
And I cat
the error.txt
something like that I found.
(base) [acc12468eh@es2 ~]$ cat /home/acc12468eh/ray_results/dm_control/cheetah/run/2020-09-14T19-51-36-sl-sac/id=43fbb_00000-seed=8373_0_hidden_layer_sizes=(256, 256),preprocessors=({'pixels': {'class_name': 'convnet_preprocessor', 'config'_2020-09-14_19-51-38hsvhe5yt/error.txt
Content of error.txt
-bash: unexpected token `('
Thank you.
hartikainen commented
I think what you're actually seeing is not the contents of error.txt
but rather an error from bash. Can you wrap the cat
argument in quotes? I.e.:
cat "/home/acc12468eh/ray_results/dm_control/cheetah/run/2020-09-14T19-51-36-sl-sac/id=43fbb_00000-seed=8373_0_hidden_layer_sizes=(256, 256),preprocessors=({'pixels': {'class_name': 'convnet_preprocessor', 'config'_2020-09-14_19-51-38hsvhe5yt/error.txt"
letusfly85 commented
Oh, sorry. This is the correct error.txt
content.
(base) [acc12468eh@es2 ~]$ cat /home/acc12468eh/ray_results/dm_control/cheetah/run/2020-09-14T19-51-36-sl-sac/id\=43fbb_00000-seed\=8373_0_hidden_layer_sizes\=\(256\,\ 256\)\,preprocessors\=\(\{\'pixels\'\:\ \{\'class_name\'\:\ \'convnet_preprocessor\'\,\ \'config\'_2020-09-14_19-51-38hsvhe5yt/error.txt
Failure # 1 (occurred at 2020-09-14_19-51-58)
Traceback (most recent call last):
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 471, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/worker.py", line 1540, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
Failure # 2 (occurred at 2020-09-14_19-52-06)
Traceback (most recent call last):
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 471, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/worker.py", line 1540, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
Failure # 3 (occurred at 2020-09-14_19-52-14)
Traceback (most recent call last):
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 471, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/worker.py", line 1540, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
Failure # 4 (occurred at 2020-09-14_19-52-23)
Traceback (most recent call last):
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 471, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/acc12468eh/miniconda3/envs/softlearning/lib/python3.7/site-packages/ray/worker.py", line 1540, in get
raise value
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
h8907283 commented
Yes, walker run
is okay, but not cheetah run
.