flow-project/flow

AssertionError while running train.py with multiagent_traffic_light_grid.py by replacing PPO to SAC.

2855239858 opened this issue · 7 comments

Bug Description

Hi, I am trying do multi-agent learning on large grid with different algorithm like SAC. All my changes are as followed:

(1)modify multiagent_traffic_light_grid.py (flow/examples/exp_configs/rl/multiagent/multiagent_traffic_light_grid.py)

from ray.rllib.agents.sac.sac_tf_policy import SACTFPolicy

def gen_policy():
    """Generate a policy in RLlib."""
    return SACTFPolicy, obs_space, act_space, {}

(2)modify train.py (flow/examples/train.py)

# alg_run = "PPO"
  alg_run = "SAC"

  agent_cls = get_agent_class(alg_run)
  config = deepcopy(agent_cls._default_config)
  config["num_workers"] = n_cpus
  config["train_batch_size"] = horizon * n_rollouts
  # config["gamma"] = 0.999  # discount rate
  # config["model"].update({"fcnet_hiddens": [32, 32, 32]})
  # config["use_gae"] = True
  # config["lambda"] = 0.97
  # config["kl_target"] = 0.02
  # config["num_sgd_iter"] = 10
  # config['clip_actions'] = False  # FIXME(ev) temporary ray bug
  # config["horizon"] = horizon

Bug Reproduce

Failure # 1 (occurred at 2020-07-31_16-26-31)
Traceback (most recent call last):
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 468, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 430, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/worker.py", line 1474, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AssertionError): �[36mray::SAC.train()�[39m (pid=6958, ip=192.168.31.9)
File "python/ray/_raylet.pyx", line 407, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 442, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 445, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 446, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 400, in ray._raylet.execute_task.function_executor
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 90, in init
Trainer.init(self, config, env, logger_creator)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 450, in init
super().init(config, logger_creator)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/tune/trainable.py", line 175, in init
self._setup(copy.deepcopy(self.config))
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 623, in _setup
self._init(self.config, self.env_creator)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 115, in _init
self.config["num_workers"])
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 696, in _make_workers
logdir=self.logdir)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py", line 59, in init
RolloutWorker, env_creator, policy, 0, self._local_config)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py", line 282, in _make_worker
extra_python_environs=extra_python_environs)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 379, in init
self._build_policy_map(policy_dict, policy_config)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 937, in _build_policy_map
policy_map[name] = cls(obs_space, act_space, merged_conf)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/policy/tf_policy_template.py", line 143, in init
obs_include_prev_action_reward=obs_include_prev_action_reward)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/policy/dynamic_tf_policy.py", line 215, in init
action_dist = dist_class(dist_inputs, self.model)
File "/home/ryc/anaconda3/envs/flow/lib/python3.6/site-packages/ray/rllib/models/tf/tf_action_dist.py", line 277, in init
assert tfp is not None
AssertionError

Any progress on this?
I have the exact same issue in an other project using ray.

@tom-doerr No, I give up.

I had the same error and checked the code that is causing the error - it looks like it is trying to import tensorflow_probability module that I didn't have on my system. When I installed it, the error was gone.

no need for any code modification, just use pip install tensorflow_probability to import that module

no need for any code modification, just use pip install tensorflow_probability to import that module
after pip install tensorflow_probability:
Memory usage on this node: 3.0/5.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 3/3 CPUs, 0/0 GPUs, 0.0/1.51 GiB heap, 0.0/0.1 GiB objects
Result logdir: /home/host/ray_results/stabilizing_the_ring
Number of trials: 1 (1 RUNNING)
+--------------------------------------+----------+-------+
| Trial name | status | loc |
|--------------------------------------+----------+-------|
| SAC_WaveAttenuationPOEnv-v0_1b3b2a58 | RUNNING | |
+--------------------------------------+----------+-------+

(pid=4346) 2020-11-27 13:40:55,709 INFO trainer.py:371 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
2020-11-27 13:40:56,147 ERROR trial_runner.py:482 -- Error processing event.
Traceback (most recent call last):
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 426, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/tune/ray_trial_executor.py", line 378, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/worker.py", line 1457, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::SAC.init() (pid=4346, ip=192.168.206.139)
File "python/ray/_raylet.pyx", line 626, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 633, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 634, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 636, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 619, in ray._raylet.execute_task.function_executor
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 90, in init
Trainer.init(self, config, env, logger_creator)
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 398, in init
Trainable.init(self, config, logger_creator)
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/tune/trainable.py", line 96, in init
self._setup(copy.deepcopy(self.config))
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 504, in _setup
self._allow_unknown_subkeys)
File "/home/host/anaconda3/envs/haha/lib/python3.7/site-packages/ray/tune/util.py", line 167, in deep_update
raise Exception("Unknown config parameter {} ".format(k))
Exception: Unknown config parameter use_gae
2020-11-27 13:40:56,157 INFO trial_runner.py:530 -- Trial SAC_WaveAttenuationPOEnv-v0_1b3b2a58: Attempting to recover trial state from last checkpoint.
2020-11-27 13:40:56,162 INFO ray_trial_executor.py:121 -- Trial SAC_WaveAttenuationPOEnv-v0_1b3b2a58: Setting up new remote runner.

I am getting the same error when using my custom environment, where i predict some values with a trained tensorflow model. The Catpole environment does not lead to the error. Tfp is installed!

I ran into this error and doing pip install tensorflow_probability didn't help - it turns out that I was running tf 2.3.0 and the version of tfp pip was installing required tf>=2.4.0. You can see this by trying to import tfp manually:
import tensorflow_probability as tfp

To fix, I went to the pypl tensorflow-probability releases page https://pypi.org/project/tensorflow-probability/#history and tried older and older versions until I got one that was able to be imported - after which, I was able to run SAC with no errors.

The version that worked for me was 0.11.1:

pip install tensorflow-probability==0.11.1