rstrivedi/Melting-Pot-Contest-2023

assert i == len(f), f AssertionError: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Closed this issue · 1 comments

I have manually installed melting pot and I have also changed rllib torch model file for the fix of LSTM wrapper . I am getting the assertion error while running training with these arguments .
Running trails with the following arguments: Namespace(num_workers=8, num_gpus=0, local=False, no_tune=False, algo='ppo', framework='torch', exp='clean_up', seed=123, results_dir='./results', logging='INFO', wandb=False, downsample=True, as_test=False)
Failure # 1 (occurred at 2023-09-02_22-43-34) �[36mray::PPO.train()�[39m (pid=86686, ip=######, actor_id=b2a47515069c17b28793673b01000000, repr=PPO) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 375, in train raise skipped from exception_cause(skipped) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 372, in train result = self.step() File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 851, in step results, train_iter_ctx = self._run_one_training_iteration() File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 2835, in _run_one_training_iteration results = self.training_step() File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 455, in training_step train_results = train_one_step(self, train_batch) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/execution/train_ops.py", line 56, in train_one_step info = do_minibatch_sgd( File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/utils/sgd.py", line 129, in do_minibatch_sgd local_worker.learn_on_batch( File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 810, in learn_on_batch info_out[pid] = policy.learn_on_batch(batch) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/utils/threading.py", line 24, in wrapper return func(self, *a, **k) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/policy/torch_policy_v2.py", line 729, in learn_on_batch grads, fetches = self.compute_gradients(postprocessed_batch) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/utils/threading.py", line 24, in wrapper return func(self, *a, **k) File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/policy/torch_policy_v2.py", line 929, in compute_gradients pad_batch_to_sequences_of_same_size( File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/policy/rnn_sequencing.py", line 155, in pad_batch_to_sequences_of_same_size feature_sequences, initial_states, seq_lens = chop_into_sequences( File "/home/saidinesh/Desktop/Projects/Melting-Pot-Contest-2023/rllib-env/lib/python3.10/site-packages/ray/rllib/policy/rnn_sequencing.py", line 387, in chop_into_sequences assert i == len(f), f AssertionError: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Could you please run sh run_patches.sh in the root folder and let me know if this error still persists?
It looks like the patch I listed under tf is actually required for all cases.

Thanks.