ahq1993/inverse_rl

AssertionError: Nesting violated for default stack of <class 'tensorflow.python.client.session.Session'> objects

wangjw55 opened this issue · 3 comments

There are four processes generating data into four folders simultaneously when i run ant_data_collect.py. But when one of the processes produces 2000 sets of data, the program will be interrupted with the following error:

2019-10-31 21:22:16.830585 EDT | itr #1999 | Saving snapshot...
0% [########################## ] 100% | ETA: 00:00:012019-10-31 21:22:17.182188 EDT | itr #1999 | Saved
2019-10-31 21:22:17.183297 EDT | ----------------------- --------------
2019-10-31 21:22:17.183448 EDT | PolicyExecTime 0.370542
2019-10-31 21:22:17.183600 EDT | EnvExecTime 13.9964
2019-10-31 21:22:17.183725 EDT | ProcessExecTime 0.203515
2019-10-31 21:22:17.183844 EDT | Iteration 1999
2019-10-31 21:22:17.183962 EDT | AverageDiscountedReturn 172.879
2019-10-31 21:22:17.184080 EDT | AverageReturn 1034.59
2019-10-31 21:22:17.184197 EDT | ExplainedVariance 0.978998
2019-10-31 21:22:17.184314 EDT | NumTrajs 40
2019-10-31 21:22:17.184431 EDT | Entropy -0.779469
2019-10-31 21:22:17.184549 EDT | Perplexity 0.458649
2019-10-31 21:22:17.184672 EDT | StdReturn 26.7545
2019-10-31 21:22:17.184789 EDT | MaxReturn 1081.51
2019-10-31 21:22:17.184906 EDT | MinReturn 949.139
2019-10-31 21:22:17.185023 EDT | AvgRewardFwd 2.17163
2019-10-31 21:22:17.185139 EDT | AvgRewardCtrl -0.101136
2019-10-31 21:22:17.185255 EDT | AvgRewardContact -0.00131123
2019-10-31 21:22:17.185371 EDT | AvgRewardFlipped 0
2019-10-31 21:22:17.185488 EDT | AveragePolicyStd 0.220438
2019-10-31 21:22:17.185605 EDT | LossBefore 0.077947
2019-10-31 21:22:17.185721 EDT | LossAfter 0.0633926
2019-10-31 21:22:17.185839 EDT | MeanKLBefore 0
2019-10-31 21:22:17.185955 EDT | MeanKL 0.00640616
2019-10-31 21:22:17.186072 EDT | dLoss 0.0145543
2019-10-31 21:22:17.186189 EDT | Time 30710.5
2019-10-31 21:22:17.186305 EDT | ItrTime 15.5186
2019-10-31 21:22:17.186433 EDT | ----------------------- --------------
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/wsco/inverse_rl/inverse_rl/utils/hyper_sweep.py", line 47, in kwargs_wrapper
return method(**args)
File "ant_data_collect.py", line 34, in main
algo.train()
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1592, in exit
exec_tb)
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 5467, in get_controller
type(default))
AssertionError: Nesting violated for default stack of <class 'tensorflow.python.client.session.Session'> objects
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "ant_data_collect.py", line 41, in
run_sweep_parallel(main, params_dict, repeat=4)
File "/home/wsco/inverse_rl/inverse_rl/utils/hyper_sweep.py", line 57, in run_sweep_parallel
pool.map(kwargs_wrapper, exp_args)
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/wsco/anaconda3/envs/irl_wjw/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
AssertionError: Nesting violated for default stack of <class 'tensorflow.python.client.session.Session'> objects

Could you tell me how to fix the problem?Thanks.

Same problem. What is the recommended TensorFlow version? @ahq1993
Thank you.

tensorflow_gpu-1.0.1

I have a same problem. could you fix it?@ahq1993