tambetm/simple_dqn

nvis.sh: pycuda._driver.LogicError: cuFuncSetBlockShape failed: an illegal memory access was encountered

mw66 opened this issue · 3 comments

mw66 commented

I changed some network parameter e.g the conv size, etc, the training works fine, but nvis.sh errors out:

2016-05-01 22:36:06,725 Collected 50 game states
2016-05-01 22:36:06,766 DataIterator class has been deprecated and renamed"ArrayIterator" please use that name.
2016-05-01 22:36:06,799 Guided Bprop Visualization of 4 feature maps per layer:
Visualization [Find Max Act Imgs |██████████ | 1/2 batches, 0.15s]
Visualization [Compute Guided Bprop |████████ | 4/9 layers, 0.00s]PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
Traceback (most recent call last):
File "src/main.py", line 125, in
visualize(net.model, states, args.visualization_filters, args.visualization_file)

....

/neon/.venv/local/lib/python2.7/site-packages/pycuda/driver.py", line 495, in function_prepared_async_call
func._set_block_shape(*block)
pycuda._driver.LogicError: cuFuncSetBlockShape failed: an illegal memory access was encountered

I wonder where I should check? is there any parameter somewhere hard coded in nvis dir from the training model?

Thanks.

Maybe the code in src/nvis needs update from https://github.com/NervanaSystems/neon/tree/master/neon/visualizations? Did it work with original convolution size?

mw66 commented

It works with the original convolution size.

And when I change my parameter again, this error goes away.

So there are certain value will trigger this issue.

I updated the deconv code to match Neon's and it seems to work fine. This particular pycuda error is out of my reach, so closing it for now.