[bug] can't run gpu_pkl_to_cpu_pkl.py
TNick opened this issue · 3 comments
As you can see in the traceback I had a problem converting between gpu and cpu variants.
I've added after this line
elif isinstance(obj, (types.FunctionType, types.BuiltinFunctionType)):
print(prefix + "skipping a function (can't pickle function objects)")
rval = None
I've also modified these lines:
if hasattr(obj, 'set_value'):
# Base case: we found a shared variable, must convert it
rval = shared(obj.get_value())
try:
rval.name = obj.name
except AttributeError:
pass
# Sabotage its getstate so if something tries to pickle it, we'll find out
obj.__getstate__ = None
The script only seems to work with device=cpu
so we could spare a bit of time to future guys by adding
# ...
if __name__ == '__main__':
# theano.config.device is read-only so we change the value in environment
# before importing theano
thflags = os.environ['THEANO_FLAGS']
if thflags:
thflags = thflags + ",device=cpu"
else:
thflags = "device=cpu"
os.environ['THEANO_FLAGS'] = thflags
_, in_path, out_path = sys.argv
# ...
I was then able to show_weights.py
, browse_conv_weights.py
, num_parameters.py
, pkl_inspector.py
, plot_monitor.py
, etc. I did not try to fprop()
in any way.
I am aware of the discussion in pylearn-users and the comments by @goodfeli in the file header.
Anyone facing issues with gpu_pkl_to_cpu_pkl.py
should read this thread.
Regarding the first point, maybe the reason it is not able to pickle that function is because it existed in the version of Pylearn2 that saved the model in the first place, but it does not exist in the current version. In that case, I'm not sure we should always skip them. I don't know what the right way of detecting that would be, except by trying to pickle it in a string and seeing if it works.
Regarding the second point, yes, keeping the name
is a good idea. I would probably call hasattr
rather than try/catch, but it does not really matter.
Regarding the last point, sure.
I happened to have test_print_monitor_cv.py opened in an editor so I've copied to a GPU-equipped machine, modified it to preserve the file and run it. On the same console I then run gpu_pkl_to_cpu_pkl.py and I was able to reproduce the error. No difference in environment, pylearn2 version, etc
But your comment made me realize that there is no _check_is_symbolic
in pylearn2.space. Previously I just assumed that it exists and never checked.
As it turns out there is a VectorSpace
object that has a __reduce_ex__()
method inherited from object that returns this funky dictionary:
{'dim': 10,
'_check_is_symbolic': <function _check_is_symbolic at 0x7ff5a9c2d1b8>,
'validate_callbacks': [],
'sparse': False,
'_dtype': 'float32',
'_check_is_numeric': <function _check_is_numeric at 0x7ff5a9c2d140>,
'np_validate_callbacks': []}
After that pickle tries to save that dictionary, attempts to locate _check_is_symbolic
in pylearn2.space
and fails (obj
is VectorSpace
, rv
+/- is the dictionary):
> /home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(331)save()
-> self.save_reduce(obj=obj, *rv)
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(419)save_reduce()
-> save(state)
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(286)save()
-> f(self, obj) # Call unbound method with explicit self
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(649)save_dict()
-> self._batch_setitems(obj.iteritems())
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(663)_batch_setitems()
-> save(v)
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(286)save()
-> f(self, obj) # Call unbound method with explicit self
/home/ubuntu/devel/anaconda/lib/python2.7/pickle.py(748)save_global()
-> (obj, module, name))
A newly constructed VectorSpace
's __reduce_ex__()
returns, amongst other things:
{'dim': 1,
'_dtype': 'float32',
'validate_callbacks': [],
'sparse': False,
'np_validate_callbacks': []}
Both _check_is_symbolic()
and _check_is_numeric
are tagged as @staticmethod
.
I have no elegant idea about how to fix the things.