module 'zmq.backend.cython.socket' has no attribute 'get'
dstan11 opened this issue · 15 comments
I met some problems when I run scheduler.start()
. It says module 'zmq.backend.cython.socket' has no attribute 'get'
and Can't get attribute 'get' on <module 'zmq.backend.cython.socket' from 'E:\\Users\\shand\\anaconda3\\envs\\DoppelGANger\\lib\\site-packages\\zmq\\backend\\cython\\socket.cp35-win_amd64.pyd'>
and Can't pickle <cyfunction Socket.get at 0x000001FCDFCC71B8>: it's not found as zmq.backend.cython.socket.get
I am not sure why you see these errors.
Could you please post here:
- The complete error log
- How you install the Python environment and the packages
- The list of the installed Python packages and versions
So that I can reproduce these errors and debug it?
Thanks!
I created a notebook which has the same content with main.py under DoppelGANger/DoppelGANger/example_training
folder.
if __name__ == "__main__":
from gan_task import GANTask
from config import config
from gpu_task_scheduler.gpu_task_scheduler import GPUTaskScheduler
scheduler = GPUTaskScheduler(config=config, gpu_task_class=GANTask)
scheduler.start()
- error log
error.txt - python version 3.5.2
packages.txt
Thanks. Can you try directly executing it instead of from Jupiter notebook?
Yes. I tried python main.py
under DoppelGANger/DoppelGANger/example_training
folder through Terminal. It seems no error came up. However, the program is still running after 3 hours. I have no idea how long it supposed to be. By the way, GPU Performance didn't change after I run the program.
Thanks.
You can look at worker.log in subfolders of results
folder for the training progress.
If the code isn't using GPU, then
- Make sure that you installed
tensorflow-gpu
instead oftensorflow
- You can check worker.log and see if there are any error messages about loading Cuda library.
Sorry to disturb you again. I didn't find results
folder. Can you show me where it is?
Thanks!
It should be on the same level as example_training
folder. It is configured in config.py: "result_root_folder": "../results/"
Thank you for the reply! I updated python version to 3.7 and tensorflow-gpu version to 1.1.4. Now the program works.
Great!!
It has a new error message.
Traceback (most recent call last):
File "E:\Users\shand\anaconda3\envs\DoppelGANger2\Scripts\start_gpu_task-script.py", line 33, in <module>
sys.exit(load_entry_point('GPUTaskScheduler', 'console_scripts', 'start_gpu_task')())
File "f:\github clone folder\gputask\gputaskscheduler\gpu_task_scheduler\start_gpu_task.py", line 23, in main
worker.main()
File "F:\Github clone folder\DoppelGANger\DoppelGANger\example_training\gan_task.py", line 124, in main
gan.train(restore=restore)
File "..\gan\doppelganger.py", line 918, in train
self.visualize(epoch_id, batch_id, global_id)
File "..\gan\doppelganger.py", line 801, in visualize
sub1(features, attributes, lengths, None, None, None, "free")
File "..\gan\doppelganger.py", line 749, in sub1
ground_truth_lengths=ground_truth_lengths)
File "<__array_function__ internals>", line 6, in savez
File "E:\Users\shand\anaconda3\envs\DoppelGANger2\lib\site-packages\numpy\lib\npyio.py", line 645, in savez
_savez(file, args, kwds, False)
File "E:\Users\shand\anaconda3\envs\DoppelGANger2\lib\site-packages\numpy\lib\npyio.py", line 743, in _savez
zipf = zipfile_factory(file, mode="w", compression=compression)
File "E:\Users\shand\anaconda3\envs\DoppelGANger2\lib\site-packages\numpy\lib\npyio.py", line 119, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File "E:\Users\shand\anaconda3\envs\DoppelGANger2\lib\zipfile.py", line 1240, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: '../results/aux_disc-False,dataset-google,epoch-400,epoch_checkpoint_freq-1,extra_checkpoint_freq-5,run-0,sample_len-1,self_norm-False,\\sample\\epoch_id-0,batch_id-199,global_id-199,type-free,samples.npz'
Could you please try modifying "result_root_folder": "../results/"
in config.py
to "result_root_folder": "..\\results\\"
, since you are in windows and the directory separator should be \. And then delete results
folder and run again.
Let me know if it doesn't work.
It doesn't work. It has the same error message.
I think another potential problem is that windows does not allow ,
in filenames. You can change ,
by adding test_config_string_separator="-"
or others in scheduler_config
section of config.py. (see https://github.com/fjxmlzn/GPUTaskScheduler for the detailed explanation.)
But I just want to double-check if there are other issues: could you please show me the directory structure of F:\Github clone folder\DoppelGANger\DoppelGANger\
after this error happens?
Thanks. Could you please email me the current code and worker.log and let me check it: zinanl AT andrew.cmu.edu