--netG_n_shared flag

Question

--netG_n_shared flag

Ibrahim-Halfaoui opened this issue 5 years ago · 1 comments

Hi, I was wondering what exactly this --netG_n_shared flag is for ?
What I understood is that these are the Resblocks shared by all encoders (they help bring the different images into the same shared latent space) right?

The problem is that when I change the default value (zero) of the flag I get a Cuda issue:

RuntimeError: chunk expects at least a 1-dimensional tensor (chunk at /pytorch/aten/src/ATen/native/TensorShape.cpp:186)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f44f02bb441 in /usr/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f44f02bad7a in /usr/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: at::native::chunk(at::Tensor const&, long, long) + 0x1f3 (0x7f449c7dbf93 in /usr/lib/python3.7/site-packages/torch/lib/libcaffe2.so)
frame #3: at::TypeDefault::chunk(at::Tensor const&, long, long) const + 0x12 (0x7f449ca3a442 in /usr/lib/python3.7/site-packages/torch/lib/libcaffe2.so)
frame #4: torch::autograd::VariableType::chunk(at::Tensor const&, long, long) const + 0x282 (0x7f449ae270c2 in /usr/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #5: torch::cuda::scatter(at::Tensor const&, c10::ArrayRef, c10::optional<std::vector<long, std::allocator > > const&, long, c10::optional<std::vector<c10::optionalc10::cuda::CUDAStream, std::allocator<c10::optionalc10::cuda::CUDAStream > > > const&) + 0x5ad (0x7f449b38a1fd in /usr/lib/python3.7/site-packages/torch/lib/libtorch.so.1)
frame #6: + 0x5a41cf (0x7f44f0a781cf in /usr/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #7: + 0x130fac (0x7f44f0604fac in /usr/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #15: THPFunction_apply(_object*, _object*) + 0x6b1 (0x7f44f0888301 in /usr/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

Answer 1 · 2019-09-12T09:21:32.000Z

This could either be something weird with Pytorch versions (since I made this in 0.4), or possibly if you give --netG_n_shared=1.

I implicitly assumed it had to be at least 2, but didn't consider this in the code. It's the number of total central blocks shared between encoders AND decoders.

So if you say n_blocks=8 and n_shared=2, the encoders will get 4 blocks and the decoders will get 4 blocks, but the last encoder block will be shared among all classes, as does the first decoder block. Thus there will be (N+N+N+1)+(1+N+N+N) total blocks, two of which are shared in total.

It wasn't well documented since it was a research feature I tried out (which didn't help much in my case), but I left it as an option for others anyway.