no input_size

Question

no input_size

Opened this issue 3 years ago · 5 comments

hi adrianjav,

thanks for your library. I install rotograd as pip install rotograd. when running my project, i get errors like this:

Traceback (most recent call last):
File "train_frontend2.py", line 282, in
train(0, args, configs, batch_size, num_gpus)
File "train_frontend2.py", line 124, in train
seg_socres, pinyin_scores = model((text_id, mask))
File "/home/xxxl/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/rotograd/rotograd.py", line 291, in forward
out_i = head(rep_i)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/rotograd/rotograd.py", line 146, in forward
new_z = rotate(z, R, self.p.input_size)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 779, in getattr
type(self).name, name))
torch.nn.modules.module.ModuleAttributeError: 'ParametrizedRotoGrad' object has no attribute 'input_size'

the input_size can be found in version 0.1.2. when changing input_size as latent_size, I get errors like this:

Traceback (most recent call last):
File "train_frontend2.py", line 282, in
train(0, args, configs, batch_size, num_gpus)
File "train_frontend2.py", line 124, in train
seg_socres, pinyin_scores = model((text_id, mask))
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/rotograd/rotograd.py", line 291, in forward
out_i = head(rep_i)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/rotograd/rotograd.py", line 146, in forward
new_z = rotate(z, R, self.p.latent_size)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/rotograd/rotograd.py", line 109, in rotate
return torch.einsum('ij,bj->bi', rotation, points)
File "/home/xxx/anaconda3/envs/py37_torch1.71/lib/python3.7/site-packages/torch/functional.py", line 344, in einsum
return _VF.einsum(equation, operands) # type: ignore
RuntimeError: dimension mismatch for operand 1: equation 2 tensor 3

Can you give me some suggestions ？

Answer 1 · 2022-02-10T08:08:13.000Z

Can MTL sequence to sequence tasks use rotograd？

Answer 2 · 2022-02-13T16:26:18.000Z

Hi, thanks for raising this issue!

Indeed, there were a few errors in the implementation. I rushed the last update and I didn't have time to check it was working properly.

I just pushed a new version (0.1.5.1) fixing these issues. Now everything should work as expected. Besides, the library now has one working example (folder example) just to verify that everything is working as expected (it has a few more dependencies, but it serves as an example code if you want to look it up).

Regarding the second error, I am not sure what caused it. Can you try with the new code?

And about seq-to-seq tasks, as long as the architecture looks like a hard-parameter sharing one (check the paper for a reference), it should work.

Answer 3 · 2022-02-14T13:35:40.000Z

Hi, thanks for your answers and updating!

Changing to version 0.1.5.1, I run rotograd example successfully. When running my seq-to-seq tasks, I still get the second error. The output shape of backbone model (seq-to-seq tasks) is (batch_size, seq_len, output_dim), and the output shape of rotograd example backbone model is (batch_size, output_dim). Intuitively，the output (batch_size, seq_len, output_dim) is not matched with rotograd(file: rotograd.py, line: 138, func: rotate). Is that helpful？

Answer 4 · 2022-02-14T15:39:23.000Z

Oh, I see!

In RotoGrad, we assume that the grad is of the form [batch_size, latent_size]. At some point I thought of flattening the gradients and reshaping them before the return statement, but I decided to not go that way since it would hide subtle details to the user that may be worth thinking about.

In your case, for example, RotoGrad would apply by considering a matrix of size [seq_len x latent_size, seq_len x latent_size] (everything as a whole). But maybe in your settings it is better to consider a matrix of size [latent_size x latent_size] and identically apply the rotation to each part of the sequence.

If you want to try the former, the simplest is to modify your local file rotograd.py, flatten the gradients, and reshape them before returning (and declaring the rotations of size seq_len x latent_size).

But I think you have a point, I will modify RotoGrad to take a shape rather than a single integer for latent_size.

Answer 5 · 2022-02-14T15:40:27.000Z

Also, I do not know the specifics of your project, if you feel that they are worth discussing and that applying RotoGrad there could lead to an interesting problem, feel free to drop me an email 😉