Shape Mismatch Error in LSTM Forward Pass (RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512])
Opened this issue · 0 comments
I'm encountering a shape mismatch error when trying to use the OpenLSTML4casadi
model within the RealTimeL4CasADi
wrapper. The error occurs when calling the forward
method of the LSTM, which is part of the OpenLSTML4casadi
class. Despite reshaping inputs and initializing the hidden/cell states according to the LSTM's requirements, the error persists.
Error Traceback:
RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512]
Here’s the full stack trace:
File "l4casadi/realtime/realtime_l4casadi.py", line 75, in get_params
params = self._get_params(a_t)
File "l4casadi/realtime/realtime_l4casadi.py", line 66, in _get_params
df_a, f_a = batched_jacobian(self.model, a_t, return_func_output=True)
File "l4casadi/realtime/sensitivities.py", line 43, in batched_jacobian
return functorch.vmap(functorch.jacrev(aux_function(func), has_aux=True), randomness=vmap_randomness)(inputs[:, None])
File "torch/_functorch/vmap.py", line 434, in wrapped
return _flat_vmap(func, batch_size, flat_in_dims, flat_args, args_spec, out_dims, randomness, **kwargs)
File "torch/_functorch/vmap.py", line 619, in _flat_vmap
batched_outputs = func(*batched_inputs, **kwargs)
File "torch/_functorch/eager_transforms.py", line 291, in _vjp_with_argnums
primals_out = func(*primals)
File "l4casadi/realtime/sensitivities.py", line 13, in aux_function.<locals>.inner_aux
out = func(inputs)
File "torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "open_lstm_l4casadi.py", line 53, in forward
y_sim, _ = self.model(u_train, state)
File "torch/nn/modules/rnn.py", line 812, in forward
result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers, self.dropout, self.training, self.bidirectional, self.batch_first)
Steps to Reproduce:
- Initialize the
OpenLSTML4casadi
model with the following parameters:n_context = 1
n_inputs = 1
sequence_length = 1
batch_size = 1
- Wrap the model using
RealTimeL4CasADi
for CasADi integration. - Call the
get_params
method using the following input:casadi_param = model_l4c.get_params(np.ones((n_inputs, batch_size * sequence_length)))
What I’ve Tried:
- I ensured that the input tensor (
u_train
) is reshaped to[sequence_length, batch_size, input_size]
before being passed to the LSTM. - I initialized the hidden and cell states (
hn
andcn
) with the correct shape:[num_layers, batch_size, hidden_size]
forcn
and[num_layers, batch_size, proj_size]
forhn
. - Despite this, the error persists when calling the LSTM's forward pass.
Possible Cause:
The issue may be related to how the LSTM is handling projected outputs (proj_size=1
) and internal reshaping of tensors during the batched Jacobian calculation. The shape mismatch suggests that the LSTM is returning an output tensor with an unexpected shape, which doesn't match the expected broadcasting dimensions.
Expected Behavior:
The LSTM forward pass should work without shape mismatches, and the batched Jacobian should correctly handle the model's projected output when using RealTimeL4CasADi
.
Environment:
- Python version:
3.10.12
- PyTorch version:
2.0.0+cpu
- CasADi version:
3.6.6
- OS: Windows 10 (WSL: Ubuntu-22.04]
Additional Context:
The full project code is available here: https://github.com/LOPES3000/RealtimeL4casadi_and_LSTM_NN/. This issue arises when integrating the LSTM model into the real-time CasADi wrapper for symbolic differentiation and optimization.
Request:
I’d appreciate any insights or suggestions on how to resolve this shape mismatch issue during the LSTM forward pass or how to modify the batched Jacobian calculation to account for projected LSTM outputs.
Thank you!