Shape Mismatch Error in LSTM Forward Pass (RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512])

Question

Shape Mismatch Error in LSTM Forward Pass (RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512])

Opened this issue 2 months ago · 0 comments

I'm encountering a shape mismatch error when trying to use the OpenLSTML4casadi model within the RealTimeL4CasADi wrapper. The error occurs when calling the forward method of the LSTM, which is part of the OpenLSTML4casadi class. Despite reshaping inputs and initializing the hidden/cell states according to the LSTM's requirements, the error persists.

Error Traceback:

RuntimeError: output with shape [1, 512] doesn't match the broadcast shape [1, 1, 512]

Here’s the full stack trace:

File "l4casadi/realtime/realtime_l4casadi.py", line 75, in get_params
    params = self._get_params(a_t)
File "l4casadi/realtime/realtime_l4casadi.py", line 66, in _get_params
    df_a, f_a = batched_jacobian(self.model, a_t, return_func_output=True)
File "l4casadi/realtime/sensitivities.py", line 43, in batched_jacobian
    return functorch.vmap(functorch.jacrev(aux_function(func), has_aux=True), randomness=vmap_randomness)(inputs[:, None])
File "torch/_functorch/vmap.py", line 434, in wrapped
    return _flat_vmap(func, batch_size, flat_in_dims, flat_args, args_spec, out_dims, randomness, **kwargs)
File "torch/_functorch/vmap.py", line 619, in _flat_vmap
    batched_outputs = func(*batched_inputs, **kwargs)
File "torch/_functorch/eager_transforms.py", line 291, in _vjp_with_argnums
    primals_out = func(*primals)
File "l4casadi/realtime/sensitivities.py", line 13, in aux_function.<locals>.inner_aux
    out = func(inputs)
File "torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
File "open_lstm_l4casadi.py", line 53, in forward
    y_sim, _ = self.model(u_train, state)
File "torch/nn/modules/rnn.py", line 812, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers, self.dropout, self.training, self.bidirectional, self.batch_first)

Steps to Reproduce:

Initialize the OpenLSTML4casadi model with the following parameters:
- n_context = 1
- n_inputs = 1
- sequence_length = 1
- batch_size = 1
Wrap the model using RealTimeL4CasADi for CasADi integration.

Call the get_params method using the following input:

casadi_param = model_l4c.get_params(np.ones((n_inputs, batch_size * sequence_length)))

What I’ve Tried:

I ensured that the input tensor (u_train) is reshaped to [sequence_length, batch_size, input_size] before being passed to the LSTM.
I initialized the hidden and cell states (hn and cn) with the correct shape: [num_layers, batch_size, hidden_size] for cn and [num_layers, batch_size, proj_size] for hn.
Despite this, the error persists when calling the LSTM's forward pass.

Possible Cause:

The issue may be related to how the LSTM is handling projected outputs (proj_size=1) and internal reshaping of tensors during the batched Jacobian calculation. The shape mismatch suggests that the LSTM is returning an output tensor with an unexpected shape, which doesn't match the expected broadcasting dimensions.

Expected Behavior:

The LSTM forward pass should work without shape mismatches, and the batched Jacobian should correctly handle the model's projected output when using RealTimeL4CasADi.

Environment:

Python version: 3.10.12
PyTorch version: 2.0.0+cpu
CasADi version: 3.6.6
OS: Windows 10 (WSL: Ubuntu-22.04]

Additional Context:

The full project code is available here: https://github.com/LOPES3000/RealtimeL4casadi_and_LSTM_NN/. This issue arises when integrating the LSTM model into the real-time CasADi wrapper for symbolic differentiation and optimization.

Request:

I’d appreciate any insights or suggestions on how to resolve this shape mismatch issue during the LSTM forward pass or how to modify the batched Jacobian calculation to account for projected LSTM outputs.

Thank you!