Questions related to the plain LSTM Cell

Question

Questions related to the plain LSTM Cell

Closed this issue a year ago · 5 comments

Hello, I'm quite interested in the section about LSTM, but I'm still trying to understand how the "plain LSTM cell" mentioned in the paper is actually applied. It seems like the backbone network utilizes "DWSConvLSTM2d," which stands for depth-wise convolutional LSTM ？

Answer 1 · 2023-08-28T08:15:05.000Z

Hi @batman47steam

DWSConvLSTM2d can function as a plain LSTM when configured in a certain way. It's designed to be flexible and allows to toggle between standard LSTMs and depthwise-separable conv LSTMs. The configs show that it's set up to act like a regular LSTM, despite using 1x1 convolutions, which are mathematically equivalent to matrix multiplication

Answer 2 · 2023-08-28T10:33:04.000Z

Hi, thank you for your response. So the default settings of DWSConvLSTM2d are for the plain LSTM, since the hidden state undergoes a 3x3 convolution. and only a 1x1 convolutional interaction occurs between the input and hidden state ?

Answer 3 · 2023-08-29T11:29:32.000Z

It's just a 1x1 convolution. You can see this here:

default config specifies lstm.dws_conv=False here
config is passed to class init function here.
As a consequence, self.conv3x3_dws is set to nn.Identity() here

Answer 4 · 2023-08-31T08:29:06.000Z

@batman47steam if that answers your question, feel free to close this issue

Answer 5 · 2023-09-06T00:42:45.000Z

Thank you. That solves my question very well