Convolutional LSTMNetwork: A Machine Learning Approach for Precipitation Nowcasting

Question

Convolutional LSTMNetwork: A Machine Learning Approach for Precipitation Nowcasting

tenaflyyy opened this issue 7 years ago · 0 comments

tenaflyyy commented 7 years ago

Abstract

Challenge
- To predict the future rainfall intensity in a local region over a very short time through machine learning perspective.
Proposed Method
- Formulation the problem as a spatiotemporal sequence problem
- Both the input and the prediction target are spatiotemporal sequences
- Extend the fully connected LSTM to have convolutional structures int both input-to-state and state-to-state transitions.
  -Outperform FC-LSTM and ROVER algorithm.

Details

Introduction
- Existing Methods Problem:Optical ﬂow based methods is limited because the ﬂow estimation step and the radar echo extrapolation step are separated and it is challenging to determine the model parameters to give good prediction performance.
- Extend FC-LSTM to Conv-LSTM(convlutional structures in both input-to-state and state-to-state transitions).
- Experiments performed on a synthetic Moving-MNIST dataset and the radar echo dateset.
Contributions
- Formulation of preciptation Nowcasting
  - Observe adynamical system over a spatial region represented by an M ×N gridwhich consists of M rows and N columns. Inside each cell in the grid, there are P measurements which vary over time. Thus, the observation at any time can be represented by a tensor X ∈ R ^ (P×M×N).
  - The spatiotemporal sequence forecasting problem is to predict the most likely length K sequence in the future given the previous J observations.
- Convlutional LSTM
  - Drawback of FC-LSTM(full connections) handling spatiotemporal data is no spatial information encoded
  - all the inputs , cell outputs , hidden states , and gates of the ConvLSTM are 3D tensors with last two dimensions spatial dimensions (rows and columns).
  - [kernel size] ConvLSTM with a larger transitional kernel should be able to capture faster motions while one with a smaller kernel can capture slower motions.
  - [padding] padding of the hidden states on the boundary points can be viewed as using the state of the outside world for calculation.
  - The encoding LSTM compresses the whole input sequence into a hidden state tensor and the forecasting LSTM unfolds this hidden state.
Experiments
- Findings
  - ConvLSTM is better than FC-LSTM in handling spatiotemporal correlations.
  - Making the size of state-to-state convolutional kernel bigger than 1 is essential for capturing the spatiotemporal motion patterns.
  - Deeper models can produce better results with fewer parameters.
- Moving-MNIST Dataset
- Radar Echo Dataset

Personal Thoughts

ConvLSTM show its potential in dealing Spatiotemporal sequence data. ConvLSTM layer preserve not only the advantages of FC-LSTM but also reserve the spatial infomation.
apply ConvLSTM to video-based action recognition. One idea is to add ConvLSTM on top of the spatial feature maps generated by a convolutional neural network and use the hidden states of ConvLSTM for the ﬁnal classiﬁcation.
Application
- extract video feature (for both appearance and motion).