TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction

(BMVC 2022)

Saber Pourheydari*1Emad Bahrami*1Mohsen Fayyaz*1,2
Gianpiero Francesca3Mehdi Noroozi4Juergen Gall1
1 University of Bonn,  2 Microsoft,  3 Toyota Motor Europe,  4 Samsung AI

* denotes equal contribution.

Overview
While recurrent neural networks (RNNs) demonstrate outstanding capabilities for future video frame prediction, they model dynamics in a discrete time space, i.e., they predict the frames sequentially with a fixed temporal step. RNNs are therefore prone to accumulate the error as the number of future frames increases. In contrast, partial differential equations (PDEs) model physical phenomena like dynamics in a continuous time space. However, the estimated PDE for frame forecasting needs to be numerically solved, which is done by discretization of the PDE and diminishes most of the advan- tages compared to discrete models. In this work, we, therefore, propose to approximate the motion in a video by a continuous function using the Taylor series. To this end, we introduce TaylorSwiftNet, a novel convolutional neural network that learns to estimate the higher order terms of the Taylor series for a given input video. TaylorSwiftNet can swiftly predict future frames in parallel and it allows to change the temporal resolution of the forecast frames on-the-fly. The experimental results on various datasets demonstrate the superiority of our model.
Results
BibTeX
  @InProceedings{taylorswiftnet,
  author    =  {Pourheydari, Saber and Bahrami, Emad and Fayyaz, Mohsen and Francesca, Gianpiero and Noroozi, Mehdi and Gall, Juergen},
  title     =  {TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction},
  booktitle =  {British Machine Vision Conference (BMVC)},
  year      =  {2022}
}

Acknowledgements
This template was originally made by GenForce.