Enhanced Real-Time Intermediate Flow Estimation for Video Frame Interpolation
DOI:
https://doi.org/10.13052/jwe1540-9589.2089Keywords:
Video frame interpolation, Optical flow estimation, Contextual information, Multiscale fusionAbstract
Recently, the demand for high-quality video content has rapidly been increasing, led by the development of network technology and the growth in video streaming platforms. In particular, displays with a high refresh rate, such as 120 Hz, have become popular. However, the visual quality is only enhanced if the video stream is produced at the same high frame rate. For the high quality, conventional videos with a low frame rate should be converted into a high frame rate in real time. This paper introduces a bidirectional intermediate flow estimation method for real-time video frame interpolation. A bidirectional intermediate optical flow is directly estimated to predict an accurate intermediate frame. For real-time processing, multiple frames are interpolated with a single intermediate optical flow and parts of the network are implemented in 16-bit floating-point precision. Perceptual loss is also applied to improve the cognitive performance of the interpolated frames. The experimental results showed a high prediction accuracy of 35.54 dB on the Vimeo90K triplet benchmark dataset. The interpolation speed of 84 fps was achieved for 480p resolution.
Downloads
References
S. Niklaus, L. Mai, F. Liu, Video frame interpolation via adaptive separable convolution, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
X. Cheng, Z. Chen, Video frame interpolation via deformable separable convolution, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2020.
Z. Huang, T. Zhang, W. Heng, B. Shi, S. Zhou, RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation, arXiv preprint arXiv:2011.06294v2, 2021.
B. Choi, S. Lee, S. Ko, New frame rate up-conversion using bi-directional motion estimation, IEEE Transactions on Consumer Electronics, 2000.
W. Bao, W. Lai, C. Ma, X. Zhang, Z. Gao, M. Yang, Depth-aware video frame interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
X. Cheng, Z. Chen, Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2018.
S. Niklaus, F. Liu, Context-aware synthesis for video frame interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
S. Niklaus, L. Mai, F. Liu, Video frame interpolation via adaptive convolution In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
O. Ronneberger, P. Fischer, T. Brox, Convolutional Networks for Biomedical Image Segmentation, In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015.
H. Jiang, D. Sun, V. Jampani, M. Yang, E. Miller, J. Kautz, Super slomo: High quality estimation of multiple intermediate frames for video interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
T. Hui, X. Tang, C. Loy, Liteflownet: A lightweight convolutional neural network for optical flow estimation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Smagt, D. Cremers, T. Brox, Flownet: Learning optical flow with convolutional networks, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
Y. Liu, L. Xie, L. Siyao, W. Sun, Y. Qiao, C. Dong, Enhanced quadratic video interpolation, In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
Y. Liu, L. Xie, L. Siyao, W. Sun, Y. Qiao, C. Dong, Enhanced Quadratic Video Interpolation European Conference on Computer Vision (ECCV), 2020.
S. Niklaus, F. Liu, Softmax Splatting for Video Frame Interpolation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
W. Shi, J. Caballero, F. Huszar, J. Totz, A. Aitken, R. Bishop, D. Ruechert, Z. Wang, Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
M. Choi, H. Kim, B. Han, N. Xu, K. Lee, Channel attention is all you need for video frame interpolation, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2020.
F. Reda, D. Sun, A. Dundar, M. Shoeybi, G. Liu, Unsupervised Video Interpolation Using Cycle Consistency, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.
E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, T. Brox, Flownet 2.0: Evolution of optical flow estimation with deep networks, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Z. Teed, J. Deng, Raft: Recurrent all-pairs field transforms for optical flow, In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
S. Meister, J. Hur, S. Roth, UnFlow: Unsupervised learning of optical flow with a bidirectional census loss, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2018.
J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time style transfer and super-resolution, In Proceedings of the European Conference on Computer Vision (ECCV), 2016.
S. Meyer, O. Wang, H. Zimmer, M. Grosse, A. Sorkine-Hornung, Phase-based frame interpolation for video, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam, Mobilenets: Efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv:1704.04861, 2017.
I. Loshchilov, F. Hutter, Fixing weight decay regularization in adam, International Conference on Learning Representations (ICLR), 2018.
S. Kaufman, S. Rosset, C. Perlich, O. Stitelman, Leakage in data mining: Formulation, detection, and avoidance, ACM Transactions on Knowledge Discovery from Data (TKDD), 2012.
U. Ninrutsirikun, D. Pal, C. Arpnikanondtand, B. Watanapa, Unified Model for Learning Style Recommendation, Journal of Web Engineering (JWE), 2021.
J. G. Enríquez, A. Martínez-Rojas, D. Lizcano, A. Jiménez-Ramírez, A Unified Model Representation of Machine Learning Knowledge, Journal of Web Engineering (JWE), 2020.
G. Sun, J. Zhang, K. Zheng, X. Fu, Eye Tracking and ROI Detection within a Computer Screen Using a Monocular Camera, Journal of Web Engineering (JWE), 2020.