Enhanced Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Authors

  • Minseop Kim Department of Multimedia Engineering, Hanbat National University, Daejeon, Republic of Korea
  • Haechul Choi Department of Multimedia Engineering, Hanbat National University, Daejeon, Republic of Korea https://orcid.org/0000-0002-7594-0828

DOI:

https://doi.org/10.13052/jwe1540-9589.2089

Keywords:

Video frame interpolation, Optical flow estimation, Contextual information, Multiscale fusion

Abstract

Recently, the demand for high-quality video content has rapidly been increasing, led by the development of network technology and the growth in video streaming platforms. In particular, displays with a high refresh rate, such as 120 Hz, have become popular. However, the visual quality is only enhanced if the video stream is produced at the same high frame rate. For the high quality, conventional videos with a low frame rate should be converted into a high frame rate in real time. This paper introduces a bidirectional intermediate flow estimation method for real-time video frame interpolation. A bidirectional intermediate optical flow is directly estimated to predict an accurate intermediate frame. For real-time processing, multiple frames are interpolated with a single intermediate optical flow and parts of the network are implemented in 16-bit floating-point precision. Perceptual loss is also applied to improve the cognitive performance of the interpolated frames. The experimental results showed a high prediction accuracy of 35.54 dB on the Vimeo90K triplet benchmark dataset. The interpolation speed of 84 fps was achieved for 480p resolution.

Downloads

Download data is not yet available.

Author Biographies

Minseop Kim, Department of Multimedia Engineering, Hanbat National University, Daejeon, Republic of Korea

Minseop Kim received a B.S. in the department of information and communication engineering from Hanbat National University, Daejeon, Korea, in 2018, where he received a M.S in the department of multimedia engineering. Currently, he is working toward Ph.D. degree in the department of multimedia engineering in Hanbat National University. His research interests include computer vision, machine learning, and parallel processing.

Haechul Choi, Department of Multimedia Engineering, Hanbat National University, Daejeon, Republic of Korea

Haechul Choi received a B.S. in electronics engineering from Kyungpook National University, Daegu, Korea, in 1997. He received his M.S. and Ph.D. in electrical engineering from Korea Advanced Institute of Science and Technology, Daejeon, Korea, in 1999. He was a senior researcher at the Broadcasting Media Research Department of Electronics and Telecommunications Research Institute until 2010 and was an adjunct professor at the University of Science and Technology. He was a visiting professor in Florida Institute Technology from 2015 to 2016. He is currently a professor at Hanbat National University. His current research areas include image processing, image compression, video coding, video compression, signal processing, computer vision, deep learning.

References

S. Niklaus, L. Mai, F. Liu, Video frame interpolation via adaptive separable convolution, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.

X. Cheng, Z. Chen, Video frame interpolation via deformable separable convolution, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2020.

Z. Huang, T. Zhang, W. Heng, B. Shi, S. Zhou, RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation, arXiv preprint arXiv:2011.06294v2, 2021.

B. Choi, S. Lee, S. Ko, New frame rate up-conversion using bi-directional motion estimation, IEEE Transactions on Consumer Electronics, 2000.

W. Bao, W. Lai, C. Ma, X. Zhang, Z. Gao, M. Yang, Depth-aware video frame interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

X. Cheng, Z. Chen, Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement, IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2018.

S. Niklaus, F. Liu, Context-aware synthesis for video frame interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

S. Niklaus, L. Mai, F. Liu, Video frame interpolation via adaptive convolution In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

O. Ronneberger, P. Fischer, T. Brox, Convolutional Networks for Biomedical Image Segmentation, In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015.

H. Jiang, D. Sun, V. Jampani, M. Yang, E. Miller, J. Kautz, Super slomo: High quality estimation of multiple intermediate frames for video interpolation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

T. Hui, X. Tang, C. Loy, Liteflownet: A lightweight convolutional neural network for optical flow estimation, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. Smagt, D. Cremers, T. Brox, Flownet: Learning optical flow with convolutional networks, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.

Y. Liu, L. Xie, L. Siyao, W. Sun, Y. Qiao, C. Dong, Enhanced quadratic video interpolation, In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

Y. Liu, L. Xie, L. Siyao, W. Sun, Y. Qiao, C. Dong, Enhanced Quadratic Video Interpolation European Conference on Computer Vision (ECCV), 2020.

S. Niklaus, F. Liu, Softmax Splatting for Video Frame Interpolation, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

W. Shi, J. Caballero, F. Huszar, J. Totz, A. Aitken, R. Bishop, D. Ruechert, Z. Wang, Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

M. Choi, H. Kim, B. Han, N. Xu, K. Lee, Channel attention is all you need for video frame interpolation, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2020.

F. Reda, D. Sun, A. Dundar, M. Shoeybi, G. Liu, Unsupervised Video Interpolation Using Cycle Consistency, In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2019.

E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, T. Brox, Flownet 2.0: Evolution of optical flow estimation with deep networks, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

Z. Teed, J. Deng, Raft: Recurrent all-pairs field transforms for optical flow, In Proceedings of the European Conference on Computer Vision (ECCV), 2020.

S. Meister, J. Hur, S. Roth, UnFlow: Unsupervised learning of optical flow with a bidirectional census loss, In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), 2018.

J. Johnson, A. Alahi, L. Fei-Fei, Perceptual losses for real-time style transfer and super-resolution, In Proceedings of the European Conference on Computer Vision (ECCV), 2016.

S. Meyer, O. Wang, H. Zimmer, M. Grosse, A. Sorkine-Hornung, Phase-based frame interpolation for video, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam, Mobilenets: Efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv:1704.04861, 2017.

I. Loshchilov, F. Hutter, Fixing weight decay regularization in adam, International Conference on Learning Representations (ICLR), 2018.

S. Kaufman, S. Rosset, C. Perlich, O. Stitelman, Leakage in data mining: Formulation, detection, and avoidance, ACM Transactions on Knowledge Discovery from Data (TKDD), 2012.

U. Ninrutsirikun, D. Pal, C. Arpnikanondtand, B. Watanapa, Unified Model for Learning Style Recommendation, Journal of Web Engineering (JWE), 2021.

J. G. Enríquez, A. Martínez-Rojas, D. Lizcano, A. Jiménez-Ramírez, A Unified Model Representation of Machine Learning Knowledge, Journal of Web Engineering (JWE), 2020.

G. Sun, J. Zhang, K. Zheng, X. Fu, Eye Tracking and ROI Detection within a Computer Screen Using a Monocular Camera, Journal of Web Engineering (JWE), 2020.

Downloads

Published

2021-11-21

Issue

Section

Articles