Keyframe Generation Method via Improved Clustering and Silhouette Coefficient for Video Summarization

Authors

  • Fengsui Wang School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China
  • Jingang Chen School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China
  • Furong Liu School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China

DOI:

https://doi.org/10.13052/jwe1540-9589.2018

Keywords:

Video analysis, video summarization, hierarchical clustering, k-means clustering, silhouette coefficient

Abstract

In order to solve the issue that the traditional k-means algorithm falls into the local optimal solution in video summarization due to unreasonable initial parameter setting, a video summarization generation algorithm by using improved clustering and silhouette coefficient was proposed. Firstly, color features and texture features are extracted and fused from the decomposed video frames. Secondly, the hierarchical clustering algorithm is used to obtain the initial clustering results. And then, the improved k-means algorithm with silhouette coefficient is introduced to optimize the initial clustering results. Finally, the nearest frame from the cluster center is selected as the keyframe, and all the final keyframes are arranged in the order of the time sequence in the original video to constitute video summarization. The proposed algorithm is evaluated on the benchmark Open Video Database dataset with an average 71% precision, 84% recall rate, and 76% F-score, which is higher than state-of-the-art video summarization methods. Moreover, it generates video keyframes that are closer to user summaries, and it improves effectively the overall quality of the generated summary.

Downloads

Download data is not yet available.

Author Biographies

Fengsui Wang, School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China

Fengsui Wang received the Ph.D. degree from Nanjing University, Nanjing, China, in 2013, in circuits and systems. He was a visiting scholar at the University of Bridgeport, Connecticut, USA, in 2017. Since 2013, he has been a Faculty Member with Anhui Polytechnic University, Wuhu, China. His current research interests include the areas of image processing, video communication, machine learning and computer vision.

Jingang Chen, School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China

Jingang Chen received the B.S. degree from Anhui Polytechnic University, Wuhu, China, in July 2019. Now he is an M.S. candidate in Anhui Polytechnic University. His current research interests include machine learning and image processing.

Furong Liu, School of Electrical Engineering, Anhui Polytechnic University, 241000 Wuhu, China; Key Laboratory of Detection Technology and Energy Saving Devices, 241000 Wuhu, China; Key Laboratory of Advanced Perception and Intelligent Control of High-end Equipment, Ministry of Education, 241000 Wuhu, China

Furong Liu received the B.S. degree from Anhui Polytechnic University, Wuhu, China, in July 2019. Now she is an M.S. candidate in Anhui Polytechnic University. Her current research interests include machine learning and video analysis.

References

B. T. Truong and S. Venkatesh, “Video abstraction: A systematic review and classification,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 3, no. 1, pp. 3, Feb. 2007.

A. Kapoor and A. Singhal, “A comparative study of K-Means, K-Means++ and Fuzzy C-Means clustering algorithms,” in 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 2017, pp. 1–6.

S. Chakraborty, O. Tickoo, and R. lyer, “Adaptive keyframe selection for video summarization,” in 2015 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2015, pp. 702–709.

S. Mei, G. L. Guan, Z. Y. Wang, et al., “Video summarization via minimum sparse reconstruction,” Pattern Recognition, vol. 48, no. 2, pp. 522–533, Feb. 2015.

G. L. Guan, Z. Y. Wang, S. Y. Lu, et al., “Keypoint-based keyframe selection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 4, pp. 729–734, Apr. 2013.

J. Fuentes, J. Ruiz, and J. M. Rendon, “Salient point tracking for key frames selection in outdoor image sequences,” IEEE Latin America Transactions, vol. 14, no. 5, pp. 2461–2469, May 2016.

Y. Zhang, R. Tao, and Y. Wang, “Motion-state-adaptive video summarization via spatiotemporal analysis,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 6, pp. 1340–1352, Jun. 2017.

G. L. Priya and S. Domnic, “Shot based keyframe extraction for ecological video indexing and retrieval,” Ecological Informatics, vol. 23, pp. 107–117, Sept. 2014.

A. Sasithradevi, S. M. M. Roomi, G. Maragatham, et al., “Video summarization using hierarchical shot boundary detection approach,” in 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR), Bangalore, India, 2017, pp. 1–5.

C. Huang and H. M. Wang, “A novel key-frames selection framework for comprehensive video summarization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 2, pp. 577–589, Jan. 2019.

M. Ma, S. Mei, S. Wan, et al., “Video summarization via block sparse dictionary selection,” Neurocomputing, vol. 378, pp. 197–209, Feb. 2020.

K. Muhammad, T. Hussain, and S. W. Baik, “Efficient CNN based summarization of surveillance videos for resource-constrained devices,” Pattern Recognition Letters, vol. 130, pp. 370–375, Feb. 2020.

S. Zhang, Y. Zhu, and A. K. Roy-Chowdhury, “Context-aware surveillance video summarization,” IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5469–5478, Nov. 2016.

S. S. Thomas, S. Gupta, and V. K. Subramanian, “Perceptual video summarization- A new framework for video summarization,” IEEE Transactions on Circuits and Systems for Video Technology, vol.27, no. 8, pp. 1790–1802, Aug. 2017.

G. Zhang, C. Zhang, and H. Zhang, “Improved K-means algorithm based on density Canopy,” Knowledge-based Systems, vol. 145, pp. 289–297, Apr. 2018.

F. S. Wang, F. R. Liu, S. M. Zhu, et al., “HEVC intra frame based compressed domain video summarization,” in Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Hainan, China, 2019, pp. 1–7.

S. S. Thomas, S. Gupta, and V. K. Subramanian, “Context driven optimized perceptual video summarization and retrieval,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 10, pp. 3132–3145, Oct. 2019.

S. De Avila, A. Lopes, A. Luz Jr, et al., “VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method,” Pattern Recognition Letters, vol. 32, no. 1, pp. 56–68, Jan. 2011.

D. DeMenthon, V. Kobla, and D. Doermann, “Video summarization by curve simplification,” in Proceedings of the Sixth ACM International Conference on Multimedia, Bristol, UK, 1999, pp. 211–218.

J. Wu, S. Zhong, J. Jiang, Y. Yang, “A novel clustering method for static video summarization,” Multimedia Tools and Applications, vol. 76, no. 7, pp. 9625–9641, Mar. 2017.

E. Elhamifar, G. Sapiro, and R. Vidal, “See all by looking at a few: Sparse modeling for finding representative objects,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 1600–1607.

J. Li, T. Yao, Q. Ling, T. Mei, “Detecting shot boundary with sparse coding for video summarization,” Neurocomputing, vol. 266, pp. 66–78, Nov. 2017.

K. M. Mahmoud, M. A. Ismail, and N. M. Ghanem, “VSCAN: an enhanced video summarization using density-based spatial clustering,” in International Conference on Image Analysis and Processing, Berlin, Heidelberg, 2013, pp. 733–742.

D. Saravanan, “Efficient video indexing and retrieval using hierarchical clustering technique,” in Proceedings of the Second International Conference on Computational Intelligence and Informatics, Berlin, Heidelberg, 2018, pp. 1–8.

V. Cohen-Addad, V. Kanade, F. Mallmann-Trenn, et al., “Hierarchical clustering: Objective functions and algorithms,” Journal of the ACM, vol. 66, no. 4, pp. 1–42, Jun. 2019.

M. Hasanzadeh-Mofrad and A. Rezvanian, “Learning automata clustering,” Journal of Computational Science, vol. 24, pp. 379–388, Jan. 2018.

M. Furini, F. Geraci, M. Montangero, M. Pellegrini, “STIMO: Still and Moving video storyboard for the web scenario,” Multimedia Tools and Applications, vol. 46, no. 1, pp. 47–69, Jan. 2010.

M. Asim, N. Almaadeed, S. Al-Maadeed, et al., “A key frame based video summarization using color features,” in 2018 Colour and Visual Computing Symposium (CVCS), Gjøvik, Norway, 2018, pp. 1–6.

G. Han, L. Liu, S. Chan, et al., “HySense: A hybrid mobile crowdsensing framework for sensing opportunities compensation under dynamic coverage constraint,” IEEE Communications Magazine, vol. 55, no. 3, pp. 93–99, Mar. 2017.

M. Tang, L. Gao, and J. Huang, “Communication, computation, and caching resource sharing for the internet of things,” IEEE Communications Magazine, vol. 58, no. 4, pp. 75–80, Apr. 2020.

Published

2021-02-18

How to Cite

Wang, F., Chen, J., & Liu, F. (2021). Keyframe Generation Method via Improved Clustering and Silhouette Coefficient for Video Summarization. Journal of Web Engineering, 20(1), 147–170. https://doi.org/10.13052/jwe1540-9589.2018

Issue

Section

Advanced Practice in Web Engineering