Keyframe Generation Method via Improved Clustering and Silhouette Coefficient for Video Summarization
Keywords:Video analysis, video summarization, hierarchical clustering, k-means clustering, silhouette coefficient
In order to solve the issue that the traditional k-means algorithm falls into the local optimal solution in video summarization due to unreasonable initial parameter setting, a video summarization generation algorithm by using improved clustering and silhouette coefficient was proposed. Firstly, color features and texture features are extracted and fused from the decomposed video frames. Secondly, the hierarchical clustering algorithm is used to obtain the initial clustering results. And then, the improved k-means algorithm with silhouette coefficient is introduced to optimize the initial clustering results. Finally, the nearest frame from the cluster center is selected as the keyframe, and all the final keyframes are arranged in the order of the time sequence in the original video to constitute video summarization. The proposed algorithm is evaluated on the benchmark Open Video Database dataset with an average 71% precision, 84% recall rate, and 76% F-score, which is higher than state-of-the-art video summarization methods. Moreover, it generates video keyframes that are closer to user summaries, and it improves effectively the overall quality of the generated summary.
B. T. Truong and S. Venkatesh, “Video abstraction: A systematic review and classification,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 3, no. 1, pp. 3, Feb. 2007.
A. Kapoor and A. Singhal, “A comparative study of K-Means, K-Means++ and Fuzzy C-Means clustering algorithms,” in 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 2017, pp. 1–6.
S. Chakraborty, O. Tickoo, and R. lyer, “Adaptive keyframe selection for video summarization,” in 2015 IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2015, pp. 702–709.
S. Mei, G. L. Guan, Z. Y. Wang, et al., “Video summarization via minimum sparse reconstruction,” Pattern Recognition, vol. 48, no. 2, pp. 522–533, Feb. 2015.
G. L. Guan, Z. Y. Wang, S. Y. Lu, et al., “Keypoint-based keyframe selection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 4, pp. 729–734, Apr. 2013.
J. Fuentes, J. Ruiz, and J. M. Rendon, “Salient point tracking for key frames selection in outdoor image sequences,” IEEE Latin America Transactions, vol. 14, no. 5, pp. 2461–2469, May 2016.
Y. Zhang, R. Tao, and Y. Wang, “Motion-state-adaptive video summarization via spatiotemporal analysis,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 6, pp. 1340–1352, Jun. 2017.
G. L. Priya and S. Domnic, “Shot based keyframe extraction for ecological video indexing and retrieval,” Ecological Informatics, vol. 23, pp. 107–117, Sept. 2014.
A. Sasithradevi, S. M. M. Roomi, G. Maragatham, et al., “Video summarization using hierarchical shot boundary detection approach,” in 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR), Bangalore, India, 2017, pp. 1–5.
C. Huang and H. M. Wang, “A novel key-frames selection framework for comprehensive video summarization,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 2, pp. 577–589, Jan. 2019.
M. Ma, S. Mei, S. Wan, et al., “Video summarization via block sparse dictionary selection,” Neurocomputing, vol. 378, pp. 197–209, Feb. 2020.
K. Muhammad, T. Hussain, and S. W. Baik, “Efficient CNN based summarization of surveillance videos for resource-constrained devices,” Pattern Recognition Letters, vol. 130, pp. 370–375, Feb. 2020.
S. Zhang, Y. Zhu, and A. K. Roy-Chowdhury, “Context-aware surveillance video summarization,” IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5469–5478, Nov. 2016.
S. S. Thomas, S. Gupta, and V. K. Subramanian, “Perceptual video summarization- A new framework for video summarization,” IEEE Transactions on Circuits and Systems for Video Technology, vol.27, no. 8, pp. 1790–1802, Aug. 2017.
G. Zhang, C. Zhang, and H. Zhang, “Improved K-means algorithm based on density Canopy,” Knowledge-based Systems, vol. 145, pp. 289–297, Apr. 2018.
F. S. Wang, F. R. Liu, S. M. Zhu, et al., “HEVC intra frame based compressed domain video summarization,” in Proceedings of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Hainan, China, 2019, pp. 1–7.
S. S. Thomas, S. Gupta, and V. K. Subramanian, “Context driven optimized perceptual video summarization and retrieval,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 10, pp. 3132–3145, Oct. 2019.
S. De Avila, A. Lopes, A. Luz Jr, et al., “VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method,” Pattern Recognition Letters, vol. 32, no. 1, pp. 56–68, Jan. 2011.
D. DeMenthon, V. Kobla, and D. Doermann, “Video summarization by curve simplification,” in Proceedings of the Sixth ACM International Conference on Multimedia, Bristol, UK, 1999, pp. 211–218.
J. Wu, S. Zhong, J. Jiang, Y. Yang, “A novel clustering method for static video summarization,” Multimedia Tools and Applications, vol. 76, no. 7, pp. 9625–9641, Mar. 2017.
E. Elhamifar, G. Sapiro, and R. Vidal, “See all by looking at a few: Sparse modeling for finding representative objects,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 1600–1607.
J. Li, T. Yao, Q. Ling, T. Mei, “Detecting shot boundary with sparse coding for video summarization,” Neurocomputing, vol. 266, pp. 66–78, Nov. 2017.
K. M. Mahmoud, M. A. Ismail, and N. M. Ghanem, “VSCAN: an enhanced video summarization using density-based spatial clustering,” in International Conference on Image Analysis and Processing, Berlin, Heidelberg, 2013, pp. 733–742.
D. Saravanan, “Efficient video indexing and retrieval using hierarchical clustering technique,” in Proceedings of the Second International Conference on Computational Intelligence and Informatics, Berlin, Heidelberg, 2018, pp. 1–8.
V. Cohen-Addad, V. Kanade, F. Mallmann-Trenn, et al., “Hierarchical clustering: Objective functions and algorithms,” Journal of the ACM, vol. 66, no. 4, pp. 1–42, Jun. 2019.
M. Hasanzadeh-Mofrad and A. Rezvanian, “Learning automata clustering,” Journal of Computational Science, vol. 24, pp. 379–388, Jan. 2018.
M. Furini, F. Geraci, M. Montangero, M. Pellegrini, “STIMO: Still and Moving video storyboard for the web scenario,” Multimedia Tools and Applications, vol. 46, no. 1, pp. 47–69, Jan. 2010.
M. Asim, N. Almaadeed, S. Al-Maadeed, et al., “A key frame based video summarization using color features,” in 2018 Colour and Visual Computing Symposium (CVCS), Gjøvik, Norway, 2018, pp. 1–6.
G. Han, L. Liu, S. Chan, et al., “HySense: A hybrid mobile crowdsensing framework for sensing opportunities compensation under dynamic coverage constraint,” IEEE Communications Magazine, vol. 55, no. 3, pp. 93–99, Mar. 2017.
M. Tang, L. Gao, and J. Huang, “Communication, computation, and caching resource sharing for the internet of things,” IEEE Communications Magazine, vol. 58, no. 4, pp. 75–80, Apr. 2020.