Research on End-to-end Voiceprint Recognition Model Based on Convolutional Neural Network
Keywords:Convolutional neural network; End-to-end voiceprint recognition; Voiceprint recognition model.
Speech signal is a time-varying signal, which is greatly affected by individual and environment. In order to improve the end-to-end voice print recognition rate, it is necessary to preprocess the original speech signal to some extent. An end-to-end voiceprint recognition algorithm based on convolutional neural network is proposed. In this algorithm, the convolution and down-sampling of convolutional neural network are used to preprocess the speech signals in end-to-end voiceprint recognition. The one-dimensional and two-dimensional convolution operations were established to extract the characteristic parameters of Meier frequency cepstrum coefficient from the preprocessed signals, and the classical universal background model was used to model the recognition model of voice print. In this study, the principle of end-to-end voiceprint recognition was firstly analyzed, and the process of end-to-end voice print recognition, end-to-end voice print recognition features and Res-FD-CNN network structure were studied. Then the convolutional neural network recognition model was constructed, and the data were preprocessed to form the convolutional layer in frequency domain and the algorithm was tested.
Q B Nguyen, T T Vu, M L Chi. Improving Acoustic Model for Vietnamese Large Vocabulary Continuous Speech Recognition System Using Deep Bottleneck Features. Advances in Intelligent Systems and Computing, 2015, 326:49–60.
Q Hu, B Y Liu. Speaker recognition algorithm based on convolutional neural network classification. Information Network Security, 2016(04):55–60.
C Zhang, S H Luo, H T Yue, et al. Transformer core voice print pattern recognition method based on MEL time spectrum convolutional neural network. High Voltage Technology, 2020, 327(02):50–60.
Lingfei Yu, Qiang Liu Research and application of voice print recognition method based on deep loop network. Application Research of Computers, 2019, 036(001):153–158.
D Y Du, L J Lu, R Y Fu, et al. Palm vein recognition: An end-to-end convolutional neural network approach. Journal of Southern Medical University, 2019, 039(002):207–214.
C W Sun, C Wen, K Xie, et al. Small sample voice print recognition method based on depth transfer model. Computer Engineering and Design, 2018, 39(12):224–230.
A Nagrani, J S Chung, W Xie, et al. Voxceleb: Large-scale speaker verification in the wild. Computer speech and language, 2020, 60(3):1–15.
J Liu, Y Hu, Huang Heyu. End-to-end deep convolutional neural network speech recognition. Computer Applications and Software, 2020, 037(004):192–196.
Y Zhao, Y Wang, M G Zhang. Recorded speech detection algorithm based on convolutional neural network. Computer Technology and Development, 2020, 274(02):177–183.
Y C Li, Z F Yan, G P Yan. Edge – Based Double Convolutional Neural Network and Its Visualization. Computer Engineering and Science, 2019, 41(10):1837–1845.
Ji S, Xu W, Yang M, et al. 3D Convolutional Neural Networks for Human Action Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(1):221–231.