Research on Semantic Similarity of Short Text Based on Bert and Time Warping Distance
Keywords:BERT; CTW; Time Warping Distance; Lexical Ambiguity; Semantic Similarity
The research on semantic similarity of short text plays an important role in machine translation, emotion analysis, information retrieval and other AI business applications. However, according to existing short text similarity research, the characteristics of ambiguous vocabularies are difficult to be effectively analyzed, the solution of the problem caused by words order needs to be further optimized as well. This paper proposes a short text semantic similarity calculation method that combines BERT and time warping distance algorithm, in order to solve the problem of vocabulary ambiguity. The model first uses the pre trained Bert model to extract the semantic features of the short text from the whole level, and obtains a 768 dimensional short text feature vector. Then, it transforms the extracted feature vector into a point sequence in space, uses the CTW algorithm to calculate the time warping distance between the curves connected by the point sequence, and finally uses the weight function designed by the analysis, according to the smaller the time warpage distance is, the higher the degree of small similarity is, to calculate the similarity between short texts. The experimental results show that this model can mine the feature information of ambiguous words, and calculate the similarity of short texts with lexical ambiguity effectively. Compared with other models, it can distinguish the semantic features of ambiguous words more accurately.
GUO Shenguo, Xing dandan. Sentence similarity calculation based on word vector and its application research[J]. Modern Electronics Technique, 2016, 39(13):99–102.
Liu Xinting, CAI Xiaodong. Sentence similarity calculation based on word vector and Chinese frame net[J]. Journal of Guilin University of Electronic Technology, 2017, v. 37; No. 153(06):67–70.
Jeffrey Pennington, Richard Socher, Christopher Manning. GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, October 25–29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics.
Matthew E. Peters, Mark Neumann, Mohit Iyyer, et al. Proceedings of NAACL-HLT 2018, pages 2227–2237 New Orleans, Louisiana, June 1–6, 2018. c 2018 Association for Computational Linguistics.
Nguyen, Hien & Duong, Phuc & Cambria, Erik. (2019). Learning short-text semantic similarity with word embeddings and external knowledge sources. Knowledge-Based Systems. 182. 10.1016/j.knosys. 2019.07.013.
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[J]. 2018.
Zhou F, Torre F D L. Canonical Time Warping for Alignment of Human Behavior[C]// Advances in Neural Information Processing Systems 22: Conference on Neural Information Processing Systems A Meeting Held December. Curran Associates Inc. 2009.
Feng Zhou, Fernando De la Torre. Generalized Canonical Time Warping[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 38(2):1–1.
George Trigeorgis, Mihalis A. Nicolaou, Stefanos Zafeiriou, et al. Deep Canonical Time Warping[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE, 2016.
Li X, Li Q. Calculation of Sentence Semantic Similarity Based on Syntactic Structure[J]. Mathematical Problems in Engineering, 2015, 2015:1–8.
Yin Qingshan, Li Rui, Yu Zhilou. A method and process of similarity matching of short text based on deep learning Bert algorithm: CHINA, 18797489.2[P].2019-09-29.
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., & Liu, Q. (2019). TinyBERT: Distilling BERT for Natural Language Understanding. ArXiv, abs/1909.10351.
Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, et al. Supervised Word Mover’s Distance[C]. Neural Information Processing Systems Conference, 2019.
Rakthanmanon T, Campana B, Mueen A, et al. Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping[C]// Acm Sigkdd International Conference on Knowledge Discovery & Data Mining. ACM, 2012.
Wei J, Ren X, Li X, et al. NEZHA: Neural Contextualized Representation for Chinese Language Understanding[J], 2019.
Trigeorgis G, Nicolaou M A, Zafeiriou S, et al. Deep Canonical Time Warping[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016.
Nguyen, Hien & Duong, Phuc & Cambria, Erik. (2019). Learning short-text semantic similarity with word embeddings and external knowledge sources. Knowledge-Based Systems. 182. 10.1016/j.knosys.2019.07.013.