Few-shot Text Classification Method Based on Feature Optimization
Keywords:Few-shot learning, Text Classification, feature optimization, WDAB-LSTM prototypical network
For the poor effect of few-shot text classification caused by insufficient data for feature representation, this paper combines wide and deep attention bidirectional long short time memory (WDAB-LSTM) and a prototypical network to optimize text features for better classification performance. With this proposed algorithm, text enhancement and preprocessing are firstly adopted to solve the problem of insufficient samples and WDAB-LSTM is used to increase word attention to get output vectors containing important context-related information. Then the prototypical network is added to optimize the distance measurement module in the model for a better effect on feature extraction and sample representation. To test the performance of this algorithm, Amazon Review Sentiment Classification (ARSC), Text Retrieval Conference (TREC), and Kaggle are selected. Compared with the Siamese network and the prototypical network, the proposed algorithm with feature optimization has a relatively higher accuracy rate, precision rate, recall rate, and F1 value.
D. P. Wang, Z. W. Wang, L. L. Cheng, et al., “Few-Shot text classification with global-local feature information”, Sensors, vol. 22, no. 12, pp. 2022.
W. F. Liu, J. M. Pang, N. Li, et al., “Few-shot short-text classification with language representations and centroid similarity”, Applied Intelligence, DOI: 10.1007/s10489-022-03880-y, 2022.
Y. Wang, Q. Yao, J. T. Kwok, et al., “Generalizing from a few examples: A survey on fewshot learning”, ACM Computing Surveys, vol. 53, no. 3, pp. 1–34, 2020.
A. Gupta, R. Bhatia, “Knowledge based deep inception model for web page classification”, Journal of Web Engineering, vol. 20, no. 7, pp. 2131–2167, 2021.
J. Guan, R. Xu, J. Ya, et al., “Few-shot text classification with external knowledge expansion”, 5th International Conference on Innovation in Artificial Intelligence (ICIAI 2021, 2021, pp. 184–189.
B. Hui, L. Liu, J. Chen, et al., “Few-shot relation classification by context attention-based prototypical networks with BERT”, Eurasip Journal on Wireless Communications and Networking, vol. 1, 2020.
Y. Huang, L. Wang, “Acmm: Aligned cross-modal memory for few-shot image and sentence matching”, IEEE/CVF International Conference on Computer Vision, 2019, pp. 5773–5782.
C. Z. Fu, C. R. Liu, C. T. Ishi, et al., “SeMemNN: A semantic matrix-based memory neural network for text classification”, IEEE 14th International Conference on Semantic Computing, 2020, pp. 123–127.
L. M. Yan, Y. H. Zheng, J. Cao, “Few-shot learning for short text classification”, Multimedia Tools and Applications, vol. 77, no. 22, pp. 29799–29810, 2018.
A. Santoro, S. Bartunov, M. Botvinick, et al., “One-shot learning with memory-augmented neural networks”, arXiv, 10.48550/arXiv.1605.06065, 2016.
R. Geng, B. Li, Y. Li, et al., “Dynamic memory induction networks for few-shot text classification”, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020 pp. 1087–1094.
Y. Lee, S. Choi, “Gradient-based meta-learning with learned layerwise metric and subspace”, Proceedings of the 35th International Conference on Machine Learning, 2018, pp. 2933–2942.
A. Rajeswaran, C. Finn, S. M. Kakade, et al. “Meta-learning with implicit gradients”, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, 2019, pp. 113–124.
C. Finn, P. Abbeel, S. Levine, “Model-agnostic meta-learning for fast adap-tation of deep networks”, 10.48550/arXiv.1703.03400, 2017.
T. Munkhdalai, H. Yu, “Meta networks”, International Conference on Machine Learning, 2017, pp. 2554–2563.
M. Yu, X. Guo, J. Yi, et al., “Diverse fewshot text classification with multiple metrics”, arXiv:1805.07513, 2018.
S. Ravi, A. Beatson, “Amortized bayesian meta-learning”, International Conference on Learning Representations, 2019.
M. Qu, T. Gao, L. A. C. Xhonneux, et al., “Few-shot relation extraction via bayesian meta-learning on relation graphs”, Proceedings of the 37th Interna-tional Conference on Machine Learning, 2020, pp. 7867–7876.
H. Lee, H. Lee, D. Na, et al., “Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks”, 8th International Conference on Learning Representations, 2020.
M. Hao, W. J. Wang, F. Zhou, “Joint representations of texts and labels with compositional loss for short text classification”, Journal of Web Engineering, vol. 20, no. 3, pp. 669–687, 2021.
J. Yoon, T. Kim, O. Dia, et al., “Bayesian model-agnostic meta-learning”, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, 2018, pp. 7343–7353.
G. Koch, R. Zemel, R. Salakhutdinov, “Siamese neural networks for one-shot image recognition”, ICML Deep Learning Workshop, vol. 2, 2015.
C. Zhang, Y. Cai, G. Lin, et al., “Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers”, Proceedings of the IEEECVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12203–12213.
F. Sung, Y. Yang, L. Zhang, et al., “Learning to compare: Relation network for few-shot learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1199–1208.
J. Snell, K. Swersky, et al., “Prototypical networks for few-shot learning”, Advances in Neural Information Processing Systems, pp. 4077–4087, 2017.
A. Li, W. Huang, X. Lan, et al., “Boosting few-shot learning with adaptive margin loss”, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12573–12581.
J. Wei, K. Zou, “Eda: Easy data augmentation techniques for boosting performance on text classification tasks”, arXiv preprint arXiv:1901.11196, 2019.
J. Pennington, R. Socher, C. D. Manning, “Glove: Global vectors for word representation”, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1532–1543.
L. H. Feng, L. B. Qiao, Y. Han, et al., “Syntactic enhanced projection network for few-shot chinese event extraction”, Lecture Notes in Artificial Intelligence, vol. 12816, pp. 75–87, 2021.
Y. Xiao, Y. C. Jin, K. R. Hao, “Adaptive prototypical networks with label words and joint representation learning for few-shot relation classification”, IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2021.3105377, 2021.
A. Banerjee, X. Guo, H. Wang, “On the optimality of conditional expectation as a Bregman predictor”, IEEE Transactions on Information Theory, vol. 51, no. 7, pp. 2664–2669, 2005.
X. Y. Gao, B. J. Tian, X. D. Tian, “Hesitant fuzzy graph neural network-based prototypical network for few-shot text classification”, Electronics, vol. 11, no. 15, pp. 2022.
G. Koch, R. Zemel, R. Salakhutdinov, “Siamese neural networks for oneshot image recognition”, ICML Deep Learning Workshop, vol. 2, 2015.
J. Snell, K. Swersky, R. S. Zemel, “Prototypical networks for fewshot learning”, arXiv preprint arXiv:1703.05175, 2017.