Bayesian Probability and Tanimoto Based Recurrent Neural Network for Question Answering System


  • Veeraraghavan Jagannathan Associate Professor, Department of Computer Science and Engineering, Sri Vasavi Engineering College (Autonomouus), Pedatadepalli, Tadepalligudem-534101. Andhra Pradesh, India



Question answering system, question classification, recurrent neural network, Bayesian probability, machine learning


Question Answering (QA) has become one of the most significant information retrieval applications. Despite that, most of the question answering system focused to increase the user experience in finding the relevant result. Due to the continuous increase of web content, retrieving the relevant result faces a challenging issue in the Question Answering System (QAS). Thus, an effective Question Classification (QC), and retrieval approach named Bayesian probability and Tanimoto-based Recurrent Neural Network (RNN) are proposed in this research to differentiate the types of questions more efficiently. This research presented an analysis of different types of questions with respect to the grammatical structures. Various patterns are identified from the questions and the RNN classifier is used to classify the questions. The results obtained by the proposed Bayesian probability and Tanimoto-based RNN showed that the syntactic categories related to the domain-specific types of proper nouns, numeral numbers, and the common nouns enable the RNN classifier to reveal better result for different types of questions. However, the proposed approach obtained better performance in terms of precision, recall, and F-measure with the values of 90.14, 86.301, and 90.936 using dataset-2.


Download data is not yet available.

Author Biography

Veeraraghavan Jagannathan, Associate Professor, Department of Computer Science and Engineering, Sri Vasavi Engineering College (Autonomouus), Pedatadepalli, Tadepalligudem-534101. Andhra Pradesh, India

Veeraraghavan Jagannathan is a researcher in Machine Learning, NLP and deep learning. He received a Post Graduate in Computer Science and Engineering from Anna University, Chennai, India and obtained his Ph.D Degree in Computer Science, from the prestigious National Institute of Technology, Trichy, India in 2008 and 2017 respectively. He has over a decade of research experience and 20 years of academic experience. He has published papers in reputed international journals. His current research areas include, but not limited to Data Analytics, GANs for NLP, Computer Vision and CNN, Medical Image Processing, and Medical Data Analytics and predictive analytics.


Kyoungman Bae, and Youngjoong Ko, “Improving Question Retrieval in Community Question Answering Service Using Dependency Relations and Question Classification”, Journal of the association for information science and technology, pp. 1–16, 2019.

Alaa Mohasseb, Mohamed Bader-El-Den, and Mihaela Cocea, “Question categorization and classification using grammar based approach”, Information Processing and Management, vol. 54, no. 6, pp. 1228–1243, 2018.

Partha Sarathy Banerjee, Baisakhi Chakraborty, Deepak Tripathi, Hardik Gupta, and Sourabh S. Kumar, “A Information Retrieval Based on Question and Answeringand NER for Unstructured Information Without Using SQL”, Wireless Personal Communications, pp. 1–23, 2019.

Kyoungman Bae, and Youngjoong Ko, “Efficient question classification and retrieval using category information and word embedding on cQA services”, Journal of Intelligent Information Systems, pp. 1–23, 2019.

Chi-Hua Chen, Chen-Ling Wu, Chi-Chun Lo, and Feng-Jang Hwang, “An Augmented Reality Question Answering System Based on Ensemble Neural Networks”, IEEE Access, vol. 5, pp. 17425–17435, August 2017.

Fei Wu, Xinyu Duan, Jun Xiao, Zhou Zhao, Siliang Tang, Yin Zhang, and Yueting Zhuang, “Temporal Interaction and Causal Influence in Community-based Question Answering”, Journal of latex class files, vol. 14, no. 8, August 2015.

Jinwei Liu, Haiying Shen, Member, ACM, and Lei Yu, “Question Quality Analysis and Prediction in Community Question Answering Services with Coupled Mutual Reinforcement”, IEEE Transactions on Services Computing, vol. 10, no. 2, pp. 286–301, March 2017.

Alami Hamza, Noureddine En-Nahnahi, Khalid Alaoui Zidani, Said El Alaoui Ouatik, “An arabic question classification method based on new taxonomy and continuous distributed representation of words”, Journal of King Saud University-Computer and Information Sciences, 2019.

Kyoungman Bae and Youngjoong Ko, “An Effective Question Expanding Method for Question Classification in cQA Services”, November 2014.

Delphine Bernhard and Iryna Gurevych, “Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding”, In proceedings of 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 728–736, 2009.

Adam Berger, and John Laerty, “Information Retrieval as Statistical Translation”, SIGIR’99, pp. 222–229, 1999.

Peter E Brown, and Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer, “The Mathematics of Statistical Machine Translation: Parameter Estimation”, Association for Computational Linguistics, 1993.

Haiying Shen, Ze Li, Jinwei Liu, Joseph Edward Grant,”Knowledge sharing in the online social network of yahoo! answers and its implications”, IEEE Transactions on Computers, vol. 64, no. 6, pp. 1715–1728, June 2015.

Anna Shtok, Gideon Dror, Yoelle Maarek, “Learning from the Past: Answering New Questions with Past Answers”, In proceedings of International World Wide Web Conference Committee, April 2012.

Lotfi A. Zadeh, “From Search Engines to Question Answering Systems – The Problems of World Knowledge, Relevance, Deduction and Precisiation”, In Fuzzy Logic and the Semantic Web, pp. 163–210, 2006.

Zhe Liu, Bernard J. Jansen, “Identifying and predicting the desire to help in social question and answering”, Information Processing and Management, pp. 1–15, 2016.

Wei Li, “Question Classification Using Language Modeling”, CIIR technical report, 2002.

Li Liu, Zhengtao Yu, Jianyi Guo, Cunli Mao, and Xudong Hong, “Chinese Question Classification Based on Question PropertyKernel”, International Journal of Machine Learning and Cybernetics, vol. 5, no. 5, pp. 713–720, 2014.

Dell Zhang, and Wee Sun Lee, “Question Classification using Support Vector Machines”, In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 26–32, 2003.

Anbuselvan Sangodiah, Rohiza Ahmad, and Wan Fatimah Wan Ahmad, “A Review in Feature Extraction Approach in Question Classification Using Support Vector Machine”, In proceedings of IEEE International Conference on Control System, Computing and Engineering, pp. 28–30 November 2014.

Kai Zhang, Wei Wu, Fang Wang, Ming Zhou, and Zhoujun Li, “Learning Distributed Representations of Data in Community Question Answering for Question Retrieval”, In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 533–542, 2016.

Xin Cao, Gao Cong, Bin Cui, and Christian S. Jensen, “A Generalized Framework of Exploring Category Information for Question Retrieval in Community Question Answer Archives”, In Proceedings of the 19th international conference on World wide web, pp. 201–210, 2010.

Poonam Yadav, “Case Retrieval Algorithm Using Similarity Measure and Adaptive Fractional Brain Storm Optimization for Health Informaticians”, Arabian Journal for Science and Engineering, vol. 41, no. 3, pp. 829–840, 2016.

Szabolcs Sergyan, “Color Histogram Features Based Image Classification in Content-Based Image Retrieval Systems”, In proceedings of 6th International Symposium on Applied Machine Intelligence and Informatics, pp. 221–224, 2008.

Question answer dataset taken from, “”, accessed on August 2019.

Zheqian Chen, Chi Zhang, Zhou Zhao, and Deng Cai, “Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning”, 2016.

Lorena Kodra, and Elinda Kajo Mece, “Question Answering Systems: A Review on Present Developments, Challenges, and Trends”, International Journal of Advanced Computer Science and Applications, vol. 8, no. 9, 2017.

Martens J, Sutskever I, “Learning recurrent neural networks with hessian-free optimization”, In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 1033–1040, 2011.

Esposito, M., Damiano, E., Minutolo, A., De Pietro, G. and Fujita, H., “Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering”, Information Sciences, vol. 514, pp. 88–105, 2020.

Sarrouti, M. and El Alaoui, S.O., “SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions”, Artificial Intelligence in Medicine, vol. 102, p. 101767, 2020.

Yilmaz, T., Ozcan, R., Altingovde, I.S. and Ulusoy, Ö., “Improving educational web search for question-like queries through subject classification”, Information Processing & Management, vol. 56, no. 1, pp. 228–246, 2019.

Yang, S. and Gao, C., “Enriching basic features via multilayer bag-of-words binding for Chinese question classification”, CAAI Transactions on Intelligence Technology, vol. 2, no. 3, pp. 133–140, 2019.

Kundu, D. and Mandal, D.P., “Formulation of a hybrid expertise retrieval system in community question answering services”, Applied Intelligence, vol. 49, no. 2, pp. 463–477, 2019.

Madaan, R., Sharma, A.K., Dixit, A., and Bhatia, P., “Indexing of Semantic Web for Efficient Question Answering System”, In Software Engineering, Springer, Singapore, pp. 51–61, 2019.

Bei Xu, and Hai Zhuge, “The influence of semantic link network on the ability of question-answering system”, Future Generation Computer Systems, vol. 108, pp. 1–14, 2020.

Asad Abdi, Shafaatunnur Hasan, Mohammad Arshi, Siti Mariyam Shamsuddin, and Norisma Idris, “A question answering system in hadith using linguistic knowledge”, Computer Speech & Language, vol. 60, 2020.

Vaibhav Rupapara, Manideep Narra, Naresh Kumar Gonda, Kaushika Thipparthy, and Swapnil Gandhi, “Auto-Encoders for Content-based Image Retrieval with its Implementation Using Handwritten Dataset, “ In the proceeding of 5th International Conference on Communication and Electronics Systems (ICCES), IEEE, 289–294, 2020.

H Sumit. Sailee Bhambere and Abhishek B, “Rapid Digitization of Healthcare – A Review of COVID-19 Impact on our Health systems”, International Journal of All Research Education and Scientific Methods, vol. 9, no. 2, pp. 1457–1459, 2021.