SPARQL Query Candidate Filtering for Improving the Quality of Multilingual Question Answering over Knowledge Graphs using Language Models

Authors

  • Aleksandr Perevalov Leipzig University of Applied Sciences, Leipzig, Germany
  • Aleksandr Gashkov Leipzig University of Applied Sciences, Leipzig, Germany
  • Maria Eltsova CBZ München GmbH, Heilbronn, Germany
  • Andreas Both Leipzig University of Applied Sciences, Leipzig, Germany ,DATEV eG, Nuremberg, Germany

DOI:

https://doi.org/10.13052/jwe1540-9589.2444

Keywords:

Question answering over knowledge graphs, query validation, query candidate filtering, question answering quality, trustworthiness

Abstract

Question answering is an approach to retrieving information from a knowledge base using natural language. Within question answering systems that work over knowledge graphs (KGQA), a ranked list of SPARQL query candidates is typically computed for the given natural-language input, where the top-ranked query should reflect the intention and semantics of the given user’s question. This article follows our long-term research agenda of providing trustworthy KGQA systems by presenting an approach for filtering incorrect queries. Here, we employ (large) language models (LMs/LLMs) to distinguish between correct and incorrect queries. The main difference to the previous work is that we address here multilingual questions represented in major languages (English, German, French, Spanish, and Russian), and confirm the generalizability of the approach by also evaluating it on some low-resource languages (Ukrainian, Armenian, Lithuanian, Belarusian, and Bashkir). The considered LMs (BERT, DistilBERT, Mistral, Zephyr, GPT-3.5, and GPT-4) were applied to the KGQA systems – QAnswer (real-world system) and MemQA (idealized system) – as SPARQL query filters. The approach was evaluated on the multilingual dataset QALD-9-plus, which is based on the Wikidata knowledge graph. The experimental results imply that the considered KGQA systems achieve quality improvements for all languages when using our query-filtering approach.

     

Downloads

Download data is not yet available.

Author Biographies

Aleksandr Perevalov, Leipzig University of Applied Sciences, Leipzig, Germany

Aleksandr Pervalov is a final-year Ph.D. student at the Leipzig University of Applied Sciences and the Paderborn University (both in Germany). He also leads the research project “Language Agnostic Semantic Search over Knowledge Graphs” in collaboration with Springer Nature as an industry partner. His research interests focus on applied conversational AI and question answering.

Aleksandr Gashkov, Leipzig University of Applied Sciences, Leipzig, Germany

Aleksandr Gashkov is an independent researcher at the Web & Software Engineering research group, with a Ph.D. in Computational Linguistics. His research interests encompass natural language processing, multilingual question answering – both general and over knowledge graphs – and applied artificial intelligence.

Maria Eltsova, CBZ München GmbH, Heilbronn, Germany

Maria Eltsova is a free researcher at Web & Software Engineering research group. After promotion in linguistics in 2006, she was a professor at Perm National Research Polytechnic University (Russia) till 2022. Her research interests include computational linguistics, psycholinguistics, question answering, and digitalization of endangered indigenous languages.

Andreas Both, Leipzig University of Applied Sciences, Leipzig, Germany ,DATEV eG, Nuremberg, Germany

Andreas Both is Head of Research at DATEV eG (a top-tier business software provider in Germany) and a professor at the Leipzig University of Applied Sciences (Germany) where he leads the Web & Software Engineering (WSE) research group which focuses on high-quality, multilingual, KG-agnostic AI methods, in particular on question answering systems, and applied AI, data, and web technologies.

References

Kushagra Singh Bisen, Sara Assefa Alemayehu, Pierre Maret, Alexandra Creighton, Rachel Gorman, Bushra Kundi, Thumeka Mgwgwi, Fabrice Muhlenbach, Serban Dinca-Panaitescu, and Christo El Morr. Evaluation of Search Methods on Community Documents, pages 39–49. Metadata and Semantic Research. Springer Nature Switzerland, 2023.

Andreas Both, Dennis Diefenbach, Kuldeep Singh, Saedeeh Shekarpour, Didier Cherix, and Christoph Lange. Qanary – a methodology for vocabulary-driven open question answering systems. In The Semantic Web. Latest Advances and New Domains, pages 625–641, Cham, 2016. Springer International Publishing.

Andreas Both, Kuldeep Singh, Dennis Diefenbach, and Ioanna Lytra. Rapid engineering of QA systems using the light-weight Qanary architecture. In Jordi Cabot, Roberto De Virgilio, and Riccardo Torlone, editors, Web Engineering, pages 544–548, Cham, 2017. Springer International Publishing.

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.

Mikhail Burtsev, Alexander Seliverstov, Rafael Airapetyan, Mikhail Arkhipov, Dilyara Baymurzina, Nickolay Bushkov, Olga Gureenkova, Taras Khakhulin, Yuri Kuratov, Denis Kuznetsov, Alexey Litinsky, Varvara Logacheva, Alexey Lymar, Valentin Malykh, Maxim Petrov, Vadim Polulyakh, Leonid Pugachev, Alexey Sorokin, Maria Vikhreva, and Marat Zaynutdinov. DeepPavlov: Open-source library for dialogue systems. In Proceedings of ACL 2018, System Demonstrations, pages 122–127. Association for Computational Linguistics, 2018.

Ruixiang Cui, Rahul Aralikatte, Heather Lent, and Daniel Hershcovich. Multilingual compositional Wikidata questions. arXiv preprint arXiv:2108.03509, 2021.

Ruixiang Cui, Rahul Aralikatte, Heather Lent, and Daniel Hershcovich. Compositional generalization in multilingual semantic parsing over Wikidata. Transactions of the ACL, 10, 2022.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019.

Dennis Diefenbach, Andreas Both, Kuldeep Singh, and Pierre Maret. Towards a question answering system over the semantic web. Semantic Web, 11:421–439, 2020.

Dennis Diefenbach, José Giménez-García, Andreas Both, Kamal Singh, and Pierre Maret. QAnswer KG: designing a portable question answering system over RDF data. In European Semantic Web Conference, pages 429–445. Springer, 2020.

Pavel Efimov, Leonid Boytsov, Elena Arslanova, and Pavel Braslavski. The impact of cross-lingual adjustment of contextual word representations on zero-shot transfer. In European Conference on Information Retrieval, pages 51–67. Springer, 2023.

Aleksandr Gashkov, Aleksandr Perevalov, Maria Eltsova, and Andreas Both. Improving the question answering quality using answer candidate filtering based on natural-language features. In 2021 16th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pages 635–642. IEEE, 2021.

Aleksandr Gashkov, Aleksandr Perevalov, Maria Eltsova, and Andreas Both. Improving question answering quality through language feature-based SPARQL query candidate validation. In The Semantic Web - 19th International Conference, ESWC 2022, Hersonissos, Crete, Greece, Proceedings, volume 13261 of Lecture Notes in Computer Science, pages 217–235. Springer, 2022.

James Hadley. Indirect translation and discursive identity: Proposing the concatenation effect hypothesis. Translation Studies, 10(2):183–197, 2017.

Nivas Jayaseelan. LLaMA 2: The new open source language model, 2023. https://www.e2enetworks.com/blog/llama-2-the-new-open-source-language-model.

Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. Mistral 7B. arXiv preprint arXiv:2310.06825, 2023.

Haemin Jung and Wooju Kim. Automated conversion from natural language query to SPARQL query. Journal of Intelligent Information Systems, 55(3):501–520, 2020.

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, Conference Track Proceedings, 2015.

Anis Koubaa. GPT-4 vs. GPT-3.5: A concise showdown. Preprints, March 2023.

Clifford E Landers. Literary translation: A practical guide. Multilingual Matters, 2001.

Gwénolé Lecorvé, Morgan Veyret, Quentin Brabant, and Lina M. Rojas Barahona. SPARQL-to-text question generation for knowledge-based conversational applications. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 131–147. Association for Computational Linguistics, 2022.

Ekaterina Loginova, Stalin Varanasi, and Günter Neumann. Towards end-to-end multilingual question answering. Information Systems Frontiers, 23:227–241, 2021.

T. R. McIntosh, T. Liu, T. Susnjak, P. Watters, A. Ng, and M. N. Halgamuge. A culturally sensitive test to evaluate nuanced GPT hallucination. IEEE Transactions on Artificial Intelligence, 1(01):1–13, 2023.

Nick McKenna and Priyanka Sen. KGQA without retraining. In ACL 2023 Workshop on SustaiNLP, 2023.

Michalis Mountantonakis, Michalis Bastakis, Loukas Mertzanis, and Yannis Tzitzikas. Tiresias: Bilingual question answering over DBpedia. In Workshop on Deep Learning for Knowledge Graphs (DL4KG 2022), 2022.

Diego Moussallem, Dwaraknath Gnaneshwar, Thiago Castro Ferreira, and Axel-Cyrille Ngonga Ngomo. NABU–multilingual graph-based neural RDF verbalizer. In International Semantic Web Conference, pages 420–437. Springer, 2020.

Axel-Cyrille Ngonga Ngomo, Lorenz Bühmann, Christina Unger, Jens Lehmann, and Daniel Gerber. Sorry, I don’t speak SPARQL: translating SPARQL queries into natural language. In Proceedings of the 22nd international conference on World Wide Web, pages 977–988, 2013.

Axel-Cyrille Ngonga Ngomo, Diego Moussallem, and Lorenz Bühmann. A holistic natural language generation framework for the semantic web. In Ruslan Mitkov and Galia Angelova, editors, Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 819–828, Varna, Bulgaria, 2019. INCOMA Ltd.

Peggy Nzomo, Isola Ajiferuke, Liwen Vaughan, and Pamela McKenzie. Multilingual information retrieval & use: Perceptions and practices amongst bi/multilingual academic users. The Journal of Academic Librarianship, 42(5):495–502, 2016.

OpenAI. Introducing ChatGPT, 2022. https://openai.com/blog/chatGPT.

OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.

Thomas Pellissier Tanon, Marcos Dias de Assunção, Eddy Caron, and Fabian M. Suchanek. Demoing Platypus – a multilingual question answering platform for Wikidata. In The Semantic Web: ESWC 2018 Satellite Events, pages 111–116. Springer, 2018.

Aleksandr Perevalov and Andreas Both. Augmentation-based answer type classification of the SMART dataset. In Nandana Mihindukulasooriya, Mohnish Dubey, Alfio Gliozzo, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, and Ricardo Usbeck, editors, Proceedings of the SeMantic AnsweR Type prediction task (SMART) at ISWC 2020 Semantic Web Challenge co-located with the 19th International Semantic Web Conference (ISWC 2020), Virtual Conference, November 5th, 2020, volume 2774 of CEUR Workshop Proceedings, pages 1–9. CEUR-WS.org, 2020.

Aleksandr Perevalov and Andreas Both. Improving answer type classification quality through combined question answering datasets. In Knowledge Science, Engineering and Management: 14th International Conference, KSEM 2021, Tokyo, Japan, August 14–16, 2021, Proceedings, Part II, pages 191–204, Berlin, Heidelberg, 2021. Springer-Verlag.

Aleksandr Perevalov, Andreas Both, Dennis Diefenbach, and Axel-Cyrille Ngonga Ngomo. Can machine translation be a reasonable alternative for multilingual question answering systems over knowledge graphs? In ACM Web Conference 2022, WWW ’22. ACM, 2022.

Aleksandr Perevalov, Andreas Both, Florian Gudat, Paul Bräuning, Johannes Meesters, Lennart Gründel, Marie-susann Bachmann, and Salem Zin Iden Naser. Qanary Builder: Addressing the reproducibility crisis in question answering over knowledge graphs. In International Semantic Web Conference (ISWC) – Posters and Demos Track, 2023.

Aleksandr Perevalov, Andreas Both, and Axel-Cyrille Ngonga Ngomo. Multilingual question answering systems for knowledge graphs—a survey. Semantic Web Journal, 2023.

Aleksandr Perevalov, Dennis Diefenbach, Ricardo Usbeck, and Andreas Both. QALD-9-plus: A multilingual dataset for question answering over DBpedia and Wikidata translated by native speakers. In International Conference on Semantic Computing (ICSC), 2022.

Aleksandr Perevalov, Aleksandr Gashkov, Maria Eltsova, and Andreas Both. Understanding SPARQL queries: Are we already there? Multilingual natural language generation based on SPARQL queries and large language models. In The Semantic Web - ISWC 2024 - 23rd International Semantic Web Conference, Proceedings, Part II, volume 15232 of Lecture Notes in Computer Science, pages 173–191. Springer, 2024.

Aleksandr Perevalov, Xi Yan, Liubov Kovriguina, Longquan Jiang, Andreas Both, and Ricardo Usbeck. Knowledge graph question answering leaderboard: A community resource to prevent a replication crisis. In Proceedings of the Thirteenth Language Resources and Evaluation Conference (LREC 2022), pages 2998–3007, Marseille, France, 2022. European Language Resources Association.

Telmo Pires, Eva Schlinger, and Dan Garrette. How multilingual is multilingual bert? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4996–5001, 2019.

Anton Razzhigaev, Mikhail Salnikov, Valentin Malykh, Pavel Braslavski, and Alexander Panchenko. A system for answering simple questions in multiple languages. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 524–537. Association for Computational Linguistics, 2023.

Ivan Rybin, Vladislav Korablinov, Pavel Efimov, and Pavel Braslavski. RuBQ 2.0: An innovated Russian question answering dataset. In The Semantic Web: 18th International Conference, ESWC 2021, pages 532–547. Springer, 2021.

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.

Apoorv Saxena, Soumen Chakrabarti, and Partha Talukdar. Question answering over temporal knowledge graphs. arXiv preprint arXiv:2106.01515, 2021.

Priyanka Sen, Alham Fikri Aji, and Amir Saffari. Mintaka: A complex, natural, and multilingual dataset for end-to-end question answering. In 29th International Conference on Computational Linguistics, pages 1604–1619, 2022.

Li Si, Qiuyu Pan, and Xiaozhe Zhuang. An empirical analysis of user behaviour on multilingual information retrieval. The Electronic Library, 35(3):410–426, 2017.

Lucia Siciliani, Pierpaolo Basile, Pasquale Lops, and Giovanni Semeraro. MQALD: Evaluating the impact of modifiers in question answering over knowledge graphs. Semantic Web, 13(2), 2022.

Javier Soruco, Diego Collarana, Andreas Both, and Ricardo Usbeck. QALD-9-ES: A Spanish Dataset for Question Answering Systems, pages 38–52. Studies on the Semantic Web. IOS Press BV, 2023.

Nikit Srivastava, Aleksandr Perevalov, Denis Kuchelev, Diego Moussallem, Axel-Cyrille Ngonga Ngomo, and Andreas Both. Lingua franca – entity-aware machine translation approach for question answering over knowledge graphs. In Knowledge Capture Conference. ACM, 2023.

Lewis Tunstall, Edward Emanuel Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro Von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M Rush, and Thomas Wolf. Zephyr: Direct distillation of LM alignment. In First Conference on Language Modeling, 2024.

UNESCO. Recommendation on the Legal Protection of Translators and Translations and the Practical Means to Improve the Status of Translators: Adopted by the General Conference at Its Nineteenth Session, Nairobi, 22 November 1976. UNESCO, 1976.

Ricardo Usbeck, Ria Hari Gusmita, Axel-Cyrille Ngonga Ngomo, and Muhammad Saleem. 9th challenge on question answering over linked data (QALD-9). In Semdeep/NLIWoD@ISWC, 2018.

Ricardo Usbeck, Michael Röder, Axel-Cyrille Ngonga Ngomo, Ciro Baron, Andreas Both, Martin Brümmer, Diego Ceccarelli, Marco Cornolti, Didier Cherix, Bernd Eickmann, Paolo Ferragina, Christiane Lemke, Andrea Moro, Roberto Navigli, Francesco Piccinno, Giuseppe Rizzo, Harald Sack, René Speck, Raphaël Troncy, Jörg Waitelonis, and Lars Wesemann. Gerbil: General entity annotator benchmarking framework. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15, page 1133–1143. International World Wide Web Conferences Steering Committee, 2015.

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45. Association for Computational Linguistics, 2020.

Silei Xu, Theo Culhane, Meng-Hsi Wu, Sina J Semnani, and Monica S Lam. Complementing GPT-3 with few-shot sequence-to-sequence semantic parsing over Wikidata. arXiv preprint arXiv:2305.14202, 2023.

Junjie Ye, Xuanting Chen, Nuo Xu, Can Zu, Zekai Shao, Shichun Liu, Yuhan Cui, Zeyang Zhou, Chao Gong, Yang Shen, Jie Zhou, Siming Chen, Tao Gui, Qi Zhang, and Xuanjing Huang. A comprehensive capability analysis of GPT-3 and GPT-3.5 series models. arXiv preprint arXiv:2303.10420, 2023.

Chen Zhang, Yuxuan Lai, Yansong Feng, and Dongyan Zhao. A review of deep learning in question answering over knowledge bases. AI Open, 2:205–215, 2021.

Yucheng Zhou, Xiubo Geng, Tao Shen, Wenqiang Zhang, and Daxin Jiang. Improving zero-shot cross-lingual transfer for multilingual question answering over knowledge graph. In NAACL: Human Language Technologies, pages 5822–5834. ACL, 2021.

Downloads

Published

2025-07-31

How to Cite

Perevalov, A. ., Gashkov, A. ., Eltsova, M. ., & Both, A. . (2025). SPARQL Query Candidate Filtering for Improving the Quality of Multilingual Question Answering over Knowledge Graphs using Language Models. Journal of Web Engineering, 24(04), 563–592. https://doi.org/10.13052/jwe1540-9589.2444

Issue

Section

ICWE 2024