LUBM4OBDA: Benchmarking OBDA Systems with Inference and Meta Knowledge

Authors

  • Julián Arenas-Guerrero Universidad Politécnica de Madrid, Spain https://orcid.org/0000-0002-3029-6469
  • María S. Pérez Universidad Politécnica de Madrid, Spain
  • Oscar Corcho Universidad Politécnica de Madrid, Spain

DOI:

https://doi.org/10.13052/jwe1540-9589.2284

Keywords:

OBDA, semantic web, ontology, data integration

Abstract

Ontology-based data access focuses on enabling query evaluation over heterogeneous relational databases according to the model represented by an ontology. The relationships between the ontology and the data sources are commonly defined with declarative mappings, which are used by systems to perform SPARQL-to-SQL query translation or to generate RDF dumps from the relational databases. Besides the potential homogenization of data because of using an ontology, some additional advantages of this paradigm are that it may allow applying reasoning thanks to the ontology, as well as querying for meta knowledge, which describes statements with information such as provenance or certainty. In this paper, (i) we adapt a widely used RDF graph store benchmark, namely LUBM, for ontology-based data access, (ii) extend the benchmark for the evaluation of queries that exploit meta knowledge, and (iii) apply it for performance evaluation of state-of-the-art declarative mapping systems. Our proposal, the LUBM4OBDA Benchmark, considers inference capabilities that are not covered by previous ontology-based data access benchmarks, and it is the first one for the evaluation of meta knowledge and the RDF-star data model. The experimental evaluation shows that current virtualization systems cannot handle some advanced inference tasks, and that optimizations are needed to scale RDF-star materialization.

Downloads

Download data is not yet available.

Author Biographies

Julián Arenas-Guerrero, Universidad Politécnica de Madrid, Spain

Julián Arenas-Guerrero received his B.Sc. degree in computer science engineering from the Universidad Complutense de Madrid, and his M.Sc. degree in artificial intelligence from the Universidad Politécnica de Madrid. He is currently pursuing a Ph.D. degree in artificial intelligence with the Universidad Politécnica de Madrid. His research interests include data integration, semantic web and knowledge graphs.

María S. Pérez, Universidad Politécnica de Madrid, Spain

María S. Pérez is currently a Full Professor with the Universidad Politécnica de Madrid. She is also a part of the Board of Directors of BDVA and also a member of the Research and Innovation Advisory Group, a EuroHPC Joint Undertaking. She has coauthored four books and seven book chapters. She has published more than 100 articles in international journals and conferences. She has been involved in the organization of several workshops and conferences and has edited several proceedings books and special issues. She has participated in a number of EU projects (ENTENTE, BigStorage, Wf4Ever, PlanetData, SEMDATA, SCALUS, SemsorGrid4Env, SEALS, OntoGrid, and RAIL) and Spanish R&D projects (CABAHLA, España Virtual, myBigData, GeoBuddies, 4V, and Datos 4.0). Her research interests include data science, big data, machine learning, storage, high performance, and large-scale computing. She has served as a program committee member for many relevant conferences.

Oscar Corcho, Universidad Politécnica de Madrid, Spain

Oscar Corcho is currently a Full Professor with the Universidad Politécnica de Madrid where he is also co-director of the Ontology Engineering Group. He previously worked as a Marie Curie researcher at the University of Manchester and as a research manager at the company iSOCO. He has published several books, among which “Ontological Engineering” stands out; it is used as an academic text in several Spanish and foreign universities, as well as more than 100 articles in journals, conferences and workshops. He is a member of the editorial committee of several journals, and regularly participates in the program committees of the most relevant conferences in the semantic web field, having directed the program committee of some of them.

References

Alhazmi, A., Blount, T., Konstantinidis, G., 2022. ForBackBench: A Benchmark for Chasing vs. Query-Rewriting. Proceedings of the VLDB Endowment, 15(8), pp. 1519–1532. doi: 10.14778/3529337.3529338

Arenas-Guerrero, J., Chaves-Fraga, D., Toledo, J., Pérez, M.S., Corcho, O., 2024. Morph-KGC: Scalable knowledge graph materialization with mapping partitions. Semantic Web. doi: 10.3233/SW-223135.

Arenas-Guerrero, J., Iglesias-Molina, A., Chaves-Fraga, D., Garijo, D., Corcho, O., Dimou, A., 2024. Declarative generation of RDF-star graphs from heterogeneous data. Submitted to Semantic Web. URL: https://www.semantic-web-journal.net/system/files/swj3602.pdf.

Arenas-Guerrero, J., Scrocca, M., Iglesias-Molina, A., Toledo, J., Pozo-Gilo, L., Doña, D., Corcho, O., Chaves-Fraga, D., 2021. Knowledge Graph Construction with R2RML and RML: An ETL System-based Overview, in: Proceedings of the 2nd International Workshop on Knowledge Graph Construction, CEUR Workshop Proceedings. URL: http://ceur-ws.org/Vol-2873/paper11.pdf.

Arenas-Guerrero, J., Alobaid, A., Navas-Loro, M., Pérez, M.S., Corcho, O, 2023. Boosting Knowledge Graph Generation from Tabular Data with RML Views, in: Proceedings of the 20th Extended Semantic Web Conference, Springer Nature Switzerland. pp. 484-501. doi: 10.1007/978-3-031-33455-9_29.

Bizer, C., Cyganiak, R., 2006. D2R Server – Publishing Relational Databases on the Semantic Web, in: International Semantic Web Conference.

Bizer, C., Schultz, A., 2009. The Berlin SPARQL Benchmark. International Journal on Semantic Web and Information Systems 5(2), pp. 1–24. doi: 10.4018/jswis.2009040101.

Brickley, D., Guha, R., 2014. RDF Schema 1.1. W3C Recommendation. URL: https://www.w3.org/TR/rdf-schema/.

Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., Rodriguez-Muro, M., Xiao, G., 2017. Ontop: Answering SPARQL queries over relational databases. Semantic Web 8, pp. 471–487. doi: 10.3233/SW-160217.

Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez-Muro, M., Rosati, R., Ruzzi, M., Savo, D.F., 2011. The MASTRO system for ontology-based data access. Semantic Web 2(1), pp. 43–53. doi: 10.3233/SW-2011-0029.

Chaloupka, M., Necasky, M., 2020. Using Berlin SPARQL Benchmark to Evaluate Relational Database Virtual SPARQL Endpoints. Submitted to Semantic Web. URL: https://www.semantic-web-journal.net/system/files/swj2473.pdf.

Chaves-Fraga, D., Priyatna, F., Cimmino, A., Toledo, J., Ruckhaus, E., Corcho, O., 2020. GTFS-Madrid-Bench: A Benchmark for Virtual Knowledge Graph Access in the Transport Domain. Journal of Web Semantics 65, 100596. doi: 10.1016/j.websem.2020.100596.

Cyganiak, R., Wood, D., Lanthaler, M., 2014. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/.

Das, S., Sundara, S., Cyganiak, R., 2012. R2RML: RDB to RDF Mapping Language. W3C Recommendation. URL: http://www.w3.org/TR/r2rml/.

Delva, T., Arenas-Guerrero, J., Iglesias-Molina, A., Corcho, O., Chaves-Fraga, D., Dimou, A., 2021. RML-star: A Declarative Mapping Language for RDF-star Generation, in: International Semantic Web Conference, P&D, pp. 1—5.

Dividino, R., Sizov, S., Staab, S., Schueler, B., 2009. Querying for provenance, trust, uncertainty and other meta knowledge in RDF. Journal of Web Semantics 7, 204–219. doi: 10.1016/j.websem.2009.07.004.

Guo, Y., Pan, Z., Heflin, J., 2005. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics 3, 158–182. doi: 10.1016/j.websem.2005.06.005.

Harris, S., Seaborne, A., 2013. SPARQL 1.1 Query Language. W3C Recommendation. URL: https://www.w3.org/TR/sparql11-query/.

Hartig, O., 2017. Foundations of RDF* and SPARQL* (An Alternative Approach to Statement-Level Metadata in RDF), in: Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web, CEUR Workshop Proceedings. URL: http://ceur-ws.org/Vol-1912/paper12.pdf.

Hernández, D., Hogan, A., Krötzsch, M., 2015. Reifying RDF: What Works Well With Wikidata?, in: Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems, CEUR Workshop Proceedings. pp. 32–47. URL: http://ceur-ws.org/Vol-1457/SSWS2015_paper3.pdf.

Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Sebastian, R., 2012. OWL 2 Web Ontology Language. W3C Recommendation. URL: https://www.w3.org/TR/owl2-primer/.

Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. D., Gutierrez, C., … & Zimmermann, A., 2021. Knowledge graphs. ACM Computing Surveys, 54(4), pp. 1–37. doi: 10.1145/3447772.

Iglesias, E., Jozashoori, S., Chaves-Fraga, D., Collarana, D., Vidal, M.E., 2020. SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs, in: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, Association for Computing Machinery. pp. 3039–3046. doi: 10.1145/3340531.3412881.

Iglesias-Molina, A., Van Assche, D., Arenas-Guerrero, J., De Meester, B., Debruyne, C., Jozashoori, S., Maria, P., Michel, F., Chaves-Fraga, D., Dimou, A., 2023. The RML Ontology: A Community-Driven Modular Redesign After a Decade of Experience in Mapping Heterogeneous Data to RDF., in: Proceedings of the 22nd International Semantic Web Conference, Springer Nature Switzerland. pp. 152–175. doi: 10.1007/978-3-031-47243-5_9.

Lanti, D., Rezk, M., Xiao, G., Calvanese, D., 2015. The NPD Benchmark: Reality Check for OBDA Systems, in: Proceedings of the 18th International Conference on Extending Database Technology, OpenProceedings.org. pp. 617–628. URL: https://openproceedings.org/2015/conf/edbt/paper-350.pdf.

Lanti, D., Xiao, G., Calvanese, D., 2019. VIG: Data scaling for OBDA benchmarks. Semantic Web 10, 413–433. doi: 10.3233/SW-180336.

Lenzerini, M., 2002. Data Integration: A Theoretical Perspective, in: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Association for Computing Machinery. pp. 233–246. doi: 10.1145/543613.543644.

Manola, F., Miller, E., McBride, B., et al., 2004. RDF primer. W3C Recommendation. URL: https://www.w3.org/TR/rdf-primer/.

Nguyen, V., Bodenreider, O., Sheth, A., 2014. Don’t like RDF Reification? Making Statements about Statements Using Singleton Property, in: Proceedings of the 23rd International Conference on World Wide Web, Association for Computing Machinery. pp. 759–770. doi: 10.1145/2566486.2567973.

Orlandi, F., Graux, D., O’Sullivan, D., 2021. Benchmarking rdf metadata representations: Reification, singleton property and rdf, in: 2021 IEEE 15th International Conference on Semantic Computing, pp. 233–240. doi: 10.1109/ICSC50631.2021.00049.

Priyatna, F., Corcho, O., Sequeda, J., 2014. Formalisation and experiences of R2RML-based SPARQL to SQL query translation using morph, in: Proceedings of the 23rd International Conference on World Wide Web, Association for Computing Machinery. pp. 479–490. doi: 10.1145/2566486.2567981.

Rodríguez-Muro, M., Rezk, M., 2015. Efficient SPARQL-to-SQL with R2RML Mappings. Journal of Web Semantics 33, 141–169. doi: 10.1016/j.websem.2015.03.001.

Sahoo, S.S., Bodenreider, O., Hitzler, P., Sheth, A., Thirunarayan, K., 2010. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data, in: Proceedings of the 22nd Scientific and Statistical Database Management, Springer Berlin Heidelberg. pp. 461–470. doi: 10.1007/978-3-642-13818-8_32.

Sen, S., Katoriya, D., Dutta, A., Dutta, B., 2021. RDFM: An alternative approach for representing, storing, and maintaining meta-knowledge in web of data. Expert Systems with Applications 179, 115043. doi: 10.1016/j.eswa.2021.115043.

Sequeda, J.F., Arenas, M., Miranker, D.P., 2014. OBDA: Query Rewriting or Materialization? In Practice, Both!, in: Proceedings of the 13th International Semantic Web Conference, Springer International Publishing. pp. 535–551. doi: 10.1007/978-3-319-11964-9_34.

Sequeda, J.F., Miranker, D.P., 2013. Ultrawrap: SPARQL execution on relational data. Journal of Web Semantics 22, pp. 19–39. doi: 10.1016/j.websem.2013.08.002.

Sijin, C., Hartig, O., 2022. LinGBM: A Performance Benchmark for Approaches to Build GraphQL Servers, in: Proceedings of the 23rd International Conference on Web Information Systems Engineering. doi: 10.1007/978-3-031-20891-1_16

Sundqvist, L., 2022. Extending VKG Systems with RDF-star Support. URL: https://ontop-vkg.org/publications/2022-sundqvist-rdf-star-ontop-msc-thesis.pdf.

Van Assche, D., Delva, T., Haesendonck, G., Heyvaert, P., De Meester, B., Dimou, A., 2022. Declarative RDF graph generation from heterogeneous (semi-)structured data. Journal of Web Semantics 75, 100753. doi: 10.1016/j.websem.2022.100753.

Xiao, G., Calvanese, D., Kontchakov, R., Lembo, D., Poggi, A., Rosati, R., Zakharyaschev, M., 2018. Ontology-Based Data Access: A Survey, in: Proceedings of the 27th International Joint Conference on Intelligence, IJCAI, International Joint Conferences on Artificial Intelligence Organization. pp. 5511–5519. doi: 10.24963/ijcai.2018/777.

Xiao, G., Ding, L., Cogrel, B., Calvanese, D., 2019. Virtual Knowledge Graphs: An Overview of Systems and Use Cases. Data Intelligence 1, pp. 201–223. doi: 10.1162/dint_a_00011.

Xiao, G., Lanti, D., Kontchakov, R., Komla-Ebri, S., Güzel-Kalaycı, E., Ding, L., Corman, J., Cogrel, B., Calvanese, D., Botoeva, E., 2020. The Virtual Knowledge Graph System Ontop, in: Proceedings of the 19th International Semantic Web Conference, Springer International Publishing. pp. 259–277. doi: 10.1007/978-3-030-62466-8_17.

Downloads

Published

2024-02-22

How to Cite

Arenas-Guerrero, J. ., Pérez, M. S. ., & Corcho, O. . (2024). LUBM4OBDA: Benchmarking OBDA Systems with Inference and Meta Knowledge. Journal of Web Engineering, 22(08), 1163–1186. https://doi.org/10.13052/jwe1540-9589.2284

Issue

Section

Articles