LUBM4OBDA: Benchmarking OBDA Systems with Inference and Meta Knowledge
DOI:
https://doi.org/10.13052/jwe1540-9589.2284Keywords:
OBDA, semantic web, ontology, data integrationAbstract
Ontology-based data access focuses on enabling query evaluation over heterogeneous relational databases according to the model represented by an ontology. The relationships between the ontology and the data sources are commonly defined with declarative mappings, which are used by systems to perform SPARQL-to-SQL query translation or to generate RDF dumps from the relational databases. Besides the potential homogenization of data because of using an ontology, some additional advantages of this paradigm are that it may allow applying reasoning thanks to the ontology, as well as querying for meta knowledge, which describes statements with information such as provenance or certainty. In this paper, (i) we adapt a widely used RDF graph store benchmark, namely LUBM, for ontology-based data access, (ii) extend the benchmark for the evaluation of queries that exploit meta knowledge, and (iii) apply it for performance evaluation of state-of-the-art declarative mapping systems. Our proposal, the LUBM4OBDA Benchmark, considers inference capabilities that are not covered by previous ontology-based data access benchmarks, and it is the first one for the evaluation of meta knowledge and the RDF-star data model. The experimental evaluation shows that current virtualization systems cannot handle some advanced inference tasks, and that optimizations are needed to scale RDF-star materialization.
Downloads
References
Alhazmi, A., Blount, T., Konstantinidis, G., 2022. ForBackBench: A Benchmark for Chasing vs. Query-Rewriting. Proceedings of the VLDB Endowment, 15(8), pp. 1519–1532. doi: 10.14778/3529337.3529338
Arenas-Guerrero, J., Chaves-Fraga, D., Toledo, J., Pérez, M.S., Corcho, O., 2024. Morph-KGC: Scalable knowledge graph materialization with mapping partitions. Semantic Web. doi: 10.3233/SW-223135.
Arenas-Guerrero, J., Iglesias-Molina, A., Chaves-Fraga, D., Garijo, D., Corcho, O., Dimou, A., 2024. Declarative generation of RDF-star graphs from heterogeneous data. Submitted to Semantic Web. URL: https://www.semantic-web-journal.net/system/files/swj3602.pdf.
Arenas-Guerrero, J., Scrocca, M., Iglesias-Molina, A., Toledo, J., Pozo-Gilo, L., Doña, D., Corcho, O., Chaves-Fraga, D., 2021. Knowledge Graph Construction with R2RML and RML: An ETL System-based Overview, in: Proceedings of the 2nd International Workshop on Knowledge Graph Construction, CEUR Workshop Proceedings. URL: http://ceur-ws.org/Vol-2873/paper11.pdf.
Arenas-Guerrero, J., Alobaid, A., Navas-Loro, M., Pérez, M.S., Corcho, O, 2023. Boosting Knowledge Graph Generation from Tabular Data with RML Views, in: Proceedings of the 20th Extended Semantic Web Conference, Springer Nature Switzerland. pp. 484-501. doi: 10.1007/978-3-031-33455-9_29.
Bizer, C., Cyganiak, R., 2006. D2R Server – Publishing Relational Databases on the Semantic Web, in: International Semantic Web Conference.
Bizer, C., Schultz, A., 2009. The Berlin SPARQL Benchmark. International Journal on Semantic Web and Information Systems 5(2), pp. 1–24. doi: 10.4018/jswis.2009040101.
Brickley, D., Guha, R., 2014. RDF Schema 1.1. W3C Recommendation. URL: https://www.w3.org/TR/rdf-schema/.
Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., Rodriguez-Muro, M., Xiao, G., 2017. Ontop: Answering SPARQL queries over relational databases. Semantic Web 8, pp. 471–487. doi: 10.3233/SW-160217.
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez-Muro, M., Rosati, R., Ruzzi, M., Savo, D.F., 2011. The MASTRO system for ontology-based data access. Semantic Web 2(1), pp. 43–53. doi: 10.3233/SW-2011-0029.
Chaloupka, M., Necasky, M., 2020. Using Berlin SPARQL Benchmark to Evaluate Relational Database Virtual SPARQL Endpoints. Submitted to Semantic Web. URL: https://www.semantic-web-journal.net/system/files/swj2473.pdf.
Chaves-Fraga, D., Priyatna, F., Cimmino, A., Toledo, J., Ruckhaus, E., Corcho, O., 2020. GTFS-Madrid-Bench: A Benchmark for Virtual Knowledge Graph Access in the Transport Domain. Journal of Web Semantics 65, 100596. doi: 10.1016/j.websem.2020.100596.
Cyganiak, R., Wood, D., Lanthaler, M., 2014. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation. URL: https://www.w3.org/TR/rdf11-concepts/.
Das, S., Sundara, S., Cyganiak, R., 2012. R2RML: RDB to RDF Mapping Language. W3C Recommendation. URL: http://www.w3.org/TR/r2rml/.
Delva, T., Arenas-Guerrero, J., Iglesias-Molina, A., Corcho, O., Chaves-Fraga, D., Dimou, A., 2021. RML-star: A Declarative Mapping Language for RDF-star Generation, in: International Semantic Web Conference, P&D, pp. 1—5.
Dividino, R., Sizov, S., Staab, S., Schueler, B., 2009. Querying for provenance, trust, uncertainty and other meta knowledge in RDF. Journal of Web Semantics 7, 204–219. doi: 10.1016/j.websem.2009.07.004.
Guo, Y., Pan, Z., Heflin, J., 2005. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics 3, 158–182. doi: 10.1016/j.websem.2005.06.005.
Harris, S., Seaborne, A., 2013. SPARQL 1.1 Query Language. W3C Recommendation. URL: https://www.w3.org/TR/sparql11-query/.
Hartig, O., 2017. Foundations of RDF* and SPARQL* (An Alternative Approach to Statement-Level Metadata in RDF), in: Proceedings of the 11th Alberto Mendelzon International Workshop on Foundations of Data Management and the Web, CEUR Workshop Proceedings. URL: http://ceur-ws.org/Vol-1912/paper12.pdf.
Hernández, D., Hogan, A., Krötzsch, M., 2015. Reifying RDF: What Works Well With Wikidata?, in: Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems, CEUR Workshop Proceedings. pp. 32–47. URL: http://ceur-ws.org/Vol-1457/SSWS2015_paper3.pdf.
Hitzler, P., Krötzsch, M., Parsia, B., Patel-Schneider, P.F., Sebastian, R., 2012. OWL 2 Web Ontology Language. W3C Recommendation. URL: https://www.w3.org/TR/owl2-primer/.
Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. D., Gutierrez, C., … & Zimmermann, A., 2021. Knowledge graphs. ACM Computing Surveys, 54(4), pp. 1–37. doi: 10.1145/3447772.
Iglesias, E., Jozashoori, S., Chaves-Fraga, D., Collarana, D., Vidal, M.E., 2020. SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs, in: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, Association for Computing Machinery. pp. 3039–3046. doi: 10.1145/3340531.3412881.
Iglesias-Molina, A., Van Assche, D., Arenas-Guerrero, J., De Meester, B., Debruyne, C., Jozashoori, S., Maria, P., Michel, F., Chaves-Fraga, D., Dimou, A., 2023. The RML Ontology: A Community-Driven Modular Redesign After a Decade of Experience in Mapping Heterogeneous Data to RDF., in: Proceedings of the 22nd International Semantic Web Conference, Springer Nature Switzerland. pp. 152–175. doi: 10.1007/978-3-031-47243-5_9.
Lanti, D., Rezk, M., Xiao, G., Calvanese, D., 2015. The NPD Benchmark: Reality Check for OBDA Systems, in: Proceedings of the 18th International Conference on Extending Database Technology, OpenProceedings.org. pp. 617–628. URL: https://openproceedings.org/2015/conf/edbt/paper-350.pdf.
Lanti, D., Xiao, G., Calvanese, D., 2019. VIG: Data scaling for OBDA benchmarks. Semantic Web 10, 413–433. doi: 10.3233/SW-180336.
Lenzerini, M., 2002. Data Integration: A Theoretical Perspective, in: Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Association for Computing Machinery. pp. 233–246. doi: 10.1145/543613.543644.
Manola, F., Miller, E., McBride, B., et al., 2004. RDF primer. W3C Recommendation. URL: https://www.w3.org/TR/rdf-primer/.
Nguyen, V., Bodenreider, O., Sheth, A., 2014. Don’t like RDF Reification? Making Statements about Statements Using Singleton Property, in: Proceedings of the 23rd International Conference on World Wide Web, Association for Computing Machinery. pp. 759–770. doi: 10.1145/2566486.2567973.
Orlandi, F., Graux, D., O’Sullivan, D., 2021. Benchmarking rdf metadata representations: Reification, singleton property and rdf, in: 2021 IEEE 15th International Conference on Semantic Computing, pp. 233–240. doi: 10.1109/ICSC50631.2021.00049.
Priyatna, F., Corcho, O., Sequeda, J., 2014. Formalisation and experiences of R2RML-based SPARQL to SQL query translation using morph, in: Proceedings of the 23rd International Conference on World Wide Web, Association for Computing Machinery. pp. 479–490. doi: 10.1145/2566486.2567981.
Rodríguez-Muro, M., Rezk, M., 2015. Efficient SPARQL-to-SQL with R2RML Mappings. Journal of Web Semantics 33, 141–169. doi: 10.1016/j.websem.2015.03.001.
Sahoo, S.S., Bodenreider, O., Hitzler, P., Sheth, A., Thirunarayan, K., 2010. Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data, in: Proceedings of the 22nd Scientific and Statistical Database Management, Springer Berlin Heidelberg. pp. 461–470. doi: 10.1007/978-3-642-13818-8_32.
Sen, S., Katoriya, D., Dutta, A., Dutta, B., 2021. RDFM: An alternative approach for representing, storing, and maintaining meta-knowledge in web of data. Expert Systems with Applications 179, 115043. doi: 10.1016/j.eswa.2021.115043.
Sequeda, J.F., Arenas, M., Miranker, D.P., 2014. OBDA: Query Rewriting or Materialization? In Practice, Both!, in: Proceedings of the 13th International Semantic Web Conference, Springer International Publishing. pp. 535–551. doi: 10.1007/978-3-319-11964-9_34.
Sequeda, J.F., Miranker, D.P., 2013. Ultrawrap: SPARQL execution on relational data. Journal of Web Semantics 22, pp. 19–39. doi: 10.1016/j.websem.2013.08.002.
Sijin, C., Hartig, O., 2022. LinGBM: A Performance Benchmark for Approaches to Build GraphQL Servers, in: Proceedings of the 23rd International Conference on Web Information Systems Engineering. doi: 10.1007/978-3-031-20891-1_16
Sundqvist, L., 2022. Extending VKG Systems with RDF-star Support. URL: https://ontop-vkg.org/publications/2022-sundqvist-rdf-star-ontop-msc-thesis.pdf.
Van Assche, D., Delva, T., Haesendonck, G., Heyvaert, P., De Meester, B., Dimou, A., 2022. Declarative RDF graph generation from heterogeneous (semi-)structured data. Journal of Web Semantics 75, 100753. doi: 10.1016/j.websem.2022.100753.
Xiao, G., Calvanese, D., Kontchakov, R., Lembo, D., Poggi, A., Rosati, R., Zakharyaschev, M., 2018. Ontology-Based Data Access: A Survey, in: Proceedings of the 27th International Joint Conference on Intelligence, IJCAI, International Joint Conferences on Artificial Intelligence Organization. pp. 5511–5519. doi: 10.24963/ijcai.2018/777.
Xiao, G., Ding, L., Cogrel, B., Calvanese, D., 2019. Virtual Knowledge Graphs: An Overview of Systems and Use Cases. Data Intelligence 1, pp. 201–223. doi: 10.1162/dint_a_00011.
Xiao, G., Lanti, D., Kontchakov, R., Komla-Ebri, S., Güzel-Kalaycı, E., Ding, L., Corman, J., Cogrel, B., Calvanese, D., Botoeva, E., 2020. The Virtual Knowledge Graph System Ontop, in: Proceedings of the 19th International Semantic Web Conference, Springer International Publishing. pp. 259–277. doi: 10.1007/978-3-030-62466-8_17.