RDF Graph Summarization Based on Node Characteristic and Centrality

Authors

  • Jimao Guo College of Computer and Information Science, Southwest University, Chongqing 400715, China
  • Yi Wang College of Computer and Information Science, Southwest University, Chongqing 400715, China

DOI:

https://doi.org/10.13052/jwe1540-9589.2174

Keywords:

Knowledge graph summarization, node centrality, knowledge graph compression, node characteristic set, graph summarization

Abstract

The explosive growth of RDF data makes it difficult to be efficiently queried, understood and used. RDF graph (RDFG) summarization aims to extract the most relevant and crucial data as summaries according to different criteria. Current summarization approaches mainly apply single strategies such as graph structure, pattern mining or relevance metrics to calculate RDFG summaries. Different to the existing approaches, this paper proposes a summarization approach to automatically generating RDFG summary, which can capture both structure and centrality information. Specifically, we present three algorithms, SumW (merging nodes based on node characteristics or similar types), SumS (merging nodes based on typed node characteristics) and SummaryFL (retrieving central nodes by combining node frequency and bridging coefficient). The three algorithms can be used by two summarization strategies: SumS or SumW only, and SumS+SummaryFL or SumW+SummaryFL. We conducted experiments over large and real-world RDF datasets to verify the effectiveness of our method with respect to time complexity, compression capability and coverage of the summary. The experiment results demonstrate that our approach outperformed the comparative algorithms.

Downloads

Download data is not yet available.

Author Biographies

Jimao Guo, College of Computer and Information Science, Southwest University, Chongqing 400715, China

Jimao Guo received her M.S. degree in Computer Science from the Southwest University (China) in July 2022. She is now working at the Neijiang Education and Examination Institute, SiChuan, China. Her research interests include Semantic technology and knowledge modeling.

Yi Wang, College of Computer and Information Science, Southwest University, Chongqing 400715, China

Yi Wang received her M.S. degree in Computer Science from the Southwest University (China) in 2004 and the Ph.D. in Computer Science from the Macquarie University (Australia) in 2012. She is an associate professor at the Southwest University (China) since 2014. Her research interests include knowledge representation and knowledge graph refinement.

References

Čebirić, S., Goasdoué, F., Kondylakis, H., Kotzinos, D., Manolescu, I., Troullinou, G., Zneika, M.: Summarizing Semantic Graphs: A Survey. VLDB J. 28, 295–327 (2019).

Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for indexing paths in graph-structured data. Proc. Int. Conf. Data Eng. 129–140 (2002).

Schätzle, A., Neu, A., Lausen, G., Przyjaciel-Zablocki, M.: Large-scale bisimulation of RDF graphs. In: Proceedings of the Fifth Workshop on Semantic Web Information Management (SWIM 2013) (2013).

Stefanoni, G., Motik, B., Kostylev, E. V.: Estimating the cardinality of conjunctive queries over RDF data using graph summarisation. In: Proceedings of the World Wide Web Conference. pp. 1043–1052 (2018).

Pires, C.E., Sousa, P., Kedad, Z., Salgado, A.C.: Summarizing ontology-based schemas in PDMS. In: Proceedings of International Conference on Data Engineering. pp. 239–244 (2010).

Troullinou, G., Kondylakis, H., Daskalaki, E., Plexousakis, D.: Ontology Understanding without Tears: The Summarization Approach. Semant. Web Jouranl. 8, 797–815 (2017).

Queiroz-Sousa, P.O., Salgado, A.C., Pires, C.E.: A Method for Building Personalized Ontology Summaries. J. Inf. Data Manag. 4, 236–250 (2013).

Safavi, T., Belth, C., Faber, L., Mottin, D., Muller, E., Koutra, D.: Personalized knowledge graph summarization: From the cloud to your pocket. In: Proceedings of IEEE International Conference on Data Mining, ICDM. pp. 528–537 (2019).

Liu, Y., Safavi, T., Dighe, A., Koutra, D.: Graph summarization methods and applications: A survey. ACM Comput. Surv. 51, 1–34 (2018).

LeFevre, K., Terzi, E.: GraSS: Graph Structure Summarization. In: Proceedings of the 10th SIAM International Conference on Data Mining, SDM 2010. pp. 454–465 (2010).

Ko, J., Kook, Y., Shin, K.: Incremental Lossless Graph Summarization. Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. 317–327 (2020).

Lee, K., Jo, H., Ko, J., Lim, S., Shin, K.: SSumM: Sparse Summarization of Massive Graphs. Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. 144–154 (2020).

Faralli, S., Finocchi, I., Ponzetto, S.P., Velardi, P.: Efficient pruning of large knowledge graphs. In: Proceedings of International Joint Conference on Artificial Intelligence. pp. 4055–4063 (2018).

Luo, Y., Fletcher, G.H.L., Hidders, J., Wu, Y., De Bra, P.: External memory k-bisimulation reduction of big graphs. In: Proceedings of International Conference on Information and Knowledge Management. pp. 919–928 (2013).

Čebirić, Š., Goasdoué, F., Manolescu, I., Šejlač, Š., Šejlaèebirić, Š.: Query-Oriented Summarization of RDF Graphs. In: In: Maneth S. (eds) Data Science. BICOD 2015. Lecture Notes in Computer Science, vol. 9147. Springer, Cham. (2015).

Song, Q., Wu, Y., Dong, X.L.: Mining summaries for knowledge graph search. In: Proceedings of IEEE International Conference on Data Mining, ICDM. pp. 1215–1220 (2017).

Zneika, M., Lucchese, C., Vodislav, D., Kotzinos, D.: RDF Graph Summarization Based on Approximate Patterns. Commun. Comput. Inf. Sci. 622, 69–87 (2016).

Pappas, A., Troullinou, G., Roussakis, G., Kondylakis, H., Plexousakis, D.: Exploring importance measures for summarizing RDF/S KBs. Lect. Notes Comput. Sci. 10249 LNCS, 387–403 (2017).

Presutti, V., Aroyo, L., Adamou, A., Schopman, B., Gangemi, A., Schreiber, G.: Extracting core knowledge from Linked Data. In: CEUR Workshop (2011).

Liu, Q., Cheng, G., Gunaratna, K., Qu, Y.: Entity summarization: State of the art and future challenges. J. Web Semant. 69, 100647 (2021).

Gunaratna, K., Thirunarayan, K., Sheth, A.: FACES: Diversity-aware entity summarization using incremental hierarchical conceptual clustering. In: Proceedings of the National Conference on Artificial Intelligence. pp. 116–122 (2015).

Thalhammer, A., Lasierra, N., Rettinger, A.: LinkSUM: Using link analysis to summarize entity data. In: Proceedings of the International Conference of Web Engineering (ICWE),Lecture Notes in Computer Science. pp. 244–261 (2016).

Liu, Q., Cheng, G., Qu, Y.: Entity summarization with high readability and low redundancy. Sci. Sin. Informationis. 50, 845–861 (2020).

Yang, Y., Li, Y., Karras, P., Tung, A.K.H.: Context-aware Outstanding Fact Mining from Knowledge Graphs. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). pp. 2006–2016 (2021).

Zheng, W., Zou, L., Peng, W., Yan, X., Song, S., Zhao, D.: Semantic SPARQL similarity search over RDF knowledge graphs. In: Proceedings of the VLDB Endowment. pp. 840–851 (2016).

Sankara Rao, A., Durga Bhavani, S., Sobha Rani, T., Bapi, R.S., Narahari Sastry, G.: Study of Diversity and Similarity of Large Chemical Databases Using Tanimoto Measure. Commun. Comput. Inf. Sci. 157 CCIS, 40–50 (2011).

Published

2022-12-06

Issue

Section

Articles