A METRIC BASED AUTOMATIC SELECTION OF ONTOLOGY MATCHERS USING BOOTSTRAPPED PATTERNS
Keywords:
Automatic Matching, Ontology Matcher Selection, Bootstrapping Patterns, Ontology Matching, Ontology MetricsAbstract
The ontology matching process has become a vital part of the (semantic) web, enabling interoperability among heterogeneous data. To enable interoperability, similar entity pairs across heterogeneous data are discovered using a static set of matchers consisting of linguistic, structural and/or instance matchers that discover similar entities. Numerous sets of matchers exist in the literature; however, none of the matcher sets are capable of achieving good results across all data. In addition, it is both tedious and painstaking for domain experts to select the best set of matchers for the given data to be matched. In this paper, we propose two bootstrapping-based approaches, Bottom-up and Top-down, to automatically select the best set of matchers for the given ontologies to be matched. The selection is processed, based on the characteristics of the ontologies which are quantified by a set of quality metrics. Two new structural quality metrics, the Concept External Structural Richness (CESR) and the Concept Internal Structural Richness (CISR), have also been proposed to better quantify the structural characteristics of the ontology. The best set of matchers is chosen using the sets of patterns learned through the proposed Bottom-up and Top-down bootstrapping approaches. The proposed metrics and the patterns constructed using these approaches are evaluated using the COMA matching tool with existing benchmark ontologies (Benchmark, Conference and Benchmark2 tracks of the OAEI 2011). The proposed Bottom-up based patterns, along with the two proposed quality metrics, achieved better effectiveness (F-measure) in selecting the best set of matchers in comparison with the static set of matching, supervised ML algorithms and the existing automatic matching. Specifically, the proposed Bottom-up patterns achieve a 14.6% Average Gain/Task and a significant improvement of 129% in comparison with the existing KNN model’s Average Gain/Task.
Downloads
References
Abney, S. (2004). Understanding the yarowsky algorithm. Computational Linguistics, 30(3), 365-
Agrawal, R., Ailamaki, A., Bernstein, P. A., Brewer, E. A., Carey, M. J., Chaudhuri, S., et al.
(2008). The Claremont report on database research. ACM Sigmod Record, 37(3), 9-19.
Algergawy, A., Nayak, R., Siegmund, N., Köppen, V., & Saake, G. (2010).Combining schema and
level-based matching for web service discovery. In B. Benatallah, F. Casati, G. Kappel, &
G.Rossi(Eds.), Web Engineering(pp. 114-128).Springer Berlin Heidelberg.
Bellahsene, Z., Bonifati, A., & Rahm, E. (2011). Schema matching and mapping (Vol. 20).
Heidelberg (DE): Springer.
Burton-Jones, A., Storey, V. C., Sugumaran, V., & Ahluwalia, P. (2005). A semiotic metrics suite
for assessing the quality of ontologies. Data & Knowledge Engineering, 55(1), 84-102.
Choi, N., Song, I. Y., & Han, H. (2006). A survey on ontology mapping. ACM Sigmod
Record, 35(3), 34-41.
Cruz, I. F., Fabiani, A., Caimi, F., Stroe, C., & Palmonari, M. (2012). Automatic configuration
selection using ontology matching task profiling. In E. Simperl, P. Cimiano, A. Polleres, O.
Corcho,& V. Presutti(Eds.), The Semantic Web: Research and Applications (pp. 179-194).
Springer Berlin Heidelberg.
Do, H. H., & Rahm, E. (2007). Matching large schemas: Approaches and evaluation. Information
Systems, 32(6), 857-885.
Doan, A., Domingos, P., & Halevy, A. Y. (2001, May). Reconciling schemas of disparate data
sources: A machine-learning approach. In ACM Sigmod Record (Vol. 30, No. 2, pp. 509-520).
ACM.
Duan, S., Fokoue, A., & Srinivas, K. (2010). One size does not fit all: Customizing ontology
alignment using user feedback. In P.F. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang,
J.Z. Pan, I, Horrocks,& B. Glimm(Eds.), The Semantic Web–ISWC 2010 (pp. 177-192).
Springer Berlin Heidelberg.
Duque-Ramos, A., Fernández-Breis, J. T., Stevens, R., & Aussenac-Gilles, N. (2011). OQuaRE:
A SQuaRE-based approach for evaluating the quality of ontologies. Journal of Research and
Practice in Information Technology, 43(2), 159.
Ehrig, M., Staab, S., & Sure, Y. (2005). Bootstrapping ontology alignment methods with
APFEL. In Y. Gil, E. Motta, V.R. Benjamins, & M. Musen(Eds.), The Semantic Web–ISWC
(pp. 186-200). Springer Berlin Heidelberg.
Euzenat, J., & Shvaiko, P. (2007). Ontology matching (Vol. 333). Heidelberg: Springer.
Gal, A. (2011). Uncertain schema matching. Synthesis Lectures on Data Management, 3(1), 1-
Gal, A., & Sagi, T. (2010). Tuning the ensemble selection process of schema
matchers. Information Systems, 35(8), 845-859.
Gal, A., & Shvaiko, P. (2009). Advances in ontology matching. In E.J. Chang, &
K. Sycara(Eds.) Advances in web semantics (pp. 176-198). Springer Berlin Heidelberg.
Hariri, B. B., Sayyadi, H., Abolhassani, H., & Esmaili, K. S. (2006, August). Combining
Ontology Alignment Metrics Using the Data Mining Techniques. In ECAI International
Workshop on Context and Ontologies (pp. 65-67).
Hu, W., Qu, Y., & Cheng, G. (2008). Matching large ontologies: A divide-and-conquer
approach. Data & Knowledge Engineering, 67(1), 140-160.
Huza, M., Harzallah, M., & Trichet, F. (2007). OntoMas: a tutoring system dedicated to
ontology matching. In R. Jardim-Gonçalves, J. Müller, K. Mertins,& M. Zelm (Eds.),
Enterprise Interoperability II (pp. 377-388). Springer London.
Lee, Y., Sayyadian, M., Doan, A., & Rosenthal, A. S. (2007). eTuner: tuning schema matching
software using synthetic scenarios. The VLDB Journal—The International Journal on Very
Large Data Bases, 16(1), 97-122.
Li, J., Tang, J., Li, Y., & Luo, Q. (2009). Rimom: A dynamic multi strategy ontology alignment
framework. In IEEE Transactions on Knowledge and Data Engineering, 21(8), 1218-1232.
Mao, M., Peng, Y., & Spring, M. (2008). A Harmony based Adaptive Ontology Mapping
Approach. In SWWS (pp. 336-342).
Marie, A., & Gal, A. (2008). Boosting schema matchers. In Tari, Z. (Ed.), On the Move to
Meaningful Internet Systems: OTM 2008 (pp. 283-300). Springer Berlin Heidelberg.
Mochol, M., & Jentzsch, A. (2008). Towards a rule-based matcher selection. In Knowledge
Engineering: Practice and Patterns (pp. 109-119). Springer Berlin Heidelberg.
Mochol, M., Jentzsch, A., & Euzenat, J. (2006). Applying an analytic method for matching
approach selection. In Proc. 1st ISWC 2006 international workshop on ontology matching
(OM) (pp. 37-48).
Ngo, D. H., & Bellahsene, Z. (2012). Evaluating the Interaction between the different Matchers
(or Strategies) in Ontology Matching Task. In International Semantic Web Conference-ISWC
(p. 12).
Otero-Cerdeira, L., Rodríguez-Martínez, F. J., & Gómez-Rodríguez, A. (2015). Ontology
matching: A literature review. Expert Systems with Applications, 42(2), 949-971.
Peukert, E., Eberius, J., & Rahm, E. (2011, April). AMC-A framework for modelling and
comparing matching systems as matching processes. In IEEE 27th International Conference
on Data Engineering (ICDE)(pp. 1304-1307). IEEE.
Peukert, E., Eberius, J., & Rahm, E. (2012, April). A self-configuring schema matching system.
In 2012 IEEE 28th International Conference on Data Engineering (pp. 306-317). IEEE.
Rahm, E. (2011). Towards large-scale schema and ontology matching. In Z. Bellahsene,
A. Bonifati, & E.Rahm (Eds.), Schema matching and mapping (pp. 3-27). Springer Berlin
Heidelberg.
Sagi, T., & Gal, A. (2013). Schema matching prediction with applications to data source
discovery and dynamic ensembling. The VLDB Journal, 22(5), 689-710.
Saruladha, K., Aghila, G., & Sathiya, B. (2011). A comparative analysis of ontology and schema
matching systems. International Journal of Computer Applications, 34(8), 14-21.
Shi, F., Li, J., Tang, J., Xie, G., & Li, H. (2009). Actively learning ontology matching via user
interaction. In A. Bernstein, D.R. Karger, T. Heath, L. Feigenbaum, D. Maynard, E. Motta, &
K. Thirunarayan(Eds.), The Semantic Web- ISWC 2009 (pp. 585-600). Springer Berlin
Heidelberg.
Shvaiko, P., & Euzenat, J. (2013). Ontology matching: state of the art and future
challenges. IEEE Transactions on Knowledge and Data Engineering, 25(1), 158-176.
Spiliopoulos, V., & Vouros, G. (2012). Synthesizing ontology alignment methods using the maxsum
algorithm. IEEE Transactions on Knowledge and Data Engineering, 24(5), 940-951.
Steyskal, S., & Polleres, A. (2013). Mix'n'Match: iteratively combining ontology matchers in an
anytime fashion. In OM (pp. 223-224).
Tartir, S., Arpinar, I. B., Moore, M., Sheth, A. P., & Aleman-Meza, B. (2005). OntoQA: Metricbased
ontology quality analysis. In: IEEE Workshop on Knowledge Acquisition from
Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources, vol. 9,
(pp. 45–53). IEEE.
Tu, K., & Yu, Y. (2005, April). CMC: Combining multiple schema-matching strategies based on
credibility prediction. In International Conference on Database Systems for Advanced
Applications (pp. 888-893). Springer Berlin Heidelberg.
Yang, P., Wang, P., Ji, L., Chen, X., Huang, K., & Yu, B. (2014). Ontology Matching Tuning
Based on Particle Swarm Optimization: Preliminary Results. In D. Zhao, J. Du, H. Wang,
P. Wang, D. Ji, & J.Z. Pan (Eds.) The Semantic Web and Web Science (pp. 146-155). Springer
Berlin Heidelberg.
Anatomy Track (2011). Available: http://oaei.ontologymatching.org/2011/anatomy/index.html
Benchmark track (2011). Available: http://oaei.ontologymatching.org/2011/benchmarks/
Benchmark track2 (2011). Available: http://oaei.ontologymatching.org/2011/benchmarks2/
COMA 3.0 ontology matching tool (Nov 2012). Available: http://sourceforge.net/projects/comace/
Conference track (2011). Available:
http://oaei.ontologymatching.org/2011/conference/index.html
One Way ANOVA - University of Wisconsin - Stevens Point. [Online]. Available:
http://www.uwsp.edu/psych/stat/12/anova-1w.ht
Ontology Alignment Evaluation Initiative (OAEI). (2011). Available: