A LEXICAL APPROACH FOR TAXONOMY MAPPING

Authors

  • LENNART NEDERSTIGT Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • DAMIR VANDIC Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • FLAVIUS FRASINCAR Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands

Keywords:

Schema mapping, product taxonomies, lexical matching, word sense disam- biguation

Abstract

Obtaining a useful complete overview of Web-based product information has become dicult nowadays due to the ever-growing amount of information available on online shops. Findings from previous studies suggest that better search capabilities, such as the exploitation of annotated data, are needed to keep online shopping transparent for the user. Annotations can, for example, help present information from multiple sources in a uniform manner. In order to support the product data integration process, we propose an algorithm that can autonomously map heterogeneous product taxonomies from dierent online shops. The proposed approach uses word sense disambiguation techniques, approximate lexical matching, and a mechanism that deals with composite categories. Our algorithm's performance compared favorably against two other state- of-the-art taxonomy mapping algorithms on three real-life datasets. The results show that the F1-measure for our algorithm is on average 60% higher than a state-of-the-art product taxonomy mapping algorithm.

 

Downloads

Download data is not yet available.

References

Steven S Aanen, Damir Vandic, and Flavius Frasincar. Automated Product Taxonomy Mapping

in an E-commerce Environment. Expert Systems with Applications, 42(3):1298{1313, 2015.

Amazon.com. US' Largest Online Retailer. http://www.amazon.com, 2015.

David Aumueller, Hong-Hai Do, Sabine Massmann, and Erhard Rahm. Schema and Ontology

Matching with COMA++. In ACM SIGMOD International Conference on Management of Data

(SIGMOD 2005), pages 906{908. ACM, 2005.

Silvana Castano, Al o Ferrara, and Stefano Montanelli. H-MATCH: An Algorithm for Dynami-

cally Matching Ontologies in Peer-Based Systems. In 1st VLDB Int. Workshop on Semantic Web

and Databases (SWDB 2003), 2003.

Silvana Castano, Al o Ferrara, Stefano Montanelli, and Daniele Zucchelli. Helios: A General

Framework for Ontology-Based Knowledge Sharing and Evolution in P2P Systems. 14th Interna-

tional Workshop on Database and Expert Systems Applications (DEXA 2003), 2003.

DMOZ. Open Directory Project. http://www.dmoz.org/, 2015.

Hong-Hai Do and Erhard Rahm. COMA: A System for Flexible Combination of Schema Matching

Approaches. In 28th International Conference on Very Large Data Bases (VLDB 2002), pages

{621. VLDB Endowment, 2002.

Mark Ehrig and Ste en Staab. QOM - Quick Ontology Mapping. In International Semantic Web

Conference 2004 (ISWC 2004), volume LNCS-3298, pages 683{697. Springer, 2004.

John H. Gennari, Mark A. Musen, Ray W. Fergerson, William E. Grosso, Monica Crubezy, Hen-

rik Eriksson, Natalya F. Noy, and Samson W. Tu. The Evolution of Protege: An Environment

for Knowledge-based Systems Development. International Journal of Human-Computer Studies,

(1):89{123, 2003.

Fausto Giunchiglia, Pavo Shvaiko, and Mikalai Yatskevich. Semantic Schema Matching. On the

Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, pages 347{365, 2005.

Thomas R. Gruber. A Translation Approach to Portable Ontology Speci cations. Knowledge

Acquisition, 5:199{199, 1993.

Bin He and Kevin Chen-Chuan Chang. Automatic Complex Schema Matching across Web Query

Interfaces: A Correlation Mining Approach. ACM Transactions on Database Systems, 31(1):1{45,

Martin Hepp. GoodRelations: An Ontology for Describing Products and Services O ers on the

Web. In 16th International Conference on Knowledge Engineering: Practice and Patterns (EKAW

, volume LNCS-5268, pages 329{346. Springer, 2008.

John B. Horrigan. Online Shopping. Pew Internet & American Life Project Report, 36, 2008.

Paul Jaccard. The Distribution of the Flora in the Alpine Zone. New Phytologist, 11(2):37{50,

Yannis Kalfoglou and Marco Schorlemmer. Ontology Mapping: The State of the Art. The Knowl-

edge Engineering Review, 18(01):1{31, 2003.

Michael Lesk. Automatic Sense Disambiguation using Machine Readable Dictionaries: How to tell

a Pine Cone from an Ice Cream Cone. In 5th Annual ACM SIGDOC International Conference on

Systems Documentation (SIGDOC 1986), pages 24{26. ACM, 1986.

Vladimir I. Levenshtein. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals.

Soviet Physics Doklady, 10(8):707{710, 1966.

John Li. LOM: A Lexicon-based Ontology Mapping Tool. In 5th Workshop Performance Metrics

Lei Li, Wenxing Hong, and Tao Li. Taxonomy-Oriented Recommendation Towards Recommen-

dation with Stage. In Web Technologies and Applications, pages 219{230. Springer, 2012.

Jayant Madhavan, Philip A. Bernstein, and Erhard Rahm. Generic Schema Matching with Cupid.

In 27th International Conference on Very Large Data Bases (VLDB 2001), pages 49{58. Morgan

Kaufmann Publishers Inc., 2001.

Bernardo Magnini, Manuela Speranza, and Christian Girardi. A Semantic-Based Approach to

Interoperability of Classi cation Hierarchies: Evaluation of Linguistic Techniques. In 20th Inter-

national Conference on Computational Linguistics (COLING 2004), pages 1133{es. Association

for Computational Linguistics, 2004.

Deborah L. McGuinness, Richard Fikes, James Rice, and Steve Wilder. The Chimaera Ontology

Environment. In 17th National Conference on Arti cial Intelligence (AAAI 2000), pages 1123{

, 2000.

Sergey Melnik, Hector Garcia-Molina, and Erhard Rahm. Similarity Flooding: A Versatile Graph

Matching Algorithm and its Application to Schema Matching. In 18th International Conference

on Data Engineering (ICDE 2002), pages 117{128. IEEE Computer Society, 2002.

George A. Miller. WordNet: A Lexical Database for English. Communications of the ACM,

(11):39{41, 1995.

Lennart Nederstigt, Damir Vandic, and Flavius Frasincar. An Automated Approach to Product

Taxonomy Mapping in E-Commerce. In 1st International Symposium on Management Intelligent

Systems (IS-MiS 2012), volume 171 of Advances in Intelligent Systems and Computing, pages

{120. Springer, 2012.

Hien T Nguyen and Tru H Cao. Named Entity Disambiguation: a Hybrid Approach. International

Journal of Computational Intelligence Systems, 5(6):1052{1067, 2012.

Ian Niles and Adam Pease. Towards a Standard Upper Ontology. In 2nd International Conference

on Formal Ontology in Information Systems (FOIS 2001), pages 2{9. ACM, 2001.

Ian Niles and Allan Terry. The MILO: A General-purpose, Mid-level Ontology. In 2nd Interna-

tional Conference on Information and Knowledge Engineering (IKE 2004), pages 15{19, 2004.

Natalya F. Noy and Mark A. Musen. The PROMPT Suite: Interactive Tools for Ontology Merging

and Mapping. International Journal of Human-Computer Studies, 59(6):983{1024, 2003.

Overstock.com. Web Store. http://www.o.co, 2015.

Sangun Park and Wooju Kim. Ontology Mapping between Heterogeneous Product Taxonomies in

an Electronic Commerce Environment. International Journal of Electronic Commerce, 12(2):69{

, 2007.

Erhard Rahm and Philip A. Bernstein. A Survey of Approaches to Automatic Schema Matching.

The VLDB Journal, 10(4):334{350, 2001.

Shopping.com. Online Shopping Comparison Website. http://www.shopping.com, 2015.

Pavel Shvaiko and Jerome Euzenat. A Survey of Schema-Based Matching Approaches. Journal

on Data Semantics IV, 3730:146{171, 2005.

Guo-Qing Zhang, Guo-Qiang Zhang, Qing-Feng Yang, Su-Qi Cheng, and Tao Zhou. Evolution of

the Internet and its Cores. New Journal of Physics, 10(12):123027, 2008.

Hongwei Zhu, Stuart Madnick, and Michael Siegel. Enabling Global Price Comparison through

Semantic Integration of Web Data. International Journal of Electronic Business, 6(4):319{341,

Cai-Nicolas Ziegler, Georg Lausen, and Lars Schmidt-Thieme. Taxonomy-Driven Computation

of Product Recommendations. In 13th International Conference on Information and Knowledge

Management (CIKM 2004), pages 406{415. ACM, 2004.

Downloads

Published

2016-03-14

Issue

Section

Articles