MAXLCA: A NEW QUERY SEMANTIC MODEL FOR XML KEYWORD SEARCH
Keywords:
Keyword Search, XML, MAXLCA, GMAXAbstract
Keyword search enables web users to easily access XML data without understanding the complex data schemas. However, the ambiguity of keyword search makes it arduous to select qualified data nodes matching keywords. To address this challenge in XML datasets whose documents have a relatively low average size, we present a new keyword query semantic model: MAXimal Lowest Common Ancestor (MAXLCA). MAXLCA can effectively avoid false negative problem observed in ELCA, SLCA and XSeek. Furthermore, we construct an algorithm GMAX for MAXLCA-based queries that is proved efficient in evaluations. Experiments on INEX show that the search engine using MAXLCA and GMAX outperforms in all three comparative criteria: effective, efficient and processing scalability.
Downloads
References
N. Gao, Zh. Deng and Y. Xiang. Peking University at INEX 2009: Ad Hoc Track. In INEX, 2009.
H. Yu, Zh. Deng and Y. Xiang and N. Gao. Adaptive Top-k Algorithm in SLCA-Based XML
Keyword Search. In APWeb, 2010.
Z. Liu and Y. Chen. Answering Keyword Queries on XML Using Materialized Views. In ICDE,
V. Hristidis, N. Koudas, Y. Papakonstantinou, and D. Srivastava. Keyword Proximity Search in
XML trees. IEEE Transactions on Knowledge and Data Engineering, 18(4), 2006.
Y. Huang, Z. Liu and Y. Chen. eXtract: A Snippet Generation System for XML Search. In VLDB
L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked Keyword Search over
XML Documents. In SIGMOD, 2003.
S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv. XSEarch: A Semantic Search Engine for XML. In
VLDB, 2003.
Y. Xu and Y. Papakonstantinou. Efficient Keyword Search for Shortest LCAs in XML Databases.
In SIGMOD, 2005.
Z. Liu and Y. Chen. Identifying Meaningful Return Information for XML Keyword Search. In
SIGMOD, 2007.
G. Li, J. Feng, J. Wang, and L. Zhou. Effective Keyword Search for Valuable LCAs over XML
Documents. In CIKM, 2007.
Z. Liu and Y. Chen. Reasoning and Identifying Relevant Matches for XML Keyword Search. In
VLDB, 2008
Z. Bao, T. W. Ling, B. Chen, and J. Lu. Effective XML Keyword Search with Relevance Oriented
Ranking. In ICDE, 2009
B. Schieber and U. Vishkin. On finding lowest common ancestors: Simplification and parallelization.
SIAM J. Computing, 1988.
INEX. http://www.inex.otago.ac.nz/.
D. Carmel, Y.S. Maarek, M. Mandelbrod, et al. Searching XML documents via XML frag-ments.
In SIGIR, 2003.
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, Hang Li. Learning to Rank: From Pair-wise
Approach to Listwise Approach. Microsoft technique report.
Y. Yue, T. Finley, F. Radlinski and T. Joachims. A Support Vector Method for Optimizing
Average Precision. In SIGIR 2007.
M. Theobald, H. Bast, D. Majumdar, R. Schenkel, and G. Weikum. TopX: Efficient and Versatile
Top-k Query Processing for Semistructured Data. In VLDB 2008.
H. Yu, Z. Deng, Y. Xiang, N. Gao, Z. Ming, S. Tang. Adaptive Top-k Algorithm in SLCA-Based
XML Keyword Search. In APWeb’10.
L. Chen, Y. Papakonstantinou. Supporting top-K keyword search in XML databases. In IEEE
ICDE’10
J. Pound, P. Mika and H. Zaragoza, Ad-Hoc Object Ranking in the Web of Data. In WWW 2010.
A. Trotman and B. Sigurbj¨ornsson. NEXI, Now and Next. In INEX 2004.
Y. Li, C. Yu, H. V. Jagadish. Schema-Free XQuery. In VLDB 2004.
C. Sun, C.Y. Chan and A.k. Geonka. Multiway SLCA-based Keyword Search in XML Data. In
WWW 2007.
R. Zhou, C. Liu, J. Li: Fast ELCA computation for keyword queries on XML data. InEDBT 2010.