POPULARITY-BASED RELEVANCE PROPAGATION

Authors

  • EHSAN MOUSAKAZEMI Department of Electrical & Computer Engineering, Yazd University, Yazd, Iran
  • MEHDI AGHA SARRAM Department of Electrical & Computer Engineering, Yazd University, Yazd, Iran
  • ALI MOHAMMAD ZAREH BIDOKI Department of Electrical & Computer Engineering, Yazd University, Yazd, Iran

Keywords:

Information Retrieval, Ranking, Relevance Propagation, Popularity Measure

Abstract

It is evident that information resources on the World Wide Web (WWW) are growing rapidly with unpredictable rate. Under these circumstances, web search engines help users to find useful information. Ranking the retrieved results is the main challenge of every search engine. There are some ranking algorithms based on content and connectivity such as BM25 and PageRank. Due to low precision of these algorithms for ranking on the web, combinational algorithms have been proposed. Recently, relevance propagation methods as one of the salient combinational algorithms, has attracted many information retrieval (IR) researchers' attention. In these methods the content-based attributes are propagated from one page to another through web graph. In this paper, we propose a generic method for exploiting the estimated popularity degree of pages (such as their PageRank score) to improve the propagation process. Experimental results based on TREC 2003 and 2004 gathered in Microsoft LETOR 3.0 benchmark collection show that this idea can improve the precision of the corresponding models without any additional online complexity.

 

Downloads

Download data is not yet available.

References

Alexa the Web Information Company, http://www.alexa.com/, 2011.

Baeza-Yates, R. & Ribeiro-Neto, B. Modern Information Retrieval. ACM Press/Addison Wesley, 1999.

Brin, S. & Page, L. The Anatomy of a Large Scale Hypertextual Web Search Engine. In Proceedings of the 7th World Wide Web Conference, 1998.

Cho, J. & Roy, S. Impact of search engines on page popularity. In proceeding of the International World-Wide Web Conference, 2004.

Craswell, N. & Hawking, D. Overview of the TREC 2003 Web Track. In the 12th TREC, 2003.

Craswell, N. & Hawking, D. Overview of the TREC 2004 Web Track. In the 13th TREC, 2004.

Golub, G. & Van Loan, C. Matrix Computations. John Hopkins Press, 1989.

Google Search Engine, http://www.google.com/, 2011.

Haveliwala, T. Topic-Sensitive PageRank. In Proceedings of the 11th International World-Wide Web Conference, 2002.

Hawking, D. Overview of the TREC-9 Web Track. In the 9th TREC, 2002.

Jarvelin, K. & Kekalainen, J. Comulated Gainbased Evaluation of IR Techniques. ACM Transactions on Information Systems, Vol. 20 No. 04, pp. 422–446, 2002.

Kamvar, S. D., Haveliwala, T. H., Manning, C. D. & Golub, G. H. Exploiting the block structure of the web for computing. Technical report, Stanford University, Stanford, CA, 2003.

Matsuo, Y., Ohsawa, Y. & Ishizuka, M. Average-clicks: A new measure of distance on the World Wide Web. Journal of Intelligent Information Systems , pp. 51–62, 2003.

Mcbryan, O. GENVL and WWW: Tools for tamping the web. In Proceedings of the 1st WWW, 1994.

MicroSoft Research Asia, http://research.microsoft.com/en-us/labs/asia/default.aspx, 2011.

Najork, M., Zaragoza, H. & Taylor, M. J. Hits on the web: how does it compare? In Proceedings of SIGIR'07, pp. 471-478, 2007.

Page, L., Brin, S., Motawni, R. & Winogard, T. The PageRank citation algorithm: Bringing order to the web. Technical report, Standford Digital Library Technologies Project, 1998.

Qin, T., Liu, T. Y., Zhang, X. D., Chen, Z., & Ma, W. Y. A study of relevance propagation for web search. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 408–415, 2005.

Qin, T., Liu, T., Xu, J. & Li, H. Letor: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval Journal, pp. 346-374, 2010.

Robertson, S. E, Walker, S., Jones, S, M.Hancock-Beaulieu, M. & Gatford, M. Okapi at TREC-3. In Harman, D. K., editor, The Third Text REtrieval Conference (TREC-3), pp. 109-126, 1995.

Salton, G. & Buckley, C. Term-weighting approaches in automatic text retrieval. Information Processing and Management: an International Journal, Vol. 24 No. 5, pp. 513-523, 1988.

Shakery, A. & Zhai, C. X. A probabilistic relevance propagation model for hypertext retrieval. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM), pp. 550-558, 2006.

Shakery, A. & Zhai, C. X. Relevance Propagation for Topic Distillation UIUC TREC 2003 Web Track Experiments. In Proceedings of the TREC Conference, 2003.

Song, R., Wen, J. R., Shi, S. M., Xin, G. M., Liu, T. Y., Qin,T., Zheng, X., Zhang, J. Y., Xue, G. R. & Ma, W. Y. Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004. 13th TREC, 2005.

Sutton, R. S. & Barto, A. G. Reinforcement learning: An introduction. Cambridge, MIT Press, 1998.

Wikipedia, http://www.wikipedia.org/, 2011.

Xue, G. R., Yang, Q., Zeng, H. J., Yu, Y., & Chen, Z. Exploiting the hierarchical structure for link analysis. In SIGIR, August 2005.

Zareh Bidoki, A. M. & Yazdani, N. DistanceRank: An Intelligent Ranking Algorithm for Web Pages. Information Processing & Managament, Vol. 44, No. 2, pp. 877-892, March 2008.

Zareh Bidoki, A. M. Effective Web Ranking & Crawling. Ph.D. Thesis, University of Tehran, May 2009.

Downloads

Published

2012-05-31

How to Cite

MOUSAKAZEMI, E. ., SARRAM, M. A. ., & BIDOKI, A. M. Z. (2012). POPULARITY-BASED RELEVANCE PROPAGATION. Journal of Web Engineering, 11(4), 350–364. Retrieved from https://journals.riverpublishers.com/index.php/JWE/article/view/4205

Issue

Section

Articles