RANKING SEARCH RESULTS BY WEB QUALITY DIMENSIONS

Authors

  • JOSHUA C. C. PUN Hong Kong University of Science and Technology
  • FREDERICK H. LOCHOVSKY Hong Kong University of Science and Technology

Keywords:

Web data quality, web metrics, user appropriateness

Abstract

Currently, search engines rank search results using mainly link-based metrics. While usually most of the search results are relevant to a user's query, due to how the results are ranked, users often are still not totally satisfied with them. Using a proposed framework of web data quality, it is found that current search engines usually only consider a very small number of the dimensions of web data quality in their ranking algorithms. In this paper, a newly identified web data-quality dimension, appropriateness, which is based on the linguistic and visual complexity of a web page, is studied. It is computed using new metrics that classify web pages into three main appropriateness genres: scholarly, news/general interest and popular. Experiments have shown the effectiveness of the metrics in ranking web pages by whether they are appropriate to a user’s task and information needs.

 

Downloads

Download data is not yet available.

References

J.E. Alexander, M.A. Tate. Web Wisdom: How to Evaluate and Create Information Quality on the

Web, Lawrence Erlbaum Associates Inc., 1999.

B. Amento, L. Terveen and W. Hill. “Does “Authority” mean quality? Predicting expert quality

ratings of web documents,” Proc. 23rd ACM SIGIR Conf., 296-303, 2000.

R. Baeza-Yates, F. Saint-Jean and C. Castillo. “Web structure, dynamics and page quality,” Proc.

SPIRE 2002, LNCS, Springer, 2002.

S. Brin and L. Page. “The anatomy of a large-scale hypertextual web search engine, Proc. 7th

World Wide Web Conf., 107-117, 1998.

California Medical Association. How to Evaluate Medical Information Found on the Internet.

http://new.cmanet.org/publicdoc.cfm/60/0/GENER/99

J. Cho and S. Roy. “Impact of web search engines on page popularity,” Proc. 13th World Wide

Web Conf., 20-29, May 2004.

Cornell University Library. Distinguishing Scholarly Journals from Other Periodicals.

http://www.library.cornell.edu/okuref/research/skill20.html

K. Crowston and M. Williams. “Reproduced and emergent genres of communication on the

Word-Wide Web,” Proc. Thirtieth Annual Hawaii Intl. Conf. on System Sciences, Vol. 6, 30-39,

C. Fox, A. Levitin and T. Redman. “The notion of data and its quality dimensions,” Information

Processing and Management 30(1), 9-19, 1994.

J. D. Graofalakis, P. Kappos and D. Mourloukos. “Web site optimization using page popularity,”

IEEE Internet Computing 3(4), 22-29, 1999.

R. Gunning. Techniques of Clear Writing, revised edition. McGraw-Hill, New York, 1968.

B.J. Jansen, A. Spink and T. Saracevic. Real life, real users, and real needs: a study and analysis

of user queries on the web,” Information Processing and Management 36(2), 207–227, 2000.

J. Kleinberg. “Authoritative sources in a hyperlinked environment,” Proc. 9th Symp. on Discrete

Algorithms, 668-677, 1998.

J.T. Kwok. “Automated text categorization using support vector machines,” Proc. Intl. Conf. on

Neural Information Processing, 347-351, 1998.

L.L. Pipino, Y.W. Lee, and R.Y. Yang. “Data quality assessment,” Communications of the ACM

(4), 211-218, 2002.

J. C.C. Pun and F.H. Lochovsky. Finding an Appropriate Web Page. Technical Report, HKUSTCS-

-05, 2004.

D. Roussinov, K. Crowston, M. Nilan, B. Kwasnik, J. Cai and X. Liu. “Genre based navigation on

the Web,” Proc. Thirty-Fourth Annual Hawaii Intl. Conf. on System Sciences, Vol. 10, 2001.

M. Shepherd and C. Watters. “Identifying web genre: hitting a moving target,” WWW 2004 Conf.

Workshop on Measuring Web Search Effectiveness: The User Perspective, 2004.

R. Song, H. Liu, J.R. Wen and W.Y. Ma. “Learning block importance models for web pages,”

Proc. 13th World Wide Web Conference, 203-211, 2004.

D.M. Strong, Y.W. Lee and R.Y. Wang. “Data quality in context,” Communications of the ACM

(5), 103-110, 1997.

UC Berkeley Library. Evaluating Web Pages: Techniques to Apply & Questions to Ask.

http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/Evaluate.html

V. Vapnik. The Nature of Statistical Learning Theory. Spring Verlag, 1995.

C.D. Waddell. Applying the ADA to the Internet: A Web Accessibility Standard.

http://www.rit.edu/~easi/law/weblaw1.htm

J. Wang and F.H. Lochovsky. “Data-rich section extraction from HTML pages,” Proc. 3rd Intl.

Conf. on Web Information System Engineering, 313-322, 2002.

The Webby Awards Categories.

http://www.webbyawards.com/main/webby_awards/index.html#categories

M.A. Winker, A. Flanagin, B. Chi-Lum, J. White, K. Andrews, R.L. Kennett, C.D. DeAngelis and

R.A. Musacchio. Guidelines for Medical and Health Information Sites on the Internet: Principles

Governing AMA Web Sites. http://www.ama-assn.org/ama/pub/category/1905.html

J.C. Wyatt. “Commentary: measuring quality and impact of the World Wide Web,” British

Medical Journal. 314(7098), 1879-1880, 1997.

R.Y. Yang, M.P. Reddy and H.B. Kon. “Toward quality data: an attributed-based approach,”

Decision Support Systems 13(3), 349-372, 1995.

P. Zhang and G.M. von Dran. “User expectations and rankings of quality factors in different web

site domains,” Intl. J. of Electronic Commerce 6(2), 9-33, Winter 2001-2002.

Downloads

Published

2004-11-20

How to Cite

PUN, J. C. C. ., & LOCHOVSKY, F. H. . (2004). RANKING SEARCH RESULTS BY WEB QUALITY DIMENSIONS. Journal of Web Engineering, 3(3-4), 216–235. Retrieved from https://journals.riverpublishers.com/index.php/JWE/article/view/4309

Issue

Section

Articles