CYBERGENRE: AUTOMATIC IDENTIFICATION OF HOME PAGES ON THE WEB

Authors

  • MICHAEL SHEPHERD Dalhousie University, Halifax, Canada
  • CAROLYN WATTERS Dalhousie University, Halifax, Canada
  • ALISTAIR KENNEDY Dalhousie University, Halifax, Canada

Keywords:

Web systems, genre, cybergenre, web page genre

Abstract

The research reported in this paper is part of a larger project on the automatic classification of web pages by their genres. The long term goal is the incorporation of web page genre into the search process to improve the quality of the search results. In this phase, a neural net classifier was trained to distinguish home pages from non-home pages and to classify those home pages as personal home page, corporate home page or organization home page. In order to evaluate the importance of the functionality attribute of cybergenre in such classification, the web pages were characterized by the cybergenre attributes of <content, form, functionality> and the resulting classifications compared to classifications in which the web pages were characterized by the genre attributes of <content, form>. Results indicate that the classifier is able to distinguish home pages from non-home pages and within the home page genre it is able to distinguish personal from corporate home pages. Organization home pages, however, were more difficult to distinguish from personal and corporate home pages. A significant improvement was found in identifying personal and corporate home pages when the functionality attribute was included.

 

Downloads

Download data is not yet available.

References

Crowston, K. and Kwasnik, B.H., A Framework for Creating a Facetted Classification for Genres:

Addresssing Issues of Multidimensionality. in Proc. of the 37th Hawaii International Conference

on System Sciences, (IEEE Computer Society, Hawaii, 5-8 January 2004).

Crowston, K. and Williams, M., Reproduced and Emergent Genres of Communication on the

World Wide Web. in Proc. of the 30th Hawaii International Conference on System Sciences,

(IEEE Computer Society, Hawaii, 1997).

Dewdney, N., VanEss-Dykema, C. and MacMillan, R., The Form is the Substance: Classification

of Genres in Text, [http://www.elsnet.org/km2001/dewdnew.pdf] Available 14 June 2004.

Erickson, T., Social Interaction on the Net: Virtual Community as Participatory Genre. In

Proceedings of the Thirtieth Annual Hawaii International Conference on System Sciences, (Maui,

Hawaii, 1997, Vol. 6, pp. 13-21).

Finn, A. and Kushmerick, N., Learning to Classify Documents According to Genre. IJCAI-03

Workshop on Computational Approaches to Style Analysis and Synthesis, (2003).

Karlgren, J. and Cutting, D., Recognizing Text Genres with Simple Metrics using Discriminant

Analysis. In Proc. of the 15th International Conference on Computational Linguistics (Coling 94),

volume II, (Kyoto, Japan, 1994., pp. 1071 – 1075).

Kessler, B. Nunberg, G. and Schutze, H., Automatic Detection of Text Genre. In Philip R. Cohen

and Wolfgang Wahlster, (eds.) Proc. of the Thirty-Fifth Annual Meeting of the Association for

Computational Linguistics and Eighth Conference of the European Chapter of the Association for

Computational Linguistics, (Association for Computational Linguistics, Somerset, New Jersey,

, pp. 32–38).

Lee, Y-B. and Myaeng, S.H., Automatic Identification of Text Genres and Their Roles in Subject-

Based Categorization. In Proc. 37th Annual Hawaii International Conference on System Sciences,

(IEEE Computer Society, Hawaii, 2004).

McLuhan, M., Is it natural that one medium should appropriate and exploit another? In Gerald E.

Stern (ed.), McLuhan: Hot and Cool. (New American Library, Signet Books, New York, 1967).

Reprinted in, Eric McLuhan and Frank Zingrone (eds.), Essential McLuhan, (House of Anansi

Press Limited, Concord, Ontario, 1995).

Rehm, G., Towards Automatic Web Genre Identification. In Proc. of the 35th Annual Hawaii

International Conference on System Sciences, (IEEE Computer Society, Hawaii, 2002).

Rosmarin, A., The Power of Genre, (University of Minneapolis Press, Minneapolis, 1985).

Roussinov, D., Crowston, K., Nilan, N., Kwasnik, B., Cai, J. and Liu, X., Genre Based Navigation

on the Web. In Proc. of the 34th Annual Hawaii International Conference on System Sciences,

(IEEE Computer Society, Maui, Hawaii, 2001).

Satamatatos, E., Fakotakis, N. and Kokkinakis, G., Text Genre Detection Using Common Word

Frequencies. In Proc. Of the 18th International Converence on Computational Linguistics, (2000).

Shepherd, M. and Watters, C., The Evolution of Cybergenres. In Proc. of the 31st Annual Hawaii

International Conference on System Sciences, (Maui, Hawaii, 1998).

Shepherd, M. and Watters, C., The Functionality Attribute of Cybergenres. In Proc. of the 32nd

Annual Hawaii International Conference on System Sciences, (Hawaii, 1999).

Shepherd, M. and Watters, C., Identifying Web Genre: Hitting A Moving Target. In Proc. of the

WWW2004 Conference. Workshop on Measureing Web Searach Effectiveness: The User

Perspective, (New York, 18 May 2004).

Wolf, M.J.P. The Medium and the Video Game. (University of Austin Press, Austin, Texas,

.

Yates, J. and Orlikowski, W., Genres of Organizational Communication: A Structurational

Approach to Studying Communication and Media. In Academy of Management Review, 17(2),

, pp. 299-326.

Yates, J., Orlikowski, W. and Rennecker, J., Collaborative Genres for Collaboration: Genre

Systems in Digital Media. In Proceedings of the Thirtieth Annual Hawaii International

Conference on System Sciences, (Maui, Hawaii, 1997, Vol. 6, pp. 50-59).

Downloads

Published

2004-08-12

How to Cite

SHEPHERD, M. ., WATTERS, C. ., & KENNEDY, A. . (2004). CYBERGENRE: AUTOMATIC IDENTIFICATION OF HOME PAGES ON THE WEB. Journal of Web Engineering, 3(3-4), 236–251. Retrieved from https://journals.riverpublishers.com/index.php/JWE/article/view/4311

Issue

Section

Articles