ACOTA: A MULTILINGUAL AND SEMI-AUTOMATIC COLLABORATIVE TAGGING WEB-BASED APPROACH
Keywords:
Collaborative Tagging, Automatic Tagging, Multilingual data, Document Retrieval, Knowledge ManagementAbstract
This paper introduces a multilingual hybrid methodology to automatically deploy and combine collaborative tagging techniques based on user-behavior and recommendation algorithms. A reference web architecture called ACOTA (Automatic Collaborative Tagging) is also described in order to show the recommendation capabilities of this approach with the aim to assist users when multilingual resource tagging is required. Finally a quantitative research in the context of corporate knowledge management is also presented to evaluate and assess the goodness and accuracy of the methodology to minimize the effort of multilingual document categorization.
Downloads
References
Belkin, N., & Croft, W. (1992). Information filtering and information retrieval: two sides of the same coin? Communications of the ACM, 29(10), 1–10. Retrieved from http://dl.acm.org/citation.cfm?id=138861
Shklar, L., Sheth, A., Kashyap, V., & Shah, K. (1995). InfoHarness: Use of automatically generated metadata for search and retrieval of heterogeneous information. In Advanced Information Systems Engineering (pp. 217–230). Retrieved from http://link.springer.com/chapter/10.1007/3-540-59498-1_248
Large, A., & Moukdad, H. (2000). Multilingual access to web resources: an overview. Program: electronic library and information systems, 34(1), 43–58. doi:10.1108/EUM0000000006938
W3Techs. (n.d.). Usage of content languages for websites. Retrieved May 03, 2013, from http://w3techs.com/technologies/overview/content_language/all
Kunder, M. de. (n.d.). The size of the World Wide Web (The Internet). Retrieved May 03, 2013, from http://www.worldwidewebsize.com/
Hjørland, B. (2007). Semantics and knowledge organization. Annual Review of Information Science and Technology, 41(1), 367–405. doi:10.1002/aris.2007.1440410115
Hjorland, B. (2012). Methods for evaluating information sources: An annotated catalogue. Journal of Information Science, 38(3), 258–268. doi:10.1177/0165551512439178
Davidson, C. H. (2001). Technology watch in the construction sector: why and how? Building Research & Information, 29(3), 233–241. doi:10.1080/09613210010027756
Gruber, T. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies, 43(5-6), 907–928. doi:10.1006/ijhc.1995.1081
Wal, T. Vander. (2007). Folksonomy Coinage and Definition. Retrieved from http://vanderwal.net/folksonomy.html
Casado-Lumbreras, C., Rodríguez-González, A., Álvarez-Rodríguez, J. M., & Colomo-Palacios, R. (2012). PsyDis: Towards a diagnosis support system for psychological disorders. Expert Systems with Applications, 39(13), 11391–11403. doi:10.1016/j.eswa.2012.04.033
García-Crespo, Á., Rodríguez, A., Mencke, M., Gómez-Berbís, J. M., & Colomo-Palacios, R. (2010). ODDIN: Ontology-driven differential diagnosis based on logical inference and probabilistic refinements. Expert Systems with Applications, 37(3), 2621–2628. doi:10.1016/j.eswa.2009.08.016
Villazón-Terrazas, B., Ramírez, J., Suárez-Figueroa, M. C., & Gómez-Pérez, A. (2011). A network of ontology networks for building e-employment advanced systems. Expert Systems with Applications, 38, 13612–13624. doi:10.1016/j.eswa.2011.04.125
Hotho, A., Jäschke, R., Schmitz, C., & Stumme, G. (2006). Information retrieval in folksonomies: Search and ranking. The Semantic Web: Research and Applications, 4011, 411–426. doi:10.1007/11762256_31
Yoo, D., Choi, K., Suh, Y., & Kim, G. (2013). Building and evaluating a collaboratively built structured folksonomy. Journal of Information Science. doi:10.1177/0165551513480309
Shirky, C. (2005). Ontology is Overrated: Categories, Links, and Tags. Economics & Culture, Media & Community. Retrieved from http://www.shirky.com/writings/ontology_overrated.html?goback=.gde_1838701_member_179729766
Park, S.-T., Pennock, D., Madani, O., Good, N., & DeCoste, D. (2006). Naïve filterbots for robust cold-start recommendations. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’06 (pp. 699–705). New York, New York, USA: ACM Press. doi:10.1145/1150402.1150490
Brook, C. H., & Montanez, N. (2006). Improved annotation of the blogopshere via autotagging and hierarchical clustering. Proceedings of the 15th World Wide Web Conference (WWW06). Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Improved+Annotation+of+the+Blogopshere+via+Autotagging+and+Hierarchical+Clustering#0
Mishne, G. (2006). AutoTag. In Proceedings of the 15th international conference on World Wide Web (WWW 06) (p. 953). New York, New York, USA: ACM Press. doi:10.1145/1135777.1135961
Sood, S. C., Owsley, S. H., Hammond, K. J., & Birnbaum, L. (2007). TagAssist: Automatic Tag Suggestion for Blog Posts. In ICWSM. Boulder, Colorado, US. Retrieved from http://www.icwsm.org/papers/paper10.html
Noll, M. G., Au Yeung, C., Gibbins, N., Meinel, C., & Shadbolt, N. (2009). Telling experts from spammers. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’09 (p. 612). New York, New York, USA: ACM Press. doi:10.1145/1571941.1572046
Hong, L., Ahmed, A., & Gurumurthy, S. (2012). Discovering geographical topics in the twitter stream. Proceedings of the 21st international conference on World Wide Web, 769–778. doi:10.1145/2187836.2187940
Gayo-Avello, D., Álvarez-Gutiérrez, D., & Gayo-Avello, J. (2004). Naïve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process. Biologically Inspired Approaches to Advanced Information Technology, LNCS 3141, 440–455. doi:10.1007/978-3-540-27835-1_32
Mika, P., Ciaramita, M., Zaragoza, H., & Atserias, J. (2008). Learning to Tag and Tagging to Learn: A Case Study on Wikipedia. IEEE Intelligent Systems, 23(5), 26–33. doi:10.1109/MIS.2008.85
Song, Y., Zhuang, Z., Li, H., Zhao, Q., Li, J., Lee, W.-C., & Giles, C. L. (2008). Real-time automatic tag recommendation. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’08, 515. doi:10.1145/1390334.1390423
Sun, K., Wang, X., Sun, C., & Lin, L. (2011). A language model approach for tag recommendation. Expert Systems with Applications, 38(3), 1575–1582. doi:10.1016/j.eswa.2010.07.075
Dostal, M., & Ježek, K. (2011). Automatic keyphrase Extraction extraction based on NLP Automatic Keyphrase based on NLP and statistical methods and Statistical Methods. Proceedings of the Dateso 2011: Annual International Workshop on DAtabases, TExts, Specifications and Object, 140–145.
Labra Gayo, J. E., de Pablos, P. O., & Cueva Lovelle, J. M. (2010). WESONet: Applying semantic web technologies and collaborative tagging to multimedia web information systems. Computers in Human Behavior, 26(2), 205–209. doi:10.1016/j.chb.2009.10.004
Jimenez-Nácero, W., Luis-Alvargonzález, C., Abella-Vallina, P., Alvarez-Rodríguez, J. M., Labra-Gayo, J. E., & Ordoñez de Pablos, P. (2012). Emergent Ontologies by collaborative tagging for Knowledge Management. In Advancing Information Management through Semantic Web Concepts and Ontologies (p. 16). IGI-Global.
Kern, R., Granitzer, M., & Pammer, V. (2008). Extending Folksonomies for Image Tagging. In 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services (pp. 126–129). IEEE. doi:10.1109/WIAMIS.2008.43
Chen, P.-I., & Lin, S.-J. (2010). Automatic keyword prediction using Google similarity distance. Expert Systems with Applications, 37(3), 1928–1938. doi:10.1016/j.eswa.2009.07.016
Sigurd, B., Eeg-Olofsson, M., & van Weijer, J. (2004). Word length, sentence length and frequency - Zipf revisited. Studia Linguistica, 58(1), 37–52. doi:10.1111/j.0039-3193.2004.00109.x
Suchanek, F. M., Vojnovic, M., & Gunawardena, D. (2008). Social tags. In Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM ’08 (p. 223). New York, New York, USA: ACM Press. doi:10.1145/1458082.1458114
Miller, G. a. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39–41. doi:10.1145/219717.219748
Sarwar, B., Karypis, G., Konstan, J., & Reidl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the tenth international conference on World Wide Web - WWW ’01 (pp. 285–295). New York, New York, USA: ACM Press. doi:10.1145/371920.372071
Lipkus, A. (1999). A proof of the triangle inequality for the Tanimoto distance. Journal of Mathematical Chemistry, 26, 263–265. Retrieved from http://link.springer.com/article/10.1023/A:1019154432472
Tanimoto, T. T. (1958). An elementary mathematical theory of classification and prediction (p. 10). International Business Machines Corporation (IBM), New York.
Battle, R., & Benson, E. (2008). Bridging the semantic Web and Web 2.0 with Representational State Transfer (REST). Web Semantics: Science, Services and Agents on the World Wide Web, 6(1), 61–69. doi:10.1016/j.websem.2007.11.002
Nielsen, J. (1994). Enhancing the explanatory power of usability heuristics. In Conference companion on Human factors in computing systems - CHI ’94 (p. 210). New York, New York, USA: ACM Press. doi:10.1145/259963.260333
Fielding, R. T. (2000). Architectural Styles and the Design of Network-based Software Architectures. University of California, Irvine. Retrieved from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
I. Fette, Google, I., Melnikov, A., & Ltd., I. (2011). RFC 6455 - The WebSocket Protocol (p. 71). Retrieved from http://tools.ietf.org/html/rfc6455
Agha, G. (1985). ACTORS:: a model of concurrent computation in distributed systems. Retrieved from http://dspace.mit.edu/handle/1721.1/6952
Keats, J. (2010). Virtual Words: Language on the Edge of Science and Technology. Oxford, New York: Oxford University Press, Inc.
Cleverdon, C. W., Mills, J., & Keen, M. (1966). Factors determining the performance of indexing systems (Vol. I, p. 120). Cranfield. Retrieved from http://hdl.handle.net/1826/862