ACOTA: A MULTILINGUAL AND SEMI-AUTOMATIC COLLABORATIVE TAGGING WEB-BASED APPROACH

Authors

  • CÉSAR LUIS ALVARGONZÁLEZ WESO Research Group, University of Oviedo, Oviedo, Spain
  • JOSE MARÍA ÁLVAREZ-RODRIGEZ Knowledge Reuse Group, Carlos III University of Madrid, Leganés, Spain
  • JOSE EMILIO LABRA GAYO WESO Research Group, University of Oviedo, Oviedo, Spain
  • PATRICIA ORDOÑEZ DE PABLOS WESO Research Group, University of Oviedo, Oviedo, Spain

Keywords:

Collaborative Tagging, Automatic Tagging, Multilingual data, Document Retrieval, Knowledge Management

Abstract

This paper introduces a multilingual hybrid methodology to automatically deploy and combine collaborative tagging techniques based on user-behavior and recommendation algorithms. A reference web architecture called ACOTA (Automatic Collaborative Tagging) is also described in order to show the recommendation capabilities of this approach with the aim to assist users when multilingual resource tagging is required. Finally a quantitative research in the context of corporate knowledge management is also presented to evaluate and assess the goodness and accuracy of the methodology to minimize the effort of multilingual document categorization.

 

Downloads

Download data is not yet available.

References

Belkin, N., & Croft, W. (1992). Information filtering and information retrieval: two sides of the same coin? Communications of the ACM, 29(10), 1–10. Retrieved from http://dl.acm.org/citation.cfm?id=138861

Shklar, L., Sheth, A., Kashyap, V., & Shah, K. (1995). InfoHarness: Use of automatically generated metadata for search and retrieval of heterogeneous information. In Advanced Information Systems Engineering (pp. 217–230). Retrieved from http://link.springer.com/chapter/10.1007/3-540-59498-1_248

Large, A., & Moukdad, H. (2000). Multilingual access to web resources: an overview. Program: electronic library and information systems, 34(1), 43–58. doi:10.1108/EUM0000000006938

W3Techs. (n.d.). Usage of content languages for websites. Retrieved May 03, 2013, from http://w3techs.com/technologies/overview/content_language/all

Kunder, M. de. (n.d.). The size of the World Wide Web (The Internet). Retrieved May 03, 2013, from http://www.worldwidewebsize.com/

Hjørland, B. (2007). Semantics and knowledge organization. Annual Review of Information Science and Technology, 41(1), 367–405. doi:10.1002/aris.2007.1440410115

Hjorland, B. (2012). Methods for evaluating information sources: An annotated catalogue. Journal of Information Science, 38(3), 258–268. doi:10.1177/0165551512439178

Davidson, C. H. (2001). Technology watch in the construction sector: why and how? Building Research & Information, 29(3), 233–241. doi:10.1080/09613210010027756

Gruber, T. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies, 43(5-6), 907–928. doi:10.1006/ijhc.1995.1081

Wal, T. Vander. (2007). Folksonomy Coinage and Definition. Retrieved from http://vanderwal.net/folksonomy.html

Casado-Lumbreras, C., Rodríguez-González, A., Álvarez-Rodríguez, J. M., & Colomo-Palacios, R. (2012). PsyDis: Towards a diagnosis support system for psychological disorders. Expert Systems with Applications, 39(13), 11391–11403. doi:10.1016/j.eswa.2012.04.033

García-Crespo, Á., Rodríguez, A., Mencke, M., Gómez-Berbís, J. M., & Colomo-Palacios, R. (2010). ODDIN: Ontology-driven differential diagnosis based on logical inference and probabilistic refinements. Expert Systems with Applications, 37(3), 2621–2628. doi:10.1016/j.eswa.2009.08.016

Villazón-Terrazas, B., Ramírez, J., Suárez-Figueroa, M. C., & Gómez-Pérez, A. (2011). A network of ontology networks for building e-employment advanced systems. Expert Systems with Applications, 38, 13612–13624. doi:10.1016/j.eswa.2011.04.125

Hotho, A., Jäschke, R., Schmitz, C., & Stumme, G. (2006). Information retrieval in folksonomies: Search and ranking. The Semantic Web: Research and Applications, 4011, 411–426. doi:10.1007/11762256_31

Yoo, D., Choi, K., Suh, Y., & Kim, G. (2013). Building and evaluating a collaboratively built structured folksonomy. Journal of Information Science. doi:10.1177/0165551513480309

Shirky, C. (2005). Ontology is Overrated: Categories, Links, and Tags. Economics & Culture, Media & Community. Retrieved from http://www.shirky.com/writings/ontology_overrated.html?goback=.gde_1838701_member_179729766

Park, S.-T., Pennock, D., Madani, O., Good, N., & DeCoste, D. (2006). Naïve filterbots for robust cold-start recommendations. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’06 (pp. 699–705). New York, New York, USA: ACM Press. doi:10.1145/1150402.1150490

Brook, C. H., & Montanez, N. (2006). Improved annotation of the blogopshere via autotagging and hierarchical clustering. Proceedings of the 15th World Wide Web Conference (WWW06). Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Improved+Annotation+of+the+Blogopshere+via+Autotagging+and+Hierarchical+Clustering#0

Mishne, G. (2006). AutoTag. In Proceedings of the 15th international conference on World Wide Web (WWW 06) (p. 953). New York, New York, USA: ACM Press. doi:10.1145/1135777.1135961

Sood, S. C., Owsley, S. H., Hammond, K. J., & Birnbaum, L. (2007). TagAssist: Automatic Tag Suggestion for Blog Posts. In ICWSM. Boulder, Colorado, US. Retrieved from http://www.icwsm.org/papers/paper10.html

Noll, M. G., Au Yeung, C., Gibbins, N., Meinel, C., & Shadbolt, N. (2009). Telling experts from spammers. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’09 (p. 612). New York, New York, USA: ACM Press. doi:10.1145/1571941.1572046

Hong, L., Ahmed, A., & Gurumurthy, S. (2012). Discovering geographical topics in the twitter stream. Proceedings of the 21st international conference on World Wide Web, 769–778. doi:10.1145/2187836.2187940

Gayo-Avello, D., Álvarez-Gutiérrez, D., & Gayo-Avello, J. (2004). Naïve Algorithms for Keyphrase Extraction and Text Summarization from a Single Document Inspired by the Protein Biosynthesis Process. Biologically Inspired Approaches to Advanced Information Technology, LNCS 3141, 440–455. doi:10.1007/978-3-540-27835-1_32

Mika, P., Ciaramita, M., Zaragoza, H., & Atserias, J. (2008). Learning to Tag and Tagging to Learn: A Case Study on Wikipedia. IEEE Intelligent Systems, 23(5), 26–33. doi:10.1109/MIS.2008.85

Song, Y., Zhuang, Z., Li, H., Zhao, Q., Li, J., Lee, W.-C., & Giles, C. L. (2008). Real-time automatic tag recommendation. Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’08, 515. doi:10.1145/1390334.1390423

Sun, K., Wang, X., Sun, C., & Lin, L. (2011). A language model approach for tag recommendation. Expert Systems with Applications, 38(3), 1575–1582. doi:10.1016/j.eswa.2010.07.075

Dostal, M., & Ježek, K. (2011). Automatic keyphrase Extraction extraction based on NLP Automatic Keyphrase based on NLP and statistical methods and Statistical Methods. Proceedings of the Dateso 2011: Annual International Workshop on DAtabases, TExts, Specifications and Object, 140–145.

Labra Gayo, J. E., de Pablos, P. O., & Cueva Lovelle, J. M. (2010). WESONet: Applying semantic web technologies and collaborative tagging to multimedia web information systems. Computers in Human Behavior, 26(2), 205–209. doi:10.1016/j.chb.2009.10.004

Jimenez-Nácero, W., Luis-Alvargonzález, C., Abella-Vallina, P., Alvarez-Rodríguez, J. M., Labra-Gayo, J. E., & Ordoñez de Pablos, P. (2012). Emergent Ontologies by collaborative tagging for Knowledge Management. In Advancing Information Management through Semantic Web Concepts and Ontologies (p. 16). IGI-Global.

Kern, R., Granitzer, M., & Pammer, V. (2008). Extending Folksonomies for Image Tagging. In 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services (pp. 126–129). IEEE. doi:10.1109/WIAMIS.2008.43

Chen, P.-I., & Lin, S.-J. (2010). Automatic keyword prediction using Google similarity distance. Expert Systems with Applications, 37(3), 1928–1938. doi:10.1016/j.eswa.2009.07.016

Sigurd, B., Eeg-Olofsson, M., & van Weijer, J. (2004). Word length, sentence length and frequency - Zipf revisited. Studia Linguistica, 58(1), 37–52. doi:10.1111/j.0039-3193.2004.00109.x

Suchanek, F. M., Vojnovic, M., & Gunawardena, D. (2008). Social tags. In Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM ’08 (p. 223). New York, New York, USA: ACM Press. doi:10.1145/1458082.1458114

Miller, G. a. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39–41. doi:10.1145/219717.219748

Sarwar, B., Karypis, G., Konstan, J., & Reidl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the tenth international conference on World Wide Web - WWW ’01 (pp. 285–295). New York, New York, USA: ACM Press. doi:10.1145/371920.372071

Lipkus, A. (1999). A proof of the triangle inequality for the Tanimoto distance. Journal of Mathematical Chemistry, 26, 263–265. Retrieved from http://link.springer.com/article/10.1023/A:1019154432472

Tanimoto, T. T. (1958). An elementary mathematical theory of classification and prediction (p. 10). International Business Machines Corporation (IBM), New York.

Battle, R., & Benson, E. (2008). Bridging the semantic Web and Web 2.0 with Representational State Transfer (REST). Web Semantics: Science, Services and Agents on the World Wide Web, 6(1), 61–69. doi:10.1016/j.websem.2007.11.002

Nielsen, J. (1994). Enhancing the explanatory power of usability heuristics. In Conference companion on Human factors in computing systems - CHI ’94 (p. 210). New York, New York, USA: ACM Press. doi:10.1145/259963.260333

Fielding, R. T. (2000). Architectural Styles and the Design of Network-based Software Architectures. University of California, Irvine. Retrieved from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

I. Fette, Google, I., Melnikov, A., & Ltd., I. (2011). RFC 6455 - The WebSocket Protocol (p. 71). Retrieved from http://tools.ietf.org/html/rfc6455

Agha, G. (1985). ACTORS:: a model of concurrent computation in distributed systems. Retrieved from http://dspace.mit.edu/handle/1721.1/6952

Keats, J. (2010). Virtual Words: Language on the Edge of Science and Technology. Oxford, New York: Oxford University Press, Inc.

Cleverdon, C. W., Mills, J., & Keen, M. (1966). Factors determining the performance of indexing systems (Vol. I, p. 120). Cranfield. Retrieved from http://hdl.handle.net/1826/862

Downloads

Published

2014-03-31

How to Cite

ALVARGONZÁLEZ, C. L. ., ÁLVAREZ-RODRIGEZ, J. M. ., GAYO, J. E. L. ., & DE PABLOS, P. O. . (2014). ACOTA: A MULTILINGUAL AND SEMI-AUTOMATIC COLLABORATIVE TAGGING WEB-BASED APPROACH. Journal of Web Engineering, 13(1-2), 160–180. Retrieved from https://journals.riverpublishers.com/index.php/JWE/article/view/3957

Issue

Section

Articles