EXPLOITING EMOTICONS IN POLARITY CLASSIFICATION OF TEXT

Authors

  • ALEXANDER HOGENBOOM, Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • DANIELLA BAL Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • FLAVIUS FRASINCAR, Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • MALISSA BAL Econometric Institute, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands
  • FRANCISKA DE JONG Erasmus School of History, Culture & Communication, Erasmus University Rotterdam P.O. Box 1738, 3000 DR Rotterdam, the Netherlands Department of Computer Science, Universiteit Twente P.O. Box 217, 7500 AE Enschede, the Netherlands
  • UZAY KAYMAK Department of Industrial Engineering & Innovation Sciences, Eindhoven University of Technology P.O. Box 513, 5600 MB Eindhoven, the Netherlands

Keywords:

Sentiment analysis, polarity classi cation, emoticons, sentiment lexicon

Abstract

With people increasingly using emoticons in written text on the Web in order to ex- press, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze how emoticons typically convey sentiment and we subsequently propose and evaluate a novel method for exploiting this with a manually created emoticon sentiment lexicon in a lexicon-based polarity classication method. We evaluate our approach on 2,080 Dutch tweets and forum messages, which all contain emoticons. We validate our nd- ings on 10,069 English reviews of apps, some of which contain emoticons. We nd that accounting for the sentiment conveyed by emoticons on a paragraph level { and, to a lesser extent, on a sentence level { signicantly improves polarity classication perfor- mance. Whenever emoticons are used, their associated sentiment tends to dominate the sentiment conveyed by textual cues and forms a good proxy for the polarity of text.

 

Downloads

Download data is not yet available.

References

S. Baccianella, A. Esuli, and F. Sebastiani. SentiWordNet 3.0: An Enhanced Lexical Resource for

Sentiment Analysis and Opinion Mining. In 7th Conference on International Language Resources

and Evaluation (LREC 2010), pages 2200{2204. European Language Resources Association, 2010.

D. Bal, M. Bal, A. van Bunningen, A. Hogenboom, F. Hogenboom, and F. Frasincar. Sentiment

Analysis with a Multilingual Pipeline. In 12th International Conference on Web Information

System Engineering (WISE 2011), volume 6997 of Lecture Notes in Computer Science, pages

{142. Springer, 2011.

J. Baldridge and T. Morton. OpenNLP, 2004. Available online, http://opennlp.sourceforge.

net/.

J. Burgoon, D. Buller, and W. Woodall. Nonverbal Communication: The Unspoken Dialogue.

McGraw-Hill, 2nd edition, 1996.

P. Casoto, A. Dattolo, and C. Tasso. Sentiment Classi cation for the Italian Language: A Case

Study on Movie Reviews. Journal of Internet Technology, 9(4):365{373, 2008.

C. Cesarano, B. Dorr, A. Picariello, D. Reforgiato, A. Sago , and V. Subrahmanian. OASYS: An

Opinion Analysis System. In AAAI Spring Symposium on Computational Approaches to Analyzing

Weblogs (CAAW 2006), pages 21{26. Association for the Advancement of Arti cial Intelligence,

J. Chenlo, A. Hogenboom, and D. Losada. Sentiment-Based Ranking of Blog Posts using Rhetor-

ical Structure Theory. In 18th International Conference on Applications of Natural Language to

Information Systems (NLDB 2013), volume 7934 of Lecture Notes in Computer Science, pages

{24. Springer, 2013.

J. Chenlo and D. Losada. E ective and Ecient Polarity Estimation in Blogs Based on Sentence-

Level Evidence. In 20th ACM Conference on Information and Knowledge Management (CIKM

, pages 365{374. Association for Computing Machinery, 2011.

T. Childers and M. Houston. Conditions for a Picture-Superiority E ect on Consumer Memory.

Journal of Consumer Research, 11(2):643{654, 1984.

ComputerUser. Emoticons, 2013. Available online, http://www.computeruser.com/resources/

dictionary/emoticons.html.

D. Davidov, O. Tsur, and A. Rappoport. Enhanced Sentiment Learning Using Twitter Hashtags

and Smileys. In 23rd International Conference on Computational Linguistics: Posters (COLING

, pages 241{249. Association for Computational Linguistics, 2010.

A. Devitt and K. Ahmad. Sentiment Polarity Identi cation in Financial News: A Cohesion-based

Approach. In 45th Annual Meeting of the Association of Computational Linguistics (ACL 2007),

pages 984{991. Association for Computational Linguistics, 2007.

X. Ding, B. Lu, and P. Yu. A Holistic Lexicon-Based Approach to Opinion Mining. In 1st ACM

International Conference on Web Search and Web Data Mining (WSDM 2008), pages 231{240.

Association for Computing Machinery, 2008.

R. Feldman. Techniques and Applications for Sentiment Analysis. Communications of the ACM,

(4):82{89, 2013.

P. Gil. Emoticons and Smileys 101, 2013. Available online, http://netforbeginners.about.com/

cs/netiquette101/a/bl_emoticons101.htm.

P. Goncalves, M. Araujo, F. Benevenuto, and M. Cha. Comparing and Combining Sentiment

Analysis Methods. In 1st ACM Conference on Online Social Networks (COSN 2013), pages 27{

Association for Computing Machinery, 2013.

B. Heerschop, F. Goossen, A. Hogenboom, F. Frasincar, U. Kaymak, and F. de Jong. Polar-

ity Analysis of Texts using Discourse Structure. In 20th ACM Conference on Information and

Knowledge Management (CIKM 2011), pages 1061{1070. Association for Computing Machinery,

B. Heerschop, A. Hogenboom, and F. Frasincar. Sentiment Lexicon Creation from Lexical Re-

sources. In 14th International Conference on Business Information Systems (BIS 2011), volume 87

of Lecture Notes in Business Information Processing, pages 185{196. Springer, 2011.

B. Heerschop, P. van Iterson, A. Hogenboom, F. Frasincar, and U. Kaymak. Analyzing Sentiment

in a Large Set of Web Data while Accounting for Negation. In 7th Atlantic Web Intelligence

Conference (AWIC 2011), pages 195{205. Springer, 2011.

A. Hogenboom, D. Bal, F. Frasincar, M. Bal, F. de Jong, and U. Kaymak. Exploiting Emoticons

in Sentiment Analysis. In 28th Symposium on Applied Computing (SAC 2013), pages 703{710.

Association for Computing Machinery, 2013.

A. Hogenboom, M. Bal, F. Frasincar, D. Bal, U. Kaymak, and F. de Jong. Lexicon-Based Sentiment

Analysis by Mapping Conveyed Sentiment to Intended Sentiment. International Journal of Web

Engineering and Technology, 9(1), 2014. To Appear.

A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. de Jong. Multi-Lingual Support

for Lexicon-Based Sentiment Analysis Guided by Semantics. Decision Support Systems, Online

First (DOI: 10.1016/j.dss.2014.03.004), 2014.

A. Hogenboom, F. Hogenboom, U. Kaymak, P. Wouters, and F. de Jong. Mining Economic

Sentiment using Argumentation Structures. In 7th International Workshop on Web Information

Systems Modeling (WISM 2010) at the 29th International Conference on Conceptual Modeling

(ER 2010), volume 6413 of Lecture Notes in Computer Science, pages 200{209. Springer, 2010.

A. Hogenboom, P. van Iterson, B. Heerschop, F. Frasincar, and U. Kaymak. Determining Negation

Scope and Strength in Sentiment Analysis. In 2011 IEEE International Conference on Systems,

Man, and Cybernetics (SMC 2011), pages 2589{2594. IEEE, 2011.

B. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Twitter Power: Tweets as Electronic Word of

Mouth. Journal of the American Society for Information Science and Technology, 60(11):2169{

, 2009.

A. Kendon. On Gesture: Its Complementary Relationship with Speech. In Nonverbal Communi-

cation. Lawrence Erlbaum Associates, 1987.

S. Kim and E. Hovy. Automatic Identi cation of Pro and Con Reasons in Online Reviews.

In 21st International Conference on Computational Linguistics and 44th Annual Meeting of the

Association for Computational Linguistics (COLING/ACL 2006), pages 483{490. Association for

the Advancement of Arti cial Intelligence, 2006.

B. Liu. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Tech-

nologies. Morgan & Claypool Publishers, 2012.

K. Liu, W. Li, and M. Guo. Emoticon Smoothed Language Models for Twitter Sentiment Analysis.

In 26th AAAI Conference on Arti cial Intelligence (AAAI 2012), pages 1678{1684. Association

for the Advancement of Arti cial Intelligence, 2012.

C. Manning, T. Grow, T. Grenager, J. Finkel, and J. Bauer. Stanford Tokenizer, 2010. Available

online, http://nlp.stanford.edu/software/tokenizer.shtml.

T. Marks. Recommended Emoticons for Email Communication, 2004. Available online, http:

//www.windweaver.com/emoticon.htm.

J. Marshall. The Canonical Smiley (and 1-Line Symbol) List, 2003. Available online, http:

//www.astro.umd.edu/~marshall/smileys.html.

L. Marvin. Spoof, Spam, Lurk, and Lag: The Aesthetics of Text-Based Virtual Realities. Journal

of Computer-Mediated Communication, 1(2), 1995.

P. Melville, V. Sindhwani, and R. Lawrence. Social Media Analytics: Channeling the Power of the

Blogosphere for Marketing Insight. In 1st Workshop on Information in Networks (WIN 2009),

R. Mihalcea, C. Banea, and J. Wiebe. Learning Multilingual Subjective Language via Cross-

Lingual Projections. In 45th Annual Meeting of the Association for Computational Linguistics

(ACL 2007), pages 976{983. Association for Computational Linguistics, 2007.

Msgweb. List of Emoticons in MSN Messenger, 2006. Available online, http://www.msgweb.nl/

en/MSN_Images/Emoticon_list/.

R. Navigli. Word Sense Disambiguation: A Survey. ACM Computing Surveys, 41(2):1{69, 2009.

B. Ojokoh and O. Kayode. A Feature-Opinion Extraction Approach to Opinion Mining. Journal

of Web Engineering, 11(1):51{63, 2012.

A. Pak and P. Paroubek. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In 7th

Conference on International Language Resources and Evaluation (LREC 2010), pages 1320{1326.

European Language Resources Association, 2010.

G. Paltoglou and M. Thelwall. A Study of Information Retrieval Weighting Schemes for Sentiment

Analysis. In 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010),

pages 1386{1395. Association for Computational Linguistics, 2010.

B. Pang and L. Lee. A Sentimental Education: Sentiment Analysis using Subjectivity Summa-

rization based on Minimum Cuts. In 42nd Annual Meeting of the Association for Computational

Linguistics (ACL 2004), pages 271{280. Association for Computational Linguistics, 2004.

B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Infor-

mation Retrieval, 2(1):1{135, 2008.

J. Read. Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment

Classi cation. In Student Research Workshop at the 43rd Annual Meeting of the Association for

Computational Linguistics (ACL 2005), pages 43{48. Association for Computational Linguistics,

L. Rezabek and J. Cochenour. Visual Cues in Computer-Mediated Communication: Supplementing

Text with Emoticons. Journal of Visual Literacy, 18(2):201{215, 1998.

Sharpened. Text-Based Emoticons, 2013. Available online, http://www.sharpened.net/

emoticons/.

R. Shepard. Recognition Memory for Words, Sentences, and Pictures. Journal of Verbal Learning

and Verbal Behavior, 6(1):156{163, 1967.

C. Strappavara and R. Mihalcea. SemEval-2007 Task 14: A ective Text. In 4th International

Workshop on Semantic Evaluations (SemEval 2007), pages 70{74. Association for Computational

Linguistics, 2007.

M. Taboada, J. Brooke, M. To loski, K. Voll, and M. Stede. Lexicon-Based Methods for Sentiment

Analysis. Computational Linguistics, 37(2):267{307, 2011.

M. Taboada, K. Voll, and J. Brooke. Extracting Sentiment as a Function of Discourse Structure

and Topicality. Technical Report 20, Simon Fraser University, 2008. Available online,

http://www.cs.sfu.ca/research/publications/techreports/#2008.

M. Thelwall, K. Buckley, and G. Paltoglou. SentiStrength, 2011. Available online, http://

sentistrength.wlv.ac.uk/.

M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas. Sentiment Strength Detection in

Short Informal Text. Journal of the American Society for Information Science and Technology,

(12):2544{2558, 2010.

B. Walenz and J. Didion. OpenNLP, 2008. Available online, http://jwordnet.sourceforge.

net/.

C. Whitelaw, N. Garg, and S. Argamon. Using Appraisal Groups for Sentiment Analysis. In 14th

ACM International Conference on Information and Knowledge Management (CIKM 2005), pages

{631. Association for Computing Machinery, 2005.

Wikipedia. List of Emoticons, 2013. Available online, http://en.wikipedia.org/wiki/List_of_

emoticons/.

D. Witmer and S. Katzman. On-Line Smiles: Does Gender Make a Di erence in the Use of Graphic

Accents? Journal of Computer-Mediated Communication, 2(4), 1997.

J. Zhao, L. Dong, J. Wu, and K. Xu. MoodLens: An Emoticon-Based Sentiment Analysis System

for Chinese Tweets. In 18th ACM SIGKDD International Conference on Knowledge Discovery

and Data Mining (KDD 2012), pages 1528{1531. Association for Computing Machinery, 2012.

Downloads

Published

2015-03-02

Issue

Section

Articles