EXPLOITING EMOTICONS IN POLARITY CLASSIFICATION OF TEXT
Keywords:
Sentiment analysis, polarity classication, emoticons, sentiment lexiconAbstract
With people increasingly using emoticons in written text on the Web in order to ex- press, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze how emoticons typically convey sentiment and we subsequently propose and evaluate a novel method for exploiting this with a manually created emoticon sentiment lexicon in a lexicon-based polarity classication method. We evaluate our approach on 2,080 Dutch tweets and forum messages, which all contain emoticons. We validate our nd- ings on 10,069 English reviews of apps, some of which contain emoticons. We nd that accounting for the sentiment conveyed by emoticons on a paragraph level { and, to a lesser extent, on a sentence level { signicantly improves polarity classication perfor- mance. Whenever emoticons are used, their associated sentiment tends to dominate the sentiment conveyed by textual cues and forms a good proxy for the polarity of text.
Downloads
References
S. Baccianella, A. Esuli, and F. Sebastiani. SentiWordNet 3.0: An Enhanced Lexical Resource for
Sentiment Analysis and Opinion Mining. In 7th Conference on International Language Resources
and Evaluation (LREC 2010), pages 2200{2204. European Language Resources Association, 2010.
D. Bal, M. Bal, A. van Bunningen, A. Hogenboom, F. Hogenboom, and F. Frasincar. Sentiment
Analysis with a Multilingual Pipeline. In 12th International Conference on Web Information
System Engineering (WISE 2011), volume 6997 of Lecture Notes in Computer Science, pages
{142. Springer, 2011.
J. Baldridge and T. Morton. OpenNLP, 2004. Available online, http://opennlp.sourceforge.
net/.
J. Burgoon, D. Buller, and W. Woodall. Nonverbal Communication: The Unspoken Dialogue.
McGraw-Hill, 2nd edition, 1996.
P. Casoto, A. Dattolo, and C. Tasso. Sentiment Classi cation for the Italian Language: A Case
Study on Movie Reviews. Journal of Internet Technology, 9(4):365{373, 2008.
C. Cesarano, B. Dorr, A. Picariello, D. Reforgiato, A. Sago , and V. Subrahmanian. OASYS: An
Opinion Analysis System. In AAAI Spring Symposium on Computational Approaches to Analyzing
Weblogs (CAAW 2006), pages 21{26. Association for the Advancement of Arti cial Intelligence,
J. Chenlo, A. Hogenboom, and D. Losada. Sentiment-Based Ranking of Blog Posts using Rhetor-
ical Structure Theory. In 18th International Conference on Applications of Natural Language to
Information Systems (NLDB 2013), volume 7934 of Lecture Notes in Computer Science, pages
{24. Springer, 2013.
J. Chenlo and D. Losada. E ective and Ecient Polarity Estimation in Blogs Based on Sentence-
Level Evidence. In 20th ACM Conference on Information and Knowledge Management (CIKM
, pages 365{374. Association for Computing Machinery, 2011.
T. Childers and M. Houston. Conditions for a Picture-Superiority E ect on Consumer Memory.
Journal of Consumer Research, 11(2):643{654, 1984.
ComputerUser. Emoticons, 2013. Available online, http://www.computeruser.com/resources/
dictionary/emoticons.html.
D. Davidov, O. Tsur, and A. Rappoport. Enhanced Sentiment Learning Using Twitter Hashtags
and Smileys. In 23rd International Conference on Computational Linguistics: Posters (COLING
, pages 241{249. Association for Computational Linguistics, 2010.
A. Devitt and K. Ahmad. Sentiment Polarity Identi cation in Financial News: A Cohesion-based
Approach. In 45th Annual Meeting of the Association of Computational Linguistics (ACL 2007),
pages 984{991. Association for Computational Linguistics, 2007.
X. Ding, B. Lu, and P. Yu. A Holistic Lexicon-Based Approach to Opinion Mining. In 1st ACM
International Conference on Web Search and Web Data Mining (WSDM 2008), pages 231{240.
Association for Computing Machinery, 2008.
R. Feldman. Techniques and Applications for Sentiment Analysis. Communications of the ACM,
(4):82{89, 2013.
P. Gil. Emoticons and Smileys 101, 2013. Available online, http://netforbeginners.about.com/
cs/netiquette101/a/bl_emoticons101.htm.
P. Goncalves, M. Araujo, F. Benevenuto, and M. Cha. Comparing and Combining Sentiment
Analysis Methods. In 1st ACM Conference on Online Social Networks (COSN 2013), pages 27{
Association for Computing Machinery, 2013.
B. Heerschop, F. Goossen, A. Hogenboom, F. Frasincar, U. Kaymak, and F. de Jong. Polar-
ity Analysis of Texts using Discourse Structure. In 20th ACM Conference on Information and
Knowledge Management (CIKM 2011), pages 1061{1070. Association for Computing Machinery,
B. Heerschop, A. Hogenboom, and F. Frasincar. Sentiment Lexicon Creation from Lexical Re-
sources. In 14th International Conference on Business Information Systems (BIS 2011), volume 87
of Lecture Notes in Business Information Processing, pages 185{196. Springer, 2011.
B. Heerschop, P. van Iterson, A. Hogenboom, F. Frasincar, and U. Kaymak. Analyzing Sentiment
in a Large Set of Web Data while Accounting for Negation. In 7th Atlantic Web Intelligence
Conference (AWIC 2011), pages 195{205. Springer, 2011.
A. Hogenboom, D. Bal, F. Frasincar, M. Bal, F. de Jong, and U. Kaymak. Exploiting Emoticons
in Sentiment Analysis. In 28th Symposium on Applied Computing (SAC 2013), pages 703{710.
Association for Computing Machinery, 2013.
A. Hogenboom, M. Bal, F. Frasincar, D. Bal, U. Kaymak, and F. de Jong. Lexicon-Based Sentiment
Analysis by Mapping Conveyed Sentiment to Intended Sentiment. International Journal of Web
Engineering and Technology, 9(1), 2014. To Appear.
A. Hogenboom, B. Heerschop, F. Frasincar, U. Kaymak, and F. de Jong. Multi-Lingual Support
for Lexicon-Based Sentiment Analysis Guided by Semantics. Decision Support Systems, Online
First (DOI: 10.1016/j.dss.2014.03.004), 2014.
A. Hogenboom, F. Hogenboom, U. Kaymak, P. Wouters, and F. de Jong. Mining Economic
Sentiment using Argumentation Structures. In 7th International Workshop on Web Information
Systems Modeling (WISM 2010) at the 29th International Conference on Conceptual Modeling
(ER 2010), volume 6413 of Lecture Notes in Computer Science, pages 200{209. Springer, 2010.
A. Hogenboom, P. van Iterson, B. Heerschop, F. Frasincar, and U. Kaymak. Determining Negation
Scope and Strength in Sentiment Analysis. In 2011 IEEE International Conference on Systems,
Man, and Cybernetics (SMC 2011), pages 2589{2594. IEEE, 2011.
B. Jansen, M. Zhang, K. Sobel, and A. Chowdury. Twitter Power: Tweets as Electronic Word of
Mouth. Journal of the American Society for Information Science and Technology, 60(11):2169{
, 2009.
A. Kendon. On Gesture: Its Complementary Relationship with Speech. In Nonverbal Communi-
cation. Lawrence Erlbaum Associates, 1987.
S. Kim and E. Hovy. Automatic Identi cation of Pro and Con Reasons in Online Reviews.
In 21st International Conference on Computational Linguistics and 44th Annual Meeting of the
Association for Computational Linguistics (COLING/ACL 2006), pages 483{490. Association for
the Advancement of Arti cial Intelligence, 2006.
B. Liu. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Tech-
nologies. Morgan & Claypool Publishers, 2012.
K. Liu, W. Li, and M. Guo. Emoticon Smoothed Language Models for Twitter Sentiment Analysis.
In 26th AAAI Conference on Arti cial Intelligence (AAAI 2012), pages 1678{1684. Association
for the Advancement of Arti cial Intelligence, 2012.
C. Manning, T. Grow, T. Grenager, J. Finkel, and J. Bauer. Stanford Tokenizer, 2010. Available
online, http://nlp.stanford.edu/software/tokenizer.shtml.
T. Marks. Recommended Emoticons for Email Communication, 2004. Available online, http:
//www.windweaver.com/emoticon.htm.
J. Marshall. The Canonical Smiley (and 1-Line Symbol) List, 2003. Available online, http:
//www.astro.umd.edu/~marshall/smileys.html.
L. Marvin. Spoof, Spam, Lurk, and Lag: The Aesthetics of Text-Based Virtual Realities. Journal
of Computer-Mediated Communication, 1(2), 1995.
P. Melville, V. Sindhwani, and R. Lawrence. Social Media Analytics: Channeling the Power of the
Blogosphere for Marketing Insight. In 1st Workshop on Information in Networks (WIN 2009),
R. Mihalcea, C. Banea, and J. Wiebe. Learning Multilingual Subjective Language via Cross-
Lingual Projections. In 45th Annual Meeting of the Association for Computational Linguistics
(ACL 2007), pages 976{983. Association for Computational Linguistics, 2007.
Msgweb. List of Emoticons in MSN Messenger, 2006. Available online, http://www.msgweb.nl/
en/MSN_Images/Emoticon_list/.
R. Navigli. Word Sense Disambiguation: A Survey. ACM Computing Surveys, 41(2):1{69, 2009.
B. Ojokoh and O. Kayode. A Feature-Opinion Extraction Approach to Opinion Mining. Journal
of Web Engineering, 11(1):51{63, 2012.
A. Pak and P. Paroubek. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In 7th
Conference on International Language Resources and Evaluation (LREC 2010), pages 1320{1326.
European Language Resources Association, 2010.
G. Paltoglou and M. Thelwall. A Study of Information Retrieval Weighting Schemes for Sentiment
Analysis. In 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010),
pages 1386{1395. Association for Computational Linguistics, 2010.
B. Pang and L. Lee. A Sentimental Education: Sentiment Analysis using Subjectivity Summa-
rization based on Minimum Cuts. In 42nd Annual Meeting of the Association for Computational
Linguistics (ACL 2004), pages 271{280. Association for Computational Linguistics, 2004.
B. Pang and L. Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Infor-
mation Retrieval, 2(1):1{135, 2008.
J. Read. Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment
Classi cation. In Student Research Workshop at the 43rd Annual Meeting of the Association for
Computational Linguistics (ACL 2005), pages 43{48. Association for Computational Linguistics,
L. Rezabek and J. Cochenour. Visual Cues in Computer-Mediated Communication: Supplementing
Text with Emoticons. Journal of Visual Literacy, 18(2):201{215, 1998.
Sharpened. Text-Based Emoticons, 2013. Available online, http://www.sharpened.net/
emoticons/.
R. Shepard. Recognition Memory for Words, Sentences, and Pictures. Journal of Verbal Learning
and Verbal Behavior, 6(1):156{163, 1967.
C. Strappavara and R. Mihalcea. SemEval-2007 Task 14: A ective Text. In 4th International
Workshop on Semantic Evaluations (SemEval 2007), pages 70{74. Association for Computational
Linguistics, 2007.
M. Taboada, J. Brooke, M. To loski, K. Voll, and M. Stede. Lexicon-Based Methods for Sentiment
Analysis. Computational Linguistics, 37(2):267{307, 2011.
M. Taboada, K. Voll, and J. Brooke. Extracting Sentiment as a Function of Discourse Structure
and Topicality. Technical Report 20, Simon Fraser University, 2008. Available online,
http://www.cs.sfu.ca/research/publications/techreports/#2008.
M. Thelwall, K. Buckley, and G. Paltoglou. SentiStrength, 2011. Available online, http://
sentistrength.wlv.ac.uk/.
M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas. Sentiment Strength Detection in
Short Informal Text. Journal of the American Society for Information Science and Technology,
(12):2544{2558, 2010.
B. Walenz and J. Didion. OpenNLP, 2008. Available online, http://jwordnet.sourceforge.
net/.
C. Whitelaw, N. Garg, and S. Argamon. Using Appraisal Groups for Sentiment Analysis. In 14th
ACM International Conference on Information and Knowledge Management (CIKM 2005), pages
{631. Association for Computing Machinery, 2005.
Wikipedia. List of Emoticons, 2013. Available online, http://en.wikipedia.org/wiki/List_of_
emoticons/.
D. Witmer and S. Katzman. On-Line Smiles: Does Gender Make a Di erence in the Use of Graphic
Accents? Journal of Computer-Mediated Communication, 2(4), 1997.
J. Zhao, L. Dong, J. Wu, and K. Xu. MoodLens: An Emoticon-Based Sentiment Analysis System
for Chinese Tweets. In 18th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2012), pages 1528{1531. Association for Computing Machinery, 2012.