On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter
Sentiment Analysis (SA) is an active research area for the last ten years. SA is the computational treatment of opinions, sentiments, and subjectivity of text. Twitter is one of the most widely used micro-blog and considered as an important source for computation of sentiment and of data analysis. Therefore, companies all over the world analyze Twitter data using SA and extract knowledge which has potential applications in diverse areas. Although SA is the successful way of finding the people’s opinion, the bias in the tweets affects the results of the SA and reflects inaccurate analysis that may mislead users to take erroneous decisions. The biased tweets are shared by valid, but biased human users as well as the social bots to propagate the biased opinions on certain topics. To counter this, this research study proposes a statistical model to identify such users and social bots who share the biased content in the form of tweets in the Twitter social media. For experiment purpose, we use annotated twitter dataset and argue the results of SA with and without the biased tweets and explored the effects of biased users at micro-level and macro level. The empirical results show that the proposed approach is effective and properly identifies the biased users and bots from other authentic users using sentiment analysis.
H. U. Khan, “Mixed-sentiment Classification of Web Forum Posts Using Lexical and Non-lexical Features,” J Web Eng, vol. 16, no. 1-2, pp. 161-176, Mar. 2017.
A. Pinto, H. G. Oliveira, A. Figueira, and A. O. Alves, “Predicting the Relevance of Social Media Posts Based on Linguistic Features and Journalistic Criteria,” New Gener. Comput., vol. 35, no. 4, pp. 451-472, Oct. 2017, doi: 10.1007/s00354-017-0015-1.
E. Qualman, Socialnomics: How Social Media Transforms the Way We Live and Do Business. John Wiley & Sons, 2010.
M. J. Culnan, P. J. McHugh, and J. I. Zubillaga, “How large US com-panies can use Twitter and other social media to gain business value,” MIS Q. Exec., vol. 9, no. 4, 2010.
P. Barbera and G. Rivero, “Understanding the political representa- tiveness of Twitter users,” Soc. Sci. Comput. Rev., vol. 33, no. 6, pp. 712-729, 2015.
A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining,” in LREc, 2010, vol. 10.
E. Kouloumpis, T. Wilson, and J. D. Moore, “Twitter sentiment analysis: The good the bad and the omg!,” Icwsm, vol. 11, no. 538-541, p. 164, 2011.
J. S. Morgan, C. Lampe, and M. Z. Shafiq, “Is News Sharing on Twitter Ideologically Biased?,” in Proceedings of the 2013 Conference on Computer Supported Cooperative Work, New York, NY, USA, 2013, pp. 887-896, doi: 10.1145/2441776.2441877.
S. Gonzalez-Bailon, N. Wang, A. Rivero, J. Borge-Holthoefer, and Y. Moreno, “Assessing the bias in communication networks sampled from twitter,” ArXiv Prepr. ArXiv12121684, 2012.
H. Lu, J. Caverlee, and W. Niu, “BiasWatch: A Lightweight System for Discovering and Tracking Topic-Sensitive Opinion Bias in Social Media,” in Proceedings ofthe 24th ACM International on Conference on Information and Knowledge Management, New York, NY, USA, 2015, pp. 213-222, doi: 10.1145/2806416.2806573.
K. Miwa and K. Ueda, “The influence of investor’s behavioral biases on the usefulness of the Dual Moving Average Crossovers,” New Gener. Comput., vol. 23, no. 1, pp. 67-75, Mar. 2005, doi: 10.1007/BF03037651.
Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, “The social-bot network: when bots socialize for fame and money,” in Proceedings of the 27th Annual Computer Security Applications Conference, 2011, pp. 93-102.
H. U. Khan and A. Daud, “Using Machine Learning Techniques for Subjectivity Analysis based on Lexical and Nonlexical Features,” Int. Arab J. Inf. Technol. IAJIT, vol. 14, no. 4, 2017.
S. Haustein, T. D. Bowman, K. Holmberg, A. Tsou, C. R. Sugimoto, and V. Lariviere, “Tweets as impact indicators: Examining the implications of automated ‘bot’ accounts on Twitter,” J. Assoc. Inf. Sci. Technol., vol. 67, no. 1, pp. 232-238, Jan. 2016, doi: 10.1002/asi.23456.
B. Mpnsted, P. Sapieyynski, E. Ferrara, and S. Lehmann, “Evidence of complex contagion of information in social media: An experiment using Twitter bots,” PLOS ONE, vol. 12, no. 9, p. e0184148, Sep. 2017, doi: 10.1371/journal.pone.0184148.
S. Liu, Y. Wang, J. Zhang, C. Chen, and Y. Xiang, “Addressing the class imbalance problem in Twitter spam detection using ensemble learning,” Comput. Secur., vol. 69, pp. 35-49, Aug. 2017, doi: 10.1016/j.cose.2016.12.004.
C. Chen et al., “Investigating the deceptive information in Twitter spam,” Future Gener. Comput. Syst., vol. 72, pp. 319-326, July 2017, doi: 10.1016/j.future.2016.05.036.
F. Morstatter, H. Dani, J. Sampson, and H. Liu, “Can One Tamper with the Sample API?: Toward Neutralizing Bias from Spam and Bot Content,” in Proceedings of the 25th International Conference Companion on World Wide Web, Republic and Canton of Geneva, Switzerland, 2016, pp. 81-82, doi: 10.1145/2872518.2889372.
K. Ravi and V. Ravi, “A survey on opinion mining and sentiment anal-ysis: tasks, approaches and applications,” Knowl.-Based Syst., vol. 89, pp. 14-46,2015.
S. Asur and B. A. Huberman, “Predicting the Future with Social Media,” in Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01, Washington, DC, USA, 2010, pp. 492-499, doi: 10.1109/ WI-IAT.2010.63.
G. Eysenbach, “Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact,” J. Med. Internet Res., vol. 13, no. 4, p. e123, Dec. 2011, doi: 10.2196/jmir.2012.
J. Ritterman, M. Osborne, and E. Klein, “Using prediction markets and Twitter to predict a swine flu pandemic,” in Ist International Workshop on Mining Social Media, 2009, vol. 9, pp. 9-17.
X. Chen, Y. Cho, and S. Y. Jang, “Crime prediction using Twitter senti-ment and weather,” 2015, pp. 63-68, doi: 10.1109/SIEDS.2015.7117012.
M. D. Conover, B. Goncalves, J. Ratkiewicz, A. Flammini, and F. Menczer, “Predicting the Political Alignment of Twitter Users,” 2011, pp. 192-199, doi: 10.1109/PASSAT/SocialCom.2011.34.
J. E. Chung and E. Mustafaraj, “Can collective sentiment expressed on twitter predict political elections?,” in AAAI, 2011, vol. 11, pp. 1770-1771.
L. Dang-Xuan, S. Stieglitz, J. Wladarsch, and C. Neuberger, “An Inves-tigation of Influentials and the Role of Sentiment in Political Communi- cation on Twitter During Election Periods,” Inf. Commun. Soc., vol. 16, no. 5, pp. 795-825, June 2013, doi: 10.1080/1369118X.2013.783608.
A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, “Predict- ing elections with twitter: What 140 characters reveal about political sentiment,” Icwsm, vol. 10, no. 1, pp. 178-185, 2010.
A. Bermingham and A. Smeaton, “On using Twitter to monitor political sentiment and predict election results,” in Proceedings ofthe Workshop on Sentiment Analysis Where AI meets Psychology (SAAIP 2011), 2011,pp. 2-10.
H. Wang, D. Can, A. Kazemzadeh, F. Bar, and S. Narayanan, “A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Pres- idential Election Cycle,” in Proceedings of the ACL 2012 System Demonstrations, Stroudsburg, PA, USA, 2012, pp. 115-120.
A. Bessi and E. Ferrara, “Social Bots Distort the 2016 US Presiden- tial Election Online Discussion,” Social Science Research Network, Rochester, NY, SSRN Scholarly Paper ID 2982233, Nov. 2016.
W. He, S. Zha, and L. Li, “Social media competitive analysis and text mining: A case study in the pizza industry,” Int. J. Inf. Manag., vol. 33, no. 3, pp. 464-472, 2013.
A. Mittal and A. Goel, Stock Prediction Using Twitter Sentiment Analysis.
W. Chamlertwat, P. Bhattarakosol, T. Rungkasiri, and C. Haruechaiyasak, “Discovering Consumer Insight from Twitter via Sentiment Analysis,” J UCS, vol. 18, no. 8, pp. 973-992, 2012.
T. K. Das, D. P. Acharjya, and M. R. Patra, “Opinion mining about a product by analyzing public tweets in Twitter,” in 2014 International Conference on Computer Communication and Informatics, 2014, pp. 1-4, doi: 10.1109/ICCCI.2014.6921727.
X. Fang and J. Zhan, “Sentiment analysis using product review data,” J. Big Data, vol. 2, no. 1, p. 5, Dec. 2015, doi: 10.1186/s40537-015- 0015-2.
M. M. Mostafa, “More than words: Social networks’ text mining for consumer brand sentiments,” Expert Syst. Appl., vol. 40, no. 10, pp. 4241-4251, Aug. 2013, doi: 10.1016/j.eswa.2013.01.019.
H. A. Aldahawi and S. M. Allen, “Twitter Mining in the Oil Business: A Sentiment Analysis Approach,” in 2013 International Conference on Cloud and Green Computing, 2013, pp. 581-586, doi: 10.1109/CGC.2013.101.
E. Guzman and W. Maalej, “How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews,” in 2014 IEEE 22nd International Requirements Engineering Conference (RE), 2014, pp. 153-162, doi: 10.1109/RE.2014.6912257.
D. Kang and Y. Park, “Review-based measurement of customer satis- faction in mobile service: Sentiment analysis and VIKOR approach,” Expert Syst. Appl., vol. 41, no. 4, Part 1, pp. 1041-1050, Mar. 2014, doi: 10.1016/j.eswa.2013.07.101.
J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting topic based twitter sentiment for stock prediction,” in Proceedings ofthe 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2013, vol. 2, pp. 24-29.
T. Rao and S. Srivastava, “Analyzing Stock Market Movements Using Twitter Sentiment Analysis,” in Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), Washington, DC, USA, 2012, pp. 119-123, doi: 10.1109/ASONAM.2012.30.
J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” J. Comput. Sci., vol. 2, no. 1, pp. 1-8, Mar. 2011, doi: 10.1016/j.jocs.2010.12.007.
J. Smailovic, M. Grcar, N. Lavrac, and M. Znidarsic, “Stream-based active learning for sentiment analysis in the financial domain,” Inf. Sci., vol. 285, pp. 181-203, Nov. 2014, doi: 10.1016/j.ins.2014.04.034.
L. Bing, K. C. C. Chan, and C. Ou, “Public Sentiment Analysis in Twitter Data for Prediction of a Company’s Stock Price Movements,” in 2014 IEEE 11th International Conference on e-Business Engineering, 2014, pp. 232-239, doi: 10.1109/ICEBE.2014.47.
N. Oliveira, P. Cortez, and N. Areal, “Some Experiments on Modeling Stock Market Behavior Using Investor Sentiment Analysis and Posting Volume from Twitter,” in Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics, New York, NY, USA, 2013, pp. 31:1-31:8, doi: 10.1145/2479787.2479811.
X. Li, H. Xie, L. Chen, J. Wang, and X. Deng, “News impact on stock price return via sentiment analysis,” Knowl.-Based Syst., vol. 69, pp. 14-23, Oct. 2014, doi: 10.1016/j.knosys.2014.04.022.
U. R. Hodeghatta, “Sentiment Analysis of Hollywood Movies on Twit- ter,” in Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, New York, NY, USA, 2013, pp. 1401-1404, doi: 10.1145/2492517.2500290.
Abd. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja, “Opin- ion Mining of Movie Review using Hybrid Method of Support Vector Machine and Partióle Swarm Optimization,” Procedía Eng., vol. 53, pp. 453-462, Jan. 2013, doi: 10.1016/j.proeng.2013.02.059.
T. T. Thet, J.-C. Na, and C. S. G. Khoo, “Aspect-based sentiment analysis of movie reviews on discussion boards,” J. Inf. Scí., vol. 36, no. 6, pp. 823-848, Dec. 2010, doi: 10.1177/0165551510388123.
A. Amolik, N. Jivane, M. Bhandari, and M. Venkatesan, “Twitter senti-ment analysis of movie reviews using machine learning techniques,” Int. J. Eng. Technol., vol. 7, no. 6, pp. 1-7, 2016.