On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter
Sentiment Analysis (SA) is an active research area for the last ten years. SA is the computational treatment of opinions, sentiments, and subjectivity of text. Twitter is one of the most widely used micro-blog and considered as an important source for computation of sentiment and of data analysis. Therefore, companies all over the world analyze Twitter data using SA and extract knowledge which has potential applications in diverse areas. Although SA is the successful way of finding the people’s opinion, the bias in the tweets affects the results of the SA and reflects inaccurate analysis that may mislead users to take erroneous decisions. The biased tweets are shared by valid, but biased human users as well as the social bots to propagate the biased opinions on certain topics. To counter this, this research study proposes a statistical model to identify such users and social bots who share the biased content in the form of tweets in the Twitter social media. For experiment purpose, we use annotated twitter dataset and argue the results of SA with and without the biased tweets and explored the effects of biased users at micro-level and macro level. The empirical results show that the proposed approach is effective and properly identifies the biased users and bots from other authentic users using sentiment analysis.
H. U. Khan, “Mixed-sentiment Classification of Web Forum Posts
Using Lexical and Non-lexical Features,” J Web Eng, vol. 16, no. 1–2,
pp. 161–176, Mar. 2017.
A. Pinto, H. G. Oliveira, A´ . Figueira, and A. O. Alves, “Predicting
the Relevance of Social Media Posts Based on Linguistic Features and
Journalistic Criteria,” New Gener. Comput., vol. 35, no. 4, pp. 451–472,
Oct. 2017, doi: 10.1007/s00354-017-0015-1.
E. Qualman, Socialnomics: How Social Media Transforms the Way We
Live and Do Business. John Wiley & Sons, 2010.
M. J. Culnan, P. J. McHugh, and J. I. Zubillaga, “How large US companies
can use Twitter and other social media to gain business value,”
MIS Q. Exec., vol. 9, no. 4, 2010.
P. Barber´a and G. Rivero, “Understanding the political representativeness
of Twitter users,” Soc. Sci. Comput. Rev., vol. 33, no. 6,
pp. 712–729, 2015.
A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and
opinion mining,” in LREc, 2010, vol. 10.
E. Kouloumpis, T.Wilson, and J. D. Moore, “Twitter sentiment analysis:
The good the bad and the omg!,” Icwsm, vol. 11, no. 538–541, p. 164,
J. S. Morgan, C. Lampe, and M. Z. Shafiq, “Is News Sharing on Twitter
Ideologically Biased?,” in Proceedings of the 2013 Conference on
Computer Supported Cooperative Work, New York, NY, USA, 2013,
pp. 887–896, doi: 10.1145/2441776.2441877.
S. Gonz´alez-Bail´on, N. Wang, A. Rivero, J. Borge-Holthoefer, and
Y. Moreno, “Assessing the bias in communication networks sampled
from twitter,” ArXiv Prepr. ArXiv12121684, 2012.
H. Lu, J. Caverlee, and W. Niu, “BiasWatch: A Lightweight System
for Discovering and Tracking Topic-Sensitive Opinion Bias in Social
Media,” in Proceedings of the 24th ACM International on Conference on
Information and Knowledge Management, New York, NY, USA, 2015,
pp. 213–222, doi: 10.1145/2806416.2806573.
K. Miwa and K. Ueda, “The influence of investor’s behavioral
biases on the usefulness of the Dual Moving Average Crossovers,”
New Gener. Comput., vol. 23, no. 1, pp. 67–75, Mar. 2005, doi:
Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, “The socialbot
network: when bots socialize for fame and money,” in Proceedings
of the 27th Annual Computer Security Applications Conference, 2011,
H. U. Khan and A. Daud, “Using Machine Learning Techniques for
Subjectivity Analysis based on Lexical and Nonlexical Features,” Int.
Arab J. Inf. Technol. IAJIT, vol. 14, no. 4, 2017.
S. Haustein, T. D. Bowman, K. Holmberg, A. Tsou, C. R. Sugimoto, and
V. Larivi`ere, “Tweets as impact indicators: Examining the implications
of automated ‘bot’ accounts on Twitter,” J. Assoc. Inf. Sci. Technol.,
vol. 67, no. 1, pp. 232–238, Jan. 2016, doi: 10.1002/asi.23456.
B. Mønsted, P. Sapie´yy˜nski, E. Ferrara, and S. Lehmann, “Evidence of
complex contagion of information in social media: An experiment using
Twitter bots,” PLOS ONE, vol. 12, no. 9, p. e0184148, Sep. 2017, doi:
S. Liu, Y. Wang, J. Zhang, C. Chen, and Y. Xiang, “Addressing
the class imbalance problem in Twitter spam detection using ensemble
learning,” Comput. Secur., vol. 69, pp. 35–49, Aug. 2017, doi:
C. Chen et al., “Investigating the deceptive information in Twitter spam,”
Future Gener. Comput. Syst., vol. 72, pp. 319–326, July 2017, doi:
F. Morstatter, H. Dani, J. Sampson, and H. Liu, “Can One Tamper with
the Sample API?: Toward Neutralizing Bias from Spam and Bot Content,”
in Proceedings of the 25th International Conference Companion
onWorldWideWeb, Republic and Canton of Geneva, Switzerland, 2016,
pp. 81–82, doi: 10.1145/2872518.2889372.
K. Ravi and V. Ravi, “A survey on opinion mining and sentiment analysis:
tasks, approaches and applications,” Knowl.-Based Syst., vol. 89,
pp. 14–46, 2015.
S. Asur and B. A. Huberman, “Predicting the Future with Social
Media,” in Proceedings of the 2010 IEEE/WIC/ACM International
Conference on Web Intelligence and Intelligent Agent Technology –
Volume 01, Washington, DC, USA, 2010, pp. 492–499, doi: 10.1109/
G. Eysenbach, “Can Tweets Predict Citations? Metrics of Social Impact
Based on Twitter and Correlation with Traditional Metrics of Scientific
Impact,” J. Med. Internet Res., vol. 13, no. 4, p. e123, Dec. 2011, doi:
J. Ritterman, M. Osborne, and E. Klein, “Using prediction markets and
Twitter to predict a swine flu pandemic,” in 1st International Workshop
on Mining Social Media, 2009, vol. 9, pp. 9–17.
X. Chen, Y. Cho, and S. Y. Jang, “Crime prediction using Twitter sentiment
and weather,” 2015, pp. 63–68, doi: 10.1109/SIEDS.2015.7117012.
M. D. Conover, B. Goncalves, J. Ratkiewicz, A. Flammini, and
F. Menczer, “Predicting the Political Alignment of Twitter Users,” 2011,
pp. 192–199, doi: 10.1109/PASSAT/SocialCom.2011.34.
J. E. Chung and E. Mustafaraj, “Can collective sentiment expressed
on twitter predict political elections?,” in AAAI, 2011, vol. 11,
L. Dang-Xuan, S. Stieglitz, J. Wladarsch, and C. Neuberger, “An Investigation
of Influentials and the Role of Sentiment in Political Communication
on Twitter During Election Periods,” Inf. Commun. Soc., vol. 16,
no. 5, pp. 795–825, June 2013, doi: 10.1080/1369118X.2013.783608.
A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, “Predicting
elections with twitter: What 140 characters reveal about political
sentiment,” Icwsm, vol. 10, no. 1, pp. 178–185, 2010.
A. Bermingham and A. Smeaton, “On using Twitter to monitor political
sentiment and predict election results,” in Proceedings of the Workshop
on Sentiment Analysis Where AI meets Psychology (SAAIP 2011), 2011,
H. Wang, D. Can, A. Kazemzadeh, F. Bar, and S. Narayanan, “A
System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential
Election Cycle,” in Proceedings of the ACL 2012 System
Demonstrations, Stroudsburg, PA, USA, 2012, pp. 115–120.
A. Bessi and E. Ferrara, “Social Bots Distort the 2016 US Presidential
Election Online Discussion,” Social Science Research Network,
Rochester, NY, SSRN Scholarly Paper ID 2982233, Nov. 2016.
W. He, S. Zha, and L. Li, “Social media competitive analysis and text
mining: A case study in the pizza industry,” Int. J. Inf. Manag., vol. 33,
no. 3, pp. 464–472, 2013.
A. Mittal and A. Goel, Stock Prediction Using Twitter Sentiment
W. Chamlertwat, P. Bhattarakosol, T. Rungkasiri, and C. Haruechaiyasak,
“Discovering Consumer Insight from Twitter via Sentiment Analysis,” J
UCS, vol. 18, no. 8, pp. 973–992, 2012.
T. K. Das, D. P. Acharjya, and M. R. Patra, “Opinion mining about
a product by analyzing public tweets in Twitter,” in 2014 International
Conference on Computer Communication and Informatics, 2014,
pp. 1–4, doi: 10.1109/ICCCI.2014.6921727.
X. Fang and J. Zhan, “Sentiment analysis using product review data,”
J. Big Data, vol. 2, no. 1, p. 5, Dec. 2015, doi: 10.1186/s40537-015-
M. M. Mostafa, “More than words: Social networks’ text mining
for consumer brand sentiments,” Expert Syst. Appl., vol. 40, no. 10,
pp. 4241–4251, Aug. 2013, doi: 10.1016/j.eswa.2013.01.019.
H. A. Aldahawi and S. M. Allen, “Twitter Mining in the Oil Business:
A Sentiment Analysis Approach,” in 2013 International Conference
on Cloud and Green Computing, 2013, pp. 581–586, doi:
E. Guzman and W. Maalej, “How Do Users Like This Feature? A
Fine Grained Sentiment Analysis of App Reviews,” in 2014 IEEE
nd International Requirements Engineering Conference (RE), 2014,
pp. 153–162, doi: 10.1109/RE.2014.6912257.
D. Kang and Y. Park, “Review-based measurement of customer satisfaction
in mobile service: Sentiment analysis and VIKOR approach,”
Expert Syst. Appl., vol. 41, no. 4, Part 1, pp. 1041–1050, Mar. 2014, doi:
J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting
topic based twitter sentiment for stock prediction,” in Proceedings of the
st Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), 2013, vol. 2, pp. 24–29.
T. Rao and S. Srivastava, “Analyzing Stock Market Movements Using
Twitter Sentiment Analysis,” in Proceedings of the 2012 International
Conference on Advances in Social Networks Analysis and Mining
(ASONAM 2012), Washington, DC, USA, 2012, pp. 119–123, doi:
J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock
market,” J. Comput. Sci., vol. 2, no. 1, pp. 1–8, Mar. 2011, doi:
J. Smailoviˇc, M. Grˇcar, N. Lavraˇc, and M. ˇ Znidarˇsiˇc, “Stream-based
active learning for sentiment analysis in the financial domain,” Inf. Sci.,
vol. 285, pp. 181–203, Nov. 2014, doi: 10.1016/j.ins.2014.04.034.
L. Bing, K. C. C. Chan, and C. Ou, “Public Sentiment Analysis in
Twitter Data for Prediction of a Company’s Stock Price Movements,” in
IEEE 11th International Conference on e-Business Engineering,
, pp. 232–239, doi: 10.1109/ICEBE.2014.47.
N. Oliveira, P. Cortez, and N. Areal, “Some Experiments on Modeling
Stock Market Behavior Using Investor Sentiment Analysis and Posting
Volume from Twitter,” in Proceedings of the 3rd International Conference
on Web Intelligence, Mining and Semantics, New York, NY, USA,
, pp. 31:1–31:8, doi: 10.1145/2479787.2479811.
X. Li, H. Xie, L. Chen, J. Wang, and X. Deng, “News impact on
stock price return via sentiment analysis,” Knowl.-Based Syst., vol. 69,
pp. 14–23, Oct. 2014, doi: 10.1016/j.knosys.2014.04.022.
U. R. Hodeghatta, “Sentiment Analysis of Hollywood Movies on Twitter,”
in Proceedings of the 2013 IEEE/ACM International Conference
on Advances in Social Networks Analysis and Mining, New York, NY,
USA, 2013, pp. 1401–1404, doi: 10.1145/2492517.2500290.
Abd. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja, “Opinion
Mining of Movie Review using Hybrid Method of Support Vector
Machine and Particle Swarm Optimization,” Procedia Eng., vol. 53,
pp. 453–462, Jan. 2013, doi: 10.1016/j.proeng.2013.02.059.
T. T. Thet, J.-C. Na, and C. S. G. Khoo, “Aspect-based sentiment
analysis of movie reviews on discussion boards,” J. Inf. Sci., vol. 36,
no. 6, pp. 823–848, Dec. 2010, doi: 10.1177/0165551510388123.
A. Amolik, N. Jivane, M. Bhandari, and M. Venkatesan, “Twitter sentiment
analysis of movie reviews using machine learning techniques,” Int.
J. Eng. Technol., vol. 7, no. 6, pp. 1–7, 2016.