On Modelling for Bias-Aware Sentiment Analysis and Its Impact in Twitter

  • Ahsan Mahmood Department of Computer Science, COMSATS University Islamabad, Attock Campus, Pakistan
  • Hikmat Ullah Khan Department of Computer Science COMSATS University Islamabad, Wah Campus, Pakistan
  • Muhammad Ramzan Department of Computer Science and IT, University of Sargodha, Sargodha, Pakistan
Keywords: Social Media, Twitter, Sentiment Analysis, Bias, data mining, opinion mining


Sentiment Analysis (SA) is an active research area for the last ten years. SA is the computational treatment of opinions, sentiments, and subjectivity of text. Twitter is one of the most widely used micro-blog and considered as an important source for computation of sentiment and of data analysis. Therefore, companies all over the world analyze Twitter data using SA and extract knowledge which has potential applications in diverse areas. Although SA is the successful way of finding the people’s opinion, the bias in the tweets affects the results of the SA and reflects inaccurate analysis that may mislead users to take erroneous decisions. The biased tweets are shared by valid, but biased human users as well as the social bots to propagate the biased opinions on certain topics. To counter this, this research study proposes a statistical model to identify such users and social bots who share the biased content in the form of tweets in the Twitter social media. For experiment purpose, we use annotated twitter dataset and argue the results of SA with and without the biased tweets and explored the effects of biased users at micro-level and macro level. The empirical results show that the proposed approach is effective and properly identifies the biased users and bots from other authentic users using sentiment analysis.


Download data is not yet available.

Author Biographies

Ahsan Mahmood, Department of Computer Science, COMSATS University Islamabad, Attock Campus, Pakistan

Ahsan Mahmood received the master’s degree in computer science from the COMSATS University, Attock campus, Pakistan. His research interests include Data Mining, Social Media Analysis, Sentiment Analysis and Machine Learning.

Hikmat Ullah Khan, Department of Computer Science COMSATS University Islamabad, Wah Campus, Pakistan

Hikmat Ullah Khan received the master’s degree in computer science and the Ph.D. degree in computer science from International Islamic University, Islamabad. He has been an Active Researcher for the last ten years. He is currently an Assistant Professor with the Department of Computer Science, COMSATS University Islamabad, Wah Cantt, Pakistan. He has authored a number of research articles in top peer-reviewed journals and international conferences. His research interests include socialWeb mining, semanticWeb, data science, information retrieval, and scientometrics. He is an Editorial Board Member of a number of prestigious impact factor journals.

Muhammad Ramzan, Department of Computer Science and IT, University of Sargodha, Sargodha, Pakistan

Muhammad Ramzan is currently pursuing the Ph.D. degree with the University of Management and Technology, Lahore, Pakistan. He is currently a Lecturer with the University of Sargodha, Pakistan. He has authored several research articles published in reputed peer-reviewed journals. His areas of research include algorithms, machine learning, software engineering, and computer vision.


H. U. Khan, “Mixed-sentiment Classification of Web Forum Posts

Using Lexical and Non-lexical Features,” J Web Eng, vol. 16, no. 1–2,

pp. 161–176, Mar. 2017.

A. Pinto, H. G. Oliveira, A´ . Figueira, and A. O. Alves, “Predicting

the Relevance of Social Media Posts Based on Linguistic Features and

Journalistic Criteria,” New Gener. Comput., vol. 35, no. 4, pp. 451–472,

Oct. 2017, doi: 10.1007/s00354-017-0015-1.

E. Qualman, Socialnomics: How Social Media Transforms the Way We

Live and Do Business. John Wiley & Sons, 2010.

M. J. Culnan, P. J. McHugh, and J. I. Zubillaga, “How large US companies

can use Twitter and other social media to gain business value,”

MIS Q. Exec., vol. 9, no. 4, 2010.

P. Barber´a and G. Rivero, “Understanding the political representativeness

of Twitter users,” Soc. Sci. Comput. Rev., vol. 33, no. 6,

pp. 712–729, 2015.

A. Pak and P. Paroubek, “Twitter as a corpus for sentiment analysis and

opinion mining,” in LREc, 2010, vol. 10.

E. Kouloumpis, T.Wilson, and J. D. Moore, “Twitter sentiment analysis:

The good the bad and the omg!,” Icwsm, vol. 11, no. 538–541, p. 164,

J. S. Morgan, C. Lampe, and M. Z. Shafiq, “Is News Sharing on Twitter

Ideologically Biased?,” in Proceedings of the 2013 Conference on

Computer Supported Cooperative Work, New York, NY, USA, 2013,

pp. 887–896, doi: 10.1145/2441776.2441877.

S. Gonz´alez-Bail´on, N. Wang, A. Rivero, J. Borge-Holthoefer, and

Y. Moreno, “Assessing the bias in communication networks sampled

from twitter,” ArXiv Prepr. ArXiv12121684, 2012.

H. Lu, J. Caverlee, and W. Niu, “BiasWatch: A Lightweight System

for Discovering and Tracking Topic-Sensitive Opinion Bias in Social

Media,” in Proceedings of the 24th ACM International on Conference on

Information and Knowledge Management, New York, NY, USA, 2015,

pp. 213–222, doi: 10.1145/2806416.2806573.

K. Miwa and K. Ueda, “The influence of investor’s behavioral

biases on the usefulness of the Dual Moving Average Crossovers,”

New Gener. Comput., vol. 23, no. 1, pp. 67–75, Mar. 2005, doi:


Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu, “The socialbot

network: when bots socialize for fame and money,” in Proceedings

of the 27th Annual Computer Security Applications Conference, 2011,

pp. 93–102.

H. U. Khan and A. Daud, “Using Machine Learning Techniques for

Subjectivity Analysis based on Lexical and Nonlexical Features,” Int.

Arab J. Inf. Technol. IAJIT, vol. 14, no. 4, 2017.

S. Haustein, T. D. Bowman, K. Holmberg, A. Tsou, C. R. Sugimoto, and

V. Larivi`ere, “Tweets as impact indicators: Examining the implications

of automated ‘bot’ accounts on Twitter,” J. Assoc. Inf. Sci. Technol.,

vol. 67, no. 1, pp. 232–238, Jan. 2016, doi: 10.1002/asi.23456.

B. Mønsted, P. Sapie´yy˜nski, E. Ferrara, and S. Lehmann, “Evidence of

complex contagion of information in social media: An experiment using

Twitter bots,” PLOS ONE, vol. 12, no. 9, p. e0184148, Sep. 2017, doi:


S. Liu, Y. Wang, J. Zhang, C. Chen, and Y. Xiang, “Addressing

the class imbalance problem in Twitter spam detection using ensemble

learning,” Comput. Secur., vol. 69, pp. 35–49, Aug. 2017, doi:


C. Chen et al., “Investigating the deceptive information in Twitter spam,”

Future Gener. Comput. Syst., vol. 72, pp. 319–326, July 2017, doi:


F. Morstatter, H. Dani, J. Sampson, and H. Liu, “Can One Tamper with

the Sample API?: Toward Neutralizing Bias from Spam and Bot Content,”

in Proceedings of the 25th International Conference Companion

onWorldWideWeb, Republic and Canton of Geneva, Switzerland, 2016,

pp. 81–82, doi: 10.1145/2872518.2889372.

K. Ravi and V. Ravi, “A survey on opinion mining and sentiment analysis:

tasks, approaches and applications,” Knowl.-Based Syst., vol. 89,

pp. 14–46, 2015.

S. Asur and B. A. Huberman, “Predicting the Future with Social

Media,” in Proceedings of the 2010 IEEE/WIC/ACM International

Conference on Web Intelligence and Intelligent Agent Technology –

Volume 01, Washington, DC, USA, 2010, pp. 492–499, doi: 10.1109/


G. Eysenbach, “Can Tweets Predict Citations? Metrics of Social Impact

Based on Twitter and Correlation with Traditional Metrics of Scientific

Impact,” J. Med. Internet Res., vol. 13, no. 4, p. e123, Dec. 2011, doi:


J. Ritterman, M. Osborne, and E. Klein, “Using prediction markets and

Twitter to predict a swine flu pandemic,” in 1st International Workshop

on Mining Social Media, 2009, vol. 9, pp. 9–17.

X. Chen, Y. Cho, and S. Y. Jang, “Crime prediction using Twitter sentiment

and weather,” 2015, pp. 63–68, doi: 10.1109/SIEDS.2015.7117012.

M. D. Conover, B. Goncalves, J. Ratkiewicz, A. Flammini, and

F. Menczer, “Predicting the Political Alignment of Twitter Users,” 2011,

pp. 192–199, doi: 10.1109/PASSAT/SocialCom.2011.34.

J. E. Chung and E. Mustafaraj, “Can collective sentiment expressed

on twitter predict political elections?,” in AAAI, 2011, vol. 11,

pp. 1770–1771.

L. Dang-Xuan, S. Stieglitz, J. Wladarsch, and C. Neuberger, “An Investigation

of Influentials and the Role of Sentiment in Political Communication

on Twitter During Election Periods,” Inf. Commun. Soc., vol. 16,

no. 5, pp. 795–825, June 2013, doi: 10.1080/1369118X.2013.783608.

A. Tumasjan, T. O. Sprenger, P. G. Sandner, and I. M. Welpe, “Predicting

elections with twitter: What 140 characters reveal about political

sentiment,” Icwsm, vol. 10, no. 1, pp. 178–185, 2010.

A. Bermingham and A. Smeaton, “On using Twitter to monitor political

sentiment and predict election results,” in Proceedings of the Workshop

on Sentiment Analysis Where AI meets Psychology (SAAIP 2011), 2011,

pp. 2–10.

H. Wang, D. Can, A. Kazemzadeh, F. Bar, and S. Narayanan, “A

System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential

Election Cycle,” in Proceedings of the ACL 2012 System

Demonstrations, Stroudsburg, PA, USA, 2012, pp. 115–120.

A. Bessi and E. Ferrara, “Social Bots Distort the 2016 US Presidential

Election Online Discussion,” Social Science Research Network,

Rochester, NY, SSRN Scholarly Paper ID 2982233, Nov. 2016.

W. He, S. Zha, and L. Li, “Social media competitive analysis and text

mining: A case study in the pizza industry,” Int. J. Inf. Manag., vol. 33,

no. 3, pp. 464–472, 2013.

A. Mittal and A. Goel, Stock Prediction Using Twitter Sentiment


W. Chamlertwat, P. Bhattarakosol, T. Rungkasiri, and C. Haruechaiyasak,

“Discovering Consumer Insight from Twitter via Sentiment Analysis,” J

UCS, vol. 18, no. 8, pp. 973–992, 2012.

T. K. Das, D. P. Acharjya, and M. R. Patra, “Opinion mining about

a product by analyzing public tweets in Twitter,” in 2014 International

Conference on Computer Communication and Informatics, 2014,

pp. 1–4, doi: 10.1109/ICCCI.2014.6921727.

X. Fang and J. Zhan, “Sentiment analysis using product review data,”

J. Big Data, vol. 2, no. 1, p. 5, Dec. 2015, doi: 10.1186/s40537-015-


M. M. Mostafa, “More than words: Social networks’ text mining

for consumer brand sentiments,” Expert Syst. Appl., vol. 40, no. 10,

pp. 4241–4251, Aug. 2013, doi: 10.1016/j.eswa.2013.01.019.

H. A. Aldahawi and S. M. Allen, “Twitter Mining in the Oil Business:

A Sentiment Analysis Approach,” in 2013 International Conference

on Cloud and Green Computing, 2013, pp. 581–586, doi:


E. Guzman and W. Maalej, “How Do Users Like This Feature? A

Fine Grained Sentiment Analysis of App Reviews,” in 2014 IEEE

nd International Requirements Engineering Conference (RE), 2014,

pp. 153–162, doi: 10.1109/RE.2014.6912257.

D. Kang and Y. Park, “Review-based measurement of customer satisfaction

in mobile service: Sentiment analysis and VIKOR approach,”

Expert Syst. Appl., vol. 41, no. 4, Part 1, pp. 1041–1050, Mar. 2014, doi:


J. Si, A. Mukherjee, B. Liu, Q. Li, H. Li, and X. Deng, “Exploiting

topic based twitter sentiment for stock prediction,” in Proceedings of the

st Annual Meeting of the Association for Computational Linguistics

(Volume 2: Short Papers), 2013, vol. 2, pp. 24–29.

T. Rao and S. Srivastava, “Analyzing Stock Market Movements Using

Twitter Sentiment Analysis,” in Proceedings of the 2012 International

Conference on Advances in Social Networks Analysis and Mining

(ASONAM 2012), Washington, DC, USA, 2012, pp. 119–123, doi:


J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock

market,” J. Comput. Sci., vol. 2, no. 1, pp. 1–8, Mar. 2011, doi:


J. Smailoviˇc, M. Grˇcar, N. Lavraˇc, and M. ˇ Znidarˇsiˇc, “Stream-based

active learning for sentiment analysis in the financial domain,” Inf. Sci.,

vol. 285, pp. 181–203, Nov. 2014, doi: 10.1016/j.ins.2014.04.034.

L. Bing, K. C. C. Chan, and C. Ou, “Public Sentiment Analysis in

Twitter Data for Prediction of a Company’s Stock Price Movements,” in

IEEE 11th International Conference on e-Business Engineering,

, pp. 232–239, doi: 10.1109/ICEBE.2014.47.

N. Oliveira, P. Cortez, and N. Areal, “Some Experiments on Modeling

Stock Market Behavior Using Investor Sentiment Analysis and Posting

Volume from Twitter,” in Proceedings of the 3rd International Conference

on Web Intelligence, Mining and Semantics, New York, NY, USA,

, pp. 31:1–31:8, doi: 10.1145/2479787.2479811.

X. Li, H. Xie, L. Chen, J. Wang, and X. Deng, “News impact on

stock price return via sentiment analysis,” Knowl.-Based Syst., vol. 69,

pp. 14–23, Oct. 2014, doi: 10.1016/j.knosys.2014.04.022.

U. R. Hodeghatta, “Sentiment Analysis of Hollywood Movies on Twitter,”

in Proceedings of the 2013 IEEE/ACM International Conference

on Advances in Social Networks Analysis and Mining, New York, NY,

USA, 2013, pp. 1401–1404, doi: 10.1145/2492517.2500290.

Abd. S. H. Basari, B. Hussin, I. G. P. Ananta, and J. Zeniarja, “Opinion

Mining of Movie Review using Hybrid Method of Support Vector

Machine and Particle Swarm Optimization,” Procedia Eng., vol. 53,

pp. 453–462, Jan. 2013, doi: 10.1016/j.proeng.2013.02.059.

T. T. Thet, J.-C. Na, and C. S. G. Khoo, “Aspect-based sentiment

analysis of movie reviews on discussion boards,” J. Inf. Sci., vol. 36,

no. 6, pp. 823–848, Dec. 2010, doi: 10.1177/0165551510388123.

A. Amolik, N. Jivane, M. Bhandari, and M. Venkatesan, “Twitter sentiment

analysis of movie reviews using machine learning techniques,” Int.

J. Eng. Technol., vol. 7, no. 6, pp. 1–7, 2016.