Evaluating Annotated Dataset of Customer Reviews for Aspect Based Sentiment Analysis


  • Dimple Chehal Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India https://orcid.org/0000-0001-5508-1803
  • Dr. Parul Gupta Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India
  • Dr. Payal Gulati Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India




Aspect based sentiment analysis; annotated dataset; Machine learning; Deep Learning; e-commerce reviews; questionnaire


Sentiment analysis of product reviews on e-commerce platforms aids in determining the preferences of customers. Aspect-based sentiment analysis (ABSA) assists in identifying the contributing aspects and their corresponding polarity, thereby allowing for a more detailed analysis of the customer’s inclination toward product aspects. This analysis helps in the transition from the traditional rating-based recommendation process to an improved aspect-based process. To automate ABSA, a labelled dataset is required to train a supervised machine learning model. As the availability of such dataset is limited due to the involvement of human efforts, an annotated dataset has been provided here for performing ABSA on customer reviews of mobile phones. The dataset comprising of product reviews of Apple-iPhone11 has been manually annotated with predefined aspect categories and aspect sentiments. The dataset’s accuracy has been validated using state-of-the-art machine learning techniques such as Naïve Bayes, Support Vector Machine, Logistic Regression, Random Forest, K-Nearest Neighbor and Multi Layer Perceptron, a sequential model built with Keras API. The MLP model built through Keras Sequential API for classifying review text into aspect categories produced the most accurate result with 67.45 percent accuracy. K- nearest neighbor performed the worst with only 49.92 percent accuracy. The Support Vector Machine had the highest accuracy for classifying review text into aspect sentiments with an accuracy of 79.46 percent. The model built with Keras API had the lowest 76.30 percent accuracy. The contribution is beneficial as a benchmark dataset for ABSA of mobile phone reviews.


Download data is not yet available.

Author Biographies

Dimple Chehal, Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India

Dimple Chehal is a Ph.D. student at J.C. Bose University of Science & Technology, YMCA, Faridabad, India since August 2017. She is Senior Research Fellow (SRF) under University Grants Commission’s (UGC), Ph.D. fellowship scheme. Dimple held the position of Systems Engineer at Tata Consultancy Services Private Ltd, India. Her Ph.D. work centers on review based recommender system in the e-commerce domain and current research interests include Data Mining, Natural Language Processing and Machine Learning.

Dr. Parul Gupta, Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India

Parul Gupta is an Associate Professor at Department of Computer Engineering, J.C. Bose University of Science & Technology, YMCA, Faridabad, India. She has teaching experience of over 17 years. In 2013, she received her Ph.D. from MDU Rohtak. She has published more than 25 research papers in reputed international journals and conferences. Her research interests include Data Mining, Information Retrieval, Databases and Sustainable Smart Cities.

Dr. Payal Gulati, Department of Computer Engineering, J.C. Bose University of Science and Technology, YMCA, Faridabad, India

Payal Gulati is an Assistant Professor at Department of Computer Engineering, J.C. Bose University of Science & Technology, YMCA, Faridabad, India. She received her Ph.D. in 2013 from Maharishi Dayanand University, Rohtak, India and has over 14 years of experience. She has contributed more than 30 papers in reputed journals and conferences. She is also a reviewer in Springer and Oxford journals. Her subject of interests includes Data Mining, Information Retrieval, Predictive Analysis, Energy Research and Sustainable Smart Cities.


M. Rahman and E. Kumar Dey, “Datasets for Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation,” Data, vol. 3, no. 2, p. 15, May 2018.

A. Sabeeh and R. K. Dewang, “Comparison, classification and survey of aspect based sentiment analysis,” in Communications in Computer and Information Science, 2019, vol. 955, pp. 612–629.

M. Pontiki, D. Galanis, J. Pavlopoulos, H. Papageorgiou, I. Androutsopoulos, and S. Manandhar, “SemEval-2014 Task 4: Aspect Based Sentiment Analysis,” in Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2015, pp. 27–35.

H. U. Khan, “MIXED-SENTIMENT CLASSIFICATION OF WEB FORUM POSTS USING LEXICAL AND NON-LEXICAL FEATURES,” Journal of Web Engineering, vol. 16, no. 1, pp. 161–176, 2017.

M. S. Akhtar, A. Ekbal, and P. Bhattacharyya, “Aspect based sentiment analysis: category detection and sentiment classification for hindi,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9624 LNCS, pp. 246–257, 2018.

Y. Noh, S. Park, and S. B. Park, “Aspect-based sentiment analysis using aspect map,” Applied Sciences (Switzerland), vol. 9, no. 16, pp. 1–16, 2019.

S. Kiritchenko, X. Zhu, C. Cherry, and S. Mohammad, “NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews,” no. SemEval, pp. 437–442, 2015.

M. Apidianaki, X. Tannier, and C. Richart, “Datasets for aspect-based sentiment analysis in French,” Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016, pp. 1122–1126, 2016.

G. Kou, P. Yang, Y. Peng, F. Xiao, Y. Chen, and F. E. Alsaadi, “Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods,” Applied Soft Computing Journal, vol. 86, p. 105836, 2020.

D. Chehal, P. Gupta, and P. Gulati, “Implementation and comparison of topic modeling techniques based on user reviews in e-commerce recommendations,” Journal of Ambient Intelligence and Humanized Computing, no. 0123456789, Apr. 2020.

A. I. Kadhim, “Survey on supervised machine learning techniques for automatic text classification,” Artificial Intelligence Review, vol. 52, no. 1, pp. 273–292, 2019.

J. Hartmann, J. Huppertz, C. Schamp, and M. Heitmann, “Comparing automated text classification methods,” International Journal of Research in Marketing, vol. 36, no. 1, pp. 20–38, 2019.

M. Pontiki, D. Galanis, H. Papageorgiou, S. Manandhar, and I. Androutsopoulos, “SemEval-2015 Task 12: Aspect Based Sentiment Analysis,” pp. 486–495, 2015.

T. Hercigt, T. Brychcín, L. Svobodat, and M. Konkolt, “SemEval-2016 task 5: Aspect based sentiment analysis,” SemEval 2016 – 10th International Workshop on Semantic Evaluation, Proceedings, pp. 342–349, 2016.

M. Al-Smadi, O. Qawasmeh, B. Talafha, and M. Quwaider, “Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis,” Proceedings – 2015 International Conference on Future Internet of Things and Cloud, FiCloud 2015 and 2015 International Conference on Open and Big Data, OBD 2015, pp. 726–730, 2015.

A. Tamchyna, O. Fiala, and K. Veselovská, “Czech aspect-based sentiment analysis: A new dataset and preliminary results,” CEUR Workshop Proceedings, vol. 1422, pp. 95–99, 2015.

M. Shaheen, “Sentiment Analysis on Mobile Phone Reviews Using Supervised Learning Techniques,” International Journal of Modern Education and Computer Science, vol. 11, no. 7, pp. 32–43, 2019.

M. Al-Smadi, O. Qawasmeh, M. Al-Ayyoub, Y. Jararweh, and B. Gupta, “Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews,” Journal of Computational Science, vol. 27, pp. 386–393, 2018.

I. Portugal, P. Alencar, and D. Cowan, “The use of machine learning algorithms in recommender systems: A systematic review,” Expert Systems with Applications, vol. 97, pp. 205–227, 2018.

M. Dragoni, M. Federici, and A. Rexha, “An unsupervised aspect extraction strategy for monitoring real-time reviews stream,” Information Processing and Management, vol. 56, no. 3, pp. 1103–1118, 2019.

W. M. Wang, J. W. Wang, Z. Li, Z. G. Tian, and E. Tsui, “Multiple affective attribute classification of online customer product reviews: A heuristic deep learning method for supporting Kansei engineering,” Engineering Applications of Artificial Intelligence, vol. 85, no. June, pp. 33–45, 2019.

Y. Yang and X. Liu, “A re-examination of text categorization methods,” Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1999, pp. 42–49, 1999.

M. Raza, F. K. Hussain, O. K. Hussain, M. Zhao, and Z. ur Rehman, “A comparative analysis of machine learning models for quality pillar assessment of SaaS services by multi-class text classification of users’ reviews,” Future Generation Computer Systems, vol. 101, pp. 341–371, 2019.

D. Vandic, F. Frasincar, and U. Kaymak, “A framework for product description classification in e-commerce,” Journal of Web Engineering, vol. 17, no. 1–2, pp. 1–27, 2018.

B. Ay Karakuş, M. Talo, İ. R. Hallaç, and G. Aydin, “Evaluating deep learning models for sentiment classification,” Concurrency Computation, vol. 30, no. 21, pp. 1–14, 2018.

A. P. Rodrigues, N. N. Chiplunkar, and R. Fernandes, “Aspect-based classification of product reviews using Hadoop framework,” Cogent Engineering, vol. 7, no. 1, 2020.

S. A. Aljuhani and N. S. Alghamdi, “A comparison of sentiment analysis methods on Amazon reviews of Mobile Phones,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 6, pp. 608–617, 2019.

M. N. Imtiaz and M. K. Ben Islam, “Identifying Significance of Product Features on Customer Satisfaction Recognizing Public Sentiment Polarity: Analysis of Smart Phone Industry Using Machine-Learning Approaches,” Applied Artificial Intelligence, vol. 00, no. 00, pp. 1–17, 2020.

L. Breiman, “Random Forests,” Machine Learning, vol. 45, pp. 5–32, 2001.

F. Chollet, “The Sequential model.” [Online]. Available: https://keras.io/guides/sequential_model/. [Accessed: 10-Sep-2020].

A. Varghese, G. Agyeman-Badu, and M. Cawley, “Deep learning in automated text classification: a case study using toxicological abstracts,” Environment Systems and Decisions, no. 0123456789, 2020.

F. Chollet, “GitHub - keras-team/keras: Deep Learning for humans.” 2015.

F. Pedregosa et al., “Scikit-learn,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

A. Mueller, “scikit-learn. PyPI.” [Online]. Available: https://pypi.org/project/scikit-learn/. [Accessed: 10-Sep-2020].