Hybrid Top Features Extraction Model for Detecting X Rumor Events Using an Ensemble Method
DOI:
https://doi.org/10.13052/jwe1540-9589.2414Keywords:
Deep learning, ensemble, machine learning, RFC, RU, SMOTE, rumor detection, natural language processing (NLP)Abstract
The paper describes a novel a hybrid ensemble algorithm (HEA) that combines ensemble learning, class imbalance handling, and feature extraction. To address class imbalance in the dataset, the suggested approach integrates SMOTE oversampling and random under sampling (RU) feature extraction. To begin, Pearson correlation analysis is used to detect highly associated features in a dataset. This analysis aids in the selection of the most relevant features, which are either substantially related to the target variable or have a strong association with other features. The method seeks to improve classification performance by focusing on these correlated features. Following that, the SMOTE oversampling and RU algorithms are used to balance the majority and minority categorization characteristics. The SMOTE (synthetic minority oversampling technique) develops synthetic cases for the minority class by interpolating between existing instances, enhancing minority class representation. RU, on the other hand, removes instances from the majority class at random to obtain a balanced distribution. Furthermore, the random forest classifier (RFC) model’s key features are input into an ensemble of decision tree (DT), k-nearest neighbor (KNN), adaptive boosting (AdaBoost), and convolutional neural network (CNN) approaches. This ensemble approach combines multiple models’ predictions, exploiting their particular strengths and catching varied patterns in the data. Popular machine learning algorithms include DT, KNN, AdaBoost, and CNN, which are notable for their capacity to handle many types of data and capture complicated relationships. The evaluation findings show that the suggested HEA approach is effective, with a maximum precision, recall, F-score, and accuracy of 90%. The proposed methodology produces encouraging results, proving its applicability to a variety of categorization problems.
Downloads
References
Li, J.; Bin, Y.; Peng, L.; Yang, Y.; Li, Y.; Jin, H.; Huang, Z. Focusing on Relevant Responses for Multi-modal Rumor Detection. IEEE Transactions on Knowledge and Data Engineering 2024.
Tan, L.; Wang, G.; Jia, F.; Lian, X. Research status of deep learning methods for rumor detection. Multimedia Tools and Applications 2023, 82, 2941-2982.
“Los angeles gangs in sick contest to kill 100 people in 100 days,” [Online] https://www.dailymail.co.uk/news/article-3178182/Los-Angeles-gangs-bet-kill-100-people-100-days-first.html. 2015.
Meel, P.; Vishwakarma, D.K. Fake news, rumor, information pollution in social media and web: A contemporary survey of state-of-the-arts, challenges and opportunities. Expert Systems with Applications 2020, 153, 112986.
Allport, G.W.; Postman, L. An analysis of rumor. Public opinion quarterly 1946, 10, 501–517.
Tan, Z.; Ning, J.; Liu, Y.; Wang, X.; Yang, G.; Yang, W. ECRModel: An elastic collision-based rumor-propagation model in online social networks. IEEE Access 2016, 4, 6105–6120.
Wu, L.; Li, J.; Hu, X.; Liu, H. Gleaning wisdom from the past: Early detection of emerging rumors in social media. In Proceedings of the Proceedings of the 2017 SIAM international conference on data mining, 2017; pp. 99–107.
Bondielli, A.; Marcelloni, F. A survey on fake news and rumour detection techniques. Information sciences 2019, 497, 38–55.
Pathak, A.R.; Mahajan, A.; Singh, K.; Patil, A.; Nair, A. Analysis of techniques for rumor detection in social media. Procedia Computer Science 2020, 167, 2286–2296.
Al-Sarem, M.; Boulila, W.; Al-Harby, M.; Qadir, J.; Alsaeedi, A. Deep learning-based rumor detection on microblogging platforms: a systematic review. IEEE access 2019, 7, 152788–152812.
Eismann, K. Diffusion and persistence of false rumors in social media networks: implications of searchability on rumor self-correction on Twitter. Journal of Business Economics 2021, 91, 1299–1329.
Alzanin, S.M.; Azmi, A.M. Detecting rumors in social media: A survey. Procedia computer science 2018, 142, 294–300.
Grekousis, G. Artificial neural networks and deep learning in urban geography: A systematic review and meta-analysis. Computers, Environment and Urban Systems 2019, 74, 244–256.
Ma, J.; Gao, W.; Mitra, P.; Kwon, S.; Jansen, B.J.; Wong, K.-F.; Cha, M. Detecting rumors from microblogs with recurrent neural networks. 2016.
Schmidhuber, J. Deep learning in neural networks: An overview. Neural networks 2015, 61, 85–117.
Deng, L.; Yu, D. Deep learning: methods and applications. Foundations and trends® in signal processing 2014, 7, 197–387.
Collobert, R.; Weston, J. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the Proceedings of the 25th international conference on Machine learning, 2008; pp. 160–167.
Wehrmann, J.; Becker, W.; Cagnini, H.E.; Barros, R.C. A character-based convolutional neural network for language-agnostic Twitter sentiment analysis. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), 2017; pp. 2384–2391.
Huang, P.-S.; Kim, M.; Hasegawa-Johnson, M.; Smaragdis, P. Deep learning for monaural speech separation. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014; pp. 1562–1566.
Lee, H.; Pham, P.; Largman, Y.; Ng, A. Unsupervised feature learning for audio classification using convolutional deep belief networks. Advances in neural information processing systems 2009, 22.
Deng, S.; Huang, L.; Xu, G.; Wu, X.; Wu, Z. On deep learning for trust-aware recommendations in social networks. IEEE transactions on neural networks and learning systems 2016, 28, 1164–1177.
Castillo, C.; Mendoza, M.; Poblete, B. Information credibility on twitter. In Proceedings of the Proceedings of the 20th international conference on World wide web, 2011; pp. 675–684.
Cai, G.; Wu, H.; Lv, R. Rumors detection in chinese via crowd responses. In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), 2014; pp. 912–917.
Kwon, S.; Cha, M.; Jung, K.; Chen, W.; Wang, Y. Prominent features of rumor propagation in online social media. In Proceedings of the 2013 IEEE 13th international conference on data mining, 2013; pp. 1103–1108.
Yang, F.; Liu, Y.; Yu, X.; Yang, M. Automatic detection of rumor on sina weibo. In Proceedings of the Proceedings of the ACM SIGKDD workshop on mining data semantics, 2012; pp. 1–7.
Jin, F.; Dougherty, E.; Saraf, P.; Cao, Y.; Ramakrishnan, N. Epidemiological modeling of news and rumors on twitter. In Proceedings of the Proceedings of the 7th workshop on social network mining and analysis, 2013; pp. 1–9.
Dayani, R.; Chhabra, N.; Kadian, T.; Kaushal, R. Rumor detection in twitter: An analysis in retrospect. In Proceedings of the 2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS), 2015; pp. 1–3.
Kumar, A.; Sangwan, S.R. Rumor detection using machine learning techniques on social media. In Proceedings of the International Conference on Innovative Computing and Communications: Proceedings of ICICC 2018, Volume 2, 2019; pp. 213–221.
Ajao, O.; Bhowmik, D.; Zargari, S. Fake news identification on twitter with hybrid cnn and rnn models. In Proceedings of the Proceedings of the 9th international conference on social media and society, 2018; pp. 226–230.
Alsaeedi, A.; Al-Sarem, M. Detecting rumors on social media based on a CNN deep learning technique. Arabian Journal for Science and Engineering 2020, 45, 10813–10844.
Asghar, M.Z.; Habib, A.; Habib, A.; Khan, A.; Ali, R.; Khattak, A. Exploring deep neural networks for rumor detection. Journal of Ambient Intelligence and Humanized Computing 2021, 12, 4315–4333.
Roy, A.; Basak, K.; Ekbal, A.; Bhattacharyya, P. A deep ensemble framework for fake news detection and classification. arXiv preprint arXiv:1811.04670 2018.
Alkhodair, S.A.; Ding, S.H.; Fung, B.C.; Liu, J. Detecting breaking news rumors of emerging topics in social media. Information Processing & Management 2020, 57, 102018.
Chen, T.; Li, X.; Yin, H.; Zhang, J. Call attention to rumors: Deep attention based recurrent neural networks for early rumor detection. In Proceedings of the Trends and Applications in Knowledge Discovery and Data Mining: PAKDD 2018 Workshops, BDASC, BDM, ML4Cyber, PAISI, DaMEMO, Melbourne, VIC, Australia, June 3, 2018, Revised Selected Papers 22, 2018; pp. 40–52.
Ma, J.; Gao, W.; Wong, K.-F. Rumor detection on twitter with tree-structured recursive neural networks. 2018.
Mendoza, M.; Poblete, B.; Castillo, C. Twitter under crisis: Can we trust what we RT? In Proceedings of the Proceedings of the first workshop on social media analytics, 2010; pp. 71–79.
Takahashi, T.; Igata, N. Rumor detection on twitter. In Proceedings of the The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems, 2012; pp. 452–457.
Pheme Dataset https://figshare.com/articles/dataset/PHEME_dataset_of_rumours_and_non-rumours/4010619?file=6453753.
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 2002, 16, 321–357.
Yi, X.; Xu, Y.; Hu, Q.; Krishnamoorthy, S.; Li, W.; Tang, Z. ASN-SMOTE: a synthetic minority oversampling method with adaptive qualified synthesizer selection. Complex & Intelligent Systems 2022, 8, 2247–2272.
Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine learning with oversampling and undersampling techniques: overview study and experimental results. In Proceedings of the 2020 11th international conference on information and communication systems (ICICS), 2020; pp. 243–248.
Obilor, E.I.; Amadi, E.C. Test for significance of Pearson’s correlation coefficient. International Journal of Innovative Mathematics, Statistics & Energy Policies 2018, 6, 11–23.
Xiaolong, X.; Wen, C.; Xinheng, W. RFC: a feature selection algorithm for software defect prediction. Journal of Systems Engineering and Electronics 2021, 32, 389–398.
Kaur, A.; Guleria, K.; Trivedi, N.K. Feature selection in machine learning: Methods and comparison. In Proceedings of the 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), 2021; pp. 789–795.
Bagheri, M.A.; Gao, Q.; Escalera, S. A framework towards the unification of ensemble classification methods. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, 2013; pp. 351–355.
Patel, H.H.; Prajapati, P. Study and analysis of decision tree based classification algorithms. International Journal of Computer Sciences and Engineering 2018, 6, 74–78.
Taunk, K.; De, S.; Verma, S.; Swetapadma, A. A brief review of nearest neighbor algorithm for learning and classification. In Proceedings of the 2019 international conference on intelligent computing and control systems (ICCS), 2019; pp. 1255–1260.
Chengsheng, T.; Huacheng, L.; Bing, X. AdaBoost typical Algorithm and its application research. In Proceedings of the MATEC Web of Conferences, 2017; p. 00222.
Indolia, S.; Goswami, A.K.; Mishra, S.P.; Asopa, P. Conceptual understanding of convolutional neural network-a deep learning approach. Procedia computer science 2018, 132, 679–688.

