Identity – Attribute Inference in Online Social Network(s) Using Bio-Inspired Algorithms and Machine Learning Approaches
DOI:
https://doi.org/10.13052/jmm1550-4646.1932Keywords:
Identity Inference, Bio-Inspired Algorithms, Homophily, Attribute Inference, EnsembleAbstract
Twitter is one of the most popular social networking sites today, and it has become a critical tool for gathering data from numerous individuals throughout the world. The platform hosts a variety of debates spanning from current events and news to entertainment, advertising, and technology. In contrast to earlier approaches, the proposed work employs the concept of both direct (via tweets) and indirect stance detection (via homophily elements) to infer sensitive attributes. Along with attribute-based inference, the proposed work also matches user profiles across cross platforms via user-generated posts. Unlike prior efforts, usernames are not included in the feature set here since they are a bit of a giveaway. Bio-inspired algorithms are used along with ensemble methods to extract the best set of features.
Downloads
References
Y. Lin, “10 twitter statistics every marketer should know in 2020 [infographic],” Jul 2020. [Online]. Available: https://www.oberlo.in/blog/twitter-statistics
E. Raad and R. Chbeir, “Privacy in online social networks,” in Security and Privacy Preserving in Social Networks, 2013.
Y. Abid, A. Imine, A. Napoli, C. Raïssi, and M. Rusinowitch, “Online link disclosure strategies for social networks,” in Risks and Security of Internet and Systems, F. Cuppens, N. Cuppens, J.-L. Lanet, and A. Legay, Eds. Cham: Springer International Publishing, 2017, pp. 153–168.
J. A. Caetano, H. S. Lima, M. F. Santos, and H. T. Marques-Neto, “Using sentiment analysis to define twitter political users’ classes and their homophily during the 2016 American presidential election,” Journal of Internet Services and Applications, vol. 9, no. 1, 2018.
W. Budiharto and M. Meiliana, “Prediction and analysis of Indonesia presidential election from twitter using sentiment analysis,” Journal of Big Data, vol. 5, no. 1, 2018.
I. E. Alaoui, Y. Gahi, R. Messoussi, Y. Chaabi, A. Todoskoff, and A. Kobi, “A novel adaptable approach for sentiment analysis on big social data,” Journal of Big Data, vol. 5, no. 1, 2018.
M. Lai, M. Tambuscio, V. Patti, G. Ruffo, and P. Rosso, “Stance polarity in political debates: A diachronic perspective of network homophily and conversations on twitter,” Data Knowledge Engineering, vol. 124, p. 101738, 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0169023X19300187
K. Sailunaz and R. Alhajj, “Emotion and sentiment analysis from twitter text,” Journal of Computational Science, vol. 36, p. 101003, 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877750318311037
A. Aldayel and W. Magdy, “Your stance is exposed! analysing possible factors for stance detection on social media,” Proceedings of the ACM on Human-Computer Interaction, vol. 3, pp. 1–20, 2019.
A. Sharma and U. Ghose, “Sentimental analysis of twitter data with respect to general elections in India,” Procedia Computer Science, vol. 173, pp. 325–334, 2020, international Conference on Smart Sustainable Intelligent Computing and Applications under ICITETM2020. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877050920315428
M. Z. Ansari, M. Aziz, M. Siddiqui, H. Mehra, and K. Singh, “Analysis of political sentiment orientations on twitter,” Procedia Computer Science, vol. 167, pp. 1821–1828, 2020, international Conference on Computational Intelligence and Data Science. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1877050920306669
K. Darwish, P. Stefanov, M. J. Aupetit, and P. Nakov, “Unsupervised user stance detection on twitter,” in ICWSM, 2020.
G. A. Kamhoua, N. Pissinou, S. S. Iyengar, J. Beltran, C. Kamhoua, B. L. Hernandez, L. Njilla, and A. P. Makki, “Preventing colluding identity clone attacks in online social networks,” in 2017 IEEE 37th International Conference on Distributed Computing Systems Workshops (ICDCSW), 2017, pp. 187–192.
W. Ahmad and R. Ali, “Social account matching in online social media using cross-linked posts,” Procedia Computer Science, vol. 152, pp. 222–229, 2019, international Conference on Pervasive Computing Advances and Applications – PerCAA 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1877050919306982
J. Liu, W. Chung, Y. Huang, and C. Toraman, “Crosssimon: A novel probabilistic approach to cross-platform online social network simulation,” in 2019 IEEE International Conference on Intelligence and Security Informatics (ISI), 2019, pp. 7–12.
Halimi, Anisa and Erman Ayday. “Profile Matching Across Online Social Networks.” ArXiv abs/2008.09608, 2020.
K. Pastor, “Democrat vs. republican tweets,” May 2018. [Online]. Available: https://www.kaggle.com/kapastor/democratvsrepublicantweets
Ankit and N. Saleena, “An ensemble classification system for twitter sentiment analysis,” Procedia Computer Science, vol. 132, pp. 937–946, 2018, international Conference on Computational Intelligence and Data Science. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S187705091830841X
H. AlMahmoud and S. Al-Khalifa, “Tsim: a system for discovering similar users on twitter,” Journal of Big Data, vol. 5, pp. 1–20, 2018.
Gong, N. and B. Liu. “You Are Who You Know and How You Behave: Attribute Inference Attacks via Users’ Social Friends and Behaviors.” USENIX Security Symposium, 2016.
D. K. Srivastava and B. Roychoudhury, “Words are important: A textual content-based identity resolution scheme across multiple online social networks,” Knowl. Based Syst., vol. 195, p. 105624, 2020.
Mónica Aresta, Luís Pedro, Carlos Santos, António Moreira,” Social Networks And The Construction Of Identity In Digital Environments”, Journal Of Mobile Multimedia, Vol. 10, Issue 3–4, 2014.
B. Chakraborty, “Bio-inspired algorithms for optimal feature subset selection,” 2012 5th International Conference on Computers and Devices for Communication (CODEC), 2012, pp. 1–7, doi: 10.1109/CODEC.2012.6509209.
Asif Khan, Huaping Zhang, Jianyun Shang, Nada Boudjellal, Arshad Ahmad, Asmat Ali, Lin Dai, “Predicting Politician’s Supporters’ Network on Twitter Using Social Network Analysis and Semantic Analysis”, Scientific Programming, vol. 2020, Article ID 9353120, 17 pages, 2020. https://doi.org/10.1155/2020/9353120.
Kassraie, Parnian, Alireza Modirshanechi and Hamid K. Aghajan. “Election Vote Share Prediction using a Sentiment-based Fusion of Twitter Data with Google Trends and Online Polls.” DATA (2017).
Yongjun Li, You Peng, Zhen Zhang, Mingjie Wu, Quanqing Xu, Hongzhi Yin,“A deep dive into user display names across social networks”, Information Sciences, Volume 447, 2018, Pages 186–204, ISSN 0020-0255, https://doi.org/10.1016/j.ins.2018.02.072.
Hazimeh, Hussein, Elena Mugellini, Omar Abou Khaled and Philippe Cudré-Mauroux. “SocialMatching++: A Novel Approach for Interlinking User Profiles on Social Networks.” PROFILES@ISWC (2017).
Liang, Wenxin, Bo Meng, Xiaosong He and Xianchao Zhang. “GCM: A Greedy-Based Cross-Matching Algorithm for Identifying Users Across Multiple Online Social Networks.” PAISI (2015).
Bennacer N., Nana Jipmo C., Penta A., Quercini G. (2014) Matching User Profiles Across Social Networks. In: Jarke M. et al. (eds) Advanced Information Systems Engineering. CAiSE 2014. Lecture Notes in Computer Science, vol 8484. Springer, Cham. https://doi.org/10.1007/978-3-319-07881-6_29.
Maytham Safar, Hisham Farahat, Khaled Mahdi,” Robustness Of Dynamic Social Networks”, Journal Of Mobile Multimedia, Vol. 6, Issue 3, 2010.