Detecting Fake Accounts Through Generative Adversarial Network in Online Social Media

Document Type : Research Article

Authors

1 Department of Computer Engineering, Islamic Azad University, Shiraz Branch, Shiraz, Iran

2 Department of Computer Engineering, Ramh.C., Islamic Azad University, Ramhormoz, Iran.

3 Institute for Medical Informatics and Statistics, University of Kiel, Kiel, Germany.

4 Department of Computer Sciences, Faculty of Mathematics and Computer, Amirkabir University of Technology, Tehran, Iran.

Abstract
Online social media is integral to human life, facilitating messaging, information sharing, and confidential communication while preserving privacy. Platforms like Twitter, Instagram, and Facebook exemplify this phenomenon. However, users face challenges due to network anomalies, often stemming from malicious activities such as identity theft for financial gain or harm. This paper proposes a novel method using user similarity measures and the Generative Adversarial Network (GAN) algorithm to identify anomalies (fake nodes) in user accounts in a large-scale social network while handling imbalanced data issues. Despite the problem's complexity, the method achieves an AUC rate of 80\% in classifying and detecting fake accounts. Notably, the study builds on previous research, highlighting advancements and insights into the evolving landscape of anomaly detection in online social networks. The findings of this study contribute to ongoing advancements in fake account detection, offering a hopeful solution for securing online spaces against fraudulent activities and anomaly detection in social networks.

Keywords


[1] Yazan Boshmaf, Dionysios Logothetis, Georgos Siganos, Jorge Ler´ıa, Jose Lorenzo, Matei Ripeanu, and Konstantin Beznosov. Integro: Leveraging victim prediction for robust fake account detection in osns. In NDSS, volume 15, pages 8–11, 2015.
[2] Gulcin Bilgin Turna. Impact of social media on tourism, hospitality and events. In Handbook on Tourism and Social Media, pages 475–488. Edward Elgar Publishing, 2022.
[3] Buket Er¸sahin, ¨ Ozlem Akta¸s, Deniz Kılın¸c, and Ceyhun Akyol. Twitter fake account detection. In 2017 international conference on computer science and engineering (UBMK), pages 388–392. IEEE, 2017.
[4] Shubhangi Rastogi and Divya Bansal. A review on fake news detection 3t’s: typology, time of detection, taxonomies. International Journal of Information Security, 22(1):177–212, 2023.
[5] Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. User-level sentiment analysis incorporating social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1397–1405, 2011.
[6] Muhammad Al-Qurishi, Sk Md Mizanur Rahman, Atif Alamri, Mohamed A Mostafa, Majed Al-Rubaian, M Shamim Hossain, and Brij B Gupta. Sybiltrap: A graph-based semisupervised sybil defense scheme for online social networks. Concurrency and Computation: Practice and Experience, 30(5):e4276, 2018.
[7] Adem Tekerek. A novel architecture for webbased attack detection using convolutional neural network. Computers & Security, 100:102096, 2021.
[8] Mohammadreza Mohammadrezaei, Mohammad Ebrahim Shiri, and Amir Masoud Rahmani. Identifying fake accounts on social networks based on graph analysis and classification algorithms. Security and Communication Networks, 2018(1):5923156, 2018.
[9] Morteza Behniafar, Alireza Nowroozi, and Hamid Reza Shahriari. A survey of anomaly detection approaches in internet of things. ISeCure, 10(2), 2018.
[10] S Sivamohan, SS Sridhar, and S Krishnaveni. An effective recurrent neural network (rnn) based intrusion detection via bi-directional long shortterm memory. In 2021 international conference on intelligent technologies (CONIT), pages 1–5. IEEE, 2021.
[11] Morteza Yousefi Kharaji, Fatemeh Salehi Rizi, and Mohammad Reza Khayyambashi. A new approach for finding cloned profiles in online social networks. arXiv preprint arXiv:1406.7377, 2014.
[12] Shaghayegh Najari, Davood Rafiei, Mostafa Salehi, and Reza Farahbakhsh. Adversarial botometer: adversarial analysis for social bot detection. Social Network Analysis and Mining, 14(1):220, 2024.
[13] Sangho Lee and Jong Kim. Early filtering of ephemeral malicious accounts on twitter. Computer communications, 54:48–57, 2014.
[14] Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y Zhao, and Yafei Dai. Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data(TKDD), 8(1):1–29, 2014.
[15] Hongfang Zhou, Heng Pan, Kangyun Zheng, Zongling Wu, and Qingyu Xiang. A novel oversampling method based on wasserstein cgan for imbalanced classification. Cybersecurity, 8(1):7, 2025.
[16] Yansong Liu, Shuang Wang, He Sui, and Li Zhu. An ensemble learning method with gan-based sampling and consistency check for anomaly detection of imbalanced data streams with concept drift. Plos one, 19(1):e0292140, 2024.
[17] Fangfang Yuan, Yanmin Shang, Yanbing Liu, Yanan Cao, and Jianlong Tan. Data augmentation for insider threat detection with gan. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pages 632–638. IEEE, 2020.
[18] Somya Ranjan Sahoo and Brij B Gupta. Multiple features based approach for automatic fake news detection on social networks using deep learning. Applied Soft Computing, 100:106983, 2021.
[19] Putra Wanda and Huang J Jie. Deepfriend: finding abnormal nodes in online social networks using dynamic deep learning. Social Network Analysis and Mining, 11(1):34, 2021.
[20] Wafa Shafqat and Yung-Cheol Byun. A hybrid gan-based approach to solve imbalanced data problem in recommendation systems. IEEE access, 10:11036–11047, 2022.
[21] Anuraganand Sharma, Prabhat Kumar Singh, and Rohitash Chandra. Smotified-gan for class imbalanced pattern classification problems. Ieee Access, 10:30655–30665, 2022.
[22] Kimberly L Elmore and Michael B Richman. Euclidean distance as a similarity metric for principal component analysis. Monthly weather review, 129(3):540–549, 2001.
[23]  Lukasz Sadowski, Mehdi Nikoo, and Mehrdad Nikoo. Principal component analysis combined with a self organization feature map to determine the pull-off adhesion between concrete layers. Construction and Building Materials, 78:386–396, 2015.
[24] Mohammadreza Mohammadrezaei. Detecting fake accounts in social networks using principal components analysis and kernel density estimation algorithm (a case study on the twitter social network). Electronic and Cyber Defense, 9(3):
109–123, 2021.
[25] Jennifer D Kaufman and William P Dunlap. Determining the number of factors to retain: Q windows-based fortran-imsl program for parallel analysis. Behavior Research Methods, Instruments, & Computers, 32(3):389–395, 2000.
[26] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
[27] Kagandi. Twitter dataset. anomalous-verticesdetection. https://github.com/kagandi/anomalous-vertices-detection/tree/master/data, 2014. Accessed: July 1, 2014.
[28] Salim Jouili, Salvatore Tabbone, and Ernest Valveny. Comparing graph similarity measures for graphical recognition. In International Workshop on Graphics Recognition, pages 37–48. Springer, 2009.
[29] Cuneyt Gurcan Akcora, Barbara Carminati, and Elena Ferrari. User similarities on social networks. Social Network Analysis and Mining, 3(3):475–495, 2013.
[30] Julio Santisteban and Javier Tejada-C´arcamo. Unilateral jaccard similarity coefficient. GSB@SIGIR, 1393:23–27, 2015.
[31] Liyan Dong, Yongli Li, Han Yin, Huang Le, and Mao Rui. The algorithm of link prediction on social network. Mathematical problems in engineering, 2013(1):125123, 2013.
[32] Linyuan L¨u and Tao Zhou. Link prediction in weighted networks: The role of weak ties. Europhysics Letters, 89(1):18001, 2010.
[33] Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. Pearson correlation coefficient. In Noise reduction in speech processing, pages 1–4. Springer, 2009.
[34] Nojun Kwak. Principal component analysis based on l1-norm maximization. IEEE transactions on pattern analysis and machine intelligence, 30(9):1672–1680, 2008.
[35] William Cukierski, Benjamin Hamner, and Bo Yang. Graph-based features for supervised link prediction. In The 2011 International joint conference on neural networks, pages 1237–1244. IEEE, 2011.
[36] Oleksandr Letychevskyi, Yaroslav Hryniuk, Viktor Yakovlev, Volodymyr Peschanenko, and Viktor Radchenko. Algebraic matching of vulnerabilities in a low-level code. ISeCure, 11(3), 2019.
[37] Mohammadreza Mohammadrezaei, Ebrahim Shiri Mohammad, Amir Masoud Rahmani, et al. Detection of fake accounts in social networks based on one class classification. The ISC International Journal of Information
Security, 2019.
[38] Taher Al-Shehari and Rakan A Alsowail. Random resampling algorithms for addressing the imbalanced dataset classes in insider threat detection. International Journal of Information Security, 22(3):611–629, 2023.
[39] Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, andWPhilip Kegelmeyer. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.
[40] T Schlegl, Philipp Seeb¨ock, Sebastian M Waldstein, G Langs, and Ursula Schmidt-Erfurth. fanogan: Fast unsupervised anomaly detection with generative adversarial networks [j]. Medical image analysis, 54:30–44, 2019.
[41] Stephen V Stehman. Selecting and interpreting measures of thematic classification accuracy. Remote sensing of Environment, 62(1):77–89, 1997.
[42] Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233–240, 2006.
[43] KA Malyshenko, MM Shafiee, and VA Malyshenko. Identification of fake news using emotional profiling as an approach to text analysis. ISeCure, 16(2), 2024.
[44] Jinus Bordbar, Mohammadreza Mohammadrezaei, Saman Ardalan, and Mohammad Ebrahim Shiri. Detecting fake accounts through generative adversarial network in online social media. 2023.