Document Type : Research Article

Authors

Department of Electrical and Computer Engineering, University of Qom, Qom, Iran.

Abstract

Today, with the advancement of science and technology, the use of smartphones has become very common, and the Android operating system has been able to gain lots of popularity in the meantime. However, these devices face many
security challenges, including malware. Malware may cause many problems in both the security and privacy of users. So far, the state-of-the-art method in malware detection is based on deep learning, however, this approach requires a lot of computing resources and leads to high battery usage, which is unacceptable in smartphone devices. This paper proposes the knowledge distillation approach for lightening android malware detection. To this end, first, a heavy model is taught and then with the knowledge distillation approach, its knowledge is transferred to a light model called student. To simplify the learning process, soft labels are used here. The resulting model, although slightly less accurate in identification, has a much smaller size than the heavier model. Moreover, ensemble learning was proposed to recover the dropped accuracy. We have tested the proposed approach on CISC datasets including dynamic and static features, and the results show that the proposed method is not only able to lighten the model up to 99%, but also maintain the accuracy of the lightened model to the extent of the heavy model.

Keywords

[1] Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, Francois Gagnon, and Fr´ed´eric Massicotte. DIDroid: Android malware classification and characterization using deep image learning. ACM International Conference Proceeding Series, pages 70–82, 2020.
[2] Xi Xiao, Shaofeng Zhang, Francesco Mercaldo, Guangwu Hu, and Arun Kumar Sangaiah. Android malware detection based on system call sequences and LSTM. Multimedia Tools and Applications, 78(4):3979–3999, feb 2019.
[3] Arvind Mahindru and Paramvir Singh. Dynamic Permissions based Android Malware Detection using Machine Learning Techniques Smartphones Security View project Android malware detection View project Dynamic Permissions based Android Malware Detection using Machine Learning Techniques. dl.acm.org, pages 202–210, feb 2017.
[4] Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. AI Benchmark: Running deep neural networks on android smartphones. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 11133 LNCS, 2019.
[5] Kimberly Tam, Ali Feizollah, Nor Badrul Anuar, Rosli Salleh, and Lorenzo Cavallaro. The evolution of android malware and android analysis techniques. ACM Computing Surveys, 49(4), jan 2017.
[6] Ahmad Salah, Eman Shalabi, and Walid Khedr. A lightweight android malware classifier using novel feature selection methods. Symmetry, 12(5):858, may 2020.
[7] Md Shohel Rana, Sheikh Shah Mohammad Motiur Rahman, and Andrew H Sung. Evaluation of Tree Based Machine Learning Classifiers for Android Malware An Optimized PeronaMalik Anisotropic Diffusion Function for Denoising Medical Image View project Phishing URLs Detection View project Evaluation of Tree Based Machine Learning Classifiers for Android Malware Detection. Proceedings, 11056 LNAI:377–385, 2018. 
[8] Guanhong Tao, Zibin Zheng, Ziying Guo, and
Michael R. Lyu. MalPat: Mining Patterns of Malicious and Benign Android Apps via PermissionRelated APIs. IEEE Transactions on Reliability,
67(1):355–369, mar 2018.
[9] Zhuo Ma, Haoran Ge, Yang Liu, Meng Zhao, and Jianfeng Ma. A Combination Method for Android Malware Detection Based on Control Flow Graphs and Machine Learning Algorithms. IEEE Access, 7, 2019.
[10] L Zhao, D Li, G Zheng, W Shi 2018 IEEE 18th International, and undefined 2018. Deep Neural Network Based on Android Mobile Malware Detection System Using Opcode Sequences. ieeexplore.ieee.org, 2018.
[11] MK Alzaylaee, SY Yerima, S Sezer Computers Security, and undefined. DL-Droid: Deep learning based android malware detection using real devices. Elsevier, 2020.
[12] Abdurrahman Pekta¸s and Tankut Acarman. Learning to detect Android malware via opcode sequences. Neurocomputing, 396:599–608, jul 2020.
[13] Taeguen Kim, Boojoong Kang, Mina Rho, Sakir Sezer, and Eul Gyu Im. A multimodal deep learning method for android malware detection using various features. IEEE Transactions on Information Forensics and Security, 14(3), 2019.
[14] Tonton Hsien De Huang and Hung Yu Kao. R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections. In Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018, pages 2633–2642. Institute of Electrical and Electronics Engineers Inc., jan 2019.
[15] Somayyeh Fallah and Amir Jalaly Bidgoly. Benchmarking machine learning algorithms for android malware detection. Jordanian Journal of Computers and Information Technology, 5(3):216–230, 2019.
[16] J McGiff, WG Hatcher, and J Nguyen. Towards multimodal learning for android malware detection. ieeexplore.ieee.org.
[17] Mohammed M Alani and Senior Member. PAIRED : An Explainable Lightweight Android Malware Detection System. IEEE Access, 10(June):73214–73228, 2022.
[18] Tao Peng, Bochao Hu, Junping Liu, Junjie Huang, Zili Zhang, Ruhan He, and Xinrong Hu. A Lightweight Multi-Source Fast Android Malware Detection Model. Applied Sciences, 12(11):5394, 2022.
[19] Jiawei Xu and Lingyun Ying. SeqNet: An Efficient Neural Network for Automatic Malware Detection. 2022.
[20] Kavita Jain and Mayank Dave. Machine Learning-Based Lightweight Android Malware Detection System with Static Features. Lecture Notes in Electrical Engineering, 694:345–359, 2021.
[21] Li Deng, Geoffrey Hinton, and Brian Kingsbury. New types of deep neural network learning for speech recognition and related applications: an overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing,
pages 8599–8603. IEEE, may 2013.
[22] Joan Bruna and Stephane Mallat. Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1872–1886, 2013.
[23] Niall McLaughlin, Jesus Martinez Del Rincon, Boo Joong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer, Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam Doupe, and Gail Joon Ahn. Deep android malware detection. In CODASPY 2017 - Proceedings of the 7th ACM Conference on Data and Application Security and Privacy, pages 301– 308, New York, NY, USA, mar 2017. Association for Computing Machinery, Inc.
[24] Xibin Dong, Zhiwen Yu, Wenming Cao, Yifan Shi, and Qianli Ma. A survey on ensemble learning. Frontiers of Computer Science, 14(2):241–258, 2020.
[25] Tianchong Gao, Wei Peng, Devkishen Sisodia, Tanay Kumar Saha, Feng Li, and Mohammad Al Hasan. Android Malware Detection via Graphlet Sampling. IEEE Transactions on Mobile Computing, 18(12):2754–2767, dec 2019.
[26] Quan Sun and Bernhard Pfahringer. Bagging Ensemble Selection. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7106 LNAI:251–260, 2011.
[27] Jafar Tanha, Yousef Abdi, Negin Samadi, Nazila Razzaghi, and Mohammad Asadpour. Boosting methods for multi-class imbalanced data classification: an experimental review. 2020.
[28] Mohammad Amini, Jalal Rezaeenoor, and Esmaeil Hadavandi. Effective Intrusion Detection with a Neural Network Ensemble Using Fuzzy Clustering and Stacking Combination Method. Journal of Computing and Security, 1(4):293–305, 2014.
[29] Matilda Rhode, Pete Burnap, and Kevin Jones. Distillation for run-time malware process detection and automated process killing. feb 2019.
[30] David Sean Keyes, Beiqi Li, Gurdip Kaur, Arash Habibi Lashkari, Francois Gagnon, and Frederic Massicotte. EntropLyzer: Android Malware Classification and Characterization Using Entropy Analysis of Dynamic Characteristics. In 2021 Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), pages 1–12. IEEE, may 2021.