Document Type : Research Article

Authors

1 Isfahan University of Technology, Isfahan, Iran

2 Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran.

Abstract

Today, intrusion detection systems are used in the networks as one of the essential methods to detect new attacks. Usually, these systems deal with a broad set of data and many features. Therefore, selecting proper features and benefitting from previously learned knowledge is suitable for efficiently detecting new attacks. A new graph-based method for online feature selection is proposed in this article to increase the accuracy in detecting attacks. In the proposed method, irrelevant features are first removed by inputting a limited number of instances. Then, features are clustered based on graph theory to reduce the search space. After the arrival of new instances at each stage, new clusters of features are created that may differ from the clusters created in the previous step. Therefore, to find the appropriate clusters, these two clusters are combined to select some relevant features with minimum redundancy. The evaluation results show that the proposed method has better performance, for instance classification with a lesser run time than similar online feature selection methods. The proposed method is also faster with a suitable accuracy in instances classification compared to some offline methods.

Keywords

[1] J.McHugh, A.Christie, J.Allen, “Defending yourself: The role of intrusion detection systems”, IEEE software, vol. 17, no. 5, pp. 42-51, 2000.
[2] S.Aljawarneh, M.Aldwairi, M.Yassein, “Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model”, Journal of Computational Science, vol. 25, pp. 152-160, 2018.
[3] B. Morin and M. Ludovic, “Intrusion detection and virology: an analysis of differences, similarities and complementariness”, Journal in Computational Virology, vol. 3, no. 1, pp. 39–49, 2007.
[4] J. Davis and A. Clark, “Data preprocessing for anomaly-based network intrusion detection: A review”,Computers Security, vol. 30, no. 6, pp.353-357, 2011.
[5] “High-Speed Security Log Analytics Using Hybrid Outlier Detection”. Doctoral thesis, Universit¨at Potsdam, 2019.
[6] P. Garcia-Teodoro, J. Diaz-Verdejo, G. MaciaFernandez, E. Vazquez, “Anomaly-based network intrusion detection: techniques, systems and challenges”, Computer Security, vol. 28, no.1, pp. 18–28, 2009.
[7] H. Liao, C. Lin, Y. Lin, K. Tung, “Intrusion detection system: a comprehensive review”, Journal of Network and Computer Applications, vol.36, no. 1, pp. 16-24, 2013.
[8] A. Patcha, J. Park, “An overview of anomaly detection techniques: Existing solutions and latest technological trends”, Computer Networks, vol.51, no. 12, pp. 3448–3470, 2007.
[9] Kunal and M. Dua, “Machine Learning Approach to IDS: A Comprehensive Review,” 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, pp. 117-121, 2019. 
[10] A.Verma, V.Ranga, “Machine Learning Based Intrusion Detection Systems for IoT Applications”. Wireless Personal Communications, pp.2287–2310, 2020.
[11] A. Amouri, V. T. Alaparthy, and S. D. Morgera, “A Machine Learning Based Intrusion Detection System for Mobile Internet of Things”, Sensors(Basel), vol.20, no. 2, 2020.
[12] H. Bhuyan, Monowar, K.Dhruba Bhattacharyya, and K. Jugal Kalita. “Survey on incremental approaches for network anomaly detection”, International Journal of Communication Networks and Information Security (IJCNIS), vol. 3, no.
3, December 2011.
[13] T. Chou, K. Yen, J. Luo, “Network intrusion detection design using feature selection of soft computing paradigms”,International Journal of Computational Intelligence, vol. 4, no. 3, pp.196–208, 2008.
[14] L. Ladha, T. Deepa, “Feature Selection Method and Algorithms”, International Journal on Computer Science and Engineering, vol.3, no. 5, pp.178-179, 2011.
[15] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection”, Journal of Machine Learning Research, vol. 3, pp. 1157–1182,2003.
[16] J. Wang, P. Zhao, C. Hoi, and R. Jin, “Online feature selection and its applications”, IEEE Transactions on Knowledge and Data Engineering, pp.1–14, 2013.
[17] U. Xindong, K. Yu, W. Ding, W. Hao, Z.Xingquan, “Online feature selection with streaming features”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no.5, pp. 1178-1192, 2013.
[18] R. Collins, Y. Liu, M. Leordeanu, “Online selection of discriminative tracking features”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1631-1643,2005.
[19] K. Glocer, J. Theiler, “Online feature selection for pixel classification”, in Proceedings of the 22nd international conference on Machine learning, pp. 249–256, 2005.
[20] F. Amiri, M. Yousefi Rezaei, C. Lucas and A.Shakery, “Mutual information-based feature selection for intrusion detection systems”, Journal of Network and Computer Applications, vol. 34,no. 4, pp. 1184-1199, 2011.
[21] L. Yu, H. Liu, “Feature selection for highdimensional data: A fast correlation-based filter solution”, Proceedings of the Twentieth International Conference on Machine Learning (ICML),vol. 3, pp. 856-863, 2003.
[22] H. Mark, “Correlation-based Feature Selection for Machine Learning”, Ph.D. Thesis, 1999.
[23] Z. Zhang and E. Hancock,”A graph-based approach to feature selection”, Springer Berlin Heidelberg, pp. 205-214, 2011.
[24] P. Moradi and M. Rostami, “A graph theoretic approach for unsupervised feature selection””,
Journal of Engineering Applications of Artificial Intelligence, vol. 44, pp. 33-45, 2015.
[25] F. Zhang, D. Wang, “An effective feature selection approach for network intrusion detection”, IEEE Eighth International Conference on Networking, pp. 307-311, 2013.
[26] G. Stein, H. Wu, “Decision Tree Classifier for Network Intrusion Detection with GA-based Feature Selection”, Proceeding of the 43rd annual Southeast regional conference, pp. 136-141, 2000.
[27] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection”, The Journal of Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
[28] D. Zhang, S. Chen, Z. Zhou, “Constraint score: A new filter”, Pattern Recognition, vol. 41, no.5, pp. 1440–1451, 2008.
[29] Z. Zhao and H. Liu, “Semi-supervised feature selection via spectral analysis”, Proceedings of the 2007 SIAM International Conference on Data Mining (SDM), 2007.
[30] Z. Xu and R. Jin, “Discriminative semi supervised feature selection via manifold regularization”, IEEE Transactions on Neural Networks, vol. 21, no. 7, pp. 1033–1047, 2010.
[31] M.Javadi, S.Eskandari, “Online streaming feature selection: a minimum redundancy, maximum significance approach”, Pattern Analysis and Applications, vol.22, no. 3, pp.949-963, 2019.
[32] S. Perkins and J. Theiler, “Online Feature Selection Using Grafting”, Proceedings of the 20thInternational Conference on Machine Learning, pp.592-599, 2003.
[33] J. Zhou, D. Foster, R. Stine, and R. Ungar, “Streaming Feature Selection Using AlphaInvesting”, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 384 -393, 2005.
[34] U. Xindong, K. Yu, W. Ding, W. Hao, Z.Xingquan, “Online feature selection with streaming features”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no.5, pp. 1178-1192, 2013.
[35] Y. Kui, W. Xindong, W. Wei, and P. Jian, “Towards Scalable and Accurate Online Feature Selection for Big Data”, IEEE International Conference on data mining, pp. 660-669, 2014.
[36] H. Grabner, H. Bischof, “Online boosting and vision”, Computer Vision and Pattern Recognition, IEEE Computer Society, vol. 1, 2006.
[37] J. Wang, P. Zhao, C. Hoi, and R. Jin, “Online feature selection and its applications”, IEEE Transactions on Knowledge and Data Engineering, pp.1–14, 2013.
[38] H. Zheng and H. Zhang, “Online Feature Selection Based on Passive-Aggressive Algorithm with Retaining Features”, Web Technologies and Applications, Springer International Publishing, pp. 707-719, 2015.
[39] Z.Cataltepe, U.Ekmekci, T.Cataltepe, and I.Kelebek. “Online feature selected semisupervised decision trees for network intrusion detection.” In NOMS 2016-2016 IEEE/IFIP Network Operations and Management Symposium, pp. 1085-1088, 2016.
[40] B.Atli, and A.Jung, “Online feature ranking for intrusion detection systems”, arXiv preprint arXiv:1803.00530, 2018.
[41] M. Hinkka, T. Lehto, K. Heljanko and A. Jung, “Structural feature selection for event logs”, in Business Process Management Workshops. BPM 2017, 2017.
[42] X. Rui and W. Donald, “Survey of clustering algorithms”, IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645-678, 2005.
[43] J. Han, M. Kamber, “Data Mining: Concepts and Techniques (3rd ed.)”, San Francuisco, CA, USA: Morgan Kaufmann Publisher Inc., 2011.
[44] C. Zhong, M. Duoqian and W. Ruizhi, “A graph-theoretical clustering method based on two rounds of minimum spanning trees”, Pattern Recognition, vol. 43, no. 3, pp. 752-766, 2010.
[45] R. Ghaemi, M. Sulaiman, N. Ibrahim, “A survey: clustering ensembles techniques”, World Academy of Science, Engineering and Technology, vol. 50, pp. 636-645, 2009.
[46] S. Mimaroglu, E. Erdil, “An efficient and scalable family of algorithms for combining clustering”, Engineering Applications of Artificial Intelligence, vol. 26, no. 10, pp. 2525-2539,2013.
[47] BRAZIL, http://www.cl.cam.ac.uk/.
[48] M. Hosseinzadeh Aghdam and P. Kabiri, “Feature Selection for Intrusion Detection System Using Ant Colony Optimization”, International Journal of Network Security, vol. 18, no. 3, pp.420-432, 2016.
[49] O. Al-Jarrah and A. Elsalamouny, “MachineLearning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection”, IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW), 2014.