TY - JOUR ID - 39153 TI - A density based clustering approach to distinguish between web robot and human requests to a web server JO - The ISC International Journal of Information Security JA - ISECURE LA - en SN - 2008-2045 AU - Zabihi, M. AU - Vafaei Jahan, M. AU - Hamidzadeh, J. AD - Y1 - 2014 PY - 2014 VL - 6 IS - 1 SP - 77 EP - 89 KW - Behavioral Patterns of Web Visitors KW - DBSCAN KW - Density Based Clustering KW - Significance of the Difference Test KW - Web Robots DO - 10.22042/isecure.2014.6.1.7 N2 - Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data sets of web visitors in a reasonable amount of time. Moreover, this technique should be insensitive to the ordering of instances and produce deterministic accurate results. Therefore, this paper presents a density-based clustering approach using Density-Based Spatial Clustering of Applications with Noises (DBSCAN), to classify web visitors of two real large data sets. We propose two new features based on the behavioral patterns of visitors to describe them. What's more, we consider 12 common features and use the significance of the difference test (T-test) to reduce the dimensions and overcome one of the disadvantages of DBSCAN. Based on the supervised evaluation metrics, the proposed algorithm has the 95% of Jaccard metric and produces two clusters having the entropy and purity rates of 0.024 and 0.97, respectively. Furthermore, from the standpoint of clustering quality and accuracy, the proposed method performs better than state-of-the-art algorithms. Finally, it can be concluded that some known web robots through imitating human users make it difficult to be identified. UR - https://www.isecure-journal.com/article_39153.html L1 - https://www.isecure-journal.com/article_39153_9fb8bce2b6d70e0fbfa12a2801cc5fce.pdf ER -