Structural Analysis of URL for Malicious URL Detection Using Machine Learning

A. Saleem Raja,S. Peerbasha, Y. Mohammed Iqbal, B. Sundarvadivazhagan,M. Mohamed Surputheen

Journal of advanced applied scientific research（2023）

Cited 0|Views3

No score

Abstract

Malicious websites are intentionally created websites that aid online criminals in carrying out illicit actions. They commit crimes like installing malware on the victim's computer, stealing private data from the victim's system, and exposing the victim online. Malicious codes can also be found on legitimate websites. Therefore, locating such a website in cyberspace is a difficult operation that demands the utilization of an automated detection tool. Currently, machine learning/deep learning technologies are employed to detect such malicious websites. However, the problem persists since the attack vector is constantly changing. Most research solutions use a limited number of URL lexical features, DNS information, global ranking information, and webpage content features. Combining several derived features involves computation time and security risk. Additionally, the dataset's minimal features don't maximize its potential. This paper exclusively uses URLs to address this problem and blends linguistic and vectorized URL features. Complete potential of the URL is utilized through vectorization. Six machine learning algorithms are examined. The results indicate that the proposed approach performs better for the count vectorizer with random forest algorithm.

Translated text

Key words

Malicious Link,Phishing,Natural Language Processing,Machine learning,ngram,Random Forest,Lexical features of URL,TFIDF vectorizer,Count vectorizer,Hashing vectorizer

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined