Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML

IEEE ACCESS(2022)

Cited 13|Views12
No score
Abstract
Phishing, a well-known cyber-attack practice has gained significant research attention in the cyber-security domain for the last two decades due to its dynamic attacking strategies. Although different solutions have been exercised against phishing, phishing attacks have dramatically increased in the past few years. Recent studies have shown that machine learning has become prominent in the present anti-phishing context, and the techniques like deep learning have extensively improved anti-phishing tools' detection ability. This paper proposes PhishDet, a new way of detecting phishing websites through Long-term Recurrent Convolutional Network and Graph Convolutional Network using URL and HTML features. PhishDet is the first of its kind, which uses the powerful analysis and processing capabilities of Graph Neural Network in the anti-phishing domain and recorded 96.42% detection accuracy, with a 0.036 false-negative rate. It is effective against zero-day attacks, and the average detection time which is 1.8 seconds could also be considered realistic. The feature selection of PhishDet is automatic and occurs inside the system, as PhishDet gradually learns URLs and HTML content features to handle constantly changing phishing attacks. This has outperformed similar solutions by achieving a 99.53% f1-score with a public benchmark dataset. However, PhishDet requires periodic retraining to maintain its performance over time. If such retraining could be facilitated, PhishDet could fight against phishers for a more extended period to safeguard Internet users from this Internet threat.
More
Translated text
Key words
Phishing,Feature extraction,Neural networks,Internet security,Visualization,Web pages,Cyberttack,Graph neural networks,Cyberattack,deep learning,graph neural networks,internet security
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined