Trigonometric words ranking model for spam message classification

Suha Mohammed Hadi,Ali Hakem Alsaeedi,Dhiah Al-Shammary,Zaid Abdi Alkareem Alyasseri,Mazin Abed Mohammed,Karrar Hameed Abdulkareem,Riyadh Rahef Nuiaa,Mustafa Musa Jaber

IET NETWORKS（2022）

引用 1|浏览0

暂无评分

摘要

The significant increase in the volume of fake (spam) messages has led to an urgent need to develop and implement a robust anti-spam method. Several of the current anti-spam systems depend mainly on the word order of the message in determining the spam message, which results in the system's inability to predict the correct type of message when the word order changes. In this paper, a new framework is proposed for anti-spam filtering that does not depend on the word's position in the message, called the Trigonometric Words Ranking Model (TWRM). The proposed TWRM is based on restricting spammers over the network by measuring a theta angle, which is a relationship between message weight and spam. TWRM classifies messages by calculating the rank of each word that places the corresponding message in the correct class. The rank of words is derived from their frequency in the entire data category. The proposed method is applied to three datasets of spam messages: UCI spam email, Enron spam, and TREC spam data. The proposed model is proven as more efficient than the Minhash and vector space models. Moreover, the TWRM performance provided better retrieval time and defence, which is reflected in the accuracy of (99.64%), which is higher than that of Minhash (88.79%) and vector space (92.59%).

查看译文

关键词

antispam filter,document handling,security and privacy information,telecommunication services,text mining

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要