Optimizing AI for Mobile Malware Detection by Self-Built-Dataset GAN Oversampling and LGBM

Ortal Dayan,Lior Wolf, Fang Wang,Yaniv Harel

CSR（2023）

引用 0|浏览9

暂无评分

摘要

The cyber detection industry focuses on analyzing the behavior of threats in order to develop IOCs and triggers. This process makes the detection always behind the attackers, as there is an analysis time between the attack tool launch and the detection ability. To address the challenges, a dedicated Sandbox environment was built, and thousands of mobile devices' samples were tested, resulted in creation of an up-to-date training dataset that is not based on the attacks analysis. With this dataset, the research focus was directed towards optimizing the AI methodology to achieve the best detection rates for a compromised mobile device. A CupolaGAN was implemented to oversample dataset and to compare results obtained from training LGBM models on both original imbalanced dataset and oversampled dataset. Classification scores on the oversampled data increase by maximum of 0.47+/-0.37%. The performance of the fine-tuned model using Optuna on the balanced data reaches 99.36+/-0.19% accuracy.

查看译文

关键词

malware detection,cybersecurity,Sandbox,CupolaGAN,LightGBM,oversampling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要