Towards Efficient Analysis for Malware in the Wild

Communications(2011)

引用 11|浏览11
暂无评分
摘要
We propose two novel techniques for reducing the workload for malware analysis. The first technique is restricted instruction, which accelerates finding the longest common subsequence (LCS) between machine code instruction sequences of malware. The second technique is probabilistic disassembly, which can find the most probable disassembly result of a binary stream without a clue, such as debug symbols or the information of import functions. By combining the two proposals and our generic unpacker, we built an automatic malware classification system. Given an unknown malware program, the system enables malware analysts to find the most similar known malware program to this unknown one, and even estimate different/common instructions. In one of our experiments, we classified 3,233 malware samples in the wild and concluded that 75% of the samples belong to the seven largest clusters. As a result, only seven samples, one from each cluster, were required to be analyzed in order to reveal the functionality of the rest of the 75%, showing a great increase in efficiency of analysis.
更多
查看译文
关键词
invasive software,program diagnostics,reverse engineering,automatic malware classification system,binary stream,debug symbol,import function,longest common subsequence,machine code instruction sequence,malware analysis,malware program,probabilistic disassembly
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要