AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
In this paper we introduced a deep learning based malware detection approach that achieves a detection rate of 95% and a false positive rate of 0.1% over an experimental dataset of over 400,000 software binaries

Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features

MALWARE, (2015): 11-20

Cited: 354|Views78
EI

Abstract

In this paper we introduce a deep neural network based malware detection system that Invincea has developed, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware. We show that our system achieves a 95% detection rate at 0.1% false positive ra...More

Code:

Data:

0
Introduction
  • Malware continues to facilitate crime, espionage, and other unwanted activities on the computer networks, as attackers use malware as a key tool their campaigns.
  • A confluence of three developments have increased the possibility for success in machine-learning based approaches, holding the promise that these methods might achieve high detection rates at low false positive rates without the burden of human signature generation required by manual methods
  • The first of these trends is the rise of commercial threat intelligence feeds that provide large volumes of new malware, meaning that for the first time, timely, labeled malware data are available to the security community.
  • Machine learning as a discipline has evolved, meaning that researchers have more tools at their disposal to craft detection models that achieve breakthrough performance in terms of both accuracy and scalability
Highlights
  • Malware continues to facilitate crime, espionage, and other unwanted activities on our computer networks, as attackers use malware as a key tool their campaigns
  • A confluence of three developments have increased the possibility for success in machine-learning based approaches, holding the promise that these methods might achieve high detection rates at low false positive rates without the burden of human signature generation required by manual methods
  • The second trend is that computing power has become cheaper, meaning that researchers can more rapidly iterate on malware detection machine learning models and fit larger and more complex models to data
  • In this paper we introduce an approach that takes advantage of all three of these trends: a deployable deep neural network based malware detector using static features that gives what we believe to be the best reported accuracy results of any previously published detection engine
  • In this paper we introduced a deep learning based malware detection approach that achieves a detection rate of 95% and a false positive rate of 0.1% over an experimental dataset of over 400,000 software binaries
  • Similar to [?], we label any file against which 30% or more of the anti-virus engines alarm as malware, and any file that no anti-virus engine alarms on as benignware
  • We believe that the layered approach of deep neural networks and our two dimensional histogram features provide an implicit categorization of binary types, allowing us to directly train on all the binaries, without separating them based on internal features, like packer types, and so on
Methods
  • Shown in Fig. 1, consists of three main components.
  • The first component extracts four different types of complementary features from the static benign and malicious binaries.
  • The second component is the deep neural network classifier which consists of an input layer, two hidden layers and an output layer.
  • The final component is the score calibrator, which translates the outputs of the neural network to a score that can be realistically interpreted as approximating the probability that the file is malware.
Results
  • The authors used the in-house database of malicious and benign binaries to conduct a set of cross-validation experiments testing how well the system performs using the individual feature sets described above and the agglomeration of the feature sets described above.
  • The software uses the Keras v0.1.1 deep learning library to implement the neural network model described above.
  • The feature extraction is mostly written in Cython and Python, heavily relying on SciPy and NumPy libraries, and each sample’s features are extracted by a single thread process.
  • The authors describe each of these evaluations in detail, starting with a description of the evaluation datasets and moving on to descriptions of the methodology and results
Conclusion
  • In this paper the authors introduced a deep learning based malware detection approach that achieves a detection rate of 95% and a false positive rate of 0.1% over an experimental dataset of over 400,000 software binaries.
  • Neural networks have several properties that make them good candidates for malware detection
  • They can allow incremental learning, can they be training in batches, but they can retrained efficiently, as new training data is collected.
  • They allow them to combine labeled and unlabeled data, through pretraining of individual layers [19].
  • The classifiers are very compact, so prediction can be done very quickly using low amounts of memory
Tables
  • Table1: Estimated TPR at 0.1% FPR and AUC, for the corresponding plots in Fig. 4
Download tables as Excel
Related work
  • Malware detection has evolved over the past several years, due to the increasingly growing threat posed by malware to large corporations and governmental agencies. Traditionally, the two major approaches for malware detection can be roughly split based on the approach that is used to analyze the malware, either static and dynamic analysis (see review [11]). In static analysis the malware file, or set of files, are either directly analyzed in binary form, or additionally unpacked and/or decompiled into assembly representation. In dynamic analysis, the binary files are executed, and the actions are recorded through hooking or some access into internals of the virtualization environment.

    In principle, dynamic detection can provide direct observation of malware action, is less vulnerable to obfuscation [28], and makes it harder to recycle existing malware. However, in practice, automated execution of software is difficult, since malware can detect if it is running in a sandbox, and prevent itself from performing malicious behavior. This resulted in an arms race between dynamic behavior detectors using a sandbox and malware [1, 13]. Further, in a significant number of cases, malware simply does not execute properly, due to missing dependencies or unexpected system configuration. These issues make it difficult to collect a large clean dataset of malware behavior.
Funding
  • We show that our system achieves a 95% detection rate at 0.1% false positive rate (FPR), based on more than 400,000 software binaries sourced directly from our customers and internal malware databases
  • Similar to [?], we label any file against which 30% or more of the anti-virus engines alarm as malware, and any file that no anti-virus engine alarms on as benignware
  • For the purposes of both training and accuracy evaluation we discard any files that more than 0% and less than 30% of VirusTotal’s anti-virus engines declare it malware, given the uncertainty surrounding the nature of these files
Study subjects and analysis
pairs: 1024
1024 byte window over an input binary, with a step size of 256 bytes. For each window, we compute the base-2 entropy of the window, and each individual byte occurrence in the window (1024 non-unique values) with this computed entropy value, storing the 1024 pairs in a list. Finally, we compute a two-dimensional histogram over the pair list, where the histogram entropy axis has sixteen evenly sized bins over the range [0, 8], and the byte axis has sixteen evenly sized bins over the range [0, 255]

Reference
  • Anubis. https://anubis.iseclab.org/.
    Findings
  • Kaggle: Microsoft malware classification challenge. https://www.kaggle.com/c/malwareclassification.
    Findings
  • VirusTotal. hhttps://www.virtualbox.org.
    Findings
  • T. Abou-Assaleh, N. Cercone, V. Keselj, and R. Sweidan. N-gram-based detection of new malicious code. In Computer Software and Applications Conference, 200COMPSAC 2004. Proceedings of the 28th Annual International, volume 2, pages 41–42. IEEE, 2004.
    Google ScholarLocate open access versionFindings
  • M. Banko and E. Brill. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pages 26–33. Association for Computational Linguistics, 2001.
    Google ScholarLocate open access versionFindings
  • U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In NDSS, volume 9, pages 8–11.
    Google ScholarLocate open access versionFindings
  • R. Benchea and D. T. Gavrilut. Combining restricted boltzmann machine and one side perceptron for malware detection. In Graph-Based Representation and Reasoning, pages 93–103.
    Google ScholarLocate open access versionFindings
  • Y. Bengio, Y. LeCun, et al. Scaling learning algorithms towards ai. Large-scale kernel machines, 34(5), 2007.
    Google ScholarLocate open access versionFindings
  • L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
    Google ScholarLocate open access versionFindings
  • G. E. Dahl, J. W. Stokes, L. Deng, and D. Yu. Largescale malware classification using random projections and neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pages 3422–3426. IEEE, 2013.
    Google ScholarLocate open access versionFindings
  • M. Egele, T. Scholte, E. Kirda, and C. Kruegel. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys, 44(2):6, 2012.
    Google ScholarLocate open access versionFindings
  • V. A. Epanechnikov. Non-parametric estimation of a multivariate probability density. Theory of Probability & Its Applications, 14(1):153–158, 1969.
    Google ScholarLocate open access versionFindings
  • D. Fleck, A. Tokhtabayev, A. Alarif, A. Stavrou, and T. Nykodym. Pytrigger: A system to trigger & extract user-activated malware behavior. In Proceedings of the 2013 International Conference on Availability, Reliability and Security, pages 92–101. IEEE, 2013.
    Google ScholarLocate open access versionFindings
  • D. Fradkin and D. Madigan. Experiments with random projections for machine learning. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 517–522. ACM, 2003.
    Google ScholarLocate open access versionFindings
  • J. Friedman, T. Hastie, and R. Tibshirani. The elements of statistical learning, volume 1. Springer series in statistics Springer, Berlin, 2001.
    Google ScholarFindings
  • E. Gandotra, D. Bansal, and S. Sofat. Malware analysis and classification: A survey. Journal of Information Security, 5(02):56, 2014.
    Google ScholarLocate open access versionFindings
  • X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In International conference on artificial intelligence and statistics, pages 249–256, 2010.
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852, 2015.
    Findings
  • G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527–1554, 2006.
    Google ScholarLocate open access versionFindings
  • P. Indyk and R. Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 604–613. ACM, 1998.
    Google ScholarLocate open access versionFindings
  • J. Jang, D. Brumley, and S. Venkataraman. Bitshred: feature hashing malware for scalable triage and semantic analysis. In Proceedings of the 18th ACM conference on Computer and communications security, pages 309– 320. ACM, 2011.
    Google ScholarLocate open access versionFindings
  • W. B. Johnson and J. Lindenstrauss. Extensions of lipschitz mappings into a hilbert space. Contemporary mathematics, 26(189-206):1, 1984.
    Google ScholarLocate open access versionFindings
  • J. O. Kephart, G. B. Sorkin, W. C. Arnold, D. M. Chess, G. J. Tesauro, S. R. White, and T. Watson. Biologically inspired defenses against computer viruses. In IJCAI (1), pages 985–996, 1995.
    Google ScholarLocate open access versionFindings
  • D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
    Google ScholarLocate open access versionFindings
  • A. L. Maas, A. Y. Hannun, and A. Y. Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. ICML, volume 30, 2013.
    Google ScholarLocate open access versionFindings
  • A. Moser, C. Kruegel, and E. Kirda. Limits of static analysis for malware detection. In Proceedings of the 23rd Computer Security Applications Conference, pages 421–430, 2007.
    Google ScholarLocate open access versionFindings
  • M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo. Data mining methods for detection of new malicious executables. In Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, pages 38–49. IEEE, 2001.
    Google ScholarLocate open access versionFindings
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.
    Google ScholarLocate open access versionFindings
  • M. Weber, M. Schmid, M. Schatz, and D. Geyer. A toolkit for detecting and analyzing malicious software. In Computer Security Applications Conference, 2002. Proceedings. 18th Annual, pages 423–4IEEE, 2002.
    Google ScholarLocate open access versionFindings
  • K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1113– 1120. ACM, 2009.
    Google ScholarLocate open access versionFindings
  • S. Wozniak, A.-D. Almasi, V. Cristea, Y. Leblebici, and T. Engbersen. Review of advances in neural networks: Neural design technology stack. In Proceedings of ELM2014 Volume 1, pages 367–376.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn