AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Our work examines the surprising imbalance between the extensive amount of research on machine learning-based anomaly detection pursued in the academic intrusion detection community, versus the lack of operational deployments of such systems

Outside the Closed World: On Using Machine Learning for Network Intrusion Detection

IEEE Symposium on Security and Privacy, pp.305-316, (2010)

引用1360|浏览374
EI WOS
下载 PDF 全文
引用
微博一下

摘要

In network intrusion detection research, one popular strategy for finding attacks is monitoring a network's activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. However, despite extensive academic research one fi...更多

代码

数据

简介
  • Network intrusion detection systems (NIDS) are broadly classified based on the style of detection they are using: systems relying on misuse-detection monitor activity with precise descriptions of known malicious behavior, while anomaly-detection systems have a notion of normal activity and flag deviations from that profile.1 Both approaches have been extensively studied by the research community for many years.
  • In terms of actual deployments, the authors observe a striking imbalance: in operational settings, of these two main classes the authors find almost exclusively only misuse detectors in use—most commonly in the form of signature systems that scan network traffic for characteristic byte sequences
  • This situation is somewhat striking when considering the success that machine-learning—which frequently forms the basis for anomaly-detection—sees in many other areas of computer science, where it often results in large-scale deployments in the commercial world.
  • The strength of machine-learning tools is finding activity that is similar to something previously seen, without the need to precisely describe that activity up front
重点内容
  • Network intrusion detection systems (NIDS) are broadly classified based on the style of detection they are using: systems relying on misuse-detection monitor activity with precise descriptions of known malicious behavior, while anomaly-detection systems have a notion of normal activity and flag deviations from that profile.1 Both approaches have been extensively studied by the research community for many years
  • Our discussion in this paper aims to develop a different general point: that much of the difficulty with anomaly detection systems stems from using tools borrowed from the machine learning community in inappropriate ways
  • Our work examines the surprising imbalance between the extensive amount of research on machine learning-based anomaly detection pursued in the academic intrusion detection community, versus the lack of operational deployments of such systems
  • The domain-specific challenges include: (i) the need for outlier detection, while machine learning instead performs better at finding similarities; very high costs of classification errors, which render error rates as encountered in other domains unrealistic; a semantic gap between detection results and their operational interpretation; the enormous variability of benign traffic, making it difficult to find stable notions of normality; (v) significant challenges with performing sound evaluation; and the need to operate in an adversarial setting
  • While none of these render machine learning an inappropriate tool for intrusion detection, we deem their unfortunate combination in this domain as a primary reason for its lack of success
  • We provide a set of guidelines for applying machine learning to network intrusion detection
结论
  • The authors' work examines the surprising imbalance between the extensive amount of research on machine learning-based anomaly detection pursued in the academic intrusion detection community, versus the lack of operational deployments of such systems.
  • The domain-specific challenges include: (i) the need for outlier detection, while machine learning instead performs better at finding similarities; very high costs of classification errors, which render error rates as encountered in other domains unrealistic; a semantic gap between detection results and their operational interpretation; the enormous variability of benign traffic, making it difficult to find stable notions of normality; (v) significant challenges with performing sound evaluation; and the need to operate in an adversarial setting
  • While none of these render machine learning an inappropriate tool for intrusion detection, the authors deem their unfortunate combination in this domain as a primary reason for its lack of success.
  • Such results do not contribute to the progress of the field without any semantic understanding of the gain
基金
  • This work was supported in part by NSF Awards NSF-0433702 and CNS-0905631
  • This work was also supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S Department of Energy under Contract No DE-AC02-05CH11231
引用论文
  • C. Ko, M. Ruschitzka, and K. Levitt, “Execution Monitoring of Security-Critical Programs in Distributed Systems: A Specification-based Approach,” in Proc. IEEE Symposium on Security and Privacy, 1997.
    Google ScholarLocate open access versionFindings
  • D. R. Ellis, J. G. Aiken, K. S. Attwood, and S. D. Tenaglia, “A Behavioral Approach to Worm Detection,” in Proc. ACM CCS WORM Workshop, 2004.
    Google ScholarLocate open access versionFindings
  • G. Linden, B. Smith, and J. York, “Amazon.com Recommendations: Item-to-Item Collaborative Filtering,” IEEE Internet Computing, vol. 7, no. 1, pp. 76–80, 2003.
    Google ScholarLocate open access versionFindings
  • J. Bennett, S. Lanning, and N. Netflix, “The Netflix Prize,” in Proc. KDD Cup and Workshop, 2007.
    Google ScholarLocate open access versionFindings
  • L. Vincent, “Google Book Search: Document Understanding on a Massive Scale,” 2007.
    Google ScholarFindings
  • R. Smith, “An Overview of the Tesseract OCR Engine,” in Proc. International Conference on Document Analysis and Recognition, 2007.
    Google ScholarLocate open access versionFindings
  • F. J. Och and H. Ney, “The Alignment Template Approach to Statistical Machine Translation,” Comput. Linguist., vol. 30, no. 4, pp. 417–449, 2004.
    Google ScholarLocate open access versionFindings
  • P. Graham, “A Plan for Spam,” in Hackers & Painters. O’Reilly, 2004.
    Google ScholarFindings
  • D. E. Denning, “An Intrusion-Detection Model,” IEEE Transactions on Software Engineering, vol. 13, no. 2, pp. 222–232, 1987.
    Google ScholarLocate open access versionFindings
  • H. S. Javitz and A. Valdes, “The NIDES Statistical Component: Description and Justification,” SRI International, Tech. Rep., 1993.
    Google ScholarLocate open access versionFindings
  • W. Lee and D. Xiang, “Information-Theoretic Measures for Anomaly Detection,” in Proc. IEEE Symposium on Security and Privacy, 2001.
    Google ScholarLocate open access versionFindings
  • Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, and J. Ucles, “HIDE: a Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification,” in Proc. IEEE Workshop on Information Assurance and Security, 2001.
    Google ScholarLocate open access versionFindings
  • W. Hu, Y. Liao, and V. R. Vemuri, “Robust Anomaly Detection Using Support Vector Machines,” in Proc. International Conference on Machine Learning, 2003.
    Google ScholarLocate open access versionFindings
  • C. Sinclair, L. Pierce, and S. Matzner, “An Application of Machine Learning to Network Intrusion Detection,” in Proc. Computer Security Applications Conference, 1999.
    Google ScholarLocate open access versionFindings
  • S. A. Hofmeyr, “An Immunological Model of Distributed Detection and its Application to Computer Security,” Ph.D. dissertation, University of New Mexico, 1999.
    Google ScholarFindings
  • V. Chandola, A. Banerjee, and V. Kumar, “Anomaly Detection: A Survey,” University of Minnesota, Tech. Rep., 2007.
    Google ScholarFindings
  • R. J. Bolton and D. J. Hand, “Statistical Fraud Detection: A Review,” Statistical Science, vol. 17, no. 3, 2002.
    Google ScholarLocate open access versionFindings
  • C. Gates and C. Taylor, “Challenging the Anomaly Detection Paradigm: A Provocative Discussion,” in Proc. Workshop on New Security Paradigms, 2007. http://www.arbornetworks.com/en/
    Locate open access versionFindings
  • [20] “StealthWatch,” http://www.lancope.com/products/.
    Findings
  • [21] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques (2nd edition). Morgan Kaufmann, 2005.
    Google ScholarFindings
  • [22] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2nd edition). Wiley Interscience, 2001.
    Google ScholarFindings
  • [23] S. Axelsson, “The Base-Rate Fallacy and Its Implications for the Difficulty of Intrusion Detection,” in Proc. ACM Conference on Computer and Communications Security, 1999.
    Google ScholarLocate open access versionFindings
  • [24] “Make Data Useful,” Greg Linden, Data Mining Seminar, Stanford University, 2006. http://glinden.blogspot.com/2006/12/slides-from-my-talk-at-stanford.htm%l.
    Findings
  • [25] C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri, “OpenFst: A General and Efficient Weighted Finite-state Transducer Library,” in Proc. International Conference on Implementation and Application of Automata, 2007.
    Google ScholarLocate open access versionFindings
  • [26] R. Sommer, “Viable Network Intrusion Detection in High-Performance Environments,” Ph.D. dissertation, TU Munchen, 2005.
    Google ScholarFindings
  • [27] C. V. Wright, L. Ballard, F. Monrose, and G. M. Masson, “Language Identification of Encrypted VoIP traffic: Alejandra y Roberto or Alice and Bob?” in Proc. USENIX Security Symposium, 2007.
    Google ScholarLocate open access versionFindings
  • [28] T.-F. Yen, X. Huang, F. Monrose, and M. K. Reiter, “Browser Fingerprinting from Coarse Traffic Summaries: Techniques and Implications,” in Proc. Conference on Detection of Intrusions and Malware & Vulnerability Assessment, 2009.
    Google ScholarLocate open access versionFindings
  • [29] A. Narayanan and V. Shmatikov, “Robust De-anonymization of Large Sparse Datasets,” in IEEE Symposium on Security and Privacy, 2008.
    Google ScholarLocate open access versionFindings
  • [30] A. Kumar, V. Paxson, and N. Weaver, “Exploiting Underlying Structure for Detailed Reconstruction of an Internet-scale Event,” in ACM SIGCOMM Internet Measurement Conference, 2005.
    Google ScholarLocate open access versionFindings
  • [31] Jim Mellander, Lawrence Berkeley National Laboratory, via personal communication, 2009.
    Google ScholarFindings
  • [32] W. Willinger, M. S. Taqqu, R. Sherman, and D. V. Wilson, “Self-Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the Source Level,” IEEE/ACM Transactions on Networking, vol. 5, no. 1, 1997.
    Google ScholarLocate open access versionFindings
  • [33] A. Feldmann, A. C. Gilbert, and W. Willinger, “Data Networks As Cascades: Investigating the Multifractal Nature of Internet WAN Traffic,” in Proc. ACM SIGCOMM, 1998.
    Google ScholarLocate open access versionFindings
  • [34] V. Paxson, “Bro: A System for Detecting Network Intruders in Real-Time,” Computer Networks, vol. 31, no. 23–24, pp. 2435–2463, 1999.
    Google ScholarLocate open access versionFindings
  • [35] P. Gill, M. Arlitt, Z. Li, and A. Mahanti, “YouTube Traffic Characterization: A View From the Edge,” in Proc. ACM SIGCOMM Internet Measurement Conference, 2008.
    Google ScholarLocate open access versionFindings
  • [36] A. Nazir, S. Raza, and C.-N. Chuah, “Unveiling Facebook: A Measurement Study of Social Network Based Applications,” in Proc. SIGCOMM Internet Measurement Conference, 2008.
    Google ScholarLocate open access versionFindings
  • [37] A. Haley, P. Norvig, and F. Pereira, “The Unreasonable Effectiveness of Data,” IEEE Intelligent Systems, March/April 2009.
    Google ScholarLocate open access versionFindings
  • [38] D. S. Anderson, C. Fleizach, S. Savage, and G. M. Voelker, “Spamscatter: Characterizing Internet Scam Hosting Infrastructure,” in Proc. USENIX Security Symposium, 2007.
    Google ScholarLocate open access versionFindings
  • [39] G. V. Cormack and T. R. Lynam, “Online Supervised Spam Filter Evaluation,” ACM Transactions on Information Systems, vol. 25, no. 3, 2007.
    Google ScholarLocate open access versionFindings
  • [40] J. van Beusekom, F. Shafait, and T. M. Breuel, “Automated OCR Ground Truth Generation,” in Proc. DAS 2008, Sep 2008.
    Google ScholarLocate open access versionFindings
  • [41] R. Lippmann, R. K. Cunningham, D. J. Fried, I. Graf, K. R. Kendall, S. E. Webster, and M. A. Zissman, “Results of the 1998 DARPA Offline Intrusion Detection Evaluation,” in Proc. Recent Advances in Intrusion Detection, 1999.
    Google ScholarLocate open access versionFindings
  • [42] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das, “The 1999 DARPA Off-line Intrusion Detection Evaluation,” Computer Networks, vol. 34, no. 4, pp. 579–595, October 2000.
    Google ScholarLocate open access versionFindings
  • [43] “KDD Cup Data,” http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
    Findings
  • [44] J. McHugh, “Testing Intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratories,” ACM Transactions on Information and System Security, vol. 3, no. 4, pp. 262–294, November 2000.
    Google ScholarLocate open access versionFindings
  • [45] M. V. Mahoney and P. K. Chan, “An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection,” in Proc. Recent Advances in Intrusion Detection, 2003. http://ita.ee.lbl.gov/html/contrib/
    Locate open access versionFindings
  • [47] Martin Arlitt, via personal communication, 2008.
    Google ScholarFindings
  • [48] S. Floyd and V. Paxson, “Difficulties in Simulating the Internet,” IEEE/ACM Transactions on Networking, vol. 9, no. 4, 2001.
    Google ScholarLocate open access versionFindings
  • [49] “tcpdpriv,” http://ita.ee.lbl.gov/html/contrib/tcpdpriv.html.
    Findings
  • [50] J. Xu, J. Fan, M. Ammar, and S. Moon, “On the Design and Performance of Prefix-Preserving IP Traffic Trace Anonymization,” in Proc. ACM SIGCOMM Internet Measurement Workshop, Nov. 2001.
    Google ScholarLocate open access versionFindings
  • [51] R. Pang, M. Allman, V. Paxson, and J. Lee, “The Devil and Packet Trace Anonymization,” in Computer Communication Review, 2006.
    Google ScholarLocate open access versionFindings
  • [52] “The Internet Traffic Archive (ITA),” http://ita.ee.lbl.gov.
    Findings
  • [53] “PREDICT,” http://www.predict.org.
    Findings
  • [54] S. E. Coull, C. V. Wright, F. Monrose, M. P. Collins, and M. K. Reiter, “Playing Devil’s Advocate: Inferring Sensitive Information from Anonymized Network Traces,” in Proc. Network and Distributed Security Symposium, 2007.
    Google ScholarLocate open access versionFindings
  • [55] K. S. Killourhy and R. A. Maxion, “Toward Realistic and Artifact-Free Insider-Threat Data,” Proc. Computer Security Applications Conference, 2007.
    Google ScholarLocate open access versionFindings
  • [56] K. M. Tan and R. A. Maxion, “”Why 6?” Defining the Operational Limits of Stide, an Anomaly-Based Intrusion Detector,” in Proc. IEEE Symposium on Security and Privacy, 2002.
    Google ScholarLocate open access versionFindings
  • [57] T. H. Ptacek and T. N. Newsham, “Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection,” Secure Networks, Inc., Tech. Rep., January 1998.
    Google ScholarLocate open access versionFindings
  • [58] P. Fogla and W. Lee, “Evading Network Anomaly Detection Systems: Formal Reasoning and Practical Techniques,” in Proc. ACM Conference on Computer and Communications Security, 2006.
    Google ScholarLocate open access versionFindings
  • [59] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can Machine Learning Be Secure?” in Proc. ACM Symposium on Information, Computer and Communications Security, 2006.
    Google ScholarLocate open access versionFindings
  • [60] C. Kruegel and G. Vigna, “Anomaly Detection of Webbased Attacks,” in Proc. ACM Conference on Computer and Communications Security, 2003.
    Google ScholarLocate open access versionFindings
  • [61] G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee, “BotHunter: Detecting Malware Infection Through IDSDriven Dialog Correlation,” in Proc. USENIX Security Symposium, August 2007.
    Google ScholarLocate open access versionFindings
  • [62] K. G. Anagnostakis, S. Sidiroglou, P. Akritidis, K. Xinidis, E. Markatos, and A. D. Keromytis, “Detecting Targeted Attacks Using Shadow Honeypots,” in Proc. USENIX Security Symposium, 2005.
    Google ScholarLocate open access versionFindings
  • [63] N. Provos and T. Holz, Virtual Honeypots: From Botnet Tracking to Intrusion Detection. Addison Wesley, 2007.
    Google ScholarFindings
  • [64] P. Mittal, V. Paxson, R. Sommer, and M. Winterrowd, “Securing Mediated Trace Access Using Black-box Permutation Analysis,” in Proc. ACM Workshop on Hot Topics in Networks, 2009.
    Google ScholarLocate open access versionFindings
  • [65] V. Paxson, “Strategies for Sound Internet Measurement,” in ACM SIGCOMM Internet Measurement Conference, Oct. 2004.
    Google ScholarLocate open access versionFindings
  • [66] N. Fraser, “Neural Network Follies,” http://neil.fraser.name/writing/tank, 1998.
    Findings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn