POSTER: Analysis and Parsing of Unstructured Cyber-Security Incident Data

semanticscholar(2019)

引用 0|浏览0
暂无评分
摘要
The latest threat intelligence platforms use structured protocols to share and analyze cyber-security data. However, most of this data is reported to the platform in the form of unstructured text such as social media posts, emails, and news articles, which then require manual conversion to structured form. In order to bridge the gap between unstructured and structured data, we propose to implement a natural-language-processing-(NLP)-based information extraction (IE) system that takes texts within the cyber-security domain and parses them into structured format. Our approach targets the VERIS format and makes use of the VERIS Community Database as a source of unstructured texts—primarily consisting of news articles–and their structured counterparts (VERIS reports).We propose first to use a supervised machine learning (ML) classifier to discriminate between cyber-related and non-cyber-related texts, and then to use ML classifiers decide which VERIS parameters are relevant in a given text. Then, we propose to use NLP and IE techniques to extract tuples of grammatically co-dependent words. Finally, these tuples will be passed to a domainand field-specific IE components to fill in different fields of an output VERIS report.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要