Evaluation of LLM Chatbots for OSINT-based Cyber Threat Awareness
CoRR(2024)
摘要
Knowledge sharing about emerging threats is crucial in the rapidly advancing
field of cybersecurity and forms the foundation of Cyber Threat Intelligence
(CTI). In this context, Large Language Models are becoming increasingly
significant in the field of cybersecurity, presenting a wide range of
opportunities. This study surveys the performance of ChatGPT, GPT4all, Dolly,
Stanford Alpaca, Alpaca-LoRA, Falcon, and Vicuna chatbots in binary
classification and Named Entity Recognition (NER) tasks performed using Open
Source INTelligence (OSINT). We utilize well-established data collected in
previous research from Twitter to assess the competitiveness of these chatbots
when compared to specialized models trained for those tasks. In binary
classification experiments, Chatbot GPT-4 as a commercial model achieved an
acceptable F1 score of 0.94, and the open-source GPT4all model achieved an F1
score of 0.90. However, concerning cybersecurity entity recognition, all
evaluated chatbots have limitations and are less effective. This study
demonstrates the capability of chatbots for OSINT binary classification and
shows that they require further improvement in NER to effectively replace
specially trained models. Our results shed light on the limitations of the LLM
chatbots when compared to specialized models, and can help researchers improve
chatbots technology with the objective to reduce the required effort to
integrate machine learning in OSINT-based CTI tools.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要