Click any tag below to further narrow down your results
Links
The article introduces CyberSOCEval, a set of open source benchmarks designed to evaluate Large Language Models (LLMs) in malware analysis and threat intelligence reasoning. It highlights the need for improved assessments of LLMs to better support cybersecurity efforts, especially as malicious actors leverage AI for attacks. The findings show that current models are underperforming in cybersecurity scenarios, indicating room for enhancement.