6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Researchers assessed AI models' abilities to exploit smart contracts, revealing significant potential financial harm. They developed a benchmark, SCONE-bench, that demonstrates AI's capacity to discover vulnerabilities and generate exploits, emphasizing the need for proactive defenses.
If you do, here's more
AI models are becoming adept at exploiting vulnerabilities in smart contracts, as shown by the research from MATS and the Anthropic Fellows program. Their new benchmark, SCONE-bench, assesses AI agents' abilities to exploit 405 smart contracts that were compromised from 2020 to 2025. Notably, after their respective knowledge cutoffs, the models Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 collectively identified vulnerabilities that could have resulted in economic damage of $4.6 million. The research highlights the tangible risks posed by AI in cyber activities and emphasizes the need for proactive defenses.
The study revealed some alarming numbers: across all benchmark problems, 10 evaluated AI models produced successful exploits for 207 contracts, simulating a total of $550.1 million in stolen funds. Even more striking, when focusing on contracts known to have been exploited after the AI models' knowledge cutoffs, the performance remained significant, with exploits yielding up to $4.6 million. The standout was Opus 4.5, which successfully targeted 13 out of 20 problems, suggesting that AI agents can effectively monetize vulnerabilities.
In a separate evaluation with Sonnet 4.5 and GPT-5 on 2,849 recently deployed contracts, both agents discovered two novel zero-day vulnerabilities, producing exploits valued at $3,694. This experiment serves as proof that autonomous exploitation is not only plausible but potentially profitable. The findings push for a shift in focus from traditional success rates to evaluating the financial impact of exploits, as the ultimate concern for stakeholders is the actual monetary loss from these vulnerabilities.
Questions about this article
No questions yet.