Quit Emailing Yourself

On the Coming Industrialisation of Exploit Generation with LLMs

7 min read | Saved February 14, 2026 | Copied!

exploits 🤖 cybersecurity 🤖 llms 🤖 automation 🤖 vulnerabilities 🤖

Do you care about this?

The article discusses experiments using Opus 4.5 and GPT-5.2 to generate exploits for a zero-day vulnerability in QuickJS. It concludes that the future of offensive cybersecurity may rely on token throughput rather than the number of human hackers, as LLMs prove effective in exploit development.

If you do, here's more

The author conducted an experiment using Opus 4.5 and GPT-5.2 to generate exploits for a zero-day vulnerability in the QuickJS JavaScript interpreter. The agents created over 40 distinct exploits across various scenarios, with GPT-5.2 successfully solving all challenges presented. The experiments involved complex constraints like modern exploit mitigations and required the agents to develop capabilities through source code analysis and debugging. The results showed that both agents could solve most challenges quickly and at a relatively low cost, with GPT-5.2 completing the most difficult task in about three hours for a total of 50 million tokens, costing around $50.

A key conclusion drawn from this work is the impending industrialization of offensive cybersecurity. The author argues that the future of exploit development may hinge more on the token throughput organizations can afford rather than the number of skilled hackers they employ. For tasks to be industrialized, an agent must be able to operate independently within a defined environment and verify its solutions without human assistance. Exploit development fits this model well, as the environment and tools are well-understood, and verification processes can be automated.

However, there are caveats. QuickJS, while real, is simpler than more complex interpreters like those in Chrome or Firefox, meaning results can't be generalized without further testing. The exploits generated relied on known flaws rather than novel breaks in protection mechanisms. The author notes that while the approach taken by GPT-5.2 was innovative to him, it might be familiar to experienced exploit developers. The article emphasizes the potential for LLMs to automate parts of cyber intrusion, but it also highlights challenges where real-time interaction with an adversarial environment complicates the industrialization process.

Questions about this article

No questions yet.