6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
Datadog developed an LLM-powered tool called BewAIre to review pull requests for malicious activity in real time. The system processes code changes and classifies them, achieving over 99.3% accuracy while minimizing false positives. It addresses the challenges posed by the increasing volume of PRs and the sophistication of attacks.
If you do, here's more
Datadog has integrated coding assistants into its workflow to speed up feature delivery and reduce repetitive tasks, resulting in nearly 10,000 pull requests (PRs) weekly. However, this increase presents security challenges. The growing volume of PRs expands the attack surface, making it easier for malicious code to slip through unnoticed. Traditional static analysis tools, while helpful, mainly identify known bad patterns and lack the ability to understand the intent behind code changes. This gap in capability allows attackers to embed harmful code that can masquerade as legitimate updates.
To tackle these issues, Datadog's SDLC Security team developed an LLM-powered system called BewAIre. This system reviews every PR in real time, classifying changes as either benign or malicious. It operates by ingesting PRs, normalizing them, and processing the code along with contextual metadata to reason about intent. During testing, BewAIre achieved over 99.3% accuracy and a 0.03% false positive rate, demonstrating its effectiveness in identifying malicious commits linked to known vulnerabilities.
The team focused on building a curated dataset of both real-world and simulated malicious exploits to train the model. They hand-labeled and assembled various examples, continuously updating the dataset to reflect current threats. Handling the context window limitation was crucial, as large PRs often pose significant risk. By refining their approach and leveraging LLM capabilities, Datadog aims to enhance code security without sacrificing developer efficiency.
Questions about this article
No questions yet.