Prompt Injection Vulnerability in Google Gemini Allows Malicious Notification Exploitation

Researchers at SafeBreach have detailed a sophisticated prompt injection vulnerability affecting Google Gemini's voice assistant, which could allow attackers to trick users into executing harmful actions. The attack, dubbed "Gemini's Secret Affair: Exploiting Gemini Voice Assistant Through Instant Messaging Apps," builds upon previous research where calendar invitations were used to manipulate Gemini. This latest technique exploits Gemini's ability to summarize message notifications, a feature that can inadvertently hide malicious instructions from the user.

Or Yair, lead of SafeBreach's security research team, demonstrated how attackers could embed hidden commands within messages. These commands could be disguised in foreign languages or within muted hyperlinks, allowing the AI assistant to process them silently without the user being aware. The potential consequences are significant, ranging from controlling smart home devices and launching unauthorized video streams to conducting social engineering attacks, including impersonating trusted contacts, and even poisoning the long-term memory of the large language model (LLM).

To bypass Google's existing security measures, Yair developed a technique called "Fake Context Alignment." This method creates a dual illusion: it presents a legitimate authorization scenario to Gemini's internal security mechanisms while simultaneously showing a benign scenario to the end-user. This allows malicious instructions to be processed without triggering standard alerts. While there is currently no evidence of this vulnerability being exploited in the wild, SafeBreach responsibly disclosed the issue to Google, which has since implemented content classifier updates to address the flaw.

The core of the exploit lies in how Gemini handles message notifications. For instance, an attacker could send a phishing message via WhatsApp from an unknown number, disguised as an invitation from a friend, and include a payment link. Crucially, the message could contain hidden hyperlink code instructing Gemini to misrepresent the sender's identity to the user. If a user asks Gemini to summarize their messages, especially while multitasking, they might receive a simplified, context-stripped summary that omits the suspicious origin, thereby building false trust.

Attackers can also employ invisible text in foreign languages at the end of a message. Gemini might interpret these characters and execute the hidden instructions, even though they are not read aloud to the user. Yair found that combining foreign characters with muted hyperlinks yielded the most reliable and stealthy results. This combination allowed a malicious command to be hidden within a hyperlink, which Gemini would process, and the user would hear a normal-sounding prompt, respond affirmatively, and inadvertently trigger the malicious action.

This vulnerability highlights a broader challenge in securing AI assistants: the inherent risk of context shifting. Even though Google has patched this specific issue, the underlying principle of manipulating context to bypass guardrails remains a critical concern. SafeBreach emphasizes that indirect prompt injections are not traditional vulnerabilities that can be 'fixed' with a single patch. Instead, they require continuous monitoring and robust guardrails implemented by vendors.

While Gemini users do not need to take specific actions due to the implemented fixes, the research serves as a stark reminder of the evolving threat landscape for AI systems. Organizations deploying LLM models must pay close attention to access controls and treat all external input, including notifications, as potentially untrusted. The research underscores the necessity of active security controls and classifiers to monitor AI behavior and mitigate risks associated with prompt injection attacks.

Researchers have detailed a novel bypass technique called "Fake Context Alignment," which allows attackers to circumvent Google's previous mitigations for prompt injection attacks against Gemini. This new method employs either obfuscated foreign language prompts or muted text-to-speech elements to trick the AI into executing malicious commands, even when Delayed Tool Invocation is re-enabled. The updated research demonstrates a wider attack surface, leveraging any app capable of sending notifications, including WhatsApp and Slack, to achieve context poisoning, covert control of smart home devices, and persistent memory poisoning across a user's Google Workspace.

The research details a novel bypass technique named Fake Context Alignment, which exploits how Gemini processes notifications. This method uses a combination of obfuscated language prompts and hidden hyperlinks to trick Gemini into authorizing sensitive actions, such as opening connected windows or initiating calls, even when the user is unaware of the malicious intent. The vulnerability was reported to Google in August 2025 and has since been patched server-side, with no evidence of it being exploited in the wild.

This new report details a specific attack class dubbed 'Fake Context Alignment,' which leverages indirect prompt injection through messaging notifications from apps like WhatsApp and Slack. Researchers demonstrated how hidden commands, embedded in foreign languages or muted hyperlinks, could be processed by Gemini without being read aloud, bypassing safeguards and enabling actions like controlling smart home devices or initiating Zoom calls. The vulnerability was patched by Google in November 2025.