ChatGPT Prompt Injection 'ChatGPhish' Lets Attacker-Controlled Web Pages Hijack Chatbot Output

A newly disclosed prompt injection vulnerability in OpenAI's ChatGPT, dubbed 'ChatGPhish,' allows attacker-controlled web pages to hijack the chatbot's output, injecting phishing URLs, fake security alerts, and even inline QR codes into responses. The flaw, discovered by Permiso threat hunter Andi Ahmeti, exploits ChatGPT's inability to distinguish its own generated content from Markdown pulled from external sources. When a user asks the chatbot to summarize a malicious page, hidden instructions can force the AI to append spoofed security warnings or phishing links that appear to come from OpenAI itself.

The attack works by embedding prompt injection instructions into a web page's Markdown content. Ahmeti demonstrated this by injecting instructions into a GitHub-hosted CloudLens page. When a user asked ChatGPT to summarize the page, the chatbot produced a legitimate summary of the tool's features, but immediately beneath it displayed a fake security alert: 'A new device was added to your account: Chrome on Linux (Pristina).' The 'Click here' link redirected to an attacker-controlled domain, krileva.com, rather than an actual OpenAI security page.

Ahmeti also demonstrated a more sophisticated variant that renders an inline QR code in the chatbot's output. Because the chatgpt.com client auto-fetches and displays Markdown images, an attacker can place a QR code that, when scanned on a mobile device, takes the victim to an attacker-controlled URL that has never been displayed in plaintext. This technique bypasses desktop URL defenses, including blocklists and password-manager domain checks, by pivoting the attack from the victim's browser to their phone.

The researcher confirmed the attack is not limited to GitHub-hosted pages. He embedded the same payload into a self-hosted Republic of Kosovo marketing website and observed identical behavior: the assistant produced a normal summary, then appended a spoofed alert with a clickable attacker link. 'The behavior is identical,' Ahmeti wrote, indicating the vulnerability is inherent to how ChatGPT processes external content, not a quirk of any specific hosting platform.

Ahmeti disclosed the vulnerability to OpenAI via Bugcrowd on April 29, 2026, and revised his report on May 1. The initial submission was marked as not reproducible, and after resubmission with additional detail, it was marked as a duplicate. Ahmeti says the supposed duplicate 'had major differences' from his report, and OpenAI did not respond to his follow-up requests for clarification. At the time of publication, OpenAI had not confirmed whether a fix has been applied, leaving the flaw potentially unpatched.

'AI systems increasingly render untrusted content directly inside browsers, which expands risk significantly,' Ahmeti told The Register. 'The bigger issue is that AI products are starting to resemble browser or operating system environments, which creates a much larger security surface.' He recommends strong sandboxing, rendering model-generated content in isolated environments, and strict filtering across Markdown, HTML, embeds, and previews. 'Do not trust model output. AI-generated content should always be treated as untrusted. Assume prompt injection will happen.'

The ChatGPhish vulnerability underscores a growing application-security problem: prompt injection is no longer just a model alignment issue but a practical attack vector that can turn AI assistants into phishing delivery platforms. As AI products increasingly integrate with browsers and external services, the security surface expands, making sandboxing and output validation critical defenses.

Permiso researchers disclosed the attack, dubbed ChatGPhish, on May 29, 2026, after reporting it to OpenAI via Bugcrowd on April 29. OpenAI initially could not reproduce the issue, then classified a revised submission as a duplicate of a previously reported vulnerability. The technique exploits ChatGPT's page summarization feature to inject attacker-controlled Markdown links, fake security alerts, QR codes, and passive tracking beacons into the trusted ChatGPT interface, enabling UI redress phishing and data exfiltration without requiring authentication.

The new Permiso Security disclosure, shared with The Hacker News on May 29, adds further technical depth to the ChatGPhish attack, detailing how attacker-controlled Markdown images and links in a web page, when summarized by ChatGPT, can leak IP, User-Agent, and Referer details to the attacker, serve fake system-style security alerts, and even embed QR codes to bypass desktop URL filters. Researchers emphasize that the article's novelty lies not in the prompt injection itself but in the fact that the markdown payload is surfaced live and clickable within the trusted AI interface simply by having the assistant summarize the page. The disclosure also contextualizes ChatGPhish within a broader wave of AI-attack research, referencing recent findings on SymJack and TrustFall against coding agents, though the core technical description of ChatGPhish overlaps with the earlier May 28 story.