First AI-Agent-Driven Intrusion: LLM Chains Marimo RCE to Database Theft in Under an Hour

On May 10, 2026, a threat actor deployed a large language model (LLM) agent to drive a complete post-exploitation chain, starting from an exposed marimo notebook server and ending with an internal PostgreSQL database dumped in under two minutes. This was not a pre-scripted attack — commands were composed in real time, adapting at each step to whatever the target revealed. Researchers at Sysdig, who captured the intrusion through their Threat Research Team (TRT), described it as the first AI-agent-driven intrusion they have ever recorded.

The entry point was a vulnerable marimo notebook exposed to the internet. The attacker exploited CVE-2026-39987, a critical flaw that allows a one-WebSocket-request shell on any unpatched marimo server. Cloud credentials were harvested from environment files and the AWS credentials store, then used to retrieve an SSH private key from AWS Secrets Manager. That key opened eight parallel SSH sessions against a downstream bastion server, from which an internal PostgreSQL database was fully exfiltrated.

Sysdig said in a report shared with Cyber Security News that the full chain ran end-to-end in under one hour. Sr. Director Michael Clark put it plainly: "We are not watching AI replace attackers. We are watching attackers replace their scripts with AI." What made this attack notable was how traffic was routed to avoid detection. Twelve AWS API calls were fanned across eleven distinct Cloudflare Workers IP addresses in just 22 seconds, defeating the per-source-IP correlation cloud defenders rely on. Eight SSH sessions came from six separate IPs simultaneously during the bastion phase.

The Sysdig TRT identified four signs that an LLM agent drove the attack. First, the agent improvised a database dump with no prior schema knowledge, enumerating tables and immediately targeting a credential table that does not exist in the application the schema resembled — it was reasoning from general knowledge, not pre-staged intelligence. Second, a Chinese-language planning comment translating to "See what else we can do" appeared directly in the command stream. That internal monologue, dispatched across six IPs at sub-second pace, is not something a human typist or static script would produce. Third, every command was built for machine parsing, using structured separators, bounded output caps, and discarded error streams so the agent could read each result cleanly. The fourth sign was how values flowed between steps: the database password came from the .pgpass file read moments earlier, the SSH key path followed a listing that confirmed the file existed, and the AWS secret ID was selected from a ListSecrets response just 20 seconds before retrieval.

The most pressing implication is that signature-based detection is losing ground. A scripted attacker leaves repeatable fingerprints like the same command order or probe sequence each run. An LLM agent rewrites its approach for every target, making static rules less reliable. Detection must shift toward what the attacker is accomplishing — such as credential access or database exfiltration — rather than the specific commands used.

Sysdig recommends updating marimo to version 0.23.0 or later immediately. If upgrading is not possible, access to the /terminal/ws endpoint should be restricted or the terminal feature disabled. Any publicly reachable marimo instance should be treated as potentially compromised, and all associated credentials, API keys, SSH keys, and database passwords should be rotated. CVE-2026-39987 is on CISA's Known Exploited Vulnerabilities catalog, and its federal remediation deadline has passed. Organizations should enable deep telemetry across the full network and deploy runtime threat detection that flags behavior-based patterns. An LLM-powered attacker no longer needs to map your environment to operate inside it — speed, adaptiveness, and distributed egress are now standard features of the threat.

Sysdig published a detailed breakdown of the May 10 attack chain, revealing four specific indicators that an LLM agent drove the post-exploitation phase: the attacker improvised a database dump without prior schema knowledge, a Chinese-language planning comment ('看还能做什么' / 'See what else we can do') leaked into the command stream, every command was machine-formatted with '---' delimiters and suppressed stderr, and value handoffs such as extracting a database password from ~/.pgpass were fed directly into subsequent actions. The firm warns that agent-in-the-loop operators adapt on the fly to unexpected environments, compressing the attacker's engineering time into inference budget.