Microsoft Details Security Measures Against Adversarial Memory Poisoning in M365 Copilot

Microsoft published a detailed security analysis of AI memory in Microsoft 365 Copilot on June 22, 2026, addressing a novel attack surface where adversaries can gradually poison an AI assistant’s memory over time to trigger malicious tool calls long after the initial interaction. The blog post, authored by the Microsoft Security team in collaboration with researchers Johann Rehberger, Håkon Måløy, and Gal Zror, describes how AI memory transforms a stateless assistant into a persistent learning system, but in doing so opens the door for stealthy staged attacks.

“Without memory, attackers need to ‘win’ in a single prompt. With AI memory, they can shape behavior gradually over time or plant memories that influence agent reasoning after the original context is gone and user awareness is lower,” the post explains. The core risk is delayed tool invocation: an attacker embeds hidden instructions in a document—for example, directives to exfiltrate the user’s calendar—that the AI assistant reads but does not act on immediately. Days later, in an unrelated conversation, the dormant instructions trigger a memory write that causes the assistant to forward scheduling updates to the attacker.

To counter this threat, Microsoft has implemented a multilayer defense spanning memory creation, storage, retrieval, and user control. During memory creation, proprietary prompt-injection classifiers inspect content for malicious input and strip it before anything is written to memory. Additionally, Copilot runs Task Adherence checks on every explicit memory write to detect discrepancies between the tool call and the user’s stated intent, effectively mitigating prompt injection at the memory tool-call layer.

Once stored, memories are governed by the same data policies that apply to other Microsoft 365 content, including tenant isolation, Data Subject Requests, Customer Lockbox, and encryption at rest. Microsoft also provides end-to-end traceability: every memory update is recorded to organizational audit logs via the MemoryUpdated field, accessible in Defender Advanced Hunting, Microsoft Sentinel, and Azure Portal Sentinel Analytics. This allows SOC analysts to track the full lifecycle from source content to memory write to downstream behavior.

The timing of the disclosure is noteworthy, as it follows multiple recent research findings on AI memory manipulation. In March, researchers from Cornell Tech demonstrated the WARP attack, which uses a single Reddit comment to poison AI deep-research tools like ChatGPT and Gemini. More closely, the SearchLeak vulnerability in Microsoft Copilot itself was patched just days earlier, after researchers found it could leak sensitive data via prompt injection. Microsoft’s new guidance appears to be a direct response to the growing recognition that memory layers in AI assistants represent a critical security boundary.

The post frames AI memory security as an active, iterative program. “Building safe AI memory is one of the most consequential challenges in AI. It requires balancing personalization, capability, privacy, security, and governance,” the company states. Microsoft plans to continue hardening the system and encourages security researchers to engage via its coordinated vulnerability disclosure program, crediting the external researchers who helped inform the new protections.