Trend Micro Details Three Key Attack Vectors for Compromising Large Language Models

Trend Micro Research has published a comprehensive analysis of the primary methods attackers use to compromise Large Language Models (LLMs), warning that the AI supply chain is increasingly vulnerable to stealthy attacks that are difficult to detect until damage occurs. The report, released on September 24, 2025, outlines three distinct attack vectors: embedding malicious executable instructions in model files, retraining models with poisoned data, and using malicious Low-Rank Adaptation (LoRA) modules to manipulate model behavior.

The first attack vector exploits the serialization process used to package and share AI models. Older formats like Python's pickle allow executable instructions to be embedded alongside model weights, creating a Trojan horse scenario. When a user loads a compromised model, hidden code can execute arbitrary commands, from stealing credentials to installing ransomware. While safer formats like safetensors have been developed, the risk persists as many models are still distributed in pickle format. Trend Micro emphasizes that organizations should always verify the provenance of model files and prefer safe serialization formats.

The second vector involves malicious LoRA adapters, which are lightweight modules that fine-tune a base model without retraining it entirely. Because LoRA files are often less than 1% of the original model's size, they can easily bypass traditional security checks. An attacker can distribute a seemingly benign LoRA that, when applied to a trusted base model, injects backdoors, introduces biases, or enables data exfiltration. The malicious logic only activates when the adapter is applied, making detection difficult without specialized tools that analyze model structure and configuration.

The third vector is data poisoning, where attackers manipulate the training data to corrupt the model's behavior. In backdoor attacks, a small amount of poisoned data containing a specific trigger is injected. The model learns to associate that trigger with a malicious action, such as granting access to unauthorized individuals or leaking confidential information when a specific phrase is used. Because the model behaves normally in all other circumstances, the backdoor remains dormant and nearly impossible to find through standard testing. Trend Micro also warns of direct model compromise through unauthorized retraining, where an attacker gains access to a trained model and alters its core weights to serve malicious purposes, such as steering a customer service chatbot toward phishing for sensitive information.

The report underscores that as LLMs become integral to business operations, the attack surface expands beyond traditional software vulnerabilities. Supply chain attacks on AI models can have cascading effects, especially when models are shared through open platforms like Hugging Face. Trend Micro recommends implementing strict model provenance checks, using safe serialization formats, deploying specialized tools to detect tampered adapters, and monitoring training data for signs of poisoning. The company also advises organizations to treat AI models as critical infrastructure and apply the same security rigor as they would to any other software component.

This analysis comes amid a broader trend of increasing attacks on AI infrastructure. Earlier this year, a critical unpatched vulnerability in ChromaDB, a popular AI vector database, was disclosed, allowing unauthenticated remote code execution. The convergence of AI adoption and supply chain risks means that organizations must now defend not only their networks and endpoints but also the integrity of the models they rely on. Trend Micro's research serves as a practical guide for security teams looking to understand and mitigate these emerging threats.