Denial of Service by abusing xgrammar unbounded cache in memory
Description
XGrammar is an open-source library for efficient, flexible, and portable structured generation. Prior to 0.1.18, Xgrammar includes a cache for compiled grammars to increase performance with repeated use of the same grammar. This cache is held in memory. Since the cache is unbounded, a system making use of xgrammar can be abused to fill up a host's memory and case a denial of service. For example, sending many small requests to an LLM inference server with unique JSON schemas would eventually cause this denial of service to occur. This vulnerability is fixed in 0.1.18.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
XGrammar 0.1.18 addresses an unbounded cache that allows an attacker to exhaust host memory, causing denial of service via repeated unique grammar requests.
Vulnerability
Description
XGrammar is an open-source library for structured generation that caches compiled grammars to improve performance with repeated use. Prior to version 0.1.18, this cache had no size limit, meaning each unique grammar added a new entry that was retained indefinitely. The unbounded cache resides entirely in host memory, allowing an attacker to systematically fill available RAM by sending many small requests, each using a distinct grammar (e.g., unique JSON schemas), leading to a denial of service condition [1].
Exploitation
Scenario
The attack surface is any application that accepts arbitrary grammars from untrusted input and uses XGrammar to process them. A practical example is an LLM inference server where a user can submit multiple requests with unique JSON schemas; each request causes XGrammar to compile and cache the schema, eventually consuming all host memory [4]. No authentication or special privileges are needed beyond the ability to send such requests [1][3].
Impact
Successful exploitation results in memory exhaustion, causing the host system to become unresponsive or crash, effectively denying service to legitimate users. The vulnerability is rated as a denial of service (DoS) with the CVSS vector not yet assigned by NVD but recognized by the project as a high-severity issue [1][4].
Mitigation
The fix is implemented in XGrammar version 0.1.18, which introduces a configurable cache size limit. Downstream integrations, such as vLLM, have adopted this fix and added an environment variable (VLLM_XGRAMMAR_CACHE_MB) to control the cache size, defaulting to 512 MB [3]. Users should upgrade to XGrammar 0.1.18 or apply equivalent limits in their applications [1][4].
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
xgrammarPyPI | < 0.1.18 | 0.1.18 |
Affected products
1Patches
0No patches discovered yet.
Vulnerability mechanics
AI mechanics synthesis has not run for this CVE yet.
References
5- github.com/advisories/GHSA-389x-67px-mjg3ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2025-32381ghsaADVISORY
- github.com/mlc-ai/xgrammar/pull/243ghsax_refsource_MISCWEB
- github.com/mlc-ai/xgrammar/security/advisories/GHSA-389x-67px-mjg3ghsax_refsource_CONFIRMWEB
- github.com/vllm-project/vllm/pull/16283ghsax_refsource_MISCWEB
News mentions
0No linked articles in our index yet.