VYPR
Moderate severityNVD Advisory· Published Apr 9, 2025· Updated Apr 9, 2025

Denial of Service by abusing xgrammar unbounded cache in memory

CVE-2025-32381

Description

XGrammar is an open-source library for efficient, flexible, and portable structured generation. Prior to 0.1.18, Xgrammar includes a cache for compiled grammars to increase performance with repeated use of the same grammar. This cache is held in memory. Since the cache is unbounded, a system making use of xgrammar can be abused to fill up a host's memory and case a denial of service. For example, sending many small requests to an LLM inference server with unique JSON schemas would eventually cause this denial of service to occur. This vulnerability is fixed in 0.1.18.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

XGrammar 0.1.18 addresses an unbounded cache that allows an attacker to exhaust host memory, causing denial of service via repeated unique grammar requests.

Vulnerability

Description

XGrammar is an open-source library for structured generation that caches compiled grammars to improve performance with repeated use. Prior to version 0.1.18, this cache had no size limit, meaning each unique grammar added a new entry that was retained indefinitely. The unbounded cache resides entirely in host memory, allowing an attacker to systematically fill available RAM by sending many small requests, each using a distinct grammar (e.g., unique JSON schemas), leading to a denial of service condition [1].

Exploitation

Scenario

The attack surface is any application that accepts arbitrary grammars from untrusted input and uses XGrammar to process them. A practical example is an LLM inference server where a user can submit multiple requests with unique JSON schemas; each request causes XGrammar to compile and cache the schema, eventually consuming all host memory [4]. No authentication or special privileges are needed beyond the ability to send such requests [1][3].

Impact

Successful exploitation results in memory exhaustion, causing the host system to become unresponsive or crash, effectively denying service to legitimate users. The vulnerability is rated as a denial of service (DoS) with the CVSS vector not yet assigned by NVD but recognized by the project as a high-severity issue [1][4].

Mitigation

The fix is implemented in XGrammar version 0.1.18, which introduces a configurable cache size limit. Downstream integrations, such as vLLM, have adopted this fix and added an environment variable (VLLM_XGRAMMAR_CACHE_MB) to control the cache size, defaulting to 512 MB [3]. Users should upgrade to XGrammar 0.1.18 or apply equivalent limits in their applications [1][4].

AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
xgrammarPyPI
< 0.1.180.1.18

Affected products

1

Patches

0

No patches discovered yet.

Vulnerability mechanics

AI mechanics synthesis has not run for this CVE yet.

References

5

News mentions

0

No linked articles in our index yet.