VYPR
Moderate severityNVD Advisory· Published May 19, 2025· Updated May 19, 2025

Regular Expression Denial of Service (ReDoS) in huggingface/transformers

CVE-2025-2099

Description

A vulnerability in the preprocess_string() function of the transformers.testing_utils module in huggingface/transformers version v4.48.3 allows for a Regular Expression Denial of Service (ReDoS) attack. The regular expression used to process code blocks in docstrings contains nested quantifiers, leading to exponential backtracking when processing input with a large number of newline characters. An attacker can exploit this by providing a specially crafted payload, causing high CPU usage and potential application downtime, effectively resulting in a Denial of Service (DoS) scenario.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

CVE-2025-2099 describes a ReDoS vulnerability in Hugging Face Transformers v4.48.3's preprocess_string() function, exploitable via crafted input causing high CPU and potential DoS.

Vulnerability

Analysis

The vulnerability is a Regular Expression Denial of Service (ReDoS) in the preprocess_string() function of the transformers.testing_utils module in Hugging Face Transformers version v4.48.3 [1][2]. The root cause is a complex regular expression with nested quantifiers used to process code blocks within docstrings. This pattern causes exponential backtracking when processing input containing a large number of newline characters [2][4]. The advisory database entry (PYSEC-2025-40) confirms the issue was present in this specific version [3].

Exploitation

An attacker can exploit this by providing a specially crafted payload to any component that uses preprocess_string(). This function is part of the testing utilities, so exploitation likely requires the application to process docstrings or test inputs containing a malicious string with numerous newlines. No authentication is mentioned as required, suggesting the attack surface could be broad if untrusted input reaches the function [2][4]. The exploitation does not require network access beyond delivering the payload.

Impact

Successful exploitation causes high CPU usage and potential application downtime, resulting in a Denial of Service (DoS) scenario [2]. No data confidentiality or integrity impact is described; the attack is purely on availability. The CVSS score is not provided in the references, but the severity is considered due to the potential for resource exhaustion [2].

Mitigation

A patch has been proposed and merged via Pull Request #36648, which cleans up the regex by removing overly complex non-capturing groups, removing unnecessary flags, and simplifying the pattern to avoid exponential runtime [4]. Users are advised to update to a fixed version (likely v4.49.0 or later) or apply the patch manually. There is no mention of the vulnerability being exploited in the wild or being added to CISA's KEV list.

AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
transformersPyPI
< 4.50.04.50.0

Affected products

3

Patches

1
8cb522b4190b

Cleanup the regex used for doc preprocessing (#36648)

1 file changed · +2 2
  • src/transformers/testing_utils.py+2 2 modified
    @@ -2732,8 +2732,8 @@ def preprocess_string(string, skip_cuda_tests):
         cuda stuff is detective (with a heuristic), this method will return an empty string so no doctest will be run for
         `string`.
         """
    -    codeblock_pattern = r"(```(?:python|py)\s*\n\s*>>> )((?:.*?\n)*?.*?```)"
    -    codeblocks = re.split(re.compile(codeblock_pattern, flags=re.MULTILINE | re.DOTALL), string)
    +    codeblock_pattern = r"(```(?:python|py)\s*\n\s*>>> )(.*?```)"
    +    codeblocks = re.split(codeblock_pattern, string, flags=re.DOTALL)
         is_cuda_found = False
         for i, codeblock in enumerate(codeblocks):
             if "load_dataset(" in codeblock and "# doctest: +IGNORE_RESULT" not in codeblock:
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

6

News mentions

0

No linked articles in our index yet.