VYPR
Moderate severityNVD Advisory· Published Sep 14, 2025· Updated Sep 15, 2025

Regular Expression Denial of Service (ReDoS) in huggingface/transformers

CVE-2025-6051

Description

A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically within the normalize_numbers() method of the EnglishNormalizer class. This vulnerability affects versions up to 4.52.4 and is fixed in version 4.53.0. The issue arises from the method's handling of numeric strings, which can be exploited using crafted input strings containing long sequences of digits, leading to excessive CPU consumption. This vulnerability impacts text-to-speech and number normalization tasks, potentially causing service disruption, resource exhaustion, and API vulnerabilities.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

ReDoS in Hugging Face Transformers EnglishNormalizer allows crafted digit strings to cause CPU exhaustion, fixed in 4.53.0.

Vulnerability

Overview

CVE-2025-6051 is a Regular Expression Denial of Service (ReDoS) vulnerability discovered in the Hugging Face transformers library, specifically within the normalize_numbers() method of the EnglishNormalizer class [1][2]. The issue affects versions up to and including 4.52.4. The root cause is the use of inefficient regular expressions (e.g., patterns with nested quantifiers like [0-9]++ and [0-9,]*) that can exhibit catastrophic backtracking when processing artificially crafted input strings containing long sequences of digits [3][4]. The vulnerability was introduced during normal development and remained undetected until the fix was applied to import the standard re module with atomic grouping support (available in Python 3.11+) or fall back to the regex library [3][4].

Exploitation

Context

An attacker can trigger the ReDoS by providing a specially crafted numeric string, such as a long sequence of digits with interleaved commas or decimal points, to any application that invokes the normalize_numbers() method. The attack does not require authentication or special privileges beyond the ability to send input to an affected endpoint. The vulnerability is particularly relevant for services performing text-to-speech or number normalization tasks, where user-supplied text is processed [2]. Because the EnglishNormalizer is part of the widely used transformers library, any downstream service or API that uses this normalizer on untrusted input is potentially vulnerable.

Impact

Successful exploitation leads to excessive CPU consumption, which can result in service disruption, resource exhaustion, and potential denial-of-service for other users or processes sharing the same compute resources. The vulnerability is rated with a CVSS v4.0 score pending full assessment, but the potential for remote exploitation with low complexity makes it a significant concern for online services [2].

Mitigation

The Hugging Face team has addressed the vulnerability in transformers version 4.53.0 by replacing the vulnerable regular expression patterns with atomic groups (using the re module) to prevent catastrophic backtracking [3][4]. All users are strongly recommended to upgrade to version 4.53.0 or later. Applications that cannot immediately upgrade should consider sanitizing or limiting the length of numeric input strings passed to normalize_numbers() as a temporary workaround.

AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
transformersPyPI
< 4.53.04.53.0

Affected products

2

Patches

2
54a02160eb03

Fix ReDOS in tokenizer digit substitution (#38844)

1 file changed · +14 7
  • src/transformers/models/clvp/number_normalizer.py+14 7 modified
    @@ -15,7 +15,14 @@
     
     """English Normalizer class for CLVP."""
     
    -import re
    +import sys
    +
    +
    +if sys.version_info >= (3, 11):
    +    # Atomic grouping support was only added to the core RE in Python 3.11
    +    import re
    +else:
    +    import regex as re
     
     
     class EnglishNormalizer:
    @@ -199,12 +206,12 @@ def normalize_numbers(self, text: str) -> str:
             This method is used to normalize numbers within a text such as converting the numbers to words, removing
             commas, etc.
             """
    -        text = re.sub(re.compile(r"([0-9][0-9\,]+[0-9])"), self._remove_commas, text)
    -        text = re.sub(re.compile(r"£([0-9\,]*[0-9]+)"), r"\1 pounds", text)
    -        text = re.sub(re.compile(r"\$([0-9\.\,]*[0-9]+)"), self._expand_dollars, text)
    -        text = re.sub(re.compile(r"([0-9]+\.[0-9]+)"), self._expand_decimal_point, text)
    -        text = re.sub(re.compile(r"[0-9]+(st|nd|rd|th)"), self._expand_ordinal, text)
    -        text = re.sub(re.compile(r"[0-9]+"), self._expand_number, text)
    +        text = re.sub(r"([0-9][0-9,]+[0-9])", self._remove_commas, text)
    +        text = re.sub(r"£([0-9,]*[0-9])", r"\1 pounds", text)
    +        text = re.sub(r"\$([0-9.,]*[0-9])", self._expand_dollars, text)
    +        text = re.sub(r"([0-9]++\.[0-9]+)", self._expand_decimal_point, text)
    +        text = re.sub(r"[0-9]++(st|nd|rd|th)", self._expand_ordinal, text)
    +        text = re.sub(r"[0-9]+", self._expand_number, text)
             return text
     
         def expand_abbreviations(self, text: str) -> str:
    
ba8eaba98656

Import regex/re correctly

1 file changed · +14 7
  • src/transformers/models/clvp/number_normalizer.py+14 7 modified
    @@ -15,7 +15,14 @@
     
     """English Normalizer class for CLVP."""
     
    -import regex as re
    +import sys
    +
    +
    +if sys.version_info >= (3, 11):
    +    # Atomic grouping support was only added to the core RE in Python 3.11
    +    import re
    +else:
    +    import regex as re
     
     
     class EnglishNormalizer:
    @@ -199,12 +206,12 @@ def normalize_numbers(self, text: str) -> str:
             This method is used to normalize numbers within a text such as converting the numbers to words, removing
             commas, etc.
             """
    -        text = re.sub(re.compile(r"([0-9][0-9\,]+[0-9])"), self._remove_commas, text)
    -        text = re.sub(re.compile(r"£([0-9\,]*[0-9])"), r"\1 pounds", text)
    -        text = re.sub(re.compile(r"\$([0-9\.\,]*[0-9])"), self._expand_dollars, text)
    -        text = re.sub(re.compile(r"([0-9]++\.[0-9]+)"), self._expand_decimal_point, text)
    -        text = re.sub(re.compile(r"[0-9]++(st|nd|rd|th)"), self._expand_ordinal, text)
    -        text = re.sub(re.compile(r"[0-9]+"), self._expand_number, text)
    +        text = re.sub(r"([0-9][0-9,]+[0-9])", self._remove_commas, text)
    +        text = re.sub(r"£([0-9,]*[0-9])", r"\1 pounds", text)
    +        text = re.sub(r"\$([0-9.,]*[0-9])", self._expand_dollars, text)
    +        text = re.sub(r"([0-9]++\.[0-9]+)", self._expand_decimal_point, text)
    +        text = re.sub(r"[0-9]++(st|nd|rd|th)", self._expand_ordinal, text)
    +        text = re.sub(r"[0-9]+", self._expand_number, text)
             return text
     
         def expand_abbreviations(self, text: str) -> str:
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

6

News mentions

0

No linked articles in our index yet.