Regular Expression Denial of Service (ReDoS) in huggingface/transformers
Description
A Regular Expression Denial of Service (ReDoS) vulnerability was identified in the huggingface/transformers library, specifically in the file tokenization_nougat_fast.py. The vulnerability occurs in the post_process_single() function, where a regular expression processes specially crafted input. The issue stems from the regex exhibiting exponential time complexity under certain conditions, leading to excessive backtracking. This can result in significantly high CPU usage and potential application downtime, effectively creating a Denial of Service (DoS) scenario. The affected version is v4.46.3 (latest).
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
A ReDoS vulnerability in the post_process_single() function of huggingface/transformers, affecting v4.46.3, allows crafted input to cause excessive backtracking and CPU exhaustion.
Vulnerability
Description A Regular Expression Denial of Service (ReDoS) vulnerability has been identified in the Hugging Face Transformers library, specifically in the file tokenization_nougat_fast.py. The issue resides in the post_process_single() function, where a regular expression processes specially crafted input, causing exponential backtracking under certain conditions. This results in significantly high CPU usage and potential application downtime, effectively creating a denial-of-service (DoS) scenario. The affected version is v4.46.3 (latest at the time of disclosure) [1][2].
Exploitation and
Impact An attacker can exploit this vulnerability by providing crafted text input to any application or service using the affected regex in post_process_single(). No authentication is required beyond the ability to send input to the vulnerable function. The exponential backtracking consumes excessive CPU resources, leading to service degradation or complete unavailability. The impact is limited to availability, with no confidentiality or integrity compromise.
Mitigation
The vulnerability has been patched in a commit that limits regex backtracking by simplifying the pattern [3]. The fix replaces a pattern with nested quantifiers (^(?:%\.?(?\\d|[ixv])+)*\s*) with a safer alternative (^#+ (?:[\d+\.]+|[ixv\.]+)?\s*). Users should update to a version of Transformers that includes this fix or apply the patch manually. The official advisory from huntr.com also confirms the vulnerability [4].
- GitHub - huggingface/transformers: 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
- NVD - CVE-2024-12720
- 🚨🚨🚨 Limit backtracking in Nougat regexp (#35264) · huggingface/transformers@deac971
- The world’s first bug bounty platform for AI/ML
AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
transformersPyPI | < 4.48.0 | 4.48.0 |
Affected products
4- Range: = v4.46.3
- osv-coords2 versions
< 0.9.0-r4+ 1 more
- (no CPE)range: < 0.9.0-r4
- (no CPE)range: < 4.48.0
- huggingface/huggingface/transformersv5Range: unspecified
Patches
1deac971c469b🚨🚨🚨 Limit backtracking in Nougat regexp (#35264)
1 file changed · +1 −1
src/transformers/models/nougat/tokenization_nougat_fast.py+1 −1 modified@@ -514,7 +514,7 @@ def post_process_single(self, generation: str, fix_markdown: bool = True) -> str generation = generation.replace("\n* [leftmargin=*]\n", "\n") # Remove lines with markdown headings starting with #, with numerals, # and possibly roman numerals with trailing spaces and newlines - generation = re.sub(r"^#+ (?:\.?(?:\d|[ixv])+)*\s*(?:$|\n\s*)", "", generation, flags=re.M) + generation = re.sub(r"^#+ (?:[\d+\.]+|[ixv\.]+)?\s*(?:$|\n\s*)", "", generation, flags=re.M) # most likely hallucinated titles lines = generation.split("\n") if lines[-1].startswith("#") and lines[-1].lstrip("#").startswith(" ") and len(lines) > 1:
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4News mentions
0No linked articles in our index yet.