CVE-2026-8829
Description
HTML::Entities versions before 3.84 contain a heap-use-after-free vulnerability in _decode_entities, potentially leading to information disclosure.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
HTML::Entities versions before 3.84 contain a heap-use-after-free vulnerability in _decode_entities, potentially leading to information disclosure.
Vulnerability
The HTML::Entities Perl module, specifically in versions prior to 3.84, contains a heap-use-after-free vulnerability within the _decode_entities XS routine. This vulnerability occurs when the input SV (string value) passed to _decode_entities is the same SV that is stored as a self-referential value in the entity hash. If the grow_gap() function reallocates the SV's PV buffer under these conditions, a cached pointer (repl) can be left pointing to freed memory.
Exploitation
An attacker can trigger this vulnerability by providing specially crafted input to HTML::Entities::_decode_entities. The input must be structured such that the string being decoded is also present as a self-referential value within the entity hash. The payload must also be large enough to force the grow_gap() function to reallocate the SV's PV buffer, which is necessary to expose the use-after-free condition. No specific network position, authentication, or user interaction is mentioned as required.
Impact
When the vulnerability is successfully exploited, the _decode_entities routine will attempt to copy data from the now-freed memory allocation. This read operation may disclose adjacent heap contents into the destination SV. The exact nature and sensitivity of the disclosed information depend on the contents of the heap at the time of the vulnerability's exploitation.
Mitigation
The vulnerability is fixed in HTML::Entities version 3.84. Users are strongly advised to upgrade to this version or later. The commit that addresses this issue is available at [1] and the pull request is at [2]. No workarounds are described in the available references.
AI Insight generated on Jun 4, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
3(expand)+ 1 more
- (no CPE)
- (no CPE)range: <3.84
- Range: <3.84
Patches
26922552b0778Fix heap-use-after-free in _decode_entities (CVE-2026-8829)
2 files changed · +43 −5
t/entities.t+18 −2 modified@@ -2,8 +2,9 @@ use strict; use warnings; use utf8; -use HTML::Entities qw(decode_entities encode_entities encode_entities_numeric); -use Test::More tests => 31; +use HTML::Entities + qw(_decode_entities decode_entities encode_entities encode_entities_numeric); +use Test::More tests => 32; my $x = "Våre norske tegn bør æres"; @@ -96,6 +97,21 @@ is($x, $ent); is($got, (values %hash)[0], "decode_entities() decodes a key properly"); } +# CVE-2026-8829 +# _decode_entities heap-use-after-free when the input SV is the same SV as +# a self-referential entity value. The payload must be large enough to +# force grow_gap() to realloc the SV's PV; the fix copies the entity value +# into an owned buffer so repl is not left pointing at the freed allocation. +{ + my $prefix_a = "A" x 32; + my $suffix_b = "B" x 8192; + my %h; + $h{foo} = $prefix_a . "&foo;" . $suffix_b; + _decode_entities($h{foo}, \%h); + is($h{foo}, ("A" x 64) . "&foo;" . ("B" x 16384), + "_decode_entities() with self-aliased entity hash value"); +} + # From: Bill Simpson-Young <bill.simpson-young@cmis.csiro.au> # Subject: HTML entities problem with 5.11 # To: libwww-perl@ics.uci.edu
util.c+25 −3 modified@@ -72,6 +72,7 @@ decode_entities(pTHX_ SV* sv, HV* entity2char, bool expand_prefix) char *repl; STRLEN repl_len; + char *repl_allocated = 0; char buf[UTF8_MAXLEN]; int repl_utf8; int high_surrogate = 0; @@ -89,6 +90,7 @@ decode_entities(pTHX_ SV* sv, HV* entity2char, bool expand_prefix) ent_start = s; repl = 0; + repl_allocated = 0; if (s < end && *s == '#') { UV num = 0; @@ -176,16 +178,34 @@ decode_entities(pTHX_ SV* sv, HV* entity2char, bool expand_prefix) (*s == ';' && (svp = hv_fetch(entity2char, ent_name, s - ent_name + 1, 0))) ) { - repl = SvPV(*svp, repl_len); + char *src = SvPV(*svp, repl_len); repl_utf8 = SvUTF8(*svp); + if ((SV*)*svp == sv) { + /* Self-aliased: hash entry SV == input SV. + * grow_gap() may realloc sv's PV later; copy + * the entity value into an owned buffer first. + * Freed by the repl_allocated cleanup below. */ + Newx(repl_allocated, repl_len ? repl_len : 1, char); + Copy(src, repl_allocated, repl_len, char); + repl = repl_allocated; + } else { + repl = src; + } } else if (expand_prefix) { char *ss = s - 1; while (ss > ent_name) { svp = hv_fetch(entity2char, ent_name, ss - ent_name, 0); if (svp) { - repl = SvPV(*svp, repl_len); + char *src = SvPV(*svp, repl_len); repl_utf8 = SvUTF8(*svp); + if ((SV*)*svp == sv) { + Newx(repl_allocated, repl_len ? repl_len : 1, char); + Copy(src, repl_allocated, repl_len, char); + repl = repl_allocated; + } else { + repl = src; + } s = ss; break; } @@ -197,7 +217,9 @@ decode_entities(pTHX_ SV* sv, HV* entity2char, bool expand_prefix) } if (repl) { - char *repl_allocated = 0; + /* repl_allocated is now function-scoped; set by the + * named-entity self-alias path above or by the UTF8 mismatch + * branch below. Same cleanup in either case. */ if (s < end && *s == ';') s++; t--; /* '&' already copied, undo it */
dc20dd38ccf4Update Changes
1 file changed · +2 −0
Changes+2 −0 modified@@ -1,6 +1,8 @@ Change history for HTML-Parser {{$NEXT}} + - Fix heap-use-after-free in _decode_entities (CVE-2026-8829) (GH#56) + (Paul Johnson) 3.83 2024-07-30 - fix '$\/]' in HTML::Entities::encode_entities (GH#45) (mauke)
Vulnerability mechanics
Root cause
"The XS routine for HTML::Entities::_decode_entities cached a pointer to freed memory when processing self-referential entity values."
Attack vector
An attacker can trigger this vulnerability by providing input to `HTML::Entities::_decode_entities` where an entity value in the `entity2char` hash is the same SV as the input SV. If this self-referential value contains its own key as an entity reference, and the input SV is large enough to cause `grow_gap()` to reallocate its buffer, the cached pointer will become dangling. This allows an attacker to read freed heap memory, potentially disclosing adjacent heap contents into the destination SV [ref_id=1].
Affected code
The vulnerability exists within the `decode_entities` function in `util.c`. Specifically, the logic handling entity lookups and the subsequent use of the `repl` pointer are affected. The patch modifies this function to add a check for self-aliased SVs and to allocate a separate buffer when necessary [patch_id=4748850].
What the fix does
The patch modifies the `decode_entities` function to detect when the hash entry SV aliases the input SV. In such cases, the entity value is copied into a newly allocated buffer (`repl_allocated`) before being used. This ensures that even if `grow_gap()` reallocates the input SV's buffer, the `repl` pointer will still point to valid, owned memory, preventing a use-after-free condition [patch_id=4748850].
Preconditions
- inputInput to `HTML::Entities::_decode_entities` must contain a self-referential entity where the entity value SV is identical to the input SV.
Generated on Jun 4, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
2News mentions
0No linked articles in our index yet.