CVE-2022-29546
Description
HtmlUnit NekoHtml Parser before 2.61.0 suffers from a denial of service vulnerability. Crafted input associated with the parsing of Processing Instruction (PI) data leads to heap memory consumption. This is similar to CVE-2022-28366 but affects a much later version of the product.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
HtmlUnit NekoHtml Parser before 2.61.0 is vulnerable to denial of service via crafted processing instruction input, causing heap exhaustion.
Vulnerability
HtmlUnit NekoHtml Parser versions before 2.61.0 are susceptible to a denial of service vulnerability when parsing crafted Processing Instruction (PI) data [1][2][4]. The parser incorrectly handles certain malformed PI input, leading to excessive heap memory consumption. This issue is similar to CVE-2022-28366 but affects a later version of the product [2].
Exploitation
An attacker can exploit this vulnerability by providing a specially crafted HTML document containing a malformed processing instruction (e.g., <!--?><?a/ as shown in the regression test [3]) to the parser. The attack can be delivered remotely, for example by hosting a malicious webpage that is parsed by the HtmlUnit browser or application. No authentication or special privileges are required.
Impact
Successful exploitation results in heap memory exhaustion, causing the application to become unresponsive or crash, leading to a denial of service condition. The impact is limited to availability, with no confidentiality or integrity compromise.
Mitigation
The vulnerability is fixed in version 2.61.0 of the HtmlUnit NekoHtml Parser [1][4]. Users should upgrade to this version or later. The fix is implemented in commit 9d2aecd [3], which adds a check for end-of-file condition when processing PI data. If upgrading is not immediately possible, avoid parsing untrusted or user-supplied HTML content until the update can be applied.
AI Insight generated on May 21, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
net.sourceforge.htmlunit:neko-htmlunitMaven | < 2.61.0 | 2.61.0 |
Affected products
3- HtmlUnit/NekoHtml Parserdescription
Patches
19d2aecd69223fix oom exception
5 files changed · +36 −1
src/main/java/net/sourceforge/htmlunit/cyberneko/HTMLScanner.java+1 −1 modified@@ -2626,7 +2626,7 @@ protected void scanPI() throws IOException { if (c == '?' || c == '/') { final char c0 = (char)c; c = fCurrentEntity.read(); - if (c == '>') { + if (c == -1 || c == '>') { break; } fStringBuffer.append(c0);
src/test/java/net/sourceforge/htmlunit/cyberneko/HTMLScannerTest.java+18 −0 modified@@ -253,4 +253,22 @@ public void elementNameNormalization() throws Exception { final String[] expectedStringLower = {"(html", "(head", "(title", ")title", ")head", "(body", ")body", ")html"}; assertEquals(Arrays.asList(expectedStringLower).toString(), filter.collectedStrings.toString()); } + + /** + * Regression test for an oom exception in versions < 2.60. + * @throws Exception + */ + @Test + public void invalidProcessingInstruction() throws Exception { + final String string = "<!--?><?a/"; + + final HTMLConfiguration parser = new HTMLConfiguration(); + final EvaluateInputSourceFilter filter = new EvaluateInputSourceFilter(parser); + parser.setProperty("http://cyberneko.org/html/properties/filters", new XMLDocumentFilter[] {filter}); + final XMLInputSource source = new XMLInputSource(null, "myTest", null, new StringReader(string), "UTF-8"); + parser.parse(source); + + final String[] expected = {"(HTML", "(head", ")head", "(body", ")body", ")html"}; + assertEquals(Arrays.asList(expected).toString(), filter.collectedStrings.toString()); + } }
src/test/resources/error-handling/test-broken-pi.html+1 −0 added@@ -0,0 +1 @@ +<!--?><?a/ \ No newline at end of file
src/test/resources/error-handling/test-broken-pi.html.canonical+14 −0 added@@ -0,0 +1,14 @@ +[Warn] HTML1000 No character encoding indicator at beginning of document. +[Err] HTML1007 Premature end of file encountered. +#? +[Warn] HTML1008 Skipping processing instruction. +?a +[Err] HTML2000 Empty document. +[Warn] HTML2006 Bare character content found. Inserting parent element <body>. +[Warn] HTML2002 Missing parent chain. Inserting proper parent <HTML> for element <head>. +(HTML +(head +)head +(body +)body +)html
src/test/resources/error-handling/test-broken-pi.html.settings+2 −0 added@@ -0,0 +1,2 @@ +property http://cyberneko.org/html/properties/default-encoding ASCII +feature http://cyberneko.org/html/features/report-errors true \ No newline at end of file
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4News mentions
0No linked articles in our index yet.