VYPR
Moderate severityNVD Advisory· Published Jul 5, 2022· Updated Nov 4, 2025

NULL Pointer Dereference in lxml/lxml

CVE-2022-2309

Description

NULL Pointer Dereference allows attackers to cause a denial of service (or application crash). This only applies when lxml is used together with libxml2 2.9.10 through 2.9.14. libxml2 2.9.9 and earlier are not affected. It allows triggering crashes through forged input data, given a vulnerable code sequence in the application. The vulnerability is caused by the iterwalk function (also used by the canonicalize function). Such code shouldn't be in wide-spread use, given that parsing + iterwalk would usually be replaced with the more efficient iterparse function. However, an XML converter that serialises to C14N would also be vulnerable, for example, and there are legitimate use cases for this code sequence. If untrusted input is received (also remotely) and processed via iterwalk function, a crash can be triggered.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

NULL pointer dereference in lxml's iterwalk function allows denial of service when processing malicious XML with libxml2 2.9.10-2.9.14.

CVE-2022-2309 is a NULL pointer dereference vulnerability in the lxml XML toolkit when used with libxml2 versions 2.9.10 through 2.9.14. The bug resides in the iterwalk function (also used by canonicalize), which fails to properly handle namespace definitions after a parse failure, leading to a NULL pointer dereference [1][4]. This issue was introduced by changes in libxml2 that leak empty namespaces between failed parser runs, and the lxml code did not adequately guard against such states.

Exploitation requires an attacker to supply specially crafted XML input to an application that processes it via the iterwalk function. The attack can be performed remotely if the application handles untrusted XML data, and no authentication or special privileges are needed. The vulnerability is triggered by parsing malformed XML that causes a parse failure, followed by iterating over the resulting tree with iterwalk [1].

Successful exploitation results in a denial of service (application crash) due to the NULL pointer dereference. While the vulnerable code sequence is not widespread—since parsing plus iterwalk is often replaced by the more efficient iterparse—it is present in XML converters that serialize to C14N canonical form and other legitimate use cases [1].

The vulnerability has been patched in lxml via commit 86368e9, which adds checks for NULL namespace pointers before dereferencing them [4]. Users are advised to update lxml to a version containing this fix or avoid using the vulnerable code path with untrusted input. Additionally, using libxml2 versions prior to 2.9.10 or later than 2.9.14 avoids the underlying issue [1][3].

AI Insight generated on May 21, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
lxmlPyPI
< 4.9.14.9.1

Affected products

19

Patches

1
86368e9cf70a

Fix a crash when incorrect parser input occurs together with usages of iterwalk() on trees generated by the same parser.

https://github.com/lxml/lxmlStefan BehnelJul 1, 2022via ghsa
3 files changed · +30 8
  • src/lxml/apihelpers.pxi+4 3 modified
    @@ -246,9 +246,10 @@ cdef dict _build_nsmap(xmlNode* c_node):
         while c_node is not NULL and c_node.type == tree.XML_ELEMENT_NODE:
             c_ns = c_node.nsDef
             while c_ns is not NULL:
    -            prefix = funicodeOrNone(c_ns.prefix)
    -            if prefix not in nsmap:
    -                nsmap[prefix] = funicodeOrNone(c_ns.href)
    +            if c_ns.prefix or c_ns.href:
    +                prefix = funicodeOrNone(c_ns.prefix)
    +                if prefix not in nsmap:
    +                    nsmap[prefix] = funicodeOrNone(c_ns.href)
                 c_ns = c_ns.next
             c_node = c_node.parent
         return nsmap
    
  • src/lxml/iterparse.pxi+6 5 modified
    @@ -420,7 +420,7 @@ cdef int _countNsDefs(xmlNode* c_node):
         count = 0
         c_ns = c_node.nsDef
         while c_ns is not NULL:
    -        count += 1
    +        count += (c_ns.href is not NULL)
             c_ns = c_ns.next
         return count
     
    @@ -431,9 +431,10 @@ cdef int _appendStartNsEvents(xmlNode* c_node, list event_list) except -1:
         count = 0
         c_ns = c_node.nsDef
         while c_ns is not NULL:
    -        ns_tuple = (funicode(c_ns.prefix) if c_ns.prefix is not NULL else '',
    -                    funicode(c_ns.href))
    -        event_list.append( (u"start-ns", ns_tuple) )
    -        count += 1
    +        if c_ns.href:
    +            ns_tuple = (funicodeOrEmpty(c_ns.prefix),
    +                        funicode(c_ns.href))
    +            event_list.append( (u"start-ns", ns_tuple) )
    +            count += 1
             c_ns = c_ns.next
         return count
    
  • src/lxml/tests/test_etree.py+20 0 modified
    @@ -1460,6 +1460,26 @@ def test_iterwalk_getiterator(self):
                 [1,2,1,4],
                 counts)
     
    +    def test_walk_after_parse_failure(self):
    +        # This used to be an issue because libxml2 can leak empty namespaces
    +        # between failed parser runs.  iterwalk() failed to handle such a tree.
    +        try:
    +            etree.XML('''<anot xmlns="1">''')
    +        except etree.XMLSyntaxError:
    +            pass
    +        else:
    +            assert False, "invalid input did not fail to parse"
    +
    +        et = etree.XML('''<root>  </root>''')
    +        try:
    +            ns = next(etree.iterwalk(et, events=('start-ns',)))
    +        except StopIteration:
    +            # This would be the expected result, because there was no namespace
    +            pass
    +        else:
    +            # This is a bug in libxml2
    +            assert not ns, repr(ns)
    +
         def test_itertext_comment_pi(self):
             # https://bugs.launchpad.net/lxml/+bug/1844674
             XML = self.etree.XML
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

16

News mentions

0

No linked articles in our index yet.