Medium severity6.1NVD Advisory· Published May 14, 2014· Updated May 6, 2026
CVE-2014-3146
CVE-2014-3146
Description
Incomplete blacklist vulnerability in the lxml.html.clean module in lxml before 3.3.5 allows remote attackers to conduct cross-site scripting (XSS) attacks via control characters in the link scheme to the clean_html function.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
lxmlPyPI | < 3.3.5 | 3.3.5 |
Affected products
95cpe:2.3:a:lxml:lxml:2.1.2:*:*:*:*:*:*:*+ 94 more
- cpe:2.3:a:lxml:lxml:2.1.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:-:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:alpha1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:beta2:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:beta3:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2:beta4:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:*:*:*:*:*:*:*:*range: <=3.3.4
- cpe:2.3:a:lxml:lxml:0.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.5.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.6:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.7:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.8:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.9:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.9.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:0.9.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.0:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.0.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.0.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.0.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.0.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.1.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.1.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.2.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:1.3.6:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.6:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.7:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.8:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.9:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.10:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.0.11:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1:alpha1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1:beta2:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1:beta3:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.1.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:-:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:beta2:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:beta3:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:beta4:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.0:beta5:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.3.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.6:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.7:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.2.8:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3:-:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3:alpha1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3:alpha2:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.3:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.4:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.5:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:2.3.6:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0:-:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0:alpha1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0:alpha2:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.0.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.1:beta1:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.1.0:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.1.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.1.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.0:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.1:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.2:*:*:*:*:*:*:*
- cpe:2.3:a:lxml:lxml:3.2.3:*:*:*:*:*:*:*
Patches
33f3082e0a678Merge branch lxml-4.2 into master.
6 files changed · +29 −14
CHANGES.txt+9 −3 modified@@ -3,14 +3,20 @@ lxml changelog ============== 4.3.0 (2018-??-??) -================== - Features added -------------- - * The module ``lxml.sax`` is compiled using Cython in order to speed it up. +4.2.5 (2018-09-09) +================== + +Bugs fixed +---------- + +* Javascript URLs that used URL escaping were not removed by the HTML cleaner. + Security problem found by Omar Eissa. + 4.2.4 (2018-08-03) ==================
doc/main.txt+7 −3 modified@@ -157,8 +157,8 @@ Index <http://pypi.python.org/pypi/lxml/>`_ (PyPI). It has the source that compiles on various platforms. The source distribution is signed with `this key <pubkey.asc>`_. -The latest version is `lxml 4.2.4`_, released 2018-08-03 -(`changes for 4.2.4`_). `Older versions <#old-versions>`_ +The latest version is `lxml 4.2.5`_, released 2018-09-09 +(`changes for 4.2.5`_). `Older versions <#old-versions>`_ are listed below. Please take a look at the @@ -250,7 +250,9 @@ See the websites of lxml .. and the `latest in-development version <http://lxml.de/dev/>`_. -.. _`PDF documentation`: lxmldoc-4.2.4.pdf +.. _`PDF documentation`: lxmldoc-4.2.5.pdf + +* `lxml 4.2.5`_, released 2018-09-09 (`changes for 4.2.5`_) * `lxml 4.2.4`_, released 2018-08-03 (`changes for 4.2.4`_) @@ -272,6 +274,7 @@ See the websites of lxml * `older releases <http://lxml.de/3.7/#old-versions>`_ +.. _`lxml 4.2.5`: /files/lxml-4.2.5.tgz .. _`lxml 4.2.4`: /files/lxml-4.2.4.tgz .. _`lxml 4.2.3`: /files/lxml-4.2.3.tgz .. _`lxml 4.2.2`: /files/lxml-4.2.2.tgz @@ -282,6 +285,7 @@ See the websites of lxml .. _`lxml 4.0.0`: /files/lxml-4.0.0.tgz .. _`lxml 3.8.0`: /files/lxml-3.8.0.tgz +.. _`changes for 4.2.5`: /changes-4.2.5.html .. _`changes for 4.2.4`: /changes-4.2.4.html .. _`changes for 4.2.3`: /changes-4.2.3.html .. _`changes for 4.2.2`: /changes-4.2.2.html
doc/rest2html.py+1 −1 modified@@ -38,7 +38,7 @@ def pygments_directive(name, arguments, options, content, lineno, content_offset, block_text, state, state_machine): try: lexer = get_lexer_by_name(arguments[0]) - except ValueError, e: + except ValueError: # no lexer found - use the text one instead of an exception lexer = TextLexer() # take an arbitrary option if more than one is given
src/lxml/html/clean.py+3 −2 modified@@ -8,9 +8,10 @@ import copy try: from urlparse import urlsplit + from urllib import unquote_plus except ImportError: # Python 3 - from urllib.parse import urlsplit + from urllib.parse import urlsplit, unquote_plus from lxml import etree from lxml.html import defs from lxml.html import fromstring, XHTML_NAMESPACE @@ -477,7 +478,7 @@ def _kill_elements(self, doc, condition, iterate=None): def _remove_javascript_link(self, link): # links like "j a v a s c r i p t:" might be interpreted in IE - new = _substitute_whitespace('', link) + new = _substitute_whitespace('', unquote_plus(link)) if _is_javascript_scheme(new): # FIXME: should this be None to delete? return ''
src/lxml/html/tests/test_clean.txt+3 −3 modified@@ -18,7 +18,7 @@ ... <body onload="evil_function()"> ... <!-- I am interpreted for EVIL! --> ... <a href="javascript:evil_function()">a link</a> -... <a href="j\x01a\x02v\x03a\x04s\x05c\x06r\x07i\x0Ep t:evil_function()">a control char link</a> +... <a href="j\x01a\x02v\x03a\x04s\x05c\x06r\x07i\x0Ep t%20:evil_function()">a control char link</a> ... <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> ... <a href="#" onclick="evil_function()">another link</a> ... <p onclick="evil_function()">a paragraph</p> @@ -51,7 +51,7 @@ <body onload="evil_function()"> <!-- I am interpreted for EVIL! --> <a href="javascript:evil_function()">a link</a> - <a href="javascrip t:evil_function()">a control char link</a> + <a href="javascrip t%20:evil_function()">a control char link</a> <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> <a href="#" onclick="evil_function()">another link</a> <p onclick="evil_function()">a paragraph</p> @@ -84,7 +84,7 @@ <body onload="evil_function()"> <!-- I am interpreted for EVIL! --> <a href="javascript:evil_function()">a link</a> - <a href="javascrip%20t:evil_function()">a control char link</a> + <a href="javascrip%20t%20:evil_function()">a control char link</a> <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> <a href="#" onclick="evil_function()">another link</a> <p onclick="evil_function()">a paragraph</p>
tools/manylinux/build-wheels.sh+6 −2 modified@@ -24,12 +24,16 @@ build_wheel() { -w /io/$WHEELHOUSE } -assert_importable() { +run_tests() { # Install packages and test for PYBIN in /opt/python/*/bin/; do ${PYBIN}/pip install $PACKAGE --no-index -f /io/$WHEELHOUSE + # check import as a quick test (cd $HOME; ${PYBIN}/python -c 'import lxml.etree, lxml.objectify') + + # run tests + (cd $HOME; ${PYBIN}/python /io/test.py) done } @@ -74,5 +78,5 @@ show_wheels() { prepare_system build_wheels repair_wheels -assert_importable +run_tests show_wheels
1 file changed · +10 −0
CHANGES.txt+10 −0 modified@@ -2,6 +2,16 @@ lxml changelog ============== +3.3.5 (???) +================== + +Bugs fixed +---------- + +* HTML cleaning could fail to strip javascript links that mix control + characters into the link scheme. + + 3.3.4 (2014-04-03) ==================
e86b294f1f81strip control characters before looking for evil text content in Cleaner
2 files changed · +13 −5
src/lxml/html/clean.py+5 −4 modified@@ -70,9 +70,10 @@ # All kinds of schemes besides just javascript: that can cause # execution: -_javascript_scheme_re = re.compile( - r'\s*(?:javascript|jscript|livescript|vbscript|data|about|mocha):', re.I) -_substitute_whitespace = re.compile(r'\s+').sub +_is_javascript_scheme = re.compile( + r'(?:javascript|jscript|livescript|vbscript|data|about|mocha):', + re.I).search +_substitute_whitespace = re.compile(r'[\s\x00-\x08\x0B\x0C\x0E-\x19]+').sub # FIXME: should data: be blocked? # FIXME: check against: http://msdn2.microsoft.com/en-us/library/ms537512.aspx @@ -466,7 +467,7 @@ def _kill_elements(self, doc, condition, iterate=None): def _remove_javascript_link(self, link): # links like "j a v a s c r i p t:" might be interpreted in IE new = _substitute_whitespace('', link) - if _javascript_scheme_re.search(new): + if _is_javascript_scheme(new): # FIXME: should this be None to delete? return '' return link
src/lxml/html/tests/test_clean.txt+8 −1 modified@@ -1,3 +1,4 @@ +>>> import re >>> from lxml.html import fromstring, tostring >>> from lxml.html.clean import clean, clean_html, Cleaner >>> from lxml.html import usedoctest @@ -17,6 +18,7 @@ ... <body onload="evil_function()"> ... <!-- I am interpreted for EVIL! --> ... <a href="javascript:evil_function()">a link</a> +... <a href="j\x01a\x02v\x03a\x04s\x05c\x06r\x07i\x0Ep t:evil_function()">a control char link</a> ... <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> ... <a href="#" onclick="evil_function()">another link</a> ... <p onclick="evil_function()">a paragraph</p> @@ -33,7 +35,7 @@ ... </body> ... </html>''' ->>> print(doc) +>>> print(re.sub('[\x00-\x07\x0E]', '', doc)) <html> <head> <script type="text/javascript" src="evil-site"></script> @@ -49,6 +51,7 @@ <body onload="evil_function()"> <!-- I am interpreted for EVIL! --> <a href="javascript:evil_function()">a link</a> + <a href="javascrip t:evil_function()">a control char link</a> <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> <a href="#" onclick="evil_function()">another link</a> <p onclick="evil_function()">a paragraph</p> @@ -81,6 +84,7 @@ <body onload="evil_function()"> <!-- I am interpreted for EVIL! --> <a href="javascript:evil_function()">a link</a> + <a href="javascrip%20t:evil_function()">a control char link</a> <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgidGVzdCIpOzwvc2NyaXB0Pg==">data</a> <a href="#" onclick="evil_function()">another link</a> <p onclick="evil_function()">a paragraph</p> @@ -104,6 +108,7 @@ </head> <body> <a href="">a link</a> + <a href="">a control char link</a> <a href="">data</a> <a href="#">another link</a> <p>a paragraph</p> @@ -123,6 +128,7 @@ </head> <body> <a href="">a link</a> + <a href="">a control char link</a> <a href="">data</a> <a href="#">another link</a> <p>a paragraph</p> @@ -146,6 +152,7 @@ </head> <body> <a href="">a link</a> + <a href="">a control char link</a> <a href="">data</a> <a href="#">another link</a> <p>a paragraph</p>
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
27- seclists.org/fulldisclosure/2014/Apr/319nvdExploitWEB
- www.securityfocus.com/bid/67159nvdExploit
- mailman-mail5.webfaction.com/pipermail/lxml/2014-April/007128.htmlnvdExploitWEB
- secunia.com/advisories/58013nvdVendor Advisory
- github.com/advisories/GHSA-57qw-cc2g-pv5pghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2014-3146ghsaADVISORY
- advisories.mageia.org/MGASA-2014-0218.htmlnvdWEB
- lists.opensuse.org/opensuse-updates/2014-05/msg00083.htmlnvdWEB
- lxml.de/3.3/changes-3.3.5.htmlnvdWEB
- seclists.org/fulldisclosure/2014/Apr/210nvdWEB
- www.debian.org/security/2014/dsa-2941nvdWEB
- www.openwall.com/lists/oss-security/2014/05/09/7nvdWEB
- www.ubuntu.com/usn/USN-2217-1nvdWEB
- github.com/lxml/lxml/commit/3f3082e0a67851cde26a48da3d1f4b75d8aa07ecghsaWEB
- github.com/lxml/lxml/commit/86e81ab393ba14c1be71284675851a3bdce57d69ghsaWEB
- github.com/lxml/lxml/commit/e86b294f1f81b899a59925123560ff924a72f1ccghsaWEB
- github.com/lxml/lxml/pull/273ghsaWEB
- github.com/pypa/advisory-database/tree/main/vulns/lxml/PYSEC-2014-9.yamlghsaWEB
- web.archive.org/web/20140724172044/http://secunia.com/advisories/58013ghsaWEB
- web.archive.org/web/20140805110535/http://secunia.com/advisories/59008ghsaWEB
- web.archive.org/web/20140806061046/http://secunia.com/advisories/58744ghsaWEB
- web.archive.org/web/20141017122607/https://mailman-mail5.webfaction.com/pipermail/lxml/2014-April/007128.htmlghsaWEB
- web.archive.org/web/20150523055039/http://www.mandriva.com/en/support/security/advisories/advisory/MDVSA-2015:112/ghsaWEB
- web.archive.org/web/20200228180542/http://www.securityfocus.com/bid/67159ghsaWEB
- secunia.com/advisories/58744nvd
- secunia.com/advisories/59008nvd
- www.mandriva.com/security/advisoriesnvd
News mentions
0No linked articles in our index yet.