VYPR
Moderate severityNVD Advisory· Published Mar 27, 2021· Updated Aug 3, 2024

CVE-2021-29272

CVE-2021-29272

Description

bluemonday before 1.0.5 allows XSS because certain Go lowercasing converts an uppercase Cyrillic character, defeating a protection mechanism against the "script" string.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

bluemonday HTML sanitizer before v1.0.5 is vulnerable to XSS due to Go's strings.ToLower converting a Cyrillic character to ASCII, bypassing script tag filtering.

Vulnerability

Overview

CVE-2021-29272 describes an XSS vulnerability in the bluemonday HTML sanitizer for Go, affecting versions prior to 1.0.5. The root cause is the use of Go's strings.ToLower to normalize HTML element names. This function converts certain uppercase Cyrillic characters (e.g., İ, U+0130) to their ASCII equivalents (i), allowing an attacker to craft a tag like <scrİpt> which, after lowercasing, becomes ``. The sanitizer's allowlist check fails to recognize the original tag as malicious because the Cyrillic character is not in the blocked list, but the lowercased version bypasses the protection mechanism [1][2][4].

Exploitation

An attacker can inject a script tag by substituting a Latin letter with a visually similar Cyrillic character that lowercases to the same ASCII letter. No authentication or special network position is required; the attack is carried out via any user-generated content that is sanitized by bluemonday. The sanitizer's token-based parser processes the tag, and the flawed lowercasing converts the Cyrillic character, allowing the script tag to pass through the allowlist and be rendered in the browser [4].

Impact

Successful exploitation leads to arbitrary JavaScript execution in the context of the victim's browser. This can result in data theft, session hijacking, defacement, or other malicious actions typically associated with stored or reflected XSS attacks [2].

Mitigation

The vulnerability is fixed in bluemonday version 1.0.5. The fix introduces a normaliseElementName function that uses strconv.QuoteToASCII to preserve non-ASCII characters as escaped equivalents, preventing the lowercasing trick [3][4]. Users are advised to upgrade to the latest version. No workarounds are documented.

AI Insight generated on May 21, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
github.com/microcosm-cc/bluemondayGo
< 1.0.51.0.5

Affected products

3

Patches

1
524f142fe46e

Resolves #111 carefully escape tag names

https://github.com/microcosm-cc/bluemondayDavid KitchenMar 27, 2021via ghsa
2 files changed · +66 4
  • sanitize.go+24 4 modified
    @@ -229,7 +229,7 @@ func (p *Policy) sanitize(r io.Reader) *bytes.Buffer {
     
     		case html.StartTagToken:
     
    -			mostRecentlyStartedToken = strings.ToLower(token.Data)
    +			mostRecentlyStartedToken = normaliseElementName(token.Data)
     
     			aps, ok := p.elsAndAttrs[token.Data]
     			if !ok {
    @@ -272,7 +272,7 @@ func (p *Policy) sanitize(r io.Reader) *bytes.Buffer {
     
     		case html.EndTagToken:
     
    -			if mostRecentlyStartedToken == strings.ToLower(token.Data) {
    +			if mostRecentlyStartedToken == normaliseElementName(token.Data) {
     				mostRecentlyStartedToken = ""
     			}
     
    @@ -350,11 +350,11 @@ func (p *Policy) sanitize(r io.Reader) *bytes.Buffer {
     
     			if !skipElementContent {
     				switch mostRecentlyStartedToken {
    -				case "script":
    +				case `script`:
     					// not encouraged, but if a policy allows JavaScript we
     					// should not HTML escape it as that would break the output
     					buff.WriteString(token.Data)
    -				case "style":
    +				case `style`:
     					// not encouraged, but if a policy allows CSS styles we
     					// should not HTML escape it as that would break the output
     					buff.WriteString(token.Data)
    @@ -887,3 +887,23 @@ func (p *Policy) matchRegex(elementName string) (map[string]attrPolicy, bool) {
     	}
     	return aps, matched
     }
    +
    +
    +// normaliseElementName takes a HTML element like <script> which is user input
    +// and returns a lower case version of it that is immune to UTF-8 to ASCII
    +// conversion tricks (like the use of upper case cyrillic i scrİpt which a
    +// strings.ToLower would convert to script). Instead this func will preserve
    +// all non-ASCII as their escaped equivalent, i.e. \u0130 which reveals the
    +// characters when lower cased
    +func normaliseElementName(str string) string {
    +	// that useful QuoteToASCII put quote marks at the start and end
    +	// so those are trimmed off
    +	return strings.TrimSuffix(
    +		strings.TrimPrefix(
    +			strings.ToLower(
    +				strconv.QuoteToASCII(str),
    +			),
    +			`"`),
    +		`"`,
    +	)
    +}
    \ No newline at end of file
    
  • sanitize_test.go+42 0 modified
    @@ -1678,3 +1678,45 @@ func TestIssue85NoReferrer(t *testing.T) {
     	}
     	wg.Wait()
     }
    +
    +
    +
    +func TestIssue111ScriptTags(t *testing.T) {
    +	p1 := NewPolicy()
    +	p2 := UGCPolicy()
    +	p3 := UGCPolicy().AllowElements("script")
    +
    +	in := `<scr\u0130pt>&lt;script&gt;alert(document.domain)&lt;/script&gt;`
    +	expected := `&lt;script&gt;alert(document.domain)&lt;/script&gt;`
    +	out := p1.Sanitize(in)
    +	if out != expected {
    +		t.Errorf(
    +			"test failed;\ninput   : %s\noutput  : %s\nexpected: %s",
    +			in,
    +			out,
    +			expected,
    +		)
    +	}
    +
    +	expected = `&lt;script&gt;alert(document.domain)&lt;/script&gt;`
    +	out = p2.Sanitize(in)
    +	if out != expected {
    +		t.Errorf(
    +			"test failed;\ninput   : %s\noutput  : %s\nexpected: %s",
    +			in,
    +			out,
    +			expected,
    +		)
    +	}
    +
    +	expected = `&lt;script&gt;alert(document.domain)&lt;/script&gt;`
    +	out = p3.Sanitize(in)
    +	if out != expected {
    +		t.Errorf(
    +			"test failed;\ninput   : %s\noutput  : %s\nexpected: %s",
    +			in,
    +			out,
    +			expected,
    +		)
    +	}
    +}
    \ No newline at end of file
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

8

News mentions

0

No linked articles in our index yet.