VYPR
Unrated severityNVD Advisory· Published Nov 1, 2021· Updated Aug 4, 2024

CVE-2021-42574

CVE-2021-42574

Description

Bidirectional control characters can reorder source code tokens, enabling invisible code injection by subverting human review.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Bidirectional control characters can reorder source code tokens, enabling invisible code injection by subverting human review.

Vulnerability

The Unicode Bidirectional Algorithm through version 14.0 permits the visual reordering of characters via control sequences, allowing source code to render differently than its logical token order [4]. This affects any compiler or interpreter that accepts Unicode source files, as the logical order ingested by parsers may differ from the visual order seen by human reviewers [2].

Exploitation

An attacker embeds Unicode bidirectional control characters (such as U+200E, U+200F, U+202A–U+202E) in source code to reorder tokens so that benign code appears visually, while malicious code is executed logically [1][2]. No special privileges are required; the attacker only needs to submit the crafted source code to a code repository or build system that accepts Unicode files. Human reviewers see the visual order and approve the code, but the compiler/interpreter processes the logical order, executing the injected payload.

Impact

Successful exploitation allows arbitrary code injection that bypasses code review. The attacker can introduce subtle vulnerabilities such as logic bombs, backdoors, or data exfiltration without detection [1][3]. The impact depends on the privileges of the process that compiles or interprets the code, potentially leading to full system compromise.

Mitigation

The Unicode Consortium provides mitigations in Unicode Technical Standard #39 (Unicode Security Mechanisms) and Unicode Standard Annex #31 (Unicode Identifier and Pattern Syntax) [1][3]. Specific recommendations include restricting allowed characters in source code, displaying control characters visibly, and using tools that detect or reject bidirectional override sequences [2]. The issue is inherent to the Unicode standard; no single patch exists. Software vendors should implement input validation and user notification for bidirectional control characters.

AI Insight generated on May 27, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected products

102

Patches

0

No patches discovered yet.

Vulnerability mechanics

Root cause

"The Unicode Bidirectional Algorithm permits invisible control sequences that visually reorder characters, enabling source code to display different logic than the logical token order consumed by compilers and interpreters."

Attack vector

An attacker embeds invisible bidirectional override codepoints (e.g., U+202E, U+2066, U+2069) into source code. These codepoints cause bidirectional-aware editors and code review tools to visually reorder characters, making the displayed logic differ from the logical token order the compiler processes [ref_id=1]. A human reviewer sees one condition (e.g., a harmless string comparison) while the compiler executes a different one, allowing the attacker to hide malicious logic. The attack requires the victim to use a tool that renders the bidirectional control sequences and to not have separate out-of-band checks for the codepoints.

Affected code

The vulnerability is in the Unicode Bidirectional Algorithm specification (through version 14.0) and affects any compiler or interpreter that accepts Unicode source code. The Rust compiler (rustc) versions 1.0.0 through 1.56.0 are cited as an affected implementation that lacked lints to detect the dangerous codepoints [ref_id=1].

What the fix does

The Rust Security Response WG introduced two new lints in Rust 1.56.1 that detect and reject source code containing the affected bidirectional override codepoints [ref_id=1]. This prevents the compiler from processing files that contain the dangerous control sequences, eliminating the discrepancy between what a reviewer sees and what the compiler interprets. The Unicode Consortium separately provides guidance in Unicode Technical Standard #39 and Unicode Standard Annex #31 for mitigating this class of issue.

Preconditions

  • configThe victim must use a code editor, review tool, or terminal that renders Unicode bidirectional override codepoints (e.g., U+202E, U+2066, U+2069).
  • configThe compiler or interpreter must accept Unicode source code without rejecting the dangerous codepoints (e.g., Rust 1.0.0 through 1.56.0).
  • inputThe attacker must be able to contribute or inject source code that the victim will review and compile.

Generated on May 30, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

18

News mentions

0

No linked articles in our index yet.