VYPR
Low severityNVD Advisory· Published Jun 8, 2026

CVE-2026-47344

CVE-2026-47344

Description

TYPO3 HTML sanitizer versions prior to 2.3.2 are vulnerable to XSS when ALLOW_INSECURE_RAW_TEXT is enabled, allowing raw text to bypass sanitization.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

TYPO3 HTML sanitizer versions prior to 2.3.2 are vulnerable to XSS when ALLOW_INSECURE_RAW_TEXT is enabled, allowing raw text to bypass sanitization.

Vulnerability

Versions of the typo3/html-sanitizer library prior to 2.3.2 are vulnerable when the ALLOW_INSECURE_RAW_TEXT configuration option is enabled. In this configuration, whitespace-variant closing tags, such as </style\t>, are not correctly recognized by the sanitizer but are accepted by browsers as valid end tags. This allows subsequent content to bypass the cross-site scripting (XSS) prevention mechanisms.

Exploitation

An attacker can exploit this vulnerability by crafting HTML content that includes a tag with a whitespace-variant closing tag, followed by malicious script content. For example, an attacker could inject ...</style\t>. The browser will interpret the </style\t> as a valid closing tag for the style element, allowing the subsequent `` tag to be rendered and executed.

Impact

Successful exploitation of this vulnerability allows an attacker to bypass the HTML sanitizer's cross-site scripting prevention mechanisms. This can lead to the execution of arbitrary JavaScript code within the context of the victim's browser session, potentially resulting in information disclosure, session hijacking, or other malicious actions.

Mitigation

The vulnerability is fixed in version 2.3.2 of the typo3/html-sanitizer library. Users are advised to upgrade to version 2.3.2 or later. No workarounds are available other than disabling the ALLOW_INSECURE_RAW_TEXT option if upgrading is not immediately possible, though this may affect intended functionality. [1]

AI Insight generated on Jun 8, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected products

2

Patches

1
bd1a88d9b5a5

[SECURITY] Mitigate raw-text bypass with ALLOW_INSECURE_RAW_TEXT

https://github.com/TYPO3/html-sanitizerOliver HaderApr 27, 2026via nvd-ref
5 files changed · +221 5
  • src/Parser/Html5.php+65 0 added
    @@ -0,0 +1,65 @@
    +<?php
    +
    +declare(strict_types=1);
    +
    +/*
    + * This file is part of the TYPO3 project.
    + *
    + * It is free software; you can redistribute it and/or modify it under the terms
    + * of the MIT License (MIT). For the full copyright and license information,
    + * please read the LICENSE file that was distributed with this source code.
    + *
    + * The TYPO3 project - inspiring people to share!
    + */
    +
    +namespace TYPO3\HtmlSanitizer\Parser;
    +
    +use DOMDocument;
    +use DOMDocumentFragment;
    +use Masterminds\HTML5 as MastermindsHTML5;
    +use Masterminds\HTML5\Parser\DOMTreeBuilder;
    +use Masterminds\HTML5\Parser\Scanner;
    +
    +/**
    + * Extends the Masterminds HTML5 parser to substitute the local Tokenizer
    + * subclass, which fixes raw-text end-tag handling for whitespace variants.
    + */
    +class Html5 extends MastermindsHTML5
    +{
    +    #[\Override]
    +    public function parse($input, array $options = []): DOMDocument
    +    {
    +        $this->errors = [];
    +        $options = array_merge($this->getOptions(), $options);
    +        $events = new DOMTreeBuilder(false, $options);
    +        $scanner = new Scanner($input, !empty($options['encoding']) ? $options['encoding'] : 'UTF-8');
    +        $parser = new Tokenizer(
    +            $scanner,
    +            $events,
    +            !empty($options['xmlNamespaces']) ? Tokenizer::CONFORMANT_XML : Tokenizer::CONFORMANT_HTML
    +        );
    +
    +        $parser->parse();
    +        $this->errors = $events->getErrors();
    +
    +        return $events->document();
    +    }
    +
    +    #[\Override]
    +    public function parseFragment($input, array $options = []): DOMDocumentFragment
    +    {
    +        $options = array_merge($this->getOptions(), $options);
    +        $events = new DOMTreeBuilder(true, $options);
    +        $scanner = new Scanner($input, !empty($options['encoding']) ? $options['encoding'] : 'UTF-8');
    +        $parser = new Tokenizer(
    +            $scanner,
    +            $events,
    +            !empty($options['xmlNamespaces']) ? Tokenizer::CONFORMANT_XML : Tokenizer::CONFORMANT_HTML
    +        );
    +
    +        $parser->parse();
    +        $this->errors = $events->getErrors();
    +
    +        return $events->fragment();
    +    }
    +}
    
  • src/Parser/Tokenizer.php+69 0 added
    @@ -0,0 +1,69 @@
    +<?php
    +
    +declare(strict_types=1);
    +
    +/*
    + * This file is part of the TYPO3 project.
    + *
    + * It is free software; you can redistribute it and/or modify it under the terms
    + * of the MIT License (MIT). For the full copyright and license information,
    + * please read the LICENSE file that was distributed with this source code.
    + *
    + * The TYPO3 project - inspiring people to share!
    + */
    +
    +namespace TYPO3\HtmlSanitizer\Parser;
    +
    +use Masterminds\HTML5\Elements;
    +use Masterminds\HTML5\Parser\Tokenizer as MastermindsTokenizer;
    +
    +/**
    + * Extends the Masterminds tokenizer to fix rawText() so that it recognises
    + * whitespace-variant closing tags (e.g. </style\t>) as valid end tags per
    + * HTML5 spec § 8.2.6.1, aligning it with the existing rcdata() behaviour.
    + */
    +class Tokenizer extends MastermindsTokenizer
    +{
    +    #[\Override]
    +    protected function rawText($tok): bool
    +    {
    +        if ($this->untilTag === null) {
    +            return $this->text($tok);
    +        }
    +
    +        // Search for `'</' . $untilTag` without the trailing '>' so that
    +        // optional whitespace before '>' is handled correctly, matching
    +        // the behaviour of rcdata() and the HTML5 spec (§ 8.2.6.1).
    +        // Entity references are NOT decoded in raw text (unlike rcdata).
    +        $sequence = '</' . $this->untilTag;
    +        $txt = '';
    +
    +        $caseSensitive = !Elements::isHtml5Element($this->untilTag);
    +        while ($tok !== false &&
    +            ($tok !== '<' || !$this->scanner->sequenceMatches($sequence, $caseSensitive))
    +        ) {
    +            $txt .= $tok;
    +            $tok = $this->scanner->next();
    +        }
    +
    +        if ($tok === false) {
    +            $this->parseError('Unexpected EOF during raw text read.');
    +            $this->events->text($txt);
    +            $this->setTextMode(0);
    +            return false;
    +        }
    +
    +        $len = strlen($sequence);
    +        $this->scanner->consume($len);
    +        $len += $this->scanner->whitespace();
    +        if ($this->scanner->current() !== '>') {
    +            $this->parseError('Unclosed raw text end tag');
    +        }
    +
    +        $this->scanner->unconsume($len);
    +        $this->events->text($txt);
    +        $this->setTextMode(0);
    +
    +        return $this->endTag();
    +    }
    +}
    
  • src/Sanitizer.php+3 3 modified
    @@ -17,7 +17,7 @@
     use DOMDocumentFragment;
     use DOMNode;
     use DOMNodeList;
    -use Masterminds\HTML5;
    +use TYPO3\HtmlSanitizer\Parser\Html5;
     use TYPO3\HtmlSanitizer\Serializer\Rules;
     use TYPO3\HtmlSanitizer\Serializer\RulesInterface;
     use TYPO3\HtmlSanitizer\Visitor\VisitorInterface;
    @@ -196,8 +196,8 @@ protected function closeRulesStream(RulesInterface $rules): bool
             return fclose($rules->getStream());
         }
     
    -    protected function createParser(): HTML5
    +    protected function createParser(): Html5
         {
    -        return new HTML5(self::mastermindsDefaultOptions);
    +        return new Html5(self::mastermindsDefaultOptions);
         }
     }
    
  • src/Serializer/Rules.php+20 2 modified
    @@ -154,9 +154,8 @@ public function text($domNode): void
             if (!$domNode instanceof DOMNode) {
                 return;
             }
    -        // @todo if allowed as text raw element
             $parentDomNode = $domNode->parentNode ?? null;
    -        if (!$this->isRawText($parentDomNode) || !$this->shallAllowInsecureRawText($parentDomNode)) {
    +        if (!$this->shallAllowInsecureRawText($parentDomNode)) {
                 $this->wr($this->enc($domNode->data));
                 return;
             }
    @@ -200,6 +199,10 @@ protected function shallAllowInsecureRawText(?DOMNode $domNode): bool
             if ($domNode === null || !$this->behavior instanceof Behavior || !$this->isRawText($domNode)) {
                 return false;
             }
    +        // allowing raw-text in elements nested in `<noscript>` is denied per default
    +        if ($this->hasAncestorWithName($domNode, 'noscript')) {
    +            return false;
    +        }
             $tag = $this->behavior->getTag($domNode->nodeName);
             return $tag instanceof Behavior\Tag && $tag->shallAllowInsecureRawText();
         }
    @@ -217,4 +220,19 @@ protected function isVoid(?DOMNode $domNode): bool
                 && !empty($domNode->tagName)
                 && Elements::isA($domNode->localName, Elements::VOID_TAG);
         }
    +
    +    protected function hasAncestorWithName(?DOMNode $domNode, string $ancestorName): bool
    +    {
    +        if (!$domNode instanceof DOMNode) {
    +            return false;
    +        }
    +        $ancestor = $domNode->parentNode;
    +        while ($ancestor instanceof DOMNode) {
    +            if ($ancestor->localName === $ancestorName) {
    +                return true;
    +            }
    +            $ancestor = $ancestor->parentNode;
    +        }
    +        return false;
    +    }
     }
    
  • tests/ScenarioTest.php+64 0 modified
    @@ -616,6 +616,10 @@ public function attributesAreEncoded(string $payload, string $expectation): void
     
         public static function specialTagsAreHandledDataProvider(): iterable
         {
    +        yield 'noscript valid' => [
    +            '<noscript><p id="info">This site requires JavaScript.</p></noscript>',
    +            '<noscript><p id="info">This site requires JavaScript.</p></noscript>',
    +        ];
             yield 'noscript attribute' => [
                 '<noscript><p id="</noscript><script>alert(1)</script>"></p>',
                 '<noscript><p id="&lt;/noscript&gt;&lt;script&gt;alert(1)&lt;/script&gt;"></p></noscript>',
    @@ -662,4 +666,64 @@ public function specialTagsAreHandled(string $payload, string $expectation): voi
             );
             self::assertSame($expectation, $sanitizer->sanitize($payload));
         }
    +
    +    public static function insecureRawTextIsSanitizedDataProvider(): \Generator
    +    {
    +        $noscript = new Behavior\Tag('noscript', Behavior\Tag::ALLOW_CHILDREN);
    +        $styleDefault = new Behavior\Tag('style', Behavior\Tag::ALLOW_CHILDREN);
    +        $styleInsecureRawText = new Behavior\Tag('style', Behavior\Tag::ALLOW_CHILDREN | Behavior\Tag::ALLOW_INSECURE_RAW_TEXT);
    +        $iframeDefault = new Behavior\Tag('iframe', Behavior\Tag::ALLOW_CHILDREN);
    +        $iframeInsecureRawText = new Behavior\Tag('iframe', Behavior\Tag::ALLOW_CHILDREN | Behavior\Tag::ALLOW_INSECURE_RAW_TEXT);
    +
    +        yield 'style whitespace closing tag is recognized (img is removed - default)' => [
    +            [$styleDefault],
    +            "<style>div::after{content:'<'}</style\t><img src=x onerror=alert(1)>",
    +            '<style>div::after{content:\'&lt;\'}</style>',
    +        ];
    +        yield 'style whitespace closing tag is recognised (img is removed - insecure raw text allowed)' => [
    +            [$styleInsecureRawText],
    +            "<style>div::after{content:'<'}</style\t><img src=x onerror=alert(1)>",
    +            '<style>div::after{content:\'<\'}</style>',
    +        ];
    +
    +        yield 'iframe & style detect raw-text part (img is removed - insecure raw text allowed)' => [
    +            [$iframeInsecureRawText, $styleInsecureRawText],
    +            '<iframe><style></iframe><img src="x" onerror="alert(1)"></style></iframe>',
    +            '<iframe><style></iframe>',
    +        ];
    +        yield 'iframe & style detect raw-text part (img is removed - default)' => [
    +            [$iframeDefault, $styleDefault],
    +            '<iframe><style></iframe><img src="x" onerror="alert(1)"></style></iframe>',
    +            '<iframe>&lt;style&gt;</iframe>',
    +        ];
    +
    +        yield 'noscript nesting another raw-text element is denied (content is encoded - default)' => [
    +            [$noscript, $styleDefault],
    +            '<noscript><style></noscript><img src=x onerror=alert(2)></style></noscript>',
    +            '<noscript><style>&lt;/noscript&gt;&lt;img src=x onerror=alert(2)&gt;</style></noscript>',
    +        ];
    +        yield 'noscript nesting another raw-text element is denied (content is encoded - insecure raw text allowed)' => [
    +            [$noscript, $styleInsecureRawText],
    +            '<noscript><style></noscript><img src=x onerror=alert(2)></style></noscript>',
    +            '<noscript><style>&lt;/noscript&gt;&lt;img src=x onerror=alert(2)&gt;</style></noscript>',
    +        ];
    +    }
    +
    +    /**
    +     * @test
    +     * @dataProvider insecureRawTextIsSanitizedDataProvider
    +     */
    +    public function insecureRawTextIsSanitized(array $tags, string $payload, string $expectation): void
    +    {
    +        $behavior = (new Behavior())
    +            ->withFlags(Behavior::REMOVE_UNEXPECTED_CHILDREN)
    +            ->withName('scenario-test')
    +            ->withTags(...$tags);
    +
    +        $sanitizer = new Sanitizer(
    +            $behavior,
    +            new CommonVisitor($behavior)
    +        );
    +        self::assertSame($expectation, $sanitizer->sanitize($payload));
    +    }
     }
    

Vulnerability mechanics

Root cause

"The sanitizer did not correctly handle whitespace variants of closing tags when `ALLOW_INSECURE_RAW_TEXT` was enabled."

Attack vector

An attacker can craft HTML input containing a tag with a whitespace variant closing tag, such as `<style> </style>`, when the `ALLOW_INSECURE_RAW_TEXT` configuration is enabled. Browsers interpret these whitespace variants as valid closing tags, but the sanitizer does not recognize them. This allows subsequent malicious content, like script tags, to be appended after the intended tag, bypassing the sanitization process and leading to cross-site scripting [ref_id=1].

Affected code

The vulnerability lies within the `html-sanitizer` library, specifically in how it handles raw text elements when the `ALLOW_INSECURE_RAW_TEXT` flag is set. The commit `bd1a88d9b5a5f67f1120ec41084e9c1a0675641c` addresses this issue by improving the recognition of closing tags in such scenarios [ref_id=1].

What the fix does

The patch modifies the `insecureRawTextIsSanitized` data provider and test method to ensure that whitespace variants of closing tags are correctly recognized by the sanitizer, even when `ALLOW_INSECURE_RAW_TEXT` is enabled [patch_id=5247382]. This prevents the sanitizer from misinterpreting the end of a raw text element, thereby closing the vulnerability that allowed subsequent content to escape sanitization.

Preconditions

  • config`ALLOW_INSECURE_RAW_TEXT` must be enabled in the sanitizer configuration.
  • inputThe input HTML must contain a tag with a whitespace variant closing tag (e.g., ``).

Generated on Jun 8, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

2

News mentions

0

No linked articles in our index yet.