CVE-2021-28965
Description
The REXML gem before 3.2.5 in Ruby before 2.6.7, 2.7.x before 2.7.3, and 3.x before 3.0.1 does not properly address XML round-trip issues. An incorrect document can be produced after parsing and serializing.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
REXML in Ruby incorrectly handles XML round-trips, allowing crafted XML to be parsed and serialized into an invalid document.
Vulnerability
The REXML gem in Ruby (versions before 3.2.5 on 2.6.x, before 2.7.3 on 2.7.x, and before 3.0.1 on 3.x) contains a vulnerability in its XML parser that does not properly address XML round-trip issues. This allows an attacker to craft a malformed XML document that, after parsing and re-serialization, produces an invalid document that may not comply with the XML specification. The issue specifically affects the handling of DOCTYPE and notation declarations, where the parser would accept certain invalid constructs (e.g., missing names or invalid ID types) and generate incorrect output [1][2][3].
Exploitation
An attacker can exploit this vulnerability by providing a specially crafted XML document to an application that uses the vulnerable REXML parser to parse and then re-serialize XML. The attacker does not require authentication or any special privileges; the only requirement is that the application processes the malicious XML input. The attack vector is typically through user-supplied XML data, such as file uploads, SOAP messages, or XML-based configuration files. The attacker submits a malformed XML with invalid DOCTYPE or notation syntax, and when the application re-serializes the parsed document, a structurally invalid XML is produced [1][2][3].
Impact
Successful exploitation leads to the production of an invalid XML document after parsing and serialization (a "round-trip" issue). This can cause data integrity problems downstream, as downstream consumers of the generated XML may reject or misinterpret the data. The core impact is on data integrity and availability, as applications relying on valid XML serialization may fail or behave unexpectedly. The vulnerability does not directly lead to remote code execution or privilege escalation, but it can disrupt XML processing in security-sensitive contexts where correct XML structure is assumed [4].
Mitigation
Fixed versions: REXML gem 3.2.5, Ruby 2.6.7, 2.7.3, and 3.0.1. Users should upgrade to these versions or later. For Ruby installations, updating to the latest patch release is recommended. If immediate upgrade is not possible, avoid processing untrusted XML with the vulnerable REXML parser, or sanitize XML input to strip DOCTYPE declarations before parsing. No known workaround exists that fully mitigates the flaw without patching. The vulnerability was fixed in April 2021 [1][4].
AI Insight generated on May 21, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
rexmlRubyGems | < 3.2.5 | 3.2.5 |
Affected products
16- Ruby/REXMLdescription
- osv-coords15 versionspkg:bitnami/rubypkg:bitnami/ruby-minpkg:gem/rexmlpkg:rpm/almalinux/rubygem-abrtpkg:rpm/almalinux/rubygem-abrt-docpkg:rpm/almalinux/rubygem-bsonpkg:rpm/almalinux/rubygem-bson-docpkg:rpm/almalinux/rubygem-mongopkg:rpm/almalinux/rubygem-mongo-docpkg:rpm/almalinux/rubygem-mysql2pkg:rpm/almalinux/rubygem-mysql2-docpkg:rpm/almalinux/rubygem-pgpkg:rpm/almalinux/rubygem-pg-docpkg:rpm/suse/ruby2.5&distro=SUSE%20Linux%20Enterprise%20Micro%205.0pkg:rpm/suse/ruby2.5&distro=SUSE%20Linux%20Enterprise%20Module%20for%20Basesystem%2015%20SP2
< 2.6.7+ 14 more
- (no CPE)range: < 2.6.7
- (no CPE)range: < 2.6.7
- (no CPE)range: < 3.2.5
- (no CPE)range: < 0.4.0-1.module_el8.4.0+2399+4e3a532a
- (no CPE)range: < 0.4.0-1.module_el8.5.0+118+1ab773e1
- (no CPE)range: < 4.8.1-1.module_el8.5.0+117+35d1289b
- (no CPE)range: < 4.8.1-1.module_el8.3.0+6147+d0dfc1e4
- (no CPE)range: < 2.11.3-1.module_el8.3.0+6147+d0dfc1e4
- (no CPE)range: < 2.11.3-1.module_el8.3.0+6147+d0dfc1e4
- (no CPE)range: < 0.5.3-1.module_el8.4.0+2399+4e3a532a
- (no CPE)range: < 0.5.3-1.module_el8.5.0+118+1ab773e1
- (no CPE)range: < 1.2.3-1.module_el8.3.0+6147+d0dfc1e4
- (no CPE)range: < 1.2.3-1.module_el8.3.0+6147+d0dfc1e4
- (no CPE)range: < 2.5.9-4.17.1
- (no CPE)range: < 2.5.9-4.17.1
Patches
73c137eb11955Fix a parser bug that some data may be ignored before DOCTYPE
3 files changed · +27 −8
lib/rexml/parsers/baseparser.rb+8 −7 modified@@ -195,11 +195,9 @@ def pull_event return [ :end_document ] if empty? return @stack.shift if @stack.size > 0 #STDERR.puts @source.encoding - @source.read if @source.buffer.size<2 #STDERR.puts "BUFFER = #{@source.buffer.inspect}" if @document_status == nil - #@source.consume( /^\s*/um ) - word = @source.match( /^((?:\s+)|(?:<[^>]*>))/um ) + word = @source.match( /\A((?:\s+)|(?:<[^>]*>))/um ) word = word[1] unless word.nil? #STDERR.puts "WORD = #{word.inspect}" case word @@ -257,18 +255,16 @@ def pull_event @stack << [ :end_doctype ] end return args - when /^\s+/ + when /\A\s+/ else @document_status = :after_doctype - @source.read if @source.buffer.size<2 - md = @source.match(/\s*/um, true) if @source.encoding == "UTF-8" @source.buffer.force_encoding(::Encoding::UTF_8) end end end if @document_status == :in_doctype - md = @source.match(/\s*(.*?>)/um) + md = @source.match(/\A\s*(.*?>)/um) case md[1] when SYSTEMENTITY match = @source.match( SYSTEMENTITY, true )[1] @@ -349,7 +345,11 @@ def pull_event return [ :end_doctype ] end end + if @document_status == :after_doctype + @source.match(/\A\s*/um, true) + end begin + @source.read if @source.buffer.size<2 if @source.buffer[0] == ?< if @source.buffer[1] == ?/ @nsstack.shift @@ -392,6 +392,7 @@ def pull_event unless md raise REXML::ParseException.new("malformed XML: missing tag start", @source) end + @document_status = :in_element prefixes = Set.new prefixes << md[2] if md[2] @nsstack.unshift(curr_ns=Set.new)
test/parser/test_ultra_light.rb+0 −1 modified@@ -16,7 +16,6 @@ def test_entity_declaration nil, [:entitydecl, "name", "value"] ], - [:text, "\n"], [:start_element, :parent, "root", {}], [:text, "\n"], ],
test/parse/test_processing_instruction.rb+19 −0 modified@@ -20,6 +20,25 @@ def test_no_name <??> DETAIL end + + def test_garbage_text + # TODO: This should be parse error. + # Create test/parse/test_document.rb or something and move this to it. + doc = parse(<<-XML) +x<?x y +<!--?><?x -->?> +<r/> + XML + pi = doc.children[1] + assert_equal([ + "x", + "y\n<!--", + ], + [ + pi.target, + pi.content, + ]) + end end end end
9b311e59ae05Fix a bug that invalid document declaration may be accepted
3 files changed · +326 −95
lib/rexml/parsers/baseparser.rb+126 −74 modified@@ -50,7 +50,6 @@ class BaseParser DOCTYPE_START = /\A\s*<!DOCTYPE\s/um DOCTYPE_END = /\A\s*\]\s*>/um - DOCTYPE_PATTERN = /\s*<!DOCTYPE\s+(.*?)(\[|>)/um ATTRIBUTE_PATTERN = /\s*(#{QNAME_STR})\s*=\s*(["'])(.*?)\4/um COMMENT_START = /\A<!--/u COMMENT_PATTERN = /<!--(.*?)-->/um @@ -69,7 +68,6 @@ class BaseParser STANDALONE = /\bstandalone\s*=\s*["'](.*?)['"]/um ENTITY_START = /\A\s*<!ENTITY/ - IDENTITY = /^([!\*\w\-]+)(\s+#{NCNAME_STR})?(\s+["'](.*?)['"])?(\s+['"](.*?)["'])?/u ELEMENTDECL_START = /\A\s*<!ELEMENT/um ELEMENTDECL_PATTERN = /\A\s*(<!ELEMENT.*?)>/um SYSTEMENTITY = /\A\s*(%.*?;)\s*$/um @@ -101,8 +99,9 @@ class BaseParser ENTITYDECL = /\s*(?:#{GEDECL})|(?:#{PEDECL})/um NOTATIONDECL_START = /\A\s*<!NOTATION/um - PUBLIC = /\A\s*<!NOTATION\s+#{NAME}\s+(PUBLIC)\s+#{PUBIDLITERAL}(?:\s+#{SYSTEMLITERAL})?\s*>/um - SYSTEM = /\A\s*<!NOTATION\s+#{NAME}\s+(SYSTEM)\s+#{SYSTEMLITERAL}\s*>/um + EXTERNAL_ID_PUBLIC = /\A\s*PUBLIC\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}\s*/um + EXTERNAL_ID_SYSTEM = /\A\s*SYSTEM\s+#{SYSTEMLITERAL}\s*/um + PUBLIC_ID = /\A\s*PUBLIC\s+#{PUBIDLITERAL}\s*/um EREFERENCE = /&(?!#{NAME};)/ @@ -225,24 +224,37 @@ def pull_event when INSTRUCTION_START return process_instruction when DOCTYPE_START - md = @source.match( DOCTYPE_PATTERN, true ) + base_error_message = "Malformed DOCTYPE" + @source.match(DOCTYPE_START, true) @nsstack.unshift(curr_ns=Set.new) - identity = md[1] - close = md[2] - identity =~ IDENTITY - name = $1 - raise REXML::ParseException.new("DOCTYPE is missing a name") if name.nil? - pub_sys = $2.nil? ? nil : $2.strip - long_name = $4.nil? ? nil : $4.strip - uri = $6.nil? ? nil : $6.strip - args = [ :start_doctype, name, pub_sys, long_name, uri ] - if close == ">" + name = parse_name(base_error_message) + if @source.match(/\A\s*\[/um, true) + id = [nil, nil, nil] + @document_status = :in_doctype + elsif @source.match(/\A\s*>/um, true) + id = [nil, nil, nil] @document_status = :after_doctype - @source.read if @source.buffer.size<2 - md = @source.match(/^\s*/um, true) - @stack << [ :end_doctype ] else - @document_status = :in_doctype + id = parse_id(base_error_message, + accept_external_id: true, + accept_public_id: false) + if id[0] == "SYSTEM" + # For backward compatibility + id[1], id[2] = id[2], nil + end + if @source.match(/\A\s*\[/um, true) + @document_status = :in_doctype + elsif @source.match(/\A\s*>/um, true) + @document_status = :after_doctype + else + message = "#{base_error_message}: garbage after external ID" + raise REXML::ParseException.new(message, @source) + end + end + args = [:start_doctype, name, *id] + if @document_status == :after_doctype + @source.match(/\A\s*/um, true) + @stack << [ :end_doctype ] end return args when /^\s+/ @@ -313,27 +325,24 @@ def pull_event end return [ :attlistdecl, element, pairs, contents ] when NOTATIONDECL_START - md = nil - if @source.match( PUBLIC ) - md = @source.match( PUBLIC, true ) - pubid = system = nil - pubid_literal = md[3] - pubid = pubid_literal[1..-2] if pubid_literal # Remove quote - system_literal = md[4] - system = system_literal[1..-2] if system_literal # Remove quote - vals = [md[1], md[2], pubid, system] - elsif @source.match( SYSTEM ) - md = @source.match( SYSTEM, true ) - system = nil - system_literal = md[3] - system = system_literal[1..-2] if system_literal # Remove quote - vals = [md[1], md[2], nil, system] - else - details = notation_decl_invalid_details - message = "Malformed notation declaration: #{details}" + base_error_message = "Malformed notation declaration" + unless @source.match(/\A\s*<!NOTATION\s+/um, true) + if @source.match(/\A\s*<!NOTATION\s*>/um) + message = "#{base_error_message}: name is missing" + else + message = "#{base_error_message}: invalid declaration name" + end + raise REXML::ParseException.new(message, @source) + end + name = parse_name(base_error_message) + id = parse_id(base_error_message, + accept_external_id: true, + accept_public_id: true) + unless @source.match(/\A\s*>/um, true) + message = "#{base_error_message}: garbage before end >" raise REXML::ParseException.new(message, @source) end - return [ :notationdecl, *vals ] + return [:notationdecl, name, *id] when DOCTYPE_END @document_status = :after_doctype @source.match( DOCTYPE_END, true ) @@ -488,6 +497,85 @@ def need_source_encoding_update?(xml_declaration_encoding) true end + def parse_name(base_error_message) + md = @source.match(/\A\s*#{NAME}/um, true) + unless md + if @source.match(/\A\s*\S/um) + message = "#{base_error_message}: invalid name" + else + message = "#{base_error_message}: name is missing" + end + raise REXML::ParseException.new(message, @source) + end + md[1] + end + + def parse_id(base_error_message, + accept_external_id:, + accept_public_id:) + if accept_external_id and (md = @source.match(EXTERNAL_ID_PUBLIC, true)) + pubid = system = nil + pubid_literal = md[1] + pubid = pubid_literal[1..-2] if pubid_literal # Remove quote + system_literal = md[2] + system = system_literal[1..-2] if system_literal # Remove quote + ["PUBLIC", pubid, system] + elsif accept_public_id and (md = @source.match(PUBLIC_ID, true)) + pubid = system = nil + pubid_literal = md[1] + pubid = pubid_literal[1..-2] if pubid_literal # Remove quote + ["PUBLIC", pubid, nil] + elsif accept_external_id and (md = @source.match(EXTERNAL_ID_SYSTEM, true)) + system = nil + system_literal = md[1] + system = system_literal[1..-2] if system_literal # Remove quote + ["SYSTEM", nil, system] + else + details = parse_id_invalid_details(accept_external_id: accept_external_id, + accept_public_id: accept_public_id) + message = "#{base_error_message}: #{details}" + raise REXML::ParseException.new(message, @source) + end + end + + def parse_id_invalid_details(accept_external_id:, + accept_public_id:) + public = /\A\s*PUBLIC/um + system = /\A\s*SYSTEM/um + if (accept_external_id or accept_public_id) and @source.match(/#{public}/um) + if @source.match(/#{public}(?:\s+[^'"]|\s*[\[>])/um) + return "public ID literal is missing" + end + unless @source.match(/#{public}\s+#{PUBIDLITERAL}/um) + return "invalid public ID literal" + end + if accept_public_id + if @source.match(/#{public}\s+#{PUBIDLITERAL}\s+[^'"]/um) + return "system ID literal is missing" + end + unless @source.match(/#{public}\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}/um) + return "invalid system literal" + end + "garbage after system literal" + else + "garbage after public ID literal" + end + elsif accept_external_id and @source.match(/#{system}/um) + if @source.match(/#{system}(?:\s+[^'"]|\s*[\[>])/um) + return "system literal is missing" + end + unless @source.match(/#{system}\s+#{SYSTEMLITERAL}/um) + return "invalid system literal" + end + "garbage after system literal" + else + unless @source.match(/\A\s*(?:PUBLIC|SYSTEM)\s/um) + return "invalid ID type" + end + "ID type is missing" + end + end + def process_instruction match_data = @source.match(INSTRUCTION_PATTERN, true) unless match_data @@ -580,42 +668,6 @@ def parse_attributes(prefixes, curr_ns) end return attributes, closed end - - def notation_decl_invalid_details - name = /#{NOTATIONDECL_START}\s+#{NAME}/um - public = /#{name}\s+PUBLIC/um - system = /#{name}\s+SYSTEM/um - if @source.match(/#{NOTATIONDECL_START}\s*>/um) - return "name is missing" - elsif not @source.match(/#{name}[\s>]/um) - return "invalid name" - elsif @source.match(/#{name}\s*>/um) - return "ID type is missing" - elsif not @source.match(/#{name}\s+(?:PUBLIC|SYSTEM)[\s>]/um) - return "invalid ID type" - elsif @source.match(/#{public}/um) - if @source.match(/#{public}\s*>/um) - return "public ID literal is missing" - elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}/um) - return "invalid public ID literal" - elsif @source.match(/#{public}\s+#{PUBIDLITERAL}[^\s>]/um) - return "garbage after public ID literal" - elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}/um) - return "invalid system literal" - elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}\s*>/um) - return "garbage after system literal" - end - elsif @source.match(/#{system}/um) - if @source.match(/#{system}\s*>/um) - return "system literal is missing" - elsif not @source.match(/#{system}\s+#{SYSTEMLITERAL}/um) - return "invalid system literal" - elsif not @source.match(/#{system}\s+#{SYSTEMLITERAL}\s*>/um) - return "garbage after system literal" - end - end - "end > is missing" - end end end end
test/parse/test_document_type_declaration.rb+186 −7 modified@@ -5,17 +5,187 @@ module REXMLTests class TestParseDocumentTypeDeclaration < Test::Unit::TestCase private - def xml(internal_subset) - <<-XML -<!DOCTYPE r SYSTEM "urn:x-rexml:test" [ -#{internal_subset} -]> + def parse(doctype) + REXML::Document.new(<<-XML).doctype +#{doctype} <r/> XML end - def parse(internal_subset) - REXML::Document.new(xml(internal_subset)).doctype + class TestName < self + def test_valid + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r> + DOCTYPE + assert_equal("r", doctype.name) + end + + def test_garbage_plus_before_name_at_line_start + exception = assert_raise(REXML::ParseException) do + parse(<<-DOCTYPE) +<!DOCTYPE + +r SYSTEM "urn:x-rexml:test" [ +]> + DOCTYPE + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed DOCTYPE: invalid name +Line: 5 +Position: 51 +Last 80 unconsumed characters: ++ r SYSTEM "urn:x-rexml:test" [ ]> <r/> + DETAIL + end + end + + class TestExternalID < self + class TestSystem < self + def test_left_bracket_in_system_literal + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM "urn:x-rexml:[test" [ +]> + DOCTYPE + assert_equal([ + "r", + "SYSTEM", + nil, + "urn:x-rexml:[test", + ], + [ + doctype.name, + doctype.external_id, + doctype.public, + doctype.system, + ]) + end + + def test_greater_than_in_system_literal + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM "urn:x-rexml:>test" [ +]> + DOCTYPE + assert_equal([ + "r", + "SYSTEM", + nil, + "urn:x-rexml:>test", + ], + [ + doctype.name, + doctype.external_id, + doctype.public, + doctype.system, + ]) + end + + def test_no_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM> + DOCTYPE + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed DOCTYPE: system literal is missing +Line: 3 +Position: 26 +Last 80 unconsumed characters: + SYSTEM> <r/> + DETAIL + end + + def test_garbage_after_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM 'r.dtd'x'> + DOCTYPE + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed DOCTYPE: garbage after external ID +Line: 3 +Position: 36 +Last 80 unconsumed characters: +x'> <r/> + DETAIL + end + + def test_single_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM 'r".dtd'> + DOCTYPE + assert_equal("r\".dtd", doctype.system) + end + + def test_double_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r SYSTEM "r'.dtd"> + DOCTYPE + assert_equal("r'.dtd", doctype.system) + end + end + + class TestPublic < self + class TestPublicIDLiteral < self + def test_content_double_quote + exception = assert_raise(REXML::ParseException) do + parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC 'double quote " is invalid' "r.dtd"> + DOCTYPE + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed DOCTYPE: invalid public ID literal +Line: 3 +Position: 62 +Last 80 unconsumed characters: + PUBLIC 'double quote " is invalid' "r.dtd"> <r/> + DETAIL + end + + def test_single_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC 'public-id-literal' "r.dtd"> + DOCTYPE + assert_equal("public-id-literal", doctype.public) + end + + def test_double_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC "public'-id-literal" "r.dtd"> + DOCTYPE + assert_equal("public'-id-literal", doctype.public) + end + end + + class TestSystemLiteral < self + def test_garbage_after_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC 'public-id-literal' 'system-literal'x'> + DOCTYPE + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed DOCTYPE: garbage after external ID +Line: 3 +Position: 65 +Last 80 unconsumed characters: +x'> <r/> + DETAIL + end + + def test_single_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC "public-id-literal" 'system"-literal'> + DOCTYPE + assert_equal("system\"-literal", doctype.system) + end + + def test_double_quote + doctype = parse(<<-DOCTYPE) +<!DOCTYPE r PUBLIC "public-id-literal" "system'-literal"> + DOCTYPE + assert_equal("system'-literal", doctype.system) + end + end + end end class TestMixed < self @@ -45,6 +215,15 @@ def test_notation_attlist assert_equal([REXML::NotationDecl, REXML::AttlistDecl], doctype.children.collect(&:class)) end + + private + def parse(internal_subset) + super(<<-DOCTYPE) +<!DOCTYPE r SYSTEM "urn:x-rexml:test" [ +#{internal_subset} +]> + DOCTYPE + end end end end
test/parse/test_notation_declaration.rb+14 −14 modified@@ -50,7 +50,7 @@ def test_invalid_name Line: 5 Position: 74 Last 80 unconsumed characters: - <!NOTATION '> ]> <r/> +'> ]> <r/> DETAIL end @@ -61,11 +61,11 @@ def test_no_id_type INTERNAL_SUBSET end assert_equal(<<-DETAIL.chomp, exception.to_s) -Malformed notation declaration: ID type is missing +Malformed notation declaration: invalid ID type Line: 5 Position: 77 Last 80 unconsumed characters: - <!NOTATION name> ]> <r/> +> ]> <r/> DETAIL end @@ -80,7 +80,7 @@ def test_invalid_id_type Line: 5 Position: 85 Last 80 unconsumed characters: - <!NOTATION name INVALID> ]> <r/> + INVALID> ]> <r/> DETAIL end end @@ -98,7 +98,7 @@ def test_no_literal Line: 5 Position: 84 Last 80 unconsumed characters: - <!NOTATION name SYSTEM> ]> <r/> + SYSTEM> ]> <r/> DETAIL end @@ -109,11 +109,11 @@ def test_garbage_after_literal INTERNAL_SUBSET end assert_equal(<<-DETAIL.chomp, exception.to_s) -Malformed notation declaration: garbage after system literal +Malformed notation declaration: garbage before end > Line: 5 Position: 103 Last 80 unconsumed characters: - <!NOTATION name SYSTEM 'system-literal'x'> ]> <r/> +x'> ]> <r/> DETAIL end @@ -145,7 +145,7 @@ def test_content_double_quote Line: 5 Position: 129 Last 80 unconsumed characters: - <!NOTATION name PUBLIC 'double quote " is invalid' "system-literal"> ]> <r/> + PUBLIC 'double quote " is invalid' "system-literal"> ]> <r/> DETAIL end @@ -172,11 +172,11 @@ def test_garbage_after_literal INTERNAL_SUBSET end assert_equal(<<-DETAIL.chomp, exception.to_s) -Malformed notation declaration: garbage after system literal +Malformed notation declaration: garbage before end > Line: 5 Position: 123 Last 80 unconsumed characters: - <!NOTATION name PUBLIC 'public-id-literal' 'system-literal'x'> ]> <r/> +x'> ]> <r/> DETAIL end @@ -229,7 +229,7 @@ def test_no_literal Line: 5 Position: 84 Last 80 unconsumed characters: - <!NOTATION name PUBLIC> ]> <r/> + PUBLIC> ]> <r/> DETAIL end @@ -244,7 +244,7 @@ def test_literal_content_double_quote Line: 5 Position: 128 Last 80 unconsumed characters: - <!NOTATION name PUBLIC 'double quote \" is invalid in PubidLiteral'> ]> <r/> + PUBLIC 'double quote \" is invalid in PubidLiteral'> ]> <r/> DETAIL end @@ -255,11 +255,11 @@ def test_garbage_after_literal INTERNAL_SUBSET end assert_equal(<<-DETAIL.chomp, exception.to_s) -Malformed notation declaration: garbage after public ID literal +Malformed notation declaration: garbage before end > Line: 5 Position: 106 Last 80 unconsumed characters: - <!NOTATION name PUBLIC 'public-id-literal'x'> ]> <r/> +x'> ]> <r/> DETAIL end
f9d88e4948b4Fix a bug that invalid document declaration may be generated
2 files changed · +155 −35
lib/rexml/doctype.rb+50 −35 modified@@ -7,6 +7,44 @@ require_relative 'xmltokens' module REXML + class ReferenceWriter + def initialize(id_type, + public_id_literal, + system_literal, + context=nil) + @id_type = id_type + @public_id_literal = public_id_literal + @system_literal = system_literal + if context and context[:prologue_quote] == :apostrophe + @default_quote = "'" + else + @default_quote = "\"" + end + end + + def write(output) + output << " #{@id_type}" + if @public_id_literal + if @public_id_literal.include?("'") + quote = "\"" + else + quote = @default_quote + end + output << " #{quote}#{@public_id_literal}#{quote}" + end + if @system_literal + if @system_literal.include?("'") + quote = "\"" + elsif @system_literal.include?("\"") + quote = "'" + else + quote = @default_quote + end + output << " #{quote}#{@system_literal}#{quote}" + end + end + end + # Represents an XML DOCTYPE declaration; that is, the contents of <!DOCTYPE # ... >. DOCTYPES can be used to declare the DTD of a document, as well as # being used to declare entities used in the document. @@ -110,19 +148,17 @@ def clone # Ignored def write( output, indent=0, transitive=false, ie_hack=false ) f = REXML::Formatters::Default.new - c = context - if c and c[:prologue_quote] == :apostrophe - quote = "'" - else - quote = "\"" - end indent( output, indent ) output << START output << ' ' output << @name - output << " #{@external_id}" if @external_id - output << " #{quote}#{@long_name}#{quote}" if @long_name - output << " #{quote}#{@uri}#{quote}" if @uri + if @external_id + reference_writer = ReferenceWriter.new(@external_id, + @long_name, + @uri, + context) + reference_writer.write(output) + end unless @children.empty? output << ' [' @children.each { |child| @@ -252,32 +288,11 @@ def initialize name, middle, pub, sys end def to_s - c = nil - c = parent.context if parent - if c and c[:prologue_quote] == :apostrophe - default_quote = "'" - else - default_quote = "\"" - end - notation = "<!NOTATION #{@name} #{@middle}" - if @public - if @public.include?("'") - quote = "\"" - else - quote = default_quote - end - notation << " #{quote}#{@public}#{quote}" - end - if @system - if @system.include?("'") - quote = "\"" - elsif @system.include?("\"") - quote = "'" - else - quote = default_quote - end - notation << " #{quote}#{@system}#{quote}" - end + context = nil + context = parent.context if parent + notation = "<!NOTATION #{@name}" + reference_writer = ReferenceWriter.new(@middle, @public, @system, context) + reference_writer.write(notation) notation << ">" notation end
test/test_doctype.rb+105 −0 modified@@ -77,6 +77,111 @@ def test_notations end end + class TestDocType < Test::Unit::TestCase + class TestExternalID < self + class TestSystem < self + class TestSystemLiteral < self + def test_to_s + doctype = REXML::DocType.new(["root", "SYSTEM", nil, "root.dtd"]) + assert_equal("<!DOCTYPE root SYSTEM \"root.dtd\">", + doctype.to_s) + end + + def test_to_s_apostrophe + doctype = REXML::DocType.new(["root", "SYSTEM", nil, "root.dtd"]) + doc = REXML::Document.new + doc << doctype + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root SYSTEM 'root.dtd'>", + doctype.to_s) + end + + def test_to_s_single_quote_apostrophe + doctype = REXML::DocType.new(["root", "SYSTEM", nil, "root'.dtd"]) + doc = REXML::Document.new + doc << doctype + # This isn't used. + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root SYSTEM \"root'.dtd\">", + doctype.to_s) + end + + def test_to_s_double_quote + doctype = REXML::DocType.new(["root", "SYSTEM", nil, "root\".dtd"]) + doc = REXML::Document.new + doc << doctype + # This isn't used. + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root SYSTEM 'root\".dtd'>", + doctype.to_s) + end + end + end + + class TestPublic < self + class TestPublicIDLiteral < self + def test_to_s + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root.dtd"]) + assert_equal("<!DOCTYPE root PUBLIC \"pub\" \"root.dtd\">", + doctype.to_s) + end + + def test_to_s_apostrophe + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root.dtd"]) + doc = REXML::Document.new + doc << doctype + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root PUBLIC 'pub' 'root.dtd'>", + doctype.to_s) + end + + def test_to_s_apostrophe_include_apostrophe + doctype = REXML::DocType.new(["root", "PUBLIC", "pub'", "root.dtd"]) + doc = REXML::Document.new + doc << doctype + # This isn't used. + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root PUBLIC \"pub'\" 'root.dtd'>", + doctype.to_s) + end + end + + class TestSystemLiteral < self + def test_to_s + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root.dtd"]) + assert_equal("<!DOCTYPE root PUBLIC \"pub\" \"root.dtd\">", + doctype.to_s) + end + + def test_to_s_apostrophe + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root.dtd"]) + doc = REXML::Document.new + doc << doctype + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root PUBLIC 'pub' 'root.dtd'>", + doctype.to_s) + end + + def test_to_s_apostrophe_include_apostrophe + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root'.dtd"]) + doc = REXML::Document.new + doc << doctype + # This isn't used. + doctype.parent.context[:prologue_quote] = :apostrophe + assert_equal("<!DOCTYPE root PUBLIC 'pub' \"root'.dtd\">", + doctype.to_s) + end + + def test_to_s_double_quote + doctype = REXML::DocType.new(["root", "PUBLIC", "pub", "root\".dtd"]) + assert_equal("<!DOCTYPE root PUBLIC \"pub\" 'root\".dtd'>", + doctype.to_s) + end + end + end + end + end + class TestNotationDeclPublic < Test::Unit::TestCase def setup @name = "vrml"
f7bab8937513Fix a bug that invalid element end may be accepted
2 files changed · +14 −1
lib/rexml/parsers/baseparser.rb+1 −1 modified@@ -62,7 +62,7 @@ class BaseParser INSTRUCTION_START = /\A<\?/u INSTRUCTION_PATTERN = /<\?#{NAME}(\s+.*?)?\?>/um TAG_MATCH = /\A<((?>#{QNAME_STR}))/um - CLOSE_MATCH = /^\s*<\/(#{QNAME_STR})\s*>/um + CLOSE_MATCH = /\A\s*<\/(#{QNAME_STR})\s*>/um VERSION = /\bversion\s*=\s*["'](.*?)['"]/um ENCODING = /\bencoding\s*=\s*["'](.*?)['"]/um
test/parse/test_element.rb+13 −0 modified@@ -59,6 +59,19 @@ def test_garbage_less_than_before_root_element_at_line_start < <x/> DETAIL end + + def test_garbage_less_than_slash_before_end_tag_at_line_start + exception = assert_raise(REXML::ParseException) do + parse("<x></\n</x>") + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Missing end tag for 'x' +Line: 2 +Position: 10 +Last 80 unconsumed characters: +</ </x> + DETAIL + end end end end
6a250d2cd119Fix a bug that invalid element start may be accepted
2 files changed · +14 −1
lib/rexml/parsers/baseparser.rb+1 −1 modified@@ -61,7 +61,7 @@ class BaseParser XMLDECL_PATTERN = /<\?xml\s+(.*?)\?>/um INSTRUCTION_START = /\A<\?/u INSTRUCTION_PATTERN = /<\?#{NAME}(\s+.*?)?\?>/um - TAG_MATCH = /^<((?>#{QNAME_STR}))/um + TAG_MATCH = /\A<((?>#{QNAME_STR}))/um CLOSE_MATCH = /^\s*<\/(#{QNAME_STR})\s*>/um VERSION = /\bversion\s*=\s*["'](.*?)['"]/um
test/parse/test_element.rb+13 −0 modified@@ -46,6 +46,19 @@ def test_empty_namespace_attribute_name DETAIL end + + def test_garbage_less_than_before_root_element_at_line_start + exception = assert_raise(REXML::ParseException) do + parse("<\n<x/>") + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +malformed XML: missing tag start +Line: 2 +Position: 6 +Last 80 unconsumed characters: +< <x/> + DETAIL + end end end end
2fe62e29094dFix a bug that invalid notation declaration may be accepted
2 files changed · +234 −6
lib/rexml/parsers/baseparser.rb+53 −6 modified@@ -83,9 +83,6 @@ class BaseParser ATTDEF_RE = /#{ATTDEF}/ ATTLISTDECL_START = /\A\s*<!ATTLIST/um ATTLISTDECL_PATTERN = /\A\s*<!ATTLIST\s+#{NAME}(?:#{ATTDEF})*\s*>/um - NOTATIONDECL_START = /\A\s*<!NOTATION/um - PUBLIC = /\A\s*<!NOTATION\s+(\w[\-\w]*)\s+(PUBLIC)\s+(["'])(.*?)\3(?:\s+(["'])(.*?)\5)?\s*>/um - SYSTEM = /\A\s*<!NOTATION\s+(\w[\-\w]*)\s+(SYSTEM)\s+(["'])(.*?)\3\s*>/um TEXT_PATTERN = /\A([^<]*)/um @@ -103,6 +100,10 @@ class BaseParser GEDECL = "<!ENTITY\\s+#{NAME}\\s+#{ENTITYDEF}\\s*>" ENTITYDECL = /\s*(?:#{GEDECL})|(?:#{PEDECL})/um + NOTATIONDECL_START = /\A\s*<!NOTATION/um + PUBLIC = /\A\s*<!NOTATION\s+#{NAME}\s+(PUBLIC)\s+#{PUBIDLITERAL}(?:\s+#{SYSTEMLITERAL})?\s*>/um + SYSTEM = /\A\s*<!NOTATION\s+#{NAME}\s+(SYSTEM)\s+#{SYSTEMLITERAL}\s*>/um + EREFERENCE = /&(?!#{NAME};)/ DEFAULT_ENTITIES = { @@ -315,12 +316,22 @@ def pull_event md = nil if @source.match( PUBLIC ) md = @source.match( PUBLIC, true ) - vals = [md[1],md[2],md[4],md[6]] + pubid = system = nil + pubid_literal = md[3] + pubid = pubid_literal[1..-2] if pubid_literal # Remove quote + system_literal = md[4] + system = system_literal[1..-2] if system_literal # Remove quote + vals = [md[1], md[2], pubid, system] elsif @source.match( SYSTEM ) md = @source.match( SYSTEM, true ) - vals = [md[1],md[2],nil,md[4]] + system = nil + system_literal = md[3] + system = system_literal[1..-2] if system_literal # Remove quote + vals = [md[1], md[2], nil, system] else - raise REXML::ParseException.new( "error parsing notation: no matching pattern", @source ) + details = notation_decl_invalid_details + message = "Malformed notation declaration: #{details}" + raise REXML::ParseException.new(message, @source) end return [ :notationdecl, *vals ] when DOCTYPE_END @@ -569,6 +580,42 @@ def parse_attributes(prefixes, curr_ns) end return attributes, closed end + + def notation_decl_invalid_details + name = /#{NOTATIONDECL_START}\s+#{NAME}/um + public = /#{name}\s+PUBLIC/um + system = /#{name}\s+SYSTEM/um + if @source.match(/#{NOTATIONDECL_START}\s*>/um) + return "name is missing" + elsif not @source.match(/#{name}[\s>]/um) + return "invalid name" + elsif @source.match(/#{name}\s*>/um) + return "ID type is missing" + elsif not @source.match(/#{name}\s+(?:PUBLIC|SYSTEM)[\s>]/um) + return "invalid ID type" + elsif @source.match(/#{public}/um) + if @source.match(/#{public}\s*>/um) + return "public ID literal is missing" + elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}/um) + return "invalid public ID literal" + elsif @source.match(/#{public}\s+#{PUBIDLITERAL}[^\s>]/um) + return "garbage after public ID literal" + elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}/um) + return "invalid system literal" + elsif not @source.match(/#{public}\s+#{PUBIDLITERAL}\s+#{SYSTEMLITERAL}\s*>/um) + return "garbage after system literal" + end + elsif @source.match(/#{system}/um) + if @source.match(/#{system}\s*>/um) + return "system literal is missing" + elsif not @source.match(/#{system}\s+#{SYSTEMLITERAL}/um) + return "invalid system literal" + elsif not @source.match(/#{system}\s+#{SYSTEMLITERAL}\s*>/um) + return "garbage after system literal" + end + end + "end > is missing" + end end end end
test/parse/test_notation_declaration.rb+181 −0 modified@@ -23,10 +23,100 @@ def test_name doctype = parse("<!NOTATION name PUBLIC 'urn:public-id'>") assert_equal("name", doctype.notation("name").name) end + + def test_no_name + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: name is missing +Line: 5 +Position: 72 +Last 80 unconsumed characters: + <!NOTATION> ]> <r/> + DETAIL + end + + def test_invalid_name + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION '> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: invalid name +Line: 5 +Position: 74 +Last 80 unconsumed characters: + <!NOTATION '> ]> <r/> + DETAIL + end + + def test_no_id_type + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: ID type is missing +Line: 5 +Position: 77 +Last 80 unconsumed characters: + <!NOTATION name> ]> <r/> + DETAIL + end + + def test_invalid_id_type + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name INVALID> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: invalid ID type +Line: 5 +Position: 85 +Last 80 unconsumed characters: + <!NOTATION name INVALID> ]> <r/> + DETAIL + end end class TestExternalID < self class TestSystem < self + def test_no_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name SYSTEM> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: system literal is missing +Line: 5 +Position: 84 +Last 80 unconsumed characters: + <!NOTATION name SYSTEM> ]> <r/> + DETAIL + end + + def test_garbage_after_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name SYSTEM 'system-literal'x'> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: garbage after system literal +Line: 5 +Position: 103 +Last 80 unconsumed characters: + <!NOTATION name SYSTEM 'system-literal'x'> ]> <r/> + DETAIL + end + def test_single_quote doctype = parse(<<-INTERNAL_SUBSET) <!NOTATION name SYSTEM 'system-literal'> @@ -44,6 +134,21 @@ def test_double_quote class TestPublic < self class TestPublicIDLiteral < self + def test_content_double_quote + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC 'double quote " is invalid' "system-literal"> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: invalid public ID literal +Line: 5 +Position: 129 +Last 80 unconsumed characters: + <!NOTATION name PUBLIC 'double quote " is invalid' "system-literal"> ]> <r/> + DETAIL + end + def test_single_quote doctype = parse(<<-INTERNAL_SUBSET) <!NOTATION name PUBLIC 'public-id-literal' "system-literal"> @@ -60,6 +165,21 @@ def test_double_quote end class TestSystemLiteral < self + def test_garbage_after_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC 'public-id-literal' 'system-literal'x'> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: garbage after system literal +Line: 5 +Position: 123 +Last 80 unconsumed characters: + <!NOTATION name PUBLIC 'public-id-literal' 'system-literal'x'> ]> <r/> + DETAIL + end + def test_single_quote doctype = parse(<<-INTERNAL_SUBSET) <!NOTATION name PUBLIC "public-id-literal" 'system-literal'> @@ -96,5 +216,66 @@ def test_public_system end end end + + class TestPublicID < self + def test_no_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: public ID literal is missing +Line: 5 +Position: 84 +Last 80 unconsumed characters: + <!NOTATION name PUBLIC> ]> <r/> + DETAIL + end + + def test_literal_content_double_quote + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC 'double quote " is invalid in PubidLiteral'> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: invalid public ID literal +Line: 5 +Position: 128 +Last 80 unconsumed characters: + <!NOTATION name PUBLIC 'double quote \" is invalid in PubidLiteral'> ]> <r/> + DETAIL + end + + def test_garbage_after_literal + exception = assert_raise(REXML::ParseException) do + parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC 'public-id-literal'x'> + INTERNAL_SUBSET + end + assert_equal(<<-DETAIL.chomp, exception.to_s) +Malformed notation declaration: garbage after public ID literal +Line: 5 +Position: 106 +Last 80 unconsumed characters: + <!NOTATION name PUBLIC 'public-id-literal'x'> ]> <r/> + DETAIL + end + + def test_literal_single_quote + doctype = parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC 'public-id-literal'> + INTERNAL_SUBSET + assert_equal("public-id-literal", doctype.notation("name").public) + end + + def test_literal_double_quote + doctype = parse(<<-INTERNAL_SUBSET) +<!NOTATION name PUBLIC "public-id-literal"> + INTERNAL_SUBSET + assert_equal("public-id-literal", doctype.notation("name").public) + end + end end end
a659c63e3741Fix a bug that invalid notation declaration may be generated
2 files changed · +118 −5
lib/rexml/doctype.rb+20 −4 modified@@ -255,13 +255,29 @@ def to_s c = nil c = parent.context if parent if c and c[:prologue_quote] == :apostrophe - quote = "'" + default_quote = "'" else - quote = "\"" + default_quote = "\"" end notation = "<!NOTATION #{@name} #{@middle}" - notation << " #{quote}#{@public}#{quote}" if @public - notation << " #{quote}#{@system}#{quote}" if @system + if @public + if @public.include?("'") + quote = "\"" + else + quote = default_quote + end + notation << " #{quote}#{@public}#{quote}" + end + if @system + if @system.include?("'") + quote = "\"" + elsif @system.include?("\"") + quote = "'" + else + quote = default_quote + end + notation << " #{quote}#{@system}#{quote}" + end notation << ">" notation end
test/test_doctype.rb+98 −1 modified@@ -89,11 +89,26 @@ def test_to_s decl(@id, nil).to_s) end + def test_to_s_pubid_literal_include_apostrophe + assert_equal("<!NOTATION #{@name} PUBLIC \"#{@id}'\">", + decl("#{@id}'", nil).to_s) + end + def test_to_s_with_uri assert_equal("<!NOTATION #{@name} PUBLIC \"#{@id}\" \"#{@uri}\">", decl(@id, @uri).to_s) end + def test_to_s_system_literal_include_apostrophe + assert_equal("<!NOTATION #{@name} PUBLIC \"#{@id}\" \"system'literal\">", + decl(@id, "system'literal").to_s) + end + + def test_to_s_system_literal_include_double_quote + assert_equal("<!NOTATION #{@name} PUBLIC \"#{@id}\" 'system\"literal'>", + decl(@id, "system\"literal").to_s) + end + def test_to_s_apostrophe document = REXML::Document.new(<<-XML) <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ @@ -107,6 +122,49 @@ def test_to_s_apostrophe notation.to_s) end + def test_to_s_apostrophe_pubid_literal_include_apostrophe + document = REXML::Document.new(<<-XML) + <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ + #{decl("#{@id}'", @uri).to_s} + ]> + <root/> + XML + # This isn't used for PubidLiteral because PubidChar includes '. + document.context[:prologue_quote] = :apostrophe + notation = document.doctype.notations[0] + assert_equal("<!NOTATION #{@name} PUBLIC \"#{@id}'\" '#{@uri}'>", + notation.to_s) + end + + def test_to_s_apostrophe_system_literal_include_apostrophe + document = REXML::Document.new(<<-XML) + <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ + #{decl(@id, "system'literal").to_s} + ]> + <root/> + XML + # This isn't used for SystemLiteral because SystemLiteral includes '. + document.context[:prologue_quote] = :apostrophe + notation = document.doctype.notations[0] + assert_equal("<!NOTATION #{@name} PUBLIC '#{@id}' \"system'literal\">", + notation.to_s) + end + + def test_to_s_apostrophe_system_literal_include_double_quote + document = REXML::Document.new(<<-XML) + <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ + #{decl(@id, "system\"literal").to_s} + ]> + <root/> + XML + # This isn't used for SystemLiteral because SystemLiteral includes ". + # But quoted by ' because SystemLiteral includes ". + document.context[:prologue_quote] = :apostrophe + notation = document.doctype.notations[0] + assert_equal("<!NOTATION #{@name} PUBLIC '#{@id}' 'system\"literal'>", + notation.to_s) + end + private def decl(id, uri) REXML::NotationDecl.new(@name, "PUBLIC", id, uri) @@ -124,6 +182,16 @@ def test_to_s decl(@id).to_s) end + def test_to_s_include_apostrophe + assert_equal("<!NOTATION #{@name} SYSTEM \"#{@id}'\">", + decl("#{@id}'").to_s) + end + + def test_to_s_include_double_quote + assert_equal("<!NOTATION #{@name} SYSTEM '#{@id}\"'>", + decl("#{@id}\"").to_s) + end + def test_to_s_apostrophe document = REXML::Document.new(<<-XML) <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ @@ -137,9 +205,38 @@ def test_to_s_apostrophe notation.to_s) end + def test_to_s_apostrophe_include_apostrophe + document = REXML::Document.new(<<-XML) + <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ + #{decl("#{@id}'").to_s} + ]> + <root/> + XML + # This isn't used for SystemLiteral because SystemLiteral includes '. + document.context[:prologue_quote] = :apostrophe + notation = document.doctype.notations[0] + assert_equal("<!NOTATION #{@name} SYSTEM \"#{@id}'\">", + notation.to_s) + end + + def test_to_s_apostrophe_include_double_quote + document = REXML::Document.new(<<-XML) + <!DOCTYPE root SYSTEM "urn:x-test:sysid" [ + #{decl("#{@id}\"").to_s} + ]> + <root/> + XML + # This isn't used for SystemLiteral because SystemLiteral includes ". + # But quoted by ' because SystemLiteral includes ". + document.context[:prologue_quote] = :apostrophe + notation = document.doctype.notations[0] + assert_equal("<!NOTATION #{@name} SYSTEM '#{@id}\"'>", + notation.to_s) + end + private def decl(id) - REXML::NotationDecl.new(@name, "SYSTEM", id, nil) + REXML::NotationDecl.new(@name, "SYSTEM", nil, id) end end end
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
18- github.com/advisories/GHSA-8cr8-4vfw-mr7hghsaADVISORY
- lists.fedoraproject.org/archives/list/package-announce%40lists.fedoraproject.org/message/WTVFTLFVCSUE5CXHINJEUCKSHU4SWDMT/mitrevendor-advisoryx_refsource_FEDORA
- nvd.nist.gov/vuln/detail/CVE-2021-28965ghsaADVISORY
- github.com/ruby/rexml/commit/2fe62e29094d95921d7e19abbd2e26b23d78dc5bghsaWEB
- github.com/ruby/rexml/commit/3c137eb119550874b2b3e27d12b733ca67033377ghsaWEB
- github.com/ruby/rexml/commit/6a250d2cd1194c2be72becbdd9c3e770aa16e752ghsaWEB
- github.com/ruby/rexml/commit/9b311e59ae05749e082eb6bbefa1cb620d1a786eghsaWEB
- github.com/ruby/rexml/commit/a659c63e37414506dfb0d4655e031bb7a2e73fc8ghsaWEB
- github.com/ruby/rexml/commit/f7bab8937513b1403cea5aff874cbf32fd5e8551ghsaWEB
- github.com/ruby/rexml/commit/f9d88e4948b4a43294c25dc0edb16815bd9d8618ghsaWEB
- github.com/rubysec/ruby-advisory-db/blob/master/gems/rexml/CVE-2021-28965.ymlghsaWEB
- hackerone.com/reports/1104077ghsaWEB
- lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/WTVFTLFVCSUE5CXHINJEUCKSHU4SWDMTghsaWEB
- rubygems.org/gems/rexmlghsaWEB
- security.netapp.com/advisory/ntap-20210528-0003ghsaWEB
- security.netapp.com/advisory/ntap-20210528-0003/mitrex_refsource_CONFIRM
- www.ruby-lang.org/en/news/2021/04/05/xml-round-trip-vulnerability-in-rexml-cve-2021-28965ghsaWEB
- www.ruby-lang.org/en/news/2021/04/05/xml-round-trip-vulnerability-in-rexml-cve-2021-28965/mitrex_refsource_MISC
News mentions
0No linked articles in our index yet.