Skip to content

Commit

Permalink
Add position check for XML declaration
Browse files Browse the repository at this point in the history
## Why?
XML declaration must be the first item.

https://www.w3.org/TR/2006/REC-xml11-20060816/#document

```
[1]   document   ::=   ( prolog element Misc* ) - ( Char* RestrictedChar Char* )
```

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-prolog

```
[22]   prolog   ::=   	XMLDecl Misc* (doctypedecl Misc*)?
```

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-XMLDecl

```
[23]   XMLDecl  ::=   '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
```

See: #161 (comment)
  • Loading branch information
naitoh committed Jul 7, 2024
1 parent face9dd commit 53dd46c
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 1 deletion.
5 changes: 4 additions & 1 deletion lib/rexml/parsers/baseparser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -635,7 +635,10 @@ def process_instruction(start_position)
@source.position = start_position
raise REXML::ParseException.new(message, @source)
end
if @document_status.nil? and match_data[1] == "xml"
if match_data[1] == "xml"
if @document_status
raise ParseException.new("Malformed XML: XML declaration is not at the start", @source)
end
content = match_data[2]
version = VERSION.match(content)
version = version[1] unless version.nil?
Expand Down
17 changes: 17 additions & 0 deletions test/parse/test_processing_instruction.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,23 @@ def test_garbage_text
pi.content,
])
end

def test_xml_declaration_not_at_document_start
exception = assert_raise(REXML::ParseException) do
parser = REXML::Parsers::BaseParser.new('<a><?xml version="1.0" ?></a>')
while parser.has_next?
parser.pull
end
end

assert_equal(<<~DETAIL.chomp, exception.to_s)
Malformed XML: XML declaration is not at the start
Line: 1
Position: 25
Last 80 unconsumed characters:
DETAIL
end
end
end
end

0 comments on commit 53dd46c

Please # to comment.