-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
StartTag: invalid element name #170
Comments
@c-x, do you have a Python traceback showing where in Also, can you confirm that the error still occurs in 1.1.105, and/or on the master branch? |
@gtback I don't have myself access to Soltra so it's really hard to isolate the bug (and it take a lot of time because I must send patched code to customers and wait for their feedback). Do I have a traceback: no. |
It would be easiest to figure this out if we had a traceback (so we know what line is actually generating the error), and ideally the XML response that is generating the error. It looks a lot like an error that would come from malformed XML, especially if it's occuring in |
If |
I also agree with you on the probable malformed XML content but I have no evidence so far. I was hoping you had an access to Soltra to reproduce the problem. |
@c-x, Looking at what you've described so far, I'm having a bit of trouble figuring out how First, you mention
Second, you mention that the error occurs when libtaxii is processing an The I can replicate the error you are getting with the following code (note the extraneous begin bracket from lxml import etree
etree.XML('<hello><</hello>')
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# File "lxml.etree.pyx", line 3072, in lxml.etree.XML (src\lxml\lxml.etree.c:70460)
# File "parser.pxi", line 1828, in lxml.etree._parseMemoryDocument (src\lxml\lxml.etree.c:106689)
# File "parser.pxi", line 1716, in lxml.etree._parseDoc (src\lxml\lxml.etree.c:105478)
# File "parser.pxi", line 1086, in lxml.etree._BaseParser._parseDoc (src\lxml\lxml.etree.c:100105)
# File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:94543)
# File "parser.pxi", line 690, in lxml.etree._handleParseResult (src\lxml\lxml.etree.c:96003)
# File "parser.pxi", line 620, in lxml.etree._raiseParseError (src\lxml\lxml.etree.c:95050)
# lxml.etree.XMLSyntaxError: StartTag: invalid element name, line 1, column 9 Based on that, my bet is that the STIX content in the Poll Response has an extra begin bracket (therefore being invalid XML) and that's what's blowing up. As @gtback mentioned to me offline, libtaxii can do a much better job of handling processing errors. So regardless of this issue's resolution, I'll open an issue for that. @c-x - I don't mean to disagree too much, but I'm having trouble arriving at your conclusion with the evidence you've provided. I've attempted to explain my thought process so that we can reach a conclusion about what's happening and the best way to fix it. I realize I'm not coming from the debug logs (like you are), but rather an understanding of the code, so I may be missing a key piece of information. Thank you. [1] https://github.com/TAXIIProject/libtaxii/blob/master/libtaxii/__init__.py#L96 |
@MarkDavidson You are absolutely right on where the error is. My bad, my caffeine level was too low when I opened the ticket. What you don't have in the code I posted, is some debug print. So, in the debug, and to confirm your analysis, the last print displayed is before the call to I'll edit the first post. |
@c-x, No worries at all! I know how tough it can be to try and debug a system you can't actually interact with (not enough coffee in the world for that problem). My current theory is bad Content in the Poll Response. There are two ways you could try and capture the XML that's being received: Modify your code or modify libtaxii's code. If you wanted to modify your code, I'd say put something like If you wanted to modify libtaxii's code, open messages_11.py for editing ( if isinstance(xml_string, basestring):
f = StringIO.StringIO(xml_string)
else:
f = xml_string Add a line that reads As a side note, I recall that somebody was asking for logging in libtaxii, and I wasn't 100% clear on the use case for it. My sense is that if libtaxii had a configurable debug log this would have been a much easier fix. Thank you. PS - I opened this related issue: #171 EDIT: I had incorrectly written |
OK Thanks. I'll update this post as soon as I have a real world sample to share. |
So, I finally got an IOC that cause libtaxii/lxml to throw the error. As suspected, it's due to an improper IOC like the following. As you can see, the brackets are url/html-encoded, which is a weird behavior of the TAXII Server. In short, this ticket can be closed as it's not a python-libtaxii issue :)
|
Yes, it looks like the server is incorrectly escaping the STIX content in the Content_Block. I don't know which web framework Soltra uses, but it's pretty common in a lot of web frameworks for this to be the default (as a defense across XSS and other content security issues). I'm going to go ahead and close this. It's possible that we could do something in libtaxii to try to detect and correct these types of issues, but I think it would be a decent amount of effort, and the result likely pretty brittle, just to accept content that doesn't actually conform to the TAXII specs. Thanks for tracking this down, @c-x ! |
As an FYI, the most common cause I've seen for this is attempting to assign the Content's text value to some XML string instead of appending an XML tree. It's kind of an easy thing to get tripped up on, especially when reading XML out of a database. For instance, using python and lxml: The "wrong" way: from lxml import etree
elt = etree.Element('name')
elt.text = '<xml/>'
etree.tostring(elt)
# '<name><xml/></name>' The "right" way: from lxml import etree
elt = etree.Element('name')
elt.append(etree.XML('<xml/>'))
etree.tostring(elt)
# '<name><xml/></name>' What's interesting though, at least for libtaxii, is that -Mark |
Yep. The Java-TAXII library has to do this XML string to XML tree dance to properly embed STIX. Initially I was trying to embed the XML string and it got escaped just as shown.
From: MarkDavidson <notifications@github.heygears.commailto:notifications@github.com> As an FYI, the most common cause I've seen for this is attempting to assign the Content's text value to some XML string instead of appending an XML tree. It's kind of an easy thing to get tripped up on, especially when reading XML out of a database. For instance, using python and lxml: The "wrong" way: from lxml import etree '<xml/>'The "right" way: from lxml import etree ''What's interesting though, at least for libtaxii, is that ContentBlock.from_xml() doesn't fail on the provided input. Perhaps there was an extra bracket ('<' or '>') in the description or observable fields that you redacted? If there is, that's a likely culprit. -Mark — |
Hello,
I have the following error with libtaxii-1.1.104 (when receiving content from Soltra).
This error is caught when libtaxii process a addinfourl like the following.
My piece of code that fails is the following:
Have you seen this error before ? Is this a known issue ? Have you any idea on what's going on ?
Thanks.
The text was updated successfully, but these errors were encountered: