Skip to content

feat: improved XMLArgs processing #3358

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 2 commits into
base: v2/master
Choose a base branch
from

Conversation

airween
Copy link
Member

@airween airween commented Apr 7, 2025

what

This PR adds a new feature within XML processing.

Old (current) behavior: in case of XML:/* target the body processor expands the node values from the XML payload. Eg.:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <level1>
    <level2>
      <node>foo1</node>
      <node>bar1</node>
    </level2>
    <level2>
      <node>foo2</node>
      <node>bar2</node>
    </level2>
  </level1>
</root>

will produce this value:

[/post][9] Target value: "  foo1  bar1  foo2  bar2"

In this case, there is no option to exclude any node. For example, if a node contains a term that a rule is looking for, the administrator could not create an exclusion. The only solution is to exclude the whole rule.

New behavior: there is a new configuration keyword, SecParseXMLintoArgs with possible values On, Off and OnlyArgs. The default value is Off. This won't change anything. If the administrator set this to On, then the engine will parse the XML into ARGS AND the XML:/* target will still contain the only text content as before. If the value is OnlyArgs then only the parsed content will appear in ARGS target; the XML:/* target won't contain the parsed content anymore.

If administrator sets it to On, then the node values will appear in ARGS, and it's easy to make any exclusion against the named target.

why

A customer request has been received to solve this.

references

See #3178.

@airween airween changed the title Finish XMLArgs processing feat: improved XMLArgs processing Apr 7, 2025
@airween airween added the 2.x Related to ModSecurity version 2.x label Apr 20, 2025
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
6 Security Hotspots
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@RedXanadu
Copy link

This is a great new feature. This will open up ModSecurity to anyone who needs to do serious processing of XML APIs (lots of legacy and current applications!). Especially with pre-written rule sets like CRS, this makes the task of handling false positives possible.

Thank you for the work that has gone into this 🚀

@dune73
Copy link
Member

dune73 commented Apr 25, 2025

@airween Could you share how the new option parses / advertises multi-level documents with multiple leaves carrying the same name? Is the hierarchy part of the name or is that hidden?

Needless to say, that I really like this option.

@airween
Copy link
Member Author

airween commented Apr 25, 2025

@dune73,

I hope I understand your question as well 😄, so consider this file:

cat test.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <level1>
    <level2>
      <node>foo1</node>
      <node>bar1</node>
    </level2>
    <level2>
      <node>foo2</node>
      <node>bar2</node>
    </level2>
  </level1>
  <level1>
    <level2>
      <node>foo1</node>
      <node>bar1</node>
    </level2>
    <level2>
      <node>foo2</node>
      <node>bar2</node>
    </level2>
  </level1>
</root>

and this request:

curl -v -H "Content-Type: application/xml" -X POST -d @test.xml http://localhost/post.php

This will generates these arguments (it's totally the same as in case of JSON):

Adding XML argument 'xml.root.level1.level2.node' with value 'foo1'
Adding XML argument 'xml.root.level1.level2.node' with value 'bar1'
Adding XML argument 'xml.root.level1.level2.node' with value 'foo2'
Adding XML argument 'xml.root.level1.level2.node' with value 'bar2'
Adding XML argument 'xml.root.level1.level2.node' with value 'foo1'
Adding XML argument 'xml.root.level1.level2.node' with value 'bar1'
Adding XML argument 'xml.root.level1.level2.node' with value 'foo2'
Adding XML argument 'xml.root.level1.level2.node' with value 'bar2'
Expanded "REQUEST_URI_RAW|REQUEST_HEADERS|ARGS|ARGS_NAMES" to "REQUEST_URI_RAW|REQUEST_HEADERS:Host|REQUEST_HEADERS:User-Agent|REQUEST_HEADERS:Accept|REQUEST_HEADERS:Content-Type|REQUEST_HEADERS:Content-Length|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node|ARGS_NAMES:xml.root.level1.level2.node".

@dune73
Copy link
Member

dune73 commented Apr 25, 2025

This was what I expected. Thanks for the confirmation. Very good.

@airween airween requested review from theseion and fzipi April 26, 2025 09:40
Copy link
Contributor

@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No tests?

Comment on lines +17 to +27
static void msc_xml_on_start_elementns(
void *ctx,
const xmlChar *localname,
const xmlChar *prefix,
const xmlChar *URI,
int nb_namespaces,
const xmlChar **namespaces,
int nb_attributes,
int nb_defaulted,
const xmlChar **attributes
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this new formatting for the code? Or was adopted as standard already?

Having mixed format in parameters, in general makes more difficult reading the code. So my suggestion will be:

  • use the same format as all other files
  • propose a new standard
  • once accepted, apply to all files once and for all
  • enforce the new format in the pipeline.

Comment on lines +52 to +57
static void msc_xml_on_end_elementns(
void* ctx,
const xmlChar* localname,
const xmlChar* prefix,
const xmlChar* URI
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

arg->value = xml_parser_state->currval;
arg->value_len = strlen(xml_parser_state->currval);
arg->value_origin_len = arg->value_len;
//arg->value_origin_offset = value-base_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//arg->value_origin_offset = value-base_offset;

// decrease the length of current path length - +1 because of the '\0'
xml_parser_state->pathlen -= (taglen + 1);

// -1 need because we don't need the '.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// -1 need because we don't need the '.'
// -1 is needed because we don't need the last '.'

}
} else {

/* Not a first invocation. */
msr_log(msr, 4, "XML: Continue parsing.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does even "Continue parsing" means?

/* error reporting and XML array flag */
char *xml_error;

/* another parser context for arguments */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* another parser context for arguments */
/* additional parser context for arguments */

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
2.x Related to ModSecurity version 2.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants