Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Removed multiple heap-allocated copies in Pm::init & parse_pm_content #3233

Merged

Conversation

eduar-hte
Copy link
Contributor

@eduar-hte eduar-hte commented Aug 18, 2024

what

Removed multiple heap-allocated copies of operator Pm's parameter and other unnecessary heap allocations.

changes

  • parse_pm_content
    • The previous version of this function was doing three strdup copies to parse the pm content. The updated version only copies the value once (in order not to modify the operator's m_param member variable), and then performs the updates inline.
    • Fixed parsing of digits (see here) which were not quoted and thus not interpreted as ASCII characters (like the hexadecimal digits) but as binary values (this is incorrect because if that were the case, 0 would have been interpreted as the null terminator when doing strdup before, thus truncating the content).
    • Error message in parse_pm_content would reference freed memory if accessed by caller (see here, here, here, here & here). Removed anyway because it was unused.
    • Moved parse_pm_content to src/operators/pm.cc as this function was introduced in ModSecurity v3 and only used by Pm::init.
  • Pm::init
    • Simplified to avoid manually handling of memory buffers (as parse_pm_content now returns a std::string instead of a char *).
    • std::istringstream iss is no longer heap allocated.
    • Removed temporary allocation of std::vector<std::string> vec to store acmp patterns, leveraging std::for_each to add them as they're parsed from the stream.
  • Removed src/operators/pm_f.cc (all code in modsecurity::operators::PmF is inline).
  • Removed unused included header files.

misc

This PR was originally part of a remove copies branch which included PR #3231 & #3222, but was broken down into separate PRs to group changes and simplify the review process. I'm not including this in the performance improvement series because even though these changes are related to the initialization of rules (and thus the Pm operator).

@eduar-hte eduar-hte force-pushed the remove-copies-pm-operator branch from e8e2050 to 521a51c Compare August 18, 2024 22:08
@marcstern marcstern added the 3.x Related to ModSecurity version 3.x label Aug 20, 2024
@airween
Copy link
Member

airween commented Aug 26, 2024

Hi @eduar-hte,

thanks again for this PR.

  • Fixed parsing of digits (see here) which were not quoted and thus not interpreted as ASCII characters (like the hexadecimal digits) but as binary values (this is incorrect because if that were the case, 0 would have been interpreted as the null terminator when doing strdup before, thus truncating the content).

uhmm, this is a bit scary. I see you fixed this behavior, but I'm afraid we should definitely must inform users about this issue. I'm not sure we need to open a CVE, but when we release the new version, we need to draw the attention of users.

@marcstern, @fzipi, @theseion, @gberkes (and of course @eduar-hte) - what do you think guys?

This PR was originally part of a remove copies branch which included PR #3231 & #3222, but was broken down into separate PRs to group changes and simplify the review process. I'm not including this in the performance improvement series because even though these changes are related to the initialization of rules (and thus the Pm operator).

Would it be there any conflict between this PR and #3231 (for eg. if I merge this one first...)?

I added one comment to my review.

@eduar-hte
Copy link
Contributor Author

eduar-hte commented Aug 26, 2024

This PR was originally part of a remove copies branch which included PR #3231 & #3222, but was broken down into separate PRs to group changes and simplify the review process. I'm not including this in the performance improvement series because even though these changes are related to the initialization of rules (and thus the Pm operator).

Would it be there any conflict between this PR and #3231 (for eg. if I merge this one first...)?

No, they're independent from each other. I mentioned that PR (and the previous one, PR #3222) because they were originally on the same remove-copies branch, focused on removing unnecessary heap-allocated copies of strings.

This commit was even part of my original submission for PR #3231, but I moved it to its own PR for that PR to be focused on Transformation classes.

@theseion
Copy link
Collaborator

uhmm, this is a bit scary. I see you fixed this behavior, but I'm afraid we should definitely must inform users about this issue. I'm not sure we need to open a CVE, but when we release the new version, we need to draw the attention of users.

Not nice. Fortunately, this affects something that probably only a few people use, which is parsing of |HEX| (I didn't even know that functionality existed; it must be what the docs refer to as "snort/suricata content style"). In any case, it has always been broken in v3 without users complaining. Yes, mention it in the release notes. I don't feel like we have to make a big deal out of this.

@eduar-hte eduar-hte force-pushed the remove-copies-pm-operator branch from 521a51c to 18efe21 Compare August 27, 2024 13:16
@eduar-hte
Copy link
Contributor Author

eduar-hte commented Aug 27, 2024

uhmm, this is a bit scary. I see you fixed this behavior, but I'm afraid we should definitely must inform users about this issue. I'm not sure we need to open a CVE, but when we release the new version, we need to draw the attention of users.

As @theseion mentions, I assume this is not currently being used because it'd just not work as is.

In addition to the fix, I think it'd be appropriate to exit the function in error when an invalid hex character is expected and not found (similarly to the way an invalid escape sequence is handled). I'll update the commit to include this.

- The previous version of this function was doing three strdup copies
  to parse the pm content. The updated version only copies the value
  once (in order not to modify the Operator's m_param member variable),
  and then performs the updates inline.
- Binary parsing was broken because digits were not compared as
  characters.
  - Fail parsing when an invalid hex character is found.
- Error message in parse_pm_content would reference freed memory if
  accessed by caller. Removed anyway because it was unused.
@eduar-hte eduar-hte force-pushed the remove-copies-pm-operator branch from 18efe21 to 3e9d810 Compare August 27, 2024 13:43
Copy link

Copy link
Member

@airween airween left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airween
Copy link
Member

airween commented Aug 28, 2024

@eduar-hte - thanks for fix (VALID_HEX()).

@theseion, @eduar-hte - thanks guys for answers.

Going to merge this.

@airween airween merged commit 4951702 into owasp-modsecurity:v3/master Aug 28, 2024
49 checks passed
airween added a commit to airween/ModSecurity that referenced this pull request Aug 28, 2024
@eduar-hte eduar-hte deleted the remove-copies-pm-operator branch August 28, 2024 12:48
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants