Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Remove unnecessary copying in transformations #2368

Closed

Conversation

WGH-
Copy link
Contributor

@WGH- WGH- commented Jul 24, 2020

@zimmerle zimmerle self-assigned this Jul 28, 2020
@zimmerle zimmerle added the 3.x Related to ModSecurity version 3.x label Jul 28, 2020
@zimmerle
Copy link
Contributor

Thanks @WGH-. Interesting proposal.

Indeed, as mentioned in [2] and as we are using c++11, it is possible to access the string buffer by array idx, and the buffer is stored contiguously.

As of our development branch, we have added the concept of ModSecString. It is not yet 100% ready, but the idea is to have an allocator to use a stack-based memory pool for each request, avoiding memory fragmentation. That stack allocator should be aware that space should be allocated contiguously.

I am going to review it better and get back to you soon.

Example of ModSecString -
https://github.com/SpiderLabs/ModSecurity/blob/5824470768e73d438a5db03b7a6eb24d7c795097/src/actions/transformations/css_decode.cc#L32-L35

@WGH-
Copy link
Contributor Author

WGH- commented Jul 28, 2020

As of our development branch, we have added the concept of ModSecString. It is not yet 100% ready, but the idea is to have an allocator to use a stack-based memory pool for each request, avoiding memory fragmentation. That stack allocator should be aware that space should be allocated contiguously.

Basically, you want to bring back region-based memory management, which old ModSecurity used with help of apr_pool_t and friends?

@zimmerle
Copy link
Contributor

As of our development branch, we have added the concept of ModSecString. It is not yet 100% ready, but the idea is to have an allocator to use a stack-based memory pool for each request, avoiding memory fragmentation. That stack allocator should be aware that space should be allocated contiguously.

Basically, you want to bring back region-based memory management, which old ModSecurity used with help of apr_pool_t and friends?

Not really, the apr_pool was used everywhere. For some pieces of the code where the strings are pre-computed as the request are processed (ARGS and QUERY_STRING), it makes sense to avoid memory fragmentation. The conjunction of std::string_view and stacked (not dynamic) memory usage could be very useful performance-wise. That is different from having dynamically allocated memory pools. It is not about having a generic memory pool but use the resources more wisely.

@WGH-
Copy link
Contributor Author

WGH- commented Jul 29, 2020

You're suggesting to use the call stack for variables associated with particular transaction?

@WGH-
Copy link
Contributor Author

WGH- commented Jul 29, 2020

This optimization has very small performance improvement: increase from 117-118 requests per second to 121-122 RPS. It might start to matter more once other parts are optimized, but right now it's mostly code cleanup.

@zimmerle
Copy link
Contributor

zimmerle commented Jul 30, 2020

You're suggesting to use the call stack for variables associated with particular transaction?

Let me illustrate that with an example -

Lets assume that we have a request at the URI: /modsec-is-cool/?foo=bar&github=fine&a=b
[test case on gits]

By the time that we achieve a given phase (request life cycle) the variables will be populated internally in modsec. To exemplifies, let's pick: ARGS, ARGS_GET, QUERY_STRING, REQUEST_URI, REQUEST_URI_RAW they will be populated as such:

ARGS

ARGS:foo :: bar
ARGS:github :: fine
ARGS:a :: b

ARGS_GET

ARGS:foo :: bar
ARGS:github :: fine
ARGS:a :: b

QUERY_STRING

QUERY_STRING :: foo=bar&github=fine&a=b

REQUEST_URI

REQUEST_URI :: /modsec-is-cool/?foo=bar&github=fine&a=b

REQUEST_URI_RAW

REQUEST_URI_RAW :: /modsec-is-cool/?foo=bar&github=fine&a=b

Notice that the content of variables (and keys), repeats. Using the example of the word fine, it is allocated for each of those variables. In this example, fine could be allocated for QUERY_STRING, and mapped using (string_view) for the other variables. Reducing memory usage and avoid extra malloc calls, consequently reducing memory fragmentation.

In that described use case, the string_view for each variable will be a part of the Transaction class, that could be stacked allocated as - although dynamically - the parameters do not vary on size: offset (void *) and length (uint?).

The above is just an example to illustrate a rationally; there are some corner cases.

There is usage for a memory pool, especially in terms of disposable memory used in serialized operations such as transformations. First, let's see if it is possible to avoid unnecessary allocations, as you are suggesting in this very own patch.

@zimmerle
Copy link
Contributor

This optimization has very small performance improvement: increase from 117-118 requests per second to 121-122 RPS. It might start to matter more once other parts are optimized, but right now it's mostly code cleanup.

At a certain point, the improvements sound to be numerically inefficient. However, placing it in scale, they are not bad. The benefit here is in the order of magnitude of 3% ~ 4%. Not bad. In a busy server, I am sure that it does make a difference.

More to come, as the code became cleaner, it easy to spot an improve what needs to be improved :)

@zimmerle
Copy link
Contributor

Not worked with that yet as I am fixing the xmlns action on the v3.1 branch. The faulty test case is very annoying.

@zimmerle
Copy link
Contributor

zimmerle commented Aug 4, 2020

Not worked with that yet as I am fixing the xmlns action on the v3.1 branch. The faulty test case is very annoying.

xmlns is now fixed. Going to fix the cppcheck warnings on v3/master. Afterward, I will proceed with the rebase for the v3.1-experimental branch. And finally, merge this.

The issue regarding the xmlns was a missing attribution for the rule variable. Almost all variables work without the knowledge of the rule that it belongs to, except for two variables: RULE, and XML.
The variable RULE needs to access the metadata (id, etc...) about the Rule while the XML demands to enumerates all xmlns actions in order to correct address the XML namespace.

During the Rule instance copy, the variable was being copied correctly. But, the reference to the Rule (inside the variable instance) was remained unchanged, point towards the old instance. In the case of xmlns the old Rule was deleted and the reference was being pointed to an invalid location hence the crash.

Relevant pieces of code

Base class common to XML and RULE

https://github.com/SpiderLabs/ModSecurity/blob/2ddb4ab97229578ae9073627c11ea616639e53ff/src/variables/rule_variable.h#L31-L54

Populating all Variables with base type RuleVariable during a Rule instance copy

https://github.com/SpiderLabs/ModSecurity/blob/2ddb4ab97229578ae9073627c11ea616639e53ff/src/rule_with_operator.h#L51-L62

XML Variable

https://github.com/SpiderLabs/ModSecurity/blob/2ddb4ab97229578ae9073627c11ea616639e53ff/src/variables/xml.h#L66-L84

@WGH-
Copy link
Contributor Author

WGH- commented Aug 5, 2020

I think most old-school pointers can be safely replaced with std::unique_ptr. It has zero runtime overhead, and it would make it much easier to reason about lifetimes, and would prevent such mistakes.

@zimmerle
Copy link
Contributor

zimmerle commented Aug 7, 2020

I think most old-school pointers can be safely replaced with std::unique_ptr. It has zero runtime overhead, and it would make it much easier to reason about lifetimes, and would prevent such mistakes.

In this particular case, we are using a shared_ptr. The idea is that Rules could belong to different hosts and so its variables. The variables that do not demand specific processing on execution time, are in practice anchors to offsets on the Transaction object. Safe to share among different Rules.

@@ -61,7 +50,7 @@ std::string Utf8ToUnicode::evaluate(const std::string &value,
}


char *Utf8ToUnicode::inplace(unsigned char *input,
char *Utf8ToUnicode::inplace(const unsigned char *input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the in-place method here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, apart from the name the function is not really in place.

@zimmerle
Copy link
Contributor

zimmerle commented Aug 7, 2020

I have rebased and applied this PR at the top of v3.1-experimental. Commit: 0cc0b929c36374084e2e607673ab88e29c848fb0

Some changes were placed as a consequence of the rebase.

@zimmerle
Copy link
Contributor

zimmerle commented Aug 7, 2020

It is now - 5226c42

@zimmerle
Copy link
Contributor

Merged! Thanks!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants