Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix memory leak in lmdb.cc + improve performance while reading collections in resolveMultiMatches #2520

Open
wants to merge 4 commits into
base: v3/master
Choose a base branch
from

Conversation

ziollek
Copy link
Contributor

@ziollek ziollek commented Feb 24, 2021

What have been done ?

  1. Removed memory leak in lmdb.cc while passing data from MDB_val to VariableValue:
  • added new VariableValue constructors for rvalue strings
  • decreased memory footprint (just one data copying between lmdb and VariableValue thanks to above constructors)
  1. Optimized fetching collection in resolveMultiMatches:
  • at the beginning, the cursor is moved to a key equal to or greater than the input key (MDB_SET_RANGE)
  • values are fetched via cursor as long as the input key is equal to the key fetched from lmdb

…ableValue

- add new VariableValue constructors for rvalue strings
- decrease number of memory copies between lmdb and VariableValue
- move cursor to first key equal or greather than input key (MDB_SET_RANGE)
- break while if input key is not equal key fetched from lmdb
@zimmerle zimmerle self-requested a review February 26, 2021 14:41
@zimmerle
Copy link
Contributor

Hi @ziollek, thanks for the contribution.

I am wondering how to fit the rvalue change on VariableValue of the 3.1-experimental development branch -

https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L51-L59

Here is the key + value constructor -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L98-L108

Here is the collection + key + value -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L111-L184

Giving that changes on VariableValue, we already have the LMDB in this shape -
https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/src/collection/backend/lmdb.cc#L498-L516

The benefit of MDB_SET_RANGE change is very clear. What do you think about the changes that we have on 3.1-experimental?

@zimmerle zimmerle self-assigned this Feb 26, 2021
@zimmerle zimmerle added the 3.x Related to ModSecurity version 3.x label Feb 26, 2021
@ziollek
Copy link
Contributor Author

ziollek commented Feb 26, 2021

Hi @zimmerle - thanks for your comments.

I wasn't aware of VariableValue changes in 3.1-experimental. Despite of internal changes in variable_value.h i'm afraid, there is still a memory leak in lmdb.cc - because of using new std::string. In effect, below constructor is called:

https://github.com/SpiderLabs/ModSecurity/blob/9f47f1473c88a5b105ced2f11c87204a0abba424/headers/modsecurity/variable_value.h#L142-L151

Above constructor (unlike constructors quoted in your comment) don't 'move' allocated in lmdb memory - it just copies it.

Luckily it could be easily fixed by replacing
new std::string(...) to std::make_unique<std::string>(...)

@zimmerle
Copy link
Contributor

Different methods from lmdb back-end are performing this string copy.

resolveFirst -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L192-L194

resolveSingleMatch -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L290-L293

resolveMultiMatches -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L499-L504

and

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L510-L515

resolveRegularExpression -

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/collection/backend/lmdb.cc#L571-L575

They are using the constructors -

Method Constructor Leak on v3/master
resolveFirst https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely
resolveSingleMatch https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely
resolveMultiMatches (a) https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L50-L57 likely
resolveMultiMatches (b) https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L50-L57 Likely
resolveRegularExpression https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/headers/modsecurity/variable_value.h#L42-L48 Likely

The class VariableValue has four std::string members: m_collection, m_key, m_keyWithCollection, and m_value the attribution of their values happens on the constructor for VariableValue. m_value could be also set via setValue. All attributions are based on the std::string copy constructor.

Later the VariableValue is used at -

Transaction

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L913-L918

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1533-L1539

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1571-L1576

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1670-L1674

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/transaction.cc#L1700-L1704

RunTimeString

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/run_time_string.cc#L63-L66

RuleWithOperator

https://github.com/SpiderLabs/ModSecurity/blob/4127c1bf52d2b30a5c2c3e641b8085fd9a720f43/src/rule_with_operator.cc#L278-L338

As an alternative the inMemoryBackEnd back-end for collections does not duplicates/copy the strings.

Investigating the best approach to tackle this without too much modification on the v3/master code.

data.mv_size)));
}
string2val(var, &key);
rc = mdb_cursor_get(cursor, &key, &data, MDB_SET_RANGE);
Copy link

@sobczak-m sobczak-m Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ziollek replace MDB_SET_RANGE to MDB_SET_KEY
MDB_SET_RANGE - getting first key greater than or equal to specified key http://www.lmdb.tech/doc/group__mdb.html#ga1206b2af8b95e7f6b0ef6b28708c9127
If you are looking for none exist samplekey you get key containing samplekey like samplekey2 (if such key exist)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
break;
}
} while ((rc = mdb_cursor_get(cursor, &key, &data, MDB_NEXT)) == 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace MDB_NEXT to MDB_NEXT_DUB to iterate only over current key

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zimmerle
Copy link
Contributor

I see a benefit of having less copies around, studying the feasibility to have this pull request partially merged on the upcoming 3.0.5.

On QA - https://github.com/SpiderLabs/ModSecurity/actions/runs/883266568

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants