Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

getLogs perf improvements #1650

Merged
merged 2 commits into from
Dec 14, 2021
Merged

getLogs perf improvements #1650

merged 2 commits into from
Dec 14, 2021

Conversation

Vovchyk
Copy link
Contributor

@Vovchyk Vovchyk commented Nov 5, 2021

Description

This introduces some performance improvements to the eth_getLogs JSON-RPC method.

To index block blooms which are not indexed yet, this PR adds a new cli tool - IndexBlooms. See motivation section for more details.

Motivation and Context

The eth_getLogs JSON-RPC call consumes lots of resources, especially when it's triggered on a long block range to retrieve historical data. One way to improve this is to use blocks' bloom filters. Bloom filters are already part of each RSK block, but iterating over each block in a requested range is still time consuming operation, as it requires retrieving blocks from a db.

To improve this some time ago there were introduced "combined" bloom filters for block batches (by default, each such batch has 64 blocks), so during the iteration process we can skip some block batches if their "combined" bloom doesn't fit our filter requirements. This can be thought as some form of indexing of block bloom filters. The node automatically index block batches that are being queried, so next time if blocks from same batches are queried again, their processing time should be lower.

Also the node already has a capability to auto-index blocks' blooms for new blocks that are being added to a chain from the network. But this feature is OFF by default. It can be enabled via this jvm flag - -Dblooms.service=true or via setting blooms.service to true in the node's config.

Besides existing capabilities for block blooms indexing, a new CLI tool is added - IndexBlooms - which allows to index in advance some specified block range. It can be triggered like this

java -cp ./rskj-core/build/libs/rskj-core-3.2.0-SNAPSHOT-all.jar co.rsk.cli.tools.IndexBlooms earliest latest

for entire chain or like this

java -cp ./rskj-core/build/libs/rskj-core-3.2.0-SNAPSHOT-all.jar co.rsk.cli.tools.IndexBlooms 1000 5000

for a specific range, in this case: [1000..5000].

One more way to seed up blooms index initialisation is to persist its cache snapshots (same as it was done here for states). This can be achieved by setting persist-snapshot to true in the config like this:

...
  blooms {
    # each entry represents a range of blocks
    # initial estimated capacity: 100000 entries
    max-elements: 100000

    # (experimental, OFF by default) enables persistence of cache snapshots, which speeds up loading of entries from a disk into memory
    persist-snapshot: true
  },
...

or by providing the following cli arg: -Dcache.blooms.persist-snapshot=true, so, for example, the previous command for blooms indexing could look like the following:

java -Dcache.blooms.persist-snapshot=true -cp ./rskj-core/build/libs/rskj-core-3.2.0-SNAPSHOT-all.jar co.rsk.cli.tools.IndexBlooms earliest latest --testnet

Pay attention that for this to take affect the node should also be started with the -Dcache.blooms.persist-snapshot=true flag (which is OFF by default) or with the appropriate config setting. It is also desirable to start the node with auto-indexing enabled. So the command could look like this:

java -Dcache.blooms.persist-snapshot=true -Dblooms.service=true -jar ./rskj-core/build/libs/rskj-core-3.2.0-SNAPSHOT-all.jar --testnet

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • Tests for the changes have been added (for bug fixes / features)
  • Requires Activation Code (Hard Fork)
  • Other information:

CacheSnapshotHandler cacheSnapshotHandler = getRskSystemProperties().shouldPersistBloomsCacheSnapshot()
? new CacheSnapshotHandler(resolveCacheSnapshotPath(bloomsStorePath))
: null;
ds = new DataSourceWithCache(ds, bloomsCacheSize, cacheSnapshotHandler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This construction 'leaks' the first datasource created at line 1183

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I would say it rather wraps it, same as it's being done in other places like here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hun, ok.

* Creates a block range by extract from/to values from {@code args}.
*/
@Nonnull
static Range makeBlockRange(@Nonnull String[] args, @Nonnull BlockStore blockStore) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/* Thats elegant */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks)

heitorfm
heitorfm previously approved these changes Nov 11, 2021
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

75.3% 75.3% Coverage
0.0% 0.0% Duplication

@Vovchyk
Copy link
Contributor Author

Vovchyk commented Nov 12, 2021

pipeline:run

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

75.3% 75.3% Coverage
0.0% 0.0% Duplication

@Vovchyk
Copy link
Contributor Author

Vovchyk commented Dec 14, 2021

pipeline:run

@Vovchyk Vovchyk merged commit f653955 into master Dec 14, 2021
@Vovchyk Vovchyk deleted the getlogs-perf branch December 14, 2021 16:42
@aeidelman aeidelman added this to the Iris v3.3.0 milestone Mar 23, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants