Merge block databases #2829

wezrule · 2020-06-26T19:24:43Z

send/change/open/receive & state block databases have been removed. They are merged into a blocks database, and are serialized as follows:
block_type -> block -> sideband

The block_type is a new addition which allows the block and sideband to be correctly deserialized. There is block_details in sideband which could be used, but it would require shrinking the number of bits for the epoch and alot more other interface changes which didn't seem worth it, an extra byte for each block is used instead for the block_type. We already follow this approach for serializing blocks in unchecked for instance so it allowed code re-use there too.

Removes block_count_type RPC as it's pretty useless now as there's no distinction possible in LMDB without extra IO to store the counts. This is possible with RocksDB as we are storing the count anyway, but I think having consistent RPC interface is more important.

The upgrade path is interesting because we do not have a way to merge 2 databases with different value types. For this I used in-memory sorting of smallish databases (legacy open/receive/change) first and then creating temporary databases for that and the send/state blocks which added the new value type (extra block_type). The smallest databases were then merged before the larger ones to reduce as much iteration.

The benefits this provides, will mean RocksDB does not need to worry about stale memtables anymore. Checking if a block exists sometimes took up to 5 database reads, now only 1 is required. It has also reduced complexity in various areas.

The database upgrade was tested on a few systems, logging info shown below:

Windows 10 - SSD, Ryzen 3700X, 64GB RAM (20 minutes)

[2020-Jun-26 14:46:35.843756]: Preparing v18 to v19 database upgrade...
[2020-Jun-26 14:47:26.973934]: Write legacy open/receive/change to new format
[2020-Jun-26 14:47:37.987258]: Write legacy send to new format
[2020-Jun-26 14:48:43.137459]: Merge legacy open/receive/change with legacy send blocks
[2020-Jun-26 14:48:59.067979]: Write state blocks to new format
[2020-Jun-26 15:00:04.990695]: Merging all legacy blocks with state blocks
[2020-Jun-26 15:07:22.412141]: Finished upgrading all blocks to new blocks database

Windows 10 - SSD (NVME), Ryzen 3700X, 64GB RAM (11 minutes)

[2020-Jun-27 08:43:20.698023]: Preparing v18 to v19 database upgrade...
[2020-Jun-27 08:43:38.285549]: Write legacy open/receive/change to new format
[2020-Jun-27 08:43:48.103760]: Write legacy send to new format
[2020-Jun-27 08:44:19.196946]: Merge legacy open/receive/change with legacy send blocks
[2020-Jun-27 08:44:32.918036]: Write state blocks to new format
[2020-Jun-27 08:50:06.377159]: Merging all legacy blocks with state blocks
[2020-Jun-27 08:54:47.716775]: Finished upgrading all blocks to new blocks database

Ubuntu - SSD, Ryzen 2600, 16GB RAM (19 mins)

[2020-Jun-26 14:49:07.484607]: Preparing v18 to v19 database upgrade...
[2020-Jun-26 14:49:30.491715]: Write legacy open/receive/change to new format
[2020-Jun-26 14:49:37.030250]: Write legacy send to new format
[2020-Jun-26 14:50:16.205321]: Merge legacy open/receive/change with legacy send blocks
[2020-Jun-26 14:50:24.769599]: Write state blocks to new format
[2020-Jun-26 15:03:05.067088]: Merging all legacy blocks with state blocks
[2020-Jun-26 15:08:08.497628]: Finished upgrading all blocks to new blocks database

The starting ledger was 36GB, unvacuumed after upgrade it becomes 61GB, vacuumed 22GB. Currently it is set up to automatically vacuum after upgrade, however this might be difficult for some users with storage constraints, perhaps we should make this step optional, or also vacuum the pre-upgraded ledger first?

Also did some benchmarking of LMDB/RocksDB performance & ledger size when using fixed and variable sized keys:

	Fixed sized value (s)	Variable sized value (s)
LMDB
get	11.2	10.01
put	46	45
RocksDB
get	3.1	3.1
put	37	37

Didn't see any real difference between fixed and variable sized values from LMDB or RocksDB, or a difference in ledger size. All gets were recorded after a computer restart to prevent any OS caching affected the results.

guilhermelawless · 2020-07-09T09:03:51Z

Had a bootstrap in progress, stopped and upgraded the database with this PR (~21M block count, 30M unchecked) on Ryzen 3600, 16GB RAM and NVME SSD:

[2020-Jul-09 09:13:52.792313]: Preparing v18 to v19 database upgrade...
[2020-Jul-09 09:15:23.890906]: Write legacy open/receive/change to new format
[2020-Jul-09 09:15:46.475696]: Write legacy send to new format
[2020-Jul-09 09:18:15.845385]: Merge legacy open/receive/change with legacy send blocks
[2020-Jul-09 09:19:22.825827]: Write state blocks to new format
[2020-Jul-09 09:25:42.851488]: Merging all legacy blocks with state blocks
[2020-Jul-09 09:28:50.550178]: Finished upgrading all blocks to new blocks database
[2020-Jul-09 09:29:04.930051]: Preparing vacuum...
[2020-Jul-09 09:51:58.636846]: Vacuum succeeded.

Though painful the benefits are significant.

I agree with vacuuming pre-upgrade, possibly even a rebuild as we've seen that speeds up upgrades considerably and reduces the maximum size reached during the upgrade.

@zhyatt Should this have the removal label instead of semantic?

zhyatt · 2020-07-09T14:26:08Z

@guilhermelawless Yes, just swapped out the labels.

wezrule · 2020-07-13T21:19:25Z

Windows 10 - SSD (NVME), Ryzen 3700X, 64GB RAM (11 minutes)

[2020-Jul-13 21:17:52.125735]: Preparing vacuum...
[2020-Jul-13 21:35:59.116457]: Vacuum succeeded.
[2020-Jul-13 21:35:59.118457]: Preparing v18 to v19 database upgrade...
[2020-Jul-13 21:36:07.470989]: Write legacy open/receive/change to new format
[2020-Jul-13 21:36:31.223525]: Write legacy send to new format
[2020-Jul-13 21:37:06.718630]: Merge legacy open/receive/change with legacy send blocks
[2020-Jul-13 21:37:52.722318]: Write state blocks to new format
[2020-Jul-13 21:43:13.007033]: Merging all legacy blocks with state blocks
[2020-Jul-13 21:48:06.407661]: Finished upgrading all blocks to new blocks database
[2020-Jul-13 21:48:10.481677]: Preparing vacuum...
[2020-Jul-13 22:07:26.969579]: Vacuum succeeded.

I did some experimenting with a rebuild vacuum before the upgrade and afterwards. After first rebuild data.ldb reaches 63GB at peak, then vacuumed to 18GB. After upgrade and rebuild the data.ldb file reached 89GB, and then need an additional 18GB for the copy w/ compaction, so 117GB in total at peak. The v18-19 upgrade itself took a couple minutes longer as well, in addition would need 18 minutes for pre-upgrade rebuild/vacuum, doesn't seem worth it.

guilhermelawless

LGTM pending doc updates.

SergiySW · 2020-07-24T16:05:47Z

Ubuntu 20.04 - NVMe SSD Samsung 970 EVO Plus 512GB, Ryzen 3900X, 64GB RAM (5 mins)

[2020-Jul-24 18:53:15.493891]: Preparing v18 to v19 database upgrade...
[2020-Jul-24 18:53:23.304675]: Write legacy open/receive/change to new format
[2020-Jul-24 18:53:25.715227]: Write legacy send to new format
[2020-Jul-24 18:53:35.892972]: Merge legacy open/receive/change with legacy send blocks
[2020-Jul-24 18:53:41.872384]: Write state blocks to new format
[2020-Jul-24 18:54:56.944585]: Merging all legacy blocks with state blocks
[2020-Jul-24 18:58:21.824351]: Finished upgrading all blocks to new blocks database
[2020-Jul-24 18:58:26.377069]: Preparing vacuum...
[2020-Jul-24 18:59:12.450875]: Vacuum succeeded.

SergiySW · 2020-07-28T05:25:17Z

Ubuntu 20.04 - NVMe Optane, Ryzen 3900X, 64GB RAM (2 mins)

[2020-Jul-28 08:17:59.413575]: Preparing v18 to v19 database upgrade...
[2020-Jul-28 08:18:01.887120]: Write legacy open/receive/change to new format
[2020-Jul-28 08:18:04.311892]: Write legacy send to new format
[2020-Jul-28 08:18:08.515955]: Merge legacy open/receive/change with legacy send blocks
[2020-Jul-28 08:18:14.605210]: Write state blocks to new format
[2020-Jul-28 08:19:05.256456]: Merging all legacy blocks with state blocks
[2020-Jul-28 08:20:08.191290]: Finished upgrading all blocks to new blocks database
[2020-Jul-28 08:20:10.844470]: Preparing vacuum...
[2020-Jul-28 08:20:25.077202]: Vacuum succeeded.

Merge block databases

9df9320

wezrule added documentation This item indicates the need for or supplies updated or expanded documentation performance Performance/resource utilization improvement database Relates to lmdb or rocksdb labels Jun 26, 2020

wezrule added this to the V22.0 milestone Jun 26, 2020

wezrule self-assigned this Jun 26, 2020

Some code cleanup

b3af678

wezrule added database structure If the database changes it needs updating in the nanodb repository rpc Changes related to Remote Procedure Calls semantic Change to node APIs (separate label) which impacts interpretation of data, integrations impacted. labels Jun 28, 2020

Open new blocks database in RocksDB

edb246c

wezrule added the blocker Some future items cannot be completed until this is merged. label Jul 6, 2020

wezrule requested review from guilhermelawless and SergiySW July 6, 2020 11:19

zhyatt added removal Indicates functionality is being removed and removed semantic Change to node APIs (separate label) which impacts interpretation of data, integrations impacted. labels Jul 9, 2020

wezrule added 3 commits July 13, 2020 22:22

Removed comment

0b18c83

Use new blocks database with --rebuild

35e38cd

Merge branch 'develop' into merge_block_databases

aa01f60

guilhermelawless previously approved these changes Jul 17, 2020

View reviewed changes

Remove block_counts (Serg review)

89aea93

wezrule dismissed guilhermelawless’s stale review via 89aea93 July 24, 2020 17:21

SergiySW approved these changes Jul 24, 2020

View reviewed changes

wezrule merged commit df5c0b4 into nanocurrency:develop Jul 24, 2020

wezrule deleted the merge_block_databases branch July 24, 2020 18:30

wezrule mentioned this pull request Jul 27, 2020

Block tables merged nanocurrency/nanodb-specification#5

Closed

wezrule mentioned this pull request Jan 8, 2021

block_count_type is removed nanocurrency/nano-docs#449

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge block databases #2829

Merge block databases #2829

wezrule commented Jun 26, 2020 •

edited

Loading

guilhermelawless commented Jul 9, 2020

zhyatt commented Jul 9, 2020

wezrule commented Jul 13, 2020

guilhermelawless left a comment

SergiySW commented Jul 24, 2020 •

edited

Loading

SergiySW commented Jul 28, 2020

Merge block databases #2829

Merge block databases #2829

Conversation

wezrule commented Jun 26, 2020 • edited Loading

guilhermelawless commented Jul 9, 2020

zhyatt commented Jul 9, 2020

wezrule commented Jul 13, 2020

guilhermelawless left a comment

Choose a reason for hiding this comment

SergiySW commented Jul 24, 2020 • edited Loading

SergiySW commented Jul 28, 2020

wezrule commented Jun 26, 2020 •

edited

Loading

SergiySW commented Jul 24, 2020 •

edited

Loading