Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Implement flat storage for state dumps #8322

Closed
Longarithm opened this issue Jan 10, 2023 · 1 comment
Closed

Implement flat storage for state dumps #8322

Longarithm opened this issue Jan 10, 2023 · 1 comment
Assignees

Comments

@Longarithm
Copy link
Member

Longarithm commented Jan 10, 2023

We have an option to initialize node from a state dump, see genesis-tools/README.md. This is used to start node which already has a large state. This state is stored in state_dump file which currently stores serialized DBCol::State. It is used in params-estimator and some nayduck tests (state_sync_massive.py and state_sync_massive_validator.py) to check that state sync works.

We need to support flat storage there as well, I have two options in mind:

  • change state_dump file format by storing only KV pairs - and then construct Trie and Flat Storage from it
  • write expected contents of DBCol::FlatState* to this file as well
  • read file as is and start flat storage migration.

Some TODOs:

  • update README by saying how genesis-populate is used
  • write test specifically for Runtime which fills state from state dump (calls genesis_state_from_dump) and then makes some state reads

Progress:

@Longarithm
Copy link
Member Author

Longarithm commented Jan 11, 2023

On the meeting today, we decided that state dumps are important for parameter estimation, because there we need to restore "real" state of filled DB. Due to the same reason, we don't want to make dumps more compact by storing only the last state.

So it makes sense just to add contents of DBCol::FlatState* columns to the dump file. I think we don't need deltas though and we can set flat storage heads to genesis block hashes, because GenesisBuilder doesn't really process any blocks.

Pointers:

  • GenesisBuilder::flush_shard_records should save flat state changes together with trie changes
  • Store::save_to_file and load_from_file could be modified to save_state_to_file and load_state_from_file. Or we can use separate files to store flat storage columns because we already use two files.
  • in test_dump_load_trie it could be nice to test reading from flat state as well
  • we need some test for Runtime for which genesis_state_from_dump is called and a read from flat state is made. Current impl must fail for it

@pugachAG pugachAG self-assigned this Jan 11, 2023
@pugachAG pugachAG linked a pull request Jan 23, 2023 that will close this issue
near-bulldozer bot pushed a commit that referenced this issue Jan 25, 2023
Extend state dump to include FlatState column.  State dump format is
changed to support multiple columns by including column index along
with every key-value pair.  This introduces an overhead of 1 byte per
entry which should be negligible.

While we are changing the format, also add an end of file mark so that
we can properly detect truncated files.  Previously, if we were very
unlucky and the file was truncated just at a record border we wouldn’t
notice it when reading the file.  With explicit end of file marker we
can detect it now.

Co-authored-by: Anton Puhach <anton@near.org>
Issue: #8322
near-bulldozer bot pushed a commit that referenced this issue Jan 26, 2023
Part of #8322.

#8424 is a prerequisite for this 

This also re-enables nayduck tests disabled in #8403. Verified with:
```
cargo build --features nightly
python tests/sanity/state_sync_massive.py
python tests/sanity/state_sync_massive_validator.py
```
ppca pushed a commit to ppca/nearcore that referenced this issue Jan 30, 2023
Extend state dump to include FlatState column.  State dump format is
changed to support multiple columns by including column index along
with every key-value pair.  This introduces an overhead of 1 byte per
entry which should be negligible.

While we are changing the format, also add an end of file mark so that
we can properly detect truncated files.  Previously, if we were very
unlucky and the file was truncated just at a record border we wouldn’t
notice it when reading the file.  With explicit end of file marker we
can detect it now.

Co-authored-by: Anton Puhach <anton@near.org>
Issue: near#8322
ppca pushed a commit to ppca/nearcore that referenced this issue Jan 30, 2023
Part of near#8322.

near#8424 is a prerequisite for this 

This also re-enables nayduck tests disabled in near#8403. Verified with:
```
cargo build --features nightly
python tests/sanity/state_sync_massive.py
python tests/sanity/state_sync_massive_validator.py
```
@Longarithm Longarithm mentioned this issue Feb 3, 2023
26 tasks
near-bulldozer bot pushed a commit that referenced this issue Feb 6, 2023
Part of #8322.

### Testing

In order to make sure that flat state is actually used I've made the following change to [`get_ref`](https://github.com/near/nearcore/blob/633a0443a57f8970bc0703f4eb61ccc4325d3020/core/store/src/trie/mod.rs#L951):
```
             if matches!(mode, KeyLookupMode::FlatStorage) && !is_delayed {
                 if let Some(flat_state) = &self.flat_state {
                     let flat_result = flat_state.get_ref(&key);
+                    println!("flat ok");
                     assert_eq!(result, flat_result);
                 }
             }
```
and then checked the output of the following cmd:
```
cargo run --package runtime-params-estimator --features "required nightly" --bin runtime-params-estimator -- --accounts-num 2000 --additional-accounts-num 2000 --iters 1 --warmup-iters 1 --metric time
```

Please note that we cannot enable flat state assert as part of `sanity_check` test, see [zulip discussion](https://near.zulipchat.com/#narrow/stream/295306-pagoda.2Fcontract-runtime/topic/parameter.20estimator/near/324929122) for more context.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants