-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Suggestion for running BSC nodes #875
Comments
Any succession for running an archive node in the cloud? |
@forcodedancing Thanks for this content, is really useful! Could you add the commands for running the node in different ways? (archive, light, ...) |
What is faster? Diffsync or Pipecommit? |
i was prune state already , there is only 600GB in my node folder but my node still has a liitle performance behind (compare same location and spec 's server) i can't figure out lol |
You've outlined two pruning methods. For the minimal size possible, should we be running Is there any dependency between these two commands? For example, if we run |
The disk requirement is very high. I believe a 10T ~ 15T disk is required. If you have such disk, you can have a try. |
Sure, I will add more detail on this. |
Pipecommit is suggested, using the latest release. |
Can you check your disk? It is usually the bottleneck now. |
This is no dependency, and the order does not affect the final performance. You can do them one by one, not in parallel. |
I had found errors when running |
Does anyone know a good vps/dedicated server hoster for hosting a full node in the US? |
@tntwist vultr |
@cruzerol Thanks. What instance would you suggest there? |
Sure, please submit an issue and we can further analysis it. Thanks |
@tntwist Bare Metal - 350$ |
hi @forcodedancing , my node keeps getting "Synchronisation failed, dropping peer" issue and stop syncing... the only solution is to restart. it happens very often. attach is the performance profile. please help reviewing it.. |
|
@forcodedancing or anyone lvl=info msg="State heal in progress" accounts=2,797,558@164.39MiB slots=1,961,139@143.59MiB codes=1875@16.24MiB nodes=30,732,105@10.01GiB pending=161,822 i had observed below log during state sync does this mean heal phase should run till the accounts in this phase reach 135,781,417 ? |
The spec should be fine for fullnode. Did you use the snapshot? I also suggest running with fastnode. |
|
Just chiming in here since I opened a related issue: #1198 There's an issue with syncing from scratch where it gets caught in "state heal" forever. I had the same problem on go-ethereum, which was solved by some recent commits and now have an open PR for bsc: #1226 These should fix/improve performance for syncing from scratch. I noticed on a c6a.8xlarge 9k iops that it seemed to finish it's initial sync after 8 hours, then go into the "state heal" loop, so hopefully the improvements will mean it will finish roughly in that amount of time. In the meantime, if you do not care about historical/archive data, I was able to start a node from snapshot on the same specs. I had to download the archive and wait for it to finish (2-3 hours), then unzip it (another 3 hours) then wait for it to start up and do any initial catchup (another 1-2 hours) which meant having to monitor it for the next step. One thing I noticed was the go implementation of lz4 seems to be way faster on the CLI, I think because the C implementation is not using threads but the go implementation is. The go lz4 implementation at: https://github.com/pierrec/lz4 has a CLI included in the default fedora repos and reduced the archive extract by about an hour. I didn't need to use fastnode with these specs when using a snapshot, and am now using full sync. Once the PR merges, I will try from scratch again but estimate I will be able to reduce the specs to 3k iops and a c6a.4xlarge (16 vcpu, 32 gig ram) based on the current bsc node (from snapshot) resource consumption, and specs of my go-ethereum node (note: this is a "full node" not a validator). I prefer syncing from scratch due to supply chain attack concerns over using a snapshot, and since there are manual steps every few hours, rather than starting the node and just waiting for the initial sync. One thing I was unsure about: is there a configuration to allow prune to take place concurrently while the node is running, like it does for go-ethereum/nethermind rather than having to manually stop the node and run prune? |
@DaveWK Thanks for sharing your useful experience and suggestions. There is no in place or online prune now. |
Hi! I installed and fully synced a full node. It is working well. Thanks! |
put it in FAQ: #1947 |
The transaction volume of BSC is huge, which sometimes brings challenges for running BSC nodes with good performance. Here, information is collected and summarized for running BSC nodes. Hope it will be useful, and any suggestion or discussion is welcomed.
Binary
All the clients are suggested to upgrade to the latest release. The latest version is supposed to be more stable and better performance.
Spec for running nodes
Followings are the recommended specs for running validator and fullnode.
Running validator
Running fullnode
Storage optimization
Block prune
If you do not care about the historical blocks/txs, e.g., txs in an old block, then you can take the following steps to prune blocks.
nohup geth snapshot prune-block --datadir {the data dir of your bsc node} --datadir.ancient {the ancient data dir of your bsc node} --block-amount-reserved 1024 &
. It will take 3-5 hours to finish.State prune
According to the test, the performance of a fullnode will degrade when the storage size exceeds 1.5T. We suggest the fullnode always keeps light storage by pruning the state storage.
nohup geth snapshot prune-state --datadir {the data dir of your bsc node} &
. It will take 3-5 hours to finish.Notice:
Sync mode
Pipecommit
The pipecommit feature in release v1.1.8 for full sync. You can enable it by adding
--pipecommit
in the starting command when running full sync.Light storage
When the node crashes or is force killed, the node will sync from a block that was a few minutes or a few hours ago. This is because the state in memory is not persisted into the database in real time, and the node needs to replay blocks from the last checkpoint. The replaying time dependents on the configuration
TrieTimeout
in theconfig.toml
. We suggest you raise it if you can tolerate with long replaying time, so the node can keep light storage.Performance monitoring
For importing blocks, you can monitor the following key metrics by using Prometheus/Grafana, via adding
--metrics
in your starting commands.As showing in the above example, you can find more interested metrics from the source code and monitor them.
Performance tuning
mgasps
means the block processing ability of the fullnode, make sure the value is above50
.—pprof
in the starting command. Profiles can be taken bycurl -sK -v http://127.0.0.1:6060/debug/pprof/profile?seconds=60 > profile_60s.out
, and the dev community can help to analyze the profile.Snapshot for new node
If you want to build a new BSC node, please fetch snapshot from bsc-snapshots.
Improvement suggestion
Feel free to raise pull requests or submit BEPs for your ideas.
References
The text was updated successfully, but these errors were encountered: