Sync Issues Encountered During Testing Gossipsub Effectiveness measuring #11136
Labels
area/chain
Area: Chain
kind/bug
Kind: Bug
need/team-input
Hint: Needs Team Input
P1
P1: Must be resolved
Checklist
Latest release
, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Repro Steps
config.toml
Describe the Bug
#11118 aims to measure the effectiveness of Gossibsub across a minimum of 10% of active mainnet nodes.
Testing on a daemon-only node did not appear to cause any issues but significant sync loss was observed when running the tracer on a node connected to a
lotus-miner
instance.Slack reference: https://filecoinproject.slack.com/archives/CP50PPW2X/p1690803449761209
The daemon doesn't "appear" to be out of sync but the miner is repeatedly falling behind with a declining
baseDeltaSeconds
which should ideally be a consistent and repeating value of10
.This is similar behaviour to the sync issues experienced in the now-resolved #10906 which was also connected to Gossipsub., however, it doesn't appear to be triggered by pending mpool messages in this instance. This is effectively rendering an SP operation completely inoperable!
No sealing or proving activities were active on the miner at time of testing but both Lotus Slasher & Lotus Disputer were active on the node in question.
Daemon and miner logs are attached below.
lotus-daemon (copy).log
lotus-miner (copy).log
Logging Information
The text was updated successfully, but these errors were encountered: