Skip to content

Node stuck / indexer stuck / goroutine leak #31732

Closed
@extradz

Description

@extradz

System information

amd 5950x
128GB ram
7TB NVME

Geth version: V.1.15.10
CL client & version: Prysm
OS & Version: Ubuntu 22.04

I'm running Etgereum RPC node with Geth since 3 years without any issue.
My node stop to sync and loose connexion with prysm if it receive lots of GetTransaction request

I have not such problem with Geth V.1.15.5 ( after several tests)

logs:
GETH

Apr 27 03:10:13 Ubuntu-2204-jammy-amd64-base bash[3592775]: INFO [04-27|03:10:13.247] Imported new potential chain segment     number=22,357,008 hash=820d85.>
Apr 27 03:10:13 Ubuntu-2204-jammy-amd64-base bash[3592775]: INFO [04-27|03:10:13.336] Chain reorg detected                     number=22,357,007 hash=1d3216.>
Apr 27 03:12:33 Ubuntu-2204-jammy-amd64-base bash[3592775]: WARN [04-27|03:12:33.065] Beacon client online, but no consensus updates received in a while. Ple>

PRYSM

Apr 27 03:10:21 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:21" level=info msg="Synced new block" block=0x2122dc91... epoch=361698>
Apr 27 03:10:21 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:21" level=info msg="Finished applying state transition" attestations=1>
Apr 27 03:10:23 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:23" level=info msg="Attempted late block reorg aborted due to attestat>
Apr 27 03:10:31 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:31" level=error msg="received an undefined execution engine error" err>
Apr 27 03:10:31 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:31" level=info msg="Chain reorg occurred" commonAncestorRoot=0x96104a0>
Apr 27 03:10:35 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:35" level=info msg="Attempted late block reorg aborted due to attestat>
Apr 27 03:10:43 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:43" level=error msg="received an undefined execution engine error" err>
Apr 27 03:10:43 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:43" level=info msg="Chain reorg occurred" commonAncestorRoot=0x96104a0>
Apr 27 03:10:43 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:10:43" level=error msg="Could not handle p2p pubsub" error="could not not>
Apr 27 03:14:16 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:14:16" level=info msg="Connected peers" inboundQUIC=57 inboundTCP=21 outb>
Apr 27 03:15:00 Ubuntu-2204-jammy-amd64-base prysm.sh[3562493]: time="2025-04-27 01:15:00" level=warning msg="Execution client is not syncing" prefix=executi>

once they lost connexion, node stop to syncing.
service not crashing, but geth not communicate with prysm anymore.

File: geth
Type: goroutine
Time: 2025-04-28 00:58:34 CEST
Showing nodes accounting for 225821, 100% of 225824 total
Dropped 229 nodes (cum <= 1129)
      flat  flat%   sum%        cum   cum%
    225821   100%   100%     225821   100%  runtime.gopark
         0     0%   100%     224926 99.60%  github.com/ethereum/go-ethereum/core.(*BlockChain).GetTransactionLookup
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/core.(*BlockChain).TxIndexProgress (inline)
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/core.(*txIndexer).txIndexProgress (inline)
         0     0%   100%     224926 99.60%  github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).GetTransaction
         0     0%   100%     224916 99.60%  github.com/ethereum/go-ethereum/internal/ethapi.(*TransactionAPI).GetTransactionByHash
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*callback).call
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).handleCall
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).handleCallMsg
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).handleMsg.func1.1
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).handleNonBatchCall
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).runMethod
         0     0%   100%     224928 99.60%  github.com/ethereum/go-ethereum/rpc.(*handler).startCallProc.func1
         0     0%   100%     224928 99.60%  reflect.Value.Call
         0     0%   100%     224928 99.60%  reflect.Value.call
         0     0%   100%     225650 99.92%  runtime.selectgo

My start command:

./geth --config ./config.toml --datadir ./mainnet  --cache=20000 --state.scheme=path --rpc.allow-unprotected-txs --history.transactions=0 --http --http.port 18545 --ws.port 18546 --rpc.txfeecap=0 --http.addr 0.0.0.0 --http.api eth,net,web3,txpool  --ws --http.corsdomain '*' --ws.addr 0.0.0.0 --ws.origins '*' --ws.api eth,net,web3,txpool --syncmode=snap --authrpc.addr localhost --authrpc.port 8551 --authrpc.vhosts localhost --authrpc.jwtsecret /home/eth/mainnet/geth/jwtsecret  --metrics  --metrics.addr 0.0.0.0 --metrics.port 6060  --vmodule="rpc=5" --pprof --pprof.addr 0.0.0.0 --pprof.port 6065 --pprof.memprofilerate 1 --pprof.blockprofilerate 1 --pprof.cpuprofile cpu.pprof

we can see on the left of the grafana screenshot the high go routine peak .

  • every goroutine drop is when i restart eth becasue it was stuck
  • drawed blue arrow is when i downgraded to geth v.1.15.5 and had no issue anymore.

Image

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions