Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Pod should auto-restart when encountering cometbft bug. #409

Open
2 tasks
danbryan opened this issue Mar 1, 2024 · 6 comments
Open
2 tasks

Pod should auto-restart when encountering cometbft bug. #409

danbryan opened this issue Mar 1, 2024 · 6 comments
Assignees

Comments

@danbryan
Copy link
Contributor

danbryan commented Mar 1, 2024

Pods stop syncing blocks periodically when they encounter th cometbft bug. Lets try to identify a way to know when this occurred, and auto restart the pod. Could be as simple as no response from the status endpoint for 2 mins.

Tasks

Preview Give feedback
@jonathanpberger jonathanpberger changed the title auto restart pod when cometbft bug encountered Pod should auto restart when cometbft bug encountered Mar 11, 2024
@danbryan
Copy link
Contributor Author

@vimystic can you provide an update on this?

@vimystic
Copy link
Contributor

Is there a description of the cometbft bug itself somewhere ?

@danbryan
Copy link
Contributor Author

@agouin are you able to describe or link to the bug?

@vimystic here is a script that identifies and restarts pods that are impacted by this bug.

#!/bin/bash

kubectl config use-context sentry-mainnet@sl-colo
PODS=( $(kubectl get pods -A | grep cosmos-sentry | awk '{print $1,$2}') )

for (( i=0; i<${#PODS[@]} ; i+=2 )) ; do
    ns="${PODS[i]}"
    pod="${PODS[i+1]}"
    kubectl logs -c node --tail=30 -n $ns $pod | grep "SignerListener: Connected" 2>&1 > /dev/null
    if [[ "$?" == "0" ]]; then
      echo "ns: ${PODS[i]} pod: ${PODS[i+1]} is stuck"
      kubectl delete --wait=false pod -n $ns $pod
    fi
done

@jonathanpberger
Copy link
Contributor

jonathanpberger commented Mar 25, 2024

depends on kubectl config secret.

@vimystic
Copy link
Contributor

vimystic commented Apr 23, 2024

Blocked until https://github.com/strangelove-ventures/infra/issues/3020 is completed.

@jonathanpberger
Copy link
Contributor

@jonathanpberger jonathanpberger changed the title Pod should auto restart when cometbft bug encountered Pod should auto-restart when encountering cometbft bug. Jul 15, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants