Skip to content

Latest commit

 

History

History
148 lines (125 loc) · 7.35 KB

File metadata and controls

148 lines (125 loc) · 7.35 KB

Use this manual as a part of main manual. It doesn't work on it's own.

Delete all cryptonodes

Sometimes you may want to remove all cryptonodes with all the data and start from clean k8s cluster.

  1. Use these commands to delete all cryptonodes:

    cd "$PROJECT_ROOT/deploy/chains"
    helmfile -e demo destroy

    Helmfile may fail to destroy helm releases in some specific state, thus you may use helm instead :

    cd "$PROJECT_ROOT/deploy/chains"
    helm --kube-context $KUBE_CONTEXT delete --purge demo-btc-0
    helm --kube-context $KUBE_CONTEXT delete --purge demo-btc-1
    helm --kube-context $KUBE_CONTEXT delete --purge demo-eth-0
    helm --kube-context $KUBE_CONTEXT delete --purge demo-eth-1
  2. Use these commands to remove cryptonodes disks. This leads to data loss.

    1. Delete bitcoin nodes disks:

      kubectl --context $KUBE_CONTEXT delete pvc -l app.kubernetes.io/instance=demo-btc-0 -A
      kubectl --context $KUBE_CONTEXT delete pvc -l app.kubernetes.io/instance=demo-btc-1 -A
    2. Delete ethereum nodes disks:

      kubectl --context $KUBE_CONTEXT delete pvc -l app.kubernetes.io/instance=demo-eth-0 -A
      kubectl --context $KUBE_CONTEXT delete pvc -l app.kubernetes.io/instance=demo-eth-1 -A

Delete and redeploy cryptonodes

  1. Perform steps under Delete all cryptonodes
  2. Wait 5 minutes after cryptonodes removal to have Helm destroy all k8s resources
  3. Use these instructions(README.md Actual deploy) to deploy required cryptonodes again

Single cryptonode operations

You may deploy just single cryptonode, for example, demo-btc-1 :

cd "$PROJECT_ROOT/deploy/chains"
helmfile -e demo -l name=demo-btc-1 sync

Check logs via

kubectl --context $KUBE_CONTEXT -n demo-btc-1 logs --since=15s --tail=10 demo-btc-1-bitcoind-0

Restart cryptonode via pod delete. Pod will be recreated by StatefulSet

kubectl --context $KUBE_CONTEXT -n demo-btc-1 get pod
kubectl --context $KUBE_CONTEXT -n demo-btc-1 delete pod demo-btc-1-bitcoind-0

And remove this particular node

cd "$PROJECT_ROOT/deploy/chains"
helmfile -e demo -l name=demo-btc-1 destroy

or use helm in case of "helmfile destroy" fails

helm --kube-context $KUBE_CONTEXT delete --purge demo-btc-1

In case you need to remove stored cryptonode data - use following command

kubectl --context $KUBE_CONTEXT -n demo-btc-1 delete pvc -l app.kubernetes.io/instance=demo-btc-1

You may hit some stale resources errors in case of destroy and re-install :

Error: release demo-btc-1 failed: object is being deleted: services "demo-btc-1-lb-p2p" already exists

You should wait a couple minutes after destroy in this case.

Basic GKE cluster changes

All the cluster changes should be done via terragrunt/terraform or using GCP web/gcloud, not both. Terraform knows nothing about "direct" changes via GCP web/gcloud and terraform will work to revert these changes in case of followed executions. To perform required "direct" changes, use official GKE docs, we don't cover it here. Of course, you can perform cluster changes via terraform/terragrunt. In this case you're limited by terragrunt.hcl input variables. Some variable changes lead to complete GKE destroy-and-recreate with all data loss, for example GKE_MASTER_REGION change. Here are some variable descriptions:

  • GKE_MASTER_REGION - GCP region to deploy GKE master
  • GKE_NODE_LOCATIONS - GCP zone list inside GKE_MASTER_REGION region, where we need to run GKE working nodes. We run one node per zone, thus it gives us 2 nodes with default value ["us-central1-c", "us-central1-b"]. Pay attention to machine types, some of them are supported in the specific zones only
  • GKE_NODE_MACHINE_TYPE - machine type to be used with GKE nodes. Adjust this value according to your load.
  • GKE_MASTER_AUTHORIZED_NETWORKS - IP whitelist to access GKE master on network level. Add your IP addresses to this whitelist
  • GKE_NODE_IMAGE_TYPE - GKE node image - operating system image to run on each GKE node

When you changed all the variables you need, it's time to apply these changes to the infrastructure. Get the list of proposed changes:

cd "$PROJECT_ROOT/infra/live/demo/infra"
terragrunt plan

Review carefully all the resources terraform is going to create, modify or destroy. Check there is no unexpected cluster destroy and recreate, which leads to in-cluster data loss. When you're good with all the proposed changes - apply them. It may take some time, depending on changes.

cd "$PROJECT_ROOT/infra/live/demo/infra"
terragrunt apply -auto-approve

Troubleshooting

We assume work is performed in the dedicated demo environment where data loss is acceptable. Do not use these instructions in a shared environment or production GKE cluster!

GCP services/GKE provision issues

You need "Owner" permissions in your demo project to proceed w/o permission issues.

If you hit issues while creating GKE node pool - check machine types zone support.

Terragrunt output may be helpful too.

Provision basic in-k8s such as helm via terragrunt

Some values of $PROJECT_ROOT/infra/live/demo/in-k8s/terragrunt.hcl must be the same as $PROJECT_ROOT/infra/live/demo/infra/terragrunt.hcl

K8S_CONTEXT -> K8S_CONTEXT
GKE_NODE_LOCATIONS -> K8S_REGIONAL_DISK_LOCATIONS 

But pay attention, K8S_REGIONAL_DISK_LOCATIONS cannot hold more than 2 zones.

Cryptonodes deploy issues

Here is the list of docs that may be helpful:

In case you hit a problem - some starting points are kubectl get and kubectl describe to pods, services, PVCs, for example

kubectl --context $KUBE_CONTEXT get namespaces
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 get pod
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 describe pod demo-btc1-bitcoind-0
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 get svc
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 describe svc demo-btc1-service
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 get pvc
kubectl --context $KUBE_CONTEXT --namespace demo-btc-1 describe pvc bitcoind-pvc-demo-btc1-bitcoind-0

describe messages help to understand the source of the issue usually.

Common issues are lack of resource quotas such as IPs and SSD quota. You should see corresponding messages in kubectl ... describe ... commands above. Visit GCP console to send quota increase request in this case. Another possible option is resource decrease. See readme to decrease f.e. disk resources.