Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Consul image can not be started on Kubernetes/Openshift without mounted volume #184

Open
fedinskiy opened this issue Apr 28, 2022 · 1 comment

Comments

@fedinskiy
Copy link

Overview of the Issue

When official Consul docker image is started on Kubernetes without mounted volume, it fails with either su-exec: setgroups(1000): Operation not permitted or failed to write NodeID to disk error.

Reproduction Steps

Steps for Openshift, steps for K8s should be similar:

  1. Login into OpenShift
  2. Create new project: oc new-project ts-consul
  3. Create file consul.yml with following content:
---
apiVersion: "v1"
kind: "List"
items:
- apiVersion: "v1"
  kind: "Service"
  metadata:
    labels:
      scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
    name: "consul"
    namespace: "ts-consul"
  spec:
    ports:
    - name: "http"
      port: 8500
      targetPort: 8500
    selector:
      deploymentconfig: "consul"
    type: "ClusterIP"
- apiVersion: "apps.openshift.io/v1"
  kind: "DeploymentConfig"
  metadata:
    labels:
      scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
    name: "consul"
    namespace: "ts-consul"
  spec:
    replicas: 1
    selector:
      deploymentconfig: "consul"
    template:
      metadata:
        labels:
          deploymentconfig: "consul"
          tsLogWatch: "consul"
          scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
        namespace: "ts-consul"
      spec:
        - image: "consul:1.11"
# Uncomment these lines for a different error:
#          env:
#          - name: "CONSUL_DISABLE_PERM_MGMT"
#            value: "yes"
          imagePullPolicy: "IfNotPresent"
          name: "consul"
          ports:
          - containerPort: 8500
            name: "http"
            protocol: "TCP"
    triggers:
    - type: "ConfigChange"
  1. Deploy the container: oc apply -f consul.yml -n ts-consul
  2. Start the container: oc scale dc/consul --replicas=1 -n ts-consul
  3. Wait for several seconds and check status: oc status -n ts-consul
Errors:
  * pod/consul-1-8lmhh is crash-looping
  1. Check pod logs: oc logs pod/consul-1-8lmhh(replace with the id of your pod)
su-exec: setgroups(1000): Operation not permitted

Alternative solution

We can follow the solution, implemented in #103 and add CONSUL_DISABLE_PERM_MGMT property. Unfortunately, this will just lead to a different error:

 failed to setup node ID: failed to write NodeID to disk: open /consul/data/node-id: permission denied

Operating system and Environment details

OS: Linux 5.16.20-200.fc35.x86_64
OpenShift:

# Client
oc v3.11.420
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server 
kubernetes v1.23.5+9ce5071

Additional info

Similar error was previously described several times: [1](suggested solution is to use custom Docker image), [2](added CONSUL_DISABLE_PERM_MGMT environment property, not helpful in this case, see "Alternative solution" section and [3] (recommended solution is to check "mount parameters"), but current solution requires volume mounting, which would be overkill in some cases(e.g/ training or integration testing). Usage of bitnami/consul image can be considered a workaround, but it comes with its own challenges[4] so it is preferable to have this issue solved for the official image.

[1] hashicorp/consul#4172
[2] #103
[3] hashicorp/consul#10403
[4] bitnami-labs/sealed-secrets#822

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants