Skip to content

Operator to Manage Let's Encrypt certificates for OpenShift Clusters

License

Notifications You must be signed in to change notification settings

c-e-brumm/certman-operator

 
 

Repository files navigation

certman-operator

Go Report Card GoDoc codecov License

About

The Certman Operator is used to automate the provisioning and management of TLS certificates from Let's Encrypt for OpenShift Dedicated clusters provisioned via https://cloud.redhat.com/.

At a high level, Certman Operator is responsible for:

  • Provisioning Certificates after a cluster's successful installation.
  • Reissuing Certificates prior to their expiry.
  • Revoking Certificates upon cluster decomissioning.

Dependencies

GO: 1.13

Operator-SDK: 0.16.0

Hive: v1

Certman Operator is currently dependent on Hive. Hive is an API-driven OpenShift operator providing OpenShift Dedicated cluster provisioning and management.

Specifically, Hive provides a namespace scoped CustomResourceDefinition called ClusterDeployment. Certman watches the Installed spec of instances of that CRD and will attempt to provision certificates for the cluster once this field returns true. Hive is also responsible for the deployment of the certificates to the cluster via syncsets.

Only Hive v1 will work with this release.

How the Certman Operator works

  1. A new OpenShift Dedicated cluster is requested from https://cloud.redhat.com.
  2. The clusterdeployment controller's Reconcile function watches the Installed field of the ClusterDeployment CRD (as explained above). Once the Installed field becomes true, a CertificateRequest resource is created for that cluster.
  3. Certman operator will then request new certificates from Let’s Encrypt based on the populated spec fields of the CertificateRequest CRD.
  4. To prove ownership of the domain, Certman will attempt to answer the Let’s Encrypt DNS-01 challenge by publishing the _acme-challenge subdomain in the cluster’s DNS zone with a TTL of 1 min.
  5. Wait for propagation of the record and then verify the existance of the challenge subdomain by using DNS over HTTPS service from Cloudflare. Certman will retry verification up to 5 times before erroring.
  6. Once the challenge subdomain record has been verified, Let’s Encrypt can verify that you are in control of the domain’s DNS.
  7. Let’s Encrypt will issue certificates once the challenge has been successfuly completed. Certman will then delete the challenge subdomain as it is no longer required.
  8. Certificates are then stored in a secret on the management cluster. Hive watches for this secret.
  9. Once the secret contains valid certificates for the cluster, Hive will sync the secrets over to the OpenShift Dedicated cluster using a SyncSet.
  10. Certman operator will reconcile all CertificateRequests every 10 minutes by default. During this reconciliation loop, certman will check for the validity of the existing certificates. As the certificate's expiry nears 45 days, they will be reissued and the secret will be updated. Reissuing certificates this early avoids getting email notifications about certificate expiry from Let’s Encrypt.
  11. Updates to secrets on certificate reissuance will trigger Hive controller’s reconciliation loop which will force a syncset of the new secret to the OpenShift Dedicated cluster. OpenShift will detect that secret has changed and will apply the new certificates to the cluster.
  12. When an OpenShift Dedicated cluster is decommissioned, all valid certificates are first revoked and then the secret is deleted on the management cluster. Hive will then continue deleting the other cluster resources.

Limitations

  • As described above in dependencies, Certman Operator requires Hive for custom resources and actual deployment of certificates. It is therefore not a suitable "out-of-the-box" solution for Let's Encrypt certificate management. For this, we recommend using either openshift-acme or cert-manager. Certman Operator is ideal for use cases when a large number of OpenShift clusters have to be managed centrally.
  • Certman Operator currently only supports DNS Challenges through AWS Route53. There are plans for GCP support. HTTP Challenges is not supported.
  • Certman Operator does not support creation of Let's Encrypt accounts at this time. You must already have a Let's Encrypt account and keys that you can provide to the Certman Operator.
  • Certman Operator does NOT configure the TLS certificates in an OpenShift cluster. This is managed by Hive using SyncSet.

CustomResourceDefinitions

The Certman Operator relies on the following custom resource definitions (CRDs):

  • CertificateRequest, which provides the details needed to request a certificate from Let's Encrypt.

  • ClusterDeployment, which defines a targeted OpenShift managed cluster. The Operator ensures at all times that the OpenShift managed cluster has valid certificates for control plane and pre-defined external routes.

Setup Certman Operator

For local development, you can use either minishift or minikube to develop and run the operator. You will also need to install the operator-sdk.

Local development testing

The script hack/test/local_test.sh can be used to automate local testing by creating a minikube cluster and deploying certman-operator and its dependencies.

Certman Operator Configuration

A ConfigMap is used to store certman operator configuration. The ConfigMap contains one value, default_notification_email_address, the email address to which Let's Encrypt certificate expiry notifications should be sent.

oc create configmap certman-operator \
    --from-literal=default_notification_email_address=foo@bar.com

Certman Operator Secrets

A Secret is used to store the Let's Encrypt account url and keys.

 oc create secret generic lets-encrypt-account-staging \
    --from-file=private-key=production-private-key.pem \
    --from-file=account-url=production-account.txt
 oc create secret generic lets-encrypt-account-production \
    --from-file=private-key=staging-private-key.pem \
    --from-file=account-url=staging-account.txt

Custom Resource Definitions (CRDs)

Create Hive CRDs

git clone git@github.com:openshift/hive.git
oc create -f hive/config/crds

Create Certman Operator CRDs

oc create -f https://raw.githubusercontent.com/openshift/certman-operator/master/deploy/crds/certman.managed.openshift.io_certificaterequests_crd.yaml

Run Operator From Source

operator-sdk run --local

Build Operator Image

docker login quay.io

operator-sdk build quay.io/tparikh/certman-operator

docker push quay.io/tparikh/certman-operator

Setup & Deploy Operator On OpenShift/Kubernetes Cluster

Create & Use OpenShift Project

oc new-project certman-operator
oc label namespace certman-operator release=monitoring

Setup Service Account

oc create -f deploy/service_account.yaml

Setup RBAC

oc create -f deploy/role.yaml
oc create -f deploy/role_binding.yaml

Deploy the Operator

Edit deploy/operator.yaml, substituting the reference to the image you built above. Then deploy it:

oc create -f deploy/operator.yaml

Metrics

certman_operator_certs_in_last_day_openshift_com reports how many certs have been issued for Openshift.com in the last 24 hours.

certman_operator_certs_in_last_day_openshift_apps_com reports how many certs have been issued for Openshiftapps.com in the last 24 hours.

certman_operator_certs_in_last_week_openshift_com reports how many certs have been issued for Openshift.com in the last 7 days.

certman_operator_certs_in_last_week_openshift_apps_com reports how many certs have been issued for Openshiftapps.com in the last 7 days.

certman_operator_duplicate_certs_in_last_week reports how many certs have had duplication issues.

certman_operator_certificate_valid_duration_days reports how many days before a certificate expires .

Additional record for control plane certificate

Certman Operator always creates a certificate for the control plane for the clusters Hive builds. By passing a string into the pod as an environment variable named EXTRA_RECORD Certman Operator can add an additional record to the SAN of the certificate for the API servers. This string should be the short hostname without the domain. The record will use the same domain as the rest of the cluster for this new record. Example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: certman-operator
spec:
  template:
    spec:
    ...
      env:
      - name: EXTRA_RECORD
        value: "myapi"

The example will add myapi.<clustername>.<clusterdomain> to the certificate of the control plane.

License

Certman Operator is licensed under Apache 2.0 license. See the LICENSE file for details.

About

Operator to Manage Let's Encrypt certificates for OpenShift Clusters

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 69.7%
  • Shell 23.0%
  • Python 3.7%
  • Makefile 3.5%
  • Dockerfile 0.1%