- certman-operator
- About
- Dependencies
- How the Certman Operator works
- Limitations
- CustomResourceDefinitions
- Setup Certman Operator
- Metrics
- Additional record for control plane certificate
- License
The Certman Operator is used to automate the provisioning and management of TLS certificates from Let's Encrypt for OpenShift Dedicated clusters provisioned via https://cloud.redhat.com/.
At a high level, Certman Operator is responsible for:
- Provisioning Certificates after a cluster's successful installation.
- Reissuing Certificates prior to their expiry.
- Revoking Certificates upon cluster decomissioning.
GO: 1.19
Operator-SDK: 1.21.0
Hive: v1
Certman Operator is currently dependent on Hive. Hive is an API-driven OpenShift operator providing OpenShift Dedicated cluster provisioning and management.
Specifically, Hive provides a namespace scoped CustomResourceDefinition called ClusterDeployment. Certman watches the Installed
spec of instances of that CRD and will attempt to provision certificates for the cluster once this field returns true
. Hive is also responsible for the deployment of the certificates to the cluster via syncsets.
Only Hive v1 will work with this release.
- A new OpenShift Dedicated cluster is requested from https://cloud.redhat.com.
- The clusterdeployment controller's
Reconcile
function watches theInstalled
field of the ClusterDeployment CRD (as explained above). Once theInstalled
field becomestrue
, a CertificateRequest resource is created for that cluster. - Certman operator will then request new certificates from Let’s Encrypt based on the populated spec fields of the CertificateRequest CRD.
- To prove ownership of the domain, Certman will attempt to answer the Let’s Encrypt DNS-01 challenge by publishing the
_acme-challenge
subdomain in the cluster’s DNS zone with a TTL of 1 min. - Wait for propagation of the record and then verify the existence of the challenge subdomain by using DNS over HTTPS service from Cloudflare. Certman will retry verification up to 5 times before erroring.
- Once the challenge subdomain record has been verified, Let’s Encrypt can verify that you are in control of the domain’s DNS.
- Let’s Encrypt will issue certificates once the challenge has been successfully completed. Certman will then delete the challenge subdomain as it is no longer required.
- Certificates are then stored in a secret on the management cluster. Hive watches for this secret.
- Once the secret contains valid certificates for the cluster, Hive will sync the secrets over to the OpenShift Dedicated cluster using a SyncSet.
- Certman operator will reconcile all CertificateRequests every 10 minutes by default. During this reconciliation loop, certman will check for the validity of the existing certificates. As the certificate's expiry nears 45 days, they will be reissued and the secret will be updated. Reissuing certificates this early avoids getting email notifications about certificate expiry from Let’s Encrypt.
- Updates to secrets on certificate reissuance will trigger Hive controller’s reconciliation loop which will force a syncset of the new secret to the OpenShift Dedicated cluster. OpenShift will detect that secret has changed and will apply the new certificates to the cluster.
- When an OpenShift Dedicated cluster is decommissioned, all valid certificates are first revoked and then the secret is deleted on the management cluster. Hive will then continue deleting the other cluster resources.
- As described above in dependencies, Certman Operator requires Hive for custom resources and actual deployment of certificates. It is therefore not a suitable "out-of-the-box" solution for Let's Encrypt certificate management. For this, we recommend using either openshift-acme or cert-manager. Certman Operator is ideal for use cases when a large number of OpenShift clusters have to be managed centrally.
- Certman Operator currently only supports DNS Challenges through AWS Route53. There are plans for GCP support. HTTP Challenges is not supported.
- Certman Operator does not support creation of Let's Encrypt accounts at this time. You must already have a Let's Encrypt account and keys that you can provide to the Certman Operator.
- Certman Operator does NOT configure the TLS certificates in an OpenShift cluster. This is managed by Hive using SyncSet.
The Certman Operator relies on the following custom resource definitions (CRDs):
-
CertificateRequest
, which provides the details needed to request a certificate from Let's Encrypt. -
ClusterDeployment
, which defines a targeted OpenShift managed cluster. The Operator ensures at all times that the OpenShift managed cluster has valid certificates for control plane and pre-defined external routes.
For local development, you can use either minishift or minikube to develop and run the operator. You will also need to install the operator-sdk.
The script hack/test/local_test.sh
can be used to automate local testing by creating a minikube cluster and deploying certman-operator and its dependencies.
A ConfigMap is used to store certman operator configuration. The ConfigMap contains one value, default_notification_email_address
, the email address to which Let's Encrypt certificate expiry notifications should be sent.
oc create configmap certman-operator \
--from-literal=default_notification_email_address=foo@bar.com
There are two secrets required for certman-operator to function.
lets-encrypt-account
- This secret is used to store the Let's Encrypt account url and keys.
# To fetch the "lets-encrypt-account" secret for a cluster on the Hive shard.
oc -n certman-operator get secret lets-encrypt-account -oyaml
For testing purposes:
# On the staging cluster:
oc -n certman-operator create secret generic lets-encrypt-account \
--from-file=private-key=private-key.pem \
--from-file=account-url=account.txt
aws
orgcp
- Based on which platform is being used (AWS or GCP), this is the secret which contains the cloud platform credentials of the account of the target cluster.
# To fetch the "aws" secret for a cluster on the Hive shard.
NAMESPACE=$(oc get cd -A | grep -i $CLUSTERNAME | awk '{ print $1 }')
oc -n $NAMESPACE get secret aws -oyaml
For testing purpose:
# To create the "aws" secret on staging cluster for testing.
oc -n certman-operator create secret generic aws --from-literal=aws_access_key_id=XXX
--from-literal=aws_secret_access_key=YYYY
NOTE:
-
The 'aws' secret for AWS platform will be required for only non-STS clusters. The STS clusters won't have this secret.
-
For testing purposes, both the secrets (i.e lets-encrypt-account secret and aws/gcp platform credential secret) can be found on the Hive shard of the staging cluster.
git clone git@github.com:openshift/hive.git
oc create -f hive/config/crds
oc create -f https://raw.githubusercontent.com/openshift/certman-operator/master/deploy/crds/certman.managed.openshift.io_certificaterequests.yaml
WATCH_NAMESPACE="certman-operator" OPERATOR_NAME="certman-operator" go run main.go
To build the certman-operator image, can follow the documentation.
oc new-project certman-operator
oc create -f deploy/service_account.yaml
oc create -f deploy/role.yaml
oc create -f deploy/role_binding.yaml
Edit deploy/operator.yaml, substituting the reference to the image
you built above. Then deploy it:
oc create -f deploy/operator.yaml
certman_operator_certs_in_last_day_openshift_com
reports how many certs have been issued for Openshift.com in the last 24 hours.
certman_operator_certs_in_last_day_openshift_apps_com
reports how many certs have been issued for Openshiftapps.com in the last 24 hours.
certman_operator_certs_in_last_week_openshift_com
reports how many certs have been issued for Openshift.com in the last 7 days.
certman_operator_certs_in_last_week_openshift_apps_com
reports how many certs have been issued for Openshiftapps.com in the last 7 days.
certman_operator_duplicate_certs_in_last_week
reports how many certs have had duplication issues.
certman_operator_certificate_valid_duration_days
reports how many days before a certificate expires .
Certman Operator always creates a certificate for the control plane for the clusters Hive builds. By passing a string into the pod as an environment variable named EXTRA_RECORD
Certman Operator can add an additional record to the SAN of the certificate for the API servers. This string should be the short hostname without the domain. The record will use the same domain as the rest of the cluster for this new record.
Example
apiVersion: apps/v1
kind: Deployment
metadata:
name: certman-operator
spec:
template:
spec:
...
env:
- name: EXTRA_RECORD
value: "myapi"
The example will add myapi.<clustername>.<clusterdomain>
to the certificate of the control plane.
Certman Operator is licensed under Apache 2.0 license. See the LICENSE file for details.