Read this guide if you want to debug the operator, fix bugs or contribute new features and tests.
Postgres Operator is written in Go. Use the installation instructions if you don't have Go on your system. You won't be able to compile the operator with Go older than 1.7. We recommend installing the latest one.
Go projects expect their source code and all the dependencies to be located under the GOPATH. Normally, one would create a directory for the GOPATH (i.e. ~/go) and place the source code under the ~/go/src sub directories.
Given the schema above, the Postgres Operator source code located at
github.com/zalando/postgres-operator
should be put at
-~/go/src/github.com/zalando/postgres-operator
.
export GOPATH=~/go
mkdir -p ${GOPATH}/src/github.com/zalando/
cd ${GOPATH}/src/github.com/zalando/
git clone https://github.com/zalando/postgres-operator.git
You need Glide to fetch all dependencies. Install it with:
make tools
Next, install dependencies with glide by issuing:
make deps
This would take a while to complete. You have to redo make deps
every time
you dependencies list changes, i.e. after adding a new library dependency.
Build the operator with the make docker
command. You may define the TAG
variable to assign an explicit tag to your docker image and the IMAGE to set
the image name. By default, the tag is computed with
git describe --tags --always --dirty
and the image is
registry.opensource.zalan.do/acid/postgres-operator
export TAG=$(git describe --tags --always --dirty)
make docker
Building the operator binary (for testing the out-of-cluster option):
make
The binary will be placed into the build directory.
The fastest way to run and test your docker image locally is to reuse the docker
from minikube or use the
load docker-image
from kind. The following steps
will get you the docker image built and deployed.
# minikube
eval $(minikube docker-env)
make docker
# kind
make docker
kind load docker-image <image> --name <kind-cluster-name>
Then create a new Postgres Operator deployment. You can reuse the provided manifest but replace the version and tag. Don't forget to also apply configuration and RBAC manifests first, e.g.:
kubectl create -f manifests/configmap.yaml
kubectl create -f manifests/operator-service-account-rbac.yaml
sed -e "s/\(image\:.*\:\).*$/\1$TAG/" manifests/postgres-operator.yaml | kubectl create -f -
# check if the operator is coming up
kubectl get pod -l name=postgres-operator
The operator employs K8s-provided code generation to obtain deep copy methods
and K8s-like APIs for its custom resource definitions, namely the
Postgres CRD and the operator CRD. The usage of the code generation follows
conventions from the k8s community. Relevant scripts live in the hack
directory:
update-codegen.sh
triggers code generation for the APIs defined inpkg/apis/acid.zalan.do/
,verify-codegen.sh
checks if the generated code is up-to-date (to be used within CI).
The /pkg/generated/
contains the resultant code. To make these scripts work,
you may need to export GOPATH=$(go env GOPATH)
References for code generation are:
- Relevant pull request See comments there for minor issues that can sometimes broke the generation process.
- Code generator source code
- Code Generation for CustomResources - intro post on the topic
- Code generation in Prometheus and etcd operators
To debug the generated API locally, use the
kubectl proxy
and kubectl --v=8
log level to display contents of HTTP requests (run the
operator itself with --v=8
to log all REST API requests). To attach a debugger
to the operator, use the -outofcluster
option to run the operator locally on
the developer's laptop (and not in a docker container).
There is a web interface in the operator to observe its internal state. The
operator listens on port 8080. It is possible to expose it to the
localhost:8080
by doing:
kubectl --context minikube port-forward $(kubectl --context minikube get pod -l name=postgres-operator -o jsonpath={.items..metadata.name}) 8080:8080
The inner query gets the name of the Postgres Operator pod, and the outer one enables port forwarding. Afterwards, you can access the operator API with:
curl --location http://127.0.0.1:8080/$endpoint | jq .
The available endpoints are listed below. Note that the worker ID is an integer from 0 up to 'workers' - 1 (value configured in the operator configuration and defaults to 4)
- /databases - all databases per cluster
- /workers/all/queue - state of the workers queue (cluster events to process)
- /workers/$id/queue - state of the queue for the worker $id
- /workers/$id/logs - log of the operations performed by a given worker
- /clusters/ - list of teams and clusters known to the operator
- /clusters/$team - list of clusters for the given team
- /clusters/$team/$namespace/$clustername - detailed status of the cluster, including the specifications for CRD, master and replica services, endpoints and statefulsets, as well as any errors and the worker that cluster is assigned to.
- /clusters/$team/$namespace/$clustername/logs/ - logs of all operations performed to the cluster so far.
- /clusters/$team/$namespace/$clustername/history/ - history of cluster changes triggered by the changes of the manifest (shows the somewhat obscure diff and what exactly has triggered the change)
The operator also supports pprof endpoints listed at the pprof package, such as:
- /debug/pprof/
- /debug/pprof/cmdline
- /debug/pprof/profile
- /debug/pprof/symbol
- /debug/pprof/trace
It's possible to attach a debugger to troubleshoot postgres-operator inside a docker container. It's possible with gdb and delve. Since the latter one is a specialized debugger for Go, we will use it as an example. To use it you need:
- Install delve locally
go get -u github.com/derekparker/delve/cmd/dlv
- Add following dependencies to the
Dockerfile
RUN apk --no-cache add go git musl-dev
RUN go get github.com/derekparker/delve/cmd/dlv
- Update the
Makefile
to build the project with debugging symbols. For that you need to addgcflags
to a build target for corresponding OS (e.g. GNU/Linux)
-gcflags "-N -l"
- Run
postgres-operator
under the delve. For that you need to replaceENTRYPOINT
with the followingCMD
:
CMD ["/root/go/bin/dlv", "--listen=:DLV_PORT", "--headless=true", "--api-version=2", "exec", "/postgres-operator"]
- Forward the listening port
kubectl port-forward POD_NAME DLV_PORT:DLV_PORT
- Attach to it
dlv connect 127.0.0.1:DLV_PORT
To run all unit tests, you can simply do:
go test ./...
For go 1.9 vendor
directory would be excluded automatically. For previous
versions you can exclude it manually:
go test $(glide novendor)
In case if you need to debug your unit test, it's possible to use delve:
dlv test ./pkg/util/retryutil/
Type 'help' for list of commands.
(dlv) c
PASS
To test the multi-namespace setup, you can use
./run_operator_locally.sh --rebuild-operator
It will automatically create an acid-minimal-cluster
in the namespace test
.
Then you can for example check the Patroni logs:
kubectl logs acid-minimal-cluster-0
The operator provides reference e2e (end-to-end) tests to ensure various infra
parts work smoothly together. Each e2e execution tests a Postgres Operator image
built from the current git branch. The test runner starts a kind
(local k8s) cluster and Docker container with tests. The k8s API client from
within the container connects to the kind
cluster using the standard Docker
bridge
network. The tests utilize examples from /manifests
(ConfigMap is
used for the operator configuration) to avoid maintaining yet another set of
configuration files. The kind cluster is deleted if tests complete successfully.
End-to-end tests are executed automatically during builds:
# invoke them from the project's top directory
make e2e-run
# install kind and build test image before first run
make e2e-tools e2e-build
End-to-end tests are written in Python and use flake8
for code quality.
Please run flake8 before submitting a PR.
In the case you want to add functionality to the operator that shall be controlled via the operator configuration there are a few places that need to be updated. As explained here, it's possible to configure the operator either with a ConfigMap or CRD, but currently we aim to synchronize parameters everywhere.
When choosing a parameter name for a new option in a Postgres cluster manifest,
keep in mind the naming conventions there. We use camelCase
for manifest
parameters (with exceptions for certain Patroni/Postgres options) and
snake_case
variables in the configuration. Only introduce new manifest
variables if you feel a per-cluster configuration is necessary.
Note: If one option is defined in the operator configuration and in the cluster manifest, the latter takes precedence.
Update the following Go files that obtain the configuration parameter from the manifest files:
Postgres manifest parameters are defined in the api package. The operator behavior has to be implemented at least in k8sres.go. Please, reflect your changes in tests, for example in:
For the CRD-based configuration, please update the following files:
- the default OperatorConfiguration
- the Helm chart's values-crd file
Reflect the changes in the ConfigMap configuration as well (note that numeric and boolean parameters have to use double quotes here):
- ConfigMap manifest
- the Helm chart's default values file
Finally, add a section for each new configuration option and/or cluster manifest parameter in the reference documents:
It also helps users to explain new features with examples in the administrator docs.