status | title | creation-date | last-updated | authors | ||
---|---|---|---|---|---|---|
implemented |
Remote Resource Resolution |
2021-03-23 |
2022-10-24 |
|
- Summary
- Key Terms
- Motivation
- Requirements
- Proposal
- Design Details
- Test Plan
- Design Evaluation
- Drawbacks
- Alternatives
- Do Nothing Specific to Tekton Pipelines
- Just Implement Git Support Directly in Pipelines
- Controller makes HTTP request to resolver to get resources
- Custom Tasks
- A CRD That Wraps Tekton's Existing Types
- Use a
pending
Status on PipelineRuns/TaskRuns With Embedded Resolution Info - Use an Admission Controller to Perform Resolution
- Sync Task and Pipeline objects directly into the cluster
- Implementation Pull Requests
- Future Extensions
- References
- Appendix A: Proof-of-Concept Controller
This TEP describes problems fetching and running Tasks and Pipelines ("Tekton resources") from remote locations. Examples of these remote locations include image registries, git repositories, company-internal catalogs, object storage, other namespaces or clusters.
This proposal advocates for treating Task and Pipeline resolution as an interface with several default sources out-of-the-box (those we already support today: in-cluster and Tekton Bundles). This would allow anyone to setup resolvers with a direct hook into Tekton's state-machine against a well-formed and reliable "API" without requiring Tekton to take hard dependencies on any one "medium".
A secondary benefit of a shared interface and default implementations is that these will provide working examples that other developers can use as the basis for their own resolvers.
This section defines some key terms used throughout the rest of this doc.
Remote
: Any location that stores Tekton resources. These could be: OCI registries, git repos and other version control systems, other namespaces, other clusters, object storage, an organization's proprietary internal catalog, etc...Remote Resource
: The YAML file or other representation of a Tekton resource that lives in the Remote.Resolver
: A program or piece of code that knows how to interpret a reference to a remote resource and fetch it for execution as part of the user's TaskRun or Pipeline.Resource Resolution
: The act of taking a reference to a Tekton resource in a Remote, fetching it, and returning its content to the Tekton Pipelines reconcilers.
Pipelines currently provides support for resources to be run from two different
locations: a Task or Pipeline can be run from the same cluster or (when the
enable-tekton-oci-bundles
flag
is "true"
) from an image registry hosting Tekton
Bundles.
This existing support has a few problems:
- For resources in a cluster the degree of control afforded to operators is
extremely narrow: Either a resource exists in the cluster or it does not. To
add new Tasks and Pipelines operators have to manually (or via automation)
sync them in (
kubectl apply -f git-clone.yaml
). - For resources in a Tekton Bundle registry, operators don't have a choice over what kind of storage best suits their teams: It's either a registry or it's back to manually installing Tasks and Pipelines. This might not be compatible with an org's chosen storage. Put another way: why can't we just keep our Tasks in git?
- Pipeline authors have to document the Tasks their Pipeline depends on
out-of-band. A common example of
this is a build
Pipeline that needs
git-clone
to be installed from the open source catalog before the Pipeline can be run. TEP-0053 is also exploring ways of encoding this information as part of pipelines in the catalog. - Pipelines' resolver code is synchronous: a reconciler thread is blocked while a resource is being fetched.
- Pipeline's existing support for cluster and bundle resources is not extensible.
- Operators of Tekton Pipelines cannot selectively enable/disable sources of
tasks and pipelines. They cannot, for example, switch off the ability for
Tekton Pipelines to use
Tasks
from the cluster. Right now the only control is whether to enable Bundles support since it's guarded behind feature flags.
- Aspirationally, never require a user to run
kubectl apply -f git-clone.yaml
ever again. - Provide the ability to manage Tekton resources in an org's source of choice
via any process of their choice -
git push
,tkn bundle push
,gsutil cp
and so on, instead ofkubectl apply
. - Allow multiple remotes to be supported in a single pipeline: task A from OCI, task B from the Catalog, task C directly from the cluster, and so on.
- Allow the open source Pipelines project (and any downstream project) to choose which remotes are enabled out of the box with a new release.
- Allow operators to add new remote sources without having to redeploy the entirety of Tekton Pipelines.
- Allow operators to completely remove the code paths that fetch Tekton resources from remotes they don't want to support.
- Allow operators to control the threshold at which fetching a remote Tekton resource is considered timed out.
- Emit Events and Conditions operators can use to assess whether fetching remote resources is slow or failing.
- Establish a common syntax that tool and platform creators can use to record, as part of a pipeline or run, the remote location that Tekton resources should be fetched from.
- Integrate mechanisms to verify remote tasks and pipelines before they are executed by Pipelines' reconcilers. See TEP-0091 on this.
Sharing a git repo of tasks: An organization decides to manage their internal tasks and pipelines in a shared git repository. The CI/CD cluster is set up to allow these resources to be referenced directly by TaskRuns and PipelineRuns in the cluster.
Using non-git VCS for pipelines-as-code: A game dev keeps all of their Tasks and Pipelines, config-as-code style, in a Perforce repo alongside the assets and source that make up their product. Pipelines does not provide support for resolving Tekton resources from Perforce repositories out-of-the-box. The dev creates a new Resolver by following a template and docs published as part of the Tekton Pipelines project. The dev's Resolver is able to fetch Tekton resources from their Perforce repo and return them to the Pipelines reconciler.
Platform interoperability: A platform builder accepts workflows imported from other Tekton-compliant platforms. When a Pipeline is imported that includes a reference to a Task in a Remote that the platform doesn't already have it goes out and fetches that Task, importing it alongside the pipeline without the user having to explicitly provide it.
Providing an audited catalog of tasks and pipelines: A company requires their CI/CD Tasks to be audited to check that they adhere to specific compliance rules before the Tasks can be used in their app teams' Pipelines. The company hosts a private internal catalog and requires all the app teams to build their Pipelines using only that catalog. To make this easier all of the CI/CD clusters in the company are configured with a single remote resource resolver - a CatalogRef resolver - that points to the internal catalog.
Replacing ClusterTasks and introducing ClusterPipelines: Tekton Pipelines
has an existing CRD called ClusterTask which is a cluster-scoped Task. The
idea is that these can be shared across many namespaces / tenants. If Pipelines
drops support for ClusterTask
it could be replaced with a Resolver that
has access to a private namespace of shared Tasks. Any user could request
tasks from that Resolver and get access to those shared resources. Similarly
for the concept of ClusterPipeline
which does not exist yet in Tekton:
a private namespace could be created full of pipelines. A Resolver can then
be given sole access to pipelines in this private namespace via RBAC and
users can leverage those shared pipelines.
- Provide a clear, well-documented interface between Pipelines' role as executor of Tekton resources and Resolvers' role fetching those Tekton resources before execution can begin.
- Include a configurable timeout for async resolution so operators can tune the allowable delay before a resource resolution is considered failed.
- Avoid blocking Pipeline's reconciler threads while fetching a remote resource is in progress.
- The implementation should be compatibile with the existing flagged Tekton Bundles support, with a view to replacing Tekton Pipeline's built-in resolvers with the new solution.
- Allow RBAC to be managed per remote source so that the permissions to fetch and return Tekton resources to a reconciler are limited to only those needed by each type of source.
- Support credentials to access a remote both per-source or per-workload, allowing operators to decide if they manage the creds to fetch Tekton resources or require their users to provide them at runtime.
- Support Tasks, Pipelines, Custom Tasks and any other resources that Tekton
Pipelines might choose to add support for (e.g. Steps?).
- Note: this does not mean that we'd add support for "any" resource type in the initial implementation. We may decide to support only fetching Tasks initially, for example. But the protocol for resolving resources should be indifferent to the content being fetched.
- At minimum we should provide resolver implementations for Tekton Bundles and
in-cluster resources. These can be provided out-of-the-box just as they are
today. The code for these resolvers can either be hosted as part of
Pipelines, in the Catalog, or in a new repo under the
tektoncd
GitHub org. - Add new support for resolving resources from git via this mechanism. This could be provided out of the box too but we don't have to since this doesn't overlap with concerns around backwards compatibility in the same way that Tekton Bundles and in-cluster support might.
- Ensure that "-as-code" flows can be supported. For example, fetching a Pipeline at a specific commit and re-writing the Pipeline so that Task references within that Pipeline also reference the specific commit as well.
Add new fields to taskRef
and pipelineRef
to set the resolver and
its parameters. Access to these new fields will be locked behind the
alpha
feature gate.
taskRef
new syntax samples:
taskRef:
resolver: bundle
resource:
image_url: gcr.io/tekton-releases/catalog/upstream/golang-fuzz:0.1
name: golang-fuzz
signer: cosign # TBD. See TEP-0091.
taskRef:
resolver: git
resource:
repository_url: https://github.com/tektoncd/catalog.git
branch: main
path: /task/golang-fuzz/0.1/golang-fuzz.yaml
pipelineRef
new syntax samples:
pipelineRef:
resolver: git
resource:
repository_url: https://github.com/sbwsg/experimental.git
branch: remote-resolution
path: /remote-resolution/pipeline.yaml
pipelineRef:
resolver: in-cluster
resource:
name: my-team-build-pipeline
Note that the fields under resource:
are validated by a resolver, not
by Tekton Pipelines. TEP-0075 (Dictionary
Params) describes
adding support for JSON object schema to Params. It would make sense to
bring in the same schema support for the resource:
field as well and
eventually resolvers might be able to publish the schema they accept so
that Pipelines can enforce it during validation of a TaskRun
or
PipelineRun
.
When the Pipelines reconciler sees a taskRef
or pipelineRef
with a
resolver
field it creates a ResolutionRequest
object. The object is
populated using fields from the taskRef
or pipelineRef
.
Once resolved the contents of the requested resource are base64-encoded
and stored in-line in the status.data
field of the ResolutionRequest
.
Metadata is set in the status.annotations
field. Examples of metadata
include the commit digest of a resolved git
resource, a bundle's
digest, or the resolved data's content-type
. The ResolutionRequest
is
marked as successfully resolved. Here is a sample of a resolved
ResolutionRequest
with some metadata fields trimmed out for brevity:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
spec:
params:
repository_url: https://github.com/sbwsg/experimental.git
branch: remote-resolution
path: /remote-resolution/pipeline.yaml
status:
data: a2luZDogUGlwZWxpbmUKYXBpVmVyc2lvbjogdGVrdG9uLmRldi92MWJldGExCm1ldGFkYXRhOgogIG5hbWU6IHAKc3BlYzoKICB0YXNrczoKICAtIG5hbWU6IGZldGNoLWZyb20tZ2l0CiAgICB0YXNrU3BlYzoKICAgICAgc3RlcHM6CiAgICAgIC0gaW1hZ2U6IGFscGluZS9naXQKICAgICAgICBzY3JpcHQ6IHwKICAgICAgICAgIGdpdCBjbG9uZSBodHRwczovL2dpdGh1Yi5jb20vdGVrdG9uY2QvcmVzdWx0cwo=
annotations:
commit: 2c9b093e94f30f588dc798cc56a2559d4da9d573
content-type: application/x-yaml
conditions:
- lastTransitionTime: "2021-10-14T15:38:14Z"
status: "True"
type: Succeeded
If a ResolutionRequest
fails resolution then it is marked as failed with
a Succeeded/"False"
condition. The condition will include a
machine-readable reason
and human-readable message
explaining the
failure.
Create a new project under the tektoncd
Github org catering specifically
to the functionality described in this TEP. The intention will be to
take this project from alpha to beta and eventually to a stable,
production-ready state. The project will provide:
- The
ResolutionRequest
CRD and a controller managing the lifecycle for that CRD. - Code-generated clients & supporting types for
ResolutionRequests
. - Function and struct helpers for other projects to use. See more in the related Design Details section.
- A common interface for resolver implementors that abstracts away the underlying use of CRDs.
Webhook and controller deployments for ResolutionRequest
objects will
run in the tekton-remote-resolution
namespace by default. The
namespace is intentionally separate from tekton-pipelines
to allow
RBAC
that isolates the remote resolution machinery.
During initial development and alpha the project will build several resolvers: "in-cluster", "bundle", and "git". In time we may decide to move these to their own repo alongside community-contributed resolvers, if any are written. See the Design Details section for more on specific resolvers.
The ResolutionRequest
Lifecycle Controller will be responsible for the
following:
- Initialize
ResolutionRequest
status fields on object creation. - Observe
ResolutionRequest
objects for populatedstatus.data
field and set theirSucceeded
condition to"True"
. - Enforce global timeouts for all
ResolutionRequests
, marking them failed if the current time minus theirmetadata.creation_timestamp
is longer than the limit.
Risk: Relying on a CRD may introduce scaling problems that couldn't be
discovered during proof-of-concept testing. Task and Pipeline
definitions may only get larger until a CRD no longer provides enough
space. In busy CI/CD clusters many rapidly created large
ResolutionRequests
may cause API server or etcd performance to
degrade.
Possible Mitigation: Implement a PipelineResource
-style artifact
system where short-lived PVCs or an object store are used as ephemeral
intermediary storage. A ResolutionRequest
could store a tuple of
(status.dataRef
, status.dataDigest
) in place of in-lined
status.data
. status.dataRef
would be a pointer to the PVC or object
store and status.dataDigest
would be a low collision hash of the data
being pointed to for Pipelines to verify when it "follows the pointer"
and downloads the data.
This is only one possible mitigation and there are many others approaches we could take. The only constraints on the mitigation are that the overall goals remain met:
- the implementation can operate without blocking Pipelines' reconcile loops,
- with the same parameter format,
- with a hook into Pipelines' reconcilers to queue reconciles when a resource has been fetched or failed.
Risk: Changes to the way Remote Resolution is implemented during alpha mean that all resolvers at some point need to be updated or rewritten as well.
Possible Mitigation: To help mitigate this the alpha resolvers will be written to be agnostic about the "protocol". All interaction with clients / CRDs will be kept in shared helpers so that a rewrite only impacts that shared code.
Risk: ResolutionRequest
objects have no data integrity mechanism yet, so
a motivated actor with access to the cluster and write permissions on
ResolutionRequest
objects can modify them without detection. This
becomes a more notable concern when thinking about task verification
occurring in Resolvers, as is planned in
TEP-0091. A user with
the necessary permissions could change a ResolutionRequest
object
containing a Task after Task verification occurred.
Possible Mitigation: Tekton already has solutions undergoing design to address this problem on two fronts, and so it would make sense to integrate directly with one of them:
- TEP-0089 Non-falsifiable provenance support
where Tekton's objects (i.e. a
ResolutionRequest
) can be signed by authorized workloads (i.e. aResolutionRequest
Reconciler). - The solution under de# TEP-0086 (available to read here) which includes content addressability as a desirable property of the storage subsystem (OCI Registry being a good candidate).
- User applies a
TaskRun
with ataskRef
that points to a remote:
kind: TaskRun
metadata:
name: my-tr
spec:
taskRef:
resolver: git
resource:
repository_url: https://github.com/tektoncd/catalog.git
branch: main
path: /task/golang-fuzz/0.1/golang-fuzz.yaml
- A
ResolutionRequest
is created by theTaskRun
reconciler:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
metadata:
labels:
resolution.tekton.dev/resolver: git
ownerReferences:
- apiVersion: tekton.dev/v1beta1
controller: true
kind: TaskRun
name: my-tr
uid: 6aa5857a-3d67-4a09-94a1-8e9cc136dcf8
spec:
params:
repository_url: https://github.com/tektoncd/catalog.git
branch: main
path: /task/golang-fuzz/0.1/golang-fuzz.yaml
- The
ResolutionRequest
is resolved and itsstatus.data
updated with the in-lined base64-encoded contents of thegolang-fuzz
task. The taskrun reconciler uses the spec from the retrievedgolang-fuzz
task to execute the user's submittedTaskRun
.
- Caching. LFU or LRU caches make sense in multiple spots along a
request's path: in Pipelines' reconcilers, in the
ResolutionRequest
reconciler, or in the resolvers themselves. - De-duplication. Mentioned earlier in this doc, if pipelines'
reconcilers can assert that an existing
ResolutionRequest
exactly matches the resolver and parameters of apipelineRef
ortaskRef
then instead of creating a newResolutionRequest
pipelines could instead attach an additionalownerReference
on the existingResolutionRequest
and reuse it.
The OpenAPI schema for the new fields added to taskRef
and
pipelineRef
will be as follows:
properties:
resolver:
description: Resolver names the type of resource resolution to be performed. e.g. "in-cluster", "bundle", "git"
type: string
resource:
description: Resource specifies the remote-specific parameters needed to resolve the Task or Pipeline.
type: object
x-kubernetes-preserve-unknown-fields: true
Pipelines' validation code will need to be updated to accomodate these new fields.
-
ResolutionRequests
are namespaced and must be created in the same namespace as the referencingPipelineRun
orTaskRun
. This is required to supportownerReferences
and to keep tenants' resource requests separated in multi-tenant clusters. -
The label
resolution.tekton.dev/resolver
is required onResolutionRequests
and its omission will result in the request being immediately failed. This is to allow for watches and filtering based on the resolver type. -
All labels prefixed
resolution.tekton.dev/
are ring-fenced for use in future resolution-related features. -
ownerReferences
are included pointing back at thePipelineRun
orTaskRun
that issued this request. If the runs are deleted, their requests are cleaned up. In future this may also be useful for de-duplicating requests in the same namespace for the same remote Tekton resource.
Below are syntax examples of ResolutionRequest
objects both in
newly-created states, succeeded state and failed state.
Example of a newly-created ResolutionRequest
for a Bundle task from the
public catalog:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
metadata:
name: get-task-from-bundle-12345
namespace: bar
labels:
resolution.tekton.dev/resolver: bundle
ownerReferences:
- apiVersion: tekton.dev/v1beta1
controller: true
kind: TaskRun
name: my-tr
uid: 6aa5857a-3d67-4a09-94a1-8e9cc136dcf8
spec:
params:
image_url: gcr.io/tekton-releases/catalog/upstream/golang-fuzz:0.1
name: golang-fuzz
A newly-created ResolutionRequest
for a task from the catalog git repo:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
metadata:
name: get-task-from-git-12345
namespace: quux
labels:
resolution.tekton.dev/resolver: git
ownerReferences:
- apiVersion: tekton.dev/v1beta1
controller: true
kind: TaskRun
name: my-tr
uid: 6aa5857a-3d67-4a09-94a1-8e9cc136dcf8
spec:
params:
repository_url: https://github.com/tektoncd/catalog.git
branch: main
path: /task/golang-fuzz/0.1/golang-fuzz.yaml
A successfully completed ResolutionRequest
:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
metadata:
name: get-pipeline-from-git-12345
namespace: quux
labels:
resolution.tekton.dev/resolver: git
ownerReferences:
- apiVersion: tekton.dev/v1beta1
controller: true
kind: PipelineRun
name: my-pr
uid: 6aa5857a-3d67-4a09-94a1-8e9cc136dcf8
spec:
params:
repository_url: https://github.com/sbwsg/experimental.git
commit: 2c9b093e94f30f588dc798cc56a2559d4da9d573
path: /remote-resolution/pipeline.yaml
status:
annotations:
commit: 2c9b093e94f30f588dc798cc56a2559d4da9d573
content-type: application/x-yaml
data: a2luZDogUGlwZWxpbmUKYXBpVmVyc2lvbjogdGVrdG9uLmRldi92MWJldGExCm1ldGFkYXRhOgogIG5hbWU6IHAKc3BlYzoKICB0YXNrczoKICAtIG5hbWU6IGZldGNoLWZyb20tZ2l0CiAgICB0YXNrU3BlYzoKICAgICAgc3RlcHM6CiAgICAgIC0gaW1hZ2U6IGFscGluZS9naXQKICAgICAgICBzY3JpcHQ6IHwKICAgICAgICAgIGdpdCBjbG9uZSBodHRwczovL2dpdGh1Yi5jb20vdGVrdG9uY2QvcmVzdWx0cwo=
conditions:
- lastTransitionTime: "2021-10-14T15:38:14Z"
status: "True"
type: Succeeded
A failed ResolutionRequest
:
apiVersion: resolution.tekton.dev/v1alpha1
kind: ResolutionRequest
metadata:
name: get-pipeline-from-git-12345
namespace: quux
labels:
resolution.tekton.dev/resolver: git
ownerReferences:
- apiVersion: tekton.dev/v1beta1
controller: true
kind: PipelineRun
name: my-pr
uid: 6aa5857a-3d67-4a09-94a1-8e9cc136dcf8
spec:
params:
repository_url: https://github.com/sbwsg/experimental.git
branch: remote-resolution
path: /remote-resolution/invalid-pipeline.yaml
status:
conditions:
- lastTransitionTime: "2021-10-19T11:06:00Z"
message: 'error opening file "/remote-resolution/invalid-pipeline.yaml": file does not exist'
reason: ResolutionFailed
status: "False"
type: Succeeded
Each Resolver runs as its own controller in the cluster. This allows an
operator to spin up or tear down support for individual resolvers by
apply
ing or delete
ing them.
A resolver observes ResolutionRequest
objects, filtering on the
resolution.tekton.dev/resolver
label to find only those it is
interested in.
Each resolver is granted only the RBAC permissions needed to perform
resolution. In most cases this will be limited to GET
, LIST
and
UPDATE
permissions on ResolutionRequests
and their status
subresource. The "in-cluster" resolver will need permissions to GET
and LIST
Tasks
, ClusterTasks
and Pipelines
as well. Depending on
requirements of each, some resolvers may need GET
on ConfigMaps
or
Secrets
in the tekton-remote-resolution
namespace as well.
A resolver is only ultimately responsible for updating the status.data
and status.annotations
field of a ResolutionRequest
. The
ResolutionRequest
lifecycle controller is responsibile for marking
resolution requests completed or timing them out. A resolver may mark a
ResolutionRequest
as failed with an accompanying error and reason.
The "in-cluster" resolver will support looking up Tasks
,
ClusterTasks
and Pipelines
from the cluster it resides in. It will
mimic Tekton Pipelines' existing built-in support for these resources.
The parameters for this ResolutionRequest
will look like:
kind: Task
name: foo
The "bundle" resolver will support looking up resources from Tekton Bundles. This will mirror the existing support in Tekton Pipelines. In addition it will verify a bundle before returning it if configured to do so. The precise approach to verifying a bundle will be based on the decisions made in TEP-0091.
The supported paramaters for a "bundle" ResolutionRequest
will be:
image_url: gcr.io/tekton-releases/catalog/upstream/golang-fuzz:0.1
name: golang-fuzz
signer: cosign # TBD, see TEP-0091
When the bundle is successfully resolved the ResolutionRequest
will be
updated with both the resource contents and an annotation with the
digest of the fetched image.
The "git" resolver will support fetching resources from git repositories. This is new functionality that is not supported by Tekton Pipelines currently. In addition to basic support for fetching files from git this resolver will also need to support quite rich configuration (which can be developed over time):
- Allow-lists for specific repo urls, branches and commits.
- Settings for limiting access to certain repos by namespace.
- Timeout settings for git operations.
- Proxy settings (equivalent to
HTTP_PROXY
,HTTPS_PROXY
environment variables). - Path filtering such that only specific directories and file paths within a repo can be sourced.
- Verification of fetched files, following any approach decided in TEP-0091.
Since clones of large repos can be slow, and not all providers support features like sparse checkout, the "git" resolver will implement caching as a high priority.
The parameters for a "git" ResolutionRequest
will initially look as
follows:
repository_url: https://github.com/tektoncd/catalog.git
commit: 2c9b093e94f30f588dc798cc56a2559d4da9d573
branch: main # either commit or branch may be specified, but not both
path: /task/golang-fuzz/0.1/golang-fuzz.yaml
The git resolver will need to gracefully handle concurrent requests for
the same resource and will also need to be able to cancel in-flight
operations if a ResolutionRequest
is failed or deleted.
When the git resource is successfully resolved the ResolutionRequest
will be updated with both the resource contents as well as an annotation
with the fetched commit SHA.
One of the goals for this feature is to reach a point of maturity that
Tekton Pipelines could consider replacing its baked-in taskRef
and
pipelineRef
support with resolvers from the Tekton Resolution project.
At that point Pipelines' maintainers would need to decide how best to
provide default resolvers "out of the box". One possible approach would be
to deploy the ResolutionRequest
lifecycle reconciler and a set of resolvers
as part of Pipelines' release.yaml
.
Tekton Pipelines and any other project leveraging Tekton Resolution will
primarily interact with the resolution machinery through helper
libraries. These helpers are going to hide as much of the specifics of
requesting resources as possible but may need some supporting objects passed
in. For example, a ResolutionRequest
client/lister may need to be passed from
Pipelines to the Resolution helpers but the client or lister would itself be
generated by, and imported from, the Tekton Resolution project.
Unit and integration tests covering the new features including:
- Validation of new
taskRef
andpipelineRef
syntax. - Creation of
ResolutionRequests
. - Behaviour of Pipelines on
ResolutionRequest
success. - Behaviour of Tekton Pipelines on
ResolutionRequest
failure. - Timing out remote resolution requests.
Eventually the resolution project may reach a point of maturity that
Tekton Pipelines opts to use it for all taskRef
and pipelineRef
resolution. At this point Pipelines will need additional test coverage
for:
- Correctly translating all
taskRefs
andpipelineRefs
toResolutionRequests
. - End-to-end behaviour of all the resolvers supported by Pipelines "out of the box".
As part of this project being spun up a new suite of tests will be introduced covering the new CRD, its reconciler as well as individual suites for each of the resolvers.
We currently dogfood Pipelines' Bundles support in our release processes, for example in add-pr-body.
If accepted, the Tekton Resolution project and a set of resolvers could eventually be installed in our dogfood cluster and used as part of our release processes.
Once Remote Resolution is supported in Dogfooding we can also implement
testing for recommended "Task Management" approaches. Documenting these
will also provide guidelines to help the community structure their own
repos and registries of Pipelines
and Tasks
.
The CRD-based approach is reusable across any Tekton project that wants
to utilize remote resources. Submitting ResolutionRequest
objects for
resolution is not a feature exclusively tied to Tekton Pipelines.
Triggers, Workflows, Pipeline-as-Code and others should all be able to
use ResolutionRequest
objects.
- Creating a new object kind specifically designed for handing off responsibility of resolution is intended to make Tekton Pipelines' job simpler. Pipelines does not need to bake-in support for git. Eventually Pipelines may be able to remove all support for reading Tasks and Pipelines from the cluster or Bundles entirely, relying instead on resolvers.
- Creating new reconcilers from scratch is not trivial, however a resolver does not need new code-generated libraries or even to know the specifics of CRDs. A shared interface and template project can alleviate the difficulty of writing new resolvers.
- By decoupling Tekton Pipelines from the source of
Tasks
andPipelines
it becomes possible to support any source independent of the Pipelines codebase. - New approaches to resolution and resource handling can be built, tested and productionized without burdening the Tekton Pipelines project directly.
The primary drawback is complexity: we may find that the proposed solution is ultimately only ever used to fetch files from git. A mitigation to this drawback is that, if this becomes apparent, starting off in alpha allows the approach to be rethought and scoped back before moving to beta.
With the proposed design RBAC will be limited to whether or not a given
ServiceAccount / Role can create or read ResolutionRequest
objects. This
won't prevent situations where, for example, multiple resolvers compete
over the same ResolutionRequest
type.
Users wanting to refer to tasks in git (or anywhere else) are required to figure out their own implementation to fetch those resources, apply them, and use them in PipelineRuns / TaskRuns.
Pros:
- Pipelines has no work to do 🎉
- Pipelines could offer some supporting docs to help people do this.
Cons:
- No opportunity to standardize around remote fetching outside the sphere of tekton bundles.
- If Pipelines did offer docs for people constructing their own solution then they'd need to be kept up to date.
Each Tekton project could utilize some of the documentation that Pipelines provides to offer their own guidance on resolution.
Bake support for referencing tasks and pipelines in Git as a first class
citizen alongside Tekton Bundles. Add a git.Resolver
to live in tandem with
oci.Resolver
.
This option is not necessarily orthogonal to the main proposal in this TEP, and it could in fact be a first step towards it. Example syntax for a PipelineRun follows:
kind: PipelineRun
spec:
pipelineRef:
# Not actual proposed syntax, but here assuming "remote" is
# just an unstructured field that will later have its content
# defined by operator-controller CRDs.
remote:
type: git
repo: git@github.com:sbwsg/foobar-repo.git
path: /path/to/pipeline.yaml
Pros:
- Dramatically simpler initial requirements than an entire remote resource resolution proposal.
- Way less specification noise, maintenance burden, etc.
- The only other source of tekton resources that we've had actual requests for.
- A convenient stepping stone towards full remote resource support.
- Note: Even if we take this approach we would still want to choose a syntax that lends itself well to being extended with fuller support for other proposals later.
Cons:
- Adds code to Pipelines that is specific to git.
- Does not provide a path for users to integrate with other version control systems.
- Pipelines-specific solution, where we might want to also support remote
resolution in other Tekton projects.
- This also applies to OCI bundles: an Operator might want e.g. Trigger resources stored in an image registry too.
- While this alternative is Pipelines-specific it doesn't prevent the syntax being adopted or implementation being shared with other Tekton projects.
If the git implementation is placed in its own package within Pipelines then it's conceivable that other Tekton projects could utilize that package simply by importing it.
Resolvers supporting different source types (git/oci/etc..) are registered with the Pipelines controller. A good existing example of this kind of registration mechanism is the ClusterInterceptor CRD from Tekton Triggers. The controller then makes web requests to resolve remote resources:
-
Operations team deploys a "ClusterInterceptor-style" CRD to register a service for git-type resource resolution.
-
User submits a run type with a remote reference:
kind: TaskRun
metadata:
name: foo-run
spec:
taskRef:
apiVersion: tekton.dev/v1beta1
kind: Task
remoteType: git
remoteParams:
- name: url
value: https://github.com/tektoncd/catalog.git
- name: ref
value: main
- name: path
value: /tasks/git-clone/0.4/git-clone.yaml
- Pipelines sends a webhook request to the endpoint registered with the CRD for
git
remoteType.
{
"url": "https://github.com/tektoncd/catalog.git",
"ref": "main",
"path": "/tasks/git-clone/0.4/git-clone.yaml"
}
- Endpoint resolves
git-clone.yaml
and returns its content in the response.
{
"resolved": "apiVersion: tekton.dev/v1beta1\nkind: Task\nmetadata:\n name: git-clone\nspec:\n [...]"
}
Pros:
- Writing an HTTP server that responds to requests with JSON is a relatively small development burden.
- Familiar pattern for anyone who's used Triggers' Interceptors.
- Flexible since the endpoint can do whatever it wants before returning the resolved resource.
- Tekton Pipelines could surface metrics around latencies, errors, etc... during resolutions.
ClusterInterceptors
as they're employed by Triggers are composable - a single incoming event payload can be processed through a sequence ofClusterInterceptors
. This might also be a nice feature for Pipelines too, passing a resolved Tekton resource through various verification, validation or processing steps before being returned to the Tekton Pipelines controller.
Cons:
- Synchronous: the resource request is a direct connection from a Pipelines
reconciler to HTTP server, blocking that reconciler thread while the resolver
processes the request. This could be mitigated by:
- Putting the responsibility on Resolver designers to ensure their resolvers are fast. Not necessarily a problem Pipelines must solve.
- Pulling the HTTP requests out of Pipelines' reconciler loops. This could be done using a go routine pool, a Futures-style interface, or a multitude of other approaches. Any approach adds implementation complexity to Pipelines and exposes more questions on logging and observability.
- Moving to an HTTP based service takes us away somewhat from Kubernetes RBAC,
which potentially makes implementing multi-tenant setups more difficult or
complex.
- "Multi-tenant" implies that each tenant could have their own set of tasks and pipelines.
- How would cross-tenant requests be prevented? (e.g. stop one tenant from requesting the tasks/pipelines of another).
- How would secrets be managed for each tenant's resource requests? (e.g. each tenant might have their own github secret to fetch their tasks/pipelines with).
An example of what this might look like for a PipelineRun instead of TaskRun:
kind: PipelineRun
metadata:
name: pr1
spec:
pipelineRef:
apiVersion: tekton.dev/v1beta1
kind: Pipeline
remoteType: git
remoteParams:
- name: url
value: https://github.com/tektoncd/catalog.git
- name: ref
value: main
- name: path
value: /pipeline/build-push-gke-deploy/0.1/build-push-gke-deploy.yaml
Other Tekton projects could integrate with this solution by observing the same "registration" CRDs and making similar calls to those Pipelines would make. They'd be interested in different kinds of stored YAMLs (e.g. eventbinding.yaml instead of pipeline.yaml) but by and large the process of fetching and returning those resources would be similar.
Utilize the existing Custom Tasks pattern that Pipelines has already established to implement remote fetching of resources. Below is one possible way that Custom Tasks can be used for Remote Resolution with the existing tools today:
- User submits PipelineRun
kind: PipelineRun
metadata:
name: foo-run
spec:
tasks:
- taskSpec:
spec:
apiVersion: gitref.tekton.dev/v1alpha1
kind: RemoteFetch
spec: # embedded custom task spec, tep-0061. These fields can be anything the Resolver wants to support.
repo: https://github.com/tektoncd/catalog.git
ref: main
path: /tasks/git-clone/0.4/git-clone.yaml
# params and workspaces for resolved git-clone task
params:
- name: url
value: my-repo/foo/bar.git
workspaces:
- name: output
# ...
- PipelineRun reconciler creates a
Run
object for thistaskSpec
:
kind: Run
spec:
spec:
apiVersion: gitref.tekton.dev/v1alpha1
kind: RemoteFetch
spec:
repo: https://github.com/tektoncd/catalog.git
ref: main
path: /tasks/git-clone/0.4/git-clone.yaml
params:
- name: url
value: my-repo/foo/bar.git
workspaces:
- name: output
# ...
-
Custom Task reconciler is waiting for
gitref.tekton.dev/v1alpha1.RemoteFetch
. Fetches remote resource based on parameters from embedded spec. -
At this point the possible implementation could be either of:
-
The Custom Task reconciler creates
TaskRun
with fetched resource, passing through params, workspaces, etc... -
Custom Task reconciler updates
status
ofRun
object with the fetched YAML. The Pipelines controller can pick up this YAML from thestatus
and create theTaskRun
.
-
-
If the Custom Task reconciler is responsible for creating the underlying
TaskRun
/PipelineRun
then it would also be responsible for mirroring status changes, results, etc from the *Run it created back to theRun
that the Pipelines reconciler is watching.
Pros:
- Reuses existing Custom Tasks pattern.
- Doesn't require any changes in Pipelines, or specification effort at all.
- We could change the spec to better support this use-case though.
- Lifetime of fetched resource can be tied to owner reference - the
TaskRun
that the custom task reconciler creates can add an owner reference to theRun
for this I think? - Tekton could provide one of these for git
- Resource resolution is performed declaratively.
- Resolution is executed concurrently while the Pipelines controller does other things.
- Pipelines not responsible for anything Resolver-specific.
- Development burden of writing a reconciler could be alleviated with a custom task SDK?
Cons:
- No consistent specification of behaviour, potentially varies for every custom task implementing remote fetch.
- No validation of fetched content by Pipelines if Custom Task reconciler is responsible for creating TaskRuns/PipelineRuns.
- No opportunity to apply defaults by Pipelines if Custom Task reconciler is responsible for creating TaskRuns/PipelineRuns.
- No opportunity for Pipelines reconciler to optimize fetching multiple remote refs up-front before a PipelineRun is allowed to start.
- The custom task needs to have permissions to submit new taskruns to the cluster.
- Unclear how we'd standardize anything around this.
- Writing a reconciler could be considered difficult burden for users.
- This would only burden those requiring a custom Resolver, however. The default use cases should ideally be well-catered for by officially-provided or community-provided Resolvers.
- We'd either need to migrate the existing OCI bundle ref to this new format or maintain it alongside the Custom Task interface.
- Pipelines doesn't distinguish between Custom Tasks and remote resolution,
meaning it's much harder to surface metrics around latencies, errors, etc...
that are specific to remote resolution.
- Again though, we could change the spec / design to accommodate for this.
This might be tricky to integrate with other Tekton projects. With this approach the Custom Task is responsible for creating the Tekton resource (PipelineRun / TaskRun). In order to integrate with other projects the Custom Tasks would also need to know how to create resources for those projects. A Git Custom Task Resolver would need to know how to create TaskRuns, PipelineRuns, TriggerBindings, EventListeners, TriggerTemplates, and so on.
The role of the Resolver is limited to fetching YAML strings and writing them
to the TektonResolutionRequest
. Other Tekton projects could leverage this same
mechanism for resolving their own YAML documents and validate / apply defaults
as they need.
Add a new CRD that embeds all the information you need for a PipelineRun or TaskRun but is intentionally typed differently so that the Pipelines controller will ignore it. We can then introduce a new controller that performs all the resolutions and creates runs as needed.
This would operate similarly to the Custom Task alternative described above but inverts the ordering so that resolution occurs prior to Tekton Pipelines seeing the *runs.
Pros:
- Nothing changes in the Pipelines controller - resolution occurs before it "sees" the submitted objects.
- No need to "bloat" or "pollute" Tekton's existing CRDs with additional fields related to remote resolution.
Cons:
- We'd either need to migrate OCI bundles to this or support both approaches in parallel.
Other Tekton projects could utilize a very similar (or perhaps even the same?)
controller to resolve their own resources prior to their expected types being
submitted. For example: the new CRD could happily go out and fetch an
EventListener
, TriggerTemplate
and TriggerBinding
from a YAML file and
submit them to the cluster for the Triggers controller to pick up on.
Write a controller that watches for PipelineRuns
/ TaskRuns
created with a
PipelineRunPending
status. Allow additional information to be included in these
*runs that instruct the controller on how to resolve those resources.
Pros:
- Utilize existing
pending
state to prevent Pipelines controller from attempting to execute a resource that isn't yet "complete" because it hasn't had its YAML fetched yet.
Cons:
- We'd still need to decide precisely what the protocol for resolution is.
PipelineRunPending
is documented for users / third-party tools, not intended for Pipelines' internal use-cases.- Users may already be using the
PipelineRunPending
status for other purposes so adopting the status for resolution may inadvertently step on those users' toes.
Other projects don't yet have a concept of "pending" in their CRDs so this would either need to be added or simply be an implementation detail of how Pipelines implements its remote resource resolution mechanism.
Write a controller that intercepts submissions of *run types and performs resolution of any resources referenced by the submitted YAMLs. This could work in a couple of ways:
- a mutating webhook could recognize
pipelineRef
andtaskRef
entries, fetch them from the remote location (e.g. git) and in-line them to the submitted YAML. - a validating webhook could similarly fetch referenced resources but then submit those to the cluster as a side-effect, possible stored with a randomized name to avoid version collisions.
Pros:
- Leverages existing well-known Kubernetes extension mechanism.
Cons:
- Risks slowing down admission, possibly hitting Kubernetes-enforced timeouts.
Admission controllers are a Kubernetes-wide concept. It seems reasonable to assume other Tekton projects could also leverage this mechanism for their own resource resolution needs. Possible opportunity to share controller or libraries to achieve this?
Rather than using an intermediary data format like ResolutionRequest
objects Resolvers could instead simply pull Tasks and Pipelines out of
places like Git repos and kubectl apply
them to the cluster. This
would avoid any question of performance impact related to CRDs and would
potentially any changes being required at all in Tekton Pipelines.
Pros:
- Syncing the contents of a repo into the cluster can happen totally independently of Tekton Pipelines.
- Possibly the simplest system design for users to understand?
Cons:
- Unclear how this would work for pipelines-as-code use-cases where a user's Pull Request could include Pipeline or Task changes that should be used in testing.
- Unclear how tasks and pipelines with the same name from multiple branches or commits could co-exist in the same namespace. Potential risk for confusing outcomes here.
This solution would be quite specific to Tekton Pipelines if coded to
only submit Tasks and Pipelines to the cluster. Alternatively resolvers
could operate with zero understanding of the resources and simply
kubectl apply
whatever appears in the repository but this too has
drawbacks with regard to deciding which resources should be applied and
which shouldn't.
PRs implementing resource resolution in Pipelines:
Resolution was originally implemented in https://github.com/tektoncd/resolution. It was moved over to the https://github.com/tektoncd/pipeline in the following PRs:
- Add types and client for Resolution
- Move Resolution bundle, git, and hub resolver pkgs over
- Add combined remote resolvers binary
- Add resolvers deployment, with release and e2e integration
- Write a resolver that can fetch a pipeline at a specific commit and rewrites its
taskRefs
to also reference that same commit. - Add "default" remotes so that operators can pick where to get tasks from when no remote is specified by user. For example, an internal OCI registry might be the default remote for an org so a PipelineTask referring only to "git-clone" without any other remote info could be translated to a fetch from that OCI registry.
- Build reusable libraries that resolvers can leverage so they don't have to each rewrite their side of the protocol, common caching approaches, etc.
- Replacing
ClusterTask
with a namespace full of resources that a resolver has access to. - Create a "local disk" resolver that would allow a user to work on a task's YAML locally and use those changes in TaskRuns on their cluster without an intermediary upload step.
- TEP-0005: Tekton OCI Bundles
- Alternatives section describes possible syntax for git refs
- TEP-0053: Tekton Catalog Pipeline Organization
- This TEP is exploring approaches to annotating pipelines in the catalog so that the user can tell which specific tasks are being referenced.
- Tekton Bundles docs in Pipelines
- Tekton Bundle Contract
- Tekton OCI Images Design
- Tekton OCI Image Catalog
- Future Work section describes expanding support to other kinds of remote locations
- Referencing Tasks in Pipelines: Separate Authoring and Runtime Concerns
- CatalogTask Custom Task PR + associated discussion
- Using a Custom Task protocol to support any kind of
resolver
- Intended to be used in tandem with a set of custom tasks in experimental repo for remote resources in oci, catalogs, git, and gcs
- A video demo (starts at 11:09) of this proof-of-concept.
- Experimental Remote Resolution Controller
- Pipelines Issue 2298: Add support for referencing Tasks in git
- Pipelines Issue 3305: Refactor the way resources are injected into the reconcilers
- Pipelines Discussion about running tasks from the catalog
As part of the discovery phase for this TEP a controller has been developed for our experimental repo to test out some ideas. That controller demonstrates two of the Alternatives we outlined above:
ClusterInterceptor
-fronted HTTP servers with which Pipelines exchanges JSON messages to resolve pipelineRefs.- A
ResolutionRequest
CRD in which resources are summoned as Kubernetes API Objects & resolved by independent reconcilers.
The code is not production-ready and only supports PipelineRuns
.
The Pull Request is available for a deeper look.
The project is split across several reconcilers:
- A shim reconciler watching
PipelineRuns
. Consider this a stand-in for Tekton Pipelines just to "make things work" without deeper integration. - A reconciler watching
ResolutionRequests
- A resolver reconciler watching for Git
ResolutionRequests
- A resolver reconciler watching for In-Cluster
ResolutionRequests
Alongside the controller are two HTTP servers:
- A HTTP server waiting for requests of Git resources behind a
Kubernetes
Service
and discovered viaClusterInterceptor
. - A HTTP server waiting for requests of in-cluster resources behind a
Kubernetes
Service
and discovered viaClusterInterceptor
.
The project is placed into either ResolutionRequest
mode or
ClusterInterceptor
mode using an environment variable in the
controller deployment.
The order of operations to resolve a single pipelineRef
is as follows:
- User creates
PipelineRun
with:
spec.status
set toPipelineRunPending
- A
resolution.tekton.dev/type
annotation set to either"git"
or"in-cluster"
. - Annotations for type-specific parameters. e.g.
git.repo
,in-cluster.kind
- The shim reconciler accepts the
PipelineRun
due to its pending state and type annotation. - At this point the flow splits depending on which mode the project is running in:
- In
ResolutionRequest
mode:- The shim reconciler creates a
ResolutionRequest
object with aresolution.tekton.dev/type
label and params. - The
ResolutionRequest
reconciler initializes the object's condition toSucceeded/Unknown
. - The resolver reconcilers check if they can resolve the
ResolutionRequest
by filtering on its type label. - A matching resolver reconciler performs the work needed to fetch the resource's content.
- The resolver reconciler updates the
ResolutionRequest
with the resolved data in itsstatus.data
field. - The
ResolutionRequest
reconciler observes the populatedstatus.data
and updates the object's condition toSucceeded/True
.
- The shim reconciler creates a
- In
ClusterInterceptor
mode:- The shim reconciler looks up the
ClusterInterceptor
matching theresolution.tekton.dev/type
label on thePipelineRun
. - The shim reconciler sends a JSON request to the address from the
ClusterInterceptor
with the parameters from thePipelineRun
annotations. - The HTTP server advertised by the
ClusterInterceptor
receives the request and fetches the resource's content. - The HTTP server responds with the resolved content wrapped in a JSON object.
- The shim reconciler looks up the
Main takeaways from development:
- There is no way to "shim" support for this feature into
TaskRuns
at the moment.
The experimental controller works with an unmodified Tekton Pipelines
installation but only for PipelineRuns
. This is made possible because
PipelineRuns
can be created in PipelineRunPending
state that allows
them to be processed by the experimental controller. TaskRuns
by
contrast do not have an equivalent pending state so the only way to
support them is using a flag to modify the behaviour of the TaskRun
reconciler.
- Async when coordinating requests to HTTP endpoints will require additional complexity.
About a half a day was spent looking at ways to implement the
ClusterInterceptor
approach in a way that does not "block"
PipelineRun
reconciler threads. Figuring out possible solutions is
quick but implementing and debugging the chosen approach will take more
engineering effort. This also raised questions around how to surface
requests so that users have observability into the state of the system.
That kind of observability feels a lot more "natural" with a CRD where
state and condition are publicized through the fields of objects.
- Resolvers are easy to write given the appropriate framework
In order to support both the ClusterInterceptor
and ResolutionRequest
approaches with the same resolvers a common interface was introduced.
The interface abstracts away the underlying requests or k8s objects and
lets the resolver code focus on the fetching and returning of resources
and metadata. This seems like a good pattern to follow with the proposed
Tekton Resolution project - provide a common interface that resolver
developers can implement so that they don't need to worry about HTTP
requests or k8s objects.
The interface from the experiment is here and example implementations are the no-op resolver, the git resolver and the in-cluster resolver.