Skip to content

Commit 34024e9

Browse files
committed
updated timeline and API
1 parent 980d43e commit 34024e9

File tree

1 file changed

+54
-17
lines changed

1 file changed

+54
-17
lines changed

enhancements/machine-config/manage-boot-images.md

+54-17
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Currently, bootimage references are [stored](https://github.com/openshift/instal
4545

4646
Additionally, the stub Ignition config [referenced](https://github.com/openshift/installer/blob/1ca0848f0f8b2ca9758493afa26bf43ebcd70410/pkg/asset/machines/gcp/machines.go#L197) in the `MachineSet` is also not managed. This stub is used by the ignition binary in firstboot to auth and consume content from the `machine-config-server`(MCS). The content served includes the actual Ignition configuration and the target OCI format RHCOS image. The ignition binary now does first boot provisioning based on this, then hands off to the `machine-config-daemon`(MCD) first boot service to do the reboot into the target OCI format RHCOS image.
4747

48-
There has been [a previous effort](https://github.com/openshift/machine-config-operator/pull/1792) to manage the stub Ignition config. It was [reverted](https://github.com/openshift/machine-config-operator/pull/2126) and then [brought back](https://github.com/openshift/machine-config-operator/pull/2827#issuecomment-996156872) just for bare metal clusters. For other platforms, the `*-managed` stubs still get generated by the MCO, but are not injected into the `MachineSet`. The proposal plans to utilize these unused `*-managed` stubs, but it is important to note that this stub is generated(and synced) by the MCO and will ignore/override any user customizations to the original stub Ignition config. This limitation will be mentioned in the documentation, and a later release will provide support for user customization of the stub, either via API or a workaround thorugh additional documentation. This should not be an issue for the majority of users as they very rarely customize the original stub Ignition config.
48+
There has been [a previous effort](https://github.com/openshift/machine-config-operator/pull/1792) to manage the stub Ignition config. It was [reverted](https://github.com/openshift/machine-config-operator/pull/2126) and then [brought back](https://github.com/openshift/machine-config-operator/pull/2827#issuecomment-996156872) just for bare metal clusters. For other platforms, the `*-managed` stubs still get generated by the MCO, but are not injected into the `MachineSet`. The proposal plans to utilize these unused `*-managed` stubs, but it is important to note that this stub is generated(and synced) by the MCO and will ignore/override any user customizations to the original stub Ignition config. This limitation will be mentioned in the documentation, and a later release will provide support for user customization of the stub, either via API or a workaround through additional documentation. This should not be an issue for the majority of users as they very rarely customize the original stub Ignition config.
4949

5050
In certain long lived clusters, the MCS TLS cert contained within the above Ignition configuration may be out of date. Example issue [here](https://issues.redhat.com/browse/OCPBUGS-1817). While this has been partly solved [MCO-642](https://issues.redhat.com/browse/MCO-642) (which allows the user to manually rotate the cert) it would be very beneficial for the MCO to actively manage this TLS cert and take this concern away from the user.
5151

@@ -81,7 +81,7 @@ __Overview__
8181
- `ManagedBootImages` feature gate is active
8282
- The cluster and/or the machineset is opted-in to boot image updates. This is done at the operator level, via the `MachineConfiguration` API object.
8383
- The `machineset` does not have a valid owner reference. Having a valid owner reference typically indicates that the `MachineSet` is managed by another workflow, and that updates to it are likely going to cause thrashing.
84-
- The golden configmap is verified to be in sync with the current version of the MCO. The MCO will update("stamp") the golden configmap with version of the new MCO image after at least 1 master node has succesfully completed an update to the new OCP image. This helps prevent `machinesets` being updated too soon at the end of a cluster upgrade, before the MCO itself has updated and has had a chance to roll out the new OCP image to the cluster.
84+
- The golden configmap is verified to be in sync with the current version of the MCO. The MCO will update("stamp") the golden configmap with version of the new MCO image after at least 1 master node has successfully completed an update to the new OCP image. This helps prevent `machinesets` being updated too soon at the end of a cluster upgrade, before the MCO itself has updated and has had a chance to roll out the new OCP image to the cluster.
8585

8686
If any of the above checks fail, the MSBIC will exit out of the sync.
8787
- Based on platform and architecture type, the MSBIC will check if the boot images referenced in the `providerSpec` field of the `MachineSet` is the same as the one in the ConfigMap. Each platform(gcp, aws...and so on) does this differently, so this part of the implementation will have to be special cased. The ConfigMap is considered to be the golden set of bootimage values, i.e. they will never go out of date. If it is not a match, the `providerSpec` field is cloned and updated with the new boot image reference.
@@ -135,21 +135,22 @@ This work will be tracked in [MCO-793](https://issues.redhat.com/browse/MCO-793)
135135

136136
##### Projected timeline
137137

138-
This is a tentative timeline, subject to change (GA = General Availability, TP = Tech Preview, EF = Default-on, Skew Enforced).
138+
This is a tentative timeline, subject to change (GA = General Availability, TP = Tech Preview, DEF = Default-on).
139139

140-
| Platform | TP | GA | EF |
140+
| Platform | TP | GA | DEF |
141141
| -------- | ------- | ------- | ------- |
142-
| gcp | 4.16 |4.17 |4.19 |
143-
| aws | 4.17 |4.18 |4.20 |
142+
| gcp | [4.16](https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html-single/machine_configuration/index#mco-update-boot-images) |[4.17](https://docs.redhat.com/en/documentation/openshift_container_platform/4.17/html-single/machine_configuration/index#mco-update-boot-images) |4.19 |
143+
| aws | [4.17](https://docs.redhat.com/en/documentation/openshift_container_platform/4.17/html-single/machine_configuration/index#mco-update-boot-images) |[4.18](https://docs.redhat.com/en/documentation/openshift_container_platform/4.18/html-single/machine_configuration/index#mco-update-boot-images) |4.19 |
144144
| vsphere | 4.20 |4.21 |4.22 |
145145
| baremetal| |4.22 |4.23 |
146146
| openstack| |4.22 |4.23 |
147-
| nutanix | |4.22 |4.23 |
148-
| ibmcloud | |4.24 |4.25 |
149-
| non-managed* | |4.21 |4.22 |
147+
| nutanix | |4.23 |4.24 |
148+
| ibmcloud | |4.23 |4.24 |
149+
| non-managed* | x |x |4.25 |
150150

151-
*non-managed in this case indicates enforcement of user-initiated bootimage updates. See enforcement section below.
151+
Once default-on behavior has been deemed to be stable across the above mentioned platforms, skew enforcement will be implemented in a platform agnostic manner. Decoupling the default-on and skew enforcement mechanism would help iron out any edge cases unique to a platform and would also aid in refining the skew enforcement workflow. The tentative timeline for skew enforcement would be 4.25.
152152

153+
**Note** : For non-managed cases, * indicates enforcement of user-initiated bootimage updates. See enforcement section below.
153154
##### Cluster API backed machinesets
154155

155156
As the Cluster API move is impending(initial release in 4.16 and default-on release in 4.17), it is necessary that this enhancement plans for the changes required in an CAPI backed cluster. Here are a couple of sample YAMLs used in CAPI backed `Machinesets`, from the [official Openshift documentation](https://docs.openshift.com/container-platform/4.14/machine_management/capi-machine-management.html#capi-sample-yaml-files-gcp).
@@ -225,9 +226,10 @@ Much of the existing design regarding architecture & platform detection, opt-in,
225226
#### Opt-in Mechanism
226227
This proposal introduces a new field in the MCO operator API, `ManagedBootImages` which encloses an array of `MachineManager` objects. A `MachineManager` object contains the resource type of the machine management object that is being opted-in, the API group of that object and a union discriminant object of the type `MachineManagerSelector`. This object `MachineManagerSelector` contains:
227228

228-
- The union discriminator, `Mode`, can be set to two values : All and Partial.
229+
- The union discriminator, `Mode`, can be set to three values : All, Partial and None.
229230
- **All**: All machine resources described by this resource/apiGroup type will be opted-in for boot image updates. In most cases, this effectively enables boot image updates for the whole cluster, unless there are multiple kinds of machine resources present in the cluster.
230231
- **Partial**: This is a set of label selectors that will be used by users to opt-in a custom selection of machine resources. When the Mode is set to Partial mode, all machinesets matched by this object would be considered enrolled for updates. In the first iteration of this API, this object will only allow for label matching with MachineResources. In the future, additional ways of filtering may be added with another label selector, e.g. namespace.
232+
- **None**: All machine resources described by this resource/apiGroup type will be excluded from boot image updates. In most cases, this effectively disables boot image updates for the whole cluster, unless there are multiple kinds of machine resources present in the cluster.
231233

232234

233235
```
@@ -268,6 +270,7 @@ type MachineManagerSelector struct {
268270
// Valid values are All and Partial.
269271
// All means that every resource matched by the machine manager will be updated.
270272
// Partial requires specified selector(s) and allows customisation of which resources matched by the machine manager will be updated.
273+
// None means that every resource matched by the machine manager will NOT be updated.
271274
// +unionDiscriminator
272275
// +kubebuilder:validation:Required
273276
Mode MachineManagerSelectorMode `json:"mode"`
@@ -286,7 +289,7 @@ type PartialSelector struct {
286289
}
287290
288291
// MachineManagerSelectorMode is a string enum used in the MachineManagerSelector union discriminator.
289-
// +kubebuilder:validation:Enum:="All";"Partial"
292+
// +kubebuilder:validation:Enum:="All";"Partial";"None"
290293
type MachineManagerSelectorMode string
291294
292295
const (
@@ -296,6 +299,9 @@ const (
296299
// Partial represents a configuration mode that will register resources specified by the parent MachineManager only
297300
// if they match with the label selector.
298301
Partial MachineManagerSelectorMode = "Partial"
302+
303+
// None represents a configuration mode that excludes all resources specified by the parent MachineManager from boot image updates.
304+
None MachineManagerSelectorMode = "None"
299305
)
300306
301307
// MachineManagerManagedResourceType is a string enum used in the MachineManager type to describe the resource
@@ -377,6 +383,37 @@ spec:
377383
name: "cluster"
378384
namespace: "default"
379385
```
386+
#### Skew Enforcement
387+
As mentioned in the timeline section, this would only be implemented after default-on behavior has been deemed to be stable across
388+
all platforms.
389+
390+
This would introduced as an new knob in the `ManagedBootImages` struct:
391+
```
392+
type ManagedBootImages struct {
393+
...
394+
...
395+
// skewEnforcement allows an admin to set behavior of the boot image skew enforcement mechanism.
396+
// Enabled means that the MCO will degrade and prevent upgrades when the boot image skew is too large.
397+
// Disabled means that the MCO will no longer degrade and will permit upgrades when the boot image skew is
398+
// too large. This may also hinder the cluster's scaling ability.
399+
// +optional
400+
SkewEnforcement SkewEnforcementSelectorMode `json:"skewEnforcement"`
401+
}
402+
403+
// SkewEnforcementSelectorMode is a string enum used to indicate the cluster's boot image skew enforcement mode.
404+
// +kubebuilder:validation:Enum:="Enabled";"Disabled"
405+
type SkewEnforcementSelectorMode string
406+
407+
const (
408+
// Enabled represents a configuration mode that enables boot image skew enforcement.
409+
Enabled SkewEnforcementSelectorMode = "Enabled"
410+
411+
// Disabled represents a configuration mode that disables boot image skew enforcement.
412+
Disabled SkewEnforcementSelectorMode = "Disabled"
413+
)
414+
415+
```
416+
380417
#### Tracking boot image history
381418

382419
Note: This section is just an idea for the moment and is considered out of scope. This CR will require thorough API review in a follow-up enhancement.
@@ -502,7 +539,7 @@ flowchart-elk TD;
502539
MachineSetOwnerCheck -->|No| ConfigMapCheck[Has the coreos-bootimages ConfigMap been stamped by the MCO?] ;
503540
504541
ConfigMapCheck -->|Yes|ArchType[Determine arch type of MachineSet, for eg: x86_64, aarch64] ;
505-
ConfigMapCheck -->|No| Wait[A cluster upgrade is ongoing. Wait until atleast 1 control plane node has completed an update.];
542+
ConfigMapCheck -->|No| Wait[A cluster upgrade is ongoing. Wait until at least 1 control plane node has completed an update.];
506543
Wait --> |Upgrade Complete| ArchType
507544
Wait --> |Timeout| Error
508545
ArchType -->PlatformType[Determine platform type of MachineSet, for eg: gcp, aws, vsphere] ;
@@ -538,7 +575,7 @@ The UX element involved include the user opt-in and opt-out, which is currently
538575

539576
This will be done on a platform by platform basis. Some key benchmarks have to be met for a platform to be considered ready
540577
for default on:
541-
- Sufficent runtime(say, atleast 1 release) has been accumulated while boot image updates has been GAed for this platform.
578+
- Sufficent runtime(say, at least 1 release) has been accumulated while boot image updates has been GAed for this platform.
542579
- Periodic tests have been added for this platform in CI and have met certain passing metrics.
543580
- Any teams that are affected by default-on
544581
behavior have been notified and assisted with the transition.
@@ -552,8 +589,8 @@ flowchart-elk LR;
552589
UpdateConfig --> Stop((Stop));
553590
```
554591
Some points to note:
555-
- For bookkeeping purposes, the MCO will annotate the `MachineConfiguration` object when opting in the cluster.
556-
- If the cluster admin wishes to opt-out of the feature, they have to explicitly do so by removing the boot image configuration. Due to the presence of the "opted in" annotation, the MCO will not attempt to automatically opt-in the cluster for updates again.
592+
- For bookkeeping purposes, the MCO will annotate the `MachineConfiguration` object when opting in the cluster by default.
593+
- If the cluster admin wishes to opt-out of the feature, they have to explicitly do so by removing the boot image configuration or explcitly opting out the cluster via the API knob. Due to the presence of the "default opted-in" annotation, the MCO will not attempt to opt-in the cluster by default again.
557594
- This mechanism will be active on installs and upgrades.
558595

559596

@@ -577,7 +614,7 @@ skew policy described in the release payload.
577614

578615
The cluster admin may also choose to opt-out of skew management via this configmap, which indicates that they will not require scaling nodes, and thereby opting out of skew enforcement and scaling functionality.
579616

580-
A potential problem here is that the way boot images are stored in the machineset is lossy. In certain platforms, there is no way to recover the boot image metadata from the MachineSet. This is most likely to happen the first time the MCO attempts to do skew enforcement on a cluster that has never had boot image updates. In such cases, the MCO will default to the install time boot image, which can be recovered from the aleph version of the control plane nodes.
617+
A potential problem here is that the way boot images are stored in the machineset is lossy. In certain platforms, there is no way to recover the boot image metadata from the MachineSet. This is most likely to happen the first time the MCO attempts to do skew enforcement on a cluster that has never had boot image updates. In such cases, the MCO will default to the install time boot image, which can be recovered from the [aleph version](https://github.com/coreos/coreos-assembler/pull/768) of the control plane nodes.
581618

582619
This configmap can then be monitored to enforce skew limits. This could be done in a couple of ways:
583620
- **via the MCO**: If the skew is determined to be too large, the MCO can update its `ClusterOperator` object with an `Upgradeable=False` condition, along with remediation steps in the `Condition` message. This will signal to the CVO that the cluster is not suitable for an upgrade. The drawback of this approach is that the MCO is not able to signal *prior* to the start of a cluster upgrade, so if an incoming upgrade has a "stricter" skew policy, this could break scaling until the admin takes the remediation steps during the upgrade or after the upgrade is complete. This may present as strange UX to the user.

0 commit comments

Comments
 (0)