You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: enhancements/machine-config/manage-boot-images.md
+54-17
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ Currently, bootimage references are [stored](https://github.com/openshift/instal
45
45
46
46
Additionally, the stub Ignition config [referenced](https://github.com/openshift/installer/blob/1ca0848f0f8b2ca9758493afa26bf43ebcd70410/pkg/asset/machines/gcp/machines.go#L197) in the `MachineSet` is also not managed. This stub is used by the ignition binary in firstboot to auth and consume content from the `machine-config-server`(MCS). The content served includes the actual Ignition configuration and the target OCI format RHCOS image. The ignition binary now does first boot provisioning based on this, then hands off to the `machine-config-daemon`(MCD) first boot service to do the reboot into the target OCI format RHCOS image.
47
47
48
-
There has been [a previous effort](https://github.com/openshift/machine-config-operator/pull/1792) to manage the stub Ignition config. It was [reverted](https://github.com/openshift/machine-config-operator/pull/2126) and then [brought back](https://github.com/openshift/machine-config-operator/pull/2827#issuecomment-996156872) just for bare metal clusters. For other platforms, the `*-managed` stubs still get generated by the MCO, but are not injected into the `MachineSet`. The proposal plans to utilize these unused `*-managed` stubs, but it is important to note that this stub is generated(and synced) by the MCO and will ignore/override any user customizations to the original stub Ignition config. This limitation will be mentioned in the documentation, and a later release will provide support for user customization of the stub, either via API or a workaround thorugh additional documentation. This should not be an issue for the majority of users as they very rarely customize the original stub Ignition config.
48
+
There has been [a previous effort](https://github.com/openshift/machine-config-operator/pull/1792) to manage the stub Ignition config. It was [reverted](https://github.com/openshift/machine-config-operator/pull/2126) and then [brought back](https://github.com/openshift/machine-config-operator/pull/2827#issuecomment-996156872) just for bare metal clusters. For other platforms, the `*-managed` stubs still get generated by the MCO, but are not injected into the `MachineSet`. The proposal plans to utilize these unused `*-managed` stubs, but it is important to note that this stub is generated(and synced) by the MCO and will ignore/override any user customizations to the original stub Ignition config. This limitation will be mentioned in the documentation, and a later release will provide support for user customization of the stub, either via API or a workaround through additional documentation. This should not be an issue for the majority of users as they very rarely customize the original stub Ignition config.
49
49
50
50
In certain long lived clusters, the MCS TLS cert contained within the above Ignition configuration may be out of date. Example issue [here](https://issues.redhat.com/browse/OCPBUGS-1817). While this has been partly solved [MCO-642](https://issues.redhat.com/browse/MCO-642) (which allows the user to manually rotate the cert) it would be very beneficial for the MCO to actively manage this TLS cert and take this concern away from the user.
51
51
@@ -81,7 +81,7 @@ __Overview__
81
81
-`ManagedBootImages` feature gate is active
82
82
- The cluster and/or the machineset is opted-in to boot image updates. This is done at the operator level, via the `MachineConfiguration` API object.
83
83
- The `machineset` does not have a valid owner reference. Having a valid owner reference typically indicates that the `MachineSet` is managed by another workflow, and that updates to it are likely going to cause thrashing.
84
-
- The golden configmap is verified to be in sync with the current version of the MCO. The MCO will update("stamp") the golden configmap with version of the new MCO image after at least 1 master node has succesfully completed an update to the new OCP image. This helps prevent `machinesets` being updated too soon at the end of a cluster upgrade, before the MCO itself has updated and has had a chance to roll out the new OCP image to the cluster.
84
+
- The golden configmap is verified to be in sync with the current version of the MCO. The MCO will update("stamp") the golden configmap with version of the new MCO image after at least 1 master node has successfully completed an update to the new OCP image. This helps prevent `machinesets` being updated too soon at the end of a cluster upgrade, before the MCO itself has updated and has had a chance to roll out the new OCP image to the cluster.
85
85
86
86
If any of the above checks fail, the MSBIC will exit out of the sync.
87
87
- Based on platform and architecture type, the MSBIC will check if the boot images referenced in the `providerSpec` field of the `MachineSet` is the same as the one in the ConfigMap. Each platform(gcp, aws...and so on) does this differently, so this part of the implementation will have to be special cased. The ConfigMap is considered to be the golden set of bootimage values, i.e. they will never go out of date. If it is not a match, the `providerSpec` field is cloned and updated with the new boot image reference.
@@ -135,21 +135,22 @@ This work will be tracked in [MCO-793](https://issues.redhat.com/browse/MCO-793)
135
135
136
136
##### Projected timeline
137
137
138
-
This is a tentative timeline, subject to change (GA = General Availability, TP = Tech Preview, EF = Default-on, Skew Enforced).
138
+
This is a tentative timeline, subject to change (GA = General Availability, TP = Tech Preview, DEF = Default-on).
*non-managed in this case indicates enforcement of user-initiated bootimage updates. See enforcement section below.
151
+
Once default-on behavior has been deemed to be stable across the above mentioned platforms, skew enforcement will be implemented in a platform agnostic manner. Decoupling the default-on and skew enforcement mechanism would help iron out any edge cases unique to a platform and would also aid in refining the skew enforcement workflow. The tentative timeline for skew enforcement would be 4.25.
152
152
153
+
**Note** : For non-managed cases, * indicates enforcement of user-initiated bootimage updates. See enforcement section below.
153
154
##### Cluster API backed machinesets
154
155
155
156
As the Cluster API move is impending(initial release in 4.16 and default-on release in 4.17), it is necessary that this enhancement plans for the changes required in an CAPI backed cluster. Here are a couple of sample YAMLs used in CAPI backed `Machinesets`, from the [official Openshift documentation](https://docs.openshift.com/container-platform/4.14/machine_management/capi-machine-management.html#capi-sample-yaml-files-gcp).
@@ -225,9 +226,10 @@ Much of the existing design regarding architecture & platform detection, opt-in,
225
226
#### Opt-in Mechanism
226
227
This proposal introduces a new field in the MCO operator API, `ManagedBootImages` which encloses an array of `MachineManager` objects. A `MachineManager` object contains the resource type of the machine management object that is being opted-in, the API group of that object and a union discriminant object of the type `MachineManagerSelector`. This object `MachineManagerSelector` contains:
227
228
228
-
- The union discriminator, `Mode`, can be set to two values : Alland Partial.
229
+
- The union discriminator, `Mode`, can be set to three values : All, Partial and None.
229
230
-**All**: All machine resources described by this resource/apiGroup type will be opted-in for boot image updates. In most cases, this effectively enables boot image updates for the whole cluster, unless there are multiple kinds of machine resources present in the cluster.
230
231
-**Partial**: This is a set of label selectors that will be used by users to opt-in a custom selection of machine resources. When the Mode is set to Partial mode, all machinesets matched by this object would be considered enrolled for updates. In the first iteration of this API, this object will only allow for label matching with MachineResources. In the future, additional ways of filtering may be added with another label selector, e.g. namespace.
232
+
-**None**: All machine resources described by this resource/apiGroup type will be excluded from boot image updates. In most cases, this effectively disables boot image updates for the whole cluster, unless there are multiple kinds of machine resources present in the cluster.
231
233
232
234
233
235
```
@@ -268,6 +270,7 @@ type MachineManagerSelector struct {
268
270
// Valid values are All and Partial.
269
271
// All means that every resource matched by the machine manager will be updated.
270
272
// Partial requires specified selector(s) and allows customisation of which resources matched by the machine manager will be updated.
273
+
// None means that every resource matched by the machine manager will NOT be updated.
271
274
// +unionDiscriminator
272
275
// +kubebuilder:validation:Required
273
276
Mode MachineManagerSelectorMode `json:"mode"`
@@ -286,7 +289,7 @@ type PartialSelector struct {
286
289
}
287
290
288
291
// MachineManagerSelectorMode is a string enum used in the MachineManagerSelector union discriminator.
// Enabled represents a configuration mode that enables boot image skew enforcement.
409
+
Enabled SkewEnforcementSelectorMode = "Enabled"
410
+
411
+
// Disabled represents a configuration mode that disables boot image skew enforcement.
412
+
Disabled SkewEnforcementSelectorMode = "Disabled"
413
+
)
414
+
415
+
```
416
+
380
417
#### Tracking boot image history
381
418
382
419
Note: This section is just an idea for the moment and is considered out of scope. This CR will require thorough API review in a follow-up enhancement.
@@ -502,7 +539,7 @@ flowchart-elk TD;
502
539
MachineSetOwnerCheck -->|No| ConfigMapCheck[Has the coreos-bootimages ConfigMap been stamped by the MCO?] ;
503
540
504
541
ConfigMapCheck -->|Yes|ArchType[Determine arch type of MachineSet, for eg: x86_64, aarch64] ;
505
-
ConfigMapCheck -->|No| Wait[A cluster upgrade is ongoing. Wait until atleast 1 control plane node has completed an update.];
542
+
ConfigMapCheck -->|No| Wait[A cluster upgrade is ongoing. Wait until at least 1 control plane node has completed an update.];
506
543
Wait --> |Upgrade Complete| ArchType
507
544
Wait --> |Timeout| Error
508
545
ArchType -->PlatformType[Determine platform type of MachineSet, for eg: gcp, aws, vsphere] ;
@@ -538,7 +575,7 @@ The UX element involved include the user opt-in and opt-out, which is currently
538
575
539
576
This will be done on a platform by platform basis. Some key benchmarks have to be met for a platform to be considered ready
540
577
for default on:
541
-
- Sufficent runtime(say, atleast 1 release) has been accumulated while boot image updates has been GAed for this platform.
578
+
- Sufficent runtime(say, at least 1 release) has been accumulated while boot image updates has been GAed for this platform.
542
579
- Periodic tests have been added for this platform in CI and have met certain passing metrics.
543
580
- Any teams that are affected by default-on
544
581
behavior have been notified and assisted with the transition.
@@ -552,8 +589,8 @@ flowchart-elk LR;
552
589
UpdateConfig --> Stop((Stop));
553
590
```
554
591
Some points to note:
555
-
- For bookkeeping purposes, the MCO will annotate the `MachineConfiguration` object when opting in the cluster.
556
-
- If the cluster admin wishes to opt-out of the feature, they have to explicitly do so by removing the boot image configuration. Due to the presence of the "opted in" annotation, the MCO will not attempt to automatically opt-in the cluster for updates again.
592
+
- For bookkeeping purposes, the MCO will annotate the `MachineConfiguration` object when opting in the cluster by default.
593
+
- If the cluster admin wishes to opt-out of the feature, they have to explicitly do so by removing the boot image configuration or explcitly opting out the cluster via the API knob. Due to the presence of the "default opted-in" annotation, the MCO will not attempt to opt-in the cluster by default again.
557
594
- This mechanism will be active on installs and upgrades.
558
595
559
596
@@ -577,7 +614,7 @@ skew policy described in the release payload.
577
614
578
615
The cluster admin may also choose to opt-out of skew management via this configmap, which indicates that they will not require scaling nodes, and thereby opting out of skew enforcement and scaling functionality.
579
616
580
-
A potential problem here is that the way boot images are stored in the machineset is lossy. In certain platforms, there is no way to recover the boot image metadata from the MachineSet. This is most likely to happen the first time the MCO attempts to do skew enforcement on a cluster that has never had boot image updates. In such cases, the MCO will default to the install time boot image, which can be recovered from the aleph version of the control plane nodes.
617
+
A potential problem here is that the way boot images are stored in the machineset is lossy. In certain platforms, there is no way to recover the boot image metadata from the MachineSet. This is most likely to happen the first time the MCO attempts to do skew enforcement on a cluster that has never had boot image updates. In such cases, the MCO will default to the install time boot image, which can be recovered from the [aleph version](https://github.com/coreos/coreos-assembler/pull/768) of the control plane nodes.
581
618
582
619
This configmap can then be monitored to enforce skew limits. This could be done in a couple of ways:
583
620
-**via the MCO**: If the skew is determined to be too large, the MCO can update its `ClusterOperator` object with an `Upgradeable=False` condition, along with remediation steps in the `Condition` message. This will signal to the CVO that the cluster is not suitable for an upgrade. The drawback of this approach is that the MCO is not able to signal *prior* to the start of a cluster upgrade, so if an incoming upgrade has a "stricter" skew policy, this could break scaling until the admin takes the remediation steps during the upgrade or after the upgrade is complete. This may present as strange UX to the user.
0 commit comments