Skip to content

[sled-agent] Integrate config-reconciler #8064

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jgallagher
Copy link
Contributor

This PR integrates the new sled-agent-config-reconciler crate with sled-agent. It will not currently pass tests due to the reconciler not being completely implemented, but I'd like to get any feedback on this integration work itself (particularly as it pertains to the API of sled-agent-config-reconciler). See the description of #8063 for more context.

There are a couple serious warts with this PR:

  • The inventory system has not been updated with all the details we need to report for the reconciler. This is a bigger chunk of work because it involves a database migration and touches various bits of Nexus, so I'll do that in a separate PR.
  • This integration removes most uses of the StorageManager (because its functionality is being absorbed into sled-agent-config-reconciler); however, the storage manager also has a rich set of test support. This PR leaves a couple sled-agent submodules using that test support (support-bundle/storage and zone-bundle). In the long run I think it'd be better to rework these (if there are no remaining production uses of StorageManager), but for now I think this is... okay? Feedback welcome.

jgallagher added a commit that referenced this pull request Apr 29, 2025
This is somewhat extracted from #8064, but can be landed independently
and will make some of the followup sled-agent-config-reconciler PRs a
little cleaner.

Fixes #7774.
@@ -34,14 +34,6 @@ enum SledAgentCommands {
#[clap(subcommand)]
Zones(ZoneCommands),

/// print information about zpools
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you expecting that inventory will supplant this info? Or are you planning on replacing this access to the sled agent later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting that inventory would supplant this. (I think maybe it already has, in practice? I definitely only look at inventory when I'm curious about zpools; I don't think I've ever used these omdb subcommands.)

jgallagher added a commit that referenced this pull request Apr 30, 2025
This is somewhat extracted from #8064, but can be landed independently
and will make some of the followup sled-agent-config-reconciler PRs a
little cleaner. We don't yet ledger `OmicronSledConfig`s to disk, so
we're free to fiddle with the details of its fields without worrying
about backwards compatibility.

Fixes #7774.
@jgallagher jgallagher force-pushed the john/sled-agent-config-reconciler-2 branch from abd7542 to 2574c5c Compare April 30, 2025 19:17
Base automatically changed from john/sled-agent-config-reconciler-1 to main May 1, 2025 12:34
@jgallagher jgallagher force-pushed the john/sled-agent-config-reconciler-2 branch from 2574c5c to a057195 Compare May 2, 2025 14:59
@jgallagher jgallagher force-pushed the john/sled-agent-config-reconciler-2 branch from a057195 to 0faddda Compare May 21, 2025 20:38
jgallagher added a commit that referenced this pull request May 22, 2025
…ig reconciler (#8188)

The primary change here is replacing these inventory fields (a subset of
`OmicronSledConfig`):

```rust
    pub omicron_zones: OmicronZonesConfig,
    pub omicron_physical_disks_generation: Generation,
```

with these:

```rust
    pub ledgered_sled_config: Option<OmicronSledConfig>,
    pub reconciler_status: ConfigReconcilerInventoryStatus,
    pub last_reconciliation: Option<ConfigReconcilerInventory>,
```

Once #8064 lands, all three of these will be filled in meaningfully; as
of this PR, only `ledgered_sled_config` is populated.
(`reconciler_status` is always `NotYetRun` and `last_reconciliation` is
always `None`, since there is no reconciler yet.) The rest of the
changes are all fallout from changing inventory:

* Update `omdb` printing
* Update sled-agent to report the new inventory fields
* Update consumers of inventory (tests, reconfigurator planner, one
Nexus RPW) - these all just look at `ledgered_sled_config` for now, but
will need to be updated on #8064 once other fields are populated
* Update database schema, model, and queries (the bulk of the diff).
This requires dropping all preexisting collections, since there's no way
to migrate from just `omicron_zones` to a full `OmicronSledConfig`. The
first few schema migrations take care of this.

Before merging I'll go through an upgrade on a racklette and confirm
things come back up okay after the schema migration blows away all the
pre-update inventory collections. (We think this is fine, but it'd be
good to confirm.) But I think this is close enough that it's reviewable.

Couple other minor changes that came along for the ride:

* Closes #6770 (`inv_sled_omicron_zones` is gone now)
* Fixes #8084 (added `image_source` columns to the inventory zone config
table, so we don't lose `ImageSource::Artifact { hash }` values reported
by sled-agent)
@jgallagher jgallagher force-pushed the john/sled-agent-config-reconciler-2 branch from 0faddda to 8ff4ae3 Compare May 22, 2025 15:30
@jgallagher jgallagher marked this pull request as ready for review May 22, 2025 21:24
@jgallagher jgallagher requested a review from sunshowers May 22, 2025 21:24
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants