-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Extra feedback for a variety of "move" consequences and edge-cases #29637
Conversation
Our previous rule for implicitly moving from IntKey(0) to NoKey would apply that move even when the current resource configuration uses for_each, because we were only considering whether "count" were set. Previously this was relatively harmless because the resource instance in question would end up planned for deletion anyway: neither an IntKey nor a NoKey are valid keys for for_each. Now that we're going to be announcing these moves explicitly in the UI, it would be confusing to see Terraform report that IntKey moved to NoKey in a situation where the config changed from count to for_each, so to address that we'll only generate the implied statement if neither repetition argument is set.
There are a few different reasons why a resource instance tracked in the prior state might be considered an "orphan", but previously we reported them all identically in the planned changes. In order to help users understand the reason for a surprising planned delete, we'll now try to specify an additional reason for the planned deletion, covering all of the main reasons why that could happen. This commit only introduces the new detail to the plans.Changes result, though it also incidentally exposes it as part of the JSON plan result in order to keep that working without returning errors in these new cases. We'll expose this information in the human-oriented UI output in a subsequent commit.
The core runtime is now able to specify a reason for some situations when Terraform plans to delete a resource instance. This commit makes that information visible in the human-oriented UI. A previous commit already made the underlying data informing these new hints visible as part of the machine-oriented (JSON) plan output. This also removes the bold formatting from the existing "has moved to" hints, because subjectively it seemed like the result was emphasizing too many parts of the output and thus somewhat defeating the benefit of the emphasis in trying to create additional visual hierarchy for sighted users running Terraform in a terminal. Now only the first line containing the main action statement will be in bold, and all of the parenthesized follow-up notes will be unformatted.
In most cases Terraform will be able to automatically fully resolve all of the pending move statements before creating a plan, but there are some edge cases where we can end up wanting to move one object to a location where another object is already declared. One relatively-obvious example is if someone uses "terraform state mv" in order to create a set of resource instance bindings that could never have arising in normal Terraform use. A less obvious example arises from the interactions between moves at different levels of granularity. If we are both moving a module to a new address and moving a resource into an instance of the new module at the same time, the old module might well have already had a resource of the same name and so the resource move will be unresolvable. In these situations Terraform will move the objects as far as possible, but because it's never valid for a move "from" address to still be declared in the configuration Terraform will inevitably always plan to destroy the objects that didn't find a final home. To give some additional explanation for that result, here we'll add a warning which describes what happened. This is not a particularly actionable warning because we don't really have enough information to guess what the user intended, but we do at least prompt that they might be able to use the "terraform state" family of subcommands to repair the ambiguous situation before planning, if they want a different result than what Terraform proposed.
Because our validation rules depend on some dynamic results produced by actually running the plan, we deal with moves in a "backwards" order where we try to apply them first -- ignoring anything strange we might find -- and then validate the original statements only after planning. An unfortunate consequence of that approach is that when the move statements are invalid it's likely that move execution will not fully complete, and so the generated plan is likely to be incorrect and might well include errors resulting from the unresolved moves. To mitigate that, here we let any move validation errors supersede all other diagnostics that the plan phase might've generated, in the hope that it'll help the user focus on fixing the incorrect move statements without creating confusing by reporting errors that only appeared as a quick of how Terraform worked around the invalid move statements earlier.
3c6c5c1
to
f371f46
Compare
// arguments. This difference in detail is out of pragmatism, because | ||
// potentially multiple nested modules could all contribute conflicting | ||
// specific reasons for a particular instance to no longer be declared. | ||
ResourceInstanceDeleteBecauseNoModule ResourceInstanceChangeActionReason = 'M' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When adding new reasons here, I think we should also update the streaming JSON UI output to match:
terraform/internal/command/views/json/change.go
Lines 70 to 94 in c8cd0b1
const ( | |
ReasonNone ChangeReason = "" | |
ReasonTainted ChangeReason = "tainted" | |
ReasonRequested ChangeReason = "requested" | |
ReasonCannotUpdate ChangeReason = "cannot_update" | |
ReasonUnknown ChangeReason = "unknown" | |
) | |
func changeReason(reason plans.ResourceInstanceChangeActionReason) ChangeReason { | |
switch reason { | |
case plans.ResourceInstanceChangeNoReason: | |
return ReasonNone | |
case plans.ResourceInstanceReplaceBecauseTainted: | |
return ReasonTainted | |
case plans.ResourceInstanceReplaceByRequest: | |
return ReasonRequested | |
case plans.ResourceInstanceReplaceBecauseCannotUpdate: | |
return ReasonCannotUpdate | |
default: | |
// This should never happen, but there's no good way to guarantee | |
// exhaustive handling of the enum, so a generic fall back is better | |
// than a misleading result or a panic | |
return ReasonUnknown | |
} | |
} |
I'll submit a PR for these changes shortly. Can you think of any automated way to keep these two enums in sync for future changes? Maybe adding nishanths/exhaustive
as a CI check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
Although in most cases Terraform's handling of the new
moved
blocks should be straightforward and intuitive, there are inevitably some less common cases where the behavior may be surprising enough to warrant further explanation.This set of changes aims to add a first round of explanations of that sort, focused on the following situations:
Colliding moves, where two objects try to move to the same address in spite of our basic static validation rules.
This can typically apply only when someone's previously used
terraform state mv
orterraform apply -target=...
to create a situation that wouldn't be reachable through the standard workflow.However, there is at least one situation where it's reachable even in the standard workflow: if we need to resolve both
module.foo
moves tomodule.bar
andaws_instance.a
moves tomodule.bar.aws_instance.a
then both of those moves would be considered structurally valid, but if our prior state already had an object atmodule.foo.aws_instance.a
then the twoaws_instance.a
objects would conflict with one another to try to occupy the addressmodule.bar.aws_instance.a
.This is not a situation we expect folks to reach regularly, and so one of the changes here adds an additional warning message to the plan result in that unusual case, which is intended to help explain why Terraform is planning to destroy an object that the user might have expected to be moved to a new address which is still declared in configuration. We don't have enough information to derive a good sense of user intent in that situation, and so the warning just makes a general observation that you can use
terraform state
subcommands to make things less ambiguous if you're not satisfied with the resolution Terraform proposed.Moving to an address that isn't declared in the configuration.
The main expected use of
moved
blocks is to set theto
argument to refer to an address which has a corresponding declaration in the configuration. In that case, the happy path is for Terraform to simply pretend that an existing object had originally been created at the new address and not propose any other changes at all.However, because
moved
blocks are a historical record that will (in most cases) remain in modules indefinitely, we do need to allow for a later version of the module possibly removing the declaration of that object. Our rules for moves are that we should always generate the same effect for two actions handled together in the same plan as we would for those actions separated into two runs, and so therefore we must tolerate moving into an undeclared address and plan to delete any object that moved into that undeclared address.An unfortunate consequence of this rule is that if an author makes a mistake when writing the
to
address then Terraform will typically propose to destroy the object. As a compromise then, the changes here add some additional context to a planned destroy to help draw attention to the mismatch between configuration and state.Although the motivation to do this now was to give better feedback about incorrect
moved
statements, adding this additional feedback for planned destroy has been on my wishlist for a while now, and so I implemented it in a generic way which works even in the normal case where there's no moving going on. I hope that this will be useful to help users disambiguate the several different reasons that can cause Terraform to declare a particular resource instance as an "orphan" (no longer declared in configuration) and thus figure out what they need to change in the configuration if they don't want to delete the object.While I was in the area anyway, I also made some minor adjustments to how we present the extra note about a resource instance having moved, which we'd planned to do in an earlier PR but accidentally overlooked it.
The following example output shows a rather gnarly combination of these situations all together, just for illustrative purposes. In practice I would expect it to be extremely rare to see all of these appear together in the same proposed plan. (I created the conflict between
null_resource.a[0]
andnull_resource.a
through some malicious use ofterraform state mv
prior to requesting this plan.)Making the implicit zero-to-none and none-to-zero key moves more visible in the UI here drew attention to a quirk of my earlier work in #29605, in the situation of moving from
for_each
tocount
rather thanfor_each
to no repetition. I also fixed that while here, because otherwise the reason given for proposing to delete the old object didn't actually make sense.