-
Notifications
You must be signed in to change notification settings - Fork 373
Add cleanup task for failed creates #404
Add cleanup task for failed creates #404
Conversation
Signed-off-by: Jamie Hannaford <jamie@limetree.org>
s.stop() | ||
// unmount and delete mount dirs | ||
bindUnmountAllRootfs(kataHostSharedDir, s) | ||
os.RemoveAll(filepath.Join(kataHostSharedDir, s.id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kataHostSharedDir
is specific to the kata_agent.go
implementation, and should never be used by the generic sandbox implementation. And about this, maybe something has been solved by #292
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamiehannaford I agree with @jshachm that you cannot use Kata
specific paths from sandbox.go
, this is specific to the agent implementation.
Talking about #292, I have just merged it !
Notes then:
|
// Cleanup will clean up any dangling mounts created during a startup | ||
// procedure, as well as terminate the VM. This is used when a sandbox | ||
// failed during create. | ||
func (s *Sandbox) Cleanup() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function does not need to be exported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jamiehannaford for the PR, I have some comments.
s.stop() | ||
// unmount and delete mount dirs | ||
bindUnmountAllRootfs(kataHostSharedDir, s) | ||
os.RemoveAll(filepath.Join(kataHostSharedDir, s.id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamiehannaford I agree with @jshachm that you cannot use Kata
specific paths from sandbox.go
, this is specific to the agent implementation.
Talking about #292, I have just merged it !
defer func() { | ||
if err != nil { | ||
s.Logger().WithError(err).WithField("sandboxid", s.id).Error("Creating sandbox failed") | ||
s.Cleanup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we can simply apply one big cleanup function because the failure could be caused by a bunch of different reasons, and using a hammer to cleanup might end up with some conflicts.
Trying to be more concrete here, let's say this function fails when creating the network, we don't want to call into s.stop()
because this will try to communicate with the agent to stop the sandbox inside the VM, but the VM does not even exist at this point. That's why I think something with finer granularity would be more appropriate.
I was just reading though this area of code (trying to figure out where my errant QEMU went...), and saw the teardown code. It occurred to me that we probably need some sort of 'teardown stack', and if we fail at some point we then call all the relevant teardown funcs in the reverse order they were added to the stack (so, yes, a real stack). I dunno if there is a nice idiom for doing this in golang? |
@grahamwhaley the way to do that in Golang is by using |
:-) I guess that works well if we have a single call chain for the whole setup. And we could use the |
@jamiehannaford - I appreciate the contribution here - more graceful failing is an important area for improvement imo. I'm trying to scrub our backlog and reduce the number of open PRs in the project and see that this PR has gone quiet and is close to getting stale. I see #351 similarly handles the error case. That's close to merge now -- can we re-evaluate the behavior once this merges? What do you think? |
@jamiehannaford - #351 has now landed so would be good to get some input to @egernst's query above. |
Based on the feedback and other PR, I think we can close this. Thanks! |
Thanks @jamiehannaford! |
network: Handle default route where gateway is empty
…itor_address-to-gitignore runtime: add monitor_address to .gitignore
Addresses a comment left by @sboeuf #396 (comment).
Not sure whether this is what you had in mind, but I've verified locally that dirs are now being cleaned up. I think a big reason
kata-runtime list
was hanging was due to the large amount of junk folders left around in/run/vc/sbs