Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

SIGSEGV with metadata from older version of ignite #854

Merged
merged 1 commit into from
Jul 12, 2021

Conversation

juozasg
Copy link
Contributor

@juozasg juozasg commented Jul 10, 2021

Signed-off-by: Juozas juozasgaigalas@gmail.com

Running ignite ps on a system with firecracker VMs created by ignite 0.7 causes a segfault because the /var/lib/firecracker/vm metadata does not have the expected vm.Status.Runtime.Name field. Instead ignore VMs missing this field.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x1083c5b]

goroutine 1 [running]:
github.com/weaveworks/ignite/cmd/ignite/run.fetchLatestStatus(0xc00007d860, 0x2, 0x2, 0x1, 0x2, 0xc00007d860, 0x1)
	/go/src/github.com/weaveworks/ignite/cmd/ignite/run/ps.go:201 +0x11b
github.com/weaveworks/ignite/cmd/ignite/run.Ps(0xc00003af40, 0x0, 0x0)
	/go/src/github.com/weaveworks/ignite/cmd/ignite/run/ps.go:81 +0x2bc
github.com/weaveworks/ignite/cmd/ignite/cmd/vmcmd.NewCmdPs.func1.1(0xc0001964e0, 0x131db72, 0x2)
	/go/src/github.com/weaveworks/ignite/cmd/ignite/cmd/vmcmd/ps.go:60 +0x5f
github.com/weaveworks/ignite/cmd/ignite/cmd/vmcmd.NewCmdPs.func1(0xc000453340, 0x1c436e0, 0x0, 0x0)
	/go/src/github.com/weaveworks/ignite/cmd/ignite/cmd/vmcmd/ps.go:61 +0x7a
github.com/spf13/cobra.(*Command).execute(0xc000453340, 0x1c436e0, 0x0, 0x0, 0xc000453340, 0x1c436e0)
	/go/src/github.com/weaveworks/ignite/vendor/github.com/spf13/cobra/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00051b8c0, 0xc000182000, 0x14892e0, 0xc00000e010)
	/go/src/github.com/weaveworks/ignite/vendor/github.com/spf13/cobra/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	/go/src/github.com/weaveworks/ignite/vendor/github.com/spf13/cobra/command.go:887
main.Run(0x112f160, 0xc000040178)
	/go/src/github.com/weaveworks/ignite/cmd/ignite/ignite.go:24 +0xab
main.main()
	/go/src/github.com/weaveworks/ignite/cmd/ignite/ignite.go:13 +0x25

Signed-off-by: Juozas <juozasgaigalas@gmail.com>
@kingdonb
Copy link

Thanks for uncovering this issue!

I don't have a lot of context here, but I think there must be a better way to handle the default case than to just move on to the next one. Probably should emit some kind of warning since there could be no other way for the user to discover these are stale resources left behind.

@kingdonb kingdonb changed the title SIGSEV with metadata from older version of ignite SIGSEGV with metadata from older version of ignite Jul 10, 2021
@darkowlzz
Copy link
Contributor

Hi @juozasg , thanks for reporting this issue. I was able to reproduce this issue. Good catch!
At first, I thought that this is a bug in the API conversion. We have API conversion logic to handle reading old API objects and converting them to the latest API object version, but in this case, we missed adding a default value for status.Runtime.Name in the conversion from v1alpha2 to the hub version (latest). So I solved the issue with the following patch

diff --git a/pkg/apis/ignite/v1alpha2/conversion.go b/pkg/apis/ignite/v1alpha2/conversion.go
index ffea1cc6..bde5004b 100644
--- a/pkg/apis/ignite/v1alpha2/conversion.go
+++ b/pkg/apis/ignite/v1alpha2/conversion.go
@@ -1,8 +1,10 @@
 package v1alpha2

 import (
-       "github.com/weaveworks/ignite/pkg/apis/ignite"
        "k8s.io/apimachinery/pkg/conversion"
+
+       "github.com/weaveworks/ignite/pkg/apis/ignite"
+       "github.com/weaveworks/ignite/pkg/runtime"
 )

 // Convert_ignite_Runtime_To_v1alpha2_Runtime calls the autogenerated conversion function along with custom conversion logic
@@ -20,6 +22,10 @@ func Convert_v1alpha2_VMStatus_To_ignite_VMStatus(in *VMStatus, out *ignite.VMSt
                out.Network = &ignite.Network{}
        }

+       if out.Runtime.Name == "" {
+               out.Runtime.Name = runtime.RuntimeContainerd
+       }
+
        // Set IPAddresses to the new position, under Network block.
        out.Network.IPAddresses = in.IPAddresses

This adds containerd as the default runtime when the runtime name is missing in the status. But this may not be accurate because we don't have a way to determine the right value of the runtime, since we didn't record it in the old API. So, although it fixes the issue, it's not so correct solution.
After thinking more about it and going through the ps code, I think a better solution would be to update

ignite/cmd/ignite/run/ps.go

Lines 166 to 167 in 2dbcdd6

// Skip VMs with no runtime info or no runtime ID.
if vm.Status.Runtime == nil || vm.Status.Runtime.ID == "" {
and ignore the VMs that don't have runtime name info. Tried it with the patch:

diff --git a/cmd/ignite/run/ps.go b/cmd/ignite/run/ps.go
index 84cd2917..3e3b5315 100644
--- a/cmd/ignite/run/ps.go
+++ b/cmd/ignite/run/ps.go
@@ -163,8 +163,8 @@ func fetchLatestStatus(vms []*api.VM) (outdatedVMs map[string]bool, errList []er

        // Iterate through the VMs, fetching the actual status from the runtime.
        for _, vm := range vms {
-               // Skip VMs with no runtime info or no runtime ID.
-               if vm.Status.Runtime == nil || vm.Status.Runtime.ID == "" {
+               // Skip VMs with insufficient runtime info.
+               if vm.Status.Runtime == nil || vm.Status.Runtime.ID == "" || vm.Status.Runtime.Name == "" {
                        continue
                }
                containerID := vm.Status.Runtime.ID

I think this is the best way to solve this in the context of ps.
It would be great if you could try the same, see if it works for you and update your patch accordingly.

@darkowlzz darkowlzz added the kind/bug Categorizes issue or PR as related to a bug. label Jul 11, 2021
@stealthybox
Copy link
Contributor

I agree that it's impossible to insinuate with the older VM API objects what the runtime value should be.
Asserting the default when converting to/from v1alpha2 is wrong.

I like the graceful degradation suggested for the ignite ps code path. 👍

@stealthybox stealthybox merged commit aaac01c into weaveworks:main Jul 12, 2021
@stealthybox
Copy link
Contributor

Thanks for finding this problem!

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants