Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Decide which architectures to support, etc. #369

Closed
utam0k opened this issue Oct 7, 2021 · 34 comments
Closed

Decide which architectures to support, etc. #369

utam0k opened this issue Oct 7, 2021 · 34 comments

Comments

@utam0k
Copy link
Member

utam0k commented Oct 7, 2021

  • Distros: which distros do are we expect to support at which versions? Different distros have different system libraries and kernels.
  • Kernel: which minimum kernel version are we supporting? I know we talked about this, but maybe just getting something official and written down somewhere.
  • Architecture: Some architectures might not support certain features, which architectures should we consider supporting? Obviously we are only really considering x86_64 at the moment, but should we consider things like ARM for embedded and possibly even other MIPS, powerPC, and RISC V?
  • OS: Obviously we currently only support Linux, but maybe we can put that in writing and maybe discuss ideas or plans to support other operating systems?

Goal

Determine and describe these in README, etc.

@yihuaf
Copy link
Collaborator

yihuaf commented Oct 8, 2021

We should also differentiate short term and long term, perhaps? Limiting what we support in short term can allow us to focus on building new features. For long term, we can be more flexible.

@Furisto
Copy link
Collaborator

Furisto commented Oct 8, 2021

Distros: Ubuntu, Debian, Fedora for short term, OpenSUSE, RHEL(or CentOS), Arch for long term.
Kernel: How about 5.4, that is the first LTS version for kernel 5.x
Architecture: x86_64; ARM support should be a goal, but not near term
OS: Windows is out of the question because it implements containers completely differently. MacOS and the BSDs are also sufficiently different that it would not make sense to support them in youki I believe.

@Furisto Furisto closed this as completed Oct 8, 2021
@Furisto Furisto reopened this Oct 8, 2021
@Furisto
Copy link
Collaborator

Furisto commented Oct 8, 2021

Sorry, wrong button 😅

@yihuaf
Copy link
Collaborator

yihuaf commented Oct 8, 2021

Distros: Ubuntu/Debian and Fedora for near term. Likely we may want to support CentOS in the future, since it has a large server share as well. But CentOS usually carries an older kernel, so we have to be careful when is a good time to commit.
Kernel: 5.4 is reasonable. We can also be flexible and bump this later. Looking into LTS version is a good idea.
Architecture: I would focus on x86_64
OS: We should focus on Linux.

@tsturzl
Copy link
Collaborator

tsturzl commented Oct 8, 2021

It might be worth supporting an embedded focused distro also since I think youki is attractive in that space. Perhaps that should come when we take the time to focus on ARM support?

CentOS and RHEL support kernel version 4.18 at the latest currently I believe. Debian 11 is 5.10. Ubuntu 20.04 is actually 5.11 now. Latest Arch is 5.14. OpenSUSE has a rolling and stable release channel, the Leap 15.3 release is on 5.3 I believe.

I'd have to look more on the 4.18 kernel, but I'd have to see if we have conflicts with supporting a kernel that old. Cgroups v2 support is optional and doesn't block builds, the newer libseccomp versions might not be supported by that kernel, eBPF might have some conflicts as well which might cause build issues. It's possible to try to hide somethings like cgv2 and eBPF behind a build flag, especially if we're building binaries for platforms that don't even support cgv2. I think for libseccomp and libsystemd we might need to consider what our minimum supported version is. It might even be nice to try to limit our use of C libraries and try to move some things into Rust. I think for libsystemd we are using one function from the whole lib, and honestly I'd like to see youki not be tied directly to systemd so we can eventually easily support embedded platforms, and I think if we made our own systemd functionality we could disable it at runtime rather than having libsystemd build time dependencies. Nonetheless I think we can probably implement that singular systemd function in pure Rust. The libseccomp dependency is certainly harder to move away from, and I believe it's LGPLv2 which is kind of a viral license which makes porting or distributing it tricky.

I agree to focus on x86_64, though I actually don't doubt that youki would work on an aarch64 platform currently. I also think we should focus pretty exclusively on Linux since other Unix-like systems have so many differences that almost no part of youki would really be reusable for a platform like FreeBSD or MacOS.

@tsturzl
Copy link
Collaborator

tsturzl commented Oct 8, 2021

Anyone have a RHEL or CentOS machine they can try Youki out on? Otherwise I might try to get a VM setup and see if trying to support 4.18 kernel is worth while. It will definitely limit our ability to embrace new kernel features, which I'd really like to do, but even a 5.4 kernel will set us back quite a bit in that whereas some of the interesting kernel features I've experimented with for youki have only been available since 5.10.

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Oct 12, 2021

Also, we need to conditionally setup fields depending on which architecture we support:
For example in State of container, pid is of data type i32, which is valid for 32-bit system, but can go wrong for 64 bit systems. Similarly we might need to define the correct type for different architectures.

To check what can be the maximum value of pid ,we can check /proc/sys/kernel/pid_max , which states 4194304 for my 64-bit system, which is well beyond 32768 for the 32-bit platforms. Such incorrect types can cause issues when parsing data from the state or config files.

@yihuaf
Copy link
Collaborator

yihuaf commented Oct 13, 2021

Also, we need to conditionally setup fields depending on which architecture we support: For example in State of container, pid is of data type i32, which is valid for 32-bit system, but can go wrong for 64 bit systems. Similarly we might need to define the correct type for different architectures.

To check what can be the maximum value of pid ,we can check /proc/sys/kernel/pid_max , which states 4194304 for my 64-bit system, which is well beyond 32768 for the 32-bit platforms. Such incorrect types can cause issues when parsing data from the state or config files.

This is a good point and also the reason why I advocate us to limit what we want to support, to avoid complexity if we can.

@yihuaf
Copy link
Collaborator

yihuaf commented Oct 13, 2021

In terms of kernel version, libseccomp will require us to have a newer kernel likely. Worst case, we disable seccomp for kernel version that doesn't support this? As mentioned before, I don't recommend rewriting libseccomp logic in pure rust, if we can avoid it. This will be use case driven as well, tbh. Not all container runtime focus on security and take advantages of features like seccomp.

@utam0k
Copy link
Member Author

utam0k commented Oct 16, 2021

@yihuaf @tsturzl @Furisto @YJDoc2
Thanks for all input.
As for OS support, only Linux is fine. However, this should be considered when someone is motivated to support other OSes. I think it's worth considering.
Let's set s kernel version to 5.4 for once. However, it should be noted that the supported version may be increased if new features using io_uring are introduced.
There is a good chance that 4.18 will work, but lets' call it unofficail; I don't it's worth verifying every time with CI, etc.
As for the distribution, as long as you decide on the version of the Linux kernel, there seems to be no problem. How about not writing anything specific about supported distributions for once? I don't feel that the differences between them will cause any current problems.
In summary

  • Kernel: ≧ 5.4(May raise in the future)
  • OS: Linux

I am wondering about the architecture to support.

@yihuaf
Copy link
Collaborator

yihuaf commented Oct 17, 2021

Agree that we should increase the min kernel version supported in the future and we should not be afraid to do so. As far as architecture is concerned, I think we should focus on x86-64 first. I know many may be interested in Arm. Most of what we do are not architecture dependent, so supporting it should not be hard. I think other architectures are a nitch and accessing hardware of other architecture is hard.

@tsturzl
Copy link
Collaborator

tsturzl commented Nov 10, 2021

I think we probably already support arm64, but I don't really have a good setup to test that. If we didn't link against C libs it'd be easy to just cross compile. I might actually be able to test this out on a embedded system one of these days, but that said I think we should focus on x86_64 until we are feature complete with runc. I agree with @yihuaf, we probably already support other architectures since we don't really do anything architecture specific. I wouldn't, however, call arm64 a niche platform either, it's being use a lot and I think Youki might be especially appealing on these embedded or edge computing platforms due to it's likely lower footprint.

@yihuaf
Copy link
Collaborator

yihuaf commented Nov 11, 2021

I think we probably already support arm64, but I don't really have a good setup to test that. If we didn't link against C libs it'd be easy to just cross compile. I might actually be able to test this out on a embedded system one of these days, but that said I think we should focus on x86_64 until we are feature complete with runc. I agree with @yihuaf, we probably already support other architectures since we don't really do anything architecture specific. I wouldn't, however, call arm64 a niche platform either, it's being use a lot and I think Youki might be especially appealing on these embedded or edge computing platforms due to it's likely lower footprint.

Agree arm64 is not that nitch, and will become bigger and bigger in the future. To reiterate our decision,

  • Kernel: ≧ 5.4(May raise in the future)
  • OS: Linux
  • Architecture: x86_64. (arm64 in the future when when have the use-case)

@utam0k
Copy link
Member Author

utam0k commented Nov 11, 2021

I also think that arm64 should be supported.

@utam0k
Copy link
Member Author

utam0k commented Nov 11, 2021

I think this will be fine for the first release. How about you guys?

Kernel: ≧ 5.4(May raise in the future)
OS: Linux
Architecture: x86_64. (arm64 in the future when when have the use-case)

@tsturzl
Copy link
Collaborator

tsturzl commented Nov 13, 2021

@utam0k looks mostly good, but my concern with distros is that some distros meet that criteria and still Youki would not run or compile on them because they do not have a suitable version of libseccomp, or in some cases like Alpine Linux there is no systemd and thus there is no systemd library to link against. So if we don't want to target specific distros then we should at least know the dependencies and their versions.

I'd like to put libsystemd behind a feature flag so we can choose whether or not we want to compile it into a release. In fact we used to have this, because I used to run a disto without systemd so I added a feature flag, but splitting Youki up into several crates has broken that feature flag and made it difficult to get working again. Maybe this isn't a concern for now, but we should at least specify that we have required dependencies, at least until we have feature flags to make them optional.

@tsturzl
Copy link
Collaborator

tsturzl commented Nov 13, 2021

On a side note. I'd be very interested in taking on the arm64 support once I have some time. I have worked a lot with arm64 boards both in C++ and Rust. We use arm processors a lot at my work, as I work largely with robotics and embedded devices.

@abalmos
Copy link

abalmos commented Nov 27, 2021

We are using containers on the edge in mobile IoT/telematics applications (oats-center/isoblue-avena). This project hits home with that use case, but, unfortunately, often calls for ARM/ARM64.

@utam0k
Copy link
Member Author

utam0k commented Nov 28, 2021

@tsturzl
I'm sorry for the delayed reply. I think we should solve this problem. Could I ask you to create an issue about this and describe now status?

I'd like to put libsystemd behind a feature flag so we can choose whether or not we want to compile it into a release. In fact we used to have this, because I used to run a disto without systemd so I added a feature flag, but splitting Youki up into several crates has broken that feature flag and made it difficult to get working again. Maybe this isn't a concern for now, but we should at least specify that we have required dependencies, at least until we have feature flags to make them optional.

@utam0k
Copy link
Member Author

utam0k commented Nov 28, 2021

We are using containers on the edge in mobile IoT/telematics applications (oats-center/isoblue-avena). This project hits home with that use case, but, unfortunately, often calls for ARM/ARM64.

@abalmos Thanks for your advice. hmm... I don't have any way to prepare for the environment. Do you have any good ideas?

@abalmos
Copy link

abalmos commented Nov 30, 2021

@utam0k I tend to use virtual machines as a starting pointing. I could potentially offer access to some of our devices for testing, but that may not be a long term solution.

@utam0k
Copy link
Member Author

utam0k commented Dec 1, 2021

@utam0k I tend to use virtual machines as a starting pointing. I could potentially offer access to some of our devices for testing, but that may not be a long term solution.

@abalmos
Youki probably works on arm/arm64, but I can't guarantee it because we can't confirm it works with CI in its current state 😭

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Dec 1, 2021

Hey, I was looking around on how can we test this, and there are two options I have found, but none of them seems much reasonable :

  • Github does not allow to specify arch in actions and CI/CD, but it allows using an external runner for CI jobs. This will require someone to set up an external account on something like Azure, which does allow selecting architecture, and possibly we can do this in its free tier, but not sure. Then setup this as the job runner for a dedicated arm job.
  • Use QEmu in CI for emulating ARM and test. Even though this would work, and allow us to test multiple ARM based CPU hardware, this will not only be tedious to set-up, but will also have a high overhead. This way will basically require needing to download/cache a lightweight linux image which matches with our required kernel, and supports libs that we need for youki, and has integration test dependencies and Rust (if we want to run unit tests) then compile youki to ARM target and copy that as binary data to a file which is to be used as harddisk for the QEmu (there may be option to access host system files, but I don't know about that). Then we will need to run all the test, and capture the output from QEmu. As said before, this will be very tedious to set-up in CI/CD and will take quite a long time to run.

A third option is having Travis CI set up and move all our CI to that. This might allow us to setup something like Bors as the Rust repo does, but not sure about the cost, and how it compares to github CI.

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 1, 2021

@utam0k I've created #512 to address the libsystemd feature flag.

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 1, 2021

@YJDoc2 @utam0k

Another option for ARM is something like tiered support. We consider ARM64 a tier 2, and maybe ARMv6/7 (32bit) as tier 3 support. Of course x86_64 would be tier 1 support, meaning that CI is actively testing all code against x86_64 for every PR before it is merged into main, and therefore we guarantee that main will always be work in at least a development capacity on main. Then for tier 2 we ensure that each actual release will be tested and built for ARM64, meaning that all tier 2 targets will be fully tested and built for each release meaning that whoever cuts the release just needs to make sure someone fully tested and built for those targets. Tier 3 support would mean we test and built for these targets as time permits, and the only support we offer is security updates and we would otherwise mostly rely on community interest and support to add features to this tier.

Eventually we could move ARM64 into tier 1 support, since Youki is an appealing choice for embedded platforms. This also means we don't have to put too much thought into tooling or immediate CI solutions for ARM64 currently because we don't even have an initial release yet.

Another thing to consider here in terms of platforms we support is that different platforms use different C libraries. For example Alpine Linux, a popular embedded distro, uses musl as it's C lib. Currently both x86_64-unknown-linux-musl and aarch64-unknown-linux-musl only have tier 2 rust support, which may effect our ability to support these platforms in some certain cases especially in CI. Perhaps more important to the discussion, is do we want to support musl? Or should we just focus on architecture support for now? I would assume we already support musl, but again we don't know until we actually test it.

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 1, 2021

I also have experience with Qemu for ARM64. I had done something similar to test embedded software written in Rust that was targeted for a ARM64. If that's a route we want to try I'd be willing to try it out, however I should make mention that running unit tests in CI in an emulator will be terribly slow, because that was exactly my experience with it. It's also very likely that the Github Actions are already being run in a VM so it may not work incredibly well. AWS also offers a free tier and has ARM64 instances.

@abalmos
Copy link

abalmos commented Dec 2, 2021

@tsturzl I think a tier based solution with ARM tier 2 is a great solution ... but that still leaves the need for devs to test occasionally and debug issue reports.

I also share the same experience with qemu in Github Actions ... it ends up being a /very/ slow pipeline (we used it through docker buildx) ... I would think just building youki may even exceed the maximum run time (which is what we hit first).

Could use the AWS free tier ARM instance running a GHA self-host runner, then no emulation is needed anywhere and things can stay in GHA.

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 2, 2021

@abalmos I'd agree that's probably a better solution here. We should probably create a ticket for this, and should probably discuss who creates the AWS resource. I'm wondering if @utam0k should create it since he's an actual containers org member.

@utam0k
Copy link
Member Author

utam0k commented Dec 2, 2021

@tsturzl @abalmos
I was looking at github actions of crun and found this. Maybe we can use it. How about this?
https://github.com/containers/crun/blob/main/.github/workflows/test.yaml
https://github.com/uraimo/run-on-arch-action

@tsturzl
Copy link
Collaborator

tsturzl commented Dec 3, 2021

I think run-on-arch-action is doing what @abalmos was saying with docker. Would youki tests run in a docker containers? I don't think the integration tests will run correctly like this. Compilation is really slow in an emulator, so I wonder how well this is working for crun. Maybe it's worth a try?

@utam0k
Copy link
Member Author

utam0k commented Dec 3, 2021

@tsturzl
Probably not that hard to try it, right? I'm actually writing a script to run a containerd integration test on docker with youki, which may be able to support a little the arm. BTW I'm mostly referring to crun for how to do it.
#331

@utam0k
Copy link
Member Author

utam0k commented Dec 3, 2021

It might be a good idea to discuss a separate issue from this for the supporting arm.

@jhult
Copy link
Contributor

jhult commented Dec 23, 2021

For whatever it is worth, I was able to compile and run on Oracle Linux 7.

$ youki info
[DEBUG crates/youki/src/main.rs:92] 2021-12-23T15:05:03.093767361-05:00 started by user 1001 with ArgsOs { inner: ["youki", "info"] }
Version           0.0.1
Kernel-Release    5.4.17-2136.301.1.4.el7uek.x86_64
Kernel-Version    #2 SMP Fri Dec 3 20:34:13 PST 2021
Architecture      x86_64
Operating System  Oracle Linux Server 7.9
Cores             4
Total Memory      47645
Cgroup setup      legacy
Cgroup mounts
  blkio           /sys/fs/cgroup/blkio
  cpu             /sys/fs/cgroup/cpu,cpuacct
  cpuacct         /sys/fs/cgroup/cpu,cpuacct
  cpuset          /sys/fs/cgroup/cpuset
  devices         /sys/fs/cgroup/devices
  freezer         /sys/fs/cgroup/freezer
  hugetlb         /sys/fs/cgroup/hugetlb
  memory          /sys/fs/cgroup/memory
  net_cls         /sys/fs/cgroup/net_cls,net_prio
  net_prio        /sys/fs/cgroup/net_cls,net_prio
  perf_event      /sys/fs/cgroup/perf_event
  pids            /sys/fs/cgroup/pids
Namespaces        enabled
  mount           enabled
  uts             enabled
  ipc             enabled
  user            enabled
  pid             enabled
  network         enabled
  cgroup          enabled

@yihuaf
Copy link
Collaborator

yihuaf commented Jul 21, 2023

I am closing this thread since there has not been any activity for a while. Due to our limited bandwidth, we can only afford to support x86 at the moment. If there are people who are willing to shoulder the responsibility, we can revisit this issue.

@yihuaf yihuaf closed this as completed Jul 21, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants