-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Missing /sys/fs/cgroup/cpuacct,cpu #145
Comments
Note that this reversal only happens within docker. Outside of docker, I see |
@crawford interesting. Thanks for adding that. @derekwaynecarr does this truly look like the same issue you had seen before? |
Do we have a good understanding of how hard this will be to fix? I know @derekwaynecarr noted he's looked at this before and thought it has been fixed already. |
To notify those on this issue work on trying to identify the issue has started. |
Here's what I found so far with Setup$ vagrant box add --name RHCOS rhcos-vagrant-libvirt.box
$ mdkir rhcos && cd rhcos && vagrant init RHCOS && vagrant up
$ vagrant ssh Link to Vagrant box binary: http://aos-ostree.rhev-ci-vms.eng.rdu2.redhat.com/rhcos/images/cloud/latest/ RPM Overlaying $ sudo ostree admin unlock --hotfix
$ rpm -qa | grep docker Docker version 1.13.1-70 RHEL7
Ran the following commands to start the Kublet:
Which gave me the following output:
So it looks the error with Updated: look at comments below, the tests stated in this comment was insufficient to identify the problem |
I've encountered an error when Mounting NFS shared folders, i.e. at The full error log in this gist. |
@Bubblemelon this makes me wonder if the fix was applied at build time via a patch. It may be worth using |
Using this Libvirt howto guide to verify the assumptions in my above comment about docker's Master Node Info RHCOS version: source
Docker Version: 2018-04-30 15:56:58
Output from
In trying to resolve I found this openshift issue #18776: To place
within
|
Related: coreos/bugs#1435 |
The error above
can be resolved by adding
After running,
$ rpm -qa | grep docker
docker-client-1.13.1-63.git94f4240.el7.x86_64
docker-rhel-push-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-common-1.13.1-63.git94f4240.el7.x86_64
docker-1.13.1-63.git94f4240.el7.x86_64
docker-novolume-plugin-1.13.1-63.git94f4240.el7.x86_64
docker-lvm-plugin-1.13.1-63.git94f4240.el7.x86_64 |
Great work debugging @Bubblemelon! |
Also thank you @crawford for helping me! Just to clarify, something on the I've also tried it out with this docker version: source - Sun, 08 Jul 2018 09:39:40 UT docker-1.13.1-72.git6f36bd4.el7.x86_64
docker-rhel-push-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-client-1.13.1-72.git6f36bd4.el7.x86_64
docker-lvm-plugin-1.13.1-72.git6f36bd4.el7.x86_64
docker-common-1.13.1-72.git6f36bd4.el7.x86_64
docker-novolume-plugin-1.13.1-72.git6f36bd4.el7.x86_64 Which gave the same error. |
Like to note that That version of kubelet should include this fix |
@derekwaynecarr what are your thoughts on this? |
cadivor doesn't like /sys:/sys:ro. See google/cadvisor#1843 |
This same error,
Still occurs when
Note that on RHCOS, the file is in this format: If both of these were added, under
This error would occur: kubelet.service holdoff time over, scheduling restart.
Starting Kubernetes Kubelet...
Started Kubernetes Kubelet.
container_linux.go:247: starting container process caused "process_linux.go:364: container init caused
\"rootfs_linux.go:54: mounting \\\"/sys/fs/cgroup/cpu,cpuacct\\\" to rootfs
\\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged\\\" at \\\"/var/lib/docker/overlay2/8c95a16f4cad1f014091093c62248c6c0f27bcde879606cef6220f7db4521708/
merged/sys/fs/cgroup/cpuacct,cpu\\\" caused \\\"no space left on device\\\"\""
/usr/bin/docker-current: Error response from daemon: oci runtime error: Failed to remove paths:
map[cpu:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope
cpuacct:/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-afc3a2d6c323ed28a6c7e6586239cb4db8b79b591513eb229ca6fa1eb0bead3b.scope]. |
@crawford do you mind stating what priority you think this should have? Or if the workaround in use should be applied in the RHCOS spins itself? This would clarify if @Bubblemelon and @mrunalp should keep digging on this specific issue. |
This needs to be fixed in the Kubelet. If the OS team is going to tackle that, then I think this bug should stay. Otherwise, let's close this and let @derekwaynecarr and his team tackle the issue. Either way, this is a low priority. I have a workaround (it's ugly, but it works). |
Since this is kubelet related we should pass it over to @derekwaynecarr's team and link back to this issue so they don't have to re-do all of the good debugging done so far. |
Moved this issue over to openshift/origin |
Closing since the fix must be done in another codebase. |
@crawford has found in tests that
/sys/fs/cgroup/cpuacct,cpu
is being expected during his testing but RHCOS provides/sys/fs/cgroup/cpu,cpuacct
.kubernetes/kubernetes#32728 (comment) denotes a similar issue. The workaround is to setup a link from one to the other.
The text was updated successfully, but these errors were encountered: