-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Node 20.3 Crashes all the time when executed inside docker #48444
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Another possible ref electron/rebuild#1085 |
@nodejs/libuv |
"Text file busy" means trying to write a shared object or binary that's already in use. My hunch is that node-gyp has some race condition in reading/writing files that wasn't manifesting (much) when everything still went through the much slower thread pool, whereas io_uring is fast enough to make it much more visible. |
Is there a way to disable ioring using a env variable as a temporary workaround when running node-gyp? |
Yes, set |
|
This is actually worse than I thought. Node doesn't run at all with 20.3 |
|
I can reproduce this, and this is quite critical. |
cc @nodejs/tsc for visibility |
FWIW on two systems I have access to (a Red Hat owned RHEL 8 machine and test-digitalocean-ubuntu1804-docker-x64-1 from the Build infra) root@test-digitalocean-ubuntu1804-docker-x64-1:~# docker run -it node:20.3.0 node
Unable to find image 'node:20.3.0' locally
20.3.0: Pulling from library/node
bba7bb10d5ba: Pull complete
ec2b820b8e87: Pull complete
284f2345db05: Pull complete
fea23129f080: Pull complete
9063cd8e3106: Pull complete
4b4424ee38d8: Pull complete
0b4eb4cbb822: Pull complete
43443b026dcf: Pull complete
Digest: sha256:fc738db1cbb81214be1719436605e9d7d84746e5eaf0629762aeba114aa0c28d
Status: Downloaded newer image for node:20.3.0
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> I can reproduce the assertion failure on an Ubuntu 16.04 host with root@infra-digitalocean-ubuntu1604-x64-1:~# docker run -it node:20.3.0 node
Unable to find image 'node:20.3.0' locally
20.3.0: Pulling from library/node
bba7bb10d5ba: Pull complete
ec2b820b8e87: Pull complete
284f2345db05: Pull complete
fea23129f080: Pull complete
9063cd8e3106: Pull complete
4b4424ee38d8: Pull complete
0b4eb4cbb822: Pull complete
43443b026dcf: Pull complete
Digest: sha256:fc738db1cbb81214be1719436605e9d7d84746e5eaf0629762aeba114aa0c28d
Status: Downloaded newer image for node:20.3.0
node[1]: ../src/node_platform.cc:68:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed.
1: 0xc8e4a0 node::Abort() [node]
2: 0xc8e51e [node]
3: 0xd0a059 node::WorkerThreadsTaskRunner::WorkerThreadsTaskRunner(int) [node]
4: 0xd0a17c node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node]
5: 0xc4bbc4 node::V8Platform::Initialize(int) [node]
6: 0xc49408 [node]
7: 0xc497db node::Start(int, char**) [node]
8: 0x7f6e8486218a [/lib/x86_64-linux-gnu/libc.so.6]
9: 0x7f6e84862245 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
10: 0xba9ade _start [node]
root@infra-digitalocean-ubuntu1604-x64-1:~# docker run -it node:20.3.0-bullseye node
Unable to find image 'node:20.3.0-bullseye' locally
20.3.0-bullseye: Pulling from library/node
93c2d578e421: Already exists
c87e6f3487e1: Already exists
65b4d59f9aba: Already exists
d7edca23d42b: Already exists
25c206b29ffe: Already exists
599134452287: Pull complete
bd8a83c4c2aa: Pull complete
d11f4613ae42: Pull complete
Digest: sha256:ceb28814a32b676bf4f6607e036944adbdb6ba7005214134deb657500b26f0d0
Status: Downloaded newer image for node:20.3.0-bullseye
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> Our website build is actually broken running |
FWIW I opened an issue about this in the docker-node repo: nodejs/docker-node#1918 TLDR: this is not a problem with Node.js itself, but with the default base OS used by the Docker image, which was upgraded for v20.3.0. |
bullseye works for me as well |
Now I also get the file busy error:
EDIT: Works with |
So to summarize:
|
Should I split the uring problem into a separate issue? |
Can someone post the result of |
On the Ubuntu 16.04 infra machine I cannot run apt in the bookworm based Another datapoint, adding root@infra-digitalocean-ubuntu1604-x64-1:~# docker run --security-opt=seccomp:unconfined -it node:20.3.0 node
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> |
Right, then I can predict with near 100% certainty what the problem is: docker doesn't know about the newish clone3 system call. Its seccomp filter rejects it with some bogus error and node consequently fails when it tries to start a new thread. This docker seccomp thing is like clockwork, it always pops up when new system calls are starting to see broader use. It's quite possibly fixed in newer versions. |
Updating docker to the latest version fixed it (v24.0.2) for me. A few notes:
Here is what I think we should do:
This seems a future-proof solution while keeping the current functionality available. |
UV_USE_IO_URING is (intentionally) undocumented and going away again so don't do that. |
@bnoordhuis Would you just document this as "if you are hit by this bug, update docker"? |
I think there are two different things here. I'm not sure updating docker will help with the uring problem. Or does it? Please confirm. |
If I'm reading this correctly there are 2 separate issues here.
I think we should try to understand better the 2nd issue before disabling it. |
i have the same, what is the correct solution ? |
Try upgrade to latest container runtime (docker, containerd, etc.) to latest. If nothing newer is picked up from your package manager, consider upgrading manually. |
Experiencing this with |
archlinux 6.9.1, nodejs 22.0.0-1, same error |
node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
Same problem
|
Btw we still use bullseye, works pretty good |
not to bring up an old issue again but this appears to be a reoccurring bug. reproducing it consistently with kernel it is fixed with the given that that does fix it, it may also podentially be an issue with |
cc @santigimeno can you take another look? |
This is supposed to be the default for Node.js (since the February security releases). https://nodejs.org/docs/latest-v22.x/api/cli.html#uv_use_io_uringvalue
|
A patch has just been sent to the kernel fixing this: It should land in stable shortly: |
I just tested with 22.2.0 installed from nvm and as documented, io_uring is disabled there. Maybe is there a problem in the arch linux package? |
If you run into this on ubuntu 16 or 18, my fix is use ubuntu >= 20.04. The issue actually comes from docker for me. |
How is arch supposed to be disabling io_uring? It configures nodejs to use the system libuv, and builds its libuv with the default options. |
That's likely the problem. Due to the security reasons mentioned above node.js patched libuv to disable io_uring in the following commits: 42e659c and 6d14352. Maybe the arch packaging hasn't taken that into account? |
Thanks @santigimeno for the help debugging this. |
I am the Arch packager and indeed I have missed this change. Opened libuv/libuv#4416 to see if there is a better way forward. |
https://aur.archlinux.org/cgit/aur.git/commit/?h=thelounge&id=fd50c63 node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
https://aur.archlinux.org/cgit/aur.git/commit/?h=thelounge&id=fd50c63 node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
For more info see nodejs/node#48444 (comment).
For more info see nodejs/node#48444 (comment).
yarn add bufferutil
) fails withText file busy
The text was updated successfully, but these errors were encountered: