Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Hard crashes Node.js when a worker thread is terminated #193

Closed
smoores-dev opened this issue Apr 20, 2024 · 7 comments
Closed

Hard crashes Node.js when a worker thread is terminated #193

smoores-dev opened this issue Apr 20, 2024 · 7 comments

Comments

@smoores-dev
Copy link

smoores-dev commented Apr 20, 2024

Howdy! This is a great project! I've been using it to integrate Storyteller with the whisperx and fuzzysearch python libraries. It works very well in the general case, but:

Storyteller runs pymport from a Worker thread, via piscina. Piscina supports cancelling workers, and does so by calling worker.terminate(). When this callsite is reached, if the Worker is running Python code, the Node.js runtime crashes with one of a few different C++ errors:

terminate called after throwing an instance of 'Napi::Error'
  what():

OR

double free or corruption (out)

After which the entire Node.js runtime crashes, up through the parent process that kicked off the Piscina worker.

For what it's worth, https://github.com/hmenyus/node-calls-python, a similar project, has very similar issues.

Anyway, I don't even know if this is something you can resolve here, or if it's actually an issue with terminating worker threads while Node.js is running any addon code, but I figured I would give it a shot!

Here's a minimal reproduction, if that helps at all: https://github.com/smoores-dev/node-python-worker-repro

@mmomtchev
Copy link
Owner

Alas, at first glance, there are no solutions without consequences.

terminate comes while the worker is in Python. When the sleep call ends - because there is no mechanism that can interrupt a running C++ call - the environment does not exist anymore and Node-API itself aborts because it cannot throw the exception. pymport does not get a chance to intervene at all.

There is a Node-API switch called NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS which solves your problem, but as its name goes, it can hide exceptions from you and it seems to be a dangerous options. If you add this #define, it won't crash - it will silently exit.

It is a compile-time option, so I cannot give you the possibility to set a runtime flag - pymport must be rebuilt with it. Everything I can do is an installation option.

By the way I highly recommend you to load pymport in the main thread before using it in worker threads, because otherwise, Node.js will keep loading and unloading the DLL every time a worker thread quits.

@smoores-dev
Copy link
Author

😞 That does make sense, thank you for explaining.

Would you be able to give me a little bit more info about that Node-API switch? Rebuilding pymport might be alright, in this specific case, since Storyteller is only distributed as a Docker container. I'm already building node-sqlite3 from source in that same container. I don't think I know how to add the flag, though; are you saying it's something I would actually have to add to a pymport header file, or something?

Thanks for the note about loading pymport into the main thread, that's a good call. In the actual production service, I'm using piscina, and I'm only using a single, persistent worker thread, so the worker module gets cached and startup and I believe only ends up getting loaded one time.

@mmomtchev
Copy link
Owner

--- a/src/pymport.h
+++ b/src/pymport.h
@@ -1,5 +1,6 @@
 #pragma once
 
+#define NODE_API_SWALLOW_UNTHROWABLE_EXCEPTIONS
 #include <map>
 #include <list>
 #include <set>

@smoores-dev
Copy link
Author

Thanks so much, I will give this a shot! I really appreciate it

@smoores-dev
Copy link
Author

This is working perfectly; thanks again!

@mmomtchev
Copy link
Owner

As you have surely seen from my profile page, I am currently in a middle of a huge extortion involving the French police, corrupt judges, several big tech companies and many major open source projects. In order to intimidate me, people are posting simultaneously in my projects.

@mmomtchev
Copy link
Owner

Fixed by #330

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants