-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Do not die after failing to load app.psgi #117
base: master
Are you sure you want to change the base?
Conversation
This stops the error flood in logs when app.psgi is broken due to endless forking of immediately dying children.
As suggested by Aristotle Pagaltzis.
Could we get a release with this patch included, since this is what effectively prevents us from using starman in production, and also what has lead to the fork of starwoman?! If you need some more testing and/or an further improvements, please let me know if i can help. |
I understand the issue but am not fully convinced that this PR is the right fix because if the .psgi has an initialization code that can die (such as opening a configuration file, or opening a database connection), then it will still repeatedly die and cause the same flood of repeated execution errors until it succeeds. You can see that with Special casing the syntax error, although it might be the most common and makes it handy to fix in dev, might actually lead to be more inconsistent behaviors. Maybe the right behavior is either:
3 is essentially what we're doing, and adding a sleep will at least slow down the amount of errors you get. However, adding a sleep in init, or lazy-loading the app upon request, makes the server process "half up" without the full capacity, and i'm not super in favor of that personally. It is really tricky to tell which one is the best behavior at this moment, but we need to be careful about the changing behavior, and if we choose an option that is a breaking change, there might have to be an option to turn on that change.
my company has been using starman in production for years and has no issues around this, but I guess mainly because we use it with |
i've been putting more thoughts into this, and simply delaying the compilation of the app until the request time doesn't really feel like something we need/want for the "production usage" -- I don't expect most users to modify the code on the production server once it's deployed, so simply delaying it won't solve that particular problem. Granted, this will definitely fix the issue of a potential disk full situation because of the flood of endless recompilation of the app. I lean towards thinking that the right fix would be to add some interval/backoff for retrying the compilation of the app, or somehow signal to the parent that the compilation failed, and if all children failed compilation, just let the parent exit, like you do with the Maybe this behavior should be customizable with an option. I know it might sound too much, but it's an important behavior change that might be potentially breaking existing users. |
Same here. I switched from Starman to Starlet back to Starman because even
I think you can’t get around having both. If the app fails to compile, then exiting ASAP is arguably the only correct response, since no success is possible. If the app fails to start due to things like required services going missing (infamous “database connection failed” etc.), then a backoff is… essentially mandatory. And I can think of use cases for any combination of these, so ideally they should be separate options that are not mutually exclusive. |
+1 for the idea of turning this behavior on with an option. That way we can bypass any breaking changes but still please people (like me :P) who think this option would be useful. |
This stops the error flood in logs when app.psgi is broken
due to endless forking of immediately dying children.
Fixes #94 and #106.