-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[BUGS] Default ulimit setting too low #1656
Comments
With systemd the configuration through security/limits don't work. [Service] |
The php frontend crash even with 10000 files. It just take longer. 3700+ files after few days |
Linking to relevant forum threads where forum members are experiencing this same issue: |
Yea, I ran into this during a rebalancing, here's a couple of comments I can add to follow up on my forum thread:
Both SMB and the web UI seem to be getting worse as time goes on, so I'm wondering if the snapshots are making things worse - perhaps each one is still attempting to run or something? The day I started the balance, I don't recall getting many issues navigating the web ui - it might have happened once or twice, but not enough to leave an impression. Today every page takes several attempts before it will load. Update: This is a bit ugly, but it gives us a summary of which processes have the most open files (it omits anything with < 200): lsof | awk 'BEGIN{print "command files"} NR!=1 {a[$1]++}END{for (i in a) if(a[i]>200){printf("%-10s %6.0f\n", i, a[i])}}'
|
Balance finished, this is what the output looks like now:
curiously enough, none of those files actually appear to exist -
I'm going to reboot it now, I'll post one more update in a minute. Edit: the output from
I saved the full output of |
So, after a couple of attempts, I concluded that the web UI's reboot wasn't working - the "graceful shutdown" would go for 5 minutes, and then I'd log back in and be greeted with another "too many files open" error. Then I noticed that my ssh connection never closed. So I rebooted it there. After the real reboot, this is what things look like:
The total is about 5k:
|
Not sure if it's related, but just for completeness: after the reboot my Rock-ons service wouldn't start. After a bit of digging around, the solution here worked for me: https://forum.rockstor.com/t/docker-service-doesnt-start/1657 Plex seems to think it's a new server and that the old one is offline with a second copy of all of my media.. but I think that is just a plex bug. Update: actually the plex issue seems to be a side-effect of none of my shares mounting. The data is all there, e.g. Any ideas? Second update: things work when I mount the shares manually. I am guessing that they failed to mount because I disabled quotas to make a rebalance not take weeks. |
The files does not exist because there are deleted But the file descriptor is still in memory so the OS keep the file open. You just can't see it when doing a ls |
I've updated our dependency on gunicorn, it was severely outdated. If the file descriptor leak still exists after this change, we can debug a bit better. |
Linking to another report by forum user smanley of "Too many open files": |
I'm having this same issue with alarming frequency. Following the directions on the error message, I opened a support ticket for it several months ago, but evidently nobody looks at that. This issue has been open for 11 months now, with no fix? Is anybody working on this, or is the solution to reboot the system all the time? |
Linking to another forum thread (with recent activity by member erisler) on "Too many open files": |
Thank you @phillxnet for linking the above troubleshooting...I'll add here as directed: I poked around onthe gunicorn github for issues with file descriptors ( benoitc/gunicorn#1428 Also this: http://carsonip.me/posts/gevent-pywsgi-http-keep-alive-fd-leak which suggests to reverse proxy connections to pywsgi to ensure proper detection of closed http connections. Update and possible fix: Added PR with possible fix: #1934 |
pass api calls through nginx to combat ulimit. See my comments in this thread: rockstor#1656 and https://forum.rockstor.com/t/exception-while-running-command-usr-bin-hostnamectl-static-errno-24-too-many-open-files/2272/5
Linking to another report of this by forum member Sublevel4: |
Along with #2020 has this been an issue since we have moved to OpenSUSE? Do we still think that AFP might have been the underlying issue (which then would have been resolved after deprecating the same). I believe, @phillxnet closed the #1934 due to age back then. |
@Hooverdan96 I'll close this as we haven't had a report of this on our openSUSE base and since we dropped AFP. |
Hi,
For production environment with about 20 shares with snapshot we ran into "too many open files"
problem.
Snapshot, shadow copy and exportfs fail then the whole webgui crash with internal server error 500
Created a file in /etc/security/limits.d/
cat 30-nofile.conf
root soft nofile 8192
root hard nofile 16364
(default is 1024)
The text was updated successfully, but these errors were encountered: