Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Limits not always working? #448

Open
sb10 opened this issue Jan 20, 2023 · 1 comment
Open

Limits not always working? #448

sb10 opened this issue Jan 20, 2023 · 1 comment
Labels

Comments

@sb10
Copy link
Member

sb10 commented Jan 20, 2023

It seems like limits can be bypassed, or at least status reporting on numbers running can be wrong, possibly related to jobs failing and retrying. Eg:

$ wr status -i re3.rsync_ibdx10_2  -r

# [...]
Limit groups: rsync_118; Priority: 0; Attempts: 1
Expected requirements: { memory: 16384MB; time: 2h30m38s; cpus: 1 disk: 0GB }
Status: running (started 23/1/20-13:09:32)
Stats of previous attempt: { Exit code: 0; Peak memory: 43MB; Peak disk: 0MB; Wall time: 72.860994ms; CPU time: 17.544ms }
Host: node-12-3-4 (IP: 10.160.12.43); Pid: 61607
+ 10 other commands with the same status

# [...]
Limit groups: rsync_118; Priority: 0; Attempts: 0
Expected requirements: { memory: 16384MB; time: 2h30m38s; cpus: 1 disk: 0GB }
Status: running (started 01/1/1-00:00:00)
+ 5 other commands with the same status

(when the limit for rsync_118 was 16, and it was observed as high as 20 running on the status website)

@sb10 sb10 added the bug label Jan 20, 2023
@sb10
Copy link
Member Author

sb10 commented Jan 24, 2023

Another related bug seen is the full limit not being used, possibly related to jobs becoming lost? Restarting the manager fixed it.

This is just lost jobs using up the limit because they're still officially running.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant