Limits not always working? #448

sb10 · 2023-01-20T13:28:52Z

It seems like limits can be bypassed, or at least status reporting on numbers running can be wrong, possibly related to jobs failing and retrying. Eg:

$ wr status -i re3.rsync_ibdx10_2  -r

# [...]
Limit groups: rsync_118; Priority: 0; Attempts: 1
Expected requirements: { memory: 16384MB; time: 2h30m38s; cpus: 1 disk: 0GB }
Status: running (started 23/1/20-13:09:32)
Stats of previous attempt: { Exit code: 0; Peak memory: 43MB; Peak disk: 0MB; Wall time: 72.860994ms; CPU time: 17.544ms }
Host: node-12-3-4 (IP: 10.160.12.43); Pid: 61607
+ 10 other commands with the same status

# [...]
Limit groups: rsync_118; Priority: 0; Attempts: 0
Expected requirements: { memory: 16384MB; time: 2h30m38s; cpus: 1 disk: 0GB }
Status: running (started 01/1/1-00:00:00)
+ 5 other commands with the same status

(when the limit for rsync_118 was 16, and it was observed as high as 20 running on the status website)

The text was updated successfully, but these errors were encountered:

sb10 · 2023-01-24T09:12:38Z

Another related bug seen is the full limit not being used, possibly related to jobs becoming lost? Restarting the manager fixed it.

This is just lost jobs using up the limit because they're still officially running.

sb10 added the bug label Jan 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limits not always working? #448

Limits not always working? #448

sb10 commented Jan 20, 2023

sb10 commented Jan 24, 2023 •

edited

Loading

Limits not always working? #448

Limits not always working? #448

Comments

sb10 commented Jan 20, 2023

sb10 commented Jan 24, 2023 • edited Loading

sb10 commented Jan 24, 2023 •

edited

Loading