Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Canary tw implementation #8808

Merged
merged 15 commits into from
Nov 28, 2024
Merged

Conversation

aspiringmind-code
Copy link
Contributor

Implementation of canary TW for #8806
Also see corresponding changes in https://gitlab.cern.ch/ai/it-puppet-hostgroup-vocmsglidein/-/merge_requests/292

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2290/artifact/artifacts/PullRequestReport.html

Copy link
Member

@belforte belforte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good. But as usual I'd like to have a few one-line comments which explain the logic to the unaware reader.
Please test on preprod first.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2291/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2294/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2295/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2296/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2297/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2298/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 60 comments to review
  • Pycodestyle check: succeeded
    • 66 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2299/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 60 comments to review
  • Pycodestyle check: succeeded
    • 66 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2303/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 60 comments to review
  • Pycodestyle check: succeeded
    • 68 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2304/artifact/artifacts/PullRequestReport.html

@aspiringmind-code
Copy link
Contributor Author

@belforte Tested on preprod. See task
The task was submitted on crab-preprod-tw01

crab3@crab-preprod-tw01:/data/srv/TaskManager/logs$ grep -C 5 "vchakrav" twlog.txt
2024-11-28 11:21:58,387:INFO:MasterWorker,255:Starting external scheduling.
2024-11-28 11:21:58,471:INFO:TaskUtils,25:Retrieved a total of 1 WAITING tasks
2024-11-28 11:21:58,471:INFO:MasterWorker,314:
Username | Waiting | Selected
---------|---|--
vchakrav | 1 | 1
2024-11-28 11:21:58,472:INFO:TaskUtils,55:Will set to NEW task 241128_102130:vchakrav_crab_20241128_112126
2024-11-28 11:21:58,502:INFO:MasterWorker,352:Task 241128_102130:vchakrav_crab_20241128_112126 status updated to 'NEW'.
2024-11-28 11:21:58,502:INFO:MasterWorker,331:Pruning the queue if required...logic tbd
2024-11-28 11:21:58,502:INFO:MasterWorker,334:Report Queue status... logic tbd
2024-11-28 11:21:58,502:INFO:MasterWorker,574:Work selected successfully.
2024-11-28 11:21:58,622:INFO:MasterWorker,428:TW changed from crab-preprod-tw01 to crab-preprod-tw02 during runCanary
2024-11-28 11:21:58,709:DEBUG:Worker,253:Ready to inject 0 items

And the task ran on canary crab-preprod-tw02

2024-11-28 11:23:27,505:DEBUG:Worker,284:Completed work 14 on 241128_102130:vchakrav_crab_20241128_112126
2024-11-28 11:23:27,505:INFO:MasterWorker,579:This is canary TW crab-preprod-tw02 running.

For the test we had set canary_fraction as 1 so that it certainly goes to canary. It will be set to a small number like 0.05 by default in master TW config.

@belforte
Copy link
Member

I suggest to put in preprod with at 30% fraction and on monday enable at low rate in prod

@aspiringmind-code
Copy link
Contributor Author

Preprod set to 0.3 in here Ok to merge?

Copy link
Member

@belforte belforte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid that polling cycle is disrupted (at least happens w/o sleep) in canary. Please look at it. And in any case check what is happening.
Then I made a few "style" comments about making it more clean and readable, please change in case you agree.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 59 comments to review
  • Pycodestyle check: succeeded
    • 67 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2305/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Pylint check: succeeded
    • 58 comments to review
  • Pycodestyle check: succeeded
    • 65 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-CRABServer-PR-test/2306/artifact/artifacts/PullRequestReport.html

@aspiringmind-code aspiringmind-code merged commit ddd6ce1 into dmwm:master Nov 28, 2024
2 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants