Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[15_0_X] Use HardwareResourcesDescription in ProcessConfiguration #47416

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Feb 20, 2025

PR description:

Backport of #47280, #47355, #47473 (well, backported by dropping out the commit reverted there), and #47477 (the commit of the PR split in two, one to replace the commit dropped in the earlier step, and the rest included in the last commit of this PR).

Resolves cms-sw/framework-team#1248

PR validation:

Code compiles (plus the tests in #47280 and #47355)

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Backport of #47280 and #47355

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for CMSSW_15_0_X.

It involves the following packages:

  • DQM/SiStripMonitorHardware (dqm)
  • DQMServices/FwkIO (dqm)
  • DataFormats/Provenance (core)
  • FWCore/AbstractServices (core)
  • FWCore/Framework (core)
  • FWCore/Integration (core)
  • FWCore/Services (core)
  • FWCore/Sources (core)
  • FWCore/TestProcessor (core)
  • FWCore/Utilities (core)
  • GeneratorInterface/LHEInterface (generators)
  • HeterogeneousCore/CUDAServices (heterogeneous)
  • HeterogeneousCore/ROCmServices (heterogeneous)
  • IOPool/Common (core)
  • IOPool/Input (core)
  • IOPool/SecondaryInput (core)
  • IOPool/Streamer (core)
  • Mixing/Base (simulation)
  • PhysicsTools/PyTorch (ml)
  • PhysicsTools/TensorFlow (ml)

@Dr15Jones, @antoniovagnerini, @bbilin, @civanch, @cmsbuild, @fwyzard, @kpedro88, @lviliani, @makortel, @mdhildreth, @menglu21, @mkirsano, @rseidita, @smuzaffar, @valsdav, @y19y19 can you please review it and eventually sign? Thanks.
@alberto-sanchez, @arossi83, @barvic, @fabiocos, @felicepantaleo, @fioriNTU, @fwyzard, @idebruyn, @jandrea, @missirol, @mkirsano, @mmusich, @richa2710, @riga, @rovere, @sroychow, @threus, @wddgit this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 20, 2025

cms-bot internal usage

@makortel
Copy link
Contributor Author

enable gpu

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 220KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9d1efb/44545/summary.html
COMMIT: c8326da
CMSSW: CMSSW_15_0_X_2025-02-20-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/47416/44545/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 9 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4018889
  • DQMHistoTests: Total failures: 68
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4018801
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 218 log files, 189 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 7
  • DQMHistoTests: Total histograms compared: 53071
  • DQMHistoTests: Total failures: 869
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 52202
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
  • Checked 24 log files, 30 edm output root files, 7 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor Author

CPU comparison differences are related to #47071

GPU comparison differences look compatible with the non-reproducibilities in the pixel code

@cmsbuild
Copy link
Contributor

Pull request #47416 was updated. @Dr15Jones, @cmsbuild, @makortel, @smuzaffar, @valsdav, @y19y19 can you please check and sign again.

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@makortel
Copy link
Contributor Author

Ok, now this PR corresponds to the status in the master branch (with a bit different history though), and should work for both CRAB and the unit tests.

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9d1efb/44754/summary.html
COMMIT: 4c6ccf4
CMSSW: CMSSW_15_0_X_2025-02-28-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47416/44754/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 7
  • DQMHistoTests: Total histograms compared: 53071
  • DQMHistoTests: Total failures: 877
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 52194
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
  • Checked 24 log files, 30 edm output root files, 7 DQM output files
  • TriggerResults: no differences found

@valsdav
Copy link
Contributor

valsdav commented Mar 3, 2025

+ml

@makortel
Copy link
Contributor Author

makortel commented Mar 5, 2025

CPU comparison differences are related to #39803 and #47071

GPU comparison differences seem compatible with pixel code non-reproducibilities, and #47406

@makortel
Copy link
Contributor Author

makortel commented Mar 5, 2025

Tests in 15_1_X IBs have not revealed any new problems, so this should be good to go now.

@makortel
Copy link
Contributor Author

makortel commented Mar 5, 2025

+core

@mandrenguyen
Copy link
Contributor

unhold

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 6, 2025

This pull request is fully signed and it will be integrated in one of the next CMSSW_15_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_15_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @rappoccio, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 9a16b11 into cms-sw:CMSSW_15_0_X Mar 6, 2025
12 checks passed
@makortel makortel deleted the processConfigurationHardwareResourcesDescription_150x branch March 6, 2025 14:32
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants