Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use HardwareResourcesDescription in ProcessConfiguration #47355

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Feb 14, 2025

PR description:

This PR is part of #30044. It continues the work of #47175 and #47280 by replacing the never-used "PassID" in ProcessConfiguration with the HardwareResourcesDescription that is filled by ResourceInformation service.

The storage ProcessConfiguration doesn't technically change, as the "Pass ID" was stored as a string, and the HardwareResourcesDescription is serialized manually as a string. Empty string de-serializing as an empty HardwareResourcesDescription, and an empty HardwareResourcesDescription serializing as an empty string should guarantee backwards and forwards compatibility (even if the latter is not formally supported, but just in case future Run3 MC production would need it), also for purposes of ProcessHistory and its ID.

The next PR in this series will improve the printouts of edmProvDump.

Resolves cms-sw/framework-team#1185

PR validation:

Framework unit tests pass

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported to 15_0_X

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 14, 2025

cms-bot internal usage

@@ -116,8 +126,9 @@ std::ostream& operator<<(std::ostream& os, edm::ProcessHistory& iHist) {
std::string const indentDelta(" ");
std::string indent = indentDelta;
for (auto const& process : iHist) {
os << indent << process.processName() << " '" << process.passID() << "' '" << process.releaseVersion() << "' ("
<< process.parameterSetID() << ")" << std::endl;
// TODO: add printout of HardwareResourcesDescription
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next PR will address this TODO

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for master.

It involves the following packages:

  • DataFormats/Provenance (core)
  • FWCore/AbstractServices (core)
  • FWCore/Framework (core)
  • FWCore/Integration (core)
  • FWCore/Sources (core)
  • FWCore/TestProcessor (core)
  • IOPool/Common (core)
  • IOPool/Input (core)
  • IOPool/SecondaryInput (core)
  • IOPool/Streamer (core)

@Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please review it and eventually sign? Thanks.
@missirol, @mmusich, @rovere, @wddgit this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

enable gpu

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: Build
Size: This PR adds an extra 256KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3a0d8d/44398/summary.html
COMMIT: ec658cd
CMSSW: CMSSW_15_1_X_2025-02-14-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47355/44398/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

>> Compiling edm plugin src/DQMServices/FwkIO/plugins/DQMRootOutputModule.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DTBB_USE_GLIBCXX_VERSION=120301 -DTBB_SUPPRESS_DEPRECATED_MESSAGES -DTBB_PREVIEW_RESUMABLE_TASKS=1 -DTBB_PREVIEW_TASK_GROUP_EXTENSIONS=1 -DBOOST_SPIRIT_THREADSAFE -DPHOENIX_THREADSAFE -DBOOST_MATH_DISABLE_STD_FPCLASSIFY -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -DCMSSW_GIT_HASH='CMSSW_15_1_X_2025-02-14-1100' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_15_1_X_2025-02-14-1100' -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-02-14-1100/src -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/classlib/3.1.3-2827e9dbeb7dd2aced5a46b8a966dfab/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/pcre/8.43-2d141998cfe5424b8f7aff48035cc2da/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/boost/1.80.0-96a02191111b66819e07de179cb25a0e/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/bz2lib/1.0.6-d065ccd79984efc6d4660f410e4c81de/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/libuuid/2.34-27ce4c3579b5b1de2808ea9c4cd8ed29/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/lcg/root/6.32.09-0a945d6e24dcaabe218b38fa8292785d/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/tbb/v2021.9.0-9b0f135342cfe979b6500c59f501774e/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/xz/5.2.5-6f3f49b07db84e10c9be594a1176c114/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/zlib/1.2.13-d217cdbdd8d586e845e05946de2796be/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/md5/1.0.0-5b594b264e04ae51e893b1d69a797ec6/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/tinyxml2/6.2.0-f99ae2781d074227d47e8a3e7c8ec87e/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el8_amd64_gcc12/src/DQMServices/FwkIO/plugins/DQMServicesFwkIOPlugins/DQMRootOutputModule.cc.d src/DQMServices/FwkIO/plugins/DQMRootOutputModule.cc -o tmp/el8_amd64_gcc12/src/DQMServices/FwkIO/plugins/DQMServicesFwkIOPlugins/DQMRootOutputModule.cc.o
>> Compiling edm plugin src/DQMServices/FwkIO/plugins/DQMRootSource.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DTBB_USE_GLIBCXX_VERSION=120301 -DTBB_SUPPRESS_DEPRECATED_MESSAGES -DTBB_PREVIEW_RESUMABLE_TASKS=1 -DTBB_PREVIEW_TASK_GROUP_EXTENSIONS=1 -DBOOST_SPIRIT_THREADSAFE -DPHOENIX_THREADSAFE -DBOOST_MATH_DISABLE_STD_FPCLASSIFY -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -DCMSSW_GIT_HASH='CMSSW_15_1_X_2025-02-14-1100' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_15_1_X_2025-02-14-1100' -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_15_1_X_2025-02-14-1100/src -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/classlib/3.1.3-2827e9dbeb7dd2aced5a46b8a966dfab/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/pcre/8.43-2d141998cfe5424b8f7aff48035cc2da/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/boost/1.80.0-96a02191111b66819e07de179cb25a0e/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/bz2lib/1.0.6-d065ccd79984efc6d4660f410e4c81de/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/libuuid/2.34-27ce4c3579b5b1de2808ea9c4cd8ed29/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/lcg/root/6.32.09-0a945d6e24dcaabe218b38fa8292785d/include -isystem/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/tbb/v2021.9.0-9b0f135342cfe979b6500c59f501774e/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/xz/5.2.5-6f3f49b07db84e10c9be594a1176c114/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/zlib/1.2.13-d217cdbdd8d586e845e05946de2796be/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/fmt/10.2.1-e35fd1db5eb3abc8ac0452e8ee427196/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/md5/1.0.0-5b594b264e04ae51e893b1d69a797ec6/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/tinyxml2/6.2.0-f99ae2781d074227d47e8a3e7c8ec87e/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -Wno-error=unused-variable -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el8_amd64_gcc12/src/DQMServices/FwkIO/plugins/DQMServicesFwkIOPlugins/DQMRootSource.cc.d src/DQMServices/FwkIO/plugins/DQMRootSource.cc -o tmp/el8_amd64_gcc12/src/DQMServices/FwkIO/plugins/DQMServicesFwkIOPlugins/DQMRootSource.cc.o
src/DQMServices/FwkIO/plugins/DQMRootOutputModule.cc: In member function 'void DQMRootOutputModule::startEndFile()':
src/DQMServices/FwkIO/plugins/DQMRootOutputModule.cc:545:22: error: 'const class edm::ProcessConfiguration' has no member named 'passID'; did you mean 'std::string edm::ProcessConfiguration::passID_'? (not accessible from this context)
  545 |       passID = itPC->passID();
      |                      ^~~~~~
In file included from src/DataFormats/Provenance/interface/ProcessHistory.h:4,
                 from src/DataFormats/Provenance/interface/StableProvenance.h:13,
                 from src/DataFormats/Provenance/interface/Provenance.h:13,


@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47355/43703

@cmsbuild
Copy link
Contributor

Pull request #47355 was updated. @Dr15Jones, @antoniovagnerini, @cmsbuild, @makortel, @rseidita, @smuzaffar can you please check and sign again.

@makortel
Copy link
Contributor Author

CPU comparison differences are related to #47071

GPU comparison differences look compatible with the non-reproducibilities in the pixel code, except there is a new one (or a rediscovery of an old one) in ParticleFlow in 12834.42[23] workflows for which I opened a new issue #47406

@makortel
Copy link
Contributor Author

+core

@civanch
Copy link
Contributor

civanch commented Feb 20, 2025

+1

@makortel
Copy link
Contributor Author

@cms-sw/dqm-l2 @cms-sw/generators-l2 Could you review and sign, please? Thanks!

@antoniovagnerini
Copy link

+dqm

@makortel
Copy link
Contributor Author

@cms-sw/generators-l2 Could you review and sign, please? I'd like to get the backport (#47416, which you signed already) to 15_0_0 if possible. Thanks!

@lviliani
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @rappoccio, @antoniovilela, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit fdd03a5 into cms-sw:master Feb 26, 2025
14 checks passed
@makortel makortel deleted the processConfigurationHardwareResourcesDescription branch February 26, 2025 14:07
os << "," << *it;
}
}
os << ")" << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makortel , crab uses edmProvDump to find the hash and it is sensitive to changes here e.g. crab tests for 15.1.X now fail. Should we add a comment here to make sure any future changes keep this in mind

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even better, avoid this fragility. What could be done instead to get the PSet hash ? CRAB uses it to enforce that users can't add to same dataset files produced with modified PSet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @belforte, we need a better (or, more stable) solution for CRAB. (e.g. I'm already working to change the edmProvDump even more, that would break the regex in question completely).

I need to first understand what exactly CRAB needs. Do I understand correctly that it is the hash of the PSet of the Process that produced the file?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @makortel . I believe that that's what we want. But I have to admit ignorance here. Possibly these lines explain better what CRAB looks for https://github.com/dmwm/CRABServer/blob/c51853ef588bea966d86017518f4136d25f5a2a0/scripts/CMSRunAnalysis.py#L471-L487 . Or maybe Eric Vaandering remembers how he picked that implementation back at the time.
I can not say if it is only about the PSet or it includes processing history or CMSSW version... all I know is that when this changes we force output files to be in a different DBS dataset.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding CRAB DevOps @aspiringmind-code and @sinonkt who will follow up on needed changes on CRAB side

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to move more discussion out of this close PR. We already have a CRAB issue dmwm/CRABServer#8953

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace PassID with HardwareResourcesDescription in ProcessConfiguration
9 participants