-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Adding PsetHash from Edmprovdump fails with CMSSW_15_1 #8953
Comments
see also discussion in cms-sw/cmssw#47355 (comment) |
I want to have a better and more stable way to communicate what CRAB needs. To me the framework job report XML would sound a good way to convey the necessary information (rather than parsing output of any additional script). Would you agree? I'd also want to understand better if it is really the PSet hash that is needed here, or e.g. reduced process history ID (which is what the framework internally uses for the segregation of different processing histories). In particular I'd advise against making any quick patches, because I'll change the |
Thanks Matti, I agree that FJR is preferrable. Even if we may have to do the change in WMCore code https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/FwkJobReport/Report.py As to "what is needed", I'd say that the use case is simple, even if possibly not technically defined: limit the possibilities to users to shoot their own feed and come complaining with use with obscure problems later. That said, we need guidance from FWK Core developers on what to check. |
So if I understood the present situation correctly, CRAB uses the PSet hash as part the dataset name published in DBS as described in https://twiki.cern.ch/twiki/bin/view/CMSPublic/Crab3DataHandling#Output_dataset_names_in_DBS, but not for anything else. I saw references to "PSet hash" in the WMCore repository as well, but it was unclear to me if WMCore uses the PSet hash only to fill the field in DBS or also for something else, and from where WMCore gets the values. Listing the various options of "IDs"
The CMSSW framework uses the 3) to segregate data. But given that a single file can contain events with different ProcessHistory IDs (e.g. merged MiniAOD files that contain data taken with different HLT configurations), that doesn't seem like a good fit here. Between 1 and 2 the question is then if CRAB should use the CMSSW major/minor version to segregate datasets. For complex PSets there is a good chance that some component added a new (tracked) parameter or a value of some existing parameter between CMSSW major or minor versions (e.g. between |
Thanks @makortel Do I understand correctly that current edmProdDump prints out ID number OTOH, if you are going to put this in the FJR XML, which is already a very large file, maybe you can write all the three possibility and free yourself :-) [1] [2] |
Correct.
Ok.
That would indeed a possibility (except maybe for 3, for which there may be multiple values). |
Indeed it sounds like 3. is only useful for the framework. One last point. I presume that we will keep using edmProvDump for up to CMSSW_14 (and maybe 15_0 ?) and switch to new way from 15_1 onward, right ? |
You can continue using Fate of 15_0_X is a good question. Until this issue I was planning to backport the (present and upcoming) |
it still seems a simple So we can decouple our and your timelines. |
I'd rather not use One option for a transition would be that CRAB would first try to use the PSet hash from the framework job report, and if it isn't there, then use |
Sorry for late reply @makortel
@aspiringmind-code @sinonkt I think it is better if you decide who among you is going to do 2. and use the time until 1. is ready to understand where changes will have to be done. |
We are planning to proceed with the following addition to the FJR XML file (under the <Process>
<Name>PROD1</Name>
<ReducedConfigurationID>783c5ab60bfe18ac97262801adef9de4</ReducedConfigurationID>
<ParameterSetID>1f94f8e1ac0ae9d7dade1c99cc5862b3</ParameterSetID>
</Process> The |
See here
This line assumes that the hash value will be at the end of the line dumped by edmProvdump. Tests fail therefore. A robust way to get the hash value should be worked out.
The text was updated successfully, but these errors were encountered: