Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add support to GUID consistency check during runtime #9468

Closed
amaltaro opened this issue Dec 26, 2019 · 9 comments · Fixed by #9618 or #9660
Closed

Add support to GUID consistency check during runtime #9468

amaltaro opened this issue Dec 26, 2019 · 9 comments · Fixed by #9618 or #9660

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Dec 26, 2019

Impact of the new feature
WMAgent

Is your feature request related to a problem? Please describe.
As discussed in this issue: #9432
we found out that the there has been cases where a wrong file got read (wrong file served via AAA) by the cmsRun process.

After a private follow up, an additional PoolSource/EmbeddedRootSource parameter has been implemented, see one of the PRs: https://github.com/cms-sw/cmssw/pull/28561/files

Describe the solution you'd like
Evaluate the CMSSW version during runtime and enable this new option enforceGUIDInFileName (cms.untracked.bool) in the PSet whenever it's possible. We can use this function:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMRuntime/Tools/Scram.py#L72
to evaluate whether CMSSW supports this option or not.

Full list of CMSSW releases supporting it:
11_1_0_pre1
11_0_0
10_6_8
10_2_20 (and 10_2_20_UL*)
9_4_16 (and 9_4_16_UL*)
9_3_17
8_0_34 (and 8_0_34_UL*)
7_1_45_patch3

and this update from Fabio:
" For 8_0_X, 9_4_X and 10_2_X I’ve created also the “_UL” version, with slc7 as production architecture, in case this may be used."

In short, I think we need to have the following - not great - approach:
i) enable it for any release "greater than" 11_0_0 (thus using that function already available) - the way to go!!!
ii) a check like True for cand in ("10_2_20_UL", "9_4_16_UL", "8_0_34_UL", "7_1_45_patch3") if cand in cmsswVersion else False
iii) enable it for any release - within the same cycle - for (thus, comparing ONLY the last digit):
10_6_8, 10_2_20, 9_4_16, 9_3_17, 8_0_34, 7_1_45_patch3

Describe alternatives you've considered
To be investigated whether we can match against all the possible releases during runtime. Otherwise, we might start this feature with the release 11_0_0...

Additional context
none

@amaltaro
Copy link
Contributor Author

amaltaro commented Feb 3, 2020

This needs to go into the next WMAgent production release. Thus it needs to be fully implemented this week.

@amaltaro amaltaro self-assigned this Feb 3, 2020
@amaltaro amaltaro modified the milestones: Jan_2020, Feb_2020 Feb 9, 2020
@nsmith-
Copy link

nsmith- commented Mar 19, 2020

What's the status of this?

@amaltaro
Copy link
Contributor Author

amaltaro commented Apr 1, 2020

@goughes here is one very important implementation to get ready for the upcoming WMAgent release candidate (beginning of the next week). Could you please take care of this one?

@nsmith- it will be in the next production WMAgent release (mean to be deployed mid April).

@amaltaro
Copy link
Contributor Author

It breaks with CMSSW_11_0_2. I have just written a private email to the group we were discussing this feature (Todor and Erik) in cc. Reopening it because we might have to quickly fix it in WMAgent. Code is already live in production, so it can potentially make a mess!

@amaltaro amaltaro reopened this Apr 23, 2020
@goughes
Copy link
Contributor

goughes commented Apr 23, 2020

We should be able to eliminate this issue by checking the source type, new PR incoming.
if process.source.type_() in ["PoolSource", "EmbeddedRootSource"]

@amaltaro
Copy link
Contributor Author

@belforte Stefano, can you please confirm whether this feature can be enabled for analysis jobs? I would say files are not necessarily named after their GUID, so we better disable it, see #9660

@belforte
Copy link
Member

Files in the USER tier usually do not have GUID indeed.
Ideally one could check if the file name has a GUID. So we keep it when cmsRun reads production files and disable when it reads user-generated files where user like to have readable names.
As a start, what happens when cmsRUn reads a file whose name has no GUID ? crash ?

Indeed I did not pay enough attention to this thing and need to think of the best approach for CRAB, currently do not know enough to tell.

If we can start disabling it, that's surely better, but I do not know how to do this

@amaltaro
Copy link
Contributor Author

No worries, Stefano. From the WMCore side, it will always be disabled for CRAB jobs.

@belforte
Copy link
Member

belforte commented Apr 24, 2020

I see that the PR claims that CRAB jobs are not modified, I do not know that code well enough to confirm, but can trust you. So the question seems how to enable it in a more selective way only when it makes sense.
I will appreciate hints from you on how to pass high level information to Scram module, w/o
requiring me to understand that code. Surely when users read production data we should use
this check.

As a general comment I do not like those PR's with zilions of "if this, if that". Really CMSSW should be able to detect a GUID in the file name. Is there a fear that there are EDM files around with a GUID-compliant file name which were not written by CMSSW to begin with ? Where is "encapsulation" here ? Why do I need to know as a CMSSW user the details of how it interacts with xroot ? This should all be handled inside the xroot plugin.

# for free to join this conversation on GitHub. Already have an account? # to comment