Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Filter efficiency from fragments should be dropped once and for all #3269

Open
efeyazgan opened this issue Sep 30, 2022 · 8 comments
Open

Filter efficiency from fragments should be dropped once and for all #3269

efeyazgan opened this issue Sep 30, 2022 · 8 comments

Comments

@efeyazgan
Copy link
Collaborator

@menglu21 @sihyunjeon @Saptaparna

Since Si hyun and Sitian are sure (see https://cms-talk.web.cern.ch/t/filter-efficiency-check-in-gen-check-script/15409) that the filter efficiency is not used at all I removed the check in the request checking script (see:
#3268) so this filter should be removed from all fragments. Please do that soon and then I will adapt the script not to cause inconsistency in UL consistency check. However, if it is not removed soon, I think we should revert back the request checking script.

In any case, I do not take any responsibility for any problems that this change may cause: #3268.

Thanks.

@agrohsje
Copy link
Collaborator

agrohsje commented Oct 3, 2022

Hi Efe. The concern I had was that this info is written in our edm files. So we cannot know if someone is using or not. Technically, it is not relevant. Only this statement is true. That's why I like the solution of Sihyun, i.e. consider 1 or -1 as some default that is ok. Allowing again random numbers is not so great. We are just too many people to assure no one is using.

@efeyazgan
Copy link
Collaborator Author

Hi Alexander, yes, but following the same reasoning we could have kept the check. If something is used, we should check how it is used. I am sure that check was there for a reason...

@sihyunjeon
Copy link
Collaborator

Hi so there are several things to think about here.

This creates somewhat unnecessary entropy for the MC contacts because
a. We don't know whether there are any real users for this metadata so we might be putting our efforts to something without real gain.
b. Removing the filter efficiency fragment still works (avoids gen checking script errors) - which means subset of the samples would for sure have broken filter efficiency written in GEN files as metadata. So subset of our samples are already broken.
c. It's not so easy to modify this through python scripts because efficiency from the run log (for filter efficiency) only gets delivered through email. I don't think the run log results are stored somewhere in McM, at least I am not aware. If I am correct, one needs to crop out the values from email boxes and tweak it into the fragments and I would hardly imagine MC contacts doing this. So in the end it would return to b. where MC contacts will just remove the line to avoid the problem, breaking the variables that should not be used.

But as Alexander said, SOME might use this and it MIGHT be not totally useless to store correct values. So allowing default settings (filter efficiency >= 1.0 or <=0.0 which doesn't make sense) is sort of compromise in between.

@agrohsje
Copy link
Collaborator

Hi Efe,
maybe this was not clear in my message: I think we should have kept the check (and the motivation now is the same as back then: it is stored in EDM and we don't know if someone is using it or not) but just slightly modify it as proposed by Sihyun:
But as Alexander said, SOME might use this and it MIGHT be not totally useless to store correct values. So allowing default settings (filter efficiency >= 1.0 or <=0.0 which doesn't make sense) is sort of compromise in between.
Cheers, Alexander.

@sihyunjeon
Copy link
Collaborator

Just to add a bit more

(filter efficiency >= 1.0 or <=0.0 which doesn't make sense)

This means that "filter efficiency itself doesn't make sense already and the users if they exist, they would know it's not trustable so they would avoid using it. but if it's some realistic value e.g. 0.48, people might believe the value is true and mistakenly use it if wrong values are stored."

So my proposal was, avoid checking unrealistic values BUT check realistic values to make sure people don't use them.

@efeyazgan
Copy link
Collaborator Author

OK, done. See #3280

@efeyazgan
Copy link
Collaborator Author

See the update: #3282

@tvami
Copy link
Contributor

tvami commented Dec 6, 2023

I run into this now too, if the eff is set to -1 it's still failing the script. I even understand from the thread above that it was discussed that in case of the eff is negative we should not give an error. Anyway, I made a PR to add that patch #3572

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants