-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Non-reproducibility in TrackerPhase2OTL1Track #47071
Comments
assign l1, dqm, upgrade |
New categories assigned: l1,dqm,upgrade @aloeliger,@antoniovagnerini,@epalencia,@Moanwar,@rseidita,@srimanob,@subirsarkar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
cms-bot internal usage |
A new Issue was created by @makortel. @Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
@tomalin FYI |
hi @makortel , i am confused why these are all showing as failures. if i look at the actual histograms, they all look fine (e.g. https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_15_0_X_2025-01-09-1100+9e6aa1/66377/29634.911_TTbar_14TeV+Run4D110_DD4hep/TrackerPhase2OTL1Track_Tracks_HQ.html). there is one entry difference between the two sets (246 vs 247), is that what is causing these all to be flagged as red? |
Probably? (technical question would be for @cms-sw/pdmv-l2 whose histogram comparison infrastructure is being used in PR tests) There is "clear" difference between blue and red in the 4-5 bin (probably by 1). |
The original sin there is that the comparison is performed via the Now the question would be: do we want to spot these discrepancies? Maybe this case is a bit pathological (and the test could, e.g., take into account the histogram population), but in general I think it would be interesting to be aware of this irreproducibilities given we run exactly on the same events. |
Ok, of course all the PR comparisons are BinToBin, while usually for the RelMon we use the Chi2. And actually the threshold is way higher: 0.999999999999. |
So far we have (in practice, at least) required CPU code to be fully reproducible within the same x86 microarchitecture and CPU vendor when running on 1 thread. In all cases so far the cause for non-reproducibility has been a bug somewhere. |
Is this issue an extension of #45505 ? |
Is anyone looking into this issue? |
Tests of PRs unrelated to L1T show differences in workflows 29634.911 and 29834.999 in TrackerPhase2OTL1Track, TrackerPhase2OTL1TrackV, and L1T folders. In #47051 (comment)
The text was updated successfully, but these errors were encountered: