Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Harcoded HLT Process for InputTags #47482

Open
AdrianoDee opened this issue Mar 3, 2025 · 5 comments
Open

Harcoded HLT Process for InputTags #47482

AdrianoDee opened this issue Mar 3, 2025 · 5 comments

Comments

@AdrianoDee
Copy link
Contributor

In the context of https://gitlab.cern.ch/cms-ppd/dataset-management/special-relvals-requests/-/issues/262 we have been facing some unexpected failures when re-running the HLT step on top of GEN-SIM-DIGI-RAW output of a step2 not running the HLT. The input dataset is a product of CMSSW_15_0_0_pre3__Run3_RV262_BPix1Thr_2025_PU-TTbar_14TeV-00001 (you can find there all the cmsDrivers) and specifically of:

cmsDriver.py step2 --conditions auto:phase1_2025_realistic --datatier GEN-SIM-DIGI-RAW --era Run3_2025 --eventcontent FEVTDEBUGHLT --filein "file:step1.root" --fileout "file:step2.root" --geometry DB:Extended --nStreams 2 --nThreads 8 --no_exec --number 10 --pileup Run3_Flat55To75_PoissonOOTPU --pileup_input das:/RelValMinBias_14TeV/CMSSW_14_2_0-142X_mcRun3_2025_realistic_v4_Winter25_MinBiasGS_RV255-v1/GEN-SIM --python_filename step_2_cfg.py --step DIGI:pdigi_valid,L1,DIGI2RAW

the HLT is run in a separete step:

cmsDriver.py step3 --conditions auto:phase1_2025_realistic --customise_commands "process.hltSiPixelClustersSoA.clusterThreshold_layer1 = 2000 \n process.hltSiPixelClusters.clusterThreshold_layer1 = 2000 \n process.hltSiPixelClustersSoASerialSync.clusterThreshold_layer1 = 2000 \n process.hltSiPixelClustersSerialSync.clusterThreshold_layer1 = 2000 \n process.hltSiPixelClustersRegForDisplaced.ClusterThreshold_L1 = 2000" --datatier GEN-SIM-DIGI-RAW --era Run3_2025 --eventcontent FEVTDEBUGHLT --filein "file:step2.root" --fileout "file:step3.root" --geometry DB:Extended --nStreams 2 --nThreads 8 --no_exec --number 10 --python_filename step_3_cfg.py --step HLT:@relval2025 

So in order to provide a comparison for the HLT objects run on exactly the same events, we run a second workflow on the above mentioned output of step2 starting from the HLT step renamed reHLT:

cmsDriver.py step2 --conditions auto:phase1_2025_realistic --datatier GEN-SIM-DIGI-RAW --era Run3_2025 --eventcontent FEVTDEBUGHLT --filein "dbs:/RelValTTbar_14TeV/CMSSW_15_0_0_pre3-PU_142X_mcRun3_2025_realistic_v5_RV262_BPix1NewThresholds-v1/GEN-SIM-DIGI-RAW" --fileout "file:step2.root" --geometry DB:Extended --nStreams 2 --nThreads 8 --no_exec --number 10 --process reHLT --python_filename step_2_cfg.py --step HLT:@relval2025

with the subsequent steps with --hltProcess reHLT to allow for the massive renaming:

cmsDriver.py step3 --conditions auto:phase1_2025_realistic --datatier GEN-SIM-RECO,MINIAODSIM,NANOAODSIM,DQMIO --era Run3_2025 --eventcontent RECOSIM,MINIAODSIM,NANOEDMAODSIM,DQM --filein "file:step2.root" --fileout "file:step3.root" --geometry DB:Extended --hltProcess reHLT --nStreams 2 --nThreads 8 --no_exec --number 10 --python_filename step_3_cfg.py --step RAW2DIGI,L1Reco,RECO,RECOSIM,PAT,NANO,VALIDATION:@standardValidation+@miniAODValidation,DQM:@standardDQM+@ExtraHLT+@miniAODDQM+@nanoAODDQM
[...]

The two steps above are a way to reproduce the crash on lxplus. The exception we hit is

----- Begin Fatal Exception 28-Feb-2025 12:39:59 CET-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 88 event: 21751 stream: 0
   [1] Running path 'validation_step'
   [2] Calling method for module B2GDoubleLeptonHLTValidation/'b2gDoubleElectronHLTValidation'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::TriggerResults
Looking for module label: TriggerResults
Looking for productInstanceName: 
Looking for process: HLT
   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "TryToContinue = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
%MSG-w MemoryCheck:  PostProcessPath 28-Feb-2025 12:39:59 CET  Run: 1 Event: 21751
MemoryCheck: earlyTermination : VSIZE 9315.8 0.0117188 RSS 6477.96 31.332 PSS 6471.3 PRIVATE 6464.46 ANONHUGEPAGES 0 JeMalloc allocated 6043.3 active 6166.21 resident 6336.98 mapped 6381.61 metadata 63.8443
%MSG
%MSG-w MemoryCheck:  PoolOutputModule:RECOSIMoutput 28-Feb-2025 12:39:59 CET  Run: 1 Event: 21752
MemoryCheck: module PoolOutputModule:RECOSIMoutput VSIZE 9315.8 0 RSS 6478.91 0.953125
%MSG
----- Begin Fatal Exception 28-Feb-2025 12:39:59 CET-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 88 event: 21752 stream: 1
   [1] Running path 'validation_step'
   [2] Calling method for module B2GDoubleLeptonHLTValidation/'b2gDoubleElectronHLTValidation'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::TriggerResults
Looking for module label: TriggerResults
Looking for productInstanceName: 
Looking for process: HLT
   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "TryToContinue = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------

and the problem is the fact that in B2GDoubleLeptonHLTValidation the InputTag is hardcoded to use the HLT process. preventing the renaming to have effect there.

In addition to this failure I suspect this causes also some "silent" bug when we run the HLT step together with the DIGI step. In that case the output will have all the HLT objects in the event and then when we run a reHLT step on top of them the old products are consumed by the validation downstream without failing.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 3, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 3, 2025

A new Issue was created by @AdrianoDee.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@AdrianoDee
Copy link
Contributor Author

assign HLTriggerOffline/B2G

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 3, 2025

New categories assigned: dqm

@antoniovagnerini,@rseidita you have been requested to review this Pull request/Issue and eventually sign? Thanks

@AdrianoDee
Copy link
Contributor Author

For this specific case I've opened #47483 but I'm not sure there are other cases like this (I haven't checked).

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants