VBF Higgs --> ZZ('*') --> 2l2b
%COMPLETE3%
flow chart of the VBF Higgs --> ZZ --> 2l2b analysis
code for ntuple production [Ntuple: ensure everything is there]
corrected calo-jets with b-tag discriminator variable [and some b-tagging variables?]
trigger filter or bits [TO BE ADD in the next version]
PF jets [TO BE ADD in the next version]
filter on number of central jets N>=1 [TO BE ADD in the next version]
- 1-jet events useful to study VBF jet backgrounds:
- can model N=2 jets with N=1 jets and same Z Pt
- reduces datasets size
test on specific use cases: tag matrix, dijet resolution...
run on datasets and produce local ntuples
collision data 2010
recipes
- is there an automatic tool to get luminosity of JSON file?
- Ilaria Segoni: "there will be, but it is not yet ready.
- Update
- JSON files only represent the GOOD LumiSections that analyses should run on, but it is not granted that all LS are stored in DBS. The only safe way to know number of LS that the analysis run on is from the CRAB job report."
- can a JSON file change?
- Ilaria Segoni: "JSON files already published are not changed. Every week a new JSON is published with the new runs taken over the last week and possible corrections to the older runs."
note
- Cleaning:
- HF cleaning by default in 3_5_7.
- The cleaning of HB will be available in ~2 weeks from now
signal samples
- eff(mm) x eff(2 b-tags) x eff(forward tags) not larger than a few %
- need at least 10k events per mass point after filtering
background samples
- Summer09@7TeVRECO:
- Zmumu+jets:
- Z + jets:
- QCD jets:
- MADGRAPH:
- PYTHIA:
- /QCD_Pt15/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 6256300 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt30/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 5238992 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt80/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 3202440 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt170/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 3132800 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt300/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 3274202 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt470/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 2162152 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt800/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 2165530 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt1400/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 1184123 events [CMSSW_3_1_2] (T2s,T3s) * HERWIG:
- /QCD_Pt15-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 1517806 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt30-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 1037467 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt80-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 902868 events [CMSSW_3_1_2] (T2s,T3s) * /QCD_Pt170-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 890505 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt300-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 726035 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt470-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 361757 events [CMSSW_3_2_1] (T2s,T3s)
- /QCD_Pt800-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 208492 events [CMSSW_3_1_2] (T2s,T3s)
- /QCD_Pt1400-herwig/Summer09-MC_31X_V3_7TeV-v1/GEN-SIM-RECO: 115380 events [CMSSW_3_1_2] (T1_TW_ASGC,T2_DE_DESY,T3_GR_Ioannina)
- QCD Dijets:
- TTbar:
- PYTHIA:
- PYTHIA + EVTGEN (No Tauola in samples, because of their incompatibility w/ EVTGEN):
- MADGRAPH
- WZ inclusive:
- ZZ inclusive:
- ggF HZZqqll:
- VVJets:
- Zbb+Jets:
- Summer08redigi samples:
- Z + jets (no heavy-flavor enrichment)
- MADGRAPH:
- ALPGEN:
- SHERPA:
- PYTHIA6:
- Zbb + jets
- VQQ MADGRAPH [both Zbb and Zcc]:
- ZZ samples:
- ZZ (->lljj) samples:
- WZ (inclusive) samples:
- QCD dijets:
- QCD jets:
- QCD HF muon-enriched
- tt inclusive:
- tt:
Notes on some samples
- VQQ_biased from Spring10 production is a re-Reco of the Summer09 sample /VQQ_biased-madgraph/Summer09-MC_31X_V3_7TeV-v2/GEN-SIM-RAW: it has the following generator cuts:
- pt(l)>3 GeV/c
- pt(b)>5 GeV/c
- m(ll,bb)>10 GeV/c
Global tag and alignment info
- Summer09 samples, CMSSW_3_1_2:
process.load('Configuration/StandardSequences/MixingNoPileUp_cff')
process.load('Configuration/StandardSequences/GeometryIdeal_cff')
process.load("Configuration.StandardSequences.FrontierConditions_GlobalTag_cff")
process.GlobalTag.globaltag = "MC_31X_V3::All"
process.load('Configuration/StandardSequences/MagneticField_38T_cff')
-
- For analysis (CMSSW_3_5_6) must use (global tag up to 20-Apr-2010):
process.load('Configuration/StandardSequences/GeometryIdeal_cff')
process.load("Configuration.StandardSequences.FrontierConditions_GlobalTag_cff")
process.GlobalTag.globaltag = "MC_3XY_V26::All"
process.load('Configuration/StandardSequences/MagneticField_cff')
study of tag jet system
- Current chain (every step is the input for the following - though single subchains are checked and can be saved):
- L2L3-corrected jets
- at least 3 reconstructed jets in the event
- at least 2 prompt global muons w/ opposite charge
- event is VBF (only for signal samples)
- select only jets w/ pt > 10.0 GeV/c
- at least a couple jets w/ η1·η2 < 0 * parallel requests: jets pairs w/ |η|>0.-4., or jet pairs w/ |Δη|>0.-8. * at least a couple jets w/ diJet invariant mass larger than 0.-1000 GeV/c**2
- Cuts on jet quantities
- Pt_jets > 10 GeV/c is a loose cut. Needed a scan to larger pt thresholds (15, 20 GeV/c) to optimize the cut
- Cut |η|<4.8 (jet reconstruction studied in this range in notes) has to be verified: it is really crucial, or it reduces Nevents without improving tagging?
- Kinematics of tag jet sysyem
- 2 jets with big |η| OR big |Δη|
- 2 opposite jets in the z direction: η1·η2 < 0
- jets with large energy. Literature suggests E>100 GeV for each Jet, yet to try this (is it really effective within our samples/backgrounds?)
- jets with large energy but opposite and comparable pz suggest using a diJet invariant mass cut. This criterion selects only one pair of jets in approximately 20% of signal events. For events w/ more than one pair selected, a likelihood function must be found to allow us to decide what pair is more likely to be the right one.
- in order to balance the transverse momentum of H0, the two jets should have great pt, so a ptMin cut could be done (if too high threshold, it will reduce Nevents).
- tag jet system could be used to redefine a central region of the detector for the process, with a shifted η. (Yet to study this)
- pz*pz should be studied in order to try to exploit this variable for tagging purposes (likelihood? another cut?)
- Kinematics of H->(Z->qq)Z(ll)
- Z->qq is expected to give two boosted jets in a small angle (almost overlap?)
- Z->qq should be the only hadronic activity in the central region of the detector, so a search for the absence of hadronic activity in the central region as a signature for the event should be investigated
- at low Higgs masses, one of the Zs is off-mass-shell, either the lept-decaying one or the had-decaying one
- To-do list:
- Matching algo
- Study the rejection power of the tagging chain w.r.t. ggF and backgrounds
- Think about a |η|<4.5 limit, to exclude low-reconstruction-efficiency regions?
study of b-tagging of Z->bb system and reconstruction of mass signal; b-tag matrix
- select well-tested algorithm
- determine way to predict tag rate (tag matrix)
- idea 1) use off-resonance data ? No – muons enrich QCD of HF
- idea 2) use Y events ? Yes – should be safe but smaller statistics at high N multiplicity need to test
- idea 3) using W+jet events at similar jet multiplicity process is virtually identical as far as ME is concerned problem is ttbar contamination [must veto additional lepton; must develop top veto filter] statistics is 10x larger * parametrize in jet variables of relevance * construct tag matrix [ex: P(Et,h,Njet) ] can test it using gamma+jets events ?
2-body mass resolution studies
- may try to use event information to increase resolution of Z->bb and H->ZZ masses
- use hyperball algorithm ?!?
define variables in input to TMVA
- kinematic variables in input to TMVA:
- use past set
- add b-tagging information
- find discrimination w.r.t. background samples not used in past iteration
samples to control performances of TMVA
- need to check TMVA output on a control sample
- idea 1: use Z->mm sidebands should work fine (assume for low-mass H searches we still impose Z->mm on resonance)
- idea 2: test on no-b-tag data may imply some fiddling with b-tag-related variables
pseudoexperiments to evaluate sensitivity
- need to put together tool to evaluate sensitivity of search as a function of M(h)
- Uue MC samples, add signal, run TMVA, fit TMVA output (or H mass distribution for high-!TMVA output events)
- extract sensitivity curves for 30 && 100/fb this allows analysis to provide “results” even before real data comes => analysis note to approve
Packages
analysis environment:
scramv1 project CMSSW CMSSW_2_2_9
cd CMSSW_2_2_9/src/
eval `scramv1 runtime -sh`
cvs co .. UserCode/Tosi/HiggsAnalysis/VBFHiggsToZZto2l2b
- VBFHZZllbbPreSelection --> filter based on number of physics objects which satisfy a pt cut [values taken as input parameter]
- tightLeptonMinNumber || softLeptonMinNumber (both muon and electron)
- tightJetMinNumber || softJetMinNumber (CorJetWithBTag)
- VBFHZZllbbMCprocessFilter ---> filter based on processID
- VBFHZZllbbElectronIsolationProducer ---> producer of sort isolated electron collection
- electronIsoSumpT
- electronIsoSumpT over pT
- electronIsoSumpT2
- VBFHZZllbbJetMatching ---> analyzer based on HEPG information and CorJetWithBTag
- VBFHZZllbbDeltaRAnalyzer ---> analyzer based on reconstructed physics objects
- VBFHZZllbbCorJetWithBTagProducer
- VBFHZZllbbCorJetWithBTagAnalyzer
- VBFHZZllbbBTagInfoAnalyzer
- TMVAntple
- VBFHZZllbbRAWCORBTAGtesting
- VBFHZZllbbMCfilterValidation
- VBFHZZllbbMCzTObbFilter
- VBFHZZllbbMCbQuarkFilter
- VBFHZZllbbBhadronReconstruction
- VBFHZZllbbMCvalidation
- VBFHZZllbbDisplay
- VBFHZZllbbMuonSelector
- SimpleNtple
- VBFHZZllbbAnalyzer
Skimming
skimming of data sample:
samples dimension:
Physics object selection&identification
electron selection
muon selection
barrel muon [ |eta| < 1.1]
endcap muon [ 2.4 >= |eta| > 1.1]
In CMSSW_2_X_Y releases, the muon-identification algorithm recommended by the muon POG is
GlobalMuonPromptTight
.
It consists of the following requirement, designed to suppress hadronic punch-throughs and muons from decays in flight:
- muon.isGlobalMuon() && muon.combinedMuon()->normalizedChi2()<10
The following additional track-quality cuts using the tracker-track information are recommended to further suppress non-prompt muons,
although they are not explicitly included in the selection types above:
- |d0| < 2 mm,
where d0 is the impact parameter of the tracker track (or global muon) relative to the beam spot position
This loose cut preserves efficiency for muons from decays of b and c hadrons.
- Nhits >= 11,
cut on the number of hits in the tracker track.
One can also impose cuts on the last point in the global fit to reject punch-throughs that terminate in the first station of the muon detector:
Moreover, as pointed out in the
Muon Isolation TWiki,
simplest [and fairly loose] requirements on isolation can be:
- trackIso = sumPt < 3 GeV in a cone of radius 0.3 around the muon
will give a 96.7 \pm 0.7 % efficiency for isolated muons from W or Z decays
with about a factor of 10 rejection of muon candidates reconstructed in QCD events.
- chi2/ndof < 10 && nTrackerHits > 10
as a quality cut on reconstructed muon (applied to a global muon, this cut allows to suppress fake muons from K/pi)
- caloIso = sumEtHcal+sumEtEcal < 5 GeV in the cone of 0.3.
If desired, tighter cuts:
- reducing the value of the above cut
- relative isolation cut: relIso = (trackIso+caloIso)/muonPt < x,
which allows to effectively suppress backgrounds with softer spectrum than the signal muons
jet selection
- jet corrections
- b-jets tagging
- tag-jets studies
first Filter
links
--
MiaTosi - 25 Feb 2009
--
PietroVischia - 13-Oct-2009