Making Tag and Probe NTuples

Note: these instructions are only meant to outline this process...!

You will most likely want to run on your own ntuples with your own specific cuts incorporated. To do this you'll have to generate these T&P ntuples yourself.

First of all you'll have to create a configuration file that will convert event information from a PatTuple into the sort of T&P ntuples the code will run on. I'll outline how to do this using the UWAnalysis framework.

Configs to Edit

Now you have several configs to edit depending on what you want to do.

open the file: UWAnalysis/Configuration/python/zEETagAndProbe.py

http://www.hep.wisc.edu/cgi-bin/cms/cvsweb.cgi/bachtis/UWAnalysis/Configuration/python/zEETagAndProbe_cff.py?rev=1.6.18.1;content-type=text%2Fplain

tagAndProbeConfigurator3.addDiCandidateModule('tagAndProbeDiElectrons','PATElePairProducer', 'triggeredPatElectrons','triggeredPatElectrons','systematicsMET','cleanPatJets',0,9999,text = '',leadingObjectsOnly = False,dR = 0.15,recoMode = "")
tagAndProbeConfigurator3.addSorter('dielectronsSorted','PATElePairSorter')
tagAndProbeConfigurator3.addSelector('selectedTagAndProbeDiElectrons','PATElePairSelector','leg1.pt()>15&&leg2.pt()>5&&charge==0&&(leg1.chargedHadronIso+leg1.photonIso+leg1.neutralHadronIso)/leg1.pt()<0.2&&leg1.userFloat("WWID")>0','EEss',0,100)

The first line defines your DiCandidate collection in this case an electron and a gsf electron . Also input are other collections like the Missing Energy and the Jets and then some other parameters like the objects that are stored and the dR between the candidates.

The second line just sorts the candidates by the highest pt pair.

The third line defines your preselection for your tag and probe pair. Where leg1 is your electron and leg2 is your electron candidate, you can see things like eta cuts on the gsf electron and isolation and Id cuts on the electron and the charge of the pair etc.

The analogous file for muons is UWAnalysis/Configuration/python/zMuMuTagAndProbe_cff.py Which should contain the lines:

tagAndProbeConfigurator.addDiCandidateModule('tagAndProbeMuTracks','PATMuPairProducer', 'triggeredPatMuons','triggeredPatMuons','systematicsMET','selectedPatJets',1,9999,text = '',leadingObjectsOnly = True,dR = 0.15,recoMode = "")
tagAndProbeConfigurator.addSelector('selectedTagAndProbeMuTracks','PATMuPairSelector','charge==0&&mass>50&&mass<120&&leg1.pt()>25&&leg2.pt()>5&&abs(leg2.eta())<2.4&&abs(leg1.eta())<2.1','tagAndProbePairs',1,100)
tagAndProbeSequence =tagAndProbeConfigurator.returnSequence() 

Next look at the file: UWAnalysis/CRAB/zEETagAndProbe/TP-MC.py

http://www.hep.wisc.edu/cgi-bin/cms/cvsweb.cgi/bachtis/UWAnalysis/CRAB/ZEETagAndProbe/TP-MC.py?rev=1.7.4.1;content-type=text%2Fplain

This is the configuration which you will input to farmoutAnalysisJobs or crab when you run. Or you can run locally using a cmsRun command. This is where most of your configuration will be edited. A given configuration will looks something like this (you can have multiple of them in one file and they will run simultaneously):

addTagAndProbePlotter(process,'ElectronTagAndProbePlotter',
                             'electron2l2t17L',
                             'triggeredPatElectrons',
                             'selectedTagAndProbeDiElectrons',
                             ['MVAID','ISO'],
                             ['(abs(eta())<0.8 && electronID("mvaNonTrigV0")>0.5) || (abs(eta())<1.479 && abs(eta())>0.8 && electronID("mvaNonTrigV0")>0.12) || (abs(eta())>1.479 && electronID("mvaNonTrigV0")>0.6)','(userIso(0)+max(0.0, userIso(1)+neutralHadronIso()-userFloat("zzRho2012")*userFloat("EAGammaNeuHadron04")))/pt()<0.25'],
                             ['hltEle27WP80TrackIsoFilter'],
                             ['hltEle17TightIdLooseIsoEle8TightIdLooseIsoTrackIsoFilter']# 17 GeV leg

)

So this is an electron-electron configuration that uses the the diCandidate that we built in the first file.

The first line gives you the type of the configuration here a DiElectron

The second line gives you the name of the folder that will be put in your output ntuple

The third line is your collection that you are testing the efficiency of with respect to your probe collection

The fourth line is the input we built in the previous file

The fifth line gives an array of names that correspond to the array of cuts on the probe in line 6 (we will later be able to calculate the efficiencies of these cuts)

Line six is the array of cuts that you want to test, each one is run for objects passing the previous one.

Line seven is your tag trigger filter object (we will come back to this) You can put more than one.

Line eight is your probe trigger filter object, it will check and see if the objects that pass the final cut in the other cuts also passes your specified trigger (again you can put more than one). So in this specific case it will check every object passing the ID and isolation filters whether it fires the trigger bit 'hltEle17TightIdLooseIsoEle8TightIdLooseIsoTrackIsoFilter'.

What you'll need to change

ID/ISO cut

In TP-MC.py file or a similar TP.py file (one for MC and one for data) you will need to add a set of cuts to test. The cuts you see in these files are for the Z->2l2tau analysis. You'll probably want to change them to suit your specific analysis needs.

http://readthedocs.org/docs/final-state-analysis/en/latest/pat_tuple.html <---- This gives information about ID's and Isolation that are available in the patTuple.

So you'll want to replace the 'MVAID' and 'ISO' cuts with whatever cuts you'd like to use in your analysis.

Keep in mind that these cuts are run sequentially on the probe (so first the ID cut, then the ISO cut on events that passed the ID cut, then the trigger filter on events that passed both the ID and ISO cuts). You'll have to consider this when calculating efficiencies later. The process for Z->mumu tag and probe involves an additional SIP cut that falls between the ID and ISO cuts.

For muons, an example configuration will look like this:

addTagAndProbePlotter(process,'MuonPairTagAndProbePlotter',
                        'MuonHttMu12',
                        'triggeredPatMuons',
                        'selectedTagAndProbeMuTracks',
                        ['ID','SIP','ISO'],
                        #['pt()>30','pt()>10000'],
                        ['abs(eta())<2.4 & pt()>5 & abs(userFloat("ipDXY"))<0.5 & abs(userFloat("dz"))<1.0 & pfCandidateRef().isNonnull() &(isGlobalMuon() | isTrackerMuon())','abs(userFloat("ip3DS"))<4', '(chargedHadronIso()+max(0.0,neutralHadronIso()+photonIso()-userFloat("zzRho2012")*userFloat("EAGammaNeuHadron04")))/pt<0.40'],
                        ['hltL3crIsoL1sMu16Eta2p1L1f0L2f16QL3f24QL3crIsoRhoFiltered0p15','hltL3fL1sMu16Eta2p1L1f0L2f16QL3Filtered24Q'],
                        ['hltL3pfL1DoubleMu10MuOpenL1f0L2pf0L3PreFiltered8','hltL3pfL1DoubleMu10MuOpenOR3p5L1f0L2pf0L3PreFiltered8','hltL3fL1DoubleMu10MuOpenL1f0L2f10L3Filtered17','hltL3fL1DoubleMu10MuOpenOR3p5L1f0L2f10L3Filtered17']
)

This configuration includes the standard ID,SIP, and ISO cuts for muons. It also contains the 8GeV and 17GeV filters for the 7e33 v2.0 and 7e33v2.5 runs.

Trigger Filter

Unfortunately the process of the correct finding trigger filters requires some experience. I'll walk you through the process I went through to find the trigger filters for my analysis, but you will have to adapt this process to suit your own. If you're experienced with choosing what filters to use to generate ntuples, you'll probably have dealt with most of this material already. If this information is new for you, be aware that the learning curve is very steep and you should consult with others to make sure you've chosen the right filter. It will really depend on what exactly you're trying to measure the efficiency of. I'll do my best to make this process as head-ache free as possible but it is likely that at first you will select some incorrect filters. My advice to you is to have someone experienced with this double-check your filters before you start generating any ntuples. This is an example for Z->ee. Z->mumu is significantly more complicated.
" The last thing you will need to do is setup the triggers properly. Some warnings. If you run and all your ntuples are empty it usually means a problem with your tag trigger. If just your 'hltPass' leg is empty this usually points to a problem with your probe trigger. You need to find an appropriate trigger for the tag and the probe. You can find a lot of this information in the hltConfigBrowser:

http://j2eeps.cern.ch/cms-project-confdb-hltdev/browser/

Click on the left panel on '/online', then '/collisions', then '/2012', then '/5e33', then 'v4.4', and finally click on 'HLT'

These will be different depending on what version of the triggers you're dealing with. In this case this corresponds to the set of triggers run on early 2012 data. You can find information on muon triggers from this link: https://twiki.cern.ch/twiki/bin/viewauth/CMS/MuonHLT/ The analogous link for electrons, which is unfortunately not as useful, is here: https://twiki.cern.ch/twiki/bin/viewauth/CMS/EgHLT

Now go to the streams tab and click on 'A', which should give you a list of the primary datasets. For this example click on 'DoubleElectron'. You'll see a list of the triggers that went into the DoubleElectron dataset. Selecting which triggers to use will really depend on what efficiency you're trying to measure.

Let's start with the tag triggers. In this example the Tag triggers that we will want to use are the ones that end in _Mass50_v1 (there are three of them). First click on HLT_Ele17_CaloIdVT_CaloIsoVT_TrkIdT_TrkIsoVT_Ele8_Mass50_v1' and you will see a list of modules that went into this trigger's path (the trigger is the first one at the top of the page. Be careful not to scoll up or down in the window or you'll lose the trigger path). Now scroll to the right until you see the last module in this path, 'HLTEndSequence'. Click on HLTEle17CaloIdVTCaloIsoVTTrkIdTTrkIsoVTEle8Mass50Sequence', which contains the detailed path information for this trigger. In general the sequence you'll want is the one just before the final sequence 'HLTEndSequence', but this can vary. You'll have to use some intuition to select the right sequence.

Now you should see a long list of the modules that went into this sequence. Scroll to the right until you see the end of this list (the last module should be 'hltEle17CaloIdVTCaloIsoVTTrkIdTTrkIsoVTEle8PMMassFilter'). Now in this case we want the last trigger filter on the first electron (the tag), which is 'hltEle17CaloIdVTCaloIsoVTTrkIdTTrkIsoVTEle8TrackIsoFilter'. In general what you'll want to do is start at the end of this list of modules and scroll left until you find the filter you're looking for (because you'll generally want the last filter on the object). You'll notice that many of the modules in the sequence either are not filters or serve some specific purpose that you're not concerned about. The hard part is of course deciding which filter you actually want; again, this is why you'll want to double-check with someone. Chances are you won't get this right the first time. The more you practice this, the better you'll get at sensing which filters you want and which ones you don't want.

In any case you may want to do the same for the other two tag triggers. See if you get these trigger filters:

'hltEle20CaloIdVTCaloIsoVTTrkIdTTrkIsoVTSC4TrackIsolFilter' for the 'HLT_Ele20_CaloIdVT_CaloIsoVT_TrkIdT_TrkIsoVT_SC4_Mass50_v3' trigger

'hltEle32CaloIdTCaloIsoTTrkIdTTrkIsoTSC17TrackIsolFilter' for the 'HLT_Ele32_CaloIdT_CaloIsoT_TrkIdT_TrkIsoT_SC17_Mass50_v3' trigger

However, in this case it's actually easier to use a SingleElectron tag trigger. Go back to the streams tab, then 'A', then 'SingleElectron' instead of 'DoubleElectron' and click on the trigger 'HLT_Ele27_WP80_v8'. Use the same process to try to find the correct trigger filter.

Is it 'hltEle27WP80TrackIsoFilter'?

Now you need your filters for the probe trigger. Go back to the streams tab, click 'A', then 'DoubleElectron'. This time you'll want to click on the trigger 'HLT_Ele17_CaloIdT_CaloIsoVL_TrkIdVL_TrkIsoVL_Ele8_CaloIdT_CaloIsoVL_TrkIdVL_TrkIsoVL_v15'. Again, scroll to the right and click on the second-last sequence, 'HLTEle17CaloIdTTrkIdVLCaloIsoVLTrkIsoVLEle8CaloIdTTrkIdVLCaloIsoVLTrkIsoVLSequence'. Here we have two options for the probe filter, one for the 17 GeV electron and one for the 8 GeV electron. See if you can find these (try not to look at the answer first!)...

17 GeV electron: 'hltEle17TightIdLooseIsoEle8TightIdLooseIsoTrackIsoFilter'

8 GeV electron: 'hltEle17TightIdLooseIsoEle8TightIdLooseIsoTrackIsoDoubleFilter'

The last things to do is to set up the code to embed the trigger matching into the leptons when you run the code. To do this you need to take the filters we've selected and put them in:

UWAnalysis/Configuration/python/tool/analysisToolsPT.py

http://www.hep.wisc.edu/cgi-bin/cms/cvsweb.cgi/bachtis/UWAnalysis/Configuration/python/tools/analysisToolsPT.py?rev=1.7.2.1;content-type=text%2Fplain

If you look down a ways you can see where the triggeredPatElectron collection is defined that we used above to build our collections:

def electronTriggerMatchPT(process,triggerProcess):

   process.triggeredPatElectronsL = cms.EDProducer("ElectronTriggerMatcher",
                                            src = cms.InputTag("cleanPatElectrons"),
                                            trigEvent = cms.InputTag("hltTriggerSummaryAOD"),
                                            filters = cms.VInputTag(
                                                cms.InputTag('hltEle17CaloIdLCaloIsoVLPixelMatchFilterDoubleEG125','',triggerProcess),
                                            ),
                                            pdgId = cms.int32(0)
   )
   process.triggeredPatElectrons = cms.EDProducer("ElectronTriggerMatcher",
                                            src = cms.InputTag("triggeredPatElectronsL"),
                                            trigEvent = cms.InputTag("hltTriggerSummaryAOD"),
                                            filters = cms.VInputTag(
                                                cms.InputTag('hltOverlapFilterIsoEle15IsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle15TightIsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle18MediumIsoPFTau20','',triggerProcess),                                                
                                                cms.InputTag('hltOverlapFilterIsoEle18TightIsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle18IsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle20MediumIsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle20LooseIsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltOverlapFilterIsoEle20WP90LooseIsoPFTau20','',triggerProcess),
                                                cms.InputTag('hltEle17CaloIdVTCaloIsoVTTrkIdTTrkIsoVTEle8PMMassFilter','',triggerProcess),
                                                cms.InputTag('hltEle20CaloIdVTCaloIsoTTrkIdTTrkIsoTTrackIsoFilterL1IsoEG18OrEG20','',triggerProcess)
                                            ),
                                            pdgId = cms.int32(11)
   )

This takes the cleanPatElectrons that are in the patTuple and adds the trigger matching information to them. So you need to add these filters that we found in the config browser to here. In this case they will all go in the bottom list as tighter electrons with track isolation applied. In some cases of looser defined electrons they are not classified as electrons by the trigger and they have to go in the first list. So at the end add the line for instance

cms.InputTag('hltEle27WP80TrackIsoFilte','',triggerProcess)

and the same idea for the rest of the triggers you'd like to add (In this case the two probe filters). And now you just need to replace the trigger filters in TP-MC.py (or TP.py for data) with your new trigger filters (tag trigger(s) in line seven and probe trigger(s) in line eight).

An example muon selection looks like this:

def muonTriggerMatchPT(process,triggerProcess):
   print 'triggerPatMuons process starting',

   process.triggeredPatMuons = cms.EDProducer("MuonTriggerMatcher",
                                            src = cms.InputTag("cleanPatMuons"),
                                            trigEvent = cms.InputTag("hltTriggerSummaryAOD"),
                                            filters = cms.VInputTag(
                                               cms.InputTag('hltL3crIsoL1sMu16Eta2p1L1f0L2f16QL3f24QL3crIsoFiltered10','',triggerProcess),
                                                cms.InputTag('hltL3crIsoL1sMu16Eta2p1L1f0L2f16QL3f24QL3crIsoRhoFiltered0p15','',triggerProcess),
                                                cms.InputTag('hltL2fL1sMu16Eta2p1L1f0L2Filtered16Q','',triggerProcess),
                                                cms.InputTag('hltL3fL1sMu16Eta2p1L1f0L2f16QL3Filtered24Q','',triggerProcess),
                                                cms.InputTag('hltDiMuonMu17Mu8DzFiltered0p2','',triggerProcess),
                                                cms.InputTag('hltL3fL1DoubleMu10MuOpenL1f0L2f10L3Filtered17','',triggerProcess),
                                                cms.InputTag('hltL3pfL1DoubleMu10MuOpenL1f0L2pf0L3PreFiltered8','',triggerProcess),
                                                cms.InputTag('hltL3fL1DoubleMu10MuOpenOR3p5L1f0L2f10L3Filtered17','',triggerProcess),
                                                cms.InputTag('hltL3pfL1DoubleMu10MuOpenOR3p5L1f0L2pf0L3PreFiltered8','',triggerProcess)
                                            ),
                                            pdgId = cms.int32(13)
   )

You can see the same filters used for the TP.py file shown earlier, such as the two 17GeV and two 8GeV filters, as well as additional filters used for monte carlo. Those filters are used in the TP-MC.py file, which is the MC analogue to the TP.py file.

These trigger instructions were meant as a guideline and they are by no means a full tutorial. To successfully select the proper tag and probe trigger filters for your analysis you'll probably need further instruction/help.

Producing the NTuples

Within in your configuration file you can have several T&P plotters, each corresponding to a different combination of cuts/trigger filters. In TP-MC.py (and TP.py) there are four active T&P plotters, corresponding to two possible isolation cuts (tight (<.1) and loose (<.25)) and two possible probe triggers (one for the 17 GeV electron and one for the 8 GeV electron). Later when you run the Tag and Probe code you'll determine which combination to run on by specifying the location of the ntuple you'd like to use (they will have names corresponding to the second line in your T&P plotter).

At this point you should have prepared a config file to generate tag and probe ntuples with the specific cuts/trigger filters and anything else you might need for your analysis. Now all you need to do is run the config file in CMSSW. For a small number of events you can just open up your config file and change the input file(s) to the pattuples you want to run on, then do:

cmsRun [your config file]

which should produce a file named 'analysis.root' in your current directory containing the tag and probe ntuples you'll need! (if you ran on MC makes sure to read the section below on MC event weighting)

If you want to run on any significant number of events you'll want to send your jobs out on the grid. An easy way to do this is with farmoutAnalysisJobs. All you need to do is open up your config file and change the input files, 'fileNames' to '$inputFileNames' instead of the path(s) of any actual '.root' files. You also need to make sure this line is in your config file:

 process.TFileService.fileName=cms.string("$outputFileName")

This line generates a unique name for each root file you generate through farmoutAnalysisJobs and will ensure that your .root files are sent to your hdfs area. For usage and a list of available options just type farmoutAnalysisJobs with no arguments:

farmoutAnalysisJobs

In the end your submit command should look something like:

farmoutAnalysisJobs [some options] --input-files-per-job=1 --input-dir=[directory containing your pattuples] <jobName> $CMSSW_BASE <your config file>

It is very important that the option "--input-files-per-job=1" is present. Otherwise the standard UW weighting code (found in UWAnalysis/ROOT/bin/EventWeightsIterative.cc) will not work properly. See the section below on weighting MC ntuples for more information.

For example, here's what one of the Z->mumu submits looked like:

farmoutAnalysisJobs --input-files-per-job=1 --input-dir=/store/user/swanson/DYJetsToLL_M-50_TuneZ2Star_8TeV-madgraph-tarball/Zjets_M50_2012-06-02-8TeV-078e4bc/ea74c25a04048a8dd7df6542c03c7a9b/ ZMuMuMCTP $CMSSW_BASE /afs/hep.wisc.edu/user/aglevine/TagProbe/src/UWAnalysis/CRAB/ZMuMuTagAndProbe/TP-MC.py

After all these jobs have finished you can use the CMSSW command 'hadd' to combine your ntuples:

cd /scratch/$USER/<jobName>
find */*.root |xargs hadd -f <your combined ntuple name (DYmumu.root,DYee.root, etc.)> 

You can also use the UW based mergeFiles command:

 mergeFiles [outputfilename] [inputdirectory]

And mergeFiles will recursively go through your specified input directory, find all .root files, and merge them.

And now you're ready to run the Tag and Probe code on your ntuple! See the section below on weighting if you have created a MC NTuple.

Weighting MC NTuples

The events in MC NTuples need to be weighted when compared with data. Without pileup corrections, each event in the sample gets a weight of (cross section)/(total number of events processed). To do a simple weighting without pile-up corrections you can use the "EventWeightsIterative" command, which is a part of the UWAnalysis framework. So all you need to do is:

EventWeightsIterative outputFile=<your MC ntuple> weight=<cross section>  histoName='summary/results'

If you need to take pileup into account, use the command:

 EventWeightsIterative outputFile=,your MC ntuple>  doOneD=1   weight=<cross section>   type=3 histoName='summary/results'

And now if you look at your ntuple you'll see a new branch named "__WEIGHT__" containing the weight for each event. The Tag and Probe code will need this branch for MC input ntuples.

Generating your own PatTuples

You'll have to decide which PatTuples you'd like to generate your ntuples from. The event information from each PatTuple will be ntuplized into an output root file (DYee.root, ZEETPData.root, DYtatau.root, DYmumu.root, etc.) and you will run the Tag and Probe code on these ntuples. If the PatTuple you'd like to use is not readily available you'll have to generate your own. Follow the instructions here:

http://final-state-analysis.readthedocs.org/en/latest/pat_tuple_generation.html

If the dataset you'd like to run on is not readily available you'll have to add it to 'MetaData/python/data{7,8}TeV.py' (pretty straightforward). You'll probably want to take a look at this file anyway to see what datasets are available.

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r10 - 2012-08-24 - AaronLevine
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback