Main Web>TWikiUsers>RustemOspanov>PhysicsLightSoftTagging>PhysicsLightPrompt (2020-01-09, FudongHe)

EditAttachPDF

Introduction

This TWiki summarizes qualification work project for Fudong He on prompt lepton tagging
Glance qualification task

List of action items

Study non-prompt electrons/muons that pass PLV
- Find true stable charged particles produced in B decay
- Plot properties of these secondary charged particles produced in B decays
- Plot distance of secondary B vertex to the primary pp vertex
Decide whether we should work on improving further prompt tagging by:
- Lowering pT for secondary vertex algorithm
- Using two track vertexes and impact parameters of additional tracks
Useful references
- Flavour Tagging Algorithm meeting August 16 - useful talk on secondary vertexing and soft muon tagger

Work plan (2018)

Week of July 17

Learn structure of MC truth in the ntuple file from Rhys
Study ATLAS simulation:
- Goal: understand MC chain from event generation to simulation to reconstruction
- ATLAS simulation paper
- MC simulation talk by Josh
- MC truth talk by Andy
- Prepare a few slides with questions

Week of July 24

Learn MC simulation of B meson semileptonic decay:
- EvtGen Project and Reference

Week of July 31

Find distinctions in the leptons come from W, tau or non-prompt decay
Match track jets to reconstruction electrons and muons, plot track jet variables for electron and muon
Add plot macro, and put all the plots on the webside .
Upgrade the plot macro to make it work much efficiently.

Week of Aug 7

Apply a Reconstruction level (Loose, Medium, Tight ) cut on the electron and muon, plot variables contrast with the each other.
Calculate the overlap area between non-prompt and W for each histogram, and add a sortable table for both elec and muon respectively.
Start BDT training with Rhys macros, and plot the ROC.

Week of Aug 14

Rerun the BDT training for both electron and muon, and save the log file as well. Then upload them to the website.
Training with:
- signal = electron or muon from W; background = non-prompt lepton.
- signal = electron or muon from W or tau; background = non-prompt lepton.
- signal = electron or muon from W or tau; background = not prompt and not tau lepton.
For both PromptLeptonIso and PromptLeptonVeto, plot them input variables with the overlap value.
Remake html table: for electron, have to exclude the variables is*, Pt, Eta, Phi and truth vars; for muon, exclude the is*, all the pt vars, PassID, medium, PDG, Eta, Phi and truth vars.

Week of Aug 21

Fix the bug of plot macros (y scale bugs and re-name some labels).
Add a flag for different sig_bkg type.
Rerun with full events again.

Week of Aug 28

Write training/evaluation events number we used to txt file.
Separate writing html table code from plot macros.
Give a presentation of summary of BDT result with Beamer.
Add a new mode of BDT training, named '!PromptLeptonVetov2 ' : which used exactly the same input variables as 'PromptLeptonVeto' but replacing 'PtRel' with 'sv1_jf_ntrkv'.
Keep running BDT with different version, and organize them.

Week of Sep. 04

Prepare presentation by reading some references.

Week of Sep. 11

Prepare presentation by reading some references.
Add efficiency uncertainty by ROOT.TEfficiency to the table.
Add training mode 'VetoV3' and 'VetoV4' with different cone size for isolation variables.

Week of Sep. 18

Add new variable to mini-ntuples called PtRelOverTrackJetPt.
Start to calculate correlation coefficients of all variables in the '!PromptNtuple' frame work.

Week of Sep. 25

Studied the math of calculate correlation coefficients, but due to it is not easy to operate with the mini-ntuple, move to next step in to-do list momentarily.
Updated current 'PromptLeptonVetoV2'
Developed 'PromptLeptonVetoV5' algorithm.

Week of Oct. 02

Fixing the bugs in the correlation coefficients calculate code ...

Week of Oct. 09

Add ’PLVetoLoose’, which apply on a looser event cuts:
- No longer require Z0 or D0 cuts
- Replace the isolation cut ’isoLoose’ with ’PtVarCone30Rel < 1’

Week of Oct. 16

Using mini-ntuple v3 from MUON5 derivation.
Prepare for MCP and ML presentations.
Two new 'PromptLeptonLoose' algorithms, observe no improvement:
- 'PLVetoLoose' new1: Put back the D0 and Z0 cut, change the isolation cut to 'PtVarCone30Rel < 0.5'.
- 'PLVetoLoose' new2: without Isolation requirements.

Week of Oct. 23

Generated full mini-ntuple file from MUON5 with condor at USTC cluster.
Learning Derivation framework from ATLAS tutorial.
Add a new algorithm – PromptLeptonVetoV7, which used the same input variables as PromptLeptonVeto but removed PtRel.

Week of Oct.30

Given a presentation on tth-ml group
Run all algorithms with whole mini-ntuple, named as V14.
Fixed some bugs in macros/runStudyVars.py and python/PhysicsAnpLightStudyVars.py.
Learning Derivation framework from ATLAS tutorial.

Week of Nov.06

Show 2D hist of 'PtFrac' and 'DRlj'.
Re-plot 'PtFrac' with increasing bin number by a factor of 5.
Learn to use JetTagNonPromptLepton.

Week of Nov.13

Learn how to run derivation with JetTagNonPromptLepton.
Comepare the merge request from Rhys, about add variables to DxAOD with ATLAS Augmentaion tool tutorial example.
Attend ML seminar, "Advanced Machine Learning for Classification, Regression, and Generation in Jet Physics" by Ben Nachman.link
Plot MSTrackPt and IDTrackPt .

Week of Nov.20

Give a presentation in the flavour tagging algorithm meeting.
- the question on calibrate non-prompt lepton:
  - We don’t need to Flavor tagging calibration, because we don’t really calibrate b jet.
  - The only calibration for this algorithm is the calibration of lepton from W.
- the question about the retraining with different variables:
  - We will do it in next Spring.
- Learning how to run data MC comparison.

Questions

Are there truth particles for B hadron decays?
- now we have new ntuple file, it includes b decay informations.
Why the distributions of variables from non-prompt look so different between electron and muon.
- The electron ID and muon ID are quite different. -- slides, documents
Add variable ParentPDG to reco-event, but failed.
- Note the different between mini_ntuple and ntuple file.
For type NotWNotTau_WTau, BDT training result looks so weird, why?
- There may be some logical bugs on python macros. Working on it ...
why track pT / jet pT > 1 for muons ? - Nov. 07
- In our code, TrackPt is filled with primary muon track Pt which are mainly consists of the combined muon Pt.

Minutes

For tth-ML meeting at Oct. 30:
- why change the selection criteria? (For tight muon evaluation, replaced 'FixedCutTightTrackOnly' with 'FixedCutTight')
- keep no tau training as default study new input variables?
- check modelling of all variables in rel 21 ( To see other b-tagging analysis )
- use lower pT tracks near the electron/muon?
- check show ranking of variables to see which ones are more relevant
- make sure the PLI variables are also in the EGAMMA derivations separate SF for leptonic taus?
- need to show performance for isoLoose+PLI

Plans

Main Schedule:
- Prepare a new prompt lepton tagging algorithm for release 21 - October.
- First data and MC comparisons for the new algorithm with release 21 - November
- Finalise PromptLeptonVeto training for release 21 - September and October
- Prepare new tag of JetTagNonPromptLepton - October
- Validate and calibrate PromptLeptonVeto training for release 21 -November and December
- Study new input variables - January to March

To-do list in October -- results shown in version 14 :
- Update current 'PromptLeptonVetoV2' by replacing 'sv1_jf_ntrkv' with 'PtRelOverTrackJetPt';
- Develop 'PromptLeptonVetoV5' algorithm which used the same input variables as 'PromptLeptonVeto' but add two additional new variables: PtVarCone40Minus30Rel and TopoEtCone40Minus30Rel：
  - PtVarCone40Minus30Rel = (PtVarCone40-PtVarCone30)/PtVarCone40
  - TopoEtCone40Minus30Rel = (TopoEtCone40-TopoEtCone30)/TopoEtCone40
- Learn how to create new variables and add them to 'RecoEvent' ;
- Add looser option for selecting training leptons, e.g. without D0, Z0 and loose isolation cuts;
- Add isolationFixedCut selection working point to efficiency table ;
- learn how to make mini-ntuples from MUON5 derivations using our cluster at USTC ;

To-do list in November :
- Re-plot 'PtFrac' with increasing bin number by a factor of 5.
- Show 2D hist of 'PtFrac' and 'DRlj'.
- Write down in papers about Jet pt calculate.
- Follow the twiki from Rhys, learn how to run JetTagNonPromptLepton;
  - twiki tutorial from Rhys: https://twiki.cern.ch/twiki/bin/view/Main/PhysicsLightSoftTagging#Setup_atlas_environment_in_relea
  - the merge request for merging with branch 21.2: https://gitlab.cern.ch/atlas/athena/merge_requests/5545
- Give a presentation in the flavour tagging algorithm meeting on 23rd Nov.
- Prepare a slide for Physics workshop in December and email it to others in advance, before 27th Nov.

To-do list in February:
- Track choice for SV determination.

General suggestions and ideas for improvements

For now, make changes only to these three files:

Test new pileup resistant isolation variables (as suggested by Rhys) and compare again these working points:

   FixedCutHighMuTight: topoetcone20/pT<0.15 && ptvarcone30_TightTTVA/pT < 0.04
   FixedCutHighMuLoose: topoetcone20/pT<0.30 && ptvarcone30_TightTTVA/pT < 0.15
   FixedCutHighMuTrackOnly: ptvarcone30_TightTTVA/pT < 0.06
   FixedCutPflowTight: (ptvarcone30_TightTTVA_pt500+0.4neflowisol20)/pT < 0.045
   FixedCutPflowLoose: (ptvarcone30_TightTTVA_pt500+0.4neflowisol20)/pT < 0.16

Variable descriptions

Truth variables

Truth flags
- isBQuark
- isCQuark
- isTau
- isPrompt
- isNonPrompt
- isPhotonConv
- PDG

Kinematic

Momentum vector
- Phi
- Eta
- Pt

Lepton identification

Muons and electrons
- isLoose
- isMedium
- isTight

Isolation

Sum of traverse momentum of Inner Dector tracks within a cone:
- PtVarCone20Rel
- PtVarCone30Rel
- PtVarCone40Rel

Sum of traverse energy of calorimeter topo clusters within a cone:
- TopoEtCone20Rel
- TopoEtCone30Rel
- TopoEtCone40Rel

Boolean decision variables
- isoFixedCutTight
- isoFixedCutTightTrackOnly

Jet fitter algorithm

- jf_dR
- jf_efrc
- jf_mass
- jf_n2tv
- jf_ntrkv
- jf_nvtxlt
- jf_sig3

Secondary vertex algorithm

- svl_L3d
- svl_Lxy
- svl_dR
- svl_efrc
- svl_jf_ntrkv
- svl_mass
- svl_n2t
- svl_ntkv
- svl_sig3

Path of data at USTC

   $ /moose/AtlUser/fuhe/data/user.rroberts.mc16_13TeV.410501.PowhegPythia8EvtGen_A14_ttbar.DAOD_MUON5.e5458_s3126_r9364_r9315_p3263.ntp_v1_out  // Ntuple files
   $ /moose/AtlUser/fuhe/data/btag-mini-ntuples-r21/mini_ntp_r21_v3.root  // mini-ntuple file

Instructions for running MVA at USTC

   $ ssh -XY fhe@ui05.lcg.ustc.edu.cn

Set up environment :

$ cd /home/fhe/testarea/AnpRel20Prod/
$ source setup_atlas_analysis_release.sh
$ cd PhysicsNtuple/PhysicsAnpLight/

Train BDTs:

$ source run_mva.sh train NonP_WTau prompt-training/mva-train-2017-08-25-v1 "--training-var=PromptLeptonVeto"
$ source run_mva.sh train NonP_WTau prompt-training/mva-train-2017-08-25-v1 "--training-var=PromptLeptonVetoLoose"

Evaluate BDT and output mini ntuple:

$ source run_mva.sh eval NonP_WTau prompt-training/mva-train-2017-08-25-v1 --do-loose

Use mini ntuple made from above to write out in histogram format, spit up into prompt and non-prompt:

$ source run_mva.sh plot NonP_WTau prompt-training/mva-train-2017-08-25-v1

Use histogram root file to draw efficiency curves

$ source run_mva.sh eff NonP_WTau prompt-training/mva-train-2017-08-25-v1

Instructions for running MVA on release 21 -- work in progress

Setup the package.

mkdir -p ~/testarea/AnpBase21/source
cd ~/testarea/AnpBase21/source
git clone https://:@gitlab.cern.ch:8443/ustc/Physics/PhysicsAnpLight.git
source PhysicsAnpLight/macros/setup/first_setup_rel21.sh

Make mini-ntuple.

cd ~/testarea/AnpBase21/run
python python $runMVA ../job_0001_out.root -o out_mini.root -n 100

Topic revision: r30 - 2020-01-09 - FudongHe

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback