Introduction
List of action items
- Study non-prompt electrons/muons that pass PLV
- Find true stable charged particles produced in B decay
- Plot properties of these secondary charged particles produced in B decays
- Plot distance of secondary B vertex to the primary pp vertex
- Decide whether we should work on improving further prompt tagging by:
- Lowering pT for secondary vertex algorithm
- Using two track vertexes and impact parameters of additional tracks
-
- Useful references
Work plan (2018)
Week of July 17
- Learn structure of MC truth in the ntuple file from Rhys
- Study ATLAS simulation:
Week of July 24
- Learn MC simulation of B meson semileptonic decay:
Week of July 31
- Find distinctions in the leptons come from W, tau or non-prompt decay
- Match track jets to reconstruction electrons and muons, plot track jet variables for electron and muon
- Add plot macro, and put all the plots on the webside .
- Upgrade the plot macro to make it work much efficiently.
Week of Aug 7
- Apply a Reconstruction level (Loose, Medium, Tight ) cut on the electron and muon, plot variables contrast with the each other.
- Calculate the overlap area between non-prompt and W for each histogram, and add a sortable table for both elec and muon respectively.
- Start BDT training with Rhys macros, and plot the ROC.
Week of Aug 14
- Rerun the BDT training for both electron and muon, and save the log file as well. Then upload them to the website.
- Training with:
- signal = electron or muon from W; background = non-prompt lepton.
- signal = electron or muon from W or tau; background = non-prompt lepton.
- signal = electron or muon from W or tau; background = not prompt and not tau lepton.
- For both PromptLeptonIso and PromptLeptonVeto, plot them input variables with the overlap value.
- Remake html table: for electron, have to exclude the variables is*, Pt, Eta, Phi and truth vars; for muon, exclude the is*, all the pt vars, PassID, medium, PDG, Eta, Phi and truth vars.
Week of Aug 21
- Fix the bug of plot macros (y scale bugs and re-name some labels).
- Add a flag for different sig_bkg type.
- Rerun with full events again.
Week of Aug 28
- Write training/evaluation events number we used to txt file.
- Separate writing html table code from plot macros.
- Give a presentation of summary of BDT result with Beamer.
- Add a new mode of BDT training, named '!PromptLeptonVetov2 ' : which used exactly the same input variables as 'PromptLeptonVeto' but replacing 'PtRel' with 'sv1_jf_ntrkv'.
- Keep running BDT with different version, and organize them.
Week of Sep. 04
- Prepare presentation by reading some references.
Week of Sep. 11
- Prepare presentation by reading some references.
- Add efficiency uncertainty by ROOT.TEfficiency to the table.
- Add training mode 'VetoV3' and 'VetoV4' with different cone size for isolation variables.
Week of Sep. 18
- Add new variable to mini-ntuples called PtRelOverTrackJetPt.
- Start to calculate correlation coefficients of all variables in the '!PromptNtuple' frame work.
Week of Sep. 25
- Studied the math of calculate correlation coefficients, but due to it is not easy to operate with the mini-ntuple, move to next step in to-do list momentarily.
- Updated current 'PromptLeptonVetoV2'
- Developed 'PromptLeptonVetoV5' algorithm.
Week of Oct. 02
- Fixing the bugs in the correlation coefficients calculate code ...
Week of Oct. 09
- Add ’PLVetoLoose’, which apply on a looser event cuts:
- No longer require Z0 or D0 cuts
- Replace the isolation cut ’isoLoose’ with ’PtVarCone30Rel < 1’
Week of Oct. 16
- Using mini-ntuple v3 from MUON5 derivation.
- Prepare for MCP and ML presentations.
- Two new 'PromptLeptonLoose' algorithms, observe no improvement:
- 'PLVetoLoose' new1: Put back the D0 and Z0 cut, change the isolation cut to 'PtVarCone30Rel < 0.5'.
- 'PLVetoLoose' new2: without Isolation requirements.
Week of Oct. 23
- Generated full mini-ntuple file from MUON5 with condor at USTC cluster.
- Learning Derivation framework from ATLAS tutorial.
- Add a new algorithm – PromptLeptonVetoV7, which used the same input variables as PromptLeptonVeto but removed PtRel.
Week of Oct.30
- Given a presentation on tth-ml group
- Run all algorithms with whole mini-ntuple, named as V14.
- Fixed some bugs in macros/runStudyVars.py and python/PhysicsAnpLightStudyVars.py.
- Learning Derivation framework from ATLAS tutorial.
Week of Nov.06
- Show 2D hist of 'PtFrac' and 'DRlj'.
- Re-plot 'PtFrac' with increasing bin number by a factor of 5.
- Learn to use JetTagNonPromptLepton.
Week of Nov.13
- Learn how to run derivation with JetTagNonPromptLepton.
- Comepare the merge request from Rhys, about add variables to DxAOD with ATLAS Augmentaion tool tutorial example.
- Attend ML seminar, "Advanced Machine Learning for Classification, Regression, and Generation in Jet Physics" by Ben Nachman.link
- Plot MSTrackPt and IDTrackPt .
Week of Nov.20
- Give a presentation in the flavour tagging algorithm meeting.
- the question on calibrate non-prompt lepton:
- We don’t need to Flavor tagging calibration, because we don’t really calibrate b jet.
- The only calibration for this algorithm is the calibration of lepton from W.
- the question about the retraining with different variables:
- We will do it in next Spring.
- Learning how to run data MC comparison.
Questions
- Are there truth particles for B hadron decays?
- now we have new ntuple file, it includes b decay informations.
- Why the distributions of variables from non-prompt look so different between electron and muon.
- Add variable ParentPDG to reco-event, but failed.
- Note the different between mini_ntuple and ntuple file.
- For type NotWNotTau_WTau, BDT training result looks so weird, why?
- There may be some logical bugs on python macros. Working on it ...
- why track pT / jet pT > 1 for muons ? - Nov. 07
- In our code, TrackPt is filled with primary muon track Pt which are mainly consists of the combined muon Pt.
Minutes
- For tth-ML meeting at Oct. 30:
- why change the selection criteria? (For tight muon evaluation, replaced 'FixedCutTightTrackOnly' with 'FixedCutTight')
- keep no tau training as default study new input variables?
- check modelling of all variables in rel 21 ( To see other b-tagging analysis )
- use lower pT tracks near the electron/muon?
- check show ranking of variables to see which ones are more relevant
- make sure the PLI variables are also in the EGAMMA derivations separate SF for leptonic taus?
- need to show performance for isoLoose+PLI
Plans
- Main Schedule:
- Prepare a new prompt lepton tagging algorithm for release 21 - October.
- First data and MC comparisons for the new algorithm with release 21 - November
- Finalise PromptLeptonVeto training for release 21 - September and October
- Prepare new tag of JetTagNonPromptLepton - October
- Validate and calibrate PromptLeptonVeto training for release 21 -November and December
- Study new input variables - January to March
- To-do list in October -- results shown in version 14 :
- Update current 'PromptLeptonVetoV2' by replacing 'sv1_jf_ntrkv' with 'PtRelOverTrackJetPt';
- Develop 'PromptLeptonVetoV5' algorithm which used the same input variables as 'PromptLeptonVeto' but add two additional new variables: PtVarCone40Minus30Rel and TopoEtCone40Minus30Rel:
- PtVarCone40Minus30Rel = (PtVarCone40-PtVarCone30)/PtVarCone40
- TopoEtCone40Minus30Rel = (TopoEtCone40-TopoEtCone30)/TopoEtCone40
- Learn how to create new variables and add them to 'RecoEvent' ;
- Add looser option for selecting training leptons, e.g. without D0, Z0 and loose isolation cuts;
- Add isolationFixedCut selection working point to efficiency table ;
- learn how to make mini-ntuples from MUON5 derivations using our cluster at USTC ;
- To-do list in November :
- Re-plot 'PtFrac' with increasing bin number by a factor of 5.
- Show 2D hist of 'PtFrac' and 'DRlj'.
- Write down in papers about Jet pt calculate.
- Follow the twiki from Rhys, learn how to run JetTagNonPromptLepton;
- Give a presentation in the flavour tagging algorithm meeting on 23rd Nov.
- Prepare a slide for Physics workshop in December and email it to others in advance, before 27th Nov.
- To-do list in February:
- Track choice for SV determination.
General suggestions and ideas for improvements
- For now, make changes only to these three files:
- Test new pileup resistant isolation variables (as suggested by Rhys) and compare again these working points:
FixedCutHighMuTight: topoetcone20/pT<0.15 && ptvarcone30_TightTTVA/pT < 0.04
FixedCutHighMuLoose: topoetcone20/pT<0.30 && ptvarcone30_TightTTVA/pT < 0.15
FixedCutHighMuTrackOnly: ptvarcone30_TightTTVA/pT < 0.06
FixedCutPflowTight: (ptvarcone30_TightTTVA_pt500+0.4neflowisol20)/pT < 0.045
FixedCutPflowLoose: (ptvarcone30_TightTTVA_pt500+0.4neflowisol20)/pT < 0.16
Variable descriptions
Truth variables
- Truth flags
- isBQuark
- isCQuark
- isTau
- isPrompt
- isNonPrompt
- isPhotonConv
- PDG
Kinematic
Lepton identification
Isolation
- Sum of traverse momentum of Inner Dector tracks within a cone:
- PtVarCone20Rel
- PtVarCone30Rel
- PtVarCone40Rel
- Sum of traverse energy of calorimeter topo clusters within a cone:
- TopoEtCone20Rel
- TopoEtCone30Rel
- TopoEtCone40Rel
- Boolean decision variables
- isoFixedCutTight
- isoFixedCutTightTrackOnly
Jet fitter algorithm
-
- jf_dR
- jf_efrc
- jf_mass
- jf_n2tv
- jf_ntrkv
- jf_nvtxlt
- jf_sig3
Secondary vertex algorithm
-
- svl_L3d
- svl_Lxy
- svl_dR
- svl_efrc
- svl_jf_ntrkv
- svl_mass
- svl_n2t
- svl_ntkv
- svl_sig3
Path of data at USTC
$ /moose/AtlUser/fuhe/data/user.rroberts.mc16_13TeV.410501.PowhegPythia8EvtGen_A14_ttbar.DAOD_MUON5.e5458_s3126_r9364_r9315_p3263.ntp_v1_out // Ntuple files
$ /moose/AtlUser/fuhe/data/btag-mini-ntuples-r21/mini_ntp_r21_v3.root // mini-ntuple file
Instructions for running MVA at USTC
$ ssh -XY fhe@ui05.lcg.ustc.edu.cn
$ cd /home/fhe/testarea/AnpRel20Prod/
$ source setup_atlas_analysis_release.sh
$ cd PhysicsNtuple/PhysicsAnpLight/
$ source run_mva.sh train NonP_WTau prompt-training/mva-train-2017-08-25-v1 "--training-var=PromptLeptonVeto"
$ source run_mva.sh train NonP_WTau prompt-training/mva-train-2017-08-25-v1 "--training-var=PromptLeptonVetoLoose"
- Evaluate BDT and output mini ntuple:
$ source run_mva.sh eval NonP_WTau prompt-training/mva-train-2017-08-25-v1 --do-loose
- Use mini ntuple made from above to write out in histogram format, spit up into prompt and non-prompt:
$ source run_mva.sh plot NonP_WTau prompt-training/mva-train-2017-08-25-v1
- Use histogram root file to draw efficiency curves
$ source run_mva.sh eff NonP_WTau prompt-training/mva-train-2017-08-25-v1
Instructions for running MVA on release 21 -- work in progress
mkdir -p ~/testarea/AnpBase21/source
cd ~/testarea/AnpBase21/source
git clone https://:@gitlab.cern.ch:8443/ustc/Physics/PhysicsAnpLight.git
source PhysicsAnpLight/macros/setup/first_setup_rel21.sh
cd ~/testarea/AnpBase21/run
python python $runMVA ../job_0001_out.root -o out_mini.root -n 100