Main Web>TWikiUsers>MatteoBedognetti>MatteoBedognettiTMVATraining (2018-08-31, unknown)

TMVA Training (TMVA_Training)

BDT CHALLENGE 2018
Caveat
In short
Where to find it
Contents
Purpose
Description
Running the code
Evaluate the result
After running BDT Training

BDT CHALLENGE 2018

Produced on 31-08-2018

Signal:

/dcache/atlas/susy/mmorgens/Tau3Mu/ntuples/Tau3MuHF/20180725/hist-300560.MC16a.root

Background:

/dcache/atlas/susy/mmorgens/Tau3Mu/ntuples/Tau3MuHF/20180725/hist-data15_13TeV_periodE_main.root %BR%

Note, I removed Pt vertex due to the discrepancy between muons and id-tracks

Here the list with the variable names used by Marcus:

  BDT configuration flags:
  MinNodeSize=1:MaxDepth=5

  tmva_sig_train_cut: "train_flag > 0.5 && pass_trigger"
  tmva_sig_test_cut: "train_flag <= 0.5 && pass_trigger"
  tmva_bkg_train_cut: "train_flag > 0.5 && pass_trigger"
  tmva_bkg_test_cut: "train_flag <= 0.5 && pass_trigger"

    - triplet_sa0xy_sig
    - triplet_slxy_sig
    - triplet_life_time_sig
    - HT
    - Tt_hard
    - run1_isolation_02
    - triplet_vertex_pval
    - calo_met

My configuration (copy-pasted)

TMVA_Marcus_challenge_n1d5_loose:
  description: "Configuration to test overlap with Marcus" 
  option: "MinNodeSize=1:MaxDepth=5" 
  tmva_sig_train_cut: "randomVal > 0.5 && (triggerPass & 1)" 
  tmva_sig_test_cut: "randomVal <= 0.5 && (triggerPass & 1)" 
  tmva_bkg_train_cut: "randomVal > 0.5 && (triggerPass & 1)" 
  tmva_bkg_test_cut: "randomVal <= 0.5 && (triggerPass & 1)"
  variables:
    - Sa0xy 
    - SLxy 
    - SlifeTime 
    - Ht_1jet 
    - TtH 
    - Isolation_ConeX_20 
    - Pv_vertex 
    - Mt_Calo

Caveat

I merged the full functionality into the Analysis package for rel.21.
So this is now obsolete!!!

In short

This is just a telegraphic recap of what has to be done with this package (not for new users).

Produce "bdtTree" samples
Add their names in Tau3MuTMVA_Training.cpp after line 120
Prepare BDTConfig file with list of bdt configurations
submit to Stoomboot:
bash submit_Tau3MuTMVA_Training.sh < bdt config file > < number of configs to run >
Re-run analysis to produce "invM" samples to be used with HistFitter

Where to find it

This package is originally found on STOOMBOOT: /project/atlas/users/mbedog/TMVA_Training (gitLab?)

Contents are:

Tau3MuTMVA_Training.cpp (main implementation)
submit_Tau3MuTMVA_Training.sh (script to submit the job on Stoomboot)
executable_Tau3MuTMVA_Training.sh (script which is then run on Stoomboot)

Purpose

This package is used for the Tau->3Mu analysis at Atlas. As for that we train a BDT to distinguish our signal from the backgrounds, we use this code to train the BDT methods. Presently we are using BDTG, as slightly more stable thatn BDT(A) and particularly more regular in its response shape, which makes it easier to fit a smooth curve to the latter. After generating some Loosely selected samples of particular input variables (input for the BDT), BDTs can be trained on these. The package described here is responsible for the training of these BDTs.

Description

It is simply based on the TMVA training example, and allowes the user to train BDTs according to specific variable selections. The variables for each BDT configuration are selected in a specific text file, which is then read in by the class BDTList (coming from the package Tau3MuMethods in testDERIV).
It uses as inputs ntuples which can be made with help of the tool Tau3Mu_BDTFiller from the same Tau3MuMethods package, these are referred to as "bdtTree" samples. The ntuple is generally supposed to be written out after Loose selection, though it is now being tested to perform training events which pass full Tight selection (in Run2 we expect to have sufficient statistics to permit BDT training after applying Tight selection).

Running the code

The code has been updated to use an intuitive config file. The whole code is now in source/

The code needs root to be set up before running.
You can compile and run the script locally:
g++ `root-config --cflags --glibs` -lTMVA Tau3MuTMVA_Training.cpp -o run.x
and then run run.x < bdt config file > < position of the config inside the file >
This will cause the outputs to be stored locally!
Note as well that the code runs only 1 configuration by default (that's because it is meant to run on Stoomboot after testing). One can simply change this by putting a loop into main() at the end of Tau3MuTMVA_Training.cpp.

Better is to run everything on Stoomboot (it doesn't take more than a few minutes on the short queue) bash submit_Tau3MuTMVA_Training.sh < bdt config file > < number of configs to run > (the number here can be a gross over-estimation, as extra jobs will simply be terminated immediately).
The reason why at this stage one needs to declare a max number of jobs is that the information of the # of configurations existing is only inside the bdt config file, while the bash script does not even open this file.
Running in this configuration places the outputs inside the BDT_Settings folder on /data/atlas/users/mbedog (can be changed in executable_Tau3MuTMVA_Training.sh).
The input "bdtTree" samples of choice are to be set inside Tau3MuTMVA_Training.cpp just after line 120.
Note that the variable "label_sample" is used to make the outputs unique, such as not to mix BDT trainings with different configurations.

Evaluate the result

To see the classic BDT training plots one uses the TMVAGui in root, this can be freely done after running the training:
setupATLAS; lsetup root
TMVA::TMVAGui("/data/atlas/users/mbedog/BDT_Settings/TMVA_W_conf_08_afterTight.root") (The file name is the one of your output .root file).

Better plots for the input variables can be made using the default plotting of the testDERIV code (mergeMultiSource).

After running BDT Training

The following step in the analysis is to use the hereby trained BDTs to finally select signal vs background and obtain an Upper Limit for the branching ration of tau->3mu.
In plain words, one re-runs the analysis (or ./RUNANALYSIS) code of testDERIV to produce "invM" samples to be processed through the HistFitter code to reach an estimate of the best cut on BDT value and the corresponding expected Upper Limit on the branching fraction.

-- MatteoBedognetti - 2017-02-08

Topic revision: r4 - 2018-08-31 - unknown

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback