--
JaimeNorman - 2014-11-28
Motivation
The previous studies apply PID prior to applying multivariate cuts. applying PID removes the contamination from wrongly assigned tracks, and then the application of multivariate techniques removes background-like events based on kinematic and topological variables. Mixing these 2, different techniques may reduce the strength of the multivariate techniques. On top of this, applying PID pre-MVA may bias the sample in some way.(??) It was suggested during the
D2H meeting (
https://twiki.cern.ch/twiki/pub/Main/JaimeNorman/Nov4_JaimeNorman_D2HPresentation.pdf) that perhaps better results would be seen if the PID variables were used in the multivariate training and application instead of applying PID a priori.
Method
The same production cuts are applied, as well as Loose nsigma PID selection (+-5σ cuts for protons, kaons and pions) to obtain the Lc candidates. It was also required that the probability of track1 to be a kaon is greater than 0.3, and the probability of track0 and track2 to be either a proton or a pion is greater than 0.3, in order to further reduce the size of the ntuple. 80.4 million Lc candidates were obtained in this way over all pT, or 59.3 million in the region of the pT bins in interest, 2 < pT < 10
GeV /c.
The same variables are passed to the
TMVA, as well as 9 additional "probability variables" - the probability of each track being a proton, kaon or pion.
Training
During the MVA training process, the variables are ranked in terms of their separation. While this does not take into account correlations between variables, it is an OK estimate of the power of each variable in the MVA process. It can be seen that some of the probabilities of each track being a species rank highly. While some MVA methods perform better when only the most strongly discriminating variables are used, the Boosted decision tree method ignores weakly discriminating variables, and the performance is not reduced. Thus, for this study only the BDT method is used (w/ adaptive boost, and w/ gradient boost).
: Ranking input variables (methodunspecific)...
: Ranking result (top variable is best ranked)
: ------------------------------------
: Rank : Variable : Separation
: ------------------------------------
: 1 :
Tr2Pp : 1.535e-01
: 2 :
Tr2Ppi : 1.499e-01
: 3 :
DecayLXYSig : 1.286e-01
: 4 :
PtTr2 : 1.084e-01
: 5 :
Tr0Ppi : 1.072e-01
: 6 :
Tr0Pp : 1.047e-01
: 7 :
PtTr1 : 9.343e-02
: 8 :
Tr1PK : 8.641e-02
: 9 :
DecayL : 6.329e-02
: 10 :
CosP : 5.846e-02
: 11 :
Tr1Pp : 5.047e-02
: 12 :
Tr2PK : 3.378e-02
: 13 : Dist12 : 3.055e-02
: 14 :
Tr1Ppi : 2.674e-02
: 15 :
SigVert : 1.968e-02
: 16 : DCA : 1.886e-02
: 17 :
PtTr0 : 1.759e-02
: 18 :
Tr0PK : 1.250e-02
: ------------------------------------
examples of probability distributions
The pT distributions of the decay products in data match the distribution in background fairly well, with some discrepancy at mid and low pT, for track0 and track1.
pT distribution of track 0 in 3 pt bins - [2,3], [5-6] and [8,10]
pT distribution of track 1 in 3 pt bins - [2,3], [5-6] and [8,10]
pT distribution of track 2 in 3 pt bins - [2,3], [5-6] and [8,10]
Background rejection vs signal efficiency, for each pT bin and boosting method.
In contrast to the previous method, the MVA responses for the BDT, from the training background sample and the data, match well.
Application
The same method of applying progressively tighter cuts on the MVA response to search for the cuts which maximises the significance was performed on LHC13c candidates, obtained using the same pre-selection as in simulation. Shown below are the 6 pt bins with the best set of cuts applied.
Comparison
Conclusion
Using particle probabilities as an input into the MVA training/application (method 2) achieves similar significance of the signal to the previous method of applying PID before the training phase (method 1). However the pT distribution of the decay products match much better using method 2, resulting in a much better match in the MVA distribution in simulation and background. This is essential to continue correcting the raw signal extraction for efficiency of the BDT cut... Two possible strategies to continue with are as follows:
- Method 1 can be continued to be used, but taking care not use any variables which do not provide a good match between data and simulation, in all pT bins. The distributions which do not show a good match can be removed, so as to not train on incorrect information
- Method 2 can be continued