LambdaCinpPbTMVA3 < Main

Main Web>PWG3HeavyFlavours>JaimeNorman>LambdaCinpPbTMVA3 (2014-12-03, JaimeNorman)

-- JaimeNorman - 2014-11-28

Motivation

The previous studies apply PID prior to applying multivariate cuts. applying PID removes the contamination from wrongly assigned tracks, and then the application of multivariate techniques removes background-like events based on kinematic and topological variables. Mixing these 2, different techniques may reduce the strength of the multivariate techniques. On top of this, applying PID pre-MVA may bias the sample in some way.(??) It was suggested during the D2H meeting (https://twiki.cern.ch/twiki/pub/Main/JaimeNorman/Nov4_JaimeNorman_D2HPresentation.pdf) that perhaps better results would be seen if the PID variables were used in the multivariate training and application instead of applying PID a priori.

Method

The same production cuts are applied, as well as Loose nsigma PID selection (+-5σ cuts for protons, kaons and pions) to obtain the Lc candidates. It was also required that the probability of track1 to be a kaon is greater than 0.3, and the probability of track0 and track2 to be either a proton or a pion is greater than 0.3, in order to further reduce the size of the ntuple. 80.4 million Lc candidates were obtained in this way over all pT, or 59.3 million in the region of the pT bins in interest, 2 < pT < 10 GeV /c.

The same variables are passed to the TMVA, as well as 9 additional "probability variables" - the probability of each track being a proton, kaon or pion.

Training

During the MVA training process, the variables are ranked in terms of their separation. While this does not take into account correlations between variables, it is an OK estimate of the power of each variable in the MVA process. It can be seen that some of the probabilities of each track being a species rank highly. While some MVA methods perform better when only the most strongly discriminating variables are used, the Boosted decision tree method ignores weakly discriminating variables, and the performance is not reduced. Thus, for this study only the BDT method is used (w/ adaptive boost, and w/ gradient boost).

: Ranking input variables (methodunspecific)...
: Ranking result (top variable is best ranked)
: ------------------------------------
: Rank : Variable : Separation
: ------------------------------------
: 1 : Tr2Pp : 1.535e-01
: 2 : Tr2Ppi : 1.499e-01
: 3 : DecayLXYSig : 1.286e-01
: 4 : PtTr2 : 1.084e-01
: 5 : Tr0Ppi : 1.072e-01
: 6 : Tr0Pp : 1.047e-01
: 7 : PtTr1 : 9.343e-02
: 8 : Tr1PK : 8.641e-02
: 9 : DecayL : 6.329e-02
: 10 : CosP : 5.846e-02
: 11 : Tr1Pp : 5.047e-02
: 12 : Tr2PK : 3.378e-02
: 13 : Dist12 : 3.055e-02
: 14 : Tr1Ppi : 2.674e-02
: 15 : SigVert : 1.968e-02
: 16 : DCA : 1.886e-02
: 17 : PtTr0 : 1.759e-02
: 18 : Tr0PK : 1.250e-02
: ------------------------------------

examples of probability distributions

The pT distributions of the decay products in data match the distribution in background fairly well, with some discrepancy at mid and low pT, for track0 and track1.

pT distribution of track 0 in 3 pt bins - [2,3], [5-6] and [8,10]

pT distribution of track 1 in 3 pt bins - [2,3], [5-6] and [8,10]

pT distribution of track 2 in 3 pt bins - [2,3], [5-6] and [8,10]

Background rejection vs signal efficiency, for each pT bin and boosting method.

In contrast to the previous method, the MVA responses for the BDT, from the training background sample and the data, match well.

Application

The same method of applying progressively tighter cuts on the MVA response to search for the cuts which maximises the significance was performed on LHC13c candidates, obtained using the same pre-selection as in simulation. Shown below are the 6 pt bins with the best set of cuts applied.

Comparison

Conclusion

Using particle probabilities as an input into the MVA training/application (method 2) achieves similar significance of the signal to the previous method of applying PID before the training phase (method 1). However the pT distribution of the decay products match much better using method 2, resulting in a much better match in the MVA distribution in simulation and background. This is essential to continue correcting the raw signal extraction for efficiency of the BDT cut... Two possible strategies to continue with are as follows:

Method 1 can be continued to be used, but taking care not use any variables which do not provide a good match between data and simulation, in all pT bins. The distributions which do not show a good match can be removed, so as to not train on incorrect information
Method 2 can be continued

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who
png	BDTG_BestCuts.png	r1	manage	205.5 K	2014-11-28 - 18:51	JaimeNorman
png	BDT_BestCuts.png	r1	manage	208.7 K	2014-11-28 - 18:50	JaimeNorman
png	GaussianSigma_2.png	r1	manage	76.7 K	2014-12-03 - 13:13	JaimeNorman
png	MVA_BDTG_pt2to3.png	r1	manage	79.4 K	2014-11-28 - 13:07	JaimeNorman
png	MVA_BDTG_pt3to4.png	r1	manage	84.5 K	2014-11-28 - 13:08	JaimeNorman
png	MVA_BDTG_pt4to5.png	r1	manage	83.6 K	2014-11-28 - 13:08	JaimeNorman
png	MVA_BDTG_pt5to6.png	r1	manage	83.8 K	2014-11-28 - 13:09	JaimeNorman
png	MVA_BDTG_pt6to8.png	r1	manage	83.4 K	2014-11-28 - 13:09	JaimeNorman
png	MVA_BDTG_pt8to10.png	r1	manage	84.5 K	2014-11-28 - 13:10	JaimeNorman
png	MVA_BDT_pt2to3.png	r1	manage	81.5 K	2014-11-28 - 13:04	JaimeNorman
png	MVA_BDT_pt3to4.png	r1	manage	81.2 K	2014-11-28 - 13:05	JaimeNorman
png	MVA_BDT_pt4to5.png	r1	manage	80.7 K	2014-11-28 - 13:06	JaimeNorman
png	MVA_BDT_pt5to6.png	r1	manage	80.3 K	2014-11-28 - 13:08	JaimeNorman
png	MVA_BDT_pt6to8.png	r1	manage	81.1 K	2014-11-28 - 13:07	JaimeNorman
png	MVA_BDT_pt8to10.png	r1	manage	80.5 K	2014-11-28 - 13:07	JaimeNorman
png	PtTr0_pt2to3.png	r1	manage	158.3 K	2014-11-28 - 11:58	JaimeNorman
png	PtTr0_pt5to6.png	r1	manage	203.1 K	2014-11-28 - 11:58	JaimeNorman
png	PtTr0_pt8to10.png	r1	manage	210.9 K	2014-11-28 - 11:59	JaimeNorman
png	PtTr1_pt2to3.png	r1	manage	145.4 K	2014-11-28 - 11:59	JaimeNorman
png	PtTr1_pt5to6.png	r1	manage	180.2 K	2014-11-28 - 11:59	JaimeNorman
png	PtTr1_pt8to10.png	r1	manage	188.9 K	2014-11-28 - 12:00	JaimeNorman
png	PtTr2_pt2to3.png	r1	manage	119.9 K	2014-11-28 - 12:01	JaimeNorman
png	PtTr2_pt5to6.png	r1	manage	127.7 K	2014-11-28 - 12:05	JaimeNorman
png	PtTr2_pt8to10.png	r1	manage	114.2 K	2014-11-28 - 12:06	JaimeNorman
png	SigErr_2.png	r1	manage	93.1 K	2014-12-03 - 13:14	JaimeNorman
png	Significance_2.png	r1	manage	57.5 K	2014-12-03 - 13:18	JaimeNorman
png	SoverB_2.png	r1	manage	43.7 K	2014-12-03 - 13:15	JaimeNorman
png	Tr0Pp_pt6to8.png	r1	manage	89.0 K	2014-11-28 - 11:30	JaimeNorman
png	Tr0Ppi_pt6to8.png	r1	manage	96.2 K	2014-11-28 - 11:31	JaimeNorman
png	Tr1PK_pt6to8.png	r1	manage	98.7 K	2014-11-28 - 11:31	JaimeNorman
png	Tr2Pp_pt6to8.png	r1	manage	83.0 K	2014-11-28 - 11:32	JaimeNorman
png	Tr2Ppi_pt6to8.png	r1	manage	96.1 K	2014-11-28 - 11:32	JaimeNorman
png	rejBvsS.png	r1	manage	37.4 K	2014-11-28 - 13:11	JaimeNorman

Topic revision: r5 - 2014-12-03 - JaimeNorman

Main

Webs

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
Main All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback