Studies on the
CombinatorialConstraint (constraining bkg slope) were performed after observing positive values in BDT bin 4 and η bin 2. Two forms of the constraints were tested:
- Hard upper limit at zero on the slope parameter
- Using -Abs(slope) in the model
Intermediate steps in unconstrained case:
η evolution:
"Check sums" (Do the components sum up to the final model?) for the final fit:
checkModel_BDTBin1ETAbin1etaEv,
checkModel_BDTBin1ETAbin2,
checkModel_BDTBin1ETAbin3
BDT evolution:
"Check sums"
checkModel_BDTBin1ETAbin1,
checkModel_BDTBin2ETAbin1,
checkModel_BDTBin3ETAbin1,
checkModel_BDTBin4ETAbin1
Hard upper limit
Constraining only the first step of the fitting procedure (polynomial binned fit on the upper sideband) does not help. Neither does adding the constraint also to the second step - unbinned fit on the upper sideband. Only after constraining also the final step (combined unbinned fit on both sidebands) the final slope is negative. Varying the lower limit (for instance -10000 instead of -10) results in negligible (O(1E-3)) variation of the slope and its error. These are the results for η and BDT bins respectively, with lower limit -10000:
-abs(slope) constraint
The values of the slope are presented with minus sign, although it's redundant - it only enters the model as -Abs(slope).
Intermediate steps for constrained cases in problematic bins
The η bin 2 hard zero:
The η bin 2 -abs():
The BDT bin 4 hard zero:
The BDT bin 4 -abs():
This is the result of the Fabio's
RooFit fitter:
If the values resulting from Fabio's
RooFit fitter are used as initial values for the
ROOT fit, the slope is still wrong. Since Fabio's fitter uses different parametrization (nComb, nSSSV instead of nComb, nTot) and interprets the event numbers on the whole range, the initial values for the event numbers were calculated from ratio of Fabio's nComb/(nSSSV+nComb) times the number of events in sidebands, and nTot was initialized as the number of sideband events itself (this is the log file:
log_BDT4_FabioInit.txt):
Attempts to double up the number of combinatorial bkg events were carried out with both 'native' (those the
ROOT fitter uses normally) and overriden initial values of fit parameters. The values from both internal note (expConst = -0.0064, slope = 1.0022) and Fabio's fitter (expConst = -0.0063608, slope = 1.0077) were used for overriding the native parameters. These are reffered to as 'intNoteInit' and 'FabioInit' respectively.
Attempts to fit the combinatorial component with a constant were carried out. The constant trend was achieved by fixing the value of slope in the polynomial parametrization to zero.
New vs. Old RooFit fitter cross-check
This is the fit log for BDT bin 4 as fitted by Fabio's old fitter:
oldRooFitLog_BDT4.txt
And This is the log for BDT bin 4 by my
RooFit fitter:
newRooFitLog_BDT4.txt
The difference is not caused by different libraries - the 'old' fitter was compiled as a standalone program linked to the same libraries as the 'new' version. Results were unchanged.
It's also not caused by different parametrization - the same parametrization (and of course initialization) as used in the 'old' version was implemented to the 'new' one. Some minor differences were still present, plus the error matrix was not pos def.
Neither it's caused by fitting different data points - data from both fitters were written out and compared.
A test was performed to check whether this could be caused by supplying initial values with different decimal precision. Only values printed out to the terminal were supplied to the new fitter as the starting values - those might have been rounded before printed out. Therefore, both fitters were overriden with same starting values as those printed out to the screen by the old fitter. Now, both converged to the same parameter values, yet the old fitter produced sensible covariance matrix, while the new one failed at that. Later, the source of all evil has been found -
RooFit sorts parameters (almost?) alphabetically. The order of the parameters in the new fitter was different than in the old one, and the covariance matrix wasn't positive definite due to this. When the parameter names in the old fitter were sorted in the same order as in the old one, both converge from same initial values to same final ones and same errors. Heureka!
Adding Roo fit step to the ROOT fitter
It's there for the background. The nComb and nTot are re-calculated using full-range implementations of the bkg model - normalized in sidebands but also returning values in the blinded region. They are calculated as integrals over whole mass range (4766
MeV - 5966
MeV) of the chebychev and combined model in the last step (the extended fit returns number of events in side bands only, by this integration, this is interpolated into the blinded region). The
RooFit fit step is then performed. The numbers of events in sidebands (to be compared to the previous step) are obtained as integrals of the resulting
RooFit pdfs (total, resp. chebychev) over the side bands, multiplied by the full-range nubers of events.
In the BDT bin 4, the
RooFit step, starting from the values of the previous step, resulted in the same (nonsensical) values, plus failed error calculation. Therefore, a condition was implemented, forcing the initial value of the slope to zero whenever the preceeding step fits value greater than zero. This resulted in sensible behavior in BDT bin 4, while leaving other bins unaffected.
Next step is to test the robustness with respect to sample fluctuations, i.e. change the binning so that the event number changes significantly, and see the fit result (last bin only, since it's the problematic one).
|
BDT 0.420 - 1.000 (41 SB events) |
BDT 0.418 - 1.000 (45 SB events) |
BDT 0.415 - 1.000 (53 SB events) |
BDT 0.413 - 1.000 (55 SB events) |
BDT 0.410 - 1.000 (63 SB events) |
BDT 0.400 - 1.000 (72 SB events) |
ROOT |
|
|
|
|
|
|
RooFit |
|
|
|
|
|
|
--
OndrejKovanda - 2020-04-08