Background Fighting in Charm Less Two Body Analyses

BABAR Analysis Document #346, Version 3
Background ghting in Charmless Two-body analyses

J. Ocariz, M. Pivk, L. Roos A. Hcker, H. Lacker, F. R. Le Diberder o July 22, 2002
Abstract
LPNHE, Paris Laboratoire de lAcclrateur Linaire, Orsay ee e
Contents
1 Introduction 2 Variable Denitions 2.1 The CLEO cones and the monomials . 2.2 Standard topological variables . . . . . 2.3 New idea: P . . . . . . . . . . . . . . . . 2.4 Flipped mass . . . . . . . . . . . . . . . 2.5 Kinematic variables . . . . . . . . . . . 2.5.1 Angular momentum conservation 2.5.2 Three Pt variables . . . . . . . . 2.5.3 Momentum of the fastest lepton 2.6 Super Fox-Wolfram . . . . . . . . . . . . 3 3 3 6 7 8 8 8 8 8 9 9 10 11 11 12 12 14 14 15 17 19 22 23 24 24 27 27 30 30 32 33 33 33
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
3 Variable selection criteria 3.1 Correlation between variables . . . . . . . . . . . . . . . . . . . . . . 3.2 Signal eciency for xed background eciency . . . . . . . . . . . . 3.3 Z-Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 MVA evaluation 4.1 Cones vs. monomials . . . . . . . . . . . . 4.2 Charged vs. Neutrals . . . . . . . . . . . 4.3 Global vs. roe . . . . . . . . . . . . . . . . 4.4 Constructing the best Fisher variable. . . 4.5 Retrainning the cones and {L0 , L2 } Fisher 4.6 Combining Tagging Categories . . . . . . 4.7 Neural Network performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Toy Monte Carlo studies 5.1 Expected improvement on the branching ratio statistical error . . . . 5.2 Tagging Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Systematics 6.1 Mode dependence of the MVA . . . . . . . 6.2 Tagging Category dependence of the MVA . 6.3 Monte Carlo/data comparison from it Breco 6.4 Opeak/onpeak eect . . . . . . . . . . . . 6.5 p.d.f Fit defects . . . . . . . . . . . . . . . . 7 Conclusions: OldFisherNewFisher 8 Acknowledgements
. . . . . . . . events . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Introduction
This document presents a detailed study of background suppression in two-body analyses. The dominant background in selecting B 0 + , B 0 K , B 0 K + K decays is the continuum e+ e qq events. Given the low branching ratios of those decays and the large amount of qq events, background ghting is indeed a major issue. The CLEO experiment has taken advantage of the dierent topology between background and signal events to dene a Fisher discriminant based on the energy spatial distribution in the center of mass frame: the jetiness of qq events implies that the energy ow is pulled in the direction of the B candidate. In this note we revisit the CLEO treatment in various ways. An approach based on continuous variables is explored. Topological variables like thrust and sphericity angles may bring additionnal informations. In section 2, we present the usual and the new variables that are tested. The selection or the rejection of a given variable is decided according to several criteria such as its separation power or the correlation with other variables. The criteria are discussed in section 3. The selected variables are then combined using either a Fisher discriminant or a neural network. Section 4 summarizes the performance of several combinations. Finally, the impact of the background suppression on branching ratios measurements and some systematic eects are studied with toy Monte Carlo simulations and described in sections 5 and 6.
Variable Denitions
Most of the variables described below use the fact that, in the (4S) rest frame, the topological aspect of the qq events is dierent from the BB one. The two B mesons are produced almost at rest in the center-of-mass frame, there is no preferred direction by their decay products. The BB events are thus spherical. On the other hand, the light quarks are produced with a signicant momentum and their decay products are contained in two more or less collimated back-to-back jets. In the following denitions, the event tracks (a loose notation for charged tracks as well as neutral energy deposits) are often divided in two subsets: the B candidate daughters (the B-of-event, or Boe) tracks on the one hand, and the rest-of-event (roe) tracks on the other hand. The charged and neutral particles in the roe are taken from the GoodTracksAccLoose and GoodNeutralLooseAcc lists. When not explicitely notied, the following quantities are computed in the (4S) centerof-mass.
2.1
The CLEO cones and the monomials
The CLEO cones, introduced by the CLEO collaboration [1] are 9 concentric, mutually exclusive, cones centered around the Boe thrust axis, dividing half of the solid angle in 9 slices of 10 each. For each cone j, one denes the quantity Cj: Cj =
roe i j pi i (|cosi |)
(1)
where pi is the momentum in the (4S) center-of-mass of the roe track i, |cosi | is the (positive) cosine of the angle between its momentum axis and j the Boe thrust axis, and i (cosi ) equals to 1 if (j 1) 10 < i j 10 , and 0 otherwhile. The 9 C j are then combined into a single CLEO ConesFisher discriminant denoted Fcones which reads: Fcones = c0 +
9 j=1
cj C j
(2)
where the Fisher coecients are given in Table 1
j cj j cj
0 1 2 3 4 0.1957 0.4273 0.5154 0.5304 0.1725 5 6 7 8 9 0.0287 -0.0495 -0.1173 -0.2780 -0.2150
Table 1: Fisher coecients for the 9 CLEO Cones [2]

Remark: Although the Fcones variable is made of discrete angular slices of the solid angle, the Fcones distribution of events appears smooth because the momentum weight spreads the contribution of each cone in a wide numerical range. However, if one of the cone coecient cj is zero, the event distribution exhibits a delta function at c0 . This is because events for which all the roe tracks fall into this cone will have Fcones = c0 . Most likely these are single track events. In practice, as exemplied by the value of c5 in Table 1, one cone coecient gets very small but non-zero: as a result a weak momentum dependence remains and the event distribution exhibits a peak nearby c0 . Such a feature of the distribution will aect mostly background events since signal events are less prone to have a single track roe. The eect is not large and it can to be taken care of, if the tting functions used to describe the p.d.fs allow for the presence of such a peak. This is illustrated on Fig.1. The monomials, a set of momentum-weighted sums of the roe tracks akin to the CLEO cones are dened as [4]:
roe
Lj =
i
pi |cos(i )|j
(3)
These variables were considered following the introduction by V. Shelkov [3] of the second order Legendre polynomial sum L2 =
roe i
pi
1 (3 cos2 (i ) 1) 2
(4)
which was shown to provide a very similar discriminant power as the 9 cones. There is no obvious reason why L2 should be the optimal continuous extension of the Fcones . Accordingly we examine the potential gain one may obtain by using the sums lj Lj (5) F{Lj } = cst +
j
100 100 75 75 50 50 25
25
0 -2 -1 0 1 2
0 -2 -1 0 1 2
FCones Signal
FCones Bkg
Figure 1: Exemple of Fcones distributions for signal events (left plot) and background 0.53 (see events (right plot). One observes a clear peak located nearby Fcones = c0 text) overshooting over the double gaussian like distribution of background events.
The best set of Lj is determined as follow: L0 , which is the sum of the particules momenta, is primarily selected because it constitues a shape independent oset. The next step is to combine L0 with every other Lj=0 in a linear combination dened by the Fisher algorithm. The best signal/background discrimination is obtained with the {L0 , L2 } pair. It was checked that adding further Lj does not bring signicant extra information. Taking charged and neutrals tracks at once, the {L0 , L2 } Fisher F{L0 ,L2 } is: F{L0 ,L2 } = 0.5319 0.6023 L0 + 1.2698 L2 (6)
Using the Lj monomials or the Legendre polynomials as Fisher components is a matter of taste: they are linearly related the ones to the others. For example, Eq.(6) can be rewritten: F{L0 ,L2 } = 0.5319 0.1790 L0 + 0.8465 L2 (7)
The reason why F{L0 ,L2 } brings a discriminant power nearly identical to the Fcones one can be found in Fig. 2 where it is shown that the contribution of a given track to Fcones and to F{L0 ,L2 } are very close, for all i values. A slightly better agreement is obtained by adding L1 to the {L0 , L2 } pair. This is also shown on Fig. 2. As already stated above, the discriminant power is increased only by a negligible amount by this renement which is therefore not considered in the following. The improvement achieved by following more closely the Fcones response is negligible because, although the cj coecients are optimized by the Fisher algorithm, the optima are not deeply pronounced: what appears graphically as a signicant change is in fact irrelevant, as far as the discriminating power is concerned. The choice of a polynomial expansion, being in Lj or Lj , is more or less arbitrary: a linear expansion is practical to optimize the (linear) Fisher variable. However, a similar result would be obtained using a (non linear) Gaussian j parametrization of the cj i (|cosi |) function.
Figure 2: The contribution to Fcones as a function of i , for a single track of momentum j pi = 1 GeV (i.e cj i (|cosi |) is indicated by the solid line histogram. The contribution to F{L0 ,L2 } of the same track is indicated by the dotted line. The third line which follows more closely the histogram shows the result of a second-order polynomial t, including the linear term L1 absent in F{L0 ,L2 } .
Remark: The above remark on the departure from a gaussian shape of the Fcones distributions applies also to F{L0 ,L2 } , but the eect is weaker. As shown on Fig.3, the F{L0 ,L2 } background distribution cannot be tted perfectly 0.52. It is with a double-Gaussian: an anomaly shows up nearby F{L0 ,L2 } due to events with an unique track in roe: those for which the roe track 0.8) cluster nearby the Fisher satisfy | cos()2 | 0.6023/1.2698 (i.e., oset, because the F{L0 ,L2 } dependence on the momentum is reduced in this particular case.
2.2
Standard topological variables
The following standard event shape variables are tested: the thrust [5] computed on the whole event including the B candidate tracks(T hr) or on the roe (T hr roe ). cosT : is the cosine of the angle between the thrust axis of the B candidate and the thrust axis of the roe. the sphericity [6] computed on the whole event including the B candidate tracks (Sph) or on the roe (Sphroe ). cosS : is the cosine of the angle between the sphericity axis of the B candidate and the sphericity axis of the roe.
80 100 60
40
50
20
0 -2 -1 0 1 2
0 -2 -1 0 1 2
F{L0L2} Signal
F{L0L2} Bkg
Figure 3: Exemple of F{L0 ,L2 } distributions for signal events (left plot) and background events (right plot). One observes a signicant anomaly located nearby F{L0 ,L2 } = 0.5 as an echo of the Fcones peak nearby c0 (see text).
roe Rl and Rl : are the ratio of the Fox-Wolfram moments Hl /H0 [7], computed respectively on the whole event and on the roe.
2.3
New idea: P
The present section is here to record an attempt: it can (should) be skipped by the reader at rst (second!) reading. The variable described here is quite involved, but, at the end, it does not fare better than the (much simpler) other variables, to the great dismay of its proponent. The monomials dened on Eq.(3) are build upon a sum over the tracks: but a sum is not obviously the right choice. For instance, the presence of even a single track with a momentum closely aligned to the B candidate thrust axis can be a hint for a qq background event. For a variable based upon a sum, this piece of information (if relevant) will be diluted by the presence of tracks away from the thrust axis. This will be less the case for a variable based upon a product over the 1 is capable of inducing tracks of 1 |cosi |: then, a single track with |cosi | a very low value for the product, and hence may trigger the identication of a qq event. The contribution of each track to the product can be regulated by a momentumdependent exponant f(p). In particular, f(p) should be such that low momentum tracks are practically removed from the product, while the contribution from sti tracks is enhanced: this means f(p = 0) = 0. However, the more tracks in roe, the more likely one of them is to satisfy |cosi | 1. Because of that, the product is to be taken only as an intermediate step: one should correct it afterwards to account for the number of tracks in the event. Therefore, we dene the P product P
roe i
(1 |cosi |)f(pi )
(8)
as an intermediate step toward the discriminating variable actually used, which is denoted CP : roe 1 f(pi ) P f(pi ) (9) CP = f(pi ) f(pj ) i j=i Because for signal events |cosi | is uniformly distributed between 0 and 1, it can be shown [8] that, irrespective of the f(p) function, the CP variable is itself uniformly distributed for signal events, whereas it peaks at zero for background events. The f(p) function should be adjusted to maximize the discriminant power of CP . A good estimate for the optimal function is provided by f(p) = 1 2 |cos| 1 |cos| (10)
where the average of |cos| is to be taken over background events of momentum p. It can be taken as dierent for charged and neutral roe tracks.
2.4
Flipped mass
The ipped mass is the invariant mass reconstructed with the ipped momenta with respect to the B thrust axis [9, 10] of each particule of the whole event (Mf lipped ) roe or of the roe Mf lipped.
2.5
2.5.1
Kinematic variables
Angular momentum conservation
Additionnal separation can be gained using angular momentum conservation. |cos(PB , z)| is the cosine of the angle of the B candidate momentum with respect to the z axis. In BB decays, it follows a sin2 (PB , z) distribution while it is at in qq events. |cos(TB , z)| is the cosine of the angle of the B candidate thrust axis with respect to the z axis. Signal events have a uniform distribution and background events follow a 1 + cos2 (TB , z) shape. |cos(SB , z)| is equivalent to |cos(TB , z)| but is dened with sphericity axis.
2.5.2
Three Pt variables
These are three scalar sums over the transverse momenta [9, 10]: P tScal: computed with the whole event, wrt the event thrust axis P tScalroe : computed with the roe, wrt the event thrust axis P tBScalroe : computed with the roe, wrt the B thrust axis
2.5.3
Momentum of the fastest lepton
In order to take advantage of the hard spectrum of lepton momentum in B decays, one denes Pf ast as the momentum of the fastest lepton in the event. It is set to zero if no lepton is found.
2.6
Super Fox-Wolfram
The Super Fox-Wolfram moments were introduced by the BELLE collaboration [12] and are dened as: SF W =
l=1,4 SO l Rl + l=1,4 OO l Rl
(11)
with
x,y Rl =
Hlx,y H0
(12)
where the upperscripts S and O denote the B candidate tracks and the roe tracks respectively. Hlx,y follows the Fox-Wolfram moments denition, constructed with the lth order Legendre polynomial Pl : Hlx,y =
ix,jy OO roe One can notice that Rl is strictly equivalent to Rl as dened in 2.2. The SO is close to a Legendre version of L (cf. 2.1). Nevertheless, a major denition of Rl l SO dierence is that the B candidate track momenta enter the denition of Rl : roe i=j
| pi || pj | Pl (cosij )
(13)
OO Rl =
| pi || pj | Pl (cosij ) roe = Rl H0 | pi || pj | Pl (cosij ) H0
(14)
SO Rl =
iS,jO
Ll
(15)
Variable selection criteria
The selection of the MVA variables to be used in the likelihood analysis should be done using as criteria the nal (statisticalsystematical) uncertainty it yields for the most cherish measurement we are aiming at. It must take into account the presence of the other variables used in the analysis. However, for background ghting, it is hardly practical to proceed that way, because one must be using the whole analysis chain, including massive toy monte-carlo simulations, for each variable (and combinaison of variables) one is willing to consider. Therefore, simplied estimators should be dened to probe quickly new ideas and rank the variables the ones with respect to the others. These estimators are bound to be qualitative in nature and they cannot by themselves asses the nal gain on the measurements. This is done as a last step, when the choices boiled down enough to make a full study practical. It is important to keep in mind that the signicative qualitative improvements one may observe when using more and more involved MVA can very well dwindle down to a negligible when one is down to the full study. Thus, one very important criteria, to be applied at the very end, is simplicity!
3.1
Correlation between variables
The correlation between two variables x1 and x2 we are considering here is dened, in the usual way, as: 12 = x1 x2 x1 x2 ( x2 1 x1 2 )( x2 x2 2 ) 2 (16)
where the sum extends over all the signal or background events retained for the likelihood analysis. Three comments are in order: 1. The correlation coecients should be as small as possible when they are refering to one MVA variable and one variable which p.d.f enters in the likelihood. This should be true both for signal events and for background events. This is because the likelihood analysis is based upon a product of p.d.fs, including the MVA p.d.f: the product of p.d.fs provides a correct likelihood only insofar the variables are uncorrelated. But some residual correlations are unavoidable, and one must not seek for strictly zero correlation. Hence some know-how is required to judge how large the correlation can be tolerated without hurting the nal analysis. The current feeling is that 10% is a reasonable goal: obviously this should be quantied precisely when evaluating the systematical uncertainties. 2. This correlation does not have to be small when it is refering to two MVA variables. In that case, what matters is that the MVA distinguishes as best as possible between signal and background. A pair of two correlated variables can be more powerful than a pair of two uncorrelated variables, provided the signal and background correlation coecients are dierent enough. Because of that, the MVA1 ,MVA2 values are to be considered with caution: they are quoted below only to give a feeling of how degenerate two MVA variables are. 3. The correlation coecients are obtained from a sum over all the events entering the likelihood analysis. For a tight-cut (extended)likelihood analysis keeping only a signal-box region, and where few background events remain, such a sum makes sense: the signal yield and its CP properties are truly extracted from the whole sample of events retained. However, for a loosecut (non-extended)likelihood analysis keeping large side-bands (i.e., a large background-box) with the signal-box, such a sum is misleading. This is because a large correlation can arise from the background-box and hence be irrelevant (it would not aect the measurement): one may then be lead to wrongly reject a variable on the ground that it appears too correlated with others. Similarly a correlation can be small because it is small in the irrelevant, but highly populated, background-box, although it is large in the relevant, but poorly populated, signal-box: one may then be lead to wrongly accept a variable on the ground that it appears uncorrelated with others, whereas in fact it might bring a signicant systematical bias. The x1 ,x2 correlation matrix is shown in g.4.
10
18 -17 -16 92 100 L0L2 12 -10 -1 -17100 92 cos(T,z) -1 100-17 -16 cos(P,z) -1 100 -1 -1 roe R2 -84100 -1 -10 -17 roe Sph 100-84 12 18
cones cos(P,z) cos(T,z) cones L0L2 Sphroe R2roe
cos(P,z)
cos(T,z)
Sphroe
Figure 4: Correlation coecient matrices between variables. The signal (background) event matrix is on the left (right).
3.2
Signal eciency for xed background eciency
An analysis can aim for a signicant measurement only if the signal is not drown under a huge background. For a loose-cut (non-extended)likelihood analysis, the sample of events used in the likelihood is mostly populated with background. Denoting f the fraction of signal in the sample, a good qualitative indicator of the power of a MVA is given by the signal selection eciency one is left with when applying a cut on the MVA such that the background selection eciency is set equal to f, and hence bring down the background to a level comparable with the signal level. 5% and we thus decided to quote as a qualitative indicator In the h+ h case f the signal eciency obtained for a background eciency xed at 5%.
3.3
Z-Transform
This is a two-step change of variable introduced [13] to make easier the choice between the dierent variables by standardizing the shapes of MVA distributions and to simplify the t of the p.d.f used in the likelihood. No gain of information is obtained by using the Z-Transform , but it is a convenient tool. If x is the discriminating variable, B(x) and S(x) the distributions for background and signal respectively, we dene the intermediate variable y, and then the Z-variable as follows: xy= B(x) Z = B(x) + S(x)
y 0
b(y )dy
cones
L0L2
R2roe
1 0.9 cones -11 10 5 -23 86 100 0.8 L0L2 -29 29 2 -17100 86 0.7 0.6 cos(T,z) 3 -8 100-17 -23 0.5 0.4 cos(P,z) 1 -3 100 -8 2 5 0.3 roe R2 -85100 -3 3 29 10 0.2 0.1 Sphroe 100-85 1 -29 -11 0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
(17)
where b(y ) is the background distribution of the y variable. Stated dierently, Z is the background selection eciency for y < y. Introducing the y variable, one removes the shape arbitrariness inherent in all MVA distributions. Under a one-to-one change of variable x x the signal and background distributions S(x) and B(x) transform into S (x ) = S(x)x and
11
B (x ) = B(x)x, where the common function x is the derivative of x with respect to x . Being common to both signal and background transformed-distribution, x can be omitted since it drops out of a likelihood analysis: one is left with the initial distributions, which demonstrates that the one-to-one change of variable is arbitrary. As a result, one can dramatically change the aspect of a MVA distribution, without aecting by any means its discriminating power: stated dierently, the shapes of the MVA distributions may lead to misleading impressions. Taking Fisher as an exemple, a unique set of variables can lead to widely dierent MVA distributions: changing the constant oset or the overall scale factor are only two examples of harmless changes of variable. However, the y variable is invariant under x x changes of variable. The fact that x y is itself a harmless change of variable can be seen readily. Since one can multiply at will the likelihood expression by an event-dependent factor, if one chooses B(x) + S(x) for this factor, one makes the likelihood depends only on y. Therefore the y variable contains all the information needed to ght against background. Introducing the Z variable, one removes the shape arbitrariness inherent in the y distributions. One uses a one-to-one change of variable y y to force the background distribution to take a standardized shape. By construction, with y Z, the background distribution B(Z) is uniformly distributed between zero and one, while the signal distribution S(Z) peaks at Z = 0: in eect, the latter can be expressed as S(Z) = (1 y)y 1 . The background distributions being identical for all MVA , when comparing dierent MVA, it is only the signal distribution S(Z), and more precisely its peaking nearby the origin which matters. The most discriminating the variable is, the higher the peak of S(Z) at zero is. The software implementation of the Z-Transform is very simple (it is a matter of a dozen lines of code). Figure 5 shows an example of the two-step change of variable.
MVA evaluation
A Fisher algorithm is used to combine the discriminant variables. Variables are kept or rejected according to the criteria described in 3. The training of the algorithm is performed on a signal SP4 Monte Carlo K sample and onpeak mES side-band data of the Winter 2002 sample (5.20 < mES < 5.26GeV /c2 and |E| < .150GeV /c) data. The standard selection described in [19] is applied. One is left with 8007 signal events and 9677 background events. 4003 events of each category are used to train the Fisher or the neural network alogorithm. The performance and the Fisher distributions are extracted from the remaining events (4004 signal and 5674 background). The coecients of the Fisher combinations of the cones and {L0 , L2 } given in section 2.1 are recomputed with these samples in order to have a fair comparison between the sets of variables tested in the following. In subsection 4.5, it is shown that this reoptimization does not change signicantly the results.
4.1
Cones vs. monomials
As mentionned in section 2.1, it is established that the best combination of monomials is {L0 , L2 }. Table 2 summarizes the performances of {L0 , L2 } and the CLEO
12
0.05 0.04 0.03 0.02 0.01 0 -2 0 2 4
0.05 0.04 0.03 0.02 0.01 0 0
s(y) b(y)
10
S(Z) B(Z)
10
-1
0.5
0.5
x
0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 0.5 1 0.07 0.06 0.05 0.04 0.03 0.02 0.01 1.5 0 0
s(y) b(y)
10
S(Z) B(Z)
10
-1
0.5
0.5
Figure 5: The upper plots represent both signal and background x, y, Z distributions of CLEO Cones-Fisher . The bottom plots correspond to the CLEO-Fisher (11 variables: cones + 2 kinematics). The dashed line in the bottom right-hand-side plot reproduces the above signal Z-distribution of CLEO Cones-Fisher : one observes that CLEO-Fisher is slightly more discriminating than the CLEO Cones-Fisher .
13
which are found to be very similar. This is expected from the fact that the same physics information is used in both cases. {L0 , L2 } is preferred since it is continuous and involves only two variables.
variables CLEO cones {L0 , L2 }
(S)@ (B) = 5% <Z > 0.338 0.007 0.213 0.004 0.344 0.007 0.213 0.004
Table 2: Background eciency at signal eciciency of 5% and < Z > for the CLEO cones and {L0 , L2 }.
4.2
Charged vs. Neutrals
In the B 0 a analysis [4], it is shown that distinguishing neutrals and charged 0 objects in the evaluation of the cones or the monomials brings a signicant improvement in the separation power of the variable. The study is repeated in the B 0 h+ h analysis. One calculates neutral cones with only the neutral particles of the roe and charged cones taking into account only the charged tracks of the roe. The performance of the 18 variables {conesneut , conesch } are given in table 3. The comparison with the results of table 2 and gure 6 show that no signicant improvement is observed in the signal/background separation.
variables {conesneut , conesch }
(S)@ (B) = 5% <Z> 0.341 0.007 0.212 0.004
Table 3: Background eciency at signal eciciency of 5% and < Z > average for the 18 variables {conesneut , conesch }.
In the same way, neutral monomials Lneut and charged monomials Lch are com0 j puted. It is found that the best Lj combination remains {L0 , L2 } for both the charged and the neutral cases. As for the cones, {Lneut , Lneut , Lch , Lch } does not 0 2 0 2 show a better separation power than {L0 , L2 }. Nick Danielson worked as well on the charged/neutral separation [14] and reached the same conclusion.
4.3
Global vs. roe
There is a number of discriminant variables for which the B candidate tracks enter the denition: the thrust T hr, the sphericity Sph, the ratio of the Fox-Wolfram moments Rl , the ipped mass Mf lipped, P tScal and nally the Super Fox-Wolfram moments. They exhibit two major disavantages: the validation of the Monte Carlo signal distribution with the fully reconstructed hadronic B sample [16] is not straightforward,
14
Figure 6: The left plot shows the Z distribution for the cones computed globally (full) and computed with neutrals and charged separetely(dashed). The right one is the background eciency vs the signal eciency for the same variables.
because the momenta of the B candidate tracks enters the evaluation of these variables, they are correlated with the B candidate substituted mass and E. Figure 7 shows the correlation in the case of R2 and Sph. All the variables listed above are then rejected from the nal choice. For the same reason, Pf ast is also rejected. We use then the above shape variables, but computed on the roe only. They are roe denoted below R2 , Sphroe , T hr roe ...
4.4
Constructing the best Fisher variable.
Starting from {L0 , L2 }, we try to improve the signal/background separation by adding other variables:
roe At a previous stage of our study, combining R2 and Sphroe with {L0 , L2 } did bring some improvement (see for example version #2 of this BAD). With the present event selection, the improvement is now negligible: the signal eciency at 5% background eciency goes from 34.4 0.7% to 35.0 0.7%.
Adding higher order of roe Fox-Wolfram moments ratio does not help. T hr roe carries the same information as Sphroe . It could have been choosen as well. cosS is a very powerful discriminating variable. A cut at |cosS | < 0.8 rejects signicantly the background (70% rejected) with 80% eciency on the signal. There is no gain to apply a softer cut and use cosS in the Fisher, while a cut at 0.8 highly reduces the number of events entering the nal t. As cosT carries the same information, it is not used either.
15
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 5.1 0.5 0.4 0.3 0.2 0.1 0 5.1 5.15 5.2 5.25 5.3
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 -0.4 -0.2 0 0.2 0.4
0.5 0.4 0.3 0.2 0.1 0 5.15 5.2 5.25 5.3 -0.4 -0.2 0 0.2 0.4
Figure 7: R2 (top) and sphericity Sph (bottom) as a function of the B candidate substituted mass (left) and E (right).
16
The roe transverse momenta P tScalroe and P tBScalroe are correlated to 86% and 90% resp. to {L0 , L2 } and do not improve the separation. The same roe conclusion is reached with Mf lipped. Not surprisingly, the kinematic variables |cos(PB , z)| and |cos(TB , z)| (or roe |cos(SB , z)|) do improve: the eciency increases from 37.80.1% for {L0 , L2 , R2 , Sphroe } roe to 40.5 0.1% for {L0 , L2 , R2 , Sphroe , |cos(PB , z)|, |cos(TB , z)|}. Here again, the choice between |cos(TB , z)| and |cos(SB , z)| is arbitrary. In summary, three sets of variables are retained [11]: the simplest set: {L0 , L2 },
roe a basic set of 4 topological variables: base4 = {L0 , L2 , R2 , Sphroe }, roe the most powerful set: var6 = {L0 , L2 , R2 , Sphroe , |cos(PB , z)|, |cos(TB , z)|}. Given that the kinematic variables |cos(PB , z)| and |cos(TB , z)| have zero correlation with base4, they could be used in the nal likelihood t as well. In any case, combining them with a Fisher algorithm allows to evaluate the gain from these variables in a very simple way.
In table 4 and in the following, these three sets are compared to the cones. These results can be seen in g. 8.
variables CLEO cones {L0 , L2 } base4 var6
(S)@ (B) = 5% 0.338 0.007 0.344 0.007 0.350 0.007 0.371 0.008
<Z > 0.213 0.004 0.213 0.004 0.213 0.004 0.201 0.004
Table 4: Powers of four Fisher combinations.
4.5
Retrainning the cones and {L0, L2} Fisher
The coecients given in section 2 are the currently last version of NCTwoBodyAnal (see for example NCTwoBodyAnal V00-13-00) used to produce version 13 of the two-body ntuples. The cone coecients were estimated in the early charmless twobody analysis [17] and remain unchanged. The {L0 , L2 } coecients were computed from MC sample and opeak data passing the following cuts: R2 < 0.95, Sph > 0.01 and |cosS | < 0.08. Although the performances are similar (see table 5), for the sake of consistency, it is recommended to switch to the {L0 , L2 } Fisher coecients as calculated in this section in the future: F{L0 ,L2 } = 0.0015 0.5417 L0 + 1.4930 L2 (18)
17
Figure 8: The left plot shows the Z distribution for the cones, {L0 , L2 }, base4 and var6. The right one is the background eciency vs the signal eciency for the same set of variables.
variables CLEO cones {L0 , L2 }
(S)@ (B) = 5% <Z > 0.327 0.007 0.234 0.004 0.335 0.007 0.228 0.004
Table 5: Background eciency at signal eciciency of 5% and < Z > for the CLEO cones and {L0 , L2 } as dened in section 2.
18
4.6
Combining Tagging Categories
Tagging being performed using information from the roe, signal-to-background discrimination varies signicantly among dierent tagging samples. Figure 9 shows the mes distribution, both for untagged and tagged hh candidates, and Figure 10 shows the latter splitted according to their category. Table 6 summarises the signal and background eciencies. In particular, candidates tagged via the lepton category are mostly pure signal events, while non tagged candidates have the smallest S/B ratio. This fact speaks in favour of a tagging-dependent analysis. The expected gain from this tagging-dependent analysis can be evaluated analytically. The statistical error (NS ) on the yield, measured from a n-dimensional M L t is Ns (19) (NS ) = < s2 > f where the separation < s2 > is dened as f < s2 >= f dn x S 2 (x) f S(x) + (1 f )B(x) (20)
with the integration performed on the n variables x1 , ...xn used on the t, and S(x), B(x) being the n-dimensional PDFs for the signal and background distributions. If several independent ts are performed on uncorrelated samples, the combined yield has an error given by (NS ) = Qi = 1 i Qi (21) (22)
2 < s2i > i f Ni
Where the index i refers to each tagging category, i , Ni and < s2i > the signal f eciency, signal yield and signal-to-background separation within the category, In this study, we evaluate the quality factors Qi by using an integration over three discriminant variables: mes , E and F{L0 ,L2 } . The signal PDFs are extracted from signal + Montecarlo, and background PDFs from onpeak sidebands: 5.2 < mes < 5.26 for E and F{L0 ,L2 } distributions, and |E| > 0.15 for mes . The PDFs are evaluated both globally, and independenlty for each tagging category, except in the case of the background lepton category, for which the kaon distribution PDF is also used to estimate the uncertainty coming from the poor statistics on the background lepton tag sample. Table 7 records the functions used to parametrise the PDFs. The results of the analytical computation, done using the + signal fractions obtained in the Moriond 02 analysis, are summarised in table 8. Adding the factors for the Lepton+Kaon+NT1+NT2+NoTag categories, we obtain a combined quality factor almost 7% larger than the equivalent for a single analysis, thus implying a 3.2% improvement on the N yield precision. Equation 21 shows a three-fold dependence of the expected gain from a combined analysis:
19
Figure 9: Left (right) plot: mes distribution for untagged (tagged) hh candidates.
Figure 10: mes distributions separated by tagging category.
20
1.2 1 1 0.8 0.8
0.6 0.6
0.4
0.4
0.2
0.2
5.2
5.22
5.24
5.26
5.28
5.3
0 -0.15
-0.1
-0.05
0.05
0.1
0.15
Background Mes PDF
Background DeltaE PDF
1 1 0.9 0.8 0.8 0.7 0.6 0.6 0.5 0.4 0.4 0.3 0.2 0.1 0 0
0.2
-3
-2
-1
-3
-2
-1
Signal Fisher PDF
Background Fisher PDF
Figure 11: Background mes , E and F{L0 ,L2 } PDFs by tagging category. Also shown are the F{L0 ,L2 } PDFs for signal.
Category Lepton Kaon NT1 NT2 No Tag
Signal Background (12.57 0.34) (1.20 0.05) (34.77 0.49) (27.15 0.22) (7.53 0.27) (6.38 0.12) (14.94 0.37) (17.10 0.19) (30.19 0.47) (48.18 0.25)
Table 6: Tagging eciencies, estimated using MC events for the signal, and 5.2 < mes < 5.26 sidebands for background.
21
Variable Signal PDF mes Argus E Parabola F{L0 ,L2 } Bifurcated Gaussian
Background PDF Gaussian Gaussian Double Gaussian
Table 7: Used signal and background PDF parametrisations. sub-sample Lepton Tag (with Kaon PDFs) Kaon Tag NT1 Tag NT2 Tag No Tag Combined Tag-independent Q(1) 0.1040417 0.1061336 0.1978656 0.0447742 0.0774646 0.1211795 0.5453256 Q(2) 0.1058392 0.1961083 0.0407896 0.0729409 0.1301118 0.5457898 0.5102391
Table 8: Quality factors. Q(1) quality factors are obtained using category-dependent PDFs; Q(2) estimation uses the single, average PDF for each variable.
The signal fraction f . Figure 12 shows the expected improvement as a function of f , up to 10%. The computation was performed with the same PDFs as for the tablated in Table 8. We see that our method brings a sensible improvement for the case of channels with very low S/B, and is reduced to 1% for K + case (f 0.065). The dierences in the distributions among dierent categories. Figure 11 shows the relevant mes , E and F{L0 ,L2 } background PDFs splitted by tagging catebory. Shape variations are globally marginal, except for the lepton and NT1 categories, which are anyway limited by statistics. Table 8 compares the quality factors obtained either by using these category-dependent PDFs or a single set of PDFs. We conclude that all numbers remain about the same, we see that, except for lepton tags, the values of quality factors are mostly unsensitive to variations in the shapes among categories. As a further cross-check, we have also used the set of PDFs obtained from the Kaon category for the lepton category. Again, the result is essentially unchanged. The variations of relative signal and background eciencies among categories, which is the dominant eect. This is a remarkable fact: the expected gain will not be diluted by the limited knowledge of the individual distributions, even for those categories suering from poor statistics.
4.7
Neural Network performance.
A way to take advantage of non linear correlation between variables is to use a Neural Network. We use a multi-layer perceptron neural network [20]. The architecture for
22
10 9 8 7 6 5 4 3 2 1 0
0.02
0.04
0.06
0.08
0.1
Improvement (%) vs. Signal Fraction (S/S+B)
Figure 12: Improvement on the statistical error (in percent) as a function of signal fraction.
each set of variables is given in table 9.
number of input variables number of output classes number of layers number of neurons per layer
CLEO cones 9 2 4 9/8/7/2
{L0 , L2 } base4 var6 2 4 6 2 2 2 4 4 4 2/2/2/2 4/3/2/2 6/5/4/2
Table 9: Neural network architecture of the four sets.

Among the 4 layers, 2 are hidden. The neural network is trained with 6000 cycles. The results obtained are given in table 10. Apart from the 6-variable combination, there is no signicant gain in using the neural network instead of the Fisher algorithm. The parameterization of the Fisher PDF being easier, we decide not to use the neural network.
Toy Monte Carlo studies
The studies presented in this section and in section 6 are performed with the LMinuit toy Monte Carlo and tting package [18]. The probability density functions of the B candidate substituted mass, E and the Cerenkov angles are those used in the t to the Winter 2002 data sample [19]. The generated event numbers, both for signal and backgrounds also correspond to Winter 2002 statistics. 500 experiments are generated for each conguration for which only the Fisher PDFs
23
variables CLEO cones {L0 , L2 } base4 var6
(S)@ (B) = 5% 0.336 0.007 0.344 0.007 0.353 0.007 0.411 0.008
<Z> 0.211 0.004 0.216 0.004 0.194 0.004 0.171 0.004
Table 10: Powers of four Neural Network outputs.

are changed. The number of events, signal and background, of each specie as well as the K asymmetries are tted. Also left free in the t are the following parameters for background PDFs : the shape parameter of the ARGUS substituted mass distribution, the two highest order coecients of the quadratic parameterization of the E spectrum and the ve parameters of the double Gaussian describing Fisher.
5.1 Expected improvement on the branching ratio statistical error

For each of the four Fishers (CLEO cones, {L0 , L2 }, base4 and var6), the distributions of background and signal are extracted from the onpeak mES side-band data sample (5.20 < mES < 5.26GeV /c2 and |E| < .150GeV /c) and signal Monte Carlo respectively. Standard selection cuts [19] are applied. The signal distributions are tted with a bifurcated Gaussian, the background ones with a double Gaussian. The same PDFs are used to generate and t the data sample.
variables CLEO cones {L0 , L2 } base4 var6 0.277 0.181 0.071 0.186
signal L 0.604 0.671 0.777 0.783
R 0.551 0.492 0.456 0.535
A1 0.619 0.762 0.477 0.562
background 1 1 2 0.429 0.648 0.208 0.387 0.608 0.206 0.382 0.393 0.343 0.428 0.470 0.398
2 0.402 0.309 0.648 0.720
Table 11: Signal and background PDFs parameters for the four Fisher combinations.
Figure 14 shows the statistical error on the tted number of and K signal events for the four Fishers. The improvement on the error distribution mean, relative to the CLEO cones, is given in table 12. {L0 , L2 } and base4 show similar performances. For var6, a slight improvement is observed, larger in the channel due to a higher background level. A maximum 2.3% improvement is reached with var6.
5.2
Tagging Categories
The aim of this study is to validate the analytical calculation presented in 4.6. Four independent toy MC sets of 250 experiments are performed, with signal and
24
150 100
200
100 50 0 0
-2
-2
SIGNAL CONES
200 150 200 100 50 0 100 0 300
BKG CONES
-2
-2
SIGNAL L0L2
150 100 50 0 300 200 100 0
BKG L0L2
-2
-2
SIGNAL L0L2-Sph-R2
150 200 100 50 0 100
BKG L0L2-Sph-R2
-2
-2
SIGNAL L0L2-Sph-R2-cosP,z-cosT,z
BKG L0L2-Sph-R2-cosP,z-cosT,z
Figure 13: Fisher distributions for the cones, {L0 , L2 }, base4 and var6. The bifurcated (double) Gaussian ts are superimposed on signal (background) distributions.
25
100 75 50 25 0
10.95 Constant Mean Sigma
/ 8 98.35 16.45 0.6986
6.938 5.571 0.3405E-01 0.2521E-01
100
Constant Mean Sigma
/ 7 117.1 24.14 0.5895
6.614 0.2709E-01 0.1998E-01
50
14
16
18
20
22
24
26
28
(N) CONES
13.77
(NK) CONES
6.973 5.793 0.3359E-01 0.2638E-01 Constant Mean Sigma / 8 120.1 24.14 0.5736 6.769 0.2605E-01 0.1924E-01
100 75 50 25 0
Constant Mean Sigma
/ 8 99.40 16.41 0.6864
100
50
14
16 (N) L0L2
10.16
18
20
22
24 (NK) L0L2
10.01
26
28
100 75 50 25 0
Constant Mean Sigma
/ 10 101.6 16.47 0.6735
5.660 0.3074E-01 0.2192E-01
100
Constant Mean Sigma
/ 9 114.0 24.18 0.6012
6.368 0.2872E-01 0.2047E-01
50
14
16
18
20
22
24
26
28
(N) L0L2-Sph-R2
12.60
(NK) L0L2-Sph-R2
3.388 Constant Mean Sigma / 9 114.6 23.93 0.6043 6.320 0.2802E-01 0.1976E-01
100 75 50 25 0
Constant Mean Sigma
/ 9 99.08 16.08 0.6868
5.575 0.3219E-01 0.2334E-01
100
50
14
16
18
20
22
24
26
28
(N) L0L2-Sph-R2-cosP,z-cosT,z
(NK) L0L2-Sph-R2-cosP,z-cosT,z
Figure 14: Statistical error on the tted number of (left) and K (right) signal events for the four Fishers. From top to bottom : CLEO cones,{L0 , L2 },base4 and var6 26
(N ) (NK )
{L0 , L2 } base4 var6 0.2% 0.1% 2.3% 0.0% 0.2% 0.9%
Table 12: Relative dierence of the error distribution mean with respect to the CLEO cones performance. The uncertainty is 0.3%.
background yields corresponding to the lepton tag category, the kaon tag category, NT1+NT2+noTag categories and all categories merged, respectively. Only events are generated. All t parameters are xed but the signal and background yields. The K and KK yields are xed to zero. The same PDFs are used in all four sets. The {L0 , L2 } sher is used. Table 13 shows the mean of the generated signal and background yields in each of the four sets of experiments, the average tted yields and their statistical error. The quadratic sum of the uncertainties on the lepton, kaon and NT1+NT2+noTag signal yields is 15.45 0.05, to be compared with 15.59 0.04 when tting all categories at once. The conclusion of this study is that, for the + signal-to-background ratio, the statistical uncertainty on yields is improved by about 1%. As this number is smaller than obtained in section 4.6, the same exercise was repeated for four dierent values of the signal fraction, and is summarised on Figure 15. Already for a S/B ratio just below the + value, the gain in statistical precision exceeds several percent. We conclude that the realisitic gain, while slightly diluted with respect to the analytical expectation, is very substantial for channels with signal fractions just below the + case.
sig generated N bkg generated N sig tted N sig (N )
leptons kaons NT1+NT2+noTag 15.5 43.4 65.4 100.6 2116.9 5856.9 15.2 0.3 42.9 0.6 65.4 0.9 4.23 0.03 8.73 0.04 12.04 0.03
All 124.3 8074.4 124.0 0.9 15.59 0.04
Table 13: Mean of the generated signal and background yields, average tted yields and their statistical uncertainty, for the lepton, kaon, NT1+NT2+noTag category and all categories merged.
6
6.1
Systematics
Mode dependence of the MVA
A simple selector based on the DIRC measurement is applied on both tracks of the B candidate to split the onpeak side-band data sample in three sub-samples : , K, KK. In order to get pure subsamples, strong requirements are applied : N > 10 K and a track is identied as a pion (resp. a kaon) if ( C )2 ( C )2 > 10 (resp.
27
4 3.5 3
2.5 2 1.5 1 0.5 0
0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Improvement (%) on Statistical Error vs. S/B ratio
Figure 15: Toy MC estimation of the improvement in the statistical error, as function of the S/B ratio, by combining tagging-dependent ts. The Moriond S/B fraction was 0.0223.
28
( C )2 ( C )2 > 10). 21% of events are classied as , 15% as K and 9% as KK, while 55% are rejected. A two Gaussian function is used to t the distribution. The K and KK distributions do not show a strong asymmetric shape at the level of the available statistics and are tted with a single Gaussian. The resulting mode dependent PDFs are shown in Figure 16 together with the common background and signal PDFs described in section 5.1.
0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04 -2 0 2
0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04 -2 0 2
CONES
0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04 -2 0 2 0.04 0.03 0.02 0.01 0 -0.01 -0.02 -0.03 -0.04 -2
L0L2
L0L2-Sph-R2
L0L2-Sph-R2-cos(P,z)-cos(T,z)
Figure 16: Dierence of the mode dependent background PDFs with the global PDFs given in table 11. All background PDFs have been normalized to one. In order to show the signal region, the signal PDF is displayed with an arbitrary normalization.
The t is performed in its standard way, i.e. with one background Fisher PDF common to all modes, the parameters of the double Gaussian being free. In histograms of Figure 17, the dierence of the t results with the mean of the
29
generated Poisson distributions (N = 124.3 and NK = 402.7, DIRC acceptance corrected yields) is plotted. Table 14 summarizes the average dierence on the number of and K events. No signicant biaises are observed in the case of the cones, L0 L2 and var6. A 3.3 eect is seen on the yield with the base4 Fisher. However, the high purity selector used to split the background sample leads to the relatively low statistics mode sub-samples and statistical uctuations on the double Gaussian of Gaussian parameters are large. One has to wait for more data to conclude on the mode dependance implications on tted yiels.
CLEO cones N 124.3 0.6 0.8 NK 402.7 0.4 1.1
{L0 , L2 } base4 var6 0.1 0.8 2.6 0.8 0.4 0.7 1.0 1.1 0.1 1.1 1.0 1.2
Table 14: Average bias on the tted numbers of events.
6.2
Tagging Category dependence of the MVA
As seen in Fig. 11, the Fisher PDFs vary signicantly among tagging categories. In the standard conguration t, global PDFs, extracted from the distribution on all events of all categories, are used for both signal and background. In order to evaluate a possible bias due to this approximation, 250 toy experiments were performed. The events are generated according to their Elba tagging category PDF and the t is performed in its standard way. Table 15 shows the average bias on the tted number of and K signal events. Two Fisher combinations are tested : {L0 , L2 } and the CLEO cones. The results are similar: one observes a small bias on N (0.7 to 1.6 ) and a signicant bias on NK (2.4 to 2.8 ). Note that this study was perfomed with the Elba tagger. Dierence within tagging category dierence seems to be less important with the new Moriond tagger [15].
N 124.3 NK 402.7
CLEO cones 0.7 1.0 4.3 1.6
{L0 , L2 } 1.8 1.1 3.9 1.6
Table 15: Average bias on the tted numbers of events.
6.3
Monte Carlo/data comparison from it Breco events
This section still needs to be updated. This study relies on the work described in [16]. By comparing the Monte Carlo and real data in open charm fully reconstructed neutral B decays, one can extract a linear correction to the signal Fisher distribution from the B 0 K + Monte Carlo.
30
6.052
100 75 50 25 0 -100
Constant Mean Sigma
/ 7 114.8 -0.6115 17.21
7.888 6.511 0.7877 0.6065
100
Constant Mean Sigma
/ 7 120.1 0.4081 24.56
6.616 1.140 0.7822
50
-50 0 50 (N()-124.3) CONES

3.109 / 6 118.2 0.1270 16.94
100
-100 0 100 (N(K)-402.7) CONES

7.790 / 9 120.8 1.049 24.41
100
Constant Mean Sigma
6.475 0.7969 0.5696
100
Constant Mean Sigma
6.492 1.115 0.7136
50
50
-100
-50 0 50 (N()-124.3) L0L2

1.828 / 7 119.2 2.553 16.76
100
-100
0 100 (N(K)-402.7) L0L2

9.505 / 9 121.4 -0.8222E-01 24.21
100
Constant Mean Sigma
6.587 0.7707 0.5614
100
Constant Mean Sigma
6.528 1.100 0.7039
50
50
-100
-50 0 50 (N()-124.3) L0L2-Sph-R2

3.019 / 7 127.6 0.3630 15.56
100
-100 0 100 (N(K)-402.7) L0L2-Sph-R2

2.309 / 7 117.5 0.9716 25.58
100
Constant Mean Sigma
7.036 0.7050 0.5005
100
Constant Mean Sigma
6.835 1.195 1.011
50
50
-100 -50 0 50 100 (N()-124.3) L0L2-Sph-R2-cosP,z-cosT,z
-100 0 100 (N(K)-402.7) L0L2-Sph-R2-cosP,z-cosT,z
Figure 17: Dierence of the tted numbers of events with the mean of the gener31 ated Poisson distributions. Left: N . Right: NK . From top to bottom : CLEO cones,{L0 , L2 },base4 and var6
The possible systematic eect of neglecting such a correction is studied here in the same way as in 6.1: the linearly corrected signal PDF is used to generate the events. The t is then performed twice: with the corrected PDF with the raw PDF extracted from Monte Carlo Table 16 gives the average dierence on the tted number of and K events. Similar behaviour is observed with all Fishers. The kinematical variables cos(Bmom ) and cos(BT ) seem to introduce a slightly larger eect.
raw corr N N raw corr NK NK
CLEO cones 1.52 .xx 2.78 .xx
{L0 , L2 } 1.20 .xx 2.16 .xx
base4 1.84 .xx 3.39 .xx
var6 .xx .xx .xx .xx
Table 16: Average dierence of the numbers of events tted with the MC raw PDF and the corrected PDF.
6.4
Opeak/onpeak eect
This section still needs to be updated. The systematic uncertainty due to the limited statistical accuracy of the background Fisher parametrization can be evaluated by extracting the PDF on two independent data sets. The E and the mES side-bands in the onpeak data can be used as an alternative to the opeak data. The toy MC proceeds as usual: the events are generated following the PDF extracted from opeak data and the t is performed with: the opeak data PDF the side-band onpeak data PDF, dened either by |E| > 100 M eV or 5.20 GeV /c2 < mES < 5.27 GeV /c2 . The systematic uncertainty is given by the average dierence between the numbers tted with the opeak PDF and the numbers tted with the onpeak PDF. Results are summarized in table 17.
of SB(E) Nf N SB(E) of f NK NK of SB(m Nf N ES ) SB(m ) of f NK NK ES
CLEO cones 1.2 .xx 1.2 .xx 0.7 .xx 0.9 .xx
{L0 , L2 } 1.3 .xx 0.6 .xx 0.5 .xx 0.6 .xx
base4 2.1 .xx 0.9 .xx 0.9 .xx 0.7 .xx
var6 .xx .xx .xx .xx .xx .xx .xx .xx
Table 17: Average dierence of the numbers of events tted with the MC opeak data PDF and the onpeak data PDF from |E| and mES side-bands.
32
6.5
p.d.f Fit defects
Conclusions: OldFisherNewFisher
CLEO cones: {L0 , L2 } base4 var6 (S) at (B) = 5% Stat Mode + + + ++ ++ + Table 18: Summary. BReco + O-On + -
Acknowledgements
We wish to thank Ran Liu and Jinwei Wu for the help they provided us in incorporating the discriminant variables introduced by the Wisconsin group into the study presented here. Vasia Shelkov contributions to background ghting have been instrumental to our study: we beneted a lot from his constant smiling advices. Finally we would like to extend our warm thanks to Jim Olsen who encouraged us to undertake the dreadful attempt to do better than CLEO did: although we did not succeed to the extent we hoped for, it was fun and we learned a lot in the process.
33
References
[1] CLEO Collaboration, D.M. Asner et al., Phys. Rev. D 53, 1039 (1996). [2] $BFDIST/packages/NCTwoBodyAnal/V00-13-00/BkgSupTraining/MCTrained-V03/FI/yFile.BkgSup.FI [3] V. Shelkov, Analysis of B 0 a (980) , talk given at the December 2000 0 Collaboration Meeting, http://www.slac.stanford.edu/BFROOT/ www/Organization/CollabMtgs/2000/detDec2000/Tues4e/vasia.pdf. o [4] BABAR Analysis Document # 141, Search for B 0 a (980) , A. Hcker et 0 al. [5] S. Brandt et al., Phys. Lett. 12, 57 (1964). E. Fahri, Phys. Rev. Lett. 39, 1587 (1977). [6] J.D. Bjorken and S.J. Brodsky, Phys. Rev. D1, 1416 (1970). [7] G.C. Fox and S. Wolfram, Observables for the Analysis of Event Shapes in e+ e Annihilation and Other Processes, Phys. Rev. Lett. 41-23, 1581. [8] P. Janot and F. R. Le Diberder [9] R. Liu, Y. Pan and J. Wu, NN method for B 0 (B 0 ) 3 Selection, talk given at the September 2001 Collaboration Meeting, http://www.slac.stanford.edu/BFROOT/www/Physics/Analysis/AWG/ chrmls hadronic/meetings/Sep20 01/pan.ps. [10] M. Pivk, talk given at one of the November 2001 Charmless two-body meeting, http://www.slac.stanford.edu/pivk/talks/08nov01.ps [11] M. Pivk, talk given at the December 2001 Collaboration Meeting, http://www.slac.stanford.edu/pivk/talks/12dec01.ps [12] Belle Collaboration, K. Abe et al., Belle Preprint 2001-5, hep-ex/0104030, submitted to Phys. Rev. Lett. [13] BABAR Analysis Document # XXX, Statistics for Rare Decay Searches [14] http://www.slac.stanford.edu/danielsm/sher 9-20-01.pdf [15] BABAR Analysis Document # 446, Measurement of branching fractions and CP-violating asymmetries in B h+ h decays, M. Bona al. [16] BABAR Analysis Document # xxx, Breco, H. Lacker et al. [17] BABAR Analysis Document # 38, Measurement of the branching fractions for charmless two-body decays, U. Berzano et al. [18] A. Farbin, LISPBase package, documentation in LISPBase/doc, 2001. [19] BABAR Analysis Document # 357, Measurement of the branching fractions and CP-violating asymmetries in B h+ h decays, F. Bianchi et al. [20] P. Gay, B. Michel, J. Proriol and O. Deschamps, Tagging Higgs Bosons in Hadronic LEP-2 Events with Neural Networks, Pisa 1995, New computing techniques in physics research, 725 (1995).
34

Background Fighting in Charm Less Two Body Analyses

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Background Fighting in Charm Less Two Body Analyses

Uploaded by

Copyright:

Available Formats

BABAR Analysis Document #346, Version 3

Background ghting in Charmless Two-body analyses

LPNHE, Paris Laboratoire de lAcclrateur Linaire, Orsay ee e

The CLEO cones and the monomials

where the Fisher coecients are given in Table 1

Table 1: Fisher coecients for the 9 CLEO Cones [2]

Standard topological variables

Momentum of the fastest lepton

| pi || pj | Pl (cosij ) roe = Rl H0 | pi || pj | Pl (cosij ) H0

Variable selection criteria

Correlation between variables

Signal eciency for xed background eciency

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Cones vs. monomials

0.05 0.04 0.03 0.02 0.01 0 -2 0 2 4

0.05 0.04 0.03 0.02 0.01 0 0

variables CLEO cones {L0 , L2 }

Charged vs. Neutrals

variables {conesneut , conesch }

(S)@ (B) = 5% <Z> 0.341 0.007 0.212 0.004

Global vs. roe

Constructing the best Fisher variable.

variables CLEO cones {L0 , L2 } base4 var6

Table 4: Powers of four Fisher combinations.

Retrainning the cones and {L0, L2} Fisher

variables CLEO cones {L0 , L2 }

Combining Tagging Categories

2 < s2i > i f Ni

Figure 10: mes distributions separated by tagging category.

1.2 1 1 0.8 0.8

Background Mes PDF

Background DeltaE PDF

Signal Fisher PDF

Background Fisher PDF

Category Lepton Kaon NT1 NT2 No Tag

Background PDF Gaussian Gaussian Double Gaussian

Neural Network performance.

Improvement (%) vs. Signal Fraction (S/S+B)

each set of variables is given in table 9.

CLEO cones 9 2 4 9/8/7/2

{L0 , L2 } base4 var6 2 4 6 2 2 2 4 4 4 2/2/2/2 4/3/2/2 6/5/4/2

Table 9: Neural network architecture of the four sets.

Toy Monte Carlo studies

variables CLEO cones {L0 , L2 } base4 var6

<Z> 0.211 0.004 0.216 0.004 0.194 0.004 0.171 0.004

Table 10: Powers of four Neural Network outputs.

5.1 Expected improvement on the branching ratio statistical error

signal L 0.604 0.671 0.777 0.783

R 0.551 0.492 0.456 0.535

A1 0.619 0.762 0.477 0.562

2 0.402 0.309 0.648 0.720

10.95 Constant Mean Sigma

/ 8 98.35 16.45 0.6986

6.938 5.571 0.3405E-01 0.2521E-01

Constant Mean Sigma

/ 7 117.1 24.14 0.5895

6.614 0.2709E-01 0.1998E-01

Constant Mean Sigma

/ 8 99.40 16.41 0.6864

Constant Mean Sigma

/ 10 101.6 16.47 0.6735

5.660 0.3074E-01 0.2192E-01

Constant Mean Sigma

/ 9 114.0 24.18 0.6012