Professional Documents
Culture Documents
Proportion estimators are quite frequently used in many application areas. The conventional proportion
estimator (number of events divided by sample size) encounters a number of problems when the data are
sparse as will be demonstrated in various settings. The problem of estimating its variance when sample sizes
become small is rarely addressed in a satisfying framework. Specifically, we have in mind applications like
the weighted risk difference in multicenter trials or stratifying risk ratio estimators (to adjust for potential
confounders) in epidemiological studies. It is suggested to estimate p using the parametric family p^ c and
p(1 p) using p^ c (1 p^ c ), where p^ c ¼ (X þ c)=(n þ 2c). We investigate the estimation problem of choosing
c 0 from various perspectives including minimizing the average mean squared error of p^ c , average bias
and average mean squared error of p^ c (1 p^ c ). The optimal value of c for minimizing the average mean
squared error of p^ c is found to be independent of n and equals c ¼ 1. The optimal value of c for minimizing
the average mean squared error of p^ c (1 p^ c ) is found to be dependent of n with limiting value c ¼ 0:833.
This might justifiy to use a near-optimal value of c ¼ 1 in practice whichffi also turns out to be beneficial
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
when constructing confidence intervals of the form p^ c 1:96 np^ c (1 p^ c )=(n þ 2c).
1 Introduction
Previous papers (e.g., Agresti and Caffo1 among others) have considered the problem of
estimating the binomial parameter p, highlighting the point that X=n might be not a good
choice as an estimator for p when the sample size n is small. Here, let X be a binomial
variable with event probability parameter p and sample size parameter n, the latter
assumed to be known. The conventional estimate p^ ¼ X=n encounters a number of Q1
problems if n is small. For example, the variance of p, ^ if estimated by p(1
^ p)=n, ^ is 0, if
X ¼ 0 (or X ¼ n). This occurs even if p is far away from zero, as Table 1 shows, with
considerable large probability for small n. Having an estimator with considerable variance
p(1 p)=n and having this variance estimated to be zero with large probability might be
considered to be a problem of itself. The paper will suggest some simple solutions for it.
However, the problem becomes even more severe when different proportion estima-
tors are combined by means of optimal pooling. Suppose that k proportion estimators
are available from k independent Pk studies, withP data Xi and ni , i ¼ 1, 2, . . . ,k. Typically,
k 1
they are combined using w
i¼k i (X i =ni )= i¼1 wi with efficient weights wi ¼
pi (1 pi )=ni . Unfortunately, these weights are unknown and replaced by sample
weights w ^ i ¼ ni =[p^ i (1 p^ i )] which are undefined if Xi ¼ 0 or Xi ¼ ni . Typically, this
Address for correspondence: Dankmar Böhning, Free University Berlin=Humboldt University at Berlin,
Joint Center for Health Sciences and Humanities, Biometry and Epidemiology, Fabeckstr. 60-62, Haus
562, 14195 Berlin, Germany. E-mail: boehning@zedat.fu-berlin.de
n P(X ¼ 0)
1 0.70
2 0.49
3 0.34
4 0.24
5 0.17
problem is coped with by excluding these study data.2,3 For illustration, consider the
data given in Table 2, representing proportion data with high sparsity in the treatment
group of a multicenter clinical trial with 21 centers involved in treating cancer with a
new treatment. In 10 of these centers (indicated by a*), almost half of all centers
involved, a variance estimate of 0 occured, leading to a considerable loss P of data.P
Of course, under homogeneity one might use the pooled estimator i Xi = i ni
which occurs if we are using as weights wi ¼ ni =[p(1 p)], so that the term p(1 p),
constant across centers, cancels out. However, center specific variances might be
required for homogeneity testing. The problem becomes more elaborate when coside-
ring two (or more) trial arms where we might be interested in a measure such as
the RDi ¼ pi qi , where qi is the success probability in the baseline arm and pi is
the corresponding one of the treatment arm in the ith center. Suppose that the risk
^ i ¼ p^ i q^i , where p^ i ¼ Xi =ni is the
difference is estimated in the conventional way as RD
estimated proportion for the treatment arm and q^ i ¼ Yi =mi for the baseline arm,
We see some important application areas of the work at hand in more elaborated
measures that involve the proportion such as the risk difference RD ¼ p q, the
4 D Böhning and C Viwatwongkasem
relative risk RR ¼ p=q, or the odds ratio OR ¼ [p=(1 p)]=[q=(1 q)]. Here p is the
risk in the treatment arm, whereas q is the baseline risk. The relative risk estimator
(X=n)=(Y=m) has infinite moments if there is a positive probability for Y ¼ 0, similarly
for the odds ratio. This is reflected in the sample such that the relative risk or odds ratio
becomes undefined if y ¼ 0 has been observed. Some measures are still defined when
both proportion estimators are 0 (risk difference), whereas others are still defined if
the baseline estimator is nonzero, (relative risk). In contrast, in those cases the estimated
variance is either zero or undefined. Therefore, it appears reasonable to separate the
estimation of the mean of the measure of interest and its associated variance.
Source 2 Source 1
1 0
1 x11 x12 x1.
0 x21 x22 x2.
x.1 x.2 n
6 D Böhning and C Viwatwongkasem
Table 4 Capture–recapture parameter layout
with two sources
Source 2 Source 1
1 0
1 p11 p12 p1.
0 p21 p22 p2.
p.1 p.2 1
(x:1 þ c)(x1: þ c)
n^ ¼ 2c (2)
x11 þ c
3 Estimating p
np(1 p)
1) Var(p^ c ) ¼
(n þ 2c)2
c(1 2p)
2) bias(p^ c ) ¼
(n þ 2c)
np(1 p) þ c2 (1 2p)2
3) MSE(p^ c ) ¼
(n þ 2c)2
X
4) Var(p^ c ) Var if c 0
n
Note that the properties 1) and 4) show that p^ c is a shrinkage estimator, the larger the c, the
smaller the variance. On the other hand, the larger the c, the higher the bias, unless p ¼ 1=2.
Revisiting proportion estimators 7
Unfortunately, it is impossible to find c such that p^ c has smallest MSE for all p.
Consider p ¼ 0: here c ¼ 0 leads to minimal MSE. On the other hand, if p ¼ 1=2, the
smallest MSE is achieved for c approaching infinity.
Ð 1 Therefore, we consider the average
MSE with repsect to a uniform prior on [0,1]: 0 MSE(p^ c )dp, in terms of mathematical
statistics the Bayes risk with respect to the euclidean loss function and uniform prior. A
straight computation shows that
ð1 ð1
1 2 2 1 n þ 2c2
MSE(p^ c )dp ¼ [np(1 p) þ c (1 2p) ]dp ¼ (3)
0 (n þ 2c)2 0 6 (n þ 2c)2
To minimize
n þ 2c2
f (c) ¼
(n þ 2c)2
consider
or equivalently,
n
p2 p ¼ (5)
8n þ 4
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
and solve for p yielding the two solutions p1,2 ¼ 0:5 0:5 (n þ 1)=(2n þ 1). Note that
f (n) ¼ (n þ 1)=(2n þ 1) is strictly monotone decreasing with n increasing and limiting value
limn!1 f (n) ¼ 0:5. Therefore, (X þ 1)=(n þ 2) will have its largest range of improvement
upon X=n if n is small. For anypvalue
ffiffiffiffiffiffiffi of n, (X þ 1)=(n þ 2) has MSE at most as large as X=n
for p in the interval 0:5 0:5 0:5 [0:15,0:85] (Figure 1). We summarize as follows.
8 D Böhning and C Viwatwongkasem
Theorem 3.2. The estimator (X þ 1)=(n þ 2) has MSE smaller than X=n in the
interval
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1,2 nþ1 pffiffiffiffiffiffiffi
p ¼ 0:5 0:5 0:5 0:5 0:5 (0:15,0:85) (6)
2n þ 1
4 Estimating p(1 p)
Sometimes it might be still useful to use X=n to estimate p, whereas in other cases one
might prefer to use (X þ c)=(n þ 2c) with c > 0 to estimate p. Whatever estimate is
used, its variance is (n=(n þ 2c))2 p(1 p)=n, with c 0, involving the parameter
y ¼ p(1 p). Therefore, we are interested in providing an improved estimator for
p(1 p) by studying the class p^ c (1 p^ c ).
rffiffiffiffiffiffiffiffiffiffiffiffi!
n 1þ2
cn ¼ (9)
2 n1
Figure 2 bias of y^ c ¼ p^ c 1 p^ c for different values of c: c ¼ 0 (solid), c ¼ 0:5 (dashed), c ¼ 1 (dotted) and n ¼ 5.
10 D Böhning and C Viwatwongkasem
Figure 3 Squared bias of y^ c ¼ p^ c ð1 p^ c Þ for different values of c: c ¼ 0 (solid), c ¼ 0:5 (dashed), c ¼ 1 (dotted)
and n ¼ 5.
1
c2 þ nc n ¼ 0
2
of which Equation (3) provides the positive root. This ends the proof of the first part.
For the second part note that the limit is not changed if 1=n2 is added under the
square root:
rffiffiffiffiffiffiffiffiffiffiffiffi! 0sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1
n 1þ2 n@ 2 12
limn!1 ¼ limn!1 1 þ þ 1A
2 n1 2 n n
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi !
n 1 2 n 1 1
¼ limn!1 (1 þ ) 1 ¼ limn!1 (1 þ 1) ¼
2 n 2 n 2
The expected value of y^ has been already provided in Equation (7), so that
Furthermore,
ð1
2[n(n 1)p2 (1 p)2 þ c(c þ n)p(1 p)]
dp ¼
0 (n þ 2c)2
n(n 1)=15 þ c(c þ n)=3
(12)
(n þ 2c)2
which leaves only the more difficult moment E(y^ 2 ) ¼ E(p^ c p^ 2c )2 ¼ E(p^ 2c 2p^ 3c þ p^ 4c ) to
be evaluated. These will involve the noncentral moments of the binomial distribution up
to the power of 4. We do this in three steps.
4.2.1 E(p^ 2c )
This is the most elementary step and can be simply accomplished by noting that
4.2.2 E(p^ 3c )
E(p^ 3c ) can be written as
3
Xþc E(X3 ) þ 3cE(X2 ) þ 3c2 E(X) þ c3
E ¼ (15)
n þ 2c (n þ 2c)3
12 D Böhning and C Viwatwongkasem
4.2.3 E(p^ 4c )
We have for E(p^ 4c )
E(X4 ) þ 4cE(X3 ) þ 6c2 E(X2 ) þ 4c3 E(X) þ c4
E(p^ 4c ) ¼ (18)
(n þ 2c)4
which involves also the 4th noncentral moment of X:
E(X4 ) ¼ np þ 7n(n 1)p2 þ 6n(n 1)(n 2)p3 þ n(n 1)(n 2)(n 3)p4
Theorem 4.3 Let y^ c ¼ p^ c (1 p^ c ), then the average mean square error is provided as:
ð1
a2 c2 þ a1 c þ a0 b3 c3 þ b2 c2 þ b1 c þ b0
MSE(y^ c ) dp ¼ 2
0 (n þ 2c)2 (n þ 2c)3
g4 c4 þ g3 c3 þ g2 c2 þ g1 c þ g0 1
þ 4
þ (19)
(n þ 2c) 30
n cn m(cn)
1 0.36603 0.00556
2 0.58114 0.00583
3 0.65849 0.00512
4 0.69921 0.00449
5 0.72445 0.00398
6 0.74166 0.00357
7 0.75415 0.00323
8 0.76364 0.00295
9 0.77110 0.00271
10 0.77711 0.00251
11 0.78206 0.00234
12 0.78621 0.00219
13 0.78974 0.00205
14 0.79277 0.00193
15 0.79541 0.00183
16 0.79773 0.00173
17 0.79978 0.00165
18 0.80160 0.00157
19 0.80324 0.00150
20
.. 0.80472
.. 0.00144
..
. . .
.50
..
0.82176
.. 0.00063
..
. .
100
.. 0.82752
.. 0.00032
..
. . .
? 0.83300 0.00000
practitioner prefers to have an integer value for c, the preceeding analysis suggests
clearly a value of c ¼ 1.
It has been solidly demonstrated1,10–12 that the simple method of confidence interval
0:5
estimation p^ 1:96[p(1
^ p)=n]
^ suffers a number of disadvantages including a
breakdown of the coverage probability even for considerable large sample sizes. A
number of approximate methods for constructing confidence intervals for a single
proportion are available and the literature on them is extensive including reviews by
Leemis and Trivedi,20 Ghosh,21 Newcombe,22 Vollset,23 Santner and Duffy.24 Clearly,
our objective is not to provide another comparison, but to demonstrate how the simple
Wald interval can be modified using p^ c to achieve rather improved properties.
As an alternative for constructing a confidence interval the Clopper–Pearson25
method might be considered. This method is also called the ‘exact’ method which is
unfortunate as it could be easily misunderstood as having the property of providing a
coverage probability of the desired level. In fact, it provides a coverage probability at
least as large as the desired level with the tendency of providing confidence level too
large with respect to the desired level. Therefore, Agresti and Coull26 argue that
Revisiting proportion estimators 15
‘approximate is better than exact’. Among approximate confidence intervals the score
confidence interval27 performs well. Agresti and Coull26 (p. 120) comment: ‘This article
shows that the score confidence interval tends to perform much better than the exact or
Wald intervals in terms of having coverage probabilites close to the nominal confidence
interval’. Newcombe22 emphasizes in his concluding remarks that ‘of the methods that
perform well only the score method is calculator friendly’. In addition, Böhning10
supported the good performance of the score method. The score interval is easily
derived as follows. Consider the following inequality
X
L(p) U(p)
n
and a direct argument shows that U 1 (X=n) and L1 (X=n) are found as the two roots of
the quadratic equation in p
2
1:962 p(1 p) X
¼ (21)
n np
where p^ ¼ X=n as before. We find it also valuable to look at approximate methods, but
our major emphasis are methods based on p^ c , not a full comparison of approximate
methods. Therefore, we include the Wilson score interval only as a reference method to
which confidence intervals based upon p^ c can be compared. We conclude the paper by
demonstrating that p^ c can be used beneficially in constructing a confidence interval for
the proportion.
performs well with respect to all three measures, followed by the interval (23) defined
by c ¼ 0:5 (at least for smaller sample sizes).
The mean expresses the average coverage probability, whereas the minimum informs
about the worst case coverage probability. Note that both measures relate to different
‘philosophies’: a method providing good minimum coverage probability will typically
overestimate the desired level, and vice versa. A method of confidence interval
construction is desirable for which the coverage probability is least dependent on p:
this is measured with the variance.
instead of Equation (23). Equation (24) is more conservative than Equation (23) as
n=(n þ 2c) 1 if c 0. The results of the evaluation (setting is as mentioned earlier)
are provided in Table 7. If we compare Tables 6 and 7, we clearly see that using the more
conservative interval (24) has a positive effect in bringing the coverage probabilities more
into the direction of the desired level. Now, we consider Table 7 alone. Here, the better
methods are provided with c ¼ 1 and c ¼ 2. For smaller sample sizes, the minimum
coverage probability is better for c ¼ 1, whereas the interval (24) defined by c ¼ 2 becomes
better for larger sample sizes (n > 7). On the other hand, the mean coverage probability
becomes better for these larger sample size if c ¼ 1. Overall, the empirical evidence provided
supports a choice of c ¼ 1 when constructing a 95% confidence interval for p.
6 Discussion
During the reviewing process it was pointed out to the authors by a referee that besides
the criteria mentioned already in Section 5 such as average coverage probability,
minimum coverage probability and variation of coverage probabilities over the
parameter space, there is another useful criterion for a confidence interval to possess:
it should avoid generating meaningless limits such as below 0 or above 1. Truncation
does not help much, as they pointed out, if X ¼ 3 and n ¼ 4, then 0 or 1 are impossible
values for p. We agree with this point of view.
It is one of the salutary properties of the Wilson score interval (like the Clopper–
Pearson) that its limits are always in (0, 1) if 0 < X < n. This can be shown by a direct
argument or seen from the way the Wilson score interval is constructed as illustrated in
Figure 5.
In Table 8 confidence intervals are given for the Leukaemia data of Table 2 using the
two methods mentioned in Section 5, namely, confidence interval construction accor-
ding to Equation (23) and Equation (24). For the second method, limits occur outside
the interval [0, 1] in 15 of the 21 centers, whereas for the first method no aberrations
occur. In fact, a direct argument shows that for small sample size n this is necessarily the
18 D Böhning and C Viwatwongkasem
Table 7 Coverage probability of internal (24)
case. For n < 5 and c ¼ 1 the following result and useful property holds: if 0 < X < n,
p p
then p^ c 1:96 np^ c (1 p^ c )=[n þ 2c]2 > 0 and p^ c þ 1:96 np^ c (1 p^ c )=[n þ 2c]2 < 1.
This property together with its closeness to the Wilson score favours the interval (23)
in comparison to Equation (24).
Another connection is interesting to note. The midpoint of the Wilson score interval
is (X þ 1:962 =2)=(n þ 1:962 ). Replacing 1.96 by 2 we have p^ c for c ¼ 2. Note also that
any (X þ c)=(n þ 2c) can be written as a weighted average X=n(n=(n þ 2c))þ
1=2(2c=(n þ 2c)) between X=n and 1=2. The larger the values of c, the more weight
is put on the midpoint. Therefore, we can view (X þ 1)=(n þ 2) as a midway
compromise between the midpoint of the Wilson score interval and the conventional
estimator (see also Figure 6 for illustration).
Finally, we would like to point out once more that our major objective was to have a
look at potential improvements of proportion estimators itself. As a by-product we find
Revisiting proportion estimators 19
that the improved proportion estimator and its associated variance estimator leads also to
an improved approximate confidence interval method. We are aware of the fact that this
is not necessarily the best approximate method, but it is a simple method with reasonable
properties. Agresti and Coull26 (p. 122) comment: ‘The poor performance of the Wald
interval is unfortunate, since it is the simplest approach to present in elementary statistics
courses. We strongly recommend that instructors present the score interval instead’.
Table 8 Proportion data of Table 2 with 95% confidence intervals based upon (23)
and (24)
Santner28 makes the same recommendation. ‘Of course, many instructors will hesitate to
present a formula such as (2) (the score interval (22), the authors) in elementary courses’.
And further, Agresti and Caffo1 (p. 281) write: ‘Many student in non-calculus based
courses are mystified by quadratic equations (which are needed to solve for the score
interval) . . .In such courses, it is often easier to show how to adapt a simple method so
that it works well rather to present a more complex method’. We agree with this point of
view and hope that our paper contributes to this perspective.
Acknowledgements
The authors would like to express their gratitude to an unknown referee for many
constructive comments as well as to the Editor, Prof. Dr Brian Everitt. Research of
D. Böhning is supported by the German Research Foundation (DFG). Research of
C. Viwatwongkasem is supported by the National Research Council of Thailand (NRCT).
References
1 Agresti A, Caffo B. Simple and effective risk difference when data are sparse.
confidence intervals for proportions and Biometrics 1998; 54: 148–60.
differences of proportions result from adding 4 Cooper MR, Dear KBG, McIntyre OR,
two successes and two failures. American Ozer H, Ellerton J, Canellos G, Duggan B,
Statistician 2000; 54: 280–88. Schiffer C. A randomized clnical trial
2 Whitehead A, Whitehead J. A general comparing Melphalan=Prednisone with
parametric approach to the meta-analysis of and without a-2b interferon in newly-
randomized clinical trials. Statistics in diagnosed patients with multiple myeloma:
Medicine 1991; 10: 1665–77. a cancer and leukemia group B study.
3 Lipsitz SR, Dear KBG, Laird NM, Journal of Clincial Oncology 1993; 11:
Molenberghs G. Tests for homogeneity of the 155–60.
Revisiting proportion estimators 21
5 Tukey JW. Exploratory Data Analysis. 17 Chao A. An overview of closed capture–
Reading: Addison-Wesley, 1977; Chapter 5. recapture models. Journal of Agricultural,
6 Sánchez-Meca J, Marı́n-Martı́nez F. Testing Biological, and Environmental Statistics 2001;
significance of a common risk difference in 6: 158–75.
meta-analysis. Computational Statistics & 18 Chao A. Capture–recapture. In Armitage P,
Data Analysis 2000; 33: 299–313. Colton T eds. Encyclopedia of Biostatistics.
7 Sankey SS, Weissfeld LA, Fine MJ, Kapoor W. Volume 1. New York: Wiley, 1998:
An assessment of the use of the continuity 482–86.
correction for sparse data in meta-analysis. 19 International Mathematical and Statistical
Communication in Statistics – Simulation and Library (IMSL). IMSL STAT=LIBRARY
Computation 1996; 25: 1031–56. User’s Manual, Version 2.0. Houston: IMSL,
8 McClave JT, Sincich T. Statistics, eighth 1991.
edition. Englewood Cliffs: Prentice Hall, 2000. 20 Leemis LM, Trivedi KS. A comparison of
9 Carlin BP, Louis TA. Bayes and Empirical approximate interval estimators for the
Bayes Methods for Data Analysis. London: Bernoulli parameter. The American
Chapman & Hall, 2000. Statistician 1996; 50: 63–68.
10 Böhning D. Confidence interval estimation of 21 Ghosh BK. A comparison of some
a rate and the choice of sample size. Statistics approximate confidence intervals for the
in Medicine 1988; 7: 865–75. binomial parameter. Journal of the
11 Böhning D. Better approximate confidence American Statistical Association 1979; 74:
intervals for a binomial parameter. Canadian 894–900.
Journal of Statistics 1994; 22: 207–18. 22 Newcombe RG. Two-sided confidence
12 Brown LD, Cai T, DasGupta A. Confidence intervals for the single proportion:
intervals for a binomial proportion and comparison of seven methods. Statistics in
asymptotic expansions. Annals of Statistics Medicine 1998; 17: 857–72.
2002; 30: 160–201. 23 Vollset SE. Confidence intervals for a
13 Woodward M. Epidemiology. Study Design binomial proportion. Statistics in Medicine
and Data Analysis. Boca Raton: Chapman & 1993; 12: 809–24.
Hall=CRC, 1999. 24 Santner TJ, Duffy DE. The Statistical Analysis
14 Hook EB, Regal R. Capture–recapture of Discrete Data. Berlin: Springer, 1989.
methods in epidemiology: methods and 25 Clopper CJ, Pearson ES. The use of
limitations. Epidemiologic Reviews 1995; 17: confidence or fiducial limits illustrated in the
243–64. case of the binomial. Biometrika 1934; 26:
15 International Working Group for Disease 404–13.
Monitoring and Forecasting. Capture– 26 Agresti A, Coull BA. Approximate is better
recapture and multiple record systems than ‘exact’ for interval estimation of
estimation I: history and theoretical binomial proportions. The American
development. American Journal of Statistician 1998; 52: 119–26.
Epidemiology 1995; 142: 1047–58. 27 Wilson EB. Probable inference, the law of
16 International Working Group for Disease succession, and statistical inference. Journal of
Monitoring and Forecasting. Capture– the American Statistical Association 1927; 22:
recapture and multiple record systems 209–12.
estimation II: Applications in human diseases. 28 Santner TJ. A note on teaching binomial
American Journal of Epidemiology 1995; 142: confidence intervals. Teaching Statistics
1059–68. 1998; 20: 20–23.
It was shown in Section 4 (Theorem 4.1) that an estimator for y having smallest average
bias is provided by y^ 0:5 . This might be not very satisfying after all, because for a
22 D Böhning and C Viwatwongkasem
particular value of p, some other value of c might be optimal. The bias of y^ c has been
provided with the help of Equation (7) as
1
[(np þ c)(n þ 2c) np(1 p) n2 p2 2cnp c2 p(1 p)(n þ 2c)2 ] (A1)
(n þ 2c)2
ap2 ap þ c(n þ c)
with a ¼ n þ 4nc þ 4c2 . This quadratic is minimized for p ¼ 1=2 for which Euqation
(A1) attains the value (n=4)=(n þ 2c)2 . This value becomes closer to zero with
increasing value of c. So, from this perspective, it might be good to choose a larger
value of c. On the other hand, this might not be wise though, as the larger the value of c,
the smaller the interval in which the bias of y^ c is smaller than the bias of X=n(1 X=n)
(Figure 2). Theorem A.1 provides such an interval.
Theorem A.1. The roots of ap2 ap þ c(n þ c), with a ¼ n þ 4nc þ 4c2 are
provided by
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 (n þ c)c
p1,2 ¼ 1=2 (A2)
4 4(n þ c)c þ n
Note that the length of the intervals (A2) and (A3) become smaller with increasing Q3
value of c. Therefore, to guarantee a better pointwise bias, it appears reasonable to
choose a smaller value of c, for example, in the interval [0.5, 1]. Equation (A2) pro-
vides an interval with endpoints corresponding to values of p for which y^ c is unbia-
sed and with smaller bias than y^ 0 inside the interval. The interval, in which y^ c has
smaller bias than y^ 0 , can still be increased (Figure 2). The largest interval with this
property is defined by p such that bias(y^ c ) ¼ bias(y^ 0 )[ ¼ p(1 p)=n] leading to
Theorem A.2. The roots of bias(y^ c ) ¼ (ap2 ap þ c(n þ c))=(n þ 2c)2 ) ¼ p(1 p)=
n ¼ bias(y^ 0 ), with a ¼ n þ 4nc þ 4c2 are provided by
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 1 (n þ c)c
p1,2 ¼ (A4)
2 4 4c(n þ 1) þ 4c2 (1 þ 1=n) þ 2n
Revisiting proportion estimators 23
The following queries have arisen during the typesetting of your manuscript. Please answer
these queries by marking the required corrections at the appropriate point in the text.
Q3 Author please check that “Intervals (26) and (26)”.... have been
changed to intervals (A2) and (A3)” appropriately.