Professional Documents
Culture Documents
To cite this article: Chia-Hao Chang, Jen-Tsung Yang & Ming-Hsueh Lee (2014): A Novel Maximizing
Kappa Approach for Assessing the Ability of a Diagnostic Marker and its Optimal Cutoff Value, Journal
of Biopharmaceutical Statistics, DOI: 10.1080/10543406.2014.920347
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever as to
the accuracy, completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of the authors,
and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content
should not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions, claims,
proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or
howsoever caused arising directly or indirectly in connection with, in relation to or arising
out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Conditions of access and use can be found at http://www.tandfonline.com/page/terms-
and-conditions
Downloaded by [New York University] at 01:51 04 July 2015
Journal of Biopharmaceutical Statistics, 00: 115, 2015
Copyright Chang Gung University of Science and Technology
ISSN: 1054-3406 print/1520-5711 online
DOI: 10.1080/10543406.2014.920347
Key Words: Diagnostic testing cutoff values; Maximize kappa; Theory of extremes; Volume under the ROC
hyper-surface.
1. INTRODUCTION
The identification of cutoff values for the key biomarkers associated with the risk of
developing clinical diseases is among the most useful clinical tools in preventive epide-
miology. However, power performance is crucial for statistical testing procedures. This
article thus contributes to the body of knowledge on this topic by proposing a new
approach for addressing diagnostic problems of this kind.
The most common method for testing the quality of biomarkers is the receiver
operating characteristic (ROC) curve (Egan, 1975; Green and Swets, 1996; Pepe, 2003).
In simple terms, the ROC curve compares the fraction of positive results among the true
positives (termed the true positive rate, i.e., sensitivity) with the fraction of negative
results among the false positives (the false positive rate, i.e., specificity) at various cutoff
values (c1) by plotting both rates. The area under the ROC curve (AUC) has been
extensively used for testing the quality of biomarkers in the literature. The hypotheses
can be formulated as follows:
Received October 15, 2013; Accepted April 23, 2014
Address correspondence to Chia-Hao Chang, Department of Nursing, Chang Gung University of Science
and Technology, Chiayi Campus, Chiayi 61363, Taiwan; E-mail: howellchang@gmail.com
1
2 CHANG ET AL.
ROC surfaces can be defined to identify new biomarkers in ordered three-class case
classification problems (Mossman, 1999; Dreiseitl et al., 2000; Heckerling, 2001; Nakas
and Yiannoutsos, 2004). In such problems, two decision cutoff values, namely c1 and c2,
are needed to classify biomarkers into three groups.
To allow us to generalize, we now consider classifying these groups into ordered k-
class cases (k > 3) based on a single biomarker. By assuming that F1(.),F2(.),,Fk(.) are k
overlapping continuous distributions, we can define a decision rule as in the three-class
case presented above. k 1 decision cutoff values are needed to classify the groups into k
classes, namely c1,c2,,ck1. Therefore, the following hypotheses may be applied:
Downloaded by [New York University] at 01:51 04 July 2015
H0 : F1 F2 . . . Fk
(1)
H1 : F1 F2 . . . Fk
X
n1 X
nk
VUS 1=n1 nk Ix1j1 x2j2 xkjk (2)
j1 1 jk 1
where Ix1j1 x2j2 xkjk = 1 provided there exists at least one strict inequality;
otherwise, Ix1j1 x2j2 xkjk is equal to zero. The hypotheses to be tested for the
quality of the biomarkers are thus
prevalence than to positive predictive value, sensitivity, and specificity (Manel et al.,
2001).
This article studies the test of homogeneity using such test statistic, which is
maximal over all possible values of the threshold. It happens often in epidemiological
studies that values of a continuous variable are divided to two groups based on comparing
the measured values with a threshold (Miller and Siegmund, 1982). Maximally selected 2
statistics in k 2 contingency tables are investigated in Betensky and Rabinowitz (1999).
The statistic has been used extensively in map accuracy work (Congalton, 1991) and in
presenceabsence mapping in applied ecology wherein a threshold is selected to max-
imize kappa (Guisan and Hofer, 2003; Hirzel et al., 2006; Moisen et al., 2006). The
proposed coefficient for k k tables further calculates the agreement between the
biomarker and true disease status.
Downloaded by [New York University] at 01:51 04 July 2015
where:
0 : the probability of an observation being classified in the same class as the true disease
status i and the new biomarker j.
e : if the true disease status and the new biomarker are independent, then the probability
of agreement is e 1 1 2 2 , where 1 and 2 are the classifications in
row 1 and column 1, respectively.
The hypothesis to be tested for agreement between the true disease status and the new
biomarker beyond mere chance can be formulated as
H0 : 0 vs: H1 : > 0
In summary, the coefficient is used to measure the proportion of the agreement for
this two-way table. For example, if the value of is in perfect agreement, then x is a
perfect biomarker associated with the risk of the true disease status, as shown in Table 1.
Table 1 Results of simulations to study size and power for sample size arrangements for k = 3
D.F. (10, 2)
Sample sizes (10, 10, 10) (15, 5, 5) (20, 5, 5)
Note. Abbreviations: D.F., degree of freedom; VUSs, the volume under the ROC hyper-surface.
a
Based on maximum likelihood estimation.
A NOVEL MAXIMIZING KAPPA APPROACH 5
where c0 is infinity.
In practice, the k 1 ordered decision cutoff values vary among the possible
biomarker values (= Ck1N1
, N = n1 + n2 + + nk). Thus, the proposed statistic is defined
by ascertaining the optimal cutoff value (i.e., the one that provides the maximum
coefficient) for assigning probabilistic predictions into the k classes of the data set. The
maximum coefficient determines the extreme values rather than the average values.
Statistics for extreme values have been proposed and applied in different fields, such as
for analyzing high sea levels, wind speeds, air pollutant concentrations, and price changes
in share markets (Gumbel, 1958; Coles, 2001; de Haan and Ferreira, 2006). Furthermore,
the maximum coefficient is equivalent to the optimal cutoff value for calculating the
coefficient as follows (Guisan et al., 1999):
n o
MaxK max 1 ; 2 ; ; Ck1
N 1 (3)
n o
where 1 ; 2 ; ; C N1 is a sequence of random
. variables that have a common
k1
m m m N 1
distribution function G, while m 0 e 1 e , m = 1,2,,Ck1 , represent
m
the coefficients under various possible cutoff values. Similarly, 0 is the probability of
an observation being classified in the same class as true disease status i and biomarker j
for the mth cutoff value. If true disease status and the biomarker are independent, then the
m Pk
m m m
probability of agreement is e l l , where l is the classifications in row l
l1
and column l, respectively.
N 1
Given that the number of possible permutations is equal to Ck1 , this number will be
quite large even for small sample sizes. Thus, in practice, only M random permutations are
generated in order to derive the reference distribution. The hypotheses for the agreement
between true disease status and the biomarker beyond mere chance can thus be formulated as
H0 : MaxK 0
H1 : MaxK > 0
The main difference among these three limiting types is the behavior of the upper tail. The
shape parameter reflects the weight of the tail of the distribution. The maximum tail
value is finite for the Weibull distribution and infinite for the Frchet and Gumbel
distributions. Distributions that have upper tails that decay exponentially produce the
Gumbel distribution of the maximum, whereas those that decay polynomially produce the
Frchet distribution of the maximum. Further, Jenkinson applied Gnedenkos results to
propose a generalized formula that combines the three types of extreme value distributions
into a single distribution, called the generalized extreme value distribution (GEV)
(Gnedenko, 1943; Jenkinson, 1955):
where = 0 for Gumbel, > 0 for Frchet, and < 0 for Weibull. is the shape parameter
and and are location and scale parameters, which are different to {an} and {bn}
defined above.
By following the extremes of dependent sequences theorem (Coles, 2001) and the
generalized formula by Jenkinson (1955), the following theorem can be deduced.
j Prfi1 uCk1
N1 ; . . . ; i uC N 1 ; j uC N 1 ; . . . ; j uC N1 g
p k1 1 k1 q k1
N 1
Prfi1 uCN1 ; . . . ; ip uCN 1 g Prfj1 uCN 1 ; . . . ; jq uC N1 gj Ck1 ; l
k1 k1 k1 k1
N 1
where Ck1
N1
; lCk1
N1 ! 0 for some sequence lC N 1
k1
such that lCk1
N1 /C
k1 ! 0 as
Ck1 ! 1.
N1
. D N 1
PrMaxK aCN 1 bCN1 ! G as Ck1 !1
k1 k1
G expf1 =1= g
for 1 =>0, with parameters < < , > 0, and < < , and the
N 1 condition is satisfied with u N 1 b N 1 a N 1 for every real .
DuCk1
Ck1 Ck1 Ck1
In practice, we do not have problems with the normalizing constants. The family of
extreme value distributions may be fitted directly to a series of observations of MaxK.
That is,
Downloaded by [New York University] at 01:51 04 July 2015
D N 1
PrMaxK ! G aCk1
N 1 =bC N 1 G as C
k1 k1 ! 1
Table 2 Results of simulations to study size and power for sample size arrangements for k = 4
Sample sizes (10, 10, 10, 10) (5, 5, 5, 10) (5, 5, 5, 15) (5, 5, 5, 20)
Simulated critical (0.161, 1.885 (0.189, 1.944 (0.17, 0.073, 1.924 (0.156, 1.803
values/(, , )a 0.062, 0.081, 0.147) 0.069,
0.217) 0.168) 0.15)
Location parameters MaxK VUSs MaxK VUSs MaxK VUSs MaxK VUSs
Sample sizes (10, 10, 10, 10) (5, 5, 5, 10) (5, 5, 5, 15) (5, 5, 5, 20)
Location parameters MaxK VUSs MaxK VUSs MaxK VUSs MaxK VUSs
D.F. (10, 2)
Sample sizes (10, 10, 10, 10) (5, 5, 5, 10) (5, 5, 5, 15) (5, 5, 5, 20)
Location parameters MaxK VUSs MaxK VUSs MaxK VUSs MaxK VUSs
Note. Abbreviations: D.F., degree of freedom; VUSs, the volume under the ROC hyper-surface.
a
Based on maximum likelihood estimation.
degrees of freedom, and i is the location parameter. We examined balanced and unba-
lanced designs with sample sizes of 5 and 20, respectively. We assumed the above-
mentioned conditions using the software R 2.15.0 (R Development Core Team, 2012)
and fit a GEV distribution to them using the VGAM package in order to obtain maximum
likelihood estimates for the limiting distribution of MaxK.
This simulation was replicated 1000 times at = 0.05 using M = 1000 random
permutations. The VUS is clearly shown to be a discrete random variable (Terpstra and
Magel, 2003). However, it may not be able to fill out the level of Type I errors completely
under H0. Thus, we used simulated critical values for the VUS (e.g., VUSs). Similarly,
Monte Carlo approximations at = 0.05 and 1000 replications were conducted and a
quantile function was used from the qgev package from evd. Ideally, biomarkers that have
better quality should have higher power.
The simulated powers of VUSs and MaxK are shown to be close to the nominal
level under the null hypothesis. Under k = 3, VUS outperforms MaxK when DF = (2, 10)
or when the sample sizes are (10, 10, 10). MaxK can have better performances only for
A NOVEL MAXIMIZING KAPPA APPROACH 9
Table 3 Results of simulations to study size and power for sample size arrangements for k = 5
Simulated critical values/ (0.138, 0.046, 1.77 (0.18, 0.099, 1.813 (0.152, 0.046, 2.036
(, , )a 0.171) 0.155) 0.105)
D.F. (10, 2)
Note. Abbreviations: D.F., degree of freedom; VUSs, the volume under the ROC hyper-surface.
a
Based on maximum likelihood estimation.
the case of unbalanced sample sizes with DF = (4.5, 4.5) or DF = (10, 2). For the case of
unbalanced sample sizes (15, 5, 5) or (20, 5, 5) with DF = (4.5, 4.5) or DF = (10, 2) in
Table 1, the gain percentage in power, DP = (MaxK-VUSs)/VUSs, ranges from 0.65% to
23.68% with the average gain percentage in power being 11.03% (difference of
percentage).
For k = 4, the simulated power of MaxK is higher than that of VUSs, while the
sample sizes that correspond to smaller spaces in adjacent location parameters are
comparatively small. The DP ranges from 0% to 66.92% with the average gain percentage
in power being 28.22%. Notice that, VUS also has comparable performances especially
for the case of location parameters (0.75, 0.5, 0.25, 0) (see Table 2).
For k = 5, the simulated power of MaxK is higher than that of VUSs, while the
sample sizes that correspond to smaller spaces in adjacent location parameters are
comparatively small. The DP ranges from 13.32% to 82.22% with the average gain
percentage in power being 36.27%. Notice that, VUS performs better when the sample
sizes are (10, 10, 10, 10, 10) (see Table 3).
The performances of MaxK and VUS strongly depend on the underlying simulation
settings. Since there is no theoretical comparison for these two methods, the readers can
judge which method should be applied under what situations by the results of Tables 13.
10 CHANG ET AL.
Tables 13 just represent a small subset of the many different scenarios that we simulated.
For example, we also conducted simulations for numerous other alternative patterns.
Interested persons may contact the corresponding author for these simulated results.
4. DATA EXAMPLES
Hemoglobin (HGB) tetramers are the major oxygen-carrying molecules in the
blood. Kimberly proposed that lower HGB values are associated with larger acute infarcts
and with an increased degree of infarct growth during acute ischemic stroke (Kimberly
et al., 2011). While the National Institutes of Health Stroke Scale (NIHSS) is an attractive
candidate predictor for disposition because it is widely used, there has been reluctance to
adopt it within clinical settings, because scale completion is considered to be too time
consuming compared with standard neurological assessments by some users (Lai et al.,
1998). In addition, although many components of the NIHSS are part of standard
Downloaded by [New York University] at 01:51 04 July 2015
neurological assessments, training is required for the reliable use of the tool (Andr,
2002).
Previous randomized controlled trials have demonstrated that the administration of
recombinant tissue plasminogen activator (rt-PA) treatment for acute ischemic stroke
(within three hours of symptom onset) improves functional outcomes without increasing
severe disability and mortality. However, intracerebral hemorrhage remains the most
feared side effect of rt-PA (Wardlaw et al., 2003; Derex and Nighoghossian, 2008).
Indeed, studies have shown that the initial stroke severity assessed by the NIHSS score
is an independent marker for subsequent intracerebral hemorrhage (Tanne et al., 2002).
In December 2003, the Bureau of National Health Insurance in Taiwan implemented
a payment rule for the rt-PA treatment of acute ischemic stroke, with a clinical disease
severity of NIHSS > 25 the minimum requirement for treatment. In addition, it adopted
the advice of the Taiwan Stroke Association and excluded slight strokes (NIHSS < 6)
from the payment rule. Based on these payment criteria, NIHSS 6 and NIHSS 25 have
become operational criteria.
From November 2010 to October 2011, 31 patients that had suffered from
ischemic stroke were recruited at the Chang Gung Memorial Hospital, Taiwan for
the study. The inclusion criteria were as follows: (i) first ever stroke, (ii) obvious
weakness of affected limbs (muscle power < 3), (iii) supratentorial hematoma, and (iv)
Glasgow Coma Scale > 6. Blood samples were collected from all study participants.
The median HGB for NIHSS < 6 (n = 21), NIHSS 625 (n = 7), and NIHSS > 25 (n =
3) were 15.2, 12.9, and 12.8, respectively. The box plot in Fig. 1 shows a non-
increasing trend in the medians among these three classes. From Equations (2) and
(3), the VUS was 0.37 (p = 0.025) and MaxK was 0.56 (p < 0.003). We conclude that
the probability of a correct NIHSS classification based on HGB value is higher than
that expected from chance alone.
We further classify the patients into one of five stroke classes with the NIHSS
representing true disease status. Of the 31 patients, 16 were classified by the NIHSS as no
stroke (NIHSS = 0), four as minor stroke (NIHSS = 14), three as moderate stroke
(NIHSS = 515), five as moderate/severe stroke (NIHSS = 1620), and three as severe
stroke (NIHSS = 2142). The investigators then quantified the ability of HGB values to
classify patients correctly into these five NIHSS classes. Smaller patient HGB values were
hypothesized to be more indicative of stroke and that HGB values would be greater for no
stroke patients. The median HGB values for the five classes were 15.6, 14.4, 13.1, 12.8,
A NOVEL MAXIMIZING KAPPA APPROACH 11
Downloaded by [New York University] at 01:51 04 July 2015
and 12.7, respectively. The box plot in Fig. 2 also shows a non-increasing trend in the
medians among these five classes. The VUS and MaxK were 0.03 (standard deviations =
0.013) and 0.52 ( = 0.1582, = 0.0658, and = 0.1561, which are based on the
maximum likelihood estimation), as indicated in Table 4. These two estimates were
contrasted with those at the uninformative levels VUS = 1/120 and MaxK = 0; the
associated p-values were p = 0.061 and p < 0.001, respectively. We conclude that the
only significant test is MaxK. Since a plot of this data set still displays a non-increasing
trend, it seems that the trend has not been detected, which indicates a deficiency with the
VUS.
5. DISCUSSION
The present study proposed a new method for assessing the performance of
diagnostic tests in clinical settings. This approach, based on a statistic called maximum
kappa, relies on the measurement agreement between true disease status and the new
biomarker; specifically, the kernel approach applies the coefficient, except when ranking
the value of the biomarker into k classes. A high coefficient indicates stronger agree-
ment with the response at a specific cutoff value.
The coefficient has been used extensively in map accuracy work (Congalton,
1991) and in models that predict the spatial distribution of species (Boyce et al., 1999;
Guisan and Zimmerman, 2000; Manly et al., 2002; Pearce and Boyce, 2006), in which a
12 CHANG ET AL.
Downloaded by [New York University] at 01:51 04 July 2015
k=3
VUS 0.37 0.105 0.025
MaxK 0.56 (0.1711, 0.0801, 0.0683) 0.003
k=5
VUS 0.03 0.013 0.061
MaxK 0.52 (0.1582, 0.0658, 0.1561) <0.001
Note. Abbreviations: SD, standard deviations; VUS, the volume under the ROC hyper-surface.
a
Based on maximum likelihood estimation.
threshold can be selected to maximize (Guisan et al., 1998; Guisan and Hofer, 2003;
Hirzel et al., 2006; Moisen et al., 2006). Therefore, the maximum coefficient should be
determined based on various cutoff values among possible biomarker values. The cutoff
values that result in the maximum value indicate the optimal agreement between the
new biomarker and true disease status.
After defining MaxK and explaining how the asymptotic GEV distribution can be
reached (Gnedenko, 1943; Guisan et al., 1999), a finite sample simulation study was used
to investigate the powers of MaxK, the VUS, and the VUS under various underlying
populations, location parameter configurations, and sample size arrangements. The VUS
was then nonparametrically estimated and its variance calculated using the U-statistics
methodology in the simulation.
A NOVEL MAXIMIZING KAPPA APPROACH 13
The statistics were clearly shown to be discrete random variables and were unable to
fill out the level of the Type I errors completely under H0; thus, we used simulated critical
values for the VUS. From this analysis, we found that the simulated powers of VUSs and
MaxK were close to the nominal levels under the null hypothesis. Further, the simulated
power of MaxK was higher than that of VUSs for the investigated location parameters
when the sample sizes that correspond to smaller spaces in adjacent location parameters
are comparatively small. The practical case shows this deficiency with the VUS in these
specific circumstances. However, the proposed approach still suffers from two major
limitations. First, the number of possible cutoff values was computationally intensive
even for relatively small samples. Second, the two-class coefficient was discrete.
Combining certain k classes, for example when two classes are relatively hard to
distinguish, and then calculating the kappa value of the collapsed (k 1) (k 1)
agreement table may be desirable. This strategy increases the value of . Following
the theorem proposed by Schouten (1986), we can deduce that the power of MaxK can
Downloaded by [New York University] at 01:51 04 July 2015
REFERENCES
Andr, C. (2002). The NIH stroke scale is unreliable in untrained hands. Journal of Stroke and
Cerebrovascular Diseases 11:4346.
Betensky, R. A., Rabinowitz, D. (1999). Maximally selected chi square statistics for k 2 tables.
Biometrics 55:317320.
Boyce, M. S., McDonald, L. L., Manly, B. F. J. (1999). Reply to Mysterud and Ims. Trends in
Ecology & Evolution 14:490.
Brennan, R. L., Prediger, D. J. (1981). Coefficient kappa: Some uses, misuses, and alternatives.
Educational and Psychological Measurement 41:687699.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological
Measurement 20:213220.
Coles, S. (2001). An Introduction to Statistical Modelling of Extreme Values. London: Springer
Verlag.
Congalton, R. G. (1991). A review of assessing the accuracy of classifications of remotely sensed
data. Remote Sensing of Environment 37:3546.
de Haan, L., Ferreira, A. (2006). Extreme Value Theory: An Introduction. London: Springer Verlag.
Derex, L., Nighoghossian, N. (2008). Intracerebral haemorrhage after thrombolysis for acute
ischaemic stroke: An update. Journal of Neurology, Neurosurgery & Psychiatry 79:1093
1099.
Dreiseitl, S., Ohno-Machado, L., Binder, M. (2000). Comparing three-class diagnostic tests by
three-way ROC analysis. Medical Decision Making 20:323331.
Egan, J. P. (1975). Signal Detection Theory and ROC Analysis. New York: Academic Press.
Fleiss, J. L. (1975). Measuring agreement between two judges on the presence or absence of a trait.
Biometrics 31:651659.
Fleiss, J. L., Cohen, J., Everitt, B. S. (1969). Large sample standard errors of kappa and weighted
kappa. Psychological Bulletin 72:323327.
Gnedenko, B. (1943). Sur la distribution limite du terme maximum dune srie alatoire. Annals of
Mathematics 44:423453.
Green, D. M., Swets, J. A. (1996). Signal Detection Theory and Psychophysics. New York: Wiley.
14 CHANG ET AL.
Guisan, A., Hofer, U. (2003). Predicting reptile distributions at mesoscale: Relation to climate and
topography. Journal of Biogeography 30:12331243.
Guisan, A., Theurillat, J. P., Kienast, F. (1998). Predicting the potential distribution of plant species
in an alpine environment. Journal of Vegetation Science 9:6574.
Guisan, A., Weiss, S. B., Weiss, A. D. (1999). GLM versus CCA spatial modelling of plant species
distribution. Plant Ecology 143:107122.
Guisan, A., Zimmerman, N. E. (2000). Predictive habitat distribution models in ecology. Ecological
Modelling 135:147186.
Gumbel, E. J. (1958). Statistics of Extremes. New York: Columbia University Press.
Heckerling, P. S. (2001). Parametric three-way receiver operating characteristic surface analysis
using Mathematica. Medical Decision Making 21:409417.
Hirzel, A. H., LeLay, G., Helfer, V. (2006). Evaluating the ability of habitat suitability models to
predict species presences. Ecological Modelling 199:142152.
James, I. R. (1983). Analysis of nonagreements among multiple raters. Biometrics 39:651657.
Jenkinson, A. F. (1955). The frequency distribution of the annual maximum (or minimum) values of
Downloaded by [New York University] at 01:51 04 July 2015
Terpstra, J. T., Magel, R. C. (2003). A new nonparametric test for the ordered alternative problem.
Nonparametric Statistics 15:289301.
Tiago de Oliveira, J. (1973). Statistical Extremes-A Survey. Lisbon: Center of Applied Mathematics.
Wardlaw, J. M., Zoppo, G., Yamaguchi, T. (2003). Thrombolysis for acute ischaemic stroke.
Cochrane Database of Systematic Reviews 3:CD000213.
Warrens, M. J. (2008a). On the equivalence of Cohens kappa and the Hubert-Arabie adjusted rand
index. Journal of Classification 25:177183.
Warrens, M. J. (2008b). On similarity coefficients for 2 2 tables and correction for chance.
Psychometrika 73:487502.
Warrens, M. J. (2010). Inequalities between kappa and kappa-like statistics for k k tables.
Psychometrika 75:176185.
Zwick, R. (1988). Another look at interrater agreement. Psychological Bulletin 103:374378.
Downloaded by [New York University] at 01:51 04 July 2015