You are on page 1of 36

1

AMartingaleTestforAlpha

DeanP.Foster*,RobertStine*,H.PeytonYoung**

Thisversion:January31,2010

*DepartmentofStatistics,WhartonSchool,UniversityofPennsylvania
**DepartmentofEconomics,UniversityofOxford

2

Abstract

We present a newmethod for testing whether the returns froma financial asset
aresystematicallyhigherthanaperfectlyefficientmarketwouldallow.Suchan
asset is said to have positive alpha. The standard way of estimating alpha is to
regresstheassetsreturnsagainstthemarketreturnsoveranextendedperiodof
timeandtoapplythettest.Thedifficultyisthattheresidualsoftenfailtosatisfy
independence and normality. In fact, portfolio managers may have an incentive
to employ strategies whose residuals depart by design from independence and
normality. To address these problems we propose a robust test for alpha based
on the martingale maximal inequality. Unlike the ttest, our test places no
restrictions on the distribution of returns while retaining substantial statistical
power.Weillustratethemethodforfourassets:astock,amutualfund,ahedge
fund,andafabricatedfundthatisdeliberatelydesignedtofoolstandardtestsof
significance.


3
1.Estimatingalpha

An asset that consistently delivers higher returns than a broadbased market


portfolio is said to have positive alpha. This is the excess return that results from
the asset managers superior skill in exploiting arbitrage opportunities and
judgingtherisksandrewardsassociatedwithvariousinvestments.Buthowcan
investors (and statisticians) tell from historical data whether a given portfolio
actually is generating positive alpha relative to the market? To answer this
question one must address four issues: i) multiplicity; ii) trends; iii) cross
sectional correlation; iv) robustness. We begin by reviewing standard
adjustments for the first three; this will set the stage for our approach to the
robustness issue, which involves a novel application of the martingale maximal
inequality(Doob,1953).

Thefirststepinevaluatingthehistoricalperformanceofafinancialassetrequires
adjusting for multiplicity. Assets are seldom considered in isolation: investors
canchooseamonghundredsorthousandsofstocks,bonds,mutualfunds,hedge
funds,andotherfinancialproducts.Withoutadjustingformultiplicity,statistical
testsofsignificancecanbeseriouslymisleading.Totakeatrivialexample,ifwe
weretotestindividuallywhethereachof100mutualfundsbeatsthemarketat
level=0.05,wewouldexpecttofindfivestatisticallysignificantpvalueswhen
infactnoneofthembeatsthemarket.

Statisticiansarewellversedinthedangersofsearchingforthemoststatistically
significanthypothesesindataandhavedevelopedawidevarietyofprocedures
4
tocorrectformultiplecomparisons.Thesimplestandmosteasilyusedoftheseis
the Bonferroni rule. When testing m hypotheses simultaneously, one compares
the observed pvalues to an appropriately reduced threshold. For example,
insteadofcomparingeachpvalue
i
p toathresholdsuchas=0.05,onewould
comparethemtothereducedthreshold/m.

Modern alternatives to Bonferroni have extended it in two directions. The first,


called alpha spending, allows the splitting of the alpha into uneven pieces; see
forexamplePocock(1977)andOBrienandFleming(1979).Thesecondgroupof
extensions provides more power when several different hypotheses are being
tested. These can be motivated from many perspectives: false discovery rate
(BenjaminiandHochberg1995),Bayesian(GeorgeandFoster,2000),information
theory (Stine, 2004) and frequentist risk (Abramovich, Benjamini, Donoho, and
Johnstone, 2006). One can even apply several of these approaches
simultaneously (Foster and Stine 2007). In the case of financial markets,
however, the natural null hypothesis is that no asset can beat the market for an
extended period of time because this would create exploitable arbitrage
opportunities. Thus, in this setting, the key issue is whether anything beats the
market,letalonewhethermultipleassetscanbeatthemarket.

Asecondkeyissueinevaluatingthehistoricalperformanceofdifferentassetsis
the need to detrend the data. This is particularly important for financial assets,
whichgenerallyexhibitastrongupwardtrendduetocompounding.Consider,
for example, the price series shown in Figure 1 for four quite different types of
assets. The top two panels represent the value over time of a longrunning
5
general mutual fund (Fidelity Puritan), and the stock of a diversified company
(Berkshire Hathaway) that is famous for its superior performance over a long
periodoftime.Thetwoassetsinthelowerpanelsrepresentmanagedfundsthat
follow dynamic investment strategies that we shall describe in more detail later
on. The Team Fund is based on a dynamic rebalancing algorithm that seeks to
reducevolatilitywhilemaintaininghighreturns(Gerth,1999;Agnew,2002).The
Piggyback Fund is based on an options trading strategy that is designed to fool
investors into believing that the fund is generating excess returns when in
expectation it is not (Foster and Young, 2007; for a related construction see Lo,
2001).

Figure1.Valueoffourassetsovertime
6

Thesimplestwaytodetrendsuchdataistostudytheperiodbyperiodreturns
ratherthanthevalueoftheassetitself.Thatis,insteadoffocusingonthevalue,
t
V ,onestudiesthesequenceofreturns
1 1
( )/
t t t
V V V

oversuccessivetimeperiods
t. The hope is that the returns are being generated by a process that is
sufficientlystationaryforstandardstatisticalteststobeapplied.Asweshallsee
later on, this hope may not be wellfounded given that financial portfolios are
oftenmanagedinawaythatproduceshighlynonstationarybehaviorbydesign.

Figure2.Monthlyreturnsseriesforthefourassets

7
The third issue, crosssectional correlation, arises because returns on financial
assets often exhibit a high degree of positive correlation. The standard way to
deal with this problem is the Capital Asset Pricing Model (CAPM), which
partitionsthevariationinassetreturnsintotwoorthogonalcomponents:market
risk, which is nondiversifiable and hence unavoidable, and idiosyncratic risk.
Byconstruction,idiosyncraticriskisorthogonaltomarketriskandmeasuresthe
rewards and risks associated with a specific asset. It is the mean return on this
idiosyncraticrisk,knownasalpha,thatdrawsinvestorstospecificstocks,mutual
funds,andalternativeinvestmentvehiclessuchashedgefunds.

Thestandardwaytoestimatethealphaofaparticularassetorportfolioofassets
istoregressitsreturnsagainstthereturnsfromabroadbasedmarketindexsuch
as the S&P 500 after subtracting out the riskfree return. Specifically, let
t
m be
the return generated by the market portfolio in period t , and let
t
r be the risk
free rate of return during the period, that is, the return available on a safe asset
suchasUSTreasurybills.Theexcessreturnofthemarketduringthe
th
t periodis

t t
m r . Let
t
y be the return in period t from a particular asset (or portfolio of
assets)thatwewishtocomparewiththemarket.Theportfoliosexcessreturnin
period t is defined as
t t
y r . CAPM isolates the idiosyncratic return of the
portfoliofromtheoverallmarketreturnbyestimatingtheregressionequation

( ) o | c = + +
t t t t t
y r m r .(1)

Themarketadjustedreturnisdefinedtobetheinterceptplustheresidual,thatis,
t
o c + .
8
The crucial question from the investors standpoint is whether the intercept is
positive ( 0 o > ). If it is, then the investor could increase his overall return by
investing a portion of his wealth in this portfolio instead of in the market.
Moreover, this increased return could be achieved without exposing himself to
muchadditionalvolatility,providedheputsasufficientlysmallproportionofhis
wealthintheportfolio.Thustherelevantquestioniswhetherthereexistsanyasset,or
portfolio of assets, that systematically beats the market in the sense that it exhibits
positivealphaatahighlevelofsignificance.

Aswehavealreadynoted,thestandardwaytoanswerthisquestionistoapplya
ttest to the estimate of the intercept. But this is only justified if the residuals
satisfy quite restrictive assumptions, including independence and normality.
Whenwegraphtheresidualsfromourfourcandidateassetsovertime,however,
it becomes apparent that two of them (Team and Piggyback) exhibit highly
erratic,nonstationarybehavior(seeFigures34).

Figure3.Marketadjustedreturnsversusmarketexcessreturns.

Figure4.Timeseriesofmarketadjustedreturnsforthefourassets.
10

In the case of Team, the market-adjusted returns are negatively correlated with
the market returns (see Figure 3), and they exhibit a cyclic pattern with bursts of
high volatility followed by periods of low volatility immediately thereafter (see
Figure 4). This is a direct consequence of Teams dynamic rebalancing strategy.
At any given time the fund is invested in a mixture of cash (earning the risk-free
rate of return) and the S&P 500. At the end of each year funds are moved from
cash to stock if the risk-free rate in that period exceeded the market rate of
return, and from stock to cash if the reverse was true. The target proportions to
maintain between stock and cash are adjusted at the end of each five-year
interval i. (The five-year intervals and target proportions -- 70% stock, 30% cash -
- are chosen purely for purposes of illustration.)

Gerth (1999) proves that if stocks have returns that are independent and
identically distributed, and if the riskfree rate of return is constant, then there
exists a choice of targets such that this strategy yields higher returns than the
marketwithoutincreasingthelevelofvolatility.Ourpurposeisnottocritique
this approach, but to take it as a concrete example showing why a natural
portfoliomanagementstrategycouldeasilyproducereturnsthatbydesigndiffer
substantiallyfromtheassumptionsneededtoapplythettest.

The returns from the Piggyback Fund exhibit even more erratic behavior. The
reasons for this will be discussed in section 3. Suffice it to say here that these
returnsareproducedbyastrategythatisdesignedtoearntheportfoliomanager
alotofmoneyratherthantodeliversuperiorreturnstotheinvestors.However,
since the natural aim of portfolio managers is to earn large amounts of money,
statisticaltestsofsignificancemustaccommodatethissortofbehavior.
11
Thepurposeofthispaperistointroducearobusttestforalphathatisimmuneto
this type of manipulation. Our test assumes nothing about the distribution of
marketadjusted returns except that they form a martingale difference: neither
independence,constantvariance,nornormalityarerequired.Thetestissimple
to compute and does not depend on the frequency of observations. A
particularlyimportantfeatureofthetestisthatitcorrectsforthepossibilitythat
themanagersstrategyconcealsasmallriskofalargelossinthetail.Thisisnot
a hypothetical problem: empirical studies using multifactor risk analysis have
shown that many hedge funds have negatively skewed returns and that as a
resultthettestsignificantlyunderestimatesthelefttailrisk(AgarwalandNaik,
2004).Moreover,negativelyskewedreturnsaretobeexpected,becausestandard
compensation arrangements give managers an incentive to follow just such
strategies(FosterandYoung,2009).

The plan of the paper is as follows. In section 2 we derive our test in its most
basicform.Thebasicideaisfirsttocorrectformarketcorrelationbycomputing
the marketadjusted returns as in (1), then compound these returns and test
whether they form a martingale. Section 3 shows why it is vital to have a test
that, unlike the ttest, is robust to distributional assumptions. In particular, we
showthatthettestleadstoafalsedegreeofconfidenceinthePiggybackFund,
whichisconstructedsothatitlooks likeitproducespositivealphawhenthisis
notactuallythecase.Section4extendstheapproachtotestwhetheroneormore
assets out of a given population of assets is producing positive alpha at a high
levelofsignificance.

12
In section 5 we show how to ramp up the statistical power of this approach by
leveraging the asset. We show, in particular, that if the returns of the asset are
lognormally distributed with known variance, then we can choose a level of
leverage such that the loss in power is quite modest relative to the optimal test
(which in this case is the ttest). In fact, for a pvalue of .01 the loss in power is
less than 30%, and for a pvalue of .001 the loss in power is only about 20%.

Insection6weconsiderthemoregeneralsituationinwhichweknownothingex
ante about the assets true distribution. In spite of this we can use leverage to
advantage by applying different levels of leverage to the asset in question. This
resultsinapopulationofleveragedassetstowhichweapplyPERT.Theessential
pointisthatthebestlevelofleverage(whichwedonotknowinadvance)results
in exponentially higher rates of growth than other levels. Hence the compound
excess return test applied to the population (PERT) can come quite close to the
compound excessreturntestappliedtothebestlevelofleverage.(Thisresultis
closely related to Covers work on universal portfolios (Cover, 1991.)
Wecallthistheexponentialpopulationexcessreturnstest(EXPERT).

Inthefinalsectionweapplythistesttoourfourcandidateassets.Itturnsout
that,aftercorrectingformultiplicity,marketcorrelation,andunrealized
downsiderisk,theonlyassetthatplausiblydeliverspositivealphaisBerkshire
Hathaway,eventhoughtwooftheothers(Fidelity, and Piggyback) look quite
impressive at first sight.



13
2. The compound excess returns test (CERT)

Considerafinancialasset,suchasastock,amutualfund,orahedgefundwhose
performance we wish to compare with that of the market. The data consist of
returns generated by the asset over a series of reporting periods 1,2,..., t T = .
Denotethemarketreturninperiodtbytherandomvariable
t
M andtheassets
returnbytherandomvariable
t
Y ,wherebydefinition , 0
t t
M Y > .Inapplications,
t
M wouldbethereturnonabroadbasedportfolioofstockssuchastheS&P500.

Thefirststepinouranalysisistosubtractofftheriskfreerateofreturnineach
period,thatis,therateavailableonasafeassetsuchasTreasurybills.Letting
t
r
denotetheriskfreerateinperiodt,definetherandomvariables

,
t t t t t t
M M r Y Y r = =

.(2)

Thesecondstepistocorrectforcorrelationwiththemarket.Let

( , )
( )
Cov Y M
Var M
| =

.(3)

In practice | can either be estimated directly from historical data or by


analyzingthecompositionoftheassetinquestion.Animportantpointtonotice
is that | can be estimated accurately using shortterm data (e.g., daily returns)
because it measures the extent to which the returns are correlated with the
market, not their magnitude. (Thus with sufficiently high frequency data one
14
could estimate the correlation coefficient period by period, i.e., monthly or
quarterly.)Themarketadjustedreturnoralphainperiodtis

t t t
A Y M | =

.(4)

Thenullhypothesisisthattheconditionalexpectationof
t
A iszeroineveryperiod
t,thatis,

1, 1
, [ | ..., ] 0
t t
t T E A A A

s = .(5)

This is equivalent to saying that the compound growth of the marketadjusted


returnsformsanonnegativemartingale

NullHypothesis:
1
(1 )
t s
s t
C A
s s
= +
[
isanonnegativemartingale.(6)

Compound Excess Return Test (CERT). For each t , 1 t T s s , let 0


t
C > be the
compound marketadjusted return of a candidate asset through period t. The CERT p
valuefortestingthenullhypothesisis

1
min (1/ )
t T t
p C
s s
= .(7)

Theproofisastraightforwardapplicationofthemartingalemaximalinequality,
which for convenience we shall derive here (Doob, 1953). Under our
assumptions,
t
C is a nonnegative martingale with conditional expectation 1 in
everyperiod.Givenarealnumber 0 > ,definetherandomtime ( ) T tobethe
15
first time t T s such that
t
C > if such a time exists, otherwise let ( ) T T = . By
the optional stopping theorem,
( )
[ ] 1
T
E C

= . (See Doob, 1953, Theorem 2.1.)
Clearly, if
1
max
s t s
C
s s
> then
( ) T
C

> , hence
1 ( )
(max ) ( )
s t s T
P C P C


s s
> s > .
Since
( ) T
C

isnonnegative,
( ) ( )
( ) [ ]/ 1/
T T
P C E C

> s = .Itfollowsthat

1
(max ) 1/
s t s
P C
s s
> s .(8)

It follows that the sequence


1 2
, ,...,
T
C C C of compound excess returns was
generatedwithprobabilityatmost
1
min (1/ )
t T t
C
s s
.

Notice that the length of the series is immaterial: what matters is the maximum
compound return that was achieved at some point during the series. Moreover,
the compound returns must bevery largefor the null hypothesis to be rejected.
For example, to reject the null at the 5% level of significance requires that the
asset grow twentyfold after subtracting off the riskfree rate and correcting for
correlation with themarket. Two of thefour fundsmeet thistest(Fidelity and
Berkshire) as may be seen from Figure 5, which shows the compound value of
eachassetsmarketadjustedreturns.

Table1comparesthepvaluesfromCERTwiththosefromthettest.According
to the latter, Berkshire, Fidelity, and Piggyback have positive alpha at a high
levelofsignificanceandPiggybackisparticularlyimpressive.Bycontrast,CERT
saysthatweshouldbeconfidentinBerkshire,whichhasaCERTpvalueequalto
.011,butnotinanyoftheothers.

16

Figure5.Compoundvalueofmarketadjustedreturnsforthefourassets

AssetRegressiontRegressionpvalueCERTpvalueCERTt
Fidelity3.230.00130.370.33
Berkshire4.03
5
6.7 10

0.0112.30
Team0.230.820.961.77
Piggyback16.79
41
1.3 10

0.151.03

Table1.RegressionpvaluesversusCERTpvaluesforthefourassets.
[CERTtisthevalueofthetstatisticthatcorrespondstothegivenpvalue.]

17
3.Gamingthettest.

At first it might seem counterintuitive that one would need to see a fund
outperformthemarketbyatleasttwentyfoldinordertobereasonablyconfident
(atthe5%levelofsignificance)thattheoutperformanceisforreal.Thereason
isthatanapparentlystellarperformancecanbedrivenbystrategiesthatleadtoa
totallosswithpositiveprobabilitybutaverylongtimecanelapsebeforetheloss
materializes.Indeed,itiseasytoconstructstrategiesofthisnatureforwhichthe
CERT bound is tight. Choose a number 1 > , and consider the following
nonnegativemartingalewithconditionalexpectation1

0
1, C =
1/ 1/
1
( ) ,
T T
t t
P C C

= =
1/
( 0) 1
T
t
P C

= = for1 t T s s .(9)

Ineachperiodthefundcompoundsbythefactor
1/T
withprobability
1/T

and
crasheswithprobability
1/
1
T

.Thustheprobabilitythatthefundscompound
excessreturn
t
C exceeds isprecisely1/ overanynumberofperiodsT.
1

The Piggyback Fund was constructed along just these lines. Namely, the fund
was invested in the S&P 500, and the returns reported every month. However,
once every six months the returns were artificially boosted by the factor 1.02.
ThiswasdonebytakinganoptionspositionintheS&P500thatwouldbankrupt
the fund if the options were exercised. The strike price was chosen so that the
probabilityofthiseventwas1/1.02=.9804,sothefairvalueoftheoptioniszero.
2

1
Returnsserieswiththispropertycanbeconstructedusingstandardoptionscontracts(Foster
andYoung,2009).
2
Withprobability1/1.02thefundgrowsbythefactor1.02andwithprobability.02/1.02itloses
everything,whichisalotterywithexpectationzero.
18
This explains the bizarre pattern of the marketadjusted residuals in Figure 3:
onesixth of the time they are + 2%, and fivesixths of the time they are 2% .
With less than 25 years of data there is a sizable probability that the downside
riskwillneverberealized,andinvestorswillbelulledintothinkingthatthefund
isgeneratingpositivealpha.ThisiswhyCERTattachesaverymodestpvalueto
thereturnsgeneratedbythePiggybackFund.

Ofcourse,evenacasualinspectionoftheresidualsinFigure3suggeststhatthe
ttest should not be used in this case. However this is not the essence of the
problem, because it is easy to construct piggyback strategies whose returns
look i.i.d. normal. Namely, suppose that every month the manager boosts the
fundsreturnsbythefactor 1.0033 where isalognormally distributederror
withmean1andsmallvariance.(Thepurposeoftheerroristolendaplausible
amountofvariabilitytothe realizedreturns.)Asbefore,theboostcomesatthe
cost of going bankrupt with probability 1/(1.0033 ) .9967/ ~ each month.
Assuming that the variance of is small, this scheme will run for about 300
months (25 years) before the fund goes bankrupt, and the residuals will look
veryconvincing.Thus,inthiscasethettestwouldseemtobeappropriate,and
the estimate of alpha will be about 4% per year at a very high level of
significance. This is misleading, however, because in reality the distribution of
returnsisnotapproximatelynormalthereisalargepotentiallosshiddeninthe
tail. One of the main virtues of CERT is that it corrects for this hidden
volatility: under CERT the pvalue of this scheme after 25 years will only be
about
300
(1.0033) .37 p

= ~ .

19
4.Multiplicity,Bonferroni,andtheportfoliotest.

The test described above is for returns generated by a single fund. In practice,
investorswouldliketoidentifythosefundsfromagivenpopulationthatgenerate
positive alpha with high probability. This requires a more demanding test of
significance.Toillustrate,supposethatwecanobservethereturnsforeachofn
funds over the same time frame 1,2,..., t T = . Let
i
t
C be the compound excess
returnforfund i throughperiod t .Supposethatweobserveaparticularfund,
say * i , whose returns are sufficiently high that we would reject the null at the
5%levelusingCERT.Thisdoesnotmeanwehave95%confidencethatfund * i
isgeneratingpositivealpha.Suppose,infact,thatnoneofthenfundsisableto
generate positive alpha and that the excess returns are stochastically
independent. It is straightforward to construct n independent nonnegative
martingales, each with expectation 1, such that on average there will be .05n
fundsthatexceedthecriticalthreshold
.05
c .Forexample,outof1000fundsthere
will, on average, be 50 funds that pass the singlefund threshold even though
noneofthefundsisactuallygeneratingpositivealpha.

Moregenerally,supposethatweobservennonnegativemartingales ,1
i
t
C i n s s ,
over the periods 1 t T s s . Let the null hypothesis
0
H be that
1 1
[ | ,..., ] 1
i i i
t t
E C c c

=
forall i andforall t ,wherethe
i
t
C areassumedtobeindependentacross i for
each t .Bythemartingalemaximalinequalityweknowthat

0, c >
1
(max )
i
t T t
P C c c
s s
> s .(10)

Hence
20

0, c >
1 1
(max max ) /
i
i n t T t
P C c n c
s s s s
> s .(11)

It follows that, to reject


0
H at level p > 0, there must exist one or more funds i such
that

1
max /
i
t T t
C n p
s s
> .(12)

We call this test BonferroniCERT. (The idea is quite general and applies to any
situation in which multiple hypothesis tests are being conducted; see Miller,
1981.)Inthepresentcasethetestsuggeststhatweshouldnotbeconfidentthat
anyofthefourassetsexhibitspositivealpha.Thereasonisthattheonlyonethat
passes muster at an individual level is Berkshire, which we cherrypicked from
theS&P500.Tobeconfidentthatthebestof500stocksgeneratedpositivealpha
at the 5% level of significance, would require that the stocks marketadjusted
returnscompoundbyafactorofatleast500x20=10,000.Whilethismightseem
like an impossibly high hurdle, we shall show in the next section that there are
morepowerfulversionsofthetestthatmakesuchvaluesachievable.

Before considering thisissue, however, letus observe thatthere isa simple and
powerful alternative to BonferroniCERT that works when one wants to know
whether one or more assets in a given population has positive alpha without
identifying which particular asset (or assets) that might be. This test, called the
Portfolio Excess Returns Test (PERT), relies on the fact that any weighted
combinationofassetsthathavezeroalphamustalsohavezeroalpha.

21
Portfolio Excess Returns Test (PERT). Consider a family of n funds, where each
fund i generates a series of compound excess returns
1 2
, ,...,
i i i
T
C C C over T periods.
Createaportfolioconsistingofequalamountsinvestedinitiallyineachofthefunds.The
portfolios compound return series is given by
1
(1/ )
i
t t
i n
C n C
s s
=

, 1 t T s s . The null
hypothesisthatnoneofthefundshaspositivealphahaspvalue
1
min (1/ )
t T t
C
s s
.

The proof is more or less immediate. Each of the n funds generates a


nonnegative martingale of excess returns
i
t
C that may nor may not be
independentacrossfunds.Theinvestorsportfolioisthenonnegativemartingale
1
(1/ )
i
t t
i n
C n C
s s
=

. The assumption that none of the funds exhibits positive alpha
implies that the conditional expectation of
t
C equals 1 in every period. The
conclusionfollowsatoncefromthemartingalemaximalinequality.

Note that this testisat leastas powerful as the Bonferroni test: the latter rejects
onlyif /
i
t
C n p > ,butinthisevent 1/
t
C p > ,whichimpliesthattheportfoliotest
rejectsalso.

5.Powerandleverage

While CERT and PERT are robust, they are also very conservative. In the next
twosectionsweshallshowthatmuchmorepowerfulvariantsofthesetestscan
be devised by leveraging the asset under scrutiny. Let us begin by considering
the special case in which the returns are known to be i.i.d. lognormal, in which
casetheoptimaltestofsignificanceisthettestappliedtotheloggedreturns.We
shall show that by leveraging the asset at an appropriate level we obtain an
22
exponentiated form of CERT (EXCERT) that involves only a modest loss of
powercomparedtotheoptimaltest.

Tobeconcrete,consideramanagerwhosefundisgeneratingcompoundreturns
0
t
C > relative to the riskfree rate and suppose for simplicity that there is no
correlation with the market ( 0) | = . We shall assume that
t
C is lognormally
distributed:

2 2
log (( / 2) , )
t
C N t t o o .
3
(13)

Whentheassetisleveragedbythe factor 0 > ,the compoundreturnsat timet,


( )
t
C ,aredescribedbytheprocess

2 2 2 2
log ( ) (( / 2) , )
t
C N t t o o .
4
(14)

Suppose that is
2
o is known and is not. The null hypothesis is that 0 = and
the alternative hypothesis is that 0 > . Choose a pvalue 0 p > and a time t at
whichatestofsignificanceistobeconducted.CERTrejectsthenullatlevelpif
andonlyif

log log(1/ )
t
C p > .(15)

3
ThisisconsistentwiththetraditionalrepresentationofassetreturnsasageometricBrownian
motionincontinuoustime
t t t t
dC C dt C dW o = + (Berndt,1996;Campbell,Lo,andMacKinlay,
1997).
4
Toleverageanassetbythefactoroneborrows1dollarsattheriskfreerateandinvests
dollarsintheasset.If<1thismeansthat1isinvestedintheriskfreeassetandthe
remainderintheriskyasset.Noticethattokeepaconstantlevelofleverageonewilltypically
needtorebalancetheabsoluteamountsinvestedineachassetovertime.Incontinuoustimethis
yieldstheprocess
t t t t
dC C dt C dW o = + .
23
Underthenullhypothesis,

2
log ( / 2)
t
t
C t
Z
t
o
o
+
= is (0,1) N .(16)

HenceCERTrejectsthenullifandonlyif

2
log(1/ ) ( / 2)
t
p t
Z
t
o
o
+
> .(17)

Tomaximizethepowerofthetestwechoosetheleveragesothattheprobability
of rejection is maximized. This occurs when the righthand side of (17) is
minimized,thatis,when

2log(1/ )
*
p
t

o
= .(18)

Definition.EXCERT(exponentialCERT)isCERTappliedtotheassetleveragedby
theamount * .

Noticethat * dependsonthevarianceoftheprocess,thetimeatwhichthetest
is conducted, and the level of significance p. However, the corresponding z
valuedependsonlyonp,thatis,EXCERTrejectsifandonlyif

2log(1/ )
t p
Z p c > ,(19)

24
that is,
p
c is the critical value for EXCERT at significance level p Let us compare
thiswiththecriticalvalueofthettest,whichrejectsatlevelpifnullatlevel p if
andonlyif

1
(1 )
t p
Z p z

> u .(20)

The power loss at significance level p, ( ) L p , is the maximum probability that for
some combination of , ,t o , the ttest rejects the null at level p when EXCERT
accepts.

Proposition1. ( ) 2 (.5( )) 1
p p
L p c z = u and
0
lim ( ) 0
p
L p
+

= .

The proof is given in the Appendix. The first expression in the proposition can
beusedtonumericallycompute ( ) L p ,andtheresultsareillustratedinFigure6.
Theloss inpowerisaround2025%forpin therange
3
10

to
5
10

,whichisthe
relevant range when we test for the best of n assets and n is on the order of
severalhundredorseveralthousand.

Figure6.Powerlossfunction ( ) L p forEXCERTcomparedtothettest.
25

6.LeveragingCERTwhenthevarianceisunknown

The situation considered in the preceding section is quite special in that the
distributionofreturnswasassumedtobei.i.d.lognormalwithknownvariance.
Here we introduce an alternative version of the test that is more robust and is
asymptotically just as powerful. To apply this test, all we need to know is the
maximumleveragethatcanbeappliedtotheassetunderconsiderationwithout
causing negative realizations; in other words it suffices to know the maximum
downsidelossthattheassetcansufferinanygivenperiod.

Consider an asset whose marketadjusted returns


t
A are bounded below by
1 | > . Then the random variable (1 / )
t t
M A
|
| + is nonnegative, that is, the
maximum permissible level of leverage is 1/| . The null hypothesis is that
[ ] 0
t
E A = , which implies that [ / ] 0
t
E A | = . Indeed the null hypothesis holds for
everylevelofleverageintherange 0 1/ | s s :

Nullhypothesis: , 0 1/ , | < s
1
(1 )
t s
s t
C A

s s
= +
[
isanonnegativemartingale.

Let us now construct a family of funds that are based on the given asset { }
t
A ,
where each fund operates under a different level of leverage 0 1/ | s s .
Suppose, for example, that we weightallfeasible levels of leverage equally. We
thenobtainapopulationoffundswhosetotalvalueattimeisgivenby

1/
0
1
(1 )
t s
s t
C A d
|
|
s s
= +
[
}
.(21)
26

More generally, consider any density ( ) f that is bounded away from zero on
an interval [ , ] [0,1/ ] a b | _ where ( ) 1
b
a
f d =
}
. We can construct a family of
fundswithcompoundreturns

1
( ) (1 )
b
t s
a
s t
C f A d
s s
= +
[
}
.(22)

Underthenullhypothesis,anysuchfundhascompoundreturnsthatformanonnegative
martingale.Hencewecanrejectthenullhypothesisatlevelpif

1
( ) (1 ) 1/
b
t s
a
s t
C f A d p
s s
= + >
[
}
.(23)

Any test of this form will be called an exponential population excess returns test
(EXPERT). This type of test is very general and assumes nothing about the
distribution of returns of the underlying asset { }
t
A except that they form a
martingaledifferenceandareboundedawayfrom 1 .

Given its generality one might expect that the power of such a test is very low.
In fact, however, it has the same power asymptotically as does EXCERT where
the variance is assumed to be known. The only requirement is that the interval
[ , ] a b contains the actual value * that optimizes EXCERT. By choosing the
interval to be as wide as possible, namely [0,1/ ] | , this requirement will
automaticallybesatisfied.

27
Proposition 2. Consider an asset whose returns are lognormally distributed
2 2
log (( / 2) , )
t
C N t t o o .SupposethatEXPERTisappliedattimettoapopulation
offundsleveragedaccordingtoadensitythatisboundedawayfromzeroonaninterval
[ , ] [0,1/ ] a b | _ that contains the optimal leverage level. Then for all sufficiently large
timestthelossinpowerrelativetothettestiswellapproximatedbyL(p).

Thisresultfollowsfromamoregeneraltheoremonuniversalportfoliosdueto
Cover(1991).Let
t
C bethesizeoftheEXPERTportfolioattimet.Let
*
t
C bethe
value at t of the portfolio that is leveraged at the optimal level * using
EXCERT. (This depends, of course, on t, p, and o ). Covers theorem compares
thesevalueswiththevalue
**
t
C ofathirdportfolioinwhichtheassetisleveraged
atalevel
**
thatischosenexposttomaximizethevalueofthefundattimetgiventhat
the realized values of the returns from the underlying asset up through t are known.
Clearly
** *
t t
C C > because
**
t
C is optimized ex post whereas
*
t
C is optimized ex
ante.Coverstheoremimpliesthat,foreverysmall 0 c > ,thereisatime T
c
such
that
**
( / 1 ) 1
t t
P C C c c > > forall t T
c
> .Itfollowsthat
*
( / 1 ) 1
t t
P C C c c > > for
all t T
c
> . By construction, EXCERT rejects at level p if
*
1/
t
C p > , whereas
EXPERTrejectsatlevelpif 1/
t
C p > .ItfollowsthatEXPERThasapproximately
thesamepowerlossasdoesEXCERT,namely ( ) L p ,atallsufficientlylargetimes
t.
5

5
Coverconsidersportfoliosthatareconvexcombinationsofagivenfinitesetofassets.This
conditionissatisfiedbecauseanyleveragedportfolio 1
t
A + canberepresentedasaconvex
combinationofthemaximallyleveragedportfolio 1
t
bA + andtheminimallyleveragedportfolio
1
t
aA + .Coveralsoassumesthatthereturnsfromeachassetinanygivenperiodarenonnegative
andboundedabove.Tomeetthisconditionwecantruncatethelognormaldistributionof
returnsfromthemaximallyleveragedfundatalevelthatisseveralordersofmagnitudesmaller
thanthelevelpwearetesting.
28
7.Empiricalanalysisofthefourassets

In this concluding section we shall apply the preceding framework to the four
assetsshowninFigures14.Aswehavealreadyseen,CERTwithoutleveraging
and without correcting for multiplicity leads to the pvalues shown in Table 1.
These suggest that Berkshire has positive alpha when viewed in isolation (p =
0.011), but not when culled from five hundred stocks (the S&P 500). However,
we have additional information about the composition of Berkshire (as well as
Fidelity and Team) that allows us to ramp up the leverage. Namely, in each of
these cases they were primarily invested in common stocks and cash and
(according to their annual reports) did not leverage their holdings to any
appreciableextent.
6
Thisallowsustoestimatetheamountofleveragethatcanbe
appliedwithoutthefundgoingbankrupt.WecanthenapplyEXPERT.

More generally we propose the following fourstep procedure for testing


whetheragivenassethaspositivealpha:

1.Estimatethecorrelationcoefficient | betweentheassetsreturns
t
Y andthe
marketsreturns
t
M .
7

2.Foreachtcomputethemarketadjustedreturns ( ) ( )
t t t t t
A Y r M r | = .

6
BerkshireHathawayisadiversifiedholdingcompanythatinvestsmainlyinthecommonand
preferredstockofothercompanies.TheFidelityPuritanFundinvestsinamixtureofstocksand
bonds,andrebalancestheproportionsperiodically.Byconstruction,Teamholdsacombination
ofcashandtheS&P500ateachpointintimeandisnotleveraged.
7
Alternatively,onecanestimatearollingcorrelationcoefficient |
t
usingatrailingsubsampleof
thedata.
29
3. Estimate the maximum amount of leverage
+
that can be applied to the
market-adjusted returns without going bankrupt. For an asset consisting of
liquid common stocks, this can be estimated either from the cost of a put option,
or from the size of the buffer needed to implement a stop-loss order. Here we
shall assume that a stop-loss order on stocks can be executed within 3% of the
limit price, so that one could leverage up to about 33 times without going
negative. (It is not crucial to estimate the upper bound precisely; what matters is
the optimal level of leverage, which will usually be lower than the maximum
possible leverage.)

4.Computethemaximumcompoundvalueoftheleveragedassetforaselection
ofleveragelevelsbetween0and
+
,andlet C be theaverage.Theestimatedp
value of the asset is 1/ C . Although this estimate will depend on the assumed
distributionof leveragelevels,therewilltypicallybeanarrowbandofleverage
levelsthatyieldvastlyhighervaluesofCthandotheothers.Theaverage C will
depend largely on these critical levels and not on the particular form of the
distribution.

For purposes of illustration we shall evaluate each of the four funds at seven
leverage levels: 1/2, 1, 2, 4, 8, 16, 32.
8
As noted above, the choice of these
particular values is not important, what matters is that they cover the range of
possiblevaluesreasonablywell(inthiscase033)andthesamedistributionof
values is applied to all the assets being evaluated. The results for Berkshire,
Fidelity,andTeamareshowninTable3.(Notethatwecannotapplythismethod
tothePiggybackFund,becausethemaximumlevelofleverageisnot33;indeed,

8
Thisapproximatesthedistributioninwhichlogleverageisuniformlydistributed..
30
the fund is already leveraged to the hilt because it is constructed from options
thatwillbankruptthefundwithpositiveprobability.)

LeverageMaxCvalue
FidelityBerkshireTeam
0.51.7121.0
12.7921.0
26.72,4001.1
431110,0001.2
82902301.3
161,2003,6001.5
32442001.5
Avg C 22516,6481.22
EXPERTp=1/ C .0044.000060.82
EXPERTt3.294.410.89
Regressiont3.234.030.23
Regressionpvalue.0017.000070.41

Table 3. EXPERT applied to three assets at leverage levels (1/2, 1/,2, 4, 8, 16, 32)
under the assumption that none of the leveraged assets can go negative at
these levels.

Noticethatthepvaluesestimatedbyourtestandbythettestareinquiteclose
agreement even though our test makes none of the regularity assumptions
requiredforthettest.

31
While these pvalues would be highly significant for a fund that is viewed in
isolation, however, we need to correct for multiplicity using Bonferroni. In
particular, Berkshire was cherrypicked from the S&P 500, so its EXPERT p
value,adjustedformultiplicity,is.00006x500=.012.Thisisstillsignificantbut
not impressively so. (And if we consider that Berkshire is just one of thousands
oflistedstocks,thentheadjustedpvaluewouldnotbesignificant.)Fidelitywas
selected from a pool of hundreds of stock mutual funds, so when adjusted for
multiplicityitspvalueisnotevenclosetobeingsignificantunderourtestorthe
ttest. We conclude that Berkshire has some claim to delivering positive alpha
aftercorrectingformultiplicity,butevenitmustbeviewedasaborderlinecase.
Theotherthreeassetsdonotevencomeclose.

Appendix:ProofofProposition1

We need to show that ( ) 2 (.5( )) 1


p p
L p c z = u and
0
lim ( ) 0
p
L p
+

= . Under the
nullhypothesis( 0 = ),

2
2
ln .5
ln .5( * )
*
t p
t
t
p
C c
C t
Z
c t
o
o
+
+
= = is (0,1) N .(A1)

Hencethettestrejectsthenullatlevelpifandonlyif

ln
.5
t
p p
p
C
z c
c
< + .(A2)
Bycontrast,EXCERTacceptsthenullifandonlyif
2
ln ln(1/ ) .5
t p
C p c s = ,thatis,

32

ln
.5
t
p
p
C
c
c
s .(A3)

Powerlossoccursintheregionwhereboth(22)and(23)hold.Assumenowthat
ln
.5
t
p
p
C
c
c
+ is distributed ( ,1)
t
N

o
for some 0 > . For this o , , t the loss in
powerisgivenbytheprobabilityoftheevent

ln
.5
t
p p p
p
C t t t
z c c
c

o o o
< + s .(A4)

ThemiddletermisdistributedN(0,1),sotheprobabilityofthiseventis

( ) ( )
p p
t t
c z

o o
u u .(A5)

This probability is maximized when


p
t
z

o
and
p
t
c

o
are symmetrically
situatedaboutzero.Itfollowsthatthepowerlossfunctionis

( ) 2 (.5( )) 1
p p
L p c z = u .(A6)

Tocompletetheproofweneedtoshowthat 0
p p
c z
+
as 0 p
+
.Tocomplete
theproofweneedtoshowthat 0
p p
c z
+
as 0 p
+
.Recallthatwhenzislarge
therighttailofthenormaldistributionhasthefollowingapproximation[Feller,
1957,p.193]:
33

2
/2
( )
2
z
e
P Z z
z t

> ~ .(A7)

Thereforewhen p isreasonablysmall,say .01 p s ,wehavetheapproximation

2log 2log(2 / )
p p
z z p t ~ + .(A8)

Combining(A6)and(A8)wefind,aftersomemanipulation,that

log .5log(2 ) log .5log(2 )


2 2log(1/ ) 2
p p
p p
p
z c
c z
p c
t t
t t
+ +
~ s .(A9)

Hence 0
p p
c z
+
as 0 p
+
,fromwhichweconcludethat ( ) 0 L p as 0 p
+
.

References

Abramovich, F., Benjamini, Y., Donoho, D. L., and Johnstone, I. M. (2006),


Adaptingtounknownsparsitybycontrollingthefalsediscoveryrate,Annalsof
Statistics,34,584653.

Agarwal, V., and Naik, N. Y., (2004), Risks and portfolio decisions involving
hedgefunds,ReviewofFinancialStudies,17,6398.

Agnew, R. A. (2002), On the TEAM approach to investing, American


MathematicalMonthly,109,188192.

34
Benjamini, Y. and Hochberg, Y., (1995), Controlling the false discovery
rate: a practical and powerful approach to multiple testing, Journal
oftheRoyalStatist.Soc.,Ser.B,57,289300.

Berndt, E. R. (1996), The Practice of Econometrics: Classic and Contemporary, New


York,AddisonWesley.

Campbell, J. C., Lo, A. W., and MacKinlay, A. C. (1997), The Econometrics of


FinancialMarkets.PrincetonNJ,PrincetonUniversityPress.

Cover,ThomasM.(1991),Universalportfolios,MathematicalFinance,1,129.

Doob,J.L.(1953),StochasticProcesses,NewYork,JohnWiley.

Feller, W. (1971), An Introduction to Probability Theory and Its Applications,


PrincetonUniversityPress.

Foster, D. P. and Stine, R. A. (2008),Alphainvesting: sequential


control of expected false discoveries, Journal of the Royal Statistical Society Series
B,70,.

Foster, D. P., and Young, H. P. (2009), Gaming Performance Fees by Portfolio


Managers,WorkingPaper08041,FinancialInstitutionsCenter,WhartonSchool
of Business, University of Pennsylvania. Available at
fic.wharton.upenn.edu/fic/papers/09/0909.pdf

35
George, E. and Foster, D. P. (2000), Empirical Bayes Variable
Selection,Biometrika,87,731747.

Gerth, F. (1999), The TEAM approach to investing, American Mathematical


Monthly,106,553558.

Lo, Andrew W., 2001, Risk management for hedge funds: introduction and
overview,FinancialAnalystsJournal,Nov/DecIssue,1633.

Miller,R.G.(1981),SimultaneousStatisticalInference,2
nd
ed.,NewYork:Springer
Verlag.

OBrien,P.C.,andFlemingT.R.(1979),Amultipletestingprocedureforclinical
trials,Biometrics,35,549556.

Pocock, S. J. (1977), Group sequential methods in the design and analysis of


clinicaltrials,Biometrika,64,191199.

Stine, R. A.(2004), Model selection using information theory and


theMDLprinciple.SociologicalMethods&Research,33,230260.

36

You might also like