139 views

Uploaded by Richard Bertematti

Pairs Trading:
An implementation of the Stochastic Spread and
Cointegration Approach

- Pairs Trading Cointegration Approach
- Women Empowerment and Economic Growth
- Investigating the Long Run Relationship Between Crude Oil and Food Commodity Prices
- A Process for Data Driven Prognostics
- IRJCL8July3287
- Hamid and Hasan
- CURRENCY_SYSTEM_AND_ITS_IMPACT_ON_ECONOM.pdf
- SSRN-id911512
- Fazal Husain & Abdul Rashid
- 2018-12-11 MMC Merlin - ACEPE Summary Course - Copia[1]
- ucin1321368833.pdf
- A Trivariate Causality Test Among Economic Growth, Government Expenditure and Inflation Rate
- File 3
- quantTrading-1c
- MATLAB R2014 Coder Getting Started Guide
- Project Report
- Risk Matrix and Sample Tables
- Extended Kalman Filter PDF
- Solution Theory 1303.4988
- (Infosys Science Foundation Series) Ramji Lal (Auth.)-Algebra 2_ Linear Algebra, Galois Theory, Representation Theory, Group Extensions and Schur Multiplier-Springer Singapore (2017)

You are on page 1of 29

Master Thesis

Pairs Trading:

An implementation of the Stochastic Spread and

Cointegration Approach

Supervisors:

Author:

Nick Huurman

5631335

Contents

1 Introduction

2 Cointegration approach

2.1

2.2

2.3

3.1

3.2

Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

The EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Trading design

12

4.1

Trading period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

4.2

Pairs selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4.2.1

Cointegration approach . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4.2.2

13

Mean-Variance optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4.3

5 Evaluation

16

5.1

Sharpe ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

5.2

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

5.2.1

17

5.2.2

Cointegration Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

23

5.3

6 Conclusion

24

Chapter 1

Introduction

History shows us that using a market neutral trading strategy can be a good way to invest your

money. Typically, such a strategy performs in a steady manner, regardless of whether the market

goes up or down, and returns come with low volatility (Vidyamurhty, 2004). These favourable

characteristics are achieved by trading a market neutral portfolio, which can be constructed by

going long and short in two assets that have the same beta (hence, a portfolio with zero beta),

which is also referred to as a spread portfolio.

This thesis will evaluate one particular market neutral trading strategy that has already been

used (and proved its value) for 25 years on Wall Street, namely pairs trading. Recent studies

tell us that pairs trading performs exceptionally well in turbulent markets, where mispricing of

stocks is more common (Gatev et al., 2006; Do et al., 2006; Baronyan et al., 2010). Baronyan

et al. (2010) even reported a 40 per cent net annual profit in the first year (2008) of the financial

crisis. This result shows that pairs trading, despite its 25-year existence, is still profitable and

therefore very relevant to investigate, especially with the recent turbulent stock market.

The concept of pairs trading is relatively simple and can be summarized as follows. To begin,

an investor has to find two securities of which the prices have historically moved together and are

therefore in a relative equilibrium. Then, when the price dierence between the two securities

widens, hence the securities are out of the relative equilibrium, the trader takes a long position

in the cheap security and a short position in the expensive security. Based on the past price

dynamics, the expectation of the investor is that the prices will converge back to their relative

equilibrium. If so, the long and short position are unwound and a profit is made.

The main difficulties of constructing a profitable pairs trading strategy lie evidently in using

the right method for selecting a suitable pair of securities and how and when to take a position

in the selected pair. A recent thesis by Yakop (2011) investigates a broad range of selection and

trading methods which are appropriate for pairs trading. He concludes that the model based

approaches perform best. Therefore, this thesis will investigate and analyse two dierent model

based approaches for pairs trading. The first approach is the cointegration approach, which is

1

CHAPTER 1. INTRODUCTION

based on the error correction model. The second method is the stochastic spread approach,

as introduced in Elliot et al. (2005).

The results of the selected pairs of both methods will be calculated with the use of two

dierent trading strategies. The first is a dynamic model for the number of positions taken in

the spread that is based on the mean-variance optimization procedure discussed in the paper of

Markowitz (1952). The second is the two-standard deviation approach, which is commonly used

in earlier literature (Yakop (2011), Gatev et al. (2006), Vidyamurhty (2004)). The main objective

for this thesis is to compare the performance of the cointegration approach with the stochastic

spread approach when implemented with the two aforementioned pairs trading strategies.

The thesis is organized as follows. Chapters 2 and 3 give an outline of the two dierent

approaches used for modelling the behaviour of a pair. In the 4th chapter the dierent trading

strategies will be described and Chapter 5 provides an evaluation of the results of both models

with the dierent trading strategies. The last chapter contains the conclusions of this thesis.

Chapter 2

Cointegration approach

The notion of cointegrated time series was first introduced by Engle and Granger (1987) and

is one of the ideas for which they received a Nobel Prize in economics in 2003. Cointegrated

time series possess characteristics that are very useful for pairs trading, such as a long-term

equilibrium with the associated property of mean reversion. In the first section of this chapter

the definitions of integration, cointegration and the error correction model (ECM) for a time

series are given. The second section gives the theoretical framework for pairs trading and in the

third section, the cointegration test proposed by Johansen (1991) is discussed.

2.1

To begin the theory about cointegration, first the definitions of weakly stationarity and integrated time series are given:

Definition. An n ordered sequence of random variables ,i.e., a time series or process {xt }

is weakly stationary or second-order stationary if the first two moments of the distribution of

{xt } are constant and independent of time.

Definition. A time series which has a stationary, invertible, ARMA representation after dierencing d times, is said to be integrated of order d, denoted {xt } I(d).

The two above definitions become tangible by an example of a simple VAR model. Consider a

k-dimensional VAR(p) time series {xt } with possible time trend so that the model is

xt = t +

1 xt 1

+ ... +

p xt p

or

(B)xt = t + at ,

3

+ at ,

with

(B) = [I

1B

...

pB

],

k-dimensional constant vectors. From the definition of weak stationarity, it follows that a necessary condition for the VAR(p) system above to be weakly stationary is that all zeros of the

determinant | (B)| lie outside the unit circle, {xt } is unit-root stationary or is said to be not

integrated (I(0) process) (Tsay, 2010). The definition of cointegration as stated in Engle and

Granger (1987) is given next.

Definition. The components of the vector {xt } are said to be cointegrated of order d, b,

denoted {xt } CI(d, b), if (i) all components of {xt } are I(d); (ii) there exists a vector (6= 0)

0

so that zt = xt I(d

Considering the case where d = b = 1, cointegration would mean that the equilibrium error

would be I(0) and zt will rarely drift far from its mean and will often cross this line (Engle

and Granger, 1987). A convenient way of representing the vector {xt } as a stationary series is

by the error correction model (ECM) representation (solves the issue of overdierencing (Tsay

(2010),p. 431)). The definition of the ECM is given next (Engle and Granger, 1987):

Definition: A vector time series {xt }, has an error correction representation if it can be expressed as:

A(B)(1

B)xt =

zt

+ ut ,

where ut is a stationary multivariate disturbance, with A(0) = I, A(1) has all elements finite

and

6= 0.

In this representation of the ECM, only the disequilibrium in the last period is an explanatory

variable. However, by rearranging terms, any kind of set of lags can be written in this form.

Therefore, this representation of the ECM permits any type of gradual adjustment towards a

new equilibrium (Engle and Granger, 1987).

2.2

Define the observed price of stock i at time t as {Pit } and let pit = ln(Pit ) be the corresponding log price. Now a common assumption about {pit } is made in the literature (Tsay, 2010;

Vidyamurhty, 2004), namely the time series {pit } has a unit-root and follows a random walk:

pit = pi,t

+ rit , where {rit } is the return (this unit root assumption of {pit } will be confirmed

Based on the arbitrage pricing theorem (APT), if two stocks have similar risk factors, they

should have similar returns. If this is the case, {p1t } and {p2t } are likely to be driven by a

common component and are therefore cointegrated (Tsay, 2010). Or in formula, there exists a

linear combination wt =

0p

= p1t

These two price series {p1t } and {p2t } can also be written in an ECM form:

!

!

!

p1t p1,t 1

1

1t

=

(wt 1 w ) +

,

p2t p2,t 1

2

2t

(2.1)

where w = E[wt ] denotes the mean of {wt }, which is referred to as the spread between the

The left hand side of the ECM form represents the log returns of both price series. Furthermore, the equation states that the returns depend on the stationary series wt

are therefore also stationary. Specifically, wt

w and

equilibrium between the two stocks. So, the returns of the stocks (left side of 2.1) depend on the

past deviations from the equilibrium. The coefficients 1 and 2 respectively show the eect of

these past deviations on the returns {r1t } and {r2t }. In practice, the coefficients 1 and 2 will

have opposite signs, indicating the mean reversion behaviour of the stationary series.

2.3

For testing purpose, the ECM representation for a k-dimensional VAR(p) time series {xt } becomes:

xt = dt + xt

1 xt 1

p xt p

+ ... +

+ at ,

where the deterministic regressor {dt } (constant/trend) is added and t = p + 1, ..., T . Furthermore,

p

X

i,

i=j+1

and

=

The term xt

p 1

+ ... +

I=

(1).

is referred to as the error correction term, which plays a key role in the cointe-

1. Rank() = 0. Hence, = 0 and xt is not cointegrated.

6 0 and xt contains no unit roots and one can just look at xt

(which is I(0)).

3. 0 < Rank() = m < k. Hence, xt has m linearly independent cointegration vectors and k m

0

As can be seen from the above three cases, the rank of the matrix is sufficient for knowing

if the time series {xt } is cointegrated. Therefore, next a likelihood ratio (LR) test is described

for determining the rank of , which is called the Johansen cointegration test. The hypothesis

of this test can be formulated as H0 : Rank() = m versus Ha : Rank() < m. The value

of m starts at null and is sequentially added by one if the null hypothesis is rejected. If the

null hypothesis is rejected for every m k, {xt } has the properties of the second case specified

above.

LRtr (m) =

(T

p)

k

X

ln(1

i ),

i=m+1

where i (should be small for i > m) are the squared canonical correlations between u

t and vt ,

which are the residuals of

xt and xt

1.

which depends on k

2,

Chapter 3

In this chapter I will describe a mean reverting Gaussian Markov chain model for the spread,

namely the stochastic spread model which is based on the paper by Elliot et al. (2005). Later in

this thesis the returns of this stochastic spread approach, when implemented as a pairs trading

strategy, are compared with the above mentioned cointegration approach using historical data.

3.1

At any given time, a pairs trading portfolio is associated with a quantity called the spread,

which is the dierence between the quoted prices of the securities used. If the spread of the

portfolio is significantly dierent from the mean, a position in both securities is taken with the

expectation that the spread will revert to its mean (Vidyamurhty, 2004).

To explicitly model the mean reverting behaviour of the spread, a state process {xk |k =

0, 1, 2, ...} is introduced, where {xk } denotes the value of some variable at time tk = k for

k = 0, 1, 2, .... We assume that {xk } is mean reverting:

a

p

xk+1 xk = b

xk +

k+1 ,

b

where

(3.1)

bX(t))dt + dW (t).

k = E(xk = a +(1 b )k

2

k ),

with

2]

= ... =

a a

(1

b b

b )k +(1 b )k 0 ,

and

1 (1 b )2k

+ (1

1 (1 b )2

From these two equations the long term mean and variance can be derived.

2

k

= V ar(xk ) = (1

For k ! 1:

b )2

2

k 1

k =

a

,

b

= ... =

2

k

(1

b )2

b )2k

2

0.

xk = A + Bxk

where A = a , B = (1

b ) and C =

+ Ck ,

(3.2)

The latent variable {xk } defined above is used in the measurement equation, which defines

yk = xk + D!k ,

(3.3)

The model described above has three major advantages from an theoretical point of view.

The first one is rather obvious, namely the model is mean reverting. This is exactly what is

required of the spread between two stocks to implement a successful pairs trading strategy.

The second advantage is that the model for the spread is continuous in time, such that it is

convenient for forecasting purposes. Critical questions for pairs trading such as, the expected

holding period of the portfolio and the expected return of the strategy, can therefore be answered.

The third advantage is that the model is completely tractable. All the parameters can be

estimated using the Kalman filter and a maximum likelihood procedure called the EM algorithm.

In the next two sections, the Kalman filter and the EM algorithm will be discussed in detail.

3.2

Kalman Filter

To estimate the above dynamical system of the stochastic spread model, a very useful tool

called the Kalman Filter (which is named for the contribution of R.E. Kalman (Kalman, 1960))

is introduced. This Kalman Filter is an algorithm for calculating linear least squares forecasts

of the state vector on the basis of data observed through t,

x

t+1|t = E[xt+1 | t ],

where

= (yt , yt

1 , ..., y1 , xt , xt 1 , ..., x1 ).

sively, generating x

1|0 , x

2|1 ,..., x

t|t

In this thesis, the Kalman filter is described as a four-step procedure and is based on the

description given in chapter 13 of the book of Hamilton (1994) and the paper of Elliot et al.

(2005). For convenience, the key features of a general state-space system are given first:

xt+1 = A + Bxt + Ct+1 ,

(3.4)

yt = xt + D!t ,

(3.5)

For now it is assumed that the values of A, B, C and D are know, but later these parameters

are estimated with the use of the EM algorithm from Shumway and Stoer (1982).

To begin the Kalman filtering, the starting point of the recursion has to be set. Typically,

the starting point of the recursion is set as x

1|0 = E[x1 ], which is just the unconditional mean

of x1 . The associated Mean Squared Error (MSE) of this starting point is therefore P1|0 =

E[(x1

x

1|0 )2 ].

After defining the starting point, the next step is to calculate the following points in time as

follows:

k+1 | k ] = A + Bk = A + B x

x

k+1|k = E[x

k|k ,

(3.6)

Pk+1|k = E[(xk+1

x

k+1|k )2 ] = B 2 Pk|k + C 2 .

(3.7)

yk|k

k |xk ,

= E[y

t 1]

= xk x

k|k

1.

(3.8)

E[(yk+1

yk+1|k )2 ] = Pk|k

+ D2 .

(3.9)

Next the inference about the current value of {xt } is updated on the basis of the observation

of {yt } to produce

k |yk ,

x

k|k = E(x

k 1)

k | k ).

= E(x

(3.10)

Using the formula for updating a linear projection (Hamilton, 1994)(p.379) results in:

x

k|k = x

k|k

+ (E[(xk

x

k|k

1 )(yk

yk|k

1 )]

(E[(yk

x

k+1|k+1 = x

k+1|k + k+1 (yk+1

yk|k

1)

])

x

k+1|k ),

(yk

yk|k

1 ),

(3.11)

(3.12)

where the stands for the kalman gain and is given by:

k+1 = Pk+1|k /(Pk+1|k + D2 ).

The estimate x

k+1|k+1 denotes the best forecast for of {xk+1 } given

3.3

(3.13)

k.

The EM Algorithm

The Kalman filter assumes that the parameters in the state-space model are specified in advance.

Normally, this is not the case and these parameters have to be estimated. One widely used

estimation method is described in the paper of Shumway and Stoer (1982) and will also be

10

used in this thesis. In the paper of Shumway and Stoer (1982) the estimation of the parameters

is done by maximum likelihood using the EM algorithm. Next, I will discuss this estimation

method.

In order to estimate the parameters of the state space model defined by 3.4 and 3.5, the joint

log likelihood has to be specified for this model. The dependence on the unobserved time series

{xk } of the system, makes the specification of the likelihood function not straightforward. To

solve this problem, the EM algorithm is conditioned on the observed time series y1 , ..., yn . Lets

define the estimated parameters at the (r + 1)st iterate as the values # = (A, B, C 2 , D2 ) which

maximize:

G(#) = Er [LogL|y1 , ..., yn ],

(3.14)

where the conditional expectation Er refers to the rth iterative values of A(r), B(r), C 2 (r) and

D2 (r). Furthermore, LogL is the joint log likelihood of the complete data. The conditional

mean and the covariance functions specified by the Kalman filter are conditioned on the full

dataset, which gives smoothed estimators of {xk }:

k|

x

k|n = E(x

Pk|n = E[(xk

Pk,k

1|n

= E[(xk

n ),

x

k|n )2 ],

x

k|n )(xk

x

k

1|n )].

The EM-algorithm is a two step iterative procedure that finds a stationary value # of the

likelihood function in the following way:

step 1 (The E-step): Compute (with # = #j ):

= E [LogL|y1 , ..., yn ],

Q(#, #)

#

step 2 (the M-step): Find

The graph 3.3 shows a generated spread (with the parameters in Elliot et al. (2005)) and

the fitted values of this spread using the stochastic spread approach.

11

Spr e ad

3

0

20

40

60

80

100

120

Day s

Figure 3.1: The fitted values of Stochastic Spread approach (green line) and simulated spread

(blue line)

Chapter 4

Trading design

This chapter discusses the trading strategy used in this thesis. In the first section, the trading

period is described. The second section sets out the pairs selection criteria for the two model

based approaches described in the former chapters. In the third section, the mean-variance

optimization theory of Markowitz (1952) for determining the optimal number of positions in the

spread, is discussed.

4.1

Trading period

The data used in this thesis contains daily closing prices of the stocks of the Amsterdam Stock

Exchange (AEX) in the period from 1st of January 2006 until 30th of December 2011 and is

obtained by Thomas Reuters through Datastream Advance. Since an equilibrium between two

stocks is not very likely to remain over the whole time of the dataset, the data is divided in little

blocks of formation periods and adjacent trading periods. The number of days in the formation

period are arbitrarily chosen and set to 128, 256 and 512 days. The adjacent trading period is

set to half of the trading days of the formation period as is done in earlier literature (Gatev et al.

(2006), Yakop (2011)). In the trading period, the number of positions in the spread is opened

following the mean-variance optimization procedure (discussed at the end of the chapter) and

the two standard deviation strategy. Any remaining open positions in the spread are closed at

the end of the trading period.

A rolling window of 40 trading days will be used to start a new formation period. The result

of implementing a rolling window is that after the first 128, 256 or 512 days (which are the

dierent lengths of the formation periods), all the remaining days in the dataset will be used

for trading and no opportunities are lost.

12

4.2

13

Pairs selection

This section describes the criteria for selecting a suitable pair for the dierent methods.

4.2.1

Cointegration approach

As mentioned in chapter 2, {pit } is assumed to have a unit-root and follows a random walk

model: pit = pi,t

1 + rit .

This assumption is tested with the ADF-test and if the null hypothesis

After selecting the time series {pit }, all the dierent combinations of pairs are tested for

cointegration by the Johansen test procedure. The model specified for testing is:

0

xt = ( wt

w ) + c 0 +

xt

+ t ,

Pairs that reject the first hypothesis of m = 0 and did not reject the second hypothesis of

m = 1 are selected as suitable pairs and have a mean reverting spread wt with mean mw . The

spread portfolio is wt = p1t

4.2.2

To select a pair suitable for trading, all the dierent combinations of spreads are estimated with

the EM algorithm and Kalman filter as discussed in chapter 3. After estimating the parameters

of the model, the parameter B of the state equation is evaluated. If B is between 0 < B < 1,

the spread shows mean reversing behaviour and the pair is selected for trading. The number of

positions taken in the spread is again obtained using the Mean-Variance optimization strategy

discussed below.

4.3

Mean-Variance optimization

This section will describe the mean-variance optimization procedure (MV), used for determining

the number of positions in a pairs trade. The concept of mean-variance optimization was first

introduced by Markowitz (1952). The main purpose of Markowitzs paper was to mathematically

explain the behaviour of investors to diversify their portfolio. Markowitz claims that investors

do not only maximize the expected return of a portfolio, but also consider the variance of

the returns. In this thesis I will use Markowitzs expected returns-variance of returns rule to

optimize the number of positions held in a spread portfolio.

The ratio behind the optimization of the number of position in a spread portfolio lies in the

mean reverting behaviour of the spread of a pairs trade. No matter how big the deviation of

14

the mean, the spread is always expected to revert back to its long term equilibrium value. In

earlier literature about pairs trading, a fixed position in the portfolio is opened after the spread

hits a pre-set threshold some distance away (two standard deviations) from the long term mean

(Yakop (2011), Gatev et al. (2006), Vidyamurhty (2004)). After hitting the threshold value, the

position is held until the spread reverts back to the mean. When this happens, the position is

unwound and a profit is made. In the time that has passed between opening and closing the

position, the spread could have been significantly larger than it was when the trader first opened

the position. If this is the case, the trader can generate a much bigger profit by taking on more

positions proportional to the size of the spread.

In this thesis, the opportunity to generate a higher profit in a trade is explored by varying the

number of positions. The positions taken in a spread are optimized by using a utility function

based on the aforementioned principle of the expected returns-variance of returns by Markowitz

(1952), namely:

Ut (wpt+1 ) = Et

wpt+1 wpt

wpt

V art

wpt+1 wpt

,

wpt

sures the risk aversion of the trader (and is set to one when the strategy is evaluated). In

h

i

wpt

the paper of Markowitz (1952) it is stressed that finding reasonable values for Et wpt+1

wpt

h

i

wpt+1 wpt

and V art

by using reliable statistical techniques is essential. Both the stochastic

wpt

spread and the cointegration approach have these favourable characteristics. Now, lets define

{returnt+1 } as the value of a portfolio at time {t + 1} that invested one dollar in the spread

at time {t} . Using this definition for {returnt+1 }, the expected return and variance can be

evaluated using the following equations:

wpt+1 wpt

returnt+1

Et

= zt E t

,

wpt

wpt

returnt+1

2

V art [rt+1 ] = zt V art

,

wpt

where {zt } represents the number of positions taken in the spread portfolio. The value of

Et [returnt+1 ] is calculated with the use of the parameters estimated in the formation period.

The value of V art [returnt+1 ] is estimated in the formation period and is assumed to be constant

in the trading period.

The number of positions taken in the spread at any point in time can now be calculated by

maximizing the utility function with respect to {zt }. The first order condition is given by:

@Ut (zt )

returnt+1

returnt+1

=E

2 zt V ar

= 0.

@zt

wpt

wpt

Since the second derivative of the utility function is always negative ( > 0, V ar[returnt+1 ] > 0),

solving this first order condition for {zt } gives the number of positions to be taken in the spread

15

that maximize the utility function. This optimal value of {zt } at any point in time is given by:

zt =

E[returnt+1 ]

wpt .

2 V ar[returnt+1 ]

rt+1 =

wpt+1 wpt

returnt+1

= zt

.

wpt

wpt

When the optimal value of zt is used, the return of the strategy is as follows:

rt+1 =

wpt+1

wp

wpt

t

Et [returnt+1 ]

returnt+1 .

2 V art [returnt+1 ]

It can be seen that the returns of this strategy are not dependent of the value of wpt .

Chapter 5

Evaluation

This chapter gives an evaluation of the results of the two model based approaches discussed in

chapters 2 and 3. The structure of this chapter is as follows. First, the definition of a Sharpe

ratio is given and a few concerns with the calculation of Sharpe ratios, as explained in the master

thesis of Yakop (2011), are discussed. In the second section, the results for both approaches are

given. The last section gives out of sample results of the dierent pairs trading strategies.

5.1

Sharpe ratio

A common way to compare the returns of dierent trading strategies is done by calculating

the reward-to-variability, nowadays also called the Sharpe ratio introduced by Sharpe (1966).

The Sharpe ratio gives the excess expected return of an investment to its return volatility. In

formula,

SR =

where E[rt ] and

E[rt ]

rf

(5.1)

are the expected return and standard deviation of the returns series {rt }. rf

is the average return earned by the benchmark in the evaluated period. The risk-free rate is

usually assumed to be an adequate benchmark for comparing the returns of the strategy. As

discussed in Yakop (2011), an adequate benchmark should act as an appropriate substitute for

pairs trading. Therefore, Yakop (2011) did not use the risk-free rate, but the composite index

of the stocks, in this case the AEX index. When calculating the Sharpe ratio with equation 5.1,

the rf is therefore set to zero. Afterwards, the calculated Sharpe ratios of the dierent trading

strategies are compared to the Sharpe ratios of the AEX index.

P

is found by substituting

The estimation of the Sharpe ratio, SR,

= T1 Tt=1 rt for E[rt ]

q P

and = T1 Tt=1 (rt

) for , which are the estimated mean and standard deviation of the

is based on

return series. As discussed in Yakop (2011), since SR

and (which are estimated

is (also) estimated with some error. Denoting the vector ( )0 by and

with some error), SR

16

CHAPTER 5. EVALUATION

17

the SR formula in equation 5.1 by g(), Lo (2002) shows that the asymptotic distribution of the

SR estimator is given by:

p

The estimation of

@g

@

T (SR

@g @g

.

@ @0

and and the derivation of the asymptotic distribution are not done in

Furthermore, Yakop (2011) discusses two limitations of the use of Sharpe ratios. The first

limitation of the Sharpe ratio is that it implicitly assumes the return series to be normally

distributed or at least approximately so. In practice, pairs trading strategies produce frequent

small positive returns with sometimes large losses, which will accentuate the Sharpe ratios

because of the excess skewness and kurtosis (Lo, 2002).

The second limitation of the use of Sharpe ratio is that it ignores any underlying serial

correlation, which is frequently present in financial time series. The consequence of the serial

correlation is, again, that it results in overestimation of the Sharpe ratios (Lo, 2002) . To resolve

this issue, the standard deviations of the return series,

HAC estimator is used when calculating the Sharpe ratios of the return series. The derivation of

the HAC estimator is not done in this thesis, but can also be found in Yakop (2011) in Appendix

A.

5.2

Results

In this section the results of pairs trading with the Stochastic Spread approach and the Cointegration approach are given.

5.2.1

As mentioned in the third chapter, the Stochastic Spread model has three major advantages

from a theoretical point of view. The model captures mean-reversion, is continuous in time and

is completely tractable. Despite these hopeful properties of the model, the experienced empirical

results turn out to be less favourable.

First of all it takes a long time to estimate the parameters of one spread, let alone those of

the 276 dierent spreads available in the AEX (consisting of 24 stocks). To give an indication

of the time needed to estimate these spreads: a single formation period already takes forty-two

minutes. There are seven formation periods in this dataset. So the estimation of all the dierent

pairs in the dataset would take roughly five hours.

This first disadvantage stated above, is inconvenient but can be overcome by the use of

faster computers (or patience). However, another disadvantage is more problematic. After the

CHAPTER 5. EVALUATION

18

estimation of all the dierent spreads, the amount of pairs found suitable for pairs trading was

minimal. For example, the first formation period resulted in five suitable pairs. This is not

much, given the fact that there are 276 dierent pairs available.

Also, the parameters estimated from the pairs selected by this method, suggest that the

model can be simplified to a simple AR(1) model for the spread. Specifically, the parameter D

in the space equations is estimated to be at most 0.001. This suggest that the state-space model

can be brought back to the state equation, which is just a simple AR(1) model for the spread.

This AR(1) model has already extensively been tested in the context of pairs trading inYakop

(2011) and will therefore not be further analysed in this thesis.

So, despite the favourable theoretical properties, the use of the stochastic spread model for

pairs trading, which was suggested by Elliot et al. (2005), does not turn out to be a good

approach for pairs trading in practice.

Parameters of selected Pairs

Values

276

A

5

0.0062

B

C

0.9845

0.0007

0.2660

0.2

0.15

0.1

3

0.05

2

0

1

0.05

0

0.1

1

0.15

2

3

0.2

0.25

100

200

300

400

500

600

50

100

150

200

250

Figure 5.1: Example of a Pair selected with the Stochastic Spread Approach

300

CHAPTER 5. EVALUATION

5.2.2

19

Cointegration Approach

Contrary to the stochastic spread approach, the results of the cointegration approach are useful

for evaluating a pairs trading strategy. To begin the evaluation of the cointegration approach, an

overview of the specifics of the dataset and parameters used in the analyses are stated in table

5.2. As can be learned from table 5.2, results for three dierent lengths of formation periods

(respectively 128, 256 and 512 days) and the adjacent trading periods, are estimated. In these

dierent lengths, all the possible combinations of pairs (in this case 276 pairs) are tested with

the Johansen cointegration trace test described in 2.3 (with a significance level of 0.05). The

average amount of pairs found by this test for the dierent formation periods are also stated in

table 5.2.

Parameters

Description

Values

Number of stocks

23

RW

Rolling window

40

FP

Formation period

128 days

256 days

512 days

TP

Trading period

64 days

128 days

256 days

NT

28

23

13

NP

19

28

35

1316

The graphs of figure 5.2 on page 20 show the behaviour of two dierent pairs during the

formation and trading period. As can be seen from the graphs, the pair of stocks show periods

of divergence and convergence during the formation period. This mean reversion behaviour is

the key for a profitable pairs trading strategy and is present in all pairs selected in the formation

period. Unfortunately, some of the pairs formed during the formation period will not portray

the same behaviour during the trading period (see graph d). As a result, losses will be made

on these pairs. For the pairs trading strategy to be a success, the pairs that do show mean

reversion behaviour should make up for the probable losses incurred on these bad pairs.

Now I will present the main results of the cointegration approach. Table 5.3 contains the

calculated Sharpe ratios of the cointegration approach using the mean-variance optimization

trading strategy. The Sharpe ratios are calculated on the basis of the daily returns and therefore

look small. Conversion of these daily SRs to annual SRs is commonly done by multiplying the

p

SRs by 250. This is known as time aggregation within finance. Lo (2002) however shows that

statistically speaking this rule is incorrect because of the serial correlation underlying financial

CHAPTER 5. EVALUATION

20

1

4

0

3

Spread

Spread

3

1

4

0

5

20

40

60

80

100

120

140

10

20

30

40

50

60

Days

Days

70

15

10

Spread

Spread

0

1

5

2

10

0

20

40

60

80

100

120

140

10

20

30

40

50

60

70

Days

Days

CHAPTER 5. EVALUATION

21

returns, which can result in extreme overestimation of the SRs. Therefore, only the estimated

daily SRs are included in this thesis.

Furthermore, it has to be noted that the calculation of the daily returns did not incorporate the transaction costs. Including transaction costs in the investigation would require some

creativity, since the dierence between the bid and ask price of a stock is not reported (only

the daily closing prices are). The fee for making a transaction is also not commonly known.

Therefore, the inclusion of transaction costs within pairs trading justifies an entire research on

its own and shall not be further dealt with in this thesis.

As can be seen from the average Sharpe ratios of this strategy, the mean-variance optimization suers large losses in all the dierent formation periods length. This is a remarkable result,

since this strategy is supposed to maximize the value of the portfolio. Unfortunately, one critical

assumption of this strategy is that the selected pairs have the property of mean reversion. If this

assumption is not met and a pair drifts away, the number of positions taken in the spread will

increase dramatically and huge losses will be taken. The results show that there are to many

pairs that show this behaviour. Therefore the average Sharpe ratios of the dierent formation

periods are negative.

Benchmark

SR(AEX)

FP

Average

Max

Min

Count

Std. Dev.

pos. SR

SR

>

Significant

SR(AEX)

at 5%

0.0073

128

-0.0447

0.1461

-0.1523

0.0583

13

0.0046

256

-0.0444

0.0585

-0.1079

0.0436

0.0439

512

-0.0376

0.0041

-0.0665

0.0227

On the other hand, the histogram in figure 5.3 (which shows the distribution of the SR in

the dierent TP) shows us that if there are enough pairs that do mean reverse in one formation

period, the SR of that period can be high (SR of 0,15). Unfortunately, this does not happen

often enough and the overall results of this strategy are disappointing.

CHAPTER 5. EVALUATION

22

Freqeuncy

0

0.2

0.15

0.1

0.05

0

SRs

0.05

0.1

0.15

0.2

Figure 5.3: Histogram of the estimated SRs of the MV strategy of formation period length of

128 days

To compare the mean-variance strategy with a less risky strategy, I also calculated the Sharpe

ratios using the common two standard deviation (2STD) strategy for opening a position. This

strategy is not as risky as the mean-variance optimization, because it will only open one position

at a time. The results of this strategy are stated in table 5.2.2. It can be seen that the 2STD

strategy returns positive average Sharpe ratios in the three dierent formation periods, where

the formation period of 128 days has the highest average. In contrast to the mean-variance

strategy, the pairs that do not converge and will drift away from the equilibrium will only have

a loss of two times the standard deviation. These losses are clearly overcome by all the pairs that

do behave as expected, which results in the positive average Sharpe ratios for all the dierent

trading periods.

CHAPTER 5. EVALUATION

Benchmark

SR(AEX)

23

FP

Average

Max

Min

Count

Std. Dev.

pos. SR

SR

>

Significant

SR(AEX)

at 5%

0.0073

128

0.0209

0.0667

-0.0227

0.0233

25

15

11

0.0046

256

0.0147

0.0469

-0.0167

0.0221

19

12

0.0439

512

0.0081

0.0204

-0.0045

0.0092

10

5.3

To see if the results of the cointegration approach are robust, an second estimation of the

cointegration approach for both trading strategies is done. The second dataset consists of the

daily closing prices from the last five years of the DAX index (which includes the thirty biggest

listed German companies). The results of both trading strategies are given in the table 5.5

below.

As can be seen in table 5.5, the MV strategy is performing even worse in this dataset than it

did in the AEX dataset. The average daily SRs of the MV strategy for the dierent periods are

all negative and only in one TP does the MV strategy significantly outperform the DAX index

(FP:128 days). The 2STD strategy (again) performs better than the MV strategy and generates

small positive average SR in all the trading periods. The results of both pairs trading strategies

of both datasets are much alike. Therefore, it can be concluded that the results obtained are

robust.

Benchmark

Strategy

MV

2STD

SR(DAX)

FP

Average

Max

Min

Count

Std. Dev.

pos. SR

SR

>

Significant

SR(AEX)

at 5%

0.0551

128

-0.0843

-0.0119

-0.1829

0.0486

0.0596

256

-0.0636

0.0165

-0.1056

0.0426

0.0316

512

-0.0429

-0.0225

-0.0536

0.0105

0.0551

128

0.0210

0.0689

-0.0181

0.0215

17

0.0596

256

0.0135

0.0429

-0.0069

0.0159

14

0.0316

512

0.0069

0.0148

-0.0045

0.0030

Chapter 6

Conclusion

In this thesis two dierent model based approaches for pairs trading were discussed and tested

with the use of two dierent trading strategies. Results were generated for the daily closing

prices of the stocks in the AEX index over the last five years. Furthermore, an out of sample

estimation was done to verify if the results where robust.

The first approach for modelling the behaviour of a pair, the stochastic spread, was first

suggested (but not yet tested) by (Elliot et al., 2005). From a theoretical point of view, the

stochastic spread has three major advantages. The model captures mean-reversion, is continuous

in time and is completely tractable. Despite these theoretical advantages, the empirical results

turn out to be less favourable in practice. First of all, the stochastic spread approach did not

find pairs suitable for trading. Secondly, the estimated parameters of the state-space form of

the model suggested that the model could be simplified to only the state equation (which is just

an AR(1) model). This renders the estimation of the parameters with the EM-algorithm and

Kalman filter unnecessary, since the AR(1) model is embedded in the other approach discussed

in this thesis. Therefore, only a few estimates and graphs of the spread are present and not the

actual results of pairs trading are present in this thesis.

The second approach for modelling the behaviour of a pair is the cointegration approach.

The idea of cointegration was already used for pairs trading in earlier papers (Yakop (2011),

Vidyamurhty (2004)). The approach in these earlier papers however, is more ad-hoc and not

based on the error correction model (ECM), which is normally used in econometric research. In

this thesis the cointegration approach is based on the ECM and the pairs are tested with the

use of the Johansen cointegration test.

Subsequently, two trading strategies for taking a position in the spread were used to calculate

the results. The first one is the two standard deviations strategy (2STD). This strategy is

commonly used in the literature (Yakop, 2011; Vidyamurhty, 2004, Gatev et al., 2006). The

concept of this strategy is very simple: one takes a position in the spread if it is far enough

(two standard deviation) away from the mean and closes the position when the spread returns

24

CHAPTER 6. CONCLUSION

25

to the equilibrium value. The second strategy is called the mean-variance approach (MV). As

the name suggests, the number of positions taken in the spread is determined by a trade-o

between the dierence from the spread of the mean and the variance of the spread. The spread

is expected to revert back to the mean and the MV strategy uses this assumption to maximize

the portfolio value by varying the number of positions taken in the spread.

The results of both strategies are in tables 5.2.2 and 5.3. The 2STD strategy generated

small positive returns over all the dierent formation periods. This result is typical for a pairs

trading strategy and is thus what you would expect. In contrary, the MV strategy generates

large negative SRs in all the formation periods. This is not what you would expect, because

this strategy aims to maximize the portfolio value by varying the number of positions in the

spread and should, consequently, perform well. However, one crucial assumption for the success

of this strategy, namely mean reversion, is not met by a large number of pairs. The number

of positions drastically increases in these pairs and the losses are substantial. This leads me to

the conclusion that the MV strategy might be too risky (in this case, at least) for pairs trading.

The estimation of the second dataset (DAX index) confirms this, because similar results were

generated. Given the fact that two indices produced similar results, one can conclude that these

results are robust.

Further research in pairs trading should focus on other ways to optimize the trading strategy,

since the MV procedure did not generate the desired results. Furthermore, the inclusion of

transaction costs within pairs trading is a relevant topic that should be taken into account,

but has not yet been investigated. One could also investigate the concept of pairs trading for

more than two securities, such as triple or quadruple trading. The cointegration approach

discussed in this thesis could be a good way for investigating this topic, since the existence of a

cointegration relation between three or four stock can be easily tested within this framework.

Bibliography

Baronyan, S. R., Boduroglu, I. I., and Sener, E. (2010). Investigation of Stochastic Pairs Trading

Strategies under dierent Volatility Regimes. The Manchester School, pages 114134.

Broda, S. (2011). Financial econometrics slides.

Do, B., Fa, R., and Hamza, K. (2006). A New Approach to Modeling and Estimation for Pairs

Trading. Working Paper, pages 130.

Elliot, M. J., van der Hoek, J., and Malcolm, W. (2005). Pairs Trading. Quantitative Finance,

5(3):271276.

Engle, R. F. and Granger, C. W. (1987). Co-integration and Error Correction:representation,

Estimation and Testing. Econometrica, 55(2):251276.

Gatev, E., Goetzmann, W. N., and Rouwenhorst, K. G. (2006). Pairs Trading: Performance of

a Relative-Value Arbitrige Rule. Review of Financial studies, 19(3):797827.

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.

Johansen, S. (1991). Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian

Vector Autoregressive Models. Econometrica, 59(6):15511580.

Kalman, R. (1960). A new Approach to Linear Filtering and Prediction Problems. Journal of

Basic Engineering, 82:3545.

Lo, A. (2002). The statistics of Sharpe Ratios. Financial Analysts Journal, July/August:3652.

Markowitz, H. (1952). Portfolio Selection. Journal of finance, 7(1):7791.

Sharpe, W. (1966). Mutual Fund Performance. The journal of Business, 39(1):119138.

Shumway, R. and Stoer, D. (1982). An Approach to Time Series Smoothing and Forecasting

using the EM Algorithm. Journal of Time Series Analysis, 3:253264.

Tsay, R. S. (2010). Analysis of Financial Time Series. John Wiley and Sons, Inc., third edition

edition.

26

BIBLIOGRAPHY

27

Vidyamurhty, G. (2004). Pairs Trading, Quantitative Methods and Analysis. John Wiley and

Sons, Inc.

Yakop, M. (2011). A Comparative Analysis of Pairs Trading. Masters thesis, University of

Amsterdam.

- Pairs Trading Cointegration ApproachUploaded byalexa_sherpy
- Women Empowerment and Economic GrowthUploaded byBhuwan
- Investigating the Long Run Relationship Between Crude Oil and Food Commodity PricesUploaded byCapita1
- A Process for Data Driven PrognosticsUploaded byEric Bechhoefer
- IRJCL8July3287Uploaded byAbdulilah Bazy
- Hamid and HasanUploaded byNaseer Iqbal
- CURRENCY_SYSTEM_AND_ITS_IMPACT_ON_ECONOM.pdfUploaded byUzair Zulkifly
- SSRN-id911512Uploaded bypasaitow
- Fazal Husain & Abdul RashidUploaded bySara Noor
- 2018-12-11 MMC Merlin - ACEPE Summary Course - Copia[1]Uploaded byEfra Ibaceta
- ucin1321368833.pdfUploaded byNoureddine Guersi
- A Trivariate Causality Test Among Economic Growth, Government Expenditure and Inflation RateUploaded byAlexander Decker
- File 3Uploaded byJosue Dorantes Malagon
- quantTrading-1cUploaded byRatnadeep Bhattacharya
- MATLAB R2014 Coder Getting Started GuideUploaded byqinshaoq
- Project ReportUploaded byRuchi Kapoor
- Risk Matrix and Sample TablesUploaded byusamafunky
- Extended Kalman Filter PDFUploaded byAngela
- Solution Theory 1303.4988Uploaded byoctavinavarro
- (Infosys Science Foundation Series) Ramji Lal (Auth.)-Algebra 2_ Linear Algebra, Galois Theory, Representation Theory, Group Extensions and Schur Multiplier-Springer Singapore (2017)Uploaded byDharman
- PortfolioUploaded byandy_jean_2
- IMPACT OF RECESSION ON ECONOMIC GROWTH IN NIGERIAUploaded byDr. Ezeanyeji Clement
- Tensor Decomp PresentationUploaded byMahbod Matt Olfat
- A Markowitz Walk Down Crypto Land Modern Assets for Modern PortfoliosUploaded byKevinLópezS
- 1. Applications of MatricesUploaded byYashwanth_Hari_5831
- Engineering MathematicsUploaded byganesh
- pof.pdfUploaded bygyqwaiodhaosidh
- Amit Kumar ProjectUploaded byvmavma
- 990-3118-2-PB.pdfUploaded byShuchi Goel
- structbreak_cointUploaded byisa

- Weiskott Chaucer the Forester.pdfUploaded byRichard Bertematti
- The_Quantum_Handshake_Explored.pdfUploaded byRichard Bertematti
- Intro GroupsUploaded bymehdii.heidary1366
- Primer Tensor CalculusUploaded byRichard Bertematti
- 1999 PRA Kwiat Entangled PhotonsUploaded byRichard Bertematti
- Percona-Toolkit-2.2.16Uploaded byRichard Bertematti
- mbook applied econometrics using MATLABUploaded byimad.akhdar
- Night TerrorUploaded byRichard Bertematti
- High Frequency and Dynamic Pairs Trading Based on Statistical Arbitrage Using a Two Stage Correlation and CointegrationUploaded byFranciscoMuñozElguezabal

- Why Optimal Diversification Cannot Outperform Naive DiversificationUploaded byjohnolavin
- FINS2624 Portfolio Management S1 2014Uploaded bybibimmin
- (Studies in Fuzziness and Soft Computing 316) Pankaj Gupta, Mukesh Kumar Mehlawat, Masahiro Inuiguchi, Suresh Chandra (Auth.)-Fuzzy Portfolio Optimization_ Advances in Hybrid Multi-criteria MethodologUploaded byAsk Bulls Bear
- Lqg Cambridge Bernd [Read Only]Uploaded byAloke Chatterjee
- Teaching Corporate FinanceUploaded byproffina786
- NetPicks Interview with Denise Shull Author of Market Mind Games Part 1Uploaded byMark Soberman
- Risk Parity PortfoliosUploaded byvettebeats
- Haldane Speech on Non-normality Bank of EnglandUploaded byNick Gogerty
- Dollar-cost average LossesUploaded bybboyvn
- An Introduction to Investment TheoryUploaded bygerom
- Teaching Plan 4th SemUploaded byBiplab Dey
- Business Analytics M17 EVAN7821 09 SE SUPPA Online - EvansUploaded bysulgrave
- Mean-Variance Portfolio TheoryUploaded byoscar10111
- Markowitz 2005Uploaded byMd Delowar Hossain Mithu
- Vault Career Guide to Investment ManagementUploaded bysoolianggary
- 5110Uploaded byMalcolm Christopher
- Literature Review (1)Uploaded byadjoead
- Portfolio Optimization Beyond MarkowitzUploaded byนาโน คอมพิวเทค
- ch08Uploaded byMike Warrel
- Portfolio ReportUploaded byAbdul Salam
- 4. Sharpe Single Index ModelUploaded bySai Mala
- sub-ia-602Uploaded byQausain Ali
- entropy-15-04909 (1)Uploaded byArthur Dux
- Investments Revision NotesUploaded bykajsdkjqwel
- Taxonomy of Finance TheoriesUploaded byDr. Vernon T Cox
- Portfolio Replication3.2Uploaded byGu Grace
- Black LittermanUploaded byCyraa Ishfaq
- Portfolio Management and Security AnalysisUploaded byMarciano Ombe
- Optimal Portfolio SelectionUploaded byPradeep Kumar
- Mad vs MarkowitzUploaded byyukicm