PDF

A Novel Approach for Predicting Trading Signals
of a Stock Market Index

Chandima D. Tilakaratne
Department of Statistics,
University of Colombo, Sri Lanka
Musa A. Mammadov, Sydney A. Morris

Graduate School of Information Technology & Mathematical Sciences,
University of Ballarat, Australia
Introduction
The profitability of investing and trading in the stock market is directly proportional to its predictability.
Therefore, predicting the direction of stock market indices is one of the most important issues in finance.
In the last few decades, there have been a growing number of studies attempting to predict the direction or
the trend movements of financial market indices. Some studies (e.g. [Wu & Zhang 1997]) have suggested
that trading strategies guided by forecasts on the direction of price change may be more effective and may
lead to higher profits. Leung et al. (2000) also found that the classification models based on the direction
of stock returns outperform those based on the level of stock returns in terms of both predictability and
profitability.
Majority of the past studies have focused on the classification of future values into two categories (up
or down), which are considered to be buy and sell signals. The time series data used for these studies are
approximately equally distributed between these two categories.
In practice, the stock traders do not participate in trading (i.e., either buy or sell shares) if there is no
substantial change in the price level and instead they hold the money/shares in hand. In such case, an additional class to represent the hold signal needs to be considered. This study introduces a novel approach
for predicting whether it is best to buy, hold, or sell shares (trading signals) of a stock market index.
The three algorithms, Feedforward Neural Networks (FNN), Probabilistic Neural Networks (PNN),
and Support Vector Machines (SVM), are claimed by past studies [Qi & Maddala 1999; Yao et al. 1999;
Kohara et al. 1996; Kim & Chun 1998; Leung et al. 2000; Cao & Tay 2001; Huang et al. 2005] as the
most successful algorithms for predicting trading signals. Almost all of these studies consider only two
classes: the upward and the downward trends of the stock market movement, which are considered as buy
and sell signals. Furthermore, it has been noticed that the time series data used for these studies are approximately equally distributed among these two classes. However, the introduction of the hold signal
leads to an imbalanced distribution of data, as the majority of the cases (nearly 50%) are in this class.
146
Chapter 10 - A Novel Approach for Predicting Trading Signals of a Stock Market Index
The most commonly used classification techniques are not successful in predicting trading signals
when the distribution of the actual trading signals among these three classes is imbalanced [Akbani et al.
2004; Yao et al. 1999]. FNN can be identified as a suitable alternative technique for classification when
the data to be studied have an imbalanced distribution [Tilakaratne et al. 2007b]. However, a standard
FNN itself shows some disadvantages: (1) usage of local optimization methods, which do not guarantee a
deeper local optimal solution; (2) because of (1), FNN needs to be trained many times with different
initial weights and biases (multiple training results in more than one solution, and having many solutions
for network parameters prevents getting a clear picture on the influence of input variables); and (3) use of
the ordinary least squares (OLS) as an error function to be minimized may not be suitable for classification. Therefore, the novel approach described in this chapter employs a modified neural network algorithm, which has the ability to overcome aforesaid problems, to predict the trading signals--buy, hold, and
sell of a given stock market index.
The suggestions and the findings in the past studies [Murphy 2004; Poddig & Rehkugler, 1995] give
a strong indication of the importance of taking the behavior of foreign stock market indices into account
when studying the behavior of a selected stock market index. Moreover, recent studies [Olson & Mossaman 2001; Tilakaratne et al. 2005] have shown that the intermarket influences improve forecast accuracy.
Olson and Mossaman (2001) further showed that during periods when macroeconomic variables are
changing, correlations among interrelated markets pick up the changing market conditions faster than the
lagged macroeconomic variables. Following the findings made by these past studies, the novel prediction
approach introduced in this chapter uses the lagged data of foreign stock market indices to predict the
trading signals of a selected stock market index.
For evaluating this new prediction approach, the Australian All Ordinary Index (AORD) was selected
as the stock market index whose trading signals would be predicted (i.e., target market index). For instance, the following criterion can be applied to define the three trading signals of, buy, hold, and sell:
Criterion 1
buy if Y(t -1) lu ,
hold if ll < Y(t -1) < lu ,
sell if Y(t -1) ll ,
where Y(t -1) is the relative return of the Close price of day (t-1) of the stock market index of interest,
while ll and lu are thresholds. The values of ll and lu depend on the traders choice. There is no standard
criterion found in the literature on how to decide these values, and these values may vary from one stock
index to another. A trader may decide the values for these thresholds according to his/her knowledge and
experience.
By examining the data distribution (during the study period, the minimum, maximum, and average
for the relative returns of the Close price of the AORD are 0.0687, 0.0573, and 0.0003, respectively), we
chose ll = - lu = 0.005 for this study, assuming that 0.5% increase (or decrease) in the Close price of day
(t-1) compared with that of day t is reasonable enough to consider the corresponding movement as a buy
(or sell) signal. It is unlikely that a change in the values of ll and lu would make a qualitative change in the
prediction results obtained. According to Criterion 1 with ll = - lu = 0.005, one cannot expect a balanced
distribution of data among the three classes (trading signals) because more data fall into the hold class,
while less data fall into the other two classes.
C. D. Tilakaratne et al.
147
Modified Neural Network Algorithms
As mentioned in the Introduction (see Section 1), the standard FNN has limitations as a prediction tool.
Its error minimization function is also not suitable for a classification problem. Therefore, we realized that
the standard FNN algorithm needs to be modified in a way that it can be applied to solve the prediction
problem of interest. When modifying neural network algorithm, two matters were considered: (1) using a
global optimization algorithm for network training, and (2) modifying the OLS error function. By using a
global optimization algorithm for network training, we expect to find deeper solutions to the error function and hence overcome the disadvantages (1) and (2) of the standard FNN (see Section 1). Moreover,
we attempted to modify the OLS error function in a way suitable for the classification problem of interest.
The standard FNN algorithm is used as the basis of the modified algorithms we proposed.
2.1
Standard FNN
A standard FNN is a fully connected network with every node in the lower layer linked to every node in
the next higher layer. These linkages are attached with some weights, w = (w1, , wM), where M is the
number of all possible linkages. Given set of weights w, the network produces an output for each input
vector. The output corresponding to the ith input vector will be denoted by oi = oi(w). FNNs adopt the
backpropagation learning that finds optimal set of weights w by minimizing the error (deviation) between
the network outputs and given targets (Yao & Tan 2001). The most commonly used error function is the
OLS function:
EOLS
a
i 1
oi ,
(1)
where N is the total number of observations in the training set, while ai and oi are the target and output
corresponding to the ith observation in the training set, respectively.
2.2
Alternative Error Functions Proposed in the Literature
As described in the Introduction (see Section 1), in financial applications, it is more important to predict
the direction of a time series rather than its value. Therefore, the minimization of the absolute errors between the target and the output may not produce the desired accuracy of predictions [Yao & Tang 2000;
2001]. With this idea in mind, some past studies aimed to modify the error function associated with the
FNNs (e.g., [Yao & Tang 2000; 2001; Caldwell 1995; Refenes et al. 1995]). These studies incorporated
factors representing the direction of the prediction (e.g., [Yao & Tang 2000; 2001; Caldwell 1995]) and
the contribution from the historical data used as inputs (e.g., [Yao & Tang 2000; 2001; Refenes et al.
1995]). The functions proposed in the studies done by Yao and Tang (2000; 2001) and Caldwell (1995)
penalize the incorrectly predicted directions more heavily than the correct predictions. In other words,
higher penalty is applied if the predicted value, oi, is negative when the target, ai, is positive or vice versa.
Caldwell (1995) proposed the Weighted Directional Symmetry (WDS) function given below:
fWDS (i)
100 N
wds (i) ai oi ,
N i1
(2)
148
where
1.5 if (ai ai 1 )(oi oi 1 ) 0 ,

wds (i)
0.5 Otherwise
(3)
and N is the total number of observations.

Yao and Tan (2000; 2001) argued that the weight associated with fWDS (i.e., wds(i)) should be heavily
adjusted if a wrong direction is predicted for a larger change, while it should be slightly adjusted if a
wrong direction is predicted for a smaller change, and so on. Based on this argument, they proposed the
Directional Profit adjustment factor
c1
c
f DP (i ) 2
c3
c4
if (ai oi ) 0 and ai ,
if ai oi 0 and ai ,
if
if
ai oi 0
ai oi 0
and
and
ai ,
ai ,
(4)
where ai = ai-ai-1, oi = oi-oi-1, and is the standard deviation of the training data (including the validation set). For the experiments, they used c1=0.5, c2=0.8, c3=1.2, and c4=1.5 [Yao & Tan 2000; Yao & Tan
2001]. By giving these weights, they tried to impose a higher penalty on the predictions with the wrong
direction and on the error magnitude that is larger than the other predictions.
Based on this Directional Profit adjustment factor (4), Yao and Tan (2000; 2001) proposed the Directional Profit (DP) model:
EDP
1 N
f DP (i) ai oi
N i 1
(5)
Refenes et al. (1995) proposed the Discounted Least Squares (LDS) function by taking the contribution from the historical data into account.
2
EDLS
1 N
wb (i)ai oi ,
N i1
(6)
where wb(i) is an adjustment related to the contribution of the ith observation and is described by the following equation:
wb (i )
1
,
2bi
1 exp b
(7)
where discount rate b denotes the contribution from the historical data. Refenes et al. (1995) suggested
that b = 6.
Yao and Tan (2000; 2001) proposed another error function, Time Dependent Directional Profit
(TDP) model, by incorporating the approach suggested by Refenes et al. (1995) to their Directional Profit
Model (5):
149
ETDP
1 N
fTDP (i)ai oi ,
N i1
(8)
where fTDP(i)=fDP(i)wb(i). fDP(i) and wb(i) are described by (4) and (7), respectively.
Note: Refenes et al. (1995) and Yao and Tan (2000; 2001) used 1/(2N) instead of 1/N in the formulas
given by (5), (6), and (8).
2.3
Modified Error Functions
Our aim is to classify trading signals into three classes: buy, hold, and sell. The hold class includes both
positive and negative values (see Criterion 1 in Section 1). Therefore, the least squares functions, in
which the cases with incorrectly predicted directions (positive or negative) are penalized (e.g., the error
functions given by (5) and (8)), will not give the desired prediction accuracy. For example, suppose that
ai 0.0045 , and oi 0.0049 . In this case the predicted signal is correct according to Criterion 1.
However, the algorithms used in the studies done by Yao and Tang (2000; 2001) try to minimize the error
function as ai oi 0 (refer to (8)). However, such minimization is unnecessary because the predicted
signal is correct. Therefore, instead of the weighing schemes suggested by previous studies, we proposed
a different scheme of weighing.
Unlike the weighing schemes suggested in Yao and Tangs studies (2000; 2001), which impose a
higher penalty on the predictions with incorrect sign (i.e., negative or positive), this novel scheme is
based on the correctness of the classification of trading signals. If the predicted trading signal is correct,
we assign a very small (close to zero) weight; otherwise, we assign a weight equal to 1. Therefore, the
proposed weighing scheme is as follows:
if the predicted trading signal is correct

wd (i)
1
Otherwise,
(9)
where is a very small value. This value of needs to be decided according to the distribution of data.
2.3.1
Proposed Error Functions
As shown above (Section 2.3), the error functions, in which the cases with incorrectly predicted directions
(positive or negative) are penalized, are not suitable for the prediction problem of interest. Instead of penalizing the predictions with incorrect direction (or sign), we tried to penalize the predictions into a wrong
class (trading signal). We modified the EDP error function (see (5)) by replacing fDP(i) with the new
weighing scheme, wd(i) (see (9)). The new error function is denoted by (ECC).
ECC
w (i)a
i 1
oi
(10)
The contribution from the historical data also plays an important role in the prediction accuracy of financial time series. By also considering this matter, we introduced another error function, ETCC, by replacing fDP(i) in the TDP error function (8) with the new weighing scheme, wd(i) (9).
150
ETCC
1 N
wb (i) wd (i)ai oi ,
N i 1
(11)
where wb(i) and wd(i) are defined by (7) and (9), respectively.
When training neural networks based on the error minimization function (10) or (11), the error is
forced to take a smaller value if the predicted trading signal is correct. On the other hand, the actual size
of the error is considered in the cases of misclassifications.
2.4
Modified Neural Network Algorithms
We modified the FNN algorithm by (a) using the modified least squares error function given in (6), (10),
and (11), and by (b) employing a global optimization algorithm to train the networks. The importance of
using global optimization algorithms for the FNN training is discussed in Section 1. The global optimization algorithm, AGOP (introduced in Mammadov (2004) and Mammadov et al. (2005)) is applied for
training these modified network algorithms, which are as follows:
NNDLS - Neural network algorithm based on the LDS error function, EDLS (see (6))
NNCC - Neural network algorithm based on the newly proposed error function 1, ECC (see (10))
NNTCC - Neural network algorithm based on the newly proposed error function 2, ETCC (see (11))
The layers of these networks are connected in the same structure as the FNN (Section 2.1). A tansigmoid function is used as the transfer function between the input layer and the hidden layer, while the
linear transformation function is employed between the hidden and the output layers. The NNDLS algorithm differs from the respective algorithm used by Refenes et al. (1995) because it employs a new global
optimization algorithm, AGOP, for training. In addition to the use of this new training algorithm, NNCC
and NNTCC are based on two different modified error functions, ECC and ETCC, respectively.
The only way to examine whether these modified algorithms perform better than the standard FNN is
to conduct numerical experiments. Testing the efficiency of these algorithms against the algorithms developed in Yao and Tang (2000; 2001), Caldwell (1995), and Refenes et al. (1995) is not possible as these
algorithms are not accessible.
Network Training and Evaluation
As mentioned in Section 1, the AORD is selected as the stock market index whose trading signals are to
be predicted. We highlighted in Section 1 that past studies have showed the importance of using lagged
data of foreign stock market indices to predict a given stock market index. The previous studies done by
the authors [Tilakaratne et al. 2006] also suggest that the lagged Close prices of the US S&P 500 Index
(GSPC), the UK FTSE 100 Index (FTSE), French CAC 40 Index (FCHI), German DAX Index (GDAXI),
as well as that of the AORD itself show an impact on the direction of the Close price of day t of the
AORD. Furthermore, it has been found that only the Close prices at lag 1 of these markets influence the
Close price of the AORD [Tilakaratne et al. 2006; 2007a]. Therefore, this study considered the relative
return of the Close prices at lag 1 of two combinations of stock market indices when forming input sets:
(i) a combination that includes the GSPC, FTSE, FCHI, and the GDAXI, and (ii) a combination that includes the AORD in addition to the markets included in (i).
151
As most of the worlds major stock markets are integrated, one integrated stock market can be considered as part of a single global system. The influence from one integrated stock market on a dependent
market includes the influence from one or more stock markets on the former. Therefore, the intermarket
influences on the target market index need to be quantified in order to use them as the input variables (the
quantification process is explained in Section 3.1). The input sets (described in Section 3.2) are formed
with and without incorporating the quantified intermarket influence [Tilakaratne et al. 2006; Tilakaratne
et al. 2007a]. By quantifying intermarket influence, we tried to identify the influential patterns between
the potential influential markets and the AORD. Training the network algorithms with pre-identified patterns may enhance their learning. Therefore, it can be expected that using quantified intermarket influence
as input features to the network algorithms produces more accurate output.
Daily relative returns of the Close prices of the selected stock market indices from 2 July 1997 to 30
December 2005 were used for this study. If no trading took place on a particular day, the rate of change of
price should be zero. Therefore, before calculating the relative returns, the missing values of the Close
price were replaced by the corresponding Close price of the last trading day.
The minimum and maximum values of the data (relative returns) used for network training are -0.137
and 0.057, respectively. Therefore, we selected the value of (see Section 2.3) as 0.01. If the trading signals are correctly predicted, 0.01 is small enough to set the value of the proposed error function (see (10))
to approximately zero.
As influential patterns between markets are likely to vary with time [Tilakaratne 2006], the whole
study period was divided into a number of moving windows with a fixed length. Overlapping windows
with a length three trading years1 were considered. A period of three trading years consists of enough data
(768 daily relative returns) for neural network experiments. Moreover, the chance that outdated data
(which are not relevant for studying the current behavior of the market) being included in the training set
is very low.
The most recent 10% of data (the last 76 trading days) in each window was accounted for out of the
sample predictions, while the remaining 90% of data was allocated for network training. We called the
part of the window allocated for training the training window. Different numbers of neurons for the hidden layer were tested when training the networks with each input set.
As described in Section 2.2, the error functions, EDLS and ETCC (see (6) and (11)), consist of a parameter b (discount rate), which decides the contribution from the historical data of the time series. Refenes et
al. (1995) fixed b=6 for their experiments. However, the discount rate may vary from one stock market
index to another. Therefore, we tested different values for b when training network NNDLS (see Section
2.4). Observing the results, the best value for b was selected; this best value was used as b when training
the network NNTCC.
3.1
Quantification of Intermarket Influences
Past studies [Wu & Su 1998; Yang et al. 2003; Bhattacharyya & Banerjee 2004] have confirmed that
most of the world's major stock markets are integrated. Hence, one integrated stock market can be considered as part of a single global system. The influence from one integrated stock market on a dependent
market includes the influence from one or more stock markets on the former.
If there is a set of influential markets to a given dependent market, it is not straightforward to separate influence from individual influential markets. Instead of measuring the individual influence from one
1
1 trading year 256 trading days
152
influential market to a dependent market, the relative strength of the influence from this influential market
to the dependent market can be measured compared to the influence from the other influential markets.
We used the approach proposed in Tilakaratne et al. (2006; 2007a) to quantify intermarket influences.
This approach estimates the combined influence of a set of influential markets and also the contribution
from each influential market to the combined influence.
Quantification of intermarket influences on the AORD was carried out by finding the coefficients, i,
i=1, 2, (see Section 3.1.1), which maximize the median rank correlation between the relative return of
the Close price of day (t+1) of the AORD index and the sum of i multiplied by the relative returns of the
Close prices of day t of a combination of influential market indices over a number of small nonoverlapping windows with a fixed size. The two combinations of markets, previously mentioned in this
section, were considered. i measures the contribution from the ith influential market to the combined influence, which is estimated by the optimal correlation.
There is a possibility that the maximum value leads to a conclusion about a relationship that does not
exist in reality. In contrast, the median is more conservative in this respect. Therefore, instead of selecting
the maximum of the optimal rank correlation, the median is considered.
Spearmans rank correlation coefficient was used as the rank correlation measure. For two variables
X and Y, Spearmans rank correlation coefficient, rs, can be defined as follows:
n(n 2 1) 6 d i Tx T y 2
2
rs
n(n
1) Tx n(n 2 1) TY
(12)
where n is the total number of bivariate observations of x and y, di is the difference between the rank of x
and the rank of y in the ith observation, and Tx and Ty are the number of tied observations of X and Y, respectively.
The same six training windows employed for the network training (Section 3) were considered for
the quantification of intermarket influence on the AORD. The correlation structure between stock markets
also changes with time [Wu & Su 1998). Therefore, each moving window was further divided into a
number of small windows with a length of 22 days. Twenty-two days of a stock market time series
represents a trading month. Spearman's rank correlation coefficients (see (12)) were calculated for these
smaller windows within each moving window.
The absolute value of the correlation coefficient was considered when finding the median optimal
correlation. This is appropriate as the main concern is the strength rather than the direction of the correlation (i.e., either positively or negatively correlated).
The objective function to be maximized (see Section 3.1.1 given below) is defined by Spearmans
correlation coefficient, which uses ranks of data. Therefore, the objective function is discontinuous. Solving such a global optimization problem is extremely difficult because of the unavailability of gradients.
We used the same global optimization algorithm, AGOP, which was used for training the proposed algorithms (see Section 2.4), to solve this optimization problem.
3.1.1
Optimization Problem
Let Y(t + 1) be the relative return of the Close price of a selected dependent market at time (t + 1) and
Xj(t) be the relative return of the Close price of the jth influential market at time t. Define X(t) as
X (t ) j X j (t ),
j
(13)
153
where the coefficient j 0, j=1,2, ..., m measures the strength of influence from each influential market
Xj, while m is the total number of influential markets.
The aim is to find the optimal values of the coefficients, = (1,, m), which maximize the rank
correlation between Y(t+1) and X(t) for a given window.
The correlation can be calculated for a window of a given size. This window can be defined as
T (t 0 , l ) t 0 , t 0 1, ..., t 0 (l 1) ,
(14)
where t0 is the starting date of the window, and l is its size (in days). We set l = 22 days.
The Spearman's correlation (see (12)) between the variables Y(t + 1), X(t), and t T(t0, l) defined on
the window T(t0, l) will be denoted as
C ( ) Corr Y (t 1), X (t ) || T (t 0 , l ) .
(15)
To define the optimal values of the coefficients for a long period, the following method is applied:
let [1,T] = {1, 2,, T} be a given period (e.g., a large window). This period is divided into n windows of
size l (we assume that T = l n, n > 1 is an integer):
T (t k , l ),
k 1, 2, 3, ..., n;
(16)
thus,
T (t k , l ) T (t k , l )
n
for k k ,
T (tk , l ) 1, T .
(17)
(18)
k 1
The correlation coefficient between Y(t+1) and X(t) defined on the window T(tk,l) is denoted as
Ck ( ) Corr Y (t 1), X (t ) ||T (tk , l ) , k 1,..., n .
(19)
To define an objective function over the period [1,T], the median of the vector, (C1(), ..., Cn()) is
used. Therefore, the optimization problem can be defined as
Maximise
s. t.
f ( ) Median C1 ( ),
1, j 0,
, Cn
j 1, 2, ..., m .
(20)
(21)
The solution to (20) and (21) is a vector, =(1,, m), where j, j=1,2,, m, denotes the strength of
the influence from the jth influential market.
In this chapter, the quantity, jXj is called the quantified relative return corresponding to the jth influential market.
3.2
Input Sets
The following six sets of inputs are used to train the modified network algorithm introduced in Section
2.4 as well as the standard FNN algorithm:
154
1. Four input features of the relative returns of the Close prices of day t of the market combination (i)
(i.e., GSPC(t), FTSE(t), FCHI(t), and GDAXI(t))
denoted by GFFG
2. Four input features of the quantified relative returns of the Close prices of day t of the market combination (i) (i.e., 1GSPC(t), 2FTSE(t), 3FCHI(t), and 4GDAXI(t))
denoted by GFFG-q
3. Single input feature consists of the sum of the quantified relative returns of the Close prices of day t
of the market combination (i) (i.e., 1GSPC(t)+2FTSE(t)+3FCHI(t)+4GDAXI(t))
denoted by GFFG-sq
4. Five input features of the relative returns of the Close prices of day t of the market combination (ii)
(i.e., GSPC(t), FTSE(t), FCHI(t), GDAXI(t), and AORD(t))
denoted by GFFGA
5. Five input features of the quantified relative returns of the Close prices of day t of the market combination (ii) (i.e., 1AGSPC(t), 2AFTSE(t), 3AFCHI(t), 4AGDAXI(t), 5AAORD(t))
denoted by GFFGA-q
6. Single input feature consists of the sum of the quantified relative returns of the Close prices of day t
of the market combination (ii) (i.e., 1AGSPC(t) + 2AFTSE(t) + 3AFCHI(t) + 4AGDAXI(t) +
5AAORD(t))
denoted by GFFGA-sq
(1, 2, 3, 4) and (1A, 2A, 3A, 4A) are solutions to (20) (21) corresponding to the market combination (i) and (ii), as previously mentioned in Section 3. These solutions related to the market combinations
(i) and (ii) are shown in Tables 1 and 2, respectively. We note that i and iA, i=1, 2, 3, 4 are not necessarily equal.
Training
Window No.
1
2
3
4
5
6
GSPC
0.57
0.61
0.77
0.79
0.56
0.66
Optimal values of
FTSE
FCHI
0.30
0.11
0.18
0.08
0.09
0.13
0.06
0.15
0.17
0.03
0.06
0.08
GDAXI
0.02
0.13
0.01
0.00
0.24
0.20
Optimal median Spearmans

correlation
0.5782*
0.5478*
0.5680*
0.5790*
0.5904*
0.5359*
Table 1: Optimal values of quantification coefficients () and the median optimal Spearman's
correlations corresponding to market combination (i) for different training windows (* - Significant at 5% level)
Training
Window No.
1
2
3
4
5
6
Optimal values of
GSPC
FTSE
0.56
0.58
0.74
0.79
0.56
0.66
0.29
0.11
0.00
0.07
0.17
0.04
FCHI
GDAXI
0.10
0.12
0.17
0.14
0.04
0.09
0.03
0.17
0.02
0.00
0.23
0.20
AOR
D
0.02
0.02
0.07
0.00
0.00
0.01
155
Optimal median
Spearmans correlation
0.5805*
0.5500*
0.5697*
0.5799*
0.5904*
0.5368*
Table 2: Optimal values of quantification coefficients () and the median optimal Spearman's
correlations corresponding to market combination (ii) for different training windows (* - Significant at 5% level)
3.3
Evaluation Measures
The network algorithms proposed in Section 2.4 (i.e., NNDLS, NNCC, NNTCC) and the standard FNN algorithm output the (t+1)th day relative returns of the Close price of the AORD. Subsequently, the output was
classified into trading signals according to Criterion 1 (see Section 1).
The performance of the networks was evaluated by the overall classification rate (rCA) as well as by
the overall misclassification rates (rE1 and rE2), which are defined as follows:
rCA
N0
100 ,
NT
(22)
where N0 and NT are the number of test cases with correct predictions and the total number of cases in the
test sample, respectively,
N1
100 ,
NT
N
2 100 ,
NT
rE1
rE 2
(23)
(24)
where N1 is the number of test cases where a buy/sell signal is misclassified as a hold signal or vice versa.
N2 is the test cases where a sell signal is classified as a buy signal and vice versa.
From a trader's point of view, the misclassification of a hold signal as a buy or sell signal is a more
serious mistake than misclassifying a buy signal or a sell signal as a hold signal. The reason is that in the
former case, a trader will lose money by taking part in an unwise investment, while in the later case
he/she will only lose the opportunity to make a profit, but there will be no monetary loss. The most serious monetary loss occurs when a buy signal is misclassified as a sell signal and vice versa. Due to the
seriousness of the mistake, rE2 plays a more important role in performance evaluation than rE1.
Results Obtained from Network Training
As mentioned in Section 3, different values for the discount rate, b, were tested. b=1, 2, ..., 12 was considered when training NNDLS. The prediction results improved with the value of b up to 5. For b > 5, the
156
prediction results remained unchanged. Therefore, the value of b was fixed at 5, and this b value was used
as the discount rate of the NNTCC algorithm as well.
We trained the four neural network algorithms (including the standard FNN) by varying the structure
of the network; that is, by changing the number of hidden layers as well as the number of neurons per
hidden layer. The best four prediction results corresponding to the four networks were obtained when the
number of hidden layers was equal to one, and the number of neurons per hidden layer was equal to two.
Therefore, only the results relevant to networks with two hidden neurons are presented in this section.
Tables 3-6 present the prediction results related to algorithms NNDLS, NNCC, and NNTCC and standard FNN,
respectively.
Input Set
GFFG
GFFGA
GFFG-q
GFFGA-q
GFFG-sq
GFFGA-sq
Average rCA
64.25
64.04
64.47
64.25
63.82
64.04
Average rE2
0.44
0.44
0.22
0.22
0.00
0.00
Average rE1
35.31
35.53
35.31
35.53
36.18
35.96
Table 3: Results obtained from training neural network NNDLS (The best prediction results are
shown in bold type.)
Input Set
GFFG
GFFGA
GFFG-q
GFFGA-q
GFFG-sq
GFFGA-sq
Average rCA
65.35
64.04
63.82
64.04
64.25
63.82
Average rE2
0.00
0.22
0.00
0.00
0.00
0.00
Average rE1
34.65
35.75
36.18
35.96
35.75
36.18
Table 4: Results obtained from training neural network NNCC (The best prediction results are
Input Set
GFFG
GFFGA
GFFG-q
GFFGA-q
GFFG-sq
GFFGA-sq
Average rCA
66.67
64.91
66.23
63.82
64.25
64.69
Average rE2
0.44
0.22
0.00
0.22
0.44
0.22
Average rE1
32.89
34.87
33.37
35.96
35.31
35.09
Table 5: Results obtained from training neural network NNTCC (The best prediction results are
Input Set
GFFG
GFFGA
GFFG-q
GFFGA-q
GFFG-sq
GFFGA-sq
Average rCA
62.06
62.06
62.72
62.72
62.28
62.50
Average rE2
0.22
0.22
0.00
0.00
0.00
0.00
157
Average rE1
37.72
37.72
37.28
37.28
37.72
37.50
Table 6: Results obtained from training standard FNN algorithms (The best prediction results
are shown in bold type.)
When the overall classification and overall misclassification rates given in Table 6 are compared with
the respective rates corresponding to the modified neural network algorithms (Tables 3 5), it is clear
that the standard FNN algorithm shows poorer performance than the modified neural network algorithms.
Therefore, it can be suggested that the modified neural network algorithms perform better when predicting the trading signals of the AORD. As shown in Tables 3-5, the performances of the two modified algorithms, NNCC and NNTCC, are better than that of the other modified algorithm (i.e., NNDLS). Out of the three
modified network algorithms, NNTCC gives the best prediction results. The classification accuracy associated with this algorithm is 3.63 higher than that of the standard FNN. This increase is a significant increase for day trading. In summary, it can be suggested that taking two matters: (1) the accuracy of the
predicted class and (2) contribution from the historical data, into account, improve the prediction accuracy.
4.1
Comparison of the Performance of the Network Algorithms
The best predictions obtained by the proposed algorithms (i.e., NNDLS, NNCC, and NNTCC) were compared
with the best prediction results generated by the standard FNN algorithm by using classification and misclassification rates. The classification rate indicates the proportion of correctly classified signals to a particular class out of the total number of actual signals in that class. On the other hand, the misclassification
rate indicates the proportion of incorrectly classified signals from a particular class to another class out of
the total number of actual signals in the former class.
4.2.1
Prediction Accuracy
The average (over six windows) classification and misclassification rates related to the best prediction
results obtained from NNDLS, NNCC, NNTCC, and standard FNN are shown in Tables 7 10, respectively
According to Tables 7 10, none of the algorithms produce serious misclassifications (i.e., misclassification of buy signal as sell signal and vice versa). Results show that the performances of all three proposed algorithms outperform the standard FNN with respect to the prediction accuracy. Moreover, it is
clear that the classification rates corresponding to buy, sell, and hold signals are highest in the case of the
NNTCC algorithm. The NNCC algorithm produces the second best prediction results. These results indicate
the success of the proposed weighing scheme.
158
Actual
Class
Buy
Hold
Sell
Average Classification (Misclassification) Rates

Predicted Class
Buy
Hold
Sell
22.10%
(77.90%)
(0.00%)
(4.97%)
89.20%
(5.83%)
(0.00%)
(83.06%)
16.94%
Table 7: Average (over six windows) classification and misclassification rates of the best prediction results corresponding to NNDLS (trained with input set GFFGA-sq; refer to Table 3)
Actual
Class
Buy
Hold
Sell

Predicted Class
Buy
Hold
Sell
23.94%
(76.06%)
(0.00%)
(5.00%)
89.59%
(6.66%)
(0.00%)
(77.71%)
22.29%
Table 8: Average (over six windows) classification and misclassification rates of the best prediction results corresponding to NNCC (trained with input set GFFG; refer to Table 4)
Actual
Class
Buy
Hold
Sell

Predicted Class
Buy
Hold
Sell
27.00%
(73.00%)
(0.00%)
(4.56%)
89.22%
(6.22%)
(0.00%)
(75.49%)
24.51%
Table 9: Average (over six windows) classification and misclassification rates of the best prediction results corresponding to NNTCC (trained with input set GFFG-q; refer to Table 5)
Actual
Class
Buy
Hold
Sell

Predicted Class
Buy
Hold
Sell
21.55%
(78.45%)
(0.00%)
(4.18%)
88.68%
(7.14%)
(0.00%)
(79.72%)
20.28%
Table 10: Average (over six windows) classification and misclassification rates of the best
prediction results corresponding to standard FNN (trained with input set GFFG-q; refer to Table 6)
159
5 Conclusions
The results obtained from the experiments show that the modified neural network algorithms introduced
by this study perform better than the standard FNN algorithm in predicting the trading signals of the
AORD. Furthermore, the neural network algorithms, based on the modified OLS error functions introduced by this study (see (10) and (11)), produced better predictions of trading signals of the AORD. Of
these two algorithms, the one based on (11) showed the best performance. This algorithm produced the
best predictions when the network consisted of one hidden layer with two neurons. The quantified relative
returns of the Close prices of the GSPC and the three European stock market indices were used as the input features. This network prevents serious misclassifications such as misclassification of buy signals as
sell signals and vice versa; it also predicts trading signals with a higher degree of accuracy. It can also be
suggested that the quantified intermarket influence on the AORD can be effectively used to predict its
trading signals. In summary, the proposed prediction approach can be effectively used to predict the trading signals of the Australian All Ordinary Index.
This prediction approach can also be used to predict whether it is best to buy, hold, or sell shares of
any company listed under a given sector of the Australian Stock Exchange. For this case, the potential
influential variables will be the share price indices of the companies listed under the stock of interest. Furthermore, it is suitable to predict trading signals of any other global stock market index. Such research
direction will be very interesting especially in a period of economic recession, as the stock indices of the
worlds major economies are strongly correlated during such periods.
Another useful research direction is in the area of marketing research, that is, the modification of the
proposed prediction approach to predict whether the market share of a certain product goes up or not. In
this case, market shares of the competitive brands can be considered as the influential variables.
Reference
[Akbani et al., 2004] Akbani, R., Kwek, S., & Japkowwicz, N. (2004). Applying Support Vector Machines to Imbalanced Datasets. In Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004, LNCS (LNAI), vol. 3201, 3950, Springer, Heidelberg.
[Bhattacharyya & Banerjee, 2004] Bhattacharyya, M. & Banerjee, A. (2004). Integration of Global Capital Markets: An Empirical Exploration. International Journal of Theoretical and Applied Finance, vol. 7, 385-405.
[Caldwell, 1995] Caldwell, R. B. (1995). Performances Matrices for Neural Network-based Trading System Development. NeuroVe$t Journal, vol. 3, 22-26.
[Cao & Tay, 2001] Cao, L. & Tay, F. E. H. (2001). Financial forecasting using support vector machines. Neural Computing &
Applications, vol. 10(2), 184192.
[Huang et al., 2005] Huang, W., Nakamori, Y., & Wang, S.-Y. (2005). Forecasting stock market movement direction with support vector machine. Computers and Operations Research, vol. 32(10), 25132522.
[Kim & Chun, 1998] Kim, S. H. & Chun, S. H. (1998). Graded forecasting using an array of bipolar predictions: application of
probabilistic neural networks to a stock market index. International Journal of Forecasting, vol. 14(3), 323337.
[Kohara et al., 1996] Kohara, K., Fukuhara, Y., & Nakamura, Y. (1996). Selective presentation learning for neural network forecasting of stock markets. Neural Computing & Applications, vol. 4(3), 143148.
160
[Leung et al., 2000] Leung, M. T., Daouk, H., & Chen, A.-S. (2000). Forecasting stock indices: a comparison of classification
and level estimation models. International Journal of Forecasting, vol. 16(2), 73190.
[Mammadov, 2004] Mammadov, M. A. (2004). A new global optimization algorithm based on dynamical systems approach. In
A Rubinov and M Sniedovich (eds.), Sixth International Conference on Optimization: Techniques and Applications (ICOTA6), Ballarat, Australia.
[Mammadov et al., 2005] Mammadov, M. A., Rubinov, A. M., & Yearwood, J. (2005). Dynamical systems described by relational elasticities with applications to global optimization. In V Jeyakumar and A Rubinov (eds.) Continuous Optimization:
Current Trends and Applications, 365-387, Springer.
[Murphy, 2004] Murphy, J. J. (2004). Intermarket Analysis: Profiting from Global Market Relationships. John Wiley and Sons.
[Olson & Mossaman, 2001] Olson D. & Mossaman, C. (2001). Cross-correlation and Predictability of Stock Returns. Journal of
Forecasting, vol. 20, 145-160.
[Pan et al., 2005] Pan, H., Tilakaratne, C., & Yearwood, J. (2005). Predicting the Australian Stock Market Index Using Neural
networks Exploiting Dynamical Swings and Intermarket Influences. Journal of Research and Practice in Information Technology, vol. 37(1), 43-55.
[Poddig & Rehkugler, 1995] Poddig, T. & Rehkugler, H. (1995). A 'world' model of integrated financial markets using artificial
neural networks, Neurocomputing, vol. 10, 251-273.
[Qi & Maddala, 1999] Qi, M. & Maddala, G. S. (1999). Economic factors and the stock market: a new perspective. Journal of
Forecasting, vol. 18(3), 151166.
[Refenes et al., 1995] Refenes, A. N., Bentz, Y., Bunn, D. W., Burgess, A. N., & Zapranis, A. D. (1995). Financial Time Series
Modelling with Discounted Least Squares Backpropagation. Neurocomputing, vol. 14, 123-138.
[Tilakaratne, 2006] Tilakaratne, C. D. (2006). A Study of Intermarket Influence on the Australian All Ordinary Index at Different
Time Periods. In 2nd Australian Business and Behavioural Sciences Association (ABBSA) International Conference, Adeliade, Australia.
[Tilakaratne et al., 2006] Tilakaratne, C. D., Mammadov, M. A., & Hurst, C. P. (2006). Quantification of Intermarket Influence
Based on the Global Optimization and Its Application for Stock Market Prediction. In International Workshop on Integrating
AI and Data Mining (AIDM'06), Horbart, Australia.
[Tilakaratne et al., 2007a] Tilakaratne, C. D., Morris, S. A., Mammadov, M. A., & Hurst, C. P. (2007). Quantification of Intermarket Influence on the Australian All Ordinary Index Based on Optimization Techniques. In W. Read and A. J. Roberts
(eds.) 13th Biennial Computational Techniques and Applications Conference (CTAC-2006), ANZIAM Journal vol. 48, C104C118.
[Tilakaratne et al., 2007b] Tilakaratne, C. D., Morris, S. A., Mammadov, M. A., & Hurst, C. P. (2007). Predicting stock market
index trading signals using neural networks. In 14th Annual Global Finance Conference (GFC 07), 171179, Melbourne,
Australia.
[Wu & Su, 1998] Wu, C., Su, Y. (1998). Dynamic Relations among International Stock Markets. International Review of Economic and Finance, vol. 7, 63-84.
[Wu & Zhang, 1997] Wu, Y. & Zhang, H. (1997). Forward premiums as unbiased predictors of future currency depreciation: a
non-parametric analysis. Journal of International Money and Finance, vol. 16(4), 609623.
[Yang et al., 2003] Yang, J., Khan, M. M., & Pointer, L. (2003). Increasing Integration Between the United States and Other
International Stock Markets?; A Recursive Co-integration Analysis. Emerging Markets Finance and Trade, vol. 39, 39-53.
[Yao et al., 1999] Yao, J., Tan, C. L., & Poh, H. L. (1999). Neural networks for technical analysis: a study on KLCI. International Journal of Theoretical and Applied Finance, vol. 2(2), 221241.
[Yao & Tan, 2000] Yao, J. & Tan, C. L. (2000). Time Dependent Directional Profit Model for Financial Time Series Forecasting.
In IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00), Como, Italy.
[Yao & Tan, 2001] Yao, J. & Tan, C. L. (2001). A study on training criteria for financial time series forecasting. In Proceedings
of the International Conference on Neural Information Processing (ICONIP 01), 15, Shanghai, China.

PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PDF

Uploaded by

Copyright:

Available Formats

A Novel Approach for Predicting Trading Signals

of a Stock Market Index

Musa A. Mammadov, Sydney A. Morris

Modified Neural Network Algorithms

Alternative Error Functions Proposed in the Literature

1.5 if (ai ai 1 )(oi oi 1 ) 0 ,

and N is the total number of observations.

Modified Error Functions

if the predicted trading signal is correct

Proposed Error Functions

Modified Neural Network Algorithms

Network Training and Evaluation

Quantification of Intermarket Influences

1 trading year 256 trading days

Ck ( ) Corr Y (t 1), X (t ) ||T (tk , l ) , k 1,..., n .

Optimal median Spearmans

Results Obtained from Network Training

Comparison of the Performance of the Network Algorithms

Average Classification (Misclassification) Rates

Average Classification (Misclassification) Rates

Average Classification (Misclassification) Rates

Average Classification (Misclassification) Rates

You might also like