You are on page 1of 41

Forecasting Financial Markets

Neural Networks
Copyright © 1999-2006
Investment Analytics

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 1
Overview
¾ Overview of neural networks
¾ Design considerations
¾ Applications

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 2
A Neural Network
¾ Processing elements
ƒ Neurons
• Receives & processes input(s)
• Delivers single output
¾ Network
ƒ Collection of interlinked neurons
ƒ Grouped in layers
• Input
• Intermediate (hidden)
• Output

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 3
A Schematic Diagram of a Neuron

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 4
The Neuron Analogy

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 5
A 3 Layer Neural Network

Network
Inputs Network
Outputs

Hidden Layer
Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 6
Processing Information
¾ Inputs
ƒ Correspond to single attributes
• Can include qualitative data
¾ Outputs
ƒ Solution to a problem
• E.g. forecast, or binary value
¾ Weights
ƒ Express relative importance of data
• On inputs or data transferred between layers
ƒ “Learning” = adapting weights
Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 7
Activation Function
¾ Determines whether a neuron will “fire”
ƒ I.E. Produce an output
¾ Weighted sum of inputs
ƒ For N inputs i into neuron j

N
Y j = ∑ Wij X i
i =1

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 8
Transfer Function
¾ Transforms or normalizes output
ƒ Also called a transformation or squashing function
¾ Popular choice
ƒ Sigmoid : f(x) = 1/(1+e-x)

¾ Alternative: threshold detector / hard limiter


ƒ E.G. F(Yj) in range {0, 1} if Yj > 0.5, 0 otherwise

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 9
Architecture / Network Topology
¾ Number of neurons
¾ Number of hidden layers
¾ Connections
ƒ Feed forward/backwards
ƒ Fully or partially connected
¾ Static or adaptive architecture

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 10
Learning
¾ Supervised
ƒ Uses set of inputs for which desired output is known
ƒ Cost function f(desired-actual) used to change weights
ƒ Example: Hopfield network
¾ Unsupervised
ƒ Network shown only inputs
ƒ No information on “correct” outputs
ƒ Self-organizing
ƒ Example: Kohonen self-organizing feature maps

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 11
Training
¾ Data divided into training & testing data sets
ƒ Training set used to adapt weights
• Many iterations or “epochs”
• Training time dependent on data, network architecture, learning
algorithm
ƒ Forecasting performance tested on test data set
• Cost function comparing desired vs. Actual outputs
ƒ Stopping rule
• Determines when to terminate training
– When weights stabilize
– When cost function minimized
– Danger of over-fitting

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 12
Applications in Finance
¾ Bankruptcy prediction
¾ Bond rating
¾ Consumer credit scoring
¾ Financial market forecasting
ƒ Equities, currencies, commodities, bonds, derivatives
ƒ Security selection
ƒ Portfolio optimization
ƒ Trading systems

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 13
A Simple NN Example
¾ Supervised Learning of OR operator
ƒ Inputs X1, X2
ƒ Outputs Z (desired), Y(actual)
ƒ Weights W1, W2 ; initial values 0.1 and 0.3
ƒ Transfer Function
• F(Y) = 1 if Y > Threshold Value (0.5); 0 otherwise
ƒ Learning
• ∆ = (Z - Y)
• Wi(final) = Wi(initial) + α∆Xi
• α is learning coefficient (0.2)
Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 14
A Simple NN Example

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 15
Design Considerations
¾ Network performance
¾ Control mechanisms
ƒ Choice of activation function
ƒ Choice of cost function
ƒ Network architecture
ƒ Gradient descent/ascent efficiency
ƒ Learning times

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 16
Network Performance Measures
¾ Convergence
ƒ Accuracy of model fitness in-sample
¾ Generalization
ƒ Accuracy of model fitness out-of-sample
¾ Stability
ƒ Variance in prediction accuracy

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 17
Convergence
¾ Is network capable of learning classification?
ƒ Under what conditions?
ƒ What are computation requirements?
¾ Fixed topology networks
ƒ Prove convergence by showing error tends to zero in limit as t
→∞
• Using gradient descent
¾ Other networks
ƒ Show that network can classify the maximum # possible
mappings with arbitrarily large probability

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 18
Generalization
¾ Ability to classify data outside training set
ƒ Most important performance criterion
¾ Analogy with curve fitting
ƒ Two problems
• Finding order of polynomial
• Estimating coefficients
ƒ Too low order (NN structure too simple)
• Bad approximation both in- and out-of-sample
ƒ Too high order (NN structure too complex)
• “Over-fitting” : fits test data well, but out-of-sample performance
poor

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 19
Stability
¾ Consistency of results
ƒ When network parameters are varied
• Networks often vary widely in predictive performance
• “Chaotic”: highly sensitive to initial conditions
¾ Two components of error
• Bias: due to parameterization & associated assumptions
• Variance: sensitivity to changes in estimated parameters
ƒ Regression: high bias, low variance
ƒ Neural networks: low bias, high variance
• No parameterization, but may fit entire family of polynomials to given data
set

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 20
Choice of Activation Function
¾ Sigmoid Functions
ƒ Differentiable and well behaved
¾ Symmetric
ƒ Typical: scaled hyperbolic tangent
e Sy − e − Sy The Symmetric Scaled Hyperbolic Tangent

f ( y ) = A tanh( Sy ) = A Sy
Function

e + e − Sy
1.5

1.0

2A
= A− 0.5

1 + e 2 Sy -2.0 -1.5 -1.0 -0.5


0.0
0.0 0.5 1.0 1.5 2.0

-0.5

• A is amplitude -1.0

• S is slope at origin -1.5

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 21
Choice of Activation Function
¾ Choice of sigmoid parameters
ƒ A = 1.7159, s = 2/3
• F(-1) = -1 and f(1) = 1
– Gain in squashing transformation is normally around 1
• ∆2f/δx2 is max at ± 1
– Improves convergence at end of learning session

¾ Symmetric vs. Asymmetric sigmoid functions


ƒ Refenes & Alippi (1991)
• Symmetric functions capable of speed of convergence over
asymmetric functions by factor of 10

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 22
Cost Function
¾ Quadratic cost function is most common
ƒ Least mean square error
1 n
E = ∑ ( d i − yi ) 2

2 i =1
• Yi is current output from unit i
• di is desired output from unit I
ƒ Discounted least square error
1 n 1
E= ∑ ( a + bi )
( d − y ) 2

n i =1 1 + e
i i

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 23
Learning
¾ Gradient descent used to minimize cost function
ƒ Change weights in proportion to δi = δe/δWi
• ∆Wij(t+1) = λ δi yij
¾ Learning rate (step size, momentum) λ
ƒ As λ → 0 and t→∞ this procedure will find MSEMin
¾ Difficult to find appropriate rate
ƒ Too small
• Slow convergence
• May get trapped in local minima
ƒ Too large: unstable weights

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 24
Learning Rate
¾ Optimal learning rate pattern
ƒ Smooth MSE chart (for each layer)
ƒ Smooth weight histograms (for each layer)
¾ Learning rate adjustment
ƒ One rate for entire network
ƒ Different rates for each layer
ƒ Different rate for each weight

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 25
Learning Rate Rules of Thumb
¾ If no connections that jump layers
ƒ Learning rate for hidden layer λL = 0.5 λ L+1
¾ With connections that jump layers
ƒ Learning rate for hidden layer λL = 0.75 λ L+1
¾ Check sign of consecutive weight changes
ƒ If same, increase λ
ƒ If opposite, decrease λ
¾ If MSE chart is erratic
ƒ Reduce learning rate (for that layer)
Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 26
Network Architecture
¾ Hidden units
ƒ In general, the fewer the better
• Network will generalize better
ƒ Another approach: weight sharing
• Imposing equality constraints amongst connections strengths
• Reduces # free parameters while preserving network size and ability to
recognize complex patterns
¾ Hidden layers
ƒ Typically start with one
ƒ If use more than one make sure to connect each layer to all prior
layers

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 27
Network Architecture
¾ Constructive techniques
ƒ Hidden units added incrementally
¾ Pruning techniques
ƒ Attempts to eliminate redundant units
¾ Genetic algorithms
ƒ Selects “fittest” of several competing networks

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 28
Constructive Techniques
¾ Tiling algorithm
ƒ Divide training data set into “faithful” & “unfaithful” classes
• I.E. Those the network recognizes correctly and those it doesn’t
ƒ Add ancillary unit and connect to layer above
ƒ Select one unfaithful class and train new unit to subdivide it into
faithful and unfaithful classes
ƒ Repeat until no unfaithful classes remain
• Always possible - worst case: one unit for each input pattern
ƒ Add new master output unit and connect to all layers
• Training the new unit to learning mapping to desired output

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 29
Other Constructive Techniques
¾ Cascade algorithm
ƒ Adds hidden unit to maximize magnitude of
correlation between new unit’s output and residual
error signal to be minimized
¾ Dynamic node creation
ƒ Add new unit is rate of error decrease falls below
certain value

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 30
Pruning Techniques
¾ Multi-stage stage pruning
ƒ Outputs of hidden units analysed to see if any are not contributing
to solution
• Output of unit doesn’t change for any input pattern
• If output from two units is identical or opposite (for all inputs)
ƒ Repeat for next layer
¾ Weight decay
ƒ Weights without much influence subjected to time decay
ƒ Equivalent: add penalty term to cost function
• E* = MSE + bσσwij2

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 31
Genetically Evolved Neural Networks
¾ Initial population of randomly generated networks
¾ Proceed through training cycle with all networks
¾ At end of initial training cycle
ƒ Worst performing networks are deleted
ƒ Best performing networks are “mated”
¾ Continue training with all networks
¾ Occasional random mutations introduced
ƒ Randomize weights of lowly ranked networks
ƒ Change forecast horizon, lags on input variables

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 32
Genetic Evolution of a Neural Network

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 33
Training
¾ Epoch
ƒ # Training cycles after which weights are updated
¾ Determining epoch size
ƒ Start with initial epoch
ƒ Train network for large # (10,000) iterations
ƒ Test network and record R2 (for each output)
ƒ Repeat for variety of epoch sizes
ƒ Pick epoch size that maximizes R2
¾ Controlling over-fitting
ƒ Terminate training when MSE on test set starts to rise

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 34
Initial Weights
¾ Start with unequal initial weights
ƒ Rumelhart, Hinton, Williams (1986)
• Will not converge if solution requires unequal weights
¾ Testing stability
ƒ Initial weight matrix defines starting point on the
weight-error surface
ƒ Need several training sets with different random
weights to test for statistical stability

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 35
Data Modeling
¾ Detrending
ƒ Removing of seasonality and trends to achieve
stationarity
¾ Normalization
ƒ Variables scaled to have zero mean, unit SD
• Brings inputs into normal operating range of activation function
• Otherwise activation values may tend to zero
– Network paralysis

′ X i (t ) − X i
X i (t ) =
σX i

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 36
Data Modeling
¾ Scaling of Outputs
ƒ Some transfer functions reach max/min values
only when inputs reach infinity
Y ′(t ) = SCALE × Y (t ) + OFFSET
MAX − MIN
SCALE =
YMax − YMin
MAX − MIN
OFFSET = MAX − YMax
YMax − YMin

¾ Multi-Collinearity
ƒ Independent variables are correlated
• Solution: use principal components analysis to
orthoganalize inputs
Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 37
Lab: Modeling Implied Volatility on
IBEX Options
¾ Compare two forecasting techniques
ƒ Regression
ƒ Genetic neural network
¾ Forecast implied volatility
ƒ Evaluate trading performance

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 38
Solution: Modeling Volatility
on IBEX Options
Cumulative Returns

300%
Regression Buy & Hold Neural Network

250%

200%

150%

100%

50%

0%
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52

-50%

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 39
Solution: Modeling Volatility
on IBEX Options

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 40
Summary: Neural Networks
¾ Pros
ƒ Can capture non-linear effects
• Pattern recognition
ƒ Process model not required
ƒ Wide range of applications in finance
¾ Cons
ƒ ‘Black-box’ approach
ƒ Sometimes poor stability & generalization
characteristics

Copyright © 1999-2006 Investment Analytics Forecasting Financial Markets – Neural Networks Slide: 41

You might also like