Quantitative Methods High-Yield Notes

Quantitative Methods 2019 Level II High Yield Notes
2019 CFA® Exam Prep
IFT High-Yield Notes®

Quantitative Methods
Level II
This document should be read in conjunction with the corresponding

readings in the 2019 Level II CFA® Program curriculum. Some of the
graphs, charts, tables, examples, and figures are copyright
2018, CFA Institute. Reproduced and republished with permission
from CFA Institute. All rights reserved.
Required disclaimer: CFA Institute does not endorse, promote, or

warrant the accuracy or quality of the products or services offered by
IFT. CFA Institute, CFA®, and Chartered Financial Analyst® are
trademarks owned by CFA Institute.
Table of Contents
Foreword ...................................................................................................................... 3
R6 Fintech in Investment Management ..................................................... 4
R7 Correlation and Regression ....................................................................... 7
R8 Multiple Regression and Machine Learning ..................................12
© IFT. All rights reserved 1

R9 Time-Series Analysis....................................................................................21
R10 Simulations .....................................................................................................30

Foreword
The IFT High-Yield Course is based on Pareto’s 80-20 rule

according to which 80% of the exam questions are likely to be based
on 20% of the curriculum. Hence this course focuses on the 20%
material which is most testable.
We call this the “High-Yield Course” because your investment (time

and money) is low but the potential return (passing the exam) is
high! As with high-yield investments in the world of finance, there is
risk. Your exam will contain some questions which are not
addressed in the High-Yield Course. However, we believe that such
questions will be few and if you complement the High-Yield course
with sufficient practice, the probability of passing the exam is high.
Below are the main components of the Level II High-Yield course:

1. IFT High-Yield Notes® summarize the most important
concepts from each reading in 2 to 6 pages. Key formulas and
facts are presented in blue boxes while examples appear in
gray boxes.
2. IFT High-Yield Lectures® are online video lectures based on
the notes. Each reading is covered in 10 to 30 minutes
3. Key Facts and Formula Sheet: 12 pages covering key facts
and formulas for all of Level II!
BONUS:
High-Yield Q-Bank®: We have identified the most important
practice problems from the curriculum that you must do. Ideally you
should do all practice problems, but if you are time constrained you

should at least do the questions on this list. As part of your final

revision look at these questions again to reinforce key concepts.
The High-Yield Course can be used on a ‘stand-alone’ basis if you are

time-constrained. However, if you do have time, we recommend
taking the High-Yield Course along with IFT’s regular material. This
will help ensure sufficient mastery of the entire curriculum.
Many candidates complain that they forget material covered earlier.

The High-Yield Course addresses this problem by helping you to
quickly revise key concepts.
Thank you for trusting IFT to help you with your exam preparation.
R6 Fintech in Investment Management

Fintech
Fintech = Finance + Technology.
Fintech refers to the technological innovation in the design and
delivery of financial products and services.
The major drivers of fintech have been:
• Rapid growth in data
• Technological advances
The major applications of fintech are:
• Analysis of large datasets
• Analytical tools
• Automated trading
• Automated advice
• Financial record keeping

Big Data, artificial intelligence, and machine learning

Big Data refers to vast amount of data generated by industry,
governments, individuals and electronic devices. Characteristics of
big data typically include:
• Volume: The amount of data that we are dealing with has
grown exponentially.
• Velocity: We are now increasingly working with real time data.
• Variety: Historically we only dealt with structured data.
However, we are now also dealing with unstructured data such
as text, audio, video etc.
Artificial intelligence (AI) computer systems perform tasks that
have traditionally required human intelligence. They exhibit
cognitive and decision making ability comparable or superior to that
of human beings.
Machine learning (ML) computer programs:
• learn how to complete tasks or predict outcomes
• improve performance over time with experience.
Machine learning programs rely on training dataset and validation
dataset. Training dataset allows the ML algorithm to:
• identify relationships between variables
• detect patterns or trends
• create structure from data.
These relationships are then tested on the validation dataset.
Two main approaches to machine learning are:
1. Supervised learning: Both inputs and outputs are identified or
labeled. After learning from labeled data, the trained algorithm
is used to predict outcomes for new data sets.
2. Unsupervised learning: The input and output variables are not

labeled. Here we want the ML algorithm to seek relationships

on its own.
Fintech applications to investment management
Major fintech applications include:
• Text analytics and natural language processing: Text
analytics refers to the use of computer programs to derive
meaning from large, unstructured text or voice based data.
Natural language processing is an application of text analytics
whereby computers analyze and interpret human language.
• Robo-advisory services: Refers to providing investment
solutions through online platforms by replacing a human
advisor with an online platform. Services provided include:
automated asset allocation, rebalancing, tax strategies, trade
execution.
• Risk analysis: Big Data and ML techniques can provide insight
into changing market conditions. This can allow us to predict
adverse market conditions and adverse tends which can result
in better risk management.
• Algorithmic trading: Algorithmic trading refers to
computerized trading based on pre-specified rules and
guidelines. It can help us decide – when, where and how to
trade. Benefits of algorithmic trading include – speed of
execution, anonymity and lower transaction costs.
Financial applications of distributed ledger technology
DLT networks allow us to create, exchange and track ownership of
financial assets on a peer-to-peer basis. There is no central authority
to validate the transactions.
Major DLT applications include:

• Cryptocurrencies: These are electronic currencies that allow

near real-time transactions to take place between buyers and
sellers without the need for an intermediary like a bank.
• Tokenization: It is the process of representing ownership
rights to physical assets such as real-estate on a blockchain or
distributed ledger. DLT can create a single digital record of
ownership and reduce efforts required in ownership
verification and examination.
• Post-trade clearing and settlement: DLT can streamline the
post-trade clearing and settlement process by providing near-
real-time trade verification, reconciliation and settlement.
• Compliance: DLT can streamline the compliance process and
bring down costs. It can allow firms and regulators to get near-
real-time access to transaction data, as well as other relevant
compliance data and help them quickly uncover fraudulent
activities. DLT can also reduce compliance costs associated
with know-your-customer and anti-money-laundering
regulations which require verification of the identities of
clients and business partners.
R7 Correlation and Regression

Sample covariance, sample correlation and scatter plot
A scatter plot is a graph that shows the relationship between the
observations for two data series in two dimensions (x-axis and y-
axis).
Correlation analysis is used to measure the strength of the linear
relationship between two variables. The sample correlation is
denoted by r, and the population correlation by ρ. The correlation

coefficient has no units. It can take a value between -1(perfect

negative correlation) and 1 (perfect positive correlation).
∑N ̅ ̅
i = 1(X i − X)(Yi − Y)
The 𝐬𝐚𝐦𝐩𝐥𝐞 𝐜𝐨𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞 is Cov (X, Y) =
n − 1
Cov(X, Y)
The 𝐬𝐚𝐦𝐩𝐥𝐞 𝐜𝐨𝐫𝐫𝐞𝐥𝐚𝐭𝐢𝐨𝐧 coefficient of two variables X and Y is
sx × sy
Limitations of correlation analysis

• Correlation only measures linear relationship. Two variables
can have a strong non-linear relation but still have a very low
correlation.
• The correlation can be unreliable when outliers are present.
• The correlation may be spurious (misleading).
Testing the significance of the correlation coefficient
A t-test is used to determine whether the correlation between two
variables is significant.
The formula for the t-test is:
r√n − 2
t=
√1 − r 2
How to use the t-test?
• Calculate the t-statistic.
• Compare it with the critical value (tc). With a 5% significance
and a relatively large sample size the critical value is
approximately 2.
• If absolute value of t > tc, then reject H0.
• If you reject H0, then there is a significant linear correlation.
Linear regression with one independent variable.
Linear regression assumes a linear relationship between the

dependent variable and the independent variable.

The variable being explained is the dependent variable, Y. The
variable used to explain the dependent variable is the independent
variable, X.
A simple linear regression using one independent variable can be
expressed as:
Yi = b0 + bi X i + εi
Regression assumptions
• The relationship between Y and X is linear.
• X is not random.
• The expected value of the error term is zero.
• The variance of the error term is constant for all observations
(homoskedasticity).
• The error term, ε, is uncorrelated across observations.
• The error term, ε, is normally distributed.
Standard error of estimate, coefficient of determination,
confidence interval for a regression coefficient
Standard error of estimate (SEE) measures how well a given
linear regression model captures the relationship between the
dependent and the independent variables.
1 1
N 2 2 1
̂0 − b
̂1 X i ) ∑N ̂2 2
(Yi − b i = 1(εi ) SSE 2
SEE = (∑ ) =( ) =( )
n−2 n−2 n−2
i=1
The lower the SEE, the better the fit of the regression line. Also note
that the sum of the squared error terms can be written as SSE.
Coefficient of determination (R2) measures the fraction of the
total variation in the dependent variable that is explained by the
independent variable.

Total variation = Unexplained variation + Explained variation

Explained variation
R2 =
Total variation
In a regression with one independent variable, R2 is the square of
the correlation between the dependent and independent variables.
The higher the R2, the more useful the model. R2 has a value
between 0 and 1.
The confidence interval for a regression coefficient is given by:
̂1 ± t c sb̂
Confidence interval = b 1
where:
tc is the critical t value for a given level of significance and degrees of
freedom
sb̂1 is the standard error of the correlation coefficient
If the hypothesized value is outside the confidence interval, then
reject the null hypothesis.
Testing the significance of the regression coefficients
A t-test with n – 2 degrees of freedom is used to conduct hypothesis
tests of the estimated regression coefficients. The test statistic is
computed as:
̂1 − b1
b
t − stat =
sb̂1
Compare t-stat with tc; if t-stat > tc, then reject the null.
The p-value is the smallest level of significance at which the null
hypothesis can be rejected. Higher the t-statistic, smaller the p-value
and the stronger the evidence to reject the null hypothesis.
Predicted value of the dependent variable (Y)
The predicted value of the dependent variable (Y) is calculated by
inserting the predicted value of the independent variable (X) in the

regression equation.
Yi = b0 + bi X i
Confidence interval for the predicted value of the dependent
variable (Y)
̂ ± t c × sf where: sf is the
The prediction interval is given by: Y
standard error of the forecast.
Analysis of variance (ANOVA)
Analysis of variance is a statistical procedure for dividing the
variability of a variable into components that can be attributed to
different sources. We use ANOVA to determine the usefulness of the
independent variable or variables in explaining variation in the
dependent variable.
ANOVA table
Source of Degrees of Sum of Mean sum of
variation freedom squares squares
Regression
RSS
(explained k RSS MSR =
k
variation)
Error
SSE
(unexplained n-2 SSE MSE =
n−k−1
variation)
Total variation n–1 SST
n represents the number of observations and k represents the number

of independent variables. With one independent variable, k = 1. Hence,
MSR = RSS and MSE = SSE / (n - 2).
Information from the ANOVA table can be used to compute:
• Standard error of estimate (SEE) = √MSE

• Coefficient of determination (R2) = RSS / SST

The F-statistic tests whether all the slope coefficients in a linear
regression are equal to 0. In a regression with one independent
variable, this is a test of the null hypothesis H0: b1 = 0 against the
alternative hypothesis Ha: b1≠0. It measures how well the regression
equation explains the variation in the dependent variable.
MSR
F − stat =
MSE
Limitations of regression analysis
• Regression relations can change over time as do correlations.
This is called parameter instability.
• Regression analysis is often difficult to apply because of
specification issues and assumption violations.
• Public knowledge of regression relationships may limit their
usefulness in the future.
R8 Multiple Regression and Machine Learning

Multiple regression equation
Multiple regression allows us to determine the effect of more than
one independent variable on a particular dependent variable.
A multiple regression model is given by:
Yi = b0 + b1 X1i + b2 X 2i + ⋯ + bk X ki + εi , i = 1, 2,….n
Interpreting regression coefficients and their p-values
The intercept term b0 is the value of the dependent variable when
the independent variables are equal to 0.
The slope coefficient bj measures how much the dependent variable

Y changes when the independent variable, Xj, changes by one unit

holding all other independent variables constant.
The p-value is the smallest level of significance at which the null
hypothesis can be rejected. The lower the p-value for a test, the
more significance the result.
Testing the significance of the regression coefficients
We form the following null and alternate hypothesis to test the
significance of the regression coefficients.
H0: bj = 0
Ha: bj ≠ 0
A t-test with n – k – 1 degrees of freedom is used. The test statistic is
computed as:
̂1 − b1
b
t − stat =
sb̂1
If the t-statistic is more that the upper critical t-value (or less than
the lower critical t-value) then we can reject the null hypothesis and
conclude that the regression coefficient is statistically significant.
Confidence interval of regression coefficient, predicted value of
the dependent variable (Y)
The confidence interval for a regression coefficient is given by:
Confidence interval = b ̂1 ± t c sb̂
1
where:
tc is the critical t value for a given level of significance and degrees of
freedom
sb̂1 is the standard error of the correlation coefficient
To calculate the predicted value of the dependent variable Y we
use a three step process:

• Calculate estimates of the regression coefficients

• Assume values for the independent variables
• Use the regression equation: Y ̂0 + b
̂i = b ̂1 X
̂1i + b̂2 X
̂ 2i + ⋯ … +
̂k X
b ̂ ki
Assumptions of a multiple regression model

• The relationship between the dependent variable, Y, and the
independent variables (X1, X2,…, Xk) are linear.
• The independent variables (X1, X2,…, Xk) are not random. No
linear relation exists between two or more of the independent
variables.
• The expected value of the error term, conditioned on the
independent variables is 0.
• The variance of the error term is the same for all observations.
• The error term is uncorrelated across all observations.
• The error term is normally distributed.
F-statistic
The F-test is reported in an ANOVA table. It is used to determine the
significance of the regression as a whole. The null hypothesis is that
all of the slope coefficients in the multiple regression model are
jointly equal to 0. The F-statistic is used to determine whether at
least one of the slope coefficients is significantly different from 0.
MSR
F=
MSE
F-test is a one-tailed test; the decision rule is to reject the null
hypothesis if F > Fc.
R2 and adjusted R2
R2 measures the percentage of variation in Y that is explained by the

independent variables. In multiple regression, R2 increases as we

add new independent variables even if the amount of variation
̅𝟐
explained by them is not statistically significant. Hence, adjusted 𝐑
is used because it does not necessarily increase when an
independent variable is added.
ANOVA table
Source of Degrees of Sum of Mean sum of
variation freedom squares squares
Regression
RSS
(explained k RSS MSR =
k
variation)
Error MSE
(unexplained n–k–1 SSE SSE
=
variation) n−k−1
Total variation n–1 SST
Qualitative independent variables (dummy variables)

• A dummy variable is an independent variable that can take only
binary values. Its value is 1 if the condition is true and 0 if the
condition is false.
• To distinguish between n categories, the regression must use (n -
1) dummy variables.
• The coefficient on each dummy variable measures the average
incremental effect of that dummy variable on the dependent
variable.
Problems in regression analysis
Problem Effect Solution

Heteroskedasticity: F-test is unreliable. Robust standard

variance of error term Standard error errors
is not constant. underestimated. Generalized least
Test using BP test t-stat overstated. squares
BP = nR2
Serial correlation: F-stat too high. Hansen method
error terms are Standard error Modify the
correlated. Test using underestimated. regression
DW stat. t-stat overstated. equation
DW ≈ 2(1 − r)
Multicollinearity: two Inflated SE’s; t-stats Omit one or more
or more independent of coefficients of the
variables are highly artificially small independent
correlated. High R2 variables
Model misspecification
Model specification refers to the selection of the independent
variables in the regression, and the regression equation’s functional
form. The general principles of a good model specification are:
• The model should be grounded in cogent economic reasoning.
• The functional form chosen for the variables in the regression
should be appropriate given the nature of the variables.
• The model should be parsimonious.
• The model should be examined for violations of the regression
assumptions before being accepted.
• The model should be tested and be found useful out of sample
before being accepted.
If a regression is misspecified, then the estimated regression
coefficients may be inconsistent and statistical inference invalid.
Three types of functional specification errors in a regression are:

• Omitted variables
• Not transforming the variables before using in a regression
• Pooling data from different samples that should not have been
pooled
Qualitative dependent variables
Qualitative dependent variables are dummy variables used as
dependent variables instead of independent variables. For example,
bankrupt or not bankrupt.
Probit (based on normal distribution) and Logit (based on logistic
distribution) models estimate the probability of a discrete outcome
given the values of the independent variables used to explain that
outcome.
Machine learning and distinction between supervised and
unsupervised learning
Machine learning (ML) is a subset of artificial intelligence (AI),
where machines are programmed to improve performance in
specified tasks with experience.
Formal definition: A computer program is said to learn from
experience E with respect to some class of tasks T and performance
measure P if its performance at tasks in T, as measured by P, improves
with experience E. (Mitchell, 1997)
Supervised learning is machine learning that makes use of labeled
training data.
Formal definition: “Supervised learning is the process of training an
algorithm to take a set inputs X and find a model that best relates
them to the output Y.”
Unsupervised learning is machine learning that does not make use

of labeled training data. Several input variables are used for

analysis but no output (or target variable) is provided.
Different types of machine learning algorithms
Penalized regression
• It is a computationally efficient technique used in prediction
problems.
• The regression coefficients are chosen to minimize sum of
squared residuals plus a penalty term that increases with the
number of independent variables.
• Because of this penalty, the model remains parsimonious and
only the most important variables for explaining Y remain in
the model.
CART
• It can be applied to predict either a categorical or continuous
target variable.
• If we are predicting a categorical target variable, then a
classification tree is produced.
• Whereas, if we are predicting a continuous outcome, then a
regression tree is produced.
Random forests
• A random forest classifier is a collection of classification trees.
• Instead of just one classification tree, several classification
trees are built based on random selection of features.

• Random forests are an example of ensemble learning, whereby

signal-to-noise ratio is improved, because errors cancel each
other out across the collection of classification trees.
Neural networks
• They are applied to a variety of tasks characterized by
nonlinearities and interactions among variables.
• Neural networks consist of three layers: an input layer, hidden
layer(s), and an output layer.
• Learning takes place through improvements in weights applied
to nodes.
• Neural networks with more than 20 hidden layers are called
‘deep learning nets (DLNs)’.
Clustering algorithms
• They group data only on the basis of information contained in
the data.
• Two approaches are: bottom-up clustering and top-down
clustering.
o With a bottom-up approach we start with each observation
being its own cluster, and then group the observations into
larger, non-overlapping clusters based on some
characteristics.
o With a top- down approach, we start with all observations
belonging to a single cluster, and then partition this into
smaller and smaller clusters.
• One example of a clustering algorithm is the ‘K-means’
algorithm. This is a bottom-up clustering algorithm. It is based
on two concepts:
o Centroid – the central point of each cluster.
o Metric measure of distance between two points.

The core idea is – the centroids should be selected such that

the distances of the observations from the centroid is
minimized.
Dimension reduction
• In dimension reduction we want to reduce set of features to a
manageable size while retaining as much of the variation in
data as possible.
• Principal component analysis (PCA) is one type of a dimension
reduction method.
o Here we come up with the first principal component which
is the most volatile. It represents the most important factor
for explaining the volatility of data.
o Then we come up with the second principal component
which extracts as much of the remaining volatility as
possible.
o Additional principal components are added as needed.
However, an important restriction is that the principal
components are uncorrelated with each other.
Steps in model training
It is important to understand the difference between emphasis of
model specification and emphasis of machine learning.
• In model specification, the emphasis is to connect the model to
economic reasoning.
• In machine learning, the focus is not on economic reasoning,
instead the emphasis is on improving prediction accuracy.
Process to train ML models
This process involves five steps:
1. Specify the ML technique/algorithm.
2. Specify the associated hyperparameters.

3. Divide data into training and validation samples.

4. Evaluate learning with performance measure P, using the
validation sample, and tune the hyperparameters.
5. Repeat the training cycle the specified number of times or until
the required performance level is obtained.
R9 Time-Series Analysis
Time series
A time series is a set of observations on a variable measured over
different time periods. A time series model allows us to make
predictions about the future values of a variable.
Linear vs log-linear trend models
• When the dependent variable changes at a constant amount with
time, a linear trend model is used.
The linear trend equation is given by yt = b0 + b1 t + εt , t =
1, 2, … , T
• When the dependent variable changes at a constant rate (grows
exponentially), a log-linear trend model is used.
The log-liner trend equation is given by ln yt = b0 + b1t, t = 1, 2,
…, T
• A limitation of trend models is that by nature they tend to exhibit
serial correlation in errors, due to which they are not useful.
• The Durban-Watson statistic is used to test for serial
correlation. If this statistic differs significantly from 2, then we
can conclude the presence of serial correlation in errors. To
overcome this problem, we use autoregressive time series (AR)
models.

Requirements for a time series to be covariance stationary

AR models can only be used for time series that are covariance
stationary.
A time-series is covariance stationary if it meets the following three
conditions:
• The expected value of the time series (its mean) must be
constant and finite in all periods.
• The variance must be constant and finite in all periods.
• The covariance of time series, past or future must also be
constant and finite in all periods.
Autoregressive (AR) models
An autoregressive time series model is a linear model that predicts
its current value using its most recent past value as the independent
variable. An AR model of order p, denoted by AR(p) uses p lags of a
time series to predict its current value.
xt = b0 + b1 x(t–1) + b2 x(t–2) + … + bp x(t– p) + εt
The chain rule of forecasting is used to predict successive
forecasts.
The one-period ahead forecast of xt from an AR(1) model is x̂t+1 =
b̂0 + b̂1 xt
xt+1 can be used to forecast the two-period ahead value : x̂t+2 =
b̂0 + b̂1 xt+1
Autocorrelations of the residuals
• If an AR model has been correctly specified, the residual terms
will not exhibit serial correlation. We cannot use the Durban–
Watson statistic to test for serial correlation in AR models.
Instead we check the autocorrelations of the residuals.

• The autocorrelations of residuals are the correlations of the

residuals with their own past values. The autocorrelation
between one residual and another one at lag k is known as the kth
order autocorrelation.
• If the model is correctly specified, the autocorrelation at all lags
must be equal to 0.
• A t-test is used to test whether the error terms in a time series are
serially correlated.
residual autocorrelation
Test stat =
standard error
Mean reversion
A time series is said to be mean-reverting if it tends to fall when its
level is above its mean and rise when its level is below its mean. If a
time series is covariance stationary, then it will be mean reverting.
The mean reverting level is calculated as:
b0
Mean– reverting level xt =
1 − b1
In- sample and out- of- sample forecasts, root mean squared
error criterion (RMSE)
There are two types of forecasting errors based on the period used
to predict values:
• In-sample forecast errors: these are residuals from a fitted
series model used to predict values within the sample period.
• Out-of-sample forecast errors: these are regression errors
from the estimated model used to predict values outside the
sample period. Out-of-sample analysis is a realistic way of
testing the forecasting accuracy of a model and aids in the
selection of a model.

Root mean squared error (RMSE), square root of the average

squared forecast error, is used to compare the out-of-sample
forecasting performance of the models. If two models are being
compared, the one with the lower RMSE for out-of-sample forecasts
has better forecast accuracy.
Instability of coefficients
Estimates of regression coefficients of the time-series model can
change substantially across different sample periods used for
estimating the model. When selecting a time period:
• Determine whether economics or environment have changed.
• Look at graphs of the data to see if the time series looks
stationary.
Most economic and financial time series data are not stationary.
Random walk
A random walk is a time series in which the value of the series in
one period is the value of the series in the previous period plus an
unpredictable random error.
The equation for a random walk without a drift is:
xt = xt−1 + εt
The equation for a random walk with a drift is:
xt = b0 + xt−1 + εt
They do not have a mean reverting level and are therefore not
covariance stationary. For example, currency exchange rates.
Unit root
• For an AR (1) model to be covariance stationary, the absolute
value of the lag coefficient b1 must be less than 1. When the

absolute value of b1 is 1, the time series is said to have a unit root.

• All random walks have unit roots. If the time series has a unit
root, then it will not be covariance stationary.
• A random-walk time series can be transformed into one that is
covariance stationary by first differencing the time series. We
define a new variable y as follows:
yt = xt − xt−1 = εt , where E(εt ) = 0, E(ε2t ) = σ2 , E(εt εs ) =
0 if t ≠ s
• We can then use and AR model on the first-differenced series.
Unit root test
We can detect the unit root problem by using the Dickey-Fuller test.
It is a unit root test based on a transformed version of the AR (1)
model xt = b0 + b1 xt−1 + εt
Subtracting xt-1 from both the sides, we get
xt − xt−1 = b0 + (b1 − 1) xt−1 + εt or
xt − xt−1 = b0 + g1 xt−1 + εt
where 𝑔1 = 𝑏 − 1
If b1 = 1, then g1 = 0. This means there is a unit root in the model.
Seasonality
If the error term of a time-series model shows significant serial
correlation at seasonal lags, the time-series has significant
seasonality. This means there is significant data in the error terms
that is not being captured in the model.
Seasonality can be corrected by including a seasonal lag in the
model. For instance, to correct seasonality in the quarterly time
series, modify the AR (1) model to include a seasonal lag 4:
xt = b0 + b1 x(t–1) + b2 x(t−4) + εt
If the revised model shows no statistical significance of the lagged

error terms, then the model has been corrected for seasonality.
Autoregressive conditional heteroskedasticity (ARCH)
• If the variance of the error in a time series depends on the
variance of the previous errors than this condition is called
autoregressive conditional heteroskedasticity (ARCH).
• If ARCH exists, the standard errors for the regression parameters
will not be correct. We will have to use generalized least squares
or other methods that correct for heteroskedasticity.
• To test for first- order ARCH we regress the squared residual on
the squared residual from the previous period. ε̂2t = a0 +
a1 ε̂2t−1 + ut
If the coefficient in our model is statistically significant, the time-
series model has ARCH(1) errors.
• If a time-series model has significant ARCH, then we can predict
the next period error variance using the formula:
̂2t+1 = â0 + â1 ε̂2t
σ
Working with two time series
If a linear regression is used to model the relationship between two
time series, a test such as the Dickey-Fuller test should be
performed to determine whether either time series has a unit root.
• If neither of the time series has a unit root, then we can safely
use linear regression.
• If one of the two time series has a unit root, then we should not
use linear regression.
• If both time series have a unit root and they are cointegrated
(exposed to the same macroeconomic variables), we may
safely use linear regression.

• If both time series have a unit root but are not cointegrated,
then we cannot not use linear regression.
The Engle-Granger/Dicky-Fuller test is used to determine if a time
series is cointegrated.
Selecting an appropriate time- series model
Section 12 from the curriculum provides a step-by-step guide on
selecting an appropriate time-series model.
1. Understand the investment problem you have, and make an
initial choice of model. One alternative is a regression model that
predicts the future behavior of a variable based on hypothesized
causal relationships with other variables. Another is a time-series
model that attempts to predict the future behavior of a variable
based on the past behavior of the same variable.
2. If you have decided to use a time-series model, compile the time
series and plot it to see whether it looks covariance stationary.
The plot might show important deviations from covariance
stationarity, including the following:
• a linear trend;
• an exponential trend;
• seasonality; or
• a significant shift in the time series during the sample period
(for example, a change in mean or variance).
3. If you find no significant seasonality or shift in the time series,
then perhaps either a linear trend or an exponential trend will be
sufficient to model the time series. In that case, take the following
steps:
• Determine whether a linear or exponential trend seems most
reasonable (usually by plotting the series).

• Estimate the trend.

• Compute the residuals.
• Use the Durbin–Watson statistic to determine whether the
residuals have significant serial correlation. If you find no
significant serial correlation in the residuals, then the trend
model is sufficient to capture the dynamics of the time series
and you can use that model for forecasting.
4. If you find significant serial correlation in the residuals from the
trend model, use a more complex model, such as an
autoregressive model. First, however, reexamine whether the
time series is covariance stationary. Following is a list of
violations of stationarity, along with potential methods to adjust
the time series to make it covariance stationary:
• If the time series has a linear trend, first-difference the time
series.
• If the time series has an exponential trend, take the natural log
of the time series and then first-difference it.
• If the time series shifts significantly during the sample period,
estimate different time-series models before and after the shift.
• If the time series has significant seasonality, include seasonal
lags (discussed in Step 7).
5. After you have successfully transformed a raw time series into a
covariance-stationary time series, you can usually model the
transformed series with a short autoregression. To decide which
autoregressive model to use, take the following steps:
• Estimate an AR(1) model.
• Test to see whether the residuals from this model have
significant serial correlation.
• If you find no significant serial correlation in the residuals, you
can use the AR(1) model to forecast.

6. If you find significant serial correlation in the residuals, use an

AR(2) model and test for significant serial correlation of the
residuals of the AR(2) model.
• If you find no significant serial correlation, use the AR(2)
model.
• If you find significant serial correlation of the residuals, keep
increasing the order of the AR model until the residual serial
correlation is no longer significant.
7. Your next move is to check for seasonality. You can use one of two
approaches:
• Graph the data and check for regular seasonal patterns.
• Examine the data to see whether the seasonal autocorrelations
of the residuals from an AR model are significant (for example,
the fourth autocorrelation for quarterly data) and whether the
autocorrelations before and after the seasonal autocorrelations
are significant. To correct for seasonality, add seasonal lags to
your AR model. For example, if you are using quarterly data,
you might add the fourth lag of a time series as an additional
variable in an AR(1) or an AR(2) model.
8. Next, test whether the residuals have autoregressive conditional
heteroskedasticity. To test for ARCH(1), for example, do the
following:
• Regress the squared residual from your time-series model on a
lagged value of the squared residual.
• Test whether the coefficient on the squared lagged residual
differs significantly from 0.
• If the coefficient on the squared lagged residual does not differ
significantly from 0, the residuals do not display ARCH and you
can rely on the standard errors from your time-series
estimates.

• If the coefficient on the squared lagged residual does differ

significantly from 0, use generalized least squares or other
methods to correct for ARCH.
9. Finally, you may also want to perform tests of the model’s out-of-
sample forecasting performance to see how the model’s out-of-
sample performance compares to its in-sample performance.
Using these steps in sequence, you can be reasonably sure that your
model is correctly specified.
R10 Simulations
Steps in running a simulation
The four major steps used to run a simulation are as follows:
1. Determine probabilistic variables.
2. Define probability distributions for these variables.
3. Check for correlation across variables.
4. Run the simulation.
Three ways to define the probability distributions for a
simulation’s variables
Three ways to define a probability distribution for a simulation’s
variables are:
1. Historical data
2. Cross sectional data
3. Statistical distribution and parameters (subjective estimation)
How to treat correlation across variables in a simulation
If the input variables are correlated with each other, there are two
ways to treat the correlation:
• Pick the input that has the greatest impact on value and drop

the other input.

• Build the correlation explicitly into the simulation.
Advantages of using simulations in decision making
The advantages of using simulations in decision making are as
follows:
• It encourages better input estimation.
• Instead of a point estimate we get a distribution for expected
value which is more informative.
Common constraints introduced into simulations
Some of the common constraints introduced into simulations are as
follows:
• Book value constraints – For example, regulatory capital
restrictions and negative book value for equity.
• Earnings and cash flow constraints – They may be imposed
internally or externally.
• Market value constraints – The equity value of a firm is the
firm value minus the value of debt. The value of a firm is
calculated by discounting expected cash flows at a discounted
rate.
Issues in using simulations in risk assessment
The issues in using simulations in risk assessment are as follows:
• Garbage in, garbage out - The quality of the output depends
on the quality of each input
• Real data may not fit assumed distributions.
• Non-stationary distributions - The type of distribution may
change with time due to shifts in market structure.
• Changing correlation across inputs - Correlations may vary

with time and may not be predictable.

We should be careful not to double count risk. If we are comparing
two assets based on the variability of simulated values, then it is
appropriate to use the risk free rate to discount the cash flows of
both assets. Using risk adjusted discount rates will lead to double
counting of risk.
Comparison of scenario analysis, decision trees, and
simulations
The table below summarizes what probabilistic approach to use
based on the type of risk and whether the risk elements are
correlated or not:
Discrete/ Correlated/ Sequential/ Risk
Continuous Independent Concurrent Approach
Discrete Independent Sequential Decision Tree
Discrete Correlated Concurrent Scenario
Analysis
Continuous Either Either Simulations
The table below summarizes whether the three probabilistic
approaches can be used as complements or substitutes for risk-
adjusted value.
Complements Substitute for Risk-
Risk-Adjusted Adjusted Value
Value
Scenario Yes No
Analysis
Decision Tree Yes Yes
Simulations Yes Yes

Quantitative Methods High-Yield Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantitative Methods High-Yield Notes

Uploaded by

Copyright:

Available Formats

Quantitative Methods 2019 Level II High Yield Notes

2019 CFA® Exam Prep

IFT High-Yield Notes®

This document should be read in conjunction with the corresponding

Required disclaimer: CFA Institute does not endorse, promote, or

© IFT. All rights reserved 1

© IFT. All rights reserved 2

The IFT High-Yield Course is based on Pareto’s 80-20 rule

We call this the “High-Yield Course” because your investment (time

Below are the main components of the Level II High-Yield course:

© IFT. All rights reserved 3

should at least do the questions on this list. As part of your final

The High-Yield Course can be used on a ‘stand-alone’ basis if you are

Many candidates complain that they forget material covered earlier.

R6 Fintech in Investment Management

© IFT. All rights reserved 4

Big Data, artificial intelligence, and machine learning

© IFT. All rights reserved 5

labeled. Here we want the ML algorithm to seek relationships

© IFT. All rights reserved 6

• Cryptocurrencies: These are electronic currencies that allow

R7 Correlation and Regression

© IFT. All rights reserved 7

coefficient has no units. It can take a value between -1(perfect

Limitations of correlation analysis

© IFT. All rights reserved 8

dependent variable and the independent variable.

© IFT. All rights reserved 9

Total variation = Unexplained variation + Explained variation

© IFT. All rights reserved 10

n represents the number of observations and k represents the number

© IFT. All rights reserved 11

• Coefficient of determination (R2) = RSS / SST

R8 Multiple Regression and Machine Learning

© IFT. All rights reserved 12

Y changes when the independent variable, Xj, changes by one unit

© IFT. All rights reserved 13

• Calculate estimates of the regression coefficients

Assumptions of a multiple regression model

© IFT. All rights reserved 14

independent variables. In multiple regression, R2 increases as we

Qualitative independent variables (dummy variables)

© IFT. All rights reserved 15

Heteroskedasticity: F-test is unreliable. Robust standard

© IFT. All rights reserved 16

© IFT. All rights reserved 17

of labeled training data. Several input variables are used for

© IFT. All rights reserved 18

• Random forests are an example of ensemble learning, whereby

© IFT. All rights reserved 19

The core idea is – the centroids should be selected such that

© IFT. All rights reserved 20

3. Divide data into training and validation samples.

© IFT. All rights reserved 21

Requirements for a time series to be covariance stationary

© IFT. All rights reserved 22

• The autocorrelations of residuals are the correlations of the

© IFT. All rights reserved 23

Root mean squared error (RMSE), square root of the average

© IFT. All rights reserved 24

absolute value of b1 is 1, the time series is said to have a unit root.

© IFT. All rights reserved 25

© IFT. All rights reserved 26

© IFT. All rights reserved 27