You are on page 1of 12

NESUG 16 Statistics, Data Analysis & Econometrics

ST004

Using Cluster Analysis for Retail Portfolio Segmentation in an Economic Capital


Model: Homogeneity by Common Default Behavior over Time
Shannon Kelly, Federal Reserve Bank of Philadelphia1
Philadelphia, PA

ABSTRACT
For credit risk management in a retail banking portfolio, cluster analysis can be used to segment exposures into
pools by common default probabilities over the next time horizon, usually one-year. These default probabilities vary over
time conditional upon the coincident economy, providing a dynamic segmentation scheme by the near-term default
behavior. For an economic capital model it may be preferable to create segments that remain stable over an economic
cycle. Consider measuring default behavior by the sample paths of conditional default probabilities, over multiple one-
year time horizons. Hierarchical cluster analysis methods in PROC CLUSTER are employed to group exposures by
similar sample paths, using distance measures defined to distinguish between these time trends.
It is assumed that one-year default probabilities are functionally related to coincident economic drivers. These
drivers are taken from real economic time series data, and several sample paths of default probabilities are constructed
from observed functional relationships between default probabilities and key economic drivers within different products
and credit qualities. A simulation study is conducted to benchmark the results of cluster analysis methods over the
simulated time series processes, with several iterations to assess the stability of the cluster analysis results over different
sample path realizations.

INTRODUCTION
The calculation of both regulatory and economic capital assessments for retail exposures in a bank portfolio typically
relies upon a segmentation scheme separating exposures/accounts into pools that are homogeneous with respect to
default behavior. These pools then form the basis for estimating a probability of default (PD) for loans in the pool. For
the purposes of this study, T homogeneity in the context of a capital model is defined as a common path of PD values
over multiple subsequent time intervals. The purpose of this study is to examine a potential method of exploratory data
analysis for creating a segmentation scheme by this definition of homogeneity that employs hierarchical clustering
methods available in PROC CLUSTER.
Banks often develop internal economic capital measurements to cover their losses due to defaulted loans up to the
tails of the probability distribution of default, usually up to 99.97% of the possible outcomes. As the capital requirement for
a pool is measured from the tail event of this probability distribution, for capital to be correctly specified the retail
exposures must be pooled into a homogeneous group as defined. However, there is no generally accepted definition of
homogeneity that forms an objective basis for determining appropriate segmentation. This study theoretically defines
homogeneity as exposures within a pool having a common relationship between the PD and exogenous economic drivers
over the range of possible outcomes. This definition of homogeneity allows for developing objective measures for when
PDs over all sub-pools of a segment follow a common probability distribution of default conditional upon the economic
environment.
Given a portfolio of accounts, it would be impossible to directly observe the pools of retail exposures that have
different distributions of the PD conditional upon the economic environment. However one observable trait of a common
probability distribution of default would be the occurrence of similar sample paths of PDs over time among sub-pools of
retail exposures, whereas a large difference between sample paths would indicate that the sub-pools have different
distributions. A sample path is defined as a time series of one-year PDs for a sub-pool of the population measured at
2
each month .
A bank could divide its portfolio into fine cross-sectional units defined by borrower and product traits, allowing for
sufficient sample size within each unit to calculate a reliable PD estimate at each month in the time series, and then
calculate sample paths over each unit to create a cross-sectional time series data set. With such a fine breakdown of the
portfolio the difference between some sample paths would be small and the associated units aggregated, while other
dissimilar sample paths would lead to different aggregations of cross-sectional units. To utilize hierarchical clustering
methods for this purpose a distance/difference measure is defined to quantify the difference between two sample paths.
Hierarchical cluster analysis aggregates sample paths of monthly PD’s by minimizing the distances within the clusters
formed. The resulting clusters would all be associated with an aggregation of cross-sectional borrower and product traits,
and these traits would be used to define pools within a portfolio segmentation scheme.
A sufficient length and detail of bank data was not available to calculate sample paths for an actual bank portfolio. A

1
Any views expressed represent those of the author only and not necessarily those of the Federal Reserve Bank
of Philadelphia or the Federal Reserve System.
2
The one-month default probability within a subgroup is measured at each month as the dollar amount of accounts that
default during that month divided by the total dollar value of all accounts at the end of that month. To calculate an
equivalent one-year PD, the one-month PD is multiplied by 12 to reflect the default rate that would occur over a 12-month
period.

1
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

cross-sectional time series for a portfolio was simulated by fitting several short time series of bank default data to potential
economic drivers of default, via regressions of actual monthly one-year default rates on actual economic drivers (or
transformations thereof). The relationships were used to create simulated sample paths of PDs over an eight-year period
(1995-2002) using only the information about the economic drivers of default and not the original default data. The details
of the data construction are discussed in the following section titled Data Simulation.
A discussion of hierarchical cluster analysis methods is contained in the section titled Analysis Methods. Two
different methods of hierarchical cluster analysis (average linkage and Ward’s minimum variance) are used to compare
the clustering results of two different strategies for aggregating sample paths. Two distance measures are used in this
analysis. Both distances are intended to capture the shape and scale of the sample paths with the second measure
adjusted to improve the separation by differences in shape. The utilization of these methods to create a portfolio
segmentation scheme is discussed in the subsequent sections.
An important step in the use of hierarchical cluster analysis is the selection of the number of clusters. The section
titled Number of Clusters discusses a method for determining an appropriate number of clusters. In the choice of the
number of clusters inhomogeneous groups should not be aggregated. However there is a significant cost to managing
large numbers of pools many of which might not be significantly different. The goal is to aggregate to a level so that,
within some margin of error, the sample paths of PD’s within cluster are similar over the economic cycle.
Clusters are also formed within shorter time windows of both five- and three-years that move forward by one year
3
each time in order to analyze the stability of clusters formed over different realizations of the sample paths . The section
titled Stability of Clustering Algorithm over Time discusses the details of this analysis. The time stability of certain clusters
is used to evaluate the process of developing a segmentation scheme, specifically whether certain clusters formed on the
full eight-year time series should be disaggregated.
The various results of the clustering methodologies are compared by inspection of the actual graphs of the sample
paths. The results indicate that hierarchical clustering methods can be used to develop a reasonable segmentation
scheme, and that the number of clusters selected generally leads to a reasonable aggregation of similar sample paths
and the separation of dissimilar sample paths. The relative quality of the resulting segmentation scheme, in correctly
clustering homogeneous observations, appears to depend mostly upon the distance measure used versus the method of
clustering.

DATA SIMULATION
The motivation for using linear regression to fit relationships between the PDs and economic drivers follows from the
proposed Basel IRB standards for regulatory capital covering credit risk. In these standards a measure of the probability
of default is mapped to the tails of a probability distribution of default using a regulatory capital model. This measure is an
unconditional probability of default, or the expectation over the distribution of default probabilities conditional upon the
4
economic environment . The paper by Michael Gordy (2000), upon which the model is based, specifies a linear
relationship between the conditional default probability and economic drivers.
The SAS procedure PROC REG was used to fit available time series of one-year PDs to economic drivers. A logit
transformation of the time series of PDs provided an improved fit to the regression models. The relationships fit to the
dependent variable of the logit(PD) to the independent variables of economic factors, with the fitted PDs taking the form

r exp 1 + [ ∑
K
w ik ( x k , t − 1 ) ]
= PD ⋅
[1 + ∑ ]
k =1
PD i | X t i K ,
1 + exp k =1
w ik ( x k , t − 1 )
r
X t = (x 1 , t , x K ,t )
r
where L, x k ,t , L, is the vector of economic variables at time t, PD i | Xt is
the probability of default for pool i conditional upon the economic environment at time t, and the wik are the coefficients
specifying the linear combination. PDi is the unconditional default probability for pool i that is essentially a smoothed
measure over the sample path of PDs.
Different time series of actual PDs were available and for each time series the logit(PD) was fit to to economic
2
drivers via methods of regression analysis. The equations were fit by stepwise regression and selected for the highest R ,
with standard regression diagnostics, goodness of fit tests and tests for collinearity. It should be noted here that the intent
of this analysis was exploratory to produce plausible sample paths of default probabilities, as determined by relationships
to economic drivers, not to produce any predictive models of default. It is not likely that such a model could even result
from such a regression analysis. An example sample path is

logit(PD) = log[PD/(1-PD)] = -77.1 + exp[ Initial Unemployment Claimants]*(2.60E-29) +


4
exp[Fed Funds Target Rate (%)]*0.000995 + [Natural Gas Price] *(-8.57E-6) +
3
log[Consumer Price Index-Urban]*13.8 + [Consumer Expectations] *(1.90E-7).

3
Specifically clusters are formed over the five-year windows of years 1-5, 2-6, 3-7 and 4-8, plus clusters are formed over
the three-year windows 1-3, 2-4, 3-5, 4-6, 5-7 and 6-8.
4
The unconditional default probability is PD = E[PD|Xt ], where Xt is a realization of economic factors at time t.

2
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

For some of the time series available more than one regression equations produced an acceptable fit, using slightly
different transformations of the economic factors and/or added independent variables. This provided a dataset with some
very similar sample paths that would be expected to be aggregated into the same cluster, while sample paths determined
by very different fitted equations would be captured in separate clusters. The data set input to the cluster procedure
consists of 46 sample paths labeled PD_1 to PD_46, as defined by 46 different fitted regressions. The sample paths are
plotted in Appendix A, with somewhat similar sample paths plotted together in the same graph. The sample paths were
determined entirely by the economic drivers from 1995 to 2002, or 96 months of data, resulting in 46 PD time series each
of length 96.

Economic Factors Used


All Loans: Total Past Due, U.S. (SA, %)
Loan Charge-Off Rate: Credit Cards, Residential Real Estate Loans and Other Consumer Loans
Nonbusiness Bankruptcy Filings, U.S. (Units)
Consumer Debt Service Payments & Mortgage Debt Service Payments as % of Disposable Personal Income
Natural Gas Price, Henry Hub, LA ($/mmbtu) & Consumer Price Index-Urban by Region
Federal Open Market Committee: Fed Funds Target Rate (%)
University of Michigan: Current Economic Conditions & Consumer Expectations
Unemployment Rate Full-time Labor Force (SA, %) & Average Duration of Unemployment (SA, Weeks)
Unemployment Insurance: Benefits Paid & Total Initial Claimants (Persons)
Mass Layoff Events: Total All Industries (Number)
Employment, Total Nonfarm, U.S. (Thous)

SAS CODE for Regressions


proc reg data=&dataset.; by &group.;
model logit_PD = &econvars. / selection = stepwise slstay = 0.10 slentry = 0.10 corrb;
run;

CLUSTERING METHODS
Hierarchical clustering involves the pair wise aggregation of units of observations, using a distance measure
between observations, in a stepwise process starting with each observation forming a single cluster up to all observations
forming a single cluster. The method of clustering is determined by the criteria by which observations are aggregated. At
each step in the process, information is lost when observations or clusters are aggregated and the average distance
between observations within aggregated clusters increases at each step.
The clustering method of average linkage was chosen as it tends to minimize the variation within clusters but not
necessarily form clusters of equal size, as there may be several similar sample paths and a few very different sample
paths that should be clustered separately. Ward’s minimum variance was also chosen as a comparison to average
linkage, as this method minimizes the increase in sum squared error (or loss of information) at each step in the clustering
process and in contrast to average linkage tends to form clusters of equal size. Though the results may differ slightly by
clustering method, the methods may produce equally viable segmentation schemes in practice.
The clustering process for the method of average linkage runs by aggregating clusters based upon the average of all
differences within cluster. Given a distance measure between two sample paths m and n, d(PDm , PDn), the difference
between two clusters i and j is defined as:

D(Ci , Cj) = ∑∑d(PD , PD )


m∈Ci n∈C j
m n Ni N j .

For Ward’s Minimum Variance method clusters are joined by minimizing the increase in the total ANOVA sum of squares
increase when aggregating two clusters (or the loss of information). The difference between two clusters i and j is defined
as:

D(Ci , Cj) = d(PDm, PDn) (1 Ni +1 Nj ) .


Two distance methods are employed in this study. First the results using the root mean square distance measure is
presented, which is defined as


T
d(PDi , PDj) = sqrt[(PDi – PDj)’(PDi –PDj)/T] = (PDit −PDjt )2 / T ,
t =1

where the vectors, PDi and PDj, contain a time series of PD estimates. A second distance measure was defined to
further emphasize more subtle differences in shape between two sample paths, specifically to scale the distance up for
two sample paths that might be close in value by negatively correlated and visa versa. This distance measure is termed

3
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

the correlation adjusted distance measure and is defined as d(PDi , PDj) =(1-ρij)*sqrt[(PDi – PDj)’(PDi -PDj)/T] =
 
∑t=1(PDit − PDi ) ⋅ (PDjt − PDj ) /T  ⋅
T


T
1−  t =1
(PDit −PDjt )2 /T
∑t=1(PDit − PDi )2 ⋅ ∑t=1(PDjt − PDj )2 /T 
T T

Clustering methodology can be used to produce a segmentation scheme by aggregating time series of default
history by the minimum difference between clusters. Aggregations are made in a stepwise process starting with all
observations separately and ending with a single cluster. The question is at what step this process can be stopped to
achieve a reasonable segmentation scheme that aggregates similar sample paths but retains separate clusters of
significantly different sample paths. Statistics are available to indicate natural breaks in the process that could be
stopping points. The next section describes these statistics and how they are used in this study to select the number of
clusters.

SAS Macro to Calculate a Distance Matrix for Input into PROC CLUSTER
%macro distance(indata, outdata, T, N, dist);
/* indata = input dataset of N sample paths of length T, NxT
outdata = output dataset of distance matrix between sample paths, NxN
T = length of time series, N = number of observations
dist = type of distance measure; 1 for root mean square and 2 for correlation adjusted */
proc iml;
d = j(&N.,&N.,.); * distance matrix;
z = j(&T., 1, .); * column vector of absolute differences;
cov = j(&T., 1, .); * column vector of covariances;
use &indata.; read all var _NUM_ into y; * N*T matrix of sample paths;
* create lower triangular distance matrix;
do i = 1 to &N. ; d[i,i] = 0;
do k = 1 to i-1 ;
z[,1] = abs((y[i,] - y[k,])`);
d[i,k] = (z`*z/&T.)**(1/2); * root mean square distance[i,k];
if &dist. = 2 then do;
cov[,1] = (y[i,]#y[k,])` - y[i,:]*y[k,:]; * covariance among paths;
var1 = (y[i,]-y[i,:])*(y[i,]-y[i,:])`/&T.; * variance of path 1;
var2 = (y[k,]-y[k,:])*(y[k,]-y[k,:])`/&T.; * variance of path 2;
corr = cov[:,1]/sqrt(var1*var2); * correlation among paths;
d[i,k] = d[i,k]*(1-corr); * correlation adjustment to root mean square distance;
end;
end;
end;
create &outdata.(type=distance) var("dist1":"dist&obscnt."); append from d;
quit;
%mend distance;

NUMBER OF CLUSTERS
There are competing considerations in selecting the number of clusters. Care should to be taken not to aggregate
inhomogeneous groups, as the resulting capital estimates would be mis-specified. However, there is also the issue of
tractability, since there is a significant cost to managing a large number of pools many of which might not be significantly
different. The goal is to aggregate to a level so that the sample paths of PD’s within cluster are similar over the economic
cycle. However no two sample paths associated with different cross-sectional units of the portfolio will be identical, so
some tolerance level of differences must be allowed within cluster.
In this study the longer the time period used the larger the number of clusters selected when using either average
linkage or Ward’s minimum variance methods, as local differences can be smoothed over a longer time series. If the
cluster results are recalculated every year for a rolling window of data with a shorter time period (five-year or three-year
windows), then the number of clusters selected tends to lessen and stable clusters are not produced over time. This is
likely due to the fact that sample paths may be more similar over specific time windows but dissimilar over others. It is the
recommendation that the longest time series possible is used to determine the number of clusters, but that the clustering
algorithm is also run over shorter time windows. This process should capture any local differences between sample paths
such as substantial differences in local peaks in the PD or the convexity, i.e. negative pair wise correlation, of the sample
path that may lead to the disaggregating of some clusters.
There are natural breaks in selecting the appropriate number of clusters as indicated by the pseudo F and
2 5
pseudo t statistics . A potential number of clusters is indicated when either there is a local peak in the pseudo F statistic,
2
or a small pseudo t statistic is followed by a much larger value. When possible the agreement of these statistics was

5
The formulas are from the section of Miscellaneous Formulas in the chapter titled “The Cluster Procedure” of the SAS
STAT Users Guide.

4
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

6
used to select the number of clusters in the following analyses . There could be multiple potential values for the number
of clusters, so balancing tractability with homogeneity within cluster for 46 sample paths a potential number of clusters
was selected between 10 and 20. An example of SAS PROC Cluster output is presented below.

The SAS System


The CLUSTER Procedure
Average Linkage Cluster Analysis
Cluster History
RMS RMS
NCL --Clusters Joined--- FREQ STD PSF PST2 Dist

20 CL25 CL27 5 0.0544 263 5.6 0.0895


19 CL20 OB5 6 0.0595 251 2.0 0.097
18 CL37 CL28 6 0.0553 217 16.4 0.1004
17 CL39 CL22 7 0.0569 189 17.9 0.1075
16 CL26 CL21 9 0.0672 162 10.0 0.1164
15 CL36 CL24 4 0.0718 157 9.9 0.1191
14 CL18 CL38 9 0.0768 135 13.8 0.1356
13 CL17 OB20 8 0.0695 136 4.4 0.1385
12 OB4 CL32 3 0.0836 141 13.4 0.1414
11 CL19 CL16 15 0.1114 91.2 29.0 0.2007
10 CL11 CL12 18 0.1245 81.6 6.4 0.2157

To choose the number of clusters over the 8-year time series, perturbations are added to the time series and results
are produced from the clustering procedure over the “noisier” data. The perturbations are generated by independent
normal random variables with mean 0 and variance to be 5% of the mean PD over time for each sample path. By adding
perturbations clusters that could be very different but contained within a set of tolerance bounds would not necessarily be
aggregated, and by running multiple (ten) trials clusters that are erroneously aggregated due to the added noise would not
2
be consistently created over multiple iterations. The criteria used to select the number of clusters were a small pseudo t
2
statistic followed by a larger pseudo t statistic and a local peak in the pseudo F statistic, with agreement of the two
statistics when possible.
When using the root mean square distance, all of the ten iterations run indicated a possible number of clusters to be
12 for the method of average linkage and 19 for Ward’s minimum variance method. The selection of 12 clusters using
average linkage aggregates a few sample paths that are noticeably different in trend from the other sample paths in the
same cluster, whereas the selection of 19 clusters, using Ward’s minimum variance method, leaves some seemingly
similar sample paths disaggregated. See the results for the eight-year window in Tables 1 and 2 of Appendix B, and
compare the clusters to the graphs of the sample paths in Appendix A.
When using the correlation adjusted distance measure, for both Ward’s minimum variance method and for the
2
average linkage method 15 clusters is the number selected by the pseudo t statistic over ten trials. Both methods form
the same 15 clusters over the eight-year time series, and by comparison to the results using the root mean square
distance the clustering results are improved in that sample paths are better aggregated by both shape and scale. See the
results for the eight-year window in Tables 3 and 4 of Appendix C, and compare the clusters to the graphs of the sample
paths in Appendix A.

SAS Macro for Adding Perturbations to Sample Paths


%macro perturb(indata, dataset, a, T, N);
/* indata = input dataset of N sample paths of length T, NxT
dataset = output dataset of sample paths with perturbations
a = scale of perturbations, roughly corresponds to CV
T = length of time series, N = number of observations */
proc iml;
* get matrix of sample paths; use &indata.; read all var _NUM_ into PD;
do s = 1 to &T.;
do n = 1 to &N.;
* generate perturbations;
PD[n,s] = max( PD[n,s] + &a.*PD[n,:]*normal(0) , 0.0001);
end;
end;
create &dataset. var("PD1":"PD&T."); append from PD;
quit;
%mend perturb;

6
See Milligan and Cooper (1985) and Cooper and Milligan (1988) for studies on methods to select the number of clusters.
2
In their studies the pseudo F statistic and pseudo t performed best for “noisy” data, or data with errors.

5
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

STABILITY OF CLUSTERING ALGORITHM OVER TIME

Root Mean Square Distance


In order to test the stability of the clustering algorithms over different realizations of the sample paths, the hierarchical
clustering methods of Ward’s minimum variance (with 19 clusters) and average linkage (with 12 clusters) are applied over
moving five-year and three-year windows that shift forward one-year between each fit of the clustering algorithms. Tables
1 and 2 in Appendix B provide a list of the clustering results over the five-year and three-year windows and the entire
eight-year period, presented as a table listing the indices of the 46 sample paths as they are clustered together within
each time window. This section demonstrates how the clustering algorithm forms clusters over shorter time periods and
how much the clusters change over time, capturing local differences rather than longer-term trends.
The longer five-year windows give more stable results over subsequent realizations of the sample paths (vs. the
three-year window), similar to the clusters formed over the full eight-year period. However, the clusters formed over
different, five-year or three-year, time periods can still be recognized as belonging to a few disjoint groups of clusters that
are patterned over results run on all subsequent time windows, where sample paths in disjoint groups are never
aggregated into a cluster together. For the Ward’s minimum variance method using 19 clusters, there are 13 distinct
groups of sample paths over the five-year windows and eight distinct groups over the three-year windows. For the
average linkage method using 12 clusters, there are seven distinct groups over the five-year windows and four distinct
groups over the three-year windows. See Tables 1 and 2 in Appendix B, where the clusters are presented (using the
indices of the sample paths 1-46) within these groups for each time window.
To investigate what aggregations of sample paths may be appropriate within these disjoint groups, the intersections
of all of the different combinations of clusters produced over the eight-year window and the five-year and three-year
windows are taken, designated as ”fundamental blocks”. The question is whether these fundamental blocks within a
disjoint group are significantly different or whether they are similar enough to be aggregated in a further enhancement of
the clustering methods. The plots in Appendix A, provide some investigation into this question. Within groups sample
paths are similar over some windows in time and dissimilar over others, as might happen with negatively correlated
sample paths. These sample paths should be separated, as negatively correlated paths would have a much different
contribution to the overall credit losses within a portfolio at any given point in time. Other similar sample paths with slight
differences in the shape or scale could be aggregated, given consideration for the tractability of a segmentation scheme.
After looking at the plots in Appendix A, and the clustering results in Appendix B, it appears that the mean square
distance measure does not necessarily distinguish well between sample paths that are close in the magnitude of the PDs
over time over time but that have a negative correlation.
• This is indicated in the example of the instability of the cluster (23,24,30,33&36) that changes over time to exclude
sample path 23 and include either 20 and/or 21 over the three-year windows. These sample paths are contained in
Group 2 over the five-year windows in Table 1, Appendix B. From Figure 1A) and 1B) the aggregation of sample
paths 24, 30, 33 & 36 should form a single cluster, while sample paths 23, 20 and 21 should form three separate
clusters, respectively.
• An example of an inappropriate aggregation of negatively correlated sample paths is the clustering of the
fundamental blocks (10&16) and (2&6) both together and then with other very different sample paths over the five-
year and three-year windows, where the sample paths of PD’s are close within one time window but very different
over others (see Figure 1E and 1F, Appendix A). This example is displayed in the clusters formed over the five-year
windows within group 4 of Table 1 and group 1 of Table 2.
An example of grouping two sample paths that are similar except for a local spike in the point-in-time PD’s is the
clustering of sample paths 44 and 46 (Figure 1N, Appendix A). For the purposes of segmenting in an economic capital
model a bank might want to separate groups that exhibit different spikes in the PD under adverse conditions, or “stressed”
PD’s. Using both Ward’s minimum variance and average linkage with the root mean square error distance measure these
sample paths form separate clusters of a single observation over most time windows.
An example where separate fundamental blocks within a group may be clustered together occurs with the case of
the blocks (5, 18, 17&19) and (1&25). From Figure 1G, it appears that blocks (1&25), (18) and (17&19) could be
aggregated into one cluster, whereas sample path (5) could form a separate cluster or be aggregated into the cluster of
sample paths 1, 5, 17, 18, 19 &25. This example is displayed in Table 1 within group 7 of the 5-year windows and group
3 of the three-year windows.
The results from taking a moving window over a shorter time interval do point out possible inappropriate
aggregations of negatively correlated sample paths, as apparent in the instability of clustering over subsequent time
windows. Also, the separation of sample paths with different magnitudes of a concurrent spike in the PD could be a
subjective decision relative to the bank’s emphasis on similar behavior in adverse conditions vs. capturing similar behavior
over the whole cycle. However, the distance measure should be adjusted to separate sample paths with high negative
pair wise correlations among the time series of PD’s, to provide better separation over the full eight-year window.

Correlation Adjusted Distance Measure


The introduction of a distance measure that captures the correlation among and the distance between sample paths
provides a solution to the problem of aggregating uncorrelated sample paths described above. Define the distance
measure to be the root mean square distance multiplied by one minus the pair wise correlation among two sample paths.

6
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

Using 15 clusters, the same analysis of the stability of the clustering algorithm over time is run for both Ward’s minimum
variance method and the average linkage method. The results are listed in Tables 3 and 4 of Appendix C for the full
eight-year window and the five-year and three-year windows.
When comparing the results in Appendix C to the sample paths plotted in Appendix A, the use of the correlation
adjusted distance measure does appear to improve the ability of both clustering methods used in this analysis to
aggregate sample paths with both a similar scale over time and a similar shape. Both clustering methods form the same
7
ten disjoint groups of sample path clusters , with almost identical results over the eight-year, five-year and three-year time
windows. The fundamental units from taking the intersections of the clustering results over time are almost identical.
There appears to be less of a tendency to aggregate sample paths that are negatively pair wise correlated. This is
evidenced by the fact that fundamental blocks (10&16) and (2&6) are never grouped in the same cluster using the entire
eight-year period or the shorter moving time windows. Also the sample path 23 is never clustered with the fundamental
block (24,30,33&36) as the two are pair wise negatively correlated over the eight-year time series, see Figure 1A and 1B
of Appendix A and Tables 3 and 4 for groups 1 and 2.
The separation of certain groups using the correlation adjusted distance that were variously grouped together over
subsequent windows in time using the root mean square distance measure also indicates improved performance with the
correlation adjusted distance measure; e.g. the fundamental blocks of sample paths (3,7&11), (9,13&15) and (29,32&35)
contained in group 4 of Tables 3 and 4, Appendix C. From analysis of the plots in Figure 1H, 1I, 1J of Appendix A, these
three subgroups should be separate clusters as both the shape and scale of the sample paths differ.
Although this new distance measure does not appear to aggregate positively pair wise correlated sample paths that
are significantly different in scale, which would be a concern with such a measure, the ability of this distance measure to
capture differences in the stressed PD may be diminished. This is evidenced in the aggregation of sample paths 44 and
46 into a single cluster for the entire eight-year period and most of the five-year and three-year time windows. The high
positive correlation between sample paths “outweighs” the difference in the peak values over most time periods, except
that sample paths 44 & 46 form separate clusters where the peak occurs.
The stability of the clusters formed over different time windows also appears to be improved, as instabilities in the
cluster results over subsequent time windows appears to involve groups of similar sample paths. For example consider
group 3 of Tables 3 and 4 and the different aggregations and separations of the following fundamental blocks of sample
paths (37,38,43& 45), (1, 17, 19&25), (5), (18), (8,12&14) and (2&6) over time. From visual inspection of Figures 1C, 1D,
1E and 1G, two clusters could be formed as [(37,38,43& 45), (8,12&14),(2&6)] and [(1, 17, 19&25), (5), (18)]. If a larger
number of clusters are desirable, then four homogeneous clusters could be produced by only aggregating [(1, 17, 19&25),
(5), (18)].
The adjustment of the distance measure by the pair wise correlation among two sample paths appears to improve
the performance of the above exploratory analysis. The results are consistent between the Ward’s minimum variance and
average linkage methods of clustering. Also, the clustering results are more stable over subsequent time windows, and
this leads to an easier more objective determination of a portfolio segmentation scheme. Given the “fundamental blocks”
derived from this exploratory data analysis, the method of determining what blocks should be aggregated or separate is
somewhat subjective as it involves analysis of the plots of the sample paths. However, until a robust “black-box” method
is developed, the process will involve subjective judgment.

SAS Macro for the Study of Clusters over Different Time Windows
%macro sims(T, meth, dist, clusters, outclus, nclus);
/* T = length of time window, method = ward or average option in PROC CLUSTER */
/* dist = distance measure (0 for root mean square, 1 for correlation adjusted) */
/* clusters = name of output dataset, outclus = output dataset from PROC TREE */
/* nclus = number of clusters output from PROC TREE */
%let N = 46; * number of sample paths;
%let iter = %eval((96-&T.)/12);
%do s = 0 %to &iter.;
%let first = %eval(&s.*12+1); %let last = %eval(&s.*12+&T.);
%perturb(clustsim.sample_paths(keep=COL&first.-COL&last.
rename=(COL&first.-COL&last.=COL1-COL&T.)),PD_SIMS, 0, &T., &N.);
%distance(PD_Paths, PD_DIST, &T., &obscnt., &dist.);
proc cluster nonorm pseudo rmsstd print=30 data=PD_DIST method=&meth.
outtree=&clusters.&s.; run;
proc tree data=&clusters.&s. out=&outclus.&s. nclusters=&nclus.; run;
proc sort data=&outclus.&s.; by cluster _NAME_; run;
data &outclus.&s.; set &outclus.&s.; drop _NAME_;
time=&s.; path = substr(_NAME_,3,4)+0; run;
%end;
%mend sims;

7
There is one exception for Ward’s minimum variance method for years 6-8, where sample path 5 is clustered with paths
9,13&15 in the next group, as these sample paths are only similar over this single 3-year window.

7
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

CONCLUSION
To produce a segmentation scheme over a retail portfolio to measure capital requirements, the analysis should utilize
as long a time series as possible. However in the creation of a segmentation scheme the performance of the clustering
method(s) over subsequent shorter time windows should be observed to capture local differences between sample paths
that may be smoothed over a longer time window. The most conservative approach to avoid aggregating inhomogeneous
sample paths is to take the intersection of the clusters formed over the moving time windows, the fundamental blocks of
sample paths, but this may leave many similar sample paths disaggregated. To balance the goal of homogeneity within
pool against the tractability of a segmentation scheme, graphical analysis of these blocks can indicate further
aggregations to produce a tractable portfolio segmentation scheme of homogeneous pools.
The arguments in this paper suggest a method of exploratory data analysis for the development of a portfolio
segmentation scheme. Further study can be conducted to produce a method for creating a segmentation scheme with
exactly the desired properties directly from a more objective procedure. However more study into the desired properties
for a segmentation scheme and the priorities of sample paths properties, such as the convexity or the magnitude of
peaks, is needed. As is evident in the analysis above, a subjective inspection of the graphs of various sample paths
within clusters indicates that while a distance measure may aggregate some common properties (such as the positive pair
wise correlation of sample paths) others may be missed (such as different concurrent peaks in the PD). The fundamental
blocks of sample paths naturally fall into disjoint groups, so an improved algorithm would involve a more objective method
of aggregating such sample paths within group to capture the desired properties of the segmentation scheme.
The above exploratory analyses should be continued by in depth investigation into the performance of clustering
algorithms and distance measures over the type of data described in this paper. This study indicates that the distance
measure used appears to have the most effect on the performance of the clustering methods, but the use of Ward’s
minimum variance method of hierarchical clustering may be best for this application. The correlation adjusted distance
provides a more stable clustering algorithm, where within cluster the PD’s follow a sample path over time with a common
shape and scale. Ward’s minimum variance method tends to form clusters of roughly equal size, which for this application
provides a more conservative approach against aggregating too many sample paths into one cluster. A continuation of
this study will involve further refinement of the distance measure and clustering methods utilized.

8
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

APPENDIX A: PLOTS OF SAMPLE PATHS

Figure 1 (A) 3
Figure 1 (B) PD_24 Figure 1 (C)
8 PD_30 8

PD_20

One-Year PD (%)
One-Year PD (%)
One-Year PD (%)

PD_33
6
6
PD_21 2
PD_36
PD_23 4
4
1 PD_8
2 2 PD_12
PD_14
0 0 0
1 21 41 61 81 1 21 41 61 81 1 21 41 61 81
Month Month Month

Figure 1 (D) Figure 1 (E) Figure 1 (F)


PD_10
8
PD_37 6
4 PD_16

One-Year PD (%)
One-Year PD (%)

One-Year PD (%)
PD_38
6 PD_43
PD_45 4

4
PD_2 2

2
2
PD_6

0 0 0
1 21 41 61 81 1 21 41 61 81 1 21 41 61 81
Month Month Month

10
Figure 1 (G) Figure 1 (H) Figure 1 (I)
8 8
8
One-Year PD (%)

One-Year PD (%)
One-Year PD (%)

6 6
6

PD_1 4 4
4 PD_5 PD_3 PD_9
PD_17
2 PD_18 2 PD_7 2 PD_13
PD_19 PD_11 PD_15
0 PD_25 0 0
1 21 41 61 81 1 21 41 61 81 1 21 41 61 81
Month Month Month

Figure 1 (J) Figure 1 (K) 40


Figure 1 (L) PD_39
12
8 PD_40
10
One-Year PD (%)
One-Year PD (%)

One-Year PD (%)

30 PD_41
6 8
PD_42
6 20
4
PD_4
PD_29 4
PD_22
2 PD_32 2
10

PD_28
PD_35 0
0 0
1 21 41 61 81 1 21 41 61 81 1 21 41 61 81
Month Month Month

Figure 1 (M) Figure 1 (N) Figure 1 (O)


40 PD_27 40 80
PD_34 PD_44
One-Year PD (%)
One-Year PD (%)

One-Year PD (%)

PD_31 PD_26
30 30 PD_46 60

20 20 40

10 10 20

0 0 0
1 21 41 61 81 1 21 41 61 81 1 21 41 61 81
Month Month Month

9
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

APPENDIX B: CLUSTER ANALYSIS RESULTS, ROOT MEAN SQUARE DISTANCE

Table 1: Ward's Minimum Variance Method with 19 Clusters


Groups 1 2 3 4 5 6 7 8
3-year Windows
20 21 23 37 38 43 45 10 16 3 7 11 22 28 17 19 39 40 27 34 44 26
1-3 years 8 12 14 29 32 35 1 5 18 25 31
30 33 36 24 2 6 9 13 15 4 41 42 46
20 21 23 3 7 11 4 17 19 27 34 44 26
2-4 years 8 12 14 24 30 33 36 37 38 43 45 10 16 2 6 29 32 35 1 5 18 25 39 40 41 42 31
9 13 15 22 28 46
20 21 37 38 43 45 3 7 4 17 18 19 27 34 44 26
3-5 years 24 30 33 36 10 16 2 6 11 29 32 35 1 5 25 39 40 41 42 31
8 12 14 23 9 13 15 22 28 46
20 23 21 10 16 3 7 4 17 18 19 39 40 27 34 44 26
4-6 years 8 12 14 24 30 33 36 2 6 9 13 15 29 32 35 11 1 5 25 31
37 38 43 45 22 28 41 42 46
20 23 43 45 2 6 3 7 11 4 5 39 40 27 34 44 26
5-7 years 8 12 14 21 24 30 33 36 10 16 9 13 15 29 32 35 1 17 18 19 25 31
37 38 22 28 41 42 46
20 23 43 45 3 7 11 4 39 40 44 26
6-8 years 8 12 14 21 24 30 33 36 10 16 2 6 29 32 35 1 17 18 19 25 41 27 34 31
37 38 9 13 15 5 22 28 42 46

Groups 1 2 3 4 5 6 7 8 9 10 11 12 13
5-year Windows
20 3 7 11 4 17 19 27 34 44 26
1-5 years 8 12 14 23 24 30 33 36 10 16 2 6 29 32 35 1 5 18 25 39 40 41 42 31
21 37 38 43 45 9 13 15 5 22 28 46
20 10 16 4 39 40 27 34 44 26
2-6 years 8 12 14 23 24 30 33 36 3 7 11 29 32 35 1 5 17 18 19 25 31
21 37 38 43 45 2 6 9 13 15 22 28 41 42 46
8 12 14 10 16 4 17 18 19 39 40 27 34 44 26
3-7 years 20 23 24 30 33 36 9 13 15 29 32 35 1 5 25 31
21 37 38 43 45 2 6 3 7 11 22 28 41 42 46
8 12 14 10 16 3 7 11 4 5 39 40 27 34 44 26
4-8 years 20 23 24 30 33 36 29 32 35 1 17 18 19 25 31
21 37 38 43 45 2 6 9 13 15 22 28 41 42 46
8-year Window
20 10 16 4 39 40 27 34 44 26
1-8 years 8 12 14 23 24 30 33 36 9 13 15 29 32 35 1 5 17 18 19 25 31
21 37 38 43 45 2 6 3 7 11 22 28 41 42 46

Fundamental 8 12 14 24 30 33 36 37 38 10 16 9 13 15 4 17 19 18 39 40 27 34 44 26
Blocks/ 29 32 35
Intersections 20 21 23 43 45 2 6 3 7 11 22 28 5 1 25 41 42 31 46

Table 2: Average Linkage Method with 12 Clusters


Group 1 2 3 4
3-year Windows
8 12 14 3 7 11 29 32 35 22 28 4 9 13 15 27 34 44 26
1-3 years 23 24 30 33 36 37 38 43 45 10 16 2 6 1 5 18 25 39 40
20 21 17 19 41 42 31 46
8 12 14 3 7 11 29 32 35 9 13 15 22 28 4 27 34 26
2-4 years 23 24 30 33 36 37 38 43 45 44 46
20 21 2 6 10 16 1 5 17 18 19 25 39 40 41 42 31
8 12 14 23 21 3 7 1 5 17 18 19 25 22 28 27 34 26
3-5 years 11 29 32 35 9 13 15 39 40 41 42 44 46
20 24 30 33 36 37 38 43 45 2 6 10 16 4 31
8 12 14 37 38 43 45 2 6 3 7 11 29 32 35 9 13 15 1 5 17 18 19 25 27 34 44 26
4-6 years
20 21 23 24 30 33 36 10 16 4 22 28 39 40 41 42 31 46
8 12 14 37 38 43 45 2 6 4 22 28 39 40 27 34 44 26
5-7 years 20 21 23 24 30 33 36 10 16 3 7 11 29 32 35 9 13 15 5
1 17 18 19 25 41 42 31 46
8 12 14 37 38 43 45 2 6 3 7 11 39 40 44 26
6-8 years 20 21 23 24 30 33 36 10 16 29 32 35 9 13 15 5 27 34 31
1 17 18 19 25 22 28 4 41 42 46

Group 1 2 3 4 5 6 7
5-year Windows
10 16 2 6 3 7 11 29 32 35 9 13 15 17 19 4 27 34 26
1-5 years 23 24 30 33 36 37 38 43 45 1 5 18 25 39 40 41 42 22 28 44 46
20 21 8 12 14 31
8 12 14 37 38 43 45 3 7 11 29 32 35 9 13 15 39 40 41 42 27 34 44 26
2-6 years 20 23 24 30 33 36 10 16 2 6 1 5 17 18 19 25 4 22 28 31
21 46
8 12 14 3 7 11 29 32 35 9 13 15 39 40 41 42 27 34 44 26
3-7 years 20 21 23 24 30 33 36 10 16 1 5 17 18 19 25 4 22 28 31
2 6 37 38 43 45 46
8 12 14 3 7 11 29 32 35 9 13 15 39 40 41 42 27 34 44 26
4-8 years 20 21 23 24 30 33 36 10 16 1 5 17 18 19 25 4 22 28 31
2 6 37 38 43 45 46
8-year Window
20 21 23 24 30 33 36 10 16 3 7 11 29 32 35 9 13 15 39 40 27 34 44 26
1-8 years 1 5 17 18 19 25 4 22 28 31
2 6 8 12 14 37 38 43 45 41 42 46

Fundamental 8 12 14 24 30 33 36 9 13 15 17 19 39 40 4 27 34 44 26
Blocks/ 10 16 2 6 29 32 35 5
Intersections 37 38 43 45 20 21 23 3 7 11 1 18 25 41 42 22 28 31 46

10
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

APPENDIX C: CLUSTER ANALYSIS RESULTS, CORRELATION ADJUSTED MEAN SQUARE DISTANCE

Table 3: Ward's Minimum Variance Method with Correlation Distance Measure


Group 1 2 3 4 5 6 7 8 9 10
3-year Windows
20 23 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
1-3 years 24 30 33 36 29 32 35 39 40 41 42
21 8 12 14 9 13 15 10 16 31 26
20 23 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
2-4 years 24 30 33 36 29 32 35 39 40 41 42
21 8 12 14 9 13 15 10 16 31 26
20 21 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
3-5 years 24 30 33 36 29 32 35 39 40 41 42
23 8 12 14 9 13 15 10 16 31 26
20 21 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 27 34 44 46
4-6 years 24 30 33 36 1 5 17 18 19 25 29 32 35 39 40 41 42
23 9 13 15 10 16 31 26
20 23 37 38 43 45 8 12 14 2 6 4 22 28 27 34 44 46
5-7 years 24 30 33 36 1 17 18 19 25 29 32 35 39 40 41 42
21 5 9 13 15 3 7 11 10 16 31 26
20 23 37 38 43 45 2 6 3 7 11 4 22 28 44 46
6-8 years 24 30 33 36 1 17 18 19 25 29 32 35 27 34 31
21 8 12 14 5 9 13 15 10 16 39 40 41 42 26

Group 1 2 3 4 5 6 7 8 9 10
5-year Windows
20 23 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
1-5 years 24 30 33 36 29 32 35 39 40 41 42
21 8 12 14 9 13 15 10 16 31 26
20 21 37 38 43 45 8 12 14 3 7 11 4 22 28 27 34 44 46
2-6 years 24 30 33 36 1 5 17 18 19 25 2 6 29 32 35 39 40 41 42
23 9 13 15 10 16 31 26
20 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 44 46
3-7 years 23 24 30 33 36 1 5 17 18 19 25 29 32 35 27 34 31
21 9 13 15 10 16 39 40 41 42 26
20 23 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 44
4-8 years 24 30 33 36 1 5 17 18 19 25 29 32 35 27 34 31 46
21 9 13 15 10 16 39 40 41 42 26
8-year Window
20 21 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 27 34 44 46
1-8 years 24 30 33 36 1 5 17 18 19 25 29 32 35 39 40 41 42
23 9 13 15 10 16 31 26

Fundamental 24 30 33 36 8 12 14 1 17 18 19 25 3 7 11 10 16 27 34 44
Blocks/ 2 6 5 29 32 35 4 22 28 46
Intersections 20 21 23 37 38 43 45 9 13 15 39 40 41 42 31 26

Table 4: Average Linkage Method with Correlation Distance Measure, 15 Clusters


Group 1 2 3 4 5 6 7 8 9 10
3-year Windows
20 37 38 43 45 1 5 17 18 19 25 2 6 4 22 28 27 34 44 46
1-3 years 23 24 30 33 36 3 7 11 29 32 35 39 40 41 42
21 8 12 14 9 13 15 10 16 31 26
20 23 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
2-4 years 24 30 33 36 29 32 35 10 16 39 40 41 42
21 8 12 14 9 13 15 31 26
20 21 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
3-5 years 24 30 33 36 29 32 35 10 16 39 40 41 42
23 8 12 14 9 13 15 31 26
20 21 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 27 34 44 46
4-6 years 24 30 33 36 1 5 17 19 25 29 32 35 10 16 39 40 41 42
23 18 9 13 15 31 26
20 23 37 38 43 45 8 12 14 2 6 4 22 28 44
5-7 years 24 30 33 36 1 17 18 19 25 29 32 35 10 16 39 40 41 42 46
21 5 9 13 15 3 7 11 27 34 31 26
20 23 37 38 43 45 2 6 1 17 18 19 25 3 7 11 4 22 28 44 46
6-8 years 24 30 33 36 5 29 32 35 10 16 27 34 31
21 8 12 14 9 13 15 39 40 41 42 26

Group 1 2 3 4 5 6 7 8 9 10
5-year Windows
20 23 37 38 43 45 1 5 17 18 19 25 2 6 3 7 11 4 22 28 27 34 44 46
1-5 years 24 30 33 36 29 32 35 10 16 39 40 41 42
21 8 12 14 9 13 15 31 26
20 21 37 38 43 45 8 12 14 3 7 11 4 22 28 27 34 44 46
2-6 years 24 30 33 36 1 5 17 18 19 25 2 6 29 32 35 10 16 39 40 41 42
23 9 13 15 31 26
20 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 44
3-7 years 23 24 30 33 36 1 5 17 18 19 25 10 16 27 34 31 46
21 9 13 15 29 32 35 39 40 41 42 26
20 23 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 44
4-8 years 24 30 33 36 1 5 17 18 19 25 29 32 35 10 16 27 34 31 46
21 9 13 15 39 40 41 42 26
8-year Window
20 21 37 38 43 45 8 12 14 2 6 3 7 11 4 22 28 27 34 44 46
1-8 years 24 30 33 36 1 5 17 18 19 25 29 32 35 10 16 39 40 41 42
23 9 13 15 31 26

Fundamental 24 30 33 36 8 12 14 1 17 19 25 3 7 11 10 16 27 34 44
Blocks/ 2 6 29 32 35 4 22 28 46
Intersections 20 21 23 37 38 43 45 5 18 9 13 15 39 40 41 42 31 26

11
NESUG 16 Statistics, Data Analysis & Econometrics

ST004

BIBLIOGRAPHY

1. “The New Basel Capital Accord,” Basel Committee on Banking Supervision, April 2003.
2. Calinski, T. and Harabasz, J. (1974), "A Dendrite Method for Cluster Analysis," Communications in Statistics, 3,
1 -27.
3. Cooper, M.C. and Milligan, G.W. (1988), "The Effect of Error on Determining the Number of Clusters,"
Proceedings of the International Workshop on Data Analysis, Decision Support and Expert Knowledge
Representation in Marketing and Related Areas of Research, 319 -328.
4. Duda, R.O. and Hart, P.E. (1973), Pattern Classification and Scene Analysis, New York: John Wiley & Sons,
Inc.
5. Carey, Mark, “Parameterizing Credit Risk Models with Rating Data,” Journal of Banking and Finance 25:1,
January 2001.
6. Carey, Mark, ''Credit Risk in Private Debt Portfolios,'' Journal of Finance, vol. 53 (August 1998), pp. 1363-87.
7. Gordy, Michael, “A Risk-Factor Model Foundation for Ratings-Based Bank Capital Rules,” working paper, Board
of Governors of the Federal Reserve System, Feb. 2001.
th
8. Greene, William H., “Econometric Analysis,” 5 Ed., Prentice Hall, NJ, 2003.
th
9. Johnson, Richard A. and Dean W. Wichern, “Applied Multivariate Statistical Analysis,” 4 Ed., Prentice Hall, NJ,
1998.
10. Lang, William and Anthony Santomero, "Risk Quantification of Retail Credit: Current Practices and Future
Challenges," 2002.
11. Loffler, Gunter, “An Anatomy of Rating through the Cycle,” working paper, Goethe-Universitat Frankfurt, July
2002.
12. Milligan, G.W. and Cooper, M.C. (1985), "An Examination of Procedures for Determining the Number of
Clusters in a Data Set," Psychometrika, 50, 159 -179.
rd
13. Neter, John, William Wasserman and Michael H. Kutner “Applied Linear Statistical Models, 3 Ed., Irwin,
Boston, 1990.
14. “Retail Credit Economic Capital Estimation – Best Practices,” The Risk Management Association, Feb. 2003.
15. SAS STAT User’s Guide, Chapter 23 and SAS online Documentation.

ACKNOWLEDGEMENTS
I would like to acknowledge the support and helpful comments from researchers in the Federal Reserve System;
specifically, Bill Lang, Paul Calem, Mark Levonian, Marc Saidenberg and Mark Carey. I would like to acknowledge Chris
Moriarity for introducing me to the field of cluster analysis and for providing insights into its applications.

SAS is a Registered Trademark of the SAS Institute, Inc. of Cary, North Carolina.

CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at:

Shannon Kelly
Federal Reserve Bank of Philadelphia
Ten Independence Mall
Philadelphia, PA 19106
Work Phone: (215) 574-3824
Fax: (215) 574-4146
Email: Shannon.M.Kelly@phil.frb.org

12

You might also like