You are on page 1of 5

Journal of the American Statistical Association

ISSN: 0162-1459 (Print) 1537-274X (Online) Journal homepage: http://www.tandfonline.com/loi/uasa20

Book Reviews
To cite this article: (2015) Book Reviews, Journal of the American Statistical Association,
110:511, 1320-1323, DOI: 10.1080/01621459.2015.1106150
To link to this article: http://dx.doi.org/10.1080/01621459.2015.1106150

Published online: 07 Nov 2015.

Submit your article to this journal

Article views: 256

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=uasa20
Download by: [Fondren Library, Rice University]

Date: 10 December 2016, At: 13:28

Book Reviews
Analysis of Multivariate and High-Dimensional Data
Inge Koch

Linear Algebra and Matrix Analysis for Statistics

Subhadeep Mukhopadhyay 1321

Analyzing Spatial Models of Choice and Judgment with R


David A. Armstrong II, Ryan Bakker,
Royce Carroll, Christopher Hare,
Keith T. Poole, and Howard Rosenthal

Sudipto Banerjee and Anindya Roy

Daniel B. Hall 1322

Richly Parameterized Linear Models: Additive, Time Series, and


Spatial Models Using Random Effects
James S. Hodges

Ian M. McCarthy 1323

Tatiyana V. Apanasovich 1321

Discrete Models of Financial Markets


Marek Capinski and Ekkehard Kopp

Paskalis Glabadanidis 1322

2015 American Statistical Association


Journal of the American Statistical Association
September 2015, Vol. 110, No. 511, Book Reviews
DOI: 10.1080/01621459.2015.1106150
1320

Book Reviews

1321

Analysis of Multivariate and High-Dimensional Data


Inge KOCH. Cambridge: Cambridge University Press 2013, xxv+504 pp.,
$89.00(H), ISBN: 978-0-5218-8793-9.
Analysis of Multivariate and High-Dimensional Data provides an impressive coverage to the classical multivariate methods and their extensions to highdimensional data. It attempts to fill the gap between traditional small-p and more
contemporary large-p multivariate statistical modeling. This self-contained textbook will be an invaluable reference for anyone interested in the Analysis of
Multivariate and High-Dimensional Data.
The book is organized into three broad sections.
Part I: Classical Methods. The first part deals with classical techniques like
principal component analysis, canonical correlation analysis, and discriminant analysis. In addition to the basic foundational topics, several recent
developments are also mentioned. For example, Chapters 2.6 and 2.7 give a
flavor of high-dimensional principal component analysis. A nice collection
of real data examples and theoretical results are compiled in these sections.
Part II: Factors and Groupings. The main focus here is cluster analysis. Topics
like factor analysis and multidimensional scaling are covered; correspondence analysis appears in Chapter 8.6.
Part III: Non-Gaussian Analysis. This stands out as one of the most distinguishing features of the book. Chapter 10 explores the concept of independent component analysis and Chapter 11 covers projection pursuit for
non-Gaussian dimension reduction. More recent nonlinear approaches like
kernel independent component analysis are discussed in Chapter 12. Finally,
Chapter 13 presents feature selection and principal component analysis of
high-dimensional data.
This book excels in several areas. I mention only a few that impressed me
particularly: (1) balance between treatment of theory and data analysis; (2) a
good collection of contemporary real data examples for motivation; (3) attempt
to build a bridge between traditional and modern high-dimensional techniques
in a way that is accessible to graduate students; (4) excellent colorful graphics
which make it interesting to read; (5) strong (three part) organization; and finally
(6) the materials of Part III on non-Gaussian Analysis, which is often neglected
in conventional textbooks.
There are a few shortcomings as well: (1) poor job in selecting topics. Many
important classical topics (that students should known) like multivariate regression and high-dimensional regularized regression should be included. Nonparametric treatment is disappointing and highly elementary. Modern nonparametric developments (Mukhopadhyay and Parzen 2014) could further enhance the
content. Other important topics that are omitted include nonparametric copula
modeling and graphical models. The role of correlation should be emphasized,
which is at the heart of all multivariate dependence modeling. I was amused to
find that almost half of the book is about principal component analysis in one
way or another. From my point of view, the book burdens the reader with its
coverage of all types of exotic results/theorems, which have little/no practical
relevance; this reflects the prevailing disconnect between theory and practice.
All of this could be safely removed for the sake of adding more valuable topics
and insights. (2) Less unified treatment. How different concepts are connected
is less emphasized. Consequently, students may be overwhelmed when they try
to digest the great diversity of topics discussed in the book. As an example,
consider the discussion on projection index (page 353). The key quantity should
be the distribution of (Xa ). This is known as the comparison density (Parzen
1979) d(u; , F ), which is defined as f (1 (u))/(1 (u)) for 0 < u < 1. It
can be shown that almost all the projection-pursuit index measures considered
can be expressed in a compact and unified way using this single notation.
A future edition may be improved by addressing few of the points I highlighted here. This would help to distinguish it from other more traditional and
popular textbooks like Izenman (2008).
Overall this is a well-written textbook with clear exposition. Students and
researchers would especially benefit from this book to train them in traditional
multivariate and modern high-dimension statistics. I recommend this book as a
nice reference for statisticians who desire to enter the world of high-dimensional
statistical modeling, especially dimension reduction-type techniques.
Subhadeep MUKHOPADHYAY
Temple University

REFERENCES
Izenman, A. (2008), Modern Multivariate Statistical Techniques, Vol. 1, New
York: Springer.
Mukhopadhyay, S., and Parzen, E. (2014), LP Approach to Statistical Modeling, unpublished technical report available at arXiv:1405.2601.
Parzen, E. (1979), Nonparametric Statistical Data Modeling (with discussion),
Journal of the American Statistical Association, 74, 105131.

Analyzing Spatial Models of Choice and Judgment with R


David A. ARMSTRONG II, Ryan BAKKER, Royce CARROLL, Christopher
HARE, Keith T. POOLE, and Howard ROSENTHAL. Boca Raton, FL: CRC
Press 2014, xx+336 pp., $69.95(H), ISBN: 978-1-4665-1715-8.
The purpose of this book is to give a comprehensive introduction to estimating
spatial models from political choice data using R. The authors use the term
spatial to describe distance-based methods which assume that it is possible to
compute a distance (or similarity) between each pair of objects in the domain.
Such data are also called relational data and spatial models are used to produce
a spatial map which contains the same information as the list of distances, but
is more interpretable. Spatial models are also used to postulate latent quantities
using observed variables. The dimensionality of such a latent space can be
interpreted as the number of separate substantial sources of variation among the
subjects or objects. In this case, researchers use the relative positions of points
in an abstract space to discover and present patterns in the data.
The methodology discussed in the book was developed in fields such as
psychology, economics, and political science. In political science, for example,
spatial models can be used to measure a voters level of conservatism via her
position on an ideological scale using a series of survey questions. The spatial
theory of voting is surveyed in Chapter 1.
The book considers various data types: data from issue scales, similarity and
dissimilarity data, rating scale data, and binary choice data. Each chapter is dedicated to a specific data type and the corresponding spatial models. The authors
discuss the assumptions the models make about the data-generating process and
about the utility functions used to formalize preferences. Since this is a survey
book, the material reflects the state of the art in spatial modeling of choice and
judgment. Some models and the subsequent inference enjoy quite a sophisticated treatment and some are somewhat heuristic. In fact, some methods lack
formal statistical inference including uncertainty quantification and hypothesis
testing. The authors do, however, discuss the use of nonparametric (Chapter 3)
and parametric (Chapter 5) bootstrap to estimate uncertainty in a few models. In
models for which a Bayesian formulation is given, the uncertainty is estimated
from the posterior distribution. Chapters 3, 4 and 6 address Bayesian treatments
of issue scales, perceptual, and preferential data models, respectively.
Chapter 7, the last chapter, discusses Bayesian extensions of spatial voting
models. The Bayesian analyses reviewed rely heavily on Markov chain Monte
Carlo (MCMC) to explore high-dimensional posterior distributions. The use
of Bayes Theorem and the incorporation of prior beliefs are not emphasized.
While MCMC is a useful tool, users may face difficulties with complex posterior
distributions or large datasets. In many cases, the careful use of convergence
diagnostics and burn-in that the authors advocate will be sufficient to ensure
proper inference. Assessing convergence, however, can be difficult in problems
with many parameters and users face additional challenges when the Markov
chain exhibits high autocorrelation, even after convergence. Neither of these limitations nor recently developed tools such as variational approximations, which
can be used to facilitate the application of Bayesian models, are discussed. The
authors do discuss the sensitivity of Bayesian methods to the choice of the prior
distribution and of the more general specification of the model, and offer advice
on Monte Carlo methods to study the consequences of model misspecification.
(Although the book does not discuss robust Bayesian methods it does describe
the use of frequentist methods such as nonparametric procedures which are robust against misspecification of the error distribution in classification problems
in Chapter 6.)
The book is well organized. Each chapter starts with a description of a particular data type and motivates it using examples. Then, the authors explain the
basic theory behind the relevant methods along with their historical developments. Estimation is discussed and demonstrated with the use of examples. The
implementation in R is presented along with explanation of the computer code.
The authors do not assume the reader is familiar with R; they provide a descrip-

1322

Journal of the American Statistical Association

tion of R programming environment in Chapter 2. The R code in the book is well


documented and the R outputs are clearly interpreted. The authors finish each
chapter with a review where they give a critical analysis of the methodology
discussed. Finally, each chapter contains a number of exercises, so that the book
can be used as a textbook for a graduate level class in Political Science.
The book is accessible to applied researchers who are more interested in
applying the methods than in delving into their underlying theory. The stepby-step instructions given allow the reader to directly apply the methods. The
understanding of the theoretical arguments, however, only requires college level
algebra.

Tatiyana V. APANASOVICH
George Washington University

Discrete Models of Financial Markets


Marek CAPINSKI and Ekkehard KOPP. Cambridge: Cambridge University
Press 2012, ix+181 pp., $44.99(P), ISBN: 978-0-52-117572-2.
This first volume in the Mastering Mathematical Finance series provides
an excellent and gentle introduction to the most important concepts in asset
pricing theory. The book should be readily accessible and useful to advanced
undergraduate students and will complement nicely the standard graduate level
texts on the subject. The coverage starts from first principles and builds on
quickly toward the first and second fundamental theorem of asset pricing linking
the concepts of absence of arbitrage, risk-neutral probabilities, state prices, and
replicating portfolios as well as their practical application. Each chapter is
peppered with a number of useful exercises with solutions available on the
Website that would keep the interested reader busy for some time and curious
about the upcoming volumes in the series.
Chapter 1 presents a brief introduction to the concepts and topics that will
be covered in the book.
Chapter 2 introduces single-step binomial and trinomial option pricing trees.
The authors introduce early on the concept of arbitrage and discuss the restrictions on the binomial model parameters that are needed to avoid arbitrage. The
concept of a replicating portfolio coupled with the law of one price quickly lead
to the no-arbitrage value of a European call option. The authors also discuss the
issues arising in derivative pricing in incomplete markets when there are three
states of the world and only two underlying securities available to construct
a replicating portfolio. They discuss the sub-replicating and super-replicating
strategies and their prices. The chapter also has a brilliant discussion surrounding
the limiting case of market completion when the sub-replicating price converges
the super-replicating price leading to the case of complete markets. There are
also a number of exercises to convince the reader as to the properties of option
prices, namely, convexity in the underlying price as well as the strike price and
the important concept of put-call parity linking the values of European call and
put option to the underlying stock price and the strike price.
Chapter 3 extends the idea of binomial option pricing to two periods and
introduces the important notion of filtration as a way of modeling the unfolding of uncertainty over time in a discrete setting. The chapter introduces the
workhorse model of discrete time option pricing following Cox, Ross, and
Rubinstein (1979) as well as the important concept of hedging the risk of a
derivative security using the options delta defined in a discrete setting. The authors also present and discuss the concept of martingale as an important property
of pricing financial securities under the risk-neutral probability measure.
Chapter 4 extends the ideas of martingales and conditional expectations developed in previous chapters to discuss the pricing of derivatives with multiplestep binomial models. They also introduce the most important theorem in asset
pricing, namely, the fundamental theorem of asset pricing linking the absence of
arbitrage to a unique risk-neutral probability measure as well as a set of conforming ArrowDebreu state prices. Several applications are included on calibrating
models to match the observed second moments of securities returns as well as
the risk-free rate of return. The authors also discuss the pricing of forward and
futures contracts as well as more complex derivatives like knock-ins, knockouts, and lookback options in the binomial pricing framework. There are also a
few examples of a protective put strategy, a covered call and a butterfly spread.
Chapter 5 discusses the pricing of American-type option where execution
can take place at any point prior to the maturity of the option. It introduces the
concept of Snell envelope as the discrete-time version of the free boundary of

early exercise and discusses its supermartingale properties. The chapter then
focuses on optimal stopping times and their properties in discrete time models,
especially, the case of an American call option on a dividend-paying stock.
The incorporation of the Doob decomposition and its importance to the optimal
exercise times for American-type options is of special interest. Finally, the
chapter concludes with the presentation of a few bounds on option prices when
early exercise is permitted. This includes the practically important generalization
to put-call parity to the case of American options where instead of an equality
only two inequalities obtain.
Chapter 6 applies the ideas of a binomial process to the pricing of bonds in a
multi-period discrete-time setting. It introduces the important concepts of spot
and forward interest rates as well as the spot interest rate. The authors apply
the binomial model to the pricing of fixed-coupon bonds, floating-rate bonds as
well as interest rate swaps. The chapter also contains a detailed discussion of the
Ho and Lee (1986) model of the term structure of interest rates as well as a lot
of detail on its calibration and properties, notably, the constant variance of the
short rate. It would have been nice to also see a brief mention or discussion of
other popular fixed income models like Black, Derman, and Toy (1990), Black
and Karasinski (1991), and Heath, Jarrow, and Morton (1990, 1992) but perhaps
space limitations constrained the choice of the leading fixed income model for
the authors to focus on.
This book fills an important gap in the literature between standard undergraduate finance texts and graduate level finance texts. The steep gradient between
those two sets of texts presents a void ripe for the taking. In the mind of this
reviewer, Discrete Models in Financial Markets fills this void rather nicely and
leaves the interested reader with a glimpse of what is to come in the follow-up
volumes in the series.
Paskalis GLABADANIDIS
University of Adelaide

REFERENCES
Black, F., Derman, E., and Toy, W. (1990), A One-Factor Model of Interest
Rates and its Application to Treasury Bond Options, Financial Analysts
Journal, 46, 3339.
Black, F., and Karasinski, P. (1991), Bond and Option Pricing When Short
Rates Are Lognormal, Financial Analysts Journal, 47, 5259.
Cox, J. C., Ross, S. A., and Rubinstein, M. (1979), Option Pricing: A Simplified
Approach, Journal of Financial Economics, 7, 229263.
Heath, D., Jarrow, R., and Morton, A. (1990), Bond Pricing and the Term
Structure of Interst Rates: A Discrete Time Approximation, Journal of
Financial and Quantitative Analysis, 25, 419440.
Heath, D., Jarrow, R., and Morton, A. (1992), Bond Pricing and the Term
Structure of Interest Rates: A New Methodology for Contingent Claims
Valuation, Econometrica, 60, 77105.
Ho, T. S. Y., and Lee, S.-B. (1986), Term Structure Movements and Pricing of
Interest Rate Contingent Claims, Journal of Finance, 41, 10111029.

Linear Algebra and Matrix Analysis for Statistics


Sudipto BANERJEE and Anindya ROY. Boca Raton, FL: CRC Press 2014,
xvii+565 pp., $79.95(H), ISBN: 978-1-4200-9538-8.
There are many books on linear algebra intended for statisticians. After all,
linear algebra is the foundation of much of our field and a firm understanding
of matrix manipulations, vector spaces, orthogonality, projections, and various
matrix decompositions is essential for graduate study in statistics, biostatistics,
and related disciplines. The most direct connections between linear algebra and
statistics occur in the theory of the linear model, which is undoubtedly why so
many linear models experts have written texts on linear/matrix algebra. Books in
this category include works by Searle (1982), Harville (1997), Graybill (2001)
and Seber (2008). Other matrix analysis texts oriented to statistics include books
by Schott (2005) and Gentle (2007). Despite this crowded landscape, Bannerjee
and Roy have managed to offer a unique and remarkable book, Linear Algebra
and Matrix Analysis for Statistics (or LAMAS hereafter), which has much to
offer that is not found elsewhere.
In the preface of LAMAS, the authors distinguish their approach from many
of their potential competitors. Their goal was to write a self-contained book
that does not assume prior knowledge of linear algebra, suitable for beginning

Book Reviews
statistics or biostatistics students who may come to these fields from disciplines
other than mathematics. While accommodating such an audience, they were
mindful of not dumbing down the subject (p.xv). Indeed, their book could
never be accused of that fault; instead it is rigorous in its presentation and
comprehensive in its coverage. The authors suggest that LAMAS could be used
as a companion text in the more theoretical courses on linear regression or as
the basis for a one-semester course devoted to linear algebra for statistics and
econometrics. (p.xvii)
The book comprises 16 chapters, starting with a basic introduction to matrices, vectors and operations on them (Ch. 1); followed by two chapters on
systems of linear equations covering Gaussian and Gauss-Jordan elimination,
matrix inverses (without determinants), LU, LDU and Cholesky factorizations
(Chs. 2 and 3); a chapter on (Euclidean) vector spaces and the four fundamental
subspaces (Ch. 4); a chapter on matrix rank (Ch. 5); a chapter on complimentary subspaces (Ch. 6); and two chapters on orthogonality, including orthogonal
subspaces, matrices, and projections (Chs. 7 and 8). Generalized inverses are
introduced in Chapter 9, while Chapter 10 is entirely devoted to determinants.
Eigenvalues and eigenvectors are the focus of Chapter 11, and singular value
and Jordan decompositions are discussed in Chapter 12. The next two chapters
deal with quadratic forms (Ch. 13), and the Kronecker product and related tools
(Ch. 14). Chapter 15 discusses vector and matrix norms, linear iterative systems
of equations, and matrix convergence, and offers a rare (in this book) application
of these ideas involving internet search algorithms. The final chapter (Ch. 16)
discusses abstract linear algebra over fields, ending with a brief introduction to
Hilbert spaces.
The topical outline of the book should make it clear that it is ambitious
in scope. While the reader is allowed to start with the basics, s/he is drawn
rapidly along into the more advanced topics of linear algebra. In addition, the
presentation is no thumbnail sketch. Topics are developed systematically and
thoroughly, with a great deal of careful and well-written explanation. Often,
important results or concepts are revisited repeatedly after new methods and
ideas have been introduced that allow fresh perspectives or insights. Nearly all
results are proved in the text, and in many cases the authors offer several proofs
of the same theorem to highlight connections between newly introduced ideas
and material from earlier in the book.
The book is about matrix algebra per se. While the topics within that field
were chosen and given emphasis according to their importance to statistics,
the authors do not often explore the statistical problems to which these topics
apply. Certainly one reason for this must have been to limit the size and scope
of the text; as is, the book is a substantial volume. Nevertheless, I would have
liked to learn where some of the more advanced topics connect with statistical
methodology. Identifying such links more frequently, even if not pursuing them
to any great degree, would be a welcome addition if a second edition is ever
pursued.
The broad scope and detailed presentation of LAMAS is both a great strength
and a weakness of the text. Most alternatives to this book offer a limited survey
of linear algebra, focusing on areas of the field that are most relevant to linear
models and traditional multivariate analysis. Bannerjee and Roys book is much
more comprehensive and offers the reader a deeper and broader treatment of
linear algebra that will provide some of the necessary background for more
modern and advanced topics in statistics such as functional data analysis, largep small-n methods, and dimension reduction. And while many of the books on
matrix algebra for statisticians provide useful handbooks for quick reference,
Bannerjee and Roys book provides the basis for systematic and detailed study
of the topic. Indeed this is not the source for concise listings of the properties of
the trace/determinant/fill-in-the-blank. Given the ease with which such results
can be found on the internet, this is perhaps not much of a drawback.
The downside of Bannerjee and Roys thorough approach is that the book will
provide information overload for many potential readers. While, in principle, a
student could start from a modest mathematical background and learn all of the
linear algebra that most statisticians would ever need to know (and more) from
this book, this would require dedicated study over considerably longer than
a single semester. For most students (and most degree programs), a two-pass
approach is more feasible in which students learn the basics as an undergraduate
or in a remedial graduate course and then return to the subject later to study
more advanced topics as needed. Therefore, I am not enthusiastic about this
book as a primary text in a course on linear/matrix algebra for statisticians.
Rather I think it is an excellent choice as a supplementary resource in courses
on linear model theory and more advanced topics, and as the definitive resource
on linear algebra for research-oriented statisticians whose work intersects with
this branch of mathematics. Whether as a course text or for self-study, the reader

1323
will benefit from numerous exercises at the end of each chapter (averaging 27
per chapter).
In Linear Algebra and Matrix Analysis for Statistics, Sudipto Bannerjee and
Anindya Roy have raised the bar for textbooks in this genre. For me, this book
will be an invaluable resource for my teaching and research. While I do not think
that it is the ideal introduction to linear algebra for young statisticians or the
best reference for practitioners, it is an outstanding choice for research-oriented
statisticians who want a comprehensive theoretical treatment of the subject that
will take them well beyond the prerequisites for the study of linear models.

Daniel B. HALL
University of Georgia

REFERENCES
Gentle, J. E. (2007), Matrix Algebra: Theory, Computations, and Applications
in Statistics, New York: Springer.
Graybill, F. A. (2001), Matrices with Applications in Statistics (2nd ed.), Pacific
Grove, CA: Duxbury Press.
Harville, D. A. (1997), Matrix Algebra from a Statisticians Perspective, New
York: Springer.
Schott, J. R. (2005), Matrix Analysis for Statistics (2nd ed.), Hoboken, NJ:
Wiley.
Searle, S. R. (1982), Matrix Algebra Useful for Statistics, Hoboken, NJ:
Wiley.
Seber, G. A. F. (2008), A Matrix Handbook for Statisticians, Hoboken, NJ:
Wiley.

Richly Parameterized Linear Models: Additive, Time Series,


and Spatial Models Using Random Effects
James S. HODGES. Boca Raton, FL: CRC Press, 2013, xxxviii+431 pp.,
$94.95(H), ISBN: 978-1-4398-6683-2.
Richly Parameterized Linear Models is a strong addition to the literature
on mixed models, offering a much more unified treatment and broader scope
than many existing texts. The book is intended for applied researchers and
covers a range of individual topics, including penalized splines, spatial analysis,
time series, and even Bayesian analysis. The book also provides an excellent
treatment of diagnostics for mixed models. Although Chapter 1 provides a brief
survey of mixed linear models, researchers already experienced in generalized
linear models and mixed linear models will benefit the most from this book.
A unique aspect of the book is its informal narrative. This is a refreshing
break from existing texts but is also counterproductive at times. For example,
the book regularly discusses old-style and new-style random effects. The
distinction between the two is necessary to treat several models under a common
framework, but this characterization may ultimately impede readers looking for
a more clear-cut distinction between different modeling assumptions.
Broadly, the book centers around two objectives. The first is to organize
under a common framework a series of models typically examined separately.
In particular, Chapters 17 serve to reexpress several models as a variation
of a standard mixed linear model. With its broad scope, the book necessarily
sacrifices some theoretical details; however, for applied researchers most interested in the mechanics of mixed linear models and extensions, the book largely
succeeds in this first endeavor.
The second objective is to examine in detail the underlying assumptions of
different models and the effects of such assumptions on results in practice. The
books opening quote characterizes this objective succinctly, if you believe in
things that you dont understand, then you suffer. This theme persists in all
of the books content, and is particularly prevalent in Chapters 819, which
provide an excellent treatment of open questions in the literature and difficult
problems encountered in practice.
The literature on mixed models has expanded in recent decades, often in
disparate ways. Richly Parameterized Linear Models provides a step toward
unifying this growing area of research and serves as an excellent resource for
applied researchers with experience and interest in mixed models.

Ian M. MCCARTHY
Emory University

You might also like