Unveiling The Identity of PIN From The Flash Crash

Unveiling the Identity of PIN from the Flash Crash:
Illiquidity or Information Asymmetry?
Qin Lei
First Draft: October 25, 2010

Current Version: November 25, 2010
Qin Lei, Finance Department, Cox School of Business at Southern Methodist University, 6212
Bishop Blvd, Dallas, TX 75275-0333. Phone: (214) 768-3183. Email: qlei@smu.edu.
Electronic copy available at: http://ssrn.com/abstract=1697879

Abstract
This paper extends the original PIN framework to explicitly allow for the coexistence of liq-
uidity shocks and fundamental news, both of which can lead to order imbalances. The pseudo
market makers submit contrarian orders in the event of liquidity shocks and thus move the
stock prices back to the fundamental level. Consequently, the conventional PIN measure
consists of one component driven by the informed traders who receive the fundamental news
and another component due to pseudo market makers who arrive upon liquidity shocks. Dur-
ing the ‡ash crash on May 6, 2010, there is a nearly ten-fold market-wide increase in the
illiquidity component of PIN but there is a lack of uniform increase in the information asym-
metry component, based on the estimation of the extended PIN model for common stocks
listed on NYSE and AMEX. In contrast, the original PIN model disallows liquidity shocks
and thus overestimates the extent of asymmetric information. In addition to introducing a
conceptually more pure measure of asymmetric information than that is previously available,
this paper also contributes to the literature through methodological improvements to the
PIN estimation and provides the recipe to eradicate the numerical over‡ow and under‡ow
problems and impute the daily PIN series from repeated estimations of quarterly PINs.
JEL Classi…cations: G10, G14
Keywords: Probability of Informed Trading (PIN), Information Asymmetry, Flash Crash,

Floating Point Over‡ow, Daily PIN Series
Electronic copy available at: http://ssrn.com/abstract=1697879

1 Introduction
The U.S. …nancial markets experienced a tumultuous day on May 6, 2010, when the Dow
Jones Industrial Average (DJIA) stock index witnessed its biggest one-day loss of 998.5 points
since its inception more than 100 years ago. Miraculously, the sharp decline across a much
wider spectrum of stocks than just the thirty component stocks in DJIA reversed itself within
a thirty-minute interval in the same day. Due to the dramatic plummet and swift reversal
of stock prices, this event became known as a “‡ash crash” in the popular press. The ‡ash
crash continues to reverberate among market participants and policy makers, and academic
researchers also have keen interests in knowing more about the mechanism underlying this
event. I study the ‡ash crash in this paper because this event amounts to a critical challenge
to the PIN literature, but at the same time presents a unique opportunity to unveil the
identity of PIN. This paper addresses the challenge in an extended PIN model that is better
supported by the data and exploits the ‡ash crash as a natural experiment to reveal that
PIN does not always purely represent asymmetric information.
In a series of important papers, Easley, O’Hara and their coauthors design and test the
probability of information-based trading (with the shorthand PIN) to capture the most likely
fraction of informed trades among all orders submitted by informed and uninformed traders
under a statistical structure that describes the news and order arrival processes. The PIN
measure has subsequently been studied in a number of contexts to quantify the e¤ect of
asymmetric information.1 Yet the ‡ash crash reveals a key weakness of the PIN measure in
that it overlooks the possibility of order imbalances induced by …rm-speci…c or market-wide
liquidity shocks and thus would overestimate the fraction of informed trades during such
times. The notion of PIN as a measure of asymmetric information being contaminated by
liquidity e¤ects extends beyond the ‡ash crash event. In fact, the interpretation of PIN has
been subject to much controversy despite its popularity. For instance, Easley, Hvidkjaer, and
O’Hara (2002) …nd that a 10% di¤erence in PIN of two stocks results in a 2.5% di¤erence in
expected returns and interpret this …nding as the information risk being priced. Duarte and
Young (2009) counter that the PIN factor is priced only because it is a proxy for illiquidity
in light of the evidence regarding the disappearance of the pricing power for the private
information factor after controlling for illiquidity. Even before the identity crisis of PIN in
the asset pricing context, there has been some confusion in the literature over the varying
interpretations of PIN.2 In light of the potential dual roles of the PIN measure, it is important
1
Here is an incomplete list of studies that apply the PIN measure outside the market microstructure …eld.
Easley and O’Hara (2004) and Duarte, Han, Harford and Young (2008) study the e¤ect of PIN on the cost
of capital. Vega (2006) and Jayaraman (2008) examine the role of PIN in the context of corporate earnings.
Bharath, Pasquariello and Wu (2009) study the relationship between PIN and the capital structure. Easley,
Hvidkjaer and O’Hara (2002) and Duarte and Young (2009) examine the pricing power of an aggregated PIN
factor in the asset pricing context.
2
For instance, Easley, Kiefer, O’Hara and Paperman (1996) understandably advocate PIN as a measure of
private information, yet Easley, Engle, O’Hara and Wu (2008) assert PIN as “a simple measure of illiquidity”
1
to ascertain the true identity of the PIN measure. Is it a pure measure of information
asymmetry as originally designed? Or, is it confounded by the extent of illiquidity? If the
latter is true, how can one possibly carve the illiquidity component out of the PIN measure
and obtain a more pure measure of asymmetric information? Answering these questions
requires disentangling the role of asymmetric information from the role of illiquidity and it
can be done either qualitatively or quantitatively. This paper takes the initiative to do both.
Given the elusive nature of illiquidity and information asymmetry, it is di¢ cult to tell
them apart unless there is some exogenous shock that naturally separates them. The ‡ash
crash on May 6, 2010, is such a natural experiment because it enables a qualitative distinction
between the two competing interpretations for the PIN measure. The U.S. Commodity
Futures Trading Commission (CFTC) and the Securities and Exchange Commission (SEC)
attribute the ‡ash crash to a short-lived liquidity crisis on both the index futures market and
the equity market (e.g., CFTC-SEC, 2010). This quick episode of a market-wide liquidity
crunch would necessarily imply a hike in an illiquidity measure for many stocks on the day
of the ‡ash crash. Consequently, one can largely rule out a uniform increase in the estimated
PIN if it purely measures the extent of asymmetric information. I repeatedly estimate the
quarterly PINs with and without the day of the ‡ash crash and then impute the daily PIN
series based on the knowledge that the informed orders have to add up over time. It turns
out that there is a marked increase in the estimated PINs for almost all stocks on May 6,
2010, and this …nding goes against PIN as a pure measure of asymmetric information. It
appears unlikely that there is a simultaneous hike in the amount of private information across
all stocks. A systematic liquidity shock is more plausible than the common arrival of private
information across all stocks at once especially because of the sharp and swift price reversal
on the day of the ‡ash crash. Stocks prices are supposed to gradually incorporate information
revealed from the informed orders and thus the information-based trading activities would
imply a price continuation rather than a sharp reversal. In other words, the qualitative
inference around the ‡ash crash event suggests that the empirical data lean in favor of the
illiquidity interpretation for the PIN measure at least during the ‡ash crash.
Having achieved the qualitative separation between asymmetric information and illiquid-
ity, I turn to quantifying the distinction so as to obtain a more pure measure of information
asymmetry. For this purpose, I extend the original PIN framework to explicitly allow for
the coexistence of liquidity shocks and fundamental news, both of which can lead to order
imbalances. The news probability of a liquidity shock is observed from the actual frequency
with which sizeable intraday price reversals occur. The idea is that fundamental news should
be steadily incorporated into stock prices without major reversal within a short time span,
while a sharp and quick reversal in stock prices is the hallmark of liquidity shocks that are
(pp. 190). Amihud (2002) treats the PIN measure as both a “…ner and better measure of illiquidity” (pp. 32)
and a “measure of microstructure risk ... that re‡ects the adverse selection cost resulting from asymmetric
information” (pp. 34).
2
unrelated to the fundamental news. The pseudo market makers submit contrarian orders
in the event of liquidity shocks and thus move the stock prices back to the fundamental
level. Consequently, the conventional PIN measure consists of one component driven by the
informed traders who receive the fundamental news and another component due to pseudo
market makers who arrive upon liquidity shocks. During the ‡ash crash on May 6, 2010, there
is a nearly ten-fold market-wide increase in the illiquidity component of PIN that would have
been mistakenly attributed to information-based trading under the original PIN model that
disallows liquidity shocks. The information asymmetry component also rises relative to the
preceding quarter but not in a uniform manner. The daily increment in asymmetric informa-
tion is not statistically signi…cant at the 1% level for four of ten volume deciles, and stocks in
the highest two volume deciles experience the largest hike since they have a very low level of
asymmetric information in the preceding quarter. These …ndings could suggest an environ-
ment with higher information asymmetry among those heavily traded stocks on the day of the
‡ash crash. The sta¤ report in CFTC-SEC (2010) identi…es the ‡ash crash as initiated in the
S&P 500 index futures market. Since the index futures market leads the equity market on an
intraday basis (e.g., Chan, 2002), it is plausible that some of the S&P 500 index component
stocks were indeed traded as if accompanied by material private information on the day of
the ‡ash crash. This circumstance would naturally translate into a more prominent e¤ect
among the most heavily traded stocks.
Using a novel idea to trace the identity of PIN through the ‡ash crash as a natural exper-
iment, this paper helps to address some challenges that the original PIN model faces. Aktas,
de Bodt, Declerck, and Van Oppens (2007) document the apparent di¢ culty in reconciling
the information leakage with the lower PIN estimates prior to the announcements of mergers
and acquisitions. One would have expected a higher estimated PIN during periods of infor-
mation leakage if PIN purely captures the extent of information asymmetry. Though Aktas
et al. label the inconsistency as a “PIN anomaly”, it is possible that the stock liquidity actu-
ally improves when traders exploit the leaked information so long as the lower PIN estimates
re‡ect lower illiquidity. Therefore, it warrants further investigation to see if the extended
PIN model resolves the anomaly.
This paper is closely related to Duarte and Young (2009) in that both papers extend
the original PIN framework to address the concern that the original PIN measure captures
illiquidity as well as information asymmetry. One critical distinction is that my extension
explicitly allows pseudo market makers to submit one-sided orders upon the occurrence of
liquidity shocks and thus addresses the problematic premise in the original PIN framework
that informed traders are the exclusive source of order imbalances. Duarte and Young (2009)
acknowledge the possibility that order imbalances could also result from liquidity shocks
rather than informed trades, leading to potentially misleading inferences from the problematic
premise. However, they also carry on the tradition in Easley, Kiefer, O’Hara and Paperman
3
(1996) that order imbalances are interpreted as an exclusive indication for informed trades,
and leave the issue unaddressed as “an important caveat”. Moreover, Duarte and Young
(2009) have a fairly di¤erent motivation behind their extension compared to this paper. My
model extension is inspired by the ‡ash crash and I introduce the pseudo market makers
to break the exclusivity of the informed traders in creating order imbalances. In contrast,
Duarte and Young (2009) are most concerned about the mismatch between theory and reality
because the observed correlation between buy orders and sell orders is positive even though
the original PIN framework implies a negative correlation. They introduce symmetric positive
order ‡ow shocks on both the buy side and the sell side to accomplish the goal of eliminating
the mismatch.
Beyond the theoretical extension of the PIN framework to explicitly allow for liquid-
ity e¤ects to coexist with, and thus be separated from, information asymmetry, this paper
also contributes to the literature by providing methodological improvements to the PIN es-
timation. Speci…cally, I design one simple procedure to dynamically factorize the daily log-
likelihood function for the maximum likelihood estimation of PIN and e¤ectively eliminate
the numerical over‡ow and under‡ow problems that have long plagued academic researchers
and practitioners alike. The buy and sell orders have steadily increased in recent years es-
pecially in light of the prevalence of algorithmic trading that often splits large orders into
smaller pieces. The explosive growth of the number of trades often contributes to the failure
of PIN estimations. After applying the dynamic factorization scheme, my estimation has a
100% convergence rate while avoiding corner solutions and local maxima. Without applying
the scheme, however, the estimation failure rate is a staggering 54.88%. I also make avail-
able a technique to impute the daily PIN series through repeated estimations of quarterly
PINs. Researchers are expected to bene…t from these methodological improvements in dif-
ferent settings, especially among studies of short-lived corporate events where the change of
asymmetric information needs to be measured on a daily basis.
The balance of the paper proceeds as follows. Section 2 describes the original PIN frame-
work and proposes a few methodological improvements to the PIN estimation. The PIN
framework is then extended in Section 3 to explicitly allow for liquidity shocks. Section 4
contains the empirical analysis and the concluding remarks are in Section 5.
2 PIN Estimation
2.1 Original PIN Framework
The expanding literature concerning the probability of informed trading is built on the the-
oretical foundation in Easley and O’Hara (1992). Easley, Kiefer and O’Hara (1996) and
especially Easley, Kiefer, O’Hara and Paperman (1996) popularize the PIN measure by pro-
4
viding an empirical recipe for the maximum likelihood estimation of PIN. The parsimonious
structure in Easley, Kiefer, O’Hara and Paperman (1996) becomes the natural starting point
for many subsequent papers that extend the trading process, the parameterization underlying
the PIN measure or both. In essence the original PIN framework in Easley, Kiefer, O’Hara
and Paperman (1996) imposes a statistical structure on the observed order ‡ows for a given
stock and relies on the parameter values that maximize the sample likelihood to compute the
average fraction of orders due to information-based trading.
Only two types of investors trade stocks in the setting of Easley, Kiefer, O’Hara and
Paperman (1996), either informed or uninformed. The orders from these traders are modeled
as Poisson processes with arrival rates and " for the informed traders and the uninformed
traders, respectively. While the uninformed traders submit both buy orders and sell orders
with equal probabilities on average, the informed traders commit to one-sided orders that are
consistent with the private news about the stock fundamentals. There is an probability
that the fundamental news would arrive on any trading interval, and the arrived news has a
probability of being negative. Therefore, there is an probability for a trading interval
to be associated with bad news, during which the informed traders submit only sell orders.
Likewise, there is an (1 ) probability that the informed traders submit only buy orders
on a trading interval with good news. When there is a lack of news with probability 1 ,
only the uninformed traders participate in trading the stock.
Easley, Kiefer, O’Hara and Paperman (1996) recommend aggregating the order ‡ows at
the daily level for all stocks so that the modeled trading interval lasts exactly one day. The
daily likelihood of observing B buy orders and S sell orders on one speci…c stock is
(")B (")S
L[(B; S)j ] = (1 ) exp( " ")
B! S!
(")B ( + ")S
+ exp( " ") (1)
B! S!
( + ")B (")S
+ (1 ) exp( " ") ;
B! S!
where denotes the vector of parameters to be estimated.
The standard practice in the literature is that under the assumption of constant parame-
ters over each calendar quarter, one can estimate the set of parameters that maximize the
sample likelihood of observing the daily order ‡ows. The average orders from the informed
traders are while the uninformed traders contribute 2". Therefore, the most likely fraction
of informed orders, or the probability of informed trading, can be de…ned as
P IN = : (2)
+ 2"
It is fairly intuitive that under this framework the informed traders are the sole source of
5
order imbalance by construction and the observation of high order imbalance is necessarily
associated with a high level of estimated PIN. As long as the order imbalance can be exclu-
sively attributed to the informed traders, it is straightforward to demonstrate that the PIN is
essentially equivalent to the absolute percentage order imbalance. Three teams of researchers
uncover this relationship independently around the same time. Kaul, Lei and Sto¤man (2005)
derive it through the change of variables using a system of equations after invoking perfect
foresight. Aktas, de Bodt, Declerck, and Van Oppens (2007) …nd that the PIN is the ratio of
expected absolute order imbalances to expected total orders. Easley, Engle, O’Hara and Wu
(2008) document the same relationship with a …rst-order approximation while noting that
the expected absolute di¤erence of Poisson variables is quite complicated.
2.2 Common Factorization of Log-likelihood
As the number of orders gets very large, the likelihood function becomes harder, and even
impossible in certain cases, to compute due to the factorial, the exponential and the power
functions. Regardless of the speci…c hardware and software used for the computation, there
are limits on the maximum and minimum numbers allowed, beyond which an over‡ow and
under‡ow error would be triggered, respectively. To get around this issue, one can re-arrange
the likelihood function to produce a common factor whose natural logarithm is easy to com-
pute. That is, rewrite the likelihood function as
L[(B; S)j ] = c m=(B!S!);
where the common factor c makes ln(c) easy to compute and the multiplicative factor m is
constructed to moderate the magnitude of inputs to the exponential functions and the power
functions. One can skip calculating the common factorial in the denominator because it does
not involve any parameters to be estimated and thus a¤ects only the absolute magnitude of
the likelihood value.
After purging some constants unrelated to parameters in , one can write the daily
log-likelihood function in the following computation-friendly form,
L( ) = 2" + (B + S) ln(")
+ ln f(1 )+ exp [ S ln(k)] + (1 ) exp [ B ln(k)]g ; (3)
"
where k ; and thus ln(k) 0:
+"
The ratio k of arrival rates is bounded between 0 and 1, and thus the inputs for the exponential
functions can be of moderate size.
6
2.3 Eliminating Over‡ow and Under‡ow Problems
The common factorization in equation (3) works reasonably well to alleviate the over‡ow and
under‡ow problems among stocks with low to moderate trading volumes, but it is far from
eliminating these problems. Stocks with high trading volume often su¤er from the over‡ow
and under‡ow problems even after the moderation introduced by the factorization. Given
the recent trends of institutional investors breaking up their orders into smaller pieces and
the increasing prevalence of high frequency traders who often submit orders of small size,
more and more stocks fall into the category for which the PIN estimation simply fails. Note
that the over‡ow and under‡ow problems are not exclusively a- icting only stocks with high
trading volumes, however. To break down the estimation process, it takes no more than
one day of severely one-sided order ‡ows or the optimization procedure’s directing one of
the interested parameters into a certain region of value that would trigger an over‡ow or
under‡ow problem.
In contrast to the dire situation regarding the empirical estimation of PIN, the PIN mea-
sure as a theoretical concept has clearly gained popularity among researchers who are keen
to measure the extent of information asymmetry in various contexts. It is thus understand-
ably desirable to e¤ectively eliminate the over‡ow and under‡ow problems from the PIN
estimation. This paper provides one such solution by dynamically changing the factorization
process for each pair of order ‡ows on a daily basis so as to actively avoid triggering any
over‡ow or under‡ow error.
To implement the dynamic factorization, it is necessary to …rst identify the trigger value
for over‡ow and under‡ow errors in the hardware and software combination for the PIN
estimations. In order to obtain the trigger value, the researcher can keep increasing the
input value C to an exponential function expfjCjg until the calculation fails. For instance, I
use the SAS software on a desktop computer that associates expf708g with an over‡ow and
expf 708g with an under‡ow. Alternatively, one can use the “constant” function in SAS to
identify the trigger values. In my computer, I …nd that constant(‘logsmall’) = 708:396 and
constant(‘logbig’) = 709:783. In other words, the combination of my computing hardware
and software yields an approximate critical value C = 708, and the factorization of the daily
log-likelihood function has to be done in a way to actively avoid numbers outside the range
[exp( C ); exp(C )], which is equivalent to [10 C = ln(10) ; 10C = ln(10) ]. Otherwise, an over‡ow
or under‡ow problem can occur.
Note that the over‡ow and under‡ow problems a¤ect mainly the multiplicative factor m
of the daily likelihood. The basic strategy of my factorization scheme is to pull the largest
exponential input out of the multiplicative factor m, make it part of the common factor c,
and identify all occasions necessary to replace an exponential with zero that would trigger an
under‡ow problem. The Appendix spells out the full details of a simple three-step procedure
7
to dynamically factorize the daily log-likelihood function. Once the over‡ow and under‡ow
problems are completely eliminated through the dynamic factorization algorithm, it is fairly
easy to conduct a grid search over di¤erent regions of parameter values so as to ensure a
global maximization.
2.4 Computing Daily PIN Series
The presence of information asymmetry is applicable in many contexts. It is very often

the case that empirical researchers seek to measure the change in the extent of asymmetric
information around certain corporate events that can be short-lived. It is the common practice
in the literature to estimate the PIN measure over one quarter of daily data for a given stock.
Therefore, the estimated quarterly PIN measures are not well suited for studying corporate
events whose e¤ects on information asymmetry may last only a day or two. While there are
studies that extend the original theoretical framework underlying the PIN measure to allow
for the estimation of daily PIN series (e.g., Lei and Wu, 2005; Easley, Engle, O’Hara and
Wu, 2008), the high frequency series comes at the cost of imposing more elaborate structures
on the observed order ‡ows and thus the extended models are not nearly as popular as the
simple estimation of quarterly PINs. In fact, even the studies that promote the extended
PIN models allowing for high frequency PIN series often avoid a large scale estimation for
many stocks and restrict the exercise to a selected few stocks instead.
In this paper, I propose a simple method to impute the daily PIN series from the quarterly
PIN estimates. The basic idea is to estimate the quarterly PIN measures with and without
the trading day t and infer the daily PIN measure from the di¤erence in quarterly PIN
estimates. Denote by Nx the total number of trades in the quarter (e.g., 62 trading days)
prior to trading day t. Denote by Nt the total number of trades in trading day t. Then the
cumulative total number of trades over a 63-trading-day span ended on trading day t is
Nc = Nx + Nt :
Denote by P INx the PIN estimated from using the …rst 62 days of trades. Denote by P INc
the PIN estimated from using the 63 days of trades. Denote by P INt the imputed PIN
measure for trading day t. Clearly, the informed orders have to add up over the period of 63
days. In other words, the following relation holds
Nx P INx + Nt P INt = Nc P INc :
Substituting the de…nition of the cumulative total number of trades and re-arranging the
8
terms, the implied daily PIN measure is
Nx
P INt = P INc + (P INc P INx ): (4)
Nt
The daily incremental PIN relative to the prior 62 trading days is
Nc
P INt P INx = (P INc P INx ):
Nt
So one alternative representation of the imputed daily PIN measure is
Nc
P INt = P INx + (P INc P INx ): (5)
Nt
The intuition behind the daily PIN measure is straightforward. The PIN estimated over
63 trading days is essentially a weighted average of the PIN measure on day t and the PIN
estimated over the preceding 62 trading days. If the PIN measure over the period inclusive
of the trading day t is higher than the PIN measure excluding the trading day t, then it must
be the case that the PIN on the trading day t is higher than before. On the other hand,
if the PIN drops lower on the trading day t, then the inclusive PIN measure must be lower
than the PIN measure excluding the trading day t.
The inference above also delivers a boundary condition between P INx and P INc as an
added bene…t. Since the daily PIN is bounded between zero and one, the estimated P INc
must be bounded as well,
Nx Nx Nt
P INx P INc P INx + : (6)
Nc Nc Nc
In practice, estimated pairs of P INx and P INc that do not satisfy the above boundary
condition should be re-visited. The violation of the boundary condition could have resulted
from local maximum rather than global maximum estimates for either P INx or P INc and
thus re-estimations may help. If the model structure is too rigid to …t the data well, however,
the estimated pairs have to be either discarded or re-estimated under an alternative model
structure. For instance, an estimation window with a di¤erent time span may …t the data
better, with daily order ‡ows over one month as opposed to one quarter.
3 PIN Extension
The ‡ash crash on May 6, 2010, provides a good motivation to introduce liquidity shocks into
an extended PIN framework. In the event of a …rm-speci…c or market-wide liquidity shock,
the stock prices can experience a sizeable reversal over a short period of time that would not
9
necessarily be consistent with the presence of informed traders. Stock prices are typically
assumed to gradually incorporate the information revealed from the informed orders and
thus the information-based activities would imply a price continuation rather than a sizeable
reversal. One way of justifying the sizeable price reversals associated with liquidity shocks is
to introduce a third group of investors, known as the pseudo market makers, whose orders
arrive only upon the liquidity shocks. In the extended PIN model, the pseudo market makers
trade in a contrarian fashion in the same way as market makers would and thus move stock
prices back to the fundamental level.
3.1 Revised Trade Process and Sample Likelihood
It is useful to extend the news arrival process so that news re‡ects either signals about the
fundamental value of the stock or simply liquidity shocks. The trading process can be revised
as follows. There is a news event in each trading interval (e.g., one day) with probability .
Conditional on the arrival of a news event, there is a probability that the news re‡ects the
liquidity shock and a 1 probability that the news re‡ects the value-relevant fundamental.
The orders from the pseudo market makers arrive only upon the occurrence of a liquidity shock
and follow a Poisson process with arrival rate . As before, the informed orders arrive upon
the release of fundamental news and follow a Poisson process with arrival rate . Regardless
of whether a news event occurs, the uninformed orders always arrive in each trading interval
and follow a Poisson process with arrival rate ". Irrespective of the news type, each news
event has an identical probability of being negative. The uninformed orders are insensitive
to the news nature and thus balanced across the buy and sell sides. In contrast, both the
informed investors and the pseudo market makers submit only one-sided orders depending
upon the news nature. Speci…cally, the liquidity shock triggers only buy (or sell) orders from
the pseudo market makers on days with bad (or good) news, while the fundamental news
induces only sell (or buy) orders from the informed traders on days with bad (or good) news.
Denote by the vector of parameters to be estimated. The daily likelihood of observing

B buy trades and S sell trades on one speci…c stock is
(")B (")S
L[(B; S)j ] = (1 ) exp( " ")
B! S!
( + ")B (")S
+ exp( " ")
B! S!
(")B ( + ")S
+ (1 ) exp( " ") (7)
B! S!
(")B ( + ")S
+ (1 ) exp( " ")
B! S!
( + ")B (")S
+ (1 )(1 ) exp( " ") :
B! S!
10
Purging some constants unrelated to parameters in , one can write the daily log-likelihood
function in the following computation-friendly form,
L( ) = 2" + (B + S) ln(")
8 9
< (1 )+ exp [ B ln(h)] + (1 ) exp [ S ln(h)] =
+ ln ;
: + (1 ) exp [ S ln(k)] + (1 )(1 ) exp [ B ln(k)] ;
" "
where h and k :
+" +"
The ratios h and k of arrival rates are bounded between 0 and 1. Consequently,
ln(h) 0 and ln(k) 0
help to moderate the magnitude of the inputs for the exponential functions.
3.2 Modeling Choices of Liquidity Shocks
Clearly, there are di¤erent ways to model the liquidity shock in comparison to the fundamental
news and the mere observation of buy and sell order ‡ows is no longer su¢ cient for this
purpose. The convention in the literature is to estimate PIN from daily order ‡ows that span
over a quarter or longer. The recent ‡ash crash illustrates that a sharp and swift reversal of
stock prices can take place in just half an hour. The order ‡ows observed at the daily level
would almost certainly miss much of the underlying dynamics. Yet shrinking the trading
interval into a shorter time span runs into the risk of mismatching with the actual data most
of the time because liquidity shocks are not always present in each trading day. Therefore, it
appears a reasonable compromise to allow for exogenously determined liquidity shocks while
respecting the convention in the literature to retain daily order ‡ows.
In this paper, I equate the probability of liquidity shocks to the empirical frequency
for the occurrence of a sizeable intraday reversal of stock prices across the entire estimation
period. The rationale behind linking the price reversal with the liquidity shock is fairly
intuitive. Stock prices are expected to continue in the same direction as they gradually
incorporate information revealed from informed orders following the release of fundamental
news. In contrast, the large price reversal over a short time interval is the hallmark of
a liquidity shock rather than fundamental news. It is also appealing to employ the price
reversal statistics along with the order ‡ow data to jointly determine the input parameters
for the PIN measure through a maximum likelihood procedure. There is valuable information
to glean from the price dynamics after all, yet many models delivering a constant PIN almost
always exclusively focus on the order ‡ows alone.3
3
One exception is Easley, Kiefer and O’Hara (1997) who introduce the role of trade size to the PIN
framework.
11
Instead of specifying a constant based on the observed frequency of intraday price
reversals, one may choose to directly introduce the price series into the model and make both
the probability of liquidity shock and the PIN time-varying. It is also possible to link the
time-varying probability of liquidity shock to certain well-known liquidity measures. One has
to carefully balance though the bene…ts of having a model with rich dynamics against the
costs of imposing a more elaborate and thus complex structure on the data. I choose to allow
for a constant in the PIN extension here because of its inherent parsimony.
3.3 PIN Decomposition
The conventional PIN measure in the extended framework can be re-de…ned as
(1 ) +
P IN = (8)
(1 ) + + 2"
to re‡ect the fact that both the informed investors and the pseudo market makers contribute
to order imbalances. The component of PIN measure that is purely related to information
asymmetry can be isolated as
(1 )
P INinasy = ; (9)
(1 ) + + 2"
after carving out the component of PIN measure related to illiquidity
P INilliq = : (10)
(1 ) + + 2"
Note that the arrival rate of the pseudo market makers is endogenously determined by the
extended PIN framework and thus can have great in‡uence over the PIN decomposition, even
though the constant probability is pinned down from the price reversal statistics that are
outside the PIN framework.
It is clear now that the conventional PIN measure actually consists of both an illiquidity
component and an information asymmetry component. In the special case with = 0, the
PIN measure fully represents the extent of asymmetric information. In the special case with
= 1, the PIN measure fully represents the extent of illiquidity. One of these two roles can
dominate the other from time to time. Keeping in mind the coexisting roles of illiquidity
and information asymmetry, helps one to reconcile the potential confusion over the varying
interpretations of the PIN measure in the literature (see the second footnote).
From the perspective of the dual roles that the PIN measure can take, it is possible to
address the PIN anomaly documented in Aktas, de Bodt, Declerck, and Van Oppens (2007).
Instead of …nding higher PIN estimates in periods with information leakage prior to the
announcements of mergers and acquisitions, these authors …nd lower PIN estimates and thus
12
label the …nding an “anomaly”. It is not inconceivable that the lower PIN estimates could
actually re‡ect the improved liquidity due to the heightened trading activities along with
information leakage. I examine this empirical possibility in a separate paper.
3.4 Literature Review
In a closely related paper, Duarte and Young (2009) decompose the PIN measure into two
components and attribute the pricing power of the PIN factor in the asset pricing context to
the illiquidity component rather than the information asymmetry component. This …nding
makes an important contrast to the …nding of information risk being priced in Easley, Hvid-
kjaer, and O’Hara (2002). Despite the similarity over the decomposition of the PIN measure,
this paper is distinctively di¤erent from Duarte and Young (2009) in several aspects. As dis-
cussed earlier, the PIN measure is essentially equivalent to absolute order imbalance under
the original framework in Easley, Kiefer, O’Hara and Paperman (1996). Duarte and Young
(2009) inherit the same critical premise as Easley, Kiefer, O’Hara and Paperman (1996) that
the informed investors are the exclusive source of order imbalances. Since this assumption
does not have to hold in reality, Duarte and Young (2009) carefully discuss the potential
problem with this assumption. They acknowledge the possibility that the order imbalances
could also result from liquidity shocks rather than informed trades, leading to potentially
problematic inferences. My paper directly tackles this “important caveat” in Duarte and
Young (2009) and explicitly allows both the informed investors and the pseudo market mak-
ers to create order imbalances. This paper’s decomposition of the PIN measure clearly re‡ects
the importance of liquidity shocks. In light of this important distinction, it is worthwhile to
examine whether or not the conclusion in Duarte and Young (2009) regarding the pricing
power of the two components of the PIN factor is robust to introducing liquidity shocks to
the PIN framework. I carry out this empirical exercise in another paper.
Moreover, the motivation behind the PIN extension in Duarte and Young (2009) is quite
di¤erent. In this paper, I introduce the liquidity shocks, upon which the pseudo market
makers arrive to move prices back to the fundamental level, in order to break the exclusivity
of informed traders in creating order imbalances. In contrast, Duarte and Young (2009)
reasonably argue that the one-sided nature of informed orders necessarily implies a negative
correlation between buy orders and sell orders even though the observed daily correlation is
positive. The PIN extension in Duarte and Young (2009) is motivated by eliminating the
mismatch with the observed order ‡ows in terms of correlation, and they accomplish the
goal by introducing symmetric positive shocks to both buy and sell orders. In a way, their
motivation and approach are quite similar to the PIN extension in Weston (2001) who also
worries about trading volume on information days being abnormally large on both buy and
sell sides. Weston (2001) argues that the positive correlation between buy and sell orders
is driven by noise trading, which is characterized as a third group of traders that submit
13
both buy and sell orders simultaneously. While Weston (2001) allows the symmetric order
‡ow hikes to take place only on a day with news arrival, Duarte and Young (2009) introduce
the symmetric order ‡ow hikes regardless of whether the fundamental news arrives. Since
informed traders and pseudo market makers in my extension would submit only one-sided
orders depending upon the nature of news events, one limitation of my extension is that it
does not imply a positive correlation between buy and sell order ‡ows. This is an empirical
limitation in the sense that the modelled trading interval does not have to span exactly one
day and the observed positive correlation between buy and sell orders does not necessarily
extend beyond all sampling intervals other than one day. One way to address the limitation is
to have a …ner grid of trading intervals so as to allow intraday interactions between di¤erent
news events and thus higher buy orders and sell orders on the same day. Alternatively, one
can follow the lead of Weston (2001) and Duarte and Young (2009) and further complement
the orders from the pseudo market makers with symmetric order ‡ow hikes to ensure daily
order ‡ows that are positively correlated.
Easley, Lopez and O’Hara (2010) also study the PIN around the ‡ash crash but rely on
an approximation rather than the maximum likelihood estimation that is typically used in
the literature. In light of the …nding that the original PIN measure is essentially equivalent
to absolute order imbalance as discussed earlier, Kaul, Lei and Sto¤man (2005) advocate
using the absolute percentage of order imbalance (AIM) in place of the PIN measure that is
much harder to estimate than AIM. Easley, Lopez and O’Hara (2010) advance this proposal by
detailing a procedure to measure the absolute order imbalance in lieu of PIN and applying the
revised measure to a number of di¤erent security products beyond stocks. As a result, these
two papers step outside the typical PIN framework and do not conduct maximum likelihood
estimations for the proposed PIN alternative. It is noteworthy that Easley, Lopez and O’Hara
(2010) update the order imbalance more frequently among heavily traded stocks than thinly
traded stocks and thus partly address the issue that Kaul, Lei and Sto¤man (2005) raise
regarding the practice in the literature of applying a uniform frequency to measure order
‡ows for all stocks. Unfortunately, however, the absolute percentage order imbalance can
be a proxy for both illiquidity and information asymmetry much in the same way as the
original PIN measure does. Both Kaul, Lei and Sto¤man (2005) and Easley, Lopez and
O’Hara (2010) su¤er from the lack of distinction between these two roles precisely because
of the ‡awed assumption that the informed traders are the sole source of order imbalance.
Moreover, Easley, Engle, O’Hara and Wu (2008) illustrate that using the absolute percentage
order imbalance as an approximation for PIN may actually miss the dynamics over short-
lived corporate events such as earnings announcements that a daily PIN series would have
captured. So it is not straightforward to conclude that the documented properties of the
alternative measure in Easley, Lopez and O’Hara (2010) necessarily re‡ect those of PIN
around the ‡ash crash.
14
In sum, this paper’s extension to the PIN framework marks an important departure from
the extant literature and contributes a measure of information asymmetry that is conceptually
purer than that is previously available.
4 Empirical Analysis
4.1 Construction of Sample
My primary data source is the detailed stock transactions from the New York Stock Exchange
(NYSE) Trade and Quote (TAQ) database between February 5 and May 6 of 2010. This
study focuses on stocks listed on NYSE and American Stock exchange (AMEX). Because
the auto-quotes are not …ltered in TAQ, I follow Chordia, Roll and Subrahmanyam (2001) in
using only the primary market (NYSE) quotes, and retain quotes within the regular trading
block after purging those quotes with non-positive bid or ask prices, negative bid or ask sizes,
missing time stamps, or bid prices higher than ask prices. I also remove trades that are out of
sequence, recorded before the open or after the close time, have special settlement conditions,
or have missing trade size or time stamp. As is the standard practice in the literature, the
algorithm in Lee and Ready (1991) is utilized to determine the buyer-initiated or seller-
initiated nature of each trade.4 Basically, all trades with a price higher (or lower) than the
midpoint of the bid and ask prices are classi…ed as buyer-initiated (or seller-initiated). Trades
with a price identical to the mid point of the prevailing quote are subject to a tick test so
that a trade is classi…ed as buyer-initiated (or seller-initiated) if the price is higher (or lower)
than the preceding trade. I follow the advice of Chordia, Roll and Subrahmanyam (2005)
who recommend revoking the …ve-second delay rule in Lee and Ready (1991) for matching
trades with quotes starting in 1999.
For each stock the PIN measure is estimated separately for the 62-day period ending on
May 5, 2010, and the 63-day period ending on May 6, 2010. With a minimum requirement of
order ‡ows for 30 trading days, the maximum likelihood estimation is carried out using the
NLMIXED procedure in SAS. The dynamic factorization of the daily log-likelihood function
is remarkably successful. After an extensive grid search over di¤erent regions of parameter
values to ensure a global maximum, the optimization exercise …nishes successfully for all
stocks in each estimation period. To facilitate imputing the daily PIN series on the date
of ‡ash crash, stocks with zero trades on May 6, 2010, are removed from the sample. As
discussed earlier in Section 2, the imputation of the daily PIN involves a set of boundary
conditions on the resulting pair of quarterly PIN estimates. Only stock quarters that survive
this additional requirement remain in the …nal sample.
4
Note that Boehmer, Grammig and Theissen (2007) study the bias on PIN estimates introduced by the
sometimes erroneous classi…cation of the trade initiation and provide a method to correct this bias.
15
The master …le of the TAQ database provides the CUSIP underlying each stock ticker
symbol and I rely on the Center for Research in Security Prices (CRSP) database to extract
the stock characteristics (such as primary exchange, share code and market equity) after
merging the two datasets on CUSIP. There are 1,765 stocks on the NYSE/AMEX with
quali…ed pairs of quarterly PIN estimates thus far. To check for results sensitivity to the exact
grouping of stocks, I employ a set of …lters to further re…ne the sample. After removing the
American Depositary Receipts (ADRs), the sample size becomes 1,600. Focusing on common
stocks with CRSP share code of either 10 or 11 further reduces the sample size to 998 stocks.
To guard against the potential confounding e¤ects from the earnings announcements adjacent
to the ‡ash crash event, I also remove stocks that have their earnings announced between
May 5 and May 7, 2010, inclusive on both ends. The announcement dates are extracted from
the actual earnings …le for the U.S. …rms in the I/B/E/S database. The sample size comes
down to 847.
4.2 Estimation of Original PIN Measure
The simple algorithm of dynamic factorization for the daily log-likelihood function outlined in
the Appendix is quite successful, achieving a 100% convergence rate in my sample while avoid-
ing corner solutions and local maxima. In contrast, the common factorization in equation
(3) fares much poorer and has a success rate of 45.12% in the same sample. The staggering
failure rate from the common factorization in equation (3) illustrates the dire situation of the
PIN estimation for the trading data in recent years. With algorithmic trading increasingly
popular, many orders are split into smaller pieces, often resulting in tens and thousands of
trades for one stock on one typical day. The sharp increase in the observed order ‡ows makes
it more likely to trigger a numerical over‡ow or under‡ow. Hence it is critical to have an
e¤ective factorization scheme that is ‡exible enough to adapt to various patterns of daily
order ‡ows in eradicating the over‡ow and under‡ow problem.
To show that extreme cases of order imbalances signi…cantly contribute to the estimation
complexity, I run a logit regression to explain the success of maximum likelihood estimations
for the original PIN framework with the common factorization of daily log-likelihood in
equation (3). The cross-sectional regression results are reported in Table 1. When the
total number of trades averaged across all trading days is the sole predictor, it is inversely
related to the estimation success. In other words, the PIN estimation is more di¢ cult among
heavily traded stocks. The maximum absolute order imbalance also adds to the di¢ culty of
maximum likelihood estimation in that extreme imbalances often trigger numerical over‡ow
and under‡ow problems. Note that the extreme absolute order imbalance delivers a better
…t than the total trades as a sole predictor for the estimation success, and there is little
incremental explanatory power from the total trades after controlling for the extreme order
imbalance. The percentage absolute order imbalance averaged across all trading days beats
16
the aforementioned two predictors, however, by delivering a pseudo-R2 of 0.42 as a sole
predictor. The positive coe¢ cient with the percentage absolute order imbalance suggests that
the original PIN framework thrives at cases with extremely imbalanced orders on average,
which in turn strongly re‡ect the presence of informed orders. Putting these predictive
variables together to explain the estimation success retains their respective signs with the
exception of the total orders. Further augmenting the logit regression with the logarithmic
market equity does not materially change the inferences, and as expected the estimations for
large cap stocks are more di¢ cult. All the estimated coe¢ cients in the top panel of Table 1,
including the intercepts, are statistically signi…cant at the 1% level.
In the bottom panel of Table 1, I repeat the same set of six regression designs while
replacing the independent variables by the cross-sectional percentile rank when possible.
The percentile ranks help us to gauge the result sensitivity to potential outliers since the
regressions in the top panel would place too much weight on observations with extreme
values. The qualitative pattern of results remains largely unchanged with a few exceptions.
The goodness of …t has improved after the transformation of independent variables. The
…rm size is no longer statistically signi…cant and the intercepts in two designs are also less
statistically signi…cant than before. Moreover, the total number of trades has one extra
change of sign in the bottom panel compared to the top panel.
Overall order imbalances contribute to the estimation complexity in an interesting way.

While a high level of extreme imbalances implies a lower estimation success, stocks with a
higher percentage of order imbalances are actually easier to estimate. The former …nding
speaks directly to the numerical over‡ow and under‡ow problems of the estimation and the
latter points to the strategy of the original PIN framework in identifying order imbalances
as informed trades.
4.3 Inferences based on Daily PIN Series
Table 2 reports the cross-sectional mean probability of informed trading related to the ‡ash
crash on May 6, 2010, based on the estimations of the original PIN model for a number
of sub-samples. The reported PIN measures include the quarterly PIN excluding the ‡ash
crash event, the imputed daily PIN on the day of the ‡ash crash as well as the incremental
PIN on the day of the ‡ash crash. Relative to the PIN estimated for the preceding quarter,
there appears to be a market-wide hike of about 0.12 (or a doubling e¤ect) in the imputed
daily PIN on the ‡ash crash event regardless of whether we exclude American Depositary
Receipts, focus on the common stocks only, or exclude stocks with earnings announced on
days immediately adjacent to the ‡ash crash. The incremental PIN on May 6, 2010, is reliably
positive, so are the quarterly and daily PIN measures.
To better understand the cross-sectional di¤erences, I further classify the 847 common
17
stocks in the …nal sample into ten volume deciles based on the daily average total number of
trades over the quarter ended on May 5, 2010. The pattern of PIN estimates in the quarter
leading up to the ‡ash crash appears similar to that reported in Easley, Kiefer, O’Hara and
Paperman (1996). That is, thinly traded stocks have higher estimated PINs than heavily
traded stocks. The PIN estimates are monotonically declining as the volume decile gets
higher. The average PIN of 0.221 for the stocks in the lowest volume decile nearly triples
that for the stocks in the highest volume decile at 0.080.
The pattern of imputed PINs on the day of the ‡ash crash is remarkably di¤erent. While
the stocks in the lowest volume decile continue to have the highest average daily PIN at
0.318, the average daily PIN for the rest of nine volume deciles ranges from 0.217 to 0.258
without any discernible pattern among them. The heavily traded stocks in the 9th and 10th
volume decile share the same average imputed PIN of 0.238 on the day of the ‡ash crash,
which almost triples their respective PIN level in the preceding quarter. The stark contrast
of the estimated PINs around the ‡ash crash, coupled with the seemingly lack of distinction
between stocks with high volume and those with modest volume on the event day, suggests
the uniqueness and the usefulness of the ‡ash crash event in revealing the true identity of
the PIN. The pattern of daily imputed PIN series points out one key weakness of the original
PIN as a pure measure of information asymmetry. For someone holding such a pure view, it
is very worrisome that the level of asymmetric information exceeds 0.217 for stocks in every
volume decile even among the most heavily traded stocks. It is also di¢ cult to make the
case that all stocks other than those most thinly traded stocks exhibit the same extent of
asymmetric information on the day of the ‡ash crash as long as they are not among the most
thinly traded group. In contrast, it is far easier for someone viewing the PIN as “a simple
measure of illiquidity” to associate the ‡ash crash event with a market-wide liquidity shock
that a¤ects almost all stocks to the same degree on average.
There are at least two ways to present the contrast between the daily imputed PIN on
the day of the ‡ash crash and the quarterly PIN just prior to that date. Table 2 reports
both the cross-sectional mean incremental PIN and the ratio of average daily PIN to average
quarterly PIN. The most thinly traded stocks experience the least increase in PIN on the day
of the ‡ash crash while the most heavily traded stocks experience the largest hike. Based on
the ratio of means, the most thinly traded stocks register a 44% hike in PIN and the most
heavily traded stocks 199%. The degree of PIN hike is gradually increasing as the volume
decile climbs higher, but not in a strictly monotonic fashion. The …nding of a stronger PIN
hike on the day of the ‡ash crash among those most frequently traded stocks is another piece
of evidence corroborating the notion that the conventional PIN measure may actually better
proxy for illiquidity than information asymmetry on the day of the ‡ash crash. After all, the
sta¤ report by CFTC-SEC (2010) traces the ‡ash crash to a large and aggressive trade in
the S&P 500 index futures market, and the highest two volume deciles indeed include many
18
stocks in the S&P 500 index.
In light of the …ndings above, it appears reasonable to conclude that the empirical evidence
surrounding the ‡ash crash leans in favor of the illiquidity interpretation rather than the
information asymmetry interpretation for the conventional PIN measure. After all, it is very
di¢ cult to exclusively attribute the market-wide hike in the PIN on the day of the ‡ash crash
to asymmetric information as the original PIN model would. The extended PIN framework
demonstrate that the conventional PIN measure consists of both an illiquidity component
and an information asymmetry component. It is interesting to see how well the extended
PIN model addresses the situation.
4.4 Estimation of Extended PIN Measures
As discussed in Section 3, it goes beyond the observation of daily order ‡ows to identify
possible liquidity shocks. Consequently, the constant probability of liquidity shocks is
determined outside the PIN structure and becomes a crucial input for the extended PIN
model. In this paper, I equate the constant probability of liquidity shock to the empirical
frequency for the occurrence of a sizeable intraday reversal of stock prices within a given
stock quarter. Here is the detailed procedure to identify sizeable intraday reversals. First,
one can cut each regular trading day into thirteen half-hour slots from 9:30am EST to 4:00pm
EST and …nd the minimum and maximum prices within each time slot. Second, the timing
information of these minimum and maximum prices along with the opening and closing prices
helps us create an intraday return series and determine the intraday maximum and minimum
returns. Suppose that the aforementioned intraday maximum return happens to be positive
and the intraday minimum return is negative. Moreover, suppose that both the intraday
maximum and minimum returns exceed a pre-speci…ed threshold level in absolute value, then
this trading day quali…es to be a day with sizeable intraday price reversals. Finally, one can
tally the number of trading days with sizeable price reversals and compute the fraction of
such days within all trading days over the entire estimation period. The resulting fraction is
the constant probability of liquidity shocks that is used to estimate the rest of parameters
in the maximum likelihood estimation and construct the two components of PIN.
The pre-speci…ed return threshold can be either stock-speci…c or uniform across all stocks.
For the former, I use the sample standard deviation of daily stock returns based on the con-
secutive daily closing prices over the entire estimation period. The intuition behind this
benchmark is that intraday stock price reversals exceeding one standard deviation of daily
returns on each direction constitute a sizeable swing within the day. In a robustness check, I
also try to set a uniform cuto¤ of 2% across all stocks to identify sizeable intraday reversals.
The cross-sectional average is 0:0805 based on the stock-speci…c cuto¤s and 0:1191 based
on the uniform cuto¤ of 2% during the quarter ended on the ‡ash crash. When the date
19
of the ‡ash crash is excluded, the cross-sectional average ’s are 0:0801 and 0:1081, respec-
tively. Given the equal weight assigned for all trading days associated with liquidity shocks
irrespective of the magnitude of the price reversal beyond the threshold, the inclusion of the
‡ash crash event only slightly boosts the empirical frequency of liquidity shocks.
The extended PIN model is repeatedly estimated for the …nal sample of common stocks
listed on the NYSE/AMEX, excluding those with earnings announced on days immediately
adjacent to the ‡ash crash. The maximum likelihood estimations for each stock produce
two pairs of PIN components, one for the quarter excluding the ‡ash crash and the other
including the ‡ash crash. As before, each PIN component can be imputed for the day of the
‡ash crash based on the set of quarterly PIN components with and without the ‡ash crash.
Depending upon the cuto¤ used to identify liquidity shocks, it is possible that none of the
trading days in the estimation period quali…es to be a day with liquidity shocks, resulting in
a zero probability of liquidity shock. For instance, 8.84% of stock quarters correspond to a
zero probability of liquidity shock when the stock-speci…c cuto¤ is used to identify liquidity
shocks. In such cases, the extended PIN model degenerates to the original PIN model and
no further estimation is needed.
4.5 PIN Decomposition
Under the extended PIN model, the conventional PIN measure can be decomposed into an
information asymmetry component and an illiquidity component. Table 3 presents the decom-
position for common stocks across ten volume deciles around the ‡ash crash. In the quarter
ended one day before the ‡ash crash, the information asymmetry component P INinasy is
non-surprisingly large (at the level of around 0.20) among the lowest three volume deciles,
gradually declines in trading volume but not in a strictly monotonic fashion, and reaches the
lowest value 0.066 for the highest volume decile. The estimated P INinasy for the low volume
deciles is two to three times larger than for the highest volume decile. The quantitative
pattern here appears comparable to the conventional PIN measure in Table 2. In the same
quarter, the illiquidity component P INilliq for the lowest volume decile is about twice as
large for each of the rest nine volume deciles, reaching 0.026 and about 0.013, respectively.
The quarterly decomposition prior to the ‡ash crash suggests that the information asymme-
try component strictly dominates the illiquidity component by a factor of 4.7 to 22.6. Even
at the lowest volume decile where the illiquidity component is twice as large as the rest of
volume deciles, the information asymmetry component is more than seven times as large as
the illiquidity component.
While the quarterly PIN decomposition is highlighted by the strict dominance of the infor-
mation asymmetry component over the illiquidity component, the imputed PIN components
on the day of the ‡ash crash are characteristic of the disappearance of this strong dominance
20
and the lack of any distinctive pattern across volume deciles. The daily P INinasy for stocks
in each of the lowest three volume deciles exceeds 0.200, followed by the fourth and the ninth
volume decile at 0.192 and 0.191, respectively. One might have expected the lowest volume
decile to continue having the highest P INilliq on the day of the ‡ash crash as it does in the
quarter prior to the ‡ash crash. This is actually not the case as the …fth volume decile has the
highest P INilliq . As far as the magnitude is concerned, the illiquidity component beats the
information asymmetry component in the …fth and sixth volume deciles, and is only slightly
behind in the other eight volume deciles.
The quarterly and daily PIN components reported in Table 3 are all reliably positive,
statistically signi…cant at any conventional level. The daily incremental PIN relative to the
quarterly PIN in terms of the information asymmetry component shows a modest increase
among thinly traded stocks but registers a fairly large hike among heavily traded stocks,
ranging from 0.041 for the lowest volume decile to 0.106 for the highest volume decile. Also
note that the incremental P INinasy is not statistically di¤erent from zero at the 1% level
for the lowest three volume deciles and the …fth decile. When expressed in terms of the
ratio of average daily P INinasy to average quarterly P INinasy , the PIN hike varies from
21% to 124%. Overall the hike in asymmetric information on the day of the ‡ash crash is
considerably weakened both economically and statistically under the extended model than
the original PIN framework that fails to distinguish the information asymmetry component
from the illiquidity component.
In a striking contrast, the daily PIN in terms of the illiquidity component is drastically
higher than its quarterly counterpart. The daily incremental P INilliq is invariably positive
and reliably so for all ten volume deciles. The boost in P INilliq on the day of the ‡ash crash
amounts to a more than …ve-fold increase for the most thinly traded stocks and a nearly
14-fold increase for the fourth volume decile.
The aforementioned results of PIN decomposition are not con…ned to using the stock-
speci…c cuto¤s to identify liquidity shocks. In a robustness check, I repeat the exercise after
de…ning liquidity shocks as intraday price reversals exceeding 2% for all stocks. The results
under the uniform 2% cuto¤ are reported in Table 4, which closely replicates all the qualitative
patterns in Table 3 under the stock-speci…c cuto¤s.
There are a number of lessons we can draw from the PIN decomposition around the ‡ash
crash. First and foremost, it is critically important to introduce liquidity shocks to extend
the original PIN framework. Otherwise, the original PIN measure can be misleadingly high in
cases where the credit to the illiquidity component is due. Second, even though the illiquidity
component of PIN is negligibly small in the quarter leading up to the ‡ash crash, it accounts
for nearly as large a fraction as the information asymmetry component on the day of the ‡ash
crash. Since the asset pricing tests of the information risk as measured by a PIN factor have
been done at the annual interval using the original PIN model, it may well be worthwhile to
21
revisit the test using the extended PIN model at the monthly interval that is typically used
by asset pricing studies. To the extent that the illiquidity component of PIN is declining in
the length of the sampling interval, the factor based on the illiquidity component of PIN may
be even stronger than previously reported in Duarte and Young (2009). Third, the roughly
similar magnitude of P INilliq across volume deciles points to the commonality of liquidity
shocks across all stocks at the time of crisis.5 This is evidence further corroborating the sta¤
report by CFTC-SEC (2010) that documents the ‡ash crash as twin liquidity crises on the
S&P 500 index futures market and the equity market.
4.6 Forecasting the Opening Bid-Ask Spread
To examine the role of PIN components in predicting future spreads, I run the regression
ln(ospreadi;t ) = a0 + a1 ln(P INinasy;i;x ) + a2 ln(P INilliq;i;x ) + a3 ln(volumei;x ) + mi;t :
The dependent variable is the logarithmic opening bid-ask spread as a percentage of midpoint
price on the day of the ‡ash crash (with time subscript t). The set of predictors are measured
in the quarter immediately preceding the ‡ash crash (with time subscript x) and include
the logarithmic PIN components on information asymmetry and illiquidity as well as the
logarithmic share volume. The individual stocks are denoted by the subscript i and the
residuals are denoted by mi;t . The logarithmic transformation of variables is partly motivated
by theory and has appeared in previous studies such as Weston (2001) and Easley, Engle,
O’Hara and Wu (2008). The cross-sectional regression results are presented in Table 5.
When liquidity shocks are identi…ed using …rm-speci…c cuto¤s, both the information asym-
metry and the illiquidity components are positive and statistically signi…cant as a standalone
predictor. The illiquidity component has a much weaker forecasting power for the opening
spread than does the asymmetric information component, with adjusted R2 of 0.012 and
0.195, respectively. Both PIN components in the preceding quarter are positive and highly
statistically signi…cant when they join the share volume in forecasting the opening spread on
the day of the ‡ash crash. Not surprisingly, the average daily share volume in the preceding
quarter is negatively associated with the opening spread and statistically signi…cant.
All the qualitative patterns of the cross-sectional regression results above are preserved
when liquidity shocks are identi…ed through intraday price reversals that uniformly exceed 2%
for all stocks. As far as the goodness of …t is concerned under the extended PIN estimations
with a uniform cuto¤, the forecasting power of the information asymmetry component is
5
Chordia, Roll and Subrahmanyam (2000), Hasbrouck and Seppi (2001) and Korajczyk and Sadka (2008)
study the cross-sectional commonality of liquidity. It can be interesting to carry out the principal component
analysis on the illiquidity component of the PIN over an extended period of time even though the data
limitation around the ‡ash crash event prevents such an exercise here.
22
weakened somewhat while that of the illiquidity component strengthens, with adjusted R2 of
0.095 and 0.181, respectively.
It is clear that the two PIN components and the share volume are able to jointly explain a
large fraction of the cross-sectional variations in the opening spread. The adjusted R2 is 0.292
with stock-speci…c cuto¤s and 0.394 with a uniform 2% cuto¤. The power of the information
asymmetry component forecasting the opening spread and the positive and highly signi…cant
association between these two variables are consistent with the …ndings in the literature (e.g.,
Easley, Kiefer, O’Hara and Paperman, 1996; Weston, 2001; Lei and Wu, 2005; Easley, Engle,
O’Hara and Wu, 2008). This is the …rst paper to my knowledge that formally introduces
the …rm-speci…c or market-wide liquidity shocks into the PIN framework so as to break the
exclusivity of the informed trades in creating order imbalances. So the new …nding of the
illiquidity component of PIN as an important predictor for future bid-ask spreads validates the
PIN extension in this paper. It contributes to the literature by helping us better understand
the role of liquidity shocks and allowing practitioners to better anticipate the trading costs
and design trading strategies accordingly.
4.7 Explaining the Illiquidity Component of PIN
In a further analysis of the illiquidity component of PIN, I run the contemporaneous cross-
sectional regression for stocks in the …nal sample on the day of the ‡ash crash
P INilliq;i;t = b0 + b1 ln(twspreadi;t ) + b1 ln(volumei;t ) + ni;t :
The individual stocks are denoted by subscript i and the residuals are denoted by ni;t . Since
the opening spread provides only a snap shot, it may not adequately represent the full-day
dynamics of the spread on the day of the ‡ash crash. So I construct the time-weighted
average spread (denoted by twspread) as a percentage of the midpoint price.6 The illiquidity
component of PIN is expected to be positively correlated with the time-weighted average
spread. The contemporaneous share volume is also included in the regression.
Table 6 reports the regression results with liquidity shocks de…ned using either stock-
speci…c cuto¤s or the uniform 2% cuto¤. Regardless of the cuto¤ scheme, the time-weighted
average spread is positive and highly statistically signi…cant in explaining the cross-sectional
variations of the illiquidity component of PIN. This relationship is quite remarkable in that
the illiquidity component of PIN is based on the rather coarse price reversal statistics and
primarily driven by the daily order ‡ows that are abstract from any price information. So
6
For the construction of the time-weighted average spread on the day of the ‡ash crash, I retain the same
set of quotes that are used to determine the trade initiation as the Lee and Ready (1991) procedure requires,
and purge other quotes that do not correspond to any actual trades. The time span in seconds between the
retained quotes is then used as the weight for each bid-ask spread in percentage of the midpoint price to
compute the time-weighted average spread for each day.
23
the positive relationship is quite revealing in light of the fact that the time-weighted spread
is purely price information.
As a single explanatory variable, the share volume is inversely related to the illiquidity
component. This relationship is weak from the statistical viewpoint, however, reinforcing
the conclusion from the visual inspection of Tables 3 and 4 that volume is not a very good
sorting device for the illiquidity component of PIN on a standalone basis. After controlling for
the time-weighted average spread, however, the logarithmic share volume registers a positive
coe¢ cient that is highly statistically signi…cant. The positive relationship with share volume
is quite unique here because it alludes to the fact that some of the S&P 500 index component
stocks are hit the hardest with extreme price reversals during the ‡ash crash, and the most
heavily traded stocks experience the largest hike in the illiquidity component of PIN.
When liquidity shocks are identi…ed with stock-speci…c cuto¤s, the time-weighted spread
and the share volume explain a fairly small fraction of the cross-sectional variations in the
illiquidity component with an adjusted R2 of 0.024. When a uniform 2% cuto¤ is used instead,
the time-weighted spread and the share volume have a much better …t with the data. The
adjusted R2 is 0.161. While further research is needed to glean additional insights from the
illiquidity component of PIN, the …ndings thus far in this paper illustrate the importance of
introducing liquidity shocks into the PIN framework.
5 Conclusion
The ‡ash crash event on May 6, 2010, provides both the motivation and the testing …eld
for this paper. During this event, the sharp drop of stock prices and the swift reversal over
a thirty-minute interval are very interesting in that they essentially amount to a serious
challenge to the original PIN framework in Easley, Kiefer, O’Hara and Paperman (1996). On
the day of the ‡ash crash, there is a wide spread large increase in PIN for various sub-samples
of stocks, and the PIN nearly tripled among the most heavily traded stocks. Such a pervasive
PIN hike cannot be solely attributed to the increase of asymmetric information as would the
original PIN model and the gradual incorporation of private information into stock prices
seems at odds with the sizeable and quick reversal of stock prices across the board.
By explicitly allowing for liquidity shocks, this paper extends the original PIN framework
to introduce a third trading motive in addition to the private information and the exogenous
liquidity needs. The pseudo market makers can submit contrarian orders during periods of
liquidity shocks and thus help to restore stock prices back to the fundamental level, resulting
in an observed price reversal. The coexistence of fundamental news and liquidity shocks in
the extended PIN model implies that the informed investors are no longer the sole source
of order imbalances and the pseudo market makers can also submit one-sided orders during
24
liquidity shocks. Consequently, the conventional PIN measure consists of both an information
asymmetry component and an illiquidity component.
The extended PIN model is then put to test around the ‡ash crash. The illiquidity
component of PIN accounts for a negligible fraction during the quarter leading up to the ‡ash
crash but experiences a …ve- to fourteen-fold hike on the day of the ‡ash crash, reaching at a
level nearly at par with the information asymmetry component. Even though the information
asymmetry component also witnesses an increase on the day of the ‡ash crash, it is not nearly
as drastic as the illiquidity component. Compared to the original PIN framework, the hike
in asymmetric information on the day of the ‡ash crash is weakened substantially under the
extended model both from the statistical and the economic perspectives. Moreover, there is
evidence that both the information asymmetry component and the illiquidity component of
PIN can forecast the opening bid-ask spread. On the day of the ‡ash crash, the illiquidity
component of PIN is positively and contemporaneously correlated the time-weighted average
spread, further supporting the notion that the …rm-speci…c or market-wide liquidity shock
a¤ects the inference on the information-based trading.
These new …ndings contribute to the literature and deepen our understanding of the
role of information asymmetry. They certainly point to the importance of accounting for
liquidity shocks in the PIN framework and invite us to revisit a number of interesting issues.
For instance, would the PIN decomposition under the extended model imply a stronger PIN
factor or a weaker one in the asset pricing context? Can we actually resolve the documented
“PIN anomaly” in the context of mergers and acquisitions announcements? I study these
and other interesting questions in a series of companion studies.
In addition to the development and testing of an extension to the PIN framework, this
paper also provides a number of methodological improvements to the PIN estimation. In the
Appendix, I outline one simple procedure to dynamically factorize the daily log-likelihood
function for the maximum likelihood estimation and e¤ectively eliminate the numerical over-
‡ow and under‡ow problems that have long plagued the academic researchers and practition-
ers alike in the PIN context. Moreover, this paper also furnishes the guidelines of imputing the
daily PIN series through repeated estimations of quarterly PINs. Researchers are expected
to bene…t from these methodological improvements in a wide variety of settings, especially
among the corporate event studies that would most appreciate the availability of a daily PIN
series without the cost of imposing a complex data structure.
25
6 Appendix. Dynamic Factorization of Log-likelihood
For ease of exposition, I illustrate the factorization process under the original PIN framework.
The daily log-likelihood function can be written as
hX i
L( ) = 2" + (B + S) ln(") + ln wi exp(xi ) ;
where the weights and the exponential inputs are given in the table below.
Weight wi Exponential Input xi

1 0
S ln(k)
(1 ) B ln(k)
The computational complexity lies in the weighted sum of exponential functions, each of which
has the potential of triggering an over‡ow or under‡ow. One can dynamically factorize the
log-likelihood function on a daily basis using a three-step procedure.
First, …nd the maximum input xmax and pull xmax into the common factor. Alternatively
speaking, one can compute the modi…ed exponential input
yi = xi xmax :
Second, examine each modi…ed exponential input yi and see if it falls below the critical
value C that is determined by the researcher’s hardware and software for estimating the
PINs. Note that the maximum input ymax is zero and thus yi 0 always holds. If yi C ,
then it is necessary to force a zero weight so as to avoid the under‡ow from evaluating exp(yi ).
If C < yi 0, then it is …ne to directly evaluate exp(yi ). Alternatively speaking, one can
compute the modi…ed weight
vi = wi ( C < yi 0);
where the indicator function takes the value of 1 if C < yi 0 and 0 otherwise.
Third, the daily log-likelihood function can be rewritten as
X
L( ) = 2" + (B + S) ln(") + xmax + ln vj exp(yj ) :
vj 6=0
Note that there is no need to check for the logarithmic inputs. The arrival rates are often
coded as an exponential function to ensure their positiveness so ln(") will not cause numerical
problems. Moreover, the fact of yi 0 implies that the input for the second logarithmic
26
function is properly bounded between 0 and 1.
Having discussed the process of factoring the log-likelihood function, I should also note the
importance of checking for the presence of over‡ow and under‡ow problems when handling
the various transformations of raw parameter inputs that help to ensure that 0 1,
0 1, > 0, and " > 0. Researchers often use the exponential transformation to ensure
a positive parameter and the logistic transformation for a parameter that is a probability. One
can use techniques similar to the ones documented above to handle these transformations.
For instance, denote by e , e, e and e
" the parameters before transformation and c a constant
to scale the arrival rates so as to make Hessian matrix of the vector of parameters well
behaved. The transformation for the news probability and the probability of arrived news
being negative can be written as
8 8
>
>0 if e C ; >
>0 if e C ;
>
< >
<
= 1
if je j < C ; = 1
e) if e < C ;
>
> 1+exp( e) >
> 1+exp(
>
:1 >
:1
if e C . if e C .
The informed arrival rate and the uninformed arrival rate " can be written as
8 8
>
>0 if e C ; >
>0 if e
" C ;
>
< >
<
= exp(e) exp(c) if jej < C ; " = exp(e
") exp(c) if je
"j < C ;
>
> >
>
>
:exp(C ) exp(c) if e C . >
:exp(C ) exp(c) if e
" C .
Finally, one needs to anticipate potential over‡ow and under‡ow problems for the com-
putation of
" 1
k= = and ln(k) = ln [1 + exp(e e
")] :
+" 1 + exp(e e
")
There are four cases to consider. (1) If e e

" C then k = 1 and ln(k) = 0. (2) If
e e
" C then k = 0 and ln(k) = e
" e. (3) If je e
"j < C and 1 + exp(e e
") 10C = ln(10)
then k = 0 and ln(k) = e

" e. (4) If je e
"j < C and 1 + exp(e ") < 10C
e = ln(10) then
k = 1=[1 + exp(e e
")] and ln(k) = ln [1 + exp(e e
")].
The log-likelihood function under the extended PIN framework can be handled in a similar
way. For instance, the table of weights and exponential inputs can be simply augmented with
two more rows along with the introduction of the parameter. I omit the details for brevity.
27
References
[1] Aktas, Nihat, Eric de Bodt, Fany Declerck, and Herve Van Oppens, 2007, The PIN
anomaly around M&A announcements, Journal of Financial Markets 10, 169-191.
[2] Benos, Evangelos, and Marek Jochec, 2007, Testing the PIN variable, University of
Illinois at Urbana-Champaign Working Paper.
[3] Bessembinder, Hendrik, Kalok Chan, and Paul J. Seguin, 1996, An empirical examina-
tion of information, di¤erences of opinion, and trading activity, Journal of Financial
Economics 40, 105-134.
[4] Bharath, Sreedhar T., Paolo Pasquariello, and Guojun Wu, 2009, Does asymmetric
information drive capital structure decisions?, Review of Financial Studies 22, 3211-
3243.
[5] Boehmer, Ekkehart, Joachim Grammig, and Erik Theissen, 2007, Estimating the proba-
bility of informed trading–does trade misclassi…cation matter?, Journal of Financial
Markets 10, 26-47.
[6] Chan, Kalok, 1992, A further analysis of the lead–lag relationship between the cash
market and stock index futures market, Review of Financial Studies 5, 123-152.
[7] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2000, Commonality in
liquidity, Journal of Financial Economics 56, 3-28.
[8] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2001, Market liquidity
and trading activity, Journal of Finance 56, 501-530.
[9] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2005, Evidence on the
speed of convergence to market e¢ ciency, Journal of Financial Economics 76, 271-
292.
[10] CFTC-SEC, 2010, Findings regarding the market events of May 6, 2010, Report of the
Sta¤ s of CFTC and SEC to the Joint Advisory Committee on Emerging Advisory
Issues.
[11] Duarte, Je¤erson, Xi Han, Jarrad Harford, and Lance Young, 2008, Information asym-
metry, information dissemination and the e¤ect of regulation FD on the cost of
capital, Journal of Financial Economics 87, 24-44.
[12] Duarte, Je¤erson, and Lance Young, 2009, Why is PIN priced?, Journal of Financial
Economics 91, 119-138.
[13] Easley, David, Robert F. Engle, Maureen O’Hara, and Liuren Wu, 2008, Time-varying
arrival rates of informed and uninformed trades, Journal of Financial Econometrics
6, 171-207.
[14] Easley, David, Soeren Hvidkjaer, and Maureen O’Hara, 2002, Is information risk a de-
terminant of asset returns?, Journal of Finance 57, 2185-2221.
28
[15] Easley, David, Nicholas M. Kiefer, and Maureen O’Hara, 1996, Cream-skimming or
pro…t-sharing? The curious role of purchased order ‡ow, Journal of Finance 51,
811-833.
[16] Easley, David, Nicholas M. Kiefer, and Maureen O’Hara, 1997, One day in the life of a
very common stock, Review of Financial Studies 10, 805-835.
[17] Easley, David, Nicholas M. Kiefer, Maureen O’Hara, and Joseph B. Paperman, 1996,
Liquidity, information, and infrequently traded stocks, Journal of Finance 51, 1405-
1436.
[18] Easley, David, Marcos M. Lopez de Prado, and Maureen O’Hara, 2010, The microstruc-
ture of the “Flash crash”: Flow toxicity, liquidity crashes and the probability of
informed trading, Cornell University Working Paper.
[19] Easley, David, and Maureen O’Hara, 1992, Time and the process of security price ad-
justment, Journal of Finance 47, 577-605.
[20] Easley, David, and Maureen O’hara, 2004, Information and the cost of capital, Journal
of Finance 59, 1553-1583.
[21] Hasbrouck, Joel, and Duane J. Seppi, 2001, Common factors in prices, order ‡ows, and
liquidity, Journal of Financial Economics 59, 383-411.
[22] Jayaraman, Sudarshan, 2008, Earnings volatility, cash ‡ow volatility, and informed trad-
ing, Journal of Accounting Research 46, 809-851.
[23] Kaul, Gautam, Qin Lei, and Noah Sto¤man, 2005, AIMing at PIN: Order ‡ow, infor-
mation, and liquidity, University of Michigan Working Paper.
[24] Korajczyk, Robert A., and Ronnie Sadka, 2008, Pricing the commonality across alter-
native measures of liquidity, Journal of Financial Economics 87, 45-72.
[25] Lee, Charles M. C., and Mark J. Ready, 1991, Inferring trade direction from intraday
data, Journal of Finance 46, 733-746.
[26] Lei, Qin, and Guojun Wu, 2005, Time-varying informed and uninformed trading activi-
ties, Journal of Financial Markets 8, 153-181.
[27] Vega, Clara, 2006, Stock price reaction to public and private information, Journal of
Financial Economics 82, 103-133.
[28] Weston, James P., 2001, Information, liquidity and noise, Rice University Working Pa-
per.
29
Table 1. Logit Model on the Success of PIN Estimations
This table presents the results from the logit regressions for the cross-sectional success of maximum likelihood estimations based on the
original PIN framework with the common factorization on the log-likelihood function in equation (3) for all trading days in the given
stock quarter. The dependent variable is the estimation outcome, taking the value 1 for a success (i.e., an estimation that converged
at a global maximum without being a corner solution for any of the estimated parameters) and 0 otherwise. For the results in the top
panel, the independent variables include the daily average total trades (in thousands of trades and denoted by T otT rade), the daily
maximum absolute order imbalance (in thousands of buy orders net of sell orders and denoted by AbsImbalance), the daily maximum
percentage absolute order imbalance (in absolute buy orders net of sell orders scaled by total orders and denoted by P ctImbalance), and
the logarithmic market equity in million dollars as of December 31, 2009 (denoted by ln(M ktEq)). The results in the bottom panel refer
to the logit regressions for the same dependent variable, while the independent variables are replaced by the cross-sectional percentile
rank of their counterparts in the top panel along with a su¢ x % in the variable names. The variable P ctImbalance remains the same for
both the top and bottom panels. Also reported are the Pseudo-R2 for the goodness of …t and the Wald statistics (inside parentheses).
30
Model (1) Model (2) Model (3) Model (4) Model (5) Model (6)
Intercept 0:78 (325) 1:47 (668) 1:59 (692) 3:84 (1231) 2:56 (241) 1:35 (13)
T otT rade 0:41 (561) 0:16 (63) 0:14 (120) 0:15 (127)
AbsImbalance 2:17 (838) 2:85 (537) 1:28 (150) 1:25 (146)
P ctImbalance 28:42 (1069) 23:95 (519) 21:87 (343)
ln(M ktEq) 0:15 (12)
Pseudo-R2 0:253 0:338 0:345 0:422 0:455 0:456
Intercept 3:83 (1142) 3:99 (1146) 4:11 (1145) 3:84 (1231) 0:79 (4) 0:56 (1)
T otT rade% 8:42 (1371) 3:07 (41) 2:23 (8) 2:08 (7)
AbsImbalance% 8:78 (1362) 5:94 (147) 8:06 (200) 8:04 (198)
P ctImbalance 28:42 (1069) 14:25 (69) 14:49 (69)
ln(M ktEq)% 0:04 (1)
Pseudo-R2 0:473 0:486 0:491 0:422 0:500 0:500
Table 2. Mean Estimates under Original PIN Framework
This table presents the cross-sectional mean probability of informed trading (PIN) related to
the ‡ash crash on May 6, 2010. For each stock listed on the NYSE/AMEX, the conventional
PIN measure is estimated repeatedly over a quarter, with and without the ‡ash crash, and
the resulting pair of quarterly PINs help to impute the daily PIN series. The PIN on the
day of the ‡ash crash (denoted by P INt ) is then compared to the quarterly PIN (denoted by
P INx ) in the quarter ended one day before the ‡ash crash, leading to the daily incremental
PIN. To assess the result sensitivities to the exact grouping of stocks, a set of increasingly
stringent …lters are applied to further …ne tune the sample, such as excluding the American
Depositary Receipts (ADRs) and focusing on only common stocks. The common stocks on
the NYSE/AMEX, excluding those with earnings announced between May 5 and May 7,
2010, are then sliced into ten volume deciles. The cross-sectional means of P INx , P INt and
P INt P INx are reported below, along with the t-statistics inside parentheses for the null
hypothesis of each respective measure being zero. Also reported is the ratio of mean P INt
to mean P INx .
P INt
Filtered Samples Quarterly P INx Daily P INt P INt P INx P INx
All NYSE/AMEX stocks 0:140 (91:04) 0:256 (72:51) 0:116 (32:72) 1:83
Exclude ADRs 0:139 (86:70) 0:256 (68:28) 0:118 (31:51) 1:85
Common stocks only 0:121 (73:57) 0:246 (53:77) 0:125 (27:42) 2:04
Exclude those with earnings 0:120 (65:16) 0:245 (49:80) 0:124 (25:43) 2:03
Lowest volume decile 0:221 (26:86) 0:318 (14:54) 0:097 (4:68) 1:44
Decile 2 0:151 (34:38) 0:254 (20:02) 0:103 (7:96) 1:68
Decile 3 0:130 (39:40) 0:258 (21:77) 0:128 (10:16) 1:98
Decile 4 0:119 (39:72) 0:240 (18:05) 0:120 (8:64) 2:01
Decile 5 0:116 (35:90) 0:217 (16:84) 0:101 (7:91) 1:87
Decile 6 0:105 (50:46) 0:232 (17:75) 0:126 (9:47) 2:20
Decile 7 0:100 (37:45) 0:232 (14:79) 0:132 (7:83) 2:32
Decile 8 0:095 (30:73) 0:218 (15:64) 0:124 (9:05) 2:31
Decile 9 0:084 (27:00) 0:238 (15:36) 0:154 (9:97) 2:83
Highest volume decile 0:080 (23:40) 0:238 (12:35) 0:159 (8:31) 2:99
31
Table 3. PIN with Liquidity Shocks based on Stock-Speci…c Cuto¤s
This table presents the cross-sectional mean probability of informed trading (PIN) related
to the ‡ash crash on May 6, 2010, after explicitly allowing for liquidity shocks in the extended
PIN model. In this exercise, liquidity shocks are identi…ed through intraday price reversals
that exceed one standard deviation of daily stock returns. For each common stock listed
on the NYSE/AMEX, excluding those with earnings announced close enough to the ‡ash
crash, the extended PIN model is estimated repeatedly over a quarter, with and without the
‡ash crash. The resulting pair of quarterly estimates help to impute a daily series. For both
the information asymmetry component (denoted by P INinasy ) and the illiquidity component
(denoted by P INilliq ), the daily series on the day of the ‡ash crash (denoted by P INt ) is
compared to the quarterly series excluding the ‡ash crash (denoted by P INx ), leading to the
daily incremental series. The common stocks in the …nal sample are sliced into ten volume
deciles and the cross-sectional means of P INx , P INt and P INt P INx are reported below,
along with the t-statistics inside parentheses for the null hypothesis of each respective measure
being zero. Also reported is the ratio of mean P INt to mean P INx .
P INt
P INinasy Quarterly P INx Daily P INt P INt P INx P INx
Decile 2 0:156 (14:02) 0:200 (9:05) 0:044 (2:16) 1:29
Decile 3 0:203 (10:64) 0:254 (9:44) 0:051 (2:30) 1:25
Decile 4 0:121 (13:67) 0:192 (7:91) 0:070 (2:91) 1:58
Decile 5 0:104 (31:33) 0:132 (8:00) 0:027 (1:66) 1:26
Decile 6 0:089 (38:86) 0:137 (8:58) 0:048 (3:01) 1:54
Decile 7 0:094 (20:98) 0:159 (7:39) 0:065 (2:98) 1:70
Decile 8 0:074 (35:80) 0:141 (7:96) 0:067 (3:63) 1:89
Decile 9 0:085 (17:10) 0:191 (6:66) 0:106 (4:00) 2:24
P INt
P INilliq Quarterly P INx Daily P INt P INt P INx P INx
Decile 2 0:013 (6:05) 0:140 (7:52) 0:127 (7:00) 11:00
Decile 3 0:009 (5:45) 0:094 (6:86) 0:085 (6:56) 10:94
Decile 4 0:010 (7:09) 0:145 (6:93) 0:135 (6:66) 14:51
Decile 5 0:013 (8:72) 0:171 (8:73) 0:158 (8:30) 12:83
Decile 6 0:013 (10:59) 0:156 (8:69) 0:143 (8:21) 12:38
Decile 7 0:013 (6:67) 0:134 (8:98) 0:122 (8:45) 10:67
Decile 8 0:013 (9:51) 0:131 (12:05) 0:118 (11:06) 10:23
Decile 9 0:014 (9:06) 0:159 (8:11) 0:145 (7:80) 11:43
32
Table 4. PIN with Liquidity Shocks based on Uniform Cuto¤
This table presents the cross-sectional mean probability of informed trading (PIN) related
to the ‡ash crash on May 6, 2010, after explicitly allowing for liquidity shocks in the extended
PIN model. In this exercise, liquidity shocks are identi…ed through intraday price reversals
that uniformly exceed 2% for all stocks. For each common stock listed on the NYSE/AMEX,
excluding those with earnings announced close enough to the ‡ash crash, the extended PIN
model is estimated repeatedly over a quarter, with and without the ‡ash crash. The resulting
pair of quarterly estimates help to impute a daily series. For both the information asymmetry
component (denoted by P INinasy ) and the illiquidity component (denoted by P INilliq ), the
daily series on the day of the ‡ash crash (denoted by P INt ) is compared to the quarterly series
excluding the ‡ash crash (denoted by P INx ), leading to the daily incremental series. The
common stocks in the …nal sample are sliced into ten volume deciles and the cross-sectional
means of P INx , P INt and P INt P INx are reported below, along with the t-statistics inside
parentheses for the null hypothesis of each respective measure being zero. Also reported is
the ratio of mean P INt to mean P INx .
P INt
P INinasy Quarterly P INx Daily P INt P INt P INx P INx
Decile 2 0:128 (16:12) 0:141 (11:18) 0:013 (1:11) 1:10
Decile 3 0:172 (12:72) 0:207 (9:17) 0:035 (2:04) 1:20
Decile 4 0:109 (18:48) 0:170 (8:99) 0:061 (3:35) 1:56
Decile 5 0:099 (28:48) 0:115 (9:42) 0:016 (1:34) 1:16
Decile 6 0:094 (22:51) 0:147 (8:51) 0:053 (3:45) 1:56
Decile 7 0:093 (20:82) 0:164 (11:07) 0:071 (4:90) 1:76
Decile 8 0:074 (39:18) 0:140 (8:83) 0:066 (4:18) 1:89
Decile 9 0:080 (17:78) 0:161 (8:56) 0:080 (4:53) 2:00
P INt
P INilliq Quarterly P INx Daily P INt P INt P INx P INx
Decile 2 0:040 (5:37) 0:173 (8:20) 0:134 (7:48) 4:38
Decile 3 0:028 (3:22) 0:116 (6:22) 0:088 (6:10) 4:20
Decile 4 0:022 (4:13) 0:186 (8:39) 0:164 (8:63) 8:48
Decile 5 0:021 (5:67) 0:155 (8:83) 0:134 (7:91) 7:52
Decile 6 0:017 (5:97) 0:150 (7:10) 0:133 (6:68) 8:98
Decile 7 0:020 (4:74) 0:126 (9:14) 0:106 (8:42) 6:21
Decile 8 0:022 (4:40) 0:126 (9:26) 0:103 (8:09) 5:64
Decile 9 0:018 (5:31) 0:143 (7:52) 0:125 (7:49) 7:94
33
Table 5. Forecasting the Opening Bid-Ask Spread
This table presents the cross-sectional regression results from predicting the bid-ask spread
at the market opening on the day of the ‡ash crash (i.e., May 6, 2010), after explicitly
allowing for liquidity shocks in the extended PIN model. In this exercise, liquidity shocks are
identi…ed through intraday price reversals that either exceed the stock-speci…c one standard
deviation of daily stock returns or uniformly exceed 2% for all stocks. For each common
stock listed on the NYSE/AMEX, excluding those with earnings announced close enough to
the ‡ash crash, the extended PIN model is estimated over the quarter immediately preceding
the ‡ash crash, and the resulting information asymmetry component (denoted by P INinasy )
and the illiquidity component (denoted by P INilliq ) are predictors for the opening spread.
Also included in the set of explanatory variables is the average daily share volume (denoted
by volume) in the quarter prior to the ‡ash crash. The opening bid-ask spread (denoted by
ospread), the share volume and the two PIN components are transformed using the natural
logarithms. The estimated coe¢ cients are reported below, along with the t-statistics inside
parentheses. Also reported is the adjusted R2 and the number of observations used in each
regression.
Forecasting ln(ospread) with Firm-Speci…c Cuto¤s

Model (1a) Model (2a) Model (3a)
Intercept 5:35 (30:22) 3:96 (16:12) 6:71 (21:84)
ln(P INinasy ) 0:89 (11:63) 0:61 (6:06)
ln(P INilliq ) 0:14 (2:65) 0:21 (4:30)
ln(volume) 0:17 ( 5:84)
Adj. R2 0:195 0:012 0:292
Observations 554 488 488
Forecasting ln(ospread) with Uniform 2% Cuto¤

Model (1b) Model (2b) Model (3b)
Intercept 4:83 (25:35) 4:76 (34:38) 6:71 (28:92)
ln(P INinasy ) 0:66 (8:23) 0:30 (3:27)
ln(P INilliq ) 0:33 (10:24) 0:31 (10:41)
ln(volume) 0:22 ( 8:31)
Adj. R2 0:095 0:181 0:394
34
Table 6. Explaining the Illiquidity Component of PIN
This table presents the regression results of explaining the cross-sectional variation in the
illiquidity component of PIN on the day of the ‡ash crash (i.e., May 6, 2010), after explicitly
allowing for liquidity shocks in the extended PIN model. In this exercise, liquidity shocks are
identi…ed through intraday price reversals that either exceed the stock-speci…c one standard
deviation of daily stock returns or uniformly exceed 2% for all stocks. For each common
stock listed on the NYSE/AMEX, excluding those with earnings announced close enough to
the ‡ash crash, the extended PIN model is estimated repeatedly over a quarter, with and
without the ‡ash crash. The resulting pair of quarterly estimates help to impute a daily
series for both the information asymmetry component and the illiquidity component on the
day of the ‡ash crash, the latter of which (denoted by P INilliq ) is the dependent variable of
the contemporaneous regression. The set of explanatory variables include the time-weighted
bid-ask spreads (as a percentage of the midpoint price and denoted by twspread) and the
share volume on the day of the ‡ash crash (denoted by volume) after taking the natural
logarithm. The estimated coe¢ cients are reported below, along with the t-statistics inside
parentheses. Also reported is the adjusted R2 and the number of observations used in each
regression.
Explaining P INilliq with Firm-Speci…c Cuto¤s

Model (1a) Model (2a) Model (3a)
Intercept 0:11 (6:20) 0:13 (5:61) 0:05 ( 0:93)
ln(twspread) 0:01 (2:19) 0:03 (3:89)
ln(volume) 0:00 (0:79) 0:01 (3:31)
Adj. R2 0:007 0:001 0:024
Explaining P INilliq with Uniform 2% Cuto¤

Model (1b) Model (2b) Model (3b)
Intercept 0:01 (0:58) 0:19 (8:17) 0:34 ( 6:44)
ln(twspread) 0:05 (8:55) 0:10 (11:06)
ln(volume) 0:01 ( 1:98) 0:03 (6:98)
Adj. R2 0:099 0:004 0:161
35

Unveiling The Identity of PIN From The Flash Crash

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unveiling The Identity of PIN From The Flash Crash

Uploaded by

Copyright:

Available Formats

Unveiling the Identity of PIN from the Flash Crash:

Illiquidity or Information Asymmetry?

First Draft: October 25, 2010

Electronic copy available at: http://ssrn.com/abstract=1697879

JEL Classi…cations: G10, G14

Keywords: Probability of Informed Trading (PIN), Information Asymmetry, Flash Crash,

Electronic copy available at: http://ssrn.com/abstract=1697879

2.1 Original PIN Framework

where denotes the vector of parameters to be estimated.

2.2 Common Factorization of Log-likelihood

L[(B; S)j ] = c m=(B!S!);

2.4 Computing Daily PIN Series

The presence of information asymmetry is applicable in many contexts. It is very often

Nx P INx + Nt P INt = Nc P INc :

The daily incremental PIN relative to the prior 62 trading days is

So one alternative representation of the imputed daily PIN measure is

3.1 Revised Trade Process and Sample Likelihood

Denote by the vector of parameters to be estimated. The daily likelihood of observing

ln(h) 0 and ln(k) 0

3.2 Modeling Choices of Liquidity Shocks

3.3 PIN Decomposition

The conventional PIN measure in the extended framework can be re-de…ned as

after carving out the component of PIN measure related to illiquidity

3.4 Literature Review

4.1 Construction of Sample

4.2 Estimation of Original PIN Measure

Overall order imbalances contribute to the estimation complexity in an interesting way.

4.3 Inferences based on Daily PIN Series

4.4 Estimation of Extended PIN Measures

4.5 PIN Decomposition

4.6 Forecasting the Opening Bid-Ask Spread

ln(ospreadi;t ) = a0 + a1 ln(P INinasy;i;x ) + a2 ln(P INilliq;i;x ) + a3 ln(volumei;x ) + mi;t :

4.7 Explaining the Illiquidity Component of PIN

P INilliq;i;t = b0 + b1 ln(twspreadi;t ) + b1 ln(volumei;t ) + ni;t :

Weight wi Exponential Input xi

Third, the daily log-likelihood function can be rewritten as

There are four cases to consider. (1) If e e

then k = 0 and ln(k) = e

Forecasting ln(ospread) with Firm-Speci…c Cuto¤s

Forecasting ln(ospread) with Uniform 2% Cuto¤

Explaining P INilliq with Firm-Speci…c Cuto¤s

Explaining P INilliq with Uniform 2% Cuto¤

You might also like