On The Information Properties of Trading Networks

On the Informational Properties of Trading Networks
Lada Adamic, Celso Brunetti, Jeffrey Harris, and Andrei Kirilenko September 9, 2009
ABSTRACT We apply network analysis to trace patterns of information transmission in an electronic limit order market. If market orders or large executable limit orders are submitted by informed traders, then resulting star-shaped or diamond-shaped patterns or trading networks should be associated with large changes in returns, smaller volume, and short duration between trades. In contrast, the execution of small limit orders from uninformed traders should result in networks with many triangular and reciprocal patterns and be associated with smaller changes in returns, larger volume and longer duration between trades. We compute a time series of trading networks using audit trail, transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 futures contract the cornerstone of price discovery for the S&P 500 Index. We nd that network metrics that quantify the shape of a network are statistically signicantly related to returns, volatility, volume, and duration.
Lada Adamic is with the University of Michigan and the Commodity Futures Trading Commission, Celso Brunetti is with Johns Hopkins University and the Commodity Futures Trading Commission, Jeffrey Harris is with the Commodity Futures Trading Commission and the University of Delaware, and Andrei Kirilenko is with the Commodity Futures Trading Commission. We are greatful to Paul Tsyhura for invaluable assistance with the retrieval, organization, and processing of transaction-level data. We thank Pat Fishe, Pete Kyle, Antonio Mele, Han Ozsoylev, and seminar participants at the Chicago Mercantile Exchange, the Commodities Futures Trading Commission, 2009 Econometric Society Summer Meetings in Barcelona, the Federal Reserve Board of Governors, NASDAQ, the Securities and Exchange Commission, and the University of Maryland for very helpful comments and suggestions. The views expressed in this paper are our own and do not constitute an ofcial position of the Commodity Futures Trading Commission, its Commissioners or staff.
Most securities exchanges around the world are electronic limit order markets. Yet, the analysis of electronic limit order trading has proven to be very challenging. To quote from the survey by Parlour and Seppi (2008): Despite the simplicity of limit orders themselves, the economic interactions in limit order markets are complex because the associated state and action spaces are extremely large and because trading with limit orders is dynamic and generates non-linear payoffs. In this paper, we apply network analysis to quantify the dynamics of information transmission in an electronic limit order market - a complex dynamic problem. The networks we analyze are trading networks. We dene a trading network as a set of traders engaged in transactions within a period of time. In graph theoretic terminology, a trading network is a graph, consisting of a set of nodes and a set of edges. Each node denotes a unique trader and an edge between two nodes denotes the occurrence of trading between two unique counterparties within a period of time. The direction of an edge indicates buy or sell transactions between unique counterparties. Namely, a directed edge from node A to node B indicates that trader A sold (one time or several times) to trader B during a specied period of time. A trading network formed over a designated number of transactions traces a pattern of order execution in the limit order book. By analyzing the shape of that pattern, we can quantify the structure of the executed portion of the book. For example, the execution of a market order will result in a star-shaped pattern with the node that submitted the market order in the center and nodes that connected to it as the market order marched through the limit order book in the periphery. This star-shaped network will also not have any triangular or reciprocal connections. In contrast, the execution of two large limit orders that arrived at different times will result in a diamond-shaped pattern with the two nodes that submitted large limit orders on the ends and market makers that provided the immediacy of execution (in small installments) in the middle. Finally, an execution of a sequence of small limit orders will look different from the execution of market or large executable limit orders. Some nodes will have more connections than others, but there will be no central dominant node or a diamond shape. There will be a number of triangular connections and some pairs of nodes will have edges that go both ways. If market orders or large executable limit orders are submitted by informed traders, then patterns of order execution should be informative beyond transaction prices, volume or trade duration. Intuitively, if market orders or large executable limit orders are submitted by informed traders, then resulting star-shaped or diamond-shaped trading networks should be associated with large changes in returns, possibly smaller volume, and short duration between trades. Conversely, trading networks that are very dissimilar to a star or a diamond - e.g., those with triangular and reciprocal patterns - should be associated with smaller changes in returns, possibly larger volume and longer duration between trades. Various network metrics that quantify the shape of a network - e.g., the number of central nodes or triangular connections in a network - should then be statistically related to returns, volatility, volume, and duration.
In this paper we nd evidence that network metrics serve as primitive measures of limit order book dynamics. Namely, we compute network and nancial variables for all regular transactions that occurred during August 2008 in the nearby E-mini S&P 500 futures contract and nd that network variables strongly Granger-case intertrade duration and volume. This suggests that network metrics presage the appearance of this information in duration and volume. We also nd that the network variable that quanties centrality (or how star-shaped a pattern is) exhibits a very high contemporaneous correlation with returns. Similarly, the network variables that quanties the assortativity of connections (or how diamond-shaped a pattern is) exhibit high contemporaneous correlation with volatility. These results are robust with respect to different equity index futures markets (E-mini Dow Jones and Nasdaq 100), different observation periods (May 2008 and August 2008), different levels of aggegation (at the broker level and individual trading account level), and different sampling frequencies (240 and 600 transactions). Correlation results can also be replicated in a simulated model, conrming that these empirical regularities do not arise by chance. Furthermore, the results do not depend on any parametric specications or modeling assumptions. This is the rst paper to empirically link trading networks that trace the execution of the limit order book with the dynamics of high frequency nancial variables - transaction prices, quantities and duration. As such, it offers a way to analyze the dynamics of the executed portion of the limit order book from transaction level data. Empirical network analysis has previously been applied in nance to study investment decisions and corporate governance.1 In contrast to strategically-formed networks where participants prefer to associate with specic counterparties, the networks we study are trading networks in which connections are formed as a result of an automated matching algorithm and reect the participants beliefs about the valuation of an asset. These networks are also highly dynamicwhereas boards of directors and portfolio holdings evolve gradually, over weeks, months, or yearsnancial trading networks change second by second. Our paper proceeds as follows. In Section I, we describe our unique ultra high frequency data, explain how we chose the sampling frequency, and describe nancial variables. In Section II, we describe network variables. In Section III we outline our conjecture of why patterns of order executiontrading networkscontain valuable information beyond prices, quantities, or intertrade duration. In Section IV, we present the empirical properties of network and nancial variables. In Section V, we analyze time series properties and employ Granger-causality tests between and among network and nancial variables. Section VI demonstrates that our results are robust with respect to different markets, different observation periods, and different sampling frequencies. In Section VII, we use an agent-based simulation model of trading networks to further test that our empirical results do not arise by chance. Finally, Section VIII
1 For
a recent survey, see Allen and Babus (2008).
summarizes our ndings and suggests further applications of the network analysis methodology to trading networks.
I. Data and Financial Variables

We use audit trail, transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 futures contract. The transactions take place during the month of August 2008 during the time when the markets for stocks underlying the S&P 500 Index are open: weekdays between 9:30 a.m. EST and 4:00 p.m. EST. The E-mini S&P 500 futures contract is a highly liquid, fully electronic, cash-settled contract traded on the CME GLOBEX trading platform. It is designed to track the price movements of the S&P 500 Index - the most widely followed benchmark of stock market performance. Empirically, the E-mini futures has been shown to contribute the most to price discovery for the S&P 500 Index.2 Price discovery typically occurs in the front month contract: in August 2008, the September 2008 futures contract is the front month, most actively traded contract (see Figure 1). For each transaction, we utilize the following data elds: date, time (up to the second), unique transaction ID (to identify consecutive transactions within a second), executing broker, opposite broker, trading account of the executing broker, trading account of the opposite broker, buy or sell ag (for the executing broker), price, and quantity.3 Using the audit trail-level of detail, we uniquely identify two trading accounts for each transaction: one for the broker who booked a buy and the opposite for the broker who booked a sale. Our dataset consists of over 6 million transactions that took place among 26950 trading accounts that belong to 346 brokers. We rst test the quality of the data by applying standard lters designed to look for recording errors and outliers in the price and quantity series.4 We nd the data to be of very high quality: the standard lters did not nd any data irregularities. We then determine the optimal sampling frequency by utilizing two techniques designed to mitigate the effect of market microstructure noise in ultra high-frequency data.5 The rst technique is developed in Andersen, Bollerslev, Diebold and Labys (2000) and is commonly referred to as the volatility signature plot. According to this technique, the effects of market microstructure noise in our data are mitigated at the level of 120 transactions or higher.
Hasbrouck (2003). the data elds are named executing broker and opposite broker, transaction data does not specify which trader initiated a transaction; in fact, for each transaction, there are two mirror entries for the two counterparties one booking a sale and the other booking a buy. 4 See, Hansen and Lunde (2004). 5 For the literature on the subject, see among others, Zhang, Mykland and Ait-Sahalia (2005), Oomen (2005), Bandi and Russell (2006), Hansen and Lunde (2006), Barndorf-Nielsen et al. (2008).
3 While 2 See,
The second technique is developed by Bandi and Russell (2006) to select the sampling frequency that minimizes the variance of market microstructure noise. According to the second technique, the optimal sampling frequency is just below 100 transactions. Neither technique makes any use of network variables. We adopt a very conservative approach and select 240 transactions as the sampling frequency for our data.6 For each period consisting of 240 transactions (which amount to a total of 25,104 such periods in our sample), we compute the following nancial variables: returns, volatility, intertrade duration, and trading volume. These four variables are typically assumed to both contain and convey valuable information to market participants about the true (but unobserved) stochastic price process.7 Intuitively, market participants can learn about the true underlying price process by observing transaction prices, trading volume, and times between trades. Transaction prices contain valuable information about the true underlying price process, but with a possibly signicant amount of noise due to, among other reasons, market microstructure issues (e.g., bid-ask bounce), measurement issues (e.g., time scale, discrete realizations from a continuous process), and seasonality (e.g., predictable intraday patterns).8 Both returns and their volatility are computed from observed prices and, thus, suffer from the same noise issues. However, a number of techniques have been developed to reduce the impact of different noise components in ultra high-frequency data. The techniques we use to deal with measurement errors and reduce market microstructure noise lters and optimal sampling frequency are described just above. In addition, we remove a predictable intraday seasonal component from the computed raw returns by regressing them on a constant and a sequence of dummy variables for each half-hour during the trading period. We then use the unexplained term as our measure of returns.9 We compute returns as differences in log prices using both the last price to the rst price within the same period (close-to-open) and last prices for consecutive periods (close-to-close). The results reported below refer to the close-to-open deseasonalized returns, because we believe it to be an intuitively more appealing measure to compare with network variables (also cleaned of seasonality), which are dened within each sampling period. Having said that, the main results are not affected by the two different ways to compute returns nor by the deseasonalization procedure.10 Volatility is another measure that contains valuable information about the true underlying price process. As mentioned above, because it is computed from observed prices, it suffers from the same noise issues as returns. Moreover, volatility suffers from the fact that unlike prices, volatility is never directly observed. Thus, volatility estimates contain not only the volatility of the noise, but also a possibly nontrivial factor due to covariance between the
order to ensure the robustness of our results, we repeat our analysis at a higher sampling frequency (see our discussion on robustness later in the paper). The main results are unaffected. 7 There is a vast theoretical and empirical literature on the subject. For a recent summary, see, Manganelli (2005). 8 See, for example, Engle (2000). 9 We apply the same technique to all nancial and network variables. 10 We also used a Fourier exible form to remove seasonality. It did not qualitatively change our results.
6 In
true price process and the noise component.11 We use three measures to estimate volatility during each period: absolute returns, squared returns, and the price range. Absolute and squared returns are proxies for the standard deviation and variance of returns, respectively. The price range is dened as the difference between the high and low price (in logs) during the period. For the results reported below, we use the price range as the measure of volatility. Range-based volatility estimators have been shown to be more efcient than return-based volatility estimators, because they incorporate the full sample path of observed prices (to select a maximum and a minimum) rather than just open and close prices.12 Our main results are not affected by the choice of volatility estimator. Intertrade duration contains valuable information, because the estimation of characteristics of the true price process obtained during periods of shorter intertrade duration can be more precise. This would happen irrespective of the reasons for shorter intertrade duration: whether more frequent trading occurs due to more informed trading or more liquidity trading, more frequent sampling would result in greater precision with respect to the true process. Having said that, there is a view that since information is disseminated through trading, the interval of time between trades can be interpreted as a proxy for the arrival of new information to the market.13 We compute duration as the time (in seconds) elapsed between the start and end of the period. We compute three measures of duration: total (unweighted) period duration, volume weighted period duration, and average for 239 intertrade (within period) durations. The results reported below are for total period duration. The main results are unaffected by the way we compute intertrade duration. Trading volume contains valuable information, because volume together with observed transaction prices can be driven by a common latent factor often referred to in the literature as information intensity.14 Intuitively, during periods of higher volume, transaction prices also exhibit greater precision about the characteristics of the true underlying price process. We compute volume as the number of contracts both bought and sold during the observation period.
number of techniques have been developed to estimate volatility components separately by varying the time window. See, for example, Zhang, Mykland, Ait-Sahalia (2005). Application of these techniques to trading networks will be explored in our future research. 12 For the literature on price range as an efcient estimator of asset price volatility, see, for example, Parkinson (1980), Garman and Klass (1980), Beckers (1983), and Brunetti and Lildtholdt (2006). In recent years, the price range has also been used to compute realized volatility in high frequency data. See, for example, Christensen and Podolski (2009). 13 See, for example, Engle and Russell (1998) and Engle (2000). 14 There is a vast theoretical and empirical literature on the subject. See, for example, Clark (1973), Epps and Epps (1976), Tauchen and Pitts (1983), Admati and Peiderer (1988), Easley and OHara (1992),and Andersen (1996).
11 A
II. Network variables

Quantitative analysis of networks employs a set of standard metrics.15 Network metrics directly depend on what is dened as a node, an edge, and a network. In our analysis, a node denotes a trading account, an edge indicates that a transaction has occurred between two trading accounts, and a trading network is constructed from a specied number of consecutive transactions (e.g., 240 transactions) among trading accounts in an electronic limit order market. The formation of a trading network consists of three interconnected steps: (i) the arrival and accumulation of orders in the limit order book; (ii) the process of matching buy and sell orders; and (iii) the display of transaction prices for matched orders.16 From the network perspective, limit orders can be visualized as stubs (ends of edges) attached to a given node. At a simple level, each stub has a time stamp (for the time it was created), a direction (in for buy and out for sell), a price, and a quantity. A node can grow a large number of stubs subject to the specics of its trading strategy, the costs of creating, modifying, maintaining, and cancelling stubs, as well as limits imposed by the stub-matching algorithm.17 Depending on its attributes, each stub is assigned to a specic place in the limit order book. In stubs go into the Buy Orders side of the book and Out stubs go into the Sell Orders side. On each side of the order book, the stubs are sorted in accordance with the rules of a matching algorithm, e.g., by price and then the time stamp or by price, quantity and then the time stamp. The matching algorithm makes edges out of stubs by linking together top stubs from each side of the book, provided that they agree on a price.18 Once two stubs are connected by a matching algorithm, two things happen: an edge between two nodes is created and the associated transaction price at which the match was achieved is displayed for all nodes to see. After seeing a transaction price or a sequence of
Newman (2003) for a review of basic network concepts and quantitative indicators. the loss of generality, the modication and/or removal of unmatched existing orders is viewed as a part of the order arrival process. 17 For example, the Chicago Mercantile Exchange (CME) Group allows most trading rms to grow the following number of free stubs (known as messages) for the products matched via its GLOBEX algorithm during regular business hours: 3,000 plus no more than a ratio of grown stubs to total executed volume for this product. This volume ratio is set at 4 to 1 for E-mini S&P 500 Futures and Spreads, 8 to 1 for E-mini NASDAQ-100 Futures and Spreads, and 25 to 1 for E-mini Dow Futures and Spreads. Stubs grown in excess of 3,000 plus the product-specic volume ratio are penalized by a surcharge fee. 18 For example, on the CMEs GLOBEX Trading System, there are three algorithms to match stubs (orders): First In, First Out (FIFO); Pro Rata Allocation (Pro Rata); and Lead Market Maker (LMM). Quoting from the CME documentation avialble to the public, FIFO uses price and time as the only criteria for lling an order: all orders at the same price level are lled according to time priority. Pro Rata matches orders based on price, top orders (the rst order only that betters a market), and size. The LMM is a rm or trader designated by CME to make a two-sided market in an assigned product. This LMM will have the benet of certain matching privileges and associated pricing concessions in return for meeting CME determined market obligations.
16 Without 15 See
transaction prices, some nodes may decide to modify or remove some existing stubs or grow new stubs, thus affecting the network formation process. Empirically, we construct trading networks as follows. At 9:30:00 a.m. EST on August 1, 2008, we start counting transactions in the September 2008 E-mini S&P 500 futures contract. For each transaction, we know which account bought from or sold to which other account (or itself), at what price, and what number of contracts. We designate 240 consecutive transactions as one period. Transactions 1 through 240 mark the rst period, transactions 241-480, mark the second period, and so on. While for each period, we do not observe the limit order book itself, we know that transactions occurred because market orders or limit orders were matched with existing orders in the limit order book. We can then trace the pattern of order execution or a trading network within each period. Even though the number of transactions for each period is the same, a pattern for a large market order executed over the period will look very different compared to a pattern for several smaller limit orders. Metrics that we compute for each network should be interpreted as quantitative measures of the pattern of order execution in the limit order book. We realize that by taking snapshots of the market at equal transaction time intervals, we cannot hope to characterize the whole complexity of changes that take place in the underlying limit order book. Specically, we cannot observe how the revelation of transaction prices translates into modications or cancellations of existing orders and submissions of new orders. Or in terms of the network formation process, we cannot observe how nodes remove some existing stubs and grow new stubs. While we know that the process of trading network formation - stubs, edges, transaction prices, new stubs - goes on continuously, we must designate the number of transactions that add up to a trading network at a point in time. This designated number of transactions could be at times too small and at times too large to clearly capture the impact of order execution on the order book through network analysis within each period. However, as we analyze the time series properties of trading networks, a statistically signicant pattern, if there is one, should emerge. In other words, the approach we take is to compute and analyze network metrics for a time series of consecutive trading networks rather that those for one aggregate network that emerges over the whole period. Given our intutition about how patterns should be related to the dynamics of transaction prices and quantities, we are interested in network metrics that can measure centrality (or how star-shaped a network is); assortativity of connections (or how diamond-shaped a network is); as well as those that can measure reciprocity, triangular connections, and the size of the network. The size of the network can be characterized in terms of the total number of nodes, denoted by N, and the total number of edges, denoted by E. From these two quantities we can also compute the average degree, AV DEG = E/N the average number of nodes that a node is connected to, and the standard deviation of degree, ST DEG the standard deviation around 7
this average. These two variables characterize the rst and second moment, respectively, of the unconditional degree distribution. Node centrality quanties the position of a specic node on a network. There are several node centrality measures, the simplest one being degree, or how many edges a node has. In a directed network, degree can be further separated into indegree and outdegree in accordance with the number of incoming or outgoing edges of a node. However, the degree alone may not necessarily capture the role of a node on the network. For example, a node that has a relatively low degree, but acts as a connector between otherwise disconnected parts of the network, can be thought of as very central. To that end, there are measures of centrality that take into account not just the degree of a node, but its position relative to all other nodes in the network. For example, betweenness measures how many other pairs of nodes would have to go through the given node in order to reach one another in the shortest number of hops. Similarly, closeness measures how many hops away a node is on average from every other node in the network. Figure 2 illustrates different node centrality measures. Node centrality is a critical input into the calculation of network centralization, a measure that characterizes the inequality of connectivity among the nodes. In order to capture this inequality in connectivity within the network whether there are a small number of nodes with high centrality and a large number of nodes with low centrality we compute a centralization measure dened as centralization Gini: n (2r N 1)ki , G = r=1 N E where ki is a nodes centrality measure and r is a nodes rank order number. Taking node is degree as its centrality measure, we use the formula above to compute separate centralization measures for indegree and outdegree incentralization, INCEN, and outcentralization, OUTCEN, respectively. By construction, these measures are 0 if every node has the same number of (incoming or outgoing) edges, and positive with increasing inequality: e.g., one node has all the incoming (outgoing) edges, the others have no incoming (outgoing) edges. We also compute a combined measure of incentralization and outcentralization: CEN = INCEN OUCEN. Intuitively, since we use a nodes degree as a measure of its centrality, the difference between in and out centralization measures can be interpreted as the presence of a dominant buyer or seller. CEN will be equal to 1 if there is a dominant buyer and -1 if there is a dominant seller. To measure wheher a node is both a buyer and a seller, we compute the Pearson correlation coefcient between the indegree and the outdegree of each node, INOUT . A positive
(1)
correlation indicates that nodes with many in edges also have many out edges - i.e., it is both buying and selling. We also calculate statistical properties of nodes one edge away from each individual node or connectivity of node B conditional on it being connected to node A. Assortativity in networks can represent any tendency of like to be connected with like for any node property (see Newman (2002)), but here we will apply it to degree. Large degree nodes (i.e., those with many edges) may connect more frequently to other large degree nodes or they may tend to connect to small degree nodes. Two large degree nodes connecting to a number of small degree nodes between them will result in a diamond-shaped network. One way to measure assortativity is by the Pearson correlation coefcient (ki , k j ) for all edges ei j . When the edges are directed there are four possible assortativity measures: (kiin , kin ), (kiin , kout ), (kiout , kin ), and (kiout , kout ) corresponding to the four conditional dej j j j gree distributions. From these four correlation coefcients, we construct the following compound measure, that we call assortativity index for directed networks: AI = 1 4 (kiin , kin ) + (kiout , kout ) (kiin , kout ) + (kiout , kin ) j j j j , (2)
computed overall all edges ei j . Figure 3 illustrates network assortativity. For example, in the context of trading networks, the coefcient (kiout , kin ) measures the correlation between the number of unique buyers (conj nected by an outward pointing edge) a seller is selling to (denoted by kout ) and the number of j in ). A negative (kout , kin ) would unique sellers those buyers are buying from (denoted by k j i j mean that when a seller has matched to many buyers, those buyers are likely to be transacting with few or no other sellers. We also measure if nodes one edge away from each individual node also form particular (e.g., triangular) patterns. Transitivity, also termed clustering, measures the prevalence of closed triads in the network. In this paper, we use the global clustering coefcient denoted by CC as a measure of transitivity:19 CC =
19 See,
3 number of triangles in the network , number of connected triples of vertices
(3)
Newman (2003).
where a connected triple means three nodes ABC such that there is an edge AB and an edge BC.20 , and the prevalence of specic directed triads can be used to conduct a motif analysis on a directed network.21 Finally, in addition to regularities in connections between pairs and triplets of nodes, a network as a whole may be composed of several separate connected components. A connected component is a maximal subset of nodes such that any node can be reached from any other node by traversing edges. Within a strongly connected component any node can be reached from any other by following directed edges. Figure 4 illustrates the largest strongly connected component (LSCC). Once the largest strongly connected component is identied, we can measure the global network structure by computing LSCC, the proportion of the network occupied by this component. Intuitively, the largest strongly connected component can only occupy a signicant portion of the network if many nodes have both incoming and outgoing edges during the same time period, and there are cycles (the simplest of which are reciprocal ties and the triads mentioned above) within the network. In other words, a large strongly connected component is much more likely to emerge as a result of a large number of limit orders than one large market order.
III. Conjecture: Trading networks contain information

We believe that network analysis is very useful for quantifying patterns of information transmission in an electronic limit order market. Specically, we conjecture that orders that contain information about the fundamental value of an asset, as well as demand and supply of liquidity for this asset, should have particularstar-shaped or diamond-shapedexecution patterns. In contrast, orders that have little such information should exhibit very different patterns, namely, they should contain many triangular and reciprocal connections. To illustrate this intuition, we show in Figure 5 three sample networks, with their network and nancial statistics. The sample networks are chosen to display extremes in centralization (CEN), assortativity index (AI), and the number of edges (E). The left column of Figure 5 presents a star-shaped network with one dominant buyer matched with many sellers. This pattern is consistent with the execution of a market order. It has a centralization coefcient close to one, high standard deviation of degree and large assortativity. This network is associated with a positive return and low period duration. The center column of Figure 5 presents a network with four large traders forming diamondshaped patterns as their limit orders are being crossed via intermediaries. High assortativity
the clustering coefcient, we are treating the edges as undirected, although directed clustering coefcients can also be dened. See, Fagiolo (2007). 21 On motif analysis, see Milo et al (2004).
20 For
10
index means that large traders are mostly matched with many small traders rather than with each other. Small largest strongly connected component suggests that large traders are not trading with each other: rather they buy from or sell to small traders who quickly trade with another large trader. This pattern is associated with negative returns, as well as with higher volume and volatility. The right column of Figure 5 presents a fairly uniform network with many buyers and sellers of various sizes. This situation is reected in a pattern of connections that exhibits network parameters close to their sample averages with the exception of the number of edges - reecting a larger and more interconnected trading network. The nancial variables estimated from transaction prices - the rate of return and volatility - are very close to their averages. Volume is somewhat above its sample average and period duration is quite high. The examples above provide illustrative evidence in support of our intuitive conjecture. Our next step is to take our conjecture to the dataa times series of over 25000 trading neworksand to prove that that metrics of order execution patterns are statistically related to returns, volatility, volume, and duration.
IV. Empirical properties of trading networks

A. Summary statistics for the nancial variables
Table I presents summary statistics for the nancial variables. All nancial variables in our sample are stationary. Standard ADF tests reject the null hypothesis of non-stationarity (pvalue = 0.00) for all nancial variables. For the rate of return, the standard deviation dominates the mean as expected. In addition, returns exhibit positive skewness. The range has a period average of 0.04 percent, which corresponds to an annualized average volatility of 24 percent in line with estimates reported in the literature.22 Intertrade period duration in this very liquid market ranges from zero to 176 seconds. In our sample, 240 transactions on average occur every 19.5 seconds. Finally, volume, volatility, and duration are highly persistent, as evidenced by autocorrelation coefcients at lags 1, 5, and 10.
the three measures of volatility - absolute returns, squared returns, and range - the range exhibits the lowest standard deviation (results available upon request). This is in line with the efciency results of Parkinson (1980).
22 Of
11
B. Summary statistics for the network variables

Table II presents summary statistics for the network variables.23 All network variables study are stationary. However, Jarque-Bera tests for individual network variables reject the null of normality at standard signicance levels. Finally, all network variables exhibit persistent autocorrelation functions. Our combined measure of the difference between in-centralization and out-centralization, CEN = 0.00 0.23, is on average equal to zero. However, this is partly due to the fact that the shapes of distributions of incentralization and outcentralization are very similar, which indicates that both the buy side and the sell side of the limit order book are executed in a symmetrically balanced way. The distributions of incentralization and outcentralization are also strongly negatively correlated. This indicates that when there are a few dominant buyers, the sellers tend to have more even numbers of trading partners, and conversely, when there are a few dominant sellers, the buyers tend to be more equal in trading partners. In other words, a large market order or executable limit order to buy (sell) is likely to be executed against several small limit orders on the sell (buy) side of the book. These trading networks are highly dynamic, and the most central node in one period may have few or no edges in the next. In other words, it is quite unlikely that a market order or an executable limit order is so large that it spans several consequtive networks. Of the nearly 27000 trading accounts who bought or sold S&P 500 E-mini futures contracts during August 2008, 17 accounts were extremely active, accounting for nearly 40 percent of all transactions. On the other hand, 85 percent of trading accounts traded only 10 percent of all transactions. The correlation in indegree from period to period for the individual 17 most active trading accounts varied between [0.28] and [0.64], while for the bottom 85 percent of accounts, the correlation was essentialy zero. At the level of the individual trader, the indegree is slightly correlated with outdegree (INOUT = 0.08 0.22), suggesting that traders who have more buying interactions will tend to have a slightly higher number of selling interactions during the same transaction window. Assortativity correlations in trading networks are on average negative. There is a moderate negative correlation between the number of buyers a seller is selling to and the number of sellers that a buyer is buying from. This relationship stems in part from the skewed degree distribution. Most buyers have low indegree, therefore a seller with high outdegree must be selling to many buyers with low indegree. Similarly, most sellers have low outdegree, and a buyer with high indegree must be buying from many of them. The overall assortativity index, which is computed by assigning equal weights to the four assortativity correlations, is slightly
the table focuses on eight network variables, we have also analyzed a wide range of other network metrics, including Freeman centralization, reciprocity, individual pairwise assortativity metrics, and alternate centrality measures, including directed and undirected betweenness, closeness, and PageRank.
23 While
12
positive: AI = 0.09 0.07. This means that on average, when a seller (buyer) is matched with many buyers (sellers) they are just as likely to be transacting with many other sellers (buyers) as with a few or no sellers (buyers). A devitation from this pattern indicates that one buyer or one seller is dominant. The global clustering coefcient or the ratio of oberved triangular connections among nodes to all possible triangular connections is 0.040.03, nearly one standard deviation below the average clustering coefcient for randomized graphs with the same assignments of degrees. In other words, there is no tendency for the traders to cluster together. Similar to the clustering coefcient, the size of the largest strongly connected component, (0.04 0.04), does not deviate from what would be expected for networks of that size, density, and distribution of in and out degrees. But as we will see in the following section, it does strongly correlate with density and other network variables.
V. Empirical analysis of trading networks

A. Correlations
Table III reports contemporaneous correlations among network variables. According to Table III, the difference between buyer and seller centralization (CEN) is not correlated with any other network variable. Average degree (AV DEG) is positively correlated with the standard deviation of degree (SDDEG). This property is typical for a power law degree distribution, in which a few dominant nodes have high degree (i.e., few accounts trade with a large number of counterparties) and a large number of nodes have very low degree. As a result, for most power law distributions, central moment estimators, like average degree and the standard deviation of degree, grow with the sample size. Correlations between a nodes indegree and outdegree (INOUT ) boost the variance in undirected degree (SDDEG). This means that large buyers are also large sellers, while small buyers are also small sellers. By construction, the assortativity index (AI) is highest when high degree buyers are matched with low degree sellers. This is more likely to occur when the network is less dense and the largest strongly connected component (LSCC) is small. Table IV presents correlations between nancial and network variables. The return process exhibits 68 percent correlation with centralization, but no other network variables. Intuitively, a large positive CEN results from a large market order or executable limit order to buy. This order is likely to push prices up, which results in a positive rate of return. The same intituition holds for a market order to sell. Volatility is positively correlated with all the network variables, with the exception of the assortativity index (AI). This means that when high degree buyers are matched with low degree sellers (like in the case of a market order), volatility is somewhat smaller compared to 13
a situation when low degree buyers are matched with low degree sellers (several limit orders). Intuitively, in a deep and liquid market like the E-mini S&P 500 futures, an incoming market order has a signicant chance to be executed against several limit orders sitting at (or near) the same tick, resulting in high assortativity and centralization, but very little price impact and, hence, low high-frequency volatility estimate. At the same time, intermediated execution of two large limit orders from both sides of the limit order book will result in a positive high frequency estimate of volatility, if only due to the bid-ask bounce. Duration is positively correlated with the average degree and in-out degree correlation and negatively correlated with the standard deviation degree and the assortativity index. Intuitively, a longer time interval between trades is associated with trades that are distributed more evenly among traders, increasing the average degree and decreasing the standard deviation of degree and the assortativity index. Over longer time intervals, it is also more likely that a node that has a high indegree also has a high outdegree (it has time to be both a buyer and a seller), which results in a positive in-out degree correlation.
B. Granger Causality
We next test for Granger causality in the context of Vector Autoregressive (VAR) models. Since the variables exhibit heteroskedasticity and serial correlation, we estimate VAR models using the generalized method of moments (GMM) and Newey-West robust standard errors. We rst consider a VAR model with eight network variables. According to the Akaike Information Criterion, the system that includes all eight network variables has an optimal laglength of twenty.24 However, the results of the model with eight network variables (available from the authors upon request) show strong evidence of feedback effects among the network variables, i.e., network variables tend to Granger cause each other. In light of this, we use standard tests to reduce the model to four network variables.25 Tables V-IV provide the results (p-values) of Granger-non-causality tests. The last column and the last row of each table are labelled all. In the last column we test whether each variable is Granger-caused by all the other variables in the system, while the last row is testing whether each variable is Granger-causing any other variable in the system. The null hypothesis is that of Granger-non-causality. Therefore, a p-value greater than ve percent indicates a failure to reject the null. Table V presents p-values for the Granger-non-causality test among three sets of four network variables (three panels). Panel 1 shows that centralization (CEN) is Granger-causing the other network variables (p-value = 0.5556), but is not Granger-caused by other network vari24 Throughout 25 Standard
the analysis we use both Akaike and Schwartz Information Criteria. test statistics are available from the authors upon request.
14
ables (p-value = 0.2387). On the other hand, Panels 2 and 3 show that the remaining network variables Granger-cause each other. Next, we test for Granger-causality between one nancial variable and four network variables. Using standard techniques, we select groups of network variables that reect degree properties at the level of a single node (e.g., centralization, standard deviation of degree, and in and out degree correlation), two nodes linked by an edge (assortativity index), connected triples of nodes (clustering coefcient), and the connectivity of the whole network (the proportion of nodes in the largest strongly connected component). Table VI presents p-values for the Granger-non-causality test for the rate of return and network variables. We nd that the return process is both Granger-caused and Granger-causes network variables. The network variable that has a strong impact on returns is centralization. This is in line with the correlation results in Table IV. Table VII reports Granger-non-causality test results for the volatility process and network variables. Similarly to the return process, we nd a feedback effect between volatility and network variables: volatility is both Granger-caused by network variables and Granger-causes them. Table VIII reports Granger-non-causality test results for intertrade duration and network variables. We nd that duration is Granger-caused by network variables (p-value =0.0000), but does not Granger-cause network variables (p-value = 0.1811). Finally, Table IX presents p-values for the Granger-non-causality test for volume and network variables. The results show that volume is Granger-caused by network variables (p-value = 0.0000) but does not Granger-cause network variables (p-value = 0.3662). What are the possible reasons for the presence of feedback effects in Granger causality test results for the rate of return and volatility (vis-a-vis the network variables) and the absence of such effects for volume and duration? We believe that there is one fundamental reason for these empirical ndings: our results for the price-based variables are polluted by noise. Unlike volume, duration, and all the network variables, which we can measure directly, the rate of return and volatility are estimated from transaction prices. As a result, the variables we call the rate of return and volatility are noisy proxies for the unobservable characteristics of the true price process. The level of noise at this very high frequency is so high that it is very hard to effectively measure the interaction between network variables and the true price process.
VI. Robustness
Our results are robust with respect to different markets, different observation periods, different levels of aggregation, and different sampling frequencies. The results we report are for the E15
mini S&P 500 futures for the month of August 2008 (over 6 million transactions). The results remain qualitatively the same when we repeat all procedures for the same market for the month of May 2008 (5.15 million transactions) at the sampling frequency of 240 transactions. The main results also remain the same for the sampling frequency of 600 transactions. Namely, both correlations and Granger-causality results hold. The results are also the same whether we construct networks at the broker level or trading account level. Finally, the results remain the same for other stock index futures markets as conrmed by the analysis of the E-mini Nasdaq 100 (2.3 and 2.8 million transactions in May 2008 and August 2008, respectively) and E-mini Dow Jones futures contracts for both May 2008 and August 2008 (1.8 and 2.4 million transactions in May 2008 and August 2008, respectively) at the sampling frequencies of 240 and 600 transactions.
VII. Agent-based simulation model of trading networks

In order to further test that our empirical results do not arise by chance and to examine the source of the high correlation between network centralization and returns, we construct an agent-based simulation model of trading networks. Figure 6 presents a snapshot of the simulation model. The model setup is as follows. There is a xed number of traders. At each interval of time, a new buy or sell order is assigned at random to one of the traders. The order arrival time is distributed according to a Poisson distribution. The order has an equal50 percentprobability to either buy or sell a single quantity at a single price. The quantity is lognormally distributed. For a sell order the price is set a small xed number above the last transaction price plus a lognormally distributed random variable with a mean of zero. For a buy order, the price is set a small xed number below the last transaction price plus a lognormally distributed random variable with a mean of zero. Setting the order a small number above (below) the last transaction price for a buy (sell) order indicates a willingness on the part of the trader to buy (sell) at a slightly higher (lower) price than the market is currently trading at. The lognormally distributed, zero mean random variable added to the order price represents the heterogeneity of beliefs. This parametric specication is consistent with a price function that arises in equilibrium under the assumption of heterogenous beliefs about the true price process.26 Each incoming order is matched against previously placed orders by an automated matching algorithm based on price and time priority. If a match is made, an edge is created between two traders.27 Immediately following a match, order quantities are updated for the two matched traders. If the newly placed order is only partly fullled, the algorithm attempts to
for example, Scheinkman and Xiong (2003). the orders are randomly assigned, it is possible that an edge connects two orders submitted by the same trader for the same trader.
27 Since 26 See,
16
match the remaining quantity against another, more recent previously placed order. Orders are set to expire after a xed amount of time from when they are rst created, at which point they are cancelled and withdrawn from the market. Using the resulting simulated transactions, we construct trading networks using the procedure identical to the one used for the empirically observed data. Namely, we simulate 6 million transactions, segment the data into periods of 240 consecutive transactions and compute network and nancial statistics for each period. Just as in the actual trading, a single order may be reected in multiple transactions in adjacent time windows. This setup allows for a possibility of heterogenous beliefs about the price process, but imparts no intentionality or memory upon the traders. It allows us to discern which features of the trading networks are due to the arrival of information to the market, and which may be due to strategic behavior on behalf of the traders. We nd that a sequence of orders with randomly distributed prices and quantities results in network and nancial variables that are very similar to those obtained from the futures market data, but with the notable (and anticipated) exception of a dynamic structure. Specically, we nd that contemporaneous correlations among the network variables, as well as correlations among network and nancial variables are very similar to those we estimate from the actual market data.28 This conrms that our empirical results do not arise by chance. We also use the agent-based simulation model to investigate possible sources of high correlation between network centralization and returns. By observing the simulation, we nd that high correlation between centralization and returns reects the network mechanics of the information arrival process: a trader submitting a large buy order at a high price will be matched against several existing sell orders, giving that trader a high indegree, and increasing the centralization of the network. At the same time, because a greater number of sell orders was matched, the market-price goes up, yielding a positive rate of return. Moreover, we nd that for the simulated data (but not market data), centralization and other network variables Granger-cause returns, but not vice versa.29 At the same time, we also nd thatas expected in a model with no intentionality or memoryGranger-causality tests among network variables and volatility, volume and duration yield very weak results: feedback effects, lack of signicance or very poor t. This suggests that the Granger-causality results that we nd in the futures markets data arise as a result of the behavior of traders and are not a statistical artifact.
are available from the authors upon request. use the lag-length of order on in the VAR. As expected from the lack of dynamics in the silmulated data, the Akaike information creterion selects a lag-length of order one in the VAR specication and the Schwartz information creterion selects a lag length of order zero.
29 We 28 Results
17
VIII. Concluding remarks

We use network analysis to examine information transmission in an electronic limit order market. We conjecture that orders that contain information about the fundamental value of an asset, as well as demand and supply of liquidity for this asset, should have particularstarshaped or diamond-shapedexecution patterns. In contrast, orders that have little such information should exhibit very different patterns, namely, they should contain many triangular and reciprocal connections. We test this conjecture by computing a time series of trading networks from audit trail, transaction-level data for all regular transactions in the September 2008 E-mini S&P 500 futures contract during the month of August 2008 (over 6 million transactions). We nd that star-shaped or diamond-shaped patternscharacterized by high centralization or assortativity and low transitivity (clustering coefcient) and connectednessare positively related to returns and volume and negatively related to duration and volatility. In contrast, less heterogeneous patternsthose with centralization and assortativity close to zero (their averages), high transitivity and high connectednessare associated with average returns and volatility, and positively related to volume and duration. Moreover, we nd that network variables strongly Granger-case intertrade duration and volume, but not the other way around. This suggests that patterns of order execution presage changes in duration and volume. This is the rst paper to employ network analysis to study complex dynamics of an electronic limit order market using transaction level data. While network analysis offers only a partial look into the limit order book (i.e., the executed part), network technology offers signicant advantages for the task of analyzing the complexity of electronic limit order trading beyond just nancial variables.
18
References
[1] Allen, Franklin, and Ana Babus, 2008, Networks in Finance, Working Paper 08-07, Wharton Financial Institutions Center, University of Pennsylvania. [2] Andersen, T., Bollerslev, T., Diebold, F.X. and Labys, P., 2000, Great Realizations, Risk, 13, 105-108. [3] Bandi, F. M. and Russell, J. R., 2006, Separating microstructure noise from volatility, Journal of Financial Economics, 79, 655-692. [4] Barndorff-Nielsen, O.E., Hansen, P.A., Lunds, A., and Shephard, N., 2008, Realised kernels in practice: trades and quotes, manuscript. [5] Beckers, S., 1983, Variance of security price returns based onhigh, low and closing prices, Journal of Business 56, 97-112. [6] Braha, Dan, and Bar-Yam, Y., 2006, From Centrality to Temporary Fame: Dynamic Centrality in Complex Networks, Complexity 12(2), 59-63. [7] Brunetti, Celso, and Lildholdt, P.M., 2006, Relative efciency of return- and range-based volatility estimators, manuscript. [8] Clark, P., 1973, A subordinated stochastic process model with nite variance for speculative prices, Econometrica 41, 135-155. [9] Christensen, K., and Podolski, M., 2005, Asymptotic theory of range-based estimation of integrated variance of a continuous semi-martingale, manuscript. [10] Engle, Robert, 2000, The econometrics of ultra-high-frequency data, Econometrica 68, 1-22. [11] Engle, R., and Gallo, G., 2006, A multiple indicators model for volatility using intra-daily data, Journal of Econometrics 131, 3-27. [12] Engle, R., and Russell, J., 1998, Autoregressive conditional duration: A new model for irregularly spaced transaction data, Econometrica 66, 1127-1162. [13] Epps, T. and Epps, M., 1976, The stochastic dependence of security price changes and transaction volumes: Implications for the mixture-of-distribution hypothesis, Econometrica 44, 305-321. [14] Fagiolo, G., 2007, Clustering in complex directed networks, Physical Review E 76(2), 26107. [15] Garman, M. and Klass, M., 1980, On the estimation of security price volatilities from historical data, Journal of Business 53(1), 67-78. [16] Hansen, P. and Lunde, A., 2006, Realized variance and market microstructure noise, Journal of Business and Economic Statistics 24, 127-218. [17] Hasbrouck, Joel, Intraday Price Formation in U.S. Equity Index Markets, 2003, Journal of Finance 58(6), 2375-2400.
19
[18] Hong, Harrison, and Jeremy C. Stein, 1999, A unied theory of underreaction, momentum trading and overreaction in asset markets, Journal of Finance 54, 2143-2184. [19] Kossinets, G. and Watts, D.J., 2006, Empirical Analysis of an Evolving Social Network, Science 311 (5757), 88-90. [20] Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and U. Alon, 2004, Superfamilies of Evolved and Designed Networks, Science 303, 1538-1542. [21] Newman, M. E. J., 2002, Assortative mixing in networks, Physical Review Letters 89, 208701. [22] Newman, M. E. J., 2003, The structure and function of complex networks, SIAM Review 45, 167. [23] Oomen, R., 2005, Properties of bias-corrected realized variance under alternative sampling schemes, Journal of Financial Econometrics 3, 555-577. [24] Parlour, Christine A. and Duane J. Seppi, 2008, Limit Order Markets: A Survey, Handbook of Financial Intermediation and Banking, Boot, Arnoud W.A., and Anjan V. Thakor, eds., Elsevier B.V., Oxford, UK. [25] Scheinkman, Jose A. and Wei Xiong, 2003, Overcondence and Speculative Bubbles, Journal of Political Economy 111(6), 1183-1219. [26] Tauchen, G. and Pitts, M., 1983, The price variability-volume relationship on speculative markets, Econometrica 51, 485-505. [27] Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo. Center for Connected Learning and Computer-Based Modeling. Northwestern University, Evanston, IL. [28] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y., 2005, A tale of two scales: Determining integrated volatility with noisy high-frequency data, Journal of American Statistical Association, 100, 13941411.
20
Figure 1: E-mini S&P 500 front month futures contract.
21
X Y
X Y
X
indegree
outdegree
betweenness
closeness
Figure 2: Example networks with node X having greater centrality than node Y for the specied measure.
(kiin , kin ) j (kiout , kout ) j (kiin , kout ) j (kiout , kin ) j AI
-1 -1 -1 -1 0
1 1 -1 -1 1
-1 -1 1 1 -1
Figure 3: Illustration of Network Assortativity.
B A C F G
Figure 4: A network containing two connected components, ABCDE and FGH. The largest strongly connected component is BCDE.
22
CEN AV DEG SDDEG INOUT AI CC LSCC E Returns Range Volume Duration
0.920 2.027 8.134 -0.470 0.353 0.001 0.014 75 0.059 0.059 1104 0
-0.092 3.269 5.960 -0.101 0.559 0.049 0.016 103 -0.019 0.059 1343 12
0.116 2.418 4.949 -0.069 0.195 0.019 0.006 185 0.000 0.039 1191 16
Figure 5: Examples of observed networks and their properties.
23
Figure 6: A screenshot of the agent based simulation using Netlogo (Wilensky 1999). As orders (denoted by squares) are randomly assigned to traders (denoted by human gures), an edge is drawn between them. When sell orders (black squares) are matched with buy orders (red squares), their quantities are reduced, and there is a direct edge drawn between the traders.
24
Table I: Financial Variables: Summary Statistics Returns Volatility Volume Duration Mean 0.0002 0.0425 1236.6720 19.4941 Median 0.0000 0.0392 1153 14 Maximum 0.2165 0.2165 6645 176 Minimum -0.1378 0.0190 459 0 Std. Dev. 0.0271 0.0140 407.1451 17.4485 Skewness 0.0485 0.8273 2.6259 2.0299 Kurtosis 2.9876 6.4663 16.9377 9.3735 ADF prob 0.0001 0.0000 0.0000 0.0000 AC Lag 1 -0.001 [0.895] 0.187 [0.000] 0.528 [0.000] 0.473 [0.000] AC Lag 5 -0.006 [0.062] 0.167 [0.000] 0.376 [0.000] 0.289 [0.000] AC Lag 10 -0.011 [0.139] 0.151 [0.000] 0.284 [0.000] 0.241 [0.000] ADF prob refers to the p-value of the ADF test for the null of unit root. AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test for no serial correlation at lags X = 1, 5, and 10.
Table II: Network Variables: Summary Statistics

CEN AV DEG SDDEG INOUT AI CC LSCC Mean 0.0049 2.9105 5.1401 0.0836 0.09903 0.0426 0.0403 Median 0.0065 2.8814 4.9923 0.0101 0.0766 0.0365 0.0192 Maximum 0.9804 5.7391 12.7021 0.9887 0.5589 0.2966 0.4889 Minimum -0.9844 1.9692 2.4073 -1.0000 -0.0664 0.0000 0.0050 Std. Dev. 0.2338 0.3691 1.0587 0.2228 0.0699 0.0300 0.0484 Skewness -0.0129 0.5758 0.9584 1.3233 0.9233 1.1996 2.3741 Kurtosis 2.8782 3.7464 4.6527 4.5685 3.7564 4.9867 10.3298 ADF prob 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 AC Lag 1 0.139 [0.000] 0.317 [0.000] 0.165 [0.000] 0.175 [0.000] 0.102 [0.000] 0.203 [0.000] 0.246 [0.000] AC Lag 5 0.046 [0.000] 0.108 [0.000] 0.060 [0.000] 0.056 [0.000] 0.042 [0.000] 0.086 [0.000] 0.082 [0.000] AC Lag 10 0.036 [0.000] 0.074 [0.000] 0.031 [0.000] 0.048 [0.000] 0.042 [0.000] 0.063 [0.000] 0.047 [0.000] ADF prob refers to the p-value of the ADF test for the null of unit root. AC Lag X [Q-test prop] refers to the p-value of the Portmanteau Q-test for no serial correlation at lags X = 1, 5, and 10. E 164.4727 116.0000 219.0000 64.0000 19.4519 -0.5869 3.5793 0.0000 0.308 [0.000] 0.144 [0.000] 0.125 [0.000]
25
Table III: Pairwise correlations between network variables CEN AV DEG SDDEG INOUT AI CC CEN 1.0000 AV DEG -0.0012 1.0000 SDDEG -0.0015 0.0031 1.0000 INOUT -0.0019 0.2367 0.5119 1.0000 AI -0.0008 -0.1079 -0.2022 -0.6787 1.0000 CC -0.0008 0.8074 -0.0248 0.2052 -0.1095 1.0000 LSCC -0.0006 0.5042 0.4070 0.7226 -0.5019 0.4248
LSCC
1.0000
Table IV: Correlations between nancial and network variables Returns Range Volume Duration CEN 0.6774 -0.0076 0.0264 -0.0065 AV DEG -0.0034 0.0415 0.0061 0.1000 SDDEG 0.0037 0.0747 0.2363 -0.1620 INOUT -0.0061 0.0429 0.0853 0.0467 AI 0.0016 0.0635 0.0129 -0.0810 CC -0.0032 0.0314 0.0320 0.0360 LSCC -0.0076 0.0331 0.0884 0.0058
26
Table V: Network Variables: P-values for the Null Hypothesis of Granger Non-causality Panel 1: 20 lags AI CC LSCC 0.2143 0.4227 0.1731 0.0004 0.0000 0.0000 0.0000 0.0002 0.0219 0.0000 0.0000 0.0000 Panel 2: 18 lags AI CC LSCC 0.1601 0.0054 0.0000 0.0643 0.0078 0.0000 0.0000 0.0005 0.0001 0.0000 0.0000 0.0000
CEN CEN AI CC LSCC All 0.6243 0.8711 0.4375 0.5556
All 0.2387 0.0000 0.0000 0.0003
SDDEG SDDEG AI CC LSCC All 0.0000 0.0000 0.0000 0.0000
All 0.0000 0.0000 0.0000 0.0000
Panel 3: 14 lags INOUT AI CC LSCC All INOUT 0.1384 0.0794 0.0000 0.0000 AI 0.0000 0.0016 0.0029 0.0000 CC 0.0475 0.0000 0.0000 0.0000 LSCC 0.0000 0.0000 0.0185 0.0000 All 0.0000 0.0000 0.0000 0.0000 VAR estimated using GMM with HAC robust standard errors. Optimal lag-length (26) is selected using Akaike Information Criterion.
Table VI: Returns and Network Variables: P-values for the Null Hypothesis of Granger Noncausality Returns CEN AI CC LSCC All Returns 0.0148 0.8984 0.3530 0.7630 0.0320 CEN 0.0000 0.4459 0.8080 0.2615 0.0000 AI 0.0306 0.1491 0.0006 0.0000 0.0000 CC 0.0235 0.0632 0.0000 0.0000 0.0000 LSCC 0.1056 0.0826 0.0003 0.0240 0.0002 All 0.0000 0.0132 0.0000 0.0001 0.0000 VAR estimated using GMM with HAC robust standard errors. Optimal lag-length (18) is selected using Akaike Information Criterion.
27
Table VII: Volatility and Network Variables: P-values for the Null Hypothesis of Granger Non-causality Volatility SDDEG AI CC LSCC All Volatility 0.0005 0.2350 0.0000 0.0019 0.0000 SDDEG 0.0000 0.0263 0.0063 0.0000 0.0000 AI 0.0020 0.0000 0.0717 0.0093 0.0000 CC 0.0000 0.0000 0.0000 0.0000 0.0000 LSCC 0.0000 0.0000 0.0003 0.0116 0.0000 All 0.0000 0.0000 0.0000 0.0000 0.0000 VAR estimated using GMM with HAC robust standard errors. Optimal lag-length (18) is selected using Akaike Information Criterion.
Table VIII: Period Duration and Network Variables: P-values for the Null Hypothesis of Granger Non-causality Duration INOUT AI CC LSCC All Duration 0.3328 0.0017 0.0000 0.0000 0.0000 INOUT 0.9526 0.0000 0.0000 0.1215 0.0000 AI 0.3345 0.0000 0.0020 0.0021 0.0000 CC 0.5520 0.0498 0.0000 0.0000 0.0000 LSCC 0.1211 0.0000 0.0000 0.0336 0.0000 All 0.1811 0.0000 0.0000 0.0000 0.0000 VAR estimated using GMM with HAC robust standard errors. Optimal lag-length (15) is selected using Akaike Information Criterion.
Table IX: Volume and Network Variables: P-values for the Null Hypothesis of Granger Noncausality Volume SDDEG AI CC LSCC All Volume 0.0014 0.0012 0.0000 0.0063 0.0000 SDDEG 0.0669 0.1752 0.0053 0.0000 0.0000 AI 0.2008 0.0000 0.0911 0.0166 0.0000 CC 0.3970 0.0000 0.0000 0.0000 0.0000 LSCC 0.4034 0.0000 0.0014 0.0002 0.0000 All 0.3662 0.0000 0.0000 0.0000 0.0000 VAR estimated using GMM with HAC robust standard errors. Optimal lag-length (15) is selected using Akaike Information Criterion.
28

On The Information Properties of Trading Networks

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

On The Information Properties of Trading Networks

Uploaded by

Copyright:

Available Formats

On the Informational Properties of Trading Networks

a recent survey, see Allen and Babus (2008).

I. Data and Financial Variables

II. Network variables

3 number of triangles in the network , number of connected triples of vertices

III. Conjecture: Trading networks contain information

IV. Empirical properties of trading networks

B. Summary statistics for the network variables

V. Empirical analysis of trading networks

VII. Agent-based simulation model of trading networks

VIII. Concluding remarks

Figure 1: E-mini S&P 500 front month futures contract.

(kiin , kin ) j (kiout , kout ) j (kiin , kout ) j (kiout , kin ) j AI

Figure 3: Illustration of Network Assortativity.

CEN AV DEG SDDEG INOUT AI CC LSCC E Returns Range Volume Duration

Figure 5: Examples of observed networks and their properties.

Table II: Network Variables: Summary Statistics

CEN CEN AI CC LSCC All 0.6243 0.8711 0.4375 0.5556

All 0.2387 0.0000 0.0000 0.0003

SDDEG SDDEG AI CC LSCC All 0.0000 0.0000 0.0000 0.0000

All 0.0000 0.0000 0.0000 0.0000

You might also like