You are on page 1of 5

A time series is a series of data points indexed (or listed or graphed) in time order.

Most
commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it
is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts
of sunspots, and the daily closing value of the Dow Jones Industrial Average.

Time series are very frequently plotted via line charts. Time series are used in statistics, signal
processing, pattern recognition, econometrics, mathematical finance, weather forecasting,

earthquake prediction, electroencephalography, control engineering, astronomy,

communications engineering, and largely in any domain of applied science and engineering

which involves temporal measurements.

Time series analysis comprises methods for analyzing time series data in order to extract
meaningful statistics and other characteristics of the data. Time series forecasting is the use of
a model to predict future values based on previously observed values. While regression
analysis is often employed in such a way as to test theories that the current values of one or more
independent time series affect the current value of another time series, this type of analysis of
time series is not called "time series analysis", which focuses on comparing values of a single
time series or multiple dependent time series at different points in time.[1] Interrupted time
series analysis is the analysis of interventions on a single time series

Time series data have a natural temporal ordering. This makes time series analysis distinct
from cross-sectional studies, in which there is no natural ordering of the observations (e.g.
explaining people's wages by reference to their respective education levels, where the
individuals' data could be entered in any order). Time series analysis is also distinct from spatial
data analysis where the observations typically relate to geographical locations (e.g. accounting
for house prices by the location as well as the intrinsic characteristics of the houses).
A stochastic model for a time series will generally reflect the fact that observations close
together in time will be more closely related than observations further apart. In addition, time
series models will often make use of the natural one-way ordering of time so that values for a
given period will be expressed as deriving in some way from past values, rather than from future
values (see time reversibility.)
Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or
discrete symbolic data

A number of different notations are in use for time-series analysis. A common notation
specifying a time series X that is indexed by the natural numbers is written
X = {X1, X2, ...}.
Another common notation is
Y = {Yt: t T},
where T is the index set.
An index set is a set whose members label (or index) members of another set.[1][2] For instance,
if the elements of a set A may be indexed or labeledby means of a set J, then J is an index set.
The indexing consists of a surjective function from J onto A and the indexed collection is
typically called an (indexed) family, often written as (Aj)j J.

Example

An enumeration of a set S gives an index set {\displaystyle J\subset \mathbb {N} } ,


where f : J S is the particular enumeration of S.

Panel data
A time series is one type of panel data. Panel data is the general class, a multidimensional data
set, whereas a time series data set is a one-dimensional panel (as is a cross-sectional dataset). A
data set may exhibit characteristics of both panel data and time series data. One way to tell is to
ask what makes one data record unique from the other records. If the answer is the time data
field, then this is a time series data set candidate. If determining a unique record requires a time
data field and an additional identifier which is unrelated to time (student ID, stock symbol,
country code), then it is panel data candidate. If the differentiation lies on the non-time identifier,
then the data set is a cross-sectional data set candidate.

Cross-sectional data
Cross-sectional data, or a cross section of a study population, in statistics and econometrics is a
type of data collected by observing many subjects (such as individuals, firms, countries, or
regions) at the same point of time, or without regard to differences in time. Analysis of cross-
sectional data usually consists of comparing the differences among the subjects.
For example, if we want to measure current obesity levels in a population, we could draw a
sample of 1,000 people randomly from that population (also known as a cross section of that
population), measure their weight and height, and calculate what percentage of that sample is
categorized as obese. This cross-sectional sample provides us with a snapshot of that population,
at that one point in time. Note that we do not know based on one cross-sectional sample if
obesity is increasing or decreasing; we can only describe the current proportion.
Cross-sectional data differs from time series data, in which the same small-scale
or aggregate entity is observed at various points in time. Another type of data, panel data (or
longitudinal data), combines both cross-sectional and time series data ideas and looks at how the
subjects (firms, individuals, etc.) change over time. Panel data differs from pooled cross section
data across time, because it deals with the observations on the same subjects in different times
whereas the latter observes different subjects in different time periods. Panel analysis uses panel
data to examine changes in variables over time and differences in variables between the subjects.
In a rolling cross-section, both the presence of an individual in the sample and the time at which
the individual is included in the sample are determined randomly. For example, a political poll
may decide to interview 1000 individuals. It first selects these individuals randomly from the
entire population. It then assigns a random date to each individual. This is the random date that
the individual will be interviewed, and thus included in the survey.[1]
Cross-sectional data can be used in cross-sectional regression, which is regression analysis of
cross-sectional data. For example, the consumption expenditures of various individuals in a fixed
month could be regressed on their incomes, accumulated wealth levels, and their
various demographic features to find out how differences in those features lead to differences in
consumers behavior.

Statistical stationarity: A stationary time series is one whose statistical properties such as mean,
variance, autocorrelation, etc. are all constant over time. Most statistical forecasting methods are
based on the assumption that the time series can be rendered approximately stationary (i.e.,
"stationarized") through the use of mathematical transformations. A stationarized series is
relatively easy to predict: you simply predict that its statistical properties will be the same in the
future as they have been in the past! (Recall our famous forecasting quotes.) The predictions
for the stationarized series can then be "untransformed," by reversing whatever mathematical
transformations were previously used, to obtain predictions for the original series. (The details
are normally taken care of by your software.) Thus, finding the sequence of transformations
needed to stationarize a time series often provides important clues in the search for an
appropriate forecasting model. Stationarizing a time series through differencing (where needed)
is an important part of the process of fitting an ARIMA model, as discussed in the ARIMA
pages of these notes.
Another reason for trying to stationarize a time series is to be able to obtain meaningful sample
statistics such as means, variances, and correlations with other variables. Such statistics are
useful as descriptors of future behavior only if the series is stationary. For example, if the series
is consistently increasing over time, the sample mean and variance will grow with the size of the
sample, and they will always underestimate the mean and variance in future periods. And if the
mean and variance of a series are not well-defined, then neither are its correlations with other
variables. For this reason you should be cautious about trying to extrapolate regression models
fitted to nonstationary data.
Most business and economic time series are far from stationary when expressed in their original
units of measurement, and even after deflation or seasonal adjustment they will typically still
exhibit trends, cycles, random-walking, and other non-stationary behavior. If the series has a
stable long-run trend and tends to revert to the trend line following a disturbance, it may be
possible to stationarize it by de-trending (e.g., by fitting a trend line and subtracting it out prior to
fitting a model, or else by including the time index as an independent variable in a regression or
ARIMA model), perhaps in conjunction with logging or deflating. Such a series is said to
be trend-stationary. However, sometimes even de-trending is not sufficient to make the series
stationary, in which case it may be necessary to transform it into a series of period-to-period
and/or season-to-season differences. If the mean, variance, and autocorrelations of the original
series are not constant in time, even after detrending, perhaps the statistics of the changes in the
series between periods or between seasons will be constant. Such a series is said to
be difference-stationary. (Sometimes it can be hard to tell the difference between a series that is
trend-stationary and one that is difference-stationary, and a so-called unit root test may be used
to get a more definitive answer. We will return to this topic later in the course.)
(Return to top of page.)

You might also like