Professional Documents
Culture Documents
Contents
Chapter 4 ......................................................................................................................................... 3
Basics of Statistical Modeling ........................................................................................................ 3
4.1 Introduction ........................................................................................................................... 3
4.2 Statistical Experiments.......................................................................................................... 4
4.3 Random Variables ................................................................................................................. 8
4.3.1 Basic Concepts ............................................................................................................... 8
4.3.2 Properties of the CDF and the PDF of a random variable ........................................... 13
4.3.3The Conditional Distribution ........................................................................................ 15
4.3.4StatisticalExpectation .................................................................................................... 17
4.3.5Functions of Random Variable ..................................................................................... 22
4.4 Two Random Variables ...................................................................................................... 25
4.4.1 Statistical Expectation in Two Dimensions ................................................................. 29
4.4.2 Functions of two Random Variables............................................................................ 31
4.4.3Two functions of two random varaibles: ...................................................................... 33
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 2
Fig. 4. 1: Illustration of the random variable as a mapping function from the sample space of the
sigma algebra into Borel sets on the real line. 8
Fig. 4. 2:Illustration of the domain and range of a discrete-valued random variable. .................... 9
Fig. 4. 3: The CDF of a mixed random variable. .......................................................................... 12
Fig. 4. 4: Illustration of the measurability of functions of random variables. .............................. 23
Fig. 4. 5: Illustration of the concept of two-dimensional random vectors for two random variables
on the same sample space.. ........................................................................................................... 26
Fig. 4. 6: Illustration of region of integration for the marginal density. ................................. 28
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 3
Chapter 4
Basics of Statistical Modeling
4.1 Introduction
Statistical experiments are conducted to infer information about various processes, in order to
extract outcomes that guide decision making. Effectiveness of a certain drug on a particular
disease, surgical procedures, therapy techniques, demographic effects on diseases, are few
examples of statistical inference based on a statistical experiment. A statistical experiment may
be described in terms of a population, a phenomenon to be investigated, and a scaling procedure
to quantify the spread of the phenomena in a population. Traditionally, a statistical experiment E
is described in terms of a trilogy: sample space enlisting elementary outcomes or alphabet of
the experiment; possible outcomes (events) enlisting the dictionary of possible events; and a
scale (measure) P that weighs the occurrences of events in the dictionary. That is: E = {, ,
P}. [Papoulis]The form and level of complexity of each of these components depend on the
specifics of the experiment. The power of statistical theory is in its ability to describe
experiments of varying levels of complexity. This chapter will discuss the basic elements that
describe statistical experiments. In particular, we shall focus on the concepts of random variables
and random process. In the following chapter we discuss their generalizations to random fields.
Navigating through a wide and extensive literature such as statistics is not an easy
endeavor; indeed, it is quite a risky task given the space limitations as well as the purpose of the
book. Therefore, we hasten and say upfront, that there is no unique approach for selecting what
subjects to cover and what is an appropriate level of rigor. However, our experience dictates that
understanding the foundation of the statistical theory is an essential ingredient for many
approaches of image analysis. This chapter is a tour de force into some of the common
terminologies and basics of statistical modeling that are essential to appreciate the material
covered in the book. The statistical literature is accessible, and the readers can learn more
beyond what is covered here.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 4
4.2 Statistical Experiments
We follow the tradition of describing a statistical experiment; i.e., E = {, , P}.
- Sample space :
The sample space enumerates the elementary outcomes of an experiment, and by construct,
various combinations of them as well. For example, if
for some , are the
elementary outcomes of E (i.e., its single-valued possibilities or entities of its alphabet), it is a
natural requirement that combinations of these outcomes, that may also happen in E, should be
included in . However, events may be constructed arbitrary and some may never happen!
Hence, it is the tradition to list only the elementary outcomes (elements of the alphabet) in ,
and the events (all possible combinations of the elementary outcomes or dictionary of the events
in the experiment) are given a special entity, denoted by a field or algebra .
- Field (algebra) :
As the events need to be scaled or measured, an enumeration procedure is needed to generate the
dictionary of events, including those who may not happen; these will be given zero weight or
measure. Therefore, desirable characteristics of will be such that it will be closed under all
countable (may be infinite in number) inclusions of elementary events, such as unions.
Specifically, we define for events
implies that
for all
.
c) If
then
.
The above requirements are all natural for the field to be an inclusive space that spans
(includes) all measurable events.
- Probability measure P:
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 5
The probability measure P is a function that assigns weights to the events in . It is the tradition
to have a positive scale normalized to the unit interval. It is natural that events may be related
(dependent) or unrelated (independent) from each others. Events also may be joint (share some
elementary outcomes) or disjoint (share no elementary outcomes) in their constructs. Therefore,
a measure P satisfies the following requirements (axioms): [Ash]
a)
and .
b) The empty set (or null event) has zero weight and the sample space (sure event) has
a weight of 1; that is: and
c) If
then
if the events
are
mutually disjoint.
As a consequence of axioms b and c, if
; then
Example 1:Consider a histology experiment conducted on cells from a tissue specimen taken
during a procedure to determine the existence of cancer. In this experiment tissue is treated and
viewed under a microscope with viewed cells cancerous (C) or noncancerous (NC). As biopsies
are performed after reasonable doubts to justify doing them, the weights of the C and NC events
differ with procedure. A possible statistical experiment may be described as follows:
, , and
Example 2: Consider the same experiment in Example 1, in which viewing the cells under the
microscope revealed a contradictory conclusion with the agent used to prepare the tissue. The
histologist chart showed unspecified cells (US) under these circumstances. That is the sample
space became: . As these events are independent, the sigma algebra would be
simply the collection of five possibilities: , and the probability measure will
be such that , and
Example 3:Consider the experiment of recording the life span of cells in a specimen under the
effect of an agent. Suppose that the viewing domain of the microscope allows individual cell
viewing and tracking, and that it is possible to tag each cell as it moves within the specimen until
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 6
its death. It is expected that a counting experiment will be a tedious one especially with large
degrees of motion of the cells, yet a histogram may be constructed indicating the number of cells
that die with time.
Events containing similar elementary outcomes are called joint, while those with no
common elementary outcomes are classed disjoint. Probability measures on disjoint events are
additive. Events with no bearing on each others are called independent. Probability measures on
independent events are multiplicative. As cause and effect is a natural rule of physics, it is also
logical to ask of the likelihood of related events with respect to each others. Conditional
probability measures addresses this issue, providing the value of the probability of an event
given a second event took place as the ratio of their joint probability normalized by the
probability of the second event. We formally define these concepts as follows.
Definition 1: Two events
and
Definition 2: Two events
and
Definition 3: The conditional probability measure of event
,
where
and
and is
defined by:
provided that
The expression for the conditional probability in Definition 3 is usually called Bayes Theorem.
As direct corollary to the above definitions, independent events do not affect the value of the
conditional probability measure. Likewise, disjoint events have a zero-valued joint probability
measure and, subsequently, zero-valued conditional probability measure.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 7
Example 4: Consider events
and that
is a subset of
; symbolically,
It
follows that the conditional probability
A stimulus may affect a set of events even though they may be disjoint. In this case,
conditional probabilities enable measuring the overall effect of that stimulus. For example,
suppose that an experiment produces three disjoint events
such that
. We can measure the effect of a stimulus (i.e., another event) on the set of
events by noting that
; i.e.,
The proof of the total probability theorem follows from the fact that the probability measure of
disjoint events is analytically additive. [Ash]
In general, enlisting the events in and will not be a simple task, if not impossible,
with real-world problems. Likewise, obtaining the corresponding functional form for the
probability measure will be a rough task to accomplish. A logical task that statistions faced has
been to search for automatic procedures to generate events and their measure for particular
experiments. Indeed, a general approach has evolved to accomplish this task through the concept
of random functions; commonly referred to as random variables. This type of function maps
events on the sample space and the sigma algebra, which may be quite abstract, into another
representation on the Cartesian domain on which assigning a scale (measure or weight) will be
possible using the common tools of calculus. We examine this concept next as prelude to
describing multivariable experiments and complicated stochastic phenomena.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 8
4.3 Random Variables
4.3.1 Basic Concepts
A random variable is a mapping from the sigma algebra into the real line such that the
even . As illustrated in Fig. 1, the random variable is a mapping from
events in into other forms of events (sets) defined on the real line; these sets are called Borel
sets. For compact representation, we insist that the inverse mapping of the Borel sets are events
on ; i.e., the set
consisting of eight elementary outcomes. Considera function
. We may
illustrate that
is a random variable.
The above illustrates the value of the random variable concept as an event generator.
Equally important is to see whether the random variable will also enable measurability of the
generated events. In other words, can we evaluate The ability to
evaluate this quantity achieves the measurability requirements for the events in , and thus
completely describes the statistical experiment. Indeed, the quantity has
a special name in mathematical statistics, known as the cumulative probability distribution
function (CDF). From its definition, the event shrinks as and expands as ;
therefore, the CDF is non-decreasing. Indeed it is also monotone; hence, continuous from right;
that is:
As the CDF is a
fundamental concept in the theory of statistics we shall examine it in further details. Indeed, we
shall, for the ease of notations, connote the CDF as and will give it the special
symbol
. To make the notations sync, we shall define the CDF formally as follows:
Definition 3:Given a random variable defined for a statistical experiment E = {, , P}.
The cumulative probability distribution (CDF) of is a function on the real line which
measures the event We connote this function as
(ii)
is differentiable over .
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 11
Definition 4: The probability density function (PDF)
.
We shall see later on that both
and
. Point events for continuous random variables are emptysets; i.e., their measures will be zero.
The implications for these characteristics and differentiability of
(1)
hence,
In practice, however, a clear distinction will be implicit in terms
of the definition of the experiment and the values that events may take. Indeed, on principle, we
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 12
may define a random variable to describe events that has both continuous and discrete domains.
We provide below some illustrative examples for the CDF and PDF for various random
variables.
Example 7: Consider a random variable with the CDF shown in Fig. 3 which has the following
functional form:
We note that the CDF in this example is not differentiable at The problem,
however, is at where the CDF has a jump (discontinuity) of size
Fig. 4. 3: The CDF of a mixed random variable.
Hence, functional analysis enables us to address the various scenarios that the CDF may have.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 13
4.3.2 Properties of the CDF and the PDF of a random variable
As the CDF is a probability measure for the event then, the positivity
and upper bound of the CDF are evident. Likewise, as the name implies, the CDF is non-
decreasing and monotone, hence, the continuity from the right is evident. We can also infer the
properties of the CDF for disjoint events as well as independent events using the basic
definitions. We also note that the following identities follow straight from the definition of the
CDF. i)
The PDF, being a derivative of the CDF, possess the positivity and summability (integrability)
constraints. Writing the PDF for discrete random variable as:
The most important virtues of the CDF is their ability to provide a computational mechanism (or
algorithm) to evaluate the probability measure using the form of the probability mass of the
Borel sets corresponding to the events on the sigma algebra through summation (discrete
random variable) or integration (continuous random variable). Some examples will illustrate
these concepts.
Example 8: Consider a statistical experiment defined within the boundaries illustrated in Figure
1. Suppose the mass function of the Borel sets are such that they have an exponential form:
We can evaluate the probability measure of the event by the mass of its
equivalent Borel set
. In particular, we can evaluate the
probability of such events as:
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 14
Over the long history of statistics, various functional forms for
Exponenti
al
;
discrete-valued
Gaussian
N/A
Cauchy
N/A
Laplace
N/A
Rayleigh
N/A
Maxwell
N/A
Gamma
N/A
Beta
N/A
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 15
Binomial N/A
n: independent trails
Poisson N/A
The parameters in the CDF forms in Table 1 represent the modes as well as resolution measures
of the functions and will be addressed later on.
It is important to remember the context of the definitions of the CDF and the PDF and the
brilliance of the statisticians, physicists and mathematicians who, over the years, developed these
tools that enable proper description and characterization of statistical experiments. The random
variable concept has been the catalyst for mapping the statistical experiment from a guessing
game into a functional representation that is governed by the rules of calculus.
4.3.3The Conditional Distribution
The CDF and the PDF may be defined for conditional events. For example, we may study the
behavior of event A defined such that:
with respect to another event B which may or may not be defined on the same sample space
We can consider the conditional CDF as measure of event { in same way we defined
the conditional probability before
We point out that } means that , and the event may or may
not be a subset of the same sample space . We maintain that the conditional CDF possesses all
the properties of the CDF as discussed before. In particular, it is straight forward to justify the
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 16
following properties:
, and
for
Indeed, the total probability theorem that we studied before, may be expressed as follows:
consider a partition
.
In the same fashion we studied the probability density function (PDF) before, we can
define the conditional PDF as well; that is, for continuous random variables we have:
.
Example 9: Consider the two events
and
. We can show that:
By derivatives, we obtain the conditional PDF
Example 10:Consider the events
and
. We can show that:
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 17
The conditional PDF will be:
The conditional distribution is an important mechanism to measure the effects of a stimulus over
a certain behavior, or the degree of dependency of events over each others. Later on we will
generalize the concepts of random variable to describe experiments that can create indexed set of
random variables, where the index is defined on the set of real numbers or on a graph; we will
exploit the dependency of the random variables to perform clustering of classes into meaningful
description, and we will be able to study various control mechanisms of experiments that can
only be studied over a long period of observations.
4.3.4StatisticalExpectation
The CDF and the PDF in Table 1 are described in terms of parameters that quantify the basic
behavior of these functions, such as their mean value and the dispersion (variance) around that
mean value. Over the years of statistical analysis, statisticians have developed metrics that can be
computed from the CDF or the PDF which provide important clues about the behavior of a
statistical experiment. In fact, some of these metrics may provide an approximation or upper
bounds of the probabilities that a certain event may take without actual calculation of the CDF
and the PDF; thus a prediction may be possible for the behavior of events based on numerical
calculations. Among these metrics is the statistical expectation which will be formally defined
below.
Definition 5:Given a random variable with PDF
The expectation as defined above for a single random variable is a numerical indication of how
symmetric the PDF is around the origin. Due to the positivity constraints of the PDF, a zero
expected value connotes symmetry, while a positive value indicates that the PDF is tilted
towards the positive (right hand side) of the origin, and a negative value indicates that the PDF is
tilted to left of the origin. Hence, in reality the quantity provides the mean value of the
PDF or its locus of symmetry on the -axis. Therefore, the mean value or the average of
the random variable From the strict mathematical sense, we can define the effect of a general
function of the random variable on the PDF. For example, we may define the expected value of a
function as follows:
Definition 6:Given a random variable with PDF
In this case, the mean is a special case of The expectation of measures the
average of the function. Indeed, we shall study the fact that a function of a random variable is
also a measurable function, which defines a random variable as well. Hence, we may calculate
as above, using the PDF of random variable , or first calculate the PDF of
and then we calculate as follows:
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 19
{
The above result is very fascinating, even though we have not formally derived it, and indeed has
enormous practical implications in the statistical analysis, this due to the fact that calculating the
PDF is not a simple matter, even for simple transformations. Therefore, the PDF of a random
variable serves a great deal for functions of as well. We focus on few definitions which are
special cases of and provide some specific clues on the PDF of .
Definition 7:Given a random variable with PDF
The variance measures the dispersion (width or concentration) of the PDF around the mean
value. It serves to estimate the domain of the values that the random variable may take. In
fact, an upper bound has been derived, through the Chebyshev inequality, which estimates the
upper bound of the probability of a normalized event.
Fact:The Chebyshev inequality [Papoulis]. Consider a random variable with PDF having a
mean value and variance
.
The proof the Chebyshev inequality follows directly from the definitions and the positivity of the
density function. Its importance is in providing a measure for the concentration of the values of
the random variable around its mean value.
Definition 8: Given a random variable with PDF
, the
The
central moment
)
Similarly,
)
(
) (
)
On the other hand, we note that
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 21
The mean value is zero since
The moments may be generated from as follows.
Example 13: Evaluate the moment generating function for the standard uniform PDF defined
over the interval .
It is evident that the mean value for example may be obtained from first derivative of
and taking the limit as approaches zero (note the need for using the l'Hpital's rule).
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 22
One form of expectations, that mimics the moment generating function yet provides a great
functional ability to analyze, and indeed construct, the PDF of a random variable is known
as the characteristic function and is defined as follows:
Definition 10: Given a random variable with PDF
of
is defined as
is the conjugate of the Fourier transform, which is a unique mapping of the function
, thus a direct link is established between the PDF and the characteristic function,
which is very useful in a number of applications.
Example 14: Evaluate the characteristic function for the standard uniform PDF defined over
the interval .
4.3.5Functions of Random Variable
It is natural to ask how a transformation of a random variable (which is a function) would
be considered, in the context of the statistical experiment that the random variable was
introduced, in order to characterize in terms of the theory of calculus instead of
enumerations and guessing. The key issue is the measurability of functions of random
variables; i.e., whether the mapping is actually constrained by the original domain (of
course the range as well) of the original random variable?A random variable is a
mapping from the sigma algebra into the real line such that the event
, and we also insisted that the inverse mapping is an event as well, i.e.,
. Now let us define a function ()It is evident that the transformation
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 23
will generate events on that must be Borel sets. Likewise, the inverse mapping will generate
events that must be in and , by construction. Hence, the measurability of ()
is quite evident and inherent. Fig. 4 illustrates this concept, where the mappings of
onto the real line produce Borelsets
.
Fig. 4. 4: Illustration of the measurability of functions of random variables.
Theorem [Piper]: If a random variable and is a Borel function, then
() is a random variable measurable by ; i.e., the algebra generated by
is a subset of the algebra generated by
As we done before with random variables, we characterize () by its
cumulative probability distribution function or by its probability density function.
Depending on the functional form of , we may obtain the CFD
in
terms of
()
( )
)
Monotone increasing
)
Monotone decreasing
)
Continuous and has
countable number of real
roots
Integrate
Derivations of the entries in Table 2 are straight forward and follow from the definitions of
the CDF, which should be obtained first, and then the derivative operator is applied to
obtain the PDF.
Functions of random variables is an important concept in mathematical statistics that
enables statistical inference about one random variable (or, in general, a random process
as will be studied later) in terms of another. Indeed, simulations of random variables from
different distributions may be obtained from simulating a random variable from a standard
uniform; i.e.,
, the
events
and
may be defined on the space
as
(
) or in a vector; i.e., (
and (
)
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 26
Fig. 4. 5: Illustration of the concept of two-dimensional random vectors for two random variables on the same
sample space. For measurability, the sets
, by the assignment
, we have three
induced probability measures:
represented by
. Similarly,
represents a unit mass distributed along
, and
. The measures
are related to
as follows:
)
Now if
, so that
. Similarly, if
, so that
is called the joint probability measure induced by the joint mapping
and
respectively. The
corresponding probability mass distributions are called the marginal probability
distributions.
Definition 12: The joint probability distribution function, or the joint cumulative distribution
function for random variables
, denoted by
is defined by
and
are
called the marginal distribution functions for random variables
respectively.
We follow the same development for the single random variable case and define the probability
density function for two random variables.
Fact (Feller Vol. 1): Ifthe joint probability measure
induced by
is absolutely
continuous, a function
.Conversely, given
and
that satisfy the previous properties, there exists two random variables
defined on some sample space with joint probability distribution and density
functions
and
respectively.
We may enlist the main properties of
ii.
iii.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 28
iv.
Example 15: Consider two jointly continuous and positive random variables
with the
following joint density function:
Find
Solution: The required quantities may be obtained by integrating
Fig. 4. 6: Illustration of region of integration for the marginal density
.
Also,
.
The cumulative distribution function will be computed from the basic definition:
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 29
given that
leads to
We may address the issue of dependence of random variables here due to its importance.
We maintain from definition 2 and example 9, that events
.
Note that in the above example,
of random
variables
is defined as follows:
Definition 13: Given a pair of random variables
is similar to (
) the expect
value of
is defined by
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 30
{
where
is
the probability mass function for discrete random variables.
Definition 14: Given a pair of random variables
where
are the marginal means as defined in Definition 5. That is, for any random
variable
, we have:
where
and
is the
probability mass function for discrete random variable
.
A special case for definition 14 that does not include the marginal means is called the auto correlation ;
i.e.,
.
We note that all the properties of the expectation operators in higher dimensions are similar to that in one-
dimension.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 31
In two dimensions (multi-dimensions, in general), the random variables may possess various degrees of
relatedness; which may span the entire scale of fully related to completely unrelated or independent.
We define the relatedness in terms of correlation coefficient ; i.e.,
The Correlation coefficient
, where
.
If the covariance is zero, then the two random variables are said to be uncorrelated. If the two
random variables are independent, then their covariance becomes zero, and therefore, by
definition, the two random become uncorrelated. We note that the converse is not true, except
in the case of Gaussian random variables, which will be considered later.
4.4.2 Functions of two Random Variables
It is natural to study the mappings (transformations) of two random variables as we did with
single random variables. Here we consider the transformation
where is a
general function. We used the bold face letter to represent the functional form for which
may be a vector. A given realization (certain value or instant of will be represented by small
bold face letter The key issue is the measurability of which may be understood in a
similar fashion as we did in one-dimensional case; see Fig. 4. We maintain that under normal
conditions, is measurable and, therefore, we may define a complete functional description
for it in terms of the probability distribution function;
). We consider few
cases below.
Example 16: Consider two jointly continuous random variables
We maintain that is a
measurable random variable whose CDF may be calculated as follows.
and the density function
The inner differentiation may be evaluated using Leibnitz rule; i.e.,
Hence,
As a special case, if
will
have the following form:
,
which is the convolution integral. For this case, the CDF would be:
Table 3 lists few examples for the distribution of functions of two random variables; all can be
derived from the basic defintions. These functions maps two random variables into a single
random variable; i.e., there is a reduction in dimensionality involved in the transformation. We
will consider the case where the functions of random varaibles are vectors next.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 33
Table 3: Some one-dimensional functions of two random variables and the correspodning CDF
and PDF.
CDF
4.4.3Two functions of two random varaibles:
As a step twards generlized representation of functions of random varaibles, let us consider
vector representations of the random varaibles
, i.e., (
) will be quite
straight forward in the case where is a linear mapping of
) [
] (
of random
varaibles
), (
) (
). We
may define the Jacobian of the polar transformation as follows:
[
] = [
] [
] Hence,
.
4.5 Simulation of Random Variables
In this section we discuss how to simulate a one and two-dimensional random variables of
general distributions, starting from standard uniform distribution. This will bring about the
linkage between a distribution or density function and deterministic functional forms, which
represents instants or realizations of the random variables. Generating such instants may be used
in a reverse fashion to estimate the density function using histograms. Simulation is an art and a
well-studied subject in statistical design; stochastic systems are studied based on simulations. We
only scratch the surface here to bring about some feeling of the outputs of random experiments in
a form of realizations of the random variables that map the sample space into Borel sets on the
real line. The fundamental step in the simulation is the simple observations that since the
distribution function is monotone and non-decreasing, an inverse may be possible; hence
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 35
mapping from uniform distribution to any other form is possible. We illustrate this as follows:
Let are two random variables, not necessarily on the same sample space or have the
same form of distribution. Suppose the two random variables are related to each through the
monotone transformation: The distribution function of with respect to the
distribution of is as follows:
) =
Therefore, if we generate a realization from , a realization from is obtained by inverse
mapping.
4.6 Summary
This chapter introduced the basic concepts of statistical modeling which serves as building block
for the material in the book. We defined the concept of random variable from an operational
point view and illustrated its significance in describing statistical experiments. Characteristics of
a single random have been studied, including the probability distribution and density function
and the concept of statistical expectation. Extensions to function of single random variables and
joint distribution of two random variables were considered. The chapter concluded with basics of
simulation of one and two-dimensional random variables.
4.7 Computer Laboratory
A random variable of can be easily converted to any other distribution such as normal
distribution. The methodology of generating random numbers has several forms. However, the
one used in great majority is the linear congruential generators (LCG), whose recursive formula
is given by (A4.1), where
is a sequence of integer values ranging from 0 to , starting
from a seed point
(A4.1)
Where m is the modulus, is the multiplier, is the increment, and all are non-negative values.
In order to obtain numbers in the interval the generated integer sequence should be divided
by to obtain
. There are many variations of LCGs, the form used in this project is
the multiplicative LCG were the addition of is not needed, so the recursive formula of the
generator can be given by
(A4.2)
We may convert the generated uniform streams to follow a Gaussian distribution of zero mean
and unit variance using the Box-Muller approach described as follows;
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 36
If
and
and
, in
order to guarantee their independence.
Task 1:
(a) Generate three streams (1D, 2D and 3D) of random numbers with 100,000 samples; you
can use the Matlab function for random number generation.
(b) Visualize the generated samples, Hint: you can use a scatter plot.
(c) Compute the histogram of the three streams, and then normalize them to become a
probability density function (pdf).
(d) Visualize the pdfs of the three streams. Are the samples uniformly distributed? Do the
pdfs represent a standard uniform distribution? Comment.
Task 2: use the streams generated in task 1 to do the following,
(a) Convert the generated streams to a standard normal distribution
(b) Visualize the data samples. Are they normally distributed with mean 0 and unit variance?
(c) Compute the normalized histogram (pdfs) of the data samples from (a) and visualize the
distribution, comment.
Task 3: use the streams generated in task 2 to do the following;
(a) Convert the 1D normal variate to a normal distribution with mean -2 and standard
deviation 3.
(b) Convert the 2D normal variate to a normal distribution with mean vector
(
=
2
1
M and
Covariance matrix
(
=
9 4
4 4
.
(c) Convert the 3D normal variate to a normal distribution with mean vector
5
5
6
M
(
(
=
(
(
and
Covariance matrix
5 2 1
2 5 0
1 0 4
(
(
=
(
(
.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 37
(d) For (a), (b) and (c) estimate the mean and covariance matrix and compare the estimated
values with the ground truth ones, comment.
4.8 Exercises
1. Given a probability space Let be the events such that
Prove/disprove the following:
(a)
(b)
(c) .
2. Given a sample space
(a) Construct the smallest algebra
and
(b) Construct the smallest algebra
and
(c) What members of
that and
measurable?
3. Let be a nonnegative finitely additive set function on the field . Show that
(
4. Let be arbitrary subsets of a set O. Describe explicitly the smallest field
containing . How many sets are there in ? (i.e., give an upper bound thats
attainable.
5. Derive the entries in Table 2.
6. Derive the entries in Table 3.
F a r a g : C h a p t e r 4 : B a s i c s o f S t a t i s t i c a l M o d e l i n g P r e p r i n t 2 0 1 1 | 38
d) 4.9 References
1. Robert Ash, Real Analysis and Probability, Academic Press, New York, 1972.
2. Athanasios Papoulis, Probability, Random Variables and Stochastic Processes, 3
rd
Edition, McGraw-Hill, New York, 1991.
3. Paul E. Pfeiffer, Concepts of probability Theory, 2
nd
Edition, Dover, New York, 1978.
4. Wilbur Davenport, Random Processes, McGraw-Hill, New York, 1970
5. William Feller, An Introduction to Probability Theory and Its Applications, Vol.1,
Wiley, New York, 1968.
6. Reuven Rubinstein, Simulation and the Monte Carlo Method, Wiley, New York, 1981.