Bayesian Methods of Estimation

MAKERERE
UNIVESITY
COLLEGE OF ENGINEERING DESIGN ART AND TECHNOLOGY DEPARTMENT OF CIVIL ENGINEERING
Math assignment
Group members
NAMES OLARA ALLAN ARIKOD RICHARD MUKIIZA JULIUS NDARAMA MICHEAL SIMON TWEHEYO DISHAN BULUMA MELINDA NAMIYA MARIAM SSONKO EMMANUEL BUYINZA ABBEY ARIKE PATRICK TUKAMUSHABA EMMY ANGURA GABRIEL SEMAHORO ALLAN NANKABIRWA ROSE MUTONGOLE SAMUEL NAMPEERA ROBINAH KIGONYA ALLAN OCAN GEOFREY MUTYOGOMA MAUYA DRATELE SIGFRIED BUDRA MUKIIBI SSEMAKULA PETER KINENE SERWANGA BRIAN OLUKA PATRICK
REGISTRATION NUMBER 10/U/683 10/U/657 10/U/671 10/X/3007/PSA 10/U/690 10/U/662 10/U/676 10/U/9979/PSA 10/U/663 10/U/9989/PSA 08/U/3053/PSA 10/U/9946/PSA 10/U/9945/PSA 10/U/678 10/U/9965/PSA 10/U/677 10/U/668 10/U/9971/PSA 10/U/9998/PS 10/U/1914 10/U/9964/PSA 10/U/687 10/U/1002/PS
STUDENT NUMBER 210001123 210001135 210001151 210004611 210001016 210000809 210000345 210005498 210000879 210018734 208006302 210018907 210006460 210000348 210009531 210001032 210000683 210017525 210006589 210001946 210006993 210001541 210006598
BAYESIAN ESTIMATION OF DISTRIBUTION PARAMETERS Introduction

Bayes theorem is a theorem with two distinct interpretations. In the Bayesian interpretation, it expresses how a subjective degree of belief should rationally change to account for evidence. In frequents interpretation, it relates inverse representations of the probabilities concerning two events Bayesian statistics has applications in fields including science, engineering, medicine and law.
Basics of t h e Bayesian estimation method

Consider the problem of finding a point estimate of the parameter for the population f(x: ). The classical approach would be take a random sample of size n and substitute the information provided by the sample into the appropriate estimator or decision function. For the case of binomial population b(x:n,p) the estimate of p, the proportion of success, would be . Suppose that additional information is given about parameter , namely, that is known to vary according to some to probability distribution f(), often called prior distribution, with prior mean o and prior variance o2. That is we are now assuming to be a value of a random variable with probability distribution f() and we wish to estimate the particular value of for the population from which we selected our sample. The probabilities associated with this prior distribution are called subjective probabilities, in that they measure a persons degree of belief in the location of the parameter. Bayesian techniques use the prior distribution f() along with joint distribution the sample to compute the posterior distribution
The posterior distribution consists of information from both subjective-prior distribution and the objective sample distribution and express the degree of belief in the location of parameter after you have observed the sample. If we denote by ( ) the joint probability distribution of the sample,
conditional on the parameter in a situation where is a random variable; The joint distribution of the sample = and then the parameter is then f()
Group 5 math assignment
Page 1
From which we readily obtain the marginal distribution
Hence the posterior distribution may be written
Note; the mean of the posterior distribution called Bayes estimate of . The density is called the posterior density.
denoted by *, is
Consider the Bayes estimation of the probability p of an event where p is a realization of a random variable X with probability density function whose range is [ ]. Prior estimates of p can b obtained from; = . (1)
To improve on the estimate of p, we conduct an experiment of tossing a die n times and observing the number of aces to be k. Applying Bayes theorem, the posterior density is written as;
.. (2) }
Where B = {
From binomial probability, we obtain ( ) Substitute equation (3) into (2)

( ) ( )
. (3)
. (4)
Page 2
The updated estimate of p can be obtained by substituting (1). = . (5)
from equation (4) for
in
Assuming that is uniformly distributed in [ range [ ] , equation (4) can be simplified. The integral mathematical induction. Substituting ( ) We can express the conditional density
], instead of a general distribution in the
................ (6) can be shown to be true using
= 1 and from equation (6), we can evaluate
as
. (7) as follows; .. (8)
The posterior estimate for p is obtained from equation (5) as = , is given by
And from equation (6), = Theorem
Bayesian methods of estimation concerning the mean of normal population are based on the following theorem. If is the mean of the random sample of size n from a normal population with known the variance 2 and the prior distribution of the population mean is a normal distribution with prior mean o and prior variance o2, then the posterior distribution of the population mean is also normal distribution with mean * and standard deviation *, where
and
The posterior mean * is the Bayes estimate of the population mean and 100(1-)% Bayesian interval for can be constructed by computing the interval
Group 5 math assignment Page 3
, which is centered at the posterior mean and contains 100(1-
)% of the posterior probability. Example An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed with a standard deviation of 100hours. Prior experience leads us to believe that is a value of a normal random variable with a mean o = 800 hours and standard deviation o = 10hours. If a random sample of 25 bulbs has an average life of 780 hours, find a 95% Bayesian interval for . Solution The posterior distribution of the mean is also normally distributed with the mean * =
= 796
and standard deviation * =
The 95% Bayesian interval for is then given by; 796 - 1.96 < < 796 + 1.96 778.5 < < 813.5
By ignoring the prior information about in the example above, one can continue to construct the classical 95% confidence interval
780-(1.96)(
)<<780+(1.96)(
Or 740.8<<819.2, observed to be wider than the corresponding Bayesian interval. Advantages of Bayesian Estimation: i. The prior probability allows you to incorporate knowledge about a particular hypothesis. In case of a study on lemur foraging and a similar study was conducted on lemur foraging 20 years ago you can incorporate that study into your scientific decision-making because says that science works by building on established knowledge. . ii. The name of the game is to compare the relative probability of competing hypotheses. The Bayesian framework allows you to make that comparison.
Page 4
Disadvantages of Bayesian Estimation i. You can get very different posterior distributions by changing what parameters have uninformative priors. In other words, there are some tricky mechanical issues. ii. The frequentist-based framework is ideal for the Popperian view of science because it allows you to falsify hypothesis. Under Bayesian statistics, there is no such thing as falsification, just relative degrees of belief. Frequentist statistics is "easy" and has accepted conventions of method and notation. The same cant be said of Bayesian statistics. Bayesian requires understanding probability and likelihood.
iii.
Page 5
VECTOR RANDOM VARIABLES A random matrix (or random vector) is a matrix (vector) whose elements are random variables. Its elements jointly distributed. Two random matrices X1 and X2are independent if the elements of X1(as a collection of random variables) are independent of the elements of X2 but the elements within X1or X2 do not have to be independent. Similarly, a collection of random matrices X1, . . . ,Xk is independent if their respective collections of random elements are (mutually) independent. (Again, the elements within any of the random matrices need not be independent.) Similarly, an infinite collection of random matrices is independent if every finite sub-collection is independent. Expectation (mean) of a random matrix The expected value or mean of an m n random matrix X is the m n matrix E(X) whose elements are the expected values of the corresponding elements of X, assuming that they all exist. That is if; X=[ Then E(X)=[ ] ]
Properties: E (X) = E(X) If X is square, E (tr(X)) = tr (E(X)) If a is a constant, E (aX) = a E(X) E (vec(X)) = vec (E(X))If A and B are constant matrices E(AXB) = AE(X) B E(X1 +X2) = E(X1)+E(X2) If X1 and X2 are independent, E(X1X2) = E(X1) E(X2)
Covariance This is the relationship between two random variables. If 3 or more random variables are jointly distributed, one must consider the covariance for all possible pairs. The covariance of 3 jointly distributed random variables x, y and z is specifically the 3 covariances; xy for x and y, yz for y and z and xz for x and z. Thus, in dealing with m jointly distributed random variables, it is convenient to collect them into a single vector. A random
Page 6
vector is one whose components are jointly distributed random variables. Therefore, if x1, x2,..., xm are m jointly distributed random variables , the vector, [ ] is a random vector;
where 1, 2,. m are mean values of x1, x2,..., xm respectively, then
=[
][
=[
Noting that E [ E( ( matrix below; [ )
= (the covariance of and ) and if , we obtain the symmetric
Note; The variance of the individual random variables form the main diagonal of
xx is
xx
the variance-covariance matrix of X
If the random variables in X are uncorrelated, all covariance (off diagonal) elements of xx are zero and the matrix is diagonal. The relationship between the weight matrix W and the corresponding variance matrix/covariance matrix, with subscripts added to indicate reference to random vector X, is restated as; Wxx = 02xx-1 0 2 is the reference variance. Caution; If Wxx is non diagonal the simple weights calculated in;
W1 = 0 2/ 12 W2 = 0 2/ 0 2. (4-16)
Wm = 0 2/ m2, are not to be used as diagonal elements of Wxx , but only when Wxx is diagonal are the weights calculated in (4-16) identical to the diagonal elements. Example 1 Two observations are represented by the random vector; X= The variances X1 and X2 are 1 2 and 22 respectively. The covariance of X1 and X2 is 12, and the correlation coefficient is 12. (a) For a selected reference variance 02, derive the weight matrix of X in terms of the given parameters. (b) Show that the weights calculated in (4-16) are identical to the diagonal elements of the weight matrix only when 12 = 0 Solution; (a) The weight of the matrix X is; Wxx = 02xx-1 = 02[ )[ ]
/ ( 12 22 -
we know that 12 = 12 1 2 and 12 22 - 12 = 12 22(1thus; Wxx= 02/( 12 22(1 - 122 )[ ]
(b) from 4-16 W1 = and W2=

Page 8
The diagonal elements of Wxx are
and
only when
=0
When =/ 0, the weights W1 and W2 cannot be used as diagonal elements of Wxx . Each of the xx can be divided by to yield a scaled version of called Qxx( co-factor of matrix of X)
Qxx = xx
[ xx = Qxx
Qxx is also called the relative co=variance matrix. The variance-covariance matrix (or covariance matrix) of an m matrix V(x)(or Var(x) or cov(x)) defined by V(X)=E((x-E(X))(X-E(x))) when the expectations all exist Also if [ [ ] ] random vector x is the m
And, in particular, V(x) is symmetrical and diagonal if the elements of x are independent. Properties If a is constant, If A is a constant matrix and b a constant vector, negative definite) The covariance between the defined to be matrix. random variables x1 and the
is always non
random variable vector x2 is
, when all expectations exist; if a and b are constants. If A and B area constant matrices and c and d are constant vectors;
and
Conditional expectation The conditional expectation of two random matrices X1and X2; E(X1/X2) (of X1 given X2=A) is the expectation of X1defined using the conditional distribution of its elements given X2=A (A being a constant matrix). The conditional expectation E(X1/X2), is the expectation of X1 defined using the conditional distribution of its elements given X2. The double expectation formula is E (E(X1/X2))=E(X1) The conditional variance-covariance matrix V(x1/X2=A) or V(x1/X2) for a random vector x1 is defined by replacing conditional expectations into the definition of the variance-covariance matrix appropriately. The conditional covariance formula applies: V(x1) = E(V(x1/X2) + V(E(x1/X2)) For random vectors x1 and x2, the conditional covariance Cov(x1,x2/x3=A) or cov(x1,x2/x3) can be defined by putting the appropriate conditional expectations into the definition of the covariance. An additional covariance formula is cov(x1x2)= E(cov(x1,x2/X3))+cov(E(x1/X3),E(x2/X3)
Page 10
REFERENCES Probability and statistics for engineers and scientists, 6th edition by Walpole. Myers, Myers Ronald E Walpole, Raymond H.Myers, Sharon L.Myers Page 275-280. Analysis and adjustment of survey measurements by Edward M. Mikhai, School of Engineering, Purdue University West Lafayette, India. Probability and random processes by Venkatarama Krishna 2006 published by John Wiley and Sons, pages 384-405. Amos Storkey. Mlpr lectures: Distributions and models. http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodelsprint4up.pdf, 2009. School of Informatics, University of Edinburgh.
Page 11

Bayesian Methods of Estimation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bayesian Methods of Estimation

Uploaded by

Copyright:

Available Formats

MAKERERE

COLLEGE OF ENGINEERING DESIGN ART AND TECHNOLOGY DEPARTMENT OF CIVIL ENGINEERING

BAYESIAN ESTIMATION OF DISTRIBUTION PARAMETERS Introduction

Basics of t h e Bayesian estimation method

Group 5 math assignment

From which we readily obtain the marginal distribution

Hence the posterior distribution may be written

From binomial probability, we obtain ( ) Substitute equation (3) into (2)

Group 5 math assignment

The updated estimate of p can be obtained by substituting (1). = . (5)

from equation (4) for

], instead of a general distribution in the

................ (6) can be shown to be true using

= 1 and from equation (6), we can evaluate

. (7) as follows; .. (8)

The posterior estimate for p is obtained from equation (5) as = , is given by

And from equation (6), = Theorem

, which is centered at the posterior mean and contains 100(1-

and standard deviation * =

Group 5 math assignment

Group 5 math assignment

Group 5 math assignment

where 1, 2,. m are mean values of x1, x2,..., xm respectively, then

Noting that E [ E( ( matrix below; [ )

= (the covariance of and ) and if , we obtain the symmetric

the variance-covariance matrix of X

we know that 12 = 12 1 2 and 12 22 - 12 = 12 22(1thus; Wxx= 02/( 12 22(1 - 122 )[ ]

(b) from 4-16 W1 = and W2=

Group 5 math assignment

The diagonal elements of Wxx are

random variable vector x2 is

Group 5 math assignment

Group 5 math assignment

You might also like