You are on page 1of 12

MAKERERE

UNIVESITY

COLLEGE OF ENGINEERING DESIGN ART AND TECHNOLOGY DEPARTMENT OF CIVIL ENGINEERING

Math assignment
Group members

NAMES OLARA ALLAN ARIKOD RICHARD MUKIIZA JULIUS NDARAMA MICHEAL SIMON TWEHEYO DISHAN BULUMA MELINDA NAMIYA MARIAM SSONKO EMMANUEL BUYINZA ABBEY ARIKE PATRICK TUKAMUSHABA EMMY ANGURA GABRIEL SEMAHORO ALLAN NANKABIRWA ROSE MUTONGOLE SAMUEL NAMPEERA ROBINAH KIGONYA ALLAN OCAN GEOFREY MUTYOGOMA MAUYA DRATELE SIGFRIED BUDRA MUKIIBI SSEMAKULA PETER KINENE SERWANGA BRIAN OLUKA PATRICK

REGISTRATION NUMBER 10/U/683 10/U/657 10/U/671 10/X/3007/PSA 10/U/690 10/U/662 10/U/676 10/U/9979/PSA 10/U/663 10/U/9989/PSA 08/U/3053/PSA 10/U/9946/PSA 10/U/9945/PSA 10/U/678 10/U/9965/PSA 10/U/677 10/U/668 10/U/9971/PSA 10/U/9998/PS 10/U/1914 10/U/9964/PSA 10/U/687 10/U/1002/PS

STUDENT NUMBER 210001123 210001135 210001151 210004611 210001016 210000809 210000345 210005498 210000879 210018734 208006302 210018907 210006460 210000348 210009531 210001032 210000683 210017525 210006589 210001946 210006993 210001541 210006598

BAYESIAN ESTIMATION OF DISTRIBUTION PARAMETERS Introduction


Bayes theorem is a theorem with two distinct interpretations. In the Bayesian interpretation, it expresses how a subjective degree of belief should rationally change to account for evidence. In frequents interpretation, it relates inverse representations of the probabilities concerning two events Bayesian statistics has applications in fields including science, engineering, medicine and law.

Basics of t h e Bayesian estimation method


Consider the problem of finding a point estimate of the parameter for the population f(x: ). The classical approach would be take a random sample of size n and substitute the information provided by the sample into the appropriate estimator or decision function. For the case of binomial population b(x:n,p) the estimate of p, the proportion of success, would be . Suppose that additional information is given about parameter , namely, that is known to vary according to some to probability distribution f(), often called prior distribution, with prior mean o and prior variance o2. That is we are now assuming to be a value of a random variable with probability distribution f() and we wish to estimate the particular value of for the population from which we selected our sample. The probabilities associated with this prior distribution are called subjective probabilities, in that they measure a persons degree of belief in the location of the parameter. Bayesian techniques use the prior distribution f() along with joint distribution the sample to compute the posterior distribution

The posterior distribution consists of information from both subjective-prior distribution and the objective sample distribution and express the degree of belief in the location of parameter after you have observed the sample. If we denote by ( ) the joint probability distribution of the sample,

conditional on the parameter in a situation where is a random variable; The joint distribution of the sample = and then the parameter is then f()

Group 5 math assignment

Page 1

From which we readily obtain the marginal distribution

Hence the posterior distribution may be written

Note; the mean of the posterior distribution called Bayes estimate of . The density is called the posterior density.

denoted by *, is

Consider the Bayes estimation of the probability p of an event where p is a realization of a random variable X with probability density function whose range is [ ]. Prior estimates of p can b obtained from; = . (1)

To improve on the estimate of p, we conduct an experiment of tossing a die n times and observing the number of aces to be k. Applying Bayes theorem, the posterior density is written as;

.. (2) }

Where B = {

From binomial probability, we obtain ( ) Substitute equation (3) into (2)


( ) ( )

. (3)

. (4)

Group 5 math assignment

Page 2

The updated estimate of p can be obtained by substituting (1). = . (5)

from equation (4) for

in

Assuming that is uniformly distributed in [ range [ ] , equation (4) can be simplified. The integral mathematical induction. Substituting ( ) We can express the conditional density

], instead of a general distribution in the

................ (6) can be shown to be true using

= 1 and from equation (6), we can evaluate

as

. (7) as follows; .. (8)

The posterior estimate for p is obtained from equation (5) as = , is given by

And from equation (6), = Theorem

Bayesian methods of estimation concerning the mean of normal population are based on the following theorem. If is the mean of the random sample of size n from a normal population with known the variance 2 and the prior distribution of the population mean is a normal distribution with prior mean o and prior variance o2, then the posterior distribution of the population mean is also normal distribution with mean * and standard deviation *, where

and

The posterior mean * is the Bayes estimate of the population mean and 100(1-)% Bayesian interval for can be constructed by computing the interval
Group 5 math assignment Page 3

, which is centered at the posterior mean and contains 100(1-

)% of the posterior probability. Example An electrical firm manufactures light bulbs that have a length of life that is approximately normally distributed with a standard deviation of 100hours. Prior experience leads us to believe that is a value of a normal random variable with a mean o = 800 hours and standard deviation o = 10hours. If a random sample of 25 bulbs has an average life of 780 hours, find a 95% Bayesian interval for . Solution The posterior distribution of the mean is also normally distributed with the mean * =

= 796

and standard deviation * =

The 95% Bayesian interval for is then given by; 796 - 1.96 < < 796 + 1.96 778.5 < < 813.5

By ignoring the prior information about in the example above, one can continue to construct the classical 95% confidence interval

780-(1.96)(

)<<780+(1.96)(

Or 740.8<<819.2, observed to be wider than the corresponding Bayesian interval. Advantages of Bayesian Estimation: i. The prior probability allows you to incorporate knowledge about a particular hypothesis. In case of a study on lemur foraging and a similar study was conducted on lemur foraging 20 years ago you can incorporate that study into your scientific decision-making because says that science works by building on established knowledge. . ii. The name of the game is to compare the relative probability of competing hypotheses. The Bayesian framework allows you to make that comparison.

Group 5 math assignment

Page 4

Disadvantages of Bayesian Estimation i. You can get very different posterior distributions by changing what parameters have uninformative priors. In other words, there are some tricky mechanical issues. ii. The frequentist-based framework is ideal for the Popperian view of science because it allows you to falsify hypothesis. Under Bayesian statistics, there is no such thing as falsification, just relative degrees of belief. Frequentist statistics is "easy" and has accepted conventions of method and notation. The same cant be said of Bayesian statistics. Bayesian requires understanding probability and likelihood.

iii.

Group 5 math assignment

Page 5

VECTOR RANDOM VARIABLES A random matrix (or random vector) is a matrix (vector) whose elements are random variables. Its elements jointly distributed. Two random matrices X1 and X2are independent if the elements of X1(as a collection of random variables) are independent of the elements of X2 but the elements within X1or X2 do not have to be independent. Similarly, a collection of random matrices X1, . . . ,Xk is independent if their respective collections of random elements are (mutually) independent. (Again, the elements within any of the random matrices need not be independent.) Similarly, an infinite collection of random matrices is independent if every finite sub-collection is independent. Expectation (mean) of a random matrix The expected value or mean of an m n random matrix X is the m n matrix E(X) whose elements are the expected values of the corresponding elements of X, assuming that they all exist. That is if; X=[ Then E(X)=[ ] ]

Properties: E (X) = E(X) If X is square, E (tr(X)) = tr (E(X)) If a is a constant, E (aX) = a E(X) E (vec(X)) = vec (E(X))If A and B are constant matrices E(AXB) = AE(X) B E(X1 +X2) = E(X1)+E(X2) If X1 and X2 are independent, E(X1X2) = E(X1) E(X2)

Covariance This is the relationship between two random variables. If 3 or more random variables are jointly distributed, one must consider the covariance for all possible pairs. The covariance of 3 jointly distributed random variables x, y and z is specifically the 3 covariances; xy for x and y, yz for y and z and xz for x and z. Thus, in dealing with m jointly distributed random variables, it is convenient to collect them into a single vector. A random

Group 5 math assignment

Page 6

vector is one whose components are jointly distributed random variables. Therefore, if x1, x2,..., xm are m jointly distributed random variables , the vector, [ ] is a random vector;

where 1, 2,. m are mean values of x1, x2,..., xm respectively, then

=[

][

=[

Noting that E [ E( ( matrix below; [ )

= (the covariance of and ) and if , we obtain the symmetric

Note; The variance of the individual random variables form the main diagonal of
xx is

xx

the variance-covariance matrix of X

If the random variables in X are uncorrelated, all covariance (off diagonal) elements of xx are zero and the matrix is diagonal. The relationship between the weight matrix W and the corresponding variance matrix/covariance matrix, with subscripts added to indicate reference to random vector X, is restated as; Wxx = 02xx-1 0 2 is the reference variance. Caution; If Wxx is non diagonal the simple weights calculated in;
Group 5 math assignment Page 7

W1 = 0 2/ 12 W2 = 0 2/ 0 2. (4-16)

Wm = 0 2/ m2, are not to be used as diagonal elements of Wxx , but only when Wxx is diagonal are the weights calculated in (4-16) identical to the diagonal elements. Example 1 Two observations are represented by the random vector; X= The variances X1 and X2 are 1 2 and 22 respectively. The covariance of X1 and X2 is 12, and the correlation coefficient is 12. (a) For a selected reference variance 02, derive the weight matrix of X in terms of the given parameters. (b) Show that the weights calculated in (4-16) are identical to the diagonal elements of the weight matrix only when 12 = 0 Solution; (a) The weight of the matrix X is; Wxx = 02xx-1 = 02[ )[ ]

/ ( 12 22 -

we know that 12 = 12 1 2 and 12 22 - 12 = 12 22(1thus; Wxx= 02/( 12 22(1 - 122 )[ ]

(b) from 4-16 W1 = and W2=


Page 8

Group 5 math assignment

The diagonal elements of Wxx are

and

only when

=0

When =/ 0, the weights W1 and W2 cannot be used as diagonal elements of Wxx . Each of the xx can be divided by to yield a scaled version of called Qxx( co-factor of matrix of X)

Qxx = xx

[ xx = Qxx

Qxx is also called the relative co=variance matrix. The variance-covariance matrix (or covariance matrix) of an m matrix V(x)(or Var(x) or cov(x)) defined by V(X)=E((x-E(X))(X-E(x))) when the expectations all exist Also if [ [ ] ] random vector x is the m

And, in particular, V(x) is symmetrical and diagonal if the elements of x are independent. Properties If a is constant, If A is a constant matrix and b a constant vector, negative definite) The covariance between the defined to be matrix. random variables x1 and the

is always non

random variable vector x2 is

, when all expectations exist; if a and b are constants. If A and B area constant matrices and c and d are constant vectors;
Group 5 math assignment Page 9

and

Conditional expectation The conditional expectation of two random matrices X1and X2; E(X1/X2) (of X1 given X2=A) is the expectation of X1defined using the conditional distribution of its elements given X2=A (A being a constant matrix). The conditional expectation E(X1/X2), is the expectation of X1 defined using the conditional distribution of its elements given X2. The double expectation formula is E (E(X1/X2))=E(X1) The conditional variance-covariance matrix V(x1/X2=A) or V(x1/X2) for a random vector x1 is defined by replacing conditional expectations into the definition of the variance-covariance matrix appropriately. The conditional covariance formula applies: V(x1) = E(V(x1/X2) + V(E(x1/X2)) For random vectors x1 and x2, the conditional covariance Cov(x1,x2/x3=A) or cov(x1,x2/x3) can be defined by putting the appropriate conditional expectations into the definition of the covariance. An additional covariance formula is cov(x1x2)= E(cov(x1,x2/X3))+cov(E(x1/X3),E(x2/X3)

Group 5 math assignment

Page 10

REFERENCES Probability and statistics for engineers and scientists, 6th edition by Walpole. Myers, Myers Ronald E Walpole, Raymond H.Myers, Sharon L.Myers Page 275-280. Analysis and adjustment of survey measurements by Edward M. Mikhai, School of Engineering, Purdue University West Lafayette, India. Probability and random processes by Venkatarama Krishna 2006 published by John Wiley and Sons, pages 384-405. Amos Storkey. Mlpr lectures: Distributions and models. http://www.inf.ed.ac.uk/teaching/courses/mlpr/lectures/distnsandmodelsprint4up.pdf, 2009. School of Informatics, University of Edinburgh.

Group 5 math assignment

Page 11

You might also like