You are on page 1of 20

Course Code Course Title Assignment Code Coverage

: : : :

MS - 8 Quantitative Analysis for Managerial Applications MS-8/SEM - II /2011 All Blocks

Note: Answer all the questions and submit this assignment on or before 31st October 2011, to the coordinator of your study center. 1. Statistics can prove anything Figures cannot lie Comment on the above two statements, indicating reasons for the existence of such divergent views regarding the nature and functions of statistics. 2. From the following data compute quartile deviation and the coefficient of skewness: Size Frequenc y 5 7 14 8 10 24 11 13 38 14 16 20 17 19 4

3. A bank has a test designed to establish the credit rating of a loan applicant. Of the persons, who default (D), 90% fail the test (F). Of the persons, who will repay the bank (ND), 5% fail the test. Furthermore, it is given that 4% of the population is not worthy of credit; i.e., P(D) = .04. Given that someone failed the test, what is the probability that he actually will default (When given a loan)? 4. Strength tests carried out on samples of two yarns spun to the same count gave the following results: Number in sample 4 9 Sample Mean 50 42 Sample variance 42 56

Yarn A Yarn B

The strengths are expressed in pounds. Does the difference in mean strengths indicate a real difference in the mean strengths of the yarn? 5. Write short notes on a) One-tail & two-tail tests

b) Standard normal distribution c) Bayes Theorem

PLEASE NOTE Q2-Q3-Q4 ARE OUT OF MY EXPERTISE AREA. REGARDS LEO LINGHAM

1.Statistics can prove anything Figures cannot lie Comment on the above two statements, indicating reasons for the existence of such divergent views regarding the nature and functions of statistics.

Statistics can prove anything The trouble is that statistics are not raw data. Statistics is a method for analysing large data sets, raw or otherwise. There are plenty of ways of getting the statistics to say what you want. To start with, you could just make up the data. If you don't want to go for quite such blantant fraud, cherry-picking the data (or the subjects giving the data) and leading questions are always good. Once you have the data, there are many, many different ways of looking at it, so there's bound to be one that will lead to the answer you want. For example, take some of the surveys on the Twin Towers, since this is a popular topic for manipulating the statistics. Let's say you want to show that the majority of American citizens think the government was responsible for this. Assuming it's a real poll and not just made up, the easiest way to get the correct answer is to pick the people you ask. Instead of randomly phoning people, go to a truther rally and ask your questions there. You're virtually guaranteed to get 100% thinking it was the government. If you can't quite bring yourself to get that close to them, just pick your questions

carefully. A good leading question would be something like: Do you think the government could have done more to prevent the attacks? With hindsight the answer is of course they could, and most people will answer something to that effect. The important thing is to have multiple choice answers such as: a) No, they did everything possible. b) Yes, security could have been a bit tighter. c) Yes, the government is hopelessly incompetent. d) Yes, the government did it on purpose. e) Not sure. Hardly anyone will pick a), which means you can proudly announce to the world that 84% of Americans think 9/11 was the governments fault, even though that's not what most of them actually meant. The important thing with all statistics is that unless you can see the raw data, the collection method and the workings, it is all extremely untrustworthy and shouldn't be taken too seriously. Even with respectable polling companies, the questions are usually written, at least partially, by the people commissioning the poll. Polls just asking questions in the street, write-ins or, horror of horrors, online polls are essentially worthless. Statistics published in peer reviewed journals are likely to be better, and are at least open about the methods, but even then there are often arguments about the validity of the analysis used and possible biases in the population. As with anything, don't believe something just because someone has written about it. Look for peer review, openness, replication and so on. Can you actually prove anything with statistics? Well, obviously not, since you can't prove something that isn't true. However, it's relatively easy to make it look like you've proved something to people who aren't skeptical enough. -----------------------------------------1. ZERO, if you're interested in trying to figure out what statistics to be skeptical of,

a good place to look is the sample size. This is the actual number of people who were asked the question. Now, if your sample is large enough and taken from the general population and not just one type of person, then you can say it's likely to be representative of the population as a whole. This is standard. When something tells you what a certain percentage of the population does, of course that survey hasn't asked everyone in the country. It's asked a representative sample and then weighted up the figures to the UK population. For example, if the UK population is 55% male and 45% female, and I ask 2000 people "what is your favourite drink?", then I need make sure that my sample is roughly 55% male and 45% female if I want to make any claims about the UK population from my data. Let's say I claim "90% of UK women drink

Baileys!". Quite a big claim. So if we examine my sample and discover that of the 2000 people I asked, only 10 of them were women, then we know that only 9 out of 2000 people actually claim to drink Baileys. My sample is not representative of the population as a whole so I can't make any claims about the UK as a whole. The same goes for other demographics like age. When you are analysing survey data, unless you are looking at something very niche and specific, then as a rough guide you shouldn't trust any data with a sample size of less than 50. That's because once you start to break it down into demographics, you're talking about very small numbers of people who actually answered the question. Imagine you're looking at people who buy KitKats in the UK. You have a nice healthy sample taken from a nationwide internet survey of 3000 people. You want to break it down by geographic region. Your data tells you that 1000 of those surveyed live in London. That's fine, a random sample of 1000 Londonites will probably give you an accurate picture of Londonites in general. If 40% of your London survey respondents say they eat KitKats, you can safely claim that 40% of Londonites eat KitKats. But if you pull out the data for a tiny village in the Midlands, where only one person did the survey, is that one person representative of the whole village? Probably not. Bobdezon touched on this, beauty companies are VERY fond of studies with tiny sample sizes. Buyer beware! What 18 women in controlled conditions say about Dove shampoo is not likely to be what most women think. ------------------------------------------------------Recent article in Agrovox suggested that Canada is Importing illiteracy in the name of family reunification Recent article in Agrovox suggested that Canada is Importing illiteracy in the name of family reunification As per a study pointed out by agravox, more than 50% of the Canadians are not able to comprehend or analyze what they are reading and this illiteracy rate is growing up. The study continues to add that only about 23% of the immigrants pay taxes and rest of the 77% are just burden on these 23% tax payers. The author of the study goes on to suggest that entire country should go on to blush with embarrassment, it no where says about the similar figures about the non immigrant Canadians. Though, I have dont have any data currently on the subject, yet I guess given the ratio of children and aged people and assuming that at least 30% of the household would be single income households, where only a single member would be earning, this figure too will not be better than about 35%. Mostly immigrants, especially from India and Pakistan are seen to be more affluent and if the car sales statistics are any thing to go by, then they are doing definitely better than native Canadians. (Though there is nothing called Natives in Canada, some Canadians established at

least 30 years or more would like to call themselves as Natives of Canada.) Obviously, the immigrants are doing better. Then where is the need for raising all this controversy. The article at agravox also continues to talk about the immense cost to society created by the crimes committed by immigrants. However, if you see the statistics released by the prison authorities from time to time, the ratio of inmates in jails is about the same as their composition in the population. The article is typical of immigrant bashing by presenting half truths. Even I am against immigration. My reason is that there are no more jobs available for new immigrants and these immigrants are eating away our jobs; even undercutting. No doubt, I agree with agravox stand that immigration has generally been touted as a quick-fix solution to our problems: as filling jobs that Canadians supposedly dont want to do; and that why, then, do we bring in mostly people who add to the cost, instead of selecting those from whom the country can actually benefit? Yes, I agree, but the reason is mostly political than economical though it has been shrouded under the garb of social equality. Fact is that Canada has pockets of immigrants; some places which resemble more to Punjab in India or look like parts of china. These areas make powerful vote banks and hence some political compulsion calls for such statements. About the fact that agravox mentioned that almost half of the 250,000 immigrants Canada admits a year enter the country under the family reunification scheme. This has swelled the ranks of the elderly and infirm - a group that is growing among native-born Canadians and a source of future affordability bottlenecks when it comes to social programs and health care - and is one of the main reasons why Canadians have to wait for months, or years, for crucial medical procedures. If you have ever wondered why emergency rooms are always clogged, and some people die before staff can attend to them, look around: the majority of patients are elderly aunts, uncles and grandparents or cousins brought in under the family class - none of whom will ever learn English and contribute even a penny in taxes. In other words, they are completely useless and are dragging Canada down into the gutter. The fact that an immigrant is almost sure to call his parents is well known at the time of granting him immigration. This is a cost that is inbuilt into the hefty fees that a immigrant pays at the time of immigration. Why regret the same later on? In any case, the as a percentage, old people in immigrants family would be the same as in others case.

================================ Figures cannot lie

THE BELOW GIVEN FIGURES ARE GENERATED BY CENCUS COUNTS AND HENCE CANNOT LIE, AS THIS CANNOT BE INTERPRETED IN ANY MANNER . THEY ARE FACTS. India's Population 2011

Current Population of India in 2011 Total Male Population in India Total Female Population in India Sex Ratio Age structure 0 to 25 years

1,210,193,422 (1.21 billion) 623,700,000 (623.7 million) 586,500,000 (586.5 million) 940 females per 1,000 males

50% of India's current population

Currently, there are about 51 births in India in a minute. India's Population in 2001 Population of India in 1947 1.02 billion 350 million

Current Population of India 2011 Rank State or union territory 01 02 03 04 Uttar Pradesh Maharashtra Bihar West Bengal Population (2011 Census) 199,581,477 112,372,972 103,804,637 91,347,736 Density (per km) 828 365 1102 1029 Sex ratio 908 946 916 947

05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Andhra Pradesh Madhya Pradesh Tamil Nadu Rajasthan Karnataka Gujarat Odisha Kerala Jharkhand Assam Punjab Haryana Chhattisgarh Jammu and Kashmir Uttarakhand Himachal Pradesh Tripura Meghalaya Manipur Nagaland Goa

84,665,533 72,597,565 72,138,958 68,621,012 61,130,704 60,383,628 41,947,358 33,387,677 32,966,238 31,169,272 27,704,236 25,353,081 25,540,196 12,548,926 10,116,752 6,856,509 3,671,032 2,964,007 2,721,756 1,980,602 1,457,723

308 236 555 201 319 308 269 859 414 397 550 573 189 56 189 123 350 132 122 119 394

992 930 995 926 968 918 978 1,084 947 954 893 877 991 883 963 974 961 986 987 931 968

26 27 28

Arunachal Pradesh Mizoram Sikkim

1,382,611 1,091,014 607,688 16,753,235 1,244,464 1,054,686 379,944 342,853 242,911 64,429 1,210,193,422

17 52 86 9,340 2,598 9,252 46 698 2,169 2,013 382

920 975 889 866 1,038 818 878 775 618 946 940

UT1 Delhi UT2 Puducherry UT3 Chandigarh UT4 Andaman and Nicobar Islands

UT5 Dadra and Nagar Haveli UT6 Daman and Diu UT7 Lakshadweep Total India

#####################################################################

1. Write short notes on A]One-tail & two-tail tests the differences between one-tailed and two-tailed tests When you conduct a test of statistical significance, whether it is from a correlation, an ANOVA, a regression or some other kind of test, you are given a p-value somewhere in the output. If your

test statistic is symmetrically distributed, you can select one of three alternative hypotheses. Two of these correspond to one-tailed tests and one corresponds to a two-tailed test. However, the pvalue presented is (almost always) for a two-tailed test. But how do you choose which test? Is the p-value appropriate for your test? And, if it is not, how can you calculate the correct p-value for your test given the p-value in your output? two-tailed test? First let's start with the meaning of a two-tailed test. If you are using a significance level of 0.05, a two-tailed test allots half of your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction. This means that .025 is in each tail of the distribution of your test statistic. When using a two-tailed test, regardless of the direction of the relationship you hypothesize, you are testing for the possibility of the relationship in both directions. For example, we may wish to compare the mean of a sample to a given value x using a t-test. Our null hypothesis is that the mean is equal to x. A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x. The mean is considered significantly different from x if the test statistic is in the top 2.5% or bottom 2.5% of its probability distribution, resulting in a p-value less than 0.05.

one-tailed test? Next, let's discuss the meaning of a one-tailed test. If you are using a significance level of .05, a one-tailed test allots all of your alpha to testing the statistical significance in the one direction of interest. This means that .05 is in one tail of the distribution of your test statistic. When using a one-tailed test, you are testing for the possibility of the relationship in one direction and completely disregarding the possibility of a relationship in the other direction. Let's return to our example comparing the mean of a sample to a given value x using a t-test. Our null hypothesis is that the mean is equal to x. A one-tailed test will test either if the mean is significantly greater than x or if the mean is significantly less than x, but not both. Then, depending on the chosen tail,

the mean is significantly greater than or less than x if the test statistic is in the top 5% of its probability distribution or bottom 5% of its probability distribution, resulting in a p-value less than 0.05. The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction. A discussion of when this is an appropriate option follows.

one-tailed test appropriate? Because the one-tailed test provides more power to detect an effect, you may be tempting to use a one-tailed test whenever you have a hypothesis about the direction of an effect. Before doing so, consider the consequences of missing an effect in the other direction. Imagine you have developed a new drug that you believe is an improvement over an existing drug. You wish to maximize your ability to detect the improvement, so you opt for a one-tailed test. In doing so, you fail to test for the possibility that the new drug is less effective than the existing drug. The consequences in this example are extreme, but they illustrate a danger of inappropriate use of a one-tailed test. So when is a one-tailed test appropriate? If you consider the consequences of missing an effect in the untested direction and conclude that they are negligible and in no way irresponsible or unethical, then you can proceed with a one-tailed test. For example, imagine again that you have developed a new drug. It is cheaper than the existing drug and, you believe, no less effective. In testing this drug, you are only interested in testing if it less effective than the existing drug. You do not care if it is significantly more effective. You only wish to show that it is not less effective. In this scenario, a one-tailed test would be appropriate. one-tailed test NOT appropriate? Choosing a one-tailed test for the sole purpose of attaining significance is not appropriate. Choosing a one-tailed test after running a two-tailed test that failed to reject the null hypothesis is not appropriate, no matter how "close" to significant the two-tailed test was. Using statistical tests inappropriately can lead to invalid results that are not replicable and highly questionable--a steep price to pay for a significance star in your results table! Deriving a one-tailed test from two-tailed output The default among statistical packages performing tests is to report two-tailed p-values. Because the most commonly used test statistic distributions (standard normal, Student's t) are symmetric about zero, most one-tailed p-values can be derived from the two-tailed p-values. Below, we have the output from a two-sample t-test in Stata. The test is comparing the mean male score to the mean female score. The null hypothesis is that the difference in means is zero. The two-sided alternative is that the difference in means is not zero. There are two one-sided alternatives that one could opt to test instead: that the male score is higher than the female score (diff > 0) or that the female score is higher than the male score (diff < 0). In this instance, Stata presents results for all three alternatives. Under the headings Ha: diff < 0 and Ha: diff > 0 are the results for the one-tailed tests. In the middle, under the heading Ha: diff != 0 (which means that the difference is not equal to 0), are the results for the two-tailed test. Two-sample t test with equal variances -----------------------------------------------------------------------------Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------male | 91 50.12088 1.080274 10.30516 47.97473 52.26703 female | 109 54.99083 .7790686 8.133715 53.44658 56.53507 ---------+-------------------------------------------------------------------combined | 200 52.775 .6702372 9.478586 51.45332 54.09668 ---------+--------------------------------------------------------------------

diff | -4.869947 1.304191 -7.441835 -2.298059 -----------------------------------------------------------------------------Degrees of freedom: 198 Ho: mean(male) - mean(female) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -3.7341 t = -3.7341 t = -3.7341 P < t = 0.0001 P > |t| = 0.0002 P > t = 0.9999 Note that the test statistic, -3.7341, is the same for all of these tests. The two-tailed p-value is P > |t|. This can be rewritten as P(>3.7341) + P(< -3.7341). Because the t-distribution is symmetric about zero, these two probabilities are equal: P > |t| = 2 * P(< -3.7341). Thus, we can see that the two-tailed p-value is twice the one-tailed p-value for the alternative hypothesis that (diff < 0). The other one-tailed alternative hypothesis has a p-value of P(>-3.7341) = 1-(P<-3.7341) = 1-0.0001 = 0.9999. So, depending on the direction of the one-tailed hypothesis, its p-value is either 0.5*(two-tailed p-value) or 1-0.5*(two-tailed p-value) if the test statistic symmetrically distributed about zero. In this example, the two-tailed p-value suggests rejecting the null hypothesis of no difference. Had we opted for the one-tailed test of (diff > 0), we would fail to reject the null because of our choice of tails. The output below is from a regression analysis in Stata. Unlike the example above, only the two-sided p-values are presented in this output. Source | SS df MS Number of obs = 200 -------------+-----------------------------F( 2, 197) = 46.58 Model | 7363.62077 2 3681.81039 Prob > F = 0.0000 Residual | 15572.5742 197 79.0486001 R-squared = 0.3210 -------------+-----------------------------Adj R-squared = 0.3142 Total | 22936.195 199 115.257261 Root MSE = 8.8909 -----------------------------------------------------------------------------socst | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------science | .2191144 .0820323 2.67 0.008 .0573403 .3808885 math | .4778911 .0866945 5.51 0.000 .3069228 .6488594 _cons | 15.88534 3.850786 4.13 0.000 8.291287 23.47939 -----------------------------------------------------------------------------For each regression coefficient, the tested null hypothesis is that the coefficient is equal to zero. Thus, the one-tailed alternatives are that the coefficient is greater than zero and that the coefficient is less than zero. To get the p-value for the one-tailed test of the variable science having a coefficient greater than zero, you would divide the .008 by 2, yielding .004 because the effect is going in the predicted direction. This is P(>2.67). If you had made your prediction in the other direction (the opposite direction of the model effect), the p-value would have been 1 - .004 = .996. This is P(<2.67). For all three p-values, the test statistic is 2.67. @@@@@@@@@@@@@@@@@@@@@@@@@ B] Standard normal distribution What is a normal distribution?

The normal distribution is pattern for the distribution of a set of data which follows a bell shaped curve. This distribution is sometimes called the Gaussian distribution in honor of Carl Friedrich Gauss, a famous mathematician. The bell shaped curve has several properties: The curve concentrated in the center and decreases on either side. This means that the data has less of a tendency to produce unusually extreme values, compared to some other distributions. The bell shaped curve is symmetric. This tells you that he probability of deviations from the mean are comparable in either direction.

When you want to describe probability for a continuous variable, you do so by describing a certain area. A large area implies a large probability and a small area implies a small probability. Some people don't like this, because it forces them to remember a bit of geometry (or in more complex situations, calculus). But the relationship between probability and area is also useful, because it provides a visual interpretation for probability. Here's an example of a bell shaped curve. This represents a normal distribution with a mean of 50 and a standard deviation of 10.

Notice that most of the area falls between 20 and 80. This means that values smaller than 20 or larger than 80 are extremely rare for this variable. Compare this to a normal distribution with a mean of 50 and a standard deviation of 2

Standard Normal Distribution

The standard normal distribution is a special case of the normal distribution. It is the distribution that occurs when a normal random variable has a mean of zero and a standard deviation of one. The normal random variable of a standard normal distribution is called a standard score or a zscore. Every normal random variable X can be transformed into a z score via the following equation: z = (X - ) / where X is a normal random variable, is the mean mean of X, and is the standard deviation of X. Standard Normal Distribution Table A standard normal distribution table shows a cumulative probability associated with a particular z-score. Table rows show the whole number and tenths place of the z-score. Table columns show the hundredths place. The cumulative probability (often from minus infinity to the z-score) appears in the cell of the table. For example, a section of the standard normal table is reproduced below. To find the cumulative probability of a z-score equal to -1.31, cross-reference the row of the table containing -1.3 with the column containing 0.01. The table shows that the probability that a standard t is, P(Z <normal random variable will be less than -1.31 is 0.0951; tha -1.31) = 0.0951. Z -3.0 ... -1.4 -1.3 -1.2 ... 3.0 0.00 0.0013 ... 0.0808 0.0968 0.1151 ... 0.9987 0.01 0.0013 ... 0.0793 0.0951 0.1131 ... 0.9987 0.02 0.0013 ... 0.0778 0.0934 0.1112 ... 0.9987 0.03 0.0012 ... 0.0764 0.0918 0.1093 ... 0.9988 0.04 0.0012 ... 0.0749 0.0901 0.1075 ... 0.9988 0.05 0.0011 ... 0.0735 0.0885 0.1056 ... 0.9989 0.06 0.0011 ... 0.0722 0.0869 0.1038 ... 0.9989 0.07 0.0011 ... 0.0708 0.0853 0.1020 ... 0.9989 0.08 0.0010 ... 0.0694 0.0838 0.1003 ... 0.9990 0.09 0.0010 ... 0.0681 0.0823 0.0985 ... 0.9990

Of course, you may not be interested in the probability that a standard normal random variable falls between minus infinity and a given value. You may want to know the probability that it lies between a given value and plus infinity. Or you may want to know the probability that a standard normal random variable lies between two given values. These probabilities are easy to compute from a normal distribution table. Here's how.

Find P(Z > a). The probability that a standard normal random variable (z) is greater than a given value (a) is easy to find. The table shows the P(Z < a). The P(Z > a) = 1 - P(Z < a). Suppose, for example, that we want to know the probability that a z-score will be greater than 3.00. From the table (see above), we find that P(Z < 3.00) = 0.9987. Therefore, P(Z > 3.00) = 1 - P(Z < 3.00) = 1 - 0.9987 = 0.0013.

Find P(a < Z < b). The probability that a standard normal random variables lies between two values is also easy to find. The P(a < Z < b) = P(Z < b) - P(Z < a). For example, suppose we want to know the probability that a z-score will be greater than -1.40 and less than -1.20. From the table (see above), we find that P(Z < -1.20) = 0.1151; and P(Z < -1.40) = 0.0808. Therefore, P(-1.40 < Z < -1.20) = P(Z < -1.20) - P(Z < -1.40) = 0.1151 - 0.0808 = 0.0343.

In school or on the Advanced Placement Statistics Exam, you may be called upon to use or interpret standard normal distribution tables. Standard normal tables are commonly found in appendices of most statistics texts. The Normal Distribution as a Model for Measurements Often, phenomena in the real world follow a normal (or near-normal) distribution. This allows researchers to use the normal distribution as a model for assessing probabilities associated with real-world phenomena. Typically, the analysis involves two steps.

Transform raw data. Usually, the raw data are not in the form of z-scores. They need to be transformed into z-scores, using the transformation equation presented earlier: z = (X ) / . Find probability. Once the data have been transformed into z-scores, you can use standard normal distribution tables, online calculators (e.g., Stat Trek's free normal distribution calculator), or handheld graphing calculators to find probabilities associated with the z-scores.

The problem in the next section demonstrates the use of the normal distribution as a model for measurement. Test Your Understanding of This Lesson

Problem 1 Molly earned a score of 940 on a national achievement test. The mean test score was 850 with a standard deviation of 100. What proportion of students had a higher score than Molly? (Assume that test scores are normally distributed.) (A) 0.10 (B) 0.18 (C) 0.50 (D) 0.82 (E) 0.90 Solution The correct answer is B. As part of the solution to this problem, we assume that test scores are normally distributed. In this way, we use the normal distribution as a model for measurement. Given an assumption of normality, the solution involves three steps.

First, we transform Molly's test score into a z-score, using the z-score transformation equation. z = (X - ) / = (940 - 850) / 100 = 0.90

Then, using an online calculator (e.g., Stat Trek's free normal distribution calculator), a handheld graphing calculator, or the standard normal distribution table, we find the cumulative probability associated with the z-score. In this case, we find P(Z < 0.90) = 0.8159.

Therefore, the P(Z > 0.90) = 1 - P(Z < 0.90) = 1 - 0.8159 = 0.1841. Thus, we estimate that 18.41 percent of the students tested had a higher score than Molly.

@@@@@@@@@@@@@@@@@@@@@

C] Bayes Theorem Bayes' Theorem

Bayes' Theorem is a theorem of probability theory originally stated by the Reverend Thomas Bayes. It can be seen as a way of understanding how the probability that a theory is true is affected by a new piece of evidence. It has been used in a wide variety of contexts, ranging from marine biology to the development of "Bayesian" spam blockers for email systems. In the philosophy of science, it has been used to try to clarify the relationship between theory and evidence. Many insights in the philosophy of science involving confirmation, falsification, the relation between science and pseudosience, and other topics can be made more precise, and sometimes extended or corrected, by using Bayes' Theorem. These pages will introduce the theorem and its use in the philosophy of science. Begin by having a look at the theorem, displayed below. Then we'll look at the notation and terminology involved.

In this formula, T stands for a theory or hypothesis that we are interested in testing, and E represents a new piece of evidence that seems to confirm or disconfirm the theory. For any proposition S, we will use P(S) to stand for our degree of belief, or "subjective probability," that S is true. In particular, P(T) represents our best estimate of the probability of the theory we are considering, prior to consideration of the new piece of evidence. It is known as the prior probability of T. What we want to discover is the probability that T is true supposing that our new piece of evidence is true. This is a conditional probability, the probability that one proposition is true provided that another proposition is true. For instance, suppose you draw a card from a deck of 52, without showing it to me. Assuming the deck has been well shuffled, I should believe that the probability that the card is a jack, P(J), is 4/52, or 1/13, since there are four jacks in the deck. But now suppose you tell me that the card is a face card. The probability that the card is a jack, given that it is a face card, is 4/12, or 1/3, since there are 12 face cards in the deck. We represent this conditional probability as P(J|F), meaning the probability that the card is a jack given that it is a face card. (We don't need to take conditional probability as a primitive notion; we can define it in terms of absolute probabilities: P(A|B) = P(A and B) / P(B), that is, the probability that A and B are both true divided by the probability that B is true.)

Using this idea of conditional probability to express what we want to use Bayes' Theorem to discover, we say that P(T|E), the probability that T is true given that E is true, is the posterior probability of T. The idea is that P(T|E) represents the probability assigned to T after taking into account the new piece of evidence, E. To calculate this we need, in addition to the prior probability P(T), two further conditional probabilities indicating how probable our piece of evidence is depending on whether our theory is or is not true. We can represent these as P(E|T) and P(E|~T), where ~T is the negation of T, i.e. the proposition that T is false. Simple example Suppose there is a school with 60% boys and 40% girls as its students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance, and what the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem. The event G is that the student observed is a girl, and the event T is that the student observed is wearing trousers. To compute P(G|T), we first need to know:

P(T|G), or the probability of the student wearing trousers given that the student is a girl. Since girls are as likely to wear skirts as trousers, this is 0.5. P(G), or the probability that the student is a girl regardless of any other information. Since the observer sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4. P(T), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since half of the girls and all of the boys are wearing trousers, this is 0.50.4 + 1.00.6 = 0.8.

Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:

Another, essentially equivalent way of obtaining the same result is as follows: Assume, for concreteness, that there are 100 students, 60 boys and 40 girls. Among these, 60 boys and 20 girls wear trousers. All together there are 80 trouser-wearers, of which 20 are girls. Therefore the chance that a random trouser-wearer is a girl equals 20/80 = 0.25. Put in terms of Bayes theorem, the probability of a student being a girl is 40/100, the probability that any given girl will wear trousers is 1/2. The product of these two is 20/100, but we know the student is wearing trousers, so one deducts the 20 students not wearing trousers, and then calculate a probability of (20/100)/(80/100), or 20/80. It is often helpful when calculating conditional probabilities to create a simple table containing the number of occurrences of each outcome, or the relative frequencies of each

outcome, for each of the independent variables. The table below illustrates the use of this method for the above girl-or-boy example Girls Boys Total Trousers 20 Skirts Total 20 40 60 0 60 80 20 100

When to Apply Bayes' Theorem Part of the challenge in applying Bayes' theorem involves recognizing the types of problems that warrant its use. You should consider Bayes' theorem when the following conditions exist.

The sample space is partitioned into a set of mutually exclusive events { A1, A2, . . . , An }. Within the sample space, there exists an event B, for which P(B) > 0. The analytical goal is to compute a conditional probability of the form: P( Ak | B ).

You know at least one of the two sets of probabilities described below.

P( Ak B ) for each Ak P( Ak ) and P( B | Ak ) for each Ak Bayes Rule Calculator

Use the Bayes Rule Calculator to compute conditional probability, when Bayes' theorem can be applied. The calculator is free, and it is easy to use. It can be found under the Stat Tools tab, which appears in the header of every Stat Trek web page. Bayes Rule Calculator

Sample Problem Bayes' theorem can be best understood through an example. This section presents an example that demonstrates how Bayes' theorem can be applied effectively to solve statistical problems. Example 1 Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow.

When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. What is the probability that it will rain on the day of Marie's wedding? Solution: The sample space is defined by two mutually-exclusive events - it rains or it does not rain. Additionally, a third event occurs when the weatherman predicts rain. Notation for these events appears below.

Event A1. It rains on Marie's wedding. Event A2. It does not rain on Marie's wedding

Event B. The weatherman predicts rain. In terms of probabilities, we know the following:

P( A1 ) = 5/365 =0.0136985 [It rains 5 days out of the year.] P( A2 ) = 360/365 = 0.9863014 [It does not rain 360 days out of the year.] P( B | A1 ) = 0.9 [When it rains, the weatherman predicts rain 90% of the time.] P( B | A2 ) = 0.1 [When it does not rain, the weatherman predicts rain 10% of the time.]

We want to know P( A1 | B ), the probability it will rain on the day of Marie's wedding, given a forecast for rain by the weatherman. The answer can be determined from Bayes' theorem, as shown below. P( A1 ) P( B | A1 ) P( A1 | B ) = P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 ) P( A1 | B ) = (0.014)(0.9) / [ (0.014)(0.9) + (0.986) (0.1) ]

P( A1 | B ) = 0.111 Note the somewhat unintuitive result. Even when the weatherman predicts rain, it only rains only about 11% of the time. Despite the weatherman's gloomy prediction, there is a good chance that Marie will not get rained on at her wedding. ####################################################

You might also like