You are on page 1of 10

1.

Describe the difference between a measure we consider a statistic and a measure we consider a parameter. a. A statistic describes something about a sample. A parameter describes something about a population.

2. 3.

Give the meaning of each symbol used in class. a. *symbols chart* Describe what areas you should consider when evaluating a statistical study. a. In class: i. That its voluntary, if not it could skew the data ii. Sample size iii. How random the sample is iv. Loaded questions v. Measurement area b. Text page 4: i. Context of the data ii. Source of the data iii. Sampling method iv. Conclusions v. Practical implications

4.

Describe what blinding and double blinding is in a statistical study. a. Blinding and double blinding is where some of the people involved in the study are prevented from knowing certain information that might lead to conscious or subconscious bias on their part, invalidating the results. b. Blinding example: If asking consumers to compare the tastes of different brands of foods the identities of each product should be concealed. c. Double blinding example: When evaluating effectiveness of a drub, both the patients and the doctors may be kept in the dark about the dosage being applied in each case.

5.

Describe what confounding data is in a statistical study. a. In class: i. Data you didnt account for that messes up the study b. Text page 32:

i. Confounding occurs in an experiment when you are not able to distinguish among the effects of different factors c. Risk factors that affect the results of a study d. Example: in a recent controversy over obesity, the CDC published a study indicating that slightly overweight people live longer than thin people. The Harvard School of Public Health and the American Cancer Society later criticized the results, noting that more of the thin people were sick (and were thin because they were sick) than the overweight people. i. In other words, the confounding variable was the number of thin people who died that were only thin because they were sick. 6. Describe what complementary events are. a. The probability of some event not occurring. The complement of event A consists of all outcomes in which event A does not occur. 7. Describe the Addition Rule and when the rule should be used and how it should be used. a. Text page 152: i. Used for finding probabilities that either event A occurs or Event B occurs (or the both occur) as the single outcome of a procedure. ii. P(AorB) iii. Key word is or iv. Formal addition rule: P(A or B) = P(A) + P(B) P(A and B) v. Only have to subtract P(A and B) if events A and B are not disjoint. If they are disjoint than P(A and B) = 0 because they both can not occur at the same time. vi. Used when you are trying to find the probability of one event or another occurring in a single trial. vii. When finding the probability that event A occurs or event B occurs, find the total of the number of ways A can occur and the number of ways B can occur and then add them together, but find this total in such a way that no outcome is counted more than once. P(A or B) is equal to this sum, divided by the total number of outcomes in the sample space b. Example: if a card is drawn randomly from a deck of ordinary playing cards, what is the probability of getting a spade or an ace? i. A = the event that the card is a spade ii. B = the event that the card is an ace iii. Of the 52 cards in a deck, 13 are spades so P(A) = 13/52 iv. There are 4 aces, so P(B) = 4/52

v. There is 1 ace that is also a spade so P(A and B) = 1/52 vi. P(A or B) = (13/52 + 4/52) 1/52 = 16/52 = 4/13 or .308 (rounded to 3 sig digs) 8. Describe the Multiplication Rule and when the rule should be used and how it should be used. a. Text page 159: i. Formal multiplication rule: P(A and B) = P(A) P(B given A has occurred) ii. Used for finding the probability that event A occurs in a first trial and event B occurs in a second trial. iii. If the outcome of the first event A somehow affects the probability of the second event B, it is important to adjust the probability of B to reflect the occurrence of event A. iv. Key word is and v. When finding the probability that event A occurs in one trial and event B occurs in the next trial, multiply the probability of event A by the probability of event B, but be sure that the probability of event B takes into account the previous occurrence of event A. vi. Two events A and B are independent if the occurrence of one does not affect the probability of the occurrence of the other. (Several events are similarly independent if the occurrence of any does not affect the probabilities of the occurrence of the others.) If A and B are not independent, they are said to be dependent. vii. If they are independent then P(A and B) = P(A) P(B) viii. If they are independent then P(A and B) = P(A) P(B given that A has occurred) b. Stattrek: i. Use when you have two events from the same sample space and want to know the probability that both events occur. ii. Example: A bag contains 6 red marbles and 4 black marbles. Two are drawn with replacement. What is the probability that both of the marbles are black? 1. A = event that the first marble is black 2. B = event that the second marble is black 3. There are a total of 10 marbles in the bag 4. 4 of them are black so P(A) = 4/10

5. After the selection, we put it back in the bag. So there are still 10 marbles in the bag, 4 of which are black; therefore, P(B) = 4/10 a. Because we are doing this experiment with replacement, A and B are independent, so there is no adjustment necessary for P(B). If it were done without replacement then after A, there would only be 9 marbles left in the bag, 3 of which would be black. So P(B) = 3/9 instead of 4/10. 6. P(A and B) = 4/10 4/10 = 16/100 = 4/25 a. If dependent: P(A and B) = 4/10 3/9 = 12/90 = 2/15 9. Describe the Probability of at least one and when the rule should be used and how it should be used. a. Text page 171: i. P(at least one) = 1 P(none) ii. At least one = one or more iii. The complement of getting at least one item of a particular type is that you get no items of that type. iv. To find the probability of at least one of some event: 1. Identify the complement of the event. The probability of none of the getting none of the items being considered. 2. Subtract this probability from 1. 3. This is the probability of getting at least one v. Example: find probability of a couple getting at least 1 girl in 3 births. 1. A = at least 1 of the 3 children is a girl 2. Complement of A = all 3 are boys = boy and boy and boy a. P(boy and boy and boy) = = 1/8 3. P(A) = 1 1/8 = 7/8 4. So there is a 7/8 probability of getting at least 1 girl in 3 births 10. Describe Conditional Probability and when the rule should be used and how it should be used. a. Text page 173: i. A conditional probability of an event is used when the probability is affected by the knowledge of other circumstances. ii. A conditional probability of an event is a probability obtained with the additional information that some other event has already occurred. Can

be found by dividing the probability of events A and B both occurring by the probability of event A. iii. A conditional probability means that the probability of some event is dependent on the outcome of some previous event. iv. Formula: P(B given A has occurred) = P(A and B)/P(A) v. Used when events are dependent. To find the probability of event B occurring given that event A has occurred, you must adjust P(B) to reflect the result of A. b. Example: A bag contains 6 red marbles and 4 black marbles. Two are drawn without replacement. What is the probability of the second marble being black given the first one was also black. 1. A = event that the first marble is black 2. B = event that the second marble is black 3. 4 of the 10 marbles are black so P(A) = 4/10 4. If we are assuming that A has already occurred then there are now a total of 9 marbles in the bag, 3 of which are black. So P(B) = 3/9 5. P(B given A has occurred) = 3/9 4/10 = 5/6 6. This is different from the multiplication rule for dependent events because the question is not asking us what the probability of both event A and B occurring is. It is asking what the probability of event B occurring assuming that Event A has already occurred. 11. Define z-score. Describe two ways we used a z-core during this semester. a. Text page 254: i. Distance, in terms of standard deviation, along the horizontal scale of the standard normal distribution that a given data point is away from the mean. b. In class: i. Z-scores normalize data thereby allowing us to compare the results of two different samples ii. We also used z-scores to create confidence intervals. 12. Describe the Central Limit Theorem and why it is important to the practice of inferential statistics. a. Text page 287:

i. The central limit theorem tells us that for a population with any distribution, the distribution of the sample means approaches a normal distribution as the sample size increases. ii. In other words, if the sample size is large enough, the distribution of sample means can be approximated by a normal distribution, even if the original population is not normally distributed. iii. The distribution of sample means gets closer to a normal distribution as sample size becomes larger. iv. It is important to inferential statistics because it means that if we take enough samples, or large enough samples, the distribution of the means of these samples will be a normal distribution. This is important because we can use this normal distribution to make reasonably accurate assumptions about the population mean even if the population distribution is not normal. This theorem is extremely important and is essential for many hypothesis tests. 13. Describe confidence level, margin of error and confidence interval and how they are used to estimate a population proportion or population mean. a. Text page 328: i. A confidence interval is a range of values used to estimate the true value of a population parameter. 1. A confidence interval is the range of values that we believe contains the true value of whatever population parameter we are trying to estimate. ii. The confidence level gives us the success rate of the procedure used to construct the confidence interval. 1. Is often expressed as the probability or area 1 alpha, where alpha is the complement of the confidence level. 2. So for a 0.95 confidence level, alpha = 0.05. iii. When data from a simple random sample are used to estimate a population proportion, the margin of error is the maximum likely difference (with probability 1 alpha, such as 0.95) between the observed sample proportion and the true value of the population proportion. 1. The margin of error can be found by multiplying the critical value and the standard deviation of sample proportions. iv. Example: The 0.95 confidence interval estimate of the population proportion is 0.667 < p < 0.723 1. This is saying that we are 95% (confidence level) sure that the interval from 0.667 to 0.723 (confidence interval) actually does contain the true value of the population proportion.

2. These interval values are found by finding the maximum likely difference between the sample proportion and the true value of the population proportion, called the margin of error, then subtracting it from the sample proportion (which we are using as our best point estimate of the true population proportion) to obtain the lowest possible, likely value of the population proportion. We do the same thing to obtain the highest possible, likely value except instead of subtracting the margin of error from the sample proportion we add it. v. So basically when we estimate a population proportion or mean, we want to make sure we are as accurate as possible. In order to do this we create a range of values that we believe contains the actual population parameter. This range is called the confidence interval. To make a confidence interval we have to find the maximum likely difference between the observed sample statistic and the true value of the population proportion. This is called the margin of error. It is essentially the largest distance the sample statistic could be from the true population parameter without being unlikely. The confidence level describes the level of confidence that we have in the process used to determine the interval. So for a 0.95 confidence interval, we are saying that we are 95% confident that this interval actually does contain the true value of the population proportion. 14. Describe what happens to the size of the Confidence Interval when the Confidence Level is increased. a. As the confidence level goes up, the confidence interval becomes wider. This is because as we increase the percentage of confidence we decrease alpha, or the probability of an unlikely event, which means our confidence interval must widen to cover the extra space. b. When we increase the confidence level, what we are saying is that we are more confident that the interval actually does contain the true value of the population parameter. The only way we can be more confident of this, is if we widen the interval to include more potential values. 15. Describe the goal of hypothesis testing and what it means when we reject the null hypothesis. a. The goal of hypothesis testing is to assess the validity of some claim about a property of a population. b. When we test a hypothesis we are testing a claim that someone has made about some property of a population, which is done by identifying a null and alternate hypothesis. i. Say we were testing a drug companys claim that a new medication lowers blood pressure.

1. This claim is essentially saying that the medication has an effect, specifically that it lowers the average blood pressure of those who use it. It becomes the alternate hypothesis. 2. The null hypothesis would be the opposite of this, that the drug has no effect at all. 3. To test this claim, we have to find the probability of getting a result at least as extreme as the result of whatever study they are using to back up their claim, assuming that the null hypothesis is true. 4. If this probability is low, then that means the likelihood of getting a result like the results of the companys study is not very high. 5. Basically this means that if the drug doesnt have any effect (the null) there is a very low chance of these results occurring. It is far more likely that the drug does have an effect, so we reject the null hypothesis and support the drug companys claim. 16. Define P-value. a. Text page 400: i. P-value is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. ii. P-values can be found after finding the area beyond the test statistic. 17. Describe how hypothesis testing and P-value are related. a. Text Page 400: i. The null hypothesis is rejected if the P-value is very small, such as 0.05 or less. b. Example: A drug company is claiming that their new medicine lowers blood pressure. This claim is based off the results of whatever studies they conducted. i. If we were to test this claim, what we would essentially be doing is finding the probability of getting the same sample results as, or results more extreme than, the company did, under the assumption that the drug has no effect. If this probability is very small then we would support the claim; likewise, if it was large we would not support the claim. In statistics we call this probability the P-value. It is an essential part of hypothesis testing because it allows us to distinguish results that are likely to occur and those that are unlikely to occur. 18. Describe what is meant by correlation and causality. a. Text page 518:

i. A correlation exists between two variables when the values of one variable are somehow associated with the values of the other variable. 1. So if one variable decreases and the other variable decreases as well, or vice versa, then there may be a correlation between the two. If they both seem to increase at the same time then we say the two have a positive correlation. If the opposite is true, then they are said to have a negative correlation. ii. Causality is when one variable directly affects another. iii. A correlation is basically just the association between two variables. This does not mean that one causes the other, just that there seems to be a relationship between the two. Causality on the other hand is when one variable actually does cause a change in another. 19. Describe what the correlation coefficient measures. a. Text page 518: i. The linear correlation coefficient measures the strength of the linear correlation between the paired quantitative x and y values in a sample. ii. A measure of how well trends in the predicted values follow trends in past actual values. iii. It is a measure of how well the predicted values from a forecast model fit with the real life data. iv. The linear correlation coefficient measures the strength and direction of a linear association. 20. Describe linear regression. a. Text page 536: i. For linear correlations, we can identify an equation that best fits the data, and we can use that equation to predict the value of one variable given the value of the other variable. ii. Linear regression is a method of estimating the conditional expected value of one variable, y, given the values of some other variable, x. iii. The variable, y, is called the dependent variable. The other variable, x, is called the independent variable. iv. Linear regression equation: Y = a + bX 1. b = slope of the line 2. a = the intercept ( the value of Y when X = 0) 21. Describe standard deviation and what it measures. a. Standard deviation measures how much a given data point is different from the average.

b. A measure of distance along the horizontal axis used to describe how far away a given data point is away from the mean. c. Example: The average IQ score is 100. But, the range that is considered normal is a standard deviation of 15. So, a normal IQ is anywhere from 85 (15 below the average) and 115 (15 above the average). 22. What have you learned from Stats? Why is it relevant? What value does statistics have in your every day life? a. For me personally, since I am an international business major, statistics plays a very large role in my career path. It is impossible for anyone to predict the future with 100% accuracy, but with statistics we can make reasonable assumptions about future economic trends. Without this ability, the entire structure of the business world would come crashing down. This is true of many different industries. The medical field for example, would not be able to conduct studies on their drugs and therefore would have no idea what they actually did or how safe they were.

10

You might also like