2 views

Uploaded by ahmed22gouda22

- 2882GS 2003
- lab 3_2
- 9ABS401 Probability & Statistics
- post konsul uji delta niat.doc
- Small Samples
- 3017 Tutorial 3 Solutions
- Error Calculation
- learning task 4
- Eco Sample Paper 2
- Coupon Conspiracy
- Notes on Mathematical Expectation
- Portfolio Selection
- est_comp
- Lectures Chapter 3B
- Assignment 1.docx
- Chapter3 Variance Analysis
- Mathematical symbols list (+,-,x,º,=,(,),...)
- The Normal Distribution
- SSRN-id965354
- Stat Primer

You are on page 1of 8

In the last handout the subject of statistics was defined to be the art and science of making inferences about a population on the basis of a sample. But what sort of inferences about a population does one seek to make ? One often seeks to estimate the population mean and the population variance. So the point of taking a sample and computing a sample mean and a sample variance is to estimate the population mean and population variance, respectively. The foregoing discussion is getting ahead of itself, in the sense that it presumes that the terms population mean and population variance have already been defined which they havent. There are two cases to consider : the case in which the population is finite, and the case in which the population is infinite. Finite versus Infinite Populations In the case of a finite population, say of size N, the definition of a population mean is the obvious analogy to the definition of a sample mean given in Handout 1 : the population

mean is

populations are not just the province of pure mathematicians spinning abstract theories for the sake of abstraction itself ; the case of an infinite population is quite common in scientific settings. Suppose one is investigating the wing span ( in millimeters ) of the progeny resulting from the cross breeding of two strains of fruit fly. What is the population under investigation? The fruit flies bred by the investigator ? The fruit flies bred by all researchers in the US that year ? The fruit flies bred by anyone, anyplace to date ? If its a truly general scientific investigation, the population of interest is the set of all wing spans of all potential fruit flies that have been or could be bred, anytime or anyplace ; which is, by the nature of the concept, an infinite set. And any scientific investigation of sufficient generality to be of significant enduring interest, is likely to be about an infinite population. Developing a Definition for the Case of Discrete Random Variables We will consider the problem of defining the population mean in the special case of a discrete random variable : that is, a random variable that takes on a finite number of different values. ( For example, consider the random variable X that takes on the value 1 if a coin toss comes up heads, and the value 0 if the coin toss comes up tails. This is a discrete random variable since there are only two different values the random variable takes on ; but one can, at least conceptually, keep tossing the coin from here to eternity, and hence the population of interest is infinite. )

The goal is to develop a definition which makes sense whether the population is finite or infinite. Were not going to just plop down an unmotivated definition !! The motivation will be provided by a consideration of how to efficiently compute a sample mean when one has so-called grouped data. Keep in mind that the computational aspect is not what is of primary importance here ( ....although some textbook authors make a ludicrously big deal out of dealing with grouped data ...) ; the significance lies in the motivation grouped data provide for defining the population mean and population variance. Lets start by considering a particular example. Consider the experiment of randomly selecting a household in Fairfax County . The random variable of interest is the response to an inquiry as to the number of adults over 18 years of age residing in the selected household. Repeating this experiment 1000 times gives the data set given in the following table : Number of Persons Over 18 1 2 3 4 5 6 7 8 The sample mean may be computed as : Number of Households 223 437 150 86 62 22 15 5

To generalize the computations in this example, say that the discrete random variable X takes on the value xi with frequency fi . Then the sample mean is computed by

where k is the number of different values of xi in the sample and n is the total number of observations.

Note, just doing some algebraic re-writing, that the last formula for

can be rewritten as :

The ratio fi /n is an approximation to P ( X = xi ) . To keep the notation simple , lets agree to write pi for P ( X = xi ) and Then one may write the formula for for the approximation fi /n . as

is the sample mean. There should be an analogous average motivates the following definition.

or mean for the entire population. The last formula for Definition.

Given a discrete random variable X that takes on k different values x1, x2, . . ., xk , the expected value of X , or population mean :X , is

:X

where pi is P ( X = xi ) for i = 1, , k. --------Remarks ( 1 ) The notations E[X] and :X are two different views of the same thing : using the notation :X one is looking at the population mean as a property of a set of values ; using the notation E[ X ] one is thinking of the population mean as a property of the random variable that generates the set. Both ways of looking at things have advantages in different instances. ( 2 ) Remark on notation : lower case Latin characters generally denote characteristics of a sample , e.g. ; lower case Greek characters generally denote population

characteristics ; e.g. :.

( 3 ) Its a good idea to get used to using summation notation, so note that

Computational Example Consider a random variable Y which takes on values 2,4 and 8 with probabilities , 1/4 and 1/4 , respectively. Then :Y = E[Y] = ( )( 2 ) + (1/4)( 4) + 1/4( 8 ) = 1 + 1 + 2 = 4. Computational Example If Y is a random variable, any function of Y , e.g. Y2 , Y3 - Y , *Y*, etc. , is a random variable, and so also has an expectation. For example, for Y as in the example above, E[Y2 ] = ( )( 22 ) + (1/4)( 42) + 1/4( 82 ) = 2 + 4 + 16 = 22. Note that E[Y2 ] is NOT the same as E[Y ]2 !!!! Computational Example In a certain board game, the number of spaces a player moves his token is determined by using a spinner. The spinner is just a circle printed on a rectangle of cardboard with a light metal arrow attached to the center of the circle, so that the arrow rotates freely. The spinner for the particular game I have in mind is marked off into three sections : a 180 section colored blue , a 120 section colored red, and a 60 section colored yellow. If , when spun, the arrow lands in the blue section , the token is moved forward 4 spaces, if the arrow lands in the red section the token is moved forward 6 spaces ; and if the arrow lands in the yellow section, then the token is moved back 12 spaces. Let Y be the random variable corresponding to the number of spaces the token is moved. Assuming that the head of the arrow is equally likely to stop at any point on the perimeter of the circle, one has P( Y = 4 ) = , P( Y = 6 ) = 1/3 , and P( Y = -12 ) = 1/6. Then one computes :

:Y = E[Y] = ( )( 4 ) + (1/3)( 6 ) + 1/6( -12 ) = 2 + 2 - 2 = 2

Computation aint everything, ya know ??? How does one interpret this result ? If one plays the game a very long time, and spins the spinner billions and billions of times ( to echo the late Carl Sagan ....) , one will average about 2 spaces forward per spin.

Defining the Population Variance The motivation for the definition of a population variance is very similar. If one writes out the definition of the sample variance long-hand, i.e. without using summation notation,

Now for grouped data , where x1 , x2 , . . ., xk are the k different values occurring in a sample of n observations, and where the frequency with which xi occurs in the sample is fi , one has :

If n is very large, dividing by n - 1 isnt all that much different computationally than dividing by n ; and if n is large we expect to be close to :, so that for large n,

This expression motivates the following definition : Definition. The population variance for a discrete random variable X, denoted Var[X] or , is

Remarks ( 1 ) The square root of the population variance, FX , is called the population standard deviation. ( 2 ) Notice again the convention that lower case Latin characters denote sample characteristics, while lower case Greek characters denote population characteristics. ( 3 ) Note that one could also write the definition as Var[ X ] = E[ ( X - : )2 ] : an observation we will make use of later. (4) (5) Observe that , by definition , the variance of a random variable is non-negative. When the definition of the sample standard deviation was introduced in Handout 1 we noted that it would have been more natural if the denominator in the defining expression had been n instead of n - 1 . Heres the intuitive explanation : what one would really like to measure is the dispersion of the data about the population mean but since the population mean is unknown, one uses the sample mean in the computation. Data is information ;so having to estimate the population mean is like having one less data point.

Computational Example Referring to the random variable associated with the game spinner of the previous example, Var[ Y ] = E[ ( Y - : )2 ] = ( )( 4 - 2 )2 + (1/3)( 6 - 2)2 + 1/6( -12 - 2 )2 = 2 + 16/3 + 196/6 = 240/6 = 40. And the population standard deviation is FY . % 40 . 6.324. Computational Example Consider the random experiment of rolling a fair die. Let X be the function that assigns to an outcome the number of spots that appear on the top face. Then P(X = 1 ) = P(X= 2 ) = P( X=3) = P(X=4) = P(X=5) = P(X=6 ) = 1/6. E[X] = 1/6 1 + 1/6 2 + 1/6 3 + 1/6 4 + 1/6 5 + 1/6 6 = 3.5 ,

Var[ X ] = 1/6 ( 1 - 3.5 ) 2 + 1/6 ( 2 - 3.5 ) 2 + 1/6 ( 3 - 3.5 ) 2 + 1/6 ( 4 - 3.5 ) 2 + 1/6 ( 5 - 3.5 ) 2 + 1/6 ( 6 - 3.5 ) 2 = 2.917. Computational Example To help illustrate the relationship between the sample mean and variance, and the population mean and variance, I rolled a fair die 600 times. ( Well, I rolled it with the help of a random number generator thats built into Lotus 123. . .. ) I then computed the sample mean and variance for this sample of size 600. After that I rolled the die another 2400 times, and again computed the sample mean and variance. The results are summarized in the table given below: First 600 Rolls X= 1 2 3 4 5 6 Total Frequency 91 92 118 101 99 99 600 = 3.537 s2 = 2.785 Second 2400 Rolls X= 1 2 3 4 5 6 Frequency 403 370 383 434 417 393 2400 = 3.530 s2 = 2.895

Compare the sample means and the sample variances with the population mean and variance computed earlier.

Nice Examination Problem : A randomly chosen drilling site in a certain area of West Texas is quite variable in its potential for producing oil. Eighty percent of the time the well will produce only 50,000 barrels of oil ; ten percent of the time the well will produce 100,000 barrels of oil ; seven percent of the time the well will produce 250,000 barrels of oil ; two percent of the time the well produces 500,000 barrels of oil ; and , on the average, one in a hundred wells is a real gusher and produces 3,000,000 barrels of oil.

( a ) If X denotes the oil production of a randomly chosen well, compute the population mean ( expected value ) and standard deviation. ( You may choose any units for oil production you find convenient . ) ( b ) If a barrel of crude oil sells for about $25, and the cost of drilling a well is about $2,000,000, what is the average profit per well for a company that has the financial resources to stay in the oil business for the long run ? And here is a solution ---Using X to denote oil production per well ( in thousands of barrels ), one computes the mean and standard deviation of X to be 107.5 and 301.3615 , respectively. This means that in the long run the oil company will average a profit of 107,500 ( 25 ) - 2,000,000 = $687,500 per well. It is well to emphasize long run , since 80% of the time a well will, in fact, lose money for the company.

- 2882GS 2003Uploaded byDurga Apparao
- lab 3_2Uploaded byRaluca Muresan
- 9ABS401 Probability & StatisticsUploaded bysivabharathamurthy
- post konsul uji delta niat.docUploaded byTUTIK SISWATI
- Small SamplesUploaded byHunny Verma
- 3017 Tutorial 3 SolutionsUploaded byNguyễn Hải
- Error CalculationUploaded byMauricio Branbilla
- learning task 4Uploaded byapi-339611548
- Eco Sample Paper 2Uploaded byRachin Pandey
- Coupon ConspiracyUploaded byPushp Toshniwal
- Notes on Mathematical ExpectationUploaded byaef
- Portfolio SelectionUploaded byMego Plamonia
- est_compUploaded byBenjamin Enoc
- Lectures Chapter 3BUploaded byShivneet Kumar
- Assignment 1.docxUploaded byHaider Ali
- Chapter3 Variance AnalysisUploaded byMohamed Diab
- Mathematical symbols list (+,-,x,º,=,(,),...)Uploaded bytgoip
- The Normal DistributionUploaded byjorsen93
- SSRN-id965354Uploaded bypowerpan
- Stat PrimerUploaded byschaltegger
- Errata1 POEUploaded byarsalan1984
- Risk and Return Note 1Uploaded byFarhanie Nordin
- Final BA ReportUploaded byAsad Ismail
- Chapter_5(1)(2)Uploaded byWei Wei
- 04a_ZAUploaded byAnastasia Shishova
- stathemaUploaded byRegita Ayu Lestari
- HW7Uploaded bynick10686
- AnswersUploaded byriip
- Assignment+Measure+of+DispersionUploaded byAhmed tariq

- symetrie-quadrillage-1.pdfUploaded byahmed22gouda22
- arrondir-decimaux-3Uploaded byahmed22gouda22
- Classer Decimaux 4 CorrigeUploaded byahmed22gouda22
- Ecriture Fractionnaire Decimale Cm2 1Uploaded byahmed22gouda22
- Nombres Et CalculUploaded byahmed22gouda22
- Arrondir Decimaux 2 CorrigeUploaded byahmed22gouda22
- Decomposition Decimaux 1Uploaded byahmed22gouda22
- Symetrie Centrale 1Uploaded byahmed22gouda22
- Symetrie Quadrillage 1 CorrigeUploaded byahmed22gouda22
- Ecriture Fractionnaire Decimale Cm2 2Uploaded byahmed22gouda22
- Ecriture Fraction 1Uploaded byahmed22gouda22
- 344877 Les Maths Pour TousUploaded byyoussou_123
- classer-decimaux-4Uploaded byahmed22gouda22
- arrondir-decimaux-2.pdfUploaded byahmed22gouda22
- arrondir-decimaux-2Uploaded byahmed22gouda22
- Operation Pack Quatre OperationsUploaded bySamson Olafi
- North TutorialUploaded byahmed22gouda22
- arrondir-decimaux-1Uploaded byahmed22gouda22
- Medicalab Catalog 2016Uploaded byahmed22gouda22
- 2692221Uploaded byahmed22gouda22
- Surface DetectionUploaded bystranjerr
- new11Uploaded byahmed22gouda22
- OpenAir Calculating Utilization Services CompanyUploaded byahmed22gouda22
- Medical Devices by FacilityUploaded byjwalit
- Exercice s 12Uploaded byahmed22gouda22
- Exercice s 12Uploaded byahmed22gouda22
- 40592374Uploaded byahmed22gouda22
- 08Uploaded byahmed22gouda22
- Arrondir Decimaux 1 CorrigeUploaded byahmed22gouda22
- Ten YearUploaded byahmed22gouda22

- Analysing and Presenting Customer FeedbackUploaded byC Muthu
- 02 Frequency DistributionUploaded byayariseifallah
- Lauric Acid Lab-1Uploaded byyoachallenge
- Simulation Probs Session1Uploaded bySaranyaRoy
- Quality ManagementUploaded byketema
- adolescenti-tineri adulti cariera.pdfUploaded byBurete Alina Maria
- A Brief History of the Generative Models of Power LawsUploaded byapi-26490800
- New Microsoft Word Document.docxUploaded byShivpreet Sharma
- Postlab 1Uploaded byTyleen Lexus Vasquez
- Paper Statistics Bangalore UniversityUploaded byfazalulbasit9796
- Iso 468 Surface roughnessUploaded byjavierhakim
- Wind Resource Assessment HandbookUploaded byanexi01
- Baii Plus TutorialUploaded byRachel Goodly
- Permutation QuestionUploaded bySherif Khalifa
- Botanal.pdfUploaded byRemzi Zarate
- Social Web Analytics – Solution AnswersUploaded byKevin H Pham
- 1Uploaded byJona
- Computation of Wave Loads.pdfUploaded byArpach Pacheco
- Statistics Measures of Central Tendency Unit PlanUploaded bycaleb castillo
- Chapter 3 Discrete Probability Distributions_final 3Uploaded byVictor Chan
- S1 Edexcel Revision Pack[1]Uploaded byJFGHANSAH
- Chapter 3 - Describing DataUploaded byzanibab
- USDA - Characteristics of pallets joints.pdfUploaded byfranzdiaz7314
- Black-Litterman Model in DetailUploaded byghaeek
- An Introduction to the BootstrapUploaded byhytsang123
- Bloomberg EQSUploaded bychuff6675
- 7400Uploaded byMARIPAZ*20
- Best Practices Evaluating Academic AdvisingUploaded byseehari
- Ch5 Evans BA1eUploaded byyarli7777
- 013_DSETarget2Uploaded by918goody