38 views

Uploaded by DrSuchitra Choudhary

statistics

- Report
- South Carolina football, 2012 season statistics
- Operation Support System
- 48 Standard Deviation
- Efficient Portfolios
- GridDataReport Cu
- Statistic Class
- Two Asset Portfolio
- standardcosting-110321141144-phpapp01
- delMas & Liu
- 51303 SAZ3C (1)
- Statistics Formula Sheet
- MCQ Business Statistics
- A Study on Human Task Related Performances in Converting Conveyor Assembly Line to Cellular Manuf
- 1203.0081v1
- MIXED_2010
- Curriculum Mechanical
- Capability
- Homework STAT
- Class 4 Data v1.0

You are on page 1of 32

Definition

A set of brief descriptive coefficients that summarizes a given data set, which can either be a representation of the entire population or a sample. The measures used to describe the data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum variables, kurtosis and skewness.

Definition

Mathematical methods (such as mean, median, standard deviation) that summarize and interpret some of the properties of a set of data (sample) but do not infer the properties of the population from which the sample was drawn

What is data ?

Descriptive statistics are numbers that are used to summarize and describe data. The word "data" refers to the information that has been collected from an experiment, a survey, an historical record, etc. "data" is plural. One piece of information is called a "datum.

Example

If we are analyzing population data of Indian states , a descriptive statistic might be the percentage of people below 5 years of age in different states of India. Several descriptive statistics are often used at one time, to give a full picture of the data.

No Inference

Descriptive statistics are just descriptive. They do not involve generalizing beyond the data at hand. Generalizing from our data to another set of cases is the business of inferential statistics, . Here we focus on (mere) descriptive statistics.

Median

To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values. Thus, in the sample of five persons the median value would be 130 pounds; since 130 pounds is the middle weight.

Data set

HIGH

LOW CLOSE

20854.55

Oct

17926.8

18038.48 19074.57

18008.15

18327.76 20509.09

18954.82

19768.96

19521.25

20032.34

20267.98

Sept

18027.12

20069

15. BSE Sensitive Index and NSE Nifty Index of Ordinary Share Prices - Mumbai 2010 2011 1 2 3 4 5 6 7 BSE SENSEX (1978-79=100) 17051.14 18882.25 19092.05 18978.32 19046.54 19007.53

Jan. 17 Jan. 18

17051.14 18882.25

5094.15 5654.75

Jan. 19

19092.05

5724.05

Jan. 21

19046.54 19007.53

5711.60 5696.50

Mode

The mode is the most frequently appearing value in the population or sample. Suppose we draw a sample of five persons and measure their weights. They weigh 100 pounds, 100 pounds, 130 pounds, 140 pounds, and 150 pounds. Since more persons weigh 100 pounds than any other weight, the mode would equal 100 pounds.

Mean

The mean of a sample or a population is computed by adding all of the observations and dividing by the number of observations. Returning to the example of the five persons, the mean weight would equal (100 + 100 + 130 + 140 + 150)/5 = 620/5 = 124 pounds. = sum of X1 + X2 + X3 + X4 + X.n/ n Where X is the values of the variable and n is the number of observations.

1 2 3 4 5 6 7 8 26 25 23 12 15 18 09 12

A proportion refers to the fraction of the total that possesses a certain attribute. For example, we might ask what proportion of persons in our sample weigh less than 135 pounds. Since 3 persons weigh less than 135 pounds, the proportion would be 3/5 or 0.60. A percentage is another way of expressing a proportion. A percentage is equal to the proportion times 100. In our example of the five persons , the percent of the total who weigh less than 135 pounds would be 100 * (3/5) or 60 percent.

Notation

Of the various measures, the mean and the proportion are most important. The notation used to describe these measures appears below: X: Refers to a population mean. x: Refers to a sample mean. P: The proportion of elements in the population that has a particular attribute. p: The proportion of elements in the sample that has a particular attribute. Q: The proportion of elements in the population that does not have a specified attribute. Note that Q = 1 - P. q: The proportion of elements in the sample that does not have a specified attribute. Note that q = 1 - p. Note that capital letters refer to population parameters and lower-case letters refer to sample Statistics

Exercise

Please rewrite these notations by your memory This is self checking exercise

Measures of Variability

Some parameters attempt to describe the amount of variation between random variables. For example, consider a population of four random variables {6,6,6,6,}. Here, each of the random variables are equal, so there is no variation. The set {3, 5, 5, 7}, on the other hand, has some variation since some random variables are different. The three parameters that are used to quantify the amount of variation in a set of random variables - the range, the variance, and the standard deviation. Though there are other measures also such as mean deviation.

Notation

: The variance of the population. : The standard deviation of the population. 2 s : The variance of the sample. s: The standard deviation of the sample. : The population mean. x: The sample mean. N: Number of observations in the population. n: Number of observations in the sample. P: The proportion of elements in the population that has a particular attribute. p: The proportion of elements in the sample that has a particular attribute. Q: The proportion of elements in the population that does not have a specified attribute. Note that Q = 1 - P. q: The proportion of elements in the sample that does not have a specified attribute. Note that q = 1 - p.

2

Exercise

Rewrite the notations given in the last slide. This is self checking exercise .

The Range

The range is the simplest measure of variation. It is difference between the biggest and smallest random variable. Range = Maximum value - Minimum value Therefore, the range of the four random variables (3, 5, 5, 7} would be 7 - 3 or 4.

It is important to distinguish between the variance of a population and the variance of a sample. They have different notation, and they are computed differently. 2 The variance of a population is denoted by ; and the 2 variance of a sample, by s . The variance of a random variable is the average squared deviation from the population mean, as defined by the following formula: 2 2 = ( Xi - ) / N 2 where is the population variance, is the population mean, Xi is the ith element from the population, and N is the number of elements in the population.

Variance of a Sample

The variance of a sample is defined by slightly different formula: 2 2 s = ( xi - x ) / ( n - 1 ) where s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the sample variance can be considered an unbiased estimate to the true population variance. Therefore, if you need to estimate the unknown population variance, based on known data from a sample, this is the formula to use.

Example 1

A population consists of four observations: {1, 3, 5, 7}. What is the variance? Solution: First, we need to compute the population mean. =(1+3+5+7)/4=4 Then we plug all of the known values in to formula for the variance of a population, as shown below: 2 2 i-) /N =(X 2 2 2 2 2 =[(1-4) +(3-4) +(5-4) +(7-4) ]/4 2 2 2 2 2 = [ ( -3 ) + ( -1 ) + ( 1 ) + ( 3 ) ] / 4 2 = [ 9 + 1 + 1 + 9 ] / 4 = 20 / 4 = 5

Example 2

A sample consists of four observations: {1, 3, 5, 7}. What is the variance? Solution: This problem is handled exactly like the previous problem, except that we use the formula for calculating sample variance, rather than the formula for calculating population variance. 2 2 s = ( xi - x ) / ( n - 1 ) 2 2 2 2 2 s =[(1-4) +(3-4) +(5-4) +(7-4) ]/(4-1) 2 2 2 2 2 s = [ ( -3 ) + ( -1 ) + ( 1 ) + ( 3 ) ] / 3 2 s = [ 9 + 1 + 1 + 9 ] / 3 = 20 / 3 = 6.667

Difference

Is there any difference between the two ? Which is more ? Why it is more ? What if the sample size increases ?

Variance of a Proportion

The variance formulas introduced in the previous section can be used with confidence for any random variable - even proportions. However, for proportions the formulas can be expressed in a form that is easier to compute. With an infinite population or when sampling with replacement, the variance of a population proportion is defined by the following formula: 2 = PQ / n where P is the population proportion, Q equals 1 - P, and n is sample size.

Given the same constraints (infinite population or sampling with replacement), the variance of the sample proportion is defined by slightly different formula: 2 s = pq / (n - 1) where n is the number of elements in the sample, p is the sample estimate of the true proportion, and q is equal to 1 - p. Using this formula, the sample variance can be considered an unbiased estimate of the true population variance. Therefore, if you need to estimate the unknown population variance, based on known data from a sample, this is the formula to use.

Many Books on statistics texts present only the formula for the variance of the population proportion. If the sample size is very large, both formulas give similar results; but when the sample size is small, it is better to use the correct formula

Standard Deviation of a Random Variable The standard deviation is the square root of the variance. It is important to distinguish between the standard deviation of a population and the standard deviation of a sample. They have different notation, and they are computed differently. The standard deviation of a population is denoted by ; and the standard deviation of a sample, by s. The standard deviation of a random variable is defined by the following formula: 2 = sqrt * ( Xi - ) / N ] where is the population standard deviation, is the population mean, Xi is the ith element from the population, and N is the number of elements in the population.

S.D of sample

The standard deviation of a sample is defined by slightly different formula: 2 s = sqrt * ( xi - x ) / ( n - 1 ) ] where s is the sample standard deviation, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the sample standard deviation can be considered an unbiased estimate to the true population standard deviation. Therefore, if you need to estimate the unknown population standard deviation, based on known data from a sample, this is the formula to use.

Find out Mean , Median , Standard Deviation and Variance and range.

Roll Number 1 2 3 4 5 6 7 8 9 10 Marks 12 15 08 24 17 27 15 17 12 13

- ReportUploaded byapi-3802282
- South Carolina football, 2012 season statisticsUploaded byThe State Newspaper
- Operation Support SystemUploaded byMohammad Anwar Ali
- 48 Standard DeviationUploaded byMj John Dell
- Efficient PortfoliosUploaded byShakti Mahapatra
- GridDataReport CuUploaded bySarva Mangala Praveena
- Statistic ClassUploaded byMarcela Gomes
- Two Asset PortfolioUploaded byvinita
- standardcosting-110321141144-phpapp01Uploaded byNishant Singh
- delMas & LiuUploaded byKhawar Nadeem
- 51303 SAZ3C (1)Uploaded byDeepak Rajaram
- Statistics Formula SheetUploaded bymeolinh
- MCQ Business StatisticsUploaded byShivani Kapoor
- A Study on Human Task Related Performances in Converting Conveyor Assembly Line to Cellular ManufUploaded byMaloth Prakash
- 1203.0081v1Uploaded byshush10
- MIXED_2010Uploaded byptscrib
- Curriculum MechanicalUploaded bypraveen
- CapabilityUploaded bydhana555
- Homework STATUploaded byCalekale
- Class 4 Data v1.0Uploaded byanon_848442250
- EE 5375/7375 Random Processes Homework #4 SolutionsUploaded bybamclo
- 2018 sem 2 math 3 final review 1Uploaded byapi-262277495
- RM ProjectUploaded bypravinswamy
- ZKPQ-50-CCUploaded byLaurentia Nathania
- An Illustration of Second Order Latent Growth ModelsUploaded bysimbycris
- Statistics for Data Analysis - Lec-3-VariabilityUploaded byNikesh Bajaj
- MLM_RUploaded byMonteiro Ferreira
- WCEE2012_2827Uploaded byÖzkan Kale
- Serial Correlation CoefficientUploaded bySam Mok
- Uji ManovaUploaded byErlangsaSyafPutra

- 15EC43_CS_CLO&CO 2016_2017(1)Uploaded byrashmi
- Ch10 Graph PartitioningUploaded bySayali
- Fundamentals of Computational FluidUploaded byPengyuan Yao
- RUserManual Stable MedicalUploaded byDatulna Benito Mamaluba Jr.
- Clip 150Uploaded bydeathgrip1235
- simulink assgnmntUploaded bySyed Farid
- Lecture Notes-March06 (1)Uploaded byrukma
- On the Question of a Dynamic Solution_FUploaded byc_y_lo
- Sep_Var_1Uploaded byArial96
- Calculus 3Uploaded byDarren Troy Lingao Mayor
- Vibration of Continuous SystemsUploaded byHossam T Abdel-aziz
- Higher order derivatives of the inverse functionUploaded byandrej_liptaj
- 12A Comprehensive Methodological Analysis of Eigen Face Recognition and ReconstructionUploaded byBhanu Prajwal Akkaji Gowda
- 3_statistik_cukup.pptUploaded byyulia sari
- MIT6_011S10_chap11.pdfUploaded byismaiba97
- 1semUploaded bybagsourav
- STiCM Assignment 1Uploaded bySahil Chadha
- Greens FunctionsUploaded byMichael Lang
- 12 TrigonometryUploaded bytamleduc1810dn
- 8_RMS.pdfUploaded byraymund12345
- Yearly Lesson Plan Mathematics Form3 2015Uploaded byAs Ekin Ali
- Notes 1Uploaded byArunmozhi Sinouvassane
- WarmUp1Uploaded byAchmad Alfiyan Faqih
- Different Kinds of Mathematical InductionUploaded byMartin Mat
- CI221 Finite Element NotesUploaded bySholpan Sholps
- Statistical mechanics -PH620-problem-set1.pdfUploaded byHarish Venkatreddy
- lecture4_2DFTUploaded bySanaa Numan Mohammed
- HessianUploaded byRamKrishna108
- fm documentUploaded byStephen Green
- ContMech SoSe2013 Exercise04 SolutionUploaded byzcapg17