11 views

Uploaded by MsKhan0078

- Hasil Uji Normalitas
- 1519-6984-bjb-1519-698401213suppl
- 5_2017_03_16!01_08_56_PM
- Copy of Lattice5x5
- Statistical Methods
- Scales of Measurement
- Measurement
- Chapter 1 Introduction and Data Collection
- business report final
- Estimating Technology Readiness Level Coefﬁcients
- SS1 Quantitative Analysis I
- vol1
- Prob Review Slides Part 1
- Measurement
- STAT
- 1950
- Tugas Statistik Bab 4
- Intelligent Data Analysis Source Code HW1
- Metrics Short Notes
- Levine Smume6 01

You are on page 1of 69

Inference

Adnan Butt

4:28 AM

Course Outline

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

4:28 AM

Random Variable and Mathematical Expectation

Discrete Probability Distributions (Binomial, Poisson)

Continuous Probability Distribution (Normal)

Sampling Theory

Confidance Intervals

Hypotheses Testing

Goodness of Fit

Regression and Correlation with ANOVA

Multiple Regression

All the topics will be SPSS oriented

2

Introduction to Mathematical Statistics, Hogg,

(2004)

Statistical Inference, Cassella, G. and Berger, R.

L., 2nd Edition (2002)

Applied Regression Analysis: A Research Tool,

Rawlings, J. O., Pantula, S. G. and Dickey, D. A.,

2nd Edition (2001)

Introduction to Statistics, Walpole, R. E., 3rd

Edition (2000)

4:28 AM

Mode of Teaching

Lecture

SPSS Workshop

Discussion Session

4:28 AM

Marks Distribution

Mid term

25 Marks

Final

40 Marks

Quizzes

15 Marks

SPSS

15 Marks

Class Participation

Total

4:28 AM

5 Marks

100 Marks

5

Variable

A characteristic or

property that varies

from individual to

individual.

4:28 AM

Constant

A characteristic or

property that does not

change from individual

to individual.

4:28 AM

Types of Variables

Types of

Variables

Qualitative

Quantitative

Discrete

4:28 AM

Continuous

8

Nominal Scale

Variable categories are mutually

exclusive and exhaustive.

Variable categories have no

logical order.

Eye Color, Hair Color, Gender.

4:28 AM

Ordinal Scale

Data categories are mutually

exclusive and exhaustive.

Data classifications are ranked or

ordered

according

to

the

particular trait they possess.

Level of Knowledge about SPSS

4:28 AM

10

Interval Scale

Data categories are mutually exclusive

and exhaustive.

Data classifications are ranked or ordered

according to the particular trait they

possess.

Equal differences in the characteristic are

not represented by equal differences in

the measurements.

Temperature, Shoe Size and IQ scores

4:28 AM

11

Ratio Scale

Data categories are mutually exclusive and

exhaustive.

Data classifications are ranked or ordered

according to the particular trait they possess.

Equal differences in the characteristic are

represented by equal differences in the

measurements.

The zero point is the essence of the

characteristic.

Height, Weight, Distance.

4:28 AM

12

Measurement Scales

Scale

Nominal

Ordinal

Interval

Ratio

be classified

Data are

ranked

does not

Exist.

Meaningful Zero

point and Ratio

Between values

Eye color,

Hair Color

Gender.

4:28 AM

Level of

Knowledge

about

SPSS

Temperature,

Shoe Size,

IQ Scores

Height, Weight,

Distance.

13

Data

The information collected

for any kind of investigation.

Usually Numerical but can

be Qualitative.

4:28 AM

14

Primary Data

The initial material collected

during the research process.

The information collected

directly from the respondent.

Personal Invetigation, Through Investigator, Through Questionnaire,

Through Local Sources, Through Telephone,

4:28 AM

15

Secondary Data

The information

collected and processed

by the people other than

the researcher

Government Organizations, Semi-Government

Organizations,

4:28 AM

16

Data Collection

Any of the following methods may be

adopted:

(a) Personal interview

(b) Direct observation

(c) Mail interview (internet interview)

(d) Telephone interview

What are the cons and pros of each?

4:28 AM

17

Data management

Office Editing,

Post Coding,

Data entry and Verification.

4:28 AM

18

Preparing data for analysis,

Extracting descriptive measures

from the data,

Using advanced statistical

techniques to analyze the data

and draw inference there from.

4:28 AM

19

Arithmetic Mean

Quantiles

(Median, Quartiles, Deciles, Percentiles)

Mode

4:28 AM

20

Arithmetic Mean

A value obtained by dividing the sum of all the observations by

their number.

Arithmetic Mean

Number of the observations

If X1, X2, , Xn are n observations of a variable X then

n

X1 X 2 X n

X

n

4:28 AM

X

i 1

n

21

Arithmetic Mean

The marks obtained by 8 students are:

67 72 68 70 65 68 75 63

67 72 63 548

X

68.5 Marks

8

8

4:28 AM

22

Quantiles

For

individual

observations/discrete

frequency

distribution by the following relations

Qi

i(n 1)

th observation in the distribution, i 1, 2, 3

4

j(n 1)

th observation in the distribution, j 1, 2,,9

10

k(n 1)

Pk

th observation in the distribution, k 1, 2,,99

100

Dj

4:28 AM

23

Quartiles

The weekly TV Watching times (Hours):

25 41 27 32 43 66 35 31 15 5

34 26 32 38 16 30 38 30 20 21

5 15 16 20 21 25 26 27 30 30

31 32 32 34 35 37 38 41 43 66

4:28 AM

24

Quartiles

1(20 1)

Q1

th observation in the distribution

4

5.25th observation in the distribution

5th obs. 0.25{6th obs.- 5th obs.}

21 0.25{25- 21} 22.0 Hours

4:28 AM

25

Quartiles

2(20 1)

Q2

th observation in the distribution

4

10.50th observation in the distribution

10th obs. 0.50{11th obs.- 10th obs.}

30 0.50{31- 30} 30.5 Hours

4:28 AM

26

Quantiles

4:28 AM

27

Mode

The mode is a value which occurs

most frequently in a set of data. Or

mode

is

value

that

occurs

sequence of observations.

4:28 AM

28

Mode

The total automobile sales (in millions) in

the United States for the last 14 years.

9.0

8.2 8.0 9.1 10.3 11.0 11.5

10.3 10.5 9.8 9.3

8.2

8.2

8.5

4:28 AM

29

variation present among the values

of a data set, so measures of

variation are measures of spread of

values in the data.

4:28 AM

30

Absolute Measures of

Dispersion

Range

Quartile Deviation

Mean (Average) Deviation

Variance and Standard Deviation

4:28 AM

31

Relative Measures of

Dispersion

Coefficient of Range

Coefficient of Quartile Deviation

Coefficient of Mean Deviation

Coefficient of Variation (CV)

4:28 AM

32

Range

Difference between the largest

and the smallest observations

Range X Largest X Smallest

4:28 AM

33

Ignores the way in which data are distributed

7

10

11

Range = 12 - 7 = 5

12

10

11

12

Range = 12 - 7 = 5

Sensitive to outliers

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 120 - 1 = 119

4:28 AM

34

Inter-quartile range = 3rd quartile 1st Quartile

Q3 Q1

4:28 AM

35

Inter-quartile Range

X

minimum

Q1

25%

12

Median

(Q2)

25%

30

25%

45

Q3

maximum

25%

57

70

= 57 30 = 27

4:28 AM

36

Mean Deviation is the average of absolute

deviations taken form the mean value.

X

8

5

2

4:28 AM

(X X )

X X

3

0

-3

0

3

0

3

(x x )

n

6

2

3

6

37

Variance

Variance is the average

of the squared

deviations taken from

the mean value.

(i) S 2

(x x )

(ii ) S

2

4:28 AM

X2

n

102

17cm 2

6

702 102 2

2

17 cm

6 6

X cm (X-Mean)^2

X2

36

16

16

36

81

12

144

13

169

16

36

256

60

102

702

38

Standard Deviation

Standard deviation is the positive square root of

the mean-square deviations of the observations

from their arithmetic mean.

Population

Sample

SD variance

x

i

N 1

SD is :

f i xi x 2

Where

fx

N

i i

i

Simplified formula

2

fx

x

f

fx

N

Ungroup Data

Family

No.

10

Size (xi)

Here, x

n

Family No.

50

5

10

10

Total

xi

50

xi x

-2

-2

-1

-1

20

16

16

25

25

36

36

49

49

270

x i x

xi

s2

x

i

20

2

10

s 2 1.41

Data A

11

12

13

14

15

16

17

18

19

20 21

Mean = 15.5

S = 3.338

20 21

Mean = 15.5

S = 0.926

20 21

Mean = 15.5

S = 4.567

Data B

11

12

13

14

15

16

17

18

19

Data C

11

12

13

14

15

16

17

18

19

clustered the scores around mean

The larger the standard deviation, the more spread out

4:28 AM

43

the scores from mean

Coefficient of Range

X Largest X Smallest

X Largest X Smallest

Q3 Q1

Coefficient of Quartile Deviation

Q3 Q1

MD

Coefficient of Mean Deviation

Mean

4:28 AM

44

S

100%

CV

X

Can be used to compare two or more

sets of data measured in different

units or same units but different

average size.

4:28 AM

45

Stock A:

Average price last year = $50

Standard deviation = $5

S

$5

CVA 100%

100% 10%

$50

X

Stock B:

Average price last year = $100

Standard deviation = $5

S

$5

CVB 100%

100% 5%

$100

X

4:28 AM

Both stocks

have the

same

standard

deviation

but stock B is

less variable

relative to its

price

46

of Variability

If data are symmetric, with no serious

deviation.

If data are skewed, and/or have serious

outliers, use IQR.

If comparing variation across two data

sets, use coefficient of variation (C.V)

4:28 AM

47

The five number summary of a data set consists of the

minimum value, the first quartile, the second quartile, the

third quartile and the maximum value written in that order:

Min, Q1, Q2, Q3, Max.

tendency (the median, Q2) and measures of variation of the

two middle quarters of the distribution, Q2-Q1 for the

second quarter and Q3-Q2 for the third quarter.

4:28 AM

48

The weekly TV viewing times (in hours).

25

34

41 27

26 32

32

38

43 66

16 30

35

38

31 15

30 20

5

21

5

15 16 20 21 25 26 27 30 30

31 32 32 34 35 37 38 41 43 66

4:28 AM

49

1(20 1)

LOCATIONof Q1;

4

VALUE of Q1 ; 5th obs. 0.25{6thobs. - 5th obs.} 21 0.25{25 - 21} 22.0 Hrs

LOCAT ION of Q 2 ;

VALUE of Q2

2(20 1)

3(20 1)

LOCATIONof Q 3 ;

VALUE of Q 3 ; 15th obs 0.75 {16th obs - 15th obs} 35 0.75{37 - 35} 36.5 Hrs

Minimum value=5.0

4:28 AM

Maximum value=66.0

50

A box and whisker diagram or box-plot is a

graphical mean for displaying the five number

summary of a set of data. In a box-plot the first

quartile is placed at the lower hinge and the

third quartile is placed at the upper hinge. The

median is placed in between these two hinges.

The two lines emanating from the box are

called whiskers. The box and whisker diagram

was introduced by Professor Jhon W. Tukey.

4:28 AM

51

Max

Value

Construction of Box-Plot

Start the box from Q1 and end at

Q3

2. Within the box draw a line to

represent Q2

3. Draw lower whisker to Min.

Value up to Q1

4. Draw upper Whisker from Q3 up

to Max. Value

1.

4:28 AM

Q3

Q2

Q1

Min

Value

52

70

Construction of Box-Plot

60

50

1.

2.

3.

4.

Q1=22.0 Q3=36.5

Q2=30.5

Minimum Value=5.0

Maximum Value=66.0

40

30

20

10

0

4:28 AM

53

70

Interpretation of Box-Plot

60

Maximum and Minimum Values in the data

Median of the data

50

40

IQR=Q3-Q1,

Lengthy box indicates more variability in the data

30

Line At the center of the box----Symmetrical

20

Line below center of the box----Positively Skewed

10

0

4:28 AM

54

Outliers

An outlier is the values that falls well outside the overall

pattern of the data. It might be

a member from a different population,

simply an unusual extreme value.

instead, be an indication of skewness.

4:28 AM

55

If

Q1=22.0

Q2=30.5

Q3=36.5

Inner Fences :

Upper Inner Fence Q 3 1.5IQR 58.25

Outer Fences :

Upper Outer Fence Q 3 3IQR 80.0

4:28 AM

56

80

70

fences are normal values

2. The values that lie outside inner

fences but inside outer fences

are

possible/suspected/mild

outliers

3. The values that lie outside outer

fences are sure outliers

*

60

Only

66 is a

mild

outlier

and each sure outliers with an hollow dot.

4:28 AM

50

40

30

20

10

57

Box plots are

especially suitable for

comparing two or more

data sets. In such a

situation the box plots

are constructed on the

same scale.

4:28 AM

Male

Female

58

Standardized Variable

A variable that has mean 0 and Variance 1 is

called standardized variable

Values of standardized variable are called

standard scores

Values of standard variable i.e standard scores are

unit-less

Construction

Z

Standard Deviation of Variable

4:28 AM

59

Standardized Variable

X

( X X )2

25

(Z Z ) 2

-1.3624 1.8561

-0.5450 0.2970

11

0.81741 0.6682

12

16

1.0899

32

54

1.1879

4.009

S x2

32

8

4

n

54

13.5

4

X X X 8

Sx

3.67

Z

S z2

n

4.009

1

4

Standard Score at X=11 is Z X X 11 8 0.8174

Sx

3.67

4:28 AM

The industry in which sales rep Mr. Atif works has mean

annual sales=$2,500

standard deviation=$500.

The industry in which sales rep Mr. Asad works has mean

annual sales=$4,800

standard deviation=$600.

Mr. Asads sales were $6,000.

Which of the representatives would you hire

if you have one sales position to fill?

4:28 AM

61

Sales rep. Atif

XB= $2,500

XP =$4,800

SB= $500

SP = $600

XB= $4,000

XP= $6,000

ZB

ZB

XB XB

SB

4,000 2,500

500

ZP

3

ZP

XP XP

SP

6,000 4,800

600

4:28 AM

62

68%

X 1S contains about 68% of values

X

X 1S

95%

X 2S

99.7%

4:28 AM

X 3S

X 3S containsabout99.7%of values

63

Measures of Skewness

A distribution in which the values equidistant from

the centre have equal frequencies is defined to be

symmetrical and any departure from symmetry is

called skewness.

Tail

2. Mean = Median = Mode

3. Sk=0

a) Sk=(Mean-Mode)/SD

b) Sk=(Q3-2Q2+Q1)/(Q3-Q1)

4:28 AM

64

Measures of Skewness

A distribution is positively skewed, if the observations

tend to concentrate more at the lower end of the possible

values of the variable than the upper end. A positively

skewed frequency curve has a longer tail on the right

hand side

Tail

2. Mean > Median > Mode

3. SK>0

4:28 AM

65

Measures of Skewness

A distribution is negatively skewed, if the

observations tend to concentrate more at the upper

end of the possible values of the variable than the

longer tail on the left side.

Tail

2. Mean < Median < Mode

3. SK< 0

4:28 AM

66

Measures of Kurtosis

4:28 AM

unimodal (single humped) distribution,

When the values of a variable are highly concentrated around

the mode, the peak of the curve becomes relatively high; the

curve is Leptokurtic.

When the values of a variable have low concentration around

the mode, the peak of the curve becomes relatively flat;curve

is Platykurtic.

A curve, which is neither very peaked nor very flat-toped, it

is taken as a basis for comparison, is called

Mesokurtic/Normal.

67

Measures of Kurtosis

4:28 AM

68

Measures of Kurtosis

Coefficient of Kurtosis=

n X-X

X-X

2 2

2. If Coefficient of Kurtosis = 3 ----------------- Mesokurtic.

4:28 AM

69

- Hasil Uji NormalitasUploaded bybasyev
- 1519-6984-bjb-1519-698401213supplUploaded byCarolina Araujo
- 5_2017_03_16!01_08_56_PMUploaded byAlexandra
- Copy of Lattice5x5Uploaded byTeflon Slim
- Statistical MethodsUploaded byGuruKPO
- Scales of MeasurementUploaded bySYED ALI HUSSAIN
- MeasurementUploaded byniro
- Chapter 1 Introduction and Data CollectionUploaded byFeri Arosa
- business report finalUploaded byapi-346042105
- Estimating Technology Readiness Level CoefﬁcientsUploaded byLucas Belmino Freitas
- SS1 Quantitative Analysis IUploaded byRohit Mittal
- vol1Uploaded bykshitijsaxena
- Prob Review Slides Part 1Uploaded byObedur Rashid Bin Sakrat Kaderi
- MeasurementUploaded bypjmauyao
- STATUploaded byabs0917838857
- 1950Uploaded byAdil Shahzad
- Tugas Statistik Bab 4Uploaded byNouvhiTha KEnneth EmiLiana
- Intelligent Data Analysis Source Code HW1Uploaded bywilltuna
- Metrics Short NotesUploaded byJay Patel
- Levine Smume6 01Uploaded byHamis Rabiam Magunda
- comqol-s5Uploaded bylnair_43
- NoCOUG 201402 Hermann Baer SQL Pattern MatchingUploaded byRorikSetyaBudi
- Extreme Downside Risk and Expected Stock ReturnsUploaded byAnto Andreawan
- All Midterm Fall 2009 Research)Uploaded byRabi Liaquat
- Chapters 1 to 4 - Data Representation and Summarisation TechniquesUploaded byKasunDilshan
- Goodyear ChicagoUploaded byPablo Estrada
- Assignment Group StatisticsUploaded byAbdul Muhaimin
- praktek statistikUploaded byRana Permana
- reaction timeUploaded byapi-264015665
- <MB0050-RESEARCH METHODOLOGY FINAL.docxhtml><head><title>400 Bad Request</title></head> <body bgcolor="white"> <center><h1>400 Bad Request</h1></center> <hr><center>nginx/1.2.9</center> </body> </html>Uploaded byShravanti Bhowmik Sen

- 4. DistributingUploaded byMsKhan0078
- Chap01Uploaded byMsKhan0078
- 3. Promoting ProductsUploaded byMsKhan0078
- 1. Understanding Marketing Processes and Consumer BehaviorUploaded byMsKhan0078
- 2. Developing and Pricing ProductsUploaded byMsKhan0078
- Chap05Uploaded byMsKhan0078
- Chap02Uploaded byMsKhan0078
- Chap03Uploaded byMsKhan0078
- The Impact of Immigration on the Labour Market - EvidenceUploaded byMsKhan0078
- The Brain DrainUploaded byMsKhan0078
- david_sm13_ppt_09Uploaded byMsKhan0078
- david_sm13_ppt_07.pptUploaded bymwm_koolguy
- chap06Uploaded byMsKhan0078
- Chap04Uploaded byMsKhan0078
- Migration & Sustainable LivelihoodsUploaded byAlankar
- Unit 2Uploaded byeviroyer
- Cost TheoryUploaded byIkhlasul Amallynda
- MA121-1.3.4-hwUploaded byMsKhan0078
- Section32 Measures of Central Tendency and DispersionUploaded bymarchelo_chelo
- Migrant Worker Remittances and Micro-FinanceUploaded byMsKhan0078
- Isoquant and IsocostUploaded byMsKhan0078
- Optimal Decisions Using Marginal AnalysisUploaded byMsKhan0078
- Permutations and CombinationsUploaded byMsKhan0078
- International Labour Migration From BangladeshUploaded byMsKhan0078
- CVP(Handout)Uploaded byMsKhan0078
- Cost and Management AccountingUploaded byMsKhan0078
- Conditional Probability 6.5 NotesUploaded byMsKhan0078
- Support for Production FunctionUploaded byMsKhan0078
- 4_MS811_pp34to45Uploaded byBella Novitasari
- sg_ch02Uploaded byMsKhan0078

- Fused DMUploaded byvamsilakshya
- 1 Research in Psychology and Basic Concepts in StatisticsUploaded byEuri
- Test Bank Questions Chapter 5Uploaded byAnonymous 8ooQmMoNs1
- Statistics IIUploaded bygambo_dc
- part_iiUploaded byJacob Harris
- Fault Detection in MotorsUploaded by13thsparton
- Statistical Forcasting - Excel, ARIMAUploaded byAnonymous rkZNo8
- 103942306-D-924-03Uploaded byDavid Méndez Mapel
- ECOLOGICAL METHODS HANDBOOK.pdfUploaded bymizzakee
- Crucible PromoUploaded byMorvan Breuss
- Presentation Studenmund Using EconoUploaded byMahum Tofiq
- Environmental ForecastingUploaded bySindhu Manja
- Geostatistics in Ecology Interpolating With Known VarianceUploaded bybriologo2
- Predictive Modeling of Titanic Survivors (1)Uploaded byAndre Hawari
- Clonal Multiplication of Teak (Tectona grandis) by Using Moderately Hard Stem Cuttings: Effect of Genotypes (FG1 and FG11 Clones) and IBA TreatmentUploaded bySEP-Publisher
- ExamC Sample QuestionsUploaded byJessica Min Yea Sul
- S11-SPUploaded bySaagar Karande
- Effects of Temperature on Tertyare Nitrification in Moving-bed Biofilm ReactorsUploaded byMoribundo Dazaranha
- Phase Rule and Phase DiagramsUploaded byMazhar Ali
- Assessment of Man's Thermal Comfort -Fanger 1973Uploaded byhaniskamis82
- C 495 â€“ 99 ;QZQ5NQ__Uploaded byEdwar Vidal Sanca Pacori
- kom5115_1307433093Uploaded byFaiz Yasin
- Soal to Nasional Bahasa InggrisUploaded byAhmad Zamroni
- Concrete Mix Design WordUploaded bySiti Nurfatin
- Using Regression Estimation to Calculate Effective Load Carrying Capacity of Renewable ResourcesUploaded byJustin Kubassek
- Percent Increase and DecreaseUploaded byMr. Peterson
- BOX_Jenkins.pdfUploaded bysheilar_16846886
- Session 10 ForecastingUploaded byAnkur Aggarwal
- PaperUploaded bySherif Hassanien
- Appendix AUploaded byUday Kumar