You are on page 1of 26

1/16/2019

STATISTIKA
&
PROBABILITAS

Agung Nugroho, Ph.D

Penilaian:

1. Tugas (10%)
2. Praktikum (15%)
3. Kuis (15%)
4. UTS (30%)
5. UAS (30%)

1
1/16/2019

Materi Kuliah

Materi
Minggu 1 Pengantar Statistika dan probabilitas
Minggu 2 Statistika Deskriptif
Minggu 3 Statistika Deskriptif dan Praktikum Excel
Minggu 4 Teori Peluang
Minggu 5 Distribusi Peluang Diskrit
Minggu 6 Distribusi Peluang Kontinyu
Minggu 7 Distribusi Sampling & Teknik Sampling
Minggu 8 UTS

Materi Kuliah

Materi
Minggu 9 Point estimation dan confidence interval
Minggu 10 Hypothesis testing
Minggu 11 Hypothesis testing-2
Minggu 12 Analisis Regresi dan korelasi
Minggu 13 Analisis Regresi dan korelasi
Minggu 14 Praktikum Excel
Minggu 15 Pengantar Statistika Quality control
Minggu 16 UAS

2
1/16/2019

Reference
1. Johnson, R. A., &
Bhattacharyya, G. K.,
“Statistics: Principles and
Methods”, Wiley Global
Education, 6th Edition, 2014.
2. Douglas C., Montgomery,
George C. Runger, “Applied
Statistics and Probability for
Engineers”, John Wiley &
sons, 2014.
3. Levine, D. M., Ramsey, P. P.,
& Smidt, R. K., “Applied
Statistics for Engineers and
Scientists: Using Microsoft
Excel and Minitab”, Prentice
Hall, 2001.

Introduction

1st week
6

3
1/16/2019

What Engineers Do?

 An engineer is someone who solves problems of interest to society


with the efficient application of scientific principles by:
• Refining existing products

• Designing new products or processes

The Creative Process

Figure 1-1 The


engineering method 7

Statistics Supports The Creative


Process
 The field of statistics deals with the collection,
presentation, analysis, and use of data to:
• Make decisions

• Solve problems

• Design products and processes

 It is the science of data.

 For students, statistics is important to collect,


organize, analysis, and interpretation data during
research and thesis.
8

4
1/16/2019

Definition

 Statistics is the science of collecting, organizing, analyzing


and interpreting data in order to make decision in the
presence of uncertainty.
 Collection of fact, generally in form of numbers arranged in
a table or diagram.
 Example: Health statistic, Birth statistic, etc

Classification

 Statistical Descriptive → collecting, organizing, analyzing


and interpreting data
 Statistical Inference makes use of information from a
sample to draw conclusions about the population from
which the sample was taken.

⚫ Descriptive Statistics ⚫ Inferential Statistics


✓ Collect ✓ Predict and forecast values
✓ Organize of population parameters
✓ Summarize ✓ Test hypotheses about
values of population
✓ Display
parameters
✓ Analyze
✓ Make decisions

10

10

5
1/16/2019

Variability

• Statistical techniques are useful to describe


and understand variability.
• By variability, we mean successive observations of
a system or phenomenon do not produce exactly
the same result.
• Statistics gives us a framework for describing this
variability and for learning about potential sources
of variability.

11

11

An Engineering Example of
Variability
Eight sample are taken from output waste water treatment plant and their Cl
concentration are measured (in ppm):
12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1.

All of the sample does not have the same concentration. We can see the
variability in the above measurements as they exhibit variability.
The dot diagram is a very useful plot for displaying a small body of data -
say up to about 20 observations.

This plot allows us to see easily two features of the data; the location, or
the middle, and the scatter or variability.

Cl concentration
12

12

6
1/16/2019

Hypothesis Tests

Hypothesis Test
• A statement about a process behavior value.
• Compared to a claim about another process value.
• Data is gathered to support or refuse the claim.

One-sample hypothesis test:


• Example: chlorine concentration (ppm) = 30
vs
chlorine concentration (ppm) < 30

Two-sample hypothesis test:


• Example: chlorine conc. at A (ppm) – chlorine conc. at B (ppm) = 0
vs
chlorine conc. at A (ppm) – chlorine conc. at B (ppm) > 0
13

13

An Experiment in Variation

W. Edwards Deming, a famous industrial statistician & contributor to the


Japanese quality revolution, conducted a illustrative experiment on process
over-control or tampering.

Let’s look at his apparatus and experimental procedure.


Marbles were dropped through a funnel onto a target and the location where
the marble struck the target was recorded.
Variation was caused by several factors:
Marble placement in funnel & release dynamics, vibration, air currents,
measurement errors.

14

14

7
1/16/2019

How Is the Change Detected


Graphically?
The center line on the control
chart is just the average of the
concentration measurements
for the first 20 samples
X = 91.5 g / l

when the process is stable.


The upper control limit and the
lower control limit are located 3
standard deviations of the
concentration values above
and below the center line.

Figure 1-5 A control chart for the chemical process concentration data.
Process steps out at hour 24 & 29. Shut down & adjust process.

15

15

Mechanistic and Empirical Models


A mechanistic model is built from our underlying knowledge of the basic
physical mechanism that relates several variables.
Example: Ohm’s Law
Current = V/R
I = E/R
I = E/R + 
where  is a term added to the model to account for the fact that the observed
values of current flow do not perfectly conform to the mechanistic model.
• The form of the function is known.
An empirical model is built from our engineering and scientific knowledge of
the phenomenon, but is not directly developed from our theoretical or first-
principles understanding of the underlying mechanism.
The form of the function is not known.

16

16

8
1/16/2019

An Example of an Empirical Model

• In a semiconductor
manufacturing plant, the
finished semiconductor is wire-
bonded to a frame. In an
observational study, the
variables recorded were:
• Pull strength to break the
bond (y)
• Wire length (x1)
• Die height (x2)

17

17

Visualizing the Data and Resultant Model Using


Regression Analysis

3D plot of the pull strength (y), wire 3D Plot of the predicted values (a
length (x1) and die height (x2) data. plane) of pull strength from the
empirical regression model.

18

18

9
1/16/2019

DESCRIPTIVE
STATISTIC

19

19

Statistic

Descriptive
 Describe the basic features of the data in a study.
 It provide simple summaries about the sample and the
measures.
 Together with simple graphics analysis, they form the
basis of virtually every quantitative analysis of data

20

10
1/16/2019

Key term

 Variable, is a characteristic that changes or varies over time


and/or for different individuals or objects under consideration.
Ex = Hair color, white blood cell count , bottom outlet of
disitillation tower, etc
 Data is a set of measurements, can be either from a sample or a
population.
 Population is the set representing all measurements of interest to
the investigator. A population is any entire collection of people,
animals, plants or things from which we may collect data.
In order to make any generalizations about a population, a
sample, that is meant to be representative of the population, is
often studied.
 A sample is a subset of measurements selected from the
population of interest. The sample should be representative of the
population.

21

Population and Sample: Example

 Population: 150-plus million adult American.


 Sample: 1500 interviewed.

Population (N) Sample (n)

22

11
1/16/2019

Types of Data

Ex : red, black, blue, white

Ex : None, mild, moderate, severe

Ex : 1 person, 3 student, 5 pet

Ex : 166 cm, 63.9kg,etc

23

Type of Measurement Scales

• Nominal Scale - groups , classes, categories


✓Gender, color, professional classification, etc.
• Ordinal Scale - order matters
✓Ranks (top ten videos, products, etc.)
• Interval Scale - difference or distance matters.
✓Temperatures (0F, 0C)
• Ratio Scale - Ratio matters.
✓Salaries, weight, volume, area, length, etc.

24

12
1/16/2019

Percentiles and Quartiles

⚫ Percentiles partition the data into 100 segments.


⚫ The Pth percentile in the ordered set is that value
below which lie P% (P percent) of the observations
in the set.
⚫ The position of the Pth percentile is given by
(n + 1)P/100, where n is the number of
observations in the set.

25

Example Sorted
Billions Billions
33 18
26 18
The magazine Forbes 24 18
21 18
publishes annually a list of the 19 19
world’s wealthiest individuals. 20 20
18 20
For, 2007, the net worth of the 18 20
20 richest individuals, in $ 52 21
56 22
billions, is as follows: 27 22
22 23
18 24
49 26
22 27
20 32
23 33
32 49
20 52
18 56

Find the 50th, 80th and the 90th percentiles of this data set.

26

13
1/16/2019

Example (Continued)
Percentiles
⚫ To find the 50th percentile, determine the data point in
position (n + 1)P/100 = (20 + 1)(50/100)
= 10.5.
⚫ Thus, the percentile is located at the 10.5th position.
⚫ The 10th observation in the ordered set is 22, and the
11th observation is also 22.
⚫ The 50th percentile will lie halfway between the 10th
and 11th values (which are both 22 in this case) and is
thus 22.

27

Example

⚫ To find the 80th percentile, determine the data


point in position (n + 1)P/100 = (20 + 1)(80/100)
= 16.8.
⚫ Thus, the percentile is located at the 16.8th
position.
⚫ The 16th observation is 32, and the 17th
observation is also 33.
⚫ The 80th percentile is a point lying 0.8 of the
way from 32 to 33 and is thus 32.8.

28

14
1/16/2019

Example

⚫ To find the 90th percentile, determine the data point in


position (n + 1)P/100 = (20 + 1)(90/100) = 18.9.
⚫ Thus, the percentile is located at the 18.9th position.
⚫ The 18th observation is 49, and the 19th observation is
also 52.
⚫ The 90th percentile is a point lying 0.9 of the
way from 49 to 52 and is thus 49 + 0.9(52 – 49) = 49 +
0.93 = 49 + 2.7 = 51.7.

29

Quartiles – Special Percentiles

⚫ Quartiles are the percentage points that break down the ordered
data set into quarters.
⚫ The first quartile (lower quartile, Q1) is the 25th percentile. It
is the point below which lie 1/4 of the data.
⚫ The second quartile (middle quartile, Q2) is the 50th
percentile. It is the point below which lie 1/2 of the data. This is
also called the median.
⚫ The third quartile (upper quartile, Q3) is the 75th percentile.
It is the point below which lie 3/4 of the data.
⚫ The interquartile range (IQR) is the difference between the
first and the third quartiles.
IQR = Q3 – Q1

30

15
1/16/2019

Example Finding Quartiles


Sorted
Billions Billions (n+1)P/100 Quartiles
33 18 Position
26 18
24 18
21 18
19 19 19 + (.25)(1) = 19.25
20 20 First Quartile (20+1)25/100=5.25
18 20
18 20
52 21
56 22 Median (20+1)50/100=10.5 22 + (.5)(0) = 22
27 22
22 23
18 24
49 26
22 27 27+ (.75)(5) = 30.75
Third Quartile (20+1)75/100=15.75
20 32
23 33
32 49
20 52
18 56

31

Your Turn!

Fifty statistics students were AMOUNT OF FREQUENCY


SLEEP PER
asked how much sleep they get SCHOOL
per school night (rounded to the NIGHT
nearest hour). The results were (HOURS)
(student data): 4 2
5 5
6 7
• 28th percentile =
7 12
• 3rd quartile = 8 14
• 80th percentile = 9 7
• 90th percentile = 10 3

32

32

16
1/16/2019

Summary Measures:

⚫ Measures of Central ⚫ Measures of Variability


Tendency ✓ Range
✓ Median ✓ Interquartile range

✓ Mode ✓ Variance
✓ Standard Deviation
✓ Mean

⚫ Measures of Shape:
✓ Skewness
✓ Kurtosis

33

MEASURES OF
CENTER

(Ukuran Pemusatan)
Mean, Median, Mode

34

17
1/16/2019

Arithmetic Mean or Average


The mean of a set of measurements is the sum of the
measurements divided by the total number of
measurements.
Symbol: x bar x
Ungrouped data Grouped data

 xi  f i .xi
x= x=
n n
where n = number of measurements
෍ 𝑥𝑖 = sum of measurements

35

Example: Mean
Consider 8 observations (xi) of pull-off force from
engine connectors as shown in the table. i xi
8

 xi 12.6 + 12.9 + ... + 13.1


1
2
12.6
12.9
x = average = i =1
= 3 13.4
8 8 4 12.3
104 5 13.6
= = 13.0 pounds 6 13.5
8 7 12.6
8 13.1
13.00
= AVERAGE($B2:$B9)

Figure 6-1 The sample mean is the balance point.

If we were able to enumerate the whole population, the population mean


would be called μ (the Greek letter “mu”).
36

36

18
1/16/2019

Median
• The median of a set of measurements is the
middle measurement when the measurements
are ranked from smallest to largest.
• The position of the median is once the
measurements have been ordered.

0.5(n +1)
• Also called second quartile or 50th percentile

37

Example
 The set: 2, 4, 9, 8, 6, 5, 3 n=7
Median = 5
 Sort: 2, 3, 4, 5, 6, 8, 9
 Position: 0.5(n + 1) = 0.5(7 + 1) = 4th

• The set: 2, 4, 9, 8, 6, 5 n=6 Median = (5 + 6)/2 = 5.5


• Sort: 2, 4, 5, 6, 8, 9 →average of the 3rd and 4th data
• Position: 0.5(n + 1) = 0.5(6 + 1) = 3.5th

38

19
1/16/2019

Mode

 The mode is the data which occurs most


frequently.
 Example:
1. The set: 2, 4, 9, 8, 8, 5, 3
 The mode is 8, which occurs twice
2. The set: 2, 2, 9, 8, 8, 5, 3
 There are two modes—8 and 2 (bimodal)
3. The set: 2, 4, 9, 8, 5, 3
 There is no mode (each value is unique).

39

Example
The number of quarts of milk purchased by 25
households:
0 0 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5

 Mean?
 xi 55
x= = = 2.2
n 25
 Median?
m=2
 Mode? (Highest peak)

mode = 2
40

20
1/16/2019

MEASURES OF
VARIABILITY

(Ukuran Penyebaran)
Range, Interquartile range, Variance, standard
deviation

41

Variability

 Tell us how far scores spread out


 Tells us how the degree to which scores deviate
from the central tendency

Mean = 10 Mean = 10 42

42

21
1/16/2019

Measures of Variability or
Dispersion
⚫ Range
✓ Difference between maximum and minimum values
⚫ Interquartile Range
✓ Difference between third and first quartile (Q3 - Q1)
⚫ Variance
✓ Average of the squared deviations from the mean
⚫ Standard Deviation
✓ Square root of the variance

43

Sample Range
If the n observations in a sample are denoted
by x1, x2, …, xn, the sample range is:

r = max(xi) – min(xi) (6-6)

It is the largest observation in the sample minus


the smallest observation.

From Example :
r = 13.6 – 12.3 = 1.30

Note that: population range ≥ sample range


44

44

22
1/16/2019

Example 1-3: Finding range


Sorted
Billions Billions Ranks
33 18 1 Range = Maximum – Minimum
26 18 2
= 56 – 18 = 38
24 18 3
21 18 4
19 19 5 First Quartile (20+1)25/100=5.25 19 + (.25)(1) = 19.25
20 20 6
18 20 7
18 20 8
52 21 9 (20+1)50/100=10.5 22 + (.5)(0) = 22
56 22 10 Median
27 22 11
22 23 12
18 24 13
49 26 14
22 27 15 Third Quartile (20+1)75/100=15.75 27+ (.75)(5) = 30.75
20 32 16
23 33 17
32 49 18 Interquartile Range = Q3 – Q1
20 52 19 = 30.75 – 19.25 = 11.5
18 56 20

45

Variance

( xi ) 2
( xi − x ) 2  xi −
2

s =
2
s2 = n
n −1 n −1

( xi ) 2
( xi −  ) 2  xi −
2

 =
2
2 = n
N N
46

46

23
1/16/2019

Standard Deviation

 The standard deviation is the square root of the


variance.
 σ is the population standard deviation symbol.
 s is the sample standard deviation symbol.

Sample standard deviation: 𝑠 = 𝑠 2


Population standard deviation : 𝜎 = 𝜎 2

47

47

Example : Sample Variance


Table below displays the quantities needed to calculate the
sample variance and sample standard deviation.
2
i xi x i - xbar (x i - xbar)
1 12.6 -0.4 0.16
Dimension of: 2 12.9 -0.1 0.01
xi is pounds 3 13.4 0.4 0.16
Mean is pounds. 4 12.3 -0.7 0.49
Variance is pounds2. 5 13.6 0.6 0.36
Standard deviation is pounds. 6 13.5 0.5 0.25
7 12.6 -0.4 0.16
Desired accuracy is generally 8 13.1 0.1 0.01
accepted to be one more place sums = 104.00 0.0 1.60
divide by 8 divide by 7
than the data.
xbar = 13.00 variance = 0.2286
standard deviation = 0.48

48

48

24
1/16/2019

Example : Variance by Shortcut


2
n
 n 

i =1
x −   xi 
2
i
 i =1 
n i xi
2
xi
s =
2 1 12.6 158.76
n −1 2 12.9 166.41
3 13.4 179.56

1,353.60 − (104.0 ) 8
2 4 12.3 151.29
= 5
6
13.6
13.5
184.96
182.25
7 7 12.6 158.76
8 13.1 171.61
1.60
= = 0.2286 pounds 2 sums = 104.0 1,353.60
7
s = 0.2286 = 0.48 pounds
49

49

Exercise

The experiment show that concentration of Cl- in the


of solution is measured by one operator using the
same instrument 8 times. She obtains the following
data (ppm):

7.15, 7.20, 7.18, 7.19, 7.21, 7.20, 7.16, and 7.18

 Calculate the sample mean, mode, median


 Find 28th percentile, 80th percentile and 1st quartile
 Calculate variance and standard deviation

50

25
1/16/2019

51

52

52

26

You might also like