Fundamental of Statistic Kel. 1

Fundamentals of
Statistics
MEMBER OF GRUP 1 :
DETYA INDRIAWAN
DIAH AULIA I
KARINA PRAVITASARI
MASATUL FARHAH
TIARA ARISENDA K
Pendidikan Biologi Reguler

2013
Statistics?
A collection of quantitative data from a sample or
population.
The science that deals with the collection, tabulation,
analysis, interpretation, and presentation of quantitative
data.
Statistic types
Deductive or descriptive statistics
describe and analyze a complete
data set
Inductive statistics
deal with a limited amount of data
(sample).
Conclusions: probability?
Population
A population is any entire collection of people, animals,
plants or things from which we may collect data.
It is the entire group we are interested in, which we wish
to describe or draw conclusions about.
For each population there are many possible samples.
Sample
A sample is a group of units selected from a larger group
(population).
By studying the sample it is hoped to draw valid conclusions about
population.
The sample should be representative of the general population.
The best way is by random sampling.
Parameter
A parameter is a value, usually unknown (and which
therefore has to be estimated), used to represent a
certain population characteristic.
For example, the population mean is a parameter that is
often used to indicate the average value of a quantity.
Inferential Statistics
Statistical Inference makes use of information from a
sample to draw conclusions (inferences) about the
population from which the sample was taken.
Types of data
Variables data
quality characteristics that are measurable values.
measurable and normally continuous;
may take on any value - eg. weight in kg
Attribute data
quality characteristics that are observed to be either
present or absent, conforming or nonconforming.
countable and normally discrete; integer - eg: 0, 1,
5, 25, , but cannot 4.65
Describing the Data

Graphical:
Plot or picture of a frequency distribution.
Analytical:
Summarize data by computing a measure of central tendensy
and dispersion.
Sampling Methods
Sampling methods are methods for selecting a
sample from the population:
Simple random sampling - equal chance for each
member of the population to be selected for the sample.
Systematic sampling - the process of selecting every n-th
member of the population arranged in a list.
Stratified sample - obtained by dividing the population
into subgroups and then randomly selecting from each
subgroups.
Cluster sampling - In cluster sampling groups are selected
rather than individuals.
Incidental or convenience sampling - Incidental or
convenience sampling is taking an intact group (e.g. your
own forth grade class of pupils)
Frequency Distribution
Consider the following
set of data which are the
high temperatures
recorded for 30
consequetive days.
We wish to summarize
this data by creating a
frequency distribution of
the temperatures.
Data Set - High

Temperatures for 30
Days
50
45
49
50
43
49
50
49
45
49
47
47
44
51 51
44
47
46
50
44
51
49
43
43
49
45
46
45
51
46
Example of Frequency
Distribution
Frequency Distribution for High

Temperatures
Temperature
51
50
49
Tally
////
////
//////
48
47
46
45
44
43
Frequency
4
4
6
0
///
///
////
///
///
3
3
4
3
3
N=
30
Cummulative Frequency Distribution

A cummulative freq distribution can be created by
adding an additional column called "Cummulative
Frequency."
The cum. frequency for a given value can be obtained
by adding the frequency for the value to the
cummulative value for the value below the given
value.
For example: The cum. frequency for 45 is 10 which is
the cum. frequency for 44 (6) plus the frequency for 45
(4).
Finally, notice that the cum. frequency for the highest
value should be the same as the total of the frequency
column.
Cummulative Frequency Distribution for High

Temperatures
Temperatu
Frequenc
Cummulative
Tally
Frequency
re
y
51
50
49
48
47
46
45
44
43
////
////
/////
/
///
///
////
///
///
N
=
4
4
6
0
3
3
4
3
3
30
30
26
22
16
16
13
10
6
3
Grouped frequency distribution

In some cases it is necessary to group the values of the
data to summarize the data properly.
Eg., we wish to create a freq. distribution for the IQ
scores of 30 pupils.
The IQ scores in the range 73 to 139.
To include these scores in a freq. distribution we would
need 67 different score values (139 down to 73).
This would not summarize the data very much.
To solve this problem we would group scores together
and create a grouped freq. distribution.
If data has more than 20 score values, we should create
a grouped freq. distribution by grouping score values
together into class intervals.
Grouped frequency
Look at the following data of

high temperatures for 50
days.
The highest temperature is 59

and the lowest temperature is
39.
We would have 21
temperature values.
This is greater than 20 values
so we should create a
grouped frequency
distribution.
Data Set - High

Temperatures for 50 Days
57
39
52
52
43
50
53
42
58
55
58
50
53
50
49
45
49
51
44
54
49
57
55
59
45
50
45
51
54
58
53
49
52
51
41
52
40
44
49
45
43
47
47
43
51
55
55
46
54
41
Grouped Frequency Distribution for High Temperatures

Class Interval
Tally
Interval
Midpoint
Frequency
57-59
//////
58
54-56
55
52
11
48-50
///////
/////////
//
/////////
49
45-47
///////
46
42-44
//////
43
39-41
////
40
N=
50
51-53
Histograms
Constructing a Histogram for Discrete Data
First, determine the frequency and relative frequency of each x value.
Then mark possible x value on a horizontal scale.
Descriptive statistics
Measures of Central Tendency
Describes the center position of the data
Mean, Median, Mode
Measures of Dispersion
Describes the spread of the data
Range, Variance, Standard deviation
Measures of central
tendency: Mean N
1
Arithmetic mean: x =
xi
N i 1
where xi is one observation, means add up what

follows and N is the number of observations
So, for example, if the data are : 0,2,5,9,12

the mean is (0+2+5+9+12)/5 = 28/5 = 5.6
Median - mode
Median = the observation in the middle of sorted data
Mode = the most frequently occurring value
Median and mode

100 91 85 84 75 72 72 69 65
Mode
Median
Mean = 79.22
Measures of dispersion:
range
The range is calculated by taking the maximum value and
subtracting the minimum value.
2 4 6 8 10 12 14
Range = 14 - 2 = 12
variance
Calculate the deviation from the mean for every
observation.
Square each deviation
Add them up and divide by the number of observations
( xi
i 1
standard deviation
The standard deviation is the square root of the variance.
The variance is in square units so the standard
deviation is in the same units as x.
( xi
i 1
Standard Deviation for a

Sample
General formula/ungrouped data:
n
(X
i 1
X )2
n 1
For computation purposes:

n
n X
i 1
2
i
i 1
n(n 1)
Standard Deviation for a

Sample
Grouped data:
n ( f i X )
i 1
2
i
fX
i 1
n(n 1)
Standard deviation and

curve shape
If is small, there is a high probability for getting a value
close to the mean.
If is large, there is a correspondingly higher probability
for getting values further away from the mean.
The Normal Curve

The normal curve or the normal
frequency distribution or Gaussian
distribution is a hypothetical
distribution that is widely used in
statistical analysis.
The characteristics of the normal
curve make it useful in education and
in the physical and social sciences.
Characteristics of the
Normal Curve
The normal curve is a symmetrical distribution
of data with an equal number of data above and
below the midpoint of the abscissa.
Since the distribution of data is symmetrical the
mean, median, and mode are all at the same
point on the abscissa.
In other words, mean = median = mode.
If we divide the distribution up into standard
deviation units, a known proportion of data lies
within each portion of the curve.
34.13% of data lie between and 1 above the mean ().

34.13% between and 1 below the mean.
Approximately two-thirds (68.28 %) within 1 of the mean.
13.59% of the data lie between one and two standard
deviations
Finally, almost all of the data (99.74%) are within 3 of the
mean.
Standardized normal
value,
Z
When a score is expressed in standard deviation
units, it is referred to as a Z-score.
A score that is one standard deviation above the

mean has a Z-score of 1.
A score that is one standard deviation below the
mean has a Z-score of -1.
A score that is at the mean would have a Z-score
of 0.
The normal curve with Z-scores along the
abscissa looks exactly like the normal curve with
standard deviation units along the abscissa.
Z-value
Deviation IQ Scores, sometimes called Wechsler IQ scores,
are a standard score with a mean of 100 and a standard
deviation of 15.
What percentage of the general population have deviation
IQs lower than 85?
So an IQ of 85 is equivalent to a z-value of 1.
So 50 % - 34.13 % = 15.87% of the population has IQ
scores lower than 85.
Frequency Polygon
A frequency polygon is what you may think of as a curve.
A frequency polygon can be created with interval or ratio
data.
Let's create a frequency polygon with the data we used
earlier to create a histogram.
To create a frequency polygon

Arrange the values along the abscissa (horizonal axis).
Arrange the lowest data on the left & the highest on
the right.
Add one value below the lowest data and one above
the highest data.
Create a ordinate (vertical axis).
Arrange the frequency values along the abscissa.
Provide a label for the ordinate (Frequency).
Create the body of the frequency polygon by placing a
dot for each value.
Connect each of the dots to the next dot with a straight
line.
Provide a title for the frequency polygon.
To create a frequency polygon

Fundamental of Statistic Kel. 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fundamental of Statistic Kel. 1

Uploaded by

Copyright:

Available Formats

Fundamentals of

Pendidikan Biologi Reguler

Describing the Data

Data Set - High

Frequency Distribution for High

Cummulative Frequency Distribution

Cummulative Frequency Distribution for High

Grouped frequency distribution

Look at the following data of

The highest temperature is 59

Data Set - High

Grouped Frequency Distribution for High Temperatures

where xi is one observation, means add up what

So, for example, if the data are : 0,2,5,9,12

Median and mode

Standard Deviation for a

For computation purposes:

Standard Deviation for a

Standard deviation and

The Normal Curve

34.13% of data lie between and 1 above the mean ().

units, it is referred to as a Z-score.

A score that is one standard deviation above the

To create a frequency polygon

To create a frequency polygon

You might also like