You are on page 1of 148

Business Statistics II

STAT8068

Management Laboratory

Binus University

2017
Modul Business Statistics II- Management Laboratory 2015

TABLE OF CONTENTS

SESSION 1 INTRODUCTION TO STATISTICAL.......................................3

1.2 DATA CLASSIFICATION..........................................................................................3

1.3 DEFINITION OF SCALE..........................................................................................6

1.4 SCALING TECHNIQUES..........................................................................................8

1.4.1 Comparative Scale........................................................................................8

1.4.2 Noncomparative Scaling Techniques.....................................................9

1.5 ANALYSIS METHOD...........................................................................................10

1.5.1 Qualitative Analysis.....................................................................................10

1.5.2 Quantitative Analysis..................................................................................10

1.5.3 Statistical Analysis Method.......................................................................11

1.6 STATISTIC ELEMENT............................................................................................11

1.7 SPSS (STATISTICAL PRODUCT AND SERVICE SOLUTION)..............................12

1.8 DESCRIPTIVE STATISTIC.....................................................................................16

SESSION 2 VALIDITY, RELIABILITY, NORMALITY.....................30

2.1 VALIDITY AND REALIBILITY TEST.......................................................................30

2.2 NORMALITY TEST...............................................................................................40

SESSION 3 CORRELATION....................................................................49

3.1 BIVARIATE CORRELATION...................................................................................49

3.2 PARTIAL CORRELATION......................................................................................52

SESSION 4 THE CLASSICAL ASSUMPTIONS...............................59

4.1 MULTICOLLINEARITY TEST..................................................................................59

4.2 HETEROSCEDASTICITY TEST..............................................................................61

2
Modul Business Statistics II- Management Laboratory 2015

4.3 AUTOCORRELATION TEST..................................................................................65

SESSION 5 REGRESSION.......................................................................72

5.1 SIMPLE REGRESSION..........................................................................................72

5.2 MULTIPLE REGRESSION......................................................................................77

SESSION 6 ANOVA AND MANOVA...................................................85

SESSION 7 CHI-SQUARE..............................................................................97

7.1 CONSISTENCY TEST OR GOODNESS OF FIT TEST.................................................97

7.2 TEST OF INDEPENDENCE...................................................................................108

SESSION 8 MANN WHITNEY....................................................................116

SESSION 9 SIGN WILCOXON....................................................................124

SESSION 10 KRUSKALL WALLIS.............................................................130

BIBLIOGRAPHY.........................................................................................142

3
SESSION 1
INTRODUCTION TO STATISTICAL

1.1 Statistical on Business

Websters Third New International Dictionary (Black, 2013)


provides a comprehensive statistical definition as science related to
gathering, analyzing, interpreting and presenting numerical data.

According to (Black, 2013), Business statistics is about measuring


the phenomenon in the business world and organize, analyze, and
present the results in the form of numerical information in a better
way, so business decisions can be made. In business statistics, often
contain variables studied, measurement, and data.

Variables in business statistics are the characteristics of each


entity being studied that capable of taking different values (Black,
2013). There are so many variables in business, such as worker
productivity, employee salaries, the price of a product / service,
market share, sales, and etc. In the study of business statistics, the
variables - the variables used will produce measurements that can be
used in the analysis.

Data are facts and numbers that are being collected, analyzed
and summarized for presentation where the results will be
demonstrated (Anderson, Sweeney, & Williams, 2011). Data has
various forms such as height, sales, the amount of production, the
number of students, and others. A collection of data is more
meaningful after going through processing process referred as
information, for example: the average height of junior high school
children who regularly swim is 100 centimeters. Select and organize
the information that can provide the understanding,
recommendations, and as the decision basic that will become a
knowledge. For example: there are differences between the average
height of junior high school students who regularly swim and do not.

3
1.2 Data classification

The classification of statistical data can be described as follows:

Source : (Putra, 2012)

A Classification of Data According to Acquire Method

1 Primary Data

Primary Data is data that is based on the first information


generated by researchers for a particular purpose in learning.
Primary Data Primary data can be obtained through interviews
and questionnaires. The example is the interview data or data
from questionnaires. Both of them are obtained directly from the
object of research.

2 Secondary Data

Secondary data is based on information generated from existing


sources. Data can be sourced from organizations internal or

4
external and can be accessed via the internet, and also
information that has been spread. Secondary data can come from
books, census data, corporate databases, media, annual reports,
and so on.

B Classification Data According to Sources

1 Internal Data

Internal data in the form of a variety of information that can be


found in the organization that can be used to make a
decision.Example of internal data are sales data, cost data,
inventory data, and employee data.

2 External Data

External data such as information that describes the situation as


well as the conditions that exist outside the organization. An
example is the amount of product use by consumers, the level of
customer preferences, population distribution, etc..

C Classification Data According to the Type

1 Quantitative Data

Quantitative data is data in the form of numbers. Example of


quantitative data are the data of population, the number of
national income, and data of the number of families in an area.

2 Qualitative Data

Qualitative data is data that is non-numbers. The information


generated by this type of data is not information in the form of
numbers. Data gender, educational level data and data religious
affiliations in a region is an example of qualitative data.

5
Qualitative data can be processed into quantitative data with
statistical techniques. For example, the data types of sex are
woman and man. To perform data processing using statistical
techniques, the data must be changed in the form in which female
as one, and male as two.

D Classification Data According to Data Trait

1 Discrete Data

Discrete data is quantitative data that has rounded nature, not in


the form of fractions. Data on the number of students in a
university, for example, is the data that is discrete because the
data is rounded, not in bits.

2 Continuous Data

Continuous data is data derived from the measurement of


something. The results of these measurements depends on the
accuracy of the measuring instruments used. Height data,
temperature data, air and moisture data is data that is
continuous. This data can be in the form of fragments, and
accuracy depends on the measuring instruments used.

E Classification Datas According to Time

1 Data Cross Section

Cross Section Data is data that is collected only once, may be


collected over several days or weeks or months, to answer the
research question. An example is the drug companies willing to
invest in research for the pill (decrease) obesity. The company

6
conducted a study of people who are obese to observe how many
of those interested in taking the pill.

2 Data Time Series

Time series data is data on the dependent variable (influenced)


collected on two or more different time points to answer research
questions. Time series data is based on a single source. Time
series data can be longitudinal data when the research is done for
some sources. Research to obtain this data using the more time,
effort, and cost compared to research for cross section data.
Examples of longitudinal data is a marketing manager interested
to know the pattern of sales of certain products in the four regions
on a quarterly basis for the next two years. The data will be
collected several times to determine the pattern of sales in four
areas.

(Sekaran & Bougie, 2013)

1.3 Definition of Scale

According to (Anderson, Sweeney, & Williams, 2011) scale calculation


determines the amount of information contained in the data and
indicate the most appropriate summary of the data and statistical
analysis. Measurement as the rules to give number for the various
objects, so that this number can represents the quality attributes.
There are four types of scale that can be used to measure the
attributes, namely: nominal scale, ordinal scale, interval scale, and
rasio scale.

1 Nominal Scale

Nominal scale is a type of measurement which number charged


for objects or object classes for the purpose of
identification/classification/category/distinguish data. Nominal

7
type of data is the data that is in the lowest' level of
measurement data. When the variable data consists of a label or
name used to identify the attributes of the element, a scale of
measurement was considered as a nominal scale. Examples of the
kind of work where 1 symbolizes private employees, 2 symbolizes
the civil servants, 3 symbolizes entrepreneurship. The nominal
scale is for the feature that is qualitative and used to indicate the
similarity or dissimilarity.

2 Ordinal Scale

Ordinal scale indicates the nature and regularity of the nominal


data or meaningful ranking data. Ordinal scale is a rating scale
where numbers are set to indicate the relativity of character
possessed. Thus it can be known whether an object has
something (character) more or less than other objects. Ordinal
scale indicates the relative position of, for example, students are
given a ranking of 1 means the relatively better than 2 or 3, but it
is not known exactly whether the difference between ranks 1 and
2 is small or big.

3 Interval Scale

The scale of measurement of a variable is called an interval scale


if the data indicates the nature ordinal data and the interval
between the values expressed in units of fixed size. Interval scale
occupy a higher measurement level. Interval data always be a
number. On a interval scale, numerical range on scale represent
the same distance on the characters measured. There is a
constant interval/cooperation between the value scale. For
example 50-80o C is said to be quite hot, 80-110o C is said to be
hot, 110-140o C is said to be very hot.

4 Ratio Scale

8
The scale of measurement called a ratio scale if the data has all
the properties of interval data and ratio of 2 values. Scale ratio is
the highest ratio, which has a scale of absolute point (zero point),
this scale can be used to identify or classify, rank and compare
interval or difference of objects. This scale can become useful to
calculate the ratio of the value scale. So that not only can know
the difference between 2 and 5 is equal to the difference of 16
and 19, which is 3, but we can also know that 16 is 8 times of 2.

Distingui Have Have Have 0 absolute


sh Rating distance point
Nomin
+ - - -
al
Ordinal + + - -
Interva
+ + + -
l
Ratio + + + +

1.4 Scaling techniques

Scaling Technique can be classify become comparative scale


and non-comparative scale. Comparative scale is a direct comparison
of the object under study, the scale are ordinal and also direct
comparison of stimulus objects with one another. Then comparative
scaling can be considered as nonmetric scaling.

Example: A respondent was asked if she/he would rather drink Fanta


or Coca-Cola, the answers obtained where they indirectly being were
forced to choose which one is preferred.

Non comparative scale (metric scale), each object is measured


independently from other objects in the stimulus. The data is
generally in the form of interval or ratio scale. For example: a

9
respondent was asked to evaluate Fanta with a scale of 1 (excellent)
to 6 (very bad).

1.4.1 Comparative Scale

Comparative scale commonly used (especially in business research)


include paired comparisons, rank order, and constant sum.

a Pairwise Comparison Scale

This scale compare two-paired object. It means , two objects


being presented in front of the respondent, and then they will be
compare to the existed criterias. Example: Respondent is being
asked to choose what theyre prefer between two brands of
vehicle with the same criterias: Brand A or Brand B. (Data that
being achieved is ordinal data).

b Terraced Order Scale

The differences between this scale and paired comparison place


in the amount of objects that being compared. Respondent will be
shown several objects simultaneously (for example vehicle Brand
A, Brand B, Brand C, Brand D, Brand E simultaneously) and being
asked to rank or organize objects based on certain criteria. Data
that being achieved is ordinal data.

c Fix Total Scale

Here, respondents are being asked to allocate constant amount in


certain unit(example dollar, kg, etc) between set of stimulus
objects based on certain criteria or respondents are being asked
to allocate the amount of point to set the price for several brands
based on certain criteria. For example, respondents are being
asked to allocate 100 points to atributes that important in the
vehicle, such as: comfortable, speed, flexibility, etc.

1.4.2 Noncomparative Scaling Techniques

This technique only measure one object, without observe others.


Object in this case can be brand, price of product, etc. Several

10
popular scale in this technique are likert, differential semantik, and
stapel scale.

a Skala Likert

This scale aske the respondents to show acceptance or non-


acceptance about a set of facts of an object. This scale is being
develop by Rensis Likert and usually has 5 or 7 categories from
strongly agree until strongly disagree. This scale often used in
business riset that uses survey method and can be categorized as
interval scale. Respondents determine acceptance level with
answering question with choosing several choice available.

For example, respondents are being asked to rate acceptance level


about Price of Brand A is economical.

Strongly Agree Agree Hesitant Not Agree Strongly


Disagree

b Semantic Differential Scale

This scale has several point (from 5 until 7 point) between two parts
that have extremely differences. Respondents are being asked to
value an object that has tendency between that two parts. For
example: Repondents are asked to value brand-new magazine, that
quote that magazine (use adjective word):

Interesting |_1_|_2_|_3_|_4_|_5_|_6_|_7_| Boring

Up-to-date |_1_|_2_|_3_|_4_|_5_|_6_|_7_| Not up-to-date

c Staple Scale

This scale are being developed by Staple and almost similar to


semantic differential scale that simultaneously measure direction and
intensity of characteristic of observed item. The differences are this
scale has positive and negative, (for example -5 until +5) and

11
respondents are being asked to value an object by rate the tendency
of -5 as not good criteria and +5 as very good.

1.5 Analysis Method

According to Sugiyono (2009:244) data analysis is the process to find


and organized data that being achieve from interview result, field
note, and other material systematically so that it will be easier to
understand and the result will be informed to others. Analysis is an
act to turn the data to useful information in answering statistic
problem. In design riset, we need to plan analysis tools that will be
apply to anaylize the data. The purpose is to get information that
relevant and use the result to solve the problem. Analysis methos
that often be used:

1.5.1 Qualitative Analysis

Qualitative data analysis has explain deeply the riset result through
non-numeric or non-statistic approach. For example: Question that
being asked to respondent about the flavor of Chitato. The answers
maybe delicious, crunchy, etc).

1.5.2 Quantitative Analysis

Quantitative Analysis try to process data become information


in the form of number. The uses of number make it easier to interpret
the result objectively. For example: interval 1-2, 4 as not good; 2,5-
3,4 as so so; and 3.5-5 as very good. Quantitative Analysis that
mostly used is statistic analysis.

1.5.3 Statistical Analysis Method

According to Websters third new international dictionary,


statistical is science that related to assembly, analysis, interpret, and
presentation of numeric data. Statistical related to many number,
so that can be interpreted as Numerical Description by many
people. The examples are amount of citizen, and else. On business

12
world, statistic also associated with a set of data such as movement
of inflation level, cost of promotion, amount of customers, etc. Beside
of that, statistic also used to several analysis like forecasting,
hypothesis test and else.

In the application, statistical analysis method can be derived into


two:

1 Descriptive statistic

Descriptive Statistic has the purpose to change a set of raw data


to become easy to understand information. So, Jadi descriptive
statistic try to explain or describe several data characteristics, such
as mean, median, variance, and else.

2 Inference Statistic

Statistik inferensi has the purpose to make forecasting basic become


available, and estimation that can be transform information become
knowledge.

1.6 Statistic Element

Several elements that usually contain in statistic issue:

1 Population

The basic problem from statistic issue is to determine the data


population. Generally, population can be define as a set of data
that identify a phenomenon. All Indonesia citizen can be called
population, citizen at A province can also be called population,
even the citizen at small village can also be called population. So,
the definition of population depends on the uses and data
relevancy.
2 Sample

Sample can be define as a set of data that being take from a


population. The example, if the population is A province citizen,
so the sample would be a part of A citizen. So, actually sample is
a part of population. Pengambilan sampel dilakukan karena dalam

13
praktek banyak kendala yang tidak memungkinkan seluruh
populasi diteliti. Kendala tersebut bisa meliputi situasi, waktu,
tenaga, atau biaya. Oleh karena itu, metode pengambilan sampel
menjadi bagian yang penting dari statistik.

3 Variable

In doing inference on the population, not all of the characteristics


of the population must be known. Only one or a few
characteristics of the known population, which is referred to as a
variable. For example, to examine customer satisfaction, then the
variables that are considered relevant for example the number of
purchases, the number of visits, and so on. (Santoso, 2011)

1.7 SPSS (Statistical Product and Service Solution)

SPSS is a software which has the analytical skills that are designed to
assist the processing of statistical data. For uniformity, SPSS used in
this lab is IBM SPSS Statistics 20.

Statistical process with SPSS:

Input data with Output data with

Data Editor Process with Viewer


DATA EDITOR

1 Data processed inserted via the menu DATA EDITOR that


automatically appears on the screen when run SPSS.Data yang
telah diinput kemudian diproses, juga lewat menu DATA EDITOR.

2. The results of data processing appears on the screen (window)


other than SPSS, SPSS is VIEWER Output can be text / text, tables,
or graphs.

At the time IBM SPSS Statistics 20 (SPSS 20) first opened, always look
first look as follows.

14
Window on top is called DATA EDITOR and is the main window in
SPSS. EDITOR DATA in this will be the main process SPSS namely
input data and further processing the data.

SPSS DATA EDITOR has two parts, namely:

DATA VIEW, namely a place to input the statistical data .Inilah


that always appear on the screen.

VARIABLE VIEW, namely the statistical variables to input. This


section is used only when inserting and defining variables.

The steps to enter data into SPSS is as follows:

1 Open a new worksheet

2 The new work is always open if there is a new variable income.


Therefore, from the main menu select File New-Data.

3 Name the necessary variables

The next step is to make a name for each new variable. VIEW
VARIABLE to that used in the data editor.

15
Here's an explanation of the columns in the variable view that
must be filled to add new variables:

Name

This column is a column to give a name for the variable. The


number of characters that can be used to provide a variable
name is 64 characters (bits). But to make a name in the form
of a sentence, use the underline for liaison. Example:
Data_Penjualan.

Type

Type used to fill the data types in accordance with the data
you want included. In SPSS, many data types offered for each
variable, but for the purposes of data analysis is usually used
is the data type string, numeric and date. The following
explanation of the types of data that exist in SPSS:

Numeric: Data in the form of numbers.

Comma: These data form the numeric and marked coma as


a differentiator thousands.

Dot: These data form the numeric and marked point as a


differentiator thousands.

16
Scientific Notation: This data is numerical form, and
marked with the symbol E.

Date: Data such as date and there are a number of choices


type / date format.

Dollar: These data form the numeric and marked ($) with a
comma as the thousands separator sign.

Custom Currency: This type of form is used to display the


currency format created via the Options dialog box from
the edit menu.

String : Data in the form of characters/letters

Width

Width is used to determine how many digits can be entered on


the data of each cell. This option provides inputs between 1 to
255 digits.

Decimals

To add or subtract decimals to the data in the form of


numbers.

Label

Label is a description for the variable name, which can be


included or not. Labels are not required does not affect the
data. Nevertheless, the data input and there are many
similarities, writing labels is highly recommended to clarify the
identity of a variable.

Values

Values is a code given if the categorical variable is used. For


example we want to use code 1 for men and 2 for women's
code.

Missing

17
Data are considered lost will not include in the analysis. There
are three options:

No missing values
Discrete missing values. Certain numbers which we think is
missing.
Range plus one optinal discrete missing value. If there is
data lost with a clear range.

Columns

Columns is the width of a character's name in the NAME. The


size at least equal to the value of WIDTH.

Align

Align used to change the location of the display data such as


at Ms.Word.

Measure

Is a scale of measurement of the variables concerned.

After we fill the VARIABLE VIEW, we can begin to enter data into
the DATA VIEW.

New features of IBM SPSS Statistics 20:

1 Can open the data file more than 1

2 The menu is more complete data, like features:

Split File (separating the contents of files with certain


criteria)

Select Case (select the contents of files with certain


criteria)

Sort Case (sort data)

1.8 Descriptive Statistic

Descriptive statistics are statistics relating to how to describe, depict,


describe or decipher the data so it is easy to understand. The
descriptive analysis aims to transform a collection of raw data into
18
easily understood information in a more concise form. The data can
be obtained from the census, surveys or other observations.
Generally still random, should be summarized properly and regularly,
both in the form of a table or graphic presentation. Descriptive
statistics are the basis for the decision making statistical inference.

Two important measures are often used in decision-making are:


1 Find central tendency such as: Mean, Median, Modus, and else.
Mean is the sum of a set of data divided by the number of data.
Median measure the middle value by dividing the number of
observations in a balanced from top to bottom or a fiftieth
percentile. If there is a sequence of data: 4 5 6 6 6 6 7 8 8, so
the median is 6.
Modus illustrates the value that most frequently appear or have
the highest frequency. If there is data: 5, 5, 6, 7, 2, 6, 5, 4, 1, 5.
Mode is the number 5.
2 Finding the size dispersion as standard deviation, variance.

Variance of a number of observation is the average squared deviation


from the mean of data.

Standard deviation is the root of (positive) of the variance 3. In


addition to central tendency and dispersion, another measure used is
Skewness and Kurtosis aims to identify the model of the distribution
of a population. Skewness (flatness / tilt) is a measure that can be
expressed from a population distribution model. Divided into three
models of the curve, ie a model of positive, negative and
symmetrical. Positive skewness indicates a distribution leaning to the
right. Negative skewness indicates a distribution skewed to the left.
Kurtosis is a measure of kurtosis distribution. The larger the kurtosis,
the fineness will be distributed. Kurtosis calculated and reported as
both absolute and relative value. The absolute value is always a
positive number.

3 Histogram

19
Histrogram is a chart which consists of a bar chart with different
height. Height of each rod represents the value of the frequency in
which the class is represented by a bar chart.

Descriptive statistics are divided into two, namely:

1 Single Data,
2 Group Data
For grouped data there are three things that need to be considered in
determining the grade for the frequency distribution are: the number
of classes, width classes and class boundaries.
1 Single Data

For Example:

Businessman of Untung Terus Kiosks have 15 stalls kiosks scattered


in West Jakarta and North Jakarta. The owner wants to do research
about the business he has run for approximately 2 years. Help Kiosks
owners Fortunately Continue to analyze the data so that the owner
get clear information about the business it is. Here is an annual
income data Wartel Profit Continues at 15 kiosks during 2013:

(in unit:million)

Kiosks Revenue Kiosks Revenue Kiosks Revenue


1 10 6 25 11 18
2 20 7 10 12 20
3 15 8 12 13 15
4 17 9 25 14 25
5 20 10 17 15 12

For data above, specify:

1 Mean, Median and Mode

2 Minimum and Maximum

3 Variance, Range dan Standard Deviation

4 Kurtosis and Skewness

5 Quartile 1, 2, 3

20
6 Decile 3

7 Persentile 40, 85

The steps:

1 Open IBM SPSS Statistics 20, input data as the table above

2 Choose analyze >> descriptive statistics >> frequencies.


Frequencies window will appear. Then enter revenues into variable
box. Display frequency tables remain checked.

3 In statistic menu, choose according to the questions, then click


continue.

21
4 In chart menu, there are 3 choices, choose as needed (Can be Bar,
Pie and Histogram specifically for Histogram, check show normal
curve on histogram section), click continue.

5 In the format, Order by, choose option: Ascending values (Data will
be a sequence from smallest to largest; means it will put 1 or men
listed first output), then, Continue,OK.

6 Click Ok

22
7 Interpretation of results through the output that has been generated
by the application of SPSS

23
24
Interpretation data is done by observing the output of IBM SPSS
application, then made a report of the results of the data inteprestasi.

Below is how to report the results of the interpretation data:

1 So, Mean in the data revenue of Untung Terus Kiosks is 17.4 mio

So, Median in the data revenueof Untung Terus Kiosks is 17 mio


So, Mode in the data revenueof Untung Terus Kiosks is 20 mio

2 So, Minimum in the data revenue of Untung Terus Kiosks is 10 mio

So, Maximum in the data revenue of Untung Terus Kiosks is 25 mio

3 So, Variance in the data revenue of Untung Terus Kiosks is 26.686

So, Range in the data revenue of Untung Terus Kiosks is15


So, Standart Deviation in the data revenue of Untung Terus Kiosks
is 5.166

4 So, Kurtosis in the data revenue of Untung Terus Kiosks is -1.005

So, Skewness in the data revenue of Untung Terus Kiosks is 0.130

5 So, Quartile 1 in the data revenue of Untung Terus Kiosks is 12


mio

So, Quartile 2 in the data revenue of Untung Terus Kiosks is 17


mio
So, Quartile 3 in the data revenue of Untung Terus Kiosks is 20
mio

6 So, Decile 3 in the data revenue of Untung Terus Kiosks is 14.4


mio

So, Percentile 40 in the data revenue of Untung Terus Kiosks is


15.8 mio
So, Percentile 85 in the data revenue of Untung Terus Kiosks is25
mio

2 Group Data

25
Example:

The following data shows the data of weight 30 women after


childbirth during the month of August 2014 in South Jakarta RSIA
Kurnia (data in kg)

Patient Patient Patient


Name Weight Name Weight Name Weight
Asih 62 Mimi 53 Jenny 65
Yati 57 Marina 58 Dea 53
Ani 65 Saskia 58 Gita 49
Meli 56 Yanti 61 Ariana 50
Cindy 53 Rara 57 Britney 55
Lina 52 Nana 62 Kate 54
Susan 48 Yola 56 Christin 57
Lisa 56 Anne 55 Bella 56
Maria 60 Irena 52 Hana 60
Donita 57 Silvia 57 Rossi 59

From the data above, find:

1 Mean, Median dan Mode

2 Minimum dan Maximum

3 Variance, Standart Deviation

4 Kurtosis dan Skewness

5 Quartile 1,2,3

6 Decile 7

7 Persentile 58

The steps:

1. Count how many class and class wide(width) with sturges method:

k = 1+ 3.3log n l = Xmax - Xmin


k

Answer:

k = 1+ 3.3log n

k = 1 + 3.3 log 30

26
k = 5,87 = 6 (rounded up, always)

l = Xmax - Xmin

k (before being rounded)

l = (65-48) / 5,87

l = 2.89

l = 3(normally rounded)

2. Open SPSS. Input data as table.

3. In Transform menu, choose Recode, then Into Different Variables.

3. In Recode Into Different Variables, move variable weight to the


right, then in Output Variable, type: Data1 as the new variable name.
Then choose change and choose: Old and New Values.

27
4. Input range:48 through50 then in New Value, input Value: 49as mid
value from that range. After that, add in OldNew, and so on. Then
click Continue.

5. Click OK. Then the new variable Data1, will appear.

6. In Analyze menu, choose Descriptive Statistics>> Frequencies.

7. Move variable Data1 to the right.

28
8. Click Statistics, check to the measure that being asked, then click
Values are group midpoints, then Continue.

9. In Chart, choose chart that being asked in the question. Click


Continue.

10. Click Ok, and get the result.

29
Data interpretation was done by observing the output of IBM SPSS
application, then made a report of the results of it.

Data Interpretation Result :

1. Thus, the mean in the data weight is 56.3 kg


Thus, the median in the data weight is 56.4 kg
30
Thus, the mode in the data weight is 58 kg
2. Thus, the minimum in the data weight is 49 kg
Thus, the maximum in the data weight is 64 kg
3. Thus, the variance in the data weight is 17 803
Thus, the standard deviation in the data weight is 4.21
4. Thus, the kurtosis in the data weight is -0689
Thus, the skewness in the data weight is -0.053
5. Thus, the quartile 1 in the data weight is 53 kg
Thus, the quartile 2 in the data weight is 56.4 kg
Thus, the quartile 3 in the data weight is 59.61 kg
6. Thus, the Deciles 7 in the data weight is 58.92 kg
Thus, the percentile 58 in the data weight is 57.36 kg

31
EXERCISE

1. A manager is conducting a survey of his nugget companys sales


produced during this year. Here are the total number of packs of
nuggets are produced each month (in thousands).

Bulan 1 2 3 4 5 6 7 8 9 10 11 12
Total 12 15 26 31 28 21 36 46 32 18 12 10
Please answer the questions below :

a Mean, Median, and Mode

b Maximum and Minimum Value

c Standard Deviation and Variance Value

d Kurtosis and Skewness Value

e Quartile 3

f Decile 7 and Percentile 85

2. PT EXIS Indonesia is doing a research of a number of customers


related the Internet quota that is used for 1 month. It has been the
responsibility of Ardian as R & D staff of the company so as to conduct
research, he took a samples of customers as follows:

3.2G 1GB 2GB 2.3G 3GB 1.5G 4.5G 2GB 2.5G 4GB
B B B B B

1.5G 1.5G 5GB 1.5G 2.5G 1GB 2GB 3.5G 1GB 3.2G
B B B B B B

Dari data diatas, tentukan:

a. Nilai Median, Modus

b. Fineness and Flatness of Data

c. Minimum and Maximum Value

d. Decile 7
32
e. Percentile 76

f. Standard Deviation Value

3. Mr. Hendri is a math teacher at SDN 10. The students at this school
has joined the final school examinations. After spending approximately
3 days to examine 48 answer, Mr. Hendri record values student into a
table below:

80 75 65 58 78 80
90 95 50 68 74 98
70 80 90 95 75 92
65 79 87 85 64 68
82 81 65 93 80 75
68 70 75 55 58 77
96 67 86 77 70 79
60 75 78 65 70 89
Tentukanlah:

a Number and Width of a class


b Mean, Median, Modus, Minimum, Maximum,
c Standar Deviation and Variance
d Q1, Q3, P65, D2, D8

4. Here are the results of population census in Indonesia in 2009 on the


age of the men and women who has been married for the first time. A
total of 10 samples were taken from the population of men and women
who live in Jakarta.
Results of census data in 2009 as follows :
Numb Man Numb Woman
er er
1 30 1 22
2 25 2 20
3 27 3 21
4 35 4 19
5 33 5 25
6 40 6 27
7 27 7 30
8 26 8 28

33
9 28 9 27
10 30 10 26
11 26 11 25
12 25 12 22
13 29 13 19
14 31 14 23
15 33 15 22

From the data above, please determine :

a Median of men and women who has been married for the first time

b Mean of men and women who has been married for the first time

c Quartile 1 and 3 of men and women who has been married for the
first time

d Standard deviation and variance of men and women who has been
married for the first time

e Kurtosis and Skewness of men and women who has been married
for the first time

f If in 2014, studies have shown that the average age of men who
married for the first time is 35 and for women is 27 years, what
information can you conclude?

SESSION 2
VALIDITY, RELIABILITY, NORMALITY

2.1 Validity and Realibility Test

One of the instrumentsthat are oftenusedin scientific


researchisa questionnaire, whichaims to determineone's opinionon a
matter. Aquestionnaire can bepreparedwithopen-endedquestions(how
old are youtoday, howdo you think ofBina Nusantara University) ora
closed questionsuch as(your agecategories: <20 yearsor>20years).
Onescaleis often usedin the preparation ofthe questionnairewasa
Likert scale(seetheexplanationLikert scale).

34
According to Sarjono(2011), the validity test is a
processtoprovethat theinstrument, technique orprocess that used of
measuring a concept is the concept that actually intended. It is used
to measure whether the question of the questionnaire that distributed
to the respondents as the object of research is valid or not in a
purpose.
In aqualitativestudythat uses aquestionnaireasameasuring
tool, there are twoimportant requirementsthat must be met, namely
therequirementof aquestionnaireto becomeValidandReliable(other
terms, the level of accuracyis notdiscussed inthismodule).
A questionnaireis considered as a
valid,ifquestionsonaquestionnaireableto expresssomething that willbe
measuredbythe questionnaire. Whileaquestionnaireconsidered as
reliable(reliably) is whensomeone are consistent and stable when
answering a question over time.
According toFerdinand(2014), scale orinstrument ofdata
measurement generatedtrustworthyorreliableif
theinstrumentwasconsistentlyled tothe same resulteach timewhen
measurement is handling,thereforethe purpose oftesting
thereliability, according to Sarjono(2011) is tomeasurethe answer
consistency ofeach respondentto theitemsincluded in
thestatementwhich has beendistributed in the questionnairecan
betrustworthyorreliable.

Measurement reliability can basically be done in two ways:


Repeated Measure or re-measuring. In here, someone will be
presented with the same questions at different times, and then see
whether he remains consistent with the answer or not.
One short or just once. In here, the measurement will be done once
and then the results were compared with the results of other
questions.
Step to arrange the questionnaire :

1. Establishaconstruct, namelyto make restrictionsregardingthe


variablesto be measured. If youwant to studyaboutconsumer
behavior, itneeds to be emphasizedin advancewhat is meant
bytheconsumer behavior.

35
2. Establishfactors, namelytrying tofindelementsthat exist in
aconstruct.

3. Developa number ofquestion, namelytrying


todescribeafurtherfactorin awide range ofquestions
thatdirectlyinteractwith therespondents. In eachconstructcan
becomprised ofseveralfactors, andeachfactorcan consistofsomeof
the questions, considering thateachfactorcouldalsohavethe
sameamount ofa number of question that isnotthesame as
another.

Validity and ReliabilityAnalysis Objectives :

Validity and reliabilitytestis the process oftesting thenumber ofthe


questions inaquestionnaire, whether the contentofthe questionshave
been validand reliable.

Contoh :

A researcherwants to knowhow theinfluence of Service Quality and


Product Quality on Customer Loyalty. The study was conductedby
distributingquestionnaires to89respondentstoprovide14of the
questions which 5 questionnaire about Service Quality variable,
5questionsfor Product Quality variable,4 questions for Customer
Loyaltyvariable. Resultsofthe questionnaireareas follows:

SQ SQ SQ SQ SQ PQ PQ PQ PQ PQ CL CL CL CL
R 1 2 3 4 5 1 2 3 4 5 1 2 3 4
R1 4 3 4 5 3 4 5 3 3 5 5 4 4 4
R2 5 4 5 4 4 5 4 4 5 5 5 4 4 4
R3 4 4 5 3 2 5 3 2 4 4 5 4 4 5
R4 5 5 5 3 5 5 3 5 5 5 4 5 4 5
R5 4 4 4 3 4 4 3 4 5 5 4 4 5 2
R6 3 5 5 4 4 5 4 4 4 4 2 5 2 4
R7 2 3 2 3 3 2 3 3 3 3 2 5 5 3
R8 2 3 2 3 3 2 3 3 2 2 5 4 5 5
R9 3 5 4 3 3 3 3 3 3 4 5 1 2 3
R10 4 3 3 2 5 3 2 5 5 5 3 5 3 3
R11 4 4 4 4 3 4 4 3 4 5 3 2 4 4
R12 4 3 3 3 4 3 3 4 3 3 2 4 4 3
R13 5 5 4 3 3 4 3 3 4 5 5 2 5 5
R14 5 5 3 3 4 3 3 4 5 1 1 2 5 2
R15 4 4 4 4 4 4 4 4 4 4 5 4 1 5

36
R16 4 4 4 4 4 4 4 4 4 4 5 3 2 5
R17 5 4 5 5 5 5 5 5 4 3 4 1 4 4
R18 4 4 4 4 3 4 4 3 3 4 1 4 3 3
R19 3 2 4 3 2 4 3 2 2 3 5 4 3 2
R20 4 3 4 3 2 4 3 2 3 4 3 4 3 3
R21 3 4 4 3 3 4 3 3 4 3 1 4 1 4
R22 4 4 4 5 5 4 5 5 4 4 3 2 2 2
R23 4 4 4 3 5 4 3 5 4 3 4 3 3 3
R24 3 5 5 5 5 5 5 5 3 4 1 3 4 2
R25 4 4 4 3 3 4 3 3 4 4 5 2 2 4
R26 3 2 3 3 4 3 3 4 3 4 4 4 3 4
R27 3 3 4 4 4 4 4 4 3 3 4 1 3 4
R28 2 3 3 2 3 3 2 3 3 4 1 2 4 1
R29 2 3 3 4 4 3 4 4 4 4 3 3 2 4
R30 3 3 4 4 3 4 4 3 3 3 3 2 3 3
R31 3 4 4 5 3 4 5 3 4 4 4 5 3 5
R32 2 4 4 3 4 4 3 4 4 3 1 4 3 3
R33 2 4 3 2 4 3 2 4 4 2 2 3 4 2
R34 2 4 2 4 3 2 4 3 3 3 4 4 4 4
R35 3 4 4 4 3 4 4 3 3 3 2 5 2 5
R36 3 3 3 3 4 3 3 4 3 2 2 2 2 3
R37 4 4 4 3 2 4 3 2 3 3 4 4 4 2
R38 4 5 4 4 5 4 4 5 4 4 5 4 4 2
R39 3 3 2 2 2 2 2 2 3 3 3 1 3 2
R40 3 4 4 3 2 4 3 2 3 4 4 3 2 3
R41 3 2 3 3 4 3 3 4 2 4 3 3 3 2
R42 5 4 4 4 4 4 4 4 4 4 4 5 5 4
R43 4 4 4 4 5 4 4 5 4 4 3 3 4 3
R44 3 3 4 5 5 4 5 5 2 2 3 3 3 4
R45 3 4 3 3 3 3 3 3 3 4 4 5 5 5
R46 4 4 3 3 5 3 3 5 4 4 4 5 5 5
R47 2 3 4 4 4 4 4 4 3 5 5 5 5 5
R48 5 5 3 4 4 3 4 4 4 5 5 5 5 5
R49 3 4 3 5 5 3 5 5 3 5 5 5 5 5
R50 5 5 4 4 5 4 4 5 4 4 5 5 5 5
R51 3 4 4 4 5 4 4 5 4 4 4 5 4 5
R52 2 3 3 4 4 3 4 4 4 4 5 4 5 4
R53 3 4 4 3 2 4 3 2 3 4 3 5 4 4
R54 3 4 4 5 2 4 5 2 4 4 4 5 5 3
R55 4 5 4 4 5 4 4 5 4 4 4 5 3 4
R56 3 3 4 5 5 4 5 5 2 2 4 5 4 5
R57 3 4 5 5 5 5 5 5 4 3 4 5 4 5
R58 4 4 4 4 3 4 4 3 5 5 5 4 4 4
R59 5 5 5 5 4 5 5 4 5 5 5 5 5 5
R60 3 3 3 4 3 3 4 3 4 4 4 3 5 4
R61 5 5 4 3 5 4 3 5 3 3 4 4 3 4
R62 4 4 4 4 4 4 4 4 4 4 5 5 3 4
R63 4 3 4 5 3 4 5 3 4 4 5 5 3 5
R64 4 4 4 4 3 4 4 3 4 4 3 5 4 3

37
R65 4 4 5 5 4 5 5 4 5 5 4 3 4 4
R66 5 5 5 5 5 5 5 5 5 5 5 3 4 4
R67 5 4 5 4 5 5 4 5 5 4 5 1 5 4
R68 3 3 4 5 3 4 5 3 4 4 3 4 5 2
R69 4 4 3 4 3 3 4 3 4 5 4 4 3 4
R70 3 3 4 4 4 4 4 4 3 4 5 4 2 4
R71 5 4 3 5 4 3 5 4 3 4 5 4 4 4
R72 5 3 3 4 3 3 4 3 4 4 5 4 2 1
R73 4 4 3 4 5 3 4 5 4 3 3 4 4 3
R74 3 4 4 4 3 4 4 3 3 3 4 3 4 1
R75 4 4 4 3 4 4 3 4 3 4 4 1 4 4
R76 5 4 3 3 4 3 3 4 4 4 3 4 2 1
R77 5 5 4 4 3 4 4 3 3 3 3 4 4 2
R78 4 5 5 5 4 5 5 4 3 4 3 5 3 4
R79 3 5 4 3 3 4 3 3 4 5 1 4 2 4
R80 3 4 3 4 3 3 4 3 4 3 2 4 3 3
R81 5 5 4 4 4 4 4 4 3 4 4 3 4 1
R82 3 4 5 4 4 5 4 4 5 3 4 3 2 1
R83 4 5 4 5 4 4 5 4 5 3 1 1 3 4
R84 3 5 3 3 4 1 5 4 4 3 3 2 4 4
R85 4 5 4 5 4 4 5 4 4 4 2 3 4 3
R86 5 5 4 4 5 4 4 5 4 5 5 3 5 2
R87 4 4 4 4 3 4 4 3 5 5 5 4 5 3
R88 4 5 3 3 3 3 3 3 5 4 4 3 4 3

From the results ofthe questionnaire above, please do


thetestwhether the data isvalid and reliable. Fortestingthe
validity and reliability, the datamustrunpervariable.

To test the validity of questionnaire, the calculation of R critical and Rtable


will be needed. The steps of finding the Rtable areas follows :

1. Enter the data at point SQ1 to CL4 into SPSS

38
2. Click Transform menu, then select Compute Variable.

3. At Compute Variable window, complete the steps below:

Fill Target Variable with a letter t

At Function group, choose Inverse DF, then at Function and


Special Variables menu, do the double click on Idf.T

39
4. The first question mark (?) at Numeric Expression column filled by the
probability, which is on in this case the probability is (1 = 1
0,05 = ) 0,95. (please input 0.95)

5. The second question mark (?) at Numeric Expression column filled by


df, which is on this case df is (n 2 = 88-2 =) 86. See the layout
above

6. Click OK, the table t should be appear in data view

7. Click theTransformmenu again, then chooseCompute Variable

8. Change the letter t with a letter r in Target Variable

9. Remove the content ofNumeric Expressionand change with this


formula:

t/sqrt(df+t**2)
so in this case, the formula must be: t/sqrt(86+t**2)

10. See the layout below

40
11. Click OK and the table r will be appear in data view. That is a Rtable

value.

Steps to find Rcritical :

1. From menu Analyze, click Scale, then Reliability Analysis.

2. At Reliability Analysis window, do the steps are as follows:

Enter all variables (SQ1-SQ5), to the Items which is on the


right.

At a Model option, let a choice at A.

41
Click Statistics menu, do the checklist for 3 options (Item,
Scale, Scale if item deleted) in Descriptive for (which is on the
left above) , then click Continue to return the main window.

3. Click OK, then the results will be shown like below.

42
4. Interpretation of Results

Like we just explained, the testing of the data started by


examining the questionnaire validity, so after that the reliability of
questionnaire can be done.

Steps for testing the questionnaire validity are as follows :

Hypothesis (for Service Quality variable)

43
H0 : the number of data are valid

Ha : the number of data are invalid

Basis for Decision Making :

Rcritical Rtable, then H0 Ket: Rtable can be seen as


accepted 0.18 at steps below,

Rcritical< Rtable, then H0 whereas Rcritical can be

rejected seen at Item-Total


Statistics table

Decision

Numb Rcritical Symb Rtable Decisio


er ol n
1 0.46 > 0.18 Valid
2 0.462 > 0.18 Valid
3 0.499 > 0.18 Valid
4 0.365 > 0.18 Valid
5 0.35 > 0.18 Valid
Decision

So, all the number of data is valid.

Do the validity testing again for Product Quality and


Customer Loyalty variable.

After the validity test has been done, do the realibility test by
using the latest output and please compare the R table with
Cronbachs (R) in a Reliability Statistics table.

Steps for Realibility testing :

1. Determine the hypothesis

H0 = the number of data are reliable

Ha = the number of data are unreliable

2. Determine the Rtable value

3. Find R

44
4. Make a decision

Basis for Decision Making :

If R> r tabel, so the number of data or variable are reliable

If R< r tabel, so the number of data or variable are unreliable

Intrepretation of Results

Hypothesis

H0 : the number of data are reliable

Ha : the number of data are unreliable

Basis of Decision Making

R Rtabel, then Ho accepted

R< Rtabel, then Hpo rejected

Decision

R = 0.669, while Rtabel = 0.18

0.669 > 0.18, so Ho accepted

Conclusion

So, the number of data are reliable

Do the testing of reliability for Product Quality and


Customer Loyalty variable

2.2 Normality Test

Normality test is done to determine whether the distribution of


data follow or approach the normal distribution or not ((Sarjono dan
Julianita, 2011: 53). Basically, normality test is used to compare our
data and the data with normal distribution, which have a mean,
standard deviation that have an equal value with our data.

Normality test is necessary, because one of the requirements


for parametric testing is the data must be in normal distribution.
According to Budiawan (2012; 58), theoretically, when the number of
samples grow larger, so the data tend to have a normal distribution.

45
Produ
ct Custom
Service Qualit er
Quality y Loyalty
3.80 4.00 4.25
This testing is usually 4.40 4.60 4.25 used to measure an
ordinal, interval, and 3.60 3.60 4.50 ratio data.
4.60 4.60 4.50
Here is the Basis 3.80 4.20 3.75 of Decision Making in
Normality Test, in 4.20 4.20 3.25 this observation, the
0,05) 2.60 2.80 3.75
2.60 2.40 4.75
If the number of 3.60 3.20 2.75 respondents <50
use the sig in 3.40 4.00 3.50 Shapiro Wilk table
3.80 4.00 3.25
If the number of 3.40 3.20 3.25 respondents > 50
use the sig in 4.00 3.80 4.25 Kolmogorov
Smirnov table 4.00 3.20 2.50
4.00 4.00 3.75
o If sig > , 4.00 4.00 3.75 then data are
4.80 4.40 3.25
normally 3.80 3.60 2.75 distributed
2.80 2.80 3.50
o If sig < , 3.20 3.20 3.25 then data are

abnormally 3.40 3.40 2.50 distributed


4.40 4.40 2.25
4.00 3.80 3.25
4.60 4.40 2.50
3.60 3.60 3.25
3.00 3.40 3.75
3.60 3.60 3.00
2.60 3.00 2.00
3.20 3.80 3.00
3.40 3.40 2.75
3.80 4.00 4.25
3.40 3.60 2.75
3.00 3.00 2.75
3.00 3.00 4.00
3.60 3.40 3.50
3.20 3.00 2.25
3.40 3.00 3.50
4.40 4.20 3.75
2.40 2.40 2.25
3.20 3.20 3.00
3.00 3.20 2.75
4.20 4.00 4.50
4.20 4.20 3.25
4.00 3.60 3.25
46
3.20 3.20 4.75
3.80 3.80 4.75
3.40 4.00 5.00
The steps are as follows :

1. Input the data above to SPSS

2. From Analyze menu, choose Descriptive Statistics, then click Explore.

3. Enter the Service Quality, Product Quality and Customer Loyalty


variable to the Dependent List box.

4. At Plots menu, checklist the option Normality plots with tests and
let another options in a blank, then click Continue.

47
5. In Display (bottom left), choose Plots menu and click OK

6. Output Layout:

Tests of Normality
Kolmogorov- Shapiro-Wilk
Smirnova
Statisti df Sig. Statist df Sig.
c ic
ServiceQualit 8 0.16 8 0.19
y 0.085 8 5 0.98 8 5
ProductQualit 8 8 0.21
y 0.093 8 0.06 0.981 8 7
CustomerLoy 8 0.05 8 0.05
alty 0.094 8 1 0.972 8 1
a. Lilliefors Significance Correction
After doing the analysis in SPSS until we got the result, we can
continue to the interpretation of data are as follows :

Hypothesis :

48
H0 = data of Service Quality are normally distributed

Ha = data of Service Quality are abnormally distributed

Basis of Decision Making

Sig , then Ho accepted

Sig < , then Ho rejected

Decision

0.165 > 0.05, then Ho accepted

Decision

So, data of Service Quality are normally distributed

Similarly to interpretation of Product Quality and Customer Loyalty


variable

49
Soal Latihan

1. Evantia is an Entrepreneur who wants to establish his fast-food


culinary business, potato burger. For knowing the market interest of
her recipe taste, Evantia spread the questionnaire with likert scale
containing 10 number of questions to the 10 respondents randomly.
The answer of the questionnaire are like below :

No. Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
1 3 4 4 5 4 4 4 4 4 4
2 3 3 2 2 2 3 4 3 2 3
3 1 2 3 3 2 3 4 3 2 3
4 1 2 2 2 2 3 2 1 2 2
5 1 2 2 2 1 2 3 4 5 4
6 5 4 4 5 4 5 4 3 2 1
7 1 2 1 2 1 2 3 4 4 4
8 5 4 3 4 5 4 3 4 5 4
9 5 4 4 3 4 5 4 3 4 4
10 3 4 4 4 5 4 4 5 4 3
11 3 3 3 3 4 4 3 2 2 1
12 3 4 4 5 4 3 3 3 2 3
13 2 1 2 1 2 2 3 3 4 4
14 3 3 4 4 3 3 2 1 2 2
15 3 2 2 1 2 3 4 3 2 3
16 3 4 5 4 4 3 2 1 2 3
17 1 2 1 2 2 1 2 2 3 2
18 1 2 2 2 3 3 3 2 1 2
19 2 1 2 1 2 1 2 2 1 2
20 1 2 1 2 3 2 1 2 3 4
21 4 5 4 5 4 3 2 2 2 1
22 5 4 5 4 4 5 4 3 3 4
23 5 4 5 4 4 3 3 4 5 4
24 2 3 2 3 4 3 4 5 4 4
25 3 4 5 4 5 4 4 5 4 3
26 2 2 2 3 3 3 4 3 2 2
27 2 2 1 2 1 2 3 2 3 3
28 4 4 3 2 1 2 1 2 1 2
29 2 1 2 2 1 2 2 2 1 2
30 4 5 4 5 4 4 5 4 3 3
Questions:

50
a. Does the questionnaire that Evanti has spread out valid and
reliable?

b. Please do the normality test of market interest variable!

2. The validity and reliability test of working satisfaction will be done.


Those variables consist of 5 indicators adapted by intrinsic factors
from Herzberg two-factor theory, included the job itself, the success
achievement, growth opportunities, career advancement and
recognition of others.

Scale that has been used is Likert scale 1-5 with 30 number of
sample. After the questionnaire is tabulated, so we got the data
below :

P1 P2 P3 P4 P5
Setuju Setuju Setuju Setuju Setuju
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Setuju Setuju Setuju Setuju Setuju
Setuju Setuju Setuju Ragu-ragu Setuju
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Ragu-ragu Ragu-ragu Setuju Ragu-ragu Ragu-ragu
Setuju Setuju Setuju Setuju Setuju
Sangat Setuju Setuju Setuju Setuju
Setuju
Ragu-ragu Setuju Ragu-ragu Ragu-ragu Setuju
Setuju Setuju Setuju Setuju Setuju
Setuju Setuju Setuju Setuju Setuju
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Setuju Setuju Setuju Ragu-ragu Setuju
Setuju Setuju Setuju Setuju Setuju
Ragu-ragu Ragu-ragu Ragu-ragu Setuju Ragu-ragu
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Setuju Setuju Setuju Setuju Setuju
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Setuju Ragu-ragu Setuju Ragu-ragu Ragu-ragu
Setuju Ragu-ragu Ragu-ragu Ragu-ragu Ragu-ragu
Setuju Setuju Setuju Setuju Setuju
Ragu-ragu Ragu-ragu Setuju Setuju Ragu-ragu
Setuju Ragu-ragu Setuju Setuju Ragu-ragu

51
Setuju Setuju Setuju Setuju Setuju
Setuju Setuju Setuju Setuju Setuju
Setuju Setuju Setuju Setuju Setuju
Tidak Tidak Ragu-ragu Tidak Tidak
Setuju Setuju Setuju Setuju
Setuju Setuju Setuju Setuju Tidak
Setuju
Setuju Setuju Setuju Ragu-ragu Ragu-ragu

a. Do all the questions above are valid and reliable?


b. Is the variable of job normally distributed?

52
SESSION 3
CORRELATION

Correlation is an association (relation) between the interest variables,


whether the available sample data provide sufficient evidence that
there is a link between the variables in the population from which the
sample, if there is a relationship, how strong is the relationship
between these variables. The relationship is expressed by the name
of the correlation coefficient or can be called correlation. It should be
noted that in correlation we have not yet to determine with certainty
the independent variable and its dependent as we did in the
regression analysis.
3.1 Bivariate Correlation

Is to measure the closeness of the relationship between the observed


results of the population which has two variants ( bivariate ) . This
calculation requires that the population from which the sample has
two variance and normal distribution . The Pearson correlation is used
to measure interval data and ratios correlation.
Example: using Data of Service Quality, Product Quality, and
Customer Loyalty
Steps of testing in SPSS :

1. Enter the data as the previous example

2. To do the correlation test, choose Analyze Correlate Bivariate.

53
3. Move the variables that will be tested to the Variablessection.

4. Click the Option button so the screen appears below.

Information:

o On the Statistics section is an option that is used to display a


descriptive statistical summary from the data that will be
correlated.

54
o In the Missing values or treatment correlates with respect to the
data that are not available in the case , SPSS provides two
alternative treatments:

Exclude cases pairwise, is pair in which one no data will not


be included in the calculation.

Exclude cases listwise, is a data that is not included in the


calculation is the data that is lost or missing data.

5. For uniformity then used pairwise, then press Continue - Ok. After
ok, the result will appear as shown below.
Descriptive Statistics
Mea Std. N
n Deviati
on
KualitasPelaya 3.78 0.5556
nan 64 8 88
KualitasProdu 3.76 0.5207
k 82 3 88
Loyalitas 3.59 0.7621
Pelanggan 38 7 88

Correlations
Kualitas Kualit Loyalita
as s
Pelayan Produ Pelangg
an k an
Pearson
Correlatio
n 1 .850** .215*
Sig. (2-
KualitasPelaya tailed) 0 0.045
nan N 88 88 88
Pearson
Correlatio
KualitasProduk n .850** 1 .320**

55
Sig. (2-
tailed) 0 0.002
N 88 88 88
Pearson
Correlatio
n .215* .320** 1
Sig. (2-
Loyalitas tailed) 0.045 0.002
Pelanggan N 88 88 88
**. Correlation is significant at the 0.01 level (2-
tailed).
*. Correlation is significant at the 0.05 level (2-tailed).
Intrepretation of Results

Hypothesis:

H0: There is no significant relationship between the Service


Quality and Product Quality
Ha : There is a significant correlation between the Service Quality
and Product Quality

Basic Desicion

Sig > Accept H0

Sig < Reject H0

Pearson Correlation Test


r > 0.5 strong r = + in the same direction
r < 0.5 weak r = - opposite direction
Decision

Reject H0
0.00 < 0.05

0,850 > 0,5 Strong

r=+ In the same direction

Conclusion

56
There is a significant correlation between the Service Quality and
Product Quality
with strong relationship and in the same direction.

3.2 Partial Correlation

The purpose of the partial correlation testing is to describe the


degree of correlation between the two independent variables after
the effect of other variables are controlled (statistically).

Example: using Data of Service Quality, Product Quality, and


Customer Loyalty. The manager wants to know whether there is a
relationship between the quality of the products with customer
loyalty and service quality as a control variable.

Step by step testing in SPSS:

1. Enter the data as the previous example.

2. To do the correlation test, choose Analyze Correlate Partial.

57
3. Move the Quality product variable and customer loyality into
Variables and the Quality service variable into Controlling for.

4. Click the Option button so the screen appears below.

Information:

o On the Statistics section is an option that is used to display a


descriptive statistical summary from the data that will be
correlated.

o In the Missing values or treatment correlates with respect to


the data that are not available in the case , SPSS provides two
alternative treatments:

Exclude cases pairwise, is pair in which one no data


will not be included in the calculation.

58
Exclude cases listwise, is a data that is not included
in the calculation is the data that is lost or missing data.

5. For uniformity then used pairwise, then press Continue - Ok. After
ok, the result will appear as shown below.

Correlations
Control Variables Kualita Loyalita Kualitas
s s Pelayan
Produ Pelangg an
k an
Correlatio
n 1 0.32 0.85
Significan
ce (2-
KualitasProdu tailed) . 0.002 0
k df 0 86 86
Correlatio
n 0.32 1 0.215
Significan
ce (2-
LoyalitasPela tailed) 0.002 . 0.045
nggan df 86 0 86
Correlatio
n 0.85 0.215 1
Significan
ce (2-
KualitasPelay tailed) 0 0.045 .
a
-none- anan df 86 86 0
KualitasPelayan Correlatio
an n 1 0.268
Significan
ce (2-
KualitasProdu tailed) . 0.012
k df 0 85
LoyalitasPela Correlatio
nggan n 0.268 1
Significan 0.012 .
ce (2-

59
tailed)
df 85 0
a. Cells contain zero-order (Pearson) correlations.

Interpretation of Result

Hypothesis:

H0 :There is no significant relationship between the quality of the


products and customer loyalty with service quality as the control
variable .
Ha :There is a significant correlation between the quality of the
products and customer loyalty with service quality as the control
variable .

Basic Desicion

Sig > Accept H0

Sig < Reject H0

Pearson Correlation Test

r > 0.5 strong r = + In the same direction

r< 0.5 weak r = - Opposite direction


Decision
0.012 < 0.05 Reject H0

0,268 < 0,5 weak

R=+ in the same direction

Conclusion

There is a significant correlation between the quality of the products


with customer loyalty and service quality as the control variable with
weak relationship and in the same direction.

60
Exercise:

1. A math lecturer, wanted to research whether there is a relation


between attendances in class with students final grades. The
following sample results were obtained:

No Attendances in Final
class (%) Grades
1 60 65
2 70 70
3 75 75
4 80 75
5 80 80
6 90 80
7 95 85
8 95 95
9 100 90
10 100 98
11 70 75
12 80 75
13 85 80
14 90 80
15 65 85
16 100 95
17 80 90
18 85 98
19 90 75
20 65 80
Do a test to determine is there any relation or not!

2. Baey want to know whether there is a relation between a person's


income with expenditure per month. The following data from
interviews to 10 people. Help Baey to determine whether there is a
relation between two variables.

Name Income (Million) Expenditure (Million)


Young 8.2 3.2
Riky 9.2 3.2
Kelvin 7 2.5

61
Incen 6.3 1.8
Olan 7.5 2
Nans 8.4 2
Jon 7.7 3.3
Minul 8 4
Pacho 9.11 3.2

3. Yoomes Bond wants to do research by using a measuring instrument


scale. Yoomes want to examine the relation between intelligence
and academic achievement if there is a factor of stress levels on
students suspected as a control variable. Each variable created
some of the questions using a Likert scale, which is the number 1 =
Strongly disagree , 2 = Disagree , 3 = Agree and 4 = Strongly Agree
. After distributing the scale to 12 respondents here is total score of
items:

Intelligen Stress Academic


ce levels Achievement

32 24 58

31 27 52

20 33 48

33 26 49

33 25 52

34 25 57

31 29 55

20 30 50

20 33 48

34 27 54

35 28 56

62
Determine:

a. Is there a correlation between stress levels and academic


achievement?

b. Is there a relation between intelligence and academic


achievement with stress levels as the control variable?

4. Last year, PT.Kelana has implemented a elogistic system with the


aim to improve the employee productivity. So the manager of
PT.Kelana want to see is the productivity of employees of PT. Kelana
influenced by elogistic or not, where the level of employee
motivation as a control variable. Help the manager of PT.Kelana to
do his research using the data below

E- Employee Employee
logistic Motivation Productivity
55 65 90
20 13 80
85 79 130
65 53 116
45 43 84
70 62 140
35 18 120
60 75 88
95 84 83
65 68 108
85 72 131
10 10 134
75 64 180
80 82 50
50 46 30
90 95 80
75 82 131
45 42 134
65 73 180
50 80 99

63
SESSION 4
THE CLASSICAL ASSUMPTIONS

The regression equation needs to convince its linearity and meet the
validity levels, accuracy in estimation, unbiased and consistent which
is expected to meet the forecast. It requires the classical assumption
test using Normality Test, Multicollinearity Test, Heteroscedasticity
Test, and Autocorrelation Test.

4.1 Multicollinearity Test

Multiple linear regression models assume that there is no significant


correlation between the independent variables. Multicollinearity test
aims to test whether the regression model found a high or perfect
correlation between the independent variables. If between
independent variable occurs perfect multicollinearity, then the
regression coefficients of these variables can not be determined and
the value of the standard error become infinity. Multicollinearity
calibration views on the value of VIF (Variance Inflation Factors) and
the value of tolerance. VIF value is less than 10 or greater than the
tolerance value of 0.10.

Example of Data: Still using the data of Service Quality,


ProductQuality, and Customer Loyalty

The following steps of multicollinearity test.

1. Insert data into IBM SPSS v.20 as shown below.

64
2. From the main menu of SPSS, choose Analyze, then submenu
Regression, then choose Linear. It will display a dialog box as
follows.

Enter the variable that is affected (dependent) and the variables that
influence (independent) into each box.

3. Choose Statistics. Then Checklist Estimates, Covariance Matrix (ask


the correlation matrix between the independent variables), Model Fit
(ask the coefficient of determination R2), Part and Partial correlation
(ask the partial correlation and zero order correlation) dan Colinearity
diagnostics (ask the tolerance value and VIF).

65
4. Click OK. Following results of SPSS output.

Coefficientsa
Model Unstandardiz Standardi t Sig. Correlations Collinearity
ed zed Statistics
Coefficients Coefficien
ts
B Std. Beta Zer Parti Part Toleran VIF
Error o- al ce
ord
er
1.93 0.00
(Constant) 2 0.577 3.35 1
- - - -
KualitasPelaya 0.28 1.07 0.28 0.21 0.11 0.10
nan 5 0.266 -0.208 2 7 5 6 9 0.277 3.607
Kualitas 0.72 2.56 0.01 0.26 0.26
1 Produk 8 0.284 0.497 5 2 0.32 8 2 0.277 3.607
a. Dependent Variable: LoyalitasPelanggan
Interpretation of Results

Hypotesis

H0 : Multicollinearity does not occur

Ha : Multicollinearity occur

66
Basic Decision

Tolerance Value > 0.10 Accept H0 VIF > 10 Reject H0

Tolerance Value < 0.10 Reject H0 VIF < 10 Accept H0

Decision

UFor service quality variable and product quality

0.227 > 0.10 Accept H0

3.607 < 10 Accept H0

Conclusion

Based on the value of Tolerance and VIF seen that there is no


Tolerance value under 0.10, that is 0.277 , and so does the
value of VIF none is greater than 10 which is only 3.607 . This
indicates no multicollinearity.

4.2 Heteroscedasticity Test

To find out the circumstances in which the inequality variant of the


data in the regression models, heteroscedasticity test needs to be
done.

Spearman Rank Correlation Test

Testing is done by looking for correlations between the residual


values of the independent variable. If there is find any correlation
between the independent variable and residual value, it can be
concluded heterocedastity occurred.

Still using the data of Quality Service, Quality Products, and Customer
Loyalty, the following steps of heterocedastity test:

1. From the main menu of SPSS, choose Analyze, then submenu


Regression, and then choose Linear. It will display a dialog box as
follows.

67
Enter the variable that is affected (dependent) and the variables that
influence (independent) into each box.

2. Choose Save then one the Residuals section choose Unstandardized


(to show the residual values).

Then click Continue>> OK. Results will appear at SPSS input.

68
3. After the residual values appear, followed by a correlation analysis.
Choose menu Analyze >>Correlate>>Bivariate.

4. Move the service quality variable, product quality, and


Unstandardized Residuals to the variable box. Then tick Spearman on
Coefficient Correlation.

5. Click OK.

Correlations

69
Kualitas Kualit Unstandardi
as zed
Pelayan Produ Residual
an k
Correlati
on
Coefficie
nt 1 .814** 0.048
Sig.(2-
KualitasPelaya tailed) . 0 0.658
nan N 88 88 88
Correlati
on
Coefficie
nt .814** 1 0.071
Sig. (2-
KualitasProdu tailed) 0 . 0.51
k N 88 88 88
Correlati
on
Coefficie
nt 0.048 0.071 1
Sig. (2-
Spearm Unstandardiz tailed) 0.658 0.51 .
an's rho ed Residual N 88 88 88
**. Correlation is significant at the 0.01 level (2-tailed).
Interpretation of Results

Hypotesis

H0 :Heterocedastity does not occur

Ha :Heterocedastity occur

Basic Decision

Sig > Accept Ho

Sig < Reject Ho

70
Decision

Service Quality : 0.658 > 0.05 Accept H0

Product Quality : 0.510 > 0.05 Accept H0

Conclusion

Based on the results obtained sig value of both independent


variable has a value greater than 0.05 so it was concluded
that there was no heterocedastity problems.

4.3 Autocorrelation Test

Autocorrelation test aims to test whether a linear regression models


has a correlation between distruber error (residual) in period t with
the error period t - 1 (previously). If there is a correlation, then there
is a problem called autocorrelation. Autocorrelation arise because
successive observations over time are related to each other.
Autocorrelation often appear in the time series data. While on the
cross section data, autocorrelation problem is relatively rare. In this
research will be used Run Test testing method. Run Test is used to
test whether there is a high correlation between residual. If yes, then
there is a problem of autocorrelation.

The following steps of Run Test:

1. Using the previous data

2. Choose menu analyze >> non parametrics test>>legacy


dialogs>>runs.

71
3. On the Test Variable List box, input Unstandardized Residual
(res_1). Choose Cut PointMedian.

4. Click Continue-OK. Following results of SPSS output.

72
Runs Test
Unstandar
dized
Residual
a
Test Value -.03251
Cases < Test
44
Value
Cases >= Test
44
Value
Total Cases 88
Number of
27
Runs
Z -3.860
Asymp. Sig. (2-
.000
tailed)
a. Median

Interpretation of Results

Hypotesis

H0:Autocorrelation does not occur

Ha :Autocorrelation occur

Basic Decision

Sig > Accept H0

Sig < Reject H0

Decision

0.000 < 0.05 Reject H0

Conclusion

The autocorrelation occur on the regression models.

Note:

73
Classical assumption test is a statistical requirements that must be
met in the multiple linear regression analysis based on ordinary
least squares (OLS) . So regression analysis that is not based on
OLS does not require classical assumptions requirements, such as
logistic regression or ordinal regression. Likewise, not all
classical assumption test should be performed on linear regression
analysis, for example multicolinearity test was not performed in a
simple linear regression analysis and autocorrelation test does not
need to be applied to the data cross sectional .

Classical assumption is not necessary for linear regression analysis


that aims to calculate the value of the particular variable. For
example, the value of stock returns are calculated by market model
or market adjusted model. The calculation of the value of the
expected return can be done with the regression equation, but it does
not need to be tested classical assumptions.

74
Exercise

1 A management student currently doing research on the factors that


influence employee satisfaction. The following data were collected :

Quality of Career Job


Relation
No. Work Opportunit Satisfacti
with Boss
Environment y on
1 4.14 4.75 4.57 5.08
2 3.21 3 2.86 3.05
3 3.29 3.5 3.43 3.5
4 2.86 2.5 2.57 2.54
5 3.43 3.75 3.71 3.73
6 3.36 3.75 2.57 3.73
7 3.07 3.5 3.14 2.8
8 3.36 3.5 3.86 3.25
9 3.71 2.25 3.86 2.8
10 3.21 4.5 3.86 3.73
11 3.5 3.75 3.71 3.02
12 4 4.5 3.71 3.73
13 3.64 2.75 3.86 2.8
14 4.21 3.75 3.29 3.25
15 2.86 3.75 3.71 3.25
16 3.14 4.25 3.57 2.51
17 4.5 3.75 4.29 3.73
18 2.93 3.5 4.57 4.48
19 3.79 4.5 3.71 3.73
20 3.5 2.75 3.57 3.85
21 3.5 3.25 3.86 4.1
22 3.57 3 3.14 3.62
23 3.5 4 2.86 2.8
24 3.14 2.75 3.71 3.25
25 3.36 4 2.57 3.73

Do the test of classical assumptions for the data above!

2 Students majoring in management named Ferlyn want to knwo how


the work performance of employees. What factors influence its.
Ferlyn distributing questionnaires to 30 employees were taken
randomly. The following results were obtained:

75
Work Job Work
Motivatio Satisfactio Performa
n n nce
3.79 4.00 3.57
3.57 4.00 3.71
3.00 2.50 2.71
4.64 5.00 4.86
3.50 3.50 3.29
3.57 3.50 3.57
3.00 2.25 2.57
3.43 3.25 3.29
3.93 4.25 4.14
3.50 3.75 3.57
3.43 3.50 3.57
3.86 4.25 4.14
3.29 3.50 3.43
3.00 2.75 3.00
3.93 4.50 4.29
4.36 5.00 4.71
3.93 4.50 4.14
3.14 3.25 3.29
3.50 3.75 3.71
4.07 4.50 4.14
3.71 4.25 4.00
4.36 4.75 4.43
3.79 4.25 4.14
3.86 4.25 4.00
3.86 4.00 3.86
3.36 3.25 3.29
3.71 4.25 3.71
3.86 4.00 3.86
3.29 3.50 3.14
3.64 4.00 3.71

Whether the data is fit for use in multiple regression testing?

76
SESSION 5
REGRESSION

5.1 Simple Regression

In a simple regression analysis will be developed an Estimating


equation (regression equation), which is a formula that seek the
dependent variable values of the known independent variables, in
which the two variables are each just one. Regression analysis is used
primarily for the purpose of forecasting.
Example: Still using the data of Quality Service, Quality Products,
and Customer Loyalty, the following steps of testing in SPSS:

1. Input data into IBM SPSS v.20 as shown below.

From the main menu of SPSS, choose Analyze, then submenu


Regression, and then choose Linear. It will display a dialog box as
follows.

77
Enter the variable that is affected (dependent) and the variables that
influence (independent) into each box.

2. Choose Statistics. Then Checklist Estimates, Model Fit, and


Descriptives.

3. Click Continue and then choose Plots. Set as shown below.

4. Click Save and choose Unstandarized option on Predicted value.

78
5. Click Option, on Stepping Method Criteria, choose: Entry .05(test F
who took the standard value of probability is 5%). Mark on Include
constant in equation (include constants remain selected). On Missing
Value, choose: Exclude cases listwise (no case data is lost). Then
Continue.

6. Click OK to see the output

79
ANOVAa
Model Sum df Mean F Sig.
of Squar
Squar e
es
Regressi
on 2.331 1 2.331 4.158 .045b
48.20
Residual 8 86 0.561
50.53
1 Total 9 87
a. Dependent Variable: LoyalitasPelanggan
b. Predictors: (Constant), Kualitas Pelayanan

Model Summaryb
Model R R Adjusted R Std. Error of
Square Square the
Estimate
1 .215a 0.046 0.035 0.74871
a. Predictors: (Constant), KualitasPelayanan
b. Dependent Variable: LoyalitasPelanggan

Coefficientsa
Model Unstandardiz Standardize t Sig.
ed d
Coefficients Coefficients
B Std. Beta
Error
(Constant) 2.479 0.553 4.484 0
KualitasPelaya
1 nan 0.295 0.144 0.215 2.039 0.045
a. Dependent Variable: LoyalitasPelanggan
Result Intepretation

Regression Test

Hypothesis

80
H0 : There is no significant effect between Service Quality toward
Customer Loyalty.
Ha : There is significant effect between Service Quality toward
Customer Loyalty.

Basis for Decision Making

sig
, then accept H0
sig
, then reject H0

Decision

0,045 < 0,05 reject H0

Conclusion

There is significant effect between Service Quality toward Customer


Loyalty.

Correlation Test

Pearson correlation test

r> 0,5 Strong r = + Positive


r< 0,5 Weak r = - Negative

Decision

0,215 < 0,5 weak


r = + Positive

Conclusion

There is weak and positive relationship between Service Quality and


Customer Loyalty.

Magnitude test

Coefficient of determination (R-squared)

81
4.6% of Customer Loyalty influenced by Service Quality variable and
the remaining 95.4% influenced by other factors.

Regression equation

y = 2.479 + 0.295x
Description : y = Customer Loyalty
x = Service Quality

o If x (Service Quality) rose by 1 point then y (Customer Loyalty) will


be increased by 0.295 point.

o If x (Service Quality) fell by 1 point then y (Customer Loyalty) will


be decreased by 0.295 point.

o If x (Service Quality) = 0 then y (Customer Loyalty) will be 2.479.

5.2 Multiple Regression

As described above, if in the linear regression there is only one


dependent variable (Y) and one independent variable (X), then in
multiple regression, there is one dependent variable and more than
one independent variable. In business practice, multiple regression is
more widely used, beside there are many variables in a business that
needs to be analyzed together, also in many cases multiple
regression are more relevant to use.
Example: Still using Service Quality and Customer Loyalty data
Solution steps:

1. Input the data to IBM SPSS v.20 as shown below.

82
2. From the SPSS main menu, choose Analyze, then Regression, and
choose Linear. So the dialog box will pop up as follows.

Drag and drop the dependent variable and independent variable(s)


into each box.

3. Choose Statistics, then checklist Estimates, Model Fit, and


Descriptives.

83
4. Click continue and choose Plots. Set as shown below.

5. Click Save and choose Unstandarized option in Predicted value.

6. Click Option, in Stepping Method Criteria, choose: Entry .05(F-test


that takes a standard figure of 5% probability). Checklist Include

84
constant in equation. In Missing Value, choose: Exclude cases listwise.
Then Continue.

7. Click OK to see the result.

Descriptive Statistics
Mean Std. N
Deviation
CustomerLoyalt 3.593
y 8 0.76217 88
3.786
ServiceQuality 4 0.55568 88
3.768
ProductQuality 2 0.52073 88

Correlations
CustomerLoy ServiceQualit ProductQuality
alty y
CustomerLoy
alty 1 0.215 0.32
ServiceQualit
y 0.215 1 0.85
Pearson ProductQualit
Correlation y 0.32 0.85 1
CustomerLoy
alty . 0.022 0.001
ServiceQualit
y 0.022 . 0
Sig. (1- ProductQualit
tailed) y 0.001 0 .
CustomerLoy
alty 88 88 88
ServiceQualit
y 88 88 88
ProductQualit
N y 88 88 88

85
Model Summaryb
Model R R Square Adjusted R Std. Error of the
Square Estimate
1 .339a 0.115 0.094 0.72555
a. Predictors: (Constant), ProductQuality, ServiceQuality
b. Dependent Variable: CustomerLoyalty

ANOVAa
Model Sum df Mean F Sig.
of Squar
Squar e
es
Regressi
on 5.793 2 2.896 5.502 .006b
44.74
Residual 6 85 0.526
50.53
1 Total 9 87
a. Dependent Variable: CustomerLoyalty
b. Predictors: (Constant), ProductQuality, ServiceQuality

Coefficientsa
Model Unstandardize Standardized t Sig.
d Coefficients Coefficients
B Std. Beta
Error
(Constant) 1.932 0.577 3.35 0.001
ServiceQuality -0.285 0.266 -0.208 -1.072 0.287
1 ProductQuality 0.728 0.284 0.497 2.565 0.012
a. Dependent Variable: CustomerLoyalty

Significant Test

Hypothesis

H0 : There is no significant effect ServiceQuality and ProductQuality


toward CustomerLoyalty.

Ha : There is significant effect ServiceQuality and ProductQuality


toward CustomerLoyalty.

86
Basis for Decision Making

sig
, then accept H0
sig
, then reject H0

Decision

0,006 < 0,05 reject H0

Conclusion

There is significant effect ServiceQuality and ProductQuality toward


CustomerLoyalty.

Coefficient of Determination (R-squared)

11.5% of Customer Loyalty influenced by Service Quality variable and


ProductQuality and the remaining of 88.5% influenced by other
factors.

Regression equation

y = 1.932 0.285 x1 + 0.728 x2


Description : y = Customer Loyalty
x1 = Service Quality
x2 = Product Quality

o If x1 (Service Quality) rose by 1 unit and x2 (Product Quality) rose


by 1 unit then y (Customer Loyalty) will be increased by 0.043
units.

o If x1 (Service Quality) fell by 1 unit and x2 (Product Quality) fell by


1 unit then y (Customer Loyalty) will be increased by 0.043 units.

o If x1 (Service Quality) and x2 (Product Quality) = 0 then y


(Customer Loyalty) will be 1.932.

87
Exercise

1. A math professor wanted to investigate whether there is influence


of class attendance on students' final grade. The following sample
results were obtained:

No Class Final
attendance (%) grade
1 60 65
2 70 70
3 75 75
4 80 75
5 80 80
6 90 80
7 95 85
8 95 95
9 100 90
10 100 98
11 70 75
12 80 75
13 85 80
14 90 80
15 65 85
16 100 95
17 80 90
18 85 98
19 90 75
20 65 80

Do a test to determine whether there is any influence or not!

2. Students majoring in management, Ferlyn, want to know what


factors affect employee performance. Ferlyn distributing
questionnaires to 30 employees at random. The following results
were obtained:

Job Job Work


Motivatio Satisfactio Performa
n n nce
3.79 4.00 3.57

88
3.57 4.00 3.71
3.00 2.50 2.71
4.64 5.00 4.86
3.50 3.50 3.29
3.57 3.50 3.57
3.00 2.25 2.57
3.43 3.25 3.29
3.93 4.25 4.14
3.50 3.75 3.57
3.43 3.50 3.57
3.86 4.25 4.14
3.29 3.50 3.43
3.00 2.75 3.00
3.93 4.50 4.29
4.36 5.00 4.71
3.93 4.50 4.14
3.14 3.25 3.29
3.50 3.75 3.71
4.07 4.50 4.14
3.71 4.25 4.00
4.36 4.75 4.43
3.79 4.25 4.14
3.86 4.25 4.00
3.86 4.00 3.86
3.36 3.25 3.29
3.71 4.25 3.71
3.86 4.00 3.86
3.29 3.50 3.14
3.64 4.00 3.71

a. Is there any influence between work motivation on work


performance?

b. Are there any influence between independent variable toward


dependent variable variable?

c. How big is the influence of the independent variable toward


dependent variable?

d. How is the overall correlation of all variables?

e. Write down the regression equation including the interpretation!

89
3. A student majoring in management is conducting research on the
factors that influence employee job satisfaction. The following data
were collected:

Relationshi Work Career Job


No. p with Environment Opportunit Satisfacti
supervisor Quality y on
1 4.14 4.75 4.57 5.08
2 3.21 3 2.86 3.05
3 3.29 3.5 3.43 3.5
4 2.86 2.5 2.57 2.54
5 3.43 3.75 3.71 3.73
6 3.36 3.75 2.57 3.73
7 3.07 3.5 3.14 2.8
8 3.36 3.5 3.86 3.25
9 3.71 2.25 3.86 2.8
10 3.21 4.5 3.86 3.73
11 3.5 3.75 3.71 3.02
12 4 4.5 3.71 3.73
13 3.64 2.75 3.86 2.8
14 4.21 3.75 3.29 3.25
15 2.86 3.75 3.71 3.25
16 3.14 4.25 3.57 2.51
17 4.5 3.75 4.29 3.73
18 2.93 3.5 4.57 4.48
19 3.79 4.5 3.71 3.73
20 3.5 2.75 3.57 3.85
21 3.5 3.25 3.86 4.1
22 3.57 3 3.14 3.62
23 3.5 4 2.86 2.8
24 3.14 2.75 3.71 3.25
25 3.36 4 2.57 3.73

a. Is there any influence between Working Environment Quality


toward Job Satisfaction?

b. Are there any influence between independent variable toward


dependent variable?

c. How much the influence of independent variable toward


dependent variable?

90
d. How is the overall correlation of all variables?

e. Write down the following regression equation including the


interpretation!

91
Modul Business Statistics II- Management Laboratory 2015

SESSION 6
ANOVA AND MANOVA

ANOVA

Analysis of variance is a method to examine the relationship between


a dependent variable (metric scale) with one or more independent
variables (non metric scale or categorical with more than two
categories). The relationship between a dependent variable and one
independent variable is called One Way ANOVA. In the case where
there are one metric dependent variable and two or three categorical
independent variables, often called Two Ways ANOVA and Three Ways
ANOVA.

ANOVA is used to determine the main effects and interaction effects


from categorical independent variables (often called factors) toward
the metric dependent variable.

Analysis of Variance assumptions

Some assumptions that must be met when using ANOVA test:

1. Homogeneity of variance, is the dependent variable must have


the same variant in each category of the independent variable.
This test is called Levene's test of homogeneity of Variance. In the
Levene Test, the desired result is to accept the null hypothesis
that the probability of <0.05 which means there are similarities
variance in group.
2. Random sampling: for the purposes of tests of significance, the
subjects in each group must be taken at random.
3. Multivariate Normality: for the purposes of tests of significance,
then the variable must follow a multivariate normal distribution.

Example:

A retail store manager "Semua Ada" want to know the level of


customer satisfaction in retail store when 3 different of air fresheners
used. Here are the results of the data collected by distributing

92
Modul Business Statistics II- Management Laboratory 2015

questionnaires about customer satisfaction satisfaction using a scale


of 1-20.

Lavender Citrus Vanilla


13 18 12
16 20 13
15 15 10
16 15 13
15 19 14
13 18 12
15 18 10
15 17 11
17 20 13
14 15 10

1. Does the the data is normally distributed?


2. Does the data have the same variance?
3. Are there any differences in average levels of customer
satisfaction towards the three scent of air freshener?
4. Are there any differences in average levels of customer
satisfaction between lavender scent with citrus according to
Tukey?
5. Are there any differences in average levels of customer
satisfaction between citrus scent with vanilla according to
Bonferroni?

Solution steps :
1. Enter the Rating and Scent variable in Variable View. Then use
the Value Labels for Scent variable.

2. Input the data vertically.

93
Modul Business Statistics II- Management Laboratory 2015

3. Perform the normality test first.

Tests of Normality
scent Kolmogorov-Smirnova Shapiro-Wilk
Statis df Sig. Statis df Sig.
tic tic
lavend
er 0.231 10 0.139 0.924 10 0.392
citrus 0.201 10 .200* 0.875 10 0.114
rating vanilla 0.192 10 .200* 0.887 10 0.158
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

4. After that, pick menu Analyze - Compare Means One Way


Anova. Input Rating variable to the Dependent List dan scent variable
to the factor.

94
Modul Business Statistics II- Management Laboratory 2015

5. Click Post_Hoct, then checklist Bonferoni and Tukey.

6. Click Options, then checklist Descriptiveand Homogeneity of


variance test.

7. Click Continue and OK. See the result.

Test of Homogeneity of
Variances

95
Modul Business Statistics II- Management Laboratory 2015

rating
Leven df1 df2 Sig.
e
Statis
tic
1.537 2 27 0.233

ANOVA

rating
Sum of df Mean F Sig.
Squares Square
Between 31.86
Groups 162.867 2 81.433 5 0

Within Groups 69 27 2.556

Total 231.867 29

96
Modul Business Statistics II- Management Laboratory 2015

8. Perform F-test. Pick menu Transforms Compute Variable. Do


the test as follows.

9. See the F-table result in data view.

Result Intepretation

a. Normality Test (Normality table please use the Shapiro-


Wilk)

Hypothesis

H0 : Rating data of customer satisfaction are normally distributed

Ha : Rating data of customer satisfaction are randomly distributed

Basis for Decision Making

Sig > , then accept H0

Sig <, then reject H0

Decision

Lavender : 0.392 > 0.05, then accept H0

97
Modul Business Statistics II- Management Laboratory 2015

Citrus : 0.114 > 0.05, then accept H0

Vanilla : 0.158 > 0.05, then accept H0

Conclusion

Rating data of customer satisfaction with lavender scent are


normally distributed.

Rating data of customer satisfaction with citrus scent are normally


distributed.

Rating data of customer satisfaction with vanilla scent are normally


distributed.

b. Variance Test (Use Levene test table)

Hypothesis

H0 : There is no difference variance in the average level of customer


satisfaction for the three scent of air freshener

Ha : There is difference variance in the average level of customer


satisfaction for the three scent of air freshener

Basis for Decision Making

Sig > , then accept H0

Sig < , then reject H0

Decision

0.233 > 0.05 accept H0

Conclusion

There is no difference variance in the average level of customer


satisfaction for the three scent of air freshener

c. ANOVA Test (Use ANOVA table)

Hypothesis

H0 : There are no differences in average levels of customer


satisfaction towards the three scent of air freshener.

98
Modul Business Statistics II- Management Laboratory 2015

Ha : There are differences in average levels of customer satisfaction


towards the three scent of air freshener.

Basis for Decision Making

Sig > , then accept H0

Sig < , then reject H0

OR

F-critical> F-table, then reject H0

F-critical< F-table, then accept H0

Decision

31.856 > 3.35 reject H0

0 < 0.05 reject H0

Conclusion

There are differences in average levels of customer satisfaction


towards the three scent of air freshener.

d. Tukey Test

Hypothesis

H0 : There are no differences in average levels of customer


satisfaction between the lavender scent and citrus according to
Tukey

Ha : There are differences in average levels of customer satisfaction


between the lavender scent and citrus according to Tukey

Basis for Decision Making

Sig > , then accept H0

Sig < , then reject H0

Decision

0.003 < 0.05 reject H0

Conclusion

99
Modul Business Statistics II- Management Laboratory 2015

There are differences in average levels of customer satisfaction


between the lavender scent and citrus according to Tukey

e. Bonferroni Test

Hypothesis

H0 : There are no differences in average levels of customer


satisfaction between the citrus scent and vanilla according to
Bonferroni

Ha : There are differences in average levels of customer satisfaction


between the citrus scent and vanilla according to Bonferroni.

Basis for Decision Making

Sig > , then accept H0

Sig < , then reject H0

Decision

0 < 0.05 reject H0

Conclusion

There are differences in average levels of customer satisfaction


between the citrus scent and vanilla according to Bonferroni.

100
Modul Business Statistics II- Management Laboratory 2015

Exercise

1 Yosua, the grocery store owner of "Variety Store" wants to know whether there are
differences in the amount of bananas sold per week (in kg) in the grocery store when he
decided to display the bananas in sorts of sections such as cereal, milk, and fruit. Below
is the data of sales in 3 months:

Cereal Milk Fruit


26 39 61
55 18 40
53 32 65
50 55 50
35 39 45
40 25 55
45 40 59
38 44 68
30 38 38
25 20 46
55 25 49
48 39 42
a Is the sales data normally distributed?
b Are the variances of the bananas sales data equal?
c Are the means of the bananas sales data from displaying in all three places
equal?

d Are the means of the bananas sales data from displaying in the milk and cereal
sections equal according to Tukey?

e Are the means of the bananas sales data from displaying in the fruits and cereal
sections equal according to Bonferroni?

2 A soap collector wants to know the sales rates of soaps in the market nowadays. He
picked up 3 different brands that are often used among the consumers, Lepboy, Luks
and Dav. Below is the sales data of the 3 brands shown as per month in a year.

Lepboy Luks Dav


(in thousand) (in thousand) (in thousand)

101
Modul Business Statistics II- Management Laboratory 2015

790 784 775


832 790 814
768 810 792
793 805 811
802 799 774
782 772 790
798 782 801
812 804 794
792 811 807
821 801 792
823 790 794
816 805 804
Data above assumed as normally distributed.

a Are the variances of the data equal?

b Are the means of the sales from the 3 different brands equal?

c Are the means of the sales from Lepboy and Dav equal according to Bonferroni?

3 Ferlyn, a young entrepreneur who opened a business that sells snacks named "Tasty
Macaroni", is intending to find out how her snack taste rating is based on several
variants. Hence, Ferlyn decided to distribute questionnaires to customers randomly. The
questionnaire consists of taste rating (in scale of 1-10). Below is the result from the
questionnaires:

Balado Cheese Original Smoke


d Beef
5 4 7 9
7 4 7 9
8 5 6 9
8 5 5 8
9 6 8 7
6 4 8 7
7 7 8 7
9 7 9 8

102
Modul Business Statistics II- Management Laboratory 2015

7 8 9 6
8 6 7 8

a Are the ratings and the amount purchased from the customers distributed normally?

b Determine the Variance-Covariance Matrix Test

c Are the dependent variables of all 4 variety flavors equal?


d Are the variances for each of the variable equal?
e Are the variances of the ratings from the 4 flavors equal?
f Are the variances of the amounts purchased of the 4 flavors equal?

103
Module Business Statistics II- Management Laboratory 2015

SESSION 7
CHI-SQUARE

Chi-Square Test is a test that used to consider the dependency between


dependent variable and independent variable often scaled in nominal or ordinal. Ch-
square test Chi-square test tabulates one or more variables into the categories and
formulate into statistics figure. One of the variables will be analyze in consistency test
or goodness of fit test to make comparison between the observed frequency (Fo) and
the expected frequency (fe). If there are 2 variables, so it usually known as independent
test that this method is used to know the relationship between 2 variables. Chi-square is
known as Non-parametric in statistics.

All variable have to be categorical numeric, nominal or ordinal to be analyzed.


This procedure is based on the assumption that the non-parametric tests do not require
assumptions based on distributions. Data is assumed to achieve from a random sample.
Expected Frequency (Fe) for each category require: Expected frequency that is less
than 5 should not be more than 20% from the total frequency categories.

(Suseno, 2013)

7.1 Consistency Test or Goodness of Fit Test

Consistency Test or Goodness of Fit Test defines if a particular population comes with a
particular distribution.

Example of expected the same frequency values in all categories:

Marlene, as a website owner, would like to offer a free gift to those who
subscribed to her website. New customers get to choose one out of 3 prizes of the
equal value: gift vouchers, dolls, or free cinema tickets. After 1000 people signed
up, Marlene wants to review the numbers to see if the three prizes offered are
equally popular. In this case, the three prizes are considered as the categorical
variable, types of gifts which consists gift vouchers, dolls, and free cinema tickets.
1000 people who have signed up might cause the "case", the case can be anything
from "people" to "animal", "object", "organization", and so on

The following data of 1000 customers:

104
Module Business Statistics II- Management Laboratory 2015

Type of gifts Respondent


s
Voucher 370
Doll 230
Movie Tickets 400

Steps:

1. Input the variables type of gifts and respondents) in variable view.

2. Use Value in type of gifts, input the prizes labels

3. Input data in Data View.

4. Before we proceed to the following test, we will need to begin with the Weight
CasesCommand. In this case, the data that is input into the Data View is counted as
what it is stated.

105
Module Business Statistics II- Management Laboratory 2015

5. Select the "Weight cases by" button and indicate that Frequency variable, in this
case, jumlah peminat.

6. After the Weight Cases Command, we are able to proceed to the normality test by
selecting Analyze Descriptive Statistics Explore.

Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

jumlahpemina
t 0.389 1000 0 0.656 1000 0

a. Lilliefors Significance Correction

106
Module Business Statistics II- Management Laboratory 2015

7. The result of accepted null hypothesis will lead us to the Non Parametric test. Select
Analyze >> Non Parametric tests >> Legacy Dialogs >> Chi Square. Move variable
tipe_hadiah to Variable List.

8. Click on OK to proceed.

tipe_hadiah
Observed Expected Residual
N N

voucher 370 333.3 36.7


boneka 230 333.3 -103.3
tiket
nonton 400 333.3 66.7
Total 1000

Test Statistics
tipe_hadiah

Chi-Square 49.400a
df 2
Asymp. Sig. 0
a. 0 cells (0.0%) have expected
frequencies less than 5. The minimum
expected cell frequency is 333.3.

107
Module Business Statistics II- Management Laboratory 2015

Interpreting result

Hypothesis

H0: The data are consistent with a specified distribution among all of types of gifts

Ha: The data are not consistent with a specified distribution among all of types of gifts

The hypothesis test is formulated as follows:

Sig > , accept H0

Sig < , reject H0

Decision:

0.000 <0.05 , reject H0

Conclusion:

The data are not consistent with a specified distribution among all of jenis hadiah

108
Module Business Statistics II- Management Laboratory 2015

Example of the expected frequency value

The following data shows the frequency definition of Success according to


Female Entrepreneurs.

Definition Frequency
Happiness 89
Luckiness 27
Helping Other People 41
Challenges 70

In this survey, the male entrepreneurs are given the multiple choices question on
the definition of Success. Based on the survey result, 42 chose kebahagiaan, 95
chose keuntungan, 27 chose membantuoranglain and 63 chose tantangan.
Perform a test to determine if there is any consistency in the specified
distribution among the male and female entrepreneurs.

Steps:

Definition Male - Frequency Female - Frequency


Happiness 42 89
Luckiness 95 27
Helping Other People 27 41
Challenges 63 70
Total 227 227

1. Input the variables (kategori_sukses and frekuensi) in Variable View.

2. Use Value in kategories_sukses input the definition of success labels

109
Module Business Statistics II- Management Laboratory 2015

3. Input data in Data View. Input only the frequency data of Male Entrepreneurs in Data
View.

4. Before we proceed to the following test, we will need to begin with the Weight
CasesCommand. In this case, the data that is input into the Data View is counted as
what it is stated.

110
Module Business Statistics II- Management Laboratory 2015

5. Select the "Weight cases by" button and indicate that Frequency variable, in this
case, frequency. Click Ok to continue.

6. After the Weight Cases Command, we are able to proceed to the normality test by
selecting Analyze Descriptive Statistics Explore.
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.

kategori_sukses 0.277 227 0 0.83 227 0


a. Lilliefors Significance Correction

7. The result of accepted null hypothesis will lead us to the Non Parametric test. Select
Analyze >> Non Parametric tests >> Legacy Dialogs >> Chi Square. Move variable
kategori_sukses to Variable List.

8. Next, we are going to input the expected data of Female Entrepreneurs in the
Valuecommand.

111
Module Business Statistics II- Management Laboratory 2015

9. Click OK to continue.

kategori_sukses
Observed Expected Residual
N N

kebahagian 42 89 -47
Keuntungan 95 27 68
membantu orang lain 27 41 -14
tantangan 63 70 -7
Total 227

Test Statistics
kategori_sukses

Chi-Square 201.560a
df 3
Asymp. Sig. 0
a. 0 cells (0.0%) have expected
frequencies less than 5. The
minimum expected cell frequency is
27.0.

10. Steps to indicate Chi-Sqaure formula:

Select Transform>>Compute Variable

Input X in Target Variable

Select InverseDF from Function Group, Idf.Chisq from SpecialVariables.

112
Module Business Statistics II- Management Laboratory 2015

IDF.CHISQ(?,?) will be shown on Numeric Expression. The first ? is defined


as the probability of the data, in this case the chi-square value is significant at
the 0.05 level, which means 0.95. The following ? is defined as the specified
degrees of freedom df, in this case is 3. IDF.CHISQ(0.95,3) is input. Click OK
to continue.

11. The result of the Chi-Square table will be shown in the Data View as X.

113
Module Business Statistics II- Management Laboratory 2015

Interpreting result:

Hypothesis

H0: The data are consistent with a specified distribution among the male and female
entreprenuers.

Ha: The data are consistent with a specified distribution among the male and female
entreprenuers.

The hypothesis test is formulated as follows:

Sig > , accept H0

Sig < , reject H0

Decision:

0.000 <0.05 , reject H0

Conclusion:

The data are consistent with a specified distribution among the male and female
entreprenuers.

114
Module Business Statistics II- Management Laboratory 2015

7.2 Test of Independence

The test of independence is to analyze the frequency from 2 variables with multiple
categories, to define if there is any association between.

Example:

A statistics professor wants to analyze if there is any association between the final
grade of the students and the length of the time they study. The following is the sample
data collected from 215 students.

Grade
Jambelajar
A B C
< 3 jam 18 48 16
3-5 jam 30 28 12
>5 jam 33 25 5

Steps:

1. Input the variables (JamBelajar, Grade, and Frekuensi) in variable view.

2. Use Value in JamBelajar andGrade, input the labels as stated.

3. Input data in Data View.

115
Module Business Statistics II- Management Laboratory 2015

4. Before we proceed to the following test, we will need to begin with the Weight
Cases Command. In this case, the data that is input into the Data View is counted as
what it is stated. Select the "Weight cases by" button and indicate that Frequency
variable, in this case, frequency. Click Ok to continue.

5. After the Weight Cases Command, we are able to proceed to the normality test by
selecting Analyze Descriptive Statistics Explore.
Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Frekuensi .176 215 .000 .910 215 .000
a. Lilliefors Significance Correction

6. Determine Goodness of fit test. Select Analyze>Descriptive Statistics>Crosstabs.

116
Module Business Statistics II- Management Laboratory 2015

7. InCrosstab, move JamBelajar to Row(s)and Grade to Column(s).

Click Statistics, tick Chi-Squarein Crosstabs: Statisticssection

Click Cells, select Observed, Expected.

8. Click Continue, then Ok for result.

JamBelajar * Grade Crosstabulation


Grade Total

117
Module Business Statistics II- Management Laboratory 2015

A B C

Count 18 48 16 82
Expected
< 3 jam Count 30.9 38.5 12.6 82
Count 30 28 12 70
Expected
3-5 jam Count 26.4 32.9 10.7 70
Count 33 25 5 63
Expected
JamBelajar >5 jam Count 23.7 29.6 9.7 63
Count 81 101 33 215
Expected
Total Count 81 101 33 215

Chi-Square Tests
Value df Asymp.
Sig. (2-
sided)

Pearson Chi-
Square 16.596a 4 0.002
Likelihood Ratio 17.456 4 0.002
Linear-by-Linear
Association 13.222 1 0
N of Valid Cases 215
a. 0 cells (0.0%) have expected count less than 5.
The minimum expected count is 9.67.

9. Steps to indicate Chi-Sqaure formula:

Select Transform>>Compute Variable

Input X in Target Variable

Select InverseDF from Function Group, Idf.Chisq from SpecialVariables.

IDF.CHISQ(?,?) will be shown on Numeric Expression. The first ? is defined as the


probability of the data, in this case the chi-square value is significant at the 0.05 level,
which means 0.95. The following ? is defined as the specified degrees of freedom
df, in this case is 4. IDF.CHISQ(0.95,4) is input. Click OK to continue.

118
Module Business Statistics II- Management Laboratory 2015

10. The result of the Chi-Square table will be shown in the Data View as X.

Interpreting result:

Hypothesis

H0: There is no association between jam belajar and grade mahasiswa.

Ha: There is an association between jam belajar and grade mahasiswa.

The hypothesis test is formulated as follows:

Sig > , accept H0

Sig < , reject H0 or another way round

Xcal> Xtable, reject H0

Xcal<Xtable, accept H0

Decision (see pearson chi-square)

119
Module Business Statistics II- Management Laboratory 2015

0.002 <0.05 , reject H0 or another way round

16.596 > 9.49, reject H0

Conclusion:

There is an association between jam belajar and grade mahasiswa.

Or

The observed value of chi-square > the critical value, so H0 is


rejected

The observed value of chi-square < the critical value, so H0 is


not rejected

Decision (see Pearson chi-square)

0.002 <0.05 ,so H0 is rejected

Or

16.596 > 9.49, so H0 is rejected

Conclusion

There is a relationship between study time with students grade.

Exercise

1. A promotor of Koreans boyband Super Junior believe that the age of


the ticket buyer for Supershow 5 are evenly distributed. Here is the
sample data of buyer's age distribution that's observed. Do a test to
determine the evenness of total ticket buyers based on age categories!

Age The
number of
buyer

< 10 16

20- 44
Oct

120
Module Business Statistics II- Management Laboratory 2015

21-30 61

31-40 56

41-50 35

>50 19

2. A professor claims that the typical distribution grade from his class is
20% A, 25% B, 40% C, 10% D, and 5% E. This semester, His class contains
85 students. a. Estimate the number of students who will receive their
grades based on professors expectation. b. At the end of the semester,
the grades of 85 students in his class are 22 A, 29 B, 20 C, 10 D, 4 E. Do a
test to determine whether or not the result of the grade suitable with
distribution grade expected.

3. A group of people in their 30s are being interviewed to determine the


type of music that's often being listened to by people in that age category
has no correlation with the geography of where they live.

Type of Music

Geographic location Rock RnB Traditional clasic

North 140 32 5 18

South 134 41 52 8

West 154 27 8 13

East 130 30 12 15

4. A psychology student is conducting a research of whether social class


has a relation to the number of children in the family. The following data
were collected:

Social Class

The number of children Upper Middle Lower

0 7 18 6

1 9 38 23

121
Module Business Statistics II- Management Laboratory 2015

2 or 3 34 97 58

>3 47 31 30

Do a test to determine whether the number of children in the family is related to social class
or not!SESSION 8
MANN WHITNEY

Mann Whitney is one of the non-parametric statistical tests used to


compare the average between two populations that are not related (free).
This test was developed by Hendry B. Mann and DR Whitney in 1947.
Mann Whitney test can be carried out in compliance with the following
assumptions:

Data distribution is free, random or abnormal

The data used are not related to each other (Independent)

Minimal Ordinal Scale

Used little data (<30)

Example

1. A private course Sukses Selalu has some tutors teaching in every


class of the course. Tika, as the owner, wants to know what kind of
learning technique will be more effective. So, Tika conducted a test
in two different classes to compare. Class A is given guidance on
learning by learning methods through video, gaming, and direct
practice. While in class B, they are given tutoring with the method of
learning through text book. The following are 20 samples of test
results taken in each class:

Class A Class B

75 90

80 95

77 80

122
Module Business Statistics II- Management Laboratory 2015

95 60

90 50

98 55

100 58

76 60

50 62

85 88

87 68

79 60

79 57

89 55

90 90

100 60

95 88

85 80

87 59

75 60

The steps :

1. Input variable class and grade in Variable View. Use Value Label
for variable class.

123
Module Business Statistics II- Management Laboratory 2015

2. Input data to Data View.

3. Do normality test.

124
Module Business Statistics II- Management Laboratory 2015

Tests of Normality

Class Kolmogorov-Smirnova Shapiro-Wilk

Statis df Sig. Statis df Sig.


tic tic

Class
A 0.154 20 .200* 0.898 20 0.038

Class
Score B 0.277 20 0 0.836 20 0.003

*. This is a lower bound of the true significance.

a. Lilliefors Significance Correction

4. Choose Analyze >> Nonparametric Tests >> Legacy


Dialogs >> 2 Independent Samples. Then in the window that
appears, move variable grade to the Test Variable List and
classes to group variable.

5. Click Define Group and input as below.

125
Module Business Statistics II- Management Laboratory 2015

6. Click Continue. In the test-type select Mann-Whitney U and then


click OK to see the output.

Ranks

Class N Mean Sum


Rank of
Ranks

Class
A 20 25.73 514.5

Class
B 20 15.28 305.5

Score Total 40

Test Statisticsa

nilai

Mann-Whitney U 95.5

Wilcoxon W 305.5

Z -2.833

Asymp. Sig. (2-tailed) 0.005

Exact Sig. [2*(1-tailed


Sig.)] .004b

a. Grouping Variable: Class

b. Not corrected for ties.

Interpretation of Data :

Hypotheses

H0 : There are no significant differences in grade between the


students based on learning methods.

126
Module Business Statistics II- Management Laboratory 2015

Ha :There are significant differences in grade between the students


based on learning methods.

Basic of The Decision Making

Sig > so, do not reject H0

Sig < so, reject H0

Decision (asymps sig value)

0.005 < 0,05, so reject H0

Conclusion

There are significant differences in terms of grade between the


students based on learning methods.

Exercise

1. The following is a table showing the salaries of several high school


teachers in California and Florida. Some people mentioned that
the salaries of high school teachers in California are higher than
the teacher in Florida. Do a test to determine whether there are
differences between the salaries of teachers in California to
Florida.

California ($) Florida ($)

47700 48300

60500 57600

40900 43300

40700 30900

57100 43600

35500 41500

59900 47100

49600 37500

48400 38600
127
Module Business Statistics II- Management Laboratory 2015

53600 41500

47700 36200

46000 49400

2. The survey of health statistics indicate that people between the


age of 65 and 74 have contacted psychiatrist with an average of
9.8 times per year. People with age > 75 contacted a psychiatrist
with an average of 12.9 times per year. Steven, a psychiatrist at
one certain hospital, wants to prove if those statistics are correct.
The following data shows the average number of visits to a
psychiatrist.

65-74 75

12 16

13 15

8 10

11 17

9 13

6 12

11 14

10 9

13 13

9 10

128
Module Business Statistics II- Management Laboratory 2015

SESSION 9
SIGN WILCOXON

Wilcoxon signed rank test is a nonparametric method used to compare


two samples of data that comes from the same group (2 samples are
interconnected). It can be done to investigate whether changes in the
value of one particular time to another or when the individual is given
more than one condition (and given a different treatment).

Wilcoxon signed rank testing can be done when it meets the following
assumptions:

1. Data of the dependent variable minimal ordinal scale

2. Free data distribution

3. The sample used is a sample-related

4. Small number of samples (<30)

Example:

1. 12 grown men are following the liquid diet program plan to lose
weight. Data of before and after doing weight loss diets are already
recorded. Do a test whether there is a weight difference before and
after doing a liquid diet. These are the following datas :

1 2 3 4 5 6 7 8 9 1 1 1
0 1 2

Bef 1 1 1 1 1 1 1 1 1 1 1 1
ore 8 7 7 6 9 7 7 9 7 7 8 8
6 1 7 8 1 2 7 1 0 1 8 7

Afte 1 1 1 1 1 1 1 1 1 1 1 1
r 7 6 6 6 8 7 6 9 6 8 8 7
9 8 5 9 2 1 5 0 5 0 1 2

The steps:

129
Module Business Statistics II- Management Laboratory 2015

1. Input variable before and after to Variable View.

2. Input data to Data View.

3. Do normality test. If data abnormal, use nonpametrik method.

Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statist df Sig. Statist df Sig.


ic ic

sebelu
m 0.204 12 0.18 0.864 12 0.055

sesud
ah 0.175 12 .200* 0.902 12 0.167

*. This is a lower bound of the true significance.

a. Lilliefors Significance Correction

130
Module Business Statistics II- Management Laboratory 2015

4. Then, click Analyze, pilih Non-Parametric Test 2 Related

Sample.

5. In Test Pairs, input the variable before and after. Then choose the
Test Type: Wilcoxon. Ignore the other options, click OK.

131
Module Business Statistics II- Management Laboratory 2015

6. Output Result

Ranks

N Mean Sum
Rank of
Rank
s

Negati
ve
Ranks 10a 6.75 67.5

Positiv
e
sesud
Ranks 2b 5.25 10.5
ah -
Ties 0c
sebelu
m Total 12

a. sesudah < sebelum

b. sesudah > sebelum

c. sesudah = sebelum

Test Statisticsa

sesudah - sebelum

Z -2.242b

Asymp. Sig. (2-


tailed) 0.025

a. Wilcoxon Signed Ranks Test

132
Module Business Statistics II- Management Laboratory 2015

b. Based on positive ranks.

Interpretation of Data

Hypotheses

H0 : There is no a weight difference before and after doing a liquid


diet

Ha : There is a weight difference before and after doing a liquid diet

Basic of The Decision Making

Sig > , so do not reject H0

Sig < , so reject H0

Decision

0.025 < 0.05 , so reject H0

Conclusion:

There is a weight difference before and after doing a liquid diet.

If seen on the table rank, the value of the negative more than the positive,
it indicates that on average there is decline in weight men who follow the
diet program.

Exercise

1. The following is a table of the results of the taste test of 10 people


who were asked to give an assessment of the taste of soda before
and after given additional Flavor with a scale of 1-10. Do a test
whether there is a difference in taste between the two flavors of
soda with 90% confidence level.

Peop 1 2 3 4 5 6 7 8 9 10
le

133
Module Business Statistics II- Management Laboratory 2015

Befor 4 3 3 9 5 5 9 9 5 3
e

After 7 7 3 10 10 3 7 10 7 8

2. Swimming Club in Jakarta has 10 expert swimmers and have been


participated in swimming race regurarly. Those 10 swimmers were
being included to follow new training programs. Following are the
results of the time recorded before and after the swimmers being
included in new training program.

Swim 1 2 3 4 5 6 7 8 9 10
mer

Before 31 39 26 45 30 45 30 26 26 27

After 28 29 35 20 20 34 35 35 29 20

SESSION 10
KRUSKALL WALLIS

Kruskal Wallis is one of the non parametric rank-based methods. Its


purpose is to determine whether there's a statistically significant
difference between two or more independent variables on a group of
dependent variable which scale numerical data (interval/ratio) and the
ordinal scale. Kruskal Wallis is the alternative if the data is not normally
distributed and the extension of the method of Mann Whitney U Test,
which is known to be only used on two groups of samples. While the
Kruskall Wallis can be used on more than 2 groups of samples.

Kruskall Walliss Assumption


The assumptions of this method are:

1. The independent variable is at least more than 2 groups.


2. The dependent variable is numeric scale (interval / ratio) or ordinal
scale.

134
Module Business Statistics II- Management Laboratory 2015

3. Independent means that the sample in each category must be free


of each other, that there should be no sample belongs in 2 or more
categories.
4. Each category has same variability, namely histograms curve or
distribution of the same data (See the variability of histogram is
similar)
5. Each sample must at least has minimal 5 datas to use Chi-Square
probabilities.

Post Hoc Kruskall Wallis

If accept H1, there is no average difference between the variable, it can


be followed by further tests or also known as post hoc test. Post Hoc Test
after Kruskall Wallis can be done by using the methods of the Mann
Whitney U Test. Using the test, we can assess whether significant
differences exist between the groups, the method of Post Hoc can be done
if:

1. The difference in scores between the students using Method A and


Method B.

2. The difference in scores between the students using Method A and


Method C.

3. The difference in scores between the students using Method B and


Method C.

Example :

The restaurants management would like to know the opinion of the


customers regarding the service, cleanliness, and quality of the food.
The management would like to compare the results of the customers'
rating for 3 different shift:

135
Module Business Statistics II- Management Laboratory 2015

Shift 1: 16:00 midnight

Shift 2: midnight 08:00

Shift 3: 08:00 16:00

Customers are given a card that will be filled with advices and
suggestions. For each shift, 20 cards are being taken randomly with 4
points of scale:

4= perfect, 3= good, 2= sufficient, 1 = bad

Shift 1 Shift 2 Shift 3


2 3 3
2 4 4
3 2 3
2 2 4
2 3 4
3 3 3
3 3 4
2 3 2
1 2 4
1 3 4
1 4 4
2 3 3
2 3 4
3 2 3
1 3 4
2 1 3
2 3 4
2 2 3
3 1 4
4 3 2

The steps in using SPSS :

1. Input variable (Shift and Nilai) to Variable View.

136
Module Business Statistics II- Management Laboratory 2015

2. Click Values, a dialog box will appear Value Labels. In the Value
column, write the numbers 1 and label it 16:00-midnight and click
Add continued until the 3rd value.

3. In the tab Data View in the column shift input numbers 1 for shift
1 and nilai column copy data from the question.

4. The requirement to do Kruskall Walliss test is abnormal data


distribution, so, first of all, do normality test. Click Analyze
Descriptive Statistics Explore.

137
Module Business Statistics II- Management Laboratory 2015

5. Then Explore dialog box will appear then click Plots.

6. In Explore dialog box :Plots in Boxplots click None, uncheck Stem-


and-leaf in Desciptive and click Normality plots with tests. Click
Continue and click Ok In Explore dialog box and the result of
normality test will appear :

138
Module Business Statistics II- Management Laboratory 2015

7. Output result of Normality Test since the amount of data or df 20 so


we see Shapiro-Wilk table. From the results of Sig 0.009, 0.004, and
0.0 smaller than (0.5) so that Kruskal-Wallis testcan be done.

Tests of Normality

Kolmogorov-
Smirnova Shapiro-Wilk

Statist Statist
Shift ic df Sig. ic df Sig.

penilai 16.00-
an midnig
ht 0.273 20 0 0.864 20 0.009

midnig
ht-
08.00 0.317 20 0 0.843 20 0.004

08.00-
16.00 0.339 20 0 0.739 20 0

a. Lilliefors Significance
Correction
8. Click Analyze Nonparametric Tests K Independent
Samples.

139
Module Business Statistics II- Management Laboratory 2015

9. Then Tests for Several Independent Samples dialog box will


appear, then move variable Nilai to Test Variable List column and
Shift to Grouping Variable.

10. Click Define Range and input Minimum value : 1 and


Maximum value : 3 (because it contains 3 sample).

11. Then click Continue and click Ok. The results of the Kruskall
Wallis Test as follows:

Ranks

Mean
Shift N Rank

penilaia 16.00-
n midnight 20 19.62

midnight- 20 28.7
08.00

140
Module Business Statistics II- Management Laboratory 2015

08.00-16.00 20 43.18

Total 60

Test Statisticsa,b

Penilaian

Chi-
Square 20.389

df 2

Asymp.
Sig. 0

a. Kruskal Wallis Test

b. Grouping Variable: shift

12. After obtaining Asymp Sig and Chi-Square we need Chi-Square


Table by using the tool in spss. Firstly, click the Transform and
choose Compute Variable.

13. In Function Group choose Inverse DF and in Functions and


Special Variables choose idf. Chisq. In Target Variable input chisq
and in Numeric Expression input formula IDF.CHISQ (prob, df) where
the probability is (1-) = 1-0.05 = 0.95 and df can be seen in the
output Statistics table, is 2 or K-1 = 3 1 = 2 (Note: K is the number
of sample being tasted).

141
Module Business Statistics II- Management Laboratory 2015

14. Output table Chi-Square will appear in data view, which should
be 5.99.

The Interpretation of Data

Hypotheses

H0 : There is no significant difference between the assessment


service, cleanliness and quality of the food for all three shifts.

Ha : There is significant difference between the assessment service,


cleanliness and quality of the food for all three shifts.

Basic of Decision Making :

The observed value of chi-square > the critical value, so H0 is


rejected

142
Module Business Statistics II- Management Laboratory 2015

The observed value of chi-square < the critical value, so H0 is not


rejected

Degree of freedom k-1

or

Sig > , so do not reject H0


Sig < , so reject H0

Decision :

20.389 > 5.99 reject H0

0 < 0.05 reject H0

Conclusion

There is significant difference between the assessment service,


cleanliness and quality of the food for all three shifts.

Exercise

1. 1. A manager of the Caf "Palaloalali" wants to find out how the


level of work performances of the new workers in their three
months so far. The workers' names are Andi, Budi, and Dani. The
Manager must determine whether or not these three workers are
good enough to be accepted as permanent employees. One of the
requirements was based on the value of customer satisfaction.
These are the results of the customer satisfaction based on the
questionnaires that have been spread randomly to the customers:

Andi Budi Dani

3.20 4.00 2.60

3.10 3.50 2.90

3.90 3.70 3.30

3.90 4.00 3.00

3.30 3.50 2.90

3.20 3.80 2.50

143
Module Business Statistics II- Management Laboratory 2015

4.00 3.70 3.50

3.40 3.80 3.40

3.80 3.50 2.80

3.20 3.80 3.00

3.10 4.00 2.90

3.00 3.50 3.00

3.20 4.00 3.30

3.30 4.00 2.90

3.20 3.70 3.00

4.00 3.70 3.00

3.20 3.90 2.50

3.40 3.80 3.40

4.00 4.00 2.90

3.40 4.00 3.50

3.20 3.80 3.50

3.80 4.00 2.70

3.90 4.00 2.50

3.80 3.70 3.00

3.30 4.00 3.30

3.10 3.90 3.50

4.00 3.70 3.50

3.10 3.80 2.80

3.80 4.00 3.50

4.00 3.70 3.40

144
Module Business Statistics II- Management Laboratory 2015

Help the manager to cultivate the datas to make sure the manager will
make the best decision.

2. Kartika has one main Beauty Salon at Pluit Mall, and three branches
of her Beauty Salon at Taman Anggrek Mall, Kelapa Gading, and
Ciputra Mall. Kartika would like to know whether the customer
satisfactions on her four Beauty salons are similar one to the others.
Here attached the results of the rating that's already obtained (scale
of 1-100):

Kelap
Pluit Taman
Ciput a
Villag Anggr
ra Gadin
e ek
g

74 88 45 61

86 66 94 74

87 70 78 67

95 94 55 74

90 61 72 78

92 89 40 60

70 65 40 40

70 60 95 40

70 60 94 40

90 95 40 74

Do a test with a confidence level of 90%!

Bibliography

Anderson, D. R., Sweeney, D. J., & Williams, T. A. (2011). Statistics For


Business and Economics 11e. South Western: Cengage Learning.

145
Module Business Statistics II- Management Laboratory 2015

Black, K. (2013). Applied Business Statistics Making Better Bussiness


Decisions 7th Edition. Singapore: John Wiley & Sons.

Ghozali, I. (2013). Aplikasi Analisis Multivariate dengan Program IBM SPSS


21 Updated PLS Regresi. Semarang: Badan Penerbit Universitas
Diponegoro.

Godam64. (2006, Juni 21). klasifikasi-jenis-dan-macam-data-pembagian-


data-dalam-ilmu-eksak-sains-statistik-statistika. Retrieved Juli 7, 2015,
from www.organisasi.org: http://www.organisasi.org/1970/01/klasifikasi-
jenis-dan-macam-data-pembagian-data-dalam-ilmu-eksak-sains-statistik-
statistika.html

Hidayat, A. (2014, July 17). Kruskall Wallis H. Retrieved Maret 20, 2015,
from www.statistikian.com: http://www.statistikian.com/2014/07/kruskall-
wallis-h.html

Kelley, W. M., & Donnelly, R. A. (2009). The Humongous Book of Statistics


Problems. USA: Penguin Group.

Ltd, L. R. (n.d.). Chi Square Goodness of Fit Test in SPSS Statistics.


Retrieved Juli 28, 2015, from www.statistics.laerd.com:
https://statistics.laerd.com/spss-tutorials/chi-square-goodness-of-fit-test-
in-spss-statistics.php

Putra, Z. (2012, Februari 23). Konsep Data Klasifikasi Jenis dan Macam
Pembagian Data dalam Statistik. Retrieved July 28, 2015, from
http://www.scribd.com: http://www.scribd.com/doc/101086535/Konsep-
Data-Klasifikasi-Jenis-Dan-Macam-Pembagian-Data-Dalam-
Statistika#scribd

Santosa, P. B., & Ashari. (2005). Analisis Statistik dengan Microsoft Excel &
SPSS. Yogyakarta: Andi.

Santoso, S. (2011). Mastering SPSS Versi 19. Jakarta: PT Elex Media


Komputindo.

Sekaran, U., & Bougie, R. (2013). Research Methods for Business. West
Sussex: John Wiley & Sons Ltd.

Suseno, B. (2013, April 18). Analisis chi square. Retrieved July 28, 2015,
from Statisti Olah Data: http://www.statistikolahdata.com/2013/04/analisis-
chi-square.html

https://statistics.laerd.com/spss-tutorials/wilcoxon-signed-rank-test-using-
spss-statistics.php

146
Module Business Statistics II- Management Laboratory 2015

147

You might also like