You are on page 1of 26

Statistics 1

STATISTICS FOR BUSINESS DECISION (HI 6007)


GROUP ASSIGNMENT

GROUP DETAILS

MAULIK THAKAR EGU8594

KIRANDEEP KAUR EMV8687

SHOUGANG CHEN DC5003

Statistics
Statistics 2

Table of Contents
Task 1: ............................................................................................................................................. 3
Task 2: ........................................................................................................................................... 14
References: .................................................................................................................................... 26
Statistics 3

Task 1:

1. Descriptive statistics:

The descriptive statistics is a significant tool which is mainly used for summarising all important

variables that are given in specific data sets (Holcomb, 2016). This can be calculated using the

different software like MS Excel or SPSS through which accurate results are obtained in the

most feasible manner. In the descriptive statistics, dependent or independent variables are also

analysed and on that basis, adequate relation is formed between the variables which are

prescribed in the data sets. Using this tool, one is able to calculate the various important

mathematical factors including median, mode, mean, range, standard deviation as well as

variance. Therefore, using the software, the different results can be evaluated as follows,

For X1= Pizza:

X2= startup costs for baker/donuts

Mean 92.09090909

Standard Error 11.72677941

Median 87

Mode 76.82

Standard Deviation 38.89332731

Sample Variance 1512.690909

Kurtosis -0.436922711

Skewness 0.509844144

Range 120

Minimum 40

Maximum 160
Statistics 4

Sum 1013

Count 11

For X2= Baker/Donuts:

X2= startup costs for baker/donuts

Mean 92.09090909

Standard Error 11.72677941

Median 87

Mode 76.82

Standard Deviation 38.89332731

Sample Variance 1512.690909

Kurtosis -0.436922711

Skewness 0.509844144

Range 120

Minimum 40

Maximum 160

Sum 1013

Count 11

For X3= Shoe stores:


Statistics 5

X3= startup costs for shoe stores

Mean 72.3

Standard Error 9.918613254

Median 70

Mode 65.4

Standard Deviation 31.36540911

Sample Variance 983.7888889

Kurtosis -0.958969069

Skewness 0.546077569

Range 90

Minimum 35

Maximum 125

Sum 723

Count 10

For X4= Gift shops:

X4= startup costs for gift shops

Mean 87

Standard Error 11.3539029

Median 97.5

Mode 118.5
Statistics 6

Standard Deviation 35.9041935

Sample Variance 1289.111111

Kurtosis -0.485709919

Skewness 0.077293703

Range 115

Minimum 35

Maximum 150

Sum 870

Count 10

For X5= Pet stores:

X5= startup costs for pet stores

Mean 51.625

Standard Error 6.76872403

Median 49

Mode 43.75

Standard Deviation 27.07489612

Sample Variance 733.05

Kurtosis -0.47673397

Skewness 0.633105979

Range 90
Statistics 7

Minimum 20

Maximum 110

Sum 826

Count 16

2. Frequency distribution:

a. Frequency as well as Relative Frequency Distributions:

For X1= Pizza:

X1 Interval Frequency Relative frequency

0-30 0 0

30-60 4 0.307692308

60-90 4 0.307692308

90-120 3 0.230769231

120-150 2 0.153846154

Total 13

For X2= Baker/Donuts:

Relative

X2 Interval Frequency frequency

0-30 0 0

30-60 3 0.272727273

60-90 4 0.363636364

90-120 2 0.181818182
Statistics 8

120-150 1 0.090909091

150-180 1 0.090909091

Total 11

For X3= Shoe stores:

Relative

X3 Interval Frequency frequency

0-30 0 0

30-60 4 0.4

60-90 3 0.3

90-120 2 0.2

120-150 1 0.1

Total 10

For X4= Gift shops:

Relative

X4 Interval Frequency frequency

0-30 0 0

30-60 3 0.3

60-90 1 0.1

90-120 5 0.5

120-150 1 0.1

Total 10
Statistics 9

For X5= Pet stores:

Relative

X5 Interval Frequency frequency

0-30 6 0.375

30-60 5 0.3125

60-90 4 0.25

90-120 1 0.0625

Total 16

b. Relative Frequency Histogram:

For X1= Pizza:

Relative frequency Histogram


0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120 120-150

For X2= Baker/Donuts:


Statistics 10

Relative frequency Histogram


0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120 120-150 150-180

For X3= Shoe stores:

Relative frequency Histogram


0.6
0.4
0.2 Relative frequency
0
0-30 30-60 60-90 90-120 120-150

For X4= Gift shops:

Relative frequency Histogram


0.6
0.4
0.2 Relative frequency
0
0-30 30-60 60-90 90-120 120-150

For X5= Pet stores:


Statistics 11

Relative frequency Histogram


0.4
0.3
0.2
Relative frequency
0.1
0
0-30 30-60 60-90 90-120

3. Results:

From the above, analysis, it has been analysed that there are various results obtained for different

businesses as given in the data set. For the business of pizza, mean is equal to 83, mode is 74,

median is 80, variance is 1165.16, standard deviation is 34.13 and the value for range comes out

as 105. In the similar manner, value for different business can also be calculated using the

software like SPSS and Excel. For the business of baker or donuts, mean is equal to 92.09, mode

is 76.82, median is 87, variance is 1512.69, standard deviation is 38.89 and the value for range

comes out as 120. For the business of shoe stores, mean is equal to 72.3, mode is 65.4, median is

70, variance is 983.78, standard deviation is 31.36 and the value for range comes out as 90. For

the business of gift shop, mean is equal to 87, mode is 118.5, median is 97.5, variance is

1289.11, standard deviation is 35.90 and the value for range comes out as 115. In addition, for

the business of pet stores, mean is equal to 51.625, mode is 43.75, median is 49, variance is

733.05, standard deviation is 27.07 and the value for range comes out as 90.

Further, from the relative frequency histogram for the different type of businesses, it has been

analysed that for pizza business, both intervals that are 30-60 and 60-90 have the highest

frequency of 4 and also have the highest relative frequency which is equal to 0.307. In the same

manner, for pizza business, both intervals that are 30-60 and 60-90 have the highest frequency of

4 and also have the highest relative frequency which is equal to 0.307. In the same or bakers or
Statistics 12

donuts business, interval of 30-60 has the highest frequency that is 4 and has the highest relative

frequency of 0.363 as observed from the plotted histogram. For shoe stores, interval of 30-60 has

the highest frequency that is 4 and has the highest relative frequency of 0.40 as observed from

the plotted histogram. For gift shop, interval of 90-120 has the highest frequency that is 5 and

has the highest relative frequency of 0.50 as observed from the plotted histogram. For pet store,

interval of 0-30 has the highest frequency that is 6 and has the highest relative frequency of

0.375 as observed from the plotted histogram.

4. Significance test:

In order to test the significance for the starting costs of different business, the test has been

conducted using the SPSS software in order to analyze the difference using the value of mean,

median as well as mode (Wetcher-Hendricks, 2011).

Runs Test

X1 X2 X3 X4 X5

Test Valuea 80.00 87.00 70.00 97.50 49.00

Cases < Test Value 6 5 5 5 8

Cases >= Test


7 6 5 5 8
Value

Total Cases 13 11 10 10 16

Number of Runs 2 2 2 2 2

Z -2.893 -2.537 -2.348 -2.348 -3.364

Asymp. Sig. (2-


.004 .011 .019 .019 .001
tailed)
Statistics 13

Runs Test

X1 X2 X3 X4 X5

Test Valuea 80.00 87.00 70.00 97.50 49.00

Cases < Test Value 6 5 5 5 8

Cases >= Test


7 6 5 5 8
Value

Total Cases 13 11 10 10 16

Number of Runs 2 2 2 2 2

Z -2.893 -2.537 -2.348 -2.348 -3.364

Asymp. Sig. (2-


.004 .011 .019 .019 .001
tailed)

a. Median

Runs Test 2

X1 X2 X3 X4 X5

Test Valuea 83.0000 92.0909 72.3000 87.0000 51.6250

Cases < Test Value 7 7 5 4 9

Cases >= Test


6 4 5 6 7
Value

Total Cases 13 11 10 10 16
Statistics 14

Number of Runs 2 2 2 2 2

Z -2.893 -2.488 -2.348 -2.318 -3.356

Asymp. Sig. (2-


.004 .013 .019 .020 .001
tailed)

a. Mean

Using the value of mean for the different businesses related to their starting costs, it has been

analyzed that starting costs for most of the business in the given data set is mainly comes in the

interval of 30-60 million dollars which indicates that there is no significant difference in the

starting costs for each type of business.

Task 2:

1. Regression equation:

Correlations
Statistics 15

X1 X2 X3 X4 X5 X6

Pearson X1 1.000 .894 .946 .914 .954 -.912

Correlation X2 .894 1.000 .844 .749 .838 -.766

X3 .946 .844 1.000 .906 .864 -.807

X4 .914 .749 .906 1.000 .795 -.841

X5 .954 .838 .864 .795 1.000 -.870

X6 -.912 -.766 -.807 -.841 -.870 1.000

Sig. (1-tailed) X1 . .000 .000 .000 .000 .000

X2 .000 . .000 .000 .000 .000

X3 .000 .000 . .000 .000 .000

X4 .000 .000 .000 . .000 .000

X5 .000 .000 .000 .000 . .000

X6 .000 .000 .000 .000 .000 .

N X1 27 27 27 27 27 27

X2 27 27 27 27 27 27

X3 27 27 27 27 27 27

X4 27 27 27 27 27 27

X5 27 27 27 27 27 27

X6 27 27 27 27 27 27
Statistics 16

Variables Entered/Removedb

Variables Variables

Model Entered Removed Method

1 X6, X2, X4,


. Enter
a
X5, X3

a. All requested variables entered.

b. Dependent Variable: X1

Regression equation is presented over the analysis of given data for the franchise for all greens

pvt ltd. on basis of the different variables like annual sales, floor area, advertising and

expenditure includes the families in different areas. On the basis of above assessment of

regression, it can be interpreted that the X1 data is the variable over the different variables which

are independent for each franchisee store. It can be said that there is a positive relation between

the X1 and X1 which dictates on the high coefficient co-relation for variables. With this, it can

also be evaluated that there is also a significant relationship between the number of families and

sales through tapping the competitors in the perspective location (Draper and Smith, 2014). On

the other hand, it is also intensified that the if there is the higher cost is consumed in the

inventory and area square for the advertisement than the sales is also influenced in positive

manner. Along with this, it is also found out that if there is no competition or low competition for
Statistics 17

the store than the performance of franchisee store is very high but if there are several competitors

in the perspective market than the sales is declined over the variability nature of the dependent

constraints. It is also evaluated that the positive performance for franchisee also derives the high

cost for business to operate in the highly competitive market to pull the customer by better

strategy for advertising. Apart from this, the negative relationship between the X6 also assessed

over the depended and independent variables factor. With this, the higher negative and low

relationship exist between the X1 and X6 to estimating the quantitative measure for the multiple

variable on based of franchisee shop.

2. Applicability of model:

The regression model is used to determine and navigate the relationship between the two and

more variables in order to establish the prediction measure for the concerned data and

performance for two variables are also developed. This model is best fit over the given data for

business of franchisee to equipping the statistical comprehension. With this, this model also

determines the quantitative relationship between the two variables over the numerical measures.

With this, in concurrent competitive time, regression model is effectively used by the managers

of large organization in order to optimizing the concerned business outcomes process. With this,

it is considered the indispensable tool for the analysis of statistical evaluation of data under

variant situation of a business. The above data and results depicts that it determine the R square

with reference to the mode calculation for the adjusted data management. With this, it is also

useful for the establishment of linear regression between the two constraints. The above diagram
Statistics 18

dictates the R square is higher than the adjusted R for the mean square. In addition to this, it also

adaptable by using a single predictor for multiple variables.

3. Hypothesis test:

Hypothesis testing is the conduction of analysis, over this prediction is assessed to using the

mean values. With this, hypothesis is processed with the experimentation of the guess in relevant

manner. Hypothesis statement is that there is no significant relationship between the dependent

and any of the independent variables which is crucial thing to proceed with the testing of

prediction (Science, 2017). On the other hand, the alternative hypothesis is also projected in

order to determine the relevancy of chosen statistical problem which supports to frame the

criteria for hypothesis to accept or reject the null hypothesis. In order to test this hypothesis, one

variable factor as dependent and another independed variable.

Coeffici Standard t Stat P- Lower Upper Lower Upper

ents Error value 95% 95% 95.0% 95.0%

Intercept - 30.15022 - 0.538 - 43.8414 - 43.8414

18.8594 791 0.625514 372 81.560 17 81.5602 17

142 81 245 45

X 16.2015 3.544437 4.570986 0.000 8.8305 23.5726 8.83051 23.5726

Variable 1 736 306 073 166 127 344 27 344

X 0.17463 0.057606 3.031540 0.006 0.0548 0.29443 0.05483 0.29443

Variable 2 515 068 961 347 368 353 678 353

X 11.5262 2.532103 4.552053 0.000 6.2604 16.7920 6.26047 16.7920

Variable 3 69 3 24 174 72 661 197 661


Statistics 19

X 13.5803 1.770456 7.670514 1.61E- 9.8984 17.2621 9.89844 17.2621

Variable 4 129 609 392 07 468 79 684 79

X - 1.705426 - 0.005 - - -8.8576 -

Variable 5 5.31097 54 3.114160 249 8.8576 1.76434 1.76434

141 17 28 278

In order to proceed with the testing,

( Science, 2017)

Here, mean value is chosen from X5 variable which is 51.625. With this, the standard deviation

is also calculated as 1.7054. P value is calculated as .005249.

On the basis of scrtuing the above depicted information it can be said that there is low and

negative relationship exist between the dependant and independent variable. It is also dictated

that the upper and lower values are also negative so that the null hypothesis is accepted over the

selected criteria.

4. Slope coefficients:
Statistics 20

ANOVAb

Sum of

Model Squares df Mean Square F Sig.

1 Regression 952538.942 5 190507.788 611.590 .000a

Residual 6541.410 21 311.496

Total 959080.352 26

a. Predictors: (Constant), X6, X2, X4, X5,

X3

b. Dependent Variable: X1

Slope coefficient is referred to the coefficient of an independent variable which dictates the

changes in y values due to change in x constraints. Over the equation of regression analysis, it

can be stated the valuation of the data establish the correlation coefficient.

5. Confidence interval:
Statistics 21

Confidence interval can be termed as the single sample which explores the probability of the data

in more realistic manner. With this, the confidence interval is calculated over the variable and

different constraints.

Calculation of confidence interval = X Z *s/(n) (O'Gorman, 2012)

For this, mean is chosen for X1 as 286.574 and the Standard deviation is 16.26

number of samples are 27

= 286.574 + 1.60 *16.26/ 27 = 288.174 *3.13 = 901.43

6. Significance test for slope coefficients:

Significance test for the derived variables is measured over the defined sample size for the given

data.
Statistics 22

7. Model re-estimation:

Correlations
Statistics 23

X1 X3 X6

Pearson X1 1.000 .946 -.912

Correlation X3 .946 1.000 -.807

X6 -.912 -.807 1.000

Sig. (1-tailed) X1 . .000 .000

X3 .000 . .000

X6 .000 .000 .

N X1 27 27 27

X3 27 27 27

X6 27 27 27

ANOVAb

Sum of

Model Squares df Mean Square F Sig.

1 Regression 918438.538 2 459219.269 271.180 .000a

Residual 40641.814 24 1693.409


Statistics 24

Total 959080.352 26

a. Predictors: (Constant), X6, X3

b. Dependent Variable: X1

Coefficient Correlationsa

Model X6 X3

1 Correlations X6 1.000 .807

X3 .807 1.000

Covariances X6 7.805 .161

X3 .161 .005

a. Dependent Variable: X1

8. Annual sales:

On the basis of rearranging the model for co-variance to equate the regression analysis the sales

can be predicted for franchisee as with 1000 ft. square, $150000 inventory, expenses for $5000

with the 2 competitors in perspective market.


Statistics 25

On the basis of devised information in the sheet the sales can be predicted as $436000 from the

above dictated and rearranged data valuation for the X variable and constant variables.
Statistics 26

References:

Holcomb, Z. (2016). Fundamentals of Descriptive Statistics. UK: Routledge.

Wetcher-Hendricks, D. (2011). Analyzing Quantitative Data: An Introduction for Social

Researchers. USA: John Wiley and Sons.

Henkel, R. (2017). The Significance Test Controversy: A Reader. UK: Routledge.

O'Gorman, T. (2012). Adaptive Tests of Significance Using Permutations of Residuals with R

and SAS. USA: Jon Wiley and Sons.

Draper, N. and Smith, H. (2014). Applied Regression Analysis. USA: John Wiley and Sons.

Science, (2017). Hypothesis Testing. Retrieved from:

https://onlinecourses.science.psu.edu/stat200/node/54

You might also like