You are on page 1of 13

QF2145 Statistics Ⅱ Final Project.

Comparison between 3C Popularity Rate and Some Indices of Countries

QF2145 Statistics Ⅱ

Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries
105061224 張傳佳

A. Introduction:
In 21st century, 3C products are what necessary objects we need to use every day in a modern
society. However, for some countries in the Third World, the popularity of 3C is still at a relatively
lower level and rises continually in recent years. The reason for the differences may be due to the
economy, education, and age distribution, etc. This report is aimed on the phenomenon and the
analyses are done on the relevance between 3C popularity rate and some indices of countries at
different degree of economic development.

B. Approach:
To do the analysis on 3C popularity rate, I searched for the statistical data in 2015 on Pew
Research Center [1]. The data is the rate of “smart phone ownership and Internet usage” including 4
kinds of rates corresponding to total, age, education, and income. According to these four categories,
I searched for other five country indices also in 2015. Human development index (HDI) [2] is for
“total”. Age distribution [3] is for “age”. Education index [4] is for “education”. GDP [5] and GNP
[6] are for “income”.

First, all collected data is gone through Goodness of Fit Test and Variance Test to know the
basic characteristics of the data. Then, Regression Analysis is used to check if age, education, and
economy really have influences on the 3C products popularity rates. Finally, with the analysis results,
the conclusion is shown as following.

C. Raw Data:
The following is the raw data I collected from the three authoritative institutes. It consists of the
popularity rate of 3C products and five country indices for thirty countries in five continents.
(a) Smart Phone Ownership and Internet Usage:

Smart Phone Ownership and Internet Usage


Less More Lower Higher
Country TOTAL 18-34 35+
education education income income
% % % % % % %
Ethiopia 8 12 4 5 43 5 23
Pakistan 15 20 10 6 33 8 20
Burkina Faso 18 22 12 11 72 15 37
India 22 34 12 9 38 11 28
Ghana 25 32 18 14 62 13 30
Indonesia 30 52 12 13 55 17 41

1
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

Senegal 31 40 20 21 82 18 42
Nigeria 39 52 21 9 53 27 52
Philippines 40 58 23 15 57 26 52
Kenya 40 53 22 19 70 26 52
South Africa 42 52 33 24 61 22 57
Vietnam 50 81 25 32 79 42 70
Peru 52 76 37 16 74 23 63
Mexico 54 76 38 35 87 44 66
Ukraine 60 93 44 20 62 44 73
Brazil 60 82 44 39 86 42 76
China 65 93 49 48 91 56 80
Lebanon 66 89 50 34 90 41 92
Jordan 67 75 57 41 96 50 80
Malaysia 68 91 50 29 82 46 79
Poland 69 98 56 28 78 56 81
Japan 69 97 64 56 88 51 86
Argentina 71 92 58 61 94 47 76
Italy 72 100 65 68 95 56 87
France 75 98 66 65 95 61 87
Chile 78 96 65 26 87 62 90
Germany 85 99 80 74 92 73 95
Israel 86 96 80 80 93 78 94
Spain 87 100 82 81 97 80 95
UK 88 98 85 82 98 82 98
USA 89 99 85 80 95 84 97
Canada 90 100 87 81 95 85 99
Australia 93 100 90 87 98 84 99

(b) Country Indices:

Country Indices
Education GDP per GNP per
Country HDI Population
index capital capital
0-14 15-59 60+ 80+
Ethiopia 0.451 0.322 645.47 600 41.4 53.3 5.2 0.5
Pakistan 0.551 0.398 1428.64 1430 35 58.4 6.6 0.6
Burkina Faso 0.412 0.277 575.31 630 45.6 50.6 3.8 0.2
India 0.627 0.542 1606.95 1600 28.8 62.3 8.9 0.9
Ghana 0.585 0.556 1783.06 1960 38.8 55.9 5.3 0.4

2
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

Indonesia 0.686 0.616 3334.55 3430 27.7 64.1 8.2 0.7


Senegal 0.492 0.354 1186.33 1290 43.8 51.7 4.5 0.3
Nigeria 0.527 0.477 2729.76 2880 44 51.5 4.5 0.2
Philippines 0.693 0.661 2878.34 3520 31.9 60.8 7.3 0.6
Kenya 0.578 0.534 1355.06 1310 41.9 53.6 4.5 0.4
South Africa 0.692 0.708 5742.99 6060 29.2 63 7.7 1
Vietnam 0.684 0.619 2065.17 1950 23.1 66.6 10.3 2
Peru 0.745 0.686 6053.11 6160 27.9 62.1 10 1.4
Mexico 0.767 0.666 9298.24 9840 27.6 62.8 9.6 1.5
Ukraine 0.743 0.794 2124.66 2650 14.9 62.5 22.6 3.4
Brazil 0.757 0.68 8750.22 10090 23 65.2 11.7 1.5
China 0.743 0.641 8069.21 7950 17.2 67.6 15.2 1.6
Lebanon 0.752 0.63 8529.51 8110 24 64.6 11.5 1.5
Jordan 0.733 0.706 4096.10 3890 35.5 59 5.4 0.5
Malaysia 0.795 0.713 9655.14 10450 24.5 66.3 9.2 0.8
Poland 0.855 0.853 12556.36 13340 14.9 62.4 22.7 4
Japan 0.905 0.839 34567.75 38880 12.9 54.1 33.1 7.8
Argentina 0.822 0.812 13698.30 12510 25.2 59.7 15.1 2.7
Italy 0.876 0.791 30170.52 32960 13.7 57.7 28.6 6.8
France 0.898 0.84 36613.38 41080 18.5 56.3 25.2 6.1
Chile 0.84 0.797 13736.64 14310 20.1 64.2 15.7 2.7
Germany 0.933 0.94 41394.66 46020 12.9 59.5 27.6 5.7
Israel 0.901 0.876 35855.28 36160 27.8 56.3 15.8 3
Spain 0.885 0.819 25817.39 28460 14.9 60.7 24.4 5.9
UK 0.918 0.911 44472.15 43860 17.8 59.2 23 4.7
USA 0.92 0.9 56803.47 56700 19 60.4 20.7 3.8
Canada 0.92 0.89 43327.17 47380 16 61.7 22.3 4.2
Australia 0.936 0.926 56644.00 60440 18.7 60.9 20.4 3.9

D. Data Analyses and Results:


To determine the confidence interval, I set 𝛼 = 0.05 for the following all statistical tests.
(a) Goodness of Fit Test:
Because the following analyses are based on normal distribution, it is important to do
Goodness of Fit Test to check the distribution of each data. For the test, I assume that

𝑯 : 𝑇ℎ𝑒 𝑑𝑎𝑡𝑎 𝑜𝑏𝑒𝑦𝑠 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛.


{ 𝟎 .
𝑯𝟏 : 𝑇ℎ𝑒 𝑑𝑎𝑡𝑎 𝑖𝑠 𝑛𝑜𝑡 𝑎 𝑛𝑜𝑟𝑚𝑎𝑙 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛.

In detail, I divided the data into 6 parts, so each part is assumed has 5 (30 ÷ 6 = 5) countries which
is represented as 𝒆𝒊 . As for the practical calculated results, they are represented as 𝒇𝒊 . The table
3
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

2
below shows the calculated value of each data gone through chi-square test and 𝜒0.05 ≅ 42.5570.
Though some values are kindly higher, we can see that all values are smaller than 42.5570. As a
consequence, it can be concluded that all the data obey normal distribution and all the following
tests can be based on this conclusion.

Goodness of Fit Test


Age Education Income
Less More Lower Higher
TOTAL 18-34 35+
education education income income
𝑒𝑖 𝑓𝑖 𝑓𝑖 𝑓𝑖 𝑓𝑖 𝑓𝑖 𝑓𝑖 𝑓𝑖
5 6 5 9 5 5 6 6
5 4 5 1 9 4 5 4
5 2 1 4 4 3 4 2
5 6 3 6 2 3 4 6
5 6 16 4 3 12 5 6
5 6 0 6 7 3 6 6
Chi-Square 2.8 33.2 7.2 6.8 12.4 0.8 2.8
Conclusion H0 H0 H0 H0 H0 H0 H0

Goodness of Fit Test


Education GDP per GNP per
HDI Age Distribution
index capital capital
0-14 15-59 60+ 80+
ei fi fi fi fi fi fi fi fi
5 6 4 0 0 6 5 6 3
5 2 4 15 15 6 4 8 13
5 7 6 6 6 4 5 2 1
5 3 3 0 0 6 4 4 4
5 5 8 2 2 3 7 2 3
5 7 5 7 7 5 5 8 6
Chi-Square 4.4 3.2 32.8 32.8 1.6 1.2 7.6 18
Conclusion H0 H0 H0 H0 H0 H0 H0 H0

(b) Variance Test:


After knowing that the data is normally distributed, I want to know the degree of variance for the
popularity of 3C products additionally. Therefore, I also used chi-square test and assumed that the
variance 𝜎 2 of the population, i.e., all of the countries around the World, is about 20%.

4
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

𝑯 : 𝜎 2 ≤ 20%
{ 𝟎 2 .
𝑯𝟏 : 𝜎 > 20%

From the table below, we can find that the chi-square value of each data is large. Though there are
2
four values lower than 𝜒0.05 ≅ 42.5570 and they can’t reject 𝑯𝟎 , the variances of the population
for the total and three categories (age, education, and income) are at a high level. Then, it can be
concluded that an obvious gap exists between different countries for the popularity of 3C products.

Variance Test
Age Education Income
Less More Lower Higher
TOTAL 18-34 35+
education education income income
% % % % % % %
Average 58.533 75.433 47.600 39.700 78.700 44.767 70.167
Variance 23.263 26.499 25.351 26.707 18.404 23.874 23.609
Exp. Std. 20 20 20 20 20 20 20
Chi-Square 39.234 50.908 46.593 51.711 24.556 41.323 40.410
Conclusion H0 H1 H1 H1 H0 H0 H0

(c) Simple Linear Regression Analysis – Total 3C Products Popularity Rate:


For this part, the objective of regression analysis is to know if any country indices have some
strong relevance with the total popularity rate of 3C products. Therefore, the following is divided
into five parts and also five linear regression lines 𝑌̂ = a + b𝑋𝑖 . The total popularity rate is viewed
as a dependent variable 𝑌̂, and HDI, education index, GDP, GNP, and age distribution are viewed
as independent variables 𝑋𝑖 . Because the characteristics of age distribution have some additional
discussions, the results of the analyses are divided into two parts.

 Part Ⅰ - HDI, Education Index, GDP, and GNP:


First, the plots and tables shown below are created through regression analyses with total
popularity rate to HDI, education index, GDP, and GNP. From the plots, we can see all have a
positive slope and match the expected points very much. Observe the values in the table Regression
Statistics, the adjusted R square of each category is large enough to prove the match. Especially
for HDI, we can find that 𝑨𝒅𝒋. 𝑹𝟐 is very high close to 0.85. With the results, it can be concluded
that the proportion of people in a country using 3C products is highly related to human development
index (HDI), i.e., the degree of development for a country.

5
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

 Total vs. HDI:


Total vs. HDI
Regression Statistics 100
Y Predicted Y
Multiple R 0.921187778

Total (Y)
R square 0.848586923
50
Adjusted R Square 0.843179313
Standard Error 9.212173778
Observations 30 0
0 0.5 1
HDI

ANOVA
df SS MS F Significance F
Regression 1 13317.27059 13317.27059 156.9245819 5.36877E-13
Residual 28 2376.19608 84.86414572
Total 29 15693.46667

Linear Regression Line-HDI


Coefficient Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -52.81736 9.04661 -5.83836 2.82966E-06 -71.34850 -34.28622
X 148.02680 11.81666 12.52695 5.36877E-13 123.82146 172.23214

 Total vs. Education Index:


Total vs. Edu. Index
Regression Statistics 100
Y Predicted Y
Multiple R 0.897946188
Total (Y)

R square 0.806307356
50
Adjusted R Square 0.799389762
Standard Error 10.41926281
Observations 30 0
0 0.5 1
Education Index

ANOVA
df SS MS F Significance F
Regression 1 12653.75762 12653.75762 116.5589231 1.72744E-11
Residual 28 3039.709049 108.5610375
Total 29 15693.46667

6
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

Linear Regression Line-Education Index


Coefficient Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -25.58855 8.02062 -3.19034 3.48932E-03 -42.01805 -9.15904
X 120.50119 11.16140 10.79625 1.72744E-11 97.63810 143.36428

 Total vs. GDP per Capital:


Total vs. GDP
100
Regression Statistics
Multiple R 0.794700865

Total
R square 0.631549464 50
Adjusted R Square 0.618390516
Standard Error 14.37043866 Y Predicted Y
Observations 30 0
0 20000 40000 60000
GDP per Capital

ANOVA
df SS MS F Significance F
Regression 1 9911.200463 9911.200463 47.99391851 1.56854E-07
Residual 28 5782.266203 206.5095073
Total 29 15693.46667

Linear Regression Line-GDP per Capital


Coefficient Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 41.54988 3.59076 11.57135 3.49658E-12 34.19455 48.90521
X 0.00106 0.00015 6.92776 1.56854E-07 0.00075 0.00137

 Total vs. GNP per Capital:


Total vs. GNP
Regression Statistics 100
Multiple R 0.790042587 80
Total

R square 0.62416729 60
Adjusted R Square 0.610744693 40
Standard Error 14.5136858 20 Y Predicted Y
Observations 30 0
0 20000 40000 60000
GNP per Capital

7
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

ANOVA
df SS MS F Significance F
Regression 1 9795.348556 9795.348556 46.50123216 2.08168E-07
Residual 28 5898.118111 210.6470754
Total 29 15693.46667

Linear Regression Line-GNP per Capital


Coefficient Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 41.62996 3.62850 11.47305 4.26488E-12 34.19732 49.06261
X 0.00100 0.00015 6.81918 2.08168E-07 0.00070 0.00130

 Part Ⅱ – Age Distribution:


As for the age distribution, because the comparison between the trends of these four classes are
relatively important, it is focused on them to get some interesting conclusions. Moreover, the tables
of the regression analyses are similarly to the above discussed results, so they are omitted in this part.
Observe the first class, age 0~14, we can find that the slope of the trend is negative. The reason might
be due to the high relevance between HDI and age 0~14. In most cases, the lower the degree of
development is with a country, the younger the average age is. Precisely, after the calculation, the
𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟕𝟔𝟔𝟖 is high for the relation between HDI and age 0~14. Therefore, the trend for this level
of age is negative.

Then, for the second class, age 15~59, the 𝑨𝒅𝒋. 𝑹𝟐 is almost equal to 0. It means that this level
of age has no influence on the popularity rate of 3C products. For the remaining two classes, age 60+
and age 80+, the slopes of the linear regression lines are positive. The reason may be the same as that
of age 0~14. The population of a country is younger tends to have a lower HDI. In contrast, if the
aged population of a country takes a high proportion, its HDI is higher than others. As a result, the
two classes have positive trend for popularity rate of 3C products. Overall, though the 𝑨𝒅𝒋. 𝑹𝟐 of
the four classes are not significant especially for age 15~59, there are still many characteristics worth
discussing for some phenomena of usage of 3C products in each country.

8
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

 Total vs. Age Distribution:

Total vs. Age 0~14 Total vs. Age 15~59


100 100

Total
Total

50 50
𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟓𝟓𝟔𝟓 𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟎𝟒𝟐𝟖
Y Predicted Y Y Predicted Y
0 0
0 20 40 60 0 50 100
Age 0~14 Age 15~59

Total vs. Age 60+ Total vs. Age 80+


100 100
Total

Total
50 50
𝟐
𝑨𝒅𝒋. 𝑹 ≅ 𝟎. 𝟓𝟑𝟒𝟏 𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟒𝟖𝟓𝟑
Y Predicted Y Y Predicted Y
0 0
0 20 40 0 5 10
Age 60+ Age 80+

(d) Simple Linear Regression Analysis – Sub-items of 3C Products Popularity Rate:


After analyzing the relevance between total popularity rate and country indices, we can search
deeply to know the characteristics of two sub-items, education and income of the popularity rate.
Education item is divided into two categories, less and high. As for income item, it is divided into
lower and higher ones.

After analyzing, the Significance F of ANOVA and P-value of the regression line
coefficients are shown to be significant. Then, the following just shows the regression plots and the
𝑨𝒅𝒋. 𝑹𝟐 . From the plots, we can find that the education index and GNP have impacts on less
education and lower income rates largely, i.e., people in these two categories are easily influenced
by the development of a country. With this point, we could do a deeper research to find the reasons
which cause the phenomenon. Then, the government could aim on the usage of 3C products and make
some new national policies to increase the education index and GNP of the country and make the
country more competitive in the World.

9
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

Less Education vs. Edu. Index High Education vs. Edu. Index
80 100

High Education
Less Education

Y Predicted Y
𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟓𝟗𝟒𝟏
50
30
𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟑𝟕𝟎𝟔
Y Predicted Y
0
-20 0 0.5 1 0 0.5 1
Edu. Index Edu. Index

Lower Income vs. GNP per Capital Higher Income vs. GNP per Capital

100 100

Higher Income
Lower Income

50 50
𝟐
𝑨𝒅𝒋. 𝑹 ≅ 𝟎. 𝟕𝟎𝟐𝟑 𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟓𝟑𝟏𝟗
Y Predicted Y Y Predicted Y
0 0
0 50000 100000 0 50000 100000
GNP per Capital GNP per Capital

(e) Multiple Regression Analysis:


From the above simple linear regression analysis with 3C product popularity rate and HDI, it
shows the strong relevance between them. Because HDI is calculated from the three parameters,
which are life expectancy, education index, and purchasing power parity, I did the multiple regression
analysis taking 3C product popularity rate as a dependent variable and education index, GNP, and
age distribution as three independent variables. The value of age distribution is calculated from the
summation of the proportion of 60+ and 80+ population to be the alternative of life expectancy. The
following is the detailed description of all parameters.

𝒀 Total 3C Popularity Rate


𝑋1 Education Index
𝑋2 (Age 60+) + (Age 80+)
𝑋3 GNP per Capital

The results below including three plots with only one independent variable changing and fix the
other two. The multiple regression function is

̂ = −𝟏𝟑. 𝟔𝟗 + 𝟗𝟕. 𝟐𝟕𝑿𝟏 − 𝟎. 𝟎𝟖𝑿𝟐 + 𝟎. 𝟑𝟑 × 𝟏𝟎−𝟑 × 𝑿𝟑 .


𝒀

From the table, we can see that 𝑺𝒊𝒈𝒏𝒊𝒇𝒊𝒄𝒂𝒏𝒄𝒆 𝑭 ≅ 𝟑. 𝟐𝟏𝐄 − 𝟏𝟎 is very small and the 𝑨𝒅𝒋. 𝑹𝟐 ≅
𝟎. 𝟖𝟏𝟑 is close to 𝟎. 𝟖𝟒𝟑 of HDI with simple linear regression. However, the table also shows that
P-value of the three coefficients are higher than 𝛼 = 0.05 especially for the variable 𝑋2 , age

10
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

distribution. It is kindly a strange result that F-test is significant while the coefficients are not. To find the
reason, I went back to check if the three “independent” variables are really independent to each other. The
results of three 𝑨𝒅𝒋. 𝑹𝟐 are shown below.

Variables 𝑨𝒅𝒋. 𝑹𝟐
HDI & Education Index 0.604901
HDI & Age Distribution 0.570967
Education & Age Distribution 0.584832

From the three values, we can find that the three variables are not truly independent. Then, it results
in the lower contribution of the three variables to the multiple regression line. As a result, the P-
value of them are not significant. It is a critical point of my multiple regression model and it needs
to be corrected for the future works.

Total vs. Edu. Index Total vs. Age Distribution


100 100
Y Predicted Y
Total
Total

50 50

Y Predicted Y
0 0
0 0.5 1 0 20 40 60
Edu. Index Age Distribution

Total vs. GDP


Regression Statistics
100
Multiple R 0.912260401
R square 0.832219039
Total

Adjusted R Square 0.812859697 50


Standard Error 10.06338549
Observations 30 Y Predicted Y
0
̂ = −𝟏𝟑. 𝟔𝟗 + 𝟗𝟕. 𝟐𝟕𝑿𝟏 − 𝟎. 𝟎𝟖𝑿𝟐
𝒀 0 50000 100000
+𝟎. 𝟑𝟑 × 𝟏𝟎 −𝟑
× 𝑿𝟑 GDP

ANOVA
df SS MS F Significance F
Regression 3 13060.40175 4353.46725 42.98798245 3.21005E-10
Residual 26 2633.064918 101.2717276
Total 29 15693.46667

11
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

Linear Regression Line-HDI


Coefficient Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -13.69399 10.24815 -1.33624 0.19305 -34.75937 7.37139
X1 97.27288 19.04779 5.10678 2.53586E-05 58.11958 136.42617
X2 -0.07795 0.31288 -0.24915 0.80520 -0.72109 0.56518
X3 0.00033 0.00017 1.90011 0.06856 -2.71921E-05 0.00069

(f) Non-linear Regression Analysis:


So far, the regression analysis assumes that the relevance of each variable is linear. However,
from the above many plots, we could observe some have different trend. Take the analysis of Higher
Income vs. GNP per Capital in part (d) for example, the trend obviously has a dramatic
increase at start and then slows down to approach 100%. This kind of trend is like a logarithmic line.
Therefore, I take the logarithmic form of variable X, i.e., GNP per Capital to check if the changed
form can increase the 𝑨𝒅𝒋. 𝑹𝟐 and the significance of F-test.

The following is the comparison between the results of the initial and the changed form.
Observe the calculated values, the 𝑨𝒅𝒋. 𝑹𝟐 ≅ 𝟎. 𝟕𝟕 of the changed logarithmic form is higher than
𝟎. 𝟓𝟑 of the initial linear form. As for the Significance F, the value of the changed one is
9.36527E-11 much lower than that of the initial one 2.91515E-06. With the better 𝑨𝒅𝒋. 𝑹𝟐 and
Significance F, it can be said that the relevance between Higher Income and GNP per Capital is
closer to a logarithmic form not the initial assumption linear form.

Higher Income vs. GNP (linear) Higher Income vs. GNP (log)
100 150
Y Predicted Y
Higher Income

Higher Income

100
50
50
Y Predicted Y
0 0
0 50000 100000 0 2 4 6
GNP per Capital GNP per Capital

Regression Statistics Regression Statistics


Multiple R 0.740324615 Multiple R 0.884122649
R square 0.548080536 R square 0.781672858
Adjusted R Square 0.531940555 Adjusted R Square 0.773875460
Standard Error 16.15206924 Standard Error 11.22668426
Observations 30 Observations 30
Significance F 2.91515E-06 Significance F 9.36527E-11
̂ = 𝟓𝟒. 𝟎𝟗 + 𝟗. 𝟒𝟗 × 𝟏𝟎
𝒀
−𝟒
×𝑿 ̂ = −𝟕𝟐. 𝟏𝟖 + 𝟑𝟔. 𝟑𝟐 × 𝐥𝐨𝐠 𝑿
𝒀 𝟏𝟎

12
QF2145 Statistics Ⅱ Final Project. Comparison between 3C Popularity Rate and Some Indices of Countries

E. Conclusion:
With a series of data analysis, we can roughly make a sense of the relevance between 3C products
popularity rate and some country indices.

First, all tested data are gone through Goodness of Fit Test, and it is checked that each data
obeys normal distribution. Second, the Variance Test is done to see the variance of each data and the
results shows that some gaps exist between different countries with the popularity rate.

Third, the total popularity rate is under Simple Linear Regression Analysis with HDI,
education index, GDP, GNP, and age distribution. Almost all the regression lines after analyzing
has positive trends except for the special case with age distribution at age 0~14. The reason for this
special case has been discussed at that part. Forth, the sub-items, less/high education and
lower/higher income, for the popularity rate is analyzed with education index and GNP per capital.
Then, we found that the condition of a country has a larger impact on less education and lower
income categories.

Fifth, the Multiple Regression Analysis is done aiming on total rate with education index,
GNP, and age distribution. There is a strange result after analyzing and the possible reason is also
discussed at that part. The last but not the least, the higher income popularity rate and GNP per
capital are taken as an example to see if the changed logarithmic form has a better 𝑨𝒅𝒋. 𝑹𝟐 and
Significance F. The result shows the success of this changing.

For the whole work, the set of 3C popularity rates are viewed as dependent variables 𝒀 ̂ and
̂
other country indices are taken as independent ones X. It seems that X’s are always the causes and 𝒀
is the effect. However, the regression analysis just gives us the relevance between each data. The
determination of what the variable X is depends on ourselves. As a result, the deeper research is
needed to know the exact cause-effect relation. We can take it as a future work. Moreover, we could
also take the data in recent years to check if the constructed regression model in 2015 is really suitable
for. If it is truly suitable, the regression models could be adopted to determine some national policies
or do other things to make a country be developed and grow faster and better.

References:
[1] Jacob Poushter, “Smartphone Ownership and Internet Usage Continues to Climb in Emerging Economies”,
Pew Research Center, pp. 11, 2016
[2] UNITED NATIONS DEVELOPMENT PROGRAMME, http://hdr.undp.org/en/data
[3] “World Population Prospects”, UNITED NATIONS DESA/POPULATION DIVISION, revision, pp. 27-31,
2015
[4] UNITED NATIONS DEVELOPMENT PROGRAMME, http://hdr.undp.org/en/data
[5] THE WORLD BANK, https://data.worldbank.org/indicator/NY.GDP.PCAP.CD?end=2015&start=1960
[6] THE WORLD BANK, https://data.worldbank.org/indicator/NY.GNP.PCAP.CD?end=2015&order=wbap
i_data_value_2014+wbapi_data_value+wbapi_data_value-last&sort=desc&start=1962

13

You might also like