Professional Documents
Culture Documents
31TH 2016
Faculty of Economics and Business, University of Zagreb
Multivariate analysis: Factor and Cluster Analysis
As a team, we decided that we want to make a brief analysis of business competitiveness of an economy.
It’s already recognised worldwide, and measured with different methodologies trough the long period
of time. World competitiveness global report has been accepted as effective, realistic and useful tool for
country SWOT analysis based on 12 pillars. It has been positioned as display of country current position
and became a start line for building of an efficient and substantial economic development strategy
among large amount of included countries. It defines competitiveness as the set of institutions, policies,
and factors that determine the level of productivity of a country. A more competitive economy, logically,
is one that is likely to grow faster over time.
Many determinants drive productivity and competitiveness. Understanding the factors behind this
process has occupied the minds of economists for hundreds of years, engendering theories ranging from
Adam Smith’s focus on specialization and the division of labour to neoclassical economists’ emphasis
on investment in physical capital and infrastructure, and, more recently, to interest in other mechanisms
such as education and training, technological progress, macroeconomic stability, good governance, firm
sophistication, and market efficiency, among others. While all of these factors are likely to be important
for competitiveness and growth, they are not mutually exclusive—two or more of them can be
significant at the same time, and in fact that is what has been shown in the economic literature.
Our analysis takes into consideration small open economies in Europe. One of the defining features of
small open economies is that households and firms in these countries can borrow and lend at an interest
rate determined by international markets. But not all small open economies are alike. Although small
open economies share the feature of being price-takers in international bond markets — that is, they do
not influence prices in the marketplace — they differ substantially in other dimensions. Consequently,
economists sort these countries into two types: developed (or industrialized) economies and developing
(or emerging) economies. This classification was originally proposed in the 1980s by World Bank
economist Antoine van Agtmael. In spite of this deceptively simple classification, there is no consensus
about where the distinction between developed and developing vanishes.
To avoid these conflicting views about the definition of emerging countries, we rely on more concrete
quantitative measures based on the business cycle properties of these economies.
Since the regional component of every economy gets increasing attention these days, since the scientific
research have shown that 75% of profit arises from cities level, while rural parts of each country show
whole spectre of development progress problems, we wanted to see, which factors make the most
significant impact on results, in the end, created on macroeconomic level.
The concept of “A Smart City” defines the cites as accelerate of sustainable economic growth, and the
whole concept lies on the idea of an social and technological infrastructure that improves the quality of
life in the city for everyone. And of whole economy in the end.
ICTs have evolved into the “general purpose technology” of our time,15 given their critical
spill overs to other economic sectors and their role as industry-wide enabling infrastructure.
Therefore ICT access and usage are key enablers of countries’ overall technological readiness.
Whether the technology used has or has not been developed within national borders is irrelevant
for its ability to enhance productivity. The central point is that the firms operating in the country
need to have access to advanced products and blueprints and the ability to absorb and use them.
Among the main sources of foreign technology, FDI often plays a key role, especially for
countries at a less advanced stage of technological development. It is important to note that, in
this context, the level of technology available to firms in a country needs to be distinguished
from the country’s ability to conduct blue-sky research and develop new technologies for
innovation that expand the frontiers of knowledge.
Variables took into a consideration: Firm-level technology absorption & FDI and technology
transfer
Variables took into a consideration: Nature of competitive advantage & Production process
sophistication
Variables took into a consideration: Capacity for innovation, Quality of scientific research
institutions, Company spending on R&D, University-industry collaboration in R&D, Gov’t
procurement of advanced tech products, Availability of scientists and engineers, PCT patents,
applications/million pop.*
They have the largest set with a reason and that’s the fact that modern economy is driven with
fast changing, global environment that requires productive, fast solutions and proactive role of
government and business policies to create a “safe”, substantial economy growth. Cities need
to see a future not just adopt to newly created situation (with short term duration).
1. Variable that connects them, of course is human capital, so we took the measure made
for quality of the education system and quality of math and science education.
> setwd("C:\\Users\\Public\\Documents")
> comp=read.table(file="BusinessCompetitiveness.txt")
> comp
> colnames(comp)=c("QES","QMSE","FLTA","FDI","NCA","PPS","CI","QSRI","CSR&D","UICR&D","GPA
TP","ASE","PCTP")
> rownames(comp)=c("Austria","Belgium","Croatia","Cyprus","Estonia","Iceland","Latvia","Lithuania","Malta
","Montenegro","Portugal","SlovakRepublic","Slovenia","Switzerland")
> comp
The number of rows and columns in our matrix were defined using following functions:
> n=nrow(comp)
>n
>p=ncol(comp)
>p
2.3. Standardization of data
The third step is referred to the matrix of standardized values, required for further analysis. In
this part, function scale was used because its default method centers and/or scales the columns
of a numeric matrix. Standardized values of observed variables are given in the matrix named
xs.
> xs=scale(comp)
> xs
> r=cov(xs)
>r
2.5. Defining the number of factors used in analysis
Function eigen is used for computing eigenvalues and eigenvectors of correlation matrix r.
> e=eigen(r)
> values=e$values
> vectors=e$vectors
Eigenvalues are important for selection of the number of factors which are going to be used in
the factor analysis. The K1 method for identification of the number of factors proposed by
Kaiser is perhaps the best known and most utilized in practice. According to this rule, only the
factors that have eigenvalues greater than one are retained for interpretation. In this case 2
factors have eigenvalues greater than one.
Eigenvalues
Eigenvectors
From the next table it is visible that only the first 2 eigenvalues are greater than one and together
explain the majority of the variation in the original data (74%).
> perc=values/sum(values)
> cumperc=cumsum(perc)
> table=cbind(values,perc,cumperc)
> table
Cattell’s Scree test
The alternative method of selecting the number of factors to extract is through the use of a
Cattell’s “scree” plot, which is a graph of the eigenvalues (variances) of each component plotted
against the component number. From this plot, it may be seen that the line “drops-off” after
about the second eigenvalue, so we have retained two components which together explain about
74 % of the variation in the original data.
> z=xs%*%vectors
> r1=cor(cbind(z,xs))
> compet2=r1[14:26,1:2]
> compet2
The percentage of variance of each variable explained by common factors is given in the
following table. It is visible that the majority of variance variables is explained by extracted
factors.
> common=diag(compet2%*%t(compet2))
> common
Specific variance
The following table shows the remaining, unexplained part of variance of each variable.
> specific=diag(r)-common
> specific
Factor loadings rotated
The following table shows factor loadings matrix after varimax rotation . It is more clear to
interpret the values because each variable is significant only for one factor. 11 variables are
significant for the first factor (QES, QMSE, FLTA, NCA, PPS, CI, QSRI, CSR&D, UICR&D,
ASE, PCTP) and two variables for the second factor (FDI, GPATP). Therefore we can easily
separate variables among the 2 factors. The first factor explaines 59.9% ov the variation in the
data and second factor explaines 14.5% of the variation.
> rot=varimax(compet2)
> rotloadings=rot$loadings
> rotloadings
2.7. Factor scores before rotation
The score for a given factor is a linear combination of all of the measures, weighted by the
corresponding factor loading.
> fa=xs%*%solve(r)%*%compet2
> fa
Large open economies were introduced to the analysis to cunduct a cluster analysis.
Data:
12
15
14
18
19
10
13
20
16
17
11
9
In next step we exclude the 12th variable:
>cluster2=cluster[-12,]
>dist.eu2= dist(cluster2,method="euclidean",p=2,diag=T)
>dist.eu22=dist.eu2^2
>s=hclust(dist.eu22, method = "single")
15
14
18
19
10
13
20
16
17
11
2. Manhattan distance:
>cluster =read.table(file="cluster.txt",header=T)
>cluster
>dist.ma= dist(cluster,method="manhattan",p=2,diag=T)
>dist.ma
12
15
14
18
19
10
13
20
16
17
11
9
>cluster2=cluster[-12,]
>dist.ma2= dist(cluster2,method="manhattan",p=2,diag=T)
>dist.ma2
15
14
18
19
10
13
20
16
17
11
9
3. Maximum distance
> cluster =read.table(file="cluster.txt",header=T)
> dist.mi= dist(cluster,method="maximum",p=2,diag=T)
> s=hclust(dist.mi, method = "single")
> plot(s, hang = -0.1, frame.plot = TRUE, ann = FALSE)
> cluster2=cluster[-12,]
> dist.mi2= dist(cluster2,method="manhattan",p=2,diag=T)
> dist.mi2
12
14
10
18
19
15
13
20
16
17
11
15
14
18
19
10
13
20
16
17
11