Analytics, Innovation, and Excellence-Driven Enterprise Sustainability (2017)

PALGRAVE STUDIES
IN DEMOCRACY,
INNOVATION, AND
ENTREPRENEURSHIP
FOR GROWTH
ANALYTICS, INNOVATION,
AND EXCELLENCE-DRIVEN
ENTERPRISE SUSTAINABILITY
Edited by Elias G. Carayannis

and Stavros Sindakis
Palgrave Studies in Democracy, Innovation, and
Entrepreneurship for Growth
Series Editor
Elias G.Carayannis
School of Business
George Washington University
Washington,DC,USA
The central theme of this series is to explore why some geographic areas
grow and others stagnate over time, and to measure the effects and impli-
cations in a trans-disciplinary context that takes both historical evolution
and geographical location into account. In other words, when, how, and
why does the nature and dynamic of a political regime inform and shape
the drivers of growth and especially innovation and entrepreneurship? In this
socio-economic, socio-political, and socio-technical context, how could we
best achieve growth, financially and environmentally? This series aims to
address key questions framing policy and strategic decision-making at firm,
industry, national, and regional levels, such as:
How does technological advance occur, and what are the strategic
processes and institutions involved?
How are new businesses created? To what extent is intellectual prop-
erty protected?
Which cultural characteristics serve to promote or impede innovation?
In what ways is wealth distributed or concentrated?
A primary feature of the series is to consider the dynamics of innova-

tion and entrepreneurship in the context of globalization, with particu-
lar respect to emerging markets, such as China, India, Russia, and Latin
America. (For example, what are the implications of Chinas rapid transition
from providing low-cost manufacturing and services to becoming an inno-
vation powerhouse? How sustainable financially, technologically, socially,
and environmentally will that transition prove? How do the perspectives
of history and geography explain this phenomenon?) Contributions from
researchers in a wide variety of fields will connect and relate the relation-
ships and inter-dependencies among
Innovation,
Political Regime, and
Economic and Social Development.
We will consider whether innovation is demonstrated differently across

sectors (e.g., health, education, technology) and disciplines (e.g., social
sciences, physical sciences), with an emphasis on discovering emerging
patterns, factors, triggers, catalysts, and accelerators to innovation, and
their impact on future research, practice, and policy. This series will delve
into what are the sustainable and sufficient growth mechanisms for the
foreseeable future for developed, knowledge-based economies and societ-
ies (such as the EU and the US) in the context of multiple, concurrent,
and inter-connected tipping-point effects with short (MENA) as well
as long (China, India) term effects from a geo-strategic, geo-economic,
geo-political, and geo-technological (GEO-STEP) set of perspectives.
This conceptualization lies at the heart of the series, and offers to explore
the correlation between democracy, innovation, and entrepreneurship for
growth. Proposals should be sent to Elias Carayannis at caraye@gwu.edu.
More information about this series at

http://www.springer.com/series/14635
Elias G. Carayannis Stavros Sindakis
Editors
Analytics, Innovation,
and Excellence-
Driven Enterprise
Sustainability
Editors
Elias G. Carayannis Stavros Sindakis
Department of Information Systems School of Business
and Technology Management American University in Dubai School
George Washington University of Business
Washington, District of Columbia, Dubai, UAE
USA
Palgrave Studies in Democracy, Innovation, and Entrepreneurship for Growth

ISBN 978-1-137-39301-2ISBN 978-1-137-37879-8(eBook)
DOI 10.1057/978-1-137-37879-8
Library of Congress Control Number: 2016957534
The Editor(s) (if applicable) and The Author(s) 2017

This work is subject to copyright. All rights are solely and exclusively licensed by the
Publisher, whether the whole or part of the material is concerned, specifically the rights of
translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on
microfilms or in any other physical way, and transmission or information storage and retrieval,
electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information
in this book are believed to be true and accurate at the date of publication. Neither the pub-
lisher nor the authors or the editors give a warranty, express or implied, with respect to the
material contained herein or for any errors or omissions that may have been made. The
publisher remains neutral with regard to jurisdictional claims in published maps and institu-
tional affiliations.
Cover image Katja Piolka / Alamy Stock Photo
Printed on acid-free paper
This Palgrave Macmillan imprint is published by Springer Nature

The registered company is Nature America Inc.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Foreword
Being an active part of the business environment for a couple of decades

means that I have witnessed and experienced several stages and trends in
business. Going back a couple of decades, one can realize the differences
not only in styles of management but also in the way that companies grew
and developed their businesses. Todays business environment has dra-
matically changed. Technology enabled that. People enable it.
The past trend where companies wanted to fight alone and keep their
traditional approaches (if it works dont fix it) has ended. The change
in times leads to a change in companies. To resist these changing times
and be successful, companies have to adopt an innovative approach sus-
tained by robust data (information) that can lead to sustained strategies.
Information and knowledge play an essential role and companies need to
establish alliances and access information that can lead them to sustain-
ability. So, instead of saying if it works dont fix it, companies should be
saying what else can I do? or better even how can we be creative and
work together?
This is what this book is all about. The authors will take the reader
through a fascinating journey of discussion and reflexion over innova-
tiveness and competiveness gained by means of alliances that can cre-
ate value and growth. They will also discuss the importance of digital
technology as a tool for creating new experiences and processes within
organizations.
vii
viii FOREWORD
This book is a complete work that will no doubt make you rethink
your business approach in this new challenging and interesting time. I am
pleased to commend it to readers.
MiguelDiasCosta,
THM School
London, UK
Contents
1Analytics, Innovation, andExcellence-driven Enterprise

Sustainability inaDynamic Era1
Stavros Sindakis
2Business Intelligence andAnalytics: Big Systems for

Big Data7
Herodotos Herodotou
3Business Analytics forPrice Trend Forecasting through

Textual Data51
Marco Pospiech and Carsten Felden
4Market Research andPredictive Analytics: Using Analytics

toMeasure Customer andMarketing Behavior inBusiness
Ventures77
D. Anthony Miles
5Strategic Planning Revisited: Acquisition andExploitation

ofInformation onForeign Markets109
Myropi Garri and Nikolaos Konstantopoulos
ix
x Contents
6Innovation intheOpen Data Ecosystem: Exploring the

Role ofReal Options Thinking andMulti-sided Platforms
forSustainable Value Generation through Open Data137
Thorhildur Jetzek
7Sustainability-Oriented Business Model AssessmentA

Conceptual Foundation169
Florian Ldeke-Freund, Birte Freudenreich, Iolanda Saviuc,
Stefan Schaltegger, and Marten Stock
8Smart Decision-Making andProductivity inthe

Digital World: TheCase ofPATAmPOWER207
Alexander Rayner
9Change Management: Planning fortheFuture

andtheCompetitive Environment225
Konstantinos Biginas
10EU Operational Program Education forCompetitiveness

andIts Impact onSustainable Development255
Petr Svoboda and Jan Cerny
11Applying Data Analytics forInnovation andSustainable

Enterprise Excellence271
Stavros Sindakis
Index277
List of Contributors
KonstantinosBiginas is the assistant dean as well as a lecturer at London College

of International Business Studies. He has extensive teaching and research experi-
ence, mainly from his positions at the External Programmes of the University of
Central Lancashire, the University of London, and State University of NewYork
Empire State College. Biginas has considerable experience in both business eco-
nomics and international management fields. Teaching has been an important part
of his career. He has teaching experience in several types of courses, including
first-, second-, and third-year business economics, management, and marketing
classes as well as postgraduate business and management classes in a number of
academic institutions. Biginas holds an undergraduate degree in Economics. After
the completion of his postgraduate studies, he commenced a PhD. He has
researched and written extensively on international management, global competi-
tion, FDI strategies, and innovation management. He has published numerous
articles and essays.
Jan Cernyis an ordinary professor of managerial science at the Faculty of
Management, University of Economics in Prague and, simultaneously, he is a visit-
ing professor of transport management at the University of Pardubice, both in
Czech Republic. His research interests include network management and recently
management of education as well. He is the author of the famous Cerny Conjecture
concerning networks which has been neither proved nor rejected yet. As
the author/co-author, he has written 6 books and more than 160 research papers.
CarstenFelden is the director of the Institute of Information Science and dean of
the Faculty of Business Administration at TU Freiberg. He is a reviewer of national
and international journals, author of more than 100 publications, member of sev-
eral committees, and co-founder of the German group Computer Science within
the Utility Sector. He is the chief executive officer of his own business IT consult-
xi
xii LIST OF CONTRIBUTORS
ing company. His research interests belong to business analytics, business intelli-
gence, eXtensible Business Reporting Language (XBRL), and e-science.
BirteFreudenreich is a researcher at the Centre for Sustainability Management
(CSM) at the Leuphana University of Lneburg. Her research focuses on business
models and on managing their contribution to sustainable development. She holds
degrees in both Environmental Sciences and Strategic Leadership towards
Sustainability. Having worked in the sustainability management field for several
years, Freudenreich takes a keen interest in the applicability of her research in busi-
ness management practice.
Myropi Garriis Senior Lecturer in Strategic Management at Portsmouth
Business School, University of Portsmouth. Her scientific interests focus on strate-
gic management, internationalization strategies, business administration, human
resources management, and public policy. Her studies and articles have been pub-
lished in scientific journals and in edited volumes.
Herodotos Herodotou is a tenure-track lecturer in the Department of Electrical
Engineering and Computer Engineering and Informatics (EECEI) at the Cyprus
University of Technology. He received his PhD in Computer Science from Duke
University in May 2012. His research interests are in large-scale data processing
systems and database systems. In particular, his work focuses on ease-of-use, man-
ageability, and automated tuning of both centralized and distributed data-inten-
sive computing systems. In addition, he is interested in applying database
techniques in other areas like scientific computing, bioinformatics, and numerical
analysis. His work experience includes research positions at Microsoft Research,
Yahoo! Labs, and Aster Data as well as software engineering positions at Microsoft
and RWD Technologies. He is the recipient of the SIGMOD Jim Gray Doctoral
Dissertation Award Honorable Mention, the Outstanding PhD Dissertation
Award in Computer Science at Duke, Steele Endowed Fellowship, and the Cyprus
Fulbright Commission Scholarship.
Thorhildur Jetzekis a postdoctoral research fellow at the Department of IT
Management, Copenhagen Business School. She has a M.Sc. in Economics and a
PhD in Information Systems Management. During her PhD she worked at the IT
company KMD in Denmark, studying the implementation of an open data infra-
structure in Danish public sector. Her current research focuses on value creation
through open and big data, with a special attention to the role of digital platforms.
Building on 15 years of experience from the IT industry, Thorhildur strives to find
synergies between academic research and practical experiences in order to further
our understanding of how data and IT can be used to create value for society.
NikolaosKonstantopoulos is an associate professor in the Department of Business
Administration at the University of the Aegean Business School. His research
interests include small business management, entrepreneurship and strategic deci-
LIST OF CONTRIBUTORS xiii
sion making, and corporate communication. Konstantopoulos has extensively

published his research work in a number of academic journals.
FlorianLdeke-Freund is a senior research associate at the University of Hamburg,
Faculty of Business, Economics and Social Sciences, and a research fellow at the
CSM at the Leuphana University of Lneburg. He holds a PhD in Economics and
Social Sciences for his thesis on Business Models for Sustainability Innovation.
His main research interests are sustainable entrepreneurship, corporate sustainabil-
ity, and innovation management with a particular focus on business models. In
2013, he founded www.SustainableBusinessModel.org as an international research
platform at the intersections of business model and sustainability research.
D.AnthonyMiles is a visiting professor at the School of Business and Leadership
at Our Lady of the Lake University. He is also the CEO/Founder of Miles
Development Industries Corporation, a consulting practice and venture capital
acquisition firm. He is a nationally known expert in Entrepreneurship and
Marketing. In addition, he is a legal expert, where he provides expert witness tes-
timony for local, state, and federal court cases. He provides expert testimony in the
areas of Business, specifically with Startup ventures and Marketing. In 2014, he has
appeared as a guest expert on The Michael Dresser Show. He won Best Research/
Paper Award for Research in Marketing at the 2014 Academy of Business Research
(ABR) Conference. In 2010, he won the Student Recognition for Teaching
Excellence Award from the Texas A&M University System, while at Texas A&M
University-San Antonio. He has over 20 years of industry experience in retail,
banking, financial services, and the non-profit sector. He has held positions with
Fortune 500 companies. He holds a PhD/MBA in Entrepreneurship and General
Business Administration from the University of the Incarnate Word (USA) and, he
has four professional business certifications: Management Consultant Professional
(MCP), registered Business Analyst (RBA), certified Chartered Marketing
Analyst (CMA), and Master Business Consultant (MBC). He has published in
numerous journals, refereed publications, and authored two books.
MarcoPospiech is a research assistant and instructor of the Institute of Information
Science. He supervises the Competence Center Energy, manages third-party
funded projects, and is an independent IT consultant. He is the author of several
publications. His research interests belong to big data and data mining.
Iolanda Saviucholds degrees in industrial engineering and in public and private
environmental management, and has gained professional experience in CSR, data
analysis and in the renewable energy sector. Her research focuses on sustainability,
green energy, and on developing assessment frameworks of environmental initiatives.
StefanSchaltegger is Full Professor of Sustainability Management and Head of
the CSM and the MBA Sustainability Management at Leuphana University of
Lneburg, Germany. His research deals with corporate sustainability management
xiv LIST OF CONTRIBUTORS
with a special focus on performance measurement, accounting, management

methods, strategic and stakeholder management, and business practices in sustain-
ability management.
StavrosSindakis has experience in the dynamic academic and professional envi-
ronments. In addition to his experience in the healthcare sector in Greece, Stavros
has participated in consulting and research projects with Ortelio (UK), AIS
Telecommunications (Thailand), Laureate Online Education (Netherlands), and
other companies. He is Assistant Professor of Management at the American
University (Dubai), and he also teaches management and business courses at the
University of Roehampton London (UK), New School of Architecture & Design
(San Diego, California), and Anaheim University (Anaheim, California).
Stavros has co-authored scholarly books published by Palgrave Macmillan
Entrepreneurial Rise in Southeast Asia and Analytics, Innovation and Excellence-
Driven Enterprise Sustainabilityand a chapter contribution for World Scientific
PublishingIntra-organizational Knowledge Flows. Stavross academic work has
been published in the Journal of Knowledge Management, Journal of Technology
Transfer, The Asian Society of Management and Marketing Research (ASMMR),
Journal of the Knowledge Economy, International Journal of Knowledge and Systems
Science and other journals.
Marten Stockstudied business administration at the University of Hamburg and
Leuphana University of Lneburg and International Material Flow Management at
the University of Applied Science Trier. He is gaining professional excellence in the
field of material flow management and life cycle assessment working as a consultant.
PetrSvoboda received both BSc and MSc in Management and Economics from the
Faculty of Management, University of Economics in Prague, Czech Republic, in
2009 and 2011, respectively. He is PhD student at the same University. His research
interests include management of education and innovative marketing strategies.
AlexanderRayner is CEO of SmartData.travel Limited that focuses on making
data useful. With over 30 years experience in travel and tourism in operational,
policy and strategic roles, over the past decade Alex has worked with the United
Nations, Pacific Asia Travel Association (PATA), Governments and recently the
Asian Development Bank. During his time at PATA, Alex invented and developed
PATAmPOWER, a Data as a Service (DaaS) software platform that aggregates
data about the Asia Pacific visitor economy. Alex is a graduate of the University of
Technology Sydney (UTS) and is a visiting professor at Thammasat University in
Thailand. After serving as a member of the World Economic Forum Global Agenda
Council on New Models of Travel & Tourism, Alex continues to be a member of
the WEFs Expert Network.
List of Figures
Fig. 2.1 Parallel join types 15

Fig. 2.2 Hadoop ecosystem for big data analytics 21
Fig. 2.3 Hadoop architecture 22
Fig. 2.4 MapReduce job execution 25
Fig. 2.5 Dryad system architecture and execution 31
Fig. 2.6 SAP HANA architecture 36
Fig. 2.7 Dremel architecture and execution inside a server node 38
Fig. 3.1 General training process 55
Fig. 3.2 Trend calculation within the price forecast process 57
Fig. 3.3 Text mining example 60
Fig. 3.4 SVM 61
Fig. 3.5 General live process of the business analytics approach 63
Fig. 3.6 Computed results for the electricity market 66
Fig. 3.7 RapidMiner process 69
Fig. 3.8 Details of best model 70
Fig. 3.9 Graphical user interface 72
Fig. 4.1 Marketing Analytic Equation Model (MAEQ) 86
Fig. 4.2 Conceptual model of study: Path analysis of firm variable on
analytics91
Fig. 4.3 SEM path analysis results for the MACS instrument
(k = 10 Items) 96
Fig. 6.1 Model of sustainable value generation in the open data
ecosystem159
Fig. 7.1 Relationships of economic and social and/or ecological
performance (Adapted from Schaltegger and Synnestvedt
(2002: 341); Schaltegger and Burritt 2005) 173
xv
xvi List of Figures
Fig. 7.2 The location of the business model within management

levels and processes (Ldeke-Freund 2009:18) 182
Fig. 7.3 The five generic business model logics 185
Fig. 7.4 Basic perspectives of the balanced scorecard concept (Kaplan and
Norton 1996: 9) 188
Fig. 7.5 Basic layout of an SBSC with fifth, non-market perspective
(Figge etal. 2002) 190
Fig. 7.6 The basic SUST-BMA framework 192
Fig. 7.7 Illustration of a materiality matrix 197
Fig. 8.1 Left: Cover page of the PATA 1st annual statistical report 209
Fig. 8.2 Right: Cover page of the PATA annual tourism monitor 2015
early edition 209
Fig. 9.1 S-C-P diagram 237
List of Tables
Table 2.1 The system categories, subcategories, and example

systems (in alphabetical order) for large-scale data analytics 9
Table 3.1 Market data (Pospiech and Felden 2014) 65
Table 3.2 Gas market data 67
Table 3.3 Performance UNSTABLE/STABLE Model 70
Table 3.4 Performance UP/DOWN Model 71
Table 4.1 Model: marketing analytics and metric equations table 87
Table 4.2 Firm sociodemographic statistic results of the study 92
Table 4.3 Measurement properties (N = 123) 94
Table 4.4 AMOS path analysis coefficients and goodness-of-fit
statistics97
Table 4.5 Correlations of observed analytics and metric items and
covariates98
Table 4.6 Linear regression model of the firm variables effect on
Analytic 1: Customer Credit 100
Analytic 2: Market Potential 102
Analytic 3: Customer Turnover 103
Analytic 4: Competition and Economic 104
Table 5.1 Operationalization of Dependent Variables 120
Table 5.2 Variables of Types of Information Acquired Means
per Cluster 121
Table 5.3 Logistic Regression Results for Types of Information
Obtained, and Characteristics, Strategies, and Structures
of the Firm 122
xvii
xviii List of Tables
Table 5.4 Compare Means for the Institutional Information Sources

Variables124
Table 5.5 Compare Means for the Inter-organizational and Market
Information Sources Variables 125
Table 5.6 Binary Logistic Regression ResultsInstitutional
Information Sources and Strategic and Structural
Characteristics of the Firm 127
Table 5.7 Binary Logistic Regression ResultsInter-Organizational
and Market Information Sources and Strategic and Structural
Characteristics of the Firm 127
Table 5.8 Information Software Marketing Strategies
Developed in Foreign Markets 128
Table 5.9 Information Software Level of Internationalization 129
Table 5.10 Information Software Strategic Complexity 129
Table 5.11 Binary Logistic Regression Results 130
Table 8.1 TIGA Value 215
CHAPTER 1
Analytics, Innovation, andExcellence-driven

Enterprise Sustainability inaDynamic Era
StavrosSindakis
The adoption of systems that help organizations to retain and transfer

knowledge, creating value at the same time, has become an element of
increased interest. Firms innovativeness, and therefore competitiveness,
might improve when they establish alliances with partners who have
strong capabilities and broad social capital, allowing them to create value
and growth as well as technological knowledge and legitimacy through
new knowledge resources. Organizational intelligence integrates the tech-
nology variable into production and business systems, allowing not only
proper cooperation but also establishment of a basis in order to advance
decision-making processes, especially, those connected with the develop-
ment of innovative processes.
S. Sindakis (*)
American University in Dubai, School of Business, Dubai, UAE
e-mail: ssindakis@aud.edu
The Author(s) 2017 1

E.G. Carayannis, S. Sindakis (eds.), Analytics, Innovation, and
Excellence-Driven Enterprise Sustainability, Palgrave Studies
in Democracy, Innovation, and Entrepreneurship for Growth,
DOI10.1057/978-1-137-37879-8_1
2 S. SINDAKIS
When strategically integrated, these factors have the power to promote

enterprise resilience, robustness, and sustainability. Organizational resil-
ience may be regarded as the combined ability of an enterprise to recover
from negative shocks to its ecosystem and the rapidity with which it is
able to do so; hence, resilience manifests along a spectrum. In contrast,
organizational robustness is not so much enterprise ability to recover from
such shocks, but rather resistance or immunity to their impact. Enterprise
sustainability has been cast in many lights, but has progressively come to
be associated with sustained and sustainable performance in the familiar
triple bottom line areas of social responsibility, environmental compliance
and care, and financial security that are often collectively referred to as
people, planet, and profit. Nowadays, it is often perceived that globaliza-
tion serves as both a catalyst of accelerated development and an agent of
chaotic disruption resulting in socioeconomic and political dislocations.
In light of this, a key idea is that heterogeneity may be understood as
a mind-set and a practice where complexity and diversity are leveraged
strategically in a manner that promotes sustainable entrepreneurship and
intrapreneurship, thus contributing to resilience, robustness, and sustain-
ability across multiple levels.
In this context, the present book offers a unique view on innovativeness
and competitiveness that improve when organizations establish alliances
with partners who have strong capabilities and broad social capital, allow-
ing them to create value and growth as well as technological knowledge
and legitimacy through new knowledge resources. Additionally, the value
of digital technology, at both personal and industrial levels, leads to new
opportunities that emerge for creating experiences, processes, and orga-
nizational forms that fundamentally reshape organizations. For example,
organizational intelligence systems have become versatile to accumulate
internal information and environmental changes, utilizing the insights that
emerge by the transformation of data with the knowledge of the strategic
value. Moreover, this book aims to show that organizational resilience is
linked to organizational competitiveness and robustness via organizational
intelligence and knowledge, information and data analytics for organiza-
tional intelligence competences and capabilities. Although there is a rela-
tionship between organizational resilience and organizational robustness,
they neither are identical nor are of necessity fully compatible: that is, a
set of strategies and actions that maximize resiliency may not be identi-
cal to the set of strategies and actions maximizing robustness. As such, a
critical organization design consideration is determination of an enterprise
ANALYTICS, INNOVATION, ANDEXCELLENCE-DRIVEN ENTERPRISE... 3
form that jointly optimizes resilience and robustness. Whenever there are
differences in the sets of strategies and actions maximizing resiliency and
robustness, the organization should exercise care to elaborate and make
informed choices among the trade-offs between resiliency and robustness
that ultimately constrain so that any choice of strategies, actions, and orga-
nization design. Overall, this book provides a unique perspective on how
knowledge, information, and data analytics create opportunities and chal-
lenges for sustainable enterprise excellence. It also illustrates the impor-
tance of knowledge, information, and data analytics for organizational
intelligence and entrepreneurial competitiveness.
This volume consists of 11 chapters, exploring and discussing the
importance of business intelligence and analytics and their impact on mar-
ket research, business ventures, organizational sustainability, and enter-
prise excellence. An alternative perspective of strategic planning is also
discussed, considering the power of information on foreign markets as well
as the dynamics of open data, innovation, and sustainable value genera-
tion. Finally, we investigate the role of data science in the decision-making
process, discuss and assess the novel business model of sustainability ori-
entation, and investigate the correlation of sustainability and excellence
in higher education institutions under a given operational programme
for competitiveness. More specifically, the second chapter explores the
value of converting big data into useful information and knowledge, and
examines the design principles and core features of systems for analysing
large datasets for business purposes, aiming at meeting the demand for
interactive analytics, a new class of systems that combine analytical and
transactional capabilities. Chapter 3 explores the underdeveloped field of
business analytics for price trend forecasting through the utilization of tex-
tual data. The study aims at identifying methods of exploiting data analyt-
ics, which enable and support traders to maximizing their business profits.
Developing various assumptions and evaluating existing solutions in price
trend forecasting, the study introduces a novel approach of applying news
tickers for price trend forecasts in the energy market: a method, which is
applicable in any domain where important events have to be considered
instantly. Considering the value of data analytics from another viewpoint,
Chap. 4 discusses the benefits of marketing analytics and metrics in female-
owned business enterprises, focusing on customer behaviour and market
behaviour patterns. The study reveals specific marketing analytics to have
significant value in both customer behaviour and marketing behaviour
in the female-owned business ventures. Chapter 5 reviews the current
4 S. SINDAKIS
approaches and theories of strategic management, identifying a need to

revisit the foundations of strategy formulation process, and proposing an
innovative thinking as to the ways by which managers and entrepreneurs
will adopt real contemporary practices to develop successful strategies in
foreign markets. More specifically, this study aims at understanding the
exploitation of latest technology for market research purposes as a value-
adding element for the firm, leading to the creation of successful strategies
in foreign markets as well as investigates the evolution of technologys effect
on the process of information obtainment and processing. Taking the data
concept one step further, Chap. 6 explores the prospects of innovation,
business development, and wealth creation through open data. The study
takes into consideration the real options theory and the theory of two-
sided markets to explain and analyse the complex relationships between
innovation and value generation in the open data ecosystem. Among
the contributions of the study is the finding that private sector initiatives
would benefit from open government data, stimulating innovation activity
and investment in the open data ecosystem. In other words, governments
should set the ground for a development of a data-oriented ecosystem
that helps private companies to use the data as a resource to provide free
information and generate value by utilizing the two-sided markets type of
business models as well as by capitalizing on the resulting positive network
externalities. In this regard, Chap. 7 examines the ways by which business
models effectively support sustainable development as a consequence of
the measurability and manageability of business model effects. A novel
conceptual framework is developed in the study, illustrating methods for
sustainability-oriented business model assessments, aiming at addressing
the identified research gap regarding the appropriate approaches for the
assessment and management of business models and their contribution to
the development of the civil society in a sustainable environment. Chapter
8 analyses the case and discusses the benefits of PATAmPOWER, a data as
a service software platform, which is an interactive and user-friendly online
tool, enabling the dynamic selection of indicators about visitor arrivals, ori-
gin markets, expenditure, accommodation, aviation, digital engagement,
and forecasts in the Asia Pacific region. This study also investigates ways in
which PATAmPOWER can evolve into a software as a service by creating
customized data platforms. Such instruments help to the identification of
additional business opportunities by exploiting and combining available
data and information leading to the development of customized offerings
and to sustainable development. As a consequence, Chap. 9 underlines the
ANALYTICS, INNOVATION, ANDEXCELLENCE-DRIVEN ENTERPRISE... 5
significant contemporary and upcoming era of the fast-paced competitive

business environment. Modern companies responses to several innova-
tive adaptations and the analysis of different market structures are com-
bined to put the business function on the correct developing line, based
on descanted strategies. Considering the value of higher education in
sustainable business growth, Chap. 10 examines the correlation between
sustainability and excellence and discusses the impact on business innova-
tion. In particular, the study outlines the substance of sustainable develop-
ment and analyses the conditions of sustainability and excellence in higher
education institutions. The authors examine the case of the University
of Economics in Prague, which participates in the EU Operational
Programme Education for Competitiveness, aiming at portraying ways
of advancing the innovation process in higher education institutions.
In the final chapter, Chap. 11, the evolution of organization intelli-
gence and analytical applications are discussed as two elements associated
with the science of big data analytics. This association and several studies
lead to new knowledge and novelty for radical innovation, as well as focus-
ing on the current knowledge to maintain incremental innovation to a
competitive extent and advantage in the information sphere.
CHAPTER 2
Business Intelligence andAnalytics: Big

Systems forBig Data
HerodotosHerodotou
2.1 Introduction
Modern industrial, government, and academic organizations are collecting
massive amounts of data (big data) at an unprecedented scale and pace.
Many enterprises continuously collect records of customer interactions,
product sales, results from advertising campaigns on the Web, and other
types of information. Powerful telescopes in astronomy, particle accelera-
tors in physics, and genome sequencers in biology are putting massive vol-
umes of data into the hands of scientists (Cohen etal. 2009; Thusoo etal.
2009). The ability to perform timely and cost-effective analytical process-
ing of such large datasets to extract deep insights is now a key ingredient
for success. These insights can drive automated processes for advertise-
ment placement, improve customer relationship management, and lead to
major scientific breakthroughs (Frankel and Reid 2008).
H. Herodotou (*)
Department of Electrical Engineering and Computer Engineering and
Informatics (EECEI), Cyprus University of Technology, Limassol, Cyprus
e-mail: herodotos.herodotou@cut.ac.cy

DOI10.1057/978-1-137-37879-8_2
8 H. HERODOTOU
The set of techniques, systems, and tools that transform raw data into
meaningful and useful information for business analysis purposes is collec-
tively known as Business Intelligence (BI) (Chen etal. 2012). In addition
to the underlying data processing and analytical techniques, BI includes
business-centric practices and methodologies that can be applied to vari-
ous high-impact applications such as e-commerce, market intelligence,
healthcare, and security. The more recent explosion of data has led to the
development of advanced and unique data storage, management, analysis,
and visualization technologiestermed big data analyticsin order to
serve applications that are so large (from terabytes to exabytes) and com-
plex (from sensor to social media data) that could not be served effectively
with the previous technologies. Big data analytics can give organizations
an edge over their rivals and lead to business rewards, including more
potent promotion and enhanced revenue.
Existing database systems are adapting to the new status quo while
large-scale data analytical systems, like MapReduce (Dean and Ghemawat
2008) and Dryad (Isard etal. 2007), are becoming popular for analytical
workloads on big data. Industry leaders such as Teradata, SAP, Oracle,
and EMC/Greenplum have addressed this explosion of data volumes
by leveraging more powerful and parallel hardware in combination with
sophisticated parallelization techniques in the underlying data manage-
ment software. Internet service companies such as Twitter, LinkedIn,
Facebook, Google, and others address the scalability challenge by leverag-
ing a combination of new technologies in their clusters: key-value stores,
columnar storage, and the MapReduce programming paradigm (Wu etal.
2012; Thusoo etal. 2010; Lee etal. 2012; Melnik etal. 2010). Finally,
small and medium enterprises are slowly adopting the new technologies to
satisfy their needs for identifying, developing, and otherwise creating new
strategic business opportunities.
This monograph is an attempt to cover the design principles and core
features of systems for analyzing very large datasets for business purposes.
We organize systems into four main categoriesParallel Databases,
MapReduce, Dataflow, and Interactive Analyticseach with multiple
subcategories, based on some major and distinctive technological innova-
tions. The categories loosely correspond to the chronological evolution of
systems as the requirements for large-scale analytics have evolved over the
last few decades. Table 2.1 lists all categories and subcategories we discuss
along with some example systems for each subcategory.
BUSINESS INTELLIGENCE ANDANALYTICS: BIG SYSTEMS FORBIG DATA 9
Table 2.1 The system categories, subcategories, and example systems (in alpha-
betical order) for large-scale data analytics
(Sub)Category Example systems
Parallel databases
Row-based parallel databases Aster nCluster, DB2 Parallel Edition, Greenplum,
Netezza, Teradata
Columnar databases C-Store, Infobright, MonetDB, ParAccel, Sybase IQ,
Vector Wise, Vertica
MapReduce
Distributed file systems Ceph, GFS, HDFS, Kosmos, MapR, Quantcast
MapReduce execution Google MapReduce, Hadoop, HadoopDB, Hadoop++
engines
MapReduce-based platforms Cascading, Clydesdale, Hive, Jaql, Pig
Dataflow
Generalized MapReduce ASTERIX, Hyracks, Nephele, Stratosphere
Directed acyclic graph systems Dryad, DryadLINQ, SCOPE, Shark, Spark
Graph processing systems GraphLab, GraphX, HaLoop, Pregel, PrIter, Twister
Interactive analytics
Mixed analytical and Bigtable, HBase, HyPer, HYRISE, Megastore, SAP
transactional HANA, Spanner
Distributed SQL query Apache Drill, Cloudera Impala, Dremel, Presto, Stinger.
engines next
Stream processing systems Aurora, Borealis, Muppet, S4, Storm, STREAM
2.1.1Evolution ofData Analytics Systems

The need for improvements in productivity and decision making pro-
cesses has led to considerable innovation in systems for large-scale data
analytics. Parallel databases dating back to 1980s have added tech-
niques like columnar data storage and processing (Boncz etal. 2006;
Lamb et al. 2012), while new distributed platforms like MapReduce
(Dean and Ghemawat 2008) have been developed. Other innovations
aimed at creating alternative system architectures for more general-
ized dataflow applications, including Dryad (Isard et al. 2007) and
Stratosphere (Alexandrov et al. 2014). More recently, the grow-
ing demand for interactive analytics has led to the emergence of a
new class of systems, like SAP HANA (Frber et al. 2012a, b) and
Spanner (Corbet etal. 2012), that combine analytical and transactional
capabilities.
10 H. HERODOTOU
2.1.1.1 Parallel Database Systems

Row-based parallel databases were the first systems to make parallel data
processing available to a wide class of users through an intuitive high-level
programming model, namely SQL.High performance and scalability were
achieved through partitioning tables across the nodes in a shared-nothing
cluster. Such a horizontal partitioning scheme enabled relational opera-
tions like filters, joins, and aggregations to be run in parallel over different
partitions of each table stored on different nodes. On the other hand,
columnar databases pioneered the concept of storing data tables as sec-
tions of columns rather than rows and performing vertical partitioning.
Systems with columnar storage and processing have been shown to use
CPU, memory, and I/O resources more efficiently in large-scale data ana-
lytics compared to row-oriented systems (Lamb etal. 2012). Some of the
main benefits come from reduced I/O in columnar systems by (a) reading
only the needed columns during query processing and (b) offering better
compression. Row-based and columnar systems are discussed in Sect. 2.2.
2.1.1.2 MapReduce Systems

MapReduce is a programming model and an associated implementation
developed by Google for processing massive datasets on large clusters of
thousands of commodity servers (Dean and Ghemawat 2008). Parallel
databases have traditionally struggled to scale to such levels. MapReduce
systems pioneered the concept of building multiple stand-alone scal-
able distributed systems and then composing two or more of these sys-
tems together in order to run analytical tasks on large datasets. Typical
MapReduce systems such as Hadoop (White 2010) store data in a stand-
alone block-oriented distributed file system and run computational tasks in
a MapReduce execution engine. The MapReduce model, although highly
flexible, has been found to be too low-level for routine use by practitio-
ners such as data analysts, statisticians, and scientists (Olston etal. 2008;
Thusoo etal. 2009). As a result, the MapReduce framework has evolved
rapidly over the past few years into a MapReduce stack that includes a
number of higher-level layers added over the core MapReduce engine.
Prominent examples of these higher-level layers include Hive (with an
SQL-like declarative interface), Pig (with an interface that mixes declara-
tive and procedural elements), Cascading (with a Java interface for speci-
fying workflows), Cascalog (with a Datalog-inspired interface), and
BigSheets (includes a spreadsheet interface). MapReduce systems are cov-
ered in Sect. 2.3.
2.1.1.3 Dataflow Systems

As MapReduce systems were being adopted for a large number of
data analysis tasks, a number of shortcomings became apparent. The
MapReduce programming model is too restrictive to express certain data
analysis tasks easily, for example, joining two datasets together. More
importantly, the execution techniques used by MapReduce systems
are suboptimal for many common types of data analysis tasks such as
relational operations, iterative machine learning, and graph processing.
Some of these problems have been addressed by replacing MapReduce
with a more generalized MapReduce execution model that contains extra
operators in addition to Map and Reduce [e.g., Hyracks (Borkar etal.
2011), Nephele (Battr etal. 2010)]. A different class of dataflow sys-
tems such as Dryad (Isard etal. 2007) and Spark (Zaharia etal. 2012)
use the directed acyclic graph (DAG) model that can express a wide range
of data access and communication patterns. Finally, graph processing sys-
tems like Pregel (Malewicz etal. 2010) are specialized in running itera-
tive computations and other analytics tasks over data graphs. Dataflow
systems are described in Sect. 2.4.
2.1.1.4 Systems forInteractive Analytics

The need to reduce the gap between the generation of data and the
generation of analytics results over this data has required system devel-
opers to constantly raise the bar in large-scale data analytics. On one
hand, this need has led to the emergence of scalable distributed storage
and computer systems that support mixed analytical and transactional
workloads, such as Spanner (Corbet etal. 2012) and Megastore (Baker
et al. 2011). Support for transactions enables storage systems in par-
ticular to serve as the data store for online services while making the
data available concurrently in the same system for analytics. The same
need led to the emergence of distributed SQL query engines that run
over distributed file systems and support ad hoc analytics. For instance,
Cloudera Impala (Wanderman-Milne and Li 2014) enables users to issue
low-latency SQL queries to data stored in Hadoop Distributed File System
(HDFS) (Shvachko etal. 2010) and Apache HBase (George 2011) with-
out requiring data movement or transformations. Finally, stream pro-
cessing systems are driven by a data-centric model that allows for near
real-time consumption and analysis of data. We discuss systems for inter-
active analytics in Sect. 2.5.
12 H. HERODOTOU
2.2 Parallel Database Systems

Traditionally, Enterprise Data Warehouses (EDWs) and BI tools built on
top of database systems have been providing the means for retrieving and
analyzing large amounts of data. In this monograph, we focus on Massive
Parallel Processing (MPP) Database Management Systems (DBMSs) that
run on clusters of commodity servers and provide support for big data
analytics. As these systems were developed based on centralized DBMSs,
they use the Structured Query Language (SQL) for accessing, managing,
and analyzing data. Users can specify an analysis task using a SQL query,
while the DBMS will optimize and execute the query.
In addition, database systems require that data conforms to a well-
defined schema and is stored in a specialized data store. The storage
format is the main differentiator between the two categories of parallel
database systems we consider, namely row-oriented and column-oriented
systems. For both categories, we concentrate on the technological innova-
tions that differentiate them from earlier centralized database systems and
from each other.
2.2.1Row-based Parallel Databases

A number of research prototypes and industry-strength parallel database
systems have been built using a shared-nothing architecture over the last
three decades. Examples include Gamma (DeWitt et al. 1990), Pivotal
Greenplum Database (Greenplum 2013), IBM DB2 Parallel Edition
(Baru etal. 1995), Netezza (IBM Netezza 2012), and Teradata (Teradata
2012). Given the parallel nature of the aforementioned systems, we focus
primarily on two key system aspects: (a) parallel data storage and (b) paral-
lel query execution.
2.2.1.1 Parallel Data Storage

The relational data model and SQL query language have the crucial ben-
efit of data independence, that is, SQL queries can be executed correctly
irrespective of how the data in the tables is physically stored in the system.
There are four noteworthy aspects of physical data storage in paral-
lel databases: (a) partitioning, (b) declustering, (c) collocation, and (d)
replication.
Table partitioning refers to the technique of distributing the tuples of a
table across disjoint fragments (or partitions) and is a standard feature in
parallel database systems today (IBM Corporation 2007; Morales 2007;

Talmage 2009). The most common types of partitioning are:
Range partitioning, where tuples are assigned to tables based on

value ranges of one or more attributes.
Hash partitioning, where tuple assignment is based on the result of
a hash function applied to one or more attributes.
List partitioning, where the unique values of one or more attributes
in each partition are specified.
Random partitioning, where tuples are assigned to partitions in a
random fashion.
Round-robin partitioning, where tuples are assigned to partitions
in a round-robin fashion.
Block partitioning, where each consecutive block of tuples (or
bytes) written to a table forms a partition.
Benefits of partitioning range from more efficient loading and removal

of data on a partition-by-partition basis to finer control over the choice of
physical design, statistics creation, and storage provisioning based on the
workload. Deciding how to partition tables, however, is now an involved
process where multiple objectives, for example, getting fast data loading
along with good query performanceand constraints, for example, on
the maximum size or number of partitions per tablemay need to be
met (Herodotou etal. 2011). Various table partitioning schemes as well
as techniques to find a good partitioning scheme automatically have been
proposed as part of database physical design tuning (Agrawal etal. 2004;
Rao etal. 2002).
The next task after table partitioning is deciding which node or nodes
in the cluster should store each partition of the tables in the database. The
number of nodes across which a table is distributed is called the degree
of declustering. When that number equals to the number of nodes in the
system, the table is said to be fully declustered; otherwise, it is partially
declustered (DeWitt et al. 1990). With partial declustering, nodes are
typically grouped in setscalled node groups (Baru etal. 1995) or rela-
tion clusters (Hsiao and DeWitt 1990)that can be referenced by name.
Each table is then assigned to one such group. Note that it is possible to
have multiple tables assigned to the same group but one table cannot be
assigned to multiple groups.
14 H. HERODOTOU
Having selective overlap among the nodes (or the group) on which
the partitions of two or more tables are stored can be beneficial, espe-
cially for join processing. Consider two tables R(a, b) and S(a, c), where
a is a common attribute. Suppose both tables are hash partitioned on the
respective attribute a using the same hash function and the same number
of partitions. Further, suppose the partitions of tables R and S are both
stored on the same group of nodes. In this case, there will be a one-to-one
correspondence between the partitions of both tables that can join with
one another on attribute a. That is, any pair of joining partitions will be
stored on the same node of the group. Under these conditions, the two
tables R and S are said to be collocated. The advantage of collocation is
that tables can be joined without the need to move any data from one
node to another.
In addition to collocation, data replication can often provide perfor-
mance benefits, both for join processing and for the concurrent execution
of multiple queries. Replication is usually done at the table level in two
scenarios. When a table is small, it can be replicated on all nodes in the
cluster or a group. Such replication is common for dimension tables in
star and snowflake schemas so that they can easily join with the partitions
of the distributed fact table(s). Replication can also be done such that dif-
ferent replicas are partitioned differently. For example, one replica of the
table may be hash partitioned while another may be range partitioned for
speeding up multiple workloads with different access and join patterns.
Apart from performance benefits, replication also helps reduce unavail-
ability or loss of data when faults arise in the parallel database system (e.g.,
a node fails permanently or becomes disconnected temporarily from other
nodes due to a network failure).
The diverse mix of partitioning, declustering, collocation, and replica-
tion techniques available can make it confusing for users of parallel database
systems to identify the best data layout for their workload. This problem
has motivated research on automated ways to recommend good data lay-
outs based on the workload (Mehta and DeWitt 1997; Rao etal. 2002)
and on partition-aware optimization techniques to generate efficient plans
for SQL queries over partitioned tables (Herodotou etal. 2011).
2.2.1.2 Parallel Query Execution

When a SQL query is submitted to the database system, the query
optimizer is responsible for generating a parallel execution plan for the
query. The plan is composed of operators that support both intra- and
inter-operator parallelism, as well as mechanisms to transfer data from pro-

ducer operators to consumer operators. The plan is broken down into
schedulable tasks that are run on the nodes in the system. Upon comple-
tion of the plan, the results are transferred back to the user or application
that submitted the query.
Parallel database systems employ multiple forms of parallelism in execu-
tion plans, including join, partitioned, pipelined, and independent parallel-
ism. Join parallelism refers to the type of join used to execute table joins
and depends primarily on the partitioning, declustering, collocation, and
replication techniques used for storing the data. We discuss four main join
types (illustrated in Fig. 2.1) for joining two tables R and S based on the
equi-join condition R.a = S.a.
Collocated join: A collocated join can be used only when tables

R and S are both partitioned on attribute a and the partitions are
assigned such that any pair of joining partitions is stored on the same
node. A collocated join operator is often the most efficient way to
Fig. 2.1 Parallel join types

16 H. HERODOTOU
perform the join because it performs the join in parallel on each node
while avoiding the need to transfer data between nodes.
Directed join: Suppose tables R and S are both partitioned on attri-
bute a but the respective partitions are not collocated. In this case,
a directed join can transfer each partition of one table (say R) to the
node where the joining partition of the other table is stored. Once
a partition from R is brought to where the joining partition in S is
stored, a local join can be performed. Compared to a collocated join,
a directed join incurs the cost of transferring one of the tables across
the network.
Repartitioned join: If tables R and S are not partitioned on the
joining attribute, then the repartitioned join is used. This join sim-
ply repartitions the tuples in both tables using the same partitioning
condition (e.g., hash). Joining partitions are brought to the same
node where they can be joined. This operator incurs the cost of
transferring both tables across the network.
Broadcast join: When tables R and S are not partitioned on the
joining attribute but one of them (say R) is very small, then the
broadcast join will transfer R in full to every node where any par-
tition of the other table (S) is stored. The join is then performed
locally. This operator incurs a data transfer cost equal to the size of R
times the degree of declustering of S.
A typical issue with join processing is the presence of skew in partition

sizes. Hash or range partitioning can produce skewed partition sizes if
the attribute used in the partitioning function has a skewed distribution.
The load imbalance created by such skew can severely degrade the per-
formance of join operators such as the repartitioned join. This problem
can be addressed by identifying the skewed join keys and handling them
in special ways. In particular, tuples in a table with a join key value u that
has a skewed distribution can be further partitioned across multiple nodes.
The correct join result will be produced as long as the tuples in the joining
table with join key equal to u are replicated across the same nodes. In this
fashion, the resources in multiple nodes can be used to process the skewed
join keys (DeWitt etal. 1992).
While our discussion is focused on the parallel execution of joins, the
same principles apply to the parallel execution of other relational opera-
tors like filtering and group by. The unique approach used here to extract
parallelism is to partition the input into multiple fragments and to process
these fragments in parallel. This form of parallelism is called partitioned

parallelism (DeWitt and Gray 1992).
Another form of parallelism employed commonly in execution plans in
parallel database systems is the pipelined parallelism. A query execution
plan may contain a sequence of operators linked together by producer
consumer relationships where all operators can be run in parallel as data
flows continuously across every producerconsumer pair. For example,
suppose an execution plan contains three operators: a table scan S, a filter
F, and a hash aggregator H. S starts scanning the table and places the
tuples in Fs input queue. At the same time, F reads from its input queue,
performs the filtering, and writes to Hs input queue. Finally, H starts
building the hash table. Thus, S, F, and H can be working concurrently on
stages from different iterations, thereby increasing performance.
Finally, independent parallelism refers to the parallel execution of inde-
pendent operators in a query plan. For example, consider a query that
joins together four tables R, S, T, and U. This query can be processed by
an execution plan where R is joined with S, T is joined with U, and then
the results from both joins are joined together to produce the final result
[(R S) (T U)]. In this plan, R S and T U can be executed
independently in parallel.
2.2.2Columnar Databases
Columnar systems excel at data-warehousing-type applications, where (a)
data is loaded in bulk but typically not modified much and (b) the typical
access pattern is to scan through large parts of the data to perform aggre-
gations and joins. The first columnar database systems that appeared in the
1990s were MonetDB (Boncz etal. 2006) and Sybase IQ (MacNicol and
French 2004). The 2000s saw a number of new columnar database systems
such as C-Store (Stonebraker etal. 2005), Infobright (Infobright 2013),
ParAccel (ParAccel 2013), VectorWise (Zukowski and Boncz 2012), and
Vertica (Lamb etal. 2012). Similar to the row-based databases discussed
above, we focus on the data storage and query execution of columnar
database systems.
2.2.2.1 Columnar Data Storage

In a pure columnar data layout, each table column is stored contiguously
in a separate file on disk. Each file stores tuples of the form <k, u> (Boncz
etal. 2006), where the key k is the unique identifier for a tuple and u is
18 H. HERODOTOU
the corresponding value. An entire tuple with tuple identifier k can be

reconstructed by bringing together all the attribute values stored for k.
It is also possible to eliminate the explicit storage of tuple identifiers and
derive them implicitly based on the position of each attribute value in the
file (Lamb etal. 2012; Stonebraker etal. 2005).
Vertica stores two files per column (Lamb etal. 2012). One file con-
tains the attribute values while the other file, called position index, stores
corresponding metadata such as the start position, minimum value, and
maximum value for the attribute values. The position index helps with
tuple reconstruction as well as eliminating reads of disk blocks during
query processing. Furthermore, removing the storage of tuple identifiers
leads to more densely packed columnar storage (Abadi etal. 2009; Lamb
etal. 2012).
C-Store introduced the concept of projections. A projection is a set of
columns that are stored together. The concept is similar to a materialized
view that projects some columns of a base table. However, in C-Store, all
the data in a table is stored as one or more projections. That is, C-Store
does not have an explicit differentiation between base tables and material-
ized views. Each projection is stored and sorted on one or more attributes.
Vertica implemented a similar concept latercalled super projectionsthat
contains every column of the table (Lamb etal. 2012).
An important advantage of columnar data layouts is that columns can
be stored densely on disk using various compression techniques (Abadi etal.
2009; Lamb etal. 2012; Stonebraker etal. 2005):
Run Length Encoding (RLE): Sequences of identical values in a

column are replaced with a single pair that contains the value and
number of occurrences. This type of compression is best for sorted,
low cardinality columns.
Delta Value: Each attribute value is stored as the difference from
the smallest value, so it is useful when the differences can be stored
in fewer bytes than the original attribute values. This type of com-
pression is best for many-valued, unsorted integer or integer-based
columns.
Compressed Delta Range: Each value is stored as a delta from the
previous one. This type of compression is best for many-valued float
columns that are either sorted or confined to a range.
Dictionary: The distinct values in the column are stored in a

dictionary which assigns a short code to each distinct value. The
actual values are replaced with the code assigned by the dictionary.
Dictionary-based compression is a general-purpose scheme, but it is
good for unsorted, low cardinality columns.
Bitmap: A column is represented by a sequence of tuples <u, b> such
that u is a value stored in the column and b is a bitmap indicating the
positions in which the value is stored. RLE can be further applied to
compress each bitmap.
Hybrid combinations of the above schemes are also possible. For

example, the Compressed Common Delta scheme used in Vertica builds
a dictionary of all the deltas in each block (Lamb et al. 2012). This
type is best for sorted data with predictable sequences and occasional
sequence breaks (e.g., timestamps recorded at periodic intervals or pri-
mary keys).
2.2.2.2 Columnar Query Execution

The columnar data layout gives rise to a distinct space of execution plans
in columnar parallel database systems that provide opportunities for highly
efficient execution: (a) operations on compressed columns, (b) vectorized
operations, and (c) late materialization.
Given the typical use of compression in columnar systems, it is highly
desirable to have (some) operators operate on the compressed representa-
tion of their input whenever possible, in order to avoid the cost of decom-
pression. The ability to operate directly on compressed data depends on
the type of the operator and the compression scheme used. For example,
consider a filter operator whose filter predicate is on a column compressed
using the Bitmap compression technique. This operator can do its pro-
cessing directly on the stored unique values of the column and then only
read those bitmaps from disk whose values match the filter predicate.
Complex operators like range filters, aggregations, and joins can also oper-
ate directly on compressed data.
Columnar layouts encourage vectorized processing since it is more
efficient for operators to process their input in large chunks at a time as
opposed to one tuple at a time. A full or partial column of values can
be treated as an array (or a vector) on which SIMD (single instruction
multiple data) instructions in CPUs can be evaluated. SIMD instructions
20 H. HERODOTOU
can greatly increase performance when the same operations have to be

performed on multiple data objects. The X100 project (which was com-
mercialized later as VectorWise) explored a compromise between the
classic tuple-at-a-time pipelining and operator-at-a-time bulk processing
techniques (Boncz et al. 2005). X100 operates on chunks of data that
are large enough to amortize function call overheads but small enough
to fit in CPU caches and to avoid materialization of large intermediate
results into main memory. X100 shows significant performance benefits
when vectorized processing is combined with just-in-time light-weight
compression.
Tuple reconstruction is expensive in columnar database systems since
information about a tuple is stored in multiple locations on disk, yet most
queries access more than one attribute from a tuple (Abadi etal. 2009).
Further, most users and applications (e.g., using ODBC or JDBC) access
query results tuple-at-a-time (not column-at-a-time). Thus, at some
point in a query plan, data from multiple columns must be materialized
as tuples. Many techniques have been developed to reduce such tuple
reconstruction costs (Abadi et al. 2007). For example, MonetDB uses
late tuple reconstruction (Idreos et al. 2012). All intermediate results
are kept in a columnar format during the entire query evaluation. Tuples
are constructed only just before sending the final result to the user or
application. This approach allows the query execution engine to exploit
CPU-optimized and cache-optimized vector-like operator implementa-
tions throughout the whole query evaluation. One disadvantage of this
approach is that larger intermediate results may need to be materialized
compared to the traditional tuple-at-a-time processing.
2.3 MapReduce Systems

MapReduce is both a programming model and an associated run-time sys-
tem for large-scale data processing (Dean and Ghemawat 2008). Hadoop
is the most popular open-source implementation of a MapReduce frame-
work that follows the design laid out in the original paper (Dean and
Ghemawat 2004). A number of companies use Hadoop in production
deployments for applications such as Web indexing, data mining, report
generation, log file analysis, machine learning, financial analysis, scientific
simulation, and bioinformatics research. Infrastructure-as-a-Service cloud
platforms like Amazon and Rackspace have made it easier than ever to run
Hadoop workloads by allowing users to instantly provision clusters and
pay only for the time and resources used.
A combination of features contributes to Hadoops increasing popular-

ity, including fault tolerance, data-local scheduling, ability to operate in a
heterogeneous environment, handling of straggler tasks (A straggler is a
task that performs poorly typically due to faulty hardware or misconfigu
ration), as well as a modular and customizable architecture. In typical
Hadoop deployments, data is stored in a block-oriented distributed file
system (usually HDFS) and processed using either the Hadoop MapReduce
execution engine directly or one of the many MapReduce-based platforms
built on top of Hadoop (e.g., Hive, Pig, Jaql). The Hadoop ecosystem is
shown in Fig. 2.2.
2.3.1Distributed Storage
The storage layer of a typical MapReduce cluster is an independent distrib-
uted file system. Typical Hadoop deployments use the HDFS running on
the clusters compute nodes (Shvachko etal. 2010). Alternatively, a Hadoop
cluster can process data from other file systems like the MapR File System
(MapR 2013), Ceph (Weil etal. 2006), Amazon Simple Storage Service (S3)
(Amazon S3, 2013), and Windows Azure Blob Storage (Calder etal. 2011).
As HDFS focuses more on batch processing rather than interactive use,
it emphasizes high throughput of data access rather than low latency. An
HDFS cluster employs a master-slave architecture consisting of a single
NameNode (the master) and multiple DataNodes (the slaves), usually
one per node in the cluster (see Fig. 2.3). The NameNode manages the
file system namespace and regulates access to files by clients, whereas the
DataNodes are responsible for serving read and write requests from the file
Fig. 2.2 Hadoop ecosystem for big data analytics

22 H. HERODOTOU
Fig. 2.3 Hadoop architecture
systems clients. HDFS is designed to reliably store very large files across
machines in a large cluster. Internally, a file is split into one or more blocks
that are replicated for fault tolerance and stored in a set of DataNodes.
A number of other distributed file systems are viable alternatives to
HDFS and offer full compatibility with Hadoop MapReduce. The MapR
File System (MapR 2013) and Ceph (Weil etal. 2006) have similar archi-
tectures to HDFS but both offer a distributed metadata service as opposed
to the centralized NameNode on HDFS. In MapR, metadata is shared
across the cluster and collocated with the data blocks, whereas Ceph uses
dedicated metadata servers with dynamic subtree partitioning to avoid
metadata access hot spots. The Quantcast File System (QFS) (Ovsiannikov
et al. 2013), which evolved from the Kosmos File System (KFS) (KFS
2013), employs erasure coding rather than replication as its fault tolerance
mechanism. Erasure coding enables QFS to not only reduce the amount
of storage but also accelerate large sequential write patterns common to
MapReduce workloads.
Distributed file systems are primarily designed for accessing raw files
and, therefore, lack any advanced features found in the storage layer of
database systems. This limitation has inspired a significant amount of
research for introducing (a) indexing, (b) collocation, and (c) columnar
capabilities into such file systems.
2.3.1.1 Indexing
Hadoop++ (Dittrich etal. 2010) provides indexing functionality for data
stored in HDFS using the so-called Trojan Indexes. The indexing informa-
tion is created during the initial loading of data onto HDFS and is stored as
additional metadata in the data blocks. Hence, targeted data retrieval can
be very efficient at the expense of increased data loading time. This prob-
lem is addressed by HAIL (Dittrich etal. 2012), which improves query
processing speeds over Hadoop++. HAIL creates indexes during the I/O-
bound phases of writing to HDFS so that it consumes CPU cycles that are
otherwise wasted. In addition, HAIL builds a different clustered index in
each replica maintained by HDFS for fault tolerance purposes. The most
suitable index for a query is then selected at run-time, and the correspond-
ing replicas are read during the MapReduce execution over HAIL.
2.3.1.2 Collocation
In addition to indexing, Hadoop++ provides a data collocation technique
in MapReduce systems. Specifically, Hadoop++ allows users to co-partition
and collocate data at load time while writing metadata in the data blocks
(Dittrich etal. 2010). Hence, blocks of HDFS can now contain data from
multiple tables. With this approach, collocated joins can be processed
at each node without the overhead of sorting and shuffling data across
nodes. CoHadoop (Eltabakh etal. 2011) provides a different collocation
strategy by adding a file-locator attribute to HDFS files and implementing
a file layout policy such that all files with the same locator are placed on
the same set of nodes. Using this feature, CoHadoop can collocate any
related pair of files, for example, every pair of joining partitions across two
tables that are both hash partitioned on the join key, or a partition and an
index on that partition. CoHadoop can then run joins in a similar manner
as collocated joins in parallel database systems.
2.3.1.3 Columnar Layouts

It is also possible to implement columnar data layouts in HDFS. Llama
(Lin et al. 2011) and CIF (Floratou et al. 2011) use a pure column-
oriented design, based on which they partition attributes into vertical
groups like the projections in C-Store and Vertica (recall Sect. 2.2). Each
vertical group is sorted based on one of its component attributes. Each
column is stored in a separate HDFS file, which enables each column to be
accessed independently and, thus, reduces read I/O costs but may incur
24 H. HERODOTOU
run-time costs for tuple reconstruction. Unlike Llama, CIF uses an exten-
sion of HDFS to enable collocation of columns corresponding to the same
tuple on the same node and supports some late materialization techniques
for reducing tuple reconstruction costs (Floratou etal. 2011).
Cheetah (Chen 2010), RCFile (He etal. 2011), and Hadoop++ (Dittrich
et al. 2010) use a hybrid row-column design based on PAX (Ailamaki
etal. 2001). In particular, each file is horizontally partitioned into blocks
but a columnar format is used within each block. Since HDFS guarantees
that all the bytes of an HDFS block will be stored on a single node, it is
guaranteed that tuple reconstruction will not require data transfer over the
network. The intra-block data layouts used by these systems differ in how
they use compression, how they treat replicas of the same block, and how
they are implemented. For example, Hadoop++ can use different layouts
in different replicas and choose the best layout at query processing time.
2.3.1.4 MapReduce Execution Engines

MapReduce execution engines implement the MapReduce programming
model for dealing with data at massive scale (Dean and Ghemawat 2004).
Users specify computations in terms of Map and Reduce functions while
the underlying run-time system automatically parallelizes the computation
across large-scale clusters of commodity servers, handles machine failures,
and schedules inter-machine communication to make efficient use of the
network and disk bandwidth.
The MapReduce programming model consists of two functions: map
(k1, v1) and reduce (k2; list(v2)). Users can implement their own pro-
cessing logic by specifying a customized map () and reduce () function
written in a general-purpose language like Java or Python. The map (k1,
v1) function is invoked for every key-value pair <k1, v1> in the input
data to output zero or more key-value pairs of the form <k2, v2> (see
Fig. 2.4). The reduce (k2, list(v2)) function is invoked for every unique
key k2 and corresponding values list(v2) in the map output, and outputs
zero or more key-value pairs of the form <k3, v3>. The MapReduce pro-
gramming model allows for other functions as well, such as (a) partition
(k2), for controlling how the map output key-value pairs are partitioned
among the reduce tasks, and (b) combine (k2, list(v2)), for performing
partial aggregation on the map side. The keys k1, k2, and k3 as well as the
values v1, v2, and v3 can be of different and arbitrary types.
Hadoop MapReduce (White 2010) is the most widely used implemen-
tation of a MapReduce execution engine. A Hadoop MapReduce cluster
Fig. 2.4 MapReduce job execution
employs a master-slave architecture where one master node (called

JobTracker) manages a number of slave nodes (called TaskTrackers), as
seen in Fig. 2.3. Hadoop launches a MapReduce job by first splitting (log-
ically) the input dataset into data splits. Each data split is then scheduled
to one TaskTracker node and is processed by a map task. A Task Scheduler
resides in the JobTracker and is responsible for scheduling the execution of
map tasks while taking data locality into account. Each TaskTracker has a
predefined number of task execution slots for running map (reduce) tasks.
If the job will execute more map (reduce) tasks than there are slots, then
the map (reduce) tasks will run in multiple waves. When map tasks com-
plete, the run-time system groups all intermediate key-value pairs using
an external sort-merge algorithm. The intermediate data is then shuffled
(i.e., transferred) to the TaskTrackers scheduled to run the reduce tasks.
Finally, the reduce tasks will process the intermediate data to produce the
results of the job.
HadoopDB (Abouzeid et al. 2009) is a hybrid system that com-
bines features from parallel database systems with Hadoop. Specifically,
HadoopDB runs a centralized database system on each node of the clus-
ter and uses Hadoop primarily as the engine to schedule query execu-
tion plans as well as to provide fine-grained fault tolerance. The additional
storage system provided by the databases gives HadoopDB the ability to
overcome limitations of HDFS such as lack of collocation and indexing.
In addition, HadoopDB includes some advanced partitioning capabilities
26 H. HERODOTOU
such as reference-based partitioning, which enable multiway joins to be

performed in a collocated fashion.
HadoopDB introduced the concept of split query execution where a
query submitted by a user or application will be converted into an execu-
tion plan consisting of some parts that would run as queries in the data-
base and other parts that would run as map and reduce tasks in Hadoop
(Bajda-Pawlikowski etal. 2011). The best such splitting of work will be
identified during plan generation based on metadata stored in a system
catalog. Metadata information includes connection parameters, schema,
and statistics of the tables stored, locations of replicas, and data partition-
ing properties.
2.3.2MapReduce-based Platforms
The MapReduce model, although highly flexible, has been found to be
too low-level for routine use by practitioners such as data analysts, statisti-
cians, and scientists (Olston etal. 2008; Thusoo etal. 2009). As a result,
the MapReduce framework has evolved into a MapReduce ecosystem shown
in Fig. 2.2, which includes a number of (a) high-level interfaces added over
the core MapReduce engine, (b) application development tools, (c) work-
flow management systems, and (d) data collection tools.
2.3.2.1 High-level Interfaces

The two most prominent examples of higher-level layers are Apache
Hive (Thusoo etal. 2009) with an SQL-like declarative interface (called
HiveQL) and Apache Pig (Olston etal. 2008) with an interface that mixes
declarative and procedural elements (called Pig Latin). Both Hive and
Pig will compile the respective HiveQL and Pig Latin queries into logical
plans, which consist of a tree of logical operators. The logical operators
are then converted into physical operators, which in turn are packed into
map and reduce tasks for execution. The execution plan generated for a
HiveQL or Pig Latin query is usually a workflow (i.e., a directed acyclic
graph) of MapReduce jobs. Workflows may be ad hoc, time-driven (e.g.,
run every hour), or data-driven. Yahoo! uses data-driven workflows to
generate a reconfigured preference model and an updated home-page for
any user within seven minutes of a home-page click by the user.
Similar to a data warehouse, Hive organizes and stores the data into
partitioned tables (Thusoo et al. 2009). Hive tables are analogous to
tables in relational databases and are represented using HDFS directories.
Partitions are then created using subdirectories while the actual data is
stored in files. Hive also includes a system catalogcalled Metastore
containing schema and statistics, which are useful in data exploration and
query optimization. In particular, Hive employs rule-based approaches for
a variety of optimizations such as filter and projection pushdown, shared
scans of input datasets across multiple operators from the same or different
analysis tasks (Nykiel etal. 2010), reducing the number of MapReduce
jobs in a workflow (Lee etal. 2011), and handling data skew in sorts and
joins.
2.3.2.2 Application Development

Cascading (Cascading 2011) and FlumeJava (Chambers et al. 2010)
are software abstraction layers for MapReduce used to express data-
parallel pipelines. They both offer program-based interfaces that inte-
grate MapReduce job definitions into popular programming languages
such as Java, JRuby, and Clojure. Hence, application developers can
develop, test, and run efficient data-parallel pipelines without worry-
ing about the underlying complexity of MapReduce jobs. To enable
parallel operations to run efficiently, FlumeJava internally constructs an
execution plan as a dataflow graph but defers its evaluation. When the
final results are eventually needed, FlumeJava optimizes the execution
plan and then executes the optimized operations on the underlying
MapReduce primitives. Cascading and FlumeJava are most often used
for log file analysis, bioinformatics, machine learning, and predictive
analytics.
2.3.2.3 Workflow Management

A given MapReduce program may be expressed in one among a variety
of programming languages like Java, C++, Python, or Ruby; may be gen-
erated by a query-based interface such as Hive or Pig; or may be gener-
ated by a program-based interface such as Cascading or JavaFlume. All
these MapReduce programs can then be connected to form a workflow
of MapReduce jobs using a workflow scheduler such as Oozie (Islam
etal. 2012) and Azkaban (Sumbaly etal. 2013). Workflow schedulers
ease construction of MapReduce workflows, which are typically defined
as a collection of actions (e.g., native MapReduce jobs, Pig, Hive, and
shell scripts) arranged in a control dependency DAG.The actions are
then executed in sequence based on the dependencies described by the
DAG.
28 H. HERODOTOU
2.3.2.4 Data Collection

MapReduce is designed to work on data stored in a distributed file system
like HDFS.As a result, a number of distributed data collection systems
have been built to copy data into distributed file systems, including Flume
(Hoffman 2015), Scribe (Thusoo etal. 2010), Chukwa (Rabkin and Katz
2010), and Kafka (Sumbaly etal. 2013). The basic abstraction for most
big data collection pipelines is the same: there is (a) a source that collects
the data and inserts it into the system, (b) a sink that delivers and stores the
data into the file system, and (c) a channel that acts as a conduit between
the source and the sink allowing data to be streamed to a range of destina-
tions. All systems are also designed to be scalable, reliable, extensible, and
robust to failures of the network or any specific machine.
2.4 Dataflow Systems

The application domain for data-intensive analytics is moving toward
complex data-processing tasks such as statistical modeling, graph analy-
sis, machine learning, and scientific computing. While MapReduce can be
used for these tasks, its programming model seems to be too restrictive in
certain cases (e.g., joining two datasets together) and its execution model
seems to be suboptimal for some common analysis tasks such as relational
operations and graph processing. Consequently, dataflow systems such as
Nephele (Battr etal. 2010) and Hyracks (Borkar etal. 2011) are extend-
ing the MapReduce framework with a more generalized MapReduce execu-
tion model that supports new primitive operations in addition to Map and
Reduce A different class of dataflow systems such as Dryad (Isard etal.
2007) and Spark (Zaharia etal. 2012) aim at replacing MapReduce alto-
gether with the DAG model that can express a wide range of data access
and communication patterns. Finally, graph processing systems like Pregel
(Malewicz etal. 2010) use the bulk synchronous parallel processing model
for running iterative computations and analysis over data graphs.
2.4.1Generalized MapReduce Systems

Similar to MapReduce, Nephele (Battr etal. 2010) and Hyracks (Borkar
etal. 2011) are two partitioned-parallel software systems designed to run
data-intensive computations on large shared-nothing clusters of comput-
ers. However, they offer a more versatile execution model compared to
MapReduce, with more data operators as well as data connectors. Nephele
and Hyracks differ mainly on the type of operators and connectors that
they support.
Nephele uses the Parallelization Contracts (PACT) programming
model (Alexandrov et al. 2010), a generalization of the well-known
MapReduce programming model. The PACT model extends MapReduce
with a total of five second-order functions:
Map is used to independently process each key-value pair.

Reduce and Combine partition and group key-value pairs by their
keys and process them together. They both assure that all pairs in a
partition have the same key but Combine does not assure that all
pairs with the same key are in the same partition.
Cross is defined as the Cartesian product over its input sets (two
or more). The user function is executed for each element of the
Cartesian product.
CoGroup partitions the key-value pairs of all input sets according to
their keys. For each input, all pairs with the same key form one sub-
set. Over all inputs, the subsets with same keys are grouped together
and handed to the user function.
Match is a relaxed version of the CoGroup contract and is equivalent
to an inner equi-join.
In addition, the PACT model defines optional output contracts that

give guarantees about the behavior of a function:
Same-Key: Each key-value pair that is generated by the function has

the same key as the key-value pair(s) that it was generated from.
Super-Key: Each key-value pair that is generated by the function has
a super-key of the key-value pair(s) that it was generated from.
Unique-Key: Each key-value pair that is produced has a unique key.
Partitioned-by-Key: Key-value pairs are partitioned by key. This
property can be exploited when the contract is attached to a data
source that supports partitioned storage.
Complete PACT programs are DAGs of user functions, starting with

one or more data sources and ending with one or more data sinks. Finally,
Nephele uses certain declarative aspects of the second-order functions of
the PACT programs to guide a series of transformation and optimization
rules for generating an efficient parallel dataflow plan (Battr etal. 2010).
30 H. HERODOTOU
Nephele is the execution engine for Stratosphere (Alexandrov et al.

2014), a massively parallel data-processing platform. In addition to
Nephele and PACT, Stratosphere contains the Sopremo layer. A Sopremo
program consists of a set of logical operators connected in a DAG, akin
to a logical query plan in relational DBMSs. Programs for the Sopremo
layer can be written in Meteor, an operator-oriented query language that
uses a JSON-like data model to support the analysis of unstructured and
semi-structured data.
Similar to Nephele, Hyracks (Borkar etal. 2011) allows users to express
a computation as a DAG of data operators and connectors. Operators pro-
cess partitions of input data and produce partitions of output data, while
connectors repartition operator outputs to make the newly produced par-
titions available at the consuming operators. The most important Hyracks
operators are:
Mapper: Evaluates a user-defined function on each item in the input.

Sorter: Sorts input records using user-provided comparator
functions.
Joiner: Binary-input operator that performs equi-joins.
Aggregator: Performs aggregation using a user-defined aggregation
function.
Hyracks is the lowest level of ASTERIX (Behm et al. 2011), a scal-

able platform for large-scale information storage, search, and analytics.
The topmost layer of the ASTERIX stack is a parallel DBMS, with a full,
flexible data model (ADM) and a query language (AQL) for describing,
querying, and analyzing data. AQL is comparable to languages such as
HiveQL and Pig Latin but supports both native storage and indexing of
data as well as access to external data residing in a distributed file system
(e.g., HDFS). In between these layers sits Algebricks, a model-agnostic,
algebraic-virtual machine for parallel query processing and optimization.
Algebricks is the target for AQL query compilation, but it can also be the
target for other declarative languages.
2.4.2Directed Acyclic Graph Systems

The DAG model replaces the MapReduce or MapReduce-based execution
models in certain dataflow systems, such as Dryad (Isard etal. 2007) and
Spark (Zaharia etal. 2012), offering a wider range of possible analytical
tasks. Dryad is the execution engine used predominantly by Microsoft and

utilized by the higher-level languages DryadLINQ (Isard and Yu 2009)
and SCOPE (Zhou et al. 2012). Spark and its SQL-like interface Shark
(Xin etal. 2013b) have been developed at Berkeleys AMP Lab and have a
strong emphasis on utilizing the memory on the compute nodes.
Dryad is a general-purpose distributed execution engine for coarse-
grain data-parallel applications. A Dryad job has the form of a DAG,
where each vertex defines the operations that are to be performed on the
data and each edge represents the flow of data between the connected
vertices. Vertices can have an arbitrary number of input and output edges.
At execution time, vertices become processes communicating with each
other through data channels (edges) used to transport a finite sequence
of data records. The physical implementation of the channel abstraction is
realized by shared memory, TCP pipes, or disk files. The inputs to a Dryad
job are typically stored as partitioned files in the Cosmos Storage System.
Each input partition is represented as a source vertex in the job graph, and
any processing vertex that is connected to a source vertex reads the entire
partition sequentially through its input channel.
Figure 2.5 shows the Dryad system architecture. The execution of a
Dryad job is orchestrated by a user-provided Job Manager. The primary
Fig. 2.5 Dryad system architecture and execution

32 H. HERODOTOU
function of the Job Manager is to construct the run-time DAG from its
logical representation and execute it in the cluster. The Job Manager is
also responsible for scheduling the vertices on the processing nodes when
all the inputs are ready, monitoring progress, and re-executing vertices
upon failure. A Dryad cluster has a Name Server that enumerates all the
available compute nodes and exposes their location within the network
so that scheduling decisions can take better account of locality. There is
a processing Daemon running on each cluster node that is responsible
for creating processes on behalf of the Job Manager. Each process cor-
responds to a vertex in the graph. The Daemon acts as a proxy so that the
Job Manager can communicate with the remote vertices and monitor the
state and progress of the computation.
DryadLINQ (Isard and Yu 2009) is a hybrid of declarative and impera-
tive language layer that targets the Dryad run-time and uses the Language
INtegrated Query (LINQ) model (Meijer et al. 2006). DryadLINQ
provides a set of NET constructs for programming with datasets. A
DryadLINQ program is a sequential program composed of LINQ expres-
sions that perform arbitrary side-effect-free transformations on datasets.
SCOPE (Zhou etal. 2012), on the other hand, offers a SQL-like declara-
tive language with well-defined but constrained semantics. In particular,
SCOPE supports writing a program using traditional nested SQL expres-
sions as well as a series of simple data transformations.
Spark (Zaharia etal. 2012) is a similar DAG-based execution engine.
However, the main difference of Spark from Dryad is that it uses a memory
abstractioncalled Resilient Distributed Datasets (RDDs)to explicitly
store data in memory. An RDD is a distributed shared memory abstraction
that represents an immutable collection of objects partitioned across a set
of nodes. Each RDD is either a collection backed by an external storage
system, such as a file in HDFS, or a derived dataset created by applying
various data-parallel operators (e.g., map, group by, hashjoin) to other
RDDs. The elements of an RDD need not exist in physical storage or
reside in memory explicitly; instead, an RDD can contain only the lineage
information necessary for computing the RDD elements starting from
data in reliable storage. This notion of lineage is crucial for achieving fault
tolerance in case a partition of an RDD is lost as well as managing how
much memory is used by RDDs. Currently, RDDs are used by Spark with
HDFS as the reliable back-end store.
Shark (Xin et al. 2013b) is a higher-level system implemented over
Spark and uses HiveQL as its query interface. Shark supports dynamic
query optimization in a distributed setting via offering support for par-

tial DAG execution (PDE), a technique that allows dynamic alteration of
query plans based on data statistics collected at run-time. Shark uses PDE
to select the best join strategy at run-time based on the exact sizes of the
joins input as well as to determine the degree of parallelism for operators
and mitigate skew.
2.4.3
Graph Processing Systems
For a growing number of applications, the data takes the form of graphs
that connect many millions of nodes. The growing need for managing
graph-shaped data comes from applications such as (a) identifying influ-
ential people and trends propagating through a social-networking com-
munity, (b) tracking patterns of how diseases spread, and (c) finding
and fixing bottlenecks in computer networks. Graph processing systems,
such as Pregel (Malewicz etal. 2010), GraphLab (Low etal. 2012), and
GraphX (Xin etal. 2013a), use graph structures with nodes, edges, and
their properties to represent and store data.
Many graph databases such as Pregel (Malewicz et al. 2010) use the
Bulk Synchronous Parallel (BSP) computing model. A typical Pregel com-
putation consists of (a) initializing the graph from the input, (b) perform-
ing a sequence of iterations separated by global synchronization points
until the algorithm terminates, and (c) writing the output. Similar to
DAG-based systems, each vertex executes the same user-defined function
that expresses the logic of a given algorithm. Within each iteration, a ver-
tex can modify its state or that of its outgoing edges, receive messages
sent to it in the previous iteration, send messages to other vertices (to be
received in the next iteration), or even mutate the topology of the graph.
GraphLab (Low etal. 2012) uses similar primitives (called PowerGraph)
but directly targets asynchronous, dynamic, graph-parallel computations
in the shared-memory setting. In addition, GraphLab contains several
performance optimizations such as using data versioning to reduce net-
work congestion and pipelined distributed locking to mitigate the effects
of network latency. GraphX (Xin etal. 2013a) runs on Spark and intro-
duces a new abstraction called Resilient Distributed Graph (RDG). Graph
algorithms are specified as a sequence of transformations on RDGs, where
a transformation can affect nodes, edges, or both, and yields a new RDG.
Techniques have also been proposed to support the iterative and recur-
sive computational needs of graph analysis in MapReduce systems. For
34 H. HERODOTOU
example, HaLoop and Twister are designed to support iterative algorithms

in MapReduce systems (Bu etal. 2010; Ekanayake etal. 2010). HaLoop
employs specialized scheduling techniques and the use of caching between
each iteration, whereas Twister relies on a publish/subscribe mechanism to
handle all communication and data transfers. PrIter (Zhang etal. 2011), a
distributed framework for iterative workloads, enables faster convergence
of iterative tasks by providing support for prioritized iteration. Instead
of performing computations on all data records without discrimination,
PrIter prioritizes the computations that help convergence the most, so
that the convergence speed of iterative process is significantly improved.
2.5 Systems forInteractive Analytics

The need to reduce the gap between the generation of data and the gen-
eration of analytics results over large-scale data has led to a new breed
of systems for interactive (i.e., with low latency) analytics. We separate
these systems into three distinct categories. The first category refers to
distributed storage and processing systems that support mixed analytical
and transactional workloads, such as Bigtable (Chang et al. 2008) and
Megastore (Baker etal. 2011). Support for transactions enables storage
systems in particular to serve as the data store for online services while
making the data available concurrently in the same system for analytics.
Second, distributed SQL query engines run over distributed file systems
and support ad hoc analytics. For instance, Cloudera Impala (Wanderman-
Milne and Li 2014) enables users to issue low-latency SQL queries to data
stored in HDFS (Shvachko etal. 2010) and Apache HBase (George 2011)
without requiring data movement or transformation. Finally, stream pro-
cessing systems such as S4 (Neumeyer etal. 2010) and Storm (Storm 2013)
are driven by a data-centric model that allows for near real-time consump-
tion and analysis of data.
2.5.1Mixed Analytical andTransactional Systems

Traditionally, parallel databases have used different systems to support
OLTP and OLAP.OLTP workloads are characterized by a mix of reads
and writes to a few tuples at a time, typically through index structures like
B-Trees. OLAP workloads are characterized by bulk updates and large
sequential scans that read only a few columns at a time. However, newer
database workloads are increasingly a mix of the traditional OLTP and
OLAP workloads, which led to the development of new systems that can
support both. On one hand, multiple distributed storage systems like
Bigtable (Chang etal. 2008) and Megastore (Baker etal. 2011) provide
various degrees of transactional capabilities, enabling them to serve as the
data store for online services while making the data available concurrently
in the same system for analytics. On the other hand, processing systems
like SAP HANA (Frber et al. 2012a, b) and HYRISE (Grund et al.
2012) can execute both OLTP and OLAP workloads.
2.5.1.1 Mixed Storage Systems

The most prominent example of a mixed storage system is Googles
Bigtable, which is a distributed, versioned, and column-oriented system
that stores multidimensional and sorted datasets (Chang et al. 2008).
Each Bigtable table is stored as a multidimensional sparse map, with rows
and columns, where each cell contains a timestamp and an associated arbi-
trary byte array. A cell value at a given row and column is uniquely iden-
tified by the tuple <table, row, column-family:column, timestamp>. All
table accesses are based on the aforementioned primary key, while second-
ary indices are possible through additional index tables. Bigtable provides
atomicity at the level of individual tuples.
Bigtable has motivated popular open-source implementations like
HBase (George 2011) and Cassandra (Lakshman and Malik 2010).
Both systems offer compression, secondary indexes, use data replication
for fault tolerance within and across data centers, and have support for
Hadoop MapReduce. However, Cassandra has a vastly different archi-
tecture: all nodes in the cluster have the same role and coordinate their
activities using a pure peer-to-peer communication protocol. Hence,
there is no single point of failure. Furthermore, Cassandra offers a tun-
able level of consistency per operation, ranging from weak, to eventual,
to strong consistency. HBase, on the other hand, offers strong consis-
tency by design.
Bigtable also led to the development of follow-up systems from Google
such as Megastore (Baker etal. 2011) and Spanner (Corbet etal. 2012).
Megastore and Spanner provide more fine-grained transactional support
compared to Bigtable without sacrificing performance requirements in
any significant way. Megastore supports ACID transactions at the level of
user-specified groups of tuples called entity groups and looser consistency
across entity groups. Spanner, on the other hand, supports transactions at
a global scale across data centers.
36 H. HERODOTOU
2.5.1.2 Mixed Processing Systems

Systems, such as SAP HANA, HYRISE, and HyPer, aim to support OLTP
and OLAP in a single system. SAP HANA (Frber etal. 2012a, b) is an
in-memory relational database management system that can handle both
high transaction rates and complex query processing. Figure 2.6 gives an
overview of the general SAP HANA architecture. At the core, SAP HANA
has a set of in-memory processing engines, each specialized in a different
category of data formats. Relational data resides in tables in column or
row layout in the combined column and row engine and can be converted
from one layout to the other to allow query expressions with tables in
both layouts. Graph data (e.g., XML, JSON) and text data reside in the
graph engine and the text engine, respectively; more engines are possible
due to the extensible architecture.
All engines in SAP HANA keep all data in main memory as long as
there is enough space available. All data structures are optimized for cache-
efficiency instead of being optimized for organization in traditional disk
blocks. Furthermore, the engines compress the data using a variety of com-
pression schemes. When the limit of available main memory is reached,
entire data objects, for example tables or partitions, are unloaded from
main memory under the control of application semantics and reloaded into
main memory when they are required again. While virtually all data is kept
Fig. 2.6 SAP HANA architecture

in main memory by the processing engines for performance reasons, data is

stored by the persistence layer for backup and recovery in case of a system
restart after an explicit shutdown or a failure (Frber etal. 2012a, b).
HYRISE (Grund etal. 2012) is a main-memory hybrid database sys-
tem, which automatically partitions tables into vertical groups of varying
widths depending on how the columns of the table are accessed. Smaller
column groups are preferred for OLAP-style data access because, when
scanning a single column, cache locality is improved when the values of
that column are stored contiguously. On the other hand, wider column
groups are preferred for OLTP-style data access because such transactions
frequently insert, delete, update, or access many of the fields of a row, and
colocating those fields leads to better cache locality. Being an in-memory
system, HYRISE identifies the best column grouping based on a detailed
cost model of cache performance in mixed OLAP/OLTP settings.
HyPer (Kemper et al. 2012) is also a main-memory database system
that complements columnar data layouts with sophisticated main-memory
indexing structures based on hashing, balanced search trees (e.g., red-
black trees), and radix trees. Hash indexes enable exact match (e.g., pri-
mary key) accesses that are the most common in transactional processing,
while the tree-structured indexes are essential for small-range queries
that are also encountered here. Finally, HyPer uses adaptive compres-
sion techniques for separating cold (i.e., immutable) data for aggressive
compression from the hot (i.e., mutable) working set data that remains
uncompressed and readily available to mission-critical OLTP queries.
2.5.2Distributed SQL Query Engines

The demand for more interactive analysis of large datasets has led to the
development of new SQL-like query engines that run on top of distributed
file systems and are optimized for ad hoc analytics. Dremel (Melnik etal.
2010) is such a system that runs on top of GFS (Ghemawat etal. 2003)
and Bigtable (Chang et al. 2008). Dremel exposes a SQL-like interface
with extra constructs to query read-only data stored in a new columnar
storage format that supports nested data. Each SQL statement in Dremel
(and the algebraic operators it translates to) takes as input one or multiple
nested tables and the input schema and produces a nested table and its
output schema. The two core technologies of Dremel are columnar stor-
age for nested data and the tree architecture for query execution.
38 H. HERODOTOU
Dremels data model is based on strongly typed nested records with

a schema that forms a tree hierarchy, originating from Protocol Buffers
(Protocol Buffers 2012). The key ideas behind the nested columnar for-
mat are (a) a lossless representation of record structure by encoding the
structure directly into the columnar format, (b) fast encoding of column
stripes by creating a tree of writers whose structure matches the field hier-
archy in the schema, and (c) efficient record assembly by utilizing finite
state machines (Melnik etal. 2010).
Dremelwith corresponding open-source systems, Cloudera Impala
(Wanderman-Milne and Li 2014) and Apache Drill (Hausenblas and
Nadeau 2013)uses the concept of a multilevel serving tree borrowed
from distributed search engines (Croft et al. 2010) to execute queries.
Figure 2.7 shows Dremels architecture and execution inside a server
node. When a root server receives an incoming query, it will rewrite the
query into appropriate subqueries based on metadata information, and
then route the subqueries down to the next level in the serving tree. Each
serving level performs a similar rewriting and re-routing. Eventually, the
subqueries will reach the leaf servers, which communicate with the storage
layer or access the data from local disk. On the way up, the intermediate
servers perform a parallel aggregation of partial results until the result of
the query is assembled back in the root server.
Compared to Dremel that can query only single tables, Cloudera Impala
supports both join and aggregate queries over multiple tables. Cloudera
Impala can query data stored in HDFS or Apache HBase and uses the
Fig. 2.7 Dremel archi-

tecture and execution
inside a server node
same metadata, SQL syntax (HiveQL), and user interface such as Apache
Hive, providing a unified platform for batch-oriented or real-time queries.
Unlike Cloudera Impala that was developed to fit nicely with the Hadoop
ecosystem, Apache Drill is meant to provide distributed query capabilities
across multiple big data platforms including MongoDB, Cassandra, Riak,
and Splunk. Finally, Presto (Traverso 2013) is a distributed SQL query
engine developed at Facebook and, unlike Cloudera Impala and Apache
Drill, supports standard ANSI SQL, including complex queries, aggrega-
tions, joins, and window functions.
2.5.3Stream Processing Systems

Timely analysis of activity and operational data is critical for companies
to stay competitive. Activity data from a companys Web site contains
page and content views, searches, as well as advertisements shown and
clicked. A users activity data, in combination with similar data from social
friends, can be analyzed for various purposes like providing personalized
content and recommendations as well as showing targeted advertise-
ments (Chandramouli etal. 2012). Operational data includes monitoring
data collected from Web applications (e.g., request latency) and cluster
resources (e.g., CPU usage). Proactive analysis of operational data is
used to ensure that Web applications continue to meet all service-level
requirements.
The vast majority of analysis over activity and operational data involves
continuous queries processed by stream processing systems. A continuous
query is issued once over streaming data that is constantly updated and
is run continuously. Hence, users get new results as the data changes,
without having to issue the same query repeatedly. Continuous queries
arise naturally over activity and operational data because (a) the data is
generated continuously in the form of append-only streams, and (b) the
data has a time component such that recent data is usually more relevant
than older data.
The growing interest in continuous queries is reflected by the engi-
neering resources that companies have recently been investing in building
continuous query execution platforms. Yahoo! released S4 (Neumeyer
et al. 2010) in 2010, Twitter released Storm (Storm 2013) in 2011,
and Walmart Labs released Muppet in 2012 (Lam etal. 2012). In addi-
tion, systems such as MapReduce Online (Condie et al. 2010) and
Facebooks real-time analytical systems (Borthakur et al. 2011) are adding
40 H. HERODOTOU
continuous querying capabilities to the popular Hadoop platform for

batch analytics. These platforms add to older research projects like Aurora
(Abadi etal. 2003), Borealis (Abadi etal. 2005), and STREAM (Babu and
Widom 2001), and as well as commercial systems like Infosphere Streams
(Biem etal. 2010), and Truviso (Franklin etal. 2009).
S4 (Neumeyer et al. 2010) is a general-purpose, distributed, scalable
platform that allows programmers to develop applications for processing
continuous unbounded streams of data. S4 implements the actors pro-
gramming paradigm. A users program is defined in terms of Processing
Elements (PEs) and Adapters, while the framework instantiates one PE
for each unique key in the data stream. Each PE consumes the events and
does one or both of the following: (a) emit one or more events which may
be consumed by other PEs, (b) publish results. Execution-wise, S4 uses
the push model for pushing events from one PE to the next. If a receiver
buffer gets full, events are dropped to ensure the system will not get over-
loaded. Finally, S4 provides state recovery via uncoordinated checkpoint-
ing. When a node crashes, a new node takes over its task and restarts from
a recent snapshot of its state. Events sent after the last checkpoint and
before the recovery are lost.
Storm (Storm 2013) is another platform for processing continuous
unbounded streams of data but with a different programming paradigm
and architecture compared to S4. A program in Storm is defined in terms
of spouts (the sources) and bolts (the processing vertices) arranged in a
specific topology. The number of bolts to instantiate is defined a priori
and each bolt will process a partition of the stream. Unlike S4, Storm uses
a pull model where each bolt pulls events from its source, be it a spout or
another bolt. Event loss can, therefore, happen only at ingestion time in
the spouts when the external event rate is higher than what the system can
process. Finally, the Storm provides guaranteed delivery of events based
on which an event will either traverse the entire pipeline within a time
interval or it will be declared as failed and can be replayed from the start
by the spout.
2.6 Conclusions
A major part of the challenge in data analytics today comes from the sheer
volume of data available for processing. Data volumes that many compa-
nies want to process in timely and cost-efficient ways have grown steadily
from the multigigabyte range to terabytes and now to many petabytes. All
data storage and processing systems that we presented in this monograph
were aimed at handling such large datasets. This challenge of dealing
with very large datasets has been termed the volume challenge. There are
two other related challenges, namely, those of velocity and variety (Laney
2001).
The velocity challenge refers to the short response-time require-
ments for collecting, storing, and processing data. Most of the sys-
tems in the MapReduce and Dataflow categories are batch systems. For
latency-sensitive applications, such as identifying potential fraud and
recommending personalized content, batch data processing is insuf-
ficient. The data may need to be processed as it streams into the system
in order to extract the maximum utility from the data. Systems for
interactive analytics are typically optimized for addressing the velocity
challenge.
The variety challenge refers to the growing list of data typesrelational,
time series, text, graphs, audio, video, images, and genetic codesas well
as the growing list of analysis techniques on such data. New insights are
found while analyzing more than one of these data types together using
a variety of analytical techniques such as linear algebra, statistical machine
learning, text search, signal processing, natural language processing, and
iterative graph processing.
Several higher-level systems and tools have been built on top of the
systems described in this monograph for implementing these techniques,
which drive automated processes for spam and fraud detection, advertise-
ment placement, Web site optimization, and customer relationship man-
agement. BI tools, such as SAS, SAP Business Objects, IBM Cognos, SPSS
Modeler, Oracle Hyperion, and Microsoft BI, provide support for reporting,
online analytical processing, data mining, process mining, and predictive
analytics based on data stored primarily in DataWarehouses. Other soft-
ware platforms such as Tableau and Spotfire specialize in interactive data
visualization of business data. In particular, these platforms query rela-
tional databases, cubes, cloud databases, and spreadsheets to generate a
number of graph types that can be combined into analytical dashboards
and applications. Both platforms also support visualizing large-scale data
stored in distributed file systems such as HDFS.On the other hand, com-
panies like Datameer, Karmasphere, and Platforma offer BI solutions that
specifically target the Hadoop ecosystem.
42 H. HERODOTOU
References
Abadi, Daniel J., Don Carney, Ugur Cetintemel, Mitch Cherniack, Christian
Convey, Sangdon Lee, Michael Stonebraker, Nesime Tatbul, and Stan Zdonik.
2003. Aurora: A new model and architecture for data stream management. The
VLDB JournalThe International Journal on Very Large Data Bases 12(2):
120139.
Abadi, Daniel J., Yanif Ahmad, Magdalena Balazinska, Ugur Cetintemel, Mitch
Cherniack, Jeong-Hyon Hwang, Wolfgang Lindner, etal. 2005. The design of
the borealis stream processing engine. CIDR 5: 277289.
Abadi, Daniel J., Daniel S.Myers, David J.DeWitt, and Samuel R.Madden. 2007.
Materialization strategies in a column-oriented DBMS.In Data Engineering,
IEEE 23rd International Conference on, 466475.
Abadi, Daniel J., Peter A. Boncz, and Stavros Harizopoulos. 2009. Column-
oriented database systems. Proceedings of the VLDB Endowment 2(2):
16641665.
Abouzeid, Azza, Kamil Bajda-Pawlikowski, Daniel Abadi, Avi Silberschatz, and
Alexander Rasin. 2009. HadoopDB: An architectural hybrid of MapReduce
and DBMS technologies for analytical workloads. Proceedings of the VLDB
Endowment 2(1): 922933.
Agrawal, Sanjay, Vivek Narasayya, and Beverly Yang. 2004. Integrating vertical
and horizontal partitioning into automated physical database design. In
Proceedings of the 2004 ACM SIGMOD International Conference on
Management of Data, 359370.
Ailamaki, Anastassia, David J.DeWitt, Mark D.Hill, and Marios Skounakis. 2001.
Weaving relations for cache performance. VLDB 1: 169180.
Alexandrov, Alexander, Max Heimel, Volker Markl, Dominic Battr, Fabian
Hueske, Erik Nijkamp, Stephan Ewen, Odej Kao, and Daniel Warneke. 2010.
Massively parallel data analysis with PACTs on nephele. Proceedings of the VLDB
Endowment 3(12): 16251628.
Alexandrov, Alexander, Rico Bergmann, Stephan Ewen, Johann-Christoph
Freytag, Fabian Hueske, Arvid Heise, Odej Kao, etal. 2014. The stratosphere
platform for big data analytics. The VLDB JournalThe International Journal
on Very Large Data Bases 23(6): 939964.
Amazon. 2013. Amazon simple storage service (S3). Accessed 2013. http://aws.
amazon.com/s3/
Babu, Shivnath, and Jennifer Widom. 2001. Continuous queries over data streams.
ACM SIGMOD Record 30(3): 109120.
Bajda-Pawlikowski, Kamil, Daniel J. Abadi, Avi Silberschatz, and Erik Paulson.
2011. Efficient processing of data warehousing queries in a split execution
environment. In Proceedings of the 2011 ACM SIGMOD International
Conference on Management of Data, 11651176.
Baker, Jason, Chris Bond, James C.Corbett, J.J.Furman, Andrey Khorlin, James
Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh.
2011. Megastore: Providing scalable, highly available storage for interactive
services. CIDR 11: 223234.
Baru, Chaitanya K., Gilles Fecteau, Ambuj Goyal, H. Hsiao, Anant Jhingran,
Sriram Padmanabhan, George P.Copeland, and Walter G.Wilson. 1995. DB2
parallel edition. IBM Systems Journal 34(2): 292322.
Battr, Dominic, Stephan Ewen, Fabian Hueske, Odej Kao, Volker Markl, and
Daniel Warneke. 2010. Nephele/PACTs: A programming model and execu-
tion framework for web-scale analytical processing. In Proceedings of the 1st
ACM Symposium on Cloud Computing, 119130.
Behm, Alexander, Vinayak R.Borkar, Michael J.Carey, Raman Grover, Chen Li,
Nicola Onose, Rares Vernica, Alin Deutsch, Yannis Papakonstantinou, and
Vassilis J.Tsotras. 2011. Asterix: Towards a scalable, semistructured data plat-
form for evolving-world models. Distributed and Parallel Databases 29(3):
185216.
Biem, Alain, Eric Bouillet, Hanhua Feng, Anand Ranganathan, Anton Riabov,
Olivier Verscheure, Haris Koutsopoulos, and Carlos Moran. 2010. IBM infos-
phere streams for scalable, real-time, intelligent transportation services. In
Boncz, Peter A., Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100:
Hyper-pipelining query execution. CIDR 5: 225237.
Boncz, Peter, Torsten Grust, Maurice Van Keulen, Stefan Manegold, Jan Rittinger,
and Jens Teubner. 2006. MonetDB/XQuery: A fast XQuery processor pow-
ered by a relational engine. In Proceedings of the 2006 ACM SIGMOD
International Conference on Management of Data, 479490.
Borkar, Vinayak, Michael Carey, Raman Grover, Nicola Onose, and Rares Vernica.
2011. Hyracks: A flexible and extensible foundation for data-intensive comput-
ing. In 2011 IEEE 27th International Conference on Data Engineering (ICDE),
11511162.
Borthakur, Dhruba, Jonathan Gray, Joydeep Sen Sarma, Kannan Muthukkaruppan,
Nicolas Spiegelberg, Hairong Kuang, Karthik Ranganathan, et al. 2011.
Apache hadoop goes realtime at Facebook. In Proceedings of the 2011 ACM
SIGMOD International Conference on Management of Data, 10711080.
Bu, Yingyi, Bill Howe, Magdalena Balazinska, and Michael D. Ernst. 2010.
HaLoop: Efficient iterative data processing on large clusters. Proceedings of the
VLDB Endowment 3(12): 285296.
Buffers, Protocol. 2012. Developer guide. Accessed 2012.
Calder, Brad, Ju Wang, Aaron Ogus, Niranjan Nilakantan, Arild Skjolsvold, Sam
McKelvie, Yikang Xu, etal. 2011. Windows azure storage: A highly available
44 H. HERODOTOU
cloud storage service with strong consistency. In Proceedings of the Twenty-

Third ACM Symposium on Operating Systems Principles, 143157.
Cascading. 2011. Cascading: Application platform for enterprise big data. http://
www.cascading.org/
Chambers, Craig, Ashish Raniwala, Frances Perry, Stephen Adams, Robert
R.Henry, Robert Bradshaw, and Nathan Weizenbaum. 2010. FlumeJava: Easy,
efficient data-parallel pipelines. ACM SIGPLAN Notices 45(6): 363375.
Chandramouli, Badrish, Jonathan Goldstein, and Songyun Duan. 2012. Temporal
analytics on big data for web advertising. In 2012 IEEE 28th International
Conference on Data Engineering (ICDE), 90101.
Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah
A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert
E. Gruber. 2008. Bigtable: A distributed storage system for structured data.
ACM Transactions on Computer Systems (TOCS) 26(2): 4.
Chen, Songting. 2010. Cheetah: A high performance, custom data warehouse on
top of MapReduce. Proceedings of the VLDB Endowment 3(12): 14591468.
Chen, Hsinchun, Roger H.L.Chiang, and Veda C.Storey. 2012. Business intelli-
gence and analytics: From big data to big impact. MIS Quarterly 36(4):
11651188.
Cohen, Jeffrey, Brian Dolan, Mark Dunlap, Joseph M. Hellerstein, and Caleb
Welton. 2009. MAD skills: New analysis practices for big data. Proceedings of
the VLDB Endowment 2(2): 14811492.
Condie, Tyson, Neil Conway, Peter Alvaro, Joseph M. Hellerstein, Khaled
Elmeleegy, and Russell Sears. 2010. MapReduce online. NSDI 10(4): 20.
Corbet, J.C., J.Dean, and M.Epstein. 2012. Spanner: Googles globally distrib-
uted database. In Proceedings of the 10th USENIX conference on operation sys-
tems design and implementation, 251264. Berkeley, CA: USENIX Association.
Croft, W., Donald Metzler Bruce, and Trevor Strohman. 2010. Search engines:
Information retrieval in practice. Reading: Addison-Wesley.
Dean, J., and S.Ghemawat. 2004. MapReduce: Simplified data processing on large
clusters (2004). Gottfrid, D.: Google, Inc.
Dean, Jeffrey, and Sanjay Ghemawat. 2008. MapReduce: Simplified data process-
ing on large clusters. Communications of the ACM 51(1): 107113.
DeWitt, David, and Jim Gray. 1992. Parallel database systems: The future of high
performance database systems. Communications of the ACM 35(6): 8598.
DeWitt, David J., Shahram Ghandeharizadeh, Donovan Schneider, Allan Bricker,
Hui-I.Hsiao, and Rick Rasmussen. 1990. The Gamma database machine proj-
ect. IEEE Transactions on Knowledge and Data Engineering 2(1): 4462.
DeWitt, David J., Jeffrey F. Naughton, Donovan A. Schneider, and Srinivasan
Seshadri. 1992. Practical skew handling in parallel joins. Madison: University
of Wisconsin-Madison, Computer Sciences Department.
Dittrich, Jens, Jorge-Arnulfo Quian-Ruiz, Alekh Jindal, Yagiz Kargin, Vinay

Setty, and Jrg Schad. 2010. Hadoop++: Making a yellow elephant run like a
cheetah (without it even noticing). Proceedings of the VLDB Endowment 3(12):
515529.
Dittrich, Jens, Jorge-Arnulfo Quian-Ruiz, Stefan Richter, Stefan Schuh, Alekh
Jindal, and Jrg Schad. 2012. Only aggressive elephants are fast elephants.
Proceedings of the VLDB Endowment 5(11): 15911602.
Ekanayake, Jaliya, Hui Li, Bingjing Zhang, Thilina Gunarathne, Seung-Hee Bae,
Judy Qiu, and Geoffrey Fox. 2010. Twister: A runtime for iterative mapreduce.
In Proceedings of the 19th ACM International Symposium on High Performance
Distributed Computing, 810818.
Eltabakh, Mohamed Y., Yuanyuan Tian, Fatma zcan, Rainer Gemulla, Aljoscha
Krettek, and John McPherson. 2011. CoHadoop: Flexible data placement and
its exploitation in Hadoop. Proceedings of the VLDB Endowment 4(9): 575585.
Frber, Franz, Sang Kyun Cha, Jrgen Primsch, Christof Bornhvd, Stefan Sigg,
and Wolfgang Lehner. 2012a. SAP HANA database: Data management for
modern business applications. ACM SIGMOD Record 40(4): 4551.
Frber, Franz, Norman May, Wolfgang Lehner, Philipp Groe, Ingo Mller,
Hannes Rauhe, and Jonathan Dees. 2012b. The SAP HANA databaseAn
architecture overview. IEEE Data Engineering Bulletin 35(1): 2833.
Floratou, Avrilia, Jignesh M.Patel, Eugene J.Shekita, and Sandeep Tata. 2011.
Column-oriented storage techniques for MapReduce. Proceedings of the VLDB
Endowment 4(7): 419429.
Frankel, Felice, and Rosalind Reid. 2008. Big data: Distilling meaning from data.
Nature 455(7209): 3030.
Franklin, Michael J., Sailesh Krishnamurthy, Neil Conway, Alan Li, Alex
Russakovsky, and Neil Thombre. 2009. Continuous analytics: Rethinking
query processing in a network-effect world. In CIDR.
George, Lars. 2011. HBase: The definitive guide. USA: OReilly Media, Inc.
Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung. 2003. The google file
system. ACM SIGOPS Operating Systems Review 37(5): 2943.
Greenplum. 2013. Pivotal greenplum database. Accessed 2013. http://www.piv-
otal.io/big-data/pivotal-greenplum-database
Grund, Martin, Philippe Cudr-Mauroux, Jens Krger, Samuel Madden, and
Hasso Plattner. 2012. An overview of HYRISE-a main memory hybrid storage
engine. IEEE Data Engineering Bulletin 35(1): 5257.
Hausenblas, Michael, and Jacques Nadeau. 2013. Apache drill: Interactive ad-hoc
analysis at scale. Big Data 1(2): 100104.
He, Yongqiang, Rubao Lee, Yin Huai, Zheng Shao, Namit Jain, Xiaodong Zhang,
and Zhiwei Xu. 2011. RCFile: A fast and space-efficient data placement struc-
ture in MapReduce-based warehouse systems. In 2011 IEEE 27th International
Conference on Data Engineering (ICDE), 11991208.
46 H. HERODOTOU
Herodotou, Herodotos, Nedyalko Borisov, and Shivnath Babu. 2011. Query opti-
mization techniques for partitioned tables. In Proceedings of the 2011 ACM
Hoffman, Steve. 2015. Apache flume: Distributed log collection for hadoop.
Birmingham: Packt Publishing.
Hsiao, Hui-I, and David J.DeWitt. 1990. Chained declustering: A new availabil-
ity strategy for multiprocessor database machines. Madison: University of
Wisconsin-Madison, Computer Sciences Department.
IBM Corporation. 2007. IBM knowledge center: Partitioned tables. Accessed 2007.
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.
db2.luw.admin.partition.doc/doc/c0021560.html
IBM Netezza. 2012. IBM Netezza data warehouse appliances. Accessed 2012.
http://www-01.ibm.com/software/data/netezza/
Idreos, Stratos, Fabian Groffen, Niels Nes, Stefan Manegold, K. Sjoerd Mullender,
and Martin L. Kersten. 2012. MonetDB: Two decades of research in column-
oriented database architectures. IEEE Data Engineering Bulletin 35(1): 4045.
Infobright. 2013. InfobrightAnalytic database for the internet of things.
Accessed 2013. http://www.infobright.com/
Isard, Michael, and Yuan Yu. 2009. Distributed data-parallel computing using a
high-level programming language. In Proceedings of the 2009 ACM SIGMOD
International Conference on Management of Data, 987994.
Isard, Michael, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007.
Dryad: Distributed data-parallel programs from sequential building blocks.
ACM SIGOPS Operating Systems Review 41(3): 5972.
Islam, Mohammad, Angelo K. Huang, Mohamed Battisha, Michelle Chiang,
Santhosh Srinivasan, Craig Peters, Andreas Neumann, and Alejandro Abdelnur.
2012. Oozie: Towards a scalable workflow management system for hadoop. In
Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution
Engines and Technologies, 4.
Kemper, Alfons, Thomas Neumann, Florian Funke, Viktor Leis, and Henrik
Mhe. 2012. HyPer: Adapting columnar main-memory data management for
transactional and query processing. IEEE Data Engineering Bulletin 35(1):
4651.
KFS. 2013. Kosmos distributed file system. Accessed 2013. http://code.google.
com/p/kosmosfs/
Lakshman, Avinash, and Prashant Malik. 2010. Cassandra: A decentralized struc-
tured storage system. ACM SIGOPS Operating Systems Review 44(2): 3540.
Lam, Wang, Lu Liu, S. T. S. Prasad, Anand Rajaraman, Zoheb Vacheri, AnHai
Doan. 2012. Muppet: MapReduce-style processing of fast data. Proceedings of
Lamb, Andrew, Matt Fuller, Ramakrishna Varadarajan, Nga Tran, Ben Vandiver,
Lyric Doshi, and Chuck Bear. 2012. The vertica analytic database: C-store 7
years later. Proceedings of the VLDB Endowment 5(12): 17901801.
Laney, Doug. 2001. 3D data management: Controlling data volume, velocity and
variety. META Group Research Note 6: 70.
Lee, Rubao, Tian Luo, Yin Huai, Fusheng Wang, Yongqiang He, and Xiaodong
Zhang. 2011. Ysmart: Yet another SQL-to-MapReduce translator. In 2011 31st
International Conference on Distributed Computing Systems (ICDCS), 2536.
Lee, George, Jimmy Lin, Chuang Liu, Andrew Lorek, and Dmitriy Ryaboy. 2012.
The unified logging infrastructure for data analytics at Twitter. Proceedings of
Lin, Yuting, Divyakant Agrawal, Chun Chen, Beng Chin Ooi, and Sai Wu. 2011.
Llama: Leveraging columnar storage for scalable join processing in the mapre-
duce framework. In Proceedings of the 2011 ACM SIGMOD International
Low, Yucheng, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola,
and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for
machine learning and data mining in the cloud. Proceedings of the VLDB
Endowment 5(8): 716727.
MacNicol, Roger, and Blaine French. 2004. Sybase IQ multiplex-designed for
analytics. In Proceedings of the Thirtieth International Conference on Very Large
Data Bases, vol. 30, 12271230. Seoul: VLDB Endowment.
Malewicz, Grzegorz, Matthew H.Austern, Aart JC Bik, James C.Dehnert, Ilan
Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-
scale graph processing. In Proceedings of the 2010 ACM SIGMOD International
MapR. 2013. MapR file system. Accessed 2013. http://www.mapr.com/prod-
ucts/apache-hadoop
Mehta, Manish, and David J.DeWitt. 1997. Data placement in shared-nothing
parallel database systems. The VLDB JournalThe International Journal on
Very Large Data Bases 6(1): 5372.
Meijer, Erik, Brian Beckman, and Gavin Bierman. 2006. Linq: Reconciling object,
relations and xml in the .net framework. In Proceedings of the 2006 ACM
Melnik, Sergey, Andrey Gubarev, Jing Long, Geoffrey Romer, Shiva Shivakumar,
Matt Tolton, and Theo Vassilakis. 2010. Dremel: Interactive analysis of web-
scale datasets. Proceedings of the VLDB Endowment 3(12): 330339.
Morales, Tony. 2007. Oracle database VLDB and partitioning guide 11 g release
1 (11.1). Oracle, July 2007.
Neumeyer, Leonardo, Bruce Robbins, Anish Nair, and Anand Kesari. 2010. S4:
Distributed stream computing platform. In 2010 IEEE International Conference
on Data Mining Workshops (ICDMW), 170177.
Nykiel, Tomasz, Michalis Potamias, Chaitanya Mishra, George Kollios, and Nick
Koudas. 2010. MRShare: Sharing across multiple queries in MapReduce.
48 H. HERODOTOU
Olston, Christopher, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew
Tomkins. 2008. Pig latin: A not-so-foreign language for data processing. In
Ovsiannikov, Michael, Silvius Rus, Damian Reeves, Paul Sutter, Sriram Rao, and
Jim Kelly. 2013. The quantcast file system. Proceedings of the VLDB Endowment
6(11): 10921101.
ParAccel. 2013. ParAccel analytic platform. Accessed 2013. http://www.paraccel.
com/
Rabkin, Ariel, and Randy H.Katz. 2010. Chukwa: A system for reliable large-scale
log collection. LISA 10: 115.
Rao, Jun, Chun Zhang, Nimrod Megiddo, and Guy Lohman. 2002. Automating
physical database design in a parallel database. In Proceedings of the 2002 ACM
Shvachko, Konstantin, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010.
The hadoop distributed file system. In 2010 IEEE 26th Symposium on Mass
Storage Systems and Technologies (MSST), 110.
Stonebraker, Mike, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch
Cherniack, Miguel Ferreira, Edmond Lau, et al. 2005. C-store: A column-
oriented DBMS. In Proceedings of the 31st International Conference on Very
Large Data Bases, 553564. Seoul: VLDB Endowment.
Storm, Apache. 2013. Storm, distributed and fault-tolerant real-time
computation.
Sumbaly, Roshan, Jay Kreps, and Sam Shah. 2013. The big data ecosystem at
linkedin. In Proceedings of the 2013 ACM SIGMOD International Conference
on Management of Data, 11251134.
Talmage, Ron. 2009. Partitioned table and index strategies using SQL server
2008. MSDN Library, March 2009.
Teradata. 2012. Teradata enterprise data warehouse. Accessed 2012. http://www.
teradata.com
Thusoo, Ashish, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka,
Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. 2009. Hive:
A warehousing solution over a map-reduce framework. Proceedings of the VLDB
Endowment 2(2): 16261629.
Thusoo, Ashish, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Namit Jain,
Joydeep Sen Sarma, Raghotham Murthy, and Hao Liu. 2010. Data warehous-
ing and analytics infrastructure at Facebook. In Proceedings of the 2010 ACM
Traverso, Martin. 2013. Presto: Interacting with petabytes of data at Facebook.
Retrieved February 4, 2014.
Wanderman-Milne, Skye, and Li Nong. 2014. Runtime code generation in clou-
dera impala. IEEE Data Eng. Bull. 37(1): 3137.
Weil, Sage A., Scott A. Brandt, Ethan L. Miller, Darrell DE Long, and Carlos
Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In
Proceedings of the 7th Symposium on Operating Systems Design and
Implementation, 307320. Berkeley, CA: USENIX Association.
White, Tom. 2010. Hadoop: The definitive guide. Sunnyvale, CA: Yahoo.
Wu, Lili, Roshan Sumbaly, Chris Riccomini, Gordon Koo, Hyung Jin Kim, Jay
Kreps, and Sam Shah. 2012. Avatara: Olap for web-scale analytics products.
Xin, Reynold S., Joseph E.Gonzalez, Michael J.Franklin, and Ion Stoica. 2013a.
Graphx: A resilient distributed graph system on spark. In First International
Workshop on Graph Data Management Experiences and Systems, 2.
Xin, Reynold S., Josh Rosen, Matei Zaharia, Michael J.Franklin, Scott Shenker,
and Ion Stoica. 2013b. Shark: SQL and rich analytics at scale. In Proceedings of
the 2013 ACM SIGMOD International Conference on Management of Data,
1324.
Zaharia, Matei, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma,
Murphy McCauley, Michael J.Franklin, Scott Shenker, and Ion Stoica. 2012.
Resilient distributed datasets: A fault-tolerant abstraction for in-memory clus-
ter computing. In Proceedings of the 9th USENIX Conference on Networked
Systems Design and Implementation, 22. Berkeley, CA: USENIX Association.
Zhang, Yanfeng, Qixin Gao, Lixin Gao, and Cuirong Wang. 2011. Priter: A dis-
tributed framework for prioritized iterative computations. In Proceedings of the
2nd ACM Symposium on Cloud Computing, 13.
Zhou, Jingren, Nicolas Bruno, Ming-Chuan Wu, Per-Ake Larson, Ronnie
Chaiken, and Darren Shakib. 2012. SCOPE: Parallel databases meet
MapReduce. The VLDB JournalThe International Journal on Very Large
Data Bases 21(5): 611636.
Zukowski, Marcin, and Peter A.Boncz. 2012. Vectorwise: Beyond column stores.
IEEE Data Engineering Bulletin 35(1): 2127.
CHAPTER 3
Business Analytics forPrice Trend

Forecasting through Textual Data
MarcoPospiech andCarstenFelden
3.1 Introduction
Living in the era of Big Data (Labrinidis and Jagadish 2012) means having
an increasing amount of structured and unstructured data accessible and
useful for different business needs (Gartner 2013). Price predictions might
be one business analytics scenario out of several ones, which describes the
need for integrating heterogeneous data (Pospiech and Felden 2014).
In all business domains, markets react sensible and accelerated of relevant
news (Chan etal. 2001). For example, news tickers contain a broad range
of edited topics from geopolitical to financial data and are currently evalu-
ated and integrated with structured (usually internal) data manually in deci-
sion processes. Since the value of an information itself decreases quickly and
decision-makers are overwhelmed by the felt information flood, an applica-
tion of business analytics is useful to gain an automated data/information
analysis. Existing business analytics approaches consider both data types sep-
aratelybut the analytical benefit comes out of an integrated perspective,
M. Pospiech (*) C. Felden (*)

TU Freiberg, Institute of Information Science, Freiberg, Germany
e-mail: marco.pospiech@bwl.tu-freiberg.de; carsten.felden@bwl.tu-freiberg.de

DOI10.1057/978-1-137-37879-8_3
52 M. POSPIECH AND C. FELDEN
which is in the sense of Big Data. However, taking just one kind of data set
into account leads to the limitation that an entire market overview is not
possible. In addition, available approaches do not regard real-time events
and due to this reason, forecasts are calculated on defined time intervals.
Using the named example of price prediction, we will illustrate how the
combination of unstructured and structure data sources can generate value.
This chapter presents a general forecasting approach based on news tickers
and market-related indicators by applying data mining algorithms. A classi-
fication is performed to predict positive and negative price trends and based
on this forecasting models are deduced. Patterns are extracted by historical
price movements caused by various attributes so that similar characteristics
can be understood as the repetition of similar trends in the future. The
functionality is demonstrated by two different case studies.
The chapter is organized as follow: after a literature review, we intro-
duce the forecasting process in general. Hereby, unstructured textual
data and various environment conditions like currency exchange rates are
mapped and classified. The approach will be applied for the natural gas
and electricity market, whereby the specific environment conditions differ
for both. Real projects show the implementation of the forecast system for
trading floors. The chapter ends with a conclusion.
3.2 Text-based Business Analytics inPrice

Forecasting
Since the advent of the Non-Random Walk theory, evidence exists that
analytics are useful to scrutinize patterns and predict market prices. Thus,
technical analysis can process valuable information automatically to pre-
dict price developments prior market adjustments. As a consequence,
traders have to modulate strategies to earn profits (Lo and MacKinlay
1999). In the past, forecasts of stock prices were supported by quantita-
tive techniques like regression analysis (Pring 1991). Nevertheless, Chan
et al. (2001) found evidence that political and economic news articles
affect trading activities. Here, more recent approaches use news tickers to
allow a prediction of the future development. The beginnings are dated
back to Wuthrich et al. (1998) and Lavrenko et al. (2000). Wuthrich
etal. used web articles to predict daily closing trends (up, down, stable)
of major stock exchanges. Given the technical circumstances in this time,
those beginnings were limited, because only 400 individual sequences of
word were defined by experts and applied. The accuracy was stated by
BUSINESS ANALYTICS FORPRICE TREND FORECASTING THROUGH TEXTUAL... 53
43.60 percent. In contrast, Lavrenko etal. (2000) developed an analyti-

cal system, which suggest the trader articles they should read, because
they are most likely to indicate an upcoming trend. They used language
models, wherein a bag of words based on probabilities leads to a spe-
cific trend (increasing, decreasing, flat). The results leading to an average
profit of 0.23 percent per trade. Later on, Mittermayer (2004) developed
NewsCATS. The system categorizes news tickers based on the potential
price movement (good news, bad news, no movers). Based on the given
categories, trading strategies are suggested. The performance was low.
Only 6 percent of good and 5 percent of bad news were identified. In
2006, Schumaker and Chen introduced AZFinText. Compared to other
approaches at that time, they tried to predict the exact value of a price
20minutes after a news publication. Applying a support vector machine
(SVM), the system outperformed the regression model. Evaluating the
direction accuracy, the best model achieved 50.08 percent.
Within the next years, the era of Big Data emerged. More and more
different information sources were used. In fact, Felden and Chamoni
(2003) classified relevant text documents and mapped them with price
charts by a business intelligence platform at first. However, a real price
prediction based on integrated structured market and unstructured text
data to achieve a better forecast performance goes back to Geva and
Zahavi (2010). They found evidence that the combination resulted in a
better accuracy instead of using the text or market data separately. The
average return per trade was 0.62 percent. Nevertheless, the system pre-
dicts not in real time but rather within a 15minutes interval. Instead of
financial reports or press releases, Oh and Sheng (2011) analyzed whether
stock-related micro blogs can improve price forecasts. Here, the micro
blog service allows users to monitor activities of traders and investors.
Their promising results support the usage of micro blogs. Similar micro
blog forecastings can be found, for example, in Nann etal. (2013).
This brief discussion shows that this business analytics topic is still
broadly discussed and of major interest. Step by step, more precise out-
comes were achieved in the past. However, the examined systems contain
drawbacks. Most of the approaches consider only company specific news.
Market-related news tickers are faded out. Almost all systems forecast not
in real time, but rather in a specified time interval. Taking these drawbacks
into account, we discuss a different business analytics approach. Whenever
a news ticker will be published, the system will forecast instantly. In con-
trast to other systems, the training data is labeled automatically and not
by experts. Another issue belongs to the mapping hypothesis between

news ticker and price reaction. Several approaches (Wuthrich etal. 1998;
Lavrenko etal. 2000) linked tickers, published several hours or even days
before the price movement happened. But it is questionable, if such a late
price movement is really caused by a news ticker.
3.3 Existing Solutions inPrice Trend

Forecasting Approach Using Textual Data
This section provides a possible solution to address the stated drawbacks of
existing forecasting systems. In cooperation with a globally acting media
and information (news) company, we developed a generalizable approach
to use news tickers and marked data for price trend forecasts. The approach
is subdivided into training (Fig. 3.1) and live process (Fig. 3.5).
Business analytics belongs to a process where not only mass data is ana-
lyzed but also appropriate methods are processed and evaluated to enable
a strategic management control within organizations (Davenport and
Harris 2007). Business analytics implies the use of models. Those models
may be manual or automatic as already known from Knowledge Discovery
in Database (KDD) (Turban etal. 2004). We are using established KDD
techniques as basis for our price trend prediction approach. According to
Fayyad etal. (1996), KDD is a process of nontrivial process identifying
valid, novel, potentially useful, and ultimately understandable patterns in
data. Hereby, knowledge extraction occurs through five process steps:
(1) selection, (2) pre-processing, (3) transformation, (4) data/text min-
ing, and (5) interpretation/evaluation.
According to the project goal, object-relevant data have to be selected
first. Hereby, key aspects such as data access and data cleaning have to
be considered. The proposed forecast approach (Fig. 3.1) uses struc-
tured marked data and unstructured news ticker, because, and what we
have seen in the argumentation already, their combination achieved best
results in literature (Geva and Zahavi 2010, Pospiech and Felden 2014).
In addition, the system will use trend classification, because exact price
predictions are more inaccurate (Schumaker and Chen 2006, Oh and
Sheng 2011). Based on historical prices, possible manifestations/classes
are derived for all examples. Thus, an example will belong to the class
UP, if the price of the previous transaction is lower than the following
one. If the price of the previous transaction is higher than the following
Fig. 3.1 General training process

one, the example will be put into the class DOWN.The class STABLE
represents a no mover status. Here, no remarkable price change between
two trade transactions happened (Lavrenko etal. 2000). Because markets
differ from each other, domain experts have to specify how many price
point movements are needed to be able to represent a meaningful move-
ment. The historical movements need an explanation. In cooperation with
domain experts, variables and data sources explaining the developments
should be identified.
A possible explanation could be seen in historical news tickers (Chan
etal. 2001). In fact, the amount of news ticker is growing. The filtering
step aims the selection of as most as possible relevant articles. Considering
the market, specific topics, keywords, or time intervals are potential fil-
ters. The step is of major importance, because irrelevant articles contain
no explanation for price movement and will decrease model accuracy
(Khandar and Dani 2010). The next step, maps historical news tickers, his-
torical market data, and historical prices to investigate the effect upon the
price trend. In fact, it is vague, how quick the trend development shows a
response to a message event. Some will need hours other minutes until a
price adjustment occurs. A time interval needs to be specified by domain
experts. Here, the mapping belongs to one of the biggest challenges
within the prediction through text documents (Lavrenko et al. 2000).
Figure 3.2 shows two possible mapping hypotheses (the time interval is
set to two minutes).
Backward mapping: This approach assumes that a trade transaction

is caused through the message in past (Lavrenko et al. 2000) and
the market data during the transaction. Thus, the marked data are
joined with the trade transaction through the timestamp. Based on a
given trade transaction, every historical news ticker is linked, which
is published within a two minutes period before. As a result, tickers
where no trade transactions happened after two minutes are filtered
and not used anymore and due to this reason, noneffective news
tickers are removed. For every remaining article, a trade transaction
occurred after two minutes. Thus, we gain an increasing probability
that the news ticker caused the trade.
Forward mapping: This approach assumes that a news ticker and
the market data during the publication will cause a future trade
transaction within a specific time. In contrast to backward mapping,
all messages are selected and mapped to the previous trade transac-
Fig. 3.2 Trend calculation within the price forecast process
tion. Marked data and news tickers are joined by the same timestamp.
In fact, the forward mapping procedure requires a new trend calcula-
tion, because the news ticker itself forms the central artifact. Thus,
the trend is not estimated between two trade transactions any longer,
but rather to the different prices of the previous trade transaction
(from the news ticker) and the status two minutes after publication.
The forward mapping has one important drawback: all news tickers
are used, even irrelevant ones.
However, the mapping procedures are not limited for text docu-
ments, only. Videos, audios, or images could affect the marked as well
and are possible events, too. A short-time price movement (e.g. a price
drops and rise immediately) implies a small effect by news tickers or mar-
ket data, which perhaps caused the change. Relevant events will lead to
a permanent price change. According to sampling literature, the model

accuracy will increase, if relevant training examples are chosen (Khandar
and Dani 2010). Relevant examples are identified by its durability.
It represents a time length, which defines the period within a specific
trend statement remains true. The durability calculation depends on the
mapping proceeding. In Fig. 3.2 (Backward), Trade 1 is labeled as UP,
because the price is increased compared to Trade 0. The statement (the
trend will increase) is still true until Trade 4, because the price does not
drop below 50.00 points. This example has a durability of 5,400 sec-
onds. Forward durability calculation considers a news ticker and Ticker
1 causes a positive price change. The resulting price remains true until
Trade 4, because the price declines lower than 50.00 points afterwards.
As a consequence, Ticker 1 has a durability of 5,520seconds. Later on,
nonrelevant examples with low durability are removed to increase the
model accuracy and to decrease the computation time. The drawback of
forward mapping should be reduced, because the durability supports the
identification of appropriate training examples. How long the durability
should be to represent a relevant item depends on the market. In mar-
kets with high information needs many relevant tickers will be published
and will annul each other. Thus, the durability should be low. In coop-
eration with an expert, the minimal durability level has to be chosen to
realize an effective decision support.
In a next step, text mining is needed as additional pre-processing step
to extract necessary patterns. In this context, each new ticker contains
specific features/terms. Thereby, those terms have to be identified and
extracted to characterize the document (Miner et al. 2012). Various
concepts can support the phase. Tokenizing breaks up text in individual
terms or tokens and removes special characters. Terms are usually writ-
ten in mixed cases and algorithms interpret lower and upper cases differ-
ent so that they have to be transformed into a unique case. This reduces
variability within the word corpus. Terms such as the or to are known as
meaningless stop words. Those are defined in lists and will be removed. In
order to reduce the word corpus variation, stemming processes normaliz-
ing related word tokens into a single form. Typically, prefixes, suffixes, and
inappropriate pluralizations are removed until the root word is disclosed.
Sometimes, not only single but also rather group terms are meaningful so
that n-grams are n contiguous terms within text. They are generated by
sliding a window with a wide of n terms across the text, where each posi-
tion of the window represents an n-gram.
The most common representation of text in text mining belongs to the

vector space model. Hereby, a document is described through weights of
any term within the word corpus. The resulting vector describes the posi-
tion of a document as point in a multidimensional spatial. The weight or
respectively the importance of a term can be computed by various meth-
ods. The term frequency inverse document frequency (TF-IDF) is here a
common approach. The basic idea is that the greater the weight, the better
the document is described by the terms (Miner etal. 2012). A possible
example is shown in Fig. 3.3. The news tickers are already pre-processed,
TF-IDF value is estimated, trend calculated and mapped to a specific mes-
sage. A 0 will occur, if the text does not contain a specific term, but other
documents will. As shown, the text in row number 27 contains the term
announc but not academ.
As a result, all weighted and pre-processed examples (marked data and
news tickers) are forwarded as input into the Data Mining phase, which is
the application of specific algorithms for extracting patterns from data
(Fayyad etal. 1996). Based on historical data, the algorithms try to extract
the patterns that explain a future price trend. After model training, the
patterns are represented by the model and actual data are transferred to
predict the future trend based on the extracted historic patterns.
In this context, it is common practice to split the items into training,
validation, and test data (relation 603010) to avoid a models over-
fitting and to allow the generalization of the identified relationships
(Breiman et al. 1984). The training and validation data are forwarded
to the classification. Based on calculation time and classification results,
SVM (Lib), Naive Bayes (Kernel), and k-nearest neighbors (KNN) are
chosen as suitable algorithms (see Sect. 3.4). As shown in Fig. 3.4, SVM
determines the best decision surface (hyper plan or line) that maximize
the margin between the data points referring to a specific class (Chang
etal. 2010). This is done by structural risk minimization. The SVM (Lib)
belongs to the SVM family and is a well-known text mining algorithm.
Since it uses kernel functions, polynominale classification problems can be
solved (Chang etal. 2010).
KNN assigns the classification of the nearest set of previously classified
points to an unclassified data point. Hereby, the k-nearest neighbors are
considered by similarity or distance measures. In contrast, Naive Bayes is
based on probability calculation. It computes the probability that a data
point belongs to a class and assigns it to the class with the highest posterior
probability. The posterior probability of a class is determined using Bayes
Fig. 3.3 Text mining example

Fig. 3.4 SVM
Margin
Rules. The testing sample is assigned to the class with the highest posterior
probability (Ni and Luh 2001). Hereby, the term naive assumes the inde-
pendency of all features among each other and a Gaussian distribution.
The estimation from the training data occurs through kernel smoothing
(Mitchell 1997).
The amount of available news tickers is growing. A decision-maker
would be overwhelmed considering all documents. In this context, a
first model is trained, which identifies only relevant situation. Therefore,
DOWN and UP examples are temporal label as UNSTABLE. STABLE
examples remain STABLE. Within the live system, only UNSTABLE
examples are forwarded to the next model. In a next step, the original
labels are reconstituted and a second model is trained to predict, whether
the example belongs to UP or DOWN.Ten percent of all examples remain
for testing. The used data are unknown for the trained models, which
reflect the reality. The model evaluation is done through the most popular
classification performance measure, the accuracy (Zhang and Luh 2002).
Hereby, the assignment of a class will be true positive (TP), if the item
belongs to the class positive and the algorithm did a correct classification.
In contrast, true negative (TN) represents the correct assignment of an
item that belongs to the negative case. False positive (FP) and false nega-
tive (FN) are both wrong classifications. The accuracy represents the ratio
of all correct classification in contrast to all assignments. Whenever the
distribution of classes is unequal or specific classes are of more interest as
others different measures are applied. Recall indicates how many elements
of a specific class are identified and precision measures how correct the
prediction was for a specific class (Miner etal. 2012):
TP + TN TP TP
Accuracy = Recall = Precision =
TP + FP + FN + TN TP + FN TP + FP
The forecasting approach (Fig. 3.1) is iterative. Errors or dissatisfied

results will lead to adjustments in previous stages until the results are
acceptable. If the performance is adequate, the trained models will be
forwarded through the live system (Fig. 3.5).
The live process differs slightly from the training process. In contrast,
actual data sources are processed. News tickers are filtered according to key-
words, time intervals, or topics. Based on the publication timestamp, the
most recent available price data and market data are selected and linked
to the article by forward mapping. Backward mapping is not possible,
because future trade transactions are unknown so that the last available
market and price data needs to be added. Since the data represents real-
time data, a trend and durability calculation is impossible. Text mining
techniques are used to pre-process the news ticker and the determined fea-
ture/term vector is used as input for stable/unstable model. The model
decides, whether the ticker belongs to a relevant or irrelevant situation.
An UNSTABLE ticker will be forwarded to the UP/DOWN model. The
final classification is shown within a dashboard. The user will analyze and
interpret the forecasting. Since the pre-classification removes irrelevant
news tickers, the information flood is reduced. The live process is itera-
tive. If the results are no longer acceptable, the training process will be
triggered again.
3.4 Application Within theEnergy Markets

This section discusses two application of the presented forecasting
approach, whereby the second one will be discussed in more detail. The
first scenario belongs to Pospiech and Felden (2014). The approach was
implemented within the German electricity market. The scenario was
used as input, to refine the general forecasting approach. At this time,
no UNSTABLE/STABLE classification was done only UP/DOWN/
STABLE.Even the durability was not estimated. The second scenario was
realized within the British gas market. Here, the full process as shown in
Fig. 3.5 General live process of the business analytics approach

Sect. 3.3 was applied. Both scenarios were conducted through the data
mining tool RapidMiner (Rapid-I Incorporation 2013). Within the sce-
narios, 56GB main memory and 4 3.07GHz processors were available.
3.4.1Electricity Market
Many countries have restructured their electrical power industry and
introduced deregulation and competition by unbundling generation,
transmission, trading, and distribution of electricity. Market participants
need to forecast the price development to be able to maximize their prof-
its or hedge against risks of price volatility as well as to ensure safety of
investments (Li et al. 2007). Text documents contain verifiable impacts
to improve decision-making (Chang et al. 2010). Pospiech and Felden
(2014) used news tickers to forecast the electricity price. They focused
on the product year-ahead, where one electricity product is traded for
the whole year. The product is liquid enough and traded within seconds
so that market participants are able to react instantly regarding published
messages.
The historical price data occurred from October 2009 until December
2012 and were obtained from a German utility company. Trends were
estimated, whereby the class remains STABLE as long as the price
change does not exceed 0.1 price points. The transactions were mapped
through forward mapping with news tickers from Thomson Reuters to
investigate the effect of a message upon the price trend. In consulta-
tion with domain experts, a mapping time interval of two minutes was
applied. The English language news ticker were categorized by Thomson
Reuters into specific topics, nonrelevant messages on the electricity price
have been deleted. Out of 1,532 topics, only 192 were selected as rele-
vant. In the end, 1,442 items remained as input data. The electricity price
is impacted through various elements (Duarte etal. 2009). Processing
expert interviews, Pospiech and Felden (2014) identified valuable input
factors (see Table 3.1). The factors form the market data and are linked
to the news ticker.
The news tickers were transformed into a machine readable format
through text mining. TD-IDF values were calculated for all terms. Overall,
11,107 features/terms are used. The final input vectors were forwarded
to the data mining stage. The items were split into training, validation,
and test data to serve as input for the chosen algorithms SVM, KNN, and
Naive Bayes. During the model development, various and rational param-
Table 3.1 Market data (Pospiech and Felden 2014)

Field Description
Price Price of a purchased/sold unit electricity

Product Year Calendar year which is traded
Time trade Time of a trade transaction is done
Day trade Day of a trade transaction is done
Day-ahead Price difference of the mean EEX day-ahead price auction value and
year-ahead
Delta CO2 Price difference of traded year-ahead CO2-certificates compared to CO2
transactions before
Delta gas Price difference of traded year-ahead gas transactions compared to gas
transactions before
Delta coal Price difference of traded year-ahead coal transactions compared to gas
transactions before
Event type Describes the nature of a message. Alert, headline, update, or delete
Products Thomson Reuters product code. Relates messages to specific Thomson
Reuters products
Agency News agency which published the message
Topic Subject area a news ticker belongs
eter settings were chosen until the model reached their optimal result. Ten
percent of the examples remained for the evaluation. The SVM achieved
an accuracy of 59.03 percent. The best results (64.58 percent) are from
KNN, and Naive Bayes (Kernel). Following Roomp et al. (2010), the
results are weak. As shown in Fig. 3.6, the class STABLE is predicted
wrong too often. In this context, Pospiech and Felden (2014) investigated
with interesting results, whether the results improve when STABLE exam-
ples are removed. The accuracy increases up to 93.33 percent. However,
in reality, STABLE (no price movement after publication) news tickers are
possible and cannot be removed. In this context, a second modeling and
due to this the identification of irrelevant ticker to be able to remove them
out of the set took place in the general forecast approach.
3.4.2Gas Market
In Europe, natural gas has a high strategic impact for the electricity and
heat supply. As consequence of an extensive liberalization of the European
gas market, natural gas is free tradable within open exchanges nowadays.
The gas price represents an indicator to adjust strategic behavior as well as
risk and investment management in organizations (Lin and Wesseh 2013).
Fig. 3.6 Computed results for the electricity market
Thus, the interest to predict the natural gas price is high (Malliaris and
Malliaris 2005). The forecast is complex and depends on various factors
like currency exchange rates, liquefied natural gas, temperatures, or text
documents (Linn and Zhu 2004; Busse et al. 2012). However, within
the gas market, an automatic trend prediction through unstructured news
tickers has not been stated, yet. Traders have to analyze them manually.
The more news are published the higher the probability to miss relevant
ones. Nevertheless, the gas market belongs to one of the most volatile
markets in the world (Lin and Wesseh 2013). As a result, the reaction time
is short and a processing of all relevant information in real time is indis-
pensable and due to this reason, a forecast system is needed. The scenario
is settled in the British gas market. The forecast product is month-ahead.
The product is high volatile and text documents can contain valuable
information during this product horizon (Linn and Zhu 2004). As shown
in Fig. 3.1, three data sources are needed. Examples from November 2011
until April 2013 are used as training and validation data. The months May
until August 2013 are used as test data.
Historical prices are obtained from an archive, where bits, offers,
asks, and deals for the month-ahead product are included. The deals are
extracted and trends (UP, DOWN, STABLE) are calculated. According
to domain experts, price movements not exceed 0.1 price points should
be labeled as STABLE.At first, relevant examples need to be identified.
Thus, the trends are relabeled temporal to STABLE and UNSTABLE.
Overall, 97,637 trade transactions remain and were used in this business
analytics process. The news tickers are obtained from Thomson Reuters,
whereby more than 3,500,000 tickers are provided. Non-English lan-
guage documents are removed. Only tickers published weekdays and dur-
ing trading hours are kept. Thomson Reuters categorizes news tickers into
specific topics. In consultation with a domain expert, 8 out of 1,532 topics
remained as gas market relevant. Tickers not containing one of this topics
Table 3.2 Gas market data

Field Description Field Description
Topic One text subject out of Day-ahead Gas price for a

278 day-ahead
Trade volume Volume traded last two CO2- Price for a emission
minutes certificate
Trade Transactions Amount of trades last two Coal Price of coal
minutes
liquefied natural Price of Liquefied Natural Oil Price of oil
gas Gas
Weather forecast Temperatures next 15days Pound-euro Exchange rate
Weather forecast to Difference forecast to Dollar-euro Exchange rate
normal normal
supply of pipelines Amount gas provided Last price Last month-ahead
pipeline price
gas storages Amount of current stored Month The month of the
gas trade
Difference gas Difference compared last Hour The hour of the trade
storages years
UK day-ahead The price of electricity one Demand gas Amount of needed
electricity day gas
Linepack Predicted gas at closing day Supply gas Amount of provided
gas
Difference linepack Difference of current and Is Monday Monday is special
predicted gas within the because of the
network weekend
... ... ... ...
are filtered. In addition, various key words are applied. Documents con-
taining terms like soy or wheat are removed. Thus, filtering reduced the
relevant amount to 117,699 tickers. Besides, news tickers relevant market
data was identified through expert interviews. In sum, 322 attributes are
considered as relevant, which might cause a price development. Those
market data are available for every 15seconds. Table 3.2 provides a selec-
tive overview.
According to the domain expert, the reaction time of the gas mar-
ket to an event is a two minutes interval. Both mapping paradigms are
applied. The backward mapping leads to 34,653 mappings. In fact, only
6,687 belong to UNSTABLE.The distribution makes sense. Only a view
articles will cause a price change. But still and calculating 20 working days
per month, approximately 3.4 important tickers are published per hour.
The discussions with domain experts resulted in a minimum durability of

30seconds. Therefore, a trend statement needs to be true for at least half
a minute. A total of 3,865 UNSTABLE and 21,197 STABLE transactions
remain. To achieve meaningful results, both classes have to be balanced
(Chawla etal. 2004). In sum, 7,730 training examples are left.
The forward mapping resulted in 117,699 news ticker. A total of
8,148 UNSTABLE examples are available. After applying durability,
2,312 UNSTABLE items remain balancing the data, 4,624 examples are
used to train the model. Figure 3.7 presents a simplified RapidMiner process.
Here, the pre-processed training examples are forwarded to the text min-
ing operator Process Documents from Data. The operator conducts the
tokenizing, transforms terms in lower cases, removes stop words, pro-
cesses stemming and n-grams, and calculates the TD-IDF scores. It cre-
ates a wordlist, too. The wordlist represents all existing terms within the
training examples. As a result, the backward data set contains 9,900 terms
and the forward data set 5,116 terms. The Validation operator receives
the pre-processed examples. Here, the model is trained and validated.
Several parameter settings were chosen. The final model is forwarded to
the operator Apply Model and applied. The model gets the test data and
predicts the trend based on the model. Finally, the performance and the
labeled test data are transmitted and stored.
In the given case, examples predicted as UNSTABLE are forwarded
to the next model. STABLE examples are removed, because they rep-
resent irrelevant situations. The second model is trained to predict UP
and DOWN trends. Here, the initial UNSTABLE trends needs to be
reconstituted. Thus, the classifier can extract historic patterns to explain
price movements. The training is processed with the same training exam-
ples as before, except STABLE trends. At least, the test data classified
as UNSTABLE in a first iteration are forwarded to the second model,
whereby the model predicts UP or DOWN.A database stores the results
and a dashboards triggers queries in real time. The performance of the
models is tested by 10,000 training examples. The items belong to
the months May until August 2013 and are unknown for the model.
Same filters as of the training stage are applied. The mapping of data
sources follows the forward mapping. TD-IDF is calculated by the opera-
tor Process Documents from Data (Fig. 3.7). The trainings wordlist is
used. Terms/Attributes of test documents, which are not available within
the wordlist, have to be removed. If terms of the wordlist are not avail-
able, missing attributes will be added to the example vector, because the
Fig. 3.7 RapidMiner process

Table 3.3 Performance UNSTABLE/STABLE Model

Performance Accuracy Precision Precision Recall Recall
measure (%) STABLE UNSTABLE STABLE UNSTABLE
(%) (%) (%) (%)
Classifier SVM (Lib) 85.19 96.54 4.62 87.78 15.83

backward KNN 73.11 96.33 3.40 74.96 23.61
mapping Naive Bayes 93.23 96.94 15.01 96.01 18.89
(Kernel)
forward KNN 75.68 96.43 3.71 77.65 23.06
(Kernel)
Fig. 3.8 Details of best model
feature vector of training and test data has to be similar. The scenario
simulates the practical usage, because the models were trained with data
from November 2011 until April 2013. The prediction of the models
was checked, regarding whether the prediction is true after two minutes.
Unfortunately, out of 10,000 examples, only 360 are UNSTABLE.Thus,
the model should identify the relevant items and label the remaining
examples STABLE.The accuracy results of the different models and map-
ping hypothesis are shown in Table 3.3.
The whole model accuracy is of minor importance. A STABLE clas-
sification of all items would lead to an accuracy of 96.40 percent. The
accurate identification of relevant examples needs to be addressed. Thus,
the UNSTABLE precision and recall is of major interest. Comparing both
mapping methods, forward mapping generates the best results. But only
minor differences to a backward mapping are observed. Thus, in future,
both methods have to be applied to choose the best approach. The Naive
Bayes (Kernel) model achieves the best results. Figure 3.8 illustrates the
details of the best model. A total of 23.61 percent of UNSTABLE cases are
identified. In this context, from 10,000 test cases and 360 possible hits at
Table 3.4 Performance UP/DOWN Model

Performance Accuracy Precision Precision Recall Recall
Measure (%) DOWN UP (%) DOWN UP
(%) (%) (%)

backward KNN 84.42 85.33 50.00 98.46 8.33
(Kernel)
forward KNN 20.00 76.19 1.56 20.25 16.67
(Kernel)
least 85 are correct identified. A total of 9,206 items are correct identified
as irrelevant. Only 518 cases are forwarded to the second model. A total
of 433 of them caused no price movement. However, that negates not
the impact of the message. It is imaginable that the price change will need
more than two minutes. In fact, the manual evaluation by a domain expert
points out that 16 percent of the 433 tickers are relevant. Nevertheless,
most of the irrelevant items are identified and almost a quarter of relevant
cases are found. Thus, the results are practicable.
The performance of the second model is excellent (Roomp et al.
2010). Again, the best results will be obtained, if the forward mapping
is selected during the model training. The Naive Bayes (Kernel) predicts
UP and DOWN examples at 91.76 percent correctly. Just seven exam-
ples are wrong. Nevertheless, the accuracy is estimated by the 85 cor-
rect forwarded items of Model 1. The 433 wrong STABLE cases are not
calculated, because they belong neither to UP nor to DOWN. Thus, a
right prediction is foredoomed to fail (Table 3.4).
The case study is implemented as prototype (see Fig. 3.9) within a
trading floor and follows the live process (Fig. 3.5). News tickers are pre-
processed and filtered. Marked data and price data are joined with the
remaining tickers through forward mapping. Text mining is applied and
item vectors are forwarded to the models. The model processes in real
time. Only UNSTABLE predictions will be labeled as UP or DOWN.The
calculated trends are stored in the database. The dashboards list updates
and changes, which are immediately moved to the user interface. The most
recent news ticker is presented as headline on top of the table. Besides
the news ticker, users can obtain additional marked information through
72
M. POSPIECH AND C. FELDEN
Fig. 3.9 Graphical user interface

the details on the right-hand side. Here, all information used during the
model predication are highlighted through a pop-up table. In addition,
the full text is provided at the bottom right-hand-side text box. Users can
select historic predictions within the table. Based on the selected item, a
chart illustrates the market price before and after the publication. A slider
allows an interactive time interval selection so that different horizons can
be observed. In this context, traders can analyze the impact of current
and historic items to gain knowledge of the market behavior. At least,
the confidence column indicates how certain the models prediction was.
A confidence filter can be applied to reduce the amount of news tickers
within the user interface. Thus, only perditions reaching a minimum of
security are shown.
3.5 Conclusion
Business analytics provides a wide field of possible application scenarios.
One of them belongs to the prediction of price trends. During the recent
years, great progresses are made. Especially the rethinking driven by the
term Big Data increases the interest in business analytics. Thus, new
data sources are combined to allow an extended market understanding
(Pospiech and Felden 2013). This section provided such an application
scenario and introduced a generic forecast approach to integrate unstruc-
tured news tickers and structured market data. The approach was applied
within two different markets, whereby other scenarios are imaginable. The
results of the predictions are practicable and comparable to state-of-the-
art research. Even the drawbacks of Big Data are addressed by this busi-
ness analytics approach. Here, the requirements of a more task-oriented
provision of data due to an increasing availability, variety, and complexity
of new data sources to prevent an information flood is fulfilled (Pospiech
and Felden 2012). Out of 10,000 examples, just 518 tickers are forwarded
through the user, which gains a benefit in context of decision-making. In
contrast to other approaches, the given prototype is event based. Changes
published by news tickers are immediately processed. Nevertheless, one
drawback remains. New information, which are not published as text doc-
uments are not perceived by the system, because audio and video formats
are not in the systems scope. Additionally, if there are no news tickers, no
price forecast will happen. It also has to be understood that not all news
tickers pulled by the dashboard are relevant and the decision-maker has
still to decide, how to handle the given information. In context of the
process has to be stated that text and market data are weighted, equally.
Thus, the prediction is perhaps not caused by a news ticker, but rather by
the market variables itself. However, the calculated forecast will not lose
its validity.
References
Breiman, Leo, Jerome Friedman, Charles J.Stone, and Richard A.Olshen. 1984.
Classification and Regression Trees. Belmont: Wadsworth.
Busse, Sebastian, Patrick Helmholz, and Markus Weinmann. 2012. Forecasting
day ahead spot price movements of natural gasAn analysis of potential influ-
ence factors on basis of a NARX neural network. Paper Presented by the
Multikonferenz Wirtschaftsinformatik, Braunschweig, Germany.
Chan, Yue-cheong, Andy C.W.Chui, and Chuck C.Y.Kwok. 2001. The impact of
salient political and economic news on the trading activity. Pacific-Basin
Finance Journal 9(3): 195217.
Chang, Yin-Wen, Cho-Jui Hsieh, and Kai-Wei Chang. 2010. Training and testing
low-degree polynomial data mappings via linear SVM. Journal of Machine
Learning Research 11(4): 14711490.
Chawla, Nitesh V., Nathalie Japkowicz, and Aleksander Kolcz. 2004. Editorial:
Learning form imbalanced datasets. SIGKDD Explorations Newsletter 6(1):
16.
Davenport, Thomas, and Jeanne Harris. 2007. Competing on Analytics: The New
Science of Winning. Boston: Harvard Business School Press.
Duarte, Andre, Jose Nuno Fidalgo, and Joao Tom Saraiva. 2009. Forecasting
electricity prices in spot marketsOne week horizon approach. Paper Presented
by the IEEE PowerTech, Bucharest, Romania.
Fayyad, Usama, Gregory Piatetsky-Shapiro, and Padhraic Smyth. 1996. From data
mining to knowledge discovery. In Advances in Knowledge Discovery and Data
Mining, ed. Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth.
Menlo Park: AAAI Press.
Felden, Carsten, and Peter Chamoni. 2003. Web farming and data warehousing
for energy tradefloors. Paper Presented by the IEEE Web Intelligence WI.
Gartner. 2013. Hype cycle for big data. https://www.gartner.com/doc/2574616.
Accessed 28 April 2014.
Geva, Tomer, and Jacob Zahavi. 2010. Predicting intraday stock returns by inte-
grating market data and financial news reports. Paper Presented by the
Mediterranean Conference on Information Systems MCISS.
Khandar, Punam V., and Sugandha V.Dani. 2010. Knowledge discovery and sam-
pling techniques with data mining for identifying trends in data sets.
International Journal on Computer Science and Engineering (IJCSE) (Special
Issue): 711.
Labrinidis, Alexandros, and Hosagrahar Jagadish. 2012. Challenges and opportu-

nities with big data. Proc. VLDB Endowment 5(12): 20322033.
Lavrenko, Victor, Matt Schmill, Dawn Lawrie, Paul Ogilvie, David Jensen, and
James Allan. 2000. Language models for financial news recommendation.
Paper Presented by the Proceedings of the Ninth International Conference on
Information and Knowledge Management, McLean, Virginia, USA.
Li, Guang, Chen-Ching Liu, and Chris Mattson. 2007. Day-ahead electricity price
forecasting in a grid environment. Transactions on Power Systems 22(1):
266274.
Lin, Boqiang, and Presley K.Wesseh. 2013. What causes price volatility and regime
shifts in the natural gas market. Energy 55(2013): 553563.
Linn, Scott C., and Zhen Zhu. 2004. Natural gas prices and the gas storage report:
Public news and volatility in energy futures markets. Journal of Futures Markets
24: 283313.
Lo, Andrew W., and Craig A.MacKinlay. 1999. A Non-Random Walk Down Wall
Street. Princeton, New Jersey: Princeton University Press.
Malliaris, Mary E., and Steven G. Malliaris. 2005. Forecasting energy product
prices. Paper Presented by the IEEE Proceedings of International Joint
Conference on Neural Networks, Montreal, Canada.
Miner, Gary, Dursun Delen, Andrew Fast, and John Elder. 2012. Practical Text
Mining and Statistical Analysis for Non-structured Text Data. Waltham:
Academic Press.
Mitchell, Tom. 1997. Machine Learning. Boston: McGraw Hill.
Mittermayer, Marc-Andr. 2004. Forecasting intraday stock price trends with text
mining techniques. Paper Presented by the IEEE Computer Society, Proceedings
of the 10th Annual Hawaii International Conference on System Sciences, Big
Island, Hawaii, USA.
Nann, Stefan, Jonas Krauss, and Detlef Schoder. 2013. Predictive analytics on public
dataThe case of stock markets. Paper Presented by the ECIS, Sofia, Bulgaria.
Ni, E., and Peter Luh. 2001. Forecasting power market clearing price and its dis-
crete PDF using a Bayesian-based classification method. Paper Presented by the
Power Engineering Society Winter Meeting.
Oh, C., and O. Sheng. 2011. Investigating predictive power stock micro blog senti-
ment in forecasting future stock price directional movement. Paper Presented by
the International Conference on Information Systems, Shanghai, China.
Pospiech, Marco, and Carsten Felden. 2012. Big dataA state-of-the-art. Paper
Presented by the AMCIS, Seattle, USA.
. 2013. A descriptive big data model using grounded theory. Paper
Presented by the IEEE Big Data Science and Engineering, Sydney, Australia.
. 2014. Towards a price forecast model for the German electricity market
based on structured and unstructured data. Paper Presented by the
Multikonferenz Wirtschaftsinformatik MKWI, Paderborn, Germany.
Pring, Martin J. 1991. Technical Analysis Explained. New York, NY:

McGraw-Hill.
Rapid-I Incorporation. 2013. Rapid-I report the future. http://rapid-i.com/.
Accessed 28 April 2014.
Roomp, Kirsten, Iris Antes, and Thomas Lengauer. 2010. Predicting MHC Class
I epitopes in large datasets. BMC Bioinformatics 11(1): 190.
Schumaker, Robert P., and Hsinchun Chen. 2006. Textual analysis of stock market
prediction using financial news articles. Paper Presented by the 12th Americas
Conference on Information Systems AMCIS, Acapulco, Mexico.
Turban, Efraim, Jay E. Aronson, and Ting-Peng Liang. 2004. Decision Support
Systems and Intelligent Systems. 7th ed. Upper Saddle River, N.J: Prentice Hall.
Wuthrich, Beat, Vincent Cho, and Jian Zhang. 1998. Daily stock market forecast
from textual web data. Paper Presented by the IEEE International Conference
on Systems, Man, and Cybernetics.
Zhang, Li, and Peter Luh. 2002. Power market clearing price prediction and con-
fidence interval estimation with fast neural network learning. Paper Presented
by the Power Engineering Society Winter Meeting.
CHAPTER 4
Market Research andPredictive Analytics:

Using Analytics toMeasure Customer
andMarketing Behavior inBusiness
Ventures
D.AnthonyMiles
4.1 Introduction
The use of analytics is coming popular due to the popularity of such films
as Moneyball, and thus emerged the philosophy of statistical thinking.
Using predictive analytics in business is not a secret anymore. It is now
becoming a big part of decision-making in companies. Using analytics
to study and predict patterns in businesses is important in this era of big
data.
Many times in the past, the field of marketing has long suffered because
it was hard to determine the effectiveness of advertisements and promo-
tional campaigns return on investment (ROI) of efforts. The use of ana-
lytics in business is now becoming a standard practice among corporations
and businesses. A major point of the using analytics is to help researchers
D.A. Miles (*)

Miles Development Industries, San Antonio, TX, USA
e-mail: dmiles@mdicorpventures.com

DOI10.1057/978-1-137-37879-8_4
78 D.A. MILES
examine differences with different aspects of the business and identify

problems and develop solutions (Bailey etal. 2009; Ghose and Lowengart
2012; Hair 2007; Lin and Hsu 2013; Morgan 2012).
In the past 50 years, the marketing literature has documented vari-
ous benefits of the use of such marketing analytics, including improved
decision consistency (Germann et al. 2011). Today, marketing analytics
continues to play an important role in measuring marketing and customer
behavior in firms. Customer behavior and market behavior plays an impor-
tant role in small-medium business enterprises (SME) profitability. Both,
researchers and statisticians have used analytics measures, and their ben-
efits for under specific circumstances. In this context, most researchers
write about the use of analytics to provide vital information to help com-
panies make critical decisions.
This study attempts to provide a greater insight into examining SMEs
through the use of analytics to measure firm behavior. The importance
of this study is that it contributes to the field of study of entrepreneur-
ship and marketing. This study attempts to provide a greater insight
into how to use analytics to examine female-owned business enterprises
(FBEs). By using analytics as an indicator of firm behavior it can give
some strong insight into the dynamics of the firms. This study contrib-
utes to the emerging literature and the field of entrepreneurship and
marketing.
Against this background, the purpose of this study is threefold. First,
the researchers wanted to test the theoretical model of four marketing
analytic categories. Second, this study attempts to explore the customer
behavior influence FBEs and their effect on profitability across a range of
four analytics. Lastly, this study examines the market behavior influence on
FBEs and its effect on profitability across a range of four analytics. More
specifically, this research has two objectives: (a) to develop four analytic
models that affects FBEs and (b) to measure explore the influence of cus-
tomer and market behavior on FBEs.
This chapter is organized into five parts. First, it provides the back-
ground of the study by reviewing prior research. Second, the theoreti-
cal foundation for the study is presented. Third, the research design is
discussed with the development of the theoretical model and conceptual
model of the study. Fourth, the methodology (statistical research design)
for the study is discussed. Lastly, the results and conclusions from the find-
ings are presented and discussed.
MARKET RESEARCH ANDPREDICTIVE ANALYTICS: USING ANALYTICS... 79
4.2 Background andPrior Research

4.2.1Marketing Analytics
The use of analytics in marketing is a much-welcomed tool for not only
measuring performance but also for predicting patterns and trends. The
quintessential question is: what makes for a good analytic marketing model?
There are five conditions that must be met. First, a good analytic model
uses techniques appropriate to the problem at hand and of course, makes
no technical mistakes. Second, good analytic modeling is couched in insti-
tutionally rich, real-world problems. Third, a goodness criterion is that
the results from the analytic model should not be something that a smart
MBA could figure out without the model. Fourth, a good analytic model
has influence beyond the immediate analysis at hand. Such models are
spurs to future research, some of which may extend broadly beyond the
first modeling effort. Lastly, good analytic models can contribute by per-
mitting the analysis of a market or a problem where other tools simply
do not (or do not yet) work (Coughlan etal. 2010). Understanding how
marketing analytics are developed can be an advantage for firms trying to
optimize their business and marketing efforts. The field of business analyt-
ics has improved significantly over the past few years, giving business users
better insights, particularly from operational data stored in transactional
systems (Kohavi etal. 2002).
Interestingly, the growth of quantitative analysis has been the second-
biggest revolution in management in the past two decades. This has led to
the major revolution in marketing has been introduction of the internet.
This has caused marketing professionals to manage information that helps
well-targeted products to satisfy customers and generate orders. Many
kinds of information are needed, and an increasing share of this informa-
tion is backed by hard data (Petti 2005).
Customer Behavior. With analytics, data about customers brand

preference, shopping frequency, buying patterns can be effectively
captured from various sources like retail outlets, web data, and
survey data. Data can then be sliced and diced so as to gain use-
ful insights about customers past, present, and future buying behav-
ior (Sathyanarayanan 2012). Marketing relationships are distinct
and idiosyncratic organizational assets, the development of such
unique relationships serves as a defensible barrier to external com-
80 D.A. MILES
petition (Panayides 2002). The effects of competition on the incen-

tive for marketing depend importantly on the nature of innovation.
Furthermore, an increase in competition intensity reduces the inno-
vation incentive (Chen 2006). Thus, we propose the following main
effect:
Hypothesis 1The Customer Turnover Analytic shows significant evi-

dence of customer behavior and activity in female-owned enterprises
(FBEs).
4.2.2Analytic Modeling
The use of analytics modeling is a further evolution of analytics in mar-
keting. Surprisingly, analytic models are also used in conceptionalizing
marketing analytic endeavors. They are characterized by precision of
expression. Furthermore, the use of analytic models is especially valuable
when they generate insights that are conditional or strategic in nature
as opposed to first-order or main effects. Such effects can be very diffi-
cult to document empirically, either because they cannot be disentangled
from the web of factors interacting in a complicated real-world market
or because their incremental effect on outcomes may not be measurably
large (Coughlan et al. 2010). Analytics have been historically rooted in
mathematical and statistical models (Chen etal. 2010; Drye 2011; Dufour
etal. 2012; Furness 2011; Gnatovich 2007; Marsella et al. 2005; Steinley
and Henson 2005).
Marketing Performance. The measurement of marketing perfor-

mance has been a concern in the field of marketing for decades. In
order to represent the current situation of companies about market-
ing measurement, the research identified the actors that are involved
in the process. Analytics have been used more for measuring per-
formance (Morgan 2012). Analytics have been used for measuring
internet advertising networks (Lin and Hsu 2013). Analytics have
also been used in speech, which now has developed into speech ana-
lytics. Within 610 years, speech analytics will become a mission-
critical enterprise application (Fluss 2010). Thus, we propose the
following main effect:
Hypothesis 2 The Customer Credit Analytic shows significant evidence

of credit behavior in FBEs.
4.2.3Predictive Analytics
The transition from using traditional analytics to predictive analytics has
been critical in the evolution of marketing and business intelligence (BI).
Furthermore, the latest shift in the BI market is the move from traditional
analytics to predictive analytics. Historically, predictive analytics belongs to
the BI family, it is emerging as a distinct new software sector (Zaman 2003).
Predictive Analytics and Business Decisions. The use of predic-

tive analytics in data analysis is crucial in understanding customer
behavior and business decisions. Machine learning and predictive
modeling-based solutions have been shown to be highly effective in
solving many important business and industrial problems (Apte etal.
2002). Predictive analytics also can be used for forecasting and cre-
ating forecasting models. Predictive analytics refers to data mining
procedures which use statistical techniques such as multiple regres-
sion, to make forecasts in support of managerial decision-making
(Kridel and Dolk 2013). The use of analytics to make market pre-
dictions can be used in four broad analytics generated across orga-
nizations: (a) market predictions, (b) customer segments, (c) need
and opportunity-focused analytics, and (d) customer value analytics
(Bailey etal. 2009). Thus, we propose the following main effect:
Hypothesis 3 The Market Potential Analytic shows significant evidence

of marketing behavior and activity in FBEs.
Effective Predictive Analytics. Measuring marketing perfor-

mance has evolved from just using predictive analytics to pre-
dicting behavioral patterns in the firm (Germann et al. 2013).
Furthermore, predictive analytics uses confirmed relationships
between explanatory and criterion variables from past occurrences
to predict future outcomes (Hair 2007). Effective predictive ana-
lytics requires a significant degree of statistical modeling expertise
coupled with a thorough understanding of the data which is being
used as the foundation for modeling (Kridel and Dolk 2013).
Predictive analytics is most often thought of as predictive model-
ing. But increasingly, the term includes descriptive and decision
modeling as well. All three m odeling approaches involve extensive
data analysis, but have different purposes and rely on different
statistical techniques.
82 D.A. MILES
Predictive analytics use in customer analytics may have a positive

impact on firm performance. However, most analytical models that have
been developed tended to focus on customer transactions. Furthermore,
despite this rather narrow perspective, these more traditional and well-
established models provide a promising starting point for discussing how
customer engagement reflecting behavioral manifestations other than pur-
chase may be modeled appropriately (Bijmolt etal. 2010). The reason for
this is most traditionalist marketing managers are unlikely to be experts
in data analysis and statistics. The rationale for this resistance in the use
of analytics is that now managers have to consider making data-driven
decisions based on the data collected by and about their organizations.
They must either rely on data analysts to extract information from the data
or employ analytic applications that blend data analysis technologies with
task-specific knowledge (Kohavi etal. 2002).
Predictive Modeling. Predictive analytics is most often thought of

as predictive modeling, however the term includes descriptive and
decision modeling as well (Hair 2007). Marketing analytics can be
used as a competitive advantage especially in the international sector.
The increased activity of firms in the global arena has created a chal-
lenge for international marketers as they need to compete against
local products in diverse consumer markets and segments (Ghose
and Lowengart 2012). Furthermore, marketing analytics can help
marketers understand consumers in foreign countries and make bet-
ter strategic marketing decisions.
Data-Driven Marketing Decisions. Predictive analytics makes stra-
tegic use of data-driven marketing decision-making. The use of
data-driven service marketing refers to the use of data to inform and
optimize the ways through which these activities are carried out.
Furthermore, data-driven services marketing is not synonymous
with automatic decision-making where the human element is no
longer relevant (Kumar etal. 2013). The use of analytics for measur-
ing and predicting customer behavior or consumption patterns can
assist companies capability to turn data into knowledge. Capturing
this knowledge from data can provide valuable information on such
things as customer buying pattern (Sathyanarayanan 2012).
Some marketing managers are traditionalists and still fight the use of
analytics as a necessary measurement tool. Many skeptics preference is the
rational analytics approach to marketing. The low prevalence of market-

ing analytics use implies that many managers remain unconvinced about
their benefits. In addition, most research that documents their outcomes
has focused on isolated firm or business unit success stories, without
exploring systematically their performance implications at the firm level
(Germann etal. 2011).
4.2.4Social Media Analytics

Under the field of marketing, another revolution is the emergence of
social media. Moreover, in the use of analytics has emerged social media
analytics. Social media analytics has perfected the use of data analysis.
Social media analytics can be used to collect, monitor, analyze, summarize,
and visualize social media data, usually driven by specific requirements
from a target application (Zeng et al. 2010). The emergence of social
media in the field of marketing has caused the increased use of analytics in
social media. Consequently, this has given rise to the emerging discipline
of Social Media Analytics, which draws from Social Network Analysis,
Machine Learning, Data Mining, Information Retrieval (IR), and Natural
Language Processing (NLP) (Melville etal. 2009).
Web Analytics. Now, many companies are using web analytics to

measure website traffic. Google Analytics, which is the primary web
analytics tool, is used by many companies. Web analytics provides
information about the number of visitors to a website and the num-
ber of page views. It helps gauge traffic and popularity trends which
is useful for market research (Dash and Sharma 2012).
4.2.5Marketing Metrics
Many marketing professionals have had some conflict with using metrics
to measure the effectiveness of advertising efforts or marketing efforts.
Few marketers recognize the extraordinary range of metrics now available
for evaluating their strategies and tactics. Companies are now using frame-
works for presenting marketing metrics. There are basically five types of
marketing metrics companies use: (a) Customer and Market Share-based,
(b) Revenue and Cost-based, (c) Product-based and Portfolio-based, (d)
Customer Profitability-based, and (e) Sales Force and Channel-based (Farris
etal. 2006). A marketing metrics framework must demonstrate how mar-
84 D.A. MILES
keting enables the organization to realize these outcomes. Therefore, a

company must at least make the transition to outcome-based metrics. The
use of metrics in measuring media has three primary needs: (a) the need
for cross-media data, (b) the need for hybrid data collection that includes
electronic and passive measurement of media use, and (c) the need for new
metrics, such as measures of implicit processing of sponsored media con-
tent and measures of consumer-generated brand communications (Smit
and Neijens 2011).
A rather interesting concern with marketing effectiveness is that it is
sometimes difficult to determine. Consequently, marketing effectiveness
is so hard to determine for organizations of all sizes: (a) marketing activity
has both tangible and intangible effects, (b) marketing activity has both
short-term and long-term (future) effects, (c) marketing operates within a
volatile and uncontrollable external environment that includes its custom-
ers, competitors, and legislators, (d) marketing operates within an internal
environment which is subject to constraint and change, (e) there is cor-
porate confusion between marketing (the total business process) and the
what the marketing department does, and (f) when it comes to available
metrics for measuring marketing performance and/or effectiveness, mar-
keters are spoilt for choice (Brooks and Simkin 2011). Thus, we propose
the following main effect:
Hypothesis 4 The Competition and Economic Analytic shows significant

evidence of competition and economic behavior in FBEs.
The use of marketing metrics in the marketing field especially those

within the advertising industry has for a long time espoused that market-
ing should be capitalized or treated as an investment on the balance sheet
rather than as an expense. Does it help with decisions and does it have
value? Opinions vary. The marketing metrics project indicated that the
measures are collected but not communicated to the board. In some firms,
marketing equity metrics are not seen as being very useful for determin-
ing the value of a firm. Solcansky etal. (2011) also argued metrics could
be divided into two groupsfinancial metrics and non-financial metrics.
Some companies use marketing dashboard as the comprehensive set of
important tools for internal and external synthesis. Furthermore, financial
metrics are used more often than non-financial metrics (Gai et al. 2007). The
importance of justifying marketing investments and the metrics necessary to
measure marketing performance thus have taken center stage (Grewal etal.
2009). Finance and marketing have traditionally been on different pages, talk-
ing different languages and unable to establish common goals (See 2006).
As with analytics, a marketing manager must be careful in its use. Many
of the traditional school of marketing professionals are still not convinced
with marketing metrics, as with marketing analytics. They still cling to the
old way of doing marketing. However, there is a dark side to metrics. Like
anything, overuse of marketing metrics can lead to disastrous results. The
use of metrics can lead to an over-reliance on statistical modeling tech-
niques (Ozimek 2010).
4.3 Methodology
4.3.1Population andSample
The data were collected through an internet questionnaire and a paper
questionnaire. The participants were FBEs. The participants were selected
from the yellow pages, local woman chambers of commerce (which was
assisted by local contacts), and the Small Business Development Center
(SBDC). The participants were able to complete the Marketing Activity
and Customer Activity Scale (MACS) survey from their offices via the
internet.
A total of 11 industry sectors were examined for this study. For each
market, both a convenience and random sample was drawn with sample size
of approximately 123 FBEs from a population of 12,256. The questions
about brand relation dealt with this particular brand. A five-point Likert
Scale was used that consisted of 1-Strongly Agree to 5-Strongly Disagree.
The data were collected for the duration of one year (20122013).
4.3.2Research Hypotheses
Four statistical hypotheses were tested for this study. The general hypoth-
esis is that there is significance in FBEs based on four marketing analytics.
The hypotheses can be segregated and studied as the following:
1. H1The Customer Turnover Analytic shows significant evidence

of customer behavior and activity in FBEs.
2. H2The Customer Credit Analytic shows significant evidence of
credit behavior in FBEs.
86 D.A. MILES
3. H3The Market Potential Analytic shows significant evidence of

marketing behavior and activity in FBEs.
4. H4The Competition and Economic Analytic shows significant
The first two hypotheses suggest that customer behavior analytics are
significant by the customer behavior activity in the FBEs. The second two
hypotheses suggest that market behavior and competition analytics are
significant by the market behavior activity in the FBEs. Researchers sug-
gest that an emphasis on one or more metrics within each analytic is a
necessary examining customer behavior and marketing behavior.
4.3.3Empirical Model oftheStudy

According to the hypotheses, the regression model can be formulated as
follows:
MAEQ = CTAn + CCAn + MPAn + CEAn (4.1)
Based on the models presented, research analytics are given in Fig. 4.1.
Markeng Analyc Equaon Model
MAEQ = CTAn + CCAn + MPAn+ CEAn
In the above equaon:
CTAn: Customer Turnover Analyc

CCAn: Customer Credit Analyc
MPAn: Market Potenal Analyc
CEAn: Compeon and Economic Analyc
The regression coefficients (1 + 4): signifies the effects of firm variables

(ethnicity, industry type, business enty type, employee number, and
franchise/non-franchise) on CTAn , CCAn, MPAn, and CEAn
Fig. 4.1 Marketing Analytic Equation Model (MAEQ)

4.3.4Measures: Marketing Analytics Used fortheStudy

This study uses marketing analytics as dependent variables for measure-
ment in FBEs. The independent variables used in the study were ethnic-
ity, industry, business entity type, employee number, and franchise. This
study measures the effect of the independent variables on the dependent
variables through the use of marketing analytics (see Table 4.1). A review
of the prior research and literature revealed a significant number of studies
Table 4.1 Model: marketing analytics and metric equations table

Analytic Variables/Metrics Description
CTAn = Customer Turnover VOP1 = Velocity of Profit Measures the speed of

Analytic Metric profitability in the
enterprise
Equation: CTAn = VOP1 + CAT2 = Customer Measures the how many
CAT2 Activity/Turnover Metric customers turnover in the
business enterprise
CCAn = Customer Credit CUC1 = Customer Credit Measures the customers
Analytic Metric credit activity and
capabilities in the enterprise
Equation: CCAn = CUC1 + LOC2 = Line of Credit Measures the line of credit
LOC2 Metric capability in the business
enterprise
MPAn = Market Potential MOP1 = Market Potential Measures the market
Analytic Metric potential of the business
enterprise
Equation: MPAn = MOP1 + BTE2 = Barriers to Entry Measures the number of
BTE2 +SET3 Metric entry barriers that affect the
business enterprise
SET3 = Social Measures the social benefits
Entrepreneurial Metric of the business enterprise
CEAn = Competition and CPI1 = Competition Measures the competition
Economic Analytic Intensity Metric intensity in the business
enterprise
Equation: CEAn = CPI1 + ECR2 = Economics Risk Measures the level of
ECR2 + GOR3 Metric economic activity such as
economic anchors in the
business enterprise
GOR3 = Government Measures the level of
Regulation Metric government regulation in
the industry in business
enterprise
88 D.A. MILES
on marketing analytics. The validity and reliability of the scale was assessed
using principal component factor analysis (PCA) and structural equation
modeling (SEM). The researcher conducted validity tests on the MACS
instrument by conducting additional tests (see later in results), such as
internal consistency by using Cronbach alpha and multivariate techniques.
Customer Turnover Analytic. This test is used in order to examine

customer activity and behavior in the FBEs. The validity of test is to
show which one test describes the variable movements in the analytic
so these tests are fixed and random effect models. Equation can be
written as follows:
CTAn = VOP1 + CAT2 (4.2)
where CTAn is the analytic of the customer behavior which is our

dependent variable. However, VOP1 is the coefficient variable metric
which measures the speed of profitability in the enterprise, whereas CAT2
is the coefficient variable metric which measures the how many customers
turnover in the business enterprise. This analytic confirms the increase of
customer activity in the business enterprise and thus leads to enhance the
market and economic growth.
Customer Credit Analytic. This test is used in order to examine

customer credit and enterprise line of credit capability and activity
in the FBEs. The validity of test is to show which one test describes
the variable movements in the analytic so these tests are fixed and
random effect models. Equation can be written as follows:
CCAn = CUC1 + LOC2 (4.3)

where CCAn is the analytic of the customer credit behavior which is

our dependent variable. However, CUC1 is the coefficient variable metric
which measures the customers credit activity and capabilities in the enter-
prise, whereas LOC2 is the coefficient variable metric which measures the
firms line of credit capability in the business enterprise. This analytic con-
firms the increase of customer credit and firms line of credit activity in the
business enterprise and thus leads to enhance the market and economic
growth.
Market Potential Analytic. This test is used in order to examine

market potential in the FBEs. The validity of test is to show which
one test describes the variable movements in the analytic so these
tests are fixed and random effect models. Equation can be written
as follows:
MPAn = MOP1 + BTE2 + SET3 (4.4)

where MPAn is the analytic of the competition and economic behavior,

which is our dependent variable. However MOP1 is the coefficient variable
metric which measures the market potential of the business enterprise,
whereas BTE2 is the coefficient variable metric which measures the num-
ber of entry barriers that affect the business enterprise, and whereas SET3
is the coefficient variable metric which measures the social benefits of the
business enterprise. This analytic confirms the increase of potential in the
business enterprise and thus leads to enhance the market and economic
growth.
Competition and Economic Analytic. This test is used in order to

examine customer credit and enterprise line of credit capability and
activity in the FBEs. The validity of test is to show which one test
describes the variable movements in the analytic so these tests are
fixed and random effect models. Equation can be written as follows:
CEAn = CPI1 + ECR2 + GOR3 (4.5)

where CEAn is the analytic of the competition and economic behavior,

which is our dependent variable. However, CPI1 is the coefficient variable
metric which measures the competition intensity in the business enterprise,
whereas ECR2 is the coefficient variable metric which measures the level
of economic activity such as economic anchors in the business enterprise,
whereas, GOR3 is the coefficient variable metric which measures the level
of government regulation in the industry that affects the business enter-
prise. This analytic confirms the increase of competition and economic
90 D.A. MILES
behavior activity in the business enterprise and thus leads to enhance the
market and economic growth (see Table 4.1).
4.3.5Study Instrument: MACS

The MACS instrument questionnaire was used and adapted from the
previous literature and study regarding the economic and marketing
activity with SME. The MACS instrument was adapted to ensure that
it was appropriate for use in for analyzing marketing analytics for FBEs.
The questionnaire instrument consisted of two sections: (a) Section
1Sociodemographic characteristics information and (b) Section 2
Marketing and Economic characteristics information. The MACS instru-
ment used a five-point Likert-scale ranging from 1 (Strongly Disagree) to
5 (Strongly Agree), the participants were asked to rate the importance of
each of the ten marketing metrics and economic metrics to determine the
significant analytics.
Data Analysis. The study used statistical analyses for examining the
data from the sample. First, descriptive statistical methods were used
such as frequencies and distribution analysis, which were used to
analyze the characteristics of the FBEs. Second, an exploratory fac-
tor analysis (EFA) was used, then a Pearson Correlation was used.
Lastly, structural equation model (SEM) was used for a path analysis
for the data.
Statistical Analyses Tools. The statistical analyses for the data in
the research were performed using SPSS (Statistical Package for
the Social Sciences) Version 21.0 for statistics. AMOS (Analysis
of Moment Structure) Version 21.0 software (Arbuckle 1995) was
used for the SEM.First, a data screening was conducted to inspect
the variables for the multivariate analyses. SPSS was used for com-
puting the descriptive statistics, inferential statistics, and multivariate
statistics. AMOS was used for computing the SEM.The sample (N
= 123) of FBEs was selected to test the psychometric properties of
the 18-item MACS.First, the exploratory factory analysis (EFA) was
performed. Lastly, then a path analysis was conducted to assess the
model fit for confirming multivariate normality and the refined mar-
keting analytics and metrics items.
4.4 Conceptual Model oftheStudy
The conceptual framework of the study is presented in Fig. 4.2. It shows

the path model that firm variables (ethnicity, industry type, franchise,
employee number, and business entity type) on the marketing analytics.
Then the proposed framework articulates our predicted relationships,
including the hypothesized relationship that marketing analytics has an
effect on the FBE.The researcher proposes that marketing analytics are
significant and have a positive impact on firm behavior.
4.5 Results
This section of the study presents the results of the statistical analyses. The
purpose of this study is to examine marketing analytics with FBEs. Four
hypotheses were tested for this study with four marketing analytics. First,
a descriptive statistical analysis was conducted on the sociodemographics.
Data such age, gender, ethnicity, and so on were examined in the study.
Second, an EFA was conducted to determine the factor structure of the
analytic metrics and variables. Third, a path analysis was conducted using
a SEM.The SEM was conducted to test the validity of the factor struc-
ture. This was conducted, using AMOS to determine which path structure
adjusts better to RMS instrument, and its fit was measured through the
following indices. Lastly, a Cronbachs Alpha was conducted to measure
internal consistency in the MACS instrument (see Table 4.2).
+ Analytic 1 +H1
+ +H2
Analytic 2
FBE +H3
Firm
+ Effect
Analytic 3 On FBEs
Variables
+ +H4
Analytic 4
Fig. 4.2 Conceptual model of study: Path analysis of firm variable on analytics
92 D.A. MILES
Table 4.2 Firm sociodemographic statistic results of the study

Firm sociodemographic variables n %
Owner ethnicity
Asian (Pacific Islander) 4 3.3
Black (non-Hispanic) 25 20.3
Hispanic 56 45.5
Native American Indian 2 1.6
White (non-Hispanic) 33 27.0
Other 3 2.4
Industry type
Agriculture 3 2.4
Communications 3 2.4
Construction 10 8.1
Finance 4 3.3
Manufacturing 4 3.3
Retail Trade 12 9.8
Services 49 40.0
Technology 6 4.9
Transportation 1 0.8
Wholesale 5 4.1
Other Industry 26 21.1
Business entity type
Corporation 28 22.8
Limited Liability Corp or Limited Liability Part 15 12.2
Partnership 9 7.3
Sole Proprietorship 66 53.7
Other 5 4.1
Employee number
110 112 91.1
1120 8 6.5
2130 1 0.8
51100 1 0.8
101200 1 0.8
Franchise
Franchise 13 10.6
Non-franchise 110 89.4
(N = 123)
Four hypotheses were tested for this study with four marketing analyt-
ics. Four hypotheses were tested on a theoretical model based on four
different marketing analytic categories: (a) Customer Turnover Analytic,
(b) Customer Credit Analytic, (c) Market Potential Analytic, and (d)
Competition and Economics Analytic. The hypotheses can be segregated

and studied as the following:
1. H1The Customer Turnover Analytic shows significant evidence

of customer behavior and activity in FBEs.
2. H2The Customer Credit Analytic shows significant evidence of
credit behavior in FBEs.
3. H3The Market Potential Analytic shows significant evidence of
marketing behavior and activity in FBEs.
4. H4The Competition and Economic Analytic shows significant
The statistical analyses were conducted with SPSS (Statistical Package

for Social Sciences) Version 21.0 and AMOS (Analysis of Moment
Structures) Version 21.0 for the analyses. After the data collection was
completed, a data-cleaning process was implemented prior to the data
analysis. The majority of the data-cleaning problems were three types: (a)
data entry or input errors, (b) misspellings, and (c) duplicate or redun-
dancy of input. Other data-cleaning issues concerned incomplete surveys.
4.5.1Sociodemographic Statistics ontheFBEs

Table 4.2 outlines the sociodemographic characteristics of the FBEs in
the study. The participants of the study completed the MACS instrument.
The researchers examined five sociodemographic metric variables in the
data. The primary objective of this section is to determine differences in
the data between the participants in the study. The 43.7% of the FBE par-
ticipants were Hispanic, 40.0% of the FBEs were in the services industry,
53.7% were sole proprietorships, and 89.4% were non-franchise business
enterprises.
4.5.2The Results oftheEFA

An EFA was conducted to identify the preliminary factors in the market-
ing analytics. The goal of EEA was to examine the variables items in the
analytics to see how they would cluster. Based on a sample size of 123
with a significant level of 0.05, items with less than a 0.3 factor coefficients
on any component were dropped from the analytics. The initial principle
94 D.A. MILES
components with varimax rotation produced a five-factor solution based on

10 metric items. We eliminated the last factor and concluded with a four-
factor solution. A Principal Axis Factoring (PAF) extraction was used for the
EFA.Prior to conducting the EFA, the Kaiser-Meyer-Olkin (KMO) mea-
sure of sampling adequacy and a Bartletts Test of Sphericity (BTS) values
were evaluated to see if this multivariate analysis was appropriate with the
data. The KMO value was 0.647, which is considered very good in the range
of 0.90 (Tabachnick and Fidell 2007). The BTS was 136.056 (p < 0.000),
indicating the diagonal elements are 1 and all off diagonal elements are 0.
The null hypothesis of the variance and covariance matrix of the variables as
an identity matrix was rejected. Based on the results of both the KMO and
BTS, the factors are factors and the factor analysis methodology was appro-
priate for this study. In the EFA, a four-factor solution emerged, explaining
a total variance of 61.2%. During the EFA, the analytics regrouped that
differed from the hypothesized model. So, it was eliminated from the fac-
tors. The extracted 10 items in the four factors were relabeled as follows: as
Customer Credit (2 items), Market Potential (3 items), Customer Turnover
(2 items), and Competition and Economics (3 items). The items were incon-
sistent with our theoretical framework (see Table 4.3).
Table 4.3 Measurement properties (N = 123)

Analytics and metric items Loadings Eigenvalues % of Variance
Analytic 1: Customer Credit

1. V18-Line of Credit Metric 0.858 2.496 24.961
2. V17-Customer Credit Metric 0.495
Analytic 2: Market Potential
1. V23-Government Regulation Metric 0.562 1.351 13.507
2. V20-Barriers-to-Entry Metric 0.532
3. V24-Social Entrepreneurial Metric 0.312
Analytic 3: Customer Turnover
1. V16-Customer Activity/Turnover 0.749 1.227 12.267
Metric
2. V15-Velocity of Profit Metric 0.391
Analytic 4: Competition and Economics
1. V21-Competition Intensity Metric 0.664 1.053 10.529
2. V19-Market Potential Metric 0.338
3. V22-Economic Climate Metric 0.649
Note: Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization.
Rotation converged in 12 iterations. Benchmark for this study, a minimum coefficient of 0.3 and higher
will be used as the standard
4.5.3The Results ofthePath Analysis andSEM

A path analysis methodology was conducted to determine the causal
effects of firm variables on the four analytics. The AMOS statistical
program was used to calculate each path to each analytic. This was done
because the program used AMOS for the analyses could not calculate or
estimate values between each of the analytics. The researcher took a differ-
ent approach to the path analysis by calculating the path estimates for each
analytic separately. The path models are presented in Fig. 4.3. The path
analyses were conducted to determine the causal effects among five firm
variables such as: (a) owner ethnicity, (b) industry type, (c) business entity
type, (d) employee number, and (e) franchise or non-franchise. This is used
among all the FBEs in the sample.
The path model for first analytic: Customer Credit is presented. This
was conducted to determine the causal effects among customer and mar-
keting analytics. Analytic 1: Customer Credit Analytic, which consists
of the metrics, (a) V17-Customer Credit and V18-Line of Credit. It was
hypothesized that firm characteristics would mediate the effect on market-
ing analytics (Hypothesis 1). For Path 1, firm characteristics influence
on Customer Credit Analytic is a good model fit (z = 0.89, p < 0.05).
Thus, the hypothesis is accepted. The path model for the second analytic,
Market Potential is presented. This was conducted to determine the causal
effects among customer and marketing analytics. The Market Potential
Analytic consists of the metrics, (a) V20-Barriers-to-Entry Metric, (b)
V23-Government Regulation Metric, and (c) V24-Social Entrepreneurial
Metric. It was hypothesized that firm characteristics would mediate the
effect on marketing analytics (Hypothesis 2). For Path 2, firm characteris-
tics influence on Market Potential Analytic is a good model fit (z = 0.66, p
< 0.05). Thus, the hypothesis is accepted.
The path model for the third analytic, Customer Turnover is presented.
This was conducted to determine the causal effects among customer
and marketing analytics. The Customer Turnover Analytic consists of the
metrics: (a) V15-Velocity of Profit Risk and (b) V16-Customer Activity/
Turnover Metric. It was hypothesized that firm characteristics would medi-
ate the effect on marketing analytics (Hypothesis 3). For Path 3, firm char-
acteristics influence on Market Potential Analytic is a good model fit (z =
0.84, p < 0.05). Thus, the hypothesis is accepted. Lastly, the path model
for the third analytic, Competition and Economics is presented. This was
conducted to determine the causal effects among customer and marketing
analytics. The Competition and Economics Analytic consists of the metrics:
96 D.A. MILES
e1 e2
V17 V18
.56 .83
e17 Analytic 1
.35 V19 e6
.07
Analytic 4 V21 e7
.36
V22 e8
e16
e18
e5 V20 .78
.49 .89
e4 V23 Analytic 2
.27 e19
.84
e3 V24
Analytic 3
.66
.26 1.18
.39
V15 V16
Firm
Variables e15 e14
.19 .33 .05 .13
.21
*V14 *V12 *V11 *V8 *V6
e13 e12 e11 e10 e9
Fig. 4.3 SEM path analysis results for the MACS instrument (k = 10 Items)
(a) V19-Market Potential Metric, (b) V21-Competition Intensity Metric,

and (c) V22-Economics Metric. It was hypothesized that firm c haracteristics
would mediate the effect on marketing analytics (Hypothesis 4). For Path
4, firm characteristics influence on Market Potential Analytic is a good
model fit (z = 0.39, p < 0.05). Thus, the hypothesis is rejected.
The goodness-of-fit indices revealed that the four-factor analytics model
fit the data poor to marginal, 2 (86) = 96.330, CFI = 0.92, RMSEA =
Table 4.4 AMOS path analysis coefficients and goodness-of-fit statistics

Goodness of fit statistics Value
Chi-square: 2 test (df = 86) 96.330 (test of close fit p = 1.00)

RMSEARoot Mean Square Error of 0.031
Approximation
CFIComparative Fit Index 0.922
IFIIncremental Fit Index 0.932
NFINormed Fit Index 0.594
PGFIParsimony Goodness-of-Fit Index 0.755
PNFIParsimonious Normed Fit Index 0.486
TLITucker-Lewis Index 0.905
AICAkaike Information Criterion 194.330
BCCBrowne-Cudeck Criterion 209.122
(N = 123)
0.031, IFI = 0.932, and AIC = 194.330, and BCC = 209.122. A notable
observation on the goodness-of-fit theory is that a model that demonstrates
poor to marginal model fit does not imply that the path model is best, but
only plausible (Kline 1998). The cross-validation with the sample examined
the psychometric properties of the measurement model. A chi-square dif-
ference (2 = 96.330, df = 86, p < 0.0000001) further suggests the mea-
surement model was variant, and scale constructs were perceived in a similar
manner across the sample (Kline 1998) (see Fig. 4.3 and Table 4.4).
4.5.4Correlation andAnalytics
Correlation analyses were also used to examine relationships among mar-
keting metrics within the analytics. The MACS instrument measures the
market analytics in the FBEs. Table 4.5 shows strong correlations among
the analytic variable metrics. First, in the Customer Credit Analytic, the
results indicate there was a significant relationship between the metrics
V17-Customer Credit and V18-Line of Credit (r = 0.465, p < 0.01).
Second, in examining the Marketing Potential Analytic, the results indi-
cate a significant relationship between the metrics V20-Barriers to Entry and
V23-Government Regulation (r = 0.381, p < 0.01). Third, the Customer
Turnover Analytic, there was a significant relationship between the metrics
V15-Velocity of Profit and V16-Customer Activity/Turnover (r = 0.306, p <
0.01). Lastly, Competition and Economics Analytic resulted in no significant
correlations in the data. In summary, the Market Potential Analytic was
found to be a potent predictor of the market potential (see Table 4.5).
Table 4.5 Correlations of observed analytics and metric items and covariates
Analytics and Means SD V171 V181 V202 V232 V242 V153 V163 V194 V214 V224
metric items
Analytic 1: Customer Credit Analytic

a) V171- 2.37 1.288
Customer
Credit Metric
b) V181-Line 2.59 1.317 0.465**
of Credit
Metric
Analytic 2: Market Potential Analytic
a) V202- 2.76 1.325 0.267** 0.424**
Barriers to
Entry Metric
b) V232- 3.24 1.478 0.138 0.245** 0.381**
Government
Regulation
Metric
c) V242-Social 1.64 1.153 0.072 0.065 0.213* 0.162
Entrepre.
Metric
Analytic 3: Customer Turnover Analytic
a) V153- 3.97 1.293 0.071 0.040 0.019 0.081 0.014
Velocity of
Profit Metric
b) V163- 3.29 1.246 0.275** 0.343** 0.127 0.174 0.091 0.306**
Customer
Activity/
Turn. Metric
Analytic 4: Competition and Economic Analytic
a) V194- 4.02 1.346 0.165 0.195* 0.049 0.026 0.026 0.151 0.309**
Market
Potential
Metric
b) V214- 4.05 1.137 0.035 0.151 0.089 0.183* 0.082 0.023 0.016 0.117
Competition
Intensity
Metric
c) V224- 3.70 1.138 0.120 0.126 0.192* 0.293** 0.092 0.110 0.265** 0.133 0.109
Economic
Climate
Metric
Note: ** Denotes correlation is significant at p < 0.01 (0.01 level two-tailed). * Denotes correlation is significant at p < 0.05 (0.05 level two-tailed)
100 D.A. MILES
4.5.5Regression Modeling andAnalytics

A linear regression was conducted to determine which independent vari-
ables (Ethnicity, Industry Type, Franchise, and Employee Number) were
predictors of firm performance analytics. The researcher used a data screen-
ing process to identify any multivariate outliers. The collinearity s tatistics
indicate that variance analytic factors fall below 0.3 for all variables, thus
indicating the lack of collinearity among variable metrics.
As indicated in Table 4.6, in step one of the analysis for Analytic
1-Customer Credit, the variable, V17-Customer Credit Metric, which
proved to be statistically insignificant with any of the five predictor variables.
However, for the metric, V18-Line of Credit Metric, both V12-Employee
Number proved to be statistically significant ( = 0.252, p < 0.005), and
V14-Business Entity Type is not significant ( = 0.202, p < 0.025).
In step two, again the variables, Ethnicity, Industry Type, Franchise,
and Employee Number were used as predictor variables. As indicated
in Table 4.6 in step one of the analysis, for Analytic 2-Market Potential,
under the metric, V20-Barriers to Entry, V12-Employee Number proved
to be statistically significant ( = 0.193, p < 0.032). However, for the met-
Table 4.6 Linear regression model of the firm variables effect on Analytic 1:
Customer Credit
Analytic 1: Customer Credit Regression SE t p
coefficient
DV: V17-Customer Credit Metric

*Predictor:V6-Ethnicity 0.069 0.105 0.060 0.656 0.513
*Predictor:V8-Industry Type 0.007 0.044 0.015 0.164 0.870
*Predictor:V11-Franchise 0.345 0.381 0.083 0.907 0.366
* Predictor:V12-Employee 0.264 0.183 0.131 1.448 0.150
Number
* Predictor:V14-Business 0.046 0.089 0.047 0.521 0.603
Entity Type
DV: V18-Line of Credit Metric
*Predictor:V12-Employee 0.518 0.181 0.252 2.858 *0.005
Number
* Predictor:V14-Business 0.202 0.089 0.202 2.274 *0.025
Entity Type
*Note: p < 0.05 ** p < 0.01 ***p <0.001

rics, V23-Government Regulation Metric and V24-Social Entrepreneurial

Metric, proved to be statistically insignificant with any of the five pre-
dictor variables. This proves that ethnicity, industry type, franchise, and
employee number cannot predict market behavior or market success with
FBEs as a strong predictor. That is a interesting finding because so much
is has been used in the prior research as using gender, ethnicity, race and
as predictor of business success in the marketplace.
In steps three and five, again the variables, ethnicity, industry type,
franchise, and employee number were examined and used as predictor
variables. The analytic Customer Turnover proved to be statistically insig-
nificant with the metric variables. However, with the Competition and
Economic Analytic, V14-Business Entity Type proved to be statistically sig-
nificant ( = 0.197, p < 0.029).
The overall model or equation proved some interesting findings.
Interestingly, the variable metric V12-Employee Number proved to be sta-
tistically significant with two analytics. This indicates a positive relationship
between (a) with employee number and V18-Line of Credit Metric with
FBEs and (b) with employee number and V20-Barriers to Entry Metric.
Lastly, there was no significance between the franchises and non-franchises
(see Tables 4.6, 4.7, 4.8, and 4.9).
4.6 Conclusions andImplications

This study used analytics with FBEs to examine marketing and customer
behavior. The primary goal of the research was to measure both internal
firm behavior (customer influence) and external firm behavior (market
influence) in this study. We tested four hypotheses based on a theoreti-
cal model based on four different marketing categories (CTAn + CCAn +
MPAn + CEAn). The researcher studied four marketing analytics for the
study: (a) Customer Turnover Analytic, (b) Customer Credit Analytic, (c)
Market Potential Analytic, and (d) Competition and Economics Analytic.
The basis for this study is to examine analytics in FBEs. The hypotheses
can be segregated and studied as the following:
a) H1The Customer Turnover Analytic shows some evidence of sig-

nificance in FBEs,
b) H2The Customer Credit Analytic shows some evidence of signifi-
cance in FBEs,
c) H3The Market Potential Analytic shows evidence of significance
in FBEs,
102 D.A. MILES
Market Potential
Analytic 2: Market Potential Regression SE t p
Analytic coefficient
DV: V20-Barriers to Entry Metric

*Predictor:V8-Industry 0.016 0.044 0.033 0.374 0.709
Type
* Predictor:V12-Employee 0.399 0.183 0.193 2.173 *0.032
Number
Entity Type
DV: V23-Government Regulation Metric
Type
*Predictor:V12-Employee 0.189 0.212 0.082 0.891 0.375
Number
Entity Type
DV: V24-Social Entrepreneurial Metric
Type
Number
Entity Type
*Note: * p < 0.05 ** p < 0.01 ***p <0.001
d) H4The Competition and Economic Analytic shows no evidence

of significance in FBEs.
Second, although the measures were drawn from the marketing lit-
erature, however it must be noted that the findings suggest some of the
analytics revealed some interesting results specifically concerning the
Competition and Economics Analytic (CEAn = CPI1 + ECR2 + GOR3).
Considering, that customer behavior is difficult to measure in firms, a piv-
otal aspect of customer behavior is examining how analytics can be used by
Customer Turnover
Analytic 3: Customer Regression SE t p
Turnover coefficient
DV: V15-Velocity of Profit Risk Metric

Type
Number
Entity Type
DV: V16-Customer Activity/Turnover Metric
Type
Number
* Predictor:V14-Business 0.186 0.084 0.197 2.209 *0.029
Entity Type
*Note: * p < 0.05 ** p < 0.01 ***p <0.001
patterns in the data and directed at specific target industries (Bailey etal.
2009; Kridel and Dolk 2013; Hair 2007). Therefore, the specific research
design was warranted to determine if such marketing and customer behav-
iors meet the noted condition for establishing that marketing analytics
bullying has indeed occurred.
Third, the findings of the study underscore several key points. First,
four hypotheses were tested for this study with four marketing analytic
models. Four hypotheses were tested on a theoretical model based on
four different marketing analytic model categories: (a) Customer Turnover
Analytic, (b) Customer Credit Analytic, (c) Market Potential Analytic, and
(d) Competition and Economics Analytic. The most salient finding from
the results is that the marketing analytic models that half of the analyt-
ics supported the hour hypotheses. The findings from the results indi-
cate that H1 was supported. Next, the findings from results indicate that
H2 was supported. Next, the findings from results indicate that H3 was
not supported. However, the findings from results indicate that H4 was
104 D.A. MILES
Competition and Economic
Analytic 4: Competition and Regression SE t p
Economic coefficient
DV: V19-Market Potential Metric

Number
Entity Type
DV: V21-Competition Intensity Metric
Number
Entity Type
DV: V22-Economics Metric
Number
Entity Type
*Note: * p < 0.05 ** p < 0.01 ***p <0.001
supported. Interestingly, the findings of the study could not support the
marketing analytic models concerning the predictor variables (ethnicity,
industry type, franchise, and employee number) influence on FBEs.
Lastly, the results of the first three analytic models prove to be most rel-
evant in the situations in which FBEs ability to borrow money from finan-
cial institutions. Also, this model proves the fourth analytic (Competition
and Economic Analytic) needs to be modified, due to the low coefficients.
With this modification to the analytic model, can possibly bear more sta-
tistical significance in the data, thus the researcher would consider modifi-
cations in the metric items.
In conclusion, in light of the findings, further modification of the mar-

keting analytics model will most definitely improve the causal path model.
This may be that perfect indirect, subtle forms of the causal relationships
between the predictor variables and the latent variables. This in turn, will
clearly delineate the causal relationships in the marketing and customer
analytic variables.
4.7 Limitations andFuture Research

There were some limitations to this study. Moreover, there were some
caveats that were worth noting. A limitation might be related to collecting
data and interpreting our results. First, one limitation was the use of a sin-
gle sample, which focused on FBEs. The fact this study focused exclusively
on a FBE sample is often an important limitation. Second, a limitation
was the research design and methodology. With regard to survey research
methodology, there was a consideration that might be a problem whether
the measures used validly captured the constructs of interest.
Third, another limitation was the results are based on a limited geo-
graphical area. The non-probability sampling approach was used. The
results of the study cannot be generalized to a larger population based
on those statistical grounds. We must proceed with caution about apply-
ing our results to the general population. Extending this work to other
geographical areas and to the much larger population of SMEs would be
useful. Lastly, another limitation is related to the measurement of cus-
tomer behavior and market behavior. Particularly, the confidence in the
results of this study could be strengthened with more access to customer
behavioral data particularly on consumer purchase histories. This would
be more fruitful in obtaining results for the study on FBEs.
There are some opportunities for extending the research. As for the
opportunities for future research, there are other directions to extend
the research on our analysis. For example, an interesting extension of this
study could be to extend this model to other SMEs in terms of different
industries. It would be interesting to extend to research to internet-based
firms. This would provide an opportunity to examine the different dynam-
ics of those industries in terms of analytics.
Another interesting direction for future research would be to examine
firms owned by immigrants. This would provide an opportunity to exam-
ine analytics and differences in that context. The opportunities for extend-
ing the research of this analytic framework can also be used to understand
106 D.A. MILES
the economic competition dynamics for and the effects of other forms
of competition metrics and analytics. This type of analysis could lead to
establishing new theoretical models for further refinement of marketing
models. This could also provide much richer theories that could assist
firms in different industries or market sectors.
References
Apte, C.V., Ramesh Natarajan, Edwin P.D. Pednault, and F.A. Tipu. 2002. A
probabilistic estimation framework for predictive modeling analytics. IBM
Systems Journal 41(3): 438448.
Arbuckle, J.L. 1995. AMOS Users Guide. Chicago: Smallwaters Publishing.
Bailey, Christine, Paul R. Baines, Hugh Wilson, and Moira Clark. 2009.
Segmentation and customer insight in contemporary services marketing prac-
tice: Why grouping customers is no longer enough. Journal of Marketing
Management 25(34): 227252.
Bijmolt, Tammo H.A., Peter S.H.Leeflang, Frank Block, Maik Eisenbeiss, Bruce
G.S.Hardie, Aurlie Lemmens, and Peter Saffert. 2010. Analytics for customer
engagement. Journal of Service Research 13(3): 341356.
Brooks, Neil, and Lyndon Simkin. 2011. Measuring marketing effectiveness: An
agenda for SMEs. The Marketing Review 11(1): 324.
Chen, Yongmin. 2006. Marketing innovation. Journal of Economics & Management
Strategy 15(1): 101123.
Chen, Chun-An, Ming-Huang Lee, and Ya-Hui Yang. 2010. Branding Taiwan for
tourism using Decision Making Trial and Evaluation Laboratory and Analytic
Network Process methods. The Service Industries Journal 32(8): 13551373.
Coughlan, Anne T., S. Chan Choi, Wujin Chu, Charles A. Ingene, Sridhar
Moorthy, V.Padmanabhan, Jagmohan S.Raju, David A.Soberman, Richard
Staelin, and Z.John Zhang. 2010. Marketing modeling reality and the realities
of marketing modeling. Marketing Letters 21(3): 317333.
Dash, Debi Prasad, and Alok Sharma. 2012. B2B marketing through social media
using web analytics. PRIMA, 3(2): 22. Publishing India Group.
Drye, Tim. 2011. Neighbourhood effects and their implications for analytics and
targeting. Journal of Direct, Data and Digital Marketing Practice 13(2):
119131.
Farris, Paul W., Neil T.Bendle, Phillip E.Pfeifer, and David J.Reibstein. 2006.
Marketing Metrics: 50+ Metrics Every Executive Should Master. Pearson
Education.
Fluss, Donna. 2010. Why marketing needs speech analytics. Journal of Direct,
Data and Digital Marketing Practice 11(4): 324331.
Furness, Peter. 2011. Applications of Monte Carlo simulation in marketing analyt-

ics. Journal of Direct, Data and Digital Marketing Practice 13(2): 132147.
Gai, Prasanna, Nigel Jenkinson, and Sujit Kapadia. 2007. Systemic risk in modern
financial systems: Analytics and policy design. The Journal of Risk Finance 8(2):
156165.
Germann, Frank, Gary L. Lilien, and Arvind Rangaswamy. 2011. Performance
implications of deploying marketing analytics. ISBM Report 2. Institute for the
Study of Business Markets, 143.
. 2013. Performance implications of deploying marketing analytics.
International Journal of Research in Marketing 30(2): 114128.
Ghose, Sanjoy, and Oded Lowengart. 2012. Consumer choice and preference for
brand categories. Journal of Marketing Analytics 1(1): 317.
Gnatovich, Rock. 2007. Making a case for business analytics. Strategic Finance
88(8): 46.
Grewal, Dhruv, Gopalkrishnan R.Iyer, Wagner A.Kamakura, Anuj Mehrotra, and
Arun Sharma. 2009. Evaluation of subsidiary marketing performance:
Combining process and outcome performance metrics. Journal of the Academy
of Marketing Science 37(2): 117129.
Hair Jr, Joe F. 2007. Knowledge creation in marketing: The role of predictive
analytics. European Business Review 19(4): 303315.
Horiguchi, M., and A.B.Piunovskiy. 2012. The expected total cost criterion for
Markov decision processes under constraints: A convex analytic approach.
Advances in Applied Probability 44(3): 774793.
Kline, Rex. 1998. Principles and Practices of Structural Equation Modeling.
NewYork, NY: The Guilford Press.
Kohavi, Ron, Neal J.Rothleder, and Evangelos Simoudis. 2002. Emerging trends
in business analytics. Communications of the ACM 45(8): 4549.
Kridel, Don, and Daniel Dolk. 2013. Automated self-service modeling: Predictive
analytics as a service. Information Systems and e-Business Management 11(1):
119140.
Kumar, V., Veena Chattaraman, Carmen Neghina, Bernd Skiera, Lerzan Aksoy,
Alexander Buoye, and Joerg Henseler. 2013. Data-driven services marketing in
a connected world. Journal of Service Management 24(3): 330352.
Lin, Chin-Tsai, and Pi-Fang Hsu. 2013. Adopting an analytic hierarchy process to
select Internet advertising networks. Marketing Intelligence & Planning 21(3):
183191.
Marsella, Anthony, Merlin Stone, and Matthew Banks. 2005. Making customer
analytics work for you! Journal of Targeting, Measurement and Analysis for
Marketing 13(4): 299303.
Melville, Prem, Vikas Sindhwani, and R.Lawrence. 2009. Social media analytics:
Channeling the power of the blogosphere for marketing insight. Proc. of the
WIN 1(1): 15.
108 D.A. MILES
Morgan, Neil A. 2012. Marketing and business performance. Journal of the

Academy of Marketing Science 40(1): 102119.
Ozimek, Jane Fae. 2010. Issues with statistical forecasting: The problems with
climate scienceAnd lessons to be drawn for marketing analytics. Journal of
Database Marketing & Customer Strategy Management 17(2): 138150.
Panayides, Photis M. 2002. Identification of strategic groups using relationship
marketing criteria: A cluster analytic approach in professional services. Service
Industries Journal 22(2): 149166.
Petti, Richard J. 2005. The Revolution in Marketing Analytics. ModelSheet.
Sathyanarayanan, R. S. 2012. Customer analyticsThe genie is in the detail.
Marketing & Communication, 41.
See, Ed. 2006. Bridging the finance-marketing divide. Financial Executive 23(6):
5053.
Smit, Edith G., and Peter C.Neijens. 2011. The march to reliable metrics: A half-
century of coming closer to the truth. Journal of Advertising Research
51(Suppl): 124135.
Solcansky, Marek, Lucie Sychrova, and Frantisek Milichovsky. 2011. Marketing
effectiveness by way of metrics. Economics & Management 16: 13231328.
Steinley, Douglas, and Robert Henson. 2005. OCLUS: An analytic method for
generating clusters with known overlap. Journal of Classification 22(2):
221250.
Tabachnick, B., and L. Fidell. 2007. Using Multivariate Statistics (5th ed.). Boston,
MA: Pearson.
Zaman, Mukhles. 2003. Predictive analytics: The future of business intelligence.
Technology Evaluation Centers.
Zeng, Daniel, Hsinchun Chen, Robert Lusch, and Shu-Hsing Li. 2010. Social
media analytics and intelligence. IEEE Computer Society 1(1): 14.
CHAPTER 5
Strategic Planning Revisited: Acquisition

andExploitation ofInformation onForeign
Markets
MyropiGarri andNikolaosKonstantopoulos
5.1 Introduction
During the last four decades, a wide range of researchers have explored
the processes of developing and implementing successful strategies for
national and international markets. Strategic management, as a scientific
field, has quickly grown and today encompasses a wide plurality of research
questions, units of analyses, and modeling tools, as a plethora of theo-
ries (e.g., Power of Competition, Resource Based View (RBV), Dynamic
Capabilities) and factors (macro- and micro-external, internal) have been
interrelated to it. However, under the light of interconnectedness, the high
complexity of analysis we reached, instead of enlightening and puzzling
M. Garri (*)
University of Portsmouth, School of Business, Portsmouth, UK
e-mail: myropi.garri@port.ac.uk
N. Konstantopoulos (*)
University of the Aegean Business School, Chios, Greece
e-mail: nkonsta@aegean.gr

DOI10.1057/978-1-137-37879-8_5
110 M. GARRI AND N. KONSTANTOPOULOS
out the process of developing strategy, has in contrast increased its level
of complexity, vagueness, and fragmentation. Furthermore, during the
contemporary times of turbulence, uncertainty about possible evolutions
and future trends of all these factors has been dramatically increased.
Additionally to the complicated and complex business world that we have
to take into account while shaping strategy, other contextual evolutions,
such as technological improvements, rise of social media and big data,
change gradually the value-adding operations of the firm. Thus, we note
that in the field of strategic management, there is a need created. Instead
of integrating new approaches and theories to the field, resulting to fur-
ther increase the level of complexity and uncertainty of strategic decision
making, we should start to revisit the foundations of strategy formulation
process. By revisiting one by one the different stages of strategy formula-
tion, we will understand whether and how much each process of strategy
development has been reshaped due to current evolutions. The reviewing
of the underpinnings of strategy formulation and the adoption of a rather
pragmatic rather than theoretic approach will decrease vagueness, pro-
viding more clear answers on the real contemporary practices used from
managers and entrepreneurs so as to develop successful strategies in for-
eign markets. Taking into account the above described framework, in this
chapter we are going to revisit strategy in use.
Given that world widely the current business environment is under
constant and radical change, we feel that one of the most important issues
companies have to address while formulating strategy is to develop pro-
cesses in order to capture the current trends in the market and the indus-
try they belong, so as to incorporate them in their strategic planning.
Therefore, this chapter is concentrating to one of the most important
areas of the traditional approach of strategy formulation, the information
acquisition, and processing process. Understanding the exploitation of lat-
est technology for market research purposes as a value-adding element for
the firm, leading to the creation of successful strategies in foreign markets,
we are to examine the evolution of technologys effect on the process
of information obtainment and processing. Specifically, we are going to
review the process internationalized enterprises implement in practice in
order to obtain information for market and industry research purposes,
investigating types of information obtained, sources of information, and
processes of dealing out the acquired information.
At the same time, we cannot ignore the fact that there is no magical rec-
ipe or a golden rule of success valid for all enterprises. The characteristics
STRATEGIC PLANNING REVISITED: ACQUISITION ANDEXPLOITATION... 111
of each company, like the industry it belongs, the product/service it

offers, its culture, its strategic orientation, its resources, competences, and
capabilities, will actually define the way the company is to operate in for-
eign markets, the strategies to be developed, and the overall behavior of
the firm. Following this spirit, in this chapter, we are to examine the strat-
egies companies develop in order to research foreign markets, the type
of information they obtain, the sources of information they use, and the
effect of technology, interconnecting each one of the above fronts with
the characteristics, structures, and prior strategic behavior of the company.
The rest of the chapter is organized as follows. First, we review the
literature establishing the theoretical background describing the main
concepts we discuss and including arguments leading to hypotheses devel-
opment. Next, we present the research methodology, the collection and
analysis of data, and the empirical analysis and discussion of the variables
under investigation. The chapter ends with a short conclusion.
5.2 Setting theFrameworkUnderstanding

theMain Concepts
5.2.1The Structural andCapitalized Strategic Characteristics

oftheFirm
According to RBV, apart from the formulation and implementation of
the appropriate business strategy, another element of high importance
leading to the creation of competitive advantage is the development of
structural, strategic, operational, and management competences and capa-
bilities (Fortune and Mitchell 2012). These competences, when absorbed
and capitalized over time, are converted to resources which the compa-
nys strategy is required to exploit in order to achieve the corporate goals
(Cepeda-Carrion et al. 2012). The dynamic capability school (Barney
1991) detects the departure point of strategy development in evaluating
the internal resources and developing a unique organizational capabilities
and core competences that are difficult to replicate. Teece etal. (1997)
underlined the need to integrate the resource-based view with the orga-
nizational learning theory, with social capital concepts, and with strategic
management, in the theoretical analysis of the means to create and exploit
competitive advantage. In other words, the capitalized structures, opera-
tions, and strategies of the firm constitute the main elements that make
each company a unique entity and lead to the generation of dynamic capa-
bilities, competitive advantage, and higher levels of performance (Eriksson
etal. 2014; Villar etal. 2014).
Another similar viewpoint supports that the organizational culture
affects decision making in a company; the corporate orientation consists of
the crucial components leading the enterprise to sustainable development
and growth, combined with the knowledge that the firm absorbs from the
market, which may redetermine its course (Dornberger and Nabi 2008).
The effective distribution of resources leverages the internalization of the
external environments effect, a fact that drives to strategic development
and organizational adjustment (Marciano 2011).
Summarizing, the transition of the company to a completely new busi-
ness environment calls for and is managed by the adjustment of business
strategy and policy to the newly emerging conditions. In addition, the
successful implementation of an internationalization strategy presupposes
the ownership or development of appropriate structures and processes by
the firm, in order to leverage the adaptation of the enterprise to the new
market settings in which it is called upon to operate and develop. Thus, a
first research main hypothesis can be formed:
H1: The characteristics, structures, and past strategies of the firm con-
stitute the main factors that facilitate or prevent the formulation and
implementation of future strategies.
5.2.2Market andIndustry Research: Types ofInformation

Acquired
Knowledge-based strategy (Eisenhardt and Santos 2002) is a term that
integrates concepts of strategy development and knowledge earned
through the acquisition of information. This term came out in the rela-
tively recent strategic managements literature and it is highly related to
processes interconnecting the creation of novel knowledge to the cre-
ation of value for every enterprise. The exploitation of informational
resources can lead to knowledge creation, through a five-phase life cycle.
These phases include planning, acquisition, stewardship, exploitation,
and disposal of information (Souchon and Diamantopoulos 1999).
Market research refers to the research activities of firms carried out so as
to reduce uncertainty of decision making in a market (Cavusgil 1985:
262). Enterprises are considered to succeed through the development
of strategies aiming to achieve: (a) innovativeness (the launch of new
products, services, technology, and to enter new markets); (b) proactive-
ness (searching new ways to succeed entrepreneurially); and (c) benefi-

cial risk-taking (making realistic decisions when faced with uncertainties
of the environment) (Miles and Arnold 1991). Successful strategic
development and implementation requires the development of relevant
dynamic capabilities, leading the organization to reconfiguration of its
internal and external competencies to address rapidly changing environ-
ments (Teece etal. 1997). In order to be able to address these rapidly
changing environments, first we have to capture change. There is no
better way to do so, than by information acquisition.
The vital role of information acquisition leading to the creation of
organizational advantage, and especially the relationship between success-
ful development of market strategies and market research as elements of
competitive advantage development, has been discussed in the strategic
management ground (Morgan and Katsikeas 1998). A relatively recent
survey among European companies reviled that 80% of the participants
regard information acquisition as a strategic asset. At the same time, 78%
of them think that their organizations are missing out on business oppor-
tunities by failing to utilize their knowledge base successfully (Raub and
von Wittich 2004). Also, market research, understood as the process of
creative enhancement and exploitation of opportunities emerging in a
market, has been proven to have an overall positive effect on business
performance (Hult and Ketchen 2001) and competitive advantage devel-
opment (Zahra and George 2002).
The process applied so as to study and evaluate the market potential
is based on market information obtainment, a process considered being
highly important for the successful development of strategy. It includes
the use of logic and objectivity in the systematic planning, mining,
analysis, and presentation of relevant information and data regarding
the under-examination market (Souchon and Diamantopoulos 1999).
Information acquisition is the processes involved in bringing information
about the external environment into the boundary of the organization
(Yeoh 2005). Using a broader approach, information acquisition can be
defined as the obtainment of information adequate for strategic decision
making. Updated and valid information is needed so as to develop suc-
cessful strategies for market potential evaluation, market selection, mar-
ket penetration, launching of new products/services, product/processes
readjustment, distribution channels selection, pricing, positioning in the
foreign market, competition evaluation, development of networks, and so
on (Leonidou 1997). However, information is not valuable per se, unless
it is turned into knowledge.
An enterprise that is strategically proactive in its doing business approach

is to be expected to express a similar proactive approach in obtaining
market information. It is expected to be involved in processes that most
require market intelligence within the concept of knowledge manage-
ment. Innovativeness and proactiveness will lead organizations to incor-
porate market intelligence numerous ways, acquiring all types of market
information. Also, a company can be seen as a set of unique resources and
capabilities, which is the foundation of shaping strategy (Hitt etal. 2012).
The above approaches guide us to perceive the resources and capabilities
of each enterprise, as factors dynamically interconnected with the process of
market information acquirement, as they define the strategic framework
of the enterprise. In other words, we see the characteristics, structures,
and previous strategies of the firm as the key aspects that dynamically
determine the market information acquisition strategy/behavior of the
firm. It is reasonable from a methodological point of view to examine the
interaction among the orientations that describe organizational capabili-
ties (i.e., enterprises overall characteristics) and processes or competencies
(i.e., market information orientation/acquisition process). According to
the above, the following hypothesis can be shaped:
H2: The characteristics, structures, or previous strategies of the enter-
prise constitute the main factors that facilitate or prevent the adoption of
a market information acquisition strategic behavior.
5.2.3Sources ofInformation About Foreign Markets

Many researches exploring the relationship between organizational char-
acteristics, business strategy, and information management often ignore
the initial step of the procedure, data acquisition. The process we apply
to search for and obtain the required data is considered to be one of the
most important in the information management chain. The importance
of assigning export information needs to the firms internationalization
success is frequently emphasized in the relevant literature (Liesch and
Knight 1999; Knight and Liesch 2002). As it has been argued, informa-
tion acquisition will guide to enhanced internationalization performance
as it facilitates decision makers to better recognize marketing opportuni-
ties/threats for the improvement of their positioning in the international
marketplace (Yeoh 2005). Actually, studies examining effect of foreign
information acquisition on internationalization performance found evi-
dence of important positive effect of information acquisition leading to
knowledge creation on export satisfaction (Wang and Olsen 2002), new

export product advantage, and market performance in host countries (Li
and Cavusgil 2000). On the other hand, the restricted availability of infor-
mation on locating overseas markets and conducting business abroad has
frequently been quoted in the internationalization literature as a serious
barrier to create sound internationalization business strategies (Leonidou
1997). Insufficient internationalization information is regarded to result
in increased level of psychological distance, inadequate design, and imple-
mentation of foreign business plans, and in augmented limitations in mak-
ing strategic internationalization decisions (Leonidou 1997).
Knowing that information acquisition and exploitation (knowledge
creation), about the foreign market enhances internationalization perfor-
mance (Li and Cavusgil 2000), the next question we have to answer is
how exactly to company mines the necessary information. Mostly, com-
panies obtain information via two ways: through their own experiences or
through the experiences of other organizations. Learning from own expe-
riences includes experimenting and interpreting earlier outcomes or devel-
oping own mechanisms for information mining. Learning from others
equals the transfer of information and knowledge embedded in products
or processes or the transfer of knowledge in a more pure form (Hkansson
etal. 1999). Vicarious experience coming from companies from the same
sector is predominantly important for the company, because between the
companies sharing information have a common background, as they share
similar resources and technologies, and they face similar challenges and
opportunities (Ingram and Simons 2002). Thus, this type of vicarious
experience is somehow easier for managers to assimilate as it regularly fits
to their cognitive models (Henisz and Delios 2002). Vicarious experience
deriving from companies belonging to a different sector can be proven
valuable as well. Especially, when companies still are coming from the same
home country, they can benefit from the fact that they have evolved within
a common cultural and business environment. Additionally, companies
coming from the same country tend to develop similar organizational
structures and resources, a fact that increases the probability to belong to
the same strategic group. This way, companies can learn from others that
have previously invested in the target country in order to develop strate-
gies in a more safe way.
Aiming to examine the process of obtaining information in detail, in
this chapter, we recognize not only the types but also the sources of infor-
mation used by internationalized companies in order to support their
internationalization activity. In addition, we attempt to find out whether

the characteristics, structures, and previous strategies implemented by the
firm affect the process applied in practice by firms so as to source infor-
mation. According to the above, the following hypothesis can be shaped:
H3: The characteristics, structures, or previous strategies of the enter-
prise constitute the main factors that facilitate or prevent the adoption of
certain strategies to source information.
5.2.4The Culture ofCRM, andtheRole ofCustomer

Information Software
5.2.4.1 What Shapes Strategic Behavior?

Like human beings, companies are distinguished by their special charac-
teristics that, when integrated, shape the special feature called identity.
Corporate identity is frequently described relevantly to what the com-
pany is or expects to be, what it does, or what it stands for (Melewar
2003). Although there is a theoretical convergence on the conceptualiza-
tion of the corporate identity concept, most researchers consent that
corporate identity consists of all the key and secondary organizational
characteristics that depict the spirit, the character, and the internal culture
of the organization (Jorda-Albinana et al. 2009). Corporate behavior is
visualized as the organizations actions performed in accordance with its
culture or, oppositely, the actions occurring impulsively and without prior
planning, representing the way it acts toward the environment (Melewar
2003). Customer information obtainment through relevant information
systems can be seen as expression of the companys strategic behavior in
accordance with the companys culture, since is not voluntarily or altru-
istically developed. Actually, the implementation of proactive customer-
centered strategies may be basically the primary expression of the corporate
culture, as customer values are often ingrained in the corporate principles.
Firms have every reason to obtain customer information. It allows them
to effectively target their most valuable ideas, make tailor-made offer-
ings to individual desires, increase customer satisfaction and loyalty, and
recognize opportunities for new products or services. All such proactive
strategies designed and implemented for these purposes are based in the
customer information acquisition process (Hagel and Rayport 1997) and
represent in part a strategic alert behavior.
5.2.4.2 CRM andInformation Software

The original definition of customer relationship management (CRM)
describes it as a strategic process combining management practices,
knowledge, resources, and suitable customer information software. This
process supports the firm to better serve the needs of its customers and to
increase their loyalty and retention, increase Return on Investment (ROI)
on marketing campaigns with better targeting, improve cross-selling, and
up-selling success. This practice greatly depends on using information
technology. But in reality, is the acquirement and use of relevant software
interconnected with the future strategy of the company and the develop-
ment of a new way of thinking and acting in the market? In this section,
we aim to illustrate how the existence of customer information software
affects corporate strategies developed and implemented by international-
ized enterprises, reflecting their character and strategic behavior.
In order to better serve customers needs and support the relationship
between the company and its customers, software to mine and analyze
information is needed. The alignment of business strategies to informa-
tion system research outcomes has been discussed in the relevant literature
(Kajalo etal. 2007). CRM has been defined as the management of cus-
tomers through a customer database and reporting software (Fink 2006).
A CRM system entails the organization-wide obtainment, processing, and
exploitation of knowledge about customers so as to sell more products
and services, even as improve customer satisfaction (Bose 2002). A CRM
system is a front-office software tool that aspires to build long-term, prof-
itable customer relationships. CRM assists the recognition, approach, and
satisfaction of new and existing customers. Enterprises have to differenti-
ate themselves from their competition and create effective switching cost
for their customers. Information technology and information systems can
be used to sustain and combine the CRM process to satisfy the desires and
retain customers (Trivellas and Santouridis 2009) as the use of relevant
software gives businesses more complete insight into their clients and cus-
tomers, and helps front-line employees make faster, better informed deci-
sions. As a result, customer value creation seems to be interconnected to
the IT initiations of a company, which is aligned to its corporate strat-
egy, culture, and identity (Lagos and Kutsikos 2011). It has been found
that the adoption of CRM practices is considered to be highly related
to increased customers satisfaction, as CRM practices take into account
and address the needs and expectations of the companys customers. In
this direction, CRM provides valuable insights into the way the compa-
nys products can be modified and promoted effectively, as it integrates
management of clusters of customers, offering tailor-made solutions that
address their personalized needs.
At the same time, information mined and explored supports the devel-
opment of various growth strategies, such as internationalization strategy.
CRM has been also identified as a crucial part of the internationalization
strategy development. Although it hasnt been a wide convergence on the
concept, Ngai (2005) underlined the significance of understanding CRM
as a comprehensive set of strategies for managing those relationships with
customers that relate to the overall process of marketing, sales, service,
and support within the organization (p.583). The effective use of CRM
software not only allows companies to build and maintain strong relation-
ships with domestic customers but also broadens the ability of the firm to
reach new market potential in foreign markets (Harrigan etal. 2008) an
important step toward the companys internationalization. As proposed
by McGowan and Durkin (2002), CRM increases both the internal and
external organizational efficiency throughout all stages of the supply chain.
However, it should not be employed autonomously but in alignment with
the overall internationalization strategy (Harrigan etal. 2008).
The fact that the company owns and uses a customers information
software can be a sign of having adopted a proactive and strategically alert
behavior in its operation in the local and foreign markets. To explore in
practice (a) the relationship between the implementation of CRM and
the marketing and organizational strategies of the firm, representing its
strategic thinking and strategic behavior and (b) to illustrate the relation-
ship between the use of a customer information acquisition software with
the level of internationalization activity of the company, we develop the
following hypotheses:
H4: We expect companies owning a CRM system to design and apply
multiple, direct, and indirect marketing strategies in the foreign markets
they activate.
H5: We expect companies owning a CRM system to be actively involved
and highly committed to the markets they develop their business activity.
We expect these companies to have a higher grade of internationalization
involvement.
H6: We expect companies owning a CRM system to have a high level of
strategic complexity so as to support their internationalization activities.
5.3 Methods
As proposed by many researchers in the field (e.g., Ramamurti 2004),
our research methodology involves both qualitative (Ritossa and Bulgacov
2009) and quantitative (Hutchinson etal. 2009) research methods. The
integration of important constructs proposed by prior research along with
the variables deriving from qualitative research assembles a valid research
framework of the examined field. We conduct in-depth personal inter-
views to refine our constructs and to develop a closed questionnaire. The
employment of interviews before embarking on the questionnaire gives a
feel for the key issues and confidence that the most important issues are
addressed (Saunders etal. 2011). Then, we test our hypotheses using the
survey data. To measure the internal cohesion of the questionnaire, we
use the reliability coefficient Cronbach . The value of the coefficient
is 0.967, indicating that the reliability of the questionnaire is very high
96.7 %. We decided to exclude all internationalized companies providing
services, as they constitute a unique case, which requires the development
of a different theoretical background.
A total of 1400 internationalized manufacturing companies are identi-
fied by the HEPO (Hellenic Foreign Trade Board) directory. We apply
a multi-industry stratified sampling design so as to broaden the gener-
alizability of the findings. We address the questionnaires to internation-
alization managers/directors. The 158 usable out of 165 questionnaires
totally received correspond to 11.29 % of the population. An effective
response rate of 36.66 % was attained. We compare responding and non-
responding companies in terms of size and mode of internationalization.
We do not find any significant differences between these two groups, a fact
suggesting that there is no response bias.
5.3.1Measures
Dependent Variables. Integrating information mined by the literature
review and the results coming from interviews with internationalization
directors, we concluded to include the following dependent variables to
our questionnaire (Table 5.1).
Five-grades Likert scale was used to measure the impact of each motive
to after the companys internationalization decision-making process. The
operationalization of independent variables, which are the structures, stra-
tegic, and contextual characteristics of the firm, is available at http://
tinyurl.com/p352ygz.
Table 5.1 Operationalization of Dependent Variables

Types of Information Sources of Information CRM
Information about the Local States Assistance (Consulates The company owns
behavior of the product EmbassiesCommercial Attaches) and uses CRM
Software
Information about the Compatriot companies that operate in
Consumer Behavior of the the foreign market
Foreign Market
Information about Co-operating Companies
Competitive Products
Information about Cultural Institutional Bodies that offer
Dimensions and Socio- assistance/information
political Dimensions of the
Market
Information about the Visits in the foreign country/market
other factors affecting the from managers of the company
market
Companys Executives Seeking
Information in Foreign Markets
Trade Shows/Exhibition
Companys Representatives Abroad
Consultancy Firms, Research
Agencies
5.4 Analysis andResults
5.4.1Strategic Behavior RegardingtheTypes ofInformation

Acquired
5.4.1.1 Cluster Analysis

We conduct cluster analysis to identify the groups in which enterprises can
be classified, according to the types of market information they obtain. In
other words, cluster analysis results illustrate the market information acqui-
sition behavior of the enterprise. According to the analysis, two groups of
enterprises were recognized representing two different types of strategic
behavior. The Compare Means table provided by SPSS indicates the
mean of each variable per cluster and in total (Table 5.2). Combining this
information, we can determine and compare the two different kinds of
strategic market information acquisition behavior.
For the first cluster of companies, we observe higher means for every
variable, compared to the total mean. Oppositely, for the firms of the sec-
ond cluster, we observe lower means for every variable, compared to the
total mean. The means of the first cluster range at almost the same value
for all variables (max: 3.90min: 3.33), except of the variable Information
about the Competitive Products. The mean of this variable is the higher
one (4.20) indicating that most enterprises primarily care about the level
of competition in the industry. The mean of this variable is the highest one
in both clusters and in total. Even for the second cluster of enterprises, the
one that tends not to acquire market information, the mean of this vari-
able is 2.46, while the total mean fluctuates around 3.67. That indicates
that for every enterprise, even for those that do not widely gather market
information, competition is their main concern before and after entering
a market. They collect information about the competitive products (price,
quality, distribution channels used, promotion strategy, etc.), even if they
do not care much to acquire any other type of market information. In a
sense, our findings highlight competition as a regulatory force, as it may
enforce or prevent the entrepreneurial decision for activating in a mar-
ket, as well as a market strategy shaping element (Menon and Varadarajan
1992).
The second clusters means range from 1.54 to 2.46, revealing that
about 30% of the entrepreneurs seek little or no information about the
Table 5.2 Variables of Types of Information Acquired Means per Cluster

Cluster Information Information Information Information Information
Number of about the about the about the about the about the
Case product Consuming Competitive Social, sum of the
behavior Behavior of Products Political, factors
the Market Cultural affecting the
Dimensions market
of the Market
1 Mean 3.90 3.87 4.20 3.33 3.86

N 110 110 110 110 110
SD 0.812 0.968 0.833 1.033 0.851
2 Mean 2.19 1.81 2.46 1.54 1.90
N 48 48 48 48 48
SD 1.065 0.915 1.051 0.651 0.928
Total Mean 3.38 3.25 3.67 2.78 3.27
N 158 158 158 158 158
SD 1.192 1.343 1.207 1.243 1.259
foreign market. Summarizing, we note that enterprises of the first cluster

are highly engaged in obtaining market information, while enterprises of
the second cluster are not. There is a group of companies that applies an
aggressive market information acquisition strategy, as they highly mine
every kind of market information. They invest resources and use their
networks to gain market knowledge. On the contrary, enterprises of the
second cluster do not bother to collect almost any kind of information
before entering a market, except maybe information about foreign com-
petition. Their business activity seems to be rather based on impulse than
on strategic planning.
5.4.1.2 D iscriminant Analysis andBinary Logistic RegressionTypes

ofInformation Acquired andtheEffect oftheCharacteristics,
Structures, andStrategies oftheCompany
What makes some companies develop different behaviors regarding infor-
mation acquisition? To investigate the factors differentiating the market
information acquisition behavior of enterprises, we conduct discriminant
analysis and binary logistic regression. We compare the two clusters strate-
gic behavior, controlling for the effect of a wide variety of capitalized char-
acteristics of the enterprises, available at http://tinyurl.com/p352ygz.
Results are presented in Table 5.3.
We find that the strategies enterprises develop affect the decision to
mine information or not. As we observe, when an enterprise wishes to
enter a highly competitive market, such the market of the EU, then the
probability to be actively engaged in the market information acquisition
process is increased by 7.7 times. The same positive relationship (8.048
times increased probability) is shaped when an enterprise has a high
Table 5.3 Logistic Regression Results for Types of Information Obtained, and
Characteristics, Strategies, and Structures of the Firm
B S.E. Wald df Sig. Exp(B)
Step Strategy of Market 2.045 0.831 6.059 1 0.014 7.731

19a ExpansionEU
Degree of Strategic 2.085 0.697 8.962 1 0.003 8.048
Complexity
Lack of Strategic 2.049 0.757 7.325 1 0.007 0.129
Differentiation per Market
Constant 2.668 1.639 2.649 1 0.104 0.069
degree of strategic complexity, for example, to develop multiple strategic

targets for different foreign markets. On the contrary, when a company
lacks strategic differentiation, the probability to be largely involved in the
under-examination process is reduced (0.129*prob). We find evidence sup-
porting our first hypothesis: the strategic characteristics of the enterprise
constitute the main factors that facilitate or prevent the adoption of a mar-
ket information acquisition behavior. As Souchon and Diamantopoulos
(1999), we also find that when market information turns into market
knowledge then there is an increased probability that the enterprise has
to recognize market opportunities in an external environment that was
considered to be hostile. Companies engaged in greater market informa-
tion search and evaluation of market potential tend to develop and imple-
ment complex penetration and development market strategies, in order
to maximize their business performance in the examined market. It seems
that actively engaged in the information mining process enterprises would
be able to discover market opportunities and create successful business
ventures. The fact that they acquire market information can lead them to
develop and apply real-time, market-oriented targeted strategies. These
strategies will enforce the companys penetration and performance in the
market. Enterprises with a low level of commitment to the process are
expected to act reactively toward the market challenges. Their business
activity is rather based on impulse than on strategic planning. Managers
should search for opportunities to enhance the way market knowledge is
obtained (e.g., acquire multiple types of market information, using dif-
ferent sources in obtaining market information) in order to more reliably
support their entrepreneurial activities. Managers should also redesign the
process applied to diffuse the information acquired throughout the orga-
nization. Through this redesign the organization will increase the levels of
market information exploitation and market knowledge generation.
5.4.2Strategic Behavior RegardingtheSources

ofInformationUsed
5.4.2.1 Factor AnalysisSources ofInformation

The first question we aim to answer in this section is the following one:
When companies mine information, do they use all information sources
simultaneously or information sources are used in groups? To examine
this, we conduct a factor analysis. Results verify the creation of two groups
of factors (information sources). The first factor was named institutional

information sources (Cronbachs alpha reliability coefficient: 0.788),
while the second factor was named inter-organizational and market
information sources (Cronbachs alpha reliability coefficient: 0.580).
Results show that according to the way they are used, sources of informa-
tion about the foreign market can be classified in two groups, institutional
sources and inter-organizational sources.
5.4.2.2 Cluster AnalysisSources ofInformation

To identify the strategic behavior of companies regarding the sources of
information they used to mine information on foreign markets, we use
cluster analysis. Two groups of companies are created per each factor.
The Compare Means table produced by SPSS, for the Institutional
Information Sources factor, provides information about the mean of
each variable belonging to each group of variables per factor (Table 5.4).
Combining this information, we can determine the status of the two clus-
ters of the factor.
As shown, companies belonging to the first cluster (total 74, 47.44 %)
have higher observations means for every variable, compared to the
total mean. Similarly, firms belonging to the second cluster (total 82,
52.56 %) have lower observations means for every variable, compared to
the total mean. The first clusters means range at almost the same value
Table 5.4 Compare Means for the Institutional Information Sources Variables
Cluster Local States Companies Institutional Consultancy Institutional
Institutional Assistance operating in Bodies of the Firms. Bodies that offer
Information (Consulates the foreign Foreign Research assistance/
Sources Embassies market Country Agencies information
Commercial
Attaches)
1 Mean 2.82 3.58 2.99 2.39 2.73

N 74 74 74 74 74
SD 0.927 0.844 0.914 1.108 1.064
2 Mean 1.35 2.10 1.56 1.45 1.50
N 82 82 82 82 82
SD 0.616 1.014 0.803 0.688 0.724
Total Mean 2.05 2.80 2.24 1.90 2.08
N 156 156 156 156 156
SD 1.070 1.194 1.114 1.023 1.089
for all variables (max: 3.58min: 2.39), while the second clusters means
range from 2.10 to 1.35. We assume that the first cluster of companies has
obtained information for their internationalization activity mainly from
institutional information sources, while the second cluster of companies
has not. The Compare Means table produced by SPSS, for the Inter-
organizational and Market Information Sources factor provides informa-
tion about the mean of each variable belonging to this group of variables
per cluster (Table 5.5). Combining this information, we can determine
the status of the two clusters of the factor.
As shown, companies belonging to the first cluster (total 70, 44.30 %)
have higher observations means for every variable, compared to the total
mean. Similarly, firms belonging to the second cluster (total 82, 55.70 %)
have lower observations means for every variable, compared to the total
mean. The means of the first cluster range at almost the same value for all
variables (max: 3.99min: 3.23), while the second clusters means range
from 3.15 to 1.61. Cluster analysis results reveal that some companies
obtain information about the foreign market from institutional infor-
mation sources, while other companies do not extensively use this kind
of information sources. In addition, there is a group of companies that
acquire information so as to support their internationalization activity from
Inter-organizational Factors and Market Factors, while another group
of companies does not. Results show that there is a group of companies
Table 5.5 Compare Means for the Inter-organizational and Market Information
Sources Variables
Cluster Co-patriot Co-operating Visits in the Trade Companys
Inter- companies Companies foreign Shows/ Representatives
organizational country/ Exhibition Abroad
and Market market from
Information managers of
Sources the company
1 Mean 3.33 3.23 3.36 3.90 3.99

N 70 70 70 70 70
SD 1.059 1.038 1.192 0.995 1.000
2 Mean 1.74 1.61 2.72 3.15 2.70
N 88 88 88 88 88
SD 0.877 0.850 1.286 1.309 1.349
Tot Mean 2.44 2.33 3.00 3.48 3.27
N 158 158 158 158 158
SD 1.244 1.234 1.282 1.235 1.362
that follows an aggressive information acquisition strategy, because they

do not rely on government or institutional information sources to mine
information on foreign markets. On the contrary, the commit resources,
human and financial, they form networks and co-operations abroad, they
use their own experiences and their companys partners experiences in
order to learn. Of course, this does not mean that they do not use insti-
tutional sources of information at the same time, but their own data and
information mining activities constitute their main informational sources.
On the other side of the river, there is a group of companies that largely
rely on institutional sources to acquire information on foreign markets.
This fact indicates a rather passive strategy on information acquirement.
They just receive information collected by others; they are not getting
involved in the information production process. As a result, they do not
form their own perception on the market. We may assume that the com-
panies having developed an active information mining strategic behavior
will have also developed an outward-looking development strategy and are
dedicated to the idea of internationalization. The ones that are using sec-
ond hand information are more probable to be hesitant internationalizers.
5.4.2.3 Discriminant Analysis andLogistic Regression

We conduct discriminant analysis and binary logistic regression to find
evidence on the characteristics that differentiate companies adopting or
not a certain strategic behavior regarding the information sources they
use. We conduct discriminant analysis using the two clusters of groups of
companies that occurred by cluster analysis per factor and a wide range
of characteristics related to the company structure and the management
strategies applied or not applied by the companies of the sample. Then
logistic regression follows (Table 5.6).
For the first factor named institutional information sources, we
observed that our sample splits into two clusters of companies, one
using the institutional information sources in order to obtain informa-
tion so as to support its internationalization activity and one not. Our
results show that the characteristics differentiating the companies using
and the companies not using these sources of information are the num-
ber of employees occupied in the special department of study/support
of internationalization activities of the company (negative relationship),
the existence of a written internationalization plan (positive relationship)
and the existence of a branch in the foreign market (negative relation-
ship). According to our results, the more employees occupied in the
Table 5.6 Binary Logistic Regression ResultsInstitutional Information

Sources and Strategic and Structural Characteristics of the Firm
Step Number of employees 2.059 1.052 3.827 1 0.050 0.128

17c occupied in the special
department of study/
support of
internationalization
activities of the company
the existence of a written 1.361 0.598 5.184 1 0.023 3.899
internationalization plan
the existence of a branch 3.506 1.590 4.860 1 0.027 0.030
in the foreign market
Constant 6.524 2.998 4.737 1 0.030 681.473
Table 5.7 Binary Logistic Regression ResultsInter-Organizational and Market

Information Sources and Strategic and Structural Characteristics of the Firm
Step Non-clustering Strategy 1.471 0.829 3.147 1 0.076 4.352

2a Aggressive Promotion 0.683 0.337 4.116 1 0.042 1.980
Strategy
Constant 3.663 1.684 4.728 1 0.030 0.026
internationalization study department of the company, the weaker the

probability that the company obtains information for its internationaliza-
tion activity from institutional factors. The same thing is valid for the
existence of a companys branch in the foreign market. On the contrary,
when a written internationalization plan has been formed, there is higher
possibility that institutional factors have supplied information to the
company (Table 5.7).
As far as the second factor named inter-organizational and market
information sources is concerned, the characteristics diversifying the
two clusters of companies were the existence of a non-clustering strategy
(positive relationship) and the existence of a co-operator in the foreign
market that the company has directly or indirectly paid in order to promote
its products (positive relationship). The above results verify that compa-
nies (a) having an outward-looking strategy (creating a special department
to support their internationalization activities, with many employees); (b)

being highly internationally engaged (have a foreign branch); (c) follow-
ing an aggressive foreign involvement strategy (they pay to establish co-
operations); and (d) forming networks, develop aggressive information
obtainment strategies as well committing resources toward this strategic
goal. In addition, the existence of an internationalization plan may lead to
the recording and the utilization of all kinds of information sources avail-
able, including those deriving from state or institutional sources.
5.4.3CRM, Information Software, andStrategic Development

In this section, we aim to identify how the existence of CRM software
affects the strategies developed by the company. We test the intercon-
nection between the existence of CRM software and the formulation of
marketing strategies in foreign markets, the level of internationalization,
and the level of internationalization strategic engagement. Phi Correlation
Coefficient is used, as all variables are dichotomously measured. As shown
in Table 5.8, phi coefficient = 0.257 (sig=0.001<0.05). Given that phi
coefficient belongs to [1.1], we observe a positive correlation between
the two variables. The existence of a CRM system in the enterprise is
positively correlated to the existence of direct marketing strategies such
as the advertisement of the companys products in foreign markets. We
observe a weak positive correlation [0.159 (sig=0.046<0.05)] between
the existence of a CRM system and the company to be indirectly adver-
tised abroad through sponsorships.
The existence of a CRM system is (a) positively correlated [0.244
(sig = 0.002 < 0.05)] to the promotion of the companys products
through distributors in foreign markets; (b) weakly positively correlated
[0.169 (sig = 0.034 < 0.05)] to the strategic choice of activating in a
market through joint ventures in foreign markets; (c) positively correlated
Table 5.8 Information Software Marketing Strategies Developed in Foreign

Markets
Advertisement of the Companys Sponsorship
Products in Foreign Markets
Phi Correlation 0.257 0.159
Approx. Sig 0.001 0.046
[0.245 (sig = 0.002 < 0.05)] to the strategic choice of activating in a

market through establishing foreign subsidiaries in foreign markets; (d)
positively correlated [0.208 (sig=0.009<0.05)] to the strategic choice
of internationalizing through mergers and acquisitions; and (e) positively
correlated [0.238 (sig=0.012<0.05)] to the strategic choice of operat-
ing a production unit(s) abroad (Table 5.9).
Table 5.10 shows that the existence of a CRM system in the enterprise
is positively correlated [0.230 (sig=0.004<0.05)] to the existence of
an internationalization department in a company. It is also positively cor-
related [0.355 (sig=0.000<0.05)] to the level of strategic complexity of
the enterprise.
We conduct binary logistic regression, using as the dependent variable
the existence of a CRM software and as independent variables all the vari-
ables correlated to the existence of CRM.The final results are presented
in Table 5.11.
According to our empirical results, the existence of a CRM system is
positively correlated with the development of direct (advertisements) and
indirect (sponsorships) marketing strategies in foreign markets. We note
that direct marketing strategies, operationalized as any kind of advertise-
ment in foreign markets, are highlighted as extremely significant by the
binary logistic regression, increasing 1.920 times the probability of the
Table 5.9 Information Software Level of Internationalization

Distributors Joint Foreign Mergers & Production
in foreign Ventures Subsidiaries Acquisitions Unit
markets Abroad
Phi 0.244 0.169 0.245 0.208 0.238
Correlation
Approx. Sig 0.002 0.034 0.002 0.009 0.012
Table 5.10 Information Software Strategic Complexity

Exports Department Level of Strategic
Complexity
Phi Correlation 0.230 0.355
Approx. Sig 0.004 0.000
130
Table 5.11 Binary Logistic Regression Results

Step Advertisement in Foreign Markets 0.652 0.382 2.921 1 0.087 1.920

8b Product Promotion via 0.937 0.382 6.019 1 0.014 2.553
M. GARRI AND N. KONSTANTOPOULOS
Distributors/Wholesalers
Internationalization via Mergers 20.325 11307.471 0.000 1 0.999 671781900.133
& Acquisitions
Multiple strategies applied 1.017 0.388 6.867 1 0.009 2.766
Constant 44.736 22614.941 0.000 1 0.998 0.000
company to obtain and manage customer information through CRM soft-

ware. This finding is in line with Marinagi and Akrivos (2011) and many
other researchers, stressing that the exploitation of customer information
facilitates the design of marketing strategies, expressed as targeting sales
and marketing campaigns to the most valuable customers retaining them
and increasing customers loyalty (Marinagi and Akrivos 2011). These
results verify our research hypotheses. Companies, owning a CRM system,
are expected to design and apply multiple, direct and indirect marketing
strategies in the foreign markets they activate in, as they are able to acquire
the necessary information and knowledge that allows them to evaluate,
design, and implement these marketing strategies. The investment in
CRM software and the development of structured, targeted marketing
strategies reflect their customer-centered corporate culture, leading to the
creation of values, as observed also by Kutsikos and Mentzas (2012). We
also find that there is a positive correlation between the existence of CRM
software in the enterprise and the companys internationalization mode
(using distributors/wholesalers, founding subsidiaries in foreign markets,
merging with or acquiring other companies abroad or establishing pro-
duction units in foreign markets). In particular, the internationalization
mode via mergers and acquisitions and through distributors and wholesal-
ers emerged as particularly significant by the binary logistic regression. As
we see, the actively engaged second-grade internationalized enterprises
have higher probabilities to obtain and use information about their cus-
tomers, so as to enhance their performance and increase their revenues
in foreign markets. This finding is aligned with Harrigan et al. (2008)
argument that the effective use of CRM software not only allows compa-
nies to build and maintain strong relationships with domestic customers
but also broadens the ability of the firm to reach new market potential
in foreign markets (Harrigan etal. 2008) an important step toward the
companys internationalization. We find evidence supporting our research
hypothesis, showing that companies, owning a CRM system, are expected
to be actively involved and highly committed to the markets they develop
their business activity. As a result, they are expected to have a higher grade
of internationalization involvement, to be active in foreign markets and
outward-oriented, a fact that is clearly motivated by and interconnected
with the corporate culture and the corporate vision.
Finally, we observed a positive correlation between the existence of
CRM software and the existence of a special exports/internationalization
department and with the high level of strategic complexity, operational-
ized as the development of multiple targets for each foreign market. The
existence of a special internationalization department enables the com-
pany to study foreign markets and develop multiple, structured, market-
targeted strategies. Findings are aligned with Ngai (2005) who underlined
the significance of understanding CRM as a comprehensive set of strate-
gies for managing those relationships with customers that relate to the
overall process of marketing, sales, service, and support within the orga-
nization (p.583). These empirical results verify our research hypothesis;
companies owning a CRM system are expected to be market-oriented,
designing, and applying multiple strategies for each market they wish to
penetrate, develop, and compete.
5.5 Conclusion
This chapter is looking at the strategic behavior of internationalized enter-
prises on three fronts: on the kind of information they select about the
foreign market, on the sources of information they use, and on the use of
relevant software to organize and exploit these information. Market infor-
mation acquisition is a crucial component of internationalization, given
that it is important for the company to be informed about the overall
market conditions, before deciding to invest in any market. We find evi-
dence interconnecting the adoption of the market information acquisition
process to the development of active marketing strategies. Enterprises,
which develop and implement complex foreign market penetration and
development strategies, tend to obtain all kinds of market information in
order to maximize their business performance in the market. Regarding
the strategic behavior of companies for the sources of information they
use, results showed that there is a wide range of information sources use
by internationalized companies. In detail, companies mainly use insti-
tutional information sources and inter-organizational and market
information sources. Companies using institutional sources seem to be
rather re-active while companies using inter-organizational and market
information sources seem to be more proactive and to more engaged in
internationalization. Companies of the second category show evidence of
higher internationalization intensity and evidence on the existence and
implementation of outward-looking strategies and higher commitment to
the internationalization vision.
This chapter also examined the interconnection between the use of
CRM software in internationalized enterprises, and the development of tar-
geted marketing strategies, internationalization engagement, and the level

of strategic complexity. Findings show that there is an increased probabil-
ity for companies running CRM software to develop targeted direct and
indirect marketing strategies, to be proactively internationalized, engaged
to second-grade internationalization mode and to develop structured, dif-
ferentiated, targeted market strategies. The integrated framework com-
posed by the empirical results, depict the adoption of an outward-looking,
customer and market-oriented corporate culture, reflected to the strate-
gies developed to mine information and the knowledge gained from the
processing of this information from an adequate software.
References
Barney, Jay. 1991. Firm resources and sustained competitive advantage. Journal of
Management 17(1): 99120.
Bose, Ranjit. 2002. Customer relationship management: Key components for IT
success. Industrial Management & Data Systems 102(2): 8997.
Cavusgil, S. Tamer. 1985. Differences among exporting firms based on their
degree of internationalization. Journal of Business Research 12(2): 195208.
Cepeda-Carrion, Gabriel, Juan G.Cegarra-Navarro, and Daniel Jimenez-Jimenez.
2012. The effect of absorptive capacity on innovativeness: Context and infor-
mation systems capability as catalysts. British Journal of Management 23(1):
110129.
Dornberger, Utz, and Md. Noor Un Nabi. 2008. Internationalization dynamic of
Eastern German SMEs. In Proceedings of the International Council for Small
Business World Conference, Halifax, Nova Scotia, Canada, 2225 June 2008.
Available at: http://sbaer.uca.edu/research/sbi/2008/creak18f.html.
Eisenhardt, Kathleen M., and Filipe M.Santos. 2002. Knowledge-based view: A
new theory of strategy. Handbook of Strategy and Management 1: 139164.
Eriksson, Taina, Niina Nummela, and Sami Saarenketo. 2014. Dynamic capability
in a small global factory. International Business Review 23(1): 169180.
Fink, Dieter. 2006. Value decomposition of e-commerce performance.
Benchmarking: An International Journal 13(1/2): 8192.
Fortune, Annetta, and Will Mitchell. 2012. Unpacking firm exit at the firm and
industry levels: The adaptation and selection of firm capabilities. Strategic
Management Journal 33(7): 794819.
Hagel, John, and Jeffrey F.Rayport. 1997. The coming battle for customer infor-
mation. McKinsey Quarterly (3): 6477.
Hkansson, Hkan, Virpi Havila, and Ann-Charlott Pedersen. 1999. Learning in
networks. Industrial Marketing Management 28(5): 443452.
Harrigan, Paul, Elaine Ramsey, and Patrick Ibbotson. 2008. e-CRM in SMEs: An
exploratory study in Northern Ireland. Marketing Intelligence & Planning
26(4): 385404.
Henisz, Witold J., and Andrew Delios. 2002. Learning about the institutional
environment. Advances in Strategic Management 19: 339372.
Hitt, Michael, R. Duane Ireland, and Robert Hoskisson. 2012. Strategic
Management Cases: Competitiveness and Globalization. Boston: Cengage
Learning.
Hult, G., M. Tomas, and David J.Ketchen. 2001. Does market orientation mat-
ter?: A test of the relationship between positional advantage and performance.
Strategic Management Journal 22(9): 899906.
Hutchinson, Karise, Barry Quinn, Nicholas Alexander, and Anne Marie Doherty.
2009. Retailer internationalization: Overcoming barriers to expansion. The
International Review of Retail, Distribution and Consumer Research 19(3):
251272.
Ingram, Paul, and Tal Simons. 2002. The transfer of experience in groups of orga-
nizations: Implications for performance and competition. Management Science
48(12): 15171533.
Jorda-Albinana, Begona, Olga Ampuero-Canellas, Natalia Vila, and Jos Ignacio
Rojas-Sola. 2009. Brand identity documentation: A cross-national examination
of identity standards manuals. International Marketing Review 26(2): 172197.
Kajalo, Sami, Risto Rajala, and Mika Westerlund. 2007. Approaches to strategic
alignment of business and information systems: A study on application service
acquisitions. Journal of Systems and Information Technology 9(2): 155166.
Knight, Gary A., and Peter W.Liesch. 2002. Information internalisation in inter-
nationalising the firm. Journal of Business Research 55(12): 981995.
Kutsikos, Konstadinos, and Gregoris Mentzas. 2012. Managing value creation.
Knowledge Service Engineering Handbook, 123.
Lagos, Dimitrios, and Konstadinos Kutsikos. 2011. The role of IT-focused busi-
ness incubators in managing regional development and innovation. European
Research Studies Journal 14(3): 3350.
Leonidou, Leonidas C. 1997. Finding the right information mix for the export
manager. Long Range Planning 30(4): 479584.
Li, Tiger, and S.Tamer Cavusgil. 2000. Decomposing the effects of market knowl-
edge competence in new product export: A dimensionality analysis. European
Journal of Marketing 34(1/2): 5780.
Liesch, Peter W., and Gary A.Knight. 1999. Information internalization and hur-
dle rates in small and medium enterprise internationalization. Journal of
International Business Studies 30(2): 383394.
Marciano, Alain. 2011. Ronald Coase, The problem of social cost and the Coase
theorem: An anniversary celebration. European Journal of Law and Economics
31(1): 19.
Marinagi, C.C., and Akrivos, C.K. 2011. Strategic alignment of ERP, CRM and
E-business: A value creation. Advances on Intergated Information Conference
Proceedings, 347350.
McGowan, Pauric, and Mark G. Durkin. 2002. Toward an understanding of
Internet adoption at the marketing/entrepreneurship interface. Journal of
Marketing Management 18(34): 361377.
Melewar, T.C. 2003. Determinants of the corporate identity construct: A review
of the literature. Journal of Marketing Communications 9(4): 195220.
Menon, Anil, and P.Rajan Varadarajan. 1992. A model of marketing knowledge
use within firms. The Journal of Marketing 56(4): 5371.
Miles, Morgan P., and Danny R.Arnold. 1991. The relationship between market-
ing orientation and entrepreneurial orientation. Entrepreneurship Theory and
Practice 15(4): 4965.
Morgan, Robert E., and Constantine S.Katsikeas. 1998. Exporting problems of
industrial manufacturers. Industrial Marketing Management 27(2): 161176.
Ngai, E.W.T. 2005. Customer relationship management research (19922002)
An academic literature review and classification. Marketing Intelligence &
Planning. 23(6): 582605.
Ramamurti, Ravi. 2004. Developing countries and MNEs: Extending and enrich-
ing the research agenda. Journal of International Business Studies 35(4):
277283.
Raub, Steffen, and Daniel Von Wittich. 2004. Implementing knowledge manage-
ment: Three strategies for effective CKOs. European Management Journal
22(6): 714724.
Ritossa, Claudia Monica, and Sergio Bulgacov. 2009. Internationalization and
diversification strategies of agricultural cooperatives: A quantitative study of the
agricultural cooperatives in the state of Parana. BAR-Brazilian Administration
Review 6(3): 187212.
Saunders, Mark N.K., Philip Lewis, and Adrian Thornhill. 2011. Research Methods
for Business Students (5th ed.). Pearson Education India.
Souchon, Anne L., and Adamantios Diamantopoulos. 1999. Export information
acquisition modes: Measure development and validation. International
Marketing Review 16(2): 143168.
Teece, D.J., G. Pisano, and Shuen, A. 1997. Dynamic capabilities and strategic
management. Strategic Management Journal, 509533.
Trivellas, Panagiotis, and Ilias Santouridis. 2009. TQM and innovation perfor-
mance in manufacturing SMEs: The mediating effect of job satisfaction. In
Industrial Engineering and Engineering Management, 2009. IEEM 2009. IEEE
International Conference on, 458462. USA: IEEE.
Villar, Cristina, Joaqun Alegre, and Jos Pla-Barber. 2014. Exploring the role of
knowledge management practices on exports: A dynamic capabilities view.
International Business Review 23(1): 3844.
Wang, Guangping, and Janeen E. Olsen. 2002. Knowledge, performance, and

exporter satisfaction: An exploratory study. Journal of Global Marketing
15(34): 3964.
Yeoh, P.L. 2005. A conceptual framework of antecedents of information search in
exporting: Importance of ability and motivation. International Marketing
Review 22(2): 165198.
Zahra, S.A., and G.George. 2002. International entrepreneurship: Research con-
tributions and future directions. In Strategic Entrepreneurship: Creating a New
Mindset, 255288. NewYork, NY: Blackwell.
CHAPTER 6
Innovation intheOpen Data Ecosystem:

Exploring theRole ofReal Options
Thinking andMulti-sided Platforms
forSustainable Value Generation through
Open Data
ThorhildurJetzek
6.1 Introduction
The miracle is this: The more we share the more we have. Leonard Nimoy
19312015
Our world is at an inflection point where technological advances and
boundary-crossing social challenges have come together to create a para-
digm shift. Our societies are facing multiple and urgent social challenges,
ranging from economic inequality, unemployment, and poor social condi-
tions to chronic diseases and climate change. Given the complexity and
cross-boundary nature of these challenges, a new approach where social
T. Jetzek (*)
Copenhagen Business School, Copenhagen, Denmark
e-mail: tj.itm@cbs.dk

DOI10.1057/978-1-137-37879-8_6
138 T. JETZEK
and technological progress co-evolves in order to generate sustainable

value is necessary (OECD 2011). While there are multiple potential paths
that lead to the creation of sustainable value, we believe that innovation
through open data is one path that is currently showing high promise for
future value generation.
All scientists understand the importance of new insights for scientific
advances. These insights have in the past been based on careful analysis
of data collected by researchers that subsequently have disseminated their
findings through the education system and scientific publications. These
insights have changed our view of the world, our attitudes, beliefs, and
behavior, finally impacting everyday actions to generate value for indi-
viduals and society. However, due to scientific and technological advances,
we are currently on the brink of a new era where we have started to see
a significant change in the pace of these processes. The digital revolu-
tion, including the digitization of nearly all media, the ubiquity of Internet
access, the proliferation of mobile phones, and the growth of the Internet
of Things have led to exponentially increasing amounts of data that offer
a world of new information and insights. The current trend toward open
access to these data has furthermore allowed stakeholders to cross bor-
ders and sector-based boundaries, and has completely revolutionized how
public and private stakeholders are collectively addressing some of our
most difficult social challenges (Bakici etal. 2013). Moreover, personal-
ized insights and awareness can now be available to individuals in near
real-time, continuously impacting the way we act and interact. All of these
changes require new theoretical lenses for analyzing value-maximizing
behavior in order for us to better understand and support the continu-
ing evolution toward openness, sharing, and new insights that create
value for all.
One of the important issues for safeguarding the ongoing supply of
open data from government is the ability to make the value of open data
explicit and visible to policy makers and open data users in the private
sector. This is, however, not a trivial task, as open data are generally a free
resource, exhibiting the features of a public good (Jetzek et al. 2013).
The openness of data guarantees that everyone can use the data for what-
ever purposes, and is an important feature when it comes to stimulate
use that might lead to sustainable value generation (Jetzek etal. 2014b).
However, the same features make it extremely difficult to trace where
open data are used and for what purposes, primarily for two reasons: (1)
A significant share of the value generation from data happens through
INNOVATION INTHEOPEN DATA ECOSYSTEM: EXPLORING THEROLE... 139
information generation and network effects, so-called nonmarket produc-

tion (Brynjolfsson and Oh 2012). (2) As open data might be used for
future innovations, another important element of the value of open data
is the fact that the data are available for unforeseen or serendipitous use
in the future. While current methods of measuring market activity capture
only materialized market-based transactions, these two important aspects
remain unidentified.
The provision of high-quality data can require significant up-front
investments (OECD 2014). In the case of governments these investment
costs usually exceed the expected private benefits of data sharing to the
organizations that collect the data. Governments must therefore look for
evidence of societal-level value generation from open data. Accordingly, if
open data initiatives are not perceived to contribute much to value genera-
tion, we could potentially enter a downward-facing spiral instead of ben-
efitting from the synergistic relationship between information production
and dissemination and entrepreneurial activity that is expected to result
from innovation through open data. This could lead to a paradoxical situ-
ation that we call the open data value paradox. This paradox describes a
situation where entrepreneurs do not use the data because the data are
not usable enough, there is too much uncertainty over the sustainability
of initiatives, and therefore little or no value is generated. However, the
data providers are not willing to invest in the people and technology nec-
essary to make the data more usable and sustainable unless they observe
some evidence of value generation. We suggest that in order to maintain
and stimulate use of open data, we need more open data platforms or data
intermediaries. To justify the investment in such platforms, we propose a
model that can make the sustainable value of open data more explicit to
governments, businesses, and individuals, and thereby act to resolve the
open data value paradox.
In this chapter, we propose the use of two established theories that
might help make the contribution of openness of data toward the gen-
eration of sustainable value for society more explicit. The first theory we
borrow from is the theory of two-sided markets. The business models that
have been developed in these types of markets (now most commonly
called multi-sided platforms or MSPs) are generally based on two or more
sides of affiliated customers that interact via digital platforms. The intan-
gible value that market participants can gain from network externalities
is not internalized and will drive the subsequent generation of economic
value that can be used to justify the investment necessary to attract the
140 T. JETZEK
required number of participants. The second theory we suggest to bor-

row from is real option theory. Innovation researchers posit that an option
value approach can influence the motivations of early adopters. While
most company managers know they must innovate to thrive, technical
innovationfor instance, data analyticsis accompanied with uncertainty
about the benefits of using the innovation and the irreversibility that arises
from high learning and adaptation costs during deployment, as well as
high switching costs after deployment (Fichman 2004). In the case of
high uncertainty and irreversibility, it is fruitful to view such investments
through a real options lens (ibid.). Both approaches will alleviate the open
data value paradox by emphasizing the factors that make value generation
from open data unique, while using methods and theories that are already
well established in industry and economics.
6.2 The Sustainable Value ofOpen Data

Openness as an overarching philosophy or concept implies that those
who embrace openness are willing to share not only raw data but also
their problems, experiences, and questions. This sharing leads to the
generation and dissemination of shared information, which in turn can
influence how individuals view the world, make their decisions, and go
about their business. Unfortunately, while the value generation that
might arise from sharing data and information might seem plausible
and intuitive to modern citizens, we are still struggling when we try to
link these intuitive notions to the business world of measures, where
thousands of initiatives must fight for the same funds. This struggle is
relevant to governments and businesses: investment in a particular ini-
tiative must be justified as something that is inherently valuable to the
owners of fundseventually, the taxpayers in the case of government.
In order to sustain open data initiatives, governments (or businesses)
must somehow quantify the potential social and environmental value
that might very well exceed the economic potential offered by open data
(McKinsey 2013).
Hundreds of national and local governments have already opened pub-
lic access to various data sources, making them available for use and re-use
for commercial or other purposes. Moreover, these initiatives have been
followed by international institutions, civil society organizations, and even
businesses, although still to a limited degree. There is certainly mount-
ing evidence showing how innovative use of open data is contributing to
sustainable value generation. However, this evidence is mostly in the form

of anecdotes and use cases and is therefore not yet cogent or rigorous
enough to put a solid foundation under open government data initia-
tives. The Open Data Barometer report highlights that strong empiri-
cal evidence on the impacts of open government data initiatives is almost
universally lacking (Davies 2013). This creates a risk that these initiatives
will not be sustained, for example, when new governments take power
or when worldwide attention to the phenomenon diminishes. Within a
few years, interest in the concept of open data might start declining into
the trough of disillusionment (Buytendijk 2014). When that happens, the
uncertainty on the extent and nature of the return on investment in open
data represents a clear risk for the sustainability of these initiatives (Martin
etal. 2013).
If commercial firms do not realize a return on their innovative activi-
ties, they will tend to underinvest in those activities that are either highly
risky or easily imitated by free-riding competitors (West and Gallagher
2006). Using the same kind of logic, governments will not continue to
invest in infrastructure projects such as open data platforms if they do not
perceive public value from their efforts and good use being put to the
infrastructure in question. In the case of open data, the main barrier to
ongoing investment is the nature of value that is generated. The value of
open data is to a large degree mediated through network effects without
any market transactions taking place. Network effects exist when the value
of a good or a service increases as more consumers use them or as more
supply-side partners augment the service (Bharadwaj etal. 2013). Open
data are a good example of how network effects bring superior value; as
digital data are a resource, the value of which will increase the more it
is used (Nilsen 2010). However, most of our predominantly used valu-
ation methods still rely heavily on market activity and the generation of
economic profits, and do not explicitly recognize the intangible value of
information sharing.
For instance, the resource-based view (Barney 1991; Wade and
Hulland 2004; Wernerfelt 1984) predicts that firms can only sustain their
competitive advantage through use of valuable resources that are neither
perfectly imitable nor substitutable without great effort. Their definition
clearly excludes open data as a resource upon which firms can build com-
petitive advantage. The value chain approach (Porter 2008) looks at how
each step in a chain of activities within a single firm generates economic
value. However, this approach misses both the value generated from
142 T. JETZEK
etwork externalities as well as the intangible or social and environmen-

n
tal dimensions of value. Modern portfolio theory predicts that the riskier
the asset, the higher return on investment is demanded by the investor,
but the return on investment is generally calculated by dividing net profit
by the companys total assets, thereby excluding both the generation of
intangible value and use of external resources. All of these theories will
therefore drive the firms focus toward internal assets and the generation
of economic value from those assets. The resulting approach will in some
cases result in products and services that contribute to solving social chal-
lenges, but unfortunately, there are also many cases where social wellbeing
and economic profits will end up as opposing interests, resulting in so-
called market failures.
In order to fully embrace the value of open data, we need to include
value that is created through better decision-making and network effects
that might eventually lead to reduced CO2 emissions, increased availabil-
ity of clean water in areas where water is contaminated, better and more
equitable use of resources or healthier citizens, to name a few. This type
of value is not easily captured by looking only at current revenue streams
of companies. Porter and Kramer (2011: 3) state that companies con-
tinue to view value creation narrowly, optimizing short-term financial per-
formance in a bubble while missing the most important customer needs
and ignoring the broader influences that determine their longer-term
success. The authors suggest a new approach, utilizing the principle of
shared value, which prescribes that companies should continue to create
economic value, but in a way that simultaneously creates value for society
by addressing its needs and challenges (ibid). To extend the concept of
shared value to include generation of value by not only businesses but
also governments and individuals, we introduce the concept of sustainable
value. The definition of sustainable value focuses on the proactive, con-
certed efforts of businesses, government institutions, and the overall com-
munity to address social challenges in innovative ways, thereby generating
social, environmental, and economic value for all stakeholders and future
generations (van Osch and Avital 2010).
If private and public stakeholders wish to formally or informally col-
laborate on generating sustainable value, they must move beyond the mar-
ket mechanism, a term that is borrowed from economics, referring to the
use of monetary exchange between buyers and sellers within an open and
understood system of value and time trade-offs to produce the best distri-
bution of goods and services. While the market mechanism still functions
well for distribution of goods and services, the shifts toward an economy
centered on information and the move to a networked Internet-based
environment have caused significant attenuation of the limitations that
market-based production places on the pursuit of value (Benkler 2006).
We must examine different types of mechanisms that facilitate shared or
sustainable value generation, and then subsequently highlight not only
economic implications of innovation but the social and environmental
implications as well.
Another mechanism has already become a foundation for generating
value from open data, that is, the network mechanism, which we define as
a mechanism that generates value from actionable insights gained through
information sharing and re-use over networks. The network mechanism
refers to the actions of what we can call information creators and infor-
mation consumers, but in fact, it is not simple to distinguish between
who creates and who consumes information. In many current business
models, the information consumers are also generating valuable data for
platforms owners that are crowdsourced to create new or improved
information. However, the main distinction between the market and the
network mechanisms is that in the latter, there is no monetary exchange
and the relationships are many-to-many, instead of the traditional one-to-
one relationship between buyers and sellers. We propose that intermediar-
ies can play a valuable role in leading the market and network mechanisms
together, thus creating a structure around these complex relationships
that allows for synergistic value generation.
6.3 The Role ofIntermediaries inOpen Data

Ecosystems
Intermediaries are important in markets because of five limitations of
direct transactions that can be better managed by intermediaries: Search
costs, lack of privacy, incomplete information, contracting risk, and pricing
(Resnick etal. 1995; Janssen and Zuiderwijk 2014). Accordingly, inter-
mediaries have basically four roles: (1) information aggregation, (2) pro-
viding trust, (3) facilitating, and (4) matching (Bailey and Bakos 1997).
The intermediary can be an agent of any kind, a government organization,
an individual company, or a private company. Recently, the democratiza-
tion of content as well as the subsequent sharing, remixing, redistribution,
and re-syndication of content in newer and more useful forms has caused
144 T. JETZEK
dramatic power shifts in the intermediary market (Bharadwaj etal. 2013).

These trends have disrupted the traditional value chain of economic prof-
its while creating new sources of value. For instance, the so-called peer-
to-peer Internet-based business models (sometimes aggregated under
the heading Sharing Economy) have challenged various traditional inter-
mediaries, such as taxi services, and a new type of platform intermediary
has moved in to take their place (Cannon and Summers 2014). These
new types of intermediaries are creating an important layer that matches
demand and supply for services, utilizing economies of scale and digital
technologies as well as the business models of two-sided markets.
While we suggest that data intermediaries are important for open
data, we must address the question of why intermediaries are needed in
this open and networked world that promises to facilitate peer-to-peer
relationships. The answer lies in the still relevant transaction and search
costs, as well as in the fact that datasets are getting increasingly bigger,
introducing a barrier for users that cannot easily download or move these
datasets. Moreover, raw data are in many cases of little or no use to end-
users who do not have the capabilities or time to manipulate and process
these data (Janssen and Zuiderwijk 2014). While leading countries are
implementing of National Data Infrastructures, including platforms where
users can directly access open data, the openness of many available gov-
ernment data is still surprisingly low (Davies 2013). Openness of data is
not a binary construct but has many dimensions, ranging from licenses to
prices to usability and technical accessibility. Making data available in cur-
rent form is therefore by far not the only milestone to cross when it comes
to enabling use of open data (Conradie and Choenni 2014; Janssen etal.
2012; Martin etal. 2013; Zuiderwijk and Janssen 2014a, b; van Veenstra
and van den Broek 2013; Zuiderwijk etal. 2012, 2014).
A recent review of the open datasets provided in Berlin, Germany,
showed that approximately 90% of the data provided were published in a
PDF format (Martin etal. 2013). The Open Data Barometer reveals that
of the 821 datasets surveyed in 2013, less than 7% of the datasets were
published both in bulk machine-readable forms and under open licenses.
Only 1.2 % of open data were published as linked data (Davies 2013).
Moreover, there are multiple open data initiatives in most countries,
where different private and civil society organizations, local governments,
and state government each have their own policies and standards for open
data. During these early days of open data, open data initiatives are hetero-
geneous in nature, licenses differ between initiatives, open data standards
are still underdeveloped and underused, and there are heterogeneous for-
mats and a lack of metadata, as well as limited network activity (Mayer-
Schnberger and Zappia 2011; Martin etal. 2013). For most individuals
and smaller developers, these issues come together to create a substantial
barrier to entry, as the efforts involved in acquiring, manipulating, and
analyzing these disparate data are simply too extensive, in comparison to
an uncertain and potentially noneconomic gain.
In most of the world, governments are already struggling with bud-
getary restraints and increased demand for services. Making data open is
never an effortless task, and these constraints limit governments aspira-
tions for open data, even if the potential for value generation may be clear
to them. As governments may not be able to do everything on their own,
data intermediaries could play a crucial role in the open data ecosystem by
facilitating data and information access for smaller organizations that may
not have the capacity and capabilities to store, integrate, and analyze large
and heterogeneous datasets. Intermediaries might also contribute directly
to value generation by augmenting and amplifying the circulation of open
data by sanitizing and curating data coming from both public and private
sources. By making data easier to access, manipulate, and use, intermediar-
ies will drive information creation and product, service, or process innova-
tion based on these data.
Having easy, one-stop access to data services offers a value proposition
for companies striving to create a competitive advantage in an increasingly
data-driven world (Lindman etal. 2014). However, a large share of data-
driven services is provided for free, oftentimes in exchange for access to
personal data (OECD 2014). Data intermediaries need to adapt to mar-
ket conditions where users are accustomed to having free access to data,
information, and information services. To enable the ongoing generation
of valuable but free information, the data intermediaries must implement
business models that allow them to generate economic profit by capital-
izing on the positive network externalities that arise from the interactions
of multiple stakeholders using the provided platforms to gain access to the
services provided by these intermediaries and their affiliates.
6.4 The Economics ofTwo-Sided Markets

The theory of two-sided marketsmore recently referred to as multi-
sided platformshas emerged over the past decade as one of the most
active areas of research in information systems, economics, and strategy.
146 T. JETZEK
It has also drawn considerable interest from practitioners. Two-sided mar-

kets are defined as markets in which one or several platforms enable inter-
actions between end-users, and try to get the two (or multiple) sides on
board by appropriately charging each side (Rochet and Tirole 2006: 2).
The importance of this approach for the analysis on how open data are
used to generate sustainable value is that, unlike classical economic theory,
it explicitly recognizes the value of network externalities (Katz and Shapiro
1985, 1986). The theory of two-sided markets builds on the notion that
there are non-internalized externalities among end-users: The starting
point for the theory of two-sided markets by contrast is that an end-user
does not internalize the welfare impact of his use of the platform on other
end-users (Rochet and Tirole 2006: 3). Two-sided markets, by playing
an intermediary role, will facilitate an interaction that would not occur
without them and therefore create value for both sides through direct and
indirect interactions and network effects.
A more recent definition views MSPs as technologies, products, or ser-
vices that create value primarily by enabling direct interactions between
two or more customer or participant groups (Hagiu and Wright 2011;
Hagiu 2014). Hagiu and Wright (2011: 2) define MSP as an organiza-
tion that creates value primarily by enabling direct interactions between
two (or more) distinct types of affiliated customers. It is important to
note that the term organization is used in order to make it clear that
the notion of MSP is not restricted to regular businesses, but also encom-
passes groups of firms, not-for-profit organizations, or even public sec-
tor entities that create a valuable interaction service (Hagiu and Wright
2011). In spite of the theoretical diversity of potential MSPs, there is a
certain winner takes all element to MSPs, resulting from the economies
of scale introduced by network effects. Therefore, through fierce compe-
tition, the trend is for a relatively low number of MSPs to own certain
domain areas. Moreover, there is an inherent chicken-and-egg problem
commonly present in multi-sided marketplaces; each side depends on the
other, which makes it a challenge to build up the required critical mass
needed to attract the other side (Caillaud and Jullien 2003; Hagiu 2014).
These elements introduce a considerable risk to any investment in an MSP.
Hagiu and Wright (2011) make a clear differentiation between an MSP
and a reseller: Requiring MSPs to enable direct interactions is crucial in
ruling out a broad category of intermediaries that buy goods or services
from suppliers and sell them to buyers. The relevant direct interactions for
a given organization are only those that the MSP specifically enables, that
is, the interactions happen on or through the platform (ibid). Successful

MSPs have been shown to create enormous value by reducing search costs
or transaction costs (or both) for participants (Hagiu 2014). MSPs have
also been shown to provide an efficient means of information sharing. In
the context of open data, the MSP is essentially a data intermediary that
provides a platform where the synergy between network externalities and
the profitable business opportunities can be exploited. Data MSPs can
aggregate the demand of several information requestors and standardize
the flow of information from large numbers of information providers from
public and private sectors (Bharosa etal. 2013).
An open data MSP would enable interactions between the main groups
of stakeholders, that is, data producers/data owners, producers of informa-
tion, products and services (developers, scientists, analysts, or journalists),
and information consumers. There can be different ways in which these
groups can interact over an MSP, and each group can be a mix of pub-
lic and private stakeholders. Government agencies have already started to
utilize the concept of MSPs in their information infrastructure ventures.
Building on the notion of collaborative value generation, rather than
developing an information infrastructure and demanding that businesses
use it, government agencies have started to move away from the clas-
sical approach toward actively tempting businesses to partner in achiev-
ing long-term goals, thereby contributing to sustainable value generation
(Bharosa etal. 2013). In the following section, we present three examples
of MSPs that have transformed open data to information that is openly
shared, creating value for information users and simultaneously attracting
paying customers. More importantly, all of these MSPs have addressed a
certain societal problem, thereby benefitting the public sector as a third
side of the platform.
6.4.1Example One: Opower

Climate change has emerged as one of the most important economic pol-
icy issues of the early twenty-first century. The pollutants that contribute
to global warming are commonly known as greenhouse gas emissions.
Carbon dioxide (CO2) is probably the best known greenhouse gas, rep-
resenting 85% of all greenhouse gasses in the USA.Electricity produc-
tion is the largest single source of global warming pollution in the USA,
responsible for nearly 40% of greenhouse gas emissions.1 After contem-
plating on how to address this problem, two college friends, Alex Laskey
148 T. JETZEK
and Daniel Yates, founded the company Opower (then Positive Energy)
in 2007. Opower is an energy tech company with a mission to help
everyone, everywhere save energy. By the end of 2014, Opower worked
with over 95 energy utilities servicing more than 50 million homes.2 In
February 2015, the Opower home energy reports had helped people
around the world save over six terawatt hours of energy and more than
$700 million on their energy bills. Opower successfully went through an
initial public offering (IPO) in April 2014 and was acquired by Oracle in
May 2016.
As utilities deploy smart grid technologies, the volume of data they pro-
duce each day increases more than 3000-fold. Furthermore, as customers
begin to interact more with their utilities online, these interactions create
even more data.3 Opowers MSP can store and process 15-minute interval
data from smart meters from millions of in-home devices at large scale and
high speed, currently spanning more than 52 million households and busi-
nesses, and growing at a rate of more than 100 billion meter reads per year.
Opowers data analytics engine sits on top of this huge repository of data.
The engine runs hundreds of algorithms that process utility data, third-
party data, and customer behavioral data to power millions of personalized
communications with utility customers on the platform.4 Opower merges
the data streams from utilities with open data from the government to
create personalized energy-use profiles. In the USA, they use data from
the Residential Energy Consumption Survey (RECS) to understand how
households are using energy. The survey provides region-specific data on
end-use energy consumption patterns, such as the type and efficiency of
appliances used by the consumers and the systems and energy sources they
use to heat and cool homes, among other topics. Opower also uses data
from the US Census Bureau on the mix of gas and electric heating sources
in a given county in order to create location-specific profiles to use when
analyzing an individuals home energy consumption.
Opowers products are designed to enhance the utilitys interactions
with their customers in order to both reduce demand and improve rela-
tionships. When designing the way the utilities interact with energy users,
Opower has utilized findings from behavioral science that have predicted
how people react against information provided on their own use, as com-
pared to the use of others (Bos etal. 2012). These results have highlighted
the importance of a feedback mechanism to drive behavioral change, cre-
ating a subtle aspect of peer pressure (Jetzek et al. 2014a). The energy
reports that Opower creates for each energy user offer a component where
this individuals energy use is compared to the use of other similar house-
holds, complete with a smiley token to indicate approval of good behav-
ior (Jetzek et al. 2014a). When provided with better information and
suggestions on how to decrease energy consumption, as well as a token
of appreciation for their efforts, customers are empowered to take greater
control of the way they use energy. On the other side of the platform, the
utilities benefit through increased customer engagement and better target-
ing of specific customer segments for efficiency. Opower has also created
an API to allow utility clients to run their own internal analytics programs
using the data in their analytics engine. Government might be labeled as
an indirect third party on the platform, as they provide open data and in
return gain greater energy efficiency, however there is no direct interaction
between government and the other two sides.
6.4.2
Example Two: INRIX
According to the Texas Transportation Institute, the cost of congestion
in the USA in 2012 was more than $120 billion, nearly $820 for every
commuter that is said to spend over 60hours per year on average stuck in
traffic.5 Similar problems are endured by most of the worlds biggest cities.
However, estimates suggest that since 2009, the global pool of personal
geo-location data is growing yearly by 20% and by 2020 this data pool
could provide $500 billion in value worldwide in the form of time and fuel
savings, or 380 megatons (million tons) of CO2 emissions saved (OECD
2014).
INRIX is a leading provider of traffic services worldwide, with the
vision to solve traffic, empower drivers, inform planning, and enhance
commerce.6 INRIX provides historical, real-time traffic information,
traffic forecasts, travel times, and travel time polygons to businesses and
individuals in 40 countries (as of September 2014).7 INRIX also gathers,
curates, and reports roadway incidents such as accidents, road closures,
and road works. INRIX was founded by former Microsoft employees,
Bryan Mistele and Craig Chapman, in July 2004. INRIX has not yet been
through an IPO, but Porsche recently invested $55 million in the com-
pany, which in July 2014 employed around 350 people. As of September
2014, INRIX collected data about roadway speeds from over 175 mil-
lion real-time anonymous mobile phones, connected cars, and other fleet
vehicles equipped with GPS locator devices. They also get data from cam-
eras and government road sensors. Moreover, INRIX keeps a database
150 T. JETZEK
of variables that affect traffic, including open government data such as

weather forecasts, special events, official accident and incident reports,
school schedules, and information on road construction, which they com-
bine with the crowdsourced data. The data collected is processed in real-
time, creating traffic speed information for major freeways, highways, and
arterials across North America, as well as much of Europe, South America,
and Africa. The companys analysis of data provides drivers with insights
that help them choose the best way to go, minimizing the amount of time
spent, saving them frustration and money on gasoline. As the app used to
source traffic information from individuals is available for free, INRIXs
main source of income is from car producers, GPS providers, and media
companies who are interested in getting access to this type of information.
Moreover, they have recently started to provide data and tools to public
information services, so government also pay for access to this type of
crowdsourced information. However, in return for providing INRIX
with relevant data government gains from reduced traffic
6.4.3Example Three: Zillow

The property market is an essential sector of the economy but also one
that has been a recent source of vulnerabilities and crises. While the recent
recovery in global housing markets is a welcome development, societies
still need to guard against another unsustainable boom, caused by over-
valuation of houses followed by unsustainable levels of private debt. A
detailed analysis and judgment are needed to make a call about over-
valuation of real estate property.8 Zillow is an online real estate database
of homes across the USA.Zillow was founded in 2005 by Rich Barton
and Lloyd Frink, former Microsoft executives and founders of Expedia.
Zillows mission is to empower consumers with information and tools to
make smart decisions about homes, real estate, and mortgages. Currently,
the company has over 30 million unique users per month scrolling
through its database of more than 110 million US homes (Capgemini
2013). Moreover, home shoppers spent more than 5 billion minutes just
on Zillow apps in 2013, making 2013 a record year for mobile usage.9
Zillow successfully launched its IPO in July 2011. As of March 2014, the
company employed 887 people, a number that is expected to double due
to a merger with competitor Trulia.
Zillow provides increased transparency in the housing market, trans-
forming the way consumers make home-related decisions and connect
with professionals.10 In order to provide such transparency, Zillow has

built a database from a range of linked data such as county records,
tax data, listing of homes for sale or rental and mortgage information,
as well as geographical data and information on local land value and
house prices (Capgemini 2013). Their website combines mapping data
with information on local land value and house prices to create a ser-
vice which accurately estimates the value of a house at a given address
(ibid). In addition to giving value estimates of homes, Zillow offers
several features, including value changes of each home in a given time
frame, aerial views of homes, and prices of comparable homes in the
area. Where they can access appropriate public data, they also provide
basic information on a given home, such as square footage and the num-
ber of bedrooms and bathrooms. Users can even get current estimates
of homes if there has been a significant change made, such as a recently
remodeled kitchen.
Zillows business model is straightforward: They connect people look-
ing to buy houses with the real estate agents, mortgage lenders, and
advertisers who want to reach them. In the third quarter of 2014, 74%
of Zillows revenue came from fees that agents paid for customer leads
and apartment leads, 8% came from fees that banks paid for mortgage
leads, and 18% came from advertising. They have also recently started
connecting renters with apartment listings, and homeowners with design
ideas and contractors.11 In order to make their business model work,
Zillow needs to attract visitors to their site who are interested in buying
houses, as well as those that offer mortgages and sales assistance. To do
so, the company has built a variety of products on top of their extensive
database.
6.5 Investing inOpen Data MSPs: Insights

fromReal Options Theory
An open data ecosystem consists of many different participants: govern-

ments, academia, civil society organizations, businesses, and citizens. This
type of ecosystem is circular in nature, building upon a complicated net-
work of value that is generated by different participants that are creating
valuable information as well as products and services. The value appropri-
ated is oftentimes intangible in nature and in many cases the currency is
data, rather than money, as shown in the three examples above. Network
152 T. JETZEK
externalities are created when information that is partly created from open
government data draws users to information platforms. This information
production and consumption activity can be utilized to create economic
value, which will attract even more players to the ecosystem. More use of
data will also eventually benefit the data providers, even without them get-
ting monetary reimbursements, as market participants collectively address
various social challenges, which can rarely be solved by governments
alone. However, before this scenario can happen, the government data
must be open enough and of high quality to be of use for entrepreneurs
and the ecosystem must contribute other factors, like various important
skills, technologies, and low rate funds. It is important that both data pro-
viders on the supply side and prospective data users on the demand side
have a relatively clear idea about potential gains from future use and what
is needed in order for those benefits to be harvested.
Data providers must appropriately recognize two important elements
of this ecosystem. (1) A significant share of the value generated is intan-
gible, resulting from improved decision-making and changed behavior,
which can impact society and the environment as a whole. (2) Those that
finally appropriate the value might be far removed from those that pro-
vide the resources. Therefore, a public sector organization might invest
in gathering data, ensuring the data quality and making the data open
across different dimensions, but future use of these data will in a minority
of cases directly benefit the organization itself. The organization depends
on the yearly budget allocations from government to sustain their activi-
ties. If top-level decision-makers do not sufficiently understand the com-
plicated mechanisms that explain how much value is generated from the
data, they might reduce this funding if the data are not being used, and
the level of openness and quality of data might become compromised as
a result. This is an open data value paradox, describing a situation where
entrepreneurs do not use the data as the data are not usable enough
and there is too much uncertainty over future provision of open data.
However, data providers are not willing to invest in the people and tech-
nology necessary to make the data more usable and sustainable unless
they observe some evidence of value generation, which again depends on
data being used.
In order to resolve this paradox, which might lead to a downward spiral-
ing cycle, as discussed in the introduction, we propose to use the v aluation
methods and ideas used in economics of real options. Uncertainty and the
option holders ability to respond to it (flexibility) are the source of value

for an option. When a firm buys a traditional resource, it has already made
an investment decision. Alternatively, when a firm is provided with the
opportunity to use open data, it has gained an option to use a resource, but
it does not have to exercise this option right away. As contemporary firms
face intense rivalry, globalization, technological change, and time-to-mar-
ket pressures, it is agility, defined as the ability to detect and seize market
opportunities with speed and surprise, which is considered an impera-
tive for business success (Sambamurthy et al. 2003). Companies know
that they need to be agile and able to react quickly to external changes.
Therefore, an option to use a resource, without any current commitment,
must be of some worth to companies and to society as a whole. The ques-
tion remains, how much is it worth?
Financial options capture a specific investment opportunity to which
the holder has preferential advantage (Sandberg etal. 2014). Options the-
ory states that a financial option provides the option holder with the right
to buy (call) or sell (put) a specified quantity of an underlying financial
asset at a fixed price (called a strike price or an exercise price) at or before
the expiration date of the option (Black and Scholes 1973). The payoff to
the option holder is dependent on the price development of the underly-
ing financial asset which will influence whether the user will exercise the
option or not. Investing in financial options can be compared to buying
insurance. The maximum loss is the payment for the right to exercise the
option, while the upside potential is theoretically unlimited, but depends
on the price development of the underlying financial asset. Therefore, the
volatility of the underlying asset price will positively influence the worth of
an option, as it creates a higher probability of considerable gain.
Real options theory further extends financial option theory by using the
same kind of logic for valuing real investment opportunities, where, unlike
many other investment valuation techniques, uncertainty over future out-
comes is viewed as a positive factor (Bowman and Hurry 1993).The real
options valuation approach in strategic management describes how orga-
nizations position themselves to seize emergent opportunities. The theory
provides insight into how tangible as well as intangible resources can act
as options that enable strategic action (Adner and Levinthal 2004). Real
options are generally described as rights to future investment choices with-
out a current obligation for full investment (Sambamurthy etal. 2003). The
activation of real options is often seen as a form of incremental decision-
making on investments, originating with a so-called shadow option that
154 T. JETZEK
is present but has not been realized to become real options when exer-
cised (Bowman and Hurry 1993). After recognizing an option as such,
the holder of an option typically makes a small initial investment, holds it
open until an opportunity arrives, and then exercises a choice to strike the
option and capture the value inherent in that opportunity (Bowman and
Hurry 1993, Gosh and Li 2013). The identification of real options is, to a
significant extent, subject to contingencies such as the firms technological
capabilities, experience, and absorptive capacity, making the identification
of real options virtually unique to every firm (Saarikko 2014). The value of
holding an option becomes magnified especially when the options holder
has preferential advantages in exploiting the opportunity provided by the
option (Sambamurthy etal. 2003).
Within the Information Systems research field, real options have been
used to offer a novel perspective called digital options. Digital options
can be described as a set of IT-enabled capabilities in the form of digi-
tized enterprise work processes and knowledge systems which create value
through increased reach and richness of digitized processes and digitized
knowledge (Sambamurthy etal. 2003, Overby etal. 2006). Digital options
are at once a means of not only preserving the opportunity to capitalize
on a new technology or practice but also of mitigating the risks induced
by technological and market uncertainty (Woodard et al. 2013). While
the concept of digital options has been applied in studies on enterprise
resource planning (ERP) systems investments, it has also received criticism
for its apparent lack of detail in certain key aspects (Saarikko 2014). It
has been argued that restricting digital options to process and knowledge
reach and richness limits the concepts generative potential as well as its
relevance to IT capabilities (Sandberg etal. 2014).
Fichman (2004) compares and contrasts IT platform valuation through
the lens of discounted cash flow (DCF) analysis, on the one hand, and
through the lens of real options valuation, on the other, showing how
real options thinking will capture mechanisms that are important to the
firms competitive advantage, although the value might be intangible and
neglected through methods such as DCF (Fichman 2004, 139). Two of the
discussed real option value determinants in Fichmans model have a special
relevance to open data: susceptibility to network externalities (the extent to
which a technology increases in value to individual adopters with the size
of the adoption network) and interpretive flexibility (the extent to which a
technology permits multiple interpretations on the part of adopters about
how it should be implemented and used).
6.6 Growing Sustainable Value fromOpen Data

While it is extremely difficult to predict how and for what purposes open
data will be used, we propose that certain enabling factors will influence the
opportunity, motivation, and ability of open data ecosystem participants to
use open data for sustainable value generation (Jetzek etal. 2014b). These
factors will influence how quickly data that have been opened up by the
government will be put to good use, creating a bigger set of possibilities
for value generation (variability) and therefore influencing the perceived
option value of data. Drawing from real options theory, the option value
of data is not equal to the current value of data, but rather an addition to
it. The option value is based on the probability of open data being used
to generate value in the future. More use creates more variability in the
ways through which value will be generated and increases the network
effects, thereby positively influencing the option value. As for any other
risky investment, the higher the perceived future value or profitability, the
more easily this opportunity will attract money and investors. To raise
the option value of data, governments can thus focus on improving the
opportunity, ability, and motivation of public and private stakeholders to
use these data for value generation. Higher option value will motivate
stakeholders to use the data, which will consequently underpin future
growth of information, products, and services based on these data and
subsequent generation of sustainable value, thus creating a virtuous cycle
of value generation and use of the data.
Drawing from the economics of two-sided markets, supported by three
use cases of open data MSPs, we propose that open data intermediaries
or MSPs will use data to generate information that is freely disseminated
to users, and the users themselves become an additional source of data,
thus creating positive network externalities that will be used to attract the
other side to the platform. These business models will contribute to
the generation of sustainable value through use of data, as the MSP busi-
ness model supports the generation of intangible value that is usually not
rewarded by economic profits. In the case of MSPs, this type of value gen-
eration is utilized as a tool to attract paying customers, and the two sides
will continue to feed upon each other. By extending real options think-
ing to the societal level, we propose that government is, by disseminating
open data, essentially writing an option (or a bundle of options in the case
of multiple open datasets) where more open distribution of more types
of data will increase the variability in potential outcomes and therefore
156 T. JETZEK
raise the upside potential, while the maximum loss for the government
is the investment made in making data fit for re-use (given that these
data have already been collected) and the eventual loss of income from
data. Of course, the decision to invest in open data in the first place also
depends on the perceived option value of data. We argue that if govern-
ments recognize the option value of open data for potential users, they will
be more willing to continue to provide high-quality open data to users,
even if doing so does demand some further investments in people and
technology.
As the value of data is dependent on network externalities, we propose
that the open data real option value increases with more use of the data,
but with diminishing marginal returns due to the market being at some
point saturated. This effect is not directly reflected in the model below,
but for those that would like to calculate the potential impact, we suggest
using growth formulas, such as that of von Bertalanffy (von Bertalanffy
1938). In that case, the growth factor (K) would be dependent on the
enabling factors we present below and, as suggested above, governments
could influence the option value of data by focusing on these factors, as
they have been found to increase use of the data (Jetzek et al. 2013).
Different growth factors will lead to different outcomes, which can be
assigned probabilities for a more accurate estimation of the distribution
of possible outcomes. Of course, this valuation is based on an estimation
of the base value (or use) of the data (as the underlying asset). The
option holders (i.e., all those that can access and use the data) are influ-
enced by these same factors, but also by their relative abilities as compared
to others. Hence, the value of the option is unique to them, reflecting
their own capabilities. We do not model the organizational level factors
here, in order to preserve clarity of representation. We propose only that
the eventual users will be influenced by the perceived value of the option
they hold, and leave more detailed organizational level modeling to future
research.
To identify potential determinants of sustainable value of open data,
we have looked to previous research, where the most important enablers
and barriers of open data have been analyzed, as well as relying on inter-
views and participation in an open data initiative in Denmark. The under-
lying assumption we make is that people are generally willing (intrinsic
motivation) to use data for sustainable value generation if they are given
the opportunity and they have the ability to do so. Additionally, certain
structures in the economy can influence extrinsic motivation, both nega-
tively and positively. Governments can influence extrinsic motivation of

firms by reducing risk and uncertainty, by creating an environment that
encourages investors and entrepreneurs and by highlighting the opportuni-
ties present. We have developed the following propositions around these
relationships:
P1: IT Infrastructure Positively Influences Perceived Option Value of Open

Data (Ability) Here we propose that use of data is dependent on the level
of technical infrastructure in a country. Technical infrastructure facilitates
data exchange between government agencies and the public and influ-
ences the ability of stakeholders to use the data. Concerning individual
organizations that would like to use data, the perceived option value to
them is dependent on the level of how their own IT infrastructure com-
pares to that of other ecosystem participants.
P2: Data-Related Skills Positively Influence Perceived Option Value of Open
Data (Ability) The growth of data-related skills within a society increases
the ability of stakeholders to generate value from data, and thus the prob-
ability that the data will be used for value generation. For individual orga-
nizations that would like to use data, the perceived option value to them
is dependent on the skills they have as relative to those of other ecosystem
participants.
P3: Data Governance Positively Influences Perceived Option Value of Open
Data (Opportunity) Better data governance will improve data quality,
so that data are accurate, complete, updated, and reliable. Moreover, if
governance is good, it is more likely that the open data initiative that
provides the data will be sustainable. Improved data quality will, in turn,
enable better and more trustworthy information to be generated from the
data and therefore influencing the opportunity for value generation. The
option value of data for the potential user should be higher as a result of
this opportunity.
P4: Openness of Data Positively Influences Perceived Option Value of Open
Data (Opportunity) As the openness of a set of data increases, it becomes
easier to external stakeholders to access and re-use the data. Accordingly,
openness creates an opportunity for value generation. The option value of
data for the potential user should be higher as a result of this opportunity.
P5: Uncertainty about Data Protection Negatively Influences Perceived
Option Value of Open Data (Motivation) We propose that data are used
more if there is confidence and trust in the legal infrastructure that guards
158 T. JETZEK
individual privacy and guides those that want to use data to generate
value. Less risk of data fraud will motivate data users to actively participate
in responsible data use and re-use, therefore, positively influencing use
of data. Accordingly, uncertainty over rights and responsibilities and data
ownership are likely to negatively influence the motivation to use data
therefore negatively impacting the perceived option value.
P6: Collaboration Positively Influences Perceived Option Value of Open Data
(Motivation) We propose that data are used more if government actively
engages and collaborates with external stakeholders in order to motivate
private and public stakeholders to use data for various use and subsequent
value generation. This collaboration can happen via public-private part-
nerships, hackathons, or living labs or other types of formal and informal
interactions between different stakeholders in the open data ecosystem.
P7: The Risk-Free Rate Will Negatively Influence Perceived Option Value
of Open Data (Motivation) The higher the risk-free rate, the more likely
it is that money will be used for risk-less investment, rather than high risk
investment. Therefore, high risk-free rate negatively influences the prob-
ability of investment and use and thereby the perceived option value of
data.
P8: Perceived Option Value of Open Data Positively Influences Investment
in MSPs The higher the perceived option value of data, the more likely it
is that intermediaries will invest in MSPs.
P9: Investment in MSPs Supports the Generation of Information, Products
and Services Based on Data and Therefore Positively Influences Sustainable
Value Various stakeholders can provide information; products and ser-
vices based on the data through these platforms and use the network and
market mechanisms to generate valuable synergies. The more the data are
used and the more synergy is created between the network mechanisms
and market mechanisms that facilitate dissemination of information, on
the one hand, and data-driven products and services on the other, the
more sustainable value will be generated and appropriated. The model
itself is presented in Fig. 6.1.
The various enabling factors are like the roots of the open data eco-
system plant and their main role is to provide nourishment so that the
seed-like data can grow into something of value. Each of these factors will
influence the opportunity, ability, or motivation of stakeholders in the eco-
system to use data for value generation, thereby contributing to a larger

set of potential outcomes and increasing variability, which will positively
influence the perceived option value of data. If the perceived option value
is high, stakeholders will be more willing to make the investments neces-
sary to establish MSPs as data intermediaries, despite the inherent risks.
The establishment of MSPs will furthermore contribute to the generation
of sustainable value, as they allow for an interaction between diverse types
of affiliated stakeholders and play the network and market mechanisms
against each other to create synergies that contribute to the generation of
social, environmental, and economic types of value.
Fig. 6.1 Model of sustainable value generation in the open data ecosystem
160 T. JETZEK
6.7 Discussion
Our societies are changing fast, faster than many of us realize in the
midst of things. The interaction between technological and social ele-
ments is a big driver in these changes, influencing not only our ability
to generate value but also the way people perceive and think about
value (which might at the individual level be more accurately described
as values). We have not discussed these individual trends in depth here,
but suffice it to say that technology and network capabilities have come
together to create vast amounts of data that are currently being trans-
formed into information and used as a resource in new products and
services by a multiplicity of stakeholders. This new data-driven ecosys-
tem is highly dependent on unstructured many-to-many relationships
where data and information are flowing through networks without any
monetary transactions taking place, as opposed to the structured value
chains of the industrial economy. Network capabilities have allowed for
much more complex interactions between stakeholders, and old inter-
mediaries have been cut out while new ones have been created. The
new intermediaries are effectively playing the network mechanisms and
market mechanisms against each otherusing network externalities as
a tool to generate the income, which is necessary to sustain invest-
ments in people and technology, while simultaneously contributing to
sustainable value.
In the context of this chapter, we have used the term open data mostly
for data generated and disseminated by governments, as they are currently
the biggest distributors of open data in the world. We have proposed
that openness is in itself an important enabler to the creation of sustain-
able value from data. Openness enables both generation and appropria-
tion of value, not only by the organization that owns the data but also
by external stakeholders. However, while openness of data might be a
necessary condition for external stakeholders that want to effectively uti-
lize the vast amounts of government data, it is insufficient on its own.
Just as governments aim to provide the necessary infrastructure for effi-
cient markets, they should be aware of the factors that are needed for a
thriving data ecosystem. Such an ecosystem relies to a large degree on
the generation of relevant information, which is further disseminated
through network-based mechanisms to generate value around society.
The network mechanisms do facilitate the appropriation of value by soci-
etys stakeholders but operate under different rules than the traditional
market mechanisms.
We have made a few propositions about how sustainable value can be
created and how MSPs are enabling such value generation, as they are not
completely bound by rent seeking; rather, they gain from stakeholders that
together are addressing complicated societal challenges, previously the
responsibility of governments alone. Governments have started to realize
the power of these models, which thrive on sharing and interactions, and
are even creating their own platforms where public sector, businesses, and
citizens can meet and interact to create superior sustainable value (Janssen
and Estevez 2013). However, as in any complicated market, there are vari-
ous challenges present. One barrier that has been identified in prior work
on MSPs is the chicken-and-egg problem, describing the need to build up
a sufficient number of participants on one side of the platform in order to
attract the other side, which, in the case of open data, is usually the paying
side. In the case of government provision of open data, this translates to
government attracting enough users to justify the investments required
for making data open. When the users come, value will be generatedbut
the users will not participate unless they have a current perception of the
future value to be gained.
The economics of real options help us conceptualize the worth of per-
ceived future value by building on the same ideas that underpin the finan-
cial options markets: The limited risk and the unlimited upside as well
as the ability to wait and see before striking the option. Using this type
of thinking might help resolve the open data value paradox. If supplying
open data is conceptualized as the act of writing an option that is handed
out to all market participants, we gain a tool that can help us evaluate the
potential gain, viewing unpredictability and variability as a positive factor
rather than as a negative one and focusing on the flexibility provided as the
data are out there when the company in question needs them. Trusting
that the companies will value the option they are giving, governments can
focus on making data more open and create a nurturing environment for
interested stakeholders, which might in turn raise the option value even
further. The potential users will eventually pay back, not only by creating
jobs and paying taxes but also by finding innovative solutions to some of
our most pressing societal problems.
162 T. JETZEK
6.8 Limitations, Implications, andConclusion
The model presented in Fig. 6.1 was created as a part of exploratory

research focusing on the emerging phenomenon of open data. The
goal is to uncover and visualize the complex relationship between open
data, the enabling factors, and barriers that can impact how much data
are used and the resulting generation of sustainable value. The model
has a number of limitations as such. The most obvious limitation is
that a simple model will not do justice to the level of complexity in
such a system. However, for conceptual clarity, we highlight only the
societal-level factors that are, based on our research and state of the
art literature on open data, most important for our argument. Many
of the constructs presented are highly dimensional, and these dimen-
sions are also important for creating more depth in our understanding
of the underlying relationships. It is important to delve deeper into
the lower-level mechanisms that can explain these high-level relation-
ships, and we hope that this will be done in future research. A related
limitation is that this model presents only the societal-level relation-
ships, while there are, of course, many factors at the organizational (or
even individual) level that will decide if, and what, companies decide
to strike the optionfor instance, their ability to make the necessary
investments in technology and knowledge and their absorptive capacity
(Jetzek etal. 2014a).
There are several implications for theory and practice. The first practical
implication is for public organizations that are disseminating open data. In
many cases, the necessary investments in people and technology have been
based on the faith and belief that when open data as a valuable resource
are made available, value generation will happen (the if you build it they
will come argument). In other cases, such initiatives have been based on
careful planning and business case evaluation, where the resulting (fore-
seen) value is transformed to a monetary equivalent and used to calculate
net present value (NPV). In both cases, the unforeseen or serendipitous
future value generation that makes the act of open access so alluring is
not explicitly evaluated, although this type of value is in many cases what
drives these initiatives.
We suggest that it might prove helpful to governments to view the
act of publishing open data as writing an option. Real options evalu-
ation methods put value on flexibility and governments provide busi-
nesses with flexibility when they disseminate high-quality data that the
businesses can use whenever convenient. However, the company needs

to wait for the right circumstances in their environment and in their
organization. One easy way to visualize the potential future gains at the
societal level is to create different configurations of the enabling factors
that will result in low to high use of open data, resulting in a distribu-
tion of the value growth factors, from which the predicted future value
of data can be calculated (a type of fuzzy set approach, see Lee and Lee
2011). This approach can make it easier to make the future value of open
data more explicit, recognizing the potential without trying to foresee
every possible use case that might result in tangible or intangible value
generation in the future.
The second practical implication is for private firms that would like
to utilize the torrents of available data to build a successful business.
Not only is the business model of MSPs well suited to support free
dissemination of information, it also creates an attractive marketplace
for smaller companies that would like to use data to create data-driven
products and services that are available to multiple users, without having
the means to establish a platform of their own. While current MSPs
range from relatively closed to relatively open, there is usually a degree
of openness, as these platforms main function is to lead together a
large number of users on both sides of the platform, benefitting both
the information consumers and the information producers. The more
open the platform, the more of the value generated will benefit soci-
etysince, other things being equal, there will be more participants
and more network externalities (Parker and Van Alstyne 2010). We
have contributed to knowledge about how use of open data can result
in the generation of sustainable value by further extending the the-
ory of MSPs to open data intermediaries. The use of this theory helps
explain how multiple stakeholders can, through their dissemination
and use of open data, create synergies that result in the simultane-
ous generation of social, environmental, and economic valueor what
we conceptualize as sustainable value. Moreover, we suggest that by
extending the simple logic that is used in economics of real options, we
have theoretically supported the argument that the openness of data
creates an opportunity for value generation and will accordingly posi-
tively influence perceived option value, as more variability of outcomes
increases the likelihood of a favorable result. Drawing from the inher-
ent chicken-and-egg problem that often inhibits investment in MSPs,
we have looked at the open data value paradox, where the lack of use
164 T. JETZEK
of data results in insufficient levels of openness and data governance,

which then leads to less use. We propose that our model might help
resolve this paradox.
In conclusion, we believe that the unique features of open data offer
the potential for unprecedented generation of sustainable value. This
turn of events, however, is not inevitable, and there are a number of
different factors that governments and other participants in the open
data ecosystem have to consider. These factors will contribute to the
motivation, opportunity, and ability of individuals to use data for sus-
tainable value generation. Our proposition is that when these factors
are in place, both governments and other possible platform owners will
start to perceive that the option value of datareflecting the potential
for future gainis high enough for them to start investing in MSPs.
These platforms will enable individuals to appropriate value from open
data through consumption of information and information services.
Smaller entrepreneurs will moreover gain the ability to create and mar-
ket products and services to those individuals via these platforms. The
two-sided markets business model will offer the ability to create a Win-
Win-Win situation where governments, businesses, and individuals all
gain through a complex network of sharing and co-creating data, infor-
mation, and information services for sustainable value. If the synergies
between public/private and social/economic domains can be exploited
in this way, we believe we have the potential for a quantum leap in
increased productivity and social progress in the near future.
Notes
1. h ttp://www.epa.gov/ghgreporting/ghgdata/reported/index.
html
2. http://opower.com/
3. http://opower.com/platform/data-science
4. http://opower.com/platform/data-science
5. http://d2dtl5nnlpfr0r.cloudfront.net/tti.tamu.edu/documents/
tti-umr.pdf
6. http://www.prnewswire.com/news-releases/inrix-partners-with-
san-francisco-on-expanding-traffic-information-services-for-bay-
area-drivers-229643681.html
7. http://en.wikipedia.org/wiki/INRIX
8. http://www.imf.org/external/np/speeches/2014/060514.htm
9. h ttp://www.zillow.com/blog/zillow-mobile-2013-year-
in-review-141305/
10. http://www.zillow.com/corp/About.htm
11. http://priceonomics.com/the-seo-dominance-of-zillow/
References
Adner, Ron, and Daniel A.Levinthal. 2004. What is not a real option: Considering
boundaries for the application of real options to business strategy. Academy of
Management Review 29(1): 7485.
Bailey, Joseph P., and Yannis Bakos. 1997. An exploratory study of the emerging
role of electronic intermediaries. International Journal of Electronic Commerce
1(3): 720.
Bakici, Tuba, Esteve Almirall, and Jonathan Wareham. 2013. The role of public
open innovation intermediaries in local government and the public sector.
Technology Analysis & Strategic Management 25(3): 311327.
Barney, Jay. 1991. Firm resources and sustained competitive advantage. Journal of
management 17(1): 99120.
Benkler, Yochai. 2006. The Wealth of Networks: How Social Production Transforms
Markets and Freedom. New Haven: Yale University Press.
Bharosa, Nitesh, Marijn Janssen, Bram Klievink, and Yao-hua Tan. 2013.
Developing multi-sided platforms for public-private information sharing:
design observations from two case studies. In the Proceedings of the 14th Annual
International Conference on Digital Government Research, 146155.
Bharadwaj, Anandhi, Omar A. El Sawy, Paul A. Pavlou, and N. Venkatraman.
2013. Digital business strategy: Toward a next generation of insights. MIS
Quarterly 37(2): 471482.
Black, Fischer, and Myron Scholes. 1973. The pricing of options and corporate
liabilities. The Journal of Political Economy 87(3): 637654.
Bos, Maarten W., Amy JC Cuddy, and Kyle T. Doherty. (2012). OPOWER:
Increasing energy efficiency through normative influence (B). Harvard Business
School NOM Unit Case, 911061.
Bowman, Edward H., and Dileep Hurry. 1993. Strategy through the option lens:
An integrated view of resource investments and the incremental-choice process.
Academy of Management Review 18(4): 760782.
Brynjolfsson, Erik, and JooHee Oh. 2012. The attention economy: Measuring the
value of free digital services on the Internet. In the Proceedings of the 33rd
International Conference on Information Systems (ICIS), Orlando.
Buytendijk, Frank. 2014. Hype cycle for big data. https://www.gartner.com/
doc/2814517/hype-cycle-big-data-
166 T. JETZEK
Caillaud, Bernard, and Bruno Jullien. 2003. Chicken and egg: Competition
among intermediation service providers. RAND Journal of Economics 34(2):
521.
Cannon, Sarah, and Lawrence H. Summers. 2014. How uber and the sharing
economy can win over regulators Harvard Business Review. https://hbr.
org/2014/10/how-uber-and-the-sharing-economy-can-win-over-regulators/
Conradie, Peter, and Sunil Choenni. 2014. On the barriers for local government
releasing open data. Government Information Quarterly 31: S10S17.
Capgemini Consulting. 2013. The open data economy: Unlocking economic
value by opening government and public data. https://www.capgemini-con-
sulting.com/resource-file-access/resource/pdf/opendata_pov_6feb.pdf
Davies, Tim. 2013. Open data barometer: 2013 global report. World Wide Web
Foundation and Open Data Institute. http://www.opendataresearch.org/dl/
odb2013/Open-Data-Barometer-2013-Global-Report.pdf
Fichman, Robert G. 2004. Real options and IT platform adoption: Implications
for theory and practice. Information Systems Research 15(20): 132154.
Ghosh, Suvankar, and Xiaolin Li. 2013. A real options model for generalized
meta-staged projects-valuing the migration to SOA. Information Systems
Research 24(4): 10111027.
Hagiu, Andrei, and Julian Wright. 2011. Multi-sided Platforms. Boston, MA:
Harvard Business School.
Hagiu, A. 2014. Strategic decisions for multisided platforms. MIT Sloan
Management Review 55(2): 71.
Janssen, Marijn, and Elsa Estevez. 2013. Lean government and platform-based
governanceDoing more with less. Government Information Quarterly 30:
S1S8.
Janssen, Marijn, and Anneke Zuiderwijk. 2014. Infomediary business models for
connecting open data providers and users. Social Science Computer Review
32(5): 694711.
Janssen, Marijn, Yannis Charalabidis, and Anneke Zuiderwijk. 2012. Benefits,
adoption barriers and myths of open data and open government. Information
Systems Management 29(4): 258268.
Jetzek, Thorhildur, Michel Avital, and Niels Bjrn-Andersen. 2013. Generating
value from open government data. In The 34th International Conference on
Information Systems. ICIS 2013.
. 2014a. Data-driven innovation through open government data. Journal
of Theoretical and Applied Electronic Commerce Research 9(2): 100120.
. 2014b. Generating sustainable value from open data in a sharing society.
In Creating Value for All Through IT, 6282. Berlin: Springer.
Katz, Michael L., and Carl Shapiro. 1985. Network externalities, competition, and
compatibility. The American Economic Review 75(3): 424440.
. 1986. Technology adoption in the presence of network externalities. The
Journal of Political Economy 94(4): 822841.
Lee, Young-Chan, and Seung-Seok Lee. 2011. The valuation of RFID investment
using fuzzy real option. Expert Systems with Applications 38(10): 1219512201.
Lindman, Juho, Tomi Kinnari, and Matti Rossi. 2014. Industrial open data: Case
studies of early open data entrepreneurs. In System Sciences (HICSS), 2014 47th
Hawaii International Conference on, 739748. USA: IEEE.
Martin, Sbastien, Muriel Foulonneau, Slim Turki, and Madjid Ihadjadene. 2013.
Risk analysis to overcome barriers to open data. Electronic Journal of
e-Government 11(1): 348359.
Mayer-Schnberger, Viktor, and Zarino Zappia. 2011. Participation and power:
Intermediaries of open data. In 1st Berlin Symposium on Internet and Society
October.
McKinsey & Company. 2013. Open data: Unlocking innovation & performance
with liquid information. McKinsey Global Institute, McKinsey Center for
Government & McKinsey Business Technology Office.
Nilsen, Kirsti. 2010. Economic theory as it applies to public sector information.
Annual Review of Information Science and Technology 44(1): 419489.
OECD. 2011. Fostering innovation to address social challenges. http://www.
oecd.org/sti/inno/47861327.pdf
. 2014. Data-driven innovation for growth and well-being. interim synthe-
sis report. http://www.oecd.org/sti/inno/data-driven-innovation-interim-
synthesis.pdf
Overby, Eric, Anandhi Bharadwaj, and V.Sambamurthy. 2006. Enterprise agility
and the enabling role of information technology. European Journal of
Information Systems 15(2): 120131.
Parker, Geoffrey, and Marshall Van Alstyne. 2010. Innovation, openness & plat-
form control. In Proceedings of the 11th ACM Conference on Electronic
Commerce, 9596. NewYork, NY: ACM.
Porter, Michael E. 2008. Competitive Advantage: Creating and Sustaining
Superior Performance. NewYork, NY: Simon and Schuster.
Porter, Michael E., and Mark R.Kramer. 2011. Creating shared value. Harvard
Business Review 89(1/2): 6277.
Resnick, Paul, Richard Zeckhauser, and Chris Avery. 1995. Roles for electronic
brokers. In Toward a Competitive Telecommunication Industry: Selected Papers
from the 1994 Telecommunications Policy Research Conference, 289304.
Mahwah, NJ: Lawrence Erlbaum Associates.
Rochet, Jean-Charles, and Jean Tirole. 2006. Two-sided markets: A progress
report. The RAND Journal of Economics 37(3): 645667.
Saarikko, Ted. 2014. Here today, here tomorrow: Considering options theory in
digital platform development. In Creating Value for All Through IT, 243260.
Berlin: Springer.
Sambamurthy, Vallabh, Anandhi Bharadwaj, and Varun Grover. 2003. Shaping
agility through digital options: Reconceptualizing the role of information tech-
nology in contemporary firms. MIS quarterly 27(2): 237263.
168 T. JETZEK
Sandberg, Johan, Lars Mathiassen, and Nannette Napier. 2014. Digital options
theory for IT capability investment. Journal of the Association for Information
Systems 15(7): 422453.
van Osch, W., and M. Avital. 2010. The road to sustainable value: The path-
dependent construction of sustainable innovation as sociomaterial practices in
the car industry. Advances in Appreciative Inquiry 3(1): 99116.
van Veenstra, Anne Fleur, and Tijs A. van den Broek. 2013. Opening movesdriv-
ers, enablers and barriers of open data in a semi-public organization. In
Electronic Government, 5061. Berlin: Springer.
von Bertalanffy, Ludwig. 1938. A quantitative theory of organic growth (inquiries
on growth laws. II). Human Biology 10(2): 181213.
Wade, Michael, and John Hulland. 2004. Review: The resource-based view and
information systems research: Review, extension, and suggestions for future
research. MIS Quarterly 28(1): 107142.
Wernerfelt, Birger. 1984. A resource-based view of the firm. Strategic Management
Journal 5(2): 171180.
West, Joel, and Scott Gallagher. 2006. Challenges of open innovation: The para-
dox of firm investment in open-source software. R&D Management 36(3):
319331.
Woodard, C.J., N. Ramasubbu, F.T. Tschang, and V. Sambamurthy. 2013. Design
capital and design moves: The logic of digital business strategy. MIS Quarterly
37(2): 537564.
Zuiderwijk, Anneke, and Marijn Janssen. 2014a. Barriers and development direc-
tions for the publication and usage of open data: A socio-technical view. In
Open Government, 115135. NewYork: Springer.
. 2014b. Open data policies, their implementation and impact: A frame-
work for comparison. Government Information Quarterly 31(1): 1729.
Zuiderwijk, Anneke, Marijn Janssen, Sunil Choenni, Ronald Meijer, and R.Sheikh
Alibaks. 2012. Socio-technical impediments of open data. Electronic Journal of
eGovernment 10(2): 156172.
Zuiderwijk, Anneke, Marijn Janssen, Sunil Choenni, and Ronald Meijer. 2014.
Design principles for improving the process of publishing open data.
Transforming Government: People, Process and Policy 8(2): 185204.
CHAPTER 7
Sustainability-Oriented Business Model

AssessmentA Conceptual Foundation
FlorianLdeke-Freund, BirteFreudenreich,
IolandaSaviuc, StefanSchaltegger, andMartenStock
7.1 Introduction
Corporate sustainability has long left its academic niche and has become an
integral part of todays business world. While companies around the globe
are trying to position themselves as economically competitive and at the
F. Ldeke-Freund (*)
University of Hamburg, Hamburg, Germany, Research Fellow at Centre
for Sustainability Management (CSM), Leuphana University, and Governing
Responsible Business Fellow, Copenhagen Business School
e-mail: Florian.Luedeke-Freund@wiso.uni-hamburg.de
B. Freudenreich (*) S. Schaltegger
Leuphana University, Lneburg, Germany
e-mail: freudenreich@leuphana.de; schaltegger@uni.leuphana.de
I. Saviuc
University of Antwerp, Antwerp, Belgium
e-mail: iolanda.saviuc@uantwerpen.be
M. Stock
ifu Hamburg GmbH, Hamburg, Germany
e-mail: m.stock@ifu.com

DOI10.1057/978-1-137-37879-8_7
170 F. LDEKE-FREUND ET AL.
same time as ecologically and socially sound, for example, through more
efficient production processes or increasing product responsibility, some
pioneers are looking at new ways to meet this challenge on a more systemic
level: the development of sustainable business models (e.g. Beltramello etal.
2013; Bisgaard etal. 2012; Wells 2013a; Schaltegger et al. 2016). Empirical
studies show that this approach is already one of the most important top-
ics of sustainability and innovation management in practice (e.g. Kiron
et al. 2013). The underlying assumption and expectation is that con-
sciously managed business models can lead to more effective ways of solv-
ing ecological and social problems, while maintaining or even enhancing an
organisations competitiveness (Schaltegger etal. 2012, 2016). Corporate
sustainability has arrived in the world of business models, and vice versa.
Corporate sustainability takes into account the risk of negative business
impacts on the natural environment and society as well as the challenge of
surviving as an organisation in partly radically changing ecological, social, and
economic contexts (Schaltegger and Burritt 2005; see also McElroy and van
Engelen 2012). But corporate sustainability is also about creating positive
effects in support of a prospering natural environment and human society;
a perspective that is emphasised in the emerging research field of sustainable
entrepreneurship (Schaltegger and Wagner 2011) and sometimes referred
to as flourishing (Ehrenfeld and Hoffman 2013). A business model can be
understood as the rationale, or logic, of organisational value creation. The
conventional business model perspective defines value mainly in financial
terms, whereas softer concepts are also discussed, for example, referring to
customer value, jobs-to-be-done, or knowledge gains (cf. e.g. Beattie and
Smith 2013; Chesbrough 2010; Johnson 2010). From a corporate sustain-
ability perspective, business models should be developed and transformed in
ways that secure the long-term viability of an organisation, that is, maintain
and improve its competitiveness through managerial and innovative capabili-
ties, while satisfying customers and other stakeholders needs within the limits
of the ecological and social systems in which every human activity is embed-
ded (cf. Boons and Ldeke-Freund 2013; Schaltegger etal. 2012, 2016).
Whether and how sustainable business models really act as business models
for sustainability, that is, whether they effectively contribute to sustainable
development, is not just a matter of business model design but also of mea-
surability and manageability. Being able to measure and manage business
model effects is an essential prerequisite for targeted activities to improve the
sustainability performance and long-term prospects of organisations, espe-
cially in rapidly and radically changing business environments. However,
appropriate management approaches for the assessment and management
SUSTAINABILITY-ORIENTED BUSINESS MODEL ASSESSMENT... 171
of business models and their innovation as a means of corporate sustainable

development are currently not available. Therefore, the proposed method
for sustainability oriented business model assessment (SUST-BMA) aims at
the systematic monitoring and evaluation of an organisations sustainability
performance on the business model level. The goal is to provide a methodi-
cal innovation that can be applied independently of organisation type and
industry, based on the transfer and combination of current research, mainly
in the fields of corporate sustainability management and business models.
The chapter is structured as follows: Sect. 7.2 provides the theoretical
background with a focus on business cases and business models for sus-
tainability. The research gap, that is, the need for sustainability assessment
methods on the business model level, is discussed. Section 7.3 introduces
the proposed SUST-BMA framework and its major components, which are
a business model concept and the Sustainability Balanced Scorecard (SBSC).
The SUST-BMA process guiding the practical application of our framework
is presented in Sect. 7.4. The chapter concludes with an outlook on future
research necessary to develop the SUST-BMA method further (Sect. 7.5).
7.2 BackgroundTowards Business Models

forSustainability
7.2.1Business Cases forSustainability (The Ends)

Sustainable entrepreneurs, as for example, defined by Schaltegger and
Wagner (2008, 2011) or Hockerts and Wstenhagen (2010), pursue
corporate sustainability mainly through the creation of so-called business
cases for sustainability (Schaltegger and Burritt 2005; Schaltegger and
Wagner 2006a).1 A business case for sustainability achieves economic suc-
cess through (and not just with) the deliberate and voluntary management
of ecological and social issues. A business case for sustainability is differ-
ent from conventional business success. It can be characterised by three
requirements which have to be met (Schaltegger and Ldeke-Freund
2013; Schaltegger etal. 2012):
The company has to realise a (mainly) voluntary activity with the

intention to contribute to the solution of an ecological and/or social
problem. It must be an intended activity for the natural environ-
ment or society and not just a reaction to legal enforcement and
regulations, or an activity which would anyhow be expected for eco-
nomic reasons as part of conventional business behaviour.
The activity must create a positive business effect, that is, positively
contribute to the economic success of the company which can be
measured or at least argued for in a convincing way. Such effects can
include cost savings, increased sales, improved competitiveness, prof-
itability, customer retention, or reputation, for example.
A clear and logically convincing argumentation must exist that a
deliberate management or entrepreneurial activity has led to both the
intended ecological or social effect and the economic business effect.
Innovation, including new processes, products, and organisational

forms, is widely discussed as a crucial strategy to create business cases for
sustainability (e.g. Hansen etal. 2009; Hockerts and Wstenhagen 2010;
Schaltegger and Wagner 2011; Wstenhagen et al. 2008). While the
innovation potential of the business model has been recognised for more
than 15 years in mainstream management research (e.g. Chesbrough and
Rosenbloom 2002; Hamel 2000; Linder and Cantrell 2000; see e.g. Alt and
Zimmermann (2014), Zott and Amit (2013) for retrospective views on the
field), it has hardly been investigated from a sustainable entrepreneurship
perspective (Ldeke-Freund 2013; Schaltegger etal. 2012, 2016). While
the strategy and innovation mainstream sees the business model mainly
as a mediator between technologies, strategies, and economic value (e.g.
Chesbrough and Rosenbloom 2002; Chesbrough 2010; Hamel 2000;
Johnson etal. 2008; Teece 2010), the question of how business models are
used by sustainable entrepreneurs to create, deliver, and capture ecological,
social, and economic value with their sustainability innovations has so far
received little attention. However, some authors have begun to deal with
this issue in more detail during the past eight or nine years (e.g. Charter
et al. 2008; Johnson and Suskewicz 2009; Ldeke-Freund 2009, 2010;
Stubbs and Cocklin 2008; Verhulst etal. 2012; Wells and Seitz 2005; Wells
2008, 2013a, b; Wstenhagen and Boehnke 2008; Upward 2013).
The most pressing risk for sustainable entrepreneurs is whether busi-
ness cases for their innovations can be realised (Schaltegger and Wagner
2011). Here, the business model comes into play as a mediating device
that can support sustainable entrepreneurs attempts to create or main-
tain successful businesses (Boons and Ldeke-Freund 2013; Ldeke-
Freund 2013; Doganova and Eyquem-Renault 2009). Depending on
their personal worldviews, organisational or socio-cultural environments
and requirements, sustainable entrepreneurs define success as financial
profit, non-financial effects like improved reputation, or simply as cost
coverage through the sale of solutions for ecological and social problems
(Schaltegger and Wagner 2011). But regardless of their normative per-

sonal motivations and organisational missions, at one time, sustainable
entrepreneurs must commercialise their innovations and be successful
in mass markets to create private and public benefits on a relevant scale,
that is, significantly reduce some of the market imperfections and nega-
tive externalities that lead to humanitys unsustainable development (cf.
Cohen and Winn 2007).
Assuming that radical innovations are crucial for improving a firms
sustainability performance (without neglecting the effects of accumulated
incremental measures), the theoretical relationships between a firms eco-
nomic success and the ecological and/or social performance of its sustain-
ability innovations are illustrated in Fig. 7.1.
Economic
Performance
Extended business case

potential due to sustain-
A ability innovation and
ES* business model effects
B
ES0
D
F C
ESP* ESP1 ESP0 Social and/ or
Ecological Performance
Fig. 7.1 Relationships of economic and social and/or ecological performance

(Adapted from Schaltegger and Synnestvedt (2002: 341); Schaltegger and Burritt
2005)
The economically optimal business case is achieved in point A (ES*/

ESP*). Beyond this point, that is, towards points B and C, trade-offs occur
and the economic performance decreases because of rising marginal costs
of further sustainability innovations after the low hanging fruits have
been picked (cf. Hahn etal. 2010; Lankoski 2006). A socially or ecologi-
cally maximised business case would be slightly above point B (ES0/ESP1).
However, even if profitable innovations exist, the economic performance
will at some point have its culmination and decline. Besides this revisionist
view, which accepts the existence of (limited) win-win situations (curve ES0-
A-B-C), the traditionalist view sees only trade-offs as soon as a company
goes beyond the legally required minimum, which corresponds to curve
ES0-D-E-F (for a more detailed discussion see Schaltegger and Burritt
2005; Schaltegger and Wagner 2006a, Schaltegger and Burritt, 2016).
In line with Chesbrough and Rosenblooms (2002) findings on the
cognitive effects of business models on value creation from new tech-
nologies, we assume that sustainability innovations together with man-
aged business models and their innovation can extend given, and create
new business case opportunitiesas indicated by the dashed line in the
upper right of Fig. 7.1 (cf. Schaltegger etal. 2012). However, this leads
to questions about the linkages between business models and sustainabil-
ity performance (Sect. 7.2.2) as well as about approaches to their sys-
tematic assessment and management (Sect. 7.2.3). Without assessment,
how could sustainable entrepreneurs know whether their activities, such
as introducing new business models, improve their performance towards
upper right corner?
7.2.2Business Models forSustainability (The Means)

Several publications discuss possible linkages between business models and
corporate sustainability or, in a wider sense, positive organisational contri-
butions to a sustainable development of nature, society, and economy (cf.
Carayannis etal. 2014; Charter etal. 2008; Ldeke-Freund 2009, 2013;
Stubbs and Cocklin 2008; Tukker and Tischner 2006; Wells 2008, 2013a;
Wstenhagen and Boehnke 2008). However, research in this field is still
rather limited with regard to both empirical analyses and rigorous theo-
retical frameworks (Bocken etal. 2014; Boons and Ldeke-Freund 2013;
Boons etal. 2013; Schaltegger etal. 2012, 2016). Prominent examples of
potentially sustainable business models found in the literature often refer
to new mobility technologies and modal concepts (e.g. Abdelkafi et al.
2013; Cohen and Kietzmann 2014; Johnson and Suskewicz 2009; Wells
2013b), approaches to marketing renewable energies (e.g. Loock 2012;
Ldeke-Freund 2014; Richter; 2012, 2013; Wstenhagen and Boehnke
2008), and different forms of social enterprises (e.g. Seelos and Mair
2005, 2007; Yunus etal. 2010; Zeyen etal. 2014).
In these contexts, the effect of new business models is often described
as the breakup of dominant and purely financially oriented paradigms of
value creation (Ldeke-Freund 2009, 2010). This can be achieved, for
example, through establishing closed-loop and zero-waste production
models (e.g. McDonough and Braungart 2013) that replace linear fire
and forget models and allow for the creation of ecological value (Wells
2008). Another role is the introduction of new ways of value distribu-
tion. For example, some social businesses distinguish between those who
pay for a value propositionlike access to nutrition, health care, or edu-
cationand those who benefit from it, thus creating additional social
value (e.g. Grassl 2012; Yunus etal. 2010). With regard to changing the
ways of producing and consuming services and goods that are culturally
and economically embedded and institutionalised, Wells argues that only
radical and sustainability driven innovations are capable of challenging
the persistent and continuously self-reproducing status quo (Wells 2008,
2013a, b; Charter etal. 2008; Hansen etal. 2009). However, such inno-
vations often start in niches and struggle either to create new markets
or to penetrate the existing mass marketstake e-mobility as a prime
example (cf. Bidmon and Knab 2014; Hockerts and Wstenhagen 2010;
Schaltegger and Wagner 2011; Tukker et al. 2008). Creating business
models that not only bridge this gap between niche and mass markets
for radical and sustainability driven innovations and hereby deliver eco-
logical and social benefits, but are also economically viable, is the major
challenge for sustainable entrepreneurs dealing with business models and
their innovation (cf. Bocken etal. 2014; Carayannis etal. 2014; Charter
etal. 2008; Laukkanen and Patala 2014; Ldeke-Freund 2013; Upward
2013).
The ability to deliberately provide market access for radical and sus-
tainability driven innovations, either by connecting to existing markets or
creating completely new markets, is the crucial feature that distinguishes
sustainable business models from what might be referred to as conven-
tional or mainstream business models (Boons and Ldeke-Freund 2013).
The relationship between business models and sustainability innovations
has been described from two major perspectives: One sees the business
model as a support structure for bringing sustainability innovations to the

market (e.g. Charter etal. 2008); and the other sees it as a potential sus-
tainability innovation itself (e.g. Stubbs and Cocklin 2008; cf. Beltramello
etal. 2013; Bisgaard etal. 2012). Regardless of the theoretical differences
and commonalities between these perspectives, both purport the signifi-
cance of business models for sustainability. However, beyond such theo-
retical issues, which still need further elaboration and clarification (Boons
etal. 2013), a more practical question is how entrepreneurs can develop
and manage their business models in order to achieve and retain economic
viability through providing ecological and social benefits.
There are, however, examples of entrepreneurs and corporations
who have managed to create business models that are successful in eco-
logical, social, and economic terms. Stubbs and Cocklin (2008), for
example, analyse the Australian Bendigo Bank as well as the famous
carpet tile maker Interface Inc. and describe their structural and cultural
attributes in relation to their socio-economic contexts and their inter-
organisational capabilities. Based on the analysis of both firms excep-
tional leadership styles and ability to provide positive external effects to
the natural environment and society, Stubbs and Cocklin propose five
key characteristics of sustainability business models including, inter
alia, adopting a stakeholder view of the firm and promoting environ-
mental stewardship. Taking ecological considerations as a starting point
for business model innovation is an approach analysed by Short and
colleagues in their case study of British Sugar (Short etal. 2014). The
company is the UKs largest sugar producer (as of 2014) and provides
an informative example of ecologically motivated business model inno-
vation. Their innovation led to remarkable synergies between product
lines as diverse as animal feed, electricity, tomatoes, and bioethanol and
thus bringing core ideas of industrial ecology to lifeand to the mar-
ket. An example of a highly effective social business model is described
by Seelos (2014) in his case study of the Indian eye hospital group
Aravind. Founded in 1976 by Dr Govindappa Venkataswamy, Aravind
has grown into a group of eye care hospitals and temporal eye care
camps throughout India, serving millions of patients every yearmost
of them at a very low price or for free. Aravind developed a hybrid busi-
ness model based on cross-subsidisation between paying and non-pay-
ing patients. This model reduces the risk of blindness and thus poverty,
for example, caused by cataracts, for people who otherwise could not
afford professional health care.
7.2.3Sustainability Oriented Business Model Assessment

(TheGap)
While compelling and inspiring, one crucial aspect is missing in the above-
mentioned theoretical, conceptual, and empirical studies: How can the
assumed positive sustainability effects of business models and their innovation
be assessed and measured to support management? And how can sustainable
entrepreneurs control their business models contribution to business cases for
sustainability? So far, most authors argue for potentially positive linkages
between particular business model types and their expected sustainability
performance (e.g. Beltramello etal. 2013; Bisgaard etal. 2012; Bocken
etal. 2014; Clinton and Whisnant 2014), but neither the academic nor
the practitioner literature offers concrete concepts or tools for the assess-
ment and management of these linkages.
In some cases, such as Aravind, it might be obvious that a particu-
lar business model (e.g. the hybrid cross-subsidisation model) leads to a
particular outcome (e.g. serving a great number of otherwise neglected
patients; Seelos 2014). But in most cases, the assumed positive effects will
be related rather indirectly to the respective business models. However,
sustainable entrepreneurs need to know how their decisions influence their
organisations sustainability performance. They need to be able to com-
pare alternative models to make informed decisions, for example, about
how to combine technologies, markets, and business models, to achieve
the performance they aspire to (see e.g. Chesbrough and Rosenbloom
(2002) and Chesbrough (2010) for a traditionally strategic perspective on
this issue). Moreover, their business models sustainability performance
must be tracked to provide information for ad hoc reactions to changes
in society, the market, or regulations as well as for ex post analyses and
strategic planning (see e.g. Burritt and Schaltegger (2010) for an overview
of different approaches to, and functions of, sustainability accounting).
Thus, at least three functions must be supported by concepts and tools for
a SUST-BMA:
Systematic performance tracking to control the (non-)attainment of

an organisations sustainability goals;
Systematic information provision to adapt business models, support
ex post and longitudinal analyses as well as strategic planning; and
Comparisons of alternative business models and their ecological, social,
and economic bottom lines.
To the best of our knowledge, no SUST-BMA concepts or tools are

available to date. The current business model literature offers only very
few assessment methods. Wirtz (2011), for example, proposes a financial
business model controlling framework, which comprises basic criteria
to assess the realisation of a promised value proposition, the degree of
customer satisfaction, and profitability. This framework aims at securing a
business models competitive advantage and is based on the assumption
that the three aspects directly interrelate: the (non-)realisation of a value
proposition has an influence on customers satisfaction, which in turn has
an impact on a business models profitability. Wirtz thus employs a cause-
and-effect view that brings Kaplan and Nortons BSC to mind (Kaplan
and Norton (1992, 1996); see Sect. 7.3.2). Still, Wirtz framework cannot
be applied as a standard tool to any business model, or even sustainability
issues, since the required mixture of soft information and hard numbers,
as well as directly and indirectly measurable information, requires a cus-
tomisation of the proposed criteria to the individual case in question, for
example, as a set of key performance indicators (KPIs) for each criterion.
Therefore, Wirtz suggests a checklist to guide the customisation of indi-
vidual performance management frameworks.
Other known approaches to assessing business models refer to their
strategic and financial value. For example, the software-supported business
quotient method scores a business models financial robustness on a scale
from 0 to 100. It separates a model into its main elements and their inter-
relations and estimates its financial strength and longevity (Muehlhausen
2013). Another software-based method is found in the Strategyzer tool-
box published by the Business Model Foundry, which commercialises
Osterwalder and Pigneurs (2009) Business Model Canvas. Strategyzer
offers design and innovation tools as well as a calculator to estimate a
business models financial performance in a manner similar to a profit
and loss account. Business model designers can enter their assumptions,
for example, about market size, sales price, and buying frequency, and
estimate their models profitability. However, Strategyzer, too, offers no
method for integrating and assessing sustainability related performance
information.
The result of this review is not surprising. Due to the business model
concepts historical background, which is at least partly in the domain of
strategic management (cf. Wirtz 2011), it is closely related to financially
oriented performance assessment. For example, Osterwalders (2004)
Business Model Ontology and its derivative innovation tool, the Business
Model Canvas (Osterwalder and Pigneur 2009), clearly show that the per-
formance of a business model and changes to it are to be expressed in
terms of financial costs and revenues. Magretta (2002) described this con-
ceptual kinship very clearly in her seminal article Why Business Models
Matterthey matter because they tie narratives, that is, business ideas,
to numbers, that is, expected financial results (see also Doganova and
Eyquem-Renault 2009):
By enabling companies to tie their marketplace insights much more tightly to

the resulting economicsto link their assumptions about how people would
behave to the numbers of a pro forma P&L [profit and loss]spreadsheets made
it possible to model businesses before they were launched. (Magretta 2002:5,
orig. emphasis)
In this chapter, we argue that this conceptualand practicalfeature of

tying narratives to numbers should be extended to include information
that exceeds financial figures and involves ecological and social perfor-
mance data. Frameworks, concepts, and tools for a SUST-BMA would
allow, inter alia, evaluating the business case potential of sustainable busi-
ness model archetypes, patterns, and other design approaches, which are
being discussed with increasing frequency (e.g. Bocken etal. 2014; Breuer
and Ldeke-Freund 2017a, 2017b; Clinton and Whisnant 2014). We
propose a basic conceptual assessment framework in the following section.
7.3 A Conceptual Framework forSUST-BMA

7.3.1Conceptual Approach
One of the business model concepts most important advantages turns
into an impediment from an assessment perspective: its systemic nature.
Any business model concept comprises a particular number of other mod-
els and concepts. Zott and Amit (2010), for example, refer to the content,
structure, and governance of an activity systems architecture to describe
a business model. Al-Debei and Avison (2010) define four dimensions
as the basic ontological structure of their concept (value proposition,
value architecture, value finance, and value network). And Wirtz (2011),
as another example, defines a business model as integrating nine partial
models, such as the strategy, resource, or network model. These and many
more concepts share a particular feature: they are concepts of concepts.
While this feature allows for flexible and rich descriptions of empirical
phenomena and supports systemic thinking, its downside is that a thorough
assessment would lead to unmanageable data requirements. Therefore,
the SUST-BMA framework builds on an approach that circumvents this
impasse: since the business model, as it was defined by Osterwalder (2004),
is partly based on Kaplan and Nortons (1992, 1996) BSC, which can be
used for assessment purposes, a (re-)alignment of these two concepts would
support business model assessments. This approach provides structural guid-
ance and eliminates the need to develop an assessment framework from
scratch. Moreover, the conventional BSC has been further developed as
an SBSC that supports the management of sustainability information (e.g.
Figge etal. 2002; Schaltegger and Dyllick 2002), which is key to a sustain-
ability oriented assessment. Therefore, our conceptual approach is based on
the alignment of the business model concept with the SBSC.It is assumed
that this approach provides a manageable SUST-BMA framework.
The following sections introduce the business model concept within
our approach (Sect. 7.3.2), provide an overview of the SBSC (Sect. 7.3.3),
and explain how both merge to form the basis of the SUST-BMA frame-
work (Sect. 7.3.4).
7.3.2The Business Model Concept

7.3.2.1 Business Models
The business model is a rather young concept which emerged around 15
years ago as an explicitly defined notion. Since then, it is becoming increas-
ingly established in research and practice (Baden-Fuller etal. 2010). Most
publications relate its emergence to the dot-com hype, that is, the new
economy and e-commerce era (cf. Timmers 1998; Alt and Zimmermann
2014). Some influential works simultaneously emerged in the strategy
and innovation domains (e.g. Hamel 2000; Linder and Cantrell 2000;
Magretta 2002). At these times, business model concepts and research
topics were limited, whereas today the amount of peer-reviewed jour-
nal articles is growing exponentially. Al-Debei and Avison (2010), for
example, compare 20 different scholarly definitions, and Wirtz (2011)
provides a 70-page overview covering the period from 1975 to 2010.
With regard to single concepts, the ones developed by Osterwalder and
Pigneur (Osterwalder et al. 2005; Osterwalder and Pigneur 2009) and
Johnson (Johnson et al. 2008; Johnson 2010) are widely accepted (in
terms of citations) and provide some coherence across disciplinary and
research- practice boundaries. From a theory building perspective, the

works of Amit and Zott, Chesbrough, or Teece serve as primary references
(e.g. Amit and Zott 2001; Zott and Amit 2007, 2008; Chesbrough and
Rosenbloom 2002; Chesbrough 2010; Teece 2010).
To introduce the business model in general, the definition proposed by
Teece (2010) is used (see also Teece 2006): A business model describes the
design or architecture of the value creation, delivery and capture mechanisms
employed. The essence of a business model is that it crystallizes customer
needs and ability to pay, defines the manner by which the business enter-
prise responds to and delivers value to customers, entices customers to pay
for value, and converts those payments to profit through the proper design
and operation of the various elements of the value chain (Teece 2010:
179, italics added). This definition highlights functions mainly discussed
by strategy and innovation scholars: creating, delivering, and capturing
value (e.g. Afuah 2004; Chesbrough and Rosenbloom 2002; Chesbrough
2010; Hamel 2000; Johnson 2010; Linder & Cantrell 2000; Magretta
2002; Shafer etal. 2005; Zott and Amit 2010). Besides these value-related
functions, which refer to what business models do in practice (sometimes
referred to as market devices, Doganova and Eyquem-Renault 2009),
Osterwalder also proposes conceptual business model functions, which
are understanding and sharing, analyzing, managing, prospects and pat-
enting of business models (Osterwalder 2004: 19).
Notwithstanding its business or model functions, the business model
is mostly referred to as a concept (e.g. Amit and Zott 2001; Baden-Fuller
and Morgan 2010; Hedman and Kalling 2003; Schweizer 2005; Zott
etal. 2011), that is, a logical and abstract idea that supports higher-level
thinking, or as a framework (e.g. Al-Debei and Avison 2010; Chesbrough
and Rosenbloom 2002; Lambert 2010; Wirtz 2011), that is, a reference
system that combines different concepts and their (causal) interrelations.
As a concept or framework it has different model characteristics, which
allow researchers and practitioners to define and deal with chosen aspects
of reality (Baden-Fuller and Morgan 2010; Seelos 2014). A common
approach to capture the desired aspects, such as the offerings, processes,
or value creation of an organisation, is to define constituting elements
and relationships (for overviews, see Al-Debei and Avison 2010; Morris
etal. 2005; Shafer etal. 2005; Wirtz 2011), that is, to create some form
of representing construct. Therefore, the business model can be seen as an
artificial construct which can assume the form of a verbal definition, dia-
gram, concept, or formal ontology.
7.3.2.2 Business Model Functions

A key function of a business modelseen as a market device (Doganova
and Eyquem-Renault 2009)is alignment, or mediation, between differ-
ent internal organisational spheres (most importantly strategy and opera-
tions, Fig. 7.2), as well as between an organisations innovations and its
environment (Al-Debei and Avison 2010; Chesbrough and Rosenbloom
2002; Lambert 2010, Osterwalder 2004). This function is crucial for an
efficient and effective implementation of process, product, and service
innovations, and thus plays an important role in keeping up with com-
petition (cf. Pateli and Giaglis 2005; Teece 2010; Zott and Amit 2008).
Beyond internal alignment, a business model also mediates between an
organisation and its environment (Ldeke-Freund 2013). In extreme
cases this can even lead to changes in an industrys dominant designs and
ways of doing business, if the respective business model possesses superior
features which are emphasised by customers expectations, new regula-
tions, or competitors who adopt these features (cf. Chesbrough 2010;
Demil and Lecocq 2010; Johnson, 2010). Amit and Zott (Amit and Zott
2001; Zott and Amit 2010) found that well aligned and superior busi-
ness model designs evoke efficiency, novelty, complementarity, or lock-in
effects, leading to advantageous market positions.
Generic Levels and Processes of

Corporate Sustainability Management
Normative Level of Sustainability Mgmt Orientation Processes: orientate to normative concepts;

e.g. corporate sustainability, CSR, ecological
Normative Foundation modernization, industrial ecology
Strategic Level of Sustainability Mgmt Strategy Development / Implementation Processes:

from vision to strategy; e.g. strategies to promote sust.
Strategic Foundation by extracting private benefits from public goods
Architectural Level Business Model Mgmt Business Model Processes: business model design; e.g.
translate strategy into structural template of business
Business & Money Making Logic logic to provide sustainability driven value propositions
Operative Level of Sustainability Mgmt Operative Management Processes: realize corporate

sustainability in everyday business; e.g. develop and im-
Staff Supervision, Financial Management,
plement instruments / concepts from sustainability mgmt
Quality Management
Fig. 7.2 The location of the business model within management levels and
processes (Ldeke-Freund 2009:18)
Business modelsseen as models (Arend 2013; Seelos 2014)are

applied for a variety of purposes, such as business analyses or supporting
innovation and creativity projects, and in different organisational contexts,
ranging from non-governmental organisations (NGOs) and social enter-
prises to multinational corporations. Depending on purpose and context,
different model functions are emphasised. For our purposes, that is, the
development of an assessment framework, we will emphasise three of the
five model functions proposed by Osterwalder etal. (2005): understand-
ing and sharing, analysing, and managing.
Understanding and sharing allows capturing, visualising, and commu-
nicating an organisations business logic (ibid., pp.1114). Business model
concepts, especially those that segregate elements and their relationships,
support clear and structured descriptions as they allow developing scale
models of real objects and circumstances. This is extremely important,
since entrepreneurs of all kinds often struggle to describe their businesses
in a clear and convincing manner (cf. Linder and Cantrell 2000)take
the 2 minute start-up pitch as the situation in which this is of utmost
importance. Other model functions build on this fundamental one.
Analysing refers to measuring, tracking, observing, and comparing
business model data and results (Osterwalder etal. 2005: 14). Using the
business model as a distinguishable unit of analysis (Sthler 2002) can help
to monitor and manage business activities in ways not thought of before
(cf. Camponovo and Pigneur 2004). Measuring, tracking, and observ-
ing relate to the development of relevant measures (e.g. in terms of busi-
ness model KPIs) and corresponding information management systems.
Comparing different business models builds on these functions and allows
understanding performance differences between organisations as well as
fostering inter-organisational learning. Comparisons, however, require a
description of different empirical entities in the same conceptual way and
with comparable KPIsa major challenge for SUST-BMAs.
Finally, managing business models involves, inter alia, their design,
implementation, and change (Osterwalder etal. 2005, pp.1516). The
inherent complexity of real businesses is the main reason that their delib-
erate design and implementation is an interdisciplinary task that crosses
nearly all departmental boundaries within and beyond an organisation.
These complexities become particularly apparent in Wirtz business
model concept (2011), which describes a business model as the inter-
play between nine different sub-models. Changing a business model is an
equally, if not more, challenging task, because of the inertia of once estab-
lished organisational worldviews and routines (Johnson 2010). Therefore,

understanding and analysing, as defined by Osterwalder et al. (2005),
together with a dedicated change strategy (cf. Linder and Cantrell 2000;
Johnson 2010), are the ultimate prerequisites for an effective business
model management.
The above discussion shows that a valid and reproducible framework
for sustainability oriented performance assessment and management has
to start with a conceptual definition of a business model and its bound-
aries, allowing for understanding and sharing. Therefore, our concept
frames business model descriptions as a set of distinct interrelated logics
(see below). Business model concepts with distinct elements and relation-
ships are particularly suitable to fulfil descriptive and analytical functions.
Such modular concepts make it easier to locate specific issues within a
business model (e.g. CO2 emissions or water consumption) and allow a
mapping of corresponding indicators to its elements. This in turn facili-
tates targeted adjustments and management of the business model (cf.
Osterwalder etal. 2005).
7.3.2.3 The Five Business Model Logics

Based on the definition by Osterwalder and Pigneur (2009: 14) that [a]
business model describes the rationale of how an organization creates,
delivers, and captures value, we consider business models to express the
fundamental rationale, or logic, of organisational value creation. This fun-
damental logic can be divided into secondary logics, analogous to busi-
ness model pillars (Osterwalder 2004), dimensions (Al-Debei and Avison
2010), or partial models (Wirtz 2011). Building on an extensive review of
34 business model concepts published between 2001 and 2014, we define
the following five secondary logics as the essence of a business model: the
marketing logic, financial logic, capabilities and resources logic, produc-
tion logic, and contextual logic (Fig. 7.3).2
Templates for these logics were found, amongst others, in the frame-
works developed by Abdelkafi et al. (2013), Al-Debei and Avison
(2010), Bieger and Reinhold (2011), Johnson (2010), Lambert (2012),
Osterwalder and Pigneur (2009), Rusnjak (2014), and Schallmo (2014).
Since we define the contextual logic as an explicit link to a business mod-
els wider stakeholder environment, and thus as a potential link to sustain-
ability issues that are usually not integrated into business model concepts,
only very few publications were found that propose, or at least discuss,
comparable conceptualisations (Ldeke-Freund 2009; Schaltegger et al.
Markeng
logic
Producon Capabilies &

logic resources logic
Financial
logic
Contextual logic
Fig. 7.3 The five generic business model logics
2003; Upward 2013). The five secondary logics are basically defined as
follows:
The marketing logic describes the interaction between a firm and its tar-
get customers. It comprises the value proposition and customer interface,
and describes what is offered, how it is offered and delivered, and how
the firm interacts with its customers. It is enabled by a firms production
and resources and capabilities logic and represents the primary source of
market-based revenues. The marketing logic relates to the value delivery
function of a business model.
The financial logic describes the costs incurred within the production
logic, as well as within the resources and capabilities logic. It also includes
the revenues generated within the marketing logic. It focuses on cost driv-
ers and how a firm generates revenues to cover costs and remain financially
viable. The financial logic relates to the value capture function of a busi-
ness model.
The capabilities and resources logic describes the foundations of the pro-
duction and marketing logic. It comprises the requirements in terms of
infrastructure, people, knowledge, and capabilities in order to enable pro-
duction and marketing activities. The development, acquisition, and main-
tenance of resources and capabilities incur costs that are captured by the
financial logic. The capabilities and resources logic, together with the pro-
duction logic, relates to the value creation function of a business model.
The production logic describes the activities that need to be performed
to create a business models value proposition. In addition to activities
within the firms boundaries, it includes activities carried out by partners.
Costs associated with the production logic are taken into account by the
financial logic. The production logic, together with the capabilities and
resources logic, relates to the value creation function of a business model.
The contextual logic comprises aspects that are crucial for the function-
ing of the business model but are situated outside the other four business
model logics and maybe even outside the market. This includes, for exam-
ple, legal requirements, technological changes, and societal aspects like a
firms public reputation. The contextual logic expresses a business models
value framing with regard to its socio-cultural, political, legal, economic,
and technological spheres.
These logics are interdependent (Fig. 7.3). That is, a complete descrip-
tion of a business models fundamental value creation logic requires con-
sidering all five secondary logics. An exception is the contextual logic
which frames the other logics. It describes the context within which the
others are embedded and which cannot, or only to a limited extent, be
influenced by the firm itself.
7.3.3The Sustainability Balanced Scorecard

The second major concept that forms the basis of our proposed SUST-
BMA framework is the SBSC (for a current literature review see Hansen
and Schaltegger 2016), which is an extension of Kaplan and Nortons
(1992, 1996) original BSC.
The BSC was developed as a reaction to one-sided, short term and
past-oriented management practices that mainly relied on quantitative
performance measurement and tended to overemphasise purely financial
indicators (Johnson and Kaplan 1987; Kaplan and Norton 1992, 1996).
The BSC was introduced as an alternative concept of multilevel perfor-
mance measurement and management that balances financial measures
(results from past activities) and operational measures (drivers of future
performance) and helps assessing corporate performance in several areas
simultaneously. The crucial point is that these operational measures and
related KPIs can have different characteristics: quantitative and qualitative,
financial and non-financial.
For this reason, corporate sustainability management scholars identified
the BSC as a promising starting point for the development of an integrated
sustainability performance measurement and management concept. The
SBSC was developed with the aim to integrate non-monetary, qualita
tive, and sometimes soft factors related to an organisations environ
mental and social performance (cf. Hansen and Schaltegger 2016; Figge
etal. 2002; Schaltegger 2011; Schaltegger and Dyllick 2002; Schaltegger
and Wagner 2006a, 2006b).
7.3.3.1 Balanced Scorecard

In its default layout, the BSC is based on four perspectives which are the
financial, customer, internal business process, and learning and growth
perspective. However, the BSC concept is explicitly open to modifications
of its layout (Kaplan and Norton 1992, 1996). Reflecting on these per-
spectives broadens managers views beyond financial KPIs. However, the
financial perspective with its objectives and measures is at the top of the
BSC and serves as starting point for the BSC process (see Kaplan and
Norton (1996) for a more detailed description of the four perspectives).
Financial perspective: All perspectives are directed towards the finan-

cial perspective, which includes measures that account for bottom-
line improvements through strategy implementation and execution.
Objectives and measures refer to profitability (e.g. operating income,
return-on-capital-employed, economic value-added), sales growth,
shareholder value, or cash flow generation. Economic performance
and viability as the main objectives are directly linked to market suc-
cess and customers.
Customer perspective: This perspective helps to identify current and
future market segments and customers. Customers are mainly con-
cerned about time, quality, service, and cost of offerings; thus, it
is important to understand how a firm is performing against these
criteria from their customers point of view. The task is to evaluate
what they really value, today and in the future, and translate this into
value propositions that lead to customer satisfaction and retention.
Internal business process perspective: Here, the focus is on the internal
value-chain. It defines what the company must do to provide attrac-
tive customer value propositions and to realise an adequate financial
performance for shareholders. Critical innovation and operations
processes are identified, referring to product design and develop-
ment, manufacturing, marketing, and post-sale services. Executives
have to identify core competencies and technologies needed to suc-
ceed in both short and long-term value creation.
Innovation and learning perspective: Global competition and chang-
ing business environments require companies to innovate, improve,
and learn continuously to develop better processes and offer compel-

ling value propositions. The ability of organisational learning is based
on employees, IT systems, and organisational quality. The innova-
tion and learning perspective identifies the infrastructure underlying
the other three perspectives.
Fig. 7.4 shows these four basic perspectives. Their hierarchical relation-
ships become clear when distinct indicators and causal chains are devel-
oped as part of a strategy map (Kaplan and Norton 2000).
Financial Perspective
Objectives
Initiatives
Measures
Targets
Customer Perspective Internal Process Perspective

Objectives
Objectives
Initiatives
Initiatives
Measures
Measures
Vision
Targets
Targets
and
Strategy
Learning and Growth

Objectives
Initiatives
Measures
Targets
Fig. 7.4 Basic perspectives of the balanced scorecard concept (Kaplan and
Norton 1996: 9)
7.3.3.2 Sustainability Balanced Scorecard

Different approaches to developing SBSCs are proposed in the litera-
ture (cf. Figge et al. 2002; Hansen and Schaltegger 2016; Schaltegger
and Dyllick 2002). Two often discussed techniques are subsumption and
addition: SBSCs can be developed either by subsuming environmental and
social aspects to the basic BSC perspectives and/or adding a non-market
perspective. Further methods and SBSC layouts are discussed in Hansen
and Schalteggers (2016) extensive literature review.
Subsumption requires the identification of ecological and social aspects
relevance for the organisations strategy and the definition of correspond-
ing strategic objectives and performance drivers. The resulting leading and
lagging indicators, as well as objectives and measures, then have to be
integrated into the existing four perspectives. An advantage is the direct
integration into cause-and-effect chains and orientation towards superior
financial objectives. This method requires ecological and social aspects to
be already incorporated in the market systemthe four basic perspectives
do not go beyond the market mechanism, that is, market prices and trans-
actions (Figge etal. 2002).
However, most sustainability issues are currently being treated as exter-
nalities and are not reflected in market prices and transactions. Strategically
relevant issues are often neglected as they appear in the socio-cultural or
legal sphere and are thus not transformed into strategic objectives or
performance drivers (cf. Schaltegger and Burritt 2005). Therefore, Figge
etal. (2002) propose the introduction of a fifth, non-market perspective
(addition). Non-financial ecological and/or social aspects with a strategic
influence on the organisations performance are included in this perspec-
tive. Its influence on the financial perspective can be either direct or indi-
rect through the other perspectives (Fig. 7.5). The addition of an explicit
non-market perspective must be justified through ecological and social
aspects from outside the market system that influence the implementation
and execution of the respective organisations strategy. The challenge thus
is to identify formerly not recognised strategic influences from outside the
market.
7.3.4The Basic SUST-BMA Framework

Some business model scholars point to similarities between the BSC and
the business model concept (e.g. Osterwalder 2004, Schallmo 2014,
Upward 2013). These similarities are rooted in their complementary roles
Non-Market Perspective
Objectives
Initiatives
Measures
Targets
Financial Perspective
Objectives
Initiatives
Measures
Targets
Customer Perspective Internal Process Perspective
Objectives
Objectives
Initiatives
Initiatives
Measures
Measures
Targets
Targets
Vision
and
Strategy
Learning and Growth

Objectives
Initiatives
Measures
Targets
Fig. 7.5 Basic layout of an SBSC with fifth, non-market perspective (Figge etal.
2002)
in ensuring the achievement of strategic aims through day-to-day business

operations. Their direct kinship is most obvious in Osterwalders (2004)
original Business Model Ontology which is partly based on Kaplan and
Nortons BSC.Building on this conceptual relationship, Ldeke-Freund
(2009) discusses the possibility of using the SBSC as a basis for develop-
ing a sustainability oriented business model template. Taking a related

approach, we propose to combine the business model concept, under-
stood as the interplay of five different logics (Sect. 7.3.2.3), and the SBSC
(Sect. 7.3.3.2) to construct the basic SUST-BMA framework (Fig. 7.6).
Fig. 7.6 provides an overview of the proposed SUST-BMA framework
with the business model on the left side, showing the rearranged five busi-
ness model logics, and the SBSC on the right side, showing the four basic
perspectives and a fifth non-market perspective.
Beginning at the top of Fig. 7.6, an organisations strategy guides both
the business model design (moving left towards the business model; for
this link, see Schaltegger et al. (2012)) and the development of perfor-
mance measurement and management systems (moving right towards
the SBSC; for this link, see Figge et al. (2002)). The business model,
that is, its five logics, is operationalised by different organisational func-
tions (finance, marketing, production, development, and environmental
and social management function, respectively), which contribute to the
various value creation processes that an organisation is involved in (centre
of Fig. 7.6). Similarly, the SBSC concept is operationalised through the
definition of goals and measures relating to each perspective, mirroring
the business model logics. These goals and measures ensure that the value
creation process, based on the five logics, is performed in line with an
organisations strategic aims, which in turn should reflect its overarching
vision and mission (Breuer and Ldeke-Freund 2017a, 2017b).
While the business model represents the different logics of how an
organisation creates value, the SBSC assesses the realisation of its strategic
goals and measures. Thus, the SBSC supports performance measurement
and management and the business model concept supports value creation
management. Matching both concepts structures ensures that they can be
applied in unison with a clear focus on organisational value creation (e.g.
in terms of shareholder value or jobs), delivery (e.g. through dividends
or hiring staff), and capture (e.g. in terms of financial profit for the focal
organisation).
The financial logic, with its focus on how costs and revenues are gener-
ated and balanced, is implemented through the finance function. Here,
activities relating to financial transactions, investment planning, and so
forth are coordinated. These actions are central to the financial value
created by an organisation, for example, as dividends paid to sharehold-
ers. The SBSC translates overarching strategic aims into departmental or
even individual goals and measures (Kaplan and Norton 1992, 1996), as
guides Strategy guides
192
Business Model frames Implementaon delivers Outcome delivers Implementaon frames Performance measurement
Design and management concept
through through
Sustainability
Business Model Created Value
Balanced Scorecard
Financial logic Finance funcon Financial goals and measures

e.g. shareholder value, Financial
Revenue model, costs,
etc.
dividends perspecve
F. LDEKE-FREUND ET AL.
Markeng logic Markeng funcon Markeng goals and measures

e.g. customer soluons, Customer
Value proposion,
customer interface, etc.
products, services Perspecve
Fig. 7.6 The basic SUST-BMA framework

Producon logic Internal
Processes, incl. Producon funcon Prodcuon goals and measures
e.g.purchase process
innovaon,
producon, logiscs, etc. Perspecve
Capabilies & resources

logic Development funcon Development goals and measures Learning &
Capabilies, e.g. jobs growth
competencies, perspecve
knowledge, etc.
Environmental and social Environmental and social

Contextual logic management funcon goals and measures
e.g. tax payments,
Corporate reputaon, Non-market perspecve
environmental benefits,
legimacy, polics, etc.
Leads to Performance Leads to

the business model translates a strategy into operations (cf. Casadesus-

Masanell and Ricart 2010). Within the financial logic and financial per-
spective, it has to be ensured that the value which is created is consistent
with the value that the business set out to create.
The customer-focused marketing logic is implemented through an
organisations marketing function, which delivers customer value in the
form of problem solutions (jobs-to-be-done, Johnson 2010), products,
and services. The accordingly oriented customer perspective of the SBSC
defines the goals and measures that move the businesss marketing logic
towards the achievement of an organisations strategic customer and mar-
ket aims.
The production logic comprises the processes and activities taking place
within the business model and is implemented through an organisations
production function. One of these activities, for example, is the procure-
ment of production inputs from suppliers. The value created through the
production logic is the purchase that supports suppliers continued exis-
tence. The SBSCs internal process perspective provides the goals and mea-
sures that keep the production function on course.
The capability and resources logic is implemented through the organisa-
tional development function which looks at the assets and people within
an organisation, their capabilities and knowledge as well as their further
development. Value is created, for example, in the form of jobs and career
opportunities. The objectives and measures in the learning and growth
perspective ensure that the value created through the development func-
tion enhances the foundation for the other perspectives and respective
business model logics, for example, through skilled people who are capa-
ble of creating value together.
The contextual logic comprises political and legal requirements, tech-
nological change, and further ecological and socio-economic aspects
with significant impacts on an organisations business model. This logic
is implemented mainly through the sustainability management function
(cf. Ldeke-Freund 2009, 2010). The non-market perspective includes
goals and measures related to these aspects, which are usually not cap-
tured within the other perspectives (see Figge etal. (2002); Sect. 7.3.3.2).
It helps to steer organisational activities in relation to issues that may have
(severe) direct and indirect effects on the other perspectives and business
model logics, for example, through effects on an organisations reputation
or its ability to attract intrinsically motivated staff (cf. Schaltegger and
Wagner 2006b).
In summary, the basic SUST-BMA framework suggests that perfor-

mance can be defined as the total of all different values that are created,
delivered, and captured by an organisations business model (bottom line
of Fig. 7.6). Assessing and managing the respective processes of value cre-
ation, delivery, and capture becomes possible with an SBSC that is aligned
with the five business model logics, and vice versa. Applying the business
model concept and the SBSC in parallel is proposed as an approach to sup-
port the assessment and management of a business models sustainability
performancewhich is a precondition for the identification of business
case effects, as discussed in Sect. 7.2.
7.4 The SUST-BMA Process

Whereas the SBSC is an important tool for the continuous measurement
and management of an organisations sustainability performance, and
thus the comparison of actual and planned results in the different per-
spectives (Hansen and Schaltegger 2016), the approach proposed in Fig.
7.6 is explicitly meant to support decision-making regarding an organ-
isations business model and its innovation. That is, by focusing on the
business model as a new unit of analysis (Sthler 2002), instead of a whole
organisation or business unit, we define a new field of application for the
SBSC.But while the above framework can only serve as a generic frame
that offers a new perspective and conceptual foundation, ways have to
be found to generate the necessary information to breathe life into this
frame. The practical assessment and management of business models is
impossible without information that has been generated and structured
accordingly.
Therefore, the SUST-BMA process is set up as a three-step approach:
The first step is the definition of the unit of analysis, that is, a particular
business model under consideration; the second step is the identification
of the most relevant sustainability aspects within this business model (hot
spots); and the third step is the assessment of its performance with regard
to these aspects.
7.4.1Defining theUnit ofAnalysis

In order to assess empirical business models sustainability performance,
they must be captured, described, and understood (see business model
functions in Sect. 7.3.2.2). Here, business model concepts come into play
for their re-construction. Osterwalder and Pigneurs (2009) Canvas has

become the leading visualisation tool because of its intuitive and easy-
to-learn approach to capturing, describing, and understanding business
models. We propose a different conceptualisation based on the five busi-
ness model logics defined above (Sect. 7.3.2.3). This conceptualisation
shall also allow for rigorous business model descriptions, but in a way that
is more aligned with the SBSC (Sect. 7.3.4). The business model concept
described above, including its contextual logic, shall guide the capturing,
description, and understanding of the empirical unit of analysis and make
it accessible for the SUST-BMA process.
7.4.2Identifying Relevant Sustainability Aspects

Once the unit of analysis has been captured, it must be decided which sus-
tainability aspects are to be assessed. Ideally, the entire breadth of possible
sustainability aspects should be evaluated to avoid overlooking important
aspects, and thus potentially critical negative ecological or social business
model effects. However, given the myriad of theoretically possible aspects,
a prioritisation system is needed to identify the most significant aspects
that should be assessed in detail.
The framework created by the Global Reporting Initiative (GRI) offers
a composition of sustainability aspects that is used as a guideline for sus-
tainability reporting worldwide and across diverse industries. The current
edition, G4, comprises 46 ecological, social, and economical aspects
with a total of 91 indicators that an organisation could report onthe
so-called specific standard disclosures (GRI 2013a, pp. 2223). Since its
inception in 2000, the framework has been reviewed and revised several
times in an interactive process involving various stakeholder groups. G4
can therefore be considered to be the most comprehensive and accepted
compilation of sustainability aspects and can be used as an approximation
to a rigorous list of relevant sustainability aspects. In addition to its practi-
cal relevancethe GRI database contains 17,000 sustainability reports (as
of December 20143)the GRI framework has also been acknowledged in
academia. It has been studied and used in a variety of scientific articles, for
example, with regard to its practical usage and acceptance or its suitability
as an assessment framework (e.g. del Mar Alonso-Almeida et al. 2014;
Kraut etal. 2012; Marimon etal. 2012; Toppinen etal. 2012).
The G4 edition offers a prioritisation tool known as materiality matrix.
This tool can be used to identify material, that is, relevant, aspects that
[r]eflect the organizations significant economic, environmental and

social impacts; or [s]ubstantively influence the assessments and decisions
of stakeholders (GRI 2013b: 11). Since an organisation cannot deal with
all possible sustainability aspects, GRI suggests a stakeholder-oriented
approach to defining priorities. That is, materiality matrices and the pri-
orities assigned to different sustainability aspects should be defined on
the basis of an organisations as well as its stakeholders perceptions and
judgments. The significance of sustainability aspects to the business model
is shown on the horizontal axis in Fig. 7.7, and their influence on stake-
holder assessments and decisions on the vertical axis. The aspects in the
top-right corner have the highest priority and those in the bottom-left
corner have the least priority.
Accordingly, this tool can be used to identify sustainability aspects with
different priorities within business modelswhere those with the highest
priority could be thought of as business model hot spots. Once identi-
fied, these hot spots can be mapped to particular business model logics
and be assessed in detail (see below), for example, high resource consump-
tion caused by a particular production logic, or high end-of-life impacts
due to customer expectations such as convenience and easy disposal.
Appropriate measures, such as resource or customer-oriented KPIs, will
be provided by the respective SBSC perspectives, here, the internal process
perspective and the customer perspective (cf. Fig. 7.6).
7.4.3Assessing Performance
Once material aspects have been identified and mapped to the business
model, appropriate indicators need to be chosen. While the organisations
goals will be defined based on its strategy, measures for the configuration of
the SBSC can be derived from the GRI framework (e.g. GRIs G4-EN1 indi-
cator for material usage or G4-PR5 for customer satisfaction; GRI 2013a,
2013b). However, the interplay between business model concept, SBSC,
and GRI indicators needs to be carefully adapted for the kind of performance
assessment proposed by the SUST-BMA framework: GRIs default scenario
is rather a whole organisation, an entrepreneurial firm or larger corporation,
and not a business modelwhich are very different units of analysis!
However, the GRI framework, its aspects, indicators, and materiality
matrix should be applicable to the purpose of SUST-BMAs. Assuming
that it is possible to identify material aspects and their location within a
business model, accordingly adapted indicators could be implemented.
Other sources of indicators, for example, industry or product-specific stan-
high
Influence on Stakeholder Assessments and Decisions
i g d c b
h
e f
j
m k
low
low high
Significance for the business that is being assessed
Fig. 7.7 Illustration of a materiality matrix (Source: Adapted from GRI 2013b:12)
dards such as the Roundtable on Sustainable Biomaterials (RSB) or other

general standards such as the Occupational Health- and Safety Assessment
Series (OHSAS), as well as indicators developed by individual organisa-
tions themselves can also be considered, as long as they can be adapted to
measure performance with regard to a business models material aspects.
As mentioned above, the identified sustainability aspects and the respec-
tive indicators managed by the SBSC can be mapped to a business models
logics, which allow identifying the location of material sustainability issues
within a business model. Depending on the logic that an indicator is mapped
to, conclusions can be drawn about necessary changes to improve a busi-
ness models performance with regard to that indicator. Some aspects (e.g.
resource and energy consumption, emissions) might be related to several
logics. The advantageor the very natureof the business model concept is
that systemic linkages across several logics can be captured, described, under-
stood, and shared, which is another crucial function when it comes to the
evaluation and communication of a business models sustainability perfor-
mance. Based on such a performance assessment, sustainable business model
innovations can be initiated (cf. Bocken et al. 2014; Boons and Ldeke-
Freund 2013; Schaltegger etal. 2012, 2016), the outcome of which can in
turn be evaluated using the SUST-BMA framework. In consequence, the
SUST-BMA framework and process will result in the continuous improve-
ment and development of more consistent and sustainable business models.
7.5 Implications andFurther Research

The SUST-BMA framework needs further elaboration with regard to the
details of the business model concept, or ontology, which is used to capture
the targeted empirical units of analysis. The currently used five logics pro-
vide a general definition of, and perspective on, business models, but need
more details in terms of distinct elements. As discussed in Sects. 7.3.2.2
and 7.4.1, describing business models accurately is the foundation for any
further analysis or assessment. Detailed and consistent descriptions are
achieved more easily with a more fine-grained ontology, as, for example, the
entity relationship models developed by Osterwalder (2004) or Upward
(2013). However, the contradiction, or trade-off, between accurate and
feasible descriptions has to be solved. The basic ontology proposed in Sect.
7.3.2.3 provides a frame; however, further research, is required to develop
a more detailed conceptualisation to achieve more precise and practically
feasible business model descriptions with clear boundaries.
Practicability and feasibility are particularly crucial in our case. The research
presented in this chapter is part of a project that involves several SMEs who
provide the empirical test cases for the SUST-BMA framework and process.
While the first two steps of our methodology (defining the unit of analysis
and identifying material sustainability aspects) have been practically tested
with the SMEs, the testing for the third step (assessing performance) is still
outstanding. Beyond the scope of the project, applying the methodology
with a larger number of more diverse organisations will be required to itera-
tively review and revise the SUST-BMA framework and process.
Experience shows that organisations carrying out sustainability assess-
ments and taking sustainability seriously eventually reach a point where they
need to change their business model in order to become more sustainable
(cf. Beltramello etal. 2013; Bisgaard etal. 2012; Kiron etal. 2013); or, in
terms of the theoretical business case considerations presented above, to

move to the upper right corner (see Fig. 7.1 in Sect. 7.2.1). A business
model assessment tool should therefore also be able to capture changes
in business models in a dynamic perspectivenot just indirectly through
changes in sustainability indicators. Change in terms of business model
innovations and evolution should also be tracked. At the moment, the GRI
indicators do not offer any options for this. Further research should there-
fore explore possibilities of assessing business models in a dynamic perspec-
tive and develop corresponding sustainability indicators, which could also
contribute to future adaptations of the GRI framework.
Notes
1.The term sustainable entrepreneur or sustainable entrepreneur-
ship is meant to include any form of leadership, entrepreneurship,
or managerial activity, mainly pursued in business organisations,
that deliberately aims at the integration of ecological, social, and
economic aspects and the creation of accordingly multiple kinds of
value for the natural environment, society, and the business organ-
isation itself (see e.g. Schaper 2010).
2.Knowing that the literature offers far more definitions and concepts
(e.g. Zott et al. 2011), our review includes only those which
explicitly define business model elements and their relationships, pro-
vide minimum definitions of business model functions, and informa-
tion about its theoretical or practical context (e.g. ICT, organisation,
or strategy).
3. http://database.globalreporting.org/
References
Abdelkafi, Nizar, Sergiy Makhotin, and Thorsten Posselt. 2013. Business model
innovations for electric mobilityWhat can be learned from existing business
model patterns? International Journal of Innovation Management 17(01):
1340003.
Afuah, Allan. 2004. Business Models: A Strategic Management Approach. NewYork:
McGrawHill.
Al-Debei, Mutaz M., and David Avison. 2010. Developing a unified framework of
the business model concept. European Journal of Information Systems 19(3):
359376.
Alt, Rainer, and Hans-Dieter Zimmermann. 2014. Editorial 24/4: Electronic
markets and business models. Electronic Markets 24(4): 231234.
Amit, Raphael, and Christoph Zott. 2001. Value creation in e-business. Strategic
Management Journal 22(6/7): 493520.
Arend, Richard J.2013. The business model: Present and futureBeyond a skeu-
morph. Strategic Organization 11(4): 390402.
Baden-Fuller, Charles, Benot Demil, Xavier Lecoq, and Ian MacMillan. 2010.
Editorial. Long Range Planning 43(23): 143145.
Baden-Fuller, Charles, and Mary S. Morgan. 2010. Business models as models.
Long Range Planning 43(2): 156171.
Beattie, Vivien, and Sarah Jane Smith. 2013. Value creation and business models:
Refocusing the intellectual capital debate. The British Accounting Review 45(4):
243254.
Beltramello, Andrea, Linda Haie-Fayle, and Dirk Pilat. 2013. Why New Business
Models Matter for Green Growth. Paris: OECD.
Bidmon, Christina Melanie, and Sebastian Knab. 2014. The three roles of business
models for socio-technical transitions. In The Proceedings of XXV ISPIM
ConferenceInnovation for Sustainable Economy and Society, 811.
Bieger, Thomas, and Stephan Reinhold. 2011. Das wertbasierte Geschftsmodell
Ein aktualisierter Strukturierungsansatz. In Innovative Geschftsmodelle, 1370.
Berlin: Springer.
Bisgaard, Tanja, K.Henriksen, and M.Bjerre. 2012. Green business model inno-
vationConceptualisation, next practice and policy. Nordic Innovation, Oslo.
Bocken, N.M.P., S.W.Short, P.Rana, and S.Evans. 2014. A literature and practice
review to develop sustainable business model archetypes. Journal of Cleaner
Production 65: 4256.
Boons, Frank, and Florian Ldeke-Freund. 2013. Business models for sustainable
innovation: State-of-the-art and steps towards a research agenda. Journal of
Cleaner Production 45: 919.
Boons, Frank, Carlos Montalvo, Jaco Quist, and Marcus Wagner. 2013. Sustainable
innovation, business models and economic performance: An overview. Journal
of Cleaner Production 45: 18.
Breuer, Henning, and Florian Ldeke-Freund. 2017a. Values-Based Innovation
Management Innovating by What We Care About. Houndmills: Palgrave.
. 2017b. Values-based network and business model innovation. International
Journal of Innovation Management 21(3): 35. Art. 1750028.
Burritt, Roger L., and Stefan Schaltegger. 2010. Sustainability accounting and
reporting: Fad or trend? Accounting, Auditing & Accountability Journal 23(7):
829846.
Camponovo, Giovanni, Yves Pigneur, and S. Lausanne. 2004. Information sys-
tems alignment in uncertain environments. Proceedings of Decision Support
Systems (DSS).
Carayannis, Elias G., Stavros Sindakis, and Christian Walter. 2014. Business model
innovation as lever of organizational sustainability. The Journal of Technology
Transfer 40(1): 120.
Casadesus-Masanell, Ramon, and Joan Enric Ricart. 2010. From strategy to busi-
ness models and onto tactics. Long Range Planning 43(2): 195215.
Charter, Martin, Casper Gray, Tom Clark, and Tim Woolman. 2008. Review: The
role of business in realizing sustainable consumption and production. In System
Innovation for Sustainability 1: Perspectives on Radical Changes to Sustainable
Consumption and Production, 4669. Sheffield, UK: Greenleaf Publishing in
association with GSE Research.
Chesbrough, Henry. 2010. Business model innovation: Opportunities and barri-
ers. Long Range Planning 43(2): 354363.
Chesbrough, Henry, and Richard S.Rosenbloom. 2002. The role of the business
model in capturing value from innovation: Evidence from Xerox Corporations
technology spin-off companies. Industrial and Corporate Change 11(3):
529555.
Clinton, L., and R.Whisnant. 2014. Model behavior20 business model innova-
tions for sustainability. Sustain Ability Report.
Cohen, Boyd, and Jan Kietzmann. 2014. Ride on! Mobility business models for
the sharing economy. Organization & Environment 27(3): 279296.
Cohen, Boyd, and Monika I.Winn. 2007. Market imperfections, opportunity and
sustainable entrepreneurship. Journal of Business Venturing 22(1): 2949.
del Mar Alonso-Almeida, Mara, Josep Llach, and Frederic Marimon. 2014. A
closer look at the Global Reporting Initiative sustainability reporting as a tool
to implement environmental and social policies: A worldwide sector analysis.
Corporate Social Responsibility and Environmental Management 21(6):
318335.
Demil, Benot, and Xavier Lecocq. 2010. Business model evolution: In search of
dynamic consistency. Long Range Planning 43(2): 227246.
Doganova, Liliana, and Marie Eyquem-Renault. 2009. What do business models
do?: Innovation devices in technology entrepreneurship. Research Policy
38(10): 15591570.
Ehrenfeld, John, and Andrew Hoffman. 2013. Flourishing: A Frank Conversation
about Sustainability. California: Stanford University Press.
Figge, Frank, Tobias Hahn, Stefan Schaltegger, and Marcus Wagner. 2002. The
sustainability balanced scorecardlinking sustainability management to business
strategy. Business Strategy and the Environment 11(5): 269284.
Grassl, Wolfgang. 2012. Business models of social enterprise: A design approach
to hybridity. ACRN Journal of Social Entrepreneurship Perspectives 1(1): 3760.
GRI. 2013a. Reporting Principles and Standard Disclosures. Amsterdam: Global
Reporting Initiative.
. 2013b. G4 Sustainability Reporting GuidelinesImplementation Manual.
Amsterdam: Global Reporting Initiative.
Hahn, Tobias, Frank Figge, Jonatan Pinkse, and Lutz Preuss. 2010. Trade-offs in
corporate sustainability: You cant have your cake and eat it. Business Strategy
and the Environment 19(4): 217229.
Hamel, Gary. 2000. Leading the Revolution: How to Thrive in Turbulent Times by
Making Innovation a Way of Life. Boston: Harvard Business School Press.
Hansen, Erik, and Stefan Schaltegger. 2016. The Sustainability Balanced Scorecard:
A Systematic Review of Architectures. Journal of Business Ethics 133(2):
193221.
Hansen, Erik G., Friedrich Grosse-Dunker, and Ralf Reichwald. 2009.
Sustainability innovation cubeA framework to evaluate sustainability-oriented
innovations. International Journal of Innovation Management 13(04):
683713.
Hedman, Jonas, and Thomas Kalling. 2003. The business model concept:
Theoretical underpinnings and empirical illustrations. European Journal of
Information Systems 12(1): 4959.
Hockerts, Kai, and Rolf Wstenhagen. 2010. Greening Goliaths versus emerging
DavidsTheorizing about the role of incumbents and new entrants in sustain-
able entrepreneurship. Journal of Business Venturing 25(5): 481492.
Johnson, Mark W. 2010. Seizing the White Space: Business Model Innovation for
Growth and Renewal. Brighton: Harvard Business Press.
Johnson, H.Thomas, and Robert S.Kaplan. 1987. The rise and fall of manage-
ment accounting. IEEE Engineering Management Review 3(15): 3644.
Johnson, Mark W., and Josh Suskewicz. 2009. How to jump-start the clean econ-
omy. Harvard Business Review 87(11): 5260.
Johnson, Mark W., Clayton M. Christensen, and Henning Kagermann. 2008.
Reinventing your business model. Harvard Business Review 86(12): 5059.
Kaplan, Robert S., and David P.Norton. 1992. The balanced scorecardMeasures
that drive performance [J]. Harvard Business Review 70(1): 7179.
. 1996. Using the balanced scorecard as a strategic management system.
Harvard Business Review 74(1): 7585.
. 2000. Having trouble with your strategy? Then map it. Harvard Business
Review 78(5): 110.
Kiron, David, Nina Kruschwitz, Knut Haanaes, Martin Reeves, and Eugene
Goh. 2013. The innovation bottom line. MIT Sloan Management Review
54(3): 1.
Kraut, Marla, Philip Dennis, and Heidi Connole. 2012. The efficacy of voluntary
disclosure: A study of water disclosures by mining companies using the global
reporting initiative framework. Academy of Accounting and Financial Studies
17(2): 23.
Lambert, Susan Christine. 2010. Progressing business model research towards
mid-range theory building. PhD diss., University of South Australia.
. 2012. A Multi-Purpose Hierarchical Business Model Framework. Centre
for Accounting, Governance and Sustainability, School of Commerce,
University of South Australia.
Lankoski, Leena. 2006. Environmental and economic performance the basic links.
In Managing the Business Case for Sustainability, eds. Schaltegger, S. and
Wagner, M.Sheffield: Greenleaf Publishing, 3246.
Laukkanen, Minttu, and Samuli Patala. 2014. Analyzing barriers to sustainable

business model innovations: Innovation systems approach. The Proceedings of
XXV ISPIM ConferenceInnovation for Sustainable Economy and Society,
Dublin, Ireland.
Linder, J.C., and S. Cantrell. 2000. Changing business models: Surveying the
landscape, white paper, institute for strategic change. Accenture [Available at
http://www.accenture.com/xd/xd.asp.
Loock, Moritz. 2012. Going beyond best technology and lowest price: On renew-
able energy investors preference for service-driven business models. Energy
Policy 40: 2127.
Ldeke-Freund, Florian. 2009. Business model concepts in corporate sustainabil-
ity contexts: From rhetoric to a generic template for Business Models for
Sustainability. Centre for Sustainability Management (CSM), Leuphana
Universitt Luneburg, ISBN, 9783.
. 2010. Towards a conceptual framework of Business Models for
Sustainability. Knowledge Collaboration & Learning For Sustainable Innovation,
eds. R.Wever, J.Quist, A.Tukker, J.Woudstra, F.Boons, N.Beute, Delft.
. 2013. Business models for sustainability innovation: Conceptual foundations
and the case of solar energy. PhD diss. Luneburg: Leuphana University.
. 2014. BPs solar business model: A case study on BPs solar business case
and its drivers. International Journal of Business Environment 6(3): 300328.
Magretta, Joan. 2002. Why business models matter. Harvard Business Review
80(5): 8692.
Marimon, Frederic, Mara del Mar Alonso-Almeida, Martha del Pilar Rodrguez, and
Klender Aimer Cortez Alejandro. 2012. The worldwide diffusion of the global
reporting initiative: What is the point? Journal of Cleaner Production 33: 132144.
McDonough, William, and Michael Braungart. 2013. The Upcycle. First ed.
NewYork: North Point Press.
McElroy, Mark W., and Jo M.L. Van Engelen. 2012. Corporate Sustainability
Management: The Art and Science of Managing Non-Financial Performance.
Hoboken: Taylor and Francis.
Morris, Michael, Minet Schindehutte, and Jeffrey Allen. 2005. The entrepreneurs
business model: Toward a unified perspective. Journal of Business Research
58(6): 726735.
Muehlhausen, Jim. 2013. Business Models for Dummies. Hoboken, NJ: John Wiley
& Sons.
Osterwalder, Alexander. 2004. The Business Model Ontology: A Proposition in a
Design Science Approach. PhD Thesis. Lausanne: Universite de Lausanne.
Osterwalder, Alexander, and Yves Pigneur. 2009. Business Model Generation: A
Handbook for Visionaries, Game Changers, and Challengers. Amsterdam:
Modderman Drukwerk.
Osterwalder, Alexander, Yves Pigneur, and Christopher L.Tucci. 2005. Clarifying
business models: Origins, present, and future of the concept. Communications
of the Association for Information Systems 16(1): 1.
Pateli, Adamantia G., and George M. Giaglis. 2005. Technology innovation-

induced business model change: A contingency approach. Journal of
Organizational Change Management 18(2): 167183.
Richter, Mario. 2012. Utilities business models for renewable energy: A review.
Renewable and Sustainable Energy Reviews 16(5): 24832493.
. 2013. Business model innovation for sustainable energy: How German
municipal utilities invest in offshore wind energy. International Journal of
Technology Management 63(12): 2450.
Rusnjak, Andreas. 2014. Entrepreneurial Business Modeling. Wiesbaden: Springer.
Schallmo, Daniel R.A. 2014. Checklisten und Erluterungen zum Kompendium
Geschftsmodell-Innovation. In Kompendium Geschftsmodell-Innovation,
441451. Wiesbaden: Springer.
Schaltegger, Stefan. 2011. Sustainability as a Driver for Corporate Economic
Success. Society and Economy 33(1): 1528.
Schaltegger, Stefan, Erik Hansen, and Florian Ldeke-Freund. 2016. Business
Models for Sustainability: Origins, Present Research, and Future Avenues.
Organization & Environment 29(1): 310.
Schaltegger, Stefan, and Florian Ldeke-Freund. 2013. Business cases for sustainabil-
ity. In Encyclopedia of Corporate Social Responsibility, 245252. Berlin: Springer.
Schaltegger, Stefan, and Marcus Wagner. 2006a. Managing and measuring the
business case for sustainability: Capturing the relationship between sustainabil-
ity performance, business competitiveness and economic performance. In
Managing the Business Case for Sustainability: The Integration of Social,
Environmental and Economic Performance, 127. Sheffield, UK: Greenleaf
Publishing in association with GSE Research.
. 2006b. Integrative management of sustainability performance, measure-
ment and reporting. International Journal of Accounting, Auditing and
Performance Evaluation 3(1): 119.
Schaltegger, Stefan, and MarcusWagner. 2008. Types of sustainable entrepreneur-
ship and conditions for sustainability innovation: From the administration of a
technical challenge to the management of an entrepreneurial opportunity. In
Sustainable Innovation and Entrepreneurship, 2748. Cheltenham: Edward Elgar.
. 2011. Sustainable entrepreneurship and sustainability innovation:
Categories and interactions. Business Strategy and the Environment 20(4):
222237.
Schaltegger, Stefan, and Roger Burritt. 2005. Corporate sustainability. In The
International Yearbook of Environmental and Resource Economics 2005/2006,
185222. Cheltenham: Edward Elgar.
Schaltegger, Stefan, and Terje Synnestvedt. 2002. The Link Between Green and
Economic Success: Environmental Management as the Crucial Trigger between
Environmental and Economic Performance. Journal of Environmental
Management 65(2): 339346.
Schaltegger, Stefan, and Thomas Dyllick. 2002. Nachhaltig Managen mit der
Balanced Scorecard: Konzept und Fallstudien. Wiesbaden: Gabler.
Schaltegger, Stefan, Roger Burritt, and Holger Petersen. 2003. An Introduction to

Corporate Environmental Management: Striving for Sustainability. Vol. 14.
UK: Emerald Group Publishing Limited.
Schaltegger, Stefan, Florian Ldeke-Freund, and Erik G.Hansen. 2012. Business
cases for sustainability: The role of business model innovation for corporate
sustainability. International Journal of Innovation and Sustainable Development
6(2): 95119.
Schaper, Michael T. 2010. Making Ecopreneurs: Developing Sustainable
Entrepreneurship, Corporate social responsibility. 2nd ed. Burlington, VT: Ashgate.
Schweizer, Lars. 2005. Concept and evolution of business models. Journal of
General Management 31(2): 3756.
Seelos, Christian. 2014. Theorizing and strategizing with models: Generative
models of social enterprises. International Journal of Entrepreneurial Venturing
6(1): 621.
Seelos, Christian, and Johanna Mair. 2005. Social entrepreneurship: Creating new
business models to serve the poor. Business Horizons 48(3): 241246.
. 2007. Profitable business models and market creation in the context of
deep poverty: A strategic view. The Academy of Management Perspectives 21(4):
4963.
Shafer, Scott M., H.Jeff Smith, and Jane C.Linder. 2005. The power of business
models. Business Horizons 48(3): 199207.
Short, Samuel W., Nancy M.P.Bocken, Claire Y.Barlow, and Marian R.Chertow.
2014. From refining sugar to growing tomatoes. Journal of Industrial Ecology
18(5): 603618.
Sthler, Patrick. 2002. Geschaftsmodelle in der digitalen konomie. In Merkmale,
Strategien und Auswirkungen. Lohmar: Josef Eul Verlag.
Stubbs, Wendy, and Chris Cocklin. 2008. Conceptualizing a sustainability busi-
ness model. Organization & Environment 21(2): 103127.
Teece, David J.2006. Reflections on profiting from innovation. Research Policy
35(8): 11311146.
. 2010. Business models, business strategy and innovation. Long Range
Planning 43(2): 172194.
Timmers, Paul. 1998. Business models for electronic markets. Electronic Markets
8(2): 38.
Toppinen, Anne, Ning Li, Anni Tuppura, and Ying Xiong. 2012. Corporate
responsibility and strategic groups in the forest-based industry: Exploratory
analysis based on the Global Reporting Initiative (GRI) framework. Corporate
Social Responsibility and Environmental Management 19(4): 191205.
Tukker, Arnold, and Ursula Tischner. 2006. New Business for Old Europe: Product-
Service Development, Competitiveness and Sustainability. Sheffield: Greenleaf
Publications.
Tukker, Arnold, Martin Charter, Carlo Vezzoli, Eivind St, and Maj Andersen,
eds. 2008. Perspectives on Radical Changes to Sustainable Consumption and
Production, System Innovation for Sustainability. Sheffield: Greenleaf Publishing.
Upward, Antony. 2013. Towards an ontology and canvas for strongly sustainable
business models: A systemic design science exploration. PhD diss., York
University Toronto.
Verhulst, Elli, Ivo Dewit, and Casper Boks. 2012. Implementation of sustainable
innovations and business models. Entrepreneurship Innovation Sustainability
1(25): 3266.
Wells, Peter. 2008. Alternative business models for a sustainable automotive indus-
try, 8098.
. 2013a. Business Models for Sustainability. Cheltenham: Edward Elgar
Publishing.
. 2013b. Sustainable business models and the automotive industry: A com-
mentary. IIMB Management Review 25(4): 228239.
Wells, Peter, and Margarete Seitz. 2005. Business models and closed-loop supply
chains: A typology. Supply Chain Management: An International Journal
10(4): 249251.
Wirtz, Bernd W. 2011. Business Model Management: Design-Instruments-Success
Factors. Wiesbaden: Gabler.
Wstenhagen, Rolf, Jost Hamschmidt, Sanjay Sharma, and Mark Starik, eds. 2008.
Sustainable Innovation and Entrepreneurship, New Perspectives in Research on
Corporate Sustainability. Cheltenham: Edward Elgar.
Wstenhagen, Rolf, and Jasper Boehnke. 2008. Business models for sustainable
energy. In Perspectives on Radical Changes to Sustainable Consumption and
Production, 7079. Sheffield, UK: Greenleaf Publishing in association with
GSE Research.
Yunus, Muhammad, Bertrand Moingeon, and Laurence Lehmann-Ortega. 2010.
Building social business models: Lessons from the Grameen experience. Long
Range Planning 43(2): 308325.
Zeyen, Anica, Markus Beckmann, and Roya Akhavan. 2014. Social entrepreneur-
ship business models: Managing innovation for social and economic value cre-
ation. In Managementperspektiven fr die Zivilgesellschaft des 21. Jahrhunderts,
107132. Germany: Springer Fachmedien Wiesbaden.
Zott, Christoph, and Raphael Amit. 2007. Business model design and the perfor-
mance of entrepreneurial firms. Organization Science 18(2): 181199.
. 2008. The fit between product market strategy and business model:
Implications for firm performance. Strategic Management Journal 29(1): 126.
. 2010. Business model design: An activity system perspective. Long Range
Planning 43(2): 216226.
. 2013. The business model: A theoretically anchored robust construct for
strategic analysis. Strategic Organization 11(4): 403411.
Zott, Christoph, Raphael Amit, and Lorenzo Massa. 2011. The business model:
Recent developments and future research. Journal of Management 37(4):
10191042.
CHAPTER 8
Smart Decision-Making andProductivity

intheDigital World: TheCase
ofPATAmPOWER
AlexanderRayner
8.1 Introduction
We live in a world overloaded with data, often referred to as the Big
Data era, where both organizations and individuals are overwhelmed
with an abundance of existing data. Nobody knows how much of the data
collected and stored is being used effectively to make data-driven deci-
sions, let alone how much of the data is actually understood. Yet, more
and more new data continues to be created from new sources such as
mobile positioning data, wearable technologies such as the Apple Watch,
and from the Internet of Things (IOT) which, according to Gartner, is
expected to interconnect nearly 26 billion devices by 2020.
Data-driven decisions can create and sustain a competitive advantage,
but it must be recognised that competing in a data-driven world is about
people being able to collaborate effectively around an ecosystem of data.
To create and maintain a competitive advantage using data, the data first
needs to be understood, and it needs to be assessed and analyzed quickly,
A. Rayner (*)
SmartData.travel Limited, Hong Kong
e-mail: alex@smartdata.travel

DOI10.1057/978-1-137-37879-8_8
208 A. RAYNER
to enable faster decisions. The key is speed, and technology is a tool avail-
able for quickly making data useful. One example is the use of visualization
to communicate data clearly and efficiently to users by using graphics such
as graphs and charts. Effective visualization helps users to quickly analyze
data and identify trends, making complex data easier to understand and
more useful.
Global tourism continues to grow every year, and in 2015 the number
of international tourist arrivals (overnight visitors) reached 1.184 bil-
lion, with spending of US$1.4 trillion in 2015 according to the United
Nations World Tourism Organisation (UNWTO). By 2030, arrivals are
expected to reach 1.8 billion, which means on average 5 million people
will be crossing international borders every day. The important economic
and social impact of International visitors is recognized by more and more
governments, consequently competition between destinations to attract
visitors is extremely strong and continues to intensify.
Every stage of the travel journey from dreaming, planning, booking,
experience to sharing creates an abundance of data. Consequently, an
emerging strategy by National Tourism Organizations (NTOs) is to use
data metrics and analytics for decision-making, enabling marketing to be
more targeted and focused to attract high value-adding visitors, with a
shift away from using the number of visitor arrivals as a performance indi-
cator, to more meaningful indicators such as visitor expenditure, length of
stay and number of jobs in tourism.
Although many segments of the travel and tourism sector have started
to adopt data-driven decision-making, the most advanced being airline,
hotel and online travel agent segments; however, there is plenty of scope
and opportunity to improve when compared to sectors beyond travel and
tourism.
The Pacific Asia Travel Association (PATA) is a not-for-profit travel
trade membership association, established in 1951, to act as a catalyst
for developing the Asia Pacific travel and tourism industry. In partner-
ship with private and public sector members, from all segments of travel
and tourism, PATAs mission is to enhance the sustainable growth, value
and quality of travel and tourism to, from and within the Asia Pacific
region.
At its inception, PATA pioneered the way in which travel and tourism
was managed and promoted, by thinking outside the box, a key element
of which was accurate research and intelligence. In 2010, upon reflecting
on PATAs achievements, founding member Matt Lurie said strategic
SMART DECISION-MAKING ANDPRODUCTIVITY INTHEDIGITAL... 209
intelligence was, and should remain, a core focus of PATA, particularly for
smaller destinations that lack the resources to do it themselves.
Since 1964, PATAs Annual Statistical Report, later renamed the
Annual Tourism Monitor (ATM), has aggregated and disseminated data
about the Asia Pacific travel and tourism sector to PATA members and
has always been considered a core member service and key membership
benefit (Figs. 8.1 and 8.2).
Recognizing the evolving importance of data, research and intelligence,
the PATA Strategic Intelligence Centre (SIC) was established in 1997 to
focus on producing a wide range of publications and market intelligence
reports, including the ATM and Forecasts that were distributed in print,
CD and DVD formats.
As the Internet gained popularity and widespread usage, it changed the
way data is communicated and also created an expectation for data access
on demand, at any time, and from anywhere. PATA provided a unique
member service by aggregating data from NTOs, which in the past was
Fig. 8.1 Left: Cover page of the PATA 1st annual statistical report
Fig. 8.2 Right: Cover page of the PATA annual tourism monitor 2015 early
edition
210 A. RAYNER
difficult to access and obtain unless you personally knew the person to
contact to get the data; however, with the Internet, many NTOs made
their data available on their websites so anyone could access and download
data. PATA members were able to access and download travel and tourism
data from websites on the Internet, at their convenience 24/7, and no
longer needed to rely as heavily on PATA.
The massive amounts of data available on the Internet continues to
increase exponentially, making the task of searching and finding the right
data an extremely time-consuming and frustrating experience. The chal-
lenge today is to ensure the data is valid and that it comes from a trusted
and credible source.
After the most recent Global Financial Crisis (GFC), new regula-
tions and governance requirements were introduced requiring organiza-
tions and their management to become more accountable, resulting in
the growth of data-driven decision-making, increasing demand for more
detailed data and more frequent updates. PATAs highest paying mem-
bers, particularly Governments and Carriers, wanted more comprehensive
data, more frequently, with on-demand access that the Internet facilitated.
In 2009, PATA appointed a new CEO and, in 2010, PATA man-
agement made the decision to create a data dashboard that was to be a
web-based software platform, a digital system for data aggregation and
dissemination, in the effort to provide more benefits that are relevant to
its existing and potential members.
A consultant was engaged to develop the data dashboard, but there
was no budget available for the technical development, which created
limitations for the solution, content and functionality. Working within the
given constraints, a partnership was created with a software developer that
made available technical services in exchange for exposure and promotion
to the travel and tourism sector through PATAs network reach.
The vision of the data dashboard was to enable better decisions by
PATA member travel and tourism professionals, by aggregating and allow-
ing faster access to data through a web-based One Stop Shop, dynamic
reporting mechanism allowing users to select what information they
needed, when they needed it.
The initial objective was to create a prototype and to showcase at the
PATA Annual General Meeting (AGM) in April 2010. It was anticipated that
if members recognized value from an operational proof of concept, members
would want to continue the project and hopefully resource the project.
In February 2010, development commenced using Flash which was

considered the best available software for providing an interactive visu-
alization of data, graphically displaying instant trends. Before being able
to aggregate, enrich and disseminate data about the Asia Pacific visitor
economy, in a systematic way, it was first necessary to harmonize the data
that PATAs SIC already collected from 46 destinations across the region.
The existing data collection process was manual, and therefore labor
intensive, because the data was obtained in a variety of different methods,
ranging from mail that was handwritten and/or typewritten, by fax, to
email with Excel or PDF file attachments. Sometimes data was down-
loaded from websites. This presented a number of difficulties, for exam-
ple, data was provided at different times due to different NTO release
dates, and arrived in different, often inconsistent formats, some arrived
daily, some monthly, some quarterly, some annually, and was based on
different methodologies.
Harmonization of data was initially focused on one indicator,
International Visitor Arrivals (IVA), on a monthly, quarterly and yearly
basis. Only a few destinations had IVA data available on a daily basis, and
this was flagged as a potential future development.
PATA had historically reported visitor data, so the decision was
made to continue using visitor data, as opposed to tourist data. Visitor
is defined as any person visiting a country other than that in which he/
she has his/her usual place of residence for any reason other than fol-
lowing an occupation remunerated from within the country visited. The
definition of a tourist according to the United Nations World Tourism
Organization (UNWTO) usage stipulates a stay of at least 24 hours but
less than a year.
IVA data, primarily sourced from the immigration arrival forms and
passports, was reported to PATA by either country of residence or nation-
ality of visitors. It was decided to display both indicators, allowing users to
recognize the different measures used by destinations.
A standard procedure policy decision was made to report only official
data, consequently uploads were done only after governments officially
released data. Although this created an issue because not every destina-
tion data would be available when a user looked at a particular month, it
was more important for users to be confident about and be able to rely
on credible and official data, rather than on alternative solutions such as
making estimates.
212 A. RAYNER
PATAs website had many limitations, the quickest and most effective
way to integrate a dashboard was to create an iframe within the existing
website. The dashboard was embedded without users knowing that a link
was actually taking users to another website, that was hosted elsewhere on
a different server, that had capacity and capability to host large amounts of
data and quickly process complex queries.
The Chinese year of the Tiger that commenced on 14 February 2010
was poised to become a Year of Transformation with a spirit of innova-
tion and transformation and changing market boundaries. As the web-
based software platform was developed and launched during the year of
the Tiger and since it was an innovative transformation for PATA, it was
named TIGA, an acronym for Travel Intelligence Graphic Architecture.
8.2 TIGA Beta 1.0

TIGA beta version 1.0 was launched at the PATA Annual Conference and
AGM in Kuching, Malaysia, in April 2010 and contained IVA and PATA
Forecasts for 20102012, data-based indicators that were traditionally dis-
tributed through the ATM in print and CD formats.
PATA members were for the first time ever able to view PATAs data
on demand, by selecting any of the 46 destinations from PATAs five sub-
regions, for the period 20072010, and view the IVA data either monthly,
quarterly or yearly, in both tabular and graphic formats.
During the PATA, AGM members were asked for feedback and it
became very clear that PATA members liked TIGA and now wanted more,
specifically members constantly reiterated that they wanted a one-stop
shop, that would provide for all their data needs. Like going to a shopping
center, members very clearly wanted to save time by being able to go to
one central place that contained data that they needed, and to have con-
fidence that the data was from credible sources that members could trust.
Self-service was a global trend gaining prominence, and the travel and
tourism sector was very familiar with this model with activities such as
online check-in for flights becoming the norm. It was recognized that
TIGA provided a direct productivity benefit for members, and also for
PATA internally, in addition to providing members the convenience
to access the latest available data from the PATA website, anytime, on
demand, 24/7.
A key issue raised was with hundreds of PATA members paying dif-
ferent prices for their membership, how could the benefit of TIGA be
aligned to the value of membership investment?
Another concern arose from the Apple release of the iPad tablet on 3
April 2010 because Flash software was not supported, hence TIGA could
not be viewed on the iPad. This was flagged as an issue and concern as if
the level of popularity of the iPad became significant then TIGA would
have limitations in terms of usage, and further development would be
needed to make usable on the iPad.
8.3 Development ofBeta 2.0

PATA has broad membership from different segments including
Government NTOs, Airlines, Hotels, Travel Agents, Tour Operators and
Education Institutions, each with different data needs relevant to their
activities.
PATAs membership structure has committees based on segments of
the travel and tourism sector, such as Government and Aviation, which
provided valuable feedback on the type of indicators that they would find
useful. Upon undertaking one-to-one consultations with members, a fur-
ther, and very important value proposition aspect was identified. An airline,
for example, already had aviation data indicators such as number of seats,
flights and air passengers; however, airlines wanted to know more about
non-aviation indicators such as IVA and Internet usage data. Similarly
hotels had data about accommodation indicators such as Occupancy,
Average Daily Rates (ADR) and Revenue Per Available Room (RevPAR);
however, hotels wanted to know about IVA and aviation indicators such as
number of seats, flights and air passengers. The value proposition became
clear when it became evident that each segment wanted data from the
other segments.
Understanding the needs of the members made it easier to design the
structure of the data and category clusters of data were identified and
developed as: Visitor, Expenditure, Aviation, Accommodation, Digital
and Forecasts. A selection of indicators within each category could then
be made based on what data was available from the various sources.
Content development was a complex and time-consuming activity;
every data indicator recommended or identified went through a rigorous
214 A. RAYNER
process that began with identifying and validating value to users, identify-
ing the various sources, contacting the sources, negotiating the rights to
display supplier data on TIGA, reach agreement on the collection method
including legal agreements, establishing a data input process, testing and
finally promoting the indicator to users and PATA members.
To overcome the challenge of obtaining third-party data without incur-
ring cost, TIGA was positioned as a promotion tool for data suppliers,
and the opportunity was provided to showcase a selection of high-level
indicators that provided insight, but with greater value gained if the user
subscribed to the third-partys data. Although this became a successful
win-win scenario, it took substantial time to convince and gain approvals,
especially from legal departments of the various data suppliers, who were
concerned about the risk of losing potential revenue. A new PATA mem-
bership category was introduced called Preferred Partner that was offered
to third-party suppliers when the data provided had a value, and PATA
products and services were bartered for the value equivalent.
Formal arrangements and agreements for data collection and dissemi-
nation were put in place for each data source including the NTOs.
Harmonization issues then arose from the PATA definition of the Asia
Pacific region and sub-regions and the destinations contained therein, as it
was different from that of many other organizations. In effect, each orga-
nization grouped destinations into their own sub-regional and regional
clusters. The term PATA Region was adopted, and a comprehensive cod-
ing system was developed. When data from different sources was input to
TIGA, the destinations had to be loaded separately to ensure alignment
with PATA Region definitions and thereby allow for direct comparison.
The next issue was determining how to provide different value to
the members of PATA many of whom paid different membership dues,
ranging from $250 to $50,000 perannum. After extensive member con-
sultation, debate and feedback, it was agreed to create a member login
mechanism and to provide three levels of access with varying data indica-
tors for members. Regional and sub-regional data remained available to
the public; however, when users wanted to access data at a destination
level, then PATA membership was required.
PATA members would receive access to Indicators and Destination data
based on the amount of their investment in PATA membership (Table 8.1).
Members paying under $1000 perannum received Local access that
provides all indicators except forecasts and source markets for a single des-
Table 8.1 TIGA Value

TIGA Access Level Local International Strategic
Membership Under $1000 $1000$3999 $4000+

Investment
Indicators All except Forecasts and All except Forecasts and All
Source markets Source markets
Destinations Single All All
tination. Members paying between $1000 and under $4000 perannum

receive International access that provides all indicators except forecasts
and source markets for all destinations. Members paying $4000 and above
would receive everything, all indicators including forecasts and source
markets indicators.
8.4 TIGA Beta 2.0

In September 2010 at the annual PATA Travel Mart in Macau SAR, TIGA
beta 2.0 version was launched, adding Expenditure and Source Market
data, and introducing third-party data indicators from the United Nations
International Telecommunications Union (ITU) on Internet and mobile
usage. Functionality was added enabling user-friendly methods to down-
load content from TIGA directly into spreadsheets, images for reports
and presentations, including the ability to email contents thereby enabling
collaboration.
Convenience and functionality were important to PATA members,
such as being able to download the data into a csv or Excel file, to save the
image into a word document or a PowerPoint presentation, to share with
colleagues, or to simply print.
PATA membership was across all the segments of the travel and tour-
ism value chain and hence each segment member had different data needs,
hoteliers, for example, knew all about hotel data but needed to know
about air capacity, how many flights, how many seats, how many passen-
gers and in which class of travel. Each segment had great data about their
own segment but lacked data on other segments and the real magic was in
the integration of the various data sets.
Ultimately, there was always the one objective of bringing value, TIGA
had to provide better data to members than they could obtain themselves.
216 A. RAYNER
Better also in the type of data and indicators and presented in ways that
saved time, enabled collaboration and provided immediate insights. PATA
members could easily use the Internet to search for data so TIGA had to
offer a higher value-added proposition.
Getting data from third parties was a challenge especially with no bud-
get, but even with partnership propositions, as PATA needed to select indi-
cators that were meaningful to the user, without diminishing the role of
the data supplier product and making it otherwise redundant or obsolete.
Strategic integration of the various data indicators was the magic for-
mula to create resilience, robustness and sustainability.
Smartphones became popular after the iPhone was introduced in 2007
and started to become mainstream. This was compounded by the popu-
larity of tablets, in particular the iPad, so PATA management made the
decision in late 2011 to redevelop TIGA in HTML5 so that it would work
on all platforms, including mobiles and to reflect the change TIGA was
rebranded and renamed to PATAmPOWER.
8.5 PATAmPOWER
PATAmPOWER1 was launched in April 2012 at the PATA AGM in Kuala
Lumpur, Malaysia, in alignment with a PATA rebranding campaign called
PATA Next Gen.
PATAmPOWER could now be accessed on any device connected to the
Internet, including tablets and smartphones that many PATA members
were using, and helped to make PATA more appealing to the younger and
emerging travel and tourism leaders especially in the Asia region.
Travel and tourism professionals could now access travel and tourism
data on demand, when they wanted it from their mobile devices, at a
lunch meeting or at an event or meeting with clients at a coffee shop.
Travel and tourism data was now available beyond a computer on a desk,
or a notebook, and the enhanced mobility of travel and tourism data now
enabled faster and smarter decisions anywhere, anytime at the conve-
nience, 24/7, of travel and tourism professionals worldwide.
8.5.1Value
The value created by PATAmPOWER is a combination of many factors
that include:
providing a central one-stop-shop environment that contains many

different travel and tourism data indicators that are important to
travel and tourism professionals;
containing data from trusted and credible sources;
available on demand, 24/7, and accessible from any internet-
connected device;
immediate insights through use of visualization technology;
customized outputs enabled by an interactive platform with the
dynamic selection of indicators;
collaboration enabling users to export data and charts, and to share
with others.
The key convenience is being able to go to one place, a central one-stop

shop, like a shopping center, where you are able to find a range of travel
and tourism data, which is a key time-saving proposition for many users,
saving precious time that would have been wasted on searching for data,
and now can instead be used to analyze the data.
An example of this value is frequently highlighted by users from the
accommodation segment, where every year sales and marketing person-
nel are required to produce plans and budgets for which data indicators
are required that will impact demand and supply over the next 12-month
period. PATAmPOWER enables these users to collect data much faster,
saving precious time, which could instead be used in analyzing the data.
A further benefit exists for multiple property portfolios, where if each
property uses the same data sources, this enables consolidation and
comparison.
Traditionally, the key indicator that has been the focus of attention
and measurement is the IVA a headcount indicator, has become a basis
of many governments to determine the budget allocation for destination
marketing; however, it is important to monitor indicators beyond IVA
because the impact and contribution of IVA varies greatly, depending
on a range of factors that include expenditure, length of stay and many
others.
Governments are increasingly understanding and recognizing that
some origin markets contribute more than others, for example, Staying
more nights, spending more and demanding less resource from the resi-
dent community.
218 A. RAYNER
The value of each data indicator on PATAmPOWER varies on the needs

of the user, the segment and how it is used. Usually each segment has out-
standing information about their own segment, however, have little data
from other segments that are interdependent.
Aviation indicators, for example, create value for the accommodation
segment by providing an indicator about air access to a destination, the
number of flights and seats planned for the next 12 months by airlines,
which is available from each origin market. If the quantity of flights and
more importantly number of seats increases, this is an instant insight that
the size of the air traffic market is expected to increase based on capacity
and supply and hence provides an indication to increase engagement in
the origin market to ensure capturing the growing market share.
Micro and Small Medium Enterprises (MSMEs) are considered the
highest value beneficiaries from PATAmPOWER as they usually have lim-
ited resources and skill sets for any data analysis activity, unlike large cor-
porations that may have dedicated employees and in some cases entire
departments for data analysis. Although there are no globally agreed defi-
nitions, typically micro enterprises have less than 5 employees, and small
medium enterprises typically have less than 50 employees, hence easy and
convenient access to data can be of great value if it can improve activities
such as marketing.
8.5.2Marketing
PATAmPOWER is promoted throughout PATAs marketing materi-
als and is highlighted as a key membership benefit. The PATA web-
site www.PATA.org features PATAmPOWER banners and has links to
PATAmPOWER, as well as links to a PDF flyer that provides a high-level
overview of PATAmPOWER.PATAs weekly newsletter, PATA Voice, fea-
tures new insights from PATAmPOWER in every edition.
A video about PATAmPOWER highlighting the benefits was pro-
duced and posted on PATAs YouTube channel called PATA TV, and was
complemented by several webinars explaining how to use PATAmPOWER,
the data indicators and functionality.
Promotion at PATA events has been an effective way to gain awareness
utilizing pull-up stands, workshops and live demonstrations.
The media occasionally acknowledge PATAmPOWER as a data source
when reporting about the travel and tourism sector; however, there
remains a lot more scope and opportunity to increase media engagement

in the future.
The ultimate publicity is when PATA members, and the media make
reference to data sourced from PATAmPOWER, and share how data-
driven decisions can help to increase revenue and profits.
8.5.3PATAmPOWER SubscriptionData asaService

Data as a Service (DaaS) is a concept that has evolved due to technology
improvements where data can be provided on demand, enabling any per-
son or organization to access data wherever it resides. DaaS began with
the notion that data quality could happen in a centralized place, cleansing
and enriching data and offering it to different systems, applications or
users, irrespective of where they are.
Initially, PATAmPOWER was a service and benefit bundled with the
membership of PATA and access to data required joining PATA as a
member.
It was recognized that some organizations simply dont have time or
desire for membership of an association and the engagement that follows,
however, are willing to pay for specific and focused services that bring
value. This is already a familiar and very successful strategy adopted by
low-cost airlines where the airfare was unbundled and each component,
such as seat with extra legroom, or checked-in luggage, is available sepa-
rately and priced according to the value, or perceived value, it generates
to the passenger.
In 2014, PATAmPOWER became a DaaS after a subscription model
was introduced allowing anyone to access data by paying a subscription
fee, providing a wider reach for the PATAmPOWER platform. Feedback
was received that organizations have policies that do not allow them, it is
difficult to gain approvals to join an association; however, subscription to
a data service, DaaS, is much easier and brings PATAmPOWER within
reach for many organizations ranging from libraries to analysts.
The Singapore Tourism Board (STB), who had canceled their PATA
membership in 2011, re-engaged with PATA by becoming the first and
launch customer of the PATAmPOWER subscription service. Another
former PATA member, Bangkok Airways, also subscribed. Organizations
that for various reasons were not members of PATA were enabled to re-
220 A. RAYNER
engage with PATA based on a very clear value propositionDaaSby

subscribing to PATAmPOWER.
8.5.4PATAmPOWERSoftware asaService
There are many sources of data available and when several sources are
used, this usually means that multiple login usernames and passwords
are required which is inconvenient and time consuming. The preference
is for one single platform that contains all the data, hence, the emerging
opportunity for the PATAmPOWER platform to become a Software as a
Service (SaaS) that can be customized with all data an organization wants
to use.
Australias Queensland Government was the launch customer of the
PATAmPOWER white label, SaaS solution by licensing the software to
develop a customized version for Tourism and Events Queensland (TEQ).2
TEQ identified the opportunity to leverage PATAmPOWER by custom-
izing the content with Queensland regional data indicators to enable the
Queensland visitor economy to make smarter decisions by using metrics
and data as a basis for decision-making.
Tourism Malaysia signed an agreement to purchase the PATAmPOWER
SaaS at the World Travel Mart in November 2014, to create a customised
platform, TMmPOWER that will contain data about Malaysias visitor
economy, which was developed during 2015, and was launched in 2016
as MyTourismData and is available at http://www.MyTourismData.
tourism.gov.my www.MyTourismData.tourism.gov.my
8.5.5Future ofPATAmPOWER
PATAmPOWER is a key PATA membership benefit, consequently aware-
ness is high among the PATA community; however, a huge market oppor-
tunity exists with millions of tourism professionals and organizations that
may not even be aware of PATA.
PATA has the opportunity to promote and to drive data-driven decision-
making, and position PATAmPOWER as a central and core component,
especially in the support of advocacy issues where data can add credibility,
increase understanding and help stakeholders to decide about their posi-
tion on advocacy issues.
PATAs advocacy and use of the term visitor economy recognizes
that tourism impacts an extensive value chain beyond travel and tourism.
PATAmPOWER data about the Asia Pacific visitor economy can be rel-
evant and useful to enterprises beyond travel and tourism, and a large
market opportunity exists with businesses that rely on the visitor economy
for revenue and their existence, such as restaurants, suppliers and retailers
to name just a few.
A challenge that remains is to determine how each PATA mem-
ber organization can gain more value from more effective use of
PATAmPOWER. Some PATA member organization employees, who
could benefit from PATAmPOWER, continue to be unaware of its exis-
tence or of their entitlement to use it. This may be because their desig-
nated contact or liaison person for PATA may not have shared the benefits
of PATAmPOWER within their organizations and this may be due to a
range of factors ranging from not making the time to having a limited
understanding of the value of data and analytics, or simply having no inter-
est at all in this area.
Getting people to use PATAmPOWER remains a challenge, usage of
PATAmPOWER is similar to a gym membership where you have access,
and it is available for your use. If you use it, you benefit, but unfortunately
many people dont make the time, or are not sure how to use it.
Engagement can be increased by consistent education and training
about the meaning of the content indicators, and how they can be used to
generate insights and value.
Identifying and targeting new and emerging data-focused job roles
such as Data Scientist positions that are beginning to gain prominence
among travel and tourism organizations, PATAmPOWER can be a very
relevant support tool.
Constant development is another key challenge in todays fast-changing
world, the success of PATAmPOWER will depend on consist investment
in the development of both the technology and the content.
Content development in particular should remain an ongoing activity,
while the monitoring of existing indicators usage and the identification of
new indicators will ensure that the content is what users need. It is clear,
for example, that data is increasingly being demanded at the city level as
well as at the national level. The challenge is securing the data at city level
and the resources to regularly aggregate the data.
Technology improvements need to be constantly monitored and evalu-
ated, as innovation will not only continue but also accelerate in the future,
222 A. RAYNER
resulting in better ways of aggregating, presenting and disseminating data.

The process of data collection and in particular the automation of data
input is an immediate improvement that could save time and improve
efficiency.
A sensitive issue that remains without a clear solution is how to deal
with Governments that are slow and inconsistent with releasing data.
Incentive needs to be created for governments to be consistently prompt
with data updates.
Wearable technology emerging in 2015, such as the Apple Watch, offers
potential to create value with new apps to provide tourism data that can
be viewed on an Apple Watch, and the question arises should the future
development strategy be to lead and innovate and create the demand, or
wait for users to demand the app.
Integrating data from IOT offers tremendous potential for new insights,
and collaboration should be explored with organizations like IBM, who
in March 2015 invested $3 billion in a new IOT unit, aiming to sell its
expertise in gathering and making sense of the anticipated surge in real-
time data.
As we move into the twenty-first century and deal with Big Data over-
load, the future success of PATAmPOWER is dependent on PATAs com-
mitment of resources to manage, develop and promote PATAmPOWER,
which itself is symbiotic to the future success of PATA.
8.6 Conclusion
As the Big Data era becomes more complex, and the velocity of data con-
tinues to accelerate, the role of technology will become critical to making
sense of data, understanding data, to make data-driven decisions, and to
make them quickly to gain competitive advantage or merely to maintain
existing customers.
Smart data is already the key ingredient for strategies that drive con-
stant improvement and innovation, and in creating and maintaining com-
petitive advantage for many segments of the travel and tourism sector.
The PATAmPOWER system creates value by unleashing the potential
buried in data, and by stimulating the use of data for better decisions this
can enhance innovation, sustainability and competitiveness. Organizations

and individuals in the travel and tourism sector have the opportunity to
benefit from smart decision-making and productivity in the digital world,
by engagement with PATAmPOWER at www.mpower.PATA.org
Notes
1. http://mpower.pata.org/
2. http://teq.queensland.com/teqmpower
CHAPTER 9
Change Management: Planning

fortheFuture andtheCompetitive
Environment
KonstantinosBiginas
9.1 Introduction
The business environment of the next decades will be significantly dif-
ferent to what might have been expected just two years ago. Over the
next 1015 years, businesses will face major changes in finance and capi-
tal conditions. Finance will be more expensive and its availability will be
constrained by regulation and changes to the banking market. From an
era in which finance was cheap and readily available, these changes will
be a significant driver of adjustments to corporate finance models and
investment behavior. The next decade will almost certainly be character-
ized by a higher level of economic volatility and increased riskcloud-
ing the certainty required for long-term planning. The financial crisis has
accelerated three other existing drivers of change or has changed their
character. Public trust in business and markets, already in decline, is now
at a low ebb. The profit motive is distrusted, and the onus is now on busi-
nesses to demonstrate their ethical credentials. There is greater skepticism
K. Biginas (*)
London College of International Business Studies, London, UK
e-mail: konstantinos.biginas@lcibs.org

DOI10.1057/978-1-137-37879-8_9
226 K. BIGINAS
about the capitalistic economic model and its ability to deliver desirable
and efficient outcomes; greater political activism, government interven-
tion and supervision can be expected. Businesses approach to social and
demographic change will also alter as a result of the recession. Retirement
will still accentuate existing shortages of critical skills, but plugging these
gaps will have to be the responsibility of business rather than govern-
ment, whose spending will be constrained. In addition, pension problems
will force some to work longer, requiring businesses to manage staff with
wider age ranges, expectations and motivations than before. Lastly, the
recession has altered the economic climate in which business needs to
move to a low-carbon economy and improve resource use. The ability and
preferences of government and some consumers to pay for this movement
have been compromised, raising new questions about the role of business.
Energy costs will continue to increase in the medium term, affecting the
basic profit structure of many companies. At the same time, trends in tech-
nology change are set to continue, and as over the last decade, will have a
significant impact on business models and ways of working.
9.1.1Future Implications
So, what will our world be like in 2030? The following chapter aims to
identify seven of the leading drivers of change that will affect our future.
These are public trust and confidence in businesses and markets, sustain-
ability and resource issues, climate change, energy, demographics, urban-
ization and technology and e-commerce. These seven drivers will be
analyzed in terms of social, technological, environmental, economic and
political aspects.
These seven drivers have been chosen based on contemporary chal-
lenges. Climate change on the Earth has started billions of years ago,
but the extent and the speed that this phenomenon is occurring since
the industrial revolution, and more specifically the last 60 years, is alarm-
ing. Climate change has been selected due to the significant influence
on our lives and the changes that could result to human and ecological
systems. Energy is closely related to climate change. There is a continu-
ous increase in demand and supply for energy. Energy can be partly con-
sidered responsible for the climate change, due to its wide use, but at the
same time is the one that will help to overcome the impacts and threats
that climate change has brought. In the not too-distant future, energy
and its alternative sources will most become the prime driver of change
CHANGE MANAGEMENT: PLANNING FORTHEFUTURE... 227
of many aspects of our lives. Demography gives influential information

about these lives, information for a nation (who this nation is, where it
comes from, etc.) and its population (fertility and mortality rates). Global
population is growing at an enormous speed. This growth is creating
the need for more scarce resources. Demographics are also related to
climate change. Due to enormous growth, pollution and waste are being
increased. Demography is a rather contemporary issue with many aspects
in society and economy. Finally, urbanization is a global phenomenon
that is created from people that are leaving the rural areas and move to
urban so as to seek for better life and income. Urbanization is an emerg-
ing trend which causes many problems such as lack of proper houses,
clean water, sanitation or roads.
9.2 Drivers ofChange intheBusiness

Environment
9.2.1Public Trust andConfidence inBusinesses andMarkets
Public trust and confidence in businesses and markets have been shaken by
the magnitude and impact of the financial crisis and recession, and are at
risk of remaining depressed over our ten-year time horizon. If these prob-
lems are not addressed, brand loyalty will be affected, companies will face
increased pressure to justify their conduct from stakeholders (including
consumers and other businesses) and, ultimately, governments will inter-
vene with tougher regulations and tighter control of business and mar-
ket operationsas has been seen already in the financial services sector.
Lightly regulated markets have been a cornerstone of the capitalistic eco-
nomic model for the last 30 years and have been a competitive strength for
many countries (especially UK and USA), but that model is now treated
with suspicion. The consensus among policymakers was that, by and large,
markets, when left to themselves, deliver an optimal outcome. But the
financial crisis has shown some markets to deliver less desirable outcomes
and to be less robust than thought.
9.2.2Sustainability andResource Issues

The need to move to a low-carbon economy and reduce resource use
has not changed as a result of the recession, but with finances likely to
remain constrained for a number of years, the ability of the government
228 K. BIGINAS
and consumers to take action has changed. It is uncertain whether either

will be able or willing to pay and this will inevitably have consequences for
their expectations of business. Businesses, populations and governments
face the stark choice of tackling the problem of climate change while the
cost of doing so is still relatively bearable, or accept a more costly adapta-
tion to the impact of global warming and an increased level of detrimen-
tal impact in the future. The Stern Review estimated the current cost of
addressing climate change to be around 1% of GDPcompared to a cost
equivalent of 520% of GDP if action is delayed. Businesses were already
under increasing consumer, NGO and regulatory pressure to significantly
reduce carbon emissions from their operations and products prior to the
recessionnow they face doing so when it is not clear whether, or at
what rate, global competitors will follow suit. Expected post-recession
volatility in the oil price over the next ten years will also be disruptive
for businessat the very least, affecting companies ability to plan and
invest. In some sectors, the cost and availability of key minerals in the
next ten years will be a major driver of innovation and change (Stern, N.
The Economics of Climate Change: The Stern Review, First edition.
Cambridge 2007: 251).
9.2.3Climate Change
The climate on Earth is stable due to a continuous supply of energy from
the sun. This heated energy is passing through the atmosphere of the
Earth, warming on that way the surface of the Earths. An increase in the
temperature of the Earth is sending the infrared radiation (heat energy)
back to the atmosphere. A part of this heat is being absorbed by gases in the
atmosphere. These gases are part of the greenhouse effect (see Appendix
1). According to Sharma, in 1950s, few people were aware of the green-
house effect. John Tyndall analyzed all the gases of the atmosphere so as
to see which of them have the most powerful greenhouse effect (Sharma
2006). In 1865, Tyndall postulated in the atmospheric envelope that
water vapor and CO2 retain the heat. In the 1970s, the period known
as atmospheric warming is being renamed into global warming. In an
effort to keep the greenhouse gas concentrations and global warming sta-
ble, the UN Framework Convention on Climate Change (UNFCCC) is
being established. In 1997, the UNFCCC agreed on the Kyoto Protocol
(see Appendix 2), according to which binding targets for 37 industrial-
ized countries for reducing greenhouse gas (GHG) emissions are being
set. GHG should decrease to an average of 5.2% against 1990 levels over
20082012 (Kyoto Protocol 1998).
There are many aspects based on which the climate change can be
analyzed. In terms of the political aspect, the international coopera-
tion for the per capita emissions and the political view of the rise of
sea levels have to be considered. The economic, social and technologi-
cal are equally important. The environmental aspect is directly linked
to the climate change. First of all, the most severe impact that global
warming has on the Earth is the disappearance of land and sea ice. A
50-year government study found that the worlds glaciers are melting
at an alarming rate. Glaciers worldwide are melting faster than any-
one had predicted they would just a few years ago (Erdman 2009).
Furthermore, climate change has already affected natural systems which
will eventually lead species off the planet. According to Professor Will
Steffen of the Australian National University, the extinction of species is
now 1001000 times faster than it used to be and is expected to further
increase this century (Falvon-Lang 2011). Additionally, climate change
will affect the oceans as well. Oceans absorb carbon dioxide emissions
from human activities. This absorption causes the pH of the water to
decrease and leads to chemical changes of the oceans (ocean acidifica-
tion). The average pH at the surface of the oceans has been decreased
by 0.1 (from 8.2 to 8.1) since the industrial revolution. According to
predictions by the end of the century, the pH will drop by an addi-
tional 0.3. Corrals and mollusks will be among the most worst affected.
The long-term consequences of the ocean acidification are going to be
changes in the stability in a number of ecosystems (National Academy
of Sciences 2010).
Finally, deforestations together with climate change can have harm-
ful effects for the planet. In his documentary, confronting climate
change (see Appendix 3), Al Gore clearly states that human activities
like deforestation are changing our climate in ways that pose increas-
ing threats to human well-being, in both developing and industrialized
nations (Al Gore, n.d.). Deforestation is a process with which natural
forests are being logged or burnt in order to use the timber or the land
differently. An extent of 1215 million hectares of forest are lost each
year, the equivalent of 36 football fields per minute (wwf.panda.org).
230 K. BIGINAS
All impacts mentioned above result among others to the reduction of

biodiversity, the increased release of greenhouse gas emissions and finally
the disruption of water cycles and livelihoods of human beings are being
disrupted.
9.2.4Energy
Energy is fundamental to peoples everyday needs. Energy used to be
cheap and people believe that it will last forever so as to cover the
present and future needs. Therefore, demand for global energy is con-
stantly increasing. Supply of energy is important in developing coun-
tries for their economic development. Generally, the demand for energy
can be seen as the combination of population, the economic activity
that the population produces and the energy needed for that activity.
Lately, the belief that the climate change is being caused by humans is
increasing. And energy is considered as the main culprit but at the same
time, the energy sector is one of the most important in overcoming the
challenges that climate change has brought. As the impacts of climate
change are growing, the need to make drastic changes with energy
sources used is also growing. In order to make these changes so as to
satisfy future demands, the main fuels initially have to be identified.
The majority of the produced energy comes from fossil fuels like coal,
oil and natural gas.
Energy functions and interacts in five broad categories; social, tech-
nological, economic, environmental and political. Only the technological
aspects will be analyzed further. These are the changes that need to be
done. First of all, the production of electricity can be performed in coal-
fired power stations. Efforts are being made so as to reduce the environ-
mental impact of power stations. The power stations operate by harnessing
suitable raw energy sources and transform them into electrical energy
which in turn is going to be sent to houses and industries. The most
widely used fuel is coal. The global production of coal will be increased
by 30% in coming decades mainly in Australia, China, Russia, Ukraine,
Kazakhstan and South Africa (Zittel and Schindler 2007). Another way
to generate electricity more efficiently is the hydrogen economy. It is an
energy that is stored as hydrogen and is being used to balance the electri-
cal grid load and in mobile applications.
According to Friedemann, wind turbines alone can generate electricity
at 3040% efficiency, producing hydrogen at an overall 25% efficiency.
That is, when the wind is blowing (Friedemann 2005). Demand side man-
agement and micro-generation are also two ways so as to save energy.
Demand side management is the signals that a household can send in order
to warn housekeepers for high consumption. Micro-generation is the pro-
duction of energy or heat, on a small scale for individuals, small businesses
and communities so as to meet their own needs. Micro-generation is the
electricity produced under the capacity of 50 kW and heat less than 300
kW, like heat pumps, solar panels, biomass boilers and micro-wind tur-
bines. The UK government and the Department of Energy and Climate
Change have launched a consultation on the micro-generation strategy
they are following (Froley 2010).
Finally, there are energy technologies based on renewable sources (e.g.
wind, solar, biomass, geothermal) for our future energy generation so as
to move away from conventional fossil fuel-based energy sources. These
technologies have seen rapid change in the latest years, of more specifically
in terms of the wide range of implementation and their public and com-
mercial use (OKeefe, etal. 2010). Unfortunately, for now, the electricity
that is generated from renewable sources of energy accounts only the one-
fifth of the total energy consumption (see Appendix 4) (Hodgson 2010).
9.2.5Demographics
The demographer Ronald Lee defines demography as the study of the
causes and consequences of demographic rates and structures. The demo-
graphic rates are fertility, mortality and migration; whereas, the structures
include size and distribution by age, sex, race-ethnicity and geographic
location (Birks 2007). Demographics as a driver of change can be seen
as the combination of interrelated social, economic, political, technologi-
cal and environmental factors. There are three processes that are related
to demography. These are natural population development (fertility and
mortality) and migration. Fertility can be set as the mean number of chil-
dren (alive) that a woman will give birth to during her lifetime. The fertil-
ity rate in European countries lies in 2.1 much lower compared to Africa,
Asia, Oceania, Latin and Northern America. In the last decades of the
twentieth century, it has been noticed that continuous changes in fertility
are occurring. The global fertility level from 1970 to 1980 was 4.6 and
fell to half in 19942005 (United Nations 2007). This reduction can be
attributed mostly in social changes that had fundamental impact in order
for the development to be sustained.
232 K. BIGINAS
Aging population on the other hand seems to be higher due to the

decreasing fertility rates and increasing in the life expectancy among oth-
ers (European Commission 2005). According to predictions by the UN
by 2030, at least half of the Western population will be over 50, with a life
expectancy for 50-year-olds of a further 40 years, and by 2050 the share of
65+ in the EU will hover at around 28% (see Appendix 5) (Birks 2007).
The more difficult demographic process is urban migration. It is
directly related to urbanization; it has been estimated that in the next
three decades, the population of urban areas in less-developed regions
is expected to increase and even double in size (Bolay 2006). Migration
has estimated to have risen from 75 million in 1960 to 175 million in
2000 (Birks 2007). The primary reason that migrants travel is usually
to find better working opportunities. Other reasons are resource wars,
environmental disasters (climate refugees) or food emergencies, like in
the case of the most powerful earthquake on March 11in Japan. Due
to fuel shortage, fresh products and food could not be delivered. This
forced thousands of people, so-called food refugees, to join queues in
order to get a bowl of hot soup (Garcia 2011). Additionally, there are
more subjects that are related to the social aspect of demography. First
is social media, which are online tools where people share with others
their opinion, thoughts or experiences. Among others, social media are
MySpace and Facebook. Obesity is a major contributor to the global
burden of mortality. Overall, more than one in ten of the worlds adult
population is obese and 65% of the worlds population is living in coun-
tries where overweight and obesity kill more people than underweight
(WHO 2010). Finally, the shrinkage of households is another social
aspect of demographics. There is a global trend for smaller households
(see appendix 6). Some of the reasons that this is happening is due to
low fertility and high divorce rates, aging population or high per capita
income.
9.2.6Urbanization
Urbanization is the process by which large numbers of people become
permanently concentrated in relatively small areas, forming in that way
cities Urbanization can be caused by natural increase, migration or reclas-
sification of rural areas as urban. Urban areas are subject to an increasing
component of regional climate change. Most of the increase of CO2 in
the atmosphere is caused by the energy consumption in the worlds cit-

ies. The more highly urbanized a developed region is, the greatest the
proportion of CO2 it generates. In addition to all drivers of change,
urbanization is a phenomenon for which the social, technological, envi-
ronmental and political aspects can be identified. Taking a closer look in
the economic aspect of urbanization, the subjects that can be determined
are employment, poverty, agriculture and congestion. Employment
can be considered the key factor in urbanization due to the fact that it
attracts people from rural to urban areas. The vulnerable employment
indicator provides hints about trends for the employment quality. The
main characteristics of vulnerable employment are low payments or dif-
ficult working conditions. The estimate of the number of workers in vul-
nerable employment in 2009 is 1.53 billion, which is increased by 146
million since 1999 (see Appendix 7) (I.L.O. 2011). Poverty in urban
areas is spatially identified in the number of slums in which people live in
overcrowded houses and suffer from lack of basic services. Shelter depri-
vation is a dimension of urban poverty and is being measured using five
key indicators: access to water, access to sanitation, durability of hous-
ing, sufficient living area and secure tenure (UN-HABITAT 20067).
The Multidimensional Poverty Index (MPI) in 2010 showed that half of
the worlds poor live in South Asia (51%) and over one quarter in Africa
(28%) (Alkire and Santos 2010). Urban agriculture is the growing, rais-
ing, processing and distributing food around an urban area. Urban agri-
culture offers food security to low-income households. Havana is the
worlds leader in urban agriculture. After the collapse of the Soviet Bloc,
food production was decentralized from large mechanized state farms
to urban cultivation systems. Today, more than 50% of Havanas fresh
products are growing within the city limits, using organic compost and
simple irrigation systems (Abitz 2008). Finally, congestion is another
social impact of urbanization. Congestion is a widespread problem for
both developed and developing countries. Car owners tend to increase
continuously. Many countries are trying to find ways so as to minimize
congestion. Like in the case of Beijing, where the managing director of
consultant Intelligence Asia Automotive, Ashvin Chotai, believes that
the demand will be decreased by 45% due to restrictions of license plate
because of the high congestion rates and levels of pollution. This strategy
will be followed by other cities such as Guangdong province or Shanghai
(Rowley and Inoue 2011).
234 K. BIGINAS
9.2.7E-commerce
E-commerce has changed the way companies do business today. This is
not something that companies are just considering to do but something
which they have to do if they want to compete in the global environ-
ment and sustain any competitive advantage they have or gain a competi-
tive advantage. Today e-commerce is linking nations and organizations
either locally or globally. It is all about speed, connectivity, sharing and
exchanging goods, services and information. New technology devel-
opment has increased during the last years and has changed our lives.
Innovations and new products have been developed at great speed. One
of these innovations is the Internet. The growth of the Internet and its
increased use are forcing companies to evaluate their current distribution
channels and redesign their strategies, since now they are able to target
customers differently.
We could describe e-commerce as a computer to computer, individual
to computer, or computer to individual business relationships enabling
an exchange of information or value (Rao 2000). A form of e-commerce
existed between a significant number of large companies for about two
decades in the form of Electronic Data Interchange (EDI). Ninety per-
cent of e-commerce is EDI, which is unlikely to vanish (Nemzow 2002).
Trading over the Internet, mainly in the USA and at an increasing rate in
Europe, is accelerating the pace of change and for the first time provid-
ing the conditions for seriously free markets. Internet provides affordable,
accessible technology to bring together buyers and sellers, large and small.
People from any location on the planet could enter competitive markets
(Rao 2000). A World Trade Organization (WTO) study on e-commerce
emphasizes the growth of opportunities that e-commerce offers, including
for developing countries, it predicts more than 300 million users would
be transacting over the Net and estimates $300 billion e-commerce (Rao
2000).
9.2.7.1 Influence ofE-commerce ontheMarket

The rapid developments in the technological field and the wide accep-
tance of e-commerce are changing the way businesses operate. The whole
infrastructure is influenced while at the same time marketing strategies are
changing. In order for e-commerce to grow, it demands infrastructure,
such as good telecommunications, computers and other related Internet
technologies. E-commerce depends on high performance hardware, soft-

ware and communications to deliver voice, data, graphics and other infor-
mation wherever the user is located.
It is suggested that small firms will be able to benefit from Internet
since it reduces the importance of scale economies (Quelch and Klein
quoted in Arnott 2002). Small firms will be able to adopt faster and have a
greater level of sophistication in their Internet usage. On the other hand,
it is suggested that large companies with high levels of resources make
the most sophisticated use of the Internet. E-commerce development
is helping companies to expand locally or internationally, helping them
to create economies of scale. However, it is suggested that Internet and
e-commerce lower the entry barriers for companies to enter into a specific
market. For example, the number of e-banks has increased during the
last years, since the costs of establishing an e-company are certainly lower
from establishing an adequate branch network in order to compete with
the market leaders.
Internet can also be used for product innovations. Direct access to con-
sumers can be used to collect information that will help to develop prod-
ucts to meet the customers needs. For international companies, this can
provide adaptations and customizations for local markets (Allen 2001).
With the use of Internet, banks, for example, are becoming able to offer
a number of home-banking services. They are capable to offer customers
general and customer-specific information, the ability to conduct transac-
tions, access to a variety of interactive financial calculators and worksheets,
customization of the content of the Internet bank and the messages sent
from the bank. Finally, it is possible to interact with a bank adviser, via
e-mail and video-based advisory services (Mols 2000).
E-commerce puts the purchase anywhere a connection exists. Internet
will allow organizations to skip over parts of the value chain, that is, mar-
keting the product on the Internet in order to bypass the retailer. It will
also allow small niche producers easier access to the markets (Allen 2001).
At the same time, the ability to compare prices across all suppliers using
the Internet and online shopping services will lead to increased price
competition. Organizations will have to use new pricing models when
selling over the Internet in order to avoid the price of a product or ser-
vice to approach its marginal cost as competition intensifies (Allen 2001).
However, it could be contradicted that there is going to be a change
in the way companies build their relationships with their customers. In
236 K. BIGINAS
Internet banking, for example, at the branch network strategy, a more

close relationship is built between the two parts. The introduction and
consumer acceptance of Internet-based home banking may bring a dra-
matic change in the way banks build and maintain their relationships with
their customers. Since, the Internet reduces the costs of searching, nego-
tiating and concluding deals and hence may make it easier for the con-
sumers to compare products. This forces the producers in some industries
to compete on the basis of price alone and puts pressure on profitability.
Therefore, a potential threat to the banks is the breakup of the close- and
long-term relationships the banks have had with some of their customers
(Mols 2000).
9.2.8Planning fortheFutureCompeting
intheContemporary Markets
Forecasts and predictions are not are not always materialize. It is impor-
tant to keep in mind that forecasts are just forecasts which are suggesting
the likelihood for something to happen unless measures are taken so as
to avoid it. For climate, the predictions for the twenty-first century show
warming in the Arctic, an increase in precipitation extremes, a decrease in
snow and ice cover, and an increase in sea level (IPCC). The magnitude
that climate is changing in the future has to be limited. Limiting climate
change is a global issue. A fundamental strategy that should be followed
is to reduce the GHGs globally. For energy, on the other hand impossible
to make accurate predictions for the future. Nonetheless, there are pres-
sures so as to develop new trajectories so as to reduce carbon emissions.
These trajectories are the use of hydrogen energy economy, carbon trad-
ing in vehicles or use of other technological developments like biofuels.
But peoples lifestyles and the way energy is being used are the major
issues. Despite the uncertainty of the future for technological advance-
ments or climate change, there is only one thing for sure and that is births
and deaths. Economic and social policies should be implemented based on
demographic trends. Fertility rates and employment should be increased.
Finally, there is no doubt that globally urban population will be increased
in the future. What needs to be done so as urban settlements to be more
sustainable and reduce poverty is the worlds population to be concen-
trated on a less than 3% of the land area?
9.3 Change andtheCompetitive Environment
9.3.1SS-C-P (Structure-Conduct-Performance) Model

Neil Harris (2009) mentions that the S-C-P model is a road map for
identifying the factors that determine the competitiveness of a market,
analyzing the behavior of firms and assessing the success of an indus-
try in producing benefits for consumers. However, S-C-P paradigm has
many disadvantages. Initially, the chronological separation of structure
and conduct cannot occur simultaneously, as a result they are not realis-
tic. Secondly, the way in which businesses behave is a direct function of
the structure of the market in which they operate. Finally, a market way
may behave differently from the way that basic economic theory predicts
(Fig. 9.1).
Structure:
1. Number of firms
2. Barriers to entry
3. Nature of products
4. Knowledge of products
Fig. 9.1 S-C-P diagram

238 K. BIGINAS
Conduct:
1. Price taker vs. Price setterPerfect Competition

2. Non-price competitionMonopolistic Competition & Oligopoly
3. Price warsOligopoly
4. CollusionOligopoly
5. Price DiscriminationMonopoly
Performance:
1. Profitability
2. Efficiency
3. Equity
4. Innovation
5. Consumer Choice
9.3.2Porters Five Forces Framework

Porters five forces theory is a useful and powerful tool. It is a frame-
work for industry analysis and business strategy development shaped by
Michael Porter, Harvard Business School in 1979. The main advantage
of using this technique is that it provides a structure for management
thinking about the competitive environment. Furthermore, it can help
to define strategic segment boundaries and reveal insights about the key
forces in the competitive environment. It is often useful to carry out sev-
eral industry or market analyses. It can be essential if two or more groups
of managers carry out an appraisal independently. Lastly, it is an outside
in approach to strategy formulation, which stresses the need to adapt the
firm to its environment as a strategy requirement. Therefore, it is a way of
looking at any industry and understanding the structure underlining driv-
ers of profitability and competition.
However, it suffers from a number of weaknesses:
1. It is a static and single point analysis of the present industry

structure.
2. It is extremely difficult to know where the industry begins and
ends and also to weight the factors and thereby understand their
relative performance.
3. It does not allow for risk and does not take account of alliances and
networks in industries.
4. It is qualitative and hence does not give accurate measurement.
Porter (2008) outlines thoroughly each force in Five competitive

forces that shape strategy, Harvard Business Review:
1. Threat of entry
ew entrants to an industry bring new capacity and a desire to

N
gain market share that puts pressure on prices, costs and the rate of
investment necessary to compete.
2. The power of suppliers
owerful suppliers capture more of the value for themselves by

P
charging higher prices, limiting quality or services, or shifting costs
to industry participants.
3. The power of buyers
owerful customersthe flip side of powerful supplierscan cap-

P
ture more value by forcing down prices, demanding better quality or
more service, and generally playing industry participants off against
one another, all at the expense of industry profitability.
4. The threat of substitutes
substitute performs the same or a similar function as an industrys

A
product by a different means.
5. Rivalry among existing competitors
ivalry among existing competitors takes many familiar forms,

R
including price discounting, new product introductions, advertising
campaigns and service improvements.
240 K. BIGINAS
9.3.3S-C-P Model vs. Porters Five Forces FrameworkA

Comparative Study
On the one hand, David Begg and Damian Ward (2012) emphasize that
Porters five forces model is closely related to the assumptions of per-
fect competition. Many buyers and many sellers relates to the bargain-
ing power of buyers and sellers. The threat of new entrants is linked to
the assumption of no entry barriers. The threat of substitutes links to
the assumption of homogeneous products, where the threat of substi-
tutes is very high under perfect competition. From a business perspec-
tive, monopoly is preferable to perfect competition. In monopoly, there
is no competition, the price is higher and barriers to entry ensure that
supernormal profits are long term. So, a manager would clearly like to
be in a monopoly or be capable of deriving a business strategy that takes
them from a perfectly competitive market to a monopoly. The firms within
the industry have to decide how to react. The level of competition and
rivalry could be high, or they may decide to collude with each other and
lower competition. Obviously, the intensity of each of these five competi-
tive forces determines the overall level of competition within the industry,
which ultimately determines profits.
On the other hand, George Djolov (2014) supports that The S-C-P
model is grounded in the belief that concentration emanates from collu-
sion or collective monopoly, with the result that any transactions such as
mergers and acquisitions that smack of largeness are likely to be classed
by the model as anticompetitive, i.e. departing from the outcome of per-
fect competition. It is shown that the theoretical parts of the SCP model
require, perhaps, greater latitude of understanding than is afforded to
them at present, which in turn makes them unsuitable for upholding the
model. The S-C-P paradigms ideal of competition is the perfect com-
petition model. According to Bain (1951), the model hypothesizes that
the average profit rate of firms in oligopolistic industries of a high con-
centration will tend to be significantly larger than that of firms in less
concentrated oligopolies or industries of atomistic structure. This main
hypothesis has two variations. Bain (1951) formulated the first one as
follows: If we hold demand and cost conditions and entry conditions
constant, monopoly or effectively collusive oligopoly tends to yield higher
profit aggregates and prices in long-run equilibrium than competition or
imperfectly-or non-collusive oligopoly. Bain (1951) formulated the sec-
ond variation as follows: There will be a systematic difference in average
excess profit rates on sales between highly concentrated oligopolies and

other industries. The difference should be found, strictly, even if there are
on the average identical entry conditions in the two groups.
9.4 Monopolistic Competition andFive Forces

McConnell (2010) points out that monopolistic competition exhibits a
considerable amount of competition mixed with a small dose of monopoly
power. In general, monopolistically competitive industries are far more
competitive that they are monopolistic.
9.4.1Monopolistic CompetitionMain Characteristics

1. A relatively large number of sellers.
2. Differentiated productsoften promoted by heavy advertising.
3. Easy entry to, and exit from, the industry.
Begg and Ward (2012) summarize that monopolistic competition is a

highly competitive market where firms may use product differentiation,
which for the most part is an industry much like perfect competition except
for the existence of product differentiation. Firms produce similar goods
or services which are differentiated in some form or other. Monopolistic
competition also requires an absence of economies of scale. A monopo-
listic industry will also be characterized by a large number of small firms
without the ability, or need, to exploit size and scale. Examples of monop-
olistic competition can be found in every high street. Monopolistically
competitive firms are most common in industries where differentiation
is possible, such as restaurant business, hotels and pubs, general specialist
retailing and consumer service.
9.4.2Monopolistic CompetitionFive Forces Analysis

1. Threat of entry
Monopolistically competitive firms are relatively free to enter and exit

an industry. There might be a few restrictions. These firms are not
perfectly mobile as with perfect competition, but they are largely
unrestricted by government rules and regulations, start-up cost or
other substantial barriers to entry.
242 K. BIGINAS
2. The power of suppliers
It is clear that each supplier offers a similar and not identical product.
Each supplier does not face a perfectly elastic demand line, as they
would in perfect competition.
3. The power of buyers
In monopolistic competition, buyers do not know everything, but

they have relatively complete information about alternative prices.
They also have relatively complete information about product differ-
ences, brand names and so on. Whatever the reason, buyers treat the
goods as similar, but different. Most important, each good satisfies
the same basic want or need. The goods might have subtle but actual
physical differences or they might only be perceived different by the
buyers.
4. The threat of substitutes
It is obvious that there are similar products. Each firm in a monop-

olistically competitive market sells a similar, but not absolutely
identical, product. The goods sold by the firms are close substi-
tutes for one another, just not perfect substitutes. The element
of differentiation lowers the degree of substitutability between
rival offerings and results in each firm facing a downward-sloping
demand line.
5. Rivalry among existing competitors
A monopolistically competitive industry contains a large number of

small firms, each of which is relatively small compared to the over-
all size of the market. This ensures that all firms are relatively com-
petitive with very little market control over price or quantity. In
particular, each firm has hundreds or even thousands of potential
competitors. Each monopolistic firm can influence its market share
to some extent by changing its price relative to its rivals (http://
www.amosweb.com/cgi-bin/awb_nav.pl?s=wpd&c=dsp&k=monop
olistic+competition).
9.5 Change andMarket Structures
9.5.1Perfect Competition and(In)Efficiency

Besanko etal. (2009) argue that in the theory of perfect competition, the
market conditions will tend to drive down prices when two or more of the
following conditions are met:
1. There are many sellers.

2. Consumers perceive the product to be homogeneous.
3. There is excess capacity.
9.5.2Efficiency ofCompetition
Consumers are efficient at all points on the demand curve D, whereas
producers are efficient at all points on the supply curve S. Moreover,
the demand curve is the marginal benefit curve MB, while the supply
curve is also the marginal curve MC. It is important to underline that
the resources are used efficiently at the point A, where marginal benefit
equals marginal cost and the sum of producer surplus and consumer sur-
plus is maximized.
Therefore, the efficient use of resources requires:
1. Consumers to be efficient, which occurs when they are on their

demand curves.
2. Firms to be efficient, which occurs when they are on their supply
curves.
3. The market to be in equilibrium with no external benefits or exter-
nal costs.
9.5.3(In)Efficiency andPerfect Competition

As we referred above, the resource use is efficient when the marginal ben-
efit equals the marginal cost. In other words, the resource use is efficient
when we produce the goods and services that people value most highly.
Resources are not being used efficiently, if someone can become better off
without anyone else becoming worse off.
Consumers are efficient along the demand curve D and producers are
efficient too along the supply curve S. Where the curves intersectthe
244 K. BIGINAS
competitive equilibriumboth consumers and producers are efficient.

Consumers allocate their budgets to get the most value possible out of
their resources at all points along their demand curve D or their supply
curve S, which are also their marginal benefit curves and their marginal
cost curves. Markets achieve productive and allocative efficiency in per-
fectly competitive equilibrium. Parkin (2010) refers that perfect competi-
tion achieves efficiency if there are no external benefits and external costs.
In such a case, the benefits accrue to the buyers of the good and the costs
are borne by its producer.
There are three main obstacles to efficiency:
1. Monopoly. Restricts output below its competitive level to raise

price and increase profit. Government policies arise to limit such use
of monopoly power.
2. Public goods. There are many examples such as the enforcement of
law and order, the disposal of sewage and garbage, the provision of
clean drinking water and national defense. Government institutions
help to overcome the problem of providing an efficient quantity of
public goods.
3. External costs and external benefits such as the production of
chemicals and steel can generate air and water pollution. Perfect
competition might produce too large a quantity of these goods.
Government policies attempt to cope with external costs and exter-
nal benefits.
9.5.3.1 Monopoly andPerfect Competition

Monopoly has a demand curve of MR, whereas competition is constrained
to D curve. Compared to a perfectly competitive industry, a single-price
monopoly restricts its output and charges a higher price.
9.5.3.2 Public Goods

Public goods are some goods and services that are consumed either by
everyone or by no one. For instance, national defense systems cannot iso-
late individuals and refuse to protect them. The market economy fails
to deliver the efficient quantity of public goods because of a free-rider
problem. Governments not only establish and maintain property rights
and set the rules for the redistribution of income and wealth, but also they
provide a non-market mechanism for allocating scarce resources when
the market economy results in inefficiency, market failure. By reallocating
resources, it is possible to make some people better off while making no

one worse off.
9.5.3.3 External Costs andExternal Benefits

External benefits are benefits that accrue to people other than a buyer
of a good. In the absence of external benefits, the market demand curve
measures marginal social benefit, the value that everyone places on the
more unit of a good or service. External costs are costs that are borne not
by the product of a good or service but by someone else. In the absence
of external costs, the market supply curve measures marginal social cost,
the entire marginal cost that anyone bears to produce one more unit of a
good or service.
The three main methods that governments use to cope with externali-
ties are:
1. Taxes used by the governments as an incentive for producers in

order to cut back on pollution. Pigorian taxes, in honor of Arthur
Cecil Pigou, the British economist who first worked out this method
of dealing with externalities during the 1920s.
2. Emission Charges are an alternative to a tax for confronting a pol-
luter with the external cost of pollution. The government sets a
price per unit of pollution.
3. Marketable permits in which each potential polluter might be
assigned a permitted polluting limit. It is a method of dealing with
pollution that provides an even stronger incentive than do emission
charges. Samuelson (1967), by means of a rather curious example,
attempts to demonstrate that free enterprise will result in greater
economic inefficiency than either monopoly or ideal planning.
Acknowledging that perfect competition would lead to an optimal
equilibrium state, he discusses the various monopolistic imperfec-
tions of the real world which prevent the existence of perfect
competition.
9.5.4Imperfect Competition andResource Allocation

The problem of resource allocation is one of the most important issues
in economics because is a result of the fundamental economic problem,
the problem of resources scarcity. The issue of scarcity brings three main
questions that we can say that constitute the basis of the economic science.
246 K. BIGINAS
These questions are what is to be produced, how is to be produced and,

finally, who gets the output (Katz and Rosen, 1998, pp.23). The alloca-
tion of resources has to do with how we answer of these questions or else
how societys resources are divided up among the various outputs, among the
different organizations that produce these outputs, and among the members
of society (Katz and Rosen 1998: 3). That demonstrates that the way
that the resources are allocated constitutes one of the most important
issues for the economic theory. According to Mansfield and Yohe the
allocation of resources among alternative uses is one of the major functions of
an economic system (2000: 289).The relative scarcity of resources makes
impossible for the economy to produce unlimited quantities of the goods
that people want in order to satisfy their needs. That is why each economy
has to develop a mechanism that will allocate these scarce resources among
the production of different goods and, then, among the consumption of
them.
The effectiveness of solving the problems of allocation forms the effi-
ciency of an economy. The economic efficiency has to do with the resource
allocation in various economic actions in a way that ensures the maximiza-
tion of the utility for the economic agents. However, because efficiency is
not an absolutely specific concept, it is quite difficult to measure it using
assessable values. The most acceptable way to measure the degree of eco-
nomic efficiency is the Pareto criterion. Using an example of apartments
as the produced good and the renders as the sample of consumers, Varian
reveals that one useful criterion for comparing the outcomes of different
economic institutions is a concept known as Pareto efficiency or economic effi-
ciency. In our context, a way of allocating apartments to the renters is Pareto
efficient if there is no alternative allocation that makes everyone at least as
well off and makes some people strictly better off (1996: 15). This leads us
in to the first welfare theorem, according to that the competitive markets
will lead us in resource allocation that will be impossible to improve, either
in production or consumption, the position of an economic agent without
making the welfare worst for another. So, the function of competitive mar-
kets will lead us in a Pareto-efficient condition.
Thus, in order to achieve a perfect allocation of resources, an economy
must ensure that will apply competitive conditions for the markets of all
goods, otherwise in a case of imperfect competition, the resources will be
misallocated. The market form of perfect competition is the only one that
ensures efficient resource allocation because only then can hold the main
assumption that the price (P) of produced goods must be equal with their
marginal costs (MC). When P = MC, the price of each product reflects its
cost of production which means that the resources are allocated according
to the demand of people reflecting also the true costs for producing them.
Under competitive environment, the efficiency is promoted and the
consumers are charged with lower prices. Perfectly, competitive firms have
incentives to use the best available technology. With a full knowledge of
existing technologies, firms will choose the technology that produces the
output they want at the least cost and each firm uses inputs such that the
marginal value of each input is just equal to its market price.
Hence, as the price-marginal cost equality (P = MC) condition defines
the efficiency of resource allocation, it is obvious that only the existence
of competitive markets will allow the promotion of perfect resource allo-
cation. The competitive markets can ensure that in the long-run equi-
librium condition (production level X0) P = MC. Katz and Rosen reveal
a Pareto-efficient allocation of resources requires that prices be in the same
ratios as marginal costs, and competition guarantees this condition will be
met (1998: 391).
Furthermore, the competitive markets lead to efficiency because when
the consumers decide the quantities that will buy from each product, they
will equal the marginal utility (MU) that they get from the consumption
of an extra unit of this product with the marginal cost (MC) of buying
this extra unit, which is the price that they pay. So, in perfect competi-
tion, as P = MC and P = MU, that means MC = MU as well. Conversely,
as the optimal allocation of resources is ensured by the maintenance of
the competitive markets, it is obvious that the absence of competitive
conditions will lead us in suboptimal allocation (misallocation) of them.
According to Karz and Rosen (1998: 398), an economy with freely oper-
ating markets may fail to generate an efficient allocation of resources
for two general reasons: the existence of market power from businesses
(monopolistic/oligopolistic power) and the nonexistence of markets at
all. In the first case, the existence of market power allows the firms to set
the prices by charging more than the competitive ones and supplying less
output than the competitive markets would. As prices cannot reflect the
real costs of production (P MC), this violates the fundamental condi-
tion of optimal resource allocation and, furthermore, all the other condi-
tions of efficiency.
For example, in a monopoly, there is a misallocation of resources as in
the production level where the monopoly maximizes its profits (X0), the
equilibrium price (P0) is much higher than the marginal cost (MC). As
248 K. BIGINAS
Katz and Rosen argue if all of the other goods in the economy are sold in
perfectly competitive markets at prices equal to their marginal costs, then the
monopolist violates the condition for allocation efficiency because it sets the
price of its product greater than its marginal costs (1998: 430). Another
example of resource misallocation can be given by a monopolistic com-
petitive market. There, although in long-run equilibrium, the firm has
zero profits (point B), this is not a sign of economic efficiency. Again the
price is higher than the marginal cost (P0>MC0), declaring the misalloca-
tion of resources. As long as P>MC we know that there is someone who is
willing to pay more for an extra unit of output than it costs to produce that
extra unit (Varian, 1996: 413). Furthermore, except the non-compet-
itive markets, the worst problem is the nonexistence of markets. That
means, there is the possibility that there is no supplier for a demanded
good or service.
The appearance of market power or the absence of market mechanisms
will result the phenomenon of market failure. Market failure occurs when
resources are misallocated or allocated inefficiently and the result is waste
or lost value. Evidence of market failure is revealed by the existence of
imperfect market structure, external costs and benefits (externalities), and
imperfect (or asymmetric) information. The appearance of market failure
violates the first welfare theorem; however, the market failure is a very
common problem for the real economies.
9.6 Correcting Market Failures

One way, the most frequent, to resolve the problem of market failure
is the intervention of the government by regulating the markets or by
substituting those, providing public goods. In a case of imperfect com-
petition, the government has to regulate the market in order to promote
competition, to decrease the monopolistic power of the firms and to pre-
vent the creations of collusion and cartels. The anti-trust law in the USA
and the European Union competition law are characteristic examples of
such regulations. The government can also deal with negative externali-
ties by charging the firms an extra fee in order to balance the economic
cost with the true social cost of the production. In the case of market
absence, the government can also offer a kind of solution by supplying
the products itself. However, it must be noted that is not guaranteed that
the intervention of the government will improve the economic efficiency.
Even the provision of public goods can damage the competition as the
provision of these goods in some people that they need them cannot
prevent others from consuming them. As Katz and Rosen state that the
market-generated allocation is imperfect does not mean that the govern-
ment can do better (1998: 399), as, for example, the cost of setting-up
an agency to deal with externalities can be higher than the cost of the
externality itself.
Concluding, it can be said that only the promotion of competitive mar-
kets can ensure the efficient allocation of resources. In the case of market
failure, the government has to develop the policies and the mechanisms
that will diminish the negative effects of this failure. However, the govern-
ments intervention in the allocation process must be restricted; other-
wise, it will create more problems than those that may solve.
9.7 Conclusion
The next decades will be of fundamental changes for businesses around the
globe and the actions business take will begin to have a significant impact
on the shape of each economy. In the short-term future, businesses will
typically be involved in a range of collaborations, partnerships and joint
ventures, supporting investment finance, R&D and innovation, train-
ing and new organizational structures. There will be much more rigor in
identifying investment and innovation projects for funding and businesses
will have outsourced the next level of activities, including many special-
ist tasks. The workforce will be more diverse, highly flexible and mobile,
making the most of new ways of working and using more business-relevant
professional skills. This will leave organizations focused on a smaller core
of people and projects, supported by a much wider range of individuals
and businesses around the periphery. Building and maintaining trust with
business partners and the public will become critical to the smooth opera-
tion of these structures, and compliance with governance and sustainabil-
ity standards will be a major objective.
Effective management and wise use and allocation of resources avail-
able will be the key determinants of survival and success. Globalization
has resulted in increased competition and expanded international oper-
ations for many US and European companies and these effects will
likely continue in the new era. In fact, the rapidly accelerating develop-
ment of Brazil, Russia, India and China (BRIC) and with Africa next
250 K. BIGINAS
on the horizon will likely drive our global economic growth, as well
as many companies financial prospects, in the new business environ-
ment. Accordingly, boards will need to continue to address these fac-
tors, especially adapting corporate governance practices to take into
account the ethical business practices of, and companies relation-
ships with, foreign governments. Government intervention and mar-
ket regulation/deregulation in different market structures will be of
a great significance in the process of change and the transition in the
new era. Factors such as human resources, culture change, creativity
and innovation are vital to a business and that if they are not properly
delivered, it could lead to the company faltering. Strategic intervention
techniques would assist the company to carry on in certain situations,
survive and continue to prosper. It is expected to experience resistance
in any change process. Transition is a painful period for companies and
economies in general. The crucial aspect will be the identification of
the actual sources of resistance and the creation of an effective action
plan based on communication, flexibility and appropriate leadership
styles and managerial techniques.
9.8 Appendix 1: The Greenhouse Effect

The gases, which are all naturally occurring, act as a blanket, trap-
ping in the heat and preventing it from being reflected too far from
the Earth. They keep the Earths average temperature at about 15C:
warm enough to sustain life for humans, plants and animals. Without
these gases, the average temperature would be about 18C: too cold
for most life forms. This natural warming effect is also sometimes called
the greenhouse effect.
9.9 Appendix 2: KYOTO PROTOCOL to

theUnited Nations Framework Convention
onClimate Change
The Parties to this Protocol, Being Parties to the United Nations

Framework Convention on Climate Change, hereinafter referred to as
the Convention, in pursuit of the ultimate objective of the Convention
as stated in its Article 2, Recalling the provisions of the Convention, Being
guided by Article 3 of the Convention, Pursuant to the Berlin Mandate
adopted by decision 1/CP.1 of the Conference of the Parties to the

Convention at its first session,
Have agreed as follows:
9.9.1Article 1
For the purposes of this Protocol, the definitions contained in Article 1 of
the Convention shall apply. In addition:
1. Conference of the Parties means the Conference of the Parties to

the Convention.
2. Convention means the United Nations Framework Convention
on Climate Change, adopted in NewYork on 9 May 1992.
3. Intergovernmental Panel on Climate Change means the
Intergovernmental Panel on Climate Change established in 1988
jointly by the World Meteorological Organization and the United
Nations Environment Programme.
4. Montreal Protocol means the Montreal Protocol on Substances
that Deplete the Ozone Layer, adopted in Montreal on 16 September
1987 and as subsequently adjusted and amended.
5. Parties present and voting means Parties present and casting an
affirmative or negative vote.
6. Party means, unless the context otherwise indicates, a Party to
this Protocol.
7. Party included in Annex I means a Party included in Annex I to
the Convention, as may be amended, or a Party which has made a
notification under Article 4, paragraph 2(g), of the Convention.
9.9.2Article 28
The original of this Protocol, of which the Arabic, Chinese, English,
French, Russian and Spanish texts are equally authentic, shall be deposited
with the Secretary-General of the United Nations.
DONE at Kyoto this eleventh day of December one thousand nine
hundred and ninety-seven.
IN WITNESS WHEREOF the undersigned, being duly authorized
to that effect, have affixed their signatures to this Protocol on the dates
indicated.
252 K. BIGINAS
9.9.3Annex A
9.9.3.1 Greenhouse Gases

Carbon dioxide (CO2), Methane (CH4), Nitrous oxide (N2O),
Hydrofluorocarbons (HFCs), Perfluorocarbons (PFCs), Sulfur hexafluo-
ride (SF6)
9.9.3.2 Sectors/Source Categories

Energy, Fuel combustion, Energy industries, Manufacturing industries
and construction, Transport, Fugitive emissions from fuels, Solid fuels,
Oil and natural gas, Industrial processes, Mineral products, Chemical
industry, Metal production, Production of halocarbons and sulfur hexa-
fluoride, Consumption of halocarbons and sulfur hexafluoride, Solvent
and other product use, Agriculture, Enteric fermentation, Manure man-
agement, Rice cultivation, Agricultural soils, Prescribed burning of savan-
nas, Field burning of agricultural residues, Waste, Solid waste disposal on
land, Wastewater handling, Waste incineration.
9.9.4Annex B
9.9.4.1 P
arty Quantified Emission Limitation or Reduction
Commitment
(Percentage of base year or period)
Australia 108, Austria 92, Belgium 92, Bulgaria* 92, Canada 94,
Croatia* 95, Czech Republic* 92, Denmark 92, Estonia* 92, European
Community 92, Finland 92, France 92, Germany 92, Greece 92, Hungary*
94, Iceland 110, Ireland 92, Italy 92, Japan 94, Latvia* 92, Liechtenstein
92, Lithuania* 92, Luxembourg 92, Monaco 92, Netherlands 92, New
Zealand 100, Norway 101, Poland* 94, Portugal 92, Romania* 92,
Russian Federation* 100, Slovakia* 92, Slovenia* 92, Spain 92, Sweden
92, Switzerland 92, Ukraine* 100, United Kingdom of Great Britain and
Northern Ireland 92, United States of America 93
* Countries that are undergoing the process of transition to a market
economy.
Source: http://www.kyotoprotocol.com/resource/kpeng.pdf
9.10 Appendix 3: Confronting Climate Change,

withAl Gore
Human activities like deforestation, and the burning of fossil fuels like
coal, oil or gas, are changing our climate in ways that pose increasing
threats to human well-being, in both developing and industrialized
nations. We are already experiencing the harmful effects of the climate
crisis and we know that more severe damage lies ahead unless we act
quickly. The good news is that we can still avoid the most severe impacts
of global warming by reducing our emissions of heat trap and gases and
halting in reversing deforestation. However, as we work to reduce these
emissions of global warming pollution, by investing in renewable energy
and by protecting our forests and soils, we must also begin to prepare
for the changes already coming, by working to better understand the
risks and intergrading these needs into our development planning. Since
the beginning of the industrial revolution, when we began to put heat
trap and gases in large volumes in the atmosphere, the global average
temperature has risen almost one degree Celsius and another eighteenth
of the degree is already in store for us because its in the ocean and will
be released in the atmosphere. If we were not to dramatically reduce
our emissions, the global average temperature is expected to raise as
much as four or more degrees Celsius by the end of this century, and
that would cause severe damage to natural systems and to human health
and well-being. Sustained warming of this magnitude could make hun-
dreds of millions of people climate refugees, because of coastal flooding.
And as many as a billion or more people at risk of increased water stress.
Sustained warming of this magnitude could cause large-scale irreversible
changes including the extinction of up to 2030% of the worlds plant
and animal species. Some of the regions most at risk for species extinc-
tion are areas that are expected to have the most species turnover due to
changing climate. In addition, the destabilization and extensive melting
of the Greenland and west Antarctica ice seeds, as the number of days of
Greenland ice seed melting, has increased dramatically since 1979. And
the disappearance of Antarcticas ice shell and a dozen others warns us
that the melting of large areas in west Antarctica and Greenland could
cause sea level to rise between 4 and 12meters, with each meter causing
roughly another one hundred million refugees.
254 K. BIGINAS
References
Bain, Joe S. 1951. Relation of profit rate to industry concentration: American
manufacturing, 19361940. The Quarterly Journal of Economics 65: 293324.
Begg, David, and Damian Ward. 2012. Economics for Business. New York:
McGraw-Hill.
Besanko, D., D.Dranove, S.Schaefer, and M.Shanley. 2009. Economics of Strategy,
5th International Student Edition. New Jersey: Willey.
Harris, N. 2009. Business Economics. UK: Butterworth Heinemann.
Katz, Michael L., and Harvey S.Rosen. 1998. Microeconomics. 3rd International
ed. Boston: McGraw-Hill.
Mansfield, Edwin, and Gary Yohe. 2000. Microeconomics. 10th ed. New York:
Auflage.
McConnell, B. 2010. Economics. 15th ed. NewYork: McGraw-Hill.
Parkin, M. 2010. Economics. 6th ed. US: Pearson Education.
Porter, Michael E. 2008. The five competitive forces that shape strategy. Harvard
Business Review.
Samuelson, Paul A. 1967. The monopolistic competition revolution. In
Monopolistic Competition Theory: Studies in Impact, 105138. NewYork: Wiley.
Varian, Hal R. 1996. Intermediate Microeconomics: A Modern Approach. NewYork:
W.W.Norton & Company.
CHAPTER 10
EU Operational Program Education

forCompetitiveness andIts Impact
onSustainable Development
PetrSvoboda andJanCerny
10.1 Introduction
Sustainable development assumes long-term vision of consistency in pur-
suing goals by implementing certain measures that change the patterns
of economic, social, and environmental interaction. Key prerequisites for
sustainability are highly qualified professionals and a more comprehen-
sive and operational information support. The availability of these prereq-
uisites requires unprecedented financial resources invested in education
and research as a sector that has little connection with revenue-producing
activities. Aware of this situation, countries of the European Union have
developed specific operational programs that aim at overcoming this sepa-
ration gap.
Generally, sustainability props on three pillars: environmental,
social, and economic. For aspiring entrepreneurs, the third one is
the most important, because it creates resources for the other two.
P. Svoboda (*) J. Cerny

University of Economics, Prague, Czech Republic
e-mail: psvoboda86@seznam.cz; cerny@fm.vse.cz

DOI10.1057/978-1-137-37879-8_10
256 P. SVOBODA AND J. CERNY
The way to strengthen the economic pillar is achieving good business

competitiveness.
The European Union has tried to enhance the entrepreneurship com-
petitiveness of its member countries by the Education for Competitiveness
Operational Program (ECOP). It is a multi-year thematic program, within
which it is possible to draw financial means from the European Social
Fund (ESF), one of the main Structural funds of the European Union in
the programming period 20072013. The ECOP focuses on the area of
the development of human resources through education in all its various
forms with an emphasis on the comprehensive system of lifelong learning,
creation of an appropriate environment for research, development, and
innovative activities and also stimulation for cooperation among the enti-
ties involved. Thus, it supports sustainability of business activities, and,
as a side effect, it supports sustainable development of higher education
institutions (HEIs) as well.
This innovation-oriented program is primarily of the triple helix type:
governmentindustryuniversities, and the innovation is aimed at indus-
try. However, universities themselves, secondarily, are forced to innova-
tions for competitiveness as well. Thus, one can observe another innovative
helix drafted on the primary one. Its pillars are: governmentuniversity as
the object of innovationuniversity as the subject (creator of innovation)
user (student, educated by the innovative curriculum).
10.1.1Notion ofSustainable Development

The term sustainable development has been more widely introduced at
the United Nations Conference on the Human Environment in Stockholm
in 1972. Two decades later, the Rio Earth Summit prioritized global envi-
ronmental discussions, and the term has become prominent after 1992.
The resulting Rio Declaration on Environment and Development also
advocated the role of education in preventing ecological degradation
(Kubiszewski and Cleveland 2012).
The most widely accepted definition of the term sustainable develop-
ment is probably the one used in the publication Our Common Future:
Development which meets the needs of the current generation with-
out compromising the ability of future generations to meet their needs
(Brundtland Commission 1987, p. 8). The advantage of this definition
is that it describes a future that all countries could engage, but it is also
EU OPERATIONAL PROGRAM EDUCATION FORCOMPETITIVENESS ANDITS... 257
easily contestable. In addition, a universal model of sustainability and

sustainable development application has not been developed yet, as the
definition is not instructive. Therefore, developing of the ideas further
in terms of defining what sustainable means became necessary in order
to implement sustainable development. In this chapter, sustainability is
understood as the end state and sustainable development is understood
as the process of getting there.
It had been recognized by environmentalists and researchers that
development patterns were harming the environment and that social
problems were emerging. Thus, an additional challenge became to fur-
ther expand the elements of a new type of development. Many action
frameworks and models were created in order to identify priority areas
in sustainable development in an attempt to address these imbalances.
Social, environmental, and economic elements were identified as three
main pillars of sustainable development as ways to achieve progress.
These three main pillars identified at the Rio Earth Summit also clarified
the definition of sustainable development and its application. People, the
planet, and profits are all inextricably linked and interdependent and must
therefore be synchronized accordingly. In other words, each one of the
three pillars carries similar importance in creating and maintaining stabil-
ity and balance.
10.2 Higher Education Institutions

andSustainable Development
10.2.1History andBasic Features

At the first Earth Summit in 1972in Stockholm, as noted earlier, the con-
cept of sustainable development was originally introduced. Participating
government representatives and non-governmental organizations identi-
fied education as fundamental to successful achievement of sustainable
development. However, after the first Earth Summit, the progress has
been rather unsatisfactory. Finally, in 2005, United Nations adopted a
Decade of Education for Sustainable Development, and a badly needed
injection of urgency was administered (UNESCO 2005).
The Decade aims at the integration of the principles, values, and prac-
tices of sustainable development into all aspects of education and learn-
ing. Such input should encourage changes in behavior that will create a
more sustainable future concerning society, economic viability, and envi-
ronmental integrity for present and future generations. Philosophy of
sustainable development has evolved to include more than just recycling
and constructing buildings with solar panels by recognizing that human
behavior can be altered to limit harmful effects on the environment. Now
it also includes how individuals and communities behave and interact with
the Earth.
The Decade of Education for Sustainable Development includes all
levels of formal and informal education, but formal higher education is
considered fundamental to the strategy for achieving sustainability as it
influences graduates who go on to become leaders in their organizations
and countries.
Two unique opportunities for HEIs to engage in sustainable develop-
ment were identified by UNESCO (2004). The first one was that universi-
ties form a link between knowledge generation and transfer of knowledge
to society for their entry into the labor market. This includes education
for teachers as they play the most important role in providing education
at all levels. The second was that universities actively contribute to the
development through outreach and service to society. Cortese (2003)
underlined this notion by stating that universities bear a moral responsi-
bility to increase awareness, knowledge, skills, and values needed to create
a just and sustainable future. Cortese also stated that higher education
often plays a critical role in making this vision a reality, which is also often
overlooked, and that it prepares most of the professionals who teach, man-
age, lead, work in, and influence societys institutions. Thus, universities
have a tangible and critical role in developing the principles and qualities
needed to improve the awareness and delivery of the sustainable develop-
ment philosophy.
10.2.2Sustainable Development oftheHEIs themselvesCzech

Experience
In order for the HEI to fulfill its role, as described in the previous subsec-
tion 10.2.1, it is essential to achieve that its development is sustainable as
well. Of course, it is almost impossible for the school to directly endanger
the environment in the region around. The environmental pillar of HEIs
can be considered strong enough. Unfortunately, however, this cannot be
said about the other two pillars. This can be illustrated by the example of
Czech universities.
10.2.3
Economic Situation ofCzech Universities
Almost every Czech University consists of highly autonomous parts. That
part is in Czech language called fakulta, which has a similar meaning
as the English word college or department. For example, the famous
Charles University in Prague includes the Faculty of Law, Faculty of
Mathematics and Physics, Faculty of Philosophy, and 24 other schools.
The economic situation of the various faculties of the same university may
be considerably different. This is because the income of each faculty is
composed of three parts:
1. Capitation payments from the state for students. However, each

faculty has predetermined the target number, that is, the maximum
number of students for which they can get paid.
2. Payments for scientific results, such as published articles or mono-
graphs, obtained grants, and so on.
3. Additional income from companies or public administration, for
example, for various studies, projects, staff training, experts reports,
consultations, and so on.
For students of the same field (e.g., Economics and Management),

the financial contribution from the state is equal, regardless of whether
the student studies in Prague, Brno, or Plzen. On the contrary, the finan-
cial situation of the faculties in which they study can be very different,
because:
(1) Sometimes, they do not meet the target number and therefore
capitation income is less.
(2) One faculty teachers have much less significant scientific results
than teachers of another faculty.
(3) Additional income can also be very different.
10.2.4Measures forOvercoming These Problems

Deans of faculties well know what to do to in order to increase revenues
of types 2 and 3. However, the problem of declining student demand
causing decline in income of type 1 is much more complicated. The mean

reasons are the following:
R1: Applicants for study are recruited from smaller and smaller population
years. Since primary and secondary schooling in Czech Republic takes
13 years (5 + 8) and starts at 6, most of the applicants in the year 2013
were born in 1994. The number of live births in 19851996 was sub-
sequently 135,881; 133,356; 130,921; 132,667; 128,356; 130,564;
129,354; 121,705; 121,025; 106,579; 96,097; and 90,446. During
the past 10 years, the sources of students were decreasing from
135,881 to 106,579, that is, by 22%. And this development will con-
tinue at least for the next 2 years. On the other hand, the teaching
capacities have been stationary or a bit increasing.
R2: Decision-making of the applicants is a complex process, which is dif-
ficult to describe exactly. They take into account such fuzzy informa-
tion like what is the image of the faculty, what is the picture of the
faculty drawn by its current students, mainly as concern the qual-
ity of education, of facilities, and of information and communication
services.
The first reason cannot be influenced by HEIs; thus, the authors focus
on the other one. Most deans try to adopt traditional marketing measures
such as advertising, open houses, and so on. The Faculty of Management
of the University of Economics is trying to extend this spectrum in a way
thoroughly described below.
10.3 Original Approach oftheUniversity

ofEconomics inPragueFaculty ofManagement
The Faculty of Management has tried to overcome the problems outlined

in R2in an original way in past three years. The following section focuses
on two main points of its approach:
P1: Specification of the method for evaluating quality of HEIs.

P2: Gaining the support from the ESF for the project Innovation of
the Field of Study and Educational Programs of the Faculty of
Management.
10.3.1Specification oftheMethod forEvaluating theQuality

ofHEIs
Insight into the Quality The importance of the quality of HEIs is continu-
ously increasing also in the countries, where the majority of local HEIs
are financed from the public resources. Although such arrangement nec-
essarily influences the overall market functionality, there still remains a
reasonable space where HEIs heavily compete against each other for new
students (Svoboda et al. 2012). Providing HE has become a product,
and HEIs have been driven by competition to review the quality of their
services, to redefine their product, and to measure and track customer
satisfaction in ways that are well known to service marketing specialists
(Kotler and Fox 1985). The long-term survival of HEIs depends on a
good or bad quality of their services. The HEIs leaders have also realized
that quality distinguishes one HEI from the rest (Aly and Akpovi 2001;
Kanji etal. 1999).
10.3.2Evaluating theQuality ofHEIs

Faculty of Management attempted to approach to the quality determi-
nants of HEIs rather than to quality as such. Three main quality dimen-
sions were selected: Education (intangibles), Facilities (tangibles), and
Information and Communication Channels. The goal was to measure the
weights of particular aspects of quality in order to find out the hierarchy
of the impacts of these aspects on student satisfaction. The most impor-
tant demographic variable identified was the type of study, since signifi-
cant differences in perceptions of HEIs quality between the distant and
full-time students have been found. This indicates that HEIs managers
should approach to distant students in a different way than to full-time
students. Providing of the optimal quality on all dimensions, identified
by students, could be attractive for HEIs managers, but putting wrong
priorities to the important factors could result in inefficient allocation of
resources. HEIs have to continuously identify the needs and wants of their
students as their primary customers and try to fully satisfy them (Svoboda
and Cerny 2013a, b).
Gaining the Support from the European Social FundThe second point
of the Faculty of Managements approach to overcome the problems
outlined in R2 was gaining the support from the ESF for the proj-
ect Innovation of the Field of Study and Educational Programs of the

Faculty of Management. The project was supported by the ESFone of
the Structural funds of the European Unionwhich is described in the
following section in detail.
10.4 European Social Fund Support
10.4.1Basic Approaches
European society must adapt to the dynamically evolving conditions in the
global economy and to the related demands on the flexibility, knowledge,
and skills of each individual. Growing number of people will be forced to
engage in lifelong learning due to the global competition and frequent
changes at the labor market. The urgency of these challenges is simultane-
ously intensified by the fact that the population of many European coun-
tries is experiencing a demographic decrease and the labor market needs
to replace the people who are retiring. A lack of readiness or an inability to
respond effectively to these conditions may cause serious problems in the
area of competitiveness of European countries.
The European Union is aware of the importance of supporting the
development of human potential as one of the fundamental factors for
sustainable economic growth of the knowledge economies of developed
countries. In this context, the educational system has to be perceived as
one of the main pillars of future economic success and social cohesion.
Taking into account the high degree of complexity of the education
issues, the Czech Republic has prepared a multi-objective operational pro-
gram Education for Competitiveness.
10.4.2EU Operational Program Education

forCompetitiveness
In Czech Republic, there are not so many financial sources for funding
lecture improvements except the OPEC or special ministerial fund for uni-
versity development. The OPEC is one of the thematic programs in the
Czech Republic funded by the Structural funds of the European Union
in the years 20072013. The OPEC has been compiled in the context of
the main strategic documents of the Czech Republic, namely the National
Innovation Policy, National Strategic Reference Framework, and the
Economic Growth Strategy.
The program has been divided into five priority partsaxescovering

particular specific objectives of the OPEC: Initial education; Tertiary edu-
cation, Research and development; Further education; System framework
of lifelong learning; and Technical assistance. Total amount allocated to
the OPEC from the ESF is EUR 1.83 billion.
10.4.2.1 Initial Education

The first priority axis deals with the development and quality improvement
of the system of initial education (i.e., primary and secondary education).
More specifically, this axis is focused on setting up the initial education sys-
tem that equalizes the access to education with emphasis on the assurance
of the education quality of teaching staff, the support of key skills, and
taking into consideration each persons individual talents to increase the
employability of students. At the same time, it is focused to achieve a posi-
tive approach to further education for both teaching and non-teaching
staff. The main tool of the first priority axis implementation is the cur-
riculum reform which develops the key competencies of pupils. Emphasis
is also put on monitoring, evaluation, self-evaluation by schools, and
quality-assuring tools. Total amount allocated to this axis is EUR 612.1
million (34% allocation).
10.4.2.2 Higher Education

The second priority axis is aimed at the modernization of higher educa-
tion, which also includes making the system of tertiary professional educa-
tion more attractive. In the area of higher education, emphasis is placed
on the offer of study programs that reflects the labor market trends and
knowledge economy requirements. The masters and follow-up doctoral
study programs should prepare high-quality graduates with focus on their
potential activity in research and development fields. Simultaneously, focus
is placed on making the research and development environment more
attractive for the involved people, and attention is also paid to increase
the attractiveness of and to promote the research and development on the
level of the entire educational system. Total amount allocated to this axis
is EUR 626.5 million (35% allocation).
10.4.2.3 Further Education

The third priority axis is focused on strengthening the adaptability and
flexibility of human resources in order to increase the economic competi-
tiveness and sustainable development of the country. The intention is to
perceive further education not only as an integrated system but also as

an open system where the competencies and responsibilities of individual
institutions are defined. Another objective of this priority axis is the sys-
temic support to the population of the country in mastering general skills
with emphasis on support for business, information technologies, and
language skills. The primary focus is on supporting the offer of further
education, particularly on supporting the informal education and compe-
tencies. Total amount allocated to this axis is EUR 289.9 million (16%
allocation).
10.4.2.4 Lifelong Learning

The System framework of lifelong learning as the fourth priority axis sup-
ports activities contributing to development of the lifelong learning sys-
tem at all levels of education in the country (e.g., primary, secondary, and
higher). Total amount allocated to this axis is EUR 227.1 million (12%
allocation).
10.4.2.5 Technical Assistance

The last priority axisTechnical assistanceis to support effective man-
agement and implementation of the OPEC. The activities supporting
management, implementation, checking, monitoring, evaluation, and
publicity of the OP are primarily funded within this priority axis. Total
amount allocated to this axis is EUR 72.4 million (4% allocation).
10.5 Practical Examples oftheInnovation

Process
Rather than providing a definitive list of approaches to sustainable devel-
opment incorporation in HEIs, this chapter aims at outlining the strat-
egy of how particular HEIs are developing institutional approaches to
incorporate the values and opportunities for sustainable development. It
will demonstrate that by examining various functions and operations of
universities.
10.5.1University ofHradec Kralove

The goal of the innovative project at the University of Hradec Kralove was
to improve the study of transcultural communication and implementa-
tion of teaching in English. Transcultural communication study program

was established as a response to the current challenges in the world, such
as migration, globalization, and merge of subcultures. This program was
accredited in Czech and English. New demands on the implementation of
transcultural dialogue require constant deepening of knowledge obtained.
Transcultural communication overcomes partiality of the multi-cultural
concept by looking for common ground, which is a condition of dialogue
and coexistence of different cultures.
The target groups are students and academics of the university. The
main output of the innovative project is a significant improvement of the
study program through practical field excursions, preparation of curricu-
lum materials and professional publications, involvement of experts from
abroad in teaching, and increasing professional competencies of academic
staff. Total amount allocated to this project was approximately EUR
650,000 (Kurikova and Ulrichova 2012).
10.5.2College ofBusiness andHotel Management inBrno

The innovative project of the College of Business and Hotel Management
in Brno was based on the needs of the target groups of students and aca-
demic staff. The project has been implemented through five key activi-
ties. The first key activity was focused on the modernization of teaching
facilities. The second key activity was the core of the project, and its main
outputs were new teaching materials prepared by academicians. The third
key activity aimed at strengthening the integrity of part-time studies, and
its major output was an adequate distance learning enabled by modern
e-learning facilities. The fourth key activity was focused on strengthening
the partnership with key institutions in the region with emphasis on the
requirements of the knowledge economy. Thus, it consisted of the coordi-
nated work of the partners, whose main intent is to provide internships for
students, help in setting up specialized teaching materials, participation
in the exchange of experience with the implementation of e-learning, and
collaboration in organizing joint exhibitions, workshops and conferences.
The final key activity is the evaluation of the innovative project. The
outcome is the evaluation report, a comprehensive study evaluating the
efficacy and efficiency of the project. The evaluation report is not only
an evaluation document, but it also focuses on the issue of sustainabil-
ity of future outcomes. The evaluation report sets the main dynamic and
time-varying parameters of educational strategy in relation to the target
groups of students and academic staff. It also sets the fundamental axis for
the long-term sustainability. Total amount allocated to this project was
approximately EUR 300,000 (Trojan 2011).
10.5.3University ofEconomics inPragues Faculty

ofManagement
The project Innovation of the Field of Study and Educational Programs
of the Faculty of Management aims to raise the quality of education and
contribute to better employability of graduates in the domestic and for-
eign labor markets. Activities of the project also include upgrading the
library with quality literature, new software for teaching support, creating
e-learning study materials for the needs of part-time students, prepara-
tion of an automated system for tracking the demand of the labor market,
improving the feedback of students and graduates, and developing a data-
base of potential providers of practice.
The output of the innovative project is an attractive and flexible study
program addressing the target groups. The main benefit of the innovated
study program is obtaining a deep base of knowledge and skills in man-
agement without the need of selecting a specialization at the beginning of
the studies, opportunity to specialize according to students interests after
becoming familiar with the basics of the study field, preparation for the
application of these skills in a specific sector in practice, and improvement
of theoretical knowledge, if the student decides to continue the follow-
up masters program. Total amount allocated to this project was approxi-
mately EUR 650,000 (FM VSE Innovation 2009).
10.5.4
Experience oftheUniversity ofEconomics inPrague
The main idea of the project was the development and innovation of ways
and educational methods and curricula. The project, however, has had a
significant positive impact on other aspects of education as well. One of
the positive effects of the project was that it provided an opportunity or
even the need to implement innovations within individual subjects and
considered the related study aids. Thus, outdated aspects of particular
subjects could be eliminated, content of the subject adjusted, and support
materials extended for more effective teaching. All this, however, stands
beside the already outlined main idea of the project.
Difficulties of the innovation lie mainly in the fact that new modular
subjects have a great range of credits, which is related to higher dotation
of teaching hours. This fact brings a number of complications. For exam-
ple, there are higher demands on the cooperation of teachers than they
were used to, as for each group of students there, alternate more teachers,
which places greater demands on coordination. Another difficulty may be
more complicated situation with the final evaluation of students.
Some of the ideas and objectives of the project are the desired and mod-
ern trends in higher education related to the Bologna Declaration and the
European Union frameworks. However, although the idea is good, its
implementation is currently precarious and not fully thought out in all
aspects. Simply said, it still cannot be stated that everything works well and
that the implemented innovation is a positive step in every aspect. To sum
it up, the realized project brought, in addition to a number of benefits,
also some complications, and it will take some time before it settles down.
10.5.5Good Experience withtheInnovation ofTeaching

Mathematics
At the Faculty of Management, before the innovation, mathematics was
taught in two semesters, two hours of lectures and two hours of exercises
per week. First semester was focused on calculus and linear algebra, the
second on their application to managerial decision-making, including the
fundamentals of financial and actuarial mathematics.
The innovation project, however, gave more space for practically ori-
ented topics such as management skills, personnel management, opera-
tional management, and so on. Therefore, the hour range of teaching
mathematics was reduced by half, to 2 + 2 per week but without a sig-
nificant reduction of the course program. This apparent imbalance was
overcome by writing many different study materials for the funds from
the project:
12 case studies illustrating applications of different mathematical

means to practical managerial problems,
4 worksheets, also describing practical situations. Students filled out
them and presented,
4 glossaries,
4 collections of training tasks,
4 practice tests.
Students used these aids extensively and achieved good study results.
More than 80% passed an exam in mathematics at the first attempt, which
had never happened before.
10.6 Conclusion
The goal of the chapter consists of three main points: (1) to present
the relation between sustainability and excellence of the higher educa-
tion institutions (HEIs), (2) to show their impact on the sustainability
of enterprise excellence and (3) to outline the role of EU Operational
Program Education for Competitiveness in Czech Republic concern-
ing (1) and (2). It was fulfilled, since after the introductory Chap. 1,
Chaps. 2, 3 and 5 presented how the sustainable development of a HEI
influences its educational excellence and what measures can be used to
enhance it. Czech experience was described in detail. In Chap. 4, the EU
Operational Program Education for Competitiveness is introduced and
the role of education in excellence and competitiveness improvement is
mentioned.
The chapter maps the possibilities of using the Structural funds of the
European Union for innovation of educational processes at HEIs. It shows
that the key role in the financing of tertiary education has the ESF with
billions allocated to innovative programs such as the OPEC. described
above. OPEC and its second priority axis focused on tertiary education
aims to improve the quality and further diversification of universities with
an emphasis on the requirements of the knowledge economy. This leads
to greater flexibility and creativity of graduates employable in the labor
market.
The article further presents practical examples of the innovation process
and documents comprehensive approaches which allow efficient utiliza-
tion of resources throughout the innovation cycle while meeting the cur-
rent objectives and priorities of the European Union.
Detailed presentation of the original solution to the problem of improv-
ing the quality of teaching at one of the leading Czech universities illus-
trates these facts in the last chapter.
Acknowledgments The research was supported by the Internal Grant Agency

of the University of Economics projects F6/11/2013 Knowledge Based
Modeling of Managerial Processes and F6/45/2015 HEI Qualityperception,
management.
References
Aly, Nael, and Joseph Akpovi. 2001. Total quality management in California pub-
lic higher education. Quality Assurance in Education 9(3): 127131.
Brundtland Commission. 1987. Our common future. Oxford: Oxford University
Press.
Cortese, Anthony D. 2003. The critical role of higher education in creating a
sustainable future. Planning for Higher Education 31(3): 1522.
FM VSE Innovation. 2009. In http://projekty.fm.vse.cz/projekt-isovp/ (in
Czech).
Kanji, Gopal K., Abdul Malek, and Bin A.Tambi. 1999. Total quality manage-
ment in UK higher education institutions. Total Quality Management 10(1):
129153.
Kotler, Philip, and Karen F.A. Fox. 1985. Strategic Marketing for Educational
Institutions. USA: Prentice-Hall.
Kubiszewski, Cutler ClevelandIda, and Cutler J. Cleveland. 2012. United
Nations Conference on Environment and Development (UNCED), Rio de
Janeiro, Brazil. The Encyclopedia of Earth. http://www.eoearth.org/view/
article/156773/, 20 November 2014.
Kurikova, Veronika, and Monika Ulrichova. 2012. Innovation of the Study
Program Transcultural Communication Taught in English (in Czech). Czech
Ministry of Education and Sport.
Svoboda, Petr, and Jan Cerny. 2013a. Customer satisfaction and loyalty in higher
education-a case study over a five-year academic experience. In KDIR/KMIS,
431436.
. 2013b. Quality of higher education institutions as a factor of students
decision-making process. In International Conference on Intellectual Capital
and Knowledge Management and Organizational Learning, 622. UK:
Academic Conferences International Limited.
Svoboda, Petr, Jan Voracek, and Michal Novak. 2012. Online marketing in higher
education. In Knowledge Management, 11451152. Cartagena: Universidad
Politcnica.
Trojan, Jan.2011. Innovation of the study program of the college of business and
hotel management under the European social fund. In Proceedings of the 4th
International Conference on Teaching of Tourism, Hotel and Restaurant Services.
Brno: VSOH.
UNESCO. 2004. Higher education for sustainable development. In Education for
Sustainable Development Information Brief. Paris: UNESCO.
. 2005. United Nations Decade of Education for Sustainable Development
20052014: Draft International Implementation Scheme. Paris: UNESCO.
CHAPTER 11
Applying Data Analytics forInnovation

andSustainable Enterprise Excellence
StavrosSindakis
There is an irreversible trend toward the criticality of big data analytics

capability and exercise thereof so that rather than exclusive use of tra-
ditional data-driven decision-making approaches, sustainableorgani-
zationalexcellence will often demand focus on more computationally
intensive data and information generation, collection, extraction, and
interpretive procedures thatwhen added to traditional datadriven
methodsyield the area of sustainable enterprise excellence referred to as
enterprise intelligence and analytics.
The evolution of organizational intelligence and analytical applica-
tionsfor example, data analysis and mining, data standards, databases,
and information systems that enable data creation and knowledge utiliza-
tionhas even led to claims of a new science of big data analytics with
respective experiments and research questions (Chen et al. 2012; Ceri
etal. 2012).
A generation of intelligence and foresight is often dependent on inte-
gration of vastly disparate sources yielding massive information, with con-
nectivity that is often far less than apparent, and messages that may be
S. Sindakis (*)
American University in Dubai, School of Business, Dubai, UAE
e-mail: ssindakis@aud.edu

DOI10.1057/978-1-137-37879-8_11
272 S. SINDAKIS
conflicting. There are of course differences between information and intel-

ligence/foresight so that coalescence of true and actionable intelligence
and foresight from information presents challenges worthy of the label
big data analytics. Iteration or regeneration of intelligence and foresight
of both short- and long-term natures implies that such massive informa-
tion is regularly and freshly examined for changing and emerging patterns
that may be highly complex in nature and may not yield easily to identifi-
cation. Identifying and leveraging changing and emerging patterns for the
purpose of yielding quasi-optimal enterprise resilience, robustness, and
sustainable enterprise excellence strategies imply that iterative examination
and extraction of intelligence and foresight must be highly dynamic, very-
nearly continuous in nature, andalmost certainlyfounded on highly
adaptive and complex algorithms.
Neither incidentally nor surprisingly, generated intelligence and fore-
sight may profoundly inform and impact organizational strategy and
practice in such important areas as innovation and, through innovation,
important co-products of sustainable enterprise excellence: enterprise
resilience and robustness. It has been documented that enterprises with
the greatest diversity in their portfolio of innovation strategies are also
among the most resilient enterprises (Reinmoeller and van Baardwijk
2005) as innovative enterprises are less susceptible and hence more robust
and more resilient in the face of sectorial and environmental pressures
than are less innovative organizations (Gunday et al. 2011). Similarly,
ambidextrous organizations are more likely to achieve sustainability and
enhance performance by combining explorative innovation strategies that
target new product and market areas with exploitative innovation strate-
gies that aim to improve current product and market positions (He and
Wong 2004). Additional studies stress the importance of organizational
ambidexterity with exploratory innovation developing new knowledge
and promoting novelty for radical innovation, while exploitative innova-
tion builds on current knowledge and aims at incremental innovation to
retain competitive advantage and achieve better performance (Benner and
Tushman 2003; Gibson and Birkinshaw 2004).
In this regard, opportunities associated with data science and analy-
sis in disparate organizations have increased interest in organizational
intelligence and analytics that involve techniques, technologies, systems,
practices, methodologies, and applications that analyze critical data to
help an enterprise better understand its business and market and make
timely business decisions (Chen et al. 2012: 1166). So rapidly evolv-
APPLYING DATA ANALYTICS FORINNOVATION ANDSUSTAINABLE... 273
ing is the import to enterprises of such intelligence and analytics, that

a noted organization design authority forecasts that it will become a
critical organization design component (Galbraith 2012), implying that
organizational progress toward resilience, robustness, and sustainable
enterprise excellence may be advanced through use of sophisticated,
computer-enabled analytic transformation and translation of information
into actionable enterprise intelligence and foresight and, subsequently,
from foresight to value (LaValle etal. 2011). Joshi et al. (2010) add that
use of big data technologies is associated with significant additional pro-
ductivity growth, as they enhance firms learning capability by increasing
the speed, intensity, and directionality of knowledge identification and
selection.
Increasingly, organizations cannot afford to ignore big data analytics,
not because they provide failsafe identification and evaluation of all impor-
tant organizational issues and decisions but rather because of the rapidly
increasing volume of information generated from which intelligence and
foresight that account for complex interactions of factors must be gleaned,
and for which data confidentiality and security must be assured. Big data
analytics generate improved solutions to complex organizational chal-
lenges, including the marketplace, supply chain navigation, and societal
and ecological performance so that adept use of big data analytics will for
many organizations be critical to progress toward sustainable enterprise
excellence. To achieve this, organizations aiming to solve business prob-
lems and improve performance should systematically follow a process of
well-defined stages in order to routinize acquiring, assessing, and activating
valuable knowledge from data at an accelerated rate. Provost and Fawcett
(2013) develop a strong argument for this approach, indicating that orga-
nizations should decompose their business problems into tractable com-
ponents using a framework to estimate probabilities and analyze expected
value. In addition, Nickerson and Zenger (2004) argued that managers
are increasingly faced with the challenge of efficiently generating new
knowledge, rather than relying only on and further exploiting the already
developed one. This argument is supported by a study d emonstrating that
while a firms competence to acquire and store knowledge is vital, this
competence does not produce innovation until this knowledge has been
transformed and exploited in order to influence creation of inventions and
increase innovation levels (Joshi etal. 2010). Aligned with this, a survey
conducted across 30 industries and 100 countries found that the maxi-
mization of value through data analytics has been a top priority for many
274 S. SINDAKIS
organizations, with numerous respondents raising the importance of the

adoption of advanced information and analytics approaches to innovate
and achieve competitive differentiation (LaValle etal. 2011).
Nowadays, companies feel the necessity to improve their growing char-
acter. Firms are concentrated to a dynamic and augmenting data volume
access. This fact creates a pentabyte effect, which leads to many chal-
lenges that must be faced. Numerous data variety appears and, because of
this, several process systems take place in the field. The velocity challenge
is addressed by optimized systems for interactive analytics and specified
software has been produced in order to implement techniques which treat
all the malfunctions that may be aroused. Big data increases the interest
in business analytics, and price trend forecasting has an extended role.
Business analytics approach limits the drawbacks of big data, and the
acceptance of textual data, as the one and only method for data increase, is
the one and only drawback that still remains so. Also, news tickers involve
into the price forecast. As a result, these two recent pillars, textual data
and tickers, support the predictions. Furthermore, female-owned business
enterprises (FBEs) are a focused issue of this book. In Chap. 4, FBEs
influence on customers and market arises some hypothetical situations and
results through a study structure. The hypotheses create some valuable
expectations as the further modification of the marketing analytics model
will open new perspectives for the casual path model. Moreover, the stra-
tegic behavior of the international enterprises is based on the different
types of information and the several software utilized to gain informa-
tion. The international investing background is broad a lot and the most
innovative sectors in software provide reliable solutions. Expanding our
reports, the Open Data Ecosystem is another discussed part. The utili-
zation of platforms that gather and distribute information in the Open
Data Ecosystem is evitable under the affiliation of governments and indi-
viduals. A twin harmony of those can produce extraordinary prospects
for sustainable value generation. Additionally, the analysis of sustainable
business models demonstrates some empirical illustrations. At the same
time, under the current global finance and capital conditions, the busi-
ness environment is affected by the economic recession. Experience shows
that organizations consider business sustainability serious enough to their
development, and adjustments in their business models are not fear factors
in their productive lab. All the above words are accompanied with the
existence of synchronous technological and practical potentials. The com-
petitive and complex environment of the big data era requests some smart
APPLYING DATA ANALYTICS FORINNOVATION ANDSUSTAINABLE... 275
data editing options like PATAmPOWER in the travel and tourism sec-
tor, as has been referred in Chap. 8. Finally, it is worth to be said that the
European Union has focused its education policies on innovation and the
comprehension of the competitive business environment. EU Operational
Program Education for Competitiveness is a testament to the invest-
ment strategy followed. Economy knowledge and flexibility establish a
new agenda, adapted to current data.
References
Benner, M.J., and M.L. Tushman. 2003. Exploitation, exploration, and process
management: The productivity dilemma revisited. Academy of Management
Review 28(2): 238256.
Ceri, S., G. Gottlob, and L. Tanca. 2012. Logic programming and databases.
Heidelberg: Springer Science & Business Media.
Chen, Hsinchun, Roger H.L. Chiang, and Veda C. Storey. 2012. Business intelli-
gence and analytics: From big data to big impact. MIS Quarterly 36(4):
11651188.
Galbraith, J.K. 2012. Inequality and instability: A study of the world economy just
before the great crisis. New York: Oxford University Press.
Gibson, C.B., and J. Birkinshaw. 2004. The antecedents, consequences, and medi-
ating role of organizational ambidexterity. Academy of Management Journal
47(2): 209226.
Gunday, G., G. Ulusoy, K. Kilic, and L. Alpkan. 2011. Effects of innovation types
on firm performance. International Journal of Production Economics 133(2):
662676.
He, Z.L., and P.K. Wong. 2004. Exploration vs. exploitation: An empirical test of
the ambidexterity hypothesis. Organization Science 15(4): 481494.
Joshi, K.D., L. Chi, A. Datta, and S. Han. 2010. Changing the competitive land-
scape: Continuous innovation through IT-enabled knowledge capabilities.
Information Systems Research 21(3): 472495.
LaValle, S., E. Lesser, R. Shockley, M.S. Hopkins, and N. Kruschwitz. 2011. Big
data, analytics and the path from insights to value. MIT Sloan Management
Review 52(2): 21.
Nickerson, J.A., and T.R. Zenger. 2004. A knowledge-based theory of the firm
The problem-solving perspective. Organization Science 15(6): 617632.
Provost, F., and T. Fawcett. 2013. Data science and its relationship to big data and
data-driven decision making. Big Data 1(1): 5159.
Reinmoeller, P., and N. Van Baardwijk. 2005. The link between diversity and resil-
ience. MIT Sloan Management Review 46(4): 61.
Index
A atmospheric envelope, 228

Abdelkafi, N., 184 atmospheric warming, 228
aging population, 232 Aurora, 40
Akrivos, C.K., 131 Australian Bendigo Bank, 176
Al-Debei, Mutaz M., 179, 180, 184 Avison, D., 179, 180, 184
Al Gore, 229, 253 AZFinText, 53
alternative business models, Azkaban, 27
comparisons of, 177
Amazon, 20
ambidextrous organizations, 272 B
Amit, R., 179, 182 Bain, Joe S., 240
AMOS statistical program, 95, 97 balanced scorecard (BSC) concept
analytics modeling basic perspectives of, 1878
first-order or main effects, 80 customer perspective, 187
marketing performance, 80 financial perspective, 187
Annual Tourism Monitor (ATM), 209, innovation and learning perspective,
212 1878
Apache Drill, 389 internal business process perspective,
Apache HBase, 11, 34, 38 187
Apache Pig, 10, 267 Bartletts Test of Sphericity (BTS)
Apple Watch, 207, 222 values, 94
ASTERIX, 30 Begg, D., 240, 241
Note: Page numbers followed by n refers to notes.

DOI10.1057/978-1-137-37879-8
278 INDEX
Besanko, D., 243 C

BI. See business intelligence (BI) capabilities and resources logic, 185,
Bieger, T., 184 193
big data analytics, 8, 273 capitalistic economic model, 226
BigSheets, 10 Cascading, 10, 27
Bigtable, 345, 37 Cascalog, 10
Bitmap compression technique, 19 Cassandra, 35, 39
Borealis, 40 Ceph, 212
Bulk Synchronous Parallel (BSP), 33 Chamoni, P., 53
business analytics change management, 22553
Big Data, drawbacks of, 73 future implications, 2267
energy markets, application, 6273 Cheetah, 24
price forecasting approach, 5462 Chen, H., 53
text-based business analytics, 524 Chotai, A., 233
business environment, change in Chukwa, 28
climate change, 22830 climate change, 226, 22830
demographics, 2312 confronting, 253
drivers of, 22736 Cloudera Impala, 11, 34, 38
e-commerce, 2346 coal, 230
energy, 2301 Cocklin, C., 176
resource issues, 2278 CoHadoop, 23
sustainability, 2278 College of Business and Hotel
urbanization, 2323 Management in Brno, 2656
business intelligence (BI) collocation, 12, 14, 15, 225
categories and subcategories, 8, 9 columnar databases, 10
data analytics systems, evolution of, columnar data storage, 1719
911 columnar query execution, 1920
dataflow systems, 2834 columnar data storage
interactive analytics, systems for, compression techniques, 1819
3440 position index, 18
MapReduce systems, 208 projections, 18
parallel database systems, 1220 columnar query execution
Business Model Canvas, 1789 compression, 19
business models, 1801 late materialization, 19, 20
analysing, 183 vectorized processing, 1920
functions, 1814 competitive environment
internal alignment, 182 change and, 23741
location of, 182 Porters five forces theory, 2389
logics, 1846 SS-C-P (structure-conduct-
managing, 183 performance) model, 2378
for sustainability, 1746 Compressed Common Delta scheme, 19
understanding and sharing, 183 compression techniques, 1819
INDEX 279
confidence, 227 Data as a Service (DaaS) concept,

congestion, 233 21920
contemporary markets, 236 Database Management Systems
contextual logic, 186, 193 (DBMSs), 12, 30, 36
continuous queries, 39 data-driven decision-making, 208,
conventional business model 210, 220, 271
perspective, 170 data-driven ecosystem, 160
conventional/mainstream business dataflow systems
models, 175 directed acyclic graph systems, 11,
corporate identity, 116 303
corporate sustainability, 169, 170 generalized MapReduce systems, 11,
Cortese, Anthony D., 258 2830
Cosmos Storage System, 31 graph processing systems, 11,
CRM. See customer relationship 334
management (CRM) DataNodes, 212
C-Store, 1718, 23 data-oriented ecosystem, 4
customer information software, role DBMSs. See Database Management
of, 11618 Systems (DBMSs)
customer relationship management Decade of Education for Sustainable
(CRM), 12832 Development, 258
binary logistic regression, 12931 decision-making process, 12, 142
culture of, 11618 data science, 2
and customer information software, organizational culture, 112
11618 declustering, 1216
definition of, 117 deforestations, 229
existence of, 1289 degree of declustering, 13, 16
information software, 117 demand side management, 231
internationalization strategy demographics, 227, 2312
development, 118, 133 digital options, 154
phi correlation coefficient, 128 directed acyclic graph (DAG) systems,
11, 303
discounted cash flow (DCF) analysis,
D 154
data analytics systems distributed file systems, 1011, 212,
dataflow systems, 9, 11 28, 30, 34, 37, 41
for innovation, 2715 distributed SQL query engines, 11,
interactive analytics, systems for, 9, 379
11 Djolov, G., 240
MapReduce systems, 9, 10 Dremel, 378
parallel database systems, 9, 10 Dryad, 9, 11, 28, 302
velocity and variety challenges, 41 DryadLINQ, 312
280 INDEX
E exploratory factor analysis (EFA), 90

e-commerce, 2346 external benefits, 245
influence on market, 2346 external costs, 245
economic efficiency, 246, 248
Education for Competitiveness
Operational Program (ECOP), 256 F
EFA. See exploratory factor analysis Facebook, 232
(EFA) Fawcett, T., 273
effective visualization, 208 Fayyad, U., 54
electricity market FBEs. See female-owned business
German, 62 enterprises (FBEs)
historical price data, 64 Felden, C., 62, 645
market data, 64, 65 female-owned business enterprises
news tickers, 645 (FBEs), 78, 80, 274
price development, 64 sociodemographic statistics, 93
Electronic Data Interchange (EDI), fertility rates, 231, 236
234 Fichman, Robert G., 154
emission charges, 245 Figge, F., 189
employment, 236 financial crisis, 225, 227
energy, 226, 2301 financial logic, 185, 191, 193
energy markets, application financial option, 153
electricity market, 62, 645 FlumeJava, 27, 28
gas market, 62, 6573 food refugees, 232
Enterprise Data Warehouses (EDWs), foreign markets
12 information acquisition, 114, 132
enterprise resource planning (ERP) internationalization information,
systems, 154 115
EU Operational Program Education obtaining information, process of,
for Competitiveness, 2624 11516
further education, 2634 Friedemann, 230, 231
higher education, 263 fundamental rationale, business model,
human resources, 263 1846
informal education and
competencies, 264
initial education, 263 G
lifelong learning, 264 Gamma, 12
technical assistance, 264 gas market
European Social Fund (ESF), 256, British, 62
2614 graphical user interface, 71, 72
basic approaches, 262 historical prices, 66, 67
EU Operational Program mapping paradigms, 67, 71
Education for marked data and price data, 71
Competitiveness, 2624 natural gas, 65
INDEX 281
performance UNSTABLE/STABLE HEIs. See higher education institutions

model, 68, 70 (HEIs)
performance UP/DOWN model, higher education, 5, 258, 263, 267
71 higher education institutions (HEIs),
process documents from data, 68 256
results, 70 European Social Fund support,
GDP, 228 2614
Geva, T., 53 evaluating quality of, 2612
GFS, 37 method specification, for quality
Glaciers, 229 evaluation, 261
Global Financial Crisis (GFC), 210 and sustainable development, 25760
globalization, 249 Hive, 10, 21, 267, 39
global population, 227 HiveQL, 26, 30, 32, 39
Global Reporting Initiative (GRI), home banking, 235
195, 196 hydrogen economy, 230
global tourism, 208 HyPer, 36, 37
global warming, 228, 229, 253 Hyracks, 11, 2830
GraphLab, 33 HYRISE, 357
graph processing systems, 11, 28,
334
GraphX, 33 I
greenhouse effect, 228, 250 IBM DB2 Parallel Edition, 12
greenhouse gas (GHG) emissions, imperfect competition, 2458
147, 229, 252 income, of faculty, 259
incremental innovation, 272
Infobright, 17
H information acquisition, 113
Hadoop, 201 information software, 12832
collocation, 23 initial public offering (IPO), 148
columnar data layouts, 234 innovation-oriented program, 256
indexing, 23 innovation process, 4
MapReduce, 245 College of Business and Hotel
Hadoop++, 23, 24 Management in Brno, 2656
HadoopDB, 256 data analytics for, 2715
partitioning, 256 practical examples of, 2648
split query execution, 26 teaching mathematics and, 2678
Hadoop Distributed File System through open data, 138
(HDFS), 11 University of Economics in Prague,
Hagiu, A., 146 2667
HaLoop, 34 University of Economics in Pragues
Hansen, E., 189 Faculty of Management, 266
Harris, N., 237 University of Hradec Kralove,
HBase, 11, 345, 38 2645
282 INDEX
INRIX, 14950 Kyoto Protocol, 228, 2501

interactive analytics, systems for Annex A, 252
distributed SQL query engines, 11, Annex B, 252
379 Article 1, 251
mixed analytical and transactional Article 28, 251
systems, 11, 347
stream processing systems, 11,
3940 L
intermediaries, 1435 Lambert, Susan C., 184
International Telecommunications Language INtegrated Query (LINQ)
Union (ITU), 215 model, 32
International Visitor Arrivals (IVA), Laskey, A., 147
211, 217 Lavrenko, V., 523
Internet, 234, 235 Lee, R., 231
Internet of Things (IOT), 207, 222 linear regression model, 1004
low-carbon economy, 2278
Ludeke-Freund, F., 190
J Lurie, M., 208
JobTracker, 25
Johnson, Mark W., 184
join parallelism M
broadcast, 15, 16 MACS. See Marketing Activity and
collocated, 1516 Customer Activity Scale (MACS)
directed, 15, 16 survey
repartitioned, 15, 16 Magretta, J., 179
Joshi, K.D., 273 MapReduce execution engine, 10, 21,
24
MapReduce systems
K application development, 27
Kafka, 28 big data, analytical workloads, 8
Kaiser-Meyer-Olkin (KMO), 94 data collection, 28
Kaplan, Robert S., 180, 186 distributed storage, 10, 216
Katz, Michael L., 2479 Hadoop, 20
key performance indicators (KPIs), high-level interfaces, 267
178 job execution, 25
k-nearest neighbors (KNN), 59, 61 workflow management, 27
Knowledge Discovery in Database MapR File System, 22
(KDD), 54 Marinagi, C.C., 131
knowledge extraction, 54 marked data, 54
knowledge management, 112, 114 marketable permits, 245
Kosmos File System (KFS), 22 market and industry research
Kramer, Mark R., 142 types of information acquired,
Kutsikos, K., 131 11214
INDEX 283
market devices, 181, 182. See also goodness-of-fit theory, 967

business models limitations and future research,
market failures, 248 1056
correcting, 2489 MACS, 90
Marketing Activity and Customer marketing analytics, 8790
Activity Scale (MACS) survey, 85 non-probability sampling approach,
data analysis, 90 105
instrument questionnaire, 90 path analysis and SEM, results of,
statistical analyses tools, 90 957
Marketing Analytic Equation Model path analysis of firm variable, 91
(MAEQ), 86 population and sample, 85
marketing analytics regression modeling and analytics,
benefits of, 3 1001
competition and economic analytic, research hypotheses, 856
8990, 956, 1024 sociodemographic statistic results,
customer behavior, 7980 912
customer credit analytic, 889, 95, theoretical models, 1056
101, 103 market structures
customer turnover analytic, 88, 95, change and, 2438
101, 103 competition, efficiency of, 243
good analytic model, conditions, 79 perfect competition and inefficiency,
market potential analytic, 89, 95, 2435
101, 103 Massive Parallel Processing (MPP), 12
and metric equations table, 87 materiality matrix, 195
quantitative analysis, 79 McConnell, B., 241
marketing logic, 185, 193 Megastore, 11, 345
marketing metrics Mentzas, G., 131
financial and non-financial metrics, Metastore, 27
845 Micro and Small Medium Enterprises
marketing effectiveness, 84 (MSMEs), 218
measuring media, primary needs, 84 micro-generation strategy, 231
statistical modeling techniques, 85 migration, 2312
types of companies, 83 Mittermayer, M.-A., 53
market research mixed analytical and transactional
causal relationships, 105 workloads, 11, 345
competition and economics analytic, model functions, 183
93 modern portfolio theory, 142
correlation and analytics, 979 monetary exchange, 142
definition, 112 MonetDB, 20
EFA, results of, 934 Moneyball, 77
empirical model of study, 86 monopolistic competition
FBEs, sociodemographic statistics, five forces analysis, 2412
93 main characteristics, 241
284 INDEX
monopoly, 244 open data

MSPs. See multi-sided platforms perceived option value of, 1578
(MSPs) sustainable value, 1403, 1559
Multidimensional Poverty Index Open Data Barometer report, 141, 144
(MPI), 233 open data ecosystems, 13765, 274
multilevel serving tree, 38 role of intermediaries in, 1435
multi-sided platforms (MSPs), 139, open data MSPs, 1515
146, 161, 163, 164 open data value paradox, 139, 152
Muppet, 39 Opower, 1479
MySpace, 232 organizational intelligence, 1, 5, 271
organizational resilience, 2
oriented business model assessment,
N 16999
NameNode, 212 Osterwalder, A., 178, 180, 181, 183,
Nann, S., 53 184, 190, 195, 198
National Tourism Organizations
(NTOs), 20810
natural population development, 231 P
Nephele, 11, 2830 Pacific Asia Travel Association (PATA)
Netezza, 12 Annual General Meeting, 210
net present value (NPV), 162 annual statistical report, 209
network effects, 142 beta 2.0, development of, 21315
network mechanism, 143 members, 210
NewsCATS, 53 mission, 208
news tickers, 524 SIC, 209, 211
Ngai, E.W.T., 132 TIGA beta version 1.0, 21213
Nickerson, J.A., 273 TIGA beta 2.0 version, 21516
node groups, 13 visitor data, 211
non-collusive oligopoly, 240 visitor economy, 2201
nonmarket production, 139 ParAccel, 17
Non-Random Walk theory, 52 parallel database systems
Norton, David P., 180, 186 columnar databases, 10, 1720
row-based parallel databases, 10,
1217
O parallel data storage
obesity, 232 collocation, 14
Occupational Health-and Safety data replication, 14
Assessment Series (OHSAS), degree of declustering, 13
197 table partitioning, 1213
Oh, C., 53 parallelism
ontology, 198 independent, 17
Oozie, 27 join, 1516
INDEX 285
partitioned, 1617 predictive modeling, 82

pipelined, 17 rational analytics approach, 83
Parallelization Contracts (PACT) Preferred Partner, 214
programming model, 29 Pregel, 28, 33
Pareto efficiency or economic Presto, 39
efficiency, 246 price forecasting approach, textual data
Parkin, M., 244 business analytics approach, process
partial DAG execution (PDE), 33 of, 62, 63
partitioned parallelism, 1617 data mining phase, 59
partitioning, types, 12, 13 durability calculation, 58
Party Quantified Emission Limitation, KNN, 59, 61
252 mapping hypotheses, 567
PATAmPOWER, 4, 275 marked data and news tickers, 567
future of, 2202 regression analysis, 52
marketing, 21819 SVM, 59, 61
Software as a Service (SaaS), 220 text mining, 58, 60
subscription, Data as a Service, TF-IDF value, 59
21920 training process, 54, 55, 62
value, 21618 trend calculation, 56, 57
PATA Next Gen, 216 price-marginal cost equality, 247
peer-to-peer Internet-based business principal component factor analysis
models, 144 (PCA), 88
pentabyte effect, 274 PrIter, 34
perfect competition Processing Elements (PEs), 40
and inefficiency, 2435 production logic, 185, 193
monopoly and, 244 projections, 18
Pig Latin, 26, 30 Provost, F., 273
Pigneur, Y., 178, 184, 195 public goods, 244
Pivotal Greenplum Database, 12 public trust, 227
Porter, Michael E., 142, 238
Porters five forces theory, 2389
vs. SS-C-P (structure-conduct- Q
performance) model, 2401 Quantcast File System (QFS), 22
position index, 18
Pospiech, M., 62, 645
poverty, 233 R
PowerGraph, 33 Rackspace, 20
predictive analytics radical innovation, 272
and business decisions, 81 RapidMiner process, 68, 69
in customer analytics, 82 real options theory, 1514
data-driven marketing decisions, 82 real option theory, 140
effective, 81 recession, 226, 227
286 INDEX
Reduction Commitment, 252 SIMD (single instruction multiple

Reinhold, S., 184 data), 19
replication, 12, 1415, 22, 35 Singapore Tourism Board (STB), 219
representing construct, 181 Small Business Development Center
Residential Energy Consumption (SBDC), 85
Survey (RECS), 148 small-medium business enterprises
Resilient Distributed Datasets (RDDs), (SME), 78
32 smart decision-making, 20723
Resilient Distributed Graph (RDG), 33 social business model, 176
resilient enterprises, 272 social media, 232
resource allocation, 2458 social media analytics, 83
return on investment (ROI), 77 social value, 175
Rio Earth Summit, 256, 257 Software as a Service (SaaS), 220
Rosen, Harvey S., 2479 Solcansky, M., 84
Roundtable on Sustainable Spanner, 11, 35
Biomaterials (RSB), 197 Spark, 30, 32
row-based parallel databases, 10 specific standard disclosures, 195
parallel data storage, 1214 split query execution, 26
parallel query execution, 1417 Spotfire, 41
Rusnjak, Andreas, 184 SS-C-P (structure-conduct-
performance) model, 2378
vs. Porters five forces theory, 2401
S statistical thinking, 77
S4, 34, 3940 Steffen, W., 229
Samuelson, Paul A., 245 Stern Review, 228
SAP HANA, 9, 35, 36 Storm, 34, 3940
Schallmo, Daniel R.A., 184 strategic behavior
Schaltegger, S., 189 cluster analysis, 1202, 1246
Schumaker, R.P., 53 corporate identity, 116
SCOPE, 312 customer information obtainment,
Scribe, 28 116
Seelos, C., 176 discriminant analysis and binary
SEM. See structural equation model logistic regression, 1223,
(SEM) 1268
shadow option, 153 factor analysis, 1234
Sharing Economy, 144 strategic development, 12832
Shark, 323 Strategic Intelligence Centre (SIC),
Sharma, 228 209, 211
shelter deprivation, 233 strategic management, 4
Sheng, O., 53 strategic planning, 3
Short, Samuel W., 176 CRM, 11618, 12832
SIC. See Strategic Intelligence Centre dependent variables,
(SIC) operationalization of, 11920
INDEX 287
firm, strategic characteristics, business model concept, 1806

11112 concepts of concepts, 179
foreign markets, 11416 conceptual approach, 17980
market and industry research, functions, concepts and tools,
11214 1779
research methods, 119 performance, assessing, 196198
strategic behavior, 1208 relevant sustainability aspects,
strategic management, 109110 identifying, 1956
strategies companies, 111 Sustainability Balanced Scorecard
strategy formulation, 110 (SBSC), 180, 1869
Strategyzer toolbox, 178 systematic information provision,
Stratosphere, 9, 30 177
STREAM, 40 systematic performance tracking,
stream processing systems, 11, 3940 177
strike price or exercise price, 153 unit of analysis, defining, 1945
structural equation model (SEM), 88, sustainable development, 4, 2556
90 Czech experience, HEIs, 2589
Structured Query Language (SQL), economic situation, of Czech
12 Universities, 259
Stubbs, W., 176 history and basic features, HEIs,
subsumption, 189 2578
super projections, 18 notion of, 2567
support vector machine (SVM), 53 revenues increase measures, HEIs,
sustainability, 169199, 2278 25960
business cases for, 1714 University of Economics in Prague,
business models for, 169, 170, Faculty of Management, 2602
1746 sustainable enterprise excellence,
entrepreneurs, 1713, 1757, 2715
199n1 sustainable entrepreneur, 1713,
sustainability oriented business 1757, 199n1
model assessment, 1779 sustainable entrepreneurship, 170, 172
Sustainability Balanced Scorecard sustainable value, 142
(SBSC), 171, 180 of open data, 1403
customer perspective of, 193 SVM. See support vector machine
internal process perspective, 193 (SVM)
learning and growth perspective, Sybase IQ, 17
193
non-market perspective, 189, 190,
193 T
sustainability oriented business model Tableau, 41
assessment (SUST-BMA), 171, TaskTrackers, 25
1779 taxes, 245
basic framework, 1894 Teece, David J., 181
288 INDEX
Teradata, 12 value generation, 138, 139, 143, 145,

term frequency inverse document 155, 159
frequency (TF-IDF), 59 values, 160, 170
TIGA beta version 1.0, 21213 Varian, Hal R., 246
TIGA beta 2.0 version, 21516 VectorWise, 17, 20
Tourism and Events Queensland Venkataswamy, Govindappa, 176
(TEQ), 220 Vertica, 1718, 23
Travel Intelligence Graphic visitor economy, 2201
Architecture (TIGA), 212 vulnerable employment, 233
Trojan Indexes, 23
Twister, 34
two-sided markets, 139 W
climate change, 1478 Ward, D., 240, 241
economics of, 14551 wearable technology, 222
Opower, 14749 web analytics, 83
Tyndall, J., 228 Wells, Peter E., 175
Wirtz, Bernd W., 17880, 183
World Trade Organization (WTO),
U 234
UN Framework Convention on Wright, J., 146
Climate Change (UNFCCC), 282 Wuthrich, B., 523
United Nations World Tourism
Organization (UNWTO), 211
University of Economics in Prague, X
2667 X100 project, 20
University of Hradec Kralove, 2645
Upward, A., 198
urban agriculture, 233 Y
urbanization, 227, 232, 233 Yates, D., 148
urban migration, 232
Z
V Zahavi, J., 53
value capture function, 185 Zenger, T.R., 273
value creation function, 185, 186 Zillow, 1501
value framing, business model, 186 Zott, C., 179, 182

Analytics, Innovation, and Excellence-Driven Enterprise Sustainability (2017)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analytics, Innovation, and Excellence-Driven Enterprise Sustainability (2017)

Uploaded by

Copyright:

Available Formats

PALGRAVE STUDIES

Edited by Elias G. Carayannis

A primary feature of the series is to consider the dynamics of innova-

We will consider whether innovation is demonstrated differently across

More information about this series at

Palgrave Studies in Democracy, Innovation, and Entrepreneurship for Growth

Library of Congress Control Number: 2016957534

The Editor(s) (if applicable) and The Author(s) 2017

Cover image Katja Piolka / Alamy Stock Photo

Printed on acid-free paper

This Palgrave Macmillan imprint is published by Springer Nature

Being an active part of the business environment for a couple of decades

1Analytics, Innovation, andExcellence-driven Enterprise

2Business Intelligence andAnalytics: Big Systems for

3Business Analytics forPrice Trend Forecasting through

4Market Research andPredictive Analytics: Using Analytics

5Strategic Planning Revisited: Acquisition andExploitation

6Innovation intheOpen Data Ecosystem: Exploring the

7Sustainability-Oriented Business Model AssessmentA

8Smart Decision-Making andProductivity inthe

9Change Management: Planning fortheFuture

10EU Operational Program Education forCompetitiveness

11Applying Data Analytics forInnovation andSustainable

KonstantinosBiginas is the assistant dean as well as a lecturer at London College

sion making, and corporate communication. Konstantopoulos has extensively

with a special focus on performance measurement, accounting, management

Fig. 2.1 Parallel join types 15

Fig. 7.2 The location of the business model within management

Table 2.1 The system categories, subcategories, and example

Table 5.4 Compare Means for the Institutional Information Sources

Analytics, Innovation, andExcellence-driven

The adoption of systems that help organizations to retain and transfer

The Author(s) 2017 1

When strategically integrated, these factors have the power to promote

approaches and theories of strategic management, identifying a need to

significant contemporary and upcoming era of the fast-paced competitive

Business Intelligence andAnalytics: Big

The Author(s) 2017 7

2.1.1Evolution ofData Analytics Systems

2.1.1.1 Parallel Database Systems

2.1.1.2 MapReduce Systems

2.1.1.3 Dataflow Systems

2.1.1.4 Systems forInteractive Analytics

2.2 Parallel Database Systems

2.2.1Row-based Parallel Databases

2.2.1.1 Parallel Data Storage

parallel database systems today (IBM Corporation 2007; Morales 2007;

Range partitioning, where tuples are assigned to tables based on

Benefits of partitioning range from more efficient loading and removal

2.2.1.2 Parallel Query Execution

inter-operator parallelism, as well as mechanisms to transfer data from pro-

Collocated join: A collocated join can be used only when tables

Fig. 2.1 Parallel join types

A typical issue with join processing is the presence of skew in partition

these fragments in parallel. This form of parallelism is called partitioned

2.2.2.1 Columnar Data Storage

the corresponding value. An entire tuple with tuple identifier k can be

Run Length Encoding (RLE): Sequences of identical values in a

Dictionary: The distinct values in the column are stored in a

Hybrid combinations of the above schemes are also possible. For

2.2.2.2 Columnar Query Execution

can greatly increase performance when the same operations have to be

2.3 MapReduce Systems

1Analytics, Innovation, andExcellence-driven Enterprise

2Business Intelligence andAnalytics: Big Systems for

3Business Analytics forPrice Trend Forecasting through

4Market Research andPredictive Analytics: Using Analytics

5Strategic Planning Revisited: Acquisition andExploitation

6Innovation intheOpen Data Ecosystem: Exploring the

7Sustainability-Oriented Business Model AssessmentA

8Smart Decision-Making andProductivity inthe

9Change Management: Planning fortheFuture

10EU Operational Program Education forCompetitiveness

11Applying Data Analytics forInnovation andSustainable

Fig. 2.1 Parallel join types 15

2.1.1.1 Parallel Database Systems

2.1.1.2 MapReduce Systems

2.1.1.3 Dataflow Systems

2.1.1.4 Systems forInteractive Analytics

2.2.1.1 Parallel Data Storage

2.2.1.2 Parallel Query Execution

2.3.1.3 Columnar Layouts

2.3.1.4 MapReduce Execution Engines

2.3.2.1 High-level Interfaces

2.3.2.2 Application Development

2.3.2.3 Workflow Management

2.3.2.4 Data Collection

2.5.1.1 Mixed Storage Systems

2.5.1.2 Mixed Processing Systems

M. Pospiech () C. Felden ()