You are on page 1of 7

2012 15th International Conference on Network-Based Information Systems

Database Research: Are We At A Crossroad?


Reflection on NoSQL

Maria Indrawan-Santiago
Faculty of Information Technology
Monash University
Melbourne, Australia
maria.indrawan@monash.edu

Abstract — The demand to process large sets of data has 1970, there was acceptance from the academic community,
increased in the last few years from both the scientific and however, practitioners were skeptical of the performance of a
business community. To serve this demand, a number of new relational database compared to the existing network and
databases have been introduced that are not based on hierarchical database systems [28]. The adoption of this type
relational models. This group of databases is popularly known of database was very low until the introduction of System R
as NoSQL. The underlying data and transaction models in the in the 80s [4]. In the early 80s, relational DBMS grew very
NoSQL are different from relational databases. Much interest competitive and became the dominant player in database
has been placed by organizations in adopting this new markets to date.
technology and has created a buzz in database research. The
Since the introduction of the relational model, there were
fact that the underlying principles are different to relational
models has placed a dilemma in the database research
a few database models introduced such as the object oriented
community. Would this new technology change the shape of database, object relational database and XML database. The
database research and industry? object oriented and object relational databases were
introduced as the object orientation approach for software
Keywords; database, noSQL, key-value pair, business analytics engineering gained momentum in the 90s. However, these
databases never really become competitive in the
I. INTRODUCTION marketplace. The reasons could be summarized in its lack of
The database industry in the last few years has seen a theoretical foundation and its limited performance gained
number of non-relational DBMS introduced, such as over the relational database [6]. The XML database suffers
MongoDB[21], Riak[32], Neo4J[22]. These families of challenges similar to OODB. It aims to support the
database are popularly known as NoSQL databases. There proliferation of XML documents, however, adoption of a
are many debates over the roles of these databases in serving native XML database is very limited. Major relational
our information needs. The pro-NoSQL camp claims that database vendors such as Oracle, Microsoft and MySQL
this technology is a future of database [20]. On the other have included XML support for their products, but native
hand, the pro relational database camp claims that the XML database such as Tamino [29] has not captured much
NoSQL database has a major drawback of not providing share of the database market. Initial excitement for the XML
strict treatment of data integrity [7] database may be attributed to the ever increasing web
This paper attempts to navigate the roads of current applications and service orientation architecture that uses
database research by exploring the technology and impact of XML as a means to standardize its data exchange format. On
NoSQL databases as the new kid on the block of database the surface, there seems to be many data being used in XML
research. This exploration starts by looking at the trends in format, however, when it comes to storing the data, the
database research in the last 40 years since introduction of format is reverted back to a relational model. XML does not
the relational database. Analysis of the different groups of have a strong theoretical foundation that can guarantee data
NoSQL is given in section 3. Each of the groups will be integrity to the same level that relational models can offer.
compared on the data model, database transaction model and In the last few years, a new group popularly known as
data analytics. NoSQL databases has emerged. Their origin can be traced
back to the introduction of BigTable [26] and MapReduced
II. POST RELATIONAL DATABASE ERA [27] by Google in 2006 and 2004 respectively. The buzz is
still high on this new technology. While a number of these
Relational model was introduced in 1970 by E.F. Codd
database systems have been introduced to the market, there
[8]. It was introduced to overcome some problems with
are still not enough reports on the adoption rate of this new
database systems associated with network and hierarchical
technology in the database market. The technology could
models. A major drawback perceived by practitioners of
still be on climbing slope from the first phase of hype cycle
hierarchical and network models was their minimum support
[13]. It is interesting to note that the hype of this technology
of data independence [9]. Hence, complex programs were
comes mainly from the industry since so much of the
often written to answer a simple query. From academia
information on the technology is available on blogs and
perspective, the network and hierarchical model were
opinion pieces. There are limited academia based papers on
inadequate since they did not have accepted theoretical
foundation. When E.F. Codd introduced relational model in

978-0-7695-4779-4/12 $26.00 © 2012 IEEE 45


DOI 10.1109/NBiS.2012.95
this topic. The discussion on this technology is presented machine design, there has been a shift in focus from
next. increasing a clock speed per chip to providing more
processor core and threads per chip. These
III. NOSQL DATABASES developments encourage the development of
A. Background database models that can exploit parallel
computation.
The relational database was popular in the 70s and 80s Based on the aforementioned factors, NoSQL databases
because the majority of data processing needed was were designed and developed. In general, NoSQL databases
commercial data processing, also known as Online are defined by the following characteristics:
Transaction Processing (OLTP). Stonebraker and Cattell [30] • Non relational model
suggest that this type of data processing is characterized by The limitations of relational model in supporting big
simple operation with focus on writing to the database. Other data and mixed-structured data are one of the main
applications such as data warehousing require complex reasons in the introduction of NoSQL.
operation and read-focus processing. The social networking
• Designed for distributed processing and horizontal
applications, such as Facebook and Twitter, sit somewhere in
scalability
between OLTP and data warehousing. For these types of
The shift in the computer architecture development
applications (non-traditional OLTP), NoSQL movement has
supports the need to have a database that is scalable
gained strong support because these applications have
horizontally and be processed in distributed manner.
pushed the limit of relational databases on its flexibility and
• Allow less strict adherence to schema
scalability. Other factors influencing the developments of
Weak or schema-less data warehousing is perceived
this family of database include [2]:
as an advantage in supporting ad-hoc business
• Big Data
analytics query.
There is an increasing need to manage ‘big data’
• Less strict adherence to consistency
both in the commercial domain with the
ACID (atomicity, consistency, isolation and
proliferation of Web 2.0 applications and in the
durability) properties for database transaction put
academia with e-science applications. When dealing
consistency of data as its first citizen. However,
with large amounts of data in relational database,
implementing this principle in distributed processing
operations such as join would become
and at the same time delivering quick response to the
unmanageable. For some applications that query
query is very difficult to achieve. To enable
mainly single rows across many attributes/columns,
processing of large amount of data in distributed
it will be more practical to group together columns
environment, relaxing the requirement of
that would otherwise spread over different tables in
consistency or ACID properties become a design
a relational database.
choice in NoSQL database.
• Data as Profit Maker
Two fundamental differences between relational and NoSQL
Business organizations have developed confidence
databases based on the characteristics described above are
in data as a profit generator. Data capture,
the data model and the treatment of consistency. In the next
integration and analysis have become keys to
sections, the different data models for NoSQL and the
business success and profit rather than a mere
different transaction models are presented.
business operation cost that needs to be minimized.
It gives the rise to the importance of business B. Transaction Models.
analytics and to some extent the ad-hoc business Relational database uses the ACID properties to ensure
analytics. Supporting ad-hoc business analytics integrity of data in the database. The concept is implemented
requires flexibility in the schema support underlying by means of a lock mechanism. NoSQL databases consider
the enterprise’s view of data. the requirement of ACID properties to be too restrictive and
• Mixed-Structured Data impossible to achieve in distributed environments, as
Many organizations face the challenge to manage suggested initially by Brewer’s theorem on CAP
both structured and unstructured data in their daily (consistency, availability and partition tolerance) [14] and as
operations. In many cases, the unstructured data later expanded into PACELC (partition, availability,
was ‘massages” into the structured schema and consistency, else, latency, consistency) [1]. Brewer’s CAP
implemented in relational databases that support theorem states that a designer of distributed database systems
unstructured data type such as blob. During the has to choose two of the three CAP properties. Fig 1 depicts
retrieval, the blob then needs to be post-processed to relations between consistency, availability and partition
extract the content. It imposes two steps processing. tolerance. For example a CP-based distributed database
• Architectural Shift in Computing system will trade availability to make sure the database
Access to parallel computing has been made based is consistent when partition is introduced to the
available to the wider community with the advent of database. A PA-based would favour availability over
cloud computing. Many organizations that did not consistency when the database is partitioned. A CA-based
have access to such computing facilities are now would mean that consistency and availability are paramount
capable of parallel computation. From individual and it is achieved by not introducing partition.

46
by applications before state of full consistency is eventually
reached. If the soft states in the database transaction are
Availability recognized and pattern of these transactions are detected,
applications can be designed to be aware of these patterns
and manage the soft states accordingly. In the eBay’s
implementation of BASE, a message queue concept is used
PA to eventually resolve possible conflicts that may rise during
CA
the soft state of transaction.
C. Data Models
1) Key Value Pair
Partition Consistency The idea of key-value pair has existed in computing for
many decades, it is a common data structure or concept used
CP
in the development of file systems. In this type of database,
data is stored as a pair of key and value. Each of the keys is
Figure 1. Brewer’s theorem unique in a collection. Access to the values is achieved by
means of key-value association. The keys needs to be kept in
The PACELC model suggests that the tradeoff between a data store that can be quickly accessed, for example a hash
consistency and availability is not solely based on partition- table. The binding from key to the values varies depending
tolerance, but also on the existence of a network partition on the programming language used. The values do not
itself. necessarily contain raw data, it can contain another set of
“If there is a partition (P), how does the system trade off keys which makes cardinality of the value become one of the
availability and consistency (A and C); else (E), when the major decisions in designing a key-value database. What
system is running normally in the absence of partitions, how would the values represent in the database? Is it going to be
does the system trade off latency (L) and consistency (C)?” attribute, entity or another key?
In addition to the tradeoff among three CAP properties, the One interesting concept that may be too radical for
PACELC model, as depicted in Fig. 2, suggests that latency someone who follows relational databases religiously is the
is an important factor to consider as most distributed key-value treatment of consistency. In a key-value database,
database systems use replication to ensure availability. it is acceptable to have two different values of data at read
time, which implies non-consistency in relational database
theory. The inconsistency in the data is left to the application
program or client to solve. Eventually the inconsistency will
Consistency be removed from the database following a procedure adopted
Latency
by DBMS. The inconsistency is the cost that application
needs to pay as a tradeoff to availability and/or latency as
described in the transaction models. In general, key-value
pair databases are suitable for applications that process a
Parittion single-key transaction and perform a lot of reading. An
example of such applications would be generating product
catalogues on-the-fly. For this type of application the key-
Availability Consistecy value pair database can produce high throughput and low
latency performance.
2) Column-Family
Column-family database could be considered as a
Figure 2. PACELC model specific type of key-value pair model. A column family
database defines the structure of the values as predefined set
An alternative protocol based on the principle of CAD of columns, hence the name column-family. The definition
called BASE, Basically Available Soft-State services with of the column family could be considered as the schema of
Eventual Consistency, was introduced by eBay to replace the database. The main driver of this approach is the
Two-phase-commit protocol [24]. It aims to support partial Google’s HBase [15]. This data model could be one that
failure so that total system failure can be avoided. In other confuses many who are familiar with relational model due to
words, BASE focuses on the availability rather than the same naming of its components. Column-family
consistency of the database. BASE can be considered to take databases are made up of column, column family and super
an optimistic approach to consistency compared to the column.
pessimistic view of ACID-based protocol. By taking an
• Column
optimistic approach, BASE allows best effort and
A column is an atomic unit of information supported by
approximate answers to exist in a database transaction state.
the database. It is expressed as a pair of name and value.
This creates a soft state that needs to be managed carefully
• Super-column

47
Super columns group together associated columns that relationships and relations matter. For some applications
would be retrieved together from disk or have semantic such as social network, depicting the relationship between
association. It is useful for modeling complex data types each entity is important. The popularity of social networks
such as address. has contributed to the resurrection of graph database research
• Column Family that was active in the 80s and early 90s [3].
A column family groups columns and super columns
D. Business Analytics.
together into a highly structured data. It is the closest
resemblance of table in relational model. NoSQL databases were designed to support availability
of data to end-users rather than to help gathering of business
Consider a sample data containing personal details as data for decision making. Hence, at this stage of NoSQL
depicted in Table I. The data can be represented in a development, there is very limited number of querying
column-family database to have: support for business analytics applications. HIVE [16] and
• Super-columns of personal data and demographic. PIG [23] are examples of available querying applications run
• Columns of name, address, birthdate, gender. on top of map-reduced framework. Unlike relational model
• Column family of person and identify by key of based business analytics that support non technical savvy
PersonID. personnel to interrogate the database for some intelligence,
ad-hoc query in NoSQL database demands skilled
TABLE I. COLUM-FAMILY EXAMPLE programmers to code the query as many of the available
database only provide API as interface and they do not have
Row key Personal Data Demographic … high-level query language.
PersonID Name Address Education Gender
1 Smith, H Rome Master F IV. COMPARISON OF NOSQL DATABASES
2 Jones, S NY PhD M
3 Chin, P Sydney Bachelor F In this section, a sample of these DBMS is presented.
4 Santos, J Lima PhD F The list is not meant to be exhaustive, its purpose is to
highlight the different characteristics described in the
The structure of super-columns and column family previous section. The DBMS were chosen because they are
determines the schema for the database. However, it is not either one of the first of its category or the leader in its
strictly fixed. A new column or super column can be added category. These DBMS are compared based on data model,
to the design with ease when the database is already in transaction model, license, indexes, and sharding.
production. As another important different between column • Data Model
family and relational database, each row in the column The supported model in the database that could be
family database does not need to be of the same degree, i.e., one of key-value pair, column-family, document or
it can have variable number of columns/super-columns. graph.
Hence, column family will be very effective in supporting • Transaction Model
highly sparse data collection. The database priority in selecting the trade-off based
on the CAP and PACELC models.
3) Document • License
A document database uses the concept of key-value pair The type of license for the software.
to store data. However, it imposes some structure on how the • Ad-hoc Query
value is stored. Unlike the column-family that stores the Indication of whether the DBMS support ad-hoc
values in a family of columns, the document database stores query. If it is supported, what technique or
the values in a document-like structure such as XML or programming language is used to write queries.
JSON. Hence, it provides more information about the
• Indexes
structure of the data compared to key-value oriented
Indication of whether the DBMS supports automatic
databases and the structure can be exploited to serve more
maintenance of secondary indexing, in contrast to,
query types. In the key-value databases, only query by key or
applications-managed secondary indexes.
key range is possible.
• Sharding
Indication of whether the DBMS support automatic
4) Graph Database sharding. That is, the ability of DBSM to
A graph database is a database that uses graphs as the automatically re-distribute data and replications
means to represent its schema. across servers when there is a change in resources
The graph database differs from relational database on its such as addition or removal of servers.
treatment of relationship. In relational database, what matters
are tuples and its collection called relation. The relationship
between individual tuple is implicitly defined by means of
foreign key and primary key. In a graph database, both

48
TABLE II. NOSQL DATABASES COMPARISON

Database Data model Transaction Ad-Hoc query License Indexes? Sharding?


CAP PACELC
Cassandra[19] column- PA EL HIVE, PIG Open-source(Apache) N Y
family
CouchDB[10] document PA EL Lucene, Cloudant Open-source(Apache) Y N
DynamoDB[11] key-value PA EL Built-in API Commercial
Hbase column- PC EC HIVE, PIG Open-source(Apache) N
family
MongoDB[21] document PA EC BSON based GNU Affero General Y Y
format Public License
(database), Apache
(language drivers)
Neo4j[22] graph CA EC Chyper Open-source (GNU N N
Affero General Public
License) and
commercial
RavenDB[25] document ACID Limited, built-in Open-source and Y Y
commercial
Riak[32] key-value PA EL CorrugatedIron Open-source(Apache) Y Y
Tokyo Key-value PC EC No GNU Lesser General N N
Cabinet[17] Public License
Voldermort[31] key-value PA EL No open-source Y Y

Many of the products listed in Table II were developed


by companies that have large data storage requirements, such NoSQL database was designed to handle large volumes
Cassandra for Facebook, Voldermort for LinkedIN, of data processing by removing some supports that existed in
DynamoDB for Amazon, HBase for Google and Tokyo RDBMS. One of these supports is ad-hoc query. Although
Cabinet for the Japanese version of Facebook from Mixi Inc. many of the databases listed in Table II has support for ad-
The development was a necessity for the operation of their hoc query, the level of programming expertise in writing this
business. Other database systems such as MongoDB, Riak ad-hoc query is much higher than that of RDBMS. In
and Neo4j, were developed as service to other organizations addition, ad-hoc query in NoSQL database is usually
as the providers see the potential in investing in the new performed in non-key data. Ad-hoc query on non-key item
technology. This group of DBMS was released after the in- would cause the system to perform poorly or would not be
house built group. available to DBMS that does not support secondary
It is interesting also to see that many of the available indexing. Ad-hoc query is well supported in relational
databases have open-source license, except DynamoDB, DBMS due to data independence concept and query
Neo4j and RavenDB. DynamoDB is available for Amazon optimization, two concepts that are hardly supported in
Webservice users with some tier pricing. The Neo4j and NoSQL. The structure of the database is designed by
RavenDB provide different licensing for different levels of considering the most efficient access path used by critical
service/product. queries. Selection of access path is very dominant in the key-
In terms of the transaction model, most of the databases value and column-family. The limitation of ad-hoc query
give priority to availability over consistency (PA) according support will limit the use of data from NoSQL database for
to CAP model, except HBase and Tokyo Cabinet that prefer OLAP and data warehousing applications. This limitation
a PC model. Tokyo Cabinet’s performance is comparable to will provide a challenge to researchers in finding a new
the PA system in terms of availability for small to medium model and/or techniques to support business analytics from
data sets [17]. Most of the PA system under the CAP model NoSQL data.
tends to select EL model in the case of non-existing network The ability of optimizing the resource allocation in a
partition with the exception of MongoDB and Tokyo distributed environment through the management of
Cabinet. These two databases can be considered to support expansion and contraction of available resources is an
more of consistency compared to the PA/EL systems. Unlike important feature in NoSQL DBMS. With the changes in
the rest of the database that does not support ACID, available resources, it is possible to automatically re-
RavenDB supports ACID for transactions based on the key- distribute the data or to have different shards of data. This
value. It falls back to BASE protocol when transactions ability is important to the performance of the database
involve secondary indexes. because it influences the latency of the system. There are

49
several DBMS listed in Table II that support automatic is challenging. Currently, to the author’s knowledge,
sharding such as Cassandra, MongDB, RavenDB and Riak. no data modeling technique has been prescribed as a
From the four data models, the graph data model is the one method to perform database design for the different
with the most challenges in supporting sharding. There is no data models.
support for sharding in graph databases at this stage, • A new model with the support of a strong theoretical
although some initial development has taken place. [18]. foundation that works well on a large distributed
data set.
V. CONCLUSION NoSQL database was designed to overcome limitations
We have explored and compared different types of of relational database in supporting distributed processing of
NoSQL databases. Two main drivers for these databases are data. Hence, some important aspects that are important in a
the needs of many organizations to process large amounts of relational database may not be relevant in NoSQL, for
data which in some cases has no obvious tabular structure. example, query optimization. A query optimization engine is
The solutions proposed for these drivers are new data models included in RDBMS because relational models impose data
that are non-relational for distributed data processing. independence and provide high level support of ad-hoc
Although it is considered a new model in the database query. In NoSQL databases, the query is implicitly
domain, these new models are being developed based on optimized during the design of the database by considering
existing and known theory. For example, the idea of key- the type of distributed architecture available and pattern of
value stored is a known data structure and has been used in queries to be supported. It does not aim to be very flexible in
many file systems. What makes the development of these serving ad-hoc queries as in a relational database.
databases novel is its design of the DBMS to support Nevertheless, there is a need to serve ad-hoc query more
horizontal scale-up. This is done by relaxing ACID protocol efficiently, but the approach should be different from query
and building a protocol that allow eventually consistent state optimization in relational database.
in the database. Availability is the main concern, not From the point of view of adoption and development, it is
consistency. This departs far from relational database that important to educate the CTOs on the strength and
put consistency as its main focus. So, are we at a crossroad? weaknesses of NoSQL database. These databases should be
Not exactly. seen as a complimentary solution to data management
The two camps of relational and non-relational (NoSQL) problem in the organization to relational database, not as a
are like two different roads to go to the same destination. It is replacement. It is important to understand operational
like having Route 66 and interstate highways. The route 66 is patterns in the organization to allow the development of best
a well known route, may be more scenic, offer multiple stops practices and methodologies that appropriate for NoSQL
and may be a bit slower. On the other hand, interstate database design and implementation. Without design tools, it
highways can make the trip faster, but only when all will be difficult for this new technology to get mass adoption
interstate highways are developed, coordinated and the roads in the marketplace.
are designed to handle the volume and rate of traffic. Many relational database end-users will find NoSQL to
There are still many open challenges in making NoSQL be difficult to use as building queries in NoSQL requires
databases become a mainstream solution. The challenges more sophisticated programming skills. Providing a high-
come from three different domains, research, level query language will be important for the acceptance of
adoption/development and end-user. From research this technology by the end-users. It is important to also
perspective there are still problems to solve such as: educate them that ad-hoc query may take longer in NoSQL
• Understanding latency and its influence in the compared to relational database, hence their way of defining
overall design and performance of a database. The their information needs may need to be altered. It should be
PACELC is a step in the right direction. But a model defined earlier during the development of the database rather
with strong theoretical foundation on understanding than later during the operational stage of the database.
latency is still needed so that a new architecture of NoSQL will not replace the relational database
DBMS can be designed accordingly. completely. Instead, it is complementary to relational
• Sharding for graph database. Unlike most of the databases in providing enhanced data management capability
other models, graph database has mutable structure within an organization.
during run time hence it is difficult to design a
sharding algorithm than provide high elasticity. [1] D.J. Abadi, "Consistency Tradeoffs in Modern Distributed Database
System Design: CAP is Only Part of the Story," Computer, vol. 45,
• Support for ad-hoc query is still limited in many no. 2, pp. 37-42, Jan. 2012
systems, hence support for OLAP or data [2] R. Agrawal, A. Ailamaki, P. A. Bernstein, E. A. Brewer, M. J. Carey,
warehousing queries is still very limited. There is a S. Chaudhuri, et.al.. The Claremont report on database research.
need to find a new model of business analytics or a SIGMOD Rec. 37, 3 (September 2008), 9-19
way to provide simple interface for decision makers [3] R. Angles and C. Gutierrez. 2008. Survey of graph database models.
to perform ad-hoc query. ACM Comput. Surv. 40, 1, Article 1 (February 2008), 39 pages.
• Unlike relational models, NoSQL does not have a [4] M. M. Astrahan, M.W. Blasgen, D. D. Chamberlin, K. P. Eswaran, J.
N. Gray, P.P. Griffiths, W.F. King, R.A. Lorie, P.R. McJones, J. W.
strong theoretical background to the model, hence Mehl, G.R. Putzolu, I.L. Traiger, B.W. Wade, and V. Watson. 1976.
developing a set of methodology for database design

50
System R: relational approach to database management. ACM Trans. [18] http://jim.webber.name/2011/02/16/3b8f4b3d-c884-4fba-ae6b-
Database Syst. 1, 2 (June 1976), 97-137. 7b75a191fa22.aspx
[5] http://ayende.com/blog/tags/nosql [19] A. Lakshman and P. Malik. 2010. Cassandra: a decentralized
[6] S. Bagui: “Achievements and Weaknesses of Object-Oriented structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (April
Databases”, in Journal of Object Technology, vol. 2, no. 4, July- 2010), 35-40
August 2003, pp. 29-41 [20] N. Leavitt; , "Will NoSQL Databases Live Up to Their Promise?,"
[7] http://cacm.acm.org/blogs/blog-cacm/99512-why-enterprises-are- Computer , vol.43, no.2, pp.12-14, Feb. 2010.
uninterested-in-nosql/fulltext [21] http://www.mongodb.org/
[8] E.F. Codd. 1970. A relational model of data for large shared data [22] http://neo4j.org/
banks. Commun. ACM 13, 6 (June 1970), 377-387. [23] http://pig.apache.org/
[9] T.M. Connoll and C.E. Begg, Database Systems: A Practical [24] D. Pritchett. 2008. BASE: An Acid Alternative. Queue 6, 3 (May
Approach to Design, Implementation and Management, 4th ed., 2008), 48-55.
Addison Wesley, 2005.
[25] http://ravendb.net/
[10] http://couchdb.apache.org/
[26] http://research.google.com/archive/bigtable.html
[11] G. DeCandia, D. Hastorun, M. Jampani, G.Kakulapati, A. Lakshman,
[27] http://research.google.com/archive/mapreduce.html
A. Pilchin, S.Sivasubramanian, P. Vosshall, and W. Vogels. 2007.
Dynamo: amazon's highly available key-value store. In Proceedings [28] A. Silberschatz, H.F. Korth., and S. Sudarshan, Database System
of twenty-first ACM SIGOPS symposium on Operating systems Concepts, 5th ed., McGraw Hill. 2006.
principles (SOSP '07). ACM, New York, NY, USA, 205-220. [29] http://www.softwareag.com/Corporate/products/wm/tamino/default.a
[12] http://fallabs.com/tokyocabinet/ sp
[13] J. Fenn. and M. Raskino, Mastering the Hype Cycle: How to Choose [30] M. Stonebraker and R. Cattell. 2011. 10 rules for scalable
the Right Innovation at the Right Time, Harvard Business Press, performance in 'simple operation' datastores. Commun. ACM 54, 6
2008. (June 2011), 72-80.
[14] S. Gilbert and N. Lynch. 2002. Brewer's conjecture and the feasibility [31] R. Sumbaly, J. Kreps, L. Gao, A. Feinberg, C. Soman, and S. Shah.
of consistent, available, partition-tolerant web services. SIGACT 2012. Serving large-scale batch computed data with project
News 33, 2 (June 2002), 51-59. Voldemort. In Proceedings of the 10th USENIX conference on File
[15] http://hbase.apache.org/ and Storage Technologies (FAST'12). USENIX Association,
Berkeley, CA, USA, 18-18.
[16] http://hive.apache.org/
[32] http://wiki.basho.com/
[17] http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-
store/

51

You might also like