Professional Documents
Culture Documents
Margo
Seltzer
October
29,
2013
Outline
In
the
beginning
The
heyday
of
RDBMS
The
rebirth
of
key/value
stores
Key/Value
stores
today:
NoSQL
NoSQL
&
key/value
use
cases
CS109
10/29/13
2
In
the
Beginning
Where
beginning
equals
1960s
Computers
Centralized
systems
Spiy
new
data
channels
let
CPU
and
IO
overlap.
Persistent
storage
is
on
drums.
Buering
and
interrupt
handling
done
in
the
OS.
Making
these
systems
fast
is
becoming
a
research
focus.
Data
What
did
data
look
like?
CS109
10/29/13
3
Organizing
Data:
ISAM
Indexed
sequen[al
access
method
Pioneered
by
IBM
for
its
mainframes
Fixed
length
records
Each
record
lives
at
a
specic
loca[on
Rapid
record
access
even
on
a
sequen[al
medium
(tape)
All
indexes
are
secondary
Allow
key
lookup,
but
Do
not
correspond
to
physical
organiza[on
Key
is
to
build
small,
ecient
index
structures
Fundamental
access
method
in
COBOL
CS109
10/29/13
4
Organizing
Data:
The
Network
Model
Early
data
management
systems
represented
data
using
something
called
a
network
model.
Data
are
represented
by
collec[ons
of
records
(today
we
would
call
those
key/data
pairs).
Rela[onships
among
records
are
expressed
via
links
between
records
(today
we
can
think
of
those
links
as
pointers).
Applica[ons
interacted
with
data
by
naviga[ng
through
it:
Find
a
record
Follow
links
Find
other
records
Repeat
CS109
10/29/13
5
The
Network
Model:
Inside
Records
Records
composed
of
afributes
(think
elds)
Afributes
are
single-valued.
Links
connect
exactly
two
records.
Represent
N-way
rela[onships
via
link
records
Pa[ent
Appointment
Doctor
name office
CS109
10/29/13
6
The
Rela8onal
Model:
The
Compe88on
The
Network
Model
had
some
problems
Applica[ons
had
to
know
the
structure
of
the
data
Changing
the
representa[on
required
a
massive
rewrite
Fundamentally:
the
physical
arrangement
was
6ghtly
coupled
to
the
applica6on
and
the
applica6on
logic.
1968:
Ted
Codd
proposes
the
rela[onal
model
Decouple
physical
representa[on
from
logical
representa[on
Store
records
as
tables
Replace
links
with
implicit
joins
among
tables
The
big
ques[on:
could
it
perform?
CS109
10/29/13
7
Outline
In
the
beginning
The
heyday
of
RDBMS
The
rebirth
of
key/value
stores
Key/Value
stores
today:
NoSQL
NoSQL
&
key/value
use
cases
CS109
10/29/13
8
The
heyday
of
RDBMS
Amer
Codds
paper,
much
debate
ensued,
but
two
groups
set
out
to
turn
an
idea
into
somware.
IBM:
System/R
U.C.Berkeley:
Ingres
Both
were
research
projects
to
explore
the
feasibility
of
implemen[ng
the
rela[onal
model.
Both
were
hugely
successful
and
had
enormous
impact.
However,
at
their
core,
youll
no[ce
something
interes[ng
CS109
10/29/13
9
The
Design
of
System
R
Read/write blocks
Disk Layer
CS109
10/29/13
10
The
Design
of
Ingres
EQUEL
Command-line EQUEL
programs
EQUEL
programs
terminal program EQUEL
programs
programs
UNIX
les
File
System
File
Database
File
Log
catalogs
tables
Disk Layer
CS109
10/29/13
11
The
Key/Value
Store
Within
All
these
rela[onal
systems
and
most
data
management
systems
have
hidden
deep
inside
some
sort
of
key/value
storage
engine.
For
years,
the
conven[onal
wisdom
was
to
hide
those
KV
stores
deep
underneath
a
query
language
and
schema
level.
Major
excep[on
was
COBOL,
which
con[nued
to
use
ISAM.
CS109
10/29/13
12
RDBMS
matured
Lots
of
new
features
(SQL-86,
SQL-89,
SQL-92,
SQL:
1999,
SQL:2003,
SQL:2008,
SQL:2100?)
Triggers
Stored
procedures
Report
generators
Rules
Incorporated
new
data
models:
Object-rela[onal
systems
XML
Became
enormously
large,
complex
systems
requiring
expert
administra[on.
CS109
10/29/13
13
RDBMS:
The
Good
and
the
Bad
RDBMS
Advantages:
Declara[ve
query
language
that
decoupled
physical
and
logical
layout.
(Rela[ve)
ease
of
schema
modica[on.
Great
support
for
ad
hoc
queries.
RDBMS
Disadvantages:
Pay
for
the
overhead
of
query
processing
even
if
you
dont
have
complex
queries.
Require
DBA
for
tuning
and
maintenance.
Nearly
always
pay
IPC
to
access
database
server.
Require
schema
deni[on.
Not
great
at
managing
hierarchical
or
other
complex
rela[onships.
CS109
10/29/13
14
Outline
In
the
beginning
The
heyday
of
RDBMS
The
rebirth
of
key/value
stores
Key/Value
stores
today:
NoSQL
NoSQL
&
key/value
use
cases
CS109
10/29/13
15
The
Age
of
the
Internet
New
classes
of
applica[ons
emerged
Search
Authen[ca[on
(LDAP)
Email
Browsing
Instant
messaging
Web
servers
Online
retail
Public
key
management
CS109
10/29/13
16
These
Applica8ons
were
Dierent
They
werent
suppor[ng
ad
hoc
query
mechanisms
Query
set
specied
by
the
applica[on
Users
interacted
with
the
applica[ons
not
the
data
They
had
fairly
simple
schemas
No
need
for
fancy
reports
Performance
was
cri[cal
CS109
10/29/13
18
Key/Value
versus
RDBMS
Key-value Databases Standard RDBMS
Pros Sometimes the good enough Rich SQL Support
solution for simple data schemas Data typing
Much smaller footprint Data relationship management
Shorter execution path Dynamic data relationships
More efficient, fewer OS resources Procedural languages (stored procedures)
Usually no client/server Parallel query execution
No application data mapping Security, Encryption
Centralized management
Cons Application-centric schema control Much larger engine, higher overhead, client/
High Availability often missing (not server overhead, general purpose RDBMS
BDB) solution
Higher level abstraction APIs often Higher administrative burden
missing (not BDB) Higher TCO
Limited RDBMS integration
CS109
10/29/13
19
From
local
to
distributed
As
the
web
matured,
data
volume
and
velocity
and
customer
demand
grew
exponen[ally.
Exceeded
single
system
capacity.
Enter
a
new
era
of
scalability
demands.
Infrastructure
migrates
from
large-scale
SMP
to
clusters
of
commodity
blades.
CS109
10/29/13
20
Outline
In
the
beginning
The
heyday
of
RDBMS
The
rebirth
of
key/value
stores
Key/Value
stores
today:
NoSQL
Framing
Technology
NoSQL
&
key/value
use
cases
CS109
10/29/13
21
Enter
NoSQL
Web
service
providers
(e.g.,
Google,
Amazon,
Ebay,
Facebook,
Yahoo)
began
pushing
exis[ng
data
management
solu[ons
to
their
limits.
Introduced
sharding:
Split
the
data
up
across
mul[ple
hosts
to
spread
out
load.
Introduced
replica[on:
Allow
access
from
mul[ple
sites
Increase
availability
Relax
consistency:
Didnt
always
need
transac[onal
seman[cs
CS109
10/29/13
22
What
is
NoSQL?
Not-only-SQL
(2009)
While
RDBMs
have
typically
focused
on
transac[ons
and
the
ACID
proper[es,
NoSQL
focuses
on
BASE:
Basic
Availability:
Use
replica6on
to
reduce
the
likelihood
of
data
unavailability
and
sharding
to
make
any
remaining
failures
par[al
Som
state:
Allow
data
to
be
inconsistent
and
relegate
designing
around
such
inconsistencies
to
applica[on
developers.
Eventually
consistent:
Ensure
only
that
at
some
future
point
in
[me
the
data
assumes
a
consistent
state
CS109
10/29/13
23
Why
NoSQL?
Three
common
problems
1. Unprecedented
transac[on
volumes
2. Cri[cal
need
for
low
latency
access
3. 100%
Availability
Hardware
shim:
From
massive
SMP
to
blades
Changing
how
we
deal
with
the
CAP
theorem
The
CAP
theorem
states
that
you
cant
have
all
three
of:
Consistency:
All
nodes
see
the
same
data
at
the
same
[me.
Availability:
Every
request
receives
a
success/fail
response
Par[[on
tolerance:
The
system
con[nues
to
operate
despite
arbitrary
message
loss.
RDBMs
(transac[onal
systems)
typically
choose
CA
NoSQL
typically
chooses
AP
or
CP
CS109
10/29/13
24
NoSQL
Evolu8on
1997:
Berkeley
DB
released
providing
transac[onal
key/value
store.
2001:
Berkeley
DB
introduces
replica[on
for
high
availability.
2006:
Google
publishes
Chubby
and
BigTable
papers.
Each
system
provides
scalable,
data
management
for
loosely
coupled
systems
2007:
Amazon
publishes
Dynamo
paper
DHT-based
eventually
consistent
key/value
store
2008+:
Mul[ple
OSS
projects
to
produce
BigTable/Dynamo
knockos
HBase:
Apache
Hadoop
project
implementa[on
of
BigTable
(2008)
CouchDB:
Erlang-based
document
store,
MVCC
and
versioning
(2008)
Cassandra:
Write-op[mized,
column-oriented,
secondary
indices
(2009)
MongoDB:
JSON-based
document
store,
indexing,
sharded
(2009)
2009+:
Commercializa[on
Companies
emerge
to
support
open
source
products
(e.g.,
DataStax/
Cassandra,
Basho/Riak,
Cloudera/HBase,10Gen/MongoDB
)
Companies
develop
commercial
oerings
(Oracle,
Citrusleaf)
CS109
10/29/13
25
NoSQL
Technology
Some
form
of
distribu[on
and
replica[on:
Distributed
hash
tables
(DHTs)
Key
par[[oning
Replica[on
in
underlying
storage
or
in
NoSQL
Rela[onal
or
key/value
store
for
underlying
storage
engine.
Second
genera[on
systems
building
custom
write-op[mized
engines.
Log-structured
merge
trees
(LSM)
CS109
10/29/13
26
NoSQL
Design
Space
Local
node
storage
system
Distribu[on
Data
Model
Consistency
Model
Eventual
consistency
No
consistency
Transac[onal
consistency
CS109
10/29/13
27
Local
Storage
Systems
Na[ve
KVstore
Oracle
NoSQL
Database:
Berkeley
DB
Java
Edi[on
Basho:
Riaks
Bitcask,
a
log-structured
hash
table
Amazon:
Dynamo,
BDB
Data
Store,
BDB
JE
(or
MySQL)
Log-Structured
Merge
Trees
LevelDB
Custom
BigTable
(and
clones):
Tablet
servers
on
SSTables
exploi[ng
GFS-like
systems.
CS109
10/29/13
28
Distribu8on
Three
main
ques[ons:
How
do
you
par[[on
data
How
many
copies
do
you
keep
Are
all
copies
equal
How
do
you
par[[on
data
Key
par[[oning
Hash
par[[oning
(omen
called
sharding)
Geographic
par[[oning
(also
called
sharding)
CS109
10/29/13
29
More
Distribu8on
How
many
copies
and
how?
Use
the
underlying
le
system
(GFS,
HDFS)
Three
is
good;
ve
is
befer;
for
hot
things
some[mes
go
to
many.
Copy
Equality:
Single-Master
MongoDB,
Oracle
NoSQL
Database,
Mul[-Master/Masterless
Riak,
CouchDB,
Couchbase
CS109
10/29/13
30
NoSQL
Data
Models
Common
to
most
oerings:
Denormaliza[on:
some
data
values
are
duplicated
to
provide
superior
query
performance.
No
joins
Key/Value:
Lifle
or
no
schema,
minimal
range
query
support
Oracle
NoSQL
DB,
DynamoDB,
Couchbase,
Riak
BigTable
Column
Family
Hbase,
Cassandra
Document-stores
MongoDB,
CouchDB
CS109
10/29/13
31
Data
Model:
Key/Value
Keys
and
data
are
both
opaque
byte
strings.
Designed
for
point
queries
Typically
no
no[on
of
a
range
query
If
itera[on
exists,
its
usually
not
in
key-order
Examples:
DynamoDB
LevelDB
Riak
Tokyo/Kyoto
Cabinet
Oracle
NoSQL
Database
Extensions:
Oracle
NoSQL:
Major/Minor
keys,
batching
LevelDB:
Atomic
batches
CS109
10/29/13
32
Data
Model:
Column
Family
(1)
l Columns
are
grouped
into
column
families.
l Column
families:
l Are
typically
stored
together
l Can
have
dierent
columns
for
each
row
l Can
have
duplicate
items
in
any
column
l No
schema
or
type
enforcement
l All
data
are
treated
as
byte
strings
l Indexed
by
rows
l Rows
are
grouped
into
tablets
l No
secondary
indexes
CS109
10/29/13
33
Data
Model:
Column
Family
(2)
Column
Families
Timestamps
versions
No
Street
City
State
Zip
1993
394
East
Riding
Dr
Carlisle
MA
01741
CS109
10/29/13
34
Data
Model:
Document
Key/Value
store
where
the
value
is
a
document
with
structure.
Document
store
understands
the
structure
of
documents:
JSON,
XML,
BSON,
PDF,
Doc
Some[mes
documents
can
have
sub-documents.
Key
afribute
of
a
document
store
is
that
it
lets
you
search
within
documents
as
well
as
searching
for
documents.
CS109
10/29/13
35
NoSQL
Consistency
The
CAP
theorem
(Eric
Brewer):
it
is
impossible
for
a
distributed
computer
system
to
simultaneously
provide:
Consistency:
All
nodes
see
the
same
data
at
the
same
[me
Availability:
Every
request
receives
a
response
about
whether
it
succeeded
Par[[on
tolerance:
The
system
con[nues
to
operate
despite
arbitrary
message
loss
or
failure
of
parts
of
the
system
Originally
posed
as
a
conjecture
by
Brewer,
later
precisely
proven
by
Gilbert
and
Lynch.
Its
an
interes[ng
history
to
read!
Implica[ons
for
NoSQL:
Pick
2
Some
systems
pick
CP
Other
systems
pick
AP
Few
systems
pick
CA
Why?
CS109
10/29/13
36
Forms
of
Consistency
Consistent/Par[[onable
Systems
BigTable,
Hbase,
HypterTable,MongoDB,
Redis,
MemcacheDB
Available/Par[[onable
Systems
Cassandra,
SimpleDB,
CouchDB,
Riak,
TokyoCabinet,
Dynamo
Eventual
Consistency
There
exist
mul[ple
deni[ons.
The
most
popular
one
is
due
to
Vogels:
The
storage
system
guarantees
that
if
no
new
updates
are
made
to
the
object
eventually
(amer
the
inconsistency
window
closes)
all
accesses
will
return
the
last
updated
value.
You
can
be
AP
and
eventually
consistent.
Variable:
through
careful
congura[on
choose
how
consistent
you
want
to
be:
Riak:
set
R
and
W
to
determine
consistency
levels
Cassandra:
read/write
to
one,
quorum,
all
Oracle
NoSQL
DB:
Can
be
CP
when
sewng
ack
policy
for
simple
majority,
else
is
AP.
Amazon
Dynamo:
Picking
read/write
quorums
trades
o
performance
and
degree
of
inconsistency
CS109
10/29/13
37
Outline
In
the
beginning
The
heyday
of
RDBMS
The
rebirth
of
key/value
stores
Key/Value
stores
today:
NoSQL
NoSQL
&
key/value
use
cases
CS109
10/29/13
38
Cassandra
Use
Case:
NeXlix
Adrian
CockcroZ
hfp://www.hpts.ws/sessions/GlobalNexlixHPTS.pdf
Scalability
E3$'(32:!
O1P5P!
G2#5<>!
FY!
F1C>:'OG!
CS109
10/29/13
39
HBase
Use
Case:
Facebook
Messages
Kannan
Muthukkaruppan:
hfp://www.hpts.ws/sessions/StorageInfraBehindMessages.pdf
CS109
10/29/13
41
Beyond
NoSQL
Googles
Spanner:
a
globally
distributed
SQL
database
with
atomic
transac[ons,
synchronous
replica[on,
and
consistency!
Data
is
sharded
Replicated
with
Paxos
state
machines.
Paxos
group
leaders
use
two
phase
commit
to
enforce
atomicity.
Mul[ple
consistency
models
Snapshot
reads
Read-only
transac[ons
ACID
transac[ons
Enabling
technology:
TrueTime
CS109
10/29/13
42
Wrapping
Up
When
should
you
be
considering
NoSQL?
Scalability
Low
latency
Redundancy
No
adhoc
queries
Joins
easily
implementable
in
the
applica[on
Straight
forward
key
structure
A
couple
of
sites
I
found
prefy
interes[ng:
List
of
NoSQL
Databases:
hfp://nosql-database.org
NoSQL
Data
Modeling:
hfp://highlyscalable.wordpress.com/2012/03/01/nosql-data-
modeling-techniques/
Recent
Performance
Comparison:
hfp://www.networkworld.com/cgi-bin/mailto/x.cgi?pagetosend=/
news/tech/2012/102212-nosql-263595.html
CS109 10/29/13 43