You are on page 1of 53

Databases & Docker

A survival guide in 3 acts


Meanwhile... Hey. Ops guy - I finished. Can
you deploy?

Whoa - you finished what?

No worries - its container


images, can you just push it?

Container. Cool. Does it need


a Database? Network?
Stoarge?
Got you covered. I have a
compose file - its got
everything you need...

(to self) Oh no, not again


Conclusion
Containers + Databases = Happy Developers
Ephemeral Containers + Databases = DevOps headaches
3 Things you must use to evaluate
Schema
Data Redundancy, Dynamic Self Discovery, Cluster formation & Self-healing
Orchestration & Scaling
Act 1
Its all about the
Schema
Meanwhile...
Ops Dude - heres the images
for the Photo Sharing App

We have a single schema for


all the microservices...

(to self) Oh no, not again


Service Architecture
Service Authentication
Service
Discovery
REST
API

Profile
API Service
Web UI
GATEWAY REST
Service
API

REST REST
API API
Comment Photo
Service Service

REST
API

Vote
Service
Schema Considerations

Normalization

Relationships
Cardinality
Containment

Aggregation
Photo Schema

casts
users votes
creates 0..*
posts
0..*

0..*
has_votes
photos

has_comments

0..*
0..*
comments
Photo Schema - Service Overlay

Profile Service Vote Service


Authentication Service
casts
users votes
creates 0..*
posts
0..*

Photo Service 0..*


has_votes
photos

has_comments

Comment Service 0..*


0..*
comments
Use Case:
Most Recent Photos with # of Votes & Comments
select p.id, p.title,
ifnull(votes.total_up,0) as upvotes,
ifnull(votes.total_down,0) as downvotes,
ifnull(comments.total_comments,0) as comments
from photos p
left join
(select p.id as id,
sum(v.upvote) as total_up,
sum(v.downvote) as total_down
from photos p
left join votes v on p.id = v.photo_id
group by p.id) votes
on p.id = votes.id
left join
(select c.photo_id as photo_id,
count(c.id) as total_comments
from photos p
left join comments c on p.id = c.photo_id
group by c.photo_id) comments
on p.id = comments.photo_id
order by p.createdAt DESC
limit 10;
OMG!

select p.id, p.title,


ifnull(votes.total_up,0) as upvotes,
Joins across Domains / Services
ifnull(votes.total_down,0) as downvotes, Highly Coupled
ifnull(comments.total_comments,0) as comments
from photos p
left join
Forces all related data to be kept
(select p.id as id,
sum(v.upvote) as total_up, together
sum(v.downvote) as total_down
from photos p
left join votes v on p.id = v.photo_id Which Service should implement?
group by p.id) votes
on p.id = votes.id
Votes? Comments? Profile?
left join
(select c.photo_id as photo_id,
count(c.id) as total_comments
from photos p
left join comments c on p.id = c.photo_id
group by c.photo_id) comments
on p.id = comments.photo_id
order by p.createdAt DESC
limit 10;
Compound Services - Service Calls Replace Joins

Photo Comment Profile Vote


Service Service Service Service

1 /ipc/getLast10
1.1 /ipc/usernames

{ list usernames }
{ list comments & who }

2 /ipc/getCount

{ total comments }

3 /ipc/usernames

{ list of posters }

4 /ipc/count

{ number of votes }

5 /ipc/voted

{ what you have voted for }


Alternative - Events / Notifications

Aggregation
Each Service publishes events
Aggregation Service ingests those events
Service
computes required metrics
publishes metrics

Where to call?
From each Service?
From a coordination point (e.g. API Gateway)?

Command Query Responsibility Segregation


http://microservices.io/patterns/data/database-per-service.html
Aggregation Service - Publishing Events to Aggregate

API
Web Vote Aggr
Gateway
Client Service Service

1 /vote

1.1 /vote Each Service


1.2 /vote forwards
interesting
events
API Gateway
forwards
events
2 /vote

2.1 /vote

2.2 /vote
Data Integrity - Service boundaries

Relationships
Referential integrity
Re-parenting (foreign key updates)

De-normalization
Aggregates
Copies of data on other tables
Use Case: Photo & Comments

select p.id, p.title, p.photo,


c.comment
from photos p
left join comments c
on p.id = c.photo_id
where p.id = ?

Join Across Domains


Not permitted

Each Service must execute


part of the query
Use Case: Photo & Comments

select p.id, p.title, p.photo, select p.id, p.title, p.photo


c.comment from photos p
from photos p where p.id = ?
left join comments c
on p.id = c.photo_id select c.comment
where p.id = ? from comments c
on p.id = c.photo_id
Join Across Domains where c.photo_id = ?
Not permitted

Each Service must execute


part of the query
Lifelines of Data

Photo Comment Profile Vote


Service Service Service Service

1 /ipc/getLast10

1.1 /ipc/usernames

Photo is removed { list usernames }


as Comment
X services processes
request

{ comments }
Referential Integrity

Insert & Update Delete


Update vs Delete/Insert Hard vs Soft
Change a Comment vs Delete & Mark as deleted vs really delete
Re-Insert Comment Cascade operations
Immediate vs Eventual Consistency Removal of a Photo means removing
Can comments be viewed after the Votes & Comments
photo is removed?
Begin vs End consistency
Vote is added, but Photo removed
during operation

Where Relationship cross Service boundaries, the Services


must maintain integrity of the data
Transactions
Use Case - Delete Photo plus Comments & Votes
START TRANSACTION;

delete from photos p


where p.id = ?

delete from comments c


where c.photo_id = ?

delete from votes v


where v.photo_id = ?

COMMIT;
Transactions
Use Case - Delete Photo plus Comments & Votes
START TRANSACTION; Transaction span Domain
boundaries
delete from photos p
where p.id = ?

delete from comments c


where c.photo_id = ?

delete from votes v


where v.photo_id = ?

COMMIT;
Transactions - Approaches

Validate Service Boundaries


Should Photo, Comments and Votes be a single Service?

Denormalize
Create a compound object (e.g. JSON) within a single service of denormalized data

Queue / Event / State Machine


Maintain an (ordered) list of idempotent steps required to process the transaction
Coordinate the processing across the Services

Dont Forget About Eventual Consistency


Act 2
Its all about the
Infrastructure
Meanwhile...
Ok - Schema fixed up and
deployed

AWESOME! When do we go
live?

How many Databases do you


need and in what
configuration?

Errr like ONE?

(to self) Oh no, not again


Service Architecture
Service Authentication
Service
Discovery
REST
API

Profile
API Service
Web UI
GATEWAY REST
Service
API

REST REST
API API
Comment Photo
Service Service

REST
API

Vote
Service
Database Instance Considerations

How many
Schema / Databases
Instances
Clusters?

High Availability

Disaster Recovery

Mean Time to Recovery / Recovery Time Objective


Databases: Multi-tenant or Per Service?

Pros Pros
Single infrastructure to manage Each Service has independant DB Cluster
Single upgrade, backup etc. across the HA, DR etc. on a service-by-service basis
landscape Isolation Between Services
Maintenance on a service-by-service
basis
Cons
Failure modes potentially catastrophic to
all services Cons
QOS between Services (e.g. noisy Duplicated processes across the
neighbour) landscape
Maintenance can affect all services Orders of magnitude more infrastructure
to manage (and to go wrong)
Databases: Recommendations

Namespace each Service (e.g. separate Database name)

Keep High Value services on independent instances/clusters


Where service SLA and minimal variance is critical
Right size each Cluster for the needs of each Service
Enable HA, DR etc. as needed for the SLA of the Service

Cluster Lower Value services on a shared platform


Service SLA and variance is not critical
Consolidate lower value Services onto a multi-tenant Database Cluster
High Availability & Disaster Recovery

No Single Point Of Failure (SPOF)


Multiple Database Nodes
Spread Across Racks & Data Centers

Automatic Failover
On node failure, loss of connectivity etc

Automatic Cluster Formation


On container entry and exit
MariaDB Portfolio - HA, DR & Clustering

MARIADB SERVER MARIADB MAXSCALE MARIADB CLUSTER


Enterprise-grade secure, Next-generation database Multi-Master, synchronous
highly available and proxy that manages security, replication - improves
scalable relational database scalability and high availability availability and scales
with a modern, extensible in scale-out deployments reads and writes
architecture
MariaDB Cluster + R/W split routing

MaxScale + MariaDB
Cluster
Use Case
MariaDB Cluster
Multi-master
Max Synchronous Write
Scale Consensus elections

MaxScale
Read/Write distribution
Automatic switch over on Master
failure

Client
Connection to MaxScale, not to entire
cluster
Service Discovery - How to mesh nodes?

DNS RESOLUTION
Docker assigns VIP to Service, each Task has
own IP
nslookup, dig, getent etc.

3rd PARTY
consul, etcd, zookeeper etc.

DOCKER EVENTS
https://docs.docker.com/engine/
reference/api/docker_remote_api/
Interlock -
https://github.com/ehazlett/interlock
Swarm Event Endpoint PR #26331
Cluster Formation - DNS Example
$ docker exec fb1076a6d716 dig tasks.mariadb_cluster
...
;; ANSWER SECTION:
tasks.mariadb_cluster. 600 IN A 10.0.0.11
tasks.mariadb_cluster. 600 IN A 10.0.0.10
tasks.mariadb_cluster. 600 IN A 10.0.0.5

$ cat docker-entrypoint.sh
...
if [ -n $CLUSTER_NAME ]; then
service_nodes=`dig tasks.$CLUSTER_NAME | \
awk "/tasks.$CLUSTER_NAME./ {print \\$5}" | \
awk 'NF'|tr '\n' ','|tr -d ' '|sed 's/,$//'`
IFS=',' read -r -a cluster_nodes <<< $service_nodes
if [ ${#cluster_nodes[@]} -gt 0 ]; then
mode="node"
master_node=${cluster_nodes[0]}
fi
fi
Act 3
Its all about the
Orchestration
Meanwhile...
Deployed Databases that
meet the corporate standard
for HA & DR

Yeah - like whatever. Can we


go live?

Just have to ensure we can


scale when we need to.

Docker just scales my


microservices right?

(to self) Oh no, not again


Roll The Application Behind Haproxy

HAProxy
app Virtual
IP

app1

Virtual
IP
MaxScale 1

1
Development Production
Scale the Application Tier

HAProxy
app Virtual
IP

app1 app2 app3 app4 appN

Virtual
IP
MaxScale 1 MaxScale 2

1 2 3
Development Production
Docker Networking
Docker Host (swarm-0) Docker Host (swarm-1)

HAProxy App
Container Container
Endpoint Endpoint Endpoint

front Network

Docker Host (swarm-2) Docker Host (swarm-3)

MaxScale MariaDB
Container Container

Endpoint Endpoint

back Overlay Network


Docker Networking
$ docker network create -d overlay --attachable --opt encrypted myapp_back

$ cat docker-compose.stack.yml
...
networks:
front:
back:
external:
name: myapp_back
haproxy & web services
services: web:
haproxy: image: alvinr/demo-webapp-vote:mariadb
image: dockercloud/haproxy environment:
networks: SERVICE_PORTS: "5000"
- front VIRTUAL_HOST: "prod.myapp.com"
- back APP_MARIADB_HOST: "maxscale"
volumes: APP_USER: "app"
- /var/run/docker.sock:/var/run/docker.sock APP_PASSWORD_FILE: "/run/secrets/app_password"
ports: APP_DATABASE: "test"
- 80:80 networks:
deploy: - back
placement: deploy:
constraints: [node.role == manager] placement:
constraints: [node.role != manger]
secrets:
- app_password
OMG! The developer hardcoded passwords!
services:
web:
build: .
ports:
- "5000:5000"
links:
- mariadb
hostname: dev.myapp.com
environment:
APP_MARIADB_HOST: dev_mariadb_1
APP_PASSWORD: foo
mariadb:
image: mariadb:10.1
environment:
MYSQL_ROOT_PASSWORD: foo
Docker secrets
$ cat ./app_password.txt $ cat app.py
appfoo
secrets_fn=
$ cat docker-compose.stack.yml
... os.environ.get("APP_PASSWORD_FILE", "")
secrets:
app_password: if os.path.isfile(secrets_fn):
file: ./app_password.txt with open(secrets_fn, 'r') as myfile:
... passwd=myfile.read().replace('\n', '')
web:
image: alvinr/demo-webapp-vote:mariadb
environment: db = mariadb.connect(
APP_PASSWORD_FILE: host=app_host,
"/run/secrets/app_password" user=app_user,
passwd=passwd,
db=app_db)
Demo
Deploying & Scaling
Database Tier
Container Placement
Docker Host (swarm-0) Docker Host (swarm-1)

HAProxy App
Container Container
Endpoint Endpoint Endpoint

front Network

Docker Host (swarm-2) Docker Host (swarm-3)

MaxScale MariaDB MariaDB


Container Container Container

Endpoint Endpoint Endpoint

back Overlay Network


Container Placement - SPOFs
Docker Host (swarm-0) Docker Host (swarm-1)

HAProxy App
Container Container
Endpoint Endpoint Endpoint

front Network

Docker Host (swarm-2) Docker Host (swarm-3)

MaxScale MariaDB MariaDB


Container Container Container

Endpoint Endpoint Endpoint

back Overlay Network


Container Placement
mariadb_cluster:
image: alvinr/mariadb-galera-swarm
...
labels:
com.mariadb.cluster: "myapp-prod-cluster"
...
deploy:
replicas: 1
placement:
constraints: [engine.labels.com.mariadb.cluster != myapp-prod-cluster]
Restarting on Failure

maxscale:
image: alvinr/maxscale-swarm
...
labels:
com.mariadb.cluster: "myapp-maxscale"
networks:
- back
deploy:
replicas: 1
restart_policy:
condition: on-failure
delay: 10s
placement:
constraints: [engine.labels.com.mariadb.cluster != myapp-maxscale]
secrets:
- app_password
Encore
Considerations
& Conclusions
Storage: Inside or Outside the Container?

Host

Docker Daemon Docker Daemon

Container Container
Local Disk e.g.
/dev/xvdb
SSD / NVMe
Networked
/mnt/xx:/var/lib/mysql e.g. EBS
Volume

Inside Outside
Encapsulation Separation of Concerns
of Concerns Storage features (e.g. Snapshots)
3rd Party options
NetApp, Google Compute Engine, Rancher Convoy
Flocker, PorkWorx, Nutanix
Storage: Data Container?

Host Host

Docker Daemon Docker Daemon

Container Container
--volumes-from
{container name}

Inside
Managed like
other containers
Special rule for
Destruction
TBD: Performance
And...
Official images
Image verification (trusted Images)
Swarm locking
AppArmor / Seccomp profiles
Monitoring
Heathchecks
Rolling Upgrades
Summary

Schema
Foreign Keys, Joins, Aggregation and Denormalization - they will kill you
Service boundaries may impact your Availability, make deployment compex

Infrastructure
Plan for the worst case
Scale for the best case

Orchestration
Dev approved images to build upon
Ops inject policy
Thanks and Q&A
Code
https://github.com/alvinr/docker-demo/tree/master/mariadb/vote

Docker Images
https://hub.docker.com/_/mariadb/

MariaDB & Docker deployment guide


https://mariadb.com/kb/en/mariadb/installing-and-using-mariadb-via-docker/

Contact me!
alvin@mariadb.com
@jonnyeight

You might also like