Professional Documents
Culture Documents
History
Originally developed by Sharon Rosner
March 2007: 0.0.1 - First release
January 2008: 1.0 - Split into sequel, sequel_core, and sequel_model gems
February 2008: 1.2 - I started using Sequel
March 2008: 1.3 - Model associations; I became developer, then maintainer of Sequel
April 2008: 1.4 - Eager loading; sequel and sequel_model gems merged
April 2008: 1.5 - Dataset graphing; much deprecation
History (2)
May 2008: 2.0 - Expression filters; deprecated method removal; massive code cleanup and
documentation updates
July 2008: 2.3 - Jruby/Ruby 1.9 support; sequel_core and sequel gems merged
August 2008: 2.4 - Bound variable/Prepared Statements; Master/Slave database and sharding
Since: Many features and bug fixes
sequel_core vs sequel_model
NOT *-core vs. *-more
sequel_core:
Dataset-centric, returns plain hashes
Basically a ruby DSL for SQL
Good for aggregate reporting, dealing with sets of objects
Also houses the adapters, core extensions, connection pool, migrations, and some utilities
sequel_model:
Object-centric, returns model objects
An ORM built on top of sequel-core
Good for dealing with individual objects
Also houses the string inflection methods
Model classes proxy many methods to their underlying dataset, so you get the benefits of sequel-
core when using sequel-model
Database Support
13 supported adapters: ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC,
OpenBase, Oracle, PostgreSQL and SQLite3
Some adapters support multiple databases: DataObjects, JDBC
Some databases are supported by multiple adapters: MySQL, PostgreSQL, SQLite
PostgreSQL adapter can use pg, postgres, or postgres-pr driver
Adding adapters
Adding additional adapters is pretty easy
Need to define:
Database#connect method that returns an adapter-specific connection object
Database#disconnect_connection method that disconnects adapter-specific connection object
Database#execute method that runs SQL against the database
Database#dataset method that returns a dataset subclass instance for the database
Dataset#fetch_rows method that yields hashes with symbol keys
Potentially, that's it
About 1/3 of Sequel code is in the adapters
Sequel::Database
uri = 'postgres://user:pass@host/database'
DB = Sequel.connect(uri)
DB = Sequel.postgres(database, :host =>host)
DB = Sequel.sqlite # Memory database
Sequel::Database (2)
# Set defaults for future datasets
DB.quote_identifiers = true
# Create Datasets
dataset = DB[:attendees]
# Setup SQL Loggers
DB.loggers << Logger.new($stdout)
# Handle transactions: block required, no way
# to leave a transaction open indefinitely
DB.transaction {
# Execute SQL directly
rows = DB['SELECT * FROM ...'].all
DB["INSERT ..."].insert
DB["DELETE ..."].delete
DB << "SET ..." }
Connection Pooling
Sequel has thread-safe connection pooling
No need for manual cleanup
Only way to get a connection is through a block
Block ensures the connection is returned to the pool before it exits
Makes it impossible to leak connections
Connection not checked out until final SQL string is ready
Connection returned as soon as iteration of results is finished
This allows for much better concurrency
Sequel::Dataset
Represents an SQL query, or more generally, an abstract set of rows/objects
Most methods returned modified copies, functional style
Don't need to worry about the order of methods, usually
Build your SQL query by chaining methods
DB[:table].limit(5, 2).order(:column4).
select(:column1, :column2).
filter(:column3 =>0..100).all
Fetching Rows
#each iterates over the returned rows
Enumerable is included
Rows are returned as hashes with symbol keys
Can set an arbitrary proc to call with the hash before yielding (how models are implemented)
#all returns all rows as an array of hashes
No caching is done
If you don't want two identical queries for the same data, store the results of #all in a variable, and
use that variable later
Dataset Filtering
ds = DB[:attendees]
# Strings
ds.filter('n = 1') # n = 1
ds.filter('n > ?', 'M') # n > 'M'
# Hashes
ds.filter(:n =>1) # n = 1
ds.filter(:n =>nil) # n IS NULL
ds.filter(:fn =>'ln') # fn = 'ln'
ds.filter(:fn =>:ln) # fn = ln
ds.filter(:n =>[1, 2]) # n IN (1, 2)
ds.filter(:n =>1..2) # n >= 1 AND n <= 2
Searching
ds.filter(:p.like(:q)) # p LIKE q
ds.filter(:p.like('q', /r/)) # p LIKE 'q' OR p ~ 'r'
ds.filter([:p, :q].sql_string_join.like('Test'))
# (p || q) LIKE 'Test'
Identifier Symbols
As a shortcut, sequel allows you to use plain symbols to signify qualified and/or aliased columns:
:table__column => table.column
:column___alias => column AS alias
:table__column___alias => table.column AS alias
You can use methods to do the same thing, if you want:
:column.qualify(:table)
:column.as(:alias)
:column.qualify(:table).as(:alias)
Can also be used for schemas (:schema__table)
Dataset Joining
ds = DB[:attendees].
join_table(:inner, :events, :id =>:event_id)
# FROM attendees INNER JOIN events
# ON (events.id = attendees.event_id)
ds = ds.left_outer_join(:caterers, \
:id =>:events__caterer_id)
# ... LEFT OUTER JOIN caterers ON
# (caterers.id = events.caterer.id)
Join Clobbering
# attendees: id, name, event_id
# events: id, name
ds = DB[:attendees].join(:events, :id =>:event_id)
ds.all
# => [{:id=>event.id, :name=>event.name}, ...]
Like SQL, returns all columns from both tables unless you choose which columns to select
SQL generally returns rows as arrays, so it's possible to differentiate columns that have the same
name but are in separate tables
Sequel returns rows as hashes, so identical names will clobber each other (last one wins)
This makes a join problematic if the tables share column names
You can use Dataset#select to restrict the columns returned, and/or to alias them to eliminate
clobbbering
But that's ugly and cumbersome
There's got to be a better way!
Dataset Graphing
# attendees: id, name, event_id
# events: id, name
ds = DB[:attendees].graph(:events, :id =>:event_id)
ds.all
# => [{:attendees=>{...}, :events=>{...}}, ...]
Sequel::Model
Allows easily adding methods to datasets and returned objects
Each model class is associated with a single dataset
The model class object proxies many methods to its dataset
Model instances represent individual rows in the dataset
The basics are similar to AR and DM
Model Associations
Player.many_to_one :team
Team.one_to_many :players
Player.first.team
# SELECT * FROM team WHERE id = #{player.team_id}
Team.first.players
# SELECT * FROM players WHERE team_id = #{team.id}
Attendee.many_to_many :events
Attendee.first.events
# SELECT events.* FROM events INNER JOIN
# attendee_events ON attendee_events.event_id=events.id
# AND attendee_events.attendee_id = #{attendee.id}
Team.one_to_many :players
team.players # Array of Players
team.players_dataset # Sequel::Dataset for this
# team's players
team.add_player(player)
team.remove_player(player)
team.remove_all_players
Association Options
There are lots of options: 27 currently, not counting ones specific to certain associations
Most are only useful in fairly rare circumstances, but if you have that circumstance...
Common ones: :select, :order, :limit (also used for offsets), :conditions, :class (takes class or
name), :read_only
Association methods take blocks:
module SameName
def self_titled
first(:name=>model_object.name)
end end
Artist.one_to_many :albums, :extend=>SameName
Artist.first.albums_dataset.self_titled
Eager Loading
Two separate methods: eager and eager_graph
eager loads each table separately
eager_graph does joins
Argument structure similar to AR's :include
Can combine the two, to a certain extent:
Works fine: eager(:blah).eager_graph(:bars, :foos)
Works fine: eager(:blah=>:bars).eager_graph(:foos=>:bazs)
Problematic: eager(:blah=>:bars).eager_graph(:blah=>:foos)
Possibly fixable, but no one has complained...
Why two methods?:
User choice (performance, explicitness)
Sequel does not parse SQL!
Advanced Associations
Sequel allows you full control over associations via the :dataset
option
# AR has_many :through=>has_many
# Firm one_to_many Clients one_to_many Invoices
Firm.one_to_many :invoices, :dataset =>proc{
Invoice.eager_graph(:client).
filter(:client__firm_id =>pk)}
Validations
Philosophy: Only useful to display nice error messages to the user, actual data integrity should be
handled by the database
9 Standard validations available: acceptance_of, confirmation_of, format_of, inclusion_of,
length_of, not_string, numericality_of, presence_of, uniqueness_of
Easy shorthand via validates:
validates do
format_of :a, :with=>/\A.*@.*\..*\z/
uniqueness_of :a_id, :b_id # both unique
uniqueness_of [:a_id, :b_id] # combination
end
Custom Validations
validates_each: Backbone of defining the standard validations and any custom ones
Requires a block (called with the object, attribute(s), and attribute value(s)), accepts multiple
attributes arguments and a hash of options
Built-in support for :if, :allow_missing, :allow_nil, and :allow_blank options
Can use arrays of attributes in addition to individual attributes
validates_each(:amount, :if=>:confirmed?){|o,a,v|
o.errors[a] << "is less than 100" if v < 100
end
Hooks
Philosophy: Should not be used for data integrity, use a database trigger for that
Called before or after certain model actions: initialize (after only), save, create, update, destroy,
and validation
Arbitrary hook types can be defined via add_hook_type, useful for plugins (all standard hooks are
implemented using it)
Can use a symbol specifying an instance method, or a proc
Class methods add hooks, instance methods call them
Returning false from any hook cancels the rest of the hook chain
Dataset Pagination
Built in support via Dataset#paginate and Dataset#each_page
Dataset#paginate applies a limit and offset and returns a dataset with helper methods such as
next_page and prev_page
Useful for building a search engine or website showing a defined number of records per page
Dataset#each_page yields paginated datasets of a given length starting with page 1
Useful for processing all records, but only loading a given number at a time due to memory
constraints
You should probably run #each_page inside of a transaction unless you know what you are doing
Model Caching
Built in support via Model.set_cache
Caches to any object with the following API:
#set(key, object, seconds): Store object with key for amount of seconds
#get(key): Return object with matching key, or nil if there is no object
This API is used by Ruby-MemCache, so it works with that by default
The cache is only used when Model.[] is called with the primary key
Schema Definition
DB.create_table(:attendees) do
primary_key :id # integer/serial/identity
String :name # varchar(255)/text
column :registered_at, DateTime # timestamp
money :price # money
foreign_key :event_id, :events
index :name, :unique =>true
index [:name, :event_id]
constraint :b, ~{:price =>0} # price != 0
check{|o| o.registered_at > '2008-12-31'}
primary_key [:name, :price] # composite pk
foreign_key [:name, :price], :blah, \
:key => [:att_name, :att_price] # composite fk
end
Schema Modification
DB.alter_table(:attendees) do
add_column :confirmed, :boolean, :null =>false
drop_constraint :b
add_constraint :c do |o| # price != 0 if confirmed
{:confirmed =>~{:price=>0}}.case(true)
end
add_foreign_key :foo_id, :foos
add_primary_key :id
rename_column :id, :attendee_id
drop_column :id
set_column_default :name, 'Jeremy'
set_column_type :price, Numeric
set_column_allow_null :confirmed, true
end
Schema Modification (2)
DB.add_column :attendees, :confirmed, :boolean
DB.add_index :attendees, :confirmed
DB.drop_index :attendees, :confirmed
DB.rename_column :attendees, :id, :attendee_id
DB.drop_column :attendees, :attendee_id
DB.set_column_default :attendees, :name, 'Jeremy'
DB.set_column_type :attendees, :price, Numeric
DB.rename_table :attendees, :people
DB.drop_table :people
DB.create_view :ac, DB[:attendees].where(:confirmed)
DB.drop_view :ac
Migrations
Similar to ActiveRecord migrations
Migration class proxies most methods to Database
Migrator
Migrations are just classes that can be used individually via an API
Sequel::Migrator deals with a directory of files containing migrations, similar to AR
Filenames should start with integer representing state of migration, similar to AR before
timestamped migrations
You can use the Migrator API, or the sequel command line tool -m switch
Migration Philosophy
Migrations should preferably only do schema modification, no data modification unless necessary
Migrations should be self contained, and not reference any part of your app (such as your models)
Migrations are deliberately not timestamped:
The whole point of timestamped migrations was to allow multiple teams working on the
same app to add migrations to different branches without requiring manual intervention
when merging
That is a poor idea, as there is no guarantee that the modifications will not conflict
Using integer versions instead of timestamps makes it so the maintainer has to take manual
effort when merging branches with different migrations, which is a good thing
Model Schemas
Models can use set_schema/create_table for a DataMapper-like way of handling things
This isn't recommended, as it makes schema changes difficult
Use migrations instead for any production app
For test code/examples, it is OK
However, even then I prefer using the standard database schema methods before the model
definition
Sequel's philosophy is that the model is simply a nice front end to working with the database, not
that the database is just a place to store the model's data
Bound Variables
Potentially faster depending on the query (no literalization of large objects)
Don't assume better performance, and don't use without profiling/benchmarking
Use :$blah placeholders on all databases
Native support on PostgreSQL, JDBC, and SQLite, others have emulated support
ds = DB[:items].filter(:name =>:$n)
ds.call(:select, :n =>'Jim')
ds.call(:update, {:n =>'Jim', :new_n =>'Bob'}, \
:name =>:$new_n)
Prepared Statements
Similar to bound variable support:
Potentially faster due to reduced literalization and query plan caching
Only use after profiling/benchmarking
Uses same :$blah placeholders
Native support on PostgreSQL, JDBC, SQLite, and MySQL, emulated support on other
databases
ds = DB[:items].filter(:name =>:$n)
ps = ds.prepare(:select, :select_by_name)
ps.call(:n =>'Jim')
DB.call(:select_by_name, :n =>'Jim')
ps2 = ds.prepare(:update, :update_name, \
:name =>:$new_n)
ps2.call(:n =>'Jim', :new_n =>'Bob')
Stored Procedures
Only supported in the MySQL and JDBC adapters
Similar to prepared statement support
DB[:table].call_sproc(:select, :mysp, \
'param1', 'param2')
sp = DB[:table].prepare_sproc(:select, :mysp)
sp.call('param1', 'param2')
sp.call('param3', 'param4')
Master/Slave Databases
Sequel has built in support for master/slave database configurations
SELECT queries go to slave databases, all other requests go to master database
No code modifications are required, just need to modify the Sequel.connect call
Sharding/Partitioning
Sequel makes it simple to deal with a sharded/partitioned database setup
Basically, you can set any standard query to use whichever server you specify, using the
Dataset#server method
Implemented in the generic connection pool, so all adapters are supported
s = {}
%w'a b c d'.each{|x| s[x.to_sym] = {:host=>"s#{x}"}}
DB=Sequel.connect('postgres://m/db', :servers=>s)
DB[:table].server(:a).filter(:num=>10).all
DB[:table].server(:b).filter(:num=>100).delete
Lightweight?
Requires: bigdecimal, bigdecimal/util, date, enumerator, thread, time, uri, yaml
Possible to trim some of those out with minor hacking
RAM Usage (VSZ change from requiring on i386):
Sequel 2.11.0:
sequel_core: 2064KB
sequel: 3700KB
DataMapper 0.9.10:
dm-core: 5560KB
dm-more: 14244KB
ActiveRecord
2.2.2: 12792KB
2.3.1rc2: 6588KB (nothing autoloaded)
2.3.1rc2: 9724KB (everything loaded)
Current Status
Current version: 2.11.0
Releases are generally done once a month, usually in the first week of the month
Generally there are no open bugs or features planned when a release is made
Every month there are small and not-so-small features that get added, mostly based on user's
suggestions/code
Bugs on the tracker get priority, and generally are dealt with quickly (1 day-1 week)
Bugs 200-260: 48 Fixed, 5 WontFix, rest invalid/spam
The empty bug tracker is the SOP
Contributing
Contributing is very easy, no +1s required
No bureaucracy, just notify me via IRC (#sequel), the Google Group, the bug tracker, or github
Feedback for all patches is prompt, no exceptions
Patches should include specs, but they aren't required, I'll accept patches without specs and add the
specs myself (if an obvious bug or a feature I like)
Feature requests are denied more often for philosophical reasons than for poor implementation
If I think a feature is good and the implementation is not, I'll rewrite the implementation myself
Questions?