Professional Documents
Culture Documents
Version 4.0
January 2005
Copyright © Daffodil Software Limited
Sco 42,3rd Floor
Old Judicial Complex, Civil lines
Gurgaon - 122001
Haryana, India.
www.daffodildb.com
All rights reserved. Daffodil DB™ is a registered trademark of Daffodil Software Limited.
Java™ is a registered trademark of Sun Microsystems, Inc. All other brand and product
names are trademarks of their respective companies.
TABLE OF FIGURES...................................................................................................... 5
PREFACE.......................................................................................................................... 7
PURPOSE....................................................................................................................................... 7
TARGET AUDIENCE ...................................................................................................................... 8
PRE-REQUISITES ........................................................................................................................... 9
TOP LEVEL INTERACTION DIAGRAMS ............................................................... 10
EXECUTE DDL QUERY ............................................................................................................... 11
EXECUTE DML QUERY ............................................................................................................... 13
EXECUTE DQL QUERY ............................................................................................................... 15
SERVER .......................................................................................................................... 18
OVERVIEW ................................................................................................................................. 18
SERVERSYSTEM ......................................................................................................................... 19
DML ................................................................................................................................. 21
OVERVIEW ................................................................................................................................. 21
CONSTRAINTSYSTEM ................................................................................................................. 75
TRIGGERSYSTEM ....................................................................................................................... 79
SQL............................................................................................................................................ 82
DDL .................................................................................................................................. 85
OVERVIEW ................................................................................................................................. 85
SQL............................................................................................................................................ 95
DESCRIPTORS ........................................................................................................................... 101
DQL................................................................................................................................ 105
OVERVIEW ............................................................................................................................... 105
SQL.......................................................................................................................................... 146
NAVIGATOR ............................................................................................................................. 149
EXECUTIONPLAN ..................................................................................................................... 153
SESSION........................................................................................................................ 156
OVERVIEW ............................................................................................................................... 156
SESSIONSYSTEM ...................................................................................................................... 170
DATADICTIONARY ................................................................................................................... 177
DATA STORE............................................................................................................... 182
OVERVIEW ............................................................................................................................... 182
FILESYSTEM ............................................................................................................................. 199
INDEXSYSTEM .......................................................................................................................... 205
PARSER......................................................................................................................... 210
Purpose
Purpose of Daffodil DB Design Document is to describe the design and the architecture
of Daffodil DB. The design is expressed in sufficient detail so as to enable all the
developers to understand the underlying architecture of Daffodil DB. Logical architecture
of JDBC driver, Server, DML, DDL, DQL, Session, Data Store and Parser of Daffodil DB
are explained.
Classes and Interfaces of all the modules have been thoroughly explained.
Examples with valid and invalid cases are provided wherever necessary.
This Design document is intended to act as a technical reference tool for developers
involved in the development of Daffodil DB/One$DB.
This document assumes that you have sufficient understanding of the following
concepts:
Daffodil DB requires Java JRE 1.4 or higher. Since Daffodil DB is written in Java, it can
run on any platform that supports the Java runtime environment 1.4 or higher. The
compiled files are contained in Java Archives (JARs) and have to be defined in the
CLASSPATH environment variable:
JDBC Driver
Server Parser
DDL DQL
DML
Session
Data Store
User will execute its DDL (Data Definition Language) query through the Daffodil DB
JDBC Driver. DDL queries can be used to create, drop or alter objects in the database.
Objects allowed are schema, tables, views, triggers, domains, procedures, roles, and
constraints. Objects once created will remain persistent in the database.
Flow Explanations:
1: JDBC driver will pass the user query to server for execution. Server will return an
object having information about the result of the query.
Exceptions:
Query is invalid
Object already exists.
2 & 3: Server will give the query to parser for parsing. Parser will parse the query and
create a tree-type object representing the content of the query. The tree-type object will
be Java classes representing different rules of the SQL grammar.
Flow Explanations:
1: JDBC driver will pass the user query to server for execution. Server will return the
count of the rows affected by the query.
Exceptions:
Constraint violation.
Execution Trigger in execution
Query is invalid.
2 & 3: Server will give the query to parser for parsing. Parser will parse the query and
create a tree-type object representing the overall structure of the query. The tree-type
object will be Java classes representing different rules of the SQL grammar.
4: Server will pass the parsed object to DML for execution. DML will perform the
semantic checking of the query. DML will check the rules specified in SQL specification
for the given query.
5 & 6: Session will be requested to retrieve all the rows of the given table according to
the specified condition. Session will interact with Data System to retrieve rows from the
physical storage.
7: Session will check for the locking of the rows being changed by the DML.
Exception:
Row modified by another user
8: DML will also apply the constraints and triggers created by the user on the given
table.
9 & 10: After doing all the operations, DML will save the changes by interacting with the
Session.
Flow Explanations:
1: JDBC driver will pass the user query to server for execution. Server will return an
iterator object which can be used to iterate the result of the DQL query.
Exceptions:
Query is invalid.
2 & 3: Server will give the query to parser for parsing. Parser will parse the query and
create a tree-type object representing the overall structure of the query. The tree-type
object will be java classes representing different rules of the SQL grammar.
4: Server will pass the parsed object to DQL for execution. DQL will perform the
semantic checking of the query. DQL will also the query according to the rules of SQL
specification. DQL will make an execution plan for the query. Execution plan will have
the information about solving query optimally. DQL will interact with the session to
retrieve rows according to the transaction isolation level of the user. DQL will create an
iterator object. Iterator can be used to iterate the results of the query. Iterator can be
navigated in forward and backward direction.
JDBC Driver will be the main interface through which a user can interact with Daffodil
DB. JDBC driver can be used to execute any type of query on the database. JDBC
driver will function according to JDBC 3.0 API specified by Sun MicroSystems. JDBC
driver will provide all the information about the database and its objects. JDBC driver will
provide access to meta-data as well as data of the tables. JDBC driver will support the
execution of all types of queries including DDL, DML, DQL and DCL. Using JDBC driver
user will be able to execute SQL statements, retrieve results and perform changes to the
database.
For more comprehensive and up-to-date information on JDBC and JTA, please refer the
following resources:
• http://java.sun.com/products/jdbc
• http://java.sun.com/products/jta
• http://developer.java.sun.com/developer/Books/JDBCTutorial/
• http://java.sun.com/products/jdbc/download.html#corespec30
JDBC driver will be given for Embedded as well as for Network version of Daffodil DB.
We will be providing other interfaces like Command-Line tool, GUI-based tool (Graphical
User interface) for performing the above mentioned operations.
Server will be responsible for processing the JDBC driver requests. Server will provide
the environment to the user for the execution of different type of queries. Queries can be
DDL, DML, DQL and DCL. Queries can be executed on different databases
concurrently. Server will be providing a multi-threaded environment for concurrent
execution of queries on the same database.
UserImpl
Server User
ServerImpl DistributedConnectionImpl
Connection
<<depends>> ConnectionImpl
XAResource
PreparedStat ementImpl
PreparedSta
tement
XAResourceImpl NonParameterizedStatement
Classes:
Connection (Interface)
Connection represents the interface through which the user interacts with the database
server. Connection is responsible for handling transaction and executing SQL queries
given by the user.
Server (Interface)
Server is responsible for providing the connections to the users and ensuring the
databases are not used by another instance of database server. Server verifies the user
while giving the connection to the database.
ServerImpl ( )
ServerImpl provides the implementation of the Server interface.
ConnectionImpl ( )
ConnectionImpl provides the implementation of Connection interface.
DistributedConnectionImpl ( )
DistributedConnectionImpl provides the implementation of Connection interface in
distributed transactions environment. DistributedConnectionImpl delegates its call to
underlying Connection instance.
PreparedStatementImpl ( )
PreparedStatementImpl provides the implementation of the PreparedStatement
interface.
NonParameterizedStatement ( )
NonParameterizedStatement provides the implementation of PreparedStatement
interface in case of queries having no parameters.
User (Interface)
User is responsible for interacting with the end-user through GUI. User manages the
connections and provide functionality for creating/droping the database.
UserImpl ( )
UserImpl provides the implementation of the User interface.
XAResource (Interface)
XAResource represents a distributed resource.
XAResourceImpl ( )
XAResourceImpl provides the implementation of XAResource interface.
DML responsibilities:
DML will be responsible for executing all types of DML queries specified in SQL 99
specification. DML queries include insert, update and delete. DML will also take care of
constraints and triggers.
DML is very crucial in the overall performance of the database server. It needs to be
optimized for best performance.
Multiple users can execute different DML queries on the same table concurrently.
Constraints:
Constraints are used to specify or check the validity of the records inserted/modified.
Different constraints are:
Unique constraint: Constraint specifies that every combination or value of the specified
columns should be unique. Null values are not considered for checking the constraint. In
case the constraint is applied on a combination of columns and either of the column is
having a null value, constraint is not checked for that record/row.
Primary constraint: Primary constraint is similar to unique constraint but it does not
allow null value in the columns. In case multiple columns are specified in the constraint
then no column can have null value in it.
Check constraint: Check constraint specifies a condition that must be satisfied by the
column value of each row of the table. Constraint can be applied on single column or
multiple columns. For checking the constraint the specified condition is evaluated and if
the current value of the column doesn’t satisfies the condition, an error is thrown.
Null Constraint: Null constraint is used to restrict the Null value in the column, specified
in the constraint.
Triggers: Triggers are used to perform specified action on the occurrence of other
operations on the table. Triggers can be applied in a variety of ways. Triggers are
applied on DML operations performed on the table. There are two types of triggers:
Row-Level Trigger: Row-level trigger is executed once for each record affected by the
DML statement execution.
Trigger can be applied on any of DML query i.e. Insert, Update and Delete. Trigger can
be configured to be executed before or after execution of operation. It can be restricted
by applying a “condition” for its execution. In case of Update, a list of columns can also
be specified. Triggers have a set of SQL statements which are executed when the
trigger is executed. Triggers can be nested i.e. an SQL statement of a trigger can initiate
another trigger.
DML query execution can be divided into four parts, mainly privileges checking,
records modification, constraints checking and triggers execution. DML query
specifies a single table whose records/rows will be modified. In case of insert query,
user specifies the columns and values of the record. In case of update query, user
specifies a “where condition” for records to be modified and expressions for retrieving
column values. In case of delete query, user specifies a “where condition” for records to
be deleted.
DML query will first of all check the privileges of the user executing the query. Current
User must have the privileges for the specified query and columns being affected. DML
query will be locating the records to be operated by either solving the “where condition”
or by making the record using the values specified. DML query will perform the specified
operation on the records found above. DML query will also perform constraints checking
on every record modified. In case there is a violation, DML query will rollback the
changes done and will throw an error to the user. DML query will also execute the
triggers applied by users on the above operation. As in the case of constraints, if there is
any error or violation, DML query will rollback all the changes done and throw an error to
user. DML query will always start a sub-transaction under current transaction so that
changes done by the current query can be reverted in case of error and on successful
execution, changes done under sub-transaction will become part of current transaction.
More details:
Using Indexes:
In case of update and delete statement, if a condition is specified then we can use
indexes to solve it.
• If more than one index is available on the columns involved in the condition then
the index containing maximum number of condition columns should be selected.
• We can also consider size of columns for selecting an index.
• If we have a condition and it can’t be solved by a single index, then we should
split the condition if the individual parts can be solved by indexes.
• We can also consider use of multiple indexes to solve the condition.
• We can remove redundant conditions involving greater than and less than
operator.
• We can remove the conditions involving “AND” and “OR” using the following
rules:
Predicate “OR” True --- True
Predicate “OR” False --- Predicate
Predicate “AND” True --- Predicate
Predicate “AND” False --- False
Condition/query rewriting:
We will be creating indexes on unique and primary constraint columns to speed up the
constraints checking, retrieving and modification.
• In case of unique and primary constraint, we can use index to check whether the
current value of the column is existing in the table.
• In case of foreign constraint, we can use index of the referenced table to check
whether the current value is valid.
Triggers Optimization:
• We can cache the parsed objects of SQL queries specified in the trigger. We can
avoid semantic checking of SQL queries because SQL queries are checked
semantically during trigger creation.
DML RESPONSIBILITIES
1. Insert
2. Update
3. Delete
4. Triggers
5. Constraints
6. Default clause
7. Auto Increment Values
8. Sequences
1. Semantic Checking ()
Invalid:
Insert into Tab1 values (1,”A”)
Comment: No object called Tab1.
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 values (1,’A’)
Invalid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, c) values (1,’A’)
Insert into Tab1 (T2.a) values (2)
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, b) values (1,’A’)
Invalid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, a) values (1, 2)
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a) values (2)
Check for cardinality (number of columns defined in insert statement should be equal
to number of values specified)
Invalid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, b) values (1, ‘A’,’B’)
Insert into Tab1 (a) values (1,’A’)
Insert into Tab1 (a) values (select * from Tab1)
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a) values (2)
Check for degree (number of rows from select query should be equal to one)
Invalid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, b) values (1, ‘A’)
Insert into Tab1 (a, b) values (2, ‘B’)
Insert into Tab1 (a, b) values (select * from Tab1)
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a, b) values (1, ‘A’)
Check for data types of values in query with the data types of respective column
in table
Invalid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a) values (‘abc’)
Valid:
Create Table Tab1 (a int, b char (1))
Insert into Tab1 (a) values (1)
Execution Insert:
1. Convert the data type of values in insert statement, according to the data type of
the columns defined in table. If user has created a table with column of varchar
data type and inserting a numeric value without quotes, Daffodil DB will convert
the numeric value to varchar type.
2. Add the default value, if user has not provided the values of columns defined with
default clause, otherwise record will be inserted with user value.
3. Add auto increment value, if table has a column with auto increment property. If
value is provided within the statement by user it takes precedence in respect to
the auto generated value.
4. Add the value generated by sequence, if user has used any sequence in
statement.
6. If any sub query is used in insert statement, Daffodil DB will fetch all the records
of the sub query to insert them into table.
Update:
1. Convert the data type of values of set clause list in update statement, according
to the data type of the columns defined in the table. If user has created a table
with column of varchar data type and inserting a numeric value without quotes,
Daffodil DB will convert the numeric value to varchar type.
2. Add the value generated by the sequence, if user has used any sequence in
statement.
3. Update all the records if user has not specified any ‘where clause’. If user has
specified it, get the navigator on condition and update all records, which satisfy
the condition given in the statement.
1. Delete all records if user has not specified any ‘where clause’. If user has
specified it, get the navigator on condition and delete all records, which satisfy
the condition given in the statement.
Examples:
e) Insert into Tab1 (a) values (seq1.next ()), here seq1 is the sequence.
This statement will insert a row with values returned by sequence.
Default Clause:
The default clause allows one to specify default values for a given column. Possible
default values can be any literal / null value / datetime / valuefunction / USER /
CURRENT_USER / CURRENT_ROLE / SESSION_USER / CURRENT_PATH / any
implicitly typed value specification.
Default value clause is set for a column at the time of creation of a table. And its values
are provided at the time of insertion. We insert the default value in a row even when the
user has not specified the name of default column in insert column list. Default value is
overwritten when user has provided value for the column that has default value.
For Example:
Create table student (Rollno integer, SName varchar (20), memo varchar (200)
DEFAULT CURRENT_USER);
Result:
1 Rohan daffodil
1 Rohan deepak
To assign incremental value for a column we use auto increment option. There is no
need for the user to provide value for the column that has auto increment option
because server itself provides incremental value for this column. However, user also can
provide value for the column.
User can declare only following data type field for auto increment option:
BIGINT, BYTE, INT, INTEGER, LONG, SMALLINT, TINYINT, DOUBLE PRECISION,
FLOAT, REAL, DEC, DECIMAL, NUMERIC
For Example:
Result:
Rollno SName
30 rohit
31 rohit
48 gohit
49 george
1 first
50 last
Sequence:
Sequences are very similar to auto increment except that auto increment applies on
table level and sequence on database level. We can use single sequence in many
tables
Result:
roll address
2 2
4 3
6 4
Constraints:
Types of Constraints
1. Primary Constraints
2. Unique Constraints
3. Check Constraints
4. Referential Constraints
Primary Constraints:
We check primary constraints only for insert and update statement. Primary constraints
are checked even if user has not provided the column, having primary key, in his
statement. For checking of primary constraints, we scan whole table to maintain
uniqueness of the column. However, to check this constraint in optimized way, we use
index created on primary column.
For Example:
Here we created a table with two-column c1 with integer and c2 with char data type. In
addition, we have applied a primary key on column c1. Following queries will show you
the behaviour of the primary constraint.
Unique Constraints:
Unique constraints are very similar to primary constraint except that unique constraints
allow null values in column. To check this constraint, database scan whole table to
maintain the uniqueness of the column. Database use index created on columns that are
included in unique constraint for optimization purpose. In this constraint database will
check the unique constraint only if user has defined the column in statement.
For Example:
Here we created a table with two-column c1 with integer and c2 with char data type. In
Additional, we have applied a unique constraint on column c1. Using following queries
we are trying to show you the behaviour of the unique constraint.
To verify check constraint, database solves the condition applied by user on column. For
solving condition, database use values from the current row which is being inserted or
updated. Check constraints are applicable only on insert and update statement. Daffodil
DB does not verify this constraint for delete statement. If any sub query is included in
check constraint, then first the database solve the select statement and then verify the
given check.
Examples:
1. Create table tab1 (c1 integer check (c1 > 5), c2 char(10))
Invalid:
Insert into tab1 values (1,’A’)
Valid:
Insert into tab1 values (7,’A’)
Invalid:
Insert into tab1 values (null, ‘A’)
Valid:
Insert into tab1 values (2l, ‘A’)
Invalid:
Insert into tab1 values (1, ‘a’)
This query will throw check constraint violation exception because sub
query will return value 5 as maximum value and value assigned in this
statement is less than that.
Valid:
Insert into tab1 values (50, ‘a’)
This query will execute successfully because sub query will return
value 5 as maximum value and value assigned in this statement is
greater than that which satisfy the check constraint.
Referencing constraints are checked for insert and update statement. For checking the
referencing constraints, database will scan the referenced table completely according
the specified match type. As we know referenced table column, to which referencing
column refers, is always a primary or unique column and for primary and unique
constraint columns we create indexes. Therefore, we use these indexes for the
optimization purpose.
Types of Match:
Simple
Full
Partial
Insertion and modification in referencing table is allowed only if the referencing column
has null value in any column or all non-null values should match to the corresponding
referenced columns value of any row.
Every referencing column value of a row should have null values in all referencing
column or all referencing columns value should be equal to the corresponding
referenced column value of any row.
Confirm Referencing columns of a row must have a non-null value and this non-null
value should be equal to the corresponding referenced column value of any row.
Example:
Scenario 1:
Create table country (Cname varchar (20), Cid Integer, Population Integer,
Primary key (Cname, Cid, Population)
Scenario 2:
Invalid:
Insert into state (‘Delhi’, null, null, null)
Insert into state (‘Delhi’, ‘India’, 1, null)
Insert into state (‘Delhi’, ‘India’, null, 200)
Valid:
Insert into state (‘De’, ‘India’, null, null)
Insert into state (‘Delhi’, null, 91, null)
Scenario 3:
Create table country (Cname varchar(20),Cid Integer,Population Integer,
Primary key(Cname,Cid,Population)
Invalid:
Insert into state values (‘Delhi’, ‘India’, 1,200)
Insert into state values (‘Delhi’, ‘India’, 91, 91)
Insert into state values (‘Delhi’, ‘Aus’, 91, 200)
Valid:
Insert into state values (‘De’, ‘India’, null, null)
Insert into state values (‘Delhi’, ‘India’, 91, 200)
Referenced constraints are checked for update and delete statement. Whenever user
modifies the record in referenced table, we follow the rule, action and match type
specified on that constraint to perform operations on Referencing table.
i.Update Rule
ii.Delete Rule
Cascade
Restrict
Set null
Set default
No Action
In some of the following topics, you will find terms like matching rows and unique
matching rows. Here is a brief description of these terms:
Matching Rows:
If Simple Match or Full Match is specified, then for a given row in the referenced table,
every row in the referencing table such that the referencing column values equal the
corresponding referenced column values in referenced table for the referential constraint
is a matching row.
If Partial Match is specified, then for a given row in the referenced table, every row in the
referencing table has, at least one non-null referencing column value and the non-null
referencing column value equals the corresponding referenced column values for the
referential constraint is a matching row.
If Partial Match is specified, then for a given row in the referenced table, every matching
row for that given row, that is a matching row only to the given row in the referenced
table for the referential constraint is a unique matching row. For a given row in the
referenced table, a matching row for that given row that is not a unique matching row for
that given row for the referential constraints is a unique matching row.
Case:
i. If delete rule specifies cascade, then every matching row from referencing
table corresponding to the referenced table will be deleted.
iii. If delete rule specifies set default, then for every referencing table, in every
unique matching row in referencing table, each referencing column in
referencing table is set to the default value.
iv. If delete rule specifies restrict and there are some matching rows, then an
exception will be raised, “ Integrity constraint violation – restrict violation”
vi. If update rule specifies cascade, for every referencing table, in every
matching row in referencing table is updated to the new value of that
referenced column.
vii. If update rule specifies set null, for every referencing table, in every matching
row in referencing table, each referencing column in referencing table that
corresponds with a referenced column is set to the null value.
viii. If update rule specifies set default, for every referencing table, in every
matching row in referencing table, the referencing column in referencing table
that corresponds with a referenced column is set to the default value.
ix. If update rule specifies restrict and there are some matching rows in
referencing table, then an exception will be raised, “ Integrity constraint
violation – restrict violation”
Case:
i. If delete rule specifies cascade, then every unique row from referencing table
corresponding to the referenced table will be deleted.
ii. If delete rule specifies set null, then every unique matching row from
referencing table corresponding to the referenced table will be set to null.
iii. If delete rule specifies set default, then for every referencing table, in every
unique matching row in referencing table, each referencing column is set to
the default value.
iv. If delete rule specifies restrict and there are some unique matching rows,
then an exception will be raised, “ Integrity constraint violation – restrict
violation”
vii. If update rule specifies set null, than in every unique matching row in
referencing table that contains a non-null value in a referencing column in
referencing table that corresponds with the updated column, that referencing
column is set to the null value.
viii. If update rule specifies set default, for every referencing table, in every
unique matching row in referencing table that contains a non-null value in a
referencing column in referencing table that corresponds with the updated
column, that referencing column is set to the default value.
ix. If update rule specifies restrict and there are some unique matching rows,
then an exception will be raised, “ Integrity constraint violation – restrict
violation”
Examples:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES
country( Cname,Cid ) MATCH SIMPLE ON DELETE set null ) ";
Case:
1. On deleting record from country table with condition cname = ‘India4’, only one
row from state table will be updated with null values in columns cname and cid
that have sname as ‘s4’
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with null values in columns cname and cid that
have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) default ‘India1’, Cid Integer default 1, FOREIGN KEY(Cname,Cid)
REFERENCES country( Cname,Cid ) MATCH SIMPLE ON DELETE set default )
";
Case:
1. On deleting record from country table with condition cname = ‘India4’, only one
row from state table will be updated with ‘India1’ and 1 in columns cname and cid
that have sname as ‘s4’
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with ‘India1’ and 1 in columns cname and cid that
have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH SIMPLE ON DELETE restrict) ";
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH SIMPLE ON DELETE no action) ";
1. On deleting record from country table with condition cname = ‘India4’, the
particular record will be deleted from country table without affecting state table.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH FULL ON DELETE CASCADE ) ";
Case:
1. On deleting record from country table with condition cname = ‘India4’, only
one row from state table will be deleted which have sname as ‘s4’.
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be deleted that have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH FULL ON DELETE set null ) ";
Case:
1. On deleting record from country table with condition cname = ‘India4’, only
one row from state table will be updated with null values in columns cname and
cid that have sname as ‘s4’
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with null values in columns cname and cid that
have sname as ‘s2’.
Case:
1. On deleting record from country table with condition cname = ‘India4’, only one
row from state table will be updated with ‘India1’ and 1 in columns cname and cid
that have sname as ‘s4’
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with ‘India1’ and 1 in columns cname and cid that
have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH FULL ON DELETE restrict) ";
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH FULL ON DELETE no action) ";
Case:
1. On deleting record from country table with condition cname = ‘India4’, the
particular record will be deleted from country table without affecting state table.
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH SIMPLE ON UPDATE set null ) ";
Case:
1. On updating record on country table with condition cname = ‘India4’ with value
‘India7’, only one row from state table will be updated with null values in columns
cname and cid that have sname as ‘s4’
2. On updating record on country table with condition cname =’India2’ with value
‘India8’, one row from state table will be updated with null values in column
cname and cid that have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) default ‘India1’, Cid Integer default 1, FOREIGN KEY(Cname,Cid)
REFERENCES country( Cname,Cid ) MATCH SIMPLE ON UPDATE set default )
";
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH SIMPLE ON UPDATE restrict) ";
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH SIMPLE ON UPDATE no action) ";
Case:
Case:
Case:
Case:
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH FULL ON UPDATE no action) ";
Case:
Case:
1. On deleting record from country table with condition cname = ‘India1, only
one row from state table will be deleted which have sname as ‘s1’ because this is
the only unique matching row for table country.
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be deleted that have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES
country(Cname,Cid ) MATCH PARTIAL ON DELETE set null ) ";
Case:
1. On deleting record from country table with condition cname = ‘India1’, only
one row from state table will be updated with null values in columns cname and
cid that have sname as ‘s1’ because this is the only unique matching row for
table country.
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with null values in columns cname and cid that
have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) default ‘India4’, Cid Integer default 4, FOREIGN KEY(Cname,Cid)
REFERENCES country( Cname,Cid ) MATCH PARTIAL ON DELETE set default
) ";
Case:
1. On deleting record from country table with condition cname = ‘India1’, only
one row from state table will be updated with the values ‘India4’ and 4 in columns
cname and cid that have sname as ‘s1’.
2. On deleting record from country table with condition cname=’India2’, one row
from state table will be updated with the value ‘India4’ and 4 in columns cname
and cid that have sname as ‘s2’.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH PARTIAL ON DELETE restrict) ";
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH PARTIAL ON DELETE no action) ";
Case:
1. On deleting record from country table with condition cname = ‘India1’, the
particular record will be deleted from country table without affecting state table.
Create table country (Cname varchar (10), Cid Integer, Population Integer,
PRIMARY KEY (Cname, Cid))
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) , Cid Integer , FOREIGN KEY(Cname,Cid) REFERENCES
country(Cname,Cid ) MATCH PARTIAL ON UPDATE set null ) ";
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10) default ‘India4’, Cid Integer default 4, FOREIGN KEY(Cname,Cid)
REFERENCES country( Cname,Cid ) MATCH PARTIAL ON UPDATE set default
) ";
1. On updating record on country table with condition cname = ‘India1’ with value
‘India7’, only one row from state table will be updated with ‘India4’ and 4 in
columns cname and cid that have sname as ‘s1’.
2. On updating record on country table with condition cname =’India2’ with value
‘India8’, one row from state table will be updated with default values ‘India4’ and
4 in columns cname and cid that have sname as ‘s2’.
3. On updating record on country table with condition cname =’India5’, state table
will remain same because there is no matching row.
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH PARTIAL ON UPDATE restrict) ";
Case:
Create table state( Sname varchar(10) , Sid Integer PRIMARY KEY , Cname
varchar(10), Cid Integer, FOREIGN KEY(Cname,Cid) REFERENCES country(
Cname,Cid ) MATCH PARTIAL ON UPDATE no action) ";
Case:
Trigger
While executing DML statement, we fire triggers to perform action specified by the user.
First, we solve the when condition for the given trigger. If it satisfies, database executes
all specified statements and every statement starts a new save point. User can also
specify aliases in statement for distinguishing old state and new state. In row level
triggers, we provide the old row values to old state and new record values to new state
to execute the statement successfully. Similarly, in statement level trigger we provide old
row set to old state and new row set to new state. We also handle recursions in trigger.
To stop recursions we use trigger execution context to save the trigger states. Before
executing the statement specified in a trigger, we add the current state of the trigger in
Trigger Execution Context. We throw an exception if we got same state due to recursion.
We can execute more than one statement within a trigger. These statements will be
executed in the same sequence in which these were specified in trigger statement.
We have some limitation in defining aliases, as we cannot specify an old alias with
before insert option because we do not have any older version of a row that is going to
be inserted. Similarly, we cannot specify new alias with after delete option because after
deleting a row, what is the mean of refer that deleted row with any alias.
Examples:
After Insert:-
create trigger abc1 Before insert on country1 for each row insert into country2
values('NewIndia' , 100 , 1000)
After Update:-
create trigger abc1 after update on country1 for each row insert into country2
values('NewIndia' ,100 ,1000)
(In all, three rows will be inserted into table country2, trigger will be fired after
each update in country1, as update statement effects 3 rows in table country1
which will fire the trigger 3 times)
After Delete:-
create trigger abc1 after delete on country1 for each row insert into country2
values('NewIndia' ,100 ,1000)
(In all, two rows will be inserted into table country2; trigger will be fired after each
delete on country1 with values (‘NewIndia’, 100, 1000))
Before Insert:-
create trigger abc1 Before insert on country1 for each row insert into country2
values('NewIndia' , 100 , 1000)
(One row will be inserted into table country2 before inserting a row in table
country1 with values ('NewIndia' ,100 ,1000))
Before Update:-
create trigger abc1 before update on country1 for each row insert into country2
values('NewIndia' ,100 ,1000)
(In all, two rows will be inserted into table country2; trigger will be fired before
each update in country1)
Before Delete:-
create trigger abc1 before delete on country1 for each row insert into country2
values('NewIndia' ,100 ,1000)
(One row will be inserted into table country2 before deleting record on country1
with values (‘NewIndia’, 100, 1000))
After Insert:-
create trigger trigger3 after insert on country1_1 for each statement insert into
country1_2 select * from country1_1
(Will insert two rows in table country1_2 after inserting row with Cid=2 in table
country1_1 as
Cname Cid Population
india 1 2
After Update:-
create trigger trigger5 after update on country1_1 for each statement insert into
country1_2 select * from country1_1
(Will insert four rows in table country1_2 after updating population column in
table country1_1)
After Delete:-
create trigger trigger2 after delete on country1_1 for each statement insert
into country1_2 select * from country1_1
(Will insert one row in table country1_2 after deleting row with Cid=1 from table
country1_1 as
Cname Cid Population
india 1 2
mumbai 2 3
)
Before Insert:-
create trigger trigger3 before insert on country1_1 for each statement insert into
country1_2 select * from country1_1
(Will insert one row in table country1_2 before inserting row with Cid=2 in table
country1_1)
Before Update:-
create trigger trigger5 before update on country1_1 for each statement insert into
country1_2 select * from country1_1
(Will insert one row in table country1_2 before updating population column in
table country1_1)
Before Delete:-
create trigger trigger1 before delete on country1_1 for each statement insert into
country1_2 select * from country1_1
(Will insert one row in table country1_2 before deleting row with Cid=1 from
table country1_1)
Triggers with “atomic condition with before delete referencing column name”
create trigger abc2 before delete on country4 referencing old o for each row
Begin
End
create trigger abc1 after insert on country4 for each row insert into country4
values (‘acc’,1,1)
(Will start inserting values in table country4 recursively and will throw an error
stating about recursion in trigger.)
create trigger abc12 after update on country28 referencing new n for each row
when (n.cid < 20) update country29 set cid = 100 where cid = n.cid
(Will fire the trigger for updating country29 if new value for updating Cid in
country28 is < 20 i.e. above update statement will update country29 and set the
value for cid to 100)
create trigger abc4 after update on country10 referencing old o for each row
update country11 set Cid = o.cid where Cid = 1
create trigger abc11 after update on country11 referencing new n for each row
delete from country where cid = n.cid
(Will fire the trigger abc11 and deletes all the rows from country having cid = new
update value for country11 i.e. cid=100)
(Will fire the trigger and updates the row in country with population = 1000)
1: execute query
Insert query will check the rules 2: semantic checking
specified in SQL 99 for validity purpose
Flow Explanations:
1& 2: When server calls for execute query, Insert Query will check for:
Table existence
Column existence
Column-list should be unique (if any).
Values specified (if any) should be valid according to column data type.
Column indexes should be valid.
Column and values specified should be equal in number.
3: Insert query will interact with Session to check the privileges of the user to execute
insert query
4: Trigger executer will execute the triggers applied for execution before the insert
operation.
6: Constraint verifier will check all the constraints applied on the table on the newly
inserted record.
For a newly inserted record, Verifier will ensure that it doesn't have null values for a not-
null column, that primary and unique columns have distinct value from other rows of the
table, that referencing columns and columns of check constraints have valid values.
7: Trigger executer will execute the triggers applied for execution after the insert
operation.
8: Session will save the newly inserted record and record modified through triggers and
constraints to the physical storage by interacting with Data Store
1: execute query
Update query will check the rules 2: check rules of sql 99
specified in sql 99 for validity purpose
3: check update privileges
Update query will interact with session
to check the privileges for update
evaluate update 4: navigate rows of table
condition for all
rows of the table
5: evaluate update condition
1 & 2: When server calls for execute query, Update Query will check for:
3: Update query will interact with the session to check the privileges for update.
4: Update query will navigate all rows of the table to identify the rows to update.
For solving condition optimally we will have to make use of execution plan of DQL.
5: Evaluate the condition specified in the update condition for the row retrieved from
Session.
For solving condition optimally we will have to make use of execution plan of DQL.
6: Trigger executer will execute the triggers applied for execution before the update
operation.
7: Update query will assign the values to columns using the current values if an
expression is given for the column and will pass the values for update to session.
8: Constraint verifier will check the constraints according to the columns modified. In
other words, if a constraint is applicable on the modified column then only it will be
checked.
If a column referenced by some other table as foreign key is modified, then the update
rule of foreign constraints is considered to validate the modification of column value.
9: Trigger executer will execute the triggers applied for execution after the update
operation.
10: Session will save the affected records and records modified through triggers and
constraints to the physical storage by interacting with Data Store
1: execute query
Delete query will check the rules 2: check rules of sql 99
specified in sql 99 for validity purpose
3: check delete privileges
Delete query will interact with session
to check privileges for delete
evaluate delete 4: navigate rows of table
condition for all
rows of the table 5: evaluate delete condition
Flow Explanations:
1 & 2: When server calls for execute query, Delete Query will check for:
Table existence
3: Delete query will interact with session to check privileges for delete.
4: Delete query will navigate all rows of the table to identify the rows to update.
For solving condition optimally we will have to make use of execution plan of DQL.
5: Evaluate the condition specified in the delete query for the row retrieved from
Session.
For solving condition optimally we will have to make use of execution plan of DQL.
6: Trigger executer will execute the triggers applied for execution before the delete
operation.
9: Trigger executer will execute the triggers applied for execution after the delete
operation.
10: Session will interact with Data Store to remove the records from physical storage.
Session will save the records modified or inserted due to execution of triggers or
constraints.
1: verify constraint
Flow Explanations:
1 & 2: When the constraint is passed on to Constraint Verifier, it will check for null
values of the columns having not null constraint. If any of the columns has null value,
verifier will throw an exception.
3: Verifier will evaluate the check constraints condition for the record passed for
verification. If any of the condition is not met, verifier will throw an exception.
4: Verifier will check the value of primary columns in the record passed, whether it is
not null and distinct in the table. It will throw an exception if the column is having null
value or a duplicate value.
6: Verifier will check the referencing column of the foreign key having either null value
or a value existing in a row in the referenced table.
In case of update, if a column of the table is being referred as foreign key in another
table, then the update rule of referencing constraints of referencing table is considered
for validation. If the update rule doesn't allow for changing the value, then an exception
is thrown.
In case of delete, if a column of the table is being referred as foreign key in another
table, then the delete rule of referencing constraints of referencing table is considered
for validation. If the delete rule doesn't allow for deletion with a row referring to the value
of the row being deleted, then an exception is thrown.
1: execute triggers
Flow Explanations:
1 & 2: Trigger executer will evaluate the trigger execution condition using the record
passed in the execute triggers step. If the record satisfies the condition, then only trigger
will be executed.
3: Trigger executer will start a new sub-transaction under the current transaction so
that triggered SQL query can be executed independently of the current transaction.
4: Trigger executer will execute all the SQL queries specified in the trigger.
5: On successful execution of all SQL queries, trigger executer will commit the sub-
transaction so that the changes done by SQL query become part of the current
transaction.
In case there is some error in any of the SQL query, trigger executer will rollback the
sub-transaction and throw a trigger execution exception.
ConstraintS
ystem
ConstraintD
atabase
ConstraintTa
ble
ConstraintDatabase (Interface)
ConstraintDatabase is reponsible for managing constraint tables. It ensures that a single
instance of a constraint table is active in the database.
ConstraintTable (Interface)
ConstraintTable is responsible for checking the constraints applied on the tables. It
checks for constraint on insert, update and delete operations.
SetNullReferenced
Executer
SetDefaultReference
<<depends>> dExecuter
ConstraintTableImpl
ReferencingConstraints UpdateCascadeReferen UpdatePartialCascadeRefer
cedExecuter encedExecuter
<<depends>>
<<depends>> ReferencingVerifier
<<depends>>
MatchFullReferencingExecuter
MatchPartialReferencing
Executer
ConstraintStore
Classes:
ConstraintSystemImpl ( )
ConstraintSystemImpl provides the implementation of Constraint System interface.
ConstraintTableImpl ( )
ConstraintTableImpl provides the implementation of the ConstraintTable interface.
CheckVerifier ( )
CheckVerifier is responsible for verify all the check constraints applied on a table. Check
constraints are verified on insert and update operations.
ReferencedVerifier ( )
ReferencedVerifier is responsible for verifying the foreign key constraint when a row in
the parent table is updated or deleted. All the foreign key constraints referencing the
parent table are verified.
UniqueVerifier ( )
UniqueVerifier is responsible for verifying all the unique constraints and primary
constraint applied on the table. Primary constraint and unique constraints are verified on
insertion and updation in the table.
ReferencingVerifier ( )
ReferencingVerifier is responsible for verifying all the foreign key constraints applied on
a table. It uses the ReferencingExecuter instances to verify the constraint according to
the match option specified.
ConstraintStore ( )
ConstraintStore stores information about the constraint condition, constraint columns
etc. It also manages the navigator taken on the constraint condition which is used to
evaluate the constraint.
ReferencedConstraints ( )
ReferencedConstraints is responsible for verifying the foreign key constraint in the child
table when a delete or update is performed in the parent table.
ReferencingConstraints ( )
ReferencingConstraint is responsible for verifying the foreign key value in the child table.
Referencing constraint verifies the foreign key on insert and update operation. It uses
ReferencingExecuter to verify the foreign key constraint.
ReferencedExecuter ( )
ReferencedExecuter is an abstract class providing the functionality to verify the foreign
key constraint's update or delete rule when a row in the parent table is updated or
deleted.
MatchSimpleReferencingExecuter ( )
MatchSimpleReferencingExecuter is responsible for checking the foreign key value of
the child table in the parent table when the match type Simple is specified in foreign key
constraint.
MatchPartialReferencingExecuter ( )
MatchPartialReferencingExecuter is responsible for checking the foreign key value of the
child table in the parent table when the match type Partial is specified in foreign key
constraint.
MatchFullReferencingExecuter ( )
MatchFullReferencingExecuter is responsible for checking the foreign key value of the
child table in the parent table when the match type Full is specified in foreign key
constraint.
DeleteCascadeReferencedExecuter ( )
DeleteCascadeReferencedExecuter is responsible for deleting the dependent rows in
the child table when a row of parent table is deleted and child table is having cascade
option on delete in its foreign key constraint. It handles all the match type (Simple, Full
and Partial).
UpdateCascadeReferencedExecuter ( )
UpdateCascadeReferencedExecuter is responsible for updating the dependent rows in
the child table when a row in the parent table is updated and foreign key constraint's
update rule specifies CASCADE option. This class handles Simple and Full match
option.
RestrictReferencedExecuter ( )
RestrictReferencedExecuter is responsible for verifying the dependent rows in the child
table when a row in the parent table is updated or deleted and foreign key constraint
specifies a RESTRICT option. This class throws an exception when there are dependent
rows in the child table.
UpdatePartialCascadeReferencedExecuter ( )
UpdatePartialCascadeExecuter is responsible for updating the dependenet rows of the
child table when a row in the parent table is updated and foreign key constraint's update
rule specifies CASCADE option and match type is Partial.
SetNullReferencedExecuter ( )
SetNullReferencedExecuter is responsible for updating the dependent rows of the child
table with Null value when the parent table row is updated or deleted and update/delete
rule of the foreign key constraint specify the SET NULL option.
SetDefaultReferencedExecuter ( )
SetDefaultReferencedExecuter is responsible for updating the dependent rows of the
child table with a default value when the parent table's row is updated or deleted and
update/delete rule of foreign key constraint specifies SET DEFAULT option.
CLASS DIAGRAMS:
Class Diagram: Interfaces
TriggerSyste
m
TriggerDatab
ase
TriggerTable
Classes:
TriggerSystem (Interface)
TriggerSystem is responsible for managing the trigger database which is active in the
server. It ensures that only a single instance of trigger database is used in the server.
TriggerDatabase (Interface)
TriggerDatabase is responsible for managing the trigger table in a database. It ensures
that only a single instance of a trigger table is active in the database.
TriggerSystemImpl
TriggerTableImpl
StateChange
Classes:
TriggerSystemImpl ( )
TriggerSystemImpl provides the implementation of TriggerSystem interface.
TriggerDatabaseImpl ( )
TriggerDatabaseImpl provides implementation of TriggerDatabase.
StateChange ( )
StateChange stores the information about the trigger fired on the table. It is used to
avoid the recursion of triggers in a data modification statement execution.
CLASS DIAGRAMS:
Class Diagram: Classes
DefaultV alues
V aluesConstructor
UpdateS tatem e nt
Classes:
InsertStatement ( )
InsertStatement represents an SQL insert statement. InsertStatement has three options
for specifying the column values. Options are values constructor, default and sub-query.
UpdateStatement ( )
UpdateStatement represents an SQL update statement. Update statement consists of a
column values list and where clause. Column values list specify the columns and
expressions for the column values. Where clause specifies the condition according to
which the records are to be updated.
InsertExecuterSimple ( )
InsertExecuterSimple is responsible for executing SQL insert statement with values
constructor. InsertExecuterSimple inserts a new row in the table with the values
specified in values constructor.
InsertExecuterDefault ( )
InsertExecuterDefault is responsible for executing the insert statement with default
clause. InsertExecuterDefault inserts a new row in the table and the columns of the row
have default values.
InsertExecuterSubQuery ( )
InsertExecuterSubQuery is responsible for executing an insert statement with select
sub-query. InsertExecuterSubQuery is responsible for executing the select query
represented by SubQuery and inserting the rows in the table by navingating the rows
returned by the navigator of the select query.
UpdateStatementExecuter ( )
UpdateStatementExecuter is responsible for executing the update statement. It takes a
navigator for the "where clause" of the update statement and updates all the rows of the
navigator with the values calculated from the value expression specified in column
values list.
DeleteExecuter ( )
DeleteExecuter is responsible for deleting the records from a table according to the
delete statement. DeleteExecuter makes a navigator according to the where clause of
delete statement and then deletes all the records of the navigator.
WhereClause ( )
Where clause represents the Boolean condition which must be satisfied by the result of
a select query.
SubQuery ( )
SubQuery represents a query which is used in value expression or Boolean condition.
ValueExpression (Interface)
ValueExpression repesents an expression. It can be a simple column, numeric
expression, string expression or a mathematical function.
DefaultValues ( )
DefaultValues represents default clause of SQL insert statement.
ValuesConstructor ( )
ValuesConstructor represents the columns and values of the insert statement. Columns
are represented using their name and values can be literals or expressions.
DDL will be responsible for handling all types of DDL queries. DDL will interact with
Session to store the meta-data about the objects. DDL will interact with Session to
create, drop and alter objects in the physical storage.
Database – Creation/Deletion
Schema – Creation/Deletion
Table – Creation/Deletion/Alteration
Index – Creation/Deletion
View – Creation/Deletion
Trigger – Creation/Deletion
Procedure – Creation/Deletion
Domains – Creation/Deletion/Alteration
User – Creation/Deletion/Alteration
Roles – Creation/Deletion
Privilege – Grant/Revoke
Create Object:
DDL query will first of all check for the privileges of the user for creating new object. DDL
query will check the SQL 99 rules corresponding to the object. DDL will create a meta-
data for the object. All the information specified in the query will be loaded in the meta-
data. Meta-data can contain meta-data of sub-objects. DDL query will make sure that no
other object with name is present in the database. DDL will assign name to the sub-
objects whose name has not been given by the user. DDL will save the content of the
meta-data in the system tables of the database. DDL will also grant all privileges
associated with the object to the current user. In case of table and index, DDL will
interact with Session for allocating the space in the physical storage for the object. In
case of table, DDL will create indexes on all the primary and unique constraint
column(s).
Drop Object:
DDL query will first of all check for the privileges of the user for dropping an object. In
case of drop object, a drop behaviour (Restrict/Cascade) is associated with DDL query.
If Restrict drop behaviour is specified, then DDL will check for dependent objects and
will drop the object only when there are no dependent objects otherwise DDL will throw
an error. In case of Cascade behaviour, DDL will drop the object along with dependent
objects. DDL will delete the meta-data of the object and sub-objects from the system
tables. DDL will also revoke all the privileges associated with the object and dependent
objects. In case of table, DDL will drop all the triggers and indexes created on the table
and with the help of Session, space occupied by table and indexes on the physical
storage will be de-allocated.
Revoke privileges:
DDL query will check the user’s privileges for revoking privileges from other users. Drop
behaviour is also associated with revoke privileges. If Restrict behaviour is specified,
and there is any dependency of privileges, then an error is thrown. If cascade is
specified, then dependent privileges are also revoked. Dependent privileges are
calculated in a recursive manner. After locating the dependent privileges, DDL will delete
the meta-data of all the privileges along with dependent privileges from the system
tables. A single revoke statement can be used for revoking multiple privileges on the
object from multiple users.
1: c reat e object
2: check semantic
DDL query will check the
semant ic of the query
3: commit current transaction
DDL query will commit the current
t ransac tion before execution
4: c heck privileges
DDL query will check the privileges
of user for object creat ion
5: verify object uniqueness
DDL query will check if an object is
already existing with same name
6: save meta-data of the object
DDL query will save the meta data
about the object in the database
7: grant privileges on new object
DDL query will grant the necessary
privileges to user for
accessing/modifying the object
8: allocat e s pace for object
DDL query will interact with session to 9:
allocate space for object (if required)
Flow Explanations:
1: User is giving call to system by executing query through server for creation of
objects like table, view, schema, index, database etc.
2: DDL Object Creation Query will check the semantic of the query before executing it.
Semantic checking involves checking of rules specified in SQL 99 for object creation.
Rules differ according to the type of object created but some rules are common like
name of objects, uniqueness of sub-constituent like column, info according to sub-
constituents specification and type.
4: DDL query will check for privileges of the current user for creating object. Current
user must be the owner of the schema in which object is created. If a new schema is
created then user must be a valid.
5: DDL query will ensure that no other object with same name is existing in the
schema. Query will also check the other dependencies as specified in SQL 99 for the
object.
6: DDL query will save the meta data about the object in the database by interacting
with Session.
7: Query will grant privileges on the newly created object to the current user for
accessing/modifying object. Current user will be allowed to grant/give these privileges to
other users for accessing/modifying the data. Other users will not be able to modify the
definition of object.
8: DDL query will interact with session to allocate space for object, if required (Like in
case of table, index) in Data Store.
DDL drop query can be used for deleting objects from the database. DDL drop query
has a drop behaviour associated with it.
1. Cascade - In case of cascade, all the dependent objects associated with the current
object are dropped from the database.
1: execute query
2: DDL drop object query will check whether any drop behaviour is specified and the
specified object exists in the database.
3: Before executing the DDL query, query will commit the current transaction so that
user data is saved.
4: DDL query will make sure that user executing the query is the owner of the object.
No other user has the privileges for dropping an object.
5: DDL query will find the dependent objects of the dropping object and on the basis of
drop behaviour specified will take the appropriate action.
6: DDL query will delete all the meta-data of the dropping obejct stored in the database.
7: DDL query will revoke all the privileges given on the dropping object from all the
users of the database.
8: DDL will interact with session to free the space used by the object in the datastore
so that free space can be used by other objects.
Grant statement can be used for giving privileges to multiple users simultaneously by the
owner of the object or the user having privileges with "grant option".
Privileges are specific to object type. Some privileges are granted only to the owner of
the object. Privileges are to be granted by the owner to a different user initailly but after
that user with grant option can give part of their privileges to other users.
1: execute query
Grant statement will save the new 6: save privileges in object meta-data
privileges in the object meta-data
2: Grant statement will check the privileges to be granted are specific to object type.
Grant statement will also check whether the specified users exist in the database.
3: Grant statement will check for grant privileges of the current user.
5: Grant statement will assign the privileges to all the users specified in the statement.
All users will have same set of privileges and their privileges will also be the same.
6: Grant statement will save the new privileges in the meta-data of the object.
1: execute query
2: check semantic
Revoke stat ement will perform semant ic
chec king of the query
Revoke stat ement will commit the current 4: commit current transaction
t ransaction
Revoke stat ement will find the privileges 5: find privileges grant ed to other us ers
granted by users to other users in recursive
manner
6: remove privileges from object meta-dat a
Revoke stat ement will remove the privileges
from the object meta data
Flow Explanations:
2: Revoke statement will check for object existence in the database and users from
which privileges are to be revoked. Revoke statement will also ensure that privileges
being revoked are allowed on the object.
3: Revoke statement will check that the current user has granted the privileges to the
users. Privileges granted by a user to another user can only revoked by the granting
user and no other user is allowed to revoke the privileges.
5: If the users specified in revoke statement have granted privileges to other users,
revoke statement will find all such privileges in recursive fashion.
6: Revoke statement will remove the privileges from the object meta data.
2: isSchemaValid
5: check semantic
Flow Explanation:
1: Object schema will be schema specified in the DDL query. In case schema is not
specified in the DDL query, the current schema of the transaction is assigned to the
object.
3: DDL query will ensure that expression has appropriate data type according to the
rule in which expression is used.
For example: Default clause for a column must have its data-type compatible with
column's data type.
4: DDL query will assign default names to the rules which have optional name rule and
for which user hasn't given any value. DDL query will take help of Session in getting the
default name.
For example: In case of view definition, semantic checking of the select query will be
done to make sure that select query of the view is a valid query.
CLASS DIAGRAMS:
TableConstr UniqueConstraintDefinition
aint
TableDefinition
ReferentialConstraintDefinition
SchemaDefinition
ColumnCon
straint
ColumnDefinition
DefaultClause
DomainDefinition
DomainConst
raint
RoleDefinition UserDefinition
IndexDefinition
SequenceDefinition SQL_invokedProcedure
GrantPrivilegeStatement
ViewDefinition TriggerDefinition
GrantRoleStatement
AddDomainConstraint
DropDomainConstraint
SetDomainDefaultClause
AlterSequenceStatement AlterDomainStatement
AlterDomain
Action DropDomainDefaultClause
AlterUserStatement
AddColumnDefinition
AlterColumnDefinition
AlterColumn
Action DropColumnDefinition
AlterTableStatement
AddTableConstraint
AlterTableA
ction SetColumnDefaultClause
DropColumnDefaultClause
DropTableConstraint
DropDatabaseStatement
DropDomainStatement DropIndexStatement
DropSequenceStatement DropUserStatement
RevokePrivilegesStatement RevokeRoleStatement
AlterTableAction (Interface)
AlterTableAction represents a part of SQL query to make table level alteration in an
existing table.
AlterDomainAction (Interface)
AlterDomainAction represents a part of SQL query which is used to make alteration in an
existing domain. Alterations allowed are: adding or dropping of domain constraint and
setting or resetting the default clause of the domain.
AlterColumnAction (Interface)
AlterColumnAction represents a part of SQL query used for column level alteration in an
existing table.
SchemaDefinition ( )
SchemaDefinition represents an SQL query to create a new schema in the database.
Existing users can create the new schema.SchemaDefinition can contain other definition
statements to create the object in the schema.
TableDefinition ( )
TableDefinition represents an SQL query to create a new table in the database.
ColumnDefinition ( )
ColumnDefinition represents a column of the table. It includes column name, column
data type, default clause (if any) and column constraints (if any).
CheckConstraintDefinition ( )
CheckConstraintDefinition represents an SQL query for creating a new check constraint
in a table. CheckConstraintDefinition includes its name, characteristics and condition of
the constraint.
TableConstraint (Interface)
TableConstraint represents a constraint applied on the table level. It is usually used
when multiple columns are involved in the constraint.
UniqueConstraintDefinition ( )
UniqueConstraintDefinition represents a part of SQL query to create a new (unique or
primary) constraint in the table.
ReferentialConstraintDefinition ( )
ReferentialConstraintDefinition represents an SQL query which is used to create a new
referential constraint in an existing table.
DefaultClause ( )
DefaultClause represents the default value for a data type. It’s used in column definition
and domain definition to represent default value.
ColumnConstraint (Interface)
ColumnConstraint represents a constraint applied on a single column. ColumnConstraint
can be of any constraint that is allowed.
DomainDefinition ( )
DomainDefinition represents an SQL query to create a new domain in the database.
DomainConstraint ( )
DomainConstraint represents a constraint applied on the domain. Domain constraint is
basically a check constriant.
GrantPrivilegeStatement ( )
GrantPrivilegeStatement represents an SQL query to assign privileges to users on the
objects of the database. Statement can be used to assign privileges to multiple users
simultaneously.
GrantRoleStatement ( )
GrantRoleStatement represents an SQL query to assign roles to existing users.
IndexDefinition ( )
IndexDefinition represents an SQL query to create a new index on an existing table.
RoleDefinition ( )
RoleDefinition represents an SQL query to create a new role in the database.
SequenceDefinition ( )
SequenceDefinition represents an SQL query to create a new sequence in the database.
TriggerDefinition ( )
TriggerDefinition represents an SQL query to create a new trigger on the table.
ViewDefinition ( )
ViewDefinition represents an SQL query to create a new view in the database.
UserDefinition ( )
UserDefinition represents an SQL query to create a new user in the database. Only
database owner can create new users.
AlterSequenceStatement ( )
AlterSequenceStatement represents an SQL query to alter an existing sequence.
AlterUserStatement ( )
AlterUserStatement represents an SQL query to alter an existing user. Statement can be
used to alter the password for a user.
AlterDomainStatement ( )
AlterDomainStatement represents an SQL query to alter an existing domain.
AlterTableStatement ( )
AlterTableStatement represents an SQL query to alter an existing table.
AddTableConstraint ( )
AddTableConstraint represents a part of SQL query which is used to add a new table
constraint in an existing table.
DropTableConstraint ( )
DropTableConstraint represents a part of SQL query which is used to delete an existing
table constraint from a table.
AddDomainConstraint ( )
AddDomainConstraint represents a part of SQL query which is used to add a new
domain constraint to an existing domain.
AddColumnDefinition ( )
AddColumnDefinition represents a part of SQL query which is used to add a new column
in an existing table.
DropDomainConstraint ( )
DropDomainConstraint represents a part of SQL query which is used to drop a domain
constraint on an existing domain.
DropDomainDefaultClause ( )
DropDomainDefaultClause represents a part of SQL query which is used to delete the
default clause of an existing domain.
DropColumnDefaultClause ( )
DropColumnDefaultClause represents a part of SQL query which is used to reset the
default value of a column of an existing table.
DropColumnDefinition ( )
DropColumnDefinition represents a part of SQL query which is used to drop a column
from an existing table. Any index (es) created on the columns are also dropped from the
database.
SQL_invokedProcedure ( )
SQL_invokedProcedure represents an SQL query to create a new procedure in the
database.
DropDatabaseStatement ( )
DropDatabaseStatement represents an SQL query to drop an existing database.
DropDomainStatement ( )
DropDomainStatement represents an SQL query to delete a domain from the database.
DropIndexStatement ( )
DropIndexStatement represents an SQL query to delete an existing index in an existing
table.
DropRoutineStatement ( )
DropRoutineStatement represents an SQL query to delete an existing routine from the
database.
DropSchemaStatement ( )
DropSchemaStatement represents an SQL query to delete an existing schema from the
database.
DropTableStatement ( )
DropTableStatement represents an SQL query to delete an existing table from database.
DropTriggerStatement ( )
DropTriggerStatement represents an SQL query to delete an existing trigger from a
table.
DropViewStatement ( )
DropViewStatement represents an SQL query to delete an existing view from the
database.
DropSequenceStatement ( )
DropSequenceStatement represents an SQL query to delete an existing sequence from
the database.
RevokeRoleStatement ( )
RevokeRoleStatement represents an SQL query which is used to revoke assigned role
from existing users in the database. Role can be revoked from multiple users
simultaneously.
RevokePrivilegesStatement ( )
RevokePrivilegesStatement represents an SQL query which is used to revoke the
assigned privileges from users on an existing object. Privileges from multiple users can
be revoked simultaneously.
DropUserStatement ( )
DropUserStatement represents an SQL query to delete an existing user from the
database.
SetColumnDefaultClause ( )
SetColumnDefaultClause represents a part of SQL query which is used to set default
clause in the column of an existing table.
SetDomainDefaultClause ( )
SetDomainDefaultClause represents an SQL query which is used to set a default clause
in an existing domain.
CLASS DIAGRAMS:
Class Diagram: Classes
All Classes implements
the Descriptor interface
Descriptor
ReferentialConstraintDescriptor
DomainDescriptor DomainConstraintDescriptor
IndexDescriptor
RoutineDescriptor
ParametersDescriptor IndexColumnDescriptor
SchemaDescriptor
TriggerDescriptor ViewDescriptor
SequenceDescriptor
RoleAuthorizationDescriptor
RoleDescriptor
TablePrivilegeDescriptor
PrivilegeDescriptor
RoutinePrivilegeDescriptor
UsagePrivilegeDescriptor
ColumnPrivilegeDescriptor
Classes:
Descriptor (Interface)
Descriptor is responsible for storing and retrieving information of objects from the system
tables.
CheckConstraintDescriptor ( )
CheckConstraintDescriptor is responsible for storing (insert/delete) and retrieving
information about the check constraint from the system tables.
PrivilegeDescriptor ( )
PrivilegeDescriptor is an abstract class responsible for storing and retrieving information
about privileges from system tables.
RoleDescriptor ( )
RoleDescriptor is responsilble for storing and retrieving information about the role from
the system tables.
RoutineDescriptor ( )
RoutineDescriptor is responsible for storing and retrieving information about the routine
from the system tables. RoutineDescriptor also manages its parameter information
through ParameterDescriptor.
RoutinePrivilegeDescriptor ( )
RoutinePrivilegeDescriptor is responsible for storing and retrieving informatoin about
privileges given on routine from the system tables.
SchemaDescriptor ( )
SchemaDescriptor is responsible for storing and retrieving information about the schema
from the system tables.
SequenceDescriptor ( )
SequenceDescriptor is responsible for storing and retrieving information about the
sequence from the system tables.
TableConstraintDescriptor ( )
TableConstraintDescriptor is responsible for storing and retrieving information about the
table constraints from the system tables
ColumnDescriptor ( )
ColumnDescriptor is responsible for storing and retrieving information about the column
from the system tables. ColumnDescriptor also manages its DataTypeDescriptor.
ColumnPrivilegeDescriptor ( )
ColumnPrivilegeDescriptor is responsible for storing and retrieving information related to
privilege given on a column from the system tables.
ConstraintDescriptor ( )
ConstraintDescriptor is an abstract class responsible for storing and retrieving
information related to constraints from system tables.
DataTypeDescriptor ( )
DataTypeDescriptor is responsible for storing and retrieving information related to data
type of a column or domain in the system tables.
DomainConstraintDescriptor ( )
DomainConstraintDescriptor is responsible for storing and retrieving information about
domain constraints from the system tables.
IndexDescriptor ( )
IndexDescriptor is responsible for storing and retrieving the information about the index
from system tables. IndexDescriptor also manages its column information through
IndexColumnDescriptors.
IndexColumnDescriptor ( )
IndexColumnDescriptor is responsible for storing and retrieving information about the
index columns from the system tables.
KeyColumnUsageDescriptor ( )
KeyColumnUsageDescriptor is responsible for storing and retrieving information about
the columns involved in constraints (unique and foreign key) from the system tables.
ParametersDescriptor ( )
ParametersDescriptor is responsible for storing and retrieving information about the
parameters used in a routine from the system tables.
ReferentialConstraintDescriptor ( )
ReferentialConstraintDescriptor is responsible for storing and retrieving information
about the referential constraint from the system tables.
This class also manages the information about the columns through
KeyColumnUsageDescriptors.
UniqueConstraintDescriptor ( )
UniqueConstraintDescriptor is responsible for storing and retrieving information about
the unique constraint from the sytem tables. UniqueConstraintDescriptor is used to store
unique as well as primary constraint. It also manages information about the columns
through KeyColumnUsageDescriptor.
TableDescriptor ( )
TableDescriptor is responsible for storing and retrieving information about the table from
the system tables. TableDescriptor also manages column and constraint descriptors.
TablePrivilegeDescriptor ( )
TablePrivilegeDescriptor is responsible for storing and retrieving information about the
privilege given on a table from the sytem tables.
TriggerDescriptor ( )
TriggerDescriptor is responsible for storing and retrieving information about the trigger
from the system tables.
UsagePrivilegeDescriptor ( )
UsagePrivilegeDescriptor is responsible for storing and retrieving information about the
privileges given on object like domains, character sets etc. from the system tables.
DQL will be responsible for execution of all type of Select Queries. Select query can be
based on a single table or multiple tables. Query can refer tables as well as views.
Select query can have cross joins, inner joins, outer joins, union, intersection and other
features mentioned in SQL 99 specification. DQL will perform semantic checking of the
query.
DQL query performance is very crucial for the database server. DQL will take care that
query is best optimized and gives its result in best possible time.
Select query is used to retrieve a table like set with specified columns according to some
criteria. Select Query can be divided as:
Simple query [Single table query]
Joins [Cartesian of two or more tables]
Cross Join
Inner Join
Outer Join
Group by [Grouping of Rows according to grouping expression]
Set Operators
Union [Query1 result Union Query2 result]
Intersection [Query1 result intersect Query2 result]
Except [Query1 result Except/Difference Query2 result]
Views
Order by [sorting the result according to ordering expression]
Sub-query [Query using other query in where clause]
In case of a query containing set operators like union, intersect etc., each query
contained in the set will be solved individually so that we have HQS[i] corresponding to
each query. Apply set operator on these sets to make a new set SQS. Finally apply the
order to sort the rows of SQS to make a sorted set OSQS. Rows of OSQS are result of
select query.
Handling of sub-query:
Sub-query is a select query used in where/join/having clause of another query. In this
case, sub-query will be executed using the above mentioned procedure. Set obtained by
executing the sub-query will be used to filter rows of the main query.
For example, we have two tables and each is having 100 rows and their
Cartesian will have 10000 rows without applying “where condition” on individual
tables. Let’s say by applying the “where condition” on these 10000 rows we have
4000 rows as the result. Now suppose that we apply “where condition” on
individual table and say we have 80 and 60 rows. Cartesian will be having 4800
rows instead of 10000. By applying the remaining “where condition” (which will
not be shifted to individual tables), we will be having 4000 rows same as
obtained above.
Views optimization:
If we have a query involving view and we have a condition applied on view, then
we should transfer this condition on the tables of the view’s query. This will result
in reduction of number of rows returned by view.
Also, if view doesn’t involve any set operator, group by and order by, then we can
even think of merging the tables and conditions of the view’s query with current
query and then executing the current query.
• If the condition involves only a single table, then we can evaluate the condition
on that table. If the condition involves two or more tables then it needs to be
checked whether we can split the condition into smaller parts which can be
solved on individual table. We can split the condition involving “AND” into two
parts easily and can solve the different parts on different tables.
• Smallest part of the condition is predicate. Each predicate reduces the number of
rows to some extent and has a cost for evaluating it. If a table has a number of
predicates to be evaluated on it, then we should evaluate the predicates on the
basis of cost and reduction factor. Predicate with the highest reduction factor and
smallest cost should be evaluated first.
• If we have to Cartesian more than two tables then we should start with the
Cartesian of tables having least number of rows.
• If we have a number of tables with similar number of rows then we should start
with the Cartesian of two tables whose join condition has the highest reduction
factor of rows.
• We can use index for solving the join condition. If we have index available on the
join columns of a table then we can seek the value of other table row in this index
to get the Cartesian rows.
• If more than one index is available on the columns involved in the condition then
the index containing maximum number of condition columns should be selected.
• We can also consider size of columns for selecting an index.
• If we have a condition and it can’t be solved by a single index then we should
split the condition, if the individual parts can be solved by indexes.
• We can also consider use of multiple indexes to solve the condition.
• We can remove redundant conditions involving greater than and less than
operator.
• We can remove the conditions involving “AND” and “OR” using the following
rules:
Predicate “OR” True --- True
Predicate “OR” False --- Predicate
Predicate “AND” True --- Predicate
Predicate “AND” False --- False
• We can replace predicate containing null value with “Unknown” value. For
example: a = null, a > null can be replaced with “Unknown” value.
• If order by clause contains columns of a single table only and we have index on
the columns of order by, then we can use the index to get the sorted results. If
the sorting of index columns and order by are opposite, even then we can use
the index. For example we have order by “a desc” and index available on “a” is in
ascending fashion then we can give the result by reversing the row sequence.
• If order by clause contains columns of more than one table then it needs to be
checked whether we can split the order in different parts belonging to individual
tables. If order can be split and indexes are available on these columns then we
should go for indexes.
• If group by clause contains columns of a single table only and we have index on
the columns of group by, then we can use the index to group the rows.
• If group by clause contains columns of more than one table then it needs to be
checked whether we can split the grouping columns in different parts belonging
to individual tables. If grouping columns can be split and indexes are available on
these columns then we should go for indexes.
• We can convert left outer join into inner join if a condition belonging to inner table
is present in “where condition”. For example, we have left outer join of A and B
table and a condition belonging to table B is present in where condition, then we
can convert left outer join into inner join.
• We can convert right outer join into inner join if a condition belonging to inner
table is present in “where condition”. For example, we have right outer join of A
and B table and a condition belonging to table A is present in where condition,
then we can convert right outer join into inner join.
• We can convert full outer join into inner join or simple outer join if a condition
belonging to either table is present in “where condition”. We have the following
cases:
o If condition is present on both the tables, we can convert full outer join
into inner join.
o If condition is present on inner table, we can convert full outer join into left
outer join.
o If condition is present on outer table, we can convert full outer join into
right outer join.
• In case of set operator, we can sort the result of individual queries on the
selected columns and use these sorted results to solve set operator. For
example, intersect of two sorted sets can be done more efficiently than intersect
of two un-sorted sets.
• In case of union operator, if order by clause is present then we can sort the result
of individual queries on the order columns and use these sorted results to get
overall sorted result. In case of union we have to give all the rows but since order
Views optimization:
• If we have a condition involving view, then this condition should be solved at view
level. View will in turn check if the condition can be passed on the tables
involved. If possible view will solve the condition on the table.
• If the query of view doesn’t contain any set operator, group by or aggregate
function, then we can solve the current query by merging the tables and where
condition of the view’s query with current query. For example, we have a query
Select * from A inner join V on A.id = V.id and definition of V is Select * from B
where B.area > 500 then we can solve the query like one A inner join B on A.id =
B.id where B.area > 500
• If order by is present in a query involving distinct clause then we use the order by
for solving distinct clause also.
• If order by is present in a query involving group by clause and order by column
doesn’t contain any aggregate function and grouping column and order column
sequence are same, then we can sort the result on order by before solving group
by and then we can make the groups.
• If order by is present in a query involving aggregate functions without group by,
then we can remove the order by clause. A query involving aggregate function
without group by returns a single row.
• If order by clause is present in the query and it can be solved on the table
involved, then we will be solving order by on the table itself. If some index is
available, then we will use that index otherwise we will sort the rows of the table
according to the order. In this manner we will be reducing the cost of sorting and
improving the query execution speed.
• If more than one table is involved and order is to be solved on the individual table
then during the Cartesian of tables we will be required to maintain the tables
sequence so that we get the order of rows as specified in the query.
• If we are using an index to solve condition or order on a table and all the selected
column of the query belonging to this table are present in the index, then we
should give the values of the columns from the index itself instead of the table. In
this manner we will be avoiding reading of the table from the physical file and this
will improve the speed of the query execution.
Condition/query rewriting:
Solution:
Considering the details for select optimization, we will have to store conditions, sorting
info, indexes info for the individual tables as well as for Joining of Tables. We will have
to evaluate the different options for solving the query and choose a solution with best
possible optimizations. Choosing a solution will involve comparing the different options
and selecting the option with minimum cost and maximum performance.
For solving a select query, we will have a sequence of steps to be done to get the
resultset. Sequence of steps will involve the operations mentioned in “Select Query
Execution”. Sequence of steps can also be considered as a tree object whose non-leaf
node specifies an operation and leaf nodes represents the tables involved.
SELECT RESPONSIBILITIES
Glossary
1. Semantic Checking
b. Check for table existence
c. Check for table ambiguity
d. Check for column existence
e. Check for column ambiguity
f. Check that select list contains only aggregate columns or columns
present in the group by clause, when either GroupBy / Aggregate
columns are present in query.
g. Outer query columns are valid in the inner query to support Correlated
Sub Query.
h. Check the “on condition” scope management in case of Qualified Joins
1. Semantic Checking
a. Check for table existence
Invalid:
select * from TT
Comment: No object called TT.
Valid:
Create Table TT(a int)
Select * from TT.
Invalid:
select * from emp , emp .
Comment: Tables ‘emp’ , ‘emp’ have the same exposed names.
Valid :
select * from emp ,emp e
Invalid:
select x from orders
Select * from orders where x = 56
Comment: invalid column name ‘x’
Invalid:
select customerid from orders , customers
Comment: Ambiguous column name, As both orders and customers
contain the same column customerid.
Valid:
Select orders.customerid , customers.customerid from orders,customers
Comment: This query works fine.
Invalid:
select orderid from orders group by customerid
Comment: does not work as the select list does not contain an aggregate
or one of the attributes in the group by.
f. Outer query columns are valid in the inner query to support Correlated
Sub Query.
Valid:
select * from orders o where 'brazil' =
(select shipcountry from orders p where p.orderid = o.orderid)
Valid:
Select * from A left outer join B left outer join C on B.id = C.id on A.id =
C.id
Invalid:
Select * from A left outer join B left outer join C on A.id = C.id on A.id =
B.id
Invalid:
select orderid, customerid from orders where orderid=shipname
comment : Syntax error converting the nvarchar value to a column
of data type int.
valid:
select orderid , customerid from orders where orderid =
customerid.
ii. Cardinality
Invalid:
Select * from emp where (salary , depno , empid) = (4000,1)
Valid:
select * from emp where ( salary,depno )= ( 4000,1)
iii. Degree
Invalid:
select * from emp where empid = ( select empid from emp )
comment: Because Inner query results in more than one row.
i. Check that the outer query columns are not used in “from subquery”
Invalid:
select * from (select * from emp where salary > 2000 and d.deptno =2 )
as e, dept as d order by e.salary DESC
Valid:
select * from (select * from emp where salary > 2000 ) as e,dept order by
e.salary DESC
Invalid:
Select max( min(salary)) from emp.
Valid:
Select max(salary) from emp.
Invalid:
select avg(shipcountry) from orders.
Comment: shipcountry is of type Char.
Valid:
select avg(ordered) from orders
l. Check that the having clause is permitted only when either group by or
some aggregate column is present
Invalid:
select depno ,avg(salary) from emp group by depno having avg(salary) >
3000 and depno=3 and empid=3
Comment: empid is neither included in the group by clause nor is it an
aggregate column.
Valid:
select depno, avg(salary) from emp group by depno having
avg(salary)>3000
Select Sum(salary) from emp having max(depno) > 10
Invalid:
select depno , sum(salary) from emp group by depno order by empid.
Comment: empid is neither included in the group by clause nor is it an
aggregate column.
n. Order by Ordinal no. in the order by clause is valid. But it should not
exceed the range of selected column.
Invalid :
select empname,empid,salary from emp order by 5
Valid:
select empname,empid,salary from emp order by 3
Comment: works as the number of selected columns is 3.
Valid:
select depno no , empid id from emp order by id
Invalid:
select productid,Suppliers.supplierid as productid from products,suppliers
order by productid.
Comment: It is not clear which column to sort.
Valid:
select productid,Suppliers.supplierid as productid from products,suppliers
order by Product.supplierId.
q. Check that in the case of Set Operators the selected columns of each
query has one to one correspondence and they have the same cardinality
and their data type should be comparable.
Invalid:
select productid from products
union
select supplierid, companyname from suppliers
Comment: This query is not having same number of columns.
Valid:
select productid,productname from products
union
select supplierid, companyname from suppliers
Invalid:
select productid,productname from products
union
select supplierid, companyname from suppliers
order by supplierid
valid:
select productid,productname from products
union
select supplierid, companyname from suppliers
order by productid
s. Check that in a view definition the columns are uniquely specified, and
any aggregate, scalar, functional columns in select list must have an
alias.
Invalid:
Create view V1 as (Select state.countryId , country.countryId from state ,
country)
Comment: as the Column countryId is ambiguous.
create view v2 as ( select avg(salary ) , depno from emp group by depno)
Valid:
create view v2 (salary , dept_no )as (select avg(salary ) , depno from emp
group by depno)
Valid:
Create view v1 as select * from A.
Query:
select * from v1
Comment: If a user has rights for the view v1 but no rights for the table A,
then in this case if the user fires the query, he will get the result.
w. Use the catalog name, schema name of view for its underlying tables.
Suppose the definition of View is Select * from A and View is created in
Schema S1 then Table A must belong to Schema A.
x. If Natural Join is specified, then the common column of both sides will be
represented only by its columnName, not by its qualified name in query.
Invalid:
Select productname , products.supplierid from products natural join
suppliers
Comment: supplierid is the common column in products and suppliers.
Valid:
Select productname , supplierid from products natural join suppliers.
2. Condition
a. Rewriting of Condition
i. Merging of two conditions on same columns to single condition
1. A > 2 and A > 3 Æ A > 3
2. A > 3 and A < 3 Æ false
3. A = 3 and A = 4 Æ false
4. A = 3 and A > 4 Æ false
5. A = 4 and A >= 4 Æ A = 4
ii. For using index
1. Suppose condition is b = 3 and a = 6 and index is on a,b
then this condition will be treated as a = 6 and b = 3.
b. Shift the condition to lowest possible level.
i. Suppose Condition is A1 > 2 and B1 < 8 and A3 in (10,20,30),
then A1 > 2 and A3 in (10,20,30) will be shifted to Table A and B1
< 8 will be shifted to Table B.
ii. Suppose Condition is A1 = 7 and B2 > 9 then whole condition will
be treated as remaining condition.
iii. Suppose Condition is A1 = 7 and B1 = C1 and (b2 = 9 or A3 = 7)
then A1 = 7 will be shifted to Table A. B1 = C1 will be treated as
join condition of Table B and Table C. (b2 = 9 or A3 = 7) will be
treated as Remaining condition.
iv. Suppose Condition is (A1 = 8 and B1 >= 8) or (A2 < 9 and B3 !=
8) then A1 = 8 and A2 < 9 will be shifted to Table A. B1 >= 8 and
B3 != 8 will be shifted to Table B. And (A1 = 8 and B1 >= 8) or (A2
< 9 and B3 != 8) will be solved as remaining condition.
3. In Predicate
a. Condition like A In (2, 3, 4) should be written as A =
2 or A = 3 or A = 4
b. Condition like A Not In (2, 3, 4) should be written as
A != 2 and A != 3 and A != 4
c. Conditions like A.A1 in (Select C.C1 from C) should
be written as from A, C where A1 = C1.
d. Conditions like A.A1 not in (Select C.C1 from C)
should be written as from A, C where A1! = C1.
Execution plan will use meta 6: choose index for solving condition
data info about the table to
solve the query. Plan will also
consider the indexes available
on table
7: execute
Select Query will call execute 8: create
of plan to identify the rows.
9: navigate rows of table
Plan will navigate rows of the
table using Session and solve 10: evaluate the where clause
the condition on the row and
will add the row to the iterator
object 11: add row
12: return iterator object
Flow Explanation:
1: Server will call execute query of Select Query for query execution
Table existence
Columns existence of column-list and where clause.
Data type compatibility of predicates used in where clause.
Table
Columns in column-list
Condition in where clause
5 & 6: Execution plan will retrieve info about indexes on the table. It will check if the
given condition can be solved using an index. If possible it will use this index to solve the
condition.
7: Execute will create an iterator object and initialize it with the rows satisfying the
query.
8 & 9: Session will return rows according to the current transaction isolation level. If
execution plan chooses an index for solving the condition, Session will return rows using
that index.
10: Row returned by session will be evaluated against the condition specified in where
clause of the query.
11, 12 & 13: Row satisfying the where clause will be added to the iterator object and
then it will be passed on to the server through selct query.
Join Types
Cross Join: This is a join between two tables without any relation.
Inner Join: This is a join between two tables with a relation relating the two tables.
Outer Join:
Left
Right
Full
1: execute query
2: check semantic
Select Query will do the
semantic checking of the query
3: create
9: execute
10: create
Flow Explanation:
1: Server will call execute query of Select Query for query execution.
2: Besides checking for points mentioned in "execute simple query", select query will
perform the following checks in case of joins:
Table
Columns in column-list
Condition in where clause
5 & 6: Execution plan will retrieve info about indexes on the table. It will check if the
given condition can be solved using an index. If possible it will use this index to solve the
condition.
7 & 8: Plan will check if join relation can be solved using any index of the tables
involved in the join relation. If possible, plan will be adjusted accordingly.
One of the major issues involved will be handling of multiple indexes on the same table.
Suppose that plan has choosen index Index1 for solving the table condition and index
Index2 for solving the join relation, then rows to be returned are intersection of the rows
obtained from the two indexes.
9: Execute will create an iterator object and initialize it with the rows satisfying the
query. Plan will do cartesian of the tables involved and will evaluate the table conditions
and join relations on the resulting rows.
10 & 11: Session will return rows according to the current transaction isolation level. If
execution plan chooses an index for solving the condition, Session will return rows using
that index.
12: Row returned by session will be evaluated against the condition (if any) given for
the table
13: Row satisfying the table condition will be evaluated against the join relation
involving the table.
14, 15 & 16: Row satisfying the condition of all tables and all join relations will be
added to the iterator object and then it will be passed on to the server through selct
query.
Group by clause is used for grouping the rows of the select query and retrieving info
about the group. The info includes sum, count, max, min, and avg of the group.
Max will give the maximum value among all the rows of the group.
Min will give the minimum value among all the rows of the group.
Avg. will give the average value among all the rows of the group. Avg is
equivalent to sum divided by count.
In general, Null values are not considered while calculating the group info.
If having clause is specified in the group by clause then the groups are evaluated
against the having condition and rows satisfying the condition are returned.
Examples:
Select countryid, Sum (population) from states group by countryid;
Select Count (orderid), Sum (units * itemAmount) from orderDetails inner item on
orderDetails.itemid = item.id group by orderid having count (orderid) > 10
1: execute query
11: execute
12: create
1: Server will call execute query of Select Query for query execution.
2: Besides checking for points mentioned in "execute simple query" and "execute query
involving joins", select query will perform following checks in case of group by:
Table
Columns in column-list
Condition in where clause
5 & 6: Execution plan will retrieve info about indexes on the table. It will check if the
given condition can be solved using an index. If possible it will use this index to solve the
condition.
7 & 8: Plan will check if join relation can be solved using any index of the tables
involved in the join relation. If possible, plan will be adjusted accordingly.
One of the major issues involved will be handling of multiple indexes on the same table.
Suppose that plan has choosen index Index1 for solving the table condition and index
Index2 for solving the join relation, then rows to be returned are intersection of the rows
obtained from two indexes.
10: If no index has been choosen for solving table conditions and join relations and an
index is available on group by columns, then plan will use the index for solving group by.
If the group by belongs to a table whos indexes have not been choosen for solving
condition or join relation and group can be solved on any of the indexes, then plan will
use that index.
11: Execute will create an iterator object and initialize it with the rows satisfying the
query.
14: Row returned by the session will be evaluated against the condition (if any) given
for the table.
15: Row satisfying the table condition will be evaluated against the join relation
involving the table.
16: Row satisfying the condition of all tables and all join relations will be added to the
iterator.
17: Computing group info means solving the aggregate functions for the group and
specifying group columns values. This will result in reduction of rows present in the
iterator.
18: Plan will evaluate having condition on the rows available after computing group
info.
19: Remove the row not satisfying the having condition from the iterator.
20 & 21: Iterator object will be returned to server through select query.
UNION
INTERSECT
EXCEPT
Union is equivalent to mathematical union of two sets. In case of select query involving
union, there are two or more select queries and the result will contain all the rows of all
the queries.
Example
Select name, age from childs
union
Select name, age from parents
Example
Select name from students
intersect
Select name from students where age < 15
8: execute
9: execute
Do for all
Plan will execute individual execution plans select queries 10: return iterator
to get iterator for the queries
11: create
Flow Explanation:
1: Server will call execute query of Select Query for query execution.
2: Select query with set operator will call for checking the semantic of individual select
queries involved in current query.
4, 5, 6 & 7: Select query will set the individual select queries in Execution plan and
execution plan will interact with individual select query to get their execution plan.
11: Plan will create a new iterator for the set query and filter rows according to the set
operator specified.
12: Plan will filter the rows of different iterators on the basis of set operator.
13, 14 & 15: Row satisfying the condition of all tables and all join relations will be
added to the iterator object and then it will be passed on to the server through selct
query.
Executing query with order by clause returns the rows sorted on the columns specified in
order by clause.
This query result will be sorted on countryname column of country table and statename
column of state table.
One important point, order by is specified with the top level select query only. Suppose
we have a select query involving set operator, then order by can't be specified with each
query. Instead it will be associated with the overall query.
Select query will set order info into 4: set order info
plan and plan will ajust the execution
based upon the indexes available for 5: choose index
order solving
6: execute
7: create
Plan will create an iterator and add
rows according to the type of query 8: add rows satisfying iterator
9: sort rows
Plan will sort the rows satisfying the
query if no index is used for order by
10: add sorted rows
columns.
Flow Explanation:
1: Server will call execute query of Select Query for query execution.
2: Select query will perform the semantic checking according to the query type as
mentioned in Sequence diagrams "execute simple query", "execute query involving
joins", "execute query involving group by" and "execute query involving set operator".
5: If plan has not choosen any index for condition, then plan will check if any index can
be used for solving order and if possible it will use that index to navigate rows of the
table.
Plan will give preference to choosing index for condition solving or join relation
evaluation. However, if the order column and condition column are same, then earlier
choosen index can be used to solve order by. If order by can't be solved using any index
on the table, then execution plan will sort the rows satisfying the query.
6: Plan will execute the query without order, depending upon its type. Plan will have an
iterator with all rows satisfying the select query.
7 & 8: Plan will create an iterator and add rows according to the type of query
9: If Plan has added rows based on an index choosen for order, it will do nothing.
Othewise it will sort the rows and add them in a new iterator.
Scalar sub-query - is a query which returns a single column and a single row.
Row sub-query - is a query which returns multiple columns but a single row
Table sub-query - is a query which returns multiple columns and multiple rows.
Number of columns in the sub-query is defined as its cardinality and number of rows in
the sub-query is defined as its degree.
Flow Explanation:
1: Server will call execute query of Select Query for query execution.
4, 5, 6 & 7: Select query will set table info in Execution Plan and will inturn get the
execution plan of the sub-query involved.
Execution plan will check for sub-query in the condition passed. If a sub-query is
involved, it will check it’s semantic and get the execution plan for sub-query.
8, 9, 10 & 11: When plan will be executed, it will execute the sub-query plan and get its
iterator for solving condition.
12, 13, 14, & 15: Plan will navigate rows of table and evaluate the condition using the
sub-query iterator and if condition is met, it will add the row of the iterator created for the
query.
16 & 17: After adding the rows, iterator will be returned to the server.
Server will give call to Select Server Select QueryExecution PlanSession View QueryView Execution
Query to execute it. Plan
1: execute query
Select Query will perform semantic
checking according to SQL 99 2: check semantic
specifications.
3: create
Select Query will create an execution 4: set Table Info
plan for the query and set the table
info for all the tables. 5: get View Meta Data
Plan will get execution plan of view 8: merge view execution plan
query
Plan will merge the view execution 9: execute Iterator
plan with its execution plan. 10: create
Select Query will execute the plan 11: navigate rows of table
to get the iterator object.
12: evaluate query condition
Plan will create an iterator and 13: add row
navigate the rows of the table. Plan
will evaluate the query condition on 14: return iterator
the row and will add the row
15: return iterator
satisfying the query to iterator.
Flow Explanation:
2: There is no extra consideration in case of views. View should be treated like tables.
Select query will check for view existence and columns existence as done in case of
table.
4: Select query will set the table info of all the tables involved. Plan will retrieve the
meta-data of the table and if the table is a view then it will retrieve meta-data of the view.
5: Session will return the meta data of the view involved in the query. Meta-data of the
view will contain the following info:
6: Execution plan will get the execution plan of the view query.
View query execution plan will be made by the following the steps mentioned in other
sequence diagrams of DQL.
8: Plan will merge the view execution plan with itself so as to optimize the select query.
Plan will add the table info's of view execution plan to it’s add table info.
Plan will change the condition involving view to the underlying view query.
Plan will also map the view column present in selected column list to the
underlying view query columns.
9: Execute will create an iterator object and initialize it with the rows satisfying the
query. Plan will do cartesian of the tables involved and will evaluate the table conditions
and join relations on the resulting rows.
10 & 11: Session will return rows according to the current transaction isolation level.
If execution plan chooses an index for solving the condition, Session will return rows
using that index.
12: Plan will evaluate the condition involved in query on the rows returned by session.
For details, refer to Diagram - execute query involving joins.
13, 14 & 15: Plan will add the row satisfying the query in the iterator and the iterator will
return to the server.
CLASS DIAGRAMS:
Class Diagram: Classes
ValueExpre
ssion
SelectQuery
ColumnList
SelectQuery crossjoin
Body
Parenthesised
Query qualifiedjoin
IntersectQuery joincondition
TableExpression RelationalTable
UnionQuery naturaljoin
FromClause TablesList
ExceptQuery SingleTable
WhereClause
Groupbyclause
GroupingColumn
BooleanCondition
Havingclause
SubQuery
Classes:
SelectStatement ( )
SelectStatement represents the SQL query. It consits of SelectQueryBody and Order by
clause.
SelectQueryBody (Interface)
SelectQueryBody represents a query. Query can be a select query, union query,
intersect query or except query.
OrderByClause ( )
Order by clause represents the ordering information about the result. SelectStatement
sorts the result of SelectQueryBody according to the order by clause given.
ExceptQuery ( )
ExceptQuery represents the difference of two select queries. Result of except query is
equivalent to mathematical difference of two sets.
UnionQuery ( )
Union Query represents union of two select queries. Result of Union query is equivalent
to mathematical union of two sets.
SelectQuery ( )
Select Query represents a query involving tables and their cartesian, filtering and
grouping.
ColumnList ( )
Column List represents the value expressions whose values will be returned by the
select query. Value expression can be a simple column, computed column or scalar
query.
FromClause ( )
From Clause represents the "from clause" of the Select Query. It consists of a table
expression.
Groupbyclause ( )
Group by clause represents the grouping clause of the Select Query. Result of Select
Query is grouped according to the grouping column given.
Havingclause ( )
Having clause represents the condition which is evaluated on the result of group by
clause. Having clause can't be given without group by clause in Select Query.
TableExpression ( )
Table Expression represents Relational Tables involved in the query, where clause,
group by clause and having clause.
WhereClause ( )
Where clause represents the Boolean condition which must be satisfied by the result of
select query.
ValueExpression (Interface)
ValueExpression repesents an expression. It can be a simple column, numeric
expression, string expression or a mathematical function.
SubQuery ( )
SubQuery represents a query which is used in value expression or Boolean condition.
GroupingColumn ( )
Grouping Column represents a column from the column list of select query according to
which results are grouped.
BooleanCondition (Interface)
Boolean condition represents an expression which will evaluate to true, false or unknown
values. Records from a Select Query are given when the Boolean condition returns true.
TablesList ( )
Tables List represents a list of relational tables.
RelationalTable (Interface)
The relational table can be a database table or view, result of some other query,
cartesian of other relational tables.
crossjoin ( )
Cross join represents cartesian of two tables without any condition. Table can be a
database table, view or any other join or parenthesized query.
qualifiedjoin ( )
Qualified join represents cartesian of two tables according to the given join condition.
naturaljoin ( )
Natural join represents the cartesian of two tables with an implicit condition. The implicit
condition is derived using the common columns of the two tables.
SingleTable ( )
Single table represents a database table or database view.
ParenthesisedQuery ( )
Parenthesised Query represents a select query. It behaves like a view.
joincondition ( )
Join condition represents the condition which is evaluated on the cartesian of two tables.
Join condition is part of qualified join.
CLASS DIAGRAM
Navigator
SelectNavigator
SingleTableNavigator GroupByNavigator
IndexedFilterNavigator AggregateGroupByNavigator
NonIndexedFilterNavigator ViewNavigator
AbstractSemiJoinNavigator
DistinctNavigator
SemiJoinNavigator
TemporaryIndexNavigator
SemiJoinIndexedNavigator
UnionAllNavigator
SemiJoinWithoutConditionNavigator
UnionAllOrderedNavigator
JoinIndexedNavigator
UnionDistinctNavigator
NestedLoopJoinNavigator
IntersectAllNavigator
NaturalFullOuterJoinNavigator
FullOuterJoinNavigator IntersectDistinctNavigator
ExceptAllNavigator
ExceptDistinctNavigator
Classes:
AbstractSemiJoinNavigator ( )
AbstractSemiJoinNavigator is responsible for giving rows of left/right outer join of
underlying navigators. It has two navigators representing the left and right relational
table of qualified join.
AggregateGroupByNavigator ( )
AggregateGroupByNavigator is responsible for returning the count of the underlying
navigator. It extends GroupByNavigator functionality and comes into picture when a
aggregate function is present in the select column list and no grouping column is given.
ExceptAllNavigator ( )
ExceptAllNavigator is responsible for giving rows from the underlying navigators
according to the Except All. It has two underlying navigator.
ExceptDistinctNavigator ( )
ExceptDistinctNavigator is responsible for giving rows of the underlying navigators
according to Except Distinct option. It has two underlying navigators.
FullOuterJoinNavigator ( )
FullOuterJoinNavigator is responsible for giving rows of the two underlying navigators
according to the full outer join specification.
GroupByNavigator ( )
GroupByNavigator is responsible for giving rows by making the group of the rows of the
underlying navigator. Grouping of rows is done according to the grouping columns.
IndexedFilterNavigator ( )
IndexedFilterNavigator is responsible for solving the condition using an index of the table
and returning rows satisfying the condition.
IntersectAllNavigator ( )
IntersectAllNavigator is responsible for returning rows of the two underlying navigators
according to intersect all specification. Intersection is done on the basis of values of
selected columns.
IntersectDistinctNavigator ( )
IntersectDistinctNavigator is responsible for returning rows of the two underlying
navigator according to intersect distinct specification. Intersection is done on the basis of
the values of the selected columns.
JoinIndexedNavigator ( )
JoinNavigator is responsible for cartesian of two underlying navigators according to the
join condition. It comes into picture when index is available on any of the column of join
condition. JoinIndexedNavigator solves the join condition by seeking the value of one
navigator into the index of the other navigator.
NaturalFullOuterJoinNavigator ( )
NaturalFullOuterJoinNavigator is responsible for giving rows of the two underlying
navigators according to full outer join specification with a join condition based on the
common columns of the underlying navigators.
NestedLoopJoinNavigator ( )
NestedLoopJoinNavigator is responsible for returning rows of the two underlying
navigators by doing cartesian of their rows. It comes into picture when no join condition
is present or join condition can't be solved using any index available.
SelectNavigator ( )
SelectNavigator is responsible for interacting with user for retrieving the rows of the
select statement and meta-data information about the select statement.
SemiJoinIndexedNavigator ( )
SemiJoinIndexedNavigator extends the functionality of AbstractSemiJoinNavigator. It
solves the join condition of outer join by using the index.
SemiJoinNavigator ( )
SemiJoinNavigator extends the functionality of the AbstractSemiJoinNavigator. It solves
the join condition on the cartesian of the underlying navigators. It comes into picture
when join condition can't be solved using an index.
SemiJoinWithoutConditionNavigator ( )
SemiJoinWithoutConditionNavigator extends the functionality of
AbstractSemiJoinNavigator. It comes into picture when no join condition is given in the
outer join.
SingleTableNavigator ( )
SingleTableNavigator is responsible for returning rows of a database table.
TemporaryIndexNavigator ( )
TemporaryIndexNavigator is responsible for sorting the rows of the underlying navigator
according to the order by clause. It comes into picture when data is not available in
sorted manner through any index.
UnionAllNavigator ( )
UnionAllNavigator is responsible for giving the rows of the two underlying navigators
according to Union all specification. Union is done on the basis of values of the selected
columns.
UnionAllOrderedNavigator ( )
UnionAllOrderedNavigator is responsible for giving the rows of the two underlying
navigators according to the union all specification and order by clause. It comes into
picture when order by and union all are both present in select statement.
UnionDistinctNavigator ( )
UnionDistinctNavigator is responsible for giving rows of the two underlying navigators
according to the union distinct specification.
ViewNavigator ( )
ViewNavigator is responsible for giving rows of the view by executing the select query of
the view. It comes into picture when select query of the view can't be merged with the
main query being executed.
CLASS DIAGRAM
RelationalTa
blePlan
SingleTablePlan GroupByPlan
ViewPlan
TableExpressionPlan
TableSequencePlan SetQueriesPlan
NestedLoopJoinPlan
AbstractJoinPlan
TwoTableJoinPlan FullOuterJoinPlan
SemiQualifiedJoinPlan NaturalFullJoinPlan
RelationalTablePlan (Interface)
RelationalTablePlan interface represents the plan for tables’ pesent in select query. It is
required in the formation of query plan for optimal execution.
SelectQueryPlan ( )
SelectQueryPlan represents execution plan of the select query. It creates the
SelectNavigator for returning the rows of the select query.
SingleTablePlan ( )
SingleTablePlan represents the execution plan of a database table. It solves the
conditions restricted on the table and returns the rows in sorted manner according to the
ordering column (if any).
TableExpressionPlan ( )
TableExpressionPlan represents execution plan of the relational tables involved in "table
expression" of select query. It contains the plans for each table.
TableSequencePlan ( )
TableSequencePlan is responsible for keeping the tables according to the ordering
column sequence when order of the query can be solved by restricting it on the tables of
the database. In this case, TableSequencePlan ensures that tables are not shuffled
because of involvement in join.
TwoTableJoinPlan ( )
TwoTableJoinPlan represents the execution plan of two relational tables involving the
join condition. It creates the appropriate JoinNavigator.
ViewPlan ( )
ViewPlan represents the execution plan of the view's query which can't be merged with
the query involving the view reference.
AbstractJoinPlan ( )
AbstractJoinPlan represents the plan for joining two relational tables. It optimizes the
cartesian by solving join condition using index (if possible). It creates the appropriate
navigator for the join.
FullOuterJoinPlan ( )
FullOuterJoinPlan represents the execution plan of full outer join of two relational tables.
It checks for index usage in solving join condition. It creates the appropriate full outer
join navigator.
NestedLoopJoinPlan ( )
NestedLoopJoinPlan represents the execution plan for cartesian of relational tables.
SemiQualifiedJoinPlan ( )
SemiQualifiedJoinPlan represents the execution plan of the two relational tables present
in the qaulified left/right outer join. It checks whether join condition can be solved using
the index. It creates the appropriate QualifiedJoinNavigator.
GroupByPlan ( )
GroupByPlan represents the execution plan of query involving group by clause or
aggregate functions without group by clasue. It creates GroupByNavigator or
AggregateGroupByNavigator.
SetQueriesPlan ( )
SetQueriesPlan represents the execution plan of two select queries of
union/intersect/except. It creates the appropriate navigator according to the type of query
and selects column distinct option.
NaturalFullJoinPlan ( )
NaturalFullJoinPlan represents the execution plan of natural full outer join of two
relational tables. It creates the NaturalFullJoinNavigator for returning rows of the join.
Session will be responsible for managing currently active transactions. Session will be
executing the operations in multi-threaded environment. Session will take care of all the
locking issues involved in the multiple user environment. Session will be responsible for
maintaining the database in a stable state. Session will provide meta-data information
(details about table, view etc) to DQL, DML and DDL.
Session will be responsible for providing support for all the transaction isolation level
mentioned in JDBC and SQL 99 specification. Following are the transaction isolation
levels supported in Daffodil DB
Read Uncommitted
Read Committed
Repeatable Read
Serializable
Record Locking:
To perverse the data integrity, locking of the record is required. When a transaction
modifies record(s) of a table, transaction is supposed to lock the record(s) before
modifying them. Lock is necessary to prevent other transactions from modifying the
record(s) concurrently. Locking is required in case of update and delete. Session will
lock the record before modifying it. In case, the transaction is unable to take the lock, an
error will be thrown. Locking should be on first in, first out basis.
Transaction Handling:
Commit: When the user calls commit of a transaction, session will take the lock on the
transaction so that no read/write operation is done. After taking lock, session will transfer
the records modified by this transaction from uncommitted data pool to the physical
storage and later delete the records from uncommitted data pool. After committing the
changes, session will adjust the boundary of the transaction to the latest view of
database. Adjustment of transaction boundary is required for handling transaction
isolation level properly.
Rollback: When the user calls rollback of a transaction, session will take the lock on the
transaction. Session will delete all the records modified by this transaction from the
uncommitted data pool. Session will also adjust the transaction boundary as in commit.
Phantom Read - SQL-transaction T1 reads the set of rows N that satisfy some <search
condition>. SQL-transaction T2 then executes SQL-statements that generate one or
more rows that satisfy the <search condition> used by SQL-transaction T1. If SQL-
transaction T1 then repeats the initial read with the same <search condition>, it obtains
a different collection of rows.
Following four isolation levels guarantee that each SQL-transaction will be executed
completely or not at all, and that no updates will be lost. The isolation levels are different
with respect to the above described terms.
In Read Committed isolation level, if a transaction wants to read/modify dirty read then it
must wait for the commit/rollback of the other transactions (transactions which have
made the data as dirty).
For supporting isolation levels, session will be maintaining multiple versions of the
uncommitted records. Session will be giving data to the transactions depending upon
their isolation level. Session will be responsible for giving the most appropriate version of
the record. Session will be merging the committed data and uncommitted of the current
transaction to give the most appropriate view of the data. In case there is some dirty
data to be given, session will wait for other transaction to commit/rollback the changes
done. If user has specified some execution time, session will wait for only that much time
and after that it will throw an error.
Record Locking:
• Locking should be table specific.
• Locking should be on first come, first serve basis.
• Rows can be locked concurrently by different transactions.
Transaction Handling:
• When a commit/rollback is executing, no other operation can be performed under
the current transaction. Some sort of locking will be required.
• When a transaction is committing its data to physical storage, either it will be
transferred completely or nothing will be transferred. This is required to maintain
the integrity of data. Physical storage will be locking the tables involved in the
commit. Locking of tables at physical storage level is also required to enhance
the performance of commit.
• Adjustment of transaction boundary after commit/rollback is required to maintain
the transaction isolation level of the transaction.
o If a transaction is having Uncommitted isolation level, its boundary can be
defined as the latest data or latest version of the row.
o If a transaction is having Read-Committed isolation level, its boundary
can be adjusted by shifting the boundary to the last committed transaction
boundary.
o If a transaction is having a Repeatable-Read isolation level, its boundary
can be adjusted by shifting the boundary to the last committed transaction
boundary. Now all the data visible to this transaction will remain the same
until the next commit/rollback of the transaction. Also, newly inserted
records committed by other transactions will be visible.
o If a transaction is having a Serializable isolation level, its boundary can be
adjusted by shifting the boundary to the last committed transaction
boundary.
To provide the multi user environment and to maintain the consistency of the database
we use the locking facilities. We implements both row level and table level locking to
maintain the data integrity. Similarly, we implement locking on retrieval for some cases
like read committed isolation level and for update statements.
Row Level
Row level locking is used in write operation to avoid the concurrent modification on same
row. We allow only one user to modify a record at a particular time; others have to wait
till first one unlocks the row. To lock the row, we use the value of rowId system column
present in every table. Server itself provides the value of this column. We do not have
same rowId value in two valid records. We use a special locking utility to handle row
locking. This utility throws an exception if a user trying to access a row that is locked by
another user. Further, this exception will be handled according to the isolation levels. If
user is working with read committed isolation level then DML will try to modify the record
after some time and if working in ‘isolation level’ other than read committed then he will
get an exception ‘Row is locked by another user’.
Suppose that both user A and user B want to modify same row having rowid 10 and
working with isolation level session serializable. Both of them try to acquire the lock for
row. If user B got success to acquire the lock and allowed to modify the record, then
user A will get the exception ‘Row locked by another user’.
Table level
When we want to transfer records from memory to file system, we make use of table
level locking. Table level locking ensure that concurrent operations on physical level will
not affect the consistency of the data. In the process of transferring record from memory
to file, first we collect the tables on which user has performed the changes in the current
transaction and acquire lock for only these tables. Only after acquiring lock, we start
transferring record from memory to file for the transaction. In the meantime, if another
user also wants to transfer his changes on the same table then he would have to wait
until first user releases his lock.
Suppose users A, B and C are working on the database. User A has modified the data of
table country, state, user B has modified the data of table country, and user C has
modified the data of table district. To make all the changes permanent, all users forward
call to session system using commit method. Suppose session system gets the call
concurrently and start working to make the record persistent. There will not be any
problem in doing the work of user C because no other user is working on table district.
Therefore, user C will get the lock on table district. User A and B have worked on the
same table; so only one user will get the control to transfer the records and second will
have to wait for the first user to finish his job. Therefore, if user A gets the lock then user
B will have to wait and if user B gets the lock then user A will have to wait for its turn.
For Update:
User through select statement can get a lock on a set of rows by specifying condition in
select statement. If user has not specified any condition in the select statement, then he
will get the lock on all rows of the table. Now if another user tries to modify these rows,
he will get an exception “Row is locked by another user“. If user has specified a
condition in select statement, then only those rows that satisfy the condition will be
locked for that user. To provide the functionality we hold all the conditions for update
select statement and use these conditions when we get the call to modify a record of
that particular table. We solve these conditions on rows that are going to be updated by
another user and throw an exception if it is evaluated successfully.
Suppose user A executes a select statement to get the lock on all rows of table country
that is having country name as ‘India’ (select * from country where countryname = ‘India’
for update). In the result of this query, we will return a result set of rows and maintain
this specified condition (countryname = ‘India’) on session system.
If user B tries to modify a row that is having value ‘Australia’ in column then session
system evaluates the condition (countryname = ‘India’) with the rows that is to be
modified. All rows will be modified successfully because condition does not satisfy the
rows. And, if user B tries to modify the row with condition countryname = ‘India’, he will
get an exception ‘Row locked by another user’, because condition provided by user A
satisfies the rows which are going to be modified by user B.
Isolation Levels:
Read Uncommitted
Read Committed
Repeatable Read
Transaction Serializable
Session Serializable
Read Uncommitted:
In this isolation level, a user can access all the valid records of a table whether they are
committed or uncommitted. We get the result set from memory system and file system
for the specified condition. Here, both the result sets will be merged before retuning it to
user, i.e. user will be able to see other user committed as well as uncommitted records.
In this Isolation level, user is in Read Only mode.
For Example:
User A:
Insert into table students values (MCA01, ‘Maichael’);
Insert into table students values (MCA02, ‘George’);
Commit;
Insert into table students values (MCA03, ‘John’);
Result Set:
Rollno Name
MCA01 Maichael
MCA02 George
MCA03 John
Read Committed:
We get the result set from memory system and file system for the specified condition.
Result is returned to the user after merging both the result set records. In this isolation
level, we use the locking when a user tries to retrieve a row that is marked dirty at that
point of time (by dirty we mean:- a transaction reads data that has been modified by
another transaction that has not been committed yet.). User will not come out of the lock
until first user complete his transaction by either committing or roll backing. Whether a
user will be locked or not is decided at the time of creation of result set. If user is
retrieving result through a non-parameterized query and for parameterized queries, we
perform this action when user passes the parameters for the query.
For Example:
Result:
Rollno Name
1 Maichael
2 George
3 John
1 Maichael
2 George
Repeatable Read:
New transaction Id is provided to the session after every commit or rollback statement.
We get the result set from memory system and file system for the specified condition.
Merge the both result set record before retuning it to user.
For Example:
User A:
Insert into table students values (1, ‘Maichael’);
Insert into table students values (2, ‘George’);
Commit;
Insert into table students values (3, ‘John’);
Update students set Rollno = ‘9’ where name = ‘George’;
Result:
Rollno Name
1 Maichael
2 George
9 Samantha
Handling of this isolation level is very much similar to the read committed isolation level
except that the new transaction Id is provided to the session after every commit or
rollback statement where as in ReadCommitted level new transaction id is not provided
after rollback or commit. We get the result set from memory system and file system for
the specified condition. Merge the both result set record before retuning it to user.
For Example:
Result:
Rollno Name
1 Maichael
2 George
3 John
1 Maichael
2 George
For Example:
Committed records in table before starting any session for user B:
1, ‘Maichael’
2, ‘George’
3, ‘Samantha’
User A:
Insert into table students values (4, ‘rohit’);
Insert into table students values (5, ‘vikas’);
Commit;
Update students set Rollno = ‘9’ where name = ‘George’;
1 Maichael
2 George
3 Samantha
9 Samantha
Handling of transactions:
We delete all the records from memory as we transfer records from memory to file
system. We cannot delete all the records immediately after transferring records from
memory, because some of the isolation level requires older versions of the records as
discussed above. We delete these records only when there is no other active session
left that can access these older version records.
Commit:
Commit operations can be performed by the following steps:
• We take a lock for the roll back operation so that no other users of this
session can give a call to commit or roll back.
• Delete all the records from uncommitted record pool if no other session is
dependent on these records.
• Release the locks.
Save Mode:
We start a new session, whenever a new write operation is performed. These new
sessions for every operation are called as save points. After starting a save point user
can make sure that data inserted before starting a save point will not be roll backed if an
error occurred while executing current transaction. After starting the save point, when
insert, update or delete is performed, we maintain the record key of that record on which
operation is being performed. We use this record key while a save point is committed or
roll backed. We maintain up to 100 keys for a save point and if this limit exceeds we
release all keys from the list and perform commit or roll back on the condition bases.
Working with record keys, we can get the better performance.
1: perform operation
Session will change t he column values and 5: update columns and record transaction and session
t ransaction and session of the record
Session will move the rec ord to the 6: move modified record to uncommitted data
uncommitted data in datastore
7: release record
Session will release t he lock
Flow Explanations:
1: DML query will pass the operation to Session.
2: Session will synchronize operation to maintain the transaction boundary. Session will
allow concurrent execution of insert, update and delete operations; but when a
transaction is executing commit or rollback no other operation will be executed.
3: Session will lock the record so that other transactions can modify it.
4: Session will check if the record is currently modified by another transaction. If so,
session will throw an exception indicating that record is modified by another transaction.
5: Session will change the column values, transaction and session of the record so as
to indicate that record has been modified by the current transaction and session.
6: Session will move the modified record to uncommitted data pool in data store. This is
required to maintain transaction boundary and support isolation level.
1: commit
Session will retrieve the uncommit ted 3: get transaction uncommitted data
data of t he transaction from dat a store
Session will interac t with dat a s tore to 4: save records in physical storage
save records in phy sical storage
Flow Explanations:
1: Session will move the uncommitted data of the current transaction onto physical
storage.
2: Session will synchronize operation to maintain the transaction boundary. Session will
allow concurrent execution of insert, update and delete operations; but when a
transaction is executing commit or rollback no other operation will be executed.
3: Session will retrieve the uncommitted records of the current transaction from the
uncommitted data pool of the data store.
4: Session will shift the uncommitted records from uncommitted data pool to physical
storage.
5: Session will delete the records from the uncommitted data pool of the data store.
6: Session will adjust the transaction boundary to the most recently committed data.
This is required for supporting the transaction isolation level.
4: notify
When the transaction is completed, other
session will notify the session
Flow Explanations:
1: Session will navigate the rows of the table according to the read committed isolation
level.
2: Session will check for uncommitted data according to the condition passed for
navigating rows. If some other transaction has modified records satisfying the condition
then session will wait for other transaction to complete.
3: If Session founds dirty data in Data Store for the current transaction, transaction will
wait for the completion of all other transactions which have modified the data.
4: When a transaction is completed and some other transaction is waiting for its
completion, transaction will notify the other transactions that the lock has been released.
On receiving notification, the transaction which was waiting will start working with the
dirty data.
5: Session will navigate the committed data of the table and uncommitted data of the
current transaction.
Session will the populate the meta 4: populate meta data info
data object with the info retrieved
above
5: cache meta data object
Session will cache the meta-data
info object for further usage
Flow Explanations:
2: Session will retrieve the information from the system tables in Data Store.
Information will be retrieved using the qualified name of the object.
4: Session will the populate the meta data object with the info retrieved above.
5: Session will cache the meta-data info object for further usage.
CLASS DIAGRAMS:
SessionSys DataDiction
tem ary
(f rom Data Di.. .)
<<instantiates>>
Session SessionTabl
e
<<depends>>
<<depends>>
UserSession UserSession
Table
SessionSystem (Interface)
Session System is responsible for managing the currently active databases
SessionDatabase (Interface)
Session Database is responsible for:
Managing the currently active sessions on the database
Managing the objects and meta-data info. [Creates, drops, alters objects]
SystemSession (Interface)
System Session is a session with admininistrator rights. System Session has the
privileges to access the system tables and other related information.
DataDictionary (Interface)
DataDictionary is responsible for managing the meta-data info about the objects.
Session (Interface)
Session represents an active transaction on the database. Session is responsible for
managing:
Record-Locking
Transaction Isolation Level
Data Integrity
SessionTable (Interface)
Session Table is responsible for managing:
Data modification operations specific to a table
Data retrieval operations specific to a table
Handling the uncommitted data specific to a table on completion of a transaction
UserSessionTable (Interface)
UserSessionTable is a user's specific session table. It is responsible for managing
privileges of the user on the table. Privileges include insert, update, delete and select
etc.
UserSession (Interface)
User session is a user's specific session. It is responsible for managing user privileges
on the database. This is a Daffodil DB specific concept. In Daffodil DB, multiple users
can be part of a single session.
DataDictionaryImpl
(from DataDictionary)
SessionSystemImpl
<<depends>>
SystemSessionImpl LockedRecords
Handler
<<instantiates>>
SessionDatabaseImpl
SessionCharacteristics
<<depends>> SelectForUpdate
SessionImpl TransactionIsolation
LevelHandler
<<uses>>
Uncommitted UncommittedIterator
<<depends>>
ReadCommitted CommittedIterator
<<depends>>SessionConditi
IsolationLevel on
<<depends>>
RepeatableRead RepeatableIterator
UserSessionTableImpl SavePoint
<<depends>>
SavePointT
racer
Classes:
SessionSystemImpl ( )
This class will provide the implementation of Session System interface.
SessionDatabaseImpl ( )
This class will provide the implementation of the Session Database interface.
DataDictionaryImpl ( )
This class will provide the implementation of DataDictionary interface. This class will be
caching the meta-data objects.
SystemSessionImpl ( )
This class will provide the implementation of the System Session interface
SessionImpl ( )
This class will provide the implementation of Session interface.
UserSessionImpl ( )
This class will provide the implementation of user session interface.
UserSessionTableImpl ( )
This class will provide the implementation of user session table interface.
IsolationLevel ( )
Abstract class representing the transaction isolation level.
TransactionIsolationLevelHandler ( )
This class is reponsible for managing the uncommitted data for the currently active
transactions. This class will keep track of all the transactions.
Uncommitted ( )
Uncommitted class will be responsible for handling the uncommitted transaction isolation
level. This class will be extending the IsolationLevel class for general operations related
to transaction isolation level.
RepeatableRead ( )
RepeatableRead class is responsible for handling the repeatable read transaction
isolation level. This class will be extending the Isolation Level class.
Serializable ( )
Serializable class will be responsible for handling the Serializable transaction isolation
level. This class will be extending the Isolation Level for common operations related to
isolation level.
ReadCommitted ( )
Read Committed class will be responsible for the Read committed isolation level
handling. This class will be extending the Isolation Level class and implementing the
requirements specific to read committed isolation level.
SavePoint ( )
Save Point class will be responsible for handling a sub-transaction in the current
transaction. Save point will be used internally as well as externally for managing sub-
transactions. Operation like insert, update and delete will be executed in a SavePoint so
that effect of insert can be controlled in case of error.
SavePointTracer ( )
Save Point Tracer will be responsible for managing the operations performed in a save
point. It will keep track of all the insert, update or delete operations performed by a save
point.
SessionCharacteristics ( )
Session Characteristics class will be responsible for managing the properties of the
session like transaction isolation level, current user, current role, transaction mode,
commit mode etc. Session will be setting the properties if it is valid.
UncommittedIterator ( )
Uncommitted Iterator is a navigator on a table. Its main responsibility is to provide
records of the table according to Uncommitted transaction Isolation Level.
RepeatableIterator ( )
Repeatable Iterator is a navigator on a table. Its main responsibilities are:
Providing records of the table according to Repeatable Read Isolation Level.
Waiting for completion of transactions holding locks on the records required by
the current transaction.
CommittedIterator ( )
Committed Iterator is a navigator on a table. Its main responsibilities are:
Providing records of the table according to Read Committed Isolation Level.
Waiting for completion of transactions holding locks on the records required by
the current transaction.
SerializableIterator ( )
Serializable Iterator is a navigator on a table. Its main responsibilities are:
Providing records of the table according to Serializable transaction Isolation
Level.
Waiting for completion of transactions holding locks on the records required by
current transaction.
LockedRecordsHandler ( )
LockedRecordsHandler will be responsible for managing the records locked using the
"Select For Update" Query. A single instance of this class will be made for a database.
SessionTable class will be using this class to check for record locking.
CLASS DIAGRAM
SessionCon
dition
CommittedSes
sionCondition
RepeatableSessi
onCondition
SerialzableSes
sionCondition
UncommittedSessionCondition ( )
UncommittedSessionCondition is a condition for filtering the records according to read
uncommitted transaction isolation level. This class will be implementing the Session
Condition interface.
SerializableSessionCondition ( )
SerializableSessionCondition is a condition for filtering the records according to
serializable transaction isolation level. This class will be implementing the Session
Condition interface.
RepeatableSessionCondition ( )
RepeatableSessionCondition is a condition for filtering the records according to
repeatable read transaction isolation level. This class will be implementing the Session
Condition interface.
CommittedSessionCondition ( )
CommittedSessionCondition is a condition for filtering the records according to read
committed transaction isolation level. This class will be implementing the Session
Condition interface.
SessionCondition (Interface)
Session condition interface represents the condition for a particular operation. For
example, in the case transaction isolation level we have condition for each isolation
level.
CLASS DIAGRAMS:
PrivilegesCha Privileges
DataDiction racteristics
ary
TriggerChara Trigger
cteristics
IndexCharac
teristics
SequenceCh Sequence
aracteristics
ColumnChar
acteristics ViewCharact
eristics
ConstraintCh
aracteristics
ReferentialCo
UniqueConstr nstraint
aint
CheckConstra
int
DataDictionary (Interface)
DataDictionary is responsible for managing the meta-data info about the objects.
ColumnCharacteristics (Interface)
Column Characteristics is responsible for providing all the information related to columns
of a table like name, type, size, nullability etc.
ConstraintCharacteristics (Interface)
Constraint Characteristics will provide the information about the constraints applied on
the table. Constraint Characteristics will provide primary, unique, referential and check
constraints.
IndexCharacteristics (Interface)
IndexCharacteristics will provide information about the Indexes created on a table.
Information about the indexes will be used by DML and DDL to optimize the condition
evaluation.
TriggerCharacteristics (Interface)
TriggerCharacteristics will provide information about the triggers applied on the table.
Triggers will be categorized using the operation on which trigger is applied i.e. Insert,
update and delete.
ViewCharacteristics (Interface)
View Characteristics will provide information about the view. Information includes view
columns, view query etc. View Characteristics will be used by DQL to execute queries
containing views.
PrivilegesCharacteristics (Interface)
PrivilegesCharacteristics will provide information related to privileges for a particular
user.
SequenceCharacteristics (Interface)
SequenceCharacteristics will provide information about the Sequences available in the
database.
UniqueConstraint (Interface)
Unique constraint will provide information about the unique and primary constraint.
Constraint will be either a unique constraint or a primary constraint. Information includes
columns of the constraint, unique or primary constraint, and condition for constraint
evaluation.
CheckConstraint (Interface)
Check constraint interface will provide the information related to a check constraint.
Information includes columns, condition, constraint name etc. DML classes will be using
this interface to evaluate check constraints for records inserted and updated.
Sequence (Interface)
Sequence will provide information about the Sequence declared by the user. Information
includes data type, start value, current value, increment value etc. Sequence will be
used by DML and DQL.
Trigger (Interface)
Trigger will provide information about the trigger applied on a table. Information includes
trigger type, initiating operation, action type (before or after the operation), statements to
be executed, trigger condition etc.
Privileges (Interface)
Privileges will provide the information about the user's privileges on a particular object.
Privilges Charcacteri
Privileges Im pl
s tics Im pl
DataDictionary
Im pl
TriggerCharacteris TriggerIm pl
tics Im pl
Colum nCharacteri
IndexCha ra cteris tics Im pl
s tics Im pl
Cons train tCh ar acteri
s ticsIm pl
ViewCharacteris tics Im pl
ReferentialCo ns tr
UniqueCons traintIm pl aintIm pl
Classes:
DataDictionaryImpl ( )
This class will provide the implementation of DataDictionary interface. This class will be
caching the meta-data objects.
IndexCharacteristicsImpl ( )
This class will provide the implementation of the IndexCharacteristics interface. This
class will access the system tables to retrieve the information about the indexes on a
table.
ColumnCharacteristicsImpl ( )
ColumnCharacteristicsImpl class will provide the implementation of
ColumnCharacteristics interface. This class will access the system tables to load the
information related to columns of the table.
PrivilgesCharcacteristicsImpl ( )
This class will provide the implementation of the PrivilegesCharacteristics interface.
SequenceCharacteristicsImpl ( )
This class will be implementing the SequenceCharacteristics interface.
TriggerCharacteristicsImpl ( )
This class will provide the implementation of the TriggerCharacteristics interface.
PrivilegesImpl ( )
This class will provide the implementation of the Privileges interface.
SequenceImpl ( )
This class will be implementing the Sequence interface. This class will access the
system tables to retrieve information about the sequence.
TriggerImpl ( )
This class will provide the implementation of the Trigger interface. This class will access
the sytem tables to retrieve the information about the trigger.
ConstraintCharacteristicsImpl ( )
This class will provide the implementation of the ConstraintCharacteristics interface. This
class will access the system tables to retrieve information about the constraints.
UniqueConstraintImpl ( )
This class will be implementing the Unique Constraint interface.
ReferentialConstraintImpl ( )
This class will implement the ReferentialConstraint interface.
CheckConstraintImpl ( )
CheckConstraintImpl will provide the implementation of the Check Constraint class. This
class will access the system tables for the information of the constraint.
DataDictionarySystem ( )
DataDictionarySystem class will be managing the DataDictionary objects of various
databases.
Data Store will be responsible for interacting with the physical storage. Data Store will be
used to save and retrieve meta-data as well as data of the tables. Data store will take
care of all the issues related to physical storage. One of the major responsibilites will be
storing data in a platform independent manner so that database built on one platform
can be used on other platforms without any changes.
Physical Storage:
Data Store will be interacting with the physical storage to store the data. Data Store will
be allocating space to tables and indexes created in the database. Data store will be
allocating space in such a manner that space will be sufficient for storing a significant
number of records. When this space is occupied fully, data store will allocate a new
space and link both spaces in the physical storage for a particular table. Data store will
be saving the records in the tables by converting the column values into physical storage
format. Data store will be de-allocating the space after the deletion of table. In case all
the records present in a particular space are deleted, data store will de-allocate the
space from table and re-use it further.
Optimization Points:
Indexes:
Data Store will be handling indexes created on the tables in the database. Data store will
be storing the index in the physical storage. Space for indexes will be allocated and de-
allocated by the data store. Data store will keep the indexes in synchronization with the
tables. Whenever a write operation will be done on the table, data store will perform the
same operation on all the indexes of the table. Suppose a new record is inserted in the
table, data store will insert the same record in all the indexes. In case of update of a
record, only those index whose columns have been updated, will be modified. Index will
be storing the column values of the record and a key corresponding to the record. Key
will be used later for referring the record from the table. As explained in “Physical
Storage” point, record can have null values as well as variable type of columns and
Indexes will be required to handle both the cases.
Optimization Points:
Unique Index:
Indexes created on column(s) having unique values can be treated separately. In
this case, we can uniquely identify the record in case of delete and update and
there is no need to match the key of the record to be deleted or updated.
Fixed Columns:
Indexes can take advantage of fixed type of column(s) and can adopt a different
format for storing and reading them.
Uncommitted Data:
Data Store will be handling the uncommitted data of the currently active transactions.
The uncommitted data should be treated separately from the data present in physical
storage. Uncommitted data will also have the same considerations as explained in
“Physical Storage” point.
Optimization Point:
Indexes:
Data Store can create indexes on the uncommitted data, in case uncommitted
data increases for a particular table so that the condition solving performance
doesn’t degrade very much.
Caching:
• We can’t afford to read/write the physical file for every operation. Data Store
should do caching of pages of physical files to improve the speed and
performance. Data store will load the pages of physical file when some read/write
operation is performed. Data store will unload the pages from memory when the
total memory taken by the loaded pages reaches a threshold.
• We can also cache the record obtained by converting the physical file data into
objects.
Fixed-type record:
We have two types of columns in the database. Column whose physical storage format
is same for different values is called fixed type of column. Example of fixed type of
column can be integer, long, char (10) etc. If user hasn’t given the value according to its
size, then padding is done to make the size. Column whose physical storage format
changes according to the value is called variable type of column. Variable type columns
are used to save the physical storage required. Data store will be handling both types of
columns.
• If a table is having fixed type of columns, only then data store can optimize its
read/write operations. Since the length of the record will always be same, any
record can be read/write directly.
Indexes:
Unique Index:
Indexes created on unique columns can be optimized for data seeking, data modification
etc. Since the index is unique we know in advance that a particular value will exists only
once.
Fixed type record and variable type record both have different storage format in pages.
For example:
Full fixed type record- Suppose we create a table with 3 columns where the data types
are integer, char and long respectively. Then we insert a record in this table. As the
record must be of fixed size 4(integer) + 1(char) + 8(long) = 13 bytes, we insert these
bytes in the table by appending a row header which helps us in retrieving these inserted
values later.
Partial fixed type record- Suppose we create a table with 3 columns where the data
types are integer, char (page size (16384 Default) + 100), long repectively. Then we
insert a record in this table. As the record must be of fixed size 4(integer) + 16484(char)
+ 8(long) = 16496 bytes, this record needs bytes greater then the Page size. In this
situation, we have to insert this record partially in 2 pages say x and y respectively and
we insert these bytes in the table by appending a row header which helps us in retrieving
these inserted values later.
In partial fixed type record we keep record written in first page as active and on rest of
the pages it’s marked as Delete.
Full variable type record- Suppose we create a table with 3 columns where the data
types are integer, varchar (page size (16384 Default) - 2000), long respectively. We can
insert a record in this table whose size can vary from 4(integer) + 1(varchar) + 8(long) =
13 bytes to 4(integer) + 14384(varchar) + 8(long) = 14396bytes. Suppose that we have
inserted maximum number of bytes (i.e.14396 bytes) but still the record can be inserted
in a single page in the table. We insert a record by appending a row header which helps
us in retrieving these inserted values later.
Partial variable type record - Suppose we create a table with 3 columns where the data
types are integer, varchar (page size (16384 Default) + 100), long respectively. Then we
insert a record in this table with 4(integer) + 16484(varchar) + 8(long) = 16496bytes.
Now, as this record needs bytes greater than the Page size, we have to insert this
record partially in 2 pages say x and y respectively and we insert these bytes in the table
by appending a row header which helps us in retrieving these inserted values later.
In partial variable type record we keep record written in first page as active and on rest
of the pages it’s marked as Delete.
We leave some space in the page which can be used at the time of updation of records
so that the records can be kept on the same page even after updation. If a record to be
updated has the larger size than available in page, then we insert record, with updated
values, in a new page and maintain a pointer to it in the old page. After maintaining the
pointer in the old page, we delete the old record and shuffle all the records next to the
record being updated. If a record is updated with the smaller or larger length, in the
same page, then we perform shuffling with the difference of the length of older and the
newer record.
For example:
Suppose that we update a record of original length (2000) in Page x and updates it with
length of record of size 3000 while the free space left in this page is only 500 i.e. total
space available for updating record = 2000 + 500 but updated record is of length 3000.
In this situation, we mark this record as Update in page x and keep a pointer to the new
page, say y with record number 5.
Similarly, when we delete a record from a page, we change its status from active to
delete and shuffle all the records next to it. We add the page in free list when all the
records are deleted.
For example:
If we Delete a record inserted as record number 1 of size 4000 bytes in page x, then we
mark this record as Delete in page x and shifts the bytes of the next written record by
4000 – 1(Delete byte for record 1) = 3999 and updates the insertable address of the
page.
Large Objects are not stored on the pages of the table. They are handled on different
pages with different format. This separation is done for the optimization purpose. To
handle these pages we hold the first page address for every large object type column
and every page contains the address of the next page and previous page, to make the
proper sequence of the included pages. We always insert a new value in the last page of
the sequence and get a new page if last page has not sufficient space. If we have a
table with two large object type columns, then we will have two page addresses for these
columns. In addition, these addresses will be used to navigate through all the pages that
were used to store the values for these columns. We perform special handling for insert,
update, delete and retrieval of large object type column.
For example:
Suppose that we create a table with one integer column and two Blob Columns. Here we
will allocate one page (suppose Page1) to this table and two pages (suppose Page2,
Page3) each to the Blob Columns and address of the First page for respective Blob
columns is maintained.
Insertion:
We never insert the values of the large object columns in to the pages of the table. We
insert these values in separate pages and the pages of table holds the address of the
pages in which actual data is written. For every large object column, we have a
sequence of page addresses, but we always insert new row data in last page address.
We maintain a row header for every row values.
We maintain the following information on page for a record:
• Active/deleted: whether a row is active or deleted
• Full/partial: whether a row value is written partially or fully.
• Length of large object: length of data of large object in bytes.
• Start Pointer: start point for the record in this page.
Start Pointer for the record is kept at byte = Page size – 4bytes (for next page
address to maintain link list) – 2* number of active record in a page (as each
pointer takes 2 bytes).
Now, for the above created table if we insert a record, then value for the integer column
and the pointers for the blob columns are kept in the record inserted in page 1 while
actual values for the blob columns get stored in page 2 and page 3 respectively.
While inserting a large object value in a page, we do not use the complete space of the
page. We always leave some part of the page which can be used, when we update large
object value in that page.
We use more than one page, when the given large object is larger than the available
space of the current page. Suppose a value is written in three pages then first two pages
will have the information ‘Partial’, for indicating the row as partially written and last page
will have the information ‘Complete’ for indicating the row as completed in this page.
For Example:
If the blob value stored in page 2 exceeds the space for insert in page 2, then it will be
stored in the next page say page 4 and keep on inserting value till the complete value is
inserted (say with page 5 our values are complete).
Updation:
We update the row on the same page, if the size of the new value is less than or equal
to the size of older row and the available space of the page. We rearrange the page to
clear all unused space if the updating record is not the last record of the page.
Otherwise, we delete that large object value from the page and insert it again with the
new values.
For example:
If we update the blob column in first record with lesser bytes (12000) than the original
bytes(suppose 15000), then we shift the next written bytes for the page with the
If we update the blob column in first record with equal bytes (15000) then we just had to
update the bytes with new values.
If we update the blob column with bytes (12000>bytes<= 13000) i.e. greater then the
size of older row (12000) but less then or equal to the size of older row (12000) + free
available space of the page (1000), then we use that free space for updating. Here we
shifts the bytes of next written records in the page, if the record to be updated is not the
last record in the page, else we just update the old bytes with the new ones.
Now if we update the blob column with bytes (bytes>13000) then we mark this record as
Delete and shifts all the bytes next to this record by the length of the old record – 1 bytes
for marking the record as delete and Insert it at a new location with new values.
Suppose if the record is partially written in 3 pages then we mark the record as Delete in
first page of the record and add the 2nd page in free list i.e. frees that page to be reused
and again adjust bytes for the 3rd page such that the bytes are shifted by the length of
the partial bytes written in this 3rd page and then insert this record at the next available
address in the last cluster with the new updated bytes.
Deletion:
On deleting a record, we set the information of that particular record as active to delete
and shift all the rows values of all the records written next to it. In addition, we add the
page in the free list if there are no more rows in it. We perform shifting to reuse the
space freed by this row.
For example:
If we give a call to delete the record 1 in page 1, then for blob columns (page2, page3)
we delete the column value by moving the pointer stored for the respective value in page
1 and then by moving to that particular pointer value in page 2, we mark the Active byte
as Delete and shifts the bytes for all the next written record by the length of the old
record (15000) – 1 (byte for Delete).
In case, if the blob column bytes are stored partially in 3 pages (page2, page4, page5)
then we mark the record as Delete in first page of the record and add the 2nd page in
free list i.e. frees that page to be reused and again and adjust the bytes for the 3rd page
such that the bytes are shifted by the length of the partial bytes written in this 3rd page.
Retrieval:
We provide two options to retrieve a large object column values: Full and Partial. User
can retrieve the complete data with full option or few bytes with partial option. We
provide the values from all pages if large object column value is written in more than one
page.
Caching:
We avoid working directly on physical file because of the performance issues. The
working on physical file for every read/write operation degrades the overall performance
of the database because physical file does not provide the sufficient speed to work on.
We load the data of physical file on pages and keep these pages in memory to provide
better speed and performance. Therefore, for every operation we load the data of a
specified size from the physical file, if it is not already loaded into memory. Page has a
status as read when it is loaded for the read operation and as write when it is loaded
with the write operation. Caching of pages may produce a problem of memory overflow.
To handle the problem we unload the pages, having read status, from the memory when
pages in memory cross the defined threshold. We also remove the pages with write
status when required. We store these pages in a separate file called temporary file
(because these pages contains modified data) and create a mapping for the pointer in
temporary file data and the address of the actual file. When the user again access the
actual page, we load the data from the page in memory by using the data written in
temporary file and put the address of temporary file in a list to reuse this address.
For example:
User is performing some operations on the database as a result of which we keep on
loading the pages or caching the pages for improving the performance and this number
reaches to the threshold level say 200 with no. of pages loaded for read = 50 and no. of
pages for write = 150. Now if 1 page is to be loaded at this moment, then we remove 50
pages taken for read from the cache and load this page. In case our problem is not
solved, as we have to load 51 pages but can remove only 50 pages which are taken for
read, then we unload pages for write by writing in temp file with proper pointer for the
database file.
For example:
If a user performs a number of read, write, update operations, then till the time user
commits this data, it is stored in this temporary data store and when the user commits, it
writes the data in the actual database file and deletes the record present in it.
Multi Users:
On the data store level, multiple users can work concurrently. They can work even on
same table if they want to read data simultaneously, but read/write and write/write
operation on same table is not allowed because this could corrupt the database. They
For example:
If three users say A, B and C wish to work concurrently, then they can work on three
different tables of the database simultaneously without corrupting the database. They
can also work on the same table if they want to perform Read Operation. But if any user
(say A) wants to perform write operation on that table, then it is not possible till the other
2 users B & C has read data from the table and released the lock. For the mean time A
has to wait and when A will start working on that table, other two users can’t perform
read/write operations on that table until A releases the lock on that table.
For Example:
If the user’s requirement is such that he has got records of small size say 3k and fewer
numbers of records, then having a default page size of 16k will waste a lot of space. So
here the user can set memory cluster size as 4k.
On the contrary, if the user’s requirement is such that his record size is 17k and he goes
for Default cluster size of 16k, then every record will be inserted partially which will
degrade the performance as it has got its own overheads of retrieving from more than 1
cluster. So to avoid this degradation, user can set the cluster size as > 17k and hence
will improve the performance.
For example:
If records added on page x are all deleted then we add this page x to free page list so
that this space can be reused for next coming records and hence economizes the
physical space available. Whenever new page is needed, first we check for any page
available in free list; if it is available we use this page from free list.
For example: Suppose that the user requires a huge database say of size 20 GB with
initial size 50m and increment factor 50 and he doesn’t opt for MULTIFILESUPPORT,
then a single file will go on increasing in length by 50%(25m). Whenever the threshold
for it is reached in a single file then traversing such a huge file won’t give him
satisfactory performance. On the other hand, if he opts for MULTIFILESUPPORT with
initial size 50m and increment factor 50, then instead of a single file growing in size to 20
GB every time, when the threshold of a file is reached a new file which is 50% of 50m
(i.e.25m) is created which results in higher performance.
Indexes:
We use indexes to improve the performance of the retrieval on a table. But, we bear
some cost on insert, update and delete methods. To maintain the indexes we use B+
Tree. Whenever a record is inserted on table, we insert the values in btrees also. Every
btree occupy some space in physical file to maintain the data in it. We do not perform
write operation directly on physical file, as direct operations on physical file always
degrade the performance. We use caching for better performance. We load the data of a
btree on pages, as it is required and unload these pages as no. of pages in memory,
cross its threshold. Btree is a set of key and value pair, where key is a set of columns
and value is a pointer of actual record in physical file. We store the data in the pages
according to the btree type. We have two types of btrees: fixed type btree and variable
type btree. In fixed type btree every key have the same storage format and in variable
type btree, format of the key could be different for each key.
Fixed type btree: storage structure of the fixed type btree is as follows:
For example:
If we insert a record in a fixed table with 3 columns having data types integer, char and
long respectively and there exists an index on all the 3 columns of this table, then we
first insert record in page for table and let it be page x, record number5 and then insert
this record in index existing on this table also and the format for this record will be:
Length of the whole row + NOTNULL + NOTNULL + NOTNULL + values for the columns
(Key) + pointer for the record in the physical file where the record is stored in table
(Value= page x, 5).
For example:
If we insert a record in a variable table with 3 columns having data types integer, varchar
(1000), long and there exists an index on all the 3 columns of this table, then we first
insert record in page for table and let it be page x, record number5 and then insert this
record in index existing on this table also and the format for this record will be:
Length of the whole row + NOTNULL + NOTNULL + NOTNULL + Length of the variable
column i.e. varchar which is actually stored + values for the columns (Key) + pointer for
the record in the physical file where the record is stored in table (Value= page x, 5).
1: insert uncommitted
Session will insert a record as
uncommitted in Data Store
2: insert
Dat a store will ask Uncommitted Data
Pool to add the record of the table
3: lock Table
Uncommitted Data Pool will lock the
t able for modification before inserting
t he record
5: insert in Indexes
Dat a pool will insert the record in
indexes to keep a consistent view
Flow Explanations:
3: Uncommitted data pool will lock the table for modification before proceeding with any
changes.
Locking should allow multiple read and multiple write.
Locking should be done in first in, first out fashion.
4: Uncommitted data pool will add the record in the table. It will convert objects into
bytes.
6: After modifying the table and indexes, uncommitted data pool will release the table
for other threads to access it.
1: save Record
Data Store will save record on physical
storage through Committed Data Pool
2: check Null
Data Pool will check for null value in the
Repeat step
column
2-4 for all the
columns of 3: get Object bytes
Byte Handler will give the bytes for the the record
column value.
Flow Explanations:
2: Committed Data Pool will check for null value of column and will keep status of
nullability.
3: Data Pool will convert the non-null column value into bytes with the help of Byte
Handler. There are two types of columns:
Fixed Size – Column value will always take a fixed number of bytes.
4: Data Pool will add the null status and column bytes of all columns to make bytes for
the record.
5: Data Pool will seek the record position in the physical file to write the bytes.
6: Data pool will write the record bytes in the physical file starting from the seeked
location of the record.
Insertion in Index
1: insert record
Dat a St ore will give call to Data Pool
for inserting a rec ord
2: lock table
Dat a Pool will loc k the table before
inserting the record
7: release lock
Dat a Pool will release the lock taken
above
Flow Explanations:
2: Data Pool will take the lock on the table to maintain consistency of the indexes.
When an insert/delete operation is in progress, no other operation (read or write) will be
allowed on the table. When an update operation is in progress and no index is affected
3: Data pool will insert the record in all the indexes of the table.
4: Index will locate the record position using the index column values from the record.
5: Index will insert the record at the position found in the above step. Index will make
the readjustment in the index so as to maintain consistent performance.
CLASS DIAGRAMS:
Class Diagram: Interfaces
DataSystem
<<instantiates>>
Database
TableCharac
teristics
<<depends>> <<instantiates>>
<<depends>>
DatabaseUse
r <<inst antiat es>>
Table
RecordClust
<<depends>> er
<<depends>>
<<depends>>
UserTableOp
erations Navigator
TableOperati
ons
TableNavigat
or
RecordCluster (Interface)
RecordCluster will be responsible for insertion, updation, deletion and retrieval of
records from the clusters. RecordCluster will also provide information to number of
records in the cluster, free space in the cluster etc.
DataSystem (Interface)
Data System will be responsible for managing
Currently active databases. By active database we mean a database on which
user is doing some operations.
Creation and deletion of databases.
Database (Interface)
Database will be responsible for managing creation, deletion and alteration of table
objects for data modifications and retrieval.
Table (Interface)
Table interface will be responsible for providing information related to a table like table
characteristics.
Navigator (Interface)
Navigator will be responsible for navigation of records in a table. Navigation supported
will be of scrollable type i.e. to and fro movement is allowed. Navigator will also provide
key for current row. Key can be used later to align the iterator on a particular row.
DatabaseUser (Interface)
Database user will be used for locking the database for data modifications. Database
User will be keeping tracks of the clusters affected during data modifications and finally
these clusters will be stored in physical file. Locking of the database will be done on the
table basis.
TableNavigator (Interface)
TableNavigator will extend the Navigator interface for providing functionality for retrieving
columns values in different ways.
TableCharacteristics (Interface)
TableCharacteristics will represent the information about the columns. All information
about the columns like type, size, name etc. can be retrieved. TableCharacteristics will
be responsible for conversion of objects into bytes and vice versa.
TableOperations (Interface)
TableOperations will be responsible for insertion, updation and deletion in a table.
TableOperations will be providing functionality to facilitate the data modifications
operations.
UserTableOperations (Interface)
UserTableOperations will provide functionality to DatabaseUser for data modifications
operations.
PersistentDatabase
WritableClustersFile
ClusterManager
FreeSpaceManager DatabaseProperties
ClustersMap
ClusterCharacteristics
TableManager
TableCharacteristics <<depends>>
Generator
ColumnBytesTable
UserLockManager
<<depends>> LOBManager
<<depends>>PersistentTable
TableCharacteristicsImpl
PersistentUser
TableProperties
TableKey DClobUpdatable
FixedRecord VariableRecord
Cluster Cluster
PartialFixedRecord PartialVariable
Cluster RecordCluster
PersistentSystem ( )
PersistentSystem will provide the implementation of DataSystem interface.
PersistentDatabase ( )
PersistentDatabase will provide the implementation of the Database interface.
PersistentDatabase will also be responsible for managing:
Physical File
Caching of clusters
PersistentTable ( )
PersistentTable will be providing the implementation of Table Interface.
PersistentUser ( )
PersistentUser will be providing the implementation of DatabaseUser Interface.
PhysicalFile ( )
PhysicalFile will be responsible for interacting with the underlying operating system. It
will be using RandomAccessFile provided by Java to do the operations. PhysicalFile will
also handle the multiples files for a single database.
ClusterManager ( )
ClusterManager will be caching the clusters used during data modifications and data
retrieval. ClusterManager will also ensure that a single instance of cluster is being used
in the data store at a given time.
Clusters will be cached on the basis of operation in which they are involved. If a cluster
is required for data retrieval, it will be loaded in a read mode. If a cluster is required for
data modification, it will be loaded in a write mode. ClusterManager will manage the
read-mode clusters and write-mode differently. Number of clusters cached in both
modes can be configured by the end user.
EncryptedPhysicalFile ( )
EncryptedPhysicalFile will be responsible for encryption and decryption of data stored in
physical file. This class will be doing encryption of data being written in physical file
using the encryption key and algorithm specified by the user at the time of creation. Also,
data being read from physical file will be decrypted using the same key and alogrithm
FileGenerator ( )
FileGenerator class will be responsible for creation and deletion of files on the operating
system in case of multiple files for a single database. File Generator will also be
managing the names and size of the files being created.
WritableClustersFile ( )
WritableClustersFile will be responsible for handling clusters (write mode) flushed by
cluster manager. Cluster Manager will be flushing the clusters whenever the limit
specified by the user exceeds. We can't flush the cluster in write mode as such because
it can cause loss of data. So before flushing the cluster, cluster manager will be saving
the contents of cluster in WritableClustersFile.
ClustersMap ( )
ClusterMap will be responsible for storing clusters loaded by cluster manager.
ClusterMap will be using ClusterCharacteristics as key.
FixedRecordClsuter ( )
FixedRecordCluster will be implementing the RecordCluster interface.
FixedRecordCluster will be handling records consisting of only fixed type of columns
VariableRecordCluster ( )
VariableRecordCluster will be providing implementation of RecordCluster interface. It will
be responsible for managing the records of table having variable type of columns. A
table may have all or some columns of variable type. VariableRecordCluster will be
responsible for insertion, updation, deletion and retrieval of variable type of records from
the clusters of the table.
PartialFixedRecordCluster ( )
PartialFixedRecordCluster handles the same responsiblity as FixedRecordCluster. This
class is used when the size of records is larger than cluster size and all the columns are
of fixed data type.
PartialVariableRecordCluster ( )
PartialVariableRecordCluster will be handling same responsibilities as
VariableRecordCluster. This class will come into picture when the bytes of the records
are larger than cluster size and some columns of the table are of variable type.
LOBManager ( )
LOBmanager is responsible for storing and retrieving blob & clob data type columns.
Each table having large object data type column will have its own LOBManager.
LOBManager will be doing data modifications and will be interacting with the persistent
database for allocation and de-allocation of clusters.
DBlobUpdatable ( )
DBlobUpdatable will be responsible for handling binary large data objects. This class will
retrieve the contents of the object from the database.
DClobUpdatable ( )
DClobUpdatable will be responsible for handling character large object. This class will
retrieve the contents of the object from the database.
TableCharacteristicsGenerator ( )
TableCharacteristicsGenerator class will be making the TableCharacteristics objects by
reading the information from the system table used for columns info.
TableCharacteristicsImpl ( )
TableCharacteristicsImpl will be implementing the TableCharacteristics interface.
ClusterIterator ( )
ClusterIterator will be providing implementation of the Navigator interface. ClusterIterator
will use clusters to the tables to navigate the records and their column values.
FreeSpaceManager ( )
FreeSpaceManager will be responsible for managing the clusters marked as free
because of deletion of records from clusters. A Cluster is marked as free when all the
records in the clustes are deleted.
UserLockManager ( )
UserLockManager will be responsible for giving locks to DatabaseUser for acessing
different tables of the database. UserLockManager will ensure that the tables are not
accessed concurrently by different users. Also, lock will be given to the users on the
First In, First Out basis.
TableManager ( )
TableManager will be responsible for creating the Table object. TableManager will be
doing manipulation on the system tables to save and retrieve the meta data about the
tables. TableManager will ensure that only single instance of a table is created.
TableKey ( )
TableKey represents the relative address of a record of a table in the database file.
TableKey consists of cluster address and record number in the cluster.
TableProperties ( )
TableProperties class will be providing information of columns in a ready to use manner
to RecordCluster classes. Difference between TableProperties and TableCharacteristics
is that TableCharacteristics provide functionality for converting objects into bytes and
vice versa.
ClusterCharacteristics ( )
ClusterCharacteristics is a unique value with which a cluster can be identified.
ClusterCharacteristics is used as key for storing the clusters in the cache. Also,
clustercharacteristics are stored in place of cluster itself so that cluster can be freed on
the requirement of memory by the database server.
ColumnBytesTable ( )
ColumnBytesTable will implement the Table interface. ColumnBytesTable will be
converting individual column bytes into row bytes and vice versa.
DatabaseProperties ( )
DatabaseProperties class will be representing the properties of the database. Properties
will include creation time properties as well as run-time properties.
For example, properties will include cluster size of the database, unicode support etc.
CLASS DIAGRAMS:
Class Diagram: Interfaces
IndexDataba
se
BTreeChara
cteristics
IndexTable
IndexIterator
<<depends>>
Clust erProvi
NodeManag der
IndexTableIt er
e rat or
Node
Classes:
NodeManager (Interface)
NodeManager provides the functionality for creation, deletion and retrieval of the nodes.
It interacts with cluster provider to manage the clusters used by the nodes.
Node (Interface)
Node provides the functionality for insertion, updation, deletion and retrieval of elements
in a btree node. It also provides functionality to manage the node like element count,
level and split point.
BTreeCharacteristics (Interface)
BTreeCharacteristics interface provide the functionality for retrieving columns values of
the index without accessing the table.
ClusterProvider (Interface)
ClusterProvider is responsible for providing clusters to the NodeManager.
ClusterProvider provides the functionality for creating cluster, reading clusters and
adding free clusters.
IndexTable (Interface)
IndexTable extends the functionality of the table interface. IndexTable is responsible for
managing indexes created on the table. IndexTable is responsible for insertion, updation
and deletion of the data from the indexes of the table.
IndexTableIterator (Interface)
IndexTableIterator extends the functionality of the Navigator and IndexIterator. It also
provides methods to get info about table and retrieve columns.
IndexIterator (Interface)
IndexIterator provides the functionality for searching data in the index and retrieving
values of the columns involved in the index.
BTreeCharacteristics ByteComparatorSingle
IndexSystem SingleColumn Column
BTreeNavigator BTreeKey
BTree
IndexTableImpl
ColumnObjectTable
<<depends>>
BTreeNode BTreeElement
IndexDatabase
User FileNodeMan
BlobClobColumn IndexTableIteratorImpl ager
ObjectTable
FixedFileNode FileBTreeElement
<<depends>>
IndexColumnInfor
mation <<depends>>
<<depends>><<depends>>
VariableFileNode
FixedBTreeCluster
PersistentDatabase
BTreeControlCluster (from FileSystem)
IndexSystem ( )
IndexSystem provides implementation of the DataSystem interface at the index level.
IndexSystem is responsible for managing databases, creation and deletion of the
databases.
IndexDatabaseImpl ( )
IndexDatabaseImpl provides the implementation of the IndexDatabase. It interacts with
persistent database to provide access to table.
IndexTableImpl ( )
IndexTableImpl provides implementation of the IndexTable interface.
IndexDatabaseUser ( )
IndexDatabaseUser implements the DatabaseUser interface. It uses the PersistentUser
class to provide the functionality of the database user.
IndexTableIteratorImpl ( )
IndexTableIteratorImpl provides implementation of the IndexTableIterator interface. This
class also takes care of the lock for read and write operations.
BTree ( )
BTree is responsible for managing an index. BTree provides the functionality for data
manipulation and data retrieval on the index. BTree also provide the functionality to
search or seek a particular data in the index.
BTreeNode ( )
BTreeNode represents a node of the btree. It consists of btreeElement arranged in a
sorted manner. Each btreenode has a parent node and a parent btreeElement.
BTreeElement ( )
BTreeElement represents a key and value pair of the index. BTreeElement key is the
value of index columns and value is the record pointer of the record in the table.
BTreeElement stores information about the node to which this element belongs. Also,
information about the child nodes is stored in the btreeElement.
FileNodeManager ( )
FileNodeManager implements the NodeManager interface. FileNodeManager stores the
information in the btree control cluster.
FileNodeManger also caches the nodes so as to avoid reading of the cluster from the
database.
FixedFileNode ( )
FixedFileNode implements the node interfaces. FixedFileNode object is created when all
the columns of the btree are of fixed data type. FixedFileNode converts objects into
bytes and vice versa and uses fixed btree cluster class to store and retrieve the bytes of
the elements in the database.
BTreeNavigator ( )
BTreeNavigator is a navigator on the btree. It is also used to retrieve the column values
of the record. BTreeNavigator provides scrollable type of navigation on the btree.
BTreeKey ( )
BTreeKey represents the address of a key in the btree. It consists of a btreeElement and
the position of btreeElement in the node. It is mainly used in the navigation of the btree.
VariableFileNode ( )
VariableFileNode implements the node interface. VariableFileNode object is created
when atleast one of the columns of the index is of variable data type. VariableFileNode
converts object into bytes and vice versa. It interacts with btree cluster class to store and
retrieve the bytes of the elements in the database.
BTreeControlCluster ( )
BTreeControlCluster represents a page/cluster of the physical file. BTreeControlCluster
is used to store information related to a btree. Information stored is starting cluster of the
btree, size of the btree, index columns etc.
IndexColumnInformation ( )
IndexColumnInformation provides information about the index columns. Info includes
type, size, fixed data type, number of variable columns etc.
PersistentDatabase ( )
PersistentDatabase will provide the implementation of the Database interface.
PersistentDatabase will also be responsible for managing
Physical File
Caching of clusters
Locks for data modifications
Free Space in the physical file
BTreeCharacteristicsImpl ( )
BTreeCharacteristicsImpl provide implementation of BTreeCharacteristics when multiple
columns are involved in an index.
BTreeCharacteristicsSingleColumn ( )
BTreeCharacteristicsImpl provide implementation of BTreeCharacteristics when a single
column is involved in an index.
BTreeCharacteristics (Interface)
BTreeCharacteristics interface provide functionality for retrieving columns values of the
index without accessing the table.
Comparator (Interface)
Comparator provides the functionality to compare the keys of the index to the btree.
ByteComparator ( )
ByteComparator implements Comparator interface and is used by btree to sort the data
in the index. Bytecomparator uses the bytes of the columns to compare them.
FixedBTreeCluster ( )
FixedBTreeCluster extends the functionality of the cluster. It also provides the
functionality to insert, update, delete and retrieve a btree element in a node. It also
manages the other info like next node address, node type (leaf or non-leaf).
ColumnObjectTable ( )
ColumnObjectTable is responsible for converting objects into bytes and vice versa.
ColumnObjectTable implements the IndexTable interface.
BlobClobColumnObjectTable ( )
BlobClobColumnObjectTable is responsible for converting objects into bytes and vice
versa. It also handles large object data type columns by interacting with LOBManager.
Parser responsibilities:
Parsing SQL statements
Creating tree-type object corresponding to the SQL statements
Generating Java classes
Parser will be responsible for parsing the SQL Statements. Parser will parse the SQL
Statements according to SQL-99 specifications grammar. Parser will create a tree-type
object containing all the information about the query.
Parser will create a tree-type java object representing the content of the query. The tree-
type object will be representing different rules of the SQL grammar and parser will be
initializing the classes with the corresponding info from the query.
Example:
"Select * from Country".
class SelectQuery{
columnlist columnList;
tablelist tableList;
whereclause where;
orderbyclause order;
}
For the above query, parser will create an object SelectQuery. Parser will assign "*" in
the columnlist object and "country" in the tablelist object. Where clause and order by
clause will remain null because query doesn't have any where condition and order by.
Parser will throw exception in case syntax is invalid. Parser will specify the position in
the query and cause for error so that user can change the query easily.
Parsing a query will involve token generation, checking for grammar rule and finally
making a tree-type object corresponding to the satisfied rule and loading query
information in the object. Token generation will involve breaking of query into tokens
using the grammar. After tokens are generated, parser will match the first token of the
query with first token of the grammar rules. Rule whose first tokens matches with query
token will be selected and parser will match the next token of the query with next token
in the rule. Parser will repeat the procedure and tries to match all the tokens with the
tokens in the query. If all the tokens match, parser will create a tree-type object and load
these tokens information into the object. If all the tokens didn’t match, parser will check
the next rule of the grammar. If all the rules of the grammar are checked and no rule
satisfies the given query, parser will throw an error.
Generating Classes:
To execute the query, we will be needing classes corresponding to grammar rules and
parser will provide a utility to generate the classes for the given grammar. Classes
generated will be representing the rules of the grammar. Each class will have a variable
corresponding to the token or sub-rule occurring in its sequence. All classes will have a
common method to execute/run them.
Grammar Rules:
Rule can be defined as a sequence of tokens or rules. A rule can contain other rules in
its sequence. A grammar can have following type of rules:
Multi option Rule: A rule containing a set of rules such that any rule can be
assigned to this rule. Parser will have to evaluate all the rules to check whether
this rule is satisfied or not.
Flow Explanations:
1: Server will give the query to Parser for parsing. If query is valid according to SQL 99
grammar, then an error will be returned.
2: Parser will request the Tokenizer to make tokens of the query passed. Token is the
smallest meaningful information according to the grammar. Tokenizer will make tokens
using the delimiter. Tokenizer will ignore the spaces and comments in the query.
3: Tokenizer will create token object. Token will be having information and type. Type
will represent whether token is a keyword, identifier, constant or delimiter etc.
4: Parser will give the tokens to Grammar rule to parse. If the grammar rule, fails to
parse the query, Parser will try the next the grammar rule. Parser will repeat this until all
the rules of the grammar have been checked. If the query is not parsed in any rule,
parser will throw an error.
5: Grammar Rule will match the current token with its first token to check whether
query is parsable under the rule.
6: If current token matches with the first token, rule will create the Java Class Object
representing the Grammar Rule.
9: Rule will set the sub-rule Java Class object in its Java Class object to create a tree
type object. The tree type object will have the contents of the query represented in terms
of Java objects.
10 & 11: Parser will return the Java class object to server
Rule will generate the name of its java 6: generate java class name
class file
Rule will create the physical file 7: define java class
according to java language
8: set sub-rule variables
Rule will set the sub-rules variables in the
physical file
Flow Explanations:
1: Class generator will give a call to Grammar rule to generate Java class
corresponding to the rule.
2: Grammar rule will ask its sub-rule to generate their java classes.
4: Sub rule will return the name of java class generated by it.
6: Grammar rule will generate the name of the java class using the rule definition.
8: Rule will set the sub-rule variables in the java class file for all the sub-rules.
CLASS DIAGRAMS:
Class Diagram: Classes
ParseException
DaffodilClassLoader CharacterStringLiteralGrammarRule
MultipleOptionGrammarRule
Parser MultipleSubRulesGrammarRule
OptionalGrammarRule
GrammarRule RangeGrammarRule
RepetitiveGrammarRule
SimpleGrammarRule
StringGrammarRule KeyWordGrammarRule
GrammarRuleFactory
TokenGrammarRule
TokenGrammarRuleFactory MultipleOptionGrammarRuleComparable
MultipleSubRulesGrammarRuleComparable
GrammarRuleComparable OptionalGrammarRuleComparable
RepetitiveGrammarRuleComparable
SimpleGrammarRuleComparable
Parser ( )
Parser is responsible for parsing the query using the GrammarRule. Parser uses
GrammarRuleFactory to create the GrammarRule objects.
GrammarRule ( )
GrammarRule is an abstract class representing a rule of SQL grammar. GrammarRule is
responsible for parsing the SQL query and creating the object of class and loading the
content of query in it. GrammarRule also manages its sub-rules and query is parsed in a
recursive fashion.
GrammarRuleComparable ( )
GrammarRuleComparable is an abstract class extending GrammarRule. It is created
when the starting token of a rule can be compared and a decision can be taken if a
query will be parsed in this rule.
DaffodilClassLoader ( )
DaffodilClassLoader extends Java class loader to load the class corresponding to
grammar rules. Classes are loaded when a grammar rule is parsed and rule creates an
instance of that class. Every grammar rule has the reference to class loader.
MultipleOptionGrammarRule ( )
MultiOptionGrammarRule represents a rule which involves multiple rules and query will
be parsed if any of the rules is satisfied. This rule will create the object of the rule which
is satisfied.
MultipleOptionGrammarRuleComparable ( )
MultipleOptionGrammarRuleComparable represents a rule which involves multiple rules
which are comparable and query will be parsed if any of the rules is satisfied.
Functionality of this class is very similar to MultipleOptionGrammarRule.
MultipleSubRulesGrammarRule ( )
MultipleSubRuleGrammarRule represent a grammar rule which involves multiple rules
and query will be parsed only when all the rules are satisfied. This rule will create an
instance of class and set the objects of sub-rules in the newly created object.
MultipleSubRulesGrammarRuleComparable ( )
MultipleSubRuleGrammarRuleComparable is same as MultipleSubRuleGrammarRule
except that the first sub-rule is of comparable type.
OptionalGrammarRule ( )
OptionalGrammarRule represents a rule which is not mandatory in the query to be
parsed. This rule returns an object if users have specified it in the query.
OptionalGrammarRule always contains another grammar rule.
OptionalGrammarRuleComparable ( )
OptionalGrammarRuleComparable is same as OptionalGrammarRule except that the
rule contained is of comparable type.
RangeGrammarRule ( )
RangeGrammarRule represents a rule which contains values lying in a range. For
example, digit = 0-9 will be a RangeGrammarRule whose value lies between 0 and 9
RepetitiveGrammarRule ( )
RepetitiveGrammarRule represents a grammar rule which can be repeated to specify
more than one set of values. RepetitiveGrammarRule parses the query and creates an
object of the class and it continues to parse the query till the rule is satisfied.
RepetitiveGrammarRuleComparable ( )
RepetitiveGrammarRuleComparable is the same in functionality as
RepetitiveGrammarRule except that the rule is of comparable type.
SimpleGrammarRuleComparable ( )
SimpleGrammarRuleComparable contains a comparable type of rule.
SimpleGrammarRule ( )
SimpleGrammarRule represents a rule representing some token or some other rule.
StringGrammarRule ( )
StringGrammarRule represents a token of the grammar.
CharacterStringLiteralGrammarRule ( )
CharacterStringLiteralGrammarRule is responsible for parsing of string literal given using
single quote. This class makes a token for the string literal.
KeyWordGrammarRule ( )
KeyWordGrammarRule is responsible for parsing keyword of a grammar.
TokenGrammarRule ( )
TokenGrammarRule ( ) is responsible for parsing Tokens.
GrammarRuleFactory ( )
GrammarRuleFactory is responisble for creating GrammarRule and
GrammarRuleComparable objects corresponding to a grammar.
TokenGrammarRuleFactory ( )
TokenGrammarRuleFactory is responisble for creating GrammarRule and
GrammarRuleComparable objects corresponding to tokens of a grammar.
For latest information and updates on Daffodil DB/One$DB, please see our Release
Notes at: http://www.daffodildb.com/daffodil-release-notes.html
If you have successfully installed and started working with Daffodil DB/One$DB, please
remember to sign up for the benefits you are entitled to as a Daffodil DB customer.
For free support, be a part of our online developer community at Daffodil Developer
Forum
If you spot a typographical error in the Daffodil DB Design Document, or if you have
thought of a way to make this manual better, we would love to hear from you!
This manual, as well as the software described in it, is furnished under license and may
be used or copied only in accordance with the terms of such license. The content of this
manual is for informational purpose only, and is liable to change without prior notice.
Daffodil Software Limited assumes no responsibility or liability whatsoever for any errors
or inaccuracies that may appear in this documentation. No part of this product or any
other product of Daffodil Software Limited or related documentation may be stored,
transmitted, reproduced or used in any other manner in any form by any means without
prior written authorization from Daffodil Software Limited.