You are on page 1of 52

Introduction to SQL

Vijay Kanabar

Boston University

Ron McFadyen

University of Winnipeg
Table of Contents 2

TABLE OF CONTENTS

TABLE OF CONTENTS.....................................................................................................................................2
Chapter 1.........................................................................................................................................................................1

SQL Architecture___________________________________1
Introduction..............................................................................................................................................................1
Client/Server Architecture.....................................................................................................................................1
Databases.................................................................................................................................................................3
The Relational Model............................................................................................................................................3
Structured Query Language (SQL)............................................................................................................................4
Chapter Two....................................................................................................................................................................6

Database Definition__________________________________6
The Library Database................................................................................................................................................6
Patron...................................................................................................................................................................8
Creating Tables.........................................................................................................................................................9
Altering a Table......................................................................................................................................................10
Dropping Tables.................................................................................................................................................10
Copying Tables...................................................................................................................................................11
Data Types.............................................................................................................................................................12
Predefined Types....................................................................................................................................................12
Numeric types.....................................................................................................................................................12
String types.........................................................................................................................................................13
SQL Data Types.....................................................................................................................................................13
The null value.....................................................................................................................................................14
Chapter Three................................................................................................................................................................15

Data Manipulation__________________________________15
Selecting Columns..................................................................................................................................................15
Selecting Rows ......................................................................................................................................................17
Arithmetic Operations.............................................................................................................................................18
Boolean Operators .................................................................................................................................................19
Special Operators for the WHERE clause ..........................................................................................................20
System Variables ..............................................................................................................................................23
Column Functions ..................................................................................................................................................23
Use of COUNT...................................................................................................................................................24
Ordering the Result ...........................................................................................................................................25
Grouping Data ......................................................................................................................................................26
Restricting groups using HAVING ....................................................................................................................28
Interaction of clauses in the SELECT .................................................................................................................29
Joins.......................................................................................................................................................................29
Outer Join...........................................................................................................................................................31
Nested Queries ......................................................................................................................................................32
Correlated Subqueries .......................................................................................................................................35
Exists..................................................................................................................................................................36
Use of ALL, ANY..............................................................................................................................................36
Set Operations........................................................................................................................................................37
Union.................................................................................................................................................................37
Difference .........................................................................................................................................................38
Chapter Four.................................................................................................................................................................39
Table of Contents 3

Updating The Database _____________________________39


Modifying Rows ...................................................................................................................................................39
Deleting Rows .....................................................................................................................................................41
Inserting New Rows ..............................................................................................................................................42
Chapter Five..................................................................................................................................................................45

V i e w s__________________________________________45
Creating Views ......................................................................................................................................................45
Dropping views .................................................................................................................................................46
Advantages of Views..........................................................................................................................................46
Grouped Views .................................................................................................................................................46
Check Option ........................................................................................................................................................47
Chapter 1

SQL Architecture
Client/Server and Web Technology Architecture
Introduction

There are two types of architectures used to develop SQL applications. The first category is
the traditional mainframe application and the second is the client/server architecture.

1) Mainframe Architecture: With this architecture we essentially use a powerful mainframe


to host the Data Base Management System (DBMS). Several terminals, largely passive
dumb terminals communicate with the mainframe and retrieve data from it.

2) Client/Server Architecture: With client/server architecture, the DBMS system and user
applications are partitioned into two or more logical segments or tiers. The first tier is the
front-end or client, which is responsible for the Graphical User Interface (GUI) and some
of the application business logic. The second tier (last tier) functions as the back-end
database server. This is typically a powerful computer and hosts the DBMS. A key
component of the client/server architecture is middleware. Middleware is software “glue”
that ties a client and server together. The partition of the client and database server
represents a functional division of work and also included a division of hardware and
software components amongst the various tiers.

Client/Server Architecture

The client/server architecture is a computational architecture that separates an application into


two processes:

1. One process plays the role of the client, the other plays the role of the server

2. The client process requests services from server process

Middleware glues the two components together.

The first generation client/server systems were two tiered architectures. Such architectures tend to
lose their efficiency when scaled up to support large-scale computing environments that service
hundreds of clients. Scalability is poor because the business logic (application program
components) resides in the front-end. This creates heavy network traffic between the client and
database server. Three tiered client/server architectures overcome this limitation.

1
INTRODUCTION 2

With three tiered architectures an application is typically divided into three logical components. The
client front-end is only responsible for the GUI and does no application programming. The
intermediate layer can also support a variety of services. For example the application server is
typically on the intermediate layer. Such a server typically shares application logic with other
servers or clients.

Since this book deals with the SQL-environment of typical client/server architectures we
introduce its key components next. An SQL-environment comprises:

One SQL-agent.

One SQL-implementation.

Zero or more SQL-client modules, containing externally-invoked procedures available to the


SQL-agent.

Zero or more authorization identifiers.

Zero or more catalogs, each of which contains one or more SQL-schemas.

The sites, principally tables that constitute ‘‘the database’’.

SQL Client/Server Terms


The client/server architecture translates into parallel SQL architecture concepts. ISO introduces the
following terms in their standard.

SQL-Agent: An SQL-agent is that which causes the execution of SQL-statements. In the case
of the direct invocation of SQL it is implementation-defined. Alternatively, it may consist of one
or more compilation units that, when executed, invoke externally invoked procedures in an
SQL-client module.

SQL-Implementation: An SQL-implementation is a processor that executes SQL-statements,


as required by the SQL-agent. An SQL-implementation, as perceived by the SQL-agent,
includes one SQL-client, to which that SQL-agent is bound, and one or more SQL-servers.

SQL-Client: An SQL-client is a processor, perceived by the SQL-agent as part of the SQL-


implementation that establishes SQL-connections between itself and SQL-servers and
maintains a diagnostics area and other state data relating to interactions between itself, the
SQL-agent, and the SQL-servers.

SQL-Server: Each SQL-server is a processor, perceived by the SQL-agent as part of the


SQL-implementation that manages SQL-data. Each SQL-server:
INTRODUCTION 3
Manages the SQL-session taking place over the SQL-connection between itself and the SQL-
client.

Executes SQL-statements received from the SQL-client, receiving and sending data as
required.

Maintains the state of the SQL-session, including the authorization identifier and certain
session defaults.

Other Key Concepts

SQL-connection: An association between an SQL-client and an SQL-server.

o SQL-environment: The context in which SQL-data exists and SQL-statements are


executed.
SQL-session: The context within which a single user, from a single SQL-agent, executes a
sequence of consecutive SQL-statements over a single SQL-connection.

SQL-statement: A string of characters that conforms, or purports to conform, to the Format

Databases

Most organizations today store their data resource in a database. So then, what is a database? A
database is a central reservoir for all data. A computerized database system facilitates sharing of
data (and information). Consider a library users database. Any branch of the same library should
be capable of checking your book in or out. They should also be in a position to answer questions
such as “How much do I owe in fines?” or “Is this book available?” In order to respond to such
questions the library database should be available to all the employees at all the branches.

Databases are queried and updated frequently - new data inserted, old removed, and existing data
changed; we use a Data Base Management System (DBMS) to facilitate these tasks. It enables us
to store, retrieve and manipulate the database.

The Relational Model

The relational model is almost the exclusive choice for implementation and use on a
microcomputer. The model is simple, flexible and easy to use. We consider a database to be a
collection of tables. Each table is a two dimensional construct of rows and columns. The
mathematical concept comparable to the term table is relation--hence, the name relational
database.

Consider the sample table shown below - the first book has a call number 100, title Physics
Handbook and subject category Physics. The information for each book is stored in rows; rows are
sometimes referred to as records, and more formally in database terminology, rows are called
tuples. The columns define fields: Title field, Subject field, etc. A field is also called an attribute.
INTRODUCTION 4

CALLN TITLE SUBJECT


O

100 Physics Handbook Physics

200 Database Systems Computing

300 HTML Computing

400 XML Computing

500 Software Testing Computing

600 E-Commerce Business

In the relational model each row should have a unique identifier - a primary key, for example,
callno, is a unique identifier in the table above. When one gives a specific value for the
primary key, it identifies a single row. Generally speaking, one does not put information into a
database unless it is possible to recall it later. Each book, and each patron, of the library is
uniquely identified.

To define and access information in the database (to retrieve and update the data) we need a
language for expressing our requests. In this book we discuss the Structured Query
Language (SQL) for accessing data in a relational database.

Structured Query Language (SQL)

Every DBMS has a data sub-language embedded in its architecture. This data sub- language is used to
communicate with the database. On the basis of the types of operations performed, the data sub language can
be classified into a data definition language DDL), a data manipulation language (DML), and a data control
language (DCL).

The DDL commands facilitate the creation and description of a database, the DML commands deal with
manipulation of the data, and the DCL commands are used to specify security constraints.

SQL is based on a data language initially designed for the IBM prototype relational database system called
System R. SQL is often pronounced sequel. SQL has been standardized by ISO and will continue to be
enhanced; when appropriate we make references to the standard as ISO SQL. It has been adopted as an
industry standard for relational database systems and such systems appear on microcomputers,
minicomputers, and mainframes. In some cases the same product is available on all these types. The number
of SQL database systems is enormous and growing; we cannot list all of them here, but the list includes IBM’s
SQL/DS, and DB2, Relational Technologies’ INGRES, Oracle Corporation’s Oracle, Cincom’s SUPRA, Ashton-
Tate’s dBASE IV.

The SQL language is comprised of:


INTRODUCTION 5
1.DDL: data definition language. DDL is used to define the tables that make up the database.
Example: Create Table syntax.

2.DML: data manipulation language. DML is concerned with the retrieval and update of the
database. Example: Select * from syntax.

3.DCL: data control language. DCL is used to specify who can access data in the database
and what operations they can perform. Example: Grant Table syntax.

We have introduced the relational model. A relational database consists of tables; tables, of
course, are a very simple data structure which everyone is capable of understanding. Every
database system requires a language for accessing data; the language discussed in this text is
SQL. We discussed the term primary key. Every table should have a primary key if we are to
access the information in our database. A discussion of primary keys would lead to another
topic, structural integrity, which is part of the relational model. This is discussed in a later
chapter. Implementations of primary key integrity and referential integrity are beginning to
appear in commercial systems, but such aspects of the relational model are not currently part
of ISO SQL.
Chapter Two

Database Definition
The database consists of tables; each table is a two dimensional structure of rows and columns. A
row is sometimes referred to as a record and a column is sometimes referred to as a field (these
terms are carried over from data processing). The DDL is used to define (create) tables, views,
and indexes.

We create tables in accordance with our data management needs; as our needs grow it may be
necessary to add new fields to existing tables.

A view is a virtual table. It is a table defined in terms of other tables; it does not occupy any real
space in the database. The view mechanism is very powerful and useful, providing us a way to
simplify a user’s perception of the database or for providing some types of security.

Indexes provide the database system with efficient access paths to rows of tables. We create
indexes in order to improve the efficiency of our requests for data.

In this chapter we concentrate on the CREATE TABLE command for creating tables, and leave the
discussion of views and indexes to subsequent chapters.

The Library Database

To illustrate features of SQL we employ a sample database consisting of three tables used in a
library environment.

BOOK

CALLNO TITLE SUBJECT


100 Physics Handbook Physics
200 Database Systems Computing
300 HTML Computing
400 XML Computing
500 Software Testing Computing
600 E-Commerce Business

6
DATA DEFINITION 7

PATRON

USERID NAME AGE


10 Wong 22
15 Colin 31
20 King 21
25 Das 67
30 Niall 17
35 Smith 72
40 Jones 41

LOAN

LOANID CALLNO USERID DATEDUE DATERET FINE PAID


1 100 10 10-FEB-00 09-FEB-00 0
2 200 10 30-MAY-00 30-MAY-00 0
3 300 20 05-AUG-88 50 No
4 500 30 26-APR-90 50 Yes
5 600 40 22-DEC-00 10 No

Figure 2.1: Relational Tables for a Library

Book

Information on each book in the library is recorded in the BOOK table:

BOOK: callno title subject

For each book we have:


 a call number (callno) which uniquely identifies the book

 the title of the book

 the subject matter of the book

Loan

Each time a book is borrowed information is recorded in the LOAN table:


DATA DEFINITION 8
LOAN: loanid callno userid datedue dateret fine paid

For each loan of a book to a person we record:

the loan id (loanid) key that uniquely identifies the loan row.

the call number (callno) of the book borrowed

the identifier (userid) of the person borrowing the book

the date the book is due (datedue)

 the date the book was returned (dateret); no value is assigned until the book is
returned

 the fine is calculated for all books returned late, at the rate of 10 cents per day (no
value is assigned until the book is actually returned)

 the attribute paid; if it’s value is yes it indicates that the fine has been paid.

The LOAN table is illustrated in Figure 2.1. You will note three separate cases:

Example 1: A book has not been returned. Fine accrues and has not been paid.

LOANID CALLNO USERID DATEDUE DATERET FINE PAID


3 300 20 05-AUG-88 50 No

Example 2: Case where the book was returned before the due date. No Fines.

LOANID CALLNO USERID DATEDUE DATERET FINE PAID


1 100 10 10-FEB-00 09-FEB-00 0

Example 3: The book was returned late. A fine was assessed, and paid.

LOANID CALLNO USERID DATEDUE DATERET FINE PAID


4 500 30 26-APR-90 50 Yes

Patron

We refer to users of the library as patrons; for each patron of the library we record information
in the PATRON table:
DATA DEFINITION 9
PATRON: userid name age

For each patron we have:

an identification number which uniquely identifies them (userid)

their name

their age

This database represents a subset of the information requirements of any library. The contents of our library
database is illustrated below:

Creating Tables

To create a table one uses CREATE TABLE command. The general syntax of CREATE TABLE
is:

CREATE TABLE table-name


(column-specifications) ;

The table is named (table-name) and each column is defined (column-specification). A


column-specification is the definition of a column giving its data type and other properties. The
execution of this command causes the system to save the definition of the table; initially the
table is empty. We now illustrate the use of the command and then we discuss data types and
other options. To create the BOOK table we could use:

CREATE TABLE BOOK (


CALLNO NUMBER(3) NOT NULL,
TITLE VARCHAR2(20),
SUBJECT VARCHAR2(13),
CONSTRAINT BOOK_PRIMARY_KEY PRIMARY KEY CALLNO));

In the above we have defined three columns for the BOOK table as CALLNO, TITLE,
SUBJECT. Call numbers are defined as numbers, titles and subjects are defined as character
strings of lengths 20 and 13 respectively. The specification of data type is required and is
restrictive. Once the book table is defined as above we could not have a book with call
number QA76.9 since QA76.9 is not a number. For book titles and subjects we must refer to
specific values by enclosing them in quotes, such as ‘Introduction to Database Systems’.

Suitable definitions for the PATRON table and the LOAN tables would be:

CREATE TABLE PATRON (


USERID NUMBER(2) NOT NULL,
NAME VARCHAR2(14),
DATA DEFINITION 10

AGE NUMBER(2),
CONSTRAINT PATRON_PRIMARY_KEY PRIMARY KEY
(USERID));

CREATE TABLE LOAN (


LOANID NUMBER(3) NOT NULL,
CALLNO NUMBER(3) NOT NULL,
USERID NUMBER(2) NOT NULL,
DATEDUE DATE,
DATERET DATE,
FINE NUMBER(5,2),
PAID VARCHAR2(3),
CONSTRAINT LOANID_PRIMARY_KEY PRIMARY KEY
(LOANID));

Altering a Table

Most SQL systems include a command to allow one to include new columns in an existing table.
For instance, to add a new column for addresses to the PATRON table one would execute a
command such as:

ALTER TABLE tablename

ADD|DROP column datatype [NULL|NOTNULL];

Example:

ALTER TABLE patron


ADD address CHARACTER(30) ;

The ALTER TABLE command can typically be used for other purposes such as changing the
column length or adding a property such as NOT NULL (these may only be allowed in very
restrictive cases).

Dropping Tables

To remove or drop a table from the database one uses the DROP TABLE command. To remove
the BOOK table one executes a command such as:
DATA DEFINITION 11
DROP TABLE book ;
This command is very powerful; all rows in the table are deleted and the table object
disappears.

Copying Tables

Some SQL systems, such as Oracle, allow the user to create a new table from an old one using
CREATE TABLE with a subquery. Queries (SELECT statements) are discussed in the next
chapter. To create a new table SENIORS (age 65 or older) from the PATRON table we could
execute the command:

CREATE TABLE seniors

AS

SELECT * FROM patron

WHERE age >= 65;

This results in creating a table with the following characteristics:

Name Null? Type

USERID NOT NULL NUMBER(2)

NAME VARCHAR2(14)

AGE NUMBER(2)

The following rows are automatically inserted into the Seniors table:

USERID NAME AGE

25 Das 67

35 Smith 72

The table inherited the columns of the PATRON table since we did not name any. In some
cases we would need to name the columns. For instance, from the LOAN table we could
derive a table giving the call number, user id, and the number of days the book was kept:

CREATE TABLE dayskept (callno, userid, kept) AS


DATA DEFINITION 12
SELECT (callno, userid, dateret - datedue)

FROM loan;

Data Types

Every data value belongs to some data type. ISO SQL has several data types: character,
numeric, decimal, integer, small integer, float, real, and double precision. A SQL system may
not provide all these types, and may include others. Most systems provide additional types
that really are necessary (SQL might be considered deficient on this) such as MONEY, DATE,
and TIME.

Some SQL systems simplify the choices; Oracle, for instance, places all the numeric types
-numeric, integer, small integer, float, real, double precision - into one category: NUMBER. It
is best, when using a SQL system to refer to its documentation to determine the data types
available.

Every data type is either:

 Predefined

 Constructed

 User-defined.

A user-defined data type is a schema object. A predefined data type is a data type provided by
the SQL-implementation. A data type is predefined even though the user is required (or
allowed) to provide certain parameters when specifying it (for example the precision of a
number).

A predefined data type is atomic. An atomic type is a data type whose values are not
composed of values of other data types. The existence of an operation (SUBSTRING,
EXTRACT) that is capable of selecting part of a string or datetime value does not imply that a
string or datetime is not atomic.

A constructed type is either atomic or composite. A composite type is a data type each of
whose values is composed of zero or more values, each of a declared data type.

Predefined Types

Numeric types

There are two classes of numeric type: exact numeric, which includes integer types and types with
specified precision and scale; and approximate numeric, which is essentially floating point, and for
DATA DEFINITION 13
which a precision may optionally be specified. Every number has a precision (number of digits),
and exact numeric types also have a scale (digits after the radix point). Arithmetic operations may
be performed on operands of different or the same numeric type, and the result is of a numeric
type that depends only on the numeric type of the operands. If the result cannot be represented
exactly in the result type, then whether it is rounded or truncated is implementation-defined. An
exception condition is raised if the result is outside the range of numeric values of the result type, or
if the arithmetic operation is not defined for the operands.

String types

A value of character type is a string (sequence) of characters drawn from some character
repertoire. The characters in a character string S are all drawn from the same character set CS. If
S is the value of some expression E, then CS is the character set specified for the declared type of
E.A character string type is either of fixed length, or of variable length up to some implementation-
defined maximum. A value of character large object (CLOB) type is a string of characters from
some character repertoire and is always associated with exactly one character set. A character
large object is of variable length, up to some implementation-defined maximum that is probably
greater than that of other character strings.

Either a character string or character large object may be specified as being based on a specific
character set by specifying CHARACTER SET in the data type; a particular character set chosen
by the implementation to be the national character set may be specified by specifying NATIONAL
CHARACTER, NATIONAL CHARACTER VARYING, or NATIONAL CHARACTER LARGE
OBJECT (or one of several syntactic equivalents) as the data type.

A value of bit string type is a string of bits (binary digits). A bit string type is either of fixed length, or
of variable length up to some implementation-defined maximum.

SQL Data Types

A value of binary string type (known as a binary large object, or BLOB) is a variable length
sequence of octets, up to an implementation-defined maximum.

Boolean type

A value of the Boolean data type is either true or false. The truth value of unknown is sometimes
represented by the null value.

Datetime types

There are three datetime types, each of which specifies values comprising datetime fields. A value
of data type TIMESTAMP comprises values of the datetime fields YEAR (between 0001 and
9999), MONTH, DAY, HOUR, MINUTE and SECOND. A value of data type TIME comprises
values of the datetime fields HOUR, MINUTE and SECOND. A value of data type DATE
comprises values of the datetime fields YEAR (between 0001 and 9999), MONTH and DAY. A
value of DATE is a valid Gregorian date. A value of TIME is a valid time of day.

TIMESTAMP and TIME may be specified with a number of (decimal) digits of fractional seconds
precision.
DATA DEFINITION 14
TIMESTAMP and TIME may also be specified as being WITH TIME ZONE, in which case every
value has associated with it a time zone displacement. In comparing values of a data type WITH

TIME ZONE, the value of the time zone displacement is disregarded.

Interval types

A value of an interval type represents the duration of a period of time. There are two classes of
intervals. One class, called year-month intervals, has a datetime precision that includes a YEAR
field or a MONTH field, or both. The other class, called day-time intervals, has an express or
implied interval precision that can include any set of contiguous fields other than YEAR or MONTH.

Constructed atomic types

Reference types

A reference type is a predefined data type, a value of which references (or points to) some site

holding a value of the referenced type. The only sites that may be so referenced are the rows of

typed tables. It follows that every referenced type is a structured type.

The null value

Every data type includes a special value, called the null value, sometimes denoted by the
keyword NULL. This value differs from other values in the following respects:

— Since the null value is in every data type, the data type of the null value implied by the
keyword NULL cannot be inferred; hence NULL can be used to denote the null value only in
certain contexts, rather than everywhere that a literal is permitted.

Other attributes for fields

ISO SQL specifies two additional properties that can be specified for a column: NOT NULL,
and UNIQUE. If we specify NOT NULL for a column, then whenever a row is inserted or
updated that column must have a value assigned to it. Null is a special term which applies
when a field has never been assigned a value; its value is unknown or missing; null is not the
same as blank characters or zero value.

If we specify UNIQUE for a column then every row in the table must have a unique value in
that column. DB2 does not allow the UNIQUE property to be specified in the table definition;
but one can achieve this by creating a unique index (see the chapter on indexing). We shall
not use UNIQUE in our examples.
Chapter Three

Data Manipulation
The SELECT command is primarily used to retrieve data from the database. It is also used
when creating a copy of a table, creating views, and can be used to specify rows for updating.
In this chapter we concentrate on its use for retrieving data.

The basic form of the SELECT is


SELECT field-list

FROM table-list

WHERE field-expression

GROUP BY group-fields

HAVING group-expression

ORDER BY field-list;

The result of the SELECT is a listing of data derived from some set of tables in the database.
The “field-list” specifies the fields to be listed, such as USERID, NAME, AGE. The data listed
is obtained from the set of tables (table-list) specified in the FROM clause. The “field-
expression” in the WHERE clause specifies a Boolean expression that rows in the “table-list”
must satisfy to be included in the listing. The GROUP BY clause is used when we wish to
summarize information in the underlying tables. For example, GROUP BY subject causes
the rows to be organized into groups, one group for each unique value of the subject field. The
HAVING clause is used to specify which groups are to be included. ORDER BY is used to
sequence the rows of the listing. The WHERE, GROUP BY, HAVING, and ORDER BY
clauses are optional.

Selecting Columns

We illustrate queries that retrieve all rows of a table.

Example 3.1 List the titles of books in the database.

SQL> SELECT title


2 FROM book;
DATA MANIPULATION 16

TITLE
Physics Handbook
Database Systems
HTML
XML
Software Testing
E-Commerce

6 rows selected.

When an SQL system executes the above example it accesses the BOOK table.
Since the SELECT does not involve a WHERE clause all rows of BOOK are
accessed and from each row the system extracts and displays the title. There
are as many rows in the result as there are rows in BOOK; the result are
shown above.

Example 3.2 List the title and subject for each book.

SQL> SELECT title, subject


2 FROM book;

TITLE SUBJECT
Physics Handbook Physics
Database Systems Computing
HTML Computing
XML Computing
Software Testing Computing
E-Commerce Business

6 rows selected.

Example 3.3 List all fields for each book.

a) SELECT callno, title, subject

FROM book;

b) SELECT *

FROM book;

The * above is an abbreviation for “all fields”.


DATA MANIPULATION 17

CALLNO TITLE SUBJECT


100 Physics Handbook Physics
200 Database Systems Computing
300 HTML Computing
400 XML Computing
500 Software Testing Computing
600 E-Commerce Business

6 rows selected.

Example 3.4 What are the subject areas of the library?

SQL> SELECT subject


2 FROM book ;

SUBJECT
Physics
Computing
Computing
Computing
Computing
Business

6 rows selected.

The answer to this query has as many lines as there are books in the library. Since there are
very few subject areas the result appears awkward. To remove the redundancies from the
display we can use the DISTINCT key word; it causes redundant rows to be eliminated:

SQL> SELECT DISTINCT subject


2 FROM book ;

SUBJECT
Business
Computing
Physics

Selecting Rows

In the previous section we were concerned with listing one or more fields from a table; every row of the table
corresponded to a line in the listing. Now we consider retrieval of a subset of the rows of a table. To limit the
retrieval to specific rows of a table we include the WHERE clause in our commands. The WHERE clause gives
DATA MANIPULATION 18
the condition that a row must satisfy to be included; any row of the table not satisfying the condition is not
considered. In the following we discuss operators that may appear in conditions:

=, <>, >, >=, <, <=


IN, BETWEEN, AND, LIKE, IS NULL, AND, OR, NOT

Example 3.5 List the titles of Computing books.

SQL> SELECT title


2 FROM book
3 WHERE subject = 'Computing';

TITLE
Database Systems
HTML
XML
Software Testing

The condition subject = ‘Computing’ is evaluated for each row of BOOK. If the
expression evaluates to true then the row contributes to the result.

Example 3.6 List the book with call number 200.

SQL> SELECT title


2 FROM book
3 WHERE callno = 200 ;

TITLE
Database Systems

Arithmetic Operations

The standard operators +, -, *, / are available for addition, subtraction, multiplication and division respectively.
These can be applied to the fields being retrieved or to fields within the WHERE clause. The use of
parentheses is encouraged to force, or clarify, the order of computation. Expressions in parentheses are
evaluated first. Unless parentheses or priorities dictate otherwise, an expression is evaluated from left to right. *
and / have equal priority; likewise + and - have equal priority. However, * and / have higher priority (hence
evaluated first) than + and -.
DATA MANIPULATION 19

Example 3.7 List patrons fines in Canadian Dollars (assuming one US


dollar is equivalent to one and half Canadian dollars).

SQL> SELECT callno, userid, fine*1.5


2 FROM loan;

CALLNO USERID FINE*1.5


100 10 0
200 10 0
300 20 75
500 30 75
600 40 15

Example 3.8 List loans where the fine is over 50 Canadian dollars.

SQL> SELECT *
2 FROM loan
3 WHERE (fine*1.5) > 50.00;

LOANID CALLNO USERID DATEDUE DATERET FINE


PAID
3 300 20 05-AUG-88 50 No
4 500 30 26-APR-90 50 Yes

Boolean Operators

The condition specified in the WHERE clause can be a Boolean expression involving AND, OR,
and NOT. The priority of the Boolean operators from highest to lowest is NOT, AND, OR. Of
course parentheses can be used to clarify or force evaluation to be performed in a certain order.

Example 3.9 List the call numbers of books borrowed by patron 30 or


patron 40 and where the fine paid is greater than $2.00.

SQL> SELECT DISTINCT callno


2 FROM loan
3 WHERE fine > 2.00
4 AND (userid=30 OR userid=40) ;

CALLNO
500
600
DATA MANIPULATION 20

The results could be different if parenthesis is not used. (This is not evident in
our data set.)

Special Operators for the WHERE clause

Four operators are available to handle special cases:

LIKE is used when searching for the appearance of a particular character string.

BETWEEN is used when searching for a value within some range.

IS NULL is used to test for a field not having been assigned any value.

IN is used to test for a field having a value contained in some set of values.

LIKE

LIKE is used with character data to determine the presence of a substring. Special notations
are available to specify unknown or irrelevant characters in the field being tested:

a single unknown character: _ (underscore)

any number of unknown characters: %

Example 3.10 List books with Database in the title.

SQL> SELECT *
2 FROM book
3 WHERE title LIKE '%Database%';

CALLNO TITLE SUBJECT


200 Database Systems Computing

In example 3.10 each BOOK title is examined to determine if it contains the character string
Database.

Example 3.11 List books with titles having an o as the second character.

SQL> SELECT *
2 FROM book
3 WHERE title LIKE '_o%';

CALLNO TITLE SUBJECT


DATA MANIPULATION 21

500 Software Testing Computing


In this example the title field of each row in BOOK is examined to determine if it has an o in the
second character position.

BETWEEN

The between operator is used with numeric data to determine if some field lies in a certain
range.

Example 3.12 List books with call numbers between 200 and 400.
SQL> SELECT *

2 FROM book

3 WHERE callno BETWEEN 200 AND 400 ;

CALLNO TITLE SUBJECT

--------- -------------------- -------------

200 Database Systems Computing

300 HTML Computing

400 XML Computing

Note that this is the same as the command:

SQL> SELECT *

2 FROM book

3 WHERE (callno>=200) AND (callno<=400);

IS NULL

NULL is a keyword that must be used to determine whether or not a field has been assigned a
value. Note that in our database the date returned field is not assigned any value until a book
is returned. If a book is returned on time then no value is assigned to the fine or paid fields.

Example 3.13 List the books currently out on loan.

SQL> SELECT callno


2 FROM loan
3 WHERE dateret IS NULL;
DATA MANIPULATION 22

CALLNO
300
500
600

Example 3.14 List the books that have been returned by patron 100.

SQL> SELECT callno


2 FROM loan
3 WHERE (userid=10) AND (dateret IS NOT NULL);

CALLNO
100
200

IS NOT NULL is the expression used to determine if a field of a row has been assigned a value. It
is not permitted to use the expressions DATERET = NULL, or DATERET <> NULL.

IN

The set of values used for comparison can be either explicitly specified or given as a subquery.
We illustrate the first case here and leave subqueries for a later section.

Example 3.15 List the names of patrons whose user id is 100, 200, 300, or
350.

SQL> SELECT name


2 FROM patron
3 WHERE userid IN (10,20,30);

NAME
Niall
King
Wong

The list of values used for comparison is enclosed in parentheses; the name of a patron is
listed if the userid is in the list.

Example 3.16 List all Computing and Business titles.

SQL> SELECT title


2 FROM book
3 WHERE subject IN ('Computing','Business');
DATA MANIPULATION 23

TITLE
Database Systems
HTML
XML
Software Testing
E-Commerce

System Variables

SQL systems have system variables used to hold values of general use or interest. All systems
have the variable USER which holds the user id of the current logged on user. Special uses of
USER can be made if it also corresponds to a column of one or more tables in the database.
Certainly the system does make use of it to determine if the current user has the access privileges
pertinent to a request (see the section on DCL).

Systems like Oracle and DB2 have many other variables such as: CURRENT DATE, CURRENT TIME,
CURRENT TIMESTAMP, CURRENT TIMEZONE. Oracle has a variable SYSDATE which can be used to
determine the current time and/or date.

Example 3.17 List patrons who have outstanding books. We shall use
Oracle’s SYSDATE in this example. Any system that includes date as a data
type would likely permit arithmetic operations and comparisons.

SQL> SELECT userid


2 FROM loan
3 WHERE SYSDATE > datedue;

USERID
10
10

Column Functions

SQL offers special functions, also called aggregate functions, to determine maximums, minimums,
averages, totals and counts for entire columns. These are MAX, SUM, AVG, and COUNT
respectively. Note that AVG and SUM are defined for numeric values only, and that their
computations exclude NULL values.

Example 3.18 What is the largest fine paid for an overdue book?

SQL> SELECT MAX(fine)


2 FROM loan;

MAX(FINE)
DATA MANIPULATION 24

50

Example 3.19 How much has the library assessed in fines?

SQL> SELECT SUM(fine)


2 FROM loan;

SUM(FINE)
110

Use of COUNT

ISO SQL specifies that COUNT can be used in just two ways. First, we can use COUNT(*) to
count the number of rows which satisfy a query. Second, we can use COUNT with DISTINCT to
count the number of unique values in a column. We illustrate this use of COUNT in the next
section.

Example 3.20 How many books are in the libraries?

SQL> SELECT COUNT(*)


2 FROM book ;

COUNT(*)
6

Example 3.21 How many times has a fine been assessed?

SQL> SELECT COUNT(fine)


2 FROM loan
3 WHERE fine > 0;

COUNT(FINE)
3

Example 3.22 How many computing books are there?

SQL> SELECT COUNT(*)


2 FROM book
3 WHERE subject='Computing';

COUNT(*)
4
DATA MANIPULATION 25
This use of COUNT, counting rows, does not involve any elimination of null values. That is, if a row
has all null values it will be counted.

Use of Distinct

ISO SQL permits the specification of DISTINCT with an aggregate function; the effect is to remove
duplicate values prior to the function being applied.

Example 3.23 How many subject areas are there?

SQL> SELECT COUNT(DISTINCT subject)


2 FROM book;

COUNT(DISTINCTSUBJECT)
----------------------
3

Example 3.24 How many patrons have borrowed books?

SQL> SELECT COUNT(DISTINCT userid)


2 FROM loan;

COUNT(DISTINCTUSERID)
---------------------
4

Whenever DISTINCT is used, the argument must be a simple field reference as shown above.

Ordering the Result

The ORDER BY clause is used to force the result to be ordered by one or more column values in
either ascending or descending order.

Example 3.25 List books in alphabetical order by title.


SQL> SELECT callno, title, subject
2 FROM book
3 ORDER BY title;

CALLNO TITLE SUBJECT


200 Database Systems Computing
600 E-Commerce Business
300 HTML Computing
DATA MANIPULATION 26

100 Physics Handbook Physics


500 Software Testing Computing
400 XML Computing

6 rows selected.

The default is ascending order; this could be specified explicitly by coding ORDER BY title ASC.
Instead of specifying a field name we can refer to it indirectly by its relative position in the field-list.
For the above we could have used: ORDER BY 2.

Example 3.26 List books in subject order and within subject order them by
call number.

SQL> SELECT *
2 FROM book
3 ORDER BY subject ASC, callno DESC;

CALLNO TITLE SUBJECT


600 E-Commerce Business
500 Software Testing Computing
400 XML Computing
300 HTML Computing
200 Database Systems Computing
100 Physics Handbook Physics

6 rows selected.

Note that the result is ordered first by subject; biology is first, followed by business, then
computing and finally mathematics. Within each group the ordering is by call number.

Grouping Data

The GROUP BY clause is used to specify one or more fields that are to be used for organizing
rows into groups. Rows are grouped together that have the same value(s) for the specified field(s).
The only simple fields that can be displayed are the ones used for grouping; any result from other
fields must be specified using a column function. The column function will be applied to a group of
rows instead of to the entire table.

Example 3.27 For each patron list the number of books he has borrowed.

SQL> SELECT userid, COUNT(*)


2 FROM loan
3 GROUP BY userid;
DATA MANIPULATION 27

USERID COUNT(*)
10 2
20 1
30 1
40 1

The effect of this SELECT is to

1.cause the SQL system to group the rows of loan by user id

2.display the user id for each group and a count of the number of rows in the group

A common error is to include a field in the listing that is not unique for the group. For example,

SQL> SELECT userid, callno, COUNT(*)

2 FROM loan

3 GROUP BY userid;

SELECT userid, callno, COUNT(*)

ERROR at line 1:

ORA-00979: not a GROUP BY expression

The above would be incorrect because callno is not single valued for a group. When GROUP
BY is used each element of the select list must be single valued; each element must either be
specified in the GROUP BY or be the result of a column function.

SQL> SELECT userid, callno, COUNT(*)


2 FROM loan
3 GROUP BY userid, callno;

USERID CALLNO COUNT(*)


10 100 1
10 200 1
20 300 1
30 500 1
40 600 1

Example 3.28 For each patron list his total fines paid.

SQL> SELECT userid, SUM(fine)


DATA MANIPULATION 28

2 FROM loan
3 GROUP BY userid;

USERID SUM(FINE)
10 0
20 50
30 50
40 10

In the above we see some patrons have not paid a fine but are still listed. To exclude rows from
the grouping process (and from the SELECT) we specify the appropriate WHERE clause.

Example 3.29 For each patron who has paid a fine, list his total.

SQL> SELECT userid, SUM(fine)


2 FROM loan
3 WHERE fine <>0
4 GROUP BY userid;

USERID SUM(FINE)
20 50
30 50
40 10

The only difference between the two preceding examples is that the latter does not report anything
for patrons who have not paid any fines.

Restricting groups using HAVING

In the foregoing examples all groups have been reported. To eliminate groups from the result we
use the HAVING clause specifying an appropriate group oriented boolean expression.

Example 3.30 List the patron id’s for those who have paid more than $30 in
fines on books with call numbers greater than 400.

SQL> SELECT userid


2 FROM loan
3 WHERE callno > 400
4 GROUP BY userid
5 HAVING SUM(fine) > 30;

USERID
30
DATA MANIPULATION 29

Interaction of clauses in the SELECT

The previous example has used all components of the SELECT command. It is important to
understand the order in which the above SELECT is processed.

 The WHERE is done first

 GROUP BY second, and

 HAVING third.

The rows of LOAN are accessed; each row of LOAN which satisfies the WHERE clause is retained
for further processing (the grouping). So only those rows of LOAN which correspond to books with
call numbers larger than 400 are retained. These rows are organized into groups according to the
specifications in the GROUP BY clause. In our case the rows are grouped by userid. When a
HAVING clause is present each group must satisfy the condition in order to be displayed in the
result; in our case each group must have a total for fine that is greater than $30.

Joins

A SELECT which performs a join is one which specifies more than one table in the FROM clause.
That is, it corresponds to a situation where it is necessary to retrieve information from more than
one table.

Example 3.31 Suppose we wish to list the names of patrons, their id’s, and the call numbers
of the books they have borrowed. There are two tables we use to get the information: PATRON
and LOAN. What we need to do is specify that a row of PATRON should be matched with a row of
LOAN whenever they have the same value for the user id field. Note that one row of PATRON can
be matched with many rows of LOAN since a patron is expected to borrow many books. With SQL
it is necessary for us to specify this join condition explicitly:

SQL> SELECT patron.name, patron.userid, loan.callno


2 FROM patron, loan
3 WHERE patron.userid = loan.userid;

NAME USERID CALLNO


Wong 10 100
Wong 10 200
King 20 300
Niall 30 500
Jones 40 600

Note that we have used prefixes for our field specifications. This is only necessary whenever
there would be ambiguity regarding the table where the field appears. In this case prefixes are
only necessary for the user id field.
DATA MANIPULATION 30
If we forgot to include the clause

WHERE patron.userid = loan.userid

and had just typed

SELECT patron.name, patron.userid, loan.callno

FROM patron, loan

our SELECT would produce a result referred to as a cartesian product. For our sample
database we would have 7*5=35 rows in the result, since PATRON has 7 rows and LOAN has
5 rows. Without the WHERE clause each row of PATRON would be joined with every row of
LOAN.

NAME USERID CALLNO


Wong 10 100
Colin 15 100
King 20 100
Das 25 100
Niall 30 100
Smith 35 100
Jones 40 100
Wong 10 200
Colin 15 200
King 20 200
Das 25 200
Niall 30 200
Smith 35 200
Jones 40 200
(repeats for callno 300, 400, 500, and 600; not shown above.)

Example 3.32 Suppose we wish to list the names of patrons who have
books out on loan. The information needed to do this is found in two tables:
LOAN contains the record of books loaned out (specifically we are interested in
those rows where the DATERET column has no value) and PATRON contains
the name for each patron. We need to match up a selected row of LOAN with
the pertinent row of PATRON; this is done by requiring them to have the same
value for the USERID field.

SQL> SELECT name


2 FROM loan, patron
3 WHERE loan.dateret IS NULL
4 AND loan.userid = patron.userid ;

NAME
DATA MANIPULATION 31
King
Niall
Jones

Aliases

Sometimes it is necessary and often convenient to use an alias (an alternate name) for a table.
An alias is specified in the FROM clause immediately following the actual table name. In the
next example we need the alias - a table is joined with itself - a self-join. It may be useful to
think of the join being performed between two copies of the same table.

Example 3.33 List each patrons name and the number of other patrons
older than him/her. In the following, one “copy” of the PATRON table is
referred to as A; the other as B.
SQL> SELECT a.name, COUNT(*)
2 FROM patron a, patron b
3 WHERE a.age < b.age
4 GROUP BY a.userid, a.name;

NAME COUNT(*)
Wong 4
Colin 3
King 5
Das 1
Niall 6
Jones 2

Note the GROUP BY clause in the example; it illustrates an important point regarding the
elements listed by the query. We have grouped the response using two fields: USERID and
NAME. In the SELECT clause we have included NAME and COUNT(*). If we grouped only
by USERID, then NAME would not be considered by SQL to be single-valued for each group.
As far as SQL is concerned, NAME could have a different value in each row of a group.
Everything included in the SELECT must be single-valued for a group when GROUP BY is
used.

Outer Join

ISO SQL does not provide for the outer join operation, but it is a useful form of join for many
situations. Some systems (such as Oracle) provide the outer join and other systems (such as
DB2) do not. Let us examine the very simple database request

For each book in the library list its title and the number of times it has been loaned out.

One approach that comes quickly to mind is


SQL> SELECT title, COUNT(*)
2 FROM book b, loan l
3 WHERE b.callno=l.callno
DATA MANIPULATION 32

4 GROUP BY b.callno, title;

TITLE COUNT(*)
Physics Handbook 1
Database Systems 1
HTML 1
Software Testing 1
E-Commerce 1

This query does give us correct counts for those books that have been loaned out, but nothing is
listed for any book that has never been loaned out at all. Such books will not be matched to any
row of LOAN and therefore do not contribute to the result.

Let us consider now how Oracle and the outer join can be used to formulate a simple and correct
query. In the WHERE clause where we specify the join condition (“b.callno=l.callno”) Oracle allows
us to place a “(+)” to designate a column which is to be matched with an imaginary row of null’s if
there are no matches otherwise. Instead of specifying COUNT(*) we specify COUNT(l.userid):

SQL> SELECT title, COUNT(l.userid)


2 FROM book b, loan l
3 WHERE b.callno=l.callno(+)
4 GROUP BY b.callno, title;

TITLE COUNT(L.USERID)
Physics Handbook 1
Database Systems 1
HTML 1
XML 0
Software Testing 1
E-Commerce 1

6 rows selected.

For the above, if a row of BOOK is not matched on call number to a row of LOAN then it will be
matched to an imaginary row where all column values are null. Hence every row of BOOK is
matched to something and the title and a count will be produced. Note that Oracle permits
COUNT(l.userid) to be specified without DISTINCT. Oracle’s COUNT(...) ignores null’s and
produces a value of zero when a book is matched to a row of NULL’s.

Nested Queries

SQL allows us to nest one query inside another, but only in the WHERE clause and the HAVING
clause. ISO SQL permits a subquery only on the right hand side of an operator. The operators
available are IN, EXISTS, and the relational operators =, <>, >, >=, <, <=. When a subquery will
return just a single row (normally a single field value) SQL allows the use of a relational operator.
DATA MANIPULATION 33
When a subquery returns more than one row (a set of rows) a relational operator must be used
with ALL or ANY.

Use of IN

The IN operator is used to determine if some value is present in a list of values. That list can
be explicitly specified (covered previously) or generated by a subquery.

Example 3.34 List names and ages of patrons who have books out on loan.
We shall proceed by giving two separate queries to accomplish this and then
we specify it as one nested query. Consider the query

SQL> SELECT userid


2 FROM loan
3 WHERE dateret IS NULL;

USERID
20
30
40

We can now specify and execute the query:

SQL> SELECT name, age


2 FROM patron
3* WHERE userid IN (20,30,40);

NAME AGE
Jones 41
Niall 17
King 21

These two queries can easily be combined; we replace the list (100,250,400) with the first
subquery:

SQL> SELECT name, age


2 FROM patron
3 WHERE userid IN
4 (SELECT userid FROM loan
5 WHERE dateret IS NULL) ;

Example 3.35 List names of patrons who have borrowed a computing or


history book.

SQL> SELECT name


2 FROM patron
DATA MANIPULATION 34

3 WHERE userid IN
4 (SELECT userid FROM loan
5 WHERE callno IN
6 (SELECT callno FROM book
7 WHERE subject IN
8 ('Business','Computing')));

NAME
Wong
King
Niall
Jones

Let’s analyze the above SELECT; it is composed of three queries. The innermost query is a
simple one and we can consider it separately. That is, we can consider that the database system
replaces it by a list of call numbers that have been retrieved from the BOOK table. The “middle”
query retrieves user id’s where the person has borrowed a book in the computing or history
categories. The outer query lists the names of these people.

Simple Comparison Operators: =, <, ...

We use two examples to illustrate the use of the comparison operators with subqueries. To the
beginning SQL user, the need for subqueries in these cases is not obvious.

Example 3.36 Who has paid the largest fine? This is such a simple
question, but unfortunately it is not trivial to formulate the correct SQL query.
One may be first tempted to express the query as

SELECT userid, MAX(fine)


FROM loan;

However, this is not correct since MAX is a function which operates on the
whole table and hence is single-valued and userid is multi-valued - there are
multiple values, one for each row. We cannot mix these two types of
expressions.

To explain the correct formulation of the query, we begin by constructing the


query in two parts. First (and this corresponds to the innermost query we shall
use) we find the largest fine:

SQL> SELECT MAX(fine)


2 FROM loan ;

MAX(FINE)
50
DATA MANIPULATION 35

This query becomes our nested query and all that we need to do is compare each fine to this
maximum value:

SQL> SELECT userid


2 FROM loan
3 WHERE fine = (SELECT MAX(fine)
4 FROM loan) ;

USERID
20
30

The ‘equals’ is permitted for testing with respect to the subquery since it is known that the
subquery returns just one value.

Example 3.37 What is the name of the oldest patron?

SQL> SELECT name


2 FROM patron
3 WHERE age =
4 (SELECT MAX(age)
5 FROM patron);

NAME
Smith

Correlated Subqueries

The previous subqueries have been simple in that there was no “interaction” between rows of the
outer table and rows of the inner table.

Example 3.38 Suppose we wish to list the names of patrons who have
borrowed more than one book. The PATRON table has the names of patrons
and we need to access the relevant rows of LOAN for each patron to determine
how many books have been borrowed.

1 SELECT name
2 FROM patron
3 WHERE 2 <= (SELECT COUNT(*)
4 FROM loan
5 WHERE loan.userid = patron.userid)

NAME
DATA MANIPULATION 36

--------------
Wong

We can think of the subquery being executed once for each row of PATRON. For each row of
PATRON the loan table is accessed and a count made of the number of rows retrieved. If
that count is greater than or equal to 2 then the patrons name is listed.

Exists

Sometimes one is only concerned with whether or not a subquery retrieves any rows. The
EXISTS operator evaluates to true or false according to whether a subquery retrieves at least
one or no rows respectively.

Example 3.39 Which titles have been borrowed by patrons?


SQL> SELECT title
2 FROM book
3 WHERE EXISTS
4 (SELECT *
5 FROM loan
6 WHERE loan.callno = book.callno);

TITLE
Physics Handbook
Database Systems
HTML
Software Testing
E-Commerce

In this example a book title is listed only when the subquery retrieves one or more rows. Note that
Database Systems is not listed. To list the books which have not been borrowed at all one just
replaces the EXISTS above by NOT EXISTS.

Use of ALL, ANY

These two qualifications can easily lead to difficulties and so we have left them
till last. Any of the comparison operators: =, <, >, <>, <=, >=, can be used
with ANY or ALL to test a specific value against a list of values. The following
outlines alternative approaches/operators.

original alternative
= ANY IN
<> ANY NOT IN
>= ALL = ... MAX(..) ...
<= ALL = ... MIN(..) ...
DATA MANIPULATION 37

Example 3.40 Determine the name of the oldest library patron. This was
done previously in example 37. Using >=ALL we can reformulate the query as

SQL> SELECT name


2 FROM patron
3 WHERE age >=ALL
4 (SELECT age FROM patron);

NAME
Smith

We consider example 37 to be a clearer statement and in general we suggest


the use of alternatives to the use of ALL and ANY.

Set Operations

ISO SQL provides the UNION operator to combine the results of two SELECTs. Some SQL
implementations include DIFFERENCE or MINUS to subtract one result from another, and
INTERSECTION to determine the rows in common for two results

Union

These operators require their operands to be union compatible. This means that the columns
of one select must agree in number and type with the other select. The results of two
SELECTs can be combined using UNION.

Example 3.41 List the call numbers of books borrowed by user 200, and
users with id’s larger than 300.

SQL> SELECT callno


2 FROM loan
3 WHERE userid = 20
4 UNION
5 SELECT callno
6 FROM loan
7 WHERE userid > 30 ;

CALLNO
300
600

The result of the above is the union of two sets - the union of the result of the two SELECTs.
SQL permits any number of SELECTs to be combined with UNION. Note that redundant rows
in the result were eliminated; if this is not desired then one specifies UNION ALL.
DATA MANIPULATION 38

Difference

Example 3.42 Suppose we wish to determine which computing books have


not been borrowed from the library. One way to approach this is to use two
SELECTs. The first of these would retrieve the call numbers of computing
books from the BOOK table. The second query would retrieve the call numbers
of books in rows of the LOAN table. If we ‘subtract’ the second from the first,
we will be left with those of the first query which were not present in the
second query.

SQL> SELECT callno


2 FROM book
3 WHERE subject='Computing'
4 MINUS
5 SELECT callno FROM loan;

CALLNO
400

Intersection
Example 3.43 Suppose we wish to determine which computing books have
been borrowed. We can proceed as above using two SELECTs, but now we
want to determine what rows (call numbers) the two queries have in common:

SQL> SELECT callno FROM book WHERE subject='Computing'


2 INTERSECT
3 SELECT callno FROM loan;
CALLNO
200
300
500
Chapter Four

Updating The Database


In this Chapter we introduce Data Manipulation Language statements responsible for updating
the database.

There are three commands for updating:

1. UPDATE modify rows of tables

2. DELETE remove rows from tables

3. INSERT add new rows to tables

Modifying Rows

The general form of the UPDATE command is


UPDATE table

SET field-assignments

WHERE condition
The field assignments are of the form field = expression and are used to assign specific
values to the fields of a row. The expressions must be of the appropriate type for the data type
of the corresponding column. These expressions cannot be subqueries or involve the
aggregate operators (AVG, COUNT, ...). The WHERE clause is optional; if absent then the
UPDATE applies to all rows.
DATA UPDATING 40

Example 4.1. Increase every patron’s age by 10.

SQL> UPDATE patron


2 SET age=age+10;

7 rows updated.

SQL> select * from patron;

userid name age


100 Wong 32
150 Colin 41
200 King 31
250 Das 77
300 Niall 27
350 Smith 82
400 Jones 51

Note that every row of PATRON is modified since there is no WHERE clause. Each row of
PATRON is modified according to the field assignments. There is only one field assignment:
AGE=AGE+10. The effect of this is to cause the current value of age in a row to be
incremented by 10; the value of this expression becomes the new value of the AGE field for
the row.

Example 4.2. Determine fines for books at the rate of one cent a day. In this example we
want to modify only the Loan row for userid 30.

aSQL> edit
Wrote file afiedt.buf

1 UPDATE loan
2 SET fine = (datedue - SYSDATE) * 0.01
3 WHERE dateret is NULL
4* AND userid = 30
SQL> /

1 row updated.

SQL> SELECT *
2 FROM loan
3 WHERE userid = 30;

LOANID CALLNO USERID DATEDUE DATERET FINE PAID


4 500 30 26-APR-90 327.62 Yes
DATA UPDATING 41

Deleting Rows

The general form of the DELETE command is

DELETE
FROM table

WHERE condition;

The effect of DELETE is to remove rows from a table; the rows deleted are those that satisfy
the condition specified in the WHERE clause. The WHERE clause is optional; if absent then
all rows are deleted.

Example 4.3. Remove Computing books from the database.

DELETE FROM book


WHERE subject=’Computing’;
4 rows deleted.

SQL> select * from book;

CALLNO TITLE SUBJECT


100 Physics Handbook Physics
600 E-Commerce Business

2 rows selected.

Note: If referential integrity is implemented (as illustrated in the next chapter)


we will get the following error message.

SQL> DELETE FROM book


2 WHERE subject='Computing';
DELETE FROM book
*
ERROR at line 1:
ORA-02292: integrity constraint (SCOTT.BOOK_FOREIGN_KEY) violated - child
record found.

Example 4.4. Remove all loan records for patron “King”.

DELETE FROM loan


DATA UPDATING 42

WHERE userid = (SELECT userid


FROM patron
WHERE name=’King’);

Inserting New Rows

In this section we illustrate how new rows are appended to a table. There are two forms of the INSERT command
a) INSERT

INTO table (field-list)

VALUES (constant, constant, ... );

b) INSERT

INTO table (field-list)

subquery;

The first form is used to insert a single row in a table; the second form is used to insert multiple rows that come
from one or more existing tables. If all fields are included and the values given in order, then the field list can be
omitted. Example 4.5 illustrates how one adds one row to a database.

Example 4.5. Add a new patron to the database.

INSERT INTO patron (userid, name, age)


VALUES (90,'Dattani',20);

SQL> SELECT * FROM PATRON;

USERID NAME AGE


10 Wong 22
15 Colin 31
20 King 21
25 Das 67
30 Niall 17
35 Smith 72
40 Jones 41
90 Dattani 20

8 rows selected.
DATA UPDATING 43

This method is clumsy and some other approach is needed if one is to add
several rows to a table. SQL systems typically provide a forms-based
interface, or a load utility, for entering many rows or to load a database.

The second form for INSERT is illustrated in Example 4.6.

Example 4.6. Create a table of senior citizens.

CREATE TABLE seniors


(userid NUMERIC NOT NULL,
name CHARACTER(30) NOT NULL);

INSERT INTO seniors


SELECT userid, name
FROM patron
WHERE age >= 65 ;
DATA UPDATING 45

Chapter Five

Views
All of our discussions to this point have dealt with tables. Views are similar to tables;
in many cases they can be used interchangeably. A view is defined using the
CREATE VIEW command and is defined in terms of tables or other views. A view
exists in definition only; it does not occupy space in the database as a table does; it is
materialized by the database system whenever it is used or referenced. There are
two reasons for using views: convenience and security.

Creating Views

The CREATE VIEW command is used to create a view; it has the general syntax:

CREATE VIEW view-name (column-specification-list)

AS select-specification

WITH CHECK OPTION;

The column-specification-list is the list of columns that appear in the view. This
can be omitted when each column in the view is derived directly and unambiguously
from a column of an underlying table. The select-specification is the SELECT
statement which defines the view. The WITH CHECK OPTION is optional and is
concerned with security aspects of the view.

As an example, suppose we wanted to supply our patrons with a view of the LOAN
table consisting of the call number and date due columns for those books that are
currently out on loan. Consider the definition BOOKSONLOAN:

CREATE VIEW BOOKSONLOAN AS


SELECT LOAN.LOANID, BOOK.CALLNO, BOOK.TITLE
FROM BOOK, LOAN
WHERE BOOK.CALLNO = LOAN.CALLNO
AND LOAN.DATERET IS NULL;

One can query the data using the SELECT syntax discussed in Chapter 3.

SELECT * FROM BOOKSONLOAN ;


DATA UPDATING 46

Dropping views

ISO SQL does not include a command to drop (remove) a view, but SQL systems
generally include a DROP statement for this purpose. So, to remove the view
OUT_BOOKS from the system one would execute:

DROP VIEW BOOKSONLOAN;

Advantages of Views

The view mechanism of SQL is a very useful tool for providing a form of security in the
database system. A view represents a subset of some underlying tables. The users
perception is simplified and what may be sensitive data is excluded from a user’s view.
For example, it may be undesirable for people to know who has the book they want. If
somehow we can force users to use BOOKSONLOAN instead of LOAN then we have
excluded (hidden) the sensitive field userid in LOAN from them.

Grouped Views

A grouped view is one where the GROUP BY clause has been used in the definition.
Such views are very useful, but unfortunately they are subject to numerous
restrictions in ISO SQL. Consider the view:

SQL> CREATE VIEW loans_by_book (callno, book_count)


2 AS
3 SELECT callno, COUNT(*)
4 FROM loan
5 GROUP BY callno;

View created.

ISO SQL does not allow one to use a WHERE clause, a GROUP BY clause, or a
HAVING clause with respect to a grouped view referenced in a SELECT. For
example,

SELECT * FROM loans_by_book WHERE book_count > 2 ;

is not acceptable since the SELECT references a grouped view and contains a
WHERE clause on that view. As well, if one of the tables/views in the FROM clause
of a SELECT references a grouped view then no other tables/views can be
referenced. For example,

SELECT loans_by_book.callno, title

FROM loans_by_book, book

WHERE loans_by_book.callno = book.callno;


DATA UPDATING 47

is illegal since there are two tables/views in the FROM clause and one of these is a
grouped view. If a SELECT references a grouped view in it’s FROM clause, then an
associated WHERE, GROUP BY, or HAVING is not permitted. Some SQL
implementations are more liberal in their treatment of grouped views.

Check Option

The WITH CHECK OPTION phrase is optional, but certainly recommended if the
view could be used for updating the database. Regardless of whether or not the WITH
CHECK OPTION is specified the user of a view is restricted to retrieving only that
portion of the database specified in the view. However, that is not the case for the
update operations of INSERT and UPDATE. Consider the view

SQL> CREATE VIEW computing_books


2 AS
3 SELECT * FROM book
4 WHERE subject='Computing';

View created.

A SELECT command referencing this view can only access computing books, but it
is possible in the absence of WITH CHECK OPTION to insert non-computing books
or change a

SQL> INSERT INTO computing_books


2 VALUES
3 (111,'Twentieth Century','History');

1 row created.

and insert the history text into the database. This book, however vanishes from the
view; we cannot retrieve it using COMPUTING_BOOKS.

SQL> select * from computing_books;

CALLNO TITLE SUBJECT


--------- -------------------- -------------
200 Database Systems Computing
300 HTML Computing
400 XML Computing
500 Software Testing Computing

A similar situation exists for the UPDATE command. To avoid this kind of behavior
we just include the WITH CHECK OPTION when we define the view.

SQL> CREATE VIEW computing_books


2 AS
3 SELECT * FROM book
4 WHERE subject='Computing'
5 WITH CHECK OPTION;
DATA UPDATING 48

View created.

This option tells the system to check and reject updates (INSERTs and UPDATEs) for
rows that vanish from the view.

SQL> INSERT INTO computing_books


2 VALUES
3 (111,'Twentieth Century','History');
INSERT INTO computing_books
*
ERROR at line 1:
ORA-01402: view WITH CHECK OPTION where-clause violation

Updating through Views

Some views are inherently non updatable. That is, for some views an update operation
does not make any sense and must be disallowed. For example, consider the update
operation

SQL> UPDATE loans_by_book


2 SET book_count = 25
3 WHERE callno=100;
UPDATE loans_by_book
*
ERROR at line 1:
ORA-01732: data manipulation operation not legal on this view

Updating the BOOK_COUNT field does not make any sense; it doesn’t even correspond
to a field in an underlying real table. What views can be updated sensibly, and which
cannot, is an area of current research in database systems. Consequently, SQL systems
are quite restrictive on what types of views can be updated. In general, a view is
updateable if it corresponds to a simple row and column subset of a single table. The
specific conditions that an updateable view must meet are:

 The FROM clause specifies exactly one table.

 The columns are derived directly from underlying columns: no arithmetic


expressions, no aggregate functions, no DISTINCT specification.

 The WHERE clause does not have a subquery, a GROUP BY clause, or a


HAVING clause.

 In the future these restrictions will in all likelihood be relaxed; at the present
time some database systems have eased these restrictions. For instance,
DB2 allows a view to be updated even if the WHERE clause contains a
subquery as long as the FROM clause in the subquery and the “outer” FROM
clause reference different tables.
DATA UPDATING 49

You might also like