You are on page 1of 30

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.

docx 1
Cleveland State University
CIS611 Relational Database Systems
Lecture Notes
Prof. Victor Matos

RELATIONAL ALGEBRA
V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 2
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

The relational model of data
(RM) was introduced by Dr. E.
Codd (CACM, June 1970).

The RM is simpler and more
uniform than the preceding
Network and Hierarchical
model.

S.Todd, (IBM 1976) presented
PRTV the first implementation
of a relational algebra DBMS.

A. Klug added summary functions for statistical
computing (ACM SIGMOD 1982).

Roth, Oszoyoglu et al. (1987) extended the model
to allow nested data structures

Clifford, Tansel, Navathe, and others have added
time especifications into the model

Current research is aimed at extending the model
to support complex data objects, multimedia mgnt,
hyperdata, geographical, temporal and logical
processing.


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 3
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra


A relational database is a collection of relations

A relation is a 2-dimensional table, in which
each row represents a collection of related data
values

The values in a relation can be interpreted as a
fact describing an entity or a relationship


Relation name Attributes

STUDENT Name SSN Address GPA
Mary Poppins 111-22-3333 77 Picadilly St 4.00
Pepe Gonzalez 123-45-6789 123 Bonita Rd. 3.09
Tuples
V. Sundarabatharan 999-88-7777 105 Calcara Ave. 3.87
Shi-Wua Yan 881-99-0101 778 Tienamen Sq. 3.88


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 4
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

Domains, Tuples, Attributes, and Relations

A domain D is a set of atomic values. A domain
is given a name, data type and format.

A relation schema R, denoted R(A
1
, A
2
, ...A
n
), is a
set of attributes (column names). The degree of
a relation is the number of attributes of its relation
scheme

Each attribute A
i
is the name of a role played by
some domain D in R(A
1
...A
n
). D is denoted the
domain of A
i
and is denoted by dom(A
i
).

A relation r defined on schema R(A
1
...A
n
), also
denoted by r(R), is a set of n-tuples r= { t
1
, t
2
, ...t
m
}

Each n-tuple t is an ordered list of n values t=
<v
1
, v
2
,...,v
n
>, where each value v
i
is an element
of dom(A
i
) or a special null value.

A relation r(R) is a subset of the cartesian
product of the domains dom(A
i
) that define R.

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 5
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

Characteristics of Relations

Relations are defined as a (mathematical) set
of tuples.

Duplicate tuples are not allowed

Order of tuples inside a relation is immaterial

Ordering of values within a tuple is irrelevant;
therefore the column ordering is not important.

Each value in a tuple is atomic (not divisible)

Recent research is oriented toward removing
the atomicity of First Normal Form databases

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 6
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

Key Attributes of a Relation

A superkey SK of relation r(R) is a group of
attributes which uniquely identifies all the other
attributes of r(R).

A key K of a relation schema R is a minimal
superkey of R.

A relation schema R may have more than one
key. Each of those is called a Candidate Key.

It is common to select one of the candidate
keys and elevate it to Primary Key.

Convention:
The attributes representing the primary key of
schema R are underlined. Example:
EMPLOYEE (SSN, Name, Address, Salary)
STOCK (PartNum, SupNum, Quantity)


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 7
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

Integrity Constrains

Integrity constrains are rules specified on the database
and are expected to hold on every instance of that schema.

1. Key constraints specify the candidate keys of
each relation scheme R.

2. Entity integrity constraints state that no primary
key value can be null.

3. Referential integrity constraints are specified
between two tables and is used to maintain the
consistency among tuples of the two relations.
Foreign key(s) of one relation are used to refer
to primary key values in the other relation.

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 8
RELATIONAL QUERY LANGUAGES

Classification based on the underlying language model

Model Example

1. Pure Algebraic ISBL-IBM
Info. Syst. Base Lang.

2. Pure Predicate Calculus

Tuple Oriented Type
Domain Oriented
QUEL Ingres
QBE, STBE

3. Mixed Algebra-Calculus

SQL

4. Object Oriented

DB4O

5. Associative

Sentences - LazySoft

6. Other

Cache (Object-Relational)
.


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 9
THE RELATIONAL DATA MODEL (RM)
and the Relational Algebra

Relational Algebra

Collection of operators which are used to manipulate
entire relations.

The result of each operation is a new relation.

Consists of two grups: operations on sets and operations
specifaclly designed to manipulate relational databases

Operations on sets:
UNION
DIFFERENCE
INTERSECTION
CARTESIAN PRODUCT

Operations on databases
SELECT
PROJECT
JOIN
AGGREGATE
DIVISION
RENAME


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 10
RELATI ONAL ALGEBRA
UNION
The result of this operation, denoted (r s ) or ( r + s ), is a
relation that includes all tuples that either are in r or s or both
in r and s.
Duplicate tuples are eliminated.
Combined relations must be union-compatible
r + s = { t / t r or t s }

r A B C

s A B C


1 1 1

1 2 3


2 2 2

1 1 1


3 3 3

3 2 1



r + s A B C


1 1 1


2 2 2


3 3 3


1 2 3


3 2 1



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 11
RELATI ONAL ALGEBRA
DIFFERENCE
The result of this operation, denoted ( r - s ), is a relation
that includes all tuples that are in r but not in s.
Participating relations must be union-compatible
r - s = { t / t r and t s }


Example

r A B C

s A B C


1 1 1

1 2 3


2 2 2

1 1 1


3 3 3

3 2 1



r - s A B C


2 2 2


3 3 3



NOTE: The difference operator is not commutative,
that is ( in general ) r - s s - r



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 12
RELATI ONAL ALGEBRA
INTERSECTION
The result of this operation, denoted ( r s ), is a relation
that includes all tuples that are present in both r and s.
Participating relations must be union-compatible
r s = { t / t r and t s }


Example

r A B C

s A B C


1 1 1

1 2 3


2 2 2

1 1 1


3 3 3

3 2 1



r s A B C


1 1 1





V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 13
RELATI ONAL ALGEBRA
CARTESIAN PRODUCT
The operation, denoted ( r s ), is also known as the cross
product or cross join. The purpose of the operator is to
concatenate rows from two relations, making all possible
combinations of rows.

Consider relation schemas r(A
1
,A
2
,...A
n
) and s(B
1
,B
2
,...B
m
)
Relations r and s, do not have to be union-compatible
If r has n tuples, and s has m tuples, then (r s) will have
a total of (n * m) tuples
The resulting relation schema is ( A
1
,A
2
,...A
n
, B
1
,...,B
m
)

Example

r2 A B C

r2 x
s2
A B C D E

1 1 1

1 1 1 10 a

2 2 2

1 1 1 20 b

3 3 3

2 2 2 10 a

2 2 2 20 b
s2 D E

3 3 3 10 a

10 a

3 3 3 20 b

20 b






V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 14
RELATI ONAL ALGEBRA
PROJECTION
The project operator extracts certain columns from the table
and discards the other columns.
Syntax: Result=
Col
Table ( )

where
Col is the list of columns to be extracted from the Table
Duplicate tuples in the resulting table are eliminated


EXAMPLE
Evaluate the expressions Temp1=
A
r ( )


Temp2=
B C
r
,
( )



r A B C

Temp1 A

Temp2 B C

1 610 3

1

610 3

1 620 3

2

620 3

1 600 2

600 2

1 650 2

650 2

2 610 3

634 4

2 634 4




V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 15
RELATI ONAL ALGEBRA
SELECTION
The selection operator extracts certain rows from the table and discards
the others. Retrieved tuples must satisfy a given filtering condition.
Syntax: Result =
cond
(table)


where
Cond is a logical expression containing and, or, not operators on
clauses of the form (table.column value) or (table.col1 table.col2)
and = { =, >, >=, <, <=, <> }
Entire rows (with all of their columns) are retrieved when the
condition is met.


EXAMPLE
Evaluate the expression
Temp1 =
(B >=620) and (c<4)
(r)





r A B C

Temp
1
A B C

1 610 3

1 620 3

1 620 3

1 650 2

1 600 2



1 650 2



2 610 3



2 634 4




V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 16
RELATI ONAL ALGEBRA
RENAME
In some cases, we may want to rename the attributes of a relation or the
relation name or both.
The rename operator is useful to avoid situations in which a query
produces columns with the same name (perhaps different meaning).
Syntax: Result =
oldName newName
(table)


where
oldName is a column in the table and newName is the new
identification for the column.
Only the column name is changed, data remains intact.


EXAMPLE
Evaluate the expression
Temp1 =
A Section, B Course, B Credits
(r)




r A B C

Temp
1
Sectio
n
Cours
e
Credits

1 610 3

1 610 3

1 620 3

1 620 3

1 600 2

1 600 2

1 650 2

1 650 2

2 610 3

2 610 3

2 634 4

2 634 4



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 17
RELATI ONAL ALGEBRA
JOIN
The join operation, denoted by (Tab
1

<join condition>
Tab
2
), is
used to combine related tuples from two relations
Join-condition format is: (Table
1
.Col
1
Table
2
.Col
2
),
where could be { =, >, >=, <, <=, <> }
Restrictions of the form (Table.Col Value), can be
and-ed, or or-ed to the joining condition.

EXAMPLE
Consider relation schemas r(A,B,C

) and s(D,E

), and the
expression: Temp1= ( r
( ) C D
s )


r A B C


1 1 1


2 2 2


3 3 3

Temp1 A B C D E

1 1 1 1 a
s D E

2 2 2 2 b

1 a

2 2 2 2 c

2 b


2 c






V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 18
RELATI ONAL ALGEBRA
NATURAL JOIN
The natural join operation, denoted by (Tab
1
* Tab
2
), is
used to combine tuples of two relations under an equi-join.
Related columns must have the same name & domain
Implicit Join-Conditions are: (Table
1
.Col
1
= Table
2
.Col
2
)

EXAMPLE
Consider relation schemas r(A,B,C

) and s(B,C,D

), and the
expression: Temp= ( r * s )


r A B C


1 1 1


2 1 0


4 3 2

Temp A B C D

1 1 1 a
s B C D

4 3 2 c

1 1 a


1 2 b


3 2 c


4 3 d


The implicit join-condition is
(r.B=s.B) and (r.C=s.C)


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 19

RELATI ONAL ALGEBRA
LEFT OUTER JOIN
The left outer join operation, denoted by (r
<join condition>

s ), is a special
case of the general join.
LOJ keeps in the resulting table representation from every tuple that
appears in the first (or left) relation
If no matching value for r is found in s, then the attributes of s appear
in the result as null values

EXAMPLE
Consider relation schemas r(A,B,C

) and s(D,E

), and the expression:
Temp1= ( r
( ) A D

s )

r A B C


1 1 1


2 2 2


3 3 3

Temp1 A B C D E

1 1 1 1 a
s D E

2 2 2 2 b

1 a

2 2 2 2 c

2 b

3 3 3 null null

2 c



NOTE:
Outer-join is not a primitive operator. It could be expressed as follows:
( r
( ) A D

s ) = ( ( ( ) ( )) )
L
A D
Y Y
r s r s Null
Where: Y= Schema(r) Schema(s), and L = degree(s) - |Y|

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 20


RELATI ONAL ALGEBRA
AGGREGATE FUNCTIONS
Originally proposed by A. Klug (1982) to extend the scope of
relational algebra allowing mathematical computations of
summary functions.
Syntax:
<grouping attributes>

<function list>
( <relation name> )
Common functions are: MAX, MIN, AVG, SUM, COUNT
Grouping attributes force a fragmentation of the relation,
the function is computed in each independent group.
Output consists of the grouping attributes and the result of
the summary functions on each group
If no grouping field(s) is given the function(s) applies on
the entire table

EXAMPLE
Compute Temp=
A Sum(B), Max(C)
( r )

r A B C


1 10 1


Group-by
field

Summary
Data


1 2 5

Temp A Sum_B Max_C


2 3 3

1 12 5


3 6 10

2 3 3


3 5 7

3 11 10






V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 21
RELATI ONAL ALGEBRA
DIVISION
The division operation, denoted by (r / s) is useful when you need a
mechanism to identify the tuples of some table that are related to each
and every one of the tuples of a second group.

EXAMPLE
Consider relation schemas r(A,B

) and s(B

), and the expression:
Temp1= ( r / s )


r A B

Temp
1
A


1 1

1


1 2


1 3


1 4


2 1


2 3


3 3



s B


1


2


3


NOTE
Division is not a primitive operation it could be expressed as:
r / s =
A

for table schemes r(A,B) and s(B)

the algebraic expression
r[A,B] / s[B]
selects the A-values from the dividend table
r[A,B], whose B-values are a super-set of those
B-values held in the divisor table s[B].
V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 22
RELATI ONAL ALGEBRA
PACK
Assume A is an attribute in Schema(r). The Pack
A
(r) operator transforms
the A-values into a nested representation.
A Nested field is a set of related atomic values.

EXAMPLE
Consider relation schemas r(A,B,C) and the expression:
temp1 = Pack
C
(r)
r A B C
b1 40 a1
b1 40 a2
b2 50 a3
b3 60 a4
b3 60 a2
b3 60 a5
b4 60 a6


Pack
c
(r)
temp1 A B C
b1 40 {a1, a2}
b2 50 {a3}
b3 60 {a2, a4, a5}
b4 60 {a6}


Consider relation schemas s(A,B,C) and the expression:
temp2 = Pack
C
(s)

s A B C
m1 1 {a1, a2}
m1 1 {a3}
m2 2 { a4 }
m2 1 {a5, a6}
m2 2 {a4, a7, a8}


Pack
C
(s)


temp2
A B C
m1 1 {a1, a2, a3}
m2 2 {a4, a7, a8}
m2 1 {a5, a6}




V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 23
RELATI ONAL ALGEBRA
PACK (cont)
Let A be one of the n attributes in Schema (A
1
A
n
). Assume the
relation r is defined over the Schema (A
1
A
n
).

Let C
A
= Schema (A
1
A
n
) {A}. Therefore |C
A
| = n-1

For each (n-1)-tuple ( )
A
C
g r we define the sets W
g
[C
A
] and W
g
[A] as
follows:
W
g
[A] = { t[A] / t and t[C
A
]= g } if A is atomic, and
W
g
[A] = { x / t) t otherwise.

Then Pack
A
(r) = { Wg / ( )
A
C
g r }
Therefore, the Pack operator converts sets of r-tuples whose (n-1)
attributes for C
A
are the same into a single tuple.

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 24
RELATI ONAL ALGEBRA
UNPACK
Unpack is the counterpart of the Pack operator. When applied on the set-
valued attribute A of a relation r, this operator transforms the single non-
atomic version of the tuple into a group of records in which the attribute
A is atomic.
EXAMPLE
Consider relation schemas r(A,B,C) and the expression:
temp1 = Unpack
C
(r)
r A B C
b1 1 {a1, a2}
b2 2 {a3}
b2 2 {a2, a4, a5}
b4 3 {a6}




Unpack
C
(r)
temp1
A B C
b1 1 a1
b1 1 a2
b2 2 a3
b3 2 a4
b3 2 a2
b3 2 a5
b4 3 a6


Let A be one of the n attributes in Schema (A
1
A
n
). Assume the
relation r is defined over the Schema (A
1
A
n
).

Let C
A
= Schema (A
1
A
n
) {A}. Therefore |C
A
| = n-1

UP
A
( {t} ) = { t[A] } if A is atomic, and
UP
A
( {t} ) = { t / (t[A] t[A]) and (t[C
A
] = t[C
A
]) } otherwise.

Then Pack
A
(r) = { Wg / ( ({ })
A
t r
UP t }
If A is atomic then UPA(r)= r, otherwise UPA(r) maps each tuple t in r
into a set of (decompressed) tuples such that each element in t[A]
becomes the atomic A-value of a new decompressed tuple.

V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 25


RELATI ONAL ALGEBRA
EXAMPLE QUERIES
QUERY 1. Retrieve the name and address of all employees
who work in the 'Research' department.

QUERY 2. For every project located in 'Cleveland', list the
project number, the controlling department number, and the
department manager's last name, address, and birthdate.

QUERY 3. Find the name of employees who work on all
projects controlled by department number 5.

QUERY 4. Make a list of project numbers for projects that
involve an employee whose last name is 'Smith', either as a
worker or as a manager of the department that controls the
project.

QUERY 5. List the name of all employees with two or
more dependents.

QUERY 6. Retrieve the name of employees who have no
dependents.

QUERY 7. List the name of managers who have at least
one dependent.



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 26
Company Database


DEPARTMENT
DNAME
DNUMBER
MGRSSN
MGRSTARTDATE
DEPENDENT
ESSN
DEPENDENT_NAME
SEX
BDATE
RELATIONSHIP
DEPT_LOCATIONS
DNUMBER
DLOCATION
EMPLOYEE
FNAME
MINIT
LNAME
SSN
BDATE
ADDRESS
SEX
SALARY
SUPERSSN
DNO
PROJECT
PNAME
PNUMBER
PLOCATION
DNUM
WORKS_ON
ESSN
PNO
Hours


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 27



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 28
IST 331 Brief Notes on Relational Algebra
V. Matos.

Consider the relation schema of the COMPANY database given below

EMPLOYEE (fmane, minit, lname, ssn, birthdate, address, sex, salary, superssn, dno)
DEPARTMENT (dname, dnumber, mgrssn, mgrstartdate)
PROJECT (pname, pnumber, plocation, dnum)
WORKS_ON (essn, pno, hours)
DEPENDENT (essn, dependent-name, sex, bdate, relationship)


Operator Example Comments
Selection
( ' ') ( 25000)
( )
sex F and salary
Answer
Employee

Find the female employees
earning at least $25K.
Projection
,
( )
ssn Fname
Answer Employee Get the Social Sec. No. and first
name of each employee
Join
( . . ) L dno R dnumber
Answer
Employee Department

Merge employee and department
records according to matching
dept. numbers.
Union
( ' ') ( 25000) ( ' ') ( 35000)
( ) ( )
sex F and salary sex M and salary
Answer Employee Employee
Get the male employees earning
at least $35K and the female
employees whose salary is
exactly $25K
Minus
4
( )
dno
Answer Employee Employee
Get the employees who do not
work for department No. 4
Intersection
' ' 4
( ) ( )
sex F dno
Answer Employee Employee
Get all the female employees
who work for dept. 4
Aggregatio
n
, ( )
( )
dno sex average salary
Answer Employee F
Find the average salary of
employees grouping by sex and
dept. no. Put the results in the
table defined as:
Answer(dno,sex,average_salary)
Division
' '
,
( ) ( ( ))
plocation Clev
essn pno Pnumber
Answer WorksOn Project
Get the SSN of employees
working in each one of the
projects located in Cleveland
Rename
, ,
( )
fname lname First Last
Answer Employee
Change the labels fname and
lname in the Employee table to
First, and Last.


Q01. Get the SSN and Last name of each of the female managers


( . . )
' '
,
1 ( ( ) )
L ssn R MgrSsn
sex F
ssn lname
Answer Employee Department
1


Q02. Get the Social Sec. No. of those employees who are married.


( ' ' )
2 ( ( ) )
relationship Spouse
essn
Answer Dependent

1
Observation: The notation
( . . )
( 1 2)
L a R b
e e merges the tables produced by the expressions e1 and e2. The match is dictated by the joining
condition (L.a=R..b). The fragment L.a identifies the a-column as part of the table produced by the table/expression e1.The L and R
qualifications indicate whether the source columns are located to the left or right side of the join-operator .


V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 29
Q03. Get the last name of the married female managers. Rename the column to Mgr Name


( . . )
" "
3 ( ( 1 2) )
L ssn R essn
lname Mgr Name
Lname
Answer Answer Answer

Q04. Get the last name of married employees who have at least one daughter and one son.


( ' ')
( ( ))
relationship Son
essn
Boys Dependent
( ' ')
( ( ) )
relationship Daughter
essn
Girls Dependent

( ' ')
( ( ) )
relationship Spouse
essn
Married Dependent

. .
( )
L ssn R essn
Lname
TheSsn Married Boys Girls
Answer Employee TheSsn


Q05. Get the last name of married employees who have no children.

. .
( )
( )
L ssn R essn
Lname
TheSsn Married Boys Girls
Answer Employee TheSsn


Q06. Get the last name of married employees who only have daughters.

. .
( )
( )
L ssn R essn
Lname
TheSsn Married Girls Boys
Answer Employee TheSsn


Q07. Get the last name and salary of each employee as well as that of their corresponding (direct)
supervisor.

( . . )
, , , ,
( ( ) ( ) )
L superSsn R BossSsn EmpSsn
ssn salary EmpSsn EmpSalary ssn salary BossSsn BossSalary
EmpSalary
BossSsn
BossSalary
Answer Employee Employee

Q08. Get the last name of employees who work on five or more projects.
( )
. .
( _ 5)
( )
( ( ) )
essn count pno
L ssn R essn Lname
count pno
theTally worksOn
Answer Employee theTally
F



V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 30
Relational Algebra Practice Test

Last Name: ___________________________ First Name:__________________


Consider the relation schema of the COMPANY database given below
EMPLOYEE (fmane, minit, lname, ssn, birthdate, address, sex, salary, superssn, dno) KEY: ssn

DEPARTMENT (dname, dnumber, mgrssn, mgrstartdate) KEY: dnumber.

PROJECT (pname, pnumber, plocation, dnum) KEY: pnumber.

WORKS_ON (essn, pno, hours) KEY: (essn, pno)

DEPENDENT (essn, dependent-name, sex, bdate, relationship) KEY: (essn, dependent-name)

Formulate the following question in Relational Algebra query language:
1. Give the last name of those employees who work in any project(s) where there are more female than male
employees.
























2. Give the last name of those female managers who work in each of the projects located in Cleveland.

You might also like