You are on page 1of 109

DBMS & SQL

Content Developers :Badrish Prakash & RS Sudhir


1

Learning Objectives:

Understand database systems Learn how to use SQL to query and update relational databases, and how to use SQL together with a programming language

Table of Contents:
Chapter
1 2 Database Relational Database Management System

Topic

Page No
5 21

3
4 5 6 7 8 9 10 11 12 13 14 15

DBMS Assignment
History of SQL Data Definition Language Data Modification Language Assignment 1 Assignment 2 Assignment 3 Assignment 4 Assignment 5 Assignment 6 Assignment 7 References & Books Reference web sites

34
35 46 77 82 85 88 93 96 102 107 108 109

Introduction

Every organization has data that needs to be collected, managed, and analyzed. Most people are familiar with some kind of spreadsheet, such as Microsoft Excel. Spreadsheets are easy and convenient to use, and they may be employed by an individual. Spreadsheets are commonly used to store information in a tabular format. A spreadsheet can store data in rows and columns, it can link cells on one sheet to those on another sheet, and it can force data to be entered in a specific cell in a specific format. Its easy to calculate formulas from groups of cells on the spreadsheet, create charts, and work with data in other ways.

A database fulfills these needs. Along with the powerful features of a relational database come requirements for developing and maintaining the database. Data analysts, database designers, and database administrators (DBAs) need to be able to translate the data in a database into useful information for both day-to-day operations and longterm planning.

Database

Originally the database was flat


Information was stored in txt file called tab delimited file Each entry in file separated by special character, such as a vertical bar (|) Difficult to search for specific information

Lname, FName, Age, Salary|Smith, John, 35, $280|Doe, Jane, 28, $325|Brown, Scott, 41, $265|Howard, Shemp, 48, $359|Taylor, Tom, 22, $250

Types of Databases

By Function
Analytical Database Also referred as On-Line Analytical processing (OLAP), are those used to keep track of statistics
Read only access to analyze the data

Operational Database Also referred as On-Line Transactional Processing (OLTP), are those, let you actually change and manipulate the data in database

Types of Databases

By Data Model
Flat File database model
Data is stored in numerous files No linkage between files so repetition of information in different files

Relational Database Model


Data can be stored in different table/databases The tables/databases can be connected using keys

Object oriented Database Model


Stores not only text, but also sounds, images, and all sorts of media clips

Database Systems

The big commercial database vendors:


Oracle IBM (with DB2) Microsoft (SQL Server) Sybase Teradata

Some free database systems (Unix) :


Postgres MySQL Predator

What is DBMS?

Need for information management A very large, integrated collection of data. Models real-world enterprise.

Entities (e.g., students, courses) Relationships

A Database Management System (DBMS) is a software package designed to store and manage databases.

Why Use a DBMS?

Data independence and efficient access. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Replication control Reduced application development time.

10

Why Study Databases??


Shift from computation to information

at the low end: access to physical world at the high end: scientific applications Digital libraries, interactive video, Human Genome project, ecommerce, sensor networks ... need for DBMS/data services exploding OS, languages, theory, AI, multimedia, logic

Datasets increasing in diversity and volume.


DBMS encompasses several areas of CS

11

Data Models
A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the a given data model. The relational model of data is the most widely used model today.

Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.

12

Data Models
A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the a given data model. The relational model of data is the most widely used model today.

Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.

13

Key

Primary Key a field or fields making every record unique and cannot be null. Ex SSO id in Emp_Table in Genpact Foreign Key a field in one table matches the primary key of another table. Ex CoE in COE_Table is primary key while CoE field in Emp_Table is foreign key

14

Key
Primary Key Foreign Key

Emp_Table SSO_Id Emp_Name CoE 123456 David D'SouzaIndustrial 234567 Ram Kumar Analytics 345678 Naveen P External

COE_Table COE_ID ICoE ACoE ECoE COE_Name Industrial Analytics External

15

Example of a Traditional Database Application

Suppose we are building a system to store the information about: Employee CoE Supervisor who goes where, who reports whom

16

Can we do it without a DBMS ?

Sure we can! Start by storing the data in files: Employee.txt coe.txt supervisor.txt

Now write a programs to implement specific tasks

17

Doing it without a DBMS...

Add XYZ in Analytics:

Write a algorithm to do the following:

Read Employee.txt Read CoE.txt Find&update the record XYZ Find&update the record Analytics Write Employee.txt Write coe.txt
18

Problems without an DBMS...

System crashes:

Read Employee.txt Read CoE.txt Find&update the record XYZ Find&update the record Analytics Write Employee.txt Write coe.txt

CRASH !

What is the problem ?

Large data sets (say 50GB)


Why is this a problem ?

Simultaneous access by many users


Lock employee.txt what is the problem ?

19

Enters a DBMS

Two tier system or client-server

connection (ODBC, JDBC)

Data files

Database server (someone elses C program)

Applications
20

Relational Database Management System


What is RDBMS ?
RDBMS stands for:
Relational Database Management System

It is a general purpose software system that facilitates the process of defining , constructing and manipulating a database. It helps in defining data types , structures and constraints It provides functions such as: querying the database to retrieve specific data Updating the database to reflect the changes in the miniworld Generating reports from the data

21

Relational Database Management System Contd...


How is data represented in RDBMS ?
Data is represented by a collection of relations
-Some cardholders make transactions -Every account is sent a statement

Each relation is depicted as a table


-Account

Each row in a table represents a Tuple


-Account Number 1 , Account Number 2 , etc

Attributes of an entity are stored in different columns


-First Name , Last Name , Credit Line

22

Relational Database Management System Contd


What is a relational database?
A relational database is a collection of related data

Data represents known facts that can be recorded and have an implicit meaning

Each database consists of various database objects. Some of the database objects are:
-Tables, Views , Indexes , Partitions , -Constraints -Meta Data

23

Relational Database Management System Contd


What are the benefits of a Relational Database?
It provides a mechanism to organize data which helps in reducing data redundancy

Each logical data item is stored in one place allowing a consistent way to store data

Access to information can be controlled depending upon the role of the user

Popular Products : Oracle , IBM DB2 , MySQL, Sybase etc Lets learn about different database objects in the next slide.

24

Database Objects
What is a Table?
Data is stored in Rows & Columns
Rows represents different tuple Columns represents attributes of entities Ex: Account_Dim in CDCI
-Columns : Account Number , First Name , Last Name

Col 1

Col 2

Col 3

Account Number

First Name George

Last Name Bush

Row 1
Row 2

ABC

DBC

Manmohan

Singh

25

Database Objects
What is a View?

A view is a single table derived from other table(s) in the database Are dynamically updated whenever the tables are refreshed Also called as Virtual Tables
Ex: Account_Dim_Level1, Account_Dim_Level2 in CDCI Has been used to support field masking in CDCI

26

Database Objects
What is an Index?

Pointers to physical location of the information in DB Defined on a field(s) in the DB Used to retrieve or update information faster in DB
Ex: Account_Key in Account_Dim

27

Database Objects
What are partitions ?
Partitions are a way of dividing tables & indexes into Manageable Pieces Paritions can be spread out in different locations /disks Partitions support Parallelism Types : List , Range , Hash
Ex: Posting_date in Transaction_fact Billing_cycle_date in Statement_Fact

Table
Billing Cycle Date

Statement Fact

STMT_20060301 SAMS CLUB Wal-Mart

STMT_20060302 SAMS CLUB Wal-Mart

STMT_20060303 SAMS CLUB Wal-Mart


28

Client 1 Client 2

Database Objects
What are Constraints?
Constraints are restrictions imposed on data Constraints help in enforcing consistency in representation of data Constraints enforces integrity into data Types
Domain Constraints
Account Number can only be of 16 digits No 2 account numbers can have the same account key Account Key cannot be null on Account_dim Every transaction made by cardholder needs to be associated with the cardholders information in Account_dim Account Key in transaction fact is a primary key in Account_dim Account number cannot be NULL on Account_dim

Key Constraint Entity Integrity

Referential Integrity
Foreign Key Constraint
-

NOT NULL constraint

29

Meta Data
What is Meta Data
Meta Data is data about data Information about data stored in the DB Ex: All_tables stores information about every table in the DB All_Indexes stores information of every index available in the DB

All Tables Output

All Indexes Output

30

Are Spreadsheets Like Databases?


Spreadsheet
More than one data type can be stored in a spreadsheet column. Cells in a spreadsheet can be defined as a formula, making the contents variable depending on other cells. A spreadsheet has only the physical row number to make it unique, and no built-in way to enforce uniqueness of a given spreadsheet row. Usually, only one user can have write access to the spreadsheet at any given time; anyone else is locked out, even if the second user is on a different part of the spreadsheet. A spreadsheet does not have any built-in transaction-control capabilities, such as ensuring that a group of changes to the sheet is completely applied or not applied at all. The Save button is about the best a spreadsheet can do to simulate transaction control. A corrupt spreadsheet cannot usually be repaired; the entire spreadsheet must be restored from a backup, which may have occurred yesterday, last week, or never!

Database
Usually, only one data type can be stored in a database table column. Columns in a database table have a fixed value. Single rows of a database table are uniquely identified by a unique value (typically a primary key, as described later in this chapter). Multiple users can access a database table at the same time, with various combinations of read and write capabilities in different parts of the database. A database usually has transactioncontrol capabilities, making it possible to roll back a change if something happened to prevent it from completing successfully (such as a power failure). There are many tools for repairing and recovering databases.

31

Things to Remember

Database Management System = DBMS Relational DBMS = RDBMS

32

Additional Information

SQL for Web Nerds, by Philip Greenspun, http://philip.greenspun.com/sql/

33

Database Assignments

What is the difference between DBMS & RDBMS? What is Primary Key in any database table? Can Primary key be null? What is Metadata? How do you differentiate between a View and a table? Row of a table represents Tuple. Is it true? What is the advantage of creating an index on a table? If Account_Schdl_No is Primary Key in table Account. It can store null value as well. Is this statement true?

34

History of SQL

An influential paper, "A Relational Model of Data for Large Shared Data Banks", by Dr. Edgar F. Codd, was published in June, 1970 in the Association for Computing Machinery (ACM) journal, Communications of the ACM, although drafts of it were circulated internally within IBM in 1969. Codd's model became widely accepted as the definitive model for relational database management systems (RDBMS or RDMS).

35

History

During the 1970s, a group at IBM's San Jose research center developed a database system "System R" based upon, but not strictly faithful to, Codd's model. Structured English Query Language ("SEQUEL") was designed to manipulate and retrieve data stored in System R. The acronym SEQUEL was later condensed to SQL because the word 'SEQUEL' was held as a trademark by the Hawker-Siddeley aircraft company of the UK. Although SQL was influenced by Codd's work, Donald D. Chamberlin and Raymond F. Boyce at IBM were the authors of the SEQUEL language design.[1] Their concepts were published to increase interest in SQL.

36

History

The first non-commercial, relational, non-SQL database, Ingres, was developed in 1974 at U.C. Berkeley. In 1978, methodical testing commenced at customer test sites. Demonstrating both the usefulness and practicality of the system, this testing proved to be a success for IBM. As a result, IBM began to develop commercial products based on their System R prototype that implemented SQL, including the System/38 (announced in 1978 and commercially available in August 1979), SQL/DS (introduced in 1981), and DB2 (in 1983).[1]

37

History

At the same time Relational Software, Inc. (now Oracle Corporation) saw the potential of the concepts described by Chamberlin and Boyce and developed their own version of a RDBMS for the Navy, CIA and others. In the summer of 1979 Relational Software, Inc. introduced Oracle V2 (Version2) for VAX computers as the first commercially available implementation of SQL. Oracle is often incorrectly cited as beating IBM to market by two years, when in fact they only beat IBM's release of the System/38 by a few weeks. Considerable public interest then developed; soon many other vendors developed versions, and Oracle's future was ensured.

38

History

Standardization SQL was adopted as a standard by ANSI (American National Standards Institute) in 1986 and ISO (International Organization for Standardization) in 1987. ANSI has declared that the official pronunciation for SQL is /s kju l/, although many Englishspeaking database professionals still pronounce it as sequel.

39

History

Year 1986

Name SQL-86

Alias SQL-87

Comments

First published by ANSI. Ratified by ISO in 1987. Minor revision.

1989

SQL-89

1992

SQL-92

SQL2

Major revision. Added regular expression matching, recursive queries, triggers, non-scalar types and some object-oriented features. (The last two are somewhat controversial and not yet widely supported.) Introduced XML-related features, window functions, standardized sequences and columns with autogenerated values (including identitycolumns).
40

1999

SQL:1999

SQL3

2003

SQL:2003

What is SQL?

SQL SQL is a syntax for querying and manipulating relational databases. It was originally known as SEQUEL (Structured English Query Language), but this was shortened to SQL due to a trademark dispute. You can use SQL to read data from a database. Such queries can be quite sophisticated - you can choose which columns of the table to extract, you can use conditional expressions to decide which rows to extract, you can sort the result, and limit the number of rows returned. It is also possible to "join" tables, that is to retrieve data from multiple related tables in a single query. SQL also allows you to insert and modify records in a table. In that sense, the term "query" is something of a misnomer. You can also create, modify or delete entire tables within the database using SQL queries.

41

Topic: cont..

SQL stands for Structured Query Language. It is the most commonly used relational database language today. SQL works with a variety of different fourth-generation (4GL) programming languages, such as Visual Basic.

42

Topic: cont..

Data Manipulation Data Definition Data Administration All are expressed as an SQL statement or command.

43

Topic: cont..

Represent all info in database as tables Keep logical representation of data independent from its physical storage characteristics Use one high-level language for structuring, querying, and changing info in the database Support the main relational operations Support alternate ways of looking at data in tables Provide a method for differentiating between unknown values and nulls (zero or blank) Support Mechanisms for integrity, authorization, transactions, and recovery

44

Topic: cont..

SQL commands can be divided into two main sublanguages. The Data Definition Language (DDL) contains the commands used to create and destroy databases and database objects. After the database structure is defined with DDL, database administrators and users can utilize the Data Manipulation Language to insert, retrieve and modify the data contained within it.

45

Topic: Data Definition Language

Data Definition Language(DDL)

46

Topic: Data Definition Language

The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project. Let's take a look at the structure and usage of four basic DDL commands: The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project. Let's take a look at the structure and usage of four basic DDL commands:

47

Topic: cont ..

SQL Data Definition Language (DDL) The Data Definition Language (DDL) part of SQL permits database tables to be created or deleted. We can also define indexes (keys), specify links between tables, and impose constraints between database tables. The most important DDL statements in SQL are: CREATE TABLE - creates a new database table ALTER TABLE - alters (changes) a database table DROP TABLE - deletes a database table CREATE INDEX - creates an index (search key) DROP INDEX - deletes an index

48

Topic: Create Table

CREATE CREATE command can be used for this purpose. The command: CREATE TABLE personal_info (first_name char(20) not null, last_name char(20) not null, employee_id int not null) establishes a table titled "personal_info" in the current database. In our example, the table contains three attributes: first_name, last_name and employee_id.

49

Topic: Alter Table

Once you've created a table within a database, you may wish to modify the definition of it. The ALTER command allows you to make changes to the structure of a table without deleting and recreating it. Take a look at the following command: ALTER TABLE personal_info ADD salary money null

This example adds a new attribute to the personal_info table -- an employee's salary. The "money" argument specifies that an employee's salary will be stored using a dollars and cents format. Finally, the "null" keyword tells the database that it's OK for this field to contain no value for any given employee.

50

Topic: Drop Table

The final command of the Data Definition Language, DROP, allows us to remove entire database objects from our DBMS. For example, if we want to permanently remove the personal_info table that we created, we'd use the following command: DROP TABLE personal_info Similarly, the command below would be used to remove the entire employees database:

DROP DATABASE employees


Use this command with care! Remember that the DROP command removes entire data structures from your database. If you want to remove individual records, use the DELETE command of the Data Manipulation Language

51

Topic: Create Table

Tables are the basic structure where data is stored in the database. Given that in most cases, there is no way for the database vendor to know ahead of time what your data storage needs are, chances are that you will need to create tables in the database yourself. Many database tools allow you to create tables without writing SQL, but given that tables are the container of all the data, it is important to include the CREATE TABLE syntax in this tutorial.

52

Topic: Create Table

Before we dive into the SQL syntax for CREATE TABLE, it is a good idea to understand what goes into a table. Tables are divided into rows and columns. Each row represents one piece of data, and each column can be thought of as representing a component of that piece of data. So, for example, if we have a table for recording customer information, then the columns may include information such as First Name, Last Name, Address, City, Country, Birth Date, and so on. As a result, when we specify a table, we include the column headers and the data types for that particular column.

53

Topic: Create Table

So what are data types? Typically, data comes in a variety of forms. It could be an integer (such as 1), a real number (such as 0.55), a string (such as 'sql'), a date/time expression (such as '2000-JAN-25 03:22:22'), or even in binary format. When we specify a table, we need to specify the data type associated with each column (i.e., we will specify that 'First Name' is of type char(50) - meaning it is a string with 50 characters). One thing to note is that different relational databases allow for different data types, so it is wise to consult with a database-specific reference first.

54

Topic: Create Table

The SQL syntax for CREATE TABLE is CREATE TABLE "table_name" ("column 1" "data_type_for_column_1", "column 2" "data_type_for_column_2", ... ) So, if we are to create the customer table specified as above, we would type in CREATE TABLE customer (First_Name char(50), Last_Name char(50), Address char(50), City char(50), Country char(25), Birth_Date date)

55

Topic: SQL View

Views can be considered as virtual tables. Generally speaking, a table has a set of definition, and it physically stores the data. A view also has a set of definitions, which is build on top of table(s) or other view(s), and it does not physically store the data. The syntax for creating a view is as follows: CREATE VIEW "VIEW_NAME" AS "SQL Statement" "SQL Statement" can be any of the SQL statements we have discussed in this tutorial.

56

Topic: SQL View

Let's use a simple example to illustrate. Say we have the following table: TABLE Customer (First_Name char(50), Last_Name char(50), Address char(50), City char(50), Country char(25), Birth_Date date)

and we want to create a view called V_Customer that contains only the First_Name, Last_Name, and Country columns from this table, we would type in, CREATE VIEW V_Customer AS SELECT First_Name, Last_Name, Country FROM Customer

57

Topic: SQL View

Now we have a view called V_Customer with the following structure: View V_Customer (First_Name char(50), Last_Name char(50), Country char(25)) We can also use a view to apply joins to two tables. In this case, users only see one view rather than two tables, and the SQL statement users need to issue becomes much simpler. Let's say we have the following two tables:

58

Table Store_Information

Table Geography
Date Jan-05-1999 Jan-07-1999 Jan-08-1999 Jan-08-1999 West West region_na me East East store_na me Boston New York Los Angeles San Diego

Topic: View

store_name Los Angeles San Diego Los Angeles Boston

Sales $1,500 $250 $300 $700

and we want to build a view that has sales by region information. We would issue the following SQL statement: CREATE VIEW V_REGION_SALES AS SELECT A1.region_name REGION, SUM(A2.Sales) SALES FROM Geography A1, Store_Information A2 WHERE A1.store_name = A2.store_name GROUP BY A1.region_name This gives us a view, V_REGION_SALES, that has been defined to store sales by region records. If we want to find out the content of this view, we type in, SELECT * FROM V_REGION_SALES

59

Topic: SQL View


Indexes help us retrieve data from tables quicker. Let's use an example to illustrate this point: Say we are interested in reading about how to grow peppers in a gardening book. Instead of reading the book from the beginning until we find a section on peppers, it is much quicker for us to go to the index section at the end of the book, locate which pages contain information on peppers, and then go to these pages directly. Going to the index first saves us time and is by far a more efficient method for locating the information we need.

60

Topic: Create Index


The same principle applies for retrieving data from a database table. Without an index, the database system reads through the entire table (this process is called a 'table scan') to locate the desired information. With the proper index in place, the database system can then first go through the index to find out where to retrieve the data, and then go to these locations directly to get the needed data. This is much faster.

61

Topic: Create Index


Therefore, it is often desirable to create indexes on tables. An index can cover one or more columns. The general syntax for creating an index is: CREATE INDEX "INDEX_NAME" ON "TABLE_NAME" (COLUMN_NAME) Let's assume that we have the following table, TABLE Customer (First_Name char(50), Last_Name char(50), Address char(50), City char(50), Country char(25), Birth_Date date)

62

Topic: Create Index


and we want to create an index on the column Last_Name, we would type in, CREATE INDEX IDX_CUSTOMER_LAST_NAME on CUSTOMER (Last_Name) If we want to create an index on both City and Country, we would type in, CREATE INDEX IDX_CUSTOMER_LOCATION on CUSTOMER (City, Country)

There is no strict rule on how to name an index. The generally accepted method is to place a prefix, such as "IDX_", before an index name to avoid confusion with other database objects. It is also a good idea to provide information on which table and column(s) the index is used on.

63

Topic: Alter Table


Once a table is created in the database, there are many occasions where one may wish to change the structure of the table. Typical cases include the following: Add a column Drop a column Change a column name Change the data type for a column Please note that the above is not an exhaustive list. There are other instances where ALTER TABLE is used to change the table structure, such as changing the primary key specification.

64

Topic: Alter Table


The SQL syntax for ALTER TABLE is ALTER TABLE "table_name" [alter specification] [alter specification] is dependent on the type of alteration we wish to perform. For the uses cited above, the [alter specification] statements are: Add a column: ADD "column 1" "data type for column 1" Drop a column: DROP "column 1" Change a column name: CHANGE "old column name" "new column name" "data type for new column name" Change the data type for a column: MODIFY "column 1" "new data type"

65

Topic: Alter Table


Let's run through examples for each one of the above, using the "customer" table created in the CREATE TABLE section:
Table customer
Column Name First_Name Last_Name Address City Country Birth_Date Data Type char(50) char(50) char(50) char(50) char(25) date

Table customer
Column Name First_Name Last_Name Address City Country Birth_Date Gender Data Type char(50) char(50) char(50) char(50) char(25) date char(1)

First, we want to add a column called "Gender" to this table. To do this, we key in: ALTER table customer add Gender char(1)

66

Topic: Alter Table


Next, we want to rename "Address" to "Addr". To do this, we key in, ALTER table customer change Address Addr char(50)
Table customer
Column Name First_Name Last_Name Addr City Country Birth_Date Gender Data Type char(50) char(50) char(50) char(50) char(25) date char(1)

67

Topic: Alter Table


Then, we want to change the data type for "Addr" to char(30). To do this, we key in, ALTER table customer modify Addr char(30) Resulting table structure:

Table customer
Column Name First_Name
Last_Name Addr City Country Birth_Date Gender

Data Type char(50)


char(50) char(30) char(50) char(25) date char(1)

68

Topic: Alter Table


Finally, we want to drop the column "Gender". To do this, we key in, ALTER table customer drop Gender Resulting table structure:

Table customer
Column Name First_Name Last_Name Addr City Data Type char(50) char(50) char(30) char(50)

Country
Birth_Date

char(25)
date

69

Topic: Primary Key


A primary key is used to uniquely identify each row in a table. It can either be part of the actual record itself , or it can be an artificial field (one that has nothing to do with the actual record). A primary key can consist of one or more fields on a table. When multiple fields are used as a primary key, they are called a composite key. Primary keys can be specified either when the table is created (using CREATE TABLE) or by changing the existing table structure (using ALTER TABLE).

70

Topic: Primary Key


Below are examples for specifying a primary key when creating a table: CREATE TABLE Customer (SID integer PRIMARY KEY, Last_Name varchar(30), First_Name varchar(30));

Below are examples for specifying a primary key by altering a table: ALTER TABLE Customer ADD PRIMARY KEY (SID);

71

Topic: Foreign Key


A foreign key is a field (or fields) that points to the primary key of another table. The purpose of the foreign key is to ensure referential integrity of the data. In other words, only values that are supposed to appear in the database are permitted. For example, say we have two tables, a CUSTOMER table that includes all customer data, and an ORDERS table that includes all customer orders. The constraint here is that all orders must be associated with a customer that is already in the CUSTOMER table. In this case, we will place a foreign key on the ORDERS table and have it relate to the primary key of the CUSTOMER table. This way, we can ensure that all orders in the ORDERS table are related to a customer in the CUSTOMER table. In other words, the ORDERS table cannot contain information on a customer that is not in the CUSTOMER table.

72

Topic: Foreign Key


The structure of these two tables will be as follows:

Table CUSTOMER
column name SID Last_Name charac teristic

Table ORDERS
column name Order_ID Order_Date Customer_SID Amount Foreig n Key charac teristic

Primar y Key

Primar y Key

First_Name

In the above example, the Customer_SID column in the ORDERS table is a foreign key pointing to the SID column in the CUSTOMER table.
73

Topic: Foreign Key


CREATE TABLE ORDERS (Order_ID integer primary key, Order_Date date, Customer_SID integer references CUSTOMER(SID), Amount double);

Below are examples for specifying a foreign key by altering a table. This assumes that the ORDERS table has been created, and the foreign key has not yet been put in: ALTER TABLE ORDERS ADD (CONSTRAINT fk_orders1) FOREIGN KEY (customer_sid) REFERENCES CUSTOMER(SID);

74

Topic: Drop Table


Sometimes we may decide that we need to get rid of a table in the database for some reason. In fact, it would be problematic if we cannot do so because this could create a maintenance nightmare for the DBA's. Fortunately, SQL allows us to do it, as we can use the DROP TABLE command. The syntax for DROP TABLE is DROP TABLE "table_name" So, if we wanted to drop the table called customer that we created in the CREATE TABLE section, we simply type DROP TABLE customer.

75

Topic: Truncate Table


Sometimes we wish to get rid of all the data in a table. One way of doing this is with DROP TABLE, which we saw in the last section. But what if we wish to simply get rid of the data but not the table itself? For this, we can use the TRUNCATE TABLE command. The syntax for TRUNCATE TABLE is TRUNCATE TABLE "table_name" So, if we wanted to truncate the table called customer that we created in SQL CREATE, we simply type, TRUNCATE TABLE customer

76

Data Modification Language(DML)

77

SELECT Statement
The SELECT statement is used to query the database and retrieve selected data that match the criteria that you specify. The SELECT statement has five main clauses to choose from, although, FROM is the only required clause. Each of the clauses have a vast selection of options, parameters, etc. The clauses will be listed below, but each of them will be covered in more detail later in the tutorial.

Here is the format of the SELECT statement:


SELECT [ALL | DISTINCT] column1[,column2] FROM table1[,table2] [WHERE "conditions"] [GROUP BY "column-list"] [HAVING "conditions] [ORDER BY "column-list" [ASC | DESC] ]

78

Example 1.1:
SELECT name, age, salary FROM employee WHERE age > 50; The above statement will select all of the values in the name, age, and salary columns from the employee table whose age is greater than 50. Note: Remember to put a semicolon at the end of your SQL statements. The ; indicates that your SQL statment is complete and is ready to be interpreted
Comparison Operators = > < Equal Greater than Less than

>=
<= <> or != LIKE

Greater than or equal to Less than or equal to


Not equal to String comparison test
79

Example: Using Comparison Operators


SELECT name, title, dept FROM employee WHERE title LIKE 'Pro%'; The above statement will select all of the rows/values in the name, title, and dept columns from the employee table whose title starts with 'Pro'. This may return job titles including Programmer or Pro-wrestler.

80

USING ALL AND DISTINCT


ALL and DISTINCT are keywords used to select either ALL (default) or the "distinct" or unique records in your query results. If you would like to retrieve just the unique records in specified columns, you can use the "DISTINCT" keyword. DISTINCT will discard the duplicate records for the columns you specified after the "SELECT" statement

Example:
SELECT DISTINCT age FROM employee_info; This statement will return all of the unique ages in the employee_info table. ALL will display "all" of the specified columns including all of the duplicates. The ALL keyword is the default if nothing is specified.

Note: The following two tables will be used throughout this course. It is recommended to have them open in another window or print them out

81

Assignment 1:
1. From the items_ordered table, select a list of all items purchased for customerid 10449. Display the customerid, item, and price for this customer. Select all columns from the items_ordered table for whoever purchased a Tent. Select the customerid, order_date, and item values from the items_ordered table for any items in the item column that start with the letter "S". Select the distinct items in the items_ordered table. In other words, display a listing of each of the unique items from the items_ordered table. Make up your own select statements and submit them

2.
3. 4. 5.

82

Aggregate Functions :
MIN MAX SUM AVG COUNT COUNT(*) returns the smallest value in a given column returns the largest value in a given column returns the sum of the numeric values in a given column returns the average value of a given column returns the total number of values in a given column returns the number of rows in a table

Aggregate functions are used to compute against a "returned column of numeric data" from your SELECT statement. They basically summarize the results of a particular column of selected data. We are covering these here since they are required by the next topic, "GROUP BY". Although they are required for the "GROUP BY" clause, these functions can be used without the "GROUP BY" clause.

83

Examples:
SELECT AVG(salary) FROM employee; This statement will return a single result which contains the average value of everything returned in the salary column from the employee table. SELECT AVG(salary) FROM employee; WHERE title = 'Programmer';

This statement will return the average salary for all employees whose title is equal to 'Programmer'
SELECT Count(*) FROM employees;

This particular statement is slightly different from the other aggregate functions since there isn't a column supplied to the count function. This statement will return the number of rows in the employees table. .

84

Assignment 2:
1. Select the maximum price of any item ordered in the items_ordered table. Hint: Select the maximum price only.>

2.
3. 4.

Select the average price of all of the items ordered that were purchased in the month of Dec.
What are the total number of rows in the items_ordered table? For all of the tents that were ordered in the items_ordered table, what is the price of the lowest tent? Hint: Your query should return the price only

85

GROUP BY clause:
The GROUP BY clause will gather all of the rows together that contain data in the specified column(s) and will allow aggregate functions to be performed on the one or more columns. GROUP BY clause syntax: SELECT column1, SUM(column2)column(n) FROM Table 1. GROUP BY column1Column(n)

86

Examples:
Let's say you would like to retrieve a list of the highest paid salaries in each dept: SELECT max(salary), dept FROM employee GROUP BY dept;

This statement will select the maximum salary for the people in each unique department. Basically, the salary for the person who makes the most in each department will be displayed. Their, salary and their department will be returned.
Let's say you want to group everything of quantity 1 together, everything of quantity 2 together, everything of quantity 3 together, etc. If you would like to determine what the largest cost item is for each grouped quantity (all quantity 1's, all quantity 2's, all quantity 3's, etc.), you would enter: SELECT quantity, max(price) FROM items_ordered GROUP BY quantity; Enter the statement in above, and take a look at the results to see if it returned what you were expecting. Verify that the maximum price in each Quantity Group is really the maximum price.
87

Assignment 3:
1. How many people are in each unique state in the customers table? Select the state and display the number of people in each. Hint: count is used to count rows in a column, sum works on numeric data only. From the items_ordered table, select the item, maximum price, and minimum price for each specific item in the table. Hint: The items will need to be broken up into separate groups. How many orders did each customer make? Use the items_ordered table. Select the customerid, number of orders they made, and the sum of their orders. Click the Group By answers link below if you have any problems.

2.

3.

88

HAVING clause:
The HAVING clause allows you to specify conditions on the rows for each group - in other words, which rows should be selected will be based on the conditions you specify. The HAVING clause should follow the GROUP BY clause if you are going to use it. HAVING clause syntax: SELECT column1, SUM(column2) FROM "list-of-tables" GROUP BY "column-list" HAVING "condition";

89

Examples:
Let's say you have an employee table containing the employee's name, department, salary, and age. If you would like to select the average salary for each employee in each department, you could enter: SELECT dept, avg(salary) FROM employee GROUP BY dept; But, let's say that you want to ONLY calculate & display the average if their salary is over 20000: SELECT dept, avg(salary) FROM employee GROUP BY dept HAVING avg(salary) > 20000;

90

ORDER BY clause:
ORDER BY is an optional clause which will allow you to display the results of your query in a sorted order (either ascending order or descending order) based on the columns that you specify to order by. ORDER BY clause syntax: SELECT column1, SUM(column2) FROM "list-of-tables" ORDER BY "column-list" [ASC | DESC]; [ ] = optional ASC = Ascending Order - default DESC = Descending Order

91

Examples:
This statement will select the employee_id, dept, name, age, and salary from the employee_info table where the dept equals 'Sales' and will list the results in Ascending (default) order based on their Salary.: SELECT employee_id, dept, name, age, salary FROM employee_info WHERE dept = 'Sales' ORDER BY salary; If you would like to order based on multiple columns, you must seperate the columns with commas. For example: SELECT employee_id, dept, name, age, salary FROM employee_info WHERE dept = 'Sales' ORDER BY salary, age DESC;

92

Assignment 4:
1. Select the lastname, firstname, and city for all customers in the customers table. Display the results in Ascending Order based on the lastname. Same thing as exercise #1, but display the results in Descending order. Select the item and price for all of the items in the items_ordered table that the price is greater than 10.00. Display the results in Ascending order based on the price.

2. 3.

93

Combining conditions and Boolean Operators:


The AND operator can be used to join two or more conditions in the WHERE clause. Both sides of the AND condition must be true in order for the condition to be met and for those rows to be displayed. SELECT column1, SUM(column2) FROM "list-of-tables" WHERE "condition1" AND "condition2"; The OR operator can be used to join two or more conditions in the WHERE clause also. However, either side of the OR operator can be true and the condition will be met - hence, the rows will be displayed. With the OR operator, either side can be true or both sides can be true.

94

Examples:

SELECT employeeid, firstname, lastname, title, salary FROM employee_info WHERE salary >= 50000.00 AND title = 'Programmer';
This statement will select the employeeid, firstname, lastname, title, and salary from the employee_info table where the salary is greater than or equal to 50000.00 AND the title is equal to 'Programmer'. Both of these conditions must be true in order for the rows to be returned in the query. If either is false, then it will not be displayed. Although they are not required, you can use paranthesis around your conditional expressions to make it easier to read: SELECT employeeid, firstname, lastname, title, salary FROM employee_info WHERE (salary >= 50000.00) AND (title = 'Programmer'); SELECT firstname, lastname, title, salary FROM employee_info WHERE (title = 'Sales') OR (title = 'Programmer'); This statement will select the firstname, lastname, title, and salary from the employee_info table where the title is either equal to 'Sales' OR the title is equal to 'Programmer'.
95

Assignment 5:
1. Select the customerid, order_date, and item from the items_ordered table for all items unless they are 'Snow Shoes' or if they are 'Ear Muffs'. Display the rows as long as they are not either of these two items. Select the item and price of all items that start with the letters 'S', 'P', or 'F'.

2.

96

IN and BETWEEN Conditional Operators:


SELECT col1, SUM(col2) FROM "list-of-tables" WHERE col3 IN (list-of-values); SELECT col1, SUM(col2) FROM "list-of-tables" WHERE col3 BETWEEN value1 AND value2; The IN conditional operator is really a set membership test operator. That is, it is used to test whether or not a value (stated before the keyword IN) is "in" the list of values provided after the keyword IN.

97

Examples:
SELECT employeeid, lastname, salary FROM employee_info WHERE lastname IN ('Hernandez', 'Jones', 'Roberts', 'Ruiz'); This statement will select the employeeid, lastname, salary from the employee_info table where the lastname is equal to either: Hernandez, Jones, Roberts, or Ruiz. It will return the rows if it is ANY of these values. The IN conditional operator can be rewritten by using compound conditions using the equals operator and combining it with OR - with exact same output results: SELECT employeeid, lastname, salary FROM employee_info WHERE lastname = 'Hernandez' OR lastname = 'Jones' OR lastname = 'Roberts' OR lastname = 'Ruiz';

As you can see, the IN operator is much shorter and easier to read when you are testing for more than two or three values.
You can also use NOT IN to exclude the rows in your list.

98

Examples:
The BETWEEN conditional operator is used to test to see whether or not a value (stated before the keyword BETWEEN) is "between" the two values stated after the keyword BETWEEN. SELECT employeeid, age, lastname, salary FROM employee_info WHERE age BETWEEN 30 AND 40; This statement will select the employeeid, age, lastname, and salary from the employee_info table where the age is between 30 and 40 (including 30 and 40). This statement can also be rewritten without the BETWEEN operator: SELECT employeeid, age, lastname, salary FROM employee_info WHERE age >= 30 AND age <= 40; You can also use NOT BETWEEN to exclude the values between your range.

99

ABS(x) SIGN(x) MOD(x,y) FLOOR(x) CEILING(x) or CEIL(x) POWER(x,y) ROUND(x) ROUND(x,d) SQRT(x)

returns the absolute value of x returns the sign of input x as -1, 0, or 1 (negative, zero, or positive respectively) modulo - returns the integer remainder of x divided by y (same as x%y) returns the largest integer value that is less than or equal to x returns the smallest integer value that is greater than or equal to x returns the value of x raised to the power of y returns the value of x rounded to the nearest whole integer returns the value of x rounded to the number of decimal places specified by the value d returns the square-root value of x

100

Examples:
SELECT round(salary), firstname FROM employee_info

This statement will select the salary rounded to the nearest whole value and the firstname from the employee_info table.

101

Assignment 6:
1. Select the item and per unit price for each item in the items_ordered table. Hint: Divide the price by the quantity.

2.

Click the exercise answers link below if you have any problems.

102

Table Joins:
All of the queries up until this point have been useful with the exception of one major limitation - that is, you've been selecting from only one table at a time with your SELECT statement. It is time to introduce you to one of the most beneficial features of SQL & relational database systems - the "Join". To put it simply, the "Join" makes relational database systems "relational". Joins allow you to link data from two or more tables together into a single query result--from one single SELECT statement. A "Join" can be recognized in a SQL SELECT statement if it has more than one table after the FROM keyword SELECT "list-of-columns" FROM table1,table2 WHERE "search-condition(s)"

103

Joins can be explained easier by demonstrating what would happen if you worked with one table only, and didn't have the ability to use "joins". This single table database is also sometimes referred to as a "flat table". Let's say you have a one-table database that is used to keep track of all of your customers and what they purchase from your store:
Everytime a new row is inserted into the table, all columns will be be updated, thus resulting in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table:
id first last address city state zip date item price

10982
10982 10982 10982 10982

Wolfgang
Wolfgang Wolfgang Wolfgang Wolfgang

Schultz
Schultz Schultz Schultz Schultz

300 N. 1st Ave


300 N. 1st Ave 300 N. 1st Ave 300 N. 1st Ave 300 N. 1st Ave

Yuma
Yuma Yuma Yuma Yuma

AZ
AZ AZ AZ AZ

85002
85002 85002 85002 85002

032299
082899 091199 100999 022900

snowboard
snow shovel gloves lantern tent

45.00
35.00 15.00 35.00 85.00

An ideal database would have two tables: 1. One for keeping track of your customers 2. And the other to keep track of what they purchase:

104

An ideal database would have two tables: 1. One for keeping track of your customers 2. And the other to keep track of what they purchase: "Customer_info" table:
customer_number firstname lastname address city state zip

"Purchases" table:
customer_number date item price

Now, whenever a purchase is made from a repeating customer, the 2nd table, "Purchases" only needs to be updated! We've just eliminated useless redundant data, that is, we've just normalized this database! Notice how each of the tables have a common "cusomer_number" column. This column, which contains the unique customer number will be used to JOIN the two tables. Using the two new tables, let's say you would like to select the customer's name, and items they've purchased.
105

Examples:
SELECT customer_info.firstname, customer_info.lastname, purchases.item FROM customer_info, purchases WHERE customer_info.customer_number = purchases.customer_number; This particular "Join" is known as an "Inner Join" or "Equijoin". This is the most common type of "Join" that you will see or use. Notice that each of the colums are always preceeded with the table name and a period. This isn't always required, however, it IS good practice so that you wont confuse which colums go with what tables. It is required if the name column names are the same between the two tables. I recommend preceeding all of your columns with the table names when using joins. SELECT employee_info.employeeid, employee_info.lastname, employee_sales.comission FROM employee_info, employee_sales WHERE employee_info.employeeid = employee_sales.employeeid; This statement will select the employeeid, lastname (from the employee_info table), and the comission value (from the employee_sales table) for all of the rows where the employeeid in the employee_info table matches the employeeid in the employee_sales table.
106

Assignment 7:
1. Write a query using a join to determine which items were ordered by each of the customers in the customers table. Select the customerid, firstname, lastname, order_date, item, and price for everything each customer purchased in the items_ordered table. Repeat exercise #1, however display the results sorted by state in descending order.

2.

107

Additional Reference:
Judith Bowman, Sandra Emerson, and Marcy Darnovsky, The Practical SQL Handbook: Using Structured Query Language, Third Edition, Addison-Wesley, ISBN 0-201-44787-8, 1996. C. J. Date and Hugh Darwen, A Guide to the SQL Standard: A User's Guide to the Standard Database Language SQL, Fourth Edition, Addison-Wesley, ISBN 0-201-96426-0, 1997. C. J. Date, An Introduction to Database Systems, Volume 1, Sixth Edition, AddisonWesley, 1994. Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, 3rd Edition, Addison-Wesley, ISBN 0-805-31755-4, August 1999. Jim Melton and Alan R. Simon, Understanding the New SQL: A Complete Guide, Morgan Kaufmann, ISBN 1-55860-245-3, 1993. Jeffrey D. Ullman, Principles of Database and Knowledge: Base Systems, Volume 1, Computer Science Press, 1988.

108

Reference websites:
Wikipedia: SQL - http://en.wikipedia.org/wiki/SQL History and overview of the language. SQLCourse - http://www.sqlcourse.com/ Interactive/On-line SQL Tutorial with SQL Interpreter & live practice database. A Gentle Introduction to SQL - http://sqlzoo.net/ An Introduction to Database Normalization - http://dev.mysql.com/tech-resources/articles/intro-tonormalization.html Online SQL tutorial featuring a live interpreter to test SQL commands. SQL Tutorial - http://www.firstsql.com/tutor.htm Complete SQL Tutorial using SQL92. SQL Tutorial - http://www.1keydata.com/sql/sql.html This site aims to teach beginners the building blocks of SQL. Database and SQL eLearning - http://db.grussell.org/ Database theory and an online tutorial interface to an Oracle database system, allowing a user to learn SQL interactively. The site automatically checks and marks SQL and gives instant feedback. SQL exercises - http://www.sql-ex.ru Introduction to Structured Query Language - http://riki-lb1.vet.ohiostate.edu/mqlin/computec/tutorials/SQLTutorial.htm SQL for Web Nerds - http://eveander.com/arsdigita/books/sql/ A nicely structured manuscript on SQL by Philip Greenspun, based on the Oracle database. Queries, transactions, triggers, and RDBMS concepts are covered. SQL School - http://www.w3schools.com/sql/ http://directory.google.com/Top/Computers/Programming/Languages/SQL/FAQs,_Help,_and_Tutorial s/

-The End109

You might also like