You are on page 1of 60

An Introduction to SQL

Kirk Anne
Computing & Information Technology
SUNY Geneseo
kma@geneseo.edu
What is a database?
Parts of a database
• Attributes (fields)
– An attribute or field is a component of a record that
describes something about an item.
• Records
– A record is the representation of an individual item.
• Table
– A collection of records
• Database
– A collection of tables and rules for accessing the
tables
What is a relational database?
• Originally developed by E.F. Codd in 1970
• Organizes data into tables where each
item is a row and the attributes of the item
are in columns.
• Different from “flat file” databases because
you can define “relationships” between
items in different tables.
Parts of a database
Record

Tables

Attribute/Field

• Records become “rows”


• Attributes/fields become “columns”
• Rules determine the relationship
between the tables and tie the data
together to form a database
I need a new database!
• Many people ask for “new databases”
when in fact they only need a new table
within an existing database.
• The data within the tables should be all
related somehow.
– By owner
– By project
Creating a database
• What information are we trying to store?
• How do we describe the information?
• Phone Book/Contact entries
– Name
– Address
– Company
– Phone Number
– URL/Web Page
– Age
– Height (in meters)
– Birthday
– When we added the entry
Data Types
• Binary
– Database specific binary objects
– Pictures, digital signatures, etc.
• Boolean
– True/False values
• Character
– Fixed width or variable size
• Numeric
– Integer, Real (floating decimal point), Money
• Temporal
– Time, Date, Timestamp
Phone Book/Contact Record
Name Character
Address Character
Company Character
Phone Number Character
URL/Web Page Character
Age Integer
Height Real (float)
Birthday Date
When we added the entry Timestamp
“Normal Forms”

Summarized from Barry Wise’s


article on Database Normalization
http://www.phpbuilder.com/columns/barry20000731.php3?page=1
What are the “normal forms”?
• E. F. Codd in 1972 wrote a paper on
“Further Normalization of the Data Base
Relational Model”
• Normal forms reduce the amount of
redundancy and inconsistent dependency
within databases.
• Codd proposed three normal forms and
through the years two more have been
added.
The Zero Form
• No rules have Contacts
been applied Name Company Address Phone1 Phone2 Phone3 ZipCode

• Where most
Joe ABC 123 5532 2234 3211 12345
people start
Jane XYZ 456 3421 14454
(and stop)
Chris PDQ 789 2341 6655 14423
• No room for
growth
• Usually
wastes space
First Normal Form
• Eliminate
repeating columns
Contacts
in each table Id Name Company Address Phone ZipCode

• Create a separate 1 Joe ABC 123 5532 12345


table for each set 1 Joe ABC 123 2234 12345
of related data 1 Joe ABC 123 3211 12345
• Identify each set of 2 Jane XYZ 456 3421 14454
related data with a 3 Chris PDQ 789 2341 14423
primary key 3 Chris PDQ 789 6655 14423

Benefits: Now we can have infinite phone numbers or


company addresses for each contact.

Drawback: Now we have to type in everything over and over


again. This leads to inconsistency, redundancy and wasting
space. Thus, the second normal form…
Second Normal Form
People
• Create separate Id Name Company Address Zip
tables for sets of 1 Joe ABC 123 12345

values that apply to 2 Jane XYZ 456 14454


3 Chris PDQ 789 14423
multiple records
• Relate these tables PhoneNumbers
PhoneID Id Phone
with a “foreign key”.
1 1 5532
2 1 2234
3 1 3211
4 2 3421
5 3 2341
6 3 6655
Third Normal Form
• Eliminate fields that People
do not depend on the Id Name AddressID
1 Joe 1
primary key.
2 Jane 2
3 Chris 3
PhoneNumbers
PhoneID Id Phone
Address
1 1 5532
AddressID Company Address Zip
2 1 2234
1 ABC 123 12345
3 1 3211
2 XYZ 456 14454
4 2 3421
3 PDQ 789 14423
5 3 2341
6 3 6655 Is this enough? Codd thought so…
What about “many to many”?
Kinds of Relationships
• “One to One”
– One row of a table matches exactly to another
• One person, one id number, one address
• “One to Many”
– One row of a table matches many of another
• One person, many phone numbers
• “Many to Many”
– One row may match many of another or many
rows match one row of another
Fourth Normal Form
• In a “many to many” relationship,
PhoneNumbers
independent entities cannot be PhoneID Phone

stored in the same table. 1 5532


2 2234

People 3 3211
4 3421
Id Name AddressID
5 2341
1 Joe 1 6 6655
2 Jane 2
3 Chris 3
PhoneRelations
PhoneRelID Id PhoneID

Address 1 1 1

AddressID Company Address Zip 2 1 2

1 ABC 123 12345 3 1 3

4 2 4
2 XYZ 456 14454
5 3 5
3 PDQ 789 14423
6 3 6
Fifth Normal Form
• The “very esoteric” one that is probably not
required to get the most out of your database.
• “The original table must be reconstructed from
the tables into which it has been broken down.”
• The rule ensures that you have not created any
extraneous columns and all the tables are only
as large as they need to be.
Why normalize?
• Increases the integrity of the data
• Reduces redundancy
• Improves efficiency
• Although normalization can be hard, it is
worth it in the long run.
What do I need to remember?
• Keep normalization in mind.
• Don’t replicate data in a table.
• If you break the rules, know why you are
breaking the rules and do it for a good
reason.
All you need to know about
SQL in 30 minutes (or less)
Basic SQL Commands
• Creating tables with CREATE
• Adding data with INSERT
• Viewing data with SELECT
• Removing data with DELETE
• Modifying data with UPDATE
• Destroying tables with DROP
Creating tables with CREATE
• Generic form

CREATE TABLE tablename (


column_name data_type attributes…,
column_name data_type attributes…,

)

• Table and column names can’t have spaces or


be “reserved words” like TABLE, CREATE, etc.
Phone Book/Contact Record
Name Character
Address Character
Company Character
Phone Number Character
URL/Web Page Character
Age Integer
Height Real (float)
Birthday Date
When we added the entry Timestamp
Phone Book/Contact Table
CREATE TABLE contacts (
Name VARCHAR(40),
Address VARCHAR(60),
Company VARCHAR(60),
Phone VARCHAR(11),
URL VARCHAR(80),
Age INT,
Height FLOAT,
Birthday DATE,
WhenEntered TIMESTAMP
);
Plan your tables very carefully!
Once created, they are difficult to change!
Phone Book/Contact Table
CREATE TABLE contacts (
ContactID INT PRIMARY KEY,
Name VARCHAR(40),
Address VARCHAR(60),
Company VARCHAR(60),
Phone VARCHAR(11),
URL VARCHAR(80),
Age INT,
Height FLOAT,
Birthday DATE,
WhenEntered TIMESTAMP
);
If you are going to use the relational nature of a database,
don’t forget you need to have a unique way to access records!
There is a way to make the key automatically increment,
so you don’t have to worry about which one is next.
Data Types
• Binary
– Database specific binary objects (BLOB)
• Boolean
– True/False values (BOOLEAN)
• Character
– Fixed width (CHAR) or variable size (VARCHAR)
• Numeric
– Integer (INT), Real (FLOAT), Money (MONEY)
• Temporal
– Time (TIME), Date (DATE), Timestamp (TIMESTAMP)
Adding data with INSERT
• Generic Form

INSERT INTO tablename (column_name,…)


VALUES (value,…)
Inserting a record into ‘contacts’
INSERT INTO contacts
(contactid,name,address,company,phone,
url,age,height,birthday,whenentered)
VALUES
(1,‘Joe’,’123 Any St.’,’ABC’,
’800-555-1212’,‘http://abc.com’,30,1.9,
’6/14/1972’,
now());
Inserting a partial record
INSERT INTO contacts
(contactid,name,phone)
VALUES (2,’Jane’,’212-555-1212’);
Automatic key generation
• CREATE SEQUENCE contactidseq;
• Change the ContactID line in the
CREATE TABLE to:
ContactID INT DEFAULT nextval(‘contactidseq’) PRIMARY KEY

• Or when inserting into a table


INSERT contacts (contactid,name,phone)
VALUES (nextval(‘contactidseq’),’Jack’,
‘716-555-1212’);
Viewing data with SELECT
• Generic Form
SELECT column,… FROM table,…
WHERE condition
GROUP BY group_by_expression
HAVING condition
ORDER BY order_expression
• The most used command
• Probably the most complicated also
• If used improperly, can cause very long waits
because complex computations
A few simple SELECTs
• SELECT * FROM contacts;
– Display all records in the ‘contacts’ table
• SELECT contactid,name FROM contacts;
– Display only the record number and names
• SELECT DISTINCT url FROM contacts;
– Display only one entry for every value of URL.
Refining selections with WHERE
• The WHERE “subclause” allows you to
select records based on a condition.
• SELECT * FROM contacts
WHERE age<10;
– Display records from contacts where age<10
• SELECT * FROM contacts
WHERE age BETWEEN 18 AND 35;
– Display records where age is 18-35
Additional selections
• The “LIKE” condition
– Allows you to look at strings that are alike
• SELECT * FROM contacts
WHERE name LIKE ‘J%’;
– Display records where the name starts with ‘J’
• SELECT * FROM contacts
WHERE url LIKE ‘%.com’;
– Display records where url ends in “.com”
Removing data with DELETE
• Generic Form

DELETE FROM table WHERE condition;

DELETE FROM contacts WHERE age<13;


Modifying data with UPDATE
• Generic Form

UPDATE table SET column=expression


WHERE condition;

UPDATE contacts SET company=‘AOL’


WHERE company=‘Time Warner’;
Destroying tables with DROP
• Generic Form

DROP TABLE tablename;

DROP TABLE contacts;


More about SELECT
“Normal Forms” and SELECT
• Good database design using the normal
forms requires data to be separated into
different tables
• SELECT allows us to join the data back
together
• We can use “views” to create virtual tables
The Normal Forms
• First Form
– Eliminate replicated data in tables
– Create separate tables for each set of related data
– Identify each set of related data with a primary key
• Second Form
– Create separate tables for sets of values that apply to multiple
records
– Relate the tables with a foreign key
• Third Form
– Eliminate fields that do not depend on the primary key
• Fourth Form
– In many-to-many relationships, independent entities cannot be
stored in the same table
Joining together tables
• SELECT name,phone,zip FROM
people, phonenumbers, address People
WHERE Id Name AddressID
people.addressid=address.addressid
1 Joe 1
AND people.id=phonenumbers.id;
2 Jane 2
3 Chris 3
PhoneNumbers
PhoneID Id Phone
Address
1 1 5532
AddressID Company Address Zip
2 1 2234
1 ABC 123 12345
3 1 3211
2 XYZ 456 14454
4 2 3421
3 PDQ 789 14423
5 3 2341
6 3 6655
Different types of JOINs
• “Inner Join”
– Unmatched rows in either table aren’t printed
• “Left Outer Join”
– All records from the “left” side are printed
• “Right Outer Join”
– All records from the “right” side are printed
• “Full Outer Join”
– All records are printed
• Multiple Table Join
– Join records from multiple tables
General form of SELECT/JOIN
SELECT columns,…
FROM left_table
join_type JOIN right_table ON condition;

SELECT name,phone FROM people


JOIN phonenumbers ON
people.id=phonenumbers.id;
Other versions
SELECT name,phone FROM people
LEFT JOIN phonenumbers ON
people.id=phonenumbers.id;
SELECT name,phone FROM people
RIGHT JOIN phonenumbers ON
people.id=phonenumbers.id;
SELECT name,phone FROM people
FULL JOIN phonenumbers ON
people.id=phonenumbers.id;
“Theta style” vs. ANSI
• Theta Style (used in most SQL books)
SELECT name,phone,zip FROM people, phonenumbers, address
WHERE people.addressid=address.addressid AND
people.id=phonenumbers.id;

• ANSI Style uses JOIN


SELECT name,phone,zip FROM people
JOIN phonenumbers ON people.id=phonenumbers.id
JOIN address ON people.addressid=address.addressid;
Other SELECT examples
• SELECT * FROM contacts
WHERE name is NULL;
• SELECT * FROM contacts
WHERE zip IN (‘14454’,’12345’);
• SELECT * FROM contacts
WHERE zip IN (
SELECT zip FROM address
WHERE state=‘NY’
);
GROUP BY/HAVING
• The “GROUP BY” clause allows you to
group results together with “aggregate
functions”
– AVG(), COUNT(), MAX(), MIN(), SUM()
– COUNT DISTINCT
• HAVING allows you to search the GROUP
BY results
GROUP BY Examples
SELECT company,count(company)
FROM contacts
GROUP BY company;

SELECT company,count(company)
FROM contacts
GROUP BY company
HAVING count(company) > 5;
ORDER BY
• The “ORDER BY” clause allows you to
sort the results returned by SELECT.

SELECT * FROM contacts


ORDER BY company;

SELECT * FROM contacts


ORDER BY company, name;
Views
• You can use “CREATE VIEW” to create a
virtual table from a SELECT statement.

CREATE VIEW contactview AS


(SELECT name,phone,zip FROM
people,phonenumbers,address
WHERE people.id=phonenumbers.id AND
people.addressid=address.addressid);
Now… Let’s do it on the web

PHP style
Basic PHP/SQL interaction
1. Open a connection to the database
2. If ok, generate SQL command
3. “Execute” SQL command
4. Handle responses from the server
5. If not done, go back to step 2
6. If done, close connection to database
Creating a table with PHP
<?php
// create a connection with the database
$connection=pg_Connect("host=oracle.geneseo.edu port=5432
dbname=mgmt357 user=mgmt357 password=ecommerce");
// if there is no connection, generate an error and get out
if (!$connection) {
print pg_ErrorMessage(); exit(-1);
};
// Create the table with the pg_exec command
$result=pg_exec($connection,"create table customer (
id int8,
name varchar(50),
address varchar(50),
email varchar(64)
)");
if ($result) {
print "The customer table has been created.";
} else {
print pg_ErrorMessage();
}
pg_close($connection);
?>
Entering data with PHP
<?php pg_close($connection);
} else {
if ($submit) { // display form
// create a connection with the database
?>
$connection=pg_Connect("host=oracle.geneseo.edu
port=5432 dbname=mgmt357 user=mgmt357
password=ecommerce"); <form method="post"
action="<?php echo $PHP_SELF?>">
if (!$connection) { Name: <input type="Text" name="name”>
print pg_ErrorMessage(); Address: <input type="Text" name="address”>
exit(-1); Email:<input type="Text" name="email”>
};
<input type="Submit" name="submit" value="Enter
information”>
$id=time(); // set a unique id number to each record
</form>
$result=pg_exec($connection, "insert into customer
VALUES ($id,'$name','$address','$email')"); <?
}
if ($result) { ?>
print "The customer data has been inserted.<p>";
} else {
print pg_ErrorMessage();
}
Updating data with PHP
<?php pg_close($connection);
} else {
if ($submit) { // display form
// create a connection with the database ?>
$connection=pg_Connect("host=oracle.geneseo.edu port=5432
dbname=mgmt357 user=mgmt357
password=ecommerce"); <form method="post"
action="<?php echo $PHP_SELF?>">
if (!$connection) { Id: <input type=“Text” name=“id”>
print pg_ErrorMessage(); Name: <input type="Text" name="name”>
exit(-1); Address: <input type="Text" name="address”>
}; Email:<input type="Text" name="email”>
<input type="Submit" name="submit" value="Enter information”>
$result=pg_exec($connection, "update customer </form>
set name='$name',
set address='$address',
<?
set email='$email' where id='$id'");
}
?>
if ($result) {
print "The customer data has been inserted.<p>";
} else {
print pg_ErrorMessage();
}
Displaying data with PHP
$result=pg_exec($connection,"select * from customer");

if ($result) {
// get the number of rows and store it in $r
$r = pg_numrows($result);
print "The customer table has $r row(s) of data.<p>";
// set the row counter to the beginning 0
$row=0;
print "<table border=1>\n";
// while there are records to deal with...
while ($data=pg_fetch_object($result,$row)) {
print "<tr>";
print "<td>$data->name</td>";
print "<td>$data->address</td>";
print "<td>$data->email</td>";
print "</tr>\n";
$row++; // increment the counter
};
print "</table>\n";
} else {
print pg_ErrorMessage();
}
Updating data with PHP part 2
<?php pg_close($connection);
} else {
if ($submit) { // display form
// create a connection with the database ?>
$connection=pg_Connect("host=oracle.geneseo.edu port=5432 <form method="post" action="<?php echo $PHP_SELF?>">
dbname=mgmt357 user=mgmt357 <?php
password=ecommerce"); $connection=pg_Connect("host=oracle.geneseo.edu port=5432
dbname=mgmt357 user=mgmt357 password=ecommerce");
if (!$connection) { $result=pg_exec($connection,"select id from customer limit 30");
print pg_ErrorMessage(); $row=0;
exit(-1); print "<select name=id>\n";
}; while ($data=pg_fetch_object($result,$row)) {
print "<option>$data->id\n";
$result=pg_exec($connection, "update customer $row++;
set name='$name', };
set address='$address', print "</select>\n";
set email='$email' where id='$id'"); ?>
Name: <input type="Text" name="name”>
if ($result) { Address: <input type="Text" name="address”>
print "The customer data has been inserted.<p>"; Email:<input type="Text" name="email”>
} else { <input type="Submit" name="submit" value="Enter information”>
print pg_ErrorMessage(); </form>
} <?
}
?>
Getting even easier… PEAR
• Abstracting the database interface
<?php
include('dbinfo.php');
require_once( 'DB.php' );
$db = DB::connect( "mysql://$user:$pass@$host/$dbname" );
// no need to select DB
$sql = 'SELECT * FROM demo';
$demoResult = $db->query($sql);
while ($demoRow = $demoResult->fetchRow()) {
    echo $demoRow[2] . '<br>';
}
$db->disconnect();
?>

http://www.phpbuilder.com/columns/allan20010115.php3?page=1
Resources
• PHP Sites
– http://www.php.net
– http://www.phpbuilder.com
– http://www.devshed.com
– http://www.geneseo.edu/~kma/PHP_Intro
• Books
– SQL in a Nutshell, Kevin Kline, O’Reilly
– PostgreSQL: Introduction and Concepts, Bruce
Momjian, Addision Wesley
– Introduction to Database Systems, C.J. Date

You might also like