You are on page 1of 48

1

Presentation Outline
SQL Writing Process
SQL Standards
Using Indexes
The Optimizer
FROM, WHERE Clauses
EXPLAIN
SQL Trace
Sub-Selects and Joins
Tips and Tricks
2
Caveat
Although many of these principles apply
to all databases, Oracle will be used in
the examples.
3
SQL Writing Process
Step 1: What information do I need? Columns
Step 2: Where is it? Tables
Step 3: Write SQL:
SELECT columns
FROM tables
WHERE ... (joins, filters, subqueries)
I'M FINISHED!
4
SQL Writing Process
YOU'RE NOT FINISHED YET! You've got the results
you want, but at what cost?

There are many, many ways to get the right results,
but only one is the fastest way1000-to-1
improvements are attainable!

Inefficient SQL can dramatically degrade the
performance of the entire system

Developers and DBAs must work together to tune
the database and the application
5
Pre-Tuning Questions
How long is too long?
Is the statement running on near-production
volumes?
Is the optimal retrieval path being used?
How often will it execute?
When will it execute?
6
SQL Standards
Why are SQL standards important?

Maintainability, readability

Performance: If SQL is the same as a (recently)
executed statement, it can be re-used instead of
needing to be reparsed
7
SQL Standards
Question: which of these statements are the same?
A. SELECT LNAME FROM EMP WHERE EMPNO = 12;
B. SELECT lname FROM emp WHERE empno = 12;
C. SELECT lname FROM emp WHERE empno = :id;
D. SELECT lname FROM emp
WHERE empno = 12;
8
SQL Standards
Answer: None

Whitespace, case, bind variables vs. constants all
matter

Using standards helps to ensure that equivalent SQL
can be reused.

9
Tables Used in the Examples
deptno
dname
loc
DEPT
empno
mgr
job
deptno
fname
lname
comm
hiredate
grade
sal
EMP
grade
losal
hisal
SALGRADE
10

SELECT E.empno,
D.dname
FROM emp E,
dept D
WHERE E.deptno = D.deptno
AND (D.deptno = :vardept
OR E.empno = :varemp);

SQL Standards: Example
Keywords upper case
and left-aligned
Columns on new lines
Use std. table aliases
Separate w/ one space
Use bind variables
AND/OR on new lines
No space before/after
parentheses
11
Indexes: What are they?
An index is a database object used to speed retrieval
of rows in a table.
The index contains only the indexed value--usually the
key(s)--and a pointer to the row in the table.
Multiple indexes may be created for a table
Not all indexes contain unique values
Indexes may have multiple columns (e.g., Oracle
allows up to 32)
12
If a column appears in a WHERE clause it is a
candidate for being indexed.
If a column is indexed the database can used the
index to find the rows instead of scanning the table.
If the column is not referenced properly, however,
the database may not be able to used the index and
will have to scan the table anyway.
Knowing what columns are and are not indexed can
help you write more efficient SQL
Indexes and SQL
13
No index exists for column EMPNO on table EMP, so
a table scan must be performed:
Example: Query without Index
empno fname lname...
4 lisa baker
9 jackie miller
1 john larson
3 larry jones
5 jim clark
2 mary smith
7 harold simmons
8 mark burns
6 gene harris
Table: EMP
SELECT *
FROM emp
WHERE empno = 8
14
7 8 9

Example: Query with Index
empno fname lname ...
4 lisa baker
9 jackie miller
1 john larson
3 larry jones
5 jim clark
2 mary smith
7 harold simmons
8 mark burns
6 gene harris
Table: EMP
5
Index: PK_EMP
EMP (EMPNO)
1, 4 5, 9
1 2

3 4

5 6

SELECT *
FROM emp
WHERE empno = 8
Column EMPNO is indexed, so it can be used to find
the requested row:
15
Sometimes a table scan cannot be avoided

Not every column should be indexed--there is
performance overhead on Inserts, Updates, Deletes

Small tables may be faster with a table scan

Queries returning a large number (> 5-20%) of the
rows in the table may be faster with a table scan


Indexes: Caveats
16
Example: Index on (EMPNO, DEPTNO)
SELECT *
FROM emp
WHERE deptno = 10;

SELECT *
FROM emp
WHERE empno > 0
AND deptno = 10;

Must use the leading column(s) of the index for
the index to be used
Indexes: Column Order
Will NOT use index
WILL use index
17
Using a function, calculation, or other operation on an
indexed column disables the use of the Index

SELECT *
FROM emp
WHERE TRUNC(hiredate) = TRUNC(SYSDATE);
...
WHERE fname || lname = 'MARYSMITH';

SELECT *
FROM emp
WHERE hiredate BETWEEN TRUNC(SYSDATE)
AND TRUNC(SYSDATE)+1
...
WHERE fname = 'MARY'
AND lname = 'SMITH';
Indexes: Functions
Will NOT use index
WILL use index
18
Using NOT excludes indexed columns:

SELECT *
FROM dept
WHERE deptno != 0;
... deptno NOT = 0;
... deptno IS NOT NULL;

SELECT *
FROM dept
WHERE deptno > 0;

Indexes: NOT
Will NOT use index
WILL use index
19
The Optimizer
The WHERE/FROM rules on the following pages apply
to the Rule-based optimizer (Oracle).

If the Cost-based Optimizer is used, Oracle will attempt
to reorder the statements as efficiently as possible
(assuming statistics are available).

DB2 and Sybase use only a Cost-based optimizer

The Optimizer's access paths can be overridden in
Oracle and Sybase (not DB2)
20
The Optimizer: Hints
Return the first rows in the result set as fast as possible:
SELECT /*+ FIRST_ROWS */ empno
FROM emp E
dept D,
WHERE E.deptno = D.deptno;

Force Optimizer to use index IDX_HIREDATE:
SELECT /*+ INDEX (E idx_hiredate) */ empno
FROM emp E
WHERE E.hiredate > TO_DATE('01-JAN-2000');

21
FROM Clause: Driving Table
Specify the driving table last in the FROM Clause:
SELECT *
FROM dept D, -- 10 rows
emp E -- 1,000 rows
WHERE E.deptno = D.deptno;

SELECT *
FROM emp E, -- 1,000 rows
dept D -- 10 rows
WHERE E.deptno = D.deptno;
Driving table is EMP
Driving table is DEPT
22
FROM Clause: Intersection Table
When joining 3 or more tables, use the Intersection table
(with the most shared columns) as the driving table:
SELECT *
FROM dept D,
salgrade S,
emp E
WHERE E.deptno = D.deptno
AND E.grade = S.grade;
EMP shares columns with
DEPT and SALGRADE,
so use as the driving table
23
WHERE: Discard Early
Use WHERE clauses first which discard the maximum
number of rows:
SELECT *
FROM emp E
WHERE E.empno IN (101, 102, 103)
AND E.deptno > 10;
3 rows
90,000 rows
24
WHERE: AND Subquery First
When using an "AND" subquery, place it first:
SELECT *
FROM emp E
WHERE E.sal > 50000
AND 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)

SELECT *
FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
AND E.sal > 50000
CPU = 156 sec
CPU = 10 sec
25
WHERE: OR Subquery Last
When using an "OR" subquery, place it last:
SELECT *
FROM emp E
WHERE 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
OR E.sal > 50000

SELECT *
FROM emp E
WHERE E.sal > 50000
OR 25 > (SELECT COUNT(*)
FROM emp M
WHERE M.mgr = E.empno)
CPU = 100 sec
CPU = 30 sec
26
WHERE: Filter First, Join Last
When Joining and Filtering, specify the Filter condition
first, Joins last.
SELECT *
FROM emp E,
dept D
WHERE (E.empno = 123
OR D.deptno > 10)
AND E.deptno = D.deptno;
Filter criteria
Join criteria
27
Subqueries: IN vs. EXISTS
Use EXISTS instead of IN in subqueries:
SELECT E.*
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT *
FROM emp E
WHERE EXISTS (
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');
IN: Both tables are
scanned
EXISTS: Only outer table
is scanned; subquery
uses index
28
Subquery vs. Join
Use Join instead of Subquery :

SELECT *
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');

SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;
IN: Both tables are
scanned
JOIN: Only one table is
scanned, other uses index
29
Join vs. EXISTS
Best performance depends on subquery/driving table:
SELECT *
FROM emp E
WHERE EXISTS (
SELECT 'X'
FROM dept D
WHERE D.deptno = E.deptno
AND D.dname = 'SALES');

SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;
EXISTS: better than Join if
the number of matching
rows in DEPT is small
JOIN: better than Exists if
the number of matching
rows in DEPT is large
30
Explain

Display the access path the database will use (e.g., use
of indexes, sorts, joins, table scans)
Oracle: EXPLAIN
Sybase: SHOWPLAN
DB2: EXPLAIN
Oracle Syntax:
EXPLAIN PLAN
SET STATEMENT_ID = 'statement id'
INTO PLAN_TABLE FOR
statement
Requires Select/Insert privileges on PLAN_TABLE
31
Explain

Example 1: IN subquery
SELECT *
FROM emp E
WHERE E.deptno IN (
SELECT D.deptno
FROM dept D
WHERE D.dname = 'SALES');
Result:
MERGE JOIN
SORT (JOIN)
TABLE ACCESS (FULL) OF EMP
SORT (JOIN)
VIEW
SORT (UNIQUE)
TABLE ACCESS (FULL) OF DEPT
3 joins
1 dynamic view
2 table scans
3 sorts
32
Explain

Example 2: "EXISTS" subquery
SELECT *
FROM emp e
WHERE EXISTS (
SELECT 'x'
FROM dept d
WHERE d.deptno = e.deptno
AND d.dname = 'SALES');

Result:
FILTER
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)
1 table scan
1 index scan
1 index access
33
Explain

Example 3: Join (no subquery)
SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

Result:
NESTED LOOPS
TABLE ACCESS (FULL) OF EMP
TABLE ACCESS (BY INDEX ROWID) OF DEPT
INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

1 table scan
1 index scan
1 index access
34
SQL Trace

Use SQL Trace to determine the actual time and
resource costs for for a statement to execute.
Step 1: ALTER SESSION SET SQL_TRACE TRUE;
Step 2: Execute SQL to be traced:
SELECT E.*
FROM emp E,
dept D
WHERE D.dname = 'SALES'
AND D.deptno = E.deptno;

Step 3: ALTER SESSION SET SQL_TRACE FALSE;

35
SQL Trace

Step 4: Trace file is created in <USER_DUMP_DEST>
directory on the server (specified by the DBA).
Step 5: Run TKPROF (UNIX) to create a formatted
output file:

tkprof
echd_ora_15319.trc
$HOME/prof.out
table=plan_table
explain=dbuser/passwd
Trace file
Formatted output file
destination for Explain
user/passwd for Explain
36
SQL Trace

Step 6: view the output file:

...
SELECT E.*
FROM emp E, dept D
WHERE D.dname = 'SALES' AND D.deptno = E.deptno;

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 4 19 3 6
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 4 19 3 6

Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 62 (PMARKS)

Rows Row Source Operation
------- ---------------------------------------------------
6 NESTED LOOPS
14 TABLE ACCESS FULL EMP
14 TABLE ACCESS BY INDEX ROWID DEPT
14 INDEX UNIQUE SCAN (object id 4628)
TIMED_STATISTICS
must be turned on to get
these values
EXPLAIN output
37
Tips and Tricks: UNION ALL
Use UNION ALL instead of UNION if there are no
duplicate rows (or if you don't mind duplicates):
SELECT * FROM emp
UNION
SELECT * FROM emp_arch;

SELECT * FROM emp
UNION ALL
SELECT * FROM emp_arch;
UNION: requires sort
UNION ALL: no sort
38
Tips and Tricks: HAVING vs. WHERE
With GROUP BY, use WHERE instead of HAVING (if the
filter criteria does not apply to a group function):
SELECT deptno,
AVG(sal)
FROM emp
GROUP BY deptno
HAVING deptno IN (10, 20);

SELECT deptno,
AVG(sal)
FROM emp
WHERE deptno IN (10, 20)
GROUP BY deptno;
HAVING: rows are
filtered after result
set is returned
WHERE: rows are
filtered first--possibly
far fewer to process
39
Tips and Tricks: EXISTS vs DISTINCT
Use EXISTS instead of DISTINCT to avoid implicit sort (if
the column is indexed):
SELECT DISTINCT
e.deptno,
e.lname
FROM dept d,
emp e
WHERE d.deptno = e.deptno;

SELECT e.deptno, e.lname
FROM emp e
WHERE EXISTS (
SELECT 'X'
FROM dept d
WHERE d.deptno = e.deptno);
DISTINCT: implicit sort
is performed to filter
duplicate rows
EXISTS: no sort
40
Tips and Tricks: Consolidate SQL

Select from Sequences and use SYSDATE in the
statement in which they are used:
SELECT SYSDATE INTO :vardate FROM dual;
SELECT arch_seq.NEXTVAL INTO :varid FROM dual;
INSERT INTO archive
VALUES (:vardate, :varid, ...)


INSERT INTO emp_archive
VALUES (SYSDATE, emp_seq.NEXTVAL, ...)

BEFORE: 3 statements
are used to perform 1
Insert
AFTER: only 1
statement is needed
41
Tips and Tricks: Consolidate SQL

Consolidate unrelated statements using outer-joins to the
the DUAL (dummy) table:
SELECT dname FROM dept WHERE deptno = 10;
SELECT lname FROM emp WHERE empno = 7369;

SELECT d.dname,
e.lname
FROM dept d,
emp e,
dual x
WHERE d.deptno (+) = 10
AND e.empno (+) = 7369
AND NVL('X', x.dummy) = NVL('X', e.ROWID (+))
AND NVL('X', x.dummy) = NVL('X', d.ROWID (+));
BEFORE: 2 round-trips
AFTER: only 1 round-trip
42
Tips and Tricks: COUNT

Use COUNT(*) instead of COUNT(column):
SELECT COUNT(empno)
FROM emp;


SELECT COUNT(*)
FROM emp;
~ 50% faster
43
Tips and Tricks: Self-Join

Use a self-join (joining a table to itself) instead of two
queries on the same table:
SELECT mgr INTO :varmgr FROM emp WHERE deptno = 10;
LOOP...
SELECT mgr, lname FROM emp WHERE mgr = :varmgr;

SELECT E.mgr,
E.lname
FROM emp E,
emp M
WHERE M.deptno = 10
AND E.empno = M.mgr;
BEFORE: 2 round-trips
AFTER: only 1
44
Tips and Tricks: ROWNUM

Use the ROWNUM pseudo-column to return only the first
N rows of a result set. (For example, if you just want a
sampling of data):
SELECT *
FROM emp
WHERE ROWNUM <= 10;
Returns only the first 10
employees in the table,
in no particular order
45
Tips and Tricks: ROWID

The ROWID pseudo-column uniquely identifies a row,
and is the fastest way to access a row:
CURSOR retired_emp_cur IS
SELECT ROWID
FROM emp
WHERE retired = 'Y';
...
FOR retired_emp_rec IN retired_emp_cur LOOP
SELECT fname || ' ' || lname
INTO :printable_name
FROM emp
WHERE ROWID = retired_emp_rec.ROWID;
...
Instead of selecting the
key column(s), ROWID is
used to identify the row
for later use
46
Tips and Tricks: Sequences

Use a Sequence to generate unique values for a table:
SELECT MAX(empno)
INTO :new_empno
FROM emp;
...
INSERT INTO emp
VALUES (:new_empno + 1, ...);

INSERT INTO emp
VALUES (emp_seq.NEXTVAL, ...);
or
SELECT emp_seq.NEXVAL
INTO :new_empno FROM dual;
Using a Sequence
ensures that you always
have a unique number,
and does not require any
table reads
MAX(empno) requires a
sort and an index scan

INSERT could fail with a
Duplicate error if someone
else gets there first
47
Tips and Tricks: Connect By

Use CONNECT BY to construct hierarchical queries:
SELECT LPAD(' ',4*(LEVEL-1)) || lname Name,
Job
FROM emp
WHERE job != 'CLERK'
START WITH job = 'PRESIDENT'
CONNECT BY PRIOR empno = mgr;
Name Job
King PRESIDENT
Jones MANAGER
Scott ANALYST
Ford ANALYST
Blake MANAGER
Allen SALESMAN
Ward SALESMAN
Martin SALESMAN
Turner SALESMAN
Clark MANAGER
48
Tips and Tricks: Cartesian Products

Avoid Cartesian products by ensuring that the tables are
joined on all shared keys:
SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp; -- 1,000 rows



SELECT *
FROM dept, -- 10 rows
salgrade, -- 20 rows
emp -- 1,000 rows
WHERE E.deptno = D.deptno
AND E.grade = S.grade;
10 * 1000 * 20 = 200,000 rows
1,000 rows

You might also like