Analytics Functions Demo

Overview
Analytic Functions, which have been available since Oracle 8.1.6, are designed to address such problems as:
"Calculate a running total",
"Find percentages within a group",
"Top-N queries",
"Compute a moving average"
With enough effort, all of these functions can be achieved with standard pl/sql.
However analytic functions are tightly integrated into the oracle kernel and avoid much of the overhead
and recursive calling of data one would need to do if you coded the same logic in pl/sql.
Analytic functions are an Oracle sql verbs that may or may not yet be part of the ANSI standards.
What are they?

Analytic functions compute an aggregate value based on a group of rows.
The group of rows is called a window and is defined by the analytic clause.
For each row, a "sliding" window of rows is defined.
The window determines the range of rows used to perform the calculations for the "current row".
Window sizes can be based on either a physical number of rows or a logical interval such as time.
Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause.
All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed.
Therefore, analytic functions can appear only in the select list or ORDER BY clause.
Analytic list:
AVG CORR COUNT CUME_DIST
DENSE_RANK FIRST FIRST_VALUE LAG
LAST LAST_VALUE LEAD MAX
MIN NTILE PERCENT_RANK PERCENTILE_CONT
PERCENTILE_DISC RANK RATIO_TO_REPORT REGR_AVGX
REGR_AVGY REGR_COUNT REGR_INTERCEPT REGR_R2
REGR_SLOPE REGR_SXX REGR_SXY REGR_SYY
ROW_NUMBER STDDEV STDDEV_POP STDDEV_SAMP
SUM VAR_POP VAR_SAMP VARIANCE
Syntax
All analytic functions have the general format syntax of:
Analytic-Function(<Argument>,<Argument>,...)
OVER (
<Query-Partition-Clause>
<Order-By-Clause>
<Windowing-Clause>
)
Query-Partition-Clause (group)
The PARTITION BY clause logically breaks a single result set into N groups,
according to the criteria set by the partition expressions.
The words "partition" and "group" are used synonymously here.
The analytic functions are applied to each group independently, they are reset for each group.
Order-By-Clause (within group)
The ORDER BY clause specifies how the data is sorted within each group (partition).
This will definitely affect the outcome of any analytic function.
Windowing-Clause
The windowing clause gives us a way to define a sliding or anchored window of data,
on which the analytic function will operate, within a group.
This clause can be used to have the analytic function compute its value based on any arbitrary
sliding or anchored window within a group.
Performance (changes to explain plan)
desc mydata0
Name Null? Type
------------------------------------------------------------------------ -------- ------------
CAL_MNTH NOT NULL NUMBER(28)
EMPLY_ID NOT NULL VARCHAR2(10)
ASMNTPLN_ID NOT NULL NUMBER(28)
RCGNTNLVL_ID NUMBER(28)
PREREQ_ASMNTPLN_ID NUMBER(28)
AMT NUMBER(7,2)
ALTER SESSION SET OPTIMIZER_MODE = 'FIRST_ROWS';

SET AUTOTRACE on
break on cal_mnth skip 1 dup
SELECT cal_mnth, emply_id, AMT
FROM mydata0
where 1=1
and emply_id in (26078, 26107, 26116, 29083) and cal_mnth between 200601 and 200606
ORDER BY cal_mnth, emply_id
CAL_MNTH EMPLY_ID AMT
---------- ---------- ----------
200601 26078 0
200601 26107 175
200601 26116 250
200601 29083 137.5
200602 26078 0
200602 26107 175
200602 26116 175
200602 29083 87.5
200603 26078 0
200603 26107 175
200603 26116 175
200603 29083 0
200604 26078 50
200604 26107 250
200604 26116 250
200604 29083 50
200605 26078 87.5

200605 26107 175
200605 26116 400
200605 29083 50
200606 26078 50
200606 26107 175
200606 26116 250
200606 29083 87.5
24 rows selected.
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=30 Card=26 Bytes=624)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'MYDATA0' (TABLE) (Cost=30 Card=26 Bytes=624)
2 1 INDEX (RANGE SCAN) OF 'MY_PK' (INDEX (UNIQUE)) (Cost=17 Card=26)
break on cal_mnth skip 1 dup
SELECT cal_mnth, emply_id, AMT,
SUM(amt)
OVER (ORDER BY cal_mnth, emply_id
) Running_Total,
SUM(AMT)
OVER (PARTITION BY cal_mnth
ORDER BY emply_id
) reporting_period_Total ,
ROW_NUMBER()
ORDER BY EMPLY_ID
) seq
FROM mydata0
where 1=1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=30 Card=26 Bytes=624)
1 0 WINDOW (BUFFER) (Cost=30 Card=26 Bytes=624)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'MYDATA0' (TABLE) (Cost=30 Card=26 Bytes=624)
3 2 INDEX (RANGE SCAN) OF 'MY_PK' (INDEX (UNIQUE)) (Cost=17 Card=26)
Output from the running total query listed above.
CAL_MNTH EMPLY_ID AMT RUNNING_TOTAL REPORTING_PERIOD_TOTAL SEQ
------------ ---------- --------------------- ------------- ---------------------- ----------
200601 26078 0 0 0 1
200601 26107 175 175 175 2
200601 26116 250 425 425 3
200601 29083 137.5 562.5 562.5 4
200602 26078 0 562.5 0 1

200602 26107 175 737.5 175 2
200602 26116 175 912.5 350 3
200602 29083 87.5 1000 437.5 4
200603 26078 0 1000 0 1

200603 26107 175 1175 175 2
200603 26116 175 1350 350 3
200603 29083 0 1350 350 4
200604 26078 50 1400 50 1

200604 26107 250 1650 300 2
200604 26116 250 1900 550 3
200604 29083 50 1950 600 4
200605 26078 87.5 2037.5 87.5 1

200605 26107 175 2212.5 262.5 2
200605 26116 400 2612.5 662.5 3
200605 29083 50 2662.5 712.5 4
200606 26078 50 2712.5 50 1

200606 26107 175 2887.5 225 2
200606 26116 250 3137.5 475 3
200606 29083 87.5 3225 562.5 4
24 rows selected.
Notice/warning
If you compare / contrast these two result sets The dataset for the first sql looks
wrong because its "running total" is from a different order by.
With many/all analytic functions order by clauses are important to understanding what is being presented.
The point of this example illustrate that order clauses can an do affect the meaning of a report.
The query below in lets one easily find their emply_id since its sorted by that column.
And then say my amt was x and the running total of the top contributors with my amount or higher is y.
In the report below if your employee 26107, your amount was 175,
running total of all people who contributed at your level or higher comes to 575 (175+400=575)

SUM(amt)
OVER (PARTITION BY cal_mnth ORDER BY cal_mnth, AMT desc ) Running_Total
FROM mydata0
where 1=1
CAL_MNTH EMPLY_ID AMT RUNNING_TOTAL

------------ ---------- --------------------- -------------
200601 26078 0 562.5
200601 26107 175 425
200601 26116 250 250
200601 29083 137.5 562.5
200602 26078 0 437.5

200602 26107 175 350
200602 26116 175 350
200602 29083 87.5 437.5
200603 26078 0 350

200603 26107 175 350
200603 26116 175 350
200603 29083 0 350
200604 26078 50 600

200604 26107 250 500
200604 26116 250 500
200604 29083 50 600
200605 26078 87.5 662.5

200605 26107 175 575
200605 26116 400 400
200605 29083 50 712.5
200606 26078 50 562.5

200606 26107 175 425
200606 26116 250 250
200606 29083 87.5 512.5
The above report is correct it just answer a different question than a pure running total of what is displayed
The sql for this question would be

SUM(amt)
OVER (PARTITION BY cal_mnth ORDER BY cal_mnth, emply_id ) Running_Total
FROM mydata0
where 1=1
CAL_MNTH EMPLY_ID AMT RUNNING_TOTAL

------------ ---------- --------------------- -------------
200601 26078 0 0
200601 26107 175 175
200601 26116 250 425
200601 29083 137.5 562.5
200602 26078 0 0
200602 26107 175 175
200602 26116 175 350
200602 29083 87.5 437.5
200603 26078 0 0
200603 26107 175 175
200603 26116 175 350
200603 29083 0 350
200604 26078 50 50
200604 26107 250 300
200604 26116 250 550
200604 29083 50 600
200605 26078 87.5 87.5

200605 26107 175 262.5
200605 26116 400 662.5
200605 29083 50 712.5
200606 26078 50 50
200606 26107 175 225
200606 26116 250 475
200606 29083 87.5 562.5
24 rows selected.
Top-N Queries
Example 1
TOP x people in each reporting period
Break on cal_mnth skip 1
SELECT *
FROM (
SELECT cal_mnth, emply_id, amt,
DENSE_RANK()
OVER (
PARTITION BY cal_mnth ORDER BY amt DESC
) dr,
RANK()
OVER (
) r,
ROW_NUMBER()
OVER (
) seq
FROM mydata0
where 1=1
-- and emply_id in (26078, 26107, 26116, 29083)
and cal_mnth between 200601 and 200603 )
WHERE dr <= 4
order by cal_mnth, amt DESC
Examine the data below four people are tied for 1st (dr) 9 people are tied for 2nd (dr)
That same data looked at different is:
Four people are tied for 1st (r) and nine people are tied for 5th (r) and then we have 14th and 15th place
The last column seq is a straight sequence (rownum) for that group.
CAL_MNTH EMPLY_ID AMT DR R SEQ
------------ ---------- --------------------- ---------- ---------- ----------
200601 09158 1500 1 1 1
03389 1500 1 1 2
28918 1500 1 1 3
27001 1500 1 1 4
27501 450 2 5 5
08201 450 2 5 6
08010 450 2 5 7
27237 450 2 5 8
27028 450 2 5 9
27866 450 2 5 10
26290 450 2 5 11
29249 450 2 5 12
26681 450 2 5 13
25585 400 3 14 14
28286 350 4 15 15
200602 28158 2250 1 1 1

09158 2125 2 2 2
28918 2125 2 2 3
28025 2000 3 4 4
03389 2000 3 4 5
28773 2000 3 4 6
02057 2000 3 4 7
03389 1500 4 8 8
04058 1500 4 8 9
08011 1500 4 8 10
06893 1500 4 8 11
28036 1500 4 8 12
28149 1500 4 8 13
07769 1500 4 8 14
27001 1500 4 8 15
. . .
. . .
. . .
49 rows selected.
Windows
The windowing clause gives us a way to define a sliding window which the analytic function will operate,
within a group.
The default window is an anchored window that simply starts at the first row of a group
an continues to the current row.
The ORDER BY in an analytic function implies a default window clause of RANGE UNBOUNDED PRECEDING.
That says to get all rows in our partition that came before us as specified by the ORDER BY clause.
windows can only be based on two criteria: RANGES of data values or ROWS offset from the current row.
Range Windows
Range windows collect rows together based on a WHERE clause.
' range 5 preceding ' means generate a sliding window of preceding rows in the group such
that they are within 5 units of the current row.
These units may either be numeric comparisons or date comparisons.
It is not valid to use RANGE with datatypes other than numbers and dates.
Count of users created WITHIN 5 days of the current row.
alter session set nls_date_format = 'yyyymmdd_hh24miss';

SELECT username, created, default_tablespace from dba_users order by created;
USERNAME CREATED DEFAULT_TABLESPACE

------------------------------ --------------- ------------------------------
SYS 20040309_235807 SYSTEM
SYSTEM 20040309_235808 SYSTEM
OUTLN 20040309_235811 SYSTEM
DIP 20040310_000508 USERS
DBSNMP 20040310_001451 SYSAUX
WMSYS 20040310_001652 SYSAUX
EXFSYS 20040310_003034 SYSAUX
MDSYS 20040310_003115 SYSAUX
ORDPLUGINS 20040310_003115 SYSAUX
SI_INFORMTN_SCHEMA 20040310_003115 SYSAUX
ORDSYS 20040310_003115 SYSAUX
DMSYS 20040310_004139 SYSAUX
CTXSYS 20040310_004249 SYSAUX
ANONYMOUS 20040310_004418 SYSAUX
XDB 20040310_004418 SYSAUX
OLAPSYS 20040310_004836 SYSAUX
MDDATA 20040310_005112 USERS
WKSYS 20040310_005559 SYSAUX
WKPROXY 20040310_005559 SYSAUX
WK_TEST 20040310_005602 SYSAUX
SYSMAN 20040310_005716 SYSAUX
MGMT_VIEW 20040310_010057 SYSAUX
SCOTT 20040310_010532 USERS
BI 20080327_110350 USERS
HR 20080327_110350 USERS
OE 20080327_110350 USERS
PM 20080327_110350 USERS
IX 20080327_110350 USERS
SH 20080327_110350 USERS
29 rows selected.. . .
. . .
. . .

SELECT username, created,
COUNT(*)
OVER (
ORDER BY created ASC
RANGE 5 PRECEDING
) cnt
FROM dba_users
ORDER BY created ASC
USERNAME CREATED CNT

------------------------------ --------------- ----------
SYS 20040309_235807 1
SYSTEM 20040309_235808 2
OUTLN 20040309_235811 3
DIP 20040310_000508 4
DBSNMP 20040310_001451 5
WMSYS 20040310_001652 6
EXFSYS 20040310_003034 7
MDSYS 20040310_003115 11
ORDPLUGINS 20040310_003115 11
SI_INFORMTN_SCHEMA 20040310_003115 11
ORDSYS 20040310_003115 11
DMSYS 20040310_004139 12
CTXSYS 20040310_004249 13
ANONYMOUS 20040310_004418 15
XDB 20040310_004418 15
OLAPSYS 20040310_004836 16
MDDATA 20040310_005112 17
WKSYS 20040310_005559 19
WKPROXY 20040310_005559 19
WK_TEST 20040310_005602 20
SYSMAN 20040310_005716 21
MGMT_VIEW 20040310_010057 22
SCOTT 20040310_010532 23
BI 20080327_110350 6
HR 20080327_110350 6
OE 20080327_110350 6
PM 20080327_110350 6
IX 20080327_110350 6
SH 20080327_110350 6
29 rows selected.
In the report above, we are using a window of 5 days prior to current row.
Since the value being compared contains a timestamp this explain why
The count jumps from 7 to 11 above.
Row Windows
Row Windows are physical units; physical number of rows, to include in the window.
In the example below we calculate the AMT field for this person an the two people above them.
Such as if I form teams of 3 people what would be their total.
SELECT *
FROM (
SELECT cal_mnth, emply_id, amt,
DENSE_RANK()
OVER (
) dr,
RANK()
OVER (
) r,
ROW_NUMBER()
OVER (
) seq ,
SUM(amt)
ORDER BY amt DESC
ROWS 2 PRECEDING) You_and_2above
FROM mydata0
where 1=1
-- and emply_id in (26078, 26107, 26116)
and cal_mnth between 200601 and 200603 )
WHERE dr <= 4
order by cal_mnth, amt DESC
CAL_MNTH EMPLY_ID AMT DR R SEQ YOU_AND_2ABOVE

------------ ---------- --------------------- ---------- ---------- ---------- --------------
200601 09158 1500 1 1 1 1500
03389 1500 1 1 2 3000
28918 1500 1 1 3 4500
27001 1500 1 1 4 4500
27501 450 2 5 5 3450
08201 450 2 5 6 2400
08010 450 2 5 7 1350
27237 450 2 5 8 1350
27028 450 2 5 9 1350
27866 450 2 5 10 1350
26290 450 2 5 11 1350
29249 450 2 5 12 1350
26681 450 2 5 13 1350
25585 400 3 14 14 1300
28286 350 4 15 15 1200
Accessing Rows Around Your Current Row

Frequently you want to access data not only from the current row but the current row " in front of " or " behind " them.
For example, let's say you need a report that shows, users when they were created,
how many days from last user did it take for to want this users;
how many days did it take for to want the next user
Lag (looking behind)

LAG ( value_expr [, offset] [, default] )
OVER ( [query_partition_clause] order_by_clause )
Lead (looking ahead)

LEAD ( value_expr [, offset] [, default] )
OVER ( [query_partition_clause] order_by_clause )
If you do not specify offset, then its default is 1.
The optional default value is returned if the offset goes beyond the scope of the window.
If you do not specify default, then its default value is null.
The following query gives us the when a user was created and their default tablespace,
It also tells us what we used for the last user created as well as what we will use for the next user created.
Column username format a15 trunc
SELECT created, USERNAME, DEFAULT_TABLESPACE,

LAG(DEFAULT_TABLESPACE, 1, '?')
OVER (ORDER BY created) AS prior_dflt,
lead (DEFAULT_TABLESPACE, 1, '?')
OVER (ORDER BY created) AS next_dflt
FROM dba_users order by created ;
CREATED USERNAME DEFAULT_TABLESPACE PRIOR_DFLT NEXT_DFLT

--------------- --------------- ------------------------------ ------------------------------ ------
20040309_235807 SYS SYSTEM ? SYSTEM
20040309_235808 SYSTEM SYSTEM SYSTEM SYSTEM
20040309_235811 OUTLN SYSTEM SYSTEM USERS
20040310_000508 DIP USERS SYSTEM SYSAUX
20040310_001451 DBSNMP SYSAUX USERS SYSAUX
20040310_001652 WMSYS SYSAUX SYSAUX SYSAUX
20040310_003034 EXFSYS SYSAUX SYSAUX SYSAUX
20040310_003115 MDSYS SYSAUX SYSAUX SYSAUX
20040310_003115 ORDPLUGINS SYSAUX SYSAUX SYSAUX
20040310_003115 SI_INFORMTN_SCH SYSAUX SYSAUX SYSAUX
20040310_003115 ORDSYS SYSAUX SYSAUX SYSAUX
20040310_004139 DMSYS SYSAUX SYSAUX SYSAUX
20040310_004249 CTXSYS SYSAUX SYSAUX SYSAUX
20040310_004418 ANONYMOUS SYSAUX SYSAUX SYSAUX
20040310_004418 XDB SYSAUX SYSAUX SYSAUX
20040310_004836 OLAPSYS SYSAUX SYSAUX USERS
20040310_005112 MDDATA USERS SYSAUX SYSAUX
20040310_005559 WKSYS SYSAUX USERS SYSAUX
20040310_005559 WKPROXY SYSAUX SYSAUX SYSAUX
20040310_005602 WK_TEST SYSAUX SYSAUX SYSAUX
20040310_005716 SYSMAN SYSAUX SYSAUX SYSAUX
20040310_010057 MGMT_VIEW SYSAUX SYSAUX USERS
20040310_010532 SCOTT USERS SYSAUX USERS
20080327_110350 BI USERS USERS USERS
20080327_110350 HR USERS USERS USERS
20080327_110350 OE USERS USERS USERS
20080327_110350 PM USERS USERS USERS
20080327_110350 IX USERS USERS USERS
20080327_110350 SH USERS USERS ?
(try same query above with rows displayed in a different order).

SELECT created, USERNAME, DEFAULT_TABLESPACE,
LAG(DEFAULT_TABLESPACE, 1, '?')
OVER (ORDER BY created) AS prior_dflt,
lead (DEFAULT_TABLESPACE, 1, '?')
OVER (ORDER BY created) AS next_dflt
FROM dba_users order by username ;
CREATED USERNAME DEFAULT_TABLESPACE PRIOR_DFLT NEXT_DFLT

--------------- --------------- ------------------------------ ------------------------------ ------
20040310_004418 ANONYMOUS SYSAUX SYSAUX SYSAUX
20080327_110350 BI USERS USERS USERS
20040310_004249 CTXSYS SYSAUX SYSAUX SYSAUX
20040310_001451 DBSNMP SYSAUX USERS SYSAUX
20040310_000508 DIP USERS SYSTEM SYSAUX
20040310_004139 DMSYS SYSAUX SYSAUX SYSAUX
20040310_003034 EXFSYS SYSAUX SYSAUX SYSAUX
20080327_110350 HR USERS USERS USERS
20080327_110350 IX USERS USERS USERS
20040310_005112 MDDATA USERS SYSAUX SYSAUX
20040310_003115 MDSYS SYSAUX SYSAUX SYSAUX
20040310_010057 MGMT_VIEW SYSAUX SYSAUX USERS
20080327_110350 OE USERS USERS USERS
20040310_004836 OLAPSYS SYSAUX SYSAUX USERS
20040310_003115 ORDPLUGINS SYSAUX SYSAUX SYSAUX
20040310_003115 ORDSYS SYSAUX SYSAUX SYSAUX
20040309_235811 OUTLN SYSTEM SYSTEM USERS
20080327_110350 PM USERS USERS USERS
20040310_010532 SCOTT USERS SYSAUX USERS
20080327_110350 SH USERS USERS ?
20040310_003115 SI_INFORMTN_SCH SYSAUX SYSAUX SYSAUX
20040309_235807 SYS SYSTEM ? SYSTEM
20040310_005716 SYSMAN SYSAUX SYSAUX SYSAUX
20040309_235808 SYSTEM SYSTEM SYSTEM SYSTEM
20040310_005559 WKPROXY SYSAUX SYSAUX SYSAUX
20040310_005559 WKSYS SYSAUX USERS SYSAUX
20040310_005602 WK_TEST SYSAUX SYSAUX SYSAUX
20040310_001652 WMSYS SYSAUX SYSAUX SYSAUX
20040310_004418 XDB SYSAUX SYSAUX SYSAUX
29 rows selected.
Determine the First Value / Last Value of a Group

The FIRST_VALUE and LAST_VALUE functions allow you to select the first and last rows from a group.
These rows are especially valuable because they are often used as the baselines in calculations.
Example
break on cal_mnth skip 1
SELECT *
FROM (
SELECT
cal_mnth, emply_id, amt,
FIRST_VALUE(emply_id)
ORDER BY amt ASC) AS MIN_prsn,
FIRST_VALUE(amt)
ORDER BY amt ASC) AS MIN_amt,
FIRST_VALUE(emply_id)
ORDER BY amt desc) AS max_prsn,
FIRST_VALUE(amt)
ORDER BY amt desc) AS max_amt
FROM mydata0
where 1=1
and emply_id in (26078, 26107, 26116, 29083)
and cal_mnth between 200601 and 200603
)
WHERE 1=1
order by cal_mnth, emply_id DESC
CAL_MNTH EMPLY_ID AMT MIN_PRSN MIN_AMT MAX_PRSN MAX_AMT

---------- ---------- ---------- ---------- ---------- ---------- ----------
200601 29083 137.5 26078 0 26116 250
26116 250 26078 0 26116 250
26107 175 26078 0 26116 250
26078 0 26078 0 26116 250
200602 29083 87.5 26078 0 26107 175

26116 175 26078 0 26107 175
26107 175 26078 0 26107 175
26078 0 26078 0 26107 175
200603 29083 0 26078 0 26107 175

26116 175 26078 0 26107 175
26107 175 26078 0 26107 175
26078 0 26078 0 26107 175
Crosstab or Pivot Queries

A crosstab query, sometimes known as a pivot query,
This sql is the base to make true cross tab, is close but there is holes in the report.
SELECT cal_mnth,
DECODE(seq,1,emply_id,null) first,
DECODE(seq,2,emply_id,null) second,
DECODE(seq,3,emply_id,null) third,
DECODE(seq,1,amt,null) firstamt,
DECODE(seq,2,amt,null) secondamt,
DECODE(seq,3,amt,null) thirdamt
FROM (SELECT cal_mnth, emply_id, amt,
row_number()
ORDER BY amt desc NULLS LAST) seq
FROM mydata0
where 1=1
)
WHERE seq <= 3
CAL_MNTH FIRST SECOND THIRD FIRSTAMT SECONDAMT THIRDAMT
---------- ---------- ---------- ---------- ---------- ---------- ----------
200601 09158 1500
03389 1500
28918 1500
200602 28158 2250

09158 2125
28918 2125
200603 28918 2125

07769 2125
28036 2000
9 rows selected.
The solution to make the above a cross tab is to add max to the column set.
SELECT cal_mnth,
MAX(DECODE(seq,1,emply_id,null) ) first,
MAX(DECODE(seq,2,emply_id,null) ) second,
MAX(DECODE(seq,3,emply_id,null) ) third,
MAX(DECODE(seq,1,amt,null)) firstamt,
MAX(DECODE(seq,2,amt,null)) secondamt,
MAX(DECODE(seq,3,amt,null)) thirdamt
FROM (SELECT cal_mnth, emply_id, amt,
row_number()
ORDER BY amt desc NULLS LAST) seq
FROM mydata0
where 1=1
)
WHERE seq <= 3
GROUP BY cal_mnth;
CAL_MNTH FIRST SECOND THIRD FIRSTAMT SECONDAMT THIRDAMT

---------- ---------- ---------- ---------- ---------- ---------- ----------
200601 09158 03389 28918 1500 1500 1500
200602 28158 09158 28918 2250 2125 2125
200603 28918 07769 28036 2125 2125 2000
200604 27978 28761 29128 2400 2400 2400
200605 04027 27978 28158 2400 2250 2250
200606 04027 27978 28158 2400 2250 2250
6 rows selected.
NTILE Divides an ordered data set into a number of buckets
In the example what what PK values do I need to divide a table into x equal row counts.
column from_key format a40

column to_key format a40
column cnt format 9999
column NT format 99
SELECT
NT,
cOUNT(*) CNT,
mIN(mykey ) from_key,
MAX(mykey ) to_key
FROM (SELECT
owner||'.'||table_name mykey,
NTILE(10) OVER (ORDER BY owner||'.'||table_name ) NT
FROM dba_tables
)
GROUP BY NT
NT CNT FROM_KEY TO_KEY

--- ----- ---------------------------------------- ----------------------------------------
1 154 CTXSYS.DR$CLASS MDSYS.SDO_TXN_IDX_DELETES
2 154 MDSYS.SDO_TXN_IDX_EXP_UPD_RGN SCOTT.EMP
3 154 SCOTT.MYDATA0 SYS.DIR$MIGRATE_OPERATIONS
4 154 SYS.DIR$NODE_ATTRIBUTES SYS.NCOMP_DLL$
5 154 SYS.NOEXP$ SYS.SUMDETAIL$
6 154 SYS.SUMINLINE$ SYS.WRI$_ADV_SQLA_STMTS
7 154 SYS.WRI$_ADV_SQLA_TMP SYSMAN.MGMT_ECM_CSA
8 153 SYSMAN.MGMT_ECM_CSA_COOKIES SYSMAN.MGMT_PRIVS
9 153 SYSMAN.MGMT_PRIV_GRANTS SYSTEM.REPCAT$_AUDIT_ATTRIBUTE
10 153 SYSTEM.REPCAT$_AUDIT_COLUMN XDB.XDB$ROOT_INFO
10 rows selected.
Elapsed: 00:00:00.14
The following query is the same technique as the above. In this example a carrot ^ is used as a separator for the
values to use in a multi-column key table. One can use this carrot separate list as values such that one queries/splits t
it into 10 equal rowcount sizes.
column from_key format a24

column to_key format a24
column fr_key1 format a7
column to_key1 format a7
column nt format 99
column cnt format 99999
set null ?
select nt, cnt,
substr(from_key,1 ,instr(from_key,'^',1,1)-1) fr_key1,
substr(from_key,instr(from_key,'^',1,1)+1 ,(instr(from_key,'^',1,2)-instr(from_key,'^',1,1))-1) fr_key2,
substr(to_key,1 ,instr(to_key,'^',1,1)-1) to_key1,
substr(to_key,instr(to_key,'^',1,1)+1 ,(instr(to_key,'^',1,2)-instr(to_key,'^',1,1))-1) to_key2,
from_key,
to_key
from (
SELECT
NT,
cOUNT(*) CNT,
mIN(CAL_MNTH||'^'||EMPLY_ID||'^'||ASMNTPLN_ID)||'^^^^^^^' from_key,
MAX(CAL_MNTH||'^'||EMPLY_ID||'^'||ASMNTPLN_ID)||'^^^^^^^' to_key
FROM (SELECT
CAL_MNTH, EMPLY_ID, ASMNTPLN_ID,
NTILE(10) OVER (ORDER BY CAL_MNTH, EMPLY_ID, ASMNTPLN_ID) NT
FROM mydata0
)
GROUP BY NT
) mstr
ORDER BY 1;
NT CNT FR_KEY1 FR_KEY2 FR_KEY3 FR_KEY4 TO_KEY1 TO_KEY2 TO_KEY3 TO_KEY4 FROM_KEY TO_KEY
--- ------ ------- ------- ------- ------- ------- ------- ------- ------- ------------------------
1 469 200601 00231 2 ? 200601 27866 3023 ? 200601^00231^2^^^^^^^ 200601^27
2 469 200601 27871 3023 ? 200602 06182 3 ? 200601^27871^3023^^^^^^^ 200602^06
3 469 200602 06190 2 ? 200602 28834 1 ? 200602^06190^2^^^^^^^ 200602^28834
4 469 200602 28837 3 ? 200603 25304 1 ? 200602^28837^3^^^^^^^ 200603^25304
5 469 200603 25305 1 ? 200603 29573 1 ? 200603^25305^1^^^^^^^ 200603^29573
6 469 200603 29574 126 ? 200604 27871 3 ? 200603^29574^126^^^^^^^ 200604^278
7 468 200604 27896 3 ? 200605 05374 3 ? 200604^27896^3^^^^^^^ 200605^05374
8 468 200605 05394 2 ? 200605 28865 2 ? 200605^05394^2^^^^^^^ 200605^28865
9 468 200605 28867 1 ? 200606 25571 1 ? 200605^28867^1^^^^^^^ 200606^25571
10 468 200606 25585 2 ? 200606 29842 1 ? 200606^25585^2^^^^^^^ 200606^29842
10 rows selected.
http://www.psoug.org/reference/analytic_functions.html
AVG This example returns a Running average

CREATE TABLE vote_count (
submit_date DATE NOT NULL,
num_votes NUMBER NOT NULL);
INSERT INTO vote_count VALUES (TRUNC(SYSDATE)-4, 100);

COMMIT;
SELECT * FROM vote_count;
SUBMIT_DA NUM_VOTES
--------- ----------
23-MAR-08 100
24-MAR-08 150
25-MAR-08 75
24-MAR-08 25
26-MAR-08 50
SELECT submit_date, num_votes, TRUNC(AVG(num_votes)

OVER(ORDER BY submit_date ROWS UNBOUNDED PRECEDING)) AVG_VOTE_PER_DAY
FROM vote_count
ORDER BY submit_date;
SUBMIT_DA NUM_VOTES AVG_VOTE_PER_DAY
--------- ---------- ----------------
23-MAR-08 100 100
24-MAR-08 150 125
24-MAR-08 25 91
25-MAR-08 75 87
26-MAR-08 50 80
SELECT submit_date, num_votes, TRUNC(AVG(num_votes)

OVER(PARTITION BY submit_date ORDER BY submit_date ROWS UNBOUNDED PRECEDING)) AVG_VOTE_PER_DAY
FROM vote_count
SUBMIT_DA NUM_VOTES AVG_VOTE_PER_DAY

--------- ---------- ----------------
23-MAR-08 100 100
24-MAR-08 150 150
24-MAR-08 25 87
25-MAR-08 75 75
26-MAR-08 50 50
CORR This example returns a coefficient of correlation of a set of number pairs

SELECT t.calendar_month_number,
CORR (SUM(s.amount_sold), SUM(s.quantity_sold))
OVER (ORDER BY t.calendar_month_number) AS CUM_CORR
FROM SH.sales s, SH.times t
WHERE s.time_id = t.time_id AND calendar_year = 1998
GROUP BY t.calendar_month_number;
CALENDAR_MONTH_NUMBER CUM_CORR
--------------------- ----------
1 ?
2 -1
3 .405926438
4 .455153818
5 .611876132
6 .64470279
7 .408235439
8 .297245237
9 .376635396
10 .484668332
11 .486030215
12 .471473718
12 rows selected.
COUNT This example returns a running count of all records or by partition

SELECT submit_date, num_votes, TRUNC(COUNT(num_votes)
OVER(ORDER BY submit_date ROWS UNBOUNDED PRECEDING)) AS DAY_COUNT
FROM vote_count
SELECT submit_date, COUNT(*)

OVER(PARTITION BY submit_date ORDER BY submit_date
ROWS UNBOUNDED PRECEDING) NUM_RECS
FROM vote_count;
SUBMIT_DA NUM_RECS
--------- ----------
23-MAR-08 1
24-MAR-08 1
24-MAR-08 2
25-MAR-08 1
26-MAR-08 1
CUME_DIST This example returns a cumulative distribution of a value in a group of values

SELECT job_id, last_name, salary, CUME_DIST()
OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist
FROM HR.employees
WHERE job_id LIKE 'PU%';
JOB_ID LAST_NAME SALARY CUME_DIST

---------- ------------------------- ---------- ----------
PU_CLERK Colmenares 2500 .2
PU_CLERK Himuro 2600 .4
PU_CLERK Tobias 2800 .6
PU_CLERK Baida 2900 .8
PU_CLERK Khoo 3100 1
PU_MAN Raphaely 11000 1
6 rows selected.
SELECT job, ename, sal, CUME_DIST()

OVER (PARTITION BY job ORDER BY sal) AS cume_dist
FROM emp
JOB ENAME SAL CUME_DIST
--------- ---------- ---------- ----------
ANALYST SCOTT 3000 1
ANALYST FORD 3000 1
CLERK SMITH 800 .25
CLERK JAMES 950 .5
CLERK ADAMS 1100 .75
CLERK MILLER 1300 1
MANAGER CLARK 2450 .333333333
MANAGER BLAKE 2850 .666666667
MANAGER JONES 2975 1
PRESIDENT KING 5000 1
SALESMAN WARD 1250 .5
SALESMAN MARTIN 1250 .5
SALESMAN TURNER 1500 .75
SALESMAN ALLEN 1600 1
14 rows selected.
DENSE_RANK This example returns a group leaving no gaps in ranking sequence when there are ties
SELECT d.department_name, e.last_name, e.salary, DENSE_RANK()
OVER (PARTITION BY e.department_id ORDER BY e.salary) AS DENSE_RANK
FROM HR.employees e, HR.departments d
WHERE e.department_id = d.department_id
AND d.department_id IN (30, 60);
DEPARTMENT_NAME LAST_NAME SALARY DENSE_RANK

------------------------------ ------------------------- ---------- ----------
Purchasing Colmenares 2500 1
Purchasing Himuro 2600 2
Purchasing Tobias 2800 3
Purchasing Baida 2900 4
Purchasing Khoo 3100 5
Purchasing Raphaely 11000 6
IT Lorentz 4200 1
IT Austin 4800 2
IT Pataballa 4800 2
IT Ernst 6000 3
IT Hunold 9000 4
11 rows selected.
FIRST This example returns the row ranked first using DENSE_RANK
SELECT last_name, department_id, salary,
MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct)
OVER (PARTITION BY department_id) "Worst",
MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct)
OVER (PARTITION BY department_id) "Best"
FROM HR.employees
WHERE department_id IN (30, 60)
ORDER BY department_id, salary;
LAST_NAME DEPARTMENT_ID SALARY Worst Best

------------------------- ------------- ---------- ---------- ----------
Colmenares 30 2500 2500 11000
Himuro 30 2600 2500 11000
Tobias 30 2800 2500 11000
Baida 30 2900 2500 11000
Khoo 30 3100 2500 11000
Raphaely 30 11000 2500 11000
Lorentz 60 4200 4200 9000
Austin 60 4800 4200 9000
Pataballa 60 4800 4200 9000
Ernst 60 6000 4200 9000
Hunold 60 9000 4200 9000
11 rows selected.
FIRST_VALUE This example returns the first value in an ordered set of values.
If the first value in the set is null, then the function returns NULL unless you specify IGNORE NULLS
SELECT last_name, salary, hire_date, FIRST_VALUE(hire_date)
OVER (ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lv
FROM (SELECT * FROM HR.employees WHERE department_id = 90
ORDER BY hire_date);
LAST_NAME SALARY HIRE_DATE LV

------------------------- ---------- --------- ---------
Kochhar 17000 21-SEP-89 21-SEP-89
De Haan 17000 13-JAN-93 21-SEP-89
King 24000 17-JUN-87 21-SEP-89
3 rows selected.
LAG This example returns a row by offset (prior data).

LAG provides access to more than one row of a table at the same time without a self-join.
Given a series of rows returned from a query and a position of the cursor,
LAG provides access to a row at a given physical offset prior to that position.
SELECT last_name, hire_date, salary,

LAG(salary, 1, 0) OVER (ORDER BY hire_date) AS PREV_SAL
FROM HR.employees
WHERE job_id = 'PU_CLERK';
LAST_NAME HIRE_DATE SALARY PREV_SAL

------------------------- --------- ---------- ----------
Khoo 18-MAY-95 3100 0
Tobias 24-JUL-97 2800 3100
Baida 24-DEC-97 2900 2800
Himuro 15-NOV-98 2600 2900
Colmenares 10-AUG-99 2500 2600
5 rows selected.
LAST_VALUE This example returns a Row ranked last by DENSE RANK

LAST_VALUE Returns the last value in an ordered set of values.
If the last value in the set is null, then the function returns NULL unless you specify IGNORE NULLS
This setting is useful for data densification. If you specify IGNORE NULLS,
then LAST_VALUE returns the first non-null value in the set, or NULL if all values are null.
SELECT last_name, salary, hire_date, FIRST_VALUE(hire_date)

OVER (ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lv
FROM (SELECT * FROM HR.employees WHERE department_id = 90
ORDER BY hire_date);
LAST_NAME SALARY HIRE_DATE LV

------------------------- ---------- --------- ---------
Kochhar 17000 21-SEP-89 21-SEP-89
De Haan 17000 13-JAN-93 21-SEP-89
King 24000 17-JUN-87 21-SEP-89
3 rows selected.
LEAD This example returns a Provides access to a row by offset BEYOND current position.
SELECT submit_date, num_votes,
LEAD(num_votes, 1, 0) OVER (ORDER BY submit_date) AS NEXT_VAL
FROM vote_count;
SUBMIT_DA NUM_VOTES NEXT_VAL

--------- ---------- ----------
23-MAR-08 100 150
24-MAR-08 150 25
24-MAR-08 25 75
25-MAR-08 75 50
26-MAR-08 50 0
5 rows selected.
MAX This example returns a maximum value by partition

SELECT manager_id, last_name, salary
FROM (
SELECT manager_id, last_name, salary,
MAX(salary) OVER (PARTITION BY manager_id) AS rmax_sal
FROM HR.employees)
WHERE salary = rmax_sal;
MANAGER_ID LAST_NAME SALARY

---------- ------------------------- ----------
100 Kochhar 17000
100 De Haan 17000
101 Greenberg 12000
101 Higgins 12000
102 Hunold 9000
103 Ernst 6000
108 Faviet 9000
114 Khoo 3100
120 Nayer 3200
120 Taylor 3200
121 Sarchand 4200
122 Chung 3800
123 Bell 4000
124 Rajs 3500
145 Tucker 10000
146 King 10000
147 Vishney 10500
148 Ozer 11500
149 Abel 11000
201 Fay 6000
205 Gietz 8300
? King 24000
22 rows selected.
MIN This example returns the minimum value by partition

NTILE Divides an ordered data set into a number of buckets indicated by expr
and assigns the appropriate bucket number to each row.
The buckets are numbered 1 through expr.
The expr value must resolve to a positive constant for each partition.
PERCENT_RANK Calculates the value of r-1/rows-1

For a row r, PERCENT_RANK calculates the rank of r minus 1,
divided by 1 less than the number of rows being evaluated
(the entire query result set or a partition).
SELECT department_id, last_name, salary, PERCENT_RANK()
OVER (PARTITION BY department_id ORDER BY salary DESC) AS pr
FROM HR.employees
ORDER BY pr, salary;
DEPARTMENT_ID LAST_NAME SALARY PR

------------- ------------------------- ---------- ----------
10 Whalen 4400 0
40 Mavris 6500 0
? Grant 7000 0
50 Fripp 8200 0
60 Hunold 9000 0
70 Baer 10000 0
30 Raphaely 11000 0
100 Greenberg 12000 0
110 Higgins 12000 0
20 Hartstein 13000 0
80 Russell 14000 0
90 King 24000 0
50 Weiss 8000 .022727273
80 Partners 13500 .03030303
50 Kaufling 7900 .045454545
80 Errazuriz 12000 .060606061
50 Vollman 6500 .068181818
50 Mourgos 5800 .090909091
80 Ozer 11500 .090909091
50 Sarchand 4200 .113636364
80 Cambrault 11000 .121212121
80 Abel 11000 .121212121
50 Bull 4100 .136363636
50 Bell 4000 .159090909
80 Banda 6200 .939393939

80 Johnson 6200 .939393939
50 Markle 2200 .954545455
50 Philtanker 2200 .954545455
50 Olson 2100 1
30 Colmenares 2500 1
60 Lorentz 4200 1
20 Fay 6000 1
80 Kumar 6100 1
100 Popp 6900 1
110 Gietz 8300 1
107 rows selected.
PERCENTILE_CONT An inverse distribution function

Inverse distribution function that assumes a continuous distribution model.
It takes a percentile value and a sort specification, and returns an interpolated
value that would fall into that percentile value with respect to the sort specification.
Nulls are ignored in the calculation.
SELECT last_name, salary, department_id,

PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) PCT_CONT, PERCENT_RANK()
OVER (PARTITION BY department_id ORDER BY salary DESC) PCT_RANK
FROM HR.employees
WHERE department_id IN (30, 60);
LAST_NAME SALARY DEPARTMENT_ID PCT_CONT PCT_RANK

------------------------- ---------- ------------- ---------- ----------
Raphaely 11000 30 2850 0
Khoo 3100 30 2850 .2
Baida 2900 30 2850 .4
Tobias 2800 30 2850 .6
Himuro 2600 30 2850 .8
Colmenares 2500 30 2850 1
Hunold 9000 60 4800 0
Ernst 6000 60 4800 .25
Austin 4800 60 4800 .5
Pataballa 4800 60 4800 .5
Lorentz 4200 60 4800 1
11 rows selected.
PERCENTILE_DISC An inverse distribution function that assumes a discrete distribution model.

It takes a percentile value and a sort specification and returns an element from the set.
Nulls are ignored in the calculation.
SELECT last_name, salary, department_id,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY salary DESC)
OVER (PARTITION BY department_id) PCT_DISC,
CUME_DIST() OVER (PARTITION BY department_id
ORDER BY salary DESC) CUME_DIST
FROM HR.employees
WHERE department_id IN (30, 60);
LAST_NAME SALARY DEPARTMENT_ID PCT_DISC CUME_DIST

------------------------- ---------- ------------- ---------- ----------
Raphaely 11000 30 2900 .166666667
Khoo 3100 30 2900 .333333333
Baida 2900 30 2900 .5
Tobias 2800 30 2900 .666666667
Himuro 2600 30 2900 .833333333
Colmenares 2500 30 2900 1
Hunold 9000 60 4800 .2
Ernst 6000 60 4800 .4
Austin 4800 60 4800 .8
Pataballa 4800 60 4800 .8
Lorentz 4200 60 4800 1
11 rows selected.
RANK This example returns a Rank of a value in a group

SELECT department_id, last_name, salary, commission_pct,
RANK() OVER (PARTITION BY department_id
ORDER BY salary DESC, commission_pct) "Rank"
FROM HR.employees
WHERE department_id = 80;
DEPARTMENT_ID LAST_NAME SALARY COMMISSION_PCT Rank

------------- ------------------------- ---------- -------------- ----------
80 Russell 14000 .4 1
80 Partners 13500 .3 2
80 Errazuriz 12000 .3 3
80 Ozer 11500 .25 4
80 Cambrault 11000 .3 5
80 Abel 11000 .3 5
80 Zlotkey 10500 .2 7
80 Vishney 10500 .25 8
80 Bloom 10000 .2 9
80 Tucker 10000 .3 10
80 King 10000 .35 11
80 Fox 9600 .2 12
80 Greene 9500 .15 13
80 Bernstein 9500 .25 14
80 Sully 9500 .35 15
80 Hall 9000 .25 16
80 McEwen 9000 .35 17
80 Hutton 8800 .25 18
80 Taylor 8600 .2 19
80 Banda 6200 .1 32
80 Johnson 6200 .1 32
80 Kumar 6100 .1 34
34 rows selected.
RATIO_TO_REPORT Computes the ratio of a value to the sum of a set of values.

If expr e valuates to null, then the ratio-to-report value also evaluates to null.
SELECT last_name, salary, RATIO_TO_REPORT(salary) OVER () AS RR
FROM HR.employees
WHERE job_id = 'PU_CLERK';
LAST_NAME SALARY RR
------------------------- ---------- ----------
Khoo 3100 .223021583
Baida 2900 .208633094
Tobias 2800 .201438849
Himuro 2600 .18705036
Colmenares 2500 .179856115
5 rows selected.
REGR_AVGX Linear regression function

REGR_AVGY Linear regression function
REGR_COUNT Linear regression function
REGR_INTERCEPT Linear regression function
REGR_R2 Linear regression function
REGR_SLOPE Linear regression function
REGR_SXX Linear regression function
REGR_SXY Linear regression function
REGR_SYY Linear regression function
SELECT job_id, employee_id ID, salary,

REGR_SLOPE(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) slope,
REGR_INTERCEPT(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) intcpt,
REGR_R2(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) rsqr,
REGR_COUNT(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) count,
REGR_AVGX(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) avgx,
REGR_AVGY(SYSDATE-hire_date, salary)
OVER (PARTITION BY job_id) avgy
FROM HR.employees
WHERE department_id in (50, 80)
ORDER BY job_id, employee_id;
JOB_ID ID SALARY SLOPE INTCPT RSQR COUNT AVGX AVGY

---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
SA_MAN 145 14000 .355215054 -653.91528 .83244748 5 12200 3679.70838
SA_MAN 146 13500 .355215054 -653.91528 .83244748 5 12200 3679.70838
SA_MAN 147 12000 .355215054 -653.91528 .83244748 5 12200 3679.70838
SA_MAN 148 11000 .355215054 -653.91528 .83244748 5 12200 3679.70838
SA_MAN 149 10500 .355215054 -653.91528 .83244748 5 12200 3679.70838
SA_REP 150 10000 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
SA_REP 151 9500 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
SA_REP 152 9000 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
SA_REP 153 8000 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
SA_REP 154 7500 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
SA_REP 155 7000 .256829349 1457.88264 .647007156 29 8396.55172 3614.36355
ST_CLERK 138 3200 .904249136 1187.52454 .742808493 20 2785 3705.85838

ST_CLERK 139 2700 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_CLERK 140 2500 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_CLERK 141 3500 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_CLERK 142 3100 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_CLERK 143 2600 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_CLERK 144 2500 .904249136 1187.52454 .742808493 20 2785 3705.85838
ST_MAN 120 8000 .479432718 483.038195 .69418508 5 7280 3973.30838
ST_MAN 121 8200 .479432718 483.038195 .69418508 5 7280 3973.30838
ST_MAN 122 7900 .479432718 483.038195 .69418508 5 7280 3973.30838
ST_MAN 123 6500 .479432718 483.038195 .69418508 5 7280 3973.30838
ST_MAN 124 5800 .479432718 483.038195 .69418508 5 7280 3973.30838
79 rows selected.
ROW_NUMBER Assigns a unique number to each row to which it is applied

(either each row in the partition or each row returned by the query),
in the ordered sequence of rows specified in the order by clause, beginning with 1.
STDDEV This example returns the standard deviation

SELECT last_name, salary,
STDDEV(salary) OVER (ORDER BY hire_date) "StdDev"
FROM HR.employees
LAST_NAME SALARY StdDev

------------------------- ---------- ----------
Raphaely 11000 0
Khoo 3100 5586.14357
Tobias 2800 4650.0896
Baida 2900 4035.26125
Himuro 2600 3649.2465
Colmenares 2500 3362.58829
6 rows selected.
STDDEV_POP Square root of the population variance

SELECT department_id, last_name, salary,
STDDEV_POP(salary) OVER (PARTITION BY department_id) AS pop_std
FROM HR.employees;
STDDEV_SAMP Cumulative sample standard deviation

Computes the cumulative sample standard deviation
and returns the square root of the sample variance.
SELECT department_id, last_name, hire_date, salary,
STDDEV_SAMP(salary) OVER (PARTITION BY department_id
ORDER BY hire_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_sdev
FROM HR.employees;
SUM Cumulative running total
SELECT submit_date, num_votes, SUM(num_votes)

OVER(ORDER BY submit_date ROWS UNBOUNDED PRECEDING) TOT_VOTE
FROM vote_count
SUBMIT_DA NUM_VOTES TOT_VOTE

--------- ---------- ----------
23-MAR-08 100 100
24-MAR-08 150 250
24-MAR-08 25 275
25-MAR-08 75 350
26-MAR-08 50 400
5 rows selected.
VAR_POP population variance of a set of numbers

VAR_SAMP sample variance of a set of numbers
SELECT t.calendar_month_desc, VAR_POP(SUM(s.amount_sold))
OVER (ORDER BY t.calendar_month_desc) "Var_Pop",
VAR_SAMP(SUM(s.amount_sold))
OVER (ORDER BY t.calendar_month_desc) "Var_Samp"
FROM SH.sales s, SH.times t
WHERE s.time_id = t.time_id AND t.calendar_year = 2001
GROUP BY t.calendar_month_desc;
VARIANCE This example returns Variance of an expression

SELECT last_name, salary,
VARIANCE(salary) OVER (ORDER BY hire_date) AS VARIANCE
FROM HR.employees

Analytics Functions Demo

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analytics Functions Demo

Uploaded by

Copyright:

Available Formats

Overview

What are they?

ALTER SESSION SET OPTIMIZER_MODE = 'FIRST_ROWS';

200605 26078 87.5

200602 26078 0 562.5 0 1

200603 26078 0 1000 0 1

200604 26078 50 1400 50 1

200605 26078 87.5 2037.5 87.5 1

200606 26078 50 2712.5 50 1

SELECT cal_mnth, emply_id, AMT,

CAL_MNTH EMPLY_ID AMT RUNNING_TOTAL

200602 26078 0 437.5

200603 26078 0 350

200604 26078 50 600

200605 26078 87.5 662.5

200606 26078 50 562.5

SELECT cal_mnth, emply_id, AMT,

CAL_MNTH EMPLY_ID AMT RUNNING_TOTAL

200605 26078 87.5 87.5

200602 28158 2250 1 1 1

alter session set nls_date_format = 'yyyymmdd_hh24miss';

USERNAME CREATED DEFAULT_TABLESPACE

alter session set nls_date_format = 'yyyymmdd_hh24miss';

USERNAME CREATED CNT

CAL_MNTH EMPLY_ID AMT DR R SEQ YOU_AND_2ABOVE

Accessing Rows Around Your Current Row

Lag (looking behind)

Lead (looking ahead)

SELECT created, USERNAME, DEFAULT_TABLESPACE,

CREATED USERNAME DEFAULT_TABLESPACE PRIOR_DFLT NEXT_DFLT

(try same query above with rows displayed in a different order).

CREATED USERNAME DEFAULT_TABLESPACE PRIOR_DFLT NEXT_DFLT

Determine the First Value / Last Value of a Group

CAL_MNTH EMPLY_ID AMT MIN_PRSN MIN_AMT MAX_PRSN MAX_AMT

200602 29083 87.5 26078 0 26107 175

200603 29083 0 26078 0 26107 175

Crosstab or Pivot Queries

200602 28158 2250

200603 28918 2125

CAL_MNTH FIRST SECOND THIRD FIRSTAMT SECONDAMT THIRDAMT

200602 28158 09158 28918 2250 2125 2125

200603 28918 07769 28036 2125 2125 2000

200604 27978 28761 29128 2400 2400 2400

200605 04027 27978 28158 2400 2250 2250

200606 04027 27978 28158 2400 2250 2250

column from_key format a40

NT CNT FROM_KEY TO_KEY

column from_key format a24

AVG This example returns a Running average

INSERT INTO vote_count VALUES (TRUNC(SYSDATE)-4, 100);

SELECT * FROM vote_count;

SELECT submit_date, num_votes, TRUNC(AVG(num_votes)

SELECT submit_date, num_votes, TRUNC(AVG(num_votes)

SUBMIT_DA NUM_VOTES AVG_VOTE_PER_DAY

CORR This example returns a coefficient of correlation of a set of number pairs

COUNT This example returns a running count of all records or by partition

SELECT submit_date, COUNT(*)

CUME_DIST This example returns a cumulative distribution of a value in a group of values

JOB_ID LAST_NAME SALARY CUME_DIST

SELECT job, ename, sal, CUME_DIST()

DEPARTMENT_NAME LAST_NAME SALARY DENSE_RANK

LAST_NAME DEPARTMENT_ID SALARY Worst Best

LAST_NAME SALARY HIRE_DATE LV

LAG This example returns a row by offset (prior data).