19 views

Uploaded by bui ton

Quick Sort

- Searching s
- EXPERIMENT4-5
- Descriptive Unit 4_50 Percent
- Laursen_CMAME187
- Bubble Sort
- 10.1.1.124
- Solving Recurrences
- 13 Fast Sorting
- Sorting 2
- 03-recursion
- Final St Manual
- c Labmanual
- Updated Scheme and Syllabus 2016-17 Version16Feb2016 (1)
- Programare 1
- Function Overloading
- data structure
- rptIpPrintNew-2.pdf
- Comprehensive Viva Voce
- DAA PROJECT REPORT.docx
- Efficiency

You are on page 1of 67

2.3 Q UICKSORT

quicksort

selection

duplicate keys

Algorithms

system sorts

F O U R T H E D I T I O N

http://algs4.cs.princeton.edu

Two classic sorting algorithms: mergesort and quicksort

Full scientific understanding of their properties has enabled us

to develop them into practical system sorts.

Quicksort honored as one of top 10 algorithms of 20th century

in science and engineering.

...

...

2

Quicksort t-shirt

3

2.3 Q UICKSORT

quicksort

selection

duplicate keys

Algorithms

system sorts

http://algs4.cs.princeton.edu

Quicksort

Basic plan.

Shuffle the array.

Partition so that, for some j

entry a[j] is in place

no larger entry to the left of j

no smaller entry to the right of j

Sort each subarray recursively.

input Q U I C K S O R T E X A M P L E

shuffle K R A T E L E P U I M Q C X O S

partitioning item

partition E C A I E K L P U T M Q R X O S

not greater not less

sort left A C E E I K L P U T M Q R X O S

sort right A C E E I K L M O P Q R S T U X

result A C E E I K L M O P Q R S T U X

Quicksort overview

5

Tony Hoare

n u m b e r ) . 9.9 X 10 45 is u s e d to r e p r e s e n t i n f i n i t y . I m a g i n a r y

v a l u e s of x m a y n o t be n e g a t i v e a n d reM v a l u e s of x m a y n o t be

conunent I a n d J are o u t p u t v a r i a b l e s , a n d A is t h e a r r a y ( w i t h

s u b s c r i p t b o u n d s M : N ) w h i c h is o p e r a t e d u p o n b y t h i s p r o c e d u r e .

s m a l l e r t h a n 1. P a r t i t i o n t a k e s t h e v a l u e X of a r a n d o m e l e m e n t of t h e a r r a y A,

V a l u e s of Qd~'(x) m a y be c a l c u l a t e d e a s i l y by h y p e r g e o m e t r i c a n d r e a r r a n g e s t h e v a l u e s of t h e e l e m e n t s of t h e a r r a y in s u c h a

series if x is n o t t o o s m a l l n o r (n - m ) t o o large. Q~m(x) c a n be w a y t h a t t h e r e e x i s t i n t e g e r s I a n d J w i t h t h e following p r o p e r t i e s :

c o m p u t e d f r o m a n a p p r o p r i a t e set of v a l u e s of Pnm(X) if X is n e a r M _-< J < I =< N p r o v i d e d M < N

1.0 or ix is n e a r 0. L o s s of s i g n i f i c a n t d i g i t s o c c u r s for x as s m a l l as A[R] =< X f o r M =< R _-< J

1.1 if n is l a r g e r t h a n 10. L o s s of s i g n i f i c a n t d i g i t s is a m a j o r diffi- A[R] = X f o r J < R < I

c u l t y in u s i n g finite p o l y n o m i M r e p r e s e n t a t i o n s also if n is l a r g e r A[R] ~ X f o r I =< R ~ N

t h a n m . H o w e v e r , Q L E G h a s b e e n t e s t e d in r e g i o n s of x a n d n The procedure uses an integer procedure random (M,N) which

b o t h large a n d small; chooses equiprobably a random integer F between M and N, and

procedure Q L E G ( m , n m a x , x, ri, R, Q); v a l u e In, n m a x , x, ri; also a p r o c e d u r e e x c h a n g e , w h i c h e x c h a n g e s t h e v a l u e s of its t w o

r e a l In, m n a x , x, ri; r e a l a r r a y R , Q; parameters ;

begin r e a l t, i, n, q0, s;

n : = 20;

i f n m a x > 13 t h e n

begin real X; integer F;

F : = r a n d o m ( M , N ) ; X : = A[F];

I:=M; J:=N;

Implemented quicksort.

n := nmax +7; up: for I : = I s t e p 1 u n t i l N d o

i f ri = 0 t h e n i f X < A [I] t h e n g o to d o w n ;

begin ifm = 0then I:=N;

Q[0] : = 0.5 X 10g((x + 1 ) / ( x - 1)) down: f o r J : = J s t e p --1 u n t i l M d o

else if A[J]<X then go to change;

begin t : = - - 1 . 0 / s q r t ( x X x - - 1); J:=M;

q0 : = 0; c h a n g e : i f I < J t h e n b e g i n e x c h a n g e (A[IL A[J]); Tony Hoare

Q[O] : = t ; I := I+ 1;J:= J - 1;

for i : = 1 step 1 until m do g o to u p 1980 Turing Award

begin s := (x+x)X(i-1)Xt end 2: b e g i n

4

Q[0]+ (3i-ii-2)q0; else i f [ < F t h e n b e g i n e x c h a n g e (A[IL A[F]) i i f c >= 0 t h e n

q0 : = Q[0]; I:=I+l 3:begin

Q[0] : = s e n d e n d ; end e:= aXe;f := bXd;goto8

if x = 1 then else i f F < J t l l e n b e g i n e x c h a n g e (A[F], A[J]) ; end 3 ;

Q[0] : = 9.9 I" 45; J:=J-1 e:=bXc;

R [ n + 1] : = x - s q r t ( x X x - 1); end ; ifd ~ 0then

for i : = n s t e p --1 u n t i l 1 d o end partition 4: begin

R[i] : = (i + m ) / ( ( i + i + 1) X x f:=bXd; goto8

+(m-i- 1) X R [ i + l ] ) ; e n d 4;

ALGORITHM 61

g o to t h e e n d ; f:=aXd; goto8

if m = 0 then ALGORITHM

PROCEDURES 64F O R R A N G E ARITHMETIC 5: e n d 2;

b e g i n i f x < 0.5 t b e n QUICKSORT

ALLAN GIBB* ifb > 0 then

Q[0] : = a r c t a n ( x ) - 1.5707963 e l s e U n iA.

C. i t y HOARE

v e r sR. of A l b e r t a , C a l g a r y , A l b e r t a , C a n a d a 6: b e g i n

Q[0] : = - a r e t a n ( 1 / x ) e n d e l s e if d > 0 then

Elliott

b e g i n Brothers Ltd., Borehamwood, Hertfordshire, Eng.

begin t : = 1 / s q r t ( x X x + 1); begin

p r o c e d u r e R A N G E S U M (a, b, c, d, e, f);

q0 : = 0; procedure quicksort (A,M,N); value M,N; e : = M I N ( a X d, b X c);

real a , b , c , d , e , f ;

q[0] := t; a r r a y A; i n t e g e r M , N ; f : = M A X ( a X c , b X d); go t o 8

c o m m e n t The term "range n u m b e r " was used by P. S. Dwyer,

for i : = 2 step 1 until m do comment Q u i c k s o r t is a v e r y f a s t a n d c o n v e n i e n t m e t h o d of e n d 6;

b e g i n s : = (x + x) X (i -- 1) X t X Q[0I

Linear Computations (Wiley, 1951). Machine procedures for

s o r t i n g a n a r r a y in t h e r a n d o m - a c c e s s s t o r e of a c o m p u t e r . T h e e : = b X c; f : = a X c; go t o 8

range arithmetic were developed about 1958 by Ramon Moore,

+(3i+iX i -- 2) q0; e n t i r e c o n t e n t s of t h e s t o r e m a y be s o r t e d , since no e x t r a s p a c e is e n d 5;

"Automatic Error Analysis in Digital C o m p u t a t i o n , " LMSD

qO : = Q[0]; r e q u i r e d . T h e a v e r a g e n u m b e r of c o m p a r i s o n s m a d e is 2 ( M - - N ) In f:=aXc;

Report 48421, 28 Jan. 1959, Lockheed Missiles and Space Divi-

Q[0] := s e n d e n d ; ( N - - M ) , a n d t h e a v e r a g e n m n b e r of e x c h a n g e s is one s i x t h t h i s i f d _-<O t h e n

sion, Palo Alto, California, 59 pp. If a _< x -< b and c ~ y ~ d,

R[n + 1] : = x - s q r t ( x x + 1); a m o u n t . S u i t a b l e r e f i n e m e n t s of t h i s m e t h o d will be d e s i r a b l e for 7: b e g i n

then R A N G E S U M yields an interval [e, f] such t h a t e =< (x + y)

for i := n step - 1 until 1 do its i m p l e m e n t a t i o n on a n y a c t u a l c o m p u t e r ; e:=bXd; goto8

f. Because of machine operation (truncation or rounding) the

R[i] : = (i + m ) / ( ( i -- m + 1) R[i + 1] begin i n t e g e r 1,J ; end 7 ;

machine sums a -4- c and b -4- d may not provide safe end-points

--(i+i+ 1) X x); if M < N then begin partition (A,M,N,I,J); e:=aXd;

of the output interval. Thus R A N G E S U M requires a non-local

for i : = 1 step 2 until nmax do quicksort (A,M,J) ; 8: e : = A D J U S T P R O D (e, - 1 ) ;

real procedure A D J U S T S U M which will compensate for the

Rill : = -- Rill; q u i c k s o r t (A, I, N ) f := A D J U S T P R O D (f, 1)

machine arithmetic. The body of A D J U S T S U M will be de-

the: for i : = 1 step 1 until nnmx do end end RANGEMPY;

pendent upon the type of machine for which it is written and so

Q[i] : = Q[i - 1] X R[i] end quieksort p r o c e d u r e R A N G E D V D (a, b, c, d, e, f) ;

is not given here. (An example, however, appears below.) I t

end QLEG; real a, b, c, d, e, f;

is assumed t h a t A D J U S T S U M has as parameters real v and w,

c o m m e n t If the range divisor includes zero the program

* T h i s p r o c e d u r e w a s d e v e l o p e d in p a r t u n d e r t h e s p o n s o r s h i p and integer i, and is accompanied by a non-local real procedure

exists to a non-local label " z e r o d v s r " . R A N G E D V D assumes a

of t h e A i r F o r c e C a m b r i d g e R e s e a r c h C e n t e r .

Communications of the ACM (July 1961)

C O R R E C T I O N 65

ALGORITHM which gives an upper bound to the magnitude

of the error involved in the machine representation of a number.

non-local real procedure A D J U S T Q U O T which is analogous

FIND (possibly identical) to A D J U S T P R O D ;

The output A D J U S T S U M provides the left end-point of the 6

C.output A. R.intervalHOARE begin

of R A N G E S U M when A D J U S T S U M is called

Tony Hoare

[ but couldn't explain his algorithm or implement it! ]

Learned Algol 60 (and recursion).

Implemented quicksort.

Tony Hoare

1980 Turing Award

There are two ways of constructing a software design: One way is

to make it so simple that there are obviously no deficiencies, and

the other way is to make it so complicated that there are no obvious

deficiencies. The first method is far more difficult.

reference in 1965 This has led to innumerable errors,

vulnerabilities, and system crashes, which have probably caused

a billion dollars of pain and damage in the last forty years.

7

Bob Sedgewick

Analyzed many versions of quicksort.

good algorithms from bad ones. Third, if the tile fits into

Bob Sedgewick

the memory of the computer, there is one algorithm,

called Quicksort, which has been shown to perform well

Programming S. L. Graham, R. L. Rivest in a variety of situations. Not only is this algorithm

Techniques Editors simpler than many other sorting algorithms, but empir-

Implementing ical [2, ll, 13, 21] and analytic [9] studies show that

Quicksort can be expected to be up to twice as fast as its

learned by programmersActa Informatica

9 by

who have no

Springer-Verlag

7, 327--355 (1977)

previous experi-

1977

ence with sorting, and those who do know other sorting

Robert Sedgewick methods should also find it profitable to learn about

Brown University Quicksort.

Because of its prominence, it is appropriate to study

The Analysis of Quicksort Programs*

how Quicksort might be improved. This subject has

This paper is a practical study of how to implement received considerable attention (see, for example, [1, 4,

the Quicksort sorting algorithm and its best variants on 11, 13, 14, 18, 20]), but few real improvements have been Robert Sedgewick

real computers, including how to apply various code suggested beyond those described by C.A.R. Hoare, the

optimization techniques. A detailed implementation inventor of Quicksort, in his original papers [5, 6].Received Hoare January 19, t976

combining the most effective improvements to also showed how to analyze Quicksort and predict its

running time. The analysis Summary. The been

has since Quicksort

extendedsorting

to algorithm and its best variants are presented

Quicksort is given, along with a discussion of how to and analyzed. Results are derived which make it possible to obtain exact formulas de-

implement it in assembly language. Analytic results the improvements that he suggested, and used to indicate

scribing the total expected running time of particular implementations on real com-

describing the performance of the programs are how they may best be implemented

puters of Quick,sort [9,and15,an17]. The

improvement called the median-of-three modification.

summarized. A variety of special situations are subject of the carefulDetailed analysis of the effect of has

implementation of Quicksort an implementation technique called loop unwrapping

considered from a practical standpoint to illustrate not been studied as widely as global

is presented. The paper improvements

is intended tonot only to present results of direct practical utility,

Quicksort's wide applicability as an internal sorting the algorithm, butbut thealso to illustrate

savings to be the intriguing

realized are asmathematics which arises in the complete analysis

of thisofimportant

significant. The history Quicksortalgorithm.

is quite complex,

method which requires negligible extra storage.

Key Words and Phrases: Quicksort, analysis of and [15] contains a full survey of the many variants

algorithms, code optimization, sorting which, have been proposed. 1. Introduction

CR Categories: 4.0, 4.6, 5.25, 5.31, 5.5 The purpose of this paper is to describe in detail how

Quicksort can best beInimplemented

t96t-62 C.A.R. Hoareactual

to handle presented a new algorithm called Quicksort [7, 8]

applications on real computers. A general description ofinto order by computer. This method combines

which is suitable for putting files

elegancebyand

the algorithm is followed efficiency,of and

descriptions it remains today the most useful general-purpose

the most

effective improvementssortingthat method

have for beencomputers.

proposed The (as practical utility of the algorithm has meant

demonstrated in [15]). Next, an implementation of to countless modifications (though few real

not only that it has been sfibjected 8

Quicksort partitioning demo

Scan i from left to right so long as (a[i] < a[lo]).

K R A T E L E P U I M Q C X O S

lo i j

9

Quicksort partitioning demo

Scan i from left to right so long as (a[i] < a[lo]).

When pointers cross.

Exchange a[lo] with a[j].

E C A I E K L P U T M Q R X O S

lo j hi

partitioned!

10

The music of quicksort partitioning (by Brad Lyon)

https://googledrive.com/host/0B2GQktu-wcTicjRaRjV1NmRFN1U/index.html

11

Quicksort: Java code for partitioning

{

int i = lo, j = hi+1;

while (true)

{

while (less(a[++i], a[lo])) find item on left to swap

if (i == hi) break;

if (j == lo) break;

exch(a, i, j); swap

}

before v

return j; return index of item now known to be in place

} lo hi

lo hi i j

before v v "v !v

lo hi i j lo j hi

12

Quicksort quiz 1

Q. How many compares (in the worst case) to partition an array of length N ?

A. ~N

B. ~N

C. ~N

D. ~ N lg N

E. I don't know.

13

Quicksort: Java implementation

{

private static int partition(Comparable[] a, int lo, int hi)

{ /* see previous slide */ }

{

shuffle needed for

StdRandom.shuffle(a);

performance guarantee

sort(a, 0, a.length - 1);

(stay tuned)

}

{

if (hi <= lo) return;

int j = partition(a, lo, hi);

sort(a, lo, j-1);

sort(a, j+1, hi);

}

}

14

Quicksort trace

lo j hi 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

initial values Q U I C K S O R T E X A M P L E

random shuffle K R A T E L E P U I M Q C X O S

0 5 15 E C A I E K L P U T M Q R X O S

0 3 4 E C A E I K L P U T M Q R X O S

0 2 2 A C E E I K L P U T M Q R X O S

0 0 1 A C E E I K L P U T M Q R X O S

1 1 A C E E I K L P U T M Q R X O S

4 4 A C E E I K L P U T M Q R X O S

6 6 15 A C E E I K L P U T M Q R X O S

no partition 7 9 15 A C E E I K L M O P T Q R X U S

for subarrays

of size 1 7 7 8 A C E E I K L M O P T Q R X U S

8 8 A C E E I K L M O P T Q R X U S

10 13 15 A C E E I K L M O P S Q R T U X

10 12 12 A C E E I K L M O P R Q S T U X

10 11 11 A C E E I K L M O P Q R S T U X

10 10 A C E E I K L M O P Q R S T U X

14 14 15 A C E E I K L M O P Q R S T U X

15 15 A C E E I K L M O P Q R S T U X

result A C E E I K L M O P Q R S T U X

15

Quicksort animation

50 random items

algorithm position

in order

current subarray

not in order

http://www.sorting-algorithms.com/quick-sort

16

Quicksort: implementation details

(and stable), but is not worth the cost.

than it might seem.

better to stop scans on keys equal to the partitioning item's key. stay tuned

Equivalent alternative. Pick a random partitioning item in each subarray.

17

Quicksort: empirical analysis (1961)

Algol 60 implementation.

National-Elliott 405 computer.

sorting N 6-word items with 1-word keys

(16K words)

18

Quicksort: empirical analysis

Home PC executes 108 compares/second.

Supercomputer executes 1012 compares/second.

computer thousand million billion thousand million billion thousand million billion

home instant 2.8 hours 317 years instant 1 second 18 min instant 0.6 sec 12 min

super instant 1 second 1 week instant instant instant instant instant instant

Lesson 2. Great algorithms are better than good ones.

19

Quicksort: best-case analysis

initial values

random shuffle

20

Quicksort: worst-case analysis

initial values

random shuffle

21

Quicksort: average-case analysis

N distinct keys is ~ 2N ln N (and the number of exchanges is ~ N ln N ).

left right

partitioning

C0 + CN 1 C1 + CN 2 CN 1+ C0

CN = (N + 1) + + + ... +

N N N

N CN = N (N + 1) + 2(C0 + C1 + . . . + CN 1)

N CN (N 1)CN 1 = 2N + 2CN 1

CN CN 1 2

= +

N +1 N N +1

22

Quicksort: average-case analysis

CCNN CC

NN 1 1 22

= ++

NN++11 NN NN++1 1

CN 2 2 2

= + + substitute previous equation

N 1 N N +1

CN 3 2 2 2

= + + +

N 2 N 1 N N +1

2 2 2 2

= + + + ... +

3 4 5 N +1

1 1 1 1

CN = 2(N + 1) + + + ...

3 4 5 N +1

Z N +1

1

2(N + 1) dx

3 x

CN 2(N + 1) ln N 1.39N lg N

23

Quicksort: summary of performance characteristics

Guaranteed to be correct.

Running time depends on random shuffle.

Average case. Expected number of compares is ~ 1.39 N lg N.

39% more compares than mergesort.

Faster than mergesort in practice because of less data movement.

Best case. Number of compares is ~ N lg N.

Worst case. Number of compares is ~ N 2.

[ but more likely that lightning bolt strikes computer during execution ]

24

Quicksort properties

Pf.

Partitioning: constant extra space.

Depth of recursion: logarithmic extra space (with high probability).

can guarantee logarithmic depth by recurring

on smaller subarray before larger subarray

(requires using an explicit stack)

Pf. [ by counterexample ]

i j 0 1 2 3

B1 C1 C2 A1

1 3 B1 C1 C2 A1

1 3 B1 A1 C2 C1

0 1 A1 B1 C2 C1

25

Quicksort: practical improvements

Even quicksort has too much overhead for tiny subarrays.

Cutoff to insertion sort for 10 items.

{

if (hi <= lo + CUTOFF - 1)

{

Insertion.sort(a, lo, hi);

return;

}

int j = partition(a, lo, hi);

sort(a, lo, j-1);

sort(a, j+1, hi);

}

26

Quicksort: practical improvements

Median of sample.

Best choice of pivot item = median.

Estimate true median by taking median of sample.

Median-of-3 (random) items.

~ 12/7 N ln N compares (14% fewer)

~ 12/35 N ln N exchanges (3% more)

{

if (hi <= lo) return;

swap(a, lo, median);

sort(a, lo, j-1);

sort(a, j+1, hi);

}

27

2.3 Q UICKSORT

quicksort

selection

duplicate keys

Algorithms

system sorts

http://algs4.cs.princeton.edu

Selection

Ex. Min (k = 0), max (k = N - 1), median (k = N / 2).

Applications.

Order statistics.

Find the "top k."

Use theory as a guide.

Easy N log N upper bound. How?

Easy N upper bound for k = 1, 2, 3. How?

Easy N lower bound. Why?

Which is true?

N log N lower bound? is selection as hard as sorting?

29

Quick-select

Entry a[j] is in place.

No larger entry to the left of j.

No smaller entry to the right of j.

Repeat in one subarray, depending on j; finished when j equals k.

k) v

before

public static Comparable select(Comparable[] a, int

{

lo if a[k] is here if a[k] is here

hi

StdRandom.shuffle(a); set hi to j-1 set lo to j+1

int lo = 0, hi = a.length - 1; during v " v !v

while (hi > lo)

{ i j

"v v

if (j < k) lo = j + 1;

else if (j > k) hi = j - 1; lo j hi

else return a[k];

Quicksort partitioning overview

}

return a[k];

}

30

Quick-select: mathematical analysis

Pf sketch.

Intuitively, each partitioning step splits array approximately in half:

N + N / 2 + N / 4 + + 1 ~ 2N compares.

Formal analysis similar to quicksort analysis yields:

CN = 2 N + 2 k ln (N / k) + 2 (N k) ln (N / (N k))

31

Theoretical context for selection

selection algorithm whose worst-case running time is linear.

bY .

Manuel Blum, Robert W. Floyd, Vaughan Watt,

Ronald L. Rive&, and Robert E. Tarjan

Abstract

i

a new selection algorithm -- PICK. Specifically, no more than

i 5.4305 n comparisons are ever required. This bound is improved for

L

extreme values of i , and a new lower bound on the requisite number

L

Until one is discovered, use quick-select if you dont need a full sort.

32

2.3 Q UICKSORT

quicksort

selection

duplicate keys

Algorithms

system sorts

http://algs4.cs.princeton.edu

Duplicate keys

Sort population by age.

Remove duplicates from mailing list.

Sort job applicants by college attended.

sorted by time sorted by city (unstable)

Chicago 09:00:00 Chicago 09:25:52 C

Typical characteristics of such applications.

Phoenix 09:00:03 Chicago 09:03:13 C

Houston 09:00:13 Chicago 09:21:05 C

Huge array. Chicago 09:00:59

Houston 09:01:10

Chicago 09:19:46

Chicago 09:19:32

C

C

Small number of key values. Chicago 09:03:13

Seattle 09:10:11

Chicago 09:00:00

Chicago 09:35:21

C

C

Seattle 09:10:25 Chicago 09:00:59 C

Phoenix 09:14:25 Houston 09:01:10 H

Chicago 09:19:32 Houston 09:00:13 NOT H

Chicago 09:19:46 Phoenix 09:37:44 sorted P

Chicago 09:21:05 Phoenix 09:00:03 P

Seattle 09:22:43 Phoenix 09:14:25 P

Seattle 09:22:54 Seattle 09:10:25 S

Chicago 09:25:52 Seattle 09:36:14 S

Chicago 09:35:21 Seattle 09:22:43 S

Seattle 09:36:14 Seattle 09:10:11 S

Phoenix 09:37:44 Seattle 09:22:54 S

key

34

Duplicate keys: stop on equal keys

P G E P A Q B P Y C O U P Z S R

P G E P A Q B P Y C O U P Z S R

35

Quicksort quiz 2

What is the result of partitioning the following array (skip over equal keys)?

A A A A A A A A A A A A A A A A

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

A. A A A A A A A A A A A A A A A A

B. A A A A A A A A A A A A A A A A

C. A A A A A A A A A A A A A A A A

D. I don't know.

36

Quicksort quiz 3

What is the result of partitioning the following array (stop on equal keys)?

A A A A A A A A A A A A A A A A

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

A. A A A A A A A A A A A A A A A A

B. A A A A A A A A A A A A A A A A

C. A A A A A A A A A A A A A A A A

D. I don't know.

37

Partitioning an array with all equal keys

38

Duplicate keys: partitioning strategies

[ ~ N 2 compares when all keys equal ]

B A A B A B B B C C C A A A A A A A A A A A

[ ~ N lg N compares when all keys equal ]

B A A B A B C C B C B A A A A A A A A A A A

[ ~ N compares when all keys equal ]

A A A B B B B B C C C A A A A A A A A A A A

39

DUTCH NATIONAL FLAG PROBLEM

red, white, or blue pebble, sort them by color.

input

sorted

Operations allowed.

swap(i, j): swap the pebble in bucket i with the pebble in bucket j.

color(i): color of pebble in bucket i.

Requirements.

Exactly N calls to color().

At most N calls to swap().

Constant extra space.

40

3-way partitioning

Entries between lt and gt equal to the partition item.

No larger entries to left of lt.

No smaller entries to right of gt.

before v

lo hi

before v <v =v >v

during

lo hi

lt i gt

during <v =v >v

after <v =v >v

lt i gt

lo lt gt hi

after <v =v >v

3-way partitioning

lo lt gt hi

3-way partitioning

Dutch national flag problem. [Edsger Dijkstra]

Conventional wisdom until mid 1990s: not worth doing.

Now incorporated into C library qsort() and Java 6 system sort.

41

Dijkstra 3-way partitioning demo

Scan i from left to right.

(a[i] < v): exchange a[lt] with a[i]; increment both lt and i

(a[i] > v): exchange a[gt] with a[i]; decrement gt

(a[i] == v): increment i

lt i gt

P A B X W P P V P D P C Y Z

lo hi

before v

invariant

lo hi

lt i gt

Dijkstra 3-way partitioning demo

Scan i from left to right.

(a[i] < v): exchange a[lt] with a[i]; increment both lt and i

(a[i] > v): exchange a[gt] with a[i]; decrement gt

(a[i] == v): increment i

lt gt

A B C D P P P P P V W Y Z X

lo hi

before v

invariant

lo hi

lt i gt

Dijkstra's 3-way partitioning: trace

v a[]

lt i gt 0 1 2 3 4 5 6 7 8 9 10 11

0 0 11 R B W W R W B R R W B R

0 1 11 R B W W R W B R R W B R

1 2 11 B R W W R W B R R W B R

1 2 10 B R R W R W B R R W B W

1 3 10 B R R W R W B R R W B W

1 3 9 B R R B R W B R R W W W

2 4 9 B B R R R W B R R W W W

2 5 9 B B R R R W B R R W W W

2 5 8 B B R R R W B R R W W W

2 5 7 B B R R R R B R W W W W

2 6 7 B B R R R R B R W W W W

3 7 7 B B B R R R R R W W W W

3 8 7 B B B R R R R R W W W W

3 8 7 B B B R R R R R W W W W

3-way partitioning trace (array contents after each loop iteration)

44

3-way quicksort: Java implementation

{

if (hi <= lo) return;

int lt = lo, gt = hi;

Comparable v = a[lo];

int i = lo;

while (i <= gt)

{

int cmp = a[i].compareTo(v);

if (cmp < 0) exch(a, lt++, i++);

else if (cmp > 0) exch(a, i, gt--);

else i++;

}

before v

sort(a, lo, lt - 1); lo hi

sort(a, gt + 1, hi); during <v =v >v

}

lt i gt

lo lt gt hi

3-way partitioning

45

3-way quicksort: visual trace

46

Duplicate keys: lower bound

Sorting lower bound. If there are n distinct keys and the ith one occurs

xi times, any compare-based sorting algorithm must use at least

n

N! xi

lg xi lg N lg N when all distinct;

x1 ! x2 ! xn ! i=1

N linear when only a constant number of distinct keys

Pf. [beyond scope of course]

from linearithmic to linear in broad class of applications.

47

Sorting summary

selection N2 N2 N2 N exchanges

insertion N N2 N2

or partially ordered

tight code;

shell N log3 N ? c N 3/2

subquadratic

N log N guarantee;

merge N lg N N lg N N lg N

stable

improves mergesort

timsort N N lg N N lg N

when preexisting order

quick N lg N 2 N ln N N2

fastest in practice

improves quicksort

3-way quick N 2 N ln N N2

when duplicate keys

48

2.3 Q UICKSORT

quicksort

selection

duplicate keys

Algorithms

system sorts

http://algs4.cs.princeton.edu

Sorting applications

Sort a list of names.

Organize an MP3 library. obvious applications

List RSS feed in reverse chronological order.

Find the median.

Identify statistical outliers. problems become easy once items

are in sorted order

Find duplicates in a mailing list.

Data compression.

Computer graphics. non-obvious applications

Computational biology.

Load balancing on a parallel computer.

...

50

War story (system sort in C)

int n = atoi(argv[1]), i, x[100000];

for (i = 0; i < n; i++)

x[i] = i;

for ( ; i < 2*n; i++)

x[i] = 2*n-i-1;

qsort(x, 2*n, sizeof(int), intcmp);

}

$ time a.out 2000

real 5.85s

$ time a.out 4000

real 21.64s

$time a.out 8000

real 85.11s

51

War story (system sort in C)

Bug. A qsort() call that should have taken seconds was taking minutes.

Version 7 Unix (1979): quadratic time to sort organ-pipe arrays.

BSD Unix (1983): quadratic time to sort random arrays of 0s and 1s.

52

Engineering a system sort (in 1993)

Bentley-McIlroy quicksort.

samples 9 items

Cutoff to insertion sort for small subarrays.

Partitioning item: median of 3 or Tukey's ninther.

Partitioning scheme: Bentley-McIlroy 3-way partitioning.

similar to Dijkstra 3-way partitioning

SOFTWAREPRACTICE AND EXPERIENCE, VOL. 23(11), 12491265 (NOVEMBER 1993)

(but fewer exchanges when not many equal keys)

JON L. BENTLEY

M. DOUGLAS McILROY

AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, U.S.A.

SUMMARY

We recount the history of a new qsort function for a C library. Our function is clearer, faster and more

robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a

novel solution to Dijkstras Dutch National Flag problem; and it swaps efficiently. Its behavior was

assessed with timing and debugging testbeds, and with a program to certify performance. The design

techniques apply in domains beyond sorting.

KEY WORDS Quicksort Sorting algorithms Performance tuning Algorithm design and implementation Testing

INTRODUCTION

C libraries have long included a qsort function to sort an array, usually implemented by

Very widely used. C, C++, Java 6, .

Hoares Quicksort. 1 Because existing qsorts are flawed, we built a new one. This paper

summarizes its evolution.

Compared to existing library sorts, our new qsort is fastertypically about twice as

fastclearer, and more robust under nonrandom inputs. It uses some standard Quicksort 53

A beautiful mailing list post (Yaroslavskiy, September 2009)

Hello All,

I'd like to share with you new Dual-Pivot Quicksort which is faster than the

known implementations (theoretically and experimental). I'd like to propose

to replace the JDK's Quicksort implementation by new one.

...

The new Dual-Pivot Quicksort uses *two* pivots elements in this manner:

2. Assume that P1 <= P2, otherwise swap it.

3. Reorder the array into three parts: those less than the smaller pivot,

those larger than the larger pivot, and in between are those elements

between (or equal to) the two pivots.

4. Recursively sort the sub-arrays.

...

http://mail.openjdk.java.net/pipermail/core-libs-dev/2009-September/002630.html

54

A beautiful mailing list post (Yaroslavskiy-Bloch-Bentley, October 2009)

Subject: Replace quicksort in java.util.Arrays with dual-pivot implementation

Changeset: b05abb410c52

Author: alanb

Date: 2009-10-29 11:18 +0000

URL: http://hg.openjdk.java.net/jdk7/tl/jdk/rev/b05abb410c52

Reviewed-by: jjb

Contributed-by: vladimir.yaroslavskiy at sun.com, joshua.bloch at google.com,

jbentley at avaya.com

! make/java/java/FILES_java.gmk

! src/share/classes/java/util/Arrays.java

+ src/share/classes/java/util/DualPivotQuicksort.java

http://mail.openjdk.java.net/pipermail/compiler-dev/2009-October.txt

55

Dual-pivot quicksort

Use two partitioning items p1 and p2 and partition into three subarrays:

Keys less than p . 1

1 2

lo lt gt hi

56

Dual-pivot partitioning demo

Initialization.

Choose a[lo] and a[hi] as partitioning items.

Exchange if necessary to ensure a[lo] a[hi].

S E A Y R L F V Z Q T C M K

lo hi

Dual-pivot partitioning demo

If (a[i] < a[lo]), exchange a[i] with a[lt] and increment lt and i.

Else if (a[i] > a[hi]), exchange a[i] with a[gt] and decrement gt.

Else, increment i.

lo lt i gt hi

K E A F R L M C Z Q T V Y S

lo lt i gt hi

Dual-pivot partitioning demo

Finalize.

Exchange a[lo] with a[--lt].

Exchange a[hi] with a[++gt].

lo lt gt hi

C E A F K L M R Q S Z V Y T

lo lt gt hi

3-way partitioned

Dual-pivot quicksort

Use two partitioning items p1 and p2 and partition into three subarrays:

Keys less than p . 1

1 2

lo lt gt hi

60

Three-pivot quicksort

Use three partitioning items p1, p2, and p3 and partition into four subarrays:

Keys less than p . 1

lo a1 a2 a3 hi

ight; see http://www.siam.org/journals/ojsa.php

skushagr@uwaterloo.ca alopez-o@uwaterloo.ca imunro@uwaterloo.ca

University of Waterloo University of Waterloo University of Waterloo

Aurick Qiao

a2qiao@uwaterloo.ca

University of Waterloo

November 7, 2013

Quicksort quiz 4

A. Fewer compares.

B. Fewer exchanges.

D. I don't know.

62

Quicksort quiz 4

A. Fewer compares.

for cache performance when

comparing quicksort variants

C. Fewer cache misses.

1-pivot 2 N ln N 0.333 N ln N 2 N ln N

beyond scope of this course

63

Which sorting algorithm to use?

sorts algorithms

elementary sorts insertion sort, selection sort, bubblesort, shaker sort, ...

parallel sorts bitonic sort, odd-even sort, smooth sort, GPUsort, ...

64

Which sorting algorithm to use?

Stable?

Parallel?

In-place?

Deterministic?

Duplicate keys?

Multiple key types?

Linked list or arrays?

Large or small items? many more combinations of

Guaranteed performance?

A. Usually.

65

System sort in Java 7

Arrays.sort().

Has overloaded method for each primitive type.

Has overloaded method for use with a Comparator.

Has overloaded methods for sorting subarrays.

Algorithms.

Dual-pivot quicksort for primitive types.

Timsort for reference types.

66

Ineffective sorts

http://xkcd.com/1185 67

- Searching sUploaded byreddyhanu
- EXPERIMENT4-5Uploaded byNaimish Dixit
- Descriptive Unit 4_50 PercentUploaded byGopal Shyam
- Laursen_CMAME187Uploaded bytincho9
- Bubble SortUploaded byasbasti007
- 10.1.1.124Uploaded byRimi Rahman
- Solving RecurrencesUploaded byGarv Mathur
- 13 Fast SortingUploaded byapi-3799621
- Sorting 2Uploaded byWilLy Libres-de la Cerna
- 03-recursionUploaded byAbhishek Sharma
- Final St ManualUploaded byKiran Kumar
- c LabmanualUploaded byChintha Venu
- Updated Scheme and Syllabus 2016-17 Version16Feb2016 (1)Uploaded byvarun3dec1
- Programare 1Uploaded byAlecsandra Rusu
- Function OverloadingUploaded byShalini Veeraragavan
- data structureUploaded byabhi123000
- rptIpPrintNew-2.pdfUploaded byarjit7890
- Comprehensive Viva VoceUploaded byDhanalakshmi Ramanujam
- DAA PROJECT REPORT.docxUploaded bySanthosh Kumar
- EfficiencyUploaded bypruebin652
- 2002-6-2-DEB-NSGA-IIUploaded bybillalou
- Design and Analysis of AlgorithmsUploaded byRajveer Kaur
- DS _AbcUploaded bysaleem010
- Ada TheoryUploaded byVikram Rao
- TestUploaded byPrateek Sinha
- AAD 1Uploaded bySruthy Sasi
- HGGGFX.pdfUploaded byMadan Mohan Reddy
- Algo About AnnotatedUploaded bynavediitr
- SigmaUploaded byStoriesofsuperheroes
- cormen_growth_of_functions.pdfUploaded byNidhiGoel Gupta

- 22Mergesort.pdfUploaded bybui ton
- 21ElementarySorts.pdfUploaded bybui ton
- 24PriorityQueues.pdfUploaded bybui ton
- 23DemoPartitioning.pdfUploaded bybui ton
- 23DemoQuickSelect.pdfUploaded bybui ton
- tính chất HỢP CHẤT CỦA CROM VÀ CROMUploaded bybui ton
- Đề Thi Thử Thpt Quốc Gia Năm 2015 Môn Hóa - Moon.vn Lần 2Uploaded bybui ton

- Configuring Hub and Spoke Route Based VPNUploaded byRyanb378
- Top 100 Automation Testing Interview Questions - Most Automation Testing Interview Questions and Answers _ Wisdom JobsUploaded bykartheekbeeramjula
- DecodeUploaded byRajat Mandiwal
- Optimal Multipath Congestion Control and Request Forwarding in Information-Centric NetworksUploaded byNgocThanhDinh
- #1 Introduction to Database.pdfUploaded byMuhammad Husain
- Nikhils DungeonUploaded byfullmetal
- Oracle C++ Call Interface Programmer's Guide 10g Release 2 (10.2) B14294-02Uploaded byrk
- cplusUploaded byAnurag Goel
- AN12-aix7-3G03(部分)不全Uploaded bychengab
- vr mount1Uploaded byAnonymous QIuAGIadXm
- DbText MannualUploaded byasifch_lums
- External PolymorphismUploaded bynewtonapples
- COMPUTER SHORT CUT & FULL NAMEUploaded byAMJAD
- HP Intelligent Management Center Installation Guide (IMC PLAT 5.1 (E0202))Uploaded byErwin Mendoza
- BCA 4010 - COMPUTER NETWORKING.docxUploaded byabhishek
- DATABASES.docxUploaded byAlejandro Areiza
- ACE 7.0Uploaded byRoy
- Chapter 8 Process ManagementUploaded byqwety300
- UnderstandingRasters.pdfUploaded byLizette Zareh Cortes Macías
- Label Inspection System Using LabVIEWUploaded bysanket_yavalkar
- AVSIM Lockheed Martin Prepar3D GuideUploaded byjoe_whalley
- Guide for DIALuxUploaded bySidra Shaikh
- Integrating Adobe Interactive Forms in SAP NetweaverUploaded byአንበሳዉ ዳንኤል
- Log SanderUploaded bys_knack
- Stack AlgoUploaded byAnand Patel
- SQL InjectionUploaded bysnehagowda
- [268]Omron_CJ1M_Motion_IntroductionUploaded byMahadzir Bin Mat Rabi'
- Bdm Pinout DescriptionsUploaded byEnzo Licul
- Zps Omnisense 4 1 Sw Launch FaqUploaded bymarcomedin
- Android Project Titles 2016-2017Uploaded byShaka Technologies