You are on page 1of 74

Sorting

The Sorting Problem


 Input:

– A sequence of n numbers a1, a2, . . . , an

 Output:

– A permutation (reordering) a1’, a2’, . . . , an’ of the

input sequence such that a1’ ≤ a2’ ≤ · · · ≤ an’

2
What is Sorting?
– Sorting: an operation that segregates items into groups
according to specified criterion.
A={3162134590}
A={0112334569}
– Put data in order based on primary key
Structure of data

4
Why Study Sorting?
 There are a variety of situations that we
can encounter
– Do we have randomly ordered keys?
– Are all keys distinct?
– How large is the set of keys to be ordered?
– Need guaranteed performance?

 Various algorithms are better suited to


some of these situations
5
Why Sort and Examples
 data searching can be optimized to a very high level,
 represent data in more readable formats.
 Following are some of the examples of sorting in real-life
 scenarios:
– Telephone Directory – The telephone directory stores the
telephone numbers of people sorted by their names, so that the
names can be searched easily.
– Dictionary – The dictionary stores words in an alphabetical
order so that searching of any word becomes easy.
– Sorting Books in Library (Dewey system)
– Sorting Individuals by Height (Feet and Inches)
– Sorting Movies in Blockbuster (Alphabetical)
– Sorting Numbers (Sequential)
Some Definitions
 Internal Sort
– The data to be sorted is all stored in the
computer’s main memory.
 External Sort
– Some of the data to be sorted might be stored in
some external, slower, device or files.
 In Place Sort
– The amount of extra space required to sort the
data is constant with the input size.

7
Stability
 after sorting the contents, does not change the sequence
of similar content in which they appear, it is called stable
sorting.
Sorted on first key:
Un-Stability
 after sorting the contents, changes the sequence of
similar content in which they appear, it is called unstable
sorting.
Records with key value 3 are not in order on first key!!
Sort file on second key:

9
Adaptive and Non-Adaptive Sorting Algorithm

Adaptive algorithm :
 if it takes advantage of already 'sorted' elements in the
list that is to be sorted.
 That is, while sorting if the source list has some element
already sorted, adaptive algorithms will take this into
account and will try not to re-order them.
non-adaptive algorithm :
 which does not take into account the elements which are
 already sorted. They try to force every single element to
be re-ordered to confirm their sortedness.
Increasing Order and Decreasing Order
Increasing Order
 A sequence of values as, the successive element is
greater than the previous one.
 For example, 1, 3, 4, 6, 8, 9 are in increasing order, as
every next element is greater than the previous element.
Decreasing Order
 A sequence of values as, the successive element is
less than the current one.
 For example, 9, 8, 6, 4, 3, 1 are in decreasing order, as
every next element is less than the previous element.
Types of Sorting Algorithms
 Exchange sort
– Bubble sort
– Quicksort Heap Sort

 Selection sort ●Swap Sort


– Straight selection sort
– Binary Tree sort
 Insertion sort
– Simple (linear) insertion sort
– Shell sort
– Address calculation sort
 Merge sort
– Mergsort
– Radix sort

12
Insertion Sort
 Idea: like sorting a hand of playing cards
– Start with an empty left hand and the cards facing down on
the table.
– Remove one card at a time from the table, and insert it into
the correct position in the left hand
 compare it with each of the cards already in the hand,
from right to left
– The cards held in the left hand are sorted
 these cards were originally the top cards of the pile on
the table

13
Insertion Sort
To insert 12, we need to
make room for it by moving
first 36 and then 24.

14
Insertion Sort
input array
5 2 4 6 1 3
 Assuming the first is at its correct position.
 We start from second index. Then compare this
with values at previous indices and insert at its
correct position.
 at each iteration, the array is divided in two sub-
arrays:
left sub-array right sub-array

sorted unsorted

15
Insertion Sort

16
Total iterations: n-1
Insertion sort – sorts the elements in place position
Outer for loop select the item and inner loop find the
correct position
INSERTION-SORT
INSERTION-SORT(A) 1 2 3 4 5 6 7 8

for j ← 2 to n a1 a2 a3 a4 a5 a6 a7 a8
do key ← A[ j ]
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i←j-1
key
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
 Total iterations: n-1
 Insertion sort – sorts the elements in place
 Outer for loop select the item and inner loop find the correct
position

18
Analysis of Insertion Sort
INSERTION-SORT(A) cost times
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
Insert A[ j ] into the sorted
sequence A[1 . . j -1]
0 n-1
i←j-1 c4 n-1
c5  j 2 t j
n
while i > 0 and A[i] > key
do A[i + 1] ← A[i] c6  j2 (t j  1)
n

i←i–1
c7  j2 (t j  1)
n

A[i + 1] ← key
c8 n-1
tj: # of times the while statement is executed at iteration j

T (n)  c1n  c2 (n  1)  c4 (n  1)  c5  t j  c6  t j  1  c7  t j  1  c8 (n  1)
n n n

j 2 j 2 j 2
19
Best Case Analysis
 The array is already sorted “while i > 0 and A[i] > key”

– A[i] ≤ key upon the first time the while loop test is run (when i = j -1)

– tj = 1

 T(n) = c1n + c2(n -1) + c4(n -1) + c5(n -1) + c8(n-1) = (c1 + c2 + c4 + c5 + c8)n
(c2 + c4 + c5 + c8)

= an + b = (n)

T (n)  c1n  c2 (n  1)  c4 (n  1)  c5  t j  c6  t j  1  c7  t j  1  c8 (n  1)
n n n

j 2 j 2 j 2

20
Worst Case Analysis
 The array is in reverse sorted order “while i > 0 and A[i] > key”
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th
position  compare with j-1 elements  tj = j
n
n(n  1) n
n(n  1) n
n(n  1)
using 
j 1
j
2
  j 
j 2 2
 1   ( j 1) 
j 2 2 we have:
 n(n  1)  n( n  1) n(n  1)
T (n )  c1n  c2 (n  1)  c4 (n  1)  c5   1  c6  c7  c8 (n  1)
 2  2 2

 an 2  bn  c a quadratic function of n

 T(n) = (n2) order of growth in n2


T (n)  c1n  c2 (n  1)  c4 (n  1)  c5  t j  c6  t j  1  c7  t j  1  c8 (n  1)
n n n

j 2 j 2 j 2 21
Insertion Sort - Summary
 Advantages
– Good running time for “almost sorted”
arrays (n)
 Disadvantages
– (n2) running time in worst and average
case
–  n2/2 comparisons and exchanges

22
Selection Sort
 Idea:
– Find the smallest element in the array
– Exchange it with the element in the first position
– Find the second smallest element and exchange it with
the element in the second position
– Continue until the array is sorted
 Disadvantage:
– Running time depends only slightly on the amount of
order in the file

23
Selection sort
 Given an array of length n,
– Search elements 0 through n-1 and select the smallest
 Swap it with the element in location 0

– Search elements 1 through n-1 and select the smallest


 Swap it with the element in location 1

– Search elements 2 through n-1 and select the smallest


 Swap it with the element in location 2

– Search elements 3 through n-1 and select the smallest


 Swap it with the element in location 3

– Continue in this fashion until there’s nothing left to search

24
Example and analysis of selection sort
 The selection sort might swap an array element
7 2 8 5 4 with itself--this is harmless, and not worth
checking for
 Analysis:
2 7 8 5 4
 The outer loop executes n-1 times
 The inner loop executes about n/2 times on
2 4 8 5 7 average (from n to 2 times)
 Work done in the inner loop is constant
2 4 5 8 7 (swap two array elements)
 Time required is roughly (n-1)*(n/2)
 You should recognize this as O(n2)
2 4 5 7 8

25
Example
8 4 6 9 2 3 1 1 2 3 4 9 6 8

1 4 6 9 2 3 8 1 2 3 4 6 9 8

1 2 6 9 4 3 8 1 2 3 4 6 8 9

1 2 3 9 4 6 8 1 2 3 4 6 8 9

26
Selection Sort
Alg.: SELECTION-SORT(A) 8 4 6 9 2 3 1
n ← length[A]
for j ← 1 to n - 1
do smallest ← j
for i ← j + 1 to n
do if A[i] < A[smallest]
then smallest ← i
exchange A[j] ↔ A[smallest]

27
Analysis of Selection Sort
Alg.: SELECTION-SORT(A) cost times
n ← length[A] c1 1
for j ← 1 to n - 1 c2 n
do smallest ← j c3 n-1
n2/2
for i ← j + 1 to n c4 nj11 (n  j  1)
comparisons
do if A[i] < A[smallest] c5 
n 1
j 1
(n  j )
n
exchanges then smallest ← i c6 
n 1
j 1
(n  j )

exchange A[j] ↔ A[smallest] c7 n-1


n 1 n 1 n 1
T (n)  c1  c2 n  c3 (n  1)  c4  (n  j  1)  c5   n  j   c6   n  j   c7 (n  1)  (n 2 )
j 1 j 28
1 j 2
Straight selection sort
 It is push down sort.
 Implement the descendent priority
queue as an unordered array.
 Input array x used to hold the priority
queue to ave additional space.
 It is an in-place sort.
Merge Sort
 Given two sorted lists
(list[i], …, list[m])
(list[m+1], …, list[n])
generate a single sorted list
(sorted[i], …, sorted[n])
 O(n) space vs. O(1) space

CHAPTER 7 31
Bubble Sort
 Idea:
– Repeatedly pass through the array
– Swaps adjacent elements that are out of order

i
1 2 3 n

8 4 6 9 2 3 1
j
 Easier to implement, but slower than Insertion sort

32
Example
8 4 6 9 2 3 1 1 8 4 6 9 2 3
i=1 j i=2 j

8 4 6 9 2 1 3 1 2 8 4 6 9 3
i=1 j i=3 j

8 4 6 9 1 2 3 1 2 3 8 4 6 9
i=1 j i=4 j

8 4 6 1 9 2 3 1 2 3 4 8 6 9
i=1 j i=5 j

8 4 1 6 9 2 3 1 2 3 4 6 8 9
i=1 j i=6 j

8 1 4 6 9 2 3 1 2 3 4 6 8 9
i=1 j i=7
j
1 8 4 6 9 2 3
i=1 j 33
Bubble Sort-Example
7 2 8 5 4 2 7 5 4 8 2 5 4 7 8 2 4 5 7 8
2 7 8 5 4 2 7 5 4 8 2 5 4 7 8 2 4 5 7 8
2 7 8 5 4 2 5 7 4 8 2 4 5 7 8 (done)
2 7 5 8 4 2 5 4 7 8

2 7 5 4 8

34
Bubble Sort-Algorithm
for i  1 to length[A]
do for j  length[A] downto i + 1
do if A[j] < A[j -1]
then exchange A[j]  A[j-1]

i
8 4 6 9 2 3 1
i=1 j

35
Bubble Sort-Code
public static void bubbleSort(int[] a) {
int outer, inner;
for (outer = a.length - 1; outer > 0; outer--) { // counting dow
for (inner = 0; inner < outer; inner++) { // bubbling up
if (a[inner] > a[inner + 1]) { // if out of order...
int temp = a[inner]; // ...then swap
a[inner] = a[inner + 1];
a[inner + 1] = temp;
}
}
}
}

36
Bubble Sort Running Time
Alg.: BUBBLESORT(A)
for i  1 to length[A] c1
do for j  length[A] downto i + 1 c2
Comparisons:  n2/2 do if A[j] < A[j -1] c3

Exchanges:  n2/2
then exchange A[j]  A[j-1] c4
n

 (n  i  1)  c  (n  i)  c 
n n
T(n) = c1(n+1) + c2 (n  i )
3 4
i 1 i 1
n i 1
= (n) + (c + c + c )2 2 (n  i )4
i 1
n n
n(n  1) n n
n 2
where  (n  i)  n   i  n    2

i 1 i 1 2 2 2
i 1

Thus,T(n) = (n2)
37
Analysis of bubble sort
 for (outer = a.length - 1; outer > 0; outer--) {
for (inner = 0; inner < outer; inner++) {
if (a[inner] > a[inner + 1]) {
// code for swap omitted
} } }
 Let n = a.length = size of the array
 The outer loop is executed n-1 times (call it n, that’s close
enough)
 Each time the outer loop is executed, the inner loop is executed
– Inner loop executes n-1 times at first, linearly dropping to just
once
– On average, inner loop executes about n/2 times for each
execution of the outer loop
– In the inner loop, the comparison is always done (constant
time), the swap might be done (also constant time)

38
Result is n * n/2 * k, that is, O(n2/2 + k) = O(n2)
Quick sort
 It is partition exchange sort.
 Choose an element a from a specific position in the array.
 It is a is positioned at j and following conditiond hold:
 - each of elements in position 0 through j-1 is less than or
equal to a.
 Each of elements in positions j+1 through n-1 is greater than or
equal to a.
Quick Sort
 Given (R0, R1, …, Rn-1)
Ki: pivot key // (a)
if Ki is placed in S(i) // array (S(i)),
then Kj  Ks(i) for j < S(i),
Kj  Ks(i) for j > S(i).
 R0, …, RS(i)-1, RS(i), RS(i)+1, …, RS(n-1)

two partitions

CHAPTER 7 40
Example for Quick Sort

R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 left right
{ 26 5 37 1 61 11 59 15 48 19} 0 9
{ 11 5 19 1 15} 26 { 59 61 48 37} 0 4
{ 1 5} 11 { 19 15} 26 { 59 61 48 37} 0 1
1 5 11 15 19 26 { 59 61 48 37} 3 4
1 5 11 15 19 26 { 48 37} 59 { 61} 6 9
1 5 11 15 19 26 37 48 59 { 61} 6 7
1 5 11 15 19 26 37 48 59 61 9 9
1 5 11 15 19 26 37 48 59 61

CHAPTER 7 41
Quick Sort
void quicksort(element list[], int left,
int right)
{
int pivot, i, j;
element temp;
if (left < right) {
i = left; j = right+1;
pivot = list[left].key;
do {
do i++; while (list[i].key < pivot);
do j--; while (list[j].key > pivot);
if (i < j) SWAP(list[i], list[j], temp);
} while (i < j);
SWAP(list[left], list[j], temp);
quicksort(list, left, j-1);
quicksort(list, j+1, right);
}
}
CHAPTER 7 42
Analysis for Quick Sort
 quick sort is often defined by recursive process.
 Assume that each time a record is positioned, the list is
divided into the rough same size of two parts.
 Position a list with n element needs O(n)
 T(n) is the time taken to sort n elements
T(n)<=cn+2T(n/2) for some c
<=cn+2(cn/2+2T(n/4))
...
<=cnlog n+nT(1)=O(nlog n)

CHAPTER 7 43
Time and Space for Quick Sort
 Space complexity:
– Average case and best case: O(log n)
– Worst case: O(n)
 Time complexity:
– Average case and best case: O(n log n)
– Worst case: O(n
2
)

CHAPTER 7 44
Merge Sort (O(n) space)
void merge(element list[], element sorted[],
int i, int m, int n)
{
int j, k, t; addition space: n-i+1
j = m+1; # of data movements: M(n-i+1)
k = i;
while (i<=m && j<=n) {
if (list[i].key<=list[j].key)
sorted[k++]= list[i++];
else sorted[k++]= list[j++];
}
if (i>m) for (t=j; t<=n; t++)
sorted[k+t-j]= list[t];
else for (t=i; t<=m; t++)
sorted[k+t-i] = list[t];
}
CHAPTER 7 47
Analysis
 array vs. linked list representation
– array: O(M(n-i+1)) where M: record length
for copy
– linked list representation: O(n-i+1)
(n-i+1) linked fields

CHAPTER 7 48
Radix Sort
Sort by keys
K0, K1, …, Kr-1

Most significant key Least significant key

R0, R1, …, Rn-1 are said to be sorted w.r.t. K0, K1, …, Kr-1 i
( k , k ,..., k )  ( k , k ,..., k )
i
0
i
1
i
r 1 0
i 1
1
i 1
r 1
i 1
0i<n-1

Most significant digit first: sort on K0, then K1, ...

Least significant digit first: sort on Kr-1, then Kr-2, ...

CHAPTER 7 51
Figure 7.14: Arrangement of cards after first pass of an MSD
sort(p.353)

Suits:  <  <  < 


Face values: 2 < 3 < 4 < … < J < Q < K < A
(1) MSD sort first, e.g., bin sort, four bins    
LSD sort second, e.g., insertion sort

(2) LSD sort first, e.g., bin sort, 13 bins


2, 3, 4, …, 10, J, Q, K, A
MSD sort, e.g., bin sort four bins    
CHAPTER 7 52
Figure 7.15: Arrangement of cards after first pass of LSD
sort (p.353)

CHAPTER 7 53
Radix Sort
0  K  999
(K0, K1, K2)
MSD LSD
0-9 0-9 0-9
radix 10 sort
radix 2 sort

CHAPTER 7 54
Example for LSD Radix Sort
d (digit) = 3, r (radix) = 10 ascending order
179, 208, 306, 93, 859, 984, 55, 9, 271, 33
front[0] NULL rear[0]
Sort by digit

front[1] 271 NULL rear[1]


front[2] NULL rear[2]
front[3] 93 33 NULL rear[3]
front[4] 984 NULL rear[4]
front[5] 55 NULL rear[5]
front[6] 306 NULL rear[6]
front[7] NULL rear[7]
front[8] 208 NULL rear[8]
front[9] 179 859 9 NULL rear[9]
concatenat

271, 93, 33, 984, 55, 306, 208, 179, 859, 9 After the first
CHAPTER 7 55
front[0] 306 208 9 null rear[0]

front[1] null rear[1]

front[2] null rear[2]

front[3] 33 null rear[3]

front[4] null rear[4]

front[5] 55 859 null rear[5]

front[6] null rear[6]

front[7] 271 179 null rear[7]


front[8] 984 null rear[8]
front[9] 93 null
CHAPTER 7 rear[9]
56
front[0] 9 33 55 93 null
rear[0]
front[1] 179 null rear[1]

front[2] 208 271 null rear[2]

front[3] 306 null rear[3]

front[4] null rear[4]

front[5] null rear[5]

front[6] null rear[6]

front[7] null rear[7]


front[8] 859 null rear[8]
front[9] 984 null
CHAPTER 7 rear[9]
57
Data Structures for LSD Radix So
 An LSD radix r sort,
 R0, R1, ..., Rn-1 have the keys that are d-tupl
(x0, x1, ..., xd-1)
#define MAX_DIGIT 3
#define RADIX_SIZE 10
typedef struct list_node *list_pointer
typedef struct list_node {
int key[MAX_DIGIT];
list_pointer link;
}
CHAPTER 7 58
LSD Radix Sort
list_pointer radix_sort(list_pointer ptr)
{
list_pointer front[RADIX_SIZE],
rear[RADIX_SIZE];
int i, j, digit;
for (i=MAX_DIGIT-1; i>=0; i--) {
for (j=0; j<RADIX_SIZE; j++) Initialize bins to be
front[j]=read[j]=NULL; empty queue.
while (ptr) { Put records into queues.
digit=ptr->key[I];
if (!front[digit]) front[digit]=ptr;
else rear[digit]->link=ptr;

CHAPTER 7 59
rear[digit]=ptr;
O(n) ptr=ptr->link; Get next record.
}
/* reestablish the linked list for the next pass *
O(d(n+r)) ptr= NULL;
for (j=RADIX_SIZE-1; j>=0; j++)
if (front[j]) {
O(r) rear[j]->link=ptr;
ptr=front[j];
}
}
return ptr;
}

CHAPTER 7 60
Heap Sort
*Figure 7.11: Array interpreted as a binary tree (p.349)
1 2 3 4 5 6 7 8 9 10
26 5 77 1 61 11 59 15 48 19

input file [1] 26

[2] 5 [3] 77

[4] 1 [5] 61 [6] 11 [7] 59

[8] 15 [9] 48 [10] 19

CHAPTER 7 62
*Figure 7.12: Max heap following first for loop of heapsort(p.350)

initial heap [1] 77

[2] 61 [3] 59
exchange
[4] 48 [5] 19 [6] 11 [7] 26

[8] 15 [9] 1 [10] 5

CHAPTER 7 63
Figure 7.13: Heap sort example(p.351)

[1] 61

[2] 48 [3] 59

[4] 15 [5] 19 [6] 11 [7] 26

[8] 5 [9] 1 [10] 77


(a)

[1] 59

[2] 48 [3] 26

[4] 15 [5] 19 [6] 11 [7] 1

[8] 5 [9] 61 [10] 77

(b)

CHAPTER 7 64
Figure 7.13(continued): Heap sort example(p.351)

[1] 48

[2] 19 [3] 26

[4] 15 [5] 5 [6] 11 [7] 1

[8] 59
59 [9] 61
61
[10] 77
(c)

[1] 26

[2] 19 [3] 11

48
[4] 15 [5] 5 [6] 1 [7] 48

59
[8] 59 [9] 61 [10] 77

(d)

CHAPTER 7 65
Heap Sort
void adjust(element list[], int root, int n)
{
int child, rootkey; element temp;
temp=list[root]; rootkey=list[root].key;
child=2*root;
while (child <= n) {
if ((child < n) &&
(list[child].key < list[child+1].key))
child++;
if (rootkey > list[child].key) break;
else {
list[child/2] = list[child];
child *= 2;
} i
} 2i 2i+1
list[child/2] = temp;
}
CHAPTER 7 66
Heap Sort
void heapsort(element list[], int n)
{ ascending order (max heap)
int i, j;
element temp; bottom-up
for (i=n/2; i>0; i--) adjust(list, i, n);
for (i=n-1; i>0; i--) { n-1 cylces
SWAP(list[1], list[i+1], temp);
adjust(list, 1, i); top-down
}
}
CHAPTER 7 67
Complexity of Sort
stability space time
best average worst
Bubble Sort stable little O(n) O(n2) O(n2)
Insertion Sort stable little O(n) O(n2) O(n2)
Quick Sort untable O(logn) O(nlogn) O(nlogn) O(n2)
Merge Sort stable O(n) O(nlogn) O(nlogn) O(nlogn)
Heap Sort untable little O(nlogn) O(nlogn) O(nlogn)
Radix Sort stable O(np) O(nlogn) O(nlogn) O(nlogn)
List Sort ? O(n) O(1) O(n) O(n)
Table Sort ? O(n) O(1) O(n) O(n)

CHAPTER 7 72
Summary
 Most of the sorting techniques we have discussed are O(n2)
 As we will see later, we can do much better than this with
somewhat more complicated sorting algorithms
 Within O(n2),
– Bubble sort is very slow, and should probably never be used for
anything
– Selection sort is intermediate in speed
– Insertion sort is usually faster than selection sort—in fact, for small
arrays (say, 10 or 20 elements), insertion sort is faster than more
complicated sorting algorithms
– Merge sort, if done in memory, is O(n log n)
 Selection sort and insertion sort are “good enough” for small arrays
 Merge sort is good for sorting data that doesn’t fit in main memory

73

You might also like