Among numerous sorting algorithms, some of the common algorithms are Quick Sort and Insertion Sort. Quick sort is very popular since it is the fastest known general sorting algorithm in practice which provides best run-time in average cases. Insertion sort, on the other hand, works very well when the array is partially sorted and also when the array size is not too large. In this project, we will try to combine these two algorithms in such a way that we can use the speed of quick sort and also the effectiveness of insertion sort. Afterwards, we proceed to find the hybrid algorithm (combination of insertion and quick), which is optimum in the sense of minimum average run-time.

Comparison

Anirban Ray

16 March 2018

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 1 / 26

Introduction

Some well known methods include Merge Sort, Heap Sort, Quick Sort,

Insertion Sort, Selection Sort etc.

Focus on Quick Sort and Insertion Sort

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 2 / 26

Quick sort: Advantages and Disadvantages

Provides best run-time in average cases

Recursive algorithm

High overhead cost for repetitive calls for relatively small arrays

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 3 / 26

Insertion Sort: Advantages and Disadvantages

Iterative algorithm, and takes only constant memory space

Sort an array as it receives it

Does not lag too much for moderately large arrays, provided the array

is substantially sorted

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 4 / 26

Objective

Speed of quick sort should prevail, if not improved

Taking advantage of effectiveness of insertion sort for smaller partially

sorted array

Finding an “optimal” sorting algorithm

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 5 / 26

Insertion Sort: Steps

The first k elements are already sorted

Removes the (k+1)-th element

Finds its ordered position in the sorted array of the first (k+1) elements

Inserts it there

The first (k+1) elements become sorted

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 6 / 26

Insertion Sort: Algorithm

INSERTIONSORT(A)

for j = 2 to A.length

key = A[j]

i = j - 1

while i > 0 and A[i] > key

A[i + 1] = A[i]

i = i - 1

A[i + 1] = key

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 7 / 26

Quick Sort: Steps

Such that all elements before it is not more than the pivot element

And all after are not less than that

Divides the array into two sub-arrays with respect to this pivot element

None of the sub-arrays are sorted themselves, but each element in the

former one are less than every element in the latter one

Repeats same for both the sub-arrays

Continue until all sub-arrays are of size 1

All these sub-arrays are sorted trivially within themselves

Merging these sub-arrays completes the sorting process

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 8 / 26

Quick Sort: Choice of Pivot

Each choice suitable for different types of the input array

The last element may be used always, but not ideal for sorted array

(due to Lomuto)

Similar problem for first element of the array (due to Hoare)

Random elements may be chosen, but there is some cost for random

number generation

Ideal choice is median, but it is hard to find median without sorting the

array

Compromised choice is median of the first, last and middle most entry

(due to Singleton)

For random inputs, no particular pivot is preferred over the other

We will use random arrays throughout in this project

We will use the first element of the array

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 9 / 26

Quick Sort: Partition Algorithm

PARTITION(A, p, r)

x = A[p]

i = p - 1

j = r + 1

while TRUE

repeat

j = j - 1

until A[j] <= x

repeat

i = i + 1

until A[i] >= x

if (i < j) exchange A[i] with A[j]

else return j

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 10 / 26

Quick Sort: Main Sorting Algorithm

QUICKSORT(A, p, r)

if p < r

q = PARTITION(A, p, r)

QUICKSORT(A, p, q)

QUICKSORT(A, q + 1, r)

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 11 / 26

Hybrid Sort: Formulation

Quick sort works by dividing the array recursively in smaller sub-arrays

These are not sorted within themselves, but sorted between themselves

Insertion works better for arrays with partially sorted sub-arrays of small

size

Hybrid method should start the sorting procedure by partition approach

of quick sort

It should continue to do so until the sub-arrays are of size not more

than a specified cut-off

At this stage, we have an array of partially sorted sub-arrays

Apply insertion sort over the array to get the sorted output

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 12 / 26

Hybrid Sort: Algorithm

HYBRIDSORT(A, p, r, k)

if (p < r)

if (r - p + 1 > k)

q = PARTITION(A, p, r)

HYBRIDSORT(A, p, q, k)

HYBRIDSORT(A, q + 1, r, k)

INSERTIONSORT(A)

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 13 / 26

Steps for finding the Optimum Cutoff Size

Now, we wish to find the “optimum” cut-off for array size

Optimality will be considered in terms of average run-time

We will use simulation study to choose this optimum cut-off

We take some choices of cut-offs in the range from 1 to 1000

For fixed array size, we find out average runtime for each choice over

25 replications

Vary the array size and repeat the same procedure

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 14 / 26

Graph 1

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 15 / 26

Graph 2

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 16 / 26

Graph 3

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 17 / 26

Graph 4

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 18 / 26

Observations from the Graphs: Initial sharp fall

For the choice of cut-off as 1, hybrid and quick algorithms are

equivalent

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 19 / 26

Observations from the Graphs: Increasing trend in

run-time with increase in cut-off size

As cut-of increases, insertion is applied over larger sub-arrays

Benefit for partially sortedness is significant only for “small” arrays

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 20 / 26

Observations from the Graphs: Skewed U-shaped pattern

Trade-off between these two is balanced in the lower part of the graph

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 21 / 26

Observations from the Graphs: No unique point of minima

Minimum occurs in all graphs, but those vary with each other

Quite expected in simulation studies

(100, 200) can be considered to be broad interval containing the

optimum

We will use 140 as cut-off in following section

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 22 / 26

Improvement over Quick Sort

We now wish to measure the extent of improvement, if possible

At least, we want to see if that varies with array size or not

We again do a simulation study

Fix an array size and sort 50 random arrays of that size by both

methods

Calculate average run-times in both cases and compute the percentage

improvement

Vary the array size and do the same

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 23 / 26

Graph 5

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 24 / 26

Explanation of Graph

But, improvement is decreasing as array size increasing

Reason is the insertion sort applied in the last step over the whole array

Large array size restricts the efficiency of insertion sort for partially

sorted arrays

Alternatively, insertion may be applied over each sub-arrays

But then there will be a high overhead cost for too many calls

We can ignore this as it still beats quick by around 40%, sufficient

enough for real life situations

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 25 / 26

Summary

We have provided a guideline about the optimum cut-off size

We verified that it outperforms the quick sort significantly for all

practical purposes

Hybrid algorithm can be implemented quite easily, once we have

already defined Insertion and Quick

Anirban Ray Hybrid Quick Sort + Insertion Sort: Runtime Comparison 16 March 2018 26 / 26

