You are on page 1of 13

Lower Bounds for Sorting, Counting Sort

Term Paper: Algorithm Analysis & Design

Sadashiv Srivastava 10810742

Acknowledgement

First and foremost, I would like to thank my teacher, Ms Shivani Malhotra, who has assigned me this topic to bring out my capabilities.

I express my gratitude to my parents for being a continuous source of encouragement and for all their financial aid given to me.

I would like to acknowledge the assistance provided to me by the library staff of LPU.

My heartfelt gratitude to my friends for helping me to complete my work in time.

Counting Sort
Counting sort is a linear time sorting algorithm used to sort items when they belong to a fixed and finite set. Integers which lie in a fixed interval, say k1 to k2, are examples of such items. The algorithm proceeds by defining an ordering relation between the items from which the set to be sorted is derived (for a set of integers, this relation is trivial).Let the set to be sorted be called A. Then, an auxiliary array with size equal to the number of items in the superset is defined, say B. For each element in A, say e, the algorithm stores the number of items in A smaller than or equal to e in B (e). If the sorted set is to be stored in an array C, then for each e in A, taken in reverse order, C [B[e]] = e. After each such step, the value of B (e) is decremented. The algorithm makes two passes over A and one pass over B. If size of the range k is smaller than size of input n, then time complexity=O (n). Also, note that it is a stable algorithm, meaning that ties are resolved by reporting those elements first which occur first.

Algorithm for Counting Sort


COUNTING SORT (A,B,k) 1. For i = 1 to k 2. Do c[i] = 0 3. For j = 1 to length [A] 4. Do C [A[j]] = C [A[j]] + 1 5. >C[i] now contains the number of elements equal to i 6. For i = 2 to k 7. Do C[i] = C[i] + C[i-1] 8. >C[i] now contains the number of elements less than or equal to i 9. For j = length[A] down to 1

10. Do B[C [A[j]]] = A[j] 11. C [A[j]] = C[A[J]] 1

Analysis
Because the algorithm uses only simple for loops, without recursion or subroutine calls, it is straightforward to analyze.

The initialization of the Count array, and the second for loop which performs a prefix sum on the count array, each iterate at most k + 1 times and therefore take O(k) time.

The other two for loops, and the initialization of the output array, each take O(n) time. Therefore the time for the whole algorithm is the sum of the times for these steps, O(n + k).

Because it uses arrays of length k + 1 and n, the total space usage of the algorithm is also O(n + k).

For problem instances in which the maximum key value is significantly smaller than the number of items, counting sort can be highly space-efficient, as the only storage it uses other than its input and output arrays is the Count array which uses space O(k).

Example
Each line below shows the step by step operation of counting sort.

A 3 6

B Analysis

1. The loop of lines 1-2 takes O(k) time 2. The loop of lines 3-4 takes O(n) time 3. The loop of lines 6-7 takes O(k) time 4. The loop of lines 9-11 takes O(n) time

Therefore, the overall time of the counting sort is O(k) + O(n) + O(k) + O(n) = O(k + n) In practice, we usually use counting sort algorithm when have k = O(n), in which case running time is O(n). The Counting sort is a stable sort i.e., multiple keys with the same value are placed in the sorted array in the same order that they appear in the input array. Suppose that the for-loop in line 9 of the Counting sort is rewritten: 9 for j 1 to n then the stability no longer holds. Notice that the correctness of argument in the CLR does not depend on the order in which array A[1 . . n] is processed. The algorithm is correct no matter what order is used. In particular, the modified algorithm still places the elements with value k in position c[k - 1] + 1 through c[k], but in reverse order of their appearance inA[1 . . n]. Note that Counting sort beats the lower bound of (n lg n), because it is not a comparison sort. There is no comparison between elements. Counting sort uses the actual values of the elements to index into an array.

Lower Bounds for Sorting


1. Overview

Here we will discuss the notion of lower bounds, in particular for the problem of sorting. We show that any deterministic comparison-based sorting algorithm must take (n log n) time to sort an array of n elements in the worst case.

We then extend this result to average case performance, and to randomized algorithms. In the process, we introduce the 2-player game view of algorithm design and analysis.

2. Lower Bound on Complexity for Sorting Methods

Result 1 The worst case complexity of any sorting algorithm that only uses key comparisons is

(nlog n) .
Result 2 The average case complexity of any sorting algorithm that only uses key comparisons is (nlog n)

The above results are proved using a Decision Tree which is a binary tree in which the nodes represent the status of the algorithm after making some comparisons.

Consider a node x in a decision tree and let y be its left child and zits right child. See Figure 1

Figure 1: A decision tree scenario

Basically, y represents a state consisting of the information known at x plus the fact that the key k1 is less than key k2. For a decision tree for insertion sort on 3 elements, see Figure 2

Figure 2: Decision tree for a 3-element insertion sort

3. Result 1: Lower Bound on Worst Case Complexity

Given a list of n distinct elements, there are n! possible outcomes that represent correct sorted orders.
o

Any decision tree describing a correct sorting algorithm on a list of n elements will have at least n! leaves. In fact, if we delete nodes corresponding to unnecessary comparisons and if we delete leaves that correspond to an inconsistent sequence of comparison results, there will be exactly n! leaves.

The length of a path from the root to a leaf gives the number of comparisons made when the ordering represented by that leaf is the sorted order for a given input list L.

The worst case complexity of an algorithm is given by the length of the longest path in the associated decision tree. To obtain a lower bound on the worst case complexity of sorting algorithm, we have to consider all possible decision trees having n! leaves and take the minimum longest path.

In any decision tree, it is clear that the longest path will have a length of at least log n! Since

n!

log n! More Precisely,

nlog n

n!

or log(n!) log

= logn Thus any sorting algorithm that only uses comparisons has a worst case complexity n log n) . 4. Result 2: Lower bound on Average Case Complexity We shall show that in any decision tree with K leaves, the average depth of a leaf is at least log K. We shall show the result for any binary tree with K leaves. Suppose the result is not true. Suppose T is the counterexample with the fewest nodes. T cannot be a single node because log 1 = 0. Let T have k leaves. T can only be of the following two forms. Now see Figure 3 (

Figure 3: Two possibilities for a counterexample with fewest nodes

Suppose T is of the from Tree 1. The tree rooted at n1, has fewer nodes than T but the same number of leaves and the hence an even smaller counterexample than T. Thus T cannot be of Tree 1 form. Suppose T is of the form of Trees 2. The trees T1 and T2 rooted at n1 and n2 are smaller than T and therefore the Average depth of T1 log k1 Average depth of T2 log k2 Thus the average depth of T

the premise that the average depth of T is < log k. Thus T cannot be of the form of Tree 2. Thus in any decision tree with n! leaves, the average path length to a leaf is at least log(n!) O(nlog n)

Bibliography

Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), "8.2 Counting Sort", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp. 168170,ISBN 0-262-03293-7. See also the historical notes on page 181.

Edmonds, Jeff (2008), "5.2 Counting Sort (a Stable Sort)", How to Think about Algorithms, Cambridge University Press, pp. 7275, ISBN 978-0-521-84931-9. Sedgewick, Robert (2003), "6.10 Key-Indexed Counting", Algorithms in Java, Parts 1-4: Fundamentals, Data Structures, Sorting, and Searching (3rd ed.), Addison-Wesley, pp. 312314. Knuth, D. E. (1998), The Art of Computer Programming, Volume 3: Sorting and Searching (2nd ed.), Addison-Wesley, ISBN 0-201-89685-0. Section 5.2, Sorting by counting, pp. 7580, and historical notes, p. 170.

Burris, David S.; Schember, Kurt (1980), "Sorting sequential files with limited auxiliary storage", Proceedings of the 18th annual Southeast Regional Conference, New York, NY, USA: ACM, pp. 2331,doi:10.1145/503838.503855.

Zagha, Marco; Blelloch, Guy E. (1991), "Radix sort for vector multiprocessors", Proceedings of Supercomputing '91, November 18-22, 1991, Albuquerque, NM, USA, IEEE Computer Society / ACM, pp. 712721, doi:10.1145/125826.126164.

You might also like