You are on page 1of 44

19 26 44 17 26 44

44

19
PRE INV

26

44

17

19

S true

BE

false

POST

3 8 6 2 2 3 6 8

F. D. Lewis Professor Emeritus Department of Computer Science University of Kentucky x


M

Copyright by F. D. Lewis.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, chemical, or mechanical, (including photocopying, recording, and information storage and retrieval), without prior written permission from the author.

Preface
This small collection of essays evolved over the years due to a lack of coverage in standard algorithms textbooks. As textbooks became more extensive the material found here then provided an alternate treatment for students in data structures and algorithm courses. Many topics are personal favorites (2-3 trees, tournaments), while others were needed for some reason at one time or another. None of the algorithms described, developed, and analyzed here were invented by the author although the presentations came from courses taught over the last twenty years or so. In particular, most of the specifications and correctness proofs given here are not those included in the original papers. In the coverage of each family of algorithms or methods for particular objects, three topics are presented: how the algorithm works, why it works, and how efficient it is. The last two, correctness and complexity, are explained fully in the companion piece Essays on Algorithm Analysis which of course should be consulted. However the complexity and correctness arguments should be able to stand alone in this collection. It is hoped that the ideas in this collection will still be useful in examining algorithms, analyzing them, and explaining them to others. Bangkok, Thailand August 2002

Essays on Algorithm Design

Algorithms (Methods) for Specific Objects Order Statistics and Tournaments 2-3 Search Trees Tables and Hashing Sorting Networks Fast Matrix Multiplication Algorithm Fragments Minimum Spanning Trees 33 External Sorting 35 String Searching and Pattern Matching 37 Network Flow 39 Faster Arithmetic 40 1 7 19 25 31

ii

Order Statistics and Tournaments


Sequential examination is not the only way to find the maximum element on a list. Another way to do this is to have the elements compete in a tournament much like those in athletic events. This is accomplished by comparing each pair of elements on the list and moving the largest on to the next round. An example appears below as figure 1.

17 17 3 22 22 22 14 87 87 87 56 87 37 37 8
Figure 1 - A Tournament of Integers Note that round two featured the elements 17, 22, 87, and 37 since they were larger than their partners in the first round. Eventually 87 became the champion and won the tournament since it was the largest of all the numbers and thus triumphed in every round of the tournament. Suppose that we were to design an algorithm to run the tournament and find the maximum. We begin with a list of n numbers and enter them in

Algorithm Design

the tournament by placing them on the list of players. Then we have them play matches (by comparing pairs of them), and retain the winners on the list of players for to the next round. We continue doing this until only one element remains. The complete algorithm is presented in figure 2. Note that for convenience we required the number of elements on the list to be a power of two. function tourna ment(a, n) PRE: 0 < n, n is a power of 2 POST: return largest of a[1], ... , a[n] place all n ele ments on the player list k = n (the match list size) while k > 1 do {k = k/2 play k matches between pairs of players post winners to a new player list} return e lement remaining on player list Figure 2 - Tournament Algorithm This looks good, but we need to verify that it indeed finds the maximum element on a list. We shall do this by arguing that at each round a certain number of elements are smaller than each of the participants in that round, and in the final round, where we claim that the winner emerges, all other elements are smaller than the tournament survivor. Let us examine the while loop in the context of the picture in figure 1. The player lists of the program are just the elements in each round or vertical column of the picture in figure 1. At the beginning we have 8 elements and we know nothing about them. After one round, 4 remain in competition and we know that each of them is larger than one other element since they had beat an element to reach the second round. After the second round, 2 elements remain and each of them is larger than three other elements. This is because each of the winners beat two elements and the most recent loser beat another element in the first round. For example: the number 22 is larger than three elements since it beat both 17 and 14, and 17 beat 3. The general case is shown in the chart presented as figure 3. It contains the relationships between the round, number of elements (players) that are still competing in that round, and the minimum number of elements smaller than each of the players in that round. Again, for convenience, the original number of players is assumed to be a power of two.

Tournaments

round 0 1 2 3 m

players n n/2 n/4 n/8 n/2m

smaller 0 1 3 7 2m-1

Figure 3 - Tournament Parameter Relationships Since the number of elements in the round is halved each time, the number of rounds is O(logn) and thus n/2logn or one element emerges from the final round, and this is larger than 2logn-1 or n-1 elements. For that reason it is the maximum on the list. Now, we know exactly why a tournament will select the maximum element. We still need to show that the algorithm does the same. With this in mind, we can now form the loop invariant for the while loop of our tournament algorithm. It merely states that the elements about to compete are larger than at least a certain number of elements. Here it is in formal, precise terms. INVARIANT: The elements on the player list are greater than at least (n/k)-1 of the elements on the original list. All that remains is to show that the algorithm is correct and complete the implementation details about playing and posting the winners to the new player list. Here are the arguments for verifying the three conditions needed to show that the while loop is correct. Entry: PRE INV. Since k = n, then obviously all of these elements are larger than at least (n/k - 1) = (n/n 1) or 0 elements. Execution: INV and (k > 1) and do(set(k) & play & post) INV. Since k > 1, at least one match is about to be played. Let us examine a match. Because of the invariant, both elements in the match are greater than at least (n/k - 1) others. This means that the winner must be greater than: a) those it was greater than before (n/k -1), b) the loser (1), and c) those smaller than the loser (n/k - 1).

Algorithm Design Adding this up, we find that the winner is larger than 2(n/k - 1) + 1 elements. This is equal to 2n/k - 1 or n/(k/2) - 1 elements. This is exactly the number of elements it must be greater than for the invariant to be true in the next round, which shall be played with half as many elements as this round. Exit: INV and (k=1) POST. If k equals 1 then by the invariant, the element remaining on the match list is greater than at least (n/k 1) = (n/1 1) = n - 1 elements on the list. Thus it has to be the maximum.

All that remains is to fill in the details about playing and posting. Playing is just comparing the numbers and posting can be implemented by writing the winners to the top of the tournament list. This is done by using a new list named p[ ] as the player list and post the winners to the front of it. Examine the algorithm presented in figure 4. function tourna ment(a, n) PRE: 0 < n, n is a power of 2 POST: return largest of a[1], ... , a[n] for i = 1 to n do p[i] = a[i] k = n (the match list size) while k > 1 do {k = k/2 fo r i = 1 to k do if p[2i] > p[2i-1] then p[i] = p[2i] e lse p[i] = p[2i-1]} return(p[1]) Figure 4 - The Final Tournament Algorithm Showing that the inner for-loop does indeed perform the playing and posting required for the tournament is left as an exercise. Let us now turn to complexity. In each round, k compares are made. Adding them up produces the sum: n/2 + n/4 + + 2 + 1 = n - 1. Thus the complexity is O(n) or linear. This is the best that can be done since every element on the list must be examined.

Tournaments

Our next question concerns finding both the maximum and minimum. If we note that the first round separated the maximum candidates from the minimum candidates, we see that we need only to: a) Finish the maximum tournament. b) Play a minimum tournament between the n/2 first round losers. This adds up to: (n - 1) + (n/2 - 1) = 3n/2 - 2 compares. This, by the way, is optimum. Finding the maximum and second largest is a little more interesting. At first glance, we might award second place to the loser in the finals. After all, this is done in sports tournaments where the last game is touted as the championship match. One of the first people to protest this practice in writing was the Reverend Dodgson [Do83], an Oxford Don better known under his penname, Lewis Carroll. He correctly pointed out that the wrong contestants were being awarded second prizes in lawn tennis. Darts throwers have long known that the second best thrower might just be eliminated at an early stage in any particular tournament. In an attempt to insure fairness, they usually play double elimination tournaments. That is, they have two simultaneous tournaments, the original, plus a losers tournament to determine second best. Doing this requires (n - 1) + ((n - 1) - 1) or 2n -3 comparisons or games. But, is the second tournament necessary? To investigate this, we first formulate the precise definition of the second largest element on a list. Definition. The second largest element on a list is the element that is larger than every element except the maximum. In terms of tournaments this means that the second largest can only be beaten by the maximum. Thus, there must have been a comparison between the largest and second largest elements during the tournament. Recall the tournament in figure 1. The second largest element is 56 and it was indeed knocked out by 87 (the winner). In fact, as we feared, it was eliminated in the very first round!

Algorithm Design

This means that the second place tournament can be played between the logn elements beaten by the maximum. Thus the number of compares needed to find the maximum and second largest is n + logn - 2. This also is optimum. [Do83] Dodgson, Rev. Charles Lutwidge. 'Lawn Tennis Tournaments'. 1883. Reprinted in The Works of Lewis Carroll, ed. Roger Lancelyn Green, Hamlyn Publishing Group Ltd., Middlesex, 1968.

2-3 Trees as Search Trees


Balanced search trees are found in many flavors and have been the major data structure used for structures called dictionaries, where insertion, deletion, and searching must take place. The earliest objects used for searching were balanced binary trees. In 1970, John Hopcroft introduced 2-3 search trees as a more civilized way to implement balanced search trees. Later they were generalized to B-trees by Bayer and McCreight. These became a popular object for performing external searching. B-trees were then simplified by Bayer to form red-black trees. The rules for 2-3 search trees are really rather elementary. 1. All data appears at the leaves. 2. Data elements are ordered from left (minimum) to right (maximum). 3. Every path through the tree is the same length. 4. Interior nodes have two or three subtrees. Thus if there are n elements stored in the leaves of a 2-3 search tree, the tree is somewhere between log3n and log2n in height. Interior nodes hold no data elements, but contain some information about the elements stored in their subtrees. Consider figure 1 below.
m low rlow

Figure 1 - A 2-3 Search Tree Interior Node

Algorithm Design

The interior node contains two numbers equal to elements in the tree, mlow = smallest element in the M (middle) subtree, and rlow = smallest element in the R (right) subtree if it has three subtrees, and one key (mlow) if it has two subtrees. An example of an entire 2-3 search tree is provided as figure 2.
17 65

26 44

89

17

26

44

65

89

Figure 2 - A 2-3 Search Tree The information stored at the interior nodes and the rule that the data elements are in order from left to right, provide us with a mechanism for navigation through the search tree. In the example of figure 2, we would look for 47 by first checking the subtree minima stored at the root. This would lead us to the middle subtree since 47 is between the minimum element in that subtree (17) and the minimum element in the right subtree (65). After checking the information stored at the top of the middle subtree we go to the right since 44 is smaller than what we are searching for. Since we are now at a leaf, we know that the target is not in the tree. If we were searching for an 89 or a 94, we would traverse the rightmost path in the 2-3 search tree. Thus searching involves traversing the tree by traveling through the only subtrees that can possibly contain the target element. In other words, we go to the rightmost subtree whose minimum value is less than the target. There is an exception - if we are looking for something that is smaller than any element in the tree. In that case we merely travel down the leftmost branches of the tree. In figure 3 we present the specifications for the problem of traversing a 2-3 search tree looking for a target value.

2-3 Trees locate(t, x) PRE: t is a 2-3 search tree POST: return the location of the rightmost element in the tree t that is no greater than x, unless x is smaller than anything in the tree, then return the leftmost leafs location

Figure 3 - Traversal Specifications To implement our journey down the path to the leaf most likely to have the same value as our target x we shall use a loop. As mentioned above, we wish to stay in the rightmost subtree that contains elements no greater than the target element x. So, instructions in the loop will move a pointer named p down through the proper subtrees until it reaches a leaf. Consider the loop invariant described in figure 4.

Figure 4 - Tree Traversal Loop Invariant It should be evident that if we traverse the 2-3 search tree keeping the invariant true, we must eventually arrive at a leaf containing the target element, if the target is indeed in the search tree. But, we need some conventions in order to specify the algorithm precisely. We shall use the interior node variables intuitively as functions as well as values. For example: mlow(p) is the value of mlow at the interior node pointed to by p. We also use tree parts such as leftkid(p), middlekid(p), and rightkid(p) as methods that return pointers to the appropriate subtrees of that rooted at p. The major portion of the search algorithm is given as Figure 5.

10 locate(t, x)

Algorithm Design

p = root(t) while p is not a leaf do {cases: x > rlow(p): p = rightkid(p) x > mlo w(p): p = middlekid(p) otherwise: p = leftkid(p)} locate = p Figure 5 - Searching Algorithm Proving correctness involves two things. First, we must be convinced that the algorithm will eventually halt. This is not difficult if we note that at every stage, p is set to a node further down the tree and thus must eventually reach a leaf since the tree is finite. Next, we must show that the loop is correct, that is, executing it takes us from the precondition to the postcondition. This involves a proof by induction on the number of times the while-loop is executed. Verifying the loop in this algorithm consists of three parts: Entry: the invariant is true when we enter the loop. This should be obvious when we recall that p points to the root of the entire tree. Execution: the invariant remains true as we execute the loop. If p is the root of a subtree which obeys the invariant, the if-statement assigns p to its rightmost subtree whose minimum value is not greater than x. Exit: the postcondition is true upon exit from the loop. If the invariant is true, and we are at a leaf, then the postcondition is true since it is the same as the invariant applied to a leaf. It is easy now to see that if x is in the tree, it is at the leaf pointed to by p. We now swiftly employ this location algorithm to develop the entire search or find algorithm presented below in figure 6.

2-3 Trees

11

find(t, x) PRE: t is a 2-3 search tree POST: if x is in t then return the location of x, otherwise return 0 p = locate(t, x) if value(p) = x then return(p) e lse return(0)

Figure 6 - Search Algorithm


and note that if x is in the tree, this algorithm finds it. Since all we do is traverse from the root to the leaf, the complexity of this algorithm is the height of the tree: somewhere between log3n and log2n for a tree with n leaves. This is O(log n). Now let us develop the methods necessary to insert a node in a 2-3 tree. Examine the 2-3 tree below in figure 7 and note where nodes containing the values 5 and 35 should appear in the collection of leaves.
17 65

26 44

89

17

26

35

44

65

89

Figure 7 - A 2-3 Search Tree With Leaves to Insert


If we search for a 5 in the tree, the locate routine returns a value of p pointing to the leaf containing the 3. Inserting the 5 is rather simple. We take the parent of p (i.e., the interior node holding the <mlow, rlow> pair <8, ->) and as illustrated below in figure 8, just a) b) c) d) make the leaf holding 8 the right child, set rlow to 8, attach a new leaf with 5 in it as middle child, and set mlow to 5.

Inserting a 35 is quite another matter. When we search for 35, the locate routine ends up at the leaf containing the 26. This means that 35 must go just to the right of it. But, since the parent of 26 already has three

12

Algorithm Design

children, we must take these plus the new leaf and divide them between two nodes. This is done in Figure 8.
17 65

26

44

89

17

26

35

44

65

89

Figure 8 - A 2-3 Search Tree With an Interior Node to Insert


A new interior node was created and the leaves containing 35 and 44 were connected to it. All interior node values have been set properly. We must now attach the new node. One very important fact should be noted at this point. We need not change any interior node values at nodes other than the parents of the nodes we are attaching. This is because nodes higher in the tree have interior values (mlow and rlow) which come from the least element in the leftmost subtree of the parent interior node. For example in figure 8 the root holds values 17 and 65. These do not occur at any other level. Let us continue with our insertion problem. We must now insert the new interior node in our tree. We know several things about it. a) It holds the pair <44, -> b) It should be inserted to the right of the node containing <26, -> c) The smallest leaf in its tree is 35 Since the parent of <26, -> has three children, we must split it as we did before and then connect the new node (in figure 9 below this holds the pair <65, ->) into the tree. Note that its smallest leaf contains the 35. Another aspect surfaces here. Since the node that was split was the root, we must create a new root and set its mlow value to 35. The final result appears as figure 9.

2-3 Trees
35 -

13

17

65

26

44

89

17

26

35

44

65

89

Figure 9 - The Final 2-3 Search Tree


Let us recap. If a new node is to be inserted as the child of a node with two children, everything is fairly straightforward. When the proposed parent of the insertee has three children already, things are not so simple. A new node must be created and the four nodes in question (the three children and the insertee) are attached to them. If we always insert nodes to the right of another node, there are just three cases of this type. These are enumerated in figure 10.

26 44

19

44

17

19

26

44

17

19

26

44

26 44

26

44

17

26

35

44

17

26

35

44

26 44

26

55

17

26

44

55

17

26

44

55

Figure 10- Insertion Cases for Nodes with 3 Children

14

Algorithm Design

This does, however bring up the special case when one seeks to insert a leaf that holds a value smaller than anything in the tree. Here we merely swap values with the leftmost leaf and attach as before. Figure 11 illustrates this.
26 44 26 44

17

26

44

17

26

44

Figure 11 - Dealing with a Minimum Value


Now to create the insert routine. We first run the locate routine and find the leaf that is to the immediate left of the value we wish to insert, unless our insertee is smaller than anything in the tree. Then we construct a leaf, place the element to be inserted in it and attach the leaf to the tree. We shall need several new routines for our 2-3 trees. The first is the function createleaf(x) which returns the location of a new leaf node containing x and the second is attach(t, p, q, value(q)) which attaches the leaf located at q to the 2-3 tree t. Examine the insert routine of figure 12.

insert(t, x) PRE: t is a 2-3 search tree POST: x has been inserted in t p = locate(t, x) q = createleaf(x) if value(p) > x then swap(p, q) attach(t, p, q, value(q)) Figure 12 - Insert Algorithm
Note that the insert routine took care of the case where x was less than anything in the tree (as shown in figure 11). The attach routine which will be developed below must be designed to attach subtrees and thus works with both interior and leaf nodes. Figure 13 contains the specifications for the attach routine. Note that the precondition follows from the locate routines postcondition.

2-3 Trees

15

attach(t, p, q, qlow) PRE: t is a 2-3 search tree p points to a node in t q points to the root of a 2-3 search tree or leaf node the subtree at p and tree at q are the same height qlow is value of the smallest leaf in qs tree the leaves in qs tree are larger than those in ps and smaller than those to the right in t POST: q has been added to t t is still a 2-3 search tree

Figure 13 - Attach Specifications


Of special interest is the fact that the subtree rooted at q should go just to the right of that rooted at p in order to add it to t and maintain t as a 2-3 search tree. Thus we wish to connect q to ps parent. First, though, we must ask if p points to the root of t. If so, then we are forced to construct a new root and attach both p and q to it. We also need new method at this point to do the work. In the algorithm of figure 14, the routine connect(p, r, position) does the actual connecting of ps tree to r at the left, middle, or right. The diagram at the right of figure 14 shows the result of attaching everything to the new root.
if p = root(t) then {r = createnode p connect(p, r, left) connect(q, r, middle) r(mlow) = qlow} qlow

r q

Figure 14 - Inserting at the Root


If p has a parent, then this parent has either two or three subtrees under the rules for 2-3 search trees. Figure 15 presents the algorithm for attaching a new node to one that has two subtrees. At the right are the initial configurations for the two cases that occur when this happens.

16
p has one sibling s = parent(p) if p = middle child of s then {connect(q, s, right) s(rlow) = qlow} if p = left child of s then {u = middle child of s connect(u, s, right) p s(rlow) = s(mlow) s(mlow) = qlow connect(q, s, middle)} mlow

Algorithm Design

mlow

Figure 15 - Inserting With Two Subtrees


The configurations which occur when the parent of p has three children are enumerated in Figure 10. The parent must be split and the portion on the right is attached to the tree. First we set things up by creating a new interior node to which we shall attach two of the subtrees of the parent of p.

Figure 16 - Creating an Interior Node


Then we take the cases from right to left. First is the situation where the subtree rooted at q must be inserted to the right of the subtrees of s, the parent of the subtree rooted at p. Note that we remember the smallest value in the new tree rooted at r.

if p is the right child of s then {connect(q, r, middle) r(mlow) = qlow connect(p, r, left) newlow = s(rlow)}

mlow rlow

r q

Figure 17 - New Node on Right

2-3 Trees

17

In the remaining two cases, qs tree will be inserted on the left of the right child of s, the parent of p. So, we can immediately move this right subtree over to the new node (r) and proceed.

if p is not the left child of s then {v = right child of s connect(v, r, middle) r(mlo w) = s(rlow)} Figure 18 - Attaching to New Node
At this stage, note that the subtree rooted at v has been attached to the new node as shown in figure 19. Also, the rlow value of s is now the mlow value for r. Now we consider the first case, when p is the center child of s and q falls between the largest and second largest of the children of s .

if p is the middle child of s then {connect(q, r, left) newlow = qlow}

mlow

rlow

r v

rlow -

Figure 19 - New Node Second from Right


And, finally we come to the last case, where ps tree is the leftmost subtree of s.

if p is the left child of s then s {u = middle child of s connect(u, r, left) p newlow = s(mlow) connect(q, s, middle) s(mlow) = qlow}

mlow rlow r

rlow -

Figure 20 - New Node Second from Left


After all of the subtrees have been connected to r, we must remove any evidence of ss old right subtree and attach r to the parent of s. This is done in figure 21.

18

Algorithm Design

re move old links to ss right subtree s(rlow) = null attach(t, s, r, newlow) Figure 21 - Finishing the Connection
Showing correctness involves tracing through the algorithm and verifying that the new leaf has been added to the tree and that it remains a 2-3 search tree. Since the attach routine is recursive, this involves verifying that its precondition is true at the end when it is called to attach the new interior node to the tree. Complexity of insertion is rather straightforward. It is the sum of that of the locate and attach routines. The locate routine requires O(logn) steps and the attach routine attaches a subtree one taller in height each time it is called. (By the way, this fact also insures that the process halts.) This means that its complexity is at worst the height of the tree. Thus inserting an element is of complexity O(logn).

Tables and Hashing


Tables are objects that contain data items primarily accessed by a lookup operation or method. These objects appear in quite a few computational applications since it is often convenient to file elements away in tables and look them up if they are needed. Here are specifications for the methods needed in table lookup.

locate(t, x, M) pre: t is a table of size M post: return address in t where x should be

Figure 1 - Table Lookup Specifications


Lookup is very much a combination of searching and insertion. We have seen many ways to do lookup that use comparisons as a basic step. Among them are sequences or arrays, linked lists, and search trees. Since the methods for these use comparisons we have found that O(logn) steps are always necessary for searching and insertion.

20

Algorithm Design

It would be nice if we could do better than O(logn). This might be possible if we did not compare things to find the data we desire. Suppose we knew where the target item was and could go straight for it. For example, let t be a sequence or array and define the methods of figure 1 so that locate(t, x, M) returns x, and lookup(t, x, M) is just the instruction pair {t[x] = x; return(x)}. This does do the job and besides, it is very fast. In fact, the complexity is constant time. There is one drawback though. We shall need a table that is the size of the data range. This could be extremely large. But, if the range of the data is small, then this is a good implementation. The sophisticated name for this technique is key-indexed search. But, could we use something similar to this technique if our data universe was quite large? This would make table lookup very attractive. A technique called hashing is a relative of key-indexed search and fits our requirements. It is used in many applications (one familiar to every computer scientist or engineer is compiler symbol tables) and has been studied in great detail by researchers since the early days of computing. Here is how hashing works. We have a hash function named h(x) that takes any member of the data universe and computes the address where this element either is located or should be located in the table. This is the locate function of figure 1. Another notational convention we shall impose is that the table size is M. This is illustrated in figure 2.

h(x) x

M x

Data Universe

Table

Figure 2 - Hashing

Tables and Hashing

21

This scheme makes the complexity of locate(t, x, M) the same as computing the hashing function h(x). So, if h(x) can be computed quickly, then lookup should also be very fast. If all of this is possible, then our problems are solved. Let us examine hashing in detail.

Hash Functions. A hash function maps data into a table that is of size M. Thus it should change keys into integers in the range from 1 to M. A good hash function should possess the following properties.
simple to implement fast to compute maps to table addresses in a random manner.

Here are three traditional methods for the computation of hash functions. Note that both of the first two techniques spread the data universe throughout the table, but with different strategies. a) Multiplicative: if 0 x k then h(x) = 1 + (x/k)M b) Modular: h(x) = 1 + x mod M c) Hybrid: h(x) = 1 + x mod M Function (a) has been used in situations where the key size is well known, and is quite fast if the keys are a known number of bits. Then at the machine language level one merely uses shifts and a multiplication. Modular hash functions are even easier to implement and the operand (x) can be the machine representation of a key. Several observations from the literature (or wealth of hashing experience) are in order. First, hashing works better when the table size (M) is a prime number. A useful sequence of primes for this purpose is: 251, 509, 1021, 2039, 4093, 8191, 16381, 32749, 65521, where each is approximately twice the size of the preceding one. (This sequence came from Robert Sedgewick's text: Algorithms in C++.) The second observation is that the golden ratio: 0.618033 is a favorite value for the constant in the hybrid hash function. Special techniques are required when the keys are very long numbers. Strings are a good example of this since the ASCII encoding of a string is sequence of base 128 numbers. Examine figure 3, it should be familiar.

22

Algorithm Design

7683 7 53782 49 47 42 58 56 22 21 1

re mainder(x, y) pre: x and y are integers post: return remainder of x/y z=0 for each digit of x do {z= z 10 z = z + next digit of x z = re mainder of z/y} return(z)

Figure 3 - Long Division


On the left is an elementary school division exercise and on the right is the algorithm used to do the division. Well, not quite. On the right is an algorithm for finding the remainder after the division is performed. (By the way, mathematicians call this technique Horner's Rule when they use it to do polynomial evaluation.) Note that the dividend need not be saved if all that is desired is the remainder. So, if the key is a rather long number, this is the way to quickly do modular arithmetic. To do this with strings, just use base 128 arithmetic and set the digits or characters of x to the ASCII codes and use 128 rather than 10 as the base. A final observation is to use 127 rather than 128 as the base since a prime base has been found to provide a nicer distribution of hashed addresses.

Collision Resolution. So far, so good. But, one problem arises as soon as we begin to hash lots of numbers into our table. Every now and then, two keys will be hashed into the same table address. This is called a collision and something must be done to fix this.
A simple and popular answer to this problem is called separate chaining. Whenever a key is hashed into an address that is occupied, we place it on a list attached or chained to that address. In figure 4 individual letters are hashed into a table that only has seven addresses with the modular hash function shown on the left. Since there twenty-six letters, collisions are bound to occur. At each address, we form a chain of the letters that have been hashed there.

Tables and Hashing

23

x h(x) a 1 v 1 e 5 r 4 y 4 b 2 i 2 g 7 o 1 n 7 e 5

a b

v i

r e

Figure 4 - Separate Chaining


In order to implement this strategy, the lookup method must be modified so that it first hashes the key to an address and then follows the chain of keys until it finds the target or the end of the chain. If the hash function is fairly random and the table is large enough, we should not have too many collisions and lookup will remain efficient. Another popular method of collision resolution is called linear probing. It is really very simple. If the address a key is hashed into is occupied, then we proceed down the table until the target or an empty slot is found. If the target is not found, it is placed in the empty address. The lookup routine is now like that of figure 5.

lookup(t, x, M) pre: t is a table of size M post: if x t then insert it in t return address of x k = locate(t, x, M) while t[k] x and t[k] empty do {k = k + 1} insert x at t[k] return(k) Figure 5 - Linear Probing
The reason it is called linear probing is that each time an address is examined, this is called a probe.

24

Algorithm Design

One problem with these two strategies for collision resolution is that even though the hash function is fairly random, lots of keys that occur in standard data sets are hashed into the same address. This phenomenon is named clustering since the keys that collide appear throughout the table in groups or clusters. When separate chaining is implemented this means that long chains must be searched. With linear probing, local areas of the table fill and again, long searches must be made to find the target. This is a very powerful example of time-space tradeoff. If the table is small, collisions are inevitable. But if the table is large, there is less likelihood of collisions. So, it would appear that large tables lead to faster lookup. Many people resort to a technique called double hashing as a better way to prevent clustering. Instead of looking for the closest available address, we hash the number again and look that much further down the table. A favorite second hashing function for this is x mod 97. A double hashing algorithm is presented as Figure 6.

lookup(t, x, M) pre: t is a table of size M post: if x t then insert it in t return address of x k = locate(t, x, M) while t[k] x and t[k] empty do {k = (k + (x mod 97)) mo d M} inse rt x at t[k] return(k)

Figure 6 - Double Hashing Dynamic Methods. Sometimes tables are just too small because we did not have a good idea about how much data was coming in. In cases like this, we must make the table larger. Doubling the table is the most used strategy. This means rehashing the data, but it must be done if we are to store everything in the table. Also if the table is made larger before filling, lookups in the future will be more efficient.

Sorting Networks
When sorting lists of some specific size, sorting networks are often employed. These are very simple devices built solely from modules that only do compare-exchange operations. A compare-exchange operation on the pair <x, y> is merely the following code.

if x > y then {t = x; x = y; y = t}
Consider the network in figure 1.

3 8 6 2

2 3 6 8

Figure 1 - A Four-Element Sorting Network


From the lists of numbers shown at the input and output ends, it seems to perform a sort. Here is how it works. The pairs of blue dots are modules that perform compare-exchange operations. Thus, in the first column, the pairs of numbers <3, 8> and <6, 2> enter the modules where the 6 and 2 are exchanged while the 3 and 8 remain in their original positions. Tracing the numbers as they go through the network provides the following table.

3 8 6 2

3 8 2 6

2 8 3 6

2 6 3 8

2 3 6 8

Note that the first column of modules exchanged the <6, 2> pair. Next, the second column exchanged the <3, 2> pair. Then the pair <8, 6> was exchanged, and finally the pair <6, 3> was exchanged.

26

Algorithm Design

It is not difficult to extract a straight-line sorting procedure for four elements from the network. This procedure is presented in figure 2.

if x1 > x2 then {t = x1 ; x1 = x2 ; x2 = t} if x3 > x4 then {t = x3 ; x3 = x4 ; x4 = t} if x1 > x3 then {t = x1 ; x1 = x3 ; x3 = t} if x2 > x4 then {t = x2 ; x2 = x4 ; x4 = t} if x2 > x3 then {t = x2 ; x2 = x3 ; x3 = t}

Figure 2 Sorting Four Elements


Now that we know how it works, we need to explore why it works and to do this we shall examine the strategy that motivated the design of the network. Understanding why generalizations of the above sorting network are correct is nontrivial. Networks similar to that of figure 1 are derived from a strange mergesort algorithm developed by K. E. Batcher in the 1960's. Thus it is known as Batcher's Odd-Even Mergesort algorithm. The main sorting algorithm is exactly the same as the standard mergesort presented in figure 3.

sort(a, n) pre: 0 < n post: a[1] ... a[n] if n > 1 then {m = (n + 1)/2 b = a[1], ... , a[m] c = a[m+1], ... , a[n] sort(b, m) sort(c, n-m) me rge(b, c, a, n)}

Figure 3 - Mergesort
The only difference is in the merge. In our first presentation of the merge, we shall assume that the two lists to be merged are both of the same size and that this size is a power of two. We shall need two auxiliary methods or algorithms in our new merge. One is a shuffle operation on lists exactly like that used with playing cards and the other is the compare exchange operation. The specifications for these appear as figure 4 along with the specifications for merging.

Sorting Networks

27

shuffle(a, b, c, n) pre: 0 < n post: c = a[1], b[1], a[2], b[2], ... , a[n], b[n] compex(a, i, k) pre: 0 < i < k post: a[i] a[k], they are the original items merge(a, b, c, n) pre: 0 < n a[1] ... a[n], b[1] ... b[n] post: c = a b, c[1] ... c[2n]

Figure 4 - Specifications
The strategy behind Batcher's odd-even merge is to merge the elements in the odd-numbered positions of lists a and b. That is, merge:

a[1], a[3], ... , a[n-1] with b[1], b[3], ... , b[n-1].


Then merge the elements of a and b that occupy the even numbered positions. After this, the two recently merged lists are shuffled and adjacent items are checked for proper order. Here is an example:

a = 5, 9, 14, 25 b = 3, 12, 19, 32 merge ao = the odd elements of a with those of b ao = 5, 14 bo = 3, 19 co = 3, 5, 14, 19 merge ae = the even elements of a with those of b ae = 9, 25 be = 12, 32 ce = 9, 12, 25, 32 shuffle co with ce c = 3, 9, 5, 12, 14, 25, 19, 32
Note that at this point exchanging adjacent elements of c provides us with a perfectly sorted list. The recursive algorithm appears in figure 5.

28

Algorithm Design

merge(a, b, c, n) if n = 1 then {c[1] = a[1]; c[2] = b[1] compex(c, 1, 2)} otherwise {ao = a[1], a[3], ... , a[n-1]; ae = a[2], a[4], ... , a[n] b o = b[1], b[3], ... , b[n-1]; be = b[2], b[4], ... , b[n] merge(ao , bo , c o, n/2) merge(ae, be, ce, n/2) shuffle(c o, ce , c, n/2) for i = 2 to n-2 in steps of 2 do compex(c, i, i+1)

Figure 5 - Batcher's Odd-Even Merge


The proof of correctness for this strange merge involves showing that each of the elements in co and ce are larger than a certain number of elements of c. Since both a[1] and b[1] are in co, then we know that the first element of co is the smallest on the list. For co[i] and ce[i-1], an intricate counting argument shows that each is larger than 2i-3 elements and thus should be in either position 2i-2 or 2i-1. Deriving the complexity of this merge is not so difficult. We note that the shuffle and the compare-exchange operations take O(n) operations and solve this recurance relation for merging two lists of size n.

Merge(n, n) = 2Merge(n/2, n/2) + n


Since n is a power of two, we may rewrite and solve the equation as shown below.

Merge(2k, 2k) = 2Merge(2k-1, 2k-1) + 2k = 2(2Merge(2k-2, 2k-2) + 2k-1) + 2k = 22Merge(2k-2, 2k-2) + 2k + 2k = 2k + ... + 2k = k2k = nlogn
This results in a complexity of O(n[logn]2) for the mergesort algorithm using the odd-even merge. This is still much better than O(n2). Two important observations lead from the odd-even mergesort to sorting networks.

Sorting Networks

29

All of the rearranging of the list to be sorted is done by the compareexchange operations. By keeping track of where the a, b, and c list elements are on the original list, the sort can be done inplace.

So, if we note the order of the compare-exchange operations during a sort, we can easily construct a sorting network and even build a sorting circuit. The compare-exchange operations for a list of size four are:

sort the first half of the list merge the first two elements: compex(a, 1, 2) sort the last half of the list merge the last two elements: compex(a, 3, 4) sort the odd and even elements on the list merge the odd elements: compex(a, 1, 3) merge the even elements: compex(a, 2, 4) shuffle and correct: compex(a, 2, 3)

This produces the sorting network of figure 1 and the straight-line code of figure 2. Omitting the operations that involve element 4 gives us:

sort the first half of the list merge the first two elements: compex(a, 1, 2) sort the odd and even elements on the list merge the odd elements: compex(a, 1, 3) shuffle and correct: compex(a, 2, 3)

This provides the sorting network for lists of size three shown in figure 6. It is easy to verify that this is correct.

3 8 6

3 6 8

Figure 6 - A Three-Element Sorting Network


If we wish to perform the odd-even mergesort on lists which are not powers of two in length, we do exactly the same thing that we would do

30

Algorithm Design

in order to mergesort such a list. Consider the operations that we would go through to sort a list of size five.

divide the list into (a[1],a[2],a[3]) and (a[4],a[5]) divide the first list into (a[1],a[2]) and (a[3]) merge a[1] and a[2]: compex(a, 1, 2) now (a[1], a[2]) is sorted form odd lists (a[1]) and (a[3]) merge a[1] and a[3]: compex(a, 1, 3) shuffle (a[1],a[3]) and (a[2]) to get (a[1],a[2],a[3]) correct (a[1],a[2],a[3]): compex(a, 2, 3) now (a[1], a[2], a[3]) is sorted merge a[4] and a[5]: compex(a, 4, 5) now (a[1], a[2], a[3]) and (a[4], a[5]) are sorted form odd lists (a[1],a[3]) and (a[4]) form odd lists (a[1]) and (a[4]) merge a[1] and a[4]: compex(a, 1, 4) shuffle (a[1],a[4]) and (a[3]) to get (a[1], a[3], a[4]) correct (a[1],a[3],a[4]): compex(a, 3, 4) now (a[1], a[3], a[4]) is sorted form even list (a[2], a[5]) merge a[2] and a[5]: compex(a, 2, 5) now (a[1], a[3], a[4]) and (a[2], a[5]) are sorted shuffle (a[1],a[3],a[4]) and (a[2],a[5]) to get (a[1],a[2],a[3],a[5],a[4]) correct (a[1],a[2],a[3],a[4],a[5]): compex(a, 2, 3); compex(a, 4, 5)
Extracting the compare-exchange operations provides both a straight line algorithm and a sorting network for lists of five elements. The network is shown in figure 7.

3 8 6 2 5
Figure 7 - A Five-Element Sorting Network

2 3 5 6 8

Fast Matrix Multiplication


Consider the following two-by-two matrix multiplication:

a b e f ae + bg af + bh c d g h = ce + dg cf + dh
Note that we need eight multiplications and four additions in order to perform the multiplication. This means that in order to multiply two nby-n matrices, it is necessary to perform n3 multiplications and n2(n-1) additions. In 1969, Strassen [Str69] surprised everyone by discovering a way to multiply the above matrices using only seven multiplications. He first computed:

x1 x2 x3 x4 x5 x6 x7

= = = = = = =

(a + d)(e + h) (b d)(g + h) (a c)(e + f) (a + b)h (c + d)e a(f - h) d(-e + g)

Then he noted that:

ae + bg = x1 + x2 x4 + x7 af + bh = x4 + x6 ce + dg = x5 + x7 cf + dh = x1 x3 x5 + x6
Of course, the number of additions and subtractions went up to eighteen, but this is fine since multiplications take longer to perform. In order to extend this to n-by-n from two-by-two, we merely divide the matrices into four equal portions and multiply the parts as above in a

32

Algorithm Design

recursive manner. This of course requires n to be a power of two, but that is fine. Then, if you solve the recurrence relation from this (setting it up is a nice exercise), the complexity comes out to be O(nlog7). For the curious, Pan [Pan80] discovered an even faster way to do this. [Pan80] Pan, V. Y. New fast algorithms for matrix operations. Siam Journal on Computing 9, 1980, pp. 321-342. [Str69] Strassen, V. Gaussian elimination is not optimal. Numerische Mathematik 13, 1969, pp. 354-356.

Minimum Spanning Trees


Preconditions and Postconditions pre: G = (V, E) is a connected graph V = {v1, vn), E = {e1, ek) post: T is minimum size tree spanning V

Prims Method
T := U := {v1} while U V do select smallest edge <u, w> from U to V-U add w to U add <u, w> to T Invariant: T is the mst for U 2 Complexity: O(klogk) or O(n )

Kruskals Method
T := place edges on a priority queue Q build edge sets Si = {v1} while T does not span V do e := deletemin(Q) if e goes from one Si to a different Sm then merge Si and Sm into Si delete Sm add e to T Invariant: T is a minimum spanning forest over the Si sets remaining Complexity: O(klogk)

34

Algorithm Design

Round Robin Method


T := build edge sets Si = {v1} while T does not span V do for all Si remaining do e := smallest edge leading out of Si merge Si and the Sm e goes to into Si delete Sm add e to T Invariant: T is a minimum spanning forest over the Si sets remaining Complexity: O(klogn) Question: why are cycles not formed in the above construction?

External Sorting
Problems when the records cannot be stored in main memory:

Cost of access is high: 10-12 msec for hard drives as opposed to 100-300 nsec for primary storage. Sometimes there are restrictions on access depending upon the medium used for example: tapes

Many times external sorts depend upon technology used for storage. Complexity steps are record moves. Balanced k-way merge

Main algorithm: setup(F, I1, I2) while runs remain merge(I1, I2, O1, O2) merge(I1, I2, O1, O2) while runs remain for k := 1 to 2 take runs from I1 and I2 merge them on to Ok exchange the Is and Os setup(F, I1, I2) while records remain on F for k := 1 to 2 read b records into primary storage sort them output them to Ik

36 Improve it by: Use heapsort and make initial runs longer Have two heaps, one for sorting and one for next time Complexity for 2 input and output files: nlog(n/b) But, how many files do we wish to have? Select k such that log(n/b) = 2 Polyphase sorting (Fibbonacci Sorting)

Algorithm Design

String Searching and Pattern Matching


Specifications

search(s, p) pre: s, p are strings over some alphabet m = |p| |s| = n post: return start of p in s, 0 otherwise
Theorem (Cook). Pattern matching can be done in O(m + n).

Brute-Force Algorithm
i = j := 1 repeat if s[i] = p[j] then i++; j++ otherwise i := i - j + 2; j := 1 until j > m or i > n if j > m then return(i - m) else return(0)
Invariant: p[1]p[j-1] matches s[i - j + 1]s[i-1] (also: p is not found before s[i - j + 1] ) Complexity = O(mn), but in real cases we do not search from each of the characters in s, only those that could begin p, so it is O(km + n)

Knuth-Morris-Pratt Algorithm
strategy: use the unsuccessful pattern checking to help calculate the position in which to begin the next search. jump[i] = maximum k such that p[1]p[k-1] match p[i - k + 1]p[i] (note that jump[1] = 0)

build the jump[] table i = j := 1; repeat if (s[i] = p[j] or j = 0) then i++; j++ otherwise j := jump[j] until j > m or i > n if j > m then return(i - m) else return(0)
Same invariant. Complexity = O(n) + building jump table

38

Algorithm Design

Jump Table Building


i = j := 1 jump[1] := 0 repeat if (p[i] = p[j] or j = 0) then i++; j++ jump[i] := j otherwise j := jump[j] until i > m
Invariant: : p[1]p[j-1] matches p[i - j + 1]p[i-1] Complexity = O(m) Improve it by looking at the mismatched character. Replace jump[i] := j by: if p[i] p[j] then jump[i] := j otherwise jump[i] := jump[j]

Network Flow
Defn. A network is a directed, weighted graph with one source (node with indegree = 0) and one sink (node with outdegree = 0). Flow conservation rule: the flow into a vertex must equal the flow out.

Maximum-flow Problem. What is the greatest possible flow from the source to the sink in a network?
Defn. Residual capacity is capacity - flow. (note that if there is a flow of k and capacity of k+m from u to v, and zero capacity from v to u, then the residual capacity from v to u is k and from u to v is m.) This is how much more can be added between u and v (or v and u). Defn. An augmenting path is a path in the residual network. Defn. Cuts and flow and capacity across cuts.

Theorem. If every path from the source to the sink has a full forward edge or empty backward edge, then the flow is maximal. (i.e., there is no path in the residual network.)
maxflow(G, s, t) pre: G = (V, C), s = source, t = sink post: f is max flow f(u, v) := 0 while there is a path from s to t increase flows along path by max amount (i.e. the cut)
do breadth-first search to find best path complexity = O(VE )
2

Faster Arithmetic
First, do adding and multiplication normally. Complexity of multiplication = O(n2) Is this the best way to multiply? Why do we do it that way? Now try divide and conquer (works sometimes!): x = x1x2 and y = y1y2 xy = x1y12n + (x1y2 + x2y1) 2n/2 + x2y2 T(n) = cn + 4T(n/2) = O(n2) Consider this new factorization: xy = x1y12n + [(x1 x2) (y2 y1) + x2y2 + x1y1) 2n/2 + x2y2 T(n) = cn + 3T(n/2) = 2k + 3(2k-1 + 3T(2k-2)) = 2k-i3I = 2k(3/2)i = 2k [ ((3/2)k-1 1)/((3/2) 1) = 3k+1 = O(nlog3) = O(n1.59) Cook-Toom: O(n1+) Shoenhage-Strassen: O(nlognloglogn)

You might also like