Professional Documents
Culture Documents
Computing
9 by Springer-Verlag1986
of two arbitrary N x N Boolean matrices is presented. The algorithm requires O (N3/logN) bit operations and only O (Nlog N) bits of additional storage. This represents an improvement on the Four Russians' method which requires the same number of operations but uses O (N3/log37)bits of additional storage.
I. Introduction
In parallel with the analysis of asymptotically fast methods, the research on Boolean matrix multiplication has also focused on the determination of "efficient" bounds; that is, finding non-optimal but practical algorithms that outperform the asymptotic algorithms for a b o u n d e d matrix size or for special classes of matrices. Examples of these results are the O ( N 2) algorithms for multiplying N x N sparse or dense Boolean matrices [2, 3], and the O (N3/log N) algorithm for multiplying arbitrary N x N Boolean matrices [1]. The latter algorithm, k n o w n as the " F o u r Russians' M e t h o d " , unfortunately requires O(N3/logN) additional bits to store the O (N/log N) sets, each containing N rows of N bits each. In this paper, a new algorithm for Boolean matrix multiplication is presented; this algorithm is based on some properties of the product matrix, and it is shown to require O (N3/log N) bit operations (thus, achieving the F o u r Russian' bound) but only O (N log N) bits of additional storage.
* This work was supported in part by the Natural Sciences and Engineering Research Council of Canada under Grant No. A.2415.
376
The paper is organized as follows. In the next section, some properties of the product matrix are discussed; based on these properties, in Section3, an algorithm is presented which multiplies a p x N matrix by a N x N matrix in O(N 2) bit operations using O (N log N) bits of additional storage where p < [log 2 NJ ; finally, in Section 4, this algorithm is employed to obtain the claimed result. In the following, all logarithms are in base two and (P, Q, R) denotes the problems of multiplying a P x Q by a Q x R Boolean matrix.
(1 a)
(lb)
and
z i 2k l = { x ~ Zik: A [i-I- 1 , x ] = 0 } +
( 0 < i < p , 1 _<k_<2i), where Z~ Obviously, Z~+* i+~=O and Z;k W=2k ~+1 7~+,=Z~(O<i<p,l<k<2~). 2k_l("~Z2k --- --
Example 1: C o n s i d e r t h e m a t r i x
01100101 A=01111000. 11001001 In this case, the sets Z~'s are as follows Z~
zl = {2,3,6,8}, z~={1,4,5,7};
Z2={2,3}, Z 2 = {6,8}, Z~={4,5), Z2={1,7}; Z~={2}, Z2 3={3}, Z 3={8}, Z 3={6}, Z~={5}, Z6 3={4}, Z{ ={1}, Z3 = {7}. Let f i = {1,2, ..., N} ~ {1 , 2 , . . , N} (0 < i _<p) be the mapping f i (x) = k iff x e Z~. (2)
A bijection ~z:{1,2 ..... N} + {1,2, ..., N} is said to be i-canonical (0 <_i<p) if for all x, y e { 1 , 2 .... ,N}
7"C (X)(77~ -1 (y) if -1
.f~(x)<f'(y).
(3)
377
[i, k] =
{~ -~
(x): x E z~},
1~ [i, k] = min {a ~ ~ [i, k] }, h~ [i, k] = max {a s 7cI-i, k] }. Lemma 1: Let rc be i-canonical. Then, Jbr all non-empty Z~(1 <k_<2 ~)
...,
~(h~[i,k])}.
Proof: By definition of /-canonical mapping, each x eZ~ is such that re- 1 (w) < ~ - 1 (x) < z - 1 (z) for all w ~ Z~,,, z ~ Z/k,, where k' < k < k". Hence ~z[i, k] is a sequence of consecutive integers. By definition ofl~ [i, k] and h~ [i, k], and since ~zis a bijection the L e m m a follows. [ ]
The sets Z~ may conveniently be imagined as lying in a binary tree, each Z~ having Z~+~ 1 a n d T~2k as left and right children. The i-th level of the tree then consists of i+~ 2kthe sets Z~ (k = 1,2 .... ,2 i) which form a partition (possibly with some empty parts) of { 1,2, ..., N} ; the function f~ is the characteristic function of the partition. If the nodes at the i-th level, Za, Z2, ..., are sets of size m], m2, ..., an/-canonical bijection is one ~ ~ " that maps the first m] positive integers onto Z~, the next m~ integers onto Z~, and so
on.
/\
{2} {3} {8} {6}
C o n s t r u c t i o n 2.1 :
?\
{5} {4}
?\
{1} {7}
1. For all k(l_<k<2~), i f Z ~ 0. a) Partition ~z[i, k] into two sets P~ and Q~ such that P~ = {a E ~ [i, k]' A [i + 1, ~i (a)] = 1} and Q~ = {a ~ n[i, k] : A [i + 1, ~i (a)] = 0). b) Define gig : ~Z[i, k] ~ rc [i, k] to be a bijection such that for all a~ P~, b~Qik, Og(a)<Ok(b ). 2. Define hi+l: {1,2 ..... N} ~ {1,2, ...,N} to be the bijection defined by 7C/-+11(X) = Ok (7"g/~ (X)) iff re/- 1 (X) ~ ~Z[i, k]. 1
378
N . S a n t o r o a n d J. U r r u t i a :
Example 2:
It is easy to verify that the bijection ~2 defined as follows 1--2, 2-+3, 3-+6, 4-+8, 5-*4, 6-*5, 7--+1, 8--+7 is 2-canonical for the matrix of Example 1. The sets 7r2 [i, k] are as follows, 7z[2, 1] ={1,2}, rc[2,23 ={3,4}, ~ [2, 3] = {5, 6), ~ [ 2 , 4 ] = { 7 , 8 } ; the partitions P~, Q~ obtained by applying Construction 2.1 are P~={1}, P2Z=(4}, P2={6}, P2=(7};
Q2 = {2}, = {3}, Q2 = {5}, = {s};
and the bijections Ck are Ca:l--*l, 2-.2; ~2:4--*3, 3 ~ 4 ; ~3:6--5, 5--+6; r The bijection rc3 obtained by Step 2 of Construction 2.1 is then =a: 1-+2, 2 ~ 3 , 3-*8, 4-+6, 5-+5, 6--+4, 7---,1, 8--+7.
Lemma 2:
8--+8.
Let ~z~ be i-canonical, and let 7r~+ be the bijection obtained by 1 Construction 2.1. Then, 7 +1 is (i + 1)-canonical. h
Proof: It is not difficult to see that ifr h is/-canonical, then zi+ a is/-canonical. Hence, it suffices to prove that ~L~ (x) < rc,.-+](y) if/'+a (x) <f~+~ (y). Let x E Z~ + 1, y e Z~+~ with k'<k". Two cases may arise. Casel ( k ' = k " - l = 2 k - 1 ) : In this case, rc~-1 (x) E P~ and ~- * (y) e Q~; by Step 1 (b) of Construction 3.1
'V+', (x) = 0~ (~,-~ (x)) < 0~ (~?' (y)) = ~,-+~ (y). Case 2: Let c' = Lk'/2J and c" = [k"/2J, obviously c' < c". By definition (1), x e Zic9and y e Z~,, ; since ~i + 1 is/-canonical, then r~/-+~(x) < n/-+~ (y). E] Let qS~(/') be the Boolean function (1 <_i<p, 1 < k < 2 ~,1 < j < N )
V B [x,j]
4~, (j) =
:,~ z~
(J) = I s = I=[p, k]
0 for l_<k<2 ~, I <j<_N. Proof: By Lemma 1.
I h~/k] B[Tr(s),j]
if Z f r
otherwise
[]
"
E]
An Improved
Algorithm
379
Theorem 1 :
2i 1
/ 9 " .,2/-1}'leZ 2 r - 1
B[x,j]=liff
V q~i,.- ~(/)-- 1.
r=l
Theorem 1 shows that to compute the entries of the j-th column of the product matrix C = A x B it is sufficient to compute the OR of the Boolean functions (b~(])'s over the appropriate indices. Lemma 4 gives a method for computing the q~ (j)'s once the @~+1 (])'S have been computed; Lemma3 shows how to compute the starting values ~b~(j)'s once a p-canonical bijection n and the values h~ [p, k]'s and l~ [p, k]'s are known. Finally, Construction 2.1 provides a method for determining a (i + 1)-canonical bijection once a/-canonical bijection is known. Observe that, by Lemma 1, each set n i [i, k] is uniquely defined by the values l~ [i, k] and h~ [i, k] (1 _k<2/). The above considerations lead to the following algorithm for computing the product
C=AxB.
Algorithm 3.1 :
380
Step 4: (Computation of j-th column of product matrix: Initialization) a) Set i:=p; b) Compute qSP(j) (1 _<k _<2p) using Lemma 3. Step 5: (Computation ofj-th column of product matrix: Iteration) a) Compute C [i,j] using Theorem 1 ; -t b) Compute ~b~ (j) (1 <_k<2 i-1) using Lemma 4; c) Set i: = i - 1 ; if i>_ 1, repeat this step. Step 6: (Computation of product matrix: Iteration) Set j: = j + 1 ; if j _<N, goto Step 4.
Theorem 2: Algorithm 3.1 correctly computes the product C = A x B within finite time.
Lemma 5: Given an i-canonical bijection 7rl, and the values l~i [i, k] and h~ [i, k] for each non empty Z~(1 <_k <_2~), an (i + 1)-canonical bijection can be computed by Construction 2.1 using 0 (N log N) bit operations.
Proof: Each non-empty ~ [i, k] is composed (see Lemma 1) of the consecutive i integers between l~ [i, k] and h~., [i, k]. To construct P~ and Qk, it is sufficient to test each entry A [i + 1, ~z (x)] (1~,[i, k] _<x _<h~ [i, k]); i
if the entry is zero, then x is added to Q~; otherwise, it is added to P~. Note that since x _<h~ [i, k] _<N, the addition of x to either P/k or Q~ would require [log N] bit operations. Hence, the construction of P~ and Q~ requires a total of [rh [i,k] I([logN]+ 1) bit operations; since 4)k can be constructed by simply examining the sets P~ and Q~ (in that order), and assigning consecutive integers between l~, [i, k] and h~, [i, k], an additional ] ~zl[i, k] [ [log N] bit operations suffice. Assuming that testing on whether rc~[i, k] is empty can be done in [log N] bit operations (note: this can be obviously achieved by setting l~ [i, k] = h~, [i, k] = 0 for empty rci I-i, k]), the execution of Step 2 requires at most
2i
bit operations. Since the ~i [i, k]'s are all disjoint and I._J ~ [i, k] = {1, ..., N}, then k O (N log N) bit operations are required in total by Step 1. Step 2 can be obviously performed in an additional O(NlogN) bit operations; hence, the Lemma holds. []
Lemma 6: The total execution of Step 2 of Algorithm 3.1 requires 0 (p N log N) bit operations.
381
Proof: By Lemma 5, the i-th iteration of Step 2(a) requires O(NlogN) bit operations, Since the values l~i [-i, k] and h~ [i, k] (in Step 2 (b)) can be computed from the sets P~- 1 and Q~,-~ in I rc~_~ [i - 1, k] [ [log N] bit operations, and since ~ l ~ i - 1 [ - i - 1 , k ] ] = N it follows that the i-th iteration of Step2(b) requires k O (N log N) bit operations. Since Steps 2 (a) and 2 (b) are performed 1.1. times, the Lemma holds. [] Lemma 7: For a given j, 1 <_j<_N, the total execution of Step 5 of Algorithm 3.1 requires 0 (N) bit operations. Proof: The i-th execution of Step 5 (a) (for a fixed j) requires at most 2i 1 - - I bit operations (see Theorem 1); Step 5(b) requires 2 r bit operations, one for each qS~,-1 (j) (see Lemma 4). Since Steps 5 (a) and 5 (b) are executed for i ~ {0, ..., p},
p
1 = O(N)
[]
(N 2) bit
Proof: By Lemmas 6 and 7, and by observing that each execution of Step 4 requires O (N) bit operations (see Lemma 4), and that Steps 4 and 5 are executed N times while Step 2 is executed only once. [ ] Algorithm 2.1 can be implemented so to employ only O (N log N) bits of additional storage. The basic ideas of this implementation are the following. An/-canonical bijection ~i can be obviously stored in an integer array of N elements, where the j-th entry contains ~h(J). Since rc~+ 1 (J) is constructed only afterwards, the storage area for zh(j) can be reused for ~ i + 1 (J)- Since PkUQk ~_{1,...,N} and i i i i i i Pk c~(~k= 0, each of the auxiliary sets Pk and Qk can obviously be stored in an integer array of N elements; furthermore, the same storage area can be used to store P~+ 1 and Q~+ 1 once Ok has been computed. Since the mapping Ok is only a "fragment" of gi+ 1 (see Step 2 of Construction 2.1), it is implicitly contained in rci+1. To store all values l~ [i, k] and h~i [i, k], two integer arrays of size 2 N each suffice (recall, there are 2p + l - 1 _ < 2 N - 1 possible /'s, and the same number of h's). Since all these integers are in {1,..., N}, 0 (Nlog N) bits of additional storage in total suffice to implement Steps 1 and 2 of Algorithm 2.1. For a fixed j, 1 < j < N, all the 2; + 1 _ 1 values ~b~(j)'s can be stored in a Boolean array of 2 N elements (recall, 2 p + 1 - l < 2 N - 1 ) ; the same array can obviously, be employed for successive j's. Hence, O (N) bits of additional storage in total are sufficient to implement the remaining Steps 3 - 6 of Algorithm 2.1. Therefore Theorem 4: Algorithm 2.1 can be implemented so to use at most 0 (Nlog N) bits of additional space.
26*
382
5. Conclusions
A new algorithm for computing the product of two arbitrary N x N Boolean matrices has been presented. It has been shown that the proposed algorithm requires O (N3/log N) bit operations (thus, achieving the Four Russians' bound) but only O (N log N) bits of additional storage. It should be pointed out that unlike the Four Russians' Method, the proposed algorithm cannot be directly executed on a vector computer.
Acknowledgements
The authors wish to thank Mike Atkinson for the helpful discussions.
References
[l] Arlazarof, V. L., Dinic, E. A., Kronrod, M. A., Faradzev, I. A. : On economical construction of the transitive closure of a directed graph. Doki. Akad. N a u k SSR 194, 4 8 7 - 4 8 8 (1970). [2] Santoro, N. : Four O (N 2) multiplication methods for sparse and dense Boolean matrices. In: Proc. 10th Conf. Numerical Mathematics and Computing, pp. 2 4 1 - 2 5 3 . Winnipeg, Manitoba, 1980. [3] Vyskoe, J.: A note on Boolean matrix multiplication. Inf. Proc. Letters 19, 2 4 9 - 2 5 1 (1984). Dr. Nicola Santoro School of Computer Science Carleton University Ottawa, K1S 5B6, Canada Dr. Jorge Urrutia Computer Science Department University of Ottawa Ottawa, K 1 N 5B4, Canada
Verleger: Springer-Verlag KG, Mtilkerbastei 5, A-1010 Wien. - - Herausgeber: Prof. Dr. Hans J. Stetter, Institut ffir Angewandte und Numerische Mathematik der Technischen Universit~it Wien, Wiedner Hauptstral3e 6 10, A-1040 Wien. - - Redaktion : Wiedner Hauptstrage 6--10, A- 1040 Wien. Hersteller : Satz Austro-Filmsatz Richard Gerin, Zirkusgasse 13, A-1020 Wien, Druck Paul Gerin, Zirkusgasse 13, A-1021 Wien. Verlagsort: Wien. - Herstellungsort : Wien. - - Printed in Austria.