You are on page 1of 6

CSE 594: Combinatorial and Graph Algorithms

SUNY at Buffalo, Spring 2005

Lecturer: Hung Q. Ngo


Last update: March 26, 2005

Analyzing approximation algorithms with the dual-fitting method

A greedy algorithm for S ET C OVER

One of the best examples of combinatorial approximation algorithms is a greedy algorithm approximating the (weighted) S ET C OVER problem. An instance of the S ET C OVER problem consists of a universe
set U = {1, . . . , m}, a family S = {S1 , . . . , Sn } of subsets of U , where set S S is weighted with wS .
We want to find a sub-family of S with minimum total weight such that the union of the sub-family is U
(i.e. covers U ).
Consider the following greedy algorithm:
Algorithm 1.1. G REEDY-S ET-C OVER(U, S, w)
1: C =
2: while U 6= do
3:
Pick S S with the least cost per un-covered element, i.e. pick S such that
4:
U U S
5:
C = C {S}
6: end while
7: return C

wS
|SU |

is minimized.

In this section, we analyze this algorithm combinatorially. Then, a linear programming based analysis
will be derived in the next section.
Without loss of generality, suppose the algorithm returns a collection {S1 , . . . , Sk } of k sets. Let Xi
be the set of newly covered elements of U after the ith step. Let xi = |Xi |, and wi = wSi which is the
weight of the ith set picked by the algorithm. Assign a cost c(u) = wi /xi to each element u Xi , for
all i k.
P
For any set S S, we first estimate uS c(u). Let ai = |S Xi |. Then, it is easy to see the
following:
wS
w1

a1 + + ak
x1
wS
w2

a2 + + ak
x2
.. .. ..
. . .
wS
wk

.
ak
xk
Hence,
X
uS

c(u) =

k
X

ai

i=1

wi X
wS

ai
wS H|S| ,
xi
ai + + ak
i=1

where H|S| = 1 + 1/2 + + 1/|S| is the |S|th harmonic number. Since |S| m for all S, we conclude
that
X
c(u) Hm wS , S S.
(1)
uS

One may ask, what if ai + +ak = 0 for some i. This is not a problem. Since S 6= , a1 + +ak 6=
0. If ai + + ak = 0 for some i, then all the terms ai wxii , . . . , ak wxkk can be ignored.
Let T be any optimal solution, then
XX
X
cost(C)
c(u)
H|T | wT Hm cost(T ).
T T uT

T T

We thus have proved the following theorem.


Theorem 1.2. G REEDY-S ET-C OVER has approximation ratio Hm .
Exercise 1. In the S ET M ULTICOVER problem, each element u is required to be covered mu times,
where mu is a positive integer. Each set can be picked multiple times. The cost of picking S k times is
kwS . Devise a greedy algorithm for S ET M ULTICOVER with approximation ratio Hm (and prove that!).
Exercise 2. In the M AXIMUM C OVERAGE problem, we are given a universe U , a collection S of subsets
of U , and a positive integer k. Each element u in the universe has a non-negative integer weight wu . The
problem is to find k members of S whose union has the maximum total weight.
Suppose we solve this problem by greedily pick the best set in each iteration until k sets are picked.
(Best set is the set maximizing total weight of uncovered elements.) Prove that this strategy has
k
approximation ratio 1 1 k1 .
Exercise 3. Consider the WEIGHTED VERTEX
wv > 0. Consider the following algorithm

COVER

problem in which each vertex v is weighted with

Algorithm 1.3. LR V ERTEX C OVER(G, w)


1: C =
2: For each v V (G), let c(v) wv
3: while C is not a vertex cover do
4:
Pick an uncovered edge (u, v), let  min{c(u), c(v)}
5:
c(u) c(u) ; c(v) c(v) 
6:
Add into C all vertices v having c(v) = 0.
7: end while
8: return C
Prove that this is a 2-approximation algorithm.

Analyzing GREEDY SET COVER with dual-fitting

It is natural to find out how Algorithm 1.1 relates to the integer programming formulation of S ET C OVER.
Recall the integer program for S ET C OVER is
X
min
wS xS
SS
X
(2)
subject to
xS 1, u U,
S3u

xS {0, 1}, S S.
The LP-relaxation is
min
subject to

X
SS
X

wS xS
xS 1, u U,

S3u

xS 0, S S.
2

(3)

And, the dual linear program is


X

max
subject to

uU
X

yu
yu wS , S S,

(4)

uS

yu 0, u U.
The dual constraints look very much like relation (1), except that we need to divide both sides of (1) by
Hm . Thus, for each u U , if we set yu = c(u)/Hm , then y is a dual feasible solution. It follows that
X
cost(C) =
c(u) = Hm cost(y) Hm OPT.
uU

More general covering problems

The C ONSTRAINED S ET M ULTICOVER problem is a generalization of the S ET C OVER problem in which


each elements u U needs to be covered mu times, where mu is a positive integer.
The corresponding integer program can be written as
X
min
wS xS
SS
X
(5)
subject to
xS mu , u U,
S3u

xS {0, 1}, S S.
When relaxing this program, it is no longer possible to remove the upper bounds xS 1 (otherwise an
integral optimal solution to the LP may not be an optimal solution to the IP). The LP-relaxation is
X
min
wS xS
SS
X
subject to
xS mu , u U,
(6)
S3u

xS 1,
S S,
xS 0, S S.
The dual linear program is now
max
subject to

X
uU
X

mu y u

zS

SS

yu zS wS ,

S S,

(7)

uS

yu , zS 0, u U, S S.
We will try to devise a greedy algorithm to solve this problem and analyze it using the dual-fitting
method.
Algorithm 3.1. G REEDY-S ET-M ULTICOVER(U, S, w, m)
1: C = ; A U
2: // We call an element u U alive if mu > 0. Initially all of A are alive
3: while A 6= do
wS
is minimized.
4:
Pick S such that |SA|
3

5:
6:
7:
8:
9:

C = C {S}
mu mu 1 for each u S A
Remove from A all u with mu = 0
end while
return C

The next step is to write the cost of C in the form of the objective function of (7). For each element
u U , and each j [mu ], let c(u, j) be the cost of covering u for the jth time. If S covers u for the jth
time, and AS is the set of alive elements before S was picked, then c(u, j) = wS /|S AS |. If S was
chosen before T , then AT AS , and thus
wS
wT
wT

.
|S AS |
|T AS |
|T AT |
Consequently, for any u we have c(u, 1) c(u, mu ). The final cost is
cost(C) =

mu
XX

c(u, j).

uU j=1

In order to write this sum in the form


makes sense to try
X

cost(C) =

uU

mu yu

mu c(u, mu )

uU

mu c(u, mu )

SS

zS (keeping in mind that yu , zS 0), it

u 1
X mX

[c(u, mu ) c(u, j)]

uU j=1
mu
XX

[c(u, mu ) c(u, j)]

uU j=1

uU

The second
double sum (after the minus sign) is non-negative, which is good. We need to write it in the
P
form SS zS somehow. Note that, each time u is covered, a term c(u, mu ) c(u, j) is added into the
sum. For each S C, suppose S covers u S AS the ju,S th time. Then,
mu
XX

[c(u, mu ) c(u, j)] =

uU j=1

Consequently, the sum

X X

[c(u, mu ) c(u, ju,S )] .

SC uSAS

[c(u, mu ) c(u, ju,S )] can roughly play the role of zS . (If S


/ C, we

uSAS

can set zS = 0.) Just as in the normal S ET C OVER case, we will have to scale down the (hypothetical)
yu and zS to make them feasible. Suppose we scale them down by to be determined. Formally, define
yu =
zS =

1
c(u, mu ), u U

1 X
[c(u, mu ) c(u, ju,S )] S C

uSAS

S
/C

0
We want to find so that, for each S S,
Consider first S
/ C. In this case,
X
uS

uS

yu zS wS .

y u zS =

1X
c(u, mu ).

uS

Let u1 , . . . , uk be the elements of S. Without loss of generality, assume that u1 was completely covered
before u2 , and so on. Then, right before ui is completely covered, S still has at least k (i 1) alive
elements. Hence, c(ui , mui ) wS /(k i + 1). Consequently,
k

X
uS

y u zS

1X
wS
Hm

wS .

ki+1

i=1

Secondly, suppose S C. In this case we have


X
1 X
1X
c(u, mu )
[c(u, mu ) c(u, ju,S )]
y u zS =

uS
uSAS
uS

X
1 X
=
c(u, mu ) +
c(u, ju,S )

uSAS

uS\AS

Let u1 , . . . , uk0P
be elements in S \AS which were completely covered in that order. Note that 0 k 0 < k.
Note also that uSAS c(u, ju,S ) = wS . Similar to the previous reasoning, we get
!
k0
X
wS
Hm
1 X
+ wS
wS .
y u zS =

ki+1

uS

i=1

Hence, (y, z) would be a dual feasible solution if we pick = Hm , which would also be an approximation ratio for Algorithm 3.1.
Exercise 4. Devise a greedy algorithm for S ET M ULTICOVER with approximation ratio Hm . Analyze
your algorithm using the dual-fitting method.
Exercise 5. In the M ULTISET M ULTICOVER problem, we are given a collection S of multisets of a
universe U . For each S S, let M (S, u) be the multiplicity of u in S. Each element u needs to be
covered mu times. We can assume M (S, u) mu for all S, u.
Devise a greedy algorithm for M ULTISET M ULTICOVER with approximation ratio Hd , where d is
the largest multiset size. The size of a multiset is the total multiplicity of its elements. Analyze your
algorithm using the dual-fitting method.
Exercise 6. Consider the integer program min{cT x | Ax b}, where A, b have non-negative integral
entries, and x is required to be non-negative and integral also. This is called a covering integer program.
Use scaling and rounding to reduce covering integer programs to M ULTISET M ULTICOVER, so that
we can use the greedy algorithm for the M ULTISET M ULTICOVER instance to get a greedy algorithm
for the C OVERING I NTEGER P ROGRAM instance with approximation ratio O(lg n), where n is the input
size of the covering integer program. (Thus, the instance of M ULTISET M ULTICOVER must have size
polynomial in n.)
Exercise 7. Vaziranis book. Problem 24.12, page 241.

Historical Notes
The greedy approximation algorithm for S ET C OVER is due to Johnson [5], Lovasz [6], and Chvatal [2].
Feige [4] showed that approximating S ET C OVER to an asymptotically better ratio than ln m is NP-hard.
The dual-fitting analysis for G REEDY S ET C OVER was given by Lovasz [6]. Dobson [3] and Rajagopalan and Vazirani [8] studied approximation algorithms for covering integer programs. The dualfitting method has found applications in other places [1, 7].
5

References
[1] P. C ARMI , T. E RLEBACH , AND Y. O KAMOTO, Greedy edge-disjoint paths in complete graphs, in Graph-theoretic concepts
in computer science, vol. 2880 of Lecture Notes in Comput. Sci., Springer, Berlin, 2003, pp. 143155.

[2] V. C HV ATAL
, A greedy heuristic for the set-covering problem, Math. Oper. Res., 4 (1979), pp. 233235.
[3] G. D OBSON, Worst-case analysis of greedy heuristics for integer programming with nonnegative data, Math. Oper. Res.,
7 (1982), pp. 515531.
[4] U. F EIGE, A threshold of ln n for approximating set cover (preliminary version), in Proceedings of the Twenty-eighth
Annual ACM Symposium on the Theory of Computing (Philadelphia, PA, 1996), New York, 1996, ACM, pp. 314318.
[5] D. S. J OHNSON, Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9 (1974), pp. 256278.
Fifth Annual ACM Symposium on the Theory of Computing (Austin, Tex., 1973).
, On the ratio of optimal integral and fractional covers, Discrete Math., 13 (1975), pp. 383390.
[6] L. L OV ASZ
[7] M. M AHDIAN , E. M ARKAKIS , A. S ABERI , AND V. VAZIRANI, A greedy facility location algorithm analyzed using dual
fitting, in Approximation, randomization, and combinatorial optimization (Berkeley, CA, 2001), vol. 2129 of Lecture Notes
in Comput. Sci., Springer, Berlin, 2001, pp. 127137.
[8] S. R AJAGOPALAN AND V. V. VAZIRANI, Primal-dual RNC approximation algorithms for set cover and covering integer
programs, SIAM J. Comput., 28 (1999), pp. 525540 (electronic). A preliminary version appeared in FOCS93.

You might also like