Branch and Bound Hand Out Uw Madison

ISyE/CS 720: Integer Programming

Branch-and-bound
University of Wisconsin-Madison
Spring 2014
ISyE/CS 720: Integer Programming 1
Outline
Branch-and-bound
Overview of the algorithm
Example
Many details on possible variations

Overview
Branch-and-bound (Chap. 7)
Basic idea behind most algorithms for solving integer programming
problems
Solve a relaxation of the problem
Some constraints are ignored or replaced with less stringent

constraints
Gives an upper bound on the true optimal value
If the relaxation solution is feasible, it is optimal
Otherwise, we divide the feasible region (branch) and repeat

Overview
Linear Programming Relaxation
Consider the mixed-integer
program:
z
IP
= max c
T
x +h
T
y
Ax +Gy b
x 0
y Z
p
+
Its linear programming
relaxation is:
z
LP
= max c
T
x +h
T
y
Ax +Gy b
x 0
y 0
How does z
LP
compare to z
IP
? z
LP
z
IP
.
What do we know if the solution (x

LP
, y
LP
) of the LP
relaxation has y
LP
Z
p
?
Overview
Branching: The divide in Divide-and-conquer
Generic optimization problem:
z
= maxc
T
x [ x S
Consider subsets S
1
, . . . S
k
of S which cover S: S =
i
S
i
. Then
maxc
T
x [ x S = max
1ik
maxc
T
x [ x S
i
In other words, we can optimize over each subset separately.
Usually want S
i
sets to be disjoint (S
i
S
j
= for all i ,= j)
Dividing the original problem into subproblems is called branching
Overview
Bounding: The conquer in Divide-and-conquer
Any feasible solution to the problem provides a lower bound L on
the optimal solution value. ( x S z
c
T
x).
We can use heuristics to nd a feasible solution x

After branching, for each subproblem i we solve a relaxation
yielding an upper bound u(S
i
) on the optimal solution value for
the subproblem.
Overall Bound: U = max

i
u(S
i
)
Key: If u(S
i
) L, then we dont need to consider subproblem i.
In MIP, we usually get the upper bound by solving the LP
relaxation, but there are other ways too.
Overview
LP-based branch and bound for MIP
Let z
IP
be the optimal value of the MIP
In LP-based branch and bound, we rst solve the LP

relaxation of the original problem. The result is one of the
following:
1. The LP in unbounded the MIP is unbounded or infeasible.
2. The LP is infeasible MIP is infeasible.
3. We obtain a feasible solution for the MIP it is an optimal
solution to MIP. (L = z
IP
= U)
4. We obtain an optimal solution to the LP that is not feasible
for the MIP Upper Bound. (U = z
LP
).
In the rst three cases, we are nished.
In the nal case, we must branch and recursively solve the

resulting subproblems.
Overview
Terminology
If we picture the
subproblems graphically,
they form a search tree.
Eliminating a problem from

further consideration is
called pruning.
The act of bounding and

then branching is called
processing.
A subproblem that has not

yet been processed is called
a candidate.
The set of candidates is the

candidate list.
Overview
LP-based branch and bound algorithm
1. To start, derive a lower bound L using a heuristic method (if
possible).
2. Put the original problem on the candidate list.
3. Select a problem S from the candidate list and solve the LP
relaxation to obtain the bound u(S)
If the LP is infeasible node can be pruned.
If u(S) > L and the solution is feasible for the MIP set
L u(S).
If u(S) L node can be pruned.
Otherwise, branch. Add the new subproblems to the list.

4. If the candidate list in nonempty, go to Step 3. Otherwise, the
algorithm is completed.
The Global upper bound
U
t
= max
u(parent(S)) : S in candidate list at step t

Overview
Lets Do An Example
maximize
z = 4x
1
x
2
subject to
7x
1
2x
2
14
x
2
3
2x
1
2x
3
3
x
1
, x
2
Z
+
Choices in branch-and-bound
Each of the steps in a branch-and-bound algorithm can be done in
many dierent ways
Heuristics to nd feasible solutions yields lower bounds
Solving a relaxation yields upper bounds
Node selection which subproblem to look at next
Branching dividing the feasible region

You can help an integer programming solver by telling it how it
should do these steps
You can even implement your own better way to do one or

more of these steps
You can do better because you know more about your problem
How long does branch-and-bound take?
Simplistic (but useful) approximation:
Total time = (Time to process a node) (Number of nodes)
When making choices in branch-and-bound, think about eect on
these separately
Question
Which of these is likely to be most important?
Heuristics
Choices in branch-and-bound: Heuristics
Practical perspective: nding good feasible solutions is most
important
Manager wont be happy if you tell her you have no solution,

but you know the optimal solution is at most U
A heuristic is an algorithm that tries to nd a good fesible solution
No guarantees maybe fails to nd a solution, maybe nds a

poor one
But, typically runs fast
Sometimes called primal heuristics

Good heuristics help nd an optimal solution in branch-and-bound
Key to succes: Prune early and often
We prune when u(S

i
) L, where L is the best lower bound
Good heuristics larger L prune more

Heuristics
Heuristics examples
Solving the LP relaxation can be interpreted as a heuristic
Often (usually) fails to give a feasible solution
Rounding/Diving
Round the fractional integer variables
With those xed, solve LP to nd continuous variable values
Diving: x one fractional integer variable, solve LP, continue
Many more clever possibilities
MetaheuristicsSimulated Annealing, Tabu Search, Genetic

Algorithms, etc...
Optimization-based heuristics
Solve a heavily restricted version of the problem optimally
Relaxation-induced neighborhood search (RINS), local

branching
Problem specic heuristics
This is often a very good way to help an IP solver
May run heuristic once, or throughout search (via a callback)

Heuristics
Problem specic heuristic for machine scheduling
Minimize weighted start time: x
jt
= 1 i job j starts at time t
min
J
j=1
w
j
T
t=1
tx
jt
J
j=1
t
s=tp
j
+1
x
js
1, t = 1, . . . , T,
T
t=1
x
jt
= 1, j = 1, . . . , J
x
jt
0, 1, j = 1, . . . , J, t = 1, . . . , T
Heuristics
Problem specic heuristic for machine scheduling
Decision variables: x
jt
= 1 i job j starts at time t
LP-based heuristic
Solve the LP relaxation x

jt
Calculate v
j
=
T
t=1
t x
jt
for all j
v
j
is when the LP wants to schedule job j
Schedule jobs in increasing order of v

j
Works well for this problem because this LP formulation is a good
approximation of IP problem
Relaxation
Choices in branch-and-bound: Choosing/solving
the relaxation
The relaxation is the most important factor for proving a

solution is optimal
Optimal value of the relaxation yields the upper bound
Recall: we prune when u(S

i
) L
Smaller (tighter) upper bounds prune more
So the formulation is very important
Much of this course will be devoted to understanding good

formulations and automatically improving formulations
Time spent solving the relaxation at each node usually

dominates the total solution time
Want to solve it fast, but also want to solve fewer
Potential trade-o: a formulation that yields a better upper

bound may be larger (more time to solve relaxation)
Usually, the formulation with a better upper bound wins

Relaxation
Solving the LP relaxation eciently
Branching is usually done by changing bounds on a variable

which is fractional in the current solution (x
j
0 or x
j
1)
Only dierence in the LP relaxation in the new subproblem is

this bound change
LP dual solution remains feasible
Reoptimize with dual simplex
If choose to process the new subproblem next, can even avoid

refactoring the basis
Another advantage of dual simplex: it works by improving an

upper bound on optimal value of the relaxation
Let u
k
be the upper bound at iteration k of dual simplex:
u
k
u(S
i
)
If u
k
L, then u(S
i
) u
k
L, so we can prune the node
We didnt even have to solve the LP relaxation completely!

Node Selection
Choices in branch-and-bound: Node selection
Node selection: Strategy for selecting the next subproblem (node)
to be processed.
Important, but not as important as heuristics, relaxations, or

branching (to be discussed next)
Often called search strategy

Two dierent goals:
Minimize overall solution time.
Find a good feasible solution quickly.

Node Selection
The Best First Approach
One way to minimize overall solution time is to try to

minimize the size of the search tree.
We can achieve this if we choose the subproblem with the

best bound (highest upper bound if we are maximizing).
Lets prove this
A candidate node is said to be critical if its bound exceeds the

value of an optimal solution solution to the IP.
Every critical node will be processed no matter what the

search order
Best rst is guaranteed to examine only critical nodes, thereby

minimizing the size of the search tree (for a given xed choice
of branching decisions).
Node Selection
Drawbacks of Best First
1. Doesnt necessarily nd feasible solutions quickly
Feasible solutions are more likely to be found deep in the tree

2. Node setup costs are high
The linear program being solved may change quite a bit from
one node LP solve to the next
3. Memory usage is high
It can require a lot of memory to store the candidate list, since

the tree can grow broad
Node Selection
The Depth First Approach
Depth rst: always choose the deepest node to process next
Dive until you prune, then back up and go the other way
Avoids most of the problems with best rst:
Number of candidate nodes is minimized (saving memory)
Node set-up costs are minimized since LPs change very little from
one iteration to the next
Feasible solutions are usually found quickly

Unfortunately, if the initial lower bound is not very good, then we may
end up processing lots of non-critical nodes.
We want to avoid this extra expense if possible.

Node Selection
Hybrid Strategies
A Key Insight
If you knew the optimal solution value, the best thing to do would
be to go depth rst
Idea: Go depth-rst until z

LP
goes below optimal value z
IP
,
then make a best-rst move.
But we dont know the optimal value!
Make an estimate z
E
of the optimal solution value
Go depth-rst until z
LP
z
E
Then jump to a better node

Branching
Choices in branch-and-bound: Branching
If our relaxed solution x is not integer feasible, we must

decide how to partition the search space into smaller
subproblems
The strategy for doing this is called a Branching Rule
Branching wisely is very important
Signicantly impacts bounds in subproblems
It is most important at the top of the branch-and-bound tree

Branching
Branching in integer programming
Most common approach: Changing variable bounds
If x is not integer feasible, choose

j N such that f
j
:= x
j
x
j
| > 0
Create two problems with additional constraints

1. x
j
x
j
| on one branch
2. x
j
, x
j
| on other branch
In the case of 0-1 IP, this dichotomy reduces to

1. x
j
= 0 on one branch
2. x
j
= 1 on other branch
Review: Why is branching by changing variable bounds convenient
when using LP relaxations?
Key question
Which variable to branch on?
Branching
The goal of branching
Branching divides one problem into two or more subproblems
We would like to choose the branching that minimizes the

sum of the solution times of all the created subproblems.
This is the solution of the entire subtree rooted at the node.

How do we know how long it will take to solve each subproblem?
Answer: We dont.
Idea: Try to branch on variables that will cause the upper

bounds to decrease the most
This will lead to more pruning, and smaller subtrees

Branching
Finding a good branching variable
I want to branch on a variable that causes the upper bound to
decrease a lot in the subproblems!
Then I can prune those nodes, or should be able to prune

them quickly
So a branching variable that changes these bounds the

most is likely to be a good choice.
Ideas?
What are some ideas you have for deciding on a branching vari-
able?
Branching
Predicting the bound change in a subproblem
How can I (quickly?) estimate the upper bounds that would result
from branching on a variable?
Strong branching
Actually solve the LP relaxation of each subproblem for each

potential branching variable
Pseudo-costs
Approximate the bound change based on previous information

collected in the branch-and-bound tree
Hybrid: Reliability branching
Tentative branching
Like strong branching, but also add valid inequalities to the

subproblems, and possibly branch a few times
Branching
Strong branching: Practicalities
Dont fully solve the subproblem LPs just do a few dual simplex
pivots
This gives an upper bound on the subproblem bound
How many is a few? empirical study suggests 25 or so

Dont check subproblem for every candidate branching variable
Which to evaluate?
Look at an estimate of their eectiveness that is very cheap to

evaluate
E.g., most fractional variables, or pseudocost (next slide)
Perhaps evaluate more candidates near the top of the tree

Fully solving the LPs or evaluating more candidates will probably
reduce search tree size, but likely increases total time
Branching
Using pseudo-costs
The pseudo-cost of a variable is an estimate of the per-unit

change in the objective function from forcing the value of the
variable to be rounded up or down. Like a gradient!
For each variable x

j
, we maintain an up and a down
pseudo-cost, denoted P
+
j
and P
j
.
Let f
j
be the current (fractional) value of variable x
j
.
An estimate of the change in objective function in each of the

subproblems resulting from branching on x
j
is given by
D
+
j
= P
+
j
(1 f
j
),
D
j
= P
j
f
j
.
How to get the pseudo-costs?

Branching
Obtaining and updating pseudo-costs
Empirical data
Observe the actual change that occurs after branching on each

one of the variables and use that as the pseudo-cost
We can either choose to update the pseudo-cost as the

calculation progresses or just use the rst pseudo-cost found
Pseudo-costs tend to remain fairly constant
How to initialize? Possibilities:
Use the objective function coecient
Use the average of all known pseudo-costs
Explicity initialize the pseudocosts using strong branching

this is the hybrid reliability branching approach
Branching
Combining multiple subproblem bounds
For each candidate branching variable, we calculate an

estimate of the upper bound change for each subproblem
Either via strong branching or pseudo-costs
How do we combine the two numbers together to form one

measure of goodness for choosing a branching variable?
Idea: branch on variable x

j
with:
j
= arg max
D
+
j
D
Older alternative: A weighted sum of (min/max)...

Branching
Putting it all together
Choices weve discussed in branching:
Strong branching or pseudo-costs?
Pseudo-costs
How should we initialize?
How should we update?
Strong branching
How do we choose the list of branching candidates?
How many pivots to do on each?
Once we have the bound estimates, how do we nally choose

the branching variable?
Ultimately, we must use empirical evidence and intuition to answer
these questions.
Final Branching Topics
GUB/SOS1 Sets
Special ordered set of type 1 (SOS1)
A set of non-negative variables x
j
: j S is an SOS1 set if we
require that at most one variable x
j
for j S can be positive.
Generalized upper bounds (GUB)
If the variables x
j
in an SOS1 set are binary, it is called a GUB
constraint:
jS
x
j
1
Why generalized upper bound?
My guess:
If x
j
0, 1, u > 0 and y is a continuous variable:
y ux
j
is called a variable upper bound
Consider model with weights u

j
for j J, and constraints:
y
jJ
u
j
x
j
jJ
x
j
1
Choose capacity of something from a set u
j
: j J
GUB Branching
Suppose x
j
0, 1 and we have the constraint:
J
j=1
x
j
= 1 and
weights u
j
for j = 1, . . . , J with u
1
u
2
u
J
Which branching do you think would be better?
1. x
k
= 1 & x
k
= 0(

j=k
x
j
= 1), or
2.
k
j=1
x
j
= 1 &
J
j=k+1
x
j
= 1
First branch: Either choose capacity u

k
or dont choose u
k
Second branch: Either choose capacity u

k
or choose
capacity u
k+1
Answer: It depends
But the answer is almost surely (2)
But it is important that there be natural weights u

j
Implementing GUB Branching
Suppose x
j
0, 1 and we have the constraint:
J
j=1
x
j
= 1
and weights u
j
for j = 1, . . . , J with u
1
u
2
u
J
A GUB branch:
k
j=1
x
j
= 1 or
J
j=k+1
x
j
= 1
To enforce
k
j=1
x
j
= 1, set upper bound on x
j
to 0 for
j = k + 1, . . . , J
To enforce
J
j=k+1
x
j
= 1, set upper bound on x
j
to 0 for
j = 1, . . . , k
Branch-and-bound Wrap-up
Weve seen a lot of details of branch-and-bound, but there is much
more that goes into an eective implementation
Preprocessing
Dealing with symmetry
Many other tricks


Branch and Bound Hand Out Uw Madison

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Branch and Bound Hand Out Uw Madison

Uploaded by

Copyright:

Available Formats

ISyE/CS 720: Integer Programming

ISyE/CS 720: Integer Programming

Overview of the algorithm

Many details on possible variations

Solve a relaxation of the problem

Some constraints are ignored or replaced with less stringent

Gives an upper bound on the true optimal value

If the relaxation solution is feasible, it is optimal

Otherwise, we divide the feasible region (branch) and repeat

What do we know if the solution (x

In other words, we can optimize over each subset separately.

We can use heuristics to nd a feasible solution x

Overall Bound: U = max

In LP-based branch and bound, we rst solve the LP

In the rst three cases, we are nished.

In the nal case, we must branch and recursively solve the

Eliminating a problem from

The act of bounding and

A subproblem that has not

The set of candidates is the

If the LP is infeasible node can be pruned.

If u(S) L node can be pruned.

Otherwise, branch. Add the new subproblems to the list.

u(parent(S)) : S in candidate list at step t

ISyE/CS 720: Integer Programming 9

Heuristics to nd feasible solutions yields lower bounds

Solving a relaxation yields upper bounds

Node selection which subproblem to look at next

Branching dividing the feasible region

You can even implement your own better way to do one or

Manager wont be happy if you tell her you have no solution,

No guarantees maybe fails to nd a solution, maybe nds a

But, typically runs fast

Sometimes called primal heuristics

Key to succes: Prune early and often

We prune when u(S

Good heuristics larger L prune more

Solving the LP relaxation can be interpreted as a heuristic

Often (usually) fails to give a feasible solution

Round the fractional integer variables

With those xed, solve LP to nd continuous variable values

Diving: x one fractional integer variable, solve LP, continue

Many more clever possibilities

MetaheuristicsSimulated Annealing, Tabu Search, Genetic

Solve a heavily restricted version of the problem optimally

Relaxation-induced neighborhood search (RINS), local

Problem specic heuristics

This is often a very good way to help an IP solver

May run heuristic once, or throughout search (via a callback)

Solve the LP relaxation x

Schedule jobs in increasing order of v

The relaxation is the most important factor for proving a

Optimal value of the relaxation yields the upper bound

Recall: we prune when u(S

Smaller (tighter) upper bounds prune more

So the formulation is very important

Much of this course will be devoted to understanding good

Time spent solving the relaxation at each node usually

Want to solve it fast, but also want to solve fewer

Potential trade-o: a formulation that yields a better upper

Usually, the formulation with a better upper bound wins

Branching is usually done by changing bounds on a variable

Only dierence in the LP relaxation in the new subproblem is