You are on page 1of 52

ESI 4313

Operations Research 2

Dynamic Programming
Dynamic Programming
 Dynamic programming is a technique for
solving certain types of optimization
problems
 The idea is to break up a large, complex
problem into many smaller, much easier ones
 Usually, this technique can be applied to
problems in which a sequence of decisions
over time needs to be made to optimize
some criterion
Dynamic Programming
 In many cases, solving a problem by
dynamic programming means
 formulating this problem as a shortest path
problem in an acyclic network
 The art of dynamic programming lies in
how to construct this network!
Example:
Travel from coast to coast
 You currently live in NYC (1), but plan to move
to LA (10)
 You will drive
 To save money, you will spend each night of your
trip at a friend’s house
 You structure your potential stopovers as follows:
 In 1 day you can reach Columbus (2), Nashville (3), or
Louisville (4)
 On the 2nd day, you can reach Kansas City (5), Omaha (6),
or Dallas (7)
 On the 3rd day, you can reach San Antonio (8) or Denver (9)
 On the 4th day, you can reach LA
 To minimize your gas expenses, you are looking for
the route of minimum length
Example:
Travel from coast to coast
 We can classify the cities as follows:
 Call all cities that you can be in at the
beginning of your nth day the Stage n cities
 The idea of solving this problem by
dynamic programming is to start by
solving easy problems that will eventually
help you solve the entire problem
 In particular, we will work backward
Example:
Travel from coast to coast
 Denote the distance between city i and
city j by ci,j
 If city i is a stage t city, we denote the
length of the shortest path from city i to
LA by ft(i)
 Clearly, we would like to find f0(1)
Example:
Travel from coast to coast
 First, find the shortest path to LA from
each of the cities from which you can
reach LA in 1 day – the stage 4 cities
 Note that these problems are trivial, since in
each case there’s only 1 way to go to LA
 More formally,
 f4(8) = c8,10
 f4(9) = c9,10
Example:
Travel from coast to coast
 Then, find the shortest path to LA from
each of the stage 3 cities
 Note that this means that you should first go
to a stage 4 city, and then use the shortest
path from this stage 4 city to LA
 These problems are not as trivial as the first
ones, but by simply looking at all possible city
4 problems and the solutions to the first set
of problems this remains relatively easy
Example:
Travel from coast to coast
 From each stage 3 city
 go to a stage 4 city, and then use the
shortest path from this stage 4 city to LA
 So, for example, f3(5) is equal to
 c5,8 + f4(8), or
 c5,9 + f4(9)
 Since we’re interested in the shortest
path, we have
 f3(5) = min{c5,8 + f4(8) , c5,9 + f4(9)}
Example:
Travel from coast to coast
 Perform the same procedure for the stage
2 cities
 Perform the same procedure for the stage
1 city, NYC
 From NYC you should first go to a stage 2
city, and then use the shortest path from this
stage 2 city to LA
 We can find the best route from NYC to LA
by considering all possible stage 2 cities
Example:
Travel from coast to coast
 In general, in stage t we are interested in
finding ft(i) for all stage t cities i
 Using the earlier approach, we can write
 ft(i) = minj: j is a stage t+1 city {ci,j + ft+1(j) }
for all stage t cities i
Computational efficiency of
dynamic programming
 In the example, we could simply enumerate all
possible paths from NYC to LA
 It is easy to see that there are 3x3x2=18 paths
 However, suppose that we have more options:
 Starting city is again stage 1
 5 cities in each of 5 stages (stages 2,…,6)
 Destination city is stage 7
 Then there are 55=3,125 paths
 Determining the length of each of these paths takes
a total of 5x55 = 15,625 additions and 3,124
comparisons
Computational efficiency of
dynamic programming
 How much work is the dynamic
programming algorithm?
 The stage 6 problems are trivial
 Each of the other problems require
 5 additions (potential choices for next city to
visit) and 4 comparisons
 For a total of 4x5x5 + 5 = 105 additions and
4x5x4 + 4 = 84 comparisons
Characteristics of dynamic
programming
 The problem should have stages
 Each stage corresponds to a point at which a
decision needs to be made
 Each stage should have a number of
associated states
 The state contains all information that is
needed to make an optimal decision for the
remaining problem
Characteristics of dynamic
programming
 The decision chosen at each stage
describes how the state at the current
stage is transformed in the state at the
next stage
 The optimal decision at the current state
should not depend on previously visited
states or previous decisions
 This is called the principle of optimality
Characteristics of dynamic
programming
 There must be a recursion that relates
the cost or reward for stages t, t+1, …, T
to the cost or reward for stages t+1, t+2,
…, T
 This recursion formalizes the procedure of
working backwards from the last stage to the
first stage
Dynamic programming
formulation
 Stages: t =1,…,5
 States: city
 Decision in each stage:
 Choose the stage t+1 city to go to
 Dynamic programming recursion:
 f4(i) = ci,10 for all stage 4 cities i
 ft(i) = minj: j is a stage t+1 city {ci,j + ft+1(j) }
for all stage t cities i
Dynamic programming
without stages
 You must drive from Bloomington to
Cleveland
 You are interested in the route that takes
the least amount of time
Dynamic programming
without stages

Gary Toledo Cleveland


3 hours 3 hours

2 hours 1 hour 3 hours

Indianapolis Dayton Columbus


3 hours 2 hours

1 hour 2 hours
2.5 hours
Bloomington Cincinnati
3 hours
Production & inventory
planning
 Consider the following production & inventory
planning problem for a single item:
 Consider a planning period of T periods, and assume
that
 the demand for the item in each of the periods is known
 the initial inventory level is known
 At the start of each period, you must decide how
many units to produce; production capacity is limited
 Each period’s demand must be met on time
 There is a limited amount of storage space available
 The goal is to minimize the total production &
inventory costs over the planning horizon
Production & inventory
planning
 This is a periodic review model
 Denote the demand in period t by dt
(t =1,…,T )
 Denote the cost of producing x units in
period t by ct(x) (often, this function is
independent of t, i.e., ct(x)=c(x) )
 If at the end of period t the inventory
level is I, a cost of ht(I) is charged (often,
these costs are independent of t, i.e.,
ht(I)=h(I) )
Production & inventory
planning
 If the production and inventory holding
cost functions are linear, we can
formulate this problem as an LP problem
(how?)
 Often, the production costs are assumed
to have a fixed-charge structure:
 c(x) = 0 if x = 0, c(x) = a + bx if x > 0
 In that case, we can formulate this problem
as a mixed-integer LP problem (how?)
Production & inventory
planning
 More generally, we can formulate this
problem as an NLP problem (how?)
 Production (and inventory) costs are
often assumed to be concave – reflecting
economies of scale
 What does that mean for the ease of
solvability of the NLP problem?
Production & inventory
planning
 NLP formulation:
T T
min  ct ( xt )   h (I )
t t
t 1 t 1

subject to
It 1  xt  dt  It t  1,..., T
0  It  B t  1,..., T
0  xt  C t  1,..., T
Production & inventory
planning
 Dynamic programming provides a solution
methodology that can be applied for
general cost functions
 We only need to assume that the units of
demand, production, and inventory are
integers – which is not unrealistic in many
practical situations
 This methodology will be efficient if the
magnitude of the numbers involved is not too
large
Production & inventory
planning
 We must identify:
 Stages
time: t  1,..., T
 States
(starting) inventory level: I  0,..., B
 Decisions
production quantity: x  0,..., C
 Recursion
minimal cost from start of stage t : ft (I )
 Clearly, we are looking for f1(I0)
Production & inventory
planning
 Recursion:
 Cost at the beginning of stage T :

fT (I )  min  cT (x)  hT (I  x  dT )
0 x C

 Note that you will always want to end up


with 0 inventory – so the final period’s
production will be x  dT  I
 So: fT (I )  cT (dT  I )
 (what if dT-I < 0 or dT-I > C ???)
Production & inventory
planning
 Recursion:
 Cost at the beginning of stage t :
ft (I )  min
max(0,d  I ) x min(C ,d  B  I )
 ct ( x)  ht (I  x  dt )
 ft 1(I  x  dt )
t t

 We have to make sure that we have


sufficient storage capacity, i.e., we need
I  x  dt  B or x  dt  B  I
 We have to make sure that we deliver on
time I  x  d  0 or x  d  I
t t
Production & inventory
planning example
 Example:
 T = 4 periods
 Demands: 1, 4, 2, 3
 Inventory holding costs: $0.50 per unit
 Production costs:
 fixed setup cost $3
 variable cost $1 per unit
 Production capacity C = 5 units
 Inventory capacity B = 4 units
Production & inventory
planning
 Initialization: T = 4
 d4 = 3
 Cost at the beginning of stage T = 4 :
fT (I )  cT (dT  I )
 I = 0: f4(0) = c4(3-0) = 3 + 31 = 6
 I = 1: f4(1) = c4(3-1) = 3 + 21 = 5
 I = 2: f4(2) = c4(3-2) = 3 + 11 = 4
 I = 3: f4(3) = c4(3-3) = 0
 I = 4: f4(4) = c4(3-4) = ???  0
Production & inventory
planning
 Next stage: t = 3
 d3 = 2
 Cost at the beginning of stage t = 3:

ft (I )  min
max(0,dt  I ) x min(C ,dt  B  I )

 ct (x)  ht (I  x  dt )  ft 1(I  x  dt )
f3 (I )  min
max(0,2  I ) x min(5,2  4  I )

 c3( x)  h3(I  x  2)  f4(I  x  2)


Production & inventory
planning
 I = 0:

f3 (0)  min
max(0,2 0) x min(5,2  4 0)

 c3(x)  h3(0  x  2)  f4(0  x  2)

f3 (0)  min  c3 (x)  12 (x  2)  f4 ( x  2)


2 x 5
Production & inventory
planning
 I = 0:
f3 (0) 
x  2: 2  1 12  2  f4 (0)  5  6  11

x  3: 2  1 12  3  f4 (1)  5 12  5  10 12
min 
x  4: 2  1 12  4  f4 (2)  6  4  10
 x  5: 2  1 12  5  f4 (3)  6 12  0  6 12
Production & inventory
planning
 I = 2:
f3 (2)  min
max(0,2 2) x min(5,2  4 2)

 c3(x)  h3(2  x  2)  f4(2  x  2)


f3 (2)  min  c3 (x )  1
2
x  f4 (x )
0 x  4


f3 (2)  min f4 (0),min  3  1 12 x  f4 ( x)
1 x  4

Production & inventory
planning
 I = 2:
f3 (2) 
x  0 : f4 (0)  6
x  1 : 3  1 12  1  f4 (1)  4 12  5  9 12

min  x  2 : 3  1 12  2  f4 (2)  4 12  4  8 12
x  3 : 3  1 12  3  f4 (3)  4 12  0  4 12

 x  4 : 3  1 12  4  f4 (4)  4 12  0  4 12
Production & inventory
planning
 Network representation:
 Nodes: stage/state combinations (t,I)
 Arcs: decisions x
 Arc from node (t,I) corresponding to decision
x leads to node (t+1,I+x-dt)
 Cost of this arc is ct(x) + ht(I+x-dt)
Resource allocation:
the knapsack problem (1)
 Stockco is considering 4 investments
 Investment 1 will yield a NPV of $16K, but
requires a cash outflow of $5K
 Investment 2 will yield a NPV of $22K, but
requires a cash outflow of $7K
 Investment 3 will yield a NPV of $12K, but
requires a cash outflow of $4K
 Investment 4 will yield a NPV of $8K, but
requires a cash outflow of $3K
 You have a budget of $14K
Resource allocation:
the knapsack problem (1)
 IP formulation:
4
max  NPVi xi  16 x1  22 x2  12 x3  8 x4
i 1

subject to
4

C x
i 1
i i  5x1  7 x2  4 x3  3x4  14

xi  {0,1} i  1,..., 4
Resource allocation:
the knapsack problem (2)
 You are planning an overnight hike, and are
considering taking 4 items along on your trip
 Item 1 yields a “benefit” of 16, but weighs 5 lbs
 Item 2 yields a “benefit” of 22, but weighs 7 lbs
 Item 3 yields a “benefit” of 12, but weighs 4 lbs
 Item 4 yields a “benefit” of 8, but weighs 3 lbs
 You do not want to carry more than 14 lbs
 You want to maximize your “benefit”
 Mathematically, this is the same problem as the
investment problem!
Resource allocation:
more general
 Stockco is considering n investments
 Investment n will yield a NPV of rn(dn) when
dn$1,000 is invested
 You only want to (or can) invest in integer multiples of
$1,000
 You have a budget of B  $1,000
 Example
 n = 3, B = 6
 r1(d1) = 7d1+2 (d1>0), r1(0) = 0
 r2(d2) = 3d2+7 (d2>0), r2(0) = 0
 r3(d3) = 4d3+5 (d3>0), r3(0) = 0
Resource allocation:
more general
 NLP formulation:
n
max  ri (di )
i 1

subject to
n

d
i 1
i B

di  {0,1,...} i  1,..., n
Resource allocation
 To formulate this problem as a DP problem, we
must identify:
 Stages
investment categories: i  1,2,3
 States
budget available: y  0,1,...,6
 Decisions
investment amount: d  0,1,...,6
 Recursion
maximal return from inv. categories i,…,3 : fi (y )
 Clearly, we are looking for f1(6)
Resource allocation
 Recursion:
 Return from investment in category 3 only:
f3 (y )  max r3 (d )
0 d  y
 Note that you will always invest all remaining
budget in category 3 at this stage, i.e., d=y

0 if y  0
f3 (y )  r3 (y )  
4y  5 if y  1,...,6
Resource allocation
 Recursion:
 Return from investment in categories 2 and 3:
f2 (y )  max  r2 (d )  f3(y  d )
0 d  y
 These subproblems are a little harder…
 y=0: f2(0) = 0
 y=1: f2(1) =max(r2(0)+f3(1),r2(1)+f3(0))
=max(0+9,10+0) = 10
 y=2: f2(2) =
max(r2(0)+f3(2),r2(1)+f3(1),r2(2)+f3(0))
=max(0+13,10+9,13+0) = 19
Resource allocation
 Network representation:
 Nodes: stage/state combinations (i,y)
 Arcs: decisions d
 Arc from node (i,y) corresponding to decision
x leads to node (i+1,y-d)
 Return of this arc is ri(d)
Resource allocation:
even more general
 NLP formulation:
n
max  ri (di )
i 1

subject to
n

 g (d )  B
i 1
i i

di  {0,1,..., Ui } i  1,..., n
 Find the DP formulation for this general
case
Equipment replacement
problem
 A company faces the problem of how long a
machine should be utilized before it should be
traded in for a new one
 Example
 A new machine costs p=$1,000, and has a useful
lifetime of 3 years
 Maintaining a machine during its first 3 years costs
m1=$60, m2=$80, m3=$120, respectively
 If a machine is traded in, a salvage value is
obtained: s1=$800, s2=$600, and s3=$500,
respectively, after the first 3 years
Equipment replacement
problem
 We currently have a y year old machine
 Find a policy that minimizes total net costs
over the next 5 years
Equipment replacement
problem
 To formulate this problem as a DP problem, we
must identify:
 Stages
time: t  0,1,...,5
 States
age of machine: y  0,1,2,3
 Decisions
keep or trade-in: d  0,1
 Recursion
minimal net cost after period t : ft (y )
 Clearly, we are looking for f0(y)
Equipment replacement
problem
 Recursion:
 Note that you will always salvage the
machine at the end of year 5:
 Net cost after period 5:
f5 (y )  sy y  1,2,3
Equipment replacement
problem
 Recursion:
 At the end of period t < 5, you must decide
whether to keep or trade-in the machine
 If y=3, you must trade it in

ft (3)  s3  p  ft 1(1) t  0,1,..., 4


 If y<3, you have a real choice:
 x  0 : my  ft 1(y  1)
ft (y )  
 x  1 :  sy  p  ft 1(1)
y  1,2; t  0,1,..., 4
Equipment replacement
problem
 Network representation:
 Nodes: stage/state combinations (t,y)
 Arcs: decisions x
 Arc from node (t,y) corresponding to decision
 x=0 leads to node (t+1,y+1)
 x=1 leads to node (t+1,1)
 Return of the arc is
 my when x=0
 – sy + p when x=1

You might also like