Strategic and Tactical Planning: A Glimps of Future

Strategic and Tactical Planning
A glimps of future
Zinovi Rabinovich
Jeffrey S. Rosenschein
School of Engineering and Computer Sciences

Hebrew University in Jerusalem
Strategic and Tactical Planning – p.1/23

Agenda
Planning - the common view
Blocks world example
Driving a car
Strategic vs. Tactical Planning
’Let it be’ plans
Tactics - proposed solution
Potential applicability

Planning - AI inheritance
Recall the most classical AI view of the world: State
Oriented Domains (SOD)[3]:

A world is perceived to be in one of a preset
group of states
and a set of actions is provided to shift the world
from one state to another

group of states
Notice that it is a very nice kind of world

group of states
Notice that it is a very nice kind of world
World’s state is known
Actions perform in exactly the prescribed way
Closed world assumption holds

Planing - AI inheritance (cont)
In SODs, planning’s subject is to sway the world
from one state to another in some optimal way.

Thus, a plan is a sequence of actions that brings the
world from the current state to a certain target state.

Thus, a plan is a sequence of actions that brings the
world from the current state to a certain target state.
The usual level of optimality of such plans is the
number of steps (actions) prescribed by the plan to
reach the desired state.

Blocks World
For example consider the Blocks World domain...

Blocks World
Plan:
Move black from 2 onto white at 1
Move gray from 3 onto table at 2
Move black from 1 onto gray at 3
Move white from 1 onto gray at 2 Strategic and Tactical Planning – p.5/23
Blocks World
Plan:
Blocks World
Plan:
Blocks World
Plan:
Blocks World
Plan:
Rigidity of approach
However all the major assumptions of SOD cease to exist
as we attempt to move toward a more realistic setup.

Our sensory system is clogged by noise and is a
subject to aliasing

subject to aliasing
Actions are seldom accurate and tend to have side
effects

subject to aliasing
Actions are seldom accurate and tend to have side
effects
The world keeps ’spinning’ without any intervention
from us

Planning solutions to world dynamics
Conditional plans[2, 1] came to help to deal with
contingencies of plan execution
as well as Partial (Global) Planning, to use multiple
participating entities and develop the plan on-the fly
and many others: via negotiation, mixed-initiative,
etc.

Planning solutions to world dynamics
Conditional plans[2, 1] came to help to deal with
contingencies of plan execution
as well as Partial (Global) Planning, to use multiple
participating entities and develop the plan on-the fly
and many others: via negotiation, mixed-initiative,
etc.
... but they all assume that we’d like to keep the world
state under control, or at least in certain descriptive
bounds...

Driving a car
Imagine a car running in a single lane road, e.g. formula
one race-car
What is the set of all possible states?
What is the subset of all states we’d like to be in?

Driving a car
one race-car
Set of all possible margins from road edge
We can discretize the domain for simplicity

Driving a car
one race-car
Set of all possible margins from road edge
We can discretize the domain for simplicity
Those in the middle of the road, far-far away
from the edges, raw-ground and pedestrians

Driving a car (cont)
Consider the following plan: push the car into (roughly)
the middle of the road and leave it there

The plan achieves the presence of a car in the middle
of the road
The plan works almost always (we can’t eradicate the case where a
16ton carrier will propel the car into oblivion) and does not need revision

The plan achieves the presence of a car in the middle
of the road
The plan works almost always (we can’t eradicate the case where a
16ton carrier will propel the car into oblivion) and does not need revision
Though the plan was correct, we didn’t mean for the car
to stay stationary...
So what happened?

Driving a (moving) car
There are actually two different reasoning levels for
driving a (moving) car:

The reason for being in that car - desire to trace a
trajectory from point A to point B over time

The reason for wheels adjustment - forcing a car to
stay at a given trajectory over time

The reason for wheels adjustment - forcing a car to
stay at a given trajectory over time
We do not have a stationary, goal margin(s) to road edges.
Rather we’d like it to develop according to a certain de-
sign.

Strategic vs. Tactical Planning
Planning (and especially in dynamic environment) is
(roughly) a two level construction:
Strategic - high level - transforming system goals
into desired system dynamics
Tactical - low level - building a sequence of actions
that attempt to force the system into the desired form
of dynamics.

Strategic/Tactical Loop
The two levels of the hierarchy create a relentless flow of
planning:
Given global goal and previous success of following
strategic directives, update and formulate an
alternative strategy
Given a strategy, provide tactical (= implementation)
support and evaluation of feasibility

Compare:
In classical planning tactical level is degenerative

Compare:
Even in conditional planning we allow the system to
develop freely and simply describe for each contingency
the desired continuation

Compare:
Strong ’tactical’ level allows re-planning procedures
(should such occur) to be part of a standard planning
loop

Compare:
Strong ’tactical’ level allows re-planning procedures
(should such occur) to be part of a standard planning
loop
Previously exceptional, radical, potentially fatal (strategic)
plan failure, now becomes a common, mild, recoverable
situation, part of normal activity

Formalism
To formalize the tactical level operations we use a
POMDP like description:
Given a system described by:
A set of possible states
Possible control actions

System transition dynamics:

An initial state

Set of possible observations

Observation probabilities

Strategic target

Find the sequence of actions such that observed system dynamics

would be as close as possible to - minimize tactical

distance

Stayin’ alive plans
How can we measures distance between two
functions?

functions?
The functions are actually probabilities use
Kulbach-Leibler distance


functions?

How do we treat time and value over time?

functions?

How do we treat time and value over time?
Compute resulting probability distribution of
distance and keep the probability of breaking a
threshold low - just stay alive

Keep track of probable system state -
" !
#
$

Keep track of estimated system transitions -
&%
" !
$

"
Given current beliefs select an action:
GF
GF
A @
B
E @
B
HG
I G
B
E @
B
H
H
(
= <
>
/ .
0
,
+*
'
E
)
54
6
D
C7
3
1
98
54
6
7
:51
8
;4
9
6
3
5
: 51
:

Keep track of probable system state -
" !
#
$

Keep track of estimated system transitions -
&%
" !
$

"
Given current beliefs select an action:
GF
GF
A @
B
E @
B
HG
I G
B
E @
B
H
H
(
= <
>
/ .
0
,
+*
'
E
)
54
6
D
C7
3
1
98
54
6
7
:51
8
;4
9
6
3
5
: 51
:
But how can we keep track of and ?
&%
" !
" !
$

"
Proposed solution (cont)
Initialize your beliefs to some prior distribution
Use “Bayesian anti-aliasing” for :
" !
#
$

F
E @
B
H
K @
B
G
EB@
G
A H
E @
B
H
N
A
J
A
L E
M
'
'
E
D
C7
7
L
8
5
For solve the following:
&%
" !
#
$

"
A @
B
E @
B
FG
HG
G
E @
FG
H
H
2
= <
>
/ .
0
-
E
5 4
6
7
3O
54
89
6
3
s.t.
E @
H
E @
B
FG
A H
E @
B
H
F
E P
A
E
)
D
C7
7
5
F
E @
FG
H
E P
Q
A
)
5

Conditional applicability
Consider a multi-agent system with communication under the
following assumptions:
Communication activity does not changes the environment and

is integrated into the global action set
Action cost and state transition evaluation are separable
denote the overall value of transition from state
E @
B
H
F
R
'
E
E
L
to under action then exists:

F
E
'
F
GF
E @
B
E @
@
' G
H
R
'
E
)I
E
S
L
Then it is possible to create (optimal) control of the system using the

above strategic vs. tactical planning paradigm
MAS Strategic vs. Tactical protocol
Strategic levels of different agents will agree upon a
target in a communication session

Basically creating a common evaluation function
, and converting it into a target distribution
W V
X
Z
Y
I

W V
X
Z
Y
I
Under assumption of complete coordination, an

agent will use proposed tactical planning to comply
with the strategy

W V
X
Z
Y
I
Under assumption of complete coordination, an

agent will use proposed tactical planning to comply
with the strategy
Strong failure of the strategy, will initiate communication
for the repetition of the strategic layer operation

Tactical communication timing
Communication cost is equivalent to one decision step:
Tactical level communication will arise
automatically as a sole action that does not have the
capability to hinder the system development
dynamics
q
q

\

n
ba`
_
^]
[

f
g
p
ho
e

c
f
ij
g
h
k c

i
mlf
j
g
e
k c

k

Tactical communication timing
Communication cost is an elaborate function:
t
sr
Convert the function into distribution

Keep track of action usage distribution .
v !
#
&
$
u
"
Use the following to select an action:
q
q

wx
z
{

[

|
{

[
\

n
a `
_
^]
[

f
g

p
p
ho
ho
e

c
f
ij
g
h
k c

k i
mlf
j
g
e
k c


Conclusion
A novel view of planning and plans was introduced.
New optimality measure for agent behavior in a
stochastic environment was developed in the
framework of continual planning.
A feasible algorithm for multi-agent communication
utilization under the measure was proposed.

Future Work
Investigate the connections between the classical
optimality measure(s) and tactical distance
Prove/disprove that proposed tactical solution is an
optimal one relative to tactical distance
Investigate the effects of tactical solution on
multi-agent system with communication

References
[1] Jim Blythe. An overview of planning under
uncertainty. volume 1600 of Lecture Notes in
Computer Science, pages 85–?? 1999.
[2] Craig Boutilier, Thomas Dean, and Steve Hanks.
Decision-theoretic planning: structural assumptions
and computational leverage. Journal of Artificial
Intelligence Research, 11:1–94, 1999.
[3] Jeffrey S. Rosenschein and Gilad Zlotkin. Rules of
Encounter: Designing Conventions for Automated
Negoti ation Among Computers. MIT Press,
Cambridge, Massachusetts, 1994.

Strategic and Tactical Planning: A Glimps of Future

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Strategic and Tactical Planning: A Glimps of Future

Uploaded by

Copyright:

Available Formats

Strategic and Tactical Planning

School of Engineering and Computer Sciences

Strategic and Tactical Planning – p.1/23

Strategic and Tactical Planning – p.2/23

Strategic and Tactical Planning – p.3/23

Strategic and Tactical Planning – p.3/23

Strategic and Tactical Planning – p.3/23

Strategic and Tactical Planning – p.3/23

Strategic and Tactical Planning – p.4/23

Strategic and Tactical Planning – p.4/23

Strategic and Tactical Planning – p.4/23

Strategic and Tactical Planning – p.5/23

Strategic and Tactical Planning – p.6/23

Strategic and Tactical Planning – p.6/23

Strategic and Tactical Planning – p.6/23

Strategic and Tactical Planning – p.6/23

Strategic and Tactical Planning – p.7/23

Strategic and Tactical Planning – p.7/23

What is the subset of all states we’d like to be in?

Strategic and Tactical Planning – p.8/23

Strategic and Tactical Planning – p.8/23

Strategic and Tactical Planning – p.8/23

Strategic and Tactical Planning – p.9/23

Strategic and Tactical Planning – p.9/23

Strategic and Tactical Planning – p.9/23

Strategic and Tactical Planning – p.10/23

Strategic and Tactical Planning – p.10/23

Strategic and Tactical Planning – p.10/23

Strategic and Tactical Planning – p.10/23

Strategic and Tactical Planning – p.11/23

Strategic and Tactical Planning – p.12/23

Strategic and Tactical Planning – p.13/23

Strategic and Tactical Planning – p.13/23

Strategic and Tactical Planning – p.13/23

Strategic and Tactical Planning – p.13/23

Set of possible observations

Find the sequence of actions such that observed system dynamics

Strategic and Tactical Planning – p.14/23

Strategic and Tactical Planning – p.15/23

Strategic and Tactical Planning – p.15/23

Strategic and Tactical Planning – p.15/23

Strategic and Tactical Planning – p.15/23

Strategic and Tactical Planning – p.16/23

Strategic and Tactical Planning – p.17/23

Communication activity does not changes the environment and

to under action then exists:

Then it is possible to create (optimal) control of the system using the

Strategic and Tactical Planning – p.19/23

Strategic and Tactical Planning – p.19/23

Under assumption of complete coordination, an

Strategic and Tactical Planning – p.19/23

Under assumption of complete coordination, an

Strategic and Tactical Planning – p.19/23

Strategic and Tactical Planning – p.20/23

Convert the function into distribution

Strategic and Tactical Planning – p.20/23

Strategic and Tactical Planning – p.21/23

Strategic and Tactical Planning – p.22/23

You might also like