2.1.1 Informe (He!r"#$"%) Se&r%' S$r&$e("e# 2.1.2 He!r"#$"% F!n%$"on# 2.1.) Lo%&* Se&r%' A*(or"$'m# &n O+$"m",&$"on Pro-*em# 2.1.. Lo%&* Se&r%' "n Con$"n!o!# S+&%e# 2.1./ On*"ne Se&r%' A(en$# &n Un0no1n En2"ronmen$# 2.2 CONSTRAINT SATISFACTION PRO3LEMS(CSP) 2.2.1 Con#$r&"n$ S&$"#f&%$"on Pro-*em# 2.2.2 3&%0$r&%0"n( Se&r%' for CSP# 2.2.) T'e S$r!%$!re of Pro-*em# 2.) AD4ERSARIAL SEARCH 2.).1 G&me# 2.).2 O+$"m&* De%"#"on# "n G&me# 2.).) A*+'&53e$& Pr!n"n( 2.).. Im+erfe%$ 6Re&*5$"me De%"#"on# 2.)./ G&me# $'&$ "n%*!e E*emen$ of C'&n%e 55555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555 2.1 INFORMED SEARCH AND EXPLORATION 2.1.1 Informe(He!r"#$"%) Se&r%' S$r&$e("e# Informe #e&r%' #$r&$e(7 is one that uses problem-specific knowledge beyond the definition of the problem itself. It can find solutions more efficiently than uninformed strategy. 3e#$5f"r#$ #e&r%' 3e#$5f"r#$ #e&r%' is an instance of general TREE-SEARC or !RA"-SEARC algorithm in which a node is selected for e#pansion based on an e2&*!&$"on f!n%$"on f$n%. The node with lowest e&aluation is selected for e#pansion'because the e&aluation measures the distance to the goal. This can be implemented using a priority-(ueue'a data structure that will maintain the fringe in ascending order of f-&alues. 2.1.2. He!r"#$"% f!n%$"on# A 'e!r"#$"% f!n%$"on or simply a 'e!r"#$"% is a function that ranks alternati&es in &arious search algorithms at each branching step basing on an a&ailable information in order to make a decision which branch is to be followed during a search. The key component of )est-first search algorithm is a 'e!r"#$"% f!n%$"on'denoted by h$n%*
h$n% + e#timated cost of the %'e&+e#$ +&$' from node n to a (o&* noe. ,or e#ample'in Romania'one might estimate the cost of the cheapest path from Arad to )ucharest &ia a #$r&"('$5*"ne "#$&n%e from Arad to )ucharest$,igure -..%. euristic function are the most common form in which additional knowledge is imparted to the search algorithm. Gree7 3e#$5f"r#$ #e&r%' Gree7 -e#$5f"r#$ #e&r%' tries to e#pand the node that is closest to the goal'on the grounds that this is likely to a solution (uickly. It e&aluates the nodes by using the heuristic function f$n% + h$n%. Taking the e#ample of Ro!$e5f"n"n( +ro-*em# in Romania ' the goal is to reach )ucharest starting from the city Arad. /e need to know the straight-line distances to )ucharest from &arious cities as shown in ,igure -... ,or e#ample' the initial state is In$Arad% 'and the straight line distance heuristic hS01$In$Arad%% is found to be 233. 4sing the #$r&"('$5*"ne "#$&n%e heuristic 'SLD 6the goal state can be reached faster. F"(!re 2.1 5alues of hS01 - straight line distances to )ucharest F"(!re 2.2 stages in greedy best-first search for )ucharest using straight-line distance heuristic hS01. 6odes are labeled with their h-&alues. ,igure -.- shows the progress of greedy best-first search using hS01 to find a path from Arad to )ucharest. The first node to be e#panded from Arad will be Sibiu'because it is closer to )ucharest than either 7erind or Timisoara. The ne#t node to be e#panded will be ,agaras'because it is closest. ,agaras in turn generates )ucharest'which is the goal. "roperties of greedy search o Com+*e$e88 6o8can get stuck in loops' e.g.' Iasi 9 6eamt 9 Iasi 9 6eamt 9 Complete in finite space with repeated-state checking o T"me88 :$bm%' but a good heuristic can gi&e dramatic impro&ement o S+&%e88 :$bm%;keeps all nodes in memory o O+$"m&*88 6o !reedy best-first search is not optimal'and it is incomplete. The worst-case time and space comple#ity is :$b m %'where m is the ma#imum depth of the search space. A 9 Se&r%' A 9 Se&r%' is the most widely used form of best-first search. The e&aluation function f$n% is obtained by combining $.% ((n) : the cost to reach the node'and $-% '(n) : the cost to get from the node to the (o&* * f$n% + g$n% < h$n%. A = Search is both optimal and complete. A = is optimal if h$n% is an admissible heuristic. The ob&ious e#ample of admissible heuristic is the straight-line distance hS01. It cannot be an o&erestimate. A = Search is optimal if h$n% is an admissible heuristic 8 that is'pro&ided that h$n% ne&er o&erestimates the cost to reach the goal. An ob&ious e#ample of an admissible heuristic is the straight-line distance hS01 that we used in getting to )ucharest. The progress of an A = tree search for )ucharest is shown in ,igure -.-. The &alues of >g > are computed from the step costs shown in the Romania map$ figure -..%. Also the &alues of hS01 are gi&en in ,igure -... Re%!r#"2e 3e#$5f"r#$ Se&r%'(R3FS) Recursi&e best-first search is a simple recursi&e algorithm that attempts to mimic the operation of standard best-first search'but using only linear space. The algorithm is shown in figure -.?. Its structure is similar to that of recursi&e depth-first search'but rather than continuing indefinitely down the current path'it keeps track of the f-&alue of the best alternati&e path a&ailable from any ancestor of the current node. If the current node e#ceeds this limit'the recursion unwinds back to the alternati&e path. As the recursion unwinds'R),S replaces the f-&alue of each node along the path with the best f-&alue of its children. ,igure -.@ shows how R),S reaches )ucharest. F"(!re 2.) Stages in A = Search for )ucharest. 6odes are labeled with f + g < h . The h-&alues are the straight-line distances to )ucharest taken from figure -.. F"(!re 2.. The algorithm for recursi&e best-first search F"(!re 2./ Stages in an R),S search for the shortest route to )ucharest. The f-limit &alue for each recursi&e call is shown on top of each current node. $a% The path &ia Rimnicu 5ilcea is followed until the current best leaf $"itesti% has a &alue that is worse than the best alternati&e path $,agaras%. $b% The recursion unwinds and the best leaf &alue of the forgotten subtree $?.A% is backed up to Rimnicu 5ilceaBthen ,agaras is e#panded're&ealing a best leaf &alue of ?@C. $c% The recursion unwinds and the best leaf &alue of the forgotten subtree $?@C% is backed upto ,agarasB then Rimni 5icea is e#panded. This time because the best alternati&e path$through Timisoara% costs atleast ??A'the e#pansion continues to )ucharest R),S E2&*!&$"on * R),S is a bit more efficient than I1A= Still e#cessi&e node generation $mind changes% 0ike A=' optimal if h(n) is admissible Space comple#ity is O(bd). I1A= retains only one single number $the current f-cost limit% Time comple#ity difficult to characteriDe 1epends on accuracy if h$n% and how often best path changes. I1A= en R),S suffer from too little memory. 2.1.2 He!r"#$"% F!n%$"on# A 'e!r"#$"% f!n%$"on or simply a heuristic is a function that ranks alternati&es in &arious search algorithms at each branching step basing on an a&ailable information in order to make a decision which branch is to be followed during a search F"(!re 2.; A typical instance of the E-puDDle. The solution is -3 steps long. T'e <5+!,,*e The E-puDDle is an e#ample of euristic search problem. The obFect of the puDDle is to slide the tiles horiDontally or &ertically into the empty space until the configuration matches the goal configuration$,igure -.3% The a&erage cost for a randomly generated E-puDDle instance is about -- steps. The branching factor is about 2.$/hen the empty tile is in the middle'there are four possible mo&esBwhen it is in the corner there are twoBand when it is along an edge there are three%. This means that an e#hausti&e search to depth -- would look at about 2 -- appro#imately + 2.. G .C .C states. )y keeping track of repeated states'we could cut this down by a factor of about .AC'CCC'because there are only H9I- + .E.'??C distinct states that are reachable. This is a manageable number 'but the corresponding number for the .@-puDDle is roughly .C .2 . If we want to find the shortest solutions by using A = 'we need a heuristic function that ne&er o&erestimates the number of steps to the goal. The two commonly used heuristic functions for the .@-puDDle are * (1) h. + the number of misplaced tiles. ,or figure -.3 'all of the eight tiles are out of position'so the start state would ha&e h. + E. h. is an admissible heuristic. (2) h- + the sum of the distances of the tiles from their goal positions. This is called $'e %"$7 -*o%0 "#$&n%e or M&n'&$$&n "#$&n%e. h- is admissible 'because all any mo&e can do is mo&e one tile one step closer to the goal. Tiles . to E in start state gi&e a Janhattan distance of h- + 2 < . < - < - < - < 2 < 2 < - + .E. 6either of these o&erestimates the true solution cost 'which is -3. T'e Effe%$"2e 3r&n%'"n( f&%$or :ne way to characteriDe the =!&*"$7 of & 'e!r"#$"% is the effe%$"2e -r&n%'"n( f&%$or -9. If the total number of nodes generated by A= for a particular problem is N'and the #o*!$"on e+$' is 6then b =
is the branching factor that a uniform tree of depth d would ha&e to ha&e in order to contain 6<. nodes. Thus' 6 < . + . < b = < $b = % - <K<$b = % d ,or e#ample'if A = finds a solution at depth @ using @- nodes'then effecti&e branching factor is ..H-. A well designed heuristic would ha&e a &alue of b = close to .'allowing failru large problems to be sol&ed. To test the heuristic functions h. and h-'.-CC random problems were generated with solution lengths from - to -? and sol&ed them with iterati&e deepening search and with A = search using both h. and h-. ,igure -.A gi&es the a&eraghe number of nodes e#panded by each strategy and the effecti&e branching factor. The results suggest that h- is better than h.'and is far better than using iterati&e deepening search. ,or a solution length of .?'A = with h- is 2C'CCC times more efficient than uninformed iterati&e deepening search. F"(!re 2.> Comparison of search costs and effecti&e branching factors for the ITERATI5E- 1EE"E6I6!-SEARC and A = Algorithms with h.'and h-. 1ata are a&erage o&er .CC instances of the E-puDDle'for &arious solution lengths. In2en$"n( &m"##"-*e 'e!r"#$"% f!n%$"on# Re*&?e +ro-*em# o A problem with fewer restrictions on the actions is called a relaxed problem o The cost of an optimal solution to a rela#ed problem is an admissible heuristic for the original problem o If the rules of the E-puDDle are rela#ed so that a tile can mo&e anywhere' then h1(n) gi&es the shortest solution o If the rules are rela#ed so that a tile can mo&e to any adjacent square, then h2(n) gi&es the shortest solution 2.1.) LOCAL SEARCH ALGORITHMS AND OPTIMI@ATION PRO3LEMS o In many optimiDation problems' the path to the goal is irrele&antB the goal state itself is the solution o ,or e#ample'in the E-(ueens problem'what matters is the final configuration of (ueens'not the order in which they are added. o In such cases' we can !#e *o%&* #e&r%' &*(or"$'m#. They operate using a #"n(*e %!rren$ #$&$e$rather than multiple paths% and generally mo&e only to neighbors of that state. o The important applications of these class of problems are $a% integrated-circuit design' $b%,actory-floor layout'$c% Fob-shop scheduling'$d%automatic programming' $e%telecommunications network optimiDation'$f%5ehicle routing'and $g% portfolio management. Ae7 &2&n$&(e# of Lo%&* Se&r%' A*(or"$'m# $.% They use &ery little memory 8 usually a constant amountB and $-% they can often find reasonable solutions in large or infinite$continuous% state spaces for which systematic algorithms are unsuitable. OPTIMI@ATION PRO3LEMS Inaddition to finding goals'local search algorithms are useful for sol&ing pure o+$"m",&$"on +ro-*em#'in which the aim is to find the -e#$ #$&$e according to an o-Be%$"2e f!n%$"on. S$&$e S+&%e L&n#%&+e To understand local search'it is better e#plained using #$&$e #+&%e *&n#%&+e as shown in figure -.E. A landscape has both L*o%&$"onM $defined by the state% and Le*e2&$"onM$defined by the &alue of the heuristic cost function or obFecti&e function%. If ele&ation corresponds to %o#$'then the aim is to find the *o1e#$ 2&**e7 8 a (*o-&* m"n"m!mB if ele&ation corresponds to an o-Be%$"2e f!n%$"on'then the aim is to find the '"('e#$ +e&0 8 a (*o-&* m&?"m!m. 0ocal search algorithms e#plore this landscape. A complete local search algorithm always finds a (o&* if one e#istsB an o+$"m&* algorithm always finds a (*o-&* m"n"m!mCm&?"m!m. F"(!re 2.< A one dimensional #$&$e #+&%e *&n#%&+e in which ele&ation corresponds to the o-Be%$"2e f!n%$"on. The aim is to find the global ma#imum. ill climbing search modifies the current state to try to impro&e it 'as shown by the arrow. The &arious topographic features are defined in the te#t H"**5%*"m-"n( #e&r%' The '"**5%*"m-"n( search algorithm as shown in figure -.H' is simply a loop that continually mo&es in the direction of increasing &alue 8 that is'!+'"**. It terminates when it reaches a L+e&0M where no neighbor has a higher &alue. f!n%$"on I00-C0IJ)I6!$ problem% re$!rn a state that is a local ma#imum "n+!$D problem' a problem *o%&* 2&r"&-*e#D current6 & noe. neighbor6 & noe. current JANE-6:1E$I6ITIA0-STATEOproblemP% *oo+ o neighbor a highest &alued successor of current "f 5A04E OneighborP 5A04EOcurrentP $'en re$!rn STATEOcurrentP current neighbor F"(!re 2.E The hill-climbing search algorithm $steepest ascent &ersion%'which is the most basic local search techni(ue. At each step the current node is replaced by the best neighborBthe neighbor with the highest 5A04E. If the heuristic cost estimate h is used'we could find the neighbor with the lowest h. ill-climbing is sometimes called greedy local search because it grabs a good neighbor state without thinking ahead about where to go ne#t. !reedy algorithms often perform (uite well. Pro-*em# 1"$' '"**5%*"m-"n( ill-climbing often gets stuck for the following reasons * o Lo%&* m&?"m& * a local ma#imum is a peak that is higher than each of its neighboring states'but lower than the global ma#imum. ill-climbing algorithms that reach the &icinity of a local ma#imum will be drawn upwards towards the peak'but will then be stuck with nowhere else to go o R"(e# * A ridge is shown in ,igure -..C. Ridges results in a se(uence of local ma#ima that is &ery difficult for greedy algorithms to na&igate. o P*&$e&!? * A plateau is an area of the state space landscape where the e&aluation function is flat. It can be a flat local ma#imum'from which no uphill e#it e#ists'or a shoulder'from which it is possible to make progress. F"(!re 2.1F Illustration of why ridges cause difficulties for hill-climbing. The grid of states$dark circles% is superimposed on a ridge rising from left to right'creating a se(uence of local ma#ima that are not directly connected to each other. ,rom each local ma#imum'all th a&ailable options point downhill. H"**5%*"m-"n( 2&r"&$"on# S$o%'&#$"% '"**5%*"m-"n( o Random selection among the uphill mo&es. o The selection probability can &ary with the steepness of the uphill mo&e. F"r#$5%'o"%e '"**5%*"m-"n( o cfr. stochastic hill climbing by generating successors randomly until a better one is found. R&nom5re#$&r$ '"**5%*"m-"n( o Tries to a&oid getting stuck in local ma#ima. S"m!*&$e &nne&*"n( #e&r%' A hill-climbing algorithm that ne&er makes LdownhillM mo&es towards states with lower &alue$or higher cost% is guaranteed to be incomplete'because it can stuck on a local ma#imum.In contrast'a purely random walk 8that is'mo&ing to a successor choosen uniformly at random from the set of successors 8 is complete'but e#tremely inefficient. Simulated annealing is an algorithm that combines hill-climbing with a random walk in someway that yields both efficiency and completeness. ,igure -... shows simulated annealing algorithm. It is (uite similar to hill climbing. Instead of picking the best mo&e'howe&er'it picks the random mo&e. If the mo&e impro&es the situation'it is always accepted. :therwise'the algorithm accepts the mo&e with some probability less than .. The probability decreases e#ponentially with the LbadnessM of the mo&e 8 the amount E by which the e&aluation is worsened. Simulated annealing was first used e#tensi&ely to sol&e 50SI layout problems in the early .HECs. It has been applied widely to factory scheduling and other large-scale optimiDation tasks. F"(!re 2.11 The simulated annealing search algorithm'a &ersion of stochastic hill climbing where some downhill mo&es are allowed. Gene$"% &*(or"$'m# A !enetic algorithm$or !A% is a &ariant of stochastic beam search in which successor states are generated by combining two parent states'rather than by modifying a single state. 0ike beam search'!as begin with a set of k randomly generated states'called the population. Each state'or indi&idual'is represented as a string o&er a finite alphabet 8 most commonly'a string of Cs and .s. ,or e#ample'an E E-(uuens state must specify the positions of E (ueens'each in acolumn of E s(uares'and so re(uires E # log- E + -? bits. F"(!re 2.12 The genetic algorithm. The initial population in $a% is ranked by the fitness function in $b%'resulting in pairs for mating in $c%. They produce offspring in $d%'which are subFected to mutation in $e%. ,igure -..- shows a population of four E-digit strings representing E-(ueen states. The production of the ne#t generation of states is shown in ,igure -..-$b% to $e%. In $b% each state is rated by the e&aluation function or the f"$ne## f!n%$"on. In $c%'a random choice of two pairs is selected for reproduction'in accordance with the probabilities in $b%. ,igure -..2 describes the algorithm that implements all these steps. f!n%$"on !E6ETICQA0!:RITJ$ population, ,IT6ESS-,6% re$!rn an indi&idual "n+!$D population' a set of indi&iduals ,IT6ESS-,6' a function which determines the (uality of the indi&idual re+e&$ new_population empty set *oo+ for i from . $o SI7E$population% o x RA61:JQSE0ECTI:6$population' ,IT6ESSQ,6% y RA61:JQSE0ECTI:6$population' ,IT6ESSQ,6% child RE"R:14CE$x,y% "f $small random probability% $'en child J4TATE$child % add child to new_population population new_population !n$"* some indi&idual is fit enough or enough time has elapsed re$!rn the best indi&idual F"(!re 2.1) A genetic algorithm. The algorithm is same as the one diagrammed in ,igure -..-'with one &ariation*each mating of two parents produces only one offspring'not two. 2.1.. LOCAL SEARCH IN CONTINUOUS SPACES /e ha&e considered algorithms that work only in discrete en&ironments' but real-world en&ironment are continuous 0ocal search amounts to ma#imiDing a continuous obFecti&e function in a multi-dimensional &ector space. This is hard to do in general. Can immediately retreat o 1iscretiDe the space near each state o Apply a discrete local search strategy $e.g.' stochastic hill climbing' simulated annealing% :ften resists a closed-form solution o ,ake up an empirical gradient o Amounts to greedy hill climbing in discretiDed state space Can employ 6ewton-Raphson Jethod to find ma#ima Continuous problems ha&e similar problems* plateaus' ridges' local ma#ima' etc. 2.1./ On*"ne Se&r%' A(en$# &n Un0no1n En2"ronmen$# On*"ne #e&r%' +ro-*em# :ffline Search $all algorithms so far% Compute complete solution' ignoring en&ironment Carry out action se(uence :nline Search Interlea&e computation and action Compute;Act;:bser&e;Compute;R :nline search good ,or dynamic' semi-dynamic' stochastic domains /hene&er offline search would yield e#ponentially many contingencies :nline search necessary for e#ploration problem States and actions unknown to agent Agent uses actions as e#periments to determine what to do E#amples Robot e#ploring unknown building Classical hero escaping a labyrinth Assume agent knows Actions a&ailable in state s Step-cost function c$s'a's S % State s is a goal state /hen it has &isited a state s pre&iously Admissible heuristic function h$s % 6ote that agent doesnTt know outcome state $s S % for a gi&en action $a% until it tries the action $and all actions from a state s % Competiti&e ratio compares actual cost with cost agent would follow if it knew the search space 6o agent can a&oid dead ends in all state spaces Robotics e#amples* Staircase' ramp' cliff' terrain Assume state space is safely e#plorable;some goal state is always reachable On*"ne Se&r%' A(en$# Interlea&ing planning and acting hamstrings offline search A= e#pands arbitrary nodes without waiting for outcome of action :nline algorithm can e#pand only the node it physically occupies )est to e#plore nodes in physically local order Suggests using depth-first search 6e#t node always a child of the current /hen all actions ha&e been tried' canTt Fust drop state Agent must physically backtrack :nline 1epth-,irst Search Jay ha&e arbitrarily bad competiti&e ratio $wandering past goal% :kay for e#plorationB bad for minimiDing path cost :nline Iterati&e-1eepening Search Competiti&e ratio stays small for state space a uniform tree On*"ne Lo%&* Se&r%' ill Climbing Search Also has physical locality in node e#pansions Is' in fact' already an online search algorithm 0ocal ma#ima problematic* canTt randomly transport agent to new state in effort to escape local ma#imum Random /alk as alternati&e Select action at random from current state /ill e&entually find a goal node in a finite space Can be &ery slow' esp. if LbackwardM steps as common as LforwardM ill Climbing with Jemory instead of randomness Store Lcurrent best estimateM of cost to goal at each &isited state Starting estimate is Fust h$s % Augment estimate based on e#perience in the state space Tends to Lflatten outM local minima' allowing progress Employ optimism under uncertainty 4ntried actions assumed to ha&e least-possible cost Encourage e#ploration of untried paths Le&rn"n( "n On*"ne Se&r%' o Rampant ignorance a ripe opportunity for learning Agent learns a LmapM of the en&ironment o :utcome of each action in each state o 0ocal search agents impro&e e&aluation function accuracy o 4pdate estimate of &alue at each &isited state o /ould like to infer higher-le&el domain model o E#ample* L4pM in maDe search increases y -coordinate Re(uires o ,ormal way to represent and manipulate such general rules $so far' ha&e hidden rules within the successor function% o Algorithms that can construct general rules based on obser&ations of the effect of actions 2.2 CONSTRAINT SATISFACTION PRO3LEMS(CSP) A Con#$r&"n$ S&$"#f&%$"on Pro-*em$or CS"% is defined by a set of 2&r"&-*e# 6G.'G-'K.Gn'and a set of constraints C.'C-'K'Cm. Each &ariable Gi has a nonempty om&"n 1'of possible 2&*!e#. Each constraint Ci in&ol&es some subset of &ariables and specifies the allowable combinations of &alues for that subset. A S$&$e of the problem is defined by an &##"(nmen$ of &alues to some or all of the &ariables'UGi + &i'GF + &F'KV. An assignment that does not &iolate any constraints is called a %on#"#$en$ or *e(&* &##"(nmen$. A complete assignment is one in which e&ery &ariable is mentioned'and a #o*!$"on to a CS" is a complete assignment that satisfies all the constraints. Some CS"s also re(uire a solution that ma#imiDes an o-Be%$"2e f!n%$"on. E?&m+*e for Con#$r&"n$ S&$"#f&%$"on Pro-*em D ,igure -..@ shows the map of Australia showing each of its states and territories. /e are gi&en the task of coloring each region either red'green'or blue in such a way that the neighboring regions ha&e the same color. To formulate this as CS" 'we define the &ariable to be the regions */A'6T'W'6S/'5'SA' and T. The domain of each &ariable is the set Ured'green'blueV.The constraints re(uire neighboring regions to ha&e distinct colorsBfor e#ample'the allowable combinations for /A and 6T are the pairs U$red'green%'$red'blue%'$green'red%'$green'blue%'$blue'red%'$blue'green%V. The constraint can also be represented more succinctly as the ine(uality /A not + 6T'pro&ided the constraint satisfaction algorithm has some way to e&aluate such e#pressions.% There are many possible solutions such as U /A + red' 6T + green'W + red' 6S/ + green' 5 + red 'SA + blue'T + redV. It is helpful to &isualiDe a CS" as a constraint graph'as shown in ,igure -..@$b%. The nodes of the graph corresponds to &ariables of the problem and the arcs correspond to constraints. F"(!re 2.1/ $a% "rinciple states and territories of Australia. Coloring this map can be &iewed as aconstraint satisfaction problem. The goal is to assign colors to each region so that no neighboring regions ha&e the same color.
F"(!re 2.1/ $b% The map coloring problem represented as a constraint graph. CS" can be &iewed as a standard search problem as follows * In"$"&* #$&$e * the empty assignment UV'in which all &ariables are unassigned. S!%%e##or f!n%$"on * a &alue can be assigned to any unassigned &ariable'pro&ided that it does not conflict with pre&iously assigned &ariables. Go&* $e#$ * the current assignment is complete. P&$' %o#$ * a constant cost$E.g.'.% for e&ery step. E&ery solution must be a complete assignment and therefore appears at depth n if there are n &ariables. 1epth first search algorithms are popular for CS"s 4&r"e$"e# of CSP# (") D"#%re$e 2&r"&-*e# F"n"$e om&"n# The simplest kind of CS" in&ol&es &ariables that are "#%re$e and ha&e f"n"$e om&"n#. Jap coloring problems are of this kind. The E-(ueens problem can also be &iewed as finite-domain CS"'where the &ariables W.'W-'K..WE are the positions each (ueen in columns .'K.E and each &ariable has the domain U.'-'2'?'@'3'A'EV. If the ma#imum domain siDe of any &ariable in a CS" is d'then the number of possible complete assignments is :$d n % 8 that is'e#ponential in the number of &ariables. ,inite domain CS"s include 3oo*e&n CSP#'whose &ariables can be either true or alse. Inf"n"$e om&"n# 1iscrete &ariables can also ha&e "nf"n"$e om&"n# 8 for e#ample'the set of integers or the set of strings. /ith infinite domains'it is no longer possible to describe constraints by enumerating all allowed combination of &alues. Instead a constraint language of algebric ine(ualities such as StartFob. < @ X+ StartFob2. ("") CSP# 1"$' %on$"n!o!# om&"n# CS"s with continuous domains are &ery common in real world. ,or e#ample 'in operation research field'the scheduling of e#periments on the ubble Telescope re(uires &ery precise timing of obser&ationsB the start and finish of each obser&ation and maneu&er are continuous-&alued &ariables that must obey a &ariety of astronomical'precedence and power constraints. The best known category of continuous-domain CS"s is that of *"ne&r +ro(r&mm"n( problems'where the constraints must be linear ine(ualities forming a con!ex region. 0inear programming problems can be sol&ed in time polynomial in the number of &ariables. 4&r"e$"e# of %on#$r&"n$# D (") !n&r7 %on#$r&"n$# in&ol&e a single &ariable. E#ample * SA Y green $ii% )inary constraints in&ol&e paris of &ariables. E#ample * SA Y /A $iii% igher order constraints in&ol&e 2 or more &ariables. E#ample * cryptarithmetic puDDles. F"(!re 2.1; $a% Cryptarithmetic problem. Each letter stands for a distinct digitBthe aim is to find a substitution of digits for letters such that the resulting sum is arithmetically correct'with the added restriction that no leading Deros are allowed. $b% The constraint hypergraph for the cryptarithmetic problem'showint the "lldi constraint as well as the column addition constraints. Each constraint is a s(uare bo# connected to the &ariables it contains. 2.2.2 3&%0$r&%0"n( Se&r%' for CSP# The term -&%0$r&%0"n( #e&r%' is used for depth-first search that chooses &alues for one &ariable at a time and backtracks when a &ariable has no legal &alues left to assign. The algorithm is shown in figure -..A. F"(!re 2.1> A simple backtracking algorithm for constraint satisfaction problem. The algorithm is modeled on the recursi&e depth-first search F"(!re 2.1>(-) "art of search tree generated by simple backtracking for the map coloring problem. Pro+&(&$"n( "nform&$"on $'ro!(' %on#$r&"n$# So far our search algorithm considers the constraints on a &ariable only at the time that the &ariable is chosen by SE0ECT-46ASSI!6E1-5ARIA)0E. )ut by looking at some of the constraints earlier in the search' or e&en before the search has started' we can drastically reduce the search space. For1&r %'e%0"n( :ne way to make better use of constraints during search is called for1&r %'e%0"n(. /hene&er a &ariable X is assigned' the forward checking process looks at each unassigned &ariable Y that is connected to X by a constraint and deletes from Y Ts domain any &alue that is inconsistent with the &alue chosen for X. ,igure @.3 shows the progress of a map-coloring search with forward checking. Con#$r&"n$ +ro+&(&$"on Although forward checking detects many inconsistencies' it does not detect all of them. Con#$r&"n$ +ro+&(&$"on is the general term for propagating the implications of a constraint on one &ariable onto other &ariables. Ar% Con#"#$en%7 05Con#"#$en%7 Lo%&* Se&r%' for CSP# 2.2.) T'e S$r!%$!re of Pro-*em# Pro-*em S$r!%$!re Ine+enen$ S!- +ro-*em# Tree5S$r!%$!re CSP# 2.. AD4ERSARIAL SEARCH Competiti&e en&ironments' in which the agentTs goals are in conflict' gi&e rise to &2er#&r"&* #e&r%' problems 8 often known as (&me#. 2...1 G&me# Jathematical G&me T'eor7' a branch of economics' &iews any m!*$"&(en$ en2"ronmen$ as a (&me pro&ided that the impact of each agent on the other is LsignificantM' regardless of whether the agents are cooperati&e or competiti&e. In AI'MgamesM are deterministic' turn- taking' two-player' Dero-sum games of perfect information. This means deterministic' fully obser&able en&ironments in which there are two agents whose actions must alternate and in which the !$"*"$7 2&*!e# at the end of the game are always e(ual and opposite. ,or e#ample' if one player wins the game of chess$<.%'the other player necessarily loses$-.%. It is this opposition between the agentsT utility functions that makes the situation &2er#&r"&*. Form&* Def"n"$"on of G&me /e will consider games with two players' whom we will call MAX and MIN. JAG mo&es firsthand then they take turns mo&ing until the game is o&er. At the end of the game' points are awarded to the winning player and penalties are gi&en to the loser. A (&me can be formally defined as a #e&r%' +ro-*em with the following components* o An "n"$"&* #$&$eD 1'"%' includes the board position and identifies the player to mo&e. o A #!%%e##or f!n%$"on* which returns a list of $mo!e, state% pairs' each indicating a legal mo&e and the resulting state. o A $erm"n&* $e#$* which describes when the game is o&er. States where the game has ended are called $erm"n&* #$&$e#. o A !$"*"$7 f!n%$"on $also called an obFecti&e function or payoff function%' which gi&e a numeric &alue for the terminal states. In chess' the outcome is a win' loss' or draw' with &alues <.'-.'or C. The payoffs in backgammon range from <.H- to -.H-. !ame Tree The "n"$"&* #$&$e and *e(&* mo2e# for each side define the (&me $ree for the game. ,igure -..E shows the part of the game tree for tic-tac-toe $noughts and crosses%. ,rom the initial state' JAG has nine possible mo&es. "lay alternates between JAGTs placing an G and JI6Ts placing a C until we reach leaf nodes corresponding to the terminal states such that one player has three in a row or all the s(uares are filled. e number on each leaf node indicates the utility &alue of the terminal state from the point of &iew of JAGB high &alues are assumed to be good for JAG and bad for JI6. It is the JAGTs Fob to use the search tree$particularly the utility of terminal states% to determine the best mo&e. F"(!re 2.1< A partial search tree. The top node is the initial state' and JAG mo&e first' placing an G in an empty s(uare. 2...2 O+$"m&* De%"#"on# "n G&me# In normal search problem' the o+$"m&* #o*!$"on would be a se(uence of mo&e leading to a (o&* #$&$e 8 a terminal state that is a win. In a game' on the other hand' JI6 has something to say about it' JAG therefore must find a contingent #$r&$e(7' which specifies JAGTs mo&e in the "n"$"&* #$&$e' then JAGTs mo&es in the states resulting from e&ery possible response by JI6' then JAGTs mo&es in the states resulting from e&ery possible response by JI6 those mo&es' and so on. An o+$"m&* #$r&$e(7 leads to outcomes at least as good as any other strategy when one is playing an infallible opponent. F"(!re 2.1E A two-ply game tree. The nodes are LJAG nodesM' in which it is AJGTs turn to mo&e' and the nodes are LJI6 nodesM. The terminal nodes show the utility &alues for JAGB the other nodes are labeled with their minima# &alues. JAGTs best mo&e at the root is a.'because it leads to the successor with the highest minima# &alue' and JI6Ts best reply is b.'because it leads to the successor with the lowest minima# &alue. f!n%$"on Jinima#-1ecision $state% re$!rn# an action "n+!$# state' current state in game v Z Ja#-5alue$state% re$!rn $'e action in Successors$state% with 2&*!e v F"(!re 2.2F An algorithm for calculating minima# decisions. It returns the action corresponding to the best possible mo&e'that is'the mo&e that leads to the outcome with the best utility'under the assumption that the opponent plays to minimiDe utility. The functions JAG-5A04E and JI6-5A04E go through the whole game tree'all the way to the lea&es'to determine the backed-up &alue of a state. T'e m"n"m&? A*(or"$'m The minima# algorithm$,igure -.-C% computes the minima# decision from the current state. It uses a simple recursi&e computation of the minima# &alues of each successor state' directly implementing the defining e(uations. The recursion proceeds all the way down to the lea&es of the tree' and then the minima# &alues are -&%0e !+ through the tree as the recursion unwinds. ,or e#ample in ,igure -..H'the algorithm first recourses down to the three bottom left nodes' and uses the utility function on them to disco&er that their &alues are 2'.-'and E respecti&ely. Then it takes the minimum of these &alues' 2' and returns it as the backed-up &alue of node ). A similar process gi&es the backed up &alues of - for C and - for 1. ,inally' we take the ma#imum of 2'-'and - to get the backed-up &alue of 2 at the root node. The minima# algorithm performs a complete depth-first e#ploration of the game tree. If the ma#imum depth of the tree is m' and there are b legal mo&es at each point' then the time comple#ity of the minima# algorithm is :$b m %. The space comple#ity is :$bm% for an algorithm that generates successors at once. 2...) A*+'&53e$& Pr!n"n( The problem with minima# search is that the number of game states it has to e#amine is e?+onen$"&* in the number of mo&es. 4nfortunately'we canTt eliminate the e#ponent'but we can effecti&ely cut it in half. )y performing +r!n"n(6we can eliminate large part of the tree from consideration. /e can apply the techni(ue known as &*+'& -e$& +r!n"n( 'when applied to a minima# tree 'it returns the same mo&e as m"n"m&? would'but +r!ne# &1&7 branches that cannot possibly influence the final decision. Alpha )eta pruning gets its name from the following two parameters that describe bounds on the backed-up &alues that appear anywhere along the path* o [ * the &alue of the best$i.e.'highest-&alue% choice we ha&e found so far at any choice point along the path of JAG. o \* the &alue of best $i.e.' lowest-&alue% choice we ha&e found so far at any choice point along the path of JI6. Alpha )eta search updates the &alues of [ and \ as it goes along and prunes the remaining branches at anode$i.e.'terminates the recursi&e call% as soon as the &alue of the current node is known to be worse than the current [ and \ &alue for JAG and JI6'respecti&ely. The complete algorithm is gi&en in ,igure -.-.. The effecti&eness of alpha-beta pruning is highly dependent on the order in which the successors are e#amined. It might be worthwhile to try to e#amine first the successors that are likely to be the best. In such case'it turns out that alpha-beta needs to e#amine only :$b dI- % nodes to pick the best mo&e'instead of :$b d % for minima#. This means that the effecti&e branching factor becomes s(rt$b% instead of b 8 for chess'3 instead of 2@. "ut anotherway alpha-beta cab look ahead roughly twice as far as minima# in the same amount of time. F"(!re 2.21 The alpha beta search algorithm. These routines are the same as the minima# routines in figure -.-C'e#cept for the two lines in each of JI6- 5A04E and JAG-5A04E that maintain [ and \ 2.... Im+erfe%$ 6Re&*5$"me De%"#"on# The minima# algorithm generates the entire game search space'whereas the alpha-beta algorithm allows us to prune large parts of it. owe&er'alpha-beta still has to search all the way to terminal states for atleast a portion of search space. ShannonTs .H@C paper'"rogramming a computer for playing chess'proposed that programs should %!$ off the search earlier and apply a heuristic e2&*!&$"on f!n%$"on to states in the search'effecti&ely turning nonterminal nodes into terminal lea&es. The basic idea is to alter minima# or alpha-beta in two ways * $.% The utility function is replaced by a heuristic e&aluation function E5A0'which gi&es an estimate of the positionTs utility'and $-% the terminal test is replaced by a %!$off $e#$ that decides when to apply E5A0. 2.../ G&me# $'&$ "n%*!e E*emen$ of C'&n%e E2&*!&$"on f!n%$"on# An e&aluation function returns an estimate of the e#pected utility of the game from a gi&en position'Fust as the heuristic function return an estimate of the distance to the goal. G&me# of "m+erfe%$ "nform&$"on o Jinima# and alpha-beta pruning re(uire too much leaf-node e&aluations. Jay be impractical within a reasonable amount of time. o SA66:6 $.H@C%* o Cut off search earlier $replace TERJI6A0-TEST by C4T:,,-TEST% o Apply heuristic e&aluation function E5A0 $replacing utility function of alpha-beta% C!$$"n( off #e&r%' ChangeD "f TERJI6A0-TEST$state% $'en re$!rn 4TI0IT]$state% into "f C4T:,,-TEST$state,depth% $'en re$!rn E5A0$state% Introduces a fi#ed-depth limit depth Is selected so that the amount of time will not e#ceed what the rules of the game allow. /hen cuttoff occurs' the e&aluation is performed. He!r"#$"% E4AL Idea* produce an estimate of the e#pected utility of the game from a gi&en position. "erformance depends on (uality of E5A0. Re(uirements* E5A0 should order terminal-nodes in the same way as 4TI0IT]. Computation may not take too long. ,or non-terminal states the E5A0 should be strongly correlated with the actual chance of winning. :nly useful for (uiescent $no wild swings in &alue in near future% states Ge"('$e L"ne&r F!n%$"on The introductory chess books gi&e an appro#imate material &alue for each piece * each pawn is worth .'a knight or bishop is worth 2'a rook 2'and the (ueen H. These feature &alues are then added up toobtain the e&aluation of the position. Jathematically'these kind of e&aluation fuction is called weighted linear function'and it can be e#pressed as * #!al(s) + w. f.$s% < w- f-$s% < K < wn fn$s% ^ e.g.' w. + H with f.$s% + $number of white (ueens% 8 $number of black (ueens%' etc. G&me# $'&$ "n%*!e %'&n%e In real life'there are many unpredictable e#ternal e&ents that put us into unforeseen situations. Jany games mirror this unpredictability by including a random element'such as throwing a dice. 3&%0(&mmon is a typical game that combines luck and skill. 1ice are rolled at the beginning of playerTs turn to determine the legal mo&es. The backgammon position of ,igure -.-2'for e#ample'white has rolled a 3-@'and has four possible mo&es. F"(!re 2.2) A typical backgammon position. The goal of the game is to mo&e all oneTs pieces off the board. /hite mo&es clockwise toward -@'and black mo&es counterclockwise toward C. A piece can mo&e to any position unless there are multiple opponent pieces thereB if there is one opponent 'it is captured and must start o&er. In the position shown'white has rolled 3-@ and must choose among four legal mo&es $@-.C'@-..%'$@-..'.H--?%'$@-.C'.C-.3%'and $@-..'..-.3% /hite mo&es clockwise toward -@ )lack mo&es counterclockwise toward C A piece can mo&e to any position unless there are multiple opponent pieces thereB if there is one opponent' it is captured and must start o&er. /hite has rolled 3-@ and must choose among four legal mo&es* $@-.C' @-..%' $@-..' .H--?% $@-.C' .C-.3%' and $@-..' ..-.3% 555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555 F"(!re 252. Schematic game tree for a backgammon position. E?+e%$e m"n"m&? 2&*!e EG"ECTE1-JI6IJAG-5A04E$n%+ 4TI0IT]$n% If n is a terminal ma#s successors(n) JI6IJAG-5A04E$s% If n is a ma# node mins successors(n) JI6IJAG-5A04E$s% If n is a ma# node s successors(n) $(s) % EG"ECTE1JI6IJAG$s% If n is a chance node These e(uations can be backed-up recursi&ely all the way to the root of the game tree. ------------------------------------------------------------------------------------------------------