You are on page 1of 4

#2 Tutorial -- Designing a BDI agent in Java

14:3016:00,16:1517:45

Running example: Goldminers


3agentsinapartiallyobservableenvironmentlikethis:

Percepts: 1. 2. 3. 4. 5. 6. p=(x,y)agentscurrentposition G={(x,y)}locationsofgoldstonestheagentsees A={(x,y)}locationsofotheragentsthattheagentsees F={(x,y)}locationsofforeststheagentsees d=(x,y)thelocationofthedepotiftheagentsseesit cthebooleanflagdeterminingwhethertheagentiscarryingagoldstone

Actions: {north,south,east,west,grab,drop,skip}

Rules: 1pointforeachpieceofgoldcarriedtothedepot treesanddepotstaystatic agentcannotpasscellswithforest,agentcannotpasscellsoccupiedbyotheragents 1%oftheactionswillfillandendupwithrandomeffect

Implementing agents
Agentfunction: Specifiesthebehavioroftheagent:foreachsequenceofpercepts,choosesanaction f:P*>A Rationalagent: Givensomeperformancemeasurearationalagentalwaysperformstheactionthat maximizesitsperformancemeasure. PerceiveDeliberateActcycle: 1.Perceive 2.Deliberate 3.Act Agentarchitectures: (AccordingtoNorvigandRussell) 1. Reflex(Reactive)Agent choseactionbasedoncurrentpercepts (onlyconsiderthepartoftheworldthatiscurrentlyvisible) typicallyinformofifthenrules 2. ModelbasedReflexAgent buildmodeloftheworldthatcontainstheexpectedstateoftheworldthatis unobserved choseactionbasedonthecurrentmodeloftheworld 3. ModelbasedGoalbasedAgent theagenthasadeclarativelyspecifiedgoal

performsactionsthatpursuethegoal Techniques: i. Classicalplanning(sequenceofactions) ii. PlanningwithUncertainty(policy) iii. Adversarialplanning(policy) iv. BeliefDesireIntentionarchitecture(reactiveplanningcanbe usedwhenfullplanningisnottractable) 4. ModelbasedUtilitybasedAgent Eachstateoftheworldhasacertainutilityfortheagent Findsequenceofactions(indeterministicenv.)orpolicy(in nondeterministicenv.)thatmaximizestheutilitygatheredovertime Techniques i. sequentialdecisionmaking:MDP,POMDP(nonadversarial) ii. sequentialgames,imperfectinformationgames(adversarial) 5. LearningbasedAgent Theagentfunctionislearntthroughtheinteractionwiththeenvironment Typicallyinvolvesbothlearningthemodelandoptimalpolicy Techniques i. Reinforcementlearning agentreceivespercepts(observations)togetherwithreward/ penalty learnstransitiontableandQvalues:stateXaction>R

Belief-Desire-Intention Architecture
Story: Programmingrootedinpsychology.Inspiredbyfolkpsychology(howpeoplethinkother peoplethink). GaverisetospecializedBDIlanguagessuchasJason,3APL,2APL,GOAL,which howeverneverreallytookoff.Stillnotcomparabletomainstreamlanguageincomfortof use. Couldbeimplementedinanylanguage Hopedtoprovidemorecomputationallytractablewaytocomputeintelligentbehaviorthan fullscaleplanning,thebehaviouriseasiertounderstandtohumananditisrelatively easytoincorporatehumanexpertiseintotheagent Manyflavours,differentauthors,differentopinionsonwhyisBDIgoodandwhyisnot

Maincomponents(internalstatedividedinto): BeliefswhatdoIbelievetoholdintheworld(modeloftheworld) DesireswhatwouldIliketheworldtolooklike(mygoals) IntentionwhatgoalsamIcommittedtopursue Unlikedesires,intentionsshouldsatisfysomeproperties: Agentshouldnotintentsomethingthathebelievestobetrue Agentshouldnotintentsomethingthathebelievesisunachievable Agentshouldnotintentsomethingthatisnotdesired Intentionshouldbeconsistent Question:howlongshouldanintentionpersist? blindcommitment:alsoreferredtoasfanaticalcommitment,the agentisintendingtheintentionuntilitbelievesthatithasbeen achieved(persistentintention) singlemindedcommitment:besidesaboveitintendstheintention untilitbelievesthatitisnolongerpossibletoachievethegoal openmindedcommitment:besidesaboveitintendstheintention aslongasitissurethattheintentionisachievable

TypicalBDIDeliberationcycle *Processpercepts/communication *Deliberateaboutintentions/goalsdroptheachievedones *Pursueintentions/applyrules

HowwouldyouimplementitinJava?

You might also like