Professional Documents
Culture Documents
LEARNING
Learning from Observation Inductive Learning Decision Trees Explanation based Learning Statistical Learning methods Reinforcement Learning
Learning
An agent tries to improve its behavior through observation, reasoning, or reflection
learning from experience
memorization of past percepts, states, and actions generalizations, identification of similar experiences
forecasting
prediction of changes in the environment
theories
generation of complex models based on observations and reasoning
2000-2012 Franz Kurfess Learning
Learning
Learning Agents
based on previous agent designs, such as reflexive, model-based, goal-based agents
those aspects of agents are encapsulated into the performance element of a learning agent
Sensors
Critic
Feedback Changes
Learning Element
Learning Goals
Knowledge
Performance Element
Problem Generator
Agent Effectors
2000-2012 Franz Kurfess
Environment
Learning
Forms of Learning
supervised learning
an agent tries to find a function that matches examples from a sample set
each example provides an input together with the correct output
unsupervised learning
the agent tries to learn from patterns without corresponding output values
reinforcement learning
the agent does not know the exact output for an input, but it receives feedback on the desirability of its behavior
the feedback can come from an outside entity, the environment, or the agent itself the feedback may be delayed, and not follow the respective action immediately
Learning
Feedback
provides information about the actual outcome of actions supervised learning
both the input and the output of a component can be perceived by the agent directly the output may be provided by a teacher
reinforcement learning
feedback concerning the desirability of the agents behavior is available
not in the form of the correct output
Learning
Prior Knowledge
background knowledge available before a task is tackled can increase performance or decrease learning time considerably many learning schemes assume that no prior knowledge is available in reality, some prior knowledge is almost always available
but often in a form that is not immediately usable by the agent 2000-2012 Franz Kurfess Learning
Inductive Learning
tries to find a function h (the hypothesis) that approximates a set of samples defining a function f
the samples are usually provided as input-output pairs (x, f(x))
Hypotheses
finding a suitable hypothesis can be difficult
since the function f is unknown, it is hard to tell if the hypothesis h is a good approximation
Learning
input-output pairs displayed as points in a plane the task is to find a hypothesis (functions) that connects the points
either all of them, or most of them
hypothesis is a function consisting of linear segments fully incorporates all sample pairs
goes through all points
very easy to calculate has discontinuities at the joints of the segments moderate predictive performance
Learning
hypothesis expressed as a polynomial function incorporates all samples more complicated to calculate than linear segments no discontinuities better predictive power
Learning
hypothesis is a linear functions does not incorporate all samples extremely easy to compute low predictive power
x
Learning
making decisions
a sequence of test is performed, testing the value of one of the attributes in each step 2000-2012 Franz Kurfess when a leaf node is reached, its value is returned Learning
the learning aspect is to predict the value of a 2000-2012 Franz Kurfess goal predicate (also called goal concept) Learning
Terminology
example or sample
describes the values of the attributes and the goal
a positive sample has the value true for the goal predicate, a negative sample has false
sample set
collection of samples used for training and validation
training
the training set consists of samples used for constructing the decision tree
validation
2000-2012 Franz Kurfess Learning the test set is used to determine if the decision tree
Attributes
Bar
No No Yes No No Yes Yes No Yes Yes No Yes No No No Yes Yes No No No Yes Yes No Yes Yes Yes No Yes No Yes No Yes No Yes No Yes Some Full Some Full Full Some None Some Full Full None Full $$$ $ $ $ $$$ $$ $ $$ $ $$$ $ $ No No No No No Yes Yes Yes Yes No No No Yes No No No Yes Yes No Yes No Yes No No French Thai Burger Thai French Italian Burger Thai Burger Italian Thai Burger 0-10 30-60 0-10 10-30 >60 0-10 0-10 0-10 >60 10-30 0-10 30-60
Goal Exam
Yes No Yes Yes No Yes No Yes No No No Yes X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
Learning
No
Yes Yes
Yes
Hungry? Alternative? Walkable? Yes No No
EstWait?
Bar? No
Yes
Yes
better solution: find a concise tree that still agrees with all samples 2000-2012 Franz Kurfess
Learning
Ockhams Razor
The most likely hypothesis is the simplest one that is consistent with all observations.
general principle for inductive learning a simple hypothesis that is consistent with all observations is more likely to be correct than a complex one
Learning
if we have positive and negative examples left, but no attributes to split them, we are in trouble Learning 2000-2012 Franz Kurfess
Learning
overfitting makes use of irrelevant attributes to distinguish between samples that have no meaningful differences
e.g. using the day of the week when rolling dice over-fitting is a general problem for all learning algorithms
cross-validation splits the sample data in different training and test sets
results are averaged
2000-2012 Franz Kurfess Learning
Ensemble Learning
Multiple hypotheses (an ensemble) are generated, and their predictions combined
by using multiple hypotheses, the likelihood for misclassification is hopefully lower also enlarges the hypothesis space
Explanation-Based Learning
Learning complex concepts using Induction procedures typically requires a substantial number of training instances. But people seem to be able to learn quite a bit from single examples. An EBL system attempts to learn from a single example x by explaining why x is an example of the target concept. The explanation is then generalized, and then systems performance is improved through the availability of this knowledge.
EBL
EBL programs as accepting the following as input:
A training example A goal concept: A high level description of what the program is supposed to learn An operational criterion- A description of which concepts are usable. A domain theory: A set of rules that describe relationships between objects and actions in a domain.
From this EBL computes a generalization of the training example that is sufficient to describe the goal concept, and also satisfies the operationality criterion.
Explanation-based generalization (EBG) is an algorithm for EBL and has two steps: (1) explain, (2) generalize During the explanation step- prune away all the unimportant aspects of the training example with respect to the goal concept gives explanation The next step is to generalize the explanation as far as possible while still describing the goal concept.
Statistical Learning
Data instantiations of some or all of the random variables describing the domain; they are evidence Hypotheses probabilistic theories of how the domain works The Surprise candy example: two flavors in very large bags of 5 kinds, indistinguishable from outside
h1: 100% cherry P(c|h1) = 1, P(l|h1) = 0 h2: 75% cherry + 25% lime h3: 50% cherry + 50% lime
Problem formulation
Given a new bag, random variable H denotes the bag type (h1 h5); Di is a random variable (cherry or lime); after seeing D1, D2, , DN, predict the flavor (value) of DN-1.
Bayesian learning
Calculates the probability of each hypothesis, given the data and makes predictions on that basis
P(hi|d) = P(d|hi)P(hi), where d are observed values of D Predictions use a likelihood-weighted average over hypotheses hi are intermediaries between raw data and predictions No need to pick one best-guess hypothesis
Data are complete when each data point contains values for every variable in the model
Maximum-likelihood parameter learning: discrete model
With complete data, ML parameter learning
If we knew the parameters of each component, we know which ci should xj belong to. However, we do not know either,
E-step computes the expected value pij of the hidden indicator variables Zij, where Zij is 1 if xj was generated by i-th component, 0 otherwise M-step finds the new values of the parameters that maximize the log likelihood of the data, given the expected values of Zij
Instance-based Learning
Parametric vs. nonparametric learning
Learning focuses on fitting the parameters of a restricted family of probability models to an unrestricted data set Parametric learning methods are often simple and effective, but can oversimplify whats really happening Nonparametric learning allows the hypothesis complexity to grow with the data IBL is nonparametric as it constructs hypotheses directly from the training data.
Nearest-neighbor models
The key idea: Neighbors are similar
Density estimation example: estimate xs probability density by the density of its neighbors Connecting with table lookup, NBC, decision trees,
Summary
Bayesian learning formulates learning as a form of probabilistic inference, using the observations to update a prior distribution over hypotheses. Maximum a posteriori (MAP) selects a single most likely hypothesis given the data. Maximum likelihood simply selects the hypothesis that maximizes the likelihood of the data (= MAP with a uniform prior). EM can find local maximum likelihood solutions for hidden variables. Instance-based models use the collection of data to represent a distribution.
Nearest-neighbor method
Reinforcement Learning
In which we examine how an agent can learn from success and failure, reward and punishment.
Introduction
Learning to ride a bicycle:
The goal given to the Reinforcement Learning system is simply to ride the bicycle without falling over Begins riding the bicycle and performs a series of actions that result in the bicycle being tilted 45 degrees to the right
Photo:http://www.roanoke.com/outdoors/bikepages/bikerattler.html
Introduction
Learning to ride a bicycle:
RL system turns the handle bars to the LEFT Result: CRASH!!! Receives negative reinforcement RL system turns the handle bars to the RIGHT Result: CRASH!!! Receives negative reinforcement
Introduction
Learning to ride a bicycle:
RL system has learned that the state of being titled 45 degrees to the right is bad Repeat trial using 40 degree to the right By performing enough of these trial-and-error interactions with the environment, the RL system will ultimately learn how to prevent the bicycle from ever falling over
Agent can move {North, East, South, West} Terminate on reading [4,2] or [4,3]
An active agent must consider : what actions to take what their outcomes may be how they will affect the rewards received
environment model now incorporates the probabilities of transitions to other states given a particular action maximize its expected utility agent needs a performance element to choose an action at each step
Learning
Neural Networks
complex networks of simple computing elements capable of learning from examples
with appropriate learning methods
sensory input analysis memory storage and retrieval reasoning feelings consciousness
neurons
basic computational elements heavily interconnected with other neurons
Learning
weighted inputs are summed up by the input function the (nonlinear) activation function calculates the activation value, which determines the output
2000-2012 Franz Kurfess Learning
Network Structures
in principle, networks can be arbitrarily connected
occasionally done to represent specific structures
semantic networks logical sentences
layered structures
networks are arranged into layers interconnections mostly between two layers some networks have feedback connections 2000-2012 Franz Kurfess
Learning
Perceptrons
single layer, feed-forward network historically one of the first types of neural networks
late 1950s
the output is calculated as a step function applied to the weighted sum of inputs capable of learning simple functions
linearly separable
Learning
this is a gradient descent search through the weight space 2000-2012 Franz Kurfess Learning
Multi-Layer Networks
research in the more complex networks with more than one layer was very limited until the 1980s
learning in such networks is much more complicated the problem is to assign the blame for an error to the respective units and their weights in a constructive way
usually all nodes of one layer have weighted connections to all nodes of the next layer
Learning
Back-Propagation Algorithm
assigns blame to individual units in the respective layers
essentially based on the connection strength proceeds from the output layer to the hidden layer(s) updates the weights of the units leading to the layer
Learning
computational efficiency
training time can be exponential in the number of inputs depends critically on parameters like the learning rate local minima are problematic
can be overcome by simulated annealing, at additional cost
generalization
2000-2012 Franz Kurfess
transparency
neural networks are essentially black boxes there is no explanation or trace for a particular answer tools for the analysis of networks are very limited some limited methods to extract rules from networks 2000-2012 Franz Kurfess Learning
Applications
domains and tasks where neural networks are successfully used
handwriting recognition control problems
juggling, truck backup problem
series prediction
weather, financial forecasting
categorization
sorting of items (fruit, characters, phonemes, )
2000-2012 Franz Kurfess Learning