You are on page 1of 12

Ayush Shakya (069bct505)

AI assignment 2
1)
I)
mango(x) -> x is a mango
color(x,y) -> x has the color y
x[ mango(x) ~color(x,blue) ]
II)
girl(x) -> x is a girl
likes(x, y) -> x likes y
x[ girl(x) likes(x, ice-cream) ]
III)
boy(x) -> x is a boy
similar(x,y) -> x is like y
Ex[ boy(x) & similar(x, monkey) ]
III)
father(x, y) -> x is father of y
father(Gum Prasad, Hari)
IV)
cat(x) -> x is a cat
has(x, y) -> x has y
x[ cat(x) ( has(x, tail) & has(x, whiskers) ) ]
V)
loyal(x, y) -> x is loyal to y
xEy[ loyal(x,y) ]
VI)
like(x, y) -> x likes y
easycourse(x) -> x is an easy course
x[like(Steve, X) easycourse(x)]

2)
Let,

dog(x) : x is a dog

animal(x) : x is an animal
die(x) : x can die
i) All dogs are animals: x[ dog(x) animal(x) ]

~dog(x) V animal(x)

ii) Fido is a dog:

dog(Fido)

dog(Fido)

iii) All animals will die: y[ animal(y) die(y) ]


iv) Fido will die :

~animal(y) V die(y)

~die(Fido)

Using resolution,

Appearance of a empty clause contradicts our previous assumption, hence Fido will die.

3)
Let,
student(x) : x is a student.
score(x, y) : x scores y%.
distinction(x): x gets distinction.
isregular(x): x is regular in class.
hasTakenExam(x): x has taken the exam.
issincere(x): x is sincere.
ispunctual(x): x submits homeworks in time.
recommend(x, y): Professor x recommends student y.
greater(x, y): x >= y
less(x, y): x < y
i) All students should appear in the examination.
(x) ( student(x) hasTakenExam(x) )
ii) Only those students who score 80% and above will get distinction.
(x) ( distinction(x) (y) (score(x, y) & (greater(y, 80))
iii) To get distinction, the students should be regular in the class, submit homeworks in time and be
sincere.
(x) ( distinction(x) (isregular(x) & issincere(x) & ispunctual(x)) )
iv) The students with distinction will get good recommendation from the professors
(x)(y) ( distinction(x) recommend(y, x) )
v) Some of the professors do not like to recommend students scoring less than 75%.
(x)(z)(y) ( score(x, z) & less(z, 75) ~recommend(y, x) )

4)
Let,
food(x) : x is a food
eats(x, y) : x eats y
likes(x, y) : x likes y
killed(x) : x is killed by
i) John likes all kinds of food:

x[ food(x) likes(John, x) ] ~food(x) V likes(John, x)

ii) Apples are food :

food(apple)food(apple)

iii) Chicken is a food:

food(chicken)food(chicken)

iv) Anything anyone eats and isn't killed by is food:


xy[ eat(y, x) ~killed(y) food(x) ] ~eat(x, y) V killed(x) V food(y)
v) Bill eats peanuts and is still alive: eat(Bill, peanuts) ~killed(Bill) eats(Bill, peanuts), ~killed(Bill)
vi) Sue eats everything bill eats:

x[ eat(Bill, x) eat(Sue, x) ]~eat(Bill,w) V eat(Sue, w)

Hence, John likes peanuts. proved.

5)Let,
easy(x) : x is an easy course.
likes(x, y) : x likes y course.
weaving(x) : x is a course in the basket weaving department
Predicates:
i) Steve only likes easy courses:

x[ easy(x) likes(Steve, x) ]

ii) Science courses are hard:

~easy(Science)

iii) All the courses in the basket weaving department are easy:

x[ weaving(x) easy(x) ]

iv) BK301 is a basket weaving course:

weaving(BK301)

Clauses:
i) ~easy(x) V likes(Steve, x)
ii) ~easy(Science)
iii) ~weaving(w) V easy(w)
iv) weaving(BK301)

Hence, Steve would like the course BK301.

6.
Perceptrons or single layer neural networks are artificial neural networks where input layer is connected
directly to the output layer. One major drawback of the perceptron is that can represent only linearly
seperable functions. Linearly seperable functions are such functions where the hyperplane defined in its
input space completely seperates the outputs. Multilayered perceptrons(MLP) focus on this drawback
and are effectively used to model any type of function. MLP consists of multiple layers of nodes in a
directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is
a neuron (or processing element) with a nonlinear activation function. MLP utilizes a supervised learning
technique called backpropagation for training the network.
Backpropagation algorithm:Learning and validation
In backpropagation, loss is calculated by comparison between the obtained output value and the
expected output value. Hence, due to the presence of the true output value, backpropagation is
considered a supervised learning algorithm. Consider a MLP. After initializing the weight of all the
synapses connected to each neuron to random values, the first training set is passed to the network
with the help of which error in output is calculated. The error propagate backwards from the output
nodes to the input nodes. Technically speaking, backpropagation calculates the gradient of the error of
the network regarding the network's modifiable weights. Thus by adjusting the weights after each
training iteration, the MLP is said to "learn". This learning process stops when the weight difference
between two iterations does not change significantly and the current state of the ANN(i.e the weight
values) is considered to be valid.
Consider an example of the logical gate XOR:

The above is a example to implement the XOR function. The need for a MLP arises in XOR due to the fact
that XOR outputs are not seperated by a (n-1) hyperplane in a n dimensional space(i.e XOR is not
linearly seperable)

Following the back propagation principle, the obtained o/p in each iteration is compared with the
desired result and the error is fed into the NN for weight adjustment. This process continues until the
weight difference in adjacent iteration becomes insignificant.

7.
Advantages of CNF and DNF:
DNF or disjunctive normal form contains clauses with conjunctions which are combined with the help of
disjuncions.
example: (p&q)v(q&r) one obvious advantage of this form is that if one of the clauses returns true, the
rest can simply be ignored. Also, on certain systems the and operator has a higher precedence over the
or operator, this results in parenthesis reduction.
In the case of conjunctive normal form(CNF), the and operator combines a group of OR clauses. CNF has
the advantage of being able to resolution to produce a new clause from two clauses containing
complementary literals.
For example:
X(Y V Z)
Step 1: Eliminate and convert it to
X(Y V Z)(X (Y V Z)) & ((Y V Z) X)
Step 2:Convert to disjunctive form
(X (Y V Z)) & ((Y V Z) X) (~X V (Y V Z)) & (~(Y V Z) V X)
Step 3: Use de-morgan's and double negation
(~X V (Y V Z)) & (~(Y V Z) V X) (~X V (Y V Z)) & ((~Y & ~Z) V X)
Step 4: Apply Distributivity law (V over &)
(~X V (Y V Z)) & ((~Y & ~Z) V X) (~X V Y V Z) & (~Y V X) & (~Z V X)
8)
a) ~(P Q) V (P V S) R
(~P V ~Q) V (P V S) R
(~P V ~Q V P V S) R
~(~P V ~Q V P V S) V R
(P Q ~P ~S) V R
(P V R) (Q V R) (~P V R) (S V R)
b) ~(P ~Q) (R ~S)
(~P V Q) (~R V ~S)
c) P ( (Q ~R) S )
~P V [ { (Q ~R) S } { S (Q ~R) } ]
~P V [ { ~(Q ~R) V S } {~S V (Q ~R) } ]

~P V [ ( ~Q V R V S ) (~S V Q) (~S V ~R) ]


(~P V ~Q V R V S) (~P V ~S V Q) (~P V ~S V R)
9.
Inference refers to the derivation of the logical conclusion from premises known or assumed to be true.
ex. from the premises:
All men are mortal
Socrates is a man
the conclusion can be derived as "Socrates is mortal"
Reasoning on the other hand, is a search process involving the application of multiple inferences. Thus
reasoning is a abstraction of inference.
Probabilistic reasoning is the use logic and probability to handle uncertain situations. An example of
probabilistic reasoning is using past situations and statistics to predict an outcome. Probabilistic
reasoning allows us to deal with uncertain and incomplete data which is a major part in major AI
systems. Some other areas in which probabalistic reasoning is suitable is as follows:
Need for nearly optimal solutions
general rules of inference are known or can be found for the problem.
Probabalistic reasoning is applicable in many areas such as stock market prediction or sentiment
analysis. Nave bayes is one of the widely used implementation for probabilistic reasoning, where
statistical reasoning is used to classify previously unseen data into output classes on the basis of the
training data.

10.
Using chain rule:
P(C,R,S,W) = P(C).P(S|C).P(R|C,S).P(W|C,S,R)
From conditional independence we know,
Rain(R) is independent of Sprinkler(S) given clouds(C) and wet grass(W) is independent of C given S and
R.
Hence,
P(C,R,S,W) = P(C).P(S|C).P(R|C).P(W|S,R).

Given,
P(C=true)=.5,P(S|C=true)=.1,P(R|C=true)=.8,P(W S=true & R=false)=.9
Finally,
P(C=true , S=true , R=false ,W =true) = 0.50.10.80.9= 0.036
hence, 3.6% of the samples are of the given type.

11.
The best attribute for any given node of a decision tree is determined by how well it can classify the
problem at hand. More formally, the attribute that causes the biggest entropy decrease is chosen as the
root node. Then representing the branches of the root as the values of the root attribute entropy
difference is again calculated for each remaining attributes only considering the output values that
follow the path of the current branch in the tree we are currently at.
Let entropy[Y,N]= Number of total 'yes' outcomes: 9
Number of total 'no' outcomes: 5
Hence, entropy[9,5]=0.28
To calculate the root node, the attribute that creates the greatest entropy decreases is chosen.
In this case, entropy decrease for outlook is the greatest and is chosen as the root.
Choosing the value x for the outlook attribute as branches, the entropy of the remaining columns are
calculated with value x in the outlook attribute. Entropy is then compared with the total entropy of the
rows with the value x in the outlook attribute and the attribute with greatest entropy decrease is chosen
as the child of that branch represented by x.
Following this procedure, the decision tree is created as follows:

12.
A)Machine learning
Machine learning is the construction or modification of the data being acquired by the machine. It
depends on 3 factors: learning on a basic level is the determination of the changes caused by the
acquired data and their representation. Secondly, the changes must also change the view of the
machine on similar tasks and finally machines must properly address the changes in its performance due
to the data and prevent performance degradation.
Machine learning is usually applied in areas where human interaction is hard or simply impossible such
as Mars exploration, speech processing etc.
Machine learning is also applied in classification tasks where the input is generally required to be
classified into certain number of predetermined classes. The most prominent example is OCR. In OCR,
the goal is generate the character(s) being shown in a input image which may be numbers, letters and
other symbols. The machine is first given large number of data (preclassified), from which it must draw
out a pattern which helps it recognize any symbols in the future.
B)Learning by analogy

Learning by analogy refers to the possesion of new knowledge about an input entity by transferring it
from a known similar entity. One the most classical examples for learning by analogy may be the relation
between MOSFETs and a water tap. As we all know, water cannot flow through the water tap without
turning the knob. Only after turning it, the water flows from the source to the sink. Similarly, in the cas
of MOSFTs without the gate(knob) being turned on(certain voltage), current(water) cannot flow through
the source and drain.
If partially known target entity T and a goal are given along with domain knowledge about the known
entities, new knowledge about T is drawn by the help of the background information from the related
entities.
C)Learning by induction
Learning by induction involves taking pairs of input and their functional values and deriving the correct
form of the equation that represents such pairs. Formally,
given(x,f(x))
we have to find h(x) that approximates f(x).
The function h(x) is called a hypothesis and it must be able to predict unseen functional values of x
correctly.
Classical examples of inductive learning is curve fitting.
Given a set of two dimensional points we have to generate the equation of the curve. Inductive learning
may sometimes produce multiple consistent hypotheses, we have to chooses the simplest hypothesis
consistent with the data.
D)Decision tree and ID3
Decision trees take inputs as an object described by a set of attributes and makes a decision regarding
the input(Most commonly, yes or no).
Decision trees reach their decisions by performing a sequence of tests where each test is represented by
a tree node. Each test only applies to a single attribute of the object. The branches for each node are
labeled with each possible outcome of the result. Finally, each leaf node represents the value to be
returned if it is reached.
Decision tree learning algorithm ID3:
ID3 is a top-down, greedy search algorithm which always evaluates the best possible node at the root of
the tree. The quality of node (object attribute) is dictated by the classification factor of the attribute for
the problem at hand.
Formally, ID3 uses information gain as a quantitive measure for the classification factor of an attribute.
Information gain is based on entropy difference of different attributes. Basically, the attribute that
causes the biggest difference in entropy is choosen as the root node. Then for its descendents, the
remaining nodes are also compared and the process continues until the leaf nodes are reached.

E)Supervised and unsupervised learning


Supervised learning involves learning in which the program is trained on training sets, which
consists of inputs with their desired outputs as well which then improves accuracy when given new data.
For example, to recognize an image of an elephant, the machine is first given images of an elephant,
from which it learns the pattern and when given a new image, the machine recognizes it. It refines its
outputs by comparing it with the given output. This is the essence of supervised learning
Unsupervised Learning is a technique in which the program is given a bunch of data and must find
patterns and relationships in that data.It is required when there isnt an example data set with known
answers. Imagine searching for a hidden pattern in a data set. An application of this is clustering, i.e.
dividing a set of elements into groups according to some unknown pattern.

You might also like