You are on page 1of 26

A.

Fuzzy Sets
Dealing with uncertainty
Probability theory and statistics
Bayesian statistics
Fuzzy setsnonfrequentist
approach
Fuzzy set
Set of objects without clear
boundaries or not well defined
partial membership

FUZZY RULE-BASED
SYSTEMS
Lecture 15
from: Fuzzy Rule-Based Modeling with
Applications to Geophysical, Biological and
Engineering Systems, by A. Bardossy and
L. Duckstein, CRC Press, 1995

A = {( x, A ( x); x X, A ( x) [ 0,1]}

1
40 - x

A ( x) =
25
0

Crisp set

A ( x) = 0 or 1

Example:
the set of young persons is a
fuzzy set

x 25
25 x 40
x > 40

linear

0
2

25

40

x
3

A ( x) defines the truth value of

h level set

A(h) = { x | A ( x) h}

xA

Learningprocess of removing or
reducing fuzziness
Cardinality

too at least a degree

A(h1 ) A(h2 ) if h1 > h2

so: A A(0 < h < 1)


c
compliment: C = A

car ( A) = i

C ( x ) = 1 A ( x )

i =1

of A = {(a1 , 1 ),..., (aI , I )}


4

intersection

D = A B

Note: A A not necessarily 0/


t norm (intersection operator)

B
D

D ( x) = min ( A ( x), B ( x) )
union:

E = A B

E ( x) = max ( A ( x), B ( x) )

let D = A B

ab : algebraic product
ab
ab
a + b ab a + (1 a )(1 b)
max ( 0, a + b 1) ... many others

D ( x) = t A ( x) , B ( x)

a
b

t co-norm

Linguistic modifiers:

c( x, y ) = 1 t (1 x,1 y )
t ( x, y ) = 1 c(1 x,1 y )

VERY MOSTLY PRACTICALLY


SORT OF NOT INDEED
SOMEWHAT ROUGHLY etc.

linguistic variables
words, phrases, expressions
x 170 cm
e.g., tall 0

185 - x

A ( x) =
15
1

e.g.

VERY ( x) = ( x)2

170 < x 185 cm


x > 185

called concentration
since [ 0,1] actually has effect
of decreasing membership

dilation modifier

MORE or LESS ( x) = ( x)
contrast modifier

2 ( x) 2
0 ( x) < 0.5
INDEED ( x) =
2
1 2 (1 ( x) ) 0.5 ( x) < 1.0
to translate back to natural language,
find closest Euclidean distance to
membership function of statement
10

B. Fuzzy Numbers
Special case of fuzzy sets
Involves arithmetic operations
A fuzzy subset A of real nos. is a
fuzzy number if:
at least 1 z such that: A ( z ) = 1
[normality]
a<c<b
for all

A (c) min ( A (a), A (b) )

[convexity]

11

Convexity insures h levels sets and


the support of the fuzzy no. are
intervals
1
1

Level sets: different nos. with given


minimum likeliness
Zadeh: A ( x) is possibility of x.
impossible
totally possible
A crisp number is a fuzzy no. with a
single point

nonconvex

convex

supp( A) = { x | A ( x) > 0}

Membership value of real no. is


likeliness of occurrence of that no.
not likelihood
12

Trapezoidal

fuzzy no.

A ( x) =

Triangular fuzzy nos. [symmetric or


asymmetric]
0
x a1

xa
1 a <xa

1
2
a2 a1
A ( x) =
a3 x a < x a
3
a3 a2 2

0
a3 < x

(a1 , a2 , a3 )T
1

a1

a2

a3

13

x
14

a1

a2

a3

a4

0
x a1
x a1
a1 < x a2
a2 a1
1
a2 < x a3
a4 x
a3 < x a4
a4 a3
0

a4 < x

x
15

Fuzzy mean
we often need to defuzzify a fuzzy
seti.e., replace with a crisp value
could take element with highest
membership valuebut may not be
unique or representative of fuzzy set
e.g. 1
(3,3,9)T
3

Fuzzy mean:

M ( A) =
for A = ( a1 , a2 , a3 )T

M ( A) =

16

M ( A) =
2x + x
( x +1 x ) +16
=0

( x +1 x )
=0

+1 + 2 x

(x

+1 ) +

(x

+1 ) + ( x )
2

t A (t )dt

A (t )dt

Fuzzy mean for piecewise linear


fuzzy membership function

L 1

(x )

a1 + a2 + a3
3

Note: not all fuzzy sets have fuzzy


means: e.g., (a1 , a2 , )T

Fuzzy median:

18

17

Advantage of fuzzy mean:


continuousi.e., differences in
membership functions produce
smooth changes in fuzzy mean
m ( A)

for x0 < x1 < < xL breakpoints

A (t )dt =

A (t )dt

m ( A)

disadvantage: not continuousalso,


not necessarily UNIQUE

19

C. Fuzzy Rules

advantage: good for fuzzy sets


defined on DISCRETE ordered
sets X
Expressing similarity between fuzzy
sets:
2

Problem of imprecise information and


measurements
Crisp systems tolerate no exceptions
and no errors
Wide differences in opinion
a statement AND its opposite may
both be true to a certain degree

d ( A, B ) =

( (a1 b1)2 + (a2 b2 )2 + (a3 b3 )2 )


similar to Euclidean distancebut
only applicable to TRIANGULAR
membership functions

20

Structure of fuzzy rule:

IF: a1 is Ai1

a2 is Ai 2

THEN: Bi

{AND

or simplified:

Ai1

Ai 2

premises

Degree of fulfillment [DOF]


Product inference

ak is Aik

OR XOR

Aik THEN:

21

Bi
consequence

[different rules with different


consequences can be applied to same
premises]

22

(not A1 ) = 1 A1 (a1 )
( A1 AND A2 ) = A1 (a1 ) A2 (a2 )
( A1 OR A2 ) = A1 (a1 ) + A2 (a2 )
A1 (a1 ) A2 (a2 )
( A1 XOR A2 ) = A1 (a1 ) + A2 (a2 )
2 A1 (a1 ) A2 (a2 )

23

min-max inference

( A1 AND A2 ) = min A1 (a1 ), A2 (a2 )

( A1 OR A2 ) = max A1 (a1 ), A2 ( a2 )

)
)

( A1 XOR A2 ) = max (min 1 A1 ( a1 ), A2 ( a2 ) ,

min A1 (a1 ),1 A2 (a2 )

))

[both reproduce Boolean truth table


for crisp or 0-1 case]

24

min-max: accounts for only limiting


or extreme arguments
product: accounts for fulfillment of
ALL arguments

Example:
IF: (1,2,3)T AND
((1,2,6)T OR (4,5,7)T) AND (2.5,4,4.5)T
THEN: (0,1,2)T
If: a1=1.5
0.5
a2=4
0.5
0.8
0.8
a3=4.8
0.333
a4=3
product gives DOF=0.150
min-max gives DOF=0.333

25

Degree of fulfillment for AND coupling


K

k =1

k =1

Di = Ai ,k (ak ) = k
Degree of fulfillment for OR coupling
2 clauses

Di = Ai ,1 (a1 ) + Ai ,2 (a2 ) Ai ,1 (a1 ) Ai ,2 (a2 )


multiple clausesrecursive OR

Di (1 ,..., K ) = Di ( Di (1,..., K 1 ), K )
26

27

Degree of fulfillment for MOST OF

K 1
Di = 1
1 Ai ,k (ak )
K
k =1

Example: suppose we have the


following 5 rule arguments for given
set of facts a1 , a2 , a3 , a4 , a5:

1 = A (a1 ) = 0.9
1

Degree of fulfillment for AT LEAST A


1
FEW
K
p

1
Di =
(a )
K Ai ,k k
k =1

2 = A (a2 ) = 0.9

p=1: perfect compensation


p-norm
p=: no compensation
p=2: good compromise

3 = A (a3 ) = 0.5
3

4 = A (a4 ) = 0.1
4

28

5 = A (a5 ) = 0
5

29

For MOST OF coupling (p=2)

For AND coupling

Di = 1

Di = 0.9 0.9 0.5 0.1 0.0 = 0


For OR coupling

Di = ( 0.9 + 0.9 0.9 0.9 )


+ 0.5 ( 0.9 + 0.9 0.9 0.9 ) 0.5
+ 0.1 (( 0.9 + 0.9 0.9 0.9 )
+ 0.5 ( 0.9 + 0.9 0.9 0.9 ) 0.5) 0.1
= 0.9955
30

(1 0.9 )2 + (1 0.9 )2 (1 0.5 )2 (1 0.1)2 (1 0 )2

= 0.355

For AT LEAST A FEW coupling


1
( 0.9 )2 + ( 0.9 )2 + ( 0.5 )2 + ( 0.1)2 + ( 0 )2 2

Di =

= 0.613
31

D. Combination of Fuzzy Rule


Responses
With fuzzy rules (unlike crisp rules),
several rules can be applied from the
same premise vector (a1 ,..., aK )
Need to be able to specify an overall
response
DOF of rule i:

i = Di (a1 ,..., aK ) Bi (response)


32

Minimum combination

B ( x) = max

Cresting minimum combination

B ( x) = min min i , B ( x)
i >0

minimum
maximum
additive

33

Cresting Maximum combination

B ( x) = min i B ( x)
i >0

Problem: define fuzzy set B based on


the individual ( Bi , i )
i.e., B = C ( ( B1 ,1 ) ,..., ( BI , I ) )
[combination operator]
Assume all rules have consequences
which are fuzzy subsets of same set
3 methods

i =1,..., I

{min ( ,
i

Bi ( x )

)}

tolerates disagreementsbut does not

)}

emphasize agreement--vague

disadvantage: requires much carecan


easily get B ( x) = 0
Maximum combination:

B ( x) = max i B ( x)
i =1,..., I

34

35

Additive combination:
1) Weighted sum:

i i B ( x)

i B ( x)
i =1

B ( x) =

2) Normed weighted sum:

max
u

B ( x) =

i B (u )

i=1

A rule is better if its consequence is

more specifici.e., not vague


36

3) Cresting weighted sum:

max
u

[continuous]

= car( Bi ) = i ( j ) [discrete]
j

37

i min ( i , B ( x) )

min ( i , B (u ) )

max

i =1

B ( x) =

i =1

4) Cresting normed weighted sum:

min ( i , B ( x) )

B ( x) =

i =1

B ( x) dx

i =1

i i B (u )

max

i =1

i min ( i , B (u ) )
i =1

rules with crisper answers carry


more weight
38

39

10

Example:

23

<
x

0.5(
3)
3

7
B ( x ) =
0.4 4 x 23 < x < 4
2 7

Rule 1: 1 = 0.4 B1 = (0, 2, 4)T


Rule 2: 2 = 0.5 B2 = (3, 4,5)T
minimum combination:

4 x

min 0.4
,
0.5
(
3)
x

23
4 x
=

=
0.4
0.5
(
x
3)
@
x

7
2

40

cresting minimum

4 x 4 x
min 0.4,
=

min ( 0.5, ( x 3) ) = x 3
4 x
10
= x3 @ x =
2
3
10

<

x
x
(
3)
3

3
B ( x ) =
10
4 x
<x<4
2
3

41

0.5

crest
min

1 2

min
comb

3 4 5 6

1
0.5

max
comb
crest
max

1 2
42

3 4 5 6

x
43

11

weighted sum combination

f ( x) =

x
0.4
2
4 x
0.4
2
4 x
0.4
+0.5( x - 3)
2
0.5(5 x)

max
0<x 2

f ( x) =

x
2
4 x
0.2
2
4 x
0.2
+0.5( x - 3)
2
0.5(5 x)
0.2

so... Bi ( x) = 2 f ( x)

2< x3

normed weighted sum:

3< x 4
4< x5

Sum of membership functions i

i B (u ) = 0.5

44

1
area under B1 : = 2
1
1
area under B2 : = 1
2

45

1 wtd.

0<x 2

sum

0.5

2< x3
1 2

3< x 4

4< x5

0.5

B ( x) = 2 f ( x)

crest.
wtd.
sum

1 2
46

norm. wtd.
sum

3 4 5 6

crest.
norm.
wtd. sum

3 4 5 6

x
47

12

E. Defuzzification

Advantage of additive methods for

Often necessary to replace fuzzy


consequence B with single crisp
consequence b
Example: prediction for forecasting
or control decision: b = D f ( B )
Methods

combination
same support
Disadvantage of additive methods
for repeating rules, gives different

48

B (b) = max B ( x)
x

not necessarily unique


Defuzzification by mean
for weighted
sum comb. M ( B ) =
easy!

M ( B) =

i
i =1

49

Defuzzification by normed weighted


I
sum:
i M ( Bi )

Defuzzification by maximum

b = D f ( B) = M ( B)

1) maximum
2) mean
3) median

i
i =1

M ( Bi )
1

i
50

i =1

i
i =1

For weighted sum combination and


mean defuzzification, rules can be:
IF Ai ,1 AND AND Ai , K

THEN: ( M ( Bi ), i )

51

13

That is, we only have to represent


fuzzy consequence Bi by its mean
M ( Bi ) and area under : i

Best methods are:


product inference
additive combinations
fuzzy mean defuzzification

For normed weighted sum


combination and mean
defuzzification, only need M ( Bi )
Defuzzification by median:

b = D f ( B ) = m( B )
52

Numerical Rule System :

F. Rule Systems
Rule system

IF Ai ,1 Ai ,2
(for i = 1,..., I )

53

IF: Ai ,k and Bi fuzzy numbers


Ai , K THEN Bi

this does not imply that all


arguments used for every rule:
e.g., assign Ai ,k ( x) = 1 for all
arguments k not used in rule i

Ai ,k fuzzy subsets of X k
Bi fuzzy subsets of Y

54

To calculate response of rule system:


inference method
combination method
defuzzification method
complete if for every premise vector
( a1,..., aK ) A , response B ( a1,..., aK )
is nonempty fuzzy set
55

14

Rule system with maximum or


additive combination methods is
complete on Q iff:

Example:

A1,1 = (1, 2,3)T B1 = (1, 2,3)T


A2,1 = (2,3, 4)T B1 = (3, 4,5)T

a rule i such that


Di (a1 ,..., ak ) > 0

not complete under minimum


combination

[minimum does not work]

min 1 B1 ( x), 2 B2 ( x) = 0

for support [2,3]


It is complete for maximum or additive
56

57

Rule system is nondegenerate if:

For maximum and additive


combinations, all we need for
completeness is:

i and Ai ,k and Bi , Ai,k

are continuous
If nondegenerate and complete

A supp(i )

then mean defuzzification response is


continuous function of A.

i =1

e.g., above 1 x 3; 2 x 4
1 x 4

58

59

15

for 1 x 2

Example:

Di values are ( x 1), 0,

if (1, 2,3)T then (0,1, 2)T

for 2 x 3

if (3, 4,5)T then (2,3, 4)T

Di values are (3 x ), 0,

if (0,3, 6)T then (0, 2, 4)T


complete and nondegenerate on
interval [1, 5]
Use normed weighted sum
combination with mean defuzzification
60

x
1( x 1) + 3(0) + 2
3 = 5x 3
for 1 x 2 :
4x 3
x
( x 1) + 0 +
3
x
1(3 x) + 3(0) + 2
3 = 9 x
for 2 x 3 :
9 2x
x
(3 x) + 0 +
3
62

for 3 x 4
Di values are 0, ( x 3),
for 4 x 5
Di values are 0, (4 x),

x
resp.
3
x
resp.
3
6 x
resp.
3
6 x
resp.
3

61

6 x
1(0) + 3( x 3) + 2

3 7x 15

for 3 x 4 :
=
2x 3
6 x
0 + x 3+

3
6 x
1(0) + 3(5 x) + 2

3 57 11x

for 4 x 5 :
=
21 4x
6 x
0+5 x +

3
63

16

3
2.5
2

1.5

defuzzified
response

1
0.5
0
0

3 crisp rules would give 3 different values


as consequences
Objects in discrete categories (e.g., HIGH,
MEDIUM, LOW) can be given continuous
representation

Proposition:
If: A = a1 , a1+ aK , aK+

and f (a1 ,..., aK ) is a continuous


function on A, then for any > 0,
any inference combinations and any
defuzzification method, a rule system
such that: f ( a1 ,..., aK ) ( a1 ,..., aK ) <
( a1 ,..., aK )

64

Corollary:
Every nondegenerate rule system,
inference and combination method
with mean defuzzification can be
replaced by a rule system using:

65

Use of closed form functions:


have specific shape
e.g., linear

parameters assessed with specific


technique

AND operator
product inference
weighted combination
fuzzy mean defuzzification

e.g., least-squares

try to approximate observed data


parameters may not have physical
meaning
66

67

17

Advantage of rule systems:


rules can define function in
specific neighborhoods
errors in coefficients of polynomial
can be disastrous
errors in a single rule only
influence function on the support
of that rule
68

Arguments with narrow (i.e., crisp)


supports
need many rules
Very wide supports
nonspecific responses
Triangular or trapezoidal
membership function are the most
popular

G. Membership Functions in
Rule Systems
What is a good rule?
How crisp should arguments and
responses be?
What shape for membership
functions?
If we are using normed weighted sum
combination and mean
defuzzificationDOESNT MATTER!

69

H. Fuzzy Rule Construction


Assessment of rules: knowledge
and/or available data encoded into
rules
Ways:

70

1) rules known by experts*


2) rules from expertsbut updated from
data
3) not knownbut variables specified
4) only observations available*

71

18

Example: algae growth model


x: available nutrients
y: algae biomass
[use scale between 0 and 1]
States described by
high (H)
nutrients
algae biomass
low (L)
Experts: HH LH LL HL HH
[ represents time transition]

Expert Rules: Starting @ HH if


algae high, nutrients reduced to L @
t+1; then since nutrients insufficient,
algae becomes L in next period; as
algae breaks down, nutrients
replenished and become H again;
then algae becomes H again in the
next time period.

72

Let H and L be triangular

Calculate means :

fuzzy numbers :
H = (0.4,1.0,1.0)

= 0.4 +1.0 +1.0 = 2.4 = 0.8


M(H)
3
3
= 0.0 + 0.0 + 0.7 = 0.7 = 0.233
M(L)
3
3
Construct state vector trajectory :

L = (0.0,0.0,0.7)T
1

73

Let initial state be :


0.4 0.5 0.7

1.0

( x (0), y (0)) = (0.5,0.6)


74

75

19

2
for
7

1 nutrients x
H (0.5) =
6
1
L (0.6) =
for
7

2 algae y
H (0.6) =
6
Use product inference for DOF :
AND rule fulfillment grade (i, j) :

DOF:

L (0.5) =

for (i, j) {L,H } is : (i, j) = i (0.5) j (0.6)

76

Fuzzy mean combination of rules:


B ( x ) replaced with mean value, and:
H =

1
0.3

L =

L (0.6)

H (0.5)

1/18

1/42

L (0.5)

2/21

2/49

Transition table:

H (0.6)

1
0.35

1 0.7
2 0.7
2 2.4
1 2.4

0.35 +
0.35 +

0.3 +

0.3
21 3
49 3
42 3
x (1) = 18 3
1
2
2
1
0.35 + 0.35 +
0.3 +
0.3
18
21
49
42
= 0.4319
1 2.4
2 0.7
2 0.7
1 2.4

0.3 +
0.35 +

0.35 +

0.3
21 3
49 3
42 3
y (1) = 18 3
1
2
2
1
0.3 + 0.35 +
0.35 +
0.3
18
21
49
42
= 0.4222
78

Rule

DOF

HH LH
LH LL
LL HL
HL HH

1/18
2/21
2/49
1/42

there are
really 8 rules:
HHL AND H
[x]
[y]
77

Same procedure used to obtain x(t+1),


y(t+1) as function of x(t), y(t), etc.
1

oscillatory behavior

0.6
0.5
t
nutrients x(t) same as solution of
algae y(t)
coupled diff. eqs.
79

20

I. Deriving Rule Systems from


Data Sets

Counting algorithm:
1. Define support:

(i,k ,i+,k ) for each Ai,k

Training set:

T = {( a1 ( s),..., aK ( s ), b( s ) ) ; s = 1,..., S }

2. Assume

(i,k ,i,k ,i+,k )T

Fuzzy rules can be used to:


simplify complex models
can generate synthetic training
sets

where mean i ,k

for each Ai ,k
1
=
ak (s)
Ni sR
i

80

+
3. Define support: i , i , i
T

where Ri = {a1 ( s ),..., aK ( s ), b( s )} T


such that k ( s ) i,k , i+,k ,
k = 1,..., K
Ri denotes the set of all those premise
vectors that fulfill at least in part the
i th rule; forms a subset of the training
set T; Ni is the number of elements
in Ri .

81

i = min b( s )
sRi

i =
i+

1
Ni

b( s )

sRi

= max b( s )
sRi

Rule System is:

+
IF i1 , i1 , i+1 AND iK
, iK , iK

82

)T

THEN: i , i , i+

)T

)T
83

21

Questions:
How to define supports?
Use split sampling methods?
Weighted counting algorithm
considers minimal DOF
calculate DOFs i ( s )
For each rule i:

i =

i ( s )>

i ( s ) b ( s )

i ( s ) >

i ( s )

i+ = max b( s )
i ( s )>

The higher the value, the fewer


elements used to define the responses,
and the crisper are the assessed rules

= min b( s )
i ( s ) >

84

Least Squares Method:

min

R ( a1 (s),..., aK (s) ) b(s)

If we are using normed weighted sum


combination and mean
defuzzification, then the unknowns
are: M ( B1 ),..., M ( BI )

Rule response for


normed weighted
sum combination:

R ( a1 ( s ),..., aK ( s ) ) =

Shapes of premise membership


functions are assumed in order to get
the i
Note: the other methods can be used
for any combination and defuzzification

i (s) M ( Bi )
i =1

i (s)
i =1

85

86

87

22

Example:
For comparison of the algorithms
Training set with 25 sets of
observation data (a1(s),b(s)),
s=1,,25
Develop rule system for the interval
[0,8]
Highly variable, nonlinear behavior

Training
Set

88

Rule system from counting algorithm


7 rules; supports of equal length
2 or 3 rules applicable to each a(s)

89

Rule system from weighted counting


algorithm
For = 0.5; smaller supports; crisper

90

91

23

Least-squares algorithm
For = 0.5; smaller supports; crisper

L( x) = R ( x) =
1
L

Rule system from least-squares


algorithm

1
( cos ( x ) + 1)
2
L-R fuzzy nos.
smooth
approximations
to triangular
fuzzy numbers
92

93

Least-squares
[correlation = 0.96]

training set
counting algorithm
weighted counting
algorithm
94

6th order
polynomial
[correlation = 0.89]

training set
least-squares
6th order polynomial
95

24

J. Application: Reservoir
Operation*

Consequence:
Actual release
Typical Rule:
IF pool elevation is Ai,1 AND
Net inflows is Ai,2 AND
Forecast demand is Ai,3 AND
Time of year is Ai,4
THEN release is Bi

Premises:
Reservoir pool elevations
Inflows [net]
Forecasted demands (power)
Time of year
Shresha, B., L. Duckstein, and E. Stakhiv. (1996).
Fuzzy Rule-Based Modeling of Reservoir
Operation, ASCE Journal of Water Resources
Planning and Management, 122(4), 262-269.

96

Uses split sampling


Calibration:
Training set
Weighted counting algorithm
Storage elevation premises
calculated from mass balance:
St+1=St + It Rt - Lt

97

Following constraints imposed:


St,min < St < St,max
Rt,min < Rt < Rt,max
Applied to Tenkiller Lake in
Oklahoma
Project purposes:
flood control
water supply
hydropower

consequence
98

recreation
habitat
99

25

Capacity: 371,000 AF
Daily data from 1980 to 1992
Training set uses 1989
9 TFNs cover elevation 620 ft. to
677.2 ft.
8 TFNs: inflows from 0 to 180,000
day-second-ft
Supports cover entire range25%
overlap

Training for each month separately


Power demands: low, medium-low,
medium, medium-high, high
Product inference used [danger of
incompleteness
Additive combination

100

101

102

103

26

You might also like