You are on page 1of 96

Automata, Grammars and Languages

Discourse 03
Finite Automata

C SC 473 Automata, Grammars & Languages

Finite Automata / Switching Theory


(CS)

/ (CE)

Boolean operators / Gates (Elem. Switching Ops)


x

xy

xy

xy

x
1

Boolean Functions / Combinatorial Circuits


Circuit
x1
x2

x1 x2
x1 x2

x1 x2
x1 x2

z1 (sum)
z2 (carry)

H half adder

x1
x2

x3

x1 x2
H

(old carry)

x1 x2

z1 (sum)
(x1 x2) x3

z2

(new carry)

F full adder

Boolean Functions / Comb. Circuits (contd)

Table representing F

x1

x2

x3

z1

z2

Boolean Functions / Comb. Circuits (contd)


Equations representing F

z1 g1(x) x1 x2 x3
z2 g 2(x) ((x1 x2) x3) (x1 x2)
General scheme (n inputs, m outputs)

z g(x)
(z1, z2, K , zm ) g(x1, x2, K , xn)
( g 1(x1, x2, K , xn), K , g m(x1, x2, K , xn))

Finite Automata / Sequential Circuits


Add memory elements = delay elements

y(i 1)

y(i)

combinatorial
circuit

z(i) f( x(i), x(i 1), x(i 2), K )


Finite # of delay elements possible d

z(i) f( x(i), x(i 1), K x(i d))


6

Finite Automata / Sequential Circuits


Ex: sequential adder: add 2 binary numbers; low
x1(0)
order bits received first
1001
0101
1110

(a) sequential net (circuit):

x2(0)

x1(i)
x2(i)
y1(i 1)

full
adder F

z1(i)
carry

y1(i)

Finite Automata / Sequential Circuits

(b) Next-State & Output Equations:

(c) Transition Table: state space


input alphabet ={(00),(01),(10),(11)}

y1(i 1) ((x1(i) x2(i)) y1(i)) (x1(i) x2(i))


z1(i) x1(i) x2(i) y1(i)
{0, 1} {q 0, q 1}
output alphabet ={0,1}
x
q

(00) (01) (10)

(11)

q0

q0 / 0 q0 / 1 q0 / 1 q1 / 0

q1

q0 / 1 q1 / 0 q1 / 0 q1 / 1

next state/output
table

Finite Automata / Sequential Circuits


(d) State Diagram:
(00)/ 0

(01)/ 1

q0

(10)/ 1

(01)/ 0
(11)/ 0
(00)/ 1

q1

(11)/ 1

(10)/ 0

Finite Automata / Sequential Circuits


(e) Finite-State Transducer (Mealy Machine)

A 5-tuple

M (Q, , , , q 0)

Q {q 0 , q 1}
q0

where

finite set of states

start state

input alphabet
{(00),(01),(10),(11)}
{0,1}
output alphabet

:Q Q
(q 0,(00)) (q 0,0)
(q 0,(10)) (q 0,1)
(q 1,(01)) (q 1,0)

transition/output function

(q 0,(01)) (q 0,1)
(q 0,(11)) (q 1,0)
etc.

10

General Sequential Network

B {0,1}
z1

x1

: B

M
xn

n s

y1
M
ys

state
space =
Bs

m s

M
zm

is a
boolean
functio
n

(y(i), x(i)) (y(i 1), z(i))


(q,
a ) (q ,
b )
11

Three Types of Automata


an L a3a2a1
time

ai

cn L c3c2c1

transducer

an L a3a2a1

Yes(1)

No(0)

recognizer (acceptor)

cn L c3c2c1

Enumerator (generator)

12

Machines that Recognize

Detection of an event, i.e., a pattern in input


Recognition of just those words in some language L
Definition of a language
Ex: detect abab all non-overlapping occurrences

a
b

b
b

b
a

13

Ex: C Comments /* */
Filter in the lexical scanner
:

a:a

(transducer)

/:/

a:/a

/:

a:

empty
notation
in : out

/:

Recognizer

a {,/}

a:

/:
a

14

Finite Automaton (Finite State Machine, FSA)

Defn 1.5: A (deterministic) finite automaton is a 5-tuple

M (Q, , , q 0, F )

Q is a finite set, the states


is a finite set, the alphabet
:Q Q
is the transition function
q 0 Q is the start state
F Q is the set of accepting (final) states
Ex: M 0 (Q, , , q , F ) Q {q 0, q 1, q 2}
{a, b}
F {q 1}
a
(q 0, a) q 1 (q 0, b) q 2
q0
b
(q 1, a) q 2 (q 1, b) q 0
a
b
(q 2, a) q 2 (q 2, b) q 2

q2

q1

M0

a, b
15

How FA Compute
FA M (Q, , , q 0, F ) is a finite structurelike a
programfixed and static
Need to define the behavior of M on input w

Sequence of configurations
Like trace of a program on given data
Dynamic and input-dependent

Ex: start M 0 on input w ababaLook at sequence of


moves determined by the transition function:

(q0, ababa) (q1, baba) (q0, aba)


(q1, ba)

(q0, a)(q1,

Since in accepting
M 0 state when input exhausted, w is
recognized by

16

www.jflap.org
JFLAP is a package of graphical tools which can be used as an aid in
learning the basic concepts of Formal Languages and Automata Theory.

C SC 473 Automata, Grammars & Languages

Info on JFLAP

Website http://www.jflap.org/

Lectura (linux) install

Downloads
Tutorial
cd /usr/local/jflap
java -jar JFLAP.jar

X11 forwarding (graphics)

ssh -X lectura

QuickTime and a
decompressor
are needed to see this picture.

18

How FA Compute (contd)


Given a FA M (Q, , , q 0, F )

Defn: configuration of M is an element of Q


Defn: yields in one step (or moves) relations
between configurations is defined by

(q, aw) (

,
w

, q, q Q
where
Notes: is a function, since is. (q,

Defn: yields is the relation

Is undefined.

Means moves in zero or more steps to

(q,
w
)

by M
Defn: A string w is recognized (accepted)

19

How FA Compute (contd)


Defn: The language recognized (accepted) by M is

L( M)

Defn 1.16: A language S is regular iff there is some FA


(M )[M is a FA S L(M )]
that recognizes it, i.e.,
Ex: In FA

M0

(q0, a)


20

Example:

= make change for i 30 & vend coffee

Coin checker for 30 coffee. ={n,d,q}


0

50
q

10

45

n
q

40

15

25
30
d

n
20

35

n
d

21

Regular Operations & Regular Expressions


The regular operations on languages are:

union (), concatenation ( ) and Kleene star (*).


So called because the class of regular languages are
closed under themi.e., applying these operators to
regular languages results in a regular language. (We will
prove these closure results later.)
In fact, these three operations (, ,*) actually characterize
what it means to be a regular language: any regular
language can be built up from alphabet symbols and a
finite number of these regular operations.
This motivates the notion of regular expression: a
sequence of symbols, like an arithmetic expression, that
defines a regular language using regular ops.
22

Regular Expressions
A syntax for describing sets of strings (languages)

Terse
Eliminates fussy { }
Reminiscent of arithmetic expressions
Obeys some useful algebra, e.g., (E*F*)* = (E+F)*

Syntax for regular expressions over ,+, ,*,(,)


E (E+E)
( text uses not +; some authors use |)
(EE)
(usually suppress the in E E)

(E*)


(some authors use )

a
for each a in
suppress (,) where possible: (a+b)*a not ( ( (a + b)* ) a )
23

Regular Expressions (contd)


Meaning rules for the syntax
The meaning (denotation) of an expression, L(E), is a set
of strings (a language)
Rules

expression E

(E+F)
(EF)
(E*)

language L(E)
{}

{a}
{}
L(E)L(F)
L(E)L(F)
L(E)*
24

Reg. Expr.: Examples, Equivalence(=)

{a, b}{
a}

({a}{
b})
{a, b}

(a+b)*a
L
(a*b*)*
=(a+b)*
(a+b)*a(a+b)*a(a+b)* {w : w has 2 a's}
{w
(b*ab*ab*ab*)*
PASCAL unsigned numbers. d={0,1,,9}
dd*( +.dd*)(+E(+ + + )dd*) {w : ?? }
a*a+b *b+a+b {w : w begins & ends same}
Defn: E=F L(E)= L(F)

*= (E*F*)*=(E+F)* E=E=
E(FG)=(EF)G E(F+G)=EF+EG E=E=E

25

Nondeterminism
Real computing devices are deterministic: the current
configuration and instruction determines the next
configuration. The relation is a function.
Why the concept of nondeterminism?

Provides powerful, economical descriptive ability


Provides a way to specify languages without over-specifying and
complex handling of cases
Can be algorithmically converted to a deterministic description (at
the sacrifice of some economy and with added complexity)
Generalization of determinism

Ex: abab occurs somewhere in w: abab


a
b
a
b
a,b

a
a,b

a,b

26

Nondeterminism (contd)
Ex: w* has penultimate symbol b: w = b?
b

a,b

a,b

Ex: w* has 2 as: w = a a


a
a,b

a
a,b

a,b

27

-Moves Can Be Useful


SNOBOL arithmetic constants (no floating E)

Use to specify optional characters like Unix command line [opt]

d=digit

28

Nondeterministic Finite Automaton

Defn 1.5: A nondeterministic finite automaton is a 5-tuple

M (Q, , , q 0, F )

Q is a finite set, the states


is a finite set, the alphabet
: Q ( {}) P(Q)
transition function
q 0 Q is the start state
F Q is the set of accepting (final) states
Q {q 0, q 1}
Ex: M 1 (Q, , , q , F )
M1
{a, b}
F {q 1}
q0 b q1
(q 0, a) {q 0}

(q 0, b) {q 0, q 1}

a,b

a,b

(q 1, a) {q 1} (q 1, b) {q 1}
29

DFA vs NFA

DFA

a
q

NFA

a
a

For each state q and input symbol


a, there is exactly one choice of
new state (or no transition is
defined at all). Each transition
consumes an input symbol
Special case of NFA!

There may be multiple choices for


the same input symbol
There may be -moves that do not
consume an input character
There can be chains of -moves
-moves can create even more
choice for the next input character

30

How NFA Compute


Given a NFA M (Q, , , q 0, F )

(
q
,
w
)

Defn: configuration
Defn: yields in one step (or moves) relations
between configurations

(q, aw) (
(q, w) (

s
Defn: yields =

(-move)

Means COULD move in zero or more steps to

by M
Defn: w is recognized (accepted)

Same as before, but has the meaning if there exists some


sequence of moves from the start config to some accepting config

31

How NFA Compute (contd)


Defn: The language recognized (accepted) by M is

L( M)

Ex: In NFA

M1

(q0, aabbba)

This provides no evidence that aabbba is accepted (or not)


However, also via a separatecomputation sequence:

And so aabbba is recognized!

(q , aabbba)

32

Tree of Computations
Ex: NFA M1 (q 0, aabbba)

(q 0, abbba)

(q 0, bbba)
(q 1, bba)
(q 0, bba)
(q 0, ba)

(q 1, ba)

(q 0, a)

(q 1, a)

(q 0, )

null
evidence

(q 1, )

q 1 F

accepting
Computation
wL(M1)

33

Computation Tree: Example

a,b

Ex: L = {w: w begins & ends same }


(1, ababa)

(1, abab)

(2, baba)

(2, bab)

(2, aba)

(2, ab)

(2, ba)

(3, ba)

(2, a)

(2, )

(2, b)
(2, )

(3, ) 3F
accept ababa
some path to F

a
3

a,b

(3, b)

reject abab
path to F

34

Example with -Moves


String length a multiple of 2 or 3
(0, aaa)

(2, aaa) (4, aaa)


(3, aa)

(5, aa)

(2, a)

(6, a)

(3, )

(4, )

-moves

a
4

a
a
6

a
a 5

4F
Accept aaa

35

Example with -Moves


a*b*

(0, aab)
(1, aab)

(0, ab)
(1, ab)

(0, b)

1
b

(1, b)
(1, )
1F
Accept aab

-moves consume no input symbols

36

Equivalence of NFA to DFA


There is an algorithm to convert any NFA into a DFA

We show basic idea assuming NFA has no -moves


Then (later) modify the construction for NFAs with -moves

Ex: L = {x : last symbol of x appeared previously } ={a,b}


N0:

a,b

a,b
a

s
b

b
a,b

Idea: given input string, keep track of all possible reached


states after reading each letter. At end of input, see if a
final state is among those reached
37

Equivalence of NFA and DFA (contd)


Computation paths through NFA N0 on w = abba
a

a
b
b

b
b

r
r

s
q

a
a

s F

q
s
38

Equivalence of NFA and DFA (contd)


Idea: keep a list of all possible states reachable by each
prefix of w (parallel worlds). For NFA N0:
a

{p} {p, {p, {p, {p,


q}

q,
r}

q,
r,

q,
r,

s}

s}

abba

{p} {p, q, r, s}
abba L(N 0) since {p, q, r, s} F

39

Equivalence of NFA and DFA (contd)


Equivalent DFA M will have:
State set P (Q)

Alphabet
Start state set {q }
0
Accepting states {X Q : X F }
Deterministic transition function : P (Q) P (Q)
Ex: For NFA N0:

p, q}, a) {p, q, s}
({
p, q}, b) {p, q, r}
({
p}, a) {p, q}
({
p}, b) {p, r}
({
K

40

Equivalence of NFA and DFA (contd)

Thm: [Rabin-Scott Construction]. Let L = L(N) for some NFA


N with no -moves. There is an algorithm to constuct a DFA
M equivalent to N, i.e. with L(M) = L(N).
Pf: Given N we construct a DFA M and then verify that it
recognizes the same set as N.
Construction: Given NFA N (Q, , , s, F) construct
M (Q , , , s, F ) where





41

Equivalence of NFA and DFA (contd)

Picture of

a
a
b
b

S Q

S Q


42

Equivalence of NFA and DFA (contd)


Verification: Show (1) M is a DFA and (2) L(M)=L(N)
(1) is a function by the construction, and Q is finite:
|Q | = 2|Q| . So M is a DFA.
(2) To show equivalence we prove the
Lemma:

( p, w)

Pf: By induction on the length of the input string w.


Base |w|=0.

( p,


Step Suppose (IH) the lemma is true w. | w | k.
Let | w |
To show:

( p, ua)


43

Equivalence of NFA and DFA (contd)


. Assume ( p, ua)

Then state r with

( p, ua) N and q (r, a).

Then ( p, u) N By (IH)


(*)
R, a). Let
By construction of M q (
R, a). Then
Q (
Using this with (*) results in:

So


44

Equivalence of NFA and DFA (contd)



. Assume
Then state R with

R, a) SoQ.
(

( {p}, ua) M

By construction

(1)

and

r R. q (r, a)

(r , a) N
(2)

Since

( {p}, u)

we
have from (IH)

( p, u)

Combining (2) & (3):

( p, ua)

So

( p, ua)

(3)

45

Equivalence of NFA and DFA (contd)


We now finish the verification proof. Let

(s,
w
)


From the Lemma
N

f F.

f F

Q F .
for some

(s, w)
That is, (

L(M ) L(N ). W

for some

46

Example: -Free NFA DFA


N 0 (Q, , , p,{s})

Consider the previous NFA

M 0 (P(Q), , ,{p}, F )
a
{p,q,s}
a
a

{p,q}

{p}
b

a,b

b
b
a

{p,r}

a
{p,q,r}

{p,q,r,s}
b

a
b

{p,r,s}
b
47

NFA with -Moves


-closure(R) = E(R) for a set of
states R

6
7

10

b
d

b
14

8
a

11
12
13

15

For R Q the -closure, E(R) of R is:


48

-closure of a set of states


Coalesce all nodes reachable from {4,5} by -moves:
10

9
a
d

{1,2,3,4,5,6,7,8}
b

a
15

a
14

11

b
12
13

Note: still an
NFA
a

E({4,5})

E({9})

a
Etc.

E({13})

49

Conversion: NFA DFA


Thm: There is an algorithm to convert any NFA to an
equivalent DFA.
Pf: Construction: Given NFA N (Q, , , s, F)
construct new NFA M where





Verification.

(q, w) N

Pf: By induction on |w|


50

Conversion: NFA DFA


Thm 1.39: [Rabin-Scott Theorem]: There is an algorithm
to convert any NFA into an equivalent DFA.
Corollary 1.40: A language is regular some NFA
recognizes it.
Ex: Start with an NFA N1 as follows:

N1
1

b
b

b
3


51

Conversion: NFA DFA


b

Ex:

N1

b
3

Useful summary

E(1)

52

Conversion: NFA DFA


b

Ex:

N1

b
3








53

Ex: NFA DFA (contd)


b

Ex:

N1

b
3







54

Conversion: NFA DFA (contd)


M1
b
1

123

13

b
1234

55

Regular Expression NFA


Thm 1.55: There is an algorithm that, given a regular
expression E, constructs a NFA N such that L(E) = L(M).
Pf: Induction on the # of operator symbols in E.
Base: E =

a
a
Step: Assume (IH) the result is true of all expressions
with operator symbols (+,,*). Let E have k+1 ops.
Three cases:
Case E = (E1+E2). By IH, FA M1 , M2 with L(E1) = L(M1)
and L(E2) = L(M2). Construct the following NFA M.

56

Case +

M1
M

F1

M2

F2
F F1 F2
L(M ) L(M 1) L(M 2)
57

Regular Expression NFA (contd)


Case E = (E1E2). By IH, FA M1 , M2 with L(E1) = L(M1)
and L(E2) = L(M2). Construct the following NFA M.

58

Case

Unmark final states in M1

F1
M1

F2

M2
L(M ) L(M 1)gL(M 2)

F F2
59

Regular Expression NFA (contd)


Case E = (E1)*. By IH, FA M1 with L(E1) = L(M1).
Construct the following NFA M.

60

Case *

M
s

F F1 {s}

F1

M1

L(M ) L(M 1)
QED

61

Example: Reg. Exp.NFA


(b+aa)*

a
b

a
Not very economical

62

Regular ExpressionsApplications
Regexp used in various development tools

qed interactive text editor. 1st version Lampson & Deutsch 1967

Regexp added by Ken Thompson, Bell Labs, ca. 1968

Regexp compiled into NFA in machine code


Rabin-Scott idea used to scan on the fly
One of the first software patents

Offspring ed by Ken for Unix


Many others followed: em, vi / ex, sam, qedx,

grep, egrep - pattern search in a file


shell command line interpreter
lex lexical analyzer generator
sed non-interactive stream editor
awk pattern scanning and processing language
perl pattern-driven programming language

63

Applications (contd)
Regular expressions = patterns
meaning
matches >=1 r
matches >=0 r
matches 0 or 1 r
matches r then s
matches r or s
match literal c
match begin/end line
match any char
group exprs
character list
negated char list

awk regexp
r+
r*
r?
rs
r|s
\c
^
$

.
(s)
[abc]
[^abc]

64

Applications--Examples
-?[0-9]+
nonempty digit strings, optional sign
[^0-9]
any char except digit
\[.*\]
reference citations in a paper
g/^[ ]*$/d
delete blank lines
g/[ ]+/d
delete lines with a blank
Ex: match is always (1) leftmost and (2) longest
file: abcddddef
vi: s/d*/x/ xabcddddef
s/d+/x/ abcxef
Ex: csh: sort roll[1-5] | egrep C SC|MATH | pr

65

Applications--Examples
Ex: traditional spelling mnemonic
i before e, except after c,
or when pronounced a,
as in neighbor and weigh
--except for weird examples.
grep [^c]ei /usr/share/dict/words > foo
cat foo
abseil Aeneid ageing Alamein albeit atheist
Boeing Budweiser caffein canoeist deice deictic
dilettanteism dreidl ...
if you think this spelling rule is sufficient, you will be deficient,
inefficient, unscientific and far from omniscient
66

Applications--Examples
Ex: lex generates a lexical analyzer yylex(). Example: wordcount (wc)
%{
int nchar, nword, nline;
%}
%%
\n
{ nline++; nchar++; }
[^ \t\n]+
{ nword++, nchar += yyleng;}
// yyleng = length of matched string
.
{ nchar++;}
%%
int main(void) {
yylex();
// invoke generated lexer
printf("%d\t%d\t%d\n", nchar, nword, nline);
return 0;
}

67

L Regular L Denoted by a Reg. Expr.


Weve defined regular as meaning: recognized by a DFA
(equiv. to rec. by an NFA)
This equivalence result is known as Kleenes Theorem
Weve already shown the directionwe constructed an
NFA from a regular expression (Using Rabin-Scott we
could convert this NFA to a DFA.)
Now we show the direction: given a DFA M construct
a regular expr. E with L(M) = L(E).
Thm (Kleene): There is an algorithm that, given a DFA M ,
computes a regular expression E such that L(M) = L(E).
Pf: Given the graph of the DFA, use the node elimination
algorithm to gradually eliminate all nodes in favor of
expressions on the edges of the graph.
68

Kleenes Thm: use Generalized NFA


0

1
A

1
B

1
A

A
S
a

2. Elim. C

1
0

01
00

Add init. S, &


final a

1
0

1.

1(1+01)*00+0
4. Elim. A

3. Elim. B

(1(1+01)*00+0)*

Order: CBA

E =(1(1+01)*00+0)*

69

Ex: Node Elimination Algorithm


b
B

b
b

a
A

Add -moves:
S

b
B

b
b

a
A

70

Ex: Node Elimination Algorithm


Elim. A:

b
B

(ACB)

ba*a

Elim. C:

bb

AC

bba*a

Elim. B:

(bb+bba*a)*b

ACB

71

Ex: Node Elimination: other orders


S

b
B

b
b

a
A

72

Ex: Other elimination orders (CAB)


Elim. C:

bb

a
bb

Elim. A:
bb

CA= AC (above)

a
bba*a

Elim. B:

(bb+bba*a)*b

CAB=ACB

73

Ex: Other elimination orders (CBA)


Elim. C:

bb

a
bb

Elim. B:
(bb)*b
S

CB

a
(bb)*bb

Elim. A:

a(bb)*b
a

(bb)*b+(bb)*bba*a(bb)*b

CBA

74

Ex: Other elimination orders (BAC)


bb

Elim. B:

ab

b
A

Elim. A:

b
S

bb
C

BA =AB
a

ba*ab

Elim. C:
S

b(bb+ba*ab)*

BAC=ABC

75

Ex: All elimination orders equiv (BAC = ACB)


BAC

b(bb+ba*ab)*

ACB

(bb+bba*a)*b

Easy to prove by induction: for any expression E,


b(Eb)*=(bE)*b. Using this identity:
b(bb+ba*ab)* = b[ (b+ba*a) b]*
= [b (b+ba*a) ]*b
= (bb+bba*a)*b
Further regular expression simplication is possible:

b(bb+ba*ab)*=b[b(b+a*ab)]*=b[b( +a*a)b]* =b[ba*b]*


Good exercise: show results of all other
elimination orders are equivalent to these, using
regular expression algebra

76

Algebra of Regular Expressions


an algebra for symplifying regular expressions
Can use this algebra to construct RegExps from FSA
r+s = s+r (r+s)+t=r+(s+t)
r+=r
(rs)t = r(st)
r =

r = r = r

r+r=r
r =

r(s+t) = rs + rt
rt + st

(r+s)t =

* =
r*

r* = r +

(r*)* = r*

r* = + rr*

(r*s*)* = (r+s)*
77

Solving Regular Expr Equations


Can solve linear equations with regexp variables
X = aX + b
=a(aX + b) + b = a2X + ab + b
= a2 (aX + b) + ab + b = a3X + a2b + ab + b

X = a*b
Check: a[a*b] + b = aa*b + b = (aa*+)b = a*b
Ex:

X
X = aX + bY
Y = X = a*b

78

Solving RegExp Equations (contd)


Ex: NFA RegExp
0

A 0A 1B

B
1B 0C
0
A
1
C 0A 1B
___ elim.B ____ B 10C
0

C
A 0A 11 0C
C 0A 110C

___ elim.C ____ C (110)0


A

A 0A 110(110)0
A

[0 110(110)0]
A

[0 110(110)0]
Simplify using reg. algebra:

0 110(110)0
[ 110(11

Gauss-Jordan elimination

& back-substitution

79

Ex: Node Elimination Example via Algebra


b
B

b
b

a
A

Want B. C is accept state.


Elim. A:

A = a * aB

B = bC

Elim B:
Elim C: C = (ba * ab + bb) *
B = bC = b(ba * ab + bb) *

A = aA + aB
B=
bC
C = bA + bB
+e
B=
bC
C = ba * aB + bB + e

C = ba * abC + bbC + e
= (ba * ab + bbC
) +e

Simplifies to B =b(ba * b) *

80

Closure Properties
A class of languages is said to be closed under an operation if
applying that operation to members of the class results in a
language that is again a member of the class. Example: the
regular languages are closed under the operations of union,
concatenation and Kleene star.
Thm: The regular languages are closed under intersection
and complementation.
Pf: Complementation. Let L = L(M) where

M (Q, , , s, F)

is a DFA. Then the FA


M (Q, , , s, F )is also deterministic, and
(s, w) M
So w leads to a nonM.
accepting state in M w leads to an accepting state in

L
(
M
)

L(M ).
So

81

Closure Properties (contd)


Intersection. Let L1, L2 be regular. By DeMorgans law

L1 L2 (L1 L2).
Since the regular languages are closed under
complementation and union, the result follows.

82

Closure Properties (contd)


Another proof of closure under illustrates the technique
called cross-product construction. See Sipser text,
Theorem 1.25.
Thm: The class of regular languages is closed under the
intersection operation.
Pf: Assume L1 L(M 1), M 1 (Q1, 1, 1, s1, F1) and
L2 L(M 2), M 2 (Q2, 2, 2, s2, F2), where the automata
are deterministic.
Construction. Construct a cross-product machine M as
follows: M (Q1 Q2, 1 2, 1 2,(s1,s2), F1 F2)
where the transition function is defined by:
1 2((q 1, q 2), a) (1(q 1, a), 2(q 2, a)).

Machine M simulates the two given machines in parallel,


keeping each machine state in one component of (q 1, q 2).
83

Closure Properties (contd)


Verification By an easy induction on |x|, can show that

( (q1, q2) , x)

f1 F1, f2 F2

Therefore, for a pair of final states

( (s1, s2) , x) M

This says that

x L(M ) x L(M 1) x L(M 2)

i.e., that

L(M ) L(M 1) L(M 2). W


84

Closure Properties (contd)


Defn: A homomorphism h is a function that maps each
symbol of {a1, a2, K , an} to a string over some
alphabet , i.e.,

h(a1) w 1, h(a2) w 2, K , h(an) w n

The homomorphism is extended to operate on strings


character-by-character, i.e.,

h(c1c2 K cn) h(c1)(


h c2)K h(cn).

It is further extended to languages element-wise, i.e.,

h(L) {h(w) : w L}.

Thm: If L is regular and h is a homomorphism, then h(L) is


regular.
Pf: Assume L is recognized by a DFA M (Q, , , s, F).

85

Closure Properties (contd)


Construction: Construct the machine
M h (Q, , h, s, F) where for each transition in
M:
Note: GNFA
a
p
q
h(a)
p
q
put into Mh the transition
An easy induction establishes that
(s, w)

from which it follows that

L(M h) h(L(M )). W

86

What is Not Regular?


FA have a very limited computing ability. They cannot, for
example, recognized strings of well-nested parentheses, or
well-formed arithmetic expressions, or even the language
of strings of the form w#w, having two copies of the same
substring.
How can we show some languages are not regular? We
will give a property that all regular languages must have
(called the pumping property). Then, to show that a
language L is not regular, we argue that it lacks this
pumping property.

87

Pumping Lemma
Thm [Pumping Lemma for Regular Languages]. Suppose
that L is an infinite regular language. Then

(p)(w)[w L w p

(x, y, z)(w xyz y xy p


(i 0) xy iz L)]

All finite languages are regular, so only infinite languages are


of interest.

88

Pumping Lemma (English)


Thm [Pumping Lemma for Regular Languages]. Suppose
that L is an infinite regular language. Then there is some
number p (called the pumping length) such that:
if w is any string in L with |w| p, then
w can be factored into 3 substrings, w = xyz, that
satisfy the following 3 conditions:
(i) y
[y is not empty]
(ii) |xy| p [the prefix and pumped part are short]
(iii) for every i 0, xyiz L [pumped up and
pumped down (i = 0) versions of the string must
also be in L]
All finite languages are regular, so only infinite languages are
of interest.

89

Pumping Lemma (contd)


Pf: Let M (Q, , , s, F) be a DFA recognizing L and
let p be the number of its states.
Let w a1a2 L an be an input string of length n where n
p.
Let r1, r2, L rn, rn 1 be the sequence of states that M
enters while processing w so that ri 1 (ri, ai) for
1 i n. This state sequence has length n+1 p+1.
Among the first p+1 states of this sequence, at least 2
must be the same state [pigeonhole principle]. Call the
first of these 2 rj and the second rk rj . Because
rk rj occurs among the first p+1 places in the
sequence r1, r2, L rn, rn 1 we have that k p+1.
Define the following substrings of w:

x a1 L aj 1, y aj L ak 1, z ak L an

90

Pumping Lemma (contd)


y aj L ak 1

Picture:
x a1 L aj 1

r1

rk
=rj

z ak L an

rn+1

rn is
s picture,
r1 we see that there
From the
1 an accepting path
j all
the
k strings of
from
for
xy iz, to
i afinal
0 state
k p 1 it must
the form y .
. Also, since
be
so
WFurthermore,
xythat p.
.
91

Non-regular Examples
k2

Ex: L {a : k 0} is not regular.


Pf: By contradiction. Suppose L is regular. Then by the
Pumping Lemma,
p
q
r

(x, y, z) x a , y a (q 0), z a
p
q n
r
[(n 0) a (a ) a L.

Then it follows that (n 0) p q n r


is a perfect square. This is impossible. For suppose
p r q n0 k02
for a k0 so large that
2k0 1 q. Then p r q (n0 1)

k q k 2k0 1 (k0 1).


2
0

2
0

Hence p r q (n0 1)
falls in between
W
perfect squaresa contradiction.
92

Non-regular Examples (contd)

Ex: L {w w : w {a, b}} is not regular.


Pf: By contradiction. Suppose L is regular. Then by
closure properties of the regular languages
L1 L a*bba* is regular. Now
n
n
L1 {a bba : n 0}. We show L1 cannot
be regular, which provides a contradiction.
If L1 is regular, then there are substrings x, y, z
n
y

0
xy
z L1.
with
such that
Case 1. y is entirely in the as. Assume it is in the as
before the 2 bs (The other subcase is symmetric). Then
R

x ap, y aq , z ar bbas (q 0)
p r
s
p and
q r s.
a
a
bba
L
But then
where p r s.
This is a contradiction.
93

Non-regular Examples (contd)


3

xy z has more than 2


Case 2. y contains a b. Then
3
bs, and so xy z L1. This is a contradiction.
Contradictions in all cases contradiction to the assumption
L1 So
W
L1
that
is regular.
is not regular.
n n
Ex: L {a b : n 0} is not regular. See Text,
Example 1.73.
B {w : w is a well-nested
Ex:

string of parentheses}
is not regular.

(
)
Pf: Suppose
B
is
regular.
Then
so
is
n n
{( ) : n 0} as is its homomorphic image
{an b n : n 0}. Contradiction. W
94

Decision Problems

For a property/predicate P the decision problem for P is:

Given: x
Question: Is P(x) true?

Ex: Given DFA M, is it true that L(M) = ?


Thm: For DFA M all the following decision problems are
solvable, i.e. there exists an algorithm to decide the
question for any input:
Given
Question
M,w
w L(M) ?
M
L(M) = ?
M
L(M) = * ?
M, M
L(M) L(M ) ?
M, M
L(M) = L(M ) ?
95

Decision Problems (contd)


Pf: Assume given DFAs for inputs.
(1) Trace w through M. Yes if leads from s to some q F
(2) Yes if there is some q F reachable from s
(3) Convert M M. L(M ) L(M )
(4) L(M 1) L(M 2) L(M 2) L(M 1)

L(M 1)

L(M 2)

(5) Use (4) twice


96

You might also like