Nondeterministic Finite Automata

Motivation for Nondeterminism

Nondeterminism in Computer Science is not so much about guessing as having a number of viable choices available at some point and not knowing which choice "leads to a solution." Algorithms, unless randomized, are deterministic. For our purposes, nondeterministic computing is more about the ability to computationally verify a solution once we have somehow come up with it.

The issue of non-determinism presents itself immediately when we try to take a regular expression and create an automata which accepts its language. What we are trying to establish is the notion of a Nondeterministic Finite Automata, or NFA.

Example 1

a(bab)*∪a(ba)*
Although we could reason it out and find a DFA, an NFA is much simpler:
We can verify that the string ababa is accepted by this NFA once we "guess" the state path q0,q2,q5,q2,q5,q2 ∈ F. Of course the only choice is the first one. If we made the wrong start q0,q1,q3,q4,q1 we reach a point where we have a remaining a symbol to process with no place to go. This is obviously a failure. We could make an "a" transition to a dead state, but the omission of the transition is more succinct.

In mathematical terms, the transition function, δ: K×Σ ⇾ K becomes a transition relation, Δ ⊆ K×Σ×K. Examples:
  1. both (q0,a,q1), (q0,a,q2) ∈ Δ, so it's not functional
  2. there is no state q such that (q0,b,q) ∈ Δ

Example 2 (textbook)

The textbook example is also quite good: (ab∪aba)*.

A DFA is presented in Figure 2-4 and an NFA for it in Figures 2-5 and 2.6. You can see how much more simple and intuitive the NFAs are.

Example 3

As a final addition to our NFA's structure we consider this example: (ab)*∪(aba)*.

This is, of course, different than the former example (ab∪aba)*. Why? What we need to avoid is coming back the same start state after taking either an "ab" or "aba" cycle. We can do so by adding ε transitions:
Loosely, an ε transition is one which consumes no input and represents a pure nondeterministic choice of being in the state or the target state without having done any processing.

Definition of an NFA

An NFA is (K,Σ,Δ,s,F) where all components, with exception of the new transition relation Δ, have the same significance as in a DFA. The transition relation is this:
Δ ⊆  K × (Σ∪{ε}) × K
The yields in one step relation uses basically the same idea as in a DFA, except that we must define the behavior for ε. The yields relation, ⊢* is defined exactly the same as in a DFA, i.e., the reflexive, transitive closure of the yields relation. Acceptance of a string is defined exactly the same as in a DFA:
(s,w) ⊢* (f,ε)  for some f ∈ F
In this case, of course, the string is not decided, since we have to guess the accepting path. Consider the NFA from Example 3 and show that aba is accepted.
(q0, aba) ⊢ (q2, aba)    because (q0,ε,q2) ∈ Δ
          ⊢ (q4, ba)     because (q2,a,q4) ∈ Δ
          ⊢ (q5, a)      because (q4,a,q5) ∈ Δ
          ⊢ (q2, ε)      because (q5,a,q2) ∈ Δ
    q2 ∈ F, and so aba is accepted.

Not all authors include ε transitions in an NFA's definition. Having them makes the NFA⇒DFA construction (and thus the proof) more complicated.

Example 4 (textbook)

Consider the example 2.2.2 in the textbook. Given an alphabet Σ, the language is:
{ w : w omits at least one symbol σ ∈ Σ }
This only gets interesting with 3 or more symbols. Suppose Σ = {a,b,c}. Then we can easily express this language as:
(a∪b)*∪(a∪c)*∪(b∪c)*
Likewise, it is just as easy to construct an NFA for this language. Figure 2-8 is a generalization. When you have n symbols, the NFA requires only n+1 states. We will show that a minimal DFA for this language with n symbols has a worst-case of 2n states.

Rabin-Scott: NFA to DFA

This is one of the classical theorems in the study of Automata which dates back to 1959.

Theorem (Rabin/Scott): For each NFA there is an equivalent DFA (they accept the same language).

Start with an NFA (K,Σ,Δ,s,F). We first do the construction and some illustrative examples and then deal with the proof.

The basic idea is that the non-deterministic choice of states which can be reached by the string w is captured as a set of these possible states. Thus the equivalent DFA is
(2K,Σ,δ,s′,F′)
where these are to be defined:
s′ ⊆ K
δ: 2K×Σ ⇾ 2K
F′ ⊆ 2K
We'll use Example 3 to make definite the construction details. The start state s′ consists of all possible "initial" state choices = {q0,q1,q2}. How to characterize this set? It is the ε-closure of q0, written:
Ε(q0)

The ε-closure

The ε-closure of a state p is the state p and all those reachable from p by one or more ε-transition. In our formalism, we write:
Ε: K ⇾ 2K
Ε(p) = { q : (p,ε) ⊢* (q,ε) }  

The idea is that all states in the constructed DFA, which are sets of states from the NFA, should be ε-closed, meaning that the epsilon closures have already been added.

In particular, the start state of the DFA is the ε-closure of the NFA start state, i.e.,
s′ = Ε(s)

Final States

Reviewing our idea that, for a given string w the relevant DFA state is the set of all NFA states reachable from the start state by w. Thus a final state of the DFA should be any state set which contains some final state of the NFA. In formal terms:
F' = { Q ∈ 2K :  Q ∩ F ≠ ∅ }   is all sets of state sets which contain at least one final state
Not all these state are "reachable", but it doesn't hurt to list them all as final.

Transitions

In general, for Q ⊆ K and σ ∈ Σ, define:
δ(Q, σ) =  Ε(p) for all p such that (q,σ,p) ∈ Δ for some q ∈ Q
The idea is that, assuming Q is already ε-closed, take a σ transition from any state in Q, and then include the ε closure of that state.

In our example,
δ( {q0,q1,q2}, a ) = {q3,q4}
getting all states reachable from one of {q0,q1,q2} by an a-transition, and then again adding the ε-closures of all these states (which produces nothing new in this case). Taking transitions out of the first state gives:
δ( {q0,q1,q2}, a ) = {q3,q4}
δ( {q0,q1,q2}, b ) = ∅
The state is a dead state since it cannot be final and all transitions out of it yield .

This process of finding all states seems rather daunting. There are 6 states in the NFA, and therefore, up to 26 = 64 states in an equivalent DFA. Proceed by branching out from the start state and keep going until no new states are produced. Here is the complete set of transitions:
δ( {q0,q1,q2}, a ) = {q3,q4}
δ( {q0,q1,q2}, b ) = ∅

δ( {q3,q4}, a ) = ∅
δ( {q3,q4}, b ) = {q1,q5}

δ( {q1,q5}, a ) = {q2,q3}
δ( {q1,q5}, b ) = ∅

δ( {q2,q3}, a ) = {q4}
δ( {q2,q3}, b ) = {q1}

δ( {q4}, a ) = ∅
δ( {q4}, b ) = {q5}
δ( {q1}, a ) = {q3}
δ( {q1}, b ) = ∅

δ( {q3}, a ) = ∅
δ( {q3}, b ) = {q1}    

δ( {q5}, a ) = {q2}
δ( {q5}, b ) = ∅

δ( {q2}, a ) = {q4}
δ( {q2}, b ) = ∅



 NFA (Example 3 above):
Our DFA consists of these 10 NFA state sets:
final                    non-final
{q0,q1,q2}             {q3,q4}
{q1,q5}                {q3}
{q2,q3}                {q4}
{q1}                   {q5}
{q2}                   ∅

Other NFA to DFA examples

Good examples are the NFAs from figures 2-5, 2-6 and 2-7 in the textbook.

This is the NFA from Figure 2-6:
We convert it into the following DFA, matching Figure 2-4:
For another first example, consider the regular expression representing all strings with aba as a substring:
(a∪b)*aba(a∪b)*
The NFA which realizes it is obvious:
It is converted into the following DFA:

The three final states of this DFA are equivalent (we'll discuss that later). The implication is that they can be grouped together into a set which acts as a single state which loops back to itself. In this way, we get the expected DFA with a single final state.


As a second example, consider the regular expression

(a*∪(ab)*)b*
Here is a suitable 5-state NFA:
The equivalent DFA is:

When drawing the DFA, it is probably a good idea to proceed in the indicated tree-like expansion from the start state in order to provide for yourself a methodology for placement of new states.

Worst-case size increase NFA to DFA

Consider the alphabet {a,b,c,d} and the language where every string omits at least one symbol. The language is this:
(b∪c∪d)*∪(a∪c∪d)*∪(a∪b∪d)*∪(a∪b∪c)*
Here's the NFA:
We've named the key states by the symbols converted to upper case. The implication of state A is that it accepts all and only the strings the symbol a, etc. The DFA's start state is {A,B,C,D,0}. The NFA's start state, 0, appears only once, within the DFA's start state and does not contribute toward the construction in any way; thus we'll ignore it.

Claim: The state set of the constructed DFA consists of all 24 = 16 states which are subsets of {A,B,C,D}.

The argument is based on the observation that for Q ⊆ {A,B,C,D},
δ(Q,σ) = Q - {upper_case(σ)}, σ = a,b,c,d
For example,
δ( {A,B,C,D}, a ) = {B,C,D}
δ( {B,C,D}, a ) = {B,C,D}.

In general, for the an alphabet with n symbols and n states we would prove that all possible subsets are in the DFA by induction on the size of the set, going down in size. For our example we would argue: For example with n=4, to argue that an arbitrary set of size 2 is in the DFA, say {B,D}, put back one missing NFA state, say C, getting {B,C,D} of size 3. Inductively, {B,C,D} is in the DFA and consequently δ({B,C,D},c) = {B,D} is in the DFA. Here is a diagram of the resulting DFA for n=4:
Except for the empty set, all states are final. The crucial point which we will address later is that all 16 states are necessary. Technically speaking, we will show that all states are distinguishable, meaning that this is the minimal DFA for the given NFA.

Proof of Rabin-Scott Theorem

There are two configuration transition relations to consider:
    defined by Δ from the NFA
⊢′   defined by δ from the DFA
We prove the following: for all strings w and states q:
A: For all NFA states p:  (q,w) ⊢* (p,ε)  ⇒  (Ε(q),w) ⊢′* (P,ε), where P ∋ p
B: (Ε(q),w) ⊢′* (P,ε)  ⇒  for all states p ∈ P, (q,w) ⊢* (p,ε)
The theorem follows directly from this result. If w is accepted by the NFA it means there is a final state f and a transition: (s,w) ⊢* (f,ε). Then, by (A), using s′ = E(s), the state state of the DFA, (s',w) ⊢′* (P,ε), where P ∋ f, meaning that P is final in the DFA, and so w is accepted by the DFA.

Conversely if w is accepted by the DFA, it means that (s′,w) ⊢′* (P,ε), where P is a final state. Then P must contain some NFA final state f, and by (B), (s,w) ⊢* (f,ε), meaning that w is accepted by the NFA.

The proof of both parts is by induction on the length of w.

Base case

w = ε
A: For all states p:    (q,ε) ⊢* (p,ε)  ⇒  (Ε(q),ε) ⊢′* (P,ε), where P ∋ p
B: (Ε(q),ε) ⊢′* (P,ε)  ⇒ for all states p ∈ P, (q,ε) ⊢* (p,ε)
(q,ε) ⊢* (p,ε) implies that p ∈ Ε(q). On the other hand, since ⊢′* is deterministic, (Ε(q),ε) ⊢′* (P,ε) implies that P = Ε(q). The two implications are obvious.

Induction

Make inductive assumptions:
Aind: For all states p:   (q,v) ⊢* (p,ε)  ⇒  (Ε(q),v) ⊢′* (P,ε), where P ∋ p
Bind: (Ε(q),v) ⊢′* (P,ε)  ⇒  for all states p ∈ P, (q,v) ⊢* (p,ε)
and prove, for w = vσ:
A: For all states p:    (q,vσ) ⊢* (p,ε)  ⇒  (Ε(q),vσ) ⊢′* (P,ε), where P ∋ p
B: (Ε(q),vσ) ⊢′* (P,ε)  ⇒  for all states p ∈ P, (q,vσ) ⊢* (p,ε)

proof of A:

Isolating the σ transition on the left implies there exist states r1, r2 such that:
(q,vσ) ⊢* (r1,σ) ⊢ (r2,ε) ⊢* (p,ε)
Analyze each part:
  1. (q,vσ) ⊢* (r1,σ) implies (q,v) ⊢* (r1,ε)
    which implies by induction that: (Ε(q),v) ⊢′* (R1,ε), where r1 ∈ R1
  2. (r1,σ) ⊢ (r2,ε) implies by definition of δ that r2 ∈ δ(R1,σ)
    Setting P = δ(R1,σ), it follows by definition that (R1,σ) ⊢′ (P,ε)
  3. (r2,ε) ⊢* (p,ε) implies p ∈ Ε(r2) ⊆ δ(R1,σ) = P — since δ(R1,σ) is the union of all ε-closures

The result in (a) implies that: (Ε(q),vσ) ⊢′* (R1,σ);

add (b) to get

(Ε(q),vσ) ⊢′* (R1,σ) ⊢′ (P,ε)

(c) tells us that p ∈ P.

We derive the desired result by collapsing the middle term and restating:

(Ε(q),vσ) ⊢′* (P,ε) where P ∋ p.

proof of B:

A similar approach: isolate the σ transition, writing:
(Ε(q),vσ) ⊢′* (R1,σ) ⊢′ (P,ε)
Again, analyze the parts:
  1. (Ε(q),vσ) ⊢′* (R1,σ) implies (Ε(q),v) ⊢′* (R1,ε)
    which implies by induction that (q,v) ⊢* (r,ε) for all r ∈ R1.
  2. (R,σ) ⊢′ (P,ε) means that P = δ(R1,σ), which, by definition of the DFA transitions, means that,
    for each p ∈ P, there exist states r1 and r2 such that r1 ∈ R1, (r1,σ,r2) ∈ Δ, and p ∈ Ε(r2).
  3. Restating (b) says: for each p ∈ P, there exists states r1 ∈ R1, and r2
    so that (r1,σ) ⊢ (r2,ε) and (r2,ε) ⊢* (p,ε).
  4. Restating and simplifying (c) says: for each p ∈ P, there exists a state r1 ∈ R1 such that (r1,σ) ⊢* (p,ε).
Choose a state p ∈ P. By (d) we get a state r1 ∈ R1, so that (r1,σ) ⊢* (p,ε). By (a), since r1 ∈ R1, (q,v) ⊢* (r1,ε) which implies that (q,vσ) ⊢* (r1,σ)

Then join these two results: (q,vσ) ⊢* (r1,σ) ⊢* (p,ε)

Restating and collapsing the inner transitions we have our proof: for each p ∈ P, (q,vσ) ⊢* (p,ε). ■

Textbook ambiguity ?

The textbook, at the top of page 71, states basically the same claim as we have proved, but with the following wording:
For all strings w and states p, q:
(q,w) ⊢* (p,ε)  if and only if  (Ε(q),w) ⊢′* (P,ε)
for some set P containing p.
Although slicker than my version, it is, in my opinion, too slick. Consider the "if":
For all strings w and states p, q:
(Ε(q),w) ⊢′* (P,ε)  ⇒  (q,w) ⊢* (p,ε)
for some set P containing p.
The problem is that the "for some set P containing p" is ambiguous in its placement at the end of this phrase. One interpretation might be this:
For all strings w and states p, q, there exists a set P containing p such that
(Ε(q),w) ⊢′* (P,ε)  ⇒  (q,w) ⊢* (p,ε)
There are other possible interpretations, but the problem with all of them is that the set P is specified after the state p is chosen. Thus one cannot exclude that possibility that P contains other states p′ which do not satisfy (q,w) ⊢* (p′,ε). What you really want to say is the way it is phrased in (B) above. Both the textbook's and my proof rely upon this interpretation.


© Robert M. Kline