Motivation for Nondeterminism
Nondeterminism in Computer Science is not so much about guessing as having a number of viable choices available at some point and not knowing which choice "leads to a solution." Algorithms, unless randomized, are deterministic. For our purposes, nondeterministic computing is more about the ability to computationally verify a solution once we have somehow come up with it. The issue of nondeterminism presents itself immediately when we try to take a regular expression and create an automaton which accepts its language. What we are trying to establish is the notion of a Nondeterministic Finite Automata, or NFA.Example 1
a(bab)*∪a(ba)*Although we could reason it out and find a DFA, an NFA is much simpler:
 both (q0,a,q1), (q0,a,q2) ∈ Δ, so it's not functional
 there is no state q such that (q0,b,q) ∈ Δ
Example 2 (textbook)
The textbook example is also quite good: (ab∪aba)*. A DFA is presented in Figure 24 and an NFA for it in Figures 25 and 2.6. You can see how much more simple and intuitive the NFAs are.Example 3
As a final addition to our NFA's structure we consider this example: (ab)*∪(aba)*. This is, of course, different than the former example (ab∪aba)*. Why? What we need to avoid is coming back the same start state after taking either an "ab" or "aba" cycle. We can do so by adding ε transitions:Definition of an NFA
An NFA is (K,Σ,Δ,s,F) where all components, with exception of the new transition relation Δ, have the same significance as in a DFA. The transition relation is this:Δ ⊆ K × (Σ∪{ε}) × KIn a relation such as this, we can think of the elements as triples, as indicated, or as pairs:
Δ ⊆ ( K × (Σ∪{ε}) ) × Kwhere the first element is itself a pair. The yields in one step relation uses basically the same idea as in a DFA, except that we must define the behavior for ε.

For each σ ∈ Σ, w ∈ Σ* (p,σw) ⊢ (q,w) for all if and only if (p,σ,q) ∈ Δ

(p,w) ⊢ (q,w) for all w ∈ Σ* if and only if (p,ε,q) ∈ Δ
(s,w) ⊢* (f,ε) for some f ∈ FIn this case, of course, the string is not decided, since we have to guess the accepting path. Consider the NFA from Example 3 and show that aba is accepted.
(q0, aba) ⊢ (q2, aba) because (q0,ε,q2) ∈ Δ ⊢ (q4, ba) because (q2,a,q4) ∈ Δ ⊢ (q5, a) because (q4,a,q5) ∈ Δ ⊢ (q2, ε) because (q5,a,q2) ∈ Δq2 ∈ F, and so aba is accepted. Not all authors include ε transitions in an NFA's definition. Having them makes the NFA⇒DFA construction (and thus the proof) more complicated.
Example 4 (textbook)
Consider the example 2.2.2 in the textbook. Given an alphabet Σ, the language is:{ w : w omits at least one symbol σ ∈ Σ }
This only gets interesting with 3 or more symbols.
Suppose Σ = {a,b,c}. Then we can easily express
this language as:
(a∪b)*∪(a∪c)*∪(b∪c)*Likewise, it is just as easy to construct an NFA for this language. Figure 28 is a generalization. When you have n symbols, the NFA requires only n+1 states. We will show that a minimal DFA for this language with n symbols has a worstcase of 2^{n} states.
RabinScott: NFA to DFA
This is one of the classical theorems in the study of Automata which dates back to 1959. Theorem (Rabin/Scott): For each NFA there is an equivalent DFA (they accept the same language). Start with an NFA (K,Σ,Δ,s,F). We first do the construction and some illustrative examples and then deal with the proof. The basic idea is that the nondeterministic choice of states which can be reached by the string w is captured as a set of these possible states. Thus the equivalent DFA is(2^{K},Σ,δ,s′,F′)where these are to be defined:
s′ ⊆ K δ: 2^{K}×Σ ⇾ 2^{K} F′ ⊆ 2^{K}We'll use Example 3 to make definite the construction details. The start state s′ consists of all possible "initial" state choices = {q0,q1,q2}. How to characterize this set? It is the εclosure of q0, written:
Ε(q0)
The εclosure
The εclosure of a state p is the state p itself and all those states reachable from p by one or more εtransition. In our formalism, we write:Ε: K ⇾ 2^{K}Ε(p) = { q : (p,ε) ⊢* (q,ε) }
The idea is that all states in the constructed DFA, which are sets of states from the NFA, should be εclosed, meaning that the epsilon closures have already been added.
In particular, the start state of the DFA is the εclosure of the NFA start state, i.e.,
s′ = Ε(s)
Final States
Reviewing our idea that, for a given string w the relevant DFA state is the set of all NFA states reachable from the start state by w. Thus a final state of the DFA should be any state set which contains some final state of the NFA. In formal terms:F' = { Q ∈ 2^{K} : Q ∩ F ≠ ∅ } is all sets of state sets which contain at least one final state
Not all these state are "reachable" from the start state, but it doesn't
hurt to list them all as final.
Transitions
In general, for Q ⊆ K and σ ∈ Σ, define:δ(Q, σ) = ∪ Ε(p) for all p such that (q,σ,p) ∈ Δ for some q ∈ Qwhich, for a single state transition from a state in Q, could be depicted like this:
δ( {q0,q1,q2}, a ) = {q3,q4}getting all states reachable from one of {q0,q1,q2} by an atransition, and then again adding the εclosures of all these states (which produces nothing new in this case). Taking transitions out of the first state gives:
δ( {q0,q1,q2}, a ) = {q3,q4} δ( {q0,q1,q2}, b ) = ∅The state ∅ is a dead state since it cannot be final and all transitions out of it yield ∅. This process of finding all states seems rather daunting. There are 6 states in the NFA, and therefore, up to 2^{6} = 64 states in an equivalent DFA. Proceed by branching out from the start state and keep going until no new states are produced. Here is the complete set of transitions:
δ( {q0,q1,q2}, a ) = {q3,q4} δ( {q0,q1,q2}, b ) = ∅ δ( {q3,q4}, a ) = ∅ δ( {q3,q4}, b ) = {q1,q5} δ( {q1,q5}, a ) = {q2,q3} δ( {q1,q5}, b ) = ∅ δ( {q2,q3}, a ) = {q4} δ( {q2,q3}, b ) = {q1} δ( {q4}, a ) = ∅ δ( {q4}, b ) = {q5} 
δ( {q1}, a ) = {q3} δ( {q1}, b ) = ∅ δ( {q3}, a ) = ∅ δ( {q3}, b ) = {q1} δ( {q5}, a ) = {q2} δ( {q5}, b ) = ∅ δ( {q2}, a ) = {q4} δ( {q2}, b ) = ∅ 
NFA (Example 3 above):

final nonfinal {q0,q1,q2} {q3,q4} {q1,q5} {q3} {q2,q3} {q4} {q1} {q5} {q2} ∅
Other NFA to DFA examples
Good examples are the NFAs from figures 25, 26 and 27 in the textbook. This is the NFA from Figure 26:(a∪b)*aba(a∪b)*The NFA which realizes it is obvious:
The three final states of this DFA are equivalent (we'll discuss that later). The implication is that they can be grouped together into a set which acts as a single state which loops back to itself. In this way, we get the expected DFA with a single final state.
As a second example, consider the regular expression
(a*∪(ab)*)b*
When drawing the DFA, it is probably a good idea to proceed in the indicated treelike expansion from the start state in order to provide for yourself a methodology for placement of new states.
Worstcase size increase NFA to DFA
Consider the alphabet {a,b,c,d} and the language where every string omits at least one symbol. The language is this:(b∪c∪d)*∪(a∪c∪d)*∪(a∪b∪d)*∪(a∪b∪c)*Here's the NFA:
Claim: The state set of the constructed DFA consists of all 2^{4} = 16 states which are subsets of {A,B,C,D}.
The argument is based on the observation that for Q ⊆ {A,B,C,D},δ(Q,σ) = Q  {upper_case(σ)}, σ = a,b,c,dFor example,
δ( {B,C,D}, a ) = {B,C,D}.
 All sets of size n are in the DFA (there's only one)
 All sets of size n1 are in the DFA (there are n of them)
 ...
Proof of RabinScott Theorem
There are two configuration transition relations to consider: ⊢ defined by Δ from the NFA
 ⊢′ defined by δ from the DFA
A: For all NFA states p: (q,w) ⊢* (p,ε) ⇒ (Ε(q),w) ⊢′* (P,ε), where P ∋ p B: (Ε(q),w) ⊢′* (P,ε) ⇒ for all states p ∈ P, (q,w) ⊢* (p,ε)The theorem follows directly from this result. If w is accepted by the NFA it means there is a final state f and a transition: (s,w) ⊢* (f,ε). Then, by (A), using s′ = E(s), the state state of the DFA, (s',w) ⊢′* (P,ε), where P ∋ f, meaning that P is final in the DFA, and so w is accepted by the DFA.
Conversely if w is accepted by the DFA, it means that (s′,w) ⊢′* (P,ε), where P is a final state. Then P must contain some NFA final state f, and by (B), (s,w) ⊢* (f,ε), meaning that w is accepted by the NFA.
The proof of both parts is by induction on the length of w.Base case
w = εA: For all states p: (q,ε) ⊢* (p,ε) ⇒ (Ε(q),ε) ⊢′* (P,ε), where P ∋ p B: (Ε(q),ε) ⊢′* (P,ε) ⇒ for all states p ∈ P, (q,ε) ⊢* (p,ε)(q,ε) ⊢* (p,ε) implies that p ∈ Ε(q). On the other hand, since ⊢′* is deterministic, (Ε(q),ε) ⊢′* (P,ε) implies that P = Ε(q). The two implications are obvious.
Induction
Make inductive assumptions:A_{ind}: For all states p: (q,v) ⊢* (p,ε) ⇒ (Ε(q),v) ⊢′* (P,ε), where P ∋ p B_{ind}: (Ε(q),v) ⊢′* (P,ε) ⇒ for all states p ∈ P, (q,v) ⊢* (p,ε)and prove, for w = vσ:
A: For all states p: (q,vσ) ⊢* (p,ε) ⇒ (Ε(q),vσ) ⊢′* (P,ε), where P ∋ p B: (Ε(q),vσ) ⊢′* (P,ε) ⇒ for all states p ∈ P, (q,vσ) ⊢* (p,ε)
proof of A:
Isolating the σ transition on the left implies there exist states r_{1}, r_{2} such that:(q,vσ) ⊢* (r_{1},σ) ⊢ (r_{2},ε) ⊢* (p,ε)Analyze each part:
 (q,vσ) ⊢* (r_{1},σ) implies (q,v) ⊢* (r_{1},ε)
which implies by induction that: (Ε(q),v) ⊢′* (R_{1},ε), where r_{1} ∈ R_{1} 
(r_{1},σ) ⊢ (r_{2},ε) implies by definition of δ
that r_{2} ∈ δ(R_{1},σ)
Setting P = δ(R_{1},σ), it follows by definition that (R_{1},σ) ⊢′ (P,ε)  (r_{2},ε) ⊢* (p,ε) implies p ∈ Ε(r_{2}) ⊆ δ(R_{1},σ) = P — since δ(R_{1},σ) is the union of all εclosures
The result in (a) implies that: (Ε(q),vσ) ⊢′* (R_{1},σ);
add (b) to get
(Ε(q),vσ) ⊢′* (R_{1},σ) ⊢′ (P,ε) (c) tells us that p ∈ P. We derive the desired result by collapsing the middle term and restating:(Ε(q),vσ) ⊢′* (P,ε) where P ∋ p.
proof of B:
A similar approach: isolate the σ transition, writing:(Ε(q),vσ) ⊢′* (R_{1},σ) ⊢′ (P,ε)Again, analyze the parts:

(Ε(q),vσ) ⊢′* (R_{1},σ) implies (Ε(q),v) ⊢′* (R_{1},ε)
which implies by induction that (q,v) ⊢* (r,ε) for all r ∈ R_{1}. 
(R,σ) ⊢′ (P,ε) means that P = δ(R_{1},σ), which, by definition
of the DFA transitions, means that,
for each p ∈ P, there exist states r_{1} and r_{2} such that r_{1} ∈ R_{1}, (r_{1},σ,r_{2}) ∈ Δ, and p ∈ Ε(r_{2}). 
Restating (b) says: for each p ∈ P, there exists states r_{1} ∈ R_{1}, and r_{2}
so that (r_{1},σ) ⊢ (r_{2},ε) and (r_{2},ε) ⊢* (p,ε).  Restating and simplifying (c) says: for each p ∈ P, there exists a state r_{1} ∈ R_{1} such that (r_{1},σ) ⊢* (p,ε).
Textbook ambiguity ?
The textbook, at the top of page 71, states basically the same claim as we have proved, but with the following wording:(q,w) ⊢* (p,ε) if and only if (Ε(q),w) ⊢′* (P,ε)
for some set P containing p.
(Ε(q),w) ⊢′* (P,ε) ⇒ (q,w) ⊢* (p,ε)for some set P containing p.
(Ε(q),w) ⊢′* (P,ε) ⇒ (q,w) ⊢* (p,ε)