a  b  
{q1,q3}  {q2}  {q4,q6} 
{q4,q6}  {q1,q3}  {q5} 
{q2}  {q5}  {q1,q3} 
{q5}  {q5}  {q5} 
Minimization Theory in a Nutshell
String equivalence with respect to a language
Definition 2.5.1 in the textbook. Given a language, L, write x ≈ y iffor all z ∈ Σ*, xz ∈ L ⇔ yz ∈ LLoosely we say that x and y "share a common fate" in terms of acceptance. Claim: The relation ≈ is an equivalence relation. Thus, we can talk about the equivalence classes with respect to ≈. Write [x] as the equivalence class containing string x:
[x] = { u : x ≈ u }
The minimal DFA
It turns out that these equivalence classes, if finitely many, form the minimal DFA for L:Min(L) = (K_{L},Σ,δ_{L},s_{L},F_{L})where:
K_{L} = { [x] : x ∈ Σ* }We will talk more about the other DFA structure features later. As an example, let
L = { w ∈ {a,b}*: #a(w) is even }Take the string "a". It is equivalent to "ab" w.r.t. L because
for all z ∈ Σ*,
az has an even # of a's if and only if abz has an even # of a's
There are only 2 equivalence classes, depending on the parity (even or odd)
of the number of a's:
az has an even # of a's if and only if abz has an even # of a's
[ε] = { ε, aa, b, bb, aba, baa, ... } [a] = { a, ab, ba, bab, abb, aaba, ... }These 2 classes form the states of the minimal DFA for L:
L = { w ∈ {a,b}*: #a(w) is even and #b(w) is odd }There are 4 equivalence classes based on the parity of the number of a's and the number of b's. The minimal DFA for L is:
The δ^{*} function
Given a DFA M=(K,Σ,δ,s,F), defineδ^{*}: K × Σ* ⇾ Kin terms of the yields relation:
δ^{*}(p,w) = q if and only if (p,w) ⊢* (q,ε)
Relationship of any DFA to the DFA Min(L)
Let L be a regular language. and let this be some arbitrary DFA for L:M=(K,Σ,δ,s,F)Consider the mapping: T : K ⇾ K_{L} defined by:
T( δ*(s,x) ) = [x]We need to argue that the T function is welldefined, meaning that:
δ*(s,x) = δ*(s,y) ⇒ x ≈ yThis is pretty obvious. In poetic English,
If x and y run to the same state,The fact that we can choose any string x implies that the T function is a surjection (an onto mapping) from K ⇾ K_{L}. Consequently,
then x and y must share the same fate.
 K ≥ K_{L}, and therefore
 K_{L} represents the minimal number of states of any DFA for L
MyhillNerode Theorem
Define Min(L) = (K_{L},Σ,δ_{L},s_{L},F_{L}) where:K_{L} = { [x] : x ∈ Σ^{*} } s_{L} = [ε] F_{L} = { [x] : x ∈ L } δ_{L}([x],σ) = [xσ]The theorem states that this is the unique minimal DFA for L. A key step in proving the theorem is showing that δ_{L} is welldefined, i.e.
x ≈ y ⇒ xσ ≈ yσObserve that Min(L) is defined in terms of the language L per se, affording it a status of being independent of any DFA which recognizes L.
Uniqueness of Min(L)
We know that Min(L) is minimal in terms of the number of states, but it is important to understand better what it means to be the unique DFA for L. Loosely, is that any other minimal DFA M = (K,Σ,δ,s,F) is essentially the same as Min(L). More precisely, consider the mappingT(q) = [x], where q = δ^{*}(s,x)T is onto (surjective) and if M is minimal then the number of states must be equal. Consequently T must be onetoone (injective), making T a bijection, or a onetoone correspondence between states of M and states of Min(L). The additional key point is that T is structurepreserving, i.e.,
T(δ(p,σ)) = δ_{L}(T(p),σ)In English, this means that the states of M and Min(L) correspond to each other in a way that the transitions also correspond. In algebraic terms, T is called an isomorphism.
Recognizing a minimal DFA
Regarding the Lewis textbook, we have outlined the material through page 97. The MyhillNerode Theorem is theoretically useful, but it is ultimately only an existence proof, not the construction of the minimal DFA. We need to better understand how to proceed with a given DFA for a language L. To start, we need to define yet another equivalence relation among states of a DFA. Write p ≡ q if,
for all z ∈ Σ^{*}, δ^{*}(p,z) ∈ F ⇔ δ^{*}(q,z) ∈ F.
If p ≡ q, we say p and q are
indistinguishable, otherwise, they are distinguishable.
This is basically expressing the same "share a common fate"
idea as is expressed by ≈, except
that it is expressed in terms of states.
We want to say that the two equivalence relations, ≈
and ≡ express exactly the
same idea. Technically, this is what needs to be proved:
(A) x ≈ y ⇔ δ^{*}(s,x) ≡ δ^{*}(s,y)Here's the proof:
x ≈ y ⇔ for all z, xz ∈ L ⇔ yz ∈ L ⇔ for all z, δ^{*}(s,xz) ∈ F ⇔ δ^{*}(s,xz) ∈ F ⇔ for all z, δ^{*}(δ^{*}(s,x),z) ∈ F ⇔ δ^{*}(δ^{*}(s,x),z) ∈ F ⇔ δ^{*}(s,x) ≡ δ^{*}(s,y)
Alternative representation of the minimal DFA
Given DFA M = (K,Σ,δ,s,F), the equivalence class of state p is [p] = { q : p ≡ q }. We can now give another more concrete representation of the minimal DFA:M′ = (K′,Σ,δ′,s′,F′)where
K′ = { [q] : q ∈ K } s′ = [s] F′ = { [f] : f ∈ F } δ′([q],σ) = [ δ(q,σ) ]In addition to other steps we have to argue that δ′ is welldefined, i.e.,
p ≡ q ⇒ δ(p,σ) ≡ δ(q,σ)The correspondence between the states of M′ and Min(L) is then written as follows:
T′: K′ ⇾ K_{L} T′( [ δ′^{*}([s],x) ] ) = [ x ]The statement (A) above guarantees that T′ is welldefined and onetoone, implying that M′ is indeed (another representation of) the minimal DFA.
Test for being a minimal DFA
One consequence of this minimal DFA construction is thatA DFA is minimal if and only if all states are distinguishable.This means that no two states are "≡" equivalent, or better this:
A DFA is minimal if, for every pair of states p and q, we can find a distinguishing string x ∈ Σ^{*} such that δ^{*}(p,x) ∈ F and δ^{*}(q,x) ∉ F, or viceversa.
RabinScott "worst case" example is minimal
Reconsider the example 2.2.2 in the textbook. We want to argue the worst case scenario of RabinScott construction exists, namely an NFA with O(n) states whose minimal DFA contains O(2^{n}) states. In particular our NFA contains exactly n+1 states and the DFA exactly 2^{n} states. The language in question is constructed from alphabet Σ with n symbols defined as follows:{ w : w omits at least one symbol σ ∈ Σ }
We have already visited the example of NFA
for the foursymbol alphabet {a,b,c,d} in the
Nondeterministic Finite Automata document.
The language is this:
(b∪c∪d)*∪(a∪c∪d)*∪(a∪b∪d)*∪(a∪b∪c)*and the NFA is this:
The minimal DFA consists of all 2^{n}
states matching every subset of Σ^{UP}, where n = Σ.
The state transitions of the DFA are easy to define based on the way
the NFA is set up with symbols corresponding to state names:
δ(Q,σ) = Q  { upper_case(σ) }, for all Q ⊆ Σ^{UP}, σ ∈ ΣThere are two parts to proving this claim:
 The DFA constructed by the RabinScott construction consists of all 2^{n} states as indicated. We have already argued this point.
 Any pair of states of the DFA are distinguishable.
Take any two distinct states P and Q,
represented by subsets of Σ^{UP}.
For simplicity, ignore the distinction between upper case states and lower case symbols.
Both PQ and QP cannot be empty,
and so assume, without loss of generality,
that PQ is nonempty. We're saying that
there is at least one symbol in P which is not in Q.
Create the string
w = σ_{1}···σ_{k}, where Q = { σ_{1}, ..., σ_{k} } (w = ε if Q = ∅)
Then, because w contains every symbol of Q, running the DFA from Q will deplete the set, i.e.,δ^{*}(Q,w) = ∅ (a nonfinal state)
However, since P contains symbols which are not in Q, running the DFA from Q will leave the remaining symbols, i.e.,δ^{*}(P,w) = PQ ≠ ∅ (a final state)
This proves that that P and Q are distinguishable.
Algorithmic construction of the Minimal DFA
The MyhillNerode theorem gives us a theoretical representation of the minimal DFA in terms of string equivalence classes. The previous section gives as a less theoretical representation in terms of stateequivalence classes; however, we do not yet know an algorithmic way of constructing these stateequivalence classes. Saying that two states are indistinguishable means that they both run to final states or both to nonfinal states for all strings. Obviously we do not test all strings. The idea is to compute the indistinguishability equivalence classes incrementally. We say:p and q are kdistinguishable if they are distinguishable by a string of length ≤ kIt's easy to understand the inductive property of this relation:
p and q are kdistinguishable if they are (k1)distinguishable per se, orThe construction of the equivalence classes starts like this:
δ(p,σ) and δ(q,σ) are (k1)distinguishable for some symbol σ ∈ Σ
 p and q are 0indistinguishable if they are both final or both nonfinal. So we start the algorithm with the states divided in two partitions: final and nonfinal.
 within each of these two partitions: p and q are 1distinguishable if there is a symbol σ so that δ(p,σ) and δ(q,σ) are 0distinguishable, i.e. one is final and the other not. By doing so we further partition each group into sets of 1indistinguishable states.
p and q
within a partition set are kdistinguishable if there is a symbol
σ so that δ(p,σ)
and δ(q,σ) are (k1)distinguishable.
At some point we cannot subdivide the partitions further.
At that point terminate the
algorithm, because no further step can produce any new subdivision.
When we terminate, we have the indistinguishability equivalence classes which
form the states of the minimal DFA.
The transition from one equivalence class to another is obtained by
choosing an arbitrary state in the source class, applying the transition and
then taking the entire target class as the target state.
Example from figure 220
 Start by distinguishing final and nonfinal: {q1,q3}, {q2,q4,q5,q6}
 Distinguish states within the groups, to get the next partition:
{q1,q3}, {q4,q6}, {q5}, {q2}
How? b distinguishes q2, q4:δ(q2,b) ∈ {q1,q3}, δ(q4,b) ∈ {q2,q4,q5,q6}
b distinguishes q2, q5:δ(q2,b) ∈ {q1,q3}, δ(q5,b) ∈ {q2,q4,q5,q6}
a distinguishes q4, q5:δ(q4,a) ∈ {q1,q3}, δ(q5,a) ∈ {q2,q4,q5,q6}
neither a nor b distinguishes (q4,q6)  We cannot split the two nonsingleton groups further, the algorithm terminates.
Examples from NFAs Document
Recall the examples in The first was a NFA constructed to accept (a∪b)*aba(a∪b)*, here with states rename for convenience: The initial grouping is by final/nonfinal: {A,B,C} {D,E,F}

a distinguishes C from {A,B}, getting {A,B}, {C}, {D,E,F}
δ(C,a) ∈ {D,E,F}, δ(A,a), δ(B,a) ∈ {A,B,C}

b distinguishes A and B, getting {A}, {B}, {C}, {D,E,F}
δ(A,b) ∈ {A,B}, δ(B,b) ∈ {C}
 Nothing distinguishes states in {D,E,F}
The second example is the DFA constructed from an NFA which accepts the language
(a*∪(ab)*)b*where the states have been relabeled after the NFAtoDFA construction.
Final Nonfinal
Example 2
 Start by separating final and nonfinal states: {1,2}, {0,3,4,5}

All of 0,3,4 go to final states by a b transition,
5 does not (5 is a dead state). So we have 3 groups:
{1,2}, {0,3,4}, {5}

An atransition takes 3 to 5 and
0,4 to 3. So we split
{0,3,4} into two groups, getting:
{1,2}, {0,4}, {3}, {5}
© Robert M. Kline