Deterministic Finite Automata (DFA)
We now begin the machine view of processing a string over alphabet Σ. The machine has a semi-inifinite tape of squares holding one alphabet symbol per square. The machine has a finite state set, K, with a known start state. Initially we want the processing to be deterministic, that is, there is only one possible outcome from processing a string. Here is how we process it:- place string on a tape with one symbol in each square
- place machine in start state and the read head on the first square
- a computation step is done by considering the (current state, current tape symbol) and based the value of this pair, move to a new state and move the tape head one square to the right
- stop the machine when there are no more symbols to process
| ⊆ (K×Σ*) × (K×Σ*) the yields in one step relation
It is defined as follows: for all σ ∈ Σ,
(p,σw) | (q,w) for all w ∈ Σ* if and only if δ(p,σ) = qThis makes rigorous several notions about processing in a DFA:
- symbols are processed from left to right, processing each one only once
- the state change only consults the current symbol (what is ahead in the string is irrelevant)
(s,w)A terminal state, q, is one in which
(s, w) |* (q, ε)Because the state transition is a function, it implies that there is only one state p such that (q,w) |* (p,ε). It is sometimes convenient to express the state transition as a function:
δ* : K × Σ* → Kwhere
δ*(q, w) = p if and only if (q, w) |* (p, ε)
Because of the nature of the DFA, this function is well-defined.
Acceptance
We say the string w is accepted if, proceeding from the start state, the terminal state we reach is final i.e., ∈ F. Concisely, w is accepted if δ*(s,w) ∈ F. Because of the determinism, it is sometimes said that w is decided by a DFA. The language accepted by a DFA is the set of all strings accepted by a DFA.DFA and Regular Language Equivalence
One of the main goals of Chapter 2 is to show the following: Theorem: A language is accepted by a DFA if and only if it is a regular language (i.e., has a regular expression).Graphical representation of DFA
Finite automata lend themselves readily to graphical interpretation.-
Each state is a node in the graph:
-
The start state is designated by:
and final states are designated by:
-
A transition δ(p,σ) = q
is represented by the labelled edge (p,σ,q):
Claim: (q0,σ1,q1) (q1,σ2,q2) ... (qn-1,σn,qn) are labelled edges if and only if (q0, σ1σ2...σn) |* (qn, ε) The empty path (no labelled edges) corresponds to the empty string (no symbols).
Example 2.1.1 in textbook:
This is the even-parity checker for the number of b's in a string, i.e., the machine accepts the language
L = { w : #b's in w is even }
Accepted strings include: ε, a*, a*ba*b.
We'll soon see the construction of a RE from a DFA, but this one is easy.
You can see two looping paths from the start/final state back to itself by
either a or ba*b.
Choose from (a∪ba*b) represents one loop going by two general
"paths". We can repeat this 0 or more times, getting (a∪ba*b)*
as our RE for the DFA.
Switching final and non-final states gives us a DFA which represents the complement of the previous language:
L = { w : #b's in w is odd }.
Note that the RE doesn't easily "complement" in any way.
Example 2.1.2 in textbook:
L = { w : w does not contain the substring bbb }
The state q3 is called a dead state,
because no paths beyond this point can reach a final state.
Again, switch final and non-final states constructs the DFA
for the complement language:
L = { w : w contains the substring bbb }It's also easy to see that a regular expression for L is (a∪b)*bbb(a∪b)*, since we only need to find the substring bbb somewhere in the string, not necessarily finding the first occurrence, which is what the DFA does.
DFAs for substring acceptance
The previous example can be generalized. Consider the language
{ w ∈ Σ* : w contains the substring u }
This is very easily expressed by the regular expression
(Σ*)u(Σ*).
To generate a DFA, write
u = σ1...σnDraw the partial DFA representing the "success" transitions:
(q0,σ1,q1) (q1,σ2,q2) ... (qn-1,σn,qn)Start state = q0, final state = qn. What is missing are the failure transitions:
(qi-1, σ, ??), for σ ≠ σi
We don't always go back to the start state when we fail.
In general, look at the failure string:
y = σ1...σi-1σ where σ ≠ σi
What we want to find is the string, x where
x = largest suffix of the failure string, y,
which is also a prefix of the success string, u
Take this string x and run the DFA from the
start state to state p, and add the failure transition
(qi-1, σ, p)Here is another example with Σ = {a,b}: L = { w : w contains the substring abaa }. The setup is this:
ab·x = failure string x·aa = success stringand that x is the largest possible substring satisfying this requirement. Therefore, to get the target state of the failure transition, run x from the start, thereby adding:
( q3, b, q2 )Completing this procedure, we get this DFA:
Complement
We will state precisely one concept that we have been suggesting in the above examples. Theorem: If a language L is accepted by a DFA, then there is a derived DFA which accepts L = Σ* - L. Given the DFA (K,Σ,δ,s,F), the derived DFA is (K,Σ,δ,s,K-F). Namely, the final state set of the derived DFA is the complement of the final state set of the original DFA.Product Construction: intersection and union
Theorem: If languages L1 and L2 over Σ are accepted by a DFAs, then there are derived DFAs which accept- the intersection: L1 ∩ L2
- the union: L1 ∪ L2
(K1×K2,Σ,δ,(s1,s2),F)where
δ(p,q) = ( δ1(p), δ2(q) )and F is either:
- for the intersection: F = F1×F2
- for the union: F = F1×K2 ∪ K1×F2
- (for the intersection) both q1 and q2 are final in their respective DFAs.
- (for the union) at least one of q1 and q2 are final in their respective DFAs.
Intersection Example
Find a DFA for the language over {a,b}:
{ w : w has an even number of b's and does not contain the substring bb }
Here are the two languages and their DFAs:
{ w : w has an even number of b's }
|
{ w : w does not contain the substring bb }
|
|
≡ |
|