Pushdown Automata: PDA/DPDA

Description

A pushdown automaton (PDA) is a finite state machine which has an additional stack storage. The transitions a machine makes are based not only on the input and current state, but also on the stack. The formal definition (in our textbook) is that a PDA is this:
M = (K,Σ,Γ,Δ,s,F)
where We have to have the finite qualifier because the full subset is infinite by virtue of the Γ* component. The meaning of the transition relation is that, for σ ∈ Σ, if ((p,σ,α),(q,β)) ∈ Δ: then: Otherwise, if ((p,ε,α),(q,β)) ∈ Δ, this means that if then (not consulting the input symbol), we can

Machine Configuration, yields, acceptance

A machine configuration is an element of K×Σ*×Γ*.
(p,w,γ) = (current state, unprocessed input, stack content)
We define the usual yields relationships:
  (p,σw,αγ) ⊢ (q,w,βγ) if ((p,σ,α),(q,β)) ∈ Δ
or
  (p,w,αγ) ⊢ (q,w,βγ) if ((p,ε,α),(q,β)) ∈ Δ 
As expected, ⊢* is the reflexive, transitive closure of .

A string w is accepted by the PDA if
(s,w,ε) ⊢* (f,ε,ε)
Namely, from the start state with empty stack, we The language accepted by a PDA M, L(M), is the set of all accepted strings.

The empty stack is our key new requirement relative to finite state machines. The examples that we generate have very few states; in general, there is so much more control from using the stack memory. Acceptance by empty stack only or final state only is addressed in problems 3.3.3 and 3.3.4.

Graphical Representation and ε-transition

The book does not indicate so, but there is a graphical representation of PDAs. A transition
((p,x,α),(q,β))     where x = ε or x ∈ Σ
would be depicted like this (respectively):
or
The stack usage represented by
α/β
represents these actions: A PDA is non-deterministic. There are several forms on non-determinism in the description: The true PDA ε-transition, in the sense of being equivalent to the NFA ε-transition is this:
because it consults neither the input, nor the stack and will leave the previous configuration intact.

Palindrome examples

These are examples 3.3.1 and 3.3.2 in the textbook. The first is this:
{x ∈ {a,b,c}* : x = wcwR for w ∈ {a,b}*}
The machine pushes a's and b's in state s, makes a transition to f when it sees the middle marker, c, and then matches input symbols with those on the stack and pops the stack symbol. Non-accepting string examples are these:
ε           in state s
ab          in state s with non-empty stack
abcab       in state f with unconsumed input and non-empty stack
abcb        in state f with non-empty stack
abcbab      in state f with unconsumed input and empty stack
Observe that this PDA is deterministic in the sense that there are no choices in transitions.

The second example is:
{x ∈ {a,b}* : x = wwR for w ∈ {a,b}*}
This PDA is identical to the previous
one except for the ε-transition
Nevertheless, there is a significant difference in that this PDA must guess when to stop pushing symbols, jump to the final state and start matching off of the stack. Therefore this machine is decidedly non-deterministic. In a general programming model (like Turing Machines), we have the luxury of preprocessing the string to determine its length and thereby knowing when the middle is coming.

The anbn language

The language is L = { w ∈ {a,b}* : w = anbn, n ≥ 0 }. Here are two PDAs for L:
and
The idea in both of these machines is to stack the a's and match off the b's. The first one is non-deterministic in the sense that it could prematurely guess that the a's are done and start matching off b's. The second version is deterministic in that the first b acts as a trigger to start matching off. Note that we have to make both states final in the second version in order to accept ε.

The equal a's and b's language

This is example 3.3.3 in the textbook. Let us use the convenience notation:
#σ(w) = the number of occurrences of σ in w
The language is L = {w ∈ {a,b}*: #a(w) = #b(w) }. Here is the PDA:
As you can see, most of the activity surrounds the behavior in state q. The idea is have the stack maintain the excess of one symbol over the other. In order to achieve our goal, we must know when the stack is empty.

Empty Stack Knowledge

There is no mechanism built into a PDA to determine whether the stack is empty or not. It's important to realize that the transition:
x = σ ∈ Σ   or   ε
means to do so without consulting the stack; it says nothing about whether the stack is empty or not.

Nevertheless, one can maintain knowledge of an empty stack by using a dedicated stack symbol, c, representing the "stack bottom" with these properties:

Behavior of PDA

The three groups of loop transitions in state q represent these respective functions: For example if we have seen 5 b's and 3 a's in any order, then the stack should be "bbc". The transition to the final state represents the only non-determinism in the PDA in that it must guess when the input is empty in order to pop off the stack bottom.

DPDA/DCFL

The textbook defines DPDAs (Deterministic PDAs) and DCFLs (Deterministic CFLs) in the introductory part of section 3.7. According to the textbook's definition, a DPDA is a PDA in which no state p has two different outgoing transitions
((p,x,α),(q,β)) and ((p,x′,α′),(q′,β′))
which are compatible in the sense that both could be applied. A DCFL is basically a language which accepted by a DPDA, but we need to qualify this further.

We want to argue that the language L = { w ∈ {a,b}* : #a(w) = #b(w)} is deterministic context free in the sense there is DPDA which accepts it.

In the above PDA, the only non-determinism is the issue of guessing the end of input; however this form of non-determinism is considered artificial. When one considers whether a language L supports a DPDA or not, a dedicated end-of-input symbol is always added to strings in the language.

Formally, a language L over Σ is deterministic context free, or L is a DCFL , if
L$  is accepted by a DPDA  M
where $ is a dedicated symbol not belonging to Σ. The significance is that we can make intelligent usage of the knowledge of the end of input to decide what to do about the stack. In our case, we would simply replace the transition into the final state by:
With this change, our PDA is now a DPDA:

a*b* examples

Two common variations on a's followed by b's. When they're equal, no stack bottom is necessary. When they're unequal, you have to be prepared to recognize that the stacked a's have been completely matched or not.
a. { anbn : n ≥ 0 }
b. { ambn : 0 ≤ m < n }
Let's look at a few sample runs of (b). The idea is that you cannot enter the final state with an "a" still on the stack. Once you get to the final state, you can consume remaining b's and end marker.

We can start from state 1 with the stack bottom pushed on:
success: abb
stateinputstack
1 abb$ c
1 bb$ ac
2 b$ ac
2 $ c
3 ε ε
success: abbbb
stateinputstack
1 abbbb$ c
1 bbbb$ ac
2 bbb$ ac
2 bb$ c
3 b$ ε
3 $ ε
3 ε ε
(fail: ab)
stateinputstack
1 ab$ c
1 b$ ac
2 $ ac
(fail: ba)
stateinputstack
1 ba$ c
2 a$ c

Observe that a string like abbba also fails due to the inability to consume the very last a.

CFL/PDA Equivalence

The goal of section 3.4 in the textbook is to prove that CFLs and PDAs are equivalent, namely, for every CFL, G, there is a PDA M such that L(G) = L(M) and vice versa. The author splits this into two lemmas: Of the two, the first uses a very simple, intuitive construction to achieve a mimick a leftmost derivation by using a PDA. The converse stepis far more complicated than the first. The construction in the first is quite simple. The construction in the second involves two steps:

Grammar to PDA construction

This construction is quite simple. Given G = (V,Σ,R,S), what the PDA will do is effect a leftmost derivation of a string w ∈ L(G). The PDA is
M = ( {0,1}, Σ, V, Δ, 0, {1} )
Namely, there are only two states The PDA moves immediately to the final state is 1 with the start symbol on the stack and then stays in this state.

These are the transitions in Δ:
1: ( (0,ε,ε), (1,S) )
2: ( (1,ε,A), (1,α) )  for A → α
3: ( (1,σ,σ), (1,ε) )  for σ ∈ Σ
This constructed PDA is inherently non-deterministic; if there is a choice of rules to apply to a non-terminal, then there is a non-deterministic choice of processing steps. Graphically the presentation is this:
To say that G and M are equivalent means that L(M) = L(G), or, considering an arbitrary string w ∈ Σ*:
S ⇒* w  ⇔  (0,w,ε) ⊢* (1,ε,ε)

The grammar for anbn

Use the grammar: S ⟶ ε | aSb

Here is the PDA:
A simple run is:
 (0, aabb, ε)
⊢ (1, aabb, S)
⊢ (1, aabb, aSb)
⊢ (1,  abb, Sb)
⊢ (1,  abb, aSbb)
⊢ (1,   bb, Sbb)
⊢ (1,   bb, bb)
⊢ (1,    b, b)
⊢ (1,    ε, ε)

Expression Grammar

The most informative examples are those in which there exist the possibility of not using a leftmost derivation such as our expression grammar:
E → E + T | T
T → T * F | F
F → ( E ) | a
We can readily match up a leftmost derivation of a + a * a with the corresponding machine configuration processing as follows:
 (0, a + a * a, ε)
⊢ (1, a + a * a, E)
⊢ (1, a + a * a, E + T)             E ⇒ E + T
⊢ (1, a + a * a, T + T)               ⇒ T + T 
⊢ (1, a + a * a, F + T)               ⇒ F + T 
⊢ (1, a + a * a, a + T)               ⇒ a + T 
⊢ (1,   + a * a, + T)
⊢ (1,     a * a, T)
⊢ (1,     a * a, T * F)               ⇒ a + T * F
⊢ (1,     a * a, F * F)               ⇒ a + F * F
⊢ (1,     a * a, a * F)               ⇒ a + a * F
⊢ (1,       * a, * F)
⊢ (1,         a, F)
⊢ (1,         a, a)                   ⇒ a + a * a
⊢ (1,         ε, ε)

Grammar to PDA proof

For simplicity, or simplicity, assume that means a leftmost derivation step.

Claim: For w ∈ Σ*,   α ∈ (V-Σ)V* ∪ {ε}:
S ⇒* wα  ⇔  (1,w,S) |—* (1,ε,α)

(Proof ⇒):

Induction on the length of the leftmost derivation.
S ⇒n α′ ⇒ wα
Then, since the last step was leftmost, we can write:
α′ = xAβ  for  x ∈ Σ*
and then
xγβ = wα  for  A → γ    (A)
By induction, since S ⇒n xAβ:
(1,x,S) |—* (1,ε,Aβ)
Furthermore, applying the transition of type 2, we have:
(1,ε,Aβ) |— (1,ε,γβ)
Putting these two together:
(1,x,S) |—* (1,ε,γβ)    (B)
Looking at (A) we see that the string x must be a prefix of w because α begins with a non-terminal, or is empty. Write
w = xy  and therefore,  γβ = yα   (C)
As a consequence of (B) we get:
(1,xy,S) |—* (1,y,γβ)   (D)
Combine (C) and (D) to get:
(1,w,S) |—* (1,y,yα)    (E)
Apply |y| transitions of type 3 to get
(1,y,yα) |—* (1,ε,α)    (F)
Combine (E) and (F) to get the desired result:
(1,w,S) |—* (1,ε,α)

(Proof ⇐):

The proof going this direction is by induction on the number of type-2 steps in the derivation. This restriction makes the entire proof simpler than the converse that we just proved.

We'll proceed to the induction step. Assume true for up to n steps and prove true for n+1 type-2 steps. Write:
(1,w,S) |—* (1,y,Aβ) |— (1,y,γβ)  |—* (1,ε,α)
where the use of the rule
A ⇒ γ
represents the final type-2 step and the last part of the chain is type-3 steps only. The string y must be a suffix of w, and so writing
w = xy
we have:
(1,xy,S) |—* (1,y,Aβ)
and therefore,
(1,x,S) |—* (1,ε,Aβ)
Therefore by induction:
S ⇒* xAβ
and consequently,
S ⇒* xγβ     (A)
Now if we look at the last part:
(1,y,γβ)  |—* (1,ε,α)
We observe, that since this consists of only type-3 transitions, it must be that
yα = γβ      (B)
and so, putting (A) and (B) together, we get:
S ⇒* xyα
Knowing that w = xy gives us the result we want:
S ⇒* wα


© Robert M. Kline