General Grammars

This document completes the discussion of Chomsky grammar hierarchy by introducing the last two: type-0 and type-1. It also shows the equivalence of type-0 languages with languages accepted by TM's.

Type-0 grammar

A type-0 grammar G = (V,Σ,R,S) is defined like a context-free grammar except for the way productions are defined:
R ⊆ V*(V-Σ)V* × V*
A rule is depicted like this:
αAβ → γ
Conceptually, the focus is on the replacement of the non-terminal A, but only in the context of being surrounded by other parts, α and β. Thus, the replacement is not context-free. Of course, α and β can contain other non-terminals, and so there is no precise notion of what is "the" non-terminal and its context.

The derives relation on V* is basically the same as it is for context-free grammars:
α′αAββ′ ⇒ α′γβ′  if αAβ → γ is a rule
The forced presence of at least one terminal in the left-hand side of a rule means that we must stop when we derive a string of all terminal symbols. All other concepts regarding unrestricted grammars are the same as for context-free grammars.

In the Chomsky Hierarchy terms, an unrestricted grammar is called a type-0 grammar.

Example 4.6.2

The obligatory one. Find a grammar G for
L = { w ∈ {a,b,c}* : w = anbncn, n ≥ 0 }
The textbooks has one solution. Here is another:
S → aSBC | ε
aB → ab
bB → bb
bC → bc
cC → cc
Here is a sample derivation:
S ⇒4 aaaBCBCBC
  ⇒  aaaBBCCBC
  ⇒  aaaBBCBCC
  ⇒  aaaBBBCCC
  ⇒  aaabBBCCC
  ⇒  aaabbBCCC
  ⇒  aaabbbCCC
  ⇒  aaabbbcCC
  ⇒  aaabbbccC
  ⇒  aaabbbccc
From the start we generate a sentential form:
the rule CB → BC gives you the ability to change this to
The remaining rules allow you to convert this form to:
It is fairly easy to argue that L ⊆ L(G), i.e., every string in L can be generated by the grammar. It is tricky to show that L(G) ⊆ L in the sense that the grammar only generates strings in L.

Type-1 Grammars

This topic is relegated to exercise 5.7.4 in the textbook. A grammar is context sensitive if all rules are of one of two forms:
S → ε   and S never appears on the right-hand side of a rule.
α → β   where  |α| ≤ |β|
In particular, a derivation is never "collapsing," i.e., sentential forms in the derivation sequence never decrease in length.

In the Chomsky Hierarchy terms, a context-sensitive grammar is called a type-1 grammar. We know that a Context-Free grammar is also context-sensitive by virtue of the Chomsky Normal Form, or simply by removing the ε-productions.

A language is a Context Sensitive Language (CSL) if it is generated by a context-sensitive grammar.

The anbncn language is context sensitive. The grammar above is technically not context sensitive, but we can make it so by "removing ε-productions" as follows:
S′ → S | ε
S  → aSBC | aBC
aB → ab
bB → bb
bC → bc
cC → cc

A non-type 1 language

This grammar generates the language {a2n: n ≥ 0 } which is not context-sensitive.
S  → LaR
L  → LD | ε
Da → aaD
DR → R
R  → ε
D is the "doubler" non-terminal. Here is an example:
S ⟹  LaR
  ⟹3 LDDDaR
  ⟹  DDDaR
  ⟹  DDaaDR
  ⟹  DaaDaDR
  ⟹  aaDaDaDR
  ⟹  aaaaDDaDR
  ⟹* aaaaaaaaDDDR
  ⟹  aaaaaaaaDDR
  ⟹* aaaaaaaaR
  ⟹  aaaaaaaa = a8 = a23
The grammar is not context-sensitive, in particular, observe how it requires explicit "collapsing rules" to finalize the derivation, in particular the rules:
DR → R
R → ε
Proving that the language is not type-1 means there is no context-sensitive grammar which can generate it, so this is much more involved.

Linear Bounded Automaton

Recall how we decided the anbncn language. We did so by scanning back and forth across a string, replacing characters by a special marker symbol. That is, the accepting TM stayed within the bounds of the string itself. This is no coincidence.

A Linear Bounded Automaton (LBA) is a NTM M = (K,Σ,Δ,s,H) such that Σ contains a right-end marker, $, whose definition is symmetric to that of the left-end marker, i.e., the computation never goes right of the fixed right-end. A starting configuration for an LBA is:
All tape head movements must stay within the left and right ends. A linear bounded NTM can, of course, simulate multiple tracks. Consequently, a LBA can just as well start with extra finite space in its starting configuration. An alternative definition of an LBA is this:
An NTM M is an LBA if there is a constant N > 0 depending on M such that the machine never goes beyond the following starting configuration:
This definition makes more sense of Linear in an LBA.

Equivalence of Type-1 and LBA

The following theorem characterizes LBA acceptance.

Theorem: A language is context-sensitive (CSL) if and only if it is accepted by an LBA.

Evidence of this is provided by the example of the machine which decides the anbncn language. from the TM Computations document. It does so by staying entirely within the range of the input string to be decided.

A key point about a type-1 language is that the following problem is decidable:
Given a type-1 grammar, G and a string w, is w ∈ L(G)?
Recall that this problem is decidable for a CFL by applying the CYK algorithm.
Proof: Since the type-1 rules are not collapsing, any sentential form (in V*) used in a derivation from S ⇒* w must have length ≤ n = |w|.

Make a graph of all strings in V* whose length ≤ n, and draw an edge from α to β if α ⇒ β ∈ R.

Then S ⇒* w if and only if there is a path from S to w.

Complete Chomsky Grammar hierarchy

Again, referring to the Chomsky hierarchy, we now have a complete picture:
  1. type 0: unrestricted: accepted by a TM (recursively enumerable)
  2. type 1: context-sensitive: accepted by an LBA
  3. type 2: context-free: accepted by an PDA
  4. type 3: regular: accepted by an DFA
Recall that type 3, or regular grammars are ones in which all rules are of the form:
A → wB
A → w
where w ∈ Σ*.

Equivalence of Type-0 Grammar and TM acceptance

Theorem: A language is generated by an unrestricted grammar, G, if and only if it is accepted by a TM M.

G ⇒ TM

Given a grammar G = (V,Σ,R,S), we will construct is a 2-tape NTM, M = (K,V,Δ,s,H). In TM Extensions we argued that a multi-tape TM (deterministic) is equivalent to a standard TM. The argument that a multi-tape NTM is equivalent to a standard (single tape) NTM is exactly the same.

Here is the tape setup used by M:
  1. contains a copy of the input string to accept
  2. contains a sentential form which represents a derivation from the start symbol
The start configuration of M is
(s, #w, #S)
The procedure is this:
  1. Guess a position within the string on tape 2:
  2. Guess a replacement rule α → β to use.
  3. Check if α = the substring starting at position k; if not, run forever, otherwise go to the next step.
  4. Replace α by β on tape 2
  5. Check if tape 2 is equal to tape 1 (containing w). If so halt, if not go back to step 1.
The argument that this is correct is that if there is a derivation S ⇒* w, and we correctly guess the replacements which were made, then the machine halts. If there is no such derivation, then the machine will run forever.

Non-deterministic machines always seem like a trick of some kind. But "guessing" simply means that all possibilities are available. For example, in step 2 we say:
guess a position k on the string on tape 2
This means that we go right at least once from the leftmost blank. After going right once, we have a choice to go right again, or stop and proceed to step 3. If we go right too far, we recognize it by reaching the right-end blank, and run forever to represent a failed guess sequence.

Context Sensitive Grammar ⇒ LBA

Based on what we're seeing, if the grammar G is context sensitive, the derivation steps of w, placed on a second tape must all be of length ≤ |w|. Suggesting that an LBA could accept L(G).

Of course an LBA is defined deterministically, so we'd have to work out some further details, in particular, how the TM which would simulate such an NTM is still an LBA.

TM ⇒ G

An alternative way of expressing a TM configuration (q,wau) is by the (unmarked) string wqau, where the state precedes the symbol marked by the tape head. If we think of q as a non-terminal then the machine transition δ(q,a) = (p,b) can be thought of as the production qa → pb, thereby creating the new configuration string wpbu. This is the essence of how a grammar can be thought of as simulating a TM.

The key problem to solve is how to somehow "end" with the string we want to accept, because the computation starts from, say, s#w and ends in a halt state with some entirely different string on the tape. There are two possible ways to solve this problem:
  1. make the grammar "run the computation backwards" from a halt state to the starting configuration with the target string on the tape
  2. generate the input string duplicated in a sentential form so that one side can be used to represent the computation and the other simply holds the string fixed; if the computation side halts then somehow erase the computation part, leaving the input string as the result
The textbook chooses the former approach. For variety, we'll use the latter approach.

The grammar is G = (V,Σ,S,R) where the non-terminal set V-Σ contains every symbol listed below which is not a terminal, including the machine states and dedicated nonteminals, Xa, for each a ∈ Σ
S   →  FQR
Q   →  aXaQ | M    for all a ∈ Σ

Xab →  bXa         for all a, b ∈ Σ
XaM →  Ma          for all a ∈ Σ
Fa  →  aF          for all a ∈ Σ
FM  →  Ms#         for all a ∈ Σ

pa  →  qb          if δ(p,a) = (q,b)
pa  →  aq          if δ(p,a) = (q,R)
pR  →  p#R
cpa →  qca         if δ(p,a) = (q,L), 
                         for all c ∈ Σ∪{#}

ha  →  h           for all a ∈ Σ∪{#}, h ∈ H
ah  →  h           for all a ∈ Σ∪{#}, h ∈ H

MhR →  ε
Read F as Front, M as Middle, R as Rear. The initial goal is to get the input string duplicated on both sides of the middle. Suppose the string ababb were accepted by M. Assume the starting configuration is
and the machine halts with the tape in this configuration
then we would derive ababb as follows:
S ⇒* FaXabXbaXabXbbXbMR    generate start string and duplicaters
  ⇒* FababbXaXbXaXbXbMR    move duplicater to middle

  ⇒* FababbMababbR        duplicaters cross middle creating duplicate string
  ⇒* ababbFMababbR        move F down to reach the middle
  ⇒* ababbMs#ababbR       create the TM starting configuration (s,#ababb)

  ⇒* ababbM#ah#R          mimic M's actions up to halting

  ⇒* ababbMhR             h erases all working input symbols and blanks between M and R

  ⇒  ababb                clear the working section
The first and second group of productions creates a working copy of the input string between the symbols M (middle) and R (rear). The front symbol, F, must move to join with M. To do so, the X symbols move to the right of M turning into the associated terminal symbol. After F joins with M, the last production in the second group creates the initial configuration of M with the input string where the state (initially s) is positioned in front of the symbol under the read/write head.

The third group of productions represents the moves made by the Turing machine from the initial configuration. The state is consistently positioned in front of the scanned position. The M symbol becomes the effective left-end of the tape. A halting computation will eventually produce the halt state h to the left of some symbol in the working area bounded by M and R.

The last two groups of productions represent the cleanup after the halt state h has been produced. It finishes with the original copy of the input which produced a halting configuration.

This grammar is not type-1

All the rules used to simulate the TM are type-1 except for those in the last two groups which erase the working section.
ha  →  h              for all a ∈ Σ∪{#}, h ∈ H
ah  →  h              for all a ∈ Σ∪{#}, h ∈ H
MhR →  ε

© Robert M. Kline