### Type-0 grammar

A type-0 grammar`G = (V,Σ,R,S)`is defined like a context-free grammar except for the way productions are defined:

R ⊆ V*(V-Σ)V* × V*A rule is depicted like this:

αAβ → γConceptually, the focus is on the replacement of the non-terminal

`A`, but only in the

**context**of being surrounded by other parts,

`α`and

`β`. Thus, the replacement is not

*context-free*. Of course,

`α`and

`β`can contain other non-terminals, and so there is no precise notion of what is "the" non-terminal and its context. The derives relation

`⇒`on

`V`is basically the same as it is for context-free grammars:

^{*}α′αAββ′ ⇒ α′γβ′ if αAβ → γ is a ruleThe forced presence of at least one terminal in the left-hand side of a rule means that we must stop when we derive a string of all terminal symbols. All other concepts regarding unrestricted grammars are the same as for context-free grammars. In the Chomsky Hierarchy terms, an unrestricted grammar is called a

**type-0**grammar.

#### Example 4.6.2

The obligatory one. Find a grammar`G`for

L = { w ∈ {a,b,c}* : w = aThe textbooks has one solution. Here is another:^{n}b^{n}c^{n}, n ≥ 0 }

S → aSBC | ε CB → BC aB → ab bB → bb bC → bc cC → ccHere is a sample derivation:

S ⇒From the start we generate a sentential form:^{4}aaaBCBCBC ⇒ aaaBBCCBC ⇒ aaaBBCBCC ⇒ aaaBBBCCC ⇒ aaabBBCCC ⇒ aaabbBCCC ⇒ aaabbbCCC ⇒ aaabbbcCC ⇒ aaabbbccC ⇒ aaabbbccc

athe rule^{n}(BC)^{n}

`CB → BC`gives you the ability to change this to

aThe remaining rules allow you to convert this form to:^{n}B^{n}C^{n}

aIt is fairly easy to argue that^{n}b^{n}c^{n}

`L ⊆ L(G)`, i.e., every string in

`L`can be generated by the grammar. It is tricky to show that

`L(G) ⊆ L`in the sense that the grammar

**only**generates strings in

`L`.

### Type-1 Grammars

This topic is relegated to exercise 5.7.4 in the textbook. A grammar is*context sensitive*if all rules are of one of two forms:

`S → ε and ``S` never appears on the right-hand side of a rule.

or
```
α → β where |α| ≤ |β|
```

In particular, a derivation is never "collapsing," i.e.,
sentential forms in the derivation sequence never decrease in length.
In the Chomsky Hierarchy terms,
a context-sensitive grammar is called a **type-1** grammar.
We know that a Context-Free grammar is also context-sensitive by virtue of the
Chomsky Normal Form, or simply by removing the `ε`-productions.

*Context Sensitive Language*(CSL) if it is generated by a context-sensitive grammar. The

`a`language is context sensitive. The grammar above is technically not context sensitive, but we can make it so by "removing ε-productions" as follows:

^{n}b^{n}c^{n}S′ → S | ε S → aSBC | aBC CB → BC aB → ab bB → bb bC → bc cC → cc

#### A non-type 1 language

This grammar generates the language`{a`which is not context-sensitive.

^{2n}: n ≥ 0 }S → LaR L → LD | ε Da → aaD DR → R R → ε

`D`is the "doubler" non-terminal. Here is an example:

S ⟹ LaR ⟹The grammar is not context-sensitive, in particular, observe how it requires explicit "collapsing rules" to finalize the derivation, in particular the rules:^{3}LDDDaR ⟹ DDDaR ⟹ DDaaDR ⟹ DaaDaDR ⟹ aaDaDaDR ⟹ aaaaDDaDR ⟹^{*}aaaaaaaaDDDR ⟹ aaaaaaaaDDR ⟹^{*}aaaaaaaaR ⟹ aaaaaaaa = a^{8}= a^{23}

DR → R R → εProving that the language is

**not**type-1 means there is no context-sensitive grammar which can generate it, so this is much more involved.

### Linear Bounded Automaton

Recall how we decided the`a`language. We did so by scanning back and forth across a string, replacing characters by a special marker symbol. That is, the accepting TM stayed within the bounds of the string itself. This is no coincidence. A Linear Bounded Automaton (LBA) is a NTM

^{n}b^{n}c^{n}`M = (K,Σ,Δ,s,H)`such that

`Σ`contains a

**right-end**marker,

`, whose definition is symmetric to that of the left-end marker, i.e., the computation never goes right of the fixed right-end. A starting configuration for an LBA is:`

**$**All tape head movements must stay within the left and right ends. A>#w#$

*linear bounded*NTM can, of course, simulate multiple tracks. Consequently, a LBA can just as well start with extra finite space in its starting configuration. An alternative definition of an LBA is this:

An NTM

This definition makes more sense of `M`is an LBA if there is a constant`N > 0`depending on`M`such that the machine never goes beyond the following starting configuration:>#w#^{N*|w|}$

**Linear**in an LBA.

#### Equivalence of Type-1 and LBA

The following theorem characterizes LBA acceptance.
**Theorem**: A language is context-sensitive (CSL) if and only if it is
accepted by an LBA.

`a`language. from the

^{n}b^{n}c^{n}**TM Computations**document. It does so by staying entirely within the range of the input string to be decided.

A key point about a type-1 language is that the following problem is

**decidable**:

Given a type-1 grammar, G and a string

Recall that this problem is decidable for a CFL by applying the CYK algorithm.
`w`, is`w ∈ L(G)`?### Complete Chomsky Grammar hierarchy

Again, referring to the Chomsky hierarchy, we now have a complete picture:**type 0**: unrestricted: accepted by a TM (recursively enumerable)**type 1**: context-sensitive: accepted by an LBA**type 2**: context-free: accepted by an PDA**type 3**: regular: accepted by an DFA

A → wB A → wwhere

`w ∈ Σ*`.

### Equivalence of Type-0 Grammar and TM acceptance

**Theorem**: A language is generated by an unrestricted grammar,

`G`, if and only if it is accepted by a TM

`M`.

#### G ⇒ TM

Given a grammar`G = (V,Σ,R,S)`, we will construct is a 2-tape NTM,

`M = (K,V,Δ,s,H)`. In TM Extensions we argued that a multi-tape TM (deterministic) is equivalent to a standard TM. The argument that a multi-tape NTM is equivalent to a standard (single tape) NTM is exactly the same. Here is the tape setup used by

`M`:

- contains a copy of the input string to accept
- contains a sentential form which represents a derivation from the start symbol

`M`is

(s,The procedure is this:#w,#S)

- Guess a position within the string on tape 2:
**#**X_{1}...X_{k}...X_{n}**#**↑ -
Guess a replacement rule
`α → β`to use. -
Check if
`α`= the substring starting at position`k`; if not, run forever, otherwise go to the next step. - Replace
`α`by`β`on tape 2 - Check if tape 2 is equal to tape 1 (containing
`w`). If so halt, if not go back to step 1.

`S ⇒* w`, and we correctly guess the replacements which were made, then the machine halts. If there is no such derivation, then the machine will run forever. Non-deterministic machines always seem like a trick of some kind. But "guessing" simply means that all possibilities are available. For example, in step 2 we say:

guess a position

This means that we go right at least once from the leftmost blank. After
going right once, we have a choice to go right again, or stop and
proceed to step 3.
If we go right too far, we recognize it by reaching the right-end blank,
and run forever to represent a failed guess sequence.
`k`on the string on tape 2#### Context Sensitive Grammar ⇒ LBA

Based on what we're seeing, if the grammar G is context sensitive, the derivation steps of`w`, placed on a second tape must all be of length ≤ |w|. Suggesting that an LBA could accept L(G). Of course an LBA is defined deterministically, so we'd have to work out some further details, in particular, how the TM which would simulate such an NTM is still an LBA.

#### TM ⇒ G

An alternative way of expressing a TM configuration`(q,wau)`is by the (unmarked) string

`wqau`, where the state precedes the symbol marked by the tape head. If we think of

`q`as a non-terminal then the machine transition

`δ(q,a) = (p,b)`can be thought of as the production

`qa → pb`, thereby creating the new configuration string

`wpbu`. This is the essence of how a grammar can be thought of as simulating a TM. The key problem to solve is how to somehow "end" with the string we want to accept, because the computation starts from, say,

`s#w`and ends in a halt state with some entirely different string on the tape. There are two possible ways to solve this problem:

- make the grammar "run the computation backwards" from a halt state to the starting configuration with the target string on the tape
- generate the input string duplicated in a sentential form so that one side can be used to represent the computation and the other simply holds the string fixed; if the computation side halts then somehow erase the computation part, leaving the input string as the result

`G = (V,Σ,S,R)`where the non-terminal set

`V-Σ`contains every symbol listed below which is not a terminal, including the machine states and dedicated nonteminals,

`X`, for each

_{a}`a ∈ Σ`

S → FQR Q → aXRead_{a}Q | M for all a ∈ Σ X_{a}b → bX_{a}for all a, b ∈ Σ X_{a}M → Ma for all a ∈ Σ Fa → aF for all a ∈ Σ FM → Ms# for all a ∈ Σ pa → qb if δ(p,a) = (q,b) pa → aq if δ(p,a) = (q,R) pR → p#R cpa → qca if δ(p,a) = (q,L), for all c ∈ Σ∪{#} ha → h for all a ∈ Σ∪{#}, h ∈ H ah → h for all a ∈ Σ∪{#}, h ∈ H MhR → ε

`F`as

**F**ront,

`M`as

**M**iddle,

`R`as

**R**ear. The initial goal is to get the input string duplicated on both sides of the middle. Suppose the string

`ababb`were accepted by

`M`. Assume the starting configuration is

and the machine halts with the tape in this configuration#ababb

then we would derive#a#

`ababb`as follows:

S ⇒* FaXThe first and second group of productions creates a working copy of the input string between the symbols_{a}bX_{b}aX_{a}bX_{b}bX_{b}MR generate start string and duplicaters ⇒* FababbX_{a}X_{b}X_{a}X_{b}X_{b}MR move duplicater to middle ⇒* FababbMababbR duplicaters cross middle creating duplicate string ⇒* ababbFMababbR move F down to reach the middle ⇒* ababbMs#ababbR create the TM starting configuration (s,#ababb) ⇒* ababbM#ah#R mimicM's actions up to halting ⇒* ababbMhRherases all working input symbols and blanks betweenMandR⇒ ababb clear the working section

`M`(middle) and

`R`(rear). The front symbol,

`F`, must move to join with

`M`. To do so, the

`X`symbols move to the right of

`M`turning into the associated terminal symbol. After

`F`joins with

`M`, the last production in the second group creates the initial configuration of

`M`with the input string where the state (initially

`s`) is positioned in front of the symbol under the read/write head. The third group of productions represents the moves made by the Turing machine from the initial configuration. The state is consistently positioned in front of the scanned position. The

`M`symbol becomes the effective left-end of the tape. A halting computation will eventually produce the halt state

`h`to the left of some symbol in the working area bounded by

`M`and

`R`. The last two groups of productions represent the cleanup after the halt state

`h`has been produced. It finishes with the original copy of the input which produced a halting configuration.

#### This grammar is not type-1

All the rules used to simulate the TM are type-1 except for those in the last two groups which erase the working section.ha → h for all a ∈ Σ∪{#}, h ∈ H ah → h for all a ∈ Σ∪{#}, h ∈ H MhR → ε