Chapter 18: Decidability

Chapter 18 Part I: Decidability

Topics
Introduction
An algorithm for deciding whether a CFL is empty
Helper algorithm: remove useless nonterminals
Algorithm for deciding whether or not a CFL is infinite

Introduction

We addressed the issue of decidability for regular languages in Chapter 11. We showed that:

there is an effective procedure (algorithm) for deciding whether a regular language is empty
there is an effective procedure for deciding whether a RL is finite
there is an effective procedure for deciding whether a string is in a given RL (just run it on the FA!)
there is an effective procedure for deciding whether two RL’s are the same language

Context-free languages are not as well-behaved when it comes to decidability questions. For questions 1, 2, and 3 above, the answer is yes, but for question 4, the answer is no. In fact, there are lots of undecidable issues for CFL’s, such as:

whether or not the complement of a given CFL is also CFL. Of course, one can tell for certain CFL’s, but there is no general procedure
whether or not two CFL’s have a word in common
whether or not a CFL generates all strings

We will not prove the undecidability results, but will prove 1 and 2 above (for CFL’s) by demonstrating appropriate effective decision procedures (algorithms).

An algorithm for deciding whether a CFL is empty

The method is a grammar method:

find the set of nullable nonterminals: if it is non-empty, the CFL is non-empty
find the set of nonterminals that can derive a string of terminals: if the sentence symbol is in that set, the CFL is non-empty
otherwise, the CFL is empty

Demonstration of this method on the following CFG:

S ® XY | AX, A ® a, Y ® BY | BB, B ® b

1. Find nullable non-terminals (there are none)
2. Remove l-productions (there are none)
3. Underline all occurrences of nonterminals	S ® XY \| AX, A ® a, Y ® BY \| BB, B ® b
4. Underline all occurrences of nonterminals any of whose right hand sides consists solely of underlined symbols. This is the heart of the algorithm.	We have A ® a, and B ® b, so underline all A’s and B’s: S ® XY \| AX, A®a, Y ®BY \| BB, B®b
5. Repeat 4, until no new nonterminals can be underlined.	We have Y ® BB, so underline all Y’s: S ® XY \| AX, A®a, Y®BY \| BB, B®b
6. Repeat 4. No new underlinings. Stop.
7. S is not underlined, so the language produces no sentences, and is empty.
The underlined symbols are those that can derive a string of terminals

Helper algorithm: remove useless nonterminals

A useless nonterminal is one that can never be used in the derivation of a sentence. This can happen in two ways:

it can never be derived from the sentence symbol
it can never derive a string of terminals

Here’s the algorithm:

1. do the "emptiness" algorithm.

the non-underlined nonterminals are those which cannot derive a string of terminals
remove all productions involving those nonterminals

Now we want to determine which nonterminals can be derived from the sentence symbol. We use the example: S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa, Y ® b (already cleansed according to 1 above)

2. underline S (on lhs)	S ® ABa, A ® Xb, B ® bAA, X ® bA \| aaa, Y ® b
3. For each production with an underline, find the nonterminals on the rhs of that production. Underline any lhs occurrence of such a nonterminal. This is the heart of the algorithm.	We have S ® ABa, so we underline A and B, where they occur as a lhs: S ® ABa, A ® Xb, B ® bAA, X ® bA \| aaa, Y ® b
4. Repeat 3	We have A ® Xb, B ® bAA, so underline X (A is already underlined): S ® ABa, A ® Xb, B ® bAA, X ® bA \| aaa, Y ® b
5. Repeat 3	No new nonterminals underlined. Stop. Y remains non-underlined.

6. Remove all productions with Y (on lhs or rhs):

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa This is our final grammar, containing only useful nonterminals.

Algorithm for deciding whether or not a CFL is infinite

We will assume that our CFL is generated by a CFG that is l-production free.

Recall that self-embedded nonterminals are those that can derive themselves:

N Þ* …N…

If a CFL has no self-embedded nonterminals:

all derivations of sentences are bounded in length (at most n = #nonterminals)
hence, there are only at most n² (i.e. finitely many) sentences.

On the other hand, if N Þ* …N…, we can pump N and generate infinitely many strings.

So all we need to do is determine if there are any self-embedded nonterminals.

Here’s the algorithm. We will use as an example grammar:

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa 1. Remove useless nonterminals; some of them may be self-embedded, but so what? They would never be used anyway. Already done for our example. Remove unit productions. Already done.

2. For each non-terminal, find those nonterminals it can derive. Let’s do it for B.

3. Examine the rhs for each B ® production. For each nonterminal found there, underline that nonterminal everywhere it appears as a lhs. These are the nonterminals that can be derived from B in one derivation step. We have B ® bAA, so we want to underline all lhs A’s:
S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa

4. For each production with a lhs underlined, underline lhs occurrences of nonterminals in the corresponding rhs. This is the heart of the algorithm. We have A ® Xb, so underline lhs X’s:
S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa

5. Repeat 4. No new underlinings. Stop.

The underlined non-terminals are those than can be derived from B in 1 or more steps. They are: {A, X}. B is not self-embedding.

If we do 3-5 for A, we find that A can derive {X, A}. Thus A is self-embedding, and the language is infinite.

Note:

we can stop as soon as we find one self-embedded nonterminal, and declare the language infinite
if we find no self-embedded nonterminals, the language is finite

Hosted by www.Geocities.ws

3. Examine the rhs for each B ® production. For each nonterminal found there, underline that nonterminal everywhere it appears as a lhs. These are the nonterminals that can be derived from B in one derivation step.	We have B ® bAA, so we want to underline all lhs A’s: S ® ABa, A ® Xb, B ® bAA, X ® bA \| aaa
4. For each production with a lhs underlined, underline lhs occurrences of nonterminals in the corresponding rhs. This is the heart of the algorithm.	We have A ® Xb, so underline lhs X’s: S ® ABa, A ® Xb, B ® bAA, X ® bA \| aaa
5. Repeat 4. No new underlinings. Stop.