Chapter 18 Part I: Decidability

Topics
Introduction
An algorithm for deciding whether a CFL is empty
Helper algorithm: remove useless nonterminals
Algorithm for deciding whether or not a CFL is infinite


Introduction

We addressed the issue of decidability for regular languages in Chapter 11. We showed that:

  1. there is an effective procedure (algorithm) for deciding whether a regular language is empty
  2. there is an effective procedure for deciding whether a RL is finite
  3. there is an effective procedure for deciding whether a string is in a given RL (just run it on the FA!)
  4. there is an effective procedure for deciding whether two RL’s are the same language
Context-free languages are not as well-behaved when it comes to decidability questions. For questions 1, 2, and 3 above, the answer is yes, but for question 4, the answer is no. In fact, there are lots of undecidable issues for CFL’s, such as: We will not prove the undecidability results, but will prove 1 and 2 above (for CFL’s) by demonstrating appropriate effective decision procedures (algorithms).


An algorithm for deciding whether a CFL is empty

The method is a grammar method:

  1. find the set of nullable nonterminals:  if it is non-empty, the CFL is non-empty
  2. find the set of nonterminals that can derive a string of terminals:  if the sentence symbol is in that set, the CFL is non-empty
  3. otherwise, the CFL is empty
Demonstration of this method on the following CFG:

S ® XY | AX, A ® a, Y ® BY | BB, B ® b

1. Find nullable non-terminals (there are none)  
2. Remove l-productions (there are none)  
3. Underline all occurrences of nonterminals S ® XY | AX, A ® a, Y ® BY | BB, B ® b
4. Underline all occurrences of nonterminals any of whose right hand sides consists solely of underlined symbols. This is the heart of the algorithm. We have A ® a, and B ® b, so underline all A’s and B’s:
S ® XY | AX, A®a, Y ®BY | BB, B®b
5. Repeat 4, until no new nonterminals can be underlined. We have Y ® BB, so underline all Y’s:
S ® XY | AX, A®a, Y®BY | BB, B®b
6. Repeat 4. No new underlinings. Stop.  
7. S is not underlined, so the language produces no sentences, and is empty.  
The underlined symbols are those that can derive a string of terminals  


Helper algorithm: remove useless nonterminals

A useless nonterminal is one that can never be used in the derivation of a sentence. This can happen in two ways:

Here’s the algorithm:

1. do the "emptiness" algorithm.

Now we want to determine which nonterminals can be derived from the sentence symbol.  We use the example: S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa, Y ® b (already cleansed according to 1 above)
 
2.  underline S (on lhs) S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa, Y ® b
3.  For each production with an underline, find the nonterminals on the rhs of that production. Underline any lhs occurrence of such a nonterminal. 
This is the heart of the algorithm.
We have S ® ABa, so we underline A and B, where they occur as a lhs:

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa, Y ® b

4. Repeat 3  We have A ® Xb, B ® bAA, so underline X (A is already underlined):
S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa, Y ® b
5. Repeat 3 No new nonterminals underlined. Stop.
Y remains non-underlined.

6. Remove all productions with Y (on lhs or rhs):

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa This is our final grammar, containing only useful nonterminals.


Algorithm for deciding whether or not a CFL is infinite

We will assume that our CFL is generated by a CFG that is l-production free.

Recall that self-embedded nonterminals are those that can derive themselves:

N Þ* …N…

If a CFL has no self-embedded nonterminals:

On the other hand, if N Þ* …N…, we can pump N and generate infinitely many strings.

So all we need to do is determine if there are any self-embedded nonterminals.

Here’s the algorithm. We will use as an example grammar:

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa 1. Remove useless nonterminals; some of them may be self-embedded, but so what? They would never be used anyway. Already done for our example. Remove unit productions. Already done.

2. For each non-terminal, find those nonterminals it can derive. Let’s do it for B.
 
3. Examine the rhs for each B ® production. For each nonterminal found there, underline that nonterminal everywhere it appears as a lhs. These are the nonterminals that can be derived from B in one derivation step. We have B ® bAA, so we want to underline all lhs A’s:

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa

4. For each production with a lhs underlined, underline lhs occurrences of nonterminals in the corresponding rhs. This is the heart of the algorithm. We have A ® Xb, so underline lhs X’s:

S ® ABa, A ® Xb, B ® bAA, X ® bA | aaa

5. Repeat 4. No new underlinings. Stop.

The underlined non-terminals are those than can be derived from B in 1 or more steps.  They are: {A, X}. B is not self-embedding.

If we do 3-5 for A, we find that A can derive {X, A}.  Thus A is self-embedding, and the language is infinite.

Note:

Hosted by www.Geocities.ws

1