Chapter 12: Context-Free Grammars

Topics
Context-free grammars
How do you use a CFG to generate a sentence?
Parse trees
Lukasiewicz (Polish) notation
Evaluating a formula written in prefix notation
Ambiguity


Context-free grammars
 
Context-free grammars (CFG’s): *standard terminology sentence will be used to stand for what the book calls "word in the language"

Example CFG:
S ® aSb  read: "S goes to aSb" or
"S becomes aSb" or "S produces aSb"
S ® l l is the empty string, as usual
 
Terminology:

If there are multiple productions with the same LHS, we usually group them using the alternation sign "|", e.g.:

S ® aSb | l


How do you use a CFG to generate a sentence?
 
Start with the sentence symbol S  
Add a Þ (the derives symbol) S Þ  
Replace a non-terminal by a RHS whose LHS is that non-terminal, and add it: S Þ aSb we used production 1 to replace S by aSb
Repeat as often as you can: S Þ aSb Þ aaSbb production 1
  S Þ aSb Þ aaSbb Þ aabb production 2
When there are no non-terminals to be replaced, you have a sentence in the language generated by the grammar

We can, of course, continue to replace S by aSb for as many steps as we please. The language generated?

anbn      (!!)

More terminology:

S Þ aSb Þ aaSbb Þ aabb is called a derivation of the string aabb we say "S derives aSb derives aaSbb derives aabb"
 
S, aSb, aaSbb, aabb are called sentential forms (book: "working string"), because they can all be derived from S in 0 or more steps


Parse trees
 
This is an alternative scheme for showing the derivation of a sentence, resembling "diagramming (English) sentences"

Derivation: S Þ aSb Þ aaSbb Þ aabb
Parse tree:

It’s not too hard to see how the tree represents a derivation.  Notice that the sentence is just the terminal symbols at the leaves of the tree, read from left to right (ignore l ).


Lukasiewicz (Polish) notation

Our book shows us a grammar for producing parse trees of the following form:
 
This is sometimes referred to as an abstract syntax tree, and represents the algebraic expression
(3 + 1)*5  
Notice: 
  • the expression needs the parentheses
  • 3 + 1*5 is not the same!
  • the tree, by its structure, 
  • indicates that the 3 + 1 must be done before 

  • its result (4) can be combined with the 5 to get 20

The abstract syntax tree becomes a device for writing arithmetic expressions without parentheses, by traversing the tree, as follows:

Trace the tree, top to bottom, left to right, recording symbols as you go (but not re-recording any you have already seen on the way down). This is called a pre-order traversal. For our tree:

*+315
This style of representation of an expression is called (Polish) prefix notation.

Standard (infix) notation: 3*(1 + 5)            Prefix notation: *+315 (parenthesis-free)

Notice, for prefix notation,


Evaluating a formula written in prefix notation
 
1. locate the rightmost operator and the two operands to its right +**2-*32421
2. replace that substring by its value

*32 = 3*2 = 6

+**2-6421
repeat 1 +**2-6421
repeat 2 +**2221
repeat 1 and 2 +*421
repeat 1 and 2 +81
repeat 1 and 2  9
you are done when all you have left is a single result
if you get something like: 43+9, step 1 cannot be performed, and the expression was not valid

Now:



Ambiguity

Consider:

S ® aSb
S ® aaSbb
S ® l
and the string: aabb

Two parse trees:

and:

When any sentence has more than one parse tree, we say the grammar is ambiguous. Grammars for computer languages are constructed so as to be unambiguous.

Hosted by www.Geocities.ws

1