What is BNF notation?
BNF is an acronym for "Backus Naur Form". John Backus and
Peter Naur introduced for the first time a formal notation
to describe the syntax of a given language (This was for the description
of the ALGOL 60
programming language, see [Naur 60]).
To be precise, most of BNF was introduced by Backus in a report presented at
an earlier UNESCO conference on ALGOL 58. Few read the report, but when
Peter Naur read it he was surprised at some of the differences he found
between his and Backus's interpretation of ALGOL 58. He decided that for the
successor to ALGOL, all participants of the first design had come to
recognize some weaknesses, should be given in a similar form so that all
participants should be aware of what they were agreeing to. He made a
few modificiations that are almost universally used and drew up on his
own the BNF for ALGOL 60 at the meeting where it was designed. Depending
on how you attribute presenting it to the world, it was either by Backus
in 59 or Naur in 60. (For more details on this period of
programming languages history, see the
introduction to Backus's Turing award article in Communications of the
ACM, Vol. 21, No. 8, august 1978. This note was suggested by
William B. Clodius
from Los Alamos Natl. Lab).
Since then, almost every author of books on new programming languages
used it to specify the syntax rules of the language. See [Jensen 74] and [Wirth 82] for examples.
The following is taken from [Marcotty 86]:
The meta-symbols of BNF are:
- ::=
- meaning "is defined as"
- |
- meaning "or"
- < >
- angle brackets used to surround category names.
The angle brackets distinguish syntax rules names (also called
non-terminal symbols) from terminal symbols which are written exactly
as they are to be represented. A BNF rule defining a nonterminal has the form:
nonterminal ::= sequence_of_alternatives consisting of strings of
terminals or nonterminals separated by the meta-symbol |
For example, the BNF production for a mini-language is:
<program> ::= program
<declaration_sequence>
begin
<statements_sequence>
end ;
This shows that a mini-language program consists of the keyword "program"
followed by the declaration sequence, then the keyword "begin" and the
statements sequence, finally the keyword "end" and a semicolon.
(end of quotation)
In fact, many authors have introduced some slight extensions of BNF for the
ease of use:
- optional items are enclosed in meta symbols [ and ], example:
<if_statement> ::= if <boolean_expression> then
<statement_sequence>
[ else
<statement_sequence> ]
end if ;
- repetitive items (zero or more times) are enclosed in meta
symbols { and }, example:
<identifier> ::= <letter> { <letter> | <digit> }
this rule is equivalent to the recursive rule:
<identifier> ::= <letter> |
<identifier> [ <letter> | <digit> ]
- terminals of only one character are surrounded by quotes (") to
distinguish them from meta-symbols, example:
<statement_sequence> ::= <statement> { ";" <statement> }
- in recent text books, terminal and non-terminal symbols are distingued
by using bold faces for terminals and suppressing < and > around
non-terminals. This improves greatly the readability. The example then becomes:
if_statement ::= if boolean_expression then
statement_sequence
[ else
statement_sequence ]
end if ";"
Now as a last example (maybe not the easiest to read !), here is the
definition of BNF expressed in BNF:
syntax ::= { rule }
rule ::= identifier "::=" expression
expression ::= term { "|" term }
term ::= factor { factor }
factor ::= identifier |
quoted_symbol |
"(" expression ")" |
"[" expression "]" |
"{" expression "}"
identifier ::= letter { letter | digit }
quoted_symbol ::= """ { any_character } """
BNF is not ony important to describe syntax rules in books, but it is
very commonly used (with variants) by syntactic tools. See for example
any book on LEX and YACC,
the standard UNIX parser generators. If you
have access to any Unix machine, you will probably find a chapter of
the documentation on these tools.
Some references:
- [Naur 60]
- NAUR, Peter (ed.), "Revised Report on the Algorithmic
Language ALGOL 60.", Communications of the ACM, Vol. 3 No.5, pp.
299-314, May 1960.
- [Jensen 74]
- JENSEN, Kathleen, WIRTH, Niklaus,
"PASCAL user manual and report",
Lecture notes in computer science ; vol. 18.,
Berlin [etc.] : Springer, 1974., 1974.
- [Johnson 75]
- S.C. Johnson,
"Yacc: Yet Another Compiler Compiler",
Computer Science Technical Report #32,
Bell Laboratories,
Murray Hill, NJ, 1975.
- [Wirth 82]
- WIRTH, Niklaus.,
Programming in Modula-2,
Berlin, Heidelberg: Springer, 1982.
- [Marcotty 86]
- M. Marcotty & H. Ledgard,
The World of Programming Languages,
Springer-Verlag,
Berlin 1986., pages 41 and following.
Th. Estier, CUI - University of Geneva