Talk:BNF Grammar: Difference between revisions

From Rosetta Code
Content added Content deleted
(Ouch!)
 
(bnf for c)
Line 1: Line 1:
Ouch! Impedance mismatch. Is it just me, or does this entry read like no other on RC? --[[User:Paddy3118|Paddy3118]] 05:17, 21 June 2009 (UTC)
Ouch! Impedance mismatch. Is it just me, or does this entry read like no other on RC? --[[User:Paddy3118|Paddy3118]] 05:17, 21 June 2009 (UTC)
:It may well be me :-) --[[User:Paddy3118|Paddy3118]] 05:17, 21 June 2009 (UTC)
:It may well be me :-) --[[User:Paddy3118|Paddy3118]] 05:17, 21 June 2009 (UTC)
This page is getting kind of big... seems to be crashing. I was moving the BNF for c, and can't put it back, so here it is:
=={{header|C}}==
<pre>
! -----------------------------------------------------------------------
! ANSI C
!
! The C programming language evolved at Bell Labs from a series of
! programming languages: 'CPL', 'BCPL', and then 'B'. As a result, C's
! development was a combined effort between Dennis Ritchie, Ken Thompson,
! and Martin Richards.
!
! C was designed for the creation and implementation of low-level systems
! such as operating systems, device drivers, firmware, etc... To realize
! this goal, the language contains the ability to perform operations
! directly on memory and has direct access to system pointers. While this
! gives an enormous amount of control and flexibility, it also makes C a
! professional programming language - not to be used by an inexperienced
! programmer.
!
! C (and later C++) quickly became the de facto standard for developing
! operating systems, applications and most other large projects. UNIX as
! well as Windows, Linux, and Mac-OS X were developed using this
! language (and its successors).
!
! More information is available at Dennis Ritchie's website:
! http://cm.bell-labs.com/cm/cs/who/dmr/
!
! The C grammar is inherently ambigious and requires a large number of
! LALR(1) states to parse. As a result, the time required by the GOLD
! Parser Builder to compile this grammar is extensive.
!
! C is not a line-based grammar with the notable exception of compiler
! directives (which are preceeded by a '#' character). These are usually not
! handled directly by the actual parser, but, rather, the pre-processor.
! Before the program is analyzed by the parser, C compilers scan the code and
! act on these commands. The final C program is then passed to the parser.
! -----------------------------------------------------------------------

! -----------------------------------------------------------------------
! This grammar does not contain the compiler directives.
!
! Note: This is an ad hoc version of the language. If there are any flaws,
! please visit the contact page and tell me.
!
! SPECIAL THANKS TO:
! BOB MEAGHER
! MIKE WISDOM
! VLADIMIR MOROZOV
! TOM VAN DIJCK
!
! Modified 06/14/2002
! * The correct definition for "return" was added. Thanks to Bob Meagher.
! * Added the missing rules for labels, storage specifications, etc...
! which were left out. Thanks to Mike Wisdom for calling this to
! my attention.
!
! Modified 06/21/2002
! * I fixed an error in the grammar for declaring functions.
!
! Modified 06/15/2003
! * Vladimir Morozov fixed an error for calling functions with no parameters
!
! Modified 01/31/2004
! * Tom van Dijck found a bug in the grammar concerning variable
! initialization.
!
! Modified 04/26/2004
! * Some errors in the grammar were fixed.
!
! Modified 01/19/2005
! * The definition for comments was modified. In ANSI C, block comments
! cannot be nested. As a result, they are defined using the whitespace
! terminal.
!
! Modified 03/28/2007
! * The commented-out definition for non-nested comments was updated. The
! previous version did not work in all cases.
! -----------------------------------------------------------------------


"Name" = 'ANSI C'
"Version" = '1973'
"Author" = 'Dennis Ritchie, Ken Thompson, Martin Richards'
"About" = 'C is one of the most common, and complex, programming languages in use today.'

"Case Sensitive" = True
"Start Symbol" = <Decls>

{Hex Digit} = {Digit} + [abcdefABCDEF]
{Oct Digit} = [01234567]

{Id Head} = {Letter} + [_]
{Id Tail} = {Id Head} + {Digit}

{String Ch} = {Printable} - ["]
{Char Ch} = {Printable} - ['']

DecLiteral = [123456789]{digit}*
OctLiteral = 0{Oct Digit}*
HexLiteral = 0x{Hex Digit}+
FloatLiteral = {Digit}*'.'{Digit}+

StringLiteral = '"'( {String Ch} | '\'{Printable} )* '"'
CharLiteral = '' ( {Char Ch} | '\'{Printable} )''

Id = {Id Head}{Id Tail}*


! ===================================================================
! Comments
! ===================================================================

Comment Start = '/*'
Comment End = '*/'
Comment Line = '//'


! Typically, C comments cannot be nested. As a result, the
! Comment Start and Comment End terminals cannot be used.
!
! To implement non-nested comments, the whitespace terminal is
! modified to accept them. In the definition below, Whitespace
! is defined as one or more {Whitespace} characters OR a series
! of characters delimited by /* and */. Note that the characters
! between the two delimiters cannot contain the */ sequence.
!
! Uncomment the following to prevent block commments. Make sure
! to comment the Comment Start and Comment End definitions.
!
! {Non Slash} = {Printable} - [/]
! {Non Asterisk} = {Printable} - [*]
!
! Whitespace = {Whitespace}+
! | '/*' ( {Non Asterisk} | '*' {Non Slash}? )* '*/'

!=======================================================

<Decls> ::= <Decl> <Decls>
|

<Decl> ::= <Func Decl>
| <Func Proto>
| <Struct Decl>
| <Union Decl>
| <Enum Decl>
| <Var Decl>
| <Typedef Decl>
! ===================================================================
! Function Declaration
! ===================================================================

<Func Proto> ::= <Func ID> '(' <Types> ')' ';'
| <Func ID> '(' <Params> ')' ';'
| <Func ID> '(' ')' ';'

<Func Decl> ::= <Func ID> '(' <Params> ')' <Block>
| <Func ID> '(' <Id List> ')' <Struct Def> <Block>
| <Func ID> '(' ')' <Block>


<Params> ::= <Param> ',' <Params>
| <Param>
<Param> ::= const <Type> ID
| <Type> ID
<Types> ::= <Type> ',' <Types>
| <Type>
<Id List> ::= Id ',' <Id List>
| Id

<Func ID> ::= <Type> ID
| ID

! ===================================================================
! Type Declaration
! ===================================================================

<Typedef Decl> ::= typedef <Type> ID ';'

<Struct Decl> ::= struct Id '{' <Struct Def> '}' ';'

<Union Decl> ::= union Id '{' <Struct Def> '}' ';'


<Struct Def> ::= <Var Decl> <Struct Def>
| <Var Decl>

! ===================================================================
! Variable Declaration
! ===================================================================

<Var Decl> ::= <Mod> <Type> <Var> <Var List> ';'
| <Type> <Var> <Var List> ';'
| <Mod> <Var> <Var List> ';'
<Var> ::= ID <Array>
| ID <Array> '=' <Op If>

<Array> ::= '[' <Expr> ']'
| '[' ']'
|
<Var List> ::= ',' <Var Item> <Var List>
|

<Var Item> ::= <Pointers> <Var>

<Mod> ::= extern
| static
| register
| auto
| volatile
| const

! ===================================================================
! Enumerations
! ===================================================================

<Enum Decl> ::= enum Id '{' <Enum Def> '}' ';'
<Enum Def> ::= <Enum Val> ',' <Enum Def>
| <Enum Val>

<Enum Val> ::= Id
| Id '=' OctLiteral
| Id '=' HexLiteral
| Id '=' DecLiteral


! ===================================================================
! Types
! ===================================================================

<Type> ::= <Base> <Pointers>

<Base> ::= <Sign> <Scalar>
| struct Id
| struct '{' <Struct Def> '}'
| union Id
| union '{' <Struct Def> '}'
| enum Id


<Sign> ::= signed
| unsigned
|

<Scalar> ::= char
| int
| short
| long
| short int
| long int
| float
| double
| void

<Pointers> ::= '*' <Pointers>
|

! ===================================================================
! Statements
! ===================================================================

<Stm> ::= <Var Decl>
| Id ':' !Label
| if '(' <Expr> ')' <Stm>
| if '(' <Expr> ')' <Then Stm> else <Stm>
| while '(' <Expr> ')' <Stm>
| for '(' <Arg> ';' <Arg> ';' <Arg> ')' <Stm>
| <Normal Stm>

<Then Stm> ::= if '(' <Expr> ')' <Then Stm> else <Then Stm>
| while '(' <Expr> ')' <Then Stm>
| for '(' <Arg> ';' <Arg> ';' <Arg> ')' <Then Stm>
| <Normal Stm>

<Normal Stm> ::= do <Stm> while '(' <Expr> ')'
| switch '(' <Expr> ')' '{' <Case Stms> '}'
| <Block>
| <Expr> ';'
| goto Id ';'
| break ';'
| continue ';'
| return <Expr> ';'
| ';' !Null statement


<Arg> ::= <Expr>
|

<Case Stms> ::= case <Value> ':' <Stm List> <Case Stms>
| default ':' <Stm List>
|

<Block> ::= '{' <Stm List> '}'

<Stm List> ::= <Stm> <Stm List>
|


! ===================================================================
! Here begins the C's 15 levels of operator precedence.
! ===================================================================

<Expr> ::= <Expr> ',' <Op Assign>
| <Op Assign>

<Op Assign> ::= <Op If> '=' <Op Assign>
| <Op If> '+=' <Op Assign>
| <Op If> '-=' <Op Assign>
| <Op If> '*=' <Op Assign>
| <Op If> '/=' <Op Assign>
| <Op If> '^=' <Op Assign>
| <Op If> '&=' <Op Assign>
| <Op If> '|=' <Op Assign>
| <Op If> '>>=' <Op Assign>
| <Op If> '<<=' <Op Assign>
| <Op If>

<Op If> ::= <Op Or> '?' <Op If> ':' <Op If>
| <Op Or>

<Op Or> ::= <Op Or> '||' <Op And>
| <Op And>

<Op And> ::= <Op And> '&&' <Op BinOR>
| <Op BinOR>

<Op BinOR> ::= <Op BinOr> '|' <Op BinXOR>
| <Op BinXOR>

<Op BinXOR> ::= <Op BinXOR> '^' <Op BinAND>
| <Op BinAND>

<Op BinAND> ::= <Op BinAND> '&' <Op Equate>
| <Op Equate>

<Op Equate> ::= <Op Equate> '==' <Op Compare>
| <Op Equate> '!=' <Op Compare>
| <Op Compare>

<Op Compare> ::= <Op Compare> '<' <Op Shift>
| <Op Compare> '>' <Op Shift>
| <Op Compare> '<=' <Op Shift>
| <Op Compare> '>=' <Op Shift>
| <Op Shift>

<Op Shift> ::= <Op Shift> '<<' <Op Add>
| <Op Shift> '>>' <Op Add>
| <Op Add>

<Op Add> ::= <Op Add> '+' <Op Mult>
| <Op Add> '-' <Op Mult>
| <Op Mult>

<Op Mult> ::= <Op Mult> '*' <Op Unary>
| <Op Mult> '/' <Op Unary>
| <Op Mult> '%' <Op Unary>
| <Op Unary>

<Op Unary> ::= '!' <Op Unary>
| '~' <Op Unary>
| '-' <Op Unary>
| '*' <Op Unary>
| '&' <Op Unary>
| '++' <Op Unary>
| '--' <Op Unary>
| <Op Pointer> '++'
| <Op Pointer> '--'
| '(' <Type> ')' <Op Unary> !CAST
| sizeof '(' <Type> ')'
| sizeof '(' ID <Pointers> ')'
| <Op Pointer>

<Op Pointer> ::= <Op Pointer> '.' <Value>
| <Op Pointer> '->' <Value>
| <Op Pointer> '[' <Expr> ']'
| <Value>

<Value> ::= OctLiteral
| HexLiteral
| DecLiteral
| StringLiteral
| CharLiteral
| FloatLiteral
| Id '(' <Expr> ')'
| Id '(' ')'

| Id
| '(' <Expr> ')'
</pre>

Revision as of 08:08, 21 June 2009

Ouch! Impedance mismatch. Is it just me, or does this entry read like no other on RC? --Paddy3118 05:17, 21 June 2009 (UTC)

It may well be me :-) --Paddy3118 05:17, 21 June 2009 (UTC)

This page is getting kind of big... seems to be crashing. I was moving the BNF for c, and can't put it back, so here it is:

C

! -----------------------------------------------------------------------
! ANSI C
!
! The C programming language evolved at Bell Labs from a series of 
! programming languages: 'CPL', 'BCPL', and then 'B'. As a result, C's
! development was a combined effort between Dennis Ritchie, Ken Thompson,
! and Martin Richards.  
!
! C was designed for the creation and implementation of low-level systems
! such as operating systems, device drivers, firmware, etc... To realize 
! this goal, the language contains the ability to perform operations 
! directly on memory and has direct access to system pointers. While this 
! gives an enormous amount of control and flexibility, it also makes C a 
! professional programming language - not to be used by an inexperienced
! programmer.
!
! C (and later C++) quickly became the de facto standard for developing
! operating systems, applications and most other large projects. UNIX as 
! well as Windows, Linux, and Mac-OS X were developed using this 
! language (and its successors).
!
! More information is available at Dennis Ritchie's website:
!     http://cm.bell-labs.com/cm/cs/who/dmr/
!
! The C grammar is inherently ambigious and requires a large number of
! LALR(1) states to parse. As a result, the time required by the GOLD 
! Parser Builder to compile this grammar is extensive.
! 
! C is not a line-based grammar with the notable exception of compiler
! directives (which are preceeded by a '#' character). These are usually not
! handled directly by the actual parser, but, rather, the pre-processor. 
! Before the program is analyzed by the parser, C compilers scan the code and
! act on these commands. The final C program is then passed to the parser.
! -----------------------------------------------------------------------

! -----------------------------------------------------------------------
! This grammar does not contain the compiler directives.
!
! Note: This is an ad hoc version of the language. If there are any flaws, 
! please visit the contact page and tell me.
!
! SPECIAL THANKS TO:
!     BOB MEAGHER
!     MIKE WISDOM
!     VLADIMIR MOROZOV
!     TOM VAN DIJCK
!
! Modified 06/14/2002
!   * The correct definition for "return" was added. Thanks to Bob Meagher.
!   * Added the missing rules for labels, storage specifications, etc...
!     which were left out. Thanks to Mike Wisdom for calling this to
!     my attention.
!
! Modified 06/21/2002
!   * I fixed an error in the grammar for declaring functions. 
!
! Modified 06/15/2003
!   * Vladimir Morozov fixed an error for calling functions with no parameters
!
! Modified 01/31/2004
!   * Tom van Dijck found a bug in the grammar concerning variable 
!     initialization.
!
! Modified 04/26/2004
!   * Some errors in the grammar were fixed.
!
! Modified 01/19/2005
!   * The definition for comments was modified. In ANSI C, block comments
!     cannot be nested. As a result, they are defined using the whitespace
!     terminal.
!
! Modified 03/28/2007
!   * The commented-out definition for non-nested comments was updated. The
!     previous version did not work in all cases.
! -----------------------------------------------------------------------


"Name"    = 'ANSI C' 
"Version" = '1973'
"Author"  = 'Dennis Ritchie, Ken Thompson, Martin Richards' 
"About"   = 'C is one of the most common, and complex, programming languages in use today.'

"Case Sensitive" = True
"Start Symbol"   = <Decls>

{Hex Digit}      = {Digit} + [abcdefABCDEF]
{Oct Digit}      = [01234567]

{Id Head}        = {Letter} + [_]
{Id Tail}        = {Id Head} + {Digit}

{String Ch}      = {Printable} - ["]
{Char Ch}        = {Printable} - ['']

DecLiteral       = [123456789]{digit}*
OctLiteral       = 0{Oct Digit}*
HexLiteral       = 0x{Hex Digit}+
FloatLiteral     = {Digit}*'.'{Digit}+

StringLiteral    = '"'( {String Ch} | '\'{Printable} )* '"'
CharLiteral      = '' ( {Char Ch} | '\'{Printable} )''

Id               = {Id Head}{Id Tail}*


! ===================================================================
! Comments
! ===================================================================

Comment Start = '/*'
Comment End   = '*/'
Comment Line  = '//'


! Typically, C comments cannot be nested. As a result, the 
! Comment Start and Comment End terminals cannot be used.
!
! To implement non-nested comments, the whitespace terminal is
! modified to accept them. In the definition below, Whitespace 
! is defined as one or more {Whitespace} characters OR a series
! of characters delimited by /* and */. Note that the characters
! between the two delimiters cannot contain the */ sequence. 
!
! Uncomment the following to prevent block commments. Make sure 
! to comment the Comment Start and Comment End definitions.
!
! {Non Slash}     = {Printable} - [/]  
! {Non Asterisk}  = {Printable} - [*]
! 
! Whitespace     = {Whitespace}+   
!                | '/*' (  {Non Asterisk} | '*' {Non Slash}? )*  '*/'

!=======================================================

<Decls> ::= <Decl> <Decls>
          |

<Decl>  ::= <Func Decl>
          | <Func Proto>
          | <Struct Decl>
          | <Union Decl>
          | <Enum Decl>
          | <Var Decl>    
          | <Typedef Decl>
              
! ===================================================================
! Function  Declaration
! ===================================================================

<Func Proto> ::= <Func ID> '(' <Types>  ')' ';'
               | <Func ID> '(' <Params> ')' ';'
               | <Func ID> '(' ')' ';'

<Func Decl>  ::= <Func ID> '(' <Params>  ')' <Block>
               | <Func ID> '(' <Id List> ')' <Struct Def> <Block>
               | <Func ID> '(' ')' <Block>


<Params>     ::= <Param> ',' <Params>
               | <Param>
               
<Param>      ::= const <Type> ID
               |       <Type> ID
               
<Types>      ::= <Type>  ',' <Types>
               | <Type> 
   
<Id List>    ::= Id ',' <Id List>
               | Id

<Func ID>    ::= <Type> ID
               | ID

! ===================================================================
! Type Declaration
! ===================================================================

<Typedef Decl> ::= typedef <Type> ID ';'

<Struct Decl>  ::= struct Id '{' <Struct Def> '}'  ';' 

<Union Decl>   ::= union Id '{' <Struct Def> '}'  ';' 


<Struct Def>   ::= <Var Decl> <Struct Def>
                 | <Var Decl>

! ===================================================================
! Variable Declaration
! ===================================================================

<Var Decl>     ::= <Mod> <Type> <Var> <Var List>  ';'
                 |       <Type> <Var> <Var List>  ';'
                 | <Mod>        <Var> <Var List>  ';'
             
<Var>      ::= ID <Array>
             | ID <Array> '=' <Op If> 

<Array>    ::= '[' <Expr> ']'
             | '[' ']'
             |
             
<Var List> ::=  ',' <Var Item> <Var List>
             | 

<Var Item> ::= <Pointers> <Var>

             
<Mod>      ::= extern 
             | static
             | register
             | auto
             | volatile
             | const   

! ===================================================================
! Enumerations
! ===================================================================

<Enum Decl>    ::= enum Id '{' <Enum Def> '}'  ';'
 
<Enum Def>     ::= <Enum Val> ',' <Enum Def>
                 | <Enum Val>

<Enum Val>     ::= Id
                 | Id '=' OctLiteral
                 | Id '=' HexLiteral
                 | Id '=' DecLiteral  


! ===================================================================
! Types
! ===================================================================

<Type>     ::= <Base> <Pointers> 

<Base>     ::= <Sign> <Scalar>
             | struct Id 
             | struct '{' <Struct Def> '}' 
             | union Id
             | union '{' <Struct Def> '}' 
             | enum Id  


<Sign>     ::= signed 
             | unsigned
             |

<Scalar>   ::= char
             | int
             | short
             | long
             | short int
             | long int
             | float
             | double
             | void           

<Pointers> ::= '*' <Pointers>
             |

! ===================================================================
! Statements
! ===================================================================

<Stm>        ::= <Var Decl>
               | Id ':'                            !Label
               | if '(' <Expr> ')' <Stm>          
               | if '(' <Expr> ')' <Then Stm> else <Stm>         
               | while '(' <Expr> ')' <Stm> 
               | for '(' <Arg> ';' <Arg> ';' <Arg> ')' <Stm>
               | <Normal Stm>

<Then Stm>   ::= if '(' <Expr> ')' <Then Stm> else <Then Stm> 
               | while '(' <Expr> ')' <Then Stm> 
               | for '(' <Arg> ';' <Arg> ';' <Arg> ')' <Then Stm>
               | <Normal Stm>

<Normal Stm> ::= do <Stm> while '(' <Expr> ')'
               | switch '(' <Expr> ')' '{' <Case Stms> '}'
               | <Block>
               | <Expr> ';'               
               | goto Id ';'
               | break ';'
               | continue ';'
               | return <Expr> ';'
               | ';'              !Null statement


<Arg>       ::= <Expr> 
              | 

<Case Stms> ::= case <Value> ':' <Stm List> <Case Stms>
              | default ':' <Stm List>                  
              |

<Block>     ::= '{' <Stm List> '}' 

<Stm List>  ::=  <Stm> <Stm List> 
              | 


! ===================================================================
! Here begins the C's 15 levels of operator precedence.
! ===================================================================

<Expr>       ::= <Expr> ',' <Op Assign>   
               | <Op Assign>

<Op Assign>  ::= <Op If> '='   <Op Assign>
               | <Op If> '+='  <Op Assign>
               | <Op If> '-='  <Op Assign>
               | <Op If> '*='  <Op Assign>
               | <Op If> '/='  <Op Assign>
               | <Op If> '^='  <Op Assign>
               | <Op If> '&='  <Op Assign>
               | <Op If> '|='  <Op Assign>
               | <Op If> '>>=' <Op Assign>
               | <Op If> '<<=' <Op Assign>
               | <Op If>

<Op If>      ::= <Op Or> '?' <Op If> ':' <Op If>
               | <Op Or>

<Op Or>      ::= <Op Or> '||' <Op And>
               | <Op And>

<Op And>     ::= <Op And> '&&' <Op BinOR>
               | <Op BinOR>

<Op BinOR>   ::= <Op BinOr> '|' <Op BinXOR>
               | <Op BinXOR>

<Op BinXOR>  ::= <Op BinXOR> '^' <Op BinAND>
               | <Op BinAND>

<Op BinAND>  ::= <Op BinAND> '&' <Op Equate>
               | <Op Equate>

<Op Equate>  ::= <Op Equate> '==' <Op Compare>
               | <Op Equate> '!=' <Op Compare>
               | <Op Compare>

<Op Compare> ::= <Op Compare> '<'  <Op Shift>
               | <Op Compare> '>'  <Op Shift>
               | <Op Compare> '<=' <Op Shift>
               | <Op Compare> '>=' <Op Shift>
               | <Op Shift>

<Op Shift>   ::= <Op Shift> '<<' <Op Add>
               | <Op Shift> '>>' <Op Add>
               | <Op Add>

<Op Add>     ::= <Op Add> '+' <Op Mult>
               | <Op Add> '-' <Op Mult>
               | <Op Mult>

<Op Mult>    ::= <Op Mult> '*' <Op Unary>
               | <Op Mult> '/' <Op Unary>
               | <Op Mult> '%' <Op Unary>
               | <Op Unary>

<Op Unary>   ::= '!'    <Op Unary>
               | '~'    <Op Unary>   
               | '-'    <Op Unary>
               | '*'    <Op Unary>
               | '&'    <Op Unary>               
               | '++'   <Op Unary>
               | '--'   <Op Unary>
               | <Op Pointer> '++'
               | <Op Pointer> '--'
               | '(' <Type> ')' <Op Unary>   !CAST
               | sizeof '(' <Type> ')'
               | sizeof '(' ID <Pointers> ')'
               | <Op Pointer>

<Op Pointer> ::= <Op Pointer> '.' <Value>
               | <Op Pointer> '->' <Value>
               | <Op Pointer> '[' <Expr> ']'
               | <Value>

<Value>      ::= OctLiteral
               | HexLiteral
               | DecLiteral  
               | StringLiteral
               | CharLiteral
               | FloatLiteral
               | Id '(' <Expr> ')'
               | Id '(' ')'          

               | Id
               | '(' <Expr> ')'