A grammar for the MAY2004 language (aka mini-pas)
Metasymbols
The vertical bar, |, is used for OR. Non-terminals are in angle brackets,e.g. <sttmnt>. Terminals are in bold face, e.g, if. An asterix superscript on a set of braces indicates zero or more repetitions and a plus superscript indicates at least one repetition. e is the empty string.
Lexical rules
<number> => { <digit>}+
<id> => <letter>{<letter> | <digit>}*
<relop> => = | > | < | <> | >= | <=
<addop> => + | - | or
<multop> => * | div | and
<digit> => 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<letter> => a | b | c | ... | z | _ (all letters are converted to lower case for lexical analysis)
Syntax rules
<program> => program <id> ; <globdecl> {<procdec> | <funcdec>}* <compsttmt> .
<globdecl> => {{const {<id> = {+ | - | e} <number> ;}+ } | e} {{var { <id> : integer ;}+ } | e }
<procdec> => procedure <id> { ( <paramlst> ) | e } ; <lcldec> <compsttmt> ;
<funcdec> => function <id> ( { <paramlist> | e } ) : integer ; <lcldec> <compsttmt> ;
<compsttmt> => begin <sttmt> {; <sttmt> }* end
<paramlst> => { e | var} <id> : integer { ; { e | var} <id> : integer }*
<lcldec> => e | {var { <id> : integer ;}+ )}
<sttmt> => <ifsttmt> | <whilesttmt> | <inputsttmt> | <outputsttmt> | <forsttmt> | <compsttmt> | <assignsttmt> | <proccallsttmt> | e
<ifsttmt> => if ( <bexpr> ) then <sttmt> { e | else <sttmt>}
<whilesttmt> => while ( <bexpr> ) do <sttmt>
<inputsttmt> => readln ( <id> )
<outputsttmt> => writeln ( { <id> | <number>} )
(optional)<outputsttmt> => writeln ( <expr> )<forsttmt> => for <id> := expr to <expr> do <sttmt>
<assignsttmt> => <id> := <expr>
<proccallsttmt> => <id> { e | ( <id> { , <id> }* ) }
<expr> => <addop>* <term> { <addop> <term>}*
<term> => <factor> {<multop> <factor>}*
<factor> => <id> | <number> | ( <expr> ) | <funccall>
<funccall> => <id> ({ e | {<id> {, <id> }* } )
<bexpr> => <expr> { <relop> <expr> | e }
Keywords and Special Symbols
The following are keywords in mini-pas: and* begin case* const do downto* else end for function if integer not* of* or* procedure program readln repeat* then to until* var while writeln . Asterisk indicates implementation is optional
The following are single symbol tokens: = < > : ; . + - * ( )
The following are multiple symbol tokens: := >= <= <> DIV
Additional token types are <number> <id> <addop> <multop>
Comments
Comments are indicated by braces, e.g. {This is a comment}, and may appear anywhere in the program. Comments may not be "nested".