The file parse.y
contains the “bison” source code of GNU
Pascal's parser. This stage of the compilation analyzes and checks
the syntax of your Pascal program, and it generates an intermediate,
language-independent code which is then passed to the GNU back-end.
The bison language essentially is a machine-readable form of the Backus-Naur Form, the symbolic notation of grammars of computer languages. “Syntax diagrams” are a graphical variant of the Backus-Naur Form.
For details about the “bison” language, see the Bison manual. A short overview how to pick up some information you might need for programming follows.
Suppose you have forgotten how a variable is declared in Pascal.
After some searching in parse.y
you have found the following:
simple_decl_1: ... | p_var variable_declaration_list { [...] } ; variable_declaration_list: variable_declaration { } | variable_declaration_list variable_declaration ;
Translated into English, this means: “A declaration can (amoung
other things like types and constants, omitted here) consist of the
keyword (lexical token) var
followed by a `variable
declaration list'. A `variable declaration list' in turn consists of
one or more `variable declarations'.” (The latter explanation
requires that you understand the recursive nature of the definition
of variable_declaration_list
.)
Now we can go on and search for variable_declaration
.
variable_declaration: id_list_limited ':' type_denoter_with_attributes { [...] } absolute_or_value_specification optional_variable_directive_list ';' { [...] } ;
The [...]
are placeholders for some C statements, the
semantic actions which (for the most part) aren't important
for understanding GPC's grammar.
From this you can look up that a variable declaration in GNU Pascal
consists of an identifier list, followed by a colon, “type denoter
with attributes”, an “absolute or value specification” and an
“optional variable directive list”, terminated by a semicolon.
Some of these parts are easy to understand, the others you can look
up from parse.y
. Remember that the reserved word var
precedes all this.
Now you know how to get the exact grammar of the GNU Pascal language from the source.
The semantic actions, not shown above, are in some sense the most
important part of the bison source, because they are responsible for
the generation of the intermediate code of the GNU Pascal front-end,
the so-called tree nodes (which are used to represent most
things in the compiler). For instance, the C code in “type
denoter” returns (assigns to $$
) information about the type
in a variable of type tree
.
The “variable declaration” gets this and other information in the
numbered arguments ($1
etc.) and passes it to some C
functions declared in the other source files. Generally, those
functions do the real work, while the main job of the C statements
in the parser is to call them with the right arguments.
This, the parser, is the place where it becomes Pascal.