Node: implementation constructor destructor operator uses import initialization, Previous: forward near far, Up: Parsing keywords



implementation, constructor, destructor, operator, uses, import and initialization as weak keywords

In ISO 7185 Pascal, each section of the source code is uniquely introduced by a keyword (program, const, type, var, label, procedure, function, begin). However, the ending of some of these sections (in particular const, type and var) is not intrinsically defined, but only by the context (the next of these “critical” keywords). E.g., var Foo: Integer; can be a complete variable declaration part (if one of those keywords follows), or only a part of one, as in var Foo: Integer; Bar: Integer;. (For the other keywords, the ending is intrinsically defined – the program heading and label declarations end with the next ;. For procedure and function it's a little more complicated, due to forward declarations, but still well-defined, and begin ends with the matching end). The same applies to sections within one routine, except that program cannot occur there.

Extended Pascal adds to (in to begin do and to end do) and end (in interface modules and implementation modules without initializer and finalizer) to those “critical” keywords.

But it also adds two keywords which are not defined in classic Pascal, namely export and import. But they can only occur at the beginning of the source or of a module implementation so they have fewer chances to conflict with those other keywords. The same applies to UCSD/Borland Pascal's uses for units. (uses terminates at the first ;, export and import do not necessarily, like var etc.)

The problem gets bigger with UCSD/Borland Pascal's implementation in units. It can occur after the interface part, so it might follow, e.g., a variable declaration part. And it is not an ISO 7185 Pascal keyword.

If we want to treat implementation as a weak keyword, it must not conflict with new identifiers anywhere in the grammar.

However, variable declaration parts are not self-contained in the sense described above, so after a variable declaration part it is not immediately clear if the part is finished or will continue. So this is a place where a new identifier is acceptable. E.g.:

     interface
     
     var
       Bar: Integer;
       Implementation: Integer;

vs.

     interface
     
     var
       Bar: Integer;
     
     implementation

The same applies to implementation after const, type, export and import parts.

The same problem also occurs with the Borland Pascal and Object Pascal keywords constructor and destructor, the Borland Delphi keyword initialization, and the PXSC keyword operator since the respective declarations can follow variable declaration blocks etc. It also happens with import (but it is only possible after an export part) and with uses if we allow it after other declarations (GPC extension).

Again, we play some lexer tricks. We observe that the new identifier in export, var, const and type is always followed by either ,, : or = while none of the keywords implementation, constructor, destructor, operator, import and uses is ever followed by one of these symbols ... with two exceptions: operator = is valid, overloading the = operator. Consider:

     type
       Foo = record end;
       Operator = (a, b);  { enum type }

vs.

     type
       Foo = record end;
     
     operator = (a, b: Foo) c: Foo;

To decide whether operator is a keyword, we would have to look ahead six tokens! Anyway, that seems to be a new record (where “record” in this sentence can be read either as a Pascal keyword or in at least one of the usual English meanings ;–).

The other exception is that initialization can, in principle, be followed by (, as in:

     implementation
     
     type
       Foo = Integer;
       Initialization (Obj: Integer)

vs.

     implementation
     
     type
       Foo = Integer;
     
     Initialization
       (Obj as SubObj).Method;

This would require 3 tokens look-ahead. However, a ( at the beginning of a statement is quite uncommon, so we just disallow that, so the use of Initialization as an identifier is not restricted.

Doing so much look-ahead would be a huge effort and cause some complications as noted above. This seems inappropriate for such an academic example. So, until someone comes up with a clever trick to cope with this case, we give up here and let operator before = be a keyword, so overloading = is possible. This means that operator cannot be used as an export interface, a type or an (untyped) constant, unless the keyword is disabled explicitly or by dialect options. (Enabling and disabling the keyword by the parser would also have been no option here, since the parser would need the 6-token look-ahead just as well, which it cannot do.)

You may have noticed that we “forgot” import (in the list of possibly unfinished sections; not in the list of critical following keywords where it was alright; it actually plays both roles in this discussion).

This is because the identifier at the beginning of an import specification can be followed by qualified, only, in, ( or ; – the former two of which are non-standard keywords as well and would therefore conflict with a new identifier after, e.g., uses and operator.

This means that there's no simple general solution. So let's consider the problematic keywords after an import part in detail:

We forbid all of these keywords immediately after an import part. This is achieved using parser precedence rules.