545 lines
15 KiB
Text
545 lines
15 KiB
Text
.\" $Id$
|
|
.\" troff -ms m2ref.doc
|
|
.TL
|
|
The ACK Modula-2 Compiler
|
|
.AU
|
|
Ceriel J.H. Jacobs
|
|
.AI
|
|
Department of Mathematics and Computer Science
|
|
Vrije Universiteit
|
|
Amsterdam
|
|
The Netherlands
|
|
.AB no
|
|
.AE
|
|
.NH
|
|
Introduction
|
|
.PP
|
|
This document describes the implementation-specific features of the
|
|
ACK Modula-2 compiler.
|
|
It is not intended to teach Modula-2 programming.
|
|
For a description of the Modula-2 language,
|
|
the reader is referred to [1].
|
|
.PP
|
|
The ACK Modula-2 compiler is currently available for use with the VAX,
|
|
Motorola MC68020,
|
|
Motorola MC68000,
|
|
PDP-11,
|
|
and Intel 8086 code-generators.
|
|
For the 8086,
|
|
MC68000,
|
|
and MC68020,
|
|
floating point emulation is used.
|
|
This is made available with the \fI-fp\fP
|
|
option,
|
|
which must be passed to \fIack\fP[4,5].
|
|
.NH
|
|
The language implemented
|
|
.PP
|
|
This section discusses the deviations from the Modula-2 language as described
|
|
in the "Report on The Programming Language Modula-2",
|
|
as it appeared in [1],
|
|
from now on referred to as "the Report".
|
|
Also,
|
|
the Report sometimes leaves room for interpretation.
|
|
The section numbers
|
|
mentioned are the section numbers of the Report.
|
|
.NH 2
|
|
Syntax (section 2)
|
|
.PP
|
|
The syntax recognized is that of the Report,
|
|
with some extensions to
|
|
also recognize the syntax of an earlier definition,
|
|
given in [2].
|
|
Only one compilation unit per file is accepted.
|
|
.NH 2
|
|
Vocabulary and Representation (section 3)
|
|
.PP
|
|
The input "\f(CW10..\fP" is parsed as two tokens: "\f(CW10\fP" and "\f(CW..\fP".
|
|
.PP
|
|
The empty string \f(CW""\fP has type
|
|
.DS
|
|
.ft CW
|
|
ARRAY [0 .. 0] OF CHAR
|
|
.ft P
|
|
.DE
|
|
and contains one character: \f(CW0C\fP.
|
|
.PP
|
|
When the text of a comment starts with a '\f(CW$\fP',
|
|
it may be a pragma.
|
|
Currently,
|
|
the following pragmas exist:
|
|
.DS
|
|
.ft CW
|
|
(*$F (F stands for Foreign) *)
|
|
(*$R[+|-] (Runtime checks, on or off, default on) *)
|
|
(*$A[+|-] (Array bound checks, on or off, default off) *)
|
|
(*$U (Allow for underscores within identifiers) *)
|
|
.ft P
|
|
.DE
|
|
The Foreign pragma is only meaningful in a \f(CWDEFINITION MODULE\fP,
|
|
and indicates that this
|
|
\f(CWDEFINITION MODULE\fP describes an interface to a module written in another
|
|
language (for instance C,
|
|
Pascal,
|
|
or EM).
|
|
Runtime checks that can be disabled are:
|
|
range checks,
|
|
\f(CWCARDINAL\fP overflow checks,
|
|
checks when assigning a \f(CWCARDINAL\fP to an \f(CWINTEGER\fP and vice versa,
|
|
and checks that \f(CWFOR\fP-loop control-variables are not changed
|
|
in the body of the loop.
|
|
Array bound checks can be enabled,
|
|
because many EM implementations do not
|
|
implement the array bound checking of the EM array instructions.
|
|
When enabled,
|
|
the compiler generates a check before generating an
|
|
EM array instruction.
|
|
Even when underscores are enabled,
|
|
they still may not start an identifier.
|
|
.PP
|
|
Constants of type \f(CWLONGINT\fP are integers with a suffix letter \f(CWD\fP
|
|
(for instance \f(CW1987D\fP).
|
|
Constants of type \f(CWLONGREAL\fP have suffix \f(CWD\fP if a scale factor is missing,
|
|
or have \f(CWD\fP in place of \f(CWE\fP in the scale factor (f.i. \f(CW1.0D\fP,
|
|
\f(CW0.314D1\fP).
|
|
This addition was made,
|
|
because there was no way to indicate long constants,
|
|
and also because the addition was made in Wirth's newest Modula-2 compiler.
|
|
.NH 2
|
|
Declarations and scope rules (section 4)
|
|
.PP
|
|
Standard identifiers are considered to be predeclared,
|
|
and valid in all
|
|
parts of a program.
|
|
They are called \fIpervasive\fP.
|
|
Unfortunately,
|
|
the Report does not state how this pervasiveness is accomplished.
|
|
However,
|
|
page 87 of [1] states: "Standard identifiers are automatically
|
|
imported into all modules".
|
|
Our implementation therefore allows
|
|
redeclarations of standard identifiers within procedures,
|
|
but not within
|
|
modules.
|
|
.NH 2
|
|
Constant expressions (section 5)
|
|
.PP
|
|
Each operand of a constant expression must be a constant:
|
|
a string,
|
|
a number,
|
|
a set,
|
|
an enumeration literal,
|
|
a qualifier denoting a
|
|
constant expression,
|
|
a type transfer with a constant argument,
|
|
or one of the standard procedures
|
|
\f(CWABS\fP,
|
|
\f(CWCAP\fP,
|
|
\f(CWCHR\fP,
|
|
\f(CWLONG\fP,
|
|
\f(CWMAX\fP,
|
|
\f(CWMIN\fP,
|
|
\f(CWODD\fP,
|
|
\f(CWORD\fP,
|
|
\f(CWSIZE\fP,
|
|
\f(CWSHORT\fP,
|
|
\f(CWTSIZE\fP,
|
|
or \f(CWVAL\fP,
|
|
with constant argument(s);
|
|
\f(CWTSIZE\fP and \f(CWSIZE\fP may also have a variable as argument.
|
|
.PP
|
|
Floating point expressions are never evaluated compile time,
|
|
because the compiler basically functions as a cross-compiler,
|
|
and thus cannot
|
|
use the floating point instructions of the machine on which it runs.
|
|
Also,
|
|
\f(CWMAX(REAL)\fP and \f(CWMIN(REAL)\fP are not allowed.
|
|
.NH 2
|
|
Type declarations (section 6)
|
|
.NH 3
|
|
Basic types (section 6.1)
|
|
.PP
|
|
The type \f(CWCHAR\fP includes the ASCII character set as a subset.
|
|
Values range from
|
|
\f(CW0C\fP to \f(CW377C\fP,
|
|
not from \f(CW0C\fP to \f(CW177C\fP.
|
|
.NH 3
|
|
Enumerations (section 6.2)
|
|
.PP
|
|
The maximum number of enumeration literals in any one enumeration type
|
|
is \f(CWMAX(INTEGER)\fP.
|
|
.NH 3
|
|
Record types (section 6.5)
|
|
.PP
|
|
The syntax of variant sections in [1] is different from the one in [2].
|
|
Our implementation recognizes both,
|
|
giving a warning for the older one.
|
|
However,
|
|
see section 3.
|
|
.NH 3
|
|
Set types (section 6.6)
|
|
.PP
|
|
The only limitation imposed by the compiler is that the base type of the
|
|
set must be a subrange type,
|
|
an enumeration type,
|
|
\f(CWCHAR\fP,
|
|
or \f(CWBOOLEAN\fP.
|
|
So,
|
|
the lower bound may be negative.
|
|
However,
|
|
if a negative lower bound is used,
|
|
the compiler gives a warning of the \fIrestricted\fP class (see the manual
|
|
page of the compiler).
|
|
.PP
|
|
The standard type \f(CWBITSET\fP is defined as
|
|
.DS
|
|
.ft CW
|
|
TYPE BITSET = SET OF [0 .. 8*SIZE(INTEGER)-1];
|
|
.ft P
|
|
.DE
|
|
.NH 2
|
|
Expressions (section 8)
|
|
.NH 3
|
|
Operators (section 8.2)
|
|
.NH 4
|
|
Arithmetic operators (section 8.2.1)
|
|
.PP
|
|
The Report does not specify the priority of the unary
|
|
operators \f(CW+\fP or \f(CW-\fP:
|
|
It does not specify whether
|
|
.DS
|
|
.ft CW
|
|
- 1 + 1
|
|
.ft P
|
|
.DE
|
|
means
|
|
.DS
|
|
.ft CW
|
|
- (1 + 1)
|
|
.ft P
|
|
.DE
|
|
or
|
|
.DS
|
|
.ft CW
|
|
(-1) + 1
|
|
.ft P
|
|
.DE
|
|
I have seen some compilers that implement the first alternative,
|
|
and others that implement the second.
|
|
Our compiler implements the second,
|
|
which is suggested by the fact that their priority is not specified,
|
|
which might indicate that it is the same as that of their binary counterparts.
|
|
And then the rule about left to right decides for the second.
|
|
On the other hand one might argue that,
|
|
since the grammar only allows for one unary operator in a simple expression,
|
|
it must apply to the whole simple expression,
|
|
not just the first term.
|
|
.NH 2
|
|
Statements (section 9)
|
|
.NH 3
|
|
Assignments (section 9.1)
|
|
.PP
|
|
The Report does not define the evaluation order in an assignment.
|
|
Our compiler certainly chooses an evaluation order,
|
|
but it is explicitly left undefined.
|
|
Therefore,
|
|
programs that depend on it may cease to work later.
|
|
.PP
|
|
The types \f(CWINTEGER\fP and \f(CWCARDINAL\fP are assignment-compatible with
|
|
\f(CWLONGINT\fP,
|
|
and \f(CWREAL\fP is assignment-compatible with \f(CWLONGREAL\fP.
|
|
.NH 3
|
|
Case statements (section 9.5)
|
|
.PP
|
|
The size of the type of the case-expression must be less than or equal to
|
|
the word-size.
|
|
.PP
|
|
The Report does not specify what happens if the value of the case-expression
|
|
does not occur as a label of any case,
|
|
and there is no \f(CWELSE\fP-part.
|
|
In our implementation,
|
|
this results in a runtime error.
|
|
.NH 3
|
|
For statements (section 9.8)
|
|
.PP
|
|
The Report does not specify the legal types for a control variable.
|
|
Our implementation allows the basic types (except \f(CWREAL\fP),
|
|
enumeration types,
|
|
and subranges.
|
|
A runtime warning is generated when the value of the control variable
|
|
is changed by the statement sequence that forms the body of the loop,
|
|
unless runtime checking is disabled.
|
|
.NH 3
|
|
Return and exit statements (section 9.11)
|
|
.PP
|
|
The Report does not specify which result-types are legal.
|
|
Our implementation allows any result type.
|
|
.NH 2
|
|
Procedure declarations (section 10)
|
|
.PP
|
|
Function procedures must exit through a RETURN statement,
|
|
or a runtime error occurs.
|
|
.NH 3
|
|
Standard procedures (section 10.2)
|
|
.PP
|
|
Our implementation supports \f(CWNEW\fP and \f(CWDISPOSE\fP
|
|
for backwards compatibility,
|
|
but issues warnings for their use.
|
|
However,
|
|
see section 3.
|
|
.PP
|
|
Also,
|
|
some new standard procedures were added,
|
|
similar to the new standard procedures in Wirth's newest compiler:
|
|
.IP \-
|
|
\f(CWLONG\fP converts an argument of type \f(CWINTEGER\fP or \f(CWREAL\fP to the
|
|
types \f(CWLONGINT\fP or \f(CWLONGREAL\fP.
|
|
.IP \-
|
|
\f(CWSHORT\fP performs the inverse transformation,
|
|
without range checks.
|
|
.IP \-
|
|
\f(CWFLOATD\fP is analogous to \f(CWFLOAT\fP,
|
|
but yields a result of type
|
|
\f(CWLONGREAL\fP.
|
|
.IP \-
|
|
\f(CWTRUNCD\fP is analogous to \f(CWTRUNC\fP,
|
|
but yields a result of type
|
|
\f(CWLONGINT\fP.
|
|
.NH 2
|
|
System-dependent facilities (section 12)
|
|
.PP
|
|
The type \f(CWBYTE\fP is added to the \f(CWSYSTEM\fP module.
|
|
It occupies a storage unit of 8 bits.
|
|
\f(CWARRAY OF BYTE\fP has a similar effect to \f(CWARRAY OF WORD\fP,
|
|
but is safer.
|
|
In some obscure cases the \f(CWARRAY OF WORD\fP mechanism does not quite
|
|
work properly.
|
|
.PP
|
|
The procedure \f(CWIOTRANSFER\fP is not implemented.
|
|
.NH 1
|
|
Backwards compatibility
|
|
.PP
|
|
Besides recognizing the language as described in [1],
|
|
the compiler recognizes most of the language described in [2],
|
|
for backwards compatibility.
|
|
It warns the user for old-fashioned
|
|
constructions (constructions that [1] does not allow).
|
|
If the \fI-Rm2-3\fP option (see [6]) is passed to \fIack\fP,
|
|
this backwards compatibility feature is disabled.
|
|
Also,
|
|
it may not be present on some
|
|
smaller machines,
|
|
like the PDP-11.
|
|
.NH 1
|
|
Compile time errors
|
|
.PP
|
|
The compile time error messages are intended to be self-explanatory,
|
|
and not listed here.
|
|
The compiler also sometimes issues warnings,
|
|
recognizable by a warning-classification between parentheses.
|
|
Currently,
|
|
there are 3 classifications:
|
|
.IP "(old-fashioned use)"
|
|
.br
|
|
These warnings are given on constructions that are not allowed by [1],
|
|
but are allowed by [2].
|
|
.IP (strict)
|
|
.br
|
|
These warnings are given on constructions that are supported by the
|
|
ACK Modula-2 compiler,
|
|
but might not be supported by others.
|
|
Examples: functions returning structured types,
|
|
SET types of subranges with
|
|
negative lower bound.
|
|
.IP (warning)
|
|
.br
|
|
The other warnings,
|
|
such as warnings about variables that are never assigned,
|
|
never used,
|
|
etc.
|
|
.NH 1
|
|
Runtime errors
|
|
.PP
|
|
The ACK Modula-2 compiler produces code for an EM machine as defined in [3].
|
|
Therefore,
|
|
it depends on the implementation
|
|
of the EM machine for detection some of the runtime errors that could occur.
|
|
.PP
|
|
The \fITraps\fP module enables the user to install his own runtime
|
|
error handler.
|
|
The default one just displays what happened and exits.
|
|
Basically,
|
|
a trap handler is just a procedure that takes an INTEGER as
|
|
parameter.
|
|
The INTEGER is the trap number.
|
|
This INTEGER can be one of the
|
|
EM trap numbers,
|
|
listed in [3],
|
|
or one of the numbers listed in the
|
|
\fITraps\fP definition module.
|
|
.PP
|
|
The following runtime errors may occur:
|
|
.IP "array bound error"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "range bound error"
|
|
.br
|
|
Range bound errors are always detected,
|
|
unless runtime checks are disabled.
|
|
.IP "set bound error"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
The current implementations detect this error.
|
|
.IP "integer overflow"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "cardinal overflow"
|
|
.br
|
|
This error is detected,
|
|
unless runtime checks are disabled.
|
|
.IP "cardinal underflow"
|
|
.br
|
|
This error is detected,
|
|
unless runtime checks are disabled.
|
|
.IP "real overflow"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "real underflow"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "divide by 0"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "divide by 0.0"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "undefined integer"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "undefined real"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "conversion error"
|
|
.br
|
|
This error occurs when assigning a negative value of type INTEGER to a
|
|
variable of type CARDINAL,
|
|
or when assigning a value of CARDINAL that is > MAX(INTEGER),
|
|
to a variable of type INTEGER.
|
|
It is detected,
|
|
unless runtime checking is disabled.
|
|
.IP "stack overflow"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
.IP "heap overflow"
|
|
.br
|
|
The detection of this error depends on the EM implementation.
|
|
Might happen when ALLOCATE fails.
|
|
.IP "case error"
|
|
.br
|
|
This error occurs when non of the cases in a CASE statement are selected,
|
|
and the CASE statement has no ELSE part.
|
|
The detection of this error depends on the EM implementation.
|
|
All current EM implementations detect this error.
|
|
.IP "stack size of process too large"
|
|
.br
|
|
This is most likely to happen if the reserved space for a coroutine stack
|
|
is too small.
|
|
In this case,
|
|
increase the size of the area given to
|
|
\f(CWNEWPROCESS\fP.
|
|
It can also happen if the stack needed for the main
|
|
process is too large and there are coroutines.
|
|
In this case,
|
|
the only fix is to reduce the stack size needed by the main process,
|
|
f.i. by avoiding local arrays.
|
|
.IP "too many nested traps + handlers"
|
|
.br
|
|
This error can only occur when the user has installed his own trap handler.
|
|
It means that during execution of the trap handler another trap has occurred,
|
|
and that several times.
|
|
In some cases,
|
|
this is an error because of overflow of some internal tables.
|
|
.IP "no RETURN from function procedure"
|
|
.br
|
|
This error occurs when a function procedure does not return properly
|
|
("falls" through).
|
|
.IP "illegal instruction"
|
|
.br
|
|
This error might occur when floating point operations are used on an
|
|
implementation that does not have floating point.
|
|
.PP
|
|
In addition,
|
|
some of the library modules may give error messages.
|
|
The \fBTraps\fP-module has a suitable mechanism for this.
|
|
.NH 1
|
|
Calling the compiler
|
|
.PP
|
|
See [4,5,6] for a detailed explanation.
|
|
.PP
|
|
The compiler itself has no version checking mechanism.
|
|
A special linker
|
|
would be needed to do that.
|
|
Therefore,
|
|
a makefile generator is included [7].
|
|
.NH 1
|
|
The procedure call interface
|
|
.PP
|
|
Parameters are pushed on the stack in reversed order,
|
|
so that the EM AB
|
|
(argument base) register indicates the first parameter.
|
|
For VAR parameters,
|
|
its address is passed,
|
|
for value parameters its value.
|
|
The only exception to this rule is with conformant arrays.
|
|
For conformant arrays,
|
|
the address is passed,
|
|
and an array descriptor is
|
|
passed.
|
|
The descriptor is an EM array descriptor.
|
|
It consists of three
|
|
fields: the lower bound (always 0),
|
|
upper bound - lower bound,
|
|
and the size of the elements.
|
|
The descriptor is pushed first.
|
|
If the parameter is a value parameter,
|
|
the called routine must make sure
|
|
that its value is never changed,
|
|
for instance by making its own copy
|
|
of the array.
|
|
The Modula-2 compiler does exactly this.
|
|
.PP
|
|
When the size of the return value of a function procedure is larger than
|
|
the maximum of \f(CWSIZE(LONGREAL)\fP and twice the pointer-size,
|
|
the caller reserves this space on the stack,
|
|
above the parameters.
|
|
Callee then stores
|
|
its result there,
|
|
and returns no other value.
|
|
.NH 1
|
|
References
|
|
.IP [1]
|
|
Niklaus Wirth,
|
|
.I
|
|
Programming in Modula-2, third, corrected edition,
|
|
.R
|
|
Springer-Verlag, Berlin (1985)
|
|
.IP [2]
|
|
Niklaus Wirth,
|
|
.I
|
|
Programming in Modula-2,
|
|
.R
|
|
Stringer-Verlag, Berlin (1983)
|
|
.IP [3]
|
|
A.S.Tanenbaum, J.W.Stevenson, Hans van Staveren, E.G.Keizer,
|
|
.I
|
|
Description of a machine architecture for use with block structured languages,
|
|
.R
|
|
Informatica rapport IR-81, Vrije Universiteit, Amsterdam
|
|
.IP [4]
|
|
UNIX manual \fIack\fP(1)
|
|
.IP [5]
|
|
UNIX manual \fImodula-2\fP(1)
|
|
.IP [6]
|
|
UNIX manual \fIem_m2\fP(6)
|
|
.IP [7]
|
|
UNIX manual \fIm2mm\fP(1)
|