425 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			425 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.TL
 | 
						|
The ACK Modula-2 Compiler
 | 
						|
.AU
 | 
						|
Ceriel J.H. Jacobs
 | 
						|
.AI
 | 
						|
Department of Mathematics and Computer Science
 | 
						|
Vrije Universiteit
 | 
						|
Amsterdam
 | 
						|
The Netherlands
 | 
						|
.AB no
 | 
						|
.AE
 | 
						|
.NH
 | 
						|
Introduction
 | 
						|
.PP
 | 
						|
This document describes the implementation-specific features of the
 | 
						|
ACK Modula-2 compiler. It is not intended to teach Modula-2 programming.
 | 
						|
For a description of the Modula-2 language, the reader is referred to [1].
 | 
						|
.PP
 | 
						|
The ACK Modula-2 compiler is currently available for use with the VAX,
 | 
						|
Motorola MC68020, Motorola MC68000,
 | 
						|
PDP-11, and Intel 8086 code-generators.
 | 
						|
For the 8086, MC68000, and MC68020,
 | 
						|
floating point emulation is used. This is made available with the \fI-fp\fP
 | 
						|
option, which must be passed to \fIack\fP[4,5].
 | 
						|
.NH
 | 
						|
The language implemented
 | 
						|
.PP
 | 
						|
This section discusses the deviations from the Modula-2 language as described
 | 
						|
in the "Report on The Programming Language Modula-2", as it appeared in [1],
 | 
						|
from now on referred to as "the Report".
 | 
						|
Also, the Report sometimes leaves room for interpretation. The section numbers
 | 
						|
mentioned are the section numbers of the Report.
 | 
						|
.NH 2
 | 
						|
Syntax (section 2)
 | 
						|
.PP
 | 
						|
The syntax recognized is that of the Report, with some extensions to 
 | 
						|
also recognize the syntax of an earlier definition, given in [2].
 | 
						|
Only one compilation unit per file is accepted.
 | 
						|
.NH 2
 | 
						|
Vocabulary and Representation (section 3)
 | 
						|
.PP
 | 
						|
The input "\f(CW10..\fP" is parsed as two tokens: "\f(CW10\fP" and "\f(CW..\fP".
 | 
						|
.PP
 | 
						|
The empty string \f(CW""\fP has type
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
ARRAY [0 .. 0] OF CHAR
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
and contains one character: \f(CW0C\fP.
 | 
						|
.PP
 | 
						|
When the text of a comment starts with a '\f(CW$\fP', it may be a pragma.
 | 
						|
Currently, the following pragmas exist:
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
(*$F      (F stands for Foreign) *)
 | 
						|
(*$R[+|-] (Runtime checks, on or off, default on) *)
 | 
						|
(*$A[+|-] (Array bound checks, on or off, default off) *)
 | 
						|
(*$U      (Allow for underscores within identifiers) *)
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
The Foreign pragma is only meaningful in a \f(CWDEFINITION MODULE\fP,
 | 
						|
and indicates that this
 | 
						|
\f(CWDEFINITION MODULE\fP describes an interface to a module written in another
 | 
						|
language (for instance C, Pascal, or EM).
 | 
						|
Runtime checks that can be disabled are:
 | 
						|
range checks,
 | 
						|
\f(CWCARDINAL\fP overflow checks,
 | 
						|
checks when assigning a \f(CWCARDINAL\fP to an \f(CWINTEGER\fP and vice versa,
 | 
						|
and checks that \f(CWFOR\fP-loop control-variables are not changed
 | 
						|
in the body of the loop.
 | 
						|
Array bound checks can be enabled, because many EM implementations do not
 | 
						|
implement the array bound checking of the EM array instructions.
 | 
						|
When enabled, the compiler generates a check before generating an
 | 
						|
EM array instruction.
 | 
						|
Even when underscores are enabled, they still may not start an identifier.
 | 
						|
.PP
 | 
						|
Constants of type \f(CWLONGINT\fP are integers with a suffix letter \f(CWD\fP
 | 
						|
(for instance \f(CW1987D\fP).
 | 
						|
Constants of type \f(CWLONGREAL\fP have suffix \f(CWD\fP if a scale factor is missing,
 | 
						|
or have \f(CWD\fP in place of \f(CWE\fP in the scale factor (f.i. \f(CW1.0D\fP,
 | 
						|
\f(CW0.314D1\fP).
 | 
						|
This addition was made, because there was no way to indicate long constants,
 | 
						|
and also because the addition was made in Wirth's newest Modula-2 compiler.
 | 
						|
.NH 2
 | 
						|
Declarations and scope rules (section 4)
 | 
						|
.PP
 | 
						|
Standard identifiers are considered to be predeclared, and valid in all
 | 
						|
parts of a program. They are called \fIpervasive\fP.
 | 
						|
Unfortunately, the Report does not state how this pervasiveness is accomplished.
 | 
						|
However, page 87 of [1] states: "Standard identifiers are automatically
 | 
						|
imported into all modules". Our implementation therefore allows
 | 
						|
redeclarations of standard identifiers within procedures, but not within
 | 
						|
modules.
 | 
						|
.NH 2
 | 
						|
Constant expressions (section 5)
 | 
						|
.PP
 | 
						|
Each operand of a constant expression must be a constant:
 | 
						|
a string, a number, a set, an enumeration literal, a qualifier denoting a
 | 
						|
constant expression, a typetransfer with a constant argument, or
 | 
						|
one of the standard procedures 
 | 
						|
\f(CWABS\fP, \f(CWCAP\fP, \f(CWCHR\fP, \f(CWLONG\fP, \f(CWMAX\fP, \f(CWMIN\fP,
 | 
						|
\f(CWODD\fP, \f(CWORD\fP,
 | 
						|
\f(CWSIZE\fP, \f(CWSHORT\fP, \f(CWTSIZE\fP, or \f(CWVAL\fP, with constant argument(s);
 | 
						|
\f(CWTSIZE\fP and \f(CWSIZE\fP may also have a variable as argument.
 | 
						|
.PP
 | 
						|
Floating point expressions are never evaluated compile time, because
 | 
						|
the compiler basically functions as a cross-compiler, and thus cannot
 | 
						|
use the floating point instructions of the machine on which it runs.
 | 
						|
Also, \f(CWMAX(REAL)\fP and \f(CWMIN(REAL)\fP are not allowed.
 | 
						|
.NH 2
 | 
						|
Type declarations (section 6)
 | 
						|
.NH 3
 | 
						|
Basic types (section 6.1)
 | 
						|
.PP
 | 
						|
The type \f(CWCHAR\fP includes the ASCII character set as a subset.
 | 
						|
Values range from
 | 
						|
\f(CW0C\fP to \f(CW377C\fP, not from \f(CW0C\fP to \f(CW177C\fP.
 | 
						|
.NH 3
 | 
						|
Enumerations (section 6.2)
 | 
						|
.PP
 | 
						|
The maximum number of enumeration literals in any one enumeration type
 | 
						|
is \f(CWMAX(INTEGER)\fP.
 | 
						|
.NH 3
 | 
						|
Record types (section 6.5)
 | 
						|
.PP
 | 
						|
The syntax of variant sections in [1] is different from the one in [2].
 | 
						|
Our implementation recognizes both, giving a warning for the older one.
 | 
						|
However, see section 3.
 | 
						|
.NH 3
 | 
						|
Set types (section 6.6)
 | 
						|
.PP
 | 
						|
The only limitation imposed by the compiler is that the base type of the
 | 
						|
set must be a subrange type, an enumeration type, \f(CWCHAR\fP, or
 | 
						|
\f(CWBOOLEAN\fP.
 | 
						|
So, the lower bound may be negative.
 | 
						|
However, if a negative lower bound is used,
 | 
						|
the compiler gives a warning of the \fIrestricted\fP class (see the manual
 | 
						|
page of the compiler).
 | 
						|
.PP
 | 
						|
The standard type \f(CWBITSET\fP is defined as
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
TYPE BITSET = SET OF [0 .. 8*SIZE(INTEGER)-1];
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
.NH 2
 | 
						|
Expressions (section 8)
 | 
						|
.NH 3
 | 
						|
Operators (section 8.2)
 | 
						|
.NH 4
 | 
						|
Arithmetic operators (section 8.2.1)
 | 
						|
.PP
 | 
						|
The Report does not specify the priority of the unary
 | 
						|
operators \f(CW+\fP or \f(CW-\fP:
 | 
						|
It does not specify whether
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
- 1 + 1
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
means
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
- (1 + 1)
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
or
 | 
						|
.DS
 | 
						|
.ft CW
 | 
						|
(-1) + 1
 | 
						|
.ft P
 | 
						|
.DE
 | 
						|
I have seen some compilers that implement the first alternative, and others
 | 
						|
that implement the second. Our compiler implements the second, which is
 | 
						|
suggested by the fact that their priority is not specified, which might
 | 
						|
indicate that it is the same as that of their binary counterparts.
 | 
						|
And then the rule about left to right decides for the second.
 | 
						|
On the other hand, one might argue that, since the grammar only allows
 | 
						|
for one unary operator in a simple expression, it must apply to the
 | 
						|
whole simple expression, not just the first term.
 | 
						|
.NH 2
 | 
						|
Statements (section 9)
 | 
						|
.NH 3
 | 
						|
Assignments (section 9.1)
 | 
						|
.PP
 | 
						|
The Report does not define the evaluation order in an assignment.
 | 
						|
Our compiler certainly chooses an evaluation order, but it is explicitly
 | 
						|
left undefined. Therefore, programs that depend on it, may cease to
 | 
						|
work later.
 | 
						|
.PP
 | 
						|
The types \f(CWINTEGER\fP and \f(CWCARDINAL\fP are assignment-compatible with
 | 
						|
\f(CWLONGINT\fP, and \f(CWREAL\fP is assignment-compatible with \f(CWLONGREAL\fP.
 | 
						|
.NH 3
 | 
						|
Case statements (section 9.5)
 | 
						|
.PP
 | 
						|
The size of the type of the case-expression must be less than or equal to
 | 
						|
the word-size.
 | 
						|
.PP
 | 
						|
The Report does not specify what happens if the value of the case-expression
 | 
						|
does not occur as a label of any case, and there is no \f(CWELSE\fP-part.
 | 
						|
In our implementation, this results in a runtime error.
 | 
						|
.NH 3
 | 
						|
For statements (section 9.8)
 | 
						|
.PP
 | 
						|
The Report does not specify the legal types for a control variable.
 | 
						|
Our implementation allows the basic types (except \f(CWREAL\fP),
 | 
						|
enumeration types, and subranges.
 | 
						|
A runtime warning is generated when the value of the control variable
 | 
						|
is changed by the statement sequence that forms the body of the loop,
 | 
						|
unless runtime checking is disabled.
 | 
						|
.NH 3
 | 
						|
Return and exit statements (section 9.11)
 | 
						|
.PP
 | 
						|
The Report does not specify which result-types are legal.
 | 
						|
Our implementation allows any result type.
 | 
						|
.NH 2
 | 
						|
Procedure declarations (section 10)
 | 
						|
.PP
 | 
						|
Function procedures must exit through a RETURN statement, or a runtime error
 | 
						|
occurs.
 | 
						|
.NH 3
 | 
						|
Standard procedures (section 10.2)
 | 
						|
.PP
 | 
						|
Our implementation supports \f(CWNEW\fP and \f(CWDISPOSE\fP
 | 
						|
for backwards compatibility,
 | 
						|
but issues warnings for their use. However, see section 3.
 | 
						|
.PP
 | 
						|
Also, some new standard procedures were added, similar to the new standard
 | 
						|
procedures in Wirth's newest compiler:
 | 
						|
.IP \-
 | 
						|
\f(CWLONG\fP converts an argument of type \f(CWINTEGER\fP or \f(CWREAL\fP to the
 | 
						|
types \f(CWLONGINT\fP or \f(CWLONGREAL\fP.
 | 
						|
.IP \-
 | 
						|
\f(CWSHORT\fP performs the inverse transformation, without range checks.
 | 
						|
.IP \-
 | 
						|
\f(CWFLOATD\fP is analogous to \f(CWFLOAT\fP, but yields a result of type
 | 
						|
\f(CWLONGREAL\fP.
 | 
						|
.IP \-
 | 
						|
\f(CWTRUNCD\fP is analogous to \f(CWTRUNC\fP, but yields a result of type
 | 
						|
\f(CWLONGINT\fP.
 | 
						|
.NH 2
 | 
						|
System-dependent facilities (section 12)
 | 
						|
.PP
 | 
						|
The type \f(CWBYTE\fP is added to the \f(CWSYSTEM\fP module.
 | 
						|
It occupies a storage unit of 8 bits.
 | 
						|
\f(CWARRAY OF BYTE\fP has a similar effect to \f(CWARRAY OF WORD\fP, but is
 | 
						|
safer. In some obscure cases the \f(CWARRAY OF WORD\fP mechanism does not quite
 | 
						|
work properly.
 | 
						|
.PP
 | 
						|
The procedure \f(CWIOTRANSFER\fP is not implemented.
 | 
						|
.NH 1
 | 
						|
Backwards compatibility
 | 
						|
.PP
 | 
						|
Besides recognizing the language as described in [1], the compiler recognizes
 | 
						|
most of the language described in [2], for backwards compatibility.
 | 
						|
It warns the user for old-fashioned
 | 
						|
constructions (constructions that [1] does not allow).
 | 
						|
If the \fI-Rm2-3\fP option (see [6]) is passed to \fIack\fP, this backwards
 | 
						|
compatibility feature is disabled. Also, it may not be present on some
 | 
						|
smaller machines, like the PDP-11.
 | 
						|
.NH 1
 | 
						|
Compile time errors
 | 
						|
.PP
 | 
						|
The compile time error messages are intended to be self-explanatory,
 | 
						|
and not listed here. The compiler also sometimes issues warnings, 
 | 
						|
recognizable by a warning-classification between parentheses.
 | 
						|
Currently, there are 3 classifications:
 | 
						|
.IP "(old-fashioned use)"
 | 
						|
.br
 | 
						|
These warnings are given on constructions that are not allowed by [1], but are
 | 
						|
allowed by [2].
 | 
						|
.IP (strict)
 | 
						|
.br
 | 
						|
These warnings are given on constructions that are supported by the 
 | 
						|
ACK Modula-2 compiler, but might not be supported by others.
 | 
						|
Examples: functions returning structured types, SET types of subranges with
 | 
						|
negative lower bound.
 | 
						|
.IP (warning)
 | 
						|
.br
 | 
						|
The other warnings, such as warnings about variables that are never assigned,
 | 
						|
never used, etc.
 | 
						|
.NH 1
 | 
						|
Runtime errors
 | 
						|
.PP
 | 
						|
The ACK Modula-2 compiler produces code for an EM machine as defined in [3].
 | 
						|
Therefore, it depends on the implementation
 | 
						|
of the EM machine for detection some of the runtime errors that could occur.
 | 
						|
.PP
 | 
						|
The \fITraps\fP module enables the user to install his own runtime
 | 
						|
error handler.
 | 
						|
The default one just displays what happened and exits.
 | 
						|
Basically, a trap handler is just a procedure that takes an INTEGER as
 | 
						|
parameter. The INTEGER is the trap number. This INTEGER can be one of the
 | 
						|
EM trap numbers, listed in [3], or one of the numbers listed in the
 | 
						|
\fITraps\fP definition module.
 | 
						|
.PP
 | 
						|
The following runtime errors may occur:
 | 
						|
.IP "array bound error"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "range bound error"
 | 
						|
.br
 | 
						|
Range bound errors are always detected, unless runtime checks are disabled.
 | 
						|
.IP "set bound error"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
The current implementations detect this error.
 | 
						|
.IP "integer overflow"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "cardinal overflow"
 | 
						|
.br
 | 
						|
This error is detected, unless runtime checks are disabled.
 | 
						|
.IP "cardinal underflow"
 | 
						|
.br
 | 
						|
This error is detected, unless runtime checks are disabled.
 | 
						|
.IP "real overflow"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "real underflow"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "divide by 0"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "divide by 0.0"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "undefined integer"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "undefined real"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "conversion error"
 | 
						|
.br
 | 
						|
This error occurs when assigning a negative value of type INTEGER to a
 | 
						|
variable of type CARDINAL,
 | 
						|
or when assigning a value of CARDINAL, that is > MAX(INTEGER), to a
 | 
						|
variable of type INTEGER.
 | 
						|
It is detected, unless runtime checking is disabled.
 | 
						|
.IP "stack overflow"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
.IP "heap overflow"
 | 
						|
.br
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
Might happen when ALLOCATE fails.
 | 
						|
.IP "case error"
 | 
						|
.br
 | 
						|
This error occurs when non of the cases in a CASE statement are selected,
 | 
						|
and the CASE statement has no ELSE part.
 | 
						|
The detection of this error depends on the EM implementation.
 | 
						|
All current EM implementations detect this error.
 | 
						|
.IP "stack size of process too large"
 | 
						|
.br
 | 
						|
The current implementation limits the stack size of processes to 1024 bytes.
 | 
						|
.IP "too many nested traps + handlers"
 | 
						|
.br
 | 
						|
This error can only occur when the user has installed his own trap handler.
 | 
						|
It means that during execution of the trap handler another trap has occurred,
 | 
						|
and that several times.
 | 
						|
In some cases, this is an error because of overflow of some internal tables.
 | 
						|
.IP "no RETURN from procedure function"
 | 
						|
.br
 | 
						|
This error occurs when a procedure function does not return properly
 | 
						|
("falls" through).
 | 
						|
.IP "illegal instruction"
 | 
						|
.br
 | 
						|
This error might occur when you use floating point operations on an
 | 
						|
implementation that does not have floating point.
 | 
						|
.PP
 | 
						|
In addition, some of the library modules may give error messages.
 | 
						|
The \fBTraps\fP-module has a suitable mechanism for this.
 | 
						|
.NH 1
 | 
						|
Calling the compiler
 | 
						|
.PP
 | 
						|
See [4,5,6] for a detailed explanation.
 | 
						|
.PP
 | 
						|
Ths compiler itself has no version checking mechanism. A special linker
 | 
						|
would be needed to do that. Therefore, a makefile generator is included [7].
 | 
						|
.NH 1
 | 
						|
The procedure call interface
 | 
						|
.PP
 | 
						|
Parameters are pushed on the stack in reversed order, so that the EM AB
 | 
						|
(argument base) register indicates the first parameter.
 | 
						|
For VAR parameters, its address is passed, for value parameters its value.
 | 
						|
The only exception to this rule is with conformant arrays.
 | 
						|
For conformant arrays, the address is passed, and an array descriptor is
 | 
						|
passed. The descriptor is an EM array descriptor. It consists of three
 | 
						|
fields: the lower bound (always 0), upper bound - lower bound, and the
 | 
						|
size of the elements.
 | 
						|
The descriptor is pushed first.
 | 
						|
If the parameter is a value parameter, the called routine must make sure
 | 
						|
that its value is never changed, for instance by making its own copy
 | 
						|
of the array. The Modula-2 compiler does exactly this.
 | 
						|
.NH 1
 | 
						|
References
 | 
						|
.IP [1]
 | 
						|
Niklaus Wirth,
 | 
						|
.I
 | 
						|
Programming in Modula-2, third, corrected edition,
 | 
						|
.R
 | 
						|
Springer-Verlag, Berlin (1985)
 | 
						|
.IP [2]
 | 
						|
Niklaus Wirth,
 | 
						|
.I
 | 
						|
Programming in Modula-2,
 | 
						|
.R
 | 
						|
Stringer-Verlag, Berlin (1983)
 | 
						|
.IP [3]
 | 
						|
A.S.Tanenbaum, J.W.Stevenson, Hans van Staveren, E.G.Keizer,
 | 
						|
.I
 | 
						|
Description of a machine architecture for use with block structured languages,
 | 
						|
.R
 | 
						|
Informatica rapport IR-81, Vrije Universiteit, Amsterdam
 | 
						|
.IP [4]
 | 
						|
UNIX manual \fIack\fP(1)
 | 
						|
.IP [5]
 | 
						|
UNIX manual \fImodula-2\fP(1)
 | 
						|
.IP [6]
 | 
						|
UNIX manual \fIem_m2\fP(6)
 | 
						|
.IP [7]
 | 
						|
UNIX manual \fIm2mm\fP(1)
 |