545 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			545 lines
		
	
	
	
		
			15 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .\" $Id$
 | |
| .\" troff -ms m2ref.doc
 | |
| .TL
 | |
| The ACK Modula-2 Compiler
 | |
| .AU
 | |
| Ceriel J.H. Jacobs
 | |
| .AI
 | |
| Department of Mathematics and Computer Science
 | |
| Vrije Universiteit
 | |
| Amsterdam
 | |
| The Netherlands
 | |
| .AB no
 | |
| .AE
 | |
| .NH
 | |
| Introduction
 | |
| .PP
 | |
| This document describes the implementation-specific features of the
 | |
| ACK Modula-2 compiler.
 | |
| It is not intended to teach Modula-2 programming.
 | |
| For a description of the Modula-2 language,
 | |
| the reader is referred to [1].
 | |
| .PP
 | |
| The ACK Modula-2 compiler is currently available for use with the VAX,
 | |
| Motorola MC68020,
 | |
| Motorola MC68000,
 | |
| PDP-11,
 | |
| and Intel 8086 code-generators.
 | |
| For the 8086,
 | |
| MC68000,
 | |
| and MC68020,
 | |
| floating point emulation is used.
 | |
| This is made available with the \fI-fp\fP
 | |
| option,
 | |
| which must be passed to \fIack\fP[4,5].
 | |
| .NH
 | |
| The language implemented
 | |
| .PP
 | |
| This section discusses the deviations from the Modula-2 language as described
 | |
| in the "Report on The Programming Language Modula-2",
 | |
| as it appeared in [1],
 | |
| from now on referred to as "the Report".
 | |
| Also,
 | |
| the Report sometimes leaves room for interpretation.
 | |
| The section numbers
 | |
| mentioned are the section numbers of the Report.
 | |
| .NH 2
 | |
| Syntax (section 2)
 | |
| .PP
 | |
| The syntax recognized is that of the Report,
 | |
| with some extensions to
 | |
| also recognize the syntax of an earlier definition,
 | |
| given in [2].
 | |
| Only one compilation unit per file is accepted.
 | |
| .NH 2
 | |
| Vocabulary and Representation (section 3)
 | |
| .PP
 | |
| The input "\f(CW10..\fP" is parsed as two tokens: "\f(CW10\fP" and "\f(CW..\fP".
 | |
| .PP
 | |
| The empty string \f(CW""\fP has type
 | |
| .DS
 | |
| .ft CW
 | |
| ARRAY [0 .. 0] OF CHAR
 | |
| .ft P
 | |
| .DE
 | |
| and contains one character: \f(CW0C\fP.
 | |
| .PP
 | |
| When the text of a comment starts with a '\f(CW$\fP',
 | |
| it may be a pragma.
 | |
| Currently,
 | |
| the following pragmas exist:
 | |
| .DS
 | |
| .ft CW
 | |
| (*$F      (F stands for Foreign) *)
 | |
| (*$R[+|-] (Runtime checks, on or off, default on) *)
 | |
| (*$A[+|-] (Array bound checks, on or off, default off) *)
 | |
| (*$U      (Allow for underscores within identifiers) *)
 | |
| .ft P
 | |
| .DE
 | |
| The Foreign pragma is only meaningful in a \f(CWDEFINITION MODULE\fP,
 | |
| and indicates that this
 | |
| \f(CWDEFINITION MODULE\fP describes an interface to a module written in another
 | |
| language (for instance C,
 | |
| Pascal,
 | |
| or EM).
 | |
| Runtime checks that can be disabled are:
 | |
| range checks,
 | |
| \f(CWCARDINAL\fP overflow checks,
 | |
| checks when assigning a \f(CWCARDINAL\fP to an \f(CWINTEGER\fP and vice versa,
 | |
| and checks that \f(CWFOR\fP-loop control-variables are not changed
 | |
| in the body of the loop.
 | |
| Array bound checks can be enabled,
 | |
| because many EM implementations do not
 | |
| implement the array bound checking of the EM array instructions.
 | |
| When enabled,
 | |
| the compiler generates a check before generating an
 | |
| EM array instruction.
 | |
| Even when underscores are enabled,
 | |
| they still may not start an identifier.
 | |
| .PP
 | |
| Constants of type \f(CWLONGINT\fP are integers with a suffix letter \f(CWD\fP
 | |
| (for instance \f(CW1987D\fP).
 | |
| Constants of type \f(CWLONGREAL\fP have suffix \f(CWD\fP if a scale factor is missing,
 | |
| or have \f(CWD\fP in place of \f(CWE\fP in the scale factor (f.i. \f(CW1.0D\fP,
 | |
| \f(CW0.314D1\fP).
 | |
| This addition was made,
 | |
| because there was no way to indicate long constants,
 | |
| and also because the addition was made in Wirth's newest Modula-2 compiler.
 | |
| .NH 2
 | |
| Declarations and scope rules (section 4)
 | |
| .PP
 | |
| Standard identifiers are considered to be predeclared,
 | |
| and valid in all
 | |
| parts of a program.
 | |
| They are called \fIpervasive\fP.
 | |
| Unfortunately,
 | |
| the Report does not state how this pervasiveness is accomplished.
 | |
| However,
 | |
| page 87 of [1] states: "Standard identifiers are automatically
 | |
| imported into all modules".
 | |
| Our implementation therefore allows
 | |
| redeclarations of standard identifiers within procedures,
 | |
| but not within
 | |
| modules.
 | |
| .NH 2
 | |
| Constant expressions (section 5)
 | |
| .PP
 | |
| Each operand of a constant expression must be a constant:
 | |
| a string,
 | |
| a number,
 | |
| a set,
 | |
| an enumeration literal,
 | |
| a qualifier denoting a
 | |
| constant expression,
 | |
| a type transfer with a constant argument,
 | |
| or one of the standard procedures
 | |
| \f(CWABS\fP,
 | |
| \f(CWCAP\fP,
 | |
| \f(CWCHR\fP,
 | |
| \f(CWLONG\fP,
 | |
| \f(CWMAX\fP,
 | |
| \f(CWMIN\fP,
 | |
| \f(CWODD\fP,
 | |
| \f(CWORD\fP,
 | |
| \f(CWSIZE\fP,
 | |
| \f(CWSHORT\fP,
 | |
| \f(CWTSIZE\fP,
 | |
| or \f(CWVAL\fP,
 | |
| with constant argument(s);
 | |
| \f(CWTSIZE\fP and \f(CWSIZE\fP may also have a variable as argument.
 | |
| .PP
 | |
| Floating point expressions are never evaluated compile time,
 | |
| because the compiler basically functions as a cross-compiler,
 | |
| and thus cannot
 | |
| use the floating point instructions of the machine on which it runs.
 | |
| Also,
 | |
| \f(CWMAX(REAL)\fP and \f(CWMIN(REAL)\fP are not allowed.
 | |
| .NH 2
 | |
| Type declarations (section 6)
 | |
| .NH 3
 | |
| Basic types (section 6.1)
 | |
| .PP
 | |
| The type \f(CWCHAR\fP includes the ASCII character set as a subset.
 | |
| Values range from
 | |
| \f(CW0C\fP to \f(CW377C\fP,
 | |
| not from \f(CW0C\fP to \f(CW177C\fP.
 | |
| .NH 3
 | |
| Enumerations (section 6.2)
 | |
| .PP
 | |
| The maximum number of enumeration literals in any one enumeration type
 | |
| is \f(CWMAX(INTEGER)\fP.
 | |
| .NH 3
 | |
| Record types (section 6.5)
 | |
| .PP
 | |
| The syntax of variant sections in [1] is different from the one in [2].
 | |
| Our implementation recognizes both,
 | |
| giving a warning for the older one.
 | |
| However,
 | |
| see section 3.
 | |
| .NH 3
 | |
| Set types (section 6.6)
 | |
| .PP
 | |
| The only limitation imposed by the compiler is that the base type of the
 | |
| set must be a subrange type,
 | |
| an enumeration type,
 | |
| \f(CWCHAR\fP,
 | |
| or \f(CWBOOLEAN\fP.
 | |
| So,
 | |
| the lower bound may be negative.
 | |
| However,
 | |
| if a negative lower bound is used,
 | |
| the compiler gives a warning of the \fIrestricted\fP class (see the manual
 | |
| page of the compiler).
 | |
| .PP
 | |
| The standard type \f(CWBITSET\fP is defined as
 | |
| .DS
 | |
| .ft CW
 | |
| TYPE BITSET = SET OF [0 .. 8*SIZE(INTEGER)-1];
 | |
| .ft P
 | |
| .DE
 | |
| .NH 2
 | |
| Expressions (section 8)
 | |
| .NH 3
 | |
| Operators (section 8.2)
 | |
| .NH 4
 | |
| Arithmetic operators (section 8.2.1)
 | |
| .PP
 | |
| The Report does not specify the priority of the unary
 | |
| operators \f(CW+\fP or \f(CW-\fP:
 | |
| It does not specify whether
 | |
| .DS
 | |
| .ft CW
 | |
| - 1 + 1
 | |
| .ft P
 | |
| .DE
 | |
| means
 | |
| .DS
 | |
| .ft CW
 | |
| - (1 + 1)
 | |
| .ft P
 | |
| .DE
 | |
| or
 | |
| .DS
 | |
| .ft CW
 | |
| (-1) + 1
 | |
| .ft P
 | |
| .DE
 | |
| I have seen some compilers that implement the first alternative,
 | |
| and others that implement the second.
 | |
| Our compiler implements the second,
 | |
| which is suggested by the fact that their priority is not specified,
 | |
| which might indicate that it is the same as that of their binary counterparts.
 | |
| And then the rule about left to right decides for the second.
 | |
| On the other hand one might argue that,
 | |
| since the grammar only allows for one unary operator in a simple expression,
 | |
| it must apply to the whole simple expression,
 | |
| not just the first term.
 | |
| .NH 2
 | |
| Statements (section 9)
 | |
| .NH 3
 | |
| Assignments (section 9.1)
 | |
| .PP
 | |
| The Report does not define the evaluation order in an assignment.
 | |
| Our compiler certainly chooses an evaluation order,
 | |
| but it is explicitly left undefined.
 | |
| Therefore,
 | |
| programs that depend on it may cease to work later.
 | |
| .PP
 | |
| The types \f(CWINTEGER\fP and \f(CWCARDINAL\fP are assignment-compatible with
 | |
| \f(CWLONGINT\fP,
 | |
| and \f(CWREAL\fP is assignment-compatible with \f(CWLONGREAL\fP.
 | |
| .NH 3
 | |
| Case statements (section 9.5)
 | |
| .PP
 | |
| The size of the type of the case-expression must be less than or equal to
 | |
| the word-size.
 | |
| .PP
 | |
| The Report does not specify what happens if the value of the case-expression
 | |
| does not occur as a label of any case,
 | |
| and there is no \f(CWELSE\fP-part.
 | |
| In our implementation,
 | |
| this results in a runtime error.
 | |
| .NH 3
 | |
| For statements (section 9.8)
 | |
| .PP
 | |
| The Report does not specify the legal types for a control variable.
 | |
| Our implementation allows the basic types (except \f(CWREAL\fP),
 | |
| enumeration types,
 | |
| and subranges.
 | |
| A runtime warning is generated when the value of the control variable
 | |
| is changed by the statement sequence that forms the body of the loop,
 | |
| unless runtime checking is disabled.
 | |
| .NH 3
 | |
| Return and exit statements (section 9.11)
 | |
| .PP
 | |
| The Report does not specify which result-types are legal.
 | |
| Our implementation allows any result type.
 | |
| .NH 2
 | |
| Procedure declarations (section 10)
 | |
| .PP
 | |
| Function procedures must exit through a RETURN statement,
 | |
| or a runtime error occurs.
 | |
| .NH 3
 | |
| Standard procedures (section 10.2)
 | |
| .PP
 | |
| Our implementation supports \f(CWNEW\fP and \f(CWDISPOSE\fP
 | |
| for backwards compatibility,
 | |
| but issues warnings for their use.
 | |
| However,
 | |
| see section 3.
 | |
| .PP
 | |
| Also,
 | |
| some new standard procedures were added,
 | |
| similar to the new standard procedures in Wirth's newest compiler:
 | |
| .IP \-
 | |
| \f(CWLONG\fP converts an argument of type \f(CWINTEGER\fP or \f(CWREAL\fP to the
 | |
| types \f(CWLONGINT\fP or \f(CWLONGREAL\fP.
 | |
| .IP \-
 | |
| \f(CWSHORT\fP performs the inverse transformation,
 | |
| without range checks.
 | |
| .IP \-
 | |
| \f(CWFLOATD\fP is analogous to \f(CWFLOAT\fP,
 | |
| but yields a result of type
 | |
| \f(CWLONGREAL\fP.
 | |
| .IP \-
 | |
| \f(CWTRUNCD\fP is analogous to \f(CWTRUNC\fP,
 | |
| but yields a result of type
 | |
| \f(CWLONGINT\fP.
 | |
| .NH 2
 | |
| System-dependent facilities (section 12)
 | |
| .PP
 | |
| The type \f(CWBYTE\fP is added to the \f(CWSYSTEM\fP module.
 | |
| It occupies a storage unit of 8 bits.
 | |
| \f(CWARRAY OF BYTE\fP has a similar effect to \f(CWARRAY OF WORD\fP,
 | |
| but is safer.
 | |
| In some obscure cases the \f(CWARRAY OF WORD\fP mechanism does not quite
 | |
| work properly.
 | |
| .PP
 | |
| The procedure \f(CWIOTRANSFER\fP is not implemented.
 | |
| .NH 1
 | |
| Backwards compatibility
 | |
| .PP
 | |
| Besides recognizing the language as described in [1],
 | |
| the compiler recognizes most of the language described in [2],
 | |
| for backwards compatibility.
 | |
| It warns the user for old-fashioned
 | |
| constructions (constructions that [1] does not allow).
 | |
| If the \fI-Rm2-3\fP option (see [6]) is passed to \fIack\fP,
 | |
| this backwards compatibility feature is disabled.
 | |
| Also,
 | |
| it may not be present on some
 | |
| smaller machines,
 | |
| like the PDP-11.
 | |
| .NH 1
 | |
| Compile time errors
 | |
| .PP
 | |
| The compile time error messages are intended to be self-explanatory,
 | |
| and not listed here.
 | |
| The compiler also sometimes issues warnings,
 | |
| recognizable by a warning-classification between parentheses.
 | |
| Currently,
 | |
| there are 3 classifications:
 | |
| .IP "(old-fashioned use)"
 | |
| .br
 | |
| These warnings are given on constructions that are not allowed by [1],
 | |
| but are allowed by [2].
 | |
| .IP (strict)
 | |
| .br
 | |
| These warnings are given on constructions that are supported by the
 | |
| ACK Modula-2 compiler,
 | |
| but might not be supported by others.
 | |
| Examples: functions returning structured types,
 | |
| SET types of subranges with
 | |
| negative lower bound.
 | |
| .IP (warning)
 | |
| .br
 | |
| The other warnings,
 | |
| such as warnings about variables that are never assigned,
 | |
| never used,
 | |
| etc.
 | |
| .NH 1
 | |
| Runtime errors
 | |
| .PP
 | |
| The ACK Modula-2 compiler produces code for an EM machine as defined in [3].
 | |
| Therefore,
 | |
| it depends on the implementation
 | |
| of the EM machine for detection some of the runtime errors that could occur.
 | |
| .PP
 | |
| The \fITraps\fP module enables the user to install his own runtime
 | |
| error handler.
 | |
| The default one just displays what happened and exits.
 | |
| Basically,
 | |
| a trap handler is just a procedure that takes an INTEGER as
 | |
| parameter.
 | |
| The INTEGER is the trap number.
 | |
| This INTEGER can be one of the
 | |
| EM trap numbers,
 | |
| listed in [3],
 | |
| or one of the numbers listed in the
 | |
| \fITraps\fP definition module.
 | |
| .PP
 | |
| The following runtime errors may occur:
 | |
| .IP "array bound error"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "range bound error"
 | |
| .br
 | |
| Range bound errors are always detected,
 | |
| unless runtime checks are disabled.
 | |
| .IP "set bound error"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| The current implementations detect this error.
 | |
| .IP "integer overflow"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "cardinal overflow"
 | |
| .br
 | |
| This error is detected,
 | |
| unless runtime checks are disabled.
 | |
| .IP "cardinal underflow"
 | |
| .br
 | |
| This error is detected,
 | |
| unless runtime checks are disabled.
 | |
| .IP "real overflow"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "real underflow"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "divide by 0"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "divide by 0.0"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "undefined integer"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "undefined real"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "conversion error"
 | |
| .br
 | |
| This error occurs when assigning a negative value of type INTEGER to a
 | |
| variable of type CARDINAL,
 | |
| or when assigning a value of CARDINAL that is > MAX(INTEGER),
 | |
| to a variable of type INTEGER.
 | |
| It is detected,
 | |
| unless runtime checking is disabled.
 | |
| .IP "stack overflow"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| .IP "heap overflow"
 | |
| .br
 | |
| The detection of this error depends on the EM implementation.
 | |
| Might happen when ALLOCATE fails.
 | |
| .IP "case error"
 | |
| .br
 | |
| This error occurs when non of the cases in a CASE statement are selected,
 | |
| and the CASE statement has no ELSE part.
 | |
| The detection of this error depends on the EM implementation.
 | |
| All current EM implementations detect this error.
 | |
| .IP "stack size of process too large"
 | |
| .br
 | |
| This is most likely to happen if the reserved space for a coroutine stack
 | |
| is too small.
 | |
| In this case,
 | |
| increase the size of the area given to
 | |
| \f(CWNEWPROCESS\fP.
 | |
| It can also happen if the stack needed for the main
 | |
| process is too large and there are coroutines.
 | |
| In this case,
 | |
| the only fix is to reduce the stack size needed by the main process,
 | |
| f.i. by avoiding local arrays.
 | |
| .IP "too many nested traps + handlers"
 | |
| .br
 | |
| This error can only occur when the user has installed his own trap handler.
 | |
| It means that during execution of the trap handler another trap has occurred,
 | |
| and that several times.
 | |
| In some cases,
 | |
| this is an error because of overflow of some internal tables.
 | |
| .IP "no RETURN from function procedure"
 | |
| .br
 | |
| This error occurs when a function procedure does not return properly
 | |
| ("falls" through).
 | |
| .IP "illegal instruction"
 | |
| .br
 | |
| This error might occur when floating point operations are used on an
 | |
| implementation that does not have floating point.
 | |
| .PP
 | |
| In addition,
 | |
| some of the library modules may give error messages.
 | |
| The \fBTraps\fP-module has a suitable mechanism for this.
 | |
| .NH 1
 | |
| Calling the compiler
 | |
| .PP
 | |
| See [4,5,6] for a detailed explanation.
 | |
| .PP
 | |
| The compiler itself has no version checking mechanism.
 | |
| A special linker
 | |
| would be needed to do that.
 | |
| Therefore,
 | |
| a makefile generator is included [7].
 | |
| .NH 1
 | |
| The procedure call interface
 | |
| .PP
 | |
| Parameters are pushed on the stack in reversed order,
 | |
| so that the EM AB
 | |
| (argument base) register indicates the first parameter.
 | |
| For VAR parameters,
 | |
| its address is passed,
 | |
| for value parameters its value.
 | |
| The only exception to this rule is with conformant arrays.
 | |
| For conformant arrays,
 | |
| the address is passed,
 | |
| and an array descriptor is
 | |
| passed.
 | |
| The descriptor is an EM array descriptor.
 | |
| It consists of three
 | |
| fields: the lower bound (always 0),
 | |
| upper bound - lower bound,
 | |
| and the size of the elements.
 | |
| The descriptor is pushed first.
 | |
| If the parameter is a value parameter,
 | |
| the called routine must make sure
 | |
| that its value is never changed,
 | |
| for instance by making its own copy
 | |
| of the array.
 | |
| The Modula-2 compiler does exactly this.
 | |
| .PP
 | |
| When the size of the return value of a function procedure is larger than
 | |
| the maximum of \f(CWSIZE(LONGREAL)\fP and twice the pointer-size,
 | |
| the caller reserves this space on the stack,
 | |
| above the parameters.
 | |
| Callee then stores
 | |
| its result there,
 | |
| and returns no other value.
 | |
| .NH 1
 | |
| References
 | |
| .IP [1]
 | |
| Niklaus Wirth,
 | |
| .I
 | |
| Programming in Modula-2, third, corrected edition,
 | |
| .R
 | |
| Springer-Verlag, Berlin (1985)
 | |
| .IP [2]
 | |
| Niklaus Wirth,
 | |
| .I
 | |
| Programming in Modula-2,
 | |
| .R
 | |
| Stringer-Verlag, Berlin (1983)
 | |
| .IP [3]
 | |
| A.S.Tanenbaum, J.W.Stevenson, Hans van Staveren, E.G.Keizer,
 | |
| .I
 | |
| Description of a machine architecture for use with block structured languages,
 | |
| .R
 | |
| Informatica rapport IR-81, Vrije Universiteit, Amsterdam
 | |
| .IP [4]
 | |
| UNIX manual \fIack\fP(1)
 | |
| .IP [5]
 | |
| UNIX manual \fImodula-2\fP(1)
 | |
| .IP [6]
 | |
| UNIX manual \fIem_m2\fP(6)
 | |
| .IP [7]
 | |
| UNIX manual \fIm2mm\fP(1)
 |