Added
This commit is contained in:
parent
0633c900a8
commit
44cc075183
23 changed files with 2115 additions and 0 deletions
50
doc/pascal/ab+intro.doc
Normal file
50
doc/pascal/ab+intro.doc
Normal file
|
@ -0,0 +1,50 @@
|
|||
.TL
|
||||
The ACK Pascal Compiler
|
||||
.AU
|
||||
Aad Geudeke
|
||||
Frans Hofmeester
|
||||
.AI
|
||||
Dept. of Mathematics and Computer Science
|
||||
Vrije Universiteit
|
||||
Amsterdam, The Netherlands
|
||||
.AB
|
||||
This document describes the implementation of a Pascal to EM compiler. The
|
||||
compiler is written in C. The lexical analysis is done using a hand-written
|
||||
lexical analyzer. Semantic analysis makes use of the extended LL(1) parser
|
||||
generator LLgen. Several EM utility modules are used in the compiler.
|
||||
.AE
|
||||
.sp 2
|
||||
.NH
|
||||
Introduction
|
||||
|
||||
.PP
|
||||
.nh
|
||||
The Pascal front end of the Amsterdam Compiler Kit (ACK) complies with the
|
||||
requirements of the international standard published by the International
|
||||
Organization for Standardization (ISO) [ISO]. An informal description, which
|
||||
unfortunately is not conforming to the standard, of the programming language
|
||||
Pascal is given in [JEN].
|
||||
|
||||
.PP
|
||||
The main reason for rewriting the Pascal compiler was that the old Pascal
|
||||
compiler was written in Pascal itself, and a disadvantage of it was its
|
||||
lack of flexibility. The compiler did not meet the needs of the current
|
||||
ACK-framework, which makes use of modern parsing techniques and utility
|
||||
modules. In this framework it is, for example, possible to use a fast back
|
||||
end. Such a back end translates directly to object code [ACK]. Our compiler is
|
||||
written in C and it is designed similar to the current C and Modula-2 compiler
|
||||
of ACK.
|
||||
|
||||
.PP
|
||||
Chapter 2 describes the basic structure of the compiler. Chapter 3 discusses
|
||||
the code generation of the main Pascal constructs. Chapter 4 covers one of
|
||||
the major components of Pascal, viz. the conformant array. In Chapter 5 the
|
||||
various compiler options that can be used are enumerated. The extensions
|
||||
to the standard and the deviations from the standard are listed in Chapter
|
||||
6 and 7. Chapter 8 presents some ideas to improve the standard. Chapter 9
|
||||
gives a short overview of testing the compiler. The major differences
|
||||
between the old and new compiler can be found in Chapter 10. Suggestions
|
||||
to improve the compiler are described in Chapter 11. The appendices
|
||||
contain the grammar of Pascal and the changes made to the ACK Pascal run time
|
||||
library. A translation of a Pascal program to EM code as example is presented.
|
||||
.bp
|
89
doc/pascal/compar.doc
Normal file
89
doc/pascal/compar.doc
Normal file
|
@ -0,0 +1,89 @@
|
|||
.sp 2
|
||||
.NH
|
||||
Comparison with the Pascal-VU compiler
|
||||
.nh
|
||||
|
||||
.LP
|
||||
In this chapter, the differences with the Pascal-VU compiler [IM2] are listed.
|
||||
The points enumerated below can be used as improvements to the compiler (see
|
||||
also Chapter 11).
|
||||
.sp
|
||||
.NH 2
|
||||
Deviations
|
||||
.LP
|
||||
.sp
|
||||
- large labels
|
||||
.in +3m
|
||||
only labels in the closed interval 0..9999 are allowed, as opposed to the
|
||||
Pascal-VU compiler. The Pascal-VU compiler allows every unsigned integer
|
||||
as label.
|
||||
.in -3m
|
||||
|
||||
- goto
|
||||
.in +3m
|
||||
the new compiler conforms to the standard as opposed to the old one. The
|
||||
following program, which contains an illegal jump to label 1, is accepted
|
||||
by the Pascal-VU compiler.
|
||||
|
||||
.nf
|
||||
\fBprogram\fR illegal_goto(output);
|
||||
\fBlabel\fR 1;
|
||||
\fBvar\fR i : integer;
|
||||
\fBbegin\fR
|
||||
\fBgoto\fR 1;
|
||||
\fBfor\fR i := 1 \fBto\fR 10 \fBdo\fR
|
||||
\fBbegin\fR
|
||||
1 : writeln(i);
|
||||
\fBend\fR;
|
||||
\fBend\fR.
|
||||
.fi
|
||||
|
||||
This program is rejected by the new compiler.
|
||||
.in -3m
|
||||
|
||||
.NH 2
|
||||
Extensions
|
||||
.LP
|
||||
.sp
|
||||
The extensions implemented by the Pascal-VU compiler are listed in
|
||||
Chapter 5 of [IM2].
|
||||
.sp
|
||||
- separate compilation
|
||||
.ti +3m
|
||||
the new compiler only accepts programs, not modules.
|
||||
|
||||
- assertions
|
||||
.ti +3m
|
||||
not implemented.
|
||||
|
||||
- additional procedures
|
||||
.ti +3m
|
||||
the procedures \fIhalt, mark\fR and \fIrelease\fR are not available.
|
||||
.bp
|
||||
- UNIX\(tm interfacing
|
||||
.ti +3m
|
||||
the \-c option is not implemented.
|
||||
.FS
|
||||
\(tm UNIX is a Trademark of Bell Laboratories.
|
||||
.FE
|
||||
|
||||
- double length integers
|
||||
.ti +3m
|
||||
integer size can be set with the \-V option, so the additional type \fIlong\fR
|
||||
is not implemented.
|
||||
|
||||
|
||||
.NH 2
|
||||
Compiler options
|
||||
.LP
|
||||
.sp
|
||||
The options implemented by the Pascal-VU compiler are listed in
|
||||
Chapter 7 of [IM2].
|
||||
.sp
|
||||
The construction "{$....}" is not recognized.
|
||||
|
||||
The options: \fIa, c, d, s\fR and \fIt\fR are not available.
|
||||
|
||||
The \-l option has been changed into the \-L option.
|
||||
|
||||
The size of reals can be set with the \-V option.
|
88
doc/pascal/conf.doc
Normal file
88
doc/pascal/conf.doc
Normal file
|
@ -0,0 +1,88 @@
|
|||
.sp 1.5i
|
||||
.nr H1 3
|
||||
.NH
|
||||
Conformant Arrays
|
||||
.nh
|
||||
.LP
|
||||
.sp
|
||||
A fifth kind of parameter, besides the value, variable, procedure, and function
|
||||
parameter, is the conformant array parameter (\fBISO 6.6.3.7\fR). This
|
||||
parameter, undoubtedly the major addition to Pascal from the compiler writer's
|
||||
point of view, has been implemented. With this kind of parameter, the required
|
||||
bounds of the index-type of an actual parameter are not fixed, but are
|
||||
restricted to a specified range of values. Two types of conformant array
|
||||
parameters can be distinguished: variable conformant array parameters and
|
||||
value conformant array parameters.
|
||||
.sp
|
||||
.NH 2
|
||||
Variable conformant array parameters
|
||||
.LP
|
||||
.sp
|
||||
The treatment of variable conformant array parameters is comparable with the
|
||||
normal variable parameter.
|
||||
Both have in common that the parameter mechanism used is \fIcall by
|
||||
reference\fR.
|
||||
.br
|
||||
An example is:
|
||||
.br
|
||||
.in +5m
|
||||
to sort variable length arrays of integers, the following Pascal procedure could be used:
|
||||
|
||||
.nf
|
||||
\fBprocedure\fR bubblesort(\fBvar\fR A : \fBarray\fR[low..high : integer] \fBof\fR integer);
|
||||
\fBvar\fR i, j : integer;
|
||||
\fBbegin
|
||||
for\fR j := high - 1 \fBdownto\fR low \fBdo
|
||||
for\fR i := low \fBto\fR j \fBdo
|
||||
if\fR A[i+1] < A[i] \fBthen\fI interchange A[i] and A[i+1]
|
||||
\fBend\fR;
|
||||
.fi
|
||||
.in -5m
|
||||
|
||||
For every actual parameter, the base address of the array is pushed on the
|
||||
stack and for every index-type-specification, exactly one array descriptor
|
||||
is pushed.
|
||||
.sp
|
||||
.NH 2
|
||||
Value conformant array parameters
|
||||
.LP
|
||||
.sp
|
||||
The treatment of value conformant array parameters is more complex than its
|
||||
variable counterpart.
|
||||
.br
|
||||
An example is:
|
||||
.br
|
||||
.in +5m
|
||||
an unpacked array of characters could be printed as a string with the following program part:
|
||||
|
||||
.nf
|
||||
\fBprocedure\fR WriteAsString( A : \fBarray\fR[low..high : integer] \fBof\fR char);
|
||||
\fBvar\fR i : integer;
|
||||
\fBbegin
|
||||
for\fR i := low \fBto\fR high \fBdo\fR write(A[i]);
|
||||
\fBend\fR;
|
||||
.fi
|
||||
.in -5m
|
||||
|
||||
The calling procedure pushes the base address of the actual parameter and
|
||||
the array descriptors belonging to it on the stack. Subsequently the procedure
|
||||
using the conformant array parameter is called. Because it is a \fIcall by
|
||||
value\fR, the called procedure has to create a copy of the actual parameter.
|
||||
This implies that the calling procedure knows how much space on the stack
|
||||
must be reserved for the parameters. If the actual-parameter is a conformant
|
||||
array, the called procedure keeps track of the size of the activation record.
|
||||
Hence the restrictions on the use of value conformant array parameters, as
|
||||
specified in \fBISO 6.6.3.7.2\fR, are dropped.
|
||||
|
||||
A description of the EM code generated by the compiler is:
|
||||
|
||||
.nf
|
||||
.ft I
|
||||
load the stack adjustment sofar
|
||||
load base address of array parameter
|
||||
compute the size in bytes of the array
|
||||
add this size to the stack adjustment
|
||||
copy the array
|
||||
remember the new address of the array
|
||||
.ft R
|
||||
.fi
|
41
doc/pascal/contents.doc
Normal file
41
doc/pascal/contents.doc
Normal file
|
@ -0,0 +1,41 @@
|
|||
.sp 1.5i
|
||||
.ps 12
|
||||
.vs 14
|
||||
.ft B
|
||||
Contents\fR\h'+108u'\h'+5i'Page
|
||||
|
||||
|
||||
\h'+34u'1. Introduction \h'+34u'\h'+1.5i'1
|
||||
|
||||
\h'+34u'2. The compiler \h'+34u'\h'+1.5i'2
|
||||
|
||||
\h'+34u'3. Translation of Pascal to EM \h'+34u'\h'+1.5i'5
|
||||
|
||||
\h'+34u'4. Conformant arrays \h'+1.5i'10
|
||||
|
||||
\h'+34u'5. Compiler options \h'+1.5i'11
|
||||
|
||||
\h'+34u'6. Extensions to the standard \h'+1.5i'13
|
||||
|
||||
\h'+34u'7. Deviations from the standard \h'+1.5i'13
|
||||
|
||||
\h'+34u'8. Hints to change the standard \h'+1.5i'15
|
||||
|
||||
\h'+34u'9. Testing the compiler \h'+1.5i'16
|
||||
|
||||
10. Comparison with the old compiler \h'+1.5i'16
|
||||
|
||||
11. Improvements to the compiler \h'+1.5i'17
|
||||
|
||||
12. History & Acknowledgements \h'+1.5i'18
|
||||
|
||||
13. References \h'+1.5i'19
|
||||
|
||||
|
||||
\fBAppendices\fR
|
||||
|
||||
\h'+16u'A. ISO-PASCAL Grammar \h'+1.5i'20
|
||||
|
||||
\h'+24u'B. Changes to run time library \h'+1.5i'26
|
||||
|
||||
\h'+20u'C. An example \h'+1.5i'28
|
118
doc/pascal/deviations.doc
Normal file
118
doc/pascal/deviations.doc
Normal file
|
@ -0,0 +1,118 @@
|
|||
.sp 2
|
||||
.NH
|
||||
Deviations from the standard
|
||||
.nh
|
||||
|
||||
.PP
|
||||
The compiler deviates from the ISO 7185 standard with respect to the
|
||||
following clauses:
|
||||
|
||||
.IP "\fBISO 6.1.3:\fR" 14
|
||||
\h'-5u'Identifiers may be of any length and all characters of an identifier
|
||||
shall be significant in distinguishing between them.
|
||||
.sp
|
||||
.in +3m
|
||||
The constant IDFSIZE, defined in the file \fIidfsize.h\fR, determines
|
||||
the (maximum) significant length of an identifier. It can be set at run
|
||||
time with the \-M option (see also section on compiler options).
|
||||
.in -3m
|
||||
.sp
|
||||
.IP "\fBISO 6.1.8:\fR"
|
||||
\h'-5u'There shall be at least one separator between any pair of consecutive tokens
|
||||
made up of identifiers, word-symbols, labels or unsigned-numbers.
|
||||
.sp
|
||||
.in +3m
|
||||
A token separator is not needed when a number is followed by an identifier
|
||||
or a word-symbol. For example the input sequence, 2\fBthen\fR, is recognized
|
||||
as the integer 2 followed by the keyword \fBthen\fR.
|
||||
.in -3m
|
||||
.sp
|
||||
.IP "\fBISO 6.2.1:\fR"
|
||||
\h'-29u'The label-declaration-part shall specify all labels that prefix a statement
|
||||
in the corresponding statement-part.
|
||||
.sp
|
||||
.ti +3m
|
||||
The compiler generates a warning if a label is declared but never defined.
|
||||
.bp
|
||||
.IP "\fBISO 6.2.2:\fR"
|
||||
\h'-9u'The scope of identifiers and labels should start at the beginning of the
|
||||
block in which these identifiers or labels are declared.
|
||||
.sp
|
||||
.in +3m
|
||||
The compiler, as most other one pass compilers deviates in this respect,
|
||||
because the scope of variables and labels start at their defining-point.
|
||||
.nf
|
||||
.in +4m
|
||||
\fBprogram\fR deviates\fB;
|
||||
const\fR
|
||||
x \fB=\fR 3\fB;
|
||||
procedure\fR p\fB;
|
||||
const\fR
|
||||
y \fB=\fR x\fB;\fR
|
||||
x \fB=\fR true\fB;
|
||||
begin end;
|
||||
begin
|
||||
end.\fR
|
||||
.in -4m
|
||||
.fi
|
||||
|
||||
In procedure p, the constant y has the integer value 3. This program does not
|
||||
conform to the standard. In [SAL] a simple algorithm is described for
|
||||
enforcing the scope rules, it involves numbering all scopes encoutered in the
|
||||
program in order of their opening, and recording in each identifier table
|
||||
entry the number of the latest scope in which it is used.
|
||||
|
||||
Note: The compiler does not deviate from the standard in the following program:
|
||||
.nf
|
||||
.in +4m
|
||||
\fBprogram\fR conforms\fB;
|
||||
type\fR
|
||||
x \fB=\fR real\fB;
|
||||
procedure\fR p\fB;
|
||||
type\fR
|
||||
y \fB= ^\fRx\fB;\fR
|
||||
x \fB=\fR boolean\fB;
|
||||
var\fR
|
||||
p \fB:\fR y\fB;
|
||||
begin end;
|
||||
begin
|
||||
end.\fR
|
||||
.in -4m
|
||||
.fi
|
||||
|
||||
In procedure p, the variable p is a pointer to boolean.
|
||||
.fi
|
||||
.in -3m
|
||||
.sp
|
||||
.IP "\fBISO 6.4.3.2:\fR"
|
||||
The standard specifies that any ordinal type is allowed as index-type.
|
||||
.sp
|
||||
.in +3m
|
||||
The required type \fIinteger\fR is not allowed as index-type, i.e.
|
||||
.ti +2m
|
||||
\fBARRAY [ \fIinteger\fB ] OF\fR <component-type>
|
||||
is not permitted.
|
||||
.br
|
||||
This could be implemented, but this might cause problems on machines with
|
||||
a small memory.
|
||||
.in -3m
|
||||
.sp
|
||||
.IP "\fBISO 6.4.3.3:\fR"
|
||||
\h'-1u'The type possessed by the variant-selector, called the tag-type, must
|
||||
be an ordinal type, so the integer type is permitted. The values denoted by
|
||||
all case-constants shall be distinct and the set thereof shall be equal
|
||||
to the set of values specified by the tag-type.
|
||||
.sp
|
||||
.in +3m
|
||||
Because it is impracticable to enumerate all integers as case-constants,
|
||||
the integer type is not permitted as tag-type. It would not make a great
|
||||
difference to allow it as tagtype.
|
||||
.in -3m
|
||||
.sp
|
||||
.IP "\fBISO 6.8.3.9:\fR"
|
||||
The standard specifies that the control-variable of a for-statement is not
|
||||
allowed to be modified while executing the loop.
|
||||
.sp
|
||||
.in +3m
|
||||
Violation of this rule is not detected. An algorithm to implement this rule
|
||||
can be found in [PCV].
|
92
doc/pascal/example.doc
Normal file
92
doc/pascal/example.doc
Normal file
|
@ -0,0 +1,92 @@
|
|||
.sp 1.5i
|
||||
.ft B
|
||||
Appendix C: An example
|
||||
.ft R
|
||||
.nh
|
||||
.nf
|
||||
|
||||
|
||||
\h'+10u' 1 \fBprogram\fR factorials(input, output);
|
||||
\h'+10u' 2 { This program prints factorials }
|
||||
\h'+10u' 3
|
||||
\h'+10u' 4 \fBconst\fR
|
||||
\h'+10u' 5 FAC1 = 1;
|
||||
\h'+10u' 6 \fBvar\fR
|
||||
\h'+10u' 7 i : integer;
|
||||
\h'+10u' 8
|
||||
\h'+10u' 9 \fBfunction\fR factorial(n : integer) : integer;
|
||||
10 \fBbegin\fR
|
||||
11 \fBif\fR n = FAC1 \fBthen\fR
|
||||
12 factorial := FAC1
|
||||
13 \fBelse\fR
|
||||
14 factorial := n * factorial(n-1);
|
||||
15 \fBend\fR;
|
||||
16
|
||||
17 \fBbegin\fR
|
||||
18 write('Give a number : ');
|
||||
19 readln(i);
|
||||
20 \fBif\fR i < 1 \fBthen\fR
|
||||
21 writeln('No factorial')
|
||||
22 \fBelse\fR
|
||||
23 writeln(factorial(i):1);
|
||||
24 \fBend\fR.
|
||||
.bp
|
||||
.po
|
||||
.DS
|
||||
mes 2,4,4 loc 16
|
||||
\&.1 cal $_wrs
|
||||
rom 'factorials.p\(rs000' asp 12
|
||||
i lin 19
|
||||
bss 4,0,0 lae input
|
||||
output cal $_rdi
|
||||
bss 540,0,0 asp 4
|
||||
input lfr 4
|
||||
bss 540,0,0 ste i
|
||||
exp $factorial lae input
|
||||
pro $factorial, ? cal $_rln
|
||||
mes 9,4 asp 4
|
||||
lin 11 lin 20
|
||||
lol 0 loe i
|
||||
loc 1 loc 1
|
||||
cmi 4 cmi 4
|
||||
teq tlt
|
||||
zeq *1 zeq *1
|
||||
lin 12 lin 21
|
||||
loc 1 .4
|
||||
stl -4 rom 'No factorial'
|
||||
bra *2 lae output
|
||||
1 lae .4
|
||||
lin 14 loc 12
|
||||
lol 0 cal $_wrs
|
||||
lol 0 asp 12
|
||||
loc 1 lae output
|
||||
sbi 4 cal $_wln
|
||||
cal $factorial asp 4
|
||||
asp 4 bra *2
|
||||
lfr 4 1
|
||||
mli 4 lin 23
|
||||
stl -4 lae output
|
||||
2 loe i
|
||||
lin 15 cal $factorial
|
||||
mes 3,0,4,0,0 asp 4
|
||||
lol -4 lfr 4
|
||||
ret 4 loc 1
|
||||
end 4 cal $_wsi
|
||||
exp $m_a_i_n asp 12
|
||||
pro $m_a_i_n, ? lae output
|
||||
mes 9,0 cal $_wln
|
||||
fil .1 asp 4
|
||||
\&.2 2
|
||||
con input, output lin 24
|
||||
lxl 0 loc 0
|
||||
lae .2 cal $_hlt
|
||||
loc 2 end 0
|
||||
lxa 0 mes 4,24,'factorials.p\(rs000'
|
||||
cal $_ini
|
||||
asp 16
|
||||
lin 18
|
||||
\&.3
|
||||
rom 'Give a number : '
|
||||
lae output
|
||||
lae .3
|
||||
.DE
|
60
doc/pascal/extensions.doc
Normal file
60
doc/pascal/extensions.doc
Normal file
|
@ -0,0 +1,60 @@
|
|||
.pl 12i
|
||||
.sp 1.5i
|
||||
.NH
|
||||
Extensions to Pascal as specified by ISO 7185
|
||||
.nh
|
||||
|
||||
.IP "\fBISO 6.1.3:\fR" 14
|
||||
\h'-11u'The underscore is treated as a letter when the \-u option is turned
|
||||
on (see also section 5.2). This is implemented to be compatible with
|
||||
Pascal-VU and can be used in identifiers to increase readability.
|
||||
.sp
|
||||
.IP "\fBISO 6.1.4:\fR"
|
||||
\h'-12u'The directive \fIextern\fR can be used in a procedure-declaration or
|
||||
function-declaration to specify that the procedure-block or function-block
|
||||
corresponding to that declaration is external to the program-block. This can
|
||||
be used in conjunction with library routines.
|
||||
.sp
|
||||
.IP "\fBISO 6.1.9:\fR"
|
||||
\h'-22u'An alternative representation for the following tokens and delimiting
|
||||
characters is recognized:
|
||||
.in +5m
|
||||
.ft 5
|
||||
\fBtoken
|
||||
.ft 5
|
||||
\& \fBalternative token
|
||||
.ft 5
|
||||
.sp
|
||||
^
|
||||
\& @
|
||||
.br
|
||||
[
|
||||
\& (.
|
||||
.br
|
||||
]
|
||||
\& .)
|
||||
|
||||
.ft 5
|
||||
\fBdelimiting character
|
||||
.ft 5
|
||||
\& \fBalternative delimiting pair of characters
|
||||
.ft 5
|
||||
.sp
|
||||
{
|
||||
\& (*
|
||||
.br
|
||||
}
|
||||
\& *)
|
||||
.in -5m
|
||||
.sp
|
||||
.IP "\fBISO 6.6.3.7.2:\fR"
|
||||
\h'-1u'A conformant array parameter can be passed as value conformant array
|
||||
parameter without the restrictions imposed by the standard. The compiler
|
||||
gives a warning. This is implemented to keep the parameter mechanism orthogonal (see also Chapter 4).
|
||||
.sp
|
||||
.IP "\fBISO 6.9.3.1:\fR"
|
||||
\h'-16u'If the value of the argument \fITotalWidth\fR of the required
|
||||
procedure \fIwrite\fR is zero or negative, no characters are written for
|
||||
character, string or boolean type arguments. If the value of the argument
|
||||
\fIFracDigits\fR of the required procedure \fIwrite\fR is zero or negative,
|
||||
the fraction and '.' character are suppressed for fixed-point arguments.
|
76
doc/pascal/hints.doc
Normal file
76
doc/pascal/hints.doc
Normal file
|
@ -0,0 +1,76 @@
|
|||
.sp 1.5i
|
||||
.nr H1 7
|
||||
.NH
|
||||
Hints to change the standard
|
||||
.nh
|
||||
.sp
|
||||
.LP
|
||||
We encoutered some difficulties when the compiler was developed. In this
|
||||
chapter some hints are presented to change the standard, which would make
|
||||
the implementation of the compiler less difficult. The semantics of Pascal
|
||||
would not be altered by these adaptions.
|
||||
.sp 2
|
||||
.LP
|
||||
\- Some minor changes in the grammar of Pascal from the user's point of view,
|
||||
but which make the writing of an LL(1) parser considerably easier, could be:
|
||||
.in +3m
|
||||
.nf
|
||||
field-list : [ ( fixed-part [ variant-part ] | variant-part ) ] .
|
||||
fixed-part : record-section \fB;\fR { record-section \fB;\fR } .
|
||||
variant-part : \fBcase\fR variant-selector \fBof\fR variant \fB;\fR { variant \fB;\fR } .
|
||||
|
||||
case-statement : \fBcase\fR case-index \fBof\fR case-list-element \fB;\fR { case-list-element \fB;\fR } \fBend\fR .
|
||||
.fi
|
||||
.in -3m
|
||||
|
||||
|
||||
.LP
|
||||
\- To ease the semantic checking on sets, the principle of qualified sets could
|
||||
be used, every set-constructor must be preceeded by its type-identifier:
|
||||
.nf
|
||||
.ti +3m
|
||||
set-constructor : type-identifier \fB[\fR [ member-designator { \fB,\fR member-designator } ] \fB]\fR .
|
||||
|
||||
Example:
|
||||
t1 = set of 1..5;
|
||||
t2 = set of integer;
|
||||
|
||||
The type of [3, 5] would be ambiguous, but the type of t1[3, 5] not.
|
||||
.fi
|
||||
|
||||
|
||||
.LP
|
||||
\- Another problem arises from the fact that a function name can appear in
|
||||
three distinct 'use' contexts: function call, assignment of function
|
||||
result and as function parameter.
|
||||
.br
|
||||
Example:
|
||||
.in +5m
|
||||
.nf
|
||||
\fBprogram\fR function_name;
|
||||
|
||||
\fBfunction\fR p(x : integer; function y : integer) : integer;
|
||||
\fBbegin\fR .. \fBend\fR;
|
||||
|
||||
\fBfunction\fR f : integer;
|
||||
\fBbegin\fR
|
||||
f := p(f, f); (*)
|
||||
\fBend\fR;
|
||||
|
||||
\fBbegin\fR .. \fBend\fR.
|
||||
.fi
|
||||
.in -5m
|
||||
|
||||
A possible solution in case of a call (also a procedure call) would be to
|
||||
make the (possibly empty) actual-parameter-list mandatory. The assignment
|
||||
of the function result could be changed in a \fIreturn\fR statement.
|
||||
Though this would change the semantics of the program slightly.
|
||||
.br
|
||||
The above statement (*) would look like this: return p(f(), f);
|
||||
|
||||
|
||||
.LP
|
||||
\- Another extension to the standard could be the implementation of an
|
||||
\fIotherwise\fR clause in a case-statement. This would behave exactly like
|
||||
the \fIdefault\fR clause in a switch-statement in C.
|
||||
.bp
|
36
doc/pascal/his.doc
Normal file
36
doc/pascal/his.doc
Normal file
|
@ -0,0 +1,36 @@
|
|||
.sp 2
|
||||
.NH
|
||||
History & Acknowledgements
|
||||
.nh
|
||||
.sp 2
|
||||
.ft B
|
||||
History
|
||||
.ft R
|
||||
.sp
|
||||
.LP
|
||||
The purpose of this project was to make a Pascal compiler which should satisfy
|
||||
the conditions of the ISO standard. The task was considerably simplified,
|
||||
because parts of the Modula-2 compiler were used. This gave the advantage of
|
||||
increasing the uniformity of the compilers in ACK.
|
||||
.br
|
||||
While developing the compiler, a number of errors were detected in the Modula-2
|
||||
compiler, EM utility modules and the old Pascal compiler.
|
||||
|
||||
.sp 2
|
||||
.ft B
|
||||
Acknowledgements
|
||||
.ft R
|
||||
.sp
|
||||
.LP
|
||||
During the development of the compiler, valuable support was received from
|
||||
a number of persons. In this regard we owe a debt of gratitude to
|
||||
Fred van Beek, Casper Capel, Rob Dekker, Frank Engel, Jos\('e Gouweleeuw
|
||||
and Sonja Keijzer (Jut and Jul !!), Herold Kroon, Martin van Nieuwkerk,
|
||||
Sjaak Schouten, Eric Valk, and Didan Westra.
|
||||
.br
|
||||
Special thanks are reserved for Dick Grune, who introduced us to the field of
|
||||
Compiler Design and who helped testing the compiler. Ceriel Jacobs, who
|
||||
developed LLgen and the Modula-2 compiler of ACK. Finally we would like to
|
||||
thank Erik Baalbergen, who had the supervision on this entire project and
|
||||
gave us many valuable suggestions.
|
||||
.bp
|
87
doc/pascal/improv.doc
Normal file
87
doc/pascal/improv.doc
Normal file
|
@ -0,0 +1,87 @@
|
|||
.sp 2
|
||||
.NH
|
||||
Improvements to the compiler
|
||||
.nh
|
||||
.sp
|
||||
.LP
|
||||
In consideration of portability, a restricted option could be implemented.
|
||||
Under this option, the extensions and warnings should be considered as errors.
|
||||
|
||||
|
||||
.LP
|
||||
The restrictions imposed by the standard on the control variable of a
|
||||
for-statment should be implemented (\fBISO 6.8.3.9\fR).
|
||||
|
||||
.LP
|
||||
To check whether a function returns a valid result, the following algorithm
|
||||
could be used. When a function is entered a hidden temporary variable of
|
||||
type boolean is created. This variable is initialized with the value false.
|
||||
The variable is set to true, when an assignment to the function name occurs.
|
||||
On exit of the function a test is performed on the variable. If the value
|
||||
of the variable is false, a run-time error occurs.
|
||||
.br
|
||||
Note: The check has to be done run-time.
|
||||
|
||||
|
||||
.LP
|
||||
The \fIundefined value\fR should be implemented. A problem arises with
|
||||
local variables, for which space on the stack is allocated. A possible
|
||||
solution would be to generate code for the initialization of the local
|
||||
variables with the undefined value at the beginning of a procedure or
|
||||
function.
|
||||
.br
|
||||
The implementation for the global variables is easy, because \fBbss\fR
|
||||
blocks are used.
|
||||
|
||||
|
||||
.LP
|
||||
Closely related to the last point is the generation of warnings when
|
||||
variables are never used or assigned. This is not yet implemented.
|
||||
|
||||
|
||||
.LP
|
||||
The error messages could specify more details about the errors occurred,
|
||||
if some additional testing is done.
|
||||
|
||||
.bp
|
||||
.LP
|
||||
Every time the compiler detects sets with different base-types, a warning
|
||||
is given. Sometimes this is superfluous.
|
||||
|
||||
.nf
|
||||
\fBprogram\fR sets(output);
|
||||
\fBtype\fR
|
||||
week = (sunday, monday, tuesday, wednesday, thursday, friday, saturday);
|
||||
workweek = monday..friday;
|
||||
\fBvar\fR
|
||||
s : \fBset of\fR workweek;
|
||||
day : week;
|
||||
\fBbegin\fR
|
||||
day := monday;
|
||||
s := [day]; (* warning *)
|
||||
day := saturday;
|
||||
s := [day]; (* warning *)
|
||||
\fBend\fR.
|
||||
.fi
|
||||
The new compiler gives two warnings, the first one is redundant.
|
||||
|
||||
|
||||
.LP
|
||||
A nasty point in the compiler is the way the procedures \fIread, readln,
|
||||
write\fR and \fIwriteln\fR are handled (see also section 2.2). They have
|
||||
been added to the grammar. This implies that they can not be redefined as
|
||||
opposed to the other required procedures and functions. They should be
|
||||
removed from the grammar altogether. This could imply that more semantic
|
||||
checks have to be performed.
|
||||
|
||||
|
||||
.LP
|
||||
No effort is made to detect possible run-time errors during compilation.
|
||||
.br
|
||||
E.g. a : \fBarray\fR[1..10] \fBof\fI something\fR, and the array selection
|
||||
a[11] would occur.
|
||||
|
||||
|
||||
.LP
|
||||
Some assistance to implement the improvements mentioned above, can be
|
||||
obtained from [PCV].
|
342
doc/pascal/internal.doc
Normal file
342
doc/pascal/internal.doc
Normal file
|
@ -0,0 +1,342 @@
|
|||
.pl 12.5i
|
||||
.sp 1.5i
|
||||
.NH
|
||||
The compiler
|
||||
|
||||
.nh
|
||||
.LP
|
||||
The compiler can be divided roughly into four modules:
|
||||
|
||||
\(bu lexical analysis
|
||||
.br
|
||||
\(bu syntax analysis
|
||||
.br
|
||||
\(bu semantic analysis
|
||||
.br
|
||||
\(bu code generation
|
||||
.br
|
||||
|
||||
The four modules are grouped into one pass. The activity of these modules
|
||||
is interleaved during the pass.
|
||||
.br
|
||||
The lexical analyzer, some expression handling routines and various
|
||||
datastructures from the Modula-2 compiler contributed to the project.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Lexical Analysis
|
||||
|
||||
.LP
|
||||
The first module of the compiler is the lexical analyzer. In this module, the
|
||||
stream of input characters making up the source program is grouped into
|
||||
\fItokens\fR, as defined in \fBISO 6.1\fR. The analyzer is hand-written,
|
||||
because the lexical analyzer generator, which was at our disposal,
|
||||
\fILex\fR [LEX], produces much slower analyzers. A character table, in the file
|
||||
\fIchar.c\fR, is created using the program \fItab\fR which takes as input
|
||||
the file \fIchar.tab\fR. In this table each character is placed into a
|
||||
particular class. The classes, as defined in the file \fIclass.h\fR,
|
||||
represent a set of tokens. The strategy of the analyzer is as follows: the
|
||||
first character of a new token is used in a multiway branch to eliminate as
|
||||
many candidate tokens as possible. Then the remaining characters of the token
|
||||
are read. The constant INP_NPUSHBACK, defined in the file \fIinput.h\fR,
|
||||
specifies the maximum number of characters the analyzer looks ahead. The
|
||||
value has to be at least 3, to handle input sequences such as:
|
||||
.br
|
||||
1e+4 (which is a real number)
|
||||
.br
|
||||
1e+a (which is the integer 1, followed by the identifier "e", a plus, and the identifier "a")
|
||||
|
||||
Another aspect of this module is the insertion and deletion of tokens
|
||||
required by the parser for the recovery of syntactic errors (see also section
|
||||
2.2). A generic input module [ACK] is used to avoid the burden of I/O.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Syntax Analysis
|
||||
|
||||
.LP
|
||||
The second module of the compiler is the parser, which is the central part of
|
||||
the compiler. It invokes the routines of the other modules. The tokens obtained
|
||||
from the lexical analyzer are grouped into grammatical phrases. These phrases
|
||||
are stored as parse trees and handed over to the next part. The parser is
|
||||
generated using \fILLgen\fR[LL], a tool for generating an efficient recursive
|
||||
descent parser with no backtrack from an Extended Context Free Syntax.
|
||||
.br
|
||||
An error recovery mechanism is generated almost completely automatically. A
|
||||
routine called \fILLmessage\fR had to be written, which gives the necessary
|
||||
error messages and deals with the insertion and deletion of tokens.
|
||||
The routine \fILLmessage\fR must accept one parameter, whose value is
|
||||
a token number, zero or -1. A zero parameter indicates that the current token
|
||||
(the one in the external variable \fILLsymb\fR) is deleted.
|
||||
A -1 parameter indicates that the parser expected end of file, but did
|
||||
not get it. The parser will then skip tokens until end of file is detected.
|
||||
A parameter that is a token number (a positive parameter) indicates that
|
||||
this token is to be inserted in front of the token currently in \fILLsymb\fR.
|
||||
Also, care must be taken, that the token currently in \fILLsymb\fR is again
|
||||
returned by the \fBnext\fR call to the lexical analyzer, with the proper
|
||||
attributes. So, the lexical analyzer must have a facility to push back one
|
||||
token.
|
||||
.br
|
||||
Calls to the two standard procedures \fIwrite\fR and \fIwriteln\fR can be
|
||||
different from calls to other procedures. The syntax of a write-parameter
|
||||
is different from the syntax of an actual-parameter. We decided to include
|
||||
them, together with \fIread\fR and \fIreadln\fR, in the grammar. An alternate
|
||||
solution would be to make the syntax of an actual-parameter identical to the
|
||||
syntax of a write-parameter. Afterwards the parameter has to be checked to
|
||||
see whether it is used properly or not.
|
||||
.bp
|
||||
As the parser is LL(1), it must always be able to determine what to do,
|
||||
based on the last token read (\fILLsymb\fR). Unfortunately, this was not the
|
||||
case with the grammar as specified in [ISO]. Two kinds of problems
|
||||
appeared, viz. the \fBalternation\fR and \fBrepetition\fR conflict.
|
||||
The examples given in the following paragraphs are taken from the grammar.
|
||||
|
||||
.NH 3
|
||||
Alternation conflict
|
||||
|
||||
.LP
|
||||
An alternation conflict arises when the parser can not decide which
|
||||
production to choose.
|
||||
.br
|
||||
\fBExample:\fR
|
||||
.in +2m
|
||||
.ft 5
|
||||
.nf
|
||||
procedure-declaration : procedure-heading \fB';'\f5 directive |
|
||||
.br
|
||||
\h'\w'procedure-declaration : 'u'procedure-identification \fB';'\f5 procedure-block |
|
||||
.br
|
||||
\h'\w'procedure-declaration : 'u'procedure-heading \fB';'\f5 procedure-block ;
|
||||
.br
|
||||
procedure-heading : \fBprocedure\f5 identifier [ formal-parameter-list ]? ;
|
||||
.br
|
||||
procedure-identification : \fBprocedure\f5 procedure-identifier ;
|
||||
.fi
|
||||
.ft R
|
||||
.in -2m
|
||||
|
||||
A sentence that starts with the terminal \fBprocedure\fR is derived from the
|
||||
three alternative productions. This conflict can be resolved in two ways:
|
||||
adjusting the grammar, usually some rules are replaced by one rule and more
|
||||
work has to be done in the semantic analysis; using the LLgen conflict
|
||||
resolver, "\fB%if\fR (C-expression)", if the C-expression evaluates to
|
||||
non-zero, the production in question is chosen, otherwise one of the
|
||||
remaining rules is chosen. The grammar rules were rewritten to solve this
|
||||
conflict. The new rules are given below. For more details see the file
|
||||
\fIdeclar.g\fR.
|
||||
|
||||
.in +2m
|
||||
.ft 5
|
||||
.nf
|
||||
procedure-declaration : procedure-heading \fB';'\f5 ( directive | procedure-block ) ;
|
||||
.br
|
||||
procedure-heading : \fBprocedure\f5 identifier [ formal-parameter-list ]? ;
|
||||
.fi
|
||||
.ft R
|
||||
.in -2m
|
||||
|
||||
A special case of an alternation conflict, which is common to many block
|
||||
structured languages, is the \fI"dangling-else"\fR ambiguity.
|
||||
|
||||
.in +2m
|
||||
.ft 5
|
||||
.nf
|
||||
if-statement : \fBif\f5 boolean-expression \fBthen\f5 statement [ else-part ]? ;
|
||||
.br
|
||||
else-part : \fBelse\f5 statement ;
|
||||
.fi
|
||||
.ft R
|
||||
.in -2m
|
||||
|
||||
The following statement that can be derived from the rules above is ambiguous:
|
||||
|
||||
.ti +2m
|
||||
\fBif\f5 boolean-expr-1 \fBthen\f5 \fBif\f5 boolean-expr-2 \fBthen\f5 statement-1 \fBelse\f5 statement-2
|
||||
.ft R
|
||||
|
||||
|
||||
.ps 8
|
||||
.vs 7
|
||||
.PS
|
||||
move right 1.1i
|
||||
S: line down 0.5i
|
||||
"if-statement" at S.start above
|
||||
.ft B
|
||||
"then" at S.end below
|
||||
.ft R
|
||||
move to S.start then down 0.25i
|
||||
L: line left 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "boolean" "expression-1"
|
||||
move to L.start then left 0.5i
|
||||
L: line left 0.5i then down 0.25i
|
||||
.ft B
|
||||
"if" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
"statement" at L.end below
|
||||
move to L.end then down 0.10i
|
||||
L: line down 0.25i dashed
|
||||
"if-statement" at L.end below
|
||||
move to L.end then down 0.10i
|
||||
L: line down 0.5i
|
||||
.ft B
|
||||
"then" at L.end below
|
||||
.ft R
|
||||
move to L.start then down 0.25i
|
||||
L: line left 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "boolean" "expression-2"
|
||||
move to L.start then left 0.5i
|
||||
L: line left 0.5i then down 0.25i
|
||||
.ft B
|
||||
"if" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "statement-1"
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
.ft B
|
||||
"else" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "statement-2"
|
||||
move to S.start
|
||||
move right 3.5i
|
||||
L: line down 0.5i
|
||||
"if-statement" at L.start above
|
||||
.ft B
|
||||
"then" at L.end below
|
||||
.ft R
|
||||
move to L.start then down 0.25i
|
||||
L: line left 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "boolean" "expression-1"
|
||||
move to L.start then left 0.5i
|
||||
L: line left 0.5i then down 0.25i
|
||||
.ft B
|
||||
"if" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
S: line right 0.5i then down 0.25i
|
||||
"statement" at S.end below
|
||||
move to S.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
.ft B
|
||||
"else" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "statement-2"
|
||||
move to S.end then down 0.10i
|
||||
L: line down 0.25i dashed
|
||||
"if-statement" at L.end below
|
||||
move to L.end then down 0.10i
|
||||
L: line down 0.5i
|
||||
.ft B
|
||||
"then" at L.end below
|
||||
.ft R
|
||||
move to L.start then down 0.25i
|
||||
L: line left 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "boolean" "expression-2"
|
||||
move to L.start then left 0.5i
|
||||
L: line left 0.5i then down 0.25i
|
||||
.ft B
|
||||
"if" at L.end below
|
||||
.ft R
|
||||
move to L.start then right 0.5i
|
||||
L: line right 0.5i then down 0.25i
|
||||
box ht 0.33i wid 0.6i "statement-1"
|
||||
.PE
|
||||
.ps
|
||||
.vs
|
||||
\h'615u'(a)\h'1339u'(b)
|
||||
.sp
|
||||
.ce
|
||||
Two parse trees showing the \fIdangling-else\fR ambiguity
|
||||
.sp 2
|
||||
According to the standard, \fBelse\fR is matched with the nearest preceding
|
||||
unmatched \fBthen\fR, i.e. parse tree (a) is valid (\fBISO 6.8.3.4\fR).
|
||||
This conflict is statically resolved in LLgen by using "\fB%prefer\fR",
|
||||
which is equivalent in behaviour to "\fB%if\fR(1)".
|
||||
.bp
|
||||
.NH 3
|
||||
Repetition conflict
|
||||
|
||||
.LP
|
||||
A repetition conflict arises when the parser can not decide whether to choose
|
||||
a production once more, or not.
|
||||
.br
|
||||
\fBExample:\fR
|
||||
.in +2m
|
||||
.ft 5
|
||||
.nf
|
||||
field-list : [ ( fixed-part [ \fB';'\f5 variant-part ]? | variantpart ) [;]? ]? ;
|
||||
.br
|
||||
fixed-part : record-section [ \fB';'\f5 record-section ]* ;
|
||||
.fi
|
||||
.in -2m
|
||||
.ft R
|
||||
|
||||
When the parser sees the semicolon, it can not decide whether another
|
||||
record-section or a variant-part follows. This conflict can be resolved in
|
||||
two ways: adjusting the grammar or using the conflict resolver,
|
||||
"\fB%while\fR (C-expression)". The grammar rules that deal with this conflict
|
||||
were completely rewritten. For more details, the reader is referred to the
|
||||
file \fIdeclar.g\fR.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Semantic Analysis
|
||||
|
||||
.LP
|
||||
The third module of the compiler is the checking of semantic conventions of
|
||||
ISO-Pascal. To check the program being parsed, actions have been used in
|
||||
LLgen. An action consists of several C-statements, enclosed in brackets
|
||||
"{" and "}". In order to facilitate communication between the actions and
|
||||
\fILLparse\fR, the parsing routines can be given C-like parameters and
|
||||
local variables. An important part of the semantic analyzer is the symbol
|
||||
table. This table stores all information concerning identifiers and their
|
||||
definitions. Symbol-table lookup and hashing is done by a generic namelist
|
||||
module [ACK]. The parser turns each program construction into a parse tree,
|
||||
which is the major datastructure in the compiler. This parse tree is used
|
||||
to exchange information between various routines.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Code Generation
|
||||
|
||||
.LP
|
||||
The final module in the compiler is that of code generation. The information
|
||||
stored in the parse trees is used to generate the EM code [EM]. EM code is
|
||||
generated with the help of a procedural EM-code interface [ACK]. The use of
|
||||
static exchanges is not desired, since the fast back end can not cope with
|
||||
static code exchanges, hence the EM pseudoinstruction \fBexc\fR is never
|
||||
generated.
|
||||
.br
|
||||
Chapter 3 discusses the code generation in more detail.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Error Handling
|
||||
|
||||
.LP
|
||||
The first three modules have in common that they can detect errors in the
|
||||
Pascal program being compiled. If this is the case, a proper message is given
|
||||
and some action is performed. If code generation has to be aborted, an error
|
||||
message is given, otherwise a warning is given. The constant MAXERR_LINE,
|
||||
defined in the file \fIerrout.h\fR, specifies the maximum number of messages
|
||||
given per line. This can be used to avoid long lists of error messages caused
|
||||
by, for example, the omission of a ';'. Three kinds of errors can be
|
||||
distinguished: the lexical error, the syntactic error, and the semantic error.
|
||||
Examples of these errors are respectively, nested comments, an expression with
|
||||
unbalanced parentheses, and the addition of two characters.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Memory Allocation and Garbage Collection
|
||||
|
||||
.LP
|
||||
The routines \fIst_alloc\fR and \fIst_free\fR provide a mechanism for
|
||||
maintaining free lists of structures, whose first field is a pointer called
|
||||
\fBnext\fR. This field is used to chain free structures together. Each
|
||||
structure, suppose the tag of the structure is ST, has a free list pointed
|
||||
by h_ST. Associated with this list are the operations: \fInew_ST()\fR, an
|
||||
allocating mechanism which supplies the space for a new ST struct; and
|
||||
\fIfree_ST()\fR, a garbage collecting mechanism which links the specified
|
||||
structure into the free list.
|
||||
.bp
|
166
doc/pascal/options.doc
Normal file
166
doc/pascal/options.doc
Normal file
|
@ -0,0 +1,166 @@
|
|||
.sp 1.5i
|
||||
.NH
|
||||
Compiler options
|
||||
.nh
|
||||
.PP
|
||||
There are some options available to control the behaviour of the compiler.
|
||||
Two types of options can be distinguished: compile-time options and
|
||||
run-time options.
|
||||
.sp
|
||||
.NH 2
|
||||
Compile time options
|
||||
.LP
|
||||
.sp
|
||||
There are some options that can be set when the compiler is installed.
|
||||
Those options can be found in the file \fIParameters\fR. To set a parameter
|
||||
just modify its definition in the file \fIParameters\fR. The shell script
|
||||
in the file \fImake.hfiles\fR creates for each parameter a separate .h file.
|
||||
This mechanism is derived from the C compiler in ACK.
|
||||
.sp
|
||||
\fBIDFSIZE\fR
|
||||
.in +3m
|
||||
The maximum number of characters that are significant in an identifier. This
|
||||
value has to be at least the value of \fBMINIDFSIZE\fR, defined in the file
|
||||
\fIoptions.c\fR. A compile-time check is included to see if the value of
|
||||
\fBMINIDFSIZE\fR is legal. The compiler will not recognize some keywords
|
||||
if \fBIDFSIZE\fR is too small.
|
||||
.in -3m
|
||||
.sp
|
||||
\fBISTRSIZE\fR, \fBRSTRSIZE\fR
|
||||
.in +3m
|
||||
The lexical analyzer uses these two values for the allocation of memory needed
|
||||
to store a string. \fBISTRSIZE\fR is the initial number of bytes allocated.
|
||||
\fBRSTRSIZE\fR is the step size used for enlarging the memory needed.
|
||||
.in -3m
|
||||
.sp
|
||||
\fBNUMSIZE\fR
|
||||
.in +3m
|
||||
The maximum length of a numeric constant recognized by the lexical analyzer.
|
||||
It is an error if this length is exceeded.
|
||||
.in -3m
|
||||
.sp
|
||||
\fBERROUT\fR, \fBMAXERR_LINE\fR
|
||||
.in +3m
|
||||
Used for error messages. \fBERROUT\fR defines the file on which the
|
||||
messages are written. \fBMAXERR_LINE\fR is the maximum number of error
|
||||
messages given per line.
|
||||
.in -3m
|
||||
.sp
|
||||
\fBSZ_CHAR\fR, \fBAL_CHAR\fR, etc
|
||||
.in +3m
|
||||
The default values of the target machine sizes and alignments. The values
|
||||
can be overruled with the \-V option.
|
||||
.in -3m
|
||||
.sp
|
||||
\fBMAXSIZE\fR
|
||||
.in +3m
|
||||
This value must be set to the maximum of the values of the target machine
|
||||
sizes. This parameter is used in overflow detection (see also section 3.2).
|
||||
.in -3m
|
||||
.sp
|
||||
\fBDENSITY\fR
|
||||
.in +3m
|
||||
This parameter is used to decide what EM instruction has to be generated
|
||||
for a case-statement. If the range of the index value is sparse, i.e.
|
||||
.br
|
||||
.ti +5m
|
||||
(upperbound - lowerbound) / number_of_cases
|
||||
.br
|
||||
is more than some threshold (\fBDENSITY\fR) the \fBcsb\fR instruction is
|
||||
chosen. If the range is dense a jump table is generated (\fBcsa\fR). This
|
||||
uses more space. Reasonable values are 2, 3 or 4.
|
||||
.br
|
||||
Higher values might also be reasonable on machines, which have lots of
|
||||
address space and memory (see also section 3.3.3).
|
||||
.in -3m
|
||||
.sp
|
||||
\fBINP_READ_IN_ONE\fR
|
||||
.in +3m
|
||||
Used by the generic input module. It can either be defined or not defined.
|
||||
Defining it has the effect that files will be read completely into memory
|
||||
using only one read-system call. This should be used only on machines with
|
||||
lots of memory.
|
||||
.in -3m
|
||||
.sp
|
||||
.bp
|
||||
\fBDEBUG\fR
|
||||
.in +3m
|
||||
.nf
|
||||
If this parameter is defined some built-in compiler-debugging tools can be used:
|
||||
.in +2m
|
||||
\(bu only lexical analyzing is done, if the \-l option is given.
|
||||
\(bu if the \-I option is turned on, the allocated number of structures is printed.
|
||||
\(bu the routine debug can be used to print miscellaneous information.
|
||||
\(bu the routine PrNode prints a tree of nodes.
|
||||
\(bu the routine DumpType prints information about a type structure.
|
||||
\(bu the macro DO_DEBUG(x,y) defined as ((x) && (y)) can be used to perform
|
||||
several actions.
|
||||
.in -2m
|
||||
.in -3m
|
||||
.sp
|
||||
.NH 2
|
||||
Run time options
|
||||
.LP
|
||||
.sp
|
||||
The run time options can be given in the command line when the compiler is
|
||||
called.
|
||||
.br
|
||||
They all have the form: \-<character>
|
||||
.br
|
||||
Depending on the option, a character string has to be specified. The following
|
||||
options are currently available:
|
||||
.sp
|
||||
.IP \-\fBC\fR 18
|
||||
The lower case and upper case letters are treated different (\fBISO 6.1.1\fR).
|
||||
.sp
|
||||
.IP \-\fBu\fR
|
||||
The character '_' is treated like a letter, so it is allowed to use the
|
||||
underscore in identifiers.
|
||||
.br
|
||||
Note: identifiers starting with an underscore may cause problems, because
|
||||
.br
|
||||
\h'\w'Note: 'u'most identifiers in library routines start with an underscore.
|
||||
.sp
|
||||
.IP \-\fBn\fR
|
||||
This option suppresses the generation of register messages.
|
||||
.sp
|
||||
.IP \-\fBr\fR
|
||||
With this option rangechecks are generated where necessary.
|
||||
.sp
|
||||
.IP \-\fBL\fR
|
||||
Do not generate EM \fBlin\fR and \fBfil\fR instructions. These instructions
|
||||
are used only for profiling.
|
||||
.sp
|
||||
.IP \-\fBM\fR<number>
|
||||
Set the number of characters that are significant in an identifier to <number>.
|
||||
The maximum significant identifier length depends on the constant IDFSIZE,
|
||||
defined in \fIidfsize.h\fR.
|
||||
.sp
|
||||
.IP \-\fBi\fR<number>
|
||||
With this flag the setsize for a set of integers can be changed. The number must
|
||||
be the number of bits per set. Default value : (#bits in a word) \- 1
|
||||
.sp
|
||||
.IP \-\fBw\fR
|
||||
Suppress warning messages (see also section 2.5).
|
||||
.sp
|
||||
.IP \-\fBV\fR[[\fBw\fR|\fBi\fR|\fBf\fR|\fBp\fR|\fBS\fR][\fIsize\fR]?[\fI.alignment\fR]?]*
|
||||
.br
|
||||
Option to set the object sizes and alignments on the target machine
|
||||
dynamically. The objects that can be manipulated are:
|
||||
.br
|
||||
\fBw\fR\h'\w'ifpS'u' word
|
||||
.br
|
||||
\fBi\fR\h'\w'wfpS'u' integer
|
||||
.br
|
||||
\fBf\fR\h'\w'wipS'u' float
|
||||
.br
|
||||
\fBp\fR\h'\w'wifS'u' pointer
|
||||
.br
|
||||
\fBS\fR\h'\w'wifp'u' structure
|
||||
.br
|
||||
In case of a structure, \fIsize\fR is discarded and the \fIalignment\fR is
|
||||
the initial alignment of the structure. The effective alignment is the least
|
||||
common multiple of \fIalignment\fR and the alignment of its members. This
|
||||
option has been implemented so that the compiler can be used as cross
|
||||
compiler.
|
||||
.bp
|
1
doc/pascal/p1-9
Executable file
1
doc/pascal/p1-9
Executable file
|
@ -0,0 +1 @@
|
|||
pic ab+intro.doc internal.doc transpem.doc | troff -ms > p1-9.dit
|
1
doc/pascal/p10-14
Executable file
1
doc/pascal/p10-14
Executable file
|
@ -0,0 +1 @@
|
|||
troff -ms -n10 conf.doc options.doc extensions.doc deviations.doc > p10-14.dit
|
1
doc/pascal/p15-19
Executable file
1
doc/pascal/p15-19
Executable file
|
@ -0,0 +1 @@
|
|||
troff -ms -n15 hints.doc test.doc compar.doc improv.doc his.doc reference.doc > p15-19.dit
|
1
doc/pascal/p20-29
Executable file
1
doc/pascal/p20-29
Executable file
|
@ -0,0 +1 @@
|
|||
troff -ms -n20 syntax.doc rtl.doc example.doc > p20-29.dit
|
50
doc/pascal/reference.doc
Normal file
50
doc/pascal/reference.doc
Normal file
|
@ -0,0 +1,50 @@
|
|||
.ps 12
|
||||
.vs 14
|
||||
.NH
|
||||
References
|
||||
.sp
|
||||
.nh
|
||||
.IP [ISO] 8
|
||||
ISO 7185 Specification for Computer Programming Language Pascal, 1982,
|
||||
Acornsoft ISO-PASCAL, 1984
|
||||
.sp
|
||||
.IP [EM]
|
||||
A.S. Tanenbaum, H. van Staveren, E.G. Keizer and J.W. Stevenson,
|
||||
\fIDescription Of A Machine Architecture for use with Block Structured
|
||||
Languages\fR, Informatica Rapport IR-81, Vrije Universiteit, Amsterdam, 1983
|
||||
.sp
|
||||
.IP [C]
|
||||
B.W. Kernighan and D.M. Ritchie, \fIThe C Programming Language\fR,
|
||||
Prentice-Hall, 1978
|
||||
.sp
|
||||
.IP [LL]
|
||||
C.J.H. Jacobs, \fISome Topics in Parser Generation\fR, Informatica Rapport
|
||||
IR-105, Vrije Universiteit, Amsterdam, October 1985
|
||||
.sp
|
||||
.IP [IM2]
|
||||
J.W. Stevenson, \fIPascal-VU Reference Manual and Unix Manual Pages\fR,
|
||||
Informatica Manual IM-2, Vrije Universiteit, Amsterdam, 1980
|
||||
.sp
|
||||
.IP [JEN]
|
||||
K. Jensen and N.Wirth, \fIPascal User Manual and Report\fR,
|
||||
Springer-Verlag, 1978
|
||||
.sp
|
||||
.IP [ACK]
|
||||
\fIACK Manual Pages\fR: ALLOC, ASSERT, EM_CODE, EM_MES, IDF, INPUT, PRINT,
|
||||
STRING, SYSTEM
|
||||
.sp
|
||||
.IP [AHO]
|
||||
A.V. Aho, R. Sethi and J.D. Ullman, \fICompiler Principles, Techniques, and
|
||||
Tools\fR, Addison Wesley, 1985
|
||||
.sp
|
||||
.IP [LEX]
|
||||
M.E. Lesk, \fILex - A Lexical Analyser Generator\fR, Comp. Sci. Tech. Rep.
|
||||
No. 39, Bell Laboratories, Murray Hill, New Jersey, October 1975
|
||||
.sp
|
||||
.IP [PCV]
|
||||
B.A. Wichmann and Z.J. Ciechanowicz, \fIPascal Compiler Validation\fR, John
|
||||
Wiley & Sons, 1983
|
||||
.sp
|
||||
.IP [SAL]
|
||||
A.H.J. Sale, \fIA Note on Scope, One-Pass Compilers and Pascal\fR, Australian
|
||||
Communications, 1, 1, 80-82, 1979
|
85
doc/pascal/rtl.doc
Normal file
85
doc/pascal/rtl.doc
Normal file
|
@ -0,0 +1,85 @@
|
|||
.sp 1.5i
|
||||
.ft B
|
||||
Appendix B: Changes to the run time library
|
||||
.ft R
|
||||
.nh
|
||||
.sp
|
||||
Some minor changes in the run time library have been made concerning the
|
||||
external files (i.e. program arguments). The old compiler reserved
|
||||
space for the file structures of the external files in one \fBhol\fR block.
|
||||
In the new compiler, every file structure is placed in a separate \fBbss\fR
|
||||
block. This implies that the arguments with which \fI_ini\fR is called are
|
||||
slightly different. The second argument was the base of the \fBhol\fR block
|
||||
to relocate the buffer addresses, it is changed into an integer denoting the
|
||||
size of the array passed as third argument. The third argument was a pointer
|
||||
to an array of integers containing the description of external files, this
|
||||
argument is changed into a pointer to an array of pointers to file structures.
|
||||
|
||||
The differences in the generated EM code for an arbitrary Pascal program are
|
||||
listed below (only the relevant parts are shown):
|
||||
.in +5m
|
||||
.nf
|
||||
\fBprogram\fR external_files(output,f);
|
||||
\fBvar\fR
|
||||
f : \fBfile of \fIsome-type\fR;
|
||||
.
|
||||
.
|
||||
\fBend\fR.
|
||||
.in -5m
|
||||
|
||||
EM code generated by Pascal-VU:
|
||||
.in +5m
|
||||
.
|
||||
.
|
||||
hol 1088,-2147483648,0 ; space belonging to file structures of the program arguments
|
||||
.
|
||||
.
|
||||
.
|
||||
\&.2
|
||||
con 3, -1, 544, 0 \h'80u'; description of external files
|
||||
lxl 0
|
||||
lae .2
|
||||
lae 0 \h'146u'; base of hol block, to relocate buffer addresses
|
||||
lxa 0
|
||||
cal $_ini
|
||||
asp 16
|
||||
.
|
||||
.
|
||||
.in -5m
|
||||
|
||||
EM code generated by our compiler:
|
||||
.in +5m
|
||||
.
|
||||
.
|
||||
f
|
||||
bss 540,0,0 \h'100u'; space belonging to file structure of program argument f
|
||||
output
|
||||
bss 540,0,0 \h'100u'; space belonging to file structure of standard output
|
||||
.
|
||||
.
|
||||
.
|
||||
\&.2
|
||||
con 0U4, output, f \h'50u'; the absence of standard input is denoted by a null pointer
|
||||
lxl 0
|
||||
lae .2
|
||||
loc 3 \h'144u'; denotes the size of the array of pointers to file structures
|
||||
lxa 0
|
||||
cal $_ini
|
||||
asp 16
|
||||
.
|
||||
.
|
||||
.in -5m
|
||||
|
||||
.po
|
||||
The following files in the run time library have been changed:
|
||||
.in +1m
|
||||
pc_file.h
|
||||
hlt.c
|
||||
ini.c
|
||||
opn.c
|
||||
pentry.c
|
||||
pexit.c
|
||||
.in -1m
|
||||
.fi
|
||||
.bp
|
||||
.po
|
269
doc/pascal/syntax.doc
Normal file
269
doc/pascal/syntax.doc
Normal file
|
@ -0,0 +1,269 @@
|
|||
.sp 1.5i
|
||||
.LP
|
||||
.vs 14
|
||||
.nh
|
||||
.ft B
|
||||
Appendix A: ISO-PASCAL grammar
|
||||
.ft R
|
||||
|
||||
|
||||
\fBA.1 Lexical tokens\fR
|
||||
|
||||
The syntax describes the formation of lexical tokens from characters and the
|
||||
separation of these tokens, and therefore does not adhere to the same rules
|
||||
as the syntax in A.2.
|
||||
|
||||
The lexical tokens used to construct Pascal programs shall be classified into
|
||||
special-symbols, identifiers, directives, unsigned-numbers, labels and
|
||||
character-strings. The representation of any letter (upper-case or lower-case,
|
||||
differences of font, etc) occurring anywhere outside of a character-string
|
||||
shall be insignificant in that occurrence to the meaning of the program.
|
||||
|
||||
letter = \fBa\fR | \fBb\fR | \fBc\fR | \fBd\fR | \fBe\fR | \fBf\fR | \fBg\fR | \fBh\fR | \fBi\fR | \fBj\fR | \fBk\fR | \fBl\fR | \fBm\fR | \fBn\fR | \fBo\fR | \fBp\fR | \fBq\fR | \fBr\fR | \fBs\fR | \fBt\fR | \fBu\fR | \fBv\fR | \fBw\fR | \fBx\fR | \fBy\fR | \fBz\fR .
|
||||
|
||||
digit = \fB0\fR | \fB1\fR | \fB2\fR | \fB3\fR | \fB4\fR | \fB5\fR | \fB6\fR | \fB7\fR | \fB8\fR | \fB9\fR .
|
||||
|
||||
|
||||
The special symbols are tokens having special meanings and shall be used to
|
||||
delimit the syntactic units of the language.
|
||||
|
||||
special-symbol = \fB+\fR | \fB\-\fR | \fB*\fR | \fB/\fR | \fB=\fR | \fB<\fR | \fB>\fR | \fB[\fR | \fB]\fR | \fB.\fR | \fB,\fR | \fB:\fR | \fB;\fR | \fB^\fR | \fB(\fR | \fB)\fR | \fB<>\fR | \fB<=\fR | \fB>=\fR | \fB:=\fR | \fB..\fR |
|
||||
\h'\w'special-symbol = 'u'word-symbol .
|
||||
|
||||
word-symbol = \fBand\fR | \fBarray\fR | \fBbegin\fR | \fBcase\fR | \fBconst\fR | \fBdiv\fR | \fBdo\fR | \fBdownto\fR | \fBelse\fR | \fBend\fR | \fBfile\fR | \fBfor\fR | \fBfunction\fR |
|
||||
\h'\w'word-symbol = 'u'\fBgoto\fR | \fBif\fR | \fBin\fR | \fBlabel\fR | \fBmod\fR | \fBnil\fR | \fBnot\fR | \fBof\fR | \fBor\fR | \fBpacked\fR | \fBprocedure\fR | \fBprogram\fR | \fBrecord\fR |
|
||||
\h'\w'word-symbol = 'u'\fBrepeat\fR | \fBset\fR | \fBthen\fR | \fBto\fR | \fBtype\fR | \fBuntil\fR | \fBvar\fR | \fBwhile\fR | \fBwith\fR .
|
||||
|
||||
|
||||
Identifiers may be of any length. All characters of an identifier shall be
|
||||
significant. No identifier shall have the same spelling as any word-symbol.
|
||||
|
||||
identifier = letter { letter | digit } .
|
||||
|
||||
|
||||
A directive shall only occur in a procedure-declaration or function-declaration.
|
||||
No directive shall have the same spelling as any word-symbol.
|
||||
|
||||
directive = letter {letter | digit} .
|
||||
|
||||
|
||||
Numbers are given in decimal notation.
|
||||
|
||||
.nf
|
||||
unsigned-integer = digit-sequence .
|
||||
unsigned-real = unsigned-integer \fB.\fR fractional-part [ \fBe\fR scale-factor ] | unsigned-integer \fBe\fR scale-factor .
|
||||
digit-sequence = digit {digit} .
|
||||
fractional-part = digit-sequence .
|
||||
scale-factor = signed-integer .
|
||||
signed-integer = [sign] unsigned-integer .
|
||||
sign = \fB+\fR | \fB\-\fR .
|
||||
.fi
|
||||
|
||||
.bp
|
||||
Labels shall be digit-sequences and shall be distinguished by their apparent
|
||||
integral values and shall be in the closed interval 0 to 9999.
|
||||
|
||||
label = digit-sequence .
|
||||
|
||||
|
||||
A character-string containing a single string-element shall denote a value of
|
||||
the required char-type. Each string-character shall denote an implementation-
|
||||
defined value of the required char-type.
|
||||
|
||||
.nf
|
||||
character-string = \fB'\fR string-element { string-element } \fB'\fR .
|
||||
string-element = apostrophe-image | string-character .
|
||||
apostrophe-image = \fB''\fR .
|
||||
string-character = All 7-bits ASCII characters except linefeed (10), vertical tab (11), and new page (12).
|
||||
.fi
|
||||
|
||||
|
||||
The construct:
|
||||
|
||||
\fB{\fR any-sequence-of-characters-and-separations-of-lines- not-containing-right-brace \fB}\fR
|
||||
|
||||
shall be a comment if the "{" does not occur within a character-string or
|
||||
within a comment. The substitution of a space for a comment shall not alter
|
||||
the meaning of a program.
|
||||
|
||||
Comments, spaces (except in character-strings), and the separation of
|
||||
consecutive lines shall be considered to be token separators. Zero or more
|
||||
token separators may occur between any two consecutive tokens, or before
|
||||
the first token of a program text. No separators shall occur within tokens.
|
||||
.bp
|
||||
.po
|
||||
\fBA.2 Grammar\fR
|
||||
|
||||
The non-terminal symbol \fIprogram\fR is the start symbol of the grammar.
|
||||
|
||||
.nf
|
||||
actual-parameter : expression | variable-access | procedure-identifier | function-identifier .
|
||||
actual-parameter-list : \fB(\fR actual-parameter { \fB,\fR actual-parameter } \fB)\fR .
|
||||
adding-operator : \fB+\fR | \fB\-\fR | \fBor\fR .
|
||||
array-type : \fBarray\fR \fB[\fR index-type { \fB,\fR index-type } \fB]\fR \fBof\fR component-type .
|
||||
array-variable : variable-access .
|
||||
assignment-statement : ( variable-access | function-identifier ) \fB:=\fR expression .
|
||||
|
||||
base-type : ordinal-type .
|
||||
block : label-declaration-part constant-definition-part type-definition-part variable-declaration-part
|
||||
\h'\w'block : 'u'procedure-and-function-declaration-part statement-part .
|
||||
Boolean-expression : expression .
|
||||
bound-identifier : identifier .
|
||||
buffer-variable : file-variable \fB^\fR .
|
||||
|
||||
case-constant : constant .
|
||||
case-constant-list : case-constant { \fB,\fR case-constant } .
|
||||
case-index : expression .
|
||||
case-list-element : case-constant-list \fB:\fR statement .
|
||||
case-statement : \fBcase\fR case-index \fBof\fR case-list-element { \fB;\fR case-list-element } [ \fB;\fR ] \fBend\fR .
|
||||
component-type : type-denoter .
|
||||
component-variable : indexed-variable | field-designator .
|
||||
compound-statement : \fBbegin\fR statement-sequence \fBend\fR .
|
||||
conditional-statement : if-statement | case-statement .
|
||||
conformant-array-parameter-specification : value-conformant-array-specification |
|
||||
\h'+18.5m'variable-conformant-array-specification .
|
||||
conformant-array-schema : packed-conformant-array-schema | unpacked-conformant-array-schema .
|
||||
constant : [ sign ] ( unsigned-number | constant-identifier ) | character-string .
|
||||
constant-definition : identifier \fB=\fR constant .
|
||||
constant-definition-part : [ \fBconst\fR constant-definition \fB;\fR { constant-definition \fB;\fR } ] .
|
||||
constant-identifier : identifier .
|
||||
control-variable : entire-variable .
|
||||
|
||||
domain-type : type-identifier .
|
||||
|
||||
else-part : \fBelse\fR statement .
|
||||
empty-statement : .
|
||||
entire-variable : variable-identifier .
|
||||
enumerated-type : \fB(\fR identifier-list \fB)\fR .
|
||||
expression : simple-expression [ relational-operator simple-expression ] .
|
||||
.bp
|
||||
.po
|
||||
factor : variable-access | unsigned-constant | bound-identifier | function-designator | set-constructor |
|
||||
\h'\w'factor : 'u'\fB(\fR expression \fB)\fR | \fBnot\fR factor .
|
||||
field-designator : record-variable \fB.\fR field-specifier | field-designator-identifier .
|
||||
field-designator-identifier : identifier .
|
||||
field-identifier : identifier .
|
||||
field-list : [ ( fixed-part [ \fB;\fR variant-part ] | variant-part ) [ \fB;\fR ] ] .
|
||||
field-specifier : field-identifier .
|
||||
file-type : \fBfile\fR \fBof\fR component-type .
|
||||
file-variable : variable-access .
|
||||
final-value : expression .
|
||||
fixed-part : record-section { \fB;\fR record-section } .
|
||||
for-statement : \fBfor\fR control-variable \fB:=\fR initial-value ( \fBto\fR | \fBdownto\fR ) final-value \fBdo\fR statement .
|
||||
formal-parameter-list : \fB(\fR formal-parameter-section { \fB;\fR formal-parameter-section } \fB)\fR .
|
||||
formal-parameter-section : value-parameter-specification | variable-parameter-specification |
|
||||
\h'\w'formal-parameter-section : 'u'procedural-parameter-specification | functional-parameter-specification |
|
||||
\h'\w'formal-parameter-section : 'u'conformant-array-parameter-specification .
|
||||
function-block : block .
|
||||
function-declaration : function-heading \fB;\fR directive | function-identification \fB;\fR function-block |
|
||||
\h'\w'function-declaration : 'u'function-heading \fB;\fR function-block .
|
||||
function-designator : function-identifier [ actual-parameter-list ] .
|
||||
function-heading : \fBfunction\fR identifier [ formal-parameter-list ] \fB:\fR result-type .
|
||||
function-identification : \fBfunction\fR function-identifier .
|
||||
function-identifier : identifier .
|
||||
functional-parameter-specification : function-heading .
|
||||
|
||||
goto-statement : \fBgoto\fR label .
|
||||
|
||||
identified-variable : pointer-variable \fB^\fR .
|
||||
identifier-list : identifier { \fB,\fR identifier } .
|
||||
if-statement : \fBif\fR Boolean-expression \fBthen\fR statement [ else-part ] .
|
||||
index-expression : expression .
|
||||
index-type : ordinal-type .
|
||||
index-type-specification : identifier \fB..\fR identifier \fB:\fR ordinal-type-identifier .
|
||||
indexed-variable : array-variable \fB[\fR index-expression { \fB,\fR index-expression } \fB]\fR .
|
||||
initial-value : expression .
|
||||
|
||||
label : digit-sequence .
|
||||
label-declaration-part : [ \fBlabel\fR label { \fB,\fR label } \fB;\fR ] .
|
||||
|
||||
member-designator : expression [ \fB..\fR expression ] .
|
||||
multiplying-operator : \fB*\fR | \fB/\fR | \fBdiv\fR | \fBmod\fR | \fBand\fR .
|
||||
.bp
|
||||
.po
|
||||
new-ordinal-type : enumerated-type | subrange-type .
|
||||
new-pointer-type : \fB^\fR domain-type .
|
||||
new-structured-type : [ \fBpacked\fR ] unpacked-structured-type .
|
||||
new-type : new-ordinal-type | new-structured-type | new-pointer-type .
|
||||
|
||||
ordinal-type : new-ordinal-type | ordinal-type-identifier .
|
||||
ordinal-type-identifier : type-identifier .
|
||||
|
||||
packed-conformant-array-schema : \fBpacked\fR \fBarray\fR \fB[\fR index-type-specification \fB]\fR \fBof\fR type-identifier .
|
||||
pointer-type-identifier : type-identifier .
|
||||
pointer-variable : variable-access .
|
||||
procedural-parameter-specification : procedure-heading .
|
||||
procedure-and-function-declaration-part : { ( procedure-declaration | function-declaration ) \fB;\fR } .
|
||||
procedure-block : block .
|
||||
procedure-declaration : procedure-heading \fB;\fR directive | procedure-identification \fB;\fR procedure-block |
|
||||
\h'\w'procedure-declaration : 'u'procedure-heading \fB;\fR procedure-block .
|
||||
procedure-heading : \fBprocedure\fR identifier [ formal-parameter-list ] .
|
||||
procedure-identification : \fBprocedure \fR procedure-identifier .
|
||||
procedure-identifier : identifier .
|
||||
procedure-statement : procedure-identifier ( [ actual-parameter-list ] | read-parameter-list | readln-parameter-list |
|
||||
\h'\w'procedure-statement : procedure-identifier ( ['u'write-parameter-list | writeln-parameter-list ) .
|
||||
program : program-heading \fB;\fR program-block \fB.\fR .
|
||||
program-block : block .
|
||||
program-heading : \fBprogram\fR identifier [ \fB(\fR program-parameters \fB)\fR ] .
|
||||
program-parameters : identifier-list .
|
||||
|
||||
read-parameter-list : \fB(\fR [ file-variable \fB,\fR ] variable-access { \fB,\fR variable-access } \fB)\fR .
|
||||
readln-parameter-list : [ \fB(\fR ( file-variable | variable-access ) { \fB,\fR variable-access } \fB)\fR ] .
|
||||
record-section : identifier-list \fB:\fR type-denoter .
|
||||
record-type : \fBrecord\fR field-list \fBend\fR .
|
||||
record-variable : variable-access .
|
||||
record-variable-list : record-variable { \fB,\fR record-variable } .
|
||||
relational-operator : \fB=\fR | \fB<>\fR | \fB<\fR | \fB>\fR | \fB<=\fR | \fB>=\fR | \fBin\fR .
|
||||
repeat-statement : \fBrepeat\fR statement-sequence \fBuntil\fR Boolean-expression .
|
||||
repetitive-statement : repeat-statement | while-statement | for-statement .
|
||||
result-type : simple-type-identifier | pointer-type-identifier .
|
||||
|
||||
set-constructor : \fB[\fR [ member-designator { \fB,\fR member-designator } ] \fB]\fR .
|
||||
set-type : \fBset\fR \fBof\fR base-type .
|
||||
sign : \fB+\fR | \fB\-\fR .
|
||||
simple-expression : [ sign ] term { adding-operator term } .
|
||||
simple-statement : empty-statement | assignment-statement | procedure-statement | goto-statement .
|
||||
simple-type-identifier : type-identifier .
|
||||
.bp
|
||||
.po
|
||||
statement : [ label \fB:\fR ] ( simple-statement | structured-statement ) .
|
||||
statement-part : compound-statement .
|
||||
statement-sequence : statement { \fB;\fR statement } .
|
||||
structured-statement : compound-statement | conditional-statement | repetitive-statement | with-statement .
|
||||
subrange-type : constant \fB..\fR constant .
|
||||
|
||||
tag-field : identifier .
|
||||
tag-type : ordinal-type-identifier .
|
||||
term : factor { multiplying-operator factor } .
|
||||
type-definition : identifier \fB=\fR type-denoter .
|
||||
type-definition-part : [ \fBtype\fR type-definition \fB;\fR { type-definition \fB;\fR } ] .
|
||||
type-denoter : type-identifier | new-type .
|
||||
type-identifier : identifier .
|
||||
|
||||
unpacked-conformant-array-schema : \fBarray\fR \fB[\fR index-type-specification { \fB;\fR index-type-specification } \fB]\fR \fBof\fR
|
||||
\h'\w'unpacked-conformant-array-schema : 'u'( type-identifier | conformant-array-schema ) .
|
||||
unpacked-structured-type : array-type | record-type | set-type | file-type .
|
||||
unsigned-constant : unsigned-number | character-string | constant-identifier | \fBnil\fR .
|
||||
unsigned-number : unsigned-integer | unsigned-real .
|
||||
|
||||
value-conformant-array-specification : identifier-list \fB:\fR conformant-array-schema .
|
||||
value-parameter-specification : identifier-list \fB:\fR type-identifier .
|
||||
variable-access : entire-variable | component-variable | identified-variable | buffer-variable .
|
||||
variable-conformant-array-specification : \fBvar\fR identifier-list \fB:\fR conformant-array-schema .
|
||||
variable-declaration : identifier-list \fB:\fR type-denoter .
|
||||
variable-declaration-part : [ \fBvar\fR variable-declaration \fB;\fR { variable-declaration \fB;\fR } ] .
|
||||
variable-identifier : identifier .
|
||||
variable-parameter-specification : \fBvar\fR identifier-list \fB:\fR type-identifier .
|
||||
variant : case-constant-list \fB:\fR \fB(\fR field-list \fB)\fR .
|
||||
variant-part : \fBcase\fR variant-selector \fBof\fR variant { \fB;\fR variant } .
|
||||
variant-selector : [ tag-field \fB:\fR ] tag-type .
|
||||
|
||||
while-statement : \fBwhile\fR Boolean-expression \fBdo\fR statement .
|
||||
with-statement : \fBwith\fR record-variable-list \fBdo\fR statement .
|
||||
write-parameter : expression [ \fB:\fR expression [ \fB:\fR expression ] ] .
|
||||
write-parameter-list : \fB(\fR [ file-variable \fB,\fR ] write-parameter { \fB,\fR write-parameter } \fB)\fR .
|
||||
writeln-parameter-list : [ \fB(\fR ( file-variable | write-parameter ) { \fB,\fR write-parameter } \fB)\fR ] .
|
||||
.fi
|
||||
.vs
|
||||
.bp
|
||||
.po
|
19
doc/pascal/test.doc
Normal file
19
doc/pascal/test.doc
Normal file
|
@ -0,0 +1,19 @@
|
|||
.sp 2
|
||||
.NH
|
||||
Testing the compiler
|
||||
.nh
|
||||
.sp
|
||||
.LP
|
||||
Although it is practically impossible to prove the correctness of a compiler,
|
||||
a systematic method of testing the compiler is used to increase the confidence
|
||||
that it will work satisfactorily in practice. The first step was to see if
|
||||
the lexical analysis was performed correctly. For this purpose, the routine
|
||||
LexScan() was used (see also the \-l option). Next we tested the parser
|
||||
generated by LLgen, to see whether correct Pascal programs were accepted and
|
||||
garbage was dealed with gracefully. The biggest test involved was the
|
||||
validation of the semantic analysis. Simultaneously we tested the code
|
||||
generation. First some small Pascal test programs were translated and
|
||||
executed. When these programs work correctly, the Pascal validation suite
|
||||
and a large set of Pascal test programs were compiled to see whether they
|
||||
behaved in the manner the standard specifies. For more details about the
|
||||
Pascal validation suite, the reader is referred to [PCV].
|
13
doc/pascal/titlepg.doc
Normal file
13
doc/pascal/titlepg.doc
Normal file
|
@ -0,0 +1,13 @@
|
|||
\v'3i'
|
||||
.ps 36
|
||||
The ACK Pascal Compiler
|
||||
.ps 12
|
||||
.sp 30
|
||||
.ce 5
|
||||
.ft I
|
||||
There is always something like something that there should not be.
|
||||
.sp 2
|
||||
.ps 10
|
||||
For Whom The Bell Tolls
|
||||
.ft R
|
||||
Ernest Hemingway
|
407
doc/pascal/transpem.doc
Normal file
407
doc/pascal/transpem.doc
Normal file
|
@ -0,0 +1,407 @@
|
|||
.sp 1.5i
|
||||
.de CL
|
||||
.ft R
|
||||
c\\$1
|
||||
.ft 5
|
||||
\fIcode statement-\\$1
|
||||
.ft 5
|
||||
\fBbra *\fRexit_label
|
||||
.ft 5
|
||||
..
|
||||
.NH
|
||||
Translation of Pascal to EM code
|
||||
.nh
|
||||
.LP
|
||||
.sp
|
||||
A short description of the translation of Pascal constructs to EM code is
|
||||
given in the following paragraphs. The EM instructions and Pascal terminal
|
||||
symbols are printed in \fBboldface\fR. A sentence in \fIitalics\fR is a
|
||||
description of a group of EM (pseudo)instructions.
|
||||
.sp
|
||||
.NH 2
|
||||
Global Variables
|
||||
.LP
|
||||
.sp
|
||||
For every global variable, a \fBbss\fR block is reserved. To enhance the
|
||||
readability of the EM-code generated, the variable-identifier is used as
|
||||
a data label to address the block.
|
||||
.sp
|
||||
.NH 2
|
||||
Expressions
|
||||
.LP
|
||||
.sp
|
||||
Operands are always evaluated, so the execution of
|
||||
.br
|
||||
.ti +3m
|
||||
\fBif\fR ( p <> nil ) \fBand\fR ( p^.value <> 0 ) \fBthen\fR .....
|
||||
.br
|
||||
might cause a run-time error, if p is equal to nil.
|
||||
.LP
|
||||
The left-hand operand of a dyadic operator is almost always evaluated before
|
||||
the right-hand side. Peculiar evaluations exist for the following cases:
|
||||
.sp
|
||||
the expression: set1 <= set2, is evaluated as follows :
|
||||
.nf
|
||||
- evaluate set2
|
||||
- evaluate set1
|
||||
- compute set2+set1
|
||||
- test set2 and set2+set1 for equality
|
||||
.fi
|
||||
.sp
|
||||
the expression: set1 >= set2, is evaluated as follows :
|
||||
.nf
|
||||
- evaluate set1
|
||||
- evaluate set2
|
||||
- compute set1+set2
|
||||
- test set1 and set1+set2 for equality
|
||||
.fi
|
||||
.sp
|
||||
Where allowed, according to the standard, constant integral expressions are
|
||||
compile-time evaluated while an effort is made to report overflow on target
|
||||
machine basis. The integral expressions are evaluated in the type \fIarith\fR.
|
||||
The size of an arith is assumed to be at least the size of the integer type
|
||||
on the target machine. If the target machine's integer size is less than the
|
||||
size of an arith, overflow can be detected at compile-time. However, the
|
||||
following call to the standard procedure new, \fInew(p, 3+5)\fR, is illegal,
|
||||
because the second parameter is not a constant according to the grammar.
|
||||
.sp
|
||||
Constant floating expressions are not compile-time evaluated, because the
|
||||
precision on the target machine and the precision on the machine on which the
|
||||
compiler runs could be different. The boolean expression \fI(1.0 + 1.0) = 2.0\fR
|
||||
could evaluate to false.
|
||||
.sp
|
||||
.NH 2
|
||||
Statements
|
||||
.NH 3
|
||||
Assignment Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.ti +3m
|
||||
\f5(variable-access | function-identifier) \fB:=\f5 expression
|
||||
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
.ft I
|
||||
evaluate expression
|
||||
store in variable-access or function-identifier
|
||||
.ft R
|
||||
.in -3m
|
||||
.fi
|
||||
|
||||
In case of a function-identifier, a hidden temporary variable is used to
|
||||
keep the function result.
|
||||
.bp
|
||||
.NH 3
|
||||
Goto Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.ti +3m
|
||||
\fBGOTO\f5 label
|
||||
|
||||
\fREM :
|
||||
.in +3m
|
||||
Two cases can be distinguished :
|
||||
.br
|
||||
- local goto,
|
||||
.ti +2m
|
||||
in which a \fBbra\fR is generated.
|
||||
|
||||
- non-local goto,
|
||||
.in +2m
|
||||
.ll -1i
|
||||
a goto_descriptor is build, containing the ProgramCounter of the instruction
|
||||
jumped to and an offset in the target procedure frame which contains the
|
||||
value of the StackPointer after the jump. The code for the jump itself is to
|
||||
load the address of the goto_descriptor, followed by a push of the LocalBase
|
||||
of the target procedure and a \fBcal\fR $_gto. A message is generated to
|
||||
indicate that a procedure or function contains a statement which is the
|
||||
target of a non-local goto.
|
||||
.ll +1i
|
||||
.in -2m
|
||||
.in -3m
|
||||
.sp 2
|
||||
.NH 3
|
||||
If Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
.ft 5
|
||||
\fBIF\f5 boolean-expression \fBTHEN\f5 statement
|
||||
|
||||
.in -3m
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fIevaluation boolean-expression
|
||||
\fBzeq \fR*exit_label
|
||||
\fIcode statement
|
||||
\fRexit_label
|
||||
.in -3m
|
||||
.fi
|
||||
.sp 2
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
.ft 5
|
||||
\fBIF\f5 boolean-expression \fBTHEN\f5 statement-1 \fBELSE\f5 statement-2
|
||||
|
||||
.in -3m
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fIevaluation boolean-expression
|
||||
\fBzeq \fR*else_label
|
||||
\fIcode statement-1
|
||||
\fBbra \fR*exit_label
|
||||
\fRelse_label
|
||||
\fIcode statement-2
|
||||
\fRexit_label
|
||||
.in -3m
|
||||
.fi
|
||||
.sp 2
|
||||
.NH 3
|
||||
Repeat Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
.ft 5
|
||||
\fBREPEAT\f5 statement-sequence \fBUNTIL\f5 boolean-expression
|
||||
|
||||
.in -3m
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fRrepeat_label
|
||||
\fIcode statement-sequence
|
||||
\fIevaluation boolean-expression
|
||||
\fBzeq\fR *repeat_label
|
||||
.in -3m
|
||||
.fi
|
||||
.bp
|
||||
.NH 3
|
||||
While Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
.ft 5
|
||||
\fBWHILE\f5 boolean-expression \fBDO\f5 statement
|
||||
|
||||
.in -3m
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fRwhile_label
|
||||
\fIevaluation boolean-expression
|
||||
\fBzeq\fR *exit_label
|
||||
\fIcode statement
|
||||
\fBbra\fR *while_label
|
||||
\fRexit_label
|
||||
.in -3m
|
||||
.fi
|
||||
.sp 2
|
||||
.NH 3
|
||||
Case Statement
|
||||
.LP
|
||||
.sp
|
||||
The case-statement is implemented using the \fBcsa\fR and \fBcsb\fR
|
||||
instructions.
|
||||
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
\fBCASE\f5 case-expression \fBOF\f5
|
||||
.in +5m
|
||||
case-constant-list-1 \fB:\f5 statement-1 \fB;\f5
|
||||
.br
|
||||
case-constant-list-2 \fB:\f5 statement-2 \fB;\f5
|
||||
.br
|
||||
\&.
|
||||
.br
|
||||
\&.
|
||||
.br
|
||||
case-constant-list-n \fB:\f5 statement-n [\fB;\f5]
|
||||
.in -5m
|
||||
\fBEND\fR
|
||||
.in -3m
|
||||
.sp 2
|
||||
.LP
|
||||
.ll -1i
|
||||
The \fBcsa\fR instruction is used if the range of the case-expression
|
||||
value is dense, i.e.
|
||||
.br
|
||||
.ti +3m
|
||||
\f5( upperbound \- lowerbound ) / number_of_cases\fR
|
||||
.br
|
||||
is less than the constant DENSITY, defined in the file \fIdensity.h\fR.
|
||||
|
||||
If the range is sparse, a \fBcsb\fR instruction is used.
|
||||
|
||||
.ll +1i
|
||||
\fREM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fIevaluation case-expression
|
||||
\fBbra\fR *l1
|
||||
.CL 1
|
||||
.CL 2
|
||||
.
|
||||
.
|
||||
.CL n
|
||||
.ft R
|
||||
\&.case_descriptor
|
||||
.ft 5
|
||||
\fIgeneration case_descriptor
|
||||
\fRl1
|
||||
.ft 5
|
||||
\fBlae\fR .case_descriptor
|
||||
.ft 5
|
||||
\fBcsa\fR size of (case-expression)
|
||||
\fRexit_label
|
||||
.in -3m
|
||||
.fi
|
||||
.bp
|
||||
.NH 3
|
||||
For Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.in +3m
|
||||
.ft 5
|
||||
\fBFOR\f5 control-variable \fB:=\f5 initial-value (\fBTO\f5 | \fBDOWNTO\f5) final-value \fBDO\f5 statement
|
||||
|
||||
.ft R
|
||||
.in -3m
|
||||
The initial-value and final-value are evaluated at the beginning of the loop.
|
||||
If the values are not constant, they are evaluated once and stored in a
|
||||
temporary.
|
||||
|
||||
EM :
|
||||
.nf
|
||||
.in +3m
|
||||
\fIload initial-value
|
||||
\fIload final-value
|
||||
\fBbgt\fR exit-label (* DOWNTO : \fBblt\fI exit-label\fR *)
|
||||
\fIload initial-value
|
||||
\fRl1
|
||||
\fIstore in control-variable
|
||||
\fIcode statement
|
||||
\fIload control-variable
|
||||
\fBdup\fI control-variable
|
||||
\fIload final-value
|
||||
\fBbeq\fR exit_label
|
||||
\fBinc\fI control-variable\fR (* DOWNTO : \fBdec\fI control-variable\fR *)
|
||||
\fBbra *\fRl1
|
||||
\fRexit_label
|
||||
.in -3m
|
||||
.fi
|
||||
|
||||
Note: testing must be done before incrementing(decrementing) the
|
||||
control-variable,
|
||||
.br
|
||||
\h'\w'Note: 'u'because wraparound could occur, which could lead to an infinite
|
||||
loop.
|
||||
.sp 2
|
||||
.NH 3
|
||||
With Statement
|
||||
|
||||
\fRPASCAL :
|
||||
.ti +3m
|
||||
\fBWITH\f5 record-variable-list \fBDO\f5 statement
|
||||
|
||||
.ft R
|
||||
The statement
|
||||
.ti +3m
|
||||
\fBWITH\fR r\s-3\d1\u\s0, r\s-3\d2\u\s0, ..., r\s-3\dn\u\s0 \fBDO\f5 statement
|
||||
|
||||
.ft R
|
||||
is equivalent to
|
||||
.in +3m
|
||||
\fBWITH\fR r\s-3\d1\u\s0 \fBDO\fR
|
||||
\fBWITH\fR r\s-3\d2\u\s0 \fBDO\fR
|
||||
...
|
||||
\fBWITH\fR r\s-3\dn\u\s0 \fBDO\f5 statement
|
||||
|
||||
.ft R
|
||||
.in -3m
|
||||
The translation of
|
||||
.ti +3m
|
||||
\fBWITH\fR r\s-3\d1\u\s0 \fBDO\f5 statement
|
||||
.br
|
||||
.ft R
|
||||
is
|
||||
.nf
|
||||
.in +3m
|
||||
\fIpush address of r\s-3\d1\u\s0
|
||||
\fIstore address in temporary
|
||||
\fIcode statement
|
||||
.in -3m
|
||||
.fi
|
||||
|
||||
.ft R
|
||||
An occurrence of a field is translated into:
|
||||
.in +3m
|
||||
\fIload temporary
|
||||
.br
|
||||
\fIadd field-offset
|
||||
.in -3m
|
||||
.bp
|
||||
.NH 2
|
||||
Procedure and Function Calls
|
||||
|
||||
.ft R
|
||||
In general, the call
|
||||
.ti +5m
|
||||
p(a\s-3\d1\u\s0, a\s-3\d2\u\s0, ...., a\s-3\dn\u\s0)
|
||||
.br
|
||||
is translated into the sequence:
|
||||
|
||||
.in +5m
|
||||
.nf
|
||||
\fIevaluate a\s-3\dn\u\s0
|
||||
\&.
|
||||
\&.
|
||||
\fIevaluate a\s-3\d2\u\s0
|
||||
\fIevaluate a\s-3\d1\u\s0
|
||||
\fIpush localbase
|
||||
\fBcal\fR $p
|
||||
\fIpop parameters
|
||||
.ft R
|
||||
.fi
|
||||
.in -5m
|
||||
|
||||
i.e. the order of evaluation and binding of the actual-parameters is from
|
||||
right to left. In general, a copy of the actual-parameter is made when the
|
||||
formal-parameter is a value-parameter. If the formal-parameter is a
|
||||
variable-parameter, a pointer to the actual-parameter is pushed.
|
||||
|
||||
In case of a function call, a \fBlfr\fR is generated, which pushes the
|
||||
function result on top of the stack.
|
||||
.sp 2
|
||||
.NH 2
|
||||
Register Messages
|
||||
|
||||
.ft R
|
||||
A register message can be generated to indicate that a local variable is never
|
||||
referenced indirectly. This implies that a register can be used for a variable.
|
||||
We distinguish the following classes, given in decreasing priority:
|
||||
|
||||
\(bu control-variable and final-value of a for-statement
|
||||
.br
|
||||
.ti +5m
|
||||
to speed up testing, and execution of the body of the for-statement
|
||||
.sp
|
||||
\(bu record-variable of a with-statement
|
||||
.br
|
||||
.ti +5m
|
||||
to improve the field selection of a record
|
||||
.sp
|
||||
\(bu remaining local variables and parameters
|
||||
.sp 2
|
||||
.NH 2
|
||||
Compile-time optimizations
|
||||
|
||||
.ft R
|
||||
The only optimization that is performed is the evaluation of constant
|
||||
integral expressions. The optimization of constructs like
|
||||
.ti +5m
|
||||
\fBif\f5 false \fBthen\f5 statement\fR,
|
||||
.br
|
||||
is left to either the peephole optimizer, or a global optimizer.
|
23
doc/pascal/vrk.doc
Normal file
23
doc/pascal/vrk.doc
Normal file
|
@ -0,0 +1,23 @@
|
|||
.TL
|
||||
|
||||
|
||||
|
||||
The ACK Pascal Compiler
|
||||
.AU
|
||||
Aad Geudeke
|
||||
Frans Hofmeester
|
||||
.AI
|
||||
Dept. of Mathematics and Computer Science
|
||||
Vrije Universiteit
|
||||
Amsterdam, The Netherlands
|
||||
.LP
|
||||
.ps 12
|
||||
.sp 24
|
||||
.ce 5
|
||||
.ft I
|
||||
There is always something like something that there should not be.
|
||||
.sp 2
|
||||
.ps 10
|
||||
For Whom The Bell Tolls
|
||||
.ft R
|
||||
Ernest Hemingway
|
Loading…
Reference in a new issue