From 44cc0751839f54bdf6e48fef362a3e5272eeab15 Mon Sep 17 00:00:00 2001 From: ceriel Date: Fri, 1 Nov 1991 09:43:36 +0000 Subject: [PATCH] Added --- doc/pascal/ab+intro.doc | 50 +++++ doc/pascal/compar.doc | 89 +++++++++ doc/pascal/conf.doc | 88 +++++++++ doc/pascal/contents.doc | 41 ++++ doc/pascal/deviations.doc | 118 +++++++++++ doc/pascal/example.doc | 92 +++++++++ doc/pascal/extensions.doc | 60 ++++++ doc/pascal/hints.doc | 76 +++++++ doc/pascal/his.doc | 36 ++++ doc/pascal/improv.doc | 87 ++++++++ doc/pascal/internal.doc | 342 ++++++++++++++++++++++++++++++++ doc/pascal/options.doc | 166 ++++++++++++++++ doc/pascal/p1-9 | 1 + doc/pascal/p10-14 | 1 + doc/pascal/p15-19 | 1 + doc/pascal/p20-29 | 1 + doc/pascal/reference.doc | 50 +++++ doc/pascal/rtl.doc | 85 ++++++++ doc/pascal/syntax.doc | 269 +++++++++++++++++++++++++ doc/pascal/test.doc | 19 ++ doc/pascal/titlepg.doc | 13 ++ doc/pascal/transpem.doc | 407 ++++++++++++++++++++++++++++++++++++++ doc/pascal/vrk.doc | 23 +++ 23 files changed, 2115 insertions(+) create mode 100644 doc/pascal/ab+intro.doc create mode 100644 doc/pascal/compar.doc create mode 100644 doc/pascal/conf.doc create mode 100644 doc/pascal/contents.doc create mode 100644 doc/pascal/deviations.doc create mode 100644 doc/pascal/example.doc create mode 100644 doc/pascal/extensions.doc create mode 100644 doc/pascal/hints.doc create mode 100644 doc/pascal/his.doc create mode 100644 doc/pascal/improv.doc create mode 100644 doc/pascal/internal.doc create mode 100644 doc/pascal/options.doc create mode 100755 doc/pascal/p1-9 create mode 100755 doc/pascal/p10-14 create mode 100755 doc/pascal/p15-19 create mode 100755 doc/pascal/p20-29 create mode 100644 doc/pascal/reference.doc create mode 100644 doc/pascal/rtl.doc create mode 100644 doc/pascal/syntax.doc create mode 100644 doc/pascal/test.doc create mode 100644 doc/pascal/titlepg.doc create mode 100644 doc/pascal/transpem.doc create mode 100644 doc/pascal/vrk.doc diff --git a/doc/pascal/ab+intro.doc b/doc/pascal/ab+intro.doc new file mode 100644 index 000000000..bd99d00ec --- /dev/null +++ b/doc/pascal/ab+intro.doc @@ -0,0 +1,50 @@ +.TL +The ACK Pascal Compiler +.AU +Aad Geudeke +Frans Hofmeester +.AI +Dept. of Mathematics and Computer Science +Vrije Universiteit +Amsterdam, The Netherlands +.AB +This document describes the implementation of a Pascal to EM compiler. The +compiler is written in C. The lexical analysis is done using a hand-written +lexical analyzer. Semantic analysis makes use of the extended LL(1) parser +generator LLgen. Several EM utility modules are used in the compiler. +.AE +.sp 2 +.NH +Introduction + +.PP +.nh +The Pascal front end of the Amsterdam Compiler Kit (ACK) complies with the +requirements of the international standard published by the International +Organization for Standardization (ISO) [ISO]. An informal description, which +unfortunately is not conforming to the standard, of the programming language +Pascal is given in [JEN]. + +.PP +The main reason for rewriting the Pascal compiler was that the old Pascal +compiler was written in Pascal itself, and a disadvantage of it was its +lack of flexibility. The compiler did not meet the needs of the current +ACK-framework, which makes use of modern parsing techniques and utility +modules. In this framework it is, for example, possible to use a fast back +end. Such a back end translates directly to object code [ACK]. Our compiler is +written in C and it is designed similar to the current C and Modula-2 compiler +of ACK. + +.PP +Chapter 2 describes the basic structure of the compiler. Chapter 3 discusses +the code generation of the main Pascal constructs. Chapter 4 covers one of +the major components of Pascal, viz. the conformant array. In Chapter 5 the +various compiler options that can be used are enumerated. The extensions +to the standard and the deviations from the standard are listed in Chapter +6 and 7. Chapter 8 presents some ideas to improve the standard. Chapter 9 +gives a short overview of testing the compiler. The major differences +between the old and new compiler can be found in Chapter 10. Suggestions +to improve the compiler are described in Chapter 11. The appendices +contain the grammar of Pascal and the changes made to the ACK Pascal run time +library. A translation of a Pascal program to EM code as example is presented. +.bp diff --git a/doc/pascal/compar.doc b/doc/pascal/compar.doc new file mode 100644 index 000000000..e712435c3 --- /dev/null +++ b/doc/pascal/compar.doc @@ -0,0 +1,89 @@ +.sp 2 +.NH +Comparison with the Pascal-VU compiler +.nh + +.LP +In this chapter, the differences with the Pascal-VU compiler [IM2] are listed. +The points enumerated below can be used as improvements to the compiler (see +also Chapter 11). +.sp +.NH 2 +Deviations +.LP +.sp +- large labels +.in +3m +only labels in the closed interval 0..9999 are allowed, as opposed to the +Pascal-VU compiler. The Pascal-VU compiler allows every unsigned integer +as label. +.in -3m + +- goto +.in +3m +the new compiler conforms to the standard as opposed to the old one. The +following program, which contains an illegal jump to label 1, is accepted +by the Pascal-VU compiler. + +.nf +\fBprogram\fR illegal_goto(output); +\fBlabel\fR 1; +\fBvar\fR i : integer; +\fBbegin\fR + \fBgoto\fR 1; + \fBfor\fR i := 1 \fBto\fR 10 \fBdo\fR + \fBbegin\fR + 1 : writeln(i); + \fBend\fR; +\fBend\fR. +.fi + +This program is rejected by the new compiler. +.in -3m + +.NH 2 +Extensions +.LP +.sp +The extensions implemented by the Pascal-VU compiler are listed in +Chapter 5 of [IM2]. +.sp +- separate compilation +.ti +3m +the new compiler only accepts programs, not modules. + +- assertions +.ti +3m +not implemented. + +- additional procedures +.ti +3m +the procedures \fIhalt, mark\fR and \fIrelease\fR are not available. +.bp +- UNIX\(tm interfacing +.ti +3m +the \-c option is not implemented. +.FS +\(tm UNIX is a Trademark of Bell Laboratories. +.FE + +- double length integers +.ti +3m +integer size can be set with the \-V option, so the additional type \fIlong\fR +is not implemented. + + +.NH 2 +Compiler options +.LP +.sp +The options implemented by the Pascal-VU compiler are listed in +Chapter 7 of [IM2]. +.sp +The construction "{$....}" is not recognized. + +The options: \fIa, c, d, s\fR and \fIt\fR are not available. + +The \-l option has been changed into the \-L option. + +The size of reals can be set with the \-V option. diff --git a/doc/pascal/conf.doc b/doc/pascal/conf.doc new file mode 100644 index 000000000..ff85003bc --- /dev/null +++ b/doc/pascal/conf.doc @@ -0,0 +1,88 @@ +.sp 1.5i +.nr H1 3 +.NH +Conformant Arrays +.nh +.LP +.sp +A fifth kind of parameter, besides the value, variable, procedure, and function +parameter, is the conformant array parameter (\fBISO 6.6.3.7\fR). This +parameter, undoubtedly the major addition to Pascal from the compiler writer's +point of view, has been implemented. With this kind of parameter, the required +bounds of the index-type of an actual parameter are not fixed, but are +restricted to a specified range of values. Two types of conformant array +parameters can be distinguished: variable conformant array parameters and +value conformant array parameters. +.sp +.NH 2 +Variable conformant array parameters +.LP +.sp +The treatment of variable conformant array parameters is comparable with the +normal variable parameter. +Both have in common that the parameter mechanism used is \fIcall by +reference\fR. +.br +An example is: +.br +.in +5m +to sort variable length arrays of integers, the following Pascal procedure could be used: + +.nf +\fBprocedure\fR bubblesort(\fBvar\fR A : \fBarray\fR[low..high : integer] \fBof\fR integer); +\fBvar\fR i, j : integer; +\fBbegin + for\fR j := high - 1 \fBdownto\fR low \fBdo + for\fR i := low \fBto\fR j \fBdo + if\fR A[i+1] < A[i] \fBthen\fI interchange A[i] and A[i+1] +\fBend\fR; +.fi +.in -5m + +For every actual parameter, the base address of the array is pushed on the +stack and for every index-type-specification, exactly one array descriptor +is pushed. +.sp +.NH 2 +Value conformant array parameters +.LP +.sp +The treatment of value conformant array parameters is more complex than its +variable counterpart. +.br +An example is: +.br +.in +5m +an unpacked array of characters could be printed as a string with the following program part: + +.nf +\fBprocedure\fR WriteAsString( A : \fBarray\fR[low..high : integer] \fBof\fR char); +\fBvar\fR i : integer; +\fBbegin + for\fR i := low \fBto\fR high \fBdo\fR write(A[i]); +\fBend\fR; +.fi +.in -5m + +The calling procedure pushes the base address of the actual parameter and +the array descriptors belonging to it on the stack. Subsequently the procedure +using the conformant array parameter is called. Because it is a \fIcall by +value\fR, the called procedure has to create a copy of the actual parameter. +This implies that the calling procedure knows how much space on the stack +must be reserved for the parameters. If the actual-parameter is a conformant +array, the called procedure keeps track of the size of the activation record. +Hence the restrictions on the use of value conformant array parameters, as +specified in \fBISO 6.6.3.7.2\fR, are dropped. + +A description of the EM code generated by the compiler is: + +.nf +.ft I +load the stack adjustment sofar +load base address of array parameter +compute the size in bytes of the array +add this size to the stack adjustment +copy the array +remember the new address of the array +.ft R +.fi diff --git a/doc/pascal/contents.doc b/doc/pascal/contents.doc new file mode 100644 index 000000000..f744c9e32 --- /dev/null +++ b/doc/pascal/contents.doc @@ -0,0 +1,41 @@ +.sp 1.5i +.ps 12 +.vs 14 +.ft B +Contents\fR\h'+108u'\h'+5i'Page + + +\h'+34u'1. Introduction \h'+34u'\h'+1.5i'1 + +\h'+34u'2. The compiler \h'+34u'\h'+1.5i'2 + +\h'+34u'3. Translation of Pascal to EM \h'+34u'\h'+1.5i'5 + +\h'+34u'4. Conformant arrays \h'+1.5i'10 + +\h'+34u'5. Compiler options \h'+1.5i'11 + +\h'+34u'6. Extensions to the standard \h'+1.5i'13 + +\h'+34u'7. Deviations from the standard \h'+1.5i'13 + +\h'+34u'8. Hints to change the standard \h'+1.5i'15 + +\h'+34u'9. Testing the compiler \h'+1.5i'16 + +10. Comparison with the old compiler \h'+1.5i'16 + +11. Improvements to the compiler \h'+1.5i'17 + +12. History & Acknowledgements \h'+1.5i'18 + +13. References \h'+1.5i'19 + + +\fBAppendices\fR + +\h'+16u'A. ISO-PASCAL Grammar \h'+1.5i'20 + +\h'+24u'B. Changes to run time library \h'+1.5i'26 + +\h'+20u'C. An example \h'+1.5i'28 diff --git a/doc/pascal/deviations.doc b/doc/pascal/deviations.doc new file mode 100644 index 000000000..53ee571ac --- /dev/null +++ b/doc/pascal/deviations.doc @@ -0,0 +1,118 @@ +.sp 2 +.NH +Deviations from the standard +.nh + +.PP +The compiler deviates from the ISO 7185 standard with respect to the +following clauses: + +.IP "\fBISO 6.1.3:\fR" 14 +\h'-5u'Identifiers may be of any length and all characters of an identifier +shall be significant in distinguishing between them. +.sp +.in +3m +The constant IDFSIZE, defined in the file \fIidfsize.h\fR, determines +the (maximum) significant length of an identifier. It can be set at run +time with the \-M option (see also section on compiler options). +.in -3m +.sp +.IP "\fBISO 6.1.8:\fR" +\h'-5u'There shall be at least one separator between any pair of consecutive tokens +made up of identifiers, word-symbols, labels or unsigned-numbers. +.sp +.in +3m +A token separator is not needed when a number is followed by an identifier +or a word-symbol. For example the input sequence, 2\fBthen\fR, is recognized +as the integer 2 followed by the keyword \fBthen\fR. +.in -3m +.sp +.IP "\fBISO 6.2.1:\fR" +\h'-29u'The label-declaration-part shall specify all labels that prefix a statement +in the corresponding statement-part. +.sp +.ti +3m +The compiler generates a warning if a label is declared but never defined. +.bp +.IP "\fBISO 6.2.2:\fR" +\h'-9u'The scope of identifiers and labels should start at the beginning of the +block in which these identifiers or labels are declared. +.sp +.in +3m +The compiler, as most other one pass compilers deviates in this respect, +because the scope of variables and labels start at their defining-point. +.nf +.in +4m +\fBprogram\fR deviates\fB; +const\fR + x \fB=\fR 3\fB; +procedure\fR p\fB; +const\fR + y \fB=\fR x\fB;\fR + x \fB=\fR true\fB; +begin end; +begin +end.\fR +.in -4m +.fi + +In procedure p, the constant y has the integer value 3. This program does not +conform to the standard. In [SAL] a simple algorithm is described for +enforcing the scope rules, it involves numbering all scopes encoutered in the +program in order of their opening, and recording in each identifier table +entry the number of the latest scope in which it is used. + +Note: The compiler does not deviate from the standard in the following program: +.nf +.in +4m +\fBprogram\fR conforms\fB; +type\fR + x \fB=\fR real\fB; +procedure\fR p\fB; +type\fR + y \fB= ^\fRx\fB;\fR + x \fB=\fR boolean\fB; +var\fR + p \fB:\fR y\fB; +begin end; +begin +end.\fR +.in -4m +.fi + +In procedure p, the variable p is a pointer to boolean. +.fi +.in -3m +.sp +.IP "\fBISO 6.4.3.2:\fR" +The standard specifies that any ordinal type is allowed as index-type. +.sp +.in +3m +The required type \fIinteger\fR is not allowed as index-type, i.e. +.ti +2m +\fBARRAY [ \fIinteger\fB ] OF\fR +is not permitted. +.br +This could be implemented, but this might cause problems on machines with +a small memory. +.in -3m +.sp +.IP "\fBISO 6.4.3.3:\fR" +\h'-1u'The type possessed by the variant-selector, called the tag-type, must +be an ordinal type, so the integer type is permitted. The values denoted by +all case-constants shall be distinct and the set thereof shall be equal +to the set of values specified by the tag-type. +.sp +.in +3m +Because it is impracticable to enumerate all integers as case-constants, +the integer type is not permitted as tag-type. It would not make a great +difference to allow it as tagtype. +.in -3m +.sp +.IP "\fBISO 6.8.3.9:\fR" +The standard specifies that the control-variable of a for-statement is not +allowed to be modified while executing the loop. +.sp +.in +3m +Violation of this rule is not detected. An algorithm to implement this rule +can be found in [PCV]. diff --git a/doc/pascal/example.doc b/doc/pascal/example.doc new file mode 100644 index 000000000..f8350f01e --- /dev/null +++ b/doc/pascal/example.doc @@ -0,0 +1,92 @@ +.sp 1.5i +.ft B +Appendix C: An example +.ft R +.nh +.nf + + +\h'+10u' 1 \fBprogram\fR factorials(input, output); +\h'+10u' 2 { This program prints factorials } +\h'+10u' 3 +\h'+10u' 4 \fBconst\fR +\h'+10u' 5 FAC1 = 1; +\h'+10u' 6 \fBvar\fR +\h'+10u' 7 i : integer; +\h'+10u' 8 +\h'+10u' 9 \fBfunction\fR factorial(n : integer) : integer; +10 \fBbegin\fR +11 \fBif\fR n = FAC1 \fBthen\fR +12 factorial := FAC1 +13 \fBelse\fR +14 factorial := n * factorial(n-1); +15 \fBend\fR; +16 +17 \fBbegin\fR +18 write('Give a number : '); +19 readln(i); +20 \fBif\fR i < 1 \fBthen\fR +21 writeln('No factorial') +22 \fBelse\fR +23 writeln(factorial(i):1); +24 \fBend\fR. +.bp +.po +.DS + mes 2,4,4 loc 16 +\&.1 cal $_wrs + rom 'factorials.p\(rs000' asp 12 +i lin 19 + bss 4,0,0 lae input +output cal $_rdi + bss 540,0,0 asp 4 +input lfr 4 + bss 540,0,0 ste i + exp $factorial lae input + pro $factorial, ? cal $_rln + mes 9,4 asp 4 + lin 11 lin 20 + lol 0 loe i + loc 1 loc 1 + cmi 4 cmi 4 + teq tlt + zeq *1 zeq *1 + lin 12 lin 21 + loc 1 .4 + stl -4 rom 'No factorial' + bra *2 lae output +1 lae .4 + lin 14 loc 12 + lol 0 cal $_wrs + lol 0 asp 12 + loc 1 lae output + sbi 4 cal $_wln + cal $factorial asp 4 + asp 4 bra *2 + lfr 4 1 + mli 4 lin 23 + stl -4 lae output +2 loe i + lin 15 cal $factorial + mes 3,0,4,0,0 asp 4 + lol -4 lfr 4 + ret 4 loc 1 + end 4 cal $_wsi + exp $m_a_i_n asp 12 + pro $m_a_i_n, ? lae output + mes 9,0 cal $_wln + fil .1 asp 4 +\&.2 2 + con input, output lin 24 + lxl 0 loc 0 + lae .2 cal $_hlt + loc 2 end 0 + lxa 0 mes 4,24,'factorials.p\(rs000' + cal $_ini + asp 16 + lin 18 +\&.3 + rom 'Give a number : ' + lae output + lae .3 +.DE diff --git a/doc/pascal/extensions.doc b/doc/pascal/extensions.doc new file mode 100644 index 000000000..44febcc49 --- /dev/null +++ b/doc/pascal/extensions.doc @@ -0,0 +1,60 @@ +.pl 12i +.sp 1.5i +.NH +Extensions to Pascal as specified by ISO 7185 +.nh + +.IP "\fBISO 6.1.3:\fR" 14 +\h'-11u'The underscore is treated as a letter when the \-u option is turned +on (see also section 5.2). This is implemented to be compatible with +Pascal-VU and can be used in identifiers to increase readability. +.sp +.IP "\fBISO 6.1.4:\fR" +\h'-12u'The directive \fIextern\fR can be used in a procedure-declaration or +function-declaration to specify that the procedure-block or function-block +corresponding to that declaration is external to the program-block. This can +be used in conjunction with library routines. +.sp +.IP "\fBISO 6.1.9:\fR" +\h'-22u'An alternative representation for the following tokens and delimiting +characters is recognized: +.in +5m +.ft 5 +\fBtoken +.ft 5 +\& \fBalternative token +.ft 5 +.sp +^ +\& @ +.br +[ +\& (. +.br +] +\& .) + +.ft 5 +\fBdelimiting character +.ft 5 +\& \fBalternative delimiting pair of characters +.ft 5 +.sp +{ +\& (* +.br +} +\& *) +.in -5m +.sp +.IP "\fBISO 6.6.3.7.2:\fR" +\h'-1u'A conformant array parameter can be passed as value conformant array +parameter without the restrictions imposed by the standard. The compiler +gives a warning. This is implemented to keep the parameter mechanism orthogonal (see also Chapter 4). +.sp +.IP "\fBISO 6.9.3.1:\fR" +\h'-16u'If the value of the argument \fITotalWidth\fR of the required +procedure \fIwrite\fR is zero or negative, no characters are written for +character, string or boolean type arguments. If the value of the argument +\fIFracDigits\fR of the required procedure \fIwrite\fR is zero or negative, +the fraction and '.' character are suppressed for fixed-point arguments. diff --git a/doc/pascal/hints.doc b/doc/pascal/hints.doc new file mode 100644 index 000000000..a1c7fc1ba --- /dev/null +++ b/doc/pascal/hints.doc @@ -0,0 +1,76 @@ +.sp 1.5i +.nr H1 7 +.NH +Hints to change the standard +.nh +.sp +.LP +We encoutered some difficulties when the compiler was developed. In this +chapter some hints are presented to change the standard, which would make +the implementation of the compiler less difficult. The semantics of Pascal +would not be altered by these adaptions. +.sp 2 +.LP +\- Some minor changes in the grammar of Pascal from the user's point of view, +but which make the writing of an LL(1) parser considerably easier, could be: +.in +3m +.nf +field-list : [ ( fixed-part [ variant-part ] | variant-part ) ] . +fixed-part : record-section \fB;\fR { record-section \fB;\fR } . +variant-part : \fBcase\fR variant-selector \fBof\fR variant \fB;\fR { variant \fB;\fR } . + +case-statement : \fBcase\fR case-index \fBof\fR case-list-element \fB;\fR { case-list-element \fB;\fR } \fBend\fR . +.fi +.in -3m + + +.LP +\- To ease the semantic checking on sets, the principle of qualified sets could +be used, every set-constructor must be preceeded by its type-identifier: +.nf +.ti +3m +set-constructor : type-identifier \fB[\fR [ member-designator { \fB,\fR member-designator } ] \fB]\fR . + +Example: + t1 = set of 1..5; + t2 = set of integer; + +The type of [3, 5] would be ambiguous, but the type of t1[3, 5] not. +.fi + + +.LP +\- Another problem arises from the fact that a function name can appear in +three distinct 'use' contexts: function call, assignment of function +result and as function parameter. +.br +Example: +.in +5m +.nf +\fBprogram\fR function_name; + +\fBfunction\fR p(x : integer; function y : integer) : integer; +\fBbegin\fR .. \fBend\fR; + +\fBfunction\fR f : integer; +\fBbegin\fR + f := p(f, f); (*) +\fBend\fR; + +\fBbegin\fR .. \fBend\fR. +.fi +.in -5m + +A possible solution in case of a call (also a procedure call) would be to +make the (possibly empty) actual-parameter-list mandatory. The assignment +of the function result could be changed in a \fIreturn\fR statement. +Though this would change the semantics of the program slightly. +.br +The above statement (*) would look like this: return p(f(), f); + + +.LP +\- Another extension to the standard could be the implementation of an +\fIotherwise\fR clause in a case-statement. This would behave exactly like +the \fIdefault\fR clause in a switch-statement in C. +.bp diff --git a/doc/pascal/his.doc b/doc/pascal/his.doc new file mode 100644 index 000000000..d4c64a2a5 --- /dev/null +++ b/doc/pascal/his.doc @@ -0,0 +1,36 @@ +.sp 2 +.NH +History & Acknowledgements +.nh +.sp 2 +.ft B +History +.ft R +.sp +.LP +The purpose of this project was to make a Pascal compiler which should satisfy +the conditions of the ISO standard. The task was considerably simplified, +because parts of the Modula-2 compiler were used. This gave the advantage of +increasing the uniformity of the compilers in ACK. +.br +While developing the compiler, a number of errors were detected in the Modula-2 +compiler, EM utility modules and the old Pascal compiler. + +.sp 2 +.ft B +Acknowledgements +.ft R +.sp +.LP +During the development of the compiler, valuable support was received from +a number of persons. In this regard we owe a debt of gratitude to +Fred van Beek, Casper Capel, Rob Dekker, Frank Engel, Jos\('e Gouweleeuw +and Sonja Keijzer (Jut and Jul !!), Herold Kroon, Martin van Nieuwkerk, +Sjaak Schouten, Eric Valk, and Didan Westra. +.br +Special thanks are reserved for Dick Grune, who introduced us to the field of +Compiler Design and who helped testing the compiler. Ceriel Jacobs, who +developed LLgen and the Modula-2 compiler of ACK. Finally we would like to +thank Erik Baalbergen, who had the supervision on this entire project and +gave us many valuable suggestions. +.bp diff --git a/doc/pascal/improv.doc b/doc/pascal/improv.doc new file mode 100644 index 000000000..3c15ee8b8 --- /dev/null +++ b/doc/pascal/improv.doc @@ -0,0 +1,87 @@ +.sp 2 +.NH +Improvements to the compiler +.nh +.sp +.LP +In consideration of portability, a restricted option could be implemented. +Under this option, the extensions and warnings should be considered as errors. + + +.LP +The restrictions imposed by the standard on the control variable of a +for-statment should be implemented (\fBISO 6.8.3.9\fR). + +.LP +To check whether a function returns a valid result, the following algorithm +could be used. When a function is entered a hidden temporary variable of +type boolean is created. This variable is initialized with the value false. +The variable is set to true, when an assignment to the function name occurs. +On exit of the function a test is performed on the variable. If the value +of the variable is false, a run-time error occurs. +.br +Note: The check has to be done run-time. + + +.LP +The \fIundefined value\fR should be implemented. A problem arises with +local variables, for which space on the stack is allocated. A possible +solution would be to generate code for the initialization of the local +variables with the undefined value at the beginning of a procedure or +function. +.br +The implementation for the global variables is easy, because \fBbss\fR +blocks are used. + + +.LP +Closely related to the last point is the generation of warnings when +variables are never used or assigned. This is not yet implemented. + + +.LP +The error messages could specify more details about the errors occurred, +if some additional testing is done. + +.bp +.LP +Every time the compiler detects sets with different base-types, a warning +is given. Sometimes this is superfluous. + +.nf +\fBprogram\fR sets(output); +\fBtype\fR + week = (sunday, monday, tuesday, wednesday, thursday, friday, saturday); + workweek = monday..friday; +\fBvar\fR + s : \fBset of\fR workweek; + day : week; +\fBbegin\fR + day := monday; + s := [day]; (* warning *) + day := saturday; + s := [day]; (* warning *) +\fBend\fR. +.fi +The new compiler gives two warnings, the first one is redundant. + + +.LP +A nasty point in the compiler is the way the procedures \fIread, readln, +write\fR and \fIwriteln\fR are handled (see also section 2.2). They have +been added to the grammar. This implies that they can not be redefined as +opposed to the other required procedures and functions. They should be +removed from the grammar altogether. This could imply that more semantic +checks have to be performed. + + +.LP +No effort is made to detect possible run-time errors during compilation. +.br +E.g. a : \fBarray\fR[1..10] \fBof\fI something\fR, and the array selection +a[11] would occur. + + +.LP +Some assistance to implement the improvements mentioned above, can be +obtained from [PCV]. diff --git a/doc/pascal/internal.doc b/doc/pascal/internal.doc new file mode 100644 index 000000000..d1a94e7ae --- /dev/null +++ b/doc/pascal/internal.doc @@ -0,0 +1,342 @@ +.pl 12.5i +.sp 1.5i +.NH +The compiler + +.nh +.LP +The compiler can be divided roughly into four modules: + +\(bu lexical analysis +.br +\(bu syntax analysis +.br +\(bu semantic analysis +.br +\(bu code generation +.br + +The four modules are grouped into one pass. The activity of these modules +is interleaved during the pass. +.br +The lexical analyzer, some expression handling routines and various +datastructures from the Modula-2 compiler contributed to the project. +.sp 2 +.NH 2 +Lexical Analysis + +.LP +The first module of the compiler is the lexical analyzer. In this module, the +stream of input characters making up the source program is grouped into +\fItokens\fR, as defined in \fBISO 6.1\fR. The analyzer is hand-written, +because the lexical analyzer generator, which was at our disposal, +\fILex\fR [LEX], produces much slower analyzers. A character table, in the file +\fIchar.c\fR, is created using the program \fItab\fR which takes as input +the file \fIchar.tab\fR. In this table each character is placed into a +particular class. The classes, as defined in the file \fIclass.h\fR, +represent a set of tokens. The strategy of the analyzer is as follows: the +first character of a new token is used in a multiway branch to eliminate as +many candidate tokens as possible. Then the remaining characters of the token +are read. The constant INP_NPUSHBACK, defined in the file \fIinput.h\fR, +specifies the maximum number of characters the analyzer looks ahead. The +value has to be at least 3, to handle input sequences such as: +.br + 1e+4 (which is a real number) +.br + 1e+a (which is the integer 1, followed by the identifier "e", a plus, and the identifier "a") + +Another aspect of this module is the insertion and deletion of tokens +required by the parser for the recovery of syntactic errors (see also section +2.2). A generic input module [ACK] is used to avoid the burden of I/O. +.sp 2 +.NH 2 +Syntax Analysis + +.LP +The second module of the compiler is the parser, which is the central part of +the compiler. It invokes the routines of the other modules. The tokens obtained +from the lexical analyzer are grouped into grammatical phrases. These phrases +are stored as parse trees and handed over to the next part. The parser is +generated using \fILLgen\fR[LL], a tool for generating an efficient recursive +descent parser with no backtrack from an Extended Context Free Syntax. +.br +An error recovery mechanism is generated almost completely automatically. A +routine called \fILLmessage\fR had to be written, which gives the necessary +error messages and deals with the insertion and deletion of tokens. +The routine \fILLmessage\fR must accept one parameter, whose value is +a token number, zero or -1. A zero parameter indicates that the current token +(the one in the external variable \fILLsymb\fR) is deleted. +A -1 parameter indicates that the parser expected end of file, but did +not get it. The parser will then skip tokens until end of file is detected. +A parameter that is a token number (a positive parameter) indicates that +this token is to be inserted in front of the token currently in \fILLsymb\fR. +Also, care must be taken, that the token currently in \fILLsymb\fR is again +returned by the \fBnext\fR call to the lexical analyzer, with the proper +attributes. So, the lexical analyzer must have a facility to push back one +token. +.br +Calls to the two standard procedures \fIwrite\fR and \fIwriteln\fR can be +different from calls to other procedures. The syntax of a write-parameter +is different from the syntax of an actual-parameter. We decided to include +them, together with \fIread\fR and \fIreadln\fR, in the grammar. An alternate +solution would be to make the syntax of an actual-parameter identical to the +syntax of a write-parameter. Afterwards the parameter has to be checked to +see whether it is used properly or not. +.bp +As the parser is LL(1), it must always be able to determine what to do, +based on the last token read (\fILLsymb\fR). Unfortunately, this was not the +case with the grammar as specified in [ISO]. Two kinds of problems +appeared, viz. the \fBalternation\fR and \fBrepetition\fR conflict. +The examples given in the following paragraphs are taken from the grammar. + +.NH 3 +Alternation conflict + +.LP +An alternation conflict arises when the parser can not decide which +production to choose. +.br +\fBExample:\fR +.in +2m +.ft 5 +.nf +procedure-declaration : procedure-heading \fB';'\f5 directive | +.br +\h'\w'procedure-declaration : 'u'procedure-identification \fB';'\f5 procedure-block | +.br +\h'\w'procedure-declaration : 'u'procedure-heading \fB';'\f5 procedure-block ; +.br +procedure-heading : \fBprocedure\f5 identifier [ formal-parameter-list ]? ; +.br +procedure-identification : \fBprocedure\f5 procedure-identifier ; +.fi +.ft R +.in -2m + +A sentence that starts with the terminal \fBprocedure\fR is derived from the +three alternative productions. This conflict can be resolved in two ways: +adjusting the grammar, usually some rules are replaced by one rule and more +work has to be done in the semantic analysis; using the LLgen conflict +resolver, "\fB%if\fR (C-expression)", if the C-expression evaluates to +non-zero, the production in question is chosen, otherwise one of the +remaining rules is chosen. The grammar rules were rewritten to solve this +conflict. The new rules are given below. For more details see the file +\fIdeclar.g\fR. + +.in +2m +.ft 5 +.nf +procedure-declaration : procedure-heading \fB';'\f5 ( directive | procedure-block ) ; +.br +procedure-heading : \fBprocedure\f5 identifier [ formal-parameter-list ]? ; +.fi +.ft R +.in -2m + +A special case of an alternation conflict, which is common to many block +structured languages, is the \fI"dangling-else"\fR ambiguity. + +.in +2m +.ft 5 +.nf +if-statement : \fBif\f5 boolean-expression \fBthen\f5 statement [ else-part ]? ; +.br +else-part : \fBelse\f5 statement ; +.fi +.ft R +.in -2m + +The following statement that can be derived from the rules above is ambiguous: + +.ti +2m +\fBif\f5 boolean-expr-1 \fBthen\f5 \fBif\f5 boolean-expr-2 \fBthen\f5 statement-1 \fBelse\f5 statement-2 +.ft R + + +.ps 8 +.vs 7 +.PS +move right 1.1i +S: line down 0.5i +"if-statement" at S.start above +.ft B +"then" at S.end below +.ft R +move to S.start then down 0.25i +L: line left 0.5i then down 0.25i +box ht 0.33i wid 0.6i "boolean" "expression-1" +move to L.start then left 0.5i +L: line left 0.5i then down 0.25i +.ft B +"if" at L.end below +.ft R +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +"statement" at L.end below +move to L.end then down 0.10i +L: line down 0.25i dashed +"if-statement" at L.end below +move to L.end then down 0.10i +L: line down 0.5i +.ft B +"then" at L.end below +.ft R +move to L.start then down 0.25i +L: line left 0.5i then down 0.25i +box ht 0.33i wid 0.6i "boolean" "expression-2" +move to L.start then left 0.5i +L: line left 0.5i then down 0.25i +.ft B +"if" at L.end below +.ft R +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +box ht 0.33i wid 0.6i "statement-1" +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +.ft B +"else" at L.end below +.ft R +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +box ht 0.33i wid 0.6i "statement-2" +move to S.start +move right 3.5i +L: line down 0.5i +"if-statement" at L.start above +.ft B +"then" at L.end below +.ft R +move to L.start then down 0.25i +L: line left 0.5i then down 0.25i +box ht 0.33i wid 0.6i "boolean" "expression-1" +move to L.start then left 0.5i +L: line left 0.5i then down 0.25i +.ft B +"if" at L.end below +.ft R +move to L.start then right 0.5i +S: line right 0.5i then down 0.25i +"statement" at S.end below +move to S.start then right 0.5i +L: line right 0.5i then down 0.25i +.ft B +"else" at L.end below +.ft R +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +box ht 0.33i wid 0.6i "statement-2" +move to S.end then down 0.10i +L: line down 0.25i dashed +"if-statement" at L.end below +move to L.end then down 0.10i +L: line down 0.5i +.ft B +"then" at L.end below +.ft R +move to L.start then down 0.25i +L: line left 0.5i then down 0.25i +box ht 0.33i wid 0.6i "boolean" "expression-2" +move to L.start then left 0.5i +L: line left 0.5i then down 0.25i +.ft B +"if" at L.end below +.ft R +move to L.start then right 0.5i +L: line right 0.5i then down 0.25i +box ht 0.33i wid 0.6i "statement-1" +.PE +.ps +.vs +\h'615u'(a)\h'1339u'(b) +.sp +.ce +Two parse trees showing the \fIdangling-else\fR ambiguity +.sp 2 +According to the standard, \fBelse\fR is matched with the nearest preceding +unmatched \fBthen\fR, i.e. parse tree (a) is valid (\fBISO 6.8.3.4\fR). +This conflict is statically resolved in LLgen by using "\fB%prefer\fR", +which is equivalent in behaviour to "\fB%if\fR(1)". +.bp +.NH 3 +Repetition conflict + +.LP +A repetition conflict arises when the parser can not decide whether to choose +a production once more, or not. +.br +\fBExample:\fR +.in +2m +.ft 5 +.nf +field-list : [ ( fixed-part [ \fB';'\f5 variant-part ]? | variantpart ) [;]? ]? ; +.br +fixed-part : record-section [ \fB';'\f5 record-section ]* ; +.fi +.in -2m +.ft R + +When the parser sees the semicolon, it can not decide whether another +record-section or a variant-part follows. This conflict can be resolved in +two ways: adjusting the grammar or using the conflict resolver, +"\fB%while\fR (C-expression)". The grammar rules that deal with this conflict +were completely rewritten. For more details, the reader is referred to the +file \fIdeclar.g\fR. +.sp 2 +.NH 2 +Semantic Analysis + +.LP +The third module of the compiler is the checking of semantic conventions of +ISO-Pascal. To check the program being parsed, actions have been used in +LLgen. An action consists of several C-statements, enclosed in brackets +"{" and "}". In order to facilitate communication between the actions and +\fILLparse\fR, the parsing routines can be given C-like parameters and +local variables. An important part of the semantic analyzer is the symbol +table. This table stores all information concerning identifiers and their +definitions. Symbol-table lookup and hashing is done by a generic namelist +module [ACK]. The parser turns each program construction into a parse tree, +which is the major datastructure in the compiler. This parse tree is used +to exchange information between various routines. +.sp 2 +.NH 2 +Code Generation + +.LP +The final module in the compiler is that of code generation. The information +stored in the parse trees is used to generate the EM code [EM]. EM code is +generated with the help of a procedural EM-code interface [ACK]. The use of +static exchanges is not desired, since the fast back end can not cope with +static code exchanges, hence the EM pseudoinstruction \fBexc\fR is never +generated. +.br +Chapter 3 discusses the code generation in more detail. +.sp 2 +.NH 2 +Error Handling + +.LP +The first three modules have in common that they can detect errors in the +Pascal program being compiled. If this is the case, a proper message is given +and some action is performed. If code generation has to be aborted, an error +message is given, otherwise a warning is given. The constant MAXERR_LINE, +defined in the file \fIerrout.h\fR, specifies the maximum number of messages +given per line. This can be used to avoid long lists of error messages caused +by, for example, the omission of a ';'. Three kinds of errors can be +distinguished: the lexical error, the syntactic error, and the semantic error. +Examples of these errors are respectively, nested comments, an expression with +unbalanced parentheses, and the addition of two characters. +.sp 2 +.NH 2 +Memory Allocation and Garbage Collection + +.LP +The routines \fIst_alloc\fR and \fIst_free\fR provide a mechanism for +maintaining free lists of structures, whose first field is a pointer called +\fBnext\fR. This field is used to chain free structures together. Each +structure, suppose the tag of the structure is ST, has a free list pointed +by h_ST. Associated with this list are the operations: \fInew_ST()\fR, an +allocating mechanism which supplies the space for a new ST struct; and +\fIfree_ST()\fR, a garbage collecting mechanism which links the specified +structure into the free list. +.bp diff --git a/doc/pascal/options.doc b/doc/pascal/options.doc new file mode 100644 index 000000000..a278b5e69 --- /dev/null +++ b/doc/pascal/options.doc @@ -0,0 +1,166 @@ +.sp 1.5i +.NH +Compiler options +.nh +.PP +There are some options available to control the behaviour of the compiler. +Two types of options can be distinguished: compile-time options and +run-time options. +.sp +.NH 2 +Compile time options +.LP +.sp +There are some options that can be set when the compiler is installed. +Those options can be found in the file \fIParameters\fR. To set a parameter +just modify its definition in the file \fIParameters\fR. The shell script +in the file \fImake.hfiles\fR creates for each parameter a separate .h file. +This mechanism is derived from the C compiler in ACK. +.sp +\fBIDFSIZE\fR +.in +3m +The maximum number of characters that are significant in an identifier. This +value has to be at least the value of \fBMINIDFSIZE\fR, defined in the file +\fIoptions.c\fR. A compile-time check is included to see if the value of +\fBMINIDFSIZE\fR is legal. The compiler will not recognize some keywords +if \fBIDFSIZE\fR is too small. +.in -3m +.sp +\fBISTRSIZE\fR, \fBRSTRSIZE\fR +.in +3m +The lexical analyzer uses these two values for the allocation of memory needed +to store a string. \fBISTRSIZE\fR is the initial number of bytes allocated. +\fBRSTRSIZE\fR is the step size used for enlarging the memory needed. +.in -3m +.sp +\fBNUMSIZE\fR +.in +3m +The maximum length of a numeric constant recognized by the lexical analyzer. +It is an error if this length is exceeded. +.in -3m +.sp +\fBERROUT\fR, \fBMAXERR_LINE\fR +.in +3m +Used for error messages. \fBERROUT\fR defines the file on which the +messages are written. \fBMAXERR_LINE\fR is the maximum number of error +messages given per line. +.in -3m +.sp +\fBSZ_CHAR\fR, \fBAL_CHAR\fR, etc +.in +3m +The default values of the target machine sizes and alignments. The values +can be overruled with the \-V option. +.in -3m +.sp +\fBMAXSIZE\fR +.in +3m +This value must be set to the maximum of the values of the target machine +sizes. This parameter is used in overflow detection (see also section 3.2). +.in -3m +.sp +\fBDENSITY\fR +.in +3m +This parameter is used to decide what EM instruction has to be generated +for a case-statement. If the range of the index value is sparse, i.e. +.br +.ti +5m +(upperbound - lowerbound) / number_of_cases +.br +is more than some threshold (\fBDENSITY\fR) the \fBcsb\fR instruction is +chosen. If the range is dense a jump table is generated (\fBcsa\fR). This +uses more space. Reasonable values are 2, 3 or 4. +.br +Higher values might also be reasonable on machines, which have lots of +address space and memory (see also section 3.3.3). +.in -3m +.sp +\fBINP_READ_IN_ONE\fR +.in +3m +Used by the generic input module. It can either be defined or not defined. +Defining it has the effect that files will be read completely into memory +using only one read-system call. This should be used only on machines with +lots of memory. +.in -3m +.sp +.bp +\fBDEBUG\fR +.in +3m +.nf +If this parameter is defined some built-in compiler-debugging tools can be used: +.in +2m +\(bu only lexical analyzing is done, if the \-l option is given. +\(bu if the \-I option is turned on, the allocated number of structures is printed. +\(bu the routine debug can be used to print miscellaneous information. +\(bu the routine PrNode prints a tree of nodes. +\(bu the routine DumpType prints information about a type structure. +\(bu the macro DO_DEBUG(x,y) defined as ((x) && (y)) can be used to perform + several actions. +.in -2m +.in -3m +.sp +.NH 2 +Run time options +.LP +.sp +The run time options can be given in the command line when the compiler is +called. +.br +They all have the form: \- +.br +Depending on the option, a character string has to be specified. The following +options are currently available: +.sp +.IP \-\fBC\fR 18 +The lower case and upper case letters are treated different (\fBISO 6.1.1\fR). +.sp +.IP \-\fBu\fR +The character '_' is treated like a letter, so it is allowed to use the +underscore in identifiers. +.br +Note: identifiers starting with an underscore may cause problems, because +.br +\h'\w'Note: 'u'most identifiers in library routines start with an underscore. +.sp +.IP \-\fBn\fR +This option suppresses the generation of register messages. +.sp +.IP \-\fBr\fR +With this option rangechecks are generated where necessary. +.sp +.IP \-\fBL\fR +Do not generate EM \fBlin\fR and \fBfil\fR instructions. These instructions +are used only for profiling. +.sp +.IP \-\fBM\fR +Set the number of characters that are significant in an identifier to . +The maximum significant identifier length depends on the constant IDFSIZE, +defined in \fIidfsize.h\fR. +.sp +.IP \-\fBi\fR +With this flag the setsize for a set of integers can be changed. The number must +be the number of bits per set. Default value : (#bits in a word) \- 1 +.sp +.IP \-\fBw\fR +Suppress warning messages (see also section 2.5). +.sp +.IP \-\fBV\fR[[\fBw\fR|\fBi\fR|\fBf\fR|\fBp\fR|\fBS\fR][\fIsize\fR]?[\fI.alignment\fR]?]* +.br +Option to set the object sizes and alignments on the target machine +dynamically. The objects that can be manipulated are: +.br +\fBw\fR\h'\w'ifpS'u' word +.br +\fBi\fR\h'\w'wfpS'u' integer +.br +\fBf\fR\h'\w'wipS'u' float +.br +\fBp\fR\h'\w'wifS'u' pointer +.br +\fBS\fR\h'\w'wifp'u' structure +.br +In case of a structure, \fIsize\fR is discarded and the \fIalignment\fR is +the initial alignment of the structure. The effective alignment is the least +common multiple of \fIalignment\fR and the alignment of its members. This +option has been implemented so that the compiler can be used as cross +compiler. +.bp diff --git a/doc/pascal/p1-9 b/doc/pascal/p1-9 new file mode 100755 index 000000000..7455d2ef7 --- /dev/null +++ b/doc/pascal/p1-9 @@ -0,0 +1 @@ +pic ab+intro.doc internal.doc transpem.doc | troff -ms > p1-9.dit diff --git a/doc/pascal/p10-14 b/doc/pascal/p10-14 new file mode 100755 index 000000000..529162663 --- /dev/null +++ b/doc/pascal/p10-14 @@ -0,0 +1 @@ +troff -ms -n10 conf.doc options.doc extensions.doc deviations.doc > p10-14.dit diff --git a/doc/pascal/p15-19 b/doc/pascal/p15-19 new file mode 100755 index 000000000..808edc2fe --- /dev/null +++ b/doc/pascal/p15-19 @@ -0,0 +1 @@ +troff -ms -n15 hints.doc test.doc compar.doc improv.doc his.doc reference.doc > p15-19.dit diff --git a/doc/pascal/p20-29 b/doc/pascal/p20-29 new file mode 100755 index 000000000..11c4b4ec6 --- /dev/null +++ b/doc/pascal/p20-29 @@ -0,0 +1 @@ +troff -ms -n20 syntax.doc rtl.doc example.doc > p20-29.dit diff --git a/doc/pascal/reference.doc b/doc/pascal/reference.doc new file mode 100644 index 000000000..e99f16da6 --- /dev/null +++ b/doc/pascal/reference.doc @@ -0,0 +1,50 @@ +.ps 12 +.vs 14 +.NH +References +.sp +.nh +.IP [ISO] 8 +ISO 7185 Specification for Computer Programming Language Pascal, 1982, +Acornsoft ISO-PASCAL, 1984 +.sp +.IP [EM] +A.S. Tanenbaum, H. van Staveren, E.G. Keizer and J.W. Stevenson, +\fIDescription Of A Machine Architecture for use with Block Structured +Languages\fR, Informatica Rapport IR-81, Vrije Universiteit, Amsterdam, 1983 +.sp +.IP [C] +B.W. Kernighan and D.M. Ritchie, \fIThe C Programming Language\fR, +Prentice-Hall, 1978 +.sp +.IP [LL] +C.J.H. Jacobs, \fISome Topics in Parser Generation\fR, Informatica Rapport +IR-105, Vrije Universiteit, Amsterdam, October 1985 +.sp +.IP [IM2] +J.W. Stevenson, \fIPascal-VU Reference Manual and Unix Manual Pages\fR, +Informatica Manual IM-2, Vrije Universiteit, Amsterdam, 1980 +.sp +.IP [JEN] +K. Jensen and N.Wirth, \fIPascal User Manual and Report\fR, +Springer-Verlag, 1978 +.sp +.IP [ACK] +\fIACK Manual Pages\fR: ALLOC, ASSERT, EM_CODE, EM_MES, IDF, INPUT, PRINT, +STRING, SYSTEM +.sp +.IP [AHO] +A.V. Aho, R. Sethi and J.D. Ullman, \fICompiler Principles, Techniques, and +Tools\fR, Addison Wesley, 1985 +.sp +.IP [LEX] +M.E. Lesk, \fILex - A Lexical Analyser Generator\fR, Comp. Sci. Tech. Rep. +No. 39, Bell Laboratories, Murray Hill, New Jersey, October 1975 +.sp +.IP [PCV] +B.A. Wichmann and Z.J. Ciechanowicz, \fIPascal Compiler Validation\fR, John +Wiley & Sons, 1983 +.sp +.IP [SAL] +A.H.J. Sale, \fIA Note on Scope, One-Pass Compilers and Pascal\fR, Australian +Communications, 1, 1, 80-82, 1979 diff --git a/doc/pascal/rtl.doc b/doc/pascal/rtl.doc new file mode 100644 index 000000000..011375b14 --- /dev/null +++ b/doc/pascal/rtl.doc @@ -0,0 +1,85 @@ +.sp 1.5i +.ft B +Appendix B: Changes to the run time library +.ft R +.nh +.sp +Some minor changes in the run time library have been made concerning the +external files (i.e. program arguments). The old compiler reserved +space for the file structures of the external files in one \fBhol\fR block. +In the new compiler, every file structure is placed in a separate \fBbss\fR +block. This implies that the arguments with which \fI_ini\fR is called are +slightly different. The second argument was the base of the \fBhol\fR block +to relocate the buffer addresses, it is changed into an integer denoting the +size of the array passed as third argument. The third argument was a pointer +to an array of integers containing the description of external files, this +argument is changed into a pointer to an array of pointers to file structures. + +The differences in the generated EM code for an arbitrary Pascal program are +listed below (only the relevant parts are shown): +.in +5m +.nf +\fBprogram\fR external_files(output,f); +\fBvar\fR + f : \fBfile of \fIsome-type\fR; + . + . +\fBend\fR. +.in -5m + +EM code generated by Pascal-VU: +.in +5m + . + . + hol 1088,-2147483648,0 ; space belonging to file structures of the program arguments + . + . + . +\&.2 + con 3, -1, 544, 0 \h'80u'; description of external files + lxl 0 + lae .2 + lae 0 \h'146u'; base of hol block, to relocate buffer addresses + lxa 0 + cal $_ini + asp 16 + . + . +.in -5m + +EM code generated by our compiler: +.in +5m + . + . +f + bss 540,0,0 \h'100u'; space belonging to file structure of program argument f +output + bss 540,0,0 \h'100u'; space belonging to file structure of standard output + . + . + . +\&.2 + con 0U4, output, f \h'50u'; the absence of standard input is denoted by a null pointer + lxl 0 + lae .2 + loc 3 \h'144u'; denotes the size of the array of pointers to file structures + lxa 0 + cal $_ini + asp 16 + . + . +.in -5m + +.po +The following files in the run time library have been changed: +.in +1m +pc_file.h +hlt.c +ini.c +opn.c +pentry.c +pexit.c +.in -1m +.fi +.bp +.po diff --git a/doc/pascal/syntax.doc b/doc/pascal/syntax.doc new file mode 100644 index 000000000..ba6cfbee7 --- /dev/null +++ b/doc/pascal/syntax.doc @@ -0,0 +1,269 @@ +.sp 1.5i +.LP +.vs 14 +.nh +.ft B +Appendix A: ISO-PASCAL grammar +.ft R + + +\fBA.1 Lexical tokens\fR + +The syntax describes the formation of lexical tokens from characters and the +separation of these tokens, and therefore does not adhere to the same rules +as the syntax in A.2. + +The lexical tokens used to construct Pascal programs shall be classified into +special-symbols, identifiers, directives, unsigned-numbers, labels and +character-strings. The representation of any letter (upper-case or lower-case, +differences of font, etc) occurring anywhere outside of a character-string +shall be insignificant in that occurrence to the meaning of the program. + +letter = \fBa\fR | \fBb\fR | \fBc\fR | \fBd\fR | \fBe\fR | \fBf\fR | \fBg\fR | \fBh\fR | \fBi\fR | \fBj\fR | \fBk\fR | \fBl\fR | \fBm\fR | \fBn\fR | \fBo\fR | \fBp\fR | \fBq\fR | \fBr\fR | \fBs\fR | \fBt\fR | \fBu\fR | \fBv\fR | \fBw\fR | \fBx\fR | \fBy\fR | \fBz\fR . + +digit = \fB0\fR | \fB1\fR | \fB2\fR | \fB3\fR | \fB4\fR | \fB5\fR | \fB6\fR | \fB7\fR | \fB8\fR | \fB9\fR . + + +The special symbols are tokens having special meanings and shall be used to +delimit the syntactic units of the language. + +special-symbol = \fB+\fR | \fB\-\fR | \fB*\fR | \fB/\fR | \fB=\fR | \fB<\fR | \fB>\fR | \fB[\fR | \fB]\fR | \fB.\fR | \fB,\fR | \fB:\fR | \fB;\fR | \fB^\fR | \fB(\fR | \fB)\fR | \fB<>\fR | \fB<=\fR | \fB>=\fR | \fB:=\fR | \fB..\fR | +\h'\w'special-symbol = 'u'word-symbol . + +word-symbol = \fBand\fR | \fBarray\fR | \fBbegin\fR | \fBcase\fR | \fBconst\fR | \fBdiv\fR | \fBdo\fR | \fBdownto\fR | \fBelse\fR | \fBend\fR | \fBfile\fR | \fBfor\fR | \fBfunction\fR | +\h'\w'word-symbol = 'u'\fBgoto\fR | \fBif\fR | \fBin\fR | \fBlabel\fR | \fBmod\fR | \fBnil\fR | \fBnot\fR | \fBof\fR | \fBor\fR | \fBpacked\fR | \fBprocedure\fR | \fBprogram\fR | \fBrecord\fR | +\h'\w'word-symbol = 'u'\fBrepeat\fR | \fBset\fR | \fBthen\fR | \fBto\fR | \fBtype\fR | \fBuntil\fR | \fBvar\fR | \fBwhile\fR | \fBwith\fR . + + +Identifiers may be of any length. All characters of an identifier shall be +significant. No identifier shall have the same spelling as any word-symbol. + +identifier = letter { letter | digit } . + + +A directive shall only occur in a procedure-declaration or function-declaration. +No directive shall have the same spelling as any word-symbol. + +directive = letter {letter | digit} . + + +Numbers are given in decimal notation. + +.nf +unsigned-integer = digit-sequence . +unsigned-real = unsigned-integer \fB.\fR fractional-part [ \fBe\fR scale-factor ] | unsigned-integer \fBe\fR scale-factor . +digit-sequence = digit {digit} . +fractional-part = digit-sequence . +scale-factor = signed-integer . +signed-integer = [sign] unsigned-integer . +sign = \fB+\fR | \fB\-\fR . +.fi + +.bp +Labels shall be digit-sequences and shall be distinguished by their apparent +integral values and shall be in the closed interval 0 to 9999. + +label = digit-sequence . + + +A character-string containing a single string-element shall denote a value of +the required char-type. Each string-character shall denote an implementation- +defined value of the required char-type. + +.nf +character-string = \fB'\fR string-element { string-element } \fB'\fR . +string-element = apostrophe-image | string-character . +apostrophe-image = \fB''\fR . +string-character = All 7-bits ASCII characters except linefeed (10), vertical tab (11), and new page (12). +.fi + + +The construct: + + \fB{\fR any-sequence-of-characters-and-separations-of-lines- not-containing-right-brace \fB}\fR + +shall be a comment if the "{" does not occur within a character-string or +within a comment. The substitution of a space for a comment shall not alter +the meaning of a program. + +Comments, spaces (except in character-strings), and the separation of +consecutive lines shall be considered to be token separators. Zero or more +token separators may occur between any two consecutive tokens, or before +the first token of a program text. No separators shall occur within tokens. +.bp +.po +\fBA.2 Grammar\fR + +The non-terminal symbol \fIprogram\fR is the start symbol of the grammar. + +.nf +actual-parameter : expression | variable-access | procedure-identifier | function-identifier . +actual-parameter-list : \fB(\fR actual-parameter { \fB,\fR actual-parameter } \fB)\fR . +adding-operator : \fB+\fR | \fB\-\fR | \fBor\fR . +array-type : \fBarray\fR \fB[\fR index-type { \fB,\fR index-type } \fB]\fR \fBof\fR component-type . +array-variable : variable-access . +assignment-statement : ( variable-access | function-identifier ) \fB:=\fR expression . + +base-type : ordinal-type . +block : label-declaration-part constant-definition-part type-definition-part variable-declaration-part +\h'\w'block : 'u'procedure-and-function-declaration-part statement-part . +Boolean-expression : expression . +bound-identifier : identifier . +buffer-variable : file-variable \fB^\fR . + +case-constant : constant . +case-constant-list : case-constant { \fB,\fR case-constant } . +case-index : expression . +case-list-element : case-constant-list \fB:\fR statement . +case-statement : \fBcase\fR case-index \fBof\fR case-list-element { \fB;\fR case-list-element } [ \fB;\fR ] \fBend\fR . +component-type : type-denoter . +component-variable : indexed-variable | field-designator . +compound-statement : \fBbegin\fR statement-sequence \fBend\fR . +conditional-statement : if-statement | case-statement . +conformant-array-parameter-specification : value-conformant-array-specification | +\h'+18.5m'variable-conformant-array-specification . +conformant-array-schema : packed-conformant-array-schema | unpacked-conformant-array-schema . +constant : [ sign ] ( unsigned-number | constant-identifier ) | character-string . +constant-definition : identifier \fB=\fR constant . +constant-definition-part : [ \fBconst\fR constant-definition \fB;\fR { constant-definition \fB;\fR } ] . +constant-identifier : identifier . +control-variable : entire-variable . + +domain-type : type-identifier . + +else-part : \fBelse\fR statement . +empty-statement : . +entire-variable : variable-identifier . +enumerated-type : \fB(\fR identifier-list \fB)\fR . +expression : simple-expression [ relational-operator simple-expression ] . +.bp +.po +factor : variable-access | unsigned-constant | bound-identifier | function-designator | set-constructor | +\h'\w'factor : 'u'\fB(\fR expression \fB)\fR | \fBnot\fR factor . +field-designator : record-variable \fB.\fR field-specifier | field-designator-identifier . +field-designator-identifier : identifier . +field-identifier : identifier . +field-list : [ ( fixed-part [ \fB;\fR variant-part ] | variant-part ) [ \fB;\fR ] ] . +field-specifier : field-identifier . +file-type : \fBfile\fR \fBof\fR component-type . +file-variable : variable-access . +final-value : expression . +fixed-part : record-section { \fB;\fR record-section } . +for-statement : \fBfor\fR control-variable \fB:=\fR initial-value ( \fBto\fR | \fBdownto\fR ) final-value \fBdo\fR statement . +formal-parameter-list : \fB(\fR formal-parameter-section { \fB;\fR formal-parameter-section } \fB)\fR . +formal-parameter-section : value-parameter-specification | variable-parameter-specification | +\h'\w'formal-parameter-section : 'u'procedural-parameter-specification | functional-parameter-specification | +\h'\w'formal-parameter-section : 'u'conformant-array-parameter-specification . +function-block : block . +function-declaration : function-heading \fB;\fR directive | function-identification \fB;\fR function-block | +\h'\w'function-declaration : 'u'function-heading \fB;\fR function-block . +function-designator : function-identifier [ actual-parameter-list ] . +function-heading : \fBfunction\fR identifier [ formal-parameter-list ] \fB:\fR result-type . +function-identification : \fBfunction\fR function-identifier . +function-identifier : identifier . +functional-parameter-specification : function-heading . + +goto-statement : \fBgoto\fR label . + +identified-variable : pointer-variable \fB^\fR . +identifier-list : identifier { \fB,\fR identifier } . +if-statement : \fBif\fR Boolean-expression \fBthen\fR statement [ else-part ] . +index-expression : expression . +index-type : ordinal-type . +index-type-specification : identifier \fB..\fR identifier \fB:\fR ordinal-type-identifier . +indexed-variable : array-variable \fB[\fR index-expression { \fB,\fR index-expression } \fB]\fR . +initial-value : expression . + +label : digit-sequence . +label-declaration-part : [ \fBlabel\fR label { \fB,\fR label } \fB;\fR ] . + +member-designator : expression [ \fB..\fR expression ] . +multiplying-operator : \fB*\fR | \fB/\fR | \fBdiv\fR | \fBmod\fR | \fBand\fR . +.bp +.po +new-ordinal-type : enumerated-type | subrange-type . +new-pointer-type : \fB^\fR domain-type . +new-structured-type : [ \fBpacked\fR ] unpacked-structured-type . +new-type : new-ordinal-type | new-structured-type | new-pointer-type . + +ordinal-type : new-ordinal-type | ordinal-type-identifier . +ordinal-type-identifier : type-identifier . + +packed-conformant-array-schema : \fBpacked\fR \fBarray\fR \fB[\fR index-type-specification \fB]\fR \fBof\fR type-identifier . +pointer-type-identifier : type-identifier . +pointer-variable : variable-access . +procedural-parameter-specification : procedure-heading . +procedure-and-function-declaration-part : { ( procedure-declaration | function-declaration ) \fB;\fR } . +procedure-block : block . +procedure-declaration : procedure-heading \fB;\fR directive | procedure-identification \fB;\fR procedure-block | +\h'\w'procedure-declaration : 'u'procedure-heading \fB;\fR procedure-block . +procedure-heading : \fBprocedure\fR identifier [ formal-parameter-list ] . +procedure-identification : \fBprocedure \fR procedure-identifier . +procedure-identifier : identifier . +procedure-statement : procedure-identifier ( [ actual-parameter-list ] | read-parameter-list | readln-parameter-list | +\h'\w'procedure-statement : procedure-identifier ( ['u'write-parameter-list | writeln-parameter-list ) . +program : program-heading \fB;\fR program-block \fB.\fR . +program-block : block . +program-heading : \fBprogram\fR identifier [ \fB(\fR program-parameters \fB)\fR ] . +program-parameters : identifier-list . + +read-parameter-list : \fB(\fR [ file-variable \fB,\fR ] variable-access { \fB,\fR variable-access } \fB)\fR . +readln-parameter-list : [ \fB(\fR ( file-variable | variable-access ) { \fB,\fR variable-access } \fB)\fR ] . +record-section : identifier-list \fB:\fR type-denoter . +record-type : \fBrecord\fR field-list \fBend\fR . +record-variable : variable-access . +record-variable-list : record-variable { \fB,\fR record-variable } . +relational-operator : \fB=\fR | \fB<>\fR | \fB<\fR | \fB>\fR | \fB<=\fR | \fB>=\fR | \fBin\fR . +repeat-statement : \fBrepeat\fR statement-sequence \fBuntil\fR Boolean-expression . +repetitive-statement : repeat-statement | while-statement | for-statement . +result-type : simple-type-identifier | pointer-type-identifier . + +set-constructor : \fB[\fR [ member-designator { \fB,\fR member-designator } ] \fB]\fR . +set-type : \fBset\fR \fBof\fR base-type . +sign : \fB+\fR | \fB\-\fR . +simple-expression : [ sign ] term { adding-operator term } . +simple-statement : empty-statement | assignment-statement | procedure-statement | goto-statement . +simple-type-identifier : type-identifier . +.bp +.po +statement : [ label \fB:\fR ] ( simple-statement | structured-statement ) . +statement-part : compound-statement . +statement-sequence : statement { \fB;\fR statement } . +structured-statement : compound-statement | conditional-statement | repetitive-statement | with-statement . +subrange-type : constant \fB..\fR constant . + +tag-field : identifier . +tag-type : ordinal-type-identifier . +term : factor { multiplying-operator factor } . +type-definition : identifier \fB=\fR type-denoter . +type-definition-part : [ \fBtype\fR type-definition \fB;\fR { type-definition \fB;\fR } ] . +type-denoter : type-identifier | new-type . +type-identifier : identifier . + +unpacked-conformant-array-schema : \fBarray\fR \fB[\fR index-type-specification { \fB;\fR index-type-specification } \fB]\fR \fBof\fR +\h'\w'unpacked-conformant-array-schema : 'u'( type-identifier | conformant-array-schema ) . +unpacked-structured-type : array-type | record-type | set-type | file-type . +unsigned-constant : unsigned-number | character-string | constant-identifier | \fBnil\fR . +unsigned-number : unsigned-integer | unsigned-real . + +value-conformant-array-specification : identifier-list \fB:\fR conformant-array-schema . +value-parameter-specification : identifier-list \fB:\fR type-identifier . +variable-access : entire-variable | component-variable | identified-variable | buffer-variable . +variable-conformant-array-specification : \fBvar\fR identifier-list \fB:\fR conformant-array-schema . +variable-declaration : identifier-list \fB:\fR type-denoter . +variable-declaration-part : [ \fBvar\fR variable-declaration \fB;\fR { variable-declaration \fB;\fR } ] . +variable-identifier : identifier . +variable-parameter-specification : \fBvar\fR identifier-list \fB:\fR type-identifier . +variant : case-constant-list \fB:\fR \fB(\fR field-list \fB)\fR . +variant-part : \fBcase\fR variant-selector \fBof\fR variant { \fB;\fR variant } . +variant-selector : [ tag-field \fB:\fR ] tag-type . + +while-statement : \fBwhile\fR Boolean-expression \fBdo\fR statement . +with-statement : \fBwith\fR record-variable-list \fBdo\fR statement . +write-parameter : expression [ \fB:\fR expression [ \fB:\fR expression ] ] . +write-parameter-list : \fB(\fR [ file-variable \fB,\fR ] write-parameter { \fB,\fR write-parameter } \fB)\fR . +writeln-parameter-list : [ \fB(\fR ( file-variable | write-parameter ) { \fB,\fR write-parameter } \fB)\fR ] . +.fi +.vs +.bp +.po diff --git a/doc/pascal/test.doc b/doc/pascal/test.doc new file mode 100644 index 000000000..60220a0e9 --- /dev/null +++ b/doc/pascal/test.doc @@ -0,0 +1,19 @@ +.sp 2 +.NH +Testing the compiler +.nh +.sp +.LP +Although it is practically impossible to prove the correctness of a compiler, +a systematic method of testing the compiler is used to increase the confidence +that it will work satisfactorily in practice. The first step was to see if +the lexical analysis was performed correctly. For this purpose, the routine +LexScan() was used (see also the \-l option). Next we tested the parser +generated by LLgen, to see whether correct Pascal programs were accepted and +garbage was dealed with gracefully. The biggest test involved was the +validation of the semantic analysis. Simultaneously we tested the code +generation. First some small Pascal test programs were translated and +executed. When these programs work correctly, the Pascal validation suite +and a large set of Pascal test programs were compiled to see whether they +behaved in the manner the standard specifies. For more details about the +Pascal validation suite, the reader is referred to [PCV]. diff --git a/doc/pascal/titlepg.doc b/doc/pascal/titlepg.doc new file mode 100644 index 000000000..af074c0f9 --- /dev/null +++ b/doc/pascal/titlepg.doc @@ -0,0 +1,13 @@ +\v'3i' +.ps 36 +The ACK Pascal Compiler +.ps 12 +.sp 30 +.ce 5 +.ft I +There is always something like something that there should not be. +.sp 2 +.ps 10 +For Whom The Bell Tolls +.ft R +Ernest Hemingway diff --git a/doc/pascal/transpem.doc b/doc/pascal/transpem.doc new file mode 100644 index 000000000..ede79369a --- /dev/null +++ b/doc/pascal/transpem.doc @@ -0,0 +1,407 @@ +.sp 1.5i +.de CL +.ft R +c\\$1 +.ft 5 + \fIcode statement-\\$1 +.ft 5 + \fBbra *\fRexit_label +.ft 5 +.. +.NH +Translation of Pascal to EM code +.nh +.LP +.sp +A short description of the translation of Pascal constructs to EM code is +given in the following paragraphs. The EM instructions and Pascal terminal +symbols are printed in \fBboldface\fR. A sentence in \fIitalics\fR is a +description of a group of EM (pseudo)instructions. +.sp +.NH 2 +Global Variables +.LP +.sp +For every global variable, a \fBbss\fR block is reserved. To enhance the +readability of the EM-code generated, the variable-identifier is used as +a data label to address the block. +.sp +.NH 2 +Expressions +.LP +.sp +Operands are always evaluated, so the execution of +.br +.ti +3m +\fBif\fR ( p <> nil ) \fBand\fR ( p^.value <> 0 ) \fBthen\fR ..... +.br +might cause a run-time error, if p is equal to nil. +.LP +The left-hand operand of a dyadic operator is almost always evaluated before +the right-hand side. Peculiar evaluations exist for the following cases: +.sp +the expression: set1 <= set2, is evaluated as follows : +.nf +- evaluate set2 +- evaluate set1 +- compute set2+set1 +- test set2 and set2+set1 for equality +.fi +.sp +the expression: set1 >= set2, is evaluated as follows : +.nf +- evaluate set1 +- evaluate set2 +- compute set1+set2 +- test set1 and set1+set2 for equality +.fi +.sp +Where allowed, according to the standard, constant integral expressions are +compile-time evaluated while an effort is made to report overflow on target +machine basis. The integral expressions are evaluated in the type \fIarith\fR. +The size of an arith is assumed to be at least the size of the integer type +on the target machine. If the target machine's integer size is less than the +size of an arith, overflow can be detected at compile-time. However, the +following call to the standard procedure new, \fInew(p, 3+5)\fR, is illegal, +because the second parameter is not a constant according to the grammar. +.sp +Constant floating expressions are not compile-time evaluated, because the +precision on the target machine and the precision on the machine on which the +compiler runs could be different. The boolean expression \fI(1.0 + 1.0) = 2.0\fR +could evaluate to false. +.sp +.NH 2 +Statements +.NH 3 +Assignment Statement + +\fRPASCAL : +.ti +3m +\f5(variable-access | function-identifier) \fB:=\f5 expression + +\fREM : +.nf +.in +3m +.ft I +evaluate expression +store in variable-access or function-identifier +.ft R +.in -3m +.fi + +In case of a function-identifier, a hidden temporary variable is used to +keep the function result. +.bp +.NH 3 +Goto Statement + +\fRPASCAL : +.ti +3m +\fBGOTO\f5 label + +\fREM : +.in +3m +Two cases can be distinguished : +.br +- local goto, +.ti +2m +in which a \fBbra\fR is generated. + +- non-local goto, +.in +2m +.ll -1i +a goto_descriptor is build, containing the ProgramCounter of the instruction +jumped to and an offset in the target procedure frame which contains the +value of the StackPointer after the jump. The code for the jump itself is to +load the address of the goto_descriptor, followed by a push of the LocalBase +of the target procedure and a \fBcal\fR $_gto. A message is generated to +indicate that a procedure or function contains a statement which is the +target of a non-local goto. +.ll +1i +.in -2m +.in -3m +.sp 2 +.NH 3 +If Statement + +\fRPASCAL : +.in +3m +.ft 5 +\fBIF\f5 boolean-expression \fBTHEN\f5 statement + +.in -3m +\fREM : +.nf +.in +3m + \fIevaluation boolean-expression + \fBzeq \fR*exit_label + \fIcode statement +\fRexit_label +.in -3m +.fi +.sp 2 +\fRPASCAL : +.in +3m +.ft 5 +\fBIF\f5 boolean-expression \fBTHEN\f5 statement-1 \fBELSE\f5 statement-2 + +.in -3m +\fREM : +.nf +.in +3m + \fIevaluation boolean-expression + \fBzeq \fR*else_label + \fIcode statement-1 + \fBbra \fR*exit_label +\fRelse_label + \fIcode statement-2 +\fRexit_label +.in -3m +.fi +.sp 2 +.NH 3 +Repeat Statement + +\fRPASCAL : +.in +3m +.ft 5 +\fBREPEAT\f5 statement-sequence \fBUNTIL\f5 boolean-expression + +.in -3m +\fREM : +.nf +.in +3m +\fRrepeat_label + \fIcode statement-sequence + \fIevaluation boolean-expression + \fBzeq\fR *repeat_label +.in -3m +.fi +.bp +.NH 3 +While Statement + +\fRPASCAL : +.in +3m +.ft 5 +\fBWHILE\f5 boolean-expression \fBDO\f5 statement + +.in -3m +\fREM : +.nf +.in +3m +\fRwhile_label + \fIevaluation boolean-expression + \fBzeq\fR *exit_label + \fIcode statement + \fBbra\fR *while_label +\fRexit_label +.in -3m +.fi +.sp 2 +.NH 3 +Case Statement +.LP +.sp +The case-statement is implemented using the \fBcsa\fR and \fBcsb\fR +instructions. + +\fRPASCAL : +.in +3m +\fBCASE\f5 case-expression \fBOF\f5 +.in +5m +case-constant-list-1 \fB:\f5 statement-1 \fB;\f5 +.br +case-constant-list-2 \fB:\f5 statement-2 \fB;\f5 +.br +\&. +.br +\&. +.br +case-constant-list-n \fB:\f5 statement-n [\fB;\f5] +.in -5m +\fBEND\fR +.in -3m +.sp 2 +.LP +.ll -1i +The \fBcsa\fR instruction is used if the range of the case-expression +value is dense, i.e. +.br +.ti +3m +\f5( upperbound \- lowerbound ) / number_of_cases\fR +.br +is less than the constant DENSITY, defined in the file \fIdensity.h\fR. + +If the range is sparse, a \fBcsb\fR instruction is used. + +.ll +1i +\fREM : +.nf +.in +3m + \fIevaluation case-expression + \fBbra\fR *l1 +.CL 1 +.CL 2 + . + . +.CL n +.ft R +\&.case_descriptor +.ft 5 + \fIgeneration case_descriptor +\fRl1 +.ft 5 + \fBlae\fR .case_descriptor +.ft 5 + \fBcsa\fR size of (case-expression) +\fRexit_label +.in -3m +.fi +.bp +.NH 3 +For Statement + +\fRPASCAL : +.in +3m +.ft 5 +\fBFOR\f5 control-variable \fB:=\f5 initial-value (\fBTO\f5 | \fBDOWNTO\f5) final-value \fBDO\f5 statement + +.ft R +.in -3m +The initial-value and final-value are evaluated at the beginning of the loop. +If the values are not constant, they are evaluated once and stored in a +temporary. + +EM : +.nf +.in +3m + \fIload initial-value + \fIload final-value + \fBbgt\fR exit-label (* DOWNTO : \fBblt\fI exit-label\fR *) + \fIload initial-value +\fRl1 + \fIstore in control-variable + \fIcode statement + \fIload control-variable + \fBdup\fI control-variable + \fIload final-value + \fBbeq\fR exit_label + \fBinc\fI control-variable\fR (* DOWNTO : \fBdec\fI control-variable\fR *) + \fBbra *\fRl1 +\fRexit_label +.in -3m +.fi + +Note: testing must be done before incrementing(decrementing) the +control-variable, +.br +\h'\w'Note: 'u'because wraparound could occur, which could lead to an infinite +loop. +.sp 2 +.NH 3 +With Statement + +\fRPASCAL : +.ti +3m +\fBWITH\f5 record-variable-list \fBDO\f5 statement + +.ft R +The statement +.ti +3m +\fBWITH\fR r\s-3\d1\u\s0, r\s-3\d2\u\s0, ..., r\s-3\dn\u\s0 \fBDO\f5 statement + +.ft R +is equivalent to +.in +3m +\fBWITH\fR r\s-3\d1\u\s0 \fBDO\fR + \fBWITH\fR r\s-3\d2\u\s0 \fBDO\fR + ... + \fBWITH\fR r\s-3\dn\u\s0 \fBDO\f5 statement + +.ft R +.in -3m +The translation of +.ti +3m +\fBWITH\fR r\s-3\d1\u\s0 \fBDO\f5 statement +.br +.ft R +is +.nf +.in +3m +\fIpush address of r\s-3\d1\u\s0 +\fIstore address in temporary +\fIcode statement +.in -3m +.fi + +.ft R +An occurrence of a field is translated into: +.in +3m +\fIload temporary +.br +\fIadd field-offset +.in -3m +.bp +.NH 2 +Procedure and Function Calls + +.ft R +In general, the call +.ti +5m +p(a\s-3\d1\u\s0, a\s-3\d2\u\s0, ...., a\s-3\dn\u\s0) +.br +is translated into the sequence: + +.in +5m +.nf +\fIevaluate a\s-3\dn\u\s0 +\&. +\&. +\fIevaluate a\s-3\d2\u\s0 +\fIevaluate a\s-3\d1\u\s0 +\fIpush localbase +\fBcal\fR $p +\fIpop parameters +.ft R +.fi +.in -5m + +i.e. the order of evaluation and binding of the actual-parameters is from +right to left. In general, a copy of the actual-parameter is made when the +formal-parameter is a value-parameter. If the formal-parameter is a +variable-parameter, a pointer to the actual-parameter is pushed. + +In case of a function call, a \fBlfr\fR is generated, which pushes the +function result on top of the stack. +.sp 2 +.NH 2 +Register Messages + +.ft R +A register message can be generated to indicate that a local variable is never +referenced indirectly. This implies that a register can be used for a variable. +We distinguish the following classes, given in decreasing priority: + +\(bu control-variable and final-value of a for-statement +.br +.ti +5m +to speed up testing, and execution of the body of the for-statement +.sp +\(bu record-variable of a with-statement +.br +.ti +5m +to improve the field selection of a record +.sp +\(bu remaining local variables and parameters +.sp 2 +.NH 2 +Compile-time optimizations + +.ft R +The only optimization that is performed is the evaluation of constant +integral expressions. The optimization of constructs like +.ti +5m +\fBif\f5 false \fBthen\f5 statement\fR, +.br +is left to either the peephole optimizer, or a global optimizer. diff --git a/doc/pascal/vrk.doc b/doc/pascal/vrk.doc new file mode 100644 index 000000000..c5622a5d7 --- /dev/null +++ b/doc/pascal/vrk.doc @@ -0,0 +1,23 @@ +.TL + + + +The ACK Pascal Compiler +.AU +Aad Geudeke +Frans Hofmeester +.AI +Dept. of Mathematics and Computer Science +Vrije Universiteit +Amsterdam, The Netherlands +.LP +.ps 12 +.sp 24 +.ce 5 +.ft I +There is always something like something that there should not be. +.sp 2 +.ps 10 +For Whom The Bell Tolls +.ft R +Ernest Hemingway