Changed to be printed on laserprinter.

Removed paragraph about bug, since bug is now solved
This commit is contained in:
sater 1985-12-04 15:52:51 +00:00
parent a91e33ce96
commit 5c0660793d

View file

@ -57,7 +57,7 @@ and the assembly code of the machine at hand.
.NH 1 .NH 1
What has changed since version 1 ? What has changed since version 1 ?
.PP .PP
This chapter can be skipped by anyone not familiar with the first version. This section can be skipped by anyone not familiar with the first version.
It is not needed to understand the current version. It is not needed to understand the current version.
.PP .PP
This paper describes the second version of the code generator system. This paper describes the second version of the code generator system.
@ -116,39 +116,40 @@ Alternatively one can think of the real stack as an infinite extension
at the bottom of the fake stack. at the bottom of the fake stack.
Both ways, the concatenation of the real stack and the fake stack Both ways, the concatenation of the real stack and the fake stack
will be the stack as it would have been on a real EM machine (see figure). will be the stack as it would have been on a real EM machine (see figure).
.KF .TS
.DS L center;
.ta 8 16 24 32 40 48 56 64 72 cw(3.5c) cw(3c) cw(3.5c)
EM machine target machine cw(3.5c) cw(3c) cw(3.5c)
|cw(3.5c)| cw(3c) |cw(3.5c)| .
| | | | EM machine target machine
| | | |
| | | |
| | | |
| | | real stack |
| | | | |
| | | | | growing
| EM stack | | | |
| | |_______________| \e|/
| | | |
| | | |
| | | |
| | | fake stack |
| | | |
|_______________| |_______________|
.I
Relation between EM stack, real stack and fake stack.
.R
.DE real stack
.KE stack
grows
EM stack \s+2\(br\s0
\s+2\(br\s0
\s+2\(br\s0 _
\s+2\(br\s0
\s+2\(da\s0
fake stack
_ _
.T&
ci s s.
Relation between EM stack, real stack and fake stack.
.TE
During code generation tokens will be kept on the fake stack as long During code generation tokens will be kept on the fake stack as long
as possible but when they are moved to the real stack, as possible but when they are moved to the real stack,
by generating code for the push, by generating code for the push,
all tokens above\u*\d all tokens above\v'-.25m'\(dg\v'.25m'
.FS .FS
* in this document the stack is assumed to grow downwards, \(dg in this document the stack is assumed to grow downwards,
although the top of the stack will mean the first element that will although the top of the stack will mean the first element that will
be popped. be popped.
.FE .FE
@ -297,8 +298,9 @@ at will to improve legibility.
Identifiers used in the table have the same syntax as C identifiers, Identifiers used in the table have the same syntax as C identifiers,
upper and lower case considered different, all characters significant. upper and lower case considered different, all characters significant.
Here is a list of reserved words; all of these are unavailable as identifiers. Here is a list of reserved words; all of these are unavailable as identifiers.
.DS L .TS
.ta 14 28 42 56 box;
l l l l l.
ADDR STACK from reg_any test ADDR STACK from reg_any test
COERCIONS STACKINGRULES gen reg_float to COERCIONS STACKINGRULES gen reg_float to
INSTRUCTIONS TESTS highw reg_loop ufit INSTRUCTIONS TESTS highw reg_loop ufit
@ -309,7 +311,7 @@ PROPERTIES cost loww reusing
REGISTERS defined move rom REGISTERS defined move rom
SETS exact pat samesign SETS exact pat samesign
SIZEFACTOR example proc sfit SIZEFACTOR example proc sfit
.DE .TE
C style comments are accepted. C style comments are accepted.
.DS .DS
/* this is a comment */ /* this is a comment */
@ -330,7 +332,7 @@ NAME=value
.DE .DE
value being an integer or string. value being an integer or string.
Three constants must be defined here: Three constants must be defined here:
.IP EM_WSIZE 10 .IP EM_WSIZE 14
Number of bytes in a machine word. Number of bytes in a machine word.
This is the number of bytes This is the number of bytes
a \fBloc\fP instruction will put on the stack. a \fBloc\fP instruction will put on the stack.
@ -368,13 +370,13 @@ FORMAT= "0%o"
to satisfy the old UNIX assembler that reads octal unless followed by to satisfy the old UNIX assembler that reads octal unless followed by
a period, and the ACK assembler that follows C conventions. a period, and the ACK assembler that follows C conventions.
.PP .PP
Tables under control of programs like Tables under control of source code control systems like
.I sccs .I sccs
or or
.I rcs .I rcs
can put their id-string here, for example can put their id-string here, for example
.DS .DS
rcsid="$Header$" rcsid="$\&Header$"
.DE .DE
These strings, like all strings in the table, will eventually These strings, like all strings in the table, will eventually
end up in the binary code generator produced. end up in the binary code generator produced.
@ -385,6 +387,7 @@ same order of magnitude.
This can be done as This can be done as
.DS .DS
SIZEFACTOR = C\d3\u/C\d4\u SIZEFACTOR = C\d3\u/C\d4\u
.sp
TIMEFACTOR = C\d1\u/C\d2\u TIMEFACTOR = C\d1\u/C\d2\u
.DE .DE
Above numbers must be read as rational numbers. Above numbers must be read as rational numbers.
@ -403,8 +406,8 @@ It consists of a list of user-defined
identifiers optionally followed by the size identifiers optionally followed by the size
of the property in parentheses, default EM_WSIZE. of the property in parentheses, default EM_WSIZE.
Example for the PDP-11: Example for the PDP-11:
.DS .TS
.ta 8 16 24 32 40 l l.
PROPERTIES /* The header word for this section */ PROPERTIES /* The header word for this section */
GENREG /* All PDP registers */ GENREG /* All PDP registers */
@ -420,15 +423,11 @@ DBLREGPAIR(16) /* Same, double precision */
LOCALBASE /* Guess what */ LOCALBASE /* Guess what */
STACKPOINTER STACKPOINTER
PROGRAMCOUNTER PROGRAMCOUNTER
.DE .TE
Registers are allocated by asking for a property, Registers are allocated by asking for a property,
so if for some reason in later parts of the table so if for some reason in later parts of the table
one particular register must be allocated it one particular register must be allocated it
has to have a unique property. has to have a unique property.
.PP
There is a bug in the codegenerator that can be circumvented by
providing a dummy property at the start of the property list.
The example has not been updated to show this.
.NH 2 .NH 2
Register definition Register definition
.PP .PP
@ -442,8 +441,8 @@ Syntax:
<register> : ident [ '(' string ')' ] [ '=' ident [ '+' ident ] ] <register> : ident [ '(' string ')' ] [ '=' ident [ '+' ident ] ]
.DE .DE
Example for the PDP-11: Example for the PDP-11:
.DS L .TS
.ta 8 16 24 32 40 48 56 64 l l.
REGISTERS REGISTERS
r0,r2,r4 : GENREG,REG. r0,r2,r4 : GENREG,REG.
@ -457,7 +456,7 @@ dr01("r0")=dr0+dr1,dr23("r2")=dr2+dr3 : DBLREGPAIR.
lb("r5") : GENREG,LOCALBASE. lb("r5") : GENREG,LOCALBASE.
sp : GENREG,STACKPOINTER. sp : GENREG,STACKPOINTER.
pc : GENREG,PROGRAMCOUNTER. pc : GENREG,PROGRAMCOUNTER.
.DE .TE
.PP .PP
The names in the left hand lists are names of registers as used The names in the left hand lists are names of registers as used
in the table. in the table.
@ -529,7 +528,8 @@ Tokens should usually be declared for every addressing mode
of the machine at hand and for every size directly usable in of the machine at hand and for every size directly usable in
a machine instruction. a machine instruction.
Example for the PDP-11 (incomplete): Example for the PDP-11 (incomplete):
.DS L .TS
l l.
TOKENS TOKENS
const2 = { INT num; } 2 cost(2,300) "$" num . const2 = { INT num; } 2 cost(2,300) "$" num .
@ -542,7 +542,7 @@ reginddef2 = { GENREG reg; ADDR off; } 2 "*" off "(" reg ")" .
regconst2 = { GENREG reg; ADDR off; } 2 . regconst2 = { GENREG reg; ADDR off; } 2 .
relative2 = { ADDR off; } 2 off . relative2 = { ADDR off; } 2 off .
reldef2 = { ADDR off; } 2 "*" off. reldef2 = { ADDR off; } 2 "*" off.
.DE .TE
.PP .PP
Types allowed in the struct are ADDR, INT and all register properties. Types allowed in the struct are ADDR, INT and all register properties.
The type ADDR means a string and an integer, The type ADDR means a string and an integer,
@ -624,13 +624,13 @@ in the remainder of the table,
but for clarity it is usually better not to. but for clarity it is usually better not to.
.LP .LP
Example for the PDP-11 (incomplete): Example for the PDP-11 (incomplete):
.DS L .TS
.ta 8 16 24 32 40 48 56 64 l l.
SETS SETS
src2 = GENREG + regdef2 + regind2 + reginddef2 + relative2 + src2 = GENREG + regdef2 + regind2 + reginddef2 + relative2 +
reldef2 + addr_external + const2 + LOCAL + ILOCAL + \h'\w'= 'u'reldef2 + addr_external + const2 + LOCAL + ILOCAL +
autodec + autoinc . \h'\w'= 'u'autodec + autoinc .
dst2 = src2 - ( const2 + addr_external ) . dst2 = src2 - ( const2 + addr_external ) .
xsrc2 = src2 + ftoint . xsrc2 = src2 + ftoint .
src1 = regdef1 + regind1 + reginddef1 + relative1 + reldef1 . src1 = regdef1 + regind1 + reginddef1 + relative1 + reldef1 .
@ -638,7 +638,7 @@ dst1 = src1 .
src1or2 = src1 + src2 . src1or2 = src1 + src2 .
src4 = relative4 + regdef4 + DLOCAL + regind4 . src4 = relative4 + regdef4 + DLOCAL + regind4 .
dst4 = src4 . dst4 = src4 .
.DE .TE
Permissible in the set construction are all the usual set operators, i.e. Permissible in the set construction are all the usual set operators, i.e.
.IP + .IP +
set union set union
@ -1252,7 +1252,7 @@ The author of
.I cgg .I cgg
could not get could not get
.I yacc .I yacc
to be silent without it. to accept his syntax without it.
Sorry about this. Sorry about this.
.IP 2) .IP 2)
a a
@ -1370,15 +1370,14 @@ A list of examples for the PDP-11 is given here.
Far from being complete it gives examples of most kinds Far from being complete it gives examples of most kinds
of instructions. of instructions.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat loc yields {const2, $1} pat loc yields {const2, $1}
pat ldc yields {const2, loww($1)} pat ldc yields {const2, loww($1)} {const2, highw($1)}
{const2, highw($1)}
.DE .DE
These simple patterns just push one or more tokens onto the fake stack. These simple patterns just push one or more tokens onto the fake stack.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat lof pat lof
with REG yields {regind2,%1,$1} with REG yields {regind2,%1,$1}
with exact regconst2 yields {regind2,%1.reg,$1+%1.off} with exact regconst2 yields {regind2,%1.reg,$1+%1.off}
@ -1393,10 +1392,9 @@ not preceded by
that can always be taken after a coercion, that can always be taken after a coercion,
if necessary. if necessary.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat lxl $1>3 pat lxl $1>3
uses REG={LOCAL, SL, 2}, uses REG={LOCAL, SL, 2}, REG={const2,$1-1}
REG={const2,$1-1}
gen 1: gen 1:
move {regind2,%a, SL},%a move {regind2,%a, SL},%a
sob %b,{label,1b} yields %a sob %b,{label,1b} yields %a
@ -1408,7 +1406,7 @@ of the static link,
that is pushed by the Pascal compiler as the last argument of that is pushed by the Pascal compiler as the last argument of
a function. a function.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat stf pat stf
with regconst2 xsrc2 with regconst2 xsrc2
kills allexeptcon kills allexeptcon
@ -1423,7 +1421,7 @@ part in a store instruction.
The set allexeptcon contains all tokens that can be the destination The set allexeptcon contains all tokens that can be the destination
of an indirect store. of an indirect store.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat sde pat sde
with exact FLTREG with exact FLTREG
kills posextern kills posextern
@ -1449,7 +1447,7 @@ The third rule is taken by default,
resulting in two separate stores, resulting in two separate stores,
nothing better exists on the PDP-11. nothing better exists on the PDP-11.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat sbi $1==2 pat sbi $1==2
with src2 REG with src2 REG
gen sub %1,%2 yields %2 gen sub %1,%2 yields %2
@ -1462,7 +1460,7 @@ This rule for
has a normal first part, has a normal first part,
and a hand optimized special case as it's second part. and a hand optimized special case as it's second part.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat mli $1==2 pat mli $1==2
with ODDREG src2 with ODDREG src2
gen mul %2,%1 yields %1 gen mul %2,%1 yields %1
@ -1473,7 +1471,7 @@ This shows the general property for rules with commutative
operators, operators,
heuristics or look ahead will have to decide which rule is the best. heuristics or look ahead will have to decide which rule is the best.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat loc sli $1==1 && $2==2 pat loc sli $1==1 && $2==2
with REG with REG
gen asl %1 yields %1 gen asl %1 yields %1
@ -1481,7 +1479,7 @@ gen asl %1 yields %1
A simple rule involving a longer EM-pattern, A simple rule involving a longer EM-pattern,
to make use of a specialized instruction available. to make use of a specialized instruction available.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat loc loc cii $1==1 && $2==2 pat loc loc cii $1==1 && $2==2
with src1or2 with src1or2
uses reusing %1,REG uses reusing %1,REG
@ -1492,8 +1490,9 @@ Note the
.I reusing .I reusing
clause. clause.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat loc loc loc cii $1>=0 && $2==2 && $3==4 leaving loc $1 loc 0 pat loc loc loc cii $1>=0 && $2==2 && $3==4
leaving loc $1 loc 0
.DE .DE
Shows a trivial example of EM-replacement. Shows a trivial example of EM-replacement.
This is a rule that could be done by the This is a rule that could be done by the
@ -1502,7 +1501,7 @@ if word order in longs was defined in EM.
On a `big-endian' machine the two replacement On a `big-endian' machine the two replacement
instructions would be the other way around. instructions would be the other way around.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat and $1==2 pat and $1==2
with const2 REG with const2 REG
gen bic {const2,~%1.num},%2 yields %2 gen bic {const2,~%1.num},%2 yields %2
@ -1517,7 +1516,7 @@ if an
.I and -instruction .I and -instruction
is not available on your machine. is not available on your machine.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat set $1==2 pat set $1==2
with REG with REG
uses REG={const2,1} uses REG={const2,1}
@ -1525,7 +1524,7 @@ gen ash %1,%a yields %a
.DE .DE
Shows the building of a word-size set. Shows the building of a word-size set.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)==0 pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)==0
leaving adi 2 leaving adi 2
@ -1535,7 +1534,7 @@ pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)!=0
Two rules showing the use of the rom pseudo function, Two rules showing the use of the rom pseudo function,
and some array optimalisation. and some array optimalisation.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat bra pat bra
with STACK with STACK
gen jbr {label, $1} gen jbr {label, $1}
@ -1544,7 +1543,7 @@ A simple jump.
The stack pattern guarantees that everything will be stacked The stack pattern guarantees that everything will be stacked
before the jump is taken. before the jump is taken.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat cal pat cal
with STACK with STACK
gen jsr pc,{label, $1} gen jsr pc,{label, $1}
@ -1552,7 +1551,7 @@ gen jsr pc,{label, $1}
A simple call. A simple call.
Same comments as previous rule. Same comments as previous rule.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat lfr $1==2 yields r0 pat lfr $1==2 yields r0
pat lfr $1==4 yields r1 r0 pat lfr $1==4 yields r1 r0
.DE .DE
@ -1564,7 +1563,7 @@ instruction, and some other instructions must leave
the function return area intact. the function return area intact.
See the defining document for EM for exact information. See the defining document for EM for exact information.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat ret $1==0 pat ret $1==0
with STACK with STACK
gen mov lb,sp gen mov lb,sp
@ -1578,7 +1577,7 @@ In a table with register variables the
part would just contain part would just contain
.I return . .I return .
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat blm pat blm
with REG REG with REG REG
uses REG={const2,$1/2} uses REG={const2,$1/2}
@ -1596,7 +1595,7 @@ It uses the marriage thesis from Hall,
a thesis from combinatorial mathematics, a thesis from combinatorial mathematics,
to accomplish this. to accomplish this.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
pat exg $1==2 pat exg $1==2
with src2 src2 yields %1 %2 with src2 src2 yields %1 %2
.DE .DE
@ -1604,7 +1603,7 @@ This rule shows the exchanging of two elements on the fake stack.
.NH 2 .NH 2
Code rules using procedures Code rules using procedures
.PP .PP
To start this chapter it must be admitted at once that the To start this section it must be admitted at once that the
word procedure is chosen here mainly for it's advertising word procedure is chosen here mainly for it's advertising
value. value.
It more resembles a glorified goto but this of course can It more resembles a glorified goto but this of course can
@ -1664,7 +1663,7 @@ The string `*' can be used as an equivalent for `[1]'.
Just in case this is not clear, here is an example for Just in case this is not clear, here is an example for
a procedure to increment/decrement a register. a procedure to increment/decrement a register.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
incop REG:rw:cc . /* in the INSTRUCTIONS part of course */ incop REG:rw:cc . /* in the INSTRUCTIONS part of course */
proc incdec proc incdec
@ -1680,7 +1679,7 @@ call <identifier> '(' string [ ',' string ] ')'
.DE .DE
which leads to the following large example: which leads to the following large example:
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
proc bxx example beq proc bxx example beq
with src2 src2 STACK with src2 src2 STACK
gen cmp %2,%1 gen cmp %2,%1
@ -1856,7 +1855,7 @@ The next part of the table defines the coercions that are possible
on the defined tokens. on the defined tokens.
Example for the PDP-11: Example for the PDP-11:
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
COERCIONS COERCIONS
from STACK from STACK
@ -1875,7 +1874,7 @@ gen mov {autoinc,sp},%a.1
These three coercions just deliver a certain type These three coercions just deliver a certain type
of register by popping it from the real stack. of register by popping it from the real stack.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
from LOCAL yields {regind2,lb,%1.ind} from LOCAL yields {regind2,lb,%1.ind}
from DLOCAL yields {regind4,lb,%1.ind} from DLOCAL yields {regind4,lb,%1.ind}
@ -1884,7 +1883,7 @@ from REG yields {regconst2, %1, 0}
.DE .DE
These three are zero-cost rewriting rules. These three are zero-cost rewriting rules.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
from regconst2 %1.off==1 from regconst2 %1.off==1
uses reusing %1,REG=%1.reg uses reusing %1,REG=%1.reg
gen inc %a yields %a gen inc %a yields %a
@ -1904,7 +1903,7 @@ Only in the last case is it always necessary to allocate
an extra register, an extra register,
since arithmetic on the localbase is unthinkable. since arithmetic on the localbase is unthinkable.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
from xsrc2 from xsrc2
uses reusing %1, REG=%1 yields %a uses reusing %1, REG=%1 yields %a
@ -1925,7 +1924,7 @@ ensure bytes are not sign-extended.
In EM it is defined that the result of a \fBloi\fP\ 1 In EM it is defined that the result of a \fBloi\fP\ 1
instruction is an integer in the range 0..255. instruction is an integer in the range 0..255.
.DS .DS
.ta 8 16 24 32 40 48 56 64 .ta 7.5c
from REGPAIR yields %1.2 %1.1 from REGPAIR yields %1.2 %1.1
from regind4 yields {regind2,%1.reg,2+%1.off} from regind4 yields {regind2,%1.reg,2+%1.off}
@ -2086,7 +2085,7 @@ If omitted no initialization is assumed.
.NH 3 .NH 3
Example mach.h for the PDP-11 Example mach.h for the PDP-11
.DS L .DS L
.ta 8 16 24 32 40 48 56 .ta 4c
#define ex_ap(y) fprintf(codefile,"\et.globl %s\en",y) #define ex_ap(y) fprintf(codefile,"\et.globl %s\en",y)
#define in_ap(y) /* nothing */ #define in_ap(y) /* nothing */
@ -2157,7 +2156,7 @@ mes(w_mesno)
This function is called when a This function is called when a
.B mes .B mes
pseudo is seen that is not handled by the machine independent part. pseudo is seen that is not handled by the machine independent part.
Example below shows all you probably have to know about that. The example below shows all you probably have to know about that.
.IP - .IP -
segname[] segname[]
.br .br
@ -2216,7 +2215,7 @@ Example mach.c for the PDP-11
As an example of the sort of code expected, As an example of the sort of code expected,
the mach.c for the PDP-11 is presented here. the mach.c for the PDP-11 is presented here.
.DS L .DS L
.ta 8 16 24 32 40 48 56 64 .ta 0.5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i
/* /*
* machine dependent back end routines for the PDP-11 * machine dependent back end routines for the PDP-11
*/ */