Changed to be printed on laserprinter.
Removed paragraph about bug, since bug is now solved
This commit is contained in:
parent
a91e33ce96
commit
5c0660793d
173
doc/ncg.doc
173
doc/ncg.doc
|
@ -57,7 +57,7 @@ and the assembly code of the machine at hand.
|
|||
.NH 1
|
||||
What has changed since version 1 ?
|
||||
.PP
|
||||
This chapter can be skipped by anyone not familiar with the first version.
|
||||
This section can be skipped by anyone not familiar with the first version.
|
||||
It is not needed to understand the current version.
|
||||
.PP
|
||||
This paper describes the second version of the code generator system.
|
||||
|
@ -116,39 +116,40 @@ Alternatively one can think of the real stack as an infinite extension
|
|||
at the bottom of the fake stack.
|
||||
Both ways, the concatenation of the real stack and the fake stack
|
||||
will be the stack as it would have been on a real EM machine (see figure).
|
||||
.KF
|
||||
.DS L
|
||||
.ta 8 16 24 32 40 48 56 64 72
|
||||
EM machine target machine
|
||||
|
||||
| | | |
|
||||
| | | |
|
||||
| | | |
|
||||
| | | |
|
||||
| | | real stack |
|
||||
| | | | |
|
||||
| | | | | growing
|
||||
| EM stack | | | |
|
||||
| | |_______________| \e|/
|
||||
| | | |
|
||||
| | | |
|
||||
| | | |
|
||||
| | | fake stack |
|
||||
| | | |
|
||||
|_______________| |_______________|
|
||||
.TS
|
||||
center;
|
||||
cw(3.5c) cw(3c) cw(3.5c)
|
||||
cw(3.5c) cw(3c) cw(3.5c)
|
||||
|cw(3.5c)| cw(3c) |cw(3.5c)| .
|
||||
EM machine target machine
|
||||
|
||||
|
||||
.I
|
||||
Relation between EM stack, real stack and fake stack.
|
||||
.R
|
||||
.DE
|
||||
.KE
|
||||
|
||||
|
||||
|
||||
real stack
|
||||
stack
|
||||
grows
|
||||
EM stack \s+2\(br\s0
|
||||
\s+2\(br\s0
|
||||
\s+2\(br\s0 _
|
||||
\s+2\(br\s0
|
||||
\s+2\(da\s0
|
||||
fake stack
|
||||
|
||||
|
||||
|
||||
_ _
|
||||
.T&
|
||||
ci s s.
|
||||
Relation between EM stack, real stack and fake stack.
|
||||
.TE
|
||||
During code generation tokens will be kept on the fake stack as long
|
||||
as possible but when they are moved to the real stack,
|
||||
by generating code for the push,
|
||||
all tokens above\u*\d
|
||||
all tokens above\v'-.25m'\(dg\v'.25m'
|
||||
.FS
|
||||
* in this document the stack is assumed to grow downwards,
|
||||
\(dg in this document the stack is assumed to grow downwards,
|
||||
although the top of the stack will mean the first element that will
|
||||
be popped.
|
||||
.FE
|
||||
|
@ -297,8 +298,9 @@ at will to improve legibility.
|
|||
Identifiers used in the table have the same syntax as C identifiers,
|
||||
upper and lower case considered different, all characters significant.
|
||||
Here is a list of reserved words; all of these are unavailable as identifiers.
|
||||
.DS L
|
||||
.ta 14 28 42 56
|
||||
.TS
|
||||
box;
|
||||
l l l l l.
|
||||
ADDR STACK from reg_any test
|
||||
COERCIONS STACKINGRULES gen reg_float to
|
||||
INSTRUCTIONS TESTS highw reg_loop ufit
|
||||
|
@ -309,7 +311,7 @@ PROPERTIES cost loww reusing
|
|||
REGISTERS defined move rom
|
||||
SETS exact pat samesign
|
||||
SIZEFACTOR example proc sfit
|
||||
.DE
|
||||
.TE
|
||||
C style comments are accepted.
|
||||
.DS
|
||||
/* this is a comment */
|
||||
|
@ -330,7 +332,7 @@ NAME=value
|
|||
.DE
|
||||
value being an integer or string.
|
||||
Three constants must be defined here:
|
||||
.IP EM_WSIZE 10
|
||||
.IP EM_WSIZE 14
|
||||
Number of bytes in a machine word.
|
||||
This is the number of bytes
|
||||
a \fBloc\fP instruction will put on the stack.
|
||||
|
@ -368,13 +370,13 @@ FORMAT= "0%o"
|
|||
to satisfy the old UNIX assembler that reads octal unless followed by
|
||||
a period, and the ACK assembler that follows C conventions.
|
||||
.PP
|
||||
Tables under control of programs like
|
||||
Tables under control of source code control systems like
|
||||
.I sccs
|
||||
or
|
||||
.I rcs
|
||||
can put their id-string here, for example
|
||||
.DS
|
||||
rcsid="$Header$"
|
||||
rcsid="$\&Header$"
|
||||
.DE
|
||||
These strings, like all strings in the table, will eventually
|
||||
end up in the binary code generator produced.
|
||||
|
@ -385,6 +387,7 @@ same order of magnitude.
|
|||
This can be done as
|
||||
.DS
|
||||
SIZEFACTOR = C\d3\u/C\d4\u
|
||||
.sp
|
||||
TIMEFACTOR = C\d1\u/C\d2\u
|
||||
.DE
|
||||
Above numbers must be read as rational numbers.
|
||||
|
@ -403,8 +406,8 @@ It consists of a list of user-defined
|
|||
identifiers optionally followed by the size
|
||||
of the property in parentheses, default EM_WSIZE.
|
||||
Example for the PDP-11:
|
||||
.DS
|
||||
.ta 8 16 24 32 40
|
||||
.TS
|
||||
l l.
|
||||
PROPERTIES /* The header word for this section */
|
||||
|
||||
GENREG /* All PDP registers */
|
||||
|
@ -420,15 +423,11 @@ DBLREGPAIR(16) /* Same, double precision */
|
|||
LOCALBASE /* Guess what */
|
||||
STACKPOINTER
|
||||
PROGRAMCOUNTER
|
||||
.DE
|
||||
.TE
|
||||
Registers are allocated by asking for a property,
|
||||
so if for some reason in later parts of the table
|
||||
one particular register must be allocated it
|
||||
has to have a unique property.
|
||||
.PP
|
||||
There is a bug in the codegenerator that can be circumvented by
|
||||
providing a dummy property at the start of the property list.
|
||||
The example has not been updated to show this.
|
||||
.NH 2
|
||||
Register definition
|
||||
.PP
|
||||
|
@ -442,8 +441,8 @@ Syntax:
|
|||
<register> : ident [ '(' string ')' ] [ '=' ident [ '+' ident ] ]
|
||||
.DE
|
||||
Example for the PDP-11:
|
||||
.DS L
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.TS
|
||||
l l.
|
||||
REGISTERS
|
||||
|
||||
r0,r2,r4 : GENREG,REG.
|
||||
|
@ -457,7 +456,7 @@ dr01("r0")=dr0+dr1,dr23("r2")=dr2+dr3 : DBLREGPAIR.
|
|||
lb("r5") : GENREG,LOCALBASE.
|
||||
sp : GENREG,STACKPOINTER.
|
||||
pc : GENREG,PROGRAMCOUNTER.
|
||||
.DE
|
||||
.TE
|
||||
.PP
|
||||
The names in the left hand lists are names of registers as used
|
||||
in the table.
|
||||
|
@ -529,7 +528,8 @@ Tokens should usually be declared for every addressing mode
|
|||
of the machine at hand and for every size directly usable in
|
||||
a machine instruction.
|
||||
Example for the PDP-11 (incomplete):
|
||||
.DS L
|
||||
.TS
|
||||
l l.
|
||||
TOKENS
|
||||
|
||||
const2 = { INT num; } 2 cost(2,300) "$" num .
|
||||
|
@ -542,7 +542,7 @@ reginddef2 = { GENREG reg; ADDR off; } 2 "*" off "(" reg ")" .
|
|||
regconst2 = { GENREG reg; ADDR off; } 2 .
|
||||
relative2 = { ADDR off; } 2 off .
|
||||
reldef2 = { ADDR off; } 2 "*" off.
|
||||
.DE
|
||||
.TE
|
||||
.PP
|
||||
Types allowed in the struct are ADDR, INT and all register properties.
|
||||
The type ADDR means a string and an integer,
|
||||
|
@ -624,13 +624,13 @@ in the remainder of the table,
|
|||
but for clarity it is usually better not to.
|
||||
.LP
|
||||
Example for the PDP-11 (incomplete):
|
||||
.DS L
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.TS
|
||||
l l.
|
||||
SETS
|
||||
|
||||
src2 = GENREG + regdef2 + regind2 + reginddef2 + relative2 +
|
||||
reldef2 + addr_external + const2 + LOCAL + ILOCAL +
|
||||
autodec + autoinc .
|
||||
\h'\w'= 'u'reldef2 + addr_external + const2 + LOCAL + ILOCAL +
|
||||
\h'\w'= 'u'autodec + autoinc .
|
||||
dst2 = src2 - ( const2 + addr_external ) .
|
||||
xsrc2 = src2 + ftoint .
|
||||
src1 = regdef1 + regind1 + reginddef1 + relative1 + reldef1 .
|
||||
|
@ -638,7 +638,7 @@ dst1 = src1 .
|
|||
src1or2 = src1 + src2 .
|
||||
src4 = relative4 + regdef4 + DLOCAL + regind4 .
|
||||
dst4 = src4 .
|
||||
.DE
|
||||
.TE
|
||||
Permissible in the set construction are all the usual set operators, i.e.
|
||||
.IP +
|
||||
set union
|
||||
|
@ -1252,7 +1252,7 @@ The author of
|
|||
.I cgg
|
||||
could not get
|
||||
.I yacc
|
||||
to be silent without it.
|
||||
to accept his syntax without it.
|
||||
Sorry about this.
|
||||
.IP 2)
|
||||
a
|
||||
|
@ -1370,15 +1370,14 @@ A list of examples for the PDP-11 is given here.
|
|||
Far from being complete it gives examples of most kinds
|
||||
of instructions.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat loc yields {const2, $1}
|
||||
|
||||
pat ldc yields {const2, loww($1)}
|
||||
{const2, highw($1)}
|
||||
pat ldc yields {const2, loww($1)} {const2, highw($1)}
|
||||
.DE
|
||||
These simple patterns just push one or more tokens onto the fake stack.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat lof
|
||||
with REG yields {regind2,%1,$1}
|
||||
with exact regconst2 yields {regind2,%1.reg,$1+%1.off}
|
||||
|
@ -1393,10 +1392,9 @@ not preceded by
|
|||
that can always be taken after a coercion,
|
||||
if necessary.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat lxl $1>3
|
||||
uses REG={LOCAL, SL, 2},
|
||||
REG={const2,$1-1}
|
||||
uses REG={LOCAL, SL, 2}, REG={const2,$1-1}
|
||||
gen 1:
|
||||
move {regind2,%a, SL},%a
|
||||
sob %b,{label,1b} yields %a
|
||||
|
@ -1408,7 +1406,7 @@ of the static link,
|
|||
that is pushed by the Pascal compiler as the last argument of
|
||||
a function.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat stf
|
||||
with regconst2 xsrc2
|
||||
kills allexeptcon
|
||||
|
@ -1423,7 +1421,7 @@ part in a store instruction.
|
|||
The set allexeptcon contains all tokens that can be the destination
|
||||
of an indirect store.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat sde
|
||||
with exact FLTREG
|
||||
kills posextern
|
||||
|
@ -1449,7 +1447,7 @@ The third rule is taken by default,
|
|||
resulting in two separate stores,
|
||||
nothing better exists on the PDP-11.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat sbi $1==2
|
||||
with src2 REG
|
||||
gen sub %1,%2 yields %2
|
||||
|
@ -1462,7 +1460,7 @@ This rule for
|
|||
has a normal first part,
|
||||
and a hand optimized special case as it's second part.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat mli $1==2
|
||||
with ODDREG src2
|
||||
gen mul %2,%1 yields %1
|
||||
|
@ -1473,7 +1471,7 @@ This shows the general property for rules with commutative
|
|||
operators,
|
||||
heuristics or look ahead will have to decide which rule is the best.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat loc sli $1==1 && $2==2
|
||||
with REG
|
||||
gen asl %1 yields %1
|
||||
|
@ -1481,7 +1479,7 @@ gen asl %1 yields %1
|
|||
A simple rule involving a longer EM-pattern,
|
||||
to make use of a specialized instruction available.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat loc loc cii $1==1 && $2==2
|
||||
with src1or2
|
||||
uses reusing %1,REG
|
||||
|
@ -1492,8 +1490,9 @@ Note the
|
|||
.I reusing
|
||||
clause.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
pat loc loc loc cii $1>=0 && $2==2 && $3==4 leaving loc $1 loc 0
|
||||
.ta 7.5c
|
||||
pat loc loc loc cii $1>=0 && $2==2 && $3==4
|
||||
leaving loc $1 loc 0
|
||||
.DE
|
||||
Shows a trivial example of EM-replacement.
|
||||
This is a rule that could be done by the
|
||||
|
@ -1502,7 +1501,7 @@ if word order in longs was defined in EM.
|
|||
On a `big-endian' machine the two replacement
|
||||
instructions would be the other way around.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat and $1==2
|
||||
with const2 REG
|
||||
gen bic {const2,~%1.num},%2 yields %2
|
||||
|
@ -1517,7 +1516,7 @@ if an
|
|||
.I and -instruction
|
||||
is not available on your machine.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat set $1==2
|
||||
with REG
|
||||
uses REG={const2,1}
|
||||
|
@ -1525,7 +1524,7 @@ gen ash %1,%a yields %a
|
|||
.DE
|
||||
Shows the building of a word-size set.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)==0
|
||||
leaving adi 2
|
||||
|
||||
|
@ -1535,7 +1534,7 @@ pat lae aar $2==2 && rom($1,3)==1 && rom($1,1)!=0
|
|||
Two rules showing the use of the rom pseudo function,
|
||||
and some array optimalisation.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat bra
|
||||
with STACK
|
||||
gen jbr {label, $1}
|
||||
|
@ -1544,7 +1543,7 @@ A simple jump.
|
|||
The stack pattern guarantees that everything will be stacked
|
||||
before the jump is taken.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat cal
|
||||
with STACK
|
||||
gen jsr pc,{label, $1}
|
||||
|
@ -1552,7 +1551,7 @@ gen jsr pc,{label, $1}
|
|||
A simple call.
|
||||
Same comments as previous rule.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat lfr $1==2 yields r0
|
||||
pat lfr $1==4 yields r1 r0
|
||||
.DE
|
||||
|
@ -1564,7 +1563,7 @@ instruction, and some other instructions must leave
|
|||
the function return area intact.
|
||||
See the defining document for EM for exact information.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat ret $1==0
|
||||
with STACK
|
||||
gen mov lb,sp
|
||||
|
@ -1578,7 +1577,7 @@ In a table with register variables the
|
|||
part would just contain
|
||||
.I return .
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat blm
|
||||
with REG REG
|
||||
uses REG={const2,$1/2}
|
||||
|
@ -1596,7 +1595,7 @@ It uses the marriage thesis from Hall,
|
|||
a thesis from combinatorial mathematics,
|
||||
to accomplish this.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
pat exg $1==2
|
||||
with src2 src2 yields %1 %2
|
||||
.DE
|
||||
|
@ -1604,7 +1603,7 @@ This rule shows the exchanging of two elements on the fake stack.
|
|||
.NH 2
|
||||
Code rules using procedures
|
||||
.PP
|
||||
To start this chapter it must be admitted at once that the
|
||||
To start this section it must be admitted at once that the
|
||||
word procedure is chosen here mainly for it's advertising
|
||||
value.
|
||||
It more resembles a glorified goto but this of course can
|
||||
|
@ -1664,7 +1663,7 @@ The string `*' can be used as an equivalent for `[1]'.
|
|||
Just in case this is not clear, here is an example for
|
||||
a procedure to increment/decrement a register.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
incop REG:rw:cc . /* in the INSTRUCTIONS part of course */
|
||||
|
||||
proc incdec
|
||||
|
@ -1680,7 +1679,7 @@ call <identifier> '(' string [ ',' string ] ')'
|
|||
.DE
|
||||
which leads to the following large example:
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
proc bxx example beq
|
||||
with src2 src2 STACK
|
||||
gen cmp %2,%1
|
||||
|
@ -1856,7 +1855,7 @@ The next part of the table defines the coercions that are possible
|
|||
on the defined tokens.
|
||||
Example for the PDP-11:
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
COERCIONS
|
||||
|
||||
from STACK
|
||||
|
@ -1875,7 +1874,7 @@ gen mov {autoinc,sp},%a.1
|
|||
These three coercions just deliver a certain type
|
||||
of register by popping it from the real stack.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
from LOCAL yields {regind2,lb,%1.ind}
|
||||
|
||||
from DLOCAL yields {regind4,lb,%1.ind}
|
||||
|
@ -1884,7 +1883,7 @@ from REG yields {regconst2, %1, 0}
|
|||
.DE
|
||||
These three are zero-cost rewriting rules.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
from regconst2 %1.off==1
|
||||
uses reusing %1,REG=%1.reg
|
||||
gen inc %a yields %a
|
||||
|
@ -1904,7 +1903,7 @@ Only in the last case is it always necessary to allocate
|
|||
an extra register,
|
||||
since arithmetic on the localbase is unthinkable.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
from xsrc2
|
||||
uses reusing %1, REG=%1 yields %a
|
||||
|
||||
|
@ -1925,7 +1924,7 @@ ensure bytes are not sign-extended.
|
|||
In EM it is defined that the result of a \fBloi\fP\ 1
|
||||
instruction is an integer in the range 0..255.
|
||||
.DS
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 7.5c
|
||||
from REGPAIR yields %1.2 %1.1
|
||||
|
||||
from regind4 yields {regind2,%1.reg,2+%1.off}
|
||||
|
@ -2086,7 +2085,7 @@ If omitted no initialization is assumed.
|
|||
.NH 3
|
||||
Example mach.h for the PDP-11
|
||||
.DS L
|
||||
.ta 8 16 24 32 40 48 56
|
||||
.ta 4c
|
||||
#define ex_ap(y) fprintf(codefile,"\et.globl %s\en",y)
|
||||
#define in_ap(y) /* nothing */
|
||||
|
||||
|
@ -2157,7 +2156,7 @@ mes(w_mesno)
|
|||
This function is called when a
|
||||
.B mes
|
||||
pseudo is seen that is not handled by the machine independent part.
|
||||
Example below shows all you probably have to know about that.
|
||||
The example below shows all you probably have to know about that.
|
||||
.IP -
|
||||
segname[]
|
||||
.br
|
||||
|
@ -2216,7 +2215,7 @@ Example mach.c for the PDP-11
|
|||
As an example of the sort of code expected,
|
||||
the mach.c for the PDP-11 is presented here.
|
||||
.DS L
|
||||
.ta 8 16 24 32 40 48 56 64
|
||||
.ta 0.5i 1i 1.5i 2i 2.5i 3i 3.5i 4i 4.5i
|
||||
/*
|
||||
* machine dependent back end routines for the PDP-11
|
||||
*/
|
||||
|
|
Loading…
Reference in a new issue