1986-12-10 11:40:00 +00:00
|
|
|
.ND
|
|
|
|
.tr ~
|
|
|
|
.ds >. .
|
|
|
|
.TL
|
|
|
|
The ACK Target Optimizer
|
|
|
|
.AU
|
|
|
|
H.E. Bal
|
|
|
|
.AI
|
|
|
|
Vrije Universiteit
|
|
|
|
Wiskundig Seminarium, Amsterdam
|
|
|
|
.AB
|
|
|
|
The Target Optimizer is one of several optimizers that are part of
|
|
|
|
the Amsterdam Compiler Kit.
|
|
|
|
It operates directly on assembly code,
|
|
|
|
rather than on a higher level intermediate code,
|
|
|
|
as the Peephole Optimizer and Global Optimizer do.
|
|
|
|
Consequently, the Target Optimizer can do optimizations
|
|
|
|
that are highly machine-dependent.
|
|
|
|
.PP
|
|
|
|
Each target machine has its own Target Optimizer.
|
|
|
|
New optimizers are generated by the Target Optimizer Generator,
|
|
|
|
which uses a machine-dependent table as input.
|
|
|
|
This document contains full information on how to
|
|
|
|
write such a table for a new machine.
|
|
|
|
It also discusses the implementation of the
|
|
|
|
Target Optimizer and its generator.
|
|
|
|
.AE
|
|
|
|
.bp
|
|
|
|
.NH 1
|
|
|
|
Introduction
|
|
|
|
.PP
|
|
|
|
.FS
|
|
|
|
This work was supported by the
|
|
|
|
Stichting Technische Wetenschappen (STW)
|
|
|
|
under grant VWI03.0001.
|
|
|
|
.FE
|
|
|
|
This document describes the target optimizer component
|
|
|
|
of the Amsterdam Compiler Kit (ACK) .
|
|
|
|
.[
|
|
|
|
tanenbaum staveren amsterdam toolkit
|
|
|
|
.]
|
|
|
|
.[
|
|
|
|
tanenbaum staveren cacm
|
|
|
|
.]
|
|
|
|
.[
|
|
|
|
tanenbaum staveren toronto
|
|
|
|
.]
|
|
|
|
Optimization takes place in several parts of ACK compilers,
|
|
|
|
most notably in the Peephole Optimizer
|
|
|
|
.[
|
|
|
|
staveren peephole toplas
|
|
|
|
.]
|
|
|
|
and
|
|
|
|
the Global Optimizer,
|
|
|
|
.[
|
|
|
|
bal tanenbaum global optimization
|
|
|
|
.]
|
|
|
|
.[
|
|
|
|
bal implementation global optimizer
|
|
|
|
.]
|
|
|
|
which are both language- and machine-independent,
|
|
|
|
and in the machine-specific code generators.
|
|
|
|
.[
|
|
|
|
documentation amsterdam compiler kit
|
|
|
|
.]
|
|
|
|
The target optimizer is the finishing touch in this sequence of
|
|
|
|
optimizers.
|
|
|
|
It can be used to capture those optimizations that are hard
|
|
|
|
to express in the other parts of ACK.
|
|
|
|
These optimizations will typically be very machine-specific.
|
|
|
|
.PP
|
|
|
|
The target optimizer operates on the assembly code of some target machine.
|
|
|
|
Hence there is one target optimizer per machine.
|
|
|
|
However, just as for the ACK code generators and assemblers,
|
|
|
|
a framework has been build that allows easy generation of
|
|
|
|
target optimizers out of machine-independent parts and a
|
|
|
|
machine-dependent description table (see figure 1.).
|
|
|
|
So the major part of the code of a target optimizer is
|
|
|
|
shared among all target optimizers.
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
|
|
|
|
|-------------------------|
|
|
|
|
| machine-independent |
|
|
|
|
| code |
|
|
|
|
| |
|
|
|
|
|-----------------| |-------------------------|
|
|
|
|
descrip- |target optimizer | | machine-dependent code |
|
|
|
|
tion --> |generator | ----> | + tables |
|
|
|
|
table | | | |
|
|
|
|
|-----------------| |-------------------------|
|
|
|
|
|
|
|
|
target optimizer
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
Figure 1: Generation of a target optimizer.
|
|
|
|
|
|
|
|
.DE
|
|
|
|
.PP
|
|
|
|
This document focusses on the description of the machine-dependent table.
|
|
|
|
In chapter 2 we give an informal introduction to the optimization
|
|
|
|
algorithm and to the definition of the table format.
|
|
|
|
Chapters 3 and 4 discuss the implementation of the target optimizer
|
|
|
|
and the target optimizer generator.
|
|
|
|
Appendix A gives full information for writing a description table.
|
|
|
|
.bp
|
|
|
|
.NH 1
|
|
|
|
Global structure of the target optimizer
|
|
|
|
.PP
|
|
|
|
The target optimizer is based on the well understood model
|
|
|
|
of a \fIpeephole optimizer\fR.
|
|
|
|
.[
|
|
|
|
aho ullman compiler
|
|
|
|
.]
|
|
|
|
It contains a machine-dependent table
|
|
|
|
of (pattern,replacement) pairs.
|
|
|
|
Each pattern describes
|
|
|
|
a sequence of one or more assembler instructions
|
|
|
|
that can be replaced by zero or more equivalent, yet cheaper,
|
|
|
|
instructions (the 'replacement').
|
|
|
|
The optimizer maintains a \fIwindow\fR that moves over the input.
|
|
|
|
At any moment, the window contains some contiguous part of the input.
|
|
|
|
If the instructions in the current window match some pattern
|
|
|
|
in the table,
|
|
|
|
they are replaced by the corresponding replacement;
|
|
|
|
else, the window moves one instruction to the right.
|
|
|
|
.PP
|
|
|
|
In the remainder of this section we will give an informal
|
|
|
|
description of the machine-dependent table.
|
|
|
|
A more precise definition is given in appendix A.
|
|
|
|
We will first discuss the restrictions put on the
|
|
|
|
format of the assembly code.
|
|
|
|
.NH 2
|
|
|
|
Assumptions about the assembly code format
|
|
|
|
.PP
|
|
|
|
We assume that a line of assembly code begins with an
|
|
|
|
instruction \fImnemonic\fR (opcode),
|
|
|
|
followed by zero or more \fIoperands\fR.
|
|
|
|
The mnemonic and the first operand must be separated by a special
|
|
|
|
character (e.g. a space or a tab).
|
|
|
|
Likewise, the operands must be separated by a special
|
|
|
|
character (e.g. a comma).
|
|
|
|
These separators need not be the same for all machines.
|
|
|
|
.NH 2
|
|
|
|
Informal description of the machine-dependent tables
|
|
|
|
.PP
|
|
|
|
The major part of the table consists of (pattern,replacement) pairs
|
|
|
|
called \fIentries\fR.
|
|
|
|
.PP
|
|
|
|
A pattern is a list of instruction descriptions.
|
|
|
|
Each instruction description describes the instruction mnemonic and
|
|
|
|
the operands.
|
|
|
|
.PP
|
|
|
|
A mnemonic is described either by a string constant or by the
|
|
|
|
keyword ANY.
|
|
|
|
As all entities dealt with by the target optimizer are strings,
|
|
|
|
string constants do not contain quotes.
|
|
|
|
A string constant matches only itself.
|
|
|
|
ANY matches every instruction mnemonic.
|
|
|
|
.nf
|
|
|
|
|
|
|
|
Examples of mnemonic descriptions:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
add
|
|
|
|
sub.l
|
|
|
|
mulw3
|
|
|
|
ANY
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
.PP
|
|
|
|
An operand can also be described by a string constant.
|
|
|
|
.nf
|
|
|
|
|
|
|
|
Examples:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
(sp)+
|
|
|
|
r5
|
|
|
|
-4(r6)
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
Alternatively, it can be described by means of a \fIvariable name\fR.
|
|
|
|
Variables have values which are strings.
|
|
|
|
They have to be declared in the table before the patterns.
|
|
|
|
Each such declaration defines the name of a variable and
|
|
|
|
a \fIrestriction\fR to which its value is subjected.
|
|
|
|
.nf
|
|
|
|
Example of variable declarations:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
CONST { VAL[0] == '$' };
|
|
|
|
REG { VAL[0] == 'r' && VAL[1] >= '0' && VAL[1] <= '3' &&
|
|
|
|
VAL[2] == '\\0' };
|
|
|
|
X { TRUE };
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
The keyword VAL denotes the value of the variable, which is
|
|
|
|
a null-terminated string.
|
|
|
|
An operand description given via a variable name matches an
|
|
|
|
actual operand if the actual operand obeys the associated restriction.
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
CONST matches $1, $-5, $foo etc.
|
|
|
|
REG matches r0, r1, r2 and r3
|
|
|
|
X matches anything
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
|
|
|
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
The restriction (between curly braces) may be any legal "C"
|
|
|
|
.[
|
|
|
|
kernighan ritchie c programming
|
|
|
|
.]
|
|
|
|
expression.
|
|
|
|
It may also contain calls to user-defined procedures.
|
|
|
|
These procedures must be added to the table after the patterns.
|
|
|
|
.nf
|
|
|
|
|
|
|
|
Example:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
FERMAT_NUMBER { VAL[0] == '$' && is_fermat_number(&VAL[1]) };
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
An operand can also be described by a mixture of a string constant
|
|
|
|
and a variable name.
|
|
|
|
The most general form allowed is:
|
|
|
|
.nf
|
|
|
|
|
|
|
|
string_constant1 variable_name string_constant2
|
|
|
|
|
|
|
|
Example:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
(REG)+ matches (r0)+, (r1)+, (r2)+ and (r3)+
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
Any of the three components may be omitted,
|
|
|
|
so the first two forms are just special cases of the general form.
|
|
|
|
The name of a variable can not be used as a string constant.
|
|
|
|
In the above context, it is impossible to define an operand that
|
|
|
|
matches the string "REG".
|
|
|
|
This limitation is of little consequence,
|
|
|
|
as the table writer is free to choose the names of variables.
|
|
|
|
This approach, however, avoids the need for awkward escape sequences.
|
|
|
|
.PP
|
|
|
|
A pattern consists of one or more instruction descriptions
|
|
|
|
(separated by a colon)
|
|
|
|
followed by an optional constraint.
|
|
|
|
A pattern "P1 : P2 : .. : Pn C" matches the sequence of
|
|
|
|
instructions "I1 I2 .. In" if:
|
|
|
|
.IP (i) 7
|
|
|
|
for each i, 1 <= i <= n, Pi matches Ii, as described above;
|
|
|
|
.IP (ii)
|
|
|
|
multiple occurrences of the same variable name or of
|
|
|
|
the keyword ANY stand for the same values throughout the pattern;
|
|
|
|
.IP (iii)
|
|
|
|
the optional constraint C is satisfied, i.e. it evaluates to TRUE.
|
|
|
|
.LP
|
|
|
|
.nf
|
|
|
|
The pattern:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
dec REG : move.b CONST,(REG)
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
matches:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
dec r0 : move.b $4,(r0)
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
but not:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
dec r0 : move.b $4,(r1)
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
(as the variable REG matches two different strings).
|
|
|
|
.fi
|
|
|
|
If a pattern containing different registers must be described,
|
|
|
|
extra names for a register should be declared, all sharing
|
|
|
|
the same restriction.
|
|
|
|
.nf
|
|
|
|
Example:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
REG1,REG2 { VAL[0] == 'r' && ..... };
|
|
|
|
|
|
|
|
addl3 REG1,REG1,REG2 : subl2 REG2,REG1
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
.PP
|
|
|
|
The optional constraint is an auxiliary "C" expression (just like
|
|
|
|
the parameter restrictions).
|
|
|
|
The expression may refer to the variables and to ANY.
|
|
|
|
.nf
|
|
|
|
Example:
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
move REG1,REG2 { REG1[1] == REG2[1] + 1 }
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
matches
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
move r1,r0
|
|
|
|
move r2,r1
|
|
|
|
move r3,r2
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
.PP
|
|
|
|
The replacement part of a (pattern,replacement) table entry
|
|
|
|
has the same structure as a pattern, except that:
|
|
|
|
.IP (i)
|
|
|
|
it may not contain an additional constraint;
|
|
|
|
.IP (ii)
|
|
|
|
it may be empty.
|
|
|
|
.LP
|
|
|
|
A replacement may also refer to the values of variables and ANY.
|
|
|
|
.NH 2
|
|
|
|
Examples
|
|
|
|
.PP
|
|
|
|
This section contains some realistic examples for
|
|
|
|
optimization on PDP-11 and Vax assembly code.
|
|
|
|
.NH 3
|
|
|
|
Vax examples
|
|
|
|
.PP
|
|
|
|
Suppose the table contains the following declarations:
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
|
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
X, LOG { TRUE };
|
|
|
|
LAB { VAL[0] == 'L' }; /* e.g. L0017 */
|
|
|
|
A { no_side_effects(VAL) };
|
|
|
|
NUM { is_number(VAL) };
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
The procedure "no_side_effects" checks if its argument
|
|
|
|
contains any side effects, i.e. auto increment or auto decrement.
|
|
|
|
The procedure "is_number" checks if its argument contains only digits.
|
|
|
|
These procedures must be supplied by the table-writer and must be
|
|
|
|
included in the table.
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP addl3 X,A,A -> addl2 X,A;
|
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
This entry changes a 3-operand instruction into a cheaper 2-operand
|
|
|
|
instruction.
|
|
|
|
An optimization like:
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
addl3 r0,(r2)+,(r2)+ -> addl2 r0,(r2)+
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
is illegal, as r2 should be incremented twice.
|
|
|
|
Hence the second argument is required to
|
|
|
|
be side-effect free.
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP addw2 $-NUM,X -> subw2 $NUM,X;
|
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
An instruction like "subw2 $5,r0" is cheaper
|
|
|
|
than "addw2 $-5,r0",
|
|
|
|
because constants in the range 0 to 63 are represented
|
|
|
|
very efficiently on the Vax.
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP bitw $NUM,A : jneq LAB
|
1986-12-10 11:40:00 +00:00
|
|
|
{ is_poweroftwo(NUM,LOG) } -> jbs $LOG,A,LAB;
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
A "bitw x,y" sets the condition codes to the bitwise "and" of
|
|
|
|
x and y.
|
|
|
|
A "jbs n,x,l" branches to l if bit n of x is set.
|
|
|
|
So, for example, the following transformation is possible:
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
bitw $32,r0 : jneq L0017 -> jbs $5,r0,L0017
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
The user-defined procedure "is_poweroftwo" checks if its first argument is
|
|
|
|
a power of 2 and, if so, sets its second argument to the logarithm
|
|
|
|
of the first argument. (Both arguments are strings).
|
|
|
|
Note that the variable LOG is not used in the pattern itself.
|
|
|
|
It is assigned a (string) value by "is_poweroftwo" and is used
|
|
|
|
in the replacement.
|
|
|
|
.NH 3
|
|
|
|
PDP-11 examples
|
|
|
|
.PP
|
|
|
|
Suppose we have the following declarations:
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
|
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
X { TRUE };
|
|
|
|
A { no_side_effects(VAL) };
|
|
|
|
L1, L2 { VAL[0] == 'I' };
|
|
|
|
REG { VAL[0] == 'r' && VAL[1] >= '0' && VAL[1] <= '5' &&
|
|
|
|
VAL[2] == '\\0' };
|
|
|
|
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft P
|
1986-12-10 11:40:00 +00:00
|
|
|
.fi
|
|
|
|
The implementation of "no_side_effects" may of course
|
|
|
|
differ for the PDP-11 and the Vax.
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP mov REG,A : ANY A,X -> mov REG,A : ANY REG,X ;
|
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
This entry implements register subsumption.
|
|
|
|
If A and REG hold the same value (which is true after "mov REG,A")
|
|
|
|
and A is used as source (first) operand, it is cheaper to use REG instead.
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP jeq L1 : jbr L2 : labdef L1 -> jne L2 : labdef L1;
|
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
The "jeq L1" is a "skip over an unconditional jump". "labdef L1"
|
|
|
|
denotes the definition (i.e. defining occurrence) of label L1.
|
|
|
|
As the target optimizer has to know how such a definition
|
|
|
|
looks like, this must be expressed in the table (see Appendix A).
|
|
|
|
.PP
|
|
|
|
.nf
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
|
|
|
\fIentry:\fP add $01,X { carry_dead(REST) } -> inc X;
|
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
|
|
|
|
.fi
|
|
|
|
On the PDP-11, an add-one is not equivalent to an increment.
|
|
|
|
The latter does not set the carry-bit of the condition codes,
|
|
|
|
while the former does.
|
|
|
|
So a look-ahead is needed to see if the rest of the input uses
|
|
|
|
the carry-bit before changing the condition codes.
|
|
|
|
A look-ahead of one instruction is provided by
|
|
|
|
the target optimizer.
|
|
|
|
This will normally be sufficient for compiler-generated code.
|
|
|
|
The keyword REST contains the mnemonic of the first instruction of
|
|
|
|
the rest of the input.
|
|
|
|
If this instruction uses the carry-bit (e.g. an adc, subc, bhis)
|
|
|
|
the transformation is not allowed.
|
|
|
|
.bp
|
|
|
|
.NH 1
|
|
|
|
Implementation of the target optimizer
|
|
|
|
.PP
|
|
|
|
The target optimizer reads one input file of assembler instructions,
|
|
|
|
processes it, and writes the optimized code
|
|
|
|
to the output file.
|
|
|
|
So it performs one pass over the input.
|
|
|
|
.NH 2
|
|
|
|
The window mechanism
|
|
|
|
.PP
|
|
|
|
The optimizer uses a \fIwindow\fR that moves over the input.
|
|
|
|
It repeatedly tries to match the instructions in the window
|
|
|
|
with the patterns in the table.
|
|
|
|
If no match is possible, the window moves
|
|
|
|
one instruction forwards (to the right).
|
|
|
|
After a successful match the matched instructions are
|
|
|
|
removed from the window and are replaced by the
|
|
|
|
replacement part of the table entry.
|
|
|
|
Furthermore, the window is moved a few instructions
|
|
|
|
backwards,
|
|
|
|
as it is possible that instructions that were rejected earlier now do match.
|
|
|
|
For example, consider the following patterns:
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
cmp $0, X -> tst X ;
|
|
|
|
mov REG,X : tst X -> move REG.X ; /* redundant test */
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
If the input is:
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
mov r0,foo : cmp $0,foo
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
then the first instruction is initially rejected.
|
|
|
|
However, after the transformation
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
cmp $0,foo -> tst foo
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
the following optimization is possible:
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
mov r0,foo : tst foo -> mov r0,foo
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
.PP
|
1986-12-10 15:28:30 +00:00
|
|
|
The window is implemented as a \fIqueue\fR.
|
1986-12-10 11:40:00 +00:00
|
|
|
Matching takes place at the head of the queue.
|
|
|
|
New instructions are added at the tail.
|
|
|
|
If the window is moved forwards, the instruction at the head
|
|
|
|
is not yet written to the output,
|
|
|
|
as it may be needed later on.
|
|
|
|
Instead it is added to a second queue,
|
|
|
|
the \fIbackup queue\fR.
|
|
|
|
After a successful match, the entire backup queue is
|
|
|
|
inserted at the front of the window queue,
|
|
|
|
which effectively implements the shift backwards.
|
|
|
|
.PP
|
|
|
|
Both queues have the length of the longest pattern in the table.
|
|
|
|
If, as a result of a forward window move,
|
|
|
|
the backup queue gets full,
|
|
|
|
the instruction at its head is outputted and removed.
|
|
|
|
Instructions are read from the input whenever the
|
|
|
|
window queue contains fewer elements than the length
|
|
|
|
of the longest pattern.
|
|
|
|
.NH 2
|
|
|
|
Pattern matching
|
|
|
|
.PP
|
|
|
|
Pattern matching is done in three steps:
|
|
|
|
.IP (i) 7
|
|
|
|
find patterns in the table whose instruction mnemonics
|
|
|
|
match the mnemonics of the instructions in the
|
|
|
|
current window;
|
|
|
|
.IP (ii)
|
|
|
|
check if the operands of the pattern match the operands of the
|
|
|
|
instructions in the current window;
|
|
|
|
.IP (iii)
|
|
|
|
check if the optional constraint is satisfied.
|
|
|
|
.LP
|
|
|
|
For step (i) hashing is used.
|
|
|
|
The mnemonic of the first instruction of the window
|
|
|
|
is used to determine a list of possible patterns.
|
|
|
|
Patterns starting with ANY are always tried.
|
|
|
|
.PP
|
|
|
|
Matching of operand descriptions against actual operands
|
|
|
|
takes place as follows.
|
|
|
|
The general form of an operand description is:
|
|
|
|
.DS
|
|
|
|
string_constant1 variable_name string_constant2
|
|
|
|
.DE
|
|
|
|
The actual operand should begin with string_constant1 and end
|
|
|
|
on string_constant2.
|
|
|
|
If so, these strings are stripped from it and the remaining string is
|
|
|
|
matched against the variable.
|
|
|
|
Matching a string against a variable is
|
|
|
|
defined as follows:
|
|
|
|
.IP 1.
|
|
|
|
initially (before the entire pattern match)
|
|
|
|
all variables are uninstantiated;
|
|
|
|
.IP 2.
|
|
|
|
matching a string against an uninstantiated variable
|
|
|
|
succeeds if the restriction associated with the variable is
|
|
|
|
satisfied.
|
|
|
|
As a side effect, it causes the variable to be instantiated to
|
|
|
|
the string;
|
|
|
|
.IP 3.
|
|
|
|
matching a string against an instantiated variable succeeds
|
|
|
|
only if the variable was instantiated to the same string.
|
|
|
|
.LP
|
|
|
|
Matching an actual mnemonic against the keyword ANY is defined likewise.
|
|
|
|
.PP
|
|
|
|
The matching scheme implements the requirement that multiple occurrences
|
|
|
|
of the same variable name or of the keyword ANY should
|
|
|
|
stand for the same values throughout the entire pattern
|
|
|
|
(see section 2.).
|
|
|
|
.PP
|
|
|
|
Both the parameter restriction of 2. and the constraint of step (iii)
|
|
|
|
are checked by executing the "C" expression.
|
|
|
|
.NH 2
|
|
|
|
Data structures
|
|
|
|
.PP
|
|
|
|
The most important data structure is the representation
|
|
|
|
of the input instructions.
|
|
|
|
For every instruction we use two representations:
|
|
|
|
.IP (i)
|
|
|
|
the textual representation,
|
|
|
|
i.e. the exact code as it appeared in the input;
|
|
|
|
.IP (ii)
|
|
|
|
a structural representation,
|
|
|
|
containing the opcode and the operands.
|
|
|
|
.LP
|
|
|
|
The opcode of an instruction is determined as soon as it is read.
|
|
|
|
If the line contains a label definition, the opcode is set
|
|
|
|
to "labdef", so a label definition is treated like a normal
|
|
|
|
instruction.
|
|
|
|
.PP
|
|
|
|
The operands of an instruction are not determined until
|
|
|
|
they are needed, i.e. until step (i) of the pattern matching
|
|
|
|
process has succeeded.
|
|
|
|
For every instruction we keep track of a \fIstate\fR.
|
|
|
|
After the opcode has successfully been determined,
|
|
|
|
the state is OPC_ONLY.
|
|
|
|
Once the operands have been recognized, the state is set to DONE.
|
|
|
|
If the opcode or operands can not be determined,
|
|
|
|
or if the instruction cannot be optimized for any other
|
|
|
|
reason (see Appendix A), the state is set to JUNK
|
|
|
|
and any attempt to match it will fail.
|
|
|
|
.PP
|
|
|
|
For each table entry we record the following information:
|
|
|
|
.IP (i) 7
|
|
|
|
the length of the pattern (i.e. the number of instruction descriptions)
|
|
|
|
.IP (ii)
|
|
|
|
a description of the instructions of the pattern
|
|
|
|
.IP (iii)
|
|
|
|
the length of the replacement
|
|
|
|
.IP (iv)
|
|
|
|
a description of the instructions of the replacement.
|
|
|
|
.LP
|
|
|
|
The description of an instruction consists of:
|
|
|
|
.IP (i)
|
|
|
|
the opcode
|
|
|
|
.IP (ii)
|
|
|
|
for each operand, a description of the operand.
|
|
|
|
.LP
|
|
|
|
The description of an operand of the form:
|
|
|
|
.DS
|
|
|
|
string_constant1 variable_name string_constant2
|
|
|
|
.DE
|
|
|
|
contains:
|
|
|
|
.IP (i)
|
|
|
|
both string constants
|
|
|
|
.IP (ii)
|
|
|
|
the number of the variable.
|
|
|
|
.LP
|
|
|
|
Each declared variable is assigned a unique number.
|
|
|
|
For every variable we maintain:
|
|
|
|
.IP (i)
|
|
|
|
its state (instantiated or not instantiated)
|
|
|
|
.IP (ii)
|
|
|
|
its current value (a string).
|
|
|
|
.LP
|
|
|
|
The restrictions on variables and the constraints are stored
|
|
|
|
in a switch-statement,
|
|
|
|
indexed by variable number and entry number respectively.
|
|
|
|
.bp
|
|
|
|
.NH 1
|
|
|
|
Implementation of the target optimizer generator
|
|
|
|
.PP
|
|
|
|
The target optimizer generator (\fItopgen\fR)
|
|
|
|
reads a target machine description table and produces
|
|
|
|
two files:
|
|
|
|
.IP gen.h: 9
|
|
|
|
contains macro definitions for
|
|
|
|
machine parameters that were changed
|
|
|
|
in the parameter section of the table (see appendix A)
|
|
|
|
and for some attributes derived from the table
|
|
|
|
(longest pattern, number of patterns, number
|
|
|
|
of variables).
|
|
|
|
.IP gen.c:
|
|
|
|
contains the entry description tables,
|
|
|
|
code for checking the parameter restrictions and constraints
|
|
|
|
(switch statements)
|
|
|
|
and the user-defined procedures.
|
|
|
|
.LP
|
|
|
|
These two files are compiled together with some machine-independent
|
|
|
|
files to produce a target optimizer.
|
|
|
|
.PP
|
|
|
|
Topgen is implemented using
|
1986-12-10 15:28:30 +00:00
|
|
|
the LL(1) parser generator system LLgen ,
|
|
|
|
.[
|
|
|
|
jacobs topics parser generation
|
|
|
|
.]
|
1986-12-10 11:40:00 +00:00
|
|
|
a powerful tool of the Amsterdam Compiler Kit.
|
|
|
|
This system provides a flexible way of describing the syntax of the tables.
|
|
|
|
The syntactical description of the table format included
|
|
|
|
in Appendix A was derived from the LLgen syntax rules.
|
|
|
|
.PP
|
|
|
|
The parser uses a simple, hand-written, lexical analyzer (scanner).
|
|
|
|
The scanner returns a single character in most cases.
|
|
|
|
The recognition of identifiers is left to the parser, as
|
|
|
|
this eases the analysis of operand descriptions.
|
|
|
|
Comments are removed from the input by the scanner,
|
|
|
|
but white space is passed to the parser,
|
|
|
|
as it is meaningful in some contexts (it separates the
|
|
|
|
opcode description from the description of the first operand).
|
|
|
|
.PP
|
|
|
|
Topgen maintains two symbol tables, one for variable names and one
|
|
|
|
for tunable parameters.
|
|
|
|
The symbol tables are organized as binary trees.
|
|
|
|
.bp
|
|
|
|
.SH
|
|
|
|
Appendix A
|
|
|
|
.PP
|
|
|
|
In this appendix we present a complete definition of the target
|
|
|
|
optimizer description table format.
|
|
|
|
This appendix is intended for table-writers.
|
|
|
|
We use syntax rules for the description of the table format.
|
|
|
|
The following notation is used:
|
1986-12-10 15:28:30 +00:00
|
|
|
.TS
|
|
|
|
center;
|
|
|
|
l l.
|
|
|
|
{ a } zero or more of a
|
|
|
|
[ a ] zero or one of a
|
|
|
|
a b a followed by b
|
|
|
|
a | b a or b
|
|
|
|
.TE
|
1986-12-10 11:40:00 +00:00
|
|
|
Terminals are given in quotes, as in ';'.
|
|
|
|
.PP
|
|
|
|
The table may contain white space and comment at all reasonable places.
|
|
|
|
Comments are as in "C", so they begin with /* and end on */.
|
|
|
|
Identifiers are sequences of letters, digits and the underscore ('_'),
|
|
|
|
beginning with a letter.
|
|
|
|
.PP
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
table -> {parameter_line} '%%;' {variable_declaration} '%%;'
|
|
|
|
{entry} '%%;' user_routines.
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
A table consists of four sections, containing machine-dependent
|
|
|
|
constants, variable declarations, pattern rules and
|
|
|
|
user-supplied subroutines.
|
|
|
|
.PP
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
parameter_line -> identifier value ';' .
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
A parameter line defines some attributes of the target machines
|
|
|
|
assembly code.
|
|
|
|
For unspecified parameters default values apply.
|
|
|
|
The names of the parameters and the corresponding defaults
|
|
|
|
are shown in table 1.
|
1986-12-10 15:28:30 +00:00
|
|
|
.TS
|
|
|
|
center;
|
|
|
|
l l.
|
|
|
|
OPC_TERMINATOR ' '
|
|
|
|
OP_SEPARATOR ','
|
|
|
|
LABEL_STARTER 'I'
|
|
|
|
LABEL_TERMINATOR ':'
|
|
|
|
MAXOP 2
|
|
|
|
MAXOPLEN 25
|
|
|
|
MAX_OPC_LEN 10
|
|
|
|
MAXVARLEN 25
|
|
|
|
MAXLINELEN 100
|
|
|
|
.TE
|
|
|
|
.ce 1
|
|
|
|
table 1: parameter names and defaults
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
The OPC_TERMINATOR is the character that separates the instruction
|
|
|
|
mnemonic from the first operand (if any).
|
|
|
|
The OP_SEPARATOR separates adjacent operands.
|
|
|
|
A LABEL_STARTER is the first character of an instruction label.
|
|
|
|
(Instruction labels are assumed to start with the same character).
|
|
|
|
The LABEL_TERMINATOR is the last character of a label definition.
|
|
|
|
It is assumed that this character is not used in an applied
|
|
|
|
occurrence of the label identifier.
|
|
|
|
For example, the defining occurrence may be "I0017:"
|
|
|
|
and the applied occurrence may be "I0017"
|
|
|
|
as in "jmp I0017".
|
|
|
|
MAXOP defines the maximum number of operands an instruction can have.
|
|
|
|
MAXOPLEN is the maximum length (in characters) of an operand.
|
|
|
|
MAX_OPC_LEN is the maximum length of an instruction opcode.
|
|
|
|
MAXVARLEN is the maximum length of a declared string variable.
|
|
|
|
As variables may be set by user routines (see "bitw" example for
|
|
|
|
the Vax) the table-writer must have access to this length and
|
|
|
|
must be able to change it.
|
|
|
|
MAXLINELEN denotes the maximum length of a line of assembly code.
|
|
|
|
.PP
|
|
|
|
If a line of assembly code violates any of the assumptions or
|
|
|
|
exceeds some limit,
|
|
|
|
the line is not optimized.
|
|
|
|
Optimization does, however, proceed with the rest of the input.
|
|
|
|
.PP
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
variable_declaration -> identifier {',' identifier} restriction ';' .
|
|
|
|
|
|
|
|
restriction -> '{' anything '}' .
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
A variable declaration declares one or more string variables
|
|
|
|
that may be used in the patterns and in the replacements.
|
|
|
|
If a variable is used as part of an operand description in
|
|
|
|
a pattern, the entire pattern can only match if the
|
|
|
|
restriction evaluates to TRUE.
|
|
|
|
If the pattern does match, the variable is assigned the matching
|
|
|
|
part of the actual operand.
|
|
|
|
Variables that are not used in a pattern are initialized to
|
|
|
|
null-strings and may be assigned a value in the constraint-part of
|
|
|
|
the pattern.
|
|
|
|
.PP
|
|
|
|
The restriction must be a legal "C" expression.
|
|
|
|
It may not contain a closing bracket ('}').
|
|
|
|
Inside the expression, the name VAL stands for the part of the actual
|
|
|
|
(matching) operand.
|
|
|
|
The expression may contain calls to procedures that are defined in the
|
|
|
|
user-routines section.
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
entry -> pattern '->' replacement ';' .
|
|
|
|
|
|
|
|
pattern -> instruction_descr
|
|
|
|
{ ':' instruction_descr }
|
|
|
|
constraint .
|
|
|
|
|
|
|
|
replacement -> [ instruction_descr { ':' instruction_descr } ] .
|
|
|
|
|
|
|
|
instruction_descr -> opcode
|
|
|
|
white
|
|
|
|
[ operand_descr { ',' operand_descr } ] .
|
|
|
|
|
|
|
|
constraint -> '{' anything '}' .
|
|
|
|
|
|
|
|
operand_descr -> [ string_constant ]
|
|
|
|
[ variable_name ]
|
|
|
|
[ string_constant ] .
|
|
|
|
|
|
|
|
variable_name -> identifier .
|
|
|
|
|
|
|
|
opcode -> anything .
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
The symbol 'white' stands for white space (space or tab).
|
|
|
|
An opcode can be any string not containing the special
|
|
|
|
symbols ';', '{', '}', ':', ',', '->' or white space.
|
|
|
|
To be recognized, it must begin with a letter.
|
|
|
|
The opcode should either be a mnemonic of a target machine
|
|
|
|
instruction or it should be one of the keywords ANY and labdef.
|
|
|
|
ANY matches any actual opcode. labdef matches only label definitions.
|
|
|
|
.PP
|
|
|
|
If an operand description contains an identifier (as defined earlier),
|
|
|
|
it is checked if the identifier is the name of a declared variable.
|
|
|
|
This effects the semantics of the matching rules for the operand,
|
|
|
|
as described in section 2.
|
|
|
|
An operand may contain at most one such variable name.
|
|
|
|
.PP
|
|
|
|
The constraint must be a legal "C" expression, just as the operand restriction.
|
|
|
|
It may call user-defined procedures and use or change the value of
|
|
|
|
declared variables.
|
|
|
|
It may also use the string variable REST,
|
|
|
|
which contains the mnemonic of the first instruction of the
|
|
|
|
rest of the input. (REST is a null-string if this mnemonic can
|
|
|
|
not be determined).
|
|
|
|
.DS
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft 5
|
1986-12-10 11:40:00 +00:00
|
|
|
user_routines -> anything .
|
1986-12-10 15:28:30 +00:00
|
|
|
.ft R
|
1986-12-10 11:40:00 +00:00
|
|
|
.DE
|
|
|
|
The remainder of the table consists of user-defined subroutines.
|
|
|
|
.bp
|
|
|
|
.[
|
|
|
|
$LIST$
|
|
|
|
.]
|