233 lines
6.8 KiB
Plaintext
233 lines
6.8 KiB
Plaintext
|
.NH 2
|
||
|
Implementation
|
||
|
.PP
|
||
|
Like most phases, SR deals with one procedure
|
||
|
at a time.
|
||
|
Within a procedure, SR works on one loop at a time.
|
||
|
Loops are processed in textual order.
|
||
|
If loops are nested inside each other,
|
||
|
SR starts with the outermost loop and proceeds in the
|
||
|
inwards direction.
|
||
|
This order is chosen,
|
||
|
because it enables the optimization
|
||
|
of multi-dimensional array address computations,
|
||
|
if the elements are accessed in the usual way
|
||
|
(i.e. row after row, rather than column after column).
|
||
|
For every loop, SR first detects all induction variables
|
||
|
and then tries to recognize
|
||
|
expressions that can be optimized.
|
||
|
.NH 3
|
||
|
Finding induction variables
|
||
|
.PP
|
||
|
The process of finding induction variables
|
||
|
can conveniently be split up
|
||
|
into two parts.
|
||
|
First, the EM text of the loop is scanned to find
|
||
|
all \fIcandidate\fR induction variables,
|
||
|
which are word-sized local variables
|
||
|
that are assigned precisely once
|
||
|
in the loop, within a firm block.
|
||
|
Second, for every candidate, the single assignment
|
||
|
is inspected, to see if it has the form
|
||
|
required by the definition of an induction variable.
|
||
|
.PP
|
||
|
Candidates are found by scanning the EM code of the loop.
|
||
|
During this scan, two sets are maintained.
|
||
|
The set "cand" contains all variables that were
|
||
|
assigned exactly once so far, within a firm block.
|
||
|
The set "dismiss" contains all variables that
|
||
|
should not be made a candidate.
|
||
|
Initially, both sets are empty.
|
||
|
If a variable is assigned to, it is put
|
||
|
in the cand set, if three conditions are met:
|
||
|
.IP 1.
|
||
|
the variable was not in cand or dismiss already
|
||
|
.IP 2.
|
||
|
the assignment takes place in a firm block
|
||
|
.IP 3.
|
||
|
the assignment is not a ZRL instruction (assignment by zero)
|
||
|
or a SDL instruction (store double local).
|
||
|
.LP
|
||
|
If any condition fails, the variable is dismissed from cand
|
||
|
(if it was there already) and put in dismiss
|
||
|
(if it was not there already).
|
||
|
.sp 0
|
||
|
All variables for which no register message was generated (i.e. those
|
||
|
variables that may be accessed indirectly) are assumed
|
||
|
to be changed in the loop.
|
||
|
.sp 0
|
||
|
All variables that remain in cand are candidate induction variables.
|
||
|
.PP
|
||
|
From the set of candidates, the induction variables can
|
||
|
be determined, by inspecting the single assignment.
|
||
|
The assignment must match one of the EM patterns below.
|
||
|
('x' is the candidate. 'ws' is the word size of the target machine.
|
||
|
'n' is any number.)
|
||
|
.DS
|
||
|
\fIpattern\fR \fIstep size\fR
|
||
|
INL x | +1
|
||
|
DEL x | -1
|
||
|
LOL x ; (INC | DEC) ; STL x | +1 | -1
|
||
|
LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x | +n | -n
|
||
|
LOC n ; LOL x ; ADI ws ; STL x. +n
|
||
|
.DE
|
||
|
From the patterns the step size of the induction variable
|
||
|
can also be determined.
|
||
|
These step sizes are displayed on the right hand side.
|
||
|
.sp
|
||
|
For every induction variable we maintain the following information:
|
||
|
.IP -
|
||
|
the offset of the variable in the stackframe of its procedure
|
||
|
.IP -
|
||
|
a pointer to the EM text of the assignment statement
|
||
|
.IP -
|
||
|
the step value
|
||
|
.LP
|
||
|
.NH 3
|
||
|
Optimizing expressions
|
||
|
.PP
|
||
|
If any induction variables of the loop were found,
|
||
|
the EM text of the loop is scanned again,
|
||
|
to detect expressions that can be optimized.
|
||
|
SR scans for multiplication and array instructions.
|
||
|
Whenever it finds such an instruction, it analyses the
|
||
|
code in front of it.
|
||
|
If an expression is to be optimized, it must
|
||
|
be generated by the following syntax rules.
|
||
|
.DS
|
||
|
optimizable_expr:
|
||
|
iv_expr const mult |
|
||
|
const iv_expr mult |
|
||
|
address iv_expr address array_instr;
|
||
|
mult:
|
||
|
MLI ws |
|
||
|
MLU ws ;
|
||
|
array_instr:
|
||
|
LAR ws |
|
||
|
SAR ws |
|
||
|
AAR ws ;
|
||
|
const:
|
||
|
LOC n ;
|
||
|
.DE
|
||
|
An 'address' is an EM instruction that loads an
|
||
|
address on the stack.
|
||
|
An instruction like LOL may be an 'address', if
|
||
|
the size of an address (pointer size, =ps) is
|
||
|
the same as the word size.
|
||
|
If the pointer size is twice the word size,
|
||
|
instructions like LDL are an 'address'.
|
||
|
(The addresses in the third grammar rule
|
||
|
denote resp. the array address and the
|
||
|
array descriptor address).
|
||
|
.DS
|
||
|
address:
|
||
|
LAE |
|
||
|
LAL |
|
||
|
LOL if ps=ws |
|
||
|
LOE ,, |
|
||
|
LIL ,, |
|
||
|
LDL if ps=2*ws |
|
||
|
LDE ,, ;
|
||
|
.DE
|
||
|
The notion of an iv-expression was introduced earlier.
|
||
|
.DS
|
||
|
iv_expr:
|
||
|
iv_expr unair_op |
|
||
|
iv_expr iv_expr binary_op |
|
||
|
loopconst |
|
||
|
iv ;
|
||
|
unair_op:
|
||
|
NGI ws |
|
||
|
INC |
|
||
|
DEC ;
|
||
|
binary_op:
|
||
|
ADI ws |
|
||
|
ADU ws |
|
||
|
SBI ws |
|
||
|
SBU ws ;
|
||
|
loopconst:
|
||
|
const |
|
||
|
LOL x if x is not changed in loop ;
|
||
|
iv:
|
||
|
LOL x if x is an induction variable ;
|
||
|
.DE
|
||
|
An iv_expression must satisfy one additional constraint:
|
||
|
it must use exactly one operand that is an induction
|
||
|
variable.
|
||
|
A simple, hand written, top-down parser is used
|
||
|
to recognize an iv-expression.
|
||
|
It scans the EM code from right to left
|
||
|
(recall that EM is essentially postfix).
|
||
|
It uses semantic attributes (inherited as well as
|
||
|
derived) to check the additional constraint.
|
||
|
.PP
|
||
|
All information assembled during the recognition
|
||
|
process is put in a 'code_info' structure.
|
||
|
This structure contains the following information:
|
||
|
.IP -
|
||
|
the optimizable code itself
|
||
|
.IP -
|
||
|
the loop and basic block the code is part of
|
||
|
.IP -
|
||
|
the induction variable
|
||
|
.IP -
|
||
|
the iv-expression
|
||
|
.IP -
|
||
|
the sign of the induction variable in the
|
||
|
iv-expression
|
||
|
.IP -
|
||
|
the offset and size of the temporary local variable
|
||
|
.IP -
|
||
|
the expensive operator (MLI, LAR etc.)
|
||
|
.IP -
|
||
|
the instruction that loads the constant
|
||
|
(for multiplication) or the array descriptor
|
||
|
(for arrays).
|
||
|
.LP
|
||
|
The entire transformation process is driven
|
||
|
by this information.
|
||
|
As the EM text is represented internally
|
||
|
as a list, this process consists
|
||
|
mainly of straightforward list manipulations.
|
||
|
.sp 0
|
||
|
The initialization code must be put
|
||
|
immediately before the loop entry.
|
||
|
For this purpose a \fIheader block\fR is
|
||
|
created that has the loop entry block as
|
||
|
its only successor and that dominates the
|
||
|
entry block.
|
||
|
The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
|
||
|
are updated.
|
||
|
.sp 0
|
||
|
An EM instruction that will
|
||
|
replace the optimizable code
|
||
|
is created and put at the place of the old code.
|
||
|
The list representing the old optimizable code
|
||
|
is used to create a list for the initializing code,
|
||
|
as they are similar.
|
||
|
Only two modifications are required:
|
||
|
.IP -
|
||
|
if the expensive operator is a LAR or SAR,
|
||
|
it must be replaced by an AAR, as the initial value
|
||
|
of TMP is the \fIaddress\fR of the first
|
||
|
array element that is accessed.
|
||
|
.IP -
|
||
|
code must be appended to store the result of the
|
||
|
expression in TMP.
|
||
|
.LP
|
||
|
Finally, code to increment TMP is created and put after
|
||
|
the code of the single assignment to the
|
||
|
induction variable.
|
||
|
The generated code uses either an integer addition
|
||
|
(ADI) or an integer-to-pointer addition (ADS)
|
||
|
to do the increment.
|
||
|
.PP
|
||
|
SR maintains a set of all expressions that have already
|
||
|
been recognized in the present loop.
|
||
|
Such expressions are said to be \fIavailable\fR.
|
||
|
If an expression is recognized that is
|
||
|
already available,
|
||
|
no new temporary local variable is allocated for it,
|
||
|
and the code to initialize and increment the local
|
||
|
is not generated.
|