232 lines
		
	
	
	
		
			6.8 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			232 lines
		
	
	
	
		
			6.8 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.NH 2
 | 
						|
Implementation
 | 
						|
.PP
 | 
						|
Like most phases, SR deals with one procedure
 | 
						|
at a time.
 | 
						|
Within a procedure, SR works on one loop at a time.
 | 
						|
Loops are processed in textual order.
 | 
						|
If loops are nested inside each other,
 | 
						|
SR starts with the outermost loop and proceeds in the
 | 
						|
inwards direction.
 | 
						|
This order is chosen,
 | 
						|
because it enables the optimization
 | 
						|
of multi-dimensional array address computations,
 | 
						|
if the elements are accessed in the usual way
 | 
						|
(i.e. row after row, rather than column after column).
 | 
						|
For every loop, SR first detects all induction variables
 | 
						|
and then tries to recognize
 | 
						|
expressions that can be optimized.
 | 
						|
.NH 3
 | 
						|
Finding induction variables
 | 
						|
.PP
 | 
						|
The process of finding induction variables
 | 
						|
can conveniently be split up
 | 
						|
into two parts.
 | 
						|
First, the EM text of the loop is scanned to find
 | 
						|
all \fIcandidate\fR induction variables,
 | 
						|
which are word-sized local variables
 | 
						|
that are assigned precisely once
 | 
						|
in the loop, within a firm block.
 | 
						|
Second, for every candidate, the single assignment
 | 
						|
is inspected, to see if it has the form
 | 
						|
required by the definition of an induction variable.
 | 
						|
.PP
 | 
						|
Candidates are found by scanning the EM code of the loop.
 | 
						|
During this scan, two sets are maintained.
 | 
						|
The set "cand" contains all variables that were
 | 
						|
assigned exactly once so far, within a firm block.
 | 
						|
The set "dismiss" contains all variables that
 | 
						|
should not be made a candidate.
 | 
						|
Initially, both sets are empty.
 | 
						|
If a variable is assigned to, it is put
 | 
						|
in the cand set, if three conditions are met:
 | 
						|
.IP 1.
 | 
						|
the variable was not in cand or dismiss already
 | 
						|
.IP 2.
 | 
						|
the assignment takes place in a firm block
 | 
						|
.IP 3.
 | 
						|
the assignment is not a ZRL instruction (assignment by zero)
 | 
						|
or a SDL instruction (store double local).
 | 
						|
.LP
 | 
						|
If any condition fails, the variable is dismissed from cand
 | 
						|
(if it was there already) and put in dismiss
 | 
						|
(if it was not there already).
 | 
						|
.sp 0
 | 
						|
All variables for which no register message was generated (i.e. those
 | 
						|
variables that may be accessed indirectly) are assumed
 | 
						|
to be changed in the loop.
 | 
						|
.sp 0
 | 
						|
All variables that remain in cand are candidate induction variables.
 | 
						|
.PP
 | 
						|
From the set of candidates, the induction variables can
 | 
						|
be determined, by inspecting the single assignment.
 | 
						|
The assignment must match one of the EM patterns below.
 | 
						|
('x' is the candidate. 'ws' is the word size of the target machine.
 | 
						|
'n' is any number.)
 | 
						|
.DS
 | 
						|
\fIpattern\fR                                     \fIstep size\fR
 | 
						|
INL x  |                                      +1
 | 
						|
DEL x  |                                      -1
 | 
						|
LOL x ; (INC | DEC) ; STL x  |                +1 | -1
 | 
						|
LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x  |  +n | -n
 | 
						|
LOC n ; LOL x ; ADI ws ; STL x.               +n
 | 
						|
.DE
 | 
						|
From the patterns the step size of the induction variable
 | 
						|
can also be determined.
 | 
						|
These step sizes are displayed on the right hand side.
 | 
						|
.sp
 | 
						|
For every induction variable we maintain the following information:
 | 
						|
.IP -
 | 
						|
the offset of the variable in the stackframe of its procedure
 | 
						|
.IP -
 | 
						|
a pointer to the EM text of the assignment statement
 | 
						|
.IP -
 | 
						|
the step value
 | 
						|
.LP
 | 
						|
.NH 3
 | 
						|
Optimizing expressions
 | 
						|
.PP
 | 
						|
If any induction variables of the loop were found,
 | 
						|
the EM text of the loop is scanned again,
 | 
						|
to detect expressions that can be optimized.
 | 
						|
SR scans for multiplication and array instructions.
 | 
						|
Whenever it finds such an instruction, it analyses the
 | 
						|
code in front of it.
 | 
						|
If an expression is to be optimized, it must
 | 
						|
be generated by the following syntax rules.
 | 
						|
.DS
 | 
						|
   optimizable_expr:
 | 
						|
		iv_expr const mult |
 | 
						|
		const iv_expr mult |
 | 
						|
		address iv_expr address array_instr;
 | 
						|
   mult:
 | 
						|
		MLI ws |
 | 
						|
		MLU ws ;
 | 
						|
   array_instr:
 | 
						|
		LAR ws |
 | 
						|
		SAR ws |
 | 
						|
		AAR ws ;
 | 
						|
   const:
 | 
						|
		LOC n ;
 | 
						|
.DE
 | 
						|
An 'address' is an EM instruction that loads an
 | 
						|
address on the stack.
 | 
						|
An instruction like LOL may be an 'address', if
 | 
						|
the size of an address (pointer size, =ps) is
 | 
						|
the same as the word size.
 | 
						|
If the pointer size is twice the word size,
 | 
						|
instructions like LDL are an 'address'.
 | 
						|
(The addresses in the third grammar rule
 | 
						|
denote resp. the array address and the
 | 
						|
array descriptor address).
 | 
						|
.DS
 | 
						|
   address:
 | 
						|
		LAE |
 | 
						|
		LAL |
 | 
						|
		LOL if ps=ws |
 | 
						|
		LOE    ,,    |
 | 
						|
		LIL    ,,    |
 | 
						|
		LDL if ps=2*ws |
 | 
						|
		LDE    ,,      ;
 | 
						|
.DE
 | 
						|
The notion of an iv-expression was introduced earlier.
 | 
						|
.DS
 | 
						|
   iv_expr:
 | 
						|
		iv_expr unair_op |
 | 
						|
		iv_expr iv_expr binary_op |
 | 
						|
		loopconst |
 | 
						|
		iv ;
 | 
						|
   unair_op:
 | 
						|
		NGI ws |
 | 
						|
		INC |
 | 
						|
		DEC ;
 | 
						|
   binary_op:
 | 
						|
		ADI ws |
 | 
						|
		ADU ws |
 | 
						|
		SBI ws |
 | 
						|
		SBU ws ;
 | 
						|
   loopconst:
 | 
						|
		const |
 | 
						|
		LOL x  if x is not changed in loop ;
 | 
						|
   iv:
 | 
						|
		LOL x  if x is an induction variable ;
 | 
						|
.DE
 | 
						|
An iv_expression must satisfy one additional constraint:
 | 
						|
it must use exactly one operand that is an induction
 | 
						|
variable.
 | 
						|
A simple, hand written, top-down parser is used
 | 
						|
to recognize an iv-expression.
 | 
						|
It scans the EM code from right to left
 | 
						|
(recall that EM is essentially postfix).
 | 
						|
It uses semantic attributes (inherited as well as
 | 
						|
derived) to check the additional constraint.
 | 
						|
.PP
 | 
						|
All information assembled during the recognition
 | 
						|
process is put in a 'code_info' structure.
 | 
						|
This structure contains the following information:
 | 
						|
.IP -
 | 
						|
the optimizable code itself
 | 
						|
.IP -
 | 
						|
the loop and basic block the code is part of
 | 
						|
.IP -
 | 
						|
the induction variable
 | 
						|
.IP -
 | 
						|
the iv-expression
 | 
						|
.IP -
 | 
						|
the sign of the induction variable in the
 | 
						|
iv-expression
 | 
						|
.IP -
 | 
						|
the offset and size of the temporary local variable
 | 
						|
.IP -	
 | 
						|
the expensive operator (MLI, LAR etc.)
 | 
						|
.IP -
 | 
						|
the instruction that loads the constant
 | 
						|
(for multiplication) or the array descriptor
 | 
						|
(for arrays).
 | 
						|
.LP
 | 
						|
The entire transformation process is driven
 | 
						|
by this information.
 | 
						|
As the EM text is represented internally
 | 
						|
as a list, this process consists
 | 
						|
mainly of straightforward list manipulations.
 | 
						|
.sp 0
 | 
						|
The initialization code must be put
 | 
						|
immediately before the loop entry.
 | 
						|
For this purpose a \fIheader block\fR is
 | 
						|
created that has the loop entry block as
 | 
						|
its only successor and that dominates the
 | 
						|
entry block.
 | 
						|
The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
 | 
						|
are updated.
 | 
						|
.sp 0
 | 
						|
An EM instruction that will
 | 
						|
replace the optimizable code
 | 
						|
is created and put at the place of the old code.
 | 
						|
The list representing the old optimizable code
 | 
						|
is used to create a list for the initializing code,
 | 
						|
as they are similar.
 | 
						|
Only two modifications are required:
 | 
						|
.IP -
 | 
						|
if the expensive operator is a LAR or SAR,
 | 
						|
it must be replaced by an AAR, as the initial value
 | 
						|
of TMP is the \fIaddress\fR of the first
 | 
						|
array element that is accessed.
 | 
						|
.IP -
 | 
						|
code must be appended to store the result of the
 | 
						|
expression in TMP.
 | 
						|
.LP
 | 
						|
Finally, code to increment TMP is created and put after
 | 
						|
the code of the single assignment to the
 | 
						|
induction variable.
 | 
						|
The generated code uses either an integer addition
 | 
						|
(ADI) or an integer-to-pointer addition (ADS)
 | 
						|
to do the increment.
 | 
						|
.PP
 | 
						|
SR maintains a set of all expressions that have already
 | 
						|
been recognized in the present loop.
 | 
						|
Such expressions are said to be \fIavailable\fR.
 | 
						|
If an expression is recognized that is
 | 
						|
already available,
 | 
						|
no new temporary local variable is allocated for it,
 | 
						|
and the code to initialize and increment the local
 | 
						|
is not generated.
 |