232 lines
		
	
	
	
		
			6.8 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			232 lines
		
	
	
	
		
			6.8 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .NH 2
 | |
| Implementation
 | |
| .PP
 | |
| Like most phases, SR deals with one procedure
 | |
| at a time.
 | |
| Within a procedure, SR works on one loop at a time.
 | |
| Loops are processed in textual order.
 | |
| If loops are nested inside each other,
 | |
| SR starts with the outermost loop and proceeds in the
 | |
| inwards direction.
 | |
| This order is chosen,
 | |
| because it enables the optimization
 | |
| of multi-dimensional array address computations,
 | |
| if the elements are accessed in the usual way
 | |
| (i.e. row after row, rather than column after column).
 | |
| For every loop, SR first detects all induction variables
 | |
| and then tries to recognize
 | |
| expressions that can be optimized.
 | |
| .NH 3
 | |
| Finding induction variables
 | |
| .PP
 | |
| The process of finding induction variables
 | |
| can conveniently be split up
 | |
| into two parts.
 | |
| First, the EM text of the loop is scanned to find
 | |
| all \fIcandidate\fR induction variables,
 | |
| which are word-sized local variables
 | |
| that are assigned precisely once
 | |
| in the loop, within a firm block.
 | |
| Second, for every candidate, the single assignment
 | |
| is inspected, to see if it has the form
 | |
| required by the definition of an induction variable.
 | |
| .PP
 | |
| Candidates are found by scanning the EM code of the loop.
 | |
| During this scan, two sets are maintained.
 | |
| The set "cand" contains all variables that were
 | |
| assigned exactly once so far, within a firm block.
 | |
| The set "dismiss" contains all variables that
 | |
| should not be made a candidate.
 | |
| Initially, both sets are empty.
 | |
| If a variable is assigned to, it is put
 | |
| in the cand set, if three conditions are met:
 | |
| .IP 1.
 | |
| the variable was not in cand or dismiss already
 | |
| .IP 2.
 | |
| the assignment takes place in a firm block
 | |
| .IP 3.
 | |
| the assignment is not a ZRL instruction (assignment by zero)
 | |
| or a SDL instruction (store double local).
 | |
| .LP
 | |
| If any condition fails, the variable is dismissed from cand
 | |
| (if it was there already) and put in dismiss
 | |
| (if it was not there already).
 | |
| .sp 0
 | |
| All variables for which no register message was generated (i.e. those
 | |
| variables that may be accessed indirectly) are assumed
 | |
| to be changed in the loop.
 | |
| .sp 0
 | |
| All variables that remain in cand are candidate induction variables.
 | |
| .PP
 | |
| From the set of candidates, the induction variables can
 | |
| be determined, by inspecting the single assignment.
 | |
| The assignment must match one of the EM patterns below.
 | |
| ('x' is the candidate. 'ws' is the word size of the target machine.
 | |
| 'n' is any number.)
 | |
| .DS
 | |
| \fIpattern\fR                                     \fIstep size\fR
 | |
| INL x  |                                      +1
 | |
| DEL x  |                                      -1
 | |
| LOL x ; (INC | DEC) ; STL x  |                +1 | -1
 | |
| LOL x ; LOC n ; (ADI ws | SBI ws) ; STL x  |  +n | -n
 | |
| LOC n ; LOL x ; ADI ws ; STL x.               +n
 | |
| .DE
 | |
| From the patterns the step size of the induction variable
 | |
| can also be determined.
 | |
| These step sizes are displayed on the right hand side.
 | |
| .sp
 | |
| For every induction variable we maintain the following information:
 | |
| .IP -
 | |
| the offset of the variable in the stackframe of its procedure
 | |
| .IP -
 | |
| a pointer to the EM text of the assignment statement
 | |
| .IP -
 | |
| the step value
 | |
| .LP
 | |
| .NH 3
 | |
| Optimizing expressions
 | |
| .PP
 | |
| If any induction variables of the loop were found,
 | |
| the EM text of the loop is scanned again,
 | |
| to detect expressions that can be optimized.
 | |
| SR scans for multiplication and array instructions.
 | |
| Whenever it finds such an instruction, it analyses the
 | |
| code in front of it.
 | |
| If an expression is to be optimized, it must
 | |
| be generated by the following syntax rules.
 | |
| .DS
 | |
|    optimizable_expr:
 | |
| 		iv_expr const mult |
 | |
| 		const iv_expr mult |
 | |
| 		address iv_expr address array_instr;
 | |
|    mult:
 | |
| 		MLI ws |
 | |
| 		MLU ws ;
 | |
|    array_instr:
 | |
| 		LAR ws |
 | |
| 		SAR ws |
 | |
| 		AAR ws ;
 | |
|    const:
 | |
| 		LOC n ;
 | |
| .DE
 | |
| An 'address' is an EM instruction that loads an
 | |
| address on the stack.
 | |
| An instruction like LOL may be an 'address', if
 | |
| the size of an address (pointer size, =ps) is
 | |
| the same as the word size.
 | |
| If the pointer size is twice the word size,
 | |
| instructions like LDL are an 'address'.
 | |
| (The addresses in the third grammar rule
 | |
| denote resp. the array address and the
 | |
| array descriptor address).
 | |
| .DS
 | |
|    address:
 | |
| 		LAE |
 | |
| 		LAL |
 | |
| 		LOL if ps=ws |
 | |
| 		LOE    ,,    |
 | |
| 		LIL    ,,    |
 | |
| 		LDL if ps=2*ws |
 | |
| 		LDE    ,,      ;
 | |
| .DE
 | |
| The notion of an iv-expression was introduced earlier.
 | |
| .DS
 | |
|    iv_expr:
 | |
| 		iv_expr unair_op |
 | |
| 		iv_expr iv_expr binary_op |
 | |
| 		loopconst |
 | |
| 		iv ;
 | |
|    unair_op:
 | |
| 		NGI ws |
 | |
| 		INC |
 | |
| 		DEC ;
 | |
|    binary_op:
 | |
| 		ADI ws |
 | |
| 		ADU ws |
 | |
| 		SBI ws |
 | |
| 		SBU ws ;
 | |
|    loopconst:
 | |
| 		const |
 | |
| 		LOL x  if x is not changed in loop ;
 | |
|    iv:
 | |
| 		LOL x  if x is an induction variable ;
 | |
| .DE
 | |
| An iv_expression must satisfy one additional constraint:
 | |
| it must use exactly one operand that is an induction
 | |
| variable.
 | |
| A simple, hand written, top-down parser is used
 | |
| to recognize an iv-expression.
 | |
| It scans the EM code from right to left
 | |
| (recall that EM is essentially postfix).
 | |
| It uses semantic attributes (inherited as well as
 | |
| derived) to check the additional constraint.
 | |
| .PP
 | |
| All information assembled during the recognition
 | |
| process is put in a 'code_info' structure.
 | |
| This structure contains the following information:
 | |
| .IP -
 | |
| the optimizable code itself
 | |
| .IP -
 | |
| the loop and basic block the code is part of
 | |
| .IP -
 | |
| the induction variable
 | |
| .IP -
 | |
| the iv-expression
 | |
| .IP -
 | |
| the sign of the induction variable in the
 | |
| iv-expression
 | |
| .IP -
 | |
| the offset and size of the temporary local variable
 | |
| .IP -	
 | |
| the expensive operator (MLI, LAR etc.)
 | |
| .IP -
 | |
| the instruction that loads the constant
 | |
| (for multiplication) or the array descriptor
 | |
| (for arrays).
 | |
| .LP
 | |
| The entire transformation process is driven
 | |
| by this information.
 | |
| As the EM text is represented internally
 | |
| as a list, this process consists
 | |
| mainly of straightforward list manipulations.
 | |
| .sp 0
 | |
| The initialization code must be put
 | |
| immediately before the loop entry.
 | |
| For this purpose a \fIheader block\fR is
 | |
| created that has the loop entry block as
 | |
| its only successor and that dominates the
 | |
| entry block.
 | |
| The CFG and all relations (SUCC,PRED, IDOM, LOOPS etc.)
 | |
| are updated.
 | |
| .sp 0
 | |
| An EM instruction that will
 | |
| replace the optimizable code
 | |
| is created and put at the place of the old code.
 | |
| The list representing the old optimizable code
 | |
| is used to create a list for the initializing code,
 | |
| as they are similar.
 | |
| Only two modifications are required:
 | |
| .IP -
 | |
| if the expensive operator is a LAR or SAR,
 | |
| it must be replaced by an AAR, as the initial value
 | |
| of TMP is the \fIaddress\fR of the first
 | |
| array element that is accessed.
 | |
| .IP -
 | |
| code must be appended to store the result of the
 | |
| expression in TMP.
 | |
| .LP
 | |
| Finally, code to increment TMP is created and put after
 | |
| the code of the single assignment to the
 | |
| induction variable.
 | |
| The generated code uses either an integer addition
 | |
| (ADI) or an integer-to-pointer addition (ADS)
 | |
| to do the increment.
 | |
| .PP
 | |
| SR maintains a set of all expressions that have already
 | |
| been recognized in the present loop.
 | |
| Such expressions are said to be \fIavailable\fR.
 | |
| If an expression is recognized that is
 | |
| already available,
 | |
| no new temporary local variable is allocated for it,
 | |
| and the code to initialize and increment the local
 | |
| is not generated.
 |