.NH 2 The model of strength reduction .PP In this section we will describe the transformations performed by Strength Reduction (SR). Before doing so, we will introduce the central notion of an induction variable. .NH 3 Induction variables .PP SR looks for variables whose values form an arithmetic progression at the beginning of a loop. These variables are called induction variables. The most frequently occurring example of such a variable is a loop-variable in a high-order programming language. Several quite sophisticated models of strength reduction can be found in the literature. .[ cocke reduction strength cacm .] .[ allen cocke kennedy reduction strength .] .[ lowry medlock cacm .] .[ aho compiler design .] In these models the notion of an induction variable is far more general than the intuitive notion of a loop-variable. The definition of an induction variable we present here is more restricted, yielding a simpler model and simpler transformations. We think the principle source for strength reduction lies in expressions using a loop-variable, i.e. a variable that is incremented or decremented by the same amount after every loop iteration, and that cannot be changed in any other way. .PP Of course, the EM code does not contain high level constructs such as for-statements. We will define an induction variable in terms of the Intermediate Code of the optimizer. Note that the notions of a loop in the EM text and of a firm basic block were defined in section 3.3.5. .sp .UL definition .sp 0 An induction variable i of a loop L is a local variable that is never accessed indirectly, whose size is the word size of the target machine, and that is assigned exactly once within L, the assignment: .IP - being of the form i := i + c or i := c +i, c is a constant called the \fIstep value\fR of i. .IP - occurring in a firm block of L. .LP (Note that the first restriction on the assignment is not described in terms of the Intermediate Code; we will give such a description later; the current definition is easier to understand however). .NH 3 Recognized expressions .PP SR recognizes certain expressions using an induction variable and replaces them by cheaper ones. Two kinds of expensive operations are recognized: multiplication and array address computations. The expressions that are simplified must use an induction variable as an operand of a multiplication or as index in an array expression. .PP Often a linear function of an induction variable is used, rather than the variable itself. In these cases optimization is still possible. We call such expressions \fIiv-expressions\fR. .sp .UL definition: .sp 0 An iv-expression of an induction variable i of a loop L is an expression that: .IP - uses only the operators + and - (unary as well as binary) .IP - uses i as operand exactly once .IP - uses (besides i) only constants or variables that are never changed in L as operands. .LP .PP The expressions recognized by SR are of the following forms: .IP (1) iv_expression * constant .IP (2) constant * iv_expression .IP (3) A[iv-expression] := (assign to array element) .IP (4) A[iv-expression] (use array element) .IP (5) & A[iv-expression] (take address of array element) .LP (Note that EM has different instructions to use an array element, store into one, or take the address of one, resp. LAR, SAR, and AAR). .sp 0 The size of the elements of A must be known statically. In cases (3) and (4) this size must equal the word size of the target machine. .NH 3 Transformations .PP With every recognized expression we associate a new temporary local variable TMP, allocated in the stack frame of the procedure containing the expression. At any program point within the loop, TMP will contain the following value: .IP multiplication: 18 the current value of iv-expression * constant .IP arrays: the current value of &A[iv-expression]. .LP In the second case, TMP essentially is a pointer variable, pointing to the element of A that is currently in use. .sp 0 If the same expression occurs several times in the loop, the same temporary local is used each time. .PP Three transformations are applied to the EM text: .IP (1) TMP is initialized with the right value. This initialization takes place just before the loop. .IP (2) The recognized expression is simplified. .IP (3) TMP is incremented; this takes place just after the induction variable is incremented. .LP For multiplication, the initial value of TMP is the value of the recognized expression at the program point immediately before the loop. For arrays, TMP is initialized with the address of the first array element that is accessed. So the initialization code is: .DS TMP := iv-expression * constant; or TMP := &A[iv-expression] .DE At the point immediately before the loop, the induction variable will already have been initialized, so the value used in the code above will be the value it has during the first iteration. .PP For multiplication, the recognized expression can simply be replaced by TMP. For array optimizations, the replacement depends on the form: .DS \fIform\fR \fIreplacement\fR (3) A[iv-expr] := *TMP := (assign indirect) (4) A[iv-expr] *TMP (use indirect) (5) &A[iv-expr] TMP .DE The '*' denotes the indirect operator. (Note that EM has different instructions to do an assign-indirect and a use-indirect). As the size of the array elements is restricted to be the word size in case (3) and (4), only one EM instruction needs to be generated in all cases. .PP The amount by which TMP is incremented is: .IP multiplication: 18 step value * constant .IP arrays: step value * element size .LP Note that the step value (see definition of induction variable above), the constant, and the element size (see previous section) can all be determined statically. If the sign of the induction variable in the iv-expression is negative, the amount must be negated. .PP The transformations are demonstrated by an example. .DS i := 100; i := 100; while i > 1 loop TMP := (6-i) * 5; X := (6-i) * 5 + 2; while i > 1 loop Y := (6-i) * 5 - 8; --> X := TMP + 2; i := i - 3; Y := TMP - 8; end loop; i := i - 3; TMP := TMP + 15; end loop; Fig. 6.2 Example of complex Strength Reduction transformations .DE The expression '(6-i)*5' is recognized twice. The constant is 5. The step value is -3. The sign of i in the recognized expression is '-'. So the increment value of TMP is -(-3*5) = +15.