132 lines
		
	
	
	
		
			4.5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			132 lines
		
	
	
	
		
			4.5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.NH 2
 | 
						|
Heuristic rules
 | 
						|
.PP
 | 
						|
Using the information described
 | 
						|
in the previous section,
 | 
						|
we can find all calls that can
 | 
						|
be expanded in line, and for which
 | 
						|
this expansion is desirable.
 | 
						|
In general, we cannot expand all these calls,
 | 
						|
so we have to choose the 'best' ones.
 | 
						|
With every CAL instruction
 | 
						|
that may be expanded, we associate
 | 
						|
a \fIpay off\fR,
 | 
						|
which expresses how desirable it is
 | 
						|
to expand this specific CAL.
 | 
						|
.sp
 | 
						|
Let Tc denote the portion of EM text involved
 | 
						|
in a specific call, i.e. the pushing of the actual
 | 
						|
parameter expressions, the CAL itself,
 | 
						|
the popping of the parameters and the
 | 
						|
pushing of the result (if any, via an LFR).
 | 
						|
Let Te denote the EM text that would be obtained
 | 
						|
by expanding the call in line.
 | 
						|
Let Pc be the original program and Pe the program
 | 
						|
with Te substituted for Tc.
 | 
						|
The pay off of the CAL depends on two factors:
 | 
						|
.IP -
 | 
						|
T = execution_time(Pe) - execution_time(Pc)
 | 
						|
.IP -
 | 
						|
S = code_size(Pe) - code_size(Pc)
 | 
						|
.LP
 | 
						|
The change in execution time (T) depends on:
 | 
						|
.IP -
 | 
						|
T1 = execution_time(Te) - execution_time(Tc)
 | 
						|
.IP -
 | 
						|
N = number of times Te or Tc get executed.
 | 
						|
.LP
 | 
						|
We assume that T1 will be the same every
 | 
						|
time the code gets executed.
 | 
						|
This is a reasonable assumption.
 | 
						|
(Note that we are talking about one CAL,
 | 
						|
not about different calls to the same procedure).
 | 
						|
Hence
 | 
						|
.DS
 | 
						|
T = N * T1
 | 
						|
.DE
 | 
						|
T1 can be estimated by a careful analysis
 | 
						|
of the transformations that are performed.
 | 
						|
Below, we list everything that will be
 | 
						|
different when a call is expanded in line:
 | 
						|
.IP -
 | 
						|
The CAL instruction is not executed.
 | 
						|
This saves a subroutine jump.
 | 
						|
.IP -
 | 
						|
The instructions in the procedure prolog
 | 
						|
are not executed.
 | 
						|
These instructions, generated from the PRO pseudo,
 | 
						|
save some machine registers 
 | 
						|
(including the old LB), set the new LB and allocate space
 | 
						|
for the locals of the called routine.
 | 
						|
The savings may be less if there are no
 | 
						|
locals to allocate.
 | 
						|
.IP -
 | 
						|
In line parameters are not evaluated before the call
 | 
						|
and are not pushed on the stack.
 | 
						|
.IP -
 | 
						|
All remaining parameters are stored in local variables,
 | 
						|
instead of being pushed on the stack.
 | 
						|
.IP -
 | 
						|
If the number of parameters is nonzero,
 | 
						|
the ASP instruction after the CAL is not executed.
 | 
						|
.IP -
 | 
						|
Every reference to an in line parameter is
 | 
						|
substituted by the parameter expression.
 | 
						|
.IP -
 | 
						|
RET (return) instructions are replaced by
 | 
						|
BRA (branch) instructions.
 | 
						|
If the called procedure 'falls through'
 | 
						|
(i.e. it has only one RET, at the end of its code),
 | 
						|
even the BRA is not needed.
 | 
						|
.IP -
 | 
						|
The LFR (fetch function result) is not executed
 | 
						|
.PP
 | 
						|
Besides these changes, which are caused directly by IL,
 | 
						|
other changes may occur as IL influences other optimization
 | 
						|
techniques, such as Register Allocation and Constant Propagation.
 | 
						|
Our heuristic rules do not take into account the quite
 | 
						|
inpredictable effects on Register Allocation.
 | 
						|
It does, however, favour calls that have numeric \fIconstants\fR
 | 
						|
as parameter; especially the constant "0" as an inline
 | 
						|
parameter gets high scores,
 | 
						|
as further optimizations may often be possible.
 | 
						|
.PP
 | 
						|
It cannot be determined statically how often a CAL instruction gets
 | 
						|
executed.
 | 
						|
We will use \fIloop nesting\fR information here.
 | 
						|
The nesting level of the loop in which
 | 
						|
the CAL appears (if any) will be used as an
 | 
						|
indication for the number of times it gets executed.
 | 
						|
.PP
 | 
						|
Based on all these facts,
 | 
						|
the pay off of a call will be computed.
 | 
						|
The following model was developed empirically.
 | 
						|
Assume procedure P calls procedure Q.
 | 
						|
The call takes place in basic block B.
 | 
						|
.DS
 | 
						|
ZP = # zero parameters
 | 
						|
CP = # constant parameters - ZP
 | 
						|
LN = Loop Nesting level (0 if outside any loop)
 | 
						|
F  = \fIif\fR # formal parameters of Q > 0 \fIthen\fR 1 \fIelse\fR 0
 | 
						|
FT = \fIif\fR Q falls through \fIthen\fR 1 \fIelse\fR 0
 | 
						|
S  = size(Q) - 1 - # inline_parameters - F
 | 
						|
L  = \fIif\fR # local variables of P > 0 \fIthen\fR 0 \fIelse\fR -1
 | 
						|
A  = CP + 2 * ZP
 | 
						|
N  = \fIif\fR LN=0 and P is never called from a loop \fIthen\fR 0 \fIelse\fR (LN+1)**2
 | 
						|
FM = \fIif\fR B is a firm block \fIthen\fR 2 \fIelse\fR 1
 | 
						|
 | 
						|
pay_off = (100/S + FT + F + L + A) * N * FM
 | 
						|
.DE
 | 
						|
S stands for the size increase of the program,
 | 
						|
which is slightly less than the size of Q.
 | 
						|
The size of a procedure is taken to be its number
 | 
						|
of (non-pseudo) EM instructions.
 | 
						|
The terms "loop nesting level" and "firm" were defined
 | 
						|
in the chapter on the Intermediate Code (section "loop tables").
 | 
						|
If a call is not inside a loop and the calling procedure
 | 
						|
is itself never called from a loop (transitively),
 | 
						|
then the call will probably be executed at most once.
 | 
						|
Such a call is never expanded in line (its pay off is zero).
 | 
						|
If the calling procedure doesn't have local variables, a penalty (L)
 | 
						|
is introduced, as it will most likely get local variables if the
 | 
						|
call gets expanded.
 |