135 lines
		
	
	
	
		
			4.6 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			135 lines
		
	
	
	
		
			4.6 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .NH 2
 | |
| Heuristic rules
 | |
| .PP
 | |
| Using the information described
 | |
| in the previous section,
 | |
| we can find all calls that can
 | |
| be expanded in line, and for which
 | |
| this expansion is desirable.
 | |
| In general, we cannot expand all these calls,
 | |
| so we have to choose the 'best' ones.
 | |
| With every CAL instruction
 | |
| that may be expanded, we associate
 | |
| a \fIpay off\fR,
 | |
| which expresses how desirable it is
 | |
| to expand this specific CAL.
 | |
| .sp
 | |
| Let Tc denote the portion of EM text involved
 | |
| in a specific call, i.e. the pushing of the actual
 | |
| parameter expressions, the CAL itself,
 | |
| the popping of the parameters and the
 | |
| pushing of the result (if any, via an LFR).
 | |
| Let Te denote the EM text that would be obtained
 | |
| by expanding the call in line.
 | |
| Let Pc be the original program and Pe the program
 | |
| with Te substituted for Tc.
 | |
| The pay off of the CAL depends on two factors:
 | |
| .IP -
 | |
| T = execution_time(Pe) - execution_time(Pc)
 | |
| .IP -
 | |
| S = code_size(Pe) - code_size(Pc)
 | |
| .LP
 | |
| The change in execution time (T) depends on:
 | |
| .IP -
 | |
| T1 = execution_time(Te) - execution_time(Tc)
 | |
| .IP -
 | |
| N = number of times Te or Tc get executed.
 | |
| .LP
 | |
| We assume that T1 will be the same every
 | |
| time the code gets executed.
 | |
| This is a reasonable assumption.
 | |
| (Note that we are talking about one CAL,
 | |
| not about different calls to the same procedure).
 | |
| Hence
 | |
| .DS
 | |
| T = N * T1
 | |
| .DE
 | |
| T1 can be estimated by a careful analysis
 | |
| of the transformations that are performed.
 | |
| Below, we list everything that will be
 | |
| different when a call is expanded in line:
 | |
| .IP -
 | |
| The CAL instruction is not executed.
 | |
| This saves a subroutine jump.
 | |
| .IP -
 | |
| The instructions in the procedure prolog
 | |
| are not executed.
 | |
| These instructions, generated from the PRO pseudo,
 | |
| save some machine registers 
 | |
| (including the old LB), set the new LB and allocate space
 | |
| for the locals of the called routine.
 | |
| The savings may be less if there are no
 | |
| locals to allocate.
 | |
| .IP -
 | |
| In line parameters are not evaluated before the call
 | |
| and are not pushed on the stack.
 | |
| .IP -
 | |
| All remaining parameters are stored in local variables,
 | |
| instead of being pushed on the stack.
 | |
| .IP -
 | |
| If the number of parameters is nonzero,
 | |
| the ASP instruction after the CAL is not executed.
 | |
| .IP -
 | |
| Every reference to an in line parameter is
 | |
| substituted by the parameter expression.
 | |
| .IP -
 | |
| RET (return) instructions are replaced by
 | |
| BRA (branch) instructions.
 | |
| If the called procedure 'falls through'
 | |
| (i.e. it has only one RET, at the end of its code),
 | |
| even the BRA is not needed.
 | |
| .IP -
 | |
| The LFR (fetch function result) is not executed
 | |
| .PP
 | |
| Besides these changes, which are caused directly by IL,
 | |
| other changes may occur as IL influences other optimization
 | |
| techniques, such as Register Allocation and Constant Propagation.
 | |
| Our heuristic rules do not take into account the quite
 | |
| inpredictable effects on Register Allocation.
 | |
| It does, however, favour calls that have numeric \fIconstants\fR
 | |
| as parameter; especially the constant "0" as an inline
 | |
| parameter gets high scores,
 | |
| as further optimizations may often be possible.
 | |
| .PP
 | |
| It cannot be determined statically how often a CAL instruction gets
 | |
| executed.
 | |
| We will use \fIloop nesting\fR information here.
 | |
| The nesting level of the loop in which
 | |
| the CAL appears (if any) will be used as an
 | |
| indication for the number of times it gets executed.
 | |
| .PP
 | |
| Based on all these facts,
 | |
| the pay off of a call will be computed.
 | |
| The following model was developed empirically.
 | |
| Assume procedure P calls procedure Q.
 | |
| The call takes place in basic block B.
 | |
| .DS
 | |
| .TS
 | |
| l l l.
 | |
| ZP	\&=	# zero parameters
 | |
| CP	\&=	# constant parameters - ZP
 | |
| LN	\&=	Loop Nesting level (0 if outside any loop)
 | |
| F	\&=	\fIif\fR # formal parameters of Q > 0 \fIthen\fR 1 \fIelse\fR 0
 | |
| FT	\&=	\fIif\fR Q falls through \fIthen\fR 1 \fIelse\fR 0
 | |
| S	\&=	size(Q) - 1 - # inline_parameters - F
 | |
| L	\&=	\fIif\fR # local variables of P > 0 \fIthen\fR 0 \fIelse\fR -1
 | |
| A	\&=	CP + 2 * ZP
 | |
| N	\&=	\fIif\fR LN=0 and P is never called from a loop \fIthen\fR 0 \fIelse\fR (LN+1)**2
 | |
| FM	\&=	\fIif\fR B is a firm block \fIthen\fR 2 \fIelse\fR 1
 | |
| 
 | |
| pay_off	\&=	(100/S + FT + F + L + A) * N * FM
 | |
| .TE
 | |
| .DE
 | |
| S stands for the size increase of the program,
 | |
| which is slightly less than the size of Q.
 | |
| The size of a procedure is taken to be its number
 | |
| of (non-pseudo) EM instructions.
 | |
| The terms "loop nesting level" and "firm" were defined
 | |
| in the chapter on the Intermediate Code (section "loop tables").
 | |
| If a call is not inside a loop and the calling procedure
 | |
| is itself never called from a loop (transitively),
 | |
| then the call will probably be executed at most once.
 | |
| Such a call is never expanded in line (its pay off is zero).
 | |
| If the calling procedure doesn't have local variables, a penalty (L)
 | |
| is introduced, as it will most likely get local variables if the
 | |
| call gets expanded.
 |