440 lines
		
	
	
	
		
			14 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			440 lines
		
	
	
	
		
			14 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .NH 2
 | |
| Implementation
 | |
| .PP
 | |
| A major factor in the implementation
 | |
| of Inline Substitution is the requirement
 | |
| not to use an excessive amount of memory.
 | |
| IL essentially analyzes the entire program;
 | |
| it makes decisions based on which procedure calls
 | |
| appear in the whole program.
 | |
| Yet, because of the memory restriction, it is
 | |
| not feasible to read the entire program
 | |
| in main memory.
 | |
| To solve this problem, the IL phase has been
 | |
| split up into three subphases that are executed sequentially:
 | |
| .IP 1.
 | |
| analyze every procedure; see how it accesses its parameters;
 | |
| simultaneously collect all calls
 | |
| appearing in the whole program an put them
 | |
| in a \fIcall-list\fR.
 | |
| .IP 2.
 | |
| use the call-list and decide which calls will be substituted
 | |
| in line.
 | |
| .IP 3.
 | |
| take the decisions of subphase 2 and modify the
 | |
| program accordingly.
 | |
| .LP
 | |
| Subphases 1 and 3 scan the input program; only
 | |
| subphase 3 modifies it.
 | |
| It is essential that the decisions can be made
 | |
| in subphase 2
 | |
| without using the input program,
 | |
| provided that subphase 1 puts enough information
 | |
| in the call-list.
 | |
| Subphase 2 keeps the entire call-list in main memory
 | |
| and repeatedly scans it, to
 | |
| find the next best candidate for expansion.
 | |
| .PP
 | |
| We will specify the
 | |
| data structures used by IL before 
 | |
| describing the subphases.
 | |
| .NH 3
 | |
| Data structures
 | |
| .NH 4
 | |
| The procedure table
 | |
| .PP
 | |
| In subphase 1 information is gathered about every procedure
 | |
| and added to the procedure table.
 | |
| This information is used by the heuristic rules.
 | |
| A proctable entry for procedure p has
 | |
| the following extra information:
 | |
| .IP -
 | |
| is it allowed to substitute an invocation of p in line?
 | |
| .IP -
 | |
| is it allowed to put any parameter of such a call in line?
 | |
| .IP -
 | |
| the size of p (number of EM instructions)
 | |
| .IP -
 | |
| does p 'fall through'?
 | |
| .IP -
 | |
| a description of the formal parameters that p accesses; this information
 | |
| is obtained by looking at the code of p. For every parameter f,
 | |
| we record:
 | |
| .RS
 | |
| .IP -
 | |
| the offset of f
 | |
| .IP -
 | |
| the type of f (word, double word, pointer)
 | |
| .IP -
 | |
| may the corresponding actual parameter be put in line?
 | |
| .IP -
 | |
| is f ever accessed indirectly?
 | |
| .IP -
 | |
| if f used: never, once or more than once?
 | |
| .RE
 | |
| .IP -
 | |
| the number of times p is called (see below)
 | |
| .IP -
 | |
| the file address of its call-count information (see below).
 | |
| .LP
 | |
| .NH 4
 | |
| Call-count information
 | |
| .PP
 | |
| As a result of Inline Substitution, some procedures may
 | |
| become useless, because all their invocations have been
 | |
| substituted in line.
 | |
| One of the tasks of IL is to keep track which
 | |
| procedures are no longer called.
 | |
| Note that IL is especially keen on procedures that are
 | |
| called only once
 | |
| (possibly as a result of expanding all other calls to it).
 | |
| So we want to know how many times a procedure
 | |
| is called \fIduring\fR Inline Substitution.
 | |
| It is not good enough to compute this
 | |
| information afterwards.
 | |
| The task is rather complex, because
 | |
| the number of times a procedure is called
 | |
| varies during the entire process:
 | |
| .IP 1.
 | |
| If a call to p is substituted in line,
 | |
| the number of calls to p gets decremented by 1.
 | |
| .IP 2.
 | |
| If a call to p is substituted in line,
 | |
| and p contains n calls to q, then the number of calls to q
 | |
| gets incremented by n.
 | |
| .IP 3.
 | |
| If a procedure p is removed (because it is no
 | |
| longer called) and p contains n calls to q,
 | |
| then the number of calls to q gets decremented by n.
 | |
| .LP
 | |
| (Note that p may be the same as q, if p is recursive).
 | |
| .sp 0
 | |
| So we actually want to have the following information:
 | |
| .DS
 | |
| NRCALL(p,q) = number of call to q appearing in p,
 | |
| 
 | |
| for all procedures p and q that may be put in line.
 | |
| .DE
 | |
| This information, called \fIcall-count information\fR is
 | |
| computed by the first subphase.
 | |
| It is stored in a file.
 | |
| It is represented as a number of lists, rather than as
 | |
| a (very sparse) matrix.
 | |
| Every procedure has a list of (proc,count) pairs,
 | |
| telling which procedures it calls, and how many times.
 | |
| The file address of its call-count list is stored
 | |
| in its proctable entry.
 | |
| Whenever this information is needed, it is fetched from
 | |
| the file, using direct access.
 | |
| The proctable entry also contains the number of times
 | |
| a procedure is called, at any moment.
 | |
| .NH 4
 | |
| The call-list
 | |
| .PP
 | |
| The call-list is the major data structure use by IL.
 | |
| Every item of the list describes one procedure call.
 | |
| It contains the following attributes:
 | |
| .IP -
 | |
| the calling procedure (caller)
 | |
| .IP -
 | |
| the called procedure (callee)
 | |
| .IP -
 | |
| identification of the CAL instruction (sequence number)
 | |
| .IP -
 | |
| the loop nesting level; our heuristic rules appreciate
 | |
| calls inside a loop (or even inside a loop nested inside
 | |
| another loop, etc.) more than other calls
 | |
| .IP -
 | |
| the actual parameter expressions involved in the call;
 | |
| for every actual, we record:
 | |
| .RS
 | |
| .IP -
 | |
| the EM code of the expression
 | |
| .IP -
 | |
| the number of bytes of its result (size)
 | |
| .IP -
 | |
| an indication if the actual may be put in line
 | |
| .RE
 | |
| .LP
 | |
| The structure of the call-list is rather complex.
 | |
| Whenever a call is expanded in line, new calls
 | |
| will suddenly appear in the program,
 | |
| that were not contained in the original body
 | |
| of the calling subroutine.
 | |
| These calls are inherited from the called procedure.
 | |
| We will refer to these invocations as \fInested calls\fR
 | |
| (see Fig. 5.1).
 | |
| .DS
 | |
| procedure p is
 | |
| begin                           .
 | |
|      a();                       .
 | |
|      b();                       .
 | |
| end;
 | |
| 
 | |
| procedure r is            procedure r is
 | |
| begin                     begin
 | |
|      x();                      x();
 | |
|      p();  -- in line          a();  -- nested call
 | |
|      y();                      b();  -- nested call
 | |
| end;                           y();
 | |
|                           end;
 | |
| 
 | |
| Fig. 5.1 Example of nested procedure calls
 | |
| .DE
 | |
| Nested calls may subsequently be put in line too
 | |
| (probably resulting in a yet deeper nesting level, etc.).
 | |
| So the call-list does not always reflect the source program,
 | |
| but changes dynamically, as decisions are made.
 | |
| If a call to p is expanded, all calls appearing in p
 | |
| will be added to the call-list.
 | |
| .sp 0
 | |
| A convenient and elegant way to represent
 | |
| the call-list is to use a LISP-like list.
 | |
| .[
 | |
| poel lisp trac
 | |
| .]
 | |
| Calls that appear at the same level
 | |
| are linked in the CDR direction. If a call C
 | |
| to a procedure p is expanded,
 | |
| all calls appearing in p are put in a sub-list
 | |
| of C, i.e. in its CAR.
 | |
| In the example above, before the decision
 | |
| to expand the call to p is made, the
 | |
| call-list of procedure r looks like:
 | |
| .DS
 | |
| (call-to-x, call-to-p, call-to-y)
 | |
| .DE
 | |
| After the decision, it looks like:
 | |
| .DS
 | |
| (call-to-x, (call-to-p*, call-to-a, call-to-b), call-to-y)
 | |
| .DE
 | |
| The call to p is marked, because it has been
 | |
| substituted.
 | |
| Whenever IL wants to traverse the call-list of some procedure,
 | |
| it uses the well-known LISP technique of
 | |
| recursion in the CAR direction and
 | |
| iteration in the CDR direction
 | |
| (see page 1.19-2 of
 | |
| .[
 | |
| poel lisp trac
 | |
| .]
 | |
| ).
 | |
| All list traversals look like:
 | |
| .DS
 | |
| traverse(list)
 | |
| {
 | |
|     for (c = first(list); c != 0; c = CDR(c)) {
 | |
| 	if (c is marked) {
 | |
| 	    traverse(CAR(c));
 | |
| 	} else {
 | |
| 	    do something with c
 | |
| 	}
 | |
|     }
 | |
| }
 | |
| .DE
 | |
| The entire call-list consists of a number of LISP-like lists,
 | |
| one for every procedure.
 | |
| The proctable entry of a procedure contains a pointer
 | |
| to the beginning of the list.
 | |
| .NH 3
 | |
| The first subphase: procedure analysis
 | |
| .PP
 | |
| The tasks of the first subphase are to determine
 | |
| several attributes of every procedure
 | |
| and to construct the basic call-list,
 | |
| i.e. without nested calls.
 | |
| The size of a procedure is determined
 | |
| by simply counting its EM instructions.
 | |
| Pseudo instructions are skipped.
 | |
| A procedure does not 'fall through' if its CFG
 | |
| contains a basic block
 | |
| that is not the last block of the CFG and
 | |
| that ends on a RET instruction.
 | |
| The formal parameters of a procedure are determined
 | |
| by inspection of
 | |
| its code.
 | |
| .PP
 | |
| The call-list in constructed by looking at all CAL instructions
 | |
| appearing in the program.
 | |
| The call-list should only contain calls to procedures
 | |
| that may be put in line.
 | |
| This fact is only known if the procedure was
 | |
| analyzed earlier.
 | |
| If a call to a procedure p appears in the program
 | |
| before the body of p,
 | |
| the call will always be put in the call-list.
 | |
| If p is later found to be unsuitable,
 | |
| the call will be removed from the list by the
 | |
| second subphase.
 | |
| .PP
 | |
| An important issue is the recognition
 | |
| of the actual parameter expressions of the call.
 | |
| The front ends produces messages telling how many
 | |
| bytes of formal parameters every procedure accesses.
 | |
| (If there is no such message for a procedure, it
 | |
| cannot be put in line).
 | |
| The actual parameters together must account for
 | |
| the same number of bytes.A recursive descent parser is used
 | |
| to parse side-effect free EM expressions.
 | |
| It uses a table and some
 | |
| auxiliary routines to determine
 | |
| how many bytes every EM instruction pops from the stack
 | |
| and how many bytes it pushes onto the stack.
 | |
| These numbers depend on the EM instruction, its argument,
 | |
| and the wordsize and pointersize of the target machine.
 | |
| Initially, the parser has to recognize the
 | |
| number of bytes specified in the formals-message,
 | |
| say N.
 | |
| Assume the first instruction before the CAL pops S bytes
 | |
| and pushes R bytes.
 | |
| If R > N, too many bytes are recognized
 | |
| and the parser fails.
 | |
| Else, it calls itself recursively to recognize the
 | |
| S bytes used as operand of the instruction.
 | |
| If it succeeds in doing so, it continues with the next instruction,
 | |
| i.e. the first instruction before the code recognized by
 | |
| the recursive call, to recognize N-R more bytes.
 | |
| The result is a number of EM instructions that collectively push N bytes.
 | |
| If an instruction is come across that has side-effects
 | |
| (e.g. a store or a procedure call) or of which R and S cannot
 | |
| be computed statically (e.g. a LOS), it fails.
 | |
| .sp 0
 | |
| Note that the parser traverses the code backwards.
 | |
| As EM code is essentially postfix code, the parser works top down.
 | |
| .PP
 | |
| If the parser fails to recognize the parameters, the call will not
 | |
| be substituted in line.
 | |
| If the parameters can be determined, they still have to
 | |
| match the formal parameters of the called procedure.
 | |
| This check is performed by the second subphase; it cannot be
 | |
| done here, because it is possible that the called
 | |
| procedure has not been analyzed yet.
 | |
| .PP
 | |
| The entire call-list is written to a file,
 | |
| to be processed by the second subphase.
 | |
| .NH 3
 | |
| The second subphase: making decisions
 | |
| .PP
 | |
| The task of the second subphase is quite easy
 | |
| to understand.
 | |
| It reads the call-list file,
 | |
| builds an incore call-list and deletes every
 | |
| call that may not be expanded in line (either because the called
 | |
| procedure may not be put in line, or because the actual parameters
 | |
| of the call do not match the formal parameters of the called procedure).
 | |
| It assigns a \fIpay-off\fR to every call,
 | |
| indicating how desirable it is to expand it.
 | |
| .PP
 | |
| The subphase repeatedly scans the call-list and takes
 | |
| the call with the highest ratio.
 | |
| The chosen one gets marked,
 | |
| and the call-list is extended with the nested calls,
 | |
| as described above.
 | |
| These nested calls are also assigned a ratio,
 | |
| and will be considered too during the next scans.
 | |
| .sp 0
 | |
| After every decision the number of times
 | |
| every procedure is called is updated, using
 | |
| the call-count information.
 | |
| Meanwhile, the subphase keeps track of the amount of space left
 | |
| available.
 | |
| If all space is used, or if there are no more calls left to
 | |
| be expanded, it exits this loop.
 | |
| Finally, calls to procedures that are called only
 | |
| once are also chosen.
 | |
| .PP
 | |
| The actual parameters of a call are only needed by
 | |
| this subphase to assign a ratio to a call.
 | |
| To save some space, these actuals are not kept in main memory.
 | |
| They are removed after the call has been read and a ratio
 | |
| has been assigned to it.
 | |
| So this subphase works with \fIabstracts\fR of calls.
 | |
| After all work has been done,
 | |
| the actual parameters of the chosen calls are retrieved
 | |
| from a file,
 | |
| as they are needed by the transformation subphase.
 | |
| .NH 3
 | |
| The third subphase: doing transformations
 | |
| .PP
 | |
| The third subphase makes the actual modifications to
 | |
| the EM text.
 | |
| It is directed by the decisions made in the previous subphase,
 | |
| as expressed via the call-list.
 | |
| The call-list read by this subphase contains
 | |
| only calls that were selected for expansion.
 | |
| The list is ordered in the same way as the EM text,
 | |
| i.e. if a call C1 appears before a call C2 in the call-list,
 | |
| C1 also appears before C2 in the EM text.
 | |
| So the EM text is traversed linearly,
 | |
| the calls that have to be substituted are determined
 | |
| and the modifications are made.
 | |
| If a procedure is come across that is no longer needed,
 | |
| it is simply not written to the output EM file.
 | |
| The substitution of a call takes place in distinct steps:
 | |
| .IP "change the calling sequence" 7
 | |
| .sp 0
 | |
| The actual parameter expressions are changed.
 | |
| Parameters that are put in line are removed.
 | |
| All remaining ones must store their result in a
 | |
| temporary local variable, rather than
 | |
| push it on the stack.
 | |
| The CAL instruction and any ASP (to pop actual parameters)
 | |
| or LFR (to fetch the result of a function)
 | |
| are deleted.
 | |
| .IP "fetch the text of the called procedure"
 | |
| .sp 0
 | |
| Direct disk access is used to to read the text of the
 | |
| called procedure.
 | |
| The file offset is obtained from the proctable entry.
 | |
| .IP "allocate bytes for locals and temporaries"
 | |
| .sp 0
 | |
| The local variables of the called procedure will be put in the
 | |
| stack frame of the calling procedure.
 | |
| The same applies to any temporary variables
 | |
| that hold the result of parameters
 | |
| that were not put in line.
 | |
| The proctable entry of the caller is updated.
 | |
| .IP "put a label after the CAL"
 | |
| .sp 0
 | |
| If the called procedure contains a RET (return) instruction
 | |
| somewhere in the middle of its text (i.e. it does
 | |
| not fall through), the RET must be changed into
 | |
| a BRA (branch), to jump over the
 | |
| remainder of the text.
 | |
| This label is not needed if the called
 | |
| procedure falls through.
 | |
| .IP "copy the text of the called procedure and modify it"
 | |
| .sp 0
 | |
| References to local variables of the called routine
 | |
| and to parameters that are not put in line
 | |
| are changed to refer to the
 | |
| new local of the caller.
 | |
| References to in line parameters are replaced
 | |
| by the actual parameter expression.
 | |
| Returns (RETs) are either deleted or
 | |
| replaced by a BRA.
 | |
| Messages containing information about local
 | |
| variables or parameters are changed.
 | |
| Global data declarations and the PRO and END pseudos
 | |
| are removed.
 | |
| Instruction labels and references to them are
 | |
| changed to make sure they do not have the
 | |
| same identifying number as
 | |
| labels in the calling procedure.
 | |
| .IP "insert the modified text"
 | |
| .sp 0
 | |
| The pseudos of the called procedure are put after the pseudos
 | |
| of the calling procedure.
 | |
| The real text of the callee is put at
 | |
| the place where the CAL was.
 | |
| .IP "take care of nested substitutions"
 | |
| .sp 0
 | |
| The expanded procedure may contain calls that
 | |
| have to be expanded too (nested calls).
 | |
| If the descriptor of this call contains actual
 | |
| parameter expressions,
 | |
| the code of the expressions has to be changed
 | |
| the same way as the code of the callee was changed.
 | |
| Next, the entire process of finding CALs and doing
 | |
| the substitutions is repeated recursively.
 | |
| .LP
 |