446 lines
		
	
	
	
		
			14 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			446 lines
		
	
	
	
		
			14 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.NH 2
 | 
						|
Implementation
 | 
						|
.PP
 | 
						|
A major factor in the implementation
 | 
						|
of Inline Substitution is the requirement
 | 
						|
not to use an excessive amount of memory.
 | 
						|
IL essentially analyzes the entire program;
 | 
						|
it makes decisions based on which procedure calls
 | 
						|
appear in the whole program.
 | 
						|
Yet, because of the memory restriction, it is
 | 
						|
not feasible to read the entire program
 | 
						|
in main memory.
 | 
						|
To solve this problem, the IL phase has been
 | 
						|
split up into three subphases that are executed sequentially:
 | 
						|
.IP 1.
 | 
						|
analyze every procedure; see how it accesses its parameters;
 | 
						|
simultaneously collect all calls
 | 
						|
appearing in the whole program an put them
 | 
						|
in a \fIcall-list\fR.
 | 
						|
.IP 2.
 | 
						|
use the call-list and decide which calls will be substituted
 | 
						|
in line.
 | 
						|
.IP 3.
 | 
						|
take the decisions of subphase 2 and modify the
 | 
						|
program accordingly.
 | 
						|
.LP
 | 
						|
Subphases 1 and 3 scan the input program; only
 | 
						|
subphase 3 modifies it.
 | 
						|
It is essential that the decisions can be made
 | 
						|
in subphase 2
 | 
						|
without using the input program,
 | 
						|
provided that subphase 1 puts enough information
 | 
						|
in the call-list.
 | 
						|
Subphase 2 keeps the entire call-list in main memory
 | 
						|
and repeatedly scans it, to
 | 
						|
find the next best candidate for expansion.
 | 
						|
.PP
 | 
						|
We will specify the
 | 
						|
data structures used by IL before 
 | 
						|
describing the subphases.
 | 
						|
.NH 3
 | 
						|
Data structures
 | 
						|
.NH 4
 | 
						|
The procedure table
 | 
						|
.PP
 | 
						|
In subphase 1 information is gathered about every procedure
 | 
						|
and added to the procedure table.
 | 
						|
This information is used by the heuristic rules.
 | 
						|
A proctable entry for procedure p has
 | 
						|
the following extra information:
 | 
						|
.IP -
 | 
						|
is it allowed to substitute an invocation of p in line?
 | 
						|
.IP -
 | 
						|
is it allowed to put any parameter of such a call in line?
 | 
						|
.IP -
 | 
						|
the size of p (number of EM instructions)
 | 
						|
.IP -
 | 
						|
does p 'fall through'?
 | 
						|
.IP -
 | 
						|
a description of the formal parameters that p accesses; this information
 | 
						|
is obtained by looking at the code of p. For every parameter f,
 | 
						|
we record:
 | 
						|
.RS
 | 
						|
.IP -
 | 
						|
the offset of f
 | 
						|
.IP -
 | 
						|
the type of f (word, double word, pointer)
 | 
						|
.IP -
 | 
						|
may the corresponding actual parameter be put in line?
 | 
						|
.IP -
 | 
						|
is f ever accessed indirectly?
 | 
						|
.IP -
 | 
						|
if f used: never, once or more than once?
 | 
						|
.RE
 | 
						|
.IP -
 | 
						|
the number of times p is called (see below)
 | 
						|
.IP -
 | 
						|
the file address of its call-count information (see below).
 | 
						|
.LP
 | 
						|
.NH 4
 | 
						|
Call-count information
 | 
						|
.PP
 | 
						|
As a result of Inline Substitution, some procedures may
 | 
						|
become useless, because all their invocations have been
 | 
						|
substituted in line.
 | 
						|
One of the tasks of IL is to keep track which
 | 
						|
procedures are no longer called.
 | 
						|
Note that IL is especially keen on procedures that are
 | 
						|
called only once
 | 
						|
(possibly as a result of expanding all other calls to it).
 | 
						|
So we want to know how many times a procedure
 | 
						|
is called \fIduring\fR Inline Substitution.
 | 
						|
It is not good enough to compute this
 | 
						|
information afterwards.
 | 
						|
The task is rather complex, because
 | 
						|
the number of times a procedure is called
 | 
						|
varies during the entire process:
 | 
						|
.IP 1.
 | 
						|
If a call to p is substituted in line,
 | 
						|
the number of calls to p gets decremented by 1.
 | 
						|
.IP 2.
 | 
						|
If a call to p is substituted in line,
 | 
						|
and p contains n calls to q, then the number of calls to q
 | 
						|
gets incremented by n.
 | 
						|
.IP 3.
 | 
						|
If a procedure p is removed (because it is no
 | 
						|
longer called) and p contains n calls to q,
 | 
						|
then the number of calls to q gets decremented by n.
 | 
						|
.LP
 | 
						|
(Note that p may be the same as q, if p is recursive).
 | 
						|
.sp 0
 | 
						|
So we actually want to have the following information:
 | 
						|
.DS
 | 
						|
NRCALL(p,q) = number of call to q appearing in p,
 | 
						|
 | 
						|
for all procedures p and q that may be put in line.
 | 
						|
.DE
 | 
						|
This information, called \fIcall-count information\fR is
 | 
						|
computed by the first subphase.
 | 
						|
It is stored in a file.
 | 
						|
It is represented as a number of lists, rather than as
 | 
						|
a (very sparse) matrix.
 | 
						|
Every procedure has a list of (proc,count) pairs,
 | 
						|
telling which procedures it calls, and how many times.
 | 
						|
The file address of its call-count list is stored
 | 
						|
in its proctable entry.
 | 
						|
Whenever this information is needed, it is fetched from
 | 
						|
the file, using direct access.
 | 
						|
The proctable entry also contains the number of times
 | 
						|
a procedure is called, at any moment.
 | 
						|
.NH 4
 | 
						|
The call-list
 | 
						|
.PP
 | 
						|
The call-list is the major data structure use by IL.
 | 
						|
Every item of the list describes one procedure call.
 | 
						|
It contains the following attributes:
 | 
						|
.IP -
 | 
						|
the calling procedure (caller)
 | 
						|
.IP -
 | 
						|
the called procedure (callee)
 | 
						|
.IP -
 | 
						|
identification of the CAL instruction (sequence number)
 | 
						|
.IP -
 | 
						|
the loop nesting level; our heuristic rules appreciate
 | 
						|
calls inside a loop (or even inside a loop nested inside
 | 
						|
another loop, etc.) more than other calls
 | 
						|
.IP -
 | 
						|
the actual parameter expressions involved in the call;
 | 
						|
for every actual, we record:
 | 
						|
.RS
 | 
						|
.IP -
 | 
						|
the EM code of the expression
 | 
						|
.IP -
 | 
						|
the number of bytes of its result (size)
 | 
						|
.IP -
 | 
						|
an indication if the actual may be put in line
 | 
						|
.RE
 | 
						|
.LP
 | 
						|
The structure of the call-list is rather complex.
 | 
						|
Whenever a call is expanded in line, new calls
 | 
						|
will suddenly appear in the program,
 | 
						|
that were not contained in the original body
 | 
						|
of the calling subroutine.
 | 
						|
These calls are inherited from the called procedure.
 | 
						|
We will refer to these invocations as \fInested calls\fR
 | 
						|
(see Fig. 5.1).
 | 
						|
.DS
 | 
						|
.TS
 | 
						|
lw(2.5i) l.
 | 
						|
procedure p is
 | 
						|
begin	.
 | 
						|
     a();	.
 | 
						|
     b();	.
 | 
						|
end;
 | 
						|
.TE
 | 
						|
 | 
						|
.TS
 | 
						|
lw(2.5i) l.
 | 
						|
procedure r is	procedure r is
 | 
						|
begin	begin
 | 
						|
     x();	    x();
 | 
						|
     p();  -- in line	    a();  -- nested call
 | 
						|
     y();	    b();  -- nested call
 | 
						|
end;	    y();
 | 
						|
	end;
 | 
						|
.TE
 | 
						|
 | 
						|
Fig. 5.1 Example of nested procedure calls
 | 
						|
.DE
 | 
						|
Nested calls may subsequently be put in line too
 | 
						|
(probably resulting in a yet deeper nesting level, etc.).
 | 
						|
So the call-list does not always reflect the source program,
 | 
						|
but changes dynamically, as decisions are made.
 | 
						|
If a call to p is expanded, all calls appearing in p
 | 
						|
will be added to the call-list.
 | 
						|
.sp 0
 | 
						|
A convenient and elegant way to represent
 | 
						|
the call-list is to use a LISP-like list.
 | 
						|
.[
 | 
						|
poel lisp trac
 | 
						|
.]
 | 
						|
Calls that appear at the same level
 | 
						|
are linked in the CDR direction. If a call C
 | 
						|
to a procedure p is expanded,
 | 
						|
all calls appearing in p are put in a sub-list
 | 
						|
of C, i.e. in its CAR.
 | 
						|
In the example above, before the decision
 | 
						|
to expand the call to p is made, the
 | 
						|
call-list of procedure r looks like:
 | 
						|
.DS
 | 
						|
(call-to-x, call-to-p, call-to-y)
 | 
						|
.DE
 | 
						|
After the decision, it looks like:
 | 
						|
.DS
 | 
						|
(call-to-x, (call-to-p*, call-to-a, call-to-b), call-to-y)
 | 
						|
.DE
 | 
						|
The call to p is marked, because it has been
 | 
						|
substituted.
 | 
						|
Whenever IL wants to traverse the call-list of some procedure,
 | 
						|
it uses the well-known LISP technique of
 | 
						|
recursion in the CAR direction and
 | 
						|
iteration in the CDR direction
 | 
						|
(see page 1.19-2 of
 | 
						|
.[
 | 
						|
poel lisp trac
 | 
						|
.]
 | 
						|
).
 | 
						|
All list traversals look like:
 | 
						|
.DS
 | 
						|
traverse(list)
 | 
						|
{
 | 
						|
    for (c = first(list); c != 0; c = CDR(c)) {
 | 
						|
        if (c is marked) {
 | 
						|
            traverse(CAR(c));
 | 
						|
        } else {
 | 
						|
            do something with c
 | 
						|
        }
 | 
						|
    }
 | 
						|
}
 | 
						|
.DE
 | 
						|
The entire call-list consists of a number of LISP-like lists,
 | 
						|
one for every procedure.
 | 
						|
The proctable entry of a procedure contains a pointer
 | 
						|
to the beginning of the list.
 | 
						|
.NH 3
 | 
						|
The first subphase: procedure analysis
 | 
						|
.PP
 | 
						|
The tasks of the first subphase are to determine
 | 
						|
several attributes of every procedure
 | 
						|
and to construct the basic call-list,
 | 
						|
i.e. without nested calls.
 | 
						|
The size of a procedure is determined
 | 
						|
by simply counting its EM instructions.
 | 
						|
Pseudo instructions are skipped.
 | 
						|
A procedure does not 'fall through' if its CFG
 | 
						|
contains a basic block
 | 
						|
that is not the last block of the CFG and
 | 
						|
that ends on a RET instruction.
 | 
						|
The formal parameters of a procedure are determined
 | 
						|
by inspection of
 | 
						|
its code.
 | 
						|
.PP
 | 
						|
The call-list in constructed by looking at all CAL instructions
 | 
						|
appearing in the program.
 | 
						|
The call-list should only contain calls to procedures
 | 
						|
that may be put in line.
 | 
						|
This fact is only known if the procedure was
 | 
						|
analyzed earlier.
 | 
						|
If a call to a procedure p appears in the program
 | 
						|
before the body of p,
 | 
						|
the call will always be put in the call-list.
 | 
						|
If p is later found to be unsuitable,
 | 
						|
the call will be removed from the list by the
 | 
						|
second subphase.
 | 
						|
.PP
 | 
						|
An important issue is the recognition
 | 
						|
of the actual parameter expressions of the call.
 | 
						|
The front ends produces messages telling how many
 | 
						|
bytes of formal parameters every procedure accesses.
 | 
						|
(If there is no such message for a procedure, it
 | 
						|
cannot be put in line).
 | 
						|
The actual parameters together must account for
 | 
						|
the same number of bytes.A recursive descent parser is used
 | 
						|
to parse side-effect free EM expressions.
 | 
						|
It uses a table and some
 | 
						|
auxiliary routines to determine
 | 
						|
how many bytes every EM instruction pops from the stack
 | 
						|
and how many bytes it pushes onto the stack.
 | 
						|
These numbers depend on the EM instruction, its argument,
 | 
						|
and the wordsize and pointersize of the target machine.
 | 
						|
Initially, the parser has to recognize the
 | 
						|
number of bytes specified in the formals-message,
 | 
						|
say N.
 | 
						|
Assume the first instruction before the CAL pops S bytes
 | 
						|
and pushes R bytes.
 | 
						|
If R > N, too many bytes are recognized
 | 
						|
and the parser fails.
 | 
						|
Else, it calls itself recursively to recognize the
 | 
						|
S bytes used as operand of the instruction.
 | 
						|
If it succeeds in doing so, it continues with the next instruction,
 | 
						|
i.e. the first instruction before the code recognized by
 | 
						|
the recursive call, to recognize N-R more bytes.
 | 
						|
The result is a number of EM instructions that collectively push N bytes.
 | 
						|
If an instruction is come across that has side-effects
 | 
						|
(e.g. a store or a procedure call) or of which R and S cannot
 | 
						|
be computed statically (e.g. a LOS), it fails.
 | 
						|
.sp 0
 | 
						|
Note that the parser traverses the code backwards.
 | 
						|
As EM code is essentially postfix code, the parser works top down.
 | 
						|
.PP
 | 
						|
If the parser fails to recognize the parameters, the call will not
 | 
						|
be substituted in line.
 | 
						|
If the parameters can be determined, they still have to
 | 
						|
match the formal parameters of the called procedure.
 | 
						|
This check is performed by the second subphase; it cannot be
 | 
						|
done here, because it is possible that the called
 | 
						|
procedure has not been analyzed yet.
 | 
						|
.PP
 | 
						|
The entire call-list is written to a file,
 | 
						|
to be processed by the second subphase.
 | 
						|
.NH 3
 | 
						|
The second subphase: making decisions
 | 
						|
.PP
 | 
						|
The task of the second subphase is quite easy
 | 
						|
to understand.
 | 
						|
It reads the call-list file,
 | 
						|
builds an incore call-list and deletes every
 | 
						|
call that may not be expanded in line (either because the called
 | 
						|
procedure may not be put in line, or because the actual parameters
 | 
						|
of the call do not match the formal parameters of the called procedure).
 | 
						|
It assigns a \fIpay-off\fR to every call,
 | 
						|
indicating how desirable it is to expand it.
 | 
						|
.PP
 | 
						|
The subphase repeatedly scans the call-list and takes
 | 
						|
the call with the highest ratio.
 | 
						|
The chosen one gets marked,
 | 
						|
and the call-list is extended with the nested calls,
 | 
						|
as described above.
 | 
						|
These nested calls are also assigned a ratio,
 | 
						|
and will be considered too during the next scans.
 | 
						|
.sp 0
 | 
						|
After every decision the number of times
 | 
						|
every procedure is called is updated, using
 | 
						|
the call-count information.
 | 
						|
Meanwhile, the subphase keeps track of the amount of space left
 | 
						|
available.
 | 
						|
If all space is used, or if there are no more calls left to
 | 
						|
be expanded, it exits this loop.
 | 
						|
Finally, calls to procedures that are called only
 | 
						|
once are also chosen.
 | 
						|
.PP
 | 
						|
The actual parameters of a call are only needed by
 | 
						|
this subphase to assign a ratio to a call.
 | 
						|
To save some space, these actuals are not kept in main memory.
 | 
						|
They are removed after the call has been read and a ratio
 | 
						|
has been assigned to it.
 | 
						|
So this subphase works with \fIabstracts\fR of calls.
 | 
						|
After all work has been done,
 | 
						|
the actual parameters of the chosen calls are retrieved
 | 
						|
from a file,
 | 
						|
as they are needed by the transformation subphase.
 | 
						|
.NH 3
 | 
						|
The third subphase: doing transformations
 | 
						|
.PP
 | 
						|
The third subphase makes the actual modifications to
 | 
						|
the EM text.
 | 
						|
It is directed by the decisions made in the previous subphase,
 | 
						|
as expressed via the call-list.
 | 
						|
The call-list read by this subphase contains
 | 
						|
only calls that were selected for expansion.
 | 
						|
The list is ordered in the same way as the EM text,
 | 
						|
i.e. if a call C1 appears before a call C2 in the call-list,
 | 
						|
C1 also appears before C2 in the EM text.
 | 
						|
So the EM text is traversed linearly,
 | 
						|
the calls that have to be substituted are determined
 | 
						|
and the modifications are made.
 | 
						|
If a procedure is come across that is no longer needed,
 | 
						|
it is simply not written to the output EM file.
 | 
						|
The substitution of a call takes place in distinct steps:
 | 
						|
.IP "change the calling sequence" 7
 | 
						|
.sp 0
 | 
						|
The actual parameter expressions are changed.
 | 
						|
Parameters that are put in line are removed.
 | 
						|
All remaining ones must store their result in a
 | 
						|
temporary local variable, rather than
 | 
						|
push it on the stack.
 | 
						|
The CAL instruction and any ASP (to pop actual parameters)
 | 
						|
or LFR (to fetch the result of a function)
 | 
						|
are deleted.
 | 
						|
.IP "fetch the text of the called procedure"
 | 
						|
.sp 0
 | 
						|
Direct disk access is used to to read the text of the
 | 
						|
called procedure.
 | 
						|
The file offset is obtained from the proctable entry.
 | 
						|
.IP "allocate bytes for locals and temporaries"
 | 
						|
.sp 0
 | 
						|
The local variables of the called procedure will be put in the
 | 
						|
stack frame of the calling procedure.
 | 
						|
The same applies to any temporary variables
 | 
						|
that hold the result of parameters
 | 
						|
that were not put in line.
 | 
						|
The proctable entry of the caller is updated.
 | 
						|
.IP "put a label after the CAL"
 | 
						|
.sp 0
 | 
						|
If the called procedure contains a RET (return) instruction
 | 
						|
somewhere in the middle of its text (i.e. it does
 | 
						|
not fall through), the RET must be changed into
 | 
						|
a BRA (branch), to jump over the
 | 
						|
remainder of the text.
 | 
						|
This label is not needed if the called
 | 
						|
procedure falls through.
 | 
						|
.IP "copy the text of the called procedure and modify it"
 | 
						|
.sp 0
 | 
						|
References to local variables of the called routine
 | 
						|
and to parameters that are not put in line
 | 
						|
are changed to refer to the
 | 
						|
new local of the caller.
 | 
						|
References to in line parameters are replaced
 | 
						|
by the actual parameter expression.
 | 
						|
Returns (RETs) are either deleted or
 | 
						|
replaced by a BRA.
 | 
						|
Messages containing information about local
 | 
						|
variables or parameters are changed.
 | 
						|
Global data declarations and the PRO and END pseudos
 | 
						|
are removed.
 | 
						|
Instruction labels and references to them are
 | 
						|
changed to make sure they do not have the
 | 
						|
same identifying number as
 | 
						|
labels in the calling procedure.
 | 
						|
.IP "insert the modified text"
 | 
						|
.sp 0
 | 
						|
The pseudos of the called procedure are put after the pseudos
 | 
						|
of the calling procedure.
 | 
						|
The real text of the callee is put at
 | 
						|
the place where the CAL was.
 | 
						|
.IP "take care of nested substitutions"
 | 
						|
.sp 0
 | 
						|
The expanded procedure may contain calls that
 | 
						|
have to be expanded too (nested calls).
 | 
						|
If the descriptor of this call contains actual
 | 
						|
parameter expressions,
 | 
						|
the code of the expressions has to be changed
 | 
						|
the same way as the code of the callee was changed.
 | 
						|
Next, the entire process of finding CALs and doing
 | 
						|
the substitutions is repeated recursively.
 | 
						|
.LP
 |