446 lines
14 KiB
Text
446 lines
14 KiB
Text
.NH 2
|
|
Implementation
|
|
.PP
|
|
A major factor in the implementation
|
|
of Inline Substitution is the requirement
|
|
not to use an excessive amount of memory.
|
|
IL essentially analyzes the entire program;
|
|
it makes decisions based on which procedure calls
|
|
appear in the whole program.
|
|
Yet, because of the memory restriction, it is
|
|
not feasible to read the entire program
|
|
in main memory.
|
|
To solve this problem, the IL phase has been
|
|
split up into three subphases that are executed sequentially:
|
|
.IP 1.
|
|
analyze every procedure; see how it accesses its parameters;
|
|
simultaneously collect all calls
|
|
appearing in the whole program an put them
|
|
in a \fIcall-list\fR.
|
|
.IP 2.
|
|
use the call-list and decide which calls will be substituted
|
|
in line.
|
|
.IP 3.
|
|
take the decisions of subphase 2 and modify the
|
|
program accordingly.
|
|
.LP
|
|
Subphases 1 and 3 scan the input program; only
|
|
subphase 3 modifies it.
|
|
It is essential that the decisions can be made
|
|
in subphase 2
|
|
without using the input program,
|
|
provided that subphase 1 puts enough information
|
|
in the call-list.
|
|
Subphase 2 keeps the entire call-list in main memory
|
|
and repeatedly scans it, to
|
|
find the next best candidate for expansion.
|
|
.PP
|
|
We will specify the
|
|
data structures used by IL before
|
|
describing the subphases.
|
|
.NH 3
|
|
Data structures
|
|
.NH 4
|
|
The procedure table
|
|
.PP
|
|
In subphase 1 information is gathered about every procedure
|
|
and added to the procedure table.
|
|
This information is used by the heuristic rules.
|
|
A proctable entry for procedure p has
|
|
the following extra information:
|
|
.IP -
|
|
is it allowed to substitute an invocation of p in line?
|
|
.IP -
|
|
is it allowed to put any parameter of such a call in line?
|
|
.IP -
|
|
the size of p (number of EM instructions)
|
|
.IP -
|
|
does p 'fall through'?
|
|
.IP -
|
|
a description of the formal parameters that p accesses; this information
|
|
is obtained by looking at the code of p. For every parameter f,
|
|
we record:
|
|
.RS
|
|
.IP -
|
|
the offset of f
|
|
.IP -
|
|
the type of f (word, double word, pointer)
|
|
.IP -
|
|
may the corresponding actual parameter be put in line?
|
|
.IP -
|
|
is f ever accessed indirectly?
|
|
.IP -
|
|
if f used: never, once or more than once?
|
|
.RE
|
|
.IP -
|
|
the number of times p is called (see below)
|
|
.IP -
|
|
the file address of its call-count information (see below).
|
|
.LP
|
|
.NH 4
|
|
Call-count information
|
|
.PP
|
|
As a result of Inline Substitution, some procedures may
|
|
become useless, because all their invocations have been
|
|
substituted in line.
|
|
One of the tasks of IL is to keep track which
|
|
procedures are no longer called.
|
|
Note that IL is especially keen on procedures that are
|
|
called only once
|
|
(possibly as a result of expanding all other calls to it).
|
|
So we want to know how many times a procedure
|
|
is called \fIduring\fR Inline Substitution.
|
|
It is not good enough to compute this
|
|
information afterwards.
|
|
The task is rather complex, because
|
|
the number of times a procedure is called
|
|
varies during the entire process:
|
|
.IP 1.
|
|
If a call to p is substituted in line,
|
|
the number of calls to p gets decremented by 1.
|
|
.IP 2.
|
|
If a call to p is substituted in line,
|
|
and p contains n calls to q, then the number of calls to q
|
|
gets incremented by n.
|
|
.IP 3.
|
|
If a procedure p is removed (because it is no
|
|
longer called) and p contains n calls to q,
|
|
then the number of calls to q gets decremented by n.
|
|
.LP
|
|
(Note that p may be the same as q, if p is recursive).
|
|
.sp 0
|
|
So we actually want to have the following information:
|
|
.DS
|
|
NRCALL(p,q) = number of call to q appearing in p,
|
|
|
|
for all procedures p and q that may be put in line.
|
|
.DE
|
|
This information, called \fIcall-count information\fR is
|
|
computed by the first subphase.
|
|
It is stored in a file.
|
|
It is represented as a number of lists, rather than as
|
|
a (very sparse) matrix.
|
|
Every procedure has a list of (proc,count) pairs,
|
|
telling which procedures it calls, and how many times.
|
|
The file address of its call-count list is stored
|
|
in its proctable entry.
|
|
Whenever this information is needed, it is fetched from
|
|
the file, using direct access.
|
|
The proctable entry also contains the number of times
|
|
a procedure is called, at any moment.
|
|
.NH 4
|
|
The call-list
|
|
.PP
|
|
The call-list is the major data structure use by IL.
|
|
Every item of the list describes one procedure call.
|
|
It contains the following attributes:
|
|
.IP -
|
|
the calling procedure (caller)
|
|
.IP -
|
|
the called procedure (callee)
|
|
.IP -
|
|
identification of the CAL instruction (sequence number)
|
|
.IP -
|
|
the loop nesting level; our heuristic rules appreciate
|
|
calls inside a loop (or even inside a loop nested inside
|
|
another loop, etc.) more than other calls
|
|
.IP -
|
|
the actual parameter expressions involved in the call;
|
|
for every actual, we record:
|
|
.RS
|
|
.IP -
|
|
the EM code of the expression
|
|
.IP -
|
|
the number of bytes of its result (size)
|
|
.IP -
|
|
an indication if the actual may be put in line
|
|
.RE
|
|
.LP
|
|
The structure of the call-list is rather complex.
|
|
Whenever a call is expanded in line, new calls
|
|
will suddenly appear in the program,
|
|
that were not contained in the original body
|
|
of the calling subroutine.
|
|
These calls are inherited from the called procedure.
|
|
We will refer to these invocations as \fInested calls\fR
|
|
(see Fig. 5.1).
|
|
.DS
|
|
.TS
|
|
lw(2.5i) l.
|
|
procedure p is
|
|
begin .
|
|
a(); .
|
|
b(); .
|
|
end;
|
|
.TE
|
|
|
|
.TS
|
|
lw(2.5i) l.
|
|
procedure r is procedure r is
|
|
begin begin
|
|
x(); x();
|
|
p(); -- in line a(); -- nested call
|
|
y(); b(); -- nested call
|
|
end; y();
|
|
end;
|
|
.TE
|
|
|
|
Fig. 5.1 Example of nested procedure calls
|
|
.DE
|
|
Nested calls may subsequently be put in line too
|
|
(probably resulting in a yet deeper nesting level, etc.).
|
|
So the call-list does not always reflect the source program,
|
|
but changes dynamically, as decisions are made.
|
|
If a call to p is expanded, all calls appearing in p
|
|
will be added to the call-list.
|
|
.sp 0
|
|
A convenient and elegant way to represent
|
|
the call-list is to use a LISP-like list.
|
|
.[
|
|
poel lisp trac
|
|
.]
|
|
Calls that appear at the same level
|
|
are linked in the CDR direction. If a call C
|
|
to a procedure p is expanded,
|
|
all calls appearing in p are put in a sub-list
|
|
of C, i.e. in its CAR.
|
|
In the example above, before the decision
|
|
to expand the call to p is made, the
|
|
call-list of procedure r looks like:
|
|
.DS
|
|
(call-to-x, call-to-p, call-to-y)
|
|
.DE
|
|
After the decision, it looks like:
|
|
.DS
|
|
(call-to-x, (call-to-p*, call-to-a, call-to-b), call-to-y)
|
|
.DE
|
|
The call to p is marked, because it has been
|
|
substituted.
|
|
Whenever IL wants to traverse the call-list of some procedure,
|
|
it uses the well-known LISP technique of
|
|
recursion in the CAR direction and
|
|
iteration in the CDR direction
|
|
(see page 1.19-2 of
|
|
.[
|
|
poel lisp trac
|
|
.]
|
|
).
|
|
All list traversals look like:
|
|
.DS
|
|
traverse(list)
|
|
{
|
|
for (c = first(list); c != 0; c = CDR(c)) {
|
|
if (c is marked) {
|
|
traverse(CAR(c));
|
|
} else {
|
|
do something with c
|
|
}
|
|
}
|
|
}
|
|
.DE
|
|
The entire call-list consists of a number of LISP-like lists,
|
|
one for every procedure.
|
|
The proctable entry of a procedure contains a pointer
|
|
to the beginning of the list.
|
|
.NH 3
|
|
The first subphase: procedure analysis
|
|
.PP
|
|
The tasks of the first subphase are to determine
|
|
several attributes of every procedure
|
|
and to construct the basic call-list,
|
|
i.e. without nested calls.
|
|
The size of a procedure is determined
|
|
by simply counting its EM instructions.
|
|
Pseudo instructions are skipped.
|
|
A procedure does not 'fall through' if its CFG
|
|
contains a basic block
|
|
that is not the last block of the CFG and
|
|
that ends on a RET instruction.
|
|
The formal parameters of a procedure are determined
|
|
by inspection of
|
|
its code.
|
|
.PP
|
|
The call-list in constructed by looking at all CAL instructions
|
|
appearing in the program.
|
|
The call-list should only contain calls to procedures
|
|
that may be put in line.
|
|
This fact is only known if the procedure was
|
|
analyzed earlier.
|
|
If a call to a procedure p appears in the program
|
|
before the body of p,
|
|
the call will always be put in the call-list.
|
|
If p is later found to be unsuitable,
|
|
the call will be removed from the list by the
|
|
second subphase.
|
|
.PP
|
|
An important issue is the recognition
|
|
of the actual parameter expressions of the call.
|
|
The front ends produces messages telling how many
|
|
bytes of formal parameters every procedure accesses.
|
|
(If there is no such message for a procedure, it
|
|
cannot be put in line).
|
|
The actual parameters together must account for
|
|
the same number of bytes.A recursive descent parser is used
|
|
to parse side-effect free EM expressions.
|
|
It uses a table and some
|
|
auxiliary routines to determine
|
|
how many bytes every EM instruction pops from the stack
|
|
and how many bytes it pushes onto the stack.
|
|
These numbers depend on the EM instruction, its argument,
|
|
and the wordsize and pointersize of the target machine.
|
|
Initially, the parser has to recognize the
|
|
number of bytes specified in the formals-message,
|
|
say N.
|
|
Assume the first instruction before the CAL pops S bytes
|
|
and pushes R bytes.
|
|
If R > N, too many bytes are recognized
|
|
and the parser fails.
|
|
Else, it calls itself recursively to recognize the
|
|
S bytes used as operand of the instruction.
|
|
If it succeeds in doing so, it continues with the next instruction,
|
|
i.e. the first instruction before the code recognized by
|
|
the recursive call, to recognize N-R more bytes.
|
|
The result is a number of EM instructions that collectively push N bytes.
|
|
If an instruction is come across that has side-effects
|
|
(e.g. a store or a procedure call) or of which R and S cannot
|
|
be computed statically (e.g. a LOS), it fails.
|
|
.sp 0
|
|
Note that the parser traverses the code backwards.
|
|
As EM code is essentially postfix code, the parser works top down.
|
|
.PP
|
|
If the parser fails to recognize the parameters, the call will not
|
|
be substituted in line.
|
|
If the parameters can be determined, they still have to
|
|
match the formal parameters of the called procedure.
|
|
This check is performed by the second subphase; it cannot be
|
|
done here, because it is possible that the called
|
|
procedure has not been analyzed yet.
|
|
.PP
|
|
The entire call-list is written to a file,
|
|
to be processed by the second subphase.
|
|
.NH 3
|
|
The second subphase: making decisions
|
|
.PP
|
|
The task of the second subphase is quite easy
|
|
to understand.
|
|
It reads the call-list file,
|
|
builds an incore call-list and deletes every
|
|
call that may not be expanded in line (either because the called
|
|
procedure may not be put in line, or because the actual parameters
|
|
of the call do not match the formal parameters of the called procedure).
|
|
It assigns a \fIpay-off\fR to every call,
|
|
indicating how desirable it is to expand it.
|
|
.PP
|
|
The subphase repeatedly scans the call-list and takes
|
|
the call with the highest ratio.
|
|
The chosen one gets marked,
|
|
and the call-list is extended with the nested calls,
|
|
as described above.
|
|
These nested calls are also assigned a ratio,
|
|
and will be considered too during the next scans.
|
|
.sp 0
|
|
After every decision the number of times
|
|
every procedure is called is updated, using
|
|
the call-count information.
|
|
Meanwhile, the subphase keeps track of the amount of space left
|
|
available.
|
|
If all space is used, or if there are no more calls left to
|
|
be expanded, it exits this loop.
|
|
Finally, calls to procedures that are called only
|
|
once are also chosen.
|
|
.PP
|
|
The actual parameters of a call are only needed by
|
|
this subphase to assign a ratio to a call.
|
|
To save some space, these actuals are not kept in main memory.
|
|
They are removed after the call has been read and a ratio
|
|
has been assigned to it.
|
|
So this subphase works with \fIabstracts\fR of calls.
|
|
After all work has been done,
|
|
the actual parameters of the chosen calls are retrieved
|
|
from a file,
|
|
as they are needed by the transformation subphase.
|
|
.NH 3
|
|
The third subphase: doing transformations
|
|
.PP
|
|
The third subphase makes the actual modifications to
|
|
the EM text.
|
|
It is directed by the decisions made in the previous subphase,
|
|
as expressed via the call-list.
|
|
The call-list read by this subphase contains
|
|
only calls that were selected for expansion.
|
|
The list is ordered in the same way as the EM text,
|
|
i.e. if a call C1 appears before a call C2 in the call-list,
|
|
C1 also appears before C2 in the EM text.
|
|
So the EM text is traversed linearly,
|
|
the calls that have to be substituted are determined
|
|
and the modifications are made.
|
|
If a procedure is come across that is no longer needed,
|
|
it is simply not written to the output EM file.
|
|
The substitution of a call takes place in distinct steps:
|
|
.IP "change the calling sequence" 7
|
|
.sp 0
|
|
The actual parameter expressions are changed.
|
|
Parameters that are put in line are removed.
|
|
All remaining ones must store their result in a
|
|
temporary local variable, rather than
|
|
push it on the stack.
|
|
The CAL instruction and any ASP (to pop actual parameters)
|
|
or LFR (to fetch the result of a function)
|
|
are deleted.
|
|
.IP "fetch the text of the called procedure"
|
|
.sp 0
|
|
Direct disk access is used to to read the text of the
|
|
called procedure.
|
|
The file offset is obtained from the proctable entry.
|
|
.IP "allocate bytes for locals and temporaries"
|
|
.sp 0
|
|
The local variables of the called procedure will be put in the
|
|
stack frame of the calling procedure.
|
|
The same applies to any temporary variables
|
|
that hold the result of parameters
|
|
that were not put in line.
|
|
The proctable entry of the caller is updated.
|
|
.IP "put a label after the CAL"
|
|
.sp 0
|
|
If the called procedure contains a RET (return) instruction
|
|
somewhere in the middle of its text (i.e. it does
|
|
not fall through), the RET must be changed into
|
|
a BRA (branch), to jump over the
|
|
remainder of the text.
|
|
This label is not needed if the called
|
|
procedure falls through.
|
|
.IP "copy the text of the called procedure and modify it"
|
|
.sp 0
|
|
References to local variables of the called routine
|
|
and to parameters that are not put in line
|
|
are changed to refer to the
|
|
new local of the caller.
|
|
References to in line parameters are replaced
|
|
by the actual parameter expression.
|
|
Returns (RETs) are either deleted or
|
|
replaced by a BRA.
|
|
Messages containing information about local
|
|
variables or parameters are changed.
|
|
Global data declarations and the PRO and END pseudos
|
|
are removed.
|
|
Instruction labels and references to them are
|
|
changed to make sure they do not have the
|
|
same identifying number as
|
|
labels in the calling procedure.
|
|
.IP "insert the modified text"
|
|
.sp 0
|
|
The pseudos of the called procedure are put after the pseudos
|
|
of the calling procedure.
|
|
The real text of the callee is put at
|
|
the place where the CAL was.
|
|
.IP "take care of nested substitutions"
|
|
.sp 0
|
|
The expanded procedure may contain calls that
|
|
have to be expanded too (nested calls).
|
|
If the descriptor of this call contains actual
|
|
parameter expressions,
|
|
the code of the expressions has to be changed
|
|
the same way as the code of the callee was changed.
|
|
Next, the entire process of finding CALs and doing
|
|
the substitutions is repeated recursively.
|
|
.LP
|