172 lines
5.6 KiB
Plaintext
172 lines
5.6 KiB
Plaintext
.bp
|
|
.NH 1
|
|
Stack pollution
|
|
.NH 2
|
|
Introduction
|
|
.PP
|
|
The "Stack Pollution" optimization technique (SP) decreases the costs
|
|
(time as well as space) of procedure calls.
|
|
In the EM calling sequence, the actual parameters are popped from
|
|
the stack by the \fIcalling\fR procedure.
|
|
The ASP (Adjust Stack Pointer) instruction is used for this purpose.
|
|
A call in EM is shown in Fig. 8.1
|
|
.DS
|
|
Pascal: EM:
|
|
|
|
f(a,2) LOC 2
|
|
LOE A
|
|
CAL F
|
|
ASP 4 -- pop 4 bytes
|
|
|
|
Fig. 8.1 An example procedure call in Pascal and EM
|
|
.DE
|
|
As procedure calls occur often in most programs,
|
|
the ASP is one of the most frequently used EM instructions.
|
|
.PP
|
|
The main intention of removing the actual parameters after a procedure call
|
|
is to avoid the stack size to increase rapidly.
|
|
Yet, in some cases, it is possible to \fIdelay\fR or even \fIavoid\fR the
|
|
removal of the parameters without letting the stack grow
|
|
significantly.
|
|
In this way, considerable savings in code size and execution time may
|
|
be achieved, at the cost of a slightly increased stack size.
|
|
.PP
|
|
A stack adjustment may be delayed if there is some other stack adjustment
|
|
later on in the same basic block.
|
|
The two ASPs can be combined into one.
|
|
.DS
|
|
Pascal: EM: optimized EM:
|
|
|
|
f(a,2) LOC 2 LOC 2
|
|
g(3,b,c) LOE A LOE A
|
|
CAL F CAL F
|
|
ASP 4 LOE C
|
|
LOE C LOE B
|
|
LOE B LOC 3
|
|
LOC 3 CAL G
|
|
CAL G ASP 10
|
|
ASP 6
|
|
|
|
Fig. 8.2 An example of local Stack Pollution
|
|
.DE
|
|
The stacksize will be increased only temporarily.
|
|
If the basic block contains another ASP, the ASP 10 may subsequently be
|
|
combined with that next ASP, and so on.
|
|
.PP
|
|
For some back ends, a stack adjustment also takes place
|
|
at the point of a procedure return.
|
|
There is no need to specify the number of bytes to be popped at a
|
|
return.
|
|
This provides an opportunity to remove ASPs more globally.
|
|
If all ASPs outside any loop are removed, the increase of the
|
|
stack size will still only be small, as no such ASP is executed more
|
|
than once without an intervening return from the procedure it is part of.
|
|
.PP
|
|
This second approach is not generally applicable to all target machines,
|
|
as some back ends require the stack to be cleaned up at the point of
|
|
a procedure return.
|
|
.NH 2
|
|
Implementation
|
|
.PP
|
|
There is one main problem the implementation has to solve.
|
|
In EM, the stack is not only used for passing parameters,
|
|
but also for evaluating expressions.
|
|
Hence, ASP instructions can only be combined or removed
|
|
if certain conditions are satisfied.
|
|
.PP
|
|
Two consecutive ASPs of one basic block can only be combined
|
|
(as described above) if:
|
|
.IP 1.
|
|
On no point of text in between the two ASPs, any item is popped from
|
|
the stack that was pushed onto it before the first ASP.
|
|
.IP 2.
|
|
The number of bytes popped from the stack by the second ASP must equal
|
|
the number of bytes pushed since the first ASP.
|
|
.LP
|
|
Condition 1. is not satisfied in Fig. 8.3.
|
|
.DS
|
|
Pascal: EM:
|
|
|
|
5 + f(10) + g(30) LOC 5
|
|
LOC 10
|
|
CAL F
|
|
ASP 2 -- cannot be removed
|
|
LFR 2 -- push function result
|
|
ADI 2
|
|
LOC 30
|
|
CAL G
|
|
ASP 2
|
|
LFR 2
|
|
ADI 2
|
|
Fig. 8.3 An illegal transformation
|
|
.DE
|
|
If the first ASP were removed (delayed), the first ADI would add
|
|
10 and f(10), instead of 5 and f(10).
|
|
.sp
|
|
Condition 2. is not satisfied in Fig. 8.4.
|
|
.DS
|
|
Pascal: EM:
|
|
|
|
f(10) + 5 * g(30) LOC 10
|
|
CAL F
|
|
ASP 2
|
|
LFR 2
|
|
LOC 5
|
|
LOC 30
|
|
CAL G
|
|
ASP 2
|
|
LFR 2
|
|
MLI 2 -- 5 * g(30)
|
|
ADI 2
|
|
|
|
Fig. 8.4 A second illegal transformation
|
|
.DE
|
|
If the two ASPs were combined into one 'ASP 4', the constant 5 would
|
|
have been popped, rather than the parameter 10 (so '10 + f(10)*g(30)'
|
|
would have been computed).
|
|
.PP
|
|
The second approach to deleting ASPs (i.e. let the procedure return
|
|
do the stack clean-up)
|
|
is only applied to the last ASP of every basic block.
|
|
Any preceding ASPs are dealt with by the first approach.
|
|
The last ASP of a basic block B will only be removed if:
|
|
.IP -
|
|
on no path in the control flow graph from B to any block containing a
|
|
RET (return) there is a basic block that, at some point of its text, pops
|
|
items from the stack that it has not itself pushed earlier.
|
|
.LP
|
|
Clearly, if this condition is satisfied, no harm can be done; no
|
|
other basic block will ever access items that were pushed
|
|
on the stack before the ASP.
|
|
.PP
|
|
The number of bytes pushed onto or popped from the stack can be
|
|
easily encoded in a so called "pop-push table".
|
|
The numbers in general depend on the target machine word- and pointer
|
|
size and on the argument given to the instruction.
|
|
For example, an ADS instruction is described by:
|
|
.DS
|
|
-a-p+p
|
|
.DE
|
|
which means: an 'ADS n' first pops an n-byte value (n being the argument),
|
|
next pops a pointer-size value and finally pushes a pointer-size value.
|
|
For some infrequently used EM instructions the pop-push numbers
|
|
cannot be computed statically.
|
|
.PP
|
|
The stack pollution algorithm first performs a depth first search over
|
|
the control flow graph and marks all blocks that do not satisfy
|
|
the global condition.
|
|
Next it visits all basic blocks in turn.
|
|
For every pair of adjacent ASPs, it checks conditions 1. and 2. and
|
|
combines the ASPs if they are satisfied.
|
|
The new ASP may be used as first ASP in the next pair.
|
|
If a condition fails, it simply continues with the next ASP.
|
|
Finally, the last ASP is removed if:
|
|
.IP -
|
|
nothing has been popped from the stack after the last ASP that was
|
|
pushed before it
|
|
.IP -
|
|
the block was not marked by the depth first search
|
|
.IP -
|
|
the block is not in a loop
|
|
.LP
|