145 lines
3.7 KiB
Plaintext
145 lines
3.7 KiB
Plaintext
.bp
|
|
.NH 1
|
|
Cross jumping
|
|
.NH 2
|
|
Introduction
|
|
.PP
|
|
The "Cross Jumping" optimization technique (CJ)
|
|
.[
|
|
wulf design optimizing compiler
|
|
.]
|
|
is basically a space optimization technique. It looks for pairs of
|
|
basic blocks (B1,B2), for which:
|
|
.DS
|
|
SUCC(B1) = SUCC(B2) = {S}
|
|
.DE
|
|
(So B1 and B2 both have one and the same successor).
|
|
If the last few non-branch instructions are the same for B1 and B2,
|
|
one such sequence can be eliminated.
|
|
.DS
|
|
Pascal:
|
|
|
|
if cond then
|
|
S1
|
|
S3
|
|
else
|
|
S2
|
|
S3
|
|
|
|
(pseudo) EM:
|
|
.TS
|
|
l l l.
|
|
TEST COND TEST COND
|
|
BNE *1 BNE *1
|
|
S1 S1
|
|
S3 ---> BRA *2
|
|
BRA *2 1:
|
|
1: S2
|
|
S2 2:
|
|
S3 S3
|
|
2:
|
|
.TE
|
|
|
|
Fig. 9.1 An example of Cross Jumping
|
|
.DE
|
|
As the basic blocks have the same successor,
|
|
at least one of them ends in an unconditional branch instruction (BRA).
|
|
Hence no extra branch instruction is ever needed, just the target
|
|
of an existing branch needs to be changed; neither the program size
|
|
nor the execution time will ever increase.
|
|
In general, the execution time will remain the same, unless
|
|
further optimizations can be applied because of this optimization.
|
|
.PP
|
|
This optimization is particularly effective,
|
|
because it cannot always be done by the programmer at the source level,
|
|
as demonstrated by the Fig. 8.2.
|
|
.DS
|
|
Pascal:
|
|
|
|
if cond then
|
|
x := f(4)
|
|
else
|
|
x := g(5)
|
|
|
|
|
|
EM:
|
|
|
|
.TS
|
|
l l.
|
|
... ...
|
|
LOC 4 LOC 5
|
|
CAL F CAL G
|
|
ASP 2 ASP 2
|
|
LFR 2 LFR 2
|
|
STL X STL X
|
|
.TE
|
|
|
|
Fig. 9.2 Effectiveness of Cross Jumping
|
|
.DE
|
|
At the source level there is no common tail,
|
|
but at the EM level there is a common tail.
|
|
.NH 2
|
|
Implementation
|
|
.PP
|
|
The implementation of cross jumping is rather straightforward.
|
|
The technique is applied to one procedure at a time.
|
|
The control flow graph of the procedure
|
|
is scanned for pairs of basic blocks
|
|
with the same (single) successor and with common tails.
|
|
Note that there may be more than two such blocks (e.g. as the result
|
|
of a case statement).
|
|
This is dealt with by repeating the entire process until no
|
|
further optimizations can de done for the current procedure.
|
|
.sp
|
|
If a suitable pair of basic blocks has been found, the control flow
|
|
graph must be altered. One of the basic
|
|
blocks must be split into two.
|
|
The control flow graphs before and after the optimization are shown
|
|
in Fig. 9.3 and Fig. 9.4.
|
|
.DS
|
|
.ft 5
|
|
|
|
-------- --------
|
|
| | | |
|
|
| S1 | | S2 |
|
|
| S3 | | S3 |
|
|
| | | |
|
|
-------- --------
|
|
| |
|
|
|------------------|--------------------|
|
|
|
|
|
v
|
|
.ft R
|
|
|
|
Fig. 9.3 CFG before optimization
|
|
.DE
|
|
.DS
|
|
.ft 5
|
|
-------- --------
|
|
| | | |
|
|
| S1 | | S2 |
|
|
| | | |
|
|
-------- --------
|
|
| |
|
|
|--------------------<------------------|
|
|
v
|
|
--------
|
|
| |
|
|
| S3 |
|
|
| |
|
|
--------
|
|
|
|
|
v
|
|
.ft R
|
|
|
|
Fig. 9.4 CFG after optimization
|
|
.DE
|
|
Some attributes of the three resulting blocks (such as immediate dominator)
|
|
are updated.
|
|
.PP
|
|
In some cases, cross jumping might split the computation of an expression
|
|
into two, by inserting a branch somewhere in the middle.
|
|
Most code generators will generate very poor assembly code when
|
|
presented with such EM code.
|
|
Therefor, cross jumping is not performed in these cases.
|