ack/doc/ego/bo/bo1
1990-06-20 10:05:22 +00:00

163 lines
5.3 KiB
Plaintext

.bp
.NH 1
Branch Optimization
.NH 2
Introduction
.PP
The Branch Optimization phase (BO) performs two related
(branch) optimizations.
.NH 3
Fusion of basic blocks
.PP
If two basic blocks B1 and B2 have the following properties:
.DS
SUCC(B1) = {B2}
PRED(B2) = {B1}
.DE
then B1 and B2 can be combined into one basic block.
If B1 ends in an unconditional jump to the beginning of B2, this
jump can be eliminated,
hence saving a little execution time and object code size.
This technique can be used to eliminate some deficiencies
introduced by the front ends (for example, the "C" front end
translates switch statements inefficiently due to its one pass nature).
.NH 3
While-loop optimization
.PP
The straightforward way to translate a while loop is to
put the test for loop termination at the beginning of the loop.
.DS
while cond loop \kyLAB1: \kxTest cond
body of the loop --->\h'|\nxu'Branch On False To LAB2
end loop\h'|\nxu'code for body of loop
\h'|\nxu'Branch To LAB1
\h'|\nyu'LAB2:
Fig. 10.1 Example of Branch Optimization
.DE
If the condition fails at the Nth iteration, the following code
gets executed (dynamically):
.DS
.TS
l l l.
N * conditional branch (which fails N-1 times)
N-1 * unconditional branch
N-1 * body of the loop
.TE
.DE
An alternative translation is:
.DS
Branch To LAB2
LAB1:
code for body of loop
LAB2:
Test cond
Branch On True To LAB1
.DE
This translation results in the following profile:
.DS
.TS
l l l.
N * conditional branch (which succeeds N-1 times)
1 * unconditional branch
N-1 * body of the loop
.TE
.DE
So the second translation will be significantly faster if N >> 2.
If N=2, execution time will be slightly increased.
On the average, the program will be speeded up.
Note that the code sizes of the two translations will be the same.
.NH 2
Implementation
.PP
The basic block fusion technique is implemented
by traversing the control flow graph of a procedure,
looking for basic blocks B with only one successor (S).
If one is found, it is checked if S has only one predecessor
(which has to be B).
If so, the two basic blocks can in principle be combined.
However, as one basic block will have to be moved,
the textual order of the basic blocks will be altered.
This reordering causes severe problems in the presence
of conditional jumps.
For example, if S ends in a conditional branch,
the basic block that comes textually next to S must stay
in that position.
So the transformation in Fig. 10.2 is illegal.
.DS
.TS
l l l l l.
LAB1: S1 LAB1: S1
BRA LAB2 S2
... --> BEQ LAB3
LAB2: S2 ...
BEQ LAB3 S3
S3
.TE
Fig. 10.2 An illegal transformation of Branch Optimization
.DE
If B is moved towards S the same problem occurs if the block before B
ends in a conditional jump.
The problem could be solved by adding one extra branch,
but this would reduce the gains of the optimization to zero.
Hence the optimization will only be done if the block that
follows S (in the textual order) is not a successor of S.
This condition assures that S does not end in a conditional branch.
The condition always holds for the code generated by the "C"
front end for a switch statement.
.PP
After the transformation has been performed,
some attributes of the basic blocks involved (such as successor and
predecessor sets and immediate dominator) must be recomputed.
.PP
The while-loop technique is applied to one loop at a time.
The list of basic blocks of the loop is traversed to find
a block B that satisfies the following conditions:
.IP 1.
the textually next block to B is not part of the loop
.IP 2.
the last instruction of B is an unconditional branch;
hence B has only one successor, say S
.IP 3.
the textually next block of B is a successor of S
.IP 4.
the last instruction of S is a conditional branch
.LP
If such a block B is found, the control flow graph is changed
as depicted in Fig. 10.3.
.DS
.ft 5
| |
| v
v |
|-----<------| ----->-----|
____|____ | |
| | | |-------| |
| S1 | | | v |
| Bcc | | | .... |
|--| | | | |
| --------- | | ----|---- |
| | | | | |
| .... ^ | | S2 | |
| | | | | |
| --------- | | | | |
v | | | ^ --------- |
| | S2 | | | | |
| | BRA | | | |-----<-----
| | | | | v
| --------- | | ____|____
| | | | | |
| ------>------ | | S1 |
| | | Bnn |
|-------| | | |
| | ----|----
v | |
|----<--|
|
v
.ft R
Fig. 10.3 Transformation of the CFG by Branch Optimization
.DE