162 lines
5.3 KiB
Text
162 lines
5.3 KiB
Text
.bp
|
|
.NH 1
|
|
Branch Optimization
|
|
.NH 2
|
|
Introduction
|
|
.PP
|
|
The Branch Optimization phase (BO) performs two related
|
|
(branch) optimizations.
|
|
.NH 3
|
|
Fusion of basic blocks
|
|
.PP
|
|
If two basic blocks B1 and B2 have the following properties:
|
|
.DS
|
|
SUCC(B1) = {B2}
|
|
PRED(B2) = {B1}
|
|
.DE
|
|
then B1 and B2 can be combined into one basic block.
|
|
If B1 ends in an unconditional jump to the beginning of B2, this
|
|
jump can be eliminated,
|
|
hence saving a little execution time and object code size.
|
|
This technique can be used to eliminate some deficiencies
|
|
introduced by the front ends (for example, the "C" front end
|
|
translates switch statements inefficiently due to its one pass nature).
|
|
.NH 3
|
|
While-loop optimization
|
|
.PP
|
|
The straightforward way to translate a while loop is to
|
|
put the test for loop termination at the beginning of the loop.
|
|
.DS
|
|
while cond loop \kyLAB1: \kxTest cond
|
|
body of the loop --->\h'|\nxu'Branch On False To LAB2
|
|
end loop\h'|\nxu'code for body of loop
|
|
\h'|\nxu'Branch To LAB1
|
|
\h'|\nyu'LAB2:
|
|
|
|
Fig. 10.1 Example of Branch Optimization
|
|
.DE
|
|
If the condition fails at the Nth iteration, the following code
|
|
gets executed (dynamically):
|
|
.DS
|
|
.TS
|
|
l l l.
|
|
N * conditional branch (which fails N-1 times)
|
|
N-1 * unconditional branch
|
|
N-1 * body of the loop
|
|
.TE
|
|
.DE
|
|
An alternative translation is:
|
|
.DS
|
|
Branch To LAB2
|
|
LAB1:
|
|
code for body of loop
|
|
LAB2:
|
|
Test cond
|
|
Branch On True To LAB1
|
|
.DE
|
|
This translation results in the following profile:
|
|
.DS
|
|
.TS
|
|
l l l.
|
|
N * conditional branch (which succeeds N-1 times)
|
|
1 * unconditional branch
|
|
N-1 * body of the loop
|
|
.TE
|
|
.DE
|
|
So the second translation will be significantly faster if N >> 2.
|
|
If N=2, execution time will be slightly increased.
|
|
On the average, the program will be speeded up.
|
|
Note that the code sizes of the two translations will be the same.
|
|
.NH 2
|
|
Implementation
|
|
.PP
|
|
The basic block fusion technique is implemented
|
|
by traversing the control flow graph of a procedure,
|
|
looking for basic blocks B with only one successor (S).
|
|
If one is found, it is checked if S has only one predecessor
|
|
(which has to be B).
|
|
If so, the two basic blocks can in principle be combined.
|
|
However, as one basic block will have to be moved,
|
|
the textual order of the basic blocks will be altered.
|
|
This reordering causes severe problems in the presence
|
|
of conditional jumps.
|
|
For example, if S ends in a conditional branch,
|
|
the basic block that comes textually next to S must stay
|
|
in that position.
|
|
So the transformation in Fig. 10.2 is illegal.
|
|
.DS
|
|
.TS
|
|
l l l l l.
|
|
LAB1: S1 LAB1: S1
|
|
BRA LAB2 S2
|
|
... --> BEQ LAB3
|
|
LAB2: S2 ...
|
|
BEQ LAB3 S3
|
|
S3
|
|
.TE
|
|
|
|
Fig. 10.2 An illegal transformation of Branch Optimization
|
|
.DE
|
|
If B is moved towards S the same problem occurs if the block before B
|
|
ends in a conditional jump.
|
|
The problem could be solved by adding one extra branch,
|
|
but this would reduce the gains of the optimization to zero.
|
|
Hence the optimization will only be done if the block that
|
|
follows S (in the textual order) is not a successor of S.
|
|
This condition assures that S does not end in a conditional branch.
|
|
The condition always holds for the code generated by the "C"
|
|
front end for a switch statement.
|
|
.PP
|
|
After the transformation has been performed,
|
|
some attributes of the basic blocks involved (such as successor and
|
|
predecessor sets and immediate dominator) must be recomputed.
|
|
.PP
|
|
The while-loop technique is applied to one loop at a time.
|
|
The list of basic blocks of the loop is traversed to find
|
|
a block B that satisfies the following conditions:
|
|
.IP 1.
|
|
the textually next block to B is not part of the loop
|
|
.IP 2.
|
|
the last instruction of B is an unconditional branch;
|
|
hence B has only one successor, say S
|
|
.IP 3.
|
|
the textually next block of B is a successor of S
|
|
.IP 4.
|
|
the last instruction of S is a conditional branch
|
|
.LP
|
|
If such a block B is found, the control flow graph is changed
|
|
as depicted in Fig. 10.3.
|
|
.DS
|
|
.ft 5
|
|
| |
|
|
| v
|
|
v |
|
|
|-----<------| ----->-----|
|
|
____|____ | |
|
|
| | | |-------| |
|
|
| S1 | | | v |
|
|
| Bcc | | | .... |
|
|
|--| | | | |
|
|
| --------- | | ----|---- |
|
|
| | | | | |
|
|
| .... ^ | | S2 | |
|
|
| | | | | |
|
|
| --------- | | | | |
|
|
v | | | ^ --------- |
|
|
| | S2 | | | | |
|
|
| | BRA | | | |-----<-----
|
|
| | | | | v
|
|
| --------- | | ____|____
|
|
| | | | | |
|
|
| ------>------ | | S1 |
|
|
| | | Bnn |
|
|
|-------| | | |
|
|
| | ----|----
|
|
v | |
|
|
|----<--|
|
|
|
|
|
v
|
|
.ft R
|
|
|
|
Fig. 10.3 Transformation of the CFG by Branch Optimization
|
|
.DE
|