162 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			162 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.bp
 | 
						|
.NH 1
 | 
						|
Branch Optimization
 | 
						|
.NH 2
 | 
						|
Introduction
 | 
						|
.PP
 | 
						|
The Branch Optimization phase (BO) performs two related
 | 
						|
(branch) optimizations.
 | 
						|
.NH 3
 | 
						|
Fusion of basic blocks
 | 
						|
.PP
 | 
						|
If two basic blocks B1 and B2 have the following properties:
 | 
						|
.DS
 | 
						|
SUCC(B1) = {B2}
 | 
						|
PRED(B2) = {B1}
 | 
						|
.DE
 | 
						|
then B1 and B2 can be combined into one basic block.
 | 
						|
If B1 ends in an unconditional jump to the beginning of B2, this
 | 
						|
jump can be eliminated,
 | 
						|
hence saving a little execution time and object code size.
 | 
						|
This technique can be used to eliminate some deficiencies
 | 
						|
introduced by the front ends (for example, the "C" front end
 | 
						|
translates switch statements inefficiently due to its one pass nature).
 | 
						|
.NH 3
 | 
						|
While-loop optimization
 | 
						|
.PP
 | 
						|
The straightforward way to translate a while loop is to
 | 
						|
put the test for loop termination at the beginning of the loop.
 | 
						|
.DS
 | 
						|
while cond loop                       \kyLAB1: \kxTest cond
 | 
						|
   body of the loop     --->\h'|\nxu'Branch On False To LAB2
 | 
						|
end loop\h'|\nxu'code for body of loop
 | 
						|
\h'|\nxu'Branch To LAB1
 | 
						|
\h'|\nyu'LAB2:
 | 
						|
 | 
						|
Fig. 10.1 Example of Branch Optimization
 | 
						|
.DE
 | 
						|
If the condition fails at the Nth iteration, the following code
 | 
						|
gets executed (dynamically):
 | 
						|
.DS
 | 
						|
.TS
 | 
						|
l l l.
 | 
						|
N	*	conditional branch (which fails N-1 times)
 | 
						|
N-1	*	unconditional branch
 | 
						|
N-1	*	body of the loop
 | 
						|
.TE
 | 
						|
.DE
 | 
						|
An alternative translation is:
 | 
						|
.DS
 | 
						|
     Branch To LAB2
 | 
						|
LAB1:
 | 
						|
     code for body of loop
 | 
						|
LAB2:
 | 
						|
     Test cond
 | 
						|
     Branch On True To LAB1
 | 
						|
.DE
 | 
						|
This translation results in the following profile:
 | 
						|
.DS
 | 
						|
.TS
 | 
						|
l l l.
 | 
						|
N	*	conditional branch (which succeeds N-1 times)
 | 
						|
1	*	unconditional branch
 | 
						|
N-1	*	body of the loop
 | 
						|
.TE
 | 
						|
.DE
 | 
						|
So the second translation will be significantly faster if N >> 2.
 | 
						|
If N=2, execution time will be slightly increased.
 | 
						|
On the average, the program will be speeded up.
 | 
						|
Note that the code sizes of the two translations will be the same.
 | 
						|
.NH 2
 | 
						|
Implementation
 | 
						|
.PP
 | 
						|
The basic block fusion technique is implemented
 | 
						|
by traversing the control flow graph of a procedure,
 | 
						|
looking for basic blocks B with only one successor (S).
 | 
						|
If one is found, it is checked if S has only one predecessor
 | 
						|
(which has to be B).
 | 
						|
If so, the two basic blocks can in principle be combined.
 | 
						|
However, as one basic block will have to be moved,
 | 
						|
the textual order of the basic blocks will be altered.
 | 
						|
This reordering causes severe problems in the presence
 | 
						|
of conditional jumps.
 | 
						|
For example, if S ends in a conditional branch,
 | 
						|
the basic block that comes textually next to S must stay
 | 
						|
in that position.
 | 
						|
So the transformation in Fig. 10.2 is illegal.
 | 
						|
.DS
 | 
						|
.TS
 | 
						|
l l l l l.
 | 
						|
LAB1:	S1		LAB1:	S1
 | 
						|
	BRA LAB2			S2
 | 
						|
	...	-->		BEQ LAB3
 | 
						|
LAB2:	S2			...
 | 
						|
	BEQ LAB3			S3
 | 
						|
	S3
 | 
						|
.TE
 | 
						|
 | 
						|
Fig. 10.2 An illegal transformation of Branch Optimization
 | 
						|
.DE
 | 
						|
If B is moved towards S the same problem occurs if the block before B
 | 
						|
ends in a conditional jump.
 | 
						|
The problem could be solved by adding one extra branch,
 | 
						|
but this would reduce the gains of the optimization to zero.
 | 
						|
Hence the optimization will only be done if the block that
 | 
						|
follows S (in the textual order) is not a successor of S.
 | 
						|
This condition assures that S does not end in a conditional branch.
 | 
						|
The condition always holds for the code generated by the "C"
 | 
						|
front end for a switch statement.
 | 
						|
.PP
 | 
						|
After the transformation has been performed,
 | 
						|
some attributes of the basic blocks involved (such as successor and
 | 
						|
predecessor sets and immediate dominator) must be recomputed.
 | 
						|
.PP
 | 
						|
The while-loop technique is applied to one loop at a time.
 | 
						|
The list of basic blocks of the loop is traversed to find
 | 
						|
a block B that satisfies the following conditions:
 | 
						|
.IP 1.
 | 
						|
the textually next block to B is not part of the loop
 | 
						|
.IP 2.
 | 
						|
the last instruction of B is an unconditional branch;
 | 
						|
hence B has only one successor, say S
 | 
						|
.IP 3.
 | 
						|
the textually next block of B is a successor of S
 | 
						|
.IP 4.
 | 
						|
the last instruction of S is a conditional branch
 | 
						|
.LP
 | 
						|
If such a block B is found, the control flow graph is changed
 | 
						|
as depicted in Fig. 10.3.
 | 
						|
.DS
 | 
						|
.ft 5
 | 
						|
       |                                    |
 | 
						|
       |                                    v
 | 
						|
       v                                    |
 | 
						|
       |-----<------|                       ----->-----|
 | 
						|
   ____|____        |                                  |
 | 
						|
   |       |        |               |-------|          |
 | 
						|
   |  S1   |        |               |       v          |
 | 
						|
   |  Bcc  |        |               |     ....         |
 | 
						|
|--|       |        |               |                  |
 | 
						|
|  ---------        |               |   ----|----      |
 | 
						|
|                   |               |   |       |      |
 | 
						|
|     ....          ^               |   |  S2   |      |
 | 
						|
|                   |               |   |       |      |
 | 
						|
|   ---------       |               |   |       |      |
 | 
						|
v   |       |       |               ^   ---------      |
 | 
						|
|   |  S2   |       |               |       |          |
 | 
						|
|   | BRA   |       |               |       |-----<-----
 | 
						|
|   |       |       |               |       v
 | 
						|
|   ---------       |               |   ____|____
 | 
						|
|       |           |               |   |       |
 | 
						|
|       ------>------               |   |  S1   |
 | 
						|
|                                   |   |  Bnn  |
 | 
						|
|-------|                           |   |       |
 | 
						|
        |                           |   ----|----
 | 
						|
        v                           |       |
 | 
						|
                                    |----<--|
 | 
						|
                                            |
 | 
						|
                                            v
 | 
						|
.ft R
 | 
						|
 | 
						|
Fig. 10.3 Transformation of the CFG by Branch Optimization
 | 
						|
.DE
 |