151 lines
		
	
	
	
		
			4.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			151 lines
		
	
	
	
		
			4.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.bp
 | 
						|
.NH 1
 | 
						|
Branch Optimization
 | 
						|
.NH 2
 | 
						|
Introduction
 | 
						|
.PP
 | 
						|
The Branch Optimization phase (BO) performs two related
 | 
						|
(branch) optimizations.
 | 
						|
.NH 3
 | 
						|
Fusion of basic blocks
 | 
						|
.PP
 | 
						|
If two basic blocks B1 and B2 have the following properties:
 | 
						|
.DS
 | 
						|
SUCC(B1) = {B2}
 | 
						|
PRED(B2) = {B1}
 | 
						|
.DE
 | 
						|
then B1 and B2 can be combined into one basic block.
 | 
						|
If B1 ends in an unconditional jump to the beginning of B2, this
 | 
						|
jump can be eliminated,
 | 
						|
hence saving a little execution time and object code size.
 | 
						|
This technique can be used to eliminate some deficiencies
 | 
						|
introduced by the front ends (for example, the "C" front end
 | 
						|
translates switch statements inefficiently due to its one pass nature).
 | 
						|
.NH 3
 | 
						|
While-loop optimization
 | 
						|
.PP
 | 
						|
The straightforward way to translate a while loop is to
 | 
						|
put the test for loop termination at the beginning of the loop.
 | 
						|
.DS
 | 
						|
while cond loop                  LAB1: Test cond
 | 
						|
   body of the loop     --->           Branch On False To LAB2
 | 
						|
end loop                               code for body of loop
 | 
						|
				       Branch To LAB1
 | 
						|
				 LAB2:
 | 
						|
 | 
						|
Fig. 10.1 Example of Branch Optimization
 | 
						|
.DE
 | 
						|
If the condition fails at the Nth iteration, the following code
 | 
						|
gets executed (dynamically):
 | 
						|
.DS
 | 
						|
N   *  conditional branch (which fails N-1 times)
 | 
						|
N-1 *  unconditional branch
 | 
						|
N-1 *  body of the loop
 | 
						|
.DE
 | 
						|
An alternative translation is:
 | 
						|
.DS
 | 
						|
     Branch To LAB2
 | 
						|
LAB1:
 | 
						|
     code for body of loop
 | 
						|
LAB2:
 | 
						|
     Test cond
 | 
						|
     Branch On True To LAB1
 | 
						|
.DE
 | 
						|
This translation results in the following profile:
 | 
						|
.DS
 | 
						|
N   *  conditional branch (which succeeds N-1 times)
 | 
						|
1   *  unconditional branch
 | 
						|
N-1 *  body of the loop
 | 
						|
.DE
 | 
						|
So the second translation will be significantly faster if N >> 2.
 | 
						|
If N=2, execution time will be slightly increased.
 | 
						|
On the average, the program will be speeded up.
 | 
						|
Note that the code sizes of the two translations will be the same.
 | 
						|
.NH 2
 | 
						|
Implementation
 | 
						|
.PP
 | 
						|
The basic block fusion technique is implemented
 | 
						|
by traversing the control flow graph of a procedure,
 | 
						|
looking for basic blocks B with only one successor (S).
 | 
						|
If one is found, it is checked if S has only one predecessor
 | 
						|
(which has to be B).
 | 
						|
If so, the two basic blocks can in principle be combined.
 | 
						|
However, as one basic block will have to be moved,
 | 
						|
the textual order of the basic blocks will be altered.
 | 
						|
This reordering causes severe problems in the presence
 | 
						|
of conditional jumps.
 | 
						|
For example, if S ends in a conditional branch,
 | 
						|
the basic block that comes textually next to S must stay
 | 
						|
in that position.
 | 
						|
So the transformation in Fig. 10.2 is illegal.
 | 
						|
.DS
 | 
						|
LAB1: S1              LAB1: S1
 | 
						|
      BRA LAB2        S2
 | 
						|
      ...       -->   BEQ LAB3
 | 
						|
LAB2: S2              ...
 | 
						|
      BEQ LAB3        S3
 | 
						|
      S3
 | 
						|
 | 
						|
Fig. 10.2 An illegal transformation of Branch Optimization
 | 
						|
.DE
 | 
						|
If B is moved towards S the same problem occurs if the block before B
 | 
						|
ends in a conditional jump.
 | 
						|
The problem could be solved by adding one extra branch,
 | 
						|
but this would reduce the gains of the optimization to zero.
 | 
						|
Hence the optimization will only be done if the block that
 | 
						|
follows S (in the textual order) is not a successor of S.
 | 
						|
This condition assures that S does not end in a conditional branch.
 | 
						|
The condition always holds for the code generated by the "C"
 | 
						|
front end for a switch statement.
 | 
						|
.PP
 | 
						|
After the transformation has been performed,
 | 
						|
some attributes of the basic blocks involved (such as successor and
 | 
						|
predecessor sets and immediate dominator) must be recomputed.
 | 
						|
.PP
 | 
						|
The while-loop technique is applied to one loop at a time.
 | 
						|
The list of basic blocks of the loop is traversed to find
 | 
						|
a block B that satisfies the following conditions:
 | 
						|
.IP 1.
 | 
						|
the textually next block to B is not part of the loop
 | 
						|
.IP 2.
 | 
						|
the last instruction of B is an unconditional branch;
 | 
						|
hence B has only one successor, say S
 | 
						|
.IP 3.
 | 
						|
the textually next block of B is a successor of S
 | 
						|
.IP 4.
 | 
						|
the last instruction of S is a conditional branch
 | 
						|
.LP
 | 
						|
If such a block B is found, the control flow graph is changed
 | 
						|
as depicted in Fig. 10.3.
 | 
						|
.DS
 | 
						|
       |				    |
 | 
						|
       |				    v
 | 
						|
       v				    |
 | 
						|
       |-----<------|			    ----->-----|
 | 
						|
   ____|____	    |				       |
 | 
						|
   |	   |	    |		    |-------|	       |
 | 
						|
   |  S1   |	    |		    |	    v	       |
 | 
						|
   |  Bcc  |	    |		    |	  ....	       |
 | 
						|
|--|	   |	    |		    |		       |
 | 
						|
|  ---------	    |		    |	----|----      |
 | 
						|
|		    |		    |	|	|      |
 | 
						|
|     ....	    ^		    |	|  S2	|      |
 | 
						|
|		    |		    |	|	|      |
 | 
						|
|   ---------	    |		    |	|	|      |
 | 
						|
v   |	    |	    |		    ^	---------      |
 | 
						|
|   |  S2   |	    |		    |	    |	       |
 | 
						|
|   | BRA   |	    |		    |	    |-----<-----
 | 
						|
|   |	    |	    |		    |	    v
 | 
						|
|   ---------	    |		    |	____|____
 | 
						|
|	|	    |		    |	|	|
 | 
						|
|	------>------		    |	|  S1	|
 | 
						|
|				    |	|  Bnn  |
 | 
						|
|-------|			    |	|	|
 | 
						|
	|			    |	----|----
 | 
						|
	v			    |	    |
 | 
						|
				    |----<--|
 | 
						|
					    |
 | 
						|
					    v
 | 
						|
 | 
						|
Fig. 10.3 Transformation of the CFG by Branch Optimization
 | 
						|
.DE
 |