151 lines
		
	
	
	
		
			4.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			151 lines
		
	
	
	
		
			4.7 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .bp
 | |
| .NH 1
 | |
| Branch Optimization
 | |
| .NH 2
 | |
| Introduction
 | |
| .PP
 | |
| The Branch Optimization phase (BO) performs two related
 | |
| (branch) optimizations.
 | |
| .NH 3
 | |
| Fusion of basic blocks
 | |
| .PP
 | |
| If two basic blocks B1 and B2 have the following properties:
 | |
| .DS
 | |
| SUCC(B1) = {B2}
 | |
| PRED(B2) = {B1}
 | |
| .DE
 | |
| then B1 and B2 can be combined into one basic block.
 | |
| If B1 ends in an unconditional jump to the beginning of B2, this
 | |
| jump can be eliminated,
 | |
| hence saving a little execution time and object code size.
 | |
| This technique can be used to eliminate some deficiencies
 | |
| introduced by the front ends (for example, the "C" front end
 | |
| translates switch statements inefficiently due to its one pass nature).
 | |
| .NH 3
 | |
| While-loop optimization
 | |
| .PP
 | |
| The straightforward way to translate a while loop is to
 | |
| put the test for loop termination at the beginning of the loop.
 | |
| .DS
 | |
| while cond loop                  LAB1: Test cond
 | |
|    body of the loop     --->           Branch On False To LAB2
 | |
| end loop                               code for body of loop
 | |
| 				       Branch To LAB1
 | |
| 				 LAB2:
 | |
| 
 | |
| Fig. 10.1 Example of Branch Optimization
 | |
| .DE
 | |
| If the condition fails at the Nth iteration, the following code
 | |
| gets executed (dynamically):
 | |
| .DS
 | |
| N   *  conditional branch (which fails N-1 times)
 | |
| N-1 *  unconditional branch
 | |
| N-1 *  body of the loop
 | |
| .DE
 | |
| An alternative translation is:
 | |
| .DS
 | |
|      Branch To LAB2
 | |
| LAB1:
 | |
|      code for body of loop
 | |
| LAB2:
 | |
|      Test cond
 | |
|      Branch On True To LAB1
 | |
| .DE
 | |
| This translation results in the following profile:
 | |
| .DS
 | |
| N   *  conditional branch (which succeeds N-1 times)
 | |
| 1   *  unconditional branch
 | |
| N-1 *  body of the loop
 | |
| .DE
 | |
| So the second translation will be significantly faster if N >> 2.
 | |
| If N=2, execution time will be slightly increased.
 | |
| On the average, the program will be speeded up.
 | |
| Note that the code sizes of the two translations will be the same.
 | |
| .NH 2
 | |
| Implementation
 | |
| .PP
 | |
| The basic block fusion technique is implemented
 | |
| by traversing the control flow graph of a procedure,
 | |
| looking for basic blocks B with only one successor (S).
 | |
| If one is found, it is checked if S has only one predecessor
 | |
| (which has to be B).
 | |
| If so, the two basic blocks can in principle be combined.
 | |
| However, as one basic block will have to be moved,
 | |
| the textual order of the basic blocks will be altered.
 | |
| This reordering causes severe problems in the presence
 | |
| of conditional jumps.
 | |
| For example, if S ends in a conditional branch,
 | |
| the basic block that comes textually next to S must stay
 | |
| in that position.
 | |
| So the transformation in Fig. 10.2 is illegal.
 | |
| .DS
 | |
| LAB1: S1              LAB1: S1
 | |
|       BRA LAB2        S2
 | |
|       ...       -->   BEQ LAB3
 | |
| LAB2: S2              ...
 | |
|       BEQ LAB3        S3
 | |
|       S3
 | |
| 
 | |
| Fig. 10.2 An illegal transformation of Branch Optimization
 | |
| .DE
 | |
| If B is moved towards S the same problem occurs if the block before B
 | |
| ends in a conditional jump.
 | |
| The problem could be solved by adding one extra branch,
 | |
| but this would reduce the gains of the optimization to zero.
 | |
| Hence the optimization will only be done if the block that
 | |
| follows S (in the textual order) is not a successor of S.
 | |
| This condition assures that S does not end in a conditional branch.
 | |
| The condition always holds for the code generated by the "C"
 | |
| front end for a switch statement.
 | |
| .PP
 | |
| After the transformation has been performed,
 | |
| some attributes of the basic blocks involved (such as successor and
 | |
| predecessor sets and immediate dominator) must be recomputed.
 | |
| .PP
 | |
| The while-loop technique is applied to one loop at a time.
 | |
| The list of basic blocks of the loop is traversed to find
 | |
| a block B that satisfies the following conditions:
 | |
| .IP 1.
 | |
| the textually next block to B is not part of the loop
 | |
| .IP 2.
 | |
| the last instruction of B is an unconditional branch;
 | |
| hence B has only one successor, say S
 | |
| .IP 3.
 | |
| the textually next block of B is a successor of S
 | |
| .IP 4.
 | |
| the last instruction of S is a conditional branch
 | |
| .LP
 | |
| If such a block B is found, the control flow graph is changed
 | |
| as depicted in Fig. 10.3.
 | |
| .DS
 | |
|        |				    |
 | |
|        |				    v
 | |
|        v				    |
 | |
|        |-----<------|			    ----->-----|
 | |
|    ____|____	    |				       |
 | |
|    |	   |	    |		    |-------|	       |
 | |
|    |  S1   |	    |		    |	    v	       |
 | |
|    |  Bcc  |	    |		    |	  ....	       |
 | |
| |--|	   |	    |		    |		       |
 | |
| |  ---------	    |		    |	----|----      |
 | |
| |		    |		    |	|	|      |
 | |
| |     ....	    ^		    |	|  S2	|      |
 | |
| |		    |		    |	|	|      |
 | |
| |   ---------	    |		    |	|	|      |
 | |
| v   |	    |	    |		    ^	---------      |
 | |
| |   |  S2   |	    |		    |	    |	       |
 | |
| |   | BRA   |	    |		    |	    |-----<-----
 | |
| |   |	    |	    |		    |	    v
 | |
| |   ---------	    |		    |	____|____
 | |
| |	|	    |		    |	|	|
 | |
| |	------>------		    |	|  S1	|
 | |
| |				    |	|  Bnn  |
 | |
| |-------|			    |	|	|
 | |
| 	|			    |	----|----
 | |
| 	v			    |	    |
 | |
| 				    |----<--|
 | |
| 					    |
 | |
| 					    v
 | |
| 
 | |
| Fig. 10.3 Transformation of the CFG by Branch Optimization
 | |
| .DE
 |