162 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			162 lines
		
	
	
	
		
			5.3 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .bp
 | |
| .NH 1
 | |
| Branch Optimization
 | |
| .NH 2
 | |
| Introduction
 | |
| .PP
 | |
| The Branch Optimization phase (BO) performs two related
 | |
| (branch) optimizations.
 | |
| .NH 3
 | |
| Fusion of basic blocks
 | |
| .PP
 | |
| If two basic blocks B1 and B2 have the following properties:
 | |
| .DS
 | |
| SUCC(B1) = {B2}
 | |
| PRED(B2) = {B1}
 | |
| .DE
 | |
| then B1 and B2 can be combined into one basic block.
 | |
| If B1 ends in an unconditional jump to the beginning of B2, this
 | |
| jump can be eliminated,
 | |
| hence saving a little execution time and object code size.
 | |
| This technique can be used to eliminate some deficiencies
 | |
| introduced by the front ends (for example, the "C" front end
 | |
| translates switch statements inefficiently due to its one pass nature).
 | |
| .NH 3
 | |
| While-loop optimization
 | |
| .PP
 | |
| The straightforward way to translate a while loop is to
 | |
| put the test for loop termination at the beginning of the loop.
 | |
| .DS
 | |
| while cond loop                       \kyLAB1: \kxTest cond
 | |
|    body of the loop     --->\h'|\nxu'Branch On False To LAB2
 | |
| end loop\h'|\nxu'code for body of loop
 | |
| \h'|\nxu'Branch To LAB1
 | |
| \h'|\nyu'LAB2:
 | |
| 
 | |
| Fig. 10.1 Example of Branch Optimization
 | |
| .DE
 | |
| If the condition fails at the Nth iteration, the following code
 | |
| gets executed (dynamically):
 | |
| .DS
 | |
| .TS
 | |
| l l l.
 | |
| N	*	conditional branch (which fails N-1 times)
 | |
| N-1	*	unconditional branch
 | |
| N-1	*	body of the loop
 | |
| .TE
 | |
| .DE
 | |
| An alternative translation is:
 | |
| .DS
 | |
|      Branch To LAB2
 | |
| LAB1:
 | |
|      code for body of loop
 | |
| LAB2:
 | |
|      Test cond
 | |
|      Branch On True To LAB1
 | |
| .DE
 | |
| This translation results in the following profile:
 | |
| .DS
 | |
| .TS
 | |
| l l l.
 | |
| N	*	conditional branch (which succeeds N-1 times)
 | |
| 1	*	unconditional branch
 | |
| N-1	*	body of the loop
 | |
| .TE
 | |
| .DE
 | |
| So the second translation will be significantly faster if N >> 2.
 | |
| If N=2, execution time will be slightly increased.
 | |
| On the average, the program will be speeded up.
 | |
| Note that the code sizes of the two translations will be the same.
 | |
| .NH 2
 | |
| Implementation
 | |
| .PP
 | |
| The basic block fusion technique is implemented
 | |
| by traversing the control flow graph of a procedure,
 | |
| looking for basic blocks B with only one successor (S).
 | |
| If one is found, it is checked if S has only one predecessor
 | |
| (which has to be B).
 | |
| If so, the two basic blocks can in principle be combined.
 | |
| However, as one basic block will have to be moved,
 | |
| the textual order of the basic blocks will be altered.
 | |
| This reordering causes severe problems in the presence
 | |
| of conditional jumps.
 | |
| For example, if S ends in a conditional branch,
 | |
| the basic block that comes textually next to S must stay
 | |
| in that position.
 | |
| So the transformation in Fig. 10.2 is illegal.
 | |
| .DS
 | |
| .TS
 | |
| l l l l l.
 | |
| LAB1:	S1		LAB1:	S1
 | |
| 	BRA LAB2			S2
 | |
| 	...	-->		BEQ LAB3
 | |
| LAB2:	S2			...
 | |
| 	BEQ LAB3			S3
 | |
| 	S3
 | |
| .TE
 | |
| 
 | |
| Fig. 10.2 An illegal transformation of Branch Optimization
 | |
| .DE
 | |
| If B is moved towards S the same problem occurs if the block before B
 | |
| ends in a conditional jump.
 | |
| The problem could be solved by adding one extra branch,
 | |
| but this would reduce the gains of the optimization to zero.
 | |
| Hence the optimization will only be done if the block that
 | |
| follows S (in the textual order) is not a successor of S.
 | |
| This condition assures that S does not end in a conditional branch.
 | |
| The condition always holds for the code generated by the "C"
 | |
| front end for a switch statement.
 | |
| .PP
 | |
| After the transformation has been performed,
 | |
| some attributes of the basic blocks involved (such as successor and
 | |
| predecessor sets and immediate dominator) must be recomputed.
 | |
| .PP
 | |
| The while-loop technique is applied to one loop at a time.
 | |
| The list of basic blocks of the loop is traversed to find
 | |
| a block B that satisfies the following conditions:
 | |
| .IP 1.
 | |
| the textually next block to B is not part of the loop
 | |
| .IP 2.
 | |
| the last instruction of B is an unconditional branch;
 | |
| hence B has only one successor, say S
 | |
| .IP 3.
 | |
| the textually next block of B is a successor of S
 | |
| .IP 4.
 | |
| the last instruction of S is a conditional branch
 | |
| .LP
 | |
| If such a block B is found, the control flow graph is changed
 | |
| as depicted in Fig. 10.3.
 | |
| .DS
 | |
| .ft 5
 | |
|        |                                    |
 | |
|        |                                    v
 | |
|        v                                    |
 | |
|        |-----<------|                       ----->-----|
 | |
|    ____|____        |                                  |
 | |
|    |       |        |               |-------|          |
 | |
|    |  S1   |        |               |       v          |
 | |
|    |  Bcc  |        |               |     ....         |
 | |
| |--|       |        |               |                  |
 | |
| |  ---------        |               |   ----|----      |
 | |
| |                   |               |   |       |      |
 | |
| |     ....          ^               |   |  S2   |      |
 | |
| |                   |               |   |       |      |
 | |
| |   ---------       |               |   |       |      |
 | |
| v   |       |       |               ^   ---------      |
 | |
| |   |  S2   |       |               |       |          |
 | |
| |   | BRA   |       |               |       |-----<-----
 | |
| |   |       |       |               |       v
 | |
| |   ---------       |               |   ____|____
 | |
| |       |           |               |   |       |
 | |
| |       ------>------               |   |  S1   |
 | |
| |                                   |   |  Bnn  |
 | |
| |-------|                           |   |       |
 | |
|         |                           |   ----|----
 | |
|         v                           |       |
 | |
|                                     |----<--|
 | |
|                                             |
 | |
|                                             v
 | |
| .ft R
 | |
| 
 | |
| Fig. 10.3 Transformation of the CFG by Branch Optimization
 | |
| .DE
 |