311 lines
		
	
	
	
		
			11 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			311 lines
		
	
	
	
		
			11 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .NH 2
 | |
| Implementation.
 | |
| .PP
 | |
| In this section we will discuss the implementation of the CS phase.
 | |
| We will first describe the basic actions that are undertaken
 | |
| by the algorithm, than the algorithm itself.
 | |
| .NH 3
 | |
| Partioning the EM instructions
 | |
| .PP
 | |
| There are over 100 EM instructions.
 | |
| For our purpose we partition this huge set into groups of
 | |
| instructions which can be more or less conveniently handled together.
 | |
| .PP
 | |
| There are groups for all sorts of load instructions:
 | |
| simple loads, expensive loads, loads of an array element.
 | |
| A load is considered \fIexpensive\fP when more than one EM instructions
 | |
| are involved in loading it.
 | |
| The load of a lexical entity is also considered expensive.
 | |
| For instance: LOF is expensive, LAL is not.
 | |
| LAR forms a group on its own, 
 | |
| because it is not only an expensive load,
 | |
| but also implicitly includes the ternary operator AAR,
 | |
| which computes the address of the array element.
 | |
| .PP
 | |
| There are groups for all sorts of operators:
 | |
| unary, binary, and ternary.
 | |
| The groups of operators are further partitioned according to the size
 | |
| of their operand(s) and result.
 | |
| .\" .PP
 | |
| .\" The distinction between operators and expensive loads is not always clear.
 | |
| .\" The ADP instruction for example,
 | |
| .\" might seem a unary operator because it pops one item
 | |
| .\" (a pointer) from the stack.
 | |
| .\" However, two ADP-instructions which pop an item with the same value number
 | |
| .\" need not have the same result,
 | |
| .\" because the attributes (an offset, to be added to the pointer)
 | |
| .\" can be different.
 | |
| .\" Is it then a binary operator?
 | |
| .\" That would give rise to the strange, and undesirable,
 | |
| .\" situation that some binary operators pop two operands
 | |
| .\" and others pop one.
 | |
| .\" The conclusion is inevitable:
 | |
| .\" we have been fooled by the name (ADd Pointer).
 | |
| .\" The ADP-instruction is an expensive load.
 | |
| .\" In this context LAF, meaning Load Address of oFfsetted,
 | |
| .\" would have been a better name,
 | |
| .\" corresponding to LOF, like LAL,
 | |
| .\" Load Address of Local, corresponds to LOL.
 | |
| .PP
 | |
| There are groups for all sorts of stores:
 | |
| direct, indirect, array element.
 | |
| The SAR forms a group on its own for the same reason
 | |
| as appeared with LAR.
 | |
| .PP
 | |
| The effect of the remaining instructions is less clear.
 | |
| They do not help very much in parsing expressions or
 | |
| in constructing our pseudo symboltable.
 | |
| They are partitioned according to the following criteria:
 | |
| .RS
 | |
| .IP "-"
 | |
| They change the value of an entity without using the stack
 | |
| (e.g. ZRL, DEE).
 | |
| .IP "-"
 | |
| They are subroutine calls (CAI, CAL).
 | |
| .IP "-"
 | |
| They change the stack in some irreproduceable way (e.g. ASP, LFR, DUP).
 | |
| .IP "-"
 | |
| They have no effect whatever on the stack or on the entities.
 | |
| This does not mean they can be deleted,
 | |
| but they can be ignored for the moment
 | |
| (e.g. MES, LIN, NOP).
 | |
| .IP "-"
 | |
| Their effect is too complicate too compute,
 | |
| so we just assume worst case behaviour.
 | |
| Hopefully, they do not occur very often.
 | |
| (e.g. MON, STR, BLM).
 | |
| .IP "-"
 | |
| They signal the end of the basic block (e.g. BLT, RET, TRP).
 | |
| .RE
 | |
| .NH 3
 | |
| Parsing expressions
 | |
| .PP
 | |
| To recognize expressions,
 | |
| we simulate the behaviour of the EM machine,
 | |
| by means of a fake-stack.
 | |
| When we scan the instructions in sequential order,
 | |
| we first encounter the instructions that load
 | |
| the operands on the stack,
 | |
| and then the instruction that indicates the operator,
 | |
| because EM expressions are postfix.
 | |
| When we find an instruction to load an operand,
 | |
| we load on the fake-stack a struct with the following information:
 | |
| .DS
 | |
| .TS
 | |
| l l.
 | |
| (1)	the value number of the operand
 | |
| (2)	the size of the operand
 | |
| (3)	a pointer to the first line of EM-code
 | |
| 	that constitutes the operand
 | |
| .TE
 | |
| .DE
 | |
| In most cases, (3) will point to the line
 | |
| that loaded the operand (e.g. LOL, LOC),
 | |
| i.e. there is only one line that refers to this operand,
 | |
| but sometimes some information must be popped
 | |
| to load the operand (e.g. LOI, LAR).
 | |
| This information must have been pushed before,
 | |
| so we also pop a pointer to the first line that pushed
 | |
| the information.
 | |
| This line is now the first line that defines the operand.
 | |
| .PP
 | |
| When we find the operator instruction,
 | |
| we pop its operand(s) from the fake-stack.
 | |
| The first line that defines the first operand is
 | |
| now the first line of the expression.
 | |
| We now have all information to determine
 | |
| whether the just parsed expression has occurred before.
 | |
| We also know the first and last line of the expression;
 | |
| we need this when we decide to eliminate it.
 | |
| Associated with each available expression is a set of
 | |
| which the elements contains the first and last line of
 | |
| a recurrence of this expression.
 | |
| .PP
 | |
| Not only will the operand(s) be popped from the fake-stack,
 | |
| but the following will be pushed:
 | |
| .DS
 | |
| .TS
 | |
| l l.
 | |
| (1)	the value number of the result
 | |
| (2)	the size of the result
 | |
| (3)	a pointer to the first line of the expression
 | |
| .TE
 | |
| .DE
 | |
| In this way an item on the fake-stack always contains
 | |
| the necessary information.
 | |
| EM expressions are parsed bottum up.
 | |
| .NH 3
 | |
| Updating entities
 | |
| .PP
 | |
| As said before,
 | |
| we build our private "symboltable",
 | |
| while scanning the EM-instructions.
 | |
| The behaviour of the EM-machine is not only reflected
 | |
| in the fake-stack,
 | |
| but also in the entities.
 | |
| When an entity is created,
 | |
| we do not yet know its value,
 | |
| so we assign a brand new value number to it.
 | |
| Each time a store-instruction is encountered,
 | |
| we change the value number of the target entity of this store
 | |
| to the value number of the token that was popped
 | |
| from the fake-stack.
 | |
| Because entities may overlap,
 | |
| we must also "forget" the value numbers of entities
 | |
| that might be affected by this store.
 | |
| Each such entity will be \fIkilled\fP,
 | |
| i.e. assigned a brand new valuenumber.
 | |
| .PP
 | |
| Because we lose information when we forget
 | |
| the value number of an entity,
 | |
| we try to save as much entities as possible.
 | |
| When we store into an external,
 | |
| we don't have to kill locals and vice versa.
 | |
| Furthermore, we can see whether two locals or
 | |
| two externals overlap,
 | |
| because we know the offset from the local base,
 | |
| resp. the offset within the data block,
 | |
| and the size.
 | |
| The situation becomes more complicated when we have
 | |
| to consider indirection.
 | |
| The worst case is that we store through an unknown pointer.
 | |
| In that case we kill all entities except those locals
 | |
| for which a so-called \fIregister message\fP has been generated;
 | |
| this register message indicates that this local can never be
 | |
| accessed indirectly.
 | |
| If we know this pointer we can be more careful.
 | |
| If it points to a local then the entity that is accessed through
 | |
| this pointer can never overlap with an external.
 | |
| If it points to an external this entity can never overlap with a local.
 | |
| Furthermore, in the latter case,
 | |
| we can find the data block this entity belongs to.
 | |
| Since pointer arithmetic is only defined within a data block,
 | |
| this entity can never overlap with entities that are known to
 | |
| belong to another data block.
 | |
| .PP
 | |
| Not only after a store-instruction but also after a 
 | |
| subroutine-call it may be necessary to kill entities;
 | |
| the subroutine may affect global variables or store
 | |
| through a pointer.
 | |
| If a subroutine is called that is not available as EM-text,
 | |
| we assume worst case behaviour,
 | |
| i.e. we kill all entities without register message.
 | |
| .NH 3
 | |
| Additions and replacements.
 | |
| .PP
 | |
| When a new expression comes available,
 | |
| we check whether the result is saved in a local
 | |
| that may go in a register.
 | |
| The last line of the expression must be followed
 | |
| by a STL or SDL instruction,
 | |
| depending on the size of the result
 | |
| (resp. WS and 2*WS),
 | |
| and a register message must be present for
 | |
| this local.
 | |
| If we have found such a local,
 | |
| we store a pointer to it with the available expression.
 | |
| Each time a new occurrence of this expression
 | |
| is found,
 | |
| we compare the value number of the local against
 | |
| the value number of the result.
 | |
| When they are different we remove the pointer to it,
 | |
| because we cannot use it.
 | |
| .PP
 | |
| The available expressions are singly linked in a list.
 | |
| When a new expression comes available,
 | |
| we link it at the head of the list.
 | |
| In this way expressions that are contained within other
 | |
| expressions appear later in the list,
 | |
| because EM-expressions are postfix.
 | |
| When we are going to eliminate expressions,
 | |
| we walk through the list,
 | |
| starting at the head, to find the largest expressions first.
 | |
| When we decide to eliminate an expression,
 | |
| we look at the expressions in the tail of the list,
 | |
| starting from where we are now,
 | |
| to delete expressions that are contained within
 | |
| the chosen one because
 | |
| we cannot eliminate an expression more than once.
 | |
| .PP
 | |
| When we are going to eliminate expressions,
 | |
| and we do not have a local that holds the result,
 | |
| we emit a STL or SDL after the line where the expression
 | |
| was first found.
 | |
| The other occurrences are simply removed,
 | |
| unless they contain instructions that not only have
 | |
| effect on the stack; e.g. messages, stores, calls.
 | |
| Before each instruction that needs the result on the stack,
 | |
| we emit a LOL or LDL.
 | |
| When the expression was an AAR,
 | |
| but the instruction was a LAR or a SAR,
 | |
| we append a LOI resp. a STI of the number of bytes
 | |
| in an array-element after each LOL/LDL.
 | |
| .NH 3
 | |
| Desirability analysis
 | |
| .PP
 | |
| Although the global optimizer works on EM code,
 | |
| the goal is to improve the quality of the object code.
 | |
| Therefore we need some machine dependent information
 | |
| to decide whether it is desirable to
 | |
| eliminate a given expression.
 | |
| Because it is impossible for the CS phase to know
 | |
| exactly what code will be generated,
 | |
| we use some heuristics.
 | |
| In most cases it will save time when we eliminate an
 | |
| operator, so we just do it.
 | |
| We only look for some special cases.
 | |
| .PP
 | |
| Some operators can in some cases be translated
 | |
| into an addressing mode for the machine at hand.
 | |
| We only eliminate such an operator,
 | |
| when its operand is itself "expensive",
 | |
| i.e. not just a simple load.
 | |
| The user of the CS phase has to supply
 | |
| a set of such operators.
 | |
| .PP
 | |
| Eliminating the loading of the Local Base or
 | |
| the Argument Base by the LXL resp. LXA instruction
 | |
| is only beneficial when the number of lexical levels
 | |
| we have to go back exceeds a certain threshold.
 | |
| This threshold will be different when registers
 | |
| are saved by the back end.
 | |
| The user must supply this threshold.
 | |
| .PP
 | |
| Replacing a SAR or a LAR by an AAR followed by a LOI
 | |
| may possibly increase the size of the object code.
 | |
| We assume that this is only possible when the
 | |
| size of the array element is greater than some
 | |
| (user-supplied) limit.
 | |
| .PP
 | |
| There are back ends that can very efficiently translate
 | |
| the index computing instruction sequence LOC SLI ADS.
 | |
| If this is the case,
 | |
| we do not eliminate the SLI instruction between a LOC
 | |
| and an ADS.
 | |
| .PP
 | |
| To handle unforeseen cases, the user may also supply
 | |
| a set of operators that should never be eliminated.
 | |
| .NH 3
 | |
| The algorithm
 | |
| .PP
 | |
| After these preparatory explanations,
 | |
| we can be short about the algorithm itself.
 | |
| For each instruction within our window,
 | |
| the following steps are performed in the order given:
 | |
| .IP 1.
 | |
| We check if this instructin defines an entity.
 | |
| If this is the case the set of entities is updated accordingly.
 | |
| .IP 2.
 | |
| We kill all entities that might be affected by this instruction.
 | |
| .IP 3.
 | |
| The instruction is simulated on the fake-stack.
 | |
| Copy propagation is done.
 | |
| If this instruction is an operator,
 | |
| we update the list of available expressions accordingly.
 | |
| .PP
 | |
| When we have processed all instructions this way,
 | |
| we have built a list of available expressions plus the information we
 | |
| need to eliminate them.
 | |
| Those expressions of which desirability analysis tells us so,
 | |
| we eliminate.
 | |
| The we shift our window and continue.
 |