311 lines
11 KiB
Text
311 lines
11 KiB
Text
.NH 2
|
|
Implementation.
|
|
.PP
|
|
In this section we will discuss the implementation of the CS phase.
|
|
We will first describe the basic actions that are undertaken
|
|
by the algorithm, than the algorithm itself.
|
|
.NH 3
|
|
Partioning the EM instructions
|
|
.PP
|
|
There are over 100 EM instructions.
|
|
For our purpose we partition this huge set into groups of
|
|
instructions which can be more or less conveniently handled together.
|
|
.PP
|
|
There are groups for all sorts of load instructions:
|
|
simple loads, expensive loads, loads of an array element.
|
|
A load is considered \fIexpensive\fP when more than one EM instructions
|
|
are involved in loading it.
|
|
The load of a lexical entity is also considered expensive.
|
|
For instance: LOF is expensive, LAL is not.
|
|
LAR forms a group on its own,
|
|
because it is not only an expensive load,
|
|
but also implicitly includes the ternary operator AAR,
|
|
which computes the address of the array element.
|
|
.PP
|
|
There are groups for all sorts of operators:
|
|
unary, binary, and ternary.
|
|
The groups of operators are further partitioned according to the size
|
|
of their operand(s) and result.
|
|
.\" .PP
|
|
.\" The distinction between operators and expensive loads is not always clear.
|
|
.\" The ADP instruction for example,
|
|
.\" might seem a unary operator because it pops one item
|
|
.\" (a pointer) from the stack.
|
|
.\" However, two ADP-instructions which pop an item with the same value number
|
|
.\" need not have the same result,
|
|
.\" because the attributes (an offset, to be added to the pointer)
|
|
.\" can be different.
|
|
.\" Is it then a binary operator?
|
|
.\" That would give rise to the strange, and undesirable,
|
|
.\" situation that some binary operators pop two operands
|
|
.\" and others pop one.
|
|
.\" The conclusion is inevitable:
|
|
.\" we have been fooled by the name (ADd Pointer).
|
|
.\" The ADP-instruction is an expensive load.
|
|
.\" In this context LAF, meaning Load Address of oFfsetted,
|
|
.\" would have been a better name,
|
|
.\" corresponding to LOF, like LAL,
|
|
.\" Load Address of Local, corresponds to LOL.
|
|
.PP
|
|
There are groups for all sorts of stores:
|
|
direct, indirect, array element.
|
|
The SAR forms a group on its own for the same reason
|
|
as appeared with LAR.
|
|
.PP
|
|
The effect of the remaining instructions is less clear.
|
|
They do not help very much in parsing expressions or
|
|
in constructing our pseudo symboltable.
|
|
They are partitioned according to the following criteria:
|
|
.RS
|
|
.IP "-"
|
|
They change the value of an entity without using the stack
|
|
(e.g. ZRL, DEE).
|
|
.IP "-"
|
|
They are subroutine calls (CAI, CAL).
|
|
.IP "-"
|
|
They change the stack in some irreproduceable way (e.g. ASP, LFR, DUP).
|
|
.IP "-"
|
|
They have no effect whatever on the stack or on the entities.
|
|
This does not mean they can be deleted,
|
|
but they can be ignored for the moment
|
|
(e.g. MES, LIN, NOP).
|
|
.IP "-"
|
|
Their effect is too complicate too compute,
|
|
so we just assume worst case behaviour.
|
|
Hopefully, they do not occur very often.
|
|
(e.g. MON, STR, BLM).
|
|
.IP "-"
|
|
They signal the end of the basic block (e.g. BLT, RET, TRP).
|
|
.RE
|
|
.NH 3
|
|
Parsing expressions
|
|
.PP
|
|
To recognize expressions,
|
|
we simulate the behaviour of the EM machine,
|
|
by means of a fake-stack.
|
|
When we scan the instructions in sequential order,
|
|
we first encounter the instructions that load
|
|
the operands on the stack,
|
|
and then the instruction that indicates the operator,
|
|
because EM expressions are postfix.
|
|
When we find an instruction to load an operand,
|
|
we load on the fake-stack a struct with the following information:
|
|
.DS
|
|
.TS
|
|
l l.
|
|
(1) the value number of the operand
|
|
(2) the size of the operand
|
|
(3) a pointer to the first line of EM-code
|
|
that constitutes the operand
|
|
.TE
|
|
.DE
|
|
In most cases, (3) will point to the line
|
|
that loaded the operand (e.g. LOL, LOC),
|
|
i.e. there is only one line that refers to this operand,
|
|
but sometimes some information must be popped
|
|
to load the operand (e.g. LOI, LAR).
|
|
This information must have been pushed before,
|
|
so we also pop a pointer to the first line that pushed
|
|
the information.
|
|
This line is now the first line that defines the operand.
|
|
.PP
|
|
When we find the operator instruction,
|
|
we pop its operand(s) from the fake-stack.
|
|
The first line that defines the first operand is
|
|
now the first line of the expression.
|
|
We now have all information to determine
|
|
whether the just parsed expression has occurred before.
|
|
We also know the first and last line of the expression;
|
|
we need this when we decide to eliminate it.
|
|
Associated with each available expression is a set of
|
|
which the elements contains the first and last line of
|
|
a recurrence of this expression.
|
|
.PP
|
|
Not only will the operand(s) be popped from the fake-stack,
|
|
but the following will be pushed:
|
|
.DS
|
|
.TS
|
|
l l.
|
|
(1) the value number of the result
|
|
(2) the size of the result
|
|
(3) a pointer to the first line of the expression
|
|
.TE
|
|
.DE
|
|
In this way an item on the fake-stack always contains
|
|
the necessary information.
|
|
EM expressions are parsed bottum up.
|
|
.NH 3
|
|
Updating entities
|
|
.PP
|
|
As said before,
|
|
we build our private "symboltable",
|
|
while scanning the EM-instructions.
|
|
The behaviour of the EM-machine is not only reflected
|
|
in the fake-stack,
|
|
but also in the entities.
|
|
When an entity is created,
|
|
we do not yet know its value,
|
|
so we assign a brand new value number to it.
|
|
Each time a store-instruction is encountered,
|
|
we change the value number of the target entity of this store
|
|
to the value number of the token that was popped
|
|
from the fake-stack.
|
|
Because entities may overlap,
|
|
we must also "forget" the value numbers of entities
|
|
that might be affected by this store.
|
|
Each such entity will be \fIkilled\fP,
|
|
i.e. assigned a brand new valuenumber.
|
|
.PP
|
|
Because we lose information when we forget
|
|
the value number of an entity,
|
|
we try to save as much entities as possible.
|
|
When we store into an external,
|
|
we don't have to kill locals and vice versa.
|
|
Furthermore, we can see whether two locals or
|
|
two externals overlap,
|
|
because we know the offset from the local base,
|
|
resp. the offset within the data block,
|
|
and the size.
|
|
The situation becomes more complicated when we have
|
|
to consider indirection.
|
|
The worst case is that we store through an unknown pointer.
|
|
In that case we kill all entities except those locals
|
|
for which a so-called \fIregister message\fP has been generated;
|
|
this register message indicates that this local can never be
|
|
accessed indirectly.
|
|
If we know this pointer we can be more careful.
|
|
If it points to a local then the entity that is accessed through
|
|
this pointer can never overlap with an external.
|
|
If it points to an external this entity can never overlap with a local.
|
|
Furthermore, in the latter case,
|
|
we can find the data block this entity belongs to.
|
|
Since pointer arithmetic is only defined within a data block,
|
|
this entity can never overlap with entities that are known to
|
|
belong to another data block.
|
|
.PP
|
|
Not only after a store-instruction but also after a
|
|
subroutine-call it may be necessary to kill entities;
|
|
the subroutine may affect global variables or store
|
|
through a pointer.
|
|
If a subroutine is called that is not available as EM-text,
|
|
we assume worst case behaviour,
|
|
i.e. we kill all entities without register message.
|
|
.NH 3
|
|
Additions and replacements.
|
|
.PP
|
|
When a new expression comes available,
|
|
we check whether the result is saved in a local
|
|
that may go in a register.
|
|
The last line of the expression must be followed
|
|
by a STL or SDL instruction,
|
|
depending on the size of the result
|
|
(resp. WS and 2*WS),
|
|
and a register message must be present for
|
|
this local.
|
|
If we have found such a local,
|
|
we store a pointer to it with the available expression.
|
|
Each time a new occurrence of this expression
|
|
is found,
|
|
we compare the value number of the local against
|
|
the value number of the result.
|
|
When they are different we remove the pointer to it,
|
|
because we cannot use it.
|
|
.PP
|
|
The available expressions are singly linked in a list.
|
|
When a new expression comes available,
|
|
we link it at the head of the list.
|
|
In this way expressions that are contained within other
|
|
expressions appear later in the list,
|
|
because EM-expressions are postfix.
|
|
When we are going to eliminate expressions,
|
|
we walk through the list,
|
|
starting at the head, to find the largest expressions first.
|
|
When we decide to eliminate an expression,
|
|
we look at the expressions in the tail of the list,
|
|
starting from where we are now,
|
|
to delete expressions that are contained within
|
|
the chosen one because
|
|
we cannot eliminate an expression more than once.
|
|
.PP
|
|
When we are going to eliminate expressions,
|
|
and we do not have a local that holds the result,
|
|
we emit a STL or SDL after the line where the expression
|
|
was first found.
|
|
The other occurrences are simply removed,
|
|
unless they contain instructions that not only have
|
|
effect on the stack; e.g. messages, stores, calls.
|
|
Before each instruction that needs the result on the stack,
|
|
we emit a LOL or LDL.
|
|
When the expression was an AAR,
|
|
but the instruction was a LAR or a SAR,
|
|
we append a LOI resp. a STI of the number of bytes
|
|
in an array-element after each LOL/LDL.
|
|
.NH 3
|
|
Desirability analysis
|
|
.PP
|
|
Although the global optimizer works on EM code,
|
|
the goal is to improve the quality of the object code.
|
|
Therefore we need some machine dependent information
|
|
to decide whether it is desirable to
|
|
eliminate a given expression.
|
|
Because it is impossible for the CS phase to know
|
|
exactly what code will be generated,
|
|
we use some heuristics.
|
|
In most cases it will save time when we eliminate an
|
|
operator, so we just do it.
|
|
We only look for some special cases.
|
|
.PP
|
|
Some operators can in some cases be translated
|
|
into an addressing mode for the machine at hand.
|
|
We only eliminate such an operator,
|
|
when its operand is itself "expensive",
|
|
i.e. not just a simple load.
|
|
The user of the CS phase has to supply
|
|
a set of such operators.
|
|
.PP
|
|
Eliminating the loading of the Local Base or
|
|
the Argument Base by the LXL resp. LXA instruction
|
|
is only beneficial when the number of lexical levels
|
|
we have to go back exceeds a certain threshold.
|
|
This threshold will be different when registers
|
|
are saved by the back end.
|
|
The user must supply this threshold.
|
|
.PP
|
|
Replacing a SAR or a LAR by an AAR followed by a LOI
|
|
may possibly increase the size of the object code.
|
|
We assume that this is only possible when the
|
|
size of the array element is greater than some
|
|
(user-supplied) limit.
|
|
.PP
|
|
There are back ends that can very efficiently translate
|
|
the index computing instruction sequence LOC SLI ADS.
|
|
If this is the case,
|
|
we do not eliminate the SLI instruction between a LOC
|
|
and an ADS.
|
|
.PP
|
|
To handle unforeseen cases, the user may also supply
|
|
a set of operators that should never be eliminated.
|
|
.NH 3
|
|
The algorithm
|
|
.PP
|
|
After these preparatory explanations,
|
|
we can be short about the algorithm itself.
|
|
For each instruction within our window,
|
|
the following steps are performed in the order given:
|
|
.IP 1.
|
|
We check if this instructin defines an entity.
|
|
If this is the case the set of entities is updated accordingly.
|
|
.IP 2.
|
|
We kill all entities that might be affected by this instruction.
|
|
.IP 3.
|
|
The instruction is simulated on the fake-stack.
|
|
Copy propagation is done.
|
|
If this instruction is an operator,
|
|
we update the list of available expressions accordingly.
|
|
.PP
|
|
When we have processed all instructions this way,
|
|
we have built a list of available expressions plus the information we
|
|
need to eliminate them.
|
|
Those expressions of which desirability analysis tells us so,
|
|
we eliminate.
|
|
The we shift our window and continue.
|