Initial revision

1987-03-03 10:44:56 +00:00 · 1987-03-03 10:44:56 +00:00 · 295380491f
commit 295380491f
parent 2d362c2274
6 changed files with 855 additions and 0 deletions
--- a/doc/ego/cj/cj1
+++ b/doc/ego/cj/cj1
@ -0,0 +1,136 @@
+.bp
+.NH 1
+Cross jumping
+.NH 2
+Introduction
+.PP
+The "Cross Jumping" optimization technique (CJ)
+.[
+wulf design optimizing compiler
+.]
+is basically a space optimization technique. It looks for pairs of
+basic blocks (B1,B2), for which:
+.DS
+SUCC(B1) = SUCC(B2) = {S}
+.DE
+(So B1 and B2 both have one and the same successor).
+If the last few non-branch instructions are the same for B1 and B2,
+one such sequence can be eliminated.
+.DS
+Pascal:
+
+if cond then
+    S1
+    S3
+else
+    S2
+    S3
+
+(pseudo) EM:
+
+TEST COND		TEST COND
+BNE *1			BNE *1
+S1			S1
+S3	   --->		BRA *2
+BRA *2			1:
+1:			S2
+S2			2:
+S3			S3
+2:
+
+Fig. 9.1 An example of Cross Jumping
+.DE
+As the basic blocks have the same successor,
+at least one of them ends in an unconditional branch instruction (BRA).
+Hence no extra branch instruction is ever needed, just the target
+of an existing branch needs to be changed; neither the program size
+nor the execution time will ever increase.
+In general, the execution time will remain the same, unless
+further optimizations can be applied because of this optimization.
+.PP
+This optimization is particularly effective,
+because it cannot always be done by the programmer at the source level,
+as demonstrated by the Fig. 8.2.
+.DS
+	Pascal:
+
+	if cond then
+	   x := f(4)
+	else
+	   x := g(5)
+
+
+	EM:
+
+	...                     ...
+	LOC 4			LOC 5
+	CAL F			CAL G
+	ASP 2			ASP 2
+	LFR 2			LFR 2
+	STL X			STL X
+
+Fig. 9.2 Effectiveness of Cross Jumping
+.DE
+At the source level there is no common tail,
+but at the EM level there is a common tail.
+.NH 2
+Implementation
+.PP
+The implementation of cross jumping is rather straightforward.
+The technique is applied to one procedure at a time.
+The control flow graph of the procedure 
+is scanned for pairs of basic blocks
+with the same (single) successor and with common tails.
+Note that there may be more than two such blocks (e.g. as the result
+of a case statement).
+This is dealt with by repeating the entire process until no
+further optimizations can de done for the current procedure.
+.sp
+If a suitable pair of basic blocks has been found, the control flow
+graph must be altered. One of the basic
+blocks must be split into two.
+The control flow graphs before and after the optimization are shown
+in Fig. 9.3 and Fig. 9.4.
+.DS
+
+	--------				--------
+	|      |				|      |
+	| S1   |			        | S2   |
+	| S3   |   				| S3   |
+	|      |				|      |
+	--------				--------
+	   |					   |
+	   |------------------|--------------------|
+			      |
+			      v
+
+Fig. 9.3 CFG before optimization
+.DE
+.DS
+
+	--------				--------
+	|      |				|      |
+	| S1   |			        | S2   |
+	|      |				|      |
+	--------				--------
+	   |					   |
+	   |--------------------<------------------|
+	   v
+	--------
+	|      |
+	| S3   |
+	|      |
+	--------
+	   |
+	   v
+
+Fig. 9.4 CFG after optimization
+.DE
+Some attributes of the three resulting blocks (such as immediate dominator)
+are updated.
+.PP
+In some cases, cross jumping might split the computation of an expression
+into two, by inserting a branch somewhere in the middle.
+Most code generators will generate very poor assembly code when
+presented with such EM code. 
+Therefor, cross jumping is not performed in these cases.
--- a/doc/ego/cs/cs1
+++ b/doc/ego/cs/cs1
@ -0,0 +1,42 @@
+.bp
+.NH 1
+Common subexpression elimination
+.NH 2
+Introduction
+.PP
+The Common Subexpression Elimination optimization technique (CS)
+tries to eliminate multiple computations of EM expressions
+that yield the same result.
+It places the result of one such computation
+in a temporary variable,
+and replaces the other computations by a reference
+to this temporary variable.
+The primary goal of this technique is to decrease
+the execution time of the program,
+but in general it will save space too.
+.PP
+As an example of the application of Common Subexpression Elimination,
+consider the piece of program in Fig. 7.1(a).
+.DS
+x := a * b;          TMP := a * b;       x := a * b;
+CODE;                x := TMP;           CODE
+y := c + a * b;      CODE                y := x;
+                     y := c + TMP;
+
+   (a)                  (b)                 (c)
+
+Fig. 7.1  Examples of Common Subexpression Elimination
+.DE
+If neither a nor b is changed in CODE,
+the instructions can be replaced by those of Fig. 7.1(b),
+which saves one multiplication,
+but costs an extra store instruction.
+If the value of x is not changed in CODE either,
+the instructions can be replaced by those of Fig. 7.1(c).
+In this case
+the extra store is not needed.
+.PP
+In the following sections we will describe
+which transformations are done
+by CS and how this phase
+was implemented.
--- a/doc/ego/cs/cs2
+++ b/doc/ego/cs/cs2
@ -0,0 +1,83 @@
+.NH 2
+Specification of the Common Subexpression Elimination phase
+.PP
+In this section we will describe
+the window
+through which CS examines the code,
+the expressions recognized by CS,
+and finally the changes made to the code.
+.NH 3
+The working window
+.PP
+The CS algorithm is applied to the
+largest sequence of textually adjacent basic blocks
+B1,..,Bn, for which
+.DS
+PRED(Bj) = {Bj-1},  j = 2,..,n.
+.DE
+Intuitively, this window consists of straight line code,
+with only one entry point (at the beginning); it may
+contain jumps, which should all have their targets outside the window.
+This is illustrated in Fig. 7.2.
+.DS
+x := a * b;	(1)
+if x < 10 then	(2)
+    y := a * b;	(3)
+
+Fig. 7.2 The working window of CS
+.DE
+Line (2) can only be executed after line (1).
+Likewise, line (3) can only be executed after
+line (2).
+Both a and b have the same values at line (1) and at line (3).
+.PP
+Larger windows were avoided.
+In Fig. 7.3, the value of a at line (4) may have been obtained
+at more than one point.
+.DS
+x := a * b;	(1)
+if x < 10 then	(2)
+    a := 100;	(3)
+y := a * b;	(4)
+
+Fig. 7.3 Several working windows
+.DE
+.NH 3
+Recognized expressions.
+.PP
+The computations eliminated by CS need not be normal expressions
+(like "a * b"),
+but can even consist of a single operand that is expensive to access,
+such as an array element or a record field.
+If an array element is used,
+its address is computed implicitly.
+CS is able to eliminate either the element itself or its
+address, whichever one is most profitable.
+A variable of a textually enclosing procedure may also be
+expensive to access, depending on the lexical level difference.
+.NH 3
+Transformations
+.PP
+CS creates a new temporary local variable (TMP)
+for every eliminated expression,
+unless it is able to use an existing local variable.
+It emits code to initialize this variable with the
+result of the expression.
+Most recurrences of the expression
+can simply be replaced by a reference to TMP.
+If the address of an array element is recognized as
+a common subexpression,
+references to the element itself are replaced by
+indirect references through TMP (see Fig. 7.4).
+.DS
+x := A[i];			TMP := &A[i];
+  . . .			-->	x := *TMP;
+A[i] := y;			   . . .
+				*TMP := y;
+
+Fig. 7.4 Elimination of an array address computation
+.DE
+Here, '&' is the 'address of' operator,
+and unary '*' is the indirection operator.
+(Note that EM actually has different instructions to do
+a use-indirect or an assign-indirect.)
--- a/doc/ego/cs/cs3
+++ b/doc/ego/cs/cs3
@ -0,0 +1,243 @@
+.NH 2
+Implementation
+.PP
+.NH 3
+The value number method
+.PP
+To determine whether two expressions have the same result,
+there must be some way to determine whether their operands have
+the same values.
+We use a system of \fIvalue numbers\fP
+.[
+kennedy data flow analysis 
+.]
+in which each distinct value of whatever type,
+created or used within the working window,
+receives a unique identifying number, its value number.
+Two items have the same value number if and only if,
+based only upon information from the instructions in the window,
+their values are provably identical.
+For example, after processing the statement
+.DS
+a := 4;
+.DE
+the variable a and the constant 4 have the same value number.
+.PP
+The value number of the result of an expression depends only
+on the kind of operator and the value number(s) of the operand(s).
+The expressions need not be textually equal, as shown in Fig. 7.5.
+.DS
+a := c;		(1)
+use(a * b);	(2)
+d := b;		(3)
+use(c * d);	(4)
+
+Fig. 7.5 Different expressions with the same value number
+.DE
+At line (1) a receives the same value number as c.
+At line (2) d receives the same value number as b.
+At line (4) the expression "c * d" receives the same value number
+as the expression "a * b" at line (2),
+because the value numbers of their left and right operands are the same,
+and the operator (*) is the same.
+.PP
+As another example of the value number method, consider Fig. 7.6.
+.DS
+use(a * b);	(1)
+a := 123;	(2)
+use(a * b);	(3)
+
+Fig. 7.6 Identical expressions with the different value numbers
+.DE
+Although textually the expressions "a * b" in line 1 and line 3 are equal,
+a will have different value numbers at line 3 and line 1.
+The two expressions will not mistakenly be recognized as equivalent.
+.NH 3
+Entities
+.PP
+The Value Number Method distinguishes between operators and operands.
+The value numbers of operands are stored in a table,
+called the \fIsymbol table\fR.
+The value number of a subexpression depends on the
+(root) operator of the expression and on the value numbers
+of its operands.
+A table of "available expressions" is used to do this mapping.
+.PP
+CS recognizes the following kinds of EM operands, called \fIentities\fR:
+.IP
+- constant
+- local variable
+- external variable
+- indirectly accessed entity
+- offsetted entity
+- address of local variable
+- address of external variable
+- address of offsetted entity
+- address of local base
+- address of argument base
+- array element
+- procedure identifier
+- floating zero
+- local base
+- heap pointer
+- ignore mask
+.LP
+Whenever a new entity is encountered in the working window,
+it is entered in the symbol table and given a brand new value number.
+Most entities have attributes (e.g. the offset in
+the current stackframe for local variables),
+which are also stored in the symbol table.
+.PP
+An entity is called static if its value cannot be changed
+(e.g. a constant or an address).
+.NH 3
+Parsing expressions
+.PP
+Common subexpressions are recognized by simulating the behaviour
+of the EM machine.
+The EM code is parsed from left to right;
+as EM is postfix code, this is a bottom up parse.
+At any point the current state of the EM runtime stack is
+reflected by a simulated "fake stack",
+containing descriptions of the parsed operands and expressions.
+A descriptor consists of:
+.DS
+(1) the value number of the operand or expression
+(2) the size of the operand or expression
+(3) a pointer to the first line of EM-code
+    that constitutes the operand or expression
+.DE
+Note that operands may consist of several EM instructions.
+Whenever an operator is encountered, the
+descriptors of its operands are on top of the fake stack.
+The operator and the value numbers of the operands 
+are used as indices in the table of available expressions,
+to determine the value number of the expression.
+.PP
+During the parsing process,
+we keep track of the first line of each expression;
+we need this information when we decide to eliminate the expression.
+.NH 3
+Updating entities
+.PP
+An entity is assigned a value number when it is
+used for the first time
+in the working window.
+If the entity is used as left hand side of an assignment,
+it gets the value number of the right hand side.
+Sometimes the effects of an instruction on an entity cannot
+be determined exactly;
+the current value and value number of the entity may become
+inconsistent.
+Hence the current value number must be forgotten.
+This is achieved by giving the entity a new value number
+that was not used before.
+The entity is said to be \fIkilled\fR.
+.PP
+As information is lost when an entity is killed,
+CS tries to save as many entities as possible.
+In case of an indirect assignment through a pointer,
+some analysis is done to see which variables cannot be altered.
+For a procedure call, the interprocedural information contained
+in the procedure table is used to restrict the set of entities that may
+be changed by the call.
+Local variables for which the front end generated 
+a register message can never be changed by an indirect assignment
+or a procedure call.
+.NH 3
+Changing the EM text
+.PP
+When a new expression comes available,
+it is checked whether its result is saved in a local
+that may go in a register.
+The last line of the expression must be followed
+by a STL or SDL instruction
+(depending on the size of the result)
+and a register message must be present for
+this local.
+If there is such a local,
+it is recorded in the available expressions table.
+Each time a new occurrence of this expression
+is found,
+the value number of the local is compared against
+the value number of the result.
+If they are different the local cannot be used and is forgotten.
+.PP
+The available expressions are linked in a list.
+New expressions are linked at the head of the list.
+In this way expressions that are contained within other
+expressions appear later in the list,
+because EM-expressions are postfix.
+The elimination process walks through the list,
+starting at the head, to find the largest expressions first.
+If an expression is eliminated,
+any expression later on in the list, contained in the former expression,
+is removed from the list,
+as expressions can only be eliminated once.
+.PP
+A STL or SDL is emitted after the first occurrence of the expression,
+unless there was an existing local variable that could hold the result.
+.NH 3
+Desirability analysis
+.PP
+Although the global optimizer works on EM code,
+the goal is to improve the quality of the object code.
+Therefore some machine-dependent information is needed
+to decide whether it is desirable to
+eliminate a given expression.
+Because it is impossible for the CS phase to know
+exactly what code will be generated,
+some heuristics are used.
+CS essentially looks for some special cases
+that should not be eliminated.
+These special cases can be turned on or off for a given machine,
+as indicated in a machine descriptor file.
+.PP
+Some operators can sometimes be translated
+into an addressing mode for the machine at hand.
+Such an operator is only eliminated
+if its operand is itself expensive,
+i.e. it is not just a simple load.
+The machine descriptor file contains a set of such operators.
+.PP
+Eliminating the loading of the Local Base or
+the Argument Base by the LXL resp. LXA instruction
+is only beneficial if the difference in lexical levels
+exceeds a certain threshold.
+The machine descriptor file contains this threshold.
+.PP
+Replacing a SAR or a LAR by an AAR followed by a LOI
+may possibly increase the size of the object code.
+We assume that this is only possible when the
+size of the array element is greater than some limit.
+.PP
+There are back ends that can very efficiently translate
+the index computing instruction sequence LOC SLI ADS.
+If this is the case,
+the SLI instruction between a LOC
+and an ADS is not eliminated.
+.PP
+To handle unforseen cases, the descriptor file may also contain
+a set of operators that should never be eliminated.
+.NH 3
+The algorithm
+.PP
+After these preparatory explanations,
+the algorithm itself is easy to understand.
+For each instruction within the current window,
+the following steps are performed in the given order :
+.IP 1.
+Check if this instruction defines an entity.
+If so, the set of entities is updated accordingly.
+.IP 2.
+Kill all entities that might be affected by this instruction.
+.IP 3.
+Simulate the instruction on the fake-stack.
+If this instruction is an operator,
+update the list of available expressions accordingly.
+.PP
+The result of this process is
+a list of available expressions plus the information
+needed to eliminate them.
+Expressions that are desirable to eliminate are eliminated.
+Next, the window is shifted and the process is repeated.
--- a/doc/ego/cs/cs4
+++ b/doc/ego/cs/cs4
@ -0,0 +1,305 @@
+.NH 2
+Implementation.
+.PP
+In this section we will discuss the implementation of the CS phase.
+We will first describe the basic actions that are undertaken
+by the algorithm, than the algorithm itself.
+.NH 3
+Partioning the EM instructions
+.PP
+There are over 100 EM instructions.
+For our purpose we partition this huge set into groups of
+instructions which can be more or less conveniently handled together.
+.PP
+There are groups for all sorts of load instructions:
+simple loads, expensive loads, loads of an array element.
+A load is considered \fIexpensive\fP when more than one EM instructions
+are involved in loading it.
+The load of a lexical entity is also considered expensive.
+For instance: LOF is expensive, LAL is not.
+LAR forms a group on its own, 
+because it is not only an expensive load,
+but also implicitly includes the ternary operator AAR,
+which computes the address of the array element.
+.PP
+There are groups for all sorts of operators:
+unary, binary, and ternary.
+The groups of operators are further partitioned according to the size
+of their operand(s) and result.
+\" .PP
+\" The distinction between operators and expensive loads is not always clear.
+\" The ADP instruction for example,
+\" might seem a unary operator because it pops one item
+\" (a pointer) from the stack.
+\" However, two ADP-instructions which pop an item with the same value number
+\" need not have the same result,
+\" because the attributes (an offset, to be added to the pointer)
+\" can be different.
+\" Is it then a binary operator?
+\" That would give rise to the strange, and undesirable,
+\" situation that some binary operators pop two operands
+\" and others pop one.
+\" The conclusion is inevitable:
+\" we have been fooled by the name (ADd Pointer).
+\" The ADP-instruction is an expensive load.
+\" In this context LAF, meaning Load Address of oFfsetted,
+\" would have been a better name,
+\" corresponding to LOF, like LAL,
+\" Load Address of Local, corresponds to LOL.
+.PP
+There are groups for all sorts of stores:
+direct, indirect, array element.
+The SAR forms a group on its own for the same reason
+as appeared with LAR.
+.PP
+The effect of the remaining instructions is less clear.
+They do not help very much in parsing expressions or
+in constructing our pseudo symboltable.
+They are partitioned according to the following criteria:
+.RS
+.IP "-"
+They change the value of an entity without using the stack
+(e.g. ZRL, DEE).
+.IP "-"
+They are subroutine calls (CAI, CAL).
+.IP "-"
+They change the stack in some irreproduceable way (e.g. ASP, LFR, DUP).
+.IP "-"
+They have no effect whatever on the stack or on the entities.
+This does not mean they can be deleted,
+but they can be ignored for the moment
+(e.g. MES, LIN, NOP).
+.IP "-"
+Their effect is too complicate too compute,
+so we just assume worst case behaviour.
+Hopefully, they do not occur very often.
+(e.g. MON, STR, BLM).
+.IP "-"
+They signal the end of the basic block (e.g. BLT, RET, TRP).
+.RE
+.NH 3
+Parsing expressions
+.PP
+To recognize expressions,
+we simulate the behaviour of the EM machine,
+by means of a fake-stack.
+When we scan the instructions in sequential order,
+we first encounter the instructions that load
+the operands on the stack,
+and then the instruction that indicates the operator,
+because EM expressions are postfix.
+When we find an instruction to load an operand,
+we load on the fake-stack a struct with the following information:
+.DS
+(1) the value number of the operand
+(2) the size of the operand
+(3) a pointer to the first line of EM-code
+    that constitutes the operand
+.DE
+In most cases, (3) will point to the line
+that loaded the operand (e.g. LOL, LOC),
+i.e. there is only one line that refers to this operand,
+but sometimes some information must be popped
+to load the operand (e.g. LOI, LAR).
+This information must have been pushed before,
+so we also pop a pointer to the first line that pushed
+the information.
+This line is now the first line that defines the operand.
+.PP
+When we find the operator instruction,
+we pop its operand(s) from the fake-stack.
+The first line that defines the first operand is
+now the first line of the expression.
+We now have all information to determine
+whether the just parsed expression has occurred before.
+We also know the first and last line of the expression;
+we need this when we decide to eliminate it.
+Associated with each available expression is a set of
+which the elements contains the first and last line of
+a recurrence of this expression.
+.PP
+Not only will the operand(s) be popped from the fake-stack,
+but the following will be pushed:
+.DS
+(1) the value number of the result
+(2) the size of the result
+(3) a pointer to the first line of the expression
+.DE
+In this way an item on the fake-stack always contains
+the necessary information.
+As you see, EM expressions are parsed bottum up.
+.NH 3
+Updating entities
+.PP
+As said before,
+we build our private "symboltable",
+while scanning the EM-instructions.
+The behaviour of the EM-machine is not only reflected
+in the fake-stack,
+but also in the entities.
+When an entity is created,
+we do not yet know its value,
+so we assign a brand new value number to it.
+Each time a store-instruction is encountered,
+we change the value number of the target entity of this store
+to the value number of the token that was popped
+from the fake-stack.
+Because entities may overlap,
+we must also "forget" the value numbers of entities
+that might be affected by this store.
+Each such entity will be \fIkilled\fP,
+i.e. assigned a brand new valuenumber.
+.PP
+Because we lose information when we forget
+the value number of an entity,
+we try to save as much entities as possible.
+When we store into an external,
+we don't have to kill locals and vice versa.
+Furthermore, we can see whether two locals or
+two externals overlap,
+because we know the offset from the local base,
+resp. the offset within the data block,
+and the size.
+The situation becomes more complicated when we have
+to consider indirection.
+The worst case is that we store through an unknown pointer.
+In that case we kill all entities except those locals
+for which a so-called \fIregister message\fP has been generated;
+this register message indicates that this local can never be
+accessed indirectly.
+If we know this pointer we can be more careful.
+If it points to a local then the entity that is accessed through
+this pointer can never overlap with an external.
+If it points to an external this entity can never overlap with a local.
+Furthermore, in the latter case,
+we can find the data block this entity belongs to.
+Since pointer arithmetic is only defined within a data block,
+this entity can never overlap with entities that are known to
+belong to another data block.
+.PP
+Not only after a store-instruction but also after a 
+subroutine-call it may be necessary to kill entities;
+the subroutine may affect global variables or store
+through a pointer.
+If a subroutine is called that is not available as EM-text,
+we assume worst case behaviour,
+i.e. we kill all entities without register message.
+.NH 3
+Additions and replacements.
+.PP
+When a new expression comes available,
+we check whether the result is saved in a local
+that may go in a register.
+The last line of the expression must be followed
+by a STL or SDL instruction,
+depending on the size of the result
+(resp. WS and 2*WS),
+and a register message must be present for
+this local.
+If we have found such a local,
+we store a pointer to it with the available expression.
+Each time a new occurrence of this expression
+is found,
+we compare the value number of the local against
+the value number of the result.
+When they are different we remove the pointer to it,
+because we cannot use it.
+.PP
+The available expressions are singly linked in a list.
+When a new expression comes available,
+we link it at the head of the list.
+In this way expressions that are contained within other
+expressions appear later in the list,
+because EM-expressions are postfix.
+When we are going to eliminate expressions,
+we walk through the list,
+starting at the head, to find the largest expressions first.
+When we decide to eliminate an expression,
+we look at the expressions in the tail of the list,
+starting from where we are now,
+to delete expressions that are contained within
+the chosen one because
+we cannot eliminate an expression more than once.
+.PP
+When we are going to eliminate expressions,
+and we do not have a local that holds the result,
+we emit a STL or SDL after the line where the expression
+was first found.
+The other occurrences are simply removed,
+unless they contain instructions that not only have
+effect on the stack; e.g. messages, stores, calls.
+Before each instruction that needs the result on the stack,
+we emit a LOL or LDL.
+When the expression was an AAR,
+but the instruction was a LAR or a SAR,
+we append a LOI resp. a STI of the number of bytes
+in an array-element after each LOL/LDL.
+.NH 3
+Desirability analysis
+.PP
+Although the global optimizer works on EM code,
+the goal is to improve the quality of the object code.
+Therefore we need some machine dependent information
+to decide whether it is desirable to
+eliminate a given expression.
+Because it is impossible for the CS phase to know
+exactly what code will be generated,
+we use some heuristics.
+In most cases it will save time when we eliminate an
+operator, so we just do it.
+We only look for some special cases.
+.PP
+Some operators can in some cases be translated
+into an addressing mode for the machine at hand.
+We only eliminate such an operator,
+when its operand is itself "expensive",
+i.e. not just a simple load.
+The user of the CS phase has to supply
+a set of such operators.
+.PP
+Eliminating the loading of the Local Base or
+the Argument Base by the LXL resp. LXA instruction
+is only beneficial when the number of lexical levels
+we have to go back exceeds a certain threshold.
+This threshold will be different when registers
+are saved by the back end.
+The user must supply this threshold.
+.PP
+Replacing a SAR or a LAR by an AAR followed by a LOI
+may possibly increase the size of the object code.
+We assume that this is only possible when the
+size of the array element is greater than some
+(user-supplied) limit.
+.PP
+There are back ends that can very efficiently translate
+the index computing instruction sequence LOC SLI ADS.
+If this is the case,
+we do not eliminate the SLI instruction between a LOC
+and an ADS.
+.PP
+To handle unforeseen cases, the user may also supply
+a set of operators that should never be eliminated.
+.NH 3
+The algorithm
+.PP
+After these preparatory explanations,
+we can be short about the algorithm itself.
+For each instruction within our window,
+the following steps are performed in the order given:
+.IP 1.
+We check if this instructin defines an entity.
+If this is the case the set of entities is updated accordingly.
+.IP 2.
+We kill all entities that might be affected by this instruction.
+.IP 3.
+The instruction is simulated on the fake-stack.
+Copy propagation is done.
+If this instruction is an operator,
+we update the list of available expressions accordingly.
+.PP
+When we have processed all instructions this way,
+we have built a list of available expressions plus the information we
+need to eliminate them.
+Those expressions of which desirability analysis tells us so,
+we eliminate.
+The we shift our window and continue.
--- a/doc/ego/cs/cs5
+++ b/doc/ego/cs/cs5
@ -0,0 +1,46 @@
+.NH 2
+Source files of CS
+.PP
+The sources of CS are in the following files and packages:
+.IP cs.h 14
+declarations of global variables and data structures
+.IP cs.c
+the routine main;
+a driving routine to process
+the basic blocks in the right order
+.IP vnm
+implements a procedure that performs
+the value numbering on one basic block
+.IP eliminate
+implements a procedure that does the
+transformations, if desirable
+.IP avail
+implements a procedure that manipulates the list of available expressions
+.IP entity
+implements a procedure that manipulates the set of entities
+.IP getentity
+implements a procedure that extracts the
+pseudo symboltable information from EM-instructions;
+uses a small table
+.IP kill
+implements several routines that find the entities
+that might be changed by EM-instructions
+and kill them
+.IP partition
+implements several routines that partition the huge set
+of EM-instructions into more or less manageable,
+more or less logical chunks
+.IP profit
+implements a procedure that decides whether it
+is advantageous to eliminate an expression;
+also removes expressions with side-effects
+.IP stack
+implements the fake-stack and operations on it
+.IP alloc
+implements several allocation routines
+.IP aux
+implements several auxiliary routines
+.IP debug
+implements several routines to provide debugging
+and verbose output
+.LP