166 lines
		
	
	
	
		
			5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			166 lines
		
	
	
	
		
			5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.NH 2
 | 
						|
The Intermediate Code construction phase
 | 
						|
.PP
 | 
						|
The first phase of the global optimizer,
 | 
						|
called
 | 
						|
.UL IC,
 | 
						|
constructs a major part of the intermediate code.
 | 
						|
To be specific, it produces:
 | 
						|
.IP -
 | 
						|
the EM text
 | 
						|
.IP -
 | 
						|
the object table
 | 
						|
.IP -
 | 
						|
part of the procedure table
 | 
						|
.LP
 | 
						|
The calling, change and use attributes of a procedure
 | 
						|
and all its flags except the external and bodyseen flags
 | 
						|
are computed by the next phase (Control Flow phase).
 | 
						|
.PP
 | 
						|
As explained before,
 | 
						|
the intermediate code does not contain
 | 
						|
any names of variables or procedures.
 | 
						|
The normal identifiers are replaced by identifying
 | 
						|
numbers.
 | 
						|
Yet, the output of the global optimizer must
 | 
						|
contain normal identifiers, as this
 | 
						|
output is in Compact Assembly Language format.
 | 
						|
We certainly want all externally visible names
 | 
						|
to be the same in the input as in the output,
 | 
						|
because the optimized EM module may be a library unit,
 | 
						|
used by other modules.
 | 
						|
IC dumps the names of all procedures and data labels
 | 
						|
on two files:
 | 
						|
.IP -
 | 
						|
the procedure dump file, containing tuples (P_ID, procedure name)
 | 
						|
.IP -
 | 
						|
the data dump file, containing tuples (D_ID, data label name)
 | 
						|
.LP
 | 
						|
The names of instruction labels are not dumped,
 | 
						|
as they are not visible outside the procedure
 | 
						|
in which they are defined.
 | 
						|
.PP
 | 
						|
The input to IC consists of one or more files.
 | 
						|
Each file is either an EM module in Compact Assembly Language
 | 
						|
format, or a Unix archive file (library) containing such modules.
 | 
						|
IC only extracts those modules from a library that are
 | 
						|
needed somehow, just as a linker does.
 | 
						|
It is advisable to present as much code
 | 
						|
of the EM program as possible to the optimizer,
 | 
						|
although it is not required to present the whole program.
 | 
						|
If a procedure is called somewhere in the EM text,
 | 
						|
but its body (text) is not included in the input,
 | 
						|
its bodyseen flag in the procedure table will still
 | 
						|
be off.
 | 
						|
Whenever such a procedure is called,
 | 
						|
we assume the worst case for everything;
 | 
						|
it will change and use all variables it has access to,
 | 
						|
it will call every procedure etc.
 | 
						|
.sp
 | 
						|
Similarly, if a data label is used
 | 
						|
but not defined, the PSEUDO attribute in its data block
 | 
						|
will be set to UNKNOWN.
 | 
						|
.NH 3
 | 
						|
Implementation
 | 
						|
.PP
 | 
						|
Part of the code for the EM Peephole Optimizer
 | 
						|
.[
 | 
						|
staveren peephole toplass
 | 
						|
.]
 | 
						|
has been used for IC.
 | 
						|
Especially the routines that read and unravel
 | 
						|
Compact Assembly Language and the identifier
 | 
						|
lookup mechanism have been used.
 | 
						|
New code was added to recognize objects,
 | 
						|
build the object and procedure tables and to
 | 
						|
output the intermediate code.
 | 
						|
.PP
 | 
						|
IC uses singly linked linear lists for both the
 | 
						|
procedure and object table.
 | 
						|
Hence there are no limits on the size of such
 | 
						|
a table (except for the trivial fact that it must fit
 | 
						|
in main memory).
 | 
						|
Both tables are outputted after all EM code has
 | 
						|
been processed.
 | 
						|
IC reads the EM text of one entire procedure
 | 
						|
at a time,
 | 
						|
processes it and appends the modified code to
 | 
						|
the EM text file.
 | 
						|
EM code is represented internally as a doubly linked linear
 | 
						|
list of EM instructions.
 | 
						|
.PP
 | 
						|
Objects are recognized by looking at the operands
 | 
						|
of instructions that reference global data.
 | 
						|
If we come across the instructions:
 | 
						|
.DS
 | 
						|
.TS
 | 
						|
l l.
 | 
						|
LDE X+6	-- Load Double External
 | 
						|
LAE X+20	-- Load Address External
 | 
						|
.TE
 | 
						|
.DE
 | 
						|
we conclude that the data block
 | 
						|
preceded by the data label X contains an object
 | 
						|
at offset 6 of size twice the word size,
 | 
						|
and an object at offset 20 of unknown size.
 | 
						|
.sp
 | 
						|
A data block entry of the object table is allocated
 | 
						|
at the first reference to a data label.
 | 
						|
If this reference is a defining occurrence
 | 
						|
or a INA pseudo instruction,
 | 
						|
the label is not externally visible
 | 
						|
.[~[
 | 
						|
keizer architecture
 | 
						|
.], section 11.1.4.3]
 | 
						|
In this case, the external flag of the data block
 | 
						|
is turned off.
 | 
						|
If the first reference is an applied occurrence
 | 
						|
or a EXA pseudo instruction, the flag is set.
 | 
						|
We record this information, because the
 | 
						|
optimizer may change the order of defining and
 | 
						|
applied occurrences.
 | 
						|
The INA and EXA pseudos are removed from the EM text.
 | 
						|
They may be regenerated by the last phase
 | 
						|
of the optimizer.
 | 
						|
.sp
 | 
						|
Similar rules hold for the procedure table
 | 
						|
and the INP and EXP pseudos.
 | 
						|
.NH 3
 | 
						|
Source files of IC
 | 
						|
.PP
 | 
						|
The source files of IC consist
 | 
						|
of the files ic.c, ic.h and several packages.
 | 
						|
.UL ic.h
 | 
						|
contains type definitions, macros and
 | 
						|
variable declarations that may be used by
 | 
						|
ic.c and by every package.
 | 
						|
.UL ic.c
 | 
						|
contains the definitions of these variables,
 | 
						|
the procedure
 | 
						|
.UL main
 | 
						|
and some high level I/O routines used by main.
 | 
						|
.sp
 | 
						|
Every package xxx consists of two files.
 | 
						|
ic_xxx.h contains type definitions,
 | 
						|
macros, variable declarations and
 | 
						|
procedure declarations that may be used by
 | 
						|
every .c file that includes this .h file.
 | 
						|
The file ic_xxx.c provides the
 | 
						|
definitions of these variables and
 | 
						|
the implementation of the declared procedures.
 | 
						|
IC uses the following packages:
 | 
						|
.IP lookup: 18
 | 
						|
procedures that loop up procedure, data label
 | 
						|
and instruction label names; procedures to dump
 | 
						|
the procedure and data label names.
 | 
						|
.IP lib:
 | 
						|
one procedure that gets the next useful input module;
 | 
						|
while scanning archives, it skips unnecessary modules.
 | 
						|
.IP aux:
 | 
						|
several auxiliary routines.
 | 
						|
.IP io:
 | 
						|
low-level I/O routines that unravel the Compact
 | 
						|
Assembly Language.
 | 
						|
.IP put:
 | 
						|
routines that output the intermediate code
 | 
						|
.LP
 |