75 lines
3 KiB
Text
75 lines
3 KiB
Text
.bp
|
|
.NH 1
|
|
Compact assembly generation
|
|
.NH 2
|
|
Introduction
|
|
.PP
|
|
The "Compact Assembly generation phase" (CA) transforms the
|
|
intermediate code of the optimizer into EM code in
|
|
Compact Assembly Language (CAL) format.
|
|
In the intermediate code, all program entities
|
|
(such as procedures, labels, global variables)
|
|
are denoted by a unique identifying number (see 3.5).
|
|
In the CAL output of the optimizer these numbers have to
|
|
be replaced by normal identifiers (strings).
|
|
The original identifiers of the input program are used whenever possible.
|
|
Recall that the IC phase generates two files that can be
|
|
used to map unique identifying numbers to procedure names and
|
|
global variable names.
|
|
For instruction labels CA always generates new names.
|
|
The reasons for doing so are:
|
|
.IP -
|
|
instruction labels are only visible inside one procedure, so they can
|
|
not be referenced in other modules
|
|
.IP -
|
|
the names are not very suggestive anyway, as they must be integer numbers
|
|
.IP -
|
|
the optimizer considerably changes the control structure of the program,
|
|
so there is really no one to one mapping of instruction labels in
|
|
the input and the output program.
|
|
.LP
|
|
As the optimizer combines all input modules into one module,
|
|
visibility problems may occur.
|
|
Two modules M1 and M2 can both define an identifier X (provided that
|
|
X is not externally visible in any of these modules).
|
|
If M1 and M2 are combined into one module M, two distinct
|
|
entities with the same name would exist in M, which
|
|
is not allowed.
|
|
.[~[
|
|
tanenbaum machine architecture
|
|
.], section 11.1.4.3]
|
|
In these cases, CA invents a new unique name for one of the entities.
|
|
.NH 2
|
|
Implementation
|
|
.PP
|
|
CA first reads the files containing the procedure and global variable names
|
|
and stores the names in two tables.
|
|
It scans these tables to make sure that all names are different.
|
|
Subsequently it reads the EM text, one procedure at a time,
|
|
and outputs it in CAL format.
|
|
The major part of the code that does the latter transformation
|
|
is adapted from the EM Peephole Optimizer.
|
|
.PP
|
|
The main problem of the implementation of CA is to
|
|
assure that the visibility rules are obeyed.
|
|
If an identifier must be externally visible (i.e.
|
|
it was externally visible in the input program)
|
|
and the identifier is defined (in the output program) before
|
|
being referenced,
|
|
an EXA or EXP pseudo must be generated for it.
|
|
(Note that the optimizer may change the order of definitions and
|
|
references, so some pseudos may be needed that were not
|
|
present in the input program).
|
|
On the other hand, an identifier may be only internally visible.
|
|
If such an identifier is referenced before being defined,
|
|
an INA or INP pseudo must be emitted prior to its first reference.
|
|
.UH
|
|
Acknowledgements
|
|
.PP
|
|
The author would like to thank Andy Tanenbaum for his guidance,
|
|
Duk Bekema for implementing the Common Subexpression Elimination phase
|
|
and writing the initial documentation of that phase,
|
|
Dick Grune for reading the manuscript of this report
|
|
and Ceriel Jacobs, Ed Keizer, Martin Kersten, Hans van Staveren
|
|
and the members of the S.T.W. user's group for their
|
|
interest and assistance.
|