comments Dick
This commit is contained in:
parent
8c20160cb6
commit
5b4ae84255
1 changed files with 196 additions and 292 deletions
488
doc/ceg/ceg.tr
488
doc/ceg/ceg.tr
|
@ -15,22 +15,22 @@ Amsterdam, The Netherlands
|
||||||
Introduction
|
Introduction
|
||||||
.PP
|
.PP
|
||||||
A \fBcode expander\fR (\fBce\fR for short) is a part of the
|
A \fBcode expander\fR (\fBce\fR for short) is a part of the
|
||||||
Amsterdam Compiler Kit (\fBACK\fR), which provides the user with
|
Amsterdam Compiler Kit (\fBACK\fR) and provides the user with
|
||||||
high-speed generation of medium-quality code. Although conceptually
|
high-speed generation of medium-quality code. Although conceptually
|
||||||
equivalent to the more usual \fBcode generator\fR, it differs in some
|
equivalent to the more usual \fBcode generator\fR, it differs in some
|
||||||
aspects.
|
aspects.
|
||||||
.LP
|
.LP
|
||||||
Normally, a program to be compiled with \fBACK\fR
|
Normally, a program to be compiled with \fBACK\fR
|
||||||
is first fed into the preprocessor. The output of the preprocessor goes
|
is first fed to the preprocessor. The output of the preprocessor goes
|
||||||
into the appropiate front end, which produces EM
|
into the appropriate front end, which produces EM
|
||||||
.[~[
|
.[~[
|
||||||
IR-81
|
Tanenbaum
|
||||||
.]]
|
.]]
|
||||||
(a
|
(a
|
||||||
machine independent low level intermediate code). The generated EM code is fed
|
machine independent low level intermediate code). The generated EM code is fed
|
||||||
into the peephole optimizer, which scans it with a window of a few instructions,
|
into the peephole optimizer, which scans it with a window of a few instructions,
|
||||||
replacing certain inefficient code sequences by better ones. After the
|
replacing certain inefficient code sequences by better ones. After the
|
||||||
peephole optimizer a backend follows, which produces high-quality assembly code.
|
peephole optimizer a back end follows, which produces high-quality assembly code.
|
||||||
The assembly code goes via the target optimizer into the assembler and the
|
The assembly code goes via the target optimizer into the assembler and the
|
||||||
object code then goes into the
|
object code then goes into the
|
||||||
linker/loader, the final component in the pipeline.
|
linker/loader, the final component in the pipeline.
|
||||||
|
@ -41,14 +41,15 @@ reducing compile time is more important than execution time of a program.
|
||||||
For this purpose a new scheme is introduced:
|
For this purpose a new scheme is introduced:
|
||||||
.IP \ \ 1:
|
.IP \ \ 1:
|
||||||
The code generator and assembler are
|
The code generator and assembler are
|
||||||
replaced by one program: the \fBcode expander\fR, which directly expands
|
replaced by a library, the \fBcode expander\fR, consisting of a set of routines
|
||||||
the EM-instructions into a relocatable objectfile.
|
which directly expand
|
||||||
|
the EM-instructions into a relocatable object file.
|
||||||
The peephole and target optimizer are not used.
|
The peephole and target optimizer are not used.
|
||||||
.IP \ \ 2:
|
.IP \ \ 2:
|
||||||
The front end and \fBce\fR are combined into a single
|
These routines replace the usual EM-generating routines in the front end; this
|
||||||
program, eliminating the overhead of intermediate files.
|
eliminates the overhead of intermediate files.
|
||||||
.LP
|
.LP
|
||||||
This results in a fast compiler producing objectfile, ready to be
|
This results in a fast compiler producing object file, ready to be
|
||||||
linked and loaded, at the cost of unoptimized object code.
|
linked and loaded, at the cost of unoptimized object code.
|
||||||
.LP
|
.LP
|
||||||
Extra speedup is obtained by generating code for a single EM-instruction
|
Extra speedup is obtained by generating code for a single EM-instruction
|
||||||
|
@ -63,16 +64,21 @@ debugged and tested in less than two weeks.
|
||||||
This document describes the tools for automatically generating a
|
This document describes the tools for automatically generating a
|
||||||
\fBce\fR (a library of C files), from two tables and
|
\fBce\fR (a library of C files), from two tables and
|
||||||
a few machine-dependent functions.
|
a few machine-dependent functions.
|
||||||
A throughout knowledge of EM is necessary to understand this document.
|
A thorough knowledge of EM is necessary to understand this document.
|
||||||
.NH
|
.NH
|
||||||
An overview (? Inside the code expander generator)
|
The code expander generator
|
||||||
|
.PP
|
||||||
|
The code expander generator (\fBceg\fR) generates a code expander from
|
||||||
|
two tables and a few machine-dependent functions. This section explains how
|
||||||
|
the \fBceg\fR works. The first half describes the transformations on the
|
||||||
|
two tables. The second half tells how these transformations are done by the
|
||||||
|
\fBceg\fR.
|
||||||
.PP
|
.PP
|
||||||
A code expander consists of a set of routines that convert EM-instructions
|
A code expander consists of a set of routines that convert EM-instructions
|
||||||
directly to relocatable object code. These routines are called by a front
|
directly to relocatable object code. These routines are called by a front
|
||||||
end through the
|
end through the EM_CODE(3ACK)
|
||||||
EM_CODE(3ACK)
|
|
||||||
.[~[
|
.[~[
|
||||||
EM_CODE(3ACK)
|
EM_CODE
|
||||||
.]]
|
.]]
|
||||||
interface. To free the table writer of the burden of building
|
interface. To free the table writer of the burden of building
|
||||||
an object file, we supply a set of routines that build an object file
|
an object file, we supply a set of routines that build an object file
|
||||||
|
@ -82,7 +88,9 @@ ACK_A.OUT(5L)
|
||||||
.]]
|
.]]
|
||||||
format (see appendix B). This set of routines is called
|
format (see appendix B). This set of routines is called
|
||||||
the
|
the
|
||||||
\fBback\fR-primitives (see appendix A).
|
\fBback\fR-primitives (see appendix A). In short, a code expander consists of a
|
||||||
|
set of routines which map the EM_CODE interface on the
|
||||||
|
\fBback\fR-primitives interface, which generate object code.
|
||||||
.PP
|
.PP
|
||||||
To avoid repetition of the same sequences of
|
To avoid repetition of the same sequences of
|
||||||
\fBback\fR-primitives in different
|
\fBback\fR-primitives in different
|
||||||
|
@ -91,8 +99,9 @@ and to improve readability, the EM-to-object information must be supplied in
|
||||||
two
|
two
|
||||||
tables. The EM_table maps EM to an assembly language, and the as_table
|
tables. The EM_table maps EM to an assembly language, and the as_table
|
||||||
maps
|
maps
|
||||||
assembly to \fBback\fR-primitives. The assembly language may be an
|
assembly to \fBback\fR-primitives. The assembly language is chosen by the
|
||||||
actual assembly language or an ad-hoc one designed by the table writer.
|
table writer. It can either be an actual assembly language or his ad-hoc
|
||||||
|
designed language.
|
||||||
.LP
|
.LP
|
||||||
The following picture shows the dependencies between the different components:
|
The following picture shows the dependencies between the different components:
|
||||||
.sp
|
.sp
|
||||||
|
@ -105,7 +114,7 @@ D: arrow right with .start at A.center - (0.25i, 0)
|
||||||
E: arrow right with .start at B.center - (0.25i, 0)
|
E: arrow right with .start at B.center - (0.25i, 0)
|
||||||
F: arrow right with .start at C.center - (0.25i, 0)
|
F: arrow right with .start at C.center - (0.25i, 0)
|
||||||
"EM_CODE(3ACK)" at A.start above
|
"EM_CODE(3ACK)" at A.start above
|
||||||
"EM_TABLE" at B.start above
|
"EM_table" at B.start above
|
||||||
"as_table" at C.start above
|
"as_table" at C.start above
|
||||||
"source language " at D.start rjust
|
"source language " at D.start rjust
|
||||||
"EM" at 0.5 of the way between D.end and E.start
|
"EM" at 0.5 of the way between D.end and E.start
|
||||||
|
@ -115,17 +124,28 @@ H: " back primitives" at F.end ljust
|
||||||
" (ACK_A.OUT)" at H - (0, 0.2i) ljust
|
" (ACK_A.OUT)" at H - (0, 0.2i) ljust
|
||||||
.PE
|
.PE
|
||||||
.PP
|
.PP
|
||||||
The entries in the as_table map assembly instructions on \fBback\fR-primitives.
|
Although the picture suggests that during compilation of the EM instructions are
|
||||||
The as_table is used to transform the EM->assembly mapping into an EM->
|
first transformed into assembly instructions and then the assembly instructions
|
||||||
\fBback\fR- primitives mapping;
|
are transformed into object-generating calls, the \fBback-primitives\fR, this
|
||||||
the expanded EM_table is then transformed into a set of C
|
is not what happens in practice, although the user is free to think it does.
|
||||||
|
Actually, however the EM_table and the as_table are combined during code
|
||||||
|
expander generation time, yielding an imaginary compound table that results in
|
||||||
|
routines from the EM_CODE interface that generate object code directly.
|
||||||
|
.PP
|
||||||
|
As already indicated, the compound table does not exist either. Instead, each
|
||||||
|
assembly instruction in the as_table is converted to a routine generating C
|
||||||
.[~[
|
.[~[
|
||||||
Kernighan
|
Kernighan
|
||||||
.]]
|
.]]
|
||||||
routines, which are
|
code to generate C code to call the \fBback\fR-primitives. The EM_table is
|
||||||
normally incorporated in a compiler. All this happens during compiler
|
converted into a program that for each EM instruction generates a routine,
|
||||||
generation time. The C routines are activated during the
|
using the routines generated from the as_table. Execution of the latter program
|
||||||
execution of the compiler.
|
will then generate the code expander.
|
||||||
|
.PP
|
||||||
|
This scheme allows great flexibility in the table writing, while still
|
||||||
|
resulting in a very efficient code expander. One implication is that the
|
||||||
|
as_table is interpreted twice and the EM_table only once. This has consequences
|
||||||
|
for their structure.
|
||||||
.PP
|
.PP
|
||||||
To illustrate what happens, we give an example. The example is an entry in
|
To illustrate what happens, we give an example. The example is an entry in
|
||||||
the tables for the VAX-machine. The assembly language chosen is a subset of the
|
the tables for the VAX-machine. The assembly language chosen is a subset of the
|
||||||
|
@ -135,19 +155,35 @@ One of the most fundamental operations in EM is ``loc c", load the value of c
|
||||||
on the stack. To expand this instruction the
|
on the stack. To expand this instruction the
|
||||||
tables contain the following information:
|
tables contain the following information:
|
||||||
.DS
|
.DS
|
||||||
\f5
|
EM_table : \f5
|
||||||
EM_table : C_loc ==> "pushl $$$1".
|
C_loc ==> "pushl $$$1".
|
||||||
/* $1 refers to the first argument of C_loc. */
|
/* $1 refers to the first argument of C_loc.
|
||||||
|
* $$ is a quoted $. */
|
||||||
|
|
||||||
|
|
||||||
as_table : pushl src : CONST ==>
|
\fRas_table :\f5
|
||||||
|
pushl src : CONST ==>
|
||||||
@text1( 0xd0);
|
@text1( 0xd0);
|
||||||
@text1( 0xef);
|
@text1( 0xef);
|
||||||
@text4( %$( src->num)).
|
@text4( %$( src->num)).
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
.LP
|
.LP
|
||||||
The following routine will be generated for C_loc:
|
The as_table is transformed in the following routine:
|
||||||
|
.DS
|
||||||
|
\f5
|
||||||
|
pushl_instr(src)
|
||||||
|
t_operand *src;
|
||||||
|
/* "t_operand" is a struct defined by the table writer. */
|
||||||
|
{
|
||||||
|
printf("swtxt();");
|
||||||
|
printf("text1( 0xd0);");
|
||||||
|
printf("text1( 0xef);");
|
||||||
|
printf("text4( %s );", substitute_dollar( src->num) );
|
||||||
|
}
|
||||||
|
\fR
|
||||||
|
.DE
|
||||||
|
Using "pushl_instr()", the following routine is generated from the EM_table:
|
||||||
.DS
|
.DS
|
||||||
\f5
|
\f5
|
||||||
C_loc( c)
|
C_loc( c)
|
||||||
|
@ -161,19 +197,20 @@ arith c;
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
.LP
|
.LP
|
||||||
A call by the compiler to "C_loc" will cause the 1-byte numbers "0xd0"
|
A compiler call to "C_loc" will cause the 1-byte numbers "0xd0"
|
||||||
and "0xef"
|
and "0xef"
|
||||||
and the 4-byte value of the variable "c" to be stored in the text segment.
|
and the 4-byte value of the variable "c" to be stored in the text segment.
|
||||||
.PP
|
.PP
|
||||||
The transformations on the tables are done automatically by the code expander
|
The transformations on the tables are done automatically by the code expander
|
||||||
generator.
|
generator.
|
||||||
The code expander generator consists of two tools, one to handle the
|
The code expander generator consists of two tools, one to handle the
|
||||||
EM_table, \fBemg\fR, and one to handle the as_table, \fBasg\fR. Asg transforms
|
EM_table, \fBemg\fR, and one to handle the as_table, \fBasg\fR. \fBAsg\fR
|
||||||
|
transforms
|
||||||
each assembly instruction in a C routine. These C routines generate calls
|
each assembly instruction in a C routine. These C routines generate calls
|
||||||
to the \fBback\fR-primitives. Finally, the generated C routines are used
|
to the \fBback\fR-primitives. Finally, the generated C routines are used
|
||||||
by emg to generate the actual code expander from the EM_table.
|
by \fBemg\fR to generate the actual code expander from the EM_table.
|
||||||
.PP
|
.PP
|
||||||
The link between emg and \fBasg\fR is an assembly language.
|
The link between \fBemg\fR and \fBasg\fR is an assembly language.
|
||||||
We did not enforce a specific syntax for the assembly language;
|
We did not enforce a specific syntax for the assembly language;
|
||||||
instead we have chosen to give the table writer the freedom
|
instead we have chosen to give the table writer the freedom
|
||||||
to make an ad-hoc assembly language or to use an actual assembly language
|
to make an ad-hoc assembly language or to use an actual assembly language
|
||||||
|
@ -183,26 +220,29 @@ runs on the machine at hand, he can test the EM_table independently from the
|
||||||
as_table. Of course there is a price to pay: the table writer has to
|
as_table. Of course there is a price to pay: the table writer has to
|
||||||
do the decoding of the operands himself. See section 4 for more details.
|
do the decoding of the operands himself. See section 4 for more details.
|
||||||
.PP
|
.PP
|
||||||
Before we explain the several parts of the ceg, we will give an overview of
|
Before we describe the structure of the tables in detail, we will give
|
||||||
the four main phases.
|
an overview of the four main phases.
|
||||||
.IP "phase 1):"
|
.IP "phase 1:"
|
||||||
.br
|
.br
|
||||||
The as_table is transformed by \fBasg\fR. This results in a set of C routines.
|
The as_table is transformed by \fBasg\fR. This results in a set of C routines.
|
||||||
Each assembly-opcode generates one C routine.
|
Each assembly-opcode generates one C routine. Note that a call to such a
|
||||||
.IP "phase 2):"
|
routine does not generate the corresponding object code; it generates C code,
|
||||||
|
which, when executed, generates the desired object code.
|
||||||
|
.IP "phase 2:"
|
||||||
.br
|
.br
|
||||||
The C routines generated by \fBasg\fR are used by emg to expand the EM_table.
|
The C routines generated by \fBasg\fR are used by emg to expand the EM_table.
|
||||||
This
|
This
|
||||||
results in a set of C routines, the code expander, which form the procedural
|
results in a set of C routines, the code expander, which conform to the
|
||||||
interface EM_CODE(3ACK).
|
procedural interface EM_CODE(3ACK). A call to such a routine does indeed
|
||||||
.IP "phase 3):"
|
generate the desired object code.
|
||||||
|
.IP "phase 3:"
|
||||||
.br
|
.br
|
||||||
The front end that uses the procedural interface is linked/loaded with the
|
The front end that uses the procedural interface is linked/loaded with the
|
||||||
code expander generated in phase 2) and the \fBback\fR-primitives.
|
code expander generated in phase 2 and the \fBback\fR-primitives (a supplied
|
||||||
This results in a compiler.
|
library). This results in a compiler.
|
||||||
.IP "phase 4):"
|
.IP "phase 4:"
|
||||||
.br
|
.br
|
||||||
Execution of the compiler; The routines in the code expander are
|
Execution of the compiler. The routines in the code expander are
|
||||||
executed and produce object code.
|
executed and produce object code.
|
||||||
.RE
|
.RE
|
||||||
.NH
|
.NH
|
||||||
|
@ -213,7 +253,7 @@ the first 3 sections describe the syntax of the EM_table,
|
||||||
the
|
the
|
||||||
semantics of the EM_table, and an list of the functions and
|
semantics of the EM_table, and an list of the functions and
|
||||||
constants that must be present in the EM_table, in the file "mach.c" or in
|
constants that must be present in the EM_table, in the file "mach.c" or in
|
||||||
the file "mach.h"; the last section deals with the case that the table
|
the file "mach.h"; and the last section deals with the case that the table
|
||||||
writer wants to generate assembly instead of object code. The section on
|
writer wants to generate assembly instead of object code. The section on
|
||||||
semantics contains many examples.
|
semantics contains many examples.
|
||||||
.NH 2
|
.NH 2
|
||||||
|
@ -244,8 +284,8 @@ a name in the EM_CODE(3ACK) interface. \fBcondition\fR is a C expression.
|
||||||
\fBfunction-call\fR is a call of a C function. \fBlabel\fR, \fBmnemonic\fR
|
\fBfunction-call\fR is a call of a C function. \fBlabel\fR, \fBmnemonic\fR
|
||||||
and \fBoperand\fR are arbitrary strings. If an \fBoperand\fR
|
and \fBoperand\fR are arbitrary strings. If an \fBoperand\fR
|
||||||
contains brackets, the
|
contains brackets, the
|
||||||
brackets must match. In reality there is an upperbound on the number of
|
brackets must match. In reality there is an upper bound on the number of
|
||||||
operands; The maxium number is defined by the constant MAX_OPERANDS in de
|
operands; the maximum number is defined by the constant MAX_OPERANDS in de
|
||||||
file "const.h" in the directory assemble.c. Comments in the table should be
|
file "const.h" in the directory assemble.c. Comments in the table should be
|
||||||
placed between "/*" and "*/". Finally, before the table is parsed, the
|
placed between "/*" and "*/". Finally, before the table is parsed, the
|
||||||
C preprocessor runs.
|
C preprocessor runs.
|
||||||
|
@ -257,13 +297,13 @@ for every instruction in the EM_CODE(3ACK).
|
||||||
For every EM-instruction not mentioned in the EM_table, a
|
For every EM-instruction not mentioned in the EM_table, a
|
||||||
C function that prints an error message is generated.
|
C function that prints an error message is generated.
|
||||||
It is possible to divide the EM_CODE(3ACK)-interface in four parts :
|
It is possible to divide the EM_CODE(3ACK)-interface in four parts :
|
||||||
.IP \0\01)
|
.IP \0\01:
|
||||||
text instructions (e.g., C_loc, C_adi, ..)
|
text instructions (e.g., C_loc, C_adi, ..)
|
||||||
.IP \0\02)
|
.IP \0\02:
|
||||||
pseudo instructions (e.g., C_open, C_df_ilb, ..)
|
pseudo instructions (e.g., C_open, C_df_ilb, ..)
|
||||||
.IP \0\03)
|
.IP \0\03:
|
||||||
storage instructions (e.g., C_rom_icon, ..)
|
storage instructions (e.g., C_rom_icon, ..)
|
||||||
.IP \0\04)
|
.IP \0\04:
|
||||||
message instructions (e.g., C_mes_begin, ..)
|
message instructions (e.g., C_mes_begin, ..)
|
||||||
.LP
|
.LP
|
||||||
This section starts with giving the semantics of the grammar. The examples
|
This section starts with giving the semantics of the grammar. The examples
|
||||||
|
@ -275,7 +315,7 @@ useful for a code expander, they are ignored.
|
||||||
Actions
|
Actions
|
||||||
.PP
|
.PP
|
||||||
The EM_table consists of rules which describe how to expand a \fBC_instr\fR
|
The EM_table consists of rules which describe how to expand a \fBC_instr\fR
|
||||||
from the EM_CODE(3ACK)-interface, an EM instruction, into actions.
|
from the EM_CODE(3ACK)-interface (corresponding to an EM instruction) into actions.
|
||||||
There are two kinds of actions: assembly instructions and C function calls.
|
There are two kinds of actions: assembly instructions and C function calls.
|
||||||
An assembly instruction is defined as a mnemonic followed by zero or more
|
An assembly instruction is defined as a mnemonic followed by zero or more
|
||||||
operands, separated by commas. The semantics of an assembly instruction is
|
operands, separated by commas. The semantics of an assembly instruction is
|
||||||
|
@ -305,9 +345,9 @@ Labels
|
||||||
Since an assembly language without instruction labels is a rather weak
|
Since an assembly language without instruction labels is a rather weak
|
||||||
language, labels inside a contiguous block of assembly instructions are
|
language, labels inside a contiguous block of assembly instructions are
|
||||||
allowed. When using labels two rules must be observed:
|
allowed. When using labels two rules must be observed:
|
||||||
.IP \0\01)
|
.IP \0\01:
|
||||||
The name of a label should be unique inside an action list.
|
The name of a label should be unique inside an action list.
|
||||||
.IP \0\02)
|
.IP \0\02:
|
||||||
The labels used in an assembler instruction should be defined in the same
|
The labels used in an assembler instruction should be defined in the same
|
||||||
action list.
|
action list.
|
||||||
.LP
|
.LP
|
||||||
|
@ -337,11 +377,11 @@ is the
|
||||||
total number of arguments of the current \fBC_instr\fR (there are a few
|
total number of arguments of the current \fBC_instr\fR (there are a few
|
||||||
exceptions, see Implicit arguments). The table writer may
|
exceptions, see Implicit arguments). The table writer may
|
||||||
refer to an argument as $\fIi\fR. If a plain $-sign is needed in an
|
refer to an argument as $\fIi\fR. If a plain $-sign is needed in an
|
||||||
assembly instruction, it must be preceeded by a extra $-sign.
|
assembly instruction, it must be preceded by a extra $-sign.
|
||||||
.PP
|
.PP
|
||||||
There are two groups of \fBC_instr\fRs whose arguments are handled specially:
|
There are two groups of \fBC_instr\fRs whose arguments are handled specially:
|
||||||
.RS
|
.RS
|
||||||
.IP "1) Instructions dealing with local offsets."
|
.IP "1: Instructions dealing with local offsets."
|
||||||
.br
|
.br
|
||||||
The value of the $\fIi\fR argument referring to a parameter ($\fIi\fR >= 0),
|
The value of the $\fIi\fR argument referring to a parameter ($\fIi\fR >= 0),
|
||||||
is increased by "EM_BSIZE". "EM_BSIZE" is the size of the return status block
|
is increased by "EM_BSIZE". "EM_BSIZE" is the size of the return status block
|
||||||
|
@ -352,7 +392,7 @@ C_lol ==> "push $1(bp)".
|
||||||
/* automatic conversion of $1 */
|
/* automatic conversion of $1 */
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
.IP "2) Instructions using global names or instruction labels"
|
.IP "2: Instructions using global names or instruction labels"
|
||||||
.br
|
.br
|
||||||
All the arguments referring to global names or instruction labels will be
|
All the arguments referring to global names or instruction labels will be
|
||||||
transformed into a unique assembly name. To prevent name clashes with library
|
transformed into a unique assembly name. To prevent name clashes with library
|
||||||
|
@ -400,7 +440,7 @@ Equivalence rule
|
||||||
Among the simple rules there is a special case rule:
|
Among the simple rules there is a special case rule:
|
||||||
the equivalence rule. This rule declares two \fBC_instr\fR equivalent. To
|
the equivalence rule. This rule declares two \fBC_instr\fR equivalent. To
|
||||||
distinguish it from the usual simple rule "==>" is replaced by a "::=".
|
distinguish it from the usual simple rule "==>" is replaced by a "::=".
|
||||||
The benefit of an equivalence rule is that the arguments are not
|
The advantage of an equivalence rule is that the arguments are not
|
||||||
converted (see 3.2.3).
|
converted (see 3.2.3).
|
||||||
.DS
|
.DS
|
||||||
\f5
|
\f5
|
||||||
|
@ -410,7 +450,7 @@ C_slu ::= C_sli( $1).
|
||||||
.NH 3
|
.NH 3
|
||||||
Abbreviations
|
Abbreviations
|
||||||
.PP
|
.PP
|
||||||
EM instructions with an external as argument come in three variants in
|
EM instructions with an external as an argument come in three variants in
|
||||||
the EM_CODE(3ACK) interface. In most cases it will be possible to take
|
the EM_CODE(3ACK) interface. In most cases it will be possible to take
|
||||||
these variants together. For this purpose the ".." notation is introduced.
|
these variants together. For this purpose the ".." notation is introduced.
|
||||||
.DS
|
.DS
|
||||||
|
@ -583,7 +623,7 @@ Notice that EM_BSIZE is zero. The vax4 takes care of this automatically.
|
||||||
.PP
|
.PP
|
||||||
There are three routines which have to be defined by the table writer. The
|
There are three routines which have to be defined by the table writer. The
|
||||||
table writer can define them as ordinary C functions in the file "mach.c" or
|
table writer can define them as ordinary C functions in the file "mach.c" or
|
||||||
define them in the EM_table. For example, for the 8086 it looks like this:
|
define them in the EM_table. For example, for the 8086 they look like this:
|
||||||
.DS
|
.DS
|
||||||
\f5
|
\f5
|
||||||
jump ==> "jmp $1".
|
jump ==> "jmp $1".
|
||||||
|
@ -600,9 +640,12 @@ locals
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
.NH 2
|
.NH 2
|
||||||
Generating assembly code
|
Generating assembly code
|
||||||
.PP
|
.PP
|
||||||
The constants "BYTES_REVERSED" and "WORDS_REVERSED" are not needed.
|
When the code expander generator is used for generating assembly instead of
|
||||||
|
object code, not all the above mentioned constants and functions have to
|
||||||
|
be defined. In this case, the constants "BYTES_REVERSED" and "WORDS_REVERSED"
|
||||||
|
are not used.
|
||||||
.NH 1
|
.NH 1
|
||||||
Description of the as_table
|
Description of the as_table
|
||||||
.PP
|
.PP
|
||||||
|
@ -617,7 +660,7 @@ VAX or for the 8086.
|
||||||
.NH 2
|
.NH 2
|
||||||
Grammar
|
Grammar
|
||||||
.PP
|
.PP
|
||||||
The formal form of the as_table is given by the following grammar :
|
The form of the as_table is given by the following grammar :
|
||||||
.VS +4
|
.VS +4
|
||||||
.TS
|
.TS
|
||||||
center tab(#);
|
center tab(#);
|
||||||
|
@ -639,7 +682,12 @@ IF_STATEMENT#::=#"@if" "(" condition ")" ACTION_LIST
|
||||||
.LP
|
.LP
|
||||||
\fBmnemonic\fR, \fBoperand\fR and \fBtype\fR are all C identifiers,
|
\fBmnemonic\fR, \fBoperand\fR and \fBtype\fR are all C identifiers,
|
||||||
\fBcondition\fR is a normal C expression.
|
\fBcondition\fR is a normal C expression.
|
||||||
\fBfunction-call\fR must be a C function call.
|
\fBfunction-call\fR must be a C function call.
|
||||||
|
Since the as_table is
|
||||||
|
interpreted on two levels, during code expander generation and during code
|
||||||
|
expander execution, two levels of calls are present in it. A "function-call"
|
||||||
|
is done during code expander generation, a "@function-call" during code
|
||||||
|
expander execution.
|
||||||
.NH 2
|
.NH 2
|
||||||
Semantics
|
Semantics
|
||||||
.PP
|
.PP
|
||||||
|
@ -650,7 +698,7 @@ one for each assembler mnemonic. (The names of
|
||||||
these functions are the assembler mnemonics postfixed with "_instr", e.g.
|
these functions are the assembler mnemonics postfixed with "_instr", e.g.
|
||||||
\"add" becomes "add_instr()".) These functions will be used by the function
|
\"add" becomes "add_instr()".) These functions will be used by the function
|
||||||
assemble() during the expansion of the EM_table.
|
assemble() during the expansion of the EM_table.
|
||||||
After explainig the semantics of the as_table the function
|
After explaining the semantics of the as_table the function
|
||||||
assemble() will be described.
|
assemble() will be described.
|
||||||
.NH 3
|
.NH 3
|
||||||
Rules
|
Rules
|
||||||
|
@ -683,20 +731,20 @@ determine the opcode. Both cases can be easily expressed in the as_table.
|
||||||
The first case is obvious. For the second case type fields for the operands
|
The first case is obvious. For the second case type fields for the operands
|
||||||
are introduced.
|
are introduced.
|
||||||
.LP
|
.LP
|
||||||
When both mnemonic and operands determine the opcode, the table writer has
|
When mnemonic and operands together determine the opcode, the table writer has
|
||||||
to give several rules for each combination of mnemonic and operands. The rules
|
to give several rules for each combination of mnemonic and operands. The rules
|
||||||
differ in the type fields of the operands.
|
differ in the type fields of the operands.
|
||||||
The table writer has to supply functions that check the type
|
The table writer has to supply functions that check the type
|
||||||
of the operand. The name of such a function is the name of the type; it
|
of the operand. The name of such a function is the name of the type; it
|
||||||
has one argument: a pointer to a struct of type t_operand; it returns
|
has one argument: a pointer to a struct of type t_operand; it returns
|
||||||
1 when the operand is of this type, otherwise it returns 0.
|
non-zero when the operand is of this type, otherwise it returns 0.
|
||||||
.LP
|
.LP
|
||||||
This will usually lead to a list of rules per mnemonic. To reduce the amount of
|
This will usually lead to a list of rules per mnemonic. To reduce the amount of
|
||||||
work an abbrevation is supplied. Once the mnemonic is specified it can be
|
work an abbreviation is supplied. Once the mnemonic is specified it can be
|
||||||
refered to in the following rules by "...".
|
refered to in the following rules by "...".
|
||||||
One has to make sure
|
One has to make sure
|
||||||
that each mnemonic is mentioned only once in the as_table, otherwise \fBasg\fR
|
that each mnemonic is mentioned only once in the as_table, as otherwise
|
||||||
will generate more than one function with the same name.
|
\fBasg\fR will generate more than one function with the same name.
|
||||||
.LP
|
.LP
|
||||||
The following example shows the usage of type fields.
|
The following example shows the usage of type fields.
|
||||||
.DS L
|
.DS L
|
||||||
|
@ -715,16 +763,20 @@ The table-writer must supply the restriction functions, \f5REG\fR and
|
||||||
.NH 3
|
.NH 3
|
||||||
The function of the @-sign and the if-statement.
|
The function of the @-sign and the if-statement.
|
||||||
.PP
|
.PP
|
||||||
The righthand side of a rule consists of function calls. Some of the
|
The right hand side of a rule consists of function calls.
|
||||||
functions generate object code directly (e.g., the \fBback\fR-primitives),
|
Since the as_table is
|
||||||
others are needed for further assemblation (e.g., \f5gen_operand()\fR in the
|
interpreted on two levels, during code expander generation and during code
|
||||||
first example). The last group will be evaluated during the expansion
|
expander execution, two levels of calls are present in it. A function-call
|
||||||
of the EM_table, while the first group is incorporated in the compiler.
|
without a "@"-sign
|
||||||
This is denoted by the @-sign in front of the \fBback\fR-primitives.
|
is called during code expander generation (e.g., the \f5gen_operand()\fR in the
|
||||||
|
first example).
|
||||||
|
A function call with a "@"-sign is called during code expander execution (e.g.,
|
||||||
|
the \fBback\fR-primitives). So the last group is a part of the compiler.
|
||||||
.LP
|
.LP
|
||||||
The next example concerns the use of the "@"-sign in front of a table writer
|
The next example concerns the use of the "@"-sign in front of a table writer
|
||||||
written
|
written
|
||||||
function. The need for this construction arises when you implement push/pop
|
function. The need for this construction arises, e.g., when you
|
||||||
|
implement push/pop
|
||||||
optimization; flags need to be set/unset and tested during the execution of
|
optimization; flags need to be set/unset and tested during the execution of
|
||||||
the compiler:
|
the compiler:
|
||||||
.DS L
|
.DS L
|
||||||
|
@ -750,7 +802,7 @@ the compiler. For example one needs to know if a "$\fIi\fR" argument fits in
|
||||||
one byte.
|
one byte.
|
||||||
In this case one can use a special if-statement provided by \fBasg\fR:
|
In this case one can use a special if-statement provided by \fBasg\fR:
|
||||||
@if, @elsif, @else, @fi. This means that the conditions will be evaluated at
|
@if, @elsif, @else, @fi. This means that the conditions will be evaluated at
|
||||||
runtime of the \fBce\fR. In such a condition one may of course refer to the
|
run time of the \fBce\fR. In such a condition one may of course refer to the
|
||||||
"$\fIi\fR" arguments. For example, constants can be packed into one or two byte
|
"$\fIi\fR" arguments. For example, constants can be packed into one or two byte
|
||||||
arguments:
|
arguments:
|
||||||
.DS L
|
.DS L
|
||||||
|
@ -766,10 +818,10 @@ mov dst:ACCU, src:DATA ==> @if ( fits_byte( %$(dst->expr)))
|
||||||
.NH 3
|
.NH 3
|
||||||
References to operands
|
References to operands
|
||||||
.PP
|
.PP
|
||||||
As mentioned before, the operands of an assembler instruction may be used as
|
As noted before, the operands of an assembler instruction may be used as
|
||||||
pointers, to the struct t_operand, in the righthand side of the table.
|
pointers, to the struct t_operand, in the right hand side of the table.
|
||||||
Because of the free format assembler, the types of the fields in the struct
|
Because of the free format assembler, the types of the fields in the struct
|
||||||
t_operand are unknown to \fBasg\fR. Clearly \fBasg\fR must know these types.
|
t_operand are unknown to \fBasg\fR. Clearly, however, \fBasg\fR must know these types.
|
||||||
This section explains how these types must be specified.
|
This section explains how these types must be specified.
|
||||||
.LP
|
.LP
|
||||||
References to operands come in three forms: ordinary operands, operands that
|
References to operands come in three forms: ordinary operands, operands that
|
||||||
|
@ -797,7 +849,7 @@ The three cases differ only in the conversion field. The first conversion
|
||||||
applies to ordinary operands. The second applies to operands that contain
|
applies to ordinary operands. The second applies to operands that contain
|
||||||
a "$\fIi\fR". The expression between brackets must be of type char *. The
|
a "$\fIi\fR". The expression between brackets must be of type char *. The
|
||||||
result of "%$" is of the type of "$\fIi\fR". The
|
result of "%$" is of the type of "$\fIi\fR". The
|
||||||
third applies operands that refer to a local label. The expression between
|
third applies to operands that refer to a local label. The expression between
|
||||||
the brackets must be of type char *. The result of "%dist" is of type arith.
|
the brackets must be of type char *. The result of "%dist" is of type arith.
|
||||||
.LP
|
.LP
|
||||||
The following example illustrates the usage of "%$". (For an
|
The following example illustrates the usage of "%$". (For an
|
||||||
|
@ -821,12 +873,12 @@ arg_type.h must contain the definition of STRING, ARITH and INT.
|
||||||
%dist is only guaranteed to work when called as a parameter of text1(), text2() or text4().
|
%dist is only guaranteed to work when called as a parameter of text1(), text2() or text4().
|
||||||
The goal of the %dist conversion is to reduce the number of reloc1(), reloc2()
|
The goal of the %dist conversion is to reduce the number of reloc1(), reloc2()
|
||||||
and reloc4()
|
and reloc4()
|
||||||
calls, saving space and time (no relocation at compiler runtime).
|
calls, saving space and time (no relocation at compiler run time).
|
||||||
.LP
|
.LP
|
||||||
The following example illustrates the usage of "%dist".
|
The following example illustrates the usage of "%dist".
|
||||||
.DS L
|
.DS L
|
||||||
\f5
|
\f5
|
||||||
jmp dst:ILB ==> /* label in an instructionlist */
|
jmp dst:ILB ==> /* label in an instruction list */
|
||||||
@text1( 0xeb);
|
@text1( 0xeb);
|
||||||
@text1( %dist( dst->lab)).
|
@text1( %dist( dst->lab)).
|
||||||
|
|
||||||
|
@ -836,20 +888,20 @@ The following example illustrates the usage of "%dist".
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
.NH 3
|
.NH 3
|
||||||
The functions assemble() and block_assemble
|
The functions assemble() and block_assemble()
|
||||||
.PP
|
.PP
|
||||||
Assemble() and block_assemble() are two functions provided by \fBceg\fR.
|
Assemble() and block_assemble() are two functions provided by \fBceg\fR.
|
||||||
However, if one is not satisfied with the way they work the table writer can
|
However, if one is not satisfied with the way they work the table writer can
|
||||||
supply his own assemble or block_assemble().
|
supply his own assemble() or block_assemble().
|
||||||
The default function assemble() splits an assembly string in a label, mnemonic,
|
The default function assemble() splits an assembly string in a label, mnemonic,
|
||||||
and operands and performs the following actions on them:
|
and operands and performs the following actions on them:
|
||||||
.IP \0\01)
|
.IP \0\01:
|
||||||
It processes the local label; it records the name and current position. Thereafter it calls the function process_label() with one argument of type string,
|
It processes the local label; it records the name and current position. Thereafter it calls the function process_label() with one argument of type string,
|
||||||
the label. The table writer has to define this function.
|
the label. The table writer has to define this function.
|
||||||
.IP \0\02)
|
.IP \0\02:
|
||||||
Thereafter it calls the function process_mnemonic() with one argument of
|
Thereafter it calls the function process_mnemonic() with one argument of
|
||||||
type string, the mnemonic. The table writer has to define this function.
|
type string, the mnemonic. The table writer has to define this function.
|
||||||
.IP \0\03)
|
.IP \0\03:
|
||||||
It calls process_operand() for each operand. Process_operand() must be
|
It calls process_operand() for each operand. Process_operand() must be
|
||||||
written by the table-writer since no fixed representation for operands
|
written by the table-writer since no fixed representation for operands
|
||||||
is enforced. It has two arguments, a string (the operand to decode)
|
is enforced. It has two arguments, a string (the operand to decode)
|
||||||
|
@ -857,7 +909,7 @@ and a pointer to the struct t_operand. The declaration of the struct
|
||||||
t_operand must be given in the
|
t_operand must be given in the
|
||||||
file "as.h", and the table-writer can put in it all the information needed for
|
file "as.h", and the table-writer can put in it all the information needed for
|
||||||
encoding the operand in machine format.
|
encoding the operand in machine format.
|
||||||
.IP \0\04)
|
.IP \0\04:
|
||||||
It examines the mnemonic and calls the associated function, generated by
|
It examines the mnemonic and calls the associated function, generated by
|
||||||
\fBasg\fR, with pointers to the decoded operands as arguments. This makes it
|
\fBasg\fR, with pointers to the decoded operands as arguments. This makes it
|
||||||
possible to use the decoded operands in the right hand side of a rule (see
|
possible to use the decoded operands in the right hand side of a rule (see
|
||||||
|
@ -868,15 +920,16 @@ instructions that belong to one action list. For every assembly instruction
|
||||||
in
|
in
|
||||||
this block assemble() is called. But, if a special action is
|
this block assemble() is called. But, if a special action is
|
||||||
required on block of assembly instructions, the table writer only has to
|
required on block of assembly instructions, the table writer only has to
|
||||||
rewrite this function to get a new \fBceg\fR that oblies to his wishes.
|
rewrite this function to get a new \fBceg\fR that obliges to his wishes.
|
||||||
.PP
|
.PP
|
||||||
Only four things have to be specified in "as.h" and "as.c". First the user must
|
Only four things have to be specified in "as.h" and "as.c". First the user must
|
||||||
give the declaration of struct t_operand in "as.h", and the functions
|
give the declaration of struct t_operand in "as.h", and the functions
|
||||||
process_operand(), process_mnemonic() and process_label() must be given
|
process_operand(), process_mnemonic() and process_label() must be given
|
||||||
in "as.c". If the right side of the as_table
|
in "as.c". If the right side of the as_table
|
||||||
contains function calls other than the \fBback\fR-primitives, these functions
|
contains function calls other than the \fBback\fR-primitives, these functions
|
||||||
must also be present in "as.c". Note that both the "@"-sign and "references"
|
must also be present in "as.c". Note that both the "@"-sign (see 4.2.3)
|
||||||
also work in
|
and "references"
|
||||||
|
(see 4.2.4) also work in
|
||||||
the functions defined in "as.c". Example, part of 8086 "as.h" and "as.c"
|
the functions defined in "as.c". Example, part of 8086 "as.h" and "as.c"
|
||||||
files :
|
files :
|
||||||
.nr PS 10
|
.nr PS 10
|
||||||
|
@ -884,13 +937,13 @@ files :
|
||||||
.DS L
|
.DS L
|
||||||
\f5
|
\f5
|
||||||
#define UNKNOWN 0
|
#define UNKNOWN 0
|
||||||
#define IS_REG 0x1
|
#define IS_REG 0x1
|
||||||
#define IS_ACCU 0x2
|
#define IS_ACCU 0x2
|
||||||
#define IS_DATA 0x4
|
#define IS_DATA 0x4
|
||||||
#define IS_LABEL 0x8
|
#define IS_LABEL 0x8
|
||||||
#define IS_MEM 0x10
|
#define IS_MEM 0x10
|
||||||
#define IS_ADDR 0x20
|
#define IS_ADDR 0x20
|
||||||
#define IS_ILB 0x40
|
#define IS_ILB 0x40
|
||||||
|
|
||||||
#define AX 0
|
#define AX 0
|
||||||
#define BX 3
|
#define BX 3
|
||||||
|
@ -900,22 +953,19 @@ files :
|
||||||
#define SI 6
|
#define SI 6
|
||||||
#define DI 7
|
#define DI 7
|
||||||
|
|
||||||
#define REG( op) ( op->type & IS_REG)
|
#define REG( op) ( op->type & IS_REG)
|
||||||
#define ACCU( op) ( op->type & IS_REG && op->reg == AX)
|
#define ACCU( op) ( op->type & IS_REG && op->reg == AX)
|
||||||
#define REG_CL( op) ( op->type & IS_REG && op->reg == CL)
|
#define REG_CL( op) ( op->type & IS_REG && op->reg == CL)
|
||||||
#define DATA( op) ( op->type & IS_DATA)
|
#define DATA( op) ( op->type & IS_DATA)
|
||||||
#define lABEL( op) ( op->type & IS_LABEL)
|
#define LABEL( op) ( op->type & IS_LABEL)
|
||||||
#define ILB( op) ( op->type & IS_ILB)
|
#define ILB( op) ( op->type & IS_ILB)
|
||||||
#define MEM( op) ( op->type & IS_MEM)
|
#define MEM( op) ( op->type & IS_MEM)
|
||||||
#define ADDR( op) ( op->type & IS_ADDR)
|
#define ADDR( op) ( op->type & IS_ADDR)
|
||||||
#define EADDR( op) ( op->type & ( IS_ADDR | IS_MEM | IS_REG))
|
#define EADDR( op) ( op->type & ( IS_ADDR | IS_MEM | IS_REG))
|
||||||
#define CONST1( op) ( op->type & IS_DATA && strcmp( "1", op->expr) == 0)
|
#define CONST1( op) ( op->type & IS_DATA && strcmp( "1", op->expr) == 0)
|
||||||
#define MOVS( op) ( op->type & IS_LABEL&&strcmp("\"movs\"", op->lab) == 0)
|
#define MOVS( op) ( op->type & IS_LABEL&&strcmp("\"movs\"", op->lab) == 0)
|
||||||
#define IMMEDIATE( op) ( op->type & ( IS_DATA | IS_LABEL))
|
#define IMMEDIATE( op) ( op->type & ( IS_DATA | IS_LABEL))
|
||||||
|
|
||||||
#define TRUE 1
|
|
||||||
#define FALSE 0
|
|
||||||
|
|
||||||
struct t_operand {
|
struct t_operand {
|
||||||
unsigned type;
|
unsigned type;
|
||||||
int reg;
|
int reg;
|
||||||
|
@ -930,23 +980,10 @@ extern struct t_operand saved_op, *AX_oper;
|
||||||
#include "arg_type.h"
|
#include "arg_type.h"
|
||||||
#include "as.h"
|
#include "as.h"
|
||||||
|
|
||||||
static struct t_operand dummy = { IS_REG, AX, 0, 0, 0};
|
#define last( s) ( s + strlen( s) - 1)
|
||||||
struct t_operand saved_op, *AX_oper = &dummy;
|
#define LEFT '('
|
||||||
|
|
||||||
save_op( op)
|
|
||||||
struct t_operand *op;
|
|
||||||
{
|
|
||||||
saved_op.type = op->type;
|
|
||||||
saved_op.reg = op->reg;
|
|
||||||
saved_op.expr = op->expr;
|
|
||||||
saved_op.lab = op->lab;
|
|
||||||
saved_op.off = op->off;
|
|
||||||
}
|
|
||||||
|
|
||||||
#define last( s) ( s + strlen( s) - 1)
|
|
||||||
#define LEFT '('
|
|
||||||
#define RIGHT ')'
|
#define RIGHT ')'
|
||||||
#define DOLLAR '$'
|
#define DOLLAR '$'
|
||||||
|
|
||||||
|
|
||||||
process_label( l)
|
process_label( l)
|
||||||
|
@ -1000,129 +1037,14 @@ struct t_operand *op;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
int is_reg( str, op)
|
|
||||||
char *str;
|
|
||||||
struct t_operand *op;
|
|
||||||
{
|
|
||||||
if ( strlen( str) != 2)
|
|
||||||
return( 0);
|
|
||||||
|
|
||||||
switch ( *(str+1)) {
|
|
||||||
case 'x' :
|
|
||||||
case 'l' : switch( *str) {
|
|
||||||
case 'a' : op->reg = 0;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'c' : op->reg = 1;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'd' : op->reg = 2;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'b' : op->reg = 3;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
default : return( FALSE);
|
|
||||||
}
|
|
||||||
|
|
||||||
case 'h' : switch( *str) {
|
|
||||||
case 'a' : op->reg = 4;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'c' : op->reg = 5;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'd' : op->reg = 6;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'b' : op->reg = 7;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
default : return( FALSE);
|
|
||||||
}
|
|
||||||
|
|
||||||
case 'p' : switch ( *str) {
|
|
||||||
case 's' : op->reg = 4;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'b' : op->reg = 5;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
default : return( FALSE);
|
|
||||||
}
|
|
||||||
|
|
||||||
case 'i' : switch ( *str) {
|
|
||||||
case 's' : op->reg = 6;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
case 'd' : op->reg = 7;
|
|
||||||
return( TRUE);
|
|
||||||
|
|
||||||
default : return( FALSE);
|
|
||||||
}
|
|
||||||
|
|
||||||
default : return( FALSE);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
#include <ctype.h>
|
|
||||||
#define isletter( c) ( isalpha( c) || c == '_')
|
|
||||||
|
|
||||||
int contains_label( str)
|
|
||||||
char *str;
|
|
||||||
{
|
|
||||||
while( !isletter( *str) && *str != '\0')
|
|
||||||
if ( *str == '$')
|
|
||||||
if ( arg_type( str) == STRING)
|
|
||||||
return( TRUE);
|
|
||||||
else
|
|
||||||
str += 5;
|
|
||||||
else
|
|
||||||
str++;
|
|
||||||
|
|
||||||
return( isletter( *str));
|
|
||||||
}
|
|
||||||
|
|
||||||
set_label( str, op)
|
|
||||||
char *str;
|
|
||||||
struct t_operand *op;
|
|
||||||
{
|
|
||||||
char *ptr, *index(), *sprint();
|
|
||||||
static char buf[256];
|
|
||||||
|
|
||||||
ptr = index( str, '+');
|
|
||||||
|
|
||||||
if ( ptr == 0)
|
|
||||||
op->off = "0";
|
|
||||||
else {
|
|
||||||
*ptr = '\0';
|
|
||||||
op->off = ptr + 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
if ( isdigit( *str) && ( *(str+1) == 'b' || *(str+1) == 'f') &&
|
|
||||||
*(str+2) == '\0') {
|
|
||||||
*(str+1) = '\0'; /* b of f verwijderen! */
|
|
||||||
op->lab = str;
|
|
||||||
op->type = IS_ILB;
|
|
||||||
}
|
|
||||||
else {
|
|
||||||
op->type = IS_LABEL;
|
|
||||||
if ( index( str, DOLLAR) != 0)
|
|
||||||
op->lab = str;
|
|
||||||
else
|
|
||||||
/* nood oplossing */
|
|
||||||
op->lab = sprint( buf, "\"%s\"", str);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
/******************************************************************************/
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
mod_RM( reg, op)
|
mod_RM( reg, op)
|
||||||
int reg;
|
int reg;
|
||||||
struct t_operand *op;
|
struct t_operand *op;
|
||||||
|
|
||||||
|
/* This function helps to decode operands in machine format.
|
||||||
|
* Note the $-operators
|
||||||
|
*/
|
||||||
{
|
{
|
||||||
if ( REG( op))
|
if ( REG( op))
|
||||||
R233( 0x3, reg, op->reg);
|
R233( 0x3, reg, op->reg);
|
||||||
|
@ -1138,7 +1060,7 @@ struct t_operand *op;
|
||||||
case DI : R233( 0x0, reg, 0x5);
|
case DI : R233( 0x0, reg, 0x5);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case BP : R233( 0x1, reg, 0x6); /* Uitzondering! */
|
case BP : R233( 0x1, reg, 0x6); /* exception! */
|
||||||
@text1( 0);
|
@text1( 0);
|
||||||
break;
|
break;
|
||||||
|
|
||||||
|
@ -1188,40 +1110,18 @@ struct t_operand *op;
|
||||||
@fi
|
@fi
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
mov_REG_EADDR( dst, src)
|
|
||||||
struct t_operand *dst, *src;
|
|
||||||
{
|
|
||||||
if ( REG(src) && dst->reg == src->reg)
|
|
||||||
; /* Nothing!! result of push/pop optimization */
|
|
||||||
else {
|
|
||||||
@text1( 0x8b);
|
|
||||||
mod_RM( dst->reg, src);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
R233( a, b, c)
|
|
||||||
int a,b,c;
|
|
||||||
{
|
|
||||||
@text1( %d( (a << 6) | ( b << 3) | c));
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
R53( a, b)
|
|
||||||
int a,b;
|
|
||||||
{
|
|
||||||
@text1( %d( (a << 3) | b));
|
|
||||||
}
|
|
||||||
\fR
|
\fR
|
||||||
.DE
|
.DE
|
||||||
|
.nr PS 12
|
||||||
|
.nr VS 14
|
||||||
|
.LP
|
||||||
If a different function assemble() is needed, it can be placed in
|
If a different function assemble() is needed, it can be placed in
|
||||||
the file "as.c"; assemble() has one argument of type char *.
|
the file "as.c"; assemble() has one argument of type char *.
|
||||||
.NH 2
|
.NH 2
|
||||||
Generating assembly
|
Generating assembly
|
||||||
.PP
|
.PP
|
||||||
It is possible to generate assembly in stead of objectfiles (see section 5), in
|
It is possible to generate assembly instead of object files (see section 5), in
|
||||||
which case one does not have to supply "as_table", "as.h" and "as.c".
|
which case there is no need to supply "as_table", "as.h" and "as.c".
|
||||||
This option is useful for debugging the EM_table.
|
This option is useful for debugging the EM_table.
|
||||||
.NH 1
|
.NH 1
|
||||||
Building a ce
|
Building a ce
|
||||||
|
@ -1233,13 +1133,13 @@ written and tested. In the second phase, the as_table is written and tested.
|
||||||
Phase one
|
Phase one
|
||||||
.PP
|
.PP
|
||||||
The following is a list of instructions that describe how to make a
|
The following is a list of instructions that describe how to make a
|
||||||
code expander that generates assembly instruction.
|
code expander that generates assembly instructions.
|
||||||
.IP \0\0-1
|
.IP \0\01:
|
||||||
Create a new directory.
|
Create a new directory.
|
||||||
.IP \0\0-2
|
.IP \0\02:
|
||||||
Create the "EM_table", "mach.h" and "mach.c" files; there is no need
|
Create the "EM_table", "mach.h" and "mach.c" files; there is no need
|
||||||
for "as_table", "as.h" and "as.c" at this moment.
|
for "as_table", "as.h" and "as.c" at this moment.
|
||||||
.IP \0\0-3
|
.IP \0\03:
|
||||||
type
|
type
|
||||||
.br
|
.br
|
||||||
\f5
|
\f5
|
||||||
|
@ -1255,7 +1155,7 @@ EM-instruction. All these files will be compiled and put in a library called
|
||||||
.br
|
.br
|
||||||
The option \f5-as\fR means that a \fBback\fR-library will be generated (in the directory back) that
|
The option \f5-as\fR means that a \fBback\fR-library will be generated (in the directory back) that
|
||||||
supports the generation of assembly language. The library is named "back.a".
|
supports the generation of assembly language. The library is named "back.a".
|
||||||
.IP \0\0-4
|
.IP \0\04:
|
||||||
Link a front end, "ce.a" and "back.a" together resulting in a compiler.
|
Link a front end, "ce.a" and "back.a" together resulting in a compiler.
|
||||||
.LP
|
.LP
|
||||||
Now, the EM_table can be tested; if an error occurs, change the table
|
Now, the EM_table can be tested; if an error occurs, change the table
|
||||||
|
@ -1271,11 +1171,11 @@ Phase two
|
||||||
.PP
|
.PP
|
||||||
The next phase is to generate a \fBce\fR that produces relocatable object
|
The next phase is to generate a \fBce\fR that produces relocatable object
|
||||||
code.
|
code.
|
||||||
.IP \0\0-1
|
.IP \0\01:
|
||||||
Remove the "ce" and "ceg" directories.
|
Remove the "ce" and "ceg" directories.
|
||||||
.IP \0\0-2
|
.IP \0\02:
|
||||||
Write the "as_table", "as.h" and "as.c" files.
|
Write the "as_table", "as.h" and "as.c" files.
|
||||||
.IP \0\0-3
|
.IP \0\03:
|
||||||
type
|
type
|
||||||
.br
|
.br
|
||||||
\f5
|
\f5
|
||||||
|
@ -1283,21 +1183,20 @@ install_ceg -obj
|
||||||
\fR
|
\fR
|
||||||
.br
|
.br
|
||||||
The option \f5-obj\fR means that "back.a" will contain a library for generating
|
The option \f5-obj\fR means that "back.a" will contain a library for generating
|
||||||
ACK_A.OUT(5L) object files, see appendix B. If another "back.a" is used,
|
ACK_A.OUT(5L) object files, see appendix B. If different "back.a" is used,
|
||||||
omit the \f5-obj\fR flag.
|
omit the \f5-obj\fR flag.
|
||||||
.IP \0\0-4
|
.IP \0\04:
|
||||||
Link a front end, "ce.a" and "back.a" together resulting in a compiler.
|
Link a front end, "ce.a" and "back.a" together resulting in a compiler.
|
||||||
.LP
|
.LP
|
||||||
The as_table is ready to be tested. If an error occurs, change the table.
|
The as_table is ready to be tested. If an error occurs, change the table.
|
||||||
Then there are two ways to proceed:
|
Then there are two ways to proceed:
|
||||||
.IP \0\0-1
|
.IP \0\01:
|
||||||
recompile the whole EM_table,
|
recompile the whole EM_table,
|
||||||
.br
|
.br
|
||||||
\f5
|
\f5
|
||||||
update ALL
|
update ALL
|
||||||
\fR
|
\fR
|
||||||
.br
|
.IP \0\02:
|
||||||
.IP \0\0-2
|
|
||||||
recompile just the few EM-instructions that contained the error,
|
recompile just the few EM-instructions that contained the error,
|
||||||
\f5
|
\f5
|
||||||
.br
|
.br
|
||||||
|
@ -1310,6 +1209,11 @@ assembly instruction.
|
||||||
,where \fBC_instr\fR is an erroneous EM-instruction.
|
,where \fBC_instr\fR is an erroneous EM-instruction.
|
||||||
\fR
|
\fR
|
||||||
.NH
|
.NH
|
||||||
|
Acknowledgements
|
||||||
|
.LP
|
||||||
|
We want to thank Henri Bal, Dick Grune, and Ceriel Jocobs for their
|
||||||
|
valuable suggestions and the critical reading of this paper.
|
||||||
|
.NH
|
||||||
References
|
References
|
||||||
.LP
|
.LP
|
||||||
.[
|
.[
|
||||||
|
@ -1319,7 +1223,7 @@ $LIST$
|
||||||
.SH
|
.SH
|
||||||
Appendix A, \fRthe \fBback\fR-primitives
|
Appendix A, \fRthe \fBback\fR-primitives
|
||||||
.PP
|
.PP
|
||||||
This appendix describes the routines avaible to generate relocatable
|
This appendix describes the routines available to generate relocatable
|
||||||
object code. If the default back.a is used, the object code is in
|
object code. If the default back.a is used, the object code is in
|
||||||
ACK A.OUT(5L) format.
|
ACK A.OUT(5L) format.
|
||||||
.nr PS 10
|
.nr PS 10
|
||||||
|
@ -1399,8 +1303,8 @@ Symbol table interaction; with int seg; char *s;
|
||||||
tab(#);
|
tab(#);
|
||||||
l c lw(10c).
|
l c lw(10c).
|
||||||
switch_segment( seg)#:#T{
|
switch_segment( seg)#:#T{
|
||||||
sets current segment to "seg", and does alignment if necessary.
|
sets current segment to "seg", and does alignment if necessary. "seg"
|
||||||
"seg" can be one of the four constants defined in "back.h": SEGTXT, SEGROM,
|
can be one of the four constants defined in "back.h": SEGTXT, SEGROM,
|
||||||
SEGCON, SEGBSS.
|
SEGCON, SEGBSS.
|
||||||
T}
|
T}
|
||||||
#
|
#
|
||||||
|
@ -1427,7 +1331,7 @@ output()#:#T{
|
||||||
End of the job, flush output.
|
End of the job, flush output.
|
||||||
T}
|
T}
|
||||||
do_close()#:#T{
|
do_close()#:#T{
|
||||||
close outputstream.
|
close output stream.
|
||||||
T}
|
T}
|
||||||
init_back()#:#T{
|
init_back()#:#T{
|
||||||
Only used with user-written back-library, gives the opportunity to initialize.
|
Only used with user-written back-library, gives the opportunity to initialize.
|
||||||
|
@ -1448,7 +1352,7 @@ format. The object file consists of one header, followed by
|
||||||
four segment headers, followed by text, data, relocation information,
|
four segment headers, followed by text, data, relocation information,
|
||||||
symbol table and the string area. The object file is tuned for the ACK-LED,
|
symbol table and the string area. The object file is tuned for the ACK-LED,
|
||||||
so there are some special things done just before the object file is dumped.
|
so there are some special things done just before the object file is dumped.
|
||||||
First, the four relocation records are added which contain the names of the four
|
First, four relocation records are added which contain the names of the four
|
||||||
segments. Second, all the local relocation is resolved. This is done by the
|
segments. Second, all the local relocation is resolved. This is done by the
|
||||||
function do_relo(). If there is a record belonging to a local
|
function do_relo(). If there is a record belonging to a local
|
||||||
name this address is relocated in the segment to which the record belongs.
|
name this address is relocated in the segment to which the record belongs.
|
||||||
|
|
Loading…
Reference in a new issue