1987-03-10 16:49:13 +00:00
|
|
|
.nr PS 11
|
|
|
|
.nr VS 13p
|
|
|
|
.EQ
|
|
|
|
delim @@
|
|
|
|
.EN
|
|
|
|
.EQ
|
|
|
|
gfont R
|
|
|
|
.EN
|
|
|
|
.ND
|
|
|
|
.RP
|
|
|
|
.TL
|
|
|
|
A back end table for the Motorola MC68000, MC68010 and MC68020 microprocessors
|
|
|
|
.AU
|
|
|
|
Frank Doodeman
|
|
|
|
.AB
|
|
|
|
A back end table is part of the Amsterdam Compiler Kit (ACK). It is used
|
|
|
|
to produce the actual back end, a program that translates the intermediate
|
|
|
|
language family EM to assembly language for some target machine. The table
|
|
|
|
discussed here can be used for two back ends, suitable for in total three
|
|
|
|
machines: the MC68000 and MC68010 (the difference between these two is
|
|
|
|
so small that one back end table can be used for either one), or
|
|
|
|
for the MC68020.
|
|
|
|
.AE
|
|
|
|
.NH
|
|
|
|
Introduction
|
|
|
|
.PP
|
|
|
|
To simplify the task of producing portable (cross) compilers and interpreters
|
|
|
|
the Vrije Universiteit designed an integrated collection of programs, the
|
|
|
|
Amsterdam Compiler Kit (ACK) [2]. It is based on the old UNCOL idea [1] which
|
|
|
|
attempts to solve the problem of how to make a compiler for each of @ N @
|
|
|
|
languages on @ M @ different machines without having to write @ N times M @
|
|
|
|
programs.
|
|
|
|
.PP
|
|
|
|
The UNCOL approach is to write @ N @
|
|
|
|
.I
|
|
|
|
front ends,
|
|
|
|
.R
|
|
|
|
which translate the
|
|
|
|
source language into a common intermediate language UNCOL (Universal Computer
|
|
|
|
Oriented Language), and @ M @
|
|
|
|
.I
|
|
|
|
back ends,
|
|
|
|
.R
|
|
|
|
each of which translates programs in
|
|
|
|
UNCOL into a specific machine language. Under these conditions only @ M + N @
|
|
|
|
programs must be written to provide all @ N @ languages on all @ M @
|
|
|
|
machines, instead of @ M times N @ programs.
|
|
|
|
.PP
|
|
|
|
The intermediate language for the Amsterdam Compiler Kit is the machine language
|
|
|
|
for a simple stack machine called EM (Encoding Machine) [3]. So a back end for
|
|
|
|
the MC68020 translates EM code into MC68020 assembly language. Writing such a
|
|
|
|
table [4] suffices to get the back end.
|
|
|
|
.PP
|
|
|
|
The back end is a single program that is driven by a machine dependent driving
|
|
|
|
table. This table, the back end table, defines the mapping of EM code to
|
|
|
|
the MC68000, MC68010 or MC68020 assembly language.
|
|
|
|
.NH
|
|
|
|
The MC68000 and MC68020 micro processors
|
|
|
|
.PP
|
|
|
|
In this document the name MC68000 will be used for both the MC68000 and the
|
|
|
|
MC68010 micro processors, because as far as the back end table is concerned
|
|
|
|
there is no difference between them. For a complete and detailed description
|
|
|
|
of the MC68020 one is referred to [5]; for the MC68000 one might also use [6].
|
|
|
|
In this section some relevant parts will be handled.
|
|
|
|
.NH 2
|
|
|
|
Registers
|
|
|
|
.PP
|
|
|
|
Both the MC68000 and the MC68020 have eight 32-bit data registers (@ D sub 0 @-@ D sub 7 @) that can
|
|
|
|
be used for byte (8-bit), word (16-bit) and long word (32-bit) data operations.
|
|
|
|
They also have seven 32-bit address registers (@ A sub 0 @-@ A sub 6 @) that may be used as
|
|
|
|
software stack pointers and base address registers; address register @ A sub 7 @ is
|
|
|
|
used as the system stack pointer. Address registers may also be used for
|
|
|
|
word and long word address operations.
|
|
|
|
.NH 2
|
|
|
|
Addressing modes
|
|
|
|
.PP
|
|
|
|
First the MC68000 addressing modes will be discussed. Since the MC68020's
|
|
|
|
set of addressing modes is an extension of the MC68000's set, of course this
|
|
|
|
section also applies to the MC68020.
|
|
|
|
.PP
|
|
|
|
In the description we use:
|
|
|
|
.IP @ A sub n @
|
|
|
|
for address register;
|
|
|
|
.IP @ D sub n @
|
|
|
|
for data register;
|
|
|
|
.IP @ R sub n @
|
|
|
|
for address or data register;
|
|
|
|
.IP @ X sub n @
|
|
|
|
for index register (either data or address register);
|
|
|
|
.IP @ PC @
|
|
|
|
for program counter;
|
|
|
|
.IP @ d sub 8 @
|
|
|
|
for 8 bit displacement integer;
|
|
|
|
.IP @ d sub 16 @
|
|
|
|
for 16 bit displacement integer;
|
|
|
|
.IP @ bd @
|
|
|
|
for base displacement (may be null, word or long);
|
|
|
|
.IP @ od @
|
|
|
|
for outer displacement (may be null, word or long).
|
|
|
|
.NH 3
|
|
|
|
General addressing modes
|
|
|
|
.NH 4
|
|
|
|
Register Direct Addressing
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ R sub n @
|
|
|
|
.PP
|
|
|
|
This addressing mode (it can be used with either a data register or an address
|
|
|
|
register) specifies that the operand is in one of
|
|
|
|
the 16 multifunction registers.
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ ( A sub n ) @
|
|
|
|
.PP
|
|
|
|
The address of the operand is in the address register specified.
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect With Postincrement
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ ( A sub n )+ @
|
|
|
|
.PP
|
|
|
|
The address of the operand is in the address register specified. After the
|
|
|
|
operand address is used, the address register is incremented by one, two or
|
|
|
|
four depending upon whether the size of the operand is byte, word or long.
|
|
|
|
If the address register is the stack pointer and the operand size is byte, the
|
|
|
|
address register is incremented by two rather than one to keep the stack pointer
|
|
|
|
on a word boundary.
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect With Predecrement
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ -( A sub n ) @
|
|
|
|
.PP
|
|
|
|
The address of the operand is in the address register specified. Before the
|
|
|
|
operand address is used, the address register is decremented by one, two or
|
|
|
|
four depending upon whether the size of the operand is byte, word or long.
|
|
|
|
If the address register is the stack pointer and the operand size is byte, the
|
|
|
|
address register is decremented by two rather than one to keep the stack pointer
|
|
|
|
on a word boundary.
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect With Displacement
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ d sub 16 ( A sub n ) @ for the MC68000, @ ( d sub 16 , A sub n ) @ for the MC68020
|
|
|
|
.PP
|
|
|
|
This address mode requires one word of extension. The address of the operand is
|
|
|
|
the sum of the contents of the address register and the sign extended 16-bit
|
|
|
|
integer in the extension word.
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect With Index
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ d sub 8 ( A sub n , X sub n .size) @ for the MC68000, @ ( d sub 8 , A sub n , X sub n .size) @ for the MC68020
|
|
|
|
.PP
|
|
|
|
This address mode requires one word of extension according to a certain format,
|
|
|
|
which specifies
|
|
|
|
.IP 1.
|
|
|
|
which register to use as index register;
|
|
|
|
.IP 2.
|
|
|
|
a flag that indicates whether the index register is a data register or an
|
|
|
|
address register;
|
|
|
|
.IP 3.
|
|
|
|
a flag that indicates the index size; this is
|
|
|
|
.I word
|
|
|
|
when the low order part of the index register is to be used, and
|
|
|
|
.I long
|
|
|
|
when the whole long value in the register is to be used as index;
|
|
|
|
.IP 4.
|
|
|
|
an 8-bit displacement integer (the low order byte of the extension word).
|
|
|
|
.PP
|
|
|
|
The address of the operand is the sum of the contents of the address register,
|
|
|
|
the possibly sign extended contents of index register and the sign
|
|
|
|
extended 8-bit displacement.
|
|
|
|
.NH 4
|
|
|
|
Absolute Data Addressing
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ address @ for the MC68000, @ ( address ) @ for the MC68020
|
|
|
|
.PP
|
|
|
|
Two different kinds of this mode are available:
|
|
|
|
.IP 1.
|
|
|
|
Absolute Short Address; this mode requires one word of extension. The address of
|
|
|
|
the operand is the sign extended 16-bit extension word.
|
|
|
|
.IP 2.
|
|
|
|
Absolute Long Address; this mode requires two words of extension. The address of
|
|
|
|
the operand is developed by concatenation of the two extension words; the high
|
|
|
|
order part of the address is the first extension word, the low order part is
|
|
|
|
the second.
|
|
|
|
.NH 4
|
|
|
|
Program Counter With Displacement.
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ d sub 16 ( PC ) @ for the MC68000, @ ( d sub 16 , PC ) @ for the MC68020
|
|
|
|
.PP
|
|
|
|
This mode requires one word of extension. The address of the operand is the sum
|
|
|
|
of the address in the program counter and the sign extended 16-bit displacement
|
|
|
|
integer in the extension word. The value in the program counter is the
|
|
|
|
address of the extension word.
|
|
|
|
.NH 4
|
|
|
|
Program Counter With Index
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ d sub 8 ( PC , X sub n .size ) @ for the MC68000, @ ( d sub 8 , PC, X sub n .size ) @ for the MC68020
|
|
|
|
.PP
|
|
|
|
This mode requires one word of extension as described under
|
|
|
|
.I
|
|
|
|
Address Register Indirect With Index.
|
|
|
|
.R
|
|
|
|
The address of the operand is the sum of the value in the
|
|
|
|
program counter, the possibly sign extended index register and the sign
|
|
|
|
extended 8-bit displacement integer in the extension word.
|
|
|
|
The value in the program counter is the address of the extension word.
|
|
|
|
.NH 4
|
|
|
|
Immediate Data
|
|
|
|
.IP Syntax: 8
|
1990-11-13 10:51:02 +00:00
|
|
|
@ "\#data" @
|
1987-03-10 16:49:13 +00:00
|
|
|
.PP
|
|
|
|
This addressing mode requires either one or two words of extension, depending
|
|
|
|
on the size of the operation;
|
|
|
|
.IP
|
|
|
|
byte operation - the operand is in the low order byte of extension word;
|
|
|
|
.IP
|
|
|
|
word operation - the operand is in the extension word;
|
|
|
|
.IP
|
|
|
|
long operation - the operand is in the two extension words, the high order
|
|
|
|
16-bits are in the first extension word, the low order 16-bits in the second.
|
|
|
|
.NH 3
|
|
|
|
Extra MC68020 addressing modes
|
|
|
|
.PP
|
|
|
|
The MC68020 has three more addressing modes. These modes all use a displacement
|
|
|
|
(some even two), an address register and an index register. Instead of the
|
|
|
|
address register one may also use the program counter. Any of these
|
|
|
|
may be omitted. If all addends are omitted the processor creates an
|
|
|
|
effective address of zero. All of these three modes require at least one
|
|
|
|
extension word, the
|
|
|
|
.I
|
|
|
|
Full Format Extension Word,
|
|
|
|
.R
|
|
|
|
which specifies:
|
|
|
|
.IP 1.
|
|
|
|
the index register number (0-7);
|
|
|
|
.IP 2.
|
|
|
|
the index register type (address or data register);
|
|
|
|
.IP 3.
|
|
|
|
the size of the index (only low order part or the whole register)
|
|
|
|
.IP 4.
|
|
|
|
a scale factor. This is a number from 0 to 3 which specifies how many bits
|
|
|
|
the contents of the index register is to be shifted to the left before being
|
|
|
|
used as an index;
|
|
|
|
.IP 5.
|
|
|
|
a flag that specifies whether the base (address) register is to be added or
|
|
|
|
to be suppressed;
|
|
|
|
.IP 6.
|
|
|
|
a flag that specifies whether to add or suppress the index operand;
|
|
|
|
.IP 7.
|
|
|
|
two bits that specify the size of the base displacement (null, word or long);
|
|
|
|
.IP 8.
|
|
|
|
three bits that in combination with (6) above specify which of the three
|
|
|
|
addressing modes (described below) to use and, if used, the size of the
|
|
|
|
outer displacement (null, word or long).
|
|
|
|
.IP N.B.
|
|
|
|
All modes mentioned above for the MC68000
|
|
|
|
that use an index register may have this register
|
|
|
|
scaled (only when using the MC68020).
|
|
|
|
.PP
|
|
|
|
The three extra addressing modes are:
|
|
|
|
.NH 4
|
|
|
|
Address Register Indirect With Index (Base Displacement)
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ ( bd , A sub n , X sub n .size*scale ) @ (MC68020 only)
|
|
|
|
.PP
|
|
|
|
The address of the operand is the sum of the contents of the address register,
|
|
|
|
the scaled contents of the possibly scaled index register and the possibly
|
|
|
|
sign extended base displacement. When the program counter is used instead
|
|
|
|
of the address register, the value in the program counter is the address
|
|
|
|
of the full format extension word. This mode requires one or two more extension
|
|
|
|
words when the size of the base displacement is word or long respectively.
|
|
|
|
.PP
|
|
|
|
Note that without the index operand, this mode is an extension of the
|
|
|
|
.I
|
|
|
|
Address Register Indirect With Displacement
|
|
|
|
.R
|
|
|
|
mode; when using the MC68020 one is no longer limited to a 16-bit displacement.
|
|
|
|
Also note that with the index operand added, this mode is an extension
|
|
|
|
of the
|
|
|
|
.I
|
|
|
|
Address Register Indirect With Index
|
|
|
|
.R
|
|
|
|
mode; when using the MC68020 one is no longer limited to an 8-bit displacement.
|
|
|
|
.NH 4
|
|
|
|
Memory Indirect Post-Indexed
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ ( [ bd , A sub n ] , X sub n .size*scale , od ) @ (MC68020 only)
|
|
|
|
.PP
|
|
|
|
This mode may use an outer displacement. First an intermediate memory
|
|
|
|
address is calculated by adding the contents of the address register and
|
|
|
|
the possibly sign extended base displacement. This address is used
|
|
|
|
for in indirect memory access of a long word, followed by adding
|
|
|
|
the index operand (scaled and possibly signed extended). Finally the
|
|
|
|
outer displacement is added to yield the address of the operand.
|
|
|
|
When the program counter is used, the value in the program counter is the
|
|
|
|
address of the full format extension word.
|
|
|
|
.NH 4
|
|
|
|
Memory Indirect Pre-Indexed
|
|
|
|
.IP Syntax: 8
|
|
|
|
@ ( [ bd , A sub n , X sub n .size*scale ] , od ) @ (MC68020 only)
|
|
|
|
.PP
|
|
|
|
This mode may use an outer displacement. First an intermediate memory
|
|
|
|
address is calculated by adding the contents of the address register,
|
|
|
|
the scaled contents of the possibly sign extended index register and
|
|
|
|
the possibly sign extended base displacement. This address is used
|
|
|
|
for an indirect memory access of a long word, followed by adding
|
|
|
|
the outer displacement to yield the address of the operand.
|
|
|
|
When the program counter is used, the value in the program counter is the
|
|
|
|
address of the full format extension word.
|
|
|
|
.NH 3
|
|
|
|
Addressing modes used in the table
|
|
|
|
.PP
|
|
|
|
Not all addressing modes mentioned above are used in code generation. It is
|
|
|
|
clear that none of the modes that use the program counter PC can be used,
|
|
|
|
since at code generation time nothing is known about the value in PC.
|
|
|
|
Also some of the possibilities of the three MC68020 addressing modes are not
|
|
|
|
used; e.g. it is possible to use a
|
|
|
|
.I
|
|
|
|
Data Register Indirect
|
|
|
|
.R
|
|
|
|
mode, which actually is the
|
|
|
|
.I
|
|
|
|
Address Register Indirect With Index
|
|
|
|
.R
|
|
|
|
mode, with the address register and the displacement left out. However
|
|
|
|
such a mode would require two extra bytes for the full format extension word,
|
|
|
|
and it would also be much slower than using
|
|
|
|
.I
|
|
|
|
Address Register Indirect.
|
|
|
|
.R
|
|
|
|
For this kind of reasons several possible addressing modes are not used in the
|
|
|
|
generation of code.
|
|
|
|
In the table address registers are only used for holding addresses, and
|
|
|
|
for index registers only data registers are used.
|
|
|
|
.NH
|
|
|
|
The M68000 and MC68020 back end table
|
|
|
|
.PP
|
|
|
|
The table itself has to be run through the C preprocessor
|
|
|
|
before it can be used to generate
|
|
|
|
the back end (called
|
|
|
|
.I
|
|
|
|
code generator
|
|
|
|
.R
|
|
|
|
or
|
|
|
|
.I cg
|
|
|
|
for short). When no flags are given to
|
|
|
|
the preprocessor an MC68020 code generator is produced; for the MC68000
|
|
|
|
code generator one has to run the table through the preprocessor using the
|
|
|
|
.I -Dm68k4
|
|
|
|
flag.
|
|
|
|
.PP
|
|
|
|
The table is designed as described in [4]. For the overall design of a back
|
|
|
|
end table one is referred to this document. This section only deals
|
|
|
|
with problems encountered in writing the table and other things worth noting.
|
|
|
|
.NH 2
|
|
|
|
Constant Definitions
|
|
|
|
.PP
|
|
|
|
Wordsize and pointersize (EM_WSIZE and EM_PSIZE respectively) are defined
|
|
|
|
as four (bytes). EM_BSIZE, the hole between AB (the parameter base) and
|
|
|
|
LB (the local base), is eight bytes: only
|
|
|
|
the return address and the localbase are saved.
|
|
|
|
.NH 2
|
|
|
|
Properties
|
|
|
|
.PP
|
|
|
|
Since Hans van Staveren in his document [4] clearly states that
|
|
|
|
.I cg
|
|
|
|
execution time is negatively influenced by the number of properties, only
|
|
|
|
four different properties have been defined. Besides, since the registers
|
|
|
|
really are multifunctional, these four are really all that are needed.
|
|
|
|
.NH 2
|
|
|
|
Registers
|
|
|
|
.PP
|
|
|
|
The table uses register variables: @ D sub 3 @ - @ D sub 7 @ are used as general register
|
|
|
|
variables, and address registers @ A sub 2 @ - @ A sub 5 @ are used as pointer register
|
|
|
|
variables. @ A sub 6 @ is reserved for the localbase.
|
|
|
|
.NH 2
|
|
|
|
Tokens
|
|
|
|
.PP
|
|
|
|
At first glance one might wonder about the amount of tokens, especially
|
|
|
|
for the MC68020, considering the small amount of different addressing modes.
|
|
|
|
However, the last three addressing modes mentioned for the MC68020 may
|
|
|
|
omit any of the addends, and this leads to a large amount of different tokens.
|
|
|
|
I did consider the possibility of enlarging the number of tokens and sets
|
|
|
|
even further, because there might be assemblers that don't handle displacements
|
|
|
|
of zero optimally (they might generate a 2 byte extension word holding zero).
|
|
|
|
The small profit in bytes in the generated code
|
|
|
|
however does not justify the increase
|
|
|
|
in size of the token section, the set section and the patterns section,
|
|
|
|
so this idea was not developed any further.
|
|
|
|
.PP
|
|
|
|
The timing cost of the tokens may be incorrect for some MC68000 tokens.
|
|
|
|
This is because the MC68000 uses a 16-bit data bus which causes the need
|
|
|
|
of two separate memory accesses for getting 32-bit operands.
|
|
|
|
.NH 3
|
|
|
|
Token names
|
|
|
|
.PP
|
|
|
|
The amount of tokens and the limited capability of the authors imagination
|
|
|
|
might have caused the names of some tokens not to be very clarifying.
|
|
|
|
Some information about the names may be in place here.
|
|
|
|
.PP
|
|
|
|
Whenever part of a token name is in capitals that part is memory indirected
|
|
|
|
(i.e. in square brackets). In token names
|
|
|
|
.I OFF
|
|
|
|
and
|
|
|
|
.I off
|
|
|
|
mean an offsetted address register, so an address register with a displacement
|
|
|
|
(either base displacement or outer displacement).
|
|
|
|
.I
|
|
|
|
IND, ind
|
|
|
|
.R
|
|
|
|
and
|
|
|
|
.I index
|
|
|
|
stand for indexed, or index register.
|
|
|
|
.I ABS
|
|
|
|
and
|
|
|
|
.I abs
|
|
|
|
stand for absolute, which actually is just a displacement (base or outer).
|
|
|
|
These `rules' only apply to names of tokens that represent actual operands.
|
|
|
|
There are also tokens that represent addresses of operands. These
|
|
|
|
(with a few exceptions) contain
|
|
|
|
.I
|
|
|
|
regA, regX
|
|
|
|
.R
|
|
|
|
and
|
|
|
|
.I con
|
|
|
|
as parts of there names, which stand for address register, index register and
|
|
|
|
displacement (always base displacement) respectively. If the address to which
|
|
|
|
the token refers uses memory indirection, that part of the name comes first
|
|
|
|
(in small letters), followed by an underscore. The memory indirection part
|
|
|
|
follows the `rules' for operand token names.
|
|
|
|
.PP
|
|
|
|
Of course there are exceptions to these `rules' but in those cases the names
|
|
|
|
are self explanatory.
|
|
|
|
.PP
|
|
|
|
Two special cases:
|
|
|
|
.I ext_regX
|
|
|
|
is the name of the token that represents the
|
|
|
|
address of an absolute indexed operand, syntax @ ( bd , X sub n .size*scale ) @;
|
|
|
|
.I regX
|
|
|
|
does not represent any real mode, but is used with EM array instructions and
|
|
|
|
pointer arithmetic.
|
|
|
|
.NH 3
|
|
|
|
Special tokens for the MC68000
|
|
|
|
.PP
|
|
|
|
The MC68000 requires two extra tokens, which are called
|
|
|
|
.I t_regAcon
|
|
|
|
and
|
|
|
|
.I
|
|
|
|
t_regAregXcon.
|
|
|
|
.R
|
|
|
|
They are necessary because
|
|
|
|
.I regAcon
|
|
|
|
can only have a 16-bit displacement on the MC68000, and
|
|
|
|
.I regAregXcon
|
|
|
|
uses only 8 bits for its displacement. To prevent these addressing modes to
|
|
|
|
be used with displacements that are too large, the extra tokens are needed.
|
|
|
|
Whenever the displacements become too large and they need
|
|
|
|
to be used in the generation
|
|
|
|
of assembly code, these tokens are transformed into other tokens.
|
|
|
|
To prevent the table from becoming too messy I defined
|
|
|
|
.I t_regAcon
|
|
|
|
and
|
|
|
|
.I t_regAregXcon
|
|
|
|
to be identical to
|
|
|
|
.I regAcon
|
|
|
|
and
|
|
|
|
.I regAregXcon
|
|
|
|
respectively for the MC68020.
|
|
|
|
.NH 2
|
|
|
|
Sets
|
|
|
|
.PP
|
|
|
|
Most set names used in the table are self explanatory, especially to the reader
|
|
|
|
who is familiar with the four addressing categories as mentioned in [5]:
|
|
|
|
.I
|
|
|
|
data, memory, alterable
|
|
|
|
.R
|
|
|
|
and
|
|
|
|
.I
|
|
|
|
control.
|
|
|
|
.R
|
|
|
|
In the sets definition part some sets are defined that are not used elsewhere in
|
|
|
|
the table, but are only used to be part of the definition of
|
|
|
|
some other set. This keeps the
|
|
|
|
set definition part from getting too unreadable.
|
|
|
|
.PP
|
|
|
|
The sets called
|
|
|
|
.I imm_cmp
|
|
|
|
consist of all tokens that can be used to compare with a constant.
|
|
|
|
.NH 2
|
|
|
|
Instructions
|
|
|
|
.PP
|
|
|
|
Only the instructions that are used in code generation are listed here.
|
|
|
|
The first few instructions are meant especially for the use with register
|
|
|
|
variables. The operand LOCAL used here refers to a register variable.
|
|
|
|
The reader may not conclude that these operations are also allowed on
|
|
|
|
ordinary locals. The space and timing cost of these instructions have been
|
|
|
|
adapted, but the use of the word LOCAL for register variables causes these cost
|
|
|
|
to be inaccurate anyway.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I killreg
|
|
|
|
instruction, which generates a comment in the assembly language output and
|
|
|
|
which is meant to let
|
|
|
|
.I cg
|
|
|
|
know that the data register operand has its contents destroyed,
|
|
|
|
needs some explaining but this explanation is better in place
|
|
|
|
in the discussion of groups 3 and 4 of the section about patterns.
|
|
|
|
.PP
|
|
|
|
The timing cost of the instructions are probably not very accurate for the
|
|
|
|
MC68020 because the MC68020 uses an instruction cache and prefetch. The
|
|
|
|
cost used in the table are the `worst case cost' as mentioned in section 9
|
|
|
|
of [5].
|
|
|
|
.NH 2
|
|
|
|
Moves
|
|
|
|
.PP
|
|
|
|
These are all pretty straightforward, except perhaps when
|
|
|
|
.I t_regAcon
|
|
|
|
and
|
|
|
|
.I t_regAregXcon
|
|
|
|
are used. In these cases the size of the displacement has to be checked
|
|
|
|
before moving. This also applies to the stacking rules and the coercions.
|
|
|
|
.NH 2
|
|
|
|
Tests
|
|
|
|
.PP
|
|
|
|
These three tests (one fore each operation size) could not be more
|
|
|
|
straightforward than they are now.
|
|
|
|
.NH 2
|
|
|
|
Stackingrules
|
|
|
|
.PP
|
|
|
|
The only peculiar stackingrule is the one for
|
|
|
|
.I
|
|
|
|
regX.
|
|
|
|
.R
|
|
|
|
This token is only used with EM array instructions and
|
|
|
|
with pointer arithmetic. Whenever it is put
|
|
|
|
on the fake stack, some EM instructions are left in the instruction stream
|
|
|
|
to remove this token. Consequently it should never have to be stacked. However
|
|
|
|
the
|
|
|
|
.I
|
|
|
|
code generator generator
|
|
|
|
.R
|
|
|
|
(or
|
|
|
|
.I cgg
|
|
|
|
for short)
|
|
|
|
complained about not having a stackingrule for this token, so it had to
|
|
|
|
be added nevertheless.
|
|
|
|
.NH 2
|
|
|
|
Coercions
|
|
|
|
.PP
|
|
|
|
These are all straightforward. There are no splitting coercions since
|
|
|
|
the fake stack never contains any tokens that can be split.
|
|
|
|
There are only two unstacking coercions.
|
|
|
|
The rest are all transforming coercions. Almost all coercions transform
|
|
|
|
tokens into either a data register or an address register, except in the
|
|
|
|
MC68000 part of the table the
|
|
|
|
.I t_regAcon
|
|
|
|
and
|
|
|
|
.I t_regAregXcon
|
|
|
|
tokens are transformed into real
|
|
|
|
.I regAcon
|
|
|
|
and
|
|
|
|
.I regAregXcon
|
|
|
|
tokens with displacements that are properly sized.
|
|
|
|
.NH 2
|
|
|
|
Patterns
|
|
|
|
.PP
|
|
|
|
This is the largest part of the table. It is subdivided into 17 groups.
|
|
|
|
We will take a closer look at the more interesting groups.
|
|
|
|
.NH 3
|
|
|
|
Group 0: rules for register variables
|
|
|
|
.PP
|
|
|
|
This group makes sure that EM instructions using register variables are
|
|
|
|
handled efficiently. This group includes: local loads and
|
|
|
|
stores; arithmetic, shifts and logical operations on locals and indirect locals
|
|
|
|
and pointer handling, where C expressions like
|
|
|
|
.I
|
|
|
|
*cp++
|
|
|
|
.R
|
|
|
|
are handled. For such an expression there are several EM instruction
|
|
|
|
sequences the front end might generate. For an integer pointer e.g.:
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol lol adp stl loi $1==$2 && $1==$4 && $3==4 && $5==4
|
|
|
|
.I
|
|
|
|
.DE
|
|
|
|
or
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol loi lol adp stl $1==$3 && $3==$5 && $2==4 && $5==4
|
|
|
|
.I
|
|
|
|
.DE
|
|
|
|
or perhaps even
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lil lol adp stl $1==$2 && $2==$4 && $3==4
|
|
|
|
.I
|
|
|
|
.DE
|
|
|
|
Each of these is included, since which one is generated is is up to the front
|
|
|
|
end. If the front end is consistent this will mean that some of these patterns
|
|
|
|
will never be used in code generation. This might seem a waist, but anyone
|
|
|
|
who thinks that will certainly change his mind when his new C front end
|
|
|
|
generates a different EM instruction sequence.
|
|
|
|
.NH 3
|
|
|
|
Groups 1 and 2: load and store instructions
|
|
|
|
.PP
|
|
|
|
In these groups
|
|
|
|
.B lof
|
|
|
|
and
|
|
|
|
.B stf
|
|
|
|
,
|
|
|
|
.B loi
|
|
|
|
and
|
|
|
|
.B sti
|
|
|
|
,
|
|
|
|
.B ldf
|
|
|
|
and
|
|
|
|
.B sdf
|
|
|
|
are the important instructions.
|
|
|
|
These are the large parts in this group, especially the
|
|
|
|
.B loi
|
|
|
|
and
|
|
|
|
.B sti
|
|
|
|
instructions, because they come in three basic sizes (byte, word and long).
|
|
|
|
Note that with these instructions in the MC68000 part the
|
|
|
|
.I exact
|
|
|
|
is omitted in front of
|
|
|
|
.I regAcon
|
|
|
|
and
|
|
|
|
.I
|
|
|
|
regAregXcon.
|
|
|
|
.R
|
|
|
|
This makes sure that
|
|
|
|
.I t_regAcon
|
|
|
|
and
|
|
|
|
.I t_regAregXcon
|
|
|
|
are transformed into proper tokens before they are used as addresses.
|
|
|
|
.PP
|
|
|
|
Also note that the
|
|
|
|
.I regAregXcon
|
|
|
|
token is completely left out from the
|
|
|
|
\fBlof\fR, \fBstf\fR, \fBldf\fR and \fBsdf\fR
|
|
|
|
instruction handling. This is because the sum of the token displacement
|
|
|
|
and the offset provided in the instruction cannot be checked and is likely
|
|
|
|
to exceed 8 bits. Unfortunately
|
|
|
|
.I cgg
|
|
|
|
does not allow the inspection of subregisters of tokens that are on the
|
|
|
|
fake stack. This same problem might also occur with the
|
|
|
|
.I regAcon
|
|
|
|
token, but this is less likely because it
|
|
|
|
uses 16-bit displacements. Besides if it would have been left out the
|
|
|
|
\fBlof\fR, \fBstf\fR, \fBldf\fR and \fBsdf\fR
|
|
|
|
instructions would have been handled considerably less efficient.
|
|
|
|
.NH 3
|
|
|
|
Groups 3 and 4: integer and unsigned arithmetic
|
|
|
|
.PP
|
|
|
|
EM instruction
|
|
|
|
.B sbi
|
|
|
|
also works with address registers, because the
|
|
|
|
.B cmp
|
|
|
|
instruction in group 12 is replaced by \fBsbi 4\fR.
|
|
|
|
.PP
|
|
|
|
For the MC68000 \fBmli\fR, \fBmlu\fR, \fBdvi\fR, \fBdvu\fR, \fBrmi\fR
|
|
|
|
and \fBrmu\fR are handled
|
|
|
|
by library routines. This is because the MC68000 has only 16-bit multiplications
|
|
|
|
and divisions.
|
|
|
|
.PP
|
|
|
|
The MC68020 does have 32-bit multiplications and divisions, but for the
|
|
|
|
.B rmi
|
|
|
|
and
|
|
|
|
.B rmu
|
|
|
|
EM instructions peculiar things happen anyway: they generate the
|
|
|
|
.I killreg
|
|
|
|
instruction. This is necessary because the data register that
|
|
|
|
first held the dividend now holds the quotient; the original contents are
|
|
|
|
destroyed without
|
|
|
|
.I cg
|
|
|
|
knowing about it (the destruction of the two registers that make up the
|
|
|
|
.I DREG_pair
|
|
|
|
token couldn't be noted in the instructions part of the table).
|
|
|
|
To let
|
|
|
|
.I cg
|
|
|
|
know that these contents are destroyed, we have to use this `pseudo instruction'
|
|
|
|
from lack of a better solution.
|
|
|
|
.NH 3
|
|
|
|
Group 5: floating point arithmetic
|
|
|
|
.PP
|
|
|
|
Since floating point arithmetic is not implemented traps will be generated here.
|
|
|
|
.NH 3
|
|
|
|
Group 6: pointer arithmetic
|
|
|
|
.PP
|
|
|
|
This also is a very important group, along with groups 1 and 2. The MC68020
|
|
|
|
has many different addressing modes and if possible they should be used in
|
|
|
|
the generation of assembly language.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I regX
|
|
|
|
token is generated here too. It is meant to make efficient use of the
|
|
|
|
MC68020 possibility of scaling index registers.
|
|
|
|
.PP
|
|
|
|
Note that I would have liked one extra pattern to handle C-statements
|
|
|
|
like
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
pointer += expr ? constant1 : constant2;
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
efficiently. This pattern would have looked like:
|
|
|
|
.DS
|
|
|
|
pat ads
|
|
|
|
with const
|
|
|
|
leaving adp %1.num
|
|
|
|
.DE
|
|
|
|
but when
|
|
|
|
.I cg
|
|
|
|
is coming to the EM replacement part, the constant has already been removed
|
|
|
|
from the fake stack, causing
|
|
|
|
.I %1.num
|
|
|
|
to have a wrong value.
|
|
|
|
.NH 3
|
|
|
|
Group 9: logical instructions
|
|
|
|
.PP
|
|
|
|
The EM instructions \fBand\fR,
|
|
|
|
.B ior
|
|
|
|
and
|
|
|
|
.B xor
|
|
|
|
are so much alike that procedures can be used here, except for the
|
|
|
|
.B
|
|
|
|
xor $1==4
|
|
|
|
.R
|
|
|
|
instruction, because the MC68000
|
|
|
|
.I eor
|
|
|
|
instruction does not allow as many kinds of operands as
|
|
|
|
.I and
|
|
|
|
and
|
|
|
|
.I
|
|
|
|
or.
|
|
|
|
.R
|
|
|
|
.NH 3
|
|
|
|
Group 11: arrays
|
|
|
|
.PP
|
|
|
|
This group also tries to make efficient use of the available addressing modes,
|
|
|
|
but it leaves the actual work to group 6 mentioned above.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I regX
|
|
|
|
token is also generated here. In this group this token is very useful for
|
|
|
|
handling array instructions for arrays with one, two, four or eight byte
|
|
|
|
elements; the array index goes into the index register, which can then
|
|
|
|
be scaled appropriately. An offset is used when the
|
|
|
|
first array element has an index other than zero.
|
|
|
|
.PP
|
|
|
|
I would have liked some extra patterns here too but they won't work
|
|
|
|
for the same reasons as explained in the discussion of group 6.
|
|
|
|
.NH 3
|
|
|
|
Group 14: procedure calls instructions
|
|
|
|
.PP
|
|
|
|
The function return area consists of registers @ D sub 0 @ and @ D sub 1 @.
|
|
|
|
.NH 3
|
|
|
|
Group 15: miscellaneous instructions
|
|
|
|
.PP
|
|
|
|
In many cases here library routines are called. These will be discussed
|
|
|
|
later.
|
|
|
|
.PP
|
|
|
|
Two special EM instructions are included here: \fBdch\fR, and \fBlpb\fR.
|
|
|
|
I don't know when they are generated by a front end, but these
|
|
|
|
instructions were also in the back end table for the PDP. In the PDP table
|
|
|
|
these instructions were replaced by
|
|
|
|
.B
|
|
|
|
loi 4
|
|
|
|
.R
|
|
|
|
and
|
|
|
|
.B
|
|
|
|
adp 8
|
|
|
|
.R
|
|
|
|
respectively. I included them both, since they couldn't do any harm.
|
|
|
|
.NH 3
|
|
|
|
Extra group: optimalization
|
|
|
|
.PP
|
|
|
|
This group is handling EM patterns with more than one instruction. This group
|
|
|
|
is not absolutely necessary but it makes the generation of code
|
|
|
|
more efficient. Among the things that are handled here are: arithmetic and
|
|
|
|
logical operations on locals, externals and indirect locals; shifting
|
|
|
|
of locals, externals and indirect locals by one; some pointer arithmetic; tests
|
|
|
|
in combination with logical and's and or's or with branches. Finally
|
|
|
|
there are sixteen patterns about divisions that could be handled more
|
|
|
|
efficiently by right shifts and which I think should be handled by the
|
|
|
|
peephole optimizer (since it also handles
|
|
|
|
the same patterns with multiplication).
|
|
|
|
.NH
|
|
|
|
The library routines
|
|
|
|
.PP
|
|
|
|
The table is supplied with two separate libraries: one for the MC68000 and one
|
|
|
|
for the MC68020. The MC68000 uses a couple more routines than the MC68020
|
|
|
|
because it doesn't have 32-bit division and multiplication.
|
|
|
|
.PP
|
|
|
|
The routines that need to pop their operands first store their return address.
|
|
|
|
Routines that need other register besides @ D sub 0 @-@ D sub 2 @ and @ A sub 0 @-@ A sub 1 @ first store
|
|
|
|
the original contents of those registers. @ D sub 0 @-@ D sub 2 @ and @ A sub 0 @-@ A sub 1 @ do not have
|
|
|
|
to be saved because if they contain anything useful, their contents
|
|
|
|
are pushed on the stack before the routine is called.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I .trp
|
|
|
|
routine just prints a message stating the trap number and exits (except
|
|
|
|
of course when that particular trap number is masked). Usually higher
|
|
|
|
level languages use their own trap handling routines.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I .mon
|
|
|
|
routine doesn't do anything useful at all. It just prints a message stating that
|
|
|
|
the specified system call is not implemented and then exits. Front ends
|
|
|
|
usually generate calls to special routines rather than the EM
|
|
|
|
instruction \fBmon\fR.
|
|
|
|
These routines have to be supplied in another library. They
|
|
|
|
may be system dependent (e.g. the MC68000 machine this table was tested on
|
|
|
|
first moves the parameters to registers, then moves the system call number
|
|
|
|
to @ D sub 0 @ and then executes
|
|
|
|
.I
|
|
|
|
trap #0,
|
|
|
|
.R
|
|
|
|
whereas the MC68020 machine this table was tested on required the parameters
|
|
|
|
to be on the stack rather than in registers). Therefor this library is not
|
|
|
|
discussed here.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I .printf
|
|
|
|
routine is included for EM diagnostic messages. It can print strings using %s,
|
|
|
|
16-bit decimal numbers using %d and 32-bit hexadecimal numbers using %x.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I .strhp
|
|
|
|
routine stores a new EM heap pointer, and sometimes it needs to allocate more
|
|
|
|
heap space. This is done by calling the system call routine \fI_brk\fR.
|
|
|
|
Chunks of 1K bytes are allocated, but this can easily be changed into
|
|
|
|
larger or smaller chunks.
|
|
|
|
.PP
|
|
|
|
The MC68000 library also contains a routine to handle the EM instruction \fBrck\fR.
|
|
|
|
The MC68020 has an instruction
|
|
|
|
.I cmp2
|
|
|
|
that is specially meant for range checking so the MC68020 library can do without
|
|
|
|
that routine.
|
|
|
|
.PP
|
|
|
|
The MC68000 library has two multiplication routines, one for unsigned and the other
|
|
|
|
for signed multiplication. The one for signed multiplication
|
|
|
|
first tests the sizes of the operands, to see if it can perform
|
|
|
|
the 16 bit machine instruction instead of the routine. If not, it considers
|
|
|
|
it's two operands being two digit numbers in a 65535-radix system. It
|
|
|
|
uses the 16-bit unsigned multiply instruction
|
|
|
|
.I mulu
|
|
|
|
three times (it does not calculate the high order result),
|
|
|
|
and adds up the intermediary results the proper way. The signed
|
|
|
|
multiplication routine calculates the sign of the result, calculates
|
|
|
|
the result as it it were an unsigned multiplication, and
|
|
|
|
adjusts the sign of the result. Here testing
|
|
|
|
the operands for there sizes would be less simple, because the operands
|
|
|
|
are signeds; so that is not done here.
|
|
|
|
.PP
|
|
|
|
The MC68000 library also has two division routines. The routine for unsigned
|
|
|
|
division uses the popular algorithm, where the divisor is shifted out and
|
|
|
|
the quotient shifted in. The signed division routine calculates the sign of
|
|
|
|
both the quotient and the remainder, calls the unsigned division routine
|
|
|
|
and adjusts the signs for the quotient and the remainder.
|
|
|
|
.PP
|
|
|
|
The
|
|
|
|
.I .nop
|
|
|
|
routine is included for testing purposes. This routine prints the line
|
|
|
|
number and the value in the stack pointer. Calls to this routine
|
|
|
|
are generated by the EM instruction \fBnop\fR, which is ordinarily
|
|
|
|
left out by the peephole optimizer.
|
|
|
|
.NH
|
|
|
|
Testing the table
|
|
|
|
.PP
|
|
|
|
There are special test programs available for testing back end tables.
|
|
|
|
First there is the EM test set, which tests most EM instructions, making
|
|
|
|
good use of the
|
|
|
|
.B nop
|
|
|
|
instruction. Then there are the Pascal and C test programs. The Pascal
|
|
|
|
test programs report errors, which makes it relatively easy
|
|
|
|
to find out what was wrong in the table. The C test programs just
|
|
|
|
generate some output, which then has to be compared to the expected
|
|
|
|
output. Differences are
|
|
|
|
not only caused by errors but also e.g. by the use of four
|
|
|
|
byte integers and unsigneds (which this table does),
|
|
|
|
the use of signed characters
|
|
|
|
instead of unsigned characters (the C front end I used generated signed
|
|
|
|
characters) or because the back end
|
|
|
|
does not support floating point.
|
|
|
|
These differences have to be `filtered out' to reveal
|
|
|
|
the differences caused by actual errors in the back end table.
|
|
|
|
These errors then have to be found out by examining the assembly code, for
|
|
|
|
no proper diagnostic messages are generated.
|
|
|
|
.PP
|
|
|
|
After these three basic tests there still remain a number of patterns that
|
|
|
|
haven't been tested yet. Fortunately
|
|
|
|
.I cgg
|
|
|
|
offers the possibility of generating a special
|
|
|
|
.I cg
|
|
|
|
that can print a list of patterns that haven't been used in
|
|
|
|
code generation yet.
|
|
|
|
For these patterns the table writer has to write his own test programs.
|
|
|
|
This may complicate things a bit because errors may now be caused by
|
|
|
|
errors in the back end table as well as errors in the test programs.
|
|
|
|
The latter happened quite often to me, because I found EM
|
|
|
|
to be an uncomfortable programming language (of course it isn't meant to
|
|
|
|
be a programming language, but an intermediary language).
|
|
|
|
.PP
|
|
|
|
There still remain a couple of patterns in this table that haven't been tested
|
|
|
|
yet. However these patterns all have very similar cases that have been
|
|
|
|
tested (an example of this is mentioned in the section on group 0
|
|
|
|
of the patterns section of the table). Some patterns have to
|
|
|
|
do with floating point numbers. These EM instructions all generate
|
|
|
|
traps, so they didn't all have to be tested. The two instructions
|
|
|
|
.B dch
|
|
|
|
and
|
|
|
|
.B lpb
|
|
|
|
haven't been tested in this table, but since they only use EM replacement
|
|
|
|
and they have been tested in the PDP back end table, these two should
|
|
|
|
be all right.
|
|
|
|
.NH
|
|
|
|
Performance of the back end
|
|
|
|
.PP
|
|
|
|
To test the performance of the back end I gathered a couple of
|
|
|
|
C programs and compiled them on the machines I used to test the back ends on.
|
|
|
|
I compiled them using the C compiler that was available there and
|
|
|
|
I also compiled them using the back end. I then compared the sizes
|
|
|
|
of the text segments in the object files.
|
|
|
|
The final results of these comparisons are in fig. 1 and fig. 2.
|
|
|
|
.KF
|
|
|
|
.TS
|
|
|
|
center box;
|
|
|
|
cfI s s s s s
|
|
|
|
c s s s s s
|
|
|
|
c c | c s | c s
|
|
|
|
c c | c s | c s
|
|
|
|
c | c | c c | c c
|
|
|
|
l | n | n n | n n.
|
|
|
|
Differences in text segment sizes for the MC68000
|
|
|
|
parts of the back end compiled by itself
|
|
|
|
_
|
|
|
|
original old m68k4 new MC68000
|
|
|
|
compiler (100%) back end back end
|
|
|
|
_
|
|
|
|
name size size perc. size perc.
|
|
|
|
_
|
|
|
|
codegen.c 13892 16224 116.7% 12860 92.5%
|
|
|
|
compute.c 4340 4502 103.7% 4530 104.3%
|
|
|
|
equiv.c 680 662 97.3% 598 87.9%
|
|
|
|
fillem.c 8016 7304 91.1% 6880 85.8%
|
|
|
|
gencode.c 1356 1194 88.0% 1130 83.3%
|
|
|
|
glosym.c 224 202 90.1% 190 84.8%
|
|
|
|
main.c 732 672 91.8% 634 86.6%
|
|
|
|
move.c 1876 1526 81.3% 1410 75.1%
|
|
|
|
nextem.c 1288 1594 123.7% 1192 92.5%
|
|
|
|
reg.c 1076 1014 94.2% 916 85.1%
|
|
|
|
regvar.c 1352 1188 87.8% 1150 85.0%
|
|
|
|
salloc.c 1240 1100 88.7% 1024 82.5%
|
|
|
|
state.c 628 600 95.5% 532 84.7%
|
|
|
|
subr.c 6948 6382 91.8% 5680 81.7%
|
|
|
|
=
|
|
|
|
averages 2939 3155 95.8% 2766 86.6%
|
|
|
|
.TE
|
|
|
|
.DS C
|
|
|
|
fig 1.
|
|
|
|
.DE
|
|
|
|
.KE
|
|
|
|
.KF
|
|
|
|
.TS
|
|
|
|
center box;
|
|
|
|
cfI s s s
|
|
|
|
cfI s s s
|
|
|
|
c s s s
|
|
|
|
c s s s
|
|
|
|
c c | c s
|
|
|
|
c c | c s
|
|
|
|
c | c | c c
|
|
|
|
l | n | n n.
|
|
|
|
Differences in text segment sizes
|
|
|
|
for the MC68020
|
|
|
|
parts of the back end
|
|
|
|
compiled by itself
|
|
|
|
_
|
|
|
|
original MC68020
|
|
|
|
compiler (100%) back end
|
|
|
|
_
|
|
|
|
name size size perc.
|
|
|
|
_
|
|
|
|
codegen.c 12608 12134 96.2%
|
|
|
|
compute.c 4624 4416 95.5%
|
|
|
|
equiv.c 572 504 88.1%
|
|
|
|
fillem.c 7780 6976 89.6%
|
|
|
|
gencode.c 1320 1086 82.2%
|
|
|
|
glosym.c 228 182 79.8%
|
|
|
|
main.c 736 596 80.9%
|
|
|
|
move.c 1392 1280 91.9%
|
|
|
|
nextem.c 1176 1066 90.6%
|
|
|
|
reg.c 1052 836 79.4%
|
|
|
|
regvar.c 1196 968 80.9%
|
|
|
|
salloc.c 1200 932 77.6%
|
|
|
|
state.c 580 528 91.0%
|
|
|
|
subr.c 6136 5268 85.8%
|
|
|
|
=
|
|
|
|
averages 2900 2627 86.4%
|
|
|
|
.TE
|
|
|
|
.DS C
|
|
|
|
fig 2.
|
|
|
|
.DE
|
|
|
|
.KE
|
|
|
|
Fig. 1 also includes results of an old m68k4 back end (a back end
|
|
|
|
for the MC68000 with four byte word and pointersize). The table for
|
|
|
|
this back end was given to me as an example, but I thought it didn't make
|
|
|
|
good use of the MC68000's addressing capabilities, it hardly did any
|
|
|
|
optimalization, and it sometimes even
|
|
|
|
generated code that the assembler would not swallow.
|
|
|
|
This was sufficient reason for me to write a completely new table.
|
|
|
|
.PP
|
|
|
|
The results from the table may not be taken too seriously. The sizes measured
|
|
|
|
are the sizes of the text segments of the user programs, i.e. without the
|
|
|
|
inclusion of library routines. Of course these segments do contain calls
|
|
|
|
to these routines. Another thing is that the
|
|
|
|
.I rom
|
|
|
|
segment may be included in the text segment (this is why the
|
|
|
|
results for the MC68000 for
|
|
|
|
.I compute.c
|
|
|
|
look so bad).
|
|
|
|
.PP
|
|
|
|
Some other things must be said about these results.
|
|
|
|
The quality of EM code
|
|
|
|
generated by the C front end is certainly not optimal. The front end
|
|
|
|
uses temporary locals (extra locals that are used to evaluate expressions)
|
|
|
|
far too quickly: for a simple C expression like
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
*(pointer) += constant
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
where
|
|
|
|
.I pointer
|
|
|
|
is a register variable, the C front end generates (for obscure reasons)
|
|
|
|
a temporary local that holds the contents of \fIpointer\fR. This way
|
|
|
|
the pattern for
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
loc lil adi sil $2==$4 && $3==4
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
for register variables is not used and longer, less efficient
|
|
|
|
code is generated. But even in spite of this, the back end seems to
|
|
|
|
generate rather compact code.
|
|
|
|
.NH
|
|
|
|
Some timing results
|
|
|
|
.PP
|
|
|
|
In order to measure the performance of the code generated by the back end
|
|
|
|
some timing tests were done. The reason I chose these particular tests is
|
|
|
|
that they were also done for many other back ends; the reader can compare
|
|
|
|
the results if he so wishes (of course comparing the results only
|
|
|
|
show a global difference in speed of the various machines; it doesn't
|
|
|
|
show whether some back end generates relatively better code than another).
|
|
|
|
.PP
|
|
|
|
On the MC68000 machine the statements were executed one million times.
|
|
|
|
On the MC68020 machine the statements had to be executed four million times
|
|
|
|
because this machine was so fast that timing results would be very
|
|
|
|
unreliable if the statements were executed only one million times.
|
|
|
|
.PP
|
|
|
|
For testing I used the following C test program:
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
main()
|
|
|
|
{
|
|
|
|
int i, j, ...
|
|
|
|
...
|
|
|
|
for (i=0; i<1000; i++)
|
|
|
|
for (j=0; j<1000; j++)
|
|
|
|
STATEMENT;
|
|
|
|
}
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
where
|
|
|
|
.I STATEMENT
|
|
|
|
is any of the test statements or the empty statement. For the MC68020
|
|
|
|
tests I used 2000 instead of 1000.
|
|
|
|
The results of the test with the empty statement were used to calculate
|
|
|
|
the execution times of the other test statements.
|
|
|
|
.PP
|
|
|
|
Figures 3 and 4 show many results. For each machine actually two tests were
|
|
|
|
done: one with register variables, and the other without them.
|
|
|
|
I noticed that the original C compilers on both machines did not generate
|
|
|
|
the use of register variables, unless specifically requested. The
|
|
|
|
back end uses register variables when and where they are profitable, even
|
|
|
|
if the user did not ask for them.
|
|
|
|
.KF
|
|
|
|
.TS
|
|
|
|
center box;
|
|
|
|
cfI s s s s
|
|
|
|
c s s s s
|
|
|
|
c | c s | c s
|
|
|
|
cw(1.5i) | c c | c c
|
|
|
|
c | c c | c c
|
|
|
|
lp-2fI | n n | n n.
|
|
|
|
timing results for the MC68000
|
|
|
|
times in @ mu @seconds
|
|
|
|
_
|
|
|
|
test statement without register variables with register variables
|
|
|
|
_
|
|
|
|
original new MC68000 original new MC68000
|
|
|
|
C compiler back end C compiler back end
|
|
|
|
_
|
|
|
|
int1=0; 2.8 2.7 0.5 0.5
|
|
|
|
int1=int2-1; 4.1 4.1 1.3 1.3
|
|
|
|
int1=int1+1; 4.1 4.1 1.3 1.3
|
|
|
|
int1=int2*int3; 40.0 40.5 36.2 36.8
|
|
|
|
T{
|
|
|
|
int1=(int2<0);
|
|
|
|
\/*true*/
|
|
|
|
T} 5.5 7.3 2.0 4.5
|
|
|
|
T{
|
|
|
|
int1=(int2<0);
|
|
|
|
\/*false*/
|
|
|
|
T} 4.7 8.5 2.8 5.6
|
|
|
|
T{
|
|
|
|
int1=(int2<3);
|
|
|
|
\/*true*/
|
|
|
|
T} 6.2 7.7 2.6 5.4
|
|
|
|
T{
|
|
|
|
int1=(int2<3);
|
|
|
|
\/*false*/
|
|
|
|
T} 5.4 8.9 3.6 6.5
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
int1=((int2>3)||(int2<3));
|
|
|
|
\/* true || false */
|
|
|
|
T} 6.0 7.8 3.4 5.4
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
int1=((int2>3)||(int2<3));
|
|
|
|
\/* false || true */
|
|
|
|
T} 9.1 10.2 5.7 7.1
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
switch (int1) {
|
|
|
|
case 1: int1=0; break;
|
|
|
|
case 2: int1=1; break;
|
|
|
|
}
|
|
|
|
T} 6.3 17.8 5.3 14.0
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
if (int1=0) int2=3;
|
|
|
|
\/*true*/
|
|
|
|
T} 5.1 4.7 1.3 1.3
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
if (int1=0) int2=3;
|
|
|
|
\/*false*/
|
|
|
|
T} 2.2 2.1 1.9 1.1
|
|
|
|
while (int1>0) int1=int1-1; 2.2 2.1 1.1 1.1
|
|
|
|
int1=a[int2]; 6.8 6.7 4.0 3.1
|
|
|
|
p3(int1); 14.3 11.1 13.4 10.0
|
|
|
|
int1=f(int2); 17.7 14.5 14.8 11.7
|
|
|
|
s.overhead=5400; 2.8 2.7 2.9 2.7
|
|
|
|
.TE
|
|
|
|
.DS C
|
|
|
|
Fig. 3
|
|
|
|
.DE
|
|
|
|
.KE
|
|
|
|
.KF
|
|
|
|
.TS
|
|
|
|
center box;
|
|
|
|
cfI s s s s
|
|
|
|
c s s s s
|
|
|
|
c | c s | c s
|
|
|
|
cw(1.5i) | c c | c c
|
|
|
|
c | c c | c c
|
|
|
|
lp-2fI | n n | n n.
|
|
|
|
timing results for the MC68020
|
|
|
|
times in @ mu @seconds
|
|
|
|
_
|
|
|
|
test statement without register variables with register variables
|
|
|
|
_
|
|
|
|
original new MC68020 original new MC68020
|
|
|
|
C compiler back end C compiler back end
|
|
|
|
_
|
|
|
|
int1=0; .25 .25 .15 .15
|
|
|
|
int1=int2-1; 1.3 1.3 .38 .38
|
|
|
|
int1=int1+1; 1.2 .90 .38 .15
|
|
|
|
int1=int2*int3; 4.4 4.2 3.0 3.1
|
|
|
|
T{
|
|
|
|
int1=(int2<0);
|
|
|
|
\/*true*/
|
|
|
|
T} 1.6 2.7 1.1 2.3
|
|
|
|
T{
|
|
|
|
int1=(int2<0);
|
|
|
|
\/*false*/
|
|
|
|
T} 1.9 2.9 .80 2.1
|
|
|
|
T{
|
|
|
|
int1=(int2<3);
|
|
|
|
\/*true*/
|
|
|
|
T} 1.7 2.8 1.2 2.6
|
|
|
|
T{
|
|
|
|
int1=(int2<3);
|
|
|
|
\/*false*/
|
|
|
|
T} 2.1 3.0 .85 2.3
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
int1=((int2>3)||(int2<3));
|
|
|
|
\/* true || false */
|
|
|
|
T} 2.1 3.1 1.2 2.5
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
int1=((int2>3)||(int2<3));
|
|
|
|
\/* false || true */
|
|
|
|
T} 3.4 4.2 1.8 3.2
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
switch (int1) {
|
|
|
|
case 1: int1=0; break;
|
|
|
|
case 2: int1=1; break;
|
|
|
|
}
|
|
|
|
T} 2.7 8.0 2.0 6.9
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
if (int1=0) int2=3;
|
|
|
|
\/*true*/
|
|
|
|
T} 1.2 1.3 .63 .63
|
|
|
|
T{
|
|
|
|
.na
|
|
|
|
if (int1=0) int2=3;
|
|
|
|
\/*false*/
|
|
|
|
T} 1.7 1.6 .50 .53
|
|
|
|
while (int1>0) int1=int1-1; 1.2 1.3 .55 .53
|
|
|
|
int1=a[int2]; 1.8 1.8 1.0 1.0
|
|
|
|
p3(int1); 14.8 5.5 14.1 5.0
|
|
|
|
int1=f(int2); 16.3 6.6 15.2 5.9
|
|
|
|
s.overhead=5400; .48 .48 .50 .50
|
|
|
|
.TE
|
|
|
|
.DS C
|
|
|
|
Fig. 4
|
|
|
|
.DE
|
|
|
|
.KE
|
|
|
|
.PP
|
|
|
|
The reader may have noticed that on both machines the back end seems
|
|
|
|
to generate considerably slower code for tests where a `condition' is
|
|
|
|
used in the rhs of an assignment statement. This is in fact not true: it is
|
|
|
|
the front end that generates bad code. Two examples: for the C statement
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
int1 = (int2 < 0);
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
the front end generates the following code for the rhs (I
|
|
|
|
used arbitrary labels):
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol -16
|
|
|
|
zlt *10
|
|
|
|
loc 0
|
|
|
|
bra *11
|
|
|
|
10
|
|
|
|
loc 1
|
|
|
|
11
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
while in this case (to my opinion) it should have generated
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol -16
|
|
|
|
tlt
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
which is much shorter. Another example: for the C statement
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
int1 = (int2 < 3);
|
|
|
|
.B
|
|
|
|
.DE
|
|
|
|
the front end generates for the rhs
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol -16
|
|
|
|
loc 3
|
|
|
|
blt *10
|
|
|
|
loc 0
|
|
|
|
bra *11
|
|
|
|
10
|
|
|
|
loc 1
|
|
|
|
11
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
while a much better translation would be
|
|
|
|
.DS
|
|
|
|
.B
|
|
|
|
lol -16
|
|
|
|
loc 3
|
|
|
|
cmi 4
|
|
|
|
tlt
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
.PP
|
|
|
|
Another statement that the back end seems to generate slower code for is
|
|
|
|
the C switch statement. This is true, but it is also caused by
|
|
|
|
the way these things are done in EM. EM uses the
|
|
|
|
.B csa
|
|
|
|
or
|
|
|
|
.B csb
|
|
|
|
instruction, and for these two I had to use library routines. On larger
|
|
|
|
switch statements the
|
|
|
|
.I .csa
|
|
|
|
routine will perform relatively better.
|
|
|
|
.PP
|
|
|
|
The back end generates considerably faster code for procedure and function
|
|
|
|
calls, especially in the MC68020 case, and also for the C statement
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
int1 = int1 + 1;
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
The original C compilers use the same method for this instruction
|
|
|
|
as for
|
|
|
|
.DS
|
|
|
|
.I
|
|
|
|
int1 = int2 - 1;
|
|
|
|
.R
|
|
|
|
.DE
|
|
|
|
they perform the addition in a scratch register, and then store the
|
|
|
|
result. For the former C statement this is not necessary, because
|
|
|
|
the MC68000 and MC68020 have an instruction that can add constants
|
|
|
|
to almost anything (in this case: to locals). The MC68000 and MC68020
|
|
|
|
back ends do use this instruction.
|
|
|
|
.NH
|
|
|
|
Some final remarks
|
|
|
|
.PP
|
|
|
|
As mentioned a few times before, the C front end compiler does not
|
|
|
|
generate optimal code and as a consequence of this the
|
|
|
|
back end does not always generate optimal code. This is especially
|
|
|
|
the case with temporary locals, which the front end generates much
|
|
|
|
too quickly, and also with conditional expressions that are
|
|
|
|
used in the rhs of an assignment statement (fortunately this is not
|
|
|
|
needed so much).
|
|
|
|
.PP
|
|
|
|
If
|
|
|
|
.I cgg
|
|
|
|
would have been able to accept operands separated by any character
|
|
|
|
instead of just by commas (in the instruction definitions part),
|
|
|
|
I wouldn't have had the need of the
|
|
|
|
.I killreg
|
|
|
|
pseudo instruction. It would also be handy to have
|
|
|
|
.I cgg
|
|
|
|
accept all normal C operators. At the moment
|
|
|
|
.I cgg
|
|
|
|
does not accept binary ands, ors and exors, even though in [4]
|
|
|
|
it is stated that
|
|
|
|
.I cgg
|
|
|
|
does accept all normal C operators. As it happens I did not need the
|
|
|
|
binary operators, but at some time in developing the table I thought
|
|
|
|
I did.
|
|
|
|
.PP
|
|
|
|
I would also like
|
|
|
|
.I cg
|
|
|
|
to do more with the condition codes information that is supplied with
|
|
|
|
each instruction in the instruction definitions section of the table.
|
|
|
|
Sometimes
|
|
|
|
.I cg
|
|
|
|
generates test instructions which actually were not necessary. This
|
|
|
|
of course causes the generated
|
|
|
|
programs to be slightly larger and slightly slower.
|
|
|
|
.PP
|
|
|
|
In spite of the few minor shortcomings mentioned above I found
|
|
|
|
.I cgg
|
|
|
|
a very comfortable tool to use.
|
|
|
|
.SH
|
|
|
|
References
|
|
|
|
.PP
|
|
|
|
.IP [1]
|
|
|
|
T. B. Steel Jr.,
|
|
|
|
.I
|
|
|
|
UNCOL: The myth and the Fact,
|
|
|
|
.R
|
|
|
|
in Ann. Rev. Auto. Prog.,
|
|
|
|
R. Goodman (ed.), Vol. 2 (1969), pp 325 - 344
|
|
|
|
.IP [2]
|
|
|
|
A. S. Tanenbaum, H. van Staveren, E. G. Keizer, J. W. Stevenson,
|
|
|
|
.I
|
|
|
|
A practical toolkit for making portable compilers,
|
|
|
|
.R
|
|
|
|
Informatica Report 74, Vrije Universiteit, Amsterdam, 1983
|
|
|
|
.IP [3]
|
|
|
|
A. S. Tanenbaum, H. van Staveren, E. G. Keizer, J. W. Stevenson,
|
|
|
|
.I
|
|
|
|
Description of an experimental machine architecture for use with
|
|
|
|
block structured languages,
|
|
|
|
.R
|
|
|
|
Informatica Report 81, Vrije Universiteit, Amsterdam, 1983
|
|
|
|
.IP [4]
|
|
|
|
H. van Staveren
|
|
|
|
.I
|
|
|
|
The table driven code generator from the Amsterdam Compiler Kit,
|
|
|
|
Second Revised Edition,
|
|
|
|
.R
|
|
|
|
Vrije Universiteit, Amsterdam
|
|
|
|
.IP [5]
|
|
|
|
.I
|
|
|
|
MC68020 32-bit Microprocessor User's Manual,
|
|
|
|
.R
|
|
|
|
Second Edition,
|
|
|
|
Motorola Inc., 1985, 1984
|
|
|
|
.IP [6]
|
|
|
|
.I
|
|
|
|
MC68000 16-bit Microprocessor User's Manual,
|
|
|
|
Preliminary,
|
|
|
|
.R
|
|
|
|
Motorola Inc., 1979
|