1121 lines
37 KiB
Text
1121 lines
37 KiB
Text
.lg 0
|
||
.ta 8 16 24 32 40 48 56 64 72 80
|
||
.hw iden-ti-fi-er
|
||
.nr a 0 1
|
||
.nr f 1 1
|
||
.de x1
|
||
'sp 2
|
||
'tl '''%'
|
||
'sp 2
|
||
.ns
|
||
..
|
||
.wh 0 x1
|
||
.de fo
|
||
'bp
|
||
..
|
||
.wh 60 fo
|
||
.ll 79
|
||
.lt 79
|
||
.de HT
|
||
.ti -4
|
||
..
|
||
.de PP
|
||
.sp
|
||
.ne 2
|
||
.ti +5
|
||
..
|
||
.de SE
|
||
.bp
|
||
\fB\\n+a. \\$1\fR
|
||
.nr b 0 1
|
||
..
|
||
.de SB
|
||
.br
|
||
.ne 10
|
||
.sp 5
|
||
\fB\\na.\\n+b. \\$1\fR
|
||
..
|
||
.de DC
|
||
.ti -14
|
||
DECISION~\\$1:
|
||
..
|
||
.de IN
|
||
.in +6
|
||
..
|
||
.de OU
|
||
.in -6
|
||
..
|
||
.tr ~
|
||
.sp 5
|
||
.rs
|
||
.sp 10
|
||
.ce 3
|
||
Changes in EM-1
|
||
|
||
Addendum to Informatica Rapport IR-54
|
||
.sp 5
|
||
.PP
|
||
This document describes a revision of EM-1.
|
||
A list of differences is presented roughly in the order IR-54
|
||
describes the original architecture.
|
||
A complete list of EM-1 pseudo's and instructions is also included.
|
||
.SE Introduction
|
||
.PP
|
||
EM is a family of intermediate languages, resembling assembly
|
||
language for a stack machine.
|
||
EM defines the layout of data memory and a partitioning
|
||
of instruction memory.
|
||
EM has can do operations on five basic types:
|
||
pointers, signed integers, unsigned integers, floating point numbers
|
||
and sets of bits.
|
||
The size of pointers is fixed in each member,
|
||
in contrast to the sizes of the other types.
|
||
Each member has one more fixed size: the word size.
|
||
This is the mimimum size of any object on the stack.
|
||
The sizes of all objects on the stack are assumed to
|
||
multiples of the word size.
|
||
We assume that pointer and word-sizes are both powers of two.
|
||
.PP
|
||
It is possible to load objects smaller then the word size from memory.
|
||
These objects are converted to objects of the word size by
|
||
clearing the most significant bytes.
|
||
(A separate conversion instruction can do sign extension).
|
||
While storing objects smaller then the word size are stored in memory,
|
||
the most significant bytes are ignored.
|
||
The size of such objects has to be a divisor of the word size.
|
||
.PP
|
||
Put in other terms, instructions such as LOC, LOL, LOE, STF, etc.
|
||
manipulate WORDS. Up until now, a word was defined as 16 bits.
|
||
It is now possible to define a word size other than 16 bits. For
|
||
example, MES 2,1,2 defines a word to be 8 bits and a pointer to be
|
||
16 bits. As another example, MES 2,4,4 defines a word to be 32 bits
|
||
and a pointer to be 32 bits.
|
||
.PP
|
||
If a compiler receives flags telling it to use 32 bit integers, it now
|
||
has a choice of setting the word length to 16 bits and using LDL etc
|
||
for dealing with integers, or setting the word length to 32 bits and using
|
||
LOL etc for integers.
|
||
For example, x:=a+b for 32-bit integers would become:
|
||
|
||
MES 2,2,4 MES 2,4,4
|
||
LDL a LOL a
|
||
LDL b LOL b
|
||
ADI 4 ADI 4
|
||
SDL x STL x
|
||
|
||
In many cases, the target machine code that is finally produced from either
|
||
of the above sequences will not show any traces of the stack machine, however
|
||
for some instructions actual pushes and pops at run time will be necessary.
|
||
Choosing a wider EM word will usually produce fewer stack operations than
|
||
a narrower word, but it eliminates the possibility of doing arithmetic on
|
||
quantities smaller than a word. If, for example, a compiler chooses a 32-bit
|
||
EM word, it will be difficult to add two 16 bit integers with ADI, since
|
||
the argument must be multiple of the word size.
|
||
(The operation can be done by converting the operands to 32 bits using CII,
|
||
adding the 32-bit numbers, and reconverting the result.)
|
||
On the other hand, choosing a 16-bit EM word makes it possible to do both
|
||
16-bit adds (ADI 2) and 32-bit adds (ADI 4),
|
||
but the price paid is that 32-bit operations will be viewed as double
|
||
precision, and may be slightly less efficient on target machines with a
|
||
32-bit word, i.e. the EM to target translator may not take full advantage
|
||
of the 32 bit facilities.
|
||
.PP
|
||
Note that since LOC pushes a WORD on the stack, the argument of LOC
|
||
must fit ina word. LOC 256 on an EM machine with a 1-byte word length
|
||
is not allowed. LDC 256 is allowed, however.
|
||
.PP
|
||
A general rule of thumb is that the compiler should choose an EM word
|
||
length equal to the width of a single precision integer.
|
||
Obviously, compilers should be well parameterized to allow the integer
|
||
size(s) and word size(s) to be changed by just changing a few constants.
|
||
.PP
|
||
The size of a instruction space pointer in is the same
|
||
as the size of a data space pointer.
|
||
.PP
|
||
EM assumes two's complement arithmetic on signed integers,
|
||
but does not define an ordering of the bytes in a integer.
|
||
The lowest numbered byte of a two-byte object can contain
|
||
either the most or the least significant part.
|
||
.SE Memory
|
||
.PP
|
||
EM has two separate addressing spaces, instruction and data.
|
||
The sizes of these spaces are not specified.
|
||
The layout of instruction space in not defined.
|
||
Any interpreter or translator may assume a layout fitting his/her needs.
|
||
The layout of data memory is specified by EM.
|
||
EM data memory consists of a sequence of 8-bit bytes each separately
|
||
addressable.
|
||
Certain alignment restrictions exist for object consisting of multiple bytes.
|
||
Objects smaller then the word size can only be addressed
|
||
at multiples of the object size.
|
||
For example: in a member with a four-byte word size, two-byte integers
|
||
can only be accessed from even addresses.
|
||
Objects larger then the word size can only be placed at multiples
|
||
of the word size.
|
||
For example: in a member with a four-byte word size,
|
||
eight-byte floating point numbers can be fetched at addresses
|
||
0, 4, 8, 12, etc.
|
||
.SB "Procedure identifiers"
|
||
.PP
|
||
Procedure identifiers in EM have the same size
|
||
as pointers.
|
||
Any implementation of EM is free to use any method of identifying procedures.
|
||
Common methods are indices into tables containing further information
|
||
and addresses of the first instructions of procedures.
|
||
.SB "Heap and Stack in global data"
|
||
.PP
|
||
The stack grows downward, the heap grows upward.
|
||
The stack pointer points to the lowest occupied word on the stack.
|
||
The heap pointer marks the first free word in the heap area.
|
||
.br
|
||
.ne 39
|
||
.sp 1
|
||
.nf
|
||
65534 -> |-------------------------------|
|
||
|///////////////////////////////|
|
||
|//// unimplemented memory /////|
|
||
|///////////////////////////////|
|
||
SB -> |-------------------------------|
|
||
| |
|
||
| stack and local area | <- LB
|
||
| |
|
||
| |
|
||
|-------------------------------| <- SP
|
||
|///////////////////////////////|
|
||
|// implementation dependent //|
|
||
|///////////////////////////////|
|
||
|-------------------------------| <- HP
|
||
| |
|
||
| heap area |
|
||
| |
|
||
| |
|
||
|-------------------------------|
|
||
| |
|
||
| global area |
|
||
| |
|
||
EB -> |-------------------------------|
|
||
| |
|
||
| |
|
||
| program text | <- PC
|
||
| |
|
||
| |
|
||
PB -> |-------------------------------|
|
||
|///////////////////////////////|
|
||
|////////// undefined //////////|
|
||
|///////////////////////////////|
|
||
0 -> |-------------------------------|
|
||
|
||
Fig. \nf. Example of memory layout showing typical register
|
||
positions during execution of an EM program.
|
||
.fi
|
||
.SB "Data addresses as arguments"
|
||
.PP
|
||
Anywhere previous versions of the EM assembly language
|
||
allowed identifiers of objects in
|
||
data space,
|
||
it is also possible to use 'identifier+constant' or 'identifier-constant'.
|
||
For example, both "CON LABEL+4" and "LAE SAVED+3" are allowed.
|
||
More complicated expressions are illegal.
|
||
.SB "Local data area"
|
||
.PP
|
||
The mark block has been banished.
|
||
When calling a procedure,
|
||
the calling routine first has to push the actual parameters.
|
||
All language implementations currently push their arguments
|
||
in reverse order, to be compatible with C.
|
||
Then the procedure is called using a CAL or CAI instruction.
|
||
Either the call or the procedure prolog somehow has to save
|
||
the return address and dynamic link.
|
||
The prolog allocates the space needed for locals and is free to
|
||
surround this space with saved registers and other information it
|
||
deems necessary.
|
||
.PP
|
||
The locals are now accessed using negative offsets in LOL, LDL, SDL, LAL,
|
||
LIL, SIL and STL instructions.
|
||
The parameters are accessed using positive offsets in LOL, LDL, SDL, LAL,
|
||
LIL, STL and
|
||
STL instructions.
|
||
The prolog might have stored information in the area between parameters and
|
||
locals.
|
||
As a consequence there are two bases, AB(virtual) and LB.
|
||
AB stands for Argument Base and LB stands for Local Base.
|
||
Positive arguments to LOL etc ... are interpreted as offsets from AB,
|
||
negative arguments as offsets from LB.
|
||
.PP
|
||
The BEG instruction is not needed to allocate the locals because
|
||
storage for locals is set aside in the prolog.
|
||
The instruction still exists under the name ASP (Adjust Stack Pointer).
|
||
.PP
|
||
Procedures return using the RET instruction.
|
||
The RET pops the function result from the stack and
|
||
brings the stack pointer and other relevant registers to the state
|
||
they had just before the procedure was called.
|
||
The RET instruction expects that - aside from possible function results -
|
||
the stack pointer has the value it had after execution of the prolog.
|
||
RET finally returns control to the calling routine.
|
||
The actual parameters have to be removed from the stack by the calling routine,
|
||
and not by the called procedure.
|
||
.sp 1
|
||
.ne 38
|
||
.nf
|
||
|
||
|
||
|
||
|===============================|
|
||
| actual argument n |
|
||
|-------------------------------|
|
||
| . |
|
||
| . |
|
||
| . |
|
||
|-------------------------------|
|
||
| actual argument 1 | ( <- AB )
|
||
|===============================|
|
||
|///////////////////////////////|
|
||
|// implementation dependent //|
|
||
|///////////////////////////////| <- LB
|
||
|===============================|
|
||
| |
|
||
| local variables |
|
||
| |
|
||
|-------------------------------|
|
||
| |
|
||
| compiler temporaries |
|
||
| |
|
||
|===============================|
|
||
|///////////////////////////////|
|
||
|// implementation dependent //|
|
||
|///////////////////////////////|
|
||
|===============================|
|
||
| |
|
||
| dynamic local generators |
|
||
| |
|
||
|===============================|
|
||
| operand |
|
||
|-------------------------------|
|
||
| operand | <- SP
|
||
|===============================|
|
||
|
||
A sample procedure frame.
|
||
|
||
.fi
|
||
.sp 1
|
||
This scheme allows procedures to be called with a variable number
|
||
of parameters.
|
||
The parameters have to be pushed in reverse order,
|
||
because the called procedure has to be able to locate the first one.
|
||
.PP
|
||
.PP
|
||
Since the mark block has disappeared, a new mechanism for static
|
||
links had to be created.
|
||
All compilers use the convention that EM procedures needing
|
||
a static link will find a link in their zero'th parameter,
|
||
i.e. the last one pushed on the stack.
|
||
This parameter should be invisible to users of the compiler.
|
||
The link needs to be in a fixed place because the lexical instructions
|
||
have to locate it.
|
||
The LEX instruction is replaced by two instructions: LXL and LXA.
|
||
\&"LXL~n" finds the LB of a procedure n static levels removed.
|
||
\&"LXA~n" finds the (virtual) AB.
|
||
The value used for static link is LB.
|
||
.PP
|
||
When a procedure needing a static link is called, first the actual
|
||
parameters are pushed, then the static link is pushed using LXL
|
||
and finally the procedure is called with a CAL with the procedure's
|
||
name as argument.
|
||
.br
|
||
.ne 40
|
||
.nf
|
||
|
||
|
||
|
||
|===============================|
|
||
| actual argument n |
|
||
|-------------------------------|
|
||
| . |
|
||
| . |
|
||
| . |
|
||
|-------------------------------|
|
||
| actual argument 1 |
|
||
|-------------------------------|
|
||
| static link | ( <- AB )
|
||
|===============================|
|
||
|///////////////////////////////|
|
||
|// implementation dependent //|
|
||
|///////////////////////////////| <- LB
|
||
|===============================|
|
||
| |
|
||
| local variables |
|
||
| |
|
||
|-------------------------------|
|
||
| |
|
||
| compiler temporaries |
|
||
| |
|
||
|===============================|
|
||
|///////////////////////////////|
|
||
|// implementation dependent //|
|
||
|///////////////////////////////|
|
||
|===============================|
|
||
| |
|
||
| dynamic local generators |
|
||
| |
|
||
|===============================|
|
||
| operand |
|
||
|-------------------------------|
|
||
| operand | <- SP
|
||
|===============================|
|
||
|
||
A procedure frame with static link.
|
||
|
||
.fi
|
||
.sp 1
|
||
.sp 1
|
||
.PP
|
||
Pascal and other languages have to use procedure
|
||
instance identifiers containing
|
||
the procedure identifier
|
||
'ul
|
||
and
|
||
the static link the procedure has to be called with.
|
||
A static link having a value of zero signals
|
||
that the called procedure does not need a static link.
|
||
C uses the same convention for pointers to C-routines.
|
||
In pointers to C-routines the static link is set to zero.
|
||
.PP
|
||
Note: The distance from LB to AB must be known for each procedure, otherwise
|
||
LXA can not be implemented.
|
||
Most implementations will have a fixed size area between
|
||
the parameter and local storage.
|
||
The zone between the compiler temporaries and the dynamic
|
||
local generators can be used
|
||
to save a variable number of registers.
|
||
.PP
|
||
.ne 11
|
||
Prolog examples:
|
||
.sp 2
|
||
.nf
|
||
|
||
proc1 proc2
|
||
|
||
mov lb,-(sp) mov lb,-(sp)
|
||
mov sp,lb mov sp,lb
|
||
sub $loc_size,sp sub $loc_size,sp
|
||
mov r2,-(sp) ; save r2 mov r2,-(sp)
|
||
mov r4,-(sp) ; save r4
|
||
|
||
.fi
|
||
.SB "Return values"
|
||
.PP
|
||
The return value popped by RET is stored in an unnamed 'function return area'.
|
||
This area can be different for different sized objects returned,
|
||
e.g. one register for two byte objects,
|
||
two registers for four byte objects,
|
||
memory for larger objects.
|
||
The area is available for 'READ-ONCE' access using the LFR instruction.
|
||
The result of a LFR is only defined if the sizes used to store and
|
||
fetch are identical.
|
||
The only instructions guaranteed not to destroy the contents of
|
||
any 'function return area' are ASP and BRA.
|
||
Thus parameters can be popped before fetching the function result.
|
||
The maximum size of all function return areas is
|
||
implementation dependant,
|
||
but allows procedure instance identifiers and all
|
||
implemented objects of type integer, unsigned, float
|
||
and pointer to be returned.
|
||
|
||
.SE "EM Assembly Language"
|
||
.nr b 0 1
|
||
.SB "Object types and instructions"
|
||
.PP
|
||
EM knows five basic object types:
|
||
pointers,
|
||
signed integers,
|
||
unsigned integers,
|
||
floating point numbers and
|
||
sets of bits.
|
||
Operations on objects of the last four types do not assume
|
||
a specific size.
|
||
Pointers (including procedure identifiers) have a fixed size in each
|
||
implementation.
|
||
Instructions acting on one or more objects of the last four types need
|
||
explicit size information.
|
||
This information can be given either as the argument of the
|
||
instruction or on top of the stack.
|
||
.sp 1
|
||
For example:
|
||
.nf
|
||
addition of integers LOL a, LOL b, ADI 2
|
||
subtraction of two floats LDL a, LDL b, SBF 4
|
||
integer to float LOL a, LOC 2, LOC 4, CIF, SDL b
|
||
.fi
|
||
.sp
|
||
Note that conversion instructions always expect size
|
||
before and size after conversion on the stack.
|
||
.sp
|
||
No obligation exists to implement all operations on all possible sizes.
|
||
.PP
|
||
The EM assembly language
|
||
allows constants as instruction arguments up to a size of four bytes.
|
||
In all EM's it is possible to initialize any type and size object.
|
||
BSS, HOL, CON and ROM allow type and size indication in initializers.
|
||
.SB "Conversion instructions"
|
||
.PP
|
||
The conversion operators can convert from any type and size to any
|
||
type and size.
|
||
The types are specified by the instruction,
|
||
the sizes should be in words on top of the stack.
|
||
Normally the sizes are multiples of the word size,
|
||
There is one exception: the CII instructions sign-extends if the
|
||
size of the source is a divisor of the word size.
|
||
.SB "CSA and CSB"
|
||
.PP
|
||
The tables used by these instructions do not contain the procedure
|
||
identifier any more.
|
||
See also "Descriptors".
|
||
.SB EXG
|
||
.PP
|
||
The EXG instruction is deleted from the EM instruction set.
|
||
If future applications show any need for this instruction,
|
||
it will be added again.
|
||
.SB "FIL"
|
||
.PP
|
||
A FIL instruction has been introduced.
|
||
When using separate compilation,
|
||
the LIN feature of EM was insufficient.
|
||
FIL expects as argument an address in global data.
|
||
This address is stored in a fixed place in memory,
|
||
where it can be used by any implementation for diagnostics etc.
|
||
Like LIN, it provides access to the ABS fragment at the start
|
||
of external data.
|
||
.SB "LAI and SAI"
|
||
.PP
|
||
LAI and SAI have been dropped, they thwarted register optimization.
|
||
.SB LNC
|
||
.PP
|
||
The LNC instruction is deleted from the instruction set.
|
||
LOC -n wil do what it is supposed to.
|
||
.SB "Branch instructions"
|
||
.PP
|
||
The branch instructions are allowed to branch both forward and backward.
|
||
Consequently BRF and BRB are deleted and a BRA instruction is added.
|
||
BRA branches unconditionally in any direction.
|
||
.SB LDC
|
||
.PP
|
||
Loads a double word constant on the stack.
|
||
.SB LEX
|
||
.PP
|
||
LXA and LXL replace LEX.
|
||
.SB LFR
|
||
.PP
|
||
LFR loads the function result stored by RET.
|
||
.SB "LIL and SIL"
|
||
.PP
|
||
They replace LOP and STP. (Name change only)
|
||
.SB "Traps and Interrupts"
|
||
.PP
|
||
The numbers used for distinguishing the various types
|
||
of traps and interrupts have been reassigned.
|
||
The new instructions LIM and SIM
|
||
allow setting and clearing of bits in a mask.
|
||
The bits in the mask control the action taken upon encountering certain
|
||
errors at runtime.
|
||
A 1 bit causes the corresponding error to be ignored,
|
||
a 0 bit causes the run-time system to trap.
|
||
.SB LPI
|
||
.PP
|
||
Loads a procedure identifier on the stack.
|
||
LOC cannot be used to do this anymore.
|
||
.SB "ZER and ZRF"
|
||
.PP
|
||
ZER loads S zero bytes on the stack.
|
||
ZRF loads a floating point zero of size S.
|
||
.SB "Descriptors"
|
||
.PP
|
||
All instructions using descriptors have the size of the integer used
|
||
in the descriptor as argument.
|
||
The descriptors are: case descriptors (CSA and CSB),
|
||
range check descriptors (RCK) and
|
||
array descriptors ( LAR, SAR, AAR).
|
||
.SB "Case descriptors"
|
||
.PP
|
||
The value used in a case descriptor to indicate the absence of a label
|
||
is zero instead of -1.
|
||
.SE "EM assembly language"
|
||
.SB "Instruction arguments"
|
||
.PP
|
||
The previous EM had different instructions for distinguishing
|
||
between operand on the stack and explicit argument in the instruction.
|
||
For example, LOI and LOS.
|
||
This distinction has been removed.
|
||
Several instructions have two possible forms:
|
||
with explicit argument and with implicit argument on top of the stack.
|
||
The size of the implicit argument is the word size.
|
||
The implicit argument is always popped before all other operands.
|
||
Appendix 1 shows what is allowed for each instruction.
|
||
.SB Notation
|
||
.PP
|
||
First the notation used for the arguments of
|
||
instructions and pseudo instructions.
|
||
.in +12
|
||
.ti -11
|
||
<num>~~=~~an integer number in the range -32768..32767
|
||
.ti -11
|
||
<off>~~=~~an offset -2**31..2**31~-~1
|
||
.ti -11
|
||
<sym>~~=~~an identifier
|
||
.ti -11
|
||
<arg>~~=~~<off> or <sym> or <sym>+<off> or <sym>-<off>
|
||
.ti -11
|
||
<con>~~=~~integer constant,
|
||
unsigned constant,
|
||
floating point constant
|
||
.ti -11
|
||
<str>~~=~~string constant (surrounded by double quotes),
|
||
.ti -11
|
||
<lab>~~=~~instruction label ('*' followed by an integer in the range
|
||
0..32767).
|
||
.ti -11
|
||
<pro>~~=~~procedure number ('$' followed by a procedure name)
|
||
.ti -11
|
||
<val>~~=~~<arg>,
|
||
<con>,
|
||
<pro> or
|
||
<lab>.
|
||
.ti -11
|
||
<...>*~=~~zero or more of <...>
|
||
.ti -11
|
||
<...>+~=~~one or more of <...>
|
||
.ti -11
|
||
[...]~~=~~optional ...
|
||
.in -12
|
||
.SB Labels
|
||
.PP
|
||
No label, instruction or data, can have a (pseudo) instruction
|
||
on the same line.
|
||
.SB Constants
|
||
.PP
|
||
All constants in EM are interpreted in the decimal base.
|
||
.PP
|
||
In BSS, HOL, CON and ROM pseudo-instructions
|
||
numbers must be followed by I, U or F
|
||
indicating Integer, Unsigned or Float.
|
||
If no character is present I is assumed.
|
||
This character can be followed by an even positive number or a 1.
|
||
The number indicates the size in bytes of the object to be initialized,
|
||
up to 32766.
|
||
Double precision integers can no longer be indicated by a trailing L.
|
||
As said before CON and ROM also allow expressions of the form:
|
||
\&"LABEL+offset" and "LABEL-offset".
|
||
The offset must be an unsigned decimal number.
|
||
The 'IUF' indicators cannot be used with the offsets.
|
||
.PP
|
||
Areas reserved in the global data area by HOL or BSS can be
|
||
initialized.
|
||
BSS and HOL have a third parameter indicating whether the initialization
|
||
is mandatory or optional.
|
||
.PP
|
||
Since EM needs aligment of objects, this alignment is enforced by the
|
||
pseudo instructions.
|
||
All objects are aligned on a multiple of their size or the word size
|
||
whichever is smaller.
|
||
Switching to another type of fragment or placing a label forces word-alignment.
|
||
There are three types of fragments in global data space: CON, ROM and BSS-HOL.
|
||
.sp
|
||
.SB "Pseudo instructions"
|
||
.PP
|
||
The LET, IMC and FWC pseudo's have disappeared.
|
||
The only application of these pseudo's was in postponing the
|
||
specification of the size of the local storage to just before
|
||
the END of the procedure.
|
||
A new mechanism has been introduced to handle this problem.
|
||
.ti +5
|
||
The pseudos involved in separate compilation and linking have
|
||
been reorganized.
|
||
.ti +5
|
||
PRO and END are altered and reflect the new calling sequence.
|
||
EOF has disappeared.
|
||
.ti +5
|
||
BSS and HOL allow initialization of the requested data areas.
|
||
.sp 2
|
||
Four pseudo instructions request global data:
|
||
.sp 2
|
||
BSS <off>,<val>,<num>
|
||
.IN
|
||
Reserve <off> bytes.
|
||
<val> is the value used to initialize the area.
|
||
<off> must be a multiple of the size of <val>.
|
||
<num> is 0 if the initialization is not strictly necessary,
|
||
1 otherwise.
|
||
.OU
|
||
.sp
|
||
HOL <off>,<val>,<num>
|
||
.IN
|
||
Idem, but all following absolute global data references will
|
||
refer to this block.
|
||
Only one HOL is allowed per procedure,
|
||
it has to be placed before the first instruction.
|
||
.OU
|
||
.sp
|
||
CON <val>+
|
||
.IN
|
||
Assemble global data words initialized with the <val> constants.
|
||
.OU
|
||
.sp
|
||
ROM <val>+
|
||
.IN
|
||
Idem, but the initialized data will never be changed by the program.
|
||
.OU
|
||
.sp 2
|
||
Two pseudo instructions partition the input into procedures:
|
||
.sp 2
|
||
PRO <sym>[,<off>]
|
||
.IN
|
||
Start of procedure.
|
||
<sym> is the procedure name.
|
||
<off> is the number of bytes for locals.
|
||
The number of bytes for locals must be specified in the PRO or
|
||
END pseudo-instruction.
|
||
When specified in both, they must be identical.
|
||
.OU
|
||
.sp
|
||
END [<off>]
|
||
.IN
|
||
End of Procedure.
|
||
<off> is the number of bytes for locals.
|
||
The number of bytes for locals must be specified in either the PRO or
|
||
END pseudo-instruction or both.
|
||
.OU
|
||
.PP
|
||
Names of data and procedures in a EM module can either be
|
||
internal or external.
|
||
External names are known outside the module and are used to link
|
||
several pieces of a program.
|
||
Internal names are not known outside the modules they are used in.
|
||
Other modules will not 'see' an internal name.
|
||
.ti +5
|
||
In order to reduce the number of passes needed,
|
||
it must be known at the first occurrence whether
|
||
a name is internal or external.
|
||
If the first occurrence of a name is in a definition,
|
||
the name is considered to be internal.
|
||
If the first occurrence of a name is a reference,
|
||
the name is considered to be external.
|
||
If the first occurrence is in one of the following pseudo instructions,
|
||
the effect of the pseudo has precedence.
|
||
.sp 2
|
||
EXA <sym>
|
||
.IN
|
||
External name.
|
||
<sym> is external to this module.
|
||
Note that <sym> may be defined in the same module.
|
||
.OU
|
||
.sp
|
||
EXP <pro>
|
||
.IN
|
||
External procedure identifier.
|
||
Note that <sym> may be defined in the same module.
|
||
.OU
|
||
.sp
|
||
INA <sym>
|
||
.IN
|
||
Internal name.
|
||
<sym> is internal to this module and must be defined in this module.
|
||
.OU
|
||
.sp
|
||
INP <pro>
|
||
.IN
|
||
Internal procedure.
|
||
<sym> is internal to this module and must be defined in this module.
|
||
.OU
|
||
.sp 2
|
||
Two other pseudo instructions provide miscellaneous features:
|
||
.sp 2
|
||
EXC <num1>,<num2>
|
||
.IN
|
||
Two blocks of instructions preceding this one are
|
||
interchanged before being processed.
|
||
<num1> gives the number of lines of the first block.
|
||
<num2> gives the number of lines of the second one.
|
||
Blank and pure comment lines do not count.
|
||
.OU
|
||
.sp
|
||
MES <num>,<val>*
|
||
.IN
|
||
A special type of comment. Used by compilers to communicate with the
|
||
optimizer, assembler, etc. as follows:
|
||
.br
|
||
MES 0 -
|
||
.IN
|
||
An error has occurred, stop further processing.
|
||
.OU
|
||
.br
|
||
MES 1 -
|
||
.IN
|
||
Suppress optimization
|
||
.OU
|
||
.br
|
||
MES 2,<num1>,<num2>
|
||
.IN
|
||
Use word-size <num1> and pointer size <num2>.
|
||
.OU
|
||
.br
|
||
MES 3,<off>,<num1>,<num2> -
|
||
.IN
|
||
Indicates that a local variable is never referenced indirectly.
|
||
<off> is offset in bytes from LB if positive
|
||
and offset from AB if negative.
|
||
<num1> gives the size of the variable.
|
||
<num2> indicates the class of the variable.
|
||
.OU
|
||
.br
|
||
MES 4,<num>,<str>
|
||
.IN
|
||
Number of source lines in file <str> (for profiler).
|
||
.OU
|
||
.br
|
||
MES 5 -
|
||
.IN
|
||
Floating point used.
|
||
.OU
|
||
.br
|
||
MES 6,<val>* -
|
||
.IN
|
||
Comment. Used to provide comments in compact assembly language (see below).
|
||
.OU
|
||
.sp 1
|
||
Each back end is free to skip irrelevant MES pseudos.
|
||
.OU
|
||
.SB "The Compact Assembly Language"
|
||
.PP
|
||
The assembler accepts input in a highly encoded form. This
|
||
form is intended to reduce the amount of file transport between the compiler
|
||
and assembler, and also reduce the amount of storage required for storing
|
||
libraries.
|
||
Libraries are stored as archived compact assembly language, not machine language.
|
||
.PP
|
||
When beginning to read the input, the assembler is in neutral state, and
|
||
expects either a label or an instruction (including the pseudoinstructions).
|
||
The meaning of the next byte(s) when in neutral state is as follows, where b1, b2
|
||
etc. represent the succeeding bytes.
|
||
.sp
|
||
0 Reserved for future use
|
||
1-129 Machine instructions, see Appendix 2, alphabetical list
|
||
130-149 Reserved for future use
|
||
150-161 BSS,CON,END,EXC,EXA,EXP,HOL,INA,INP,MES,PRO,ROM
|
||
162-179 Reserved for future pseudoinstructions
|
||
180-239 Instruction labels 0 - 59 (180 is local label 0 etc.)
|
||
240-244 See the Common Table below
|
||
245-255 Not used
|
||
|
||
After a label, the assembler is back in neutral state; it can immediately
|
||
accept another label or an instruction in the very next byte. There are
|
||
no linefeeds used to separate lines.
|
||
.PP
|
||
If an opcode expects no arguments,
|
||
the assembler is back in neutral state after
|
||
reading the one byte containing the instruction number. If it has one or
|
||
more arguments (only pseudos have more than 1), the arguments follow directly,
|
||
encoded as follows:
|
||
.sp
|
||
0-239 Offsets from -120 to 119
|
||
.br
|
||
240-255 See the Common Table below
|
||
.sp 2
|
||
If an opcode has one optional argument,
|
||
a special byte is used to announce that the argument is not present.
|
||
.ce 1
|
||
Common Table for Neutral State and Arguments
|
||
.sp
|
||
.nf
|
||
<lab> 240 b1 Instruction label b1 (Not used for branches)
|
||
<lab> 241 b1 b2 16 bit instruction label (256*b2 + b1)
|
||
<sym> 242 b1 Global label .0-.255, with b1 being the label
|
||
<sym> 243 b1 b2 Global label .0-.32767
|
||
with 256*b2+b1 being the label
|
||
<sym> 244 <string> Global symbol not of the form .nnn
|
||
. \" Only the previous can occur in neutral state.
|
||
<num> 245 b1 b2 (16 bit constant) 256*b2+b1
|
||
<off> 246 b1 b2 b3 b4 (32 bit constant) (256*(256*(256*b4)+b3)+b2)+b1
|
||
<arg> 247 <sym><off> Global label + (possibly negative) constant
|
||
<pro> 248 <string> Procedure name (not including $)
|
||
<str> 249 <string> String used in CON or ROM (no quotes)
|
||
<con> 250 <num><string> Integer constant, size <num> bytes
|
||
<con> 251 <num><string> Unsigned constant, size <num> bytes
|
||
<con> 252 <num><string> Floating constant, size <num> bytes
|
||
<end> 255 Delimiter for argument lists or
|
||
indicates absence of optional argument
|
||
|
||
.fi
|
||
.PP
|
||
The notation <string> consists first of a length field, and then an
|
||
arbitrary string of bytes.
|
||
The length is specified by a <num>.
|
||
.PP
|
||
.ne 8
|
||
The pseudoinstructions fall into several categories, depending on their
|
||
arguments:
|
||
.sp
|
||
Group 1 -- EXC, BSS, HOL have a known number of arguments
|
||
Group 2 -- EXA, EXP, INA, INP start with a string
|
||
Group 3 -- CON, MES, ROM have a variable number of various things
|
||
Group 4 -- END, PRO have a trailing optional argument.
|
||
|
||
Groups 1 and 2
|
||
use the encoding described above.
|
||
Group 3 also uses the encoding listed above, with a <end> byte after the
|
||
last argument to indicate the end of the list.
|
||
Group 4 uses
|
||
a <end> byte if the trailing argument is not present.
|
||
|
||
.ad
|
||
.fi
|
||
.sp 2
|
||
.ne 12
|
||
.nf
|
||
Example ASCII Example compact
|
||
(LOC = 66, BRA = 18 here):
|
||
|
||
2 182
|
||
1 181
|
||
LOC 10 66 130
|
||
LOC -10 66 110
|
||
LOC 300 66 245 44 1
|
||
BRA 19 18 139
|
||
300 241 44 1
|
||
.3 242 3
|
||
CON 4,9,*2,$foo 151 124 130 240 2 248 3 102 111 111 255
|
||
LOC .35 66 242 35
|
||
.fi
|
||
.nr a 0 1
|
||
.SE "ASSEMBLY LANGUAGE INSTRUCTION LIST"
|
||
.PP
|
||
For each instruction in the list the range of operand values
|
||
in the assembly language is given.
|
||
All constants, offsets and sizes are in the range -2**31~..~2**31-1.
|
||
The column headed \fIassem\fP contains the mnemonics defined
|
||
in 4.1.
|
||
The following column indicates restrictions in the range of the operand.
|
||
Addresses have to obey the restrictions mentioned in chapter 2 - Memory -.
|
||
The size parameter of most instructions has to be a multiple
|
||
of the word size.
|
||
The classes of operands
|
||
are indicated by letters:
|
||
.ds b \fBb\fP
|
||
.ds c \fBc\fP
|
||
.ds d \fBd\fP
|
||
.ds g \fBg\fP
|
||
.ds f \fBf\fP
|
||
.ds l \fBl\fP
|
||
.ds n \fBn\fP
|
||
.ds i \fBi\fP
|
||
.ds p \fBp\fP
|
||
.ds r \fBr\fP
|
||
.ds s \fBs\fP
|
||
.ds z \fBz\fP
|
||
.ds - \fB-\fP
|
||
.nf
|
||
|
||
\fIassem\fP constraints rationale
|
||
|
||
\&\*c off 1-word constant
|
||
\&\*d off 2-word constant
|
||
\&\*l off local offset
|
||
\&\*g arg >= 0 global offset
|
||
\&\*f off fragment offset
|
||
\&\*n num >= 0 counter
|
||
\&\*s off > 0 object size
|
||
\&\*z off >= 0 object size
|
||
\&\*i off > 0 object size *
|
||
\&\*p pro pro identifier
|
||
\&\*b lab >= 0 label number
|
||
\&\*r num 0,1,2 register number
|
||
\&\*- no operand
|
||
|
||
.fi
|
||
.PP
|
||
The * at the rationale for \*i indicates that the operand
|
||
can either be given as argument or on top of the stack.
|
||
If the operand has to be fetched from the stack,
|
||
it is assumed to be a word-sized unsigned integer.
|
||
.PP
|
||
Instructions that check for undefined operands and underflow or overflow
|
||
are indicated by (*).
|
||
.nf
|
||
|
||
GROUP 1 - LOAD
|
||
|
||
LOC \*c : Load constant (i.e. push one word onto the stack)
|
||
LDC \*d : Load double constant ( push two words )
|
||
LOL \*l : Load word at \*l-th local (l<0) or parameter (l>=0)
|
||
LOE \*g : Load external word \*g
|
||
LIL \*l : Load word pointed to by \*l-th local or parameter
|
||
LOF \*f : Load offsetted. (top of stack + \*f yield address)
|
||
LAL \*l : Load address of local or parameter
|
||
LAE \*g : Load address of external
|
||
LXL \*n : Load lexical. (address of LB \*n static levels back)
|
||
LXA \*n : Load lexical. (address of AB \*n static levels back)
|
||
LOI \*s : Load indirect \*s bytes (address is popped from the stack)
|
||
LOS \*i : Load indirect. \*i-byte integer on top of stack gives object size
|
||
LDL \*l : Load double local or parameter (two consecutive words are stacked)
|
||
LDE \*g : Load double external (two consecutive externals are stacked)
|
||
LDF \*f : Load double offsetted (top of stack + \*f yield address)
|
||
LPI \*p : Load procedure identifier
|
||
|
||
GROUP 2 - STORE
|
||
|
||
STL \*l : Store local or parameter
|
||
STE \*g : Store external
|
||
SIL \*l : Store into word pointed to by \*l-th local or parameter
|
||
STF \*f : Store offsetted
|
||
STI \*s : Store indirect \*s bytes (pop address, then data)
|
||
STS \*i : Store indirect. \*i-byte integer on top of stack gives object size
|
||
SDL \*l : Store double local or parameter
|
||
SDE \*g : Store double external
|
||
SDF \*f : Store double offsetted
|
||
|
||
GROUP 3 - INTEGER ARITHMETIC
|
||
|
||
ADI \*i : Addition (*)
|
||
SBI \*i : Subtraction (*)
|
||
MLI \*i : Multiplication (*)
|
||
DVI \*i : Division (*)
|
||
RMI \*i : Remainder (*)
|
||
NGI \*i : Negate (two's complement) (*)
|
||
SLI \*i : Shift left (*)
|
||
SRI \*i : Shift right (*)
|
||
|
||
GROUP 4 - UNSIGNED ARITHMETIC
|
||
|
||
ADU \*i : Addition
|
||
SBU \*i : Subtraction
|
||
MLU \*i : Multiplication
|
||
DVU \*i : Division
|
||
RMU \*i : Remainder
|
||
SLU \*i : Shift left
|
||
SRU \*i : Shift right
|
||
|
||
GROUP 5 - FLOATING POINT ARITHMETIC (Format not defined)
|
||
|
||
ADF \*i : Floating add (*)
|
||
SBF \*i : Floating subtract (*)
|
||
MLF \*i : Floating multiply (*)
|
||
DVF \*i : Floating divide (*)
|
||
NGF \*i : Floating negate (*)
|
||
FIF \*i : Floating multiply and split integer and fraction part (*)
|
||
FEF \*i : Split floating number in exponent and fraction part (*)
|
||
|
||
GROUP 6 - POINTER ARITHMETIC
|
||
|
||
ADP \*f : Add \*c to pointer on top of stack
|
||
ADS \*i : Add \*i-byte value and pointer
|
||
SBS \*i : Subtract pointers in same fragment and push diff as size \*i integer
|
||
|
||
GROUP 7 - INCREMENT/DECREMENT/ZERO
|
||
|
||
INC \*- : Increment top of stack by 1 (*)
|
||
INL \*l : Increment local or parameter (*)
|
||
INE \*g : Increment external (*)
|
||
DEC \*- : Decrement top of stack by 1 (*)
|
||
DEL \*l : Decrement local or parameter (*)
|
||
DEE \*g : Decrement external (*)
|
||
ZRL \*l : Zero local or parameter
|
||
ZRE \*g : Zero external
|
||
ZRF \*i : Load a floating zero of size \*i
|
||
ZER \*i : Load \*i zero bytes
|
||
|
||
GROUP 8 - CONVERT ( stack: source, source size, dest. size (top) )
|
||
|
||
CII \*- : Convert integer to integer (*)
|
||
CUI \*- : Convert unsigned to integer (*)
|
||
CFI \*- : Convert floating to integer (*)
|
||
CIF \*- : Convert integer to floating (*)
|
||
CUF \*- : Convert unsigned to floating (*)
|
||
CFF \*- : Convert floating to floating (*)
|
||
CIU \*- : Convert integer to unsigned
|
||
CUU \*- : Convert unsigned to unsigned
|
||
CFU \*- : Convert floating to unsigned
|
||
|
||
GROUP 9 - LOGICAL
|
||
|
||
AND \*i : Boolean and on two groups of \*i bytes
|
||
IOR \*i : Boolean inclusive or on two groups of \*i bytes
|
||
XOR \*i : Boolean exclusive or on two groups of \*i bytes
|
||
COM \*i : Complement (one's complement of top \*i bytes)
|
||
ROL \*i : Rotate left a group of \*i bytes
|
||
ROR \*i : Rotate right a group of \*i bytes
|
||
|
||
GROUP 10 - SETS
|
||
|
||
INN \*i : Bit test on \*i byte set (bit number on top of stack)
|
||
SET \*i : Create singleton \*i byte set with bit n on (n is top of stack)
|
||
|
||
GROUP 11 - ARRAY
|
||
|
||
LAR \*i : Load array element, descriptor contains integers of size \*i
|
||
SAR \*i : Store array element
|
||
AAR \*i : Load address of array element
|
||
|
||
GROUP 12 - COMPARE
|
||
|
||
CMI \*i : Compare \*i byte integers. Push negative, zero, positive for <, = or >
|
||
CMF \*i : Compare \*i byte reals
|
||
CMU \*i : Compare \*i byte unsigneds
|
||
CMS \*i : Compare \*i byte sets. can only be used for equality test.
|
||
CMP \*- : Compare pointers
|
||
|
||
TLT \*- : True if less, i.e. iff top of stack < 0
|
||
TLE \*- : True if less or equal, i.e. iff top of stack <= 0
|
||
TEQ \*- : True if equal, i.e. iff top of stack = 0
|
||
TNE \*- : True if not equal, i.e. iff top of stack non zero
|
||
TGE \*- : True if greater or equal, i.e. iff top of stack >= 0
|
||
TGT \*- : True if greater, i.e. iff top of stack > 0
|
||
|
||
GROUP 13 - BRANCH
|
||
|
||
BRA \*b : Branch unconditionally to label \*b
|
||
|
||
BLT \*b : Branch less (pop 2 words, branch if top > second)
|
||
BLE \*b : Branch less or equal
|
||
BEQ \*b : Branch equal
|
||
BNE \*b : Branch not equal
|
||
BGE \*b : Branch greater or equal
|
||
BGT \*b : Branch greater
|
||
|
||
ZLT \*b : Branch less than zero (pop 1 word, branch negative)
|
||
ZLE \*b : Branch less or equal to zero
|
||
ZEQ \*b : Branch equal zero
|
||
ZNE \*b : Branch not zero
|
||
ZGE \*b : Branch greater or equal zero
|
||
ZGT \*b : Branch greater than zero
|
||
|
||
GROUP 14 - PROCEDURE CALL
|
||
|
||
CAI \*- : Call procedure (procedure instance identifier on stack)
|
||
CAL \*p : Call procedure (with name \*p)
|
||
LFR \*s : Load function result
|
||
RET \*z : Return (function result consists of top \*z bytes)
|
||
|
||
GROUP 15 - MISCELLANEOUS
|
||
|
||
ASP \*f : Adjust the stack pointer by \*f
|
||
ASS \*i : Adjust the stack pointer by \*i-byte integer
|
||
BLM \*z : Block move \*z bytes; first pop destination addr, then source addr
|
||
BLS \*i : Block move, size is in \*i-byte integer on top of stack
|
||
CSA \*i : Case jump; address of jump table at top of stack
|
||
CSB \*i : Table lookup jump; address of jump table at top of stack
|
||
DUP \*s : Duplicate top \*s bytes
|
||
DUS \*i : Duplicate top \*i bytes
|
||
FIL \*g : File name (external 4 := \*g)
|
||
LIM \*- : Load 16 bit ignore mask
|
||
LIN \*n : Line number (external 0 := \*n)
|
||
LNI \*- : Line number increment
|
||
LOR \*r : Load register (0=LB, 1=SP, 2=HP)
|
||
MON \*- : Monitor call
|
||
NOP \*- : No operation
|
||
RCK \*i : Range check; trap on error
|
||
RTT \*- : Return from trap
|
||
SIG \*- : Trap errors to proc nr on top of stack (-2 resets default). Static
|
||
link of procedure is below procedure number. Old values returned
|
||
SIM \*- : Store 16 bit ignore mask
|
||
STR \*r : Store register (0=LB, 1=SP, 2=HP)
|
||
TRP \*- : Cause trap to occur (Error number on stack)
|
||
.fi
|