ack/doc/em/addend.n

1122 lines
37 KiB
Plaintext
Raw Normal View History

1984-06-29 14:46:39 +00:00
.lg 0
.ta 8 16 24 32 40 48 56 64 72 80
.hw iden-ti-fi-er
.nr a 0 1
.nr f 1 1
.de x1
'sp 2
'tl '''%'
'sp 2
.ns
..
.wh 0 x1
.de fo
'bp
..
.wh 60 fo
.ll 79
.lt 79
.de HT
.ti -4
..
.de PP
.sp
.ne 2
.ti +5
..
.de SE
.bp
\fB\\n+a. \\$1\fR
.nr b 0 1
..
.de SB
.br
.ne 10
.sp 5
\fB\\na.\\n+b. \\$1\fR
..
.de DC
.ti -14
DECISION~\\$1:
..
.de IN
.in +6
..
.de OU
.in -6
..
.tr ~
.sp 5
.rs
.sp 10
.ce 3
Changes in EM-1
Addendum to Informatica Rapport IR-54
.sp 5
.PP
This document describes a revision of EM-1.
A list of differences is presented roughly in the order IR-54
describes the original architecture.
A complete list of EM-1 pseudo's and instructions is also included.
.SE Introduction
.PP
EM is a family of intermediate languages, resembling assembly
language for a stack machine.
EM defines the layout of data memory and a partitioning
of instruction memory.
EM has can do operations on five basic types:
pointers, signed integers, unsigned integers, floating point numbers
and sets of bits.
The size of pointers is fixed in each member,
in contrast to the sizes of the other types.
Each member has one more fixed size: the word size.
This is the mimimum size of any object on the stack.
The sizes of all objects on the stack are assumed to
multiples of the word size.
We assume that pointer and word-sizes are both powers of two.
.PP
It is possible to load objects smaller then the word size from memory.
These objects are converted to objects of the word size by
clearing the most significant bytes.
(A separate conversion instruction can do sign extension).
While storing objects smaller then the word size are stored in memory,
the most significant bytes are ignored.
The size of such objects has to be a divisor of the word size.
.PP
Put in other terms, instructions such as LOC, LOL, LOE, STF, etc.
manipulate WORDS. Up until now, a word was defined as 16 bits.
It is now possible to define a word size other than 16 bits. For
example, MES 2,1,2 defines a word to be 8 bits and a pointer to be
16 bits. As another example, MES 2,4,4 defines a word to be 32 bits
and a pointer to be 32 bits.
.PP
If a compiler receives flags telling it to use 32 bit integers, it now
has a choice of setting the word length to 16 bits and using LDL etc
for dealing with integers, or setting the word length to 32 bits and using
LOL etc for integers.
For example, x:=a+b for 32-bit integers would become:
MES 2,2,4 MES 2,4,4
LDL a LOL a
LDL b LOL b
ADI 4 ADI 4
SDL x STL x
In many cases, the target machine code that is finally produced from either
of the above sequences will not show any traces of the stack machine, however
for some instructions actual pushes and pops at run time will be necessary.
Choosing a wider EM word will usually produce fewer stack operations than
a narrower word, but it eliminates the possibility of doing arithmetic on
quantities smaller than a word. If, for example, a compiler chooses a 32-bit
EM word, it will be difficult to add two 16 bit integers with ADI, since
the argument must be multiple of the word size.
(The operation can be done by converting the operands to 32 bits using CII,
adding the 32-bit numbers, and reconverting the result.)
On the other hand, choosing a 16-bit EM word makes it possible to do both
16-bit adds (ADI 2) and 32-bit adds (ADI 4),
but the price paid is that 32-bit operations will be viewed as double
precision, and may be slightly less efficient on target machines with a
32-bit word, i.e. the EM to target translator may not take full advantage
of the 32 bit facilities.
.PP
Note that since LOC pushes a WORD on the stack, the argument of LOC
must fit ina word. LOC 256 on an EM machine with a 1-byte word length
is not allowed. LDC 256 is allowed, however.
.PP
A general rule of thumb is that the compiler should choose an EM word
length equal to the width of a single precision integer.
Obviously, compilers should be well parameterized to allow the integer
size(s) and word size(s) to be changed by just changing a few constants.
.PP
The size of a instruction space pointer in is the same
as the size of a data space pointer.
.PP
EM assumes two's complement arithmetic on signed integers,
but does not define an ordering of the bytes in a integer.
The lowest numbered byte of a two-byte object can contain
either the most or the least significant part.
.SE Memory
.PP
EM has two separate addressing spaces, instruction and data.
The sizes of these spaces are not specified.
The layout of instruction space in not defined.
Any interpreter or translator may assume a layout fitting his/her needs.
The layout of data memory is specified by EM.
EM data memory consists of a sequence of 8-bit bytes each separately
addressable.
Certain alignment restrictions exist for object consisting of multiple bytes.
Objects smaller then the word size can only be addressed
at multiples of the object size.
For example: in a member with a four-byte word size, two-byte integers
can only be accessed from even addresses.
Objects larger then the word size can only be placed at multiples
of the word size.
For example: in a member with a four-byte word size,
eight-byte floating point numbers can be fetched at addresses
0, 4, 8, 12, etc.
.SB "Procedure identifiers"
.PP
Procedure identifiers in EM have the same size
as pointers.
Any implementation of EM is free to use any method of identifying procedures.
Common methods are indices into tables containing further information
and addresses of the first instructions of procedures.
.SB "Heap and Stack in global data"
.PP
The stack grows downward, the heap grows upward.
The stack pointer points to the lowest occupied word on the stack.
The heap pointer marks the first free word in the heap area.
.br
.ne 39
.sp 1
.nf
65534 -> |-------------------------------|
|///////////////////////////////|
|//// unimplemented memory /////|
|///////////////////////////////|
SB -> |-------------------------------|
| |
| stack and local area | <- LB
| |
| |
|-------------------------------| <- SP
|///////////////////////////////|
|// implementation dependent //|
|///////////////////////////////|
|-------------------------------| <- HP
| |
| heap area |
| |
| |
|-------------------------------|
| |
| global area |
| |
EB -> |-------------------------------|
| |
| |
| program text | <- PC
| |
| |
PB -> |-------------------------------|
|///////////////////////////////|
|////////// undefined //////////|
|///////////////////////////////|
0 -> |-------------------------------|
Fig. \nf. Example of memory layout showing typical register
positions during execution of an EM program.
.fi
.SB "Data addresses as arguments"
.PP
Anywhere previous versions of the EM assembly language
allowed identifiers of objects in
data space,
it is also possible to use 'identifier+constant' or 'identifier-constant'.
For example, both "CON LABEL+4" and "LAE SAVED+3" are allowed.
More complicated expressions are illegal.
.SB "Local data area"
.PP
The mark block has been banished.
When calling a procedure,
the calling routine first has to push the actual parameters.
All language implementations currently push their arguments
in reverse order, to be compatible with C.
Then the procedure is called using a CAL or CAI instruction.
Either the call or the procedure prolog somehow has to save
the return address and dynamic link.
The prolog allocates the space needed for locals and is free to
surround this space with saved registers and other information it
deems necessary.
.PP
The locals are now accessed using negative offsets in LOL, LDL, SDL, LAL,
LIL, SIL and STL instructions.
The parameters are accessed using positive offsets in LOL, LDL, SDL, LAL,
LIL, STL and
STL instructions.
The prolog might have stored information in the area between parameters and
locals.
As a consequence there are two bases, AB(virtual) and LB.
AB stands for Argument Base and LB stands for Local Base.
Positive arguments to LOL etc ... are interpreted as offsets from AB,
negative arguments as offsets from LB.
.PP
The BEG instruction is not needed to allocate the locals because
storage for locals is set aside in the prolog.
The instruction still exists under the name ASP (Adjust Stack Pointer).
.PP
Procedures return using the RET instruction.
The RET pops the function result from the stack and
brings the stack pointer and other relevant registers to the state
they had just before the procedure was called.
The RET instruction expects that - aside from possible function results -
the stack pointer has the value it had after execution of the prolog.
RET finally returns control to the calling routine.
The actual parameters have to be removed from the stack by the calling routine,
and not by the called procedure.
.sp 1
.ne 38
.nf
|===============================|
| actual argument n |
|-------------------------------|
| . |
| . |
| . |
|-------------------------------|
| actual argument 1 | ( <- AB )
|===============================|
|///////////////////////////////|
|// implementation dependent //|
|///////////////////////////////| <- LB
|===============================|
| |
| local variables |
| |
|-------------------------------|
| |
| compiler temporaries |
| |
|===============================|
|///////////////////////////////|
|// implementation dependent //|
|///////////////////////////////|
|===============================|
| |
| dynamic local generators |
| |
|===============================|
| operand |
|-------------------------------|
| operand | <- SP
|===============================|
A sample procedure frame.
.fi
.sp 1
This scheme allows procedures to be called with a variable number
of parameters.
The parameters have to be pushed in reverse order,
because the called procedure has to be able to locate the first one.
.PP
.PP
Since the mark block has disappeared, a new mechanism for static
links had to be created.
All compilers use the convention that EM procedures needing
a static link will find a link in their zero'th parameter,
i.e. the last one pushed on the stack.
This parameter should be invisible to users of the compiler.
The link needs to be in a fixed place because the lexical instructions
have to locate it.
The LEX instruction is replaced by two instructions: LXL and LXA.
\&"LXL~n" finds the LB of a procedure n static levels removed.
\&"LXA~n" finds the (virtual) AB.
The value used for static link is LB.
.PP
When a procedure needing a static link is called, first the actual
parameters are pushed, then the static link is pushed using LXL
and finally the procedure is called with a CAL with the procedure's
name as argument.
.br
.ne 40
.nf
|===============================|
| actual argument n |
|-------------------------------|
| . |
| . |
| . |
|-------------------------------|
| actual argument 1 |
|-------------------------------|
| static link | ( <- AB )
|===============================|
|///////////////////////////////|
|// implementation dependent //|
|///////////////////////////////| <- LB
|===============================|
| |
| local variables |
| |
|-------------------------------|
| |
| compiler temporaries |
| |
|===============================|
|///////////////////////////////|
|// implementation dependent //|
|///////////////////////////////|
|===============================|
| |
| dynamic local generators |
| |
|===============================|
| operand |
|-------------------------------|
| operand | <- SP
|===============================|
A procedure frame with static link.
.fi
.sp 1
.sp 1
.PP
Pascal and other languages have to use procedure
instance identifiers containing
the procedure identifier
'ul
and
the static link the procedure has to be called with.
A static link having a value of zero signals
that the called procedure does not need a static link.
C uses the same convention for pointers to C-routines.
In pointers to C-routines the static link is set to zero.
.PP
Note: The distance from LB to AB must be known for each procedure, otherwise
LXA can not be implemented.
Most implementations will have a fixed size area between
the parameter and local storage.
The zone between the compiler temporaries and the dynamic
local generators can be used
to save a variable number of registers.
.PP
.ne 11
Prolog examples:
.sp 2
.nf
proc1 proc2
mov lb,-(sp) mov lb,-(sp)
mov sp,lb mov sp,lb
sub $loc_size,sp sub $loc_size,sp
mov r2,-(sp) ; save r2 mov r2,-(sp)
mov r4,-(sp) ; save r4
.fi
.SB "Return values"
.PP
The return value popped by RET is stored in an unnamed 'function return area'.
This area can be different for different sized objects returned,
e.g. one register for two byte objects,
two registers for four byte objects,
memory for larger objects.
The area is available for 'READ-ONCE' access using the LFR instruction.
The result of a LFR is only defined if the sizes used to store and
fetch are identical.
The only instructions guaranteed not to destroy the contents of
any 'function return area' are ASP and BRA.
Thus parameters can be popped before fetching the function result.
The maximum size of all function return areas is
implementation dependant,
but allows procedure instance identifiers and all
implemented objects of type integer, unsigned, float
and pointer to be returned.
.SE "EM Assembly Language"
.nr b 0 1
.SB "Object types and instructions"
.PP
EM knows five basic object types:
pointers,
signed integers,
unsigned integers,
floating point numbers and
sets of bits.
Operations on objects of the last four types do not assume
a specific size.
Pointers (including procedure identifiers) have a fixed size in each
implementation.
Instructions acting on one or more objects of the last four types need
explicit size information.
This information can be given either as the argument of the
instruction or on top of the stack.
.sp 1
For example:
.nf
addition of integers LOL a, LOL b, ADI 2
subtraction of two floats LDL a, LDL b, SBF 4
integer to float LOL a, LOC 2, LOC 4, CIF, SDL b
.fi
.sp
Note that conversion instructions always expect size
before and size after conversion on the stack.
.sp
No obligation exists to implement all operations on all possible sizes.
.PP
The EM assembly language
allows constants as instruction arguments up to a size of four bytes.
In all EM's it is possible to initialize any type and size object.
BSS, HOL, CON and ROM allow type and size indication in initializers.
.SB "Conversion instructions"
.PP
The conversion operators can convert from any type and size to any
type and size.
The types are specified by the instruction,
the sizes should be in words on top of the stack.
Normally the sizes are multiples of the word size,
There is one exception: the CII instructions sign-extends if the
size of the source is a divisor of the word size.
.SB "CSA and CSB"
.PP
The tables used by these instructions do not contain the procedure
identifier any more.
See also "Descriptors".
.SB EXG
.PP
The EXG instruction is deleted from the EM instruction set.
If future applications show any need for this instruction,
it will be added again.
.SB "FIL"
.PP
A FIL instruction has been introduced.
When using separate compilation,
the LIN feature of EM was insufficient.
FIL expects as argument an address in global data.
This address is stored in a fixed place in memory,
where it can be used by any implementation for diagnostics etc.
Like LIN, it provides access to the ABS fragment at the start
of external data.
.SB "LAI and SAI"
.PP
LAI and SAI have been dropped, they thwarted register optimization.
.SB LNC
.PP
The LNC instruction is deleted from the instruction set.
LOC -n wil do what it is supposed to.
.SB "Branch instructions"
.PP
The branch instructions are allowed to branch both forward and backward.
Consequently BRF and BRB are deleted and a BRA instruction is added.
BRA branches unconditionally in any direction.
.SB LDC
.PP
Loads a double word constant on the stack.
.SB LEX
.PP
LXA and LXL replace LEX.
.SB LFR
.PP
LFR loads the function result stored by RET.
.SB "LIL and SIL"
.PP
They replace LOP and STP. (Name change only)
.SB "Traps and Interrupts"
.PP
The numbers used for distinguishing the various types
of traps and interrupts have been reassigned.
The new instructions LIM and SIM
allow setting and clearing of bits in a mask.
The bits in the mask control the action taken upon encountering certain
errors at runtime.
A 1 bit causes the corresponding error to be ignored,
a 0 bit causes the run-time system to trap.
.SB LPI
.PP
Loads a procedure identifier on the stack.
LOC cannot be used to do this anymore.
.SB "ZER and ZRF"
.PP
ZER loads S zero bytes on the stack.
ZRF loads a floating point zero of size S.
.SB "Descriptors"
.PP
All instructions using descriptors have the size of the integer used
in the descriptor as argument.
The descriptors are: case descriptors (CSA and CSB),
range check descriptors (RCK) and
array descriptors ( LAR, SAR, AAR).
.SB "Case descriptors"
.PP
The value used in a case descriptor to indicate the absence of a label
is zero instead of -1.
.SE "EM assembly language"
.SB "Instruction arguments"
.PP
The previous EM had different instructions for distinguishing
between operand on the stack and explicit argument in the instruction.
For example, LOI and LOS.
This distinction has been removed.
Several instructions have two possible forms:
with explicit argument and with implicit argument on top of the stack.
The size of the implicit argument is the word size.
The implicit argument is always popped before all other operands.
Appendix 1 shows what is allowed for each instruction.
.SB Notation
.PP
First the notation used for the arguments of
instructions and pseudo instructions.
.in +12
.ti -11
<num>~~=~~an integer number in the range -32768..32767
.ti -11
<off>~~=~~an offset -2**31..2**31~-~1
.ti -11
<sym>~~=~~an identifier
.ti -11
<arg>~~=~~<off> or <sym> or <sym>+<off> or <sym>-<off>
.ti -11
<con>~~=~~integer constant,
unsigned constant,
floating point constant
.ti -11
<str>~~=~~string constant (surrounded by double quotes),
.ti -11
<lab>~~=~~instruction label ('*' followed by an integer in the range
0..32767).
.ti -11
<pro>~~=~~procedure number ('$' followed by a procedure name)
.ti -11
<val>~~=~~<arg>,
<con>,
<pro> or
<lab>.
.ti -11
<...>*~=~~zero or more of <...>
.ti -11
<...>+~=~~one or more of <...>
.ti -11
[...]~~=~~optional ...
.in -12
.SB Labels
.PP
No label, instruction or data, can have a (pseudo) instruction
on the same line.
.SB Constants
.PP
All constants in EM are interpreted in the decimal base.
.PP
In BSS, HOL, CON and ROM pseudo-instructions
numbers must be followed by I, U or F
indicating Integer, Unsigned or Float.
If no character is present I is assumed.
This character can be followed by an even positive number or a 1.
The number indicates the size in bytes of the object to be initialized,
up to 32766.
Double precision integers can no longer be indicated by a trailing L.
As said before CON and ROM also allow expressions of the form:
\&"LABEL+offset" and "LABEL-offset".
The offset must be an unsigned decimal number.
The 'IUF' indicators cannot be used with the offsets.
.PP
Areas reserved in the global data area by HOL or BSS can be
initialized.
BSS and HOL have a third parameter indicating whether the initialization
is mandatory or optional.
.PP
Since EM needs aligment of objects, this alignment is enforced by the
pseudo instructions.
All objects are aligned on a multiple of their size or the word size
whichever is smaller.
Switching to another type of fragment or placing a label forces word-alignment.
There are three types of fragments in global data space: CON, ROM and BSS-HOL.
.sp
.SB "Pseudo instructions"
.PP
The LET, IMC and FWC pseudo's have disappeared.
The only application of these pseudo's was in postponing the
specification of the size of the local storage to just before
the END of the procedure.
A new mechanism has been introduced to handle this problem.
.ti +5
The pseudos involved in separate compilation and linking have
been reorganized.
.ti +5
PRO and END are altered and reflect the new calling sequence.
EOF has disappeared.
.ti +5
BSS and HOL allow initialization of the requested data areas.
.sp 2
Four pseudo instructions request global data:
.sp 2
BSS <off>,<val>,<num>
.IN
Reserve <off> bytes.
<val> is the value used to initialize the area.
<off> must be a multiple of the size of <val>.
<num> is 0 if the initialization is not strictly necessary,
1 otherwise.
.OU
.sp
HOL <off>,<val>,<num>
.IN
Idem, but all following absolute global data references will
refer to this block.
Only one HOL is allowed per procedure,
it has to be placed before the first instruction.
.OU
.sp
CON <val>+
.IN
Assemble global data words initialized with the <val> constants.
.OU
.sp
ROM <val>+
.IN
Idem, but the initialized data will never be changed by the program.
.OU
.sp 2
Two pseudo instructions partition the input into procedures:
.sp 2
PRO <sym>[,<off>]
.IN
Start of procedure.
<sym> is the procedure name.
<off> is the number of bytes for locals.
The number of bytes for locals must be specified in the PRO or
END pseudo-instruction.
When specified in both, they must be identical.
.OU
.sp
END [<off>]
.IN
End of Procedure.
<off> is the number of bytes for locals.
The number of bytes for locals must be specified in either the PRO or
END pseudo-instruction or both.
.OU
.PP
Names of data and procedures in a EM module can either be
internal or external.
External names are known outside the module and are used to link
several pieces of a program.
Internal names are not known outside the modules they are used in.
Other modules will not 'see' an internal name.
.ti +5
In order to reduce the number of passes needed,
it must be known at the first occurrence whether
a name is internal or external.
If the first occurrence of a name is in a definition,
the name is considered to be internal.
If the first occurrence of a name is a reference,
the name is considered to be external.
If the first occurrence is in one of the following pseudo instructions,
the effect of the pseudo has precedence.
.sp 2
EXA <sym>
.IN
External name.
<sym> is external to this module.
Note that <sym> may be defined in the same module.
.OU
.sp
EXP <pro>
.IN
External procedure identifier.
Note that <sym> may be defined in the same module.
.OU
.sp
INA <sym>
.IN
Internal name.
<sym> is internal to this module and must be defined in this module.
.OU
.sp
INP <pro>
.IN
Internal procedure.
<sym> is internal to this module and must be defined in this module.
.OU
.sp 2
Two other pseudo instructions provide miscellaneous features:
.sp 2
EXC <num1>,<num2>
.IN
Two blocks of instructions preceding this one are
interchanged before being processed.
<num1> gives the number of lines of the first block.
<num2> gives the number of lines of the second one.
Blank and pure comment lines do not count.
.OU
.sp
MES <num>,<val>*
.IN
A special type of comment. Used by compilers to communicate with the
optimizer, assembler, etc. as follows:
.br
MES 0 -
.IN
An error has occurred, stop further processing.
.OU
.br
MES 1 -
.IN
Suppress optimization
.OU
.br
MES 2,<num1>,<num2>
.IN
Use word-size <num1> and pointer size <num2>.
.OU
.br
MES 3,<off>,<num1>,<num2> -
.IN
Indicates that a local variable is never referenced indirectly.
<off> is offset in bytes from LB if positive
and offset from AB if negative.
<num1> gives the size of the variable.
<num2> indicates the class of the variable.
.OU
.br
MES 4,<num>,<str>
.IN
Number of source lines in file <str> (for profiler).
.OU
.br
MES 5 -
.IN
Floating point used.
.OU
.br
MES 6,<val>* -
.IN
Comment. Used to provide comments in compact assembly language (see below).
.OU
.sp 1
Each back end is free to skip irrelevant MES pseudos.
.OU
.SB "The Compact Assembly Language"
.PP
The assembler accepts input in a highly encoded form. This
form is intended to reduce the amount of file transport between the compiler
and assembler, and also reduce the amount of storage required for storing
libraries.
Libraries are stored as archived compact assembly language, not machine language.
.PP
When beginning to read the input, the assembler is in neutral state, and
expects either a label or an instruction (including the pseudoinstructions).
The meaning of the next byte(s) when in neutral state is as follows, where b1, b2
etc. represent the succeeding bytes.
.sp
0 Reserved for future use
1-129 Machine instructions, see Appendix 2, alphabetical list
130-149 Reserved for future use
150-161 BSS,CON,END,EXC,EXA,EXP,HOL,INA,INP,MES,PRO,ROM
162-179 Reserved for future pseudoinstructions
180-239 Instruction labels 0 - 59 (180 is local label 0 etc.)
240-244 See the Common Table below
245-255 Not used
After a label, the assembler is back in neutral state; it can immediately
accept another label or an instruction in the very next byte. There are
no linefeeds used to separate lines.
.PP
If an opcode expects no arguments,
the assembler is back in neutral state after
reading the one byte containing the instruction number. If it has one or
more arguments (only pseudos have more than 1), the arguments follow directly,
encoded as follows:
.sp
0-239 Offsets from -120 to 119
.br
240-255 See the Common Table below
.sp 2
If an opcode has one optional argument,
a special byte is used to announce that the argument is not present.
.ce 1
Common Table for Neutral State and Arguments
.sp
.nf
<lab> 240 b1 Instruction label b1 (Not used for branches)
<lab> 241 b1 b2 16 bit instruction label (256*b2 + b1)
<sym> 242 b1 Global label .0-.255, with b1 being the label
<sym> 243 b1 b2 Global label .0-.32767
with 256*b2+b1 being the label
<sym> 244 <string> Global symbol not of the form .nnn
. \" Only the previous can occur in neutral state.
<num> 245 b1 b2 (16 bit constant) 256*b2+b1
<off> 246 b1 b2 b3 b4 (32 bit constant) (256*(256*(256*b4)+b3)+b2)+b1
<arg> 247 <sym><off> Global label + (possibly negative) constant
<pro> 248 <string> Procedure name (not including $)
<str> 249 <string> String used in CON or ROM (no quotes)
<con> 250 <num><string> Integer constant, size <num> bytes
<con> 251 <num><string> Unsigned constant, size <num> bytes
<con> 252 <num><string> Floating constant, size <num> bytes
<end> 255 Delimiter for argument lists or
indicates absence of optional argument
.fi
.PP
The notation <string> consists first of a length field, and then an
arbitrary string of bytes.
The length is specified by a <num>.
.PP
.ne 8
The pseudoinstructions fall into several categories, depending on their
arguments:
.sp
Group 1 -- EXC, BSS, HOL have a known number of arguments
Group 2 -- EXA, EXP, INA, INP start with a string
Group 3 -- CON, MES, ROM have a variable number of various things
Group 4 -- END, PRO have a trailing optional argument.
Groups 1 and 2
use the encoding described above.
Group 3 also uses the encoding listed above, with a <end> byte after the
last argument to indicate the end of the list.
Group 4 uses
a <end> byte if the trailing argument is not present.
.ad
.fi
.sp 2
.ne 12
.nf
Example ASCII Example compact
(LOC = 66, BRA = 18 here):
2 182
1 181
LOC 10 66 130
LOC -10 66 110
LOC 300 66 245 44 1
BRA 19 18 139
300 241 44 1
.3 242 3
CON 4,9,*2,$foo 151 124 130 240 2 248 3 102 111 111 255
LOC .35 66 242 35
.fi
.nr a 0 1
.SE "ASSEMBLY LANGUAGE INSTRUCTION LIST"
.PP
For each instruction in the list the range of operand values
in the assembly language is given.
All constants, offsets and sizes are in the range -2**31~..~2**31-1.
The column headed \fIassem\fP contains the mnemonics defined
in 4.1.
The following column indicates restrictions in the range of the operand.
Addresses have to obey the restrictions mentioned in chapter 2 - Memory -.
The size parameter of most instructions has to be a multiple
of the word size.
The classes of operands
are indicated by letters:
.ds b \fBb\fP
.ds c \fBc\fP
.ds d \fBd\fP
.ds g \fBg\fP
.ds f \fBf\fP
.ds l \fBl\fP
.ds n \fBn\fP
.ds i \fBi\fP
.ds p \fBp\fP
.ds r \fBr\fP
.ds s \fBs\fP
.ds z \fBz\fP
.ds - \fB-\fP
.nf
\fIassem\fP constraints rationale
\&\*c off 1-word constant
\&\*d off 2-word constant
\&\*l off local offset
\&\*g arg >= 0 global offset
\&\*f off fragment offset
\&\*n num >= 0 counter
\&\*s off > 0 object size
\&\*z off >= 0 object size
\&\*i off > 0 object size *
\&\*p pro pro identifier
\&\*b lab >= 0 label number
\&\*r num 0,1,2 register number
\&\*- no operand
.fi
.PP
The * at the rationale for \*i indicates that the operand
can either be given as argument or on top of the stack.
If the operand has to be fetched from the stack,
it is assumed to be a word-sized unsigned integer.
.PP
Instructions that check for undefined operands and underflow or overflow
are indicated by (*).
.nf
GROUP 1 - LOAD
LOC \*c : Load constant (i.e. push one word onto the stack)
LDC \*d : Load double constant ( push two words )
LOL \*l : Load word at \*l-th local (l<0) or parameter (l>=0)
LOE \*g : Load external word \*g
LIL \*l : Load word pointed to by \*l-th local or parameter
LOF \*f : Load offsetted. (top of stack + \*f yield address)
LAL \*l : Load address of local or parameter
LAE \*g : Load address of external
LXL \*n : Load lexical. (address of LB \*n static levels back)
LXA \*n : Load lexical. (address of AB \*n static levels back)
LOI \*s : Load indirect \*s bytes (address is popped from the stack)
LOS \*i : Load indirect. \*i-byte integer on top of stack gives object size
LDL \*l : Load double local or parameter (two consecutive words are stacked)
LDE \*g : Load double external (two consecutive externals are stacked)
LDF \*f : Load double offsetted (top of stack + \*f yield address)
LPI \*p : Load procedure identifier
GROUP 2 - STORE
STL \*l : Store local or parameter
STE \*g : Store external
SIL \*l : Store into word pointed to by \*l-th local or parameter
STF \*f : Store offsetted
STI \*s : Store indirect \*s bytes (pop address, then data)
STS \*i : Store indirect. \*i-byte integer on top of stack gives object size
SDL \*l : Store double local or parameter
SDE \*g : Store double external
SDF \*f : Store double offsetted
GROUP 3 - INTEGER ARITHMETIC
ADI \*i : Addition (*)
SBI \*i : Subtraction (*)
MLI \*i : Multiplication (*)
DVI \*i : Division (*)
RMI \*i : Remainder (*)
NGI \*i : Negate (two's complement) (*)
SLI \*i : Shift left (*)
SRI \*i : Shift right (*)
GROUP 4 - UNSIGNED ARITHMETIC
ADU \*i : Addition
SBU \*i : Subtraction
MLU \*i : Multiplication
DVU \*i : Division
RMU \*i : Remainder
SLU \*i : Shift left
SRU \*i : Shift right
GROUP 5 - FLOATING POINT ARITHMETIC (Format not defined)
ADF \*i : Floating add (*)
SBF \*i : Floating subtract (*)
MLF \*i : Floating multiply (*)
DVF \*i : Floating divide (*)
NGF \*i : Floating negate (*)
FIF \*i : Floating multiply and split integer and fraction part (*)
FEF \*i : Split floating number in exponent and fraction part (*)
GROUP 6 - POINTER ARITHMETIC
ADP \*f : Add \*c to pointer on top of stack
ADS \*i : Add \*i-byte value and pointer
SBS \*i : Subtract pointers in same fragment and push diff as size \*i integer
GROUP 7 - INCREMENT/DECREMENT/ZERO
INC \*- : Increment top of stack by 1 (*)
INL \*l : Increment local or parameter (*)
INE \*g : Increment external (*)
DEC \*- : Decrement top of stack by 1 (*)
DEL \*l : Decrement local or parameter (*)
DEE \*g : Decrement external (*)
ZRL \*l : Zero local or parameter
ZRE \*g : Zero external
ZRF \*i : Load a floating zero of size \*i
ZER \*i : Load \*i zero bytes
GROUP 8 - CONVERT ( stack: source, source size, dest. size (top) )
CII \*- : Convert integer to integer (*)
CUI \*- : Convert unsigned to integer (*)
CFI \*- : Convert floating to integer (*)
CIF \*- : Convert integer to floating (*)
CUF \*- : Convert unsigned to floating (*)
CFF \*- : Convert floating to floating (*)
CIU \*- : Convert integer to unsigned
CUU \*- : Convert unsigned to unsigned
CFU \*- : Convert floating to unsigned
GROUP 9 - LOGICAL
AND \*i : Boolean and on two groups of \*i bytes
IOR \*i : Boolean inclusive or on two groups of \*i bytes
XOR \*i : Boolean exclusive or on two groups of \*i bytes
COM \*i : Complement (one's complement of top \*i bytes)
ROL \*i : Rotate left a group of \*i bytes
ROR \*i : Rotate right a group of \*i bytes
GROUP 10 - SETS
INN \*i : Bit test on \*i byte set (bit number on top of stack)
SET \*i : Create singleton \*i byte set with bit n on (n is top of stack)
GROUP 11 - ARRAY
LAR \*i : Load array element, descriptor contains integers of size \*i
SAR \*i : Store array element
AAR \*i : Load address of array element
GROUP 12 - COMPARE
CMI \*i : Compare \*i byte integers. Push negative, zero, positive for <, = or >
CMF \*i : Compare \*i byte reals
CMU \*i : Compare \*i byte unsigneds
CMS \*i : Compare \*i byte sets. can only be used for equality test.
CMP \*- : Compare pointers
TLT \*- : True if less, i.e. iff top of stack < 0
TLE \*- : True if less or equal, i.e. iff top of stack <= 0
TEQ \*- : True if equal, i.e. iff top of stack = 0
TNE \*- : True if not equal, i.e. iff top of stack non zero
TGE \*- : True if greater or equal, i.e. iff top of stack >= 0
TGT \*- : True if greater, i.e. iff top of stack > 0
GROUP 13 - BRANCH
BRA \*b : Branch unconditionally to label \*b
BLT \*b : Branch less (pop 2 words, branch if top > second)
BLE \*b : Branch less or equal
BEQ \*b : Branch equal
BNE \*b : Branch not equal
BGE \*b : Branch greater or equal
BGT \*b : Branch greater
ZLT \*b : Branch less than zero (pop 1 word, branch negative)
ZLE \*b : Branch less or equal to zero
ZEQ \*b : Branch equal zero
ZNE \*b : Branch not zero
ZGE \*b : Branch greater or equal zero
ZGT \*b : Branch greater than zero
GROUP 14 - PROCEDURE CALL
CAI \*- : Call procedure (procedure instance identifier on stack)
CAL \*p : Call procedure (with name \*p)
LFR \*s : Load function result
RET \*z : Return (function result consists of top \*z bytes)
GROUP 15 - MISCELLANEOUS
ASP \*f : Adjust the stack pointer by \*f
ASS \*i : Adjust the stack pointer by \*i-byte integer
BLM \*z : Block move \*z bytes; first pop destination addr, then source addr
BLS \*i : Block move, size is in \*i-byte integer on top of stack
CSA \*i : Case jump; address of jump table at top of stack
CSB \*i : Table lookup jump; address of jump table at top of stack
DUP \*s : Duplicate top \*s bytes
DUS \*i : Duplicate top \*i bytes
FIL \*g : File name (external 4 := \*g)
LIM \*- : Load 16 bit ignore mask
LIN \*n : Line number (external 0 := \*n)
LNI \*- : Line number increment
LOR \*r : Load register (0=LB, 1=SP, 2=HP)
MON \*- : Monitor call
NOP \*- : No operation
RCK \*i : Range check; trap on error
RTT \*- : Return from trap
SIG \*- : Trap errors to proc nr on top of stack (-2 resets default). Static
link of procedure is below procedure number. Old values returned
SIM \*- : Store 16 bit ignore mask
STR \*r : Store register (0=LB, 1=SP, 2=HP)
TRP \*- : Cause trap to occur (Error number on stack)
.fi