ack/doc/int/txt2

.\"	Implementation details
.\"
.\"	$Header$
.bp
.NH
IMPLEMENTATION DETAILS.
.PP
The pertinent issues are addressed below, in arbitrary order.
.NH 2
Stack manipulation and start-up
.PP
It is not at all easy to start the EM machine with the stack in a reasonable
and consistent state.  One reason is the anomalous value of the ML register
and another is the absence of a proper RSB.  It may be argued that the initial
stack does not have to be in a consistent state, since the first instruction
proper is only executed after \fIargc\fP, \fIargv\fP and \fIenviron\fP
have been stacked (which takes care of the empty stack) and the initial
procedure has been called (which creates a RSB).  We would, however, like to
preform the stacking of these values and the calling of the initial procedure
using the normal stack and call routines, which again require the stack to be
in an acceptable state.
.NH 3
The anomalous value of the ML register
.PP
All registers in the EM machine point to word boundaries, and all of them,
except ML, address the even-numbered byte at the boundary.
The exception has a good reason: the even numbered byte at the ML boundary does
not exist.
This problem is not particular to EM but is inherent in the number system: the
number of N-digit numbers can itself not be expressed in an N-digit number, and
the number of addresses in an N-bit machine will itself not fit in an N-bit
address.  The problem is solved in the interpreter by having ML point to the
highest word boundary that has bytes on either side; this makes ML+1
expressible.
.NH 3
The absence of an initial Return Status Block
.PP
When the stack is empty, there is no legal value for AB, since there are no
actuals; LB can be set naturally to ML+1.  This is all right when the
interpreter starts with a call of the initial routine which stores the value
of LB in the first RSB, but causes problems when finally this call returns.  We
want this call to return completely before stopping the interpreter, to check
the integrity of the last RSB; restoring information from it will, however,
cause illegal values to be stored in LB and AB (ML+1 and ML+1+rsbsize, resp.).
On top of this, the initial (illegal) Procedure Identifier of the running
procedure will be restored; then, upon restoring the likewise illegal PC will
cause a check to see if it still is inside the running procedure.  After a few
attempts at writing special cases, we have decided that it is possible, but not
worth the effort; the final (= initial) RSB will not be unstacked.
.NH 2
Floating point numbers.
.PP
The interpreter is capable of working with 4- and 8-byte floating point (FP)
numbers.
In C-terms, this corresponds to objects of type float and double respectively.
Both types fit in a C-double so the obvious way to manipulate these entities
internally is in doubles.
Pushing a 8-byte FP, all bytes of the C-double are pushed.
Pushing a 4-byte FP causes the 4 bytes representing the smallest fraction
to be discarded.
.PP
In EM, floats can be obtained in two different ways: via conversion
of another type, or via initialization in the loadfile.
Initialized floats are represented in the loadfile by an ASCII string in
the syntax of a Pascal real (signed \fPUnsignedReal\fP).
I.e. a float looks like:
.DS
[ \fISign\fP ] \fIDigit\fP+ [ . \fIDigit\fP+ ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ]                                (G1)
.DE
followed by a null byte.
Here \fISign\fP = {+, \-}; \fIDigit\fP = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
\fIExp\fP = {e, E}; [ \fIAnything\fP ] means that \fIAnything\fP is optional;
and a + means one or more times.
To accommodate some loose code generators, the actual grammar accepted is:
.DS
[ \fISign\fP ] \fIDigit\fP\(** [ . \fIDigit\fP\(** ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ]                                (G2)
.DE
followed by a null byte. Here \(** means zero or more times.  A floating
denotation which is in G2 but not in G1 draws a warning, one that is not even
in G2 causes a fatal error.
.LP
A string, representing a float which does not fit in a double causes a
warning to be given.
In that case, the returned value will be the double 0.0.
.LP
Floating point arithmetic is handled by some simple routines, checking for
over/underflow, and returning appropriate values in case of an ignored error.
.PP
Since not all C compilers provide floating point operations, there is a
compile time flag NOFLOAT, which, if defined, suppresses the use of all
fp operations in the interpreter.  The resulting interpreter will still load
EM files with floats in the global data area (and ignore them) but will give a
fatal error upon attempt to execute a floating point instruction; consequently
code involving floating point operations can be run as long as the actual
instructions are avoided.
.NH 2
Pointers.
.PP
The following sub-sections both deal with problems concerning pointers.
First, something is said about pointer arithmetic in general.
Then, the null-pointer problem is dealt with.
.NH 3
Pointer arithmetic.
.PP
Strictly speaking, pointer arithmetic is defined only within a \fBfragment\fP.
From the explanation of the term fragment however (as given in [1], page 3),
it is not quite clear what a fragment should look like
from an interpreter's point of view.
For this reason we introduced the term \fBsegment\fP,
bordering the various areas within which pointer arithmetic is allowed.
Every stack-frame is a segment, and so are the global data area (GDA) and
the heap area.
Thus, the number of segments varies over time, and at some point in time is
given by the number of currently active stack-frames
(#CAL + #CAI \- #RET \- #RTT) plus 2 (gda, heap).
Pointers in the area between heap and stack (which is inaccessible by
definition), are assumed to be in the heap segment.
.PP
The interpreter, while building a new stack-frame (i.e. segment), stores the
value of the last ActualBase in a pointer-array  (\fIAB_list[\ ]\fP).
When a pointer (say \fIP\fP) is available for arithmetic, the number
of the segment where it points (say \fIS\d\s-2P\s+2\u\fP),
is determined first.
Next, the arithmetic is performed, followed by a check on the number
of the segment where the resulting pointer \fIR\fP points
(say \fIS\d\s-2R\s+2\u\fP).
Now, if \fIS\d\s-2P\s+2\u != S\d\s-2R\s+2\u\fP, a warning is given:
\fBPointer arithmetic yields pointer to bad segment\fP.
.br
It may also be clear now, why the illegal area between heap and stack
was joined with the heap segment.
When calculating a new heap pointer (\fIHP\fP), one will obtain intermediate
results being pointers in this area just before it is made legal.
We do not want error messages all of the time, just because someone is
allocating space in the heap.
.LP
A similar treatment is given to the pointers in the SBS instruction; they have
to point into the same fragment for subtraction to be meaningful.
.LP
The length of the \fIAB_list[\ ]\fP is initially 100,
and it is reallocated in the same way the dynamically growing partitions
are (see 1.1).
.NH 3
Null pointer.
.PP
Because the EM language lacks an instruction for loading a null pointer,
most programs solve this problem by loading a pointer-sized integer of
value zero, and using this as a null pointer (this is also proposed in [1]).
\fBInt\fP allows this, and will not complain.
A warning is given however, when an attempt is made to add something to a
null pointer (i.e. the pointer-sized integer zero).
.LP
Since many programming languages use a pointer to location 0 as an illegal
value, it is desirable to detect its use.
The big problem is though that 0 is a perfectly legal EM address;
address 0 holds the current line number in the source file.  It may be freely
read but is written only by means of the LIN instruction.  This allows us to
declare the area consisting of the line number and the file name pointer to be
read-only memory.  Thus a store will be caught (and result in a warning) but a
read will succeed (and yield the EM information stored there).
.NH 2
Function Return Area (FRA).
.PP
The Function Return Area (\fIFRA[\ ]\fP) has a default size of 8 bytes;
this default can
be overridden through the use of the \fB\-r\fP-option, but cannot be
made smaller than the size of two pointers, in accordance with the
remark on page 5 of [1].
The global variable \fIFRASize\fP keeps track of how many bytes were
stored in the FRA, the last time a RET instruction was executed.
The LFR instruction only works when its argument is equal to this size.
If not, the FRA contents are loaded anyhow, but one of the following warnings
is given:
\fBReturned function result too large\fP (\fIFRASize\fP > LFR size) or
\fBReturned function result too small\fP (\fIFRASize\fP < LFR size).
.LP
Note that a C-program, falling through the end of its code without doing
a proper \fIreturn\fP or \fIexit()\fP, will generate this warning.
.PP
The only instructions that do not disturb the contents of the FRA are
GTO, BRA, ASP and RET.
This is expressed in the program by setting \fIFRA_def\fP to "undefined"
in any instruction except these four.
We realize this is a useless action most of the time, but a more
efficient solution does not seem to be at hand.
If a result is loaded when \fIFRA_def\fP is "undefined", the warning:
\fBReturned function result may be garbled\fP is generated.
.LP
Note that the FRA needs a shadow-FRA in order to store the shadow
information when performing a LFR instruction.
.NH 2
Environment interaction.
.PP
The EM machine represented by \fBint\fP can communicate with
the environment in three different ways.
A first possibility is by means of (UNIX) interrupts;
the second by executing (relatively) high level system calls (called
monitor calls).
A third means of interaction, especially interesting for the debugging
programmer, is via internal variables set on the command line.
The former two techniques, and the way they are implemented will be described
in this section.
The latter has been allotted a separate section (3).
.NH 3
Traps and interrupts.
.PP
Simple user programs will generally not mess around with UNIX-signals.
In interpreting these programs, the default actions will be taken
when a signal is received by the program: it gives a message and
stops running.
.LP
There are programs however, which try to handle certain signals
themselves.
In C, this is achieved by the system call \fIsignal(\ sig_no,\ catch\ )\fP,
which calls the handling routine \fIcatch()\fP, as soon as signal
\fBsig_no\fP occurs.
EM does not provide this call; instead, the \fIsigtrp()\fP monitor call
is available for mapping UNIX signals onto EM traps.
This implies that a \fIsignal()\fP call in a C-program
must be translated by the EM library routine to a \fIsigtrp()\fP call in EM.
.PP
The interpreter keeps an administration of the mapping of UNIX-signals
onto EM traps in the array \fIsig_map[NSIG]\fP.
Initially, the signals all have their default values.
Now assume a \fIsigtrp()\fP occurs, telling to map signal \fBsig_no\fP onto
trap \fBtrap_no\fP.
This results in:
.IP 1.
setting the relevant array element
\fIsig_map[sig_no]\fP to \fBtrap_no\fP (after saving the old value),
.IP 2.
catching the next to come \fBsig_no\fP signal with the handling routine
\fIHndlEMSig\fP (by a plain UNIX \fIsignal()\fP of course), and
.IP 3.
returning the saved map-value on the stack so the user can know the previous
trap value onto which \fBsig_no\fP was mapped.
.LP
On an incoming signal,
the handling routine for signal \fBsig_no\fP arms the
correct EM trap by calling the routine \fIarm_trap()\fP with argument
\fIsig_map[sig_no]\fP.
At the end of the EM instruction the proper call of \fItrap()\fP is done.
\fITrap()\fP on its turn examines the value of the \fIHaltOnTrap\fP variable;
if it is set, the interpreter will stop with a message. In the normal case of
controlled trap handling this bit is not on and the interpreter examines
the value of the \fITrapPI\fP variable,
which contains the procedure identifier of the EM trap handling routine.
It then initiates a call to this routine and performs a \fIlongjmp()\fP
to the main
loop to bypass all further processing of the instruction that caused the trap.
\fITrapPI\fP should be set properly by the library routines, through the
SIG instruction.
.LP
In short:
.IP 1.
A UNIX interrupt is caught by the interpreter.
.IP 2.
A handling routine is called which generates the corresponding EM trap
(according to the mapping).
.IP 3.
The trap handler calls the corresponding EM routine which emulates a UNIX
interrupt for the benefit of the interpreted program.
.PP
When considering UNIX signals, it is important to notice that some of them
are real signals, i.e., messages coming from outside the program, like DEL
and QUIT, but some are actually program-caused synchronous traps, like Illegal
Instruction.  The latter, if they happen, are incurred by the interpreter
itself and consequently are of no concern to the interpreted program: it
cannot catch them.  The present code assumes that the UNIX signals between
SIGILL (4) and SIGSYS (12) are really traps; \fIdo_sigtrp()\fP
will fail on them.
.LP
To avoid losing the last line(s) of output files, the interpreter should
always do a proper close-down, even in the presence of signals.  To this end,
all non-ignored genuine signals are initially caught by the interpreter,
through the routine \fIHndlIntSig\fP, which gives a message and preforms a
proper close-down.
Synchronous trap can only be caused by the interpreter itself; they are never
caught, and consequently the UNIX default action prevails.  Generally they
cause a core dump.
Signals requested by the interpreted program are caught by the routine
\fIHndlEMSig\fP, as explained above.
.NH 3
Monitor calls.
.PP
For the convenience of the programmer, as many monitor calls as possible
have been implemented.
The list of monitor calls given in [1] pages 20/21, has been implemented
completely, except for \fIptrace()\fP, \fIprofil()\fP and \fImpxcall()\fP.
The semantics of \fIptrace()\fP and \fIprofil()\fP from an interpreted program
is unclear; the data structure passed to \fImpxcall()\fP is non-trivial
and the system call has low portability and applicability.
For these calls, on invocation a warning is generated, and the arguments which
were meant for the call are popped properly, so the program can continue
without the stack being messed up.
The errorcode 5 (IOERROR) is pushed onto the stack (twice), in order to
fake an unsuccessful monitor call.
No other \- more meaningful \- errorcode is available in the errno-list.
.LP
Now for the implemented monitor calls.
The returned value is zero for a successful call.
When something goes wrong, the value of the external \fIerrno\fP variable
is pushed, thus enabling the user to find out what the reason of failure was.
The implementation of the majority of the monitor calls is straightforward.
Those working with a special format buffer, (e.g. \fIioctl()\fP,
\fItime()\fP and \fIstat()\fP variants), need some extra attention.
This is due to the fact that working with varying word/pointer size
combinations may cause alignment problems.
.LP
The data structure returned by the UNIX system call results from
C code that has been translated with the regular C compiler, which,
on the VAX, happens to be a 4-4 compiler.
The data structure expected by the interpreted program conforms
to the translation by \fBack\fP of the pertinent include file.
Depending on the exact call of \fBack\fP, sizes and alignment may differ.
.LP
An example is in order. The EM MON 18 instruction in the interpreted program
leads to a UNIX \fIstat()\fP system call by the interpreter.
This call fills the given struct with stat information, the contents
and alignments of which are determined by the version of UNIX and the
used C compiler, resp.
The interpreter, like any program wishing to do system calls that fill
structs, has to be translated by a C compiler that uses the
appropriate struct definition and alignments, so that it can use, e.g.,
\fIstab.st_mtime\fP and expect to obtain the right field.
This struct cannot be copied directly to the EM memory to fulfill the
MON instruction.
First, the struct may contain extraneous, system-dependent fields,
pertaining, e.g., to symbolic links, sockets, etc.
Second, it may contain holes, due to alignment requirements.
The EM program runs on an EM machine, knows nothing about these
requirements and expects UNIX Version 7 fields, with offsets as
determined by the em22, em24 or em44 compiler, resp.
To do the conversion, the interpreter has a built-in table of the
offsets of all the fields in the structs that are filled by the MON
instruction.
The appropriate fields from the result of the UNIX \fIstat()\fP are copied
one by one to the appropriate positions in the EM memory to be filled
by MON 18.
.PP
The \fIioctl()\fP call (MON 54) poses additional problems. Not only does it
have a second argument which is a pointer to a struct, the type of
which is dynamically determined, but its first argument is an opcode
that varies considerably between the versions of UNIX.
To solve the first problem, the interpreter examines the opcode (request) and
treats the second argument accordingly.  The second problem can be solved by
translating the UNIX Version 7 \fIioctl()\fP request codes to their proper
values on the various systems.  This is, however, not always useful, since
some EM run-time systems use the local request codes.  There is a compile-time
flag, V7IOCTL, which, if defined, will restrict the \fIioctl()\fP call to the
version 7 request codes and emulate them on the local system; otherwise the
request codes of the local system will be used (as far as implemented).
.PP
Minor problems also showed up with the implementation of \fIexecve()\fP
and \fIfork()\fP.
\fIExecve()\fP expects three pointers on the stack.
The first points to the name of the program to be executed,
the second and third are the beginnings of the \fBargv\fP and \fBenvp\fP
pointer arrays respectively.
We cannot pass these pointers to the system call however, because
the EM addresses to which they point do not correspond with UNIX
addresses.
Moreover, (it is not very likely to happen but) what if someone constructs
a program holding the contents for one of these pointers in the stack?
The stack is implemented upside down, so passing the pointer to
\fIexecve()\fP causes trouble for this reason too.
The only solution was to copy the pointer contents completely
to fresh UNIX memory, constructing vectors which can be passed to the
system call.
Any impending memory fault while making these copies results in failure of the
system call, with \fIerrno\fP set to EFAULT.
.PP
The implementation of the \fIfork()\fP call faced us with problems
concerning IO-channels.
Checking messages (as well as logging) must be divided over different files.
Otherwise, these messages will coincide.
This problem was solved by post-fixing the default message file
\fBint.mess\fP (as well as the logging file \fBint.log\fP) with an
automatically leveled number for every new forked process.
Children of the original process do their diagnostics
in files with postfix 1,2,3 etc.
Second generation processes are assigned files numbered 11, 12, 21 etc.
When 6 generations of processes exist at one moment, the seventh will
get the same message file as the sixth, for the length of the filename
will become too long.
.PP
Some of the monitor calls receive pointers (addresses) from to program, to be
passed to the kernel; examples are the struct stat for \fIstat()\fP, the area
to be filled for \fIread()\fP, etc. If the address is wrong, the kernel does
not generate a trap, but rather the system call returns with failure, while
\fIerrno\fP is set to EFAULT.  This is implemented by consistent checking of
all pointers in the MON instruction.
.NH 2
Internal arithmetic.
.PP
Doing arithmetic on signed integers, the smallest negative integer
(\fIminsint\fP) is considered a legal value.
This is in contradiction with the EM Manual [1], page 14, which proposes using
\fIminsint\fP for uninitialized integers.
The shadow bytes already check for uninitialized integers however,
so we do not need this special illegal value.
Although the EM Manual provides two traps, for undefined integers and floats,
undefined objects occur so frequently (e.g. in block copying partially
initialized areas) that the interpreter just gives a warning.
.LP
Except for arithmetic on unsigneds, all arithmetic checks for overflow.
The value that is pushed on the stack after an overflow occurs depends
on the UNIX behavior with regard to that particular calculation.
If UNIX would not accept the calculation (e.g. division by zero), a zero
is pushed as a convention.
Illegal computations which UNIX does accept in silence (e.g. one's
complement of \fIminsint\fP), simply push the UNIX-result after giving a
trap message.
.NH 2
Shadow bytes implementation.
.PP
A great deal of run-time checking is performed by the interpreter (except if
used in the fast version).
This section gives all details about the shadow bytes.
In order to keep track of information about the contents of D-space (stack
and global data area), there is one shadow-byte for each byte in these spaces.
Each bit in a shadow-byte represents some piece
of information about the contents of its corresponding 'sun-byte'.
All bits off indicates an undefined sun-byte.
One or more bits on always guarantees a well-defined sun-byte.
The bits have the following meaning:
.IP "\(bu bit 0:" 8
indicates that the sun-byte is (a part of) an integer.
.IP "\(bu bit 1:" 8
the sun-byte is a part of a floating point number.
.IP "\(bu bit 2:" 8
the sun-byte is a part of a pointer in dataspace.
.IP "\(bu bit 3:" 8
the sun-byte is a part of a pointer in the instruction space.
According to [1] (paragraph 6.4), there are two types pointers which
must be distinguishable.
Conversion between these two types is impossible.
The shadow-bytes make the distinction here.
.IP "\(bu bit 4:" 8
protection bit.
Indicates that the sun-byte is part of a protected piece of memory.
There is a protected area in the stack, the Return Status Block.
The EM machine language has no possibility to declare protected
memory, as is possible in EM assembly (the ROM instruction).  The protection
bit is, however, set for the line number and filename pointer area near
location 0, to aid in catching references to location 0.
.IP "\(bu bit 5/6/7:" 8
free for later use.
.LP
The shadow bytes are managed by the routines declared in \fIshadow.h\fP.
The warnings originating from checking these shadow-bytes during
run-time are various.
A list of them is given in appendix A, together with suggestions
(primarily for the C-programmer) where to look for the trouble maker(s).
.LP
A point to notice is, that once a warning is generated, it may be repeated
thousands of times.
Since repetitive warnings carry little information, but consume much
file space, the interpreter keeps track of the number of times a given warning
has been produced from a given line in a given file.
The warning message will
be printed only if the corresponding counter is a power of four (starting at
1).  In this way, a logarithmic back-off in warning generation is established.
.LP
It might be argued that the counter should be kept for each (warning, PC
value) pair rather than for each (warning, file position) pair.  Suppose,
however, that two instruction in a given line would cause the same message
regularly; this would produce two intertwined streams of identical messages,
with their counters jumping up and down.  This does not seem desirable.
.NH 2
Return Status Block (RSB)
.PP
According to the description in [1], at least the return address and the
base address of the previous RSB have to be pushed when performing a call.
Besides these two pointers, other information can be stored in the RSB
also.
The interpreter pushes the following items:
.IP \-
a pointer to the current filename,
.IP \-
the current line number (always four bytes),
.IP \-
the Local Base,
.IP \-
the return address (Program Counter),
.IP \-
the current procedure identifier
.IP \-
the RSB code, which distinguishes between initial start-up, normal call,
returnable trap and non-returnable trap (a word-size integer).
.LP
Consequently, the size of the RSB varies, depending on
word size and pointer size; its value is available as \fIrsbsize\fP.
When the RSB is removed from the stack (by a RET or RTT) the RSB code is under
the Stack Pointer for immediate checking.  It is not clear what should be done
if RSB code and return instruction do not match; at present we give a message
and continue, for what it is worth.
.PP
The reason for pushing filename and line number is that some front-ends tend
to forget the LIN and FIL instructions after returning from a function.
This may result in error messages in wrong source files and/or line numbers.
.PP
The procedure identifier is kept and restored to check that the PC will not
move out of the running procedure.  The PI is an index in the proctab, which
tells the limits in the text segment of the running procedure.
.PP
If the Return Status Block is generated as a result of a trap, more is
stacked.  Before stacking the normal RSB, the trap function pushes the
following items:
.IP \-
the contents of the entire Function Return Area,
.IP \-
the number of bytes significant in the above (a word-size integer),
.IP \-
a word-size flag indicating if the contents of the FRA are valid,
.IP \-
the trap number (a word-size integer).
.LP
The latter is followed directly by the RSB, and consequently acts as the only
parameter to the trap handler.
.NH 2
Operand access.
.PP
The EM Manual mentions two ways to access the operands of an instruction.  It
should be noticed that the operand in EM is often not the direct operand of the
operation; the operand of the ADI instruction, e.g., is the width of the
integers to be added, not one of the integers themselves.  The various operand
types are described in [1].  Each opcode in the text segment identifies an
instruction with a particular operand type; these relations are described in
computer-readable format in a file in the EM tree, \fIip_spec.t\fP.
.PP
The interpreter uses a variant of the second method.  Several other approaches
can be designed, with increasing efficiency and equally increasing complexity.
They are briefly treated below.
.NH 3
The Dispatch Table, Method 1.
.PP
When the interpreter starts, it reads the ip_spec.t file and constructs from it
a dispatch table.  This table (of which there are actually three,
for primary, secondary
and tertiary opcodes) has 256 entries, each describing an instruction with
indications on how to decode the operand.  For each instruction executed, the
interpreter finds the entry in the dispatch table, finds information there on
how to access the operand, constructs the operand and calls the appropriate
routine with the operand as calculated.  There is one routine for each
instruction, which is called with the ready-made operand.  Method 1 is easy to
program but requires constant interpretation of the dispatch table.
.NH 3
Intelligent Routines, Method 2.
.PP
For each opcode there is a separate routine, and since an opcode uniquely
defines the instruction and the operand format, the routine knows how to get
the operand; this knowledge is built into the routine.  Preferably the heading
of the routine is generated automatically from the ip_spec.t file.  Operand
decoding is immediate, and no dispatch table is needed.  Generation of the
469 required routines is, however, far from simple.  Either a generated array
of routine names or a generated switch statement is used to map the opcode onto
the correct routine.  The switch approach has the advantage that parameters can
be passed to the routines.
.LP
The interpreter uses a variant of the switch statement scheme.  Numerical
information that can be deduced from the opcode is passed as parameters to the
routine; this includes the argument of minis, the high order byte of shorties,
and the fact that the result is to be multiplied by the word size.  This
reduces the number of required routines to 338.
.NH 3
Intelligent Calls.
.PP
The call in the switch statement does full operand construction, and the
resulting operand is passed to the routine.  This reduces the number of
routines to 133, the number of EM instructions.  Generation of the switch
statement from ip_spec.t will be complicated, but the routine space will be
much cleaner.  This will not give any speed-up since the same actions are still
required; they are just performed in a different place.
.NH 3
Static Evaluation.
.PP
It can be observed that the evaluation of the operand of a given instruction in
the text segment will always give the same result.  It is therefore possible to
preprocess the text segment, decomposing the instructions into structs which
contain the address, the instruction code and the operand.  No operand decoding
will be necessary at run-time: all operands have been precalculated.  This will
probably give a considerable speed-up.  Jumps, especially GTO jumps, will,
however, require more attention.
.NH 2
Disassembly.
.PP
A disassembly facility is available, which gives a readable but not
letter-perfect disassembly of the EM object.  The procedure structure is
indicated by placing the indication  \fBP[n]\fP  at the entry point of each
procedure, where \fBn\fP is the procedure identifier.  The number of locals is
given in a comment.
.LP
The disassembler was generated by the software in the directory \fIswitch\fP
and then further processed by hand.