215 lines
9.4 KiB
Text
215 lines
9.4 KiB
Text
.\" Introduction
|
|
.\"
|
|
.\" $Id$
|
|
.NH
|
|
INTRODUCTION.
|
|
.PP
|
|
This document describes an EM interpreter which does extensive checking.
|
|
The interpreter exists in two versions: the normal version with full checking
|
|
and debugging facilities, and a fast stripped version that does interpretation
|
|
only.
|
|
This document assumes that the full version is used.
|
|
.LP
|
|
First the virtual EM machine embodied by the interpreter (called \fBint\fP) is
|
|
described, followed by some remarks on performance.
|
|
The second section gives some specific implementation decisions.
|
|
Section three explains the usage of the built-in debugging tool.
|
|
.LP
|
|
Appendix A gives an overview of the various warnings \fBint\fP gives,
|
|
with possible causes and solutions.
|
|
Appendix B is a simple tutorial on the use of \fBint\fP.
|
|
A separate manual page exists.
|
|
.PP
|
|
The document assumes a good understanding of what EM is and what
|
|
the assembly code looks like [1].
|
|
Notions like 'procedure descriptor', 'mini', 'shortie' etc. are not
|
|
explained.
|
|
In the sequel, any word in \fIthis font\fP refers to the name of a
|
|
variable, constant, function or whatever, used in the source code under
|
|
the same name.
|
|
.LP
|
|
To avoid confusion: \fBint\fP interprets EM machine language (e.out files),
|
|
\fInot\fP the assembly language (.e files) and \fInot\fP the compact
|
|
code (.k files).
|
|
.NH 2
|
|
The virtual EM machine.
|
|
.PP
|
|
The memory layout of the virtual EM machine represented by the interpreter
|
|
differs in details from the description in [1].
|
|
Virtual memory is split up into two separate spaces:
|
|
one space containing the instructions,
|
|
the other all the data, including stack and heap (D-space).
|
|
The procedure descriptors are preprocessed and stored in a separate array,
|
|
\fIproctab[]\fP.
|
|
Both spaces start off at address 0.
|
|
This is possible because pointers in the two different spaces are
|
|
distinguishable by context (and shadow-bytes: see 2.6).
|
|
.NH 3
|
|
Instruction Space
|
|
.PP
|
|
Figure 1 shows the I-space, together with the position of some important
|
|
EM registers.
|
|
.Dr 12
|
|
NEXT --> |________________| <-- DB \e
|
|
| | |
|
|
| | | T
|
|
| | <-- PC |
|
|
| Program | | e
|
|
| | |
|
|
| Text | | x
|
|
| | |
|
|
| | | t
|
|
0 --> |________________| <--(PB) /
|
|
.Df
|
|
\fI Fig 1. Virtual instruction space (I-space).\fP
|
|
.De
|
|
.PP
|
|
The I-space is just big enough to contain all the instructions.
|
|
The size needed for the program text (\fINTEXT\fP) is found from the
|
|
header-bytes of the loadfile.
|
|
Legal values for the program counter (\fIPC\fP) consist of all
|
|
addresses in the range from 0 through \fINTEXT\fP \- 1.
|
|
If the \fIPC\fP is made to point to an illegal address, a trap will occur.
|
|
.NH 3
|
|
The Procedure Table
|
|
.PP
|
|
The \fINProc\fP constant indicates how many procedure descriptors there
|
|
are in the proctab array.
|
|
Elements of this array contain for each procedure: the number of locals, the
|
|
entry point and the entry point of the textually following procedure. This is
|
|
used in testing the restriction that the program counter may not wander from
|
|
procedure to procedure.
|
|
.NH 3
|
|
The Data Space
|
|
.PP
|
|
Figure 2 shows the layout of the data space, which closely conforms to the EM
|
|
Manual.
|
|
.Dr 36
|
|
__________________
|
|
maxaddr(psize) --> | | <-- ML \e
|
|
| | | S
|
|
| Locals | | t
|
|
| & | | a
|
|
| RSBs | | c
|
|
| | | k
|
|
|________________| <-- SP /
|
|
. .
|
|
. .
|
|
. Unused .
|
|
. .
|
|
. .
|
|
. .
|
|
. .
|
|
. .
|
|
. Unused .
|
|
. .
|
|
. .
|
|
|________________| <-- HP
|
|
| | \e
|
|
| Heap | |
|
|
|________________| <-- HB |
|
|
| | | D
|
|
| Arguments | |
|
|
| Environ | | a
|
|
| _ _ _ _ | |
|
|
| | | t
|
|
| | |
|
|
| | | a
|
|
| Global data | |
|
|
| | |
|
|
| | |
|
|
0 --> |________________| <--(EB) /
|
|
.Df
|
|
\fI Fig 2. Virtual dataspace (D-space).\fP
|
|
.De
|
|
.PP
|
|
D-space begins at address 0, and ends at the largest address
|
|
representable by the pointer size (\fIpsize\fP) being used;
|
|
for a 2-byte pointer size this maximum address is
|
|
.DS
|
|
((2 ^ 16 \- 1) / word size * word size) \- 1
|
|
.DE
|
|
for a 4-byte pointer size it is
|
|
.DS
|
|
((2 ^ 31 \- 1) / word size * word size) \- 1
|
|
.DE
|
|
(not 2 ^ 32, to allow illegal pointers to be implemented in the future). The
|
|
funny rounding construction is required to make ML+1 expressible as the
|
|
initialisation value of LB and SP.
|
|
.PP
|
|
D-space is split into two partitions: Data and Stack (indicated by the
|
|
brackets).
|
|
The Data partition holds the global data area (GDA) and the heap.
|
|
Its initial size is given by the loadfile constant SZDATA.
|
|
Some space is added to it, because arguments and environment are
|
|
stored here also.
|
|
This total size is static while interpreting.
|
|
However, as the heap may grow during execution (e.g. caused by dynamic
|
|
allocation) this results in a variable size for the Data partition.
|
|
Initially, the size for the Data partition is the sum of the space needed
|
|
by the GDA (including the space needed for arguments and environment) and
|
|
the initial heapspace.
|
|
The lowest legal Data address is 0; the highest \fIHP\fP \- 1.
|
|
.LP
|
|
The Stack partition holds the stack.
|
|
It begins at the highest available D-space address, and grows
|
|
towards the low addresses, so the Stack partition is of variable size too.
|
|
The lowest legal Stack address is the stackpointer (\fISP\fP),
|
|
the highest is the memory limit (\fIML\fP).
|
|
.NH 2
|
|
Physical lay-out
|
|
.PP
|
|
Each partition is mapped onto a piece of physical memory with the
|
|
same name: \fItext\fP (fig. 1), \fIstack\fP and \fIdata\fP (fig. 2).
|
|
These are the storage structures which \fBint\fP uses to physically
|
|
store the contents of the virtual EM spaces.
|
|
Figure 2 thus shows the mapping of D-space onto two
|
|
different physical parts: \fIstack\fP and \fIdata\fP.
|
|
The I-space is represented by one physical part: \fItext\fP.
|
|
.LP
|
|
Each time more space is needed, the actual partition is reallocated,
|
|
with the new size being computed with the formula:
|
|
.DS
|
|
\fInew size\fP = 1.5 \(mu (\fIold size\fP + \fIextra\fP)
|
|
.DE
|
|
\fIextra\fP is the number of bytes exceeding the \fIold size\fP.
|
|
One can prove that using this method, there is a
|
|
linear relationship between allocation time and needed partition size.
|
|
.PP
|
|
A virtual D-space starting at address 0 is in correspondence with
|
|
the definition in [1], p. 3\-6.
|
|
The main reason for having D-space start at address 0, is that it induces
|
|
a one-one correspondence between the heap \- and GDA
|
|
addresses on the virtual machine (and hence the definition) on one hand,
|
|
and the offset within the \fIdata\fP partition on the other.
|
|
This implies that no extra calculation is needed to perform load and
|
|
storage operations.
|
|
.LP
|
|
Some calculation however cannot be avoided, because the stack part of
|
|
the D-space grows downwards by EM definition.
|
|
The first address of the virtual stack (\fIML\fP, the maximum address for
|
|
the given \fIpsize\fP) is mapped onto the
|
|
beginning of the \fIstack\fP partition.
|
|
When the stack grows (i.e. EM addresses get lower), the offset within the
|
|
\fIstack\fP partition gets higher.
|
|
By taking offset \fIML \- A\fP in the stack partition, one obtains the
|
|
physical address corresponding to some virtual EM (stack) address \fIA\fP.
|
|
.NH 2
|
|
Speed.
|
|
.PP
|
|
From several test results with both versions of the interpreter, the
|
|
following may be concluded.
|
|
The speed of the interpreter depends strongly on the type of
|
|
program being interpreted.
|
|
If plain CPU arithmetic is performed, the interpreter is
|
|
relatively slow (1000 \(mu the cc version).
|
|
When stack manipulation is at hand, the interpreter is
|
|
quite fast (100 \(mu the cc version).
|
|
.LP
|
|
Most programs however will not be this extreme, so an interpretation
|
|
time of somewhere between 300 and 500 times direct execution
|
|
for a normal program is to be expected.
|
|
.LP
|
|
The fast version runs in about 60% of the time of the full version, at the
|
|
expense of a considerably lower functionality.
|
|
Tallying costs about 10%.
|