215 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			215 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
.\"	Introduction
 | 
						|
.\"
 | 
						|
.\"	$Id$
 | 
						|
.NH
 | 
						|
INTRODUCTION.
 | 
						|
.PP
 | 
						|
This document describes an EM interpreter which does extensive checking.
 | 
						|
The interpreter exists in two versions: the normal version with full checking
 | 
						|
and debugging facilities, and a fast stripped version that does interpretation
 | 
						|
only.
 | 
						|
This document assumes that the full version is used.
 | 
						|
.LP
 | 
						|
First the virtual EM machine embodied by the interpreter (called \fBint\fP) is
 | 
						|
described, followed by some remarks on performance.
 | 
						|
The second section gives some specific implementation decisions.
 | 
						|
Section three explains the usage of the built-in debugging tool.
 | 
						|
.LP
 | 
						|
Appendix A gives an overview of the various warnings \fBint\fP gives,
 | 
						|
with possible causes and solutions.
 | 
						|
Appendix B is a simple tutorial on the use of \fBint\fP.
 | 
						|
A separate manual page exists.
 | 
						|
.PP
 | 
						|
The document assumes a good understanding of what EM is and what
 | 
						|
the assembly code looks like [1].
 | 
						|
Notions like 'procedure descriptor', 'mini', 'shortie' etc. are not
 | 
						|
explained.
 | 
						|
In the sequel, any word in \fIthis font\fP refers to the name of a
 | 
						|
variable, constant, function or whatever, used in the source code under
 | 
						|
the same name.
 | 
						|
.LP
 | 
						|
To avoid confusion: \fBint\fP interprets EM machine language (e.out files),
 | 
						|
\fInot\fP the assembly language (.e files) and \fInot\fP the compact
 | 
						|
code (.k files).
 | 
						|
.NH 2
 | 
						|
The virtual EM machine.
 | 
						|
.PP
 | 
						|
The memory layout of the virtual EM machine represented by the interpreter
 | 
						|
differs in details from the description in [1].
 | 
						|
Virtual memory is split up into two separate spaces:
 | 
						|
one space containing the instructions,
 | 
						|
the other all the data, including stack and heap (D-space).
 | 
						|
The procedure descriptors are preprocessed and stored in a separate array,
 | 
						|
\fIproctab[]\fP.
 | 
						|
Both spaces start off at address 0.
 | 
						|
This is possible because pointers in the two different spaces are
 | 
						|
distinguishable by context (and shadow-bytes: see 2.6).
 | 
						|
.NH 3
 | 
						|
Instruction Space
 | 
						|
.PP
 | 
						|
Figure 1 shows the I-space, together with the position of some important
 | 
						|
EM registers.
 | 
						|
.Dr 12
 | 
						|
                      NEXT -->  |________________|  <-- DB    \e
 | 
						|
                                |                |            |
 | 
						|
                                |                |            |  T
 | 
						|
                                |                |  <-- PC    |
 | 
						|
                                |     Program    |            |  e
 | 
						|
                                |                |            |
 | 
						|
                                |      Text      |            |  x
 | 
						|
                                |                |            |
 | 
						|
                                |                |            |  t
 | 
						|
                         0 -->  |________________|  <--(PB)   /
 | 
						|
.Df
 | 
						|
\fI Fig 1. Virtual instruction space (I-space).\fP
 | 
						|
.De
 | 
						|
.PP
 | 
						|
The I-space is just big enough to contain all the instructions.
 | 
						|
The size needed for the program text (\fINTEXT\fP) is found from the
 | 
						|
header-bytes of the loadfile.
 | 
						|
Legal values for the program counter (\fIPC\fP) consist of all
 | 
						|
addresses in the range from 0 through \fINTEXT\fP \- 1.
 | 
						|
If the \fIPC\fP is made to point to an illegal address, a trap will occur.
 | 
						|
.NH 3
 | 
						|
The Procedure Table
 | 
						|
.PP
 | 
						|
The \fINProc\fP constant indicates how many procedure descriptors there
 | 
						|
are in the proctab array.
 | 
						|
Elements of this array contain for each procedure: the number of locals, the
 | 
						|
entry point and the entry point of the textually following procedure.  This is
 | 
						|
used in testing the restriction that the program counter may not wander from
 | 
						|
procedure to procedure.
 | 
						|
.NH 3
 | 
						|
The Data Space
 | 
						|
.PP
 | 
						|
Figure 2 shows the layout of the data space, which closely conforms to the EM
 | 
						|
Manual.
 | 
						|
.Dr 36
 | 
						|
                                __________________
 | 
						|
            maxaddr(psize) -->  |                |  <-- ML    \e
 | 
						|
                                |                |            |  S
 | 
						|
                                |     Locals     |            |  t
 | 
						|
                                |       &        |            |  a
 | 
						|
                                |      RSBs      |            |  c
 | 
						|
                                |                |            |  k
 | 
						|
                                |________________|  <-- SP    /
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                .     Unused     .
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                .     Unused     .
 | 
						|
                                .                .
 | 
						|
                                .                .
 | 
						|
                                |________________|  <-- HP
 | 
						|
                                |                |            \e
 | 
						|
                                |      Heap      |            |
 | 
						|
                                |________________|  <-- HB    |
 | 
						|
                                |                |            |  D
 | 
						|
                                |    Arguments   |            |
 | 
						|
                                |     Environ    |            |  a
 | 
						|
                                |  _   _   _   _ |            |
 | 
						|
                                |                |            |  t
 | 
						|
                                |                |            |
 | 
						|
                                |                |            |  a
 | 
						|
                                |   Global data  |            |
 | 
						|
                                |                |            |
 | 
						|
                                |                |            |
 | 
						|
                         0 -->  |________________|  <--(EB)   /
 | 
						|
.Df
 | 
						|
\fI Fig 2. Virtual dataspace (D-space).\fP
 | 
						|
.De
 | 
						|
.PP
 | 
						|
D-space begins at address 0, and ends at the largest address
 | 
						|
representable by the pointer size (\fIpsize\fP) being used;
 | 
						|
for a 2-byte pointer size this maximum address is
 | 
						|
.DS
 | 
						|
((2 ^ 16 \- 1) / word size * word size) \- 1
 | 
						|
.DE
 | 
						|
for a 4-byte pointer size it is
 | 
						|
.DS
 | 
						|
((2 ^ 31 \- 1) / word size * word size) \- 1
 | 
						|
.DE
 | 
						|
(not 2 ^ 32, to allow illegal pointers to be implemented in the future).  The
 | 
						|
funny rounding construction is required to make ML+1 expressible as the
 | 
						|
initialisation value of LB and SP.
 | 
						|
.PP
 | 
						|
D-space is split into two partitions: Data and Stack (indicated by the
 | 
						|
brackets).
 | 
						|
The Data partition holds the global data area (GDA) and the heap.
 | 
						|
Its initial size is given by the loadfile constant SZDATA.
 | 
						|
Some space is added to it, because arguments and environment are
 | 
						|
stored here also.
 | 
						|
This total size is static while interpreting.
 | 
						|
However, as the heap may grow during execution (e.g. caused by dynamic
 | 
						|
allocation) this results in a variable size for the Data partition.
 | 
						|
Initially, the size for the Data partition is the sum of the space needed
 | 
						|
by the GDA (including the space needed for arguments and environment) and
 | 
						|
the initial heapspace.
 | 
						|
The lowest legal Data address is 0; the highest \fIHP\fP \- 1.
 | 
						|
.LP
 | 
						|
The Stack partition holds the stack.
 | 
						|
It begins at the highest available D-space address, and grows
 | 
						|
towards the low addresses, so the Stack partition is of variable size too.
 | 
						|
The lowest legal Stack address is the stackpointer (\fISP\fP),
 | 
						|
the highest is the memory limit (\fIML\fP).
 | 
						|
.NH 2
 | 
						|
Physical lay-out
 | 
						|
.PP
 | 
						|
Each partition is mapped onto a piece of physical memory with the
 | 
						|
same name: \fItext\fP (fig. 1), \fIstack\fP and \fIdata\fP (fig. 2).
 | 
						|
These are the storage structures which \fBint\fP uses to physically
 | 
						|
store the contents of the virtual EM spaces.
 | 
						|
Figure 2 thus shows the mapping of D-space onto two
 | 
						|
different physical parts: \fIstack\fP and \fIdata\fP.
 | 
						|
The I-space is represented by one physical part: \fItext\fP.
 | 
						|
.LP
 | 
						|
Each time more space is needed, the actual partition is reallocated,
 | 
						|
with the new size being computed with the formula:
 | 
						|
.DS
 | 
						|
\fInew size\fP = 1.5 \(mu (\fIold size\fP + \fIextra\fP)
 | 
						|
.DE
 | 
						|
\fIextra\fP is the number of bytes exceeding the \fIold size\fP.
 | 
						|
One can prove that using this method, there is a
 | 
						|
linear relationship between allocation time and needed partition size.
 | 
						|
.PP
 | 
						|
A virtual D-space starting at address 0 is in correspondence with
 | 
						|
the definition in [1], p. 3\-6.
 | 
						|
The main reason for having D-space start at address 0, is that it induces
 | 
						|
a one-one correspondence between the heap \- and GDA
 | 
						|
addresses on the virtual machine (and hence the definition) on one hand,
 | 
						|
and the offset within the \fIdata\fP partition on the other.
 | 
						|
This implies that no extra calculation is needed to perform load and
 | 
						|
storage operations.
 | 
						|
.LP
 | 
						|
Some calculation however cannot be avoided, because the stack part of
 | 
						|
the D-space grows downwards by EM definition.
 | 
						|
The first address of the virtual stack (\fIML\fP, the maximum address for
 | 
						|
the given \fIpsize\fP) is mapped onto the
 | 
						|
beginning of the \fIstack\fP partition.
 | 
						|
When the stack grows (i.e. EM addresses get lower), the offset within the
 | 
						|
\fIstack\fP partition gets higher.
 | 
						|
By taking offset \fIML \- A\fP in the stack partition, one obtains the
 | 
						|
physical address corresponding to some virtual EM (stack) address \fIA\fP.
 | 
						|
.NH 2
 | 
						|
Speed.
 | 
						|
.PP
 | 
						|
From several test results with both versions of the interpreter, the
 | 
						|
following may be concluded.
 | 
						|
The speed of the interpreter depends strongly on the type of
 | 
						|
program being interpreted.
 | 
						|
If plain CPU arithmetic is performed, the interpreter is
 | 
						|
relatively slow (1000 \(mu the cc version).
 | 
						|
When stack manipulation is at hand, the interpreter is
 | 
						|
quite fast (100 \(mu the cc version).
 | 
						|
.LP
 | 
						|
Most programs however will not be this extreme, so an interpretation
 | 
						|
time of somewhere between 300 and 500 times direct execution
 | 
						|
for a normal program is to be expected.
 | 
						|
.LP
 | 
						|
The fast version runs in about 60% of the time of the full version, at the
 | 
						|
expense of a considerably lower functionality.
 | 
						|
Tallying costs about 10%.
 |