215 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			215 lines
		
	
	
	
		
			9.4 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .\"	Introduction
 | |
| .\"
 | |
| .\"	$Id$
 | |
| .NH
 | |
| INTRODUCTION.
 | |
| .PP
 | |
| This document describes an EM interpreter which does extensive checking.
 | |
| The interpreter exists in two versions: the normal version with full checking
 | |
| and debugging facilities, and a fast stripped version that does interpretation
 | |
| only.
 | |
| This document assumes that the full version is used.
 | |
| .LP
 | |
| First the virtual EM machine embodied by the interpreter (called \fBint\fP) is
 | |
| described, followed by some remarks on performance.
 | |
| The second section gives some specific implementation decisions.
 | |
| Section three explains the usage of the built-in debugging tool.
 | |
| .LP
 | |
| Appendix A gives an overview of the various warnings \fBint\fP gives,
 | |
| with possible causes and solutions.
 | |
| Appendix B is a simple tutorial on the use of \fBint\fP.
 | |
| A separate manual page exists.
 | |
| .PP
 | |
| The document assumes a good understanding of what EM is and what
 | |
| the assembly code looks like [1].
 | |
| Notions like 'procedure descriptor', 'mini', 'shortie' etc. are not
 | |
| explained.
 | |
| In the sequel, any word in \fIthis font\fP refers to the name of a
 | |
| variable, constant, function or whatever, used in the source code under
 | |
| the same name.
 | |
| .LP
 | |
| To avoid confusion: \fBint\fP interprets EM machine language (e.out files),
 | |
| \fInot\fP the assembly language (.e files) and \fInot\fP the compact
 | |
| code (.k files).
 | |
| .NH 2
 | |
| The virtual EM machine.
 | |
| .PP
 | |
| The memory layout of the virtual EM machine represented by the interpreter
 | |
| differs in details from the description in [1].
 | |
| Virtual memory is split up into two separate spaces:
 | |
| one space containing the instructions,
 | |
| the other all the data, including stack and heap (D-space).
 | |
| The procedure descriptors are preprocessed and stored in a separate array,
 | |
| \fIproctab[]\fP.
 | |
| Both spaces start off at address 0.
 | |
| This is possible because pointers in the two different spaces are
 | |
| distinguishable by context (and shadow-bytes: see 2.6).
 | |
| .NH 3
 | |
| Instruction Space
 | |
| .PP
 | |
| Figure 1 shows the I-space, together with the position of some important
 | |
| EM registers.
 | |
| .Dr 12
 | |
|                       NEXT -->  |________________|  <-- DB    \e
 | |
|                                 |                |            |
 | |
|                                 |                |            |  T
 | |
|                                 |                |  <-- PC    |
 | |
|                                 |     Program    |            |  e
 | |
|                                 |                |            |
 | |
|                                 |      Text      |            |  x
 | |
|                                 |                |            |
 | |
|                                 |                |            |  t
 | |
|                          0 -->  |________________|  <--(PB)   /
 | |
| .Df
 | |
| \fI Fig 1. Virtual instruction space (I-space).\fP
 | |
| .De
 | |
| .PP
 | |
| The I-space is just big enough to contain all the instructions.
 | |
| The size needed for the program text (\fINTEXT\fP) is found from the
 | |
| header-bytes of the loadfile.
 | |
| Legal values for the program counter (\fIPC\fP) consist of all
 | |
| addresses in the range from 0 through \fINTEXT\fP \- 1.
 | |
| If the \fIPC\fP is made to point to an illegal address, a trap will occur.
 | |
| .NH 3
 | |
| The Procedure Table
 | |
| .PP
 | |
| The \fINProc\fP constant indicates how many procedure descriptors there
 | |
| are in the proctab array.
 | |
| Elements of this array contain for each procedure: the number of locals, the
 | |
| entry point and the entry point of the textually following procedure.  This is
 | |
| used in testing the restriction that the program counter may not wander from
 | |
| procedure to procedure.
 | |
| .NH 3
 | |
| The Data Space
 | |
| .PP
 | |
| Figure 2 shows the layout of the data space, which closely conforms to the EM
 | |
| Manual.
 | |
| .Dr 36
 | |
|                                 __________________
 | |
|             maxaddr(psize) -->  |                |  <-- ML    \e
 | |
|                                 |                |            |  S
 | |
|                                 |     Locals     |            |  t
 | |
|                                 |       &        |            |  a
 | |
|                                 |      RSBs      |            |  c
 | |
|                                 |                |            |  k
 | |
|                                 |________________|  <-- SP    /
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 .     Unused     .
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 .     Unused     .
 | |
|                                 .                .
 | |
|                                 .                .
 | |
|                                 |________________|  <-- HP
 | |
|                                 |                |            \e
 | |
|                                 |      Heap      |            |
 | |
|                                 |________________|  <-- HB    |
 | |
|                                 |                |            |  D
 | |
|                                 |    Arguments   |            |
 | |
|                                 |     Environ    |            |  a
 | |
|                                 |  _   _   _   _ |            |
 | |
|                                 |                |            |  t
 | |
|                                 |                |            |
 | |
|                                 |                |            |  a
 | |
|                                 |   Global data  |            |
 | |
|                                 |                |            |
 | |
|                                 |                |            |
 | |
|                          0 -->  |________________|  <--(EB)   /
 | |
| .Df
 | |
| \fI Fig 2. Virtual dataspace (D-space).\fP
 | |
| .De
 | |
| .PP
 | |
| D-space begins at address 0, and ends at the largest address
 | |
| representable by the pointer size (\fIpsize\fP) being used;
 | |
| for a 2-byte pointer size this maximum address is
 | |
| .DS
 | |
| ((2 ^ 16 \- 1) / word size * word size) \- 1
 | |
| .DE
 | |
| for a 4-byte pointer size it is
 | |
| .DS
 | |
| ((2 ^ 31 \- 1) / word size * word size) \- 1
 | |
| .DE
 | |
| (not 2 ^ 32, to allow illegal pointers to be implemented in the future).  The
 | |
| funny rounding construction is required to make ML+1 expressible as the
 | |
| initialisation value of LB and SP.
 | |
| .PP
 | |
| D-space is split into two partitions: Data and Stack (indicated by the
 | |
| brackets).
 | |
| The Data partition holds the global data area (GDA) and the heap.
 | |
| Its initial size is given by the loadfile constant SZDATA.
 | |
| Some space is added to it, because arguments and environment are
 | |
| stored here also.
 | |
| This total size is static while interpreting.
 | |
| However, as the heap may grow during execution (e.g. caused by dynamic
 | |
| allocation) this results in a variable size for the Data partition.
 | |
| Initially, the size for the Data partition is the sum of the space needed
 | |
| by the GDA (including the space needed for arguments and environment) and
 | |
| the initial heapspace.
 | |
| The lowest legal Data address is 0; the highest \fIHP\fP \- 1.
 | |
| .LP
 | |
| The Stack partition holds the stack.
 | |
| It begins at the highest available D-space address, and grows
 | |
| towards the low addresses, so the Stack partition is of variable size too.
 | |
| The lowest legal Stack address is the stackpointer (\fISP\fP),
 | |
| the highest is the memory limit (\fIML\fP).
 | |
| .NH 2
 | |
| Physical lay-out
 | |
| .PP
 | |
| Each partition is mapped onto a piece of physical memory with the
 | |
| same name: \fItext\fP (fig. 1), \fIstack\fP and \fIdata\fP (fig. 2).
 | |
| These are the storage structures which \fBint\fP uses to physically
 | |
| store the contents of the virtual EM spaces.
 | |
| Figure 2 thus shows the mapping of D-space onto two
 | |
| different physical parts: \fIstack\fP and \fIdata\fP.
 | |
| The I-space is represented by one physical part: \fItext\fP.
 | |
| .LP
 | |
| Each time more space is needed, the actual partition is reallocated,
 | |
| with the new size being computed with the formula:
 | |
| .DS
 | |
| \fInew size\fP = 1.5 \(mu (\fIold size\fP + \fIextra\fP)
 | |
| .DE
 | |
| \fIextra\fP is the number of bytes exceeding the \fIold size\fP.
 | |
| One can prove that using this method, there is a
 | |
| linear relationship between allocation time and needed partition size.
 | |
| .PP
 | |
| A virtual D-space starting at address 0 is in correspondence with
 | |
| the definition in [1], p. 3\-6.
 | |
| The main reason for having D-space start at address 0, is that it induces
 | |
| a one-one correspondence between the heap \- and GDA
 | |
| addresses on the virtual machine (and hence the definition) on one hand,
 | |
| and the offset within the \fIdata\fP partition on the other.
 | |
| This implies that no extra calculation is needed to perform load and
 | |
| storage operations.
 | |
| .LP
 | |
| Some calculation however cannot be avoided, because the stack part of
 | |
| the D-space grows downwards by EM definition.
 | |
| The first address of the virtual stack (\fIML\fP, the maximum address for
 | |
| the given \fIpsize\fP) is mapped onto the
 | |
| beginning of the \fIstack\fP partition.
 | |
| When the stack grows (i.e. EM addresses get lower), the offset within the
 | |
| \fIstack\fP partition gets higher.
 | |
| By taking offset \fIML \- A\fP in the stack partition, one obtains the
 | |
| physical address corresponding to some virtual EM (stack) address \fIA\fP.
 | |
| .NH 2
 | |
| Speed.
 | |
| .PP
 | |
| From several test results with both versions of the interpreter, the
 | |
| following may be concluded.
 | |
| The speed of the interpreter depends strongly on the type of
 | |
| program being interpreted.
 | |
| If plain CPU arithmetic is performed, the interpreter is
 | |
| relatively slow (1000 \(mu the cc version).
 | |
| When stack manipulation is at hand, the interpreter is
 | |
| quite fast (100 \(mu the cc version).
 | |
| .LP
 | |
| Most programs however will not be this extreme, so an interpretation
 | |
| time of somewhere between 300 and 500 times direct execution
 | |
| for a normal program is to be expected.
 | |
| .LP
 | |
| The fast version runs in about 60% of the time of the full version, at the
 | |
| expense of a considerably lower functionality.
 | |
| Tallying costs about 10%.
 |