ack/doc/em/mapping.nr

.SN 5
.BP
.S1 "MAPPING OF EM DATA MEMORY ONTO TARGET MACHINE MEMORY"
The EM architecture is designed to be implemented
on many existing and future machines.
EM memory is highly fragmented to make
adaptation to various memory architectures possible.
Format and encoding of pointers is explicitly undefined.
.P
This chapter gives solutions to some of the
anticipated problems.
First, we describe a possible memory layout for machines
with 64K bytes of address space.
Here we use a member of the EM family with 2-byte word and pointer
size.
The most straightforward layout is shown in figure 2.
.Dr 40
       65534 \-> |-------------------------------|
                |///////////////////////////////|
                |//// unimplemented memory /////|
                |///////////////////////////////|
          ML \-> |-------------------------------|
                |                               |
                |                               | <\- LB
                |     stack and local area      |
                |                               |
                |-------------------------------| <\- SP
                |///////////////////////////////|
                |//////// inaccessible /////////|
                |///////////////////////////////|
                |-------------------------------| <\- HP
                |                               |
                |           heap area           |
                |                               |
                |                               |
          HB \-> |-------------------------------|
                |                               |
                |       global data area        |
                |                               |
          EB \-> |-------------------------------|
                |                               |
                |         program text          | <\- PC
                |                               |
                |        ( and tables )         |
                |                               |
                |                               |
          PB \-> |-------------------------------|
                |///////////////////////////////|
                |////////// undefined //////////|
                |///////////////////////////////|
           0 \-> |-------------------------------|
.Df
Figure 2.  Memory layout showing typical register
positions during execution of an EM program.
.De
.N 1
The base registers for the various memory pieces can be stored
in target machine registers or memory.
.IS
.N 1
.TS
tab(;);
l 1 l l l.
PB;:;program base;points to the base of the instruction address space.
EB;:;external base;points to the base of the data address space.
HB;:;heap base;points to the base of the heap area.
ML;:;memory limit;marks the high end of the addressable data space.
.TE 1
.IE
The stack grows from high
EM addresses to low EM addresses, and the heap the
other way.
The memory between SP and HP is not accessible,
but may be allocated later to the stack or the heap if needed.
The local data area is allocated starting at the high end of
memory.
.P
Because EM address 0 is not mapped onto target
address 0, a problem arises when pointers are used.
If a program pushed a constant, say 6, onto the stack,
and then tried to indirect through it,
the wrong word would be fetched,
because EM address 6 is mapped onto target address EB+6
and not target address 6 itself.
This particular problem is solved by explicitly declaring
the format of a pointer to be undefined,
so that using a constant as a pointer is completely illegal.
However, the general problem of mapping pointers still exists.
.P
There are two possible solutions.
In the first solution, EM pointers are represented
in the target machine as true EM addresses,
for example, a pointer to EM address 6 really is
stored as a 6 in the target machine.
This solution implies that every time a pointer is fetched
EB must be added before referencing
the target machine's memory.
If the target machine has powerful indexing
facilities, EB can be kept in a target machine register,
and the relocation can indeed be done on
every reference to the data address space
at a modest cost in speed.
.P
The other solution consists of having EM pointers
refer to the true target machine address.
Thus the instruction LAE 6 (Load Address of External 6)
would push the value of EB+6 onto the stack.
When this approach is chosen, back ends must know
how to offset from EB, to translate all
instructions that manipulate EM addresses.
However, the problem is not completely solved,
because a front end may have to initialize a pointer
in CON or ROM data to point to a global address.
This pointer must also be relocated by the back end or the interpreter.
.P
Although the EM stack grows from high to low EM addresses,
some machines have hardware PUSH and POP
instructions that require the stack to grow upwards.
If reasons of efficiency demand the use of these
instructions, then EM
can be implemented with the memory layout
upside down, as shown in figure 3.
This is possible because the pointer format is explicitly undefined.
The first element of a word array will have a
lower physical address than the second element.
.Dr 18
          |                 |                    |                 |
          |      EB=60      |                    |        ^        |
          |                 |                    |        |        |
          |-----------------|                    |-----------------|
      105 |   45   |   44   | 104            214 |   41   |   40   | 215
          |-----------------|                    |-----------------|
      103 |   43   |   42   | 102            212 |   43   |   42   | 213
          |-----------------|                    |-----------------|
      101 |   41   |   40   | 100            210 |   45   |   44   | 211
          |-----------------|                    |-----------------|
          |        |        |                    |                 |
          |        v        |                    |      EB=255     |
          |                 |                    |                 |

                Type A                                 Type B
.Df
Figure 3. Two possible memory implementations.
Numbers within the boxes are EM addresses.
The other numbers are physical addresses.
.De
.A 1 0
So, we have two different EM memory implementations:
.IS
.PS - 4
.PT A~\-
stack downwards
.PT B~\-
stack upwards
.PE
.IE
.P
For each of these two possibilities we give the translation of
the EM instructions to push the third byte of a global data
block starting at EM address 40 onto the stack and to load the
word at address 40.
All translations assume a word and pointer size of two bytes.
The target machine used is a PDP-11 augmented with push and pop instructions.
Registers 'r0' and 'r1' are used and suffer from sign extension for byte
transfers.
Push $40 means push the constant 40, not word 40.
.P
The translation of the EM instructions depends on the pointer representation
used.
For each of the two solutions explained above the translation is given.
.P
First, the translation for the two implementations using EM addresses as
pointer representation:
.DS
.TS
tab(:), center;
l s l s l s
_ s _ s _ s
l 2 l 6 l 2 l 6 l 2 l.
EM:type A:type B


LAE:40:push:$40:push:$40

ADP:3:pop:r0:pop:r0
::add:$3,r0:add:$3,r0
::push:r0:push:r0

LOI:1:pop:r0:pop:r0
::\-::neg:r0
::clr:r1:clr:r1
::bisb:eb(r0),r1:bisb:eb(r0),r1
::push:r1:push:r1

LOE:40:push:eb+40:push:eb-41
.TE
.DE
.P
The translation for the two implementations, if the target machine address is
used as pointer representation, is:
.N 1
.DS
.TS
tab(:), center;
l s l s l s
_ s _ s _ s
l 2 l 6 l 2 l 6 l 2 l.
EM:type A:type B


LAE:40:push:$eb+40:push:$eb-40

ADP:3:pop:r0:pop:r0
::add:$3,r0:sub:$3,r0
::push:r0:push:r0

LOI:1:pop:r0:pop:r0
::clr:r1:clr:r1
::bisb:(r0),r1:bisb:(r0),r1
::push:r1:push:r1

LOE:40:push:eb+40:push:eb-41
.TE
.DE
.P
The translation presented above is not intended to be optimal.
Most machines can handle these simple cases in one or two instructions.
It demonstrates, however, the flexibility of the EM design.
.P
There are several possibilities to implement EM on machines with
address spaces larger than 64k bytes.
For EM with two byte pointers one could allocate instruction and
data space each in a separate 64k piece of memory.
EM pointers still have to fit in two bytes,
but the base registers PB and EB may be loaded in hardware registers
wider than 16 bits, if available.
EM implementations can also make efficient use of a machine
with separate instruction and data space.
.P
EM with 32 bit pointers allows one to make use of machines
with large address spaces.
In a virtual, segmented memory system one could use a separate
segment for each fragment.
Initial revision 1984-06-29 14:46:39 +00:00			`.SN 5`
			`.BP`
			`.S1 "MAPPING OF EM DATA MEMORY ONTO TARGET MACHINE MEMORY"`
			`The EM architecture is designed to be implemented`
			`on many existing and future machines.`
			`EM memory is highly fragmented to make`
			`adaptation to various memory architectures possible.`
			`Format and encoding of pointers is explicitly undefined.`
			`.P`
			`This chapter gives solutions to some of the`
			`anticipated problems.`
			`First, we describe a possible memory layout for machines`
			`with 64K bytes of address space.`
			`Here we use a member of the EM family with 2-byte word and pointer`
			`size.`
			`The most straightforward layout is shown in figure 2.`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.Dr 40`
			`65534 \-> \|-------------------------------\|`
Initial revision 1984-06-29 14:46:39 +00:00			`\|///////////////////////////////\|`
			`\|//// unimplemented memory /////\|`
			`\|///////////////////////////////\|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`ML \-> \|-------------------------------\|`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`\| \| <\- LB`
Initial revision 1984-06-29 14:46:39 +00:00			`\| stack and local area \|`
			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`\|-------------------------------\| <\- SP`
Initial revision 1984-06-29 14:46:39 +00:00			`\|///////////////////////////////\|`
			`\|//////// inaccessible /////////\|`
			`\|///////////////////////////////\|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`\|-------------------------------\| <\- HP`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \|`
			`\| heap area \|`
			`\| \|`
			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`HB \-> \|-------------------------------\|`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \|`
			`\| global data area \|`
			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`EB \-> \|-------------------------------\|`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`\| program text \| <\- PC`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \|`
			`\| ( and tables ) \|`
			`\| \|`
			`\| \|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`PB \-> \|-------------------------------\|`
Initial revision 1984-06-29 14:46:39 +00:00			`\|///////////////////////////////\|`
			`\|////////// undefined //////////\|`
			`\|///////////////////////////////\|`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`0 \-> \|-------------------------------\|`
			`.Df`
minor improvement in layout 1987-03-31 07:42:10 +00:00			`Figure 2. Memory layout showing typical register`
			`positions during execution of an EM program.`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.De`
			`.N 1`
Initial revision 1984-06-29 14:46:39 +00:00			`The base registers for the various memory pieces can be stored`
			`in target machine registers or memory.`
			`.IS`
			`.N 1`
			`.TS`
			`tab(;);`
			`l 1 l l l.`
			`PB;:;program base;points to the base of the instruction address space.`
			`EB;:;external base;points to the base of the data address space.`
			`HB;:;heap base;points to the base of the heap area.`
			`ML;:;memory limit;marks the high end of the addressable data space.`
			`.TE 1`
			`.IE`
			`The stack grows from high`
			`EM addresses to low EM addresses, and the heap the`
			`other way.`
			`The memory between SP and HP is not accessible,`
			`but may be allocated later to the stack or the heap if needed.`
			`The local data area is allocated starting at the high end of`
			`memory.`
			`.P`
			`Because EM address 0 is not mapped onto target`
			`address 0, a problem arises when pointers are used.`
			`If a program pushed a constant, say 6, onto the stack,`
			`and then tried to indirect through it,`
			`the wrong word would be fetched,`
			`because EM address 6 is mapped onto target address EB+6`
			`and not target address 6 itself.`
			`This particular problem is solved by explicitly declaring`
			`the format of a pointer to be undefined,`
			`so that using a constant as a pointer is completely illegal.`
			`However, the general problem of mapping pointers still exists.`
			`.P`
			`There are two possible solutions.`
			`In the first solution, EM pointers are represented`
			`in the target machine as true EM addresses,`
			`for example, a pointer to EM address 6 really is`
			`stored as a 6 in the target machine.`
			`This solution implies that every time a pointer is fetched`
			`EB must be added before referencing`
			`the target machine's memory.`
			`If the target machine has powerful indexing`
			`facilities, EB can be kept in a target machine register,`
			`and the relocation can indeed be done on`
			`every reference to the data address space`
			`at a modest cost in speed.`
			`.P`
			`The other solution consists of having EM pointers`
			`refer to the true target machine address.`
			`Thus the instruction LAE 6 (Load Address of External 6)`
			`would push the value of EB+6 onto the stack.`
			`When this approach is chosen, back ends must know`
			`how to offset from EB, to translate all`
			`instructions that manipulate EM addresses.`
			`However, the problem is not completely solved,`
			`because a front end may have to initialize a pointer`
			`in CON or ROM data to point to a global address.`
			`This pointer must also be relocated by the back end or the interpreter.`
			`.P`
			`Although the EM stack grows from high to low EM addresses,`
			`some machines have hardware PUSH and POP`
			`instructions that require the stack to grow upwards.`
Avoid informal usage of 'you' 1991-11-19 13:19:02 +00:00			`If reasons of efficiency demand the use of these`
Initial revision 1984-06-29 14:46:39 +00:00			`instructions, then EM`
			`can be implemented with the memory layout`
			`upside down, as shown in figure 3.`
			`This is possible because the pointer format is explicitly undefined.`
			`The first element of a word array will have a`
			`lower physical address than the second element.`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.Dr 18`
Initial revision 1984-06-29 14:46:39 +00:00			`\| \| \| \|`
			`\| EB=60 \| \| ^ \|`
			`\| \| \| \| \|`
			`\|-----------------\| \|-----------------\|`
			`105 \| 45 \| 44 \| 104 214 \| 41 \| 40 \| 215`
			`\|-----------------\| \|-----------------\|`
			`103 \| 43 \| 42 \| 102 212 \| 43 \| 42 \| 213`
			`\|-----------------\| \|-----------------\|`
			`101 \| 41 \| 40 \| 100 210 \| 45 \| 44 \| 211`
			`\|-----------------\| \|-----------------\|`
			`\| \| \| \| \|`
			`\| v \| \| EB=255 \|`
			`\| \| \| \|`

			`Type A Type B`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.Df`
minor improvement in layout 1987-03-31 07:42:10 +00:00			`Figure 3. Two possible memory implementations.`
			`Numbers within the boxes are EM addresses.`
			`The other numbers are physical addresses.`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.De`
minor improvement in layout 1987-03-31 07:42:10 +00:00			`.A 1 0`
Initial revision 1984-06-29 14:46:39 +00:00			`So, we have two different EM memory implementations:`
			`.IS`
			`.PS - 4`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.PT A~\-`
Initial revision 1984-06-29 14:46:39 +00:00			`stack downwards`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`.PT B~\-`
Initial revision 1984-06-29 14:46:39 +00:00			`stack upwards`
			`.PE`
			`.IE`
			`.P`
			`For each of these two possibilities we give the translation of`
			`the EM instructions to push the third byte of a global data`
			`block starting at EM address 40 onto the stack and to load the`
			`word at address 40.`
			`All translations assume a word and pointer size of two bytes.`
			`The target machine used is a PDP-11 augmented with push and pop instructions.`
			`Registers 'r0' and 'r1' are used and suffer from sign extension for byte`
			`transfers.`
			`Push $40 means push the constant 40, not word 40.`
			`.P`
			`The translation of the EM instructions depends on the pointer representation`
			`used.`
			`For each of the two solutions explained above the translation is given.`
			`.P`
			`First, the translation for the two implementations using EM addresses as`
			`pointer representation:`
			`.DS`
			`.TS`
			`tab(:), center;`
			`l s l s l s`
			`_ s _ s _ s`
			`l 2 l 6 l 2 l 6 l 2 l.`
			`EM:type A:type B`


			`LAE:40:push:$40:push:$40`

			`ADP:3:pop:r0:pop:r0`
			`::add:$3,r0:add:$3,r0`
			`::push:r0:push:r0`

			`LOI:1:pop:r0:pop:r0`
updated for photo-typesetter 1986-02-04 17:37:41 +00:00			`::\-::neg:r0`
Initial revision 1984-06-29 14:46:39 +00:00			`::clr:r1:clr:r1`
			`::bisb:eb(r0),r1:bisb:eb(r0),r1`
			`::push:r1:push:r1`

			`LOE:40:push:eb+40:push:eb-41`
			`.TE`
			`.DE`
			`.P`
			`The translation for the two implementations, if the target machine address is`
			`used as pointer representation, is:`
			`.N 1`
			`.DS`
			`.TS`
			`tab(:), center;`
			`l s l s l s`
			`_ s _ s _ s`
			`l 2 l 6 l 2 l 6 l 2 l.`
			`EM:type A:type B`


			`LAE:40:push:$eb+40:push:$eb-40`

			`ADP:3:pop:r0:pop:r0`
			`::add:$3,r0:sub:$3,r0`
			`::push:r0:push:r0`

			`LOI:1:pop:r0:pop:r0`
			`::clr:r1:clr:r1`
			`::bisb:(r0),r1:bisb:(r0),r1`
			`::push:r1:push:r1`

			`LOE:40:push:eb+40:push:eb-41`
			`.TE`
			`.DE`
			`.P`
			`The translation presented above is not intended to be optimal.`
			`Most machines can handle these simple cases in one or two instructions.`
			`It demonstrates, however, the flexibility of the EM design.`
			`.P`
			`There are several possibilities to implement EM on machines with`
			`address spaces larger than 64k bytes.`
			`For EM with two byte pointers one could allocate instruction and`
			`data space each in a separate 64k piece of memory.`
			`EM pointers still have to fit in two bytes,`
			`but the base registers PB and EB may be loaded in hardware registers`
			`wider than 16 bits, if available.`
			`EM implementations can also make efficient use of a machine`
			`with separate instruction and data space.`
			`.P`
			`EM with 32 bit pointers allows one to make use of machines`
			`with large address spaces.`
			`In a virtual, segmented memory system one could use a separate`
			`segment for each fragment.`