Initial revision
This commit is contained in:
parent
253118db19
commit
e0872423d9
21 changed files with 7189 additions and 0 deletions
1121
doc/em/addend.n
Normal file
1121
doc/em/addend.n
Normal file
File diff suppressed because it is too large
Load diff
488
doc/em/app.nr
Normal file
488
doc/em/app.nr
Normal file
|
|
@ -0,0 +1,488 @@
|
||||||
|
.BP
|
||||||
|
.AP "EM INTERPRETER"
|
||||||
|
.nf
|
||||||
|
.ta 8 16 24 32 40 48 56 64 72 80
|
||||||
|
.so em.i
|
||||||
|
.fi
|
||||||
|
.BP
|
||||||
|
.AP "EM CODE TABLES"
|
||||||
|
The following table is used by the assembler for EM machine
|
||||||
|
language.
|
||||||
|
It specifies the opcodes used for each instruction and
|
||||||
|
how arguments are mapped to machine language arguments.
|
||||||
|
The table is presented in three columns,
|
||||||
|
each line in each column contains three or four fields.
|
||||||
|
Each line describes a range of interpreter opcodes by
|
||||||
|
specifying for which instruction the range is used, the type of the
|
||||||
|
opcodes (mini, shortie, etc..) and range for the instruction
|
||||||
|
argument.
|
||||||
|
.A
|
||||||
|
The first field on each line gives the EM instruction mnemonic,
|
||||||
|
the second field gives some flags.
|
||||||
|
If the opcodes are minis or shorties the third field specifies
|
||||||
|
how many minis/shorties are used.
|
||||||
|
The last field gives the number of the (first) interpreter
|
||||||
|
opcode.
|
||||||
|
.N 1
|
||||||
|
Flags :
|
||||||
|
.IS 3
|
||||||
|
.N 1
|
||||||
|
Opcode type, only one of the following may be specified.
|
||||||
|
.PS - 5 " "
|
||||||
|
.PT -
|
||||||
|
opcode without argument
|
||||||
|
.PT m
|
||||||
|
mini
|
||||||
|
.PT s
|
||||||
|
shortie
|
||||||
|
.PT 2
|
||||||
|
opcode with 2-byte signed argument
|
||||||
|
.PT 4
|
||||||
|
opcode with 4-byte signed argument
|
||||||
|
.PT 8
|
||||||
|
opcode with 8-byte signed argument
|
||||||
|
.PE
|
||||||
|
Secondary (escaped) opcodes.
|
||||||
|
.PS - 5 " "
|
||||||
|
.PT e
|
||||||
|
The opcode thus marked is in the secondary opcode group instead
|
||||||
|
of the primary
|
||||||
|
.PE
|
||||||
|
restrictions on arguments
|
||||||
|
.PS - 5 " "
|
||||||
|
.PT N
|
||||||
|
Negative arguments only
|
||||||
|
.PT P
|
||||||
|
Positive and zero arguments only
|
||||||
|
.PE
|
||||||
|
mapping of arguments
|
||||||
|
.PS - 5 " "
|
||||||
|
.PT w
|
||||||
|
argument must be divisible by the wordsize and is divided by the
|
||||||
|
wordsize before use as opcode argument.
|
||||||
|
.PT o
|
||||||
|
argument ( possibly after division ) must be >= 1 and is
|
||||||
|
decremented before use as opcode argument
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
If the opcode type is 2,4 or 8 the resulting argument is used as
|
||||||
|
opcode argument (least significant byte first).
|
||||||
|
.N
|
||||||
|
If the opcode type is mini, the argument is added
|
||||||
|
to the first opcode - if in range - .
|
||||||
|
If the argument is negative, the absolute value minus one is
|
||||||
|
used in the algorithm above.
|
||||||
|
.N
|
||||||
|
For shorties with positive arguments the first opcode is used
|
||||||
|
for arguments in the range 0..255, the second for the range
|
||||||
|
256..511, etc..
|
||||||
|
For shorties with negative arguments the first opcode is used
|
||||||
|
for arguments in the range -1..-256, the second for the range
|
||||||
|
-257..-512, etc..
|
||||||
|
The byte following the opcode contains the least significant
|
||||||
|
byte of the argument.
|
||||||
|
First some examples of these specifications.
|
||||||
|
.PS - 5
|
||||||
|
.PT "aar mwPo 1 34"
|
||||||
|
Indicates that opcode 34 is used as a mini for Positive
|
||||||
|
instruction arguments only.
|
||||||
|
The w and o indicate division and decrementing of the
|
||||||
|
instruction argument.
|
||||||
|
Because the resulting argument must be zero ( only opcode 34 may be used
|
||||||
|
), this mini can only be used for instruction argument 2.
|
||||||
|
Conclusion: opcode 34 is for "AAR 2".
|
||||||
|
.PT "adp sP 1 41"
|
||||||
|
Opcode 41 is used as shortie for ADP with arguments in the range
|
||||||
|
0..255.
|
||||||
|
.PT "bra sN 2 60"
|
||||||
|
Opcode 60 is used as shortie for BRA with arguments -1..-256,
|
||||||
|
61 is used for arguments -257..-512.
|
||||||
|
.PT "zer e- 145"
|
||||||
|
Escaped opcode 145 is used for ZER.
|
||||||
|
.PE
|
||||||
|
The interpreter opcode table:
|
||||||
|
.N 1
|
||||||
|
.IS 3
|
||||||
|
.DS B
|
||||||
|
.so itables
|
||||||
|
.DE 0
|
||||||
|
.IE
|
||||||
|
.P
|
||||||
|
The table above results in the following dispatch tables.
|
||||||
|
Dispatch tables are used by interpreters to jump to the
|
||||||
|
routines implementing the EM instructions, indexed by the next opcode.
|
||||||
|
Each line of the dispatch tables gives the routine names
|
||||||
|
of eight consecutive opcodes, preceded by the first opcode number
|
||||||
|
on that line.
|
||||||
|
Routine names consist of an EM mnemonic followed by a suffix.
|
||||||
|
The suffices show the encoding used for each opcode.
|
||||||
|
.N
|
||||||
|
The following suffices exist:
|
||||||
|
.N 1
|
||||||
|
.VS 1 0
|
||||||
|
.IS 4
|
||||||
|
.PS - 11
|
||||||
|
.PT .z
|
||||||
|
no arguments
|
||||||
|
.PT .l
|
||||||
|
16-bit argument
|
||||||
|
.PT .lw
|
||||||
|
16-bit argument divided by the wordsize
|
||||||
|
.PT .p
|
||||||
|
positive 16-bit argument
|
||||||
|
.PT .pw
|
||||||
|
positive 16-bit argument divided by the wordsize
|
||||||
|
.PT .n
|
||||||
|
negative 16-bit argument
|
||||||
|
.PT .nw
|
||||||
|
negative 16-bit argument divided by the wordsize
|
||||||
|
.PT .s<num>
|
||||||
|
shortie with <num> as high order argument byte
|
||||||
|
.PT .sw<num>
|
||||||
|
shortie with argument divided by the wordsize
|
||||||
|
.PT .<num>
|
||||||
|
mini with <num> as argument
|
||||||
|
.PT .<num>W
|
||||||
|
mini with <num>*wordsize as argument
|
||||||
|
.PE 3
|
||||||
|
<num> is a possibly negative integer.
|
||||||
|
.VS 1 1
|
||||||
|
.IE
|
||||||
|
The dispatch table for the 256 primary opcodes:
|
||||||
|
.DS B
|
||||||
|
0 loc.0 loc.1 loc.2 loc.3 loc.4 loc.5 loc.6 loc.7
|
||||||
|
8 loc.8 loc.9 loc.10 loc.11 loc.12 loc.13 loc.14 loc.15
|
||||||
|
16 loc.16 loc.17 loc.18 loc.19 loc.20 loc.21 loc.22 loc.23
|
||||||
|
24 loc.24 loc.25 loc.26 loc.27 loc.28 loc.29 loc.30 loc.31
|
||||||
|
32 loc.32 loc.33 aar.1W adf.s0 adi.1W adi.2W adp.l adp.1
|
||||||
|
40 adp.2 adp.s0 adp.s-1 ads.1W and.1W asp.1W asp.2W asp.3W
|
||||||
|
48 asp.4W asp.5W asp.w0 beq.l beq.s0 bge.s0 bgt.s0 ble.s0
|
||||||
|
56 blm.s0 blt.s0 bne.s0 bra.l bra.s-1 bra.s-2 bra.s0 bra.s1
|
||||||
|
64 cal.1 cal.2 cal.3 cal.4 cal.5 cal.6 cal.7 cal.8
|
||||||
|
72 cal.9 cal.10 cal.11 cal.12 cal.13 cal.14 cal.15 cal.16
|
||||||
|
80 cal.17 cal.18 cal.19 cal.20 cal.21 cal.22 cal.23 cal.24
|
||||||
|
88 cal.25 cal.26 cal.27 cal.28 cal.s0 cff.z cif.z cii.z
|
||||||
|
96 cmf.s0 cmi.1W cmi.2W cmp.z cms.s0 csa.1W csb.1W dec.z
|
||||||
|
104 dee.w0 del.w-1 dup.1W dvf.s0 dvi.1W fil.l inc.z ine.lw
|
||||||
|
112 ine.w0 inl.-1W inl.-2W inl.-3W inl.w-1 inn.s0 ior.1W ior.s0
|
||||||
|
120 lae.l lae.w0 lae.w1 lae.w2 lae.w3 lae.w4 lae.w5 lae.w6
|
||||||
|
128 lal.p lal.n lal.0 lal.-1 lal.w0 lal.w-1 lal.w-2 lar.W
|
||||||
|
136 ldc.0 lde.lw lde.w0 ldl.0 ldl.w-1 lfr.1W lfr.2W lfr.s0
|
||||||
|
144 lil.w-1 lil.w0 lil.0 lil.1W lin.l lin.s0 lni.z loc.l
|
||||||
|
152 loc.-1 loc.s0 loc.s-1 loe.lw loe.w0 loe.w1 loe.w2 loe.w3
|
||||||
|
160 loe.w4 lof.l lof.1W lof.2W lof.3W lof.4W lof.s0 loi.l
|
||||||
|
168 loi.1 loi.1W loi.2W loi.3W loi.4W loi.s0 lol.pw lol.nw
|
||||||
|
176 lol.0 lol.1W lol.2W lol.3W lol.-1W lol.-2W lol.-3W lol.-4W
|
||||||
|
184 lol.-5W lol.-6W lol.-7W lol.-8W lol.w0 lol.w-1 lxa.1 lxl.1
|
||||||
|
192 lxl.2 mlf.s0 mli.1W mli.2W rck.1W ret.0 ret.1W ret.s0
|
||||||
|
200 rmi.1W sar.1W sbf.s0 sbi.1W sbi.2W sdl.w-1 set.s0 sil.w-1
|
||||||
|
208 sil.w0 sli.1W ste.lw ste.w0 ste.w1 ste.w2 stf.l stf.W
|
||||||
|
216 stf.2W stf.s0 sti.1 sti.1W sti.2W sti.3W sti.4W sti.s0
|
||||||
|
224 stl.pw stl.nw stl.0 stl.1W stl.-1W stl.-2W stl.-3W stl.-4W
|
||||||
|
232 stl.-5W stl.w-1 teq.z tgt.z tlt.z tne.z zeq.l zeq.s0
|
||||||
|
240 zeq.s1 zer.s0 zge.s0 zgt.s0 zle.s0 zlt.s0 zne.s0 zne.s-1
|
||||||
|
248 zre.lw zre.w0 zrl.-1W zrl.-2W zrl.w-1 zrl.nw escape1 escape2
|
||||||
|
.DE 2
|
||||||
|
The list of secondary opcodes (escape1):
|
||||||
|
.N 1
|
||||||
|
.DS B
|
||||||
|
0 aar.l aar.z adf.l adf.z adi.l adi.z ads.l ads.z
|
||||||
|
8 adu.l adu.z and.l and.z asp.lw ass.l ass.z bge.l
|
||||||
|
16 bgt.l ble.l blm.l bls.l bls.z blt.l bne.l cai.z
|
||||||
|
24 cal.l cfi.z cfu.z ciu.z cmf.l cmf.z cmi.l cmi.z
|
||||||
|
32 cms.l cms.z cmu.l cmu.z com.l com.z csa.l csa.z
|
||||||
|
40 csb.l csb.z cuf.z cui.z cuu.z dee.lw del.pw del.nw
|
||||||
|
48 dup.l dus.l dus.z dvf.l dvf.z dvi.l dvi.z dvu.l
|
||||||
|
56 dvu.z fef.l fef.z fif.l fif.z inl.pw inl.nw inn.l
|
||||||
|
64 inn.z ior.l ior.z lar.l lar.z ldc.l ldf.l ldl.pw
|
||||||
|
72 ldl.nw lfr.l lil.pw lil.nw lim.z los.l los.z lor.s0
|
||||||
|
80 lpi.l lxa.l lxl.l mlf.l mlf.z mli.l mli.z mlu.l
|
||||||
|
88 mlu.z mon.z ngf.l ngf.z ngi.l ngi.z nop.z rck.l
|
||||||
|
96 rck.z ret.l rmi.l rmi.z rmu.l rmu.z rol.l rol.z
|
||||||
|
104 ror.l ror.z rtt.z sar.l sar.z sbf.l sbf.z sbi.l
|
||||||
|
112 sbi.z sbs.l sbs.z sbu.l sbu.z sde.l sdf.l sdl.pw
|
||||||
|
120 sdl.nw set.l set.z sig.z sil.pw sil.nw sim.z sli.l
|
||||||
|
128 sli.z slu.l slu.z sri.l sri.z sru.l sru.z sti.l
|
||||||
|
136 sts.l sts.z str.s0 tge.z tle.z trp.z xor.l xor.z
|
||||||
|
144 zer.l zer.z zge.l zgt.l zle.l zlt.l zne.l zrf.l
|
||||||
|
152 zrf.z zrl.pw dch.z exg.s0 exg.l exg.z lpb.z gto.l
|
||||||
|
.DE 2
|
||||||
|
Finally, the list of opcodes with four byte arguments (escape2).
|
||||||
|
.DS
|
||||||
|
|
||||||
|
0 loc
|
||||||
|
.DE 0
|
||||||
|
.BP
|
||||||
|
.AP "AN EXAMPLE PROGRAM"
|
||||||
|
.DS B
|
||||||
|
1 program example(output);
|
||||||
|
2 {This program just demonstrates typical EM code.}
|
||||||
|
3 type rec = record r1: integer; r2:real; r3: boolean end;
|
||||||
|
4 var mi: integer; mx:real; r:rec;
|
||||||
|
5
|
||||||
|
6 function sum(a,b:integer):integer;
|
||||||
|
7 begin
|
||||||
|
8 sum := a + b
|
||||||
|
9 end;
|
||||||
|
10
|
||||||
|
11 procedure test(var r: rec);
|
||||||
|
12 label 1;
|
||||||
|
13 var i,j: integer;
|
||||||
|
14 x,y: real;
|
||||||
|
15 b: boolean;
|
||||||
|
16 c: char;
|
||||||
|
17 a: array[1..100] of integer;
|
||||||
|
18
|
||||||
|
19 begin
|
||||||
|
20 j := 1;
|
||||||
|
21 i := 3 * j + 6;
|
||||||
|
22 x := 4.8;
|
||||||
|
23 y := x/0.5;
|
||||||
|
24 b := true;
|
||||||
|
25 c := 'z';
|
||||||
|
26 for i:= 1 to 100 do a[i] := i * i;
|
||||||
|
27 r.r1 := j+27;
|
||||||
|
28 r.r3 := b;
|
||||||
|
29 r.r2 := x+y;
|
||||||
|
30 i := sum(r.r1, a[j]);
|
||||||
|
31 while i > 0 do begin j := j + r.r1; i := i - 1 end;
|
||||||
|
32 with r do begin r3 := b; r2 := x+y; r1 := 0 end;
|
||||||
|
33 goto 1;
|
||||||
|
34 1: writeln(j, i:6, x:9:3, b)
|
||||||
|
35 end; {test}
|
||||||
|
36 begin {main program}
|
||||||
|
37 mx := 15.96;
|
||||||
|
38 mi := 99;
|
||||||
|
39 test(r)
|
||||||
|
40 end.
|
||||||
|
.DE 0
|
||||||
|
.BP
|
||||||
|
The EM code as produced by the Pascal-VU compiler is given below. Comments
|
||||||
|
have been added manually. Note that this code has already been optimized.
|
||||||
|
.DS B
|
||||||
|
mes 2,2,2 ; wordsize 2, pointersize 2
|
||||||
|
.1
|
||||||
|
rom 't.p\e000' ; the name of the source file
|
||||||
|
hol 552,-32768,0 ; externals and buf occupy 552 bytes
|
||||||
|
exp $sum ; sum can be called from other modules
|
||||||
|
pro $sum,2 ; procedure sum; 2 bytes local storage
|
||||||
|
lin 8 ; code from source line 8
|
||||||
|
ldl 0 ; load two locals ( a and b )
|
||||||
|
adi 2 ; add them
|
||||||
|
ret 2 ; return the result
|
||||||
|
end 2 ; end of procedure ( still two bytes local storage )
|
||||||
|
.2
|
||||||
|
rom 1,99,2 ; descriptor of array a[]
|
||||||
|
exp $test ; the compiler exports all level 0 procedures
|
||||||
|
pro $test,226 ; procedure test, 226 bytes local storage
|
||||||
|
.3
|
||||||
|
rom 4.8F8 ; assemble Floating point 4.8 (8 bytes) in
|
||||||
|
.4 ; global storage
|
||||||
|
rom 0.5F8 ; same for 0.5
|
||||||
|
mes 3,-226,2,2 ; compiler temporary not referenced by address
|
||||||
|
mes 3,-24,2,0 ; the same is true for i, j, b and c in test
|
||||||
|
mes 3,-22,2,0
|
||||||
|
mes 3,-4,2,0
|
||||||
|
mes 3,-2,2,0
|
||||||
|
mes 3,-20,8,0 ; and for x and y
|
||||||
|
mes 3,-12,8,0
|
||||||
|
lin 20 ; maintain source line number
|
||||||
|
loc 1
|
||||||
|
stl -4 ; j := 1
|
||||||
|
lni ; lin 21 prior to optimization
|
||||||
|
lol -4
|
||||||
|
loc 3
|
||||||
|
mli 2
|
||||||
|
loc 6
|
||||||
|
adi 2
|
||||||
|
stl -2 ; i := 3 * j + 6
|
||||||
|
lni ; lin 22 prior to optimization
|
||||||
|
lae .3
|
||||||
|
loi 8
|
||||||
|
lal -12
|
||||||
|
sti 8 ; x := 4.8
|
||||||
|
lni ; lin 23 prior to optimization
|
||||||
|
lal -12
|
||||||
|
loi 8
|
||||||
|
lae .4
|
||||||
|
loi 8
|
||||||
|
dvf 8
|
||||||
|
lal -20
|
||||||
|
sti 8 ; y := x / 0.5
|
||||||
|
lni ; lin 24 prior to optimization
|
||||||
|
loc 1
|
||||||
|
stl -22 ; b := true
|
||||||
|
lni ; lin 25 prior to optimization
|
||||||
|
loc 122
|
||||||
|
stl -24 ; c := 'z'
|
||||||
|
lni ; lin 26 prior to optimization
|
||||||
|
loc 1
|
||||||
|
stl -2 ; for i:= 1
|
||||||
|
2
|
||||||
|
lol -2
|
||||||
|
dup 2
|
||||||
|
mli 2 ; i*i
|
||||||
|
lal -224
|
||||||
|
lol -2
|
||||||
|
lae .2
|
||||||
|
sar 2 ; a[i] :=
|
||||||
|
lol -2
|
||||||
|
loc 100
|
||||||
|
beq *3 ; to 100 do
|
||||||
|
inl -2 ; increment i and loop
|
||||||
|
bra *2
|
||||||
|
3
|
||||||
|
lin 27
|
||||||
|
lol -4
|
||||||
|
loc 27
|
||||||
|
adi 2 ; j + 27
|
||||||
|
sil 0 ; r.r1 :=
|
||||||
|
lni ; lin 28 prior to optimization
|
||||||
|
lol -22 ; b
|
||||||
|
lol 0
|
||||||
|
stf 10 ; r.r3 :=
|
||||||
|
lni ; lin 29 prior to optimization
|
||||||
|
lal -20
|
||||||
|
loi 16
|
||||||
|
adf 8 ; x + y
|
||||||
|
lol 0
|
||||||
|
adp 2
|
||||||
|
sti 8 ; r.r2 :=
|
||||||
|
lni ; lin 23 prior to optimization
|
||||||
|
lal -224
|
||||||
|
lol -4
|
||||||
|
lae .2
|
||||||
|
lar 2 ; a[j]
|
||||||
|
lil 0 ; r.r1
|
||||||
|
cal $sum ; call now
|
||||||
|
asp 4 ; remove parameters from stack
|
||||||
|
lfr 2 ; get function result
|
||||||
|
stl -2 ; i :=
|
||||||
|
4
|
||||||
|
lin 31
|
||||||
|
lol -2
|
||||||
|
zle *5 ; while i > 0 do
|
||||||
|
lol -4
|
||||||
|
lil 0
|
||||||
|
adi 2
|
||||||
|
stl -4 ; j := j + r.r1
|
||||||
|
del -2 ; i := i - 1
|
||||||
|
bra *4 ; loop
|
||||||
|
5
|
||||||
|
lin 32
|
||||||
|
lol 0
|
||||||
|
stl -226 ; make copy of address of r
|
||||||
|
lol -22
|
||||||
|
lol -226
|
||||||
|
stf 10 ; r3 := b
|
||||||
|
lal -20
|
||||||
|
loi 16
|
||||||
|
adf 8
|
||||||
|
lol -226
|
||||||
|
adp 2
|
||||||
|
sti 8 ; r2 := x + y
|
||||||
|
loc 0
|
||||||
|
sil -226 ; r1 := 0
|
||||||
|
lin 34 ; note the abscence of the unnecesary jump
|
||||||
|
lae 22 ; address of output structure
|
||||||
|
lol -4
|
||||||
|
cal $_wri ; write integer with default width
|
||||||
|
asp 4 ; pop parameters
|
||||||
|
lae 22
|
||||||
|
lol -2
|
||||||
|
loc 6
|
||||||
|
cal $_wsi ; write integer width 6
|
||||||
|
asp 6
|
||||||
|
lae 22
|
||||||
|
lal -12
|
||||||
|
loi 8
|
||||||
|
loc 9
|
||||||
|
loc 3
|
||||||
|
cal $_wrf ; write fixed format real, width 9, precision 3
|
||||||
|
asp 14
|
||||||
|
lae 22
|
||||||
|
lol -22
|
||||||
|
cal $_wrb ; write boolean, default width
|
||||||
|
asp 4
|
||||||
|
lae 22
|
||||||
|
cal $_wln ; writeln
|
||||||
|
asp 2
|
||||||
|
ret 0 ; return, no result
|
||||||
|
end 226
|
||||||
|
exp $_main
|
||||||
|
pro $_main,0 ; main program
|
||||||
|
.6
|
||||||
|
con 2,-1,22 ; description of external files
|
||||||
|
.5
|
||||||
|
rom 15.96F8
|
||||||
|
fil .1 ; maintain source file name
|
||||||
|
lae .6 ; description of external files
|
||||||
|
lae 0 ; base of hol area to relocate buffer addresses
|
||||||
|
cal $_ini ; initialize files, etc...
|
||||||
|
asp 4
|
||||||
|
lin 37
|
||||||
|
lae .5
|
||||||
|
loi 8
|
||||||
|
lae 2
|
||||||
|
sti 8 ; mx := 15.96
|
||||||
|
lni ; lin 38 prior to optimization
|
||||||
|
loc 99
|
||||||
|
ste 0 ; mi := 99
|
||||||
|
lni ; lin 39 prior to optimization
|
||||||
|
lae 10 ; address of r
|
||||||
|
cal $test
|
||||||
|
asp 2
|
||||||
|
loc 0 ; normal exit
|
||||||
|
cal $_hlt ; cleanup and finish
|
||||||
|
asp 2
|
||||||
|
end 0
|
||||||
|
mes 5 ; reals were used
|
||||||
|
.DE 0
|
||||||
|
The compact code corresponding to the above program is listed below.
|
||||||
|
Read it horizontally, line by line, not column by column.
|
||||||
|
Each number represents a byte of compact code, printed in decimal.
|
||||||
|
The first two bytes form the magic word.
|
||||||
|
.N 1
|
||||||
|
.IS 3
|
||||||
|
.DS B
|
||||||
|
173 0 159 122 122 122 255 242 1 161 250 124 116 46 112 0
|
||||||
|
255 156 245 40 2 245 0 128 120 155 249 123 115 117 109 160
|
||||||
|
249 123 115 117 109 122 67 128 63 120 3 122 88 122 152 122
|
||||||
|
242 2 161 121 219 122 255 155 249 124 116 101 115 116 160 249
|
||||||
|
124 116 101 115 116 245 226 0 242 3 161 253 128 123 52 46
|
||||||
|
56 255 242 4 161 253 128 123 48 46 53 255 159 123 245 30
|
||||||
|
255 122 122 255 159 123 96 122 120 255 159 123 98 122 120 255
|
||||||
|
159 123 116 122 120 255 159 123 118 122 120 255 159 123 100 128
|
||||||
|
120 255 159 123 108 128 120 255 67 140 69 121 113 116 68 73
|
||||||
|
116 69 123 81 122 69 126 3 122 113 118 68 57 242 3 72
|
||||||
|
128 58 108 112 128 68 58 108 72 128 57 242 4 72 128 44
|
||||||
|
128 58 100 112 128 68 69 121 113 98 68 69 245 122 0 113
|
||||||
|
96 68 69 121 113 118 182 73 118 42 122 81 122 58 245 32
|
||||||
|
255 73 118 57 242 2 94 122 73 118 69 220 10 123 54 118
|
||||||
|
18 122 183 67 147 73 116 69 147 3 122 104 120 68 73 98
|
||||||
|
73 120 111 130 68 58 100 72 136 2 128 73 120 4 122 112
|
||||||
|
128 68 58 245 32 255 73 116 57 242 2 59 122 65 120 20
|
||||||
|
249 123 115 117 109 8 124 64 122 113 118 184 67 151 73 118
|
||||||
|
128 125 73 116 65 120 3 122 113 116 41 118 18 124 185 67
|
||||||
|
152 73 120 113 245 30 255 73 98 73 245 30 255 111 130 58
|
||||||
|
100 72 136 2 128 73 245 30 255 4 122 112 128 69 120 104
|
||||||
|
245 30 255 67 154 57 142 73 116 20 249 124 95 119 114 105
|
||||||
|
8 124 57 142 73 118 69 126 20 249 124 95 119 115 105 8
|
||||||
|
126 57 142 58 108 72 128 69 129 69 123 20 249 124 95 119
|
||||||
|
114 102 8 134 57 142 73 98 20 249 124 95 119 114 98 8
|
||||||
|
124 57 142 20 249 124 95 119 108 110 8 122 88 120 152 245
|
||||||
|
226 0 155 249 125 95 109 97 105 110 160 249 125 95 109 97
|
||||||
|
105 110 120 242 6 151 122 119 142 255 242 5 161 253 128 125
|
||||||
|
49 53 46 57 54 255 50 242 1 57 242 6 57 120 20 249
|
||||||
|
124 95 105 110 105 8 124 67 157 57 242 5 72 128 57 122
|
||||||
|
112 128 68 69 219 110 120 68 57 130 20 249 124 116 101 115
|
||||||
|
116 8 122 69 120 20 249 124 95 104 108 116 8 122 152 120
|
||||||
|
159 124 160 255 159 125 255
|
||||||
|
.DE 0
|
||||||
|
.IE
|
||||||
|
.MS T A 0
|
||||||
|
.ME
|
||||||
|
.BP
|
||||||
|
.MS B A 0
|
||||||
|
.ME
|
||||||
|
.CT
|
||||||
756
doc/em/assem.nr
Normal file
756
doc/em/assem.nr
Normal file
|
|
@ -0,0 +1,756 @@
|
||||||
|
.BP
|
||||||
|
.SN 11
|
||||||
|
.S1 "EM ASSEMBLY LANGUAGE"
|
||||||
|
We use two representations for assembly language programs,
|
||||||
|
one is in ASCII and the other is the compact assembly language.
|
||||||
|
The latter needs less space than the first for the same program
|
||||||
|
and therefore allows faster processing.
|
||||||
|
Our only program accepting ASCII assembly
|
||||||
|
language converts it to the compact form.
|
||||||
|
All other programs expect compact assembly input.
|
||||||
|
The first part of the chapter describes the ASCII assembly
|
||||||
|
language and its semantics.
|
||||||
|
The second part describes the syntax of the compact assembly
|
||||||
|
language.
|
||||||
|
The last part lists the EM instructions with the type of
|
||||||
|
arguments allowed and an indication of the function.
|
||||||
|
Appendix A gives a detailed description of the effect of all
|
||||||
|
instructions in the form of a Pascal program.
|
||||||
|
.S2 "ASCII assembly language"
|
||||||
|
An assembly language program consists of a series of lines, each
|
||||||
|
line may be blank, contain one (pseudo)instruction or contain one
|
||||||
|
label.
|
||||||
|
Input to the assembler is in lower case.
|
||||||
|
Upper case is used in this
|
||||||
|
document merely to distinguish keywords from the surrounding prose.
|
||||||
|
Comment is allowed at the end of each line and starts with a semicolon ";".
|
||||||
|
This kind of comment does not exist in the compact form.
|
||||||
|
.A
|
||||||
|
Labels must be placed all by themselves on a line and start in
|
||||||
|
column 1.
|
||||||
|
There are two kinds of labels, instruction and data labels.
|
||||||
|
Instruction labels are unsigned positive integers.
|
||||||
|
The scope of an instruction label is its procedure.
|
||||||
|
.A
|
||||||
|
The pseudoinstructions CON, ROM and BSS may be preceded by a
|
||||||
|
line containing a
|
||||||
|
1-8 character data label, the first character of which is a
|
||||||
|
letter, period or underscore.
|
||||||
|
The period may only be followed by
|
||||||
|
digits, the others may be followed by letters, digits and underscores.
|
||||||
|
The use of the character "." followed by a constant,
|
||||||
|
which must be in the range 1 to 32767 (e.g. ".40") is recommended
|
||||||
|
for compiler
|
||||||
|
generated programs.
|
||||||
|
These labels are considered as a special case and handled
|
||||||
|
more efficiently in compact assembly language (see below).
|
||||||
|
Note that a data label on its own or two consecutive labels are not
|
||||||
|
allowed.
|
||||||
|
.P
|
||||||
|
Each statement may contain an instruction mnemonic or pseudoinstruction.
|
||||||
|
These must begin in column 2 or later (not column 1) and must be followed
|
||||||
|
by a space, tab, semicolon or LF.
|
||||||
|
Everything on the line following a semicolon is
|
||||||
|
taken as a comment.
|
||||||
|
.P
|
||||||
|
Each input file contains one module.
|
||||||
|
A module may contain many procedures,
|
||||||
|
which may be nested.
|
||||||
|
A procedure consists of
|
||||||
|
a PRO statement, a (possibly empty)
|
||||||
|
collection of instructions and pseudoinstructions and finally an END
|
||||||
|
statement.
|
||||||
|
Pseudoinstructions are also allowed between procedures.
|
||||||
|
They do not belong to a specific procedure.
|
||||||
|
.P
|
||||||
|
All constants in EM are interpreted in the decimal base.
|
||||||
|
The ASCII assembly language accepts constant expressions
|
||||||
|
wherever constants are allowed.
|
||||||
|
The operators recognized are: +, -, *, % and / with the usual
|
||||||
|
precedence order.
|
||||||
|
Use of the parentheses ( and ) to alter the precedence order is allowed.
|
||||||
|
.S3 "Instruction arguments"
|
||||||
|
Unlike many other assembly languages, the EM assembly
|
||||||
|
language requires all arguments of normal and pseudoinstructions
|
||||||
|
to be either a constant or an identifier, but not a combination
|
||||||
|
of these two.
|
||||||
|
There is one exception to this rule: when a data label is used
|
||||||
|
for initialization or as an instruction argument,
|
||||||
|
expressions of the form 'label+constant' and 'label-constant'
|
||||||
|
are allowed.
|
||||||
|
This makes it possible to address, for example, the
|
||||||
|
third word of a ten word BSS block
|
||||||
|
directly.
|
||||||
|
Thus LOE LABEL+4 is permitted and so is CON LABEL+3.
|
||||||
|
The resulting address is must be in the same fragment as the label.
|
||||||
|
It is not allowed to add or subtract from instruction labels or procedure
|
||||||
|
identifiers,
|
||||||
|
which certainly is not a severe restriction and greatly aids
|
||||||
|
optimization.
|
||||||
|
.P
|
||||||
|
Instruction arguments can be constants,
|
||||||
|
data labels, data labels offsetted by a constant, instruction
|
||||||
|
labels and procedure identifiers.
|
||||||
|
The range of integers allowed depends on the instruction.
|
||||||
|
Most instructions allow only integers
|
||||||
|
(signed or unsigned)
|
||||||
|
that fit in a word.
|
||||||
|
Arguments used as offsets to pointers should fit in a
|
||||||
|
pointer-sized integer.
|
||||||
|
Finally, arguments to LDC should fit in a double-word integer.
|
||||||
|
.P
|
||||||
|
Several instructions have two possible forms:
|
||||||
|
with an explicit argument and with an implicit argument on top of the stack.
|
||||||
|
The size of the implicit argument is the wordsize.
|
||||||
|
The implicit argument is always popped before all other operands.
|
||||||
|
For example: 'CMI 4' specifies that two four-byte signed
|
||||||
|
integers on top of the stack are to be compared.
|
||||||
|
\&'CMI' without an argument expects a wordsized integer
|
||||||
|
on top of the stack that specifies the size of the integers to
|
||||||
|
be compared.
|
||||||
|
Thus the following two sequences are equivalent:
|
||||||
|
.N 2
|
||||||
|
.TS
|
||||||
|
center, tab(:) ;
|
||||||
|
l r 30 l r.
|
||||||
|
LDL:-10:LDL:-10
|
||||||
|
LDL:-14:LDL:-14
|
||||||
|
::LOC:4
|
||||||
|
CMI:4:CMI:
|
||||||
|
ZEQ:*1:ZEQ:*1
|
||||||
|
.TE 2
|
||||||
|
Section 11.1.6 shows the arguments allowed for each instruction.
|
||||||
|
.S3 "Pseudoinstruction arguments"
|
||||||
|
Pseudoinstruction arguments can be divided in two classes:
|
||||||
|
Initializers and others.
|
||||||
|
The following initializers are allowed: signed integer constants,
|
||||||
|
unsigned integer constants, floating-point constants, strings,
|
||||||
|
data labels, data labels offsetted by a constant, instruction
|
||||||
|
labels and procedure identifiers.
|
||||||
|
.P
|
||||||
|
Constant initializers in BSS, HOL, CON and ROM pseudoinstructions
|
||||||
|
can be followed by a letter I, U or F.
|
||||||
|
This indicator
|
||||||
|
specifies the type of the initializer: Integer, Unsigned or Float.
|
||||||
|
If no indicator is present I is assumed.
|
||||||
|
The size of the object is the wordsize unless
|
||||||
|
the indicator is followed by an integer specifying the
|
||||||
|
object's size.
|
||||||
|
This integer is governed by the same restrictions as for
|
||||||
|
transfer of objects to/from memory.
|
||||||
|
As in instruction arguments, initializers include expressions of the form:
|
||||||
|
\&"LABEL+offset" and "LABEL-offset".
|
||||||
|
The offset must be an unsigned decimal constant.
|
||||||
|
The 'IUF' indicators cannot be used in the offsets.
|
||||||
|
.P
|
||||||
|
Data labels are referred to by their name.
|
||||||
|
.P
|
||||||
|
Strings are surrounded by double quotes (").
|
||||||
|
Semecolon's in string do not indicate the start of comment.
|
||||||
|
In the ASCII representation the escape character \e (backslash)
|
||||||
|
alters the meaning of subsequent character(s).
|
||||||
|
This feature allows inclusion of zeroes, graphic characters and
|
||||||
|
the double quote in the string.
|
||||||
|
The following escape sequences exist:
|
||||||
|
.DS
|
||||||
|
.TS
|
||||||
|
center, tab(:);
|
||||||
|
l l l.
|
||||||
|
newline:NL\|(LF):\en
|
||||||
|
horizontal tab:HT:\et
|
||||||
|
backspace:BS:\eb
|
||||||
|
carriage return:CR:\er
|
||||||
|
form feed:FF:\ef
|
||||||
|
backslash:\e:\e\e
|
||||||
|
double quote:":\e"
|
||||||
|
bit pattern:\fBddd\fP:\e\fBddd\fP
|
||||||
|
.TE
|
||||||
|
.DE
|
||||||
|
The escape \fBddd\fP consists of the backslash followed by 1,
|
||||||
|
2, or 3 octal digits specifing the value of
|
||||||
|
the desired character.
|
||||||
|
If the character following a backslash is not one of those
|
||||||
|
specified,
|
||||||
|
the backslash is ignored.
|
||||||
|
Example: CON "hello\e012\e0".
|
||||||
|
Each string element initializes a single byte.
|
||||||
|
The ASCII character set is used to map characters onto values.
|
||||||
|
Strings are padded with zeroes up to a multiple of the wordsize.
|
||||||
|
.P
|
||||||
|
Instruction labels are referred to as *1, *2, etc. in both branch
|
||||||
|
instructions and as initializers.
|
||||||
|
.P
|
||||||
|
The notation $procname means the identifier for the procedure
|
||||||
|
with the specified name.
|
||||||
|
This identifier has the size of a pointer.
|
||||||
|
.S3 Notation
|
||||||
|
First, the notation used for the arguments, classes of
|
||||||
|
instructions and pseudoinstructions.
|
||||||
|
.IS 2
|
||||||
|
.TS
|
||||||
|
tab(:);
|
||||||
|
l l l.
|
||||||
|
<cst>:\&=:integer constant (current range -2**31..2**31-1)
|
||||||
|
<dlb>:\&=:data label
|
||||||
|
<arg>:\&=:<cst> or <dlb> or <dlb>+<cst> or <dlb>-<cst>
|
||||||
|
<con>:\&=:integer constant, unsigned constant, floating-point constant
|
||||||
|
<str>:\&=:string constant (surrounded by double quotes),
|
||||||
|
<ilb>:\&=:instruction label
|
||||||
|
::'*' followed by an integer in the range 0..32767.
|
||||||
|
<pro>:\&=:procedure number ('$' followed by a procedure name)
|
||||||
|
<val>:\&=:<arg>, <con>, <pro> or <ilb>.
|
||||||
|
<par>:\&=:<val> or <str>
|
||||||
|
<...>*:\&=:zero or more of <...>
|
||||||
|
<...>+:\&=:one or more of <...>
|
||||||
|
[...]:\&=:optional ...
|
||||||
|
.TE
|
||||||
|
.IE
|
||||||
|
.S3 "Pseudoinstructions"
|
||||||
|
.S4 Storage declaration
|
||||||
|
Initialized global data is allocated by the pseudoinstruction CON,
|
||||||
|
which needs at least one argument.
|
||||||
|
For each argument, an integral number of words,
|
||||||
|
determined by the argument type, is allocated and initialized.
|
||||||
|
.P
|
||||||
|
The pseudoinstruction ROM is the same as CON,
|
||||||
|
except that it guarantees that the initialized words
|
||||||
|
will not change during the execution of the program.
|
||||||
|
This information allows optimizers to do
|
||||||
|
certain calculations such as array indexing and
|
||||||
|
subrange checking at compile time instead
|
||||||
|
of at run time.
|
||||||
|
.P
|
||||||
|
The pseudoinstruction BSS allocates
|
||||||
|
uninitialized global data or large blocks of data initialized
|
||||||
|
by the same value.
|
||||||
|
The first argument to this pseudo is the number
|
||||||
|
of bytes required, which must be a multiple of the wordsize.
|
||||||
|
The other arguments specify the value used for initialization and
|
||||||
|
whether the initialization is only for convenience or a strict necessity.
|
||||||
|
The pseudoinstruction HOL is similar to BSS in that it requests an
|
||||||
|
(un)initialized global data block.
|
||||||
|
Addressing of a HOL block, however, is quasi absolute.
|
||||||
|
The first byte is addressed by 0,
|
||||||
|
the second byte by 1 etc. in assembly language.
|
||||||
|
The assembler/loader adds the base address of
|
||||||
|
the HOL block to these numbers to obtain the
|
||||||
|
absolute address in the machine language.
|
||||||
|
.P
|
||||||
|
The scope of a HOL block starts at the HOL pseudo and
|
||||||
|
ends at the next HOL pseudo or at the end of a module
|
||||||
|
whatever comes first.
|
||||||
|
Each instruction falls in the scope of at most one
|
||||||
|
HOL block, the current HOL block.
|
||||||
|
It is not allowed to have more than one HOL block per procedure.
|
||||||
|
.P
|
||||||
|
The alignment restrictions are enforced by the
|
||||||
|
pseudoinstructions.
|
||||||
|
All objects are aligned on a multiple of their size or the wordsize
|
||||||
|
whichever is smaller.
|
||||||
|
Switching to another type of fragment or placing a label forces
|
||||||
|
word-alignment.
|
||||||
|
There are three types of fragments in global data space: CON, ROM and
|
||||||
|
BSS/HOL.
|
||||||
|
.N 2
|
||||||
|
.IS 2
|
||||||
|
.PS - 4
|
||||||
|
.PT "BSS <cst1>,<val>,<cst2>"
|
||||||
|
Reserve <cst1> bytes.
|
||||||
|
<val> is the value used to initialize the area.
|
||||||
|
<cst1> must be a multiple of the size of <val>.
|
||||||
|
<cst2> is 0 if the initialization is not strictly necessary,
|
||||||
|
1 if it is.
|
||||||
|
.PT "HOL <cst1>,<val>,<cst2>"
|
||||||
|
Idem, but all following absolute global data references will
|
||||||
|
refer to this block.
|
||||||
|
Only one HOL is allowed per procedure,
|
||||||
|
it has to be placed before the first instruction.
|
||||||
|
.PT "CON <val>+"
|
||||||
|
Assemble global data words initialized with the <val> constants.
|
||||||
|
.PT "ROM <val>+"
|
||||||
|
Idem, but the initialized data will never be changed by the program.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.S4 Partitioning
|
||||||
|
Two pseudoinstructions partition the input into procedures:
|
||||||
|
.IS 2
|
||||||
|
.PS - 4
|
||||||
|
.PT "PRO <pro>[,<cst>]"
|
||||||
|
Start of procedure.
|
||||||
|
<pro> is the procedure name.
|
||||||
|
<cst> is the number of bytes for locals.
|
||||||
|
The number of bytes for locals must be specified in the PRO or
|
||||||
|
END pseudoinstruction.
|
||||||
|
When specified in both, they must be identical.
|
||||||
|
.PT "END [<cst>]"
|
||||||
|
End of Procedure.
|
||||||
|
<cst> is the number of bytes for locals.
|
||||||
|
The number of bytes for locals must be specified in either the PRO or
|
||||||
|
END pseudoinstruction or both.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.S4 Visibility
|
||||||
|
Names of data and procedures in an EM module can either be
|
||||||
|
internal or external.
|
||||||
|
External names are known outside the module and are used to link
|
||||||
|
several pieces of a program.
|
||||||
|
Internal names are not known outside the modules they are used in.
|
||||||
|
Other modules will not 'see' an internal name.
|
||||||
|
.A
|
||||||
|
To reduce the number of passes needed,
|
||||||
|
it must be known at the first occurrence whether
|
||||||
|
a name is internal or external.
|
||||||
|
If the first occurrence of a name is in a definition,
|
||||||
|
the name is considered to be internal.
|
||||||
|
If the first occurrence of a name is a reference,
|
||||||
|
the name is considered to be external.
|
||||||
|
If the first occurrence is in one of the following pseudoinstructions,
|
||||||
|
the effect of the pseudo has precedence.
|
||||||
|
.IS 2
|
||||||
|
.PS - 4
|
||||||
|
.PT "EXA <dlb>"
|
||||||
|
External name.
|
||||||
|
<dlb> is known, possibly defined, outside this module.
|
||||||
|
Note that <dlb> may be defined in the same module.
|
||||||
|
.PT "EXP <pro>"
|
||||||
|
External procedure identifier.
|
||||||
|
Note that <pro> may be defined in the same module.
|
||||||
|
.PT "INA <dlb>"
|
||||||
|
Internal name.
|
||||||
|
<dlb> is internal to this module and must be defined in this module.
|
||||||
|
.PT "INP <pro>"
|
||||||
|
Internal procedure.
|
||||||
|
<pro> is internal to this module and must be defined in this module.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.S4 Miscellaneous
|
||||||
|
Two other pseudoinstructions provide miscellaneous features:
|
||||||
|
.IS 2
|
||||||
|
.PS - 4
|
||||||
|
.PT "EXC <cst1>,<cst2>"
|
||||||
|
Two blocks of instructions preceding this one are
|
||||||
|
interchanged before being processed.
|
||||||
|
<cst1> gives the number of lines of the first block.
|
||||||
|
<cst2> gives the number of lines of the second one.
|
||||||
|
Blank and pure comment lines do not count.
|
||||||
|
.PT "MES <cst>[,<par>]*"
|
||||||
|
A special type of comment.
|
||||||
|
Used by compilers to communicate with the
|
||||||
|
optimizer, assembler, etc. as follows:
|
||||||
|
.VS 1 0
|
||||||
|
.PS - 4
|
||||||
|
.PT "MES 0"
|
||||||
|
An error has occurred, stop further processing.
|
||||||
|
.PT "MES 1"
|
||||||
|
Suppress optimization.
|
||||||
|
.PT "MES 2,<cst1>,<cst2>"
|
||||||
|
Use wordsize <cst1> and pointer size <cst2>.
|
||||||
|
.PT "MES 3,<cst1>,<cst2>,<cst3>,<cst4>"
|
||||||
|
Indicates that a local variable is never referenced indirectly.
|
||||||
|
Used to indicate that a register may be used for a specific
|
||||||
|
variable.
|
||||||
|
<cst1> is offset in bytes from AB if positive
|
||||||
|
and offset from LB if negative.
|
||||||
|
<cst2> gives the size of the variable.
|
||||||
|
<cst3> indicates the class of the variable.
|
||||||
|
The following values are currently recognized:
|
||||||
|
.PS
|
||||||
|
.PT 0
|
||||||
|
The variable can be used for anything.
|
||||||
|
.PT 1
|
||||||
|
The variable is used as a loopindex.
|
||||||
|
.PT 2
|
||||||
|
The variable is used as a pointer.
|
||||||
|
.PT 3
|
||||||
|
The variable is used as a floating point number.
|
||||||
|
.PE 0
|
||||||
|
<cst4> gives the priority of the variable,
|
||||||
|
higher numbers indicate better candidates.
|
||||||
|
.PT "MES 4,<cst>,<str>"
|
||||||
|
Number of source lines in file <str> (for profiler).
|
||||||
|
.PT "MES 5"
|
||||||
|
Floating point used.
|
||||||
|
.PT "MES 6,<val>*"
|
||||||
|
Comment. Used to provide comments in compact assembly language.
|
||||||
|
.PT "MES 7,....."
|
||||||
|
Reserved.
|
||||||
|
.PT "MES 8,<pro>[,<dlb>]..."
|
||||||
|
Library module. Indicates that the module may only be loaded
|
||||||
|
if it is useful, that is, if it can satisfy any unresolved
|
||||||
|
references during the loading process.
|
||||||
|
May not be preceded by any other pseudo, except MES's.
|
||||||
|
.PT "MES 9,<cst>"
|
||||||
|
Guarantees that no more than <cst> bytes of parameters are
|
||||||
|
accessed, either directly or indirectly.
|
||||||
|
.PE 1
|
||||||
|
.VS 1 1
|
||||||
|
Each backend is free to skip irrelevant MES pseudos.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.S2 "The Compact Assembly Language"
|
||||||
|
The assembler accepts input in a highly encoded form.
|
||||||
|
This
|
||||||
|
form is intended to reduce the amount of file transport between the
|
||||||
|
front ends, optimizers
|
||||||
|
and back ends, and also reduces the amount of storage required for storing
|
||||||
|
libraries.
|
||||||
|
Libraries are stored as archived compact assembly language, not machine
|
||||||
|
language.
|
||||||
|
.P
|
||||||
|
When beginning to read the input, the assembler is in neutral state, and
|
||||||
|
expects either a label or an instruction (including the pseudoinstructions).
|
||||||
|
The meaning of the next byte(s) when in neutral state is as follows, where
|
||||||
|
b1, b2
|
||||||
|
etc. represent the succeeding bytes.
|
||||||
|
.N 1
|
||||||
|
.DS
|
||||||
|
.TS
|
||||||
|
tab(:) ;
|
||||||
|
rw17 4 l.
|
||||||
|
0:Reserved for future use
|
||||||
|
1-129:Machine instructions, see Appendix A, alphabetical list
|
||||||
|
130-149:Reserved for future use
|
||||||
|
150-161:BSS,CON,END,EXA,EXC,EXP,HOL,INA,INP,MES,PRO,ROM
|
||||||
|
162-179:Reserved for future pseudoinstructions
|
||||||
|
180-239:Instruction labels 0 - 59 (180 is local label 0 etc.)
|
||||||
|
240-244:See the Common Table below
|
||||||
|
245-255:Not used
|
||||||
|
.TE 1
|
||||||
|
.DE 0
|
||||||
|
After a label, the assembler is back in neutral state; it can immediately
|
||||||
|
accept another label or an instruction in the next byte.
|
||||||
|
No linefeeds are used to separate lines.
|
||||||
|
.P
|
||||||
|
If an opcode expects no arguments,
|
||||||
|
the assembler is back in neutral state after
|
||||||
|
reading the one byte containing the instruction number.
|
||||||
|
If it has one or
|
||||||
|
more arguments (only pseudos have more than 1), the arguments follow directly,
|
||||||
|
encoded as follows:
|
||||||
|
.N 1
|
||||||
|
.IS 2
|
||||||
|
.TS
|
||||||
|
tab(:);
|
||||||
|
r l.
|
||||||
|
0-239:Offsets from -120 to 119
|
||||||
|
|
||||||
|
240-255:See the Common Table below
|
||||||
|
.TE 1
|
||||||
|
Absence of an optional argument is indicated by a special
|
||||||
|
byte.
|
||||||
|
.IE 2
|
||||||
|
.CS
|
||||||
|
Common Table for Neutral State and Arguments
|
||||||
|
.CE
|
||||||
|
.TS
|
||||||
|
tab(:);
|
||||||
|
c c s c
|
||||||
|
l8 l l8 l.
|
||||||
|
class:bytes:description
|
||||||
|
|
||||||
|
<ilb>:240:b1:Instruction label b1 (Not used for branches)
|
||||||
|
<ilb>:241:b1 b2:16 bit instruction label (256*b2 + b1)
|
||||||
|
<dlb>:242:b1:Global label .0-.255, with b1 being the label
|
||||||
|
<dlb>:243:b1 b2:Global label .0-.32767
|
||||||
|
:::with 256*b2+b1 being the label
|
||||||
|
<dlb>:244:<string>:Global symbol not of the form .nnn
|
||||||
|
<cst>:245:b1 b2:16 bit constant
|
||||||
|
<cst>:246:b1 b2 b3 b4:32 bit constant
|
||||||
|
<cst>:247:b1 .. b8:64 bit constant
|
||||||
|
<arg>:248:<dlb><cst>:Global label + (possibly negative) constant
|
||||||
|
<pro>:249:<string>:Procedure name (not including $)
|
||||||
|
<str>:250:<string>:String used in CON or ROM (no quotes-no escapes)
|
||||||
|
<con>:251:<cst><string>:Integer constant, size <cst> bytes
|
||||||
|
<con>:252:<cst><string>:Unsigned constant, size <cst> bytes
|
||||||
|
<con>:253:<cst><string>:Floating constant, size <cst> bytes
|
||||||
|
:254::unused
|
||||||
|
<end>:255::Delimiter for argument lists or
|
||||||
|
:::indicates absence of optional argument
|
||||||
|
.TE 1
|
||||||
|
.P
|
||||||
|
The bytes specifying the value of a 16, 32 or 64 bit constant
|
||||||
|
are presented in two's complement notation, with the least
|
||||||
|
significant byte first. For example: the value of a 32 bit
|
||||||
|
constant is ((s4*256+b3)*256+b2)*256+b1, where s4 is b4-256 if
|
||||||
|
b4 is greater than 128 else s4 takes the value of b4.
|
||||||
|
A <string> consists of a <cst> inmediatly followed by
|
||||||
|
a sequence of bytes with length <cst>.
|
||||||
|
.P
|
||||||
|
.ne 8
|
||||||
|
The pseudoinstructions fall into several categories, depending on their
|
||||||
|
arguments:
|
||||||
|
.N 1
|
||||||
|
.DS
|
||||||
|
Group 1 -- EXC, BSS, HOL have a known number of arguments
|
||||||
|
Group 2 -- EXA, EXP, INA, INP have a string as argument
|
||||||
|
Group 3 -- CON, MES, ROM have a variable number of various things
|
||||||
|
Group 4 -- END, PRO have a trailing optional argument.
|
||||||
|
.DE 1
|
||||||
|
Groups 1 and 2
|
||||||
|
use the encoding described above.
|
||||||
|
Group 3 also uses the encoding listed above, with an <end> byte after the
|
||||||
|
last argument to indicate the end of the list.
|
||||||
|
Group 4 uses
|
||||||
|
an <end> byte if the trailing argument is not present.
|
||||||
|
.N 2
|
||||||
|
.IS 2
|
||||||
|
.TS
|
||||||
|
tab(|);
|
||||||
|
l s l
|
||||||
|
l s s
|
||||||
|
l 2 lw(46) l.
|
||||||
|
Example ASCII|Example compact
|
||||||
|
(LOC = 69, BRA = 18 here):
|
||||||
|
|
||||||
|
2||182
|
||||||
|
1||181
|
||||||
|
LOC|10|69 130
|
||||||
|
LOC|-10|69 110
|
||||||
|
LOC|300|69 245 44 1
|
||||||
|
BRA|*19|18 139
|
||||||
|
300||241 44 1
|
||||||
|
.3||242 3
|
||||||
|
CON|4,9,*2,$foo|151 124 129 240 2 249 123 102 111 111 255
|
||||||
|
CON|.35|151 242 35 255
|
||||||
|
.TE 0
|
||||||
|
.IE 0
|
||||||
|
.BP
|
||||||
|
.S2 "Assembly language instruction list"
|
||||||
|
.P
|
||||||
|
For each instruction in the list the range of argument values
|
||||||
|
in the assembly language is given.
|
||||||
|
The column headed \fIassem\fP contains the mnemonics defined
|
||||||
|
in 11.1.3.
|
||||||
|
The following column specifies restrictions of the argument
|
||||||
|
value.
|
||||||
|
Addresses have to obey the restrictions mentioned in chapter 2.
|
||||||
|
The classes of arguments
|
||||||
|
are indicated by letters:
|
||||||
|
.ds b \fBb\fP
|
||||||
|
.ds c \fBc\fP
|
||||||
|
.ds d \fBd\fP
|
||||||
|
.ds g \fBg\fP
|
||||||
|
.ds f \fBf\fP
|
||||||
|
.ds l \fBl\fP
|
||||||
|
.ds n \fBn\fP
|
||||||
|
.ds w \fBw\fP
|
||||||
|
.ds p \fBp\fP
|
||||||
|
.ds r \fBr\fP
|
||||||
|
.ds s \fBs\fP
|
||||||
|
.ds z \fBz\fP
|
||||||
|
.ds o \fBo\fP
|
||||||
|
.ds - \fB-\fP
|
||||||
|
.N 1
|
||||||
|
.TS
|
||||||
|
tab(:);
|
||||||
|
c s l l
|
||||||
|
l l 15 l l.
|
||||||
|
\fIassem\fP:constraints:rationale
|
||||||
|
|
||||||
|
\&\*c:cst:fits word:constant
|
||||||
|
\&\*d:cst:fits double word:constant
|
||||||
|
\&\*l:cst::local offset
|
||||||
|
\&\*g:arg:>= 0:global offset
|
||||||
|
\&\*f:cst::fragment offset
|
||||||
|
\&\*n:cst:>= 0:counter
|
||||||
|
\&\*s:cst:>0 , word multiple:object size
|
||||||
|
\&\*z:cst:>= 0 , zero or word multiple:object size
|
||||||
|
\&\*o:cst:>= 0 , word multiple or fraction:object size
|
||||||
|
\&\*w:cst:> 0 , word multiple:object size *
|
||||||
|
\&\*p:pro::pro identifier
|
||||||
|
\&\*b:ilb:>= 0:label number
|
||||||
|
\&\*r:cst:0,1,2:register number
|
||||||
|
\&\*-:::no argument
|
||||||
|
.TE 1
|
||||||
|
.P
|
||||||
|
The * at the rationale for \*w indicates that the argument
|
||||||
|
can either be given as argument or on top of the stack.
|
||||||
|
If the argument is omitted, the argument is fetched from the
|
||||||
|
stack;
|
||||||
|
it is assumed to be a wordsized unsigned integer.
|
||||||
|
Instructions that check for undefined integer or floating-point
|
||||||
|
values and underflow or overflow
|
||||||
|
are indicated below by (*).
|
||||||
|
.N 1
|
||||||
|
.DS B
|
||||||
|
GROUP 1 - LOAD
|
||||||
|
|
||||||
|
LOC \*c : Load constant (i.e. push one word onto the stack)
|
||||||
|
LDC \*d : Load double constant ( push two words )
|
||||||
|
LOL \*l : Load word at \*l-th local (\*l<0) or parameter (\*l>=0)
|
||||||
|
LOE \*g : Load external word \*g
|
||||||
|
LIL \*l : Load word pointed to by \*l-th local or parameter
|
||||||
|
LOF \*f : Load offsetted (top of stack + \*f yield address)
|
||||||
|
LAL \*l : Load address of local or parameter
|
||||||
|
LAE \*g : Load address of external
|
||||||
|
LXL \*n : Load lexical (address of LB \*n static levels back)
|
||||||
|
LXA \*n : Load lexical (address of AB \*n static levels back)
|
||||||
|
LOI \*o : Load indirect \*o bytes (address is popped from the stack)
|
||||||
|
LOS \*w : Load indirect, \*w-byte integer on top of stack gives object size
|
||||||
|
LDL \*l : Load double local or parameter (two consecutive words are stacked)
|
||||||
|
LDE \*g : Load double external (two consecutive externals are stacked)
|
||||||
|
LDF \*f : Load double offsetted (top of stack + \*f yield address)
|
||||||
|
LPI \*p : Load procedure identifier
|
||||||
|
|
||||||
|
GROUP 2 - STORE
|
||||||
|
|
||||||
|
STL \*l : Store local or parameter
|
||||||
|
STE \*g : Store external
|
||||||
|
SIL \*l : Store into word pointed to by \*l-th local or parameter
|
||||||
|
STF \*f : Store offsetted
|
||||||
|
STI \*o : Store indirect \*o bytes (pop address, then data)
|
||||||
|
STS \*w : Store indirect, \*w-byte integer on top of stack gives object size
|
||||||
|
SDL \*l : Store double local or parameter
|
||||||
|
SDE \*g : Store double external
|
||||||
|
SDF \*f : Store double offsetted
|
||||||
|
|
||||||
|
GROUP 3 - INTEGER ARITHMETIC
|
||||||
|
|
||||||
|
ADI \*w : Addition (*)
|
||||||
|
SBI \*w : Subtraction (*)
|
||||||
|
MLI \*w : Multiplication (*)
|
||||||
|
DVI \*w : Division (*)
|
||||||
|
RMI \*w : Remainder (*)
|
||||||
|
NGI \*w : Negate (two's complement) (*)
|
||||||
|
SLI \*w : Shift left (*)
|
||||||
|
SRI \*w : Shift right (*)
|
||||||
|
|
||||||
|
GROUP 4 - UNSIGNED ARITHMETIC
|
||||||
|
|
||||||
|
ADU \*w : Addition
|
||||||
|
SBU \*w : Subtraction
|
||||||
|
MLU \*w : Multiplication
|
||||||
|
DVU \*w : Division
|
||||||
|
RMU \*w : Remainder
|
||||||
|
SLU \*w : Shift left
|
||||||
|
SRU \*w : Shift right
|
||||||
|
|
||||||
|
GROUP 5 - FLOATING POINT ARITHMETIC
|
||||||
|
|
||||||
|
ADF \*w : Floating add (*)
|
||||||
|
SBF \*w : Floating subtract (*)
|
||||||
|
MLF \*w : Floating multiply (*)
|
||||||
|
DVF \*w : Floating divide (*)
|
||||||
|
NGF \*w : Floating negate (*)
|
||||||
|
FIF \*w : Floating multiply and split integer and fraction part (*)
|
||||||
|
FEF \*w : Split floating number in exponent and fraction part (*)
|
||||||
|
|
||||||
|
GROUP 6 - POINTER ARITHMETIC
|
||||||
|
|
||||||
|
ADP \*f : Add \*f to pointer on top of stack
|
||||||
|
ADS \*w : Add \*w-byte value and pointer
|
||||||
|
SBS \*w : Subtract pointers in same fragment and push diff as size \*w integer
|
||||||
|
|
||||||
|
GROUP 7 - INCREMENT/DECREMENT/ZERO
|
||||||
|
|
||||||
|
INC \*- : Increment word on top of stack by 1 (*)
|
||||||
|
INL \*l : Increment local or parameter (*)
|
||||||
|
INE \*g : Increment external (*)
|
||||||
|
DEC \*- : Decrement word on top of stack by 1 (*)
|
||||||
|
DEL \*l : Decrement local or parameter (*)
|
||||||
|
DEE \*g : Decrement external (*)
|
||||||
|
ZRL \*l : Zero local or parameter
|
||||||
|
ZRE \*g : Zero external
|
||||||
|
ZRF \*w : Load a floating zero of size \*w
|
||||||
|
ZER \*w : Load \*w zero bytes
|
||||||
|
|
||||||
|
GROUP 8 - CONVERT (stack: source, source size, dest. size (top))
|
||||||
|
|
||||||
|
CII \*- : Convert integer to integer (*)
|
||||||
|
CUI \*- : Convert unsigned to integer (*)
|
||||||
|
CFI \*- : Convert floating to integer (*)
|
||||||
|
CIF \*- : Convert integer to floating (*)
|
||||||
|
CUF \*- : Convert unsigned to floating (*)
|
||||||
|
CFF \*- : Convert floating to floating (*)
|
||||||
|
CIU \*- : Convert integer to unsigned
|
||||||
|
CUU \*- : Convert unsigned to unsigned
|
||||||
|
CFU \*- : Convert floating to unsigned
|
||||||
|
|
||||||
|
GROUP 9 - LOGICAL
|
||||||
|
|
||||||
|
AND \*w : Boolean and on two groups of \*w bytes
|
||||||
|
IOR \*w : Boolean inclusive or on two groups of \*w bytes
|
||||||
|
XOR \*w : Boolean exclusive or on two groups of \*w bytes
|
||||||
|
COM \*w : Complement (one's complement of top \*w bytes)
|
||||||
|
ROL \*w : Rotate left a group of \*w bytes
|
||||||
|
ROR \*w : Rotate right a group of \*w bytes
|
||||||
|
|
||||||
|
GROUP 10 - SETS
|
||||||
|
|
||||||
|
INN \*w : Bit test on \*w byte set (bit number on top of stack)
|
||||||
|
SET \*w : Create singleton \*w byte set with bit n on (n is top of stack)
|
||||||
|
|
||||||
|
GROUP 11 - ARRAY
|
||||||
|
|
||||||
|
LAR \*w : Load array element, descriptor contains integers of size \*w
|
||||||
|
SAR \*w : Store array element
|
||||||
|
AAR \*w : Load address of array element
|
||||||
|
|
||||||
|
GROUP 12 - COMPARE
|
||||||
|
|
||||||
|
CMI \*w : Compare \*w byte integers, Push negative, zero, positive for <, = or >
|
||||||
|
CMF \*w : Compare \*w byte reals
|
||||||
|
CMU \*w : Compare \*w byte unsigneds
|
||||||
|
CMS \*w : Compare \*w byte values, can only be used for bit for bit equality test
|
||||||
|
CMP \*- : Compare pointers
|
||||||
|
|
||||||
|
TLT \*- : True if less, i.e. iff top of stack < 0
|
||||||
|
TLE \*- : True if less or equal, i.e. iff top of stack <= 0
|
||||||
|
TEQ \*- : True if equal, i.e. iff top of stack = 0
|
||||||
|
TNE \*- : True if not equal, i.e. iff top of stack non zero
|
||||||
|
TGE \*- : True if greater or equal, i.e. iff top of stack >= 0
|
||||||
|
TGT \*- : True if greater, i.e. iff top of stack > 0
|
||||||
|
|
||||||
|
GROUP 13 - BRANCH
|
||||||
|
|
||||||
|
BRA \*b : Branch unconditionally to label \*b
|
||||||
|
|
||||||
|
BLT \*b : Branch less (pop 2 words, branch if top > second)
|
||||||
|
BLE \*b : Branch less or equal
|
||||||
|
BEQ \*b : Branch equal
|
||||||
|
BNE \*b : Branch not equal
|
||||||
|
BGE \*b : Branch greater or equal
|
||||||
|
BGT \*b : Branch greater
|
||||||
|
|
||||||
|
ZLT \*b : Branch less than zero (pop 1 word, branch negative)
|
||||||
|
ZLE \*b : Branch less or equal to zero
|
||||||
|
ZEQ \*b : Branch equal zero
|
||||||
|
ZNE \*b : Branch not zero
|
||||||
|
ZGE \*b : Branch greater or equal zero
|
||||||
|
ZGT \*b : Branch greater than zero
|
||||||
|
|
||||||
|
GROUP 14 - PROCEDURE CALL
|
||||||
|
|
||||||
|
CAI \*- : Call procedure (procedure identifier on stack)
|
||||||
|
CAL \*p : Call procedure (with identifier \*p)
|
||||||
|
LFR \*s : Load function result
|
||||||
|
RET \*z : Return (function result consists of top \*z bytes)
|
||||||
|
|
||||||
|
GROUP 15 - MISCELLANEOUS
|
||||||
|
|
||||||
|
ASP \*f : Adjust the stack pointer by \*f
|
||||||
|
ASS \*w : Adjust the stack pointer by \*w-byte integer
|
||||||
|
BLM \*z : Block move \*z bytes; first pop destination addr, then source addr
|
||||||
|
BLS \*w : Block move, size is in \*w-byte integer on top of stack
|
||||||
|
CSA \*w : Case jump; address of jump table at top of stack
|
||||||
|
CSB \*w : Table lookup jump; address of jump table at top of stack
|
||||||
|
DCH \*- : Follow dynamic chain, convert LB to LB of caller
|
||||||
|
DUP \*s : Duplicate top \*s bytes
|
||||||
|
DUS \*w : Duplicate top \*w bytes
|
||||||
|
EXG \*w : Exchange top \*w bytes
|
||||||
|
FIL \*g : File name (external 4 := \*g)
|
||||||
|
GTO \*g : Non-local goto, descriptor at \*g
|
||||||
|
LIM \*- : Load 16 bit ignore mask
|
||||||
|
LIN \*n : Line number (external 0 := \*n)
|
||||||
|
LNI \*- : Line number increment
|
||||||
|
LOR \*r : Load register (0=LB, 1=SP, 2=HP)
|
||||||
|
LPB \*- : Convert local base to argument base
|
||||||
|
MON \*- : Monitor call
|
||||||
|
NOP \*- : No operation
|
||||||
|
RCK \*w : Range check; trap on error
|
||||||
|
RTT \*- : Return from trap
|
||||||
|
SIG \*- : Trap errors to proc identifier on top of stack, -2 resets default
|
||||||
|
SIM \*- : Store 16 bit ignore mask
|
||||||
|
STR \*r : Store register (0=LB, 1=SP, 2=HP)
|
||||||
|
TRP \*- : Cause trap to occur (Error number on stack)
|
||||||
|
.DE 0
|
||||||
164
doc/em/descr.nr
Normal file
164
doc/em/descr.nr
Normal file
|
|
@ -0,0 +1,164 @@
|
||||||
|
.SN 7
|
||||||
|
.BP
|
||||||
|
.S1 "DESCRIPTORS"
|
||||||
|
Several instructions use descriptors, notably the range check instruction,
|
||||||
|
the array instructions, the goto instruction and the case jump instructions.
|
||||||
|
Descriptors reside in data space.
|
||||||
|
They may be constructed at run time, but
|
||||||
|
more often they are fixed and allocated in ROM data.
|
||||||
|
.P
|
||||||
|
All instructions using descriptors, except GTO, have as argument
|
||||||
|
the size of the integers in the descriptor.
|
||||||
|
All implementations have to allow integers of the size of a
|
||||||
|
word in descriptors.
|
||||||
|
All integers popped from the stack and used for indexing or comparing
|
||||||
|
must have the same size as the integers in the descriptor.
|
||||||
|
.S2 "Range check descriptors"
|
||||||
|
Range check descriptors consist of two integers:
|
||||||
|
.IS 2
|
||||||
|
.PS 1 4 "" .
|
||||||
|
.PT
|
||||||
|
lower bound~~~~~~~signed
|
||||||
|
.PT
|
||||||
|
upper bound~~~~~~~signed
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
The range check instruction checks an integer on the stack against
|
||||||
|
these bounds and causes a trap if the value is outside the interval.
|
||||||
|
The value itself is neither changed nor removed from the stack.
|
||||||
|
.S2 "Array descriptors"
|
||||||
|
Each array descriptor describes a single dimension.
|
||||||
|
For multi-dimensional arrays, several array instructions are
|
||||||
|
needed to access a single element.
|
||||||
|
Array descriptors contain the following three integers:
|
||||||
|
.IS 2
|
||||||
|
.PS 1 4 "" .
|
||||||
|
.PT
|
||||||
|
lower bound~~~~~~~~~~~~~~~~~~~~~signed
|
||||||
|
.PT
|
||||||
|
upper bound - lower bound~~~~~~~unsigned
|
||||||
|
.PT
|
||||||
|
number of bytes per element~~~~~unsigned
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
The array instructions LAR, SAR and AAR have the pointer to the start
|
||||||
|
of the descriptor as operand on the stack.
|
||||||
|
.sp
|
||||||
|
The element A[I] is fetched as follows:
|
||||||
|
.IS 2
|
||||||
|
.PS 1 4 "" .
|
||||||
|
.PT
|
||||||
|
Stack the address of A (e.g., using LAE or LAL)
|
||||||
|
.PT
|
||||||
|
Stack the value of I (n-byte integer)
|
||||||
|
.PT
|
||||||
|
Stack the pointer to the descriptor (e.g., using LAE)
|
||||||
|
.PT
|
||||||
|
LAR n (n is the size of the integers in the descriptor and I)
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
All array instructions first pop the address of the descriptor
|
||||||
|
and the index.
|
||||||
|
If the index is not within the bounds specified, a trap occurs.
|
||||||
|
If ok, (I~-~lower bound) is multiplied
|
||||||
|
by the number of bytes per element (the third word). The result is added
|
||||||
|
to the address of A and replaces A on the stack.
|
||||||
|
.A
|
||||||
|
At this point LAR, SAR and AAR diverge.
|
||||||
|
AAR is finished. LAR pops the address and fetches the data
|
||||||
|
item,
|
||||||
|
the size being specified by the descriptor.
|
||||||
|
The usual restrictions for memory access must be obeyed.
|
||||||
|
SAR pops the address and stores the
|
||||||
|
data item now exposed.
|
||||||
|
.S2 "Non-local goto descriptors"
|
||||||
|
The GTO instruction provides a way of returning directly to any
|
||||||
|
active procedure invocation.
|
||||||
|
The argument of the instruction is the address of a descriptor
|
||||||
|
containing three pointers:
|
||||||
|
.IS 2
|
||||||
|
.PS 1 4 "" .
|
||||||
|
.PT
|
||||||
|
value of PC after the jump
|
||||||
|
.PT
|
||||||
|
value of SP after the jump
|
||||||
|
.PT
|
||||||
|
value of LB after the jump
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
GTO replaces the loads PC, SP and LB from the descriptor,
|
||||||
|
thereby jumping to a procedure
|
||||||
|
and removing zeor or more frames from the stack.
|
||||||
|
The LB, SP and PC in the descriptor must belong to a
|
||||||
|
dynamically enclosing procedure,
|
||||||
|
because some EM implementations will need to backtrack through
|
||||||
|
the dynamic chain and use the implementation dependent data
|
||||||
|
in frames to restore registers etc.
|
||||||
|
.S2 "Case descriptors"
|
||||||
|
The case jump instructions CSA and CSB both
|
||||||
|
provide multiway branches selected by a case index.
|
||||||
|
Both fetch two operands from the stack:
|
||||||
|
first a pointer to the low address of the case descriptor
|
||||||
|
and then the case index.
|
||||||
|
CSA uses the case index as index in the descriptor table, but CSB searches
|
||||||
|
the table for an occurrence of the case index.
|
||||||
|
Therefore, the descriptors for CSA and CSB,
|
||||||
|
as shown in figure 4, are different.
|
||||||
|
All pointers in the table must be addresses of instructions in the
|
||||||
|
procedure executing the case instruction.
|
||||||
|
.P
|
||||||
|
CSA selects the new PC by indexing.
|
||||||
|
If the index, a signed integer, is greater than or equal to
|
||||||
|
the lower bound and less than or equal to the upper bound,
|
||||||
|
then fetch the new PC from the list of instruction pointers by indexing with
|
||||||
|
index-lower.
|
||||||
|
The table does not contain the value of the upper bound,
|
||||||
|
but the value of upper-lower as an unsigned integer.
|
||||||
|
If the index is out of bounds or if the fetched pointer is 0,
|
||||||
|
then fetch the default instruction pointer.
|
||||||
|
If the resulting PC is 0, then trap.
|
||||||
|
.P
|
||||||
|
CSB selects the new PC by searching.
|
||||||
|
The table is searched for an entry with index value equal to the case index.
|
||||||
|
That entry or, if none is found, the default entry contains the
|
||||||
|
new PC.
|
||||||
|
When the resulting PC is 0, a trap is performed.
|
||||||
|
.P
|
||||||
|
The choice of which case instruction to use for
|
||||||
|
each source language case statement
|
||||||
|
is up to the front end.
|
||||||
|
If the range of the index value is dense, i.e
|
||||||
|
.DS
|
||||||
|
(highest value - lowest value) / number of cases
|
||||||
|
.DE 1
|
||||||
|
is less than some threshold, then CSA is the obvious choice.
|
||||||
|
If the range is sparse, CSB is better.
|
||||||
|
.N 2
|
||||||
|
.DS
|
||||||
|
|--------------------| |--------------------| high address
|
||||||
|
| pointer for upb | | pointer n-1 |
|
||||||
|
|--------------------| |- - - - - - - |
|
||||||
|
| . | | index n-1 |
|
||||||
|
| . | |--------------------|
|
||||||
|
| . | | . |
|
||||||
|
| . | | . |
|
||||||
|
| . | | . |
|
||||||
|
| . | |--------------------|
|
||||||
|
| . | | pointer 1 |
|
||||||
|
|--------------------| |- - - - - - - |
|
||||||
|
| pointer for lwb+1 | | index 1 |
|
||||||
|
|--------------------| |--------------------|
|
||||||
|
| pointer for lwb | | pointer 0 |
|
||||||
|
|--------------------| |- - - - - - - |
|
||||||
|
| upper - lower | | index 0 |
|
||||||
|
|--------------------| |--------------------|
|
||||||
|
| lower bound | | number of entries |
|
||||||
|
|--------------------| |--------------------|
|
||||||
|
| default pointer | | default pointer | low address
|
||||||
|
|--------------------| |--------------------|
|
||||||
|
|
||||||
|
CSA descriptor CSB descriptor
|
||||||
|
|
||||||
|
|
||||||
|
Figure 4. Descriptor layout for CSA and CSB
|
||||||
|
.DE
|
||||||
377
doc/em/dspace.nr
Normal file
377
doc/em/dspace.nr
Normal file
|
|
@ -0,0 +1,377 @@
|
||||||
|
.BP
|
||||||
|
.SN 4
|
||||||
|
.S1 "DATA ADDRESS SPACE"
|
||||||
|
The data address space is divided into three parts, called 'areas',
|
||||||
|
each with its own addressing method:
|
||||||
|
global data area,
|
||||||
|
local data area (including the stack),
|
||||||
|
and heap data area.
|
||||||
|
These data areas must be part of the same
|
||||||
|
address space because all data is accessed by
|
||||||
|
the same type of pointers.
|
||||||
|
.P
|
||||||
|
Space for global data is reserved using several pseudoinstructions in the
|
||||||
|
assembly language, as described in
|
||||||
|
the next paragraph and chapter 11.
|
||||||
|
The size of the global data area is fixed per program.
|
||||||
|
.A
|
||||||
|
Global data is addressed absolutely in the machine language.
|
||||||
|
Many instructions are available to address global data.
|
||||||
|
They all have an absolute address as argument.
|
||||||
|
Examples are LOE, LAE and STE.
|
||||||
|
.P
|
||||||
|
Part of the global data area is initialized by the
|
||||||
|
compiler, the
|
||||||
|
rest is not initialized at all or is initialized
|
||||||
|
with a value, typically -32768 or 0.
|
||||||
|
Part of the initialized global data may be made read-only
|
||||||
|
if the implementation supports protection.
|
||||||
|
.P
|
||||||
|
The local data area is used as a stack,
|
||||||
|
which grows from high to low addresses
|
||||||
|
and contains some data for each active procedure
|
||||||
|
invocation, called a 'frame'.
|
||||||
|
The size of the local data area varies dynamically during
|
||||||
|
execution.
|
||||||
|
Below the current procedure frame resides the operand stack.
|
||||||
|
The stack pointer SP always points to the bottom of
|
||||||
|
the local data area.
|
||||||
|
Local data is addressed by offsetting from the local base pointer LB.
|
||||||
|
LB always points to the frame of the current procedure.
|
||||||
|
Only the words of the current frame and the parameters
|
||||||
|
can be addressed directly.
|
||||||
|
Variables in other active procedures are addressed by following
|
||||||
|
the chain of statically enclosing procedures using the LXL or LXA instruction.
|
||||||
|
The variables in dynamically enclosing procedures can be
|
||||||
|
addressed with the use of the DCH instruction.
|
||||||
|
.A
|
||||||
|
Many instructions have offsets to LB as argument,
|
||||||
|
for instance LOL, LAL and STL.
|
||||||
|
The arguments of these instructions range from -1 to some
|
||||||
|
(negative) minimum
|
||||||
|
for the access of local storage and from 0 to some (positive)
|
||||||
|
maximum for parameter access.
|
||||||
|
.P
|
||||||
|
The procedure call instructions CAL and CAI each create a new frame
|
||||||
|
on the stack.
|
||||||
|
Each procedure has an assembly-time parameter specifying
|
||||||
|
the number of bytes needed for local storage.
|
||||||
|
This storage is allocated each time the procedure is called and
|
||||||
|
must be a multiple of the wordsize.
|
||||||
|
Each procedure, therefore, starts with a stack with the local variables
|
||||||
|
already allocated.
|
||||||
|
The return instructions RET and RTT remove a frame.
|
||||||
|
The actual parameters must be removed by the calling procedure.
|
||||||
|
.P
|
||||||
|
RET may copy some words from the stack of
|
||||||
|
the returning procedure to an unnamed 'function return area'.
|
||||||
|
This area is available for 'READ-ONCE' access using the LFR instruction.
|
||||||
|
The result of a LFR is only defined if the size used to fetch
|
||||||
|
is identical to the size used in the last return.
|
||||||
|
The instruction ASP, used to remove the parameters from the
|
||||||
|
stack, the branch instruction BRA and the non-local goto
|
||||||
|
instrucion GTO are the only ones that leave the contents of
|
||||||
|
the 'function return area' intact.
|
||||||
|
All other instructions are allowed to destroy the function
|
||||||
|
return area.
|
||||||
|
Thus parameters can be popped before fetching the function result.
|
||||||
|
The maximum size of all function return areas is
|
||||||
|
implementation dependent,
|
||||||
|
but should allow procedure instance identifiers and all
|
||||||
|
implemented objects of type integer, unsigned, float
|
||||||
|
and pointer to be returned.
|
||||||
|
In most implementations
|
||||||
|
the maximum size of the function return
|
||||||
|
area is twice the pointer size,
|
||||||
|
because we want to be able to handle 'procedure instance
|
||||||
|
identifiers' which consist of a procedure identifier and the LB
|
||||||
|
of a frame belonging to that procedure.
|
||||||
|
.P
|
||||||
|
The heap data area grows upwards, to higher numbered
|
||||||
|
addresses.
|
||||||
|
It is initially empty.
|
||||||
|
The initial value of the heap pointer HP
|
||||||
|
marks the low end.
|
||||||
|
The heap pointer may be manipulated
|
||||||
|
by the LOR and STR instructions.
|
||||||
|
The heap can only be addressed indirectly,
|
||||||
|
by pointers derived from previous values of HP.
|
||||||
|
.S2 "Global data area"
|
||||||
|
The initial size of the global data area is determined at assembly time.
|
||||||
|
Global data is allocated by several
|
||||||
|
pseudoinstructions in the EM assembly
|
||||||
|
language.
|
||||||
|
Each pseudoinstruction allocates one or more bytes.
|
||||||
|
The bytes allocated for a single pseudo form
|
||||||
|
a 'block'.
|
||||||
|
A block differs from a fragment, because,
|
||||||
|
under certain conditions, several blocks are allocated
|
||||||
|
in a single fragment.
|
||||||
|
This guarantees that the bytes of these blocks
|
||||||
|
are consecutive.
|
||||||
|
.P
|
||||||
|
Global data is addressed absolutely in binary
|
||||||
|
machine language.
|
||||||
|
Most compilers, however,
|
||||||
|
cannot assign absolute addresses to their global variables,
|
||||||
|
especially not if the language
|
||||||
|
allows programs to be composed of several separately compiled modules.
|
||||||
|
The assembly language therefore allows the compiler to name
|
||||||
|
the first address of a global data block with an alphanumeric label.
|
||||||
|
Moreover, the only way to address such a named global data block
|
||||||
|
in the assembly language is by using its name.
|
||||||
|
It is the task of the assembler/loader to
|
||||||
|
translate these labels into absolute addresses.
|
||||||
|
These labels may also be used
|
||||||
|
in CON and ROM pseudoinstructions to initialize pointers.
|
||||||
|
.P
|
||||||
|
The pseudoinstruction CON allocates initialized data.
|
||||||
|
ROM acts like CON but indicates that the initialized data will
|
||||||
|
not change during execution of the program.
|
||||||
|
The pseudoinstruction BSS allocates a block of uninitialized
|
||||||
|
or identically initialized
|
||||||
|
data.
|
||||||
|
The pseudoinstruction HOL is similar to BSS,
|
||||||
|
but it alters the meaning of subsequent absolute addressing in
|
||||||
|
the assembly language.
|
||||||
|
.P
|
||||||
|
Another type of global data is a small block,
|
||||||
|
called the ABS block, with an implementation defined size.
|
||||||
|
Storage in this type of block can only be addressed
|
||||||
|
absolutely in assembly language.
|
||||||
|
The first word has address 0 and is used to maintain the
|
||||||
|
source line number.
|
||||||
|
Special instructions LIN and LNI are provided to
|
||||||
|
update this counter.
|
||||||
|
A pointer at location 4 points to a string containing the
|
||||||
|
current source file name.
|
||||||
|
The instruction FIL can be used to update the pointer.
|
||||||
|
.P
|
||||||
|
All numeric arguments of the instructions that address
|
||||||
|
the global data area refer to locations in the
|
||||||
|
ABS block unless
|
||||||
|
they are preceded by at least one HOL pseudo in the same
|
||||||
|
module,
|
||||||
|
in which case they refer to the storage area allocated by the
|
||||||
|
last HOL pseudoinstruction.
|
||||||
|
Thus LOE 0 loads the zeroth word of the most recent HOL, unless no HOL has
|
||||||
|
appeared in the current file so
|
||||||
|
far, in which case it loads the zeroth word of the
|
||||||
|
ABS fragment.
|
||||||
|
.P
|
||||||
|
The global data area is highly fragmented.
|
||||||
|
The ABS block and each HOL and BSS block are separate fragments.
|
||||||
|
The way fragments are formed from CON and ROM blocks is more complex.
|
||||||
|
The assemblers group several blocks into a single fragment.
|
||||||
|
A fragment only contains blocks of the same type: CON or ROM.
|
||||||
|
It is guaranteed that the bytes allocated for two consecutive CON pseudos are
|
||||||
|
allocated consecutively in a single fragment, unless
|
||||||
|
these CON pseudos are separated in the assembly language program
|
||||||
|
by a data label definition or one or more of the following pseudos:
|
||||||
|
.DS
|
||||||
|
|
||||||
|
ROM, BSS, HOL and END
|
||||||
|
|
||||||
|
.DE
|
||||||
|
An analogous rule holds for ROM pseudos.
|
||||||
|
.S2 "Local data area"
|
||||||
|
The local data area consists of a sequence of frames, one for
|
||||||
|
each active procedure.
|
||||||
|
Below the frame of the current procedure resides the
|
||||||
|
expression stack.
|
||||||
|
Frames are generated by procedure calls and are
|
||||||
|
removed by procedure returns.
|
||||||
|
A procedure frame consists of six 'zones':
|
||||||
|
.DS
|
||||||
|
|
||||||
|
1. The return status block
|
||||||
|
2. The local variables and compiler temporaries
|
||||||
|
3. The register save block
|
||||||
|
4. The dynamic local generators
|
||||||
|
5. The operand stack.
|
||||||
|
6. The parameters of a procedure one level deeper
|
||||||
|
|
||||||
|
.DE
|
||||||
|
A sample frame is shown in Figure 1.
|
||||||
|
.P
|
||||||
|
Before a procedure call is performed the actual
|
||||||
|
parameters are pushed onto the stack of the calling procedure.
|
||||||
|
The exact details are compiler dependent.
|
||||||
|
EM allows procedures to be called with a variable number of
|
||||||
|
parameters.
|
||||||
|
The implementation of the C-language almost forces its runtime
|
||||||
|
system to push the parameters in reverse order, that is,
|
||||||
|
the first positional parameter last.
|
||||||
|
Most compilers use the C calling convention to be compatible.
|
||||||
|
The parameters of a procedure belong to the frame of the
|
||||||
|
calling procedure.
|
||||||
|
Note that the evaluation of the actual parameters may imply
|
||||||
|
the calling of procedures.
|
||||||
|
The parameters can be accessed with certain instructions using
|
||||||
|
offsets of 0 and greater.
|
||||||
|
The first byte of the last parameter pushed has offset 0.
|
||||||
|
Note that the parameter at offset 0 has a special use in the
|
||||||
|
instructions following the static chain (LXL and LXA).
|
||||||
|
These instructions assume that this parameter contains the LB of
|
||||||
|
the statically enclosing procedure.
|
||||||
|
Procedures that do not have a dynamically enclosing procedure
|
||||||
|
do not need a static link at offset 0.
|
||||||
|
.P
|
||||||
|
Two instructions are available to perform procedure calls, CAL
|
||||||
|
and CAI.
|
||||||
|
Several tasks are performed by these call instructions.
|
||||||
|
.A
|
||||||
|
First, a part of the status of the calling procedure is
|
||||||
|
saved on the stack in the return status block.
|
||||||
|
This block should contain the return address of the calling
|
||||||
|
procedure, its LB and other implementation dependent data.
|
||||||
|
The size of this block is fixed for any given implementation
|
||||||
|
because the lexical instructions LPB, LXL and LXA must be able to
|
||||||
|
obtain the base addresses of the procedure parameters \fBand\fP local
|
||||||
|
variables.
|
||||||
|
An alternative solution can be used on machines with a highly
|
||||||
|
segmented address space.
|
||||||
|
The stack frames need not be contiguous then and the first
|
||||||
|
status save area can contain the parameter base AB,
|
||||||
|
which has the value of SP just after the last parameter has
|
||||||
|
been pushed.
|
||||||
|
.A
|
||||||
|
Second, the LB is changed to point to the
|
||||||
|
first word above the local variables.
|
||||||
|
The new LB is a copy of the SP after the return status
|
||||||
|
block has been pushed.
|
||||||
|
.A
|
||||||
|
Third, the amount of local storage needed by the procedure is
|
||||||
|
reserved.
|
||||||
|
The parameters and local storage are accessed by the same instructions.
|
||||||
|
Negative offsets are used for access to local variables.
|
||||||
|
The highest byte, that is the byte nearest
|
||||||
|
to LB, has to be accessed with offset -1.
|
||||||
|
The pseudoinstruction specifying the entry point of a
|
||||||
|
procedure, has an argument that specifies the amount of local
|
||||||
|
storage needed.
|
||||||
|
The local variables allocated by the CAI or CAL instructions
|
||||||
|
are the only ones that can be accessed with a fixed negative offset.
|
||||||
|
The initial value of the allocated words is
|
||||||
|
not defined, but implementations that check for undefined
|
||||||
|
values will probably initialize them with a
|
||||||
|
special 'undefined' pattern, typically -32768.
|
||||||
|
.A
|
||||||
|
Fourth, any EM implementation is allowed to reserve a variable size
|
||||||
|
block beneath the local variables.
|
||||||
|
This block could, for example, be used to save a variable number
|
||||||
|
of registers.
|
||||||
|
.A
|
||||||
|
Finally, the address of the entry point of the called procedure
|
||||||
|
is loaded into the Program Counter.
|
||||||
|
.P
|
||||||
|
The ASP instruction can be used to allocate further (dynamic)
|
||||||
|
local storage.
|
||||||
|
The base address of such storage must be obtained with a LOR~SP
|
||||||
|
instruction.
|
||||||
|
This same instruction ASP may also be used
|
||||||
|
to remove some words from the stack.
|
||||||
|
.P
|
||||||
|
There is a version of ASP, called ASS, which fetches the number
|
||||||
|
of bytes to allocate from the stack.
|
||||||
|
It can be used to allocate space for local
|
||||||
|
objects whose size is unknown at compile time,
|
||||||
|
so called 'dynamic local generators'.
|
||||||
|
.P
|
||||||
|
Control is returned to the calling procedure with a RET instruction.
|
||||||
|
Any return value is then copied to the 'function return area'.
|
||||||
|
The frame created by the call is deallocated and the status of
|
||||||
|
the calling procedure is restored.
|
||||||
|
The value of SP just after the return value has been popped must
|
||||||
|
be the same as the
|
||||||
|
value of SP just before executing the first instruction of this
|
||||||
|
invocation.
|
||||||
|
This means that when a RET is executed the operand stack can
|
||||||
|
only contain the return value and all dynamically generated locals must be
|
||||||
|
deallocated.
|
||||||
|
Violating this restriction might result in hard to detect
|
||||||
|
errors.
|
||||||
|
The calling procedure has to remove the parameters from the stack.
|
||||||
|
This can be done with the aforementioned ASP instruction.
|
||||||
|
.P
|
||||||
|
Each procedure frame is a separate fragment.
|
||||||
|
Because any fragment may be placed anywhere in memory,
|
||||||
|
procedure frames need not be contiguous.
|
||||||
|
.DS
|
||||||
|
|===============================|
|
||||||
|
| actual parameter n-1 |
|
||||||
|
|-------------------------------|
|
||||||
|
| . |
|
||||||
|
| . |
|
||||||
|
| . |
|
||||||
|
|-------------------------------|
|
||||||
|
| actual parameter 0 | ( <- AB )
|
||||||
|
|===============================|
|
||||||
|
|
||||||
|
|
||||||
|
|===============================|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|///// return status block /////|
|
||||||
|
|///////////////////////////////| <- LB
|
||||||
|
|===============================|
|
||||||
|
| |
|
||||||
|
| local variables |
|
||||||
|
| |
|
||||||
|
|-------------------------------|
|
||||||
|
| |
|
||||||
|
| compiler temporaries |
|
||||||
|
| |
|
||||||
|
|===============================|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|///// register save block /////|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|===============================|
|
||||||
|
| |
|
||||||
|
| dynamic local generators |
|
||||||
|
| |
|
||||||
|
|===============================|
|
||||||
|
| operand |
|
||||||
|
|-------------------------------|
|
||||||
|
| operand |
|
||||||
|
|===============================|
|
||||||
|
| parameter m-1 |
|
||||||
|
|-------------------------------|
|
||||||
|
| . |
|
||||||
|
| . |
|
||||||
|
| . |
|
||||||
|
|-------------------------------|
|
||||||
|
| parameter 0 | <- SP
|
||||||
|
|===============================|
|
||||||
|
|
||||||
|
Figure 1. A sample procedure frame and parameters.
|
||||||
|
.DE
|
||||||
|
.S2 "Heap data area"
|
||||||
|
The heap area starts empty, with HP
|
||||||
|
pointing to the low end of it.
|
||||||
|
HP always contains a word address.
|
||||||
|
A copy of HP can always be obtained with the LOR instruction.
|
||||||
|
A new value may be stored in the heap pointer using the STR instruction.
|
||||||
|
If the new value is greater than the old one,
|
||||||
|
then the heap grows.
|
||||||
|
If it is smaller, then the heap shrinks.
|
||||||
|
HP may never point below its original value.
|
||||||
|
All words between the current HP and the original HP
|
||||||
|
are allocated to the heap.
|
||||||
|
The heap may not grow into a part of memory that is already allocated
|
||||||
|
for the stack.
|
||||||
|
When this is attempted, the STR instruction will cause a trap to occur.
|
||||||
|
.P
|
||||||
|
The only way to address the heap is indirectly.
|
||||||
|
Whenever an object is allocated by increasing HP,
|
||||||
|
then the old HP value must be saved and can be used later to address
|
||||||
|
the allocated object.
|
||||||
|
If, in the meantime, HP is decreased so that the object
|
||||||
|
is no longer part of the heap, then an attempt to access
|
||||||
|
the object is not allowed.
|
||||||
|
Furthermore, if the heap pointer is increased again to above
|
||||||
|
the object address, then access to the old object gives undefined results.
|
||||||
|
.P
|
||||||
|
The heap is a single fragment.
|
||||||
|
All bytes have consecutive addresses.
|
||||||
|
No limits are imposed on the size of the heap as long as it fits
|
||||||
|
in the available data address space.
|
||||||
9
doc/em/even.c
Normal file
9
doc/em/even.c
Normal file
|
|
@ -0,0 +1,9 @@
|
||||||
|
main() {
|
||||||
|
register int l,j ;
|
||||||
|
|
||||||
|
for ( j=0 ; (l=getchar()) != -1 ; j++ ) {
|
||||||
|
if ( j%16 == 15 ) printf("%3d\n",l&0377 ) ;
|
||||||
|
else printf("%3d ",l&0377 ) ;
|
||||||
|
}
|
||||||
|
printf("\n") ;
|
||||||
|
}
|
||||||
178
doc/em/exam.e
Normal file
178
doc/em/exam.e
Normal file
|
|
@ -0,0 +1,178 @@
|
||||||
|
mes 2,2,2 ; wordsize 2, pointersize 2
|
||||||
|
.1
|
||||||
|
rom 't.p\000' ; the name of the source file
|
||||||
|
hol 552,-32768,0 ; externals and buf occupy 552 bytes
|
||||||
|
exp $sum ; sum can be called from other modules
|
||||||
|
pro $sum,2 ; procedure sum; 2 bytes local storage
|
||||||
|
lin 8 ; code from source line 8
|
||||||
|
ldl 0 ; load two locals ( a and b )
|
||||||
|
adi 2 ; add them
|
||||||
|
ret 2 ; return the result
|
||||||
|
end 2 ; end of procedure ( still two bytes local storage )
|
||||||
|
.2
|
||||||
|
rom 1,99,2 ; descriptor of array a[]
|
||||||
|
exp $test ; the compiler exports all level 0 procedures
|
||||||
|
pro $test,226 ; procedure test, 226 bytes local storage
|
||||||
|
.3
|
||||||
|
rom 4.8F8 ; assemble Floating point 4.8 (8 bytes) in
|
||||||
|
.4 ; global storage
|
||||||
|
rom 0.5F8 ; same for 0.5
|
||||||
|
mes 3,-226,2,2 ; compiler temporary not referenced indirect
|
||||||
|
mes 3,-24,2,0 ; the same is true for i, j, b and c in test
|
||||||
|
mes 3,-22,2,0
|
||||||
|
mes 3,-4,2,0
|
||||||
|
mes 3,-2,2,0
|
||||||
|
mes 3,-20,8,0 ; and for x and y
|
||||||
|
mes 3,-12,8,0
|
||||||
|
lin 20 ; maintain source line number
|
||||||
|
loc 1
|
||||||
|
stl -4 ; j := 1
|
||||||
|
lni ; was lin 21 prior to optimization
|
||||||
|
lol -4
|
||||||
|
loc 3
|
||||||
|
mli 2
|
||||||
|
loc 6
|
||||||
|
adi 2
|
||||||
|
stl -2 ; i := 3 * j + 6
|
||||||
|
lni ; was lin 22 prior to optimization
|
||||||
|
lae .3
|
||||||
|
loi 8
|
||||||
|
lal -12
|
||||||
|
sti 8 ; x := 4.8
|
||||||
|
lni ; was lin 23 prior to optimization
|
||||||
|
lal -12
|
||||||
|
loi 8
|
||||||
|
lae .4
|
||||||
|
loi 8
|
||||||
|
dvf 8
|
||||||
|
lal -20
|
||||||
|
sti 8 ; y := x / 0.5
|
||||||
|
lni ; was lin 24 prior to optimization
|
||||||
|
loc 1
|
||||||
|
stl -22 ; b := true
|
||||||
|
lni ; was lin 25 prior to optimization
|
||||||
|
loc 122
|
||||||
|
stl -24 ; c := 'z'
|
||||||
|
lni ; was lin 26 prior to optimization
|
||||||
|
loc 1
|
||||||
|
stl -2 ; for i:= 1
|
||||||
|
2
|
||||||
|
lol -2
|
||||||
|
dup 2
|
||||||
|
mli 2 ; i*i
|
||||||
|
lal -224
|
||||||
|
lol -2
|
||||||
|
lae .2
|
||||||
|
sar 2 ; a[i] :=
|
||||||
|
lol -2
|
||||||
|
loc 100
|
||||||
|
beq *3 ; to 100 do
|
||||||
|
inl -2 ; increment i and loop
|
||||||
|
bra *2
|
||||||
|
3
|
||||||
|
lin 27
|
||||||
|
lol -4
|
||||||
|
loc 27
|
||||||
|
adi 2 ; j + 27
|
||||||
|
sil 0 ; r.r1 :=
|
||||||
|
lni ; was lin 28 prior to optimization
|
||||||
|
lol -22 ; b
|
||||||
|
lol 0
|
||||||
|
stf 10 ; r.r3 :=
|
||||||
|
lni ; was lin 29 prior to optimization
|
||||||
|
lal -20
|
||||||
|
loi 16
|
||||||
|
adf 8 ; x + y
|
||||||
|
lol 0
|
||||||
|
adp 2
|
||||||
|
sti 8 ; r.r2 :=
|
||||||
|
lni ; was lin 23 prior to optimization
|
||||||
|
lal -224
|
||||||
|
lol -4
|
||||||
|
lae .2
|
||||||
|
lar 2 ; a[j]
|
||||||
|
lil 0 ; r.r1
|
||||||
|
cal $sum ; call now
|
||||||
|
asp 4 ; remove parameters from stack
|
||||||
|
lfr 2 ; get function result
|
||||||
|
stl -2 ; i :=
|
||||||
|
4
|
||||||
|
lin 31
|
||||||
|
lol -2
|
||||||
|
zle *5 ; while i > 0 do
|
||||||
|
lol -4
|
||||||
|
lil 0
|
||||||
|
adi 2
|
||||||
|
stl -4 ; j := j + r.r1
|
||||||
|
del -2 ; i := i - 1
|
||||||
|
bra *4 ; loop
|
||||||
|
5
|
||||||
|
lin 32
|
||||||
|
lol 0
|
||||||
|
stl -226 ; make copy of address of r
|
||||||
|
lol -22
|
||||||
|
lol -226
|
||||||
|
stf 10 ; r3 := b
|
||||||
|
lal -20
|
||||||
|
loi 16
|
||||||
|
adf 8
|
||||||
|
lol -226
|
||||||
|
adp 2
|
||||||
|
sti 8 ; r2 := x + y
|
||||||
|
loc 0
|
||||||
|
sil -226 ; r1 := 0
|
||||||
|
lin 34 ; note the abscence of the unnecesary jump
|
||||||
|
lae 22 ; address of output structure
|
||||||
|
lol -4
|
||||||
|
cal $_wri ; write integer with default width
|
||||||
|
asp 4 ; pop parameters
|
||||||
|
lae 22
|
||||||
|
lol -2
|
||||||
|
loc 6
|
||||||
|
cal $_wsi ; write integer width 6
|
||||||
|
asp 6
|
||||||
|
lae 22
|
||||||
|
lal -12
|
||||||
|
loi 8
|
||||||
|
loc 9
|
||||||
|
loc 3
|
||||||
|
cal $_wrf ; write fixed format real, width 9, precision 3
|
||||||
|
asp 14
|
||||||
|
lae 22
|
||||||
|
lol -22
|
||||||
|
cal $_wrb ; write boolean, default width
|
||||||
|
asp 4
|
||||||
|
lae 22
|
||||||
|
cal $_wln ; writeln
|
||||||
|
asp 2
|
||||||
|
ret 0 ; return, no result
|
||||||
|
end 226
|
||||||
|
exp $_main
|
||||||
|
pro $_main,0 ; main program
|
||||||
|
.6
|
||||||
|
con 2,-1,22 ; description of external files
|
||||||
|
.5
|
||||||
|
rom 15.96F8
|
||||||
|
fil .1 ; maintain source file name
|
||||||
|
lae .6 ; description of external files
|
||||||
|
lae 0 ; base of hol area to relocate buffer addresses
|
||||||
|
cal $_ini ; initialize files, etc...
|
||||||
|
asp 4
|
||||||
|
lin 37
|
||||||
|
lae .5
|
||||||
|
loi 8
|
||||||
|
lae 2
|
||||||
|
sti 8 ; x := 15.9
|
||||||
|
lni ; was lin 38 prior to optimization
|
||||||
|
loc 99
|
||||||
|
ste 0 ; mi := 99
|
||||||
|
lni ; was lin 39 prior to optimization
|
||||||
|
lae 10 ; address of r
|
||||||
|
cal $test
|
||||||
|
asp 2
|
||||||
|
loc 0 ; normal exit
|
||||||
|
cal $_hlt ; cleanup and finish
|
||||||
|
asp 2
|
||||||
|
end 0
|
||||||
|
mes 4,40 ; length of source file is 40 lines
|
||||||
|
mes 5 ; reals were used
|
||||||
40
doc/em/exam.p
Normal file
40
doc/em/exam.p
Normal file
|
|
@ -0,0 +1,40 @@
|
||||||
|
program example(output);
|
||||||
|
{This program just demonstrates typical EM code.}
|
||||||
|
type rec = record r1: integer; r2:real; r3: boolean end;
|
||||||
|
var mi: integer; mx:real; r:rec;
|
||||||
|
|
||||||
|
function sum(a,b:integer):integer;
|
||||||
|
begin
|
||||||
|
sum := a + b
|
||||||
|
end;
|
||||||
|
|
||||||
|
procedure test(var r: rec);
|
||||||
|
label 1;
|
||||||
|
var i,j: integer;
|
||||||
|
x,y: real;
|
||||||
|
b: boolean;
|
||||||
|
c: char;
|
||||||
|
a: array[1..100] of integer;
|
||||||
|
|
||||||
|
begin
|
||||||
|
j := 1;
|
||||||
|
i := 3 * j + 6;
|
||||||
|
x := 4.8;
|
||||||
|
y := x/0.5;
|
||||||
|
b := true;
|
||||||
|
c := 'z';
|
||||||
|
for i:= 1 to 100 do a[i] := i * i;
|
||||||
|
r.r1 := j+27;
|
||||||
|
r.r3 := b;
|
||||||
|
r.r2 := x+y;
|
||||||
|
i := sum(r.r1, a[j]);
|
||||||
|
while i > 0 do begin j := j + r.r1; i := i - 1 end;
|
||||||
|
with r do begin r3 := b; r2 := x+y; r1 := 0 end;
|
||||||
|
goto 1;
|
||||||
|
1: writeln(j, i:6, x:9:3, b)
|
||||||
|
end; {test}
|
||||||
|
begin {main program}
|
||||||
|
mx := 15.96;
|
||||||
|
mi := 99;
|
||||||
|
test(r)
|
||||||
|
end.
|
||||||
180
doc/em/intro.nr
Normal file
180
doc/em/intro.nr
Normal file
|
|
@ -0,0 +1,180 @@
|
||||||
|
.BP
|
||||||
|
.S1 "INTRODUCTION"
|
||||||
|
EM is a family of intermediate languages designed for producing
|
||||||
|
portable compilers.
|
||||||
|
The general strategy is for a program called
|
||||||
|
.B front end
|
||||||
|
to translate the source program to EM.
|
||||||
|
Another program,
|
||||||
|
.B back
|
||||||
|
.BW end
|
||||||
|
translates EM to target assembly language.
|
||||||
|
Alternatively, the EM code can be assembled to a binary form
|
||||||
|
and interpreted.
|
||||||
|
These considerations led to the following goals:
|
||||||
|
.IS 2 10
|
||||||
|
.PS 1 4
|
||||||
|
.PT
|
||||||
|
The design should allow translation to,
|
||||||
|
or interpretation on, a wide range of existing machines.
|
||||||
|
Design decisions should be delayed as far as possible
|
||||||
|
and the implications of these decisions should
|
||||||
|
be localized as much as possible.
|
||||||
|
.N
|
||||||
|
The current microcomputer technology offers 8, 16 and 32 bit machines
|
||||||
|
with various sizes of address space.
|
||||||
|
EM should be flexible enough to be useful on most of these
|
||||||
|
machines.
|
||||||
|
The differences between the members of the EM family should only
|
||||||
|
concern the wordsize and address space size.
|
||||||
|
.PT
|
||||||
|
The architecture should ease the task of code generation for
|
||||||
|
high level languages such as Pascal, C, Ada, Algol 68, BCPL.
|
||||||
|
.PT
|
||||||
|
The instruction set used by the interpreter should be compact,
|
||||||
|
to reduce the amount of memory needed
|
||||||
|
for program storage, and to reduce the time needed to transmit
|
||||||
|
programs over communication lines.
|
||||||
|
.PT
|
||||||
|
It should be designed with microprogrammed implementations in
|
||||||
|
mind; in particular, the use of many short fields within
|
||||||
|
instruction opcodes should be avoided, because their extraction by the
|
||||||
|
microprogram or conversion to other instruction formats is inefficient.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.A
|
||||||
|
The basic architecture is based on the concept of a stack. The stack
|
||||||
|
is used for procedure return addresses, actual parameters, local variables,
|
||||||
|
and arithmetic operations.
|
||||||
|
There are several built-in object types,
|
||||||
|
for example, signed and unsigned integers,
|
||||||
|
floating point numbers, pointers and sets of bits.
|
||||||
|
There are instructions to push and pop objects
|
||||||
|
to and from the stack.
|
||||||
|
The push and pop instructions are not typed.
|
||||||
|
They only care about the size of the objects.
|
||||||
|
For each built-in type there are
|
||||||
|
reverse Polish type instructions that pop one or more
|
||||||
|
objects from the top of
|
||||||
|
the stack, perform an operation, and push the result back onto the
|
||||||
|
stack.
|
||||||
|
For all types except pointers,
|
||||||
|
these instructions have the object size
|
||||||
|
as argument.
|
||||||
|
.P
|
||||||
|
There are no visible general registers used for arithmetic operands
|
||||||
|
etc. This is in contrast to most third generation computers, which usually
|
||||||
|
have 8 or 16 general registers. The decision not to have a group of
|
||||||
|
general registers was fully intentional, and follows W.L. Van der
|
||||||
|
Poel's dictum that a machine should have 0, 1, or an infinite
|
||||||
|
number of any feature. General registers have two primary uses: to hold
|
||||||
|
intermediate results of complicated expressions, e.g.
|
||||||
|
.IS 5 0 1
|
||||||
|
((a*b + c*d)/e + f*g/h) * i
|
||||||
|
.IE 1
|
||||||
|
and to hold local variables.
|
||||||
|
.P
|
||||||
|
Various studies
|
||||||
|
have shown that the average expression has fewer than two operands,
|
||||||
|
making the former use of registers of doubtful value. The present trend
|
||||||
|
toward structured programs consisting of many small
|
||||||
|
procedures greatly reduces the value of registers to hold local variables
|
||||||
|
because the large number of procedure calls implies a large overhead in
|
||||||
|
saving and restoring the registers at every call.
|
||||||
|
.BP
|
||||||
|
.P
|
||||||
|
Although there are no general purpose registers, there are a
|
||||||
|
few internal registers with specific functions as follows:
|
||||||
|
.IS 2
|
||||||
|
.N 1
|
||||||
|
.TS
|
||||||
|
tab(:);
|
||||||
|
l 1 l l.
|
||||||
|
PC:-:Program Counter:Pointer to next instruction
|
||||||
|
LB:-:Local Base:Points to base of the local variables \
|
||||||
|
in the current procedure.
|
||||||
|
SP:-:Stack Pointer:Points to the highest occupied word on the stack.
|
||||||
|
HP:-:Heap Pointer:Points to the top of the heap area.
|
||||||
|
.TE 1
|
||||||
|
.IE
|
||||||
|
.A
|
||||||
|
Furthermore, reverse Polish code is much easier to generate than
|
||||||
|
multi-register machine code, especially if highly efficient code is
|
||||||
|
desired.
|
||||||
|
When translating to assembly language the back end can make
|
||||||
|
good use of the target machine's registers.
|
||||||
|
An EM machine can
|
||||||
|
achieve high performance by keeping part of the stack
|
||||||
|
in high speed storage (a cache or microprogram scratchpad memory) rather
|
||||||
|
than in primary memory.
|
||||||
|
.P
|
||||||
|
Again according to van der Poel's dictum,
|
||||||
|
all EM instructions have zero or one argument.
|
||||||
|
We believe that instructions needing two arguments
|
||||||
|
can be split into two simpler ones.
|
||||||
|
The simpler ones can probably be used in other
|
||||||
|
circumstances as well.
|
||||||
|
Moreover, these two instructions together often
|
||||||
|
have a shorter encoding than the single
|
||||||
|
instruction before.
|
||||||
|
.P
|
||||||
|
This document describes EM at three different levels:
|
||||||
|
the abstract level, the assembly language level and
|
||||||
|
the machine language level.
|
||||||
|
.A
|
||||||
|
The most important level is that of the abstract EM architecture.
|
||||||
|
This level deals with the basic design issues.
|
||||||
|
Only the functional capabilities of instructions are relevant, not their
|
||||||
|
format or encoding.
|
||||||
|
Most chapters of this document refer to the abstract level
|
||||||
|
and it is explicitly stated whenever
|
||||||
|
another level is described.
|
||||||
|
.A
|
||||||
|
The assembly language is intended for the compiler writer.
|
||||||
|
It presents a more or less orthogonal instruction
|
||||||
|
set and provides symbolic names for data.
|
||||||
|
Moreover, it facilitates the linking of
|
||||||
|
separately compiled 'modules' into a single program
|
||||||
|
by providing several pseudoinstructions.
|
||||||
|
.A
|
||||||
|
The machine language is designed for interpretation with a compact
|
||||||
|
program text and easy decoding.
|
||||||
|
The binary representation of the machine language instruction set is
|
||||||
|
far from orthogonal.
|
||||||
|
Frequent instructions have a short opcode.
|
||||||
|
The encoding is fully byte oriented.
|
||||||
|
These bytes do not contain small bit fields, because
|
||||||
|
bit fields would slow down decoding considerably.
|
||||||
|
.P
|
||||||
|
A common use for EM is for producing portable (cross) compilers.
|
||||||
|
When used this way, the compilers produce
|
||||||
|
EM assembly language as their output.
|
||||||
|
To run the compiled program on the target machine,
|
||||||
|
the back end, translates the EM assembly language to
|
||||||
|
the target machine's assembly language.
|
||||||
|
When this approach is used, the format of the EM
|
||||||
|
machine language instructions is irrelevant.
|
||||||
|
On the other hand, when writing an interpreter for EM machine language
|
||||||
|
programs, the interpreter must deal with the machine language
|
||||||
|
and not with the symbolic assembly language.
|
||||||
|
.P
|
||||||
|
As mentioned above, the
|
||||||
|
current microcomputer technology offers 8, 16 and 32 bit
|
||||||
|
machines with address spaces ranging from 2\v'-0.5m'16\v'0.5m'
|
||||||
|
to 2\v'-0.5m'32\v'0.5m' bytes.
|
||||||
|
Having one size of pointers and integers restricts
|
||||||
|
the usefulness of the language.
|
||||||
|
We decided to have a different language for each combination of
|
||||||
|
word and pointer size.
|
||||||
|
All languages offer the same instruction set and differ only in
|
||||||
|
memory alignment restrictions and the implicit size assumed in
|
||||||
|
several instructions.
|
||||||
|
The languages
|
||||||
|
differ slightly for the
|
||||||
|
different size combinations.
|
||||||
|
For example: the
|
||||||
|
size of any object on the stack and alignment restrictions.
|
||||||
|
The wordsize is restricted to powers of 2 and
|
||||||
|
the pointer size must be a multiple of the wordsize.
|
||||||
|
Almost all programs handling EM will be parametrized with word
|
||||||
|
and pointer size.
|
||||||
376
doc/em/iotrap.nr
Normal file
376
doc/em/iotrap.nr
Normal file
|
|
@ -0,0 +1,376 @@
|
||||||
|
.SN 8
|
||||||
|
.VS 1 0
|
||||||
|
.BP
|
||||||
|
.S1 "ENVIRONMENT INTERACTIONS"
|
||||||
|
EM programs can interact with their environment in three ways.
|
||||||
|
Two, starting/stopping and monitor calls, are dealt with in this chapter.
|
||||||
|
The remaining way to interact, interrupts, will be treated
|
||||||
|
together with traps in chapter 9.
|
||||||
|
.S2 "Program starting and stopping"
|
||||||
|
EM user programs start with a call to a procedure called
|
||||||
|
m_a_i_n.
|
||||||
|
The assembler and backends look for the definition of a procedure
|
||||||
|
with this name in their input.
|
||||||
|
The call passes three parameters to the procedure.
|
||||||
|
The parameters are similar to the parameters supplied by the
|
||||||
|
UNIX
|
||||||
|
.FS
|
||||||
|
UNIX is a Trademark of Bell Laboratories.
|
||||||
|
.FE
|
||||||
|
operating system to C programs.
|
||||||
|
These parameters are often called
|
||||||
|
.BW argc ,
|
||||||
|
.B argv
|
||||||
|
and
|
||||||
|
.BW envp .
|
||||||
|
Argc is the parameter nearest to LB and is a wordsized integer.
|
||||||
|
The other two are pointers to the first element of an array of
|
||||||
|
string pointers.
|
||||||
|
.N
|
||||||
|
The
|
||||||
|
.B argv
|
||||||
|
array contains
|
||||||
|
.B argc
|
||||||
|
strings, the first of which contains the program call name.
|
||||||
|
The other strings in the
|
||||||
|
.B argv
|
||||||
|
array are the program parameters.
|
||||||
|
.P
|
||||||
|
The
|
||||||
|
.B envp
|
||||||
|
array contains strings in the form "name=string", where 'name'
|
||||||
|
is the name of an environment variable and string its value.
|
||||||
|
The
|
||||||
|
.B envp
|
||||||
|
is terminated by a zero pointer.
|
||||||
|
.P
|
||||||
|
An EM user program stops if the program returns from the first
|
||||||
|
invocation of m_a_i_n.
|
||||||
|
The contents of the function return area are used to procure a
|
||||||
|
wordsized program return code.
|
||||||
|
EM programs also stop when traps and interrupts occur that are
|
||||||
|
not caught and when the exit monitor call is executed.
|
||||||
|
.S2 "Input/Output and other monitor calls"
|
||||||
|
EM differs from most conventional machines in that it has high level i/o
|
||||||
|
instructions.
|
||||||
|
Typical instructions are OPEN FILE and READ FROM FILE instead
|
||||||
|
of low level instructions such as setting and clearing
|
||||||
|
bits in device registers.
|
||||||
|
By providing such high level i/o primitives, the task of implementing
|
||||||
|
EM on various non EM machines is made considerably easier.
|
||||||
|
.P
|
||||||
|
I/O is initiated by the MON instruction, which expects an iocode on top
|
||||||
|
of the stack.
|
||||||
|
Often there are also parameters which are pushed on the
|
||||||
|
stack in reverse order, that is: last
|
||||||
|
parameter first.
|
||||||
|
Some i/o functions also provide results, which are returned on the stack.
|
||||||
|
In the list of monitor calls we use several types of parameters and results,
|
||||||
|
these types consist of integers and unsigneds of varying sizes, but never
|
||||||
|
smaller than the wordsize, and the two pointer types.
|
||||||
|
.N 1
|
||||||
|
The names of the types used are:
|
||||||
|
.IS 4
|
||||||
|
.PS - 10
|
||||||
|
.PT int
|
||||||
|
an integer of wordsize
|
||||||
|
.PT int2
|
||||||
|
an integer whose size is the maximum of the wordsize and 2
|
||||||
|
bytes
|
||||||
|
.PT int4
|
||||||
|
an integer whose size is the maximum of the wordsize and 4
|
||||||
|
bytes
|
||||||
|
.PT intp
|
||||||
|
an integer with the size of a pointer
|
||||||
|
.PT uns2
|
||||||
|
an unsigned integer whose size is the maximum of the wordsize and 2
|
||||||
|
.PT unsp
|
||||||
|
an unsigned integer with the size of a pointer
|
||||||
|
.PT ptr
|
||||||
|
a pointer into data space
|
||||||
|
.PE 1
|
||||||
|
.IE 0
|
||||||
|
The table below lists the i/o codes with their results and
|
||||||
|
parameters.
|
||||||
|
This list is similar to the system calls of the UNIX Version 7
|
||||||
|
operating system.
|
||||||
|
.BP
|
||||||
|
.A
|
||||||
|
To execute a monitor call, proceed as follows:
|
||||||
|
.IS 2
|
||||||
|
.N 1
|
||||||
|
.PS a 4 "" )
|
||||||
|
.PT
|
||||||
|
Stack the parameters, in reverse order, last parameter first.
|
||||||
|
.PT
|
||||||
|
Push the monitor call number (iocode) onto the stack.
|
||||||
|
.PT
|
||||||
|
Execute the MON instruction.
|
||||||
|
.PE 1
|
||||||
|
.IE
|
||||||
|
An error code is present on the top of the stack after
|
||||||
|
execution of most monitor calls.
|
||||||
|
If this error code is zero, the call performed the action
|
||||||
|
requested and the results are available on top of the stack.
|
||||||
|
Non-zero error codes indicate a failure, in this case no
|
||||||
|
results are available and the error code has been pushed twice.
|
||||||
|
This construction enables programs to test for failure with a
|
||||||
|
single instruction (~TEQ or TNE~) and still find out the cause of
|
||||||
|
the failure.
|
||||||
|
The result name 'e' is reserved for the error code.
|
||||||
|
.N 1
|
||||||
|
List of monitor calls.
|
||||||
|
.DS B
|
||||||
|
number name parameters results function
|
||||||
|
|
||||||
|
1 Exit status:int Terminate this process
|
||||||
|
2 Fork e,flag,pid:int Spawn new process
|
||||||
|
3 Read fildes:int;buf:ptr;nbytes:unsp
|
||||||
|
e:int;rbytes:unsp Read from file
|
||||||
|
4 Write fildes:int;buf:ptr;nbytes:unsp
|
||||||
|
e:int;wbytes:unsp Write on a file
|
||||||
|
5 Open string:ptr;flag:int
|
||||||
|
e,fildes:int Open file for read and/or write
|
||||||
|
6 Close fildes:int e:int Close a file
|
||||||
|
7 Wait e:int;status,pid:int2
|
||||||
|
Wait for child
|
||||||
|
8 Creat string:ptr;mode:int
|
||||||
|
e,fildes:int Create a new file
|
||||||
|
9 Link string1,string2:ptr
|
||||||
|
e:int Link to a file
|
||||||
|
10 Unlink string:ptr e:int Remove directory entry
|
||||||
|
12 Chdir string:ptr e:int Change default directory
|
||||||
|
14 Mknod string:ptr;mode,addr:int2
|
||||||
|
e:int Make a special file
|
||||||
|
15 Chmod string:ptr;mode:int2
|
||||||
|
e:int Change mode of file
|
||||||
|
16 Chown string:ptr;owner,group:int2
|
||||||
|
e:int Change owner/group of a file
|
||||||
|
18 Stat string,statbuf:ptr
|
||||||
|
e:int Get file status
|
||||||
|
19 Lseek fildes:int;off:int4;whence:int
|
||||||
|
e:int;oldoff:int4 Move read/write pointer
|
||||||
|
20 Getpid pid:int2 Get process identification
|
||||||
|
21 Mount special,string:ptr;rwflag:int
|
||||||
|
e:int Mount file system
|
||||||
|
22 Umount special:ptr e:int Unmount file system
|
||||||
|
23 Setuid userid:int2 e:int Set user ID
|
||||||
|
24 Getuid e_uid,r_uid:int2 Get user ID
|
||||||
|
25 Stime time:int4 e:int Set time and date
|
||||||
|
26 Ptrace request:int;pid:int2;addr:ptr;data:int
|
||||||
|
e,value:int Process trace
|
||||||
|
27 Alarm seconds:uns2 previous:uns2 Schedule signal
|
||||||
|
28 Fstat fildes:int;statbuf:ptr
|
||||||
|
e:int Get file status
|
||||||
|
29 Pause Stop until signal
|
||||||
|
30 Utime string,timep:ptr
|
||||||
|
e:int Set file times
|
||||||
|
33 Access string,mode:int e:int Determine file accessibility
|
||||||
|
34 Nice incr:int Set program priority
|
||||||
|
35 Ftime bufp:ptr e:int Get date and time
|
||||||
|
36 Sync Update filesystem
|
||||||
|
37 Kill pid:int2;sig:int
|
||||||
|
e:int Send signal to a process
|
||||||
|
41 Dup fildes,newfildes:int
|
||||||
|
e,fildes:int Duplicate a file descriptor
|
||||||
|
42 Pipe e,w_des,r_des:int Create a pipe
|
||||||
|
43 Times buffer:ptr Get process times
|
||||||
|
44 Profil buff:ptr;bufsiz,offset,scale:intp Execution time profile
|
||||||
|
46 Setgid gid:int2 e:int Set group ID
|
||||||
|
47 Getgid e_gid,r_gid:int Get group ID
|
||||||
|
48 Sigtrp trapno,signo:int
|
||||||
|
e,prevtrap:int See below
|
||||||
|
51 Acct file:ptr e:int Turn accounting on or off
|
||||||
|
53 Lock flag:int e:int Lock a process
|
||||||
|
54 Ioctl fildes,request:int;argp:ptr
|
||||||
|
e:int Control device
|
||||||
|
56 Mpxcall cmd:int;vec:ptr e:int Multiplexed file handling
|
||||||
|
59 Exece name,argv,envp:ptr
|
||||||
|
e:int Execute a file
|
||||||
|
60 Umask complmode:int2 oldmask:int2 Set file creation mode mask
|
||||||
|
61 Chroot string:ptr e:int Change root directory
|
||||||
|
.DE 1
|
||||||
|
Codes 0, 11, 13, 17, 31, 32, 38, 39, 40, 45, 49, 50, 52,
|
||||||
|
55, 57, 58, 62, and 63 are
|
||||||
|
not used.
|
||||||
|
.P
|
||||||
|
All monitor calls, except fork and sigtrp
|
||||||
|
are the same as the UNIX version 7 system calls.
|
||||||
|
.P
|
||||||
|
The sigtrp entry maps UNIX signals onto EM interrupts.
|
||||||
|
Normally, trapno is in the range 0 to 252.
|
||||||
|
In that case it requests that signal signo
|
||||||
|
will cause trap trapno to occur.
|
||||||
|
When given trap number -2, default signal handling is reset, and when given
|
||||||
|
trap number -3, the signal is ignored.
|
||||||
|
.P
|
||||||
|
The flag returned by fork is 1 in the child process and 0 in
|
||||||
|
the parent.
|
||||||
|
The pid returned is the process-id of the other process.
|
||||||
|
.BP
|
||||||
|
.S1 "TRAPS AND INTERRUPTS"
|
||||||
|
EM provides a means for the user program to catch all traps
|
||||||
|
generated by the program itself, the hardware, or external conditions.
|
||||||
|
This mechanism uses five instructions: LIM, SIM, SIG, TRP and RTT.
|
||||||
|
This section of the manual may be omitted on the first reading since it
|
||||||
|
presupposes knowledge of the EM instruction set.
|
||||||
|
.P
|
||||||
|
The action taken when a trap occures is determined by the value
|
||||||
|
of an internal EM trap register.
|
||||||
|
This register contains a pointer to a procedure.
|
||||||
|
Initially the pointer used is zero and all traps halt the
|
||||||
|
program with, hopefully, a useful message to the outside world.
|
||||||
|
The SIG instruction can be used to alter the trap register,
|
||||||
|
it pops a procedure pointer from the
|
||||||
|
stack into the trap register.
|
||||||
|
When a trap occurs after storing a nonzero value in the trap
|
||||||
|
register, the procedure pointed to by the trap register
|
||||||
|
is called with the trap number
|
||||||
|
as the only parameter (see below).
|
||||||
|
SIG returns the previous value of the trap register on the
|
||||||
|
stack.
|
||||||
|
Two consecutive SIGs are a no-op.
|
||||||
|
When a trap occurs, the trap register is reset to its initial
|
||||||
|
condition, to prevent recursive traps from hanging the machine up,
|
||||||
|
e.g. stack overflow in the stack overflow handling procedure.
|
||||||
|
.P
|
||||||
|
The runtime systems for some languages need to ignore some EM
|
||||||
|
traps.
|
||||||
|
EM offers a feature called the ignore mask.
|
||||||
|
It contains one bit for each of the lowest 16 trap numbers.
|
||||||
|
The bits are numbered 0 to 15, with the least significant bit
|
||||||
|
having number 0.
|
||||||
|
If a certain bit is 1 the corresponding trap never
|
||||||
|
occurs and processing simply continues.
|
||||||
|
The actions performed by the offending instruction are
|
||||||
|
described by the Pascal program in appendix A.
|
||||||
|
.N
|
||||||
|
If the bit is 0, traps are not ignored.
|
||||||
|
The instructions LIM and SIM allow copying and replacement of
|
||||||
|
the ignore mask.~
|
||||||
|
.P
|
||||||
|
The TRP instruction generates a trap, the trap number being found on the
|
||||||
|
stack.
|
||||||
|
This is, among other things,
|
||||||
|
useful for library procedures and runtime systems.
|
||||||
|
It can also be used by a low level trap procedure to pass the trap to a
|
||||||
|
higher level one (see example below).
|
||||||
|
.P
|
||||||
|
The RTT instruction returns from the trap procedure and continues after the
|
||||||
|
trap.
|
||||||
|
In the list below all traps marked with an asterisk ('*') are
|
||||||
|
considered to be fatal and it is explicitly undefined what happens if
|
||||||
|
you try to restart after the trap.
|
||||||
|
.P
|
||||||
|
The way a trap procedure is called is completely compatible
|
||||||
|
with normal calling conventions. The only way a trap procedure
|
||||||
|
differs from normal procedures is the return. It has to use RTT instead
|
||||||
|
of RET. This is necessary because the complete runtime status is saved on the
|
||||||
|
stack before calling the procedure and all this status has to be reloaded.
|
||||||
|
Error numbers are in the range 0 to 252.
|
||||||
|
The trap numbers are divided into three categories:
|
||||||
|
.IS 4
|
||||||
|
.N 1
|
||||||
|
.PS - 10
|
||||||
|
.PT ~~0-~63
|
||||||
|
EM machine errors, e.g. illegal instruction.
|
||||||
|
.PS - 8
|
||||||
|
.PT ~0-15
|
||||||
|
maskable
|
||||||
|
.PT 16-63
|
||||||
|
not maskable
|
||||||
|
.PE
|
||||||
|
.PT ~64-127
|
||||||
|
Reserved for use by compilers, run time systems, etc.
|
||||||
|
.PT 128-252
|
||||||
|
Available for user programs.
|
||||||
|
.PE 1
|
||||||
|
.IE
|
||||||
|
EM machine errors are numbered as follows:
|
||||||
|
.DS I 5
|
||||||
|
.TS
|
||||||
|
tab(@);
|
||||||
|
n l l.
|
||||||
|
0@EARRAY@Array bound error
|
||||||
|
1@ERANGE@Range bound error
|
||||||
|
2@ESET@Set bound error
|
||||||
|
3@EIOVFL@Integer overflow
|
||||||
|
4@EFOVFL@Floating overflow
|
||||||
|
5@EFUNFL@Floating underflow
|
||||||
|
6@EIDIVZ@Divide by 0
|
||||||
|
7@EFDIVZ@Divide by 0.0
|
||||||
|
8@EIUND@Undefined integer
|
||||||
|
9@EFUND@Undefined float
|
||||||
|
10@ECONV@Conversion error
|
||||||
|
16*@ESTACK@Stack overflow
|
||||||
|
17*@EHEAP@Heap overflow
|
||||||
|
18*@EILLINS@Illegal instruction
|
||||||
|
19*@EODDZ@Illegal size argument
|
||||||
|
20*@ECASE@Case error
|
||||||
|
21*@EMEMFLT@Addressing non existent memory
|
||||||
|
22*@EBADPTR@Bad pointer used
|
||||||
|
23*@EBADPC@Program counter out of range
|
||||||
|
24@EBADLAE@Bad argument of LAE
|
||||||
|
25@EBADMON@Bad monitor call
|
||||||
|
26@EBADLIN@Argument of LIN too high
|
||||||
|
27@EBADGTO@GTO descriptor error
|
||||||
|
.TE
|
||||||
|
.DE 0
|
||||||
|
.P
|
||||||
|
As an example,
|
||||||
|
suppose a subprocedure has to be written to do a numeric
|
||||||
|
calculation.
|
||||||
|
When an overflow occurs the computation has to be stopped and
|
||||||
|
the higher level procedure must be resumed.
|
||||||
|
This can be programmed as follows using the mechanism described above:
|
||||||
|
.DS B
|
||||||
|
mes 2,2,2 ; set sizes
|
||||||
|
ersave
|
||||||
|
bss 2,0,0 ; Room to save previous value of trap procedure
|
||||||
|
msave
|
||||||
|
bss 2,0,0 ; Room to save previous value of trap mask
|
||||||
|
|
||||||
|
pro calcule,0 ; entry point
|
||||||
|
lxl 0 ; fill in non-local goto descriptor with LB
|
||||||
|
ste jmpbuf+4
|
||||||
|
lor 1 ; and SP
|
||||||
|
ste jmpbuf+2
|
||||||
|
lim ; get current ignore mask
|
||||||
|
ste msave ; save it
|
||||||
|
lim
|
||||||
|
loc 4 ; bit for EFOVFL
|
||||||
|
ior 2 ; set in mask
|
||||||
|
sim ; ignore EFOVFL from now on
|
||||||
|
lpi $catch ; load procedure identifier
|
||||||
|
sig ; catch wil get all traps now
|
||||||
|
ste ersave ; save previous trap procedure identifier
|
||||||
|
; perform calculation now, possibly generating overflow
|
||||||
|
1 ; label jumped to by catch procedure
|
||||||
|
loe ersave ; get old trap procedure
|
||||||
|
sig ; refer all following trap to old procedure
|
||||||
|
asp 2 ; remove result of sig
|
||||||
|
loe msave ; restore previous mask
|
||||||
|
sim ; done now
|
||||||
|
; load result of calculation
|
||||||
|
ret 2 ; return result
|
||||||
|
jmpbuf
|
||||||
|
con *1,0,0
|
||||||
|
end
|
||||||
|
.DE 0
|
||||||
|
.VS 1 1
|
||||||
|
.DS
|
||||||
|
Example of catch procedure
|
||||||
|
pro catch,0 ; Local procedure that must catch the overflow trap
|
||||||
|
lol 2 ; Load trap number
|
||||||
|
loc 4 ; check for overflow
|
||||||
|
bne *1 ; if other trap, call higher trap procedure
|
||||||
|
gto jmpbuf ; return to procedure calcule
|
||||||
|
1 ; other trap has occurred
|
||||||
|
loe ersave ; previous trap procedure
|
||||||
|
sig ; other procedure will get the traps now
|
||||||
|
asp 2 ; remove the result of sig
|
||||||
|
lol 2 ; stack trap number
|
||||||
|
trp ; call other trap procedure
|
||||||
|
rtt ; if other procedure returns, do the same
|
||||||
|
end
|
||||||
|
.DE
|
||||||
6
doc/em/ip.awk
Normal file
6
doc/em/ip.awk
Normal file
|
|
@ -0,0 +1,6 @@
|
||||||
|
BEGIN { printf ".TS\nlw(6) lw(8) rw(3) rw(6) 14 lw(6) lw(8) rw(3) rw(6) 14 lw(6) lw(8) rw(3) rw(6).\n" }
|
||||||
|
NF == 4 { printf "%s\t%s\t%d\t%d",$1,$2,$3,$4 }
|
||||||
|
NF == 3 { printf "%s\t%s\t\t%d",$1,$2,$3 }
|
||||||
|
{ if ( NR%3 == 0 ) printf("\n") ; else printf("\t"); }
|
||||||
|
END { if ( NR%3 != 0 ) printf("\n")
|
||||||
|
printf ".TE\n" }
|
||||||
61
doc/em/ispace.nr
Normal file
61
doc/em/ispace.nr
Normal file
|
|
@ -0,0 +1,61 @@
|
||||||
|
.SN 3
|
||||||
|
.BP
|
||||||
|
.S1 "INSTRUCTION ADDRESS SPACE"
|
||||||
|
The instruction space of the EM machine contains
|
||||||
|
the code for procedures.
|
||||||
|
Tables necessary for the execution of this code, for example, procedure
|
||||||
|
descriptor tables, may also be present.
|
||||||
|
The instruction space does not change during
|
||||||
|
the execution of a program, so that it may be
|
||||||
|
protected.
|
||||||
|
No further restrictions to the instruction address space are
|
||||||
|
necessary for the abstract and assembly language level.
|
||||||
|
.P
|
||||||
|
Each procedure has a single entry point: the first instruction.
|
||||||
|
A special type of pointer identifies a procedure.
|
||||||
|
Pointers into the instruction
|
||||||
|
address space have the same size as pointers into data space and
|
||||||
|
can, for example, contain the address of the first instruction
|
||||||
|
or an index in a procedure descriptor table.
|
||||||
|
.A
|
||||||
|
There is a single EM program counter, PC, pointing
|
||||||
|
to the next instruction to be executed.
|
||||||
|
The procedure pointed to by PC is
|
||||||
|
called the 'current' procedure.
|
||||||
|
A procedure may call another procedure using the CAL or CAI
|
||||||
|
instruction.
|
||||||
|
The calling procedure remains 'active' and is resumed whenever the called
|
||||||
|
procedure returns.
|
||||||
|
Note that a procedure has several 'active' invocations when
|
||||||
|
called recursively.
|
||||||
|
.P
|
||||||
|
Each procedure must return properly.
|
||||||
|
It is not allowed to fall through to the
|
||||||
|
code of the next procedure.
|
||||||
|
There are several ways to exit from a procedure:
|
||||||
|
.IS 3
|
||||||
|
.PS
|
||||||
|
.PT
|
||||||
|
the RET instruction, which returns to the
|
||||||
|
calling procedure.
|
||||||
|
.PT
|
||||||
|
the RTT instruction, which exits a trap handling routine and resumes
|
||||||
|
the trapping instruction (see next chapter).
|
||||||
|
.PT
|
||||||
|
the GTO instruction, which is used for non-local goto's.
|
||||||
|
It can remove several frames from the stack and transfer
|
||||||
|
control to an active procedure.
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.P
|
||||||
|
All branch instructions can transfer control
|
||||||
|
to any label within the same procedure.
|
||||||
|
Branch instructions can never jump out of a procedure.
|
||||||
|
.P
|
||||||
|
Several language implementations use a so called procedure
|
||||||
|
instance identifier, a combination of a procedure identifier and
|
||||||
|
the LB of a stack frame, also called static link.
|
||||||
|
.P
|
||||||
|
The program text for each procedure, as well as any tables,
|
||||||
|
are fragments and can be allocated anywhere
|
||||||
|
in the instruction address space.
|
||||||
2525
doc/em/itables
Normal file
2525
doc/em/itables
Normal file
File diff suppressed because it is too large
Load diff
390
doc/em/mach.nr
Normal file
390
doc/em/mach.nr
Normal file
|
|
@ -0,0 +1,390 @@
|
||||||
|
.BP
|
||||||
|
.SN 10
|
||||||
|
.S1 "EM MACHINE LANGUAGE"
|
||||||
|
The EM machine language is designed to make program text compact
|
||||||
|
and to make decoding easy.
|
||||||
|
Compact program text has many advantages: programs execute faster,
|
||||||
|
programs occupy less primary and secondary storage and loading
|
||||||
|
programs into satellite processors is faster.
|
||||||
|
The decoding of EM machine language is so simple,
|
||||||
|
that it is feasible to use interpreters as long as EM hardware
|
||||||
|
machines are not available.
|
||||||
|
This chapter is irrelevant when back ends are used to
|
||||||
|
produce executable target machine code.
|
||||||
|
.S2 "Instruction encoding"
|
||||||
|
A design goal of EM is to make the
|
||||||
|
program text as compact as possible.
|
||||||
|
Decoding must be easy, however.
|
||||||
|
The encoding is fully byte oriented, without any small bit fields.
|
||||||
|
There are 256 primary opcodes, two of which are an escape to
|
||||||
|
two groups of 256 secondary opcodes each.
|
||||||
|
.A
|
||||||
|
EM instructions without arguments have a single opcode assigned,
|
||||||
|
possibly escaped:
|
||||||
|
.DS
|
||||||
|
|
||||||
|
|--------------|
|
||||||
|
| opcode |
|
||||||
|
|--------------|
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
|--------------|--------------|
|
||||||
|
| escape | opcode |
|
||||||
|
|--------------|--------------|
|
||||||
|
|
||||||
|
.DE
|
||||||
|
The encoding for instructions with an argument is more complex.
|
||||||
|
Several instructions have an address from the global data area
|
||||||
|
as argument.
|
||||||
|
Other instructions have different opcodes for positive
|
||||||
|
and negative arguments.
|
||||||
|
.N 1
|
||||||
|
There is always an opcode that takes the next two bytes as argument,
|
||||||
|
high byte first:
|
||||||
|
.DS
|
||||||
|
|
||||||
|
|--------------|--------------|--------------|
|
||||||
|
| opcode | hibyte | lobyte |
|
||||||
|
|--------------|--------------|--------------|
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
|--------------|--------------|--------------|--------------|
|
||||||
|
| escape | opcode | hibyte | lobyte |
|
||||||
|
|--------------|--------------|--------------|--------------|
|
||||||
|
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
An extra escape is provided for instructions with four or eight byte arguments.
|
||||||
|
|
||||||
|
|--------------|--------------|--------------| |--------------|
|
||||||
|
| ESCAPE | opcode | hibyte |...| lobyte |
|
||||||
|
|--------------|--------------|--------------| |--------------|
|
||||||
|
|
||||||
|
.DE
|
||||||
|
For most instructions some argument values predominate.
|
||||||
|
The most frequent combinations of instruction and argument
|
||||||
|
will be encoded in a single byte, called a mini:
|
||||||
|
.DS
|
||||||
|
|
||||||
|
|---------------|
|
||||||
|
|opcode+argument| (mini)
|
||||||
|
|---------------|
|
||||||
|
|
||||||
|
.DE
|
||||||
|
The number of minis is restricted, because only
|
||||||
|
254 primary opcodes are available.
|
||||||
|
Many instructions have the bulk of their arguments
|
||||||
|
fall in the range 0 to 255.
|
||||||
|
Instructions that address global data have their arguments
|
||||||
|
distributed over a wider range,
|
||||||
|
but small values of the high byte are common.
|
||||||
|
For all these cases there is another encoding
|
||||||
|
that combines the instruction and the high byte of the argument
|
||||||
|
into a single opcode.
|
||||||
|
These opcodes are called shorties.
|
||||||
|
Shorties may be escaped.
|
||||||
|
.DS
|
||||||
|
|
||||||
|
|--------------|--------------|
|
||||||
|
| opcode+high | lobyte | (shortie)
|
||||||
|
|--------------|--------------|
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
|--------------|--------------|--------------|
|
||||||
|
| escape | opcode+high | lobyte |
|
||||||
|
|--------------|--------------|--------------|
|
||||||
|
|
||||||
|
.DE
|
||||||
|
Escaped shorties are useless if the normal encoding has a primary opcode.
|
||||||
|
Note that for some instruction-argument combinations
|
||||||
|
several different encodings are available.
|
||||||
|
It is the task of the assembler to select the shortest of these.
|
||||||
|
The savings by these mini and shortie
|
||||||
|
opcodes are considerable, about 55%.
|
||||||
|
.P
|
||||||
|
Further improvements are possible:
|
||||||
|
the arguments of
|
||||||
|
many instructions are a multiple of the wordsize.
|
||||||
|
Some do also not allow zero as an argument.
|
||||||
|
If these arguments are divided by the wordsize and,
|
||||||
|
when zero is not allowed, then decremented by 1, more of them can
|
||||||
|
be encoded as shortie or mini.
|
||||||
|
The arguments of some other instructions
|
||||||
|
rarely or never assume the value 0, but start at 1.
|
||||||
|
The value 1 is then encoded as 0,
|
||||||
|
2 as 1 and so on.
|
||||||
|
.P
|
||||||
|
Assigning opcodes to instructions by the assembler is completely
|
||||||
|
table driven.
|
||||||
|
For details see appendix B.
|
||||||
|
.S2 "Procedure descriptors"
|
||||||
|
The procedure identifiers used in the interpreter are indices
|
||||||
|
into a table of procedure descriptors.
|
||||||
|
Each descriptor contains:
|
||||||
|
.IS 6
|
||||||
|
.PS - 4
|
||||||
|
.PT 1.
|
||||||
|
the number of bytes to be reserved for locals at each
|
||||||
|
invocation.
|
||||||
|
.N
|
||||||
|
This is a pointer-szied integer.
|
||||||
|
.PT 2.
|
||||||
|
the start address of the procedure
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.S2 "Load format"
|
||||||
|
The EM machine language load format defines the interface between
|
||||||
|
the EM assembler/loader and the EM machine itself.
|
||||||
|
A load file consists of a header, the program text to be executed,
|
||||||
|
a description of the global data area and the procedure descriptor table,
|
||||||
|
in this order.
|
||||||
|
All integers in the load file are presented with the
|
||||||
|
least significant byte first.
|
||||||
|
.P
|
||||||
|
The header has two parts: the first half (eight 16-bit integers)
|
||||||
|
aids in selecting
|
||||||
|
the correct EM machine or interpreter.
|
||||||
|
Some EM machines, for instance, may have hardware floating point
|
||||||
|
instructions.
|
||||||
|
.N
|
||||||
|
The header entries are as follows (bit 0 is rightmost):
|
||||||
|
.IS 2
|
||||||
|
.VS 1 0
|
||||||
|
.PS 1 4 "" :
|
||||||
|
.PT
|
||||||
|
magic number (07255)
|
||||||
|
.PT
|
||||||
|
flag bits with the following meaning:
|
||||||
|
.PS - 7 "" :
|
||||||
|
.PT bit 0
|
||||||
|
TEST; test for integer overflow etc.
|
||||||
|
.PT bit 1
|
||||||
|
PROFILE; for each source line: count the number of memory
|
||||||
|
cycles executed.
|
||||||
|
.PT bit 2
|
||||||
|
FLOW; for each source line: set a bit in a bit map table if
|
||||||
|
instructions on that line are executed.
|
||||||
|
.PT bit 3
|
||||||
|
COUNT; for each source line: increment a counter if that line
|
||||||
|
is entered.
|
||||||
|
.PT bit 4
|
||||||
|
REALS; set if a program uses floating point instructions.
|
||||||
|
.PT bit 5
|
||||||
|
EXTRA; more tests during compiler debugging.
|
||||||
|
.PE
|
||||||
|
.PT
|
||||||
|
number of unresolved references.
|
||||||
|
.PT
|
||||||
|
version number; used to detect obsolete EM load files.
|
||||||
|
.PT
|
||||||
|
wordsize ; the number of bytes in each machine word.
|
||||||
|
.PT
|
||||||
|
pointer size ; the number of bytes available for addressing.
|
||||||
|
.PT
|
||||||
|
unused
|
||||||
|
.PT
|
||||||
|
unused
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
The second part of the header (eight entries, of pointer size bytes each)
|
||||||
|
describes the load file itself:
|
||||||
|
.IS 2
|
||||||
|
.PS 1 4 "" :
|
||||||
|
.PT
|
||||||
|
NTEXT; the program text size in bytes.
|
||||||
|
.PT
|
||||||
|
NDATA; the number of load-file descriptors (see below).
|
||||||
|
.PT
|
||||||
|
NPROC; the number of entries in the procedure descriptor table.
|
||||||
|
.PT
|
||||||
|
ENTRY; procedure number of the procedure to start with.
|
||||||
|
.PT
|
||||||
|
NLINE; the maximum source line number.
|
||||||
|
.PT
|
||||||
|
SZDATA; the address of the lowest uninitialized data byte.
|
||||||
|
.PT
|
||||||
|
unused
|
||||||
|
.PT
|
||||||
|
unused
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.P
|
||||||
|
The program text consists of NTEXT bytes.
|
||||||
|
NTEXT is always a multiple of the wordsize.
|
||||||
|
The first byte of the program text is the
|
||||||
|
first byte of the instruction address
|
||||||
|
space, i.e. it has address 0.
|
||||||
|
Pointers into the program text are found in the procedure descriptor
|
||||||
|
table where relocation is simple and in the global data area.
|
||||||
|
The initialization of the global data area allows easy
|
||||||
|
relocation of pointers into both address spaces.
|
||||||
|
.P
|
||||||
|
The global data area is described by the NDATA descriptors.
|
||||||
|
Each descriptor describes a number of consecutive words (of~wordsize)
|
||||||
|
and consists of a sequence of bytes.
|
||||||
|
While reading the descriptors from the load file, one can
|
||||||
|
initialize the global data area from low to high addresses.
|
||||||
|
The size of the initialized data area is given by SZDATA,
|
||||||
|
this number can be used to check the initialization.
|
||||||
|
.N
|
||||||
|
The header of each descriptor consists of a byte, describing the type,
|
||||||
|
and a count.
|
||||||
|
The number of bytes used for this (unsigned) count depends on the
|
||||||
|
type of the descriptor and
|
||||||
|
is either a pointer-sized integer
|
||||||
|
or one byte.
|
||||||
|
The meaning of the count depends on the descriptor type.
|
||||||
|
At load time an interpreter can
|
||||||
|
perform any conversion deemed necessary, such as
|
||||||
|
reordering bytes in integers
|
||||||
|
and pointers and adding base addresses to pointers.
|
||||||
|
.BP
|
||||||
|
.A
|
||||||
|
In the following pictures we show a graphical notation of the
|
||||||
|
initializers.
|
||||||
|
The leftmost rectangle represents the leading byte.
|
||||||
|
.N 1
|
||||||
|
.DS
|
||||||
|
.PS - 4 " "
|
||||||
|
Fields marked with
|
||||||
|
.N 1
|
||||||
|
.PT n
|
||||||
|
contain a pointer-sized integer used as a count
|
||||||
|
.PT m
|
||||||
|
contain a one-byte integer used as a count
|
||||||
|
.PT b
|
||||||
|
contain a one-byte integer
|
||||||
|
.PT w
|
||||||
|
contain a wordsized integer
|
||||||
|
.PT p
|
||||||
|
contain a data or instruction pointer
|
||||||
|
.PT s
|
||||||
|
contain a null terminated ASCII string
|
||||||
|
.PE 1
|
||||||
|
.DE 0
|
||||||
|
.VS 1 1
|
||||||
|
.DS
|
||||||
|
|
||||||
|
-------------------
|
||||||
|
| 0 | n | repeat last initialization n times
|
||||||
|
-------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
---------
|
||||||
|
| 1 | m | m uninitialized words
|
||||||
|
---------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
____________
|
||||||
|
/ bytes \e
|
||||||
|
----------------- -----
|
||||||
|
| 2 | m | b | b |...| b | m initialized bytes
|
||||||
|
----------------- -----
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
_________
|
||||||
|
/ word \e
|
||||||
|
-----------------------
|
||||||
|
| 3 | m | w |... m initialized wordsized integers
|
||||||
|
-----------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
_________
|
||||||
|
/ pointer \e
|
||||||
|
-----------------------
|
||||||
|
| 4 | m | p |... m initialized data pointers
|
||||||
|
-----------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
_________
|
||||||
|
/ pointer \e
|
||||||
|
-----------------------
|
||||||
|
| 5 | m | p |... m initialized instruction pointers
|
||||||
|
-----------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
____________
|
||||||
|
/ bytes \e
|
||||||
|
-------------------------
|
||||||
|
| 6 | m | b | b |...| b | initialized integer of size m
|
||||||
|
-------------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
____________
|
||||||
|
/ bytes \e
|
||||||
|
-------------------------
|
||||||
|
| 7 | m | b | b |...| b | initialized unsigned of size m
|
||||||
|
-------------------------
|
||||||
|
.DE
|
||||||
|
.DS
|
||||||
|
____________
|
||||||
|
/ string \e
|
||||||
|
-------------------------
|
||||||
|
| 8 | m | s | initialized float of size m
|
||||||
|
-------------------------
|
||||||
|
.DE 3
|
||||||
|
.PS - 8
|
||||||
|
.PT type~0:
|
||||||
|
If the last initialization initialized k bytes starting
|
||||||
|
at address \fIa\fP, do the same initialization again n times,
|
||||||
|
starting at \fIa\fP+k, \fIa\fP+2*k, .... \fIa\fP+n*k.
|
||||||
|
This is the only descriptor whose starting byte
|
||||||
|
is followed by an integer with the
|
||||||
|
size of a
|
||||||
|
pointer,
|
||||||
|
in all other descriptors the first byte is followed by a one-byte count.
|
||||||
|
This descriptor must be preceded by a descriptor of
|
||||||
|
another type.
|
||||||
|
.PT type~1:
|
||||||
|
Reserve m words, not explicitly initialized (BSS and HOL).
|
||||||
|
.PT type~2:
|
||||||
|
The m bytes following the descriptor header are
|
||||||
|
initializers for the next m bytes of the
|
||||||
|
global data area.
|
||||||
|
m is divisible by the wordsize.
|
||||||
|
.PT type~3:
|
||||||
|
The m words following the header are initializers for the next m words of the
|
||||||
|
global data area.
|
||||||
|
.PT type~4:
|
||||||
|
The m data address space pointers following the header are
|
||||||
|
initializers for the next
|
||||||
|
m data pointers in the global data area.
|
||||||
|
Interpreters that represent EM pointers by
|
||||||
|
target machine addresses must relocate all data pointers.
|
||||||
|
.PT type~5:
|
||||||
|
The m instruction address space pointers following the header are
|
||||||
|
initializers for the next
|
||||||
|
m instruction pointers in the global data area.
|
||||||
|
Interpreters that represent EM instruction pointers by
|
||||||
|
target machine addresses must relocate these pointers.
|
||||||
|
.PT type~6:
|
||||||
|
The m bytes following the header form
|
||||||
|
a signed integer number with a size of m bytes,
|
||||||
|
which is an initializer for the next m bytes
|
||||||
|
of the global data area.
|
||||||
|
m is governed by the same restrictions as for
|
||||||
|
transfer of objects to/from memory.
|
||||||
|
.PT type~7:
|
||||||
|
The m bytes following the header form
|
||||||
|
an unsigned integer number with a size of m bytes,
|
||||||
|
which is an initializer for the next m bytes
|
||||||
|
of the global data area.
|
||||||
|
m is governed by the same restrictions as for
|
||||||
|
transfer of objects to/from memory.
|
||||||
|
.PT type~8:
|
||||||
|
The header is followed by an ASCII string, null terminated, to
|
||||||
|
initialize, in global data,
|
||||||
|
a floating point number with a size of m bytes.
|
||||||
|
m is governed by the same restrictions as for
|
||||||
|
transfer of objects to/from memory.
|
||||||
|
The ASCII string contains the notation of a real as used in the
|
||||||
|
Pascal language.
|
||||||
|
.PE
|
||||||
|
.P
|
||||||
|
The NPROC procedure descriptors on the load file consist of
|
||||||
|
an instruction space address (of~pointer~size) and
|
||||||
|
an integer (of~pointer~size) specifying the number of bytes for
|
||||||
|
locals.
|
||||||
16
doc/em/macr.nr
Normal file
16
doc/em/macr.nr
Normal file
|
|
@ -0,0 +1,16 @@
|
||||||
|
.so /usr/lib/tmac/tmac.kun
|
||||||
|
.SS 6
|
||||||
|
.RP
|
||||||
|
.PL 12i 11i
|
||||||
|
.LL 89
|
||||||
|
.MS T E
|
||||||
|
\!.TL '%'''
|
||||||
|
.ME
|
||||||
|
.MS T O
|
||||||
|
\!.TL '''%'
|
||||||
|
.ME
|
||||||
|
.MS B
|
||||||
|
.sp 1
|
||||||
|
.ME
|
||||||
|
.SM S1 B
|
||||||
|
.SM S2 B
|
||||||
245
doc/em/mapping.nr
Normal file
245
doc/em/mapping.nr
Normal file
|
|
@ -0,0 +1,245 @@
|
||||||
|
.SN 5
|
||||||
|
.BP
|
||||||
|
.S1 "MAPPING OF EM DATA MEMORY ONTO TARGET MACHINE MEMORY"
|
||||||
|
The EM architecture is designed to be implemented
|
||||||
|
on many existing and future machines.
|
||||||
|
EM memory is highly fragmented to make
|
||||||
|
adaptation to various memory architectures possible.
|
||||||
|
Format and encoding of pointers is explicitly undefined.
|
||||||
|
.P
|
||||||
|
This chapter gives solutions to some of the
|
||||||
|
anticipated problems.
|
||||||
|
First, we describe a possible memory layout for machines
|
||||||
|
with 64K bytes of address space.
|
||||||
|
Here we use a member of the EM family with 2-byte word and pointer
|
||||||
|
size.
|
||||||
|
The most straightforward layout is shown in figure 2.
|
||||||
|
.N 1
|
||||||
|
.DS
|
||||||
|
65534 -> |-------------------------------|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|//// unimplemented memory /////|
|
||||||
|
|///////////////////////////////|
|
||||||
|
ML -> |-------------------------------|
|
||||||
|
| |
|
||||||
|
| | <- LB
|
||||||
|
| stack and local area |
|
||||||
|
| |
|
||||||
|
|-------------------------------| <- SP
|
||||||
|
|///////////////////////////////|
|
||||||
|
|//////// inaccessible /////////|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|-------------------------------| <- HP
|
||||||
|
| |
|
||||||
|
| heap area |
|
||||||
|
| |
|
||||||
|
| |
|
||||||
|
HB -> |-------------------------------|
|
||||||
|
| |
|
||||||
|
| global data area |
|
||||||
|
| |
|
||||||
|
EB -> |-------------------------------|
|
||||||
|
| |
|
||||||
|
| program text | <- PC
|
||||||
|
| |
|
||||||
|
| ( and tables ) |
|
||||||
|
| |
|
||||||
|
| |
|
||||||
|
PB -> |-------------------------------|
|
||||||
|
|///////////////////////////////|
|
||||||
|
|////////// undefined //////////|
|
||||||
|
|///////////////////////////////|
|
||||||
|
0 -> |-------------------------------|
|
||||||
|
|
||||||
|
Figure 2. Memory layout showing typical register
|
||||||
|
positions during execution of an EM program.
|
||||||
|
.DE 2
|
||||||
|
The base registers for the various memory pieces can be stored
|
||||||
|
in target machine registers or memory.
|
||||||
|
.IS
|
||||||
|
.N 1
|
||||||
|
.TS
|
||||||
|
tab(;);
|
||||||
|
l 1 l l l.
|
||||||
|
PB;:;program base;points to the base of the instruction address space.
|
||||||
|
EB;:;external base;points to the base of the data address space.
|
||||||
|
HB;:;heap base;points to the base of the heap area.
|
||||||
|
ML;:;memory limit;marks the high end of the addressable data space.
|
||||||
|
.TE 1
|
||||||
|
.IE
|
||||||
|
The stack grows from high
|
||||||
|
EM addresses to low EM addresses, and the heap the
|
||||||
|
other way.
|
||||||
|
The memory between SP and HP is not accessible,
|
||||||
|
but may be allocated later to the stack or the heap if needed.
|
||||||
|
The local data area is allocated starting at the high end of
|
||||||
|
memory.
|
||||||
|
.P
|
||||||
|
Because EM address 0 is not mapped onto target
|
||||||
|
address 0, a problem arises when pointers are used.
|
||||||
|
If a program pushed a constant, say 6, onto the stack,
|
||||||
|
and then tried to indirect through it,
|
||||||
|
the wrong word would be fetched,
|
||||||
|
because EM address 6 is mapped onto target address EB+6
|
||||||
|
and not target address 6 itself.
|
||||||
|
This particular problem is solved by explicitly declaring
|
||||||
|
the format of a pointer to be undefined,
|
||||||
|
so that using a constant as a pointer is completely illegal.
|
||||||
|
However, the general problem of mapping pointers still exists.
|
||||||
|
.P
|
||||||
|
There are two possible solutions.
|
||||||
|
In the first solution, EM pointers are represented
|
||||||
|
in the target machine as true EM addresses,
|
||||||
|
for example, a pointer to EM address 6 really is
|
||||||
|
stored as a 6 in the target machine.
|
||||||
|
This solution implies that every time a pointer is fetched
|
||||||
|
EB must be added before referencing
|
||||||
|
the target machine's memory.
|
||||||
|
If the target machine has powerful indexing
|
||||||
|
facilities, EB can be kept in a target machine register,
|
||||||
|
and the relocation can indeed be done on
|
||||||
|
every reference to the data address space
|
||||||
|
at a modest cost in speed.
|
||||||
|
.P
|
||||||
|
The other solution consists of having EM pointers
|
||||||
|
refer to the true target machine address.
|
||||||
|
Thus the instruction LAE 6 (Load Address of External 6)
|
||||||
|
would push the value of EB+6 onto the stack.
|
||||||
|
When this approach is chosen, back ends must know
|
||||||
|
how to offset from EB, to translate all
|
||||||
|
instructions that manipulate EM addresses.
|
||||||
|
However, the problem is not completely solved,
|
||||||
|
because a front end may have to initialize a pointer
|
||||||
|
in CON or ROM data to point to a global address.
|
||||||
|
This pointer must also be relocated by the back end or the interpreter.
|
||||||
|
.P
|
||||||
|
Although the EM stack grows from high to low EM addresses,
|
||||||
|
some machines have hardware PUSH and POP
|
||||||
|
instructions that require the stack to grow upwards.
|
||||||
|
If reasons of efficiency urge you to use these
|
||||||
|
instructions, then EM
|
||||||
|
can be implemented with the memory layout
|
||||||
|
upside down, as shown in figure 3.
|
||||||
|
This is possible because the pointer format is explicitly undefined.
|
||||||
|
The first element of a word array will have a
|
||||||
|
lower physical address than the second element.
|
||||||
|
.N 2
|
||||||
|
.DS
|
||||||
|
| | | |
|
||||||
|
| EB=60 | | ^ |
|
||||||
|
| | | | |
|
||||||
|
|-----------------| |-----------------|
|
||||||
|
105 | 45 | 44 | 104 214 | 41 | 40 | 215
|
||||||
|
|-----------------| |-----------------|
|
||||||
|
103 | 43 | 42 | 102 212 | 43 | 42 | 213
|
||||||
|
|-----------------| |-----------------|
|
||||||
|
101 | 41 | 40 | 100 210 | 45 | 44 | 211
|
||||||
|
|-----------------| |-----------------|
|
||||||
|
| | | | |
|
||||||
|
| v | | EB=255 |
|
||||||
|
| | | |
|
||||||
|
|
||||||
|
Type A Type B
|
||||||
|
.sp 2
|
||||||
|
Figure 3. Two possible memory implementations.
|
||||||
|
Numbers within the boxes are EM addresses.
|
||||||
|
The other numbers are physical addresses.
|
||||||
|
.DE 2
|
||||||
|
.A 0 0
|
||||||
|
So, we have two different EM memory implementations:
|
||||||
|
.IS
|
||||||
|
.PS - 4
|
||||||
|
.PT A~-
|
||||||
|
stack downwards
|
||||||
|
.PT B~-
|
||||||
|
stack upwards
|
||||||
|
.PE
|
||||||
|
.IE
|
||||||
|
.P
|
||||||
|
For each of these two possibilities we give the translation of
|
||||||
|
the EM instructions to push the third byte of a global data
|
||||||
|
block starting at EM address 40 onto the stack and to load the
|
||||||
|
word at address 40.
|
||||||
|
All translations assume a word and pointer size of two bytes.
|
||||||
|
The target machine used is a PDP-11 augmented with push and pop instructions.
|
||||||
|
Registers 'r0' and 'r1' are used and suffer from sign extension for byte
|
||||||
|
transfers.
|
||||||
|
Push $40 means push the constant 40, not word 40.
|
||||||
|
.P
|
||||||
|
The translation of the EM instructions depends on the pointer representation
|
||||||
|
used.
|
||||||
|
For each of the two solutions explained above the translation is given.
|
||||||
|
.P
|
||||||
|
First, the translation for the two implementations using EM addresses as
|
||||||
|
pointer representation:
|
||||||
|
.DS
|
||||||
|
.TS
|
||||||
|
tab(:), center;
|
||||||
|
l s l s l s
|
||||||
|
_ s _ s _ s
|
||||||
|
l 2 l 6 l 2 l 6 l 2 l.
|
||||||
|
EM:type A:type B
|
||||||
|
|
||||||
|
|
||||||
|
LAE:40:push:$40:push:$40
|
||||||
|
|
||||||
|
ADP:3:pop:r0:pop:r0
|
||||||
|
::add:$3,r0:add:$3,r0
|
||||||
|
::push:r0:push:r0
|
||||||
|
|
||||||
|
LOI:1:pop:r0:pop:r0
|
||||||
|
::-::neg:r0
|
||||||
|
::clr:r1:clr:r1
|
||||||
|
::bisb:eb(r0),r1:bisb:eb(r0),r1
|
||||||
|
::push:r1:push:r1
|
||||||
|
|
||||||
|
LOE:40:push:eb+40:push:eb-41
|
||||||
|
.TE
|
||||||
|
.DE
|
||||||
|
.BP
|
||||||
|
.P
|
||||||
|
The translation for the two implementations, if the target machine address is
|
||||||
|
used as pointer representation, is:
|
||||||
|
.N 1
|
||||||
|
.DS
|
||||||
|
.TS
|
||||||
|
tab(:), center;
|
||||||
|
l s l s l s
|
||||||
|
_ s _ s _ s
|
||||||
|
l 2 l 6 l 2 l 6 l 2 l.
|
||||||
|
EM:type A:type B
|
||||||
|
|
||||||
|
|
||||||
|
LAE:40:push:$eb+40:push:$eb-40
|
||||||
|
|
||||||
|
ADP:3:pop:r0:pop:r0
|
||||||
|
::add:$3,r0:sub:$3,r0
|
||||||
|
::push:r0:push:r0
|
||||||
|
|
||||||
|
LOI:1:pop:r0:pop:r0
|
||||||
|
::clr:r1:clr:r1
|
||||||
|
::bisb:(r0),r1:bisb:(r0),r1
|
||||||
|
::push:r1:push:r1
|
||||||
|
|
||||||
|
LOE:40:push:eb+40:push:eb-41
|
||||||
|
.TE
|
||||||
|
.DE
|
||||||
|
.P
|
||||||
|
The translation presented above is not intended to be optimal.
|
||||||
|
Most machines can handle these simple cases in one or two instructions.
|
||||||
|
It demonstrates, however, the flexibility of the EM design.
|
||||||
|
.P
|
||||||
|
There are several possibilities to implement EM on machines with
|
||||||
|
address spaces larger than 64k bytes.
|
||||||
|
For EM with two byte pointers one could allocate instruction and
|
||||||
|
data space each in a separate 64k piece of memory.
|
||||||
|
EM pointers still have to fit in two bytes,
|
||||||
|
but the base registers PB and EB may be loaded in hardware registers
|
||||||
|
wider than 16 bits, if available.
|
||||||
|
EM implementations can also make efficient use of a machine
|
||||||
|
with separate instruction and data space.
|
||||||
|
.P
|
||||||
|
EM with 32 bit pointers allows one to make use of machines
|
||||||
|
with large address spaces.
|
||||||
|
In a virtual, segmented memory system one could use a separate
|
||||||
|
segment for each fragment.
|
||||||
80
doc/em/mem.nr
Normal file
80
doc/em/mem.nr
Normal file
|
|
@ -0,0 +1,80 @@
|
||||||
|
.BP
|
||||||
|
.SN 2
|
||||||
|
.S1 MEMORY
|
||||||
|
The EM machine has two distinct address spaces,
|
||||||
|
one for instructions and one for data.
|
||||||
|
The data space is divided up into 8-bit bytes.
|
||||||
|
The smallest addressable unit is a byte.
|
||||||
|
Bytes are numbered consecutively from 0 to some maximum.
|
||||||
|
All sizes in EM are expressed in bytes.
|
||||||
|
.P
|
||||||
|
Some EM instructions can transfer objects containing several bytes
|
||||||
|
to and/or from memory.
|
||||||
|
The size of all objects larger than a word must be a multiple of
|
||||||
|
the wordsize.
|
||||||
|
The size of all objects smaller than a word must be a divisor
|
||||||
|
of the wordsize.
|
||||||
|
For example: if the wordsize is 2 bytes, objects of the sizes 1,
|
||||||
|
2, 4, 6,... are allowed.
|
||||||
|
The address of such an object is the lowest address of all bytes it contains.
|
||||||
|
For objects smaller than the wordsize, the
|
||||||
|
address must be a multiple of the object size.
|
||||||
|
For all other objects the address must be a multiple of the
|
||||||
|
wordsize.
|
||||||
|
For example, if an instruction transfers a 4-byte object to memory at
|
||||||
|
location \fIm\fP and the wordsize is 2,
|
||||||
|
\fIm\fP must be a multiple of 2 and the bytes at
|
||||||
|
locations \fIm\fP, \fIm\fP\|+\|1,\fIm\fP\|+\|2 and
|
||||||
|
\fIm\fP\|+\|3 are overwritten.
|
||||||
|
.P
|
||||||
|
The size of almost all objects in EM
|
||||||
|
is an integral number of words.
|
||||||
|
Only two operations are allowed on
|
||||||
|
objects whose size is a divisor of the wordsize:
|
||||||
|
push it onto the stack and pop it from the stack.
|
||||||
|
The addressing of these objects in memory is always indirect.
|
||||||
|
If such a small object is pushed onto the stack
|
||||||
|
it is assumed to be a small integer and stored
|
||||||
|
in the least significant part of a word.
|
||||||
|
The rest of the word is cleared to zero,
|
||||||
|
although
|
||||||
|
EM provides a way to sign-extend a small integer.
|
||||||
|
Popping a small object from the stack removes a word
|
||||||
|
from the stack, stores the least significant byte(s)
|
||||||
|
of this word in memory and discards the rest of the word.
|
||||||
|
.P
|
||||||
|
The format of pointers into both address spaces is explicitly undefined.
|
||||||
|
The size of a pointer, however, is fixed for a member of EM, so that
|
||||||
|
the compiler writer knows how much storage to allocate for a pointer.
|
||||||
|
.P
|
||||||
|
A minor problem is raised by the undefined pointer format.
|
||||||
|
Some languages, notably Pascal, require a special,
|
||||||
|
otherwise illegal, pointer value to represent the nil pointer.
|
||||||
|
The current Pascal-VU compiler uses the
|
||||||
|
integer value 0 as nil pointer.
|
||||||
|
This value is also used by many C programs as a normally impossible address.
|
||||||
|
A better solution would be to have a special
|
||||||
|
instruction loading an illegal pointer value,
|
||||||
|
but it is hard to imagine an implementation
|
||||||
|
for which the current solution is inadequate,
|
||||||
|
especially because the first word in the EM data space
|
||||||
|
is special and probably not the target of any pointer.
|
||||||
|
.P
|
||||||
|
The next two chapters describe the EM memory
|
||||||
|
in more detail.
|
||||||
|
One describes the instruction address space,
|
||||||
|
the other the data address space.
|
||||||
|
.P
|
||||||
|
A design goal of EM has been to allow
|
||||||
|
its implementation on a wide range of existing machines,
|
||||||
|
as well as allowing a new one to be built in hardware.
|
||||||
|
To this extent we have tried to minimize the demands
|
||||||
|
of EM on the memory structure of the target machine.
|
||||||
|
Therefore, apart from the logical partitioning,
|
||||||
|
EM memory is divided into 'fragments'.
|
||||||
|
A fragment consists of consecutive machine
|
||||||
|
words and has a base address and a size.
|
||||||
|
Pointer arithmetic is only defined within a fragment.
|
||||||
|
The only exception to this rule is comparison with the null
|
||||||
|
pointer.
|
||||||
|
All fragments must be word aligned.
|
||||||
5
doc/em/print
Executable file
5
doc/em/print
Executable file
|
|
@ -0,0 +1,5 @@
|
||||||
|
|
||||||
|
case $# in
|
||||||
|
1) make "$1".t ; ntlp "$1".t^lpr ;;
|
||||||
|
*) echo $0 heeft een argument nodig ;;
|
||||||
|
esac
|
||||||
4
doc/em/show
Executable file
4
doc/em/show
Executable file
|
|
@ -0,0 +1,4 @@
|
||||||
|
case $# in
|
||||||
|
1) make $1.t ; ntout $1.t ;;
|
||||||
|
*) echo $0 heeft een argument nodig ;;
|
||||||
|
esac
|
||||||
38
doc/em/title.nr
Normal file
38
doc/em/title.nr
Normal file
|
|
@ -0,0 +1,38 @@
|
||||||
|
.po 0
|
||||||
|
.TP 1
|
||||||
|
.ll 79
|
||||||
|
.sp 15
|
||||||
|
.ce 4
|
||||||
|
DESCRIPTION OF A MACHINE
|
||||||
|
ARCHITECTURE FOR USE WITH
|
||||||
|
BLOCK STRUCTURED LANGUAGES
|
||||||
|
.sp 6
|
||||||
|
.ce 4
|
||||||
|
Andrew S. Tanenbaum
|
||||||
|
Hans van Staveren
|
||||||
|
Ed G. Keizer
|
||||||
|
Johan W. Stevenson\v'-0.5m'*\v'0.5m'
|
||||||
|
.sp 2
|
||||||
|
.ce
|
||||||
|
August 1983
|
||||||
|
.sp 2
|
||||||
|
.ce
|
||||||
|
Informatica Rapport IR-81
|
||||||
|
.sp 13
|
||||||
|
Abstract
|
||||||
|
.sp 2
|
||||||
|
.ti +5
|
||||||
|
EM is a family of intermediate languages
|
||||||
|
designed for producing portable compilers.
|
||||||
|
A program called
|
||||||
|
.B front end
|
||||||
|
translates source programs to EM.
|
||||||
|
Another program,
|
||||||
|
.B back
|
||||||
|
.BW end ,
|
||||||
|
translates EM to the assembly language of the target machine.
|
||||||
|
Alternatively, the EM program can be assembled to a highly
|
||||||
|
efficient binary format for interpretation.
|
||||||
|
This document describes the EM languages in detail.
|
||||||
|
.sp 4
|
||||||
|
\v'-0.5m'*\v'0.5m' Present affiliation: NV Philips, Eindhoven
|
||||||
130
doc/em/types.nr
Normal file
130
doc/em/types.nr
Normal file
|
|
@ -0,0 +1,130 @@
|
||||||
|
.SN 6
|
||||||
|
.BP
|
||||||
|
.S1 "TYPE REPRESENTATIONS"
|
||||||
|
The representations used for typed objects are not precisely
|
||||||
|
specified by EM.
|
||||||
|
Sometimes we only specify that a typed object occupies a
|
||||||
|
certain amount of space and state no further restrictions.
|
||||||
|
If one wants to have a different representation of the value of
|
||||||
|
an object on the stack one has to use a convert instruction
|
||||||
|
in most cases.
|
||||||
|
We do specify some relations between the representations of
|
||||||
|
types.
|
||||||
|
This allows some intermixed use of operators for different types
|
||||||
|
on the same object(s).
|
||||||
|
For example, the instruction ZER pushes signed and
|
||||||
|
unsigned integers with the value zero and empty sets.
|
||||||
|
ZER has as only argument the size of the object.
|
||||||
|
.A
|
||||||
|
The representation of floating point numbers is a good example,
|
||||||
|
it allows widely varying implementations.
|
||||||
|
The only ways to create floating point numbers are via
|
||||||
|
initialization and via conversions from integer numbers.
|
||||||
|
Only by using conversions to integers and comparing
|
||||||
|
two floating point numbers with each other, can these numbers
|
||||||
|
be converted to human readable output.
|
||||||
|
Implementations may use base 10, base 2 or any other
|
||||||
|
base for exponents, and have freedom in choosing the range of
|
||||||
|
exponent and mantissa.
|
||||||
|
.A
|
||||||
|
Other types are more precisely described.
|
||||||
|
In the following paragraphs a description will be given of the
|
||||||
|
restrictions imposed on the representation of the types used.
|
||||||
|
A number \fBn\fP used in these paragraphs indicates the size of
|
||||||
|
the object in \fIbits\fP.
|
||||||
|
.S2 "Unsigned integers"
|
||||||
|
The range of unsigned integers is 0..2\v'-0.5m'\fBn\fP\v'0.5m'-1.
|
||||||
|
A binary representation is assumed.
|
||||||
|
The order of the bits within an object is knowingly left
|
||||||
|
unspecified.
|
||||||
|
Discussing bit order within each 8-bit byte is academic,
|
||||||
|
so the only real freedom of this specification lies in the byte
|
||||||
|
order.
|
||||||
|
We really do not care whether an implementation of a 4-byte
|
||||||
|
integer has its bytes in a particular order of significance.
|
||||||
|
This of course means that some sequences of instructions have
|
||||||
|
unpredictable effects.
|
||||||
|
For example:
|
||||||
|
.DS
|
||||||
|
LOC 258 ; STL 0 ; LAL 0 ; LOI 1 ( wordsize >=2 )
|
||||||
|
.DE
|
||||||
|
The value on the stack after executing this sequence
|
||||||
|
can be anything,
|
||||||
|
but will most likely be 1 or 2.
|
||||||
|
.A
|
||||||
|
Conversion between unsigned integers of different sizes have to
|
||||||
|
be done with explicit convert instructions.
|
||||||
|
One cannot simply pad an unsigned integer with zero's at either end
|
||||||
|
and expect a correct result.
|
||||||
|
.A
|
||||||
|
We assume existence of at least single word unsigned arithmetic
|
||||||
|
in any implementation.
|
||||||
|
.S2 "Signed Integers"
|
||||||
|
The range of signed integers is -2\v'-0.5m'\fBn\fP-1\v'0.5m'~..~2\v'-0.5m'\fBn\fP-1\v'0.5m'-1,
|
||||||
|
in other words the range of signed integers of \fBn\fP bits
|
||||||
|
using two's complement arithmetic.
|
||||||
|
The representation is the same as for unsigned integers except
|
||||||
|
the range 2\v'-0.5m'\fBn\fP-1\v'0.5m'~..~2\v'-0.5m'\fBn\fP\v'0.5m'-1 is mapped on the
|
||||||
|
range -2\v'-0.5m'\fBn\fP-1\v'0.5m'~..~-1.
|
||||||
|
In other words, the most significant bit is used as sign bit.
|
||||||
|
The convert instructions between signed and unsigned integers
|
||||||
|
of the same size can be used to catch errors.
|
||||||
|
.A
|
||||||
|
The value -2\v'-0.5m'\fBn\fP-1\v'0.5m' is used for undefined
|
||||||
|
signed integers.
|
||||||
|
EM implementations should trap when this value is used in an
|
||||||
|
operation on signed integers.
|
||||||
|
The instruction mask, accessed with SIM and LIM -~see chapter 9~- ,
|
||||||
|
can be used to disable such traps.
|
||||||
|
.A
|
||||||
|
We assume existence of at least single word signed arithmetic
|
||||||
|
in any implementation.
|
||||||
|
.BP
|
||||||
|
.S2 "Floating point values"
|
||||||
|
Floating point values must have a signed mantissa and a signed
|
||||||
|
exponent.
|
||||||
|
Although no base is specified, base 2 is the normal choice,
|
||||||
|
because the FEF instruction pushes the exponent in base 2.
|
||||||
|
.A
|
||||||
|
The implementation of floating point arithmetic is optional.
|
||||||
|
The compilers currently in use have runtime parameters for the
|
||||||
|
size of the floating point values they should use.
|
||||||
|
Common choices are 4 and/or 8 bytes.
|
||||||
|
.S2 Pointers
|
||||||
|
EM has two kinds of pointers: for instruction and for data
|
||||||
|
space.
|
||||||
|
Each kind can only be used for its own space, conversion between
|
||||||
|
these two subtypes is impossible.
|
||||||
|
We assume that pointers have a range from 0 upwards.
|
||||||
|
Any implementation may have holes in the pointer range between
|
||||||
|
fragments.
|
||||||
|
One can of course not expect to be able to address two megabyte
|
||||||
|
of memory using a 2-byte pointer.
|
||||||
|
Normally, a 2-byte pointer allows up to 65536 bytes of
|
||||||
|
addressable memory.
|
||||||
|
.A
|
||||||
|
Pointer representation has one restriction.
|
||||||
|
The pointer with the same representation as the integer zero of
|
||||||
|
the same size should be invalid.
|
||||||
|
Some languages and/or runtime systems represent the nil
|
||||||
|
pointer as zero.
|
||||||
|
.S2 "Bit sets"
|
||||||
|
All bit sets of size \fBn\fP are subsets of the set
|
||||||
|
{~i~|~i>=0,~i<\fBn\fP~}.
|
||||||
|
A bit set contains a bit for each element showing its
|
||||||
|
presence or absence.
|
||||||
|
Bit sets are subdivided into words.
|
||||||
|
The word with the lowest EM address governs the subset
|
||||||
|
{~i~|~i>=0,~i<\fBm\fP~}, where \fBm\fP is the number of bits in
|
||||||
|
a word.
|
||||||
|
The next higher words each govern the next higher \fBm\fP set elements.
|
||||||
|
The relation between a set with size of
|
||||||
|
a word and an unsigned integer word is that
|
||||||
|
the value of the unsigned integer is the summation of the
|
||||||
|
2\v'-0.5m'i\v'0.5m' where i is in the set.
|
||||||
|
.A
|
||||||
|
Example: a 2-word bit set (wordsize 2) containing the
|
||||||
|
elements 1, 6, 8, 15, 18, 21, 27 and 28 is composed of two
|
||||||
|
integers, e.g. at addresses 40 and 42.
|
||||||
|
The word at 40 contains the value 33090 (or~-32446),
|
||||||
|
the word at 42 contains the value 6180.
|
||||||
Loading…
Add table
Reference in a new issue