396 lines
		
	
	
	
		
			12 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			396 lines
		
	
	
	
		
			12 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
| .BP
 | |
| .SN 10
 | |
| .S1 "EM MACHINE LANGUAGE"
 | |
| The EM machine language is designed to make program text compact
 | |
| and to make decoding easy.
 | |
| Compact program text has many advantages: programs execute faster,
 | |
| programs occupy less primary and secondary storage and loading
 | |
| programs into satellite processors is faster.
 | |
| The decoding of EM machine language is so simple,
 | |
| that it is feasible to use interpreters as long as EM hardware
 | |
| machines are not available.
 | |
| This chapter is irrelevant when back ends are used to
 | |
| produce executable target machine code.
 | |
| .S2 "Instruction encoding"
 | |
| A design goal of EM is to make the
 | |
| program text as compact as possible.
 | |
| Decoding must be easy, however.
 | |
| The encoding is fully byte oriented, without any small bit fields.
 | |
| There are 256 primary opcodes, two of which are an escape to
 | |
| two groups of 256 secondary opcodes each.
 | |
| .A
 | |
| EM instructions without arguments have a single opcode assigned,
 | |
| possibly escaped:
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|
 | |
|          |    opcode    |
 | |
|          |--------------|
 | |
| 
 | |
| .De
 | |
| 		or
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|--------------|
 | |
|          |    escape    |     opcode   |
 | |
|          |--------------|--------------|
 | |
| 
 | |
| .De
 | |
| The encoding for instructions with an argument is more complex.
 | |
| Several instructions have an address from the global data area
 | |
| as argument.
 | |
| Other instructions have different opcodes for positive
 | |
| and negative arguments.
 | |
| .N 1
 | |
| There is always an opcode that takes the next two bytes as argument,
 | |
| high byte first:
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|--------------|--------------|
 | |
|          |    opcode    |    hibyte    |    lobyte    |
 | |
|          |--------------|--------------|--------------|
 | |
| 
 | |
| .De
 | |
| 		or
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|--------------|--------------|--------------|
 | |
|          |    escape    |    opcode    |    hibyte    |    lobyte    |
 | |
|          |--------------|--------------|--------------|--------------|
 | |
| 
 | |
| .De
 | |
| An extra escape is provided for instructions with four or eight byte arguments.
 | |
| .Dr 6
 | |
| 
 | |
|   |--------------|--------------|--------------|   |--------------|
 | |
|   |    ESCAPE    |    opcode    |    hibyte    |...|    lobyte    |
 | |
|   |--------------|--------------|--------------|   |--------------|
 | |
| 
 | |
| .De
 | |
| For most instructions some argument values predominate.
 | |
| The most frequent combinations of instruction and argument
 | |
| will be encoded in a single byte, called a mini:
 | |
| .Dr 6
 | |
| 
 | |
|          |---------------|
 | |
|          |opcode+argument|  (mini)
 | |
|          |---------------|
 | |
| 
 | |
| .De
 | |
| The number of minis is restricted, because only
 | |
| 254 primary opcodes are available.
 | |
| Many instructions have the bulk of their arguments
 | |
| fall in the range 0 to 255.
 | |
| Instructions that address global data have their arguments
 | |
| distributed over a wider range,
 | |
| but small values of the high byte are common.
 | |
| For all these cases there is another encoding
 | |
| that combines the instruction and the high byte of the argument
 | |
| into a single opcode.
 | |
| These opcodes are called shorties.
 | |
| Shorties may be escaped.
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|--------------|
 | |
|          | opcode+high  |    lobyte    |  (shortie)
 | |
|          |--------------|--------------|
 | |
| 
 | |
| .De
 | |
| 		or
 | |
| .Dr 6
 | |
| 
 | |
|          |--------------|--------------|--------------|
 | |
|          |    escape    | opcode+high  |    lobyte    |
 | |
|          |--------------|--------------|--------------|
 | |
| 
 | |
| .De
 | |
| Escaped shorties are useless if the normal encoding has a primary opcode.
 | |
| Note that for some instruction-argument combinations
 | |
| several different encodings are available.
 | |
| It is the task of the assembler to select the shortest of these.
 | |
| The savings by these mini and shortie
 | |
| opcodes are considerable, about 55%.
 | |
| .P
 | |
| Further improvements are possible:
 | |
| the arguments of
 | |
| many instructions are a multiple of the wordsize.
 | |
| Some do also not allow zero as an argument.
 | |
| If these arguments are divided by the wordsize and,
 | |
| when zero is not allowed, then decremented by 1, more of them can
 | |
| be encoded as shortie or mini.
 | |
| The arguments of some other instructions
 | |
| rarely or never assume the value 0, but start at 1.
 | |
| The value 1 is then encoded as 0,
 | |
| 2 as 1 and so on.
 | |
| .P
 | |
| Assigning opcodes to instructions by the assembler is completely
 | |
| table driven.
 | |
| For details see appendix B.
 | |
| .S2 "Procedure descriptors"
 | |
| The procedure identifiers used in the interpreter are indices
 | |
| into a table of procedure descriptors.
 | |
| Each descriptor contains:
 | |
| .IS 6
 | |
| .PS - 4
 | |
| .PT 1.
 | |
| the number of bytes to be reserved for locals at each
 | |
| invocation.
 | |
| .N
 | |
| This is a pointer-szied integer.
 | |
| .PT 2.
 | |
| the start address of the procedure
 | |
| .PE
 | |
| .IE
 | |
| .S2 "Load format"
 | |
| The EM machine language load format defines the interface between
 | |
| the EM assembler/loader and the EM machine itself.
 | |
| A load file consists of a header, the program text to be executed,
 | |
| a description of the global data area and the procedure descriptor table,
 | |
| in this order.
 | |
| All integers in the load file are presented with the
 | |
| least significant byte first.
 | |
| .P
 | |
| The header has two parts: the first half (eight 16-bit integers)
 | |
| aids in selecting
 | |
| the correct EM machine or interpreter.
 | |
| Some EM machines, for instance, may have hardware floating point
 | |
| instructions.
 | |
| .N
 | |
| The header entries are as follows (bit 0 is rightmost):
 | |
| .IS 2
 | |
| .VS 1 0
 | |
| .PS 1 4 "" :
 | |
| .PT
 | |
| magic number (07255)
 | |
| .PT
 | |
| flag bits with the following meaning:
 | |
| .PS - 7 "" :
 | |
| .PT bit 0
 | |
| TEST; test for integer overflow etc.
 | |
| .PT bit 1
 | |
| PROFILE; for each source line: count the number of memory
 | |
| cycles executed.
 | |
| .PT bit 2
 | |
| FLOW; for each source line: set a bit in a bit map table if
 | |
| instructions on that line are executed.
 | |
| .PT bit 3
 | |
| COUNT; for each source line: increment a counter if that line
 | |
| is entered.
 | |
| .PT bit 4
 | |
| REALS; set if a program uses floating point instructions.
 | |
| .PT bit 5
 | |
| EXTRA; more tests during compiler debugging.
 | |
| .PE
 | |
| .PT
 | |
| number of unresolved references.
 | |
| .PT
 | |
| version number; used to detect obsolete EM load files.
 | |
| .PT
 | |
| wordsize ; the number of bytes in each machine word.
 | |
| .PT
 | |
| pointer size ; the number of bytes available for addressing.
 | |
| .PT
 | |
| unused
 | |
| .PT
 | |
| unused
 | |
| .PE
 | |
| .IE
 | |
| The second part of the header (eight entries, of pointer size bytes each)
 | |
| describes the load file itself:
 | |
| .IS 2
 | |
| .PS 1 4 "" :
 | |
| .PT
 | |
| NTEXT; the program text size in bytes.
 | |
| .PT
 | |
| NDATA; the number of load-file descriptors (see below).
 | |
| .PT
 | |
| NPROC; the number of entries in the procedure descriptor table.
 | |
| .PT
 | |
| ENTRY; procedure number of the procedure to start with.
 | |
| .PT
 | |
| NLINE; the maximum source line number.
 | |
| .PT
 | |
| SZDATA; the address of the lowest uninitialized data byte.
 | |
| .PT
 | |
| unused
 | |
| .PT
 | |
| unused
 | |
| .PE
 | |
| .IE
 | |
| .P
 | |
| The program text consists of NTEXT bytes.
 | |
| NTEXT is always a multiple of the wordsize.
 | |
| The first byte of the program text is the
 | |
| first byte of the instruction address
 | |
| space, i.e. it has address 0.
 | |
| Pointers into the program text are found in the procedure descriptor
 | |
| table where relocation is simple and in the global data area.
 | |
| The initialization of the global data area allows easy
 | |
| relocation of pointers into both address spaces.
 | |
| .P
 | |
| The global data area is described by the NDATA descriptors.
 | |
| Each descriptor describes a number of consecutive words (of~wordsize)
 | |
| and consists of a sequence of bytes.
 | |
| While reading the descriptors from the load file, one can
 | |
| initialize the global data area from low to high addresses.
 | |
| The size of the initialized data area is given by SZDATA,
 | |
| this number can be used to check the initialization.
 | |
| .N
 | |
| The header of each descriptor consists of a byte, describing the type,
 | |
| and a count.
 | |
| The number of bytes used for this (unsigned) count depends on the
 | |
| type of the descriptor and
 | |
| is either a pointer-sized integer
 | |
| or one byte.
 | |
| The meaning of the count depends on the descriptor type.
 | |
| At load time an interpreter can
 | |
| perform any conversion deemed necessary, such as
 | |
| reordering bytes in integers
 | |
| and pointers and adding base addresses to pointers.
 | |
| .BP
 | |
| .A
 | |
| In the following pictures we show a graphical notation of the
 | |
| initializers.
 | |
| The leftmost rectangle represents the leading byte.
 | |
| .N 1
 | |
| .DS
 | |
| .PS - 4 " "
 | |
| Fields marked with
 | |
| .N 1
 | |
| .PT n
 | |
| contain a pointer-sized integer used as a count
 | |
| .PT m
 | |
| contain a one-byte integer used as a count
 | |
| .PT b
 | |
| contain a one-byte integer
 | |
| .PT w
 | |
| contain a wordsized integer
 | |
| .PT p
 | |
| contain a data or instruction pointer
 | |
| .PT s
 | |
| contain a null terminated ASCII string
 | |
| .PE 1
 | |
| .DE 0
 | |
| .VS 1 1
 | |
| .Dr 6
 | |
| 
 | |
|     -------------------
 | |
|     | 0 |      n      |           repeat last initialization n times
 | |
|     -------------------
 | |
| .De
 | |
| .Dr 4
 | |
|     ---------
 | |
|     | 1 | m |                     m uninitialized words
 | |
|     ---------
 | |
| .De
 | |
| .Dr 6
 | |
|                ____________
 | |
|               /    bytes   \e
 | |
|     -----------------   -----
 | |
|     | 2 | m | b | b |...| b |     m initialized bytes
 | |
|     -----------------   -----
 | |
| .De
 | |
| .Dr 6
 | |
|                _________
 | |
|               /  word   \e
 | |
|     -----------------------
 | |
|     | 3 | m |      w      |...    m initialized wordsized integers
 | |
|     -----------------------
 | |
| .De
 | |
| .Dr 6
 | |
|                _________
 | |
|               / pointer \e
 | |
|     -----------------------
 | |
|     | 4 | m |      p      |...    m initialized data pointers
 | |
|     -----------------------
 | |
| .De
 | |
| .Dr 6
 | |
|                _________
 | |
|               / pointer \e
 | |
|     -----------------------
 | |
|     | 5 | m |      p      |...    m initialized instruction pointers
 | |
|     -----------------------
 | |
| .De
 | |
| .Dr 6
 | |
|                ____________
 | |
|               /    bytes   \e
 | |
|     -------------------------
 | |
|     | 6 | m | b | b |...| b |     initialized integer of size m
 | |
|     -------------------------
 | |
| .De
 | |
| .Dr 6
 | |
|                ____________
 | |
|               /    bytes   \e
 | |
|     -------------------------
 | |
|     | 7 | m | b | b |...| b |     initialized unsigned of size m
 | |
|     -------------------------
 | |
| .De
 | |
| .Dr 6
 | |
|                ____________
 | |
|               /   string   \e
 | |
|     -------------------------
 | |
|     | 8 | m |        s      |     initialized float of size m
 | |
|     -------------------------
 | |
| .De 3
 | |
| .PS - 8
 | |
| .PT type~0:
 | |
| If the last initialization initialized k bytes starting
 | |
| at address \fIa\fP, do the same initialization again n times,
 | |
| starting at \fIa\fP+k, \fIa\fP+2*k, .... \fIa\fP+n*k.
 | |
| This is the only descriptor whose starting byte
 | |
| is followed by an integer with the
 | |
| size of a
 | |
| pointer,
 | |
| in all other descriptors the first byte is followed by a one-byte count.
 | |
| This descriptor must be preceded by a descriptor of
 | |
| another type.
 | |
| .PT type~1:
 | |
| Reserve m words, not explicitly initialized (BSS and HOL).
 | |
| .PT type~2:
 | |
| The m bytes following the descriptor header are
 | |
| initializers for the next m bytes of the
 | |
| global data area.
 | |
| m is divisible by the wordsize.
 | |
| .PT type~3:
 | |
| The m words following the header are initializers for the next m words of the
 | |
| global data area.
 | |
| .PT type~4:
 | |
| The m data address space pointers following the header are
 | |
| initializers for the next
 | |
| m data pointers in the global data area.
 | |
| Interpreters that represent EM pointers by
 | |
| target machine addresses must relocate all data pointers.
 | |
| .PT type~5:
 | |
| The m instruction address space pointers following the header are
 | |
| initializers for the next
 | |
| m instruction pointers in the global data area.
 | |
| Interpreters that represent EM instruction pointers by
 | |
| target machine addresses must relocate these pointers.
 | |
| .PT type~6:
 | |
| The m bytes following the header form
 | |
| a signed integer number with a size of m bytes,
 | |
| which is an initializer for the next m bytes
 | |
| of the global data area.
 | |
| m is governed by the same restrictions as for
 | |
| transfer of objects to/from memory.
 | |
| .PT type~7:
 | |
| The m bytes following the header form
 | |
| an unsigned integer number with a size of m bytes,
 | |
| which is an initializer for the next m bytes
 | |
| of the global data area.
 | |
| m is governed by the same restrictions as for
 | |
| transfer of objects to/from memory.
 | |
| .PT type~8:
 | |
| The header is followed by an ASCII string, null terminated, to
 | |
| initialize, in global data,
 | |
| a floating point number with a size of m bytes.
 | |
| m is governed by the same restrictions as for
 | |
| transfer of objects to/from memory.
 | |
| The ASCII string contains the notation of a real as used in the
 | |
| Pascal language.
 | |
| .PE
 | |
| .P
 | |
| The NPROC procedure descriptors on the load file consist of
 | |
| an instruction space address (of~pointer~size) and
 | |
| an integer (of~pointer~size) specifying the number of bytes for
 | |
| locals.
 |