1893 lines
		
	
	
	
		
			42 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
			
		
		
	
	
			1893 lines
		
	
	
	
		
			42 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
. \" $Header$"
 | 
						|
.RP
 | 
						|
.ND Dec 1984
 | 
						|
.TL
 | 
						|
.B
 | 
						|
A backend table for the 6500 microprocessor
 | 
						|
.R
 | 
						|
.AU
 | 
						|
Jan van Dalen
 | 
						|
.AB
 | 
						|
The backend table is part of the Amsterdam Compiler Kit (ACK).
 | 
						|
It translates the intermediate language family EM to a machine
 | 
						|
code for the MCS6500 microprocessor family.
 | 
						|
.AE
 | 
						|
.bp
 | 
						|
.DS C
 | 
						|
.B
 | 
						|
THE MCS6500 MICROPROCESSOR.
 | 
						|
.R
 | 
						|
.DE
 | 
						|
.NH 0
 | 
						|
Introduction
 | 
						|
.PP
 | 
						|
Why a back end table for the MCS6500 microprocessor family.
 | 
						|
Although the MCS6500 microprocessor family has an simple
 | 
						|
instruction set and internal structure, it is used in a
 | 
						|
variety of microcomputers and homecomputers.
 | 
						|
This is because of is low cost.
 | 
						|
As an example the Apple II, a well known and width spread
 | 
						|
microprocessor, uses the MCS6502 CPU.
 | 
						|
Also the BBC homecomputer, whose popularity is growing day
 | 
						|
by day uses the MCS6502 CPU.
 | 
						|
The BBC homecomputer is based on the MCS6502 CPU although 
 | 
						|
better and stronger microprocessors are available.
 | 
						|
The designers of Acorn computer Industries have probably
 | 
						|
choosen for the MCS6502 because of the amount of software
 | 
						|
available for this CPU.
 | 
						|
Since its width spreaded use, a variaty of software
 | 
						|
will be needed for it.
 | 
						|
One can think of games!!, administration programs,
 | 
						|
teaching programs, basic interpreters and other application
 | 
						|
programs.
 | 
						|
Even do it will not be possible to run the total compiler kit
 | 
						|
on a MCS6500 based computer, it is possible to write application
 | 
						|
programs in a high level language, such as Pascal or C on a
 | 
						|
minicomputer.
 | 
						|
These application programs can be tested and compiled on that
 | 
						|
minicomputer and put in a ROM (Read Only Memory), for example,
 | 
						|
cso that it an be executed by a MCS6500 CPU.
 | 
						|
The strategy of writing testprograms on a minicomputer, 
 | 
						|
compile it and then execute it on a MCS6500 based
 | 
						|
microprocessor is used by the development of the back end.
 | 
						|
The minicomputer used is M68000 based one, manufactured by
 | 
						|
Bleasdale Computer Systems Ltd..
 | 
						|
The micro- or homecomputer used is a BBC microcomputer,
 | 
						|
manufactured by Acorn Computer Ltd..
 | 
						|
.NH
 | 
						|
The MOS Technology MCS6500
 | 
						|
.PP
 | 
						|
The MCS6500 is as a family of CPU devices developed by MOS
 | 
						|
Technology [1].
 | 
						|
The members of the MCS6500 family are the same chips in a 
 | 
						|
different housing.
 | 
						|
The MCS6502, the big brother in the family, can handle 64k
 | 
						|
bytes of memory, while for example the MCS6504 can only handle
 | 
						|
8k bytes of memory.
 | 
						|
This difference is due to the fact that the MCS6502 is in a
 | 
						|
40 pins house and the MCS6504 has a 28 pins house, so less
 | 
						|
address lines are available.
 | 
						|
.bp
 | 
						|
.NH
 | 
						|
The MCS6500 CPU programmable registers
 | 
						|
.PP
 | 
						|
The MCS6500 series is based on the same chip so all have the
 | 
						|
same programmable registers.
 | 
						|
.sp 9
 | 
						|
.NH 2
 | 
						|
The accumulator A.
 | 
						|
.PP
 | 
						|
The accumulator A is the only register on which the arithmetic
 | 
						|
and logical instructions can be used.
 | 
						|
For example, the instruction ADC (add with carry) adds the
 | 
						|
contents of the accumulator A and a byte from memory or data.
 | 
						|
.NH 2
 | 
						|
The index register X.
 | 
						|
.PP
 | 
						|
As the name suggests this register can be used for some
 | 
						|
indirect addressing modes.
 | 
						|
The modes are explaned below.
 | 
						|
.NH 2
 | 
						|
The index register Y.
 | 
						|
.PP
 | 
						|
This register is, just as the index register X, used for
 | 
						|
certain indirect addressing modes.
 | 
						|
These addressing modes are different from the modes which
 | 
						|
use index register X.
 | 
						|
.NH 2
 | 
						|
The program counter PC
 | 
						|
.PP 
 | 
						|
This is the only 16-bit register available.
 | 
						|
It is used to point to the next instruction to be
 | 
						|
carried out.
 | 
						|
.NH 2
 | 
						|
The stack pointer SP
 | 
						|
.PP
 | 
						|
The stack pointer is an 8-bit register, so the stack can contain
 | 
						|
at most 256 bytes.
 | 
						|
The CPU always appends 00000001 as highbyte of any stack address,
 | 
						|
which means that memory locations
 | 
						|
.B
 | 
						|
0100
 | 
						|
.R
 | 
						|
through
 | 
						|
.B
 | 
						|
01FF
 | 
						|
.R
 | 
						|
are permanently assigned to the stack.
 | 
						|
.sp 12
 | 
						|
.NH 2
 | 
						|
The status register
 | 
						|
.PP
 | 
						|
The status register maintains six status flags and a master
 | 
						|
interrupt control bit.
 | 
						|
.br
 | 
						|
These are the six status flags:
 | 
						|
    Carry        (c)
 | 
						|
    Zero         (z)
 | 
						|
    Overflow     (o)
 | 
						|
    Sign         (n)
 | 
						|
    Decimal mode (d)
 | 
						|
    Break        (b)
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
The bit (i) is the master interrupt control bit.
 | 
						|
.NH
 | 
						|
The MCS6500 memory layout.
 | 
						|
.PP
 | 
						|
In the MCS6500 memory space three area's have special meaning.
 | 
						|
These area's are:
 | 
						|
.IP 1)
 | 
						|
Top page.
 | 
						|
.IP 2)
 | 
						|
Zero page.
 | 
						|
.IP 3)
 | 
						|
The stack.
 | 
						|
.PP
 | 
						|
MCS6500 memory is divided up into pages.
 | 
						|
These pages consist 256 bytes.
 | 
						|
So in a memory address the highbyte denotes the page number
 | 
						|
and the lowbyte the offset within the page.
 | 
						|
.NH 2
 | 
						|
Top page.
 | 
						|
.PP
 | 
						|
When a MCS6500 is restared it jumps indirect via memory address
 | 
						|
.B
 | 
						|
FFFC.
 | 
						|
.R
 | 
						|
At
 | 
						|
.B
 | 
						|
FFFC
 | 
						|
.R
 | 
						|
(lowbyte) and 
 | 
						|
.B
 | 
						|
FFFD
 | 
						|
.R
 | 
						|
(highbyte) there must be the address of the bootstrap subroutine.
 | 
						|
When a break instruction (BRK) occurs or an interrupt takes place,
 | 
						|
the MCS6500 jumps indirect through memory address
 | 
						|
.B
 | 
						|
FFFE.
 | 
						|
.R
 | 
						|
.B
 | 
						|
FFFE
 | 
						|
.R
 | 
						|
and 
 | 
						|
.B
 | 
						|
FFFF
 | 
						|
.R
 | 
						|
thus, must contain the address of the interrupt routine.
 | 
						|
The former only goes for maskeble interrupt.
 | 
						|
There also exist a nonmaskeble interrupt.
 | 
						|
This cause the MCS6500 to jump indirect through memory address
 | 
						|
.B
 | 
						|
FFFA.
 | 
						|
.R
 | 
						|
So the top six bytes of memory are used by the operating system
 | 
						|
and therefore not available for the back end.
 | 
						|
.NH 2
 | 
						|
Zero page.
 | 
						|
.PP
 | 
						|
This page has a special meaning in the sence that addressing
 | 
						|
this page uses special opcodes.
 | 
						|
Since a page consists of 256 bytes, only one byte is needed
 | 
						|
for addressing zero page.
 | 
						|
So an instruction which uses zero page occupies two bytes.
 | 
						|
It also uses less clock cycle's while carrying out the instruction.
 | 
						|
Zero page is also needed when indirect addressing is used.
 | 
						|
This means that when indirect addressing is used, the address must
 | 
						|
reside in zero page (two consecutive bytes).
 | 
						|
In this case (the back end), zero page is used, for example
 | 
						|
to hold the local base, the second local base, the stack pointer
 | 
						|
etc.
 | 
						|
.NH 2
 | 
						|
The stack.
 | 
						|
.PP
 | 
						|
The stack is described in paragraph 3.5 about the MCS6500
 | 
						|
programmable registers.
 | 
						|
.NH 
 | 
						|
The memory adressing modes
 | 
						|
.PP
 | 
						|
MCS6500 memory reference instructions use direct addressing,
 | 
						|
indexed addressing, and indirect addressing.
 | 
						|
.NH 2
 | 
						|
direct addressing.
 | 
						|
.PP
 | 
						|
Three-byte instructions use the second and third bytes of the
 | 
						|
object code to provide a direct 16-bit address:
 | 
						|
therefore, 65.536 bytes of memory can be addressed directly.
 | 
						|
The commonly used memory reference instructions also have a two-byte
 | 
						|
object code variation, where the second byte directly addresses
 | 
						|
one of the first 256 bytes.
 | 
						|
.NH 2
 | 
						|
Base page, indexed addressing.
 | 
						|
.PP
 | 
						|
In this case, the instruction has two bytes of object code.
 | 
						|
The contents of either the X or Y index registers are added to the 
 | 
						|
second  object code byte in order to compute a memory address.
 | 
						|
This may be illustrated as follows:
 | 
						|
.sp 15
 | 
						|
Base page, indexed addressing, as illustrated above, is 
 | 
						|
wraparound - which means that there is no carry.
 | 
						|
If the sum of the index register and second object code byte contents
 | 
						|
is more than
 | 
						|
.B
 | 
						|
FF
 | 
						|
.R
 | 
						|
, the carry bit will be dicarded.
 | 
						|
This may be illustrated as follows:
 | 
						|
.sp 9
 | 
						|
.NH 2
 | 
						|
Absolute indexed addressing.
 | 
						|
.PP
 | 
						|
In this case, the contents of either the X or Y register are added
 | 
						|
to a 16-bit direct address provided by the second and third bytes
 | 
						|
of an instruction's object code.
 | 
						|
This may be illustrated as follows:
 | 
						|
.sp 10
 | 
						|
.NH 2
 | 
						|
Indirect addressing.
 | 
						|
.PP
 | 
						|
Instructions that use simple indirect addressing have three bytes of
 | 
						|
object code.
 | 
						|
The second and third object code bytes provide a 16-bit address;
 | 
						|
therefore, the indirect address can be located anywhere in
 | 
						|
memory.
 | 
						|
This is straightforward indirect addressing.
 | 
						|
.NH 3
 | 
						|
Pre-indexed indirect addressing.
 | 
						|
.PP
 | 
						|
In this case, the object code consists of two bytes and the 
 | 
						|
second object code byte provides an 8-bit address.
 | 
						|
Instructions that use pre-indexed indirect addressing add the contents
 | 
						|
of the X index register and the second object code byte to access
 | 
						|
a memory location in the first 256 bytes of memory, where the 
 | 
						|
indirect address will be found:
 | 
						|
.sp 18
 | 
						|
When using pre-indexed indirect addressing, once again wraparound
 | 
						|
addition is used, which means that when the X index register contents
 | 
						|
are added to the second object code byte, any carry will be discarded.
 | 
						|
Note that only the X index register can be used with pre-indexed
 | 
						|
addressing.
 | 
						|
.NH 3
 | 
						|
Post-indexed indirect addressing.
 | 
						|
.PP
 | 
						|
In this case, the object code consists of two bytes and the
 | 
						|
second object code byte provides an 8-bit address.
 | 
						|
Now the second object code byte indentifies a location
 | 
						|
in the first 256 bytes of memory where an indirect address
 | 
						|
will be found.
 | 
						|
The contents of the Y index register are added to this indirect
 | 
						|
address.
 | 
						|
This may be illustrated as follows:
 | 
						|
.sp 18
 | 
						|
Note that only the Y index register can be used with post-indexed
 | 
						|
indirect addressing.
 | 
						|
.bp
 | 
						|
.NH
 | 
						|
What the CPU has and doesn't has.
 | 
						|
.PP
 | 
						|
Although the designers of the MCS6500 CPUs family state that
 | 
						|
there is nothing very significant about the short stack (only
 | 
						|
256 bytes) this stack caused problems for the back end.
 | 
						|
The designers say that a 256-byte stack usually is sufficient
 | 
						|
for any typical microcomputer, this is only true if the stack
 | 
						|
is used only for return addresses of the JSR (jump to
 | 
						|
subroutine) instruction.
 | 
						|
But since the EM machine is suppost to be a stack machine and
 | 
						|
high level languages need the ability of parameters and
 | 
						|
locals in there procedures and function, this short stack
 | 
						|
is unsufficiant.
 | 
						|
So an software stack is implemented in this back end, requiring two
 | 
						|
additional subroutines for stack handling.
 | 
						|
These two stack handling subroutines slow down the processing time
 | 
						|
of a program since the stack is used heavely.
 | 
						|
.PP
 | 
						|
Since parameters and locals of EM procedures are offseted
 | 
						|
from the localbase of that procedure, indirect addressing
 | 
						|
is havily used.
 | 
						|
Offsets are positive (for parameters) and negative (for
 | 
						|
local variables).
 | 
						|
As explaned before the addressing modes the MCS6500 have a
 | 
						|
post indexed indirect addressing mode.
 | 
						|
This addressing mode can only handle positive offsets.
 | 
						|
This raises a problem for accessing the local variables
 | 
						|
I have chosen for the next solution.
 | 
						|
A second local base is introduced.
 | 
						|
This second local base is the real local base subtracted by
 | 
						|
a constant BASE.
 | 
						|
In the present situation of the back end the value of BASE
 | 
						|
is 240.
 | 
						|
This means that there are 240 bytes reseved for local
 | 
						|
variables to be indirect addressed and 14 bytes for
 | 
						|
the parameters.
 | 
						|
.DS C
 | 
						|
.B
 | 
						|
THE CODE GENERATOR.
 | 
						|
.R
 | 
						|
.DE
 | 
						|
.NH 0
 | 
						|
Description of the machine table.
 | 
						|
.PP
 | 
						|
The machine description table consists of the following sections:
 | 
						|
.IP 1.
 | 
						|
The macro definitions.
 | 
						|
.IP 2.
 | 
						|
Constant definitions.
 | 
						|
.IP 3.
 | 
						|
Register definitions.
 | 
						|
.IP 4.
 | 
						|
Token definitions.
 | 
						|
.IP 5.
 | 
						|
Token expressions.
 | 
						|
.IP 6.
 | 
						|
Code rules.
 | 
						|
.IP 7.
 | 
						|
Move definitions.
 | 
						|
.IP 8.
 | 
						|
Test definitions.
 | 
						|
.IP 9.
 | 
						|
Stack definitions.
 | 
						|
.NH 2
 | 
						|
Macro definitions.
 | 
						|
.PP
 | 
						|
The macro definitions at the top of the table are expanded
 | 
						|
by the preprocessor on occurence in the rest of the table.
 | 
						|
.NH 2
 | 
						|
Constant definitions.
 | 
						|
.PP
 | 
						|
There are three constants which must be defined at first.
 | 
						|
The are:
 | 
						|
.IP EM_WSIZE: 11
 | 
						|
Number of bytes in a machine word.
 | 
						|
This is the number of bytes a simple
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
instruction will put on the stack.
 | 
						|
.IP EM_PSIZE:
 | 
						|
Number of bytes in a pointer.
 | 
						|
This is the number of bytes a
 | 
						|
.B
 | 
						|
lal
 | 
						|
.R
 | 
						|
instruction will put on the stack.
 | 
						|
.IP EM_BSIZE:
 | 
						|
Number of bytes in the hole between AB and LB.
 | 
						|
The calling sequence only saves LB on the stack so this
 | 
						|
constant is equal to the pointer size.
 | 
						|
.NH 1
 | 
						|
Register definitions.
 | 
						|
.PP
 | 
						|
The only important register definition is the definition of
 | 
						|
the registerpair AX.
 | 
						|
Since the rest of the machine's registers Y, PC, ST serve
 | 
						|
special purposes, the code generator cannot use them.
 | 
						|
.NH 2
 | 
						|
Token definitions
 | 
						|
.PP
 | 
						|
There is a fake token.
 | 
						|
This token is put in the table, since the code generator generator
 | 
						|
complains if it cannot find one.
 | 
						|
.NH 2
 | 
						|
Token expression definitions.
 | 
						|
.PP
 | 
						|
The token expression is also a fake one.
 | 
						|
This token expression is put in the table, since the code generator
 | 
						|
generator complains if it cannot find one.
 | 
						|
.NH 2
 | 
						|
Code rules.
 | 
						|
.PP
 | 
						|
The code rule section is the largest section in the table.
 | 
						|
They specify EM patterns, stack patterns, code to be generated,
 | 
						|
etc.
 | 
						|
The syntax is:
 | 
						|
.IP code rule:
 | 
						|
EM pattern '|' stack pattern '|' code '|'
 | 
						|
stack replacement '|' EM replacement '|'
 | 
						|
.PP
 | 
						|
All patterns are optional, however there must be at least one
 | 
						|
pattern present.
 | 
						|
If the EM pattern is missing the rule becomes a rewriting
 | 
						|
rule or a
 | 
						|
.B
 | 
						|
coercion
 | 
						|
.R
 | 
						|
to be used when code generation cannot continue because of an
 | 
						|
invalid stack pattern.
 | 
						|
The code rules are preceeded by the word CODE:.
 | 
						|
.NH 3
 | 
						|
The EM pattern.
 | 
						|
.PP
 | 
						|
The EM pattern consists of a list of EM mnemonics followed by
 | 
						|
a boolean expression. Examples:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loe
 | 
						|
.R
 | 
						|
.sp 1
 | 
						|
will match a single
 | 
						|
.B
 | 
						|
loe
 | 
						|
.R
 | 
						|
instruction,
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc loc cif
 | 
						|
.R
 | 
						|
$1==2 && $2==8
 | 
						|
.sp 1
 | 
						|
is a pattern that will match
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
2
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
8
 | 
						|
.br
 | 
						|
.B
 | 
						|
cif
 | 
						|
.R
 | 
						|
.sp 1
 | 
						|
and
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol
 | 
						|
inc
 | 
						|
stl
 | 
						|
.R
 | 
						|
$1==$3
 | 
						|
.sp 1
 | 
						|
will match for example
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
6
 | 
						|
.br
 | 
						|
.B
 | 
						|
inc
 | 
						|
.R
 | 
						|
.br
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
6
 | 
						|
.sp 1
 | 
						|
A missing boolean expession evaluates to TRUE.
 | 
						|
.PP
 | 
						|
The code generator will match the longest EM pattern on every occasion,
 | 
						|
if two patterns of the same length match the first in the table
 | 
						|
will be chosen, while all patterns of length greater than or equal
 | 
						|
to three are considered to be of the same length.
 | 
						|
.NH 3
 | 
						|
The stack pattern.
 | 
						|
.PP
 | 
						|
The only stack pattern that can occur is R16, which means that the
 | 
						|
registerpair AX contains the word on top of the stack.
 | 
						|
If this is not the case a coersion occurs.
 | 
						|
This coersion generates a "jsr Pop", which means that the top
 | 
						|
of the stack is popped and stored in the registerpair AX.
 | 
						|
.NH 3
 | 
						|
The code part.
 | 
						|
.PP
 | 
						|
The code part consists of three parts, stack cleanup, register
 | 
						|
allocation, and code to be generated.
 | 
						|
All of these may be omitted.
 | 
						|
.NH 4
 | 
						|
Stack cleanup.
 | 
						|
.PP
 | 
						|
When generating something like a branch instruction it might be
 | 
						|
needed to empty the fake stack, that is, remove the AX registerpair.
 | 
						|
This is done by the instruction remove(ALL)
 | 
						|
.NH 4
 | 
						|
Register allocation.
 | 
						|
.PP
 | 
						|
If the machine code to be generated uses the registerpair AX,
 | 
						|
this is signaled to the code generator by the allocate(R16)
 | 
						|
instruction.
 | 
						|
If the registerpair AX resides on the fake stack, this will result
 | 
						|
in a "jsr Push", which means that the registerpair AX is pushed on
 | 
						|
the stack and will be free for further use.
 | 
						|
If registerpair AX is not on the fake stack nothing happens.
 | 
						|
.NH 4
 | 
						|
Code to be generated.
 | 
						|
.PP
 | 
						|
Code to be generated is specified as a list of items of the following
 | 
						|
kind:
 | 
						|
.IP 1)
 | 
						|
A string in double quotes("This is a string").
 | 
						|
This is copied to the codefile and a newline ('\n') is appended.
 | 
						|
Inside the string all normal C string conventions are allowed,
 | 
						|
and substitutions can be made of the following sorts.
 | 
						|
.RS
 | 
						|
.IP a)
 | 
						|
$1, $2 etc. These are the operand of the corresponding EM 
 | 
						|
instructions and are printed according to there type.
 | 
						|
To put a real '$' inside the string it must be doubled ('$$').
 | 
						|
.IP b)
 | 
						|
%[1], %[2.reg], %[b.1] etc. these have there obvious meaning.
 | 
						|
If they describe a complete token (%[1]) the printformat for
 | 
						|
the token is used.
 | 
						|
If they stand fo a basic term in an expression they will be
 | 
						|
printed according to their type.
 | 
						|
To put a real '%' inside the string it must be doubled ('%%').
 | 
						|
.IP c)
 | 
						|
%( arbitrary expression %). This allows inclusion of arbitrary
 | 
						|
expressions inside strings.
 | 
						|
Usually not needed very often, so that the akward notation
 | 
						|
is not too bad.
 | 
						|
Note that %(%[1]%) is equivalent to %[1].
 | 
						|
.RE
 | 
						|
.NH 3
 | 
						|
stack replacement.
 | 
						|
.PP
 | 
						|
The stack replacement is a possibly empty list of items to be
 | 
						|
pushed on the fake stack.
 | 
						|
Three things can occur:
 | 
						|
.IP 1)
 | 
						|
%[1] is used if the registerpair AX was on the fake stack and is
 | 
						|
to be pushed back onto it.
 | 
						|
.IP 2)
 | 
						|
%[a] is used if the registerpair AX is allocated with allocate(R16)
 | 
						|
and is to be pushed onto the fake stack.
 | 
						|
.IP 3)
 | 
						|
It can also be empty.
 | 
						|
.NH 3
 | 
						|
EM replacement.
 | 
						|
.PP
 | 
						|
In exeptional cases it might be useful to leave part of the an EM
 | 
						|
pattern undone.
 | 
						|
For example, a
 | 
						|
.B
 | 
						|
sdl
 | 
						|
.R
 | 
						|
instruction might be split into two
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
instructions when there is no 4-byte quantity on the stack.
 | 
						|
The EM replacement part allows one to express this.
 | 
						|
Example:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
$1
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
$1+2
 | 
						|
.sp 1
 | 
						|
The instructions are inserted in the stream so they can match
 | 
						|
the first part of a pattern in the next step.
 | 
						|
Note that since the code generator traverses the EM instructions
 | 
						|
in a strict linear fashion, it is impossible to let the EM
 | 
						|
replacement match later parts of a pattern.
 | 
						|
So if there is a pattern
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc
 | 
						|
stl
 | 
						|
.R
 | 
						|
$1==0
 | 
						|
.sp1
 | 
						|
and the input is
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
0
 | 
						|
.B
 | 
						|
sdl
 | 
						|
.R
 | 
						|
4
 | 
						|
.sp 1
 | 
						|
the
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
0
 | 
						|
will be processed first, then the
 | 
						|
.B
 | 
						|
sdl
 | 
						|
.R
 | 
						|
might be split into two
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
's but the pattern cannot match now.
 | 
						|
.NH 3
 | 
						|
Move definitions.
 | 
						|
.PP
 | 
						|
This definition is a fake. This definition is put in the
 | 
						|
table, since the code generator generator complains if it
 | 
						|
cannot find one.
 | 
						|
.NH 3
 | 
						|
Test definitions.
 | 
						|
.PP
 | 
						|
Test definitions aren't used by the table.
 | 
						|
.NH 3
 | 
						|
Stack definitions.
 | 
						|
.PP
 | 
						|
When the generator has to push the registerpair AX, it must
 | 
						|
know how to do so.
 | 
						|
The machine code to be generated is defined here.
 | 
						|
.NH 1
 | 
						|
Some remarks.
 | 
						|
.PP
 | 
						|
The above description of the machine table is
 | 
						|
a description of the table for the MCS6500.
 | 
						|
It uses only a part of the possibilities which the code generator
 | 
						|
generator offers.
 | 
						|
For a more precise and detailed description see [2].
 | 
						|
.DS C
 | 
						|
.B
 | 
						|
THE BACK END TABLE.
 | 
						|
.R
 | 
						|
.DE
 | 
						|
.NH 0
 | 
						|
Introduction.
 | 
						|
.PP
 | 
						|
The code rules are divided in 15 groups.
 | 
						|
These groups are:
 | 
						|
.IP 1.
 | 
						|
Load instructions.
 | 
						|
.IP 2.
 | 
						|
Store instructions.
 | 
						|
.IP 3.
 | 
						|
Integer arithmetic instructions.
 | 
						|
.IP 4.
 | 
						|
Unsigned arithmetic instructions.
 | 
						|
.IP 5.
 | 
						|
Floating point arithmetic instructions.
 | 
						|
.IP 6.
 | 
						|
Pointer arithmetic instructions.
 | 
						|
.IP 7.
 | 
						|
Increment, decrement and zero instructions.
 | 
						|
.IP 8.
 | 
						|
Convert instructions.
 | 
						|
.IP 9.
 | 
						|
Logical instructions.
 | 
						|
.IP 10.
 | 
						|
Set manipulation instructions.
 | 
						|
.IP 11.
 | 
						|
Array instructions.
 | 
						|
.IP 12.
 | 
						|
Compare instructions.
 | 
						|
.IP 13.
 | 
						|
Branch instructions.
 | 
						|
.IP 14.
 | 
						|
Procedure call instructions.
 | 
						|
.IP 15.
 | 
						|
Miscellaneous instructions.
 | 
						|
.PP
 | 
						|
From all of these groups one or two typical EM pattern will be explained
 | 
						|
in the next paragraphs.
 | 
						|
Comment is placed between /* and */ (/* This is a comment */).
 | 
						|
.NH
 | 
						|
The instructions.
 | 
						|
.NH 2
 | 
						|
The load instructions.
 | 
						|
.PP
 | 
						|
In this group a typical instruction is
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
.
 | 
						|
A
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
instruction pushes the word at local base + offset, where offset
 | 
						|
is the instructions argument, onto the stack.
 | 
						|
Since the MCS6500 can only offset by 256 bytes, as explaned at the
 | 
						|
memory addressing modes, there is a need for two code rules in the
 | 
						|
table.
 | 
						|
One which can offset directly and one that must explicit
 | 
						|
calculate the address of the local.
 | 
						|
.NH 3
 | 
						|
The lol instruction with indirect offsetting.
 | 
						|
.PP
 | 
						|
In this case an indirect offsetted load from the second local base
 | 
						|
is possible.
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
IN($1) | |
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"ldy #BASE+$1"	/* load Y with the offset from the second
 | 
						|
.br
 | 
						|
					      local base */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect the lowbyte of the word */
 | 
						|
.br
 | 
						|
"tax"		/* move register A to register X */
 | 
						|
.br
 | 
						|
"iny"		/* increment register Y (offset) */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect the highbyte of the word */
 | 
						|
.br
 | 
						|
| %[a] | |	/* push the word onto the fake stack */
 | 
						|
.NH 3
 | 
						|
The lol instruction whose offset is to big.
 | 
						|
.PP
 | 
						|
In this case, the library subroutine "Lol" is used.
 | 
						|
This subroutine expects the offset in registerpair AX, then
 | 
						|
calculates the address of the local or parameter, and loads
 | 
						|
it into registerpair AX.
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
| |
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"lda #[$1].h"	/* load highbyte of offset into register A */
 | 
						|
.br
 | 
						|
"ldx #[$1].l"	/* load lowbyte of offset into register X */
 | 
						|
.br
 | 
						|
"jsr Lol"	/* perform the subroutine */
 | 
						|
.br
 | 
						|
| %[a] | |	/* push word onto the fake stack */
 | 
						|
.NH 2
 | 
						|
The store instructions.
 | 
						|
.PP
 | 
						|
In this group a typical instruction is
 | 
						|
.B
 | 
						|
stl.
 | 
						|
.R
 | 
						|
A
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
instruction poppes a word from the stack and stores it in the word
 | 
						|
at local base + offset, where offset is the instructions argument.
 | 
						|
Here also is the need for two code rules in the table as a result
 | 
						|
of the offset limits.
 | 
						|
.NH 3
 | 
						|
The stl instruction with indirect offsetting.
 | 
						|
.PP
 | 
						|
In this case it an indirect offsetted store from the second local
 | 
						|
base is possible.
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
IN($1) | R16 |	/* expect registerpair AX on top of the
 | 
						|
.br
 | 
						|
							fake stack */
 | 
						|
.br
 | 
						|
"ldy #BASE+1+$1"  /* load Y with the offset from the
 | 
						|
.br
 | 
						|
						second local base */
 | 
						|
.br
 | 
						|
"sta (LBl),y"	/* store the highbyte of the word from A */
 | 
						|
.br
 | 
						|
"txa"		/* move register X to register A */
 | 
						|
.br
 | 
						|
"dey"		/* decrement offset */
 | 
						|
.br
 | 
						|
"sta (LBl),y"	/* store the lowbyte of the word from A */
 | 
						|
.br
 | 
						|
| | |
 | 
						|
.NH 3
 | 
						|
The stl instruction whose offset is to big.
 | 
						|
.PP
 | 
						|
In this case the library subroutine 'Stl' is used.
 | 
						|
This subroutine expects the offset in registerpair AX, then
 | 
						|
calculates the address, poppes the word stores it at its place.
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
stl
 | 
						|
.R
 | 
						|
| |
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"lda #[$1].h"	/* load highbyte of offset in register A */
 | 
						|
.br
 | 
						|
"ldx #[$1].l"	/* load lowbyte of offset in register X */
 | 
						|
.br
 | 
						|
"jsr Stl"	/* perform the subroutine */
 | 
						|
.br
 | 
						|
| | |
 | 
						|
.NH 2
 | 
						|
Integer arithmetic instructions.
 | 
						|
.PP
 | 
						|
In this group typical instructions are
 | 
						|
.B
 | 
						|
adi
 | 
						|
.R
 | 
						|
and
 | 
						|
.B
 | 
						|
mli.
 | 
						|
.R
 | 
						|
These instructions, in this table, are implemented for 2-byte
 | 
						|
and 4-byte integers.
 | 
						|
The only arithmetic instructions available on the MCS6500 are
 | 
						|
the ADC (add with carry), and SBC (subtract with not(carry)).
 | 
						|
Not(carry) here means that in a subtraction, the one's complement
 | 
						|
of the carry is taken.
 | 
						|
The absence of multiply and division instructions forces the
 | 
						|
use of subroutines to handle these cases.
 | 
						|
Because there are no registers left to perform on the multiply
 | 
						|
and division, zero page is used here.
 | 
						|
The 4-byte integer arithmetic is implemented, because in C there
 | 
						|
exists the integer type long.
 | 
						|
A user is freely to use the type long, but will pay in performance.
 | 
						|
.NH 3
 | 
						|
The adi instruction.
 | 
						|
.PP
 | 
						|
In case of the
 | 
						|
.B
 | 
						|
adi
 | 
						|
.R
 | 
						|
2 (and
 | 
						|
.B
 | 
						|
sbi
 | 
						|
.R
 | 
						|
2) instruction there are many EM
 | 
						|
patterns, so that the instruction can be performed in line in
 | 
						|
most cases.
 | 
						|
For the worst case there exists a subroutine in the library
 | 
						|
which deals with the EM instruction.
 | 
						|
In case of a
 | 
						|
.B
 | 
						|
adi
 | 
						|
.R
 | 
						|
4 (or
 | 
						|
.B
 | 
						|
sbi
 | 
						|
.R
 | 
						|
4) there only is a subroutine to deal with it.
 | 
						|
A table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol lol adi
 | 
						|
.R
 | 
						|
(IN($1) && IN($2) && $3==2) | | /* is it in range */
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"ldy #BASE+$1+1" /* load Y with offset for first operand */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect highbyte first operand */
 | 
						|
.br
 | 
						|
"pha"		/* save highbyte first operand on hard_stack */
 | 
						|
.br
 | 
						|
"dey"		/* decrement offset first operand */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect lowbyte first operand */
 | 
						|
.br
 | 
						|
"ldy #BASE+$2"	/* load Y with offset for second operand */
 | 
						|
.br
 | 
						|
"clc"		/* clear carry for addition */
 | 
						|
.br
 | 
						|
"adc (LBl),y"	/* add the lowbytes of the operands */
 | 
						|
.br
 | 
						|
"tax"		/* store lowbyte of result in place */
 | 
						|
.br
 | 
						|
"iny"		/* increment offset second operand */
 | 
						|
.br
 | 
						|
"pla"		/* get highbyte first operand */
 | 
						|
.br
 | 
						|
"adc (LBl),y"	/* add the highbytes of the operands */
 | 
						|
.br
 | 
						|
| %[a] | |	/* push the result onto the fake stack */
 | 
						|
.NH 3
 | 
						|
The mli instruction.
 | 
						|
.PP
 | 
						|
The
 | 
						|
.B
 | 
						|
mli
 | 
						|
.R
 | 
						|
2 instruction uses most the subroutine 'Mlinp'.
 | 
						|
This subroutine expects the multiplicand in zero page
 | 
						|
at locations ARTH, ARTH+1, while the multiplier is in zero
 | 
						|
page locations ARTH+2, ARTH+3.
 | 
						|
For a description of the algorithms used for multiplication and
 | 
						|
division, see [3].
 | 
						|
A table content is:
 | 
						|
.sp  1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol lol mli
 | 
						|
.R
 | 
						|
(IN($1) && IN($2) && $3==2) | |
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"ldy #BASE+$1"	/* load Y with offset of multiplicand */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect lowbyte of multiplicand */
 | 
						|
.br
 | 
						|
"sta ARTH"	/* store lowbyte in zero page */
 | 
						|
.br
 | 
						|
"iny"		/* increment offset of multiplicand */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect highbyte of multiplicand */
 | 
						|
.br
 | 
						|
"sta ARTH+1"	/* store highbyte in zero page */
 | 
						|
.br
 | 
						|
"ldy #BASE+$2"	/* load Y with offset of multiplier */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect lowbyte of multiplier */
 | 
						|
.br
 | 
						|
"sta ARTH+2"	/* store lowbyte in zero page */
 | 
						|
.br
 | 
						|
"iny"		/* increment offset of multiplier */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect highbyte of multiplier */
 | 
						|
.br
 | 
						|
"sta ARTH+3"	/* store highbyte in zero page */
 | 
						|
.br
 | 
						|
"jsr Mlinp"	/* perform the multiply */
 | 
						|
.br
 | 
						|
| %[a] | |	/* push result onto fake stack */
 | 
						|
.NH 2
 | 
						|
The unsgned arithmetic instructions.
 | 
						|
.PP
 | 
						|
Since unsigned addition an subtraction is performed in the same way
 | 
						|
as signed addition and subtraction, these cases are dealt with by
 | 
						|
an EM replacement.
 | 
						|
For mutiplication and division there are special subroutines.
 | 
						|
.NH 3
 | 
						|
Unsigned addition.
 | 
						|
.PP
 | 
						|
This is an example of the EM replacement strategy.
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lol lol adu
 | 
						|
.R
 | 
						|
	| | | |
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
$1
 | 
						|
.B
 | 
						|
lol
 | 
						|
.R
 | 
						|
$2
 | 
						|
.B
 | 
						|
adi
 | 
						|
.R
 | 
						|
$3 |
 | 
						|
.NH 2
 | 
						|
Floating point arithmetic.
 | 
						|
.PP
 | 
						|
Floating point arithmetic isn't implemented in this table.
 | 
						|
.NH 2
 | 
						|
Pointer arithmetic instructions.
 | 
						|
.PP
 | 
						|
A typical pointer arithmetic instruction is
 | 
						|
.B
 | 
						|
adp
 | 
						|
.R
 | 
						|
2.
 | 
						|
This instruction adds an offset and a pointer.
 | 
						|
A table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
adp
 | 
						|
.R
 | 
						|
	| | | |
 | 
						|
.B
 | 
						|
loc
 | 
						|
.R
 | 
						|
$1
 | 
						|
.B
 | 
						|
adi
 | 
						|
.R
 | 
						|
2 |
 | 
						|
.NH 2
 | 
						|
Increment, decrement and zero instructions.
 | 
						|
.PP
 | 
						|
In this group a typical instruction is
 | 
						|
.B
 | 
						|
inl
 | 
						|
.R
 | 
						|
, which increments a local or parameter.
 | 
						|
The MCS6500 doesn't have an instruction to increment the
 | 
						|
accumulator A, so the 'ADC' instruction must be used.
 | 
						|
A table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
inl
 | 
						|
.R
 | 
						|
IN($1) | |
 | 
						|
.br
 | 
						|
allocate(R16)	/* allocate registerpair AX */
 | 
						|
.br
 | 
						|
"ldy #BASE+$1"	/* load Y with offset of the local */
 | 
						|
.br
 | 
						|
"clc"		/* clear carry for addition */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect lowbyte of local */
 | 
						|
.br
 | 
						|
"adc #1"	/* increment lowbyte */
 | 
						|
.br
 | 
						|
"sta (LBl),y"	/* restore indirect the incremented lowbyte */
 | 
						|
.br
 | 
						|
"bcc 1f"	/* if carry is clear then ready */
 | 
						|
.br 
 | 
						|
"iny"		/* increment offset of local */
 | 
						|
.br
 | 
						|
"lda (LBl),y"	/* load indirect highbyte of local */
 | 
						|
.br
 | 
						|
"adc #0"	/* add carry to highbyte */
 | 
						|
.br
 | 
						|
"sta (LBl),y\\n1:"  /* restore indirect the highbyte */
 | 
						|
.PP
 | 
						|
If the offset of the local or parameter is to big, first the
 | 
						|
local or parameter is fetched, than incremented, and then
 | 
						|
restored.
 | 
						|
.NH 2
 | 
						|
Convert instructions.
 | 
						|
.PP
 | 
						|
In this case there are two convert instructions
 | 
						|
which really do something.
 | 
						|
One of them is in line code, and deals with the extension of
 | 
						|
a character (1-byte) to an integer.
 | 
						|
The other one is a subroutine which handles the conversion
 | 
						|
between 2-byte integers and 4-byte integers.
 | 
						|
.NH 3
 | 
						|
The in line conversion.
 | 
						|
.PP
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc loc cii
 | 
						|
.R
 | 
						|
$1==1 && $2==2 | R16 |
 | 
						|
.br
 | 
						|
"txa"		/* see if sign extension is needed */
 | 
						|
.br
 | 
						|
"bpl 1f"	/* there is no need for sign extension */
 | 
						|
.br
 | 
						|
"lda #0FFh"	/* sign extension here */
 | 
						|
.br
 | 
						|
"bne 2f"	/* conversion ready */
 | 
						|
.br
 | 
						|
"1: lda #0\\n2:"	/* no sign extension here */
 | 
						|
.NH 2
 | 
						|
Logical instructions.
 | 
						|
.PP
 | 
						|
A typical instruction in this group is the logical
 | 
						|
.B
 | 
						|
and
 | 
						|
.R
 | 
						|
on two 2-byte words.
 | 
						|
The logical
 | 
						|
.B
 | 
						|
and
 | 
						|
.R
 | 
						|
on groups of more than two bytes (max 254)
 | 
						|
is also possible and uses a library subroutine.
 | 
						|
.NH 3
 | 
						|
The logical and on 2-byte groups.
 | 
						|
.PP
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
and
 | 
						|
.R
 | 
						|
$1==2 | R16 |	/* one group must be on the fake stack */
 | 
						|
.br
 | 
						|
"sta ARTH+1"	/* temporary save of first group highbyte */
 | 
						|
.br
 | 
						|
"stx ARTH"	/* temporary save of first group lowbyte */
 | 
						|
.br
 | 
						|
"jsr Pop"	/* pop second group from the stack */
 | 
						|
.br
 | 
						|
"and ARTH+1"	/* logical and on highbytes */
 | 
						|
.br
 | 
						|
"pha"		/* temporary save the result's highbyte */
 | 
						|
.br
 | 
						|
"txa"		/* logical and can only be done in A */
 | 
						|
.br
 | 
						|
"and ARTH"	/* logical and on lowbytes */
 | 
						|
.br
 | 
						|
"tax"		/* restore results lowbyte */
 | 
						|
.br
 | 
						|
"pla"		/* restore results highbyte */
 | 
						|
.br
 | 
						|
| %[1] | |	/* push result onto fake stack */
 | 
						|
.NH 2
 | 
						|
Set manipulation instructions.
 | 
						|
.PP
 | 
						|
A typical EM pattern in this group is
 | 
						|
.B
 | 
						|
loc inn zeq
 | 
						|
.R
 | 
						|
$1>0 && $1<16 && $2==2.
 | 
						|
This EM pattern works on sets of 16 bits.
 | 
						|
Sets can be bigger (max 256 bytes = 2048 bits), but than a
 | 
						|
library routine is used instead of in line code.
 | 
						|
The table content of the above EM pattern is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
loc inn zeq
 | 
						|
.R
 | 
						|
$1>0 && $1<16 && $2==2 | R16 |
 | 
						|
.br
 | 
						|
"ldy #$1+1"	/* load Y with bit number */
 | 
						|
.br
 | 
						|
"stx ARTH"	/* cannot rotate X, so use zero page */
 | 
						|
.br
 | 
						|
"1: lsr a"	/* right shift A */
 | 
						|
.br
 | 
						|
"ror ARTH"	/* right rotate zero page location */
 | 
						|
.br
 | 
						|
"dey"		/* decrement Y */
 | 
						|
.br
 | 
						|
"bne 1b"	/* shift $1 times */
 | 
						|
.br
 | 
						|
"bcc $1"	/* no carry, so bit is zero */
 | 
						|
.NH 2
 | 
						|
Array instructions.
 | 
						|
.PP
 | 
						|
In this group a typical EM pattern is
 | 
						|
.B
 | 
						|
lae lar
 | 
						|
.R
 | 
						|
defined(rom(1,3)) | | | |
 | 
						|
.B
 | 
						|
lae
 | 
						|
.R
 | 
						|
$1
 | 
						|
.B
 | 
						|
aar
 | 
						|
.R
 | 
						|
$2
 | 
						|
.B
 | 
						|
loi
 | 
						|
.R
 | 
						|
rom(1,3).
 | 
						|
This pattern uses the 
 | 
						|
.B
 | 
						|
aar
 | 
						|
.R
 | 
						|
instruction, which is part of a typical EM pattern:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
lae aar
 | 
						|
.R
 | 
						|
$2==2 && rom(1,3)==2 && rom(1,1)==0 | R16 | /* registerpair AX contains
 | 
						|
the index in the array */
 | 
						|
.br
 | 
						|
"pha"		/* save highbyte of index */
 | 
						|
.br
 | 
						|
"txa"		/* move lowbyte of index to A */
 | 
						|
.br
 | 
						|
"asl a"		/* shift left lowbyte == 2 times lowbyte */
 | 
						|
.br
 | 
						|
"tax"		/* restore lowbyte */
 | 
						|
.br
 | 
						|
"pla"		/* restore highbyte */
 | 
						|
.br
 | 
						|
"rol a"		/* rotate left highbyte == 2 times highbyte */
 | 
						|
.br
 | 
						|
| %[1] | adi 2 | /* push new index, add to lowerbound array */
 | 
						|
.NH 2
 | 
						|
Compare instructions.
 | 
						|
.PP
 | 
						|
In this group all EM patterns are performed by calling
 | 
						|
a subroutine.
 | 
						|
Subroutines are used here because comparison is only
 | 
						|
possible byte by byte.
 | 
						|
This means a lot of code, and since compare are used frequently
 | 
						|
a lot of in line code would be generated, and thus reducing
 | 
						|
the space left for the software stack.
 | 
						|
These subroutines can be found in the library.
 | 
						|
.NH 2
 | 
						|
Branch instructions.
 | 
						|
.PP
 | 
						|
A typical branch instruction is
 | 
						|
.B
 | 
						|
beq.
 | 
						|
.R
 | 
						|
The table content for it is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
beq
 | 
						|
.R
 | 
						|
| R16 |
 | 
						|
.br
 | 
						|
"sta BRANCH+1"	/* save highbyte second operand in zero page */
 | 
						|
.br
 | 
						|
"stx BRANCH"	/* save lowbyte second operand in zero page */
 | 
						|
.br
 | 
						|
"jsr Pop"	/* pop the first operand */
 | 
						|
.br
 | 
						|
"cmp BRANCH+1" 	/* compare the highbytes */
 | 
						|
.br
 | 
						|
"bne 1f"	/* there not equal so go on */
 | 
						|
.br
 | 
						|
"cpx BRANCH"	/* compare the lowbytes */
 | 
						|
.br
 | 
						|
"beq $1\\n1:"	/* lowbytes are also equal, so branch */
 | 
						|
.PP
 | 
						|
Another typical instruction in this group is
 | 
						|
.B
 | 
						|
zeq.
 | 
						|
.R
 | 
						|
The table content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
zeq
 | 
						|
.R
 | 
						|
| R16 |
 | 
						|
.br
 | 
						|
"tay"		/* move A to Y for setting testbits */
 | 
						|
.br
 | 
						|
"bmi $1"	/* highbyte s minus so branch */
 | 
						|
.br
 | 
						|
"txa"		/* move X to A for setting testbits */
 | 
						|
.br
 | 
						|
"beq $1\\n1:"	/* lowbyte also zero, thus branch */
 | 
						|
.NH 2
 | 
						|
Procedure call instructions.
 | 
						|
.PP
 | 
						|
In this group one code generation might seem a little
 | 
						|
akward.
 | 
						|
It is the EM instruction
 | 
						|
.B
 | 
						|
cai
 | 
						|
.R
 | 
						|
which generates a 'jsr Indir'.
 | 
						|
This is because there is no indirect jump_subroutine in the
 | 
						|
MCS6500.
 | 
						|
The only solution is to store the address in zero page, and then
 | 
						|
do a 'jsr' to a known label.
 | 
						|
At this label there must be an indirect jump instruction, which
 | 
						|
perform a jump to the address stored in zero page.
 | 
						|
In this case the label is Indir, and the address is stored in
 | 
						|
zero page at the addresses ADDR, ADDR+1.
 | 
						|
The tabel content is:
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
.B
 | 
						|
cai
 | 
						|
.R
 | 
						|
| R16 |
 | 
						|
.br
 | 
						|
"stx ADDR"	/* store lowbyte of address in zero page */
 | 
						|
.br
 | 
						|
"sta ADDR+1"	/* store highbyte of address in zero page */
 | 
						|
.br
 | 
						|
"jsr Indir"	/* use the indirect jump */
 | 
						|
.br
 | 
						|
| | |
 | 
						|
.NH 2
 | 
						|
Miscellaneous instructions.
 | 
						|
.PP
 | 
						|
In this group, as the name suggests, there is no
 | 
						|
typical EM instruction or EM pattern.
 | 
						|
Most of the MCS6500 code to be generated uses a library subroutine
 | 
						|
or is straightforward.
 | 
						|
.DS C
 | 
						|
.B
 | 
						|
PERFORMANCE.
 | 
						|
.R
 | 
						|
.DE
 | 
						|
.NH 0
 | 
						|
Introduction.
 | 
						|
.PP
 | 
						|
To measure the performance of the back end table some timing
 | 
						|
tests are done.
 | 
						|
What to time?
 | 
						|
In this case, the execution time of several Pascal statements
 | 
						|
are timed.
 | 
						|
Statements in C, which have a Pascal equivalence are timed also.
 | 
						|
The statements are timed as follows.
 | 
						|
A test program is been written, which executes two
 | 
						|
nested  for_loops from 1 to 1.000.
 | 
						|
Within these for_loops the statement, which is to be tested, is placed,
 | 
						|
so the statement will be executed 1.000.000 times.
 | 
						|
Then the same program is executed without the test statement.
 | 
						|
The time difference between the two executions is the time
 | 
						|
neccesairy to execute the test statement 1.000.000 times.
 | 
						|
The total time to execute the test statement requires thus the
 | 
						|
time difference divided by 1.000.000.
 | 
						|
.NH 0
 | 
						|
Testing Pascal statements.
 | 
						|
.PP
 | 
						|
The next statements are tested.
 | 
						|
.IP 1)
 | 
						|
int1 := 0;
 | 
						|
.IP 2)
 | 
						|
int1 := int2 - 1;
 | 
						|
.IP 3)
 | 
						|
int1 := int1 + 1;
 | 
						|
.IP 4)
 | 
						|
int1 := icon1 - icon2;
 | 
						|
.IP 5)
 | 
						|
int1 := icon2 div icon1;
 | 
						|
.IP 6)
 | 
						|
int1 := int2 * int3;
 | 
						|
.IP 7)
 | 
						|
bool := (int1 < 0);
 | 
						|
.IP 8)
 | 
						|
bool := (int1 < 3);
 | 
						|
.IP 9)
 | 
						|
bool := ((int1 > 3) or (int1 < 3))
 | 
						|
.IP 10)
 | 
						|
case int1 of 1: bool := false; 2: bool := true end;
 | 
						|
.IP 11)
 | 
						|
if int1 = 0 then int2 := 3;
 | 
						|
.IP 12)
 | 
						|
while int1 > 0 do int1 := int1 - 1;
 | 
						|
.IP 13)
 | 
						|
m := a[k];
 | 
						|
.IP 14)
 | 
						|
let2 := ['a'..'c'];
 | 
						|
.IP 15)
 | 
						|
P3(x);
 | 
						|
.IP 16)
 | 
						|
dum := F3(x);
 | 
						|
.IP 17)
 | 
						|
s.overhead := 5400;
 | 
						|
.IP 18)
 | 
						|
with s do overhead := 5400;
 | 
						|
.PP
 | 
						|
These statement were tested in a procedure test.
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
procedure test;
 | 
						|
.br
 | 
						|
var i, j, ... : integer;
 | 
						|
.br
 | 
						|
    bool : boolean;
 | 
						|
.br
 | 
						|
    let2 : set of char;
 | 
						|
.br
 | 
						|
begin
 | 
						|
.br
 | 
						|
    for i := 1 to 1000
 | 
						|
.br
 | 
						|
	for j := 1 to 1000
 | 
						|
.br
 | 
						|
	    STATEMENT
 | 
						|
.br
 | 
						|
end;
 | 
						|
.sp 1
 | 
						|
.PP
 | 
						|
STATEMENT is one of the statements as shown above, or it is
 | 
						|
the empty statement.
 | 
						|
The assignment of used variables, if neccesairy, is done before
 | 
						|
the first for_loop.
 | 
						|
In case of the statement which uses the procedure call, statement
 | 
						|
15, a dummy procedure is declared whose body is empty.
 | 
						|
In case of the statement which uses the function, statement 16,
 | 
						|
this function returns its argument.
 | 
						|
for the timing of C statements a similar test program was
 | 
						|
written.
 | 
						|
.sp 1
 | 
						|
.br
 | 
						|
main()
 | 
						|
.br
 | 
						|
{
 | 
						|
.br
 | 
						|
    int i, j, ...;
 | 
						|
.br
 | 
						|
    for (i = 1; i <= 1000; i++)
 | 
						|
.br
 | 
						|
	for (j = 1; j <= 1000; j++)
 | 
						|
.br
 | 
						|
	    STATEMENT
 | 
						|
.br
 | 
						|
}
 | 
						|
.sp 1
 | 
						|
.NH
 | 
						|
The results.
 | 
						|
.PP
 | 
						|
Here are tables with the results of the time measurments.
 | 
						|
Times are in microseconds (10^-6).
 | 
						|
Some statements appear twice in the tables.
 | 
						|
In the second case an array of 200 integers was declerated
 | 
						|
before the variable to be tested, so this variable cannot
 | 
						|
be accessed by indirect addressing from the second local base.
 | 
						|
This results in a larger execution time of the statement to be
 | 
						|
tested.
 | 
						|
The column 68000 contains the times measured on a Bleasdale,
 | 
						|
M68000 based, computer.
 | 
						|
The times in column pdp are measured on a DEC pdp11/44, where
 | 
						|
the times from column 6500 come from a BBC microcomputer.
 | 
						|
.bp
 | 
						|
.TS
 | 
						|
expand;
 | 
						|
c s s s
 | 
						|
c c c c
 | 
						|
lw35 nw7 nw7 nw7.
 | 
						|
Pascal timing results
 | 
						|
statement	68000	pdp	6500
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := 0;
 | 
						|
T}	4.0	5.8	16.7
 | 
						|
 	4.0	4.2	97.8
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := int2 - 1;
 | 
						|
T}	7.2	7.1	27.2
 | 
						|
 	6.9	7.1	206.5
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := int1 + 1;
 | 
						|
T}	6.9	6.8	27.2
 | 
						|
 	6.4	6.7	106.5
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := icon1 + icon2;
 | 
						|
T}	6.2	6.2	25.6
 | 
						|
 	6.2	6.0	106.6
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := icon2 div icon1;
 | 
						|
T}	14.9	14.3	372.6
 | 
						|
 	14.9	14.7	453.7
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 := int2 * int3;
 | 
						|
T}	11.5	12.0	558.1
 | 
						|
 	11.3	11.6	728.6
 | 
						|
_
 | 
						|
T{
 | 
						|
bool := (int1 < 0);
 | 
						|
T}	7.2	6.9	122.8
 | 
						|
 	7.8	8.1	453.2
 | 
						|
_
 | 
						|
T{
 | 
						|
bool := (int1 < 3);
 | 
						|
T}	7.3	7.6	126.0
 | 
						|
 	7.2	8.1	232.2
 | 
						|
_
 | 
						|
T{
 | 
						|
bool := ((int1 > 3) or (int1 < 3))
 | 
						|
T}	10.1	12.0	307.8
 | 
						|
 	10.2	11.9	440.1
 | 
						|
_
 | 
						|
T{
 | 
						|
case int1 of 1: bool := false; 2: bool := true end;
 | 
						|
T}	18.3	17.9	165.7
 | 
						|
_
 | 
						|
T{
 | 
						|
if int1 = 0 then int2 := 3;
 | 
						|
T}	9.5	8.5	133.8
 | 
						|
_
 | 
						|
T{
 | 
						|
while int1 > 0 do int1 := int1 - 1;
 | 
						|
T}	6.9	6.9	126.0
 | 
						|
_
 | 
						|
T{
 | 
						|
m := a[k];
 | 
						|
T}	7.2	6.8	134.3
 | 
						|
_
 | 
						|
T{
 | 
						|
let2 := ['a'..'c'];
 | 
						|
T}	38.4	38.8	447.4
 | 
						|
_
 | 
						|
T{
 | 
						|
P3(x);
 | 
						|
T}	18.9	18.8	180.3
 | 
						|
_
 | 
						|
T{
 | 
						|
dum := F3(x);
 | 
						|
T}	26.8	27.1	343.3
 | 
						|
_
 | 
						|
T{
 | 
						|
s.overhead := 5400;
 | 
						|
T}	4.6	4.1	16.7
 | 
						|
_
 | 
						|
T{
 | 
						|
with s do overhead := 5400;
 | 
						|
T}	4.2	4.3	16.7
 | 
						|
.TE
 | 
						|
.TS
 | 
						|
expand;
 | 
						|
c s s s
 | 
						|
c c c c
 | 
						|
lw35 nw7 nw7 nw7.
 | 
						|
C timing results
 | 
						|
statement	68000time	pdptime	6500time
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = 0;
 | 
						|
T}	4.1	3.6	17.2
 | 
						|
 	4.1	4.1	97.7
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = int2 - 1;
 | 
						|
T}	6.6	6.9	27.2
 | 
						|
 	6.1	6.5	206.4
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = int1 + 1;
 | 
						|
T}	6.4	7.3	27.2
 | 
						|
 	6.3	6.2	206.4
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = int2 * int3;
 | 
						|
T}	11.4	12.3	522.6
 | 
						|
	9.6	10.1	721.2
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = (int2 < 0);
 | 
						|
T}	7.2	7.6	126.4
 | 
						|
 	7.4	7.7	232.5
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = (int2 < 3);
 | 
						|
T}	7.0	7.5	126.0
 | 
						|
 	7.8	7.8	232.6
 | 
						|
_
 | 
						|
T{
 | 
						|
int1 = ((int2 > 3) || (int2 < 3));
 | 
						|
T}	11.8	12.2	193.4
 | 
						|
 	11.5	13.2	245.6
 | 
						|
_
 | 
						|
T{
 | 
						|
switch (int1) { case 1: int1 = 0; break; case 2: int1 = 1; break; }
 | 
						|
T}	28.3	29.2	164.1
 | 
						|
_
 | 
						|
T{
 | 
						|
if (int1 == 0) int2 = 3;
 | 
						|
T}	4.8	4.8	19.4
 | 
						|
_
 | 
						|
T{
 | 
						|
while (int2 > 0) int2 = int2 - 1;
 | 
						|
T}	5.8	6.0	125.9
 | 
						|
_
 | 
						|
T{
 | 
						|
int2 = a[int2];
 | 
						|
T}	4.8	5.1	192.8
 | 
						|
_
 | 
						|
T{
 | 
						|
P3(int2);
 | 
						|
T}	18.8	18.4	180.3
 | 
						|
_
 | 
						|
T{
 | 
						|
int2 = F3(int2);
 | 
						|
T}	27.0	27.2	309.4
 | 
						|
_
 | 
						|
T{
 | 
						|
s.overhead = 5400;
 | 
						|
T}	5.0	4.1	16.7
 | 
						|
.TE
 | 
						|
.NH
 | 
						|
Pascal statements which don't have a C equivalent.
 | 
						|
.PP
 | 
						|
At first, the two statements who perform an operation on constants
 | 
						|
are left out.
 | 
						|
These are left out while the C front end does constant folding,
 | 
						|
while the Pascal front end doesn't.
 | 
						|
So in C the statements int1 = icon1 + icon2; and int1 = icon1 / icont2;
 | 
						|
will use the same amount of time since the expression is evaluated
 | 
						|
by the front end.
 | 
						|
The two other statements (let2 := ['a'..'c']; and
 | 
						|
.B
 | 
						|
with
 | 
						|
.R
 | 
						|
s
 | 
						|
.B
 | 
						|
do
 | 
						|
.R
 | 
						|
overhead := 5400;), aren't included in the C statement timing table,
 | 
						|
because there constructs do not exist in C.
 | 
						|
Although in C there can be direct bit manipulation, and thus can
 | 
						|
be used to implement sets I have not used it here.
 | 
						|
The
 | 
						|
.B
 | 
						|
with
 | 
						|
.R
 | 
						|
statement does not exists in C and there is nothing with the slightest
 | 
						|
resemblance to it.
 | 
						|
.PP
 | 
						|
At first sight in the table , it looked if there is no much difference
 | 
						|
in the times for the M68000 and the pdp11/44, in comparison with the
 | 
						|
times needed by the MCS6500.
 | 
						|
To verify this impression, I calculated the correlation coefficient
 | 
						|
between the times of the M68000 and pdp11/44.
 | 
						|
It turned out to be 0.997 for both the Pascal time tests and the C
 | 
						|
time tests.
 | 
						|
Since the correlation coefficient is near to one and the difference
 | 
						|
between the times is small, they can be considered to be the same
 | 
						|
as seen from the times of the MCS6500.
 | 
						|
Then I have tried to make a grafic of the times from the M68000 and
 | 
						|
the MCS6500.
 | 
						|
Well, there was't any correlation to been seen, taken all the times.
 | 
						|
The only correlation one could see, with some effort, was in the
 | 
						|
times for the first three Pascal statements.
 | 
						|
The two first C statements show also a correlation, which two points
 | 
						|
always do.
 | 
						|
.PP
 | 
						|
Also the three Pascal statements
 | 
						|
.B
 | 
						|
case
 | 
						|
.R
 | 
						|
,
 | 
						|
.B
 | 
						|
if
 | 
						|
.R
 | 
						|
,
 | 
						|
and
 | 
						|
.B
 | 
						|
while
 | 
						|
.R
 | 
						|
have a correlation coefficient of 0.999.
 | 
						|
This is probably because the
 | 
						|
.B
 | 
						|
case
 | 
						|
.R
 | 
						|
statement uses a subroutine in both cases and the other two
 | 
						|
statements
 | 
						|
.B
 | 
						|
if
 | 
						|
.R
 | 
						|
and,
 | 
						|
.B
 | 
						|
while
 | 
						|
.R
 | 
						|
generate in line code.
 | 
						|
The last two Pascal statements use the same time, since the front
 | 
						|
end wil generate the same EM code for both.
 | 
						|
.PP
 | 
						|
The independence between the rest of the test times is because
 | 
						|
in these cases the object code for the MCS6500 uses library
 | 
						|
subroutines, while the other processors can handle the EM code
 | 
						|
with in line code.
 | 
						|
.PP
 | 
						|
It is clear that the MCS6500 is a slower device, it needs longer
 | 
						|
execution times, the need of more library subroutines, but
 | 
						|
there is no constant factor between it execution times and those
 | 
						|
of other processors.
 | 
						|
.PP
 | 
						|
The slowing down of the MCS6500 as result of the need of a
 | 
						|
library subroutine is illustrated by the muliplication
 | 
						|
statement.
 | 
						|
The MCS6500 needs a library subroutine, while the other
 | 
						|
two processors have a machine instruction to perform the
 | 
						|
multiply.
 | 
						|
This results in a factor of 48.5, when the operands can be accessed
 | 
						|
indirect by the MCS6500.
 | 
						|
When the MCS6500 cannot access the operands indirectly the situation
 | 
						|
is even worse.
 | 
						|
The slight differences between the MCS6500 execution times for
 | 
						|
Pascal statements and C statements is probably the result of the
 | 
						|
front end, and thus beyond the scope of this discussion.
 | 
						|
.PP
 | 
						|
Another timing test is done in C on the statement k = i + j + 1983.
 | 
						|
This statement is tested on many UNIX*
 | 
						|
.FS
 | 
						|
* UNIX is a Trademark of Bell Laboratories.
 | 
						|
.FE
 | 
						|
systems.
 | 
						|
For a complete list see appendix A.
 | 
						|
The slowest one is the IBM XT, which runs on a 8088 microprocessor.
 | 
						|
The fasted one is the Amdahl computer.
 | 
						|
Here is short table to illustrate the performance of the
 | 
						|
MCS6500.
 | 
						|
.TS
 | 
						|
c c c
 | 
						|
c n n.
 | 
						|
machine	short	int
 | 
						|
IBM XT	53.4	53.4
 | 
						|
Amdahl	0.5	0.3
 | 
						|
MCS6500	150.2	150.2
 | 
						|
.TE
 | 
						|
The MCS6500 is three times slower than the IBM XT, but threehundred
 | 
						|
times slower than the Amdahl.
 | 
						|
The reason why the times on the IBM XT and the MCS6500 are the
 | 
						|
same for short's and int's, is that most C compilers make the types
 | 
						|
short and integer the same size on 16-bit machines.
 | 
						|
In this project the MCS6500 is regarded as a 16-bit machine.
 | 
						|
.NH
 | 
						|
Length tests.
 | 
						|
.PP
 | 
						|
I have also compiled several programs written in Pascal and C to
 | 
						|
see if there is a resemblance between the number of bytes generated
 | 
						|
in the machine's language.
 | 
						|
In the tables:
 | 
						|
.IP length: 9
 | 
						|
The number of bytes of the source program.
 | 
						|
.IP 68000:
 | 
						|
The number of bytes of the a.out file for a M68000.
 | 
						|
.IP pdp:
 | 
						|
The number of bytes of the a.out file for a pdp11/44.
 | 
						|
.IP 6500:
 | 
						|
The number of bytes of the a.out file for a MCS6500.
 | 
						|
.LP
 | 
						|
These are the results:
 | 
						|
.TS
 | 
						|
c s s s
 | 
						|
c c c c
 | 
						|
n n n n.
 | 
						|
Pascal programs
 | 
						|
length	68000	pdp	6500
 | 
						|
_
 | 
						|
19946	14383	16090	26710
 | 
						|
19484	20169	20190	35416
 | 
						|
10849	10469	11464	18949
 | 
						|
273	4221	5106	7944
 | 
						|
1854	5807	6610	10301
 | 
						|
.TE
 | 
						|
.TS
 | 
						|
c s s s
 | 
						|
c c c c
 | 
						|
n n n n.
 | 
						|
C progams
 | 
						|
length	68000	pdp	6500
 | 
						|
_
 | 
						|
9444	6927	8234	11559
 | 
						|
7655	14353	18240	26251
 | 
						|
4775	11309	15934	19910
 | 
						|
639	6337	9660	12494
 | 
						|
.TE
 | 
						|
.PP
 | 
						|
In contrast to the execution times of the test statements, the
 | 
						|
object code files sizes show a constant factor between them.
 | 
						|
After calculating the correlation coefficient, I have calculated
 | 
						|
the line fitted between sizes.
 | 
						|
.FS
 | 
						|
* x is the number of bytes
 | 
						|
.FE
 | 
						|
.TS
 | 
						|
c s s
 | 
						|
c c c
 | 
						|
l c c.
 | 
						|
Pascal programs
 | 
						|
processor	corr. coef.	fitted line
 | 
						|
_
 | 
						|
68000-pdp	0.996	 
 | 
						|
68000-6500	0.999	1.76x + 502*
 | 
						|
pdp-6500	0.999	1.80x - 1577
 | 
						|
.TE
 | 
						|
.TS
 | 
						|
c s s
 | 
						|
c c c
 | 
						|
l c c.
 | 
						|
C programs
 | 
						|
processor	corr. coef.	fitted line
 | 
						|
_
 | 
						|
68000-pdp	0.974	 
 | 
						|
68000-6500	0.992	1.80x + 502*
 | 
						|
pdp-6500	0.980	1.40x - 1577
 | 
						|
.TE
 | 
						|
.PP
 | 
						|
As seen from the tables above the correlation coefficient for
 | 
						|
Pascal programs is better than the ones for C programs.
 | 
						|
Thus the line fits best for Pascal programs.
 | 
						|
With the formula of the best fitted line one can now estimate
 | 
						|
the size of the object code, which a program needs, for a MCS6500
 | 
						|
without having the compiler at hand.
 | 
						|
One also can see from these formula that the object code
 | 
						|
generated for a MCS6500 is about 1.8 times more than for the other
 | 
						|
processors.
 | 
						|
Since the number of bytes in the source file havily depends on the
 | 
						|
programmer, how many spaces he or she uses, the size of the indenting
 | 
						|
in structured programs, etc., there is no correlation between the
 | 
						|
size of the source file and the size of the object file.
 | 
						|
Also the use of comments has its influence on the size.
 | 
						|
.bp
 | 
						|
.DS C
 | 
						|
.B
 | 
						|
SUMMARY.
 | 
						|
.R
 | 
						|
.DE
 | 
						|
.NH 0
 | 
						|
Summary
 | 
						|
.PP
 | 
						|
In this chapter some final conclusions are made.
 | 
						|
.PP
 | 
						|
In spite of its simplicity, the MCS6500 is strong enough to
 | 
						|
implement a EM machine.
 | 
						|
A serious deficy of the MCS6500 is the missing of 16-bit
 | 
						|
general purpose registers, and especially the missing of a
 | 
						|
16-bit stackpointer.
 | 
						|
As pointed out before, one 16-bit register can be simulated
 | 
						|
by a pair of 8-bit registers, in fact, the accumulator A to
 | 
						|
hold the highbyte, and the index register X to hold the lowbyte
 | 
						|
of the word.
 | 
						|
By lack of a 16-bit stackpointer, zero page must be used to hold
 | 
						|
a stackpointer and there are also two subroutines needed for
 | 
						|
manipulating the stack (Push and Pop).
 | 
						|
.PP
 | 
						|
As seen at the time tests, the simple instruction set of the
 | 
						|
MCS6500 forces the use of library subroutines.
 | 
						|
These library subroutines increas the execution time of the
 | 
						|
programs.
 | 
						|
.PP
 | 
						|
The sizes of the object code files show a strong correlation
 | 
						|
in contrast to the execution times.
 | 
						|
With this correlatiuon one canestimate the size of a program
 | 
						|
if it is to be used on a MCS6500.
 | 
						|
.bp
 | 
						|
.NH 0
 | 
						|
.B
 | 
						|
REFERENCES.
 | 
						|
.R
 | 
						|
.IP 1.
 | 
						|
Osborn, A., Jacobson, S., and Kane, J. The Mos Technology MCS6500.
 | 
						|
.B
 | 
						|
An Introduction to Microcomputers ,
 | 
						|
.R
 | 
						|
Volume II, Some Real Products (june 1977) chap. 9.
 | 
						|
.RS
 | 
						|
.PP
 | 
						|
A hardware description of some real existing CPU's, such as
 | 
						|
the Intel Z80, MCS6500, etc. is given in this book.
 | 
						|
.RE
 | 
						|
.IP 2.
 | 
						|
van Staveren, H.
 | 
						|
The table driven code generator from the Amsterdam Compiler Kit.
 | 
						|
Vrije Universiteit, Amsterdam, (July 11, 1983).
 | 
						|
.RS
 | 
						|
.PP
 | 
						|
The defining document for writing a back end table.
 | 
						|
.RE
 | 
						|
.IP 3.
 | 
						|
Tanenbaum, A.S. Structured Computer Organization.
 | 
						|
Prentice Hall. (1976).
 | 
						|
.RS
 | 
						|
.PP
 | 
						|
In this book computers are described as a hierarchy of levels,
 | 
						|
with each one performing some well-defined function.
 | 
						|
.RE
 |