Initial entry

1988-06-22 21:48:19 +00:00 · 1988-06-22 21:48:19 +00:00 · 6214be89c8
commit 6214be89c8
parent b72f2848dd
10 changed files with 1837 additions and 0 deletions
--- a/doc/int/Makefile
+++ b/doc/int/Makefile
@ -0,0 +1,16 @@
 # $Header$
 TBL=/usr/ditroff/tbl
 DOC =		draw.mac cover txt1 txt2 txt3 appA appB bib
 int.doc:	$(DOC)
 		$(TBL) $(DOC) > $@
 FLS =		README .distr Makefile int.1 $(DOC)
 .distr:		Makefile
 		echo $(FLS) | tr ' ' '\012' >.distr
 clean:
 		rm -f int.doc
--- a/doc/int/README
+++ b/doc/int/README
@ -0,0 +1,4 @@
 # $Header$
 This directory contains the text of the documentation for the
 Production Quality Interpreter "int".
--- a/doc/int/appA
+++ b/doc/int/appA
@ -0,0 +1,280 @@
 .\"	List of all warnings; source of warn_msg and warn.h
 .\"
 .\"	$Header$
 .\"
 .\"	This file contains the warnings issued by the interpreter, together
 .\"	with their names and values in the code of the interpreter. Some of
 .\"	the source files of the interpreter are generated from the Wn
 .\"	macros in this file.
 .\"	When modifying this file, preserve the parameters of the Wn macros.
 .de Wn	\" <text> <define> <value>
 .IP \\$3. 7
 .B "\\$1"
 .br
 ..  Wn
 .bp
 .DS C
 APPENDIX A
 .DE
 .SH
 List of Warnings.
 .PP
 The shadow-byte administration makes it possible to check for a
 wide range of errors during run-time.
 We have tried to make the diagnostics self-explanatory and especially useful
 for the C-programmer.
 The warnings are printed in the message file, together with source file
 and line number.
 The complete list of warnings is presented here, followed by an
 explanation of what might be wrong.
 Often, these explanations implicitly assume that the program
 being interpreted, was originally written in C (and not Pascal, Basic etc.).
 .LP
 .I "Reading the load file"
 .Wn "Floating point instructions flag in header ignored" WFLUSED 1
 .Wn "No float initialisation in this version" WFLINIT 2
 The interpreter was compiled with the NOFLOAT option; code involving
 floating point operations can be run as long as the actual
 instructions are avoided.
 .Wn "Extra-test flag in header ignored" WEXTRIGN 4
 The interpreter already tests anything conceivable.
 .Wn "Maximum line number in header was 0" WNLINEZR 5
 This number could be used to allocate tables for tallying; these tables are,
 however, expanded as needed, so the number is immaterial.
 .Wn "Bad float initialisation" WBADFLOAT 7
 The loadfile contains a floating point denotation which does not
 satisfy the syntax (see 2.6).
 Examining the loadfile (with \fBod \-c\fP) might show the syntax error.
 Probably there is a bug in the front-end, creating floats with
 a bad syntax.
 .LP
 .I "System calls"
 .Wn "IOCTL \- bad or unimplemented request" WBADIOCTL 11
 The second parameter to the ioctl() request (the operation code) is invalid or
 not implemented; since there are many different opcodes on the various UNIX
 systems, it is difficult to tell which.  The system call fails.
 .Wn "MPXCALL \- not (yet) implemented" WMPXIMP 14
 .Wn "PROFIL \- not (yet) implemented" WPROFILIMP 15
 .Wn "PTRACE \- not (yet) implemented" WPTRACEIMP 16
 The monitor calls \fImpxcall()\fP, \fIprofil()\fP and \fIptrace()\fP
 have not been implemented.  The monitor call fails.
 .Wn "Inaccessible memory in system call" WMONFLT 21
 Bad pointers passed to system calls do not cause a memory fault (which in UNIX
 would happen to the kernel), but cause the system call to fail with the UNIX
 variable errno set to 14 (EFAULT).  It seems likely that your program is at
 fault, but there is also a good possibility that a library routine made
 unwarranted assumptions about word size and pointer size.
 .Wn "READ \- buffer resides in unallocated memory" WRUMEM 23
 .Wn "READ \- buffer across global data area and heap" WRGDAH 24
 When the buffer passed to the read() system call is situated (completely
 or partially) in unallocated memory (beyond \fIHP\fP) or begins
 in the global data area and ends in the heap, the appropriate warning
 is given.
 The buffer is not written.
 .Wn "WRITE \- buffer resides in unallocated memory" WWUMEM 25
 .Wn "WRITE \- buffer across global data area and heap" WWGDAH 26
 .Wn "WRITE \- (part of) global buffer is undefined" WWGUNDEF 27
 .Wn "WRITE \- (part of) local buffer is undefined" WWLUNDEF 28
 The first two are equivalent to the READ-errors above.
 Writing out a buffer usually makes no sense when the contents are undefined,
 so one of the latter two warnings will be generated in this case.
 A global buffer resides in the data partition; a local buffer resides in
 the stack partition.
 This corresponds to global and local variables in a C-program.
 In the first two cases the WRITE is not performed, in the latter two cases
 it is.
 .LP
 .I "Traps and signals"
 .Wn "SIGTRP \- bad signo argument" WILLSN 31
 The \fIsigtrp()\fP monitor call allows \fIsig_no\fP arguments in the
 range [1..17] (UNIX Version 7 signals); the actual argument is out of range.
 .Wn "SIGTRP \- signo argument is a synchronous trap" WUNIXTR 32
 The signal is one that can only be caused synchronously by the running program
 on UNIX; it cannot occur to an interpreted program.
 .Wn "SIGTRP \- bad trapno argument" WILLTN 33
 The \fIsigtrp()\fP monitor call allows \fItrap_no\fP arguments between 0 and
 252, and the special values \-2 and \-3; the actual argument is not one of
 these.
 .Wn "Heap overflow due to command line limitation" WEHEAP 36
 .Wn "Stack overflow due to command line limitation" WESTACK 37
 The maximum sizes of the heap and the stack can be limited by options on the
 command line.  If overflow occurs due to such limitations, the corresponding
 trap is taken, preceded by one of the above warnings.  If the memory of the
 interpreter itself is exhausted, a fatal error follows.
 .LP
 .I "Run-time type checking"
 .Wn "Local character expected" WLCEXP 41
 .Wn "Global character expected" WGCEXP 42
 .Wn "Local integer expected" WLIEXP 43
 .Wn "Global integer expected" WGIEXP 44
 .Wn "Local float expected" WLFEXP 45
 .Wn "Global float expected" WGFEXP 46
 .Wn "Local data pointer expected" WLDPEXP 47
 .Wn "Global data pointer expected" WGDPEXP 48
 .Wn "Local instruction pointer expected" WLIPEXP 49
 .Wn "Global instruction pointer expected" WGIPEXP 50
 In general, a type violation has taken place when one of
 these warnings is given.
 The \fBfloat\fP- and \fBinstruction pointer\fP warnings are rare and will
 usually be easy traceable.
 \fBInteger/character expected\fP will normally occur when unsigned arithmetic
 is performed on datapointers or when memory containing objects other than
 integers is copied bytewise.
 Often, this warning is followed by a warning \fBdatapointer expected\fP.
 This is due to our decision of transforming pointers to (unsigned) integers
 after doing unsigned arithmetic on them.
 When such a transformed integer is dereferenced (as if it were a pointer)
 or, in general, when it is treated as a pointer, this results in a warning.
 The present library implementation of malloc() causes such a
 sequence of errors.
 .LP
 These messages are always followed by a tentative description of what is found
 in memory at the offending place.
 .Wn "Actual memory is undefined" WWASUND 61
 .Wn "Actual memory contains an integer" WWASINT 62
 .Wn "Actual memory contains a float" WWASFLOAT 63
 .Wn "Actual memory contains a data pointer" WWASDATAP 64
 .Wn "Actual memory contains an instruction pointer" WWASINSP 65
 .Wn "Actual memory contains mixed information" WWASMISC 66
 If the contents of the area was undefined,
 check the source code for an uninitialized variable of the mentioned type.
 Officially, the use of an undefined value
 should result in a EIUND or EFUND trap but the occurrence is
 so common that a warning is more appropriate.
 The contents of memory are described as mixed if the data consists of pieces
 of different types.  This happens, e.g., when caller and callee do not agree on
 the types and lengths of the parameters.
 .LP
 .I "Protection"
 .br
 .Wn "Destroying contents of ROM (at or near loc 0)" WDESROM 71
 The program stores a value in Read-Only Memory; the only ROM in the present
 implementation is the area near location 0.  The warning probably results from
 storing under a NULL pointer.  This is only a warning, the store operation is
 executed normally.  Reads from location 0 are not detected.
 .Wn "Destroying contents of Return Status Block" WDESRSB 72
 The Return Status Block is the stack area containing the return address, the
 dynamic link, etc.
 This may or may not be an error.
 The current implementation of \fIsetjmp()\fP/\fIlongjmp()\fP
 may be responsible for it.
 If your program does not use setjmp(), there \fIis\fP something
 very wrong (e.g. argument for ASP too large).
 Note that there are some library routines (such as \fIalarm()\fP) which
 use \fIsetjmp()\fP.
 .Wn "Logical operation using undefined operand(s)" WUNLOG 81
 .Wn "Comparing undefined operand(s)" WUNCMP 82
 The logical operations AND, XOR, IOR, COM and the compare operation
 CMS do their jobs bytewise.
 If one of the bytes is found to be undefined, the corresponding warning
 is given, and the operation is stopped immediately.
 The stack is adjusted so interpretation may continue.
 .br
 It is hard to say what went wrong.
 Possibly, the argument of the instruction at hand (which indicates the
 size of the objects to be compared), was too large.
 .LP
 .I "Bad operands"
 .Wn "Shift over negative distance" WSHNEG 91
 .Wn "Shift over too large distance" WSHLARGE 92
 Shift instructions yield undefined results if the shift distance is negative
 or larger than the object size.
 .Wn "Pointer arithmetic yields pointer to bad segment" WSEGADP 93
 When doing pointer arithmetic (ADP, ADS), the operand and result pointer
 must be in the same \fIsegment\fP (see sec. 4).
 E.g. loading the address of the first local and adding 20 to it will
 certainly give this warning.
 .Wn "Subtracting pointers to different segments" WSEGSBS 94
 Pointers may be subtracted only if they point into the same segment.
 .Wn "Pointer arithmetic with NULL pointer" WNULLPA 96
 By definition it is illegal to do arithmetic with null pointers.
 Integers with the size of a pointer and the value zero are recognized
 as NULL pointers.
 A well-known C-trick to compute the offset of some field in a struct
 is converting the null-pointer to the type of the struct and simply
 taking the address of the field.
 This trick will \-when translated and interpreted\- generate this warning
 because it results in arithmetic with the NULL pointer.
 .LP
 .I "Return area"
 .Wn "Returned function result too large" WRFUNLAR 101
 .Wn "Returned function result too small" WRFUNSML 102
 This warning is generated when the size of the expected return value
 is not equal to the size actually returned.
 .br
 Your interpreted program may have fallen through the end of
 the code without explicitly doing an \fIexit()\fP or \fIreturn()\fP.
 The start-up routine (\fIcrt0()\fP) however always expects to get some
 value returned by the program proper.
 .br
 Another (less probable) possibility of course is that the code contains
 a subroutine or function call that does not return properly (e.g.
 it returns a short instead of a long).
 .Wn "Returned function result may be garbled" WRFUNGAR 103
 This warning will be generated, when the contents of the FRA are fetched
 after some instruction is executed which can mess up the area.
 Compiler-generated loadfiles should not generate this message.
 .LP
 .I "Return Status Block"
 .Wn "RET did not find a Return Status Block" WRETBAD 111
 .Wn "Used RET to return from a trap" WRETTRAP 112
 The RET instruction found a garbled Return Status Block, or on that resulted
 from a trap.
 .Wn "RTT did not find a Return Status Block" WRTTBAD 115
 .Wn "RTT on empty stack" WRTTEMPTY 116
 .Wn "Used RTT to return from a call" WRTTCALL 117
 .Wn "Used RTT to return from a non-returnable trap" WRTTNRTT 118
 The RTT (Return from Trap) instruction found a Return Status block that was not
 created properly by a trap.
 .Wn "Stack Pointer too large in RET" WRETSTL 121
 .Wn "Stack Pointer too small in RET" WRETSTS 122
 .Wn "Stack Pointer too large in RTT" WRTTSTL 125
 .Wn "Stack Pointer too small in RTT" WRTTSTS 126
 According to the EM Manual (4.2), "the value of SP just after the return
 value has been popped must be the same as the
 value of SP just before executing the first instruction of the
 invocation."
 If the Stack Pointer is too large, some dynamically allocated item or some
 temporary result may have been left behind on the stack.
 If the Stack Pointer is too small, some locals have been unstacked.
 Since the interpreter has enough information in the Return Status Block, it
 recovers correctly from these errors.
 .LP
 .I "Traps"
 .LP
 Some traps have ambiguous or non-obvious causes.
 As far as possible, these are preceded by a warning, explaining the
 circumstances of the trap.
 .Wn "Trap ESTACK: DCH on bad LB" WDCHBADLB 131
 .Wn "Trap ESTACK: LPB on bad LB" WLPBBADLB 132
 .Wn "Trap ESTACK: SP retracted over Return Status Block" WSPGTLB 133
 .Wn "Trap ESTACK: SP moved into data area" WSPINHEAP 134
 .Wn "Trap ESTACK: SP set to non-word-boundary" WSPODD 135
 .Wn "Trap ESTACK: LB set out of stack" WLBOUT 136
 .Wn "Trap ESTACK: LB set to non-word-boundary" WLBODD 137
 .Wn "Trap ESTACK: LB set to position where there is no RSB" WLBRSB 138
 .Wn "Trap EHEAP: HP retracted into Global Data Area" WHPGDA 141
 .Wn "Trap EHEAP: HP pushed into stack" WHPSTACK 142
 .Wn "Trap EHEAP: HP set to non-word-boundary" WHPODD 143
 .Wn "Trap EILLINS: unknown opcode" WBADOPC 151
 .Wn "Trap EILLINS: conversion with unacceptable size for this machine" WILLCONV 152
 .Wn "Trap EILLINS: FIL with non-existing address" WILLFIL 153
 .Wn "Trap EILLINS: LFR with too large size" WILLLFR 154
 .Wn "Trap EILLINS: RET with too large size" WILLRET 155
 .Wn "Trap EILLINS: instruction argument of class c does not fit a word" WARGC 156
 .Wn "Trap EILLINS: instruction on double word on machine with word size 4" WARGD 157
 .Wn "Trap EILLINS: local offset too large" WARGL 158
 .Wn "Trap EILLINS: instruction argument of class g not in GDA" WARGG 159
 .Wn "Trap EILLINS: fragment offset too large" WARGF 160
 .Wn "Trap EILLINS: counter in lexical instruction out of range" WARGN 161
 .Wn "Trap EILLINS: non-existent procedure identifier" WARGP 162
 .Wn "Trap EILLINS: illegal register number" WARGR 163
 .Wn "Trap EBADPC: jump out of text segment" WPCOVFL 172
 .Wn "Trap EBADPC: jump out of procedure fragment" WPCPROC 173
 .Wn "Trap EBADGTO: GTO does not restore an existing RSB" WGTORSB 181
 .Wn "Trap EBADGTO: GTO descriptor on the stack" WGTOSTACK 182
 .Wn "Trap caused by TRP instruction" WTRP 191
 .ig
 .Wn "Last warning" WMSG 199
 !Leave these lines here!
 ..
--- a/doc/int/appB
+++ b/doc/int/appB
@ -0,0 +1,486 @@
 .\"	A simple tutorial
 .\"
 .\"	$Header$
 .\"
 .bp
 .DS
 APPENDIX B
 .DE
 .SH
 How to use the interpreter
 .PP
 The interpreter is not normally used for the debugging of programs under
 construction.  Its primary application is as a verification tool for almost
 completed programs.  Although the proper operation of the interpreter is
 obviously a black art, this chapter tries to provide some guidelines.
 .LP
 For the sake of the argument, the source language is assumed to be C, but most
 hints apply equally well to other languages supported by ACK.
 .sp
 .LP
 .I "Initial measures"
 .PP
 Start with a test case of trivial size; to be on the safe side, reckon with a
 time dilatation factor of about 500, i.e., a second grows into 10 minutes.
 (The interpreter takes 0.5 msec to do one EM instruction on a Sun 3/50).
 Fortunately many trivial test cases are much shorter than one second.
 .PP
 Compile the program into an \fIe.out\fP, the EM machine version of a
 \fIa.out\fP, by calling \fIem22\fP (for 2-byte integers and 2-byte pointers),
 \fIem24\fP (for 2 and 4) or \fIem44\fP (for 4 and 4) as seems appropriate;
 if in doubt, use \fIem44\fP.  These compilers can be found in the ACK
 \fIbin\fP directory, and should be used instead of \fIacc\fP (or normal
 .UX
 \fIcc\fP).  Alternatively, you can use \fIacc \-memNN\fP instead of
 \fIemNN\fP.
 .LP
 If your C program consists of more than one file, as it usually does, there is
 a small problem.  The \fIacc\fP and \fIcc\fP compilers generate .o files,
 whereas the \fIemNN\fP compilers generate .m files as object files.
 A simple technique to avoid the problem is to call
 .DS
 em44 *.c
 .DE
 if you can.  If not, the following hack on the \fIMakefile\fP generally works.
 .IP \-
 Make sure the \fIMakefile\fP is reasonably clean and complete: all calls to
 the compiler are through \fI$(CC)\fP, \fICFLAGS\fP is used properly and all
 dependencies are specified.
 .IP \-
 Add the following lines to the \fIMakefile\fP (possibly permanently):
 .DS
 \&.SUFFIXES:	.o
 \&.c.o:
 \&	$(CC) \-c $(CFLAGS) $<
 .DE
 .IP \-
 Set CC to \fIem44 \-.c\fP (for example).  Make sure CFLAGS includes
 the \-O option; this yields a speed-up of about 15 %.
 .IP \-
 Change all .o to .m (or .k if you do not use the \-O option).
 .IP \-
 If necessary, change \fIa.out\fP to \fIe.out\fP.
 .PP
 With these changes, \fImake\fP will produce an EM object; you can use
 \fIesize\fP to verify that it is indeed an EM object and obtain some
 statistics.  Then call the interpreter:
 .DS
 int <EM-object-file> [ parameters ]
 .DE
 where the parameters are the normal parameters of your program.  This should
 work exactly like the original program, though slower.  It reads from the
 terminal if the original does, it opens and closes files like the original and
 it accepts interrupts.
 .sp
 .LP
 .I "Interpreting the results"
 .PP
 Now there are several possibilities.
 .PP
 It does all this.  Great!  This means the program
 does not do very uncouth things.  Now
 read the file \fIint.mess\fP to see if any messages were generated.  If there
 are none, the program did not really run (perhaps the original cc \fIa.out\fP
 got called instead?)  Normally there is at least a termination message like
 .DS
 (Message): program exits with status 0 at "awa.p", line 64, INR = 4124
 .DE
 This says that the program terminated through an exit(0) on line 64 of the
 file \fIawa.p\fP after 4124 EM instructions.
 If this is the only message it is time to move to a bigger test case.
 .PP
 On the other hand, the program may come to a grinding halt with an error
 message.
 All messages (errors and warnings) have a format in which the sequence
 .DS
 "<file name>", line <ln#>
 .DE
 occurs, which is the same sequence many compilers produce for their error
 messages. Consequently, the \fIint.mess\fP file can be processed as any
 compiler message output.
 .PP
 One such message can be
 .DS
 (Fatal error) a.em: trap "Addressing non existent memory" not caught at "a.c", line 2, INR = 16
 .DE
 produced by the abysmal program
 .DS
 main()	{
 	*(int*)200000 = 1;
 }
 .DE
 .LP
 Often the effects are more subtle, however.  The program
 .DS
 main()	{
 	int *a, b = 777;
 	b = *a;
 }
 .DE
 produces the following five warnings (in far less than a second):
 .DS
 (Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
 (Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
 (Warning 102, #1): Returned function result too small at "<unknown>", line 0, INR = 21
 (Warning 43, #1): Local integer expected at "exit.c", line 11, INR = 34
 (Warning 61, cont.): Actual memory is undefined at "exit.c", line 11, INR = 34
 .DE
 The one about the function result looks the most frightening,
 but is the most easily solved:
 \fImain\fP is a function returning an int, so the start-up routine expects a
 (four-byte) integer but gets an empty (zero-byte) return area.
 .LP
 \fINote\fP: The experts are divided about this. The traditional school holds
 that \fImain\fP is an int function and its result is the return code; this
 leaves them with two ways of supplying a return code: one as the parameter
 of \fIexit()\fP and one as the result
 of \fImain\fP.  The modern school (Berkeley 4.2 etc.) claims that
 return codes are supplied exclusively
 by \fIexit()\fP, and they have an \fIexit(0)\fP in
 the start-up routine, just after the call to \fImain()\fP; leaving \fImain()\fP
 through the bottom implies successful termination.
 .LP
 We shall satisfy both groups by
 .DS
 main()	{
 	int *a, b = 777;
 	b = *a;
 	exit(0);
 }
 .DE
 This results in
 .DS
 (Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
 (Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
 (Message): program exits with status 0 at "exit.c", line 11, INR = 33
 .DE
 which is pretty clear as it stands.
 .sp
 .LP
 .I "Using stack dumps"
 .PP
 Let's, for the sake of argument
 and to avoid the fierce realism of 10000-line programs, assume that the above
 still puzzles you.
 Since the error occurred in EM instruction number 17, we should like to see
 more information around that moment.  Call the interpreter again, now with the
 shell variable AT set at 17:
 .DS
 int AT=17 t.em
 .DE
 (The interpreter has a number of internal variables that can be set by
 assignments on the command line, like with \fImake\fP.)
 This gives you a file called \fIint.log\fP containing the
 stack dump of 150 lines presented at the end of this chapter.
 .PP
 Since dumping is a subfacility of logging in the interpreter, the formats of
 the lines are
 the same.  If a line starts with an @, it will contain a file-name/line-number
 indication; the next two characters are the subject and the log
 level. Then comes the information, preceded by a space.  The text contains
 three stack dumps, one before the offending instruction, one at it, and one
 after it; then the interpreter stops.  All kinds of other dumps can be
 obtained, but this is default.
 .PP
 For each instruction we have, in order:
 .IP \-
 an @x9 line, giving the position in the program,
 .IP \-
 the messages, warnings and errors from the instruction as it is being executed,
 .IP \-
 dump(s), as requested.
 .PP
 The first two lines mean that at line 4 in file \fIt.c\fP the interpreter
 performed its 16-th instruction, with the Program Counter at 30 pointing at
 opcode 180 in the text segment; the instruction was an LOL (LOad Local)
 with the operand \-4 derived from the opcode.  It copies the local at offset
 \-4 to the top of the stack.  The effect can be seen from the subsequent stack
 dump, where the undefined word at addresses 2147483568 to ...571 (the variable
 \fIa\fP) has been copied to the top of the stack at 2147483560 (copying
 undefined values does not generate a warning).
 Since we used the \fIem44\fP compiler, all pointers and ints in our dump are
 4 bytes long.
 So a variable at address X in reality extends from address X to X+3.
 .br
 Note that this is not the offending instruction; this stack dump represents
 the situation just before the error.
 .PP
 The stack consists of a sequence of frames, each containing data followed by
 a Return Status Block resulting from a call; the last frame ends in
 top-of-stack.  The first frame represents the stack when the program starts,
 through a call to the start-up routine.  This routine prepares the second
 stack frame with the actual parameters to \fImain()\fP:
 \fIargc\fP at 2147483596, \fIargv\fP at 2147483600 and \fIenviron\fP at
 2147483604.
 .LP
 The RSB line shows that the call to \fImain()\fP was made from procedure 0
 which has 0 locals, with PC at
 16, an LB of 2147483608 and file name and line number still unknown.
 The \fIcode\fP in the RSB tells how this RSB was made; possible values are STP
 (start-up), CAL, RTT (returnable trap) and NRT (non-returnable trap).
 .PP
 The next frame shows the local variable(s) of \fImain()\fP; there are two of
 them, the pointer \fIa\fP at 2147483568, which is undefined, and variable
 \fIb\fP at 2147483564, which has the value 777.  Then comes a copy of \fIa\fP,
 just made by the LOL instruction, at 2147483560.  The following line shows that
 the Function Return Area (which does not reside at the end of the stack, but
 just happens to be printed here) has size 0 and is presently undefined.
 The stack dump ends
 by showing that the Actuals Base is at 2147483596 (pointing at \fIargc\fP), the
 Locals Base at 2147483572 (pointing just above the local \fIa\fP), the Stack
 Pointer at 2147483560 (pointing at the undefined pointer), the line count is 4
 and the file name is "t.c".
 .LP
 (Notice that there is one more stack frame than you would probably expect, the
 one above the start-up routine.)
 .LP
 The Function Return Area
 could have a size larger than 0 and still be undefined, for
 example when an instruction that does not preserve the contents of the FRA has
 just been executed; likewise the FRA could have size 0 and be defined
 nevertheless, for example just after a RET 0 instruction.
 .PP
 All this has set the scene for the distaster which is about to strike in the
 next instruction.  This is indeed a LOI (LOad Indirect) of size 4, opcode 169;
 it causes the message
 .DS
 warning: Local data pointer expected [stack.c: 242]
 .DE
 and its continuation
 .DS
 warning cont.: Actual memory is undefined
 .DE
 (detected in the interpreter file \fIstack.c\fP at line 242; this can be
 useful for sorting out dubious semantics).  We see that the effect, as shown in
 the third frame of this stack dump (at instruction number 17) is somewhat
 unexpected: the LOI has fetched the value 4 and stacked it.  The reason is
 that, unfortunately, undefinedness is not transitive in the interpreter.  When
 an undefined value is used in an operation (other than copying) a warning is
 given, but thereafter the value is treated as if it were zero.  So, after the
 warning a normal null pointer remains, which is then used to pick up the value
 at location 0.  This is the place where the EM machine stores its current line
 number, which is presently 4.
 .PP
 The third stack dump shows the final effect: the value 4 has been unstacked
 and copied to variable \fIb\fP at 2147483564 through an STL (STore Local)
 instruction.
 .PP
 Since this form of logging dumps the stack only, the log file is relatively
 small as dumps go.
 Nevertheless, a useful excerpt can be obtained with the command
 .DS
 grep 'd1' int.log
 .DE
 This extracts the Return Status Block lines from the log, thus producing three
 traces of calls, one for each instruction in the log:
 .DS
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, LIN = 4, FIL = "t.c"
 .DE
 Theoretically, the pertinent trace is the middle one, but in practice all three
 are equal.  In the present case there isn't much to trace, but in real programs
 the trace can be useful.
 .sp
 .LP
 .I "Errors in libraries"
 .PP
 Since libraries are generally compiled with suppression of line number and
 file name information, the line number and file name in the interpreter will
 not be updated when it enters a library routine. Consequently, all messages
 generated by interpreting library routines will seem to originate from the
 line of the call.  This is especially true for the routine malloc(), which,
 from the nature of its business, often contains dubitable code.
 .PP
 A usual message is:
 .DS
 (Warning 43, #1): Local integer expected at "buff.c", line 18, INR = 266
 (Warning 64, cont.): Actual memory contains a data pointer at "buff.c", line 18, INR = 266
 .DE
 and indeed at line 18 of the file buff.c we find:
 .DS
 	buff = malloc(buff_size = BFSIZE);
 .DE
 This problem can be avoided by using a specially compiled version of the
 library that contains the correct LIN and FIL instructions, or, less
 elegantly, by including the source code of the library routines in the
 program; in the latter case, make sure you have them all.
 .sp
 .LP
 .I "Unavoidable messages"
 .br
 Some messages produced by the logging are almost unavoidable; sometimes the
 writer of a library routine is forced to take liberties with the semantics of
 EM.
 .LP
 Examples from C include the memory allocation routines.
 For efficiency reasons, one bit of an pointer in the administration is used as
 a flag; setting, clearing and reading this bit requires bitwise operations on
 pointers, which gives the above messages.
 Realloc causes a problem in that it may have to copy the originally allocated
 area to a different place; this area may contain uninitialised bytes.
 .bp
 .DS
 .ft CW
@x9 "t.c", line 4, INR = 16, PC = 30 OPCODE = 180
@L6 "t.c", line 4, INR = 16, DoLOLm(-4)
 d2 
 d2 . . STACK_DUMP[4/4] . . INR = 16 . . STACK_DUMP . .
 d2 ----------------------------------------------------------------
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483643        0                  (Dp)
 d2    2147483642        0                  (Dp)
 d2    2147483641        0                  (Dp)
 d2    2147483640       40    [        40]  (Dp)
 d2    2147483639        0                  (Dp)
 d2    2147483638        0                  (Dp)
 d2    2147483637        3                  (Dp)
 d2    2147483636       64    [       832]  (Dp)
 d2    2147483635        0                  (In)
 d2    2147483634        0                  (In)
 d2    2147483633        0                  (In)
 d2    2147483632        1    [         1]  (In)
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483607        0                  (Dp)
 d2    2147483606        0                  (Dp)
 d2    2147483605        0                  (Dp)
 d2    2147483604       40    [        40]  (Dp)
 d2    2147483603        0                  (Dp)
 d2    2147483602        0                  (Dp)
 d2    2147483601        3                  (Dp)
 d2    2147483600       64    [       832]  (Dp)
 d2    2147483599        0                  (In)
 d2    2147483598        0                  (In)
 d2    2147483597        0                  (In)
 d2    2147483596        1    [         1]  (In)
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483571    undef
 d2         | | |    | | |
 d2    2147483568    undef (1 word)
 d2    2147483567        0                  (In)
 d2    2147483566        0                  (In)
 d2    2147483565        3                  (In)
 d2    2147483564        9    [       777]  (In)
 d2    2147483563    undef
 d2         | | |    | | |
 d2    2147483560    undef (1 word)
 d2        FRA: size = 0, undefined
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
 							LIN = 4, FIL = "t.c"
 d2 ----------------------------------------------------------------
 d2 
@x9 "t.c", line 4, INR = 17, PC = 31 OPCODE = 169
@w1 "t.c", line 4, INR = 17, warning: Local data pointer expected [stack.c: 242]
@w1 "t.c", line 4, INR = 17, warning cont.: Actual memory is undefined
@L6 "t.c", line 4, INR = 17, DoLOIm(4)
 d2 
 d2 . . STACK_DUMP[4/4] . . INR = 17 . . STACK_DUMP . .
 d2 ----------------------------------------------------------------
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483643        0                  (Dp)
 d2    2147483642        0                  (Dp)
 d2    2147483641        0                  (Dp)
 d2    2147483640       40    [        40]  (Dp)
 d2    2147483639        0                  (Dp)
 d2    2147483638        0                  (Dp)
 d2    2147483637        3                  (Dp)
 d2    2147483636       64    [       832]  (Dp)
 d2    2147483635        0                  (In)
 d2    2147483634        0                  (In)
 d2    2147483633        0                  (In)
 d2    2147483632        1    [         1]  (In)
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483607        0                  (Dp)
 d2    2147483606        0                  (Dp)
 d2    2147483605        0                  (Dp)
 d2    2147483604       40    [        40]  (Dp)
 d2    2147483603        0                  (Dp)
 d2    2147483602        0                  (Dp)
 d2    2147483601        3                  (Dp)
 d2    2147483600       64    [       832]  (Dp)
 d2    2147483599        0                  (In)
 d2    2147483598        0                  (In)
 d2    2147483597        0                  (In)
 d2    2147483596        1    [         1]  (In)
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483571    undef
 d2         | | |    | | |
 d2    2147483568    undef (1 word)
 d2    2147483567        0                  (In)
 d2    2147483566        0                  (In)
 d2    2147483565        3                  (In)
 d2    2147483564        9    [       777]  (In)
 d2    2147483563        0                  (In)
 d2    2147483562        0                  (In)
 d2    2147483561        0                  (In)
 d2    2147483560        4    [         4]  (In)
 d2        FRA: size = 0, undefined
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
 							LIN = 4, FIL = "t.c"
 d2 ----------------------------------------------------------------
 d2 
@x9 "t.c", line 4, INR = 18, PC = 32 OPCODE = 229
@S6 "t.c", line 4, INR = 18, DoSTLm(-8)
 d2 
 d2 . . STACK_DUMP[4/4] . . INR = 18 . . STACK_DUMP . .
 d2 ----------------------------------------------------------------
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483643        0                  (Dp)
 d2    2147483642        0                  (Dp)
 d2    2147483641        0                  (Dp)
 d2    2147483640       40    [        40]  (Dp)
 d2    2147483639        0                  (Dp)
 d2    2147483638        0                  (Dp)
 d2    2147483637        3                  (Dp)
 d2    2147483636       64    [       832]  (Dp)
 d2    2147483635        0                  (In)
 d2    2147483634        0                  (In)
 d2    2147483633        0                  (In)
 d2    2147483632        1    [         1]  (In)
 d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483607        0                  (Dp)
 d2    2147483606        0                  (Dp)
 d2    2147483605        0                  (Dp)
 d2    2147483604       40    [        40]  (Dp)
 d2    2147483603        0                  (Dp)
 d2    2147483602        0                  (Dp)
 d2    2147483601        3                  (Dp)
 d2    2147483600       64    [       832]  (Dp)
 d2    2147483599        0                  (In)
 d2    2147483598        0                  (In)
 d2    2147483597        0                  (In)
 d2    2147483596        1    [         1]  (In)
 d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
 d2 
 d2       ADDRESS     BYTE     ITEM VALUE   SHADOW
 d2    2147483571    undef
 d2         | | |    | | |
 d2    2147483568    undef (1 word)
 d2    2147483567        0                  (In)
 d2    2147483566        0                  (In)
 d2    2147483565        0                  (In)
 d2    2147483564        4    [         4]  (In)
 d2        FRA: size = 0, undefined
 d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, \e
 							LIN = 4, FIL = "t.c"
 d2 ----------------------------------------------------------------
 d2 
 .DE
--- a/doc/int/bib
+++ b/doc/int/bib
@ -0,0 +1,25 @@
 .\"	Bibliography
 .\"
 .\"	$Header$
 .bp
 .DS C
 BIBLIOGRAPHY
 .DE
 .LP
 [1] A.S. Tanenbaum, H. van Staveren, E.G. Keizer and J.W. Stevenson.
 \fIDescription of a Machine Architecture for use with Block Structured
 Languages\fP. VU Informatica Rapport IR-81, august 1983.
 .LP
 [2] E.G. Keizer. \fIAck description file reference manual.\fP
 .LP
 [3] K. Jensen and N. Wirth.
 \fIPASCAL, User Manual and Report\fP. Springer Verlag.
 .LP
 [4] B.W. Kernighan and D.M. Ritchie.
 \fIThe C Programming Language\fP. Prentice-Hall, 1978.
 .LP
 [5] D.M. Ritchie. \fIC Reference Manual\fP.
 .LP
 [6] \fIAmsterdam Compiler Kit, reference manual.\fP
 .LP
 [7] \fIUnix Programmer's Manual, 4.1BSD\fP. UCB, August 1983.
--- a/doc/int/cover
+++ b/doc/int/cover
@ -0,0 +1,26 @@
 .\"	Front page
 .\"
 .\"	$Header$
 .TL
 The EM Interpreter
 .AU
 Eddo de Groot
 Leo van den Berge
 Dick Grune
 .AI
 Faculteit Wiskunde en Informatica
 Vrije Universiteit, Amsterdam
 .AB
 This document describes the implementation
 and usage of a new interpreter for the EM machine language.
 This interpreter implements the full EM machine
 and can be helpful to people writing new front-ends.
 Moreover, it can be used as a thorough testing and debugging
 tool by anyone familiar with the EM language.
 .PP
 A list of all warnings is given in appendix A; appendix B is a simple
 tutorial.
 .AE
 .PP
 .pn 1
 .bp
--- a/doc/int/draw.mac
+++ b/doc/int/draw.mac
@ -0,0 +1,24 @@
 .\"	Macros for simple constant width drawings (uses font CW)
 .\"
 .\"	$Header$
 .de Dr		\" Drawing $1 (size)
 .sp 1
 .ne \\$1
 .na
 .nf
 .ft CW				\" constant width font
 .lg 0				\" no ligatures
 ..
 .de Df		\" Drawing Footer
 .sp 1
 .ft R
 .ce 1000
 .lg 1
 ..
 .de De		\" Drawing End $1 (lines)
 .Df				\" if it has not happened yet
 .ce
 .ad
 .fi
 .sp \\$1
 ..
--- a/doc/int/txt2
+++ b/doc/int/txt2
@ -0,0 +1,595 @@
 .\"	Implementation details
 .\"
 .\"	$Header$
 .bp
 .NH
 IMPLEMENTATION DETAILS.
 .PP
 The pertinent issues are addressed below, in arbitrary order.
 .NH 2
 Stack manipulation and start-up
 .PP
 It is not at all easy to start the EM machine with the stack in a reasonable
 and consistent state.  One reason is the anomalous value of the ML register
 and another is the absence of a proper RSB.  It may be argued that the initial
 stack does not have to be in a consistent state, since the first instruction
 proper is only executed after \fIargc\fP, \fIargv\fP and \fIenviron\fP
 have been stacked (which takes care of the empty stack) and the initial
 procedure has been called (which creates a RSB).  We would, however, like to
 preform the stacking of these values and the calling of the initial procedure
 using the normal stack and call routines, which again require the stack to be
 in an acceptable state.
 .NH 3
 The anomalous value of the ML register
 .PP
 All registers in the EM machine point to word boundaries, and all of them,
 except ML, address the even-numbered byte at the boundary.
 The exception has a good reason: the even numbered byte at the ML boundary does
 not exist.
 This problem is not particular to EM but is inherent in the number system: the
 number of N-digit numbers can itself not be expressed in an N-digit number, and
 the number of addresses in an N-bit machine will itself not fit in an N-bit
 address.  The problem is solved in the interpreter by having ML point to the
 highest word boundary that has bytes on either side; this makes ML+1
 expressible.
 .NH 3
 The absence of an initial Return Status Block
 .PP
 When the stack is empty, there is no legal value for AB, since there are no
 actuals; LB can be set naturally to ML+1.  This is all right when the
 interpreter starts with a call of the initial routine which stores the value
 of LB in the first RSB, but causes problems when finally this call returns.  We
 want this call to return completely before stopping the interpreter, to check
 the integrity of the last RSB; restoring information from it will, however,
 cause illegal values to be stored in LB and AB (ML+1 and ML+1+rsbsize, resp.).
 On top of this, the initial (illegal) Procedure Identifier of the running
 procedure will be restored; then, upon restoring the likewise illegal PC will
 cause a check to see if it still is inside the running procedure.  After a few
 attempts at writing special cases, we have decided that it is possible, but not
 worth the effort; the final (= initial) RSB will not be unstacked.
 .NH 2
 Floating point numbers.
 .PP
 The interpreter is capable of working with 4- and 8-byte floating point (FP)
 numbers.
 In C-terms, this corresponds to objects of type float and double respectively.
 Both types fit in a C-double so the obvious way to manipulate these entities
 internally is in doubles.
 Pushing a 8-byte FP, all bytes of the C-double are pushed.
 Pushing a 4-byte FP causes the 4 bytes representing the smallest fraction
 to be discarded.
 .PP
 In EM, floats can be obtained in two different ways: via conversion
 of another type, or via initialization in the loadfile.
 Initialized floats are represented in the loadfile by an ASCII string in
 the syntax of a Pascal real (signed \fPUnsignedReal\fP).
 I.e. a float looks like:
 .DS
 [ \fISign\fP ] \fIDigit\fP+ [ . \fIDigit\fP+ ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ]                                (G1)
 .DE
 followed by a null byte.
 Here \fISign\fP = {+, \-}; \fIDigit\fP = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
 \fIExp\fP = {e, E}; [ \fIAnything\fP ] means that \fIAnything\fP is optional;
 and a + means one or more times.
 To accommodate some loose code generators, the actual grammar accepted is:
 .DS
 [ \fISign\fP ] \fIDigit\fP\(** [ . \fIDigit\fP\(** ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ]                                (G2)
 .DE
 followed by a null byte. Here \(** means zero or more times.  A floating
 denotation which is in G2 but not in G1 draws a warning, one that is not even
 in G2 causes a fatal error.
 .LP
 A string, representing a float which does not fit in a double causes a
 warning to be given.
 In that case, the returned value will be the double 0.0.
 .LP
 Floating point arithmetic is handled by some simple routines, checking for
 over/underflow, and returning appropriate values in case of an ignored error.
 .PP
 Since not all C compilers provide floating point operations, there is a
 compile time flag NOFLOAT, which, if defined, suppresses the use of all
 fp operations in the interpreter.  The resulting interpreter will still load
 EM files with floats in the global data area (and ignore them) but will give a
 fatal error upon attempt to execute a floating point instruction; consequently
 code involving floating point operations can be run as long as the actual
 instructions are avoided.
 .NH 2
 Pointers.
 .PP
 The following sub-sections both deal with problems concerning pointers.
 First, something is said about pointer arithmetic in general.
 Then, the null-pointer problem is dealt with.
 .NH 3
 Pointer arithmetic.
 .PP
 Strictly speaking, pointer arithmetic is defined only within a \fBfragment\fP.
 From the explanation of the term fragment however (as given in [1], page 3),
 it is not quite clear what a fragment should look like
 from an interpreter's point of view.
 For this reason we introduced the term \fBsegment\fP,
 bordering the various areas within which pointer arithmetic is allowed.
 Every stack-frame is a segment, and so are the global data area (GDA) and
 the heap area.
 Thus, the number of segments varies over time, and at some point in time is
 given by the number of currently active stack-frames
 (#CAL + #CAI \- #RET \- #RTT) plus 2 (gda, heap).
 Pointers in the area between heap and stack (which is inaccessible by
 definition), are assumed to be in the heap segment.
 .PP
 The interpreter, while building a new stack-frame (i.e. segment), stores the
 value of the last ActualBase in a pointer-array  (\fIAB_list[\ ]\fP).
 When a pointer (say \fIP\fP) is available for arithmetic, the number
 of the segment where it points (say \fIS\d\s-2P\s+2\u\fP),
 is determined first.
 Next, the arithmetic is performed, followed by a check on the number
 of the segment where the resulting pointer \fIR\fP points
 (say \fIS\d\s-2R\s+2\u\fP).
 Now, if \fIS\d\s-2P\s+2\u != S\d\s-2R\s+2\u\fP, a warning is given:
 \fBPointer arithmetic yields pointer to bad segment\fP.
 .br
 It may also be clear now, why the illegal area between heap and stack
 was joined with the heap segment.
 When calculating a new heap pointer (\fIHP\fP), one will obtain intermediate
 results being pointers in this area just before it is made legal.
 We do not want error messages all of the time, just because someone is
 allocating space in the heap.
 .LP
 A similar treatment is given to the pointers in the SBS instruction; they have
 to point into the same fragment for subtraction to be meaningful.
 .LP
 The length of the \fIAB_list[\ ]\fP is initially 100,
 and it is reallocated in the same way the dynamically growing partitions
 are (see 1.1).
 .NH 3
 Null pointer.
 .PP
 Because the EM language lacks an instruction for loading a null pointer,
 most programs solve this problem by loading a pointer-sized integer of
 value zero, and using this as a null pointer (this is also proposed in [1]).
 \fBInt\fP allows this, and will not complain.
 A warning is given however, when an attempt is made to add something to a
 null pointer (i.e. the pointer-sized integer zero).
 .LP
 Since many programming languages use a pointer to location 0 as an illegal
 value, it is desirable to detect its use.
 The big problem is though that 0 is a perfectly legal EM address;
 address 0 holds the current line number in the source file.  It may be freely
 read but is written only by means of the LIN instruction.  This allows us to
 declare the area consisting of the line number and the file name pointer to be
 read-only memory.  Thus a store will be caught (and result in a warning) but a
 read will succeed (and yield the EM information stored there).
 .NH 2
 Function Return Area (FRA).
 .PP
 The Function Return Area (\fIFRA[\ ]\fP) has a default size of 8 bytes;
 this default can
 be overridden through the use of the \fB\-r\fP-option, but cannot be
 made smaller than the size of two pointers, in accordance with the
 remark on page 5 of [1].
 The global variable \fIFRASize\fP keeps track of how many bytes were
 stored in the FRA, the last time a RET instruction was executed.
 The LFR instruction only works when its argument is equal to this size.
 If not, the FRA contents are loaded anyhow, but one of the following warnings
 is given:
 \fBReturned function result too large\fP (\fIFRASize\fP > LFR size) or
 \fBReturned function result too small\fP (\fIFRASize\fP < LFR size).
 .LP
 Note that a C-program, falling through the end of its code without doing
 a proper \fIreturn\fP or \fIexit()\fP, will generate this warning.
 .PP
 The only instructions that do not disturb the contents of the FRA are
 GTO, BRA, ASP and RET.
 This is expressed in the program by setting \fIFRA_def\fP to "undefined"
 in any instruction except these four.
 We realize this is a useless action most of the time, but a more
 efficient solution does not seem to be at hand.
 If a result is loaded when \fIFRA_def\fP is "undefined", the warning:
 \fBReturned function result may be garbled\fP is generated.
 .LP
 Note that the FRA needs a shadow-FRA in order to store the shadow
 information when performing a LFR instruction.
 .NH 2
 Environment interaction.
 .PP
 The EM machine represented by \fBint\fP can communicate with
 the environment in three different ways.
 A first possibility is by means of (UNIX) interrupts;
 the second by executing (relatively) high level system calls (called
 monitor calls).
 A third means of interaction, especially interesting for the debugging
 programmer, is via internal variables set on the command line.
 The former two techniques, and the way they are implemented will be described
 in this section.
 The latter has been allotted a separate section (3).
 .NH 3
 Traps and interrupts.
 .PP
 Simple user programs will generally not mess around with UNIX-signals.
 In interpreting these programs, the default actions will be taken
 when a signal is received by the program: it gives a message and
 stops running.
 .LP
 There are programs however, which try to handle certain signals
 themselves.
 In C, this is achieved by the system call \fIsignal(\ sig_no,\ catch\ )\fP,
 which calls the handling routine \fIcatch()\fP, as soon as signal
 \fBsig_no\fP occurs.
 EM does not provide this call; instead, the \fIsigtrp()\fP monitor call
 is available for mapping UNIX signals onto EM traps.
 This implies that a \fIsignal()\fP call in a C-program
 must be translated by the EM library routine to a \fIsigtrp()\fP call in EM.
 .PP
 The interpreter keeps an administration of the mapping of UNIX-signals
 onto EM traps in the array \fIsig_map[NSIG]\fP.
 Initially, the signals all have their default values.
 Now assume a \fIsigtrp()\fP occurs, telling to map signal \fBsig_no\fP onto
 trap \fBtrap_no\fP.
 This results in:
 .IP 1.
 setting the relevant array element
 \fIsig_map[sig_no]\fP to \fBtrap_no\fP (after saving the old value),
 .IP 2.
 catching the next to come \fBsig_no\fP signal with the handling routine
 \fIHndlEMSig\fP (by a plain UNIX \fIsignal()\fP of course), and
 .IP 3.
 returning the saved map-value on the stack so the user can know the previous
 trap value onto which \fBsig_no\fP was mapped.
 .LP
 On an incoming signal,
 the handling routine for signal \fBsig_no\fP arms the
 correct EM trap by calling the routine \fIarm_trap()\fP with argument
 \fIsig_map[sig_no]\fP.
 At the end of the EM instruction the proper call of \fItrap()\fP is done.
 \fITrap()\fP on its turn examines the value of the \fIHaltOnTrap\fP variable;
 if it is set, the interpreter will stop with a message. In the normal case of
 controlled trap handling this bit is not on and the interpreter examines
 the value of the \fITrapPI\fP variable,
 which contains the procedure identifier of the EM trap handling routine.
 It then initiates a call to this routine and performs a \fIlongjmp()\fP
 to the main
 loop to bypass all further processing of the instruction that caused the trap.
 \fITrapPI\fP should be set properly by the library routines, through the
 SIG instruction.
 .LP
 In short:
 .IP 1.
 A UNIX interrupt is caught by the interpreter.
 .IP 2.
 A handling routine is called which generates the corresponding EM trap
 (according to the mapping).
 .IP 3.
 The trap handler calls the corresponding EM routine which emulates a UNIX
 interrupt for the benefit of the interpreted program.
 .PP
 When considering UNIX signals, it is important to notice that some of them
 are real signals, i.e., messages coming from outside the program, like DEL
 and QUIT, but some are actually program-caused synchronous traps, like Illegal
 Instruction.  The latter, if they happen, are incurred by the interpreter
 itself and consequently are of no concern to the interpreted program: it
 cannot catch them.  The present code assumes that the UNIX signals between
 SIGILL (4) and SIGSYS (12) are really traps; \fIdo_sigtrp()\fP
 will fail on them.
 .LP
 To avoid losing the last line(s) of output files, the interpreter should
 always do a proper close-down, even in the presence of signals.  To this end,
 all non-ignored genuine signals are initially caught by the interpreter,
 through the routine \fIHndlIntSig\fP, which gives a message and preforms a
 proper close-down.
 Synchronous trap can only be caused by the interpreter itself; they are never
 caught, and consequently the UNIX default action prevails.  Generally they
 cause a core dump.
 Signals requested by the interpreted program are caught by the routine
 \fIHndlEMSig\fP, as explained above.
 .NH 3
 Monitor calls.
 .PP
 For the convenience of the programmer, as many monitor calls as possible
 have been implemented.
 The list of monitor calls given in [1] pages 20/21, has been implemented
 completely, except for \fIptrace()\fP, \fIprofil()\fP and \fImpxcall()\fP.
 The semantics of \fIptrace()\fP and \fIprofil()\fP from an interpreted program
 is unclear; the data structure passed to \fImpxcall()\fP is non-trivial
 and the system call has low portability and applicability.
 For these calls, on invocation a warning is generated, and the arguments which
 were meant for the call are popped properly, so the program can continue
 without the stack being messed up.
 The errorcode 5 (IOERROR) is pushed onto the stack (twice), in order to
 fake an unsuccessful monitor call.
 No other \- more meaningful \- errorcode is available in the errno-list.
 .LP
 Now for the implemented monitor calls.
 The returned value is zero for a successful call.
 When something goes wrong, the value of the external \fIerrno\fP variable
 is pushed, thus enabling the user to find out what the reason of failure was.
 The implementation of the majority of the monitor calls is straightforward.
 Those working with a special format buffer, (e.g. \fIioctl()\fP,
 \fItime()\fP and \fIstat()\fP variants), need some extra attention.
 This is due to the fact that working with varying word/pointer size
 combinations may cause alignment problems.
 .LP
 The data structure returned by the UNIX system call results from
 C code that has been translated with the regular C compiler, which,
 on the VAX, happens to be a 4-4 compiler.
 The data structure expected by the interpreted program conforms
 to the translation by \fBack\fP of the pertinent include file.
 Depending on the exact call of \fBack\fP, sizes and alignment may differ.
 .LP
 An example is in order. The EM MON 18 instruction in the interpreted program
 leads to a UNIX \fIstat()\fP system call by the interpreter.
 This call fills the given struct with stat information, the contents
 and alignments of which are determined by the version of UNIX and the
 used C compiler, resp.
 The interpreter, like any program wishing to do system calls that fill
 structs, has to be translated by a C compiler that uses the
 appropriate struct definition and alignments, so that it can use, e.g.,
 \fIstab.st_mtime\fP and expect to obtain the right field.
 This struct cannot be copied directly to the EM memory to fulfill the
 MON instruction.
 First, the struct may contain extraneous, system-dependent fields,
 pertaining, e.g., to symbolic links, sockets, etc.
 Second, it may contain holes, due to alignment requirements.
 The EM program runs on an EM machine, knows nothing about these
 requirements and expects UNIX Version 7 fields, with offsets as
 determined by the em22, em24 or em44 compiler, resp.
 To do the conversion, the interpreter has a built-in table of the
 offsets of all the fields in the structs that are filled by the MON
 instruction.
 The appropriate fields from the result of the UNIX \fIstat()\fP are copied
 one by one to the appropriate positions in the EM memory to be filled
 by MON 18.
 .PP
 The \fIioctl()\fP call (MON 54) poses additional problems. Not only does it
 have a second argument which is a pointer to a struct, the type of
 which is dynamically determined, but its first argument is an opcode
 that varies considerably between the versions of UNIX.
 To solve the first problem, the interpreter examines the opcode (request) and
 treats the second argument accordingly.  The second problem can be solved by
 translating the UNIX Version 7 \fIioctl()\fP request codes to their proper
 values on the various systems.  This is, however, not always useful, since
 some EM run-time systems use the local request codes.  There is a compile-time
 flag, V7IOCTL, which, if defined, will restrict the \fIioctl()\fP call to the
 version 7 request codes and emulate them on the local system; otherwise the
 request codes of the local system will be used (as far as implemented).
 .PP
 Minor problems also showed up with the implementation of \fIexecve()\fP
 and \fIfork()\fP.
 \fIExecve()\fP expects three pointers on the stack.
 The first points to the name of the program to be executed,
 the second and third are the beginnings of the \fBargv\fP and \fBenvp\fP
 pointer arrays respectively.
 We cannot pass these pointers to the system call however, because
 the EM addresses to which they point do not correspond with UNIX
 addresses.
 Moreover, (it is not very likely to happen but) what if someone constructs
 a program holding the contents for one of these pointers in the stack?
 The stack is implemented upside down, so passing the pointer to
 \fIexecve()\fP causes trouble for this reason too.
 The only solution was to copy the pointer contents completely
 to fresh UNIX memory, constructing vectors which can be passed to the
 system call.
 Any impending memory fault while making these copies results in failure of the
 system call, with \fIerrno\fP set to EFAULT.
 .PP
 The implementation of the \fIfork()\fP call faced us with problems
 concerning IO-channels.
 Checking messages (as well as logging) must be divided over different files.
 Otherwise, these messages will coincide.
 This problem was solved by post-fixing the default message file
 \fBint.mess\fP (as well as the logging file \fBint.log\fP) with an
 automatically leveled number for every new forked process.
 Children of the original process do their diagnostics
 in files with postfix 1,2,3 etc.
 Second generation processes are assigned files numbered 11, 12, 21 etc.
 When 6 generations of processes exist at one moment, the seventh will
 get the same message file as the sixth, for the length of the filename
 will become too long.
 .PP
 Some of the monitor calls receive pointers (addresses) from to program, to be
 passed to the kernel; examples are the struct stat for \fIstat()\fP, the area
 to be filled for \fIread()\fP, etc. If the address is wrong, the kernel does
 not generate a trap, but rather the system call returns with failure, while
 \fIerrno\fP is set to EFAULT.  This is implemented by consistent checking of
 all pointers in the MON instruction.
 .NH 2
 Internal arithmetic.
 .PP
 Doing arithmetic on signed integers, the smallest negative integer
 (\fIminsint\fP) is considered a legal value.
 This is in contradiction with the EM Manual [1], page 14, which proposes using
 \fIminsint\fP for uninitialized integers.
 The shadow bytes already check for uninitialized integers however,
 so we do not need this special illegal value.
 Although the EM Manual provides two traps, for undefined integers and floats,
 undefined objects occur so frequently (e.g. in block copying partially
 initialized areas) that the interpreter just gives a warning.
 .LP
 Except for arithmetic on unsigneds, all arithmetic checks for overflow.
 The value that is pushed on the stack after an overflow occurs depends
 on the UNIX behavior with regard to that particular calculation.
 If UNIX would not accept the calculation (e.g. division by zero), a zero
 is pushed as a convention.
 Illegal computations which UNIX does accept in silence (e.g. one's
 complement of \fIminsint\fP), simply push the UNIX-result after giving a
 trap message.
 .NH 2
 Shadow bytes implementation.
 .PP
 A great deal of run-time checking is performed by the interpreter (except if
 used in the fast version).
 This section gives all details about the shadow bytes.
 In order to keep track of information about the contents of D-space (stack
 and global data area), there is one shadow-byte for each byte in these spaces.
 Each bit in a shadow-byte represents some piece
 of information about the contents of its corresponding 'sun-byte'.
 All bits off indicates an undefined sun-byte.
 One or more bits on always guarantees a well-defined sun-byte.
 The bits have the following meaning:
 .IP "\(bu bit 0:" 8
 indicates that the sun-byte is (a part of) an integer.
 .IP "\(bu bit 1:" 8
 the sun-byte is a part of a floating point number.
 .IP "\(bu bit 2:" 8
 the sun-byte is a part of a pointer in dataspace.
 .IP "\(bu bit 3:" 8
 the sun-byte is a part of a pointer in the instruction space.
 According to [1] (paragraph 6.4), there are two types pointers which
 must be distinguishable.
 Conversion between these two types is impossible.
 The shadow-bytes make the distinction here.
 .IP "\(bu bit 4:" 8
 protection bit.
 Indicates that the sun-byte is part of a protected piece of memory.
 There is a protected area in the stack, the Return Status Block.
 The EM machine language has no possibility to declare protected
 memory, as is possible in EM assembly (the ROM instruction).  The protection
 bit is, however, set for the line number and filename pointer area near
 location 0, to aid in catching references to location 0.
 .IP "\(bu bit 5/6/7:" 8
 free for later use.
 .LP
 The shadow bytes are managed by the routines declared in \fIshadow.h\fP.
 The warnings originating from checking these shadow-bytes during
 run-time are various.
 A list of them is given in appendix A, together with suggestions
 (primarily for the C-programmer) where to look for the trouble maker(s).
 .LP
 A point to notice is, that once a warning is generated, it may be repeated
 thousands of times.
 Since repetitive warnings carry little information, but consume much
 file space, the interpreter keeps track of the number of times a given warning
 has been produced from a given line in a given file.
 The warning message will
 be printed only if the corresponding counter is a power of four (starting at
 1).  In this way, a logarithmic back-off in warning generation is established.
 .LP
 It might be argued that the counter should be kept for each (warning, PC
 value) pair rather than for each (warning, file position) pair.  Suppose,
 however, that two instruction in a given line would cause the same message
 regularly; this would produce two intertwined streams of identical messages,
 with their counters jumping up and down.  This does not seem desirable.
 .NH 2
 Return Status Block (RSB)
 .PP
 According to the description in [1], at least the return address and the
 base address of the previous RSB have to be pushed when performing a call.
 Besides these two pointers, other information can be stored in the RSB
 also.
 The interpreter pushes the following items:
 .IP \-
 a pointer to the current filename,
 .IP \-
 the current line number (always four bytes),
 .IP \-
 the Local Base,
 .IP \-
 the return address (Program Counter),
 .IP \-
 the current procedure identifier
 .IP \-
 the RSB code, which distinguishes between initial start-up, normal call,
 returnable trap and non-returnable trap (a word-size integer).
 .LP
 Consequently, the size of the RSB varies, depending on
 word size and pointer size; its value is available as \fIrsbsize\fP.
 When the RSB is removed from the stack (by a RET or RTT) the RSB code is under
 the Stack Pointer for immediate checking.  It is not clear what should be done
 if RSB code and return instruction do not match; at present we give a message
 and continue, for what it is worth.
 .PP
 The reason for pushing filename and line number is that some front-ends tend
 to forget the LIN and FIL instructions after returning from a function.
 This may result in error messages in wrong source files and/or line numbers.
 .PP
 The procedure identifier is kept and restored to check that the PC will not
 move out of the running procedure.  The PI is an index in the proctab, which
 tells the limits in the text segment of the running procedure.
 .PP
 If the Return Status Block is generated as a result of a trap, more is
 stacked.  Before stacking the normal RSB, the trap function pushes the
 following items:
 .IP \-
 the contents of the entire Function Return Area,
 .IP \-
 the number of bytes significant in the above (a word-size integer),
 .IP \-
 a word-size flag indicating if the contents of the FRA are valid,
 .IP \-
 the trap number (a word-size integer).
 .LP
 The latter is followed directly by the RSB, and consequently acts as the only
 parameter to the trap handler.
 .NH 2
 Operand access.
 .PP
 The EM Manual mentions two ways to access the operands of an instruction.  It
 should be noticed that the operand in EM is often not the direct operand of the
 operation; the operand of the ADI instruction, e.g., is the width of the
 integers to be added, not one of the integers themselves.  The various operand
 types are described in [1].  Each opcode in the text segment identifies an
 instruction with a particular operand type; these relations are described in
 computer-readable format in a file in the EM tree, \fIip_spec.t\fP.
 .PP
 The interpreter uses a variant of the second method.  Several other approaches
 can be designed, with increasing efficiency and equally increasing complexity.
 They are briefly treated below.
 .NH 3
 The Dispatch Table, Method 1.
 .PP
 When the interpreter starts, it reads the ip_spec.t file and constructs from it
 a dispatch table.  This table (of which there are actually three,
 for primary, secondary
 and tertiary opcodes) has 256 entries, each describing an instruction with
 indications on how to decode the operand.  For each instruction executed, the
 interpreter finds the entry in the dispatch table, finds information there on
 how to access the operand, constructs the operand and calls the appropriate
 routine with the operand as calculated.  There is one routine for each
 instruction, which is called with the ready-made operand.  Method 1 is easy to
 program but requires constant interpretation of the dispatch table.
 .NH 3
 Intelligent Routines, Method 2.
 .PP
 For each opcode there is a separate routine, and since an opcode uniquely
 defines the instruction and the operand format, the routine knows how to get
 the operand; this knowledge is built into the routine.  Preferably the heading
 of the routine is generated automatically from the ip_spec.t file.  Operand
 decoding is immediate, and no dispatch table is needed.  Generation of the
 469 required routines is, however, far from simple.  Either a generated array
 of routine names or a generated switch statement is used to map the opcode onto
 the correct routine.  The switch approach has the advantage that parameters can
 be passed to the routines.
 .LP
 The interpreter uses a variant of the switch statement scheme.  Numerical
 information that can be deduced from the opcode is passed as parameters to the
 routine; this includes the argument of minis, the high order byte of shorties,
 and the fact that the result is to be multiplied by the word size.  This
 reduces the number of required routines to 338.
 .NH 3
 Intelligent Calls.
 .PP
 The call in the switch statement does full operand construction, and the
 resulting operand is passed to the routine.  This reduces the number of
 routines to 133, the number of EM instructions.  Generation of the switch
 statement from ip_spec.t will be complicated, but the routine space will be
 much cleaner.  This will not give any speed-up since the same actions are still
 required; they are just performed in a different place.
 .NH 3
 Static Evaluation.
 .PP
 It can be observed that the evaluation of the operand of a given instruction in
 the text segment will always give the same result.  It is therefore possible to
 preprocess the text segment, decomposing the instructions into structs which
 contain the address, the instruction code and the operand.  No operand decoding
 will be necessary at run-time: all operands have been precalculated.  This will
 probably give a considerable speed-up.  Jumps, especially GTO jumps, will,
 however, require more attention.
 .NH 2
 Disassembly.
 .PP
 A disassembly facility is available, which gives a readable but not
 letter-perfect disassembly of the EM object.  The procedure structure is
 indicated by placing the indication  \fBP[n]\fP  at the entry point of each
 procedure, where \fBn\fP is the procedure identifier.  The number of locals is
 given in a comment.
 .LP
 The disassembler was generated by the software in the directory \fIswitch\fP
 and then further processed by hand.
--- a/doc/int/txt3
+++ b/doc/int/txt3
@ -0,0 +1,181 @@
 .\"	Logging
 .\"
 .\"	$Header$
 .bp
 .NH
 THE LOGGING MACHINE.
 .PP
 Since messages and warnings provided by \fBint\fP include source code file
 names and line numbers, they alone often suffice to identify the error.
 If, however, the necessity arises, much more extensive debugging information
 can be obtained by activating the the Logging Machine.
 This Logging Machine, which monitors all actions of the EM machine, is the
 subject of this chapter.
 .NH 2
 Implementation.
 .PP
 When inspecting the source code of \fBint\fP, many lines in the
 following format will show up:
 .DS
 LOG(("@<\fIletter\fP><\fIdigit\fP> message", args));
 .DE
 or
 .DS
 LOG(("\ <\fIletter\fP><\fIdigit\fP> message", args));
 .DE
 The double parentheses are needed, because \fILOG()\fP is
 declared as a define, and has a printf-like argument structure.
 .PP
 The <\fIletter\fP> classifies the log message and corresponds to an entry in
 the \fIlogmask\fP, which holds a threshold for each class of messages.
 The following classes exist:
 .TS
 tab(@);
 l l l.
 \(bu  A\-Z@the flow of instructions:
@A:    array
@B:    branch
@C:    convert
@F:    floating point arithmetic
@I:    integer arithmetic
@L:    load
@M:    miscellaneous
@P:    procedure call
@R:    pointer arithmetic
@S:    store
@T:    compare
@U:    unsigned arithmetic
@X:    logical
@Y:    sets
@Z:    increment/decrement/zero
 \(bu  d@stack dumping.
 \(bu  g@gda & heap manipulation.
 \(bu  s@stack manipulation.
 \(bu  r@reading the loadfile.
 \(bu  q@floating point calculations during reading the loadfile.
 \(bu  x@the instruction count, contents and file position.
 \(bu  m@monitor calls.
 \(bu  p@procedure calls and returns.
 \(bu  t@traps.
 \(bu  w@warnings.
 .TE
 .LP
 When the interpreter reaches a LOG(()) statement it scans its first argument;
 if \fIletter\fP
 occurs in the logmask, and if \fIdigit\fP is lower or equal to the
 threshold in the logmask, the message is given.
 Depending on the first character, the message will be preceded by a
 position indication (with the @) or will be printed as is (with the
 space).
 The \fIletter\fP is determines the message class
 and the \fIdigit\fP is used to distinguish various levels
 of logging, with a lower digit indicating a more important message.
 We will call the <\fIletter\fP><\fIdigit\fP> combination the \fBid\fP of
 the logging.
 .LP
 In general, the lower the \fIdigit\fP following the \fIletter\fP,
 the more important the message.
 E.g. m5 reports about unsuccessful monitor calls only, m9 also reports
 about successful monitors (which are obviously less interesting).
 New logging messages can be added to the source code on places you
 think relevant.
 .LP
 Reasonable settings for the logmask are:
 .TS
 tab(@);
 l l l.
  @A\-Z9d4twx9@advised setting when trouble shooting (default).
  @A\-Zx9@shows the flow of instructions & global information.
  @pm9@shows the procedure & monitor calls.
  @tw9@shows warning & trap information.
 .TE
 .PP
 An EM interpreter without a Logging Machine can be obtained by undefining the
 macro \fICHECKING\fP in the file \fIchecking.h\fP.
 .NH 2
 Controlling the Logging machine.
 .PP
 The actions of the Logging Machine are controlled by a set of internal
 variables (one of which is the log mask).
 These variables can be set through assignments on the command line, as
 explained int the manual page \fIint.1\fP, q.v.
 Since there are a great many logging statements in the program, of which only a
 few will be executed in any call of the interpreter, it is important to be able
 to decide quickly if a given \fIid\fP has to be checked at all.
 To this end all logging statements are guarded (in the #define) by a test for
 the boolean variable \fIlogging\fP.
 This variable will only be set if the command line assignments show the
 potential need for logging (\fImust_log\fP) and the instruction count
 (\fIinr\fP) is at least equal to \fIlog_start\fP (which derives from the
 parameter \fBLOG\fP).
 .LP
 The log mask can be set by the assignment
 .DS
 "LOGMASK=\fIlogstring\fP"
 .DE
 which sets the current logmask to \fIlogstring\fP.
 A logstring has the following form:
 .DS
 [ [ \fIletter\fP | \fIletter\fP \- \fIletter\fP ]+ \fIdigit\fP ]+
 .DE
 E.g. LOGMASK=A\-D8x9R7c0hi4 will print all messages belonging to loggings
 with \fBid\fPs:
 \fIA0..A8,B0..B8,C0..C8,D0..D8,x0..x9,R0..R7,c0,h0..h4,i0..i4\fP.
 .PP
 The logging variable STOP can be used to prevent run-away logging
 past the point where the user expects an error to occur.
 STOP=\fInr\fP will stop the interpreter after instruction number \fInr\fP.
 .PP
 To simplify the use of the logging machine, a number of abbreviations have been
 defined.
 E.g., AT=\fInr\fP can be thought of as an abbreviation of LOG=\fInr\-1\fP
 STOP=\fInr+1\fP; this causes three stack dumps, one before the suspect
 instruction, one on it and one after it; then the interpreter stops.
 .PP
 Logging results will appear in a special logging file (default: \fIint.log\fP).
 .NH 2
 Dumps.
 .PP
 There are three routines available to examine the memory contents:
 .TS
 tab(@);
 l l l.
  @\fIstd_all()\fP@dumps the contents of the stack (\fId1\fP or \fId2\fP must be in the logmask).
  @\fIgdad_all()\fP@dumps the contents of the gda (\fI+1\fP must be in the logmask).
  @\fIhpd_all()\fP@dumps the contents of the heap (\fI*1\fP must be in the logmask).
 .TE
 .LP
 These routines can be used everywhere in the program to examine the
 contents of memory.
 The internal variables allow the
 gda and heap to be dumped only once (according to the
 corresponding internal variable).
 The stack is dumped after each
 instruction if the log mask contains d1 or d2; d2 gives a full formatted
 dump, d1 produces a listing of the Return Status Blocks only.
 An attempt is made to format the stack correctly, based on the shadow
 bytes, which identify the Return Status Block.
 .LP
 Remember to set the correct \fBid\fP in the LOGMASK, and to give
 LOG the correct value.
 If dumping is needed before the first instruction, then LOG must  be
 set to 0.
 .LP
 The dumps of the global data area and the heap are controlled internally by
 the id-s +1 and *1 resp.; the corresponding logmask entries are set
 automatically by setting the GDA and HEAP variables.
 .NH 2
 Forking.
 .PP
 As mentioned earlier, a call to \fIfork()\fP, causes an image of the current
 program to start running.
 To prevent a messy logfile, the child process gets its own logfile
 (and message file, tally file, etc.).
 These logfiles are distinguished from the parent logfile by the a
 postfix, e.g.,
 \fIlogfile_1\fP for the first child, \fIlogfile_2\fP for the second child,
 \fIlogfile_1_2\fP for the second child of the first child, etc.
 .br
 \fINote\fP: the implementation of this feature is shaky; it works for the log
 file but should also work for other files and for the names of the logging
 variables.
--- a/util/int/int.1
+++ b/util/int/int.1
@ -0,0 +1,200 @@
 .\"	Manual page
 .\"
 .\"	$Header$
 .TH INT I
 .ad
 .SH NAME
 int \- Interpreter for EM Machine Language
 .SH SYNOPSIS
 \fBint\fP [ intargs ] [ emfile [ emargs ] ]
 .SH DESCRIPTION
 This program interprets the EM machine-language, and replaces
 the pascal written EM interpreter described in [1].
 The program interprets load files in \fIe.out\fP format (see [1], sec. 10.3).
 .LP
 \fIEmfile\fP is the name of the load file; if no name is
 specified, the default name \fIe.out\fP is used.
 The program can handle several word size / pointer size combinations.
 The combinations presently supported are 2/2, 2/4 and 4/4.
 .LP
 \fIEmargs\fP are the arguments for the program being interpreted.
 If any arguments are given, then \fIemfile\fP must be present.
 .PP
 The interpreter can generate diagnostic messages (warnings) about the
 interpreted program.
 Some of these warnings are given very frequently,
 which may result in a large, non-functional message file.
 To avoid this behavior, counters keep track of the number of times
 a given warning occurs in a given file at a given line number.
 Only when this counter is a power of 4, the warning will actually be
 given.
 `Logarithmic warning generation' is established in this way.
 .PP
 \fIInt\fP preempts the highest two file descriptors available, for
 diagnostic purposes.
 Interpreted programs can use the other file descriptors without
 clash problems.
 .PP
 .I "Interpreter parameters"
 .br
 \fIInt\fP itself accepts the following options, all given as separate flags:
 .IP \fB\-d\fP
 The program will not be run; a disassembly listing of the program will
 be written to standard output file instead.
 The original names are lost, but the procedure structure is recovered.
 .IP \fB\-h\fP\fIN\fP
 The maximum size of the heap will be limited to \fIN\fP bytes.  This can be
 used to force a heap overflow trap.
 .IP \fB\-I\fP\fIN\fP
 It is possible to tell \fIint\fP to ignore traps in the range 0-15.
 If a trap is ignored, every time the trap would have happened
 a warning is generated instead.
 The argument \fIN\fP is the trap number, as described in [1], sec. 9.
 For ignoring more than one trap, several \fB\-I\fP flags are needed.
 .IP \fB\-m\fP\fIfile\fP
 The argument \fIfile\fP is the name of a file on which the messages will
 appear.
 The default file name is \fIint.mess\fP.
 .IP \fB\-r\fP\fIN\fP
 Determines the size of the Function Return Area.
 Default: 2 \(mu pointer size.
 .IP \fB\-s\fP\fIN\fP
 The maximum size of the stack will be limited to \fIN\fP bytes.  This can be
 used to force a stack overflow trap.
 .IP \fB\-t\fP
 If given, a file \fIint.tally\fP will be produced upon program termination.
 For each source file, it contains a list of line numbers visited,
 with the number of times the line was visited and
 the number of EM instructions executed on the line.
 .IP \fB\-W\fP\fIN\fP
 This option can be used to disable warnings.
 The argument \fIN\fP is the number of the warning to be suppressed,
 as found in the \fIint\fP documentation [3].
 For disabling more than one warning, several \fB\-W\fP flags are needed.
 .PP
 .I "The Logging Machine"
 .br
 The EM machine is monitored continually by a Logging Machine. This logging
 machine keeps an instruction count and
 can produce a trace of the actions of the EM machine, make readable
 dumps of the stack, heap and global data area, and stop the EM machine after a
 given instruction number.
 The actions of the logging machine are controlled by
 its internal variables, the values of which can be set by assignments on the
 command line, much like setting macro names in a call of \fImake\fP.
 These assignments can be interspersed with the options for the EM machine.
 .PP
 The logging machine has the following internal variables:
 .IP \fBLOG\fP=\fIN\fP
 Logging will start when the instruction count has reached \fIN\fP.
 .IP \fBLOGMASK\fP=\fIstring\fP
 The tracing actions are controlled by a log mask; the log mask consists of a
 list of pairs of action classes and logging levels.
 E.g. \fBLOGMASK\fP=\fIm9\fP means: trace all monitor calls.
 The action classes are described fully in [3].
 The default log mask is reasonably suitable.
 .IP \fBLOGFILE\fP=\fIstring\fP
 The \fIstring\fP is the name of a file on which all logging information is
 written.
 The default file name is \fIint.log\fP.
 .IP \fBSTOP\fP=\fIN\fP
 The logging machine stops the EM machine after instruction \fIN\fP.
 .PP
 Stack dumps can be made after each instruction; they are controlled by the pair
 \fBd4\fP in the log mask; gda and heap dumps can only be made after a specific
 instruction.
 The following internal variables pertain to memory dumps:
 .IP \fBGDA\fP=\fIN\fP
 The contents of the Global Data Area are dumped after instruction \fIN\fP.  The
 extent can be adjusted by setting \fBGMIN\fP=\fINmin\fP (default 0) and
 \fBGMAX\fP=\fINmax\fP (default HB).
 .IP \fBHEAP\fP=\fIN\fP
 The contents of the heap are dumped after instruction \fIN\fP.
 .IP \fBSTDSIZE\fP=\fIN\fP
 The stack dump is restricted to the \fIN\fP topmost bytes.
 .IP \fBRAWSTACK\fP=\fIN\fP
 Normally the stack dump produced is divided into activation records
 separated by formatted dumps of the Return Status Blocks.
 If \fIN\fP is non-zero, this dividing and formatting is suppressed, and the
 stack is dumped raw.
 .PP
 Some combinations of variable settings are generally useful and can be
 abbreviated:
 .IP \fBAT\fP=\fIN\fP
 Is an abbreviation of \fBLOG\fP=\fIN\-1\fP \fBSTOP\fP=\fIN+1\fP.
 The default log mask applies.
 .IP \fBL\fP=\fIstring\fP
 Is an abbreviation of \fBLOG\fP=\fI0\fP \fBLOGMASK\fP=\fIstring\fP.
 E.g., \fBL\fP=\fIm9\fP will log all monitor calls
 and \fBL\fP=\fIA\-Z9\fP will log all instructions (give a full trace).
 .PP
 When the interpreter forks, the child continues logging on a new file named
 \fIint.log_1\fP, etc.
 In principle it reevaluates the interpreter arguments, now looking for
 \fBLOG_1\fP, \fBLOGMASK_1\fP, etc., but this feature has not been fully
 implemented.
 .PP
 .I "Diagnostics"
 .br
 All diagnostics are written to the message file.
 Diagnostics come in three flavors:
 .IP \-
 (messages): These inform you about NOP instructions, give more information
 about incoming signals and display the exit status of the program.
 .IP \-
 (warnings): These are generated as a result of the checking.
 In most cases the diagnostic is self-explanatory.
 A complete description of the warnings can be found in the \fIint\fP
 documentation [3].
 .IP \-
 (fatal errors): This diagnostic is the result of an irrecoverable
 error, generally before the program has started: incorrect call of the
 interpreter, cannot access file, incorrect format of load file.  A few follow
 during interpretation: out of memory, uncaught traps, floating point operation
 on a version without floating point;
 execution stops immediately after the diagnostic is generated.
 .PP
 Further diagnostics are generated (on \fIstderr\fP) if files cannot
 be opened or found.
 .SH "SEE ALSO"
 e.out(5), ack(1), em22(1), em24(1), em44(1).
 .IP [1]
 Andrew S. Tanenbaum, Hans van Staveren, Ed G. Keizer and Johan W. Stevenson,
 \fIDescription of a Machine Architecture for use with Block
 Structured Languages\fP, Informatica rapport IR-81.
 .IP [2]
 Amsterdam Compiler Kit, reference manual and UNIX manual pages.
 .IP [3]
 Eddo de Groot, Leo van den Berge, Dick Grune,
 \fIThe EM Interpreter\fP.
 .SH "FILES"
 .ta 20n
 int.mess	contains messages
 .br
 int.log	contains logging info, if requested
 .br
 int.tally	contains tally results, if requested
 .br
 int.core	produced upon fatal error; format provisional
 .SH "BUGS"
 The monitor calls
 .IR mpxcall ,
 .I ptrace
 and
 .I profile
 have not been implemented.
 .br
 The maximum number of bytes for rotation is 4.
 .br
 The UNIX V7 struct tchars is not emulated under System V.
 .br
 The P and N restrictions on operands are not checked.
 .br
 The start-up has a quadratic component in the number of procedures in the EM
 program.
 .SH "AUTHORS"
 L.J.A. van den Berge.
 .br
 E.J. de Groot.
 .br
 D. Grune