Initial entry

This commit is contained in:
dick 1988-06-22 21:48:19 +00:00
parent b72f2848dd
commit 6214be89c8
10 changed files with 1837 additions and 0 deletions

16
doc/int/Makefile Executable file
View file

@ -0,0 +1,16 @@
# $Header$
TBL=/usr/ditroff/tbl
DOC = draw.mac cover txt1 txt2 txt3 appA appB bib
int.doc: $(DOC)
$(TBL) $(DOC) > $@
FLS = README .distr Makefile int.1 $(DOC)
.distr: Makefile
echo $(FLS) | tr ' ' '\012' >.distr
clean:
rm -f int.doc

4
doc/int/README Normal file
View file

@ -0,0 +1,4 @@
# $Header$
This directory contains the text of the documentation for the
Production Quality Interpreter "int".

280
doc/int/appA Normal file
View file

@ -0,0 +1,280 @@
.\" List of all warnings; source of warn_msg and warn.h
.\"
.\" $Header$
.\"
.\" This file contains the warnings issued by the interpreter, together
.\" with their names and values in the code of the interpreter. Some of
.\" the source files of the interpreter are generated from the Wn
.\" macros in this file.
.\" When modifying this file, preserve the parameters of the Wn macros.
.de Wn \" <text> <define> <value>
.IP \\$3. 7
.B "\\$1"
.br
.. Wn
.bp
.DS C
APPENDIX A
.DE
.SH
List of Warnings.
.PP
The shadow-byte administration makes it possible to check for a
wide range of errors during run-time.
We have tried to make the diagnostics self-explanatory and especially useful
for the C-programmer.
The warnings are printed in the message file, together with source file
and line number.
The complete list of warnings is presented here, followed by an
explanation of what might be wrong.
Often, these explanations implicitly assume that the program
being interpreted, was originally written in C (and not Pascal, Basic etc.).
.LP
.I "Reading the load file"
.Wn "Floating point instructions flag in header ignored" WFLUSED 1
.Wn "No float initialisation in this version" WFLINIT 2
The interpreter was compiled with the NOFLOAT option; code involving
floating point operations can be run as long as the actual
instructions are avoided.
.Wn "Extra-test flag in header ignored" WEXTRIGN 4
The interpreter already tests anything conceivable.
.Wn "Maximum line number in header was 0" WNLINEZR 5
This number could be used to allocate tables for tallying; these tables are,
however, expanded as needed, so the number is immaterial.
.Wn "Bad float initialisation" WBADFLOAT 7
The loadfile contains a floating point denotation which does not
satisfy the syntax (see 2.6).
Examining the loadfile (with \fBod \-c\fP) might show the syntax error.
Probably there is a bug in the front-end, creating floats with
a bad syntax.
.LP
.I "System calls"
.Wn "IOCTL \- bad or unimplemented request" WBADIOCTL 11
The second parameter to the ioctl() request (the operation code) is invalid or
not implemented; since there are many different opcodes on the various UNIX
systems, it is difficult to tell which. The system call fails.
.Wn "MPXCALL \- not (yet) implemented" WMPXIMP 14
.Wn "PROFIL \- not (yet) implemented" WPROFILIMP 15
.Wn "PTRACE \- not (yet) implemented" WPTRACEIMP 16
The monitor calls \fImpxcall()\fP, \fIprofil()\fP and \fIptrace()\fP
have not been implemented. The monitor call fails.
.Wn "Inaccessible memory in system call" WMONFLT 21
Bad pointers passed to system calls do not cause a memory fault (which in UNIX
would happen to the kernel), but cause the system call to fail with the UNIX
variable errno set to 14 (EFAULT). It seems likely that your program is at
fault, but there is also a good possibility that a library routine made
unwarranted assumptions about word size and pointer size.
.Wn "READ \- buffer resides in unallocated memory" WRUMEM 23
.Wn "READ \- buffer across global data area and heap" WRGDAH 24
When the buffer passed to the read() system call is situated (completely
or partially) in unallocated memory (beyond \fIHP\fP) or begins
in the global data area and ends in the heap, the appropriate warning
is given.
The buffer is not written.
.Wn "WRITE \- buffer resides in unallocated memory" WWUMEM 25
.Wn "WRITE \- buffer across global data area and heap" WWGDAH 26
.Wn "WRITE \- (part of) global buffer is undefined" WWGUNDEF 27
.Wn "WRITE \- (part of) local buffer is undefined" WWLUNDEF 28
The first two are equivalent to the READ-errors above.
Writing out a buffer usually makes no sense when the contents are undefined,
so one of the latter two warnings will be generated in this case.
A global buffer resides in the data partition; a local buffer resides in
the stack partition.
This corresponds to global and local variables in a C-program.
In the first two cases the WRITE is not performed, in the latter two cases
it is.
.LP
.I "Traps and signals"
.Wn "SIGTRP \- bad signo argument" WILLSN 31
The \fIsigtrp()\fP monitor call allows \fIsig_no\fP arguments in the
range [1..17] (UNIX Version 7 signals); the actual argument is out of range.
.Wn "SIGTRP \- signo argument is a synchronous trap" WUNIXTR 32
The signal is one that can only be caused synchronously by the running program
on UNIX; it cannot occur to an interpreted program.
.Wn "SIGTRP \- bad trapno argument" WILLTN 33
The \fIsigtrp()\fP monitor call allows \fItrap_no\fP arguments between 0 and
252, and the special values \-2 and \-3; the actual argument is not one of
these.
.Wn "Heap overflow due to command line limitation" WEHEAP 36
.Wn "Stack overflow due to command line limitation" WESTACK 37
The maximum sizes of the heap and the stack can be limited by options on the
command line. If overflow occurs due to such limitations, the corresponding
trap is taken, preceded by one of the above warnings. If the memory of the
interpreter itself is exhausted, a fatal error follows.
.LP
.I "Run-time type checking"
.Wn "Local character expected" WLCEXP 41
.Wn "Global character expected" WGCEXP 42
.Wn "Local integer expected" WLIEXP 43
.Wn "Global integer expected" WGIEXP 44
.Wn "Local float expected" WLFEXP 45
.Wn "Global float expected" WGFEXP 46
.Wn "Local data pointer expected" WLDPEXP 47
.Wn "Global data pointer expected" WGDPEXP 48
.Wn "Local instruction pointer expected" WLIPEXP 49
.Wn "Global instruction pointer expected" WGIPEXP 50
In general, a type violation has taken place when one of
these warnings is given.
The \fBfloat\fP- and \fBinstruction pointer\fP warnings are rare and will
usually be easy traceable.
\fBInteger/character expected\fP will normally occur when unsigned arithmetic
is performed on datapointers or when memory containing objects other than
integers is copied bytewise.
Often, this warning is followed by a warning \fBdatapointer expected\fP.
This is due to our decision of transforming pointers to (unsigned) integers
after doing unsigned arithmetic on them.
When such a transformed integer is dereferenced (as if it were a pointer)
or, in general, when it is treated as a pointer, this results in a warning.
The present library implementation of malloc() causes such a
sequence of errors.
.LP
These messages are always followed by a tentative description of what is found
in memory at the offending place.
.Wn "Actual memory is undefined" WWASUND 61
.Wn "Actual memory contains an integer" WWASINT 62
.Wn "Actual memory contains a float" WWASFLOAT 63
.Wn "Actual memory contains a data pointer" WWASDATAP 64
.Wn "Actual memory contains an instruction pointer" WWASINSP 65
.Wn "Actual memory contains mixed information" WWASMISC 66
If the contents of the area was undefined,
check the source code for an uninitialized variable of the mentioned type.
Officially, the use of an undefined value
should result in a EIUND or EFUND trap but the occurrence is
so common that a warning is more appropriate.
The contents of memory are described as mixed if the data consists of pieces
of different types. This happens, e.g., when caller and callee do not agree on
the types and lengths of the parameters.
.LP
.I "Protection"
.br
.Wn "Destroying contents of ROM (at or near loc 0)" WDESROM 71
The program stores a value in Read-Only Memory; the only ROM in the present
implementation is the area near location 0. The warning probably results from
storing under a NULL pointer. This is only a warning, the store operation is
executed normally. Reads from location 0 are not detected.
.Wn "Destroying contents of Return Status Block" WDESRSB 72
The Return Status Block is the stack area containing the return address, the
dynamic link, etc.
This may or may not be an error.
The current implementation of \fIsetjmp()\fP/\fIlongjmp()\fP
may be responsible for it.
If your program does not use setjmp(), there \fIis\fP something
very wrong (e.g. argument for ASP too large).
Note that there are some library routines (such as \fIalarm()\fP) which
use \fIsetjmp()\fP.
.Wn "Logical operation using undefined operand(s)" WUNLOG 81
.Wn "Comparing undefined operand(s)" WUNCMP 82
The logical operations AND, XOR, IOR, COM and the compare operation
CMS do their jobs bytewise.
If one of the bytes is found to be undefined, the corresponding warning
is given, and the operation is stopped immediately.
The stack is adjusted so interpretation may continue.
.br
It is hard to say what went wrong.
Possibly, the argument of the instruction at hand (which indicates the
size of the objects to be compared), was too large.
.LP
.I "Bad operands"
.Wn "Shift over negative distance" WSHNEG 91
.Wn "Shift over too large distance" WSHLARGE 92
Shift instructions yield undefined results if the shift distance is negative
or larger than the object size.
.Wn "Pointer arithmetic yields pointer to bad segment" WSEGADP 93
When doing pointer arithmetic (ADP, ADS), the operand and result pointer
must be in the same \fIsegment\fP (see sec. 4).
E.g. loading the address of the first local and adding 20 to it will
certainly give this warning.
.Wn "Subtracting pointers to different segments" WSEGSBS 94
Pointers may be subtracted only if they point into the same segment.
.Wn "Pointer arithmetic with NULL pointer" WNULLPA 96
By definition it is illegal to do arithmetic with null pointers.
Integers with the size of a pointer and the value zero are recognized
as NULL pointers.
A well-known C-trick to compute the offset of some field in a struct
is converting the null-pointer to the type of the struct and simply
taking the address of the field.
This trick will \-when translated and interpreted\- generate this warning
because it results in arithmetic with the NULL pointer.
.LP
.I "Return area"
.Wn "Returned function result too large" WRFUNLAR 101
.Wn "Returned function result too small" WRFUNSML 102
This warning is generated when the size of the expected return value
is not equal to the size actually returned.
.br
Your interpreted program may have fallen through the end of
the code without explicitly doing an \fIexit()\fP or \fIreturn()\fP.
The start-up routine (\fIcrt0()\fP) however always expects to get some
value returned by the program proper.
.br
Another (less probable) possibility of course is that the code contains
a subroutine or function call that does not return properly (e.g.
it returns a short instead of a long).
.Wn "Returned function result may be garbled" WRFUNGAR 103
This warning will be generated, when the contents of the FRA are fetched
after some instruction is executed which can mess up the area.
Compiler-generated loadfiles should not generate this message.
.LP
.I "Return Status Block"
.Wn "RET did not find a Return Status Block" WRETBAD 111
.Wn "Used RET to return from a trap" WRETTRAP 112
The RET instruction found a garbled Return Status Block, or on that resulted
from a trap.
.Wn "RTT did not find a Return Status Block" WRTTBAD 115
.Wn "RTT on empty stack" WRTTEMPTY 116
.Wn "Used RTT to return from a call" WRTTCALL 117
.Wn "Used RTT to return from a non-returnable trap" WRTTNRTT 118
The RTT (Return from Trap) instruction found a Return Status block that was not
created properly by a trap.
.Wn "Stack Pointer too large in RET" WRETSTL 121
.Wn "Stack Pointer too small in RET" WRETSTS 122
.Wn "Stack Pointer too large in RTT" WRTTSTL 125
.Wn "Stack Pointer too small in RTT" WRTTSTS 126
According to the EM Manual (4.2), "the value of SP just after the return
value has been popped must be the same as the
value of SP just before executing the first instruction of the
invocation."
If the Stack Pointer is too large, some dynamically allocated item or some
temporary result may have been left behind on the stack.
If the Stack Pointer is too small, some locals have been unstacked.
Since the interpreter has enough information in the Return Status Block, it
recovers correctly from these errors.
.LP
.I "Traps"
.LP
Some traps have ambiguous or non-obvious causes.
As far as possible, these are preceded by a warning, explaining the
circumstances of the trap.
.Wn "Trap ESTACK: DCH on bad LB" WDCHBADLB 131
.Wn "Trap ESTACK: LPB on bad LB" WLPBBADLB 132
.Wn "Trap ESTACK: SP retracted over Return Status Block" WSPGTLB 133
.Wn "Trap ESTACK: SP moved into data area" WSPINHEAP 134
.Wn "Trap ESTACK: SP set to non-word-boundary" WSPODD 135
.Wn "Trap ESTACK: LB set out of stack" WLBOUT 136
.Wn "Trap ESTACK: LB set to non-word-boundary" WLBODD 137
.Wn "Trap ESTACK: LB set to position where there is no RSB" WLBRSB 138
.Wn "Trap EHEAP: HP retracted into Global Data Area" WHPGDA 141
.Wn "Trap EHEAP: HP pushed into stack" WHPSTACK 142
.Wn "Trap EHEAP: HP set to non-word-boundary" WHPODD 143
.Wn "Trap EILLINS: unknown opcode" WBADOPC 151
.Wn "Trap EILLINS: conversion with unacceptable size for this machine" WILLCONV 152
.Wn "Trap EILLINS: FIL with non-existing address" WILLFIL 153
.Wn "Trap EILLINS: LFR with too large size" WILLLFR 154
.Wn "Trap EILLINS: RET with too large size" WILLRET 155
.Wn "Trap EILLINS: instruction argument of class c does not fit a word" WARGC 156
.Wn "Trap EILLINS: instruction on double word on machine with word size 4" WARGD 157
.Wn "Trap EILLINS: local offset too large" WARGL 158
.Wn "Trap EILLINS: instruction argument of class g not in GDA" WARGG 159
.Wn "Trap EILLINS: fragment offset too large" WARGF 160
.Wn "Trap EILLINS: counter in lexical instruction out of range" WARGN 161
.Wn "Trap EILLINS: non-existent procedure identifier" WARGP 162
.Wn "Trap EILLINS: illegal register number" WARGR 163
.Wn "Trap EBADPC: jump out of text segment" WPCOVFL 172
.Wn "Trap EBADPC: jump out of procedure fragment" WPCPROC 173
.Wn "Trap EBADGTO: GTO does not restore an existing RSB" WGTORSB 181
.Wn "Trap EBADGTO: GTO descriptor on the stack" WGTOSTACK 182
.Wn "Trap caused by TRP instruction" WTRP 191
.ig
.Wn "Last warning" WMSG 199
!Leave these lines here!
..

486
doc/int/appB Normal file
View file

@ -0,0 +1,486 @@
.\" A simple tutorial
.\"
.\" $Header$
.\"
.bp
.DS
APPENDIX B
.DE
.SH
How to use the interpreter
.PP
The interpreter is not normally used for the debugging of programs under
construction. Its primary application is as a verification tool for almost
completed programs. Although the proper operation of the interpreter is
obviously a black art, this chapter tries to provide some guidelines.
.LP
For the sake of the argument, the source language is assumed to be C, but most
hints apply equally well to other languages supported by ACK.
.sp
.LP
.I "Initial measures"
.PP
Start with a test case of trivial size; to be on the safe side, reckon with a
time dilatation factor of about 500, i.e., a second grows into 10 minutes.
(The interpreter takes 0.5 msec to do one EM instruction on a Sun 3/50).
Fortunately many trivial test cases are much shorter than one second.
.PP
Compile the program into an \fIe.out\fP, the EM machine version of a
\fIa.out\fP, by calling \fIem22\fP (for 2-byte integers and 2-byte pointers),
\fIem24\fP (for 2 and 4) or \fIem44\fP (for 4 and 4) as seems appropriate;
if in doubt, use \fIem44\fP. These compilers can be found in the ACK
\fIbin\fP directory, and should be used instead of \fIacc\fP (or normal
.UX
\fIcc\fP). Alternatively, you can use \fIacc \-memNN\fP instead of
\fIemNN\fP.
.LP
If your C program consists of more than one file, as it usually does, there is
a small problem. The \fIacc\fP and \fIcc\fP compilers generate .o files,
whereas the \fIemNN\fP compilers generate .m files as object files.
A simple technique to avoid the problem is to call
.DS
em44 *.c
.DE
if you can. If not, the following hack on the \fIMakefile\fP generally works.
.IP \-
Make sure the \fIMakefile\fP is reasonably clean and complete: all calls to
the compiler are through \fI$(CC)\fP, \fICFLAGS\fP is used properly and all
dependencies are specified.
.IP \-
Add the following lines to the \fIMakefile\fP (possibly permanently):
.DS
\&.SUFFIXES: .o
\&.c.o:
\& $(CC) \-c $(CFLAGS) $<
.DE
.IP \-
Set CC to \fIem44 \-.c\fP (for example). Make sure CFLAGS includes
the \-O option; this yields a speed-up of about 15 %.
.IP \-
Change all .o to .m (or .k if you do not use the \-O option).
.IP \-
If necessary, change \fIa.out\fP to \fIe.out\fP.
.PP
With these changes, \fImake\fP will produce an EM object; you can use
\fIesize\fP to verify that it is indeed an EM object and obtain some
statistics. Then call the interpreter:
.DS
int <EM-object-file> [ parameters ]
.DE
where the parameters are the normal parameters of your program. This should
work exactly like the original program, though slower. It reads from the
terminal if the original does, it opens and closes files like the original and
it accepts interrupts.
.sp
.LP
.I "Interpreting the results"
.PP
Now there are several possibilities.
.PP
It does all this. Great! This means the program
does not do very uncouth things. Now
read the file \fIint.mess\fP to see if any messages were generated. If there
are none, the program did not really run (perhaps the original cc \fIa.out\fP
got called instead?) Normally there is at least a termination message like
.DS
(Message): program exits with status 0 at "awa.p", line 64, INR = 4124
.DE
This says that the program terminated through an exit(0) on line 64 of the
file \fIawa.p\fP after 4124 EM instructions.
If this is the only message it is time to move to a bigger test case.
.PP
On the other hand, the program may come to a grinding halt with an error
message.
All messages (errors and warnings) have a format in which the sequence
.DS
"<file name>", line <ln#>
.DE
occurs, which is the same sequence many compilers produce for their error
messages. Consequently, the \fIint.mess\fP file can be processed as any
compiler message output.
.PP
One such message can be
.DS
(Fatal error) a.em: trap "Addressing non existent memory" not caught at "a.c", line 2, INR = 16
.DE
produced by the abysmal program
.DS
main() {
*(int*)200000 = 1;
}
.DE
.LP
Often the effects are more subtle, however. The program
.DS
main() {
int *a, b = 777;
b = *a;
}
.DE
produces the following five warnings (in far less than a second):
.DS
(Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
(Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
(Warning 102, #1): Returned function result too small at "<unknown>", line 0, INR = 21
(Warning 43, #1): Local integer expected at "exit.c", line 11, INR = 34
(Warning 61, cont.): Actual memory is undefined at "exit.c", line 11, INR = 34
.DE
The one about the function result looks the most frightening,
but is the most easily solved:
\fImain\fP is a function returning an int, so the start-up routine expects a
(four-byte) integer but gets an empty (zero-byte) return area.
.LP
\fINote\fP: The experts are divided about this. The traditional school holds
that \fImain\fP is an int function and its result is the return code; this
leaves them with two ways of supplying a return code: one as the parameter
of \fIexit()\fP and one as the result
of \fImain\fP. The modern school (Berkeley 4.2 etc.) claims that
return codes are supplied exclusively
by \fIexit()\fP, and they have an \fIexit(0)\fP in
the start-up routine, just after the call to \fImain()\fP; leaving \fImain()\fP
through the bottom implies successful termination.
.LP
We shall satisfy both groups by
.DS
main() {
int *a, b = 777;
b = *a;
exit(0);
}
.DE
This results in
.DS
(Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
(Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
(Message): program exits with status 0 at "exit.c", line 11, INR = 33
.DE
which is pretty clear as it stands.
.sp
.LP
.I "Using stack dumps"
.PP
Let's, for the sake of argument
and to avoid the fierce realism of 10000-line programs, assume that the above
still puzzles you.
Since the error occurred in EM instruction number 17, we should like to see
more information around that moment. Call the interpreter again, now with the
shell variable AT set at 17:
.DS
int AT=17 t.em
.DE
(The interpreter has a number of internal variables that can be set by
assignments on the command line, like with \fImake\fP.)
This gives you a file called \fIint.log\fP containing the
stack dump of 150 lines presented at the end of this chapter.
.PP
Since dumping is a subfacility of logging in the interpreter, the formats of
the lines are
the same. If a line starts with an @, it will contain a file-name/line-number
indication; the next two characters are the subject and the log
level. Then comes the information, preceded by a space. The text contains
three stack dumps, one before the offending instruction, one at it, and one
after it; then the interpreter stops. All kinds of other dumps can be
obtained, but this is default.
.PP
For each instruction we have, in order:
.IP \-
an @x9 line, giving the position in the program,
.IP \-
the messages, warnings and errors from the instruction as it is being executed,
.IP \-
dump(s), as requested.
.PP
The first two lines mean that at line 4 in file \fIt.c\fP the interpreter
performed its 16-th instruction, with the Program Counter at 30 pointing at
opcode 180 in the text segment; the instruction was an LOL (LOad Local)
with the operand \-4 derived from the opcode. It copies the local at offset
\-4 to the top of the stack. The effect can be seen from the subsequent stack
dump, where the undefined word at addresses 2147483568 to ...571 (the variable
\fIa\fP) has been copied to the top of the stack at 2147483560 (copying
undefined values does not generate a warning).
Since we used the \fIem44\fP compiler, all pointers and ints in our dump are
4 bytes long.
So a variable at address X in reality extends from address X to X+3.
.br
Note that this is not the offending instruction; this stack dump represents
the situation just before the error.
.PP
The stack consists of a sequence of frames, each containing data followed by
a Return Status Block resulting from a call; the last frame ends in
top-of-stack. The first frame represents the stack when the program starts,
through a call to the start-up routine. This routine prepares the second
stack frame with the actual parameters to \fImain()\fP:
\fIargc\fP at 2147483596, \fIargv\fP at 2147483600 and \fIenviron\fP at
2147483604.
.LP
The RSB line shows that the call to \fImain()\fP was made from procedure 0
which has 0 locals, with PC at
16, an LB of 2147483608 and file name and line number still unknown.
The \fIcode\fP in the RSB tells how this RSB was made; possible values are STP
(start-up), CAL, RTT (returnable trap) and NRT (non-returnable trap).
.PP
The next frame shows the local variable(s) of \fImain()\fP; there are two of
them, the pointer \fIa\fP at 2147483568, which is undefined, and variable
\fIb\fP at 2147483564, which has the value 777. Then comes a copy of \fIa\fP,
just made by the LOL instruction, at 2147483560. The following line shows that
the Function Return Area (which does not reside at the end of the stack, but
just happens to be printed here) has size 0 and is presently undefined.
The stack dump ends
by showing that the Actuals Base is at 2147483596 (pointing at \fIargc\fP), the
Locals Base at 2147483572 (pointing just above the local \fIa\fP), the Stack
Pointer at 2147483560 (pointing at the undefined pointer), the line count is 4
and the file name is "t.c".
.LP
(Notice that there is one more stack frame than you would probably expect, the
one above the start-up routine.)
.LP
The Function Return Area
could have a size larger than 0 and still be undefined, for
example when an instruction that does not preserve the contents of the FRA has
just been executed; likewise the FRA could have size 0 and be defined
nevertheless, for example just after a RET 0 instruction.
.PP
All this has set the scene for the distaster which is about to strike in the
next instruction. This is indeed a LOI (LOad Indirect) of size 4, opcode 169;
it causes the message
.DS
warning: Local data pointer expected [stack.c: 242]
.DE
and its continuation
.DS
warning cont.: Actual memory is undefined
.DE
(detected in the interpreter file \fIstack.c\fP at line 242; this can be
useful for sorting out dubious semantics). We see that the effect, as shown in
the third frame of this stack dump (at instruction number 17) is somewhat
unexpected: the LOI has fetched the value 4 and stacked it. The reason is
that, unfortunately, undefinedness is not transitive in the interpreter. When
an undefined value is used in an operation (other than copying) a warning is
given, but thereafter the value is treated as if it were zero. So, after the
warning a normal null pointer remains, which is then used to pick up the value
at location 0. This is the place where the EM machine stores its current line
number, which is presently 4.
.PP
The third stack dump shows the final effect: the value 4 has been unstacked
and copied to variable \fIb\fP at 2147483564 through an STL (STore Local)
instruction.
.PP
Since this form of logging dumps the stack only, the log file is relatively
small as dumps go.
Nevertheless, a useful excerpt can be obtained with the command
.DS
grep 'd1' int.log
.DE
This extracts the Return Status Block lines from the log, thus producing three
traces of calls, one for each instruction in the log:
.DS
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, LIN = 4, FIL = "t.c"
.DE
Theoretically, the pertinent trace is the middle one, but in practice all three
are equal. In the present case there isn't much to trace, but in real programs
the trace can be useful.
.sp
.LP
.I "Errors in libraries"
.PP
Since libraries are generally compiled with suppression of line number and
file name information, the line number and file name in the interpreter will
not be updated when it enters a library routine. Consequently, all messages
generated by interpreting library routines will seem to originate from the
line of the call. This is especially true for the routine malloc(), which,
from the nature of its business, often contains dubitable code.
.PP
A usual message is:
.DS
(Warning 43, #1): Local integer expected at "buff.c", line 18, INR = 266
(Warning 64, cont.): Actual memory contains a data pointer at "buff.c", line 18, INR = 266
.DE
and indeed at line 18 of the file buff.c we find:
.DS
buff = malloc(buff_size = BFSIZE);
.DE
This problem can be avoided by using a specially compiled version of the
library that contains the correct LIN and FIL instructions, or, less
elegantly, by including the source code of the library routines in the
program; in the latter case, make sure you have them all.
.sp
.LP
.I "Unavoidable messages"
.br
Some messages produced by the logging are almost unavoidable; sometimes the
writer of a library routine is forced to take liberties with the semantics of
EM.
.LP
Examples from C include the memory allocation routines.
For efficiency reasons, one bit of an pointer in the administration is used as
a flag; setting, clearing and reading this bit requires bitwise operations on
pointers, which gives the above messages.
Realloc causes a problem in that it may have to copy the originally allocated
area to a different place; this area may contain uninitialised bytes.
.bp
.DS
.ft CW
@x9 "t.c", line 4, INR = 16, PC = 30 OPCODE = 180
@L6 "t.c", line 4, INR = 16, DoLOLm(-4)
d2
d2 . . STACK_DUMP[4/4] . . INR = 16 . . STACK_DUMP . .
d2 ----------------------------------------------------------------
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483643 0 (Dp)
d2 2147483642 0 (Dp)
d2 2147483641 0 (Dp)
d2 2147483640 40 [ 40] (Dp)
d2 2147483639 0 (Dp)
d2 2147483638 0 (Dp)
d2 2147483637 3 (Dp)
d2 2147483636 64 [ 832] (Dp)
d2 2147483635 0 (In)
d2 2147483634 0 (In)
d2 2147483633 0 (In)
d2 2147483632 1 [ 1] (In)
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483607 0 (Dp)
d2 2147483606 0 (Dp)
d2 2147483605 0 (Dp)
d2 2147483604 40 [ 40] (Dp)
d2 2147483603 0 (Dp)
d2 2147483602 0 (Dp)
d2 2147483601 3 (Dp)
d2 2147483600 64 [ 832] (Dp)
d2 2147483599 0 (In)
d2 2147483598 0 (In)
d2 2147483597 0 (In)
d2 2147483596 1 [ 1] (In)
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483571 undef
d2 | | | | | |
d2 2147483568 undef (1 word)
d2 2147483567 0 (In)
d2 2147483566 0 (In)
d2 2147483565 3 (In)
d2 2147483564 9 [ 777] (In)
d2 2147483563 undef
d2 | | | | | |
d2 2147483560 undef (1 word)
d2 FRA: size = 0, undefined
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
LIN = 4, FIL = "t.c"
d2 ----------------------------------------------------------------
d2
@x9 "t.c", line 4, INR = 17, PC = 31 OPCODE = 169
@w1 "t.c", line 4, INR = 17, warning: Local data pointer expected [stack.c: 242]
@w1 "t.c", line 4, INR = 17, warning cont.: Actual memory is undefined
@L6 "t.c", line 4, INR = 17, DoLOIm(4)
d2
d2 . . STACK_DUMP[4/4] . . INR = 17 . . STACK_DUMP . .
d2 ----------------------------------------------------------------
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483643 0 (Dp)
d2 2147483642 0 (Dp)
d2 2147483641 0 (Dp)
d2 2147483640 40 [ 40] (Dp)
d2 2147483639 0 (Dp)
d2 2147483638 0 (Dp)
d2 2147483637 3 (Dp)
d2 2147483636 64 [ 832] (Dp)
d2 2147483635 0 (In)
d2 2147483634 0 (In)
d2 2147483633 0 (In)
d2 2147483632 1 [ 1] (In)
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483607 0 (Dp)
d2 2147483606 0 (Dp)
d2 2147483605 0 (Dp)
d2 2147483604 40 [ 40] (Dp)
d2 2147483603 0 (Dp)
d2 2147483602 0 (Dp)
d2 2147483601 3 (Dp)
d2 2147483600 64 [ 832] (Dp)
d2 2147483599 0 (In)
d2 2147483598 0 (In)
d2 2147483597 0 (In)
d2 2147483596 1 [ 1] (In)
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483571 undef
d2 | | | | | |
d2 2147483568 undef (1 word)
d2 2147483567 0 (In)
d2 2147483566 0 (In)
d2 2147483565 3 (In)
d2 2147483564 9 [ 777] (In)
d2 2147483563 0 (In)
d2 2147483562 0 (In)
d2 2147483561 0 (In)
d2 2147483560 4 [ 4] (In)
d2 FRA: size = 0, undefined
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
LIN = 4, FIL = "t.c"
d2 ----------------------------------------------------------------
d2
@x9 "t.c", line 4, INR = 18, PC = 32 OPCODE = 229
@S6 "t.c", line 4, INR = 18, DoSTLm(-8)
d2
d2 . . STACK_DUMP[4/4] . . INR = 18 . . STACK_DUMP . .
d2 ----------------------------------------------------------------
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483643 0 (Dp)
d2 2147483642 0 (Dp)
d2 2147483641 0 (Dp)
d2 2147483640 40 [ 40] (Dp)
d2 2147483639 0 (Dp)
d2 2147483638 0 (Dp)
d2 2147483637 3 (Dp)
d2 2147483636 64 [ 832] (Dp)
d2 2147483635 0 (In)
d2 2147483634 0 (In)
d2 2147483633 0 (In)
d2 2147483632 1 [ 1] (In)
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483607 0 (Dp)
d2 2147483606 0 (Dp)
d2 2147483605 0 (Dp)
d2 2147483604 40 [ 40] (Dp)
d2 2147483603 0 (Dp)
d2 2147483602 0 (Dp)
d2 2147483601 3 (Dp)
d2 2147483600 64 [ 832] (Dp)
d2 2147483599 0 (In)
d2 2147483598 0 (In)
d2 2147483597 0 (In)
d2 2147483596 1 [ 1] (In)
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
d2
d2 ADDRESS BYTE ITEM VALUE SHADOW
d2 2147483571 undef
d2 | | | | | |
d2 2147483568 undef (1 word)
d2 2147483567 0 (In)
d2 2147483566 0 (In)
d2 2147483565 0 (In)
d2 2147483564 4 [ 4] (In)
d2 FRA: size = 0, undefined
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, \e
LIN = 4, FIL = "t.c"
d2 ----------------------------------------------------------------
d2
.DE

25
doc/int/bib Normal file
View file

@ -0,0 +1,25 @@
.\" Bibliography
.\"
.\" $Header$
.bp
.DS C
BIBLIOGRAPHY
.DE
.LP
[1] A.S. Tanenbaum, H. van Staveren, E.G. Keizer and J.W. Stevenson.
\fIDescription of a Machine Architecture for use with Block Structured
Languages\fP. VU Informatica Rapport IR-81, august 1983.
.LP
[2] E.G. Keizer. \fIAck description file reference manual.\fP
.LP
[3] K. Jensen and N. Wirth.
\fIPASCAL, User Manual and Report\fP. Springer Verlag.
.LP
[4] B.W. Kernighan and D.M. Ritchie.
\fIThe C Programming Language\fP. Prentice-Hall, 1978.
.LP
[5] D.M. Ritchie. \fIC Reference Manual\fP.
.LP
[6] \fIAmsterdam Compiler Kit, reference manual.\fP
.LP
[7] \fIUnix Programmer's Manual, 4.1BSD\fP. UCB, August 1983.

26
doc/int/cover Normal file
View file

@ -0,0 +1,26 @@
.\" Front page
.\"
.\" $Header$
.TL
The EM Interpreter
.AU
Eddo de Groot
Leo van den Berge
Dick Grune
.AI
Faculteit Wiskunde en Informatica
Vrije Universiteit, Amsterdam
.AB
This document describes the implementation
and usage of a new interpreter for the EM machine language.
This interpreter implements the full EM machine
and can be helpful to people writing new front-ends.
Moreover, it can be used as a thorough testing and debugging
tool by anyone familiar with the EM language.
.PP
A list of all warnings is given in appendix A; appendix B is a simple
tutorial.
.AE
.PP
.pn 1
.bp

24
doc/int/draw.mac Normal file
View file

@ -0,0 +1,24 @@
.\" Macros for simple constant width drawings (uses font CW)
.\"
.\" $Header$
.de Dr \" Drawing $1 (size)
.sp 1
.ne \\$1
.na
.nf
.ft CW \" constant width font
.lg 0 \" no ligatures
..
.de Df \" Drawing Footer
.sp 1
.ft R
.ce 1000
.lg 1
..
.de De \" Drawing End $1 (lines)
.Df \" if it has not happened yet
.ce
.ad
.fi
.sp \\$1
..

595
doc/int/txt2 Normal file
View file

@ -0,0 +1,595 @@
.\" Implementation details
.\"
.\" $Header$
.bp
.NH
IMPLEMENTATION DETAILS.
.PP
The pertinent issues are addressed below, in arbitrary order.
.NH 2
Stack manipulation and start-up
.PP
It is not at all easy to start the EM machine with the stack in a reasonable
and consistent state. One reason is the anomalous value of the ML register
and another is the absence of a proper RSB. It may be argued that the initial
stack does not have to be in a consistent state, since the first instruction
proper is only executed after \fIargc\fP, \fIargv\fP and \fIenviron\fP
have been stacked (which takes care of the empty stack) and the initial
procedure has been called (which creates a RSB). We would, however, like to
preform the stacking of these values and the calling of the initial procedure
using the normal stack and call routines, which again require the stack to be
in an acceptable state.
.NH 3
The anomalous value of the ML register
.PP
All registers in the EM machine point to word boundaries, and all of them,
except ML, address the even-numbered byte at the boundary.
The exception has a good reason: the even numbered byte at the ML boundary does
not exist.
This problem is not particular to EM but is inherent in the number system: the
number of N-digit numbers can itself not be expressed in an N-digit number, and
the number of addresses in an N-bit machine will itself not fit in an N-bit
address. The problem is solved in the interpreter by having ML point to the
highest word boundary that has bytes on either side; this makes ML+1
expressible.
.NH 3
The absence of an initial Return Status Block
.PP
When the stack is empty, there is no legal value for AB, since there are no
actuals; LB can be set naturally to ML+1. This is all right when the
interpreter starts with a call of the initial routine which stores the value
of LB in the first RSB, but causes problems when finally this call returns. We
want this call to return completely before stopping the interpreter, to check
the integrity of the last RSB; restoring information from it will, however,
cause illegal values to be stored in LB and AB (ML+1 and ML+1+rsbsize, resp.).
On top of this, the initial (illegal) Procedure Identifier of the running
procedure will be restored; then, upon restoring the likewise illegal PC will
cause a check to see if it still is inside the running procedure. After a few
attempts at writing special cases, we have decided that it is possible, but not
worth the effort; the final (= initial) RSB will not be unstacked.
.NH 2
Floating point numbers.
.PP
The interpreter is capable of working with 4- and 8-byte floating point (FP)
numbers.
In C-terms, this corresponds to objects of type float and double respectively.
Both types fit in a C-double so the obvious way to manipulate these entities
internally is in doubles.
Pushing a 8-byte FP, all bytes of the C-double are pushed.
Pushing a 4-byte FP causes the 4 bytes representing the smallest fraction
to be discarded.
.PP
In EM, floats can be obtained in two different ways: via conversion
of another type, or via initialization in the loadfile.
Initialized floats are represented in the loadfile by an ASCII string in
the syntax of a Pascal real (signed \fPUnsignedReal\fP).
I.e. a float looks like:
.DS
[ \fISign\fP ] \fIDigit\fP+ [ . \fIDigit\fP+ ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ] (G1)
.DE
followed by a null byte.
Here \fISign\fP = {+, \-}; \fIDigit\fP = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
\fIExp\fP = {e, E}; [ \fIAnything\fP ] means that \fIAnything\fP is optional;
and a + means one or more times.
To accommodate some loose code generators, the actual grammar accepted is:
.DS
[ \fISign\fP ] \fIDigit\fP\(** [ . \fIDigit\fP\(** ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ] (G2)
.DE
followed by a null byte. Here \(** means zero or more times. A floating
denotation which is in G2 but not in G1 draws a warning, one that is not even
in G2 causes a fatal error.
.LP
A string, representing a float which does not fit in a double causes a
warning to be given.
In that case, the returned value will be the double 0.0.
.LP
Floating point arithmetic is handled by some simple routines, checking for
over/underflow, and returning appropriate values in case of an ignored error.
.PP
Since not all C compilers provide floating point operations, there is a
compile time flag NOFLOAT, which, if defined, suppresses the use of all
fp operations in the interpreter. The resulting interpreter will still load
EM files with floats in the global data area (and ignore them) but will give a
fatal error upon attempt to execute a floating point instruction; consequently
code involving floating point operations can be run as long as the actual
instructions are avoided.
.NH 2
Pointers.
.PP
The following sub-sections both deal with problems concerning pointers.
First, something is said about pointer arithmetic in general.
Then, the null-pointer problem is dealt with.
.NH 3
Pointer arithmetic.
.PP
Strictly speaking, pointer arithmetic is defined only within a \fBfragment\fP.
From the explanation of the term fragment however (as given in [1], page 3),
it is not quite clear what a fragment should look like
from an interpreter's point of view.
For this reason we introduced the term \fBsegment\fP,
bordering the various areas within which pointer arithmetic is allowed.
Every stack-frame is a segment, and so are the global data area (GDA) and
the heap area.
Thus, the number of segments varies over time, and at some point in time is
given by the number of currently active stack-frames
(#CAL + #CAI \- #RET \- #RTT) plus 2 (gda, heap).
Pointers in the area between heap and stack (which is inaccessible by
definition), are assumed to be in the heap segment.
.PP
The interpreter, while building a new stack-frame (i.e. segment), stores the
value of the last ActualBase in a pointer-array (\fIAB_list[\ ]\fP).
When a pointer (say \fIP\fP) is available for arithmetic, the number
of the segment where it points (say \fIS\d\s-2P\s+2\u\fP),
is determined first.
Next, the arithmetic is performed, followed by a check on the number
of the segment where the resulting pointer \fIR\fP points
(say \fIS\d\s-2R\s+2\u\fP).
Now, if \fIS\d\s-2P\s+2\u != S\d\s-2R\s+2\u\fP, a warning is given:
\fBPointer arithmetic yields pointer to bad segment\fP.
.br
It may also be clear now, why the illegal area between heap and stack
was joined with the heap segment.
When calculating a new heap pointer (\fIHP\fP), one will obtain intermediate
results being pointers in this area just before it is made legal.
We do not want error messages all of the time, just because someone is
allocating space in the heap.
.LP
A similar treatment is given to the pointers in the SBS instruction; they have
to point into the same fragment for subtraction to be meaningful.
.LP
The length of the \fIAB_list[\ ]\fP is initially 100,
and it is reallocated in the same way the dynamically growing partitions
are (see 1.1).
.NH 3
Null pointer.
.PP
Because the EM language lacks an instruction for loading a null pointer,
most programs solve this problem by loading a pointer-sized integer of
value zero, and using this as a null pointer (this is also proposed in [1]).
\fBInt\fP allows this, and will not complain.
A warning is given however, when an attempt is made to add something to a
null pointer (i.e. the pointer-sized integer zero).
.LP
Since many programming languages use a pointer to location 0 as an illegal
value, it is desirable to detect its use.
The big problem is though that 0 is a perfectly legal EM address;
address 0 holds the current line number in the source file. It may be freely
read but is written only by means of the LIN instruction. This allows us to
declare the area consisting of the line number and the file name pointer to be
read-only memory. Thus a store will be caught (and result in a warning) but a
read will succeed (and yield the EM information stored there).
.NH 2
Function Return Area (FRA).
.PP
The Function Return Area (\fIFRA[\ ]\fP) has a default size of 8 bytes;
this default can
be overridden through the use of the \fB\-r\fP-option, but cannot be
made smaller than the size of two pointers, in accordance with the
remark on page 5 of [1].
The global variable \fIFRASize\fP keeps track of how many bytes were
stored in the FRA, the last time a RET instruction was executed.
The LFR instruction only works when its argument is equal to this size.
If not, the FRA contents are loaded anyhow, but one of the following warnings
is given:
\fBReturned function result too large\fP (\fIFRASize\fP > LFR size) or
\fBReturned function result too small\fP (\fIFRASize\fP < LFR size).
.LP
Note that a C-program, falling through the end of its code without doing
a proper \fIreturn\fP or \fIexit()\fP, will generate this warning.
.PP
The only instructions that do not disturb the contents of the FRA are
GTO, BRA, ASP and RET.
This is expressed in the program by setting \fIFRA_def\fP to "undefined"
in any instruction except these four.
We realize this is a useless action most of the time, but a more
efficient solution does not seem to be at hand.
If a result is loaded when \fIFRA_def\fP is "undefined", the warning:
\fBReturned function result may be garbled\fP is generated.
.LP
Note that the FRA needs a shadow-FRA in order to store the shadow
information when performing a LFR instruction.
.NH 2
Environment interaction.
.PP
The EM machine represented by \fBint\fP can communicate with
the environment in three different ways.
A first possibility is by means of (UNIX) interrupts;
the second by executing (relatively) high level system calls (called
monitor calls).
A third means of interaction, especially interesting for the debugging
programmer, is via internal variables set on the command line.
The former two techniques, and the way they are implemented will be described
in this section.
The latter has been allotted a separate section (3).
.NH 3
Traps and interrupts.
.PP
Simple user programs will generally not mess around with UNIX-signals.
In interpreting these programs, the default actions will be taken
when a signal is received by the program: it gives a message and
stops running.
.LP
There are programs however, which try to handle certain signals
themselves.
In C, this is achieved by the system call \fIsignal(\ sig_no,\ catch\ )\fP,
which calls the handling routine \fIcatch()\fP, as soon as signal
\fBsig_no\fP occurs.
EM does not provide this call; instead, the \fIsigtrp()\fP monitor call
is available for mapping UNIX signals onto EM traps.
This implies that a \fIsignal()\fP call in a C-program
must be translated by the EM library routine to a \fIsigtrp()\fP call in EM.
.PP
The interpreter keeps an administration of the mapping of UNIX-signals
onto EM traps in the array \fIsig_map[NSIG]\fP.
Initially, the signals all have their default values.
Now assume a \fIsigtrp()\fP occurs, telling to map signal \fBsig_no\fP onto
trap \fBtrap_no\fP.
This results in:
.IP 1.
setting the relevant array element
\fIsig_map[sig_no]\fP to \fBtrap_no\fP (after saving the old value),
.IP 2.
catching the next to come \fBsig_no\fP signal with the handling routine
\fIHndlEMSig\fP (by a plain UNIX \fIsignal()\fP of course), and
.IP 3.
returning the saved map-value on the stack so the user can know the previous
trap value onto which \fBsig_no\fP was mapped.
.LP
On an incoming signal,
the handling routine for signal \fBsig_no\fP arms the
correct EM trap by calling the routine \fIarm_trap()\fP with argument
\fIsig_map[sig_no]\fP.
At the end of the EM instruction the proper call of \fItrap()\fP is done.
\fITrap()\fP on its turn examines the value of the \fIHaltOnTrap\fP variable;
if it is set, the interpreter will stop with a message. In the normal case of
controlled trap handling this bit is not on and the interpreter examines
the value of the \fITrapPI\fP variable,
which contains the procedure identifier of the EM trap handling routine.
It then initiates a call to this routine and performs a \fIlongjmp()\fP
to the main
loop to bypass all further processing of the instruction that caused the trap.
\fITrapPI\fP should be set properly by the library routines, through the
SIG instruction.
.LP
In short:
.IP 1.
A UNIX interrupt is caught by the interpreter.
.IP 2.
A handling routine is called which generates the corresponding EM trap
(according to the mapping).
.IP 3.
The trap handler calls the corresponding EM routine which emulates a UNIX
interrupt for the benefit of the interpreted program.
.PP
When considering UNIX signals, it is important to notice that some of them
are real signals, i.e., messages coming from outside the program, like DEL
and QUIT, but some are actually program-caused synchronous traps, like Illegal
Instruction. The latter, if they happen, are incurred by the interpreter
itself and consequently are of no concern to the interpreted program: it
cannot catch them. The present code assumes that the UNIX signals between
SIGILL (4) and SIGSYS (12) are really traps; \fIdo_sigtrp()\fP
will fail on them.
.LP
To avoid losing the last line(s) of output files, the interpreter should
always do a proper close-down, even in the presence of signals. To this end,
all non-ignored genuine signals are initially caught by the interpreter,
through the routine \fIHndlIntSig\fP, which gives a message and preforms a
proper close-down.
Synchronous trap can only be caused by the interpreter itself; they are never
caught, and consequently the UNIX default action prevails. Generally they
cause a core dump.
Signals requested by the interpreted program are caught by the routine
\fIHndlEMSig\fP, as explained above.
.NH 3
Monitor calls.
.PP
For the convenience of the programmer, as many monitor calls as possible
have been implemented.
The list of monitor calls given in [1] pages 20/21, has been implemented
completely, except for \fIptrace()\fP, \fIprofil()\fP and \fImpxcall()\fP.
The semantics of \fIptrace()\fP and \fIprofil()\fP from an interpreted program
is unclear; the data structure passed to \fImpxcall()\fP is non-trivial
and the system call has low portability and applicability.
For these calls, on invocation a warning is generated, and the arguments which
were meant for the call are popped properly, so the program can continue
without the stack being messed up.
The errorcode 5 (IOERROR) is pushed onto the stack (twice), in order to
fake an unsuccessful monitor call.
No other \- more meaningful \- errorcode is available in the errno-list.
.LP
Now for the implemented monitor calls.
The returned value is zero for a successful call.
When something goes wrong, the value of the external \fIerrno\fP variable
is pushed, thus enabling the user to find out what the reason of failure was.
The implementation of the majority of the monitor calls is straightforward.
Those working with a special format buffer, (e.g. \fIioctl()\fP,
\fItime()\fP and \fIstat()\fP variants), need some extra attention.
This is due to the fact that working with varying word/pointer size
combinations may cause alignment problems.
.LP
The data structure returned by the UNIX system call results from
C code that has been translated with the regular C compiler, which,
on the VAX, happens to be a 4-4 compiler.
The data structure expected by the interpreted program conforms
to the translation by \fBack\fP of the pertinent include file.
Depending on the exact call of \fBack\fP, sizes and alignment may differ.
.LP
An example is in order. The EM MON 18 instruction in the interpreted program
leads to a UNIX \fIstat()\fP system call by the interpreter.
This call fills the given struct with stat information, the contents
and alignments of which are determined by the version of UNIX and the
used C compiler, resp.
The interpreter, like any program wishing to do system calls that fill
structs, has to be translated by a C compiler that uses the
appropriate struct definition and alignments, so that it can use, e.g.,
\fIstab.st_mtime\fP and expect to obtain the right field.
This struct cannot be copied directly to the EM memory to fulfill the
MON instruction.
First, the struct may contain extraneous, system-dependent fields,
pertaining, e.g., to symbolic links, sockets, etc.
Second, it may contain holes, due to alignment requirements.
The EM program runs on an EM machine, knows nothing about these
requirements and expects UNIX Version 7 fields, with offsets as
determined by the em22, em24 or em44 compiler, resp.
To do the conversion, the interpreter has a built-in table of the
offsets of all the fields in the structs that are filled by the MON
instruction.
The appropriate fields from the result of the UNIX \fIstat()\fP are copied
one by one to the appropriate positions in the EM memory to be filled
by MON 18.
.PP
The \fIioctl()\fP call (MON 54) poses additional problems. Not only does it
have a second argument which is a pointer to a struct, the type of
which is dynamically determined, but its first argument is an opcode
that varies considerably between the versions of UNIX.
To solve the first problem, the interpreter examines the opcode (request) and
treats the second argument accordingly. The second problem can be solved by
translating the UNIX Version 7 \fIioctl()\fP request codes to their proper
values on the various systems. This is, however, not always useful, since
some EM run-time systems use the local request codes. There is a compile-time
flag, V7IOCTL, which, if defined, will restrict the \fIioctl()\fP call to the
version 7 request codes and emulate them on the local system; otherwise the
request codes of the local system will be used (as far as implemented).
.PP
Minor problems also showed up with the implementation of \fIexecve()\fP
and \fIfork()\fP.
\fIExecve()\fP expects three pointers on the stack.
The first points to the name of the program to be executed,
the second and third are the beginnings of the \fBargv\fP and \fBenvp\fP
pointer arrays respectively.
We cannot pass these pointers to the system call however, because
the EM addresses to which they point do not correspond with UNIX
addresses.
Moreover, (it is not very likely to happen but) what if someone constructs
a program holding the contents for one of these pointers in the stack?
The stack is implemented upside down, so passing the pointer to
\fIexecve()\fP causes trouble for this reason too.
The only solution was to copy the pointer contents completely
to fresh UNIX memory, constructing vectors which can be passed to the
system call.
Any impending memory fault while making these copies results in failure of the
system call, with \fIerrno\fP set to EFAULT.
.PP
The implementation of the \fIfork()\fP call faced us with problems
concerning IO-channels.
Checking messages (as well as logging) must be divided over different files.
Otherwise, these messages will coincide.
This problem was solved by post-fixing the default message file
\fBint.mess\fP (as well as the logging file \fBint.log\fP) with an
automatically leveled number for every new forked process.
Children of the original process do their diagnostics
in files with postfix 1,2,3 etc.
Second generation processes are assigned files numbered 11, 12, 21 etc.
When 6 generations of processes exist at one moment, the seventh will
get the same message file as the sixth, for the length of the filename
will become too long.
.PP
Some of the monitor calls receive pointers (addresses) from to program, to be
passed to the kernel; examples are the struct stat for \fIstat()\fP, the area
to be filled for \fIread()\fP, etc. If the address is wrong, the kernel does
not generate a trap, but rather the system call returns with failure, while
\fIerrno\fP is set to EFAULT. This is implemented by consistent checking of
all pointers in the MON instruction.
.NH 2
Internal arithmetic.
.PP
Doing arithmetic on signed integers, the smallest negative integer
(\fIminsint\fP) is considered a legal value.
This is in contradiction with the EM Manual [1], page 14, which proposes using
\fIminsint\fP for uninitialized integers.
The shadow bytes already check for uninitialized integers however,
so we do not need this special illegal value.
Although the EM Manual provides two traps, for undefined integers and floats,
undefined objects occur so frequently (e.g. in block copying partially
initialized areas) that the interpreter just gives a warning.
.LP
Except for arithmetic on unsigneds, all arithmetic checks for overflow.
The value that is pushed on the stack after an overflow occurs depends
on the UNIX behavior with regard to that particular calculation.
If UNIX would not accept the calculation (e.g. division by zero), a zero
is pushed as a convention.
Illegal computations which UNIX does accept in silence (e.g. one's
complement of \fIminsint\fP), simply push the UNIX-result after giving a
trap message.
.NH 2
Shadow bytes implementation.
.PP
A great deal of run-time checking is performed by the interpreter (except if
used in the fast version).
This section gives all details about the shadow bytes.
In order to keep track of information about the contents of D-space (stack
and global data area), there is one shadow-byte for each byte in these spaces.
Each bit in a shadow-byte represents some piece
of information about the contents of its corresponding 'sun-byte'.
All bits off indicates an undefined sun-byte.
One or more bits on always guarantees a well-defined sun-byte.
The bits have the following meaning:
.IP "\(bu bit 0:" 8
indicates that the sun-byte is (a part of) an integer.
.IP "\(bu bit 1:" 8
the sun-byte is a part of a floating point number.
.IP "\(bu bit 2:" 8
the sun-byte is a part of a pointer in dataspace.
.IP "\(bu bit 3:" 8
the sun-byte is a part of a pointer in the instruction space.
According to [1] (paragraph 6.4), there are two types pointers which
must be distinguishable.
Conversion between these two types is impossible.
The shadow-bytes make the distinction here.
.IP "\(bu bit 4:" 8
protection bit.
Indicates that the sun-byte is part of a protected piece of memory.
There is a protected area in the stack, the Return Status Block.
The EM machine language has no possibility to declare protected
memory, as is possible in EM assembly (the ROM instruction). The protection
bit is, however, set for the line number and filename pointer area near
location 0, to aid in catching references to location 0.
.IP "\(bu bit 5/6/7:" 8
free for later use.
.LP
The shadow bytes are managed by the routines declared in \fIshadow.h\fP.
The warnings originating from checking these shadow-bytes during
run-time are various.
A list of them is given in appendix A, together with suggestions
(primarily for the C-programmer) where to look for the trouble maker(s).
.LP
A point to notice is, that once a warning is generated, it may be repeated
thousands of times.
Since repetitive warnings carry little information, but consume much
file space, the interpreter keeps track of the number of times a given warning
has been produced from a given line in a given file.
The warning message will
be printed only if the corresponding counter is a power of four (starting at
1). In this way, a logarithmic back-off in warning generation is established.
.LP
It might be argued that the counter should be kept for each (warning, PC
value) pair rather than for each (warning, file position) pair. Suppose,
however, that two instruction in a given line would cause the same message
regularly; this would produce two intertwined streams of identical messages,
with their counters jumping up and down. This does not seem desirable.
.NH 2
Return Status Block (RSB)
.PP
According to the description in [1], at least the return address and the
base address of the previous RSB have to be pushed when performing a call.
Besides these two pointers, other information can be stored in the RSB
also.
The interpreter pushes the following items:
.IP \-
a pointer to the current filename,
.IP \-
the current line number (always four bytes),
.IP \-
the Local Base,
.IP \-
the return address (Program Counter),
.IP \-
the current procedure identifier
.IP \-
the RSB code, which distinguishes between initial start-up, normal call,
returnable trap and non-returnable trap (a word-size integer).
.LP
Consequently, the size of the RSB varies, depending on
word size and pointer size; its value is available as \fIrsbsize\fP.
When the RSB is removed from the stack (by a RET or RTT) the RSB code is under
the Stack Pointer for immediate checking. It is not clear what should be done
if RSB code and return instruction do not match; at present we give a message
and continue, for what it is worth.
.PP
The reason for pushing filename and line number is that some front-ends tend
to forget the LIN and FIL instructions after returning from a function.
This may result in error messages in wrong source files and/or line numbers.
.PP
The procedure identifier is kept and restored to check that the PC will not
move out of the running procedure. The PI is an index in the proctab, which
tells the limits in the text segment of the running procedure.
.PP
If the Return Status Block is generated as a result of a trap, more is
stacked. Before stacking the normal RSB, the trap function pushes the
following items:
.IP \-
the contents of the entire Function Return Area,
.IP \-
the number of bytes significant in the above (a word-size integer),
.IP \-
a word-size flag indicating if the contents of the FRA are valid,
.IP \-
the trap number (a word-size integer).
.LP
The latter is followed directly by the RSB, and consequently acts as the only
parameter to the trap handler.
.NH 2
Operand access.
.PP
The EM Manual mentions two ways to access the operands of an instruction. It
should be noticed that the operand in EM is often not the direct operand of the
operation; the operand of the ADI instruction, e.g., is the width of the
integers to be added, not one of the integers themselves. The various operand
types are described in [1]. Each opcode in the text segment identifies an
instruction with a particular operand type; these relations are described in
computer-readable format in a file in the EM tree, \fIip_spec.t\fP.
.PP
The interpreter uses a variant of the second method. Several other approaches
can be designed, with increasing efficiency and equally increasing complexity.
They are briefly treated below.
.NH 3
The Dispatch Table, Method 1.
.PP
When the interpreter starts, it reads the ip_spec.t file and constructs from it
a dispatch table. This table (of which there are actually three,
for primary, secondary
and tertiary opcodes) has 256 entries, each describing an instruction with
indications on how to decode the operand. For each instruction executed, the
interpreter finds the entry in the dispatch table, finds information there on
how to access the operand, constructs the operand and calls the appropriate
routine with the operand as calculated. There is one routine for each
instruction, which is called with the ready-made operand. Method 1 is easy to
program but requires constant interpretation of the dispatch table.
.NH 3
Intelligent Routines, Method 2.
.PP
For each opcode there is a separate routine, and since an opcode uniquely
defines the instruction and the operand format, the routine knows how to get
the operand; this knowledge is built into the routine. Preferably the heading
of the routine is generated automatically from the ip_spec.t file. Operand
decoding is immediate, and no dispatch table is needed. Generation of the
469 required routines is, however, far from simple. Either a generated array
of routine names or a generated switch statement is used to map the opcode onto
the correct routine. The switch approach has the advantage that parameters can
be passed to the routines.
.LP
The interpreter uses a variant of the switch statement scheme. Numerical
information that can be deduced from the opcode is passed as parameters to the
routine; this includes the argument of minis, the high order byte of shorties,
and the fact that the result is to be multiplied by the word size. This
reduces the number of required routines to 338.
.NH 3
Intelligent Calls.
.PP
The call in the switch statement does full operand construction, and the
resulting operand is passed to the routine. This reduces the number of
routines to 133, the number of EM instructions. Generation of the switch
statement from ip_spec.t will be complicated, but the routine space will be
much cleaner. This will not give any speed-up since the same actions are still
required; they are just performed in a different place.
.NH 3
Static Evaluation.
.PP
It can be observed that the evaluation of the operand of a given instruction in
the text segment will always give the same result. It is therefore possible to
preprocess the text segment, decomposing the instructions into structs which
contain the address, the instruction code and the operand. No operand decoding
will be necessary at run-time: all operands have been precalculated. This will
probably give a considerable speed-up. Jumps, especially GTO jumps, will,
however, require more attention.
.NH 2
Disassembly.
.PP
A disassembly facility is available, which gives a readable but not
letter-perfect disassembly of the EM object. The procedure structure is
indicated by placing the indication \fBP[n]\fP at the entry point of each
procedure, where \fBn\fP is the procedure identifier. The number of locals is
given in a comment.
.LP
The disassembler was generated by the software in the directory \fIswitch\fP
and then further processed by hand.

181
doc/int/txt3 Normal file
View file

@ -0,0 +1,181 @@
.\" Logging
.\"
.\" $Header$
.bp
.NH
THE LOGGING MACHINE.
.PP
Since messages and warnings provided by \fBint\fP include source code file
names and line numbers, they alone often suffice to identify the error.
If, however, the necessity arises, much more extensive debugging information
can be obtained by activating the the Logging Machine.
This Logging Machine, which monitors all actions of the EM machine, is the
subject of this chapter.
.NH 2
Implementation.
.PP
When inspecting the source code of \fBint\fP, many lines in the
following format will show up:
.DS
LOG(("@<\fIletter\fP><\fIdigit\fP> message", args));
.DE
or
.DS
LOG(("\ <\fIletter\fP><\fIdigit\fP> message", args));
.DE
The double parentheses are needed, because \fILOG()\fP is
declared as a define, and has a printf-like argument structure.
.PP
The <\fIletter\fP> classifies the log message and corresponds to an entry in
the \fIlogmask\fP, which holds a threshold for each class of messages.
The following classes exist:
.TS
tab(@);
l l l.
\(bu A\-Z@the flow of instructions:
@A: array
@B: branch
@C: convert
@F: floating point arithmetic
@I: integer arithmetic
@L: load
@M: miscellaneous
@P: procedure call
@R: pointer arithmetic
@S: store
@T: compare
@U: unsigned arithmetic
@X: logical
@Y: sets
@Z: increment/decrement/zero
\(bu d@stack dumping.
\(bu g@gda & heap manipulation.
\(bu s@stack manipulation.
\(bu r@reading the loadfile.
\(bu q@floating point calculations during reading the loadfile.
\(bu x@the instruction count, contents and file position.
\(bu m@monitor calls.
\(bu p@procedure calls and returns.
\(bu t@traps.
\(bu w@warnings.
.TE
.LP
When the interpreter reaches a LOG(()) statement it scans its first argument;
if \fIletter\fP
occurs in the logmask, and if \fIdigit\fP is lower or equal to the
threshold in the logmask, the message is given.
Depending on the first character, the message will be preceded by a
position indication (with the @) or will be printed as is (with the
space).
The \fIletter\fP is determines the message class
and the \fIdigit\fP is used to distinguish various levels
of logging, with a lower digit indicating a more important message.
We will call the <\fIletter\fP><\fIdigit\fP> combination the \fBid\fP of
the logging.
.LP
In general, the lower the \fIdigit\fP following the \fIletter\fP,
the more important the message.
E.g. m5 reports about unsuccessful monitor calls only, m9 also reports
about successful monitors (which are obviously less interesting).
New logging messages can be added to the source code on places you
think relevant.
.LP
Reasonable settings for the logmask are:
.TS
tab(@);
l l l.
@A\-Z9d4twx9@advised setting when trouble shooting (default).
@A\-Zx9@shows the flow of instructions & global information.
@pm9@shows the procedure & monitor calls.
@tw9@shows warning & trap information.
.TE
.PP
An EM interpreter without a Logging Machine can be obtained by undefining the
macro \fICHECKING\fP in the file \fIchecking.h\fP.
.NH 2
Controlling the Logging machine.
.PP
The actions of the Logging Machine are controlled by a set of internal
variables (one of which is the log mask).
These variables can be set through assignments on the command line, as
explained int the manual page \fIint.1\fP, q.v.
Since there are a great many logging statements in the program, of which only a
few will be executed in any call of the interpreter, it is important to be able
to decide quickly if a given \fIid\fP has to be checked at all.
To this end all logging statements are guarded (in the #define) by a test for
the boolean variable \fIlogging\fP.
This variable will only be set if the command line assignments show the
potential need for logging (\fImust_log\fP) and the instruction count
(\fIinr\fP) is at least equal to \fIlog_start\fP (which derives from the
parameter \fBLOG\fP).
.LP
The log mask can be set by the assignment
.DS
"LOGMASK=\fIlogstring\fP"
.DE
which sets the current logmask to \fIlogstring\fP.
A logstring has the following form:
.DS
[ [ \fIletter\fP | \fIletter\fP \- \fIletter\fP ]+ \fIdigit\fP ]+
.DE
E.g. LOGMASK=A\-D8x9R7c0hi4 will print all messages belonging to loggings
with \fBid\fPs:
\fIA0..A8,B0..B8,C0..C8,D0..D8,x0..x9,R0..R7,c0,h0..h4,i0..i4\fP.
.PP
The logging variable STOP can be used to prevent run-away logging
past the point where the user expects an error to occur.
STOP=\fInr\fP will stop the interpreter after instruction number \fInr\fP.
.PP
To simplify the use of the logging machine, a number of abbreviations have been
defined.
E.g., AT=\fInr\fP can be thought of as an abbreviation of LOG=\fInr\-1\fP
STOP=\fInr+1\fP; this causes three stack dumps, one before the suspect
instruction, one on it and one after it; then the interpreter stops.
.PP
Logging results will appear in a special logging file (default: \fIint.log\fP).
.NH 2
Dumps.
.PP
There are three routines available to examine the memory contents:
.TS
tab(@);
l l l.
@\fIstd_all()\fP@dumps the contents of the stack (\fId1\fP or \fId2\fP must be in the logmask).
@\fIgdad_all()\fP@dumps the contents of the gda (\fI+1\fP must be in the logmask).
@\fIhpd_all()\fP@dumps the contents of the heap (\fI*1\fP must be in the logmask).
.TE
.LP
These routines can be used everywhere in the program to examine the
contents of memory.
The internal variables allow the
gda and heap to be dumped only once (according to the
corresponding internal variable).
The stack is dumped after each
instruction if the log mask contains d1 or d2; d2 gives a full formatted
dump, d1 produces a listing of the Return Status Blocks only.
An attempt is made to format the stack correctly, based on the shadow
bytes, which identify the Return Status Block.
.LP
Remember to set the correct \fBid\fP in the LOGMASK, and to give
LOG the correct value.
If dumping is needed before the first instruction, then LOG must be
set to 0.
.LP
The dumps of the global data area and the heap are controlled internally by
the id-s +1 and *1 resp.; the corresponding logmask entries are set
automatically by setting the GDA and HEAP variables.
.NH 2
Forking.
.PP
As mentioned earlier, a call to \fIfork()\fP, causes an image of the current
program to start running.
To prevent a messy logfile, the child process gets its own logfile
(and message file, tally file, etc.).
These logfiles are distinguished from the parent logfile by the a
postfix, e.g.,
\fIlogfile_1\fP for the first child, \fIlogfile_2\fP for the second child,
\fIlogfile_1_2\fP for the second child of the first child, etc.
.br
\fINote\fP: the implementation of this feature is shaky; it works for the log
file but should also work for other files and for the names of the logging
variables.

200
util/int/int.1 Normal file
View file

@ -0,0 +1,200 @@
.\" Manual page
.\"
.\" $Header$
.TH INT I
.ad
.SH NAME
int \- Interpreter for EM Machine Language
.SH SYNOPSIS
\fBint\fP [ intargs ] [ emfile [ emargs ] ]
.SH DESCRIPTION
This program interprets the EM machine-language, and replaces
the pascal written EM interpreter described in [1].
The program interprets load files in \fIe.out\fP format (see [1], sec. 10.3).
.LP
\fIEmfile\fP is the name of the load file; if no name is
specified, the default name \fIe.out\fP is used.
The program can handle several word size / pointer size combinations.
The combinations presently supported are 2/2, 2/4 and 4/4.
.LP
\fIEmargs\fP are the arguments for the program being interpreted.
If any arguments are given, then \fIemfile\fP must be present.
.PP
The interpreter can generate diagnostic messages (warnings) about the
interpreted program.
Some of these warnings are given very frequently,
which may result in a large, non-functional message file.
To avoid this behavior, counters keep track of the number of times
a given warning occurs in a given file at a given line number.
Only when this counter is a power of 4, the warning will actually be
given.
`Logarithmic warning generation' is established in this way.
.PP
\fIInt\fP preempts the highest two file descriptors available, for
diagnostic purposes.
Interpreted programs can use the other file descriptors without
clash problems.
.PP
.I "Interpreter parameters"
.br
\fIInt\fP itself accepts the following options, all given as separate flags:
.IP \fB\-d\fP
The program will not be run; a disassembly listing of the program will
be written to standard output file instead.
The original names are lost, but the procedure structure is recovered.
.IP \fB\-h\fP\fIN\fP
The maximum size of the heap will be limited to \fIN\fP bytes. This can be
used to force a heap overflow trap.
.IP \fB\-I\fP\fIN\fP
It is possible to tell \fIint\fP to ignore traps in the range 0-15.
If a trap is ignored, every time the trap would have happened
a warning is generated instead.
The argument \fIN\fP is the trap number, as described in [1], sec. 9.
For ignoring more than one trap, several \fB\-I\fP flags are needed.
.IP \fB\-m\fP\fIfile\fP
The argument \fIfile\fP is the name of a file on which the messages will
appear.
The default file name is \fIint.mess\fP.
.IP \fB\-r\fP\fIN\fP
Determines the size of the Function Return Area.
Default: 2 \(mu pointer size.
.IP \fB\-s\fP\fIN\fP
The maximum size of the stack will be limited to \fIN\fP bytes. This can be
used to force a stack overflow trap.
.IP \fB\-t\fP
If given, a file \fIint.tally\fP will be produced upon program termination.
For each source file, it contains a list of line numbers visited,
with the number of times the line was visited and
the number of EM instructions executed on the line.
.IP \fB\-W\fP\fIN\fP
This option can be used to disable warnings.
The argument \fIN\fP is the number of the warning to be suppressed,
as found in the \fIint\fP documentation [3].
For disabling more than one warning, several \fB\-W\fP flags are needed.
.PP
.I "The Logging Machine"
.br
The EM machine is monitored continually by a Logging Machine. This logging
machine keeps an instruction count and
can produce a trace of the actions of the EM machine, make readable
dumps of the stack, heap and global data area, and stop the EM machine after a
given instruction number.
The actions of the logging machine are controlled by
its internal variables, the values of which can be set by assignments on the
command line, much like setting macro names in a call of \fImake\fP.
These assignments can be interspersed with the options for the EM machine.
.PP
The logging machine has the following internal variables:
.IP \fBLOG\fP=\fIN\fP
Logging will start when the instruction count has reached \fIN\fP.
.IP \fBLOGMASK\fP=\fIstring\fP
The tracing actions are controlled by a log mask; the log mask consists of a
list of pairs of action classes and logging levels.
E.g. \fBLOGMASK\fP=\fIm9\fP means: trace all monitor calls.
The action classes are described fully in [3].
The default log mask is reasonably suitable.
.IP \fBLOGFILE\fP=\fIstring\fP
The \fIstring\fP is the name of a file on which all logging information is
written.
The default file name is \fIint.log\fP.
.IP \fBSTOP\fP=\fIN\fP
The logging machine stops the EM machine after instruction \fIN\fP.
.PP
Stack dumps can be made after each instruction; they are controlled by the pair
\fBd4\fP in the log mask; gda and heap dumps can only be made after a specific
instruction.
The following internal variables pertain to memory dumps:
.IP \fBGDA\fP=\fIN\fP
The contents of the Global Data Area are dumped after instruction \fIN\fP. The
extent can be adjusted by setting \fBGMIN\fP=\fINmin\fP (default 0) and
\fBGMAX\fP=\fINmax\fP (default HB).
.IP \fBHEAP\fP=\fIN\fP
The contents of the heap are dumped after instruction \fIN\fP.
.IP \fBSTDSIZE\fP=\fIN\fP
The stack dump is restricted to the \fIN\fP topmost bytes.
.IP \fBRAWSTACK\fP=\fIN\fP
Normally the stack dump produced is divided into activation records
separated by formatted dumps of the Return Status Blocks.
If \fIN\fP is non-zero, this dividing and formatting is suppressed, and the
stack is dumped raw.
.PP
Some combinations of variable settings are generally useful and can be
abbreviated:
.IP \fBAT\fP=\fIN\fP
Is an abbreviation of \fBLOG\fP=\fIN\-1\fP \fBSTOP\fP=\fIN+1\fP.
The default log mask applies.
.IP \fBL\fP=\fIstring\fP
Is an abbreviation of \fBLOG\fP=\fI0\fP \fBLOGMASK\fP=\fIstring\fP.
E.g., \fBL\fP=\fIm9\fP will log all monitor calls
and \fBL\fP=\fIA\-Z9\fP will log all instructions (give a full trace).
.PP
When the interpreter forks, the child continues logging on a new file named
\fIint.log_1\fP, etc.
In principle it reevaluates the interpreter arguments, now looking for
\fBLOG_1\fP, \fBLOGMASK_1\fP, etc., but this feature has not been fully
implemented.
.PP
.I "Diagnostics"
.br
All diagnostics are written to the message file.
Diagnostics come in three flavors:
.IP \-
(messages): These inform you about NOP instructions, give more information
about incoming signals and display the exit status of the program.
.IP \-
(warnings): These are generated as a result of the checking.
In most cases the diagnostic is self-explanatory.
A complete description of the warnings can be found in the \fIint\fP
documentation [3].
.IP \-
(fatal errors): This diagnostic is the result of an irrecoverable
error, generally before the program has started: incorrect call of the
interpreter, cannot access file, incorrect format of load file. A few follow
during interpretation: out of memory, uncaught traps, floating point operation
on a version without floating point;
execution stops immediately after the diagnostic is generated.
.PP
Further diagnostics are generated (on \fIstderr\fP) if files cannot
be opened or found.
.SH "SEE ALSO"
e.out(5), ack(1), em22(1), em24(1), em44(1).
.IP [1]
Andrew S. Tanenbaum, Hans van Staveren, Ed G. Keizer and Johan W. Stevenson,
\fIDescription of a Machine Architecture for use with Block
Structured Languages\fP, Informatica rapport IR-81.
.IP [2]
Amsterdam Compiler Kit, reference manual and UNIX manual pages.
.IP [3]
Eddo de Groot, Leo van den Berge, Dick Grune,
\fIThe EM Interpreter\fP.
.SH "FILES"
.ta 20n
int.mess contains messages
.br
int.log contains logging info, if requested
.br
int.tally contains tally results, if requested
.br
int.core produced upon fatal error; format provisional
.SH "BUGS"
The monitor calls
.IR mpxcall ,
.I ptrace
and
.I profile
have not been implemented.
.br
The maximum number of bytes for rotation is 4.
.br
The UNIX V7 struct tchars is not emulated under System V.
.br
The P and N restrictions on operands are not checked.
.br
The start-up has a quadratic component in the number of procedures in the EM
program.
.SH "AUTHORS"
L.J.A. van den Berge.
.br
E.J. de Groot.
.br
D. Grune