Initial entry
This commit is contained in:
parent
b72f2848dd
commit
6214be89c8
10 changed files with 1837 additions and 0 deletions
16
doc/int/Makefile
Executable file
16
doc/int/Makefile
Executable file
|
@ -0,0 +1,16 @@
|
||||||
|
# $Header$
|
||||||
|
|
||||||
|
TBL=/usr/ditroff/tbl
|
||||||
|
|
||||||
|
DOC = draw.mac cover txt1 txt2 txt3 appA appB bib
|
||||||
|
int.doc: $(DOC)
|
||||||
|
$(TBL) $(DOC) > $@
|
||||||
|
|
||||||
|
FLS = README .distr Makefile int.1 $(DOC)
|
||||||
|
|
||||||
|
.distr: Makefile
|
||||||
|
echo $(FLS) | tr ' ' '\012' >.distr
|
||||||
|
|
||||||
|
clean:
|
||||||
|
rm -f int.doc
|
||||||
|
|
4
doc/int/README
Normal file
4
doc/int/README
Normal file
|
@ -0,0 +1,4 @@
|
||||||
|
# $Header$
|
||||||
|
|
||||||
|
This directory contains the text of the documentation for the
|
||||||
|
Production Quality Interpreter "int".
|
280
doc/int/appA
Normal file
280
doc/int/appA
Normal file
|
@ -0,0 +1,280 @@
|
||||||
|
.\" List of all warnings; source of warn_msg and warn.h
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.\"
|
||||||
|
.\" This file contains the warnings issued by the interpreter, together
|
||||||
|
.\" with their names and values in the code of the interpreter. Some of
|
||||||
|
.\" the source files of the interpreter are generated from the Wn
|
||||||
|
.\" macros in this file.
|
||||||
|
.\" When modifying this file, preserve the parameters of the Wn macros.
|
||||||
|
.de Wn \" <text> <define> <value>
|
||||||
|
.IP \\$3. 7
|
||||||
|
.B "\\$1"
|
||||||
|
.br
|
||||||
|
.. Wn
|
||||||
|
.bp
|
||||||
|
.DS C
|
||||||
|
APPENDIX A
|
||||||
|
.DE
|
||||||
|
.SH
|
||||||
|
List of Warnings.
|
||||||
|
.PP
|
||||||
|
The shadow-byte administration makes it possible to check for a
|
||||||
|
wide range of errors during run-time.
|
||||||
|
We have tried to make the diagnostics self-explanatory and especially useful
|
||||||
|
for the C-programmer.
|
||||||
|
The warnings are printed in the message file, together with source file
|
||||||
|
and line number.
|
||||||
|
The complete list of warnings is presented here, followed by an
|
||||||
|
explanation of what might be wrong.
|
||||||
|
Often, these explanations implicitly assume that the program
|
||||||
|
being interpreted, was originally written in C (and not Pascal, Basic etc.).
|
||||||
|
.LP
|
||||||
|
.I "Reading the load file"
|
||||||
|
.Wn "Floating point instructions flag in header ignored" WFLUSED 1
|
||||||
|
.Wn "No float initialisation in this version" WFLINIT 2
|
||||||
|
The interpreter was compiled with the NOFLOAT option; code involving
|
||||||
|
floating point operations can be run as long as the actual
|
||||||
|
instructions are avoided.
|
||||||
|
.Wn "Extra-test flag in header ignored" WEXTRIGN 4
|
||||||
|
The interpreter already tests anything conceivable.
|
||||||
|
.Wn "Maximum line number in header was 0" WNLINEZR 5
|
||||||
|
This number could be used to allocate tables for tallying; these tables are,
|
||||||
|
however, expanded as needed, so the number is immaterial.
|
||||||
|
.Wn "Bad float initialisation" WBADFLOAT 7
|
||||||
|
The loadfile contains a floating point denotation which does not
|
||||||
|
satisfy the syntax (see 2.6).
|
||||||
|
Examining the loadfile (with \fBod \-c\fP) might show the syntax error.
|
||||||
|
Probably there is a bug in the front-end, creating floats with
|
||||||
|
a bad syntax.
|
||||||
|
.LP
|
||||||
|
.I "System calls"
|
||||||
|
.Wn "IOCTL \- bad or unimplemented request" WBADIOCTL 11
|
||||||
|
The second parameter to the ioctl() request (the operation code) is invalid or
|
||||||
|
not implemented; since there are many different opcodes on the various UNIX
|
||||||
|
systems, it is difficult to tell which. The system call fails.
|
||||||
|
.Wn "MPXCALL \- not (yet) implemented" WMPXIMP 14
|
||||||
|
.Wn "PROFIL \- not (yet) implemented" WPROFILIMP 15
|
||||||
|
.Wn "PTRACE \- not (yet) implemented" WPTRACEIMP 16
|
||||||
|
The monitor calls \fImpxcall()\fP, \fIprofil()\fP and \fIptrace()\fP
|
||||||
|
have not been implemented. The monitor call fails.
|
||||||
|
.Wn "Inaccessible memory in system call" WMONFLT 21
|
||||||
|
Bad pointers passed to system calls do not cause a memory fault (which in UNIX
|
||||||
|
would happen to the kernel), but cause the system call to fail with the UNIX
|
||||||
|
variable errno set to 14 (EFAULT). It seems likely that your program is at
|
||||||
|
fault, but there is also a good possibility that a library routine made
|
||||||
|
unwarranted assumptions about word size and pointer size.
|
||||||
|
.Wn "READ \- buffer resides in unallocated memory" WRUMEM 23
|
||||||
|
.Wn "READ \- buffer across global data area and heap" WRGDAH 24
|
||||||
|
When the buffer passed to the read() system call is situated (completely
|
||||||
|
or partially) in unallocated memory (beyond \fIHP\fP) or begins
|
||||||
|
in the global data area and ends in the heap, the appropriate warning
|
||||||
|
is given.
|
||||||
|
The buffer is not written.
|
||||||
|
.Wn "WRITE \- buffer resides in unallocated memory" WWUMEM 25
|
||||||
|
.Wn "WRITE \- buffer across global data area and heap" WWGDAH 26
|
||||||
|
.Wn "WRITE \- (part of) global buffer is undefined" WWGUNDEF 27
|
||||||
|
.Wn "WRITE \- (part of) local buffer is undefined" WWLUNDEF 28
|
||||||
|
The first two are equivalent to the READ-errors above.
|
||||||
|
Writing out a buffer usually makes no sense when the contents are undefined,
|
||||||
|
so one of the latter two warnings will be generated in this case.
|
||||||
|
A global buffer resides in the data partition; a local buffer resides in
|
||||||
|
the stack partition.
|
||||||
|
This corresponds to global and local variables in a C-program.
|
||||||
|
In the first two cases the WRITE is not performed, in the latter two cases
|
||||||
|
it is.
|
||||||
|
.LP
|
||||||
|
.I "Traps and signals"
|
||||||
|
.Wn "SIGTRP \- bad signo argument" WILLSN 31
|
||||||
|
The \fIsigtrp()\fP monitor call allows \fIsig_no\fP arguments in the
|
||||||
|
range [1..17] (UNIX Version 7 signals); the actual argument is out of range.
|
||||||
|
.Wn "SIGTRP \- signo argument is a synchronous trap" WUNIXTR 32
|
||||||
|
The signal is one that can only be caused synchronously by the running program
|
||||||
|
on UNIX; it cannot occur to an interpreted program.
|
||||||
|
.Wn "SIGTRP \- bad trapno argument" WILLTN 33
|
||||||
|
The \fIsigtrp()\fP monitor call allows \fItrap_no\fP arguments between 0 and
|
||||||
|
252, and the special values \-2 and \-3; the actual argument is not one of
|
||||||
|
these.
|
||||||
|
.Wn "Heap overflow due to command line limitation" WEHEAP 36
|
||||||
|
.Wn "Stack overflow due to command line limitation" WESTACK 37
|
||||||
|
The maximum sizes of the heap and the stack can be limited by options on the
|
||||||
|
command line. If overflow occurs due to such limitations, the corresponding
|
||||||
|
trap is taken, preceded by one of the above warnings. If the memory of the
|
||||||
|
interpreter itself is exhausted, a fatal error follows.
|
||||||
|
.LP
|
||||||
|
.I "Run-time type checking"
|
||||||
|
.Wn "Local character expected" WLCEXP 41
|
||||||
|
.Wn "Global character expected" WGCEXP 42
|
||||||
|
.Wn "Local integer expected" WLIEXP 43
|
||||||
|
.Wn "Global integer expected" WGIEXP 44
|
||||||
|
.Wn "Local float expected" WLFEXP 45
|
||||||
|
.Wn "Global float expected" WGFEXP 46
|
||||||
|
.Wn "Local data pointer expected" WLDPEXP 47
|
||||||
|
.Wn "Global data pointer expected" WGDPEXP 48
|
||||||
|
.Wn "Local instruction pointer expected" WLIPEXP 49
|
||||||
|
.Wn "Global instruction pointer expected" WGIPEXP 50
|
||||||
|
In general, a type violation has taken place when one of
|
||||||
|
these warnings is given.
|
||||||
|
The \fBfloat\fP- and \fBinstruction pointer\fP warnings are rare and will
|
||||||
|
usually be easy traceable.
|
||||||
|
\fBInteger/character expected\fP will normally occur when unsigned arithmetic
|
||||||
|
is performed on datapointers or when memory containing objects other than
|
||||||
|
integers is copied bytewise.
|
||||||
|
Often, this warning is followed by a warning \fBdatapointer expected\fP.
|
||||||
|
This is due to our decision of transforming pointers to (unsigned) integers
|
||||||
|
after doing unsigned arithmetic on them.
|
||||||
|
When such a transformed integer is dereferenced (as if it were a pointer)
|
||||||
|
or, in general, when it is treated as a pointer, this results in a warning.
|
||||||
|
The present library implementation of malloc() causes such a
|
||||||
|
sequence of errors.
|
||||||
|
.LP
|
||||||
|
These messages are always followed by a tentative description of what is found
|
||||||
|
in memory at the offending place.
|
||||||
|
.Wn "Actual memory is undefined" WWASUND 61
|
||||||
|
.Wn "Actual memory contains an integer" WWASINT 62
|
||||||
|
.Wn "Actual memory contains a float" WWASFLOAT 63
|
||||||
|
.Wn "Actual memory contains a data pointer" WWASDATAP 64
|
||||||
|
.Wn "Actual memory contains an instruction pointer" WWASINSP 65
|
||||||
|
.Wn "Actual memory contains mixed information" WWASMISC 66
|
||||||
|
If the contents of the area was undefined,
|
||||||
|
check the source code for an uninitialized variable of the mentioned type.
|
||||||
|
Officially, the use of an undefined value
|
||||||
|
should result in a EIUND or EFUND trap but the occurrence is
|
||||||
|
so common that a warning is more appropriate.
|
||||||
|
The contents of memory are described as mixed if the data consists of pieces
|
||||||
|
of different types. This happens, e.g., when caller and callee do not agree on
|
||||||
|
the types and lengths of the parameters.
|
||||||
|
.LP
|
||||||
|
.I "Protection"
|
||||||
|
.br
|
||||||
|
.Wn "Destroying contents of ROM (at or near loc 0)" WDESROM 71
|
||||||
|
The program stores a value in Read-Only Memory; the only ROM in the present
|
||||||
|
implementation is the area near location 0. The warning probably results from
|
||||||
|
storing under a NULL pointer. This is only a warning, the store operation is
|
||||||
|
executed normally. Reads from location 0 are not detected.
|
||||||
|
.Wn "Destroying contents of Return Status Block" WDESRSB 72
|
||||||
|
The Return Status Block is the stack area containing the return address, the
|
||||||
|
dynamic link, etc.
|
||||||
|
This may or may not be an error.
|
||||||
|
The current implementation of \fIsetjmp()\fP/\fIlongjmp()\fP
|
||||||
|
may be responsible for it.
|
||||||
|
If your program does not use setjmp(), there \fIis\fP something
|
||||||
|
very wrong (e.g. argument for ASP too large).
|
||||||
|
Note that there are some library routines (such as \fIalarm()\fP) which
|
||||||
|
use \fIsetjmp()\fP.
|
||||||
|
.Wn "Logical operation using undefined operand(s)" WUNLOG 81
|
||||||
|
.Wn "Comparing undefined operand(s)" WUNCMP 82
|
||||||
|
The logical operations AND, XOR, IOR, COM and the compare operation
|
||||||
|
CMS do their jobs bytewise.
|
||||||
|
If one of the bytes is found to be undefined, the corresponding warning
|
||||||
|
is given, and the operation is stopped immediately.
|
||||||
|
The stack is adjusted so interpretation may continue.
|
||||||
|
.br
|
||||||
|
It is hard to say what went wrong.
|
||||||
|
Possibly, the argument of the instruction at hand (which indicates the
|
||||||
|
size of the objects to be compared), was too large.
|
||||||
|
.LP
|
||||||
|
.I "Bad operands"
|
||||||
|
.Wn "Shift over negative distance" WSHNEG 91
|
||||||
|
.Wn "Shift over too large distance" WSHLARGE 92
|
||||||
|
Shift instructions yield undefined results if the shift distance is negative
|
||||||
|
or larger than the object size.
|
||||||
|
.Wn "Pointer arithmetic yields pointer to bad segment" WSEGADP 93
|
||||||
|
When doing pointer arithmetic (ADP, ADS), the operand and result pointer
|
||||||
|
must be in the same \fIsegment\fP (see sec. 4).
|
||||||
|
E.g. loading the address of the first local and adding 20 to it will
|
||||||
|
certainly give this warning.
|
||||||
|
.Wn "Subtracting pointers to different segments" WSEGSBS 94
|
||||||
|
Pointers may be subtracted only if they point into the same segment.
|
||||||
|
.Wn "Pointer arithmetic with NULL pointer" WNULLPA 96
|
||||||
|
By definition it is illegal to do arithmetic with null pointers.
|
||||||
|
Integers with the size of a pointer and the value zero are recognized
|
||||||
|
as NULL pointers.
|
||||||
|
A well-known C-trick to compute the offset of some field in a struct
|
||||||
|
is converting the null-pointer to the type of the struct and simply
|
||||||
|
taking the address of the field.
|
||||||
|
This trick will \-when translated and interpreted\- generate this warning
|
||||||
|
because it results in arithmetic with the NULL pointer.
|
||||||
|
.LP
|
||||||
|
.I "Return area"
|
||||||
|
.Wn "Returned function result too large" WRFUNLAR 101
|
||||||
|
.Wn "Returned function result too small" WRFUNSML 102
|
||||||
|
This warning is generated when the size of the expected return value
|
||||||
|
is not equal to the size actually returned.
|
||||||
|
.br
|
||||||
|
Your interpreted program may have fallen through the end of
|
||||||
|
the code without explicitly doing an \fIexit()\fP or \fIreturn()\fP.
|
||||||
|
The start-up routine (\fIcrt0()\fP) however always expects to get some
|
||||||
|
value returned by the program proper.
|
||||||
|
.br
|
||||||
|
Another (less probable) possibility of course is that the code contains
|
||||||
|
a subroutine or function call that does not return properly (e.g.
|
||||||
|
it returns a short instead of a long).
|
||||||
|
.Wn "Returned function result may be garbled" WRFUNGAR 103
|
||||||
|
This warning will be generated, when the contents of the FRA are fetched
|
||||||
|
after some instruction is executed which can mess up the area.
|
||||||
|
Compiler-generated loadfiles should not generate this message.
|
||||||
|
.LP
|
||||||
|
.I "Return Status Block"
|
||||||
|
.Wn "RET did not find a Return Status Block" WRETBAD 111
|
||||||
|
.Wn "Used RET to return from a trap" WRETTRAP 112
|
||||||
|
The RET instruction found a garbled Return Status Block, or on that resulted
|
||||||
|
from a trap.
|
||||||
|
.Wn "RTT did not find a Return Status Block" WRTTBAD 115
|
||||||
|
.Wn "RTT on empty stack" WRTTEMPTY 116
|
||||||
|
.Wn "Used RTT to return from a call" WRTTCALL 117
|
||||||
|
.Wn "Used RTT to return from a non-returnable trap" WRTTNRTT 118
|
||||||
|
The RTT (Return from Trap) instruction found a Return Status block that was not
|
||||||
|
created properly by a trap.
|
||||||
|
.Wn "Stack Pointer too large in RET" WRETSTL 121
|
||||||
|
.Wn "Stack Pointer too small in RET" WRETSTS 122
|
||||||
|
.Wn "Stack Pointer too large in RTT" WRTTSTL 125
|
||||||
|
.Wn "Stack Pointer too small in RTT" WRTTSTS 126
|
||||||
|
According to the EM Manual (4.2), "the value of SP just after the return
|
||||||
|
value has been popped must be the same as the
|
||||||
|
value of SP just before executing the first instruction of the
|
||||||
|
invocation."
|
||||||
|
If the Stack Pointer is too large, some dynamically allocated item or some
|
||||||
|
temporary result may have been left behind on the stack.
|
||||||
|
If the Stack Pointer is too small, some locals have been unstacked.
|
||||||
|
Since the interpreter has enough information in the Return Status Block, it
|
||||||
|
recovers correctly from these errors.
|
||||||
|
.LP
|
||||||
|
.I "Traps"
|
||||||
|
.LP
|
||||||
|
Some traps have ambiguous or non-obvious causes.
|
||||||
|
As far as possible, these are preceded by a warning, explaining the
|
||||||
|
circumstances of the trap.
|
||||||
|
.Wn "Trap ESTACK: DCH on bad LB" WDCHBADLB 131
|
||||||
|
.Wn "Trap ESTACK: LPB on bad LB" WLPBBADLB 132
|
||||||
|
.Wn "Trap ESTACK: SP retracted over Return Status Block" WSPGTLB 133
|
||||||
|
.Wn "Trap ESTACK: SP moved into data area" WSPINHEAP 134
|
||||||
|
.Wn "Trap ESTACK: SP set to non-word-boundary" WSPODD 135
|
||||||
|
.Wn "Trap ESTACK: LB set out of stack" WLBOUT 136
|
||||||
|
.Wn "Trap ESTACK: LB set to non-word-boundary" WLBODD 137
|
||||||
|
.Wn "Trap ESTACK: LB set to position where there is no RSB" WLBRSB 138
|
||||||
|
.Wn "Trap EHEAP: HP retracted into Global Data Area" WHPGDA 141
|
||||||
|
.Wn "Trap EHEAP: HP pushed into stack" WHPSTACK 142
|
||||||
|
.Wn "Trap EHEAP: HP set to non-word-boundary" WHPODD 143
|
||||||
|
.Wn "Trap EILLINS: unknown opcode" WBADOPC 151
|
||||||
|
.Wn "Trap EILLINS: conversion with unacceptable size for this machine" WILLCONV 152
|
||||||
|
.Wn "Trap EILLINS: FIL with non-existing address" WILLFIL 153
|
||||||
|
.Wn "Trap EILLINS: LFR with too large size" WILLLFR 154
|
||||||
|
.Wn "Trap EILLINS: RET with too large size" WILLRET 155
|
||||||
|
.Wn "Trap EILLINS: instruction argument of class c does not fit a word" WARGC 156
|
||||||
|
.Wn "Trap EILLINS: instruction on double word on machine with word size 4" WARGD 157
|
||||||
|
.Wn "Trap EILLINS: local offset too large" WARGL 158
|
||||||
|
.Wn "Trap EILLINS: instruction argument of class g not in GDA" WARGG 159
|
||||||
|
.Wn "Trap EILLINS: fragment offset too large" WARGF 160
|
||||||
|
.Wn "Trap EILLINS: counter in lexical instruction out of range" WARGN 161
|
||||||
|
.Wn "Trap EILLINS: non-existent procedure identifier" WARGP 162
|
||||||
|
.Wn "Trap EILLINS: illegal register number" WARGR 163
|
||||||
|
.Wn "Trap EBADPC: jump out of text segment" WPCOVFL 172
|
||||||
|
.Wn "Trap EBADPC: jump out of procedure fragment" WPCPROC 173
|
||||||
|
.Wn "Trap EBADGTO: GTO does not restore an existing RSB" WGTORSB 181
|
||||||
|
.Wn "Trap EBADGTO: GTO descriptor on the stack" WGTOSTACK 182
|
||||||
|
.Wn "Trap caused by TRP instruction" WTRP 191
|
||||||
|
.ig
|
||||||
|
.Wn "Last warning" WMSG 199
|
||||||
|
!Leave these lines here!
|
||||||
|
..
|
486
doc/int/appB
Normal file
486
doc/int/appB
Normal file
|
@ -0,0 +1,486 @@
|
||||||
|
.\" A simple tutorial
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.\"
|
||||||
|
.bp
|
||||||
|
.DS
|
||||||
|
APPENDIX B
|
||||||
|
.DE
|
||||||
|
.SH
|
||||||
|
How to use the interpreter
|
||||||
|
.PP
|
||||||
|
The interpreter is not normally used for the debugging of programs under
|
||||||
|
construction. Its primary application is as a verification tool for almost
|
||||||
|
completed programs. Although the proper operation of the interpreter is
|
||||||
|
obviously a black art, this chapter tries to provide some guidelines.
|
||||||
|
.LP
|
||||||
|
For the sake of the argument, the source language is assumed to be C, but most
|
||||||
|
hints apply equally well to other languages supported by ACK.
|
||||||
|
.sp
|
||||||
|
.LP
|
||||||
|
.I "Initial measures"
|
||||||
|
.PP
|
||||||
|
Start with a test case of trivial size; to be on the safe side, reckon with a
|
||||||
|
time dilatation factor of about 500, i.e., a second grows into 10 minutes.
|
||||||
|
(The interpreter takes 0.5 msec to do one EM instruction on a Sun 3/50).
|
||||||
|
Fortunately many trivial test cases are much shorter than one second.
|
||||||
|
.PP
|
||||||
|
Compile the program into an \fIe.out\fP, the EM machine version of a
|
||||||
|
\fIa.out\fP, by calling \fIem22\fP (for 2-byte integers and 2-byte pointers),
|
||||||
|
\fIem24\fP (for 2 and 4) or \fIem44\fP (for 4 and 4) as seems appropriate;
|
||||||
|
if in doubt, use \fIem44\fP. These compilers can be found in the ACK
|
||||||
|
\fIbin\fP directory, and should be used instead of \fIacc\fP (or normal
|
||||||
|
.UX
|
||||||
|
\fIcc\fP). Alternatively, you can use \fIacc \-memNN\fP instead of
|
||||||
|
\fIemNN\fP.
|
||||||
|
.LP
|
||||||
|
If your C program consists of more than one file, as it usually does, there is
|
||||||
|
a small problem. The \fIacc\fP and \fIcc\fP compilers generate .o files,
|
||||||
|
whereas the \fIemNN\fP compilers generate .m files as object files.
|
||||||
|
A simple technique to avoid the problem is to call
|
||||||
|
.DS
|
||||||
|
em44 *.c
|
||||||
|
.DE
|
||||||
|
if you can. If not, the following hack on the \fIMakefile\fP generally works.
|
||||||
|
.IP \-
|
||||||
|
Make sure the \fIMakefile\fP is reasonably clean and complete: all calls to
|
||||||
|
the compiler are through \fI$(CC)\fP, \fICFLAGS\fP is used properly and all
|
||||||
|
dependencies are specified.
|
||||||
|
.IP \-
|
||||||
|
Add the following lines to the \fIMakefile\fP (possibly permanently):
|
||||||
|
.DS
|
||||||
|
\&.SUFFIXES: .o
|
||||||
|
\&.c.o:
|
||||||
|
\& $(CC) \-c $(CFLAGS) $<
|
||||||
|
.DE
|
||||||
|
.IP \-
|
||||||
|
Set CC to \fIem44 \-.c\fP (for example). Make sure CFLAGS includes
|
||||||
|
the \-O option; this yields a speed-up of about 15 %.
|
||||||
|
.IP \-
|
||||||
|
Change all .o to .m (or .k if you do not use the \-O option).
|
||||||
|
.IP \-
|
||||||
|
If necessary, change \fIa.out\fP to \fIe.out\fP.
|
||||||
|
.PP
|
||||||
|
With these changes, \fImake\fP will produce an EM object; you can use
|
||||||
|
\fIesize\fP to verify that it is indeed an EM object and obtain some
|
||||||
|
statistics. Then call the interpreter:
|
||||||
|
.DS
|
||||||
|
int <EM-object-file> [ parameters ]
|
||||||
|
.DE
|
||||||
|
where the parameters are the normal parameters of your program. This should
|
||||||
|
work exactly like the original program, though slower. It reads from the
|
||||||
|
terminal if the original does, it opens and closes files like the original and
|
||||||
|
it accepts interrupts.
|
||||||
|
.sp
|
||||||
|
.LP
|
||||||
|
.I "Interpreting the results"
|
||||||
|
.PP
|
||||||
|
Now there are several possibilities.
|
||||||
|
.PP
|
||||||
|
It does all this. Great! This means the program
|
||||||
|
does not do very uncouth things. Now
|
||||||
|
read the file \fIint.mess\fP to see if any messages were generated. If there
|
||||||
|
are none, the program did not really run (perhaps the original cc \fIa.out\fP
|
||||||
|
got called instead?) Normally there is at least a termination message like
|
||||||
|
.DS
|
||||||
|
(Message): program exits with status 0 at "awa.p", line 64, INR = 4124
|
||||||
|
.DE
|
||||||
|
This says that the program terminated through an exit(0) on line 64 of the
|
||||||
|
file \fIawa.p\fP after 4124 EM instructions.
|
||||||
|
If this is the only message it is time to move to a bigger test case.
|
||||||
|
.PP
|
||||||
|
On the other hand, the program may come to a grinding halt with an error
|
||||||
|
message.
|
||||||
|
All messages (errors and warnings) have a format in which the sequence
|
||||||
|
.DS
|
||||||
|
"<file name>", line <ln#>
|
||||||
|
.DE
|
||||||
|
occurs, which is the same sequence many compilers produce for their error
|
||||||
|
messages. Consequently, the \fIint.mess\fP file can be processed as any
|
||||||
|
compiler message output.
|
||||||
|
.PP
|
||||||
|
One such message can be
|
||||||
|
.DS
|
||||||
|
(Fatal error) a.em: trap "Addressing non existent memory" not caught at "a.c", line 2, INR = 16
|
||||||
|
.DE
|
||||||
|
produced by the abysmal program
|
||||||
|
.DS
|
||||||
|
main() {
|
||||||
|
*(int*)200000 = 1;
|
||||||
|
}
|
||||||
|
.DE
|
||||||
|
.LP
|
||||||
|
Often the effects are more subtle, however. The program
|
||||||
|
.DS
|
||||||
|
main() {
|
||||||
|
int *a, b = 777;
|
||||||
|
|
||||||
|
b = *a;
|
||||||
|
}
|
||||||
|
.DE
|
||||||
|
produces the following five warnings (in far less than a second):
|
||||||
|
.DS
|
||||||
|
(Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
|
||||||
|
(Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
|
||||||
|
(Warning 102, #1): Returned function result too small at "<unknown>", line 0, INR = 21
|
||||||
|
(Warning 43, #1): Local integer expected at "exit.c", line 11, INR = 34
|
||||||
|
(Warning 61, cont.): Actual memory is undefined at "exit.c", line 11, INR = 34
|
||||||
|
.DE
|
||||||
|
The one about the function result looks the most frightening,
|
||||||
|
but is the most easily solved:
|
||||||
|
\fImain\fP is a function returning an int, so the start-up routine expects a
|
||||||
|
(four-byte) integer but gets an empty (zero-byte) return area.
|
||||||
|
.LP
|
||||||
|
\fINote\fP: The experts are divided about this. The traditional school holds
|
||||||
|
that \fImain\fP is an int function and its result is the return code; this
|
||||||
|
leaves them with two ways of supplying a return code: one as the parameter
|
||||||
|
of \fIexit()\fP and one as the result
|
||||||
|
of \fImain\fP. The modern school (Berkeley 4.2 etc.) claims that
|
||||||
|
return codes are supplied exclusively
|
||||||
|
by \fIexit()\fP, and they have an \fIexit(0)\fP in
|
||||||
|
the start-up routine, just after the call to \fImain()\fP; leaving \fImain()\fP
|
||||||
|
through the bottom implies successful termination.
|
||||||
|
.LP
|
||||||
|
We shall satisfy both groups by
|
||||||
|
.DS
|
||||||
|
main() {
|
||||||
|
int *a, b = 777;
|
||||||
|
|
||||||
|
b = *a;
|
||||||
|
exit(0);
|
||||||
|
}
|
||||||
|
.DE
|
||||||
|
This results in
|
||||||
|
.DS
|
||||||
|
(Warning 47, #1): Local data pointer expected at "t.c", line 4, INR = 17
|
||||||
|
(Warning 61, cont.): Actual memory is undefined at "t.c", line 4, INR = 17
|
||||||
|
(Message): program exits with status 0 at "exit.c", line 11, INR = 33
|
||||||
|
.DE
|
||||||
|
which is pretty clear as it stands.
|
||||||
|
.sp
|
||||||
|
.LP
|
||||||
|
.I "Using stack dumps"
|
||||||
|
.PP
|
||||||
|
Let's, for the sake of argument
|
||||||
|
and to avoid the fierce realism of 10000-line programs, assume that the above
|
||||||
|
still puzzles you.
|
||||||
|
Since the error occurred in EM instruction number 17, we should like to see
|
||||||
|
more information around that moment. Call the interpreter again, now with the
|
||||||
|
shell variable AT set at 17:
|
||||||
|
.DS
|
||||||
|
int AT=17 t.em
|
||||||
|
.DE
|
||||||
|
(The interpreter has a number of internal variables that can be set by
|
||||||
|
assignments on the command line, like with \fImake\fP.)
|
||||||
|
This gives you a file called \fIint.log\fP containing the
|
||||||
|
stack dump of 150 lines presented at the end of this chapter.
|
||||||
|
.PP
|
||||||
|
Since dumping is a subfacility of logging in the interpreter, the formats of
|
||||||
|
the lines are
|
||||||
|
the same. If a line starts with an @, it will contain a file-name/line-number
|
||||||
|
indication; the next two characters are the subject and the log
|
||||||
|
level. Then comes the information, preceded by a space. The text contains
|
||||||
|
three stack dumps, one before the offending instruction, one at it, and one
|
||||||
|
after it; then the interpreter stops. All kinds of other dumps can be
|
||||||
|
obtained, but this is default.
|
||||||
|
.PP
|
||||||
|
For each instruction we have, in order:
|
||||||
|
.IP \-
|
||||||
|
an @x9 line, giving the position in the program,
|
||||||
|
.IP \-
|
||||||
|
the messages, warnings and errors from the instruction as it is being executed,
|
||||||
|
.IP \-
|
||||||
|
dump(s), as requested.
|
||||||
|
.PP
|
||||||
|
The first two lines mean that at line 4 in file \fIt.c\fP the interpreter
|
||||||
|
performed its 16-th instruction, with the Program Counter at 30 pointing at
|
||||||
|
opcode 180 in the text segment; the instruction was an LOL (LOad Local)
|
||||||
|
with the operand \-4 derived from the opcode. It copies the local at offset
|
||||||
|
\-4 to the top of the stack. The effect can be seen from the subsequent stack
|
||||||
|
dump, where the undefined word at addresses 2147483568 to ...571 (the variable
|
||||||
|
\fIa\fP) has been copied to the top of the stack at 2147483560 (copying
|
||||||
|
undefined values does not generate a warning).
|
||||||
|
Since we used the \fIem44\fP compiler, all pointers and ints in our dump are
|
||||||
|
4 bytes long.
|
||||||
|
So a variable at address X in reality extends from address X to X+3.
|
||||||
|
.br
|
||||||
|
Note that this is not the offending instruction; this stack dump represents
|
||||||
|
the situation just before the error.
|
||||||
|
.PP
|
||||||
|
The stack consists of a sequence of frames, each containing data followed by
|
||||||
|
a Return Status Block resulting from a call; the last frame ends in
|
||||||
|
top-of-stack. The first frame represents the stack when the program starts,
|
||||||
|
through a call to the start-up routine. This routine prepares the second
|
||||||
|
stack frame with the actual parameters to \fImain()\fP:
|
||||||
|
\fIargc\fP at 2147483596, \fIargv\fP at 2147483600 and \fIenviron\fP at
|
||||||
|
2147483604.
|
||||||
|
.LP
|
||||||
|
The RSB line shows that the call to \fImain()\fP was made from procedure 0
|
||||||
|
which has 0 locals, with PC at
|
||||||
|
16, an LB of 2147483608 and file name and line number still unknown.
|
||||||
|
The \fIcode\fP in the RSB tells how this RSB was made; possible values are STP
|
||||||
|
(start-up), CAL, RTT (returnable trap) and NRT (non-returnable trap).
|
||||||
|
.PP
|
||||||
|
The next frame shows the local variable(s) of \fImain()\fP; there are two of
|
||||||
|
them, the pointer \fIa\fP at 2147483568, which is undefined, and variable
|
||||||
|
\fIb\fP at 2147483564, which has the value 777. Then comes a copy of \fIa\fP,
|
||||||
|
just made by the LOL instruction, at 2147483560. The following line shows that
|
||||||
|
the Function Return Area (which does not reside at the end of the stack, but
|
||||||
|
just happens to be printed here) has size 0 and is presently undefined.
|
||||||
|
The stack dump ends
|
||||||
|
by showing that the Actuals Base is at 2147483596 (pointing at \fIargc\fP), the
|
||||||
|
Locals Base at 2147483572 (pointing just above the local \fIa\fP), the Stack
|
||||||
|
Pointer at 2147483560 (pointing at the undefined pointer), the line count is 4
|
||||||
|
and the file name is "t.c".
|
||||||
|
.LP
|
||||||
|
(Notice that there is one more stack frame than you would probably expect, the
|
||||||
|
one above the start-up routine.)
|
||||||
|
.LP
|
||||||
|
The Function Return Area
|
||||||
|
could have a size larger than 0 and still be undefined, for
|
||||||
|
example when an instruction that does not preserve the contents of the FRA has
|
||||||
|
just been executed; likewise the FRA could have size 0 and be defined
|
||||||
|
nevertheless, for example just after a RET 0 instruction.
|
||||||
|
.PP
|
||||||
|
All this has set the scene for the distaster which is about to strike in the
|
||||||
|
next instruction. This is indeed a LOI (LOad Indirect) of size 4, opcode 169;
|
||||||
|
it causes the message
|
||||||
|
.DS
|
||||||
|
warning: Local data pointer expected [stack.c: 242]
|
||||||
|
.DE
|
||||||
|
and its continuation
|
||||||
|
.DS
|
||||||
|
warning cont.: Actual memory is undefined
|
||||||
|
.DE
|
||||||
|
(detected in the interpreter file \fIstack.c\fP at line 242; this can be
|
||||||
|
useful for sorting out dubious semantics). We see that the effect, as shown in
|
||||||
|
the third frame of this stack dump (at instruction number 17) is somewhat
|
||||||
|
unexpected: the LOI has fetched the value 4 and stacked it. The reason is
|
||||||
|
that, unfortunately, undefinedness is not transitive in the interpreter. When
|
||||||
|
an undefined value is used in an operation (other than copying) a warning is
|
||||||
|
given, but thereafter the value is treated as if it were zero. So, after the
|
||||||
|
warning a normal null pointer remains, which is then used to pick up the value
|
||||||
|
at location 0. This is the place where the EM machine stores its current line
|
||||||
|
number, which is presently 4.
|
||||||
|
.PP
|
||||||
|
The third stack dump shows the final effect: the value 4 has been unstacked
|
||||||
|
and copied to variable \fIb\fP at 2147483564 through an STL (STore Local)
|
||||||
|
instruction.
|
||||||
|
.PP
|
||||||
|
Since this form of logging dumps the stack only, the log file is relatively
|
||||||
|
small as dumps go.
|
||||||
|
Nevertheless, a useful excerpt can be obtained with the command
|
||||||
|
.DS
|
||||||
|
grep 'd1' int.log
|
||||||
|
.DE
|
||||||
|
This extracts the Return Status Block lines from the log, thus producing three
|
||||||
|
traces of calls, one for each instruction in the log:
|
||||||
|
.DS
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, LIN = 4, FIL = "t.c"
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, LIN = 4, FIL = "t.c"
|
||||||
|
.DE
|
||||||
|
Theoretically, the pertinent trace is the middle one, but in practice all three
|
||||||
|
are equal. In the present case there isn't much to trace, but in real programs
|
||||||
|
the trace can be useful.
|
||||||
|
.sp
|
||||||
|
.LP
|
||||||
|
.I "Errors in libraries"
|
||||||
|
.PP
|
||||||
|
Since libraries are generally compiled with suppression of line number and
|
||||||
|
file name information, the line number and file name in the interpreter will
|
||||||
|
not be updated when it enters a library routine. Consequently, all messages
|
||||||
|
generated by interpreting library routines will seem to originate from the
|
||||||
|
line of the call. This is especially true for the routine malloc(), which,
|
||||||
|
from the nature of its business, often contains dubitable code.
|
||||||
|
.PP
|
||||||
|
A usual message is:
|
||||||
|
.DS
|
||||||
|
(Warning 43, #1): Local integer expected at "buff.c", line 18, INR = 266
|
||||||
|
(Warning 64, cont.): Actual memory contains a data pointer at "buff.c", line 18, INR = 266
|
||||||
|
.DE
|
||||||
|
and indeed at line 18 of the file buff.c we find:
|
||||||
|
.DS
|
||||||
|
buff = malloc(buff_size = BFSIZE);
|
||||||
|
.DE
|
||||||
|
This problem can be avoided by using a specially compiled version of the
|
||||||
|
library that contains the correct LIN and FIL instructions, or, less
|
||||||
|
elegantly, by including the source code of the library routines in the
|
||||||
|
program; in the latter case, make sure you have them all.
|
||||||
|
.sp
|
||||||
|
.LP
|
||||||
|
.I "Unavoidable messages"
|
||||||
|
.br
|
||||||
|
Some messages produced by the logging are almost unavoidable; sometimes the
|
||||||
|
writer of a library routine is forced to take liberties with the semantics of
|
||||||
|
EM.
|
||||||
|
.LP
|
||||||
|
Examples from C include the memory allocation routines.
|
||||||
|
For efficiency reasons, one bit of an pointer in the administration is used as
|
||||||
|
a flag; setting, clearing and reading this bit requires bitwise operations on
|
||||||
|
pointers, which gives the above messages.
|
||||||
|
Realloc causes a problem in that it may have to copy the originally allocated
|
||||||
|
area to a different place; this area may contain uninitialised bytes.
|
||||||
|
.bp
|
||||||
|
.DS
|
||||||
|
.ft CW
|
||||||
|
@x9 "t.c", line 4, INR = 16, PC = 30 OPCODE = 180
|
||||||
|
@L6 "t.c", line 4, INR = 16, DoLOLm(-4)
|
||||||
|
d2
|
||||||
|
d2 . . STACK_DUMP[4/4] . . INR = 16 . . STACK_DUMP . .
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483643 0 (Dp)
|
||||||
|
d2 2147483642 0 (Dp)
|
||||||
|
d2 2147483641 0 (Dp)
|
||||||
|
d2 2147483640 40 [ 40] (Dp)
|
||||||
|
d2 2147483639 0 (Dp)
|
||||||
|
d2 2147483638 0 (Dp)
|
||||||
|
d2 2147483637 3 (Dp)
|
||||||
|
d2 2147483636 64 [ 832] (Dp)
|
||||||
|
d2 2147483635 0 (In)
|
||||||
|
d2 2147483634 0 (In)
|
||||||
|
d2 2147483633 0 (In)
|
||||||
|
d2 2147483632 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483607 0 (Dp)
|
||||||
|
d2 2147483606 0 (Dp)
|
||||||
|
d2 2147483605 0 (Dp)
|
||||||
|
d2 2147483604 40 [ 40] (Dp)
|
||||||
|
d2 2147483603 0 (Dp)
|
||||||
|
d2 2147483602 0 (Dp)
|
||||||
|
d2 2147483601 3 (Dp)
|
||||||
|
d2 2147483600 64 [ 832] (Dp)
|
||||||
|
d2 2147483599 0 (In)
|
||||||
|
d2 2147483598 0 (In)
|
||||||
|
d2 2147483597 0 (In)
|
||||||
|
d2 2147483596 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483571 undef
|
||||||
|
d2 | | | | | |
|
||||||
|
d2 2147483568 undef (1 word)
|
||||||
|
d2 2147483567 0 (In)
|
||||||
|
d2 2147483566 0 (In)
|
||||||
|
d2 2147483565 3 (In)
|
||||||
|
d2 2147483564 9 [ 777] (In)
|
||||||
|
d2 2147483563 undef
|
||||||
|
d2 | | | | | |
|
||||||
|
d2 2147483560 undef (1 word)
|
||||||
|
d2 FRA: size = 0, undefined
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
|
||||||
|
LIN = 4, FIL = "t.c"
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2
|
||||||
|
@x9 "t.c", line 4, INR = 17, PC = 31 OPCODE = 169
|
||||||
|
@w1 "t.c", line 4, INR = 17, warning: Local data pointer expected [stack.c: 242]
|
||||||
|
@w1 "t.c", line 4, INR = 17, warning cont.: Actual memory is undefined
|
||||||
|
@L6 "t.c", line 4, INR = 17, DoLOIm(4)
|
||||||
|
d2
|
||||||
|
d2 . . STACK_DUMP[4/4] . . INR = 17 . . STACK_DUMP . .
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483643 0 (Dp)
|
||||||
|
d2 2147483642 0 (Dp)
|
||||||
|
d2 2147483641 0 (Dp)
|
||||||
|
d2 2147483640 40 [ 40] (Dp)
|
||||||
|
d2 2147483639 0 (Dp)
|
||||||
|
d2 2147483638 0 (Dp)
|
||||||
|
d2 2147483637 3 (Dp)
|
||||||
|
d2 2147483636 64 [ 832] (Dp)
|
||||||
|
d2 2147483635 0 (In)
|
||||||
|
d2 2147483634 0 (In)
|
||||||
|
d2 2147483633 0 (In)
|
||||||
|
d2 2147483632 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483607 0 (Dp)
|
||||||
|
d2 2147483606 0 (Dp)
|
||||||
|
d2 2147483605 0 (Dp)
|
||||||
|
d2 2147483604 40 [ 40] (Dp)
|
||||||
|
d2 2147483603 0 (Dp)
|
||||||
|
d2 2147483602 0 (Dp)
|
||||||
|
d2 2147483601 3 (Dp)
|
||||||
|
d2 2147483600 64 [ 832] (Dp)
|
||||||
|
d2 2147483599 0 (In)
|
||||||
|
d2 2147483598 0 (In)
|
||||||
|
d2 2147483597 0 (In)
|
||||||
|
d2 2147483596 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483571 undef
|
||||||
|
d2 | | | | | |
|
||||||
|
d2 2147483568 undef (1 word)
|
||||||
|
d2 2147483567 0 (In)
|
||||||
|
d2 2147483566 0 (In)
|
||||||
|
d2 2147483565 3 (In)
|
||||||
|
d2 2147483564 9 [ 777] (In)
|
||||||
|
d2 2147483563 0 (In)
|
||||||
|
d2 2147483562 0 (In)
|
||||||
|
d2 2147483561 0 (In)
|
||||||
|
d2 2147483560 4 [ 4] (In)
|
||||||
|
d2 FRA: size = 0, undefined
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483560, HP = 848, \e
|
||||||
|
LIN = 4, FIL = "t.c"
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2
|
||||||
|
@x9 "t.c", line 4, INR = 18, PC = 32 OPCODE = 229
|
||||||
|
@S6 "t.c", line 4, INR = 18, DoSTLm(-8)
|
||||||
|
d2
|
||||||
|
d2 . . STACK_DUMP[4/4] . . INR = 18 . . STACK_DUMP . .
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483643 0 (Dp)
|
||||||
|
d2 2147483642 0 (Dp)
|
||||||
|
d2 2147483641 0 (Dp)
|
||||||
|
d2 2147483640 40 [ 40] (Dp)
|
||||||
|
d2 2147483639 0 (Dp)
|
||||||
|
d2 2147483638 0 (Dp)
|
||||||
|
d2 2147483637 3 (Dp)
|
||||||
|
d2 2147483636 64 [ 832] (Dp)
|
||||||
|
d2 2147483635 0 (In)
|
||||||
|
d2 2147483634 0 (In)
|
||||||
|
d2 2147483633 0 (In)
|
||||||
|
d2 2147483632 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = STP, PI = uninit, PC = 0, LB = 2147483644, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483607 0 (Dp)
|
||||||
|
d2 2147483606 0 (Dp)
|
||||||
|
d2 2147483605 0 (Dp)
|
||||||
|
d2 2147483604 40 [ 40] (Dp)
|
||||||
|
d2 2147483603 0 (Dp)
|
||||||
|
d2 2147483602 0 (Dp)
|
||||||
|
d2 2147483601 3 (Dp)
|
||||||
|
d2 2147483600 64 [ 832] (Dp)
|
||||||
|
d2 2147483599 0 (In)
|
||||||
|
d2 2147483598 0 (In)
|
||||||
|
d2 2147483597 0 (In)
|
||||||
|
d2 2147483596 1 [ 1] (In)
|
||||||
|
d1 >> RSB: code = CAL, PI = (0,0), PC = 16, LB = 2147483608, LIN = 0, FIL = NULL
|
||||||
|
d2
|
||||||
|
d2 ADDRESS BYTE ITEM VALUE SHADOW
|
||||||
|
d2 2147483571 undef
|
||||||
|
d2 | | | | | |
|
||||||
|
d2 2147483568 undef (1 word)
|
||||||
|
d2 2147483567 0 (In)
|
||||||
|
d2 2147483566 0 (In)
|
||||||
|
d2 2147483565 0 (In)
|
||||||
|
d2 2147483564 4 [ 4] (In)
|
||||||
|
d2 FRA: size = 0, undefined
|
||||||
|
d1 >> AB = 2147483596, LB = 2147483572, SP = 2147483564, HP = 848, \e
|
||||||
|
LIN = 4, FIL = "t.c"
|
||||||
|
d2 ----------------------------------------------------------------
|
||||||
|
d2
|
||||||
|
.DE
|
25
doc/int/bib
Normal file
25
doc/int/bib
Normal file
|
@ -0,0 +1,25 @@
|
||||||
|
.\" Bibliography
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.bp
|
||||||
|
.DS C
|
||||||
|
BIBLIOGRAPHY
|
||||||
|
.DE
|
||||||
|
.LP
|
||||||
|
[1] A.S. Tanenbaum, H. van Staveren, E.G. Keizer and J.W. Stevenson.
|
||||||
|
\fIDescription of a Machine Architecture for use with Block Structured
|
||||||
|
Languages\fP. VU Informatica Rapport IR-81, august 1983.
|
||||||
|
.LP
|
||||||
|
[2] E.G. Keizer. \fIAck description file reference manual.\fP
|
||||||
|
.LP
|
||||||
|
[3] K. Jensen and N. Wirth.
|
||||||
|
\fIPASCAL, User Manual and Report\fP. Springer Verlag.
|
||||||
|
.LP
|
||||||
|
[4] B.W. Kernighan and D.M. Ritchie.
|
||||||
|
\fIThe C Programming Language\fP. Prentice-Hall, 1978.
|
||||||
|
.LP
|
||||||
|
[5] D.M. Ritchie. \fIC Reference Manual\fP.
|
||||||
|
.LP
|
||||||
|
[6] \fIAmsterdam Compiler Kit, reference manual.\fP
|
||||||
|
.LP
|
||||||
|
[7] \fIUnix Programmer's Manual, 4.1BSD\fP. UCB, August 1983.
|
26
doc/int/cover
Normal file
26
doc/int/cover
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
.\" Front page
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.TL
|
||||||
|
The EM Interpreter
|
||||||
|
.AU
|
||||||
|
Eddo de Groot
|
||||||
|
Leo van den Berge
|
||||||
|
Dick Grune
|
||||||
|
.AI
|
||||||
|
Faculteit Wiskunde en Informatica
|
||||||
|
Vrije Universiteit, Amsterdam
|
||||||
|
.AB
|
||||||
|
This document describes the implementation
|
||||||
|
and usage of a new interpreter for the EM machine language.
|
||||||
|
This interpreter implements the full EM machine
|
||||||
|
and can be helpful to people writing new front-ends.
|
||||||
|
Moreover, it can be used as a thorough testing and debugging
|
||||||
|
tool by anyone familiar with the EM language.
|
||||||
|
.PP
|
||||||
|
A list of all warnings is given in appendix A; appendix B is a simple
|
||||||
|
tutorial.
|
||||||
|
.AE
|
||||||
|
.PP
|
||||||
|
.pn 1
|
||||||
|
.bp
|
24
doc/int/draw.mac
Normal file
24
doc/int/draw.mac
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
.\" Macros for simple constant width drawings (uses font CW)
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.de Dr \" Drawing $1 (size)
|
||||||
|
.sp 1
|
||||||
|
.ne \\$1
|
||||||
|
.na
|
||||||
|
.nf
|
||||||
|
.ft CW \" constant width font
|
||||||
|
.lg 0 \" no ligatures
|
||||||
|
..
|
||||||
|
.de Df \" Drawing Footer
|
||||||
|
.sp 1
|
||||||
|
.ft R
|
||||||
|
.ce 1000
|
||||||
|
.lg 1
|
||||||
|
..
|
||||||
|
.de De \" Drawing End $1 (lines)
|
||||||
|
.Df \" if it has not happened yet
|
||||||
|
.ce
|
||||||
|
.ad
|
||||||
|
.fi
|
||||||
|
.sp \\$1
|
||||||
|
..
|
595
doc/int/txt2
Normal file
595
doc/int/txt2
Normal file
|
@ -0,0 +1,595 @@
|
||||||
|
.\" Implementation details
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.bp
|
||||||
|
.NH
|
||||||
|
IMPLEMENTATION DETAILS.
|
||||||
|
.PP
|
||||||
|
The pertinent issues are addressed below, in arbitrary order.
|
||||||
|
.NH 2
|
||||||
|
Stack manipulation and start-up
|
||||||
|
.PP
|
||||||
|
It is not at all easy to start the EM machine with the stack in a reasonable
|
||||||
|
and consistent state. One reason is the anomalous value of the ML register
|
||||||
|
and another is the absence of a proper RSB. It may be argued that the initial
|
||||||
|
stack does not have to be in a consistent state, since the first instruction
|
||||||
|
proper is only executed after \fIargc\fP, \fIargv\fP and \fIenviron\fP
|
||||||
|
have been stacked (which takes care of the empty stack) and the initial
|
||||||
|
procedure has been called (which creates a RSB). We would, however, like to
|
||||||
|
preform the stacking of these values and the calling of the initial procedure
|
||||||
|
using the normal stack and call routines, which again require the stack to be
|
||||||
|
in an acceptable state.
|
||||||
|
.NH 3
|
||||||
|
The anomalous value of the ML register
|
||||||
|
.PP
|
||||||
|
All registers in the EM machine point to word boundaries, and all of them,
|
||||||
|
except ML, address the even-numbered byte at the boundary.
|
||||||
|
The exception has a good reason: the even numbered byte at the ML boundary does
|
||||||
|
not exist.
|
||||||
|
This problem is not particular to EM but is inherent in the number system: the
|
||||||
|
number of N-digit numbers can itself not be expressed in an N-digit number, and
|
||||||
|
the number of addresses in an N-bit machine will itself not fit in an N-bit
|
||||||
|
address. The problem is solved in the interpreter by having ML point to the
|
||||||
|
highest word boundary that has bytes on either side; this makes ML+1
|
||||||
|
expressible.
|
||||||
|
.NH 3
|
||||||
|
The absence of an initial Return Status Block
|
||||||
|
.PP
|
||||||
|
When the stack is empty, there is no legal value for AB, since there are no
|
||||||
|
actuals; LB can be set naturally to ML+1. This is all right when the
|
||||||
|
interpreter starts with a call of the initial routine which stores the value
|
||||||
|
of LB in the first RSB, but causes problems when finally this call returns. We
|
||||||
|
want this call to return completely before stopping the interpreter, to check
|
||||||
|
the integrity of the last RSB; restoring information from it will, however,
|
||||||
|
cause illegal values to be stored in LB and AB (ML+1 and ML+1+rsbsize, resp.).
|
||||||
|
On top of this, the initial (illegal) Procedure Identifier of the running
|
||||||
|
procedure will be restored; then, upon restoring the likewise illegal PC will
|
||||||
|
cause a check to see if it still is inside the running procedure. After a few
|
||||||
|
attempts at writing special cases, we have decided that it is possible, but not
|
||||||
|
worth the effort; the final (= initial) RSB will not be unstacked.
|
||||||
|
.NH 2
|
||||||
|
Floating point numbers.
|
||||||
|
.PP
|
||||||
|
The interpreter is capable of working with 4- and 8-byte floating point (FP)
|
||||||
|
numbers.
|
||||||
|
In C-terms, this corresponds to objects of type float and double respectively.
|
||||||
|
Both types fit in a C-double so the obvious way to manipulate these entities
|
||||||
|
internally is in doubles.
|
||||||
|
Pushing a 8-byte FP, all bytes of the C-double are pushed.
|
||||||
|
Pushing a 4-byte FP causes the 4 bytes representing the smallest fraction
|
||||||
|
to be discarded.
|
||||||
|
.PP
|
||||||
|
In EM, floats can be obtained in two different ways: via conversion
|
||||||
|
of another type, or via initialization in the loadfile.
|
||||||
|
Initialized floats are represented in the loadfile by an ASCII string in
|
||||||
|
the syntax of a Pascal real (signed \fPUnsignedReal\fP).
|
||||||
|
I.e. a float looks like:
|
||||||
|
.DS
|
||||||
|
[ \fISign\fP ] \fIDigit\fP+ [ . \fIDigit\fP+ ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ] (G1)
|
||||||
|
.DE
|
||||||
|
followed by a null byte.
|
||||||
|
Here \fISign\fP = {+, \-}; \fIDigit\fP = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
|
||||||
|
\fIExp\fP = {e, E}; [ \fIAnything\fP ] means that \fIAnything\fP is optional;
|
||||||
|
and a + means one or more times.
|
||||||
|
To accommodate some loose code generators, the actual grammar accepted is:
|
||||||
|
.DS
|
||||||
|
[ \fISign\fP ] \fIDigit\fP\(** [ . \fIDigit\fP\(** ] [ \fIExp\fP [ \fISign\fP ] \fIDigit\fP+ ] (G2)
|
||||||
|
.DE
|
||||||
|
followed by a null byte. Here \(** means zero or more times. A floating
|
||||||
|
denotation which is in G2 but not in G1 draws a warning, one that is not even
|
||||||
|
in G2 causes a fatal error.
|
||||||
|
.LP
|
||||||
|
A string, representing a float which does not fit in a double causes a
|
||||||
|
warning to be given.
|
||||||
|
In that case, the returned value will be the double 0.0.
|
||||||
|
.LP
|
||||||
|
Floating point arithmetic is handled by some simple routines, checking for
|
||||||
|
over/underflow, and returning appropriate values in case of an ignored error.
|
||||||
|
.PP
|
||||||
|
Since not all C compilers provide floating point operations, there is a
|
||||||
|
compile time flag NOFLOAT, which, if defined, suppresses the use of all
|
||||||
|
fp operations in the interpreter. The resulting interpreter will still load
|
||||||
|
EM files with floats in the global data area (and ignore them) but will give a
|
||||||
|
fatal error upon attempt to execute a floating point instruction; consequently
|
||||||
|
code involving floating point operations can be run as long as the actual
|
||||||
|
instructions are avoided.
|
||||||
|
.NH 2
|
||||||
|
Pointers.
|
||||||
|
.PP
|
||||||
|
The following sub-sections both deal with problems concerning pointers.
|
||||||
|
First, something is said about pointer arithmetic in general.
|
||||||
|
Then, the null-pointer problem is dealt with.
|
||||||
|
.NH 3
|
||||||
|
Pointer arithmetic.
|
||||||
|
.PP
|
||||||
|
Strictly speaking, pointer arithmetic is defined only within a \fBfragment\fP.
|
||||||
|
From the explanation of the term fragment however (as given in [1], page 3),
|
||||||
|
it is not quite clear what a fragment should look like
|
||||||
|
from an interpreter's point of view.
|
||||||
|
For this reason we introduced the term \fBsegment\fP,
|
||||||
|
bordering the various areas within which pointer arithmetic is allowed.
|
||||||
|
Every stack-frame is a segment, and so are the global data area (GDA) and
|
||||||
|
the heap area.
|
||||||
|
Thus, the number of segments varies over time, and at some point in time is
|
||||||
|
given by the number of currently active stack-frames
|
||||||
|
(#CAL + #CAI \- #RET \- #RTT) plus 2 (gda, heap).
|
||||||
|
Pointers in the area between heap and stack (which is inaccessible by
|
||||||
|
definition), are assumed to be in the heap segment.
|
||||||
|
.PP
|
||||||
|
The interpreter, while building a new stack-frame (i.e. segment), stores the
|
||||||
|
value of the last ActualBase in a pointer-array (\fIAB_list[\ ]\fP).
|
||||||
|
When a pointer (say \fIP\fP) is available for arithmetic, the number
|
||||||
|
of the segment where it points (say \fIS\d\s-2P\s+2\u\fP),
|
||||||
|
is determined first.
|
||||||
|
Next, the arithmetic is performed, followed by a check on the number
|
||||||
|
of the segment where the resulting pointer \fIR\fP points
|
||||||
|
(say \fIS\d\s-2R\s+2\u\fP).
|
||||||
|
Now, if \fIS\d\s-2P\s+2\u != S\d\s-2R\s+2\u\fP, a warning is given:
|
||||||
|
\fBPointer arithmetic yields pointer to bad segment\fP.
|
||||||
|
.br
|
||||||
|
It may also be clear now, why the illegal area between heap and stack
|
||||||
|
was joined with the heap segment.
|
||||||
|
When calculating a new heap pointer (\fIHP\fP), one will obtain intermediate
|
||||||
|
results being pointers in this area just before it is made legal.
|
||||||
|
We do not want error messages all of the time, just because someone is
|
||||||
|
allocating space in the heap.
|
||||||
|
.LP
|
||||||
|
A similar treatment is given to the pointers in the SBS instruction; they have
|
||||||
|
to point into the same fragment for subtraction to be meaningful.
|
||||||
|
.LP
|
||||||
|
The length of the \fIAB_list[\ ]\fP is initially 100,
|
||||||
|
and it is reallocated in the same way the dynamically growing partitions
|
||||||
|
are (see 1.1).
|
||||||
|
.NH 3
|
||||||
|
Null pointer.
|
||||||
|
.PP
|
||||||
|
Because the EM language lacks an instruction for loading a null pointer,
|
||||||
|
most programs solve this problem by loading a pointer-sized integer of
|
||||||
|
value zero, and using this as a null pointer (this is also proposed in [1]).
|
||||||
|
\fBInt\fP allows this, and will not complain.
|
||||||
|
A warning is given however, when an attempt is made to add something to a
|
||||||
|
null pointer (i.e. the pointer-sized integer zero).
|
||||||
|
.LP
|
||||||
|
Since many programming languages use a pointer to location 0 as an illegal
|
||||||
|
value, it is desirable to detect its use.
|
||||||
|
The big problem is though that 0 is a perfectly legal EM address;
|
||||||
|
address 0 holds the current line number in the source file. It may be freely
|
||||||
|
read but is written only by means of the LIN instruction. This allows us to
|
||||||
|
declare the area consisting of the line number and the file name pointer to be
|
||||||
|
read-only memory. Thus a store will be caught (and result in a warning) but a
|
||||||
|
read will succeed (and yield the EM information stored there).
|
||||||
|
.NH 2
|
||||||
|
Function Return Area (FRA).
|
||||||
|
.PP
|
||||||
|
The Function Return Area (\fIFRA[\ ]\fP) has a default size of 8 bytes;
|
||||||
|
this default can
|
||||||
|
be overridden through the use of the \fB\-r\fP-option, but cannot be
|
||||||
|
made smaller than the size of two pointers, in accordance with the
|
||||||
|
remark on page 5 of [1].
|
||||||
|
The global variable \fIFRASize\fP keeps track of how many bytes were
|
||||||
|
stored in the FRA, the last time a RET instruction was executed.
|
||||||
|
The LFR instruction only works when its argument is equal to this size.
|
||||||
|
If not, the FRA contents are loaded anyhow, but one of the following warnings
|
||||||
|
is given:
|
||||||
|
\fBReturned function result too large\fP (\fIFRASize\fP > LFR size) or
|
||||||
|
\fBReturned function result too small\fP (\fIFRASize\fP < LFR size).
|
||||||
|
.LP
|
||||||
|
Note that a C-program, falling through the end of its code without doing
|
||||||
|
a proper \fIreturn\fP or \fIexit()\fP, will generate this warning.
|
||||||
|
.PP
|
||||||
|
The only instructions that do not disturb the contents of the FRA are
|
||||||
|
GTO, BRA, ASP and RET.
|
||||||
|
This is expressed in the program by setting \fIFRA_def\fP to "undefined"
|
||||||
|
in any instruction except these four.
|
||||||
|
We realize this is a useless action most of the time, but a more
|
||||||
|
efficient solution does not seem to be at hand.
|
||||||
|
If a result is loaded when \fIFRA_def\fP is "undefined", the warning:
|
||||||
|
\fBReturned function result may be garbled\fP is generated.
|
||||||
|
.LP
|
||||||
|
Note that the FRA needs a shadow-FRA in order to store the shadow
|
||||||
|
information when performing a LFR instruction.
|
||||||
|
.NH 2
|
||||||
|
Environment interaction.
|
||||||
|
.PP
|
||||||
|
The EM machine represented by \fBint\fP can communicate with
|
||||||
|
the environment in three different ways.
|
||||||
|
A first possibility is by means of (UNIX) interrupts;
|
||||||
|
the second by executing (relatively) high level system calls (called
|
||||||
|
monitor calls).
|
||||||
|
A third means of interaction, especially interesting for the debugging
|
||||||
|
programmer, is via internal variables set on the command line.
|
||||||
|
The former two techniques, and the way they are implemented will be described
|
||||||
|
in this section.
|
||||||
|
The latter has been allotted a separate section (3).
|
||||||
|
.NH 3
|
||||||
|
Traps and interrupts.
|
||||||
|
.PP
|
||||||
|
Simple user programs will generally not mess around with UNIX-signals.
|
||||||
|
In interpreting these programs, the default actions will be taken
|
||||||
|
when a signal is received by the program: it gives a message and
|
||||||
|
stops running.
|
||||||
|
.LP
|
||||||
|
There are programs however, which try to handle certain signals
|
||||||
|
themselves.
|
||||||
|
In C, this is achieved by the system call \fIsignal(\ sig_no,\ catch\ )\fP,
|
||||||
|
which calls the handling routine \fIcatch()\fP, as soon as signal
|
||||||
|
\fBsig_no\fP occurs.
|
||||||
|
EM does not provide this call; instead, the \fIsigtrp()\fP monitor call
|
||||||
|
is available for mapping UNIX signals onto EM traps.
|
||||||
|
This implies that a \fIsignal()\fP call in a C-program
|
||||||
|
must be translated by the EM library routine to a \fIsigtrp()\fP call in EM.
|
||||||
|
.PP
|
||||||
|
The interpreter keeps an administration of the mapping of UNIX-signals
|
||||||
|
onto EM traps in the array \fIsig_map[NSIG]\fP.
|
||||||
|
Initially, the signals all have their default values.
|
||||||
|
Now assume a \fIsigtrp()\fP occurs, telling to map signal \fBsig_no\fP onto
|
||||||
|
trap \fBtrap_no\fP.
|
||||||
|
This results in:
|
||||||
|
.IP 1.
|
||||||
|
setting the relevant array element
|
||||||
|
\fIsig_map[sig_no]\fP to \fBtrap_no\fP (after saving the old value),
|
||||||
|
.IP 2.
|
||||||
|
catching the next to come \fBsig_no\fP signal with the handling routine
|
||||||
|
\fIHndlEMSig\fP (by a plain UNIX \fIsignal()\fP of course), and
|
||||||
|
.IP 3.
|
||||||
|
returning the saved map-value on the stack so the user can know the previous
|
||||||
|
trap value onto which \fBsig_no\fP was mapped.
|
||||||
|
.LP
|
||||||
|
On an incoming signal,
|
||||||
|
the handling routine for signal \fBsig_no\fP arms the
|
||||||
|
correct EM trap by calling the routine \fIarm_trap()\fP with argument
|
||||||
|
\fIsig_map[sig_no]\fP.
|
||||||
|
At the end of the EM instruction the proper call of \fItrap()\fP is done.
|
||||||
|
\fITrap()\fP on its turn examines the value of the \fIHaltOnTrap\fP variable;
|
||||||
|
if it is set, the interpreter will stop with a message. In the normal case of
|
||||||
|
controlled trap handling this bit is not on and the interpreter examines
|
||||||
|
the value of the \fITrapPI\fP variable,
|
||||||
|
which contains the procedure identifier of the EM trap handling routine.
|
||||||
|
It then initiates a call to this routine and performs a \fIlongjmp()\fP
|
||||||
|
to the main
|
||||||
|
loop to bypass all further processing of the instruction that caused the trap.
|
||||||
|
\fITrapPI\fP should be set properly by the library routines, through the
|
||||||
|
SIG instruction.
|
||||||
|
.LP
|
||||||
|
In short:
|
||||||
|
.IP 1.
|
||||||
|
A UNIX interrupt is caught by the interpreter.
|
||||||
|
.IP 2.
|
||||||
|
A handling routine is called which generates the corresponding EM trap
|
||||||
|
(according to the mapping).
|
||||||
|
.IP 3.
|
||||||
|
The trap handler calls the corresponding EM routine which emulates a UNIX
|
||||||
|
interrupt for the benefit of the interpreted program.
|
||||||
|
.PP
|
||||||
|
When considering UNIX signals, it is important to notice that some of them
|
||||||
|
are real signals, i.e., messages coming from outside the program, like DEL
|
||||||
|
and QUIT, but some are actually program-caused synchronous traps, like Illegal
|
||||||
|
Instruction. The latter, if they happen, are incurred by the interpreter
|
||||||
|
itself and consequently are of no concern to the interpreted program: it
|
||||||
|
cannot catch them. The present code assumes that the UNIX signals between
|
||||||
|
SIGILL (4) and SIGSYS (12) are really traps; \fIdo_sigtrp()\fP
|
||||||
|
will fail on them.
|
||||||
|
.LP
|
||||||
|
To avoid losing the last line(s) of output files, the interpreter should
|
||||||
|
always do a proper close-down, even in the presence of signals. To this end,
|
||||||
|
all non-ignored genuine signals are initially caught by the interpreter,
|
||||||
|
through the routine \fIHndlIntSig\fP, which gives a message and preforms a
|
||||||
|
proper close-down.
|
||||||
|
Synchronous trap can only be caused by the interpreter itself; they are never
|
||||||
|
caught, and consequently the UNIX default action prevails. Generally they
|
||||||
|
cause a core dump.
|
||||||
|
Signals requested by the interpreted program are caught by the routine
|
||||||
|
\fIHndlEMSig\fP, as explained above.
|
||||||
|
.NH 3
|
||||||
|
Monitor calls.
|
||||||
|
.PP
|
||||||
|
For the convenience of the programmer, as many monitor calls as possible
|
||||||
|
have been implemented.
|
||||||
|
The list of monitor calls given in [1] pages 20/21, has been implemented
|
||||||
|
completely, except for \fIptrace()\fP, \fIprofil()\fP and \fImpxcall()\fP.
|
||||||
|
The semantics of \fIptrace()\fP and \fIprofil()\fP from an interpreted program
|
||||||
|
is unclear; the data structure passed to \fImpxcall()\fP is non-trivial
|
||||||
|
and the system call has low portability and applicability.
|
||||||
|
For these calls, on invocation a warning is generated, and the arguments which
|
||||||
|
were meant for the call are popped properly, so the program can continue
|
||||||
|
without the stack being messed up.
|
||||||
|
The errorcode 5 (IOERROR) is pushed onto the stack (twice), in order to
|
||||||
|
fake an unsuccessful monitor call.
|
||||||
|
No other \- more meaningful \- errorcode is available in the errno-list.
|
||||||
|
.LP
|
||||||
|
Now for the implemented monitor calls.
|
||||||
|
The returned value is zero for a successful call.
|
||||||
|
When something goes wrong, the value of the external \fIerrno\fP variable
|
||||||
|
is pushed, thus enabling the user to find out what the reason of failure was.
|
||||||
|
The implementation of the majority of the monitor calls is straightforward.
|
||||||
|
Those working with a special format buffer, (e.g. \fIioctl()\fP,
|
||||||
|
\fItime()\fP and \fIstat()\fP variants), need some extra attention.
|
||||||
|
This is due to the fact that working with varying word/pointer size
|
||||||
|
combinations may cause alignment problems.
|
||||||
|
.LP
|
||||||
|
The data structure returned by the UNIX system call results from
|
||||||
|
C code that has been translated with the regular C compiler, which,
|
||||||
|
on the VAX, happens to be a 4-4 compiler.
|
||||||
|
The data structure expected by the interpreted program conforms
|
||||||
|
to the translation by \fBack\fP of the pertinent include file.
|
||||||
|
Depending on the exact call of \fBack\fP, sizes and alignment may differ.
|
||||||
|
.LP
|
||||||
|
An example is in order. The EM MON 18 instruction in the interpreted program
|
||||||
|
leads to a UNIX \fIstat()\fP system call by the interpreter.
|
||||||
|
This call fills the given struct with stat information, the contents
|
||||||
|
and alignments of which are determined by the version of UNIX and the
|
||||||
|
used C compiler, resp.
|
||||||
|
The interpreter, like any program wishing to do system calls that fill
|
||||||
|
structs, has to be translated by a C compiler that uses the
|
||||||
|
appropriate struct definition and alignments, so that it can use, e.g.,
|
||||||
|
\fIstab.st_mtime\fP and expect to obtain the right field.
|
||||||
|
This struct cannot be copied directly to the EM memory to fulfill the
|
||||||
|
MON instruction.
|
||||||
|
First, the struct may contain extraneous, system-dependent fields,
|
||||||
|
pertaining, e.g., to symbolic links, sockets, etc.
|
||||||
|
Second, it may contain holes, due to alignment requirements.
|
||||||
|
The EM program runs on an EM machine, knows nothing about these
|
||||||
|
requirements and expects UNIX Version 7 fields, with offsets as
|
||||||
|
determined by the em22, em24 or em44 compiler, resp.
|
||||||
|
To do the conversion, the interpreter has a built-in table of the
|
||||||
|
offsets of all the fields in the structs that are filled by the MON
|
||||||
|
instruction.
|
||||||
|
The appropriate fields from the result of the UNIX \fIstat()\fP are copied
|
||||||
|
one by one to the appropriate positions in the EM memory to be filled
|
||||||
|
by MON 18.
|
||||||
|
.PP
|
||||||
|
The \fIioctl()\fP call (MON 54) poses additional problems. Not only does it
|
||||||
|
have a second argument which is a pointer to a struct, the type of
|
||||||
|
which is dynamically determined, but its first argument is an opcode
|
||||||
|
that varies considerably between the versions of UNIX.
|
||||||
|
To solve the first problem, the interpreter examines the opcode (request) and
|
||||||
|
treats the second argument accordingly. The second problem can be solved by
|
||||||
|
translating the UNIX Version 7 \fIioctl()\fP request codes to their proper
|
||||||
|
values on the various systems. This is, however, not always useful, since
|
||||||
|
some EM run-time systems use the local request codes. There is a compile-time
|
||||||
|
flag, V7IOCTL, which, if defined, will restrict the \fIioctl()\fP call to the
|
||||||
|
version 7 request codes and emulate them on the local system; otherwise the
|
||||||
|
request codes of the local system will be used (as far as implemented).
|
||||||
|
.PP
|
||||||
|
Minor problems also showed up with the implementation of \fIexecve()\fP
|
||||||
|
and \fIfork()\fP.
|
||||||
|
\fIExecve()\fP expects three pointers on the stack.
|
||||||
|
The first points to the name of the program to be executed,
|
||||||
|
the second and third are the beginnings of the \fBargv\fP and \fBenvp\fP
|
||||||
|
pointer arrays respectively.
|
||||||
|
We cannot pass these pointers to the system call however, because
|
||||||
|
the EM addresses to which they point do not correspond with UNIX
|
||||||
|
addresses.
|
||||||
|
Moreover, (it is not very likely to happen but) what if someone constructs
|
||||||
|
a program holding the contents for one of these pointers in the stack?
|
||||||
|
The stack is implemented upside down, so passing the pointer to
|
||||||
|
\fIexecve()\fP causes trouble for this reason too.
|
||||||
|
The only solution was to copy the pointer contents completely
|
||||||
|
to fresh UNIX memory, constructing vectors which can be passed to the
|
||||||
|
system call.
|
||||||
|
Any impending memory fault while making these copies results in failure of the
|
||||||
|
system call, with \fIerrno\fP set to EFAULT.
|
||||||
|
.PP
|
||||||
|
The implementation of the \fIfork()\fP call faced us with problems
|
||||||
|
concerning IO-channels.
|
||||||
|
Checking messages (as well as logging) must be divided over different files.
|
||||||
|
Otherwise, these messages will coincide.
|
||||||
|
This problem was solved by post-fixing the default message file
|
||||||
|
\fBint.mess\fP (as well as the logging file \fBint.log\fP) with an
|
||||||
|
automatically leveled number for every new forked process.
|
||||||
|
Children of the original process do their diagnostics
|
||||||
|
in files with postfix 1,2,3 etc.
|
||||||
|
Second generation processes are assigned files numbered 11, 12, 21 etc.
|
||||||
|
When 6 generations of processes exist at one moment, the seventh will
|
||||||
|
get the same message file as the sixth, for the length of the filename
|
||||||
|
will become too long.
|
||||||
|
.PP
|
||||||
|
Some of the monitor calls receive pointers (addresses) from to program, to be
|
||||||
|
passed to the kernel; examples are the struct stat for \fIstat()\fP, the area
|
||||||
|
to be filled for \fIread()\fP, etc. If the address is wrong, the kernel does
|
||||||
|
not generate a trap, but rather the system call returns with failure, while
|
||||||
|
\fIerrno\fP is set to EFAULT. This is implemented by consistent checking of
|
||||||
|
all pointers in the MON instruction.
|
||||||
|
.NH 2
|
||||||
|
Internal arithmetic.
|
||||||
|
.PP
|
||||||
|
Doing arithmetic on signed integers, the smallest negative integer
|
||||||
|
(\fIminsint\fP) is considered a legal value.
|
||||||
|
This is in contradiction with the EM Manual [1], page 14, which proposes using
|
||||||
|
\fIminsint\fP for uninitialized integers.
|
||||||
|
The shadow bytes already check for uninitialized integers however,
|
||||||
|
so we do not need this special illegal value.
|
||||||
|
Although the EM Manual provides two traps, for undefined integers and floats,
|
||||||
|
undefined objects occur so frequently (e.g. in block copying partially
|
||||||
|
initialized areas) that the interpreter just gives a warning.
|
||||||
|
.LP
|
||||||
|
Except for arithmetic on unsigneds, all arithmetic checks for overflow.
|
||||||
|
The value that is pushed on the stack after an overflow occurs depends
|
||||||
|
on the UNIX behavior with regard to that particular calculation.
|
||||||
|
If UNIX would not accept the calculation (e.g. division by zero), a zero
|
||||||
|
is pushed as a convention.
|
||||||
|
Illegal computations which UNIX does accept in silence (e.g. one's
|
||||||
|
complement of \fIminsint\fP), simply push the UNIX-result after giving a
|
||||||
|
trap message.
|
||||||
|
.NH 2
|
||||||
|
Shadow bytes implementation.
|
||||||
|
.PP
|
||||||
|
A great deal of run-time checking is performed by the interpreter (except if
|
||||||
|
used in the fast version).
|
||||||
|
This section gives all details about the shadow bytes.
|
||||||
|
In order to keep track of information about the contents of D-space (stack
|
||||||
|
and global data area), there is one shadow-byte for each byte in these spaces.
|
||||||
|
Each bit in a shadow-byte represents some piece
|
||||||
|
of information about the contents of its corresponding 'sun-byte'.
|
||||||
|
All bits off indicates an undefined sun-byte.
|
||||||
|
One or more bits on always guarantees a well-defined sun-byte.
|
||||||
|
The bits have the following meaning:
|
||||||
|
.IP "\(bu bit 0:" 8
|
||||||
|
indicates that the sun-byte is (a part of) an integer.
|
||||||
|
.IP "\(bu bit 1:" 8
|
||||||
|
the sun-byte is a part of a floating point number.
|
||||||
|
.IP "\(bu bit 2:" 8
|
||||||
|
the sun-byte is a part of a pointer in dataspace.
|
||||||
|
.IP "\(bu bit 3:" 8
|
||||||
|
the sun-byte is a part of a pointer in the instruction space.
|
||||||
|
According to [1] (paragraph 6.4), there are two types pointers which
|
||||||
|
must be distinguishable.
|
||||||
|
Conversion between these two types is impossible.
|
||||||
|
The shadow-bytes make the distinction here.
|
||||||
|
.IP "\(bu bit 4:" 8
|
||||||
|
protection bit.
|
||||||
|
Indicates that the sun-byte is part of a protected piece of memory.
|
||||||
|
There is a protected area in the stack, the Return Status Block.
|
||||||
|
The EM machine language has no possibility to declare protected
|
||||||
|
memory, as is possible in EM assembly (the ROM instruction). The protection
|
||||||
|
bit is, however, set for the line number and filename pointer area near
|
||||||
|
location 0, to aid in catching references to location 0.
|
||||||
|
.IP "\(bu bit 5/6/7:" 8
|
||||||
|
free for later use.
|
||||||
|
.LP
|
||||||
|
The shadow bytes are managed by the routines declared in \fIshadow.h\fP.
|
||||||
|
The warnings originating from checking these shadow-bytes during
|
||||||
|
run-time are various.
|
||||||
|
A list of them is given in appendix A, together with suggestions
|
||||||
|
(primarily for the C-programmer) where to look for the trouble maker(s).
|
||||||
|
.LP
|
||||||
|
A point to notice is, that once a warning is generated, it may be repeated
|
||||||
|
thousands of times.
|
||||||
|
Since repetitive warnings carry little information, but consume much
|
||||||
|
file space, the interpreter keeps track of the number of times a given warning
|
||||||
|
has been produced from a given line in a given file.
|
||||||
|
The warning message will
|
||||||
|
be printed only if the corresponding counter is a power of four (starting at
|
||||||
|
1). In this way, a logarithmic back-off in warning generation is established.
|
||||||
|
.LP
|
||||||
|
It might be argued that the counter should be kept for each (warning, PC
|
||||||
|
value) pair rather than for each (warning, file position) pair. Suppose,
|
||||||
|
however, that two instruction in a given line would cause the same message
|
||||||
|
regularly; this would produce two intertwined streams of identical messages,
|
||||||
|
with their counters jumping up and down. This does not seem desirable.
|
||||||
|
.NH 2
|
||||||
|
Return Status Block (RSB)
|
||||||
|
.PP
|
||||||
|
According to the description in [1], at least the return address and the
|
||||||
|
base address of the previous RSB have to be pushed when performing a call.
|
||||||
|
Besides these two pointers, other information can be stored in the RSB
|
||||||
|
also.
|
||||||
|
The interpreter pushes the following items:
|
||||||
|
.IP \-
|
||||||
|
a pointer to the current filename,
|
||||||
|
.IP \-
|
||||||
|
the current line number (always four bytes),
|
||||||
|
.IP \-
|
||||||
|
the Local Base,
|
||||||
|
.IP \-
|
||||||
|
the return address (Program Counter),
|
||||||
|
.IP \-
|
||||||
|
the current procedure identifier
|
||||||
|
.IP \-
|
||||||
|
the RSB code, which distinguishes between initial start-up, normal call,
|
||||||
|
returnable trap and non-returnable trap (a word-size integer).
|
||||||
|
.LP
|
||||||
|
Consequently, the size of the RSB varies, depending on
|
||||||
|
word size and pointer size; its value is available as \fIrsbsize\fP.
|
||||||
|
When the RSB is removed from the stack (by a RET or RTT) the RSB code is under
|
||||||
|
the Stack Pointer for immediate checking. It is not clear what should be done
|
||||||
|
if RSB code and return instruction do not match; at present we give a message
|
||||||
|
and continue, for what it is worth.
|
||||||
|
.PP
|
||||||
|
The reason for pushing filename and line number is that some front-ends tend
|
||||||
|
to forget the LIN and FIL instructions after returning from a function.
|
||||||
|
This may result in error messages in wrong source files and/or line numbers.
|
||||||
|
.PP
|
||||||
|
The procedure identifier is kept and restored to check that the PC will not
|
||||||
|
move out of the running procedure. The PI is an index in the proctab, which
|
||||||
|
tells the limits in the text segment of the running procedure.
|
||||||
|
.PP
|
||||||
|
If the Return Status Block is generated as a result of a trap, more is
|
||||||
|
stacked. Before stacking the normal RSB, the trap function pushes the
|
||||||
|
following items:
|
||||||
|
.IP \-
|
||||||
|
the contents of the entire Function Return Area,
|
||||||
|
.IP \-
|
||||||
|
the number of bytes significant in the above (a word-size integer),
|
||||||
|
.IP \-
|
||||||
|
a word-size flag indicating if the contents of the FRA are valid,
|
||||||
|
.IP \-
|
||||||
|
the trap number (a word-size integer).
|
||||||
|
.LP
|
||||||
|
The latter is followed directly by the RSB, and consequently acts as the only
|
||||||
|
parameter to the trap handler.
|
||||||
|
.NH 2
|
||||||
|
Operand access.
|
||||||
|
.PP
|
||||||
|
The EM Manual mentions two ways to access the operands of an instruction. It
|
||||||
|
should be noticed that the operand in EM is often not the direct operand of the
|
||||||
|
operation; the operand of the ADI instruction, e.g., is the width of the
|
||||||
|
integers to be added, not one of the integers themselves. The various operand
|
||||||
|
types are described in [1]. Each opcode in the text segment identifies an
|
||||||
|
instruction with a particular operand type; these relations are described in
|
||||||
|
computer-readable format in a file in the EM tree, \fIip_spec.t\fP.
|
||||||
|
.PP
|
||||||
|
The interpreter uses a variant of the second method. Several other approaches
|
||||||
|
can be designed, with increasing efficiency and equally increasing complexity.
|
||||||
|
They are briefly treated below.
|
||||||
|
.NH 3
|
||||||
|
The Dispatch Table, Method 1.
|
||||||
|
.PP
|
||||||
|
When the interpreter starts, it reads the ip_spec.t file and constructs from it
|
||||||
|
a dispatch table. This table (of which there are actually three,
|
||||||
|
for primary, secondary
|
||||||
|
and tertiary opcodes) has 256 entries, each describing an instruction with
|
||||||
|
indications on how to decode the operand. For each instruction executed, the
|
||||||
|
interpreter finds the entry in the dispatch table, finds information there on
|
||||||
|
how to access the operand, constructs the operand and calls the appropriate
|
||||||
|
routine with the operand as calculated. There is one routine for each
|
||||||
|
instruction, which is called with the ready-made operand. Method 1 is easy to
|
||||||
|
program but requires constant interpretation of the dispatch table.
|
||||||
|
.NH 3
|
||||||
|
Intelligent Routines, Method 2.
|
||||||
|
.PP
|
||||||
|
For each opcode there is a separate routine, and since an opcode uniquely
|
||||||
|
defines the instruction and the operand format, the routine knows how to get
|
||||||
|
the operand; this knowledge is built into the routine. Preferably the heading
|
||||||
|
of the routine is generated automatically from the ip_spec.t file. Operand
|
||||||
|
decoding is immediate, and no dispatch table is needed. Generation of the
|
||||||
|
469 required routines is, however, far from simple. Either a generated array
|
||||||
|
of routine names or a generated switch statement is used to map the opcode onto
|
||||||
|
the correct routine. The switch approach has the advantage that parameters can
|
||||||
|
be passed to the routines.
|
||||||
|
.LP
|
||||||
|
The interpreter uses a variant of the switch statement scheme. Numerical
|
||||||
|
information that can be deduced from the opcode is passed as parameters to the
|
||||||
|
routine; this includes the argument of minis, the high order byte of shorties,
|
||||||
|
and the fact that the result is to be multiplied by the word size. This
|
||||||
|
reduces the number of required routines to 338.
|
||||||
|
.NH 3
|
||||||
|
Intelligent Calls.
|
||||||
|
.PP
|
||||||
|
The call in the switch statement does full operand construction, and the
|
||||||
|
resulting operand is passed to the routine. This reduces the number of
|
||||||
|
routines to 133, the number of EM instructions. Generation of the switch
|
||||||
|
statement from ip_spec.t will be complicated, but the routine space will be
|
||||||
|
much cleaner. This will not give any speed-up since the same actions are still
|
||||||
|
required; they are just performed in a different place.
|
||||||
|
.NH 3
|
||||||
|
Static Evaluation.
|
||||||
|
.PP
|
||||||
|
It can be observed that the evaluation of the operand of a given instruction in
|
||||||
|
the text segment will always give the same result. It is therefore possible to
|
||||||
|
preprocess the text segment, decomposing the instructions into structs which
|
||||||
|
contain the address, the instruction code and the operand. No operand decoding
|
||||||
|
will be necessary at run-time: all operands have been precalculated. This will
|
||||||
|
probably give a considerable speed-up. Jumps, especially GTO jumps, will,
|
||||||
|
however, require more attention.
|
||||||
|
.NH 2
|
||||||
|
Disassembly.
|
||||||
|
.PP
|
||||||
|
A disassembly facility is available, which gives a readable but not
|
||||||
|
letter-perfect disassembly of the EM object. The procedure structure is
|
||||||
|
indicated by placing the indication \fBP[n]\fP at the entry point of each
|
||||||
|
procedure, where \fBn\fP is the procedure identifier. The number of locals is
|
||||||
|
given in a comment.
|
||||||
|
.LP
|
||||||
|
The disassembler was generated by the software in the directory \fIswitch\fP
|
||||||
|
and then further processed by hand.
|
181
doc/int/txt3
Normal file
181
doc/int/txt3
Normal file
|
@ -0,0 +1,181 @@
|
||||||
|
.\" Logging
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.bp
|
||||||
|
.NH
|
||||||
|
THE LOGGING MACHINE.
|
||||||
|
.PP
|
||||||
|
Since messages and warnings provided by \fBint\fP include source code file
|
||||||
|
names and line numbers, they alone often suffice to identify the error.
|
||||||
|
If, however, the necessity arises, much more extensive debugging information
|
||||||
|
can be obtained by activating the the Logging Machine.
|
||||||
|
This Logging Machine, which monitors all actions of the EM machine, is the
|
||||||
|
subject of this chapter.
|
||||||
|
.NH 2
|
||||||
|
Implementation.
|
||||||
|
.PP
|
||||||
|
When inspecting the source code of \fBint\fP, many lines in the
|
||||||
|
following format will show up:
|
||||||
|
.DS
|
||||||
|
LOG(("@<\fIletter\fP><\fIdigit\fP> message", args));
|
||||||
|
.DE
|
||||||
|
or
|
||||||
|
.DS
|
||||||
|
LOG(("\ <\fIletter\fP><\fIdigit\fP> message", args));
|
||||||
|
.DE
|
||||||
|
The double parentheses are needed, because \fILOG()\fP is
|
||||||
|
declared as a define, and has a printf-like argument structure.
|
||||||
|
.PP
|
||||||
|
The <\fIletter\fP> classifies the log message and corresponds to an entry in
|
||||||
|
the \fIlogmask\fP, which holds a threshold for each class of messages.
|
||||||
|
The following classes exist:
|
||||||
|
.TS
|
||||||
|
tab(@);
|
||||||
|
l l l.
|
||||||
|
\(bu A\-Z@the flow of instructions:
|
||||||
|
@A: array
|
||||||
|
@B: branch
|
||||||
|
@C: convert
|
||||||
|
@F: floating point arithmetic
|
||||||
|
@I: integer arithmetic
|
||||||
|
@L: load
|
||||||
|
@M: miscellaneous
|
||||||
|
@P: procedure call
|
||||||
|
@R: pointer arithmetic
|
||||||
|
@S: store
|
||||||
|
@T: compare
|
||||||
|
@U: unsigned arithmetic
|
||||||
|
@X: logical
|
||||||
|
@Y: sets
|
||||||
|
@Z: increment/decrement/zero
|
||||||
|
\(bu d@stack dumping.
|
||||||
|
\(bu g@gda & heap manipulation.
|
||||||
|
\(bu s@stack manipulation.
|
||||||
|
\(bu r@reading the loadfile.
|
||||||
|
\(bu q@floating point calculations during reading the loadfile.
|
||||||
|
\(bu x@the instruction count, contents and file position.
|
||||||
|
\(bu m@monitor calls.
|
||||||
|
\(bu p@procedure calls and returns.
|
||||||
|
\(bu t@traps.
|
||||||
|
\(bu w@warnings.
|
||||||
|
.TE
|
||||||
|
.LP
|
||||||
|
When the interpreter reaches a LOG(()) statement it scans its first argument;
|
||||||
|
if \fIletter\fP
|
||||||
|
occurs in the logmask, and if \fIdigit\fP is lower or equal to the
|
||||||
|
threshold in the logmask, the message is given.
|
||||||
|
Depending on the first character, the message will be preceded by a
|
||||||
|
position indication (with the @) or will be printed as is (with the
|
||||||
|
space).
|
||||||
|
The \fIletter\fP is determines the message class
|
||||||
|
and the \fIdigit\fP is used to distinguish various levels
|
||||||
|
of logging, with a lower digit indicating a more important message.
|
||||||
|
We will call the <\fIletter\fP><\fIdigit\fP> combination the \fBid\fP of
|
||||||
|
the logging.
|
||||||
|
.LP
|
||||||
|
In general, the lower the \fIdigit\fP following the \fIletter\fP,
|
||||||
|
the more important the message.
|
||||||
|
E.g. m5 reports about unsuccessful monitor calls only, m9 also reports
|
||||||
|
about successful monitors (which are obviously less interesting).
|
||||||
|
New logging messages can be added to the source code on places you
|
||||||
|
think relevant.
|
||||||
|
.LP
|
||||||
|
Reasonable settings for the logmask are:
|
||||||
|
.TS
|
||||||
|
tab(@);
|
||||||
|
l l l.
|
||||||
|
@A\-Z9d4twx9@advised setting when trouble shooting (default).
|
||||||
|
@A\-Zx9@shows the flow of instructions & global information.
|
||||||
|
@pm9@shows the procedure & monitor calls.
|
||||||
|
@tw9@shows warning & trap information.
|
||||||
|
.TE
|
||||||
|
.PP
|
||||||
|
An EM interpreter without a Logging Machine can be obtained by undefining the
|
||||||
|
macro \fICHECKING\fP in the file \fIchecking.h\fP.
|
||||||
|
.NH 2
|
||||||
|
Controlling the Logging machine.
|
||||||
|
.PP
|
||||||
|
The actions of the Logging Machine are controlled by a set of internal
|
||||||
|
variables (one of which is the log mask).
|
||||||
|
These variables can be set through assignments on the command line, as
|
||||||
|
explained int the manual page \fIint.1\fP, q.v.
|
||||||
|
Since there are a great many logging statements in the program, of which only a
|
||||||
|
few will be executed in any call of the interpreter, it is important to be able
|
||||||
|
to decide quickly if a given \fIid\fP has to be checked at all.
|
||||||
|
To this end all logging statements are guarded (in the #define) by a test for
|
||||||
|
the boolean variable \fIlogging\fP.
|
||||||
|
This variable will only be set if the command line assignments show the
|
||||||
|
potential need for logging (\fImust_log\fP) and the instruction count
|
||||||
|
(\fIinr\fP) is at least equal to \fIlog_start\fP (which derives from the
|
||||||
|
parameter \fBLOG\fP).
|
||||||
|
.LP
|
||||||
|
The log mask can be set by the assignment
|
||||||
|
.DS
|
||||||
|
"LOGMASK=\fIlogstring\fP"
|
||||||
|
.DE
|
||||||
|
which sets the current logmask to \fIlogstring\fP.
|
||||||
|
A logstring has the following form:
|
||||||
|
.DS
|
||||||
|
[ [ \fIletter\fP | \fIletter\fP \- \fIletter\fP ]+ \fIdigit\fP ]+
|
||||||
|
.DE
|
||||||
|
E.g. LOGMASK=A\-D8x9R7c0hi4 will print all messages belonging to loggings
|
||||||
|
with \fBid\fPs:
|
||||||
|
\fIA0..A8,B0..B8,C0..C8,D0..D8,x0..x9,R0..R7,c0,h0..h4,i0..i4\fP.
|
||||||
|
.PP
|
||||||
|
The logging variable STOP can be used to prevent run-away logging
|
||||||
|
past the point where the user expects an error to occur.
|
||||||
|
STOP=\fInr\fP will stop the interpreter after instruction number \fInr\fP.
|
||||||
|
.PP
|
||||||
|
To simplify the use of the logging machine, a number of abbreviations have been
|
||||||
|
defined.
|
||||||
|
E.g., AT=\fInr\fP can be thought of as an abbreviation of LOG=\fInr\-1\fP
|
||||||
|
STOP=\fInr+1\fP; this causes three stack dumps, one before the suspect
|
||||||
|
instruction, one on it and one after it; then the interpreter stops.
|
||||||
|
.PP
|
||||||
|
Logging results will appear in a special logging file (default: \fIint.log\fP).
|
||||||
|
.NH 2
|
||||||
|
Dumps.
|
||||||
|
.PP
|
||||||
|
There are three routines available to examine the memory contents:
|
||||||
|
.TS
|
||||||
|
tab(@);
|
||||||
|
l l l.
|
||||||
|
@\fIstd_all()\fP@dumps the contents of the stack (\fId1\fP or \fId2\fP must be in the logmask).
|
||||||
|
@\fIgdad_all()\fP@dumps the contents of the gda (\fI+1\fP must be in the logmask).
|
||||||
|
@\fIhpd_all()\fP@dumps the contents of the heap (\fI*1\fP must be in the logmask).
|
||||||
|
.TE
|
||||||
|
.LP
|
||||||
|
These routines can be used everywhere in the program to examine the
|
||||||
|
contents of memory.
|
||||||
|
The internal variables allow the
|
||||||
|
gda and heap to be dumped only once (according to the
|
||||||
|
corresponding internal variable).
|
||||||
|
The stack is dumped after each
|
||||||
|
instruction if the log mask contains d1 or d2; d2 gives a full formatted
|
||||||
|
dump, d1 produces a listing of the Return Status Blocks only.
|
||||||
|
An attempt is made to format the stack correctly, based on the shadow
|
||||||
|
bytes, which identify the Return Status Block.
|
||||||
|
.LP
|
||||||
|
Remember to set the correct \fBid\fP in the LOGMASK, and to give
|
||||||
|
LOG the correct value.
|
||||||
|
If dumping is needed before the first instruction, then LOG must be
|
||||||
|
set to 0.
|
||||||
|
.LP
|
||||||
|
The dumps of the global data area and the heap are controlled internally by
|
||||||
|
the id-s +1 and *1 resp.; the corresponding logmask entries are set
|
||||||
|
automatically by setting the GDA and HEAP variables.
|
||||||
|
.NH 2
|
||||||
|
Forking.
|
||||||
|
.PP
|
||||||
|
As mentioned earlier, a call to \fIfork()\fP, causes an image of the current
|
||||||
|
program to start running.
|
||||||
|
To prevent a messy logfile, the child process gets its own logfile
|
||||||
|
(and message file, tally file, etc.).
|
||||||
|
These logfiles are distinguished from the parent logfile by the a
|
||||||
|
postfix, e.g.,
|
||||||
|
\fIlogfile_1\fP for the first child, \fIlogfile_2\fP for the second child,
|
||||||
|
\fIlogfile_1_2\fP for the second child of the first child, etc.
|
||||||
|
.br
|
||||||
|
\fINote\fP: the implementation of this feature is shaky; it works for the log
|
||||||
|
file but should also work for other files and for the names of the logging
|
||||||
|
variables.
|
200
util/int/int.1
Normal file
200
util/int/int.1
Normal file
|
@ -0,0 +1,200 @@
|
||||||
|
.\" Manual page
|
||||||
|
.\"
|
||||||
|
.\" $Header$
|
||||||
|
.TH INT I
|
||||||
|
.ad
|
||||||
|
.SH NAME
|
||||||
|
int \- Interpreter for EM Machine Language
|
||||||
|
.SH SYNOPSIS
|
||||||
|
\fBint\fP [ intargs ] [ emfile [ emargs ] ]
|
||||||
|
.SH DESCRIPTION
|
||||||
|
This program interprets the EM machine-language, and replaces
|
||||||
|
the pascal written EM interpreter described in [1].
|
||||||
|
The program interprets load files in \fIe.out\fP format (see [1], sec. 10.3).
|
||||||
|
.LP
|
||||||
|
\fIEmfile\fP is the name of the load file; if no name is
|
||||||
|
specified, the default name \fIe.out\fP is used.
|
||||||
|
The program can handle several word size / pointer size combinations.
|
||||||
|
The combinations presently supported are 2/2, 2/4 and 4/4.
|
||||||
|
.LP
|
||||||
|
\fIEmargs\fP are the arguments for the program being interpreted.
|
||||||
|
If any arguments are given, then \fIemfile\fP must be present.
|
||||||
|
.PP
|
||||||
|
The interpreter can generate diagnostic messages (warnings) about the
|
||||||
|
interpreted program.
|
||||||
|
Some of these warnings are given very frequently,
|
||||||
|
which may result in a large, non-functional message file.
|
||||||
|
To avoid this behavior, counters keep track of the number of times
|
||||||
|
a given warning occurs in a given file at a given line number.
|
||||||
|
Only when this counter is a power of 4, the warning will actually be
|
||||||
|
given.
|
||||||
|
`Logarithmic warning generation' is established in this way.
|
||||||
|
.PP
|
||||||
|
\fIInt\fP preempts the highest two file descriptors available, for
|
||||||
|
diagnostic purposes.
|
||||||
|
Interpreted programs can use the other file descriptors without
|
||||||
|
clash problems.
|
||||||
|
.PP
|
||||||
|
.I "Interpreter parameters"
|
||||||
|
.br
|
||||||
|
\fIInt\fP itself accepts the following options, all given as separate flags:
|
||||||
|
.IP \fB\-d\fP
|
||||||
|
The program will not be run; a disassembly listing of the program will
|
||||||
|
be written to standard output file instead.
|
||||||
|
The original names are lost, but the procedure structure is recovered.
|
||||||
|
.IP \fB\-h\fP\fIN\fP
|
||||||
|
The maximum size of the heap will be limited to \fIN\fP bytes. This can be
|
||||||
|
used to force a heap overflow trap.
|
||||||
|
.IP \fB\-I\fP\fIN\fP
|
||||||
|
It is possible to tell \fIint\fP to ignore traps in the range 0-15.
|
||||||
|
If a trap is ignored, every time the trap would have happened
|
||||||
|
a warning is generated instead.
|
||||||
|
The argument \fIN\fP is the trap number, as described in [1], sec. 9.
|
||||||
|
For ignoring more than one trap, several \fB\-I\fP flags are needed.
|
||||||
|
.IP \fB\-m\fP\fIfile\fP
|
||||||
|
The argument \fIfile\fP is the name of a file on which the messages will
|
||||||
|
appear.
|
||||||
|
The default file name is \fIint.mess\fP.
|
||||||
|
.IP \fB\-r\fP\fIN\fP
|
||||||
|
Determines the size of the Function Return Area.
|
||||||
|
Default: 2 \(mu pointer size.
|
||||||
|
.IP \fB\-s\fP\fIN\fP
|
||||||
|
The maximum size of the stack will be limited to \fIN\fP bytes. This can be
|
||||||
|
used to force a stack overflow trap.
|
||||||
|
.IP \fB\-t\fP
|
||||||
|
If given, a file \fIint.tally\fP will be produced upon program termination.
|
||||||
|
For each source file, it contains a list of line numbers visited,
|
||||||
|
with the number of times the line was visited and
|
||||||
|
the number of EM instructions executed on the line.
|
||||||
|
.IP \fB\-W\fP\fIN\fP
|
||||||
|
This option can be used to disable warnings.
|
||||||
|
The argument \fIN\fP is the number of the warning to be suppressed,
|
||||||
|
as found in the \fIint\fP documentation [3].
|
||||||
|
For disabling more than one warning, several \fB\-W\fP flags are needed.
|
||||||
|
.PP
|
||||||
|
.I "The Logging Machine"
|
||||||
|
.br
|
||||||
|
The EM machine is monitored continually by a Logging Machine. This logging
|
||||||
|
machine keeps an instruction count and
|
||||||
|
can produce a trace of the actions of the EM machine, make readable
|
||||||
|
dumps of the stack, heap and global data area, and stop the EM machine after a
|
||||||
|
given instruction number.
|
||||||
|
The actions of the logging machine are controlled by
|
||||||
|
its internal variables, the values of which can be set by assignments on the
|
||||||
|
command line, much like setting macro names in a call of \fImake\fP.
|
||||||
|
These assignments can be interspersed with the options for the EM machine.
|
||||||
|
.PP
|
||||||
|
The logging machine has the following internal variables:
|
||||||
|
.IP \fBLOG\fP=\fIN\fP
|
||||||
|
Logging will start when the instruction count has reached \fIN\fP.
|
||||||
|
.IP \fBLOGMASK\fP=\fIstring\fP
|
||||||
|
The tracing actions are controlled by a log mask; the log mask consists of a
|
||||||
|
list of pairs of action classes and logging levels.
|
||||||
|
E.g. \fBLOGMASK\fP=\fIm9\fP means: trace all monitor calls.
|
||||||
|
The action classes are described fully in [3].
|
||||||
|
The default log mask is reasonably suitable.
|
||||||
|
.IP \fBLOGFILE\fP=\fIstring\fP
|
||||||
|
The \fIstring\fP is the name of a file on which all logging information is
|
||||||
|
written.
|
||||||
|
The default file name is \fIint.log\fP.
|
||||||
|
.IP \fBSTOP\fP=\fIN\fP
|
||||||
|
The logging machine stops the EM machine after instruction \fIN\fP.
|
||||||
|
.PP
|
||||||
|
Stack dumps can be made after each instruction; they are controlled by the pair
|
||||||
|
\fBd4\fP in the log mask; gda and heap dumps can only be made after a specific
|
||||||
|
instruction.
|
||||||
|
The following internal variables pertain to memory dumps:
|
||||||
|
.IP \fBGDA\fP=\fIN\fP
|
||||||
|
The contents of the Global Data Area are dumped after instruction \fIN\fP. The
|
||||||
|
extent can be adjusted by setting \fBGMIN\fP=\fINmin\fP (default 0) and
|
||||||
|
\fBGMAX\fP=\fINmax\fP (default HB).
|
||||||
|
.IP \fBHEAP\fP=\fIN\fP
|
||||||
|
The contents of the heap are dumped after instruction \fIN\fP.
|
||||||
|
.IP \fBSTDSIZE\fP=\fIN\fP
|
||||||
|
The stack dump is restricted to the \fIN\fP topmost bytes.
|
||||||
|
.IP \fBRAWSTACK\fP=\fIN\fP
|
||||||
|
Normally the stack dump produced is divided into activation records
|
||||||
|
separated by formatted dumps of the Return Status Blocks.
|
||||||
|
If \fIN\fP is non-zero, this dividing and formatting is suppressed, and the
|
||||||
|
stack is dumped raw.
|
||||||
|
.PP
|
||||||
|
Some combinations of variable settings are generally useful and can be
|
||||||
|
abbreviated:
|
||||||
|
.IP \fBAT\fP=\fIN\fP
|
||||||
|
Is an abbreviation of \fBLOG\fP=\fIN\-1\fP \fBSTOP\fP=\fIN+1\fP.
|
||||||
|
The default log mask applies.
|
||||||
|
.IP \fBL\fP=\fIstring\fP
|
||||||
|
Is an abbreviation of \fBLOG\fP=\fI0\fP \fBLOGMASK\fP=\fIstring\fP.
|
||||||
|
E.g., \fBL\fP=\fIm9\fP will log all monitor calls
|
||||||
|
and \fBL\fP=\fIA\-Z9\fP will log all instructions (give a full trace).
|
||||||
|
.PP
|
||||||
|
When the interpreter forks, the child continues logging on a new file named
|
||||||
|
\fIint.log_1\fP, etc.
|
||||||
|
In principle it reevaluates the interpreter arguments, now looking for
|
||||||
|
\fBLOG_1\fP, \fBLOGMASK_1\fP, etc., but this feature has not been fully
|
||||||
|
implemented.
|
||||||
|
.PP
|
||||||
|
.I "Diagnostics"
|
||||||
|
.br
|
||||||
|
All diagnostics are written to the message file.
|
||||||
|
Diagnostics come in three flavors:
|
||||||
|
.IP \-
|
||||||
|
(messages): These inform you about NOP instructions, give more information
|
||||||
|
about incoming signals and display the exit status of the program.
|
||||||
|
.IP \-
|
||||||
|
(warnings): These are generated as a result of the checking.
|
||||||
|
In most cases the diagnostic is self-explanatory.
|
||||||
|
A complete description of the warnings can be found in the \fIint\fP
|
||||||
|
documentation [3].
|
||||||
|
.IP \-
|
||||||
|
(fatal errors): This diagnostic is the result of an irrecoverable
|
||||||
|
error, generally before the program has started: incorrect call of the
|
||||||
|
interpreter, cannot access file, incorrect format of load file. A few follow
|
||||||
|
during interpretation: out of memory, uncaught traps, floating point operation
|
||||||
|
on a version without floating point;
|
||||||
|
execution stops immediately after the diagnostic is generated.
|
||||||
|
.PP
|
||||||
|
Further diagnostics are generated (on \fIstderr\fP) if files cannot
|
||||||
|
be opened or found.
|
||||||
|
.SH "SEE ALSO"
|
||||||
|
e.out(5), ack(1), em22(1), em24(1), em44(1).
|
||||||
|
.IP [1]
|
||||||
|
Andrew S. Tanenbaum, Hans van Staveren, Ed G. Keizer and Johan W. Stevenson,
|
||||||
|
\fIDescription of a Machine Architecture for use with Block
|
||||||
|
Structured Languages\fP, Informatica rapport IR-81.
|
||||||
|
.IP [2]
|
||||||
|
Amsterdam Compiler Kit, reference manual and UNIX manual pages.
|
||||||
|
.IP [3]
|
||||||
|
Eddo de Groot, Leo van den Berge, Dick Grune,
|
||||||
|
\fIThe EM Interpreter\fP.
|
||||||
|
.SH "FILES"
|
||||||
|
.ta 20n
|
||||||
|
int.mess contains messages
|
||||||
|
.br
|
||||||
|
int.log contains logging info, if requested
|
||||||
|
.br
|
||||||
|
int.tally contains tally results, if requested
|
||||||
|
.br
|
||||||
|
int.core produced upon fatal error; format provisional
|
||||||
|
.SH "BUGS"
|
||||||
|
The monitor calls
|
||||||
|
.IR mpxcall ,
|
||||||
|
.I ptrace
|
||||||
|
and
|
||||||
|
.I profile
|
||||||
|
have not been implemented.
|
||||||
|
.br
|
||||||
|
The maximum number of bytes for rotation is 4.
|
||||||
|
.br
|
||||||
|
The UNIX V7 struct tchars is not emulated under System V.
|
||||||
|
.br
|
||||||
|
The P and N restrictions on operands are not checked.
|
||||||
|
.br
|
||||||
|
The start-up has a quadratic component in the number of procedures in the EM
|
||||||
|
program.
|
||||||
|
.SH "AUTHORS"
|
||||||
|
L.J.A. van den Berge.
|
||||||
|
.br
|
||||||
|
E.J. de Groot.
|
||||||
|
.br
|
||||||
|
D. Grune
|
Loading…
Reference in a new issue