ack/doc/ack.doc

.\" $Header$
.nr PD 1v
.tr ~
.TL
Ack Description File
.br
Reference Manual
.AU
Ed Keizer
.AI
Vakgroep Informatica
Vrije Universiteit
Amsterdam
.NH
Introduction
.PP
The program \fIack\fP(I) internally maintains a table of
possible transformations and a table of string variables.
The transformation table contains one entry for each possible
transformation of a file.
Which transformations are used depends on the suffix of the
source file.
Each transformation table entry tells which input suffixes are
allowed and what suffix/name the output file has.
When the output file does not already satisfy the request of the
user, with the flag \fB\-c.suffix\fP, the table is scanned
starting with the next transformation in the table for another
transformation that has as input suffix the output suffix of
the previous transformation.
A few special transformations are recognized, among them is the
combiner, which is
a program combining several files into one.
When no stop suffix was specified (flag \fB\-c.suffix\fP) \fIack\fP
stops after executing the combiner with as arguments the \-
possibly transformed \- input files and libraries.
\fIAck\fP will only perform the transformations in the order in
which they are presented in the table.
.LP
The string variables are used while creating the argument list
and program call name for
a particular transformation.
.NH
Which descriptions are used
.PP
\fIAck\fP always uses two description files: one to define the
front-end transformations and one for the machine dependent
back-end transformations.
Each description has a name.
First the way of determining
the name of the descriptions needed is described.
.PP
When the shell environment variable ACKFE is set \fIack\fP uses
that to determine the front-end table name, otherwise it uses
\fBfe\fP.
.PP
The way the backend table name is determined is more
convoluted.
.br
First, when the last filename in the program call name is not
one of \fIack\fP, \fIcc\fP, \fIacc\fP, \fIpc\fP or \fIapc\fP,
this filename is used as the backend description name.
Second, when the \fB\-m\fP is present the \fB\-m\fP is chopped of this
flag and the rest is used as the backend description name.
Third, when both failed the shell environment variable ACKM is
used.
Last, when also ACKM was not present the default backend is
used, determined by the definition of ACKM in h/local.h.
The presence and value of the definition of ACKM is
determined at compile time of \fIack\fP.
.PP
Now, we have the names, but that is only the first step.
\fIAck\fP stores a few descriptions at compile time.
This descriptions are simply files read in at compile time.
At the moment of writing this document, the descriptions
included are: pdp, fe, i86, m68k2, vax2 and int.
The name of a description is first searched for internally,
then in lib/descr/\fIname\fP, then in
lib/\fIname\fP/descr, band finally in the current
directory of the user.
.NH
Using the description file
.PP
Before starting on a narrative of the description file,
the introduction of a few terms is necessary.
All these terms are used to describe the scanning of zero
terminated strings, thereby producing another string or
sequence of strings.
.IP Backslashing 5
.br
All characters preceded by \e are modified to prevent
recognition at further scanning.
This modification is undone before a string is passed to the
outside world as argument or message.
When reading the description files the
sequences \e\e, \e# and \e<newline> have a special meaning.
\e\e translates to a single \e, \e# translates to a single #
that is not
recognized as the start of comment, but can be used in
recognition and finally, \e<newline> translates to nothing at
all, thereby allowing continuation lines.
.nr PD 0
.IP "Variable replacement"
.br
The scan recognizes the sequences {{, {NAME} and {NAME?text}
Where NAME can be any combination if characters excluding ? and
} and text may be anything excluding }.
(~\e} is allowed of course~)
The first sequence produces an unescaped single {.
The second produces the contents of the NAME, definitions are
done by \fIack\fP and in description files.
When the NAME is not defined an error message is produced on
the diagnostic output.
The last sequence produces the contents of NAME if it is
defined and text otherwise.
.PP
.IP "Expression replacement"
.br
Syntax:  (\fIsuffix sequence\fP:\fIsuffix sequence\fP=\fItext\fP)
.br
Example: (.c.p.e:.e=tail_em)
.br
If the two suffix sequences have a common member \-~\&.e in this
case~\- the text is produced.
When no common member is present the empty string is produced.
Thus the example given is a constant expression.
Normally, one of the suffix sequences is produced by variable
replacement.
\fIAck\fP sets three variables while performing the diverse
transformations: HEAD, TAIL and RTS.
All three variables depend on the properties \fIrts\fP and
\fIneed\fP from the transformations used.
Whenever a transformation is used for the first time,
the text following the \fIneed\fP is appended to both the HEAD and
TAIL variable.
The value of the variable RTS is determined by the first
transformation used with a \fIrts\fP property.
.IP
Two runtime flags have effect on the value of one or more of
these variables.
The flag \fB\-.suffix\fP has the same effect on these three variables
as if a file with that \fBsuffix\fP was included in the argument list
and had to be translated.
The flag \fB\-r.suffix\fP only has that effect on the TAIL
variable.
The program call names \fIacc\fP and \fIcc\fP have the effect
of an automatic \fB\-.c\fP flag.
\fIApc\fP and \fIpc\fP have the effect of an automatic \fB\-.p\fP flag.
.IP "Line splitting"
.br
The string is transformed into a sequence of strings by replacing
the blank space by string separators (nulls).
.IP "IO replacement"
.br
The > in the string is replaced by the output file name.
The < in the string is replaced by the input file name.
When multiple input files are present the string is duplicated
for each input file name.
.nr PD 1v
.LP
Each description is a sequence of variable definitions followed
by a sequence of transformation definitions.
Variable definitions use a line each, transformations
definitions consist of a sequence of lines.
Empty lines are discarded, as are lines with nothing but
comment.
Comment is started by a # character, and continues to the end
of the line.
Three special two-characters sequences exist: \e#, \e\e and
\e<newline>.
Their effect is described under 'backslashing' above.
Each \- nonempty \- line starts with a keyword, possibly
preceded by blank space.
The keyword can be followed by a further specification.
The two are separated by blank space.
.PP
Variable definitions use the keyword \fIvar\fP and look like this:
.DS X
   var NAME=text
.DE
The name can be any identifier, the text may contain any
character.
Blank space before the equal sign is not part of the NAME.
Blank space after the equal is considered as part of the text.
The text is scanned for variable replacement before it is
associated with the variable name.
.br
.sp 2
The start of a transformation definition is indicated by the
keyword \fIname\fP.
The last line of such a definition contains the keyword
\fIend\fP.
The lines in between associate properties to a transformation
and may be presented in any order.
The identifier after the \fIname\fP keyword determines the name
of the transformation.
This name is used for debugging and by the \fB\-R\fP flag.
The keywords are used to specify which input suffices are
recognized by that transformation,
the program to run, the arguments to be handed to that program
and the name or suffix of the resulting output file.
Two keywords are used to indicate which run-time startoffs and
libraries are needed.
The possible keywords are:
.IP \fIfrom\fP
.br
followed by a sequence of suffices.
Each file with one of these suffices is allowed as input file.
Preprocessor transformations do not need the \fIfrom\fP
keyword. All other transformations do.
.nr PD 0
.IP \fIto\fP
.br
followed by the suffix of the output file name or in the case of a
linker
the output file name.
.IP \fIprogram\fP
.br
followed by name of the load file of the program, a pathname most likely
starts with either a / or {EM}.
This keyword must be
present, the remainder of the line
is subject to backslashing and variable replacement.
.IP \fImapflag\fP
.br
The mapflags are used to grab flags given to \fIack\fP and
pass them on to a specific transformation.
This feature uses a few simple pattern matching and replacement
facilities.
Multiple occurences of this keyword are allowed.
This text following the keyword is
subjected to backslashing.
The keyword is followed by a match expression and a variable
assignment separated by blank space.
As soon as both description files are read, \fIack\fP looks
at all transformations in these files to find a match for the
flags given to \fIack\fP.
The flags \fB\-m\fP, \fB\-o\fP,
\fB\-O\fP, \fB\-r\fP, \fB\-v\fP, \fB\-g\fP, \-\fB\-c\fP, \fB\-t\fP,
\fB\-k\fP, \fB\-R\fP and \-\fB\-.\fP are specific to \fIack\fP and
not handed down to any transformation.
The matching is performed in the order in which the entries
appear in the definition.
The scanning stops after first match is found.
When a match is found, the variable assignment is executed.
A * in the match expression matches any sequence of characters,
a * in the right hand part of the assignment is
replaced by the characters matched by
the * in the expression.
The right hand part is also subject to variable replacement.
The variable will probably be used in the program arguments.
The \fB\-l\fP flags are special,
the order in which they are presented to \fIack\fP must be
preserved.
The identifier LNAME is used in conjunction with the scanning of
\fB\-l\fP flags.
The value assigned to LNAME is used to replace the flag.
The example further on shows the use all this.
.IP \fIargs\fP
.br
The keyword is followed by the program call arguments.
It is subject to backslashing, variable replacement, expression
replacement, line splitting and IO replacement.
The variables assigned to by \fImapflags\fP will probably be
used here.
The flags not recognized by \fIack\fP or any of the transformations
are passed to the linker and inserted before all other arguments.
.IP \fIstdin\fP
.br
This keyword indicates that the transformation reads from standard input.
.IP \fIstdout\fP
.br
This keyword indicates that the transformation writes on standard output.
.IP \fIoptimizer\fP
.br
This keyword indicates that this transformation is an optimizer.
.IP \fIlinker\fP
.br
This keyword indicates that this transformation is the linker.
.IP \fIcombiner\fP
.br
This keyword indicates that this transformation is a combiner. A combiner
is a program combining several files into one, but is not a linker.
An example of a combiner is the global optimizer.
.IP \fIprep\fP
.br
This \-~optional~\- keyword is followed an option indicating its relation
to the preprocessor.
The possible options are:
.DS X
  always	the input files must be preprocessed
  cond	the input files must be preprocessed when starting with #
  is	this transformation is the preprocessor
.DE
.IP \fIrts\fP
.br
This \-~optional~\- keyword indicates that the rest of the line must be
used to set the variable RTS, if it was not already set.
Thus the variable RTS is set by the first transformation
executed which such a property or as a result from \fIack\fP's program
call name (acc, cc, apc or pc) or by the \fB\-.suffix\fP flag.
.IP \fIneed\fP
.br
This \-~optional~\- keyword indicates that the rest of the line must be
concatenated to the NEEDS variable.
This is done once for every transformation used or indicated
by one of the program call names mentioned above or indicated
by the \fB\-.suffix\fP flag.
.br
.nr PD 1v
.NH
Conventions used in description files
.PP
\fIAck\fP reads two description files.
A few of the variables defined in the machine specific file
are used by the descriptions of the front-ends.
Other variables, set by \fIack\fP, are of use to all
transformations.
.PP
\fIAck\fP sets the variable EM to the home directory of the
Amsterdam Compiler Kit.
The variable SOURCE is set to the name of the argument that is currently
being massaged, this is usefull for debugging.
The variable SUFFIX is set to the suffix of the argument that is
currently being massaged.
.br
The variable M indicates the
directory in lib/{M}/tail_..... and NAME is the string to
be defined by the preprocessor with \-D{NAME}.
The definitions of {w}, {s}, {l}, {d}, {f} and {p} indicate
EM_WSIZE, EM_SSIZE, EM_LSIZE, EM_DSIZE, EM_FSIZE and EM_PSIZE
respectively.
.br
The variable INCLUDES is used as the last argument to \fIcpp\fP.
It is used to add directories to
the list of directories containing #include files.
.PP
The variables HEAD, TAIL and RTS are set by \fIack\fP and used
to compose the arguments for the linker.
.NH
Example
.PP
Description for front-end
.DS X
.ta 4n 40n
name cpp	# the C-preprocessor
		# no from, it's governed by the P property
	to .i	# result files have suffix i
	program {EM}/lib/cpp	# pathname of loadfile
	mapflag \-I* CPP_F={CPP_F?} \-I*	# grab \-I.. \-U.. and
	mapflag \-U* CPP_F={CPP_F?} \-U*	# \-D.. to use as arguments
	mapflag \-D* CPP_F={CPP_F?} \-D*	# in the variable CPP_F
	args {CPP_F?} {INCLUDES?} \-D{NAME} \-DEM_WSIZE={w} \-DEM_PSIZE={p} \e
	    \-DEM_SSIZE={s} \-DEM_LSIZE={l} \-DEM_FSIZE={f} \-DEM_DSIZE={d} <
		# The arguments are: first the \-[IUD]...
		#  then the include dir's for this machine
		#  then the NAME and size valeus finally
		#  followed by the input file name
	stdout	# Output on stdout
	prep is	# Is preprocessor
end
name cem	# the C-compiler proper
	from .c	# used for files with suffix .c
	to .k	# produces compact code files
	program {EM}/lib/em_cem	# pathname of loadfile
	mapflag \-p CEM_F={CEM_F?} \-Xp	# pass \-p as \-Xp to cem
	mapflag \-L CEM_F={CEM_F?} \-l	# pass \-L as \-l to cem
	args \-Vw{w}i{w}p{p}f{f}s{s}l{l}d{d} {CEM_F?}
		# the arguments are the object sizes in
		# the \-V... flag and possibly \-l and \-Xp
	stdin	# input from stdin
	stdout	# output on stdout
	prep always	# use cpp
	rts .c	# use the C run-time system
	need .c	# use the C libraries
end
name decode	# make human readable files from compact code
	from .k.m	# accept files with suffix .k or .m
	to .e	# produce .e files
	program {EM}/lib/em_decode	# pathname of loadfile
	args <	# the input file name is the only argument
	stdout	# the output comes on stdout
end
.DE

.DS X
.ta 4n 40n
Example of a backend, in this case the EM assembler/loader.

var w=2	# wordsize 2
var p=2	# pointersize 2
var s=2	# short size 2
var l=4	# long size 4
var f=4	# float size 4
var d=8	# double size 8
var M=int	# Unused in this example
var NAME=int22	# for cpp (NAME=int results in #define int 1)
var LIB=mach/int/lib/tail_	# part of file name for libraries
var RT=mach/int/lib/head_	# part of file name for run-time startoff
var SIZE_FLAG=\-sm	# default internal table size flag
var INCLUDES=\-I{EM}/include	# use {EM}/include for #include files
name asld	# Assembler/loader
	from .k.m.a	# accepts compact code and archives
	to e.out	# output file name
	program {EM}/lib/em_ass	# load file pathname
	mapflag \-l* LNAME={EM}/{LIB}*	# e.g. \-ly becomes
		#	{EM}/mach/int/lib/tail_y
	mapflag \-+* ASS_F={ASS_F?} \-+*  # recognize \-+ and \-\-
	mapflag \-\-* ASS_F={ASS_F?} \-\-*
	mapflag \-s* SIZE_FLAG=\-s*	# overwrite old value of SIZE_FLAG
	args {SIZE_FLAG} \e
	    ({RTS}:.c={EM}/{RT}cc) ({RTS}:.p={EM}/{RT}pc) \-o > < \e
	    (.p:{TAIL}={EM}/{LIB}pc) \e
	    (.c:{TAIL}={EM}/{LIB}cc.1s {EM}/{LIB}cc.2g) \e
	    (.c.p:{TAIL}={EM}/{LIB}mon)
		# \-s[sml] must be first argument
		# the next line contains the choice for head_cc or head_pc
		# and the specification of in- and output.
		# the last three args lines choose libraries
	linker
end
.DE

The command \fIack \-mint \-v \-v \-I../h \-L \-ly prog.c\fP
would result in the following
calls (with exec(II)):
.DS X
.ta 4n
1)	/lib/cpp \-I../h \-I/usr/em/include \-Dint22 \-DEM_WSIZE=2 \-DEM_PSIZE=2 \e
	    \-DEM_SSIZE=2 \-DEM_LSIZE=4 \-DEM_FSIZE=4 \-DEM_DSIZE=8 prog.c
2)	/usr/em/lib/em_cem \-Vw2i2p2f4s2l4d8 \-l
3)	/usr/em/lib/em_ass \-sm /usr/em/mach/int/lib/head_cc \-o e.out prog.k
	/usr/em/mach/int/lib/tail_y /usr/em/mach/int/lib/tail_cc.1s
	/usr/em/mach/int/lib/tail_cc.2g /usr/em/mach/int/lib/tail_mon
.DE