ack/util/cmisc/tabgen.1
1989-12-06 12:38:18 +00:00

116 lines
3.6 KiB
Groff

.TH TABGEN 1ACK
.ad
.SH NAME
tabgen \- table generator for C-programs
.SH SYNOPSYS
.B tabgen \fIarguments\fP
.SH DESCRIPTION
.I Tabgen
is a handy tool for generating tables for C-programs from a compact
description. The current version is only suitable for generating character
tables. The output is produced on standard output.
It works by maintaining an internal table of values, printing this table
when this is requested.
.PP
Each argument given to
.I tabgen
is either a command or a description. Descriptions are discussed first.
.PP
A description consists of a value (a string), directly followed by a semicolon,
directly followed by a list of indices for which the table to be generated
has this value. This list of indices must be in a certain \fBinputformat\fP,
characterized by a charactet.
Currently, there is only one inputformat, "c". In this format, the indices
are characters. There are two special characters: '\e' and '-'. The '\e'
behaves like in a C-string, and the '-' describes a range, unless
it starts the list of indices.
.PP
Some examples of descriptions:
.nf
STIDF:a-zA-Z_
STSKIP:\er \et\e013\ef
.fi
.PP
These descriptions have the effect that the internal table values for
'a' through 'z', 'A' through 'Z', and '_' are set to STIDF, and that the
internal table values for carriage-return, space, tab, vertical-tab, and
form-feed are set to STSKIP.
.PP
A command is introduced by a special character. On the command line,
a command is introduced by a '-'. The following commands are
recognized:
.IP I\fIchar\fP
switch to a different input format. This command is only there for future
extensions.
.IP f\fIfile\fP
read input from the file \fIfile\fP. In a file, each line is an argument
to \fItabgen\fP. Each line is either a command or a description. In a file,
commands are introduced by a '%'.
.IP F\fIformat\fP
Values are printed in a printf format. The default value for this format
is \fB"%s,\en"\fP. This can be changed with this command.
.IP T\fItext\fP
Print \fItext\fP literally at this point.
.IP p
Print the table as it is built at this point.
.IP C
Clear the table. This sets all internal table values to 0.
.IP i\fIstr\fP
Initialize all internal table values to \fIstr\fP. if \fIstr\fP is not
given, this command is equivalent to the C command.
.IP S\fInum\fP
Set the table size to \fInum\fP entries. The default size is 128.
.IP H\fIfilename\fP
Create tables which can be indexed by the full range of characters,
rather than 0..127. As this depends on the implementation of 'char'
in C (signed or unsigned), this also generates a file \fIfilename\fP,
with a #define for the constant "CharOffset". The generated tables can
be indexed by first adding "CharOffset" to the base of the table.
If \fIfilename\fP is not given, "charoffset.h" is used.
.SH "AN EXAMPLE"
.PP
The next example is a part of the \fItabgen\fP description of the
character tables used by the lexical analyser of the ACK Modula-2 compiler.
This description resides in a file called char.tab.
.I
Tabgen
is called as follows:
.nf
tabgen -fchar.tab > char.c
.fi
.PP
The description as given here generates 2 tables: one indicating a class
according to which token a character can be a start of, and one indicating
whether a character may occur in an identifier.
.nf
% Comments are introduced with space or tab after the %
%S129
%F %s,
% CHARACTER CLASSES
%iSTGARB
STSKIP: \et\e013\e014\e015
STNL:\e012
STSIMP:-#&()*+,/;=[]^{|}~
STCOMP:.:<>
STIDF:a-zA-Z
STSTR:"'
STNUM:0-9
STEOI:\e200
%T#include "class.h"
% class.h contains #defines for STSKIP, STNL, ...
%Tchar tkclass[] = {
%p
%T};
% INIDF
%C
1:a-zA-Z0-9
%Tchar inidf[] = {
%F %s,
%p
%T};
.fi
.SH BUGS
.PP
.I Tabgen
assumes that characters are 8 bits wide.