781 lines
		
	
	
	
		
			20 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
			
		
		
	
	
			781 lines
		
	
	
	
		
			20 KiB
		
	
	
	
		
			Groff
		
	
	
	
	
	
.TH FLEX 1 "26 May 1990" "Version 2.3"
 | 
						|
.SH NAME
 | 
						|
flex - fast lexical analyzer generator
 | 
						|
.SH SYNOPSIS
 | 
						|
.B flex
 | 
						|
.B [-bcdfinpstvFILT8 -C[efmF] -Sskeleton]
 | 
						|
.I [filename ...]
 | 
						|
.SH DESCRIPTION
 | 
						|
.I flex
 | 
						|
is a tool for generating
 | 
						|
.I scanners:
 | 
						|
programs which recognized lexical patterns in text.
 | 
						|
.I flex
 | 
						|
reads
 | 
						|
the given input files, or its standard input if no file names are given,
 | 
						|
for a description of a scanner to generate.  The description is in
 | 
						|
the form of pairs
 | 
						|
of regular expressions and C code, called
 | 
						|
.I rules.  flex
 | 
						|
generates as output a C source file,
 | 
						|
.B lex.yy.c,
 | 
						|
which defines a routine
 | 
						|
.B yylex().
 | 
						|
This file is compiled and linked with the
 | 
						|
.B -lfl
 | 
						|
library to produce an executable.  When the executable is run,
 | 
						|
it analyzes its input for occurrences
 | 
						|
of the regular expressions.  Whenever it finds one, it executes
 | 
						|
the corresponding C code.
 | 
						|
.LP
 | 
						|
For full documentation, see
 | 
						|
.B flexdoc(1).
 | 
						|
This manual entry is intended for use as a quick reference.
 | 
						|
.SH OPTIONS
 | 
						|
.I flex
 | 
						|
has the following options:
 | 
						|
.TP
 | 
						|
.B -b
 | 
						|
Generate backtracking information to
 | 
						|
.I lex.backtrack.
 | 
						|
This is a list of scanner states which require backtracking
 | 
						|
and the input characters on which they do so.  By adding rules one
 | 
						|
can remove backtracking states.  If all backtracking states
 | 
						|
are eliminated and
 | 
						|
.B -f
 | 
						|
or
 | 
						|
.B -F
 | 
						|
is used, the generated scanner will run faster.
 | 
						|
.TP
 | 
						|
.B -c
 | 
						|
is a do-nothing, deprecated option included for POSIX compliance.
 | 
						|
.IP
 | 
						|
.B NOTE:
 | 
						|
in previous releases of
 | 
						|
.I flex
 | 
						|
.B -c
 | 
						|
specified table-compression options.  This functionality is
 | 
						|
now given by the
 | 
						|
.B -C
 | 
						|
flag.  To ease the the impact of this change, when
 | 
						|
.I flex
 | 
						|
encounters
 | 
						|
.B -c,
 | 
						|
it currently issues a warning message and assumes that
 | 
						|
.B -C
 | 
						|
was desired instead.  In the future this "promotion" of
 | 
						|
.B -c
 | 
						|
to
 | 
						|
.B -C
 | 
						|
will go away in the name of full POSIX compliance (unless
 | 
						|
the POSIX meaning is removed first).
 | 
						|
.TP
 | 
						|
.B -d
 | 
						|
makes the generated scanner run in
 | 
						|
.I debug
 | 
						|
mode.  Whenever a pattern is recognized and the global
 | 
						|
.B yy_flex_debug
 | 
						|
is non-zero (which is the default), the scanner will
 | 
						|
write to
 | 
						|
.I stderr
 | 
						|
a line of the form:
 | 
						|
.nf
 | 
						|
 | 
						|
    --accepting rule at line 53 ("the matched text")
 | 
						|
 | 
						|
.fi
 | 
						|
The line number refers to the location of the rule in the file
 | 
						|
defining the scanner (i.e., the file that was fed to flex).  Messages
 | 
						|
are also generated when the scanner backtracks, accepts the
 | 
						|
default rule, reaches the end of its input buffer (or encounters
 | 
						|
a NUL; the two look the same as far as the scanner's concerned),
 | 
						|
or reaches an end-of-file.
 | 
						|
.TP
 | 
						|
.B -f
 | 
						|
specifies (take your pick)
 | 
						|
.I full table
 | 
						|
or
 | 
						|
.I fast scanner.
 | 
						|
No table compression is done.  The result is large but fast.
 | 
						|
This option is equivalent to
 | 
						|
.B -Cf
 | 
						|
(see below).
 | 
						|
.TP
 | 
						|
.B -i
 | 
						|
instructs
 | 
						|
.I flex
 | 
						|
to generate a
 | 
						|
.I case-insensitive
 | 
						|
scanner.  The case of letters given in the
 | 
						|
.I flex
 | 
						|
input patterns will
 | 
						|
be ignored, and tokens in the input will be matched regardless of case.  The
 | 
						|
matched text given in
 | 
						|
.I yytext
 | 
						|
will have the preserved case (i.e., it will not be folded).
 | 
						|
.TP
 | 
						|
.B -n
 | 
						|
is another do-nothing, deprecated option included only for
 | 
						|
POSIX compliance.
 | 
						|
.TP
 | 
						|
.B -p
 | 
						|
generates a performance report to stderr.  The report
 | 
						|
consists of comments regarding features of the
 | 
						|
.I flex
 | 
						|
input file which will cause a loss of performance in the resulting scanner.
 | 
						|
.TP
 | 
						|
.B -s
 | 
						|
causes the
 | 
						|
.I default rule
 | 
						|
(that unmatched scanner input is echoed to
 | 
						|
.I stdout)
 | 
						|
to be suppressed.  If the scanner encounters input that does not
 | 
						|
match any of its rules, it aborts with an error.
 | 
						|
.TP
 | 
						|
.B -t
 | 
						|
instructs
 | 
						|
.I flex
 | 
						|
to write the scanner it generates to standard output instead
 | 
						|
of
 | 
						|
.B lex.yy.c.
 | 
						|
.TP
 | 
						|
.B -v
 | 
						|
specifies that
 | 
						|
.I flex
 | 
						|
should write to
 | 
						|
.I stderr
 | 
						|
a summary of statistics regarding the scanner it generates.
 | 
						|
.TP
 | 
						|
.B -F
 | 
						|
specifies that the
 | 
						|
.ul
 | 
						|
fast
 | 
						|
scanner table representation should be used.  This representation is
 | 
						|
about as fast as the full table representation
 | 
						|
.ul
 | 
						|
(-f),
 | 
						|
and for some sets of patterns will be considerably smaller (and for
 | 
						|
others, larger).  See
 | 
						|
.B flexdoc(1)
 | 
						|
for details.
 | 
						|
.IP
 | 
						|
This option is equivalent to
 | 
						|
.B -CF
 | 
						|
(see below).
 | 
						|
.TP
 | 
						|
.B -I
 | 
						|
instructs
 | 
						|
.I flex
 | 
						|
to generate an
 | 
						|
.I interactive
 | 
						|
scanner, that is, a scanner which stops immediately rather than
 | 
						|
looking ahead if it knows
 | 
						|
that the currently scanned text cannot be part of a longer rule's match.
 | 
						|
Again, see
 | 
						|
.B flexdoc(1)
 | 
						|
for details.
 | 
						|
.IP
 | 
						|
Note,
 | 
						|
.B -I
 | 
						|
cannot be used in conjunction with
 | 
						|
.I full
 | 
						|
or
 | 
						|
.I fast tables,
 | 
						|
i.e., the
 | 
						|
.B -f, -F, -Cf,
 | 
						|
or
 | 
						|
.B -CF
 | 
						|
flags.
 | 
						|
.TP
 | 
						|
.B -L
 | 
						|
instructs
 | 
						|
.I flex
 | 
						|
not to generate
 | 
						|
.B #line
 | 
						|
directives in
 | 
						|
.B lex.yy.c.
 | 
						|
The default is to generate such directives so error
 | 
						|
messages in the actions will be correctly
 | 
						|
located with respect to the original
 | 
						|
.I flex
 | 
						|
input file, and not to
 | 
						|
the fairly meaningless line numbers of
 | 
						|
.B lex.yy.c.
 | 
						|
.TP
 | 
						|
.B -T
 | 
						|
makes
 | 
						|
.I flex
 | 
						|
run in
 | 
						|
.I trace
 | 
						|
mode.  It will generate a lot of messages to
 | 
						|
.I stdout
 | 
						|
concerning
 | 
						|
the form of the input and the resultant non-deterministic and deterministic
 | 
						|
finite automata.  This option is mostly for use in maintaining
 | 
						|
.I flex.
 | 
						|
.TP
 | 
						|
.B -8
 | 
						|
instructs
 | 
						|
.I flex
 | 
						|
to generate an 8-bit scanner.
 | 
						|
On some sites, this is the default.  On others, the default
 | 
						|
is 7-bit characters.  To see which is the case, check the verbose
 | 
						|
.B (-v)
 | 
						|
output for "equivalence classes created".  If the denominator of
 | 
						|
the number shown is 128, then by default
 | 
						|
.I flex
 | 
						|
is generating 7-bit characters.  If it is 256, then the default is
 | 
						|
8-bit characters.
 | 
						|
.TP 
 | 
						|
.B -C[efmF]
 | 
						|
controls the degree of table compression.
 | 
						|
.IP
 | 
						|
.B -Ce
 | 
						|
directs
 | 
						|
.I flex
 | 
						|
to construct
 | 
						|
.I equivalence classes,
 | 
						|
i.e., sets of characters
 | 
						|
which have identical lexical properties.
 | 
						|
Equivalence classes usually give
 | 
						|
dramatic reductions in the final table/object file sizes (typically
 | 
						|
a factor of 2-5) and are pretty cheap performance-wise (one array
 | 
						|
look-up per character scanned).
 | 
						|
.IP
 | 
						|
.B -Cf
 | 
						|
specifies that the
 | 
						|
.I full
 | 
						|
scanner tables should be generated -
 | 
						|
.I flex
 | 
						|
should not compress the
 | 
						|
tables by taking advantages of similar transition functions for
 | 
						|
different states.
 | 
						|
.IP
 | 
						|
.B -CF
 | 
						|
specifies that the alternate fast scanner representation (described in
 | 
						|
.B flexdoc(1))
 | 
						|
should be used.
 | 
						|
.IP
 | 
						|
.B -Cm
 | 
						|
directs
 | 
						|
.I flex
 | 
						|
to construct
 | 
						|
.I meta-equivalence classes,
 | 
						|
which are sets of equivalence classes (or characters, if equivalence
 | 
						|
classes are not being used) that are commonly used together.  Meta-equivalence
 | 
						|
classes are often a big win when using compressed tables, but they
 | 
						|
have a moderate performance impact (one or two "if" tests and one
 | 
						|
array look-up per character scanned).
 | 
						|
.IP
 | 
						|
A lone
 | 
						|
.B -C
 | 
						|
specifies that the scanner tables should be compressed but neither
 | 
						|
equivalence classes nor meta-equivalence classes should be used.
 | 
						|
.IP
 | 
						|
The options
 | 
						|
.B -Cf
 | 
						|
or
 | 
						|
.B -CF
 | 
						|
and
 | 
						|
.B -Cm
 | 
						|
do not make sense together - there is no opportunity for meta-equivalence
 | 
						|
classes if the table is not being compressed.  Otherwise the options
 | 
						|
may be freely mixed.
 | 
						|
.IP
 | 
						|
The default setting is
 | 
						|
.B -Cem,
 | 
						|
which specifies that
 | 
						|
.I flex
 | 
						|
should generate equivalence classes
 | 
						|
and meta-equivalence classes.  This setting provides the highest
 | 
						|
degree of table compression.  You can trade off
 | 
						|
faster-executing scanners at the cost of larger tables with
 | 
						|
the following generally being true:
 | 
						|
.nf
 | 
						|
 | 
						|
    slowest & smallest
 | 
						|
          -Cem
 | 
						|
          -Cm
 | 
						|
          -Ce
 | 
						|
          -C
 | 
						|
          -C{f,F}e
 | 
						|
          -C{f,F}
 | 
						|
    fastest & largest
 | 
						|
 | 
						|
.fi
 | 
						|
.IP
 | 
						|
.B -C
 | 
						|
options are not cumulative; whenever the flag is encountered, the
 | 
						|
previous -C settings are forgotten.
 | 
						|
.TP
 | 
						|
.B -Sskeleton_file
 | 
						|
overrides the default skeleton file from which
 | 
						|
.I flex
 | 
						|
constructs its scanners.  You'll never need this option unless you are doing
 | 
						|
.I flex
 | 
						|
maintenance or development.
 | 
						|
.SH SUMMARY OF FLEX REGULAR EXPRESSIONS
 | 
						|
The patterns in the input are written using an extended set of regular
 | 
						|
expressions.  These are:
 | 
						|
.nf
 | 
						|
 | 
						|
    x          match the character 'x'
 | 
						|
    .          any character except newline
 | 
						|
    [xyz]      a "character class"; in this case, the pattern
 | 
						|
                 matches either an 'x', a 'y', or a 'z'
 | 
						|
    [abj-oZ]   a "character class" with a range in it; matches
 | 
						|
                 an 'a', a 'b', any letter from 'j' through 'o',
 | 
						|
                 or a 'Z'
 | 
						|
    [^A-Z]     a "negated character class", i.e., any character
 | 
						|
                 but those in the class.  In this case, any
 | 
						|
                 character EXCEPT an uppercase letter.
 | 
						|
    [^A-Z\\n]   any character EXCEPT an uppercase letter or
 | 
						|
                 a newline
 | 
						|
    r*         zero or more r's, where r is any regular expression
 | 
						|
    r+         one or more r's
 | 
						|
    r?         zero or one r's (that is, "an optional r")
 | 
						|
    r{2,5}     anywhere from two to five r's
 | 
						|
    r{2,}      two or more r's
 | 
						|
    r{4}       exactly 4 r's
 | 
						|
    {name}     the expansion of the "name" definition
 | 
						|
               (see above)
 | 
						|
    "[xyz]\\"foo"
 | 
						|
               the literal string: [xyz]"foo
 | 
						|
    \\X         if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
 | 
						|
                 then the ANSI-C interpretation of \\x.
 | 
						|
                 Otherwise, a literal 'X' (used to escape
 | 
						|
                 operators such as '*')
 | 
						|
    \\123       the character with octal value 123
 | 
						|
    \\x2a       the character with hexadecimal value 2a
 | 
						|
    (r)        match an r; parentheses are used to override
 | 
						|
                 precedence (see below)
 | 
						|
 | 
						|
 | 
						|
    rs         the regular expression r followed by the
 | 
						|
                 regular expression s; called "concatenation"
 | 
						|
 | 
						|
 | 
						|
    r|s        either an r or an s
 | 
						|
 | 
						|
 | 
						|
    r/s        an r but only if it is followed by an s.  The
 | 
						|
                 s is not part of the matched text.  This type
 | 
						|
                 of pattern is called as "trailing context".
 | 
						|
    ^r         an r, but only at the beginning of a line
 | 
						|
    r$         an r, but only at the end of a line.  Equivalent
 | 
						|
                 to "r/\\n".
 | 
						|
 | 
						|
 | 
						|
    <s>r       an r, but only in start condition s (see
 | 
						|
               below for discussion of start conditions)
 | 
						|
    <s1,s2,s3>r
 | 
						|
               same, but in any of start conditions s1,
 | 
						|
               s2, or s3
 | 
						|
 | 
						|
 | 
						|
    <<EOF>>    an end-of-file
 | 
						|
    <s1,s2><<EOF>>
 | 
						|
               an end-of-file when in start condition s1 or s2
 | 
						|
 | 
						|
.fi
 | 
						|
The regular expressions listed above are grouped according to
 | 
						|
precedence, from highest precedence at the top to lowest at the bottom.
 | 
						|
Those grouped together have equal precedence.
 | 
						|
.LP
 | 
						|
Some notes on patterns:
 | 
						|
.IP -
 | 
						|
Negated character classes
 | 
						|
.I match newlines
 | 
						|
unless "\\n" (or an equivalent escape sequence) is one of the
 | 
						|
characters explicitly present in the negated character class
 | 
						|
(e.g., "[^A-Z\\n]").
 | 
						|
.IP -
 | 
						|
A rule can have at most one instance of trailing context (the '/' operator
 | 
						|
or the '$' operator).  The start condition, '^', and "<<EOF>>" patterns
 | 
						|
can only occur at the beginning of a pattern, and, as well as with '/' and '$',
 | 
						|
cannot be grouped inside parentheses.  The following are all illegal:
 | 
						|
.nf
 | 
						|
 | 
						|
    foo/bar$
 | 
						|
    foo|(bar$)
 | 
						|
    foo|^bar
 | 
						|
    <sc1>foo<sc2>bar
 | 
						|
 | 
						|
.fi
 | 
						|
.SH SUMMARY OF SPECIAL ACTIONS
 | 
						|
In addition to arbitrary C code, the following can appear in actions:
 | 
						|
.IP -
 | 
						|
.B ECHO
 | 
						|
copies yytext to the scanner's output.
 | 
						|
.IP -
 | 
						|
.B BEGIN
 | 
						|
followed by the name of a start condition places the scanner in the
 | 
						|
corresponding start condition.
 | 
						|
.IP -
 | 
						|
.B REJECT
 | 
						|
directs the scanner to proceed on to the "second best" rule which matched the
 | 
						|
input (or a prefix of the input).
 | 
						|
.B yytext
 | 
						|
and
 | 
						|
.B yyleng
 | 
						|
are set up appropriately.  Note that
 | 
						|
.B REJECT
 | 
						|
is a particularly expensive feature in terms scanner performance;
 | 
						|
if it is used in
 | 
						|
.I any
 | 
						|
of the scanner's actions it will slow down
 | 
						|
.I all
 | 
						|
of the scanner's matching.  Furthermore,
 | 
						|
.B REJECT
 | 
						|
cannot be used with the
 | 
						|
.I -f
 | 
						|
or
 | 
						|
.I -F
 | 
						|
options.
 | 
						|
.IP
 | 
						|
Note also that unlike the other special actions,
 | 
						|
.B REJECT
 | 
						|
is a
 | 
						|
.I branch;
 | 
						|
code immediately following it in the action will
 | 
						|
.I not
 | 
						|
be executed.
 | 
						|
.IP -
 | 
						|
.B yymore()
 | 
						|
tells the scanner that the next time it matches a rule, the corresponding
 | 
						|
token should be
 | 
						|
.I appended
 | 
						|
onto the current value of
 | 
						|
.B yytext
 | 
						|
rather than replacing it.
 | 
						|
.IP -
 | 
						|
.B yyless(n)
 | 
						|
returns all but the first
 | 
						|
.I n
 | 
						|
characters of the current token back to the input stream, where they
 | 
						|
will be rescanned when the scanner looks for the next match.
 | 
						|
.B yytext
 | 
						|
and
 | 
						|
.B yyleng
 | 
						|
are adjusted appropriately (e.g.,
 | 
						|
.B yyleng
 | 
						|
will now be equal to
 | 
						|
.I n
 | 
						|
).
 | 
						|
.IP -
 | 
						|
.B unput(c)
 | 
						|
puts the character
 | 
						|
.I c
 | 
						|
back onto the input stream.  It will be the next character scanned.
 | 
						|
.IP -
 | 
						|
.B input()
 | 
						|
reads the next character from the input stream (this routine is called
 | 
						|
.B yyinput()
 | 
						|
if the scanner is compiled using
 | 
						|
.B C++).
 | 
						|
.IP -
 | 
						|
.B yyterminate()
 | 
						|
can be used in lieu of a return statement in an action.  It terminates
 | 
						|
the scanner and returns a 0 to the scanner's caller, indicating "all done".
 | 
						|
.IP
 | 
						|
By default,
 | 
						|
.B yyterminate()
 | 
						|
is also called when an end-of-file is encountered.  It is a macro and
 | 
						|
may be redefined.
 | 
						|
.IP -
 | 
						|
.B YY_NEW_FILE
 | 
						|
is an action available only in <<EOF>> rules.  It means "Okay, I've
 | 
						|
set up a new input file, continue scanning".
 | 
						|
.IP -
 | 
						|
.B yy_create_buffer( file, size )
 | 
						|
takes a
 | 
						|
.I FILE
 | 
						|
pointer and an integer
 | 
						|
.I size.
 | 
						|
It returns a YY_BUFFER_STATE
 | 
						|
handle to a new input buffer large enough to accomodate
 | 
						|
.I size
 | 
						|
characters and associated with the given file.  When in doubt, use
 | 
						|
.B YY_BUF_SIZE
 | 
						|
for the size.
 | 
						|
.IP -
 | 
						|
.B yy_switch_to_buffer( new_buffer )
 | 
						|
switches the scanner's processing to scan for tokens from
 | 
						|
the given buffer, which must be a YY_BUFFER_STATE.
 | 
						|
.IP -
 | 
						|
.B yy_delete_buffer( buffer )
 | 
						|
deletes the given buffer.
 | 
						|
.SH VALUES AVAILABLE TO THE USER
 | 
						|
.IP -
 | 
						|
.B char *yytext
 | 
						|
holds the text of the current token.  It may not be modified.
 | 
						|
.IP -
 | 
						|
.B int yyleng
 | 
						|
holds the length of the current token.  It may not be modified.
 | 
						|
.IP -
 | 
						|
.B FILE *yyin
 | 
						|
is the file which by default
 | 
						|
.I flex
 | 
						|
reads from.  It may be redefined but doing so only makes sense before
 | 
						|
scanning begins.  Changing it in the middle of scanning will have
 | 
						|
unexpected results since
 | 
						|
.I flex
 | 
						|
buffers its input.  Once scanning terminates because an end-of-file
 | 
						|
has been seen,
 | 
						|
.B
 | 
						|
void yyrestart( FILE *new_file )
 | 
						|
may be called to point
 | 
						|
.I yyin
 | 
						|
at the new input file.
 | 
						|
.IP -
 | 
						|
.B FILE *yyout
 | 
						|
is the file to which
 | 
						|
.B ECHO
 | 
						|
actions are done.  It can be reassigned by the user.
 | 
						|
.IP -
 | 
						|
.B YY_CURRENT_BUFFER
 | 
						|
returns a
 | 
						|
.B YY_BUFFER_STATE
 | 
						|
handle to the current buffer.
 | 
						|
.SH MACROS THE USER CAN REDEFINE
 | 
						|
.IP -
 | 
						|
.B YY_DECL
 | 
						|
controls how the scanning routine is declared.
 | 
						|
By default, it is "int yylex()", or, if prototypes are being
 | 
						|
used, "int yylex(void)".  This definition may be changed by redefining
 | 
						|
the "YY_DECL" macro.  Note that
 | 
						|
if you give arguments to the scanning routine using a
 | 
						|
K&R-style/non-prototyped function declaration, you must terminate
 | 
						|
the definition with a semi-colon (;).
 | 
						|
.IP -
 | 
						|
The nature of how the scanner
 | 
						|
gets its input can be controlled by redefining the
 | 
						|
.B YY_INPUT
 | 
						|
macro.
 | 
						|
YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".  Its
 | 
						|
action is to place up to
 | 
						|
.I max_size
 | 
						|
characters in the character array
 | 
						|
.I buf
 | 
						|
and return in the integer variable
 | 
						|
.I result
 | 
						|
either the
 | 
						|
number of characters read or the constant YY_NULL (0 on Unix systems)
 | 
						|
to indicate EOF.  The default YY_INPUT reads from the
 | 
						|
global file-pointer "yyin".
 | 
						|
A sample redefinition of YY_INPUT (in the definitions
 | 
						|
section of the input file):
 | 
						|
.nf
 | 
						|
 | 
						|
    %{
 | 
						|
    #undef YY_INPUT
 | 
						|
    #define YY_INPUT(buf,result,max_size) \\
 | 
						|
        { \\
 | 
						|
        int c = getchar(); \\
 | 
						|
        result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \\
 | 
						|
        }
 | 
						|
    %}
 | 
						|
 | 
						|
.fi
 | 
						|
.IP -
 | 
						|
When the scanner receives an end-of-file indication from YY_INPUT,
 | 
						|
it then checks the
 | 
						|
.B yywrap()
 | 
						|
function.  If
 | 
						|
.B yywrap()
 | 
						|
returns false (zero), then it is assumed that the
 | 
						|
function has gone ahead and set up
 | 
						|
.I yyin
 | 
						|
to point to another input file, and scanning continues.  If it returns
 | 
						|
true (non-zero), then the scanner terminates, returning 0 to its
 | 
						|
caller.
 | 
						|
.IP
 | 
						|
The default
 | 
						|
.B yywrap()
 | 
						|
always returns 1.  Presently, to redefine it you must first
 | 
						|
"#undef yywrap", as it is currently implemented as a macro.  It is
 | 
						|
likely that
 | 
						|
.B yywrap()
 | 
						|
will soon be defined to be a function rather than a macro.
 | 
						|
.IP -
 | 
						|
YY_USER_ACTION
 | 
						|
can be redefined to provide an action
 | 
						|
which is always executed prior to the matched rule's action.
 | 
						|
.IP -
 | 
						|
The macro
 | 
						|
.B YY_USER_INIT
 | 
						|
may be redefined to provide an action which is always executed before
 | 
						|
the first scan.
 | 
						|
.IP -
 | 
						|
In the generated scanner, the actions are all gathered in one large
 | 
						|
switch statement and separated using
 | 
						|
.B YY_BREAK,
 | 
						|
which may be redefined.  By default, it is simply a "break", to separate
 | 
						|
each rule's action from the following rule's.
 | 
						|
.SH FILES
 | 
						|
.TP
 | 
						|
.I flex.skel
 | 
						|
skeleton scanner.
 | 
						|
.TP
 | 
						|
.I lex.yy.c
 | 
						|
generated scanner (called
 | 
						|
.I lexyy.c
 | 
						|
on some systems).
 | 
						|
.TP
 | 
						|
.I lex.backtrack
 | 
						|
backtracking information for
 | 
						|
.B -b
 | 
						|
flag (called
 | 
						|
.I lex.bck
 | 
						|
on some systems).
 | 
						|
.TP
 | 
						|
.B -lfl
 | 
						|
library with which to link the scanners.
 | 
						|
.SH "SEE ALSO"
 | 
						|
.LP
 | 
						|
flexdoc(1), lex(1), yacc(1), sed(1), awk(1).
 | 
						|
.LP
 | 
						|
M. E. Lesk and E. Schmidt,
 | 
						|
.I LEX - Lexical Analyzer Generator
 | 
						|
.SH DIAGNOSTICS
 | 
						|
.I reject_used_but_not_detected undefined
 | 
						|
or
 | 
						|
.LP
 | 
						|
.I yymore_used_but_not_detected undefined -
 | 
						|
These errors can occur at compile time.  They indicate that the
 | 
						|
scanner uses
 | 
						|
.B REJECT
 | 
						|
or
 | 
						|
.B yymore()
 | 
						|
but that
 | 
						|
.I flex
 | 
						|
failed to notice the fact, meaning that
 | 
						|
.I flex
 | 
						|
scanned the first two sections looking for occurrences of these actions
 | 
						|
and failed to find any, but somehow you snuck some in (via a #include
 | 
						|
file, for example).  Make an explicit reference to the action in your
 | 
						|
.I flex
 | 
						|
input file.  (Note that previously
 | 
						|
.I flex
 | 
						|
supported a
 | 
						|
.B %used/%unused
 | 
						|
mechanism for dealing with this problem; this feature is still supported
 | 
						|
but now deprecated, and will go away soon unless the author hears from
 | 
						|
people who can argue compellingly that they need it.)
 | 
						|
.LP
 | 
						|
.I flex scanner jammed -
 | 
						|
a scanner compiled with
 | 
						|
.B -s
 | 
						|
has encountered an input string which wasn't matched by
 | 
						|
any of its rules.
 | 
						|
.LP
 | 
						|
.I flex input buffer overflowed -
 | 
						|
a scanner rule matched a string long enough to overflow the
 | 
						|
scanner's internal input buffer (16K bytes - controlled by
 | 
						|
.B YY_BUF_MAX
 | 
						|
in "flex.skel").
 | 
						|
.LP
 | 
						|
.I scanner requires -8 flag -
 | 
						|
Your scanner specification includes recognizing 8-bit characters and
 | 
						|
you did not specify the -8 flag (and your site has not installed flex
 | 
						|
with -8 as the default).
 | 
						|
.LP
 | 
						|
.I
 | 
						|
fatal flex scanner internal error--end of buffer missed -
 | 
						|
This can occur in an scanner which is reentered after a long-jump
 | 
						|
has jumped out (or over) the scanner's activation frame.  Before
 | 
						|
reentering the scanner, use:
 | 
						|
.nf
 | 
						|
 | 
						|
    yyrestart( yyin );
 | 
						|
 | 
						|
.fi
 | 
						|
.LP
 | 
						|
.I too many %t classes! -
 | 
						|
You managed to put every single character into its own %t class.
 | 
						|
.I flex
 | 
						|
requires that at least one of the classes share characters.
 | 
						|
.SH AUTHOR
 | 
						|
Vern Paxson, with the help of many ideas and much inspiration from
 | 
						|
Van Jacobson.  Original version by Jef Poskanzer.
 | 
						|
.LP
 | 
						|
See flexdoc(1) for additional credits and the address to send comments to.
 | 
						|
.SH DEFICIENCIES / BUGS
 | 
						|
.LP
 | 
						|
Some trailing context
 | 
						|
patterns cannot be properly matched and generate
 | 
						|
warning messages ("Dangerous trailing context").  These are
 | 
						|
patterns where the ending of the
 | 
						|
first part of the rule matches the beginning of the second
 | 
						|
part, such as "zx*/xy*", where the 'x*' matches the 'x' at
 | 
						|
the beginning of the trailing context.  (Note that the POSIX draft
 | 
						|
states that the text matched by such patterns is undefined.)
 | 
						|
.LP
 | 
						|
For some trailing context rules, parts which are actually fixed-length are
 | 
						|
not recognized as such, leading to the abovementioned performance loss.
 | 
						|
In particular, parts using '|' or {n} (such as "foo{3}") are always
 | 
						|
considered variable-length.
 | 
						|
.LP
 | 
						|
Combining trailing context with the special '|' action can result in
 | 
						|
.I fixed
 | 
						|
trailing context being turned into the more expensive
 | 
						|
.I variable
 | 
						|
trailing context.  For example, this happens in the following example:
 | 
						|
.nf
 | 
						|
 | 
						|
    %%
 | 
						|
    abc      |
 | 
						|
    xyz/def
 | 
						|
 | 
						|
.fi
 | 
						|
.LP
 | 
						|
Use of unput() invalidates yytext and yyleng.
 | 
						|
.LP
 | 
						|
Use of unput() to push back more text than was matched can
 | 
						|
result in the pushed-back text matching a beginning-of-line ('^')
 | 
						|
rule even though it didn't come at the beginning of the line
 | 
						|
(though this is rare!).
 | 
						|
.LP
 | 
						|
Pattern-matching of NUL's is substantially slower than matching other
 | 
						|
characters.
 | 
						|
.LP
 | 
						|
.I flex
 | 
						|
does not generate correct #line directives for code internal
 | 
						|
to the scanner; thus, bugs in
 | 
						|
.I flex.skel
 | 
						|
yield bogus line numbers.
 | 
						|
.LP
 | 
						|
Due to both buffering of input and read-ahead, you cannot intermix
 | 
						|
calls to <stdio.h> routines, such as, for example,
 | 
						|
.B getchar(),
 | 
						|
with
 | 
						|
.I flex
 | 
						|
rules and expect it to work.  Call
 | 
						|
.B input()
 | 
						|
instead.
 | 
						|
.LP
 | 
						|
The total table entries listed by the
 | 
						|
.B -v
 | 
						|
flag excludes the number of table entries needed to determine
 | 
						|
what rule has been matched.  The number of entries is equal
 | 
						|
to the number of DFA states if the scanner does not use
 | 
						|
.B REJECT,
 | 
						|
and somewhat greater than the number of states if it does.
 | 
						|
.LP
 | 
						|
.B REJECT
 | 
						|
cannot be used with the
 | 
						|
.I -f
 | 
						|
or
 | 
						|
.I -F
 | 
						|
options.
 | 
						|
.LP
 | 
						|
Some of the macros, such as
 | 
						|
.B yywrap(),
 | 
						|
may in the future become functions which live in the
 | 
						|
.B -lfl
 | 
						|
library.  This will doubtless break a lot of code, but may be
 | 
						|
required for POSIX-compliance.
 | 
						|
.LP
 | 
						|
The
 | 
						|
.I flex
 | 
						|
internal algorithms need documentation.
 |