This commit was manufactured by cvs2svn to create branch 'VENDOR-flex'.

author: cvs2svn <cvs2svn@FreeBSD.org> 1994-08-27 09:52:33 +0000
committer: cvs2svn <cvs2svn@FreeBSD.org> 1994-08-27 09:52:33 +0000
commit: 44852ef08668a7c6fb78323d12f3234aadfff3db (patch)
tree: 01580496c9d675be8be31d9608fe1cbd27c5e28f /usr.bin/lex
parent: b5e4d7a92edc625822f742dc8af8721069184091 (diff)
download: FreeBSD-src-44852ef08668a7c6fb78323d12f3234aadfff3db.zip
FreeBSD-src-44852ef08668a7c6fb78323d12f3234aadfff3db.tar.gz
1 files changed, 1001 insertions, 0 deletions
diff --git a/usr.bin/lex/lex.1 b/usr.bin/lex/lex.1
new file mode 100644
index 0000000..6aba4d6
--- /dev/null
+++ b/usr.bin/lex/lex.1
@@ -0,0 +1,1001 @@
+.TH FLEX 1 "November 1993" "Version 2.4"
+.SH NAME
+flex \- fast lexical analyzer generator
+.SH SYNOPSIS
+.B flex
+.B [\-bcdfhilnpstvwBFILTV78+ \-C[aefFmr] \-Pprefix \-Sskeleton]
+.I [filename ...]
+.SH DESCRIPTION
+.I flex
+is a tool for generating
+.I scanners:
+programs which recognized lexical patterns in text.
+.I flex
+reads
+the given input files, or its standard input if no file names are given,
+for a description of a scanner to generate.  The description is in
+the form of pairs
+of regular expressions and C code, called
+.I rules.  flex
+generates as output a C source file,
+.B lex.yy.c,
+which defines a routine
+.B yylex().
+This file is compiled and linked with the
+.B \-lfl
+library to produce an executable.  When the executable is run,
+it analyzes its input for occurrences
+of the regular expressions.  Whenever it finds one, it executes
+the corresponding C code.
+.PP
+For full documentation, see
+.B flexdoc(1).
+This manual entry is intended for use as a quick reference.
+.SH OPTIONS
+.I flex
+has the following options:
+.TP
+.B \-b
+generate backing-up information to
+.I lex.backup.
+This is a list of scanner states which require backing up and the input
+characters on which they do so.  By adding rules one can remove
+backing-up states.  If all backing-up states are eliminated and
+.B \-Cf
+or
+.B \-CF
+is used, the generated scanner will run faster.
+.TP
+.B \-c
+is a do-nothing, deprecated option included for POSIX compliance.
+.IP
+.B NOTE:
+in previous releases of
+.I flex
+.B \-c
+specified table-compression options.  This functionality is
+now given by the
+.B \-C
+flag.  To ease the the impact of this change, when
+.I flex
+encounters
+.B \-c,
+it currently issues a warning message and assumes that
+.B \-C
+was desired instead.  In the future this "promotion" of
+.B \-c
+to
+.B \-C
+will go away in the name of full POSIX compliance (unless
+the POSIX meaning is removed first).
+.TP
+.B \-d
+makes the generated scanner run in
+.I debug
+mode.  Whenever a pattern is recognized and the global
+.B yy_flex_debug
+is non-zero (which is the default), the scanner will
+write to
+.I stderr
+a line of the form:
+.nf
+
+    --accepting rule at line 53 ("the matched text")
+
+.fi
+The line number refers to the location of the rule in the file
+defining the scanner (i.e., the file that was fed to flex).  Messages
+are also generated when the scanner backs up, accepts the
+default rule, reaches the end of its input buffer (or encounters
+a NUL; the two look the same as far as the scanner's concerned),
+or reaches an end-of-file.
+.TP
+.B \-f
+specifies
+.I fast scanner.
+No table compression is done and stdio is bypassed.
+The result is large but fast.  This option is equivalent to
+.B \-Cfr
+(see below).
+.TP
+.B \-h
+generates a "help" summary of
+.I flex's
+options to
+.I stderr 
+and then exits.
+.TP
+.B \-i
+instructs
+.I flex
+to generate a
+.I case-insensitive
+scanner.  The case of letters given in the
+.I flex
+input patterns will
+be ignored, and tokens in the input will be matched regardless of case.  The
+matched text given in
+.I yytext
+will have the preserved case (i.e., it will not be folded).
+.TP
+.B \-l
+turns on maximum compatibility with the original AT&T lex implementation,
+at a considerable performance cost.  This option is incompatible with
+.B \-+, \-f, \-F, \-Cf,
+or
+.B \-CF.
+See
+.I flexdoc(1)
+for details.
+.TP
+.B \-n
+is another do-nothing, deprecated option included only for
+POSIX compliance.
+.TP
+.B \-p
+generates a performance report to stderr.  The report
+consists of comments regarding features of the
+.I flex
+input file which will cause a loss of performance in the resulting scanner.
+If you give the flag twice, you will also get comments regarding
+features that lead to minor performance losses.
+.TP
+.B \-s
+causes the
+.I default rule
+(that unmatched scanner input is echoed to
+.I stdout)
+to be suppressed.  If the scanner encounters input that does not
+match any of its rules, it aborts with an error.
+.TP
+.B \-t
+instructs
+.I flex
+to write the scanner it generates to standard output instead
+of
+.B lex.yy.c.
+.TP
+.B \-v
+specifies that
+.I flex
+should write to
+.I stderr
+a summary of statistics regarding the scanner it generates.
+.TP
+.B \-w
+suppresses warning messages.
+.TP
+.B \-B
+instructs
+.I flex
+to generate a
+.I batch
+scanner instead of an
+.I interactive
+scanner (see
+.B \-I
+below).  See
+.I flexdoc(1)
+for details.  Scanners using
+.B \-Cf
+or
+.B \-CF
+compression options automatically specify this option, too.
+.TP
+.B \-F
+specifies that the
+.ul
+fast
+scanner table representation should be used (and stdio bypassed).
+This representation is about as fast as the full table representation
+.B (-f),
+and for some sets of patterns will be considerably smaller (and for
+others, larger).  It cannot be used with the
+.B \-+
+option.  See
+.B flexdoc(1)
+for more details.
+.IP
+This option is equivalent to
+.B \-CFr
+(see below).
+.TP
+.B \-I
+instructs
+.I flex
+to generate an
+.I interactive
+scanner, that is, a scanner which stops immediately rather than
+looking ahead if it knows
+that the currently scanned text cannot be part of a longer rule's match.
+This is the opposite of
+.I batch
+scanners (see
+.B \-B
+above).  See
+.B flexdoc(1)
+for details.
+.IP
+Note,
+.B \-I
+cannot be used in conjunction with
+.I full
+or
+.I fast tables,
+i.e., the
+.B \-f, \-F, \-Cf,
+or
+.B \-CF
+flags.  For other table compression options,
+.B \-I
+is the default.
+.TP
+.B \-L
+instructs
+.I flex
+not to generate
+.B #line
+directives in
+.B lex.yy.c.
+The default is to generate such directives so error
+messages in the actions will be correctly
+located with respect to the original
+.I flex
+input file, and not to
+the fairly meaningless line numbers of
+.B lex.yy.c.
+.TP
+.B \-T
+makes
+.I flex
+run in
+.I trace
+mode.  It will generate a lot of messages to
+.I stderr
+concerning
+the form of the input and the resultant non-deterministic and deterministic
+finite automata.  This option is mostly for use in maintaining
+.I flex.
+.TP
+.B \-V
+prints the version number to
+.I stderr
+and exits.
+.TP
+.B \-7
+instructs
+.I flex
+to generate a 7-bit scanner, which can save considerable table space,
+especially when using
+.B \-Cf
+or
+.B \-CF
+(and, at most sites,
+.B \-7
+is on by default for these options.  To see if this is the case, use the
+.B -v
+verbose flag and check the flag summary it reports).
+.TP
+.B \-8
+instructs
+.I flex
+to generate an 8-bit scanner.  This is the default except for the
+.B \-Cf
+and
+.B \-CF
+compression options, for which the default is site-dependent, and
+can be checked by inspecting the flag summary generated by the
+.B \-v
+option.
+.TP
+.B \-+
+specifies that you want flex to generate a C++
+scanner class.  See the section on Generating C++ Scanners in
+.I flexdoc(1)
+for details.
+.TP 
+.B \-C[aefFmr]
+controls the degree of table compression and scanner optimization.
+.IP
+.B \-Ca
+trade off larger tables in the generated scanner for faster performance
+because the elements of the tables are better aligned for memory access
+and computation.  This option can double the size of the tables used by
+your scanner.
+.IP
+.B \-Ce
+directs
+.I flex
+to construct
+.I equivalence classes,
+i.e., sets of characters
+which have identical lexical properties.
+Equivalence classes usually give
+dramatic reductions in the final table/object file sizes (typically
+a factor of 2-5) and are pretty cheap performance-wise (one array
+look-up per character scanned).
+.IP
+.B \-Cf
+specifies that the
+.I full
+scanner tables should be generated -
+.I flex
+should not compress the
+tables by taking advantages of similar transition functions for
+different states.
+.IP
+.B \-CF
+specifies that the alternate fast scanner representation (described in
+.B flexdoc(1))
+should be used.  This option cannot be used with
+.B \-+.
+.IP
+.B \-Cm
+directs
+.I flex
+to construct
+.I meta-equivalence classes,
+which are sets of equivalence classes (or characters, if equivalence
+classes are not being used) that are commonly used together.  Meta-equivalence
+classes are often a big win when using compressed tables, but they
+have a moderate performance impact (one or two "if" tests and one
+array look-up per character scanned).
+.IP
+.B \-Cr
+causes the generated scanner to
+.I bypass
+using stdio for input.  In general this option results in a minor
+performance gain only worthwhile if used in conjunction with
+.B \-Cf
+or
+.B \-CF.
+It can cause surprising behavior if you use stdio yourself to
+read from
+.I yyin
+prior to calling the scanner.
+.IP
+A lone
+.B \-C
+specifies that the scanner tables should be compressed but neither
+equivalence classes nor meta-equivalence classes should be used.
+.IP
+The options
+.B \-Cf
+or
+.B \-CF
+and
+.B \-Cm
+do not make sense together - there is no opportunity for meta-equivalence
+classes if the table is not being compressed.  Otherwise the options
+may be freely mixed.
+.IP
+The default setting is
+.B \-Cem,
+which specifies that
+.I flex
+should generate equivalence classes
+and meta-equivalence classes.  This setting provides the highest
+degree of table compression.  You can trade off
+faster-executing scanners at the cost of larger tables with
+the following generally being true:
+.nf
+
+    slowest & smallest
+          -Cem
+          -Cm
+          -Ce
+          -C
+          -C{f,F}e
+          -C{f,F}
+          -C{f,F}a
+    fastest & largest
+
+.fi
+.IP
+.B \-C
+options are cumulative.
+.TP
+.B \-Pprefix
+changes the default
+.I "yy"
+prefix used by
+.I flex
+to be
+.I prefix
+instead.  See
+.I flexdoc(1)
+for a description of all the global variables and file names that
+this affects.
+.TP
+.B \-Sskeleton_file
+overrides the default skeleton file from which
+.I flex
+constructs its scanners.  You'll never need this option unless you are doing
+.I flex
+maintenance or development.
+.SH SUMMARY OF FLEX REGULAR EXPRESSIONS
+The patterns in the input are written using an extended set of regular
+expressions.  These are:
+.nf
+
+    x          match the character 'x'
+    .          any character except newline
+    [xyz]      a "character class"; in this case, the pattern
+                 matches either an 'x', a 'y', or a 'z'
+    [abj-oZ]   a "character class" with a range in it; matches
+                 an 'a', a 'b', any letter from 'j' through 'o',
+                 or a 'Z'
+    [^A-Z]     a "negated character class", i.e., any character
+                 but those in the class.  In this case, any
+                 character EXCEPT an uppercase letter.
+    [^A-Z\\n]   any character EXCEPT an uppercase letter or
+                 a newline
+    r*         zero or more r's, where r is any regular expression
+    r+         one or more r's
+    r?         zero or one r's (that is, "an optional r")
+    r{2,5}     anywhere from two to five r's
+    r{2,}      two or more r's
+    r{4}       exactly 4 r's
+    {name}     the expansion of the "name" definition
+               (see above)
+    "[xyz]\\"foo"
+               the literal string: [xyz]"foo
+    \\X         if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v',
+                 then the ANSI-C interpretation of \\x.
+                 Otherwise, a literal 'X' (used to escape
+                 operators such as '*')
+    \\123       the character with octal value 123
+    \\x2a       the character with hexadecimal value 2a
+    (r)        match an r; parentheses are used to override
+                 precedence (see below)
+
+
+    rs         the regular expression r followed by the
+                 regular expression s; called "concatenation"
+
+
+    r|s        either an r or an s
+
+
+    r/s        an r but only if it is followed by an s.  The
+                 s is not part of the matched text.  This type
+                 of pattern is called as "trailing context".
+    ^r         an r, but only at the beginning of a line
+    r$         an r, but only at the end of a line.  Equivalent
+                 to "r/\\n".
+
+
+    <s>r       an r, but only in start condition s (see
+               below for discussion of start conditions)
+    <s1,s2,s3>r
+               same, but in any of start conditions s1,
+               s2, or s3
+    <*>r       an r in any start condition, even an exclusive one.
+
+
+    <<EOF>>    an end-of-file
+    <s1,s2><<EOF>>
+               an end-of-file when in start condition s1 or s2
+
+.fi
+The regular expressions listed above are grouped according to
+precedence, from highest precedence at the top to lowest at the bottom.
+Those grouped together have equal precedence.
+.PP
+Some notes on patterns:
+.IP -
+Negated character classes
+.I match newlines
+unless "\\n" (or an equivalent escape sequence) is one of the
+characters explicitly present in the negated character class
+(e.g., "[^A-Z\\n]").
+.IP -
+A rule can have at most one instance of trailing context (the '/' operator
+or the '$' operator).  The start condition, '^', and "<<EOF>>" patterns
+can only occur at the beginning of a pattern, and, as well as with '/' and '$',
+cannot be grouped inside parentheses.  The following are all illegal:
+.nf
+
+    foo/bar$
+    foo|(bar$)
+    foo|^bar
+    <sc1>foo<sc2>bar
+
+.fi
+.SH SUMMARY OF SPECIAL ACTIONS
+In addition to arbitrary C code, the following can appear in actions:
+.IP -
+.B ECHO
+copies yytext to the scanner's output.
+.IP -
+.B BEGIN
+followed by the name of a start condition places the scanner in the
+corresponding start condition.
+.IP -
+.B REJECT
+directs the scanner to proceed on to the "second best" rule which matched the
+input (or a prefix of the input).
+.B yytext
+and
+.B yyleng
+are set up appropriately.  Note that
+.B REJECT
+is a particularly expensive feature in terms scanner performance;
+if it is used in
+.I any
+of the scanner's actions it will slow down
+.I all
+of the scanner's matching.  Furthermore,
+.B REJECT
+cannot be used with the
+.B \-f
+or
+.B \-F
+options.
+.IP
+Note also that unlike the other special actions,
+.B REJECT
+is a
+.I branch;
+code immediately following it in the action will
+.I not
+be executed.
+.IP -
+.B yymore()
+tells the scanner that the next time it matches a rule, the corresponding
+token should be
+.I appended
+onto the current value of
+.B yytext
+rather than replacing it.
+.IP -
+.B yyless(n)
+returns all but the first
+.I n
+characters of the current token back to the input stream, where they
+will be rescanned when the scanner looks for the next match.
+.B yytext
+and
+.B yyleng
+are adjusted appropriately (e.g.,
+.B yyleng
+will now be equal to
+.I n
+).
+.IP -
+.B unput(c)
+puts the character
+.I c
+back onto the input stream.  It will be the next character scanned.
+.IP -
+.B input()
+reads the next character from the input stream (this routine is called
+.B yyinput()
+if the scanner is compiled using
+.B C++).
+.IP -
+.B yyterminate()
+can be used in lieu of a return statement in an action.  It terminates
+the scanner and returns a 0 to the scanner's caller, indicating "all done".
+.IP
+By default,
+.B yyterminate()
+is also called when an end-of-file is encountered.  It is a macro and
+may be redefined.
+.IP -
+.B YY_NEW_FILE
+is an action available only in <<EOF>> rules.  It means "Okay, I've
+set up a new input file, continue scanning".  It is no longer required;
+you can just assign
+.I yyin
+to point to a new file in the <<EOF>> action.
+.IP -
+.B yy_create_buffer( file, size )
+takes a
+.I FILE
+pointer and an integer
+.I size.
+It returns a YY_BUFFER_STATE
+handle to a new input buffer large enough to accomodate
+.I size
+characters and associated with the given file.  When in doubt, use
+.B YY_BUF_SIZE
+for the size.
+.IP -
+.B yy_switch_to_buffer( new_buffer )
+switches the scanner's processing to scan for tokens from
+the given buffer, which must be a YY_BUFFER_STATE.
+.IP -
+.B yy_delete_buffer( buffer )
+deletes the given buffer.
+.SH VALUES AVAILABLE TO THE USER
+.IP -
+.B char *yytext
+holds the text of the current token.  It may be modified but not lengthened
+(you cannot append characters to the end).  Modifying the last character
+may affect the activity of rules anchored using '^' during the next scan;
+see
+.B flexdoc(1)
+for details.
+.IP
+If the special directive
+.B %array
+appears in the first section of the scanner description, then
+.B yytext
+is instead declared
+.B char yytext[YYLMAX],
+where
+.B YYLMAX
+is a macro definition that you can redefine in the first section
+if you don't like the default value (generally 8KB).  Using
+.B %array
+results in somewhat slower scanners, but the value of
+.B yytext
+becomes immune to calls to
+.I input()
+and
+.I unput(),
+which potentially destroy its value when
+.B yytext
+is a character pointer.  The opposite of
+.B %array
+is
+.B %pointer,
+which is the default.
+.IP
+You cannot use
+.B %array
+when generating C++ scanner classes
+(the
+.B \-+
+flag).
+.IP -
+.B int yyleng
+holds the length of the current token.
+.IP -
+.B FILE *yyin
+is the file which by default
+.I flex
+reads from.  It may be redefined but doing so only makes sense before
+scanning begins or after an EOF has been encountered.  Changing it in
+the midst of scanning will have unexpected results since
+.I flex
+buffers its input; use
+.B yyrestart()
+instead.
+Once scanning terminates because an end-of-file
+has been seen,
+.B
+you can assign
+.I yyin
+at the new input file and then call the scanner again to continue scanning.
+.IP -
+.B void yyrestart( FILE *new_file )
+may be called to point
+.I yyin
+at the new input file.  The switch-over to the new file is immediate
+(any previously buffered-up input is lost).  Note that calling
+.B yyrestart()
+with
+.I yyin
+as an argument thus throws away the current input buffer and continues
+scanning the same input file.
+.IP -
+.B FILE *yyout
+is the file to which
+.B ECHO
+actions are done.  It can be reassigned by the user.
+.IP -
+.B YY_CURRENT_BUFFER
+returns a
+.B YY_BUFFER_STATE
+handle to the current buffer.
+.IP -
+.B YY_START
+returns an integer value corresponding to the current start
+condition.  You can subsequently use this value with
+.B BEGIN
+to return to that start condition.
+.SH MACROS AND FUNCTIONS YOU CAN REDEFINE
+.IP -
+.B YY_DECL
+controls how the scanning routine is declared.
+By default, it is "int yylex()", or, if prototypes are being
+used, "int yylex(void)".  This definition may be changed by redefining
+the "YY_DECL" macro.  Note that
+if you give arguments to the scanning routine using a
+K&R-style/non-prototyped function declaration, you must terminate
+the definition with a semi-colon (;).
+.IP -
+The nature of how the scanner
+gets its input can be controlled by redefining the
+.B YY_INPUT
+macro.
+YY_INPUT's calling sequence is "YY_INPUT(buf,result,max_size)".  Its
+action is to place up to
+.I max_size
+characters in the character array
+.I buf
+and return in the integer variable
+.I result
+either the
+number of characters read or the constant YY_NULL (0 on Unix systems)
+to indicate EOF.  The default YY_INPUT reads from the
+global file-pointer "yyin".
+A sample redefinition of YY_INPUT (in the definitions
+section of the input file):
+.nf
+
+    %{
+    #undef YY_INPUT
+    #define YY_INPUT(buf,result,max_size) \\
+        { \\
+        int c = getchar(); \\
+        result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \\
+        }
+    %}
+
+.fi
+.IP -
+When the scanner receives an end-of-file indication from YY_INPUT,
+it then checks the function
+.B yywrap()
+function.  If
+.B yywrap()
+returns false (zero), then it is assumed that the
+function has gone ahead and set up
+.I yyin
+to point to another input file, and scanning continues.  If it returns
+true (non-zero), then the scanner terminates, returning 0 to its
+caller.
+.IP
+The default
+.B yywrap()
+always returns 1.
+.IP -
+YY_USER_ACTION
+can be redefined to provide an action
+which is always executed prior to the matched rule's action.
+.IP -
+The macro
+.B YY_USER_INIT
+may be redefined to provide an action which is always executed before
+the first scan.
+.IP -
+In the generated scanner, the actions are all gathered in one large
+switch statement and separated using
+.B YY_BREAK,
+which may be redefined.  By default, it is simply a "break", to separate
+each rule's action from the following rule's.
+.SH FILES
+.TP
+.B \-lfl
+library with which to link scanners to obtain the default versions
+of
+.I yywrap()
+and/or
+.I main().
+.TP
+.I lex.yy.c
+generated scanner (called
+.I lexyy.c
+on some systems).
+.TP
+.I lex.yy.cc
+generated C++ scanner class, when using
+.B -+.
+.TP
+.I <FlexLexer.h>
+header file defining the C++ scanner base class,
+.B FlexLexer,
+and its derived class,
+.B yyFlexLexer.
+.TP
+.I flex.skl
+skeleton scanner.  This file is only used when building flex, not when
+flex executes.
+.TP
+.I lex.backup
+backing-up information for
+.B \-b
+flag (called
+.I lex.bck
+on some systems).
+.SH "SEE ALSO"
+.PP
+flexdoc(1), lex(1), yacc(1), sed(1), awk(1).
+.PP
+M. E. Lesk and E. Schmidt,
+.I LEX \- Lexical Analyzer Generator
+.SH DIAGNOSTICS
+.PP
+.I reject_used_but_not_detected undefined
+or
+.PP
+.I yymore_used_but_not_detected undefined -
+These errors can occur at compile time.  They indicate that the
+scanner uses
+.B REJECT
+or
+.B yymore()
+but that
+.I flex
+failed to notice the fact, meaning that
+.I flex
+scanned the first two sections looking for occurrences of these actions
+and failed to find any, but somehow you snuck some in (via a #include
+file, for example).  Make an explicit reference to the action in your
+.I flex
+input file.  (Note that previously
+.I flex
+supported a
+.B %used/%unused
+mechanism for dealing with this problem; this feature is still supported
+but now deprecated, and will go away soon unless the author hears from
+people who can argue compellingly that they need it.)
+.PP
+.I flex scanner jammed -
+a scanner compiled with
+.B \-s
+has encountered an input string which wasn't matched by
+any of its rules.
+.PP
+.I warning, rule cannot be matched
+indicates that the given rule
+cannot be matched because it follows other rules that will
+always match the same text as it.  See
+.I flexdoc(1)
+for an example.
+.PP
+.I warning,
+.B \-s
+.I
+option given but default rule can be matched
+means that it is possible (perhaps only in a particular start condition)
+that the default rule (match any single character) is the only one
+that will match a particular input.  Since
+.PP
+.I scanner input buffer overflowed -
+a scanner rule matched more text than the available dynamic memory.
+.PP
+.I token too large, exceeds YYLMAX -
+your scanner uses
+.B %array
+and one of its rules matched a string longer than the
+.B YYLMAX
+constant (8K bytes by default).  You can increase the value by
+#define'ing
+.B YYLMAX
+in the definitions section of your
+.I flex
+input.
+.PP
+.I scanner requires \-8 flag to
+.I use the character 'x' -
+Your scanner specification includes recognizing the 8-bit character
+.I 'x'
+and you did not specify the \-8 flag, and your scanner defaulted to 7-bit
+because you used the
+.B \-Cf
+or
+.B \-CF
+table compression options.
+.PP
+.I flex scanner push-back overflow -
+you used
+.B unput()
+to push back so much text that the scanner's buffer could not hold
+both the pushed-back text and the current token in
+.B yytext.
+Ideally the scanner should dynamically resize the buffer in this case, but at
+present it does not.
+.PP
+.I
+input buffer overflow, can't enlarge buffer because scanner uses REJECT -
+the scanner was working on matching an extremely large token and needed
+to expand the input buffer.  This doesn't work with scanners that use
+.B
+REJECT.
+.PP
+.I
+fatal flex scanner internal error--end of buffer missed -
+This can occur in an scanner which is reentered after a long-jump
+has jumped out (or over) the scanner's activation frame.  Before
+reentering the scanner, use:
+.nf
+
+    yyrestart( yyin );
+
+.fi
+or use C++ scanner classes (the
+.B \-+
+option), which are fully reentrant.
+.SH AUTHOR
+Vern Paxson, with the help of many ideas and much inspiration from
+Van Jacobson.  Original version by Jef Poskanzer.
+.PP
+See flexdoc(1) for additional credits and the address to send comments to.
+.SH DEFICIENCIES / BUGS
+.PP
+Some trailing context
+patterns cannot be properly matched and generate
+warning messages ("dangerous trailing context").  These are
+patterns where the ending of the
+first part of the rule matches the beginning of the second
+part, such as "zx*/xy*", where the 'x*' matches the 'x' at
+the beginning of the trailing context.  (Note that the POSIX draft
+states that the text matched by such patterns is undefined.)
+.PP
+For some trailing context rules, parts which are actually fixed-length are
+not recognized as such, leading to the abovementioned performance loss.
+In particular, parts using '|' or {n} (such as "foo{3}") are always
+considered variable-length.
+.PP
+Combining trailing context with the special '|' action can result in
+.I fixed
+trailing context being turned into the more expensive
+.I variable
+trailing context.  For example, in the following:
+.nf
+
+    %%
+    abc      |
+    xyz/def
+
+.fi
+.PP
+Use of
+.B unput()
+or
+.B input()
+invalidates yytext and yyleng, unless the
+.B %array
+directive
+or the
+.B \-l
+option has been used.
+.PP
+Use of unput() to push back more text than was matched can
+result in the pushed-back text matching a beginning-of-line ('^')
+rule even though it didn't come at the beginning of the line
+(though this is rare!).
+.PP
+Pattern-matching of NUL's is substantially slower than matching other
+characters.
+.PP
+Dynamic resizing of the input buffer is slow, as it entails rescanning
+all the text matched so far by the current (generally huge) token.
+.PP
+.I flex
+does not generate correct #line directives for code internal
+to the scanner; thus, bugs in
+.I flex.skl
+yield bogus line numbers.
+.PP
+Due to both buffering of input and read-ahead, you cannot intermix
+calls to <stdio.h> routines, such as, for example,
+.B getchar(),
+with
+.I flex
+rules and expect it to work.  Call
+.B input()
+instead.
+.PP
+The total table entries listed by the
+.B \-v
+flag excludes the number of table entries needed to determine
+what rule has been matched.  The number of entries is equal
+to the number of DFA states if the scanner does not use
+.B REJECT,
+and somewhat greater than the number of states if it does.
+.PP
+.B REJECT
+cannot be used with the
+.B \-f
+or
+.B \-F
+options.
+.PP
+The
+.I flex
+internal algorithms need documentation.
author	cvs2svn <cvs2svn@FreeBSD.org>	1994-08-27 09:52:33 +0000
committer	cvs2svn <cvs2svn@FreeBSD.org>	1994-08-27 09:52:33 +0000
commit	44852ef08668a7c6fb78323d12f3234aadfff3db (patch)
tree	01580496c9d675be8be31d9608fe1cbd27c5e28f /usr.bin/lex
parent	b5e4d7a92edc625822f742dc8af8721069184091 (diff)
download	FreeBSD-src-44852ef08668a7c6fb78323d12f3234aadfff3db.zip FreeBSD-src-44852ef08668a7c6fb78323d12f3234aadfff3db.tar.gz