summaryrefslogtreecommitdiffstats
path: root/share/doc
diff options
context:
space:
mode:
authorgrog <grog@FreeBSD.org>2002-05-19 04:37:39 +0000
committergrog <grog@FreeBSD.org>2002-05-19 04:37:39 +0000
commit1d9bed166c9114d035925645959f30f03030bc21 (patch)
treeeb82caa81d3e10a69e8f0cf8d0bccc531d75ee0c /share/doc
parentd62432c0c994f10cfaefb8fd1be8218b2e4d13ac (diff)
downloadFreeBSD-src-1d9bed166c9114d035925645959f30f03030bc21.zip
FreeBSD-src-1d9bed166c9114d035925645959f30f03030bc21.tar.gz
Initial checkin: 4.4BSD version. These files need to be updated with
current license information and adapted to the FreeBSD build environment before they will build. Approved by: David Taylor <davidt@caldera.com>
Diffstat (limited to 'share/doc')
-rw-r--r--share/doc/psd/15.yacc/Makefile12
-rw-r--r--share/doc/psd/15.yacc/ss..61
-rw-r--r--share/doc/psd/15.yacc/ss0206
-rw-r--r--share/doc/psd/15.yacc/ss1143
-rw-r--r--share/doc/psd/15.yacc/ss2156
-rw-r--r--share/doc/psd/15.yacc/ss3109
-rw-r--r--share/doc/psd/15.yacc/ss4335
-rw-r--r--share/doc/psd/15.yacc/ss5307
-rw-r--r--share/doc/psd/15.yacc/ss6151
-rw-r--r--share/doc/psd/15.yacc/ss7129
-rw-r--r--share/doc/psd/15.yacc/ss898
-rw-r--r--share/doc/psd/15.yacc/ss9174
-rw-r--r--share/doc/psd/15.yacc/ssA189
-rw-r--r--share/doc/psd/15.yacc/ssB31
-rw-r--r--share/doc/psd/15.yacc/ssa118
-rw-r--r--share/doc/psd/15.yacc/ssb115
-rw-r--r--share/doc/psd/15.yacc/ssc316
-rw-r--r--share/doc/psd/15.yacc/ssd45
18 files changed, 2695 insertions, 0 deletions
diff --git a/share/doc/psd/15.yacc/Makefile b/share/doc/psd/15.yacc/Makefile
new file mode 100644
index 0000000..8fa7406
--- /dev/null
+++ b/share/doc/psd/15.yacc/Makefile
@@ -0,0 +1,12 @@
+# @(#)Makefile 8.1 (Berkeley) 8/14/93
+# $FreeBSD$
+
+DIR= psd/15.yacc
+SRCS= ss.. ss0 ss1 ss2 ss3 ss4 ss5 ss6 ss7 ss8 ss9 ssA ssB ssa ssb ssc ssd
+MACROS= -msU
+REFER= refer -e -p /usr/old/dict/papers/Ind
+
+paper.ps: ${SRCS}
+ ${REFER} ${SRCS} | ${ROFF} > ${.TARGET}
+
+.include <bsd.doc.mk>
diff --git a/share/doc/psd/15.yacc/ss.. b/share/doc/psd/15.yacc/ss..
new file mode 100644
index 0000000..d9786b2
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss..
@@ -0,0 +1,61 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss.. 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.EH 'PSD:15-%''Yacc: Yet Another Compiler-Compiler'
+.OH 'Yacc: Yet Another Compiler-Compiler''PSD:15-%'
+.\".RP
+.ND "July 31, 1978"
+.TL
+Yacc:
+Yet Another Compiler-Compiler
+.AU "MH 2C-559" 3968
+Stephen C. Johnson
+.AI
+.MH
+.AB
+.PP
+Computer program input generally has some structure;
+in fact, every computer program that does input can be thought of as defining
+an ``input language'' which it accepts.
+An input language may be as complex as a programming language, or as simple as
+a sequence of numbers.
+Unfortunately, usual input facilities
+are limited, difficult to use,
+and often are lax about checking their inputs for validity.
+.PP
+Yacc provides a general tool for describing
+the input to a computer program.
+The Yacc user specifies the structures
+of his input, together with code to be invoked as
+each such structure is recognized.
+Yacc turns such a specification into a subroutine that
+handles the input process;
+frequently, it is convenient and appropriate to have most
+of the flow of control in the user's application
+handled by this subroutine.
+.PP
+The input subroutine produced by Yacc calls a user-supplied routine to
+return the next basic input item.
+Thus, the user can specify his input in terms of individual input characters, or
+in terms of higher level constructs such as names and numbers.
+The user-supplied routine may also handle idiomatic features such as
+comment and continuation conventions, which typically defy easy grammatical specification.
+.PP
+Yacc is written in portable C.
+The class of specifications accepted is a very general one: LALR(1)
+grammars with disambiguating rules.
+.PP
+In addition to compilers for C, APL, Pascal, RATFOR, etc., Yacc
+has also been used for less conventional languages,
+including a phototypesetter language, several desk calculator languages, a document retrieval system,
+and a Fortran debugging system.
+.AE
+.OK
+.\"Computer Languages
+.\"Compilers
+.\"Formal Language Theory
+.CS 23 11 34 0 0 8
diff --git a/share/doc/psd/15.yacc/ss0 b/share/doc/psd/15.yacc/ss0
new file mode 100644
index 0000000..aed127a
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss0
@@ -0,0 +1,206 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss0 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+0: Introduction
+.PP
+Yacc provides a general tool for imposing structure on the input to a computer program.
+The Yacc user prepares a
+specification of the input process; this includes rules
+describing the input structure, code to be invoked when these
+rules are recognized, and a low-level routine to do the
+basic input.
+Yacc then generates a function to control the input process.
+This function, called a
+.I parser ,
+calls the user-supplied low-level input routine
+(the
+.I "lexical analyzer" )
+to pick up the basic items
+(called
+.I tokens )
+from the input stream.
+These tokens are organized according to the input structure rules,
+called
+.I "grammar rules" \|;
+when one of these rules has been recognized,
+then user code supplied for this rule, an
+.I action ,
+is invoked; actions have the ability to return values and
+make use of the values of other actions.
+.PP
+Yacc is written in a portable dialect of C
+.[
+Ritchie Kernighan Language Prentice
+.]
+and the actions, and output subroutine, are in C as well.
+Moreover, many of the syntactic conventions of Yacc follow C.
+.PP
+The heart of the input specification is a collection of grammar rules.
+Each rule describes an allowable structure and gives it a name.
+For example, one grammar rule might be
+.DS
+date : month\_name day \',\' year ;
+.DE
+Here,
+.I date ,
+.I month\_name ,
+.I day ,
+and
+.I year
+represent structures of interest in the input process;
+presumably,
+.I month\_name ,
+.I day ,
+and
+.I year
+are defined elsewhere.
+The comma ``,'' is enclosed in single quotes; this implies that the
+comma is to appear literally in the input.
+The colon and semicolon merely serve as punctuation in the rule, and have
+no significance in controlling the input.
+Thus, with proper definitions, the input
+.DS
+July 4, 1776
+.DE
+might be matched by the above rule.
+.PP
+An important part of the input process is carried out by the
+lexical analyzer.
+This user routine reads the input stream, recognizing the lower level structures,
+and communicates these tokens
+to the parser.
+For historical reasons, a structure recognized by the lexical analyzer is called a
+.I "terminal symbol" ,
+while the structure recognized by the parser is called a
+.I "nonterminal symbol" .
+To avoid confusion, terminal symbols will usually be referred to as
+.I tokens .
+.PP
+There is considerable leeway in deciding whether to recognize structures using the lexical
+analyzer or grammar rules.
+For example, the rules
+.DS
+month\_name : \'J\' \'a\' \'n\' ;
+month\_name : \'F\' \'e\' \'b\' ;
+
+ . . .
+
+month\_name : \'D\' \'e\' \'c\' ;
+.DE
+might be used in the above example.
+The lexical analyzer would only need to recognize individual letters, and
+.I month\_name
+would be a nonterminal symbol.
+Such low-level rules tend to waste time and space, and may
+complicate the specification beyond Yacc's ability to deal with it.
+Usually, the lexical analyzer would
+recognize the month names,
+and return an indication that a
+.I month\_name
+was seen; in this case,
+.I month\_name
+would be a token.
+.PP
+Literal characters such as ``,'' must also be passed through the lexical
+analyzer, and are also considered tokens.
+.PP
+Specification files are very flexible.
+It is realively easy to add to the above example the rule
+.DS
+date : month \'/\' day \'/\' year ;
+.DE
+allowing
+.DS
+7 / 4 / 1776
+.DE
+as a synonym for
+.DS
+July 4, 1776
+.DE
+In most cases, this new rule could be ``slipped in'' to a working system with minimal effort,
+and little danger of disrupting existing input.
+.PP
+The input being read may not conform to the
+specifications.
+These input errors are detected as early as is theoretically possible with a
+left-to-right scan;
+thus, not only is the chance of reading and computing with bad
+input data substantially reduced, but the bad data can usually be quickly found.
+Error handling,
+provided as part of the input specifications,
+permits the reentry of bad data,
+or the continuation of the input process after skipping over the bad data.
+.PP
+In some cases, Yacc fails to produce a parser when given a set of
+specifications.
+For example, the specifications may be self contradictory, or they may
+require a more powerful recognition mechanism than that available to Yacc.
+The former cases represent design errors;
+the latter cases
+can often be corrected
+by making
+the lexical analyzer
+more powerful, or by rewriting some of the grammar rules.
+While Yacc cannot handle all possible specifications, its power
+compares favorably with similar systems;
+moreover, the
+constructions which are difficult for Yacc to handle are
+also frequently difficult for human beings to handle.
+Some users have reported that the discipline of formulating valid
+Yacc specifications for their input revealed errors of
+conception or design early in the program development.
+.PP
+The theory underlying Yacc has been described elsewhere.
+.[
+Aho Johnson Surveys LR Parsing
+.]
+.[
+Aho Johnson Ullman Ambiguous Grammars
+.]
+.[
+Aho Ullman Principles Compiler Design
+.]
+Yacc has been extensively used in numerous practical applications,
+including
+.I lint ,
+.[
+Johnson Lint
+.]
+the Portable C Compiler,
+.[
+Johnson Portable Compiler Theory
+.]
+and a system for typesetting mathematics.
+.[
+Kernighan Cherry typesetting system CACM
+.]
+.PP
+The next several sections describe the
+basic process of preparing a Yacc specification;
+Section 1 describes the preparation of grammar rules,
+Section 2 the preparation of the user supplied actions associated with these rules,
+and Section 3 the preparation of lexical analyzers.
+Section 4 describes the operation of the parser.
+Section 5 discusses various reasons why Yacc may be unable to produce a
+parser from a specification, and what to do about it.
+Section 6 describes a simple mechanism for
+handling operator precedences in arithmetic expressions.
+Section 7 discusses error detection and recovery.
+Section 8 discusses the operating environment and special features
+of the parsers Yacc produces.
+Section 9 gives some suggestions which should improve the
+style and efficiency of the specifications.
+Section 10 discusses some advanced topics, and Section 11 gives
+acknowledgements.
+Appendix A has a brief example, and Appendix B gives a
+summary of the Yacc input syntax.
+Appendix C gives an example using some of the more advanced
+features of Yacc, and, finally,
+Appendix D describes mechanisms and syntax
+no longer actively supported, but
+provided for historical continuity with older versions of Yacc.
diff --git a/share/doc/psd/15.yacc/ss1 b/share/doc/psd/15.yacc/ss1
new file mode 100644
index 0000000..d1c9832
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss1
@@ -0,0 +1,143 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss1 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.tr *\(**
+.tr |\(or
+.SH
+1: Basic Specifications
+.PP
+Names refer to either tokens or nonterminal symbols.
+Yacc requires
+token names to be declared as such.
+In addition, for reasons discussed in Section 3, it is often desirable
+to include the lexical analyzer as part of the specification file;
+it may be useful to include other programs as well.
+Thus, every specification file consists of three sections:
+the
+.I declarations ,
+.I "(grammar) rules" ,
+and
+.I programs .
+The sections are separated by double percent ``%%'' marks.
+(The percent ``%'' is generally used in Yacc specifications as an escape character.)
+.PP
+In other words, a full specification file looks like
+.DS
+declarations
+%%
+rules
+%%
+programs
+.DE
+.PP
+The declaration section may be empty.
+Moreover, if the programs section is omitted, the second %% mark may be omitted also;
+thus, the smallest legal Yacc specification is
+.DS
+%%
+rules
+.DE
+.PP
+Blanks, tabs, and newlines are ignored except
+that they may not appear in names or multi-character reserved symbols.
+Comments may appear wherever a name is legal; they are enclosed
+in /* . . . */, as in C and PL/I.
+.PP
+The rules section is made up of one or more grammar rules.
+A grammar rule has the form:
+.DS
+A : BODY ;
+.DE
+A represents a nonterminal name, and BODY represents a sequence of zero or more names and literals.
+The colon and the semicolon are Yacc punctuation.
+.PP
+Names may be of arbitrary length, and may be made up of letters, dot ``.'', underscore ``\_'', and
+non-initial digits.
+Upper and lower case letters are distinct.
+The names used in the body of a grammar rule may represent tokens or nonterminal symbols.
+.PP
+A literal consists of a character enclosed in single quotes ``\'''.
+As in C, the backslash ``\e'' is an escape character within literals, and all the C escapes
+are recognized.
+Thus
+.DS
+\'\en\' newline
+\'\er\' return
+\'\e\'\' single quote ``\'''
+\'\e\e\' backslash ``\e''
+\'\et\' tab
+\'\eb\' backspace
+\'\ef\' form feed
+\'\exxx\' ``xxx'' in octal
+.DE
+For a number of technical reasons, the
+\s-2NUL\s0
+character (\'\e0\' or 0) should never
+be used in grammar rules.
+.PP
+If there are several grammar rules with the same left hand side, the vertical bar ``|''
+can be used to avoid rewriting the left hand side.
+In addition,
+the semicolon at the end of a rule can be dropped before a vertical bar.
+Thus the grammar rules
+.DS
+A : B C D ;
+A : E F ;
+A : G ;
+.DE
+can be given to Yacc as
+.DS
+A : B C D
+ | E F
+ | G
+ ;
+.DE
+It is not necessary that all grammar rules with the same left side appear together in the grammar rules section,
+although it makes the input much more readable, and easier to change.
+.PP
+If a nonterminal symbol matches the empty string, this can be indicated in the obvious way:
+.DS
+empty : ;
+.DE
+.PP
+Names representing tokens must be declared; this is most simply done by writing
+.DS
+%token name1 name2 . . .
+.DE
+in the declarations section.
+(See Sections 3 , 5, and 6 for much more discussion).
+Every name not defined in the declarations section is assumed to represent a nonterminal symbol.
+Every nonterminal symbol must appear on the left side of at least one rule.
+.PP
+Of all the nonterminal symbols, one, called the
+.I "start symbol" ,
+has particular importance.
+The parser is designed to recognize the start symbol; thus,
+this symbol represents the largest,
+most general structure described by the grammar rules.
+By default,
+the start symbol is taken to be the left hand side of the first
+grammar rule in the rules section.
+It is possible, and in fact desirable, to declare the start
+symbol explicitly in the declarations section using the %start keyword:
+.DS
+%start symbol
+.DE
+.PP
+The end of the input to the parser is signaled by a special token, called the
+.I endmarker .
+If the tokens up to, but not including, the endmarker form a structure
+which matches the start symbol, the parser function returns to its caller
+after the endmarker is seen; it
+.I accepts
+the input.
+If the endmarker is seen in any other context, it is an error.
+.PP
+It is the job of the user-supplied lexical analyzer
+to return the endmarker when appropriate; see section 3, below.
+Usually the endmarker represents some reasonably obvious
+I/O status, such as ``end-of-file'' or ``end-of-record''.
diff --git a/share/doc/psd/15.yacc/ss2 b/share/doc/psd/15.yacc/ss2
new file mode 100644
index 0000000..929e4a6
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss2
@@ -0,0 +1,156 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss2 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+2: Actions
+.PP
+With each grammar rule, the user may associate actions to be performed each time
+the rule is recognized in the input process.
+These actions may return values, and may obtain the values returned by previous
+actions.
+Moreover, the lexical analyzer can return values
+for tokens, if desired.
+.PP
+An action is an arbitrary C statement, and as such can do
+input and output, call subprograms, and alter
+external vectors and variables.
+An action is specified by
+one or more statements, enclosed in curly braces ``{'' and ``}''.
+For example,
+.DS
+A : \'(\' B \')\'
+ { hello( 1, "abc" ); }
+.DE
+and
+.DS
+XXX : YYY ZZZ
+ { printf("a message\en");
+ flag = 25; }
+.DE
+are grammar rules with actions.
+.PP
+To facilitate easy communication between the actions and the parser, the action statements are altered
+slightly.
+The symbol ``dollar sign'' ``$'' is used as a signal to Yacc in this context.
+.PP
+To return a value, the action normally sets the
+pseudo-variable ``$$'' to some value.
+For example, an action that does nothing but return the value 1 is
+.DS
+ { $$ = 1; }
+.DE
+.PP
+To obtain the values returned by previous actions and the lexical analyzer, the
+action may use the pseudo-variables $1, $2, . . .,
+which refer to the values returned by the
+components of the right side of a rule, reading from left to right.
+Thus, if the rule is
+.DS
+A : B C D ;
+.DE
+for example, then $2 has the value returned by C, and $3 the value returned by D.
+.PP
+As a more concrete example, consider the rule
+.DS
+expr : \'(\' expr \')\' ;
+.DE
+The value returned by this rule is usually the value of the
+.I expr
+in parentheses.
+This can be indicated by
+.DS
+expr : \'(\' expr \')\' { $$ = $2 ; }
+.DE
+.PP
+By default, the value of a rule is the value of the first element in it ($1).
+Thus, grammar rules of the form
+.DS
+A : B ;
+.DE
+frequently need not have an explicit action.
+.PP
+In the examples above, all the actions came at the end of their rules.
+Sometimes, it is desirable to get control before a rule is fully parsed.
+Yacc permits an action to be written in the middle of a rule as well
+as at the end.
+This rule is assumed to return a value, accessible
+through the usual \$ mechanism by the actions to
+the right of it.
+In turn, it may access the values
+returned by the symbols to its left.
+Thus, in the rule
+.DS
+A : B
+ { $$ = 1; }
+ C
+ { x = $2; y = $3; }
+ ;
+.DE
+the effect is to set
+.I x
+to 1, and
+.I y
+to the value returned by C.
+.PP
+Actions that do not terminate a rule are actually
+handled by Yacc by manufacturing a new nonterminal
+symbol name, and a new rule matching this
+name to the empty string.
+The interior action is the action triggered off by recognizing
+this added rule.
+Yacc actually treats the above example as if
+it had been written:
+.DS
+$ACT : /* empty */
+ { $$ = 1; }
+ ;
+
+A : B $ACT C
+ { x = $2; y = $3; }
+ ;
+.DE
+.PP
+In many applications, output is not done directly by the actions;
+rather, a data structure, such as a parse tree, is constructed in memory,
+and transformations are applied to it before output is generated.
+Parse trees are particularly easy to
+construct, given routines to build and maintain the tree
+structure desired.
+For example, suppose there is a C function
+.I node ,
+written so that the call
+.DS
+node( L, n1, n2 )
+.DE
+creates a node with label L, and descendants n1 and n2, and returns the index of
+the newly created node.
+Then parse tree can be built by supplying actions such as:
+.DS
+expr : expr \'+\' expr
+ { $$ = node( \'+\', $1, $3 ); }
+.DE
+in the specification.
+.PP
+The user may define other variables to be used by the actions.
+Declarations and definitions can appear in
+the declarations section,
+enclosed in the marks ``%{'' and ``%}''.
+These declarations and definitions have global scope,
+so they are known to the action statements and the lexical analyzer.
+For example,
+.DS
+%{ int variable = 0; %}
+.DE
+could be placed in the declarations section,
+making
+.I variable
+accessible to all of the actions.
+The Yacc parser uses only names beginning in ``yy'';
+the user should avoid such names.
+.PP
+In these examples, all the values are integers: a discussion of
+values of other types will be found in Section 10.
diff --git a/share/doc/psd/15.yacc/ss3 b/share/doc/psd/15.yacc/ss3
new file mode 100644
index 0000000..d520a34
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss3
@@ -0,0 +1,109 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss3 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+3: Lexical Analysis
+.PP
+The user must supply a lexical analyzer to read the input stream and communicate tokens
+(with values, if desired) to the parser.
+The lexical analyzer is an integer-valued function called
+.I yylex .
+The function returns an integer, the
+.I "token number" ,
+representing the kind of token read.
+If there is a value associated with that token, it should be assigned
+to the external variable
+.I yylval .
+.PP
+The parser and the lexical analyzer must agree on these token numbers in order for
+communication between them to take place.
+The numbers may be chosen by Yacc, or chosen by the user.
+In either case, the ``# define'' mechanism of C is used to allow the lexical analyzer
+to return these numbers symbolically.
+For example, suppose that the token name DIGIT has been defined in the declarations section of the
+Yacc specification file.
+The relevant portion of the lexical analyzer might look like:
+.DS
+yylex(){
+ extern int yylval;
+ int c;
+ . . .
+ c = getchar();
+ . . .
+ switch( c ) {
+ . . .
+ case \'0\':
+ case \'1\':
+ . . .
+ case \'9\':
+ yylval = c\-\'0\';
+ return( DIGIT );
+ . . .
+ }
+ . . .
+.DE
+.PP
+The intent is to return a token number of DIGIT, and a value equal to the numerical value of the
+digit.
+Provided that the lexical analyzer code is placed in the programs section of the specification file,
+the identifier DIGIT will be defined as the token number associated
+with the token DIGIT.
+.PP
+This mechanism leads to clear,
+easily modified lexical analyzers; the only pitfall is the need
+to avoid using any token names in the grammar that are reserved
+or significant in C or the parser; for example, the use of
+token names
+.I if
+or
+.I while
+will almost certainly cause severe
+difficulties when the lexical analyzer is compiled.
+The token name
+.I error
+is reserved for error handling, and should not be used naively
+(see Section 7).
+.PP
+As mentioned above, the token numbers may be chosen by Yacc or by the user.
+In the default situation, the numbers are chosen by Yacc.
+The default token number for a literal
+character is the numerical value of the character in the local character set.
+Other names are assigned token numbers
+starting at 257.
+.PP
+To assign a token number to a token (including literals),
+the first appearance of the token name or literal
+.I
+in the declarations section
+.R
+can be immediately followed by
+a nonnegative integer.
+This integer is taken to be the token number of the name or literal.
+Names and literals not defined by this mechanism retain their default definition.
+It is important that all token numbers be distinct.
+.PP
+For historical reasons, the endmarker must have token
+number 0 or negative.
+This token number cannot be redefined by the user; thus, all
+lexical analyzers should be prepared to return 0 or negative as a token number
+upon reaching the end of their input.
+.PP
+A very useful tool for constructing lexical analyzers is
+the
+.I Lex
+program developed by Mike Lesk.
+.[
+Lesk Lex
+.]
+These lexical analyzers are designed to work in close
+harmony with Yacc parsers.
+The specifications for these lexical analyzers
+use regular expressions instead of grammar rules.
+Lex can be easily used to produce quite complicated lexical analyzers,
+but there remain some languages (such as FORTRAN) which do not
+fit any theoretical framework, and whose lexical analyzers
+must be crafted by hand.
diff --git a/share/doc/psd/15.yacc/ss4 b/share/doc/psd/15.yacc/ss4
new file mode 100644
index 0000000..3d6066e
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss4
@@ -0,0 +1,335 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss4 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+4: How the Parser Works
+.PP
+Yacc turns the specification file into a C program, which
+parses the input according to the specification given.
+The algorithm used to go from the
+specification to the parser is complex, and will not be discussed
+here (see
+the references for more information).
+The parser itself, however, is relatively simple,
+and understanding how it works, while
+not strictly necessary, will nevertheless make
+treatment of error recovery and ambiguities much more
+comprehensible.
+.PP
+The parser produced by Yacc consists
+of a finite state machine with a stack.
+The parser is also capable of reading and remembering the next
+input token (called the
+.I lookahead
+token).
+The
+.I "current state"
+is always the one on the top of the stack.
+The states of the finite state machine are given
+small integer labels; initially, the machine is in state 0,
+the stack contains only state 0, and no lookahead token has been read.
+.PP
+The machine has only four actions available to it, called
+.I shift ,
+.I reduce ,
+.I accept ,
+and
+.I error .
+A move of the parser is done as follows:
+.IP 1.
+Based on its current state, the parser decides
+whether it needs a lookahead token to decide
+what action should be done; if it needs one, and does
+not have one, it calls
+.I yylex
+to obtain the next token.
+.IP 2.
+Using the current state, and the lookahead token
+if needed, the parser decides on its next action, and
+carries it out.
+This may result in states being pushed onto the stack, or popped off of
+the stack, and in the lookahead token being processed
+or left alone.
+.PP
+The
+.I shift
+action is the most common action the parser takes.
+Whenever a shift action is taken, there is always
+a lookahead token.
+For example, in state 56 there may be
+an action:
+.DS
+ IF shift 34
+.DE
+which says, in state 56, if the lookahead token is IF,
+the current state (56) is pushed down on the stack,
+and state 34 becomes the current state (on the
+top of the stack).
+The lookahead token is cleared.
+.PP
+The
+.I reduce
+action keeps the stack from growing without
+bounds.
+Reduce actions are appropriate when the parser has seen
+the right hand side of a grammar rule,
+and is prepared to announce that it has seen
+an instance of the rule, replacing the right hand side
+by the left hand side.
+It may be necessary to consult the lookahead token
+to decide whether to reduce, but
+usually it is not; in fact, the default
+action (represented by a ``.'') is often a reduce action.
+.PP
+Reduce actions are associated with individual grammar rules.
+Grammar rules are also given small integer
+numbers, leading to some confusion.
+The action
+.DS
+ \fB.\fR reduce 18
+.DE
+refers to
+.I "grammar rule"
+18, while the action
+.DS
+ IF shift 34
+.DE
+refers to
+.I state
+34.
+.PP
+Suppose the rule being reduced is
+.DS
+A \fB:\fR x y z ;
+.DE
+The reduce action depends on the
+left hand symbol (A in this case), and the number of
+symbols on the right hand side (three in this case).
+To reduce, first pop off the top three states
+from the stack
+(In general, the number of states popped equals the number of symbols on the
+right side of the rule).
+In effect, these states were the ones
+put on the stack while recognizing
+.I x ,
+.I y ,
+and
+.I z ,
+and no longer serve any useful purpose.
+After popping these states, a state is uncovered
+which was the state the parser was in before beginning to
+process the rule.
+Using this uncovered state, and the symbol
+on the left side of the rule, perform what is in
+effect a shift of A.
+A new state is obtained, pushed onto the stack, and parsing continues.
+There are significant differences between the processing of
+the left hand symbol and an ordinary shift of a token,
+however, so this action is called a
+.I goto
+action.
+In particular, the lookahead token is cleared by a shift, and
+is not affected by a goto.
+In any case, the uncovered state contains an entry such as:
+.DS
+ A goto 20
+.DE
+causing state 20 to be pushed
+onto the stack, and become the current state.
+.PP
+In effect, the reduce action ``turns back the clock'' in the parse,
+popping the states off the stack to go back to the
+state where the right hand side of the rule was first seen.
+The parser then behaves as if it had seen the left side at that time.
+If the right hand side of the rule is empty,
+no states are popped off of the stack: the uncovered state
+is in fact the current state.
+.PP
+The reduce action is also important in the treatment of user-supplied
+actions and values.
+When a rule is reduced, the code supplied with the rule is executed
+before the stack is adjusted.
+In addition to the stack holding the states, another stack,
+running in parallel with it, holds the values returned
+from the lexical analyzer and the actions.
+When a shift takes place, the external variable
+.I yylval
+is copied onto the value stack.
+After the return from the user code, the reduction is carried out.
+When the
+.I goto
+action is done, the external variable
+.I yyval
+is copied onto the value stack.
+The pseudo-variables $1, $2, etc., refer to the value stack.
+.PP
+The other two parser actions are conceptually much simpler.
+The
+.I accept
+action indicates that the entire input has been seen and
+that it matches the specification.
+This action appears only when the lookahead token is
+the endmarker, and indicates that the parser has successfully
+done its job.
+The
+.I error
+action, on the other hand, represents a place where the parser
+can no longer continue parsing according to the specification.
+The input tokens it has seen, together with the lookahead token,
+cannot be followed by anything that would result
+in a legal input.
+The parser reports an error, and attempts to recover the situation and
+resume parsing: the error recovery (as opposed to the detection of error)
+will be covered in Section 7.
+.PP
+It is time for an example!
+Consider the specification
+.DS
+%token DING DONG DELL
+%%
+rhyme : sound place
+ ;
+sound : DING DONG
+ ;
+place : DELL
+ ;
+.DE
+.PP
+When Yacc is invoked with the
+.B \-v
+option, a file called
+.I y.output
+is produced, with a human-readable description of the parser.
+The
+.I y.output
+file corresponding to the above grammar (with some statistics
+stripped off the end) is:
+.DS
+state 0
+ $accept : \_rhyme $end
+
+ DING shift 3
+ . error
+
+ rhyme goto 1
+ sound goto 2
+
+state 1
+ $accept : rhyme\_$end
+
+ $end accept
+ . error
+
+state 2
+ rhyme : sound\_place
+
+ DELL shift 5
+ . error
+
+ place goto 4
+
+state 3
+ sound : DING\_DONG
+
+ DONG shift 6
+ . error
+
+state 4
+ rhyme : sound place\_ (1)
+
+ . reduce 1
+
+state 5
+ place : DELL\_ (3)
+
+ . reduce 3
+
+state 6
+ sound : DING DONG\_ (2)
+
+ . reduce 2
+.DE
+Notice that, in addition to the actions for each state, there is a
+description of the parsing rules being processed in each
+state. The \_ character is used to indicate
+what has been seen, and what is yet to come, in each rule.
+Suppose the input is
+.DS
+DING DONG DELL
+.DE
+It is instructive to follow the steps of the parser while
+processing this input.
+.PP
+Initially, the current state is state 0.
+The parser needs to refer to the input in order to decide
+between the actions available in state 0, so
+the first token,
+.I DING ,
+is read, becoming the lookahead token.
+The action in state 0 on
+.I DING
+is
+is ``shift 3'', so state 3 is pushed onto the stack,
+and the lookahead token is cleared.
+State 3 becomes the current state.
+The next token,
+.I DONG ,
+is read, becoming the lookahead token.
+The action in state 3 on the token
+.I DONG
+is ``shift 6'',
+so state 6 is pushed onto the stack, and the lookahead is cleared.
+The stack now contains 0, 3, and 6.
+In state 6, without even consulting the lookahead,
+the parser reduces by rule 2.
+.DS
+ sound : DING DONG
+.DE
+This rule has two symbols on the right hand side, so
+two states, 6 and 3, are popped off of the stack, uncovering state 0.
+Consulting the description of state 0, looking for a goto on
+.I sound ,
+.DS
+ sound goto 2
+.DE
+is obtained; thus state 2 is pushed onto the stack,
+becoming the current state.
+.PP
+In state 2, the next token,
+.I DELL ,
+must be read.
+The action is ``shift 5'', so state 5 is pushed onto the stack, which now has
+0, 2, and 5 on it, and the lookahead token is cleared.
+In state 5, the only action is to reduce by rule 3.
+This has one symbol on the right hand side, so one state, 5,
+is popped off, and state 2 is uncovered.
+The goto in state 2 on
+.I place ,
+the left side of rule 3,
+is state 4.
+Now, the stack contains 0, 2, and 4.
+In state 4, the only action is to reduce by rule 1.
+There are two symbols on the right, so the top two states are popped off,
+uncovering state 0 again.
+In state 0, there is a goto on
+.I rhyme
+causing the parser to enter state 1.
+In state 1, the input is read; the endmarker is obtained,
+indicated by ``$end'' in the
+.I y.output
+file.
+The action in state 1 when the endmarker is seen is to accept,
+successfully ending the parse.
+.PP
+The reader is urged to consider how the parser works
+when confronted with such incorrect strings as
+.I "DING DONG DONG" ,
+.I "DING DONG" ,
+.I "DING DONG DELL DELL" ,
+etc.
+A few minutes spend with this and other simple examples will
+probably be repaid when problems arise in more complicated contexts.
diff --git a/share/doc/psd/15.yacc/ss5 b/share/doc/psd/15.yacc/ss5
new file mode 100644
index 0000000..df9cae0
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss5
@@ -0,0 +1,307 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss5 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+5: Ambiguity and Conflicts
+.PP
+A set of grammar rules is
+.I ambiguous
+if there is some input string that can be structured in two or more different ways.
+For example, the grammar rule
+.DS
+expr : expr \'\-\' expr
+.DE
+is a natural way of expressing the fact that one way of forming an arithmetic expression
+is to put two other expressions together with a minus sign between them.
+Unfortunately, this grammar rule does not
+completely specify the way that all complex inputs
+should be structured.
+For example, if the input is
+.DS
+expr \- expr \- expr
+.DE
+the rule allows this input to be structured as either
+.DS
+( expr \- expr ) \- expr
+.DE
+or as
+.DS
+expr \- ( expr \- expr )
+.DE
+(The first is called
+.I "left association" ,
+the second
+.I "right association" ).
+.PP
+Yacc detects such ambiguities when it is attempting to build the parser.
+It is instructive to consider the problem that confronts the parser when it is
+given an input such as
+.DS
+expr \- expr \- expr
+.DE
+When the parser has read the second expr, the input that it has seen:
+.DS
+expr \- expr
+.DE
+matches the right side of the grammar rule above.
+The parser could
+.I reduce
+the input by applying this rule;
+after applying the rule;
+the input is reduced to
+.I expr (the
+left side of the rule).
+The parser would then read the final part of the input:
+.DS
+\- expr
+.DE
+and again reduce.
+The effect of this is to take the left associative interpretation.
+.PP
+Alternatively, when the parser has seen
+.DS
+expr \- expr
+.DE
+it could defer the immediate application of the rule, and continue reading
+the input until it had seen
+.DS
+expr \- expr \- expr
+.DE
+It could then apply the rule to the rightmost three symbols, reducing them to
+.I expr
+and leaving
+.DS
+expr \- expr
+.DE
+Now the rule can be reduced once more; the effect is to
+take the right associative interpretation.
+Thus, having read
+.DS
+expr \- expr
+.DE
+the parser can do two legal things, a shift or a reduction, and has no way of
+deciding between them.
+This is called a
+.I "shift / reduce conflict" .
+It may also happen that the parser has a choice of two legal reductions;
+this is called a
+.I "reduce / reduce conflict" .
+Note that there are never any ``Shift/shift'' conflicts.
+.PP
+When there are shift/reduce or reduce/reduce conflicts, Yacc still produces a parser.
+It does this by selecting one of the valid steps wherever it has a choice.
+A rule describing which choice to make in a given situation is called
+a
+.I "disambiguating rule" .
+.PP
+Yacc invokes two disambiguating rules by default:
+.IP 1.
+In a shift/reduce conflict, the default is to do the shift.
+.IP 2.
+In a reduce/reduce conflict, the default is to reduce by the
+.I earlier
+grammar rule (in the input sequence).
+.PP
+Rule 1 implies that reductions are deferred whenever there is a choice,
+in favor of shifts.
+Rule 2 gives the user rather crude control over the behavior of the parser
+in this situation, but reduce/reduce conflicts should be avoided whenever possible.
+.PP
+Conflicts may arise because of mistakes in input or logic, or because the grammar rules, while consistent,
+require a more complex parser than Yacc can construct.
+The use of actions within rules can also cause conflicts, if the action must
+be done before the parser can be sure which rule is being recognized.
+In these cases, the application of disambiguating rules is inappropriate,
+and leads to an incorrect parser.
+For this reason, Yacc
+always reports the number of shift/reduce and reduce/reduce conflicts resolved by Rule 1 and Rule 2.
+.PP
+In general, whenever it is possible to apply disambiguating rules to produce a correct parser, it is also
+possible to rewrite the grammar rules so that the same inputs are read but there are no
+conflicts.
+For this reason, most previous parser generators
+have considered conflicts to be fatal errors.
+Our experience has suggested that this rewriting is somewhat unnatural,
+and produces slower parsers; thus, Yacc will produce parsers even in the presence of conflicts.
+.PP
+As an example of the power of disambiguating rules, consider a fragment from a programming
+language involving an ``if-then-else'' construction:
+.DS
+stat : IF \'(\' cond \')\' stat
+ | IF \'(\' cond \')\' stat ELSE stat
+ ;
+.DE
+In these rules,
+.I IF
+and
+.I ELSE
+are tokens,
+.I cond
+is a nonterminal symbol describing
+conditional (logical) expressions, and
+.I stat
+is a nonterminal symbol describing statements.
+The first rule will be called the
+.ul
+simple-if
+rule, and the
+second the
+.ul
+if-else
+rule.
+.PP
+These two rules form an ambiguous construction, since input of the form
+.DS
+IF ( C1 ) IF ( C2 ) S1 ELSE S2
+.DE
+can be structured according to these rules in two ways:
+.DS
+IF ( C1 ) {
+ IF ( C2 ) S1
+ }
+ELSE S2
+.DE
+or
+.DS
+IF ( C1 ) {
+ IF ( C2 ) S1
+ ELSE S2
+ }
+.DE
+The second interpretation is the one given in most programming languages
+having this construct.
+Each
+.I ELSE
+is associated with the last preceding
+``un-\fIELSE'\fRd''
+.I IF .
+In this example, consider the situation where the parser has seen
+.DS
+IF ( C1 ) IF ( C2 ) S1
+.DE
+and is looking at the
+.I ELSE .
+It can immediately
+reduce
+by the simple-if rule to get
+.DS
+IF ( C1 ) stat
+.DE
+and then read the remaining input,
+.DS
+ELSE S2
+.DE
+and reduce
+.DS
+IF ( C1 ) stat ELSE S2
+.DE
+by the if-else rule.
+This leads to the first of the above groupings of the input.
+.PP
+On the other hand, the
+.I ELSE
+may be shifted,
+.I S2
+read, and then the right hand portion of
+.DS
+IF ( C1 ) IF ( C2 ) S1 ELSE S2
+.DE
+can be reduced by the if-else rule to get
+.DS
+IF ( C1 ) stat
+.DE
+which can be reduced by the simple-if rule.
+This leads to the second of the above groupings of the input, which
+is usually desired.
+.PP
+Once again the parser can do two valid things \- there is a shift/reduce conflict.
+The application of disambiguating rule 1 tells the parser to shift in this case,
+which leads to the desired grouping.
+.PP
+This shift/reduce conflict arises only when there is a particular current input symbol,
+.I ELSE ,
+and particular inputs already seen, such as
+.DS
+IF ( C1 ) IF ( C2 ) S1
+.DE
+In general, there may be many conflicts, and each one
+will be associated with an input symbol and
+a set of previously read inputs.
+The previously read inputs are characterized by the
+state
+of the parser.
+.PP
+The conflict messages of Yacc are best understood
+by examining the verbose (\fB\-v\fR) option output file.
+For example, the output corresponding to the above
+conflict state might be:
+.DS L
+23: shift/reduce conflict (shift 45, reduce 18) on ELSE
+
+state 23
+
+ stat : IF ( cond ) stat\_ (18)
+ stat : IF ( cond ) stat\_ELSE stat
+
+ ELSE shift 45
+ . reduce 18
+
+.DE
+The first line describes the conflict, giving the state and the input symbol.
+The ordinary state description follows, giving
+the grammar rules active in the state, and the parser actions.
+Recall that the underline marks the
+portion of the grammar rules which has been seen.
+Thus in the example, in state 23 the parser has seen input corresponding
+to
+.DS
+IF ( cond ) stat
+.DE
+and the two grammar rules shown are active at this time.
+The parser can do two possible things.
+If the input symbol is
+.I ELSE ,
+it is possible to shift into state
+45.
+State 45 will have, as part of its description, the line
+.DS
+stat : IF ( cond ) stat ELSE\_stat
+.DE
+since the
+.I ELSE
+will have been shifted in this state.
+Back in state 23, the alternative action, described by ``\fB.\fR'',
+is to be done if the input symbol is not mentioned explicitly in the above actions; thus,
+in this case, if the input symbol is not
+.I ELSE ,
+the parser reduces by grammar rule 18:
+.DS
+stat : IF \'(\' cond \')\' stat
+.DE
+Once again, notice that the numbers following ``shift'' commands refer to other states,
+while the numbers following ``reduce'' commands refer to grammar
+rule numbers.
+In the
+.I y.output
+file, the rule numbers are printed after those rules which can be reduced.
+In most one states, there will be at most reduce action possible in the
+state, and this will be the default command.
+The user who encounters unexpected shift/reduce conflicts will probably want to
+look at the verbose output to decide whether the default actions are appropriate.
+In really tough cases, the user might need to know more about
+the behavior and construction of the parser than can be covered here.
+In this case, one of the theoretical references
+.[
+Aho Johnson Surveys Parsing
+.]
+.[
+Aho Johnson Ullman Deterministic Ambiguous
+.]
+.[
+Aho Ullman Principles Design
+.]
+might be consulted; the services of a local guru might also be appropriate.
diff --git a/share/doc/psd/15.yacc/ss6 b/share/doc/psd/15.yacc/ss6
new file mode 100644
index 0000000..1000034
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss6
@@ -0,0 +1,151 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss6 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+6: Precedence
+.PP
+There is one common situation
+where the rules given above for resolving conflicts are not sufficient;
+this is in the parsing of arithmetic expressions.
+Most of the commonly used constructions for arithmetic expressions can be naturally
+described by the notion of
+.I precedence
+levels for operators, together with information about left
+or right associativity.
+It turns out that ambiguous grammars with appropriate disambiguating rules
+can be used to create parsers that are faster and easier to
+write than parsers constructed from unambiguous grammars.
+The basic notion is to write grammar rules
+of the form
+.DS
+expr : expr OP expr
+.DE
+and
+.DS
+expr : UNARY expr
+.DE
+for all binary and unary operators desired.
+This creates a very ambiguous grammar, with many parsing conflicts.
+As disambiguating rules, the user specifies the precedence, or binding
+strength, of all the operators, and the associativity
+of the binary operators.
+This information is sufficient to allow Yacc to resolve the parsing conflicts
+in accordance with these rules, and construct a parser that realizes the desired
+precedences and associativities.
+.PP
+The precedences and associativities are attached to tokens in the declarations section.
+This is done by a series of lines beginning with a Yacc keyword: %left, %right,
+or %nonassoc, followed by a list of tokens.
+All of the tokens on the same line are assumed to have the same precedence level
+and associativity; the lines are listed in
+order of increasing precedence or binding strength.
+Thus,
+.DS
+%left \'+\' \'\-\'
+%left \'*\' \'/\'
+.DE
+describes the precedence and associativity of the four arithmetic operators.
+Plus and minus are left associative, and have lower precedence than
+star and slash, which are also left associative.
+The keyword %right is used to describe right associative operators,
+and the keyword %nonassoc is used to describe operators, like
+the operator .LT. in Fortran, that may not associate with themselves; thus,
+.DS
+A .LT. B .LT. C
+.DE
+is illegal in Fortran, and such an operator would be described with the keyword
+%nonassoc in Yacc.
+As an example of the behavior of these declarations, the description
+.DS
+%right \'=\'
+%left \'+\' \'\-\'
+%left \'*\' \'/\'
+
+%%
+
+expr : expr \'=\' expr
+ | expr \'+\' expr
+ | expr \'\-\' expr
+ | expr \'*\' expr
+ | expr \'/\' expr
+ | NAME
+ ;
+.DE
+might be used to structure the input
+.DS
+a = b = c*d \- e \- f*g
+.DE
+as follows:
+.DS
+a = ( b = ( ((c*d)\-e) \- (f*g) ) )
+.DE
+When this mechanism is used,
+unary operators must, in general, be given a precedence.
+Sometimes a unary operator and a binary operator
+have the same symbolic representation, but different precedences.
+An example is unary and binary \'\-\'; unary minus may be given the same
+strength as multiplication, or even higher, while binary minus has a lower strength than
+multiplication.
+The keyword, %prec, changes the precedence level associated with a particular grammar rule.
+%prec appears immediately after the body of the grammar rule, before the action or closing semicolon,
+and is followed by a token name or literal.
+It
+causes the precedence of the grammar rule to become that of the following token name or literal.
+For example, to make unary minus have the same precedence as multiplication the rules might resemble:
+.DS
+%left \'+\' \'\-\'
+%left \'*\' \'/\'
+
+%%
+
+expr : expr \'+\' expr
+ | expr \'\-\' expr
+ | expr \'*\' expr
+ | expr \'/\' expr
+ | \'\-\' expr %prec \'*\'
+ | NAME
+ ;
+.DE
+.PP
+A token declared
+by %left, %right, and %nonassoc need not be, but may be, declared by %token as well.
+.PP
+The precedences and associativities are used by Yacc to
+resolve parsing conflicts; they give rise to disambiguating rules.
+Formally, the rules work as follows:
+.IP 1.
+The precedences and associativities are recorded for those tokens and literals
+that have them.
+.IP 2.
+A precedence and associativity is associated with each grammar rule; it is the precedence
+and associativity of the last token or literal in the body of the rule.
+If the %prec construction is used, it overrides this default.
+Some grammar rules may have no precedence and associativity associated with them.
+.IP 3.
+When there is a reduce/reduce conflict, or there is a shift/reduce conflict
+and either the input symbol or the grammar rule has no precedence and associativity,
+then the two disambiguating rules given at the beginning of the section are used,
+and the conflicts are reported.
+.IP 4.
+If there is a shift/reduce conflict, and both the grammar rule and the input character
+have precedence and associativity associated with them, then the conflict is resolved
+in favor of the action (shift or reduce) associated with the higher precedence.
+If the precedences are the same, then the associativity is used; left
+associative implies reduce, right associative implies shift, and nonassociating
+implies error.
+.PP
+Conflicts resolved by precedence are not counted in the number of shift/reduce and reduce/reduce
+conflicts reported by Yacc.
+This means that mistakes in the specification of precedences may
+disguise errors in the input grammar; it is a good idea to be sparing
+with precedences, and use them in an essentially ``cookbook'' fashion,
+until some experience has been gained.
+The
+.I y.output
+file
+is very useful in deciding whether the parser is actually doing
+what was intended.
diff --git a/share/doc/psd/15.yacc/ss7 b/share/doc/psd/15.yacc/ss7
new file mode 100644
index 0000000..8876ece
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss7
@@ -0,0 +1,129 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss7 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+7: Error Handling
+.PP
+Error handling is an extremely difficult area, and many of the problems are semantic ones.
+When an error is found, for example, it may be necessary to reclaim parse tree storage,
+delete or alter symbol table entries, and, typically, set switches to avoid generating any further output.
+.PP
+It is seldom acceptable to stop all processing when an error is found; it is more useful to continue
+scanning the input to find further syntax errors.
+This leads to the problem of getting the parser ``restarted'' after an error.
+A general class of algorithms to do this involves discarding a number of tokens
+from the input string, and attempting to adjust the parser so that input can continue.
+.PP
+To allow the user some control over this process,
+Yacc provides a simple, but reasonably general, feature.
+The token name ``error'' is reserved for error handling.
+This name can be used in grammar rules;
+in effect, it suggests places where errors are expected, and recovery might take place.
+The parser pops its stack until it enters a state where the token ``error'' is legal.
+It then behaves as if the token ``error'' were the current lookahead token,
+and performs the action encountered.
+The lookahead token is then reset to the token that caused the error.
+If no special error rules have been specified, the processing halts when an error is detected.
+.PP
+In order to prevent a cascade of error messages, the parser, after
+detecting an error, remains in error state until three tokens have been successfully
+read and shifted.
+If an error is detected when the parser is already in error state,
+no message is given, and the input token is quietly deleted.
+.PP
+As an example, a rule of the form
+.DS
+stat : error
+.DE
+would, in effect, mean that on a syntax error the parser would attempt to skip over the statement
+in which the error was seen.
+More precisely, the parser will
+scan ahead, looking for three tokens that might legally follow
+a statement, and start processing at the first of these; if
+the beginnings of statements are not sufficiently distinctive, it may make a
+false start in the middle of a statement, and end up reporting a
+second error where there is in fact no error.
+.PP
+Actions may be used with these special error rules.
+These actions might attempt to reinitialize tables, reclaim symbol table space, etc.
+.PP
+Error rules such as the above are very general, but difficult to control.
+Somewhat easier are rules such as
+.DS
+stat : error \';\'
+.DE
+Here, when there is an error, the parser attempts to skip over the statement, but
+will do so by skipping to the next \';\'.
+All tokens after the error and before the next \';\' cannot be shifted, and are discarded.
+When the \';\' is seen, this rule will be reduced, and any ``cleanup''
+action associated with it performed.
+.PP
+Another form of error rule arises in interactive applications, where
+it may be desirable to permit a line to be reentered after an error.
+A possible error rule might be
+.DS
+input : error \'\en\' { printf( "Reenter last line: " ); } input
+ { $$ = $4; }
+.DE
+There is one potential difficulty with this approach;
+the parser must correctly process three input tokens before it
+admits that it has correctly resynchronized after the error.
+If the reentered line contains an error
+in the first two tokens, the parser deletes the offending tokens,
+and gives no message; this is clearly unacceptable.
+For this reason, there is a mechanism that
+can be used to force the parser
+to believe that an error has been fully recovered from.
+The statement
+.DS
+yyerrok ;
+.DE
+in an action
+resets the parser to its normal mode.
+The last example is better written
+.DS
+input : error \'\en\'
+ { yyerrok;
+ printf( "Reenter last line: " ); }
+ input
+ { $$ = $4; }
+ ;
+.DE
+.PP
+As mentioned above, the token seen immediately
+after the ``error'' symbol is the input token at which the
+error was discovered.
+Sometimes, this is inappropriate; for example, an
+error recovery action might
+take upon itself the job of finding the correct place to resume input.
+In this case,
+the previous lookahead token must be cleared.
+The statement
+.DS
+yyclearin ;
+.DE
+in an action will have this effect.
+For example, suppose the action after error
+were to call some sophisticated resynchronization routine,
+supplied by the user, that attempted to advance the input to the
+beginning of the next valid statement.
+After this routine was called, the next token returned by yylex would presumably
+be the first token in a legal statement;
+the old, illegal token must be discarded, and the error state reset.
+This could be done by a rule like
+.DS
+stat : error
+ { resynch();
+ yyerrok ;
+ yyclearin ; }
+ ;
+.DE
+.PP
+These mechanisms are admittedly crude, but do allow for a simple, fairly effective recovery of the parser
+from many errors;
+moreover, the user can get control to deal with
+the error actions required by other portions of the program.
diff --git a/share/doc/psd/15.yacc/ss8 b/share/doc/psd/15.yacc/ss8
new file mode 100644
index 0000000..cf1b892
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss8
@@ -0,0 +1,98 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss8 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+8: The Yacc Environment
+.PP
+When the user inputs a specification
+to Yacc, the output is a file of C programs, called
+.I y.tab.c
+on most
+systems
+(due to local file system conventions, the names may differ from
+installation to installation).
+The function produced by Yacc is called
+.I yyparse \|;
+it is an integer valued function.
+When it is called, it in turn repeatedly calls
+.I yylex ,
+the lexical analyzer
+supplied by the user (see Section 3)
+to obtain input tokens.
+Eventually, either an error is detected, in which case
+(if no error recovery is possible)
+.I yyparse
+returns the value 1,
+or the lexical analyzer returns the endmarker token
+and the parser accepts.
+In this case,
+.I yyparse
+returns the value 0.
+.PP
+The user must provide a certain amount of environment for this
+parser in order to obtain a working program.
+For example, as with every C program, a program called
+.I main
+must be defined, that eventually calls
+.I yyparse .
+In addition, a routine called
+.I yyerror
+prints a message
+when a syntax error is detected.
+.PP
+These two routines must be supplied in one form or another by the
+user.
+To ease the initial effort of using Yacc, a library has been
+provided with default versions of
+.I main
+and
+.I yyerror .
+The name of this library is system dependent;
+on many systems the library is accessed by a
+.B \-ly
+argument to the loader.
+To show the triviality of these default programs, the source is
+given below:
+.DS
+main(){
+ return( yyparse() );
+ }
+.DE
+and
+.DS
+# include <stdio.h>
+
+yyerror(s) char *s; {
+ fprintf( stderr, "%s\en", s );
+ }
+.DE
+The argument to
+.I yyerror
+is a string containing an error message, usually
+the string ``syntax error''.
+The average application will want to do better than this.
+Ordinarily, the program should keep track of the input line number, and print it
+along with the message when a syntax error is detected.
+The external integer variable
+.I yychar
+contains the lookahead token number at the time the error was detected;
+this may be of some interest in giving better diagnostics.
+Since the
+.I main
+program is probably supplied by the user (to read arguments, etc.)
+the Yacc library is useful only in small
+projects, or in the earliest stages of larger ones.
+.PP
+The external integer variable
+.I yydebug
+is normally set to 0.
+If it is set to a nonzero value, the parser will output a
+verbose description of its actions, including
+a discussion of which input symbols have been read, and
+what the parser actions are.
+Depending on the operating environment,
+it may be possible to set this variable by using a debugging system.
diff --git a/share/doc/psd/15.yacc/ss9 b/share/doc/psd/15.yacc/ss9
new file mode 100644
index 0000000..7fc5b68
--- /dev/null
+++ b/share/doc/psd/15.yacc/ss9
@@ -0,0 +1,174 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ss9 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+9: Hints for Preparing Specifications
+.PP
+This section contains miscellaneous hints on preparing efficient, easy to change,
+and clear specifications.
+The individual subsections are more or less
+independent.
+.SH
+Input Style
+.PP
+It is difficult to
+provide rules with substantial actions
+and still have a readable specification file.
+The following style hints owe much to Brian Kernighan.
+.IP a.
+Use all capital letters for token names, all lower case letters for
+nonterminal names.
+This rule comes under the heading of ``knowing who to blame when
+things go wrong.''
+.IP b.
+Put grammar rules and actions on separate lines.
+This allows either to be changed without
+an automatic need to change the other.
+.IP c.
+Put all rules with the same left hand side together.
+Put the left hand side in only once, and let all
+following rules begin with a vertical bar.
+.IP d.
+Put a semicolon only after the last rule with a given left hand side,
+and put the semicolon on a separate line.
+This allows new rules to be easily added.
+.IP e.
+Indent rule bodies by two tab stops, and action bodies by three
+tab stops.
+.PP
+The example in Appendix A is written following this style, as are
+the examples in the text of this paper (where space permits).
+The user must make up his own mind about these stylistic questions;
+the central problem, however, is to make the rules visible through
+the morass of action code.
+.SH
+Left Recursion
+.PP
+The algorithm used by the Yacc parser encourages so called ``left recursive''
+grammar rules: rules of the form
+.DS
+name : name rest_of_rule ;
+.DE
+These rules frequently arise when
+writing specifications of sequences and lists:
+.DS
+list : item
+ | list \',\' item
+ ;
+.DE
+and
+.DS
+seq : item
+ | seq item
+ ;
+.DE
+In each of these cases, the first rule
+will be reduced for the first item only, and the second rule
+will be reduced for the second and all succeeding items.
+.PP
+With right recursive rules, such as
+.DS
+seq : item
+ | item seq
+ ;
+.DE
+the parser would be a bit bigger, and the items would be seen, and reduced,
+from right to left.
+More seriously, an internal stack in the parser
+would be in danger of overflowing if a very long sequence were read.
+Thus, the user should use left recursion wherever reasonable.
+.PP
+It is worth considering whether a sequence with zero
+elements has any meaning, and if so, consider writing
+the sequence specification with an empty rule:
+.DS
+seq : /* empty */
+ | seq item
+ ;
+.DE
+Once again, the first rule would always be reduced exactly once, before the
+first item was read,
+and then the second rule would be reduced once for each item read.
+Permitting empty sequences
+often leads to increased generality.
+However, conflicts might arise if Yacc is asked to decide
+which empty sequence it has seen, when it hasn't seen enough to
+know!
+.SH
+Lexical Tie-ins
+.PP
+Some lexical decisions depend on context.
+For example, the lexical analyzer might want to
+delete blanks normally, but not within quoted strings.
+Or names might be entered into a symbol table in declarations,
+but not in expressions.
+.PP
+One way of handling this situation is
+to create a global flag that is
+examined by the lexical analyzer, and set by actions.
+For example, suppose a program
+consists of 0 or more declarations, followed by 0 or more statements.
+Consider:
+.DS
+%{
+ int dflag;
+%}
+ ... other declarations ...
+
+%%
+
+prog : decls stats
+ ;
+
+decls : /* empty */
+ { dflag = 1; }
+ | decls declaration
+ ;
+
+stats : /* empty */
+ { dflag = 0; }
+ | stats statement
+ ;
+
+ ... other rules ...
+.DE
+The flag
+.I dflag
+is now 0 when reading statements, and 1 when reading declarations,
+.ul
+except for the first token in the first statement.
+This token must be seen by the parser before it can tell that
+the declaration section has ended and the statements have
+begun.
+In many cases, this single token exception does not
+affect the lexical scan.
+.PP
+This kind of ``backdoor'' approach can be elaborated
+to a noxious degree.
+Nevertheless, it represents a way of doing some things
+that are difficult, if not impossible, to
+do otherwise.
+.SH
+Reserved Words
+.PP
+Some programming languages
+permit the user to
+use words like ``if'', which are normally reserved,
+as label or variable names, provided that such use does not
+conflict with the legal use of these names in the programming language.
+This is extremely hard to do in the framework of Yacc;
+it is difficult to pass information to the lexical analyzer
+telling it ``this instance of `if' is a keyword, and that instance is a variable''.
+The user can make a stab at it, using the
+mechanism described in the last subsection,
+but it is difficult.
+.PP
+A number of ways of making this easier are under advisement.
+Until then, it is better that the keywords be
+.I reserved \|;
+that is, be forbidden for use as variable names.
+There are powerful stylistic reasons for preferring this, anyway.
diff --git a/share/doc/psd/15.yacc/ssA b/share/doc/psd/15.yacc/ssA
new file mode 100644
index 0000000..027a06a
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssA
@@ -0,0 +1,189 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssA 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+10: Advanced Topics
+.PP
+This section discusses a number of advanced features
+of Yacc.
+.SH
+Simulating Error and Accept in Actions
+.PP
+The parsing actions of error and accept can be simulated
+in an action by use of macros YYACCEPT and YYERROR.
+YYACCEPT causes
+.I yyparse
+to return the value 0;
+YYERROR causes
+the parser to behave as if the current input symbol
+had been a syntax error;
+.I yyerror
+is called, and error recovery takes place.
+These mechanisms can be used to simulate parsers
+with multiple endmarkers or context-sensitive syntax checking.
+.SH
+Accessing Values in Enclosing Rules.
+.PP
+An action may refer to values
+returned by actions to the left of the current rule.
+The mechanism is simply the same as with ordinary actions,
+a dollar sign followed by a digit, but in this case the
+digit may be 0 or negative.
+Consider
+.DS
+sent : adj noun verb adj noun
+ { \fIlook at the sentence\fR . . . }
+ ;
+
+adj : THE { $$ = THE; }
+ | YOUNG { $$ = YOUNG; }
+ . . .
+ ;
+
+noun : DOG
+ { $$ = DOG; }
+ | CRONE
+ { if( $0 == YOUNG ){
+ printf( "what?\en" );
+ }
+ $$ = CRONE;
+ }
+ ;
+ . . .
+.DE
+In the action following the word CRONE, a check is made that the
+preceding token shifted was not YOUNG.
+Obviously, this is only possible when a great deal is known about
+what might precede the symbol
+.I noun
+in the input.
+There is also a distinctly unstructured flavor about this.
+Nevertheless, at times this mechanism will save a great
+deal of trouble, especially when a few combinations are to
+be excluded from an otherwise regular structure.
+.SH
+Support for Arbitrary Value Types
+.PP
+By default, the values returned by actions and the lexical analyzer are integers.
+Yacc can also support
+values of other types, including structures.
+In addition, Yacc keeps track of the types, and inserts
+appropriate union member names so that the resulting parser will
+be strictly type checked.
+The Yacc value stack (see Section 4)
+is declared to be a
+.I union
+of the various types of values desired.
+The user declares the union, and associates union member names
+to each token and nonterminal symbol having a value.
+When the value is referenced through a $$ or $n construction,
+Yacc will automatically insert the appropriate union name, so that
+no unwanted conversions will take place.
+In addition, type checking commands such as
+.I Lint\|
+.[
+Johnson Lint Checker 1273
+.]
+will be far more silent.
+.PP
+There are three mechanisms used to provide for this typing.
+First, there is a way of defining the union; this must be
+done by the user since other programs, notably the lexical analyzer,
+must know about the union member names.
+Second, there is a way of associating a union member name with tokens
+and nonterminals.
+Finally, there is a mechanism for describing the type of those
+few values where Yacc can not easily determine the type.
+.PP
+To declare the union, the user includes in the declaration section:
+.DS
+%union {
+ body of union ...
+ }
+.DE
+This declares the Yacc value stack,
+and the external variables
+.I yylval
+and
+.I yyval ,
+to have type equal to this union.
+If Yacc was invoked with the
+.B \-d
+option, the union declaration
+is copied onto the
+.I y.tab.h
+file.
+Alternatively,
+the union may be declared in a header file, and a typedef
+used to define the variable YYSTYPE to represent
+this union.
+Thus, the header file might also have said:
+.DS
+typedef union {
+ body of union ...
+ } YYSTYPE;
+.DE
+The header file must be included in the declarations
+section, by use of %{ and %}.
+.PP
+Once YYSTYPE is defined,
+the union member names must be associated
+with the various terminal and nonterminal names.
+The construction
+.DS
+< name >
+.DE
+is used to indicate a union member name.
+If this follows
+one of the
+keywords %token,
+%left, %right, and %nonassoc,
+the union member name is associated with the tokens listed.
+Thus, saying
+.DS
+%left <optype> \'+\' \'\-\'
+.DE
+will cause any reference to values returned by these two tokens to be
+tagged with
+the union member name
+.I optype .
+Another keyword, %type, is
+used similarly to associate
+union member names with nonterminals.
+Thus, one might say
+.DS
+%type <nodetype> expr stat
+.DE
+.PP
+There remain a couple of cases where these mechanisms are insufficient.
+If there is an action within a rule, the value returned
+by this action has no
+.I "a priori"
+type.
+Similarly, reference to left context values (such as $0 \- see the
+previous subsection ) leaves Yacc with no easy way of knowing the type.
+In this case, a type can be imposed on the reference by inserting
+a union member name, between < and >, immediately after
+the first $.
+An example of this usage is
+.DS
+rule : aaa { $<intval>$ = 3; } bbb
+ { fun( $<intval>2, $<other>0 ); }
+ ;
+.DE
+This syntax has little to recommend it, but the situation arises rarely.
+.PP
+A sample specification is given in Appendix C.
+The facilities in this subsection are not triggered until they are used:
+in particular, the use of %type will turn on these mechanisms.
+When they are used, there is a fairly strict level of checking.
+For example, use of $n or $$ to refer to something with no defined type
+is diagnosed.
+If these facilities are not triggered, the Yacc value stack is used to
+hold
+.I int' s,
+as was true historically.
diff --git a/share/doc/psd/15.yacc/ssB b/share/doc/psd/15.yacc/ssB
new file mode 100644
index 0000000..82d32ae
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssB
@@ -0,0 +1,31 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssB 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+11: Acknowledgements
+.PP
+Yacc owes much to a
+most stimulating collection of users, who have goaded
+me beyond my inclination, and frequently beyond my
+ability, in their endless search for ``one more feature''.
+Their irritating unwillingness to learn how to
+do things my way has usually led to my doing things their way;
+most of the time, they have been right.
+B. W. Kernighan, P. J. Plauger, S. I. Feldman, C. Imagna,
+M. E. Lesk,
+and A. Snyder will recognize some of their ideas in the current version
+of Yacc.
+C. B. Haley contributed to the error recovery algorithm.
+D. M. Ritchie, B. W. Kernighan, and M. O. Harris helped translate this document into English.
+Al Aho also deserves special credit for bringing
+the mountain to Mohammed, and other favors.
+.SG "MH-1273-SCJ-unix"
+.bp
+.[
+$LIST$
+.]
+.bp
diff --git a/share/doc/psd/15.yacc/ssa b/share/doc/psd/15.yacc/ssa
new file mode 100644
index 0000000..fb604f9
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssa
@@ -0,0 +1,118 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssa 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+Appendix A: A Simple Example
+.PP
+This example gives the complete Yacc specification for a small desk calculator;
+the desk calculator has 26 registers, labeled ``a'' through ``z'', and accepts
+arithmetic expressions made up of the operators +, \-, *, /,
+% (mod operator), & (bitwise and), | (bitwise or), and assignment.
+If an expression at the top level is an assignment, the value is not
+printed; otherwise it is.
+As in C, an integer that begins with 0 (zero) is assumed to be octal;
+otherwise, it is assumed to be decimal.
+.PP
+As an example of a Yacc specification, the desk calculator
+does a reasonable job of showing how precedences and ambiguities
+are used, and demonstrating simple error recovery.
+The major oversimplifications are that the
+lexical analysis phase is much simpler than for most applications, and the
+output is produced immediately, line by line.
+Note the way that decimal and octal integers are read in by the grammar rules;
+This job is probably better done by the lexical analyzer.
+.sp
+.nf
+.ta .5i 1i 1.5i 2i 2.5i
+
+%{
+# include <stdio.h>
+# include <ctype.h>
+
+int regs[26];
+int base;
+
+%}
+
+%start list
+
+%token DIGIT LETTER
+
+%left \'|\'
+%left \'&\'
+%left \'+\' \'\-\'
+%left \'*\' \'/\' \'%\'
+%left UMINUS /* supplies precedence for unary minus */
+
+%% /* beginning of rules section */
+
+list : /* empty */
+ | list stat \'\en\'
+ | list error \'\en\'
+ { yyerrok; }
+ ;
+
+stat : expr
+ { printf( "%d\en", $1 ); }
+ | LETTER \'=\' expr
+ { regs[$1] = $3; }
+ ;
+
+expr : \'(\' expr \')\'
+ { $$ = $2; }
+ | expr \'+\' expr
+ { $$ = $1 + $3; }
+ | expr \'\-\' expr
+ { $$ = $1 \- $3; }
+ | expr \'*\' expr
+ { $$ = $1 * $3; }
+ | expr \'/\' expr
+ { $$ = $1 / $3; }
+ | expr \'%\' expr
+ { $$ = $1 % $3; }
+ | expr \'&\' expr
+ { $$ = $1 & $3; }
+ | expr \'|\' expr
+ { $$ = $1 | $3; }
+ | \'\-\' expr %prec UMINUS
+ { $$ = \- $2; }
+ | LETTER
+ { $$ = regs[$1]; }
+ | number
+ ;
+
+number : DIGIT
+ { $$ = $1; base = ($1==0) ? 8 : 10; }
+ | number DIGIT
+ { $$ = base * $1 + $2; }
+ ;
+
+%% /* start of programs */
+
+yylex() { /* lexical analysis routine */
+ /* returns LETTER for a lower case letter, yylval = 0 through 25 */
+ /* return DIGIT for a digit, yylval = 0 through 9 */
+ /* all other characters are returned immediately */
+
+ int c;
+
+ while( (c=getchar()) == \' \' ) { /* skip blanks */ }
+
+ /* c is now nonblank */
+
+ if( islower( c ) ) {
+ yylval = c \- \'a\';
+ return ( LETTER );
+ }
+ if( isdigit( c ) ) {
+ yylval = c \- \'0\';
+ return( DIGIT );
+ }
+ return( c );
+ }
+.fi
+.bp
diff --git a/share/doc/psd/15.yacc/ssb b/share/doc/psd/15.yacc/ssb
new file mode 100644
index 0000000..4b4df03
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssb
@@ -0,0 +1,115 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssb 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+Appendix B: Yacc Input Syntax
+.PP
+This Appendix has a description of the Yacc input syntax, as a Yacc specification.
+Context dependencies, etc., are not considered.
+Ironically, the Yacc input specification language
+is most naturally specified as an LR(2) grammar; the sticky
+part comes when an identifier is seen in a rule, immediately
+following an action.
+If this identifier is followed by a colon, it is the start of the
+next rule; otherwise
+it is a continuation of the current rule, which just happens to have
+an action embedded in it.
+As implemented, the lexical analyzer looks
+ahead after seeing an identifier, and
+decide whether the next token (skipping blanks, newlines, comments, etc.)
+is a colon.
+If so, it returns the token C_IDENTIFIER.
+Otherwise, it returns IDENTIFIER.
+Literals (quoted strings) are also returned as IDENTIFIERS,
+but never as part of C_IDENTIFIERs.
+.sp
+.nf
+.ta .6i 1.2i 1.8i 2.4i 3i 3.6i
+
+ /* grammar for the input to Yacc */
+
+ /* basic entities */
+%token IDENTIFIER /* includes identifiers and literals */
+%token C_IDENTIFIER /* identifier (but not literal) followed by colon */
+%token NUMBER /* [0-9]+ */
+
+ /* reserved words: %type => TYPE, %left => LEFT, etc. */
+
+%token LEFT RIGHT NONASSOC TOKEN PREC TYPE START UNION
+
+%token MARK /* the %% mark */
+%token LCURL /* the %{ mark */
+%token RCURL /* the %} mark */
+
+ /* ascii character literals stand for themselves */
+
+%start spec
+
+%%
+
+spec : defs MARK rules tail
+ ;
+
+tail : MARK { \fIIn this action, eat up the rest of the file\fR }
+ | /* empty: the second MARK is optional */
+ ;
+
+defs : /* empty */
+ | defs def
+ ;
+
+def : START IDENTIFIER
+ | UNION { \fICopy union definition to output\fR }
+ | LCURL { \fICopy C code to output file\fR } RCURL
+ | ndefs rword tag nlist
+ ;
+
+rword : TOKEN
+ | LEFT
+ | RIGHT
+ | NONASSOC
+ | TYPE
+ ;
+
+tag : /* empty: union tag is optional */
+ | \'<\' IDENTIFIER \'>\'
+ ;
+
+nlist : nmno
+ | nlist nmno
+ | nlist \',\' nmno
+ ;
+
+nmno : IDENTIFIER /* NOTE: literal illegal with %type */
+ | IDENTIFIER NUMBER /* NOTE: illegal with %type */
+ ;
+
+ /* rules section */
+
+rules : C_IDENTIFIER rbody prec
+ | rules rule
+ ;
+
+rule : C_IDENTIFIER rbody prec
+ | '|' rbody prec
+ ;
+
+rbody : /* empty */
+ | rbody IDENTIFIER
+ | rbody act
+ ;
+
+act : \'{\' { \fICopy action, translate $$, etc.\fR } \'}\'
+ ;
+
+prec : /* empty */
+ | PREC IDENTIFIER
+ | PREC IDENTIFIER act
+ | prec \';\'
+ ;
+.fi
+.bp
diff --git a/share/doc/psd/15.yacc/ssc b/share/doc/psd/15.yacc/ssc
new file mode 100644
index 0000000..29b38be
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssc
@@ -0,0 +1,316 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssc 8.1 (Berkeley) 8/14/93
+.\"
+.\" $FreeBSD$
+.SH
+Appendix C: An Advanced Example
+.PP
+This Appendix gives an example of a grammar using some
+of the advanced features discussed in Section 10.
+The desk calculator example in Appendix A is
+modified to provide a desk calculator that
+does floating point interval arithmetic.
+The calculator understands floating point
+constants, the arithmetic operations +, \-, *, /,
+unary \-, and = (assignment), and has 26 floating
+point variables, ``a'' through ``z''.
+Moreover, it also understands
+.I intervals ,
+written
+.DS
+ ( x , y )
+.DE
+where
+.I x
+is less than or equal to
+.I y .
+There are 26 interval valued variables ``A'' through ``Z''
+that may also be used.
+The usage is similar to that in Appendix A; assignments
+return no value, and print nothing, while expressions print
+the (floating or interval) value.
+.PP
+This example explores a number of interesting features
+of Yacc and C.
+Intervals are represented by a structure, consisting of the
+left and right endpoint values, stored as
+.I double 's.
+This structure is given a type name, INTERVAL, by using
+.I typedef .
+The Yacc value stack can also contain floating point scalars, and
+integers (used to index into the arrays holding the variable values).
+Notice that this entire strategy depends strongly on being able to
+assign structures and unions in C.
+In fact, many of the actions call functions that return structures
+as well.
+.PP
+It is also worth noting the use of YYERROR to handle error conditions:
+division by an interval containing 0, and an interval presented in
+the wrong order.
+In effect, the error recovery mechanism of Yacc is used to throw away the
+rest of the offending line.
+.PP
+In addition to the mixing of types on the value stack,
+this grammar also demonstrates an interesting use of syntax to
+keep track of the type (e.g. scalar or interval) of intermediate
+expressions.
+Note that a scalar can be automatically promoted to an interval if
+the context demands an interval value.
+This causes a large number of conflicts when the grammar is run through
+Yacc: 18 Shift/Reduce and 26 Reduce/Reduce.
+The problem can be seen by looking at the two input lines:
+.DS
+ 2.5 + ( 3.5 \- 4. )
+.DE
+and
+.DS
+ 2.5 + ( 3.5 , 4. )
+.DE
+Notice that the 2.5 is to be used in an interval valued expression
+in the second example, but this fact is not known until
+the ``,'' is read; by this time, 2.5 is finished, and the parser cannot go back
+and change its mind.
+More generally, it might be necessary to look ahead an arbitrary number of
+tokens to decide whether to convert a scalar to an interval.
+This problem is evaded by having two rules for each binary interval
+valued operator: one when the left operand is a scalar, and one when
+the left operand is an interval.
+In the second case, the right operand must be an interval,
+so the conversion will be applied automatically.
+Despite this evasion, there are still many cases where the
+conversion may be applied or not, leading to the above
+conflicts.
+They are resolved by listing the rules that yield scalars first
+in the specification file; in this way, the conflicts will
+be resolved in the direction of keeping scalar
+valued expressions scalar valued until they are forced to become
+intervals.
+.PP
+This way of handling multiple types is very instructive, but not very general.
+If there were many kinds of expression types, instead of just two,
+the number of rules needed would increase dramatically, and the conflicts
+even more dramatically.
+Thus, while this example is instructive, it is better practice in a
+more normal programming language environment to
+keep the type information as part of the value, and not as part
+of the grammar.
+.PP
+Finally, a word about the lexical analysis.
+The only unusual feature is the treatment of floating point constants.
+The C library routine
+.I atof
+is used to do the actual conversion from a character string
+to a double precision value.
+If the lexical analyzer detects an error,
+it responds by returning a token that
+is illegal in the grammar, provoking a syntax error
+in the parser, and thence error recovery.
+.LD
+
+%{
+
+# include <stdio.h>
+# include <ctype.h>
+
+typedef struct interval {
+ double lo, hi;
+ } INTERVAL;
+
+INTERVAL vmul(), vdiv();
+
+double atof();
+
+double dreg[ 26 ];
+INTERVAL vreg[ 26 ];
+
+%}
+
+%start lines
+
+%union {
+ int ival;
+ double dval;
+ INTERVAL vval;
+ }
+
+%token <ival> DREG VREG /* indices into dreg, vreg arrays */
+
+%token <dval> CONST /* floating point constant */
+
+%type <dval> dexp /* expression */
+
+%type <vval> vexp /* interval expression */
+
+ /* precedence information about the operators */
+
+%left \'+\' \'\-\'
+%left \'*\' \'/\'
+%left UMINUS /* precedence for unary minus */
+
+%%
+
+lines : /* empty */
+ | lines line
+ ;
+
+line : dexp \'\en\'
+ { printf( "%15.8f\en", $1 ); }
+ | vexp \'\en\'
+ { printf( "(%15.8f , %15.8f )\en", $1.lo, $1.hi ); }
+ | DREG \'=\' dexp \'\en\'
+ { dreg[$1] = $3; }
+ | VREG \'=\' vexp \'\en\'
+ { vreg[$1] = $3; }
+ | error \'\en\'
+ { yyerrok; }
+ ;
+
+dexp : CONST
+ | DREG
+ { $$ = dreg[$1]; }
+ | dexp \'+\' dexp
+ { $$ = $1 + $3; }
+ | dexp \'\-\' dexp
+ { $$ = $1 \- $3; }
+ | dexp \'*\' dexp
+ { $$ = $1 * $3; }
+ | dexp \'/\' dexp
+ { $$ = $1 / $3; }
+ | \'\-\' dexp %prec UMINUS
+ { $$ = \- $2; }
+ | \'(\' dexp \')\'
+ { $$ = $2; }
+ ;
+
+vexp : dexp
+ { $$.hi = $$.lo = $1; }
+ | \'(\' dexp \',\' dexp \')\'
+ {
+ $$.lo = $2;
+ $$.hi = $4;
+ if( $$.lo > $$.hi ){
+ printf( "interval out of order\en" );
+ YYERROR;
+ }
+ }
+ | VREG
+ { $$ = vreg[$1]; }
+ | vexp \'+\' vexp
+ { $$.hi = $1.hi + $3.hi;
+ $$.lo = $1.lo + $3.lo; }
+ | dexp \'+\' vexp
+ { $$.hi = $1 + $3.hi;
+ $$.lo = $1 + $3.lo; }
+ | vexp \'\-\' vexp
+ { $$.hi = $1.hi \- $3.lo;
+ $$.lo = $1.lo \- $3.hi; }
+ | dexp \'\-\' vexp
+ { $$.hi = $1 \- $3.lo;
+ $$.lo = $1 \- $3.hi; }
+ | vexp \'*\' vexp
+ { $$ = vmul( $1.lo, $1.hi, $3 ); }
+ | dexp \'*\' vexp
+ { $$ = vmul( $1, $1, $3 ); }
+ | vexp \'/\' vexp
+ { if( dcheck( $3 ) ) YYERROR;
+ $$ = vdiv( $1.lo, $1.hi, $3 ); }
+ | dexp \'/\' vexp
+ { if( dcheck( $3 ) ) YYERROR;
+ $$ = vdiv( $1, $1, $3 ); }
+ | \'\-\' vexp %prec UMINUS
+ { $$.hi = \-$2.lo; $$.lo = \-$2.hi; }
+ | \'(\' vexp \')\'
+ { $$ = $2; }
+ ;
+
+%%
+
+# define BSZ 50 /* buffer size for floating point numbers */
+
+ /* lexical analysis */
+
+yylex(){
+ register c;
+
+ while( (c=getchar()) == \' \' ){ /* skip over blanks */ }
+
+ if( isupper( c ) ){
+ yylval.ival = c \- \'A\';
+ return( VREG );
+ }
+ if( islower( c ) ){
+ yylval.ival = c \- \'a\';
+ return( DREG );
+ }
+
+ if( isdigit( c ) || c==\'.\' ){
+ /* gobble up digits, points, exponents */
+
+ char buf[BSZ+1], *cp = buf;
+ int dot = 0, exp = 0;
+
+ for( ; (cp\-buf)<BSZ ; ++cp,c=getchar() ){
+
+ *cp = c;
+ if( isdigit( c ) ) continue;
+ if( c == \'.\' ){
+ if( dot++ || exp ) return( \'.\' ); /* will cause syntax error */
+ continue;
+ }
+
+ if( c == \'e\' ){
+ if( exp++ ) return( \'e\' ); /* will cause syntax error */
+ continue;
+ }
+
+ /* end of number */
+ break;
+ }
+ *cp = \'\e0\';
+ if( (cp\-buf) >= BSZ ) printf( "constant too long: truncated\en" );
+ else ungetc( c, stdin ); /* push back last char read */
+ yylval.dval = atof( buf );
+ return( CONST );
+ }
+ return( c );
+ }
+
+INTERVAL hilo( a, b, c, d ) double a, b, c, d; {
+ /* returns the smallest interval containing a, b, c, and d */
+ /* used by *, / routines */
+ INTERVAL v;
+
+ if( a>b ) { v.hi = a; v.lo = b; }
+ else { v.hi = b; v.lo = a; }
+
+ if( c>d ) {
+ if( c>v.hi ) v.hi = c;
+ if( d<v.lo ) v.lo = d;
+ }
+ else {
+ if( d>v.hi ) v.hi = d;
+ if( c<v.lo ) v.lo = c;
+ }
+ return( v );
+ }
+
+INTERVAL vmul( a, b, v ) double a, b; INTERVAL v; {
+ return( hilo( a*v.hi, a*v.lo, b*v.hi, b*v.lo ) );
+ }
+
+dcheck( v ) INTERVAL v; {
+ if( v.hi >= 0. && v.lo <= 0. ){
+ printf( "divisor interval contains 0.\en" );
+ return( 1 );
+ }
+ return( 0 );
+ }
+
+INTERVAL vdiv( a, b, v ) double a, b; INTERVAL v; {
+ return( hilo( a/v.hi, a/v.lo, b/v.hi, b/v.lo ) );
+ }
+.DE
+.bp
diff --git a/share/doc/psd/15.yacc/ssd b/share/doc/psd/15.yacc/ssd
new file mode 100644
index 0000000..872152f
--- /dev/null
+++ b/share/doc/psd/15.yacc/ssd
@@ -0,0 +1,45 @@
+.\" This module is believed to contain source code proprietary to AT&T.
+.\" Use and redistribution is subject to the Berkeley Software License
+.\" Agreement and your Software Agreement with AT&T (Western Electric).
+.\"
+.\" @(#)ssd 8.1 (Berkeley) 6/8/93
+.\"
+.\" $FreeBSD$
+.SH
+Appendix D: Old Features Supported but not Encouraged
+.PP
+This Appendix mentions synonyms and features which are supported for historical
+continuity, but, for various reasons, are not encouraged.
+.IP 1.
+Literals may also be delimited by double quotes ``"''.
+.IP 2.
+Literals may be more than one character long.
+If all the characters are alphabetic, numeric, or \_, the type number of the literal is defined,
+just as if the literal did not have the quotes around it.
+Otherwise, it is difficult to find the value for such literals.
+.IP
+The use of multi-character literals is likely to mislead those unfamiliar with
+Yacc, since it suggests that Yacc is doing a job which must be actually done by the lexical analyzer.
+.IP 3.
+Most places where % is legal, backslash ``\e'' may be used.
+In particular, \e\e is the same as %%, \eleft the same as %left, etc.
+.IP 4.
+There are a number of other synonyms:
+.DS
+%< is the same as %left
+%> is the same as %right
+%binary and %2 are the same as %nonassoc
+%0 and %term are the same as %token
+%= is the same as %prec
+.DE
+.IP 5.
+Actions may also have the form
+.DS
+={ . . . }
+.DE
+and the curly braces can be dropped if the action is a
+single C statement.
+.IP 6.
+C code between %{ and %} used to be permitted at the
+head of the rules section, as well as in the
+declaration section.
OpenPOWER on IntegriCloud