summaryrefslogtreecommitdiffstats
path: root/contrib/awk/doc/awk.1
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/awk/doc/awk.1')
-rw-r--r--contrib/awk/doc/awk.12628
1 files changed, 0 insertions, 2628 deletions
diff --git a/contrib/awk/doc/awk.1 b/contrib/awk/doc/awk.1
deleted file mode 100644
index 807f79f..0000000
--- a/contrib/awk/doc/awk.1
+++ /dev/null
@@ -1,2628 +0,0 @@
-.\" $FreeBSD$
-.ds PX \s-1POSIX\s+1
-.ds UX \s-1UNIX\s+1
-.ds AN \s-1ANSI\s+1
-.TH GAWK 1 "Apr 28 1999" "Free Software Foundation" "Utility Commands"
-.SH NAME
-gawk \- pattern scanning and processing language
-.SH SYNOPSIS
-.B gawk
-[ POSIX or GNU style options ]
-.B \-f
-.I program-file
-[
-.B \-\^\-
-] file .\^.\^.
-.br
-.B gawk
-[ POSIX or GNU style options ]
-[
-.B \-\^\-
-]
-.I program-text
-file .\^.\^.
-.SH DESCRIPTION
-.I Gawk
-is the GNU Project's implementation of the AWK programming language.
-It conforms to the definition of the language in
-the \*(PX 1003.2 Command Language And Utilities Standard.
-This version in turn is based on the description in
-.IR "The AWK Programming Language" ,
-by Aho, Kernighan, and Weinberger,
-with the additional features found in the System V Release 4 version
-of \*(UX
-.IR awk .
-.I Gawk
-also provides more recent Bell Labs
-.I awk
-extensions, and some GNU-specific extensions.
-.PP
-The command line consists of options to
-.I gawk
-itself, the AWK program text (if not supplied via the
-.B \-f
-or
-.B \-\^\-file
-options), and values to be made
-available in the
-.B ARGC
-and
-.B ARGV
-pre-defined AWK variables.
-.SH OPTION FORMAT
-.PP
-.I Gawk
-options may be either the traditional \*(PX one letter options,
-or the GNU style long options. \*(PX options start with a single ``\-'',
-while long options start with ``\-\^\-''.
-Long options are provided for both GNU-specific features and
-for \*(PX mandated features.
-.PP
-Following the \*(PX standard,
-.IR gawk -specific
-options are supplied via arguments to the
-.B \-W
-option. Multiple
-.B \-W
-options may be supplied
-Each
-.B \-W
-option has a corresponding long option, as detailed below.
-Arguments to long options are either joined with the option
-by an
-.B =
-sign, with no intervening spaces, or they may be provided in the
-next command line argument.
-Long options may be abbreviated, as long as the abbreviation
-remains unique.
-.SH OPTIONS
-.PP
-.I Gawk
-accepts the following options.
-.TP
-.PD 0
-.BI \-F " fs"
-.TP
-.PD
-.BI \-\^\-field-separator " fs"
-Use
-.I fs
-for the input field separator (the value of the
-.B FS
-predefined
-variable).
-.TP
-.PD 0
-\fB\-v\fI var\fB\^=\^\fIval\fR
-.TP
-.PD
-\fB\-\^\-assign \fIvar\fB\^=\^\fIval\fR
-Assign the value
-.IR val ,
-to the variable
-.IR var ,
-before execution of the program begins.
-Such variable values are available to the
-.B BEGIN
-block of an AWK program.
-.TP
-.PD 0
-.BI \-f " program-file"
-.TP
-.PD
-.BI \-\^\-file " program-file"
-Read the AWK program source from the file
-.IR program-file ,
-instead of from the first command line argument.
-Multiple
-.B \-f
-(or
-.BR \-\^\-file )
-options may be used.
-.TP
-.PD 0
-.BI \-mf " NNN"
-.TP
-.PD
-.BI \-mr " NNN"
-Set various memory limits to the value
-.IR NNN .
-The
-.B f
-flag sets the maximum number of fields, and the
-.B r
-flag sets the maximum record size. These two flags and the
-.B \-m
-option are from the Bell Labs research version of \*(UX
-.IR awk .
-They are ignored by
-.IR gawk ,
-since
-.I gawk
-has no pre-defined limits.
-.TP
-.PD 0
-.B "\-W traditional"
-.TP
-.PD 0
-.B "\-W compat"
-.TP
-.PD 0
-.B \-\^\-traditional
-.TP
-.PD
-.B \-\^\-compat
-Run in
-.I compatibility
-mode. In compatibility mode,
-.I gawk
-behaves identically to \*(UX
-.IR awk ;
-none of the GNU-specific extensions are recognized.
-The use of
-.B \-\^\-traditional
-is preferred over the other forms of this option.
-See
-.BR "GNU EXTENSIONS" ,
-below, for more information.
-.TP
-.PD 0
-.B "\-W copyleft"
-.TP
-.PD 0
-.B "\-W copyright"
-.TP
-.PD 0
-.B \-\^\-copyleft
-.TP
-.PD
-.B \-\^\-copyright
-Print the short version of the GNU copyright information message on
-the standard output, and exits successfully.
-.TP
-.PD 0
-.B "\-W help"
-.TP
-.PD 0
-.B "\-W usage"
-.TP
-.PD 0
-.B \-\^\-help
-.TP
-.PD
-.B \-\^\-usage
-Print a relatively short summary of the available options on
-the standard output.
-(Per the
-.IR "GNU Coding Standards" ,
-these options cause an immediate, successful exit.)
-.TP
-.PD 0
-.B "\-W lint"
-.TP
-.PD
-.B \-\^\-lint
-Provide warnings about constructs that are
-dubious or non-portable to other AWK implementations.
-.TP
-.PD 0
-.B "\-W lint\-old"
-.TP
-.PD
-.B \-\^\-lint\-old
-Provide warnings about constructs that are
-not portable to the original version of Unix
-.IR awk .
-.ig
-.\" This option is left undocumented, on purpose.
-.TP
-.PD 0
-.B "\-W nostalgia"
-.TP
-.PD
-.B \-\^\-nostalgia
-Provide a moment of nostalgia for long time
-.I awk
-users.
-..
-.TP
-.PD 0
-.B "\-W posix"
-.TP
-.PD
-.B \-\^\-posix
-This turns on
-.I compatibility
-mode, with the following additional restrictions:
-.RS
-.TP \w'\(bu'u+1n
-\(bu
-.B \ex
-escape sequences are not recognized.
-.TP
-\(bu
-Only space and tab act as field separators when
-.B FS
-is set to a single space, newline does not.
-.TP
-\(bu
-The synonym
-.B func
-for the keyword
-.B function
-is not recognized.
-.TP
-\(bu
-The operators
-.B **
-and
-.B **=
-cannot be used in place of
-.B ^
-and
-.BR ^= .
-.TP
-\(bu
-The
-.B fflush()
-function is not available.
-.RE
-.TP
-.PD 0
-.B "\-W re\-interval"
-.TP
-.PD
-.B \-\^\-re\-interval
-Enable the use of
-.I "interval expressions"
-in regular expression matching
-(see
-.BR "Regular Expressions" ,
-below).
-Interval expressions were not traditionally available in the
-AWK language. The POSIX standard added them, to make
-.I awk
-and
-.I egrep
-consistent with each other.
-However, their use is likely
-to break old AWK programs, so
-.I gawk
-only provides them if they are requested with this option, or when
-.B \-\^\-posix
-is specified.
-.TP
-.PD 0
-.BI "\-W source " program-text
-.TP
-.PD
-.BI \-\^\-source " program-text"
-Use
-.I program-text
-as AWK program source code.
-This option allows the easy intermixing of library functions (used via the
-.B \-f
-and
-.B \-\^\-file
-options) with source code entered on the command line.
-It is intended primarily for medium to large AWK programs used
-in shell scripts.
-.TP
-.PD 0
-.B "\-W version"
-.TP
-.PD
-.B \-\^\-version
-Print version information for this particular copy of
-.I gawk
-on the standard output.
-This is useful mainly for knowing if the current copy of
-.I gawk
-on your system
-is up to date with respect to whatever the Free Software Foundation
-is distributing.
-This is also useful when reporting bugs.
-(Per the
-.IR "GNU Coding Standards" ,
-these options cause an immediate, successful exit.)
-.TP
-.B \-\^\-
-Signal the end of options. This is useful to allow further arguments to the
-AWK program itself to start with a ``\-''.
-This is mainly for consistency with the argument parsing convention used
-by most other \*(PX programs.
-.PP
-In compatibility mode,
-any other options are flagged as illegal, but are otherwise ignored.
-In normal operation, as long as program text has been supplied, unknown
-options are passed on to the AWK program in the
-.B ARGV
-array for processing. This is particularly useful for running AWK
-programs via the ``#!'' executable interpreter mechanism.
-.SH AWK PROGRAM EXECUTION
-.PP
-An AWK program consists of a sequence of pattern-action statements
-and optional function definitions.
-.RS
-.PP
-\fIpattern\fB { \fIaction statements\fB }\fR
-.br
-\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR
-.RE
-.PP
-.I Gawk
-first reads the program source from the
-.IR program-file (s)
-if specified,
-from arguments to
-.BR \-\^\-source ,
-or from the first non-option argument on the command line.
-The
-.B \-f
-and
-.B \-\^\-source
-options may be used multiple times on the command line.
-.I Gawk
-will read the program text as if all the
-.IR program-file s
-and command line source texts
-had been concatenated together. This is useful for building libraries
-of AWK functions, without having to include them in each new AWK
-program that uses them. It also provides the ability to mix library
-functions with command line programs.
-.PP
-The environment variable
-.B AWKPATH
-specifies a search path to use when finding source files named with
-the
-.B \-f
-option. If this variable does not exist, the default path is
-\fB".:/usr/local/share/awk"\fR.
-(The actual directory may vary, depending upon how
-.I gawk
-was built and installed.)
-If a file name given to the
-.B \-f
-option contains a ``/'' character, no path search is performed.
-.PP
-.I Gawk
-executes AWK programs in the following order.
-First,
-all variable assignments specified via the
-.B \-v
-option are performed.
-Next,
-.I gawk
-compiles the program into an internal form.
-Then,
-.I gawk
-executes the code in the
-.B BEGIN
-block(s) (if any),
-and then proceeds to read
-each file named in the
-.B ARGV
-array.
-If there are no files named on the command line,
-.I gawk
-reads the standard input.
-.PP
-If a filename on the command line has the form
-.IB var = val
-it is treated as a variable assignment. The variable
-.I var
-will be assigned the value
-.IR val .
-(This happens after any
-.B BEGIN
-block(s) have been run.)
-Command line variable assignment
-is most useful for dynamically assigning values to the variables
-AWK uses to control how input is broken into fields and records. It
-is also useful for controlling state if multiple passes are needed over
-a single data file.
-.PP
-If the value of a particular element of
-.B ARGV
-is empty (\fB""\fR),
-.I gawk
-skips over it.
-.PP
-For each record in the input,
-.I gawk
-tests to see if it matches any
-.I pattern
-in the AWK program.
-For each pattern that the record matches, the associated
-.I action
-is executed.
-The patterns are tested in the order they occur in the program.
-.PP
-Finally, after all the input is exhausted,
-.I gawk
-executes the code in the
-.B END
-block(s) (if any).
-.SH VARIABLES, RECORDS AND FIELDS
-AWK variables are dynamic; they come into existence when they are
-first used. Their values are either floating-point numbers or strings,
-or both,
-depending upon how they are used. AWK also has one dimensional
-arrays; arrays with multiple dimensions may be simulated.
-Several pre-defined variables are set as a program
-runs; these will be described as needed and summarized below.
-.SS Records
-Normally, records are separated by newline characters. You can control how
-records are separated by assigning values to the built-in variable
-.BR RS .
-If
-.B RS
-is any single character, that character separates records.
-Otherwise,
-.B RS
-is a regular expression. Text in the input that matches this
-regular expression will separate the record.
-However, in compatibility mode,
-only the first character of its string
-value is used for separating records.
-If
-.B RS
-is set to the null string, then records are separated by
-blank lines.
-When
-.B RS
-is set to the null string, the newline character always acts as
-a field separator, in addition to whatever value
-.B FS
-may have.
-.SS Fields
-.PP
-As each input record is read,
-.I gawk
-splits the record into
-.IR fields ,
-using the value of the
-.B FS
-variable as the field separator.
-If
-.B FS
-is a single character, fields are separated by that character.
-If
-.B FS
-is the null string, then each individual character becomes a
-separate field.
-Otherwise,
-.B FS
-is expected to be a full regular expression.
-In the special case that
-.B FS
-is a single space, fields are separated
-by runs of spaces and/or tabs and/or newlines.
-(But see the discussion of
-.BR \-\-posix ,
-below).
-Note that the value of
-.B IGNORECASE
-(see below) will also affect how fields are split when
-.B FS
-is a regular expression, and how records are separated when
-.B RS
-is a regular expression.
-.PP
-If the
-.B FIELDWIDTHS
-variable is set to a space separated list of numbers, each field is
-expected to have fixed width, and
-.I gawk
-will split up the record using the specified widths. The value of
-.B FS
-is ignored.
-Assigning a new value to
-.B FS
-overrides the use of
-.BR FIELDWIDTHS ,
-and restores the default behavior.
-.PP
-Each field in the input record may be referenced by its position,
-.BR $1 ,
-.BR $2 ,
-and so on.
-.B $0
-is the whole record. The value of a field may be assigned to as well.
-Fields need not be referenced by constants:
-.RS
-.PP
-.ft B
-n = 5
-.br
-print $n
-.ft R
-.RE
-.PP
-prints the fifth field in the input record.
-The variable
-.B NF
-is set to the total number of fields in the input record.
-.PP
-References to non-existent fields (i.e. fields after
-.BR $NF )
-produce the null-string. However, assigning to a non-existent field
-(e.g.,
-.BR "$(NF+2) = 5" )
-will increase the value of
-.BR NF ,
-create any intervening fields with the null string as their value, and
-cause the value of
-.B $0
-to be recomputed, with the fields being separated by the value of
-.BR OFS .
-References to negative numbered fields cause a fatal error.
-Decrementing
-.B NF
-causes the values of fields past the new value to be lost, and the value of
-.B $0
-to be recomputed, with the fields being separated by the value of
-.BR OFS .
-.SS Built-in Variables
-.PP
-.IR Gawk 's
-built-in variables are:
-.PP
-.TP \w'\fBFIELDWIDTHS\fR'u+1n
-.B ARGC
-The number of command line arguments (does not include options to
-.IR gawk ,
-or the program source).
-.TP
-.B ARGIND
-The index in
-.B ARGV
-of the current file being processed.
-.TP
-.B ARGV
-Array of command line arguments. The array is indexed from
-0 to
-.B ARGC
-\- 1.
-Dynamically changing the contents of
-.B ARGV
-can control the files used for data.
-.TP
-.B CONVFMT
-The conversion format for numbers, \fB"%.6g"\fR, by default.
-.TP
-.B ENVIRON
-An array containing the values of the current environment.
-The array is indexed by the environment variables, each element being
-the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
-.BR /home/arnold ).
-Changing this array does not affect the environment seen by programs which
-.I gawk
-spawns via redirection or the
-.B system()
-function.
-(This may change in a future version of
-.IR gawk .)
-.\" but don't hold your breath...
-.TP
-.B ERRNO
-If a system error occurs either doing a redirection for
-.BR getline ,
-during a read for
-.BR getline ,
-or during a
-.BR close() ,
-then
-.B ERRNO
-will contain
-a string describing the error.
-.TP
-.B FIELDWIDTHS
-A white-space separated list of fieldwidths. When set,
-.I gawk
-parses the input into fields of fixed width, instead of using the
-value of the
-.B FS
-variable as the field separator.
-The fixed field width facility is still experimental; the
-semantics may change as
-.I gawk
-evolves over time.
-.TP
-.B FILENAME
-The name of the current input file.
-If no files are specified on the command line, the value of
-.B FILENAME
-is ``\-''.
-However,
-.B FILENAME
-is undefined inside the
-.B BEGIN
-block.
-.TP
-.B FNR
-The input record number in the current input file.
-.TP
-.B FS
-The input field separator, a space by default. See
-.BR Fields ,
-above.
-.TP
-.B IGNORECASE
-Controls the case-sensitivity of all regular expression
-and string operations. If
-.B IGNORECASE
-has a non-zero value, then string comparisons and
-pattern matching in rules,
-field splitting with
-.BR FS ,
-record separating with
-.BR RS ,
-regular expression
-matching with
-.B ~
-and
-.BR !~ ,
-and the
-.BR gensub() ,
-.BR gsub() ,
-.BR index() ,
-.BR match() ,
-.BR split() ,
-and
-.B sub()
-pre-defined functions will all ignore case when doing regular expression
-operations. Thus, if
-.B IGNORECASE
-is not equal to zero,
-.B /aB/
-matches all of the strings \fB"ab"\fP, \fB"aB"\fP, \fB"Ab"\fP,
-and \fB"AB"\fP.
-As with all AWK variables, the initial value of
-.B IGNORECASE
-is zero, so all regular expression and string
-operations are normally case-sensitive.
-Under Unix, the full ISO 8859-1 Latin-1 character set is used
-when ignoring case.
-.B NOTE:
-In versions of
-.I gawk
-prior to 3.0,
-.B IGNORECASE
-only affected regular expression operations. It now affects string
-comparisons as well.
-.TP
-.B NF
-The number of fields in the current input record.
-.TP
-.B NR
-The total number of input records seen so far.
-.TP
-.B OFMT
-The output format for numbers, \fB"%.6g"\fR, by default.
-.TP
-.B OFS
-The output field separator, a space by default.
-.TP
-.B ORS
-The output record separator, by default a newline.
-.TP
-.B RS
-The input record separator, by default a newline.
-.TP
-.B RT
-The record terminator.
-.I Gawk
-sets
-.B RT
-to the input text that matched the character or regular expression
-specified by
-.BR RS .
-.TP
-.B RSTART
-The index of the first character matched by
-.BR match() ;
-0 if no match.
-.TP
-.B RLENGTH
-The length of the string matched by
-.BR match() ;
-\-1 if no match.
-.TP
-.B SUBSEP
-The character used to separate multiple subscripts in array
-elements, by default \fB"\e034"\fR.
-.SS Arrays
-.PP
-Arrays are subscripted with an expression between square brackets
-.RB ( [ " and " ] ).
-If the expression is an expression list
-.RI ( expr ", " expr " ...)"
-then the array subscript is a string consisting of the
-concatenation of the (string) value of each expression,
-separated by the value of the
-.B SUBSEP
-variable.
-This facility is used to simulate multiply dimensioned
-arrays. For example:
-.PP
-.RS
-.ft B
-i = "A";\^ j = "B";\^ k = "C"
-.br
-x[i, j, k] = "hello, world\en"
-.ft R
-.RE
-.PP
-assigns the string \fB"hello, world\en"\fR to the element of the array
-.B x
-which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in AWK
-are associative, i.e. indexed by string values.
-.PP
-The special operator
-.B in
-may be used in an
-.B if
-or
-.B while
-statement to see if an array has an index consisting of a particular
-value.
-.PP
-.RS
-.ft B
-.nf
-if (val in array)
- print array[val]
-.fi
-.ft
-.RE
-.PP
-If the array has multiple subscripts, use
-.BR "(i, j) in array" .
-.PP
-The
-.B in
-construct may also be used in a
-.B for
-loop to iterate over all the elements of an array.
-.PP
-An element may be deleted from an array using the
-.B delete
-statement.
-The
-.B delete
-statement may also be used to delete the entire contents of an array,
-just by specifying the array name without a subscript.
-.SS Variable Typing And Conversion
-.PP
-Variables and fields
-may be (floating point) numbers, or strings, or both. How the
-value of a variable is interpreted depends upon its context. If used in
-a numeric expression, it will be treated as a number, if used as a string
-it will be treated as a string.
-.PP
-To force a variable to be treated as a number, add 0 to it; to force it
-to be treated as a string, concatenate it with the null string.
-.PP
-When a string must be converted to a number, the conversion is accomplished
-using
-.IR atof (3).
-A number is converted to a string by using the value of
-.B CONVFMT
-as a format string for
-.IR sprintf (3),
-with the numeric value of the variable as the argument.
-However, even though all numbers in AWK are floating-point,
-integral values are
-.I always
-converted as integers. Thus, given
-.PP
-.RS
-.ft B
-.nf
-CONVFMT = "%2.2f"
-a = 12
-b = a ""
-.fi
-.ft R
-.RE
-.PP
-the variable
-.B b
-has a string value of \fB"12"\fR and not \fB"12.00"\fR.
-.PP
-.I Gawk
-performs comparisons as follows:
-If two variables are numeric, they are compared numerically.
-If one value is numeric and the other has a string value that is a
-``numeric string,'' then comparisons are also done numerically.
-Otherwise, the numeric value is converted to a string and a string
-comparison is performed.
-Two strings are compared, of course, as strings.
-According to the \*(PX standard, even if two strings are
-numeric strings, a numeric comparison is performed. However, this is
-clearly incorrect, and
-.I gawk
-does not do this.
-.PP
-Note that string constants, such as \fB"57"\fP, are
-.I not
-numeric strings, they are string constants. The idea of ``numeric string''
-only applies to fields,
-.B getline
-input,
-.BR FILENAME ,
-.B ARGV
-elements,
-.B ENVIRON
-elements and the elements of an array created by
-.B split()
-that are numeric strings.
-The basic idea is that
-.IR "user input" ,
-and only user input, that looks numeric,
-should be treated that way.
-.PP
-Uninitialized variables have the numeric value 0 and the string value ""
-(the null, or empty, string).
-.SH PATTERNS AND ACTIONS
-AWK is a line oriented language. The pattern comes first, and then the
-action. Action statements are enclosed in
-.B {
-and
-.BR } .
-Either the pattern may be missing, or the action may be missing, but,
-of course, not both. If the pattern is missing, the action will be
-executed for every single record of input.
-A missing action is equivalent to
-.RS
-.PP
-.B "{ print }"
-.RE
-.PP
-which prints the entire record.
-.PP
-Comments begin with the ``#'' character, and continue until the
-end of the line.
-Blank lines may be used to separate statements.
-Normally, a statement ends with a newline, however, this is not the
-case for lines ending in
-a ``,'',
-.BR { ,
-.BR ? ,
-.BR : ,
-.BR && ,
-or
-.BR || .
-Lines ending in
-.B do
-or
-.B else
-also have their statements automatically continued on the following line.
-In other cases, a line can be continued by ending it with a ``\e'',
-in which case the newline will be ignored.
-.PP
-Multiple statements may
-be put on one line by separating them with a ``;''.
-This applies to both the statements within the action part of a
-pattern-action pair (the usual case),
-and to the pattern-action statements themselves.
-.SS Patterns
-AWK patterns may be one of the following:
-.PP
-.RS
-.nf
-.B BEGIN
-.B END
-.BI / "regular expression" /
-.I "relational expression"
-.IB pattern " && " pattern
-.IB pattern " || " pattern
-.IB pattern " ? " pattern " : " pattern
-.BI ( pattern )
-.BI ! " pattern"
-.IB pattern1 ", " pattern2
-.fi
-.RE
-.PP
-.B BEGIN
-and
-.B END
-are two special kinds of patterns which are not tested against
-the input.
-The action parts of all
-.B BEGIN
-patterns are merged as if all the statements had
-been written in a single
-.B BEGIN
-block. They are executed before any
-of the input is read. Similarly, all the
-.B END
-blocks are merged,
-and executed when all the input is exhausted (or when an
-.B exit
-statement is executed).
-.B BEGIN
-and
-.B END
-patterns cannot be combined with other patterns in pattern expressions.
-.B BEGIN
-and
-.B END
-patterns cannot have missing action parts.
-.PP
-For
-.BI / "regular expression" /
-patterns, the associated statement is executed for each input record that matches
-the regular expression.
-Regular expressions are the same as those in
-.IR egrep (1),
-and are summarized below.
-.PP
-A
-.I "relational expression"
-may use any of the operators defined below in the section on actions.
-These generally test whether certain fields match certain regular expressions.
-.PP
-The
-.BR && ,
-.BR || ,
-and
-.B !
-operators are logical AND, logical OR, and logical NOT, respectively, as in C.
-They do short-circuit evaluation, also as in C, and are used for combining
-more primitive pattern expressions. As in most languages, parentheses
-may be used to change the order of evaluation.
-.PP
-The
-.B ?\^:
-operator is like the same operator in C. If the first pattern is true
-then the pattern used for testing is the second pattern, otherwise it is
-the third. Only one of the second and third patterns is evaluated.
-.PP
-The
-.IB pattern1 ", " pattern2
-form of an expression is called a
-.IR "range pattern" .
-It matches all input records starting with a record that matches
-.IR pattern1 ,
-and continuing until a record that matches
-.IR pattern2 ,
-inclusive. It does not combine with any other sort of pattern expression.
-.SS Regular Expressions
-Regular expressions are the extended kind found in
-.IR egrep .
-They are composed of characters as follows:
-.TP \w'\fB[^\fIabc...\fB]\fR'u+2n
-.I c
-matches the non-metacharacter
-.IR c .
-.TP
-.I \ec
-matches the literal character
-.IR c .
-.TP
-.B .
-matches any character
-.I including
-newline.
-.TP
-.B ^
-matches the beginning of a string.
-.TP
-.B $
-matches the end of a string.
-.TP
-.BI [ abc... ]
-character list, matches any of the characters
-.IR abc... .
-.TP
-.BI [^ abc... ]
-negated character list, matches any character except
-.IR abc... .
-.TP
-.IB r1 | r2
-alternation: matches either
-.I r1
-or
-.IR r2 .
-.TP
-.I r1r2
-concatenation: matches
-.IR r1 ,
-and then
-.IR r2 .
-.TP
-.IB r +
-matches one or more
-.IR r 's.
-.TP
-.IB r *
-matches zero or more
-.IR r 's.
-.TP
-.IB r ?
-matches zero or one
-.IR r 's.
-.TP
-.BI ( r )
-grouping: matches
-.IR r .
-.TP
-.PD 0
-.IB r { n }
-.TP
-.PD 0
-.IB r { n ,}
-.TP
-.PD
-.IB r { n , m }
-One or two numbers inside braces denote an
-.IR "interval expression" .
-If there is one number in the braces, the preceding regexp
-.I r
-is repeated
-.I n
-times. If there are two numbers separated by a comma,
-.I r
-is repeated
-.I n
-to
-.I m
-times.
-If there is one number followed by a comma, then
-.I r
-is repeated at least
-.I n
-times.
-.sp .5
-Interval expressions are only available if either
-.B \-\^\-posix
-or
-.B \-\^\-re\-interval
-is specified on the command line.
-.TP
-.B \ey
-matches the empty string at either the beginning or the
-end of a word.
-.TP
-.B \eB
-matches the empty string within a word.
-.TP
-.B \e<
-matches the empty string at the beginning of a word.
-.TP
-.B \e>
-matches the empty string at the end of a word.
-.TP
-.B \ew
-matches any word-constituent character (letter, digit, or underscore).
-.TP
-.B \eW
-matches any character that is not word-constituent.
-.TP
-.B \e`
-matches the empty string at the beginning of a buffer (string).
-.TP
-.B \e'
-matches the empty string at the end of a buffer.
-.PP
-The escape sequences that are valid in string constants (see below)
-are also legal in regular expressions.
-.PP
-.I "Character classes"
-are a new feature introduced in the POSIX standard.
-A character class is a special notation for describing
-lists of characters that have a specific attribute, but where the
-actual characters themselves can vary from country to country and/or
-from character set to character set. For example, the notion of what
-is an alphabetic character differs in the USA and in France.
-.PP
-A character class is only valid in a regexp
-.I inside
-the brackets of a character list. Character classes consist of
-.BR [: ,
-a keyword denoting the class, and
-.BR :] .
-Here are the character
-classes defined by the POSIX standard.
-.TP
-.B [:alnum:]
-Alphanumeric characters.
-.TP
-.B [:alpha:]
-Alphabetic characters.
-.TP
-.B [:blank:]
-Space or tab characters.
-.TP
-.B [:cntrl:]
-Control characters.
-.TP
-.B [:digit:]
-Numeric characters.
-.TP
-.B [:graph:]
-Characters that are both printable and visible.
-(A space is printable, but not visible, while an
-.B a
-is both.)
-.TP
-.B [:lower:]
-Lower-case alphabetic characters.
-.TP
-.B [:print:]
-Printable characters (characters that are not control characters.)
-.TP
-.B [:punct:]
-Punctuation characters (characters that are not letter, digits,
-control characters, or space characters).
-.TP
-.B [:space:]
-Space characters (such as space, tab, and formfeed, to name a few).
-.TP
-.B [:upper:]
-Upper-case alphabetic characters.
-.TP
-.B [:xdigit:]
-Characters that are hexadecimal digits.
-.PP
-For example, before the POSIX standard, to match alphanumeric
-characters, you would have had to write
-.BR /[A\-Za\-z0\-9]/ .
-If your character set had other alphabetic characters in it, this would not
-match them. With the POSIX character classes, you can write
-.BR /[[:alnum:]]/ ,
-and this will match
-.I all
-the alphabetic and numeric characters in your character set.
-.PP
-Two additional special sequences can appear in character lists.
-These apply to non-ASCII character sets, which can have single symbols
-(called
-.IR "collating elements" )
-that are represented with more than one
-character, as well as several characters that are equivalent for
-.IR collating ,
-or sorting, purposes. (E.g., in French, a plain ``e''
-and a grave-accented e\` are equivalent.)
-.TP
-Collating Symbols
-A collating symbols is a multi-character collating element enclosed in
-.B [.
-and
-.BR .] .
-For example, if
-.B ch
-is a collating element, then
-.B [[.ch.]]
-is a regexp that matches this collating element, while
-.B [ch]
-is a regexp that matches either
-.B c
-or
-.BR h .
-.TP
-Equivalence Classes
-An equivalence class is a locale-specific name for a list of
-characters that are equivalent. The name is enclosed in
-.B [=
-and
-.BR =] .
-For example, the name
-.B e
-might be used to represent all of
-``e,'' ``e\`,'' and ``e\`.''
-In this case,
-.B [[=e=]]
-is a regexp
-that matches any of
-.BR e ,
-.BR e\' ,
-or
-.BR e\` .
-.PP
-These features are very valuable in non-English speaking locales.
-The library functions that
-.I gawk
-uses for regular expression matching
-currently only recognize POSIX character classes; they do not recognize
-collating symbols or equivalence classes.
-.PP
-The
-.BR \ey ,
-.BR \eB ,
-.BR \e< ,
-.BR \e> ,
-.BR \ew ,
-.BR \eW ,
-.BR \e` ,
-and
-.B \e'
-operators are specific to
-.IR gawk ;
-they are extensions based on facilities in the GNU regexp libraries.
-.PP
-The various command line options
-control how
-.I gawk
-interprets characters in regexps.
-.TP
-No options
-In the default case,
-.I gawk
-provide all the facilities of
-POSIX regexps and the GNU regexp operators described above.
-However, interval expressions are not supported.
-.TP
-.B \-\^\-posix
-Only POSIX regexps are supported, the GNU operators are not special.
-(E.g.,
-.B \ew
-matches a literal
-.BR w ).
-Interval expressions are allowed.
-.TP
-.B \-\^\-traditional
-Traditional Unix
-.I awk
-regexps are matched. The GNU operators
-are not special, interval expressions are not available, and neither
-are the POSIX character classes
-.RB ( [[:alnum:]]
-and so on).
-Characters described by octal and hexadecimal escape sequences are
-treated literally, even if they represent regexp metacharacters.
-.TP
-.B \-\^\-re\-interval
-Allow interval expressions in regexps, even if
-.B \-\^\-traditional
-has been provided.
-.SS Actions
-Action statements are enclosed in braces,
-.B {
-and
-.BR } .
-Action statements consist of the usual assignment, conditional, and looping
-statements found in most languages. The operators, control statements,
-and input/output statements
-available are patterned after those in C.
-.SS Operators
-.PP
-The operators in AWK, in order of decreasing precedence, are
-.PP
-.TP "\w'\fB*= /= %= ^=\fR'u+1n"
-.BR ( \&... )
-Grouping
-.TP
-.B $
-Field reference.
-.TP
-.B "++ \-\^\-"
-Increment and decrement, both prefix and postfix.
-.TP
-.B ^
-Exponentiation (\fB**\fR may also be used, and \fB**=\fR for
-the assignment operator).
-.TP
-.B "+ \- !"
-Unary plus, unary minus, and logical negation.
-.TP
-.B "* / %"
-Multiplication, division, and modulus.
-.TP
-.B "+ \-"
-Addition and subtraction.
-.TP
-.I space
-String concatenation.
-.TP
-.PD 0
-.B "< >"
-.TP
-.PD 0
-.B "<= >="
-.TP
-.PD
-.B "!= =="
-The regular relational operators.
-.TP
-.B "~ !~"
-Regular expression match, negated match.
-.B NOTE:
-Do not use a constant regular expression
-.RB ( /foo/ )
-on the left-hand side of a
-.B ~
-or
-.BR !~ .
-Only use one on the right-hand side. The expression
-.BI "/foo/ ~ " exp
-has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR.
-This is usually
-.I not
-what was intended.
-.TP
-.B in
-Array membership.
-.TP
-.B &&
-Logical AND.
-.TP
-.B ||
-Logical OR.
-.TP
-.B ?:
-The C conditional expression. This has the form
-.IB expr1 " ? " expr2 " : " expr3\c
-\&. If
-.I expr1
-is true, the value of the expression is
-.IR expr2 ,
-otherwise it is
-.IR expr3 .
-Only one of
-.I expr2
-and
-.I expr3
-is evaluated.
-.TP
-.PD 0
-.B "= += \-="
-.TP
-.PD
-.B "*= /= %= ^="
-Assignment. Both absolute assignment
-.BI ( var " = " value )
-and operator-assignment (the other forms) are supported.
-.SS Control Statements
-.PP
-The control statements are
-as follows:
-.PP
-.RS
-.nf
-\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR]
-\fBwhile (\fIcondition\fB) \fIstatement \fR
-\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR
-\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR
-\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR
-\fBbreak\fR
-\fBcontinue\fR
-\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR
-\fBdelete \fIarray\^\fR
-\fBexit\fR [ \fIexpression\fR ]
-\fB{ \fIstatements \fB}
-.fi
-.RE
-.SS "I/O Statements"
-.PP
-The input/output statements are as follows:
-.PP
-.TP "\w'\fBprintf \fIfmt, expr-list\fR'u+1n"
-.BI close( file )
-Close file (or pipe, see below).
-.TP
-.B getline
-Set
-.B $0
-from next input record; set
-.BR NF ,
-.BR NR ,
-.BR FNR .
-.TP
-.BI "getline <" file
-Set
-.B $0
-from next record of
-.IR file ;
-set
-.BR NF .
-.TP
-.BI getline " var"
-Set
-.I var
-from next input record; set
-.BR NR ,
-.BR FNR .
-.TP
-.BI getline " var" " <" file
-Set
-.I var
-from next record of
-.IR file .
-.TP
-.B next
-Stop processing the current input record. The next input record
-is read and processing starts over with the first pattern in the
-AWK program. If the end of the input data is reached, the
-.B END
-block(s), if any, are executed.
-.TP
-.B "nextfile"
-Stop processing the current input file. The next input record read
-comes from the next input file.
-.B FILENAME
-and
-.B ARGIND
-are updated,
-.B FNR
-is reset to 1, and processing starts over with the first pattern in the
-AWK program. If the end of the input data is reached, the
-.B END
-block(s), if any, are executed.
-.B NOTE:
-Earlier versions of gawk used
-.BR "next file" ,
-as two words. While this usage is still recognized, it generates a
-warning message and will eventually be removed.
-.TP
-.B print
-Prints the current record.
-The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI print " expr-list"
-Prints expressions.
-Each expression is separated by the value of the
-.B OFS
-variable.
-The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI print " expr-list" " >" file
-Prints expressions on
-.IR file .
-Each expression is separated by the value of the
-.B OFS
-variable. The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI printf " fmt, expr-list"
-Format and print.
-.TP
-.BI printf " fmt, expr-list" " >" file
-Format and print on
-.IR file .
-.TP
-.BI system( cmd-line )
-Execute the command
-.IR cmd-line ,
-and return the exit status.
-(This may not be available on non-\*(PX systems.)
-.TP
-\&\fBfflush(\fR[\fIfile\^\fR]\fB)\fR
-Flush any buffers associated with the open output file or pipe
-.IR file .
-If
-.I file
-is missing, then standard output is flushed.
-If
-.I file
-is the null string,
-then all open output files and pipes
-have their buffers flushed.
-.PP
-Other input/output redirections are also allowed. For
-.B print
-and
-.BR printf ,
-.BI >> file
-appends output to the
-.IR file ,
-while
-.BI | " command"
-writes on a pipe.
-In a similar fashion,
-.IB command " | getline"
-pipes into
-.BR getline .
-The
-.BR getline
-command will return 0 on end of file, and \-1 on an error.
-.SS The \fIprintf\fP\^ Statement
-.PP
-The AWK versions of the
-.B printf
-statement and
-.B sprintf()
-function
-(see below)
-accept the following conversion specification formats:
-.TP
-.B %c
-An \s-1ASCII\s+1 character.
-If the argument used for
-.B %c
-is numeric, it is treated as a character and printed.
-Otherwise, the argument is assumed to be a string, and the only first
-character of that string is printed.
-.TP
-.PD 0
-.B %d
-.TP
-.PD
-.B %i
-A decimal number (the integer part).
-.TP
-.PD 0
-.B %e
-.TP
-.PD
-.B %E
-A floating point number of the form
-.BR [\-]d.dddddde[+\^\-]dd .
-The
-.B %E
-format uses
-.B E
-instead of
-.BR e .
-.TP
-.B %f
-A floating point number of the form
-.BR [\-]ddd.dddddd .
-.TP
-.PD 0
-.B %g
-.TP
-.PD
-.B %G
-Use
-.B %e
-or
-.B %f
-conversion, whichever is shorter, with nonsignificant zeros suppressed.
-The
-.B %G
-format uses
-.B %E
-instead of
-.BR %e .
-.TP
-.B %o
-An unsigned octal number (again, an integer).
-.TP
-.B %s
-A character string.
-.TP
-.PD 0
-.B %x
-.TP
-.PD
-.B %X
-An unsigned hexadecimal number (an integer).
-.The
-.B %X
-format uses
-.B ABCDEF
-instead of
-.BR abcdef .
-.TP
-.B %%
-A single
-.B %
-character; no argument is converted.
-.PP
-There are optional, additional parameters that may lie between the
-.B %
-and the control letter:
-.TP
-.B \-
-The expression should be left-justified within its field.
-.TP
-.I space
-For numeric conversions, prefix positive values with a space, and
-negative values with a minus sign.
-.TP
-.B +
-The plus sign, used before the width modifier (see below),
-says to always supply a sign for numeric conversions, even if the data
-to be formatted is positive. The
-.B +
-overrides the space modifier.
-.TP
-.B #
-Use an ``alternate form'' for certain control letters.
-For
-.BR %o ,
-supply a leading zero.
-For
-.BR %x ,
-and
-.BR %X ,
-supply a leading
-.BR 0x
-or
-.BR 0X
-for
-a nonzero result.
-For
-.BR %e ,
-.BR %E ,
-and
-.BR %f ,
-the result will always contain a
-decimal point.
-For
-.BR %g ,
-and
-.BR %G ,
-trailing zeros are not removed from the result.
-.TP
-.B 0
-A leading
-.B 0
-(zero) acts as a flag, that indicates output should be
-padded with zeroes instead of spaces.
-This applies even to non-numeric output formats.
-This flag only has an effect when the field width is wider than the
-value to be printed.
-.TP
-.I width
-The field should be padded to this width. The field is normally padded
-with spaces. If the
-.B 0
-flag has been used, it is padded with zeroes.
-.TP
-.BI \&. prec
-A number that specifies the precision to use when printing.
-For the
-.BR %e ,
-.BR %E ,
-and
-.BR %f
-formats, this specifies the
-number of digits you want printed to the right of the decimal point.
-For the
-.BR %g ,
-and
-.B %G
-formats, it specifies the maximum number
-of significant digits. For the
-.BR %d ,
-.BR %o ,
-.BR %i ,
-.BR %u ,
-.BR %x ,
-and
-.B %X
-formats, it specifies the minimum number of
-digits to print. For a string, it specifies the maximum number of
-characters from the string that should be printed.
-.PP
-The dynamic
-.I width
-and
-.I prec
-capabilities of the \*(AN C
-.B printf()
-routines are supported.
-A
-.B *
-in place of either the
-.B width
-or
-.B prec
-specifications will cause their values to be taken from
-the argument list to
-.B printf
-or
-.BR sprintf() .
-.SS Special File Names
-.PP
-When doing I/O redirection from either
-.B print
-or
-.B printf
-into a file,
-or via
-.B getline
-from a file,
-.I gawk
-recognizes certain special filenames internally. These filenames
-allow access to open file descriptors inherited from
-.IR gawk 's
-parent process (usually the shell).
-Other special filenames provide access to information about the running
-.B gawk
-process.
-The filenames are:
-.TP \w'\fB/dev/stdout\fR'u+1n
-.B /dev/pid
-Reading this file returns the process ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/ppid
-Reading this file returns the parent process ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/pgrpid
-Reading this file returns the process group ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/user
-Reading this file returns a single record terminated with a newline.
-The fields are separated with spaces.
-.B $1
-is the value of the
-.IR getuid (2)
-system call,
-.B $2
-is the value of the
-.IR geteuid (2)
-system call,
-.B $3
-is the value of the
-.IR getgid (2)
-system call, and
-.B $4
-is the value of the
-.IR getegid (2)
-system call.
-If there are any additional fields, they are the group IDs returned by
-.IR getgroups (2).
-Multiple groups may not be supported on all systems.
-.TP
-.B /dev/stdin
-The standard input.
-.TP
-.B /dev/stdout
-The standard output.
-.TP
-.B /dev/stderr
-The standard error output.
-.TP
-.BI /dev/fd/\^ n
-The file associated with the open file descriptor
-.IR n .
-.PP
-These are particularly useful for error messages. For example:
-.PP
-.RS
-.ft B
-print "You blew it!" > "/dev/stderr"
-.ft R
-.RE
-.PP
-whereas you would otherwise have to use
-.PP
-.RS
-.ft B
-print "You blew it!" | "cat 1>&2"
-.ft R
-.RE
-.PP
-These file names may also be used on the command line to name data files.
-.SS Numeric Functions
-.PP
-AWK has the following pre-defined arithmetic functions:
-.PP
-.TP \w'\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR'u+1n
-.BI atan2( y , " x" )
-returns the arctangent of
-.I y/x
-in radians.
-.TP
-.BI cos( expr )
-returns the cosine of
-.IR expr ,
-which is in radians.
-.TP
-.BI exp( expr )
-the exponential function.
-.TP
-.BI int( expr )
-truncates to integer.
-.TP
-.BI log( expr )
-the natural logarithm function.
-.TP
-.B rand()
-returns a random number between 0 and 1.
-.TP
-.BI sin( expr )
-returns the sine of
-.IR expr ,
-which is in radians.
-.TP
-.BI sqrt( expr )
-the square root function.
-.TP
-\&\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR
-uses
-.I expr
-as a new seed for the random number generator. If no
-.I expr
-is provided, the time of day will be used.
-The return value is the previous seed for the random
-number generator.
-.SS String Functions
-.PP
-.I Gawk
-has the following pre-defined string functions:
-.PP
-.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
-\fBgensub(\fIr\fB, \fIs\fB, \fIh \fR[\fB, \fIt\fR]\fB)\fR
-search the target string
-.I t
-for matches of the regular expression
-.IR r .
-If
-.I h
-is a string beginning with
-.B g
-or
-.BR G ,
-then replace all matches of
-.I r
-with
-.IR s .
-Otherwise,
-.I h
-is a number indicating which match of
-.I r
-to replace.
-If no
-.I t
-is supplied,
-.B $0
-is used instead.
-Within the replacement text
-.IR s ,
-the sequence
-.BI \e n\fR,
-where
-.I n
-is a digit from 1 to 9, may be used to indicate just the text that
-matched the
-.IR n 'th
-parenthesized subexpression. The sequence
-.B \e0
-represents the entire matched text, as does the character
-.BR & .
-Unlike
-.B sub()
-and
-.BR gsub() ,
-the modified string is returned as the result of the function,
-and the original target string is
-.I not
-changed.
-.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
-\fBgsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
-for each substring matching the regular expression
-.I r
-in the string
-.IR t ,
-substitute the string
-.IR s ,
-and return the number of substitutions.
-If
-.I t
-is not supplied, use
-.BR $0 .
-An
-.B &
-in the replacement text is replaced with the text that was actually matched.
-Use
-.B \e&
-to get a literal
-.BR & .
-See
-.I "AWK Language Programming"
-for a fuller discussion of the rules for
-.BR &'s
-and backslashes in the replacement text of
-.BR sub() ,
-.BR gsub() ,
-and
-.BR gensub() .
-.TP
-.BI index( s , " t" )
-returns the index of the string
-.I t
-in the string
-.IR s ,
-or 0 if
-.I t
-is not present.
-.TP
-\fBlength(\fR[\fIs\fR]\fB)
-returns the length of the string
-.IR s ,
-or the length of
-.B $0
-if
-.I s
-is not supplied.
-.TP
-.BI match( s , " r" )
-returns the position in
-.I s
-where the regular expression
-.I r
-occurs, or 0 if
-.I r
-is not present, and sets the values of
-.B RSTART
-and
-.BR RLENGTH .
-.TP
-\fBsplit(\fIs\fB, \fIa \fR[\fB, \fIr\fR]\fB)\fR
-splits the string
-.I s
-into the array
-.I a
-on the regular expression
-.IR r ,
-and returns the number of fields. If
-.I r
-is omitted,
-.B FS
-is used instead.
-The array
-.I a
-is cleared first.
-Splitting behaves identically to field splitting, described above.
-.TP
-.BI sprintf( fmt , " expr-list" )
-prints
-.I expr-list
-according to
-.IR fmt ,
-and returns the resulting string.
-.TP
-\fBsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
-just like
-.BR gsub() ,
-but only the first matching substring is replaced.
-.TP
-\fBsubstr(\fIs\fB, \fIi \fR[\fB, \fIn\fR]\fB)\fR
-returns the at most
-.IR n -character
-substring of
-.I s
-starting at
-.IR i .
-If
-.I n
-is omitted, the rest of
-.I s
-is used.
-.TP
-.BI tolower( str )
-returns a copy of the string
-.IR str ,
-with all the upper-case characters in
-.I str
-translated to their corresponding lower-case counterparts.
-Non-alphabetic characters are left unchanged.
-.TP
-.BI toupper( str )
-returns a copy of the string
-.IR str ,
-with all the lower-case characters in
-.I str
-translated to their corresponding upper-case counterparts.
-Non-alphabetic characters are left unchanged.
-.SS Time Functions
-.PP
-Since one of the primary uses of AWK programs is processing log files
-that contain time stamp information,
-.I gawk
-provides the following two functions for obtaining time stamps and
-formatting them.
-.PP
-.TP "\w'\fBsystime()\fR'u+1n"
-.B systime()
-returns the current time of day as the number of seconds since the Epoch
-(Midnight UTC, January 1, 1970 on \*(PX systems).
-.TP
-\fBstrftime(\fR[\fIformat \fR[\fB, \fItimestamp\fR]]\fB)\fR
-formats
-.I timestamp
-according to the specification in
-.IR format.
-The
-.I timestamp
-should be of the same form as returned by
-.BR systime() .
-If
-.I timestamp
-is missing, the current time of day is used.
-If
-.I format
-is missing, a default format equivalent to the output of
-.IR date (1)
-will be used.
-See the specification for the
-.B strftime()
-function in \*(AN C for the format conversions that are
-guaranteed to be available.
-A public-domain version of
-.IR strftime (3)
-and a man page for it come with
-.IR gawk ;
-if that version was used to build
-.IR gawk ,
-then all of the conversions described in that man page are available to
-.IR gawk.
-.SS String Constants
-.PP
-String constants in AWK are sequences of characters enclosed
-between double quotes (\fB"\fR). Within strings, certain
-.I "escape sequences"
-are recognized, as in C. These are:
-.PP
-.TP \w'\fB\e\^\fIddd\fR'u+1n
-.B \e\e
-A literal backslash.
-.TP
-.B \ea
-The ``alert'' character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character.
-.TP
-.B \eb
-backspace.
-.TP
-.B \ef
-form-feed.
-.TP
-.B \en
-newline.
-.TP
-.B \er
-carriage return.
-.TP
-.B \et
-horizontal tab.
-.TP
-.B \ev
-vertical tab.
-.TP
-.BI \ex "\^hex digits"
-The character represented by the string of hexadecimal digits following
-the
-.BR \ex .
-As in \*(AN C, all following hexadecimal digits are considered part of
-the escape sequence.
-(This feature should tell us something about language design by committee.)
-E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
-.TP
-.BI \e ddd
-The character represented by the 1-, 2-, or 3-digit sequence of octal
-digits. E.g. \fB"\e033"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
-.TP
-.BI \e c
-The literal character
-.IR c\^ .
-.PP
-The escape sequences may also be used inside constant regular expressions
-(e.g.,
-.B "/[\ \et\ef\en\er\ev]/"
-matches whitespace characters).
-.PP
-In compatibility mode, the characters represented by octal and
-hexadecimal escape sequences are treated literally when used in
-regexp constants. Thus,
-.B /a\e52b/
-is equivalent to
-.BR /a\e*b/ .
-.SH FUNCTIONS
-Functions in AWK are defined as follows:
-.PP
-.RS
-\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR
-.RE
-.PP
-Functions are executed when they are called from within expressions
-in either patterns or actions. Actual parameters supplied in the function
-call are used to instantiate the formal parameters declared in the function.
-Arrays are passed by reference, other variables are passed by value.
-.PP
-Since functions were not originally part of the AWK language, the provision
-for local variables is rather clumsy: They are declared as extra parameters
-in the parameter list. The convention is to separate local variables from
-real parameters by extra spaces in the parameter list. For example:
-.PP
-.RS
-.ft B
-.nf
-function f(p, q, a, b) # a & b are local
-{
- \&.....
-}
-
-/abc/ { ... ; f(1, 2) ; ... }
-.fi
-.ft R
-.RE
-.PP
-The left parenthesis in a function call is required
-to immediately follow the function name,
-without any intervening white space.
-This is to avoid a syntactic ambiguity with the concatenation operator.
-This restriction does not apply to the built-in functions listed above.
-.PP
-Functions may call each other and may be recursive.
-Function parameters used as local variables are initialized
-to the null string and the number zero upon function invocation.
-.PP
-Use
-.BI return " expr"
-to return a value from a function. The return value is undefined if no
-value is provided, or if the function returns by ``falling off'' the
-end.
-.PP
-If
-.B \-\^\-lint
-has been provided,
-.I gawk
-will warn about calls to undefined functions at parse time,
-instead of at run time.
-Calling an undefined function at run time is a fatal error.
-.PP
-The word
-.B func
-may be used in place of
-.BR function .
-.SH EXAMPLES
-.nf
-Print and sort the login names of all users:
-
-.ft B
- BEGIN { FS = ":" }
- { print $1 | "sort" }
-
-.ft R
-Count lines in a file:
-
-.ft B
- { nlines++ }
- END { print nlines }
-
-.ft R
-Precede each line by its number in the file:
-
-.ft B
- { print FNR, $0 }
-
-.ft R
-Concatenate and line number (a variation on a theme):
-
-.ft B
- { print NR, $0 }
-.ft R
-.fi
-.SH SEE ALSO
-.IR egrep (1),
-.IR getpid (2),
-.IR getppid (2),
-.IR getpgrp (2),
-.IR getuid (2),
-.IR geteuid (2),
-.IR getgid (2),
-.IR getegid (2),
-.IR getgroups (2)
-.PP
-.IR "The AWK Programming Language" ,
-Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
-Addison-Wesley, 1988. ISBN 0-201-07981-X.
-.PP
-.IR "AWK Language Programming" ,
-Edition 1.0, published by the Free Software Foundation, 1995.
-.SH POSIX COMPATIBILITY
-A primary goal for
-.I gawk
-is compatibility with the \*(PX standard, as well as with the
-latest version of \*(UX
-.IR awk .
-To this end,
-.I gawk
-incorporates the following user visible
-features which are not described in the AWK book,
-but are part of the Bell Labs version of
-.IR awk ,
-and are in the \*(PX standard.
-.PP
-The
-.B \-v
-option for assigning variables before program execution starts is new.
-The book indicates that command line variable assignment happens when
-.I awk
-would otherwise open the argument as a file, which is after the
-.B BEGIN
-block is executed. However, in earlier implementations, when such an
-assignment appeared before any file names, the assignment would happen
-.I before
-the
-.B BEGIN
-block was run. Applications came to depend on this ``feature.''
-When
-.I awk
-was changed to match its documentation, this option was added to
-accommodate applications that depended upon the old behavior.
-(This feature was agreed upon by both the AT&T and GNU developers.)
-.PP
-The
-.B \-W
-option for implementation specific features is from the \*(PX standard.
-.PP
-When processing arguments,
-.I gawk
-uses the special option ``\fB\-\^\-\fP'' to signal the end of
-arguments.
-In compatibility mode, it will warn about, but otherwise ignore,
-undefined options.
-In normal operation, such arguments are passed on to the AWK program for
-it to process.
-.PP
-The AWK book does not define the return value of
-.BR srand() .
-The \*(PX standard
-has it return the seed it was using, to allow keeping track
-of random number sequences. Therefore
-.B srand()
-in
-.I gawk
-also returns its current seed.
-.PP
-Other new features are:
-The use of multiple
-.B \-f
-options (from MKS
-.IR awk );
-the
-.B ENVIRON
-array; the
-.BR \ea ,
-and
-.BR \ev
-escape sequences (done originally in
-.I gawk
-and fed back into AT&T's); the
-.B tolower()
-and
-.B toupper()
-built-in functions (from AT&T); and the \*(AN C conversion specifications in
-.B printf
-(done first in AT&T's version).
-.SH GNU EXTENSIONS
-.I Gawk
-has a number of extensions to \*(PX
-.IR awk .
-They are described in this section. All the extensions described here
-can be disabled by
-invoking
-.I gawk
-with the
-.B \-\^\-traditional
-option.
-.PP
-The following features of
-.I gawk
-are not available in
-\*(PX
-.IR awk .
-.RS
-.TP \w'\(bu'u+1n
-\(bu
-The
-.B \ex
-escape sequence.
-(Disabled with
-.BR \-\^\-posix .)
-.TP \w'\(bu'u+1n
-\(bu
-The
-.B fflush()
-function.
-(Disabled with
-.BR \-\^\-posix .)
-.TP
-\(bu
-The
-.BR systime(),
-.BR strftime(),
-and
-.B gensub()
-functions.
-.TP
-\(bu
-The special file names available for I/O redirection are not recognized.
-.TP
-\(bu
-The
-.BR ARGIND ,
-.BR ERRNO ,
-and
-.B RT
-variables are not special.
-.TP
-\(bu
-The
-.B IGNORECASE
-variable and its side-effects are not available.
-.TP
-\(bu
-The
-.B FIELDWIDTHS
-variable and fixed-width field splitting.
-.TP
-\(bu
-The use of
-.B RS
-as a regular expression.
-.TP
-\(bu
-The ability to split out individual characters using the null string
-as the value of
-.BR FS ,
-and as the third argument to
-.BR split() .
-.TP
-\(bu
-No path search is performed for files named via the
-.B \-f
-option. Therefore the
-.B AWKPATH
-environment variable is not special.
-.TP
-\(bu
-The use of
-.B "nextfile"
-to abandon processing of the current input file.
-.TP
-\(bu
-The use of
-.BI delete " array"
-to delete the entire contents of an array.
-.RE
-.PP
-The AWK book does not define the return value of the
-.B close()
-function.
-.IR Gawk\^ 's
-.B close()
-returns the value from
-.IR fclose (3),
-or
-.IR pclose (3),
-when closing a file or pipe, respectively.
-.PP
-When
-.I gawk
-is invoked with the
-.B \-\^\-traditional
-option,
-if the
-.I fs
-argument to the
-.B \-F
-option is ``t'', then
-.B FS
-will be set to the tab character.
-Note that typing
-.B "gawk \-F\et \&..."
-simply causes the shell to quote the ``t,'', and does not pass
-``\et'' to the
-.B \-F
-option.
-Since this is a rather ugly special case, it is not the default behavior.
-This behavior also does not occur if
-.B \-\^\-posix
-has been specified.
-To really get a tab character as the field separator, it is best to use
-quotes:
-.BR "gawk \-F'\et' \&..." .
-.ig
-.PP
-If
-.I gawk
-was compiled for debugging, it will
-accept the following additional options:
-.TP
-.PD 0
-.B \-Wparsedebug
-.TP
-.PD
-.B \-\^\-parsedebug
-Turn on
-.IR yacc (1)
-or
-.IR bison (1)
-debugging output during program parsing.
-This option should only be of interest to the
-.I gawk
-maintainers, and may not even be compiled into
-.IR gawk .
-..
-.SH HISTORICAL FEATURES
-There are two features of historical AWK implementations that
-.I gawk
-supports.
-First, it is possible to call the
-.B length()
-built-in function not only with no argument, but even without parentheses!
-Thus,
-.RS
-.PP
-.ft B
-a = length # Holy Algol 60, Batman!
-.ft R
-.RE
-.PP
-is the same as either of
-.RS
-.PP
-.ft B
-a = length()
-.br
-a = length($0)
-.ft R
-.RE
-.PP
-This feature is marked as ``deprecated'' in the \*(PX standard, and
-.I gawk
-will issue a warning about its use if
-.B \-\^\-lint
-is specified on the command line.
-.PP
-The other feature is the use of either the
-.B continue
-or the
-.B break
-statements outside the body of a
-.BR while ,
-.BR for ,
-or
-.B do
-loop. Traditional AWK implementations have treated such usage as
-equivalent to the
-.B next
-statement.
-.I Gawk
-will support this usage if
-.B \-\^\-traditional
-has been specified.
-.SH ENVIRONMENT
-If
-.B POSIXLY_CORRECT
-exists in the environment, then
-.I gawk
-behaves exactly as if
-.B \-\^\-posix
-had been specified on the command line.
-If
-.B \-\^\-lint
-has been specified,
-.I gawk
-will issue a warning message to this effect.
-.PP
-The
-.B AWKPATH
-environment variable can be used to provide a list of directories that
-.I gawk
-will search when looking for files named via the
-.B \-f
-and
-.B \-\^\-file
-options.
-.SH BUGS
-The
-.B \-F
-option is not necessary given the command line variable assignment feature;
-it remains only for backwards compatibility.
-.PP
-If your system actually has support for
-.B /dev/fd
-and the associated
-.BR /dev/stdin ,
-.BR /dev/stdout ,
-and
-.B /dev/stderr
-files, you may get different output from
-.I gawk
-than you would get on a system without those files. When
-.I gawk
-interprets these files internally, it synchronizes output to the standard
-output with output to
-.BR /dev/stdout ,
-while on a system with those files, the output is actually to different
-open files.
-Caveat Emptor.
-.PP
-Syntactically invalid single character programs tend to overflow
-the parse stack, generating a rather unhelpful message. Such programs
-are surprisingly difficult to diagnose in the completely general case,
-and the effort to do so really is not worth it.
-.SH VERSION INFORMATION
-This man page documents
-.IR gawk ,
-version 3.0.4.
-.SH AUTHORS
-The original version of \*(UX
-.I awk
-was designed and implemented by Alfred Aho,
-Peter Weinberger, and Brian Kernighan of AT&T Bell Labs. Brian Kernighan
-continues to maintain and enhance it.
-.PP
-Paul Rubin and Jay Fenlason,
-of the Free Software Foundation, wrote
-.IR gawk ,
-to be compatible with the original version of
-.I awk
-distributed in Seventh Edition \*(UX.
-John Woods contributed a number of bug fixes.
-David Trueman, with contributions
-from Arnold Robbins, made
-.I gawk
-compatible with the new version of \*(UX
-.IR awk .
-Arnold Robbins is the current maintainer.
-.PP
-The initial DOS port was done by Conrad Kwok and Scott Garfinkle.
-Scott Deifik is the current DOS maintainer. Pat Rankin did the
-port to VMS, and Michal Jaegermann did the port to the Atari ST.
-The port to OS/2 was done by Kai Uwe Rommel, with contributions and
-help from Darrel Hankerson. Fred Fish supplied support for the Amiga.
-.SH BUG REPORTS
-If you find a bug in
-.IR gawk ,
-please send electronic mail to
-.BR bug-gnu-utils@gnu.org ,
-.I with
-a carbon copy to
-.BR arnold@gnu.org .
-Please include your operating system and its revision, the version of
-.IR gawk ,
-what C compiler you used to compile it, and a test program
-and data that are as small as possible for reproducing the problem.
-.PP
-Before sending a bug report, please do two things. First, verify that
-you have the latest version of
-.IR gawk .
-Many bugs (usually subtle ones) are fixed at each release, and if
-yours is out of date, the problem may already have been solved.
-Second, please read this man page and the reference manual carefully to
-be sure that what you think is a bug really is, instead of just a quirk
-in the language.
-.PP
-Whatever you do, do
-.B NOT
-post a bug report in
-.BR comp.lang.awk .
-While the
-.I gawk
-developers occasionally read this newsgroup, posting bug reports there
-is an unreliable way to report bugs. Instead, please use the electronic mail
-addresses given above.
-.SH ACKNOWLEDGEMENTS
-Brian Kernighan of Bell Labs
-provided valuable assistance during testing and debugging.
-We thank him.
-.SH COPYING PERMISSIONS
-Copyright \(co) 1996,97,98,99 Free Software Foundation, Inc.
-.PP
-Permission is granted to make and distribute verbatim copies of
-this manual page provided the copyright notice and this permission
-notice are preserved on all copies.
-.ig
-Permission is granted to process this file through troff and print the
-results, provided the printed document carries copying permission
-notice identical to this one except for the removal of this paragraph
-(this paragraph not being relevant to the printed manual page).
-..
-.PP
-Permission is granted to copy and distribute modified versions of this
-manual page under the conditions for verbatim copying, provided that
-the entire resulting derived work is distributed under the terms of a
-permission notice identical to this one.
-.PP
-Permission is granted to copy and distribute translations of this
-manual page into another language, under the above conditions for
-modified versions, except that this permission notice may be stated in
-a translation approved by the Foundation.
OpenPOWER on IntegriCloud