summaryrefslogtreecommitdiffstats
path: root/contrib/awk/doc/gawk.1
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/awk/doc/gawk.1')
-rw-r--r--contrib/awk/doc/gawk.13322
1 files changed, 0 insertions, 3322 deletions
diff --git a/contrib/awk/doc/gawk.1 b/contrib/awk/doc/gawk.1
deleted file mode 100644
index 3e3c62b..0000000
--- a/contrib/awk/doc/gawk.1
+++ /dev/null
@@ -1,3322 +0,0 @@
-.ds PX \s-1POSIX\s+1
-.ds UX \s-1UNIX\s+1
-.ds AN \s-1ANSI\s+1
-.ds GN \s-1GNU\s+1
-.ds AK \s-1AWK\s+1
-.ds EP \fIGAWK: Effective AWK Programming\fP
-.if !\n(.g \{\
-. if !\w|\*(lq| \{\
-. ds lq ``
-. if \w'\(lq' .ds lq "\(lq
-. \}
-. if !\w|\*(rq| \{\
-. ds rq ''
-. if \w'\(rq' .ds rq "\(rq
-. \}
-.\}
-.TH GAWK 1 "May 29 2001" "Free Software Foundation" "Utility Commands"
-.SH NAME
-gawk \- pattern scanning and processing language
-.SH SYNOPSIS
-.B gawk
-[ \*(PX or \*(GN style options ]
-.B \-f
-.I program-file
-[
-.B \-\^\-
-] file .\|.\|.
-.br
-.B gawk
-[ \*(PX or \*(GN style options ]
-[
-.B \-\^\-
-]
-.I program-text
-file .\|.\|.
-.sp
-.B pgawk
-[ \*(PX or \*(GN style options ]
-.B \-f
-.I program-file
-[
-.B \-\^\-
-] file .\|.\|.
-.br
-.B pgawk
-[ \*(PX or \*(GN style options ]
-[
-.B \-\^\-
-]
-.I program-text
-file .\|.\|.
-.SH DESCRIPTION
-.I Gawk
-is the \*(GN Project's implementation of the \*(AK programming language.
-It conforms to the definition of the language in
-the \*(PX 1003.2 Command Language And Utilities Standard.
-This version in turn is based on the description in
-.IR "The AWK Programming Language" ,
-by Aho, Kernighan, and Weinberger,
-with the additional features found in the System V Release 4 version
-of \*(UX
-.IR awk .
-.I Gawk
-also provides more recent Bell Laboratories
-.I awk
-extensions, and a number of \*(GN-specific extensions.
-.PP
-.I Pgawk
-is the profiling version of
-.IR gawk .
-It is identical in every way to
-.IR gawk ,
-except that programs run more slowly,
-and it automatically produces an execution profile in the file
-.B awkprof.out
-when done.
-See the
-.B \-\^\-profile
-option, below.
-.PP
-The command line consists of options to
-.I gawk
-itself, the \*(AK program text (if not supplied via the
-.B \-f
-or
-.B \-\^\-file
-options), and values to be made
-available in the
-.B ARGC
-and
-.B ARGV
-pre-defined \*(AK variables.
-.SH OPTION FORMAT
-.PP
-.I Gawk
-options may be either traditional \*(PX one letter options,
-or \*(GN style long options. \*(PX options start with a single \*(lq\-\*(rq,
-while long options start with \*(lq\-\^\-\*(rq.
-Long options are provided for both \*(GN-specific features and
-for \*(PX-mandated features.
-.PP
-Following the \*(PX standard,
-.IR gawk -specific
-options are supplied via arguments to the
-.B \-W
-option. Multiple
-.B \-W
-options may be supplied
-Each
-.B \-W
-option has a corresponding long option, as detailed below.
-Arguments to long options are either joined with the option
-by an
-.B =
-sign, with no intervening spaces, or they may be provided in the
-next command line argument.
-Long options may be abbreviated, as long as the abbreviation
-remains unique.
-.SH OPTIONS
-.PP
-.I Gawk
-accepts the following options, listed alphabetically.
-.TP
-.PD 0
-.BI \-F " fs"
-.TP
-.PD
-.BI \-\^\-field-separator " fs"
-Use
-.I fs
-for the input field separator (the value of the
-.B FS
-predefined
-variable).
-.TP
-.PD 0
-\fB\-v\fI var\fB\^=\^\fIval\fR
-.TP
-.PD
-\fB\-\^\-assign \fIvar\fB\^=\^\fIval\fR
-Assign the value
-.I val
-to the variable
-.IR var ,
-before execution of the program begins.
-Such variable values are available to the
-.B BEGIN
-block of an \*(AK program.
-.TP
-.PD 0
-.BI \-f " program-file"
-.TP
-.PD
-.BI \-\^\-file " program-file"
-Read the \*(AK program source from the file
-.IR program-file ,
-instead of from the first command line argument.
-Multiple
-.B \-f
-(or
-.BR \-\^\-file )
-options may be used.
-.TP
-.PD 0
-.BI \-mf " NNN"
-.TP
-.PD
-.BI \-mr " NNN"
-Set various memory limits to the value
-.IR NNN .
-The
-.B f
-flag sets the maximum number of fields, and the
-.B r
-flag sets the maximum record size. These two flags and the
-.B \-m
-option are from the Bell Laboratories research version of \*(UX
-.IR awk .
-They are ignored by
-.IR gawk ,
-since
-.I gawk
-has no pre-defined limits.
-.TP
-.PD 0
-.B "\-W compat"
-.TP
-.PD 0
-.B "\-W traditional"
-.TP
-.PD 0
-.B \-\^\-compat
-.TP
-.PD
-.B \-\^\-traditional
-Run in
-.I compatibility
-mode. In compatibility mode,
-.I gawk
-behaves identically to \*(UX
-.IR awk ;
-none of the \*(GN-specific extensions are recognized.
-The use of
-.B \-\^\-traditional
-is preferred over the other forms of this option.
-See
-.BR "GNU EXTENSIONS" ,
-below, for more information.
-.TP
-.PD 0
-.B "\-W copyleft"
-.TP
-.PD 0
-.B "\-W copyright"
-.TP
-.PD 0
-.B \-\^\-copyleft
-.TP
-.PD
-.B \-\^\-copyright
-Print the short version of the \*(GN copyright information message on
-the standard output and exit successfully.
-.TP
-.PD 0
-\fB\-W dump-variables\fR[\fB=\fIfile\fR]
-.TP
-.PD
-\fB\-\^\-dump-variables\fR[\fB=\fIfile\fR]
-Print a sorted list of global variables, their types and final values to
-.IR file .
-If no
-.I file
-is provided,
-.I gawk
-uses a file named
-.I awkvars.out
-in the current directory.
-.sp .5
-Having a list of all the global variables is a good way to look for
-typographical errors in your programs.
-You would also use this option if you have a large program with a lot of
-functions, and you want to be sure that your functions don't
-inadvertently use global variables that you meant to be local.
-(This is a particularly easy mistake to make with simple variable
-names like
-.BR i ,
-.BR j ,
-and so on.)
-.TP
-.PD 0
-.B "\-W help"
-.TP
-.PD 0
-.B "\-W usage"
-.TP
-.PD 0
-.B \-\^\-help
-.TP
-.PD
-.B \-\^\-usage
-Print a relatively short summary of the available options on
-the standard output.
-(Per the
-.IR "GNU Coding Standards" ,
-these options cause an immediate, successful exit.)
-.TP
-.PD 0
-.BR "\-W lint" [ =fatal ]
-.TP
-.PD
-.BR \-\^\-lint [ =fatal ]
-Provide warnings about constructs that are
-dubious or non-portable to other \*(AK implementations.
-With an optional argument of
-.BR fatal ,
-lint warnings become fatal errors.
-This may be drastic, but its use will certainly encourage the
-development of cleaner \*(AK programs.
-.TP
-.PD 0
-.B "\-W lint\-old"
-.TP
-.PD
-.B \-\^\-lint\-old
-Provide warnings about constructs that are
-not portable to the original version of Unix
-.IR awk .
-.TP
-.PD 0
-.B "\-W gen\-po"
-.TP
-.PD
-.B \-\^\-gen\-po
-Scan and parse the \*(AK program, and generate a \*(GN
-.B \&.po
-format file on standard output with entries for all localizable
-strings in the program. The program itself is not executed.
-See the \*(GN
-.I gettext
-distribution for more information on
-.B \&.po
-files.
-.TP
-.PD 0
-.B "\-W non\-decimal\-data"
-.TP
-.PD
-.B "\-\^\-non\-decimal\-data"
-Recognize octal and hexadecimal values in input data.
-.I "Use this option with great caution!"
-.ig
-.\" This option is left undocumented, on purpose.
-.TP
-.PD 0
-.B "\-W nostalgia"
-.TP
-.PD
-.B \-\^\-nostalgia
-Provide a moment of nostalgia for long time
-.I awk
-users.
-..
-.TP
-.PD 0
-.B "\-W posix"
-.TP
-.PD
-.B \-\^\-posix
-This turns on
-.I compatibility
-mode, with the following additional restrictions:
-.RS
-.TP "\w'\(bu'u+1n"
-\(bu
-.B \ex
-escape sequences are not recognized.
-.TP
-\(bu
-Only space and tab act as field separators when
-.B FS
-is set to a single space, newline does not.
-.TP
-\(bu
-You cannot continue lines after
-.B ?
-and
-.BR : .
-.TP
-\(bu
-The synonym
-.B func
-for the keyword
-.B function
-is not recognized.
-.TP
-\(bu
-The operators
-.B **
-and
-.B **=
-cannot be used in place of
-.B ^
-and
-.BR ^= .
-.TP
-\(bu
-The
-.B fflush()
-function is not available.
-.RE
-.TP
-.PD 0
-\fB\-W profile\fR[\fB=\fIprof_file\fR]
-.TP
-.PD
-\fB\-\^\-profile\fR[\fB=\fIprof_file\fR]
-Send profiling data to
-.IR prof_file .
-The default is
-.BR awkprof.out .
-When run with
-.IR gawk ,
-the profile is just a \*(lqpretty printed\*(rq version of the program.
-When run with
-.IR pgawk ,
-the profile contains execution counts of each statement in the program
-in the left margin and function call counts for each user-defined function.
-.TP
-.PD 0
-.B "\-W re\-interval"
-.TP
-.PD
-.B \-\^\-re\-interval
-Enable the use of
-.I "interval expressions"
-in regular expression matching
-(see
-.BR "Regular Expressions" ,
-below).
-Interval expressions were not traditionally available in the
-\*(AK language. The \*(PX standard added them, to make
-.I awk
-and
-.I egrep
-consistent with each other.
-However, their use is likely
-to break old \*(AK programs, so
-.I gawk
-only provides them if they are requested with this option, or when
-.B \-\^\-posix
-is specified.
-.TP
-.PD 0
-.BI "\-W source " program-text
-.TP
-.PD
-.BI \-\^\-source " program-text"
-Use
-.I program-text
-as \*(AK program source code.
-This option allows the easy intermixing of library functions (used via the
-.B \-f
-and
-.B \-\^\-file
-options) with source code entered on the command line.
-It is intended primarily for medium to large \*(AK programs used
-in shell scripts.
-.TP
-.PD 0
-.B "\-W version"
-.TP
-.PD
-.B \-\^\-version
-Print version information for this particular copy of
-.I gawk
-on the standard output.
-This is useful mainly for knowing if the current copy of
-.I gawk
-on your system
-is up to date with respect to whatever the Free Software Foundation
-is distributing.
-This is also useful when reporting bugs.
-(Per the
-.IR "GNU Coding Standards" ,
-these options cause an immediate, successful exit.)
-.TP
-.PD 0
-.B \-\^\-
-Signal the end of options. This is useful to allow further arguments to the
-\*(AK program itself to start with a \*(lq\-\*(rq.
-This is mainly for consistency with the argument parsing convention used
-by most other \*(PX programs.
-.PP
-In compatibility mode,
-any other options are flagged as invalid, but are otherwise ignored.
-In normal operation, as long as program text has been supplied, unknown
-options are passed on to the \*(AK program in the
-.B ARGV
-array for processing. This is particularly useful for running \*(AK
-programs via the \*(lq#!\*(rq executable interpreter mechanism.
-.SH AWK PROGRAM EXECUTION
-.PP
-An \*(AK program consists of a sequence of pattern-action statements
-and optional function definitions.
-.RS
-.PP
-\fIpattern\fB { \fIaction statements\fB }\fR
-.br
-\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR
-.RE
-.PP
-.I Gawk
-first reads the program source from the
-.IR program-file (s)
-if specified,
-from arguments to
-.BR \-\^\-source ,
-or from the first non-option argument on the command line.
-The
-.B \-f
-and
-.B \-\^\-source
-options may be used multiple times on the command line.
-.I Gawk
-reads the program text as if all the
-.IR program-file s
-and command line source texts
-had been concatenated together. This is useful for building libraries
-of \*(AK functions, without having to include them in each new \*(AK
-program that uses them. It also provides the ability to mix library
-functions with command line programs.
-.PP
-The environment variable
-.B AWKPATH
-specifies a search path to use when finding source files named with
-the
-.B \-f
-option. If this variable does not exist, the default path is
-\fB".:/usr/local/share/awk"\fR.
-(The actual directory may vary, depending upon how
-.I gawk
-was built and installed.)
-If a file name given to the
-.B \-f
-option contains a \*(lq/\*(rq character, no path search is performed.
-.PP
-.I Gawk
-executes \*(AK programs in the following order.
-First,
-all variable assignments specified via the
-.B \-v
-option are performed.
-Next,
-.I gawk
-compiles the program into an internal form.
-Then,
-.I gawk
-executes the code in the
-.B BEGIN
-block(s) (if any),
-and then proceeds to read
-each file named in the
-.B ARGV
-array.
-If there are no files named on the command line,
-.I gawk
-reads the standard input.
-.PP
-If a filename on the command line has the form
-.IB var = val
-it is treated as a variable assignment. The variable
-.I var
-will be assigned the value
-.IR val .
-(This happens after any
-.B BEGIN
-block(s) have been run.)
-Command line variable assignment
-is most useful for dynamically assigning values to the variables
-\*(AK uses to control how input is broken into fields and records.
-It is also useful for controlling state if multiple passes are needed over
-a single data file.
-.PP
-If the value of a particular element of
-.B ARGV
-is empty (\fB""\fR),
-.I gawk
-skips over it.
-.PP
-For each record in the input,
-.I gawk
-tests to see if it matches any
-.I pattern
-in the \*(AK program.
-For each pattern that the record matches, the associated
-.I action
-is executed.
-The patterns are tested in the order they occur in the program.
-.PP
-Finally, after all the input is exhausted,
-.I gawk
-executes the code in the
-.B END
-block(s) (if any).
-.SH VARIABLES, RECORDS AND FIELDS
-\*(AK variables are dynamic; they come into existence when they are
-first used. Their values are either floating-point numbers or strings,
-or both,
-depending upon how they are used. \*(AK also has one dimensional
-arrays; arrays with multiple dimensions may be simulated.
-Several pre-defined variables are set as a program
-runs; these will be described as needed and summarized below.
-.SS Records
-Normally, records are separated by newline characters. You can control how
-records are separated by assigning values to the built-in variable
-.BR RS .
-If
-.B RS
-is any single character, that character separates records.
-Otherwise,
-.B RS
-is a regular expression. Text in the input that matches this
-regular expression separates the record.
-However, in compatibility mode,
-only the first character of its string
-value is used for separating records.
-If
-.B RS
-is set to the null string, then records are separated by
-blank lines.
-When
-.B RS
-is set to the null string, the newline character always acts as
-a field separator, in addition to whatever value
-.B FS
-may have.
-.SS Fields
-.PP
-As each input record is read,
-.I gawk
-splits the record into
-.IR fields ,
-using the value of the
-.B FS
-variable as the field separator.
-If
-.B FS
-is a single character, fields are separated by that character.
-If
-.B FS
-is the null string, then each individual character becomes a
-separate field.
-Otherwise,
-.B FS
-is expected to be a full regular expression.
-In the special case that
-.B FS
-is a single space, fields are separated
-by runs of spaces and/or tabs and/or newlines.
-(But see the discussion of
-.BR \-\^\-posix ,
-below).
-.B NOTE:
-The value of
-.B IGNORECASE
-(see below) also affects how fields are split when
-.B FS
-is a regular expression, and how records are separated when
-.B RS
-is a regular expression.
-.PP
-If the
-.B FIELDWIDTHS
-variable is set to a space separated list of numbers, each field is
-expected to have fixed width, and
-.I gawk
-splits up the record using the specified widths. The value of
-.B FS
-is ignored.
-Assigning a new value to
-.B FS
-overrides the use of
-.BR FIELDWIDTHS ,
-and restores the default behavior.
-.PP
-Each field in the input record may be referenced by its position,
-.BR $1 ,
-.BR $2 ,
-and so on.
-.B $0
-is the whole record.
-Fields need not be referenced by constants:
-.RS
-.PP
-.ft B
-n = 5
-.br
-print $n
-.ft R
-.RE
-.PP
-prints the fifth field in the input record.
-.PP
-The variable
-.B NF
-is set to the total number of fields in the input record.
-.PP
-References to non-existent fields (i.e. fields after
-.BR $NF )
-produce the null-string. However, assigning to a non-existent field
-(e.g.,
-.BR "$(NF+2) = 5" )
-increases the value of
-.BR NF ,
-creates any intervening fields with the null string as their value, and
-causes the value of
-.B $0
-to be recomputed, with the fields being separated by the value of
-.BR OFS .
-References to negative numbered fields cause a fatal error.
-Decrementing
-.B NF
-causes the values of fields past the new value to be lost, and the value of
-.B $0
-to be recomputed, with the fields being separated by the value of
-.BR OFS .
-.PP
-Assigning a value to an existing field
-causes the whole record to be rebuilt when
-.B $0
-is referenced.
-Similarly, assigning a value to
-.B $0
-causes the record to be resplit, creating new
-values for the fields.
-.SS Built-in Variables
-.PP
-.IR Gawk\^ "'s"
-built-in variables are:
-.PP
-.TP "\w'\fBFIELDWIDTHS\fR'u+1n"
-.B ARGC
-The number of command line arguments (does not include options to
-.IR gawk ,
-or the program source).
-.TP
-.B ARGIND
-The index in
-.B ARGV
-of the current file being processed.
-.TP
-.B ARGV
-Array of command line arguments. The array is indexed from
-0 to
-.B ARGC
-\- 1.
-Dynamically changing the contents of
-.B ARGV
-can control the files used for data.
-.TP
-.B BINMODE
-On non-POSIX systems, specifies use of \*(lqbinary\*(rq mode for all file I/O.
-Numeric values of 1, 2, or 3, specify that input files, output files, or
-all files, respectively, should use binary I/O.
-String values of \fB"r"\fR, or \fB"w"\fR specify that input files, or output files,
-respectively, should use binary I/O.
-String values of \fB"rw"\fR or \fB"wr"\fR specify that all files
-should use binary I/O.
-Any other string value is treated as \fB"rw"\fR, but generates a warning message.
-.TP
-.B CONVFMT
-The conversion format for numbers, \fB"%.6g"\fR, by default.
-.TP
-.B ENVIRON
-An array containing the values of the current environment.
-The array is indexed by the environment variables, each element being
-the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be
-.BR /home/arnold ).
-Changing this array does not affect the environment seen by programs which
-.I gawk
-spawns via redirection or the
-.B system()
-function.
-.TP
-.B ERRNO
-If a system error occurs either doing a redirection for
-.BR getline ,
-during a read for
-.BR getline ,
-or during a
-.BR close() ,
-then
-.B ERRNO
-will contain
-a string describing the error.
-The value is subject to translation in non-English locales.
-.TP
-.B FIELDWIDTHS
-A white-space separated list of fieldwidths. When set,
-.I gawk
-parses the input into fields of fixed width, instead of using the
-value of the
-.B FS
-variable as the field separator.
-.TP
-.B FILENAME
-The name of the current input file.
-If no files are specified on the command line, the value of
-.B FILENAME
-is \*(lq\-\*(rq.
-However,
-.B FILENAME
-is undefined inside the
-.B BEGIN
-block
-(unless set by
-.BR getline ).
-.TP
-.B FNR
-The input record number in the current input file.
-.TP
-.B FS
-The input field separator, a space by default. See
-.BR Fields ,
-above.
-.TP
-.B IGNORECASE
-Controls the case-sensitivity of all regular expression
-and string operations. If
-.B IGNORECASE
-has a non-zero value, then string comparisons and
-pattern matching in rules,
-field splitting with
-.BR FS ,
-record separating with
-.BR RS ,
-regular expression
-matching with
-.B ~
-and
-.BR !~ ,
-and the
-.BR gensub() ,
-.BR gsub() ,
-.BR index() ,
-.BR match() ,
-.BR split() ,
-and
-.B sub()
-built-in functions all ignore case when doing regular expression
-operations.
-.B NOTE:
-Array subscripting is
-.I not
-affected, nor is the
-.B asort()
-function.
-.sp .5
-Thus, if
-.B IGNORECASE
-is not equal to zero,
-.B /aB/
-matches all of the strings \fB"ab"\fP, \fB"aB"\fP, \fB"Ab"\fP,
-and \fB"AB"\fP.
-As with all \*(AK variables, the initial value of
-.B IGNORECASE
-is zero, so all regular expression and string
-operations are normally case-sensitive.
-Under Unix, the full ISO 8859-1 Latin-1 character set is used
-when ignoring case.
-.TP
-.B LINT
-Provides dynamic control of the
-.B \-\^\-lint
-option from within an \*(AK program.
-When true,
-.I gawk
-prints lint warnings. When false, it does not.
-When assigned the string value \fB"fatal"\fP,
-lint warnings become fatal errors, exactly like
-.BR \-\^\-lint=fatal .
-Any other true value just prints warnings.
-.TP
-.B NF
-The number of fields in the current input record.
-.TP
-.B NR
-The total number of input records seen so far.
-.TP
-.B OFMT
-The output format for numbers, \fB"%.6g"\fR, by default.
-.TP
-.B OFS
-The output field separator, a space by default.
-.TP
-.B ORS
-The output record separator, by default a newline.
-.TP
-.B PROCINFO
-The elements of this array provide access to information about the
-running \*(AK program.
-On some systems,
-there may be elements in the array, \fB"group1"\fP through
-\fB"group\fIn\fB"\fR for some
-.IR n ,
-which is the number of supplementary groups that the process has.
-Use the
-.B in
-operator to test for these elements.
-The following elements are guaranteed to be available:
-.RS
-.TP \w'\fBPROCINFO["pgrpid"]\fR'u+1n
-\fBPROCINFO["egid"]\fP
-the value of the
-.IR getegid (2)
-system call.
-.TP
-\fBPROCINFO["euid"]\fP
-the value of the
-.IR geteuid (2)
-system call.
-.TP
-\fBPROCINFO["FS"]\fP
-\fB"FS"\fP if field splitting with
-.B FS
-is in effect, or \fB"FIELDWIDTHS"\fP if field splitting with
-.B FIELDWIDTHS
-is in effect.
-.TP
-\fBPROCINFO["gid"]\fP
-the value of the
-.IR getgid (2)
-system call.
-.TP
-\fBPROCINFO["pgrpid"]\fP
-the process group ID of the current process.
-.TP
-\fBPROCINFO["pid"]\fP
-the process ID of the current process.
-.TP
-\fBPROCINFO["ppid"]\fP
-the parent process ID of the current process.
-.TP
-\fBPROCINFO["uid"]\fP
-the value of the
-.IR getuid (2)
-system call.
-.RE
-.TP
-.B RS
-The input record separator, by default a newline.
-.TP
-.B RT
-The record terminator.
-.I Gawk
-sets
-.B RT
-to the input text that matched the character or regular expression
-specified by
-.BR RS .
-.TP
-.B RSTART
-The index of the first character matched by
-.BR match() ;
-0 if no match.
-.TP
-.B RLENGTH
-The length of the string matched by
-.BR match() ;
-\-1 if no match.
-.TP
-.B SUBSEP
-The character used to separate multiple subscripts in array
-elements, by default \fB"\e034"\fR.
-.TP
-.B TEXTDOMAIN
-The text domain of the \*(AK program; used to find the localized
-translations for the program's strings.
-.SS Arrays
-.PP
-Arrays are subscripted with an expression between square brackets
-.RB ( [ " and " ] ).
-If the expression is an expression list
-.RI ( expr ", " expr " .\|.\|.)"
-then the array subscript is a string consisting of the
-concatenation of the (string) value of each expression,
-separated by the value of the
-.B SUBSEP
-variable.
-This facility is used to simulate multiply dimensioned
-arrays. For example:
-.PP
-.RS
-.ft B
-i = "A";\^ j = "B";\^ k = "C"
-.br
-x[i, j, k] = "hello, world\en"
-.ft R
-.RE
-.PP
-assigns the string \fB"hello, world\en"\fR to the element of the array
-.B x
-which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in \*(AK
-are associative, i.e. indexed by string values.
-.PP
-The special operator
-.B in
-may be used in an
-.B if
-or
-.B while
-statement to see if an array has an index consisting of a particular
-value.
-.PP
-.RS
-.ft B
-.nf
-if (val in array)
- print array[val]
-.fi
-.ft
-.RE
-.PP
-If the array has multiple subscripts, use
-.BR "(i, j) in array" .
-.PP
-The
-.B in
-construct may also be used in a
-.B for
-loop to iterate over all the elements of an array.
-.PP
-An element may be deleted from an array using the
-.B delete
-statement.
-The
-.B delete
-statement may also be used to delete the entire contents of an array,
-just by specifying the array name without a subscript.
-.SS Variable Typing And Conversion
-.PP
-Variables and fields
-may be (floating point) numbers, or strings, or both. How the
-value of a variable is interpreted depends upon its context. If used in
-a numeric expression, it will be treated as a number, if used as a string
-it will be treated as a string.
-.PP
-To force a variable to be treated as a number, add 0 to it; to force it
-to be treated as a string, concatenate it with the null string.
-.PP
-When a string must be converted to a number, the conversion is accomplished
-using
-.IR strtod (3).
-A number is converted to a string by using the value of
-.B CONVFMT
-as a format string for
-.IR sprintf (3),
-with the numeric value of the variable as the argument.
-However, even though all numbers in \*(AK are floating-point,
-integral values are
-.I always
-converted as integers. Thus, given
-.PP
-.RS
-.ft B
-.nf
-CONVFMT = "%2.2f"
-a = 12
-b = a ""
-.fi
-.ft R
-.RE
-.PP
-the variable
-.B b
-has a string value of \fB"12"\fR and not \fB"12.00"\fR.
-.PP
-.I Gawk
-performs comparisons as follows:
-If two variables are numeric, they are compared numerically.
-If one value is numeric and the other has a string value that is a
-\*(lqnumeric string,\*(rq then comparisons are also done numerically.
-Otherwise, the numeric value is converted to a string and a string
-comparison is performed.
-Two strings are compared, of course, as strings.
-Note that the POSIX standard applies the concept of
-\*(lqnumeric string\*(rq everywhere, even to string constants.
-However, this is
-clearly incorrect, and
-.I gawk
-does not do this.
-(Fortunately, this is fixed in the next version of the standard.)
-.PP
-Note that string constants, such as \fB"57"\fP, are
-.I not
-numeric strings, they are string constants.
-The idea of \*(lqnumeric string\*(rq
-only applies to fields,
-.B getline
-input,
-.BR FILENAME ,
-.B ARGV
-elements,
-.B ENVIRON
-elements and the elements of an array created by
-.B split()
-that are numeric strings.
-The basic idea is that
-.IR "user input" ,
-and only user input, that looks numeric,
-should be treated that way.
-.PP
-Uninitialized variables have the numeric value 0 and the string value ""
-(the null, or empty, string).
-.SS Octal and Hexadecimal Constants
-Starting with version 3.1 of
-.I gawk ,
-you may use C-style octal and hexadecimal constants in your AWK
-program source code.
-For example, the octal value
-.B 011
-is equal to decimal
-.BR 9 ,
-and the hexadecimal value
-.B 0x11
-is equal to decimal 17.
-.SS String Constants
-.PP
-String constants in \*(AK are sequences of characters enclosed
-between double quotes (\fB"\fR). Within strings, certain
-.I "escape sequences"
-are recognized, as in C. These are:
-.PP
-.TP "\w'\fB\e\^\fIddd\fR'u+1n"
-.B \e\e
-A literal backslash.
-.TP
-.B \ea
-The \*(lqalert\*(rq character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character.
-.TP
-.B \eb
-backspace.
-.TP
-.B \ef
-form-feed.
-.TP
-.B \en
-newline.
-.TP
-.B \er
-carriage return.
-.TP
-.B \et
-horizontal tab.
-.TP
-.B \ev
-vertical tab.
-.TP
-.BI \ex "\^hex digits"
-The character represented by the string of hexadecimal digits following
-the
-.BR \ex .
-As in \*(AN C, all following hexadecimal digits are considered part of
-the escape sequence.
-(This feature should tell us something about language design by committee.)
-E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
-.TP
-.BI \e ddd
-The character represented by the 1-, 2-, or 3-digit sequence of octal
-digits.
-E.g., \fB"\e033"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character.
-.TP
-.BI \e c
-The literal character
-.IR c\^ .
-.PP
-The escape sequences may also be used inside constant regular expressions
-(e.g.,
-.B "/[\ \et\ef\en\er\ev]/"
-matches whitespace characters).
-.PP
-In compatibility mode, the characters represented by octal and
-hexadecimal escape sequences are treated literally when used in
-regular expression constants. Thus,
-.B /a\e52b/
-is equivalent to
-.BR /a\e*b/ .
-.SH PATTERNS AND ACTIONS
-\*(AK is a line-oriented language. The pattern comes first, and then the
-action. Action statements are enclosed in
-.B {
-and
-.BR } .
-Either the pattern may be missing, or the action may be missing, but,
-of course, not both. If the pattern is missing, the action is
-executed for every single record of input.
-A missing action is equivalent to
-.RS
-.PP
-.B "{ print }"
-.RE
-.PP
-which prints the entire record.
-.PP
-Comments begin with the \*(lq#\*(rq character, and continue until the
-end of the line.
-Blank lines may be used to separate statements.
-Normally, a statement ends with a newline, however, this is not the
-case for lines ending in
-a \*(lq,\*(rq,
-.BR { ,
-.BR ? ,
-.BR : ,
-.BR && ,
-or
-.BR || .
-Lines ending in
-.B do
-or
-.B else
-also have their statements automatically continued on the following line.
-In other cases, a line can be continued by ending it with a \*(lq\e\*(rq,
-in which case the newline will be ignored.
-.PP
-Multiple statements may
-be put on one line by separating them with a \*(lq;\*(rq.
-This applies to both the statements within the action part of a
-pattern-action pair (the usual case),
-and to the pattern-action statements themselves.
-.SS Patterns
-\*(AK patterns may be one of the following:
-.PP
-.RS
-.nf
-.B BEGIN
-.B END
-.BI / "regular expression" /
-.I "relational expression"
-.IB pattern " && " pattern
-.IB pattern " || " pattern
-.IB pattern " ? " pattern " : " pattern
-.BI ( pattern )
-.BI ! " pattern"
-.IB pattern1 ", " pattern2
-.fi
-.RE
-.PP
-.B BEGIN
-and
-.B END
-are two special kinds of patterns which are not tested against
-the input.
-The action parts of all
-.B BEGIN
-patterns are merged as if all the statements had
-been written in a single
-.B BEGIN
-block. They are executed before any
-of the input is read. Similarly, all the
-.B END
-blocks are merged,
-and executed when all the input is exhausted (or when an
-.B exit
-statement is executed).
-.B BEGIN
-and
-.B END
-patterns cannot be combined with other patterns in pattern expressions.
-.B BEGIN
-and
-.B END
-patterns cannot have missing action parts.
-.PP
-For
-.BI / "regular expression" /
-patterns, the associated statement is executed for each input record that matches
-the regular expression.
-Regular expressions are the same as those in
-.IR egrep (1),
-and are summarized below.
-.PP
-A
-.I "relational expression"
-may use any of the operators defined below in the section on actions.
-These generally test whether certain fields match certain regular expressions.
-.PP
-The
-.BR && ,
-.BR || ,
-and
-.B !
-operators are logical AND, logical OR, and logical NOT, respectively, as in C.
-They do short-circuit evaluation, also as in C, and are used for combining
-more primitive pattern expressions. As in most languages, parentheses
-may be used to change the order of evaluation.
-.PP
-The
-.B ?\^:
-operator is like the same operator in C. If the first pattern is true
-then the pattern used for testing is the second pattern, otherwise it is
-the third. Only one of the second and third patterns is evaluated.
-.PP
-The
-.IB pattern1 ", " pattern2
-form of an expression is called a
-.IR "range pattern" .
-It matches all input records starting with a record that matches
-.IR pattern1 ,
-and continuing until a record that matches
-.IR pattern2 ,
-inclusive. It does not combine with any other sort of pattern expression.
-.SS Regular Expressions
-Regular expressions are the extended kind found in
-.IR egrep .
-They are composed of characters as follows:
-.TP "\w'\fB[^\fIabc.\|.\|.\fB]\fR'u+2n"
-.I c
-matches the non-metacharacter
-.IR c .
-.TP
-.I \ec
-matches the literal character
-.IR c .
-.TP
-.B .
-matches any character
-.I including
-newline.
-.TP
-.B ^
-matches the beginning of a string.
-.TP
-.B $
-matches the end of a string.
-.TP
-.BI [ abc.\|.\|. ]
-character list, matches any of the characters
-.IR abc.\|.\|. .
-.TP
-.BI [^ abc.\|.\|. ]
-negated character list, matches any character except
-.IR abc.\|.\|. .
-.TP
-.IB r1 | r2
-alternation: matches either
-.I r1
-or
-.IR r2 .
-.TP
-.I r1r2
-concatenation: matches
-.IR r1 ,
-and then
-.IR r2 .
-.TP
-.IB r\^ +
-matches one or more
-.IR r\^ "'s."
-.TP
-.IB r *
-matches zero or more
-.IR r\^ "'s."
-.TP
-.IB r\^ ?
-matches zero or one
-.IR r\^ "'s."
-.TP
-.BI ( r )
-grouping: matches
-.IR r .
-.TP
-.PD 0
-.IB r { n }
-.TP
-.PD 0
-.IB r { n ,}
-.TP
-.PD
-.IB r { n , m }
-One or two numbers inside braces denote an
-.IR "interval expression" .
-If there is one number in the braces, the preceding regular expression
-.I r
-is repeated
-.I n
-times. If there are two numbers separated by a comma,
-.I r
-is repeated
-.I n
-to
-.I m
-times.
-If there is one number followed by a comma, then
-.I r
-is repeated at least
-.I n
-times.
-.sp .5
-Interval expressions are only available if either
-.B \-\^\-posix
-or
-.B \-\^\-re\-interval
-is specified on the command line.
-.TP
-.B \ey
-matches the empty string at either the beginning or the
-end of a word.
-.TP
-.B \eB
-matches the empty string within a word.
-.TP
-.B \e<
-matches the empty string at the beginning of a word.
-.TP
-.B \e>
-matches the empty string at the end of a word.
-.TP
-.B \ew
-matches any word-constituent character (letter, digit, or underscore).
-.TP
-.B \eW
-matches any character that is not word-constituent.
-.TP
-.B \e`
-matches the empty string at the beginning of a buffer (string).
-.TP
-.B \e'
-matches the empty string at the end of a buffer.
-.PP
-The escape sequences that are valid in string constants (see below)
-are also valid in regular expressions.
-.PP
-.I "Character classes"
-are a new feature introduced in the \*(PX standard.
-A character class is a special notation for describing
-lists of characters that have a specific attribute, but where the
-actual characters themselves can vary from country to country and/or
-from character set to character set. For example, the notion of what
-is an alphabetic character differs in the USA and in France.
-.PP
-A character class is only valid in a regular expression
-.I inside
-the brackets of a character list. Character classes consist of
-.BR [: ,
-a keyword denoting the class, and
-.BR :] .
-The character
-classes defined by the \*(PX standard are:
-.TP "\w'\fB[:alnum:]\fR'u+2n"
-.B [:alnum:]
-Alphanumeric characters.
-.TP
-.B [:alpha:]
-Alphabetic characters.
-.TP
-.B [:blank:]
-Space or tab characters.
-.TP
-.B [:cntrl:]
-Control characters.
-.TP
-.B [:digit:]
-Numeric characters.
-.TP
-.B [:graph:]
-Characters that are both printable and visible.
-(A space is printable, but not visible, while an
-.B a
-is both.)
-.TP
-.B [:lower:]
-Lower-case alphabetic characters.
-.TP
-.B [:print:]
-Printable characters (characters that are not control characters.)
-.TP
-.B [:punct:]
-Punctuation characters (characters that are not letter, digits,
-control characters, or space characters).
-.TP
-.B [:space:]
-Space characters (such as space, tab, and formfeed, to name a few).
-.TP
-.B [:upper:]
-Upper-case alphabetic characters.
-.TP
-.B [:xdigit:]
-Characters that are hexadecimal digits.
-.PP
-For example, before the \*(PX standard, to match alphanumeric
-characters, you would have had to write
-.BR /[A\-Za\-z0\-9]/ .
-If your character set had other alphabetic characters in it, this would not
-match them, and if your character set collated differently from
-\s-1ASCII\s+1, this might not even match the
-\s-1ASCII\s+1 alphanumeric characters.
-With the \*(PX character classes, you can write
-.BR /[[:alnum:]]/ ,
-and this matches
-the alphabetic and numeric characters in your character set.
-.PP
-Two additional special sequences can appear in character lists.
-These apply to non-\s-1ASCII\s+1 character sets, which can have single symbols
-(called
-.IR "collating elements" )
-that are represented with more than one
-character, as well as several characters that are equivalent for
-.IR collating ,
-or sorting, purposes. (E.g., in French, a plain \*(lqe\*(rq
-and a grave-accented e\` are equivalent.)
-.TP
-Collating Symbols
-A collating symbol is a multi-character collating element enclosed in
-.B [.
-and
-.BR .] .
-For example, if
-.B ch
-is a collating element, then
-.B [[.ch.]]
-is a regular expression that matches this collating element, while
-.B [ch]
-is a regular expression that matches either
-.B c
-or
-.BR h .
-.TP
-Equivalence Classes
-An equivalence class is a locale-specific name for a list of
-characters that are equivalent. The name is enclosed in
-.B [=
-and
-.BR =] .
-For example, the name
-.B e
-might be used to represent all of
-\*(lqe,\*(rq \*(lqe\h'-\w:e:u'\`,\*(rq and \*(lqe\h'-\w:e:u'\`.\*(rq
-In this case,
-.B [[=e=]]
-is a regular expression
-that matches any of
-.BR e ,
-....BR "e\'" ,
-.BR "e\h'-\w:e:u'\'" ,
-or
-....BR "e\`" .
-.BR "e\h'-\w:e:u'\`" .
-.PP
-These features are very valuable in non-English speaking locales.
-The library functions that
-.I gawk
-uses for regular expression matching
-currently only recognize \*(PX character classes; they do not recognize
-collating symbols or equivalence classes.
-.PP
-The
-.BR \ey ,
-.BR \eB ,
-.BR \e< ,
-.BR \e> ,
-.BR \ew ,
-.BR \eW ,
-.BR \e` ,
-and
-.B \e'
-operators are specific to
-.IR gawk ;
-they are extensions based on facilities in the \*(GN regular expression libraries.
-.PP
-The various command line options
-control how
-.I gawk
-interprets characters in regular expressions.
-.TP
-No options
-In the default case,
-.I gawk
-provide all the facilities of
-\*(PX regular expressions and the \*(GN regular expression operators described above.
-However, interval expressions are not supported.
-.TP
-.B \-\^\-posix
-Only \*(PX regular expressions are supported, the \*(GN operators are not special.
-(E.g.,
-.B \ew
-matches a literal
-.BR w ).
-Interval expressions are allowed.
-.TP
-.B \-\^\-traditional
-Traditional Unix
-.I awk
-regular expressions are matched. The \*(GN operators
-are not special, interval expressions are not available, and neither
-are the \*(PX character classes
-.RB ( [[:alnum:]]
-and so on).
-Characters described by octal and hexadecimal escape sequences are
-treated literally, even if they represent regular expression metacharacters.
-.TP
-.B \-\^\-re\-interval
-Allow interval expressions in regular expressions, even if
-.B \-\^\-traditional
-has been provided.
-.SS Actions
-Action statements are enclosed in braces,
-.B {
-and
-.BR } .
-Action statements consist of the usual assignment, conditional, and looping
-statements found in most languages. The operators, control statements,
-and input/output statements
-available are patterned after those in C.
-.SS Operators
-.PP
-The operators in \*(AK, in order of decreasing precedence, are
-.PP
-.TP "\w'\fB*= /= %= ^=\fR'u+1n"
-.BR ( \&.\|.\|. )
-Grouping
-.TP
-.B $
-Field reference.
-.TP
-.B "++ \-\^\-"
-Increment and decrement, both prefix and postfix.
-.TP
-.B ^
-Exponentiation (\fB**\fR may also be used, and \fB**=\fR for
-the assignment operator).
-.TP
-.B "+ \- !"
-Unary plus, unary minus, and logical negation.
-.TP
-.B "* / %"
-Multiplication, division, and modulus.
-.TP
-.B "+ \-"
-Addition and subtraction.
-.TP
-.I space
-String concatenation.
-.TP
-.PD 0
-.B "< >"
-.TP
-.PD 0
-.B "<= >="
-.TP
-.PD
-.B "!= =="
-The regular relational operators.
-.TP
-.B "~ !~"
-Regular expression match, negated match.
-.B NOTE:
-Do not use a constant regular expression
-.RB ( /foo/ )
-on the left-hand side of a
-.B ~
-or
-.BR !~ .
-Only use one on the right-hand side. The expression
-.BI "/foo/ ~ " exp
-has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR.
-This is usually
-.I not
-what was intended.
-.TP
-.B in
-Array membership.
-.TP
-.B &&
-Logical AND.
-.TP
-.B ||
-Logical OR.
-.TP
-.B ?:
-The C conditional expression. This has the form
-.IB expr1 " ? " expr2 " : " expr3\c
-\&.
-If
-.I expr1
-is true, the value of the expression is
-.IR expr2 ,
-otherwise it is
-.IR expr3 .
-Only one of
-.I expr2
-and
-.I expr3
-is evaluated.
-.TP
-.PD 0
-.B "= += \-="
-.TP
-.PD
-.B "*= /= %= ^="
-Assignment. Both absolute assignment
-.BI ( var " = " value )
-and operator-assignment (the other forms) are supported.
-.SS Control Statements
-.PP
-The control statements are
-as follows:
-.PP
-.RS
-.nf
-\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR]
-\fBwhile (\fIcondition\fB) \fIstatement \fR
-\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR
-\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR
-\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR
-\fBbreak\fR
-\fBcontinue\fR
-\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR
-\fBdelete \fIarray\^\fR
-\fBexit\fR [ \fIexpression\fR ]
-\fB{ \fIstatements \fB}
-.fi
-.RE
-.SS "I/O Statements"
-.PP
-The input/output statements are as follows:
-.PP
-.TP "\w'\fBprintf \fIfmt, expr-list\fR'u+1n"
-\fBclose(\fIfile \fR[\fB, \fIhow\fR]\fB)\fR
-Close file, pipe or co-process.
-The optional
-.I how
-should only be used when closing one end of a
-two-way pipe to a co-process.
-It must be a string value, either
-\fB"to"\fR or \fB"from"\fR.
-.TP
-.B getline
-Set
-.B $0
-from next input record; set
-.BR NF ,
-.BR NR ,
-.BR FNR .
-.TP
-.BI "getline <" file
-Set
-.B $0
-from next record of
-.IR file ;
-set
-.BR NF .
-.TP
-.BI getline " var"
-Set
-.I var
-from next input record; set
-.BR NR ,
-.BR FNR .
-.TP
-.BI getline " var" " <" file
-Set
-.I var
-from next record of
-.IR file .
-.TP
-\fIcommand\fB | getline \fR[\fIvar\fR]
-Run
-.I command
-piping the output either into
-.B $0
-or
-.IR var ,
-as above.
-.TP
-\fIcommand\fB |& getline \fR[\fIvar\fR]
-Run
-.I command
-as a co-process
-piping the output either into
-.B $0
-or
-.IR var ,
-as above.
-Co-processes are a
-.I gawk
-extension.
-.TP
-.B next
-Stop processing the current input record. The next input record
-is read and processing starts over with the first pattern in the
-\*(AK program. If the end of the input data is reached, the
-.B END
-block(s), if any, are executed.
-.TP
-.B "nextfile"
-Stop processing the current input file. The next input record read
-comes from the next input file.
-.B FILENAME
-and
-.B ARGIND
-are updated,
-.B FNR
-is reset to 1, and processing starts over with the first pattern in the
-\*(AK program. If the end of the input data is reached, the
-.B END
-block(s), if any, are executed.
-.TP
-.B print
-Prints the current record.
-The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI print " expr-list"
-Prints expressions.
-Each expression is separated by the value of the
-.B OFS
-variable.
-The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI print " expr-list" " >" file
-Prints expressions on
-.IR file .
-Each expression is separated by the value of the
-.B OFS
-variable. The output record is terminated with the value of the
-.B ORS
-variable.
-.TP
-.BI printf " fmt, expr-list"
-Format and print.
-.TP
-.BI printf " fmt, expr-list" " >" file
-Format and print on
-.IR file .
-.TP
-.BI system( cmd-line )
-Execute the command
-.IR cmd-line ,
-and return the exit status.
-(This may not be available on non-\*(PX systems.)
-.TP
-\&\fBfflush(\fR[\fIfile\^\fR]\fB)\fR
-Flush any buffers associated with the open output file or pipe
-.IR file .
-If
-.I file
-is missing, then standard output is flushed.
-If
-.I file
-is the null string,
-then all open output files and pipes
-have their buffers flushed.
-.PP
-Additional output redirections are allowed for
-.B print
-and
-.BR printf .
-.TP
-.BI "print .\|.\|. >>" " file"
-appends output to the
-.IR file .
-.TP
-.BI "print .\|.\|. |" " command"
-writes on a pipe.
-.TP
-.BI "print .\|.\|. |&" " command"
-sends data to a co-process.
-.PP
-The
-.BR getline
-command returns 0 on end of file and \-1 on an error.
-Upon an error,
-.B ERRNO
-contains a string describing the problem.
-.PP
-.B NOTE:
-If using a pipe or co-process to
-.BR getline ,
-or from
-.B print
-or
-.B printf
-within a loop, you
-.I must
-use
-.B close()
-to create new instances of the command.
-\*(AK does not automatically close pipes or co-processes when
-they return EOF.
-.SS The \fIprintf\fP\^ Statement
-.PP
-The \*(AK versions of the
-.B printf
-statement and
-.B sprintf()
-function
-(see below)
-accept the following conversion specification formats:
-.TP "\w'\fB%g\fR, \fB%G\fR'u+2n"
-.B %c
-An \s-1ASCII\s+1 character.
-If the argument used for
-.B %c
-is numeric, it is treated as a character and printed.
-Otherwise, the argument is assumed to be a string, and the only first
-character of that string is printed.
-.TP
-.BR "%d" "," " %i"
-A decimal number (the integer part).
-.TP
-.B %e , " %E"
-A floating point number of the form
-.BR [\-]d.dddddde[+\^\-]dd .
-The
-.B %E
-format uses
-.B E
-instead of
-.BR e .
-.TP
-.B %f
-A floating point number of the form
-.BR [\-]ddd.dddddd .
-.TP
-.B %g , " %G"
-Use
-.B %e
-or
-.B %f
-conversion, whichever is shorter, with nonsignificant zeros suppressed.
-The
-.B %G
-format uses
-.B %E
-instead of
-.BR %e .
-.TP
-.B %o
-An unsigned octal number (also an integer).
-.TP
-.PD
-.B %u
-An unsigned decimal number (again, an integer).
-.TP
-.B %s
-A character string.
-.TP
-.B %x , " %X"
-An unsigned hexadecimal number (an integer).
-The
-.B %X
-format uses
-.B ABCDEF
-instead of
-.BR abcdef .
-.TP
-.B %%
-A single
-.B %
-character; no argument is converted.
-.PP
-Optional, additional parameters may lie between the
-.B %
-and the control letter:
-.TP
-.IB count $
-Use the
-.IR count "'th"
-argument at this point in the formatting.
-This is called a
-.I "positional specifier"
-and
-is intended primarily for use in translated versions of
-format strings, not in the original text of an AWK program.
-It is a
-.I gawk
-extension.
-.TP
-.B \-
-The expression should be left-justified within its field.
-.TP
-.I space
-For numeric conversions, prefix positive values with a space, and
-negative values with a minus sign.
-.TP
-.B +
-The plus sign, used before the width modifier (see below),
-says to always supply a sign for numeric conversions, even if the data
-to be formatted is positive. The
-.B +
-overrides the space modifier.
-.TP
-.B #
-Use an \*(lqalternate form\*(rq for certain control letters.
-For
-.BR %o ,
-supply a leading zero.
-For
-.BR %x ,
-and
-.BR %X ,
-supply a leading
-.BR 0x
-or
-.BR 0X
-for
-a nonzero result.
-For
-.BR %e ,
-.BR %E ,
-and
-.BR %f ,
-the result always contains a
-decimal point.
-For
-.BR %g ,
-and
-.BR %G ,
-trailing zeros are not removed from the result.
-.TP
-.B 0
-A leading
-.B 0
-(zero) acts as a flag, that indicates output should be
-padded with zeroes instead of spaces.
-This applies even to non-numeric output formats.
-This flag only has an effect when the field width is wider than the
-value to be printed.
-.TP
-.I width
-The field should be padded to this width. The field is normally padded
-with spaces. If the
-.B 0
-flag has been used, it is padded with zeroes.
-.TP
-.BI \&. prec
-A number that specifies the precision to use when printing.
-For the
-.BR %e ,
-.BR %E ,
-and
-.BR %f
-formats, this specifies the
-number of digits you want printed to the right of the decimal point.
-For the
-.BR %g ,
-and
-.B %G
-formats, it specifies the maximum number
-of significant digits. For the
-.BR %d ,
-.BR %o ,
-.BR %i ,
-.BR %u ,
-.BR %x ,
-and
-.B %X
-formats, it specifies the minimum number of
-digits to print. For
-.BR %s ,
-it specifies the maximum number of
-characters from the string that should be printed.
-.PP
-The dynamic
-.I width
-and
-.I prec
-capabilities of the \*(AN C
-.B printf()
-routines are supported.
-A
-.B *
-in place of either the
-.B width
-or
-.B prec
-specifications causes their values to be taken from
-the argument list to
-.B printf
-or
-.BR sprintf() .
-To use a positional specifier with a dynamic width or precision,
-supply the
-.IB count $
-after the
-.B *
-in the format string.
-For example, \fB"%3$*2$.*1$s"\fP.
-.SS Special File Names
-.PP
-When doing I/O redirection from either
-.B print
-or
-.B printf
-into a file,
-or via
-.B getline
-from a file,
-.I gawk
-recognizes certain special filenames internally. These filenames
-allow access to open file descriptors inherited from
-.IR gawk\^ "'s"
-parent process (usually the shell).
-These file names may also be used on the command line to name data files.
-The filenames are:
-.TP "\w'\fB/dev/stdout\fR'u+1n"
-.B /dev/stdin
-The standard input.
-.TP
-.B /dev/stdout
-The standard output.
-.TP
-.B /dev/stderr
-The standard error output.
-.TP
-.BI /dev/fd/\^ n
-The file associated with the open file descriptor
-.IR n .
-.PP
-These are particularly useful for error messages. For example:
-.PP
-.RS
-.ft B
-print "You blew it!" > "/dev/stderr"
-.ft R
-.RE
-.PP
-whereas you would otherwise have to use
-.PP
-.RS
-.ft B
-print "You blew it!" | "cat 1>&2"
-.ft R
-.RE
-.PP
-The following special filenames may be used with the
-.B |&
-co-process operator for creating TCP/IP network connections.
-.TP "\w'\fB/inet/tcp/\fIlport\fB/\fIrhost\fB/\fIrport\fR'u+2n"
-.BI /inet/tcp/ lport / rhost / rport
-File for TCP/IP connection on local port
-.I lport
-to
-remote host
-.I rhost
-on remote port
-.IR rport .
-Use a port of
-.B 0
-to have the system pick a port.
-.TP
-.BI /inet/udp/ lport / rhost / rport
-Similar, but use UDP/IP instead of TCP/IP.
-.TP
-.BI /inet/raw/ lport / rhost / rport
-.\" Similar, but use raw IP sockets.
-Reserved for future use.
-.PP
-Other special filenames provide access to information about the running
-.I gawk
-process.
-.B "These filenames are now obsolete."
-Use the
-.B PROCINFO
-array to obtain the information they provide.
-The filenames are:
-.TP "\w'\fB/dev/stdout\fR'u+1n"
-.B /dev/pid
-Reading this file returns the process ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/ppid
-Reading this file returns the parent process ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/pgrpid
-Reading this file returns the process group ID of the current process,
-in decimal, terminated with a newline.
-.TP
-.B /dev/user
-Reading this file returns a single record terminated with a newline.
-The fields are separated with spaces.
-.B $1
-is the value of the
-.IR getuid (2)
-system call,
-.B $2
-is the value of the
-.IR geteuid (2)
-system call,
-.B $3
-is the value of the
-.IR getgid (2)
-system call, and
-.B $4
-is the value of the
-.IR getegid (2)
-system call.
-If there are any additional fields, they are the group IDs returned by
-.IR getgroups (2).
-Multiple groups may not be supported on all systems.
-.SS Numeric Functions
-.PP
-\*(AK has the following built-in arithmetic functions:
-.PP
-.TP "\w'\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR'u+1n"
-.BI atan2( y , " x" )
-Returns the arctangent of
-.I y/x
-in radians.
-.TP
-.BI cos( expr )
-Returns the cosine of
-.IR expr ,
-which is in radians.
-.TP
-.BI exp( expr )
-The exponential function.
-.TP
-.BI int( expr )
-Truncates to integer.
-.TP
-.BI log( expr )
-The natural logarithm function.
-.TP
-.B rand()
-Returns a random number between 0 and 1.
-.TP
-.BI sin( expr )
-Returns the sine of
-.IR expr ,
-which is in radians.
-.TP
-.BI sqrt( expr )
-The square root function.
-.TP
-\&\fBsrand(\fR[\fIexpr\^\fR]\fB)\fR
-Uses
-.I expr
-as a new seed for the random number generator. If no
-.I expr
-is provided, the time of day is used.
-The return value is the previous seed for the random
-number generator.
-.SS String Functions
-.PP
-.I Gawk
-has the following built-in string functions:
-.PP
-.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
-\fBasort(\fIs \fR[\fB, \fId\fR]\fB)\fR
-Returns the number of elements in the source
-array
-.IR s .
-The contents of
-.I s
-are sorted using
-.IR gawk\^ "'s"
-normal rules for
-comparing values, and the indexes of the
-sorted values of
-.I s
-are replaced with sequential
-integers starting with 1. If the optional
-destination array
-.I d
-is specified, then
-.I s
-is first duplicated into
-.IR d ,
-and then
-.I d
-is sorted, leaving the indexes of the
-source array
-.I s
-unchanged.
-.TP
-\fBgensub(\fIr\fB, \fIs\fB, \fIh \fR[\fB, \fIt\fR]\fB)\fR
-Search the target string
-.I t
-for matches of the regular expression
-.IR r .
-If
-.I h
-is a string beginning with
-.B g
-or
-.BR G ,
-then replace all matches of
-.I r
-with
-.IR s .
-Otherwise,
-.I h
-is a number indicating which match of
-.I r
-to replace.
-If
-.I t
-is not supplied,
-.B $0
-is used instead.
-Within the replacement text
-.IR s ,
-the sequence
-.BI \e n\fR,
-where
-.I n
-is a digit from 1 to 9, may be used to indicate just the text that
-matched the
-.IR n 'th
-parenthesized subexpression. The sequence
-.B \e0
-represents the entire matched text, as does the character
-.BR & .
-Unlike
-.B sub()
-and
-.BR gsub() ,
-the modified string is returned as the result of the function,
-and the original target string is
-.I not
-changed.
-.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n"
-\fBgsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
-For each substring matching the regular expression
-.I r
-in the string
-.IR t ,
-substitute the string
-.IR s ,
-and return the number of substitutions.
-If
-.I t
-is not supplied, use
-.BR $0 .
-An
-.B &
-in the replacement text is replaced with the text that was actually matched.
-Use
-.B \e&
-to get a literal
-.BR & .
-(This must be typed as \fB"\e\e&"\fP;
-see \*(EP
-for a fuller discussion of the rules for
-.BR &'s
-and backslashes in the replacement text of
-.BR sub() ,
-.BR gsub() ,
-and
-.BR gensub() .)
-.TP
-.BI index( s , " t" )
-Returns the index of the string
-.I t
-in the string
-.IR s ,
-or 0 if
-.I t
-is not present.
-.TP
-\fBlength(\fR[\fIs\fR]\fB)
-Returns the length of the string
-.IR s ,
-or the length of
-.B $0
-if
-.I s
-is not supplied.
-.TP
-\fBmatch(\fIs\fB, \fIr \fR[\fB, \fIa\fR]\fB)\fR
-Returns the position in
-.I s
-where the regular expression
-.I r
-occurs, or 0 if
-.I r
-is not present, and sets the values of
-.B RSTART
-and
-.BR RLENGTH .
-Note that the argument order is the same as for the
-.B ~
-operator:
-.IB str " ~"
-.IR re .
-.ft R
-If array
-.I a
-is provided,
-.I a
-is cleared and then elements 1 through
-.I n
-are filled with the portions of
-.I s
-that match the corresponding parenthesized
-subexpression in
-.IR r .
-The 0'th element of
-.I a
-contains the portion
-of
-.I s
-matched by the entire regular expression
-.IR r .
-.TP
-\fBsplit(\fIs\fB, \fIa \fR[\fB, \fIr\fR]\fB)\fR
-Splits the string
-.I s
-into the array
-.I a
-on the regular expression
-.IR r ,
-and returns the number of fields. If
-.I r
-is omitted,
-.B FS
-is used instead.
-The array
-.I a
-is cleared first.
-Splitting behaves identically to field splitting, described above.
-.TP
-.BI sprintf( fmt , " expr-list" )
-Prints
-.I expr-list
-according to
-.IR fmt ,
-and returns the resulting string.
-.TP
-.BI strtonum( str )
-Examines
-.IR str ,
-and returns its numeric value.
-If
-.I str
-begins
-with a leading
-.BR 0 ,
-.B strtonum()
-assumes that
-.I str
-is an octal number.
-If
-.I str
-begins
-with a leading
-.B 0x
-or
-.BR 0X ,
-.B strtonum()
-assumes that
-.I str
-is a hexadecimal number.
-.TP
-\fBsub(\fIr\fB, \fIs \fR[\fB, \fIt\fR]\fB)\fR
-Just like
-.BR gsub() ,
-but only the first matching substring is replaced.
-.TP
-\fBsubstr(\fIs\fB, \fIi \fR[\fB, \fIn\fR]\fB)\fR
-Returns the at most
-.IR n -character
-substring of
-.I s
-starting at
-.IR i .
-If
-.I n
-is omitted, the rest of
-.I s
-is used.
-.TP
-.BI tolower( str )
-Returns a copy of the string
-.IR str ,
-with all the upper-case characters in
-.I str
-translated to their corresponding lower-case counterparts.
-Non-alphabetic characters are left unchanged.
-.TP
-.BI toupper( str )
-Returns a copy of the string
-.IR str ,
-with all the lower-case characters in
-.I str
-translated to their corresponding upper-case counterparts.
-Non-alphabetic characters are left unchanged.
-.SS Time Functions
-Since one of the primary uses of \*(AK programs is processing log files
-that contain time stamp information,
-.I gawk
-provides the following functions for obtaining time stamps and
-formatting them.
-.PP
-.TP "\w'\fBsystime()\fR'u+1n"
-\fBmktime(\fIdatespec\fB)\fR
-Rurns
-.I datespec
-into a time stamp of the same form as returned by
-.BR systime() .
-The
-.I datespec
-is a string of the form
-.IR "YYYY MM DD HH MM SS[ DST]" .
-The contents of the string are six or seven numbers representing respectively
-the full year including century,
-the month from 1 to 12,
-the day of the month from 1 to 31,
-the hour of the day from 0 to 23,
-the minute from 0 to 59,
-and the second from 0 to 60,
-and an optional daylight saving flag.
-The values of these numbers need not be within the ranges specified;
-for example, an hour of \-1 means 1 hour before midnight.
-The origin-zero Gregorian calendar is assumed,
-with year 0 preceding year 1 and year \-1 preceding year 0.
-The time is assumed to be in the local timezone.
-If the daylight saving flag is positive,
-the time is assumed to be daylight saving time;
-if zero, the time is assumed to be standard time;
-and if negative (the default),
-.B mktime()
-attempts to determine whether daylight saving time is in effect
-for the specified time.
-If
-.I datespec
-does not contain enough elements or if the resulting time
-is out of range,
-.B mktime()
-returns \-1.
-.TP
-\fBstrftime(\fR[\fIformat \fR[\fB, \fItimestamp\fR]]\fB)\fR
-Formats
-.I timestamp
-according to the specification in
-.IR format.
-The
-.I timestamp
-should be of the same form as returned by
-.BR systime() .
-If
-.I timestamp
-is missing, the current time of day is used.
-If
-.I format
-is missing, a default format equivalent to the output of
-.IR date (1)
-is used.
-See the specification for the
-.B strftime()
-function in \*(AN C for the format conversions that are
-guaranteed to be available.
-A public-domain version of
-.IR strftime (3)
-and a man page for it come with
-.IR gawk ;
-if that version was used to build
-.IR gawk ,
-then all of the conversions described in that man page are available to
-.IR gawk.
-.TP
-.B systime()
-Returns the current time of day as the number of seconds since the Epoch
-(1970-01-01 00:00:00 UTC on \*(PX systems).
-.SS Bit Manipulations Functions
-Starting with version 3.1 of
-.IR gawk ,
-the following bit manipulation functions are available.
-They work by converting double-precision floating point
-values to
-.B "unsigned long"
-integers, doing the operation, and then converting the
-result back to floating point.
-The functions are:
-.TP "\w'\fBrshift(\fIval\fB, \fIcount\fB)\fR'u+2n"
-\fBand(\fIv1\fB, \fIv2\fB)\fR
-Return the bitwise AND of the values provided by
-.I v1
-and
-.IR v2 .
-.TP
-\fBcompl(\fIval\fB)\fR
-Return the bitwise complement of
-.IR val .
-.TP
-\fBlshift(\fIval\fB, \fIcount\fB)\fR
-Return the value of
-.IR val ,
-shifted left by
-.I count
-bits.
-.TP
-\fBor(\fIv1\fB, \fIv2\fB)\fR
-Return the bitwise OR of the values provided by
-.I v1
-and
-.IR v2 .
-.TP
-\fBrshift(\fIval\fB, \fIcount\fB)\fR
-Return the value of
-.IR val ,
-shifted right by
-.I count
-bits.
-.TP
-\fBxor(\fIv1\fB, \fIv2\fB)\fR
-Return the bitwise XOR of the values provided by
-.I v1
-and
-.IR v2 .
-.PP
-.SS Internationalization Functions
-Starting with version 3.1 of
-.IR gawk ,
-the following functions may be used from within your AWK program for
-translating strings at run-time.
-For full details, see \*(EP.
-.TP
-\fBbindtextdomain(\fIdirectory \fR[\fB, \fIdomain\fR]\fB)\fR
-Specifies the directory where
-.I gawk
-looks for the
-.B \&.mo
-files, in case they
-will not or cannot be placed in the ``standard'' locations
-(e.g., during testing).
-It returns the directory where
-.I domain
-is ``bound.''
-.sp .5
-The default
-.I domain
-is the value of
-.BR TEXTDOMAIN .
-If
-.I directory
-is the null string (\fB""\fR), then
-.B bindtextdomain()
-returns the current binding for the
-given
-.IR domain .
-.TP
-\fBdcgettext(\fIstring \fR[\fB, \fIdomain \fR[\fB, \fIcategory\fR]]\fB)\fR
-Returns the translation of
-.I string
-in
-text domain
-.I domain
-for locale category
-.IR category .
-The default value for
-.I domain
-is the current value of
-.BR TEXTDOMAIN .
-The default value for
-.I category
-is \fB"LC_MESSAGES"\fR.
-.sp .5
-If you supply a value for
-.IR category ,
-it must be a string equal to
-one of the known locale categories described
-in \*(EP.
-You must also supply a text domain. Use
-.B TEXTDOMAIN
-if you want to use the current domain.
-.SH USER-DEFINED FUNCTIONS
-Functions in \*(AK are defined as follows:
-.PP
-.RS
-\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR
-.RE
-.PP
-Functions are executed when they are called from within expressions
-in either patterns or actions. Actual parameters supplied in the function
-call are used to instantiate the formal parameters declared in the function.
-Arrays are passed by reference, other variables are passed by value.
-.PP
-Since functions were not originally part of the \*(AK language, the provision
-for local variables is rather clumsy: They are declared as extra parameters
-in the parameter list. The convention is to separate local variables from
-real parameters by extra spaces in the parameter list. For example:
-.PP
-.RS
-.ft B
-.nf
-function f(p, q, a, b) # a and b are local
-{
- \&.\|.\|.
-}
-
-/abc/ { .\|.\|. ; f(1, 2) ; .\|.\|. }
-.fi
-.ft R
-.RE
-.PP
-The left parenthesis in a function call is required
-to immediately follow the function name,
-without any intervening white space.
-This is to avoid a syntactic ambiguity with the concatenation operator.
-This restriction does not apply to the built-in functions listed above.
-.PP
-Functions may call each other and may be recursive.
-Function parameters used as local variables are initialized
-to the null string and the number zero upon function invocation.
-.PP
-Use
-.BI return " expr"
-to return a value from a function. The return value is undefined if no
-value is provided, or if the function returns by \*(lqfalling off\*(rq the
-end.
-.PP
-If
-.B \-\^\-lint
-has been provided,
-.I gawk
-warns about calls to undefined functions at parse time,
-instead of at run time.
-Calling an undefined function at run time is a fatal error.
-.PP
-The word
-.B func
-may be used in place of
-.BR function .
-.SH DYNAMICALLY LOADING NEW FUNCTIONS
-Beginning with version 3.1 of
-.IR gawk ,
-you can dynamically add new built-in functions to the running
-.I gawk
-interpreter.
-The full details are beyond the scope of this manual page;
-see \*(EP for the details.
-.PP
-.TP 8
-\fBextension(\fIobject\fB, \fIfunction\fB)\fR
-Dynamically link the shared object file named by
-.IR object ,
-and invoke
-.I function
-in that object, to perform initialization.
-These should both be provided as strings.
-Returns the value returned by
-.IR function .
-.PP
-.ft B
-This function is provided and documented in \*(EP,
-but everything about this feature is likely to change
-in the next release.
-We STRONGLY recommend that you do not use this feature
-for anything that you aren't willing to redo.
-.ft R
-.SH SIGNALS
-.I pgawk
-accepts two signals.
-.B SIGUSR1
-causes it to dump a profile and function call stack to the
-profile file, which is either
-.BR awkprof.out ,
-or whatever file was named with the
-.B \-\^\-profile
-option. It then continues to run.
-.B SIGHUP
-causes it to dump the profile and function call stack and then exit.
-.SH EXAMPLES
-.nf
-Print and sort the login names of all users:
-
-.ft B
- BEGIN { FS = ":" }
- { print $1 | "sort" }
-
-.ft R
-Count lines in a file:
-
-.ft B
- { nlines++ }
- END { print nlines }
-
-.ft R
-Precede each line by its number in the file:
-
-.ft B
- { print FNR, $0 }
-
-.ft R
-Concatenate and line number (a variation on a theme):
-
-.ft B
- { print NR, $0 }
-.ft R
-.fi
-.SH INTERNATIONALIZATION
-.PP
-String constants are sequences of characters enclosed in double
-quotes. In non-English speaking environments, it is possible to mark
-strings in the \*(AK program as requiring translation to the native
-natural language. Such strings are marked in the \*(AK program with
-a leading underscore (\*(lq_\*(rq). For example,
-.sp
-.RS
-.ft B
-gawk 'BEGIN { print "hello, world" }'
-.RE
-.sp
-.ft R
-always prints
-.BR "hello, world" .
-But,
-.sp
-.RS
-.ft B
-gawk 'BEGIN { print _"hello, world" }'
-.RE
-.sp
-.ft R
-might print
-.B "bonjour, monde"
-in France.
-.PP
-There are several steps involved in producing and running a localizable
-\*(AK program.
-.TP "\w'4.'u+2n"
-1.
-Add a
-.B BEGIN
-action to assign a value to the
-.B TEXTDOMAIN
-variable to set the text domain to a name associated with your program.
-.sp
-.ti +5n
-.ft B
-BEGIN { TEXTDOMAIN = "myprog" }
-.ft R
-.sp
-This allows
-.I gawk
-to find the
-.B \&.mo
-file associated with your program.
-Without this step,
-.I gawk
-uses the
-.B messages
-text domain,
-which likely does not contain translations for your program.
-.TP
-2.
-Mark all strings that should be translated with leading underscores.
-.TP
-3.
-If necessary, use the
-.B dcgettext()
-and/or
-.B bindtextdomain()
-functions in your program, as appropriate.
-.TP
-4.
-Run
-.B "gawk \-\^\-gen\-po \-f myprog.awk > myprog.po"
-to generate a
-.B \&.po
-file for your program.
-.TP
-5.
-Provide appropriate translations, and build and install a corresponding
-.B \&.mo
-file.
-.PP
-The internationalization features are described in full detail in \*(EP.
-.SH POSIX COMPATIBILITY
-A primary goal for
-.I gawk
-is compatibility with the \*(PX standard, as well as with the
-latest version of \*(UX
-.IR awk .
-To this end,
-.I gawk
-incorporates the following user visible
-features which are not described in the \*(AK book,
-but are part of the Bell Laboratories version of
-.IR awk ,
-and are in the \*(PX standard.
-.PP
-The book indicates that command line variable assignment happens when
-.I awk
-would otherwise open the argument as a file, which is after the
-.B BEGIN
-block is executed. However, in earlier implementations, when such an
-assignment appeared before any file names, the assignment would happen
-.I before
-the
-.B BEGIN
-block was run. Applications came to depend on this \*(lqfeature.\*(rq
-When
-.I awk
-was changed to match its documentation, the
-.B \-v
-option for assigning variables before program execution was added to
-accommodate applications that depended upon the old behavior.
-(This feature was agreed upon by both the Bell Laboratories and the \*(GN developers.)
-.PP
-The
-.B \-W
-option for implementation specific features is from the \*(PX standard.
-.PP
-When processing arguments,
-.I gawk
-uses the special option \*(lq\-\^\-\*(rq to signal the end of
-arguments.
-In compatibility mode, it warns about but otherwise ignores
-undefined options.
-In normal operation, such arguments are passed on to the \*(AK program for
-it to process.
-.PP
-The \*(AK book does not define the return value of
-.BR srand() .
-The \*(PX standard
-has it return the seed it was using, to allow keeping track
-of random number sequences. Therefore
-.B srand()
-in
-.I gawk
-also returns its current seed.
-.PP
-Other new features are:
-The use of multiple
-.B \-f
-options (from MKS
-.IR awk );
-the
-.B ENVIRON
-array; the
-.BR \ea ,
-and
-.BR \ev
-escape sequences (done originally in
-.I gawk
-and fed back into the Bell Laboratories version); the
-.B tolower()
-and
-.B toupper()
-built-in functions (from the Bell Laboratories version); and the \*(AN C conversion specifications in
-.B printf
-(done first in the Bell Laboratories version).
-.SH HISTORICAL FEATURES
-There are two features of historical \*(AK implementations that
-.I gawk
-supports.
-First, it is possible to call the
-.B length()
-built-in function not only with no argument, but even without parentheses!
-Thus,
-.RS
-.PP
-.ft B
-a = length # Holy Algol 60, Batman!
-.ft R
-.RE
-.PP
-is the same as either of
-.RS
-.PP
-.ft B
-a = length()
-.br
-a = length($0)
-.ft R
-.RE
-.PP
-This feature is marked as \*(lqdeprecated\*(rq in the \*(PX standard, and
-.I gawk
-issues a warning about its use if
-.B \-\^\-lint
-is specified on the command line.
-.PP
-The other feature is the use of either the
-.B continue
-or the
-.B break
-statements outside the body of a
-.BR while ,
-.BR for ,
-or
-.B do
-loop. Traditional \*(AK implementations have treated such usage as
-equivalent to the
-.B next
-statement.
-.I Gawk
-supports this usage if
-.B \-\^\-traditional
-has been specified.
-.SH GNU EXTENSIONS
-.I Gawk
-has a number of extensions to \*(PX
-.IR awk .
-They are described in this section. All the extensions described here
-can be disabled by
-invoking
-.I gawk
-with the
-.B \-\^\-traditional
-option.
-.PP
-The following features of
-.I gawk
-are not available in
-\*(PX
-.IR awk .
-.\" Environment vars and startup stuff
-.TP "\w'\(bu'u+1n"
-\(bu
-No path search is performed for files named via the
-.B \-f
-option. Therefore the
-.B AWKPATH
-environment variable is not special.
-.\" POSIX and language recognition issues
-.TP
-\(bu
-The
-.B \ex
-escape sequence.
-(Disabled with
-.BR \-\^\-posix .)
-.TP
-\(bu
-The
-.B fflush()
-function.
-(Disabled with
-.BR \-\^\-posix .)
-.TP
-\(bu
-The ability to continue lines after
-.B ?
-and
-.BR : .
-(Disabled with
-.BR \-\^\-posix .)
-.TP
-\(bu
-Octal and hexadecimal constants in AWK programs.
-.\" Special variables
-.TP
-\(bu
-The
-.BR ARGIND ,
-.BR BINMODE ,
-.BR ERRNO ,
-.BR LINT ,
-.B RT
-and
-.B TEXTDOMAIN
-variables are not special.
-.TP
-\(bu
-The
-.B IGNORECASE
-variable and its side-effects are not available.
-.TP
-\(bu
-The
-.B FIELDWIDTHS
-variable and fixed-width field splitting.
-.TP
-\(bu
-The
-.B PROCINFO
-array is not available.
-.\" I/O stuff
-.TP
-\(bu
-The use of
-.B RS
-as a regular expression.
-.TP
-\(bu
-The special file names available for I/O redirection are not recognized.
-.TP
-\(bu
-The
-.B |&
-operator for creating co-processes.
-.\" Changes to standard awk functions
-.TP
-\(bu
-The ability to split out individual characters using the null string
-as the value of
-.BR FS ,
-and as the third argument to
-.BR split() .
-.TP
-\(bu
-The optional second argument to the
-.B close()
-function.
-.TP
-\(bu
-The optional third argument to the
-.B match()
-function.
-.TP
-\(bu
-The ability to use positional specifiers with
-.B printf
-and
-.BR sprintf() .
-.\" New keywords or changes to keywords
-.TP
-\(bu
-The use of
-.BI delete " array"
-to delete the entire contents of an array.
-.TP
-\(bu
-The use of
-.B "nextfile"
-to abandon processing of the current input file.
-.\" New functions
-.TP
-\(bu
-The
-.BR and() ,
-.BR asort() ,
-.BR bindtextdomain() ,
-.BR compl() ,
-.BR dcgettext() ,
-.BR gensub() ,
-.BR lshift() ,
-.BR mktime() ,
-.BR or() ,
-.BR rshift() ,
-.BR strftime() ,
-.BR strtonum() ,
-.B systime()
-and
-.B xor()
-functions.
-.\" I18N stuff
-.TP
-\(bu
-Localizable strings.
-.\" Extending gawk
-.TP
-\(bu
-Adding new built-in functions dynamically with the
-.B extension()
-function.
-.PP
-The \*(AK book does not define the return value of the
-.B close()
-function.
-.IR Gawk\^ "'s"
-.B close()
-returns the value from
-.IR fclose (3),
-or
-.IR pclose (3),
-when closing an output file or pipe, respectively.
-It returns the process's exit status when closing an input pipe.
-The return value is \-1 if the named file, pipe
-or co-process was not opened with a redirection.
-.PP
-When
-.I gawk
-is invoked with the
-.B \-\^\-traditional
-option,
-if the
-.I fs
-argument to the
-.B \-F
-option is \*(lqt\*(rq, then
-.B FS
-is set to the tab character.
-Note that typing
-.B "gawk \-F\et \&.\|.\|."
-simply causes the shell to quote the \*(lqt,\*(rq, and does not pass
-\*(lq\et\*(rq to the
-.B \-F
-option.
-Since this is a rather ugly special case, it is not the default behavior.
-This behavior also does not occur if
-.B \-\^\-posix
-has been specified.
-To really get a tab character as the field separator, it is best to use
-single quotes:
-.BR "gawk \-F'\et' \&.\|.\|." .
-.ig
-.PP
-If
-.I gawk
-was compiled for debugging, it
-accepts the following additional options:
-.TP
-.PD 0
-.B \-Wparsedebug
-.TP
-.PD
-.B \-\^\-parsedebug
-Turn on
-.IR yacc (1)
-or
-.IR bison (1)
-debugging output during program parsing.
-This option should only be of interest to the
-.I gawk
-maintainers, and may not even be compiled into
-.IR gawk .
-..
-.SH ENVIRONMENT VARIABLES
-The
-.B AWKPATH
-environment variable can be used to provide a list of directories that
-.I gawk
-searches when looking for files named via the
-.B \-f
-and
-.B \-\^\-file
-options.
-.PP
-If
-.B POSIXLY_CORRECT
-exists in the environment, then
-.I gawk
-behaves exactly as if
-.B \-\^\-posix
-had been specified on the command line.
-If
-.B \-\^\-lint
-has been specified,
-.I gawk
-issues a warning message to this effect.
-.SH SEE ALSO
-.IR egrep (1),
-.IR getpid (2),
-.IR getppid (2),
-.IR getpgrp (2),
-.IR getuid (2),
-.IR geteuid (2),
-.IR getgid (2),
-.IR getegid (2),
-.IR getgroups (2)
-.PP
-.IR "The AWK Programming Language" ,
-Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger,
-Addison-Wesley, 1988. ISBN 0-201-07981-X.
-.PP
-\*(EP,
-Edition 3.0, published by the Free Software Foundation, 2001.
-.SH BUGS
-The
-.B \-F
-option is not necessary given the command line variable assignment feature;
-it remains only for backwards compatibility.
-.PP
-Syntactically invalid single character programs tend to overflow
-the parse stack, generating a rather unhelpful message. Such programs
-are surprisingly difficult to diagnose in the completely general case,
-and the effort to do so really is not worth it.
-.ig
-.PP
-.I Gawk
-suffers from ``feeping creaturism.''
-It's too bad
-.I perl
-is so inelegant.
-..
-.SH AUTHORS
-The original version of \*(UX
-.I awk
-was designed and implemented by Alfred Aho,
-Peter Weinberger, and Brian Kernighan of Bell Laboratories. Brian Kernighan
-continues to maintain and enhance it.
-.PP
-Paul Rubin and Jay Fenlason,
-of the Free Software Foundation, wrote
-.IR gawk ,
-to be compatible with the original version of
-.I awk
-distributed in Seventh Edition \*(UX.
-John Woods contributed a number of bug fixes.
-David Trueman, with contributions
-from Arnold Robbins, made
-.I gawk
-compatible with the new version of \*(UX
-.IR awk .
-Arnold Robbins is the current maintainer.
-.PP
-The initial DOS port was done by Conrad Kwok and Scott Garfinkle.
-Scott Deifik is the current DOS maintainer. Pat Rankin did the
-port to VMS, and Michal Jaegermann did the port to the Atari ST.
-The port to OS/2 was done by Kai Uwe Rommel, with contributions and
-help from Darrel Hankerson. Fred Fish supplied support for the Amiga,
-Stephen Davies provided the Tandem port,
-and Martin Brown provided the BeOS port.
-.SH VERSION INFORMATION
-This man page documents
-.IR gawk ,
-version 3.1.0.
-.SH BUG REPORTS
-If you find a bug in
-.IR gawk ,
-please send electronic mail to
-.BR bug-gawk@gnu.org .
-Please include your operating system and its revision, the version of
-.I gawk
-(from
-.BR "gawk \-\^\-version" ),
-what C compiler you used to compile it, and a test program
-and data that are as small as possible for reproducing the problem.
-.PP
-Before sending a bug report, please do two things. First, verify that
-you have the latest version of
-.IR gawk .
-Many bugs (usually subtle ones) are fixed at each release, and if
-yours is out of date, the problem may already have been solved.
-Second, please read this man page and the reference manual carefully to
-be sure that what you think is a bug really is, instead of just a quirk
-in the language.
-.PP
-Whatever you do, do
-.B NOT
-post a bug report in
-.BR comp.lang.awk .
-While the
-.I gawk
-developers occasionally read this newsgroup, posting bug reports there
-is an unreliable way to report bugs. Instead, please use the electronic mail
-addresses given above.
-.SH ACKNOWLEDGEMENTS
-Brian Kernighan of Bell Laboratories
-provided valuable assistance during testing and debugging.
-We thank him.
-.SH COPYING PERMISSIONS
-Copyright \(co 1989, 1991\-2001 Free Software Foundation, Inc.
-.PP
-Permission is granted to make and distribute verbatim copies of
-this manual page provided the copyright notice and this permission
-notice are preserved on all copies.
-.ig
-Permission is granted to process this file through troff and print the
-results, provided the printed document carries copying permission
-notice identical to this one except for the removal of this paragraph
-(this paragraph not being relevant to the printed manual page).
-..
-.PP
-Permission is granted to copy and distribute modified versions of this
-manual page under the conditions for verbatim copying, provided that
-the entire resulting derived work is distributed under the terms of a
-permission notice identical to this one.
-.PP
-Permission is granted to copy and distribute translations of this
-manual page into another language, under the above conditions for
-modified versions, except that this permission notice may be stated in
-a translation approved by the Foundation.
OpenPOWER on IntegriCloud