diff options
Diffstat (limited to 'gnu/usr.bin/awk/awk.1')
-rw-r--r-- | gnu/usr.bin/awk/awk.1 | 1969 |
1 files changed, 0 insertions, 1969 deletions
diff --git a/gnu/usr.bin/awk/awk.1 b/gnu/usr.bin/awk/awk.1 deleted file mode 100644 index 1b58bec..0000000 --- a/gnu/usr.bin/awk/awk.1 +++ /dev/null @@ -1,1969 +0,0 @@ -.ds PX \s-1POSIX\s+1 -.ds UX \s-1UNIX\s+1 -.ds AN \s-1ANSI\s+1 -.TH AWK 1 "Apr 18 1994" "Free Software Foundation" "Utility Commands" -.SH NAME -awk \- GNU awk pattern scanning and processing language -.SH SYNOPSIS -.B awk -[ POSIX or GNU style options ] -.B \-f -.I program-file -[ -.B \-\^\- -] file .\^.\^. -.br -.B awk -[ POSIX or GNU style options ] -[ -.B \-\^\- -] -.I program-text -file .\^.\^. -.SH DESCRIPTION -.I Gawk -is the GNU Project's implementation of the AWK programming language. -It conforms to the definition of the language in -the \*(PX 1003.2 Command Language And Utilities Standard. -This version in turn is based on the description in -.IR "The AWK Programming Language" , -by Aho, Kernighan, and Weinberger, -with the additional features defined in the System V Release 4 version -of \*(UX -.IR awk . -.I Gawk -also provides some GNU-specific extensions. -.PP -The command line consists of options to -.I awk -itself, the AWK program text (if not supplied via the -.B \-f -or -.B \-\^\-file -options), and values to be made -available in the -.B ARGC -and -.B ARGV -pre-defined AWK variables. -.SH OPTIONS -.PP -.I Gawk -options may be either the traditional \*(PX one letter options, -or the GNU style long options. \*(PX style options start with a single ``\-'', -while GNU long options start with ``\-\^\-''. -GNU style long options are provided for both GNU-specific features and -for \*(PX mandated features. Other implementations of the AWK language -are likely to only accept the traditional one letter options. -.PP -Following the \*(PX standard, -.IR awk -specific -options are supplied via arguments to the -.B \-W -option. Multiple -.B \-W -options may be supplied, or multiple arguments may be supplied together -if they are separated by commas, or enclosed in quotes and separated -by white space. -Case is ignored in arguments to the -.B \-W -option. -Each -.B \-W -option has a corresponding GNU style long option, as detailed below. -Arguments to GNU style long options are either joined with the option -by an -.B = -sign, with no intervening spaces, or they may be provided in the -next command line argument. -.PP -.I Gawk -accepts the following options. -.TP -.PD 0 -.BI \-F " fs" -.TP -.PD -.BI \-\^\-field-separator= fs -Use -.I fs -for the input field separator (the value of the -.B FS -predefined -variable). -.TP -.PD 0 -\fB\-v\fI var\fB\^=\^\fIval\fR -.TP -.PD -\fB\-\^\-assign=\fIvar\fB\^=\^\fIval\fR -Assign the value -.IR val , -to the variable -.IR var , -before execution of the program begins. -Such variable values are available to the -.B BEGIN -block of an AWK program. -.TP -.PD 0 -.BI \-f " program-file" -.TP -.PD -.BI \-\^\-file= program-file -Read the AWK program source from the file -.IR program-file , -instead of from the first command line argument. -Multiple -.B \-f -(or -.BR \-\^\-file ) -options may be used. -.TP -.PD 0 -.BI \-mf= NNN -.TP -.BI \-mr= NNN -Set various memory limits to the value -.IR NNN . -The -.B f -flag sets the maximum number of fields, and the -.B r -flag sets the maximum record size. These two flags and the -.B \-m -option are from the AT&T Bell Labs research version of \*(UX -.IR awk . -They are ignored by -.IR awk , -since -.I awk -has no pre-defined limits. -.TP \w'\fB\-\^\-copyright\fR'u+1n -.PD 0 -.B "\-W compat" -.TP -.PD -.B \-\^\-compat -Run in -.I compatibility -mode. In compatibility mode, -.I awk -behaves identically to \*(UX -.IR awk ; -none of the GNU-specific extensions are recognized. -See -.BR "GNU EXTENSIONS" , -below, for more information. -.TP -.PD 0 -.B "\-W copyleft" -.TP -.PD 0 -.B "\-W copyright" -.TP -.PD 0 -.B \-\^\-copyleft -.TP -.PD -.B \-\^\-copyright -Print the short version of the GNU copyright information message on -the error output. -.TP -.PD 0 -.B "\-W help" -.TP -.PD 0 -.B "\-W usage" -.TP -.PD 0 -.B \-\^\-help -.TP -.PD -.B \-\^\-usage -Print a relatively short summary of the available options on -the error output. -Per the GNU Coding Standards, these options cause an immediate, -successful exit. -.TP -.PD 0 -.B "\-W lint" -.TP -.PD 0 -.B \-\^\-lint -Provide warnings about constructs that are -dubious or non-portable to other AWK implementations. -.ig -.\" This option is left undocumented, on purpose. -.TP -.PD 0 -.B "\-W nostalgia" -.TP -.PD -.B \-\^\-nostalgia -Provide a moment of nostalgia for long time -.I awk -users. -.. -.TP -.PD 0 -.B "\-W posix" -.TP -.PD -.B \-\^\-posix -This turns on -.I compatibility -mode, with the following additional restrictions: -.RS -.TP \w'\(bu'u+1n -\(bu -.B \ex -escape sequences are not recognized. -.TP -\(bu -The synonym -.B func -for the keyword -.B function -is not recognized. -.TP -\(bu -The operators -.B ** -and -.B **= -cannot be used in place of -.B ^ -and -.BR ^= . -.RE -.TP -.PD 0 -.BI "\-W source=" program-text -.TP -.PD -.BI \-\^\-source= program-text -Use -.I program-text -as AWK program source code. -This option allows the easy intermixing of library functions (used via the -.B \-f -and -.B \-\^\-file -options) with source code entered on the command line. -It is intended primarily for medium to large size AWK programs used -in shell scripts. -.sp .5 -The -.B "\-W source=" -form of this option uses the rest of the command line argument for -.IR program-text ; -no other options to -.B \-W -will be recognized in the same argument. -.TP -.PD 0 -.B "\-W version" -.TP -.PD -.B \-\^\-version -Print version information for this particular copy of -.I awk -on the error output. -This is useful mainly for knowing if the current copy of -.I awk -on your system -is up to date with respect to whatever the Free Software Foundation -is distributing. -Per the GNU Coding Standards, these options cause an immediate, -successful exit. -.TP -.B \-\^\- -Signal the end of options. This is useful to allow further arguments to the -AWK program itself to start with a ``\-''. -This is mainly for consistency with the argument parsing convention used -by most other \*(PX programs. -.PP -In compatibility mode, -any other options are flagged as illegal, but are otherwise ignored. -In normal operation, as long as program text has been supplied, unknown -options are passed on to the AWK program in the -.B ARGV -array for processing. This is particularly useful for running AWK -programs via the ``#!'' executable interpreter mechanism. -.SH AWK PROGRAM EXECUTION -.PP -An AWK program consists of a sequence of pattern-action statements -and optional function definitions. -.RS -.PP -\fIpattern\fB { \fIaction statements\fB }\fR -.br -\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements\fB }\fR -.RE -.PP -.I Gawk -first reads the program source from the -.IR program-file (s) -if specified, -from arguments to -.BR "\-W source=" , -or from the first non-option argument on the command line. -The -.B \-f -and -.B "\-W source=" -options may be used multiple times on the command line. -.I Gawk -will read the program text as if all the -.IR program-file s -and command line source texts -had been concatenated together. This is useful for building libraries -of AWK functions, without having to include them in each new AWK -program that uses them. It also provides the ability to mix library -functions with command line programs. -.PP -The environment variable -.B AWKPATH -specifies a search path to use when finding source files named with -the -.B \-f -option. If this variable does not exist, the default path is -\fB".:/usr/lib/awk:/usr/local/lib/awk"\fR. -If a file name given to the -.B \-f -option contains a ``/'' character, no path search is performed. -.PP -.I Gawk -executes AWK programs in the following order. -First, -all variable assignments specified via the -.B \-v -option are performed. -Next, -.I awk -compiles the program into an internal form. -Then, -.I awk -executes the code in the -.B BEGIN -block(s) (if any), -and then proceeds to read -each file named in the -.B ARGV -array. -If there are no files named on the command line, -.I awk -reads the standard input. -.PP -If a filename on the command line has the form -.IB var = val -it is treated as a variable assignment. The variable -.I var -will be assigned the value -.IR val . -(This happens after any -.B BEGIN -block(s) have been run.) -Command line variable assignment -is most useful for dynamically assigning values to the variables -AWK uses to control how input is broken into fields and records. It -is also useful for controlling state if multiple passes are needed over -a single data file. -.PP -If the value of a particular element of -.B ARGV -is empty (\fB""\fR), -.I awk -skips over it. -.PP -For each line in the input, -.I awk -tests to see if it matches any -.I pattern -in the AWK program. -For each pattern that the line matches, the associated -.I action -is executed. -The patterns are tested in the order they occur in the program. -.PP -Finally, after all the input is exhausted, -.I awk -executes the code in the -.B END -block(s) (if any). -.SH VARIABLES AND FIELDS -AWK variables are dynamic; they come into existence when they are -first used. Their values are either floating-point numbers or strings, -or both, -depending upon how they are used. AWK also has one dimensional -arrays; arrays with multiple dimensions may be simulated. -Several pre-defined variables are set as a program -runs; these will be described as needed and summarized below. -.SS Fields -.PP -As each input line is read, -.I awk -splits the line into -.IR fields , -using the value of the -.B FS -variable as the field separator. -If -.B FS -is a single character, fields are separated by that character. -Otherwise, -.B FS -is expected to be a full regular expression. -In the special case that -.B FS -is a single blank, fields are separated -by runs of blanks and/or tabs. -Note that the value of -.B IGNORECASE -(see below) will also affect how fields are split when -.B FS -is a regular expression. -.PP -If the -.B FIELDWIDTHS -variable is set to a space separated list of numbers, each field is -expected to have fixed width, and -.I awk -will split up the record using the specified widths. The value of -.B FS -is ignored. -Assigning a new value to -.B FS -overrides the use of -.BR FIELDWIDTHS , -and restores the default behavior. -.PP -Each field in the input line may be referenced by its position, -.BR $1 , -.BR $2 , -and so on. -.B $0 -is the whole line. The value of a field may be assigned to as well. -Fields need not be referenced by constants: -.RS -.PP -.ft B -n = 5 -.br -print $n -.ft R -.RE -.PP -prints the fifth field in the input line. -The variable -.B NF -is set to the total number of fields in the input line. -.PP -References to non-existent fields (i.e. fields after -.BR $NF ) -produce the null-string. However, assigning to a non-existent field -(e.g., -.BR "$(NF+2) = 5" ) -will increase the value of -.BR NF , -create any intervening fields with the null string as their value, and -cause the value of -.B $0 -to be recomputed, with the fields being separated by the value of -.BR OFS . -References to negative numbered fields cause a fatal error. -.SS Built-in Variables -.PP -AWK's built-in variables are: -.PP -.TP \w'\fBFIELDWIDTHS\fR'u+1n -.B ARGC -The number of command line arguments (does not include options to -.IR awk , -or the program source). -.TP -.B ARGIND -The index in -.B ARGV -of the current file being processed. -.TP -.B ARGV -Array of command line arguments. The array is indexed from -0 to -.B ARGC -\- 1. -Dynamically changing the contents of -.B ARGV -can control the files used for data. -.TP -.B CONVFMT -The conversion format for numbers, \fB"%.6g"\fR, by default. -.TP -.B ENVIRON -An array containing the values of the current environment. -The array is indexed by the environment variables, each element being -the value of that variable (e.g., \fBENVIRON["HOME"]\fP might be -.BR /u/arnold ). -Changing this array does not affect the environment seen by programs which -.I awk -spawns via redirection or the -.B system() -function. -(This may change in a future version of -.IR awk .) -.\" but don't hold your breath... -.TP -.B ERRNO -If a system error occurs either doing a redirection for -.BR getline , -during a read for -.BR getline , -or during a -.BR close() , -then -.B ERRNO -will contain -a string describing the error. -.TP -.B FIELDWIDTHS -A white-space separated list of fieldwidths. When set, -.I awk -parses the input into fields of fixed width, instead of using the -value of the -.B FS -variable as the field separator. -The fixed field width facility is still experimental; expect the -semantics to change as -.I awk -evolves over time. -.TP -.B FILENAME -The name of the current input file. -If no files are specified on the command line, the value of -.B FILENAME -is ``\-''. -However, -.B FILENAME -is undefined inside the -.B BEGIN -block. -.TP -.B FNR -The input record number in the current input file. -.TP -.B FS -The input field separator, a blank by default. -.TP -.B IGNORECASE -Controls the case-sensitivity of all regular expression operations. If -.B IGNORECASE -has a non-zero value, then pattern matching in rules, -field splitting with -.BR FS , -regular expression -matching with -.B ~ -and -.BR !~ , -and the -.BR gsub() , -.BR index() , -.BR match() , -.BR split() , -and -.B sub() -pre-defined functions will all ignore case when doing regular expression -operations. Thus, if -.B IGNORECASE -is not equal to zero, -.B /aB/ -matches all of the strings \fB"ab"\fP, \fB"aB"\fP, \fB"Ab"\fP, -and \fB"AB"\fP. -As with all AWK variables, the initial value of -.B IGNORECASE -is zero, so all regular expression operations are normally case-sensitive. -.TP -.B NF -The number of fields in the current input record. -.TP -.B NR -The total number of input records seen so far. -.TP -.B OFMT -The output format for numbers, \fB"%.6g"\fR, by default. -.TP -.B OFS -The output field separator, a blank by default. -.TP -.B ORS -The output record separator, by default a newline. -.TP -.B RS -The input record separator, by default a newline. -.B RS -is exceptional in that only the first character of its string -value is used for separating records. -(This will probably change in a future release of -.IR awk .) -If -.B RS -is set to the null string, then records are separated by -blank lines. -When -.B RS -is set to the null string, then the newline character always acts as -a field separator, in addition to whatever value -.B FS -may have. -.TP -.B RSTART -The index of the first character matched by -.BR match() ; -0 if no match. -.TP -.B RLENGTH -The length of the string matched by -.BR match() ; -\-1 if no match. -.TP -.B SUBSEP -The character used to separate multiple subscripts in array -elements, by default \fB"\e034"\fR. -.SS Arrays -.PP -Arrays are subscripted with an expression between square brackets -.RB ( [ " and " ] ). -If the expression is an expression list -.RI ( expr ", " expr " ...)" -then the array subscript is a string consisting of the -concatenation of the (string) value of each expression, -separated by the value of the -.B SUBSEP -variable. -This facility is used to simulate multiply dimensioned -arrays. For example: -.PP -.RS -.ft B -i = "A" ;\^ j = "B" ;\^ k = "C" -.br -x[i, j, k] = "hello, world\en" -.ft R -.RE -.PP -assigns the string \fB"hello, world\en"\fR to the element of the array -.B x -which is indexed by the string \fB"A\e034B\e034C"\fR. All arrays in AWK -are associative, i.e. indexed by string values. -.PP -The special operator -.B in -may be used in an -.B if -or -.B while -statement to see if an array has an index consisting of a particular -value. -.PP -.RS -.ft B -.nf -if (val in array) - print array[val] -.fi -.ft -.RE -.PP -If the array has multiple subscripts, use -.BR "(i, j) in array" . -.PP -The -.B in -construct may also be used in a -.B for -loop to iterate over all the elements of an array. -.PP -An element may be deleted from an array using the -.B delete -statement. -The -.B delete -statement may also be used to delete the entire contents of an array. -.SS Variable Typing And Conversion -.PP -Variables and fields -may be (floating point) numbers, or strings, or both. How the -value of a variable is interpreted depends upon its context. If used in -a numeric expression, it will be treated as a number, if used as a string -it will be treated as a string. -.PP -To force a variable to be treated as a number, add 0 to it; to force it -to be treated as a string, concatenate it with the null string. -.PP -When a string must be converted to a number, the conversion is accomplished -using -.IR atof (3). -A number is converted to a string by using the value of -.B CONVFMT -as a format string for -.IR sprintf (3), -with the numeric value of the variable as the argument. -However, even though all numbers in AWK are floating-point, -integral values are -.I always -converted as integers. Thus, given -.PP -.RS -.ft B -.nf -CONVFMT = "%2.2f" -a = 12 -b = a "" -.fi -.ft R -.RE -.PP -the variable -.B b -has a string value of \fB"12"\fR and not \fB"12.00"\fR. -.PP -.I Gawk -performs comparisons as follows: -If two variables are numeric, they are compared numerically. -If one value is numeric and the other has a string value that is a -``numeric string,'' then comparisons are also done numerically. -Otherwise, the numeric value is converted to a string and a string -comparison is performed. -Two strings are compared, of course, as strings. -According to the \*(PX standard, even if two strings are -numeric strings, a numeric comparison is performed. However, this is -clearly incorrect, and -.I awk -does not do this. -.PP -Uninitialized variables have the numeric value 0 and the string value "" -(the null, or empty, string). -.SH PATTERNS AND ACTIONS -AWK is a line oriented language. The pattern comes first, and then the -action. Action statements are enclosed in -.B { -and -.BR } . -Either the pattern may be missing, or the action may be missing, but, -of course, not both. If the pattern is missing, the action will be -executed for every single line of input. -A missing action is equivalent to -.RS -.PP -.B "{ print }" -.RE -.PP -which prints the entire line. -.PP -Comments begin with the ``#'' character, and continue until the -end of the line. -Blank lines may be used to separate statements. -Normally, a statement ends with a newline, however, this is not the -case for lines ending in -a ``,'', ``{'', ``?'', ``:'', ``&&'', or ``||''. -Lines ending in -.B do -or -.B else -also have their statements automatically continued on the following line. -In other cases, a line can be continued by ending it with a ``\e'', -in which case the newline will be ignored. -.PP -Multiple statements may -be put on one line by separating them with a ``;''. -This applies to both the statements within the action part of a -pattern-action pair (the usual case), -and to the pattern-action statements themselves. -.SS Patterns -AWK patterns may be one of the following: -.PP -.RS -.nf -.B BEGIN -.B END -.BI / "regular expression" / -.I "relational expression" -.IB pattern " && " pattern -.IB pattern " || " pattern -.IB pattern " ? " pattern " : " pattern -.BI ( pattern ) -.BI ! " pattern" -.IB pattern1 ", " pattern2 -.fi -.RE -.PP -.B BEGIN -and -.B END -are two special kinds of patterns which are not tested against -the input. -The action parts of all -.B BEGIN -patterns are merged as if all the statements had -been written in a single -.B BEGIN -block. They are executed before any -of the input is read. Similarly, all the -.B END -blocks are merged, -and executed when all the input is exhausted (or when an -.B exit -statement is executed). -.B BEGIN -and -.B END -patterns cannot be combined with other patterns in pattern expressions. -.B BEGIN -and -.B END -patterns cannot have missing action parts. -.PP -For -.BI / "regular expression" / -patterns, the associated statement is executed for each input line that matches -the regular expression. -Regular expressions are the same as those in -.IR egrep (1), -and are summarized below. -.PP -A -.I "relational expression" -may use any of the operators defined below in the section on actions. -These generally test whether certain fields match certain regular expressions. -.PP -The -.BR && , -.BR || , -and -.B ! -operators are logical AND, logical OR, and logical NOT, respectively, as in C. -They do short-circuit evaluation, also as in C, and are used for combining -more primitive pattern expressions. As in most languages, parentheses -may be used to change the order of evaluation. -.PP -The -.B ?\^: -operator is like the same operator in C. If the first pattern is true -then the pattern used for testing is the second pattern, otherwise it is -the third. Only one of the second and third patterns is evaluated. -.PP -The -.IB pattern1 ", " pattern2 -form of an expression is called a -.IR "range pattern" . -It matches all input records starting with a line that matches -.IR pattern1 , -and continuing until a record that matches -.IR pattern2 , -inclusive. It does not combine with any other sort of pattern expression. -.SS Regular Expressions -Regular expressions are the extended kind found in -.IR egrep . -They are composed of characters as follows: -.TP \w'\fB[^\fIabc...\fB]\fR'u+2n -.I c -matches the non-metacharacter -.IR c . -.TP -.I \ec -matches the literal character -.IR c . -.TP -.B . -matches any character except newline. -.TP -.B ^ -matches the beginning of a line or a string. -.TP -.B $ -matches the end of a line or a string. -.TP -.BI [ abc... ] -character class, matches any of the characters -.IR abc... . -.TP -.BI [^ abc... ] -negated character class, matches any character except -.I abc... -and newline. -.TP -.IB r1 | r2 -alternation: matches either -.I r1 -or -.IR r2 . -.TP -.I r1r2 -concatenation: matches -.IR r1 , -and then -.IR r2 . -.TP -.IB r + -matches one or more -.IR r 's. -.TP -.IB r * -matches zero or more -.IR r 's. -.TP -.IB r ? -matches zero or one -.IR r 's. -.TP -.BI ( r ) -grouping: matches -.IR r . -.PP -The escape sequences that are valid in string constants (see below) -are also legal in regular expressions. -.SS Actions -Action statements are enclosed in braces, -.B { -and -.BR } . -Action statements consist of the usual assignment, conditional, and looping -statements found in most languages. The operators, control statements, -and input/output statements -available are patterned after those in C. -.SS Operators -.PP -The operators in AWK, in order of increasing precedence, are -.PP -.TP "\w'\fB*= /= %= ^=\fR'u+1n" -.PD 0 -.B "= += \-=" -.TP -.PD -.B "*= /= %= ^=" -Assignment. Both absolute assignment -.BI ( var " = " value ) -and operator-assignment (the other forms) are supported. -.TP -.B ?: -The C conditional expression. This has the form -.IB expr1 " ? " expr2 " : " expr3\c -\&. If -.I expr1 -is true, the value of the expression is -.IR expr2 , -otherwise it is -.IR expr3 . -Only one of -.I expr2 -and -.I expr3 -is evaluated. -.TP -.B || -Logical OR. -.TP -.B && -Logical AND. -.TP -.B "~ !~" -Regular expression match, negated match. -.B NOTE: -Do not use a constant regular expression -.RB ( /foo/ ) -on the left-hand side of a -.B ~ -or -.BR !~ . -Only use one on the right-hand side. The expression -.BI "/foo/ ~ " exp -has the same meaning as \fB(($0 ~ /foo/) ~ \fIexp\fB)\fR. -This is usually -.I not -what was intended. -.TP -.PD 0 -.B "< >" -.TP -.PD 0 -.B "<= >=" -.TP -.PD -.B "!= ==" -The regular relational operators. -.TP -.I blank -String concatenation. -.TP -.B "+ \-" -Addition and subtraction. -.TP -.B "* / %" -Multiplication, division, and modulus. -.TP -.B "+ \- !" -Unary plus, unary minus, and logical negation. -.TP -.B ^ -Exponentiation (\fB**\fR may also be used, and \fB**=\fR for -the assignment operator). -.TP -.B "++ \-\^\-" -Increment and decrement, both prefix and postfix. -.TP -.B $ -Field reference. -.SS Control Statements -.PP -The control statements are -as follows: -.PP -.RS -.nf -\fBif (\fIcondition\fB) \fIstatement\fR [ \fBelse\fI statement \fR] -\fBwhile (\fIcondition\fB) \fIstatement \fR -\fBdo \fIstatement \fBwhile (\fIcondition\fB)\fR -\fBfor (\fIexpr1\fB; \fIexpr2\fB; \fIexpr3\fB) \fIstatement\fR -\fBfor (\fIvar \fBin\fI array\fB) \fIstatement\fR -\fBbreak\fR -\fBcontinue\fR -\fBdelete \fIarray\^\fB[\^\fIindex\^\fB]\fR -\fBdelete \fIarray\^\fR -\fBexit\fR [ \fIexpression\fR ] -\fB{ \fIstatements \fB} -.fi -.RE -.SS "I/O Statements" -.PP -The input/output statements are as follows: -.PP -.TP "\w'\fBprintf \fIfmt, expr-list\fR'u+1n" -.BI close( filename ) -Close file (or pipe, see below). -.TP -.B getline -Set -.B $0 -from next input record; set -.BR NF , -.BR NR , -.BR FNR . -.TP -.BI "getline <" file -Set -.B $0 -from next record of -.IR file ; -set -.BR NF . -.TP -.BI getline " var" -Set -.I var -from next input record; set -.BR NF , -.BR FNR . -.TP -.BI getline " var" " <" file -Set -.I var -from next record of -.IR file . -.TP -.B next -Stop processing the current input record. The next input record -is read and processing starts over with the first pattern in the -AWK program. If the end of the input data is reached, the -.B END -block(s), if any, are executed. -.TP -.B "next file" -Stop processing the current input file. The next input record read -comes from the next input file. -.B FILENAME -is updated, -.B FNR -is reset to 1, and processing starts over with the first pattern in the -AWK program. If the end of the input data is reached, the -.B END -block(s), if any, are executed. -.TP -.B print -Prints the current record. -.TP -.BI print " expr-list" -Prints expressions. -Each expression is separated by the value of the -.B OFS -variable. The output record is terminated with the value of the -.B ORS -variable. -.TP -.BI print " expr-list" " >" file -Prints expressions on -.IR file . -Each expression is separated by the value of the -.B OFS -variable. The output record is terminated with the value of the -.B ORS -variable. -.TP -.BI printf " fmt, expr-list" -Format and print. -.TP -.BI printf " fmt, expr-list" " >" file -Format and print on -.IR file . -.TP -.BI system( cmd-line ) -Execute the command -.IR cmd-line , -and return the exit status. -(This may not be available on non-\*(PX systems.) -.PP -Other input/output redirections are also allowed. For -.B print -and -.BR printf , -.BI >> file -appends output to the -.IR file , -while -.BI | " command" -writes on a pipe. -In a similar fashion, -.IB command " | getline" -pipes into -.BR getline . -The -.BR getline -command will return 0 on end of file, and \-1 on an error. -.SS The \fIprintf\fP\^ Statement -.PP -The AWK versions of the -.B printf -statement and -.B sprintf() -function -(see below) -accept the following conversion specification formats: -.TP -.B %c -An \s-1ASCII\s+1 character. -If the argument used for -.B %c -is numeric, it is treated as a character and printed. -Otherwise, the argument is assumed to be a string, and the only first -character of that string is printed. -.TP -.B %d -A decimal number (the integer part). -.TP -.B %i -Just like -.BR %d . -.TP -.B %e -A floating point number of the form -.BR [\-]d.ddddddE[+\^\-]dd . -.TP -.B %f -A floating point number of the form -.BR [\-]ddd.dddddd . -.TP -.B %g -Use -.B e -or -.B f -conversion, whichever is shorter, with nonsignificant zeros suppressed. -.TP -.B %o -An unsigned octal number (again, an integer). -.TP -.B %s -A character string. -.TP -.B %x -An unsigned hexadecimal number (an integer). -.TP -.B %X -Like -.BR %x , -but using -.B ABCDEF -instead of -.BR abcdef . -.TP -.B %% -A single -.B % -character; no argument is converted. -.PP -There are optional, additional parameters that may lie between the -.B % -and the control letter: -.TP -.B \- -The expression should be left-justified within its field. -.TP -.I width -The field should be padded to this width. If the number has a leading -zero, then the field will be padded with zeros. -Otherwise it is padded with blanks. -This applies even to the non-numeric output formats. -.TP -.BI . prec -A number indicating the maximum width of strings or digits to the right -of the decimal point. -.PP -The dynamic -.I width -and -.I prec -capabilities of the \*(AN C -.B printf() -routines are supported. -A -.B * -in place of either the -.B width -or -.B prec -specifications will cause their values to be taken from -the argument list to -.B printf -or -.BR sprintf() . -.SS Special File Names -.PP -When doing I/O redirection from either -.B print -or -.B printf -into a file, -or via -.B getline -from a file, -.I awk -recognizes certain special filenames internally. These filenames -allow access to open file descriptors inherited from -.IR awk 's -parent process (usually the shell). -Other special filenames provide access information about the running -.B awk -process. -The filenames are: -.TP \w'\fB/dev/stdout\fR'u+1n -.B /dev/pid -Reading this file returns the process ID of the current process, -in decimal, terminated with a newline. -.TP -.B /dev/ppid -Reading this file returns the parent process ID of the current process, -in decimal, terminated with a newline. -.TP -.B /dev/pgrpid -Reading this file returns the process group ID of the current process, -in decimal, terminated with a newline. -.TP -.B /dev/user -Reading this file returns a single record terminated with a newline. -The fields are separated with blanks. -.B $1 -is the value of the -.IR getuid (2) -system call, -.B $2 -is the value of the -.IR geteuid (2) -system call, -.B $3 -is the value of the -.IR getgid (2) -system call, and -.B $4 -is the value of the -.IR getegid (2) -system call. -If there are any additional fields, they are the group IDs returned by -.IR getgroups (2). -Multiple groups may not be supported on all systems. -.TP -.B /dev/stdin -The standard input. -.TP -.B /dev/stdout -The standard output. -.TP -.B /dev/stderr -The standard error output. -.TP -.BI /dev/fd/\^ n -The file associated with the open file descriptor -.IR n . -.PP -These are particularly useful for error messages. For example: -.PP -.RS -.ft B -print "You blew it!" > "/dev/stderr" -.ft R -.RE -.PP -whereas you would otherwise have to use -.PP -.RS -.ft B -print "You blew it!" | "cat 1>&2" -.ft R -.RE -.PP -These file names may also be used on the command line to name data files. -.SS Numeric Functions -.PP -AWK has the following pre-defined arithmetic functions: -.PP -.TP \w'\fBsrand(\^\fIexpr\^\fB)\fR'u+1n -.BI atan2( y , " x" ) -returns the arctangent of -.I y/x -in radians. -.TP -.BI cos( expr ) -returns the cosine in radians. -.TP -.BI exp( expr ) -the exponential function. -.TP -.BI int( expr ) -truncates to integer. -.TP -.BI log( expr ) -the natural logarithm function. -.TP -.B rand() -returns a random number between 0 and 1. -.TP -.BI sin( expr ) -returns the sine in radians. -.TP -.BI sqrt( expr ) -the square root function. -.TP -.BI srand( expr ) -use -.I expr -as a new seed for the random number generator. If no -.I expr -is provided, the time of day will be used. -The return value is the previous seed for the random -number generator. -.SS String Functions -.PP -AWK has the following pre-defined string functions: -.PP -.TP "\w'\fBsprintf(\^\fIfmt\fB\^, \fIexpr-list\^\fB)\fR'u+1n" -\fBgsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR -for each substring matching the regular expression -.I r -in the string -.IR t , -substitute the string -.IR s , -and return the number of substitutions. -If -.I t -is not supplied, use -.BR $0 . -.TP -.BI index( s , " t" ) -returns the index of the string -.I t -in the string -.IR s , -or 0 if -.I t -is not present. -.TP -.BI length( s ) -returns the length of the string -.IR s , -or the length of -.B $0 -if -.I s -is not supplied. -.TP -.BI match( s , " r" ) -returns the position in -.I s -where the regular expression -.I r -occurs, or 0 if -.I r -is not present, and sets the values of -.B RSTART -and -.BR RLENGTH . -.TP -\fBsplit(\fIs\fB, \fIa\fB, \fIr\fB)\fR -splits the string -.I s -into the array -.I a -on the regular expression -.IR r , -and returns the number of fields. If -.I r -is omitted, -.B FS -is used instead. -The array -.I a -is cleared first. -.TP -.BI sprintf( fmt , " expr-list" ) -prints -.I expr-list -according to -.IR fmt , -and returns the resulting string. -.TP -\fBsub(\fIr\fB, \fIs\fB, \fIt\fB)\fR -just like -.BR gsub() , -but only the first matching substring is replaced. -.TP -\fBsubstr(\fIs\fB, \fIi\fB, \fIn\fB)\fR -returns the -.IR n -character -substring of -.I s -starting at -.IR i . -If -.I n -is omitted, the rest of -.I s -is used. -.TP -.BI tolower( str ) -returns a copy of the string -.IR str , -with all the upper-case characters in -.I str -translated to their corresponding lower-case counterparts. -Non-alphabetic characters are left unchanged. -.TP -.BI toupper( str ) -returns a copy of the string -.IR str , -with all the lower-case characters in -.I str -translated to their corresponding upper-case counterparts. -Non-alphabetic characters are left unchanged. -.SS Time Functions -.PP -Since one of the primary uses of AWK programs is processing log files -that contain time stamp information, -.I awk -provides the following two functions for obtaining time stamps and -formatting them. -.PP -.TP "\w'\fBsystime()\fR'u+1n" -.B systime() -returns the current time of day as the number of seconds since the Epoch -(Midnight UTC, January 1, 1970 on \*(PX systems). -.TP -\fBstrftime(\fIformat\fR, \fItimestamp\fB)\fR -formats -.I timestamp -according to the specification in -.IR format. -The -.I timestamp -should be of the same form as returned by -.BR systime() . -If -.I timestamp -is missing, the current time of day is used. -See the specification for the -.B strftime() -function in \*(AN C for the format conversions that are -guaranteed to be available. -A public-domain version of -.IR strftime (3) -and a man page for it are shipped with -.IR awk ; -if that version was used to build -.IR awk , -then all of the conversions described in that man page are available to -.IR awk. -.SS String Constants -.PP -String constants in AWK are sequences of characters enclosed -between double quotes (\fB"\fR). Within strings, certain -.I "escape sequences" -are recognized, as in C. These are: -.PP -.TP \w'\fB\e\^\fIddd\fR'u+1n -.B \e\e -A literal backslash. -.TP -.B \ea -The ``alert'' character; usually the \s-1ASCII\s+1 \s-1BEL\s+1 character. -.TP -.B \eb -backspace. -.TP -.B \ef -form-feed. -.TP -.B \en -new line. -.TP -.B \er -carriage return. -.TP -.B \et -horizontal tab. -.TP -.B \ev -vertical tab. -.TP -.BI \ex "\^hex digits" -The character represented by the string of hexadecimal digits following -the -.BR \ex . -As in \*(AN C, all following hexadecimal digits are considered part of -the escape sequence. -(This feature should tell us something about language design by committee.) -E.g., \fB"\ex1B"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character. -.TP -.BI \e ddd -The character represented by the 1-, 2-, or 3-digit sequence of octal -digits. E.g. \fB"\e033"\fR is the \s-1ASCII\s+1 \s-1ESC\s+1 (escape) character. -.TP -.BI \e c -The literal character -.IR c\^ . -.PP -The escape sequences may also be used inside constant regular expressions -(e.g., -.B "/[\ \et\ef\en\er\ev]/" -matches whitespace characters). -.SH FUNCTIONS -Functions in AWK are defined as follows: -.PP -.RS -\fBfunction \fIname\fB(\fIparameter list\fB) { \fIstatements \fB}\fR -.RE -.PP -Functions are executed when called from within the action parts of regular -pattern-action statements. Actual parameters supplied in the function -call are used to instantiate the formal parameters declared in the function. -Arrays are passed by reference, other variables are passed by value. -.PP -Since functions were not originally part of the AWK language, the provision -for local variables is rather clumsy: They are declared as extra parameters -in the parameter list. The convention is to separate local variables from -real parameters by extra spaces in the parameter list. For example: -.PP -.RS -.ft B -.nf -function f(p, q, a, b) { # a & b are local - ..... } - -/abc/ { ... ; f(1, 2) ; ... } -.fi -.ft R -.RE -.PP -The left parenthesis in a function call is required -to immediately follow the function name, -without any intervening white space. -This is to avoid a syntactic ambiguity with the concatenation operator. -This restriction does not apply to the built-in functions listed above. -.PP -Functions may call each other and may be recursive. -Function parameters used as local variables are initialized -to the null string and the number zero upon function invocation. -.PP -The word -.B func -may be used in place of -.BR function . -.SH EXAMPLES -.nf -Print and sort the login names of all users: - -.ft B - BEGIN { FS = ":" } - { print $1 | "sort" } - -.ft R -Count lines in a file: - -.ft B - { nlines++ } - END { print nlines } - -.ft R -Precede each line by its number in the file: - -.ft B - { print FNR, $0 } - -.ft R -Concatenate and line number (a variation on a theme): - -.ft B - { print NR, $0 } -.ft R -.fi -.SH SEE ALSO -.IR egrep (1), -.IR getpid (2), -.IR getppid (2), -.IR getpgrp (2), -.IR getuid (2), -.IR geteuid (2), -.IR getgid (2), -.IR getegid (2), -.IR getgroups (2) -.PP -.IR "The AWK Programming Language" , -Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger, -Addison-Wesley, 1988. ISBN 0-201-07981-X. -.PP -.IR "The GAWK Manual" , -Edition 0.15, published by the Free Software Foundation, 1993. -.SH POSIX COMPATIBILITY -A primary goal for -.I awk -is compatibility with the \*(PX standard, as well as with the -latest version of \*(UX -.IR awk . -To this end, -.I awk -incorporates the following user visible -features which are not described in the AWK book, -but are part of -.I awk -in System V Release 4, and are in the \*(PX standard. -.PP -The -.B \-v -option for assigning variables before program execution starts is new. -The book indicates that command line variable assignment happens when -.I awk -would otherwise open the argument as a file, which is after the -.B BEGIN -block is executed. However, in earlier implementations, when such an -assignment appeared before any file names, the assignment would happen -.I before -the -.B BEGIN -block was run. Applications came to depend on this ``feature.'' -When -.I awk -was changed to match its documentation, this option was added to -accommodate applications that depended upon the old behavior. -(This feature was agreed upon by both the AT&T and GNU developers.) -.PP -The -.B \-W -option for implementation specific features is from the \*(PX standard. -.PP -When processing arguments, -.I awk -uses the special option ``\fB\-\^\-\fP'' to signal the end of -arguments. -In compatibility mode, it will warn about, but otherwise ignore, -undefined options. -In normal operation, such arguments are passed on to the AWK program for -it to process. -.PP -The AWK book does not define the return value of -.BR srand() . -The System V Release 4 version of \*(UX -.I awk -(and the \*(PX standard) -has it return the seed it was using, to allow keeping track -of random number sequences. Therefore -.B srand() -in -.I awk -also returns its current seed. -.PP -Other new features are: -The use of multiple -.B \-f -options (from MKS -.IR awk ); -the -.B ENVIRON -array; the -.BR \ea , -and -.BR \ev -escape sequences (done originally in -.I awk -and fed back into AT&T's); the -.B tolower() -and -.B toupper() -built-in functions (from AT&T); and the \*(AN C conversion specifications in -.B printf -(done first in AT&T's version). -.SH GNU EXTENSIONS -.I Gawk -has some extensions to \*(PX -.IR awk . -They are described in this section. All the extensions described here -can be disabled by -invoking -.I awk -with the -.B "\-W compat" -option. -.PP -The following features of -.I awk -are not available in -\*(PX -.IR awk . -.RS -.TP \w'\(bu'u+1n -\(bu -The -.B \ex -escape sequence. -.TP -\(bu -The -.B systime() -and -.B strftime() -functions. -.TP -\(bu -The special file names available for I/O redirection are not recognized. -.TP -\(bu -The -.B ARGIND -and -.B ERRNO -variables are not special. -.TP -\(bu -The -.B IGNORECASE -variable and its side-effects are not available. -.TP -\(bu -The -.B FIELDWIDTHS -variable and fixed width field splitting. -.TP -\(bu -No path search is performed for files named via the -.B \-f -option. Therefore the -.B AWKPATH -environment variable is not special. -.TP -\(bu -The use of -.B "next file" -to abandon processing of the current input file. -.TP -\(bu -The use of -.BI delete " array" -to delete the entire contents of an array. -.RE -.PP -The AWK book does not define the return value of the -.B close() -function. -.IR Gawk\^ 's -.B close() -returns the value from -.IR fclose (3), -or -.IR pclose (3), -when closing a file or pipe, respectively. -.PP -When -.I awk -is invoked with the -.B "\-W compat" -option, -if the -.I fs -argument to the -.B \-F -option is ``t'', then -.B FS -will be set to the tab character. -Since this is a rather ugly special case, it is not the default behavior. -This behavior also does not occur if -.B "\-W posix" -has been specified. -.ig -.PP -If -.I awk -was compiled for debugging, it will -accept the following additional options: -.TP -.PD 0 -.B \-Wparsedebug -.TP -.PD -.B \-\^\-parsedebug -Turn on -.IR yacc (1) -or -.IR bison (1) -debugging output during program parsing. -This option should only be of interest to the -.I awk -maintainers, and may not even be compiled into -.IR awk . -.. -.SH HISTORICAL FEATURES -There are two features of historical AWK implementations that -.I awk -supports. -First, it is possible to call the -.B length() -built-in function not only with no argument, but even without parentheses! -Thus, -.RS -.PP -.ft B -a = length -.ft R -.RE -.PP -is the same as either of -.RS -.PP -.ft B -a = length() -.br -a = length($0) -.ft R -.RE -.PP -This feature is marked as ``deprecated'' in the \*(PX standard, and -.I awk -will issue a warning about its use if -.B "\-W lint" -is specified on the command line. -.PP -The other feature is the use of the -.B continue -statement outside the body of a -.BR while , -.BR for , -or -.B do -loop. Traditional AWK implementations have treated such usage as -equivalent to the -.B next -statement. -.I Gawk -will support this usage if -.B "\-W posix" -has not been specified. -.SH ENVIRONMENT VARIABLES -If -.B POSIXLY_CORRECT -exists in the environment, then -.I awk -behaves exactly as if -.B \-\-posix -had been specified on the command line. -If -.B \-\-lint -has been specified, -.I awk -will issue a warning message to this effect. -.SH BUGS -The -.B \-F -option is not necessary given the command line variable assignment feature; -it remains only for backwards compatibility. -.PP -If your system actually has support for -.B /dev/fd -and the associated -.BR /dev/stdin , -.BR /dev/stdout , -and -.B /dev/stderr -files, you may get different output from -.I awk -than you would get on a system without those files. When -.I awk -interprets these files internally, it synchronizes output to the standard -output with output to -.BR /dev/stdout , -while on a system with those files, the output is actually to different -open files. -Caveat Emptor. -.SH VERSION INFORMATION -This man page documents -.IR awk , -version 2.15. -.PP -Starting with the 2.15 version of -.IR awk , -the -.BR \-c , -.BR \-V , -.BR \-C , -.ig -.BR \-D , -.. -.BR \-a , -and -.B \-e -options of the 2.11 version are no longer recognized. -This fact will not even be documented in the manual page for version 2.16. -.SH AUTHORS -The original version of \*(UX -.I awk -was designed and implemented by Alfred Aho, -Peter Weinberger, and Brian Kernighan of AT&T Bell Labs. Brian Kernighan -continues to maintain and enhance it. -.PP -Paul Rubin and Jay Fenlason, -of the Free Software Foundation, wrote -.IR gawk , -to be compatible with the original version of -.I awk -distributed in Seventh Edition \*(UX. -John Woods contributed a number of bug fixes. -David Trueman, with contributions -from Arnold Robbins, made -.I gawk -compatible with the new version of \*(UX -.IR awk . -.PP -The initial DOS port was done by Conrad Kwok and Scott Garfinkle. -Scott Deifik is the current DOS maintainer. Pat Rankin did the -port to VMS, and Michal Jaegermann did the port to the Atari ST. -The port to OS/2 was done by Kai Uwe Rommel, with contributions and -help from Darrel Hankerson. -.SH ACKNOWLEDGEMENTS -Brian Kernighan of Bell Labs -provided valuable assistance during testing and debugging. -We thank him. |