diff options
Diffstat (limited to 'gnu/usr.bin/grep/doc/grep.texi')
-rw-r--r-- | gnu/usr.bin/grep/doc/grep.texi | 607 |
1 files changed, 445 insertions, 162 deletions
diff --git a/gnu/usr.bin/grep/doc/grep.texi b/gnu/usr.bin/grep/doc/grep.texi index 23b0553..50a6938 100644 --- a/gnu/usr.bin/grep/doc/grep.texi +++ b/gnu/usr.bin/grep/doc/grep.texi @@ -22,19 +22,20 @@ @defcodeindex op @syncodeindex op fn +@syncodeindex vr fn @ifinfo @direntry * grep: (grep). print lines matching a pattern. @end direntry -This file documents @sc{grep}, a pattern matching engine. +This file documents @command{grep}, a pattern matching engine. Published by the Free Software Foundation, 59 Temple Place - Suite 330 Boston, MA 02111-1307, USA -Copyright (C) 1998 Free Software Foundation, Inc. +Copyright 1999 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice @@ -67,7 +68,7 @@ by the Foundation. @page @vskip 0pt plus 1filll -Copyright @copyright{} 1998 Free Software Foundation, Inc. +Copyright @copyright{} 1999 Free Software Foundation, Inc. @sp 2 Published by the Free Software Foundation, @* @@ -92,43 +93,48 @@ by the Foundation. @page -@node Top, Introduction, (dir), (dir) -@comment node-name, next, previous, up +@ifnottex +@node Top +@top Grep -@ifinfo -This document was produced for version @value{VERSION} of @sc{GNU} @sc{grep}. -@end ifinfo +@command{grep} searches for lines matching a pattern. + +This document was produced for version @value{VERSION} of @sc{gnu} +@command{grep}. +@end ifnottex @menu * Introduction:: Introduction. -* Invoking:: Invoking @sc{grep}; description of options. -* Diagnostics:: Exit status returned by @sc{grep}. -* Grep Programs:: @sc{grep} programs. +* Invoking:: Invoking @command{grep}; description of options. +* Diagnostics:: Exit status returned by @command{grep}. +* Grep Programs:: @command{grep} programs. * Regular Expressions:: Regular Expressions. +* Usage:: Examples. * Reporting Bugs:: Reporting Bugs. * Concept Index:: A menu with all the topics in this manual. -* Index:: A menu with all @sc{grep} commands +* Index:: A menu with all @command{grep} commands and command-line options. @end menu -@node Introduction, Invoking, Top, Top -@comment node-name, next, previous, up +@node Introduction @chapter Introduction @cindex Searching for a pattern. -@sc{grep} searches the input files for lines containing a match to a given + +@command{grep} searches the input files +for lines containing a match to a given pattern list. When it finds a match in a line, it copies the line to standard output (by default), or does whatever other sort of output you have requested -with options. @sc{grep} expects to do the matching on text. +with options. @command{grep} expects to do the matching on text. Since newline is also a separator for the list of patterns, there is no way to match newline characters in a text. -@node Invoking, Diagnostics, Introduction, Top -@comment node-name, next, previous, up -@chapter Invoking @sc{grep} +@node Invoking +@chapter Invoking @command{grep} -@sc{grep} comes with a rich set of options from POSIX.2 and GNU extensions. +@command{grep} comes with a rich set of options from @sc{posix.2} and @sc{gnu} +extensions. @table @samp @@ -138,7 +144,7 @@ is no way to match newline characters in a text. @opindex -count @cindex counting lines Suppress normal output; instead print a count of matching -lines for each input file. With the @samp{-v}, @samp{--revert-match} option, +lines for each input file. With the @samp{-v}, @samp{--invert-match} option, count non-matching lines. @item -e @var{pattern} @@ -146,15 +152,15 @@ count non-matching lines. @opindex -e @opindex --regexp=@var{pattern} @cindex pattern list -Use @var{pattern} as the pattern; useful to protect patterns +Use @var{pattern} as the pattern; useful to protect patterns beginning with a @samp{-}. -@item -f @var{file} +@item -f @var{file} @itemx --file=@var{file} -@opindex -f -@opindex --file +@opindex -f +@opindex --file @cindex pattern from file -Obtain patterns from @var{file}, one per line. The empty +Obtain patterns from @var{file}, one per line. The empty file contains zero patterns, and therefore matches nothing. @item -i @@ -162,15 +168,15 @@ file contains zero patterns, and therefore matches nothing. @opindex -i @opindex --ignore-case @cindex case insensitive search -Ignore case distinctions in both the pattern and the input files. +Ignore case distinctions in both the pattern and the input files. @item -l @itemx --files-with-matches @opindex -l @opindex --files-with-matches @cindex names of matching files -Suppress normal output; instead print the name of each input -file from which output would normally have been printed. +Suppress normal output; instead print the name of each input +file from which output would normally have been printed. The scanning of every file will stop on the first match. @item -n @@ -178,7 +184,7 @@ The scanning of every file will stop on the first match. @opindex -n @opindex --line-number @cindex line numbering -Prefix each line of output with the line number within its input file. +Prefix each line of output with the line number within its input file. @item -q @itemx --quiet @@ -187,7 +193,7 @@ Prefix each line of output with the line number within its input file. @opindex --quiet @opindex --silent @cindex quiet, silent -Quiet; suppress normal output. The scanning of every file will stop on +Quiet; suppress normal output. The scanning of every file will stop on the first match. Also see the @samp{-s} or @samp{--no-messages} option. @item -s @@ -196,31 +202,32 @@ the first match. Also see the @samp{-s} or @samp{--no-messages} option. @opindex --no-messages @cindex suppress error messages Suppress error messages about nonexistent or unreadable files. -Portability note: unlike GNU @sc{grep}, BSD @sc{grep} does not comply -with POSIX.2, because BSD @sc{grep} lacks a @samp{-q} option and its -@samp{-s} option behaves like GNU @sc{grep}'s @samp{-q} option. Shell -scripts intended to be portable to BSD @sc{grep} should avoid both +Portability note: unlike @sc{gnu} @command{grep}, traditional +@command{grep} did not conform to @sc{posix.2}, because traditional +@command{grep} lacked a @samp{-q} option and its @samp{-s} option behaved +like @sc{gnu} @command{grep}'s @samp{-q} option. Shell scripts intended +to be portable to traditional @command{grep} should avoid both @samp{-q} and @samp{-s} and should redirect output to @file{/dev/null} instead. @item -v -@itemx --revert-match +@itemx --invert-match @opindex -v -@opindex --revert-match -@cindex revert matching +@opindex --invert-match +@cindex invert matching @cindex print non-matching lines -Invert the sense of matching, to select non-matching lines. +Invert the sense of matching, to select non-matching lines. @item -x @itemx --line-regexp @opindex -x @opindex --line-regexp @cindex match the whole line -Select only those matches that exactly match the whole line. +Select only those matches that exactly match the whole line. @end table -@section GNU Extensions +@section @sc{gnu} Extensions @table @samp @@ -240,17 +247,17 @@ Print @var{num} lines of trailing context after matching lines. @cindex context lines, before match Print @var{num} lines of leading context before matching lines. -@item -C -@itemx --context@var{[=num]} +@item -C @var{num} +@itemx --context=[@var{num}] @opindex -C @opindex --context @cindex context Print @var{num} lines (default 2) of output context. -@item -NUM +@item -@var{num} @opindex -NUM -Same as @samp{--context=@var{num}} lines of leading and trailing +Same as @samp{--context=@var{num}} lines of leading and trailing context. However, grep will never print any given line more than once. @@ -259,8 +266,8 @@ context. However, grep will never print any given line more than once. @opindex -V @opindex --version @cindex Version, printing -Print the version number of @sc{grep} to the standard output stream. -This version number should be included in all bug reports. +Print the version number of @command{grep} to the standard output stream. +This version number should be included in all bug reports. @item --help @opindex --help @@ -274,24 +281,32 @@ and the bug-reporting address, then exit. @opindex --byte-offset @cindex byte offset Print the byte offset within the input file before each line of output. -When @sc{grep} runs on MS-DOS or MS-Windows, the printed byte offsets +When @command{grep} runs on @sc{ms-dos} or MS-Windows, the printed +byte offsets depend on whether the @samp{-u} (@samp{--unix-byte-offsets}) option is used; see below. @item -d @var{action} @itemx --directories=@var{action} -@opindex -d +@opindex -d @opindex --directories @cindex directory search -If an input file is a directory, use @var{action} to process it. -By default, @var{action} is @samp{read}, which means that directories are -read just as if they were ordinary files (some operating systems -and filesystems disallow this, and will cause @sc{grep} to print error +If an input file is a directory, use @var{action} to process it. +By default, @var{action} is @samp{read}, which means that directories are +read just as if they were ordinary files (some operating systems +and filesystems disallow this, and will cause @command{grep} to print error messages for every directory). If @var{action} is @samp{skip}, directories are silently skipped. If @var{action} is @samp{recurse}, -@sc{grep} reads all files under each directory, recursively; this is +@command{grep} reads all files under each directory, recursively; this is equivalent to the @samp{-r} option. +@item -H +@itemx --with-filename +@opindex -H +@opindex --With-filename +@cindex with filename prefix +Print the filename for each match. + @item -h @itemx --no-filename @opindex -h @@ -304,9 +319,9 @@ Suppress the prefixing of filenames on output when multiple files are searched. @opindex -L @opindex --files-without-match @cindex files which don't match -Suppress normal output; instead print the name of each input -file from which no output would normally have been printed. -The scanning of every file will stop on the first match. +Suppress normal output; instead print the name of each input +file from which no output would normally have been printed. +The scanning of every file will stop on the first match. @item -a @itemx --text @@ -314,14 +329,14 @@ The scanning of every file will stop on the first match. @opindex --text @cindex suppress binary data @cindex binary files -Do not suppress output lines that contain binary data. -Normally, if the first few bytes of a file indicate +Do not suppress output lines that contain binary data. +Normally, if the first few bytes of a file indicate that the file contains binary data, grep outputs only a message saying that the file matches the pattern. This -option causes grep to act as if the file is a text +option causes grep to act as if the file is a text file, even if it would otherwise be treated as binary. -@emph{Warning:} the result might be binary garbage -printed to the terminal, which can have nasty +@emph{Warning:} the result might be binary garbage +printed to the terminal, which can have nasty side-effects if the terminal driver interprets some of it as commands. @@ -330,12 +345,12 @@ it as commands. @opindex -w @opindex --word-regexp @cindex matching whole words -Select only those lines containing matches that form -whole words. The test is that the matching substring -must either be at the beginning of the line, or preceded +Select only those lines containing matches that form +whole words. The test is that the matching substring +must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by -a non-word constituent character. Word-constituent +a non-word constituent character. Word-constituent characters are letters, digits, and the underscore. @item -r @@ -359,18 +374,18 @@ Obsolete synonym for @samp{-i}. @opindex --binary @cindex DOS/Windows binary files @cindex binary files, DOS/Windows -Treat the file(s) as binary. By default, under MS-DOS -and MS-Windows, @sc{grep} guesses the file type by looking -at the contents of the first 32KB read from the file. -If @sc{grep} decides the file is a text file, it strips the -CR characters from the original file contents (to make -regular expressions with @code{^} and @code{$} work correctly). +Treat the file(s) as binary. By default, under @sc{ms-dos} +and MS-Windows, @command{grep} guesses the file type by looking +at the contents of the first 32kB read from the file. +If @command{grep} decides the file is a text file, it strips the +@code{CR} characters from the original file contents (to make +regular expressions with @code{^} and @code{$} work correctly). Specifying @samp{-U} overrules this guesswork, causing all -files to be read and passed to the matching mechanism -verbatim; if the file is a text file with CR/LF pairs -at the end of each line, this will cause some regular -expressions to fail. This option is only supported on -MS-DOS and MS-Windows. +files to be read and passed to the matching mechanism +verbatim; if the file is a text file with @code{CR/LF} pairs +at the end of each line, this will cause some regular +expressions to fail. This option has no effect on platforms other than +@sc{ms-dos} and MS-Windows. @item -u @itemx --unix-byte-offsets @@ -378,38 +393,146 @@ MS-DOS and MS-Windows. @opindex --unix-byte-offsets @cindex DOS byte offsets @cindex byte offsets, on DOS/Windows -Report Unix-style byte offsets. This switch causes -@sc{grep} to report byte offsets as if the file were Unix style -text file, i.e. the byte offsets ignore the CR characters which were -stripped off. This will produce results identical to running @sc{grep} on -a Unix machine. This option has no effect unless @samp{-b} -option is also used; it is only supported on MS-DOS and +Report Unix-style byte offsets. This switch causes +@command{grep} to report byte offsets as if the file were Unix style +text file, i.e., the byte offsets ignore the @code{CR} characters which were +stripped. This will produce results identical to running @command{grep} on +a Unix machine. This option has no effect unless @samp{-b} +option is also used; it has no effect on platforms other than @sc{ms-dos} and MS-Windows. +@item --mmap +@opindex --mmap +@cindex memory mapped input +If possible, use the @code{mmap} system call to read input, instead of +the default @code{read} system call. In some situations, @samp{--mmap} +yields better performance. However, @samp{--mmap} can cause undefined +behavior (including core dumps) if an input file shrinks while +@command{grep} is operating, or if an I/O error occurs. + +@item -Z +@itemx --null +@opindex -Z +@opindex --null +@cindex zero-terminated file names +Output a zero byte (the @sc{ascii} @code{NUL} character) instead of the +character that normally follows a file name. For example, @samp{grep +-lZ} outputs a zero byte after each file name instead of the usual +newline. This option makes the output unambiguous, even in the presence +of file names containing unusual characters like newlines. This option +can be used with commands like @samp{find -print0}, @samp{perl -0}, +@samp{sort -z}, and @samp{xargs -0} to process arbitrary file names, +even those that contain newline characters. + +@item -z +@itemx --null-data +@opindex -z +@opindex --null-data +@cindex zero-terminated lines +Treat the input as a set of lines, each terminated by a zero byte (the +@sc{ascii} @code{NUL} character) instead of a newline. Like the @samp{-Z} +or @samp{--null} option, this option can be used with commands like +@samp{sort -z} to process arbitrary file names. + @end table -Several additional options control which variant of the @sc{grep} +Several additional options control which variant of the @command{grep} matching engine is used. @xref{Grep Programs}. -@sc{grep} uses the environment variable @var{LANG} to -provide internationalization support, if compiled with this feature. +@section Environment Variables + +Grep's behavior is affected by the following environment variables. +@cindex environment variables + +@table @code + +@item GREP_OPTIONS +@vindex GREP_OPTIONS +@cindex default options environment variable +This variable specifies default options to be placed in front of any +explicit options. For example, if @code{GREP_OPTIONS} is @samp{--text +--directories=skip}, @command{grep} behaves as if the two options +@samp{--text} and @samp{--directories=skip} had been specified before +any explicit options. Option specifications are separated by +whitespace. A backslash escapes the next character, so it can be used to +specify an option containing whitespace or a backslash. + +@item LC_ALL +@itemx LC_MESSAGES +@itemx LANG +@vindex LC_ALL +@vindex LC_MESSAGES +@vindex LANG +@cindex language of messages +@cindex message language +@cindex national language support +@cindex NLS +@cindex translation of message language +These variables specify the @code{LC_MESSAGES} locale, which determines +the language that @command{grep} uses for messages. The locale is determined +by the first of these variables that is set. American English is used +if none of these environment variables are set, or if the message +catalog is not installed, or if @command{grep} was not compiled with national +language support (@sc{nls}). + +@item LC_ALL +@itemx LC_CTYPE +@itemx LANG +@vindex LC_ALL +@vindex LC_CTYPE +@vindex LANG +@cindex character type +@cindex national language support +@cindex NLS +These variables specify the @code{LC_CTYPE} locale, which determines the +type of characters, e.g., which characters are whitespace. The locale is +determined by the first of these variables that is set. The @sc{posix} +locale is used if none of these environment variables are set, or if the +locale catalog is not installed, or if @command{grep} was not compiled with +national language support (@sc{nls}). + +@item POSIXLY_CORRECT +@vindex POSIXLY_CORRECT +If set, @command{grep} behaves as @sc{posix.2} requires; otherwise, +@command{grep} behaves more like other @sc{gnu} programs. @sc{posix.2} +requires that options that +follow file names must be treated as file names; by default, such +options are permuted to the front of the operand list and are treated as +options. Also, @sc{posix.2} requires that unrecognized options be +diagnosed as +``illegal'', but since they are not really against the law the default +is to diagnose them as ``invalid''. @code{POSIXLY_CORRECT} also +disables @code{_@var{N}_GNU_nonoption_argv_flags_}, described below. + +@item _@var{N}_GNU_nonoption_argv_flags_ +@vindex _@var{N}_GNU_nonoption_argv_flags_ +(Here @code{@var{N}} is @command{grep}'s numeric process ID.) If the +@var{i}th character of this environment variable's value is @samp{1}, do +not consider the @var{i}th operand of @command{grep} to be an option, even if +it appears to be one. A shell can put this variable in the environment +for each command it runs, specifying which operands are the results of +file name wildcard expansion and therefore should not be treated as +options. This behavior is available only with the @sc{gnu} C library, and +only when @code{POSIXLY_CORRECT} is not set. -@node Diagnostics, Grep Programs, Invoking, Top -@comment node-name, next, previous, up +@end table + +@node Diagnostics @chapter Diagnostics + Normally, exit status is 0 if matches were found, and 1 if no matches were found (the @samp{-v} option inverts the sense of the exit status). -Exit status is 2 if there were syntax errors in the pattern, +Exit status is 2 if there were syntax errors in the pattern, inaccessible input files, or other system errors. -@node Grep Programs, Regular Expressions, Diagnostics, Top -@comment node-name, next, previous, up -@chapter @sc{grep} programs +@node Grep Programs +@chapter @command{grep} programs -@sc{grep} searches the named input files (or standard input if no +@command{grep} searches the named input files (or standard input if no files are named, or the file name @file{-} is given) for lines containing -a match to the given pattern. By default, @sc{grep} prints the matching lines. -There are three major variants of @sc{grep}, controlled by the following options. +a match to the given pattern. By default, @command{grep} prints the +matching lines. There are three major variants of @command{grep}, +controlled by the following options. @table @samp @@ -418,14 +541,14 @@ There are three major variants of @sc{grep}, controlled by the following options @opindex -G @opindex --basic-regexp @cindex matching basic regular expressions -Interpret pattern as a basic regular expression. This is the default. +Interpret pattern as a basic regular expression. This is the default. @item -E -@item --extended-regexp +@itemx --extended-regexp @opindex -E @opindex --extended-regexp @cindex matching extended regular expressions -Interpret pattern as an extended regular expression. +Interpret pattern as an extended regular expression. @item -F @@ -439,60 +562,66 @@ by newlines, any of which is to be matched. @end table In addition, two variant programs @sc{egrep} and @sc{fgrep} are available. -@sc{egrep} is similar (but not identical) to @samp{grep -E}, and -is compatible with the historical Unix @sc{egrep}. @sc{fgrep} is the +@sc{egrep} is the same as @samp{grep -E}. @sc{fgrep} is the same as @samp{grep -F}. -@node Regular Expressions, Reporting Bugs, Grep Programs, Top -@comment node-name, next, previous, up +@node Regular Expressions @chapter Regular Expressions @cindex regular expressions -A @dfn{regular expression} is a pattern that describes a set of strings. +A @dfn{regular expression} is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, -by using various operators to combine smaller expressions. -@sc{grep} understands two different versions of regular expression -syntax: ``basic'' and ``extended''. In GNU @sc{grep}, there is no -difference in available functionality using either syntax. -In other implementations, basic regular expressions are less powerful. -The following description applies to extended regular expressions; +by using various operators to combine smaller expressions. +@command{grep} understands two different versions of regular expression +syntax: ``basic'' and ``extended''. In @sc{gnu} @command{grep}, there is no +difference in available functionality using either syntax. +In other implementations, basic regular expressions are less powerful. +The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards. -The fundamental building blocks are the regular expressions that match +The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, -are regular expressions that match themselves. Any metacharacter +are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash. A list of characters enclosed by @samp{[} and @samp{]} matches any single character in that list; if the first character of the list is the -caret @samp{^}, then it +caret @samp{^}, then it matches any character @strong{not} in the list. For example, the regular expression @samp{[0123456789]} matches any single digit. -A range of @sc{ascii} characters may be specified by giving the first -and last characters, separated by a hyphen. Finally, certain named -classes of characters are predefined. Their names are self explanatory, -and they are : +A range of @sc{ascii} characters may be specified by giving the first +and last characters, separated by a hyphen. + +Finally, certain named classes of characters are predefined, as follows. +Their interpretation depends on the @code{LC_CTYPE} locale; the +interpretation below is that of the @sc{posix} locale, which is the default +if no @code{LC_CTYPE} locale is specified. @cindex classes of characters @cindex character classes @table @samp @item [:alnum:] -@opindex alnum -@cindex alphanumeric characters -Any of [:digit:] or [:alpha:] +@opindex alnum +@cindex alphanumeric characters +Any of @samp{[:digit:]} or @samp{[:alpha:]} @item [:alpha:] @opindex alpha @cindex alphabetic characters -Any local-specific or one of the @sc{ascii} letters:@* +Any letter:@* @code{a b c d e f g h i j k l m n o p q r s t u v w x y z},@* @code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}. +@item [:blank:] +@opindex blank +@cindex blank characters +Space or tab. + @item [:cntrl:] @opindex cntrl @cindex control characters -Any of @code{BEL}, @code{BS}, @code{CR}, @code{FF}, @code{HT}, -@code{NL}, or @code{VT}. +Any character with octal codes 000 through 037, or @code{DEL} (octal +code 177). @item [:digit:] @opindex digit @@ -503,7 +632,7 @@ Any one of @code{0 1 2 3 4 5 6 7 8 9}. @item [:graph:] @opindex graph @cindex graphic characters -Anything that is not a @samp{[:alphanum:]} or @samp{[:punct:]}. +Anything that is not a @samp{[:alnum:]} or @samp{[:punct:]}. @item [:lower:] @opindex lower @@ -514,13 +643,12 @@ Any one of @code{a b c d e f g h i j k l m n o p q r s t u v w x y z}. @opindex print @cindex printable characters Any character from the @samp{[:space:]} class, and any character that is -@strong{not} in the @samp{[:isgraph:]} class. +@strong{not} in the @samp{[:graph:]} class. @item [:punct:] @opindex punct @cindex punctuation characters -Any one of @code{!@: " #% & ' ( ) ; < = > ?@: [ \ ] * + , - .@: / : ^ _ @{ | @}}. - +Any one of @code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}. @item [:space:] @opindex space @@ -541,13 +669,13 @@ Any one of @code{a b c d e f A B C D E F 0 1 2 3 4 5 6 7 8 9}. @end table For example, @samp{[[:alnum:]]} means @samp{[0-9A-Za-z]}, except the latter -form is dependent upon the @sc{ascii} character encoding, whereas the -former is portable. (Note that the brackets in these class names are -part of the symbolic names, and must be included in addition to -the brackets delimiting the bracket list). Most metacharacters lose +form is dependent upon the @sc{ascii} character encoding, whereas the +former is portable. (Note that the brackets in these class names are +part of the symbolic names, and must be included in addition to +the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal @samp{]}, place it first in the list. Similarly, to include a literal @samp{^}, place it anywhere -but first. Finally, to include a literal @samp{-}, place it last. +but first. Finally, to include a literal @samp{-}, place it last. The period @samp{.} matches any single character. The symbol @samp{\w} is a synonym for @samp{[[:alnum:]]} and @samp{\W} is a synonym for @@ -555,12 +683,12 @@ is a synonym for @samp{[[:alnum:]]} and @samp{\W} is a synonym for The caret @samp{^} and the dollar sign @samp{$} are metacharacters that respectively match the empty string at the beginning and end -of a line. The symbols @samp{\<} and @samp{\>} respectively match the +of a line. The symbols @samp{\<} and @samp{\>} respectively match the empty string at the beginning and end of a word. The symbol -@samp{\b} matches the empty string at the edge of a word, and @samp{\B} -matches the empty string provided it's not at the edge of a word. +@samp{\b} matches the empty string at the edge of a word, and @samp{\B} +matches the empty string provided it's not at the edge of a word. -A regular expression may be followed by one of several +A regular expression may be followed by one of several repetition operators: @@ -580,7 +708,7 @@ The preceding item will be matched zero or more times. @item + @opindex + -@cindex plus sign +@cindex plus sign The preceding item will be matched one or more times. @item @{@var{n}@} @@ -595,12 +723,6 @@ The preceding item is matched exactly @var{n} times. @cindex match sub-expression n or more times The preceding item is matched n or more times. -@item @{,@var{m}@} -@opindex @{,m@} -@cindex braces, first argument omitted -@cindex match sub-expression at most m times -The preceding item is optional and is matched at most @var{m} times. - @item @{@var{n},@var{m}@} @opindex @{n,m@} @cindex braces, two arguments @@ -609,17 +731,17 @@ The preceding item is matched at least @var{n} times, but not more than @end table -Two regular expressions may be concatenated; the resulting regular +Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings -that respectively match the concatenated subexpressions. +that respectively match the concatenated subexpressions. -Two regular expressions may be joined by the infix operator @samp{|}; the -resulting regular expression matches any string matching either +Two regular expressions may be joined by the infix operator @samp{|}; the +resulting regular expression matches any string matching either subexpression. -Repetition takes precedence over concatenation, which in turn +Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be -enclosed in parentheses to override these precedence rules. +enclosed in parentheses to override these precedence rules. The backreference @samp{\@var{n}}, where @var{n} is a single digit, matches the substring previously matched by the @var{n}th parenthesized subexpression @@ -631,40 +753,201 @@ In basic regular expressions the metacharacters @samp{?}, @samp{+}, instead use the backslashed versions @samp{\?}, @samp{\+}, @samp{\@{}, @samp{\|}, @samp{\(}, and @samp{\)}. -In @sc{egrep} the metacharacter @samp{@{} loses its special meaning; -instead use @samp{\@{}. This not true for @samp{grep -E}. +@cindex interval specifications +Traditional @command{egrep} did not support the @samp{@{} metacharacter, +and some @command{egrep} implementations support @samp{\@{} instead, so +portable scripts should avoid @samp{@{} in @samp{egrep} patterns and +should use @samp{[@{]} to match a literal @samp{@{}. + +@sc{gnu} @command{egrep} attempts to support traditional usage by +assuming that @samp{@{} is not special if it would be the start of an +invalid interval specification. For example, the shell command +@samp{egrep '@{1'} searches for the two-character string @samp{@{1} +instead of reporting a syntax error in the regular expression. +@sc{posix.2} allows this behavior as an extension, but portable scripts +should avoid it. + +@node Usage +@chapter Usage + +@cindex Usage, examples +Here is an example shell command that invokes @sc{gnu} @command{grep}: + +@example +grep -i 'hello.*world' menu.h main.c +@end example + +@noindent +This lists all lines in the files @file{menu.h} and @file{main.c} that +contain the string @samp{hello} followed by the string @samp{world}; +this is because @samp{.*} matches zero or more characters within a line. +@xref{Regular Expressions}. The @samp{-i} option causes @command{grep} +to ignore case, causing it to match the line @samp{Hello, world!}, which +it would not otherwise match. @xref{Invoking}, for more details about +how to invoke @command{grep}. + +@cindex Using @command{grep}, Q&A +@cindex FAQ about @command{grep} usage +Here are some common questions and answers about @command{grep} usage. + +@enumerate + +@item +How can I list just the names of matching files? + +@example +grep -l 'main' *.c +@end example + +@noindent +lists the names of all C files in the current directory whose contents +mention @samp{main}. + +@item +How do I search directories recursively? + +@example +grep -r 'hello' /home/gigi +@end example + +@noindent +searches for @samp{hello} in all files under the directory +@file{/home/gigi}. For more control of which files are searched, use +@command{find}, @command{grep} and @command{xargs}. For example, +the following command searches only C files: + +@smallexample +find /home/gigi -name '*.c' -print | xargs grep 'hello' /dev/null +@end smallexample + +@item +What if a pattern has a leading @samp{-}? + +@example +grep -e '--cut here--' * +@end example + +@noindent +searches for all lines matching @samp{--cut here--}. Without @samp{-e}, +@command{grep} would attempt to parse @samp{--cut here--} as a list of +options. + +@item +Suppose I want to search for a whole word, not a part of a word? + +@example +grep -w 'hello' * +@end example + +@noindent +searches only for instances of @samp{hello} that are entire words; it +does not match @samp{Othello}. For more control, use @samp{\<} and +@samp{\>} to match the start and end of words. For example: + +@example +grep 'hello\>' * +@end example + +@noindent +searches only for words ending in @samp{hello}, so it matches the word +@samp{Othello}. +@item +How do I output context around the matching lines? -@node Reporting Bugs, Concept Index, Regular Expressions, Top -@comment node-name, next, previous, up +@example +grep -C 2 'hello' * +@end example + +@noindent +prints two lines of context around each matching line. + +@item +How do I force grep to print the name of the file? + +Append @file{/dev/null}: + +@example +grep 'eli' /etc/passwd /dev/null +@end example + +@item +Why do people use strange regular expressions on @command{ps} output? + +@example +ps -ef | grep '[c]ron' +@end example + +If the pattern had been written without the square brackets, it would +have matched not only the @command{ps} output line for @command{cron}, +but also the @command{ps} output line for @command{grep}. + +@item +Why does @command{grep} report ``Binary file matches''? + +If @command{grep} listed all matching ``lines'' from a binary file, it +would probably generate output that is not useful, and it might even +muck up your display. So @sc{gnu} @command{grep} suppresses output from +files that appear to be binary files. To force @sc{gnu} @command{grep} +to output lines even from files that appear to be binary, use the +@samp{-a} or @samp{--text} option. + +@item +Why doesn't @samp{grep -lv} print nonmatching file names? + +@samp{grep -lv} lists the names of all files containing one or more +lines that do not match. To list the names of all files that contain no +matching lines, use the @samp{-L} or @samp{--files-without-match} +option. + +@item +I can do @sc{or} with @samp{|}, but what about @sc{and}? + +@example +grep 'paul' /etc/motd | grep 'franc,ois' +@end example + +@noindent +finds all lines that contain both @samp{paul} and @samp{franc,ois}. + +@item +How can I search in both standard input and in files? + +Use the special file name @samp{-}: + +@example +cat /etc/passwd | grep 'alain' - /etc/motd +@end example +@end enumerate + +@node Reporting Bugs @chapter Reporting bugs @cindex Bugs, reporting Email bug reports to @email{bug-gnu-utils@@gnu.org}. Be sure to include the word ``grep'' somewhere in the ``Subject:'' field. -Large repetition counts in the @samp{@{m,n@}} construct may cause -@sc{grep} to use lots of memory. In addition, certain other -obscure regular expressions require exponential time and +Large repetition counts in the @samp{@{m,n@}} construct may cause +@command{grep} to use lots of memory. In addition, certain other +obscure regular expressions require exponential time and space, and may cause grep to run out of memory. -Backreferences are very slow, and may require exponential time. +Backreferences are very slow, and may require exponential time. @page -@node Concept Index , Index, Reporting Bugs, Top -@comment node-name, next, previous, up +@node Concept Index @unnumbered Concept Index This is a general index of all issues discussed in this manual, with the -exception of the @sc{grep} commands and command-line options. +exception of the @command{grep} commands and command-line options. @printindex cp @page -@node Index, , Concept Index, Top +@node Index @unnumbered Index -This is an alphabetical list of all @sc{grep} commands and command-line -options. +This is an alphabetical list of all @command{grep} commands, command-line +options, and environment variables. @printindex fn |