summaryrefslogtreecommitdiffstats
path: root/gnu/usr.bin/sort/sort.1
blob: e9f4b1ed290c6cfecc54f5de9e2d3cd69f1d7cb6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
.TH SORT 1 "GNU Text Utilities" "FSF" \" -*- nroff -*-
.SH NAME
sort \- sort lines of text files
.SH SYNOPSIS
.B sort
[\-cmus] [\-t separator] [\-o output-file] [\-T tempdir] [\-bdfiMnr]
[+POS1 [\-POS2]] [\-k POS1[,POS2]] [file...]
.br
.B sort
{\-\-help,\-\-version}
.SH DESCRIPTION
This manual page
documents the GNU version of
.BR sort .
.B sort
sorts, merges, or compares all the lines from the given files, or the standard
input if no files are given.  A file name of `-' means standard input.
By default,
.B sort
writes the results to the standard output.
.PP
.B sort
has three modes of operation: sort (the default), merge, and check for
sortedness.  The following options change the operation mode:
.TP
.I \-c
Check whether the given files are already sorted: if they are not all
sorted, print an error message and exit with a status of 1.
.TP
.I \-m
Merge the given files by sorting them as a group.  Each input file
should already be individually sorted.  It always works to sort
instead of merge; merging is provided because it is faster, in the
case where it works.
.PP
A pair of lines is compared as follows: 
if any key fields have been specified,
.B sort
compares each pair of fields, in the order specified on the command
line, according to the associated ordering options, until a difference
is found or no fields are left.
.PP 
If any of the global options
.I Mbdfinr
are given but no key fields are 
specified,
.B sort
compares the entire lines according to the global options.
.PP 
Finally, as a last resort when all keys compare equal
(or if no ordering options were specified at all),
.B sort
compares the lines byte by byte in machine collating sequence.
The last resort comparison honors the
.I -r
global option.
The
.I \-s
(stable) option disables this last-resort comparison so that
lines in which all fields compare equal are left in their original
relative order.  If no fields or global options are specified,
.I \-s
has no effect.
.PP
GNU
.B sort
has no limits on input line length or restrictions on bytes allowed
within lines.  In addition, if the final byte of an input file is not
a newline, GNU
.B sort
silently supplies one.
.PP
If the environment variable
.B TMPDIR
is set,
.B sort
uses it as the directory in which to put temporary files instead of
the default, /tmp.  The
.I "\-T tempdir"
option is another way to select the directory for temporary files; it
overrides the environment variable.
.PP
The following options affect the ordering of output lines.  They may
be specified globally or as part of a specific key field.  If no key
fields are specified, global options apply to comparison of entire
lines; otherwise the global options are inherited by key fields that
do not specify any special options of their own.
.TP
.I \-b
Ignore leading blanks when finding sort keys in each line.
.TP
.I \-d
Sort in `phone directory' order: ignore all characters except letters,
digits and blanks when sorting.
.TP
.I \-f
Fold lower case characters into the equivalent upper case characters
when sorting so that, for example, `b' is sorted the same way `B' is.
.TP
.I \-i
Ignore characters outside the ASCII range 040-0176 octal (inclusive)
when sorting.
.TP
.I \-M
An initial string, consisting of any amount of white space, followed 
by three letters abbreviating a month name, is folded to UPPER case 
and compared in the order `JAN' < `FEB' < ... < `DEC.'  Invalid names 
compare low to valid names.
.TP
.I \-n
Compare according to arithmetic value an initial numeric string
consisting of optional white space, an optional \- sign, and zero or
more digits, optionally followed by a decimal point and zero or more
digits.
.TP
.I \-r
Reverse the result of comparison, so that lines with greater key
values appear earlier in the output instead of later.
.PP
Other options are:
.TP
.I "\-o output-file"
Write output to
.I output-file
instead of to the standard output.  If
.I output-file
is one of the input files,
.B sort
copies it to a temporary file before sorting and writing the output to
.IR output-file .
.TP
.I "\-t separator"
Use character
.I separator
as the field separator when finding the sort keys in each line.  By
default, fields are separated by the empty string between a
non-whitespace character and a whitespace character.  That is to say,
given the input line ` foo bar',
.B sort
breaks it into fields ` foo' and ` bar'.  The field separator is not
considered to be part of either the field preceding or the field
following it.
.TP
.I \-u
For the default case or the
.I \-m
option, only output the first of a sequence of lines that compare
equal.  For the
.I \-c
option, check that no pair of consecutive lines compares equal.
.TP
.I "+POS1 [\-POS2]"
Specify a field within each line to use as a sorting key.  The field
consists of the portion of the line starting at POS1 and up to (but
not including) POS2 (or to the end of the line if POS2 is not given).
The fields and character positions are numbered starting with 0.
.TP
.I "\-k POS1[,POS2]"
An alternate syntax for specifying sorting keys.
The fields and character positions are numbered starting with 1.
.PP
A position has the form \fIf\fP.\fIc\fP, where \fIf\fP is the number
of the field to use and \fIc\fP is the number of the first character
from the beginning of the field (for \fI+pos\fP) or from the end of
the previous field (for \fI\-pos\fP).  The .\fIc\fP part of a position
may be omitted in which case it is taken to be the first character in
the field.  If the
.I \-b
option has been given, the .\fIc\fP part of a field specification is
counted from the first nonblank character of the field (for
\fI+pos\fP) or from the first nonblank character following the
previous field (for \fI\-pos\fP).
.PP
A \fI+pos\fP or \fI-pos\fP argument may also have any of the option
letters
.I Mbdfinr
appended to it, in which case the global ordering options are not used
for that particular field.  The
.I \-b
option may be independently attached to either or both of the
\fI+pos\fP and \fI\-pos\fP parts of a field specification, and if it
is inherited from the global options it will be attached to both.
If a
.I \-n
or
.I \-M
option is used, thus implying a
.I \-b
option, the
.I \-b
option is taken to apply to both the \fI+pos\fP and the \fI\-pos\fP
parts of a key specification.  Keys may span multiple fields.
.PP
In addition, when GNU
.B sort
is invoked with exactly one argument, the following options are recognized:
.TP
.I "\-\-help"
Print a usage message on standard output and exit successfully.
.TP
.I "\-\-version"
Print version information on standard output then exit successfully.
.SH COMPATIBILITY
.PP
Historical (BSD and System V) implementations of
.B sort
have differed in their interpretation of some options,
particularly
.IR \-b ,
.IR \-f ,
and
.IR \-n .
GNU sort follows the POSIX behavior, which is
usually (but not always!) like the System V behavior.
According to POSIX
.I \-n
no longer implies
.IR \-b .
For consistency,
.I \-M
has been changed in the same way.
This may affect the meaning of character positions in field
specifications in obscure cases.
If this bites you the fix is to add an explicit
.IR \-b .
.SH BUGS
The different meaning of field numbers depending
on whether
.I -k
is used is confusing.
It's all POSIX's fault!
OpenPOWER on IntegriCloud