diff options
author | obrien <obrien@FreeBSD.org> | 2009-01-02 03:10:55 +0000 |
---|---|---|
committer | obrien <obrien@FreeBSD.org> | 2009-01-02 03:10:55 +0000 |
commit | 729acffa050ba99227c27884a5760fdd3d6959b2 (patch) | |
tree | d5289d633f1b84fbbf947f98b34ff58109fccca7 /contrib/file/magic.man | |
parent | 964611a3050b11c026667dbb4ae98380fe18252a (diff) | |
parent | 05dd1f1bd993ec12015e6782ea5c46b6a0b69ecb (diff) | |
download | FreeBSD-src-729acffa050ba99227c27884a5760fdd3d6959b2.zip FreeBSD-src-729acffa050ba99227c27884a5760fdd3d6959b2.tar.gz |
Record that base/vendor/file/dist@186675 was merged.
Merge base/vendor/file/dist@186675@186690, bringing FILE 4.26 to 8-CURRENT.
Diffstat (limited to 'contrib/file/magic.man')
-rw-r--r-- | contrib/file/magic.man | 199 |
1 files changed, 119 insertions, 80 deletions
diff --git a/contrib/file/magic.man b/contrib/file/magic.man index 3842b64..314a014 100644 --- a/contrib/file/magic.man +++ b/contrib/file/magic.man @@ -1,11 +1,11 @@ -.\" $File: magic.man,v 1.39 2007/11/08 00:31:37 christos Exp $ -.Dd January 10, 2007 +.\" $File: magic.man,v 1.57 2008/08/30 09:50:20 christos Exp $ +.Dd August 30, 2008 .Dt MAGIC __FSECTION__ .Os -.\" install as magic.4 on USG, magic.5 on V7 or Berkeley systems. +.\" install as magic.4 on USG, magic.5 on V7, Berkeley and Linux systems. .Sh NAME .Nm magic -.Nd file command's magic number file +.Nd file command's magic pattern file .Sh DESCRIPTION This manual page documents the format of the magic file as used by the @@ -15,18 +15,17 @@ The .Xr file __CSECTION__ command identifies the type of a file using, among other tests, -a test for whether the file begins with a certain -.Dq "magic number" . +a test for whether the file contains certain +.Dq "magic patterns" . The file .Pa __MAGIC__ -specifies what magic numbers are to be tested for, -what message to print if a particular magic number is found, +specifies what patterns are to be tested for, what message or +MIME type to print if a particular pattern is found, and additional information to extract from the file. .Pp Each line of the file specifies a test to be performed. A test compares the data starting at a particular offset -in the file with a 1-byte, 2-byte, or 4-byte numeric value or -a string. +in the file with a byte value, a string or a numeric value. If the test succeeds, a message is printed. The line consists of the following fields: .Bl -tag -width ".Dv message" @@ -40,15 +39,15 @@ The possible values are: .It Dv byte A one-byte value. .It Dv short -A two-byte value (on most systems) in this machine's native byte order. +A two-byte value in this machine's native byte order. .It Dv long -A four-byte value (on most systems) in this machine's native byte order. +A four-byte value in this machine's native byte order. .It Dv quad -An eight-byte value (on most systems) in this machine's native byte order. +An eight-byte value in this machine's native byte order. .It Dv float -A 32-bit (on most systems) single precision IEEE floating point number in this machine's native byte order. +A 32-bit single precision IEEE floating point number in this machine's native byte order. .It Dv double -A 64-bit (on most systems) double precision IEEE floating point number in this machine's native byte order. +A 64-bit double precision IEEE floating point number in this machine's native byte order. .It Dv string A string of bytes. The string type specification can be optionally followed @@ -69,10 +68,10 @@ Finally the .Dq c flag, specifies case insensitive matching: lowercase characters in the magic match both lower and upper case characters in the -targer, whereas upper case characters in the magic, only much uppercase +target, whereas upper case characters in the magic only match uppercase characters in the target. .It Dv pstring -A pascal style string where the first byte is interpreted as the an +A Pascal-style string where the first byte is interpreted as the an unsigned length. The string is not NUL terminated. .It Dv date @@ -86,106 +85,119 @@ local time rather than UTC. An eight-byte value interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv beshort -A two-byte value (on most systems) in big-endian byte order. +A two-byte value in big-endian byte order. .It Dv belong -A four-byte value (on most systems) in big-endian byte order. +A four-byte value in big-endian byte order. .It Dv bequad -An eight-byte value (on most systems) in big-endian byte order. +An eight-byte value in big-endian byte order. .It Dv befloat -A 32-bit (on most systems) single precision IEEE floating point number in big-endian byte order. +A 32-bit single precision IEEE floating point number in big-endian byte order. .It Dv bedouble -A 64-bit (on most systems) double precision IEEE floating point number in big-endian byte order. +A 64-bit double precision IEEE floating point number in big-endian byte order. .It Dv bedate -A four-byte value (on most systems) in big-endian byte order, +A four-byte value in big-endian byte order, interpreted as a Unix date. .It Dv beqdate -An eight-byte value (on most systems) in big-endian byte order, +An eight-byte value in big-endian byte order, interpreted as a Unix date. .It Dv beldate -A four-byte value (on most systems) in big-endian byte order, +A four-byte value in big-endian byte order, interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv beqldate -An eight-byte value (on most systems) in big-endian byte order, +An eight-byte value in big-endian byte order, interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv bestring16 A two-byte unicode (UCS16) string in big-endian byte order. .It Dv leshort -A two-byte value (on most systems) in little-endian byte order. +A two-byte value in little-endian byte order. .It Dv lelong -A four-byte value (on most systems) in little-endian byte order. +A four-byte value in little-endian byte order. .It Dv lequad -An eight-byte value (on most systems) in little-endian byte order. +An eight-byte value in little-endian byte order. .It Dv lefloat -A 32-bit (on most systems) single precision IEEE floating point number in little-endian byte order. +A 32-bit single precision IEEE floating point number in little-endian byte order. .It Dv ledouble -A 64-bit (on most systems) double precision IEEE floating point number in little-endian byte order. +A 64-bit double precision IEEE floating point number in little-endian byte order. .It Dv ledate -A four-byte value (on most systems) in little-endian byte order, +A four-byte value in little-endian byte order, interpreted as a UNIX date. .It Dv leqdate -An eight-byte value (on most systems) in little-endian byte order, +An eight-byte value in little-endian byte order, interpreted as a UNIX date. .It Dv leldate -A four-byte value (on most systems) in little-endian byte order, +A four-byte value in little-endian byte order, interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv leqldate -An eight-byte value (on most systems) in little-endian byte order, +An eight-byte value in little-endian byte order, interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv lestring16 A two-byte unicode (UCS16) string in little-endian byte order. .It Dv melong -A four-byte value (on most systems) in middle-endian (PDP-11) byte order. +A four-byte value in middle-endian (PDP-11) byte order. .It Dv medate -A four-byte value (on most systems) in middle-endian (PDP-11) byte order, +A four-byte value in middle-endian (PDP-11) byte order, interpreted as a UNIX date. .It Dv meldate -A four-byte value (on most systems) in middle-endian (PDP-11) byte order, +A four-byte value in middle-endian (PDP-11) byte order, interpreted as a UNIX-style date, but interpreted as local time rather than UTC. .It Dv regex A regular expression match in extended POSIX regular expression syntax -(much like egrep). -The type specification can be optionally followed by /[cse]*. +(like egrep). Regular expressions can take exponential time to +process, and their performance is hard to predict, so their use is +discouraged. When used in production environments, their performance +should be carefully checked. The type specification can be optionally +followed by +.Dv /[c][s] . The .Dq c flag makes the match case insensitive, while the .Dq s -or -.Dq e -flags update the offset to the starting or ending offsets of the -match (only one should be used). -By default, regex does not update the offset. -The regular expression is always tested against the first -.Dv N -lines, where +flag update the offset to the start offset of the match, rather than the end. +The regular expression is tested against line +.Dv N + 1 +onwards, where .Dv N -is the given offset, thus it -is only useful for (single-byte encoded) text. +is the given offset. +Line endings are assumed to be in the machine's native format. .Dv ^ and .Dv $ -will match the beginning and end of individual lines, respectively, +match the beginning and end of individual lines, respectively, not beginning and end of file. .It Dv search -A literal string search starting at the given offset. -It must be followed by -.Dv \*[Lt]number\*[Gt] -which specifies how many matches shall be attempted (the range). -This is suitable for searching larger binary expressions with variable -offsets, using +A literal string search starting at the given offset. The same +modifier flags can be used as for string patterns. The modifier flags +(if any) must be followed by +.Dv /number +the range, that is, the number of positions at which the match will be +attempted, starting from the start offset. This is suitable for +searching larger binary expressions with variable offsets, using .Dv \e -escapes for special characters. +escapes for special characters. The offset works as for regex. .It Dv default -This is intended to be used with the text -.Dv x +This is intended to be used with the test +.Em x (which is always true) and a message that is to be used if there are no other matches. .El -.El +.Pp +Each top-level magic pattern (see below for an explanation of levels) +is classified as text or binary according to the types used. Types +.Dq regex +and +.Dq search +are classified as text tests, unless non-printable characters are used +in the pattern. All other tests are classified as binary. A top-level +pattern is considered to be a test text when all its patterns are text +patterns; otherwise, it is considered to be a binary pattern. When +matching a file, binary patterns are tried first; if no match is +found, and the file looks like text, then its encoding is determined +and the text patterns are tried. .Pp The numeric types may optionally be followed by .Dv \*[Am] @@ -195,7 +207,6 @@ numeric value before any comparisons are done. Prepending a .Dv u to the type indicates that ordered comparisons should be unsigned. -.Bl -tag -width ".Dv message" .It Dv test The value to be compared with the value from the file. If the type is @@ -232,12 +243,8 @@ Operators and .Dv ~ don't work with floats and doubles. -For all tests except -.Em string -and -.Em regex , -operation -.Dv ! +The operator +.Dv !\& specifies that the line matches if the test does .Em not succeed. @@ -250,8 +257,8 @@ is octal, and .Dv 0x13 is hexadecimal. .Pp -For string values, the byte string from the -file must match the specified byte string. +For string values, the string from the +file must match the specified string. The operators .Dv = , .Dv \*[Lt] @@ -262,10 +269,10 @@ and can be applied to strings. The length used for matching is that of the string argument in the magic file. -This means that a line can match any string, and -then presumably print that string, by doing +This means that a line can match any non-empty string (usually used to +then print the string), with .Em \*[Gt]\e0 -(because all strings are greater than the null string). +(because all non-empty strings are greater than the empty string). .Pp The special test .Em x @@ -276,11 +283,44 @@ If the string contains a .Xr printf 3 format specification, the value from the file (with any specified masking performed) is printed using the message as the format string. -If the string begins with ``\\b'', the message printed is the -remainder of the string with no whitespace added before it: multiple -matches are normally separated by a single space. +If the string begins with +.Dq \eb , +the message printed is the remainder of the string with no whitespace +added before it: multiple matches are normally separated by a single +space. .El .Pp +A MIME type is given on a separate line, which must be the next +non-blank or comment line after the magic line that identifies the +file type, and has the following format: +.Bd -literal -offset indent +!:mime MIMETYPE +.Ed +.Pp +i.e. the literal string +.Dq !:mime +followed by the MIME type. +.Pp +An optional strength can be supplied on a separate line which refers to +the current magic description using the following format: +.Bd -literal -offset indent +!:strength OP VALUE +.Ed +.Pp +The operand +.Dv OP +can be: +.Dv + , +.Dv - , +.Dv * , +or +.Dv / +and +.Dv VALUE +is a constant between 0 and 255. +This constant is applied using the specified operand +to the currently computed default magic strength. +.Pp Some file formats contain additional information which is to be printed along with the file type or need additional tests to determine the true file type. @@ -350,13 +390,13 @@ That way variable length structures can be examined: \*[Gt]\*[Gt](0x3c.l) string LX\e0\e0 LX executable (OS/2) .Ed .Pp -This strategy of examining has one drawback: You must make sure that +This strategy of examining has a drawback: You must make sure that you eventually print something, or users may get empty output (like, when there is neither PE\e0\e0 nor LE\e0\e0 in the above example) .Pp -If this indirect offset cannot be used as-is, there are simple calculations +If this indirect offset cannot be used directly, simple calculations are possible: appending -.Em [+-*/%\*[Am]|^]\*[Lt]number\*[Gt] +.Em [+-*/%\*[Am]|^]number inside parentheses allows one to modify the value read from the file before it is used as an offset: .Bd -literal -offset indent @@ -468,4 +508,3 @@ a system on which the lengths are invariant. .\" the changes I posted to the S5R2 version. .\" .\" Modified for Ian Darwin's version of the file command. -.\" @(#)$Id: magic.man,v 1.39 2007/11/08 00:31:37 christos Exp $ |