Add a lengthy discussion of why "tr a-z A-Z" and "tr A-Z a-z" are not the

right way to perform case-conversion.
author: tjr <tjr@FreeBSD.org> 2004-07-23 05:44:04 +0000
committer: tjr <tjr@FreeBSD.org> 2004-07-23 05:44:04 +0000
commit: 2322892e0bbf7bdfdc334c74442cd2fd8b435e9c (patch)
tree: 8e513a35fca945014d750a9a4eb629826fb1c5b8 /usr.bin/tr
parent: aa7eec1b492ac8e2f8e78f641829f125fb5b4531 (diff)
download: FreeBSD-src-2322892e0bbf7bdfdc334c74442cd2fd8b435e9c.zip
FreeBSD-src-2322892e0bbf7bdfdc334c74442cd2fd8b435e9c.tar.gz
1 files changed, 41 insertions, 1 deletions
diff --git a/usr.bin/tr/tr.1 b/usr.bin/tr/tr.1
index 80f4516..ef39711 100644
--- a/usr.bin/tr/tr.1
+++ b/usr.bin/tr/tr.1
@@ -35,7 +35,7 @@
 .\"     @(#)tr.1	8.1 (Berkeley) 6/6/93
 .\" $FreeBSD$
 .\"
-.Dd July 9, 2004
+.Dd July 23, 2004
 .Dt TR 1
 .Os
 .Sh NAME
@@ -169,6 +169,13 @@ as defined by the collation sequence.
 If either or both of the range endpoints are octal sequences, it
 represents the range of specific coded values between the
 range endpoints, inclusive.
+.Pp
+.Bf Em
+See the COMPATIBILITY section below for an important note regarding
+differences in the way the current
+implementation interprets range expressions differently from
+previous implementations.
+.Ef
 .It [:class:]
 Represents all characters belonging to the defined character class.
 Class names are:
@@ -274,6 +281,12 @@ Translate the contents of file1 to upper-case.
 .Pp
 .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
 .Pp
+(This should be preferred over the traditional
+.Ux
+idiom of
+.Ql "tr a-z A-Z" ,
+since it works correctly in all locales.)
+.Pp
 Strip out non-printable characters from file1.
 .Pp
 .D1 Li "tr -cd \*q[:print:]\*q < file1"
@@ -285,6 +298,33 @@ Remove diacritical marks from all accented variants of the letter
 .Sh DIAGNOSTICS
 .Ex -std
 .Sh COMPATIBILITY
+Previous
+.Fx
+implementations of
+.Nm
+did not order characters in range expressions according to the current
+locale's collation order, making it possible to convert unaccented Latin
+characters (esp. as found in English text) from upper to lower case using
+the traditional
+.Ux
+idiom of
+.Ql "tr A-Z a-z" .
+Since
+.Nm
+now obeys the locale's collation order, this idiom may not produce
+correct results when there is not a 1:1 mapping between lower and
+upper case, or when the order of characters within the two cases differs.
+As noted in the
+.Sx EXAMPLES
+section above, the character class expressions
+.Ql "[:lower:]"
+and
+.Ql "[:upper:]"
+should be used instead of explicit character ranges like
+.Ql "a-z"
+and
+.Ql "A-Z" .
+.Pp
 System V has historically implemented character ranges using the syntax
 ``[c-c]'' instead of the ``c-c'' used by historic
 .Bx
author	tjr <tjr@FreeBSD.org>	2004-07-23 05:44:04 +0000
committer	tjr <tjr@FreeBSD.org>	2004-07-23 05:44:04 +0000
commit	2322892e0bbf7bdfdc334c74442cd2fd8b435e9c (patch)
tree	8e513a35fca945014d750a9a4eb629826fb1c5b8 /usr.bin/tr
parent	aa7eec1b492ac8e2f8e78f641829f125fb5b4531 (diff)
download	FreeBSD-src-2322892e0bbf7bdfdc334c74442cd2fd8b435e9c.zip FreeBSD-src-2322892e0bbf7bdfdc334c74442cd2fd8b435e9c.tar.gz