diff options
Diffstat (limited to 'lib/libc/locale/gb18030.5')
-rw-r--r-- | lib/libc/locale/gb18030.5 | 78 |
1 files changed, 78 insertions, 0 deletions
diff --git a/lib/libc/locale/gb18030.5 b/lib/libc/locale/gb18030.5 new file mode 100644 index 0000000..3a296c0 --- /dev/null +++ b/lib/libc/locale/gb18030.5 @@ -0,0 +1,78 @@ +.\" Copyright (c) 2002, 2003 Tim J. Robbins +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd August 10, 2003 +.Dt GB18030 5 +.Os +.Sh NAME +.Nm gb18030 +.Nd "GB 18030 encoding method for Chinese text" +.Sh SYNOPSIS +.Nm ENCODING +.Qq GB18030 +.Sh DESCRIPTION +The +.Nm GB18030 +encoding implements GB 18030-2000, a PRC national standard for the encoding of +Chinese characters. +It is a superset of the older GB\ 2312-1980 and GBK encodings, +and incorporates Unicode's Unihan Extension A completely. +It also provides code space for all Unicode 3.0 code points. +.Pp +Multibyte characters in the +.Nm GB18030 +encoding can be one byte, two bytes, or +four bytes long. +There are a total of over 1.5 million code positions. +.Pp +.No GB\ 11383-1981 Pq Tn ASCII +characters are represented by single bytes in the range 0x00 to 0x7F. +.Pp +Chinese characters are represented as either two bytes or four bytes. +Characters that are represented by two bytes begin with a byte in the range +0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE. +.Pp +Characters that are represented by four bytes begin with a byte in the range +0x81-0xFE, have a second byte in the range 0x30-0x39, a third byte in the range +0x81-0xFE and a fourth byte in the range 0x30-0x39. +.Sh SEE ALSO +.Xr euc 5 , +.Xr gb2312 5 , +.Xr gbk 5 , +.Xr utf8 5 +.Rs +.%T "Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange -- Extension for the basic set" +.%D "March 2000" +.Re +.Rs +.%Q "The Unicode Consortium" +.%T "The Unicode Standard, Version 3.0" +.%D "2000" +.Re +.Sh STANDARDS +The +.Nm GB18030 +encoding is believed to be compatible with GB 18030-2000. |