1 files changed, 226 insertions, 0 deletions
diff --git a/contrib/perl5/lib/unicode/NamesList.html b/contrib/perl5/lib/unicode/NamesList.html
new file mode 100644
index 0000000..0bfc5db
--- /dev/null
+++ b/contrib/perl5/lib/unicode/NamesList.html
@@ -0,0 +1,226 @@
+<html>
+
+<head>
+<meta name="GENERATOR" content="Microsoft FrontPage 3.0">
+<title>Unicode 3.0 NamesList File Structure</title>
+</head>
+
+<body>
+
+<h3>Unicode NamesList File Format</h3>
+
+<p>Last updated: 1999-07-06</p>
+
+<h3>1.0 Introduction</h3>
+
+<p>The Unicode name list file NamesList.txt (also NamesList.lst) is a plain text file used
+to drive the layout of the character code charts in the Unicode Standard. The information
+in this file is a combination of several fields from the UnicodeData.txt and Blocks.txt files,
+together with additional annotations for many characters. This document describes the
+syntax rules for the file format, but also gives brief information on how each construct
+is rendered when laid out for the book. Some of the syntax elements were used in
+preparation of the drafts of the book and may not be present in the final, released form
+of the NamesList.txt file.</p>
+
+<p>The same input file can be used to do the draft preparation for ISO/IEC 10646 (referred
+below as ISO-style). This necessitates the presence of some information in the name list
+file that is not needed (and in fact removed during parsing) for the Unicode book.</p>
+
+<p>With access to the layout program (unibook.exe) it is a simple matter of creating
+name lists for the purpose of formatting working drafts containing proposed characters.</p>
+
+<h3>1.1 NamesList File Overview</h3>
+
+<p>The *.lst files are plain text files which in their most simple form look like this</p>
+
+<p>@@&lt;tab&gt;0020&lt;tab&gt;BASIC LATIN&lt;tab&gt;007F<br>
+; this is a file comment (ignored)<br>
+0020&lt;tab&gt;SPACE<br>
+0021&lt;tab&gt;EXCLAMATION MARK<br>
+0022&lt;tab&gt;QUOTATION MARK<br>
+. . . <br>
+007F&lt;tab&gt;DELETE</p>
+
+<p>The semicolon (as first character), @ and &lt;tab&gt; characters are used by the file
+syntax and must be provided as shown. Hexadecimal digits must be in UPPER CASE). A double
+@@ introduces a block header, with the title, and start and ending code of the block
+provided as shown.</p>
+
+<p>For an ISO-style, minimal name list, only the NAME_LINE and BLOCKHEADER and their
+constituent syntax elements are needed.</p>
+
+<p>The full syntax with all the options is provided in the following sections.</p>
+
+<h3>1.2 NamesList File Structure</h3>
+
+<p>This section gives defines the overall file structure</p>
+
+<pre><strong>NAMELIST:     TITLE_PAGE* BLOCK* 
+</strong>
+<strong>TITLE_PAGE:   TITLE 
+		| TITLE_PAGE SUBTITLE 
+		| TITLE_PAGE SUBHEADER 
+		| TITLE_PAGE IGNORED_LINE 
+		| TITLE_PAGE EMPTY_LINE
+		| TITLE_PAGE COMMENTLINE
+		| TITLE_PAGE NOTICE
+		| TITLE_PAGE PAGEBREAK 
+</strong>
+<strong>BLOCK:	      BLOCKHEADER 
+		| BLOCK CHAR_ENTRY 
+		| BLOCK SUBHEADER 
+		| BLOCK NOTICE 
+		| BLOCK EMPTY_LINE 
+		| BLOCK IGNORED_LINE 
+		| BLOCK PAGEBREAK
+
+CHAR_ENTRY:   NAME_LINE | RESERVED_LINE
+		| CHAR_ENTRY ALIAS_LINE
+		| CHAR_ENTRY COMMENT_LINE
+		| CHAR_ENTRY CROSS_REF
+		| CHAR_ENTRY DECOMPOSITION
+		| CHAR_ENTRY COMPAT_MAPPING
+		| CHAR_ENTRY IGNORED_LINE
+		| CHAR_ENTRY EMPTY_LINE
+		| CHAR_ENTRY NOTICE
+</strong></pre>
+
+<p>In other words:<br>
+<br>
+Neither TITLE nor&nbsp; SUBTITLE may occur after the first BLOCKHEADER. </p>
+
+<p>Only TITLE, SUBTITLE, SUBHEADER, PAGEBREAK, COMMENT_LINE,&nbsp; and IGNORED_LINE may
+occur before the first BLOCKHEADER.</p>
+
+<p>Directly following either a NAME_LINE or a RESERVED_LINE an uninterrupted sequence of
+the following lines may occur (in any order and repeated as often as needed): ALIAS_LINE,
+CROSS_REF, DECOMPOSITION, COMPAT_MAPPING, NOTICE, EMPTY_LINE and IGNORED_LINE.</p>
+
+<p>Except for EMPTY_LINE, NOTICE and IGNORED_LINE, none of these lines may occur in any other
+place. </p>
+
+<p>Note: A NOTICE displays differently depending on whether it follows a header or title
+or is part of a CHAR_ENTRY.</p>
+
+<h3>1.3 NamesList File Elements</h3>
+
+<p>This section provides the details of the syntax for the individual elements.</p>
+
+<pre><small><strong>ELEMENT		SYNTAX</strong>	// How rendered</small></pre>
+
+<pre><small><strong>NAME_LINE:	CHAR &lt;tab&gt; LINE
+</strong>			// the CHAR and the corresponding image are echoed, 
+			// followed by the name as given in LINE
+
+<strong>		CHAR TAB NAME COMMENT LF
+</strong>			// Names may have a comment, which is stripped off
+			// unless the file is parsed for an ISO style list
+										
+<strong>RESERVED_LINE:	CHAR TAB &lt;reserved&gt;		
+</strong>			// the CHAR is echoed followed by an icon for the
+			// reserved character and a fixed string e.g. &lt;reserved&gt;
+	
+<strong>COMMMENT_LINE:	&lt;tab&gt; &quot;*&quot; SP EXPAND_LINE
+</strong>			// * is replaced by BULLET, output line as comment
+		<strong>&lt;tab&gt; EXPAND_LINE</strong>	
+			// output line as comment
+
+<strong>ALIAS_LINE:	&lt;tab&gt; &quot;=&quot; SP LINE	
+</strong>			// replace = by itself, output line as alias
+
+<strong>CROSS_REF:	&lt;tab&gt; &quot;X&quot; SP EXPAND_LINE	
+</strong>			// X is replaced by a right arrow
+<strong>		&lt;tab&gt; &quot;X&quot; SP &quot;(&quot; STRING SP &quot;-&quot; SP CHAR &quot;)&quot;	
+</strong>			// X is replaced by a right arrow
+			// the &quot;(&quot;, &quot;-&quot;, &quot;)&quot; are removed, the
+			// order of CHAR and STRING is reversed
+			// i.e. both inputs result in the same output
+
+<strong>IGNORED_LINE:	&lt;tab&gt; &quot;;&quot; EXPAND_LINE	
+EMPTY_LINE:	LF			
+</strong>			// empty lines and file comments are ignored
+
+<strong>DECOMPOSITION:	&lt;tab&gt; &quot;:&quot; EXPAND_LINE	
+</strong>			// replace ':' by EQUIV, expand line into 
+			// decomposition 
+
+<strong>COMPAT_MAPPING:	&lt;tab&gt; &quot;#&quot; SP EXPAND_LINE	
+</strong>			// replace '#' by APPROX, output line as mapping 
+
+<strong>NOTICE:		&quot;@+&quot; &lt;tab&gt; LINE		
+</strong>			// skip '@+', output text as notice
+<strong>		&quot;@+&quot; TAB * SP LINE	
+</strong>			// skip '@', output text as notice
+			// &quot;*&quot; expands to a bullet character
+			// Notices following a character code apply to the
+			// character and are indented. Notices not following
+			// a character code apply to the page/block/column 
+			// and are italicized, but not indented
+
+<strong>SUBTITLE:	&quot;@@@+&quot; &lt;tab&gt; LINE	
+</strong>			// skip &quot;@@@+&quot;, output text as subtitle
+
+<strong>SUBHEADER:	&quot;@&quot; &lt;tab&gt; LINE	
+</strong>			// skip '@', output line as text as column header
+
+<strong>BLOCKHEADER:	&quot;@@&quot; &lt;tab&gt; BLOCKSTART &lt;tab&gt; BLOCKNAME &lt;tab&gt; BLOCKEND
+</strong>			// skip &quot;@@&quot;, cause a page break and optional
+			// blank page, then output one or more charts
+			// followed by the list of character names. 
+			// use BLOCKSTART and BLOCKEND to define the 
+			// what characters belong to a block
+			// use blockname in page and table headers
+	<strong>	&quot;@@&quot; &lt;tab&gt; BLOCKSTART &lt;tab&gt; BLOCKNAME COMMENT &lt;tab&gt; BLOCKEND
+			</strong>// if a comment is present it replaces the blockname
+			// when an ISO-style namelist is laid out
+
+<strong>BLOCKSTART:	CHAR</strong>	// first character position in block
+<strong>BLOCKEND:	CHAR</strong>	// last character position in block
+<strong>PAGE_BREAK:	&quot;@@&quot;</strong>	// insert a (column) break
+
+<strong>TITLE:		&quot;@@@&quot; &lt;tab&gt; LINE</strong>	
+			// skip &quot;@@@&quot;, output line as text
+			// Title is used in page headers
+
+<strong>EXPAND_LINE:	{CHAR | STRING}+ LF	</strong>
+			// all instances of CHAR *) are replaced by 
+			// CHAR NBSP x NBSP where x is the single Unicode
+			// character corresponding to char
+			// If character is combining, it is replaced with
+			// CHAR NBSP &lt;circ&gt; x NBSP where &lt;circ&gt; is the 
+			// dotted circle</small>
+</pre>
+
+<h3><strong>1.4 NamesList File Primitives</strong></h3>
+
+<p>The following are the primitives and terminals for the NamesList syntax.</p>
+
+<pre><small><strong>LINE:		STRING LF
+COMMENT:	&quot;(&quot; NAME &quot;)&quot;
+		&quot;(&quot; NAME &quot;)&quot; &quot;*&quot;
+</strong>
+<strong>NAME</strong>:	  	&lt;sequence of ASCII characters, except &quot;(&quot; or &quot;)&quot; &gt; 
+<strong>STRING</strong>:	  	&lt;sequence of Latin-1 characters&gt; 
+<strong>CHAR</strong>:		<strong>X X X X</strong>
+		<strong>| X X X X X X X X X</strong></small>
+<small><strong>X:	  	&quot;0&quot;|&quot;1&quot;|&quot;2&quot;|&quot;3&quot;|&quot;4&quot;|&quot;5&quot;|&quot;6&quot;|&quot;7&quot;|&quot;8&quot;|&quot;9&quot;|&quot;A&quot;|&quot;B&quot;|&quot;C&quot;|&quot;D&quot;|&quot;E&quot;|&quot;F&quot; 
+&lt;tab&gt;:</strong>	  	&lt;sequence of one or more ASCII tab characters 0x09&gt;	
+<strong>SP</strong>:	  	&lt;ASCII 0x20&gt;
+<strong>LF</strong>:	  	&lt;any sequence of ASCII 0x0A and 0x0D&gt;
+</small></pre>
+
+<p><strong>Notes:</strong> 
+
+<ul>
+  <li>Special lookahead logic prevents a mention of a 4 digit standard, such as ISO 9999 from
+    being misinterpreted as ISO CHAR.</li>
+  <li>Use of Latin-1 is supported in unibook.exe, but not portably, unless the file is encoded as
+    UTF-16LE.</li>
+  <li>The final LF in the file must be present</li>
+  <li>A CHAR inside ' or &quot; is expanded, but only its glyph image is printed,&nbsp; the
+    code value is not echoed</li>
+  <li>Straight quotes in an EXPAND_LINE are replaced by curly quotes using English rules.
+    Apostrophes are supported, but nested quotes are not.</li>
+</ul>
+</body>
+</html>