diff options
Diffstat (limited to 'contrib/perl5/ext/SDBM_File/sdbm/readme.ms')
-rw-r--r-- | contrib/perl5/ext/SDBM_File/sdbm/readme.ms | 353 |
1 files changed, 0 insertions, 353 deletions
diff --git a/contrib/perl5/ext/SDBM_File/sdbm/readme.ms b/contrib/perl5/ext/SDBM_File/sdbm/readme.ms deleted file mode 100644 index 01ca17c..0000000 --- a/contrib/perl5/ext/SDBM_File/sdbm/readme.ms +++ /dev/null @@ -1,353 +0,0 @@ -.\" tbl | readme.ms | [tn]roff -ms | ... -.\" note the "C" (courier) and "CB" fonts: you will probably have to -.\" change these. -.\" $Id: readme.ms,v 1.1 90/12/13 13:09:15 oz Exp Locker: oz $ - -.de P1 -.br -.nr dT 4 -.nf -.ft C -.sp .5 -.nr t \\n(dT*\\w'x'u -.ta 1u*\\ntu 2u*\\ntu 3u*\\ntu 4u*\\ntu 5u*\\ntu 6u*\\ntu 7u*\\ntu 8u*\\ntu 9u*\\ntu 10u*\\ntu 11u*\\ntu 12u*\\ntu 13u*\\ntu 14u*\\ntu -.. -.de P2 -.br -.ft 1 -.br -.sp .5 -.br -.fi -.. -.\" CW uses the typewriter/courier font. -.de CW -\fC\\$1\\fP\\$2 -.. - -.\" Footnote numbering [by Henry Spencer] -.\" <text>\*f for a footnote number.. -.\" .FS -.\" \*F <footnote text> -.\" .FE -.\" -.ds f \\u\\s-2\\n+f\\s+2\\d -.nr f 0 1 -.ds F \\n+F. -.nr F 0 1 - -.ND -.LP -.TL -\fIsdbm\fP \(em Substitute DBM -.br -or -.br -Berkeley \fIndbm\fP for Every UN*X\** Made Simple -.AU -Ozan (oz) Yigit -.AI -The Guild of PD Software Toolmakers -Toronto - Canada -.sp -oz@nexus.yorku.ca -.LP -.FS -UN*X is not a trademark of any (dis)organization. -.FE -.sp 2 -\fIImplementation is the sincerest form of flattery. \(em L. Peter Deutsch\fP -.SH -A The Clone of the \fIndbm\fP library -.PP -The sources accompanying this notice \(em \fIsdbm\fP \(em constitute -the first public release (Dec. 1990) of a complete clone of -the Berkeley UN*X \fIndbm\fP library. The \fIsdbm\fP library is meant to -clone the proven functionality of \fIndbm\fP as closely as possible, -including a few improvements. It is practical, easy to understand, and -compatible. -The \fIsdbm\fP library is not derived from any licensed, proprietary or -copyrighted software. -.PP -The \fIsdbm\fP implementation is based on a 1978 algorithm -[Lar78] by P.-A. (Paul) Larson known as ``Dynamic Hashing''. -In the course of searching for a substitute for \fIndbm\fP, I -prototyped three different external-hashing algorithms [Lar78, Fag79, Lit80] -and ultimately chose Larson's algorithm as a basis of the \fIsdbm\fP -implementation. The Bell Labs -\fIdbm\fP (and therefore \fIndbm\fP) is based on an algorithm invented by -Ken Thompson, [Tho90, Tor87] and predates Larson's work. -.PP -The \fIsdbm\fR programming interface is totally compatible -with \fIndbm\fP and includes a slight improvement in database initialization. -It is also expected to be binary-compatible under most UN*X versions that -support the \fIndbm\fP library. -.PP -The \fIsdbm\fP implementation shares the shortcomings of the \fIndbm\fP -library, as a side effect of various simplifications to the original Larson -algorithm. It does produce \fIholes\fP in the page file as it writes -pages past the end of file. (Larson's paper include a clever solution to -this problem that is a result of using the hash value directly as a block -address.) On the other hand, extensive tests seem to indicate that \fIsdbm\fP -creates fewer holes in general, and the resulting pagefiles are -smaller. The \fIsdbm\fP implementation is also faster than \fIndbm\fP -in database creation. -Unlike the \fIndbm\fP, the \fIsdbm\fP -.CW store -operation will not ``wander away'' trying to split its -data pages to insert a datum that \fIcannot\fP (due to elaborate worst-case -situations) be inserted. (It will fail after a pre-defined number of attempts.) -.SH -Important Compatibility Warning -.PP -The \fIsdbm\fP and \fIndbm\fP -libraries \fIcannot\fP share databases: one cannot read the (dir/pag) -database created by the other. This is due to the differences -between the \fIndbm\fP and \fIsdbm\fP algorithms\**, -.FS -Torek's discussion [Tor87] -indicates that \fIdbm/ndbm\fP implementations use the hash -value to traverse the radix trie differently than \fIsdbm\fP -and as a result, the page indexes are generated in \fIdifferent\fP order. -For more information, send e-mail to the author. -.FE -and the hash functions -used. -It is easy to convert between the \fIdbm/ndbm\fP databases and \fIsdbm\fP -by ignoring the index completely: see -.CW dbd , -.CW dbu -etc. -.R -.LP -.SH -Notice of Intellectual Property -.LP -\fIThe entire\fP sdbm \fIlibrary package, as authored by me,\fP Ozan S. Yigit, -\fIis hereby placed in the public domain.\fP As such, the author is not -responsible for the consequences of use of this software, no matter how -awful, even if they arise from defects in it. There is no expressed or -implied warranty for the \fIsdbm\fP library. -.PP -Since the \fIsdbm\fP -library package is in the public domain, this \fIoriginal\fP -release or any additional public-domain releases of the modified original -cannot possibly (by definition) be withheld from you. Also by definition, -You (singular) have all the rights to this code (including the right to -sell without permission, the right to hoard\** -.FS -You cannot really hoard something that is available to the public at -large, but try if it makes you feel any better. -.FE -and the right to do other icky things as -you see fit) but those rights are also granted to everyone else. -.PP -Please note that all previous distributions of this software contained -a copyright (which is now dropped) to protect its -origins and its current public domain status against any possible claims -and/or challenges. -.SH -Acknowledgments -.PP -Many people have been very helpful and supportive. A partial list would -necessarily include Rayan Zacherissen (who contributed the man page, -and also hacked a MMAP version of \fIsdbm\fP), -Arnold Robbins, Chris Lewis, -Bill Davidsen, Henry Spencer, Geoff Collyer, Rich Salz (who got me started -in the first place), Johannes Ruschein -(who did the minix port) and David Tilbrook. I thank you all. -.SH -Distribution Manifest and Notes -.LP -This distribution of \fIsdbm\fP includes (at least) the following: -.P1 - CHANGES change log - README this file. - biblio a small bibliography on external hashing - dba.c a crude (n/s)dbm page file analyzer - dbd.c a crude (n/s)dbm page file dumper (for conversion) - dbe.1 man page for dbe.c - dbe.c Janick's database editor - dbm.c a dbm library emulation wrapper for ndbm/sdbm - dbm.h header file for the above - dbu.c a crude db management utility - hash.c hashing function - makefile guess. - pair.c page-level routines (posted earlier) - pair.h header file for the above - readme.ms troff source for the README file - sdbm.3 man page - sdbm.c the real thing - sdbm.h header file for the above - tune.h place for tuning & portability thingies - util.c miscellaneous -.P2 -.PP -.CW dbu -is a simple database manipulation program\** that tries to look -.FS -The -.CW dbd , -.CW dba , -.CW dbu -utilities are quick hacks and are not fit for production use. They were -developed late one night, just to test out \fIsdbm\fP, and convert some -databases. -.FE -like Bell Labs' -.CW cbt -utility. It is currently incomplete in functionality. -I use -.CW dbu -to test out the routines: it takes (from stdin) tab separated -key/value pairs for commands like -.CW build -or -.CW insert -or takes keys for -commands like -.CW delete -or -.CW look . -.P1 - dbu <build|creat|look|insert|cat|delete> dbmfile -.P2 -.PP -.CW dba -is a crude analyzer of \fIdbm/sdbm/ndbm\fP -page files. It scans the entire -page file, reporting page level statistics, and totals at the end. -.PP -.CW dbd -is a crude dump program for \fIdbm/ndbm/sdbm\fP -databases. It ignores the -bitmap, and dumps the data pages in sequence. It can be used to create -input for the -.CW dbu -utility. -Note that -.CW dbd -will skip any NULLs in the key and data -fields, thus is unsuitable to convert some peculiar databases that -insist in including the terminating null. -.PP -I have also included a copy of the -.CW dbe -(\fIndbm\fP DataBase Editor) by Janick Bergeron [janick@bnr.ca] for -your pleasure. You may find it more useful than the little -.CW dbu -utility. -.PP -.CW dbm.[ch] -is a \fIdbm\fP library emulation on top of \fIndbm\fP -(and hence suitable for \fIsdbm\fP). Written by Robert Elz. -.PP -The \fIsdbm\fP -library has been around in beta test for quite a long time, and from whatever -little feedback I received (maybe no news is good news), I believe it has been -functioning without any significant problems. I would, of course, appreciate -all fixes and/or improvements. Portability enhancements would especially be -useful. -.SH -Implementation Issues -.PP -Hash functions: -The algorithm behind \fIsdbm\fP implementation needs a good bit-scrambling -hash function to be effective. I ran into a set of constants for a simple -hash function that seem to help \fIsdbm\fP perform better than \fIndbm\fP -for various inputs: -.P1 - /* - * polynomial conversion ignoring overflows - * 65599 nice. 65587 even better. - */ - long - dbm_hash(char *str, int len) { - register unsigned long n = 0; - - while (len--) - n = n * 65599 + *str++; - return n; - } -.P2 -.PP -There may be better hash functions for the purposes of dynamic hashing. -Try your favorite, and check the pagefile. If it contains too many pages -with too many holes, (in relation to this one for example) or if -\fIsdbm\fP -simply stops working (fails after -.CW SPLTMAX -attempts to split) when you feed your -NEWS -.CW history -file to it, you probably do not have a good hashing function. -If you do better (for different types of input), I would like to know -about the function you use. -.PP -Block sizes: It seems (from various tests on a few machines) that a page -file block size -.CW PBLKSIZ -of 1024 is by far the best for performance, but -this also happens to limit the size of a key/value pair. Depending on your -needs, you may wish to increase the page size, and also adjust -.CW PAIRMAX -(the maximum size of a key/value pair allowed: should always be at least -three words smaller than -.CW PBLKSIZ .) -accordingly. The system-wide version of the library -should probably be -configured with 1024 (distribution default), as this appears to be sufficient -for most common uses of \fIsdbm\fP. -.SH -Portability -.PP -This package has been tested in many different UN*Xes even including minix, -and appears to be reasonably portable. This does not mean it will port -easily to non-UN*X systems. -.SH -Notes and Miscellaneous -.PP -The \fIsdbm\fP is not a very complicated package, at least not after you -familiarize yourself with the literature on external hashing. There are -other interesting algorithms in existence that ensure (approximately) -single-read access to a data value associated with any key. These are -directory-less schemes such as \fIlinear hashing\fP [Lit80] (+ Larson -variations), \fIspiral storage\fP [Mar79] or directory schemes such as -\fIextensible hashing\fP [Fag79] by Fagin et al. I do hope these sources -provide a reasonable playground for experimentation with other algorithms. -See the June 1988 issue of ACM Computing Surveys [Enb88] for an -excellent overview of the field. -.PG -.SH -References -.LP -.IP [Lar78] 4m -P.-A. Larson, -``Dynamic Hashing'', \fIBIT\fP, vol. 18, pp. 184-201, 1978. -.IP [Tho90] 4m -Ken Thompson, \fIprivate communication\fP, Nov. 1990 -.IP [Lit80] 4m -W. Litwin, -`` Linear Hashing: A new tool for file and table addressing'', -\fIProceedings of the 6th Conference on Very Large Dabatases (Montreal)\fP, -pp. 212-223, Very Large Database Foundation, Saratoga, Calif., 1980. -.IP [Fag79] 4m -R. Fagin, J. Nievergelt, N. Pippinger, and H. R. Strong, -``Extendible Hashing - A Fast Access Method for Dynamic Files'', -\fIACM Trans. Database Syst.\fP, vol. 4, no.3, pp. 315-344, Sept. 1979. -.IP [Wal84] 4m -Rich Wales, -``Discussion of "dbm" data base system'', \fIUSENET newsgroup unix.wizards\fP, -Jan. 1984. -.IP [Tor87] 4m -Chris Torek, -``Re: dbm.a and ndbm.a archives'', \fIUSENET newsgroup comp.unix\fP, -1987. -.IP [Mar79] 4m -G. N. Martin, -``Spiral Storage: Incrementally Augmentable Hash Addressed Storage'', -\fITechnical Report #27\fP, University of Varwick, Coventry, U.K., 1979. -.IP [Enb88] 4m -R. J. Enbody and H. C. Du, -``Dynamic Hashing Schemes'',\fIACM Computing Surveys\fP, -vol. 20, no. 2, pp. 85-113, June 1988. |