summaryrefslogtreecommitdiffstats
path: root/contrib/perl5/ext/SDBM_File/sdbm/README
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/perl5/ext/SDBM_File/sdbm/README')
-rw-r--r--contrib/perl5/ext/SDBM_File/sdbm/README396
1 files changed, 0 insertions, 396 deletions
diff --git a/contrib/perl5/ext/SDBM_File/sdbm/README b/contrib/perl5/ext/SDBM_File/sdbm/README
deleted file mode 100644
index cd7312c..0000000
--- a/contrib/perl5/ext/SDBM_File/sdbm/README
+++ /dev/null
@@ -1,396 +0,0 @@
-
-
-
-
-
-
- sdbm - Substitute DBM
- or
- Berkeley ndbm for Every UN*X[1] Made Simple
-
- Ozan (oz) Yigit
-
- The Guild of PD Software Toolmakers
- Toronto - Canada
-
- oz@nexus.yorku.ca
-
-
-
-Implementation is the sincerest form of flattery. - L. Peter
-Deutsch
-
-A The Clone of the ndbm library
-
- The sources accompanying this notice - sdbm - consti-
-tute the first public release (Dec. 1990) of a complete
-clone of the Berkeley UN*X ndbm library. The sdbm library is
-meant to clone the proven functionality of ndbm as closely
-as possible, including a few improvements. It is practical,
-easy to understand, and compatible. The sdbm library is not
-derived from any licensed, proprietary or copyrighted
-software.
-
- The sdbm implementation is based on a 1978 algorithm
-[Lar78] by P.-A. (Paul) Larson known as ``Dynamic Hashing''.
-In the course of searching for a substitute for ndbm, I pro-
-totyped three different external-hashing algorithms [Lar78,
-Fag79, Lit80] and ultimately chose Larson's algorithm as a
-basis of the sdbm implementation. The Bell Labs dbm (and
-therefore ndbm) is based on an algorithm invented by Ken
-Thompson, [Tho90, Tor87] and predates Larson's work.
-
- The sdbm programming interface is totally compatible
-with ndbm and includes a slight improvement in database ini-
-tialization. It is also expected to be binary-compatible
-under most UN*X versions that support the ndbm library.
-
- The sdbm implementation shares the shortcomings of the
-ndbm library, as a side effect of various simplifications to
-the original Larson algorithm. It does produce holes in the
-page file as it writes pages past the end of file. (Larson's
-paper include a clever solution to this problem that is a
-result of using the hash value directly as a block address.)
-On the other hand, extensive tests seem to indicate that
-sdbm creates fewer holes in general, and the resulting page-
-files are smaller. The sdbm implementation is also faster
-than ndbm in database creation. Unlike the ndbm, the sdbm
-_________________________
-
- [1] UN*X is not a trademark of any (dis)organization.
-
-
-
-
-
-
-
-
-
- - 2 -
-
-
-store operation will not ``wander away'' trying to split its
-data pages to insert a datum that cannot (due to elaborate
-worst-case situations) be inserted. (It will fail after a
-pre-defined number of attempts.)
-
-Important Compatibility Warning
-
- The sdbm and ndbm libraries cannot share databases: one
-cannot read the (dir/pag) database created by the other.
-This is due to the differences between the ndbm and sdbm
-algorithms[2], and the hash functions used. It is easy to
-convert between the dbm/ndbm databases and sdbm by ignoring
-the index completely: see dbd, dbu etc.
-
-
-Notice of Intellectual Property
-
-The entire sdbm library package, as authored by me, Ozan S.
-Yigit, is hereby placed in the public domain. As such, the
-author is not responsible for the consequences of use of
-this software, no matter how awful, even if they arise from
-defects in it. There is no expressed or implied warranty for
-the sdbm library.
-
- Since the sdbm library package is in the public domain,
-this original release or any additional public-domain
-releases of the modified original cannot possibly (by defin-
-ition) be withheld from you. Also by definition, You (singu-
-lar) have all the rights to this code (including the right
-to sell without permission, the right to hoard[3] and the
-right to do other icky things as you see fit) but those
-rights are also granted to everyone else.
-
- Please note that all previous distributions of this
-software contained a copyright (which is now dropped) to
-protect its origins and its current public domain status
-against any possible claims and/or challenges.
-
-Acknowledgments
-
- Many people have been very helpful and supportive. A
-partial list would necessarily include Rayan Zacherissen
-(who contributed the man page, and also hacked a MMAP
-_________________________
-
- [2] Torek's discussion [Tor87] indicates that
-dbm/ndbm implementations use the hash value to traverse
-the radix trie differently than sdbm and as a result,
-the page indexes are generated in different order. For
-more information, send e-mail to the author.
- [3] You cannot really hoard something that is avail-
-able to the public at large, but try if it makes you
-feel any better.
-
-
-
-
-
-
-
-
-
-
- - 3 -
-
-
-version of sdbm), Arnold Robbins, Chris Lewis, Bill David-
-sen, Henry Spencer, Geoff Collyer, Rich Salz (who got me
-started in the first place), Johannes Ruschein (who did the
-minix port) and David Tilbrook. I thank you all.
-
-Distribution Manifest and Notes
-
-This distribution of sdbm includes (at least) the following:
-
- CHANGES change log
- README this file.
- biblio a small bibliography on external hashing
- dba.c a crude (n/s)dbm page file analyzer
- dbd.c a crude (n/s)dbm page file dumper (for conversion)
- dbe.1 man page for dbe.c
- dbe.c Janick's database editor
- dbm.c a dbm library emulation wrapper for ndbm/sdbm
- dbm.h header file for the above
- dbu.c a crude db management utility
- hash.c hashing function
- makefile guess.
- pair.c page-level routines (posted earlier)
- pair.h header file for the above
- readme.ms troff source for the README file
- sdbm.3 man page
- sdbm.c the real thing
- sdbm.h header file for the above
- tune.h place for tuning & portability thingies
- util.c miscellaneous
-
- dbu is a simple database manipulation program[4] that
-tries to look like Bell Labs' cbt utility. It is currently
-incomplete in functionality. I use dbu to test out the rou-
-tines: it takes (from stdin) tab separated key/value pairs
-for commands like build or insert or takes keys for commands
-like delete or look.
-
- dbu <build|creat|look|insert|cat|delete> dbmfile
-
- dba is a crude analyzer of dbm/sdbm/ndbm page files. It
-scans the entire page file, reporting page level statistics,
-and totals at the end.
-
- dbd is a crude dump program for dbm/ndbm/sdbm data-
-bases. It ignores the bitmap, and dumps the data pages in
-sequence. It can be used to create input for the dbu util-
-ity. Note that dbd will skip any NULLs in the key and data
-fields, thus is unsuitable to convert some peculiar
-_________________________
-
- [4] The dbd, dba, dbu utilities are quick hacks and
-are not fit for production use. They were developed
-late one night, just to test out sdbm, and convert some
-databases.
-
-
-
-
-
-
-
-
-
- - 4 -
-
-
-databases that insist in including the terminating null.
-
- I have also included a copy of the dbe (ndbm DataBase
-Editor) by Janick Bergeron [janick@bnr.ca] for your pleas-
-ure. You may find it more useful than the little dbu util-
-ity.
-
- dbm.[ch] is a dbm library emulation on top of ndbm (and
-hence suitable for sdbm). Written by Robert Elz.
-
- The sdbm library has been around in beta test for quite
-a long time, and from whatever little feedback I received
-(maybe no news is good news), I believe it has been func-
-tioning without any significant problems. I would, of
-course, appreciate all fixes and/or improvements. Portabil-
-ity enhancements would especially be useful.
-
-Implementation Issues
-
- Hash functions: The algorithm behind sdbm implementa-
-tion needs a good bit-scrambling hash function to be effec-
-tive. I ran into a set of constants for a simple hash func-
-tion that seem to help sdbm perform better than ndbm for
-various inputs:
-
- /*
- * polynomial conversion ignoring overflows
- * 65599 nice. 65587 even better.
- */
- long
- dbm_hash(char *str, int len) {
- register unsigned long n = 0;
-
- while (len--)
- n = n * 65599 + *str++;
- return n;
- }
-
- There may be better hash functions for the purposes of
-dynamic hashing. Try your favorite, and check the pagefile.
-If it contains too many pages with too many holes, (in rela-
-tion to this one for example) or if sdbm simply stops work-
-ing (fails after SPLTMAX attempts to split) when you feed
-your NEWS history file to it, you probably do not have a
-good hashing function. If you do better (for different
-types of input), I would like to know about the function you
-use.
-
- Block sizes: It seems (from various tests on a few
-machines) that a page file block size PBLKSIZ of 1024 is by
-far the best for performance, but this also happens to limit
-the size of a key/value pair. Depending on your needs, you
-may wish to increase the page size, and also adjust PAIRMAX
-(the maximum size of a key/value pair allowed: should always
-
-
-
-
-
-
-
-
-
- - 5 -
-
-
-be at least three words smaller than PBLKSIZ.) accordingly.
-The system-wide version of the library should probably be
-configured with 1024 (distribution default), as this appears
-to be sufficient for most common uses of sdbm.
-
-Portability
-
- This package has been tested in many different UN*Xes
-even including minix, and appears to be reasonably portable.
-This does not mean it will port easily to non-UN*X systems.
-
-Notes and Miscellaneous
-
- The sdbm is not a very complicated package, at least
-not after you familiarize yourself with the literature on
-external hashing. There are other interesting algorithms in
-existence that ensure (approximately) single-read access to
-a data value associated with any key. These are directory-
-less schemes such as linear hashing [Lit80] (+ Larson varia-
-tions), spiral storage [Mar79] or directory schemes such as
-extensible hashing [Fag79] by Fagin et al. I do hope these
-sources provide a reasonable playground for experimentation
-with other algorithms. See the June 1988 issue of ACM Com-
-puting Surveys [Enb88] for an excellent overview of the
-field.
-
-References
-
-
-[Lar78]
- P.-A. Larson, ``Dynamic Hashing'', BIT, vol. 18, pp.
- 184-201, 1978.
-
-[Tho90]
- Ken Thompson, private communication, Nov. 1990
-
-[Lit80]
- W. Litwin, `` Linear Hashing: A new tool for file and
- table addressing'', Proceedings of the 6th Conference on
- Very Large Dabatases (Montreal), pp. 212-223, Very
- Large Database Foundation, Saratoga, Calif., 1980.
-
-[Fag79]
- R. Fagin, J. Nievergelt, N. Pippinger, and H. R.
- Strong, ``Extendible Hashing - A Fast Access Method for
- Dynamic Files'', ACM Trans. Database Syst., vol. 4,
- no.3, pp. 315-344, Sept. 1979.
-
-[Wal84]
- Rich Wales, ``Discussion of "dbm" data base system'',
- USENET newsgroup unix.wizards, Jan. 1984.
-
-[Tor87]
- Chris Torek, ``Re: dbm.a and ndbm.a archives'',
-
-
-
-
-
-
-
-
-
- - 6 -
-
-
- USENET newsgroup comp.unix, 1987.
-
-[Mar79]
- G. N. Martin, ``Spiral Storage: Incrementally Augment-
- able Hash Addressed Storage'', Technical Report #27,
- University of Varwick, Coventry, U.K., 1979.
-
-[Enb88]
- R. J. Enbody and H. C. Du, ``Dynamic Hashing
- Schemes'',ACM Computing Surveys, vol. 20, no. 2, pp.
- 85-113, June 1988.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
OpenPOWER on IntegriCloud