diff options
Diffstat (limited to 'subversion/libsvn_fs_fs/structure')
-rw-r--r-- | subversion/libsvn_fs_fs/structure | 621 |
1 files changed, 621 insertions, 0 deletions
diff --git a/subversion/libsvn_fs_fs/structure b/subversion/libsvn_fs_fs/structure new file mode 100644 index 0000000..41caf1d --- /dev/null +++ b/subversion/libsvn_fs_fs/structure @@ -0,0 +1,621 @@ +This file describes the design, layouts, and file formats of a +libsvn_fs_fs repository. + +Design +------ + +In FSFS, each committed revision is represented as an immutable file +containing the new node-revisions, contents, and changed-path +information for the revision, plus a second, changeable file +containing the revision properties. + +In contrast to the BDB back end, the contents of recent revision of +files are stored as deltas against earlier revisions, instead of the +other way around. This is less efficient for common-case checkouts, +but brings greater simplicity and robustness, as well as the +flexibility to make commits work without write access to existing +revisions. Skip-deltas and delta combination mitigate the checkout +cost. + +In-progress transactions are represented with a prototype rev file +containing only the new text representations of files (appended to as +changed file contents come in), along with a separate file for each +node-revision, directory representation, or property representation +which has been changed or added in the transaction. During the final +stage of the commit, these separate files are marshalled onto the end +of the prototype rev file to form the immutable revision file. + +Layout of the FS directory +-------------------------- + +The layout of the FS directory (the "db" subdirectory of the +repository) is: + + revs/ Subdirectory containing revs + <shard>/ Shard directory, if sharding is in use (see below) + <revnum> File containing rev <revnum> + <shard>.pack/ Pack directory, if the repo has been packed (see below) + pack Pack file, if the repository has been packed (see below) + manifest Pack manifest file, if a pack file exists (see below) + revprops/ Subdirectory containing rev-props + <shard>/ Shard directory, if sharding is in use (see below) + <revnum> File containing rev-props for <revnum> + <shard>.pack/ Pack directory, if the repo has been packed (see below) + <rev>.<count> Pack file, if the repository has been packed (see below) + manifest Pack manifest file, if a pack file exists (see below) + revprops.db SQLite database of the packed revision properties + transactions/ Subdirectory containing transactions + <txnid>.txn/ Directory containing transaction <txnid> + txn-protorevs/ Subdirectory containing transaction proto-revision files + <txnid>.rev Proto-revision file for transaction <txnid> + <txnid>.rev-lock Write lock for proto-rev file + txn-current File containing the next transaction key + locks/ Subdirectory containing locks + <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest + <digest> File containing locks/children for path with <digest> + node-origins/ Lazy cache of origin noderevs for nodes + <partial-nodeid> File containing noderev ID of origins of nodes + current File specifying current revision and next node/copy id + fs-type File identifying this filesystem as an FSFS filesystem + write-lock Empty file, locked to serialise writers + txn-current-lock Empty file, locked to serialise 'txn-current' + uuid File containing the UUID of the repository + format File containing the format number of this filesystem + fsfs.conf Configuration file + min-unpacked-rev File containing the oldest revision not in a pack file + min-unpacked-revprop File containing the oldest revision of unpacked revprop + rep-cache.db SQLite database mapping rep checksums to locations + +Files in the revprops directory are in the hash dump format used by +svn_hash_write. + +The format of the "current" file is: + + * Format 3 and above: a single line of the form + "<youngest-revision>\n" giving the youngest revision for the + repository. + + * Format 2 and below: a single line of the form "<youngest-revision> + <next-node-id> <next-copy-id>\n" giving the youngest revision, the + next unique node-ID, and the next unique copy-ID for the + repository. + +The "write-lock" file is an empty file which is locked before the +final stage of a commit and unlocked after the new "current" file has +been moved into place to indicate that a new revision is present. It +is also locked during a revprop propchange while the revprop file is +read in, mutated, and written out again. Note that readers are never +blocked by any operation - writers must ensure that the filesystem is +always in a consistent state. + +The "txn-current" file is a file with a single line of text that +contains only a base-36 number. The current value will be used in the +next transaction name, along with the revision number the transaction +is based on. This sequence number ensures that transaction names are +not reused, even if the transaction is aborted and a new transaction +based on the same revision is begun. The only operation that FSFS +performs on this file is "get and increment"; the "txn-current-lock" +file is locked during this operation. + +"fsfs.conf" is a configuration file in the standard Subversion/Python +config format. It is automatically generated when you create a new +repository; read the generated file for details on what it controls. + +When representation sharing is enabled, the filesystem tracks +representation checksum and location mappings using a SQLite database in +"rep-cache.db". The database has a single table, which stores the sha1 +hash text as the primary key, mapped to the representation revision, offset, +size and expanded size. This file is only consulted during writes and never +during reads. Consequently, it is not required, and may be removed at an +abritrary time, with the subsequent loss of rep-sharing capabilities for +revisions written thereafter. + +Filesystem formats +------------------ + +The "format" file defines what features are permitted within the +filesystem, and indicates changes that are not backward-compatible. +It serves the same purpose as the repository file of the same name. + +The filesystem format file was introduced in Subversion 1.2, and so +will not be present if the repository was created with an older +version of Subversion. An absent format file should be interpreted as +indicating a format 1 filesystem. + +The format file is a single line of the form "<format number>\n", +followed by any number of lines specifying 'format options' - +additional information about the filesystem's format. Each format +option line is of the form "<option>\n" or "<option> <parameters>\n". + +Clients should raise an error if they encounter an option not +permitted by the format number in use. + +The formats are: + + Format 1, understood by Subversion 1.1+ + Format 2, understood by Subversion 1.4+ + Format 3, understood by Subversion 1.5+ + Format 4, understood by Subversion 1.6+ + Format 5, understood by Subversion 1.7-dev, never released + Format 6, understood by Subversion 1.8 + +The differences between the formats are: + +Delta representation in revision files + Format 1: svndiff0 only + Formats 2+: svndiff0 or svndiff1 + +Format options + Formats 1-2: none permitted + Format 3+: "layout" option + +Transaction name reuse + Formats 1-2: transaction names may be reused + Format 3+: transaction names generated using txn-current file + +Location of proto-rev file and its lock + Formats 1-2: transactions/<txnid>/rev and + transactions/<txnid>/rev-lock. + Format 3+: txn-protorevs/<txnid>.rev and + txn-protorevs/<txnid>.rev-lock. + +Node-ID and copy-ID generation + Formats 1-2: Node-IDs and copy-IDs are guaranteed to form a + monotonically increasing base36 sequence using the "current" + file. + Format 3+: Node-IDs and copy-IDs use the new revision number to + ensure uniqueness and the "current" file just contains the + youngest revision. + +Mergeinfo metadata: + Format 1-2: minfo-here and minfo-count node-revision fields are not + stored. svn_fs_get_mergeinfo returns an error. + Format 3+: minfo-here and minfo-count node-revision fields are + maintained. svn_fs_get_mergeinfo works. + +Revision changed paths list: + Format 1-3: Does not contain the node's kind. + Format 4+: Contains the node's kind. + +Shard packing: + Format 4: Applied to revision data only. + Format 5: Revprops would be packed independently of revision data. + Format 6+: Applied equally to revision data and revprop data + (i.e. same min packed revision) + +# Incomplete list. See SVN_FS_FS__MIN_*_FORMAT + + +Filesystem format options +------------------------- + +Currently, the only recognised format option is "layout", which +specifies the paths that will be used to store the revision files and +revision property files. + +The "layout" option is followed by the name of the filesystem layout +and any required parameters. The default layout, if no "layout" +keyword is specified, is the 'linear' layout. + +The known layouts, and the parameters they require, are as follows: + +"linear" + Revision files and rev-prop files are named after the revision they + represent, and are placed directly in the revs/ and revprops/ + directories. r1234 will be represented by the revision file + revs/1234 and the rev-prop file revprops/1234. + +"sharded <max-files-per-directory>" + Revision files and rev-prop files are named after the revision they + represent, and are placed in a subdirectory of the revs/ and + revprops/ directories named according to the 'shard' they belong to. + + Shards are numbered from zero and contain between one and the + maximum number of files per directory specified in the layout's + parameters. + + For the "sharded 1000" layout, r1234 will be represented by the + revision file revs/1/1234 and rev-prop file revprops/1/1234. The + revs/0/ directory will contain revisions 0-999, revs/1/ will contain + 1000-1999, and so on. + +Packing revisions +----------------- + +A filesystem can optionally be "packed" to conserve space on disk. The +packing process concatenates all the revision files in each full shard to +create pack files. A manifest file is also created for each shard which +records the indexes of the corresponding revision files in the pack file. +In addition, the original shard is removed, and reads are redirected to the +pack file. + +The manifest file consists of a list of offsets, one for each revision in the +pack file. The offsets are stored as ASCII decimal, and separated by a newline +character. + +Packing revision properties (format 5: SQLite) +--------------------------- + +This was supported by 1.7-dev builds but never included in a blessed release. + +See r1143829 of this file: +http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/structure?view=markup&pathrev=1143829 + + +Packing revision properties (format 6+) +--------------------------- + +Similarly to the revision data, packing will concatenate multiple +revprops into a single file. Since they are mutable data, we put an +upper limit to the size of these files: We will concatenate the data +up to the limit and then use a new file for the following revisions. + +The limit can be set and changed at will in the configuration file. +It is 64kB by default. Because a pack file must contain at least one +complete property list, files containing just one revision may exceed +that limit. + +Furthermore, pack files can be compressed which saves about 75% of +disk space. A configuration file flag enables the compression; it is +off by default and may be switched on and off at will. The pack size +limit is always applied to the uncompressed data. For this reason, +the default is 256kB while compression has been enabled. + +Files are named after their start revision as "<rev>.<counter>" where +counter will be increased whenever we rewrite a pack file due to a +revprop change. The manifest file contains the list of pack file +names, one line for each revision. + +Many tools track repository global data in revision properties at +revision 0. To minimize I/O overhead for those applications, we +will never pack that revision, i.e. its data is always being kept +in revprops/0/0. + +Pack file format + + Top level: <packed container> + + We always apply data compression to the pack file - using the + SVN_DELTA_COMPRESSION_LEVEL_NONE level if compression is disabled. + (Note that compression at SVN_DELTA_COMPRESSION_LEVEL_NONE is not + a no-op stream transformation although most of the data will remain + human readable.) + + container := header '\n' (revprops)+ + header := start_rev '\n' rev_count '\n' (size '\n')+ + + All numbers in the header are given as ASCII decimals. rev_count + is the number of revisions packed into this container. There must + be exactly as many "size" and serialized "revprops". The "size" + values in the list are the length in bytes of the serialized + revprops of the respective revision. + +Writing to packed revprops + + The old pack file is being read and the new revprops serialized. + If they fit into the same pack file, a temp file with the new + content gets written and moved into place just like an non-packed + revprop file would. No name change or manifest update required. + + If they don't fit into the same pack file, i.e. exceed the pack + size limit, the pack will be split into 2 or 3 new packs just + before and / or after the modified revision. + + In the current implementation, they will never be merged again. + To minimize fragmentation, the initial packing process will only + use about 90% of the limit, i.e. leave some room for growth. + + When a pack file gets split, its counter is being increased + creating a new file and leaving the old content in place and + available for concurrent readers. Only after the new manifest + file got moved into place, will the old pack files be deleted. + + Write access to revprops is being serialized by the global + filesystem write lock. We only need to build a few retries into + the reader code to gracefully handle manifest changes and pack + file deletions. + + +Node-revision IDs +----------------- + +A node-rev ID consists of the following three fields: + + node_revision_id ::= node_id '.' copy_id '.' txn_id + +At this level, the form of the ID is the same as for BDB - see the +section called "ID's" in <../libsvn_fs_base/notes/structure>. + +In order to support efficient lookup of node-revisions by their IDs +and to simplify the allocation of fresh node-IDs during a transaction, +we treat the fields of a node-rev ID in new and interesting ways. + +Within a new transaction: + + New node-revision IDs assigned within a transaction have a txn-id + field of the form "t<txnid>". + + When a new node-id or copy-id is assigned in a transaction, the ID + used is a "_" followed by a base36 number unique to the transaction. + +Within a revision: + + Within a revision file, node-revs have a txn-id field of the form + "r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII + decimal) number of bytes from the start of the revision file to the + start of the node-rev. + + During the final phase of a commit, node-revision IDs are rewritten + to have repository-wide unique node-ID and copy-ID fields, and to have + "r<rev>/<offset>" txn-id fields. + + In Format 3 and above, this uniqueness is done by changing a temporary + id of "_<base36>" to "<base36>-<rev>". Note that this means that the + originating revision of a line of history or a copy can be determined + by looking at the node ID. + + In Format 2 and below, the "current" file contains global base36 + node-ID and copy-ID counters; during the commit, the counter value is + added to the transaction-specific base36 ID, and the value in + "current" is adjusted. + + (It is legal for Format 3 repositories to contain Format 2-style IDs; + this just prevents I/O-less node-origin-rev lookup for those nodes.) + +The temporary assignment of node-ID and copy-ID fields has +implications for svn_fs_compare_ids and svn_fs_check_related. The ID +_1.0.t1 is not related to the ID _1.0.t2 even though they have the +same node-ID, because temporary node-IDs are restricted in scope to +the transactions they belong to. + +There is a lazily created cache mapping from node-IDs to the full +node-revision ID where they are created. This is in the node-origins +directory; the file name is the node-ID without its last character (or +"0" for single-character node IDs) and the contents is a serialized +hash mapping from node-ID to node-revision ID. This cache is only +used for node-IDs of the pre-Format 3 style. + +Copy-IDs and copy roots +----------------------- + +Copy-IDs are assigned in the same manner as they are in the BDB +implementation: + + * A node-rev resulting from a creation operation (with no copy + history) receives the copy-ID of its parent directory. + + * A node-rev resulting from a copy operation receives a fresh + copy-ID, as one would expect. + + * A node-rev resulting from a modification operation receives a + copy-ID depending on whether its predecessor derives from a + copy operation or whether it derives from a creation operation + with no intervening copies: + + - If the predecessor does not derive from a copy, the new + node-rev receives the copy-ID of its parent directory. If the + node-rev is being modified through its created-path, this will + be the same copy-ID as the predecessor node-rev has; however, + if the node-rev is being modified through a copied ancestor + directory (i.e. we are performing a "lazy copy"), this will be + a different copy-ID. + + - If the predecessor derives from a copy and the node-rev is + being modified through its created-path, the new node-rev + receives the copy-ID of the predecessor. + + - If the predecessor derives from a copy and the node-rev is not + being modified through its created path, the new node-rev + receives a fresh copy-ID. This is called a "soft copy" + operation, as distinct from a "true copy" operation which was + actually requested through the svn_fs interface. Soft copies + exist to ensure that the same <node-ID,copy-ID> pair is not + used twice within a transaction. + +Unlike the BDB implementation, we do not have a "copies" table. +Instead, each node-revision record contains a "copyroot" field +identifying the node-rev resulting from the true copy operation most +proximal to the node-rev. If the node-rev does not itself derive from +a copy operation, then the copyroot field identifies the copy of an +ancestor directory; if no ancestor directories derive from a copy +operation, then the copyroot field identifies the root directory of +rev 0. + +Revision file format +-------------------- + +A revision file contains a concatenation of various kinds of data: + + * Text and property representations + * Node-revisions + * The changed-path data + * Two offsets at the very end + +A representation begins with a line containing either "PLAIN\n" or +"DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>, +and <length> give the location of the delta base of the representation +and the amount of data it contains (not counting the header or +trailer). If no base location is given for a delta, the base is the +empty stream. After the initial line comes raw svndiff data, followed +by a cosmetic trailer "ENDREP\n". + +If the representation is for the text contents of a directory node, +the expanded contents are in hash dump format mapping entry names to +"<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives +the ID of the child node-rev. + +If a representation is for a property list, the expanded contents are +in the form of a dumped hash map mapping property names to property +values. + +The marshalling syntax for node-revs is a series of fields terminated +by a blank line. Fields have the syntax "<name>: <value>\n", where +<name> is a symbolic field name (each symbolic name is used only once +in a given node-rev) and <value> is the value data. Unrecognized +fields are ignored, for extensibility. The following fields are +defined: + + id The ID of the node-rev + type "file" or "dir" + pred The ID of the predecessor node-rev + count Count of node-revs since the base of the node + text "<rev> <offset> <length> <size> <digest>" for text rep + props "<rev> <offset> <length> <size> <digest>" for props rep + <rev> and <offset> give location of rep + <length> gives length of rep, sans header and trailer + <size> gives size of expanded rep; may be 0 if equal + to the length + <digest> gives hex MD5 digest of expanded rep + ### in formats >=4, also present: + <sha1-digest> gives hex SHA1 digest of expanded rep + <uniquifier> see representation_t->uniquifier in fs.h + cpath FS pathname node was created at + copyfrom "<rev> <path>" of copyfrom data + copyroot "<rev> <created-path>" of the root of this copy + minfo-cnt The number of nodes under (and including) this node + which have svn:mergeinfo. + minfo-here Exists if this node itself has svn:mergeinfo. + +The predecessor of a node-rev crosses both soft and true copies; +together with the count field, it allows efficient determination of +the base for skip-deltas. The first node-rev of a node contains no +"pred" field. A node-revision with no properties may omit the "props" +field. A node-revision with no contents (a zero-length file or an +empty directory) may omit the "text" field. In a node-revision +resulting from a true copy operation, the "copyfrom" field gives the +copyfrom data. The "copyroot" field identifies the root node-revision +of the copy; it may be omitted if the node-rev is its own copy root +(as is the case for node-revs with copy history, and for the root node +of revision 0). Copy roots are identified by revision and +created-path, not by node-rev ID, because a copy root may be a +node-rev which exists later on within the same revision file, meaning +its offset is not yet known. + +The changed-path data is represented as a series of changed-path +items, each consisting of two lines. The first line has the format +"<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the +node-rev ID of the new node-rev, <action> is "add", "delete", +"replace", or "modify", <text-mod> and <prop-mod> are "true" or +"false" indicating whether the text and/or properties changed, and +<path> is the changed pathname. For deletes, <id> is the node-rev ID +of the deleted node-rev, and <text-mod> and <prop-mod> are always +"false". The second line has the format "<rev> <path>\n" containing +the node-rev's copyfrom information if it has any; if it does not, the +second line is blank. + +Starting with FS format 4, <action> may contain the kind ("file" or +"dir") of the node, after a hyphen; for example, an added directory +may be represented as "add-dir". + +At the very end of a rev file is a pair of lines containing +"\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of +the root directory node revision and <cp-offset> is the offset of the +changed-path data. + +All numbers in the rev file format are unsigned and are represented as +ASCII decimal. + +Transaction layout +------------------ + +A transaction directory has the following layout: + + props Transaction props + next-ids Next temporary node-ID and copy-ID + changes Changed-path information so far + node.<nid>.<cid> New node-rev data for node + node.<nid>.<cid>.props Props for new node-rev, if changed + node.<nid>.<cid>.children Directory contents for node-rev + <sha1> Text representation of that sha1 + +In FS formats 1 and 2, it also contains: + + rev Prototype rev file with new text reps + rev-lock Lockfile for writing to the above + +In newer formats, these files are in the txn-protorevs/ directory. + +The prototype rev file is used to store the text representations as +they are received from the client. To ensure that only one client is +writing to the file at a given time, the "rev-lock" file is locked for +the duration of each write. + +The two kinds of props files are all in hash dump format. The "props" +file will always be present. The "node.<nid>.<cid>.props" file will +only be present if the node-rev properties have been changed. + +The <sha1> files have been introduced in FS format 6. Their content +is that of text rep references: "<rev> <offset> <length> <size> <digest>" +They will be written for text reps in the current transaction and be +used to eliminate duplicate reps within that transaction. + +The "next-ids" file contains a single line "<next-temp-node-id> +<next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID +assignments (without the leading underscores). The next node-ID is +also used as a uniquifier for representations which may share the same +underlying rep. + +The "children" file for a node-revision begins with a copy of the hash +dump representation of the directory entries from the old node-rev (or +a dump of the empty hash for new directories), and then an incremental +hash dump entry for each change made to the directory. + +The "changes" file contains changed-path entries in the same form as +the changed-path entries in a rev file, except that <id> and <action> +may both be "reset" (in which case <text-mod> and <prop-mod> are both +always "false") to indicate that all changes to a path should be +considered undone. Reset entries are only used during the final merge +phase of a transaction. Actions in the "changes" file always contain +a node kind, even if the FS format is older than format 4. + +The node-rev files have the same format as node-revs in a revision +file, except that the "text" and "props" fields are augmented as +follows: + + * The "props" field may have the value "-1" if properties have + been changed and are contained in a "props" file within the + node-rev subdirectory. + + * For directory node-revs, the "text" field may have the value + "-1" if entries have been changed and are contained in a + "contents" file in the node-rev subdirectory. + + * For the directory node-rev representing the root of the + transaction, the "is-fresh-txn-root" field indicates that it has + not been made mutable yet (see Issue #2608). + + * For file node-revs, the "text" field may have the value "-1 + <offset> <length> <size> <digest>" if the text representation is + within the prototype rev file. + + * The "copyroot" field may have the value "-1 <created-path>" if the + copy root of the node-rev is part of the transaction in process. + +Locks layout +------------ + +Locks in FSFS are stored in serialized hash format in files whose +names are MD5 digests of the FS path which the lock is associated +with. For the purposes of keeping directory inode usage down, these +digest files live in subdirectories of the main lock directory whose +names are the first 3 characters of the digest filename. + +Also stored in the digest file for a given FS path are pointers to +other digest files which contain information associated with other FS +paths that are beneath our path (an immediate child thereof, or a +grandchild, or a great-grandchild, ...). + +To answer the question, "Does path FOO have a lock associated with +it?", one need only generate the MD5 digest of FOO's +absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look +for a file located like so: + + /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8 + +And then see if that file contains lock information. + +To inquire about locks on children of the path FOO, you would +reference the same path as above, but look for a list of children in +that file (instead of lock information). Children are listed as MD5 +digests, too, so you would simply iterate over those digests and +consult the files they reference for lock information. |