diff options
Diffstat (limited to 'contrib/bind9/doc/rfc/rfc2168.txt')
-rw-r--r-- | contrib/bind9/doc/rfc/rfc2168.txt | 1123 |
1 files changed, 0 insertions, 1123 deletions
diff --git a/contrib/bind9/doc/rfc/rfc2168.txt b/contrib/bind9/doc/rfc/rfc2168.txt deleted file mode 100644 index 3eed1bd..0000000 --- a/contrib/bind9/doc/rfc/rfc2168.txt +++ /dev/null @@ -1,1123 +0,0 @@ - - - - - - -Network Working Group R. Daniel -Request for Comments: 2168 Los Alamos National Laboratory -Category: Experimental M. Mealling - Network Solutions, Inc. - June 1997 - - - Resolution of Uniform Resource Identifiers - using the Domain Name System - -Status of this Memo -=================== - - This memo defines an Experimental Protocol for the Internet - community. This memo does not specify an Internet standard of any - kind. Discussion and suggestions for improvement are requested. - Distribution of this memo is unlimited. - -Abstract: -========= - - Uniform Resource Locators (URLs) are the foundation of the World Wide - Web, and are a vital Internet technology. However, they have proven - to be brittle in practice. The basic problem is that URLs typically - identify a particular path to a file on a particular host. There is - no graceful way of changing the path or host once the URL has been - assigned. Neither is there a graceful way of replicating the resource - located by the URL to achieve better network utilization and/or fault - tolerance. Uniform Resource Names (URNs) have been hypothesized as a - adjunct to URLs that would overcome such problems. URNs and URLs are - both instances of a broader class of identifiers known as Uniform - Resource Identifiers (URIs). - - The requirements document for URN resolution systems[15] defines the - concept of a "resolver discovery service". This document describes - the first, experimental, RDS. It is implemented by a new DNS Resource - Record, NAPTR (Naming Authority PoinTeR), that provides rules for - mapping parts of URIs to domain names. By changing the mapping - rules, we can change the host that is contacted to resolve a URI. - This will allow a more graceful handling of URLs over long time - periods, and forms the foundation for a new proposal for Uniform - Resource Names. - - - - - - - - - -Daniel & Mealling Experimental [Page 1] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - In addition to locating resolvers, the NAPTR provides for other - naming systems to be grandfathered into the URN world, provides - independence between the name assignment system and the resolution - protocol system, and allows multiple services (Name to Location, Name - to Description, Name to Resource, ...) to be offered. In conjunction - with the SRV RR, the NAPTR record allows those services to be - replicated for the purposes of fault tolerance and load balancing. - -Introduction: -============= - - Uniform Resource Locators have been a significant advance in - retrieving Internet-accessible resources. However, their brittle - nature over time has been recognized for several years. The Uniform - Resource Identifier working group proposed the development of Uniform - Resource Names to serve as persistent, location-independent - identifiers for Internet resources in order to overcome most of the - problems with URLs. RFC-1737 [1] sets forth requirements on URNs. - - During the lifetime of the URI-WG, a number of URN proposals were - generated. The developers of several of those proposals met in a - series of meetings, resulting in a compromise known as the Knoxville - framework. The major principle behind the Knoxville framework is - that the resolution system must be separate from the way names are - assigned. This is in marked contrast to most URLs, which identify the - host to contact and the protocol to use. Readers are referred to [2] - for background on the Knoxville framework and for additional - information on the context and purpose of this proposal. - - Separating the way names are resolved from the way they are - constructed provides several benefits. It allows multiple naming - approaches and resolution approaches to compete, as it allows - different protocols and resolvers to be used. There is just one - problem with such a separation - how do we resolve a name when it - can't give us directions to its resolver? - - For the short term, DNS is the obvious candidate for the resolution - framework, since it is widely deployed and understood. However, it is - not appropriate to use DNS to maintain information on a per-resource - basis. First of all, DNS was never intended to handle that many - records. Second, the limited record size is inappropriate for catalog - information. Third, domain names are not appropriate as URNs. - - Therefore our approach is to use DNS to locate "resolvers" that can - provide information on individual resources, potentially including - the resource itself. To accomplish this, we "rewrite" the URI into a - domain name following the rules provided in NAPTR records. Rewrite - rules provide considerable power, which is important when trying to - - - -Daniel & Mealling Experimental [Page 2] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - meet the goals listed above. However, collections of rules can become - difficult to understand. To lessen this problem, the NAPTR rules are - *always* applied to the original URI, *never* to the output of - previous rules. - - Locating a resolver through the rewrite procedure may take multiple - steps, but the beginning is always the same. The start of the URI is - scanned to extract its colon-delimited prefix. (For URNs, the prefix - is always "urn:" and we extract the following colon-delimited - namespace identifier [3]). NAPTR resolution begins by taking the - extracted string, appending the well-known suffix ".urn.net", and - querying the DNS for NAPTR records at that domain name. Based on the - results of this query, zero or more additional DNS queries may be - needed to locate resolvers for the URI. The details of the - conversation between the client and the resolver thus located are - outside the bounds of this draft. Three brief examples of this - procedure are given in the next section. - - The NAPTR RR provides the level of indirection needed to keep the - naming system independent of the resolution system, its protocols, - and services. Coupled with the new SRV resource record proposal[4] - there is also the potential for replicating the resolver on multiple - hosts, overcoming some of the most significant problems of URLs. This - is an important and subtle point. Not only do the NAPTR and SRV - records allow us to replicate the resource, we can replicate the - resolvers that know about the replicated resource. Preventing a - single point of failure at the resolver level is a significant - benefit. Separating the resolution procedure from the way names are - constructed has additional benefits. Different resolution procedures - can be used over time, and resolution procedures that are determined - to be useful can be extended to deal with additional namespaces. - -Caveats -======= - - The NAPTR proposal is the first resolution procedure to be considered - by the URN-WG. There are several concerns about the proposal which - have motivated the group to recommend it for publication as an - Experimental rather than a standards-track RFC. - - First, URN resolution is new to the IETF and we wish to gain - operational experience before recommending any procedure for the - standards track. Second, the NAPTR proposal is based on DNS and - consequently inherits concerns about security and administration. The - recent advancement of the DNSSEC and secure update drafts to Proposed - Standard reduce these concerns, but we wish to experiment with those - new capabilities in the context of URN administration. A third area - of concern is the potential for a noticeable impact on the DNS. We - - - -Daniel & Mealling Experimental [Page 3] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - believe that the proposal makes appropriate use of caching and - additional information, but it is best to go slow where the potential - for impact on a core system like the DNS is concerned. Fourth, the - rewrite rules in the NAPTR proposal are based on regular expressions. - Since regular expressions are difficult for humans to construct - correctly, concerns exist about the usability and maintainability of - the rules. This is especially true where international character sets - are concerned. Finally, the URN-WG is developing a requirements - document for URN Resolution Services[15], but that document is not - complete. That document needs to precede any resolution service - proposals on the standards track. - -Terminology -=========== - - "Must" or "Shall" - Software that does not behave in the manner that - this document says it must is not conformant to this - document. - "Should" - Software that does not follow the behavior that this - document says it should may still be conformant, but is - probably broken in some fundamental way. - "May" - Implementations may or may not provide the described - behavior, while still remaining conformant to this - document. - -Brief overview and examples of the NAPTR RR: -============================================ - - A detailed description of the NAPTR RR will be given later, but to - give a flavor for the proposal we first give a simple description of - the record and three examples of its use. - - The key fields in the NAPTR RR are order, preference, service, flags, - regexp, and replacement: - - * The order field specifies the order in which records MUST be - processed when multiple NAPTR records are returned in response to a - single query. A naming authority may have delegated a portion of - its namespace to another agency. Evaluating the NAPTR records in - the correct order is necessary for delegation to work properly. - - * The preference field specifies the order in which records SHOULD be - processed when multiple NAPTR records have the same value of - "order". This field lets a service provider specify the order in - which resolvers are contacted, so that more capable machines are - contacted in preference to less capable ones. - - - - - -Daniel & Mealling Experimental [Page 4] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - * The service field specifies the resolution protocol and resolution - service(s) that will be available if the rewrite specified by the - regexp or replacement fields is applied. Resolution protocols are - the protocols used to talk with a resolver. They will be specified - in other documents, such as [5]. Resolution services are operations - such as N2R (URN to Resource), N2L (URN to URL), N2C (URN to URC), - etc. These will be discussed in the URN Resolution Services - document[6], and their behavior in a particular resolution protocol - will be given in the specification for that protocol (see [5] for a - concrete example). - - * The flags field contains modifiers that affect what happens in the - next DNS lookup, typically for optimizing the process. Flags may - also affect the interpretation of the other fields in the record, - therefore, clients MUST skip NAPTR records which contain an unknown - flag value. - - * The regexp field is one of two fields used for the rewrite rules, - and is the core concept of the NAPTR record. The regexp field is a - String containing a sed-like substitution expression. (The actual - grammar for the substitution expressions is given later in this - draft). The substitution expression is applied to the original URN - to determine the next domain name to be queried. The regexp field - should be used when the domain name to be generated is conditional - on information in the URI. If the next domain name is always known, - which is anticipated to be a common occurrence, the replacement - field should be used instead. - - * The replacement field is the other field that may be used for the - rewrite rule. It is an optimization of the rewrite process for the - case where the next domain name is fixed instead of being - conditional on the content of the URI. The replacement field is a - domain name (subject to compression if a DNS sender knows that a - given recipient is able to decompress names in this RR type's RDATA - field). If the rewrite is more complex than a simple substitution - of a domain name, the replacement field should be set to . and the - regexp field used. - - - - - - - - - - - - - - -Daniel & Mealling Experimental [Page 5] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - Note that the client applies all the substitutions and performs all - lookups, they are not performed in the DNS servers. Note also that it - is the belief of the developers of this document that regexps should - rarely be used. The replacement field seems adequate for the vast - majority of situations. Regexps are only necessary when portions of a - namespace are to be delegated to different resolvers. Finally, note - that the regexp and replacement fields are, at present, mutually - exclusive. However, developers of client software should be aware - that a new flag might be defined which requires values in both - fields. - -Example 1 ---------- - - Consider a URN that uses the hypothetical DUNS namespace. DUNS - numbers are identifiers for approximately 30 million registered - businesses around the world, assigned and maintained by Dunn and - Bradstreet. The URN might look like: - - urn:duns:002372413:annual-report-1997 - - The first step in the resolution process is to find out about the - DUNS namespace. The namespace identifier, "duns", is extracted from - the URN, prepended to urn.net, and the NAPTRs for duns.urn.net looked - up. It might return records of the form: - -duns.urn.net -;; order pref flags service regexp replacement - IN NAPTR 100 10 "s" "dunslink+N2L+N2C" "" dunslink.udp.isi.dandb.com - IN NAPTR 100 20 "s" "rcds+N2C" "" rcds.udp.isi.dandb.com - IN NAPTR 100 30 "s" "http+N2L+N2C+N2R" "" http.tcp.isi.dandb.com - - The order field contains equal values, indicating that no name - delegation order has to be followed. The preference field indicates - that the provider would like clients to use the special dunslink - protocol, followed by the RCDS protocol, and that HTTP is offered as - a last resort. All the records specify the "s" flag, which will be - explained momentarily. The service fields say that if we speak - dunslink, we will be able to issue either the N2L or N2C requests to - obtain a URL or a URC (description) of the resource. The Resource - Cataloging and Distribution Service (RCDS)[7] could be used to get a - URC for the resource, while HTTP could be used to get a URL, URC, or - the resource itself. All the records supply the next domain name to - query, none of them need to be rewritten with the aid of regular - expressions. - - - - - - -Daniel & Mealling Experimental [Page 6] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - The general case might require multiple NAPTR rewrites to locate a - resolver, but eventually we will come to the "terminal NAPTR". Once - we have the terminal NAPTR, our next probe into the DNS will be for a - SRV or A record instead of another NAPTR. Rather than probing for a - non-existent NAPTR record to terminate the loop, the flags field is - used to indicate a terminal lookup. If it has a value of "s", the - next lookup should be for SRV RRs, "a" denotes that A records should - sought. A "p" flag is also provided to indicate that the next action - is Protocol-specific, but that looking up another NAPTR will not be - part of it. - - Since our example RR specified the "s" flag, it was terminal. - Assuming our client does not know the dunslink protocol, our next - action is to lookup SRV RRs for rcds.udp.isi.dandb.com, which will - tell us hosts that can provide the necessary resolution service. That - lookup might return: - - ;; Pref Weight Port Target - rcds.udp.isi.dandb.com IN SRV 0 0 1000 defduns.isi.dandb.com - IN SRV 0 0 1000 dbmirror.com.au - IN SRV 0 0 1000 ukmirror.com.uk - - telling us three hosts that could actually do the resolution, and - giving us the port we should use to talk to their RCDS server. (The - reader is referred to the SRV proposal [4] for the interpretation of - the fields above). - - There is opportunity for significant optimization here. We can return - the SRV records as additional information for terminal NAPTRs (and - the A records as additional information for those SRVs). While this - recursive provision of additional information is not explicitly - blessed in the DNS specifications, it is not forbidden, and BIND does - take advantage of it [8]. This is a significant optimization. In - conjunction with a long TTL for *.urn.net records, the average number - of probes to DNS for resolving DUNS URNs would approach one. - Therefore, DNS server implementors SHOULD provide additional - information with NAPTR responses. The additional information will be - either SRV or A records. If SRV records are available, their A - records should be provided as recursive additional information. - - Note that the example NAPTR records above are intended to represent - the reply the client will see. They are not quite identical to what - the domain administrator would put into the zone files. For one - thing, the administrator should supply the trailing '.' character on - any FQDNs. - - - - - - -Daniel & Mealling Experimental [Page 7] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - -Example 2 ---------- - - Consider a URN namespace based on MIME Content-Ids. The URN might - look like this: - - urn:cid:199606121851.1@mordred.gatech.edu - - (Note that this example is chosen for pedagogical purposes, and does - not conform to the recently-approved CID URL scheme.) - - The first step in the resolution process is to find out about the CID - namespace. The namespace identifier, cid, is extracted from the URN, - prepended to urn.net, and the NAPTR for cid.urn.net looked up. It - might return records of the form: - - cid.urn.net - ;; order pref flags service regexp replacement - IN NAPTR 100 10 "" "" "/urn:cid:.+@([^\.]+\.)(.*)$/\2/i" . - - We have only one NAPTR response, so ordering the responses is not a - problem. The replacement field is empty, so we check the regexp - field and use the pattern provided there. We apply that regexp to the - entire URN to see if it matches, which it does. The \2 part of the - substitution expression returns the string "gatech.edu". Since the - flags field does not contain "s" or "a", the lookup is not terminal - and our next probe to DNS is for more NAPTR records: - lookup(query=NAPTR, "gatech.edu"). - - Note that the rule does not extract the full domain name from the - CID, instead it assumes the CID comes from a host and extracts its - domain. While all hosts, such as mordred, could have their very own - NAPTR, maintaining those records for all the machines at a site as - large as Georgia Tech would be an intolerable burden. Wildcards are - not appropriate here since they only return results when there is no - exactly matching names already in the system. - - The record returned from the query on "gatech.edu" might look like: - -gatech.edu IN NAPTR -;; order pref flags service regexp replacement - IN NAPTR 100 50 "s" "z3950+N2L+N2C" "" z3950.tcp.gatech.edu - IN NAPTR 100 50 "s" "rcds+N2C" "" rcds.udp.gatech.edu - IN NAPTR 100 50 "s" "http+N2L+N2C+N2R" "" http.tcp.gatech.edu - - - - - - - -Daniel & Mealling Experimental [Page 8] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - Continuing with our example, we note that the values of the order and - preference fields are equal in all records, so the client is free to - pick any record. The flags field tells us that these are the last - NAPTR patterns we should see, and after the rewrite (a simple - replacement in this case) we should look up SRV records to get - information on the hosts that can provide the necessary service. - - Assuming we prefer the Z39.50 protocol, our lookup might return: - - ;; Pref Weight Port Target - z3950.tcp.gatech.edu IN SRV 0 0 1000 z3950.gatech.edu - IN SRV 0 0 1000 z3950.cc.gatech.edu - IN SRV 0 0 1000 z3950.uga.edu - - telling us three hosts that could actually do the resolution, and - giving us the port we should use to talk to their Z39.50 server. - - Recall that the regular expression used \2 to extract a domain name - from the CID, and \. for matching the literal '.' characters - seperating the domain name components. Since '\' is the escape - character, literal occurances of a backslash must be escaped by - another backslash. For the case of the cid.urn.net record above, the - regular expression entered into the zone file should be - "/urn:cid:.+@([^\\.]+\\.)(.*)$/\\2/i". When the client code actually - receives the record, the pattern will have been converted to - "/urn:cid:.+@([^.]+\.)(.*)$/\2/i". - -Example 3 ---------- - - Even if URN systems were in place now, there would still be a - tremendous number of URLs. It should be possible to develop a URN - resolution system that can also provide location independence for - those URLs. This is related to the requirement in [1] to be able to - grandfather in names from other naming systems, such as ISO Formal - Public Identifiers, Library of Congress Call Numbers, ISBNs, ISSNs, - etc. - - The NAPTR RR could also be used for URLs that have already been - assigned. Assume we have the URL for a very popular piece of - software that the publisher wishes to mirror at multiple sites around - the world: - - http://www.foo.com/software/latest-beta.exe - - - - - - - -Daniel & Mealling Experimental [Page 9] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - We extract the prefix, "http", and lookup NAPTR records for - http.urn.net. This might return a record of the form - - http.urn.net IN NAPTR - ;; order pref flags service regexp replacement - 100 90 "" "" "!http://([^/:]+)!\1!i" . - - This expression returns everything after the first double slash and - before the next slash or colon. (We use the '!' character to delimit - the parts of the substitution expression. Otherwise we would have to - use backslashes to escape the forward slashes, and would have a - regexp in the zone file that looked like - "/http:\\/\\/([^\\/:]+)/\\1/i".). - - Applying this pattern to the URL extracts "www.foo.com". Looking up - NAPTR records for that might return: - - www.foo.com - ;; order pref flags service regexp replacement - IN NAPTR 100 100 "s" "http+L2R" "" http.tcp.foo.com - IN NAPTR 100 100 "s" "ftp+L2R" "" ftp.tcp.foo.com - - Looking up SRV records for http.tcp.foo.com would return information - on the hosts that foo.com has designated to be its mirror sites. The - client can then pick one for the user. - -NAPTR RR Format -=============== - - The format of the NAPTR RR is given below. The DNS type code for - NAPTR is 35. - - Domain TTL Class Order Preference Flags Service Regexp - Replacement - - where: - - Domain - The domain name this resource record refers to. - TTL - Standard DNS Time To Live field - Class - Standard DNS meaning - - - - - - - - -Daniel & Mealling Experimental [Page 10] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - Order - A 16-bit integer specifying the order in which the NAPTR - records MUST be processed to ensure correct delegation of - portions of the namespace over time. Low numbers are processed - before high numbers, and once a NAPTR is found that "matches" - a URN, the client MUST NOT consider any NAPTRs with a higher - value for order. - - Preference - A 16-bit integer which specifies the order in which NAPTR - records with equal "order" values SHOULD be processed, low - numbers being processed before high numbers. This is similar - to the preference field in an MX record, and is used so domain - administrators can direct clients towards more capable hosts - or lighter weight protocols. - - Flags - A String giving flags to control aspects of the rewriting and - interpretation of the fields in the record. Flags are single - characters from the set [A-Z0-9]. The case of the alphabetic - characters is not significant. - - At this time only three flags, "S", "A", and "P", are defined. - "S" means that the next lookup should be for SRV records - instead of NAPTR records. "A" means that the next lookup - should be for A records. The "P" flag says that the remainder - of the resolution shall be carried out in a Protocol-specific - fashion, and we should not do any more DNS queries. - - The remaining alphabetic flags are reserved. The numeric flags - may be used for local experimentation. The S, A, and P flags - are all mutually exclusive, and resolution libraries MAY - signal an error if more than one is given. (Experimental code - and code for assisting in the creation of NAPTRs would be more - likely to signal such an error than a client such as a - browser). We anticipate that multiple flags will be allowed in - the future, so implementers MUST NOT assume that the flags - field can only contain 0 or 1 characters. Finally, if a client - encounters a record with an unknown flag, it MUST ignore it - and move to the next record. This test takes precedence even - over the "order" field. Since flags can control the - interpretation placed on fields, a novel flag might change the - interpretation of the regexp and/or replacement fields such - that it is impossible to determine if a record matched a URN. - - - - - - - -Daniel & Mealling Experimental [Page 11] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - Service - Specifies the resolution service(s) available down this - rewrite path. It may also specify the particular protocol that - is used to talk with a resolver. A protocol MUST be specified - if the flags field states that the NAPTR is terminal. If a - protocol is specified, but the flags field does not state that - the NAPTR is terminal, the next lookup MUST be for a NAPTR. - The client MAY choose not to perform the next lookup if the - protocol is unknown, but that behavior MUST NOT be relied - upon. - - The service field may take any of the values below (using the - Augmented BNF of RFC 822[9]): - - service_field = [ [protocol] *("+" rs)] - protocol = ALPHA *31ALPHANUM - rs = ALPHA *31ALPHANUM - // The protocol and rs fields are limited to 32 - // characters and must start with an alphabetic. - // The current set of "known" strings are: - // protocol = "rcds" / "thttp" / "hdl" / "rwhois" / "z3950" - // rs = "N2L" / "N2Ls" / "N2R" / "N2Rs" / "N2C" - // / "N2Ns" / "L2R" / "L2Ns" / "L2Ls" / "L2C" - - i.e. an optional protocol specification followed by 0 or more - resolution services. Each resolution service is indicated by - an initial '+' character. - - Note that the empty string is also a valid service field. This - will typically be seen at the top levels of a namespace, when - it is impossible to know what services and protocols will be - offered by a particular publisher within that name space. - - At this time the known protocols are rcds[7], hdl[10] (binary, - UDP-based protocols), thttp[5] (a textual, TCP-based - protocol), rwhois[11] (textual, UDP or TCP based), and - Z39.50[12] (binary, TCP-based). More will be allowed later. - The names of the protocols must be formed from the characters - [a-Z0-9]. Case of the characters is not significant. - - The service requests currently allowed will be described in - more detail in [6], but in brief they are: - N2L - Given a URN, return a URL - N2Ls - Given a URN, return a set of URLs - N2R - Given a URN, return an instance of the resource. - N2Rs - Given a URN, return multiple instances of the - resource, typically encoded using - multipart/alternative. - - - -Daniel & Mealling Experimental [Page 12] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - N2C - Given a URN, return a collection of meta- - information on the named resource. The format of - this response is the subject of another document. - N2Ns - Given a URN, return all URNs that are also - identifers for the resource. - L2R - Given a URL, return the resource. - L2Ns - Given a URL, return all the URNs that are - identifiers for the resource. - L2Ls - Given a URL, return all the URLs for instances of - of the same resource. - L2C - Given a URL, return a description of the - resource. - - The actual format of the service request and response will be - determined by the resolution protocol, and is the subject for - other documents (e.g. [5]). Protocols need not offer all - services. The labels for service requests shall be formed from - the set of characters [A-Z0-9]. The case of the alphabetic - characters is not significant. - - Regexp - A STRING containing a substitution expression that is applied - to the original URI in order to construct the next domain name - to lookup. The grammar of the substitution expression is given - in the next section. - - Replacement - The next NAME to query for NAPTR, SRV, or A records depending - on the value of the flags field. As mentioned above, this may - be compressed. - -Substitution Expression Grammar: -================================ - - The content of the regexp field is a substitution expression. True - sed(1) substitution expressions are not appropriate for use in this - application for a variety of reasons, therefore the contents of the - regexp field MUST follow the grammar below: - -subst_expr = delim-char ere delim-char repl delim-char *flags -delim-char = "/" / "!" / ... (Any non-digit or non-flag character other - than backslash '\'. All occurances of a delim_char in a - subst_expr must be the same character.) -ere = POSIX Extended Regular Expression (see [13], section - 2.8.4) -repl = dns_str / backref / repl dns_str / repl backref -dns_str = 1*DNS_CHAR -backref = "\" 1POS_DIGIT - - - -Daniel & Mealling Experimental [Page 13] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - -flags = "i" -DNS_CHAR = "-" / "0" / ... / "9" / "a" / ... / "z" / "A" / ... / "Z" -POS_DIGIT = "1" / "2" / ... / "9" ; 0 is not an allowed backref -value domain name (see RFC-1123 [14]). - - The result of applying the substitution expression to the original - URI MUST result in a string that obeys the syntax for DNS host names - [14]. Since it is possible for the regexp field to be improperly - specified, such that a non-conforming host name can be constructed, - client software SHOULD verify that the result is a legal host name - before making queries on it. - - Backref expressions in the repl portion of the substitution - expression are replaced by the (possibly empty) string of characters - enclosed by '(' and ')' in the ERE portion of the substitution - expression. N is a single digit from 1 through 9, inclusive. It - specifies the N'th backref expression, the one that begins with the - N'th '(' and continues to the matching ')'. For example, the ERE - (A(B(C)DE)(F)G) - has backref expressions: - \1 = ABCDEFG - \2 = BCDE - \3 = C - \4 = F - \5..\9 = error - no matching subexpression - - The "i" flag indicates that the ERE matching SHALL be performed in a - case-insensitive fashion. Furthermore, any backref replacements MAY - be normalized to lower case when the "i" flag is given. - - The first character in the substitution expression shall be used as - the character that delimits the components of the substitution - expression. There must be exactly three non-escaped occurrences of - the delimiter character in a substitution expression. Since escaped - occurrences of the delimiter character will be interpreted as - occurrences of that character, digits MUST NOT be used as delimiters. - Backrefs would be confused with literal digits were this allowed. - Similarly, if flags are specified in the substitution expression, the - delimiter character must not also be a flag character. - - - - - - - - - - - - -Daniel & Mealling Experimental [Page 14] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - -Advice to domain administrators: -================================ - - Beware of regular expressions. Not only are they a pain to get - correct on their own, but there is the previously mentioned - interaction with DNS. Any backslashes in a regexp must be entered - twice in a zone file in order to appear once in a query response. - More seriously, the need for double backslashes has probably not been - tested by all implementors of DNS servers. We anticipate that urn.net - will be the heaviest user of regexps. Only when delegating portions - of namespaces should the typical domain administrator need to use - regexps. - - On a related note, beware of interactions with the shell when - manipulating regexps from the command line. Since '\' is a common - escape character in shells, there is a good chance that when you - think you are saying "\\" you are actually saying "\". Similar - caveats apply to characters such as - - The "a" flag allows the next lookup to be for A records rather than - SRV records. Since there is no place for a port specification in the - NAPTR record, when the "A" flag is used the specified protocol must - be running on its default port. - - The URN Sytnax draft defines a canonical form for each URN, which - requires %encoding characters outside a limited repertoire. The - regular expressions MUST be written to operate on that canonical - form. Since international character sets will end up with extensive - use of %encoded characters, regular expressions operating on them - will be essentially impossible to read or write by hand. - -Usage -===== - - For the edification of implementers, pseudocode for a client routine - using NAPTRs is given below. This code is provided merely as a - convience, it does not have any weight as a standard way to process - NAPTR records. Also, as is the case with pseudocode, it has never - been executed and may contain logical errors. You have been warned. - - // - // findResolver(URN) - // Given a URN, find a host that can resolve it. - // - findResolver(string URN) { - // prepend prefix to urn.net - sprintf(key, "%s.urn.net", extractNS(URN)); - do { - - - -Daniel & Mealling Experimental [Page 15] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - rewrite_flag = false; - terminal = false; - if (key has been seen) { - quit with a loop detected error - } - add key to list of "seens" - records = lookup(type=NAPTR, key); // get all NAPTR RRs for 'key' - - discard any records with an unknown value in the "flags" field. - sort NAPTR records by "order" field and "preference" field - (with "order" being more significant than "preference"). - n_naptrs = number of NAPTR records in response. - curr_order = records[0].order; - max_order = records[n_naptrs-1].order; - - // Process current batch of NAPTRs according to "order" field. - for (j=0; j < n_naptrs && records[j].order <= max_order; j++) { - if (unknown_flag) // skip this record and go to next one - continue; - newkey = rewrite(URN, naptr[j].replacement, naptr[j].regexp); - if (!newkey) // Skip to next record if the rewrite didn't - match continue; - // We did do a rewrite, shrink max_order to current value - // so that delegation works properly - max_order = naptr[j].order; - // Will we know what to do with the protocol and services - // specified in the NAPTR? If not, try next record. - if(!isKnownProto(naptr[j].services)) { - continue; - } - if(!isKnownService(naptr[j].services)) { - continue; - } - - // At this point we have a successful rewrite and we will - // know how to speak the protocol and request a known - // resolution service. Before we do the next lookup, check - // some optimization possibilities. - - if (strcasecmp(flags, "S") - || strcasecmp(flags, "P")) - || strcasecmp(flags, "A")) { - terminal = true; - services = naptr[j].services; - addnl = any SRV and/or A records returned as additional - info for naptr[j]. - } - key = newkey; - - - -Daniel & Mealling Experimental [Page 16] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - rewriteflag = true; - break; - } - } while (rewriteflag && !terminal); - - // Did we not find our way to a resolver? - if (!rewrite_flag) { - report an error - return NULL; - } - - - // Leave rest to another protocol? - if (strcasecmp(flags, "P")) { - return key as host to talk to; - } - - // If not, keep plugging - if (!addnl) { // No SRVs came in as additional info, look them up - srvs = lookup(type=SRV, key); - } - - sort SRV records by preference, weight, ... - foreach (SRV record) { // in order of preference - try contacting srv[j].target using the protocol and one of the - resolution service requests from the "services" field of the - last NAPTR record. - if (successful) - return (target, protocol, service); - // Actually we would probably return a result, but this - // code was supposed to just tell us a good host to talk to. - } - die with an "unable to find a host" error; - } - -Notes: -====== - - - A client MUST process multiple NAPTR records in the order - specified by the "order" field, it MUST NOT simply use the first - record that provides a known protocol and service combination. - - - - - - - - - - -Daniel & Mealling Experimental [Page 17] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - - If a record at a particular order matches the URI, but the - client doesn't know the specified protocol and service, the - client SHOULD continue to examine records that have the same - order. The client MUST NOT consider records with a higher value - of order. This is necessary to make delegation of portions of - the namespace work. The order field is what lets site - administrators say "all requests for URIs matching pattern x go - to server 1, all others go to server 2". - (A match is defined as: - 1) The NAPTR provides a replacement domain name - or - 2) The regular expression matches the URN - ) - - - When multiple RRs have the same "order", the client should use - the value of the preference field to select the next NAPTR to - consider. However, because of preferred protocols or services, - estimates of network distance and bandwidth, etc. clients may - use different criteria to sort the records. - - If the lookup after a rewrite fails, clients are strongly - encouraged to report a failure, rather than backing up to pursue - other rewrite paths. - - When a namespace is to be delegated among a set of resolvers, - regexps must be used. Each regexp appears in a separate NAPTR - RR. Administrators should do as little delegation as possible, - because of limitations on the size of DNS responses. - - Note that SRV RRs impose additional requirements on clients. - -Acknowledgments: -================= - - The editors would like to thank Keith Moore for all his consultations - during the development of this draft. We would also like to thank - Paul Vixie for his assistance in debugging our implementation, and - his answers on our questions. Finally, we would like to acknowledge - our enormous intellectual debt to the participants in the Knoxville - series of meetings, as well as to the participants in the URI and URN - working groups. - -References: -=========== - - [1] Sollins, Karen and Larry Masinter, "Functional Requirements - for Uniform Resource Names", RFC-1737, Dec. 1994. - - [2] The URN Implementors, Uniform Resource Names: A Progress Report, - http://www.dlib.org/dlib/february96/02arms.html, D-Lib Magazine, - February 1996. - - - -Daniel & Mealling Experimental [Page 18] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - - [3] Moats, Ryan, "URN Syntax", RFC-2141, May 1997. - - [4] Gulbrandsen, A. and P. Vixie, "A DNS RR for specifying - the location of services (DNS SRV)", RFC-2052, October 1996. - - [5] Daniel, Jr., Ron, "A Trivial Convention for using HTTP in URN - Resolution", RFC-2169, June 1997. - - [6] URN-WG, "URN Resolution Services", Work in Progress. - - [7] Moore, Keith, Shirley Browne, Jason Cox, and Jonathan Gettler, - Resource Cataloging and Distribution System, Technical Report - CS-97-346, University of Tennessee, Knoxville, December 1996 - - [8] Paul Vixie, personal communication. - - [9] Crocker, Dave H. "Standard for the Format of ARPA Internet Text - Messages", RFC-822, August 1982. - - [10] Orth, Charles and Bill Arms; Handle Resolution Protocol - Specification, http://www.handle.net/docs/client_spec.html - - [11] Williamson, S., M. Kosters, D. Blacka, J. Singh, K. Zeilstra, - "Referral Whois Protocol (RWhois)", RFC-2167, June 1997. - - [12] Information Retrieval (Z39.50): Application Service Definition - and Protocol Specification, ANSI/NISO Z39.50-1995, July 1995. - - [13] IEEE Standard for Information Technology - Portable Operating - System Interface (POSIX) - Part 2: Shell and Utilities (Vol. 1); - IEEE Std 1003.2-1992; The Institute of Electrical and - Electronics Engineers; New York; 1993. ISBN:1-55937-255-9 - - [14] Braden, R., "Requirements for Internet Hosts - Application and - and Support", RFC-1123, Oct. 1989. - - [15] Sollins, Karen, "Requirements and a Framework for URN Resolution - Systems", November 1996, Work in Progress. - - - - - - - - - - - - - -Daniel & Mealling Experimental [Page 19] - -RFC 2168 Resolution of URIs Using the DNS June 1997 - - -Security Considerations -======================= - - The use of "urn.net" as the registry for URN namespaces is subject to - denial of service attacks, as well as other DNS spoofing attacks. The - interactions with DNSSEC are currently being studied. It is expected - that NAPTR records will be signed with SIG records once the DNSSEC - work is deployed. - - The rewrite rules make identifiers from other namespaces subject to - the same attacks as normal domain names. Since they have not been - easily resolvable before, this may or may not be considered a - problem. - - Regular expressions should be checked for sanity, not blindly passed - to something like PERL. - - This document has discussed a way of locating a resolver, but has not - discussed any detail of how the communication with the resolver takes - place. There are significant security considerations attached to the - communication with a resolver. Those considerations are outside the - scope of this document, and must be addressed by the specifications - for particular resolver communication protocols. - -Author Contact Information: -=========================== - - Ron Daniel - Los Alamos National Laboratory - MS B287 - Los Alamos, NM, USA, 87545 - voice: +1 505 665 0597 - fax: +1 505 665 4939 - email: rdaniel@lanl.gov - - - Michael Mealling - Network Solutions - 505 Huntmar Park Drive - Herndon, VA 22070 - voice: (703) 742-0400 - fax: (703) 742-9552 - email: michaelm@internic.net - URL: http://www.netsol.com/ - - - - - - - -Daniel & Mealling Experimental [Page 20] - |