diff options
Diffstat (limited to 'share/doc')
-rw-r--r-- | share/doc/IPv6/IMPLEMENTATION | 659 |
1 files changed, 498 insertions, 161 deletions
diff --git a/share/doc/IPv6/IMPLEMENTATION b/share/doc/IPv6/IMPLEMENTATION index 98150d5..ea1715f 100644 --- a/share/doc/IPv6/IMPLEMENTATION +++ b/share/doc/IPv6/IMPLEMENTATION @@ -2,11 +2,12 @@ # Some portion of this document is not applicable to the code merged into # FreeBSD-current (for example, section 5). - Implementation Note + Implementation Note - KAME Project - http://www.kame.net/ - $FreeBSD$ + KAME Project + http://www.kame.net/ + $KAME: IMPLEMENTATION,v 1.216 2001/05/25 07:43:01 jinmei Exp $ + $FreeBSD$ 1. IPv6 @@ -27,12 +28,7 @@ RFC1639: FTP Operation Over Big Address Records (FOOBAR) * RFC2428 is preferred over RFC1639. ftp clients will first try RFC2428, then RFC1639 if failed. RFC1886: DNS Extensions to support IPv6 -RFC1933: Transition Mechanisms for IPv6 Hosts and Routers - * IPv4 compatible address is not supported. - * automatic tunneling (4.3) is not supported. - * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way, - and it covers "configured tunnel" described in the spec. - See 1.5 in this document for details. +RFC1933: (see RFC2893) RFC1981: Path MTU Discovery for IPv6 RFC2080: RIPng for IPv6 * KAME-supplied route6d, bgpd and hroute6d support this. @@ -42,8 +38,7 @@ RFC2283: Multiprotocol Extensions for BGP-4 RFC2292: Advanced Sockets API for IPv6 * For supported library functions/kernel APIs, see sys/netinet6/ADVAPI. RFC2362: Protocol Independent Multicast-Sparse Mode (PIM-SM) - * RFC2362 defines packet formats for PIM-SM. draft-ietf-pim-ipv6-01.txt - is written based on this. + * RFC2362 defines the packet formats and the protcol of PIM-SM. RFC2373: IPv6 Addressing Architecture * KAME supports node required addresses, and conforms to the scope requirement. @@ -82,6 +77,13 @@ RFC2553: Basic Socket Interface Extensions for IPv6 - supported but turned off by default on KAME/NetBSD, - not supported on KAME/FreeBSD228, KAME/OpenBSD and KAME/BSDI3. see 1.12 in this document for details. +RFC2671: Extension Mechanisms for DNS (EDNS0) + * see USAGE for how to use it. + * not supported on kame/freebsd4 and kame/bsdi4. +RFC2673: Binary Labels in the Domain Name System + * KAME/bsdi4 supports A6, DNAME and binary label to some extent. + * KAME apps/bind8 repository has resolver library with partial A6, DNAME + and binary label support. RFC2675: IPv6 Jumbograms * See 1.7 in this document for details. RFC2710: Multicast Listener Discovery for IPv6 @@ -89,35 +91,74 @@ RFC2711: IPv6 router alert option RFC2732: Format for Literal IPv6 Addresses in URL's * The spec is implemented in programs that handle URLs (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1)) -draft-ietf-ipngwg-router-renum-10: Router renumbering for IPv6 -draft-ietf-ipngwg-icmp-name-lookups-05: IPv6 Name Lookups Through ICMP -draft-ietf-pim-ipv6-03.txt: PIM for IPv6 - * pim6dd implements dense mode. pim6sd implements sparse mode. +RFC2766: Network Address Translation - Protocol Translation (NAT-PT) + * Section 4.2 is implemented by totd (see ports/totd, or pkgsrc/net/totd). +RFC2874: DNS Extensions to Support IPv6 Address Aggregation and Renumbering + * KAME/bsdi4 supports A6, DNAME and binary label to some extent. + * KAME apps/bind8 repository has resolver library with partial A6, DNAME + and binary label support. +RFC2893: Transition Mechanisms for IPv6 Hosts and Routers + * IPv4 compatible address is not supported. + * automatic tunneling (4.3) is not supported. + * "gif" interface implements IPv[46]-over-IPv[46] tunnel in a generic way, + and it covers "configured tunnel" described in the spec. + See 1.5 in this document for details. +RFC2894: Router renumbering for IPv6 +RFC3041: Privacy Extensions for Stateless Address Autoconfiguration in IPv6 +RFC3056: Connection of IPv6 Domains via IPv4 Clouds + * So-called "6to4". + * "stf" interface implements it. Be sure to read + draft-itojun-ipv6-transition-abuse-01.txt + below before configuring it, there can be security issues. +draft-ietf-ipngwg-icmp-name-lookups-07: IPv6 Name Lookups Through ICMP draft-ietf-dhc-dhcpv6-15.txt: DHCPv6 draft-ietf-dhc-dhcpv6exts-12.txt: Extensions for DHCPv6 * kame/dhcp6 has test implementation, which will not be compiled in default compilation. + * 15/12 drafts are not explicit about padding and string termination. + at IETF48, the author confirmed that there's no padding/termination + (and extensions can appear unaligned). our code follows the comment. draft-itojun-ipv6-tcp-to-anycast-00.txt: Disconnecting TCP connection toward IPv6 anycast address -draft-ietf-ipngwg-scopedaddr-format-02.txt: - An Extension of Format for IPv6 Scoped Addresses -draft-ietf-ngtrans-tcpudp-relay-01.txt: +draft-ietf-ipngwg-rfc2553bis-03.txt: + Basic Socket Interface Extensions for IPv6 (revised) +draft-ietf-ipngwg-rfc2292bis-02.txt: + Advanced Sockets API for IPv6 (revised) + * Some of the updates in the draft are not implemented yet. See + TODO.2292bis for more details. +draft-ietf-mobileip-ipv6-13.txt: Mobility Support in IPv6 + * See section 6. +draft-ietf-ngtrans-tcpudp-relay-04.txt: An IPv6-to-IPv4 transport relay translator * FAITH tcp relay translator (faithd) implements this. See 3.1 for more details. -draft-ietf-ngtrans-6to4-06.txt: - Connection of IPv6 Domains via IPv4 Clouds without Explicit Tunnels - * "stf" interface implements it. Be sure to read the next item before - configuring it, there are security issues. -http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt: - Possible abuse against IPv6 transition technologies - * KAME does not implement RFC1933 automatic tunnel. +draft-ietf-ipngwg-router-selection-01.txt: + Default Router Preferences and More-Specific Routes + * router-side only. +draft-ietf-ipngwg-scoping-arch-02.txt: + The architecture, text representation, and usage of IPv6 + scoped addresses. + * some part of the documentation (especially about the routing + model) is not supported yet. +draft-ietf-pim-sm-v2-new-02.txt + A revised version of RFC2362, which includes the IPv6 specific + packet format and protocol descriptions. +draft-ietf-dnsext-mdns-00.txt: Multicast DNS + * kame/mdnsd has test implementation, which will not be built in + default compilation. The draft will experience a major change in the + near future, so don't rely upon it. +draft-itojun-ipv6-transition-abuse-02.txt: + Possible abuse against IPv6 transition technologies (expired) + * KAME does not implement RFC1933/2893 automatic tunnel. * "stf" interface implements some address filters. Refer to stf(4) for details. Since there's no way to make 6to4 interface 100% secure, we do not include "stf" interface into GENERIC.v6 compilation. * kame/openbsd completely disables IPv4 mapped address support. * kame/netbsd makes IPv4 mapped address support off by default. - * See section 12.6 and 14 for more details. + * See section 1.12.6 and 1.14 for more details. +draft-itojun-ipv6-tclass-api-02.txt: Socket API for IPv6 traffic class field +draft-itojun-ipv6-flowlabel-api-01.txt: Socket API for IPv6 flow label field + * no consideration is made against the use of routing headers and such. 1.2 Neighbor Discovery @@ -145,7 +186,10 @@ Some of network drivers loop multicast packets back to themselves, even if instructed not to do so (especially in promiscuous mode). In such cases DAD may fail, because DAD engine sees inbound NS packet (actually from the node itself) and considers it as a sign of duplicate. -You may want to look at #if condition marked "heuristics" in +In this case, drivers should be corrected to honor IFF_SIMPLEX behavior. +For example, you may need to check source MAC address on a inbound packet, +and reject it if it is from the node itself. +You may also want to look at #if condition marked "heuristics" in sys/netinet6/nd6_nbr.c:nd6_dad_timer() as workaround (note that the code fragment in "heuristics" section is not spec conformant). @@ -218,84 +262,157 @@ upper-layer hints to be accepted. non-root process - after local discussion, it looks that hints are not that trustworthy even if they are from privileged processes) -1.3 Scope Index +If inbound ND packets carry invalid values, the KAME kernel will +drop these packet and increment statistics variable. See +"netstat -sn", icmp6 section. For detailed debugging session, you can +turn on syslog output from the kernel on errors, by turning on sysctl MIB +net.inet6.icmp6.nd6_debug. nd6_debug can be turned on at bootstrap +time, by defining ND6_DEBUG kernel compilation option (so you can +debug behavior during bootstrap). nd6_debug configuration should +only be used for test/debug purposes - for production environment, +nd6_debug must be set to 0. If you leave it to 1, malicious parties +can inject broken packet and fill up /var/log partition. + +1.3 Scope Zone Index IPv6 uses scoped addresses. It is therefore very important to -specify scope index (interface index for link-local address, or -site index for site-local address) with an IPv6 address. Without -scope index, a scoped IPv6 address is ambiguous to the kernel, and -the kernel will not be able to determine the outbound interface for a -packet. KAME code tries to address the issue in several ways. - -Site-local address is very vaguely defined in the specs, and both specification -and KAME code need tons of improvements to enable its actual use. -For example, it is still very unclear how we define a site, or how we resolve -hostnames in a site. There are work underway to define behavior of routers -at site border, however, we have almost no code for site boundary node support -(both forwarding nor routing) and we bet almost noone has. -We recommend, at this moment, you to use global addresses for experiments - -there are way too many pitfalls if you use site-local addresses. +specify the scope zone index (link index for a link-local address, or +site index for a site-local address) with an IPv6 address. Without a +zone index, a scoped IPv6 address is ambiguous to the kernel, and +the kernel would not be able to determine the outbound link for a +packet to the scoped address. KAME code tries to address the issue in +several ways. + +The entire architecture of scoped addresses is documented in +draft-ietf-ipngwg-scoping-arch-xx.txt. One non-trivial point of the +architecture is that the link scope is (theoretically) larger than the +interface scope. That is, two different interfaces can belong to a +same single link. However, in a normal operation, we can assume that +there is 1-to-1 relationship between links and interfaces. In +other words, we can usually put links and interfaces in the same scope +type. The current KAME implementation assumes the 1-to-1 +relationship. In particular, we use interface names such as "ne1" as +unique link identifiers. This would be much more human-readable and +intuitive than numeric identifiers, but please keep your mind on the +theoretical difference between links and interfaces. + +Site-local addresses are very vaguely defined in the specs, and both +the specification and the KAME code need tons of improvements to +enable its actual use. For example, it is still very unclear how we +define a site, or how we resolve host names in a site. There is work +underway to define behavior of routers at site border, but, we have +almost no code for site boundary node support (both forwarding nor +routing) and we bet almost noone has. We recommend, at this moment, +you to use global addresses for experiments - there are way too many +pitfalls if you use site-local addresses. 1.3.1 Kernel internal -In the kernel, the interface index for a link-local scope address is +In the kernel, the link index for a link-local scope address is embedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6 address. For example, you may see something like: fe80:1::200:f8ff:fe01:6317 -in the routing table and interface address structure (struct -in6_ifaddr). The address above is a link-local unicast address -which belongs to a network interface whose interface identifier is 1. -The embedded index enables us to identify IPv6 link local -addresses over multiple interfaces effectively and with only a +in the routing table and the interface address structure (struct +in6_ifaddr). The address above is a link-local unicast address which +belongs to a network link whose link identifier is 1 (note that it +eqauls to the interface index by the assumption of our +implementation). The embedded index enables us to identify IPv6 +link-local addresses over multiple links effectively and with only a little code change. 1.3.2 Interaction with API -Ordinary userland applications should use the advanced API (RFC2292) -to specify scope index, or interface index. For the similar purpose, -the sin6_scope_id member in the sockaddr_in6 structure is defined in -RFC2553. However, the semantics for sin6_scope_id is rather vague. -If you care about portability of your application, we suggest you to -use the advanced API rather than sin6_scope_id. - -Routing daemons and configuration programs, like route6d and -ifconfig, will need to manipulate the "embedded" scope index. -These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6) -and the kernel API will return IPv6 addresses with 2nd 16bit-word -filled in. The APIs are for manipulating kernel internal structure. -Programs that use these APIs have to be prepared about differences -in kernels anyway. - -getaddrinfo(3) and getnameinfo(3) are modified to support extended numeric -IPv6 syntax, as documented in draft-ietf-ipngwg-scopedaddr-format-xx.txt. -You can specify outgoing link, by using name of the outgoing interface -like "fe80::1%ne0". This way you will be able to specify link-local scoped -address without much trouble. -To use this extension in your program, you'll need to use getaddrinfo(3), -and getnameinfo(3) with NI_WITHSCOPEID. -The implementation currently assumes 1-to-1 relationship between a link and an -interface, which is stronger than what IPv6 specs say. -Other APIs like inet_pton(3) or getipnodebyname(3) are inherently unfriendly -with scoped addresses, since they are unable to annotate addresses with -scope identifier. +There are several candidates of API to deal with scoped addresses +without ambiguity. + +The IPV6_PKTINFO ancillary data type or socket option defined in the +advanced API (RFC2292 or draft-ietf-ipngwg-rfc2292bis-xx) can specify +the outgoing interface of a packet. Similarly, the IPV6_PKTINFO or +IPV6_RECVPKTINFO socket options tell kernel to pass the incoming +interface to user applications. + +These options are enough to disambiguate scoped addresses of an +incoming packet, because we can uniquely identify the corresponding +zone of the scoped address(es) by the incoming interface. However, +they are too strong for outgoing packets. For example, consider a +multi-sited node and suppose that more than one interface of the node +belongs to a same site. When we want to send a packet to the site, +we can only specify one of the interfaces for the outgoing packet with +these options; we cannot just say "send the packet to (one of the +interfaces of) the site." + +Another kind of candidates is to use the sin6_scope_id member in the +sockaddr_in6 structure, defined in RFC2553 and +draft-ietf-ipngwg-rfc2553bis-xx.txt. The KAME kernel interprets the +sin6_scope_id field properly in order to disambiguate scoped +addresses. For example, if an application passes a sockaddr_in6 +structure that has a non-zero sin6_scope_id value to the sendto(2) +system call, the kernel should send the packet to the appropriate zone +according to the sin6_scope_id field. Similarly, when the source or +the destination address of an incoming packet is a scoped one, the +kernel should detect the correct zone identifier based on the address +and the receiving interface, fill the identifier in the sin6_scope_id +field of a sockaddr_in6 structure, and then pass the packet to an +application via the recvfrom(2) system call, etc. + +However, the semantics of the sin6_scope_id is still vague and on the +way to standardization. Additionally, not so many operating systems +support the behavior above at this moment. + +In summary, +- If your target system is limited to KAME based ones (i.e. BSD + variants and KAME snaps), use the sin6_scope_id field assuming the + kernel behavior described above. +- Otherwise, (i.e. if your program should be portable on other systems + than BSDs) + + Use the advanced API to disambiguate scoped addresses of incoming + packets. + + To disambiguate scoped addresses of outgoing packets, + * if it is okay to just specify the outgoing interface, use the + advanced API. This would be the case, for example, when you + should only consider link-local addresses and your system + assumes 1-to-1 relationship between links and interfaces. + * otherwise, sorry but you lose. Please rush the IETF IPv6 + community into standardizing the semantics of the sin6_scope_id + field. + +Routing daemons and configuration programs, like route6d and ifconfig, +will need to manipulate the "embedded" zone index. These programs use +routing sockets and ioctls (like SIOCGIFADDR_IN6) and the kernel API +will return IPv6 addresses with the 2nd 16bit-word filled in. The +APIs are for manipulating kernel internal structure. Programs that +use these APIs have to be prepared about differences in kernels +anyway. + +getaddrinfo(3) and getnameinfo(3) support an extended numeric IPv6 +syntax, as documented in draft-ietf-ipngwg-rfc2553bis-xx.txt. You can +specify the outgoing link, by using the name of the outgoing interface +as the link, like "fe80::1%ne0" (again, note that we assume there is +1-to-1 relationship between links and interfaces.) This way you will +be able to specify a link-local scoped address without much trouble. + +Other APIs like inet_pton(3) and inet_ntop(3) are inherently +unfriendly with scoped addresses, since they are unable to annotate +addresses with zone identifier. 1.3.3 Interaction with users (command line) -Most of user applications now support an extended numeric IPv6 syntax, -as documented in draft-ietf-ipngwg-scopedaddr-format-xx.txt. In this -case, you can specify outgoing link, by using the name of the outgoing -interface like "fe80::1%ne0". This is the case for some management -tools such as route(8) or ndp(8). For example, to install the IPv6 -default route by hand, you can type like +Most of user applications now support the extended numeric IPv6 +syntax. In this case, you can specify outgoing link, by using the name +of the outgoing interface like "fe80::1%ne0" (sorry for the duplicated +notice, but please recall again that we assume 1-to-1 relationship +between links and interfaces). This is even the case for some +management tools such as route(8) or ndp(8). For example, to install +the IPv6 default route by hand, you can type like # route add -inet6 default fe80::9876:5432:1234:abcd%ne0 (Although we suggest you to run dynamic routing instead of static routes, in order to avoid configuration mistakes.) Some applications have command line options for specifying an appropriate zone of a scoped address (like "ping6 -I ne0 ff02::1" to -specify the outgoing interface). However, you can't always expect such -options. Thus, we recommend you to use the extended format described +specify the outgoing interface). However, you can't always expect such +options. Thus, we recommend you to use the extended format described above. In any case, when you specify a scoped address to the command line, @@ -401,12 +518,6 @@ To summarize the sysctl knob: 1 1 invalid, or experimental (out-of-scope of spec) -RFC2462 has validation rules against incoming RA prefix information option, -in 5.5.3 (e). This is to protect hosts from malicious (or misconfigured) -routers that advertise very short prefix lifetime. -There was an update from Jim Bound to ipngwg mailing list (look -for "(ipng 6712)" in the archive) and KAME implements Jim's update. - See 1.2 in the document for relationship between DAD and autoconfiguration. 1.4.3 DHCPv6 @@ -450,22 +561,25 @@ automatically assigned to the gif interface. KAME's source address selection takes care of the following conditions: - address scope -- prefix matching against the destination - outgoing interface - whether an address is deprecated +- whether an address is temporary (in terms of RFC 3041) +- prefix matching against the destination Roughly speaking, the selection policy is as follows: - always use an address that belongs to the same scope zone as the destination. - addresses that have equal or larger scope than the scope of the destination are preferred. -- if multiple addresses have the equal scope, one which is longest - prefix matching against the destination is preferred. - a deprecated address is not used in new communications if an alternate (non-deprecated) address is available and has sufficient scope. +- a temporary address (in terms of RFC 3041 privacy extension) are + preferred to a public address. - if none of above conditions tie-breaks, addresses assigned on the outgoing interface are preferred. +- if none of above conditions tie-breaks, one which is longest prefix + matching against the destination is preferred as the last resort. For instance, ::1 is selected for ff01::1, fe80::200:f8ff:fe01:6317%ne0 for fe80::2a0:24ff:feab:839b%ne0. @@ -484,47 +598,68 @@ The precise desripction of the algorithm is quite complicated. To describe the algorithm, we introduce the following notation: For a given destination D, - samescope(D): A set of addresses that have the same scope as D. - largerscope(D): A set of addresses that have a larger scope than D. - smallerscope(D): A set of addresses that have a smaller scope than D. + samescope(D): The set of addresses that have the same scope as D. + largerscope(D): The set of addresses that have a larger scope than D. + smallerscope(D): The set of addresses that have a smaller scope than D. For a given set of addresses A, - DEP(A): a set of deprecated addresses in A. + DEP(A): the set of deprecated addresses in A. nonDEP(A): A - DEP(A). +For a given set of addresses A, + tmp(A): the set of preferred temporary-autoconfigured or + manually-configure addresses in A. + Also, the algorithm assumes that the outgoing interface for the destination D is determined. We call the interface "I". The algorithm is as follows. Selection proceeds step by step as -described; For example, if an address is selected by item 1, item 2 or +described; For example, if an address is selected by item 1, item 2 and later are not considered at all. 0. If there is no address in the same scope zone as D, just give up; the packet will not be sent. - 1. If nonDEP(samescope(D)) is not empty, - choose a longest matching address against D. If more than one - address is longest matching, choose arbitrary one provided that - an address on I is always preferred. - 2. If nonDEP(largerscope(D)) is not empty, - choose an address that has the smallest scope. If more than one - address has the smallest scope, choose arbitrary one provided + 1. If we do not prefer temporary addresses, go to 3. + Otherwise, and if tmp(samescope(D)) is not empty, + then choose an address that is on the interface I. If every + address is on I, or every address is on a different interface + from I, choose an arbitrary one provided that an address longest + matching against D is always preferred. + 2. If tmp(largerscope(D)) is not empty, + then choose an address that has the smallest scope. If more than one + address has the smallest scope, choose an arbitrary one provided + that addresses on I are always preferred. + 3. If nonDEP(samescope(D)) is not empty, + then apply the same logic as of 1. + 4. If nonDEP(largerscope(D)) is not empty, + then apply the same logic as of 2. + 5. If we do not prefer temporary addresses, go to 7. + Otherwise, and if tmp(DEP(samescope(D))) is not empty, + then choose an address that is on the interface I. If every + address is on I, or every address is on a different interface + from I, choose an arbitrary one provided that an address longest + matching against D is always preferred. + 6. If tmp(DEP(largerscope(D))) is not empty, + then choose an address that has the smallest scope. If more than + one address has the smallest scope, choose an arbitrary one provided that an address on I is always preferred. - 3. If DEP(samescope(D)) is not empty, - choose a longest matching address against D. If more than one - address is longest matching, choose arbitrary one provided that - an address on I is always preferred. - 4. If DEP(largerscope(D)) is not empty, - choose an address that has the smallest scope. If more than one - address has the smallest scope, choose arbitrary one provided + 7. If DEP(samescope(D)) is not empty, + then apply the same logic as of 5. + 8. If DEP(largerscope(D)) is not empty, + then apply the same logic as of 6. + 9. If we do not prefer temporary addresses, go to 11. + Otherwise, and if tmp(nonDEP(smallerscope(D))) is not empty, + then choose an address that has the largest scope. If more than + one address has the largest scope, choose an arbitrary one provided that an address on I is always preferred. - 5. if nonDEP(smallerscope(D)) is not empty, - choose an address that has the largest scope. If more than one - address has the largest scope, choose arbitrary one provided - that an address on I is always preferred. - 6. if DEP(smallerscope(D)) is not empty, - choose an address that has the largest scope. If more than one - address has the largest scope, choose arbitrary one provided + 10. If tmp(DEP(smallerscope(D))) is not empty, + then choose an address that has the largest scope. If more than + one address has the largest scope, choose an arbitrary one provided that an address on I is always preferred. + 11. If nonDEP(smallerscope(D)) is not empty, + then apply the same logic as of 9. + 12. If DEP(smallerscope(D)) is not empty, + then apply the same logic as of 10. There exists a document about source address selection (draft-ietf-ipngwg-default-addr-select-xx.txt). KAME's algorithm @@ -622,6 +757,15 @@ overflow due to long function call chain. KAME sys/netinet6 code is carefully designed to avoid kernel stack overflow. Because of this, KAME sys/netinet6 code defines its own protocol switch structure, as "struct ip6protosw" (see netinet6/ip6protosw.h). + +In addition to this, we restrict the number of extension headers +(including the IPv6 header) in each incoming packet, in order to +prevent a DoS attack that tries to send packets with a massive number +of extension headers. The upper limit can be configured by the sysctl +value net.inet6.ip6.hdrnestlimit. In particular, if the value is 0, +the node will allow an arbitrary number of headers. As of writing this +document, the default value is 50. + IPv4 part (sys/netinet) remains untouched for compatibility. Because of this, if you receive IPsec-over-IPv4 packet with massive number of IPsec headers, kernel stack may blow up. IPsec-over-IPv6 is okay. @@ -823,6 +967,9 @@ and initiating side). AF_INET6 and AF_INET sockets are totally separated. Port number space is totally separate between AF_INET and AF_INET6 sockets. +It should be noted that KAME/BSDI3 and KAME/FreeBSD228 are not conformant +to RFC2553 section 3.7 and 3.8. It is due to code sharing reasons. + 1.12.2 KAME/FreeBSD[34]x KAME/FreeBSD3x and KAME/FreeBSD4x use shared tcp4/6 code (from @@ -840,7 +987,7 @@ Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following conditions are satisfied: - there's no AF_INET socket that matches the IPv4 connection - the AF_INET6 socket is configured to accept IPv4 traffic, i.e. - getsockopt(IPV6_BINDV6ONLY) returns 0. + getsockopt(IPV6_V6ONLY) returns 0. (XXX need checking) @@ -859,15 +1006,19 @@ udp4/6 code (from sys/netinet/udp*). The implementation is made differently from KAME/FreeBSD[34]x. KAME/NetBSD uses separate inpcb/in6pcb structures, while KAME/FreeBSD[34]x uses merged inpcb structure. +It should be noted that the default configuration of KAME/NetBSD is not +conformant to RFC2553 section 3.8. It is intentionally turned off by default +for security reasons. + 1.12.3.1 KAME/NetBSD, listening side The platform can be configured to support IPv4 mapped address/special AF_INET6 wildcard bind (disabled by default). Kernel behavior can be summarized as follows: - default: special support code will be compiled in, but is disabled by - default. It can be controlled by sysctl (net.inet6.ip6.bindv6only), - or setsockopt(IPV6_BINDV6ONLY). -- add "INET6_BINDV6ONLY": No special support code for AF_INET6 wildcard socket + default. It can be controlled by sysctl (net.inet6.ip6.v6only), + or setsockopt(IPV6_V6ONLY). +- add "INET6_V6ONLY": No special support code for AF_INET6 wildcard socket will be compiled in. AF_INET6 sockets and AF_INET sockets are totally separate. The behavior is similar to what described in 1.12.1. @@ -881,7 +1032,7 @@ Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following conditions are satisfied: - there's no AF_INET socket that matches the IPv4 connection - the AF_INET6 socket is configured to accept IPv4 traffic, i.e. - getsockopt(IPV6_BINDV6ONLY) returns 0. + getsockopt(IPV6_V6ONLY) returns 0. You cannot bind(2) with IPv4 mapped address. This is a workaround for port number duplicate and other twists. @@ -919,6 +1070,9 @@ KAME/BSDi4 supports connection initiation to IPv4 mapped address KAME/OpenBSD uses NRL-based TCP/UDP stack and inpcb source code, which was derived from NRL IPv6/IPsec stack. +It should be noted that KAME/OpenBSD is not conformant to RFC2553 section 3.7 +and 3.8. It is intentionally omitted for security reasons. + 1.12.5.1 KAME/OpenBSD, listening side KAME/OpenBSD disables special behavior on AF_INET6 wildcard bind for @@ -955,7 +1109,7 @@ mapped address or not. This adds many twists: use EPSV/EPRT or LPSV/LPRT; /*IPv6*/ else error; - Under SIIT environment, the correct code would be: + The correct code, with consideration to IPv4 mapped address, would be: if (sa_family == AF_INET) use EPSV/EPRT or PASV/PORT; /*IPv4*/ else if (sa_family == AF_INET6 && IPv4 mapped address) @@ -970,17 +1124,40 @@ mapped address or not. This adds many twists: - By enabling kernel support for IPv4 mapped address (outgoing direction), servers on the kernel can be hosed by IPv6 native packet that has IPv4 mapped address in IPv6 header source, and can generate unwanted IPv4 packets. - http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt - talks more about this scenario. + draft-itojun-ipv6-transition-abuse-01.txt talks more about this scenario. Due to the above twists, some of KAME userland programs has restrictions on the use of IPv4 mapped addresses: - rshd/rlogind do not accept connections from IPv4 mapped address. This is to avoid malicious use of IPv4 mapped address in IPv6 native packet, to bypass source-address based authentication. -- ftp/ftpd does not support SIIT environment. IPv4 mapped address will be - decoded in userland, and will be passed to AF_INET sockets - (SIIT client should pass IPv4 mapped address as is, to AF_INET6 sockets). +- ftp/ftpd assume that you are on dual stack network. IPv4 mapped address + will be decoded in userland, and will be passed to AF_INET sockets + (in other words, ftp/ftpd do not support SIIT environment). + +1.12.7 Interaction with SIIT translator + +SIIT translator is specified in RFC2765. KAME node cannot become a SIIT +translator box, nor SIIT end node (a node in SIIT cloud). + +To become a SIIT translator box, we need to put additional code for that. +We do not have the code in our tree at this moment. + +There are multiple reasons that we are unable to become SIIT end node. +(1) SIIT translators require end nodes in the SIIT cloud to be IPv6-only. +Since we are unable to compile INET-less kernel, we are unable to become +SIIT end node. (2) As presented in 1.12.6, some of our userland code assumes +dual stack network. (3) KAME stack filters out IPv6 packets with IPv4 +mapped address in the header, to secure non-SIIT case (which is much more +common). Effectively KAME node will reject any packets via SIIT translator +box. See section 1.14 for more detail about the last item. + +There are documentation issues too - SIIT document requires very strange +things. For example, SIIT document asks IPv6-only (meaning no IPv4 code) +node to be able to construct IPv4 IPsec headers. If a node knows how to +construct IPv4 IPsec headers, that is not an IPv6-only node, it is a dual-stack +node. The requirements imposed in SIIT document contradict with the other +part of the document itself. 1.13 sockaddr_storage @@ -1036,22 +1213,45 @@ or bypass security controls: IPv4 address (if they are in IPv6 native packet header, they are malicious) ::ffff:0.0.0.0/104 ::ffff:127.0.0.0/104 ::ffff:224.0.0.0/100 ::ffff:255.0.0.0/104 -- 6to4 prefix generated from unspecified/multicast/loopback/broadcast/private - IPv4 address +- 6to4 (RFC3056) prefix generated from unspecified/multicast/loopback/ + broadcast/private IPv4 address 2002:0000::/24 2002:7f00::/24 2002:e000::/24 2002:ff00::/24 2002:0a00::/24 2002:ac10::/28 2002:c0a8::/32 - -Also, since KAME does not support RFC1933 auto tunnels, seeing IPv4 compatible -is very rare. You should take caution if you see those on the wire. +- IPv4 compatible address that embeds unspecified/multicast/loopback/broadcast + IPv4 address (if they are in IPv6 native packet header, they are malicious). + Note that, since KAME doe snot support RFC1933/2893 auto tunnels, KAME nodes + are not vulnerable to these packets. + ::0.0.0.0/104 ::127.0.0.0/104 ::224.0.0.0/100 ::255.0.0.0/104 + +Also, since KAME does not support RFC1933/2893 auto tunnels, seeing IPv4 +compatible is very rare. You should take caution if you see those on the wire. + +If we see IPv6 packets with IPv4 mapped address (::ffff:0.0.0.0/96) in the +header in dual-stack environment (not in SIIT environment), they indicate +that someone is trying to inpersonate IPv4 peer. The packet should be dropped. + +IPv6 specifications do not talk very much about IPv6 unspecified address (::) +in the IPv6 source address field. Clarification is in progress. +Here are couple of comments: +- IPv6 unspecified address can be used in IPv6 source address field, if and + only if we have no legal source address for the node. The legal situations + include, but may not be limited to, (1) MLD while no IPv6 address is assigned + to the node and (2) DAD. +- If IPv6 TCP packet has IPv6 unspecified address, it is an attack attempt. + The form can be used as a trigger for TCP DoS attack. KAME code already + filters them out. +- The following examples are seemingly illegal. It seems that there's general + consensus among ipngwg for those. (1) mobile-ip6 home address option, + (2) offlink packets (so routers should not forward them). + KAME implmements (2) already. KAME code is carefully written to avoid such incidents. More specifically, KAME kernel will reject packets with certain source/dstination address in IPv6 base header, or IPv6 routing header. Also, KAME default configuration file is written carefully, to avoid those attacks. -http://playground.iijlab.net/i-d/draft-itojun-ipv6-transition-abuse-00.txt -talks about more about this. +draft-itojun-ipv6-transition-abuse-01.txt talks about more about this. 1.15 Node's required addresses @@ -1098,6 +1298,38 @@ like ff02::9 for RIPng. Users can join groups by using appropriate system calls like setsockopt(2). +1.16 Advanced API + +Current KAME kernel implements 2292bis API, documented in +draft-ietf-ipngwg-rfc2292bis-xx.txt. It also implements RFC2292 API, +for backward compatibility purposes with *BSD-integrated codebase. +KAME tree ships with 2292bis headers. +*BSD-integrated codebase implements either RFC2292, or 2292bis, API. +see "COVERAGE" document for detailed implementation status. + +Here are couple of issues to mention: +- *BSD-integrated binaries, compiled for RFC2292, will work on KAME kernel. + For example, OpenBSD 2.7 /sbin/rtsol will work on KAME/openbsd kernel. +- KAME binaries, compiled using 2292bis, will not work on *BSD-integrated + kenrel. For example, KAME /usr/local/v6/sbin/rtsol will not work on + OpenBSD 2.7 kernel. +- 2292bis API is not compatible with RFC2292 API. 2292bis #define symbols + conflict with RFC2292 symbols. Therefore, if you compile programs that + assume RFC2292 API, the compilation itself goes fine, however, the compiled + binary will not work correctly. The problem is not KAME issue, but API + issue. For example, Solaris 8 implements 2292bis API. If you compile + RFC2292-based code on Solaris 8, the binary can behave strange. + +There are few (or couple of) incompatible behavior in RFC2292 binary backward +compatibility support in KAME tree. To enumerate: +- Type 0 routing header lacks support for strict/loose bitmap. + Even if we see packets with "strict" bit set, those bits will not be made + visible to the userland. + Background: RFC2292 document is based on RFC1883 IPv6, and it uses + strict/loose bitmap. 2292bis document is based on RFC2460 IPv6, and it has + no strict/loose bitmap (it was removed from RFC2460). KAME tree obeys + RFC2460 IPv6, and lacks support for strict/loose bitmap. + 2. Network Drivers KAME requires three items to be added into the standard drivers: @@ -1228,7 +1460,16 @@ Here is a list of FreeBSD 3.x-RELEASE drivers and its conditions: More drivers will just simply work on KAME FreeBSD 3.x-RELEASE but have not been checked yet. -2.5 OpenBSD 2.x +2.5 FreeBSD 4.x-RELEASE + +Here is a list of FreeBSD 4.x-RELEASE drivers and its conditions: + + driver multicast + --- --- + (Ethernet) + lnc/vmware ok + +2.6 OpenBSD 2.x Here is a list of OpenBSD 2.x drivers and its conditions: @@ -1246,7 +1487,7 @@ Here is a list of OpenBSD 2.x drivers and its conditions: configuration. This happens with certain revision of chipset on the card. Should be fixed by now by workaround in sys/net/if.c, but still not sure. -2.6 BSD/OS 4.x +2.7 BSD/OS 4.x The following lists BSD/OS 4.x device drivers and its conditions: @@ -1306,11 +1547,11 @@ the connection will be relayed toward IPv4 destination 163.221.202.12. faithd must be invoked on FAITH-relay dual stack node. For more details, consult kame/kame/faithd/README and -draft-ietf-ngtrans-tcpudp-relay-01.txt. +draft-ietf-ngtrans-tcpudp-relay-04.txt. 3.2 IPv6-to-IPv4 header translator -# removed since it is not imported to FreeBSD-current +(to be written) 4. IPsec @@ -1324,6 +1565,9 @@ Note that KAME/OpenBSD does NOT include support for KAME IPsec code, as OpenBSD team has their home-brew IPsec stack and they have no plan to replace it. IPv6 support for IPsec is, therefore, lacking on KAME/OpenBSD. +http://www.netbsd.org/Documentation/network/ipsec/ has more information +including usage examples. + 4.1 Policy Management The kernel implements experimental policy management code. There are two way @@ -1383,10 +1627,8 @@ Tunnel mode works basically fine, but comes with the following restrictions: IPsec tunnel as pseudo interfaces. - Authentication model for AH tunnel must be revisited. We'll need to improve the policy management engine, eventually. -- Tunnelling for IPv6 IPsec is still incomplete. This is disabled by default. - If you need to perform experiments, add "options IPSEC_IPV6FWD" into - the kernel configuration file. Note that path MTU discovery does not work - across IPv6 IPsec tunnel gateway due to insufficient code. +- Path MTU discovery does not work across IPv6 IPsec tunnel gateway due to + insufficient code. AH specificaton does not talk much about "multiple AH on a packet" case. We incrementally compute AH checksum, from inside to outside. Also, we @@ -1401,6 +1643,16 @@ we do it incrementally. As a result, we get crypto checksums like below: Also note that AH3 has the smallest sequence number, and AH1 has the largest sequence number. +To avoid traffic analysis on shorter packets, ESP output logic supports +random length padding. By setting net.inet.ipsec.esp_randpad (or +net.inet6.ipsec6.esp_randpad) to positive value N, you can ask the kernel +to randomly pad packets shorter than N bytes, to random length smaller than +or equal to N. Note that N does not include ESP authentication data length. +Also note that the random padding is not included in TCP segment +size computation. Negative value will turn off the functionality. +Recommeded value for N is like 128, or 256. If you use a too big number +as N, you may experience inefficiency due to fragmented packtes. + 4.4 IPComp handling IPComp stands for IP payload compression protocol. This is aimed for @@ -1448,13 +1700,15 @@ Here are some points to be noted: The IPsec code in the kernel conforms (or, tries to conform) to the following standards: "old IPsec" specification documented in rfc182[5-9].txt - "new IPsec" specification documented in rfc240[1-6].txt, rfc241[01].txt, - rfc2451.txt and draft-mcdonald-simple-ipsec-api-01.txt (draft expired, - but you can take from ftp://ftp.kame.net/pub/internet-drafts/). - (NOTE: IKE specifications, rfc240[7-9].txt are implemented in userland, - as "racoon" IKE daemon) + "new IPsec" specification documented in: + rfc240[1-6].txt rfc241[01].txt rfc2451.txt + draft-mcdonald-simple-ipsec-api-01.txt + (expired, available in ftp://ftp.kame.net/pub/internet-drafts/) + draft-ietf-ipsec-ciph-aes-cbc-00.txt IPComp: RFC2393: IP Payload Compression Protocol (IPComp) +IKE specifications (rfc240[7-9].txt) are implemented in userland +as "racoon" IKE daemon. Currently supported algorithms are: old IPsec AH @@ -1472,6 +1726,9 @@ Currently supported algorithms are: keyed SHA1 with 96bit crypto checksum (no document) HMAC MD5 with 96bit crypto checksum (rfc2403.txt HMAC SHA1 with 96bit crypto checksum (rfc2404.txt) + HMAC SHA2-256 with 96bit crypto checksum (no document) + HMAC SHA2-384 with 96bit crypto checksum (no document) + HMAC SHA2-512 with 96bit crypto checksum (no document) new IPsec ESP null encryption (rfc2410.txt) DES-CBC with derived IV @@ -1480,7 +1737,9 @@ Currently supported algorithms are: 3DES-CBC with explicit IV (rfc2451.txt) BLOWFISH CBC (rfc2451.txt) CAST128 CBC (rfc2451.txt) - RC5 CBC (rfc2451.txt) + RIJNDAEL/AES CBC (draft-ietf-ipsec-ciph-aes-cbc-00.txt, + uses IANA-assigned protocol number) + TWOFISH CBC (draft-ietf-ipsec-ciph-aes-cbc-00.txt) each of the above can be combined with: ESP authentication with HMAC-MD5(96bit) ESP authentication with HMAC-SHA1(96bit) @@ -1560,26 +1819,104 @@ coverage for IPsec crypto algorithms documented in RFC (we do not cover algorithms with intellectual property issues, though). Here are (some of) platforms we have tested IPsec/IKE interoperability -in the past, in no particular order. Note that both ends (KAME and +in the past, no particular order. Note that both ends (KAME and others) may have modified their implementation, so use the following list just for reference purposes. - Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure), - BlueSteel, CISCO, Ericsson, ACC, Fitel, FreeS/WAN, HITACHI, IBM - AIX, IIJ, Intel, Microsoft WinNT, NAI PGPnet, - NIST (linux IPsec + plutoplus), Netscreen, OpenBSD isakmpd, Radguard, - RedCreek, Routerware, SSH, Secure Computing, Soliton, Toshiba, - TIS/NAI Gauntret, VPNet, Yamaha RT100i + ACC, allied-telesis, Altiga, Ashley-laurent (vpcom.com), BlueSteel, + CISCO IOS, Cryptek, Checkpoint FW-1, Data Fellows (F-Secure), + Ericsson, Fitel, FreeS/WAN, HiFn, HITACHI, IBM AIX, IIJ, Intel Canada, + Intel Packet Protect, MEW NetCocoon, MGCS, Microsoft WinNT/2000, + NAI PGPnet, NetLock, NIST (linux IPsec + plutoplus), NEC IX5000, + Netscreen, NxNetworks, OpenBSD isakmpd, Pivotal, Radguard, RapidStream, + RedCreek, Routerware, RSA, SSH (both IPv4/IPv6), Secure Computing, + Soliton, Sun Solaris8, TIS/NAI Gauntret, Toshiba, VPNet, + Yamaha RT series Here are (some of) platforms we have tested IPComp/IKE interoperability in the past, in no particular order. - IRE + IRE, SSH (both IPv4/IPv6), NetLock + +VPNC (vpnc.org) provides IPsec conformance tests, using KAME and OpenBSD +IPsec/IKE implementations. Their test results are available at +http://www.vpnc.org/conformance.html, and it may give you more idea +about which implementation interoperates with KAME IPsec/IKE implementation. 5. ALTQ -# removed since it is not imported to FreeBSD-current +KAME kit includes ALTQ 2.1 code, which supports FreeBSD2, FreeBSD3, +NetBSD and OpenBSD. For BSD/OS, ALTQ does not work. +ALTQ in KAME supports (or tries to support) IPv6. +(actually, ALTQ is developed on KAME repository since ALTQ 2.1 - Jan 2000) + +ALTQ occupies single character device number. For FreeBSD, it is officially +allocated. For OpenBSD and NetBSD, we use the number which is not +currently allocated (will eventually get an official number). +The character device is enabled for i386 architecture only. To enable and +compile ALTQ-ready kernel for other archititectures, take the following steps: +- assume that your architecture is FOOBAA. +- modify sys/arch/FOOBAA/FOOBAA/conf.c (or somewhere that defines cdevsw), + to include a line for ALTQ. look at sys/arch/i386/i386/conf.c for + example. The major number must be same as i386 case. +- copy kernel configuration file (like ALTQ.v6 or GENERIC.v6) from i386, + and modify accordingly. +- build a kernel. +- before building userland, change netbsd/{lib,usr.sbin,usr.bin}/Makefile + (or openbsd/foobaa) so that it will visit altq-related sub directories. 6. mobile-ip6 -# removed since it is not imported to FreeBSD-current +6.1 KAME node as correspondent node + +Default installation recognizes home address option (in destination +options header). No sub-options are supported. interaction with +IPsec, and/or 2292bis API, needs further study. + +6.2 KAME node as home agent/mobile node + +KAME kit includes Ericsson mobile-ip6 code. The integration is just started +(in Feb 2000), and we will need some more time to integrate it better. + +See kame/mip6config/{QUICKSTART,README_MIP6.txt} for more details. + +The Ericsson code implements revision 09 of the mobile-ip6 draft. There +are other implementations available: + NEC: http://www.6bone.nec.co.jp/mipv6/internal-dist/ (-13 draft) + SFC: http://neo.sfc.wide.ad.jp/~mip6/ (-13 draft) + +7. Coding style + +The KAME developers basically do not make a bother about coding +style. However, there is still some agreement on the style, in order +to make the distributed develoment smooth. + +- the tab character should be 8 columns wide (tabstops are at 8, 16, 24, ... + column). With vi, use ":set ts=8 sw=8". +- each line should be within 80 characters. +- keep a single open/close bracket in a comment such as in the following + line: + putchar('('); /* ) */ + without this, some vi users would have a hard time to match a pair of + brackets. Although this type of bracket seems clumsy and is even + harmful for some other type of vi users and Emacs users, the + agreement in the KAME developers is to allow it. +- add the following line to the head of every KAME-derived file: + /* (dollar)KAME(dollar) */ + where "(dollar)" is the dollar character ($), and around "$" are tabs. + (this is for C. For other language, you should use its own comment + line.) + Once commited to the CVS repository, this line will contain its + version number (see, for example, at the top of this file). This + would make it easy to report a bug. +- when creating a new file with the WIDE copyright, tap "make copyright.c" at + the top-level, and use copyright.c as a template. KAME RCS tag will be + included automatically. +- when editting a third-party package, keep its own coding style as + much as possible, even if the style does not follow the items above. + +When you want to contribute something to the KAME project, and if *you +do not mind* the agreement, it would be helpful for the project to +keep these rules. Note, however, that we would never intend to force +you to adopt our rules. We would rather regard your own style, +especially when you have a policy about the style. <end of IMPLEMENTATION> |