summaryrefslogtreecommitdiffstats
path: root/sys/netinet/raw_ip.c
Commit message (Collapse)AuthorAgeFilesLines
* Slight white space tweak.rwatson2005-06-011-0/+1
| | | | MFC after: 7 days
* If we are going tocperciva2005-05-061-0/+1
| | | | | | | | | | 1. Copy a NULL-terminated string into a fixed-length buffer, and 2. copyout that buffer to userland, we really ought to 0. Zero the entire buffer first. Security: FreeBSD-SA-05:08.kmem
* eliminate extraneous null ptr checkssam2005-03-291-2/+2
| | | | Noticed by: Coverity Prevent analysis tool
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Initialize struct pr_userreqs in new/sparse style and fill in commonphk2004-11-081-5/+12
| | | | | | default elements in net_init_domain(). This makes it possible to grep these structures and see any bogosities.
* When the access control on creating raw sockets was modified so thatrwatson2004-10-121-20/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | processes in jail could create raw sockets, additional access control checks were added to raw IP sockets to limit the ways in which those sockets could be used. Specifically, only the socket option IP_HDRINCL was permitted in rip_ctloutput(). Other socket options were protected by a call to suser(). This change was required to prevent processes in a Jail from modifying system properties such as multicast routing and firewall rule sets. However, it also introduced a regression: processes that create a raw socket with root privilege, but then downgraded credential (i.e., a daemon giving up root, or a setuid process switching back to the real uid) could no longer issue other unprivileged generic IP socket option operations, such as IP_TOS, IP_TTL, and the multicast group membership options, which prevented multicast routing daemons (and some other tools) from operating correctly. This change pushes the access control decision down to the granularity of individual socket options, rather than all socket options, on raw IP sockets. When rip_ctloutput() doesn't implement an option, it will now pass the request directly to in_control() without an access control check. This should restore the functionality of the generic IP socket options for raw sockets in the above-described scenarios, which may be confirmed with the ipsockopt regression test. RELENG_5 candidate. Reviewed by: csjp
* fix up socket/ip layer violation... don't assume/know thatjmg2004-09-051-1/+2
| | | | SO_DONTROUTE == IP_ROUTETOIF and SO_BROADCAST == IP_ALLOWBROADCAST...
* When a prison is given the ability to create raw sockets (when thecsjp2004-08-211-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged access to jails is given out, it is possible for prison root to manipulate various network parameters which effect the host environment. This commit plugs a number of security holes associated with the use of raw sockets and prisons. This commit makes the following changes: - Add a comment to rtioctl warning developers that if they add any ioctl commands, they should use super-user checks where necessary, as it is possible for PRISON root to make it this far in execution. - Add super-user checks for the execution of the SIOCGETVIFCNT and SIOCGETSGCNT IP multicast ioctl commands. - Add a super-user check to rip_ctloutput(). If the calling cred is PRISON root, make sure the socket option name is IP_HDRINCL, otherwise deny the request. Although this patch corrects a number of security problems associated with raw sockets and prisons, the warning in jail(8) should still apply, and by default we should keep the default value of security.jail.allow_raw_sockets MIB to 0 (or disabled) until we are certain that we have tracked down all the problems. Looking forward, we will probably want to eliminate the references to curthread. This may be a MFC candidate for RELENG_5. Reviewed by: rwatson Approved by: bmilekic (mentor)
* Convert ipfw to use PFIL_HOOKS. This is change is transparent to userlandandre2004-08-171-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and preserves the ipfw ABI. The ipfw core packet inspection and filtering functions have not been changed, only how ipfw is invoked is different. However there are many changes how ipfw is and its add-on's are handled: In general ipfw is now called through the PFIL_HOOKS and most associated magic, that was in ip_input() or ip_output() previously, is now done in ipfw_check_[in|out]() in the ipfw PFIL handler. IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to be diverted is checked if it is fragmented, if yes, ip_reass() gets in for reassembly. If not, or all fragments arrived and the packet is complete, divert_packet is called directly. For 'tee' no reassembly attempt is made and a copy of the packet is sent to the divert socket unmodified. The original packet continues its way through ip_input/output(). ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet with the new destination sockaddr_in. A check if the new destination is a local IP address is made and the m_flags are set appropriately. ip_input() and ip_output() have some more work to do here. For ip_input() the m_flags are checked and a packet for us is directly sent to the 'ours' section for further processing. Destination changes on the input path are only tagged and the 'srcrt' flag to ip_forward() is set to disable destination checks and ICMP replies at this stage. The tag is going to be handled on output. ip_output() again checks for m_flags and the 'ours' tag. If found, the packet will be dropped back to the IP netisr where it is going to be picked up by ip_input() again and the directly sent to the 'ours' section. When only the destination changes, the route's 'dst' is overwritten with the new destination from the forward m_tag. Then it jumps back at the route lookup again and skips the firewall check because it has been marked with M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with 'option IPFIREWALL_FORWARD' to enable it. DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will then inject it back into ip_input/ip_output() after it has served its time. Dummynet packets are tagged and will continue from the next rule when they hit the ipfw PFIL handlers again after re-injection. BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS. More detailed changes to the code: conf/files Add netinet/ip_fw_pfil.c. conf/options Add IPFIREWALL_FORWARD option. modules/ipfw/Makefile Add ip_fw_pfil.c. net/bridge.c Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw is still directly invoked to handle layer2 headers and packets would get a double ipfw when run through PFIL_HOOKS as well. netinet/ip_divert.c Removed divert_clone() function. It is no longer used. netinet/ip_dummynet.[ch] Neither the route 'ro' nor the destination 'dst' need to be stored while in dummynet transit. Structure members and associated macros are removed. netinet/ip_fastfwd.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_fw.h Removed 'ro' and 'dst' from struct ip_fw_args. netinet/ip_fw2.c (Re)moved some global variables and the module handling. netinet/ip_fw_pfil.c New file containing the ipfw PFIL handlers and module initialization. netinet/ip_input.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. ip_forward() does not longer require the 'next_hop' struct sockaddr_in argument. Disable early checks if 'srcrt' is set. netinet/ip_output.c Removed all direct ipfw handling code and replace it with the new 'ipfw forward' handling code. netinet/ip_var.h Add ip_reass() as general function. (Used from ipfw PFIL handlers for IPDIVERT.) netinet/raw_ip.c Directly check if ipfw and dummynet control pointers are active. netinet/tcp_input.c Rework the 'ipfw forward' to local code to work with the new way of forward tags. netinet/tcp_sack.c Remove include 'opt_ipfw.h' which is not needed here. sys/mbuf.h Remove m_claim_next() macro which was exclusively for ipfw 'forward' and is no longer needed. Approved by: re (scottl)
* White space cleanup for netinet before branch:rwatson2004-08-161-3/+3
| | | | | | | | | | | - Trailing tab/space cleanup - Remove spurious spaces between or before tabs This change avoids touching files that Andre likely has in his working set for PFIL hooks changes for IPFW/DUMMYNET. Approved by: re (scottl) Submitted by: Xin LI <delphij@frontfree.net>
* Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSDdwmalone2004-08-141-6/+1
| | | | | | | | | | | | | | | | | | | | | have already done this, so I have styled the patch on their work: 1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID. 2) named the sysctl net.inet.ip.random_id 3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns. The sysctl defaults to 0 (sequential IP IDs). Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months
* Backout removal of UMA_ZONE_NOFREE flag for all zones which are establishedandre2004-08-111-1/+1
| | | | | | | | | for structures with timers in them. It might be that a timer might fire even when the associated structure has already been free'd. Having type- stable storage in this case is beneficial for graceful failure handling and debugging. Discussed with: bosko, tegge, rwatson
* Remove the UMA_ZONE_NOFREE flag to all uma_zcreate() calls in the IP andandre2004-08-111-1/+1
| | | | | TCP code. This flag would have prevented giving back excessive free slabs to the global pool after a transient peak usage.
* Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This iscperciva2004-07-261-1/+1
| | | | | | | | | | | somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb
* M_PREPEND() the IP header on to the front of an outgoing raw IP packetrwatson2004-07-201-1/+1
| | | | | using M_DONTWAIT rather than M_WAITOK to avoid sleeping on memory while holding a mutex.
* Reduce the number of unnecessary unlock-relocks on socket buffer mutexesrwatson2004-06-261-3/+7
| | | | | | | | | | | | | | | | | | | | associated with performing a wakeup on the socket buffer: - When performing an sbappend*() followed by a so[rw]wakeup(), explicitly acquire the socket buffer lock and use the _locked() variants of both calls. Note that the _locked() sowakeup() versions unlock the mutex on return. This is done in uipc_send(), divert_packet(), mroute socket_send(), raw_append(), tcp_reass(), tcp_input(), and udp_append(). - When the socket buffer lock is dropped before a sowakeup(), remove the explicit unlock and use the _locked() sowakeup() variant. This is done in soisdisconnecting(), soisdisconnected() when setting the can't send/ receive flags and dropping data, and in uipc_rcvd() which adjusting back-pressure on the sockets. For UNIX domain sockets running mpsafe with a contention-intensive SMP mysql benchmark, this results in a 1.6% query rate improvement due to reduce mutex costs.
* Introduce a new feature to IPFW2: lookup tables. These are usefulru2004-06-091-0/+5
| | | | | | | for handling large sparse address sets. Initial implementation by Vsevolod Lobko <seva@ip.net.ua>, refined by me. MFC after: 1 week
* Move the locking of the pcb into raw_output(). Organize code sobmilekic2004-06-031-10/+14
| | | | | | | | that m_prepend() is not called with possibility to wait while the pcb lock is held. What still needs revisiting is whether the ripcbinfo lock is really required here. Discussed with: rwatson
* Switch to using the inpcb MAC label instead of socket MAC label whenrwatson2004-05-041-1/+3
| | | | | | | | | | | | | | | | | | | | labeling new mbufs created from sockets/inpcbs in IPv4. This helps avoid the need for socket layer locking in the lower level network paths where inpcb locks are already frequently held where needed. In particular: - Use the inpcb for label instead of socket in raw_append(). - Use the inpcb for label instead of socket in tcp_output(). - Use the inpcb for label instead of socket in tcp_respond(). - Use the inpcb for label instead of socket in tcp_twrespond(). - Use the inpcb for label instead of socket in syncache_respond(). While here, modify tcp_respond() to avoid assigning NULL to a stack variable and centralize assertions about the inpcb when inp is assigned. Obtained from: TrustedBSD Project Sponsored by: DARPA, McAfee Research
* Assert the inpcb lock on 'last' in udp_append(), since it's alwaysrwatson2004-05-041-0/+2
| | | | | | | called with it, and also requires it. Obtained from: TrustedBSD Project Sponsored by: DARPA, McAfee Research
* o Fix misindentation in the previous commit.maxim2004-05-031-8/+7
|
* Give jail(8) the feature to allow raw sockets from within abmilekic2004-04-261-2/+31
| | | | | | | | | | | | | | | | | | | | | jail, which is less restrictive but allows for more flexible jail usage (for those who are willing to make the sacrifice). The default is off, but allowing raw sockets within jails can now be accomplished by tuning security.jail.allow_raw_sockets to 1. Turning this on will allow you to use things like ping(8) or traceroute(8) from within a jail. The patch being committed is not identical to the patch in the PR. The committed version is more friendly to APIs which pjd is working on, so it should integrate into his work quite nicely. This change has also been presented and addressed on the freebsd-hackers mailing list. Submitted by: Christian S.J. Peron <maneo@bsdpro.com> PR: kern/65800
* Remove advertising clause from University of California Regent'simp2004-04-071-4/+0
| | | | | | | license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson
* Remove unused argument.pjd2004-03-271-1/+1
| | | | Reviewed by: ume
* IPSEC and FAST_IPSEC have the same internal API now;ume2004-02-171-8/+3
| | | | | | so merge these (IPSEC has an extra ipsecstat) Submitted by: "Bjoern A. Zeeb" <bzeeb+freebsd@zabbadoz.net>
* pass pcb rather than so. it is expected that per socket policyume2004-02-031-1/+1
| | | | works again.
* Correct the descriptions of the net.inet.{udp,raw}.recvspace sysctls.ru2004-01-271-1/+1
|
* Split the "inp" mutex class into separate classes for each of divert,sam2003-11-261-1/+1
| | | | | | | | raw, tcp, udp, raw6, and udp6 sockets to avoid spurious witness complaints. Reviewed by: rwatson Approved by: re (rwatson)
* Introduce tcp_hostcache and remove the tcp specific metrics fromandre2003-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | the routing table. Move all usage and references in the tcp stack from the routing table metrics to the tcp hostcache. It caches measured parameters of past tcp sessions to provide better initial start values for following connections from or to the same source or destination. Depending on the network parameters to/from the remote host this can lead to significant speedups for new tcp connections after the first one because they inherit and shortcut the learning curve. tcp_hostcache is designed for multiple concurrent access in SMP environments with high contention and is hash indexed by remote ip address. It removes significant locking requirements from the tcp stack with regard to the routing table. Reviewed by: sam (mentor), bms Reviewed by: -net, -current, core@kame.net (IPv6 parts) Approved by: re (scottl)
* Introduce a MAC label reference in 'struct inpcb', which cachesrwatson2003-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | the MAC label referenced from 'struct socket' in the IPv4 and IPv6-based protocols. This permits MAC labels to be checked during network delivery operations without dereferencing inp->inp_socket to get to so->so_label, which will eventually avoid our having to grab the socket lock during delivery at the network layer. This change introduces 'struct inpcb' as a labeled object to the MAC Framework, along with the normal circus of entry points: initialization, creation from socket, destruction, as well as a delivery access control check. For most policies, the inpcb label will simply be a cache of the socket label, so a new protocol switch method is introduced, pr_sosetlabel() to notify protocols that the socket layer label has been updated so that the cache can be updated while holding appropriate locks. Most protocols implement this using pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use the the worker function in_pcbsosetlabel(), which calls into the MAC Framework to perform a cache update. Biba, LOMAC, and MLS implement these entry points, as do the stub policy, and test policy. Reviewed by: sam, bms Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
* In rip_abort(), unlock the inpcb if we didn't detach it, or we maycognet2003-11-171-0/+2
| | | | | | recurse on the lock before destroying the mutex. Submitted by: sam
* add some missing lockingsam2003-11-081-17/+75
| | | | Supported by: FreeBSD Foundation
* shuffle code so we don't "continue" and miss a needed unlock operationsam2003-09-171-4/+2
| | | | Observed by: Wiktor Niesiobedzki <w@evip.pl>
* remove warning about use of old divert sockets; this was markedsam2003-09-011-9/+0
| | | | | | for removal before 5.2 Reviewed by: silence on -net and -arch
* add lockingsam2003-09-011-125/+117
| | | | Sponsored by: FreeBSD Foundation
* M_PREPEND() with an argument of M_TRYWAIT can fail, meaning therwatson2003-08-261-0/+2
| | | | | | | | | | returned mbuf can be NULL. Check for NULL in rip_output() when prepending an IP header. This prevents mbuf exhaustion from causing a local kernel panic when sending raw IP packets. PR: kern/55886 Reported by: Pawel Malachowski <pawmal-posting@freebsd.lublin.pl> MFC after: 3 days
* Add the IP_ONESBCAST option, to enable undirected IP broadcasts to be sent onbms2003-08-201-0/+3
| | | | | | | | | | specific interfaces. This is required by aodvd, and may in future help us in getting rid of the requirement for BPF from our import of isc-dhcp. Suggested by: fenestro Obtained from: BSD/OS Reviewed by: mini, sam Approved by: jake (mentor)
* 1. Basic PIM kernel supporthsu2003-08-071-0/+8
| | | | | | | | | | | | | | | | | | Disabled by default. To enable it, the new "options PIM" must be added to the kernel configuration file (in addition to MROUTING): options MROUTING # Multicast routing options PIM # Protocol Independent Multicast 2. Add support for advanced multicast API setup/configuration and extensibility. 3. Add support for kernel-level PIM Register encapsulation. Disabled by default. Can be enabled by the advanced multicast API. 4. Implement a mechanism for "multicast bandwidth monitoring and upcalls". Submitted by: Pavlin Radoslavov <pavlin@icir.org>
* Add a comment above rip_ctloutput() documenting that the privilegerwatson2003-07-181-0/+10
| | | | | | | | | | | check for raw IP system management operations is often (although not always) implicit due to the namespacing of raw IP sockets. I.e., you have to have privilege to get a raw IP socket, so much of the management code sitting on raw IP sockets assumes that any requests on the socket should be granted privilege. Obtained from: TrustedBSD Project Product of: France
* Back out M_* changes, per decision of the TRB.imp2003-02-191-2/+2
| | | | Approved by: trb
* s/IPSSEC/IPSEC/tanimura2003-02-111-1/+1
|
* Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.alfred2003-01-211-2/+2
| | | | Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
* Fix long-standing bug predating FreeBSD where calling connect() twicehsu2003-01-181-1/+3
| | | | on a raw ip socket will crash the system with a null-dereference.
* Back out some style changes. They are not urgent,luigi2002-11-201-27/+29
| | | | | | | I will put them back in after 5.0 is out. Requested by: sam Approved by: re
* Cleanup some of the comments, and reformat long lines.luigi2002-11-171-29/+27
| | | | | | | | | | | | Replace m_copy() with m_copypacket() where applicable. Replace "if (a.s_addr ...)" with "if (a.s_addr != INADDR_ANY ...)" to make it clear what the code means. While at it, fix some function headers and remove 'register' from variable declarations. MFC after: 3 days
* Massive cleanup of the ip_mroute code.luigi2002-11-151-9/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No functional changes, but: + the mrouting module now should behave the same as the compiled-in version (it did not before, some of the rsvp code was not loaded properly); + netinet/ip_mroute.c is now truly optional; + removed some redundant/unused code; + changed many instances of '0' to NULL and INADDR_ANY as appropriate; + removed several static variables to make the code more SMP-friendly; + fixed some minor bugs in the mrouting code (mostly, incorrect return values from functions). This commit is also a prerequisite to the addition of support for PIM, which i would like to put in before DP2 (it does not change any of the existing APIs, anyways). Note, in the process we found out that some device drivers fail to properly handle changes in IFF_ALLMULTI, leading to interesting behaviour when a multicast router is started. This bug is not corrected by this commit, and will be fixed with a separate commit. Detailed changes: -------------------- netinet/ip_mroute.c all the above. conf/files make ip_mroute.c optional net/route.c fix mrt_ioctl hook netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here together with other rsvp code, and a couple of indentation fixes. netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks netinet/ip_var.h rsvp function hooks netinet/raw_ip.c hooks for mrouting and rsvp functions, plus interface cleanup. netinet/ip_mroute.h remove an unused and optional field from a struct Most of the code is from Pavlin Radoslavov and the XORP project Reviewed by: sam MFC after: 1 week
* Renumber IPPROTO_DIVERT out of the range of valid IP protocol numbers.fenner2002-10-291-0/+12
| | | | | | | | | | | This allows socket() to return an error when the kernel is not built with IPDIVERT, and doesn't prevent future applications from using the "borrowed" IP protocol number. The sysctl net.inet.raw.olddiverterror controls whether opening a socket with the "borrowed" IP protocol fails with an accompanying kernel printf; this code should last only a couple of releases. Approved by: re
* Fix two instances of variant struct definitions in sys/netinet:phk2002-10-201-3/+2
| | | | | | | | | | | | | | Remove the never completed _IP_VHL version, it has not caught on anywhere and it would make us incompatible with other BSD netstacks to retain this version. Add a CTASSERT protecting sizeof(struct ip) == 20. Don't let the size of struct ipq depend on the IPDIVERT option. This is a functional no-op commit. Approved by: re
* Tie new "Fast IPsec" code into the build. This involves the usualsam2002-10-161-0/+20
| | | | | | | | | | | | configuration stuff as well as conditional code in the IPv4 and IPv6 areas. Everything is conditional on FAST_IPSEC which is mutually exclusive with IPSEC (KAME IPsec implmentation). As noted previously, don't use FAST_IPSEC with INET6 at the moment. Reviewed by: KAME, rwatson Approved by: silence Supported by: Vernier Networks
* Replace aux mbufs with packet tags:sam2002-10-161-8/+1
| | | | | | | | | | | | | | | | | | | o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month
OpenPOWER on IntegriCloud