summaryrefslogtreecommitdiffstats
path: root/sys/net/rtsock.c
Commit message (Collapse)AuthorAgeFilesLines
* Commit step 1 of the vimage project, (network stack)bz2008-08-171-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch
* Remove unused support for local and foreign addresses in generic rawrwatson2008-07-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | socket support. These utility routines are used only for routing and pfkey sockets, neither of which have a notion of address, so were required to mock up fake socket addresses to avoid connection requirements for applications that did not specify their own fake addresses (most of them). Quite a bit of the removed code is #ifdef notdef, since raw sockets don't support bind() or connect() in practice. Removing this simplifies the raw socket implementation, and removes two (commented out) uses of dtom(9). Fake addresses passed to sendto(2) by applications are ignored for compatibility reasons, but this is now done in a more consistent way (and with a comment). Possibly, EINVAL could be returned here in the future if it is determined that no applications depend on the semantic inconsistency of specifying a destination address for a protocol without address support, but this will require some amount of careful surveying. NB: This does not affect netinet, netinet6, or other wire protocol raw sockets, which provide their own independent infrastructure with control block address support specific to the protocol. MFC after: 3 weeks Reviewed by: bz
* Remove NETISR_MPSAFE, which allows specific netisr handlers to be directlyrwatson2008-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific netisr handlers to always be dispatched via a queue (deferred). Mark the usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly acquire Giant in those handlers. Previously, any netisr handler not marked NETISR_MPSAFE would necessarily run deferred and with Giant acquired. This change removes Giant scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows non-MPSAFE handlers to continue to force deferred dispatch so as to avoid lock order reversals between their acqusition of Giant and any calling context. It is likely we will be able to remove NETISR_FORCEQUEUE once IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no longer be supported. Reviewed by: bz MFC after: 1 month X-MFC note: We can't remove NETISR_MPSAFE from stable/7 for KPI reasons, but the rest can go back.
* Add code to allow the system to handle multiple routing tables.julian2008-05-091-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco
* This patch provides the back end support for equal-cost multi-pathqingli2008-04-131-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (ECMP) for both IPv4 and IPv6. Previously, multipath route insertion is disallowed. For example, route add -net 192.103.54.0/24 10.9.44.1 route add -net 192.103.54.0/24 10.9.44.2 The second route insertion will trigger an error message of "add net 192.103.54.0/24: gateway 10.2.5.2: route already in table" Multiple default routes can also be inserted. Here is the netstat output: default 10.2.5.1 UGS 0 3074 bge0 => default 10.2.5.2 UGS 0 0 bge0 When multipath routes exist, the "route delete" command requires a specific gateway to be specified or else an error message would be displayed. For example, route delete default would fail and trigger the following error message: "route: writing to routing socket: No such process" "delete net default: not in table" On the other hand, route delete default 10.2.5.2 would be successful: "delete net default: gateway 10.2.5.2" One does not have to specify a gateway if there is only a single route for a particular destination. I need to perform more testings on address aliases and multiple interfaces that have the same IP prefixes. This patch as it stands today is not yet ready for prime time. Therefore, the ECMP code fragments are fully guarded by the RADIX_MPATH macro. Include the "options RADIX_MPATH" in the kernel configuration to enable this feature. Reviewed by: robert, sam, gnn, julian, kmacy
* In keeping with style(9)'s recommendations on macros, use a ';'rwatson2008-03-161-1/+1
| | | | | | | | | after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink
* Do not set the RTF_GATEWAY flag if RTF_LLINFO is set, it doesn't make muchcognet2007-09-081-1/+2
| | | | | | | | | sense in that context, and leads to unusable routes. This should unbreak bootpd. Discussed with: glebius Submitted by: bms Approved by: re (bmah)
* Fix regression in rev. 1.140.glebius2007-03-271-6/+7
| | | | Reported by: Yuriy Tsibizov <Yuriy.Tsibizov gfk.ru>, bsam
* Fix a case where hardware removal of an interface caused an attempt tobms2007-03-271-0/+2
| | | | | | announce an ll_ifma which has gone away. Add a KASSERT to catch regressions. Bug found by: Tom Uffner
* When working on an RTM_CHANGE do the route editing in the followingglebius2007-03-221-18/+17
| | | | | | | | | | | | | | | | | sequence. First, if rt_ifa is going to be changed, then call ifa_rtrequest(RTM_DELETE). Second, if gateway is going to be changed, then call rt_setgate(). Third, change rt_ifa. With this change we are able to change a link level route to a gateway one, that wasn't possible before: # ifconfig em0 192.168.22.1/24 # arp -s 192.168.22.99 00:11:22:33:44:55 # route change 192.168.22.99 192.168.22.199 # ping 192.168.22.99 db> Reported by: avatar
* Sweep kernel replacing suser(9) calls with priv(9) calls, assigningrwatson2006-11-061-2/+6
| | | | | | | | | | | | | specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
* Ok, here it is, we finally add SCTP to current. Note that thisrrs2006-11-031-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | work is not just mine, but it is also the works of Peter Lei and Michael Tuexen. They both are my two key other developers working on the project.. and they need ata-boy's too: **** peterlei@cisco.com tuexen@fh-muenster.de **** I did do a make sysent which updated the syscall's and sysproto.. I hope that is correct... without it you don't build since we have new syscalls for SCTP :-0 So go out and look at the NOTES, add option SCTP (make sure inet and inet6 are present too) and play with SCTP. I will see about comitting some test tools I have after I figure out where I should place them. I also have a lib (libsctp.a) that adds some of the missing socketapi functions that I need to put into lib's.. I will talk to George about this :-) There may still be some 64 bit issues in here, none of us have a 64 bit processor to test with yet.. Michael may have a MAC but thats another beast too.. If you have a mac and want to use SCTP contact Michael he maintains a web site with a loadable module with this code :-) Reviewed by: gnn Approved by: gnn
* Change semantics of socket close and detach. Add a new protocol switchrwatson2006-07-211-0/+8
| | | | | | | | | | | | | | | | | | | function, pru_close, to notify protocols that the file descriptor or other consumer of a socket is closing the socket. pru_abort is now a notification of close also, and no longer detaches. pru_detach is no longer used to notify of close, and will be called during socket tear-down by sofree() when all references to a socket evaporate after an earlier call to abort or close the socket. This means detach is now an unconditional teardown of a socket, whereas previously sockets could persist after detach of the protocol retained a reference. This faciliates sharing mutexes between layers of the network stack as the mutex is required during the checking and removal of references at the head of sofree(). With this change, pru_detach can now assume that the mutex will no longer be required by the socket layer after completion, whereas before this was not necessarily true. Reviewed by: gnn
* Adjust rt_(set|get)metrics() to do kernel <-> userland timebase conversion.oleg2006-07-061-2/+7
| | | | | | We need it since kernel timebase has changed (time_second -> time_uptime). Approved by: glebius (mentor)
* Chance protocol switch method pru_detach() so that it returns voidrwatson2006-04-011-23/+19
| | | | | | | | | | | | | | | | | | | | | | | | | rather than an error. Detaches do not "fail", they other occur or the protocol flags SS_PROTOREF to take ownership of the socket. soclose() no longer looks at so_pcb to see if it's NULL, relying entirely on the protocol to decide whether it's time to free the socket or not using SS_PROTOREF. so_pcb is now entirely owned and managed by the protocol code. Likewise, no longer test so_pcb in other socket functions, such as soreceive(), which have no business digging into protocol internals. Protocol detach routines no longer try to free the socket on detach, this is performed in the socket code if the protocol permits it. In rts_detach(), no longer test for rp != NULL in detach, and likewise in other protocols that don't permit a NULL so_pcb, reduce the incidence of testing for it during detach. netinet and netinet6 are not fully updated to this change, which will be in an upcoming commit. In their current state they may leak memory or panic. MFC after: 3 months
* Change protocol switch pru_abort() API so that it returns void ratherrwatson2006-04-011-2/+2
| | | | | | | | | | | | | | than an int, as an error here is not meaningful. Modify soabort() to unconditionally free the socket on the return of pru_abort(), and modify most protocols to no longer conditionally free the socket, since the caller will do this. This commit likely leaves parts of netinet and netinet6 in a situation where they may panic or leak memory, as they have not are not fully updated by this commit. This will be corrected shortly in followup commits to these components. MFC after: 3 months
* - Fill in the correct rtm_index for RTM_ADD and RTM_CHANGE messages.andre2006-03-151-0/+8
| | | | | | | | | | | | | | | | | | | - Allow RTM_CHANGE to change a number of route flags as specified by RTF_FMASK. - The unused rtm_use field in struct rt_msghdr is redesignated as rtm_fmask field to communicate route flag changes in RTM_CHANGE messages from userland. The use count of a route was moved to rtm_rmx a long time ago. For source code compatibility reasons a define of rtm_use to rtm_fmask is provided. These changes faciliate running of multiple cooperating routing daemons at the same time without causing undesired interference. Open[BGP|OSPF]D make use of these features to have IGP routes override EGP ones. Obtained from: OpenBSD (claudio@) MFC after: 3 days
* - Store pointer to the link-level address right in "struct ifnet"ru2005-11-111-9/+6
| | | | | | | | | | rather than in ifindex_table[]; all (except one) accesses are through ifp anyway. IF_LLADDR() works faster, and all (except one) ifaddr_byindex() users were converted to use ifp->if_addr. - Stop storing a (pointer to) Ethernet address in "struct arpcom", and drop the IFP2ENADDR() macro; all users have been converted to use IF_LLADDR() instead.
* Use sparse initializers for "struct domain" and "struct protosw",ru2005-11-091-8/+14
| | | | so they are easier to follow for the human being.
* Drop current rtentry lock before calling rt_getifa(). This fixes a LORglebius2005-09-191-3/+3
| | | | | | | and a possible recursive use of rtentry mutex. PR: kern/69356 Reviewed by: sam
* Protect interface and address lists using the appropriate mutex. Thesecsjp2005-09-101-16/+16
| | | | | | | | | | | | | | | | | | locks were not aquired because the user buffers were not wired, thus it was possible that that SYSCTL_OUT could sleep, causing a number of different problems such as lock ordering issues and dead locks. -Wire user supplied buffer to ensure SYSCTL_OUT will not sleep. -Pickup ifnet locks to protect the list. -Where applicable pickup address locks. -Pickup radix node head locks. -Remove splnet stubs -Remove various comments about locking here, because they are no longer needed. It is the hope that these changes will make sysctl_rtsock MP safe. MFC after: 3 weeks
* Forward declaring static variables as extern is invalid ISO-C. Now thatobrien2005-09-071-1/+1
| | | | GCC can properly handle forward static declarations, do this properly.
* De-spl parts of the routing socket code now generally protectedrwatson2005-08-251-40/+20
| | | | | | | | through locking; leave some spl references around code where there are open questions about global variable references. Also, add an XXX regarding locking in sysctl. MFC after: 3 days
* o To prevent a race between RTM_DELETE message andglebius2005-08-111-2/+4
| | | | | | | arptimer() deleting stale entry, we need to lock rtentry before unlocking radix head. Reviewed by: sam
* Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,rwatson2005-08-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making and documenting the locking of these flags the responsibility of the device driver, not the network stack. The flags for these two fields will be mutually exclusive so that they can be exposed to user space as though they were stored in the same variable. Provide #defines to provide the old names #ifndef _KERNEL, so that user applications (such as ifconfig) can use the old flag names. Using the old names in a device driver will result in a compile error in order to help device driver writers adopt the new model. When exposing the interface flags to user space, via interface ioctls or routing sockets, or the two fields together. Since the driver flags cannot currently be set for user space, no new logic is currently required to handle this case. Add some assertions that general purpose network stack routines, such as if_setflags(), are not improperly used on driver-owned flags. With this change, a large number of very minor network stack races are closed, subject to correct device driver locking. Most were likely never triggered. Driver sweep to follow; many thanks to pjd and bz for the line-by-line review they gave this patch. Reviewed by: pjd, bz MFC after: 7 days
* Fix for PR 82974. We were not checking that the route looked up ingnn2005-07-151-0/+19
| | | | | | | | | | | | | | the case of an RTM_CHANGE was specific, i.e. that it matched completely. This led to a route change of a non-existent route changing the default route as the radix code would simply back track to that point and hand that route back to the routing socket code. PR: 82974 Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw> Ben Kaduk <minimarmot@gmail.com> Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> Obtained from: OpenBSD with modifications. MFC after: 2 weeks
* When returing an RTM_GET message through the routing socket fillharti2005-06-091-0/+2
| | | | | in the rtm_index field whenever we have an interface pointer. This is consistent with the RTM_GET messages returned by sysctl().
* rt_newaddrmsg will blow up if given something other than RTM_ADDsam2005-03-261-0/+3
| | | | | | | | or RTM_DELETE; add an assertion, may want to do something more heavyhanded in the future Noticed by: Coverity Prevent analysis tool Reviewed by: mdodd
* eliminate dead code and collapse the remaindersam2005-02-231-3/+1
| | | | | Noticed by: Coverity Prevent analysis tool Reviewed by: rwatson
* /* -> /*- for license, minor formatting changesimp2005-01-071-1/+1
|
* Initialize struct pr_userreqs in new/sparse style and fill in commonphk2004-11-081-5/+10
| | | | | | default elements in net_init_domain(). This makes it possible to grep these structures and see any bogosities.
* Add 802.11-specific events that are dispatched through the routing socket.sam2004-10-051-13/+66
| | | | | | | This really doesn't belong here but is preferred (for the moment) over adding yet another mechanism for sending msgs from the kernel to user apps. Reviewed by: imp
* Apply error and success logic consistently to the function netisr_queue() andandre2004-08-271-1/+1
| | | | | | | | | | | | | | | | | | its users. netisr_queue() now returns (0) on success and ERRNO on failure. At the moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full) are supported. Previously it would return (1) on success but the return value of IF_HANDOFF() was interpreted wrongly and (0) was actually returned on success. Due to this schednetisr() was never called to kick the scheduling of the isr. However this was masked by other normal packets coming through netisr_dispatch() causing the dequeueing of waiting packets. PR: kern/70988 Found by: MOROHOSHI Akihiko <moro@remus.dti.ne.jp> MFC after: 3 days
* Fix a typo (attacked -> attached).roam2004-08-241-1/+1
| | | | Approved by: sam
* If a tunable for the routing socket netisr queue max is defined, allow itrwatson2004-08-211-1/+3
| | | | | to override the default value, rather than the default value overriding the tunable.
* Allow the size of the routing socket netisr queue to be configured usingrwatson2004-08-211-1/+6
| | | | | | | | | | the tunable or sysctl 'net.route.netisr_maxqlen'. Default the maximum depth to 256 rather than IFQ_MAXLEN due to the downsides of dropping routing messages. MT5 candidate. Discussed with: mdodd, mlaier, Vincent Jardin <jardin at 6wind.com>
* Use IFQ_SET_MAXLEN() to set the maximum queue depth of the routingrwatson2004-08-131-1/+1
| | | | | | socket netisr queue. Pointed out by: winter
* Be consistent and use bzero() instead of memset().bms2004-07-061-1/+1
|
* Introduce a netisr to deliver kernel-generated routing, avoidingrwatson2004-06-091-4/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | recursive entering of the socket code from the routing code: - Modify rt_dispatch() to bundle up the sockaddr family, if any, associated with a pending mbuf to dispatch to routing sockets, in an m_tag on the mbuf. - Allocate NETISR_ROUTE for use by routing sockets. - Introduce rtsintrq, an ifqueue to be used by the netisr, and introduce rts_input(), a function to unbundle the tagged sockaddr and inject the mbuf and address into raw_input(), which previously occurred in rt_dispatch(). - Introduce rts_init() to initialize rtsintrq, its mutex, and register the netisr. Perform this at the same point in system initialization as setup of the domains. This change introduces asynchrony between the generation of a pending routing socket message and delivery to sockets for use by userspace. It avoids socket->routing->rtsock->socket use and helps to avoid lock order reversals between the routing code and socket code (in particular, raw socket control blocks), as route locks are held over calls to rt_dispatch(). Reviewed by: "George V.Neville-Neil" <gnn@neville-neil.com> Conceptual head nod by: sam
* Zero the un-used portions of the struct sockaddr data before sendingcsjp2004-05-101-0/+1
| | | | | | | | | | | | | it back to userspace, so it does not break bind(2) on raw sockets in jails. Currently some processes, like traceroute(8) construct a routing request to determine its source address based on the destination. This sockaddr data is fed directly to bind(2). When bind calls ifa_ifwithaddr(9) to make sure the address exists on the interface, the comparison will fail causing bind(2) to return EADDRNOTAVAIL if the data wasnt zero'ed before initialization. Approved by: bmilekic (mentor)
* o Fix misindentation in the previous commit.maxim2004-05-031-4/+4
|
* Give jail(8) the feature to allow raw sockets from within abmilekic2004-04-261-2/+13
| | | | | | | | | | | | | | | | | | | | | jail, which is less restrictive but allows for more flexible jail usage (for those who are willing to make the sacrifice). The default is off, but allowing raw sockets within jails can now be accomplished by tuning security.jail.allow_raw_sockets to 1. Turning this on will allow you to use things like ping(8) or traceroute(8) from within a jail. The patch being committed is not identical to the patch in the PR. The committed version is more friendly to APIs which pjd is working on, so it should integrate into his work quite nicely. This change has also been presented and addressed on the freebsd-hackers mailing list. Submitted by: Christian S.J. Peron <maneo@bsdpro.com> PR: kern/65800
* More style and deobfuscation fixes.ru2004-04-191-4/+4
| | | | Submitted by: bde
* Style and code unobfuscation.ru2004-04-181-4/+4
|
* Fixed a bug from rev. 1.42: cast to a correct type.ru2004-04-181-2/+2
| | | | Submitted by: luigi
* + replace Bcmp/Bzero with 'the real thing' as in the rest of the file.luigi2004-04-181-3/+4
| | | | + remember to check and fix or explain a strange cast in route_output()
* Minor changes to improve code readability (no actual code changes):luigi2004-04-181-60/+63
| | | | | | | | + replace 0 with NULL where appropriate (not complete) + remove register declaration while there + add argument names to function prototypes to have a better idea of what they are used for + add 'const' qualifiers in 3 places
* misc cleanup in sysctl_ifmalist():luigi2004-04-171-27/+10
| | | | | | | | | | + remove a partly incorrect comment that i introduced in the last commit; + deal with the correct part of the above comment by cleaning up the updates of 'info' -- rti_addrs needd not to be updated, rti_info[RTAX_IFP] can be set once outside the loop. While at it, correct a few misspelling of NULL as 0, but there are way too many in this file, and i did not want to clutter the important part of this commit.
* Consistently use ifaddr_byindex() to access the link-level addressluigi2004-04-161-17/+21
| | | | | | | of an interface. No functional change. On passing, comment a likely bug in net/rtsock.c:sysctl_ifmalist() which, if confirmed, would deserve to be fixed and MFC'ed
* route.h: introduce a macro, SA_SIZE(struct sockaddr *) which returnsluigi2004-04-131-10/+5
| | | | | | | | | | the space occupied by a struct sockaddr when passed through a routing socket. Use it to replace the macro ROUNDUP(int), that does the same but is redefined by every file which uses it, courtesy of the School of Cut'n'Paste Programming(TM). (partial) userland changes to follow.
OpenPOWER on IntegriCloud