summaryrefslogtreecommitdiffstats
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Complete the UDP tunneling of ICMP msgs to those protocolsrrs2016-04-281-1/+1
| | | | | | | | interested in having tunneled UDP and finding out about the ICMP (tested by Michael Tuexen with SCTP.. soon to be using this feature). Differential Revision: http://reviews.freebsd.org/D5875
* radix_mpath: Don't derefence a NULL pointer in for loop iterationcem2016-04-261-1/+1
| | | | | | | | | | | | | | | | | | It seems rn_dupedkey may be NULL, because of the NULL check inside the loop. (Also, the rt gets assigned from rn_dupedkey and NULL checked at top of loop.) However, the for-loop update condition happens before the top-of-loop check and dereferences 'rt' unconditionally. Instead, NULL-check before dereferencing. If rn_dupedkey cannot in fact be NULL, or something else protects this, feel free to revert this and add an ASSERT of some kind instead. This was introduced in r191080 (2009) and moved around slightly in r293657. Reported by: Coverity CID: 1348482 Sponsored by: EMC / Isilon Storage Division
* sys: extend use of the howmany() macro when available.pfg2016-04-261-1/+1
| | | | | We have a howmany() macro in the <sys/param.h> header that is convenient to re-use as it makes things easier to read.
* sys: use our roundup2/rounddown2() macros when param.h is available.pfg2016-04-211-1/+1
| | | | | | | | | | rounddown2 tends to produce longer lines than the original code and when the code has a high indentation level it was not really advantageous to do the replacement. This tries to strike a balance between readability using the macros and flexibility of having the expressions, so not everything is converted.
* Remove slightly used const values that can be replaced with nitems().pfg2016-04-211-4/+2
| | | | Suggested by: jhb
* Add more fields from struct ifnet needed during debugging a kernel panic.bz2016-04-201-1/+3
| | | | | | | Move if_fib into the right place. MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
* radix rn_inithead: Fix minor leak in low memory conditionscem2016-04-201-0/+2
| | | | | | | | | | | | R_Zalloc is essentially a malloc(M_NOWAIT) wrapper. It is possible that 'rnh' failed to allocate, but 'rmh' succeeds. In that case, we bail out of rn_inithead() but previously did not free 'rmh'. Introduced in r287073 (projects/routing) / MFP r294706. Reported by: Coverity CID: 1350258 Sponsored by: EMC / Isilon Storage Division
* bpf_getdltlist: Don't overrun 'lst'cem2016-04-201-1/+1
| | | | | | | | | 'lst' is allocated with 'n1' members. 'n' indexes 'lst'. So 'n == n1' is an invalid 'lst' index. This is a follow-up to r296009. Reported by: Coverity CID: 1352743 Sponsored by: EMC / Isilon Storage Division
* kernel: use our nitems() macro when it is available through param.h.pfg2016-04-191-1/+1
| | | | | | No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current
* sys/net* : for pointers replace 0 with NULL.pfg2016-04-1514-50/+50
| | | | | | Mostly cosmetical, no functional change. Found with devel/coccinelle.
* During if_vmove() we call if_detach_internal() which in turn calls the eventbz2016-04-113-0/+76
| | | | | | | | | | | | | | | | | | | | | handler notifying about interface departure and one of the consumers will detach if_bpf. There is no way for us to re-attach this easily as the DLT and hdrlen are only given on interface creation. Add a function to allow us to query the DLT and hdrlen from a current BPF attachment and after if_attach_internal() manually re-add the if_bpf attachment using these values. Found by panics triggered by nd6 packets running past BPF_MTAP() with no proper if_bpf pointer on the interface. Also add a basic DDB show function to investigate the if_bpf attachment of an interface. Reviewed by: gnn MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D5896
* Cleanup unnecessary semicolons from the kernel.pfg2016-04-102-3/+3
| | | | Found with devel/coccinelle.
* Revert accidental submit of WIP as part of r297609rpokala2016-04-062-278/+39
| | | | Pointyhat to: rpokala
* Storage Controller Interface driver - typo in unimplemented macro inrpokala2016-04-062-39/+278
| | | | | | | | | scic_sds_controller_registers.h s/contoller/controller/ PR: 207336 Submitted by: Tony Narlock <tony @ git-pull.com>
* Remove an unneeded check.jhb2016-04-051-3/+0
| | | | | | CPUs with valid per-CPU data are not absent. Sponsored by: Netflix
* Catch up with some more fields. I needed the bpf one lately.bz2016-03-311-0/+2
| | | | Sponsored by: The FreeBSD Foundation
* Remove some NULL checks for M_WAITOK allocations.trasz2016-03-291-2/+0
| | | | | MFC after: 1 month Sponsored by: The FreeBSD Foundation
* Add ethertype reserved for network testinggnn2016-03-281-0/+1
| | | | MFC after: 2 weeks
* Fix compile errors after r297225:bz2016-03-242-2/+2
| | | | | | | | | - properly V_irtualise variable access unbreaking VIMAGE kernels. - remove the volatile from the function return type to make architecture using gcc happy [-Wreturn-type] "type qualifiers ignored on function return type" I am not entirely happy with this solution putting the u_int there but it will do for now.
* FreeBSD previously provided route caching for TCP (and UDP). Re-addgnn2016-03-243-1/+35
| | | | | | | | | | route caching for TCP, with some improvements. In particular, invalidate the route cache if a new route is added, which might be a better match. The cache is automatically invalidated if the old route is deleted. Submitted by: Mike Karels Reviewed by: gnn Differential Revision: https://reviews.freebsd.org/D4306
* buf_ring/drbr: Add buf_ring_peek_clear_sc and use it in drbr_peeksephe2016-02-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike buf_ring_peek, it only supports single consumer mode, and it clears the cons_head if DEBUG_BUFRING/INVARIANTS is defined. The normal use case of drbr_peek for network drivers is: m = drbr_peek(br); err = hw_spec_encap(&m); /* could m_defrag/m_collapse */ (*) if (err) { if (m == NULL) drbr_advance(br); else drbr_putback(br, m); /* break the loop */ } drbr_advance(br); The race is: If hw_spec_encap() m_defrag or m_collapse the mbuf, i.e. the old mbuf was freed, or like the Hyper-V's network driver, that transmission- done does not even require the TX lock; then on the other CPU at the (*) time, the freed mbuf could be recycled and being drbr_enqueue even before the current CPU had the chance to call drbr_{advance,putback}. This triggers a panic in drbr_enqueue duplicated element check, if DEBUG_BUFRING/INVARIANTS is defined. Use buf_ring_peek_clear_sc() in drbr_peek() to fix the above race. This change is a NO-OP, if neither DEBUG_BUFRING nor INVARIANTS are defined. MFC after: 1 week Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D5416
* In bpf_getdltlist(), do not call copyout(9) while holding bpf lock.kib2016-02-241-7/+25
| | | | | | | | | Copy the data into temprorary malloced buffer and drop the lock for copyout. Reported, reviewed and tested by: cem Sponsored by: The FreeBSD Foundation MFC after: 1 week
* Fix regression introduced on 272446r.araujo2016-02-191-1/+1
| | | | | | | | | | | | | lagg(4) supports the protocol none, where it disables any traffic without disabling the lagg(4) interface itself. PR: 206921 Submitted by: Pushkar Kothavade <pushkarbk@gmail.com> Reviewed by: rpokala Approved by: bapt (mentor) MFC after: 3 weeks Sponsored by: gandi.net Differential Revision: https://reviews.freebsd.org/D5076
* Merge SVN r295220 (bz) from projects/vnet/dteske2016-02-111-0/+14
| | | | | | | Fix a panic that occurs when a vnet interface is unavailable at the time the vnet jail referencing said interface is stopped. Sponsored by: FIS Global, Inc.
* These files were getting sys/malloc.h and vm/uma.h with header pollutionglebius2016-02-015-0/+6
| | | | via sys/mbuf.h
* Provide TCPSTAT_DEC() and TCPSTAT_FETCH() macros.glebius2016-01-271-0/+3
|
* Prune a definition which is / was never used.zec2016-01-251-1/+0
|
* Fix flowtable part missed in r294706.melifaro2016-01-251-1/+1
|
* MFP r287070,r287073: split radix implementation and route table structure.melifaro2016-01-258-200/+332
| | | | | | | | | | | | | | | | | | | | | | | There are number of radix consumers in kernel land (pf,ipfw,nfs,route) with different requirements. In fact, first 3 don't have _any_ requirements and first 2 does not use radix locking. On the other hand, routing structure do have these requirements (rnh_gen, multipath, custom to-be-added control plane functions, different locking). Additionally, radix should not known anything about its consumers internals. So, radix code now uses tiny 'struct radix_head' structure along with internal 'struct radix_mask_head' instead of 'struct radix_node_head'. Existing consumers still uses the same 'struct radix_node_head' with slight modifications: they need to pass pointer to (embedded) 'struct radix_head' to all radix callbacks. Routing code now uses new 'struct rib_head' with different locking macro: RADIX_NODE_HEAD prefix was renamed to RIB_ (which stands for routing information base). New net/route_var.h header was added to hold routing subsystem internal data. 'struct rib_head' was placed there. 'struct rtentry' will also be moved there soon.
* Remove unused radix_mpath definitions.melifaro2016-01-251-3/+0
|
* Add an IOCTL rr_limit to let users fine tuning the number of packets to bearaujo2016-01-232-2/+26
| | | | | | | | | | | | | | | | | | | sent using roundrobin protocol and set a better granularity and distribution among the interfaces. Tuning the number of packages sent by interface can increase throughput and reduce unordered packets as well as reduce SACK. Example of usage: # ifconfig bge0 up # ifconfig bge1 up # ifconfig lagg0 create # ifconfig lagg0 laggproto roundrobin laggport bge0 laggport bge1 \ 192.168.1.1 netmask 255.255.255.0 # ifconfig lagg0 rr_limit 500 Reviewed by: thompsa, glebius, adrian (old patch) Approved by: bapt (mentor) Relnotes: Yes Differential Revision: https://reviews.freebsd.org/D540
* Clean up original route path selection logic a bit.melifaro2016-01-151-5/+6
| | | | | | | NULL pointer dereference claimed by Coverity was possible if one (or several) next-hops for had their weights set to 0. CID: 1348482
* Fix panic in IP redirect. Panic was introduced in r293466.melifaro2016-01-141-2/+2
| | | | Found by: Yamagi Burmeister <lists at yamagi.org>>
* Remove now-unused wrappers for various routing functions.melifaro2016-01-142-72/+0
|
* Remove RTF_RNH_LOCKED support from rtalloc1_fib().melifaro2016-01-132-17/+6
| | | | | | Last caller using it was eliminated in r293471. Sponsored by: Yandex LLC
* Bring RADIX_MPATH support to new routing KPI to ease migration.melifaro2016-01-112-21/+41
| | | | | | Move actual rte selection process from rtalloc_mpath_fib() to the rt_path_selectrte() function. Add public rt_mpath_select() to use in fibX_lookup_ functions.
* Do not rewrite all ro_flags.melifaro2016-01-111-1/+1
|
* Fix userland build broken by r293470.melifaro2016-01-091-0/+2
| | | | Pointy hat to: melifaro
* Finish r275196: do not dereference rtentry in if_output() routines.melifaro2016-01-097-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | The only piece of information that is required is rt_flags subset. In particular, if_loop() requires RTF_REJECT and RTF_BLACKHOLE flags to check if this particular mbuf needs to be dropped (and what error should be returned). Note that if_loop() will always return EHOSTUNREACH for "reject" routes regardless of RTF_HOST flag existence. This is due to upcoming routing changes where RTF_HOST value won't be available as lookup result. All other functions require RTF_GATEWAY flag to check if they need to return EHOSTUNREACH instead of EHOSTDOWN error. There are 11 places where non-zero 'struct route' is passed to if_output(). For most of the callers (forwarding, bpf, arp) does not care about exact error value. In fact, the only place where this result is propagated is ip_output(). (ip6_output() passes NULL route to nd6_output_ifp()). Given that, add 3 new 'struct route' flags (RT_REJECT, RT_BLACKHOLE and RT_IS_GW) and inline function (rt_update_ro_flags()) to copy necessary rte flags to ro_flags. Call this function in ip_output() after looking up/ verifying rte. Reviewed by: ae
* Remove sys/eventhandler.h from net/route.hmelifaro2016-01-091-1/+0
| | | | Reviewed by: ae
* (Temporarily) remove route_redirect_event eventhandler.melifaro2016-01-092-15/+2
| | | | | | | | | | Such handler should pass different set of variables, instead of directly providing 2 locked route entries. Given that it hasn't been really used since at least 2012, remove current code. Will re-add it after finishing most major routing-related changes. Discussed with: np
* Please Coverity by removing unneccessary check (rt_key() is always set).melifaro2016-01-091-1/+1
| | | | Coverity CID: 1347797
* Do more fine-grained locking in rtrequest1_fib().melifaro2016-01-081-30/+22
| | | | | | Last consumer using RTF_RNH_LOCKED flag was eliminated in r291643. Restrict passing RTF_RNH_LOCKED to rtrequest1_fib() and do better locking for RTM_ADD / RTM_DELETE cases.
* Add rib_lookup_info() to provide API for retrieving individual routemelifaro2016-01-043-11/+165
| | | | | | | | | | | | | | | | | | | | | | | entries data in unified format. There are control plane functions that require information other than just next-hop data (e.g. individual rtentry fields like flags or prefix/mask). Given that the goal is to avoid rte reference/refcounting, re-use rt_addrinfo structure to store most rte fields. If caller wants to retrieve key/mask or gateway (which are sockaddrs and are allocated separately), it needs to provide sufficient-sized sockaddrs structures w/ ther pointers saved in passed rt_addrinfo. Convert: * lltable new records checks (in_lltable_rtcheck(), nd6_is_new_addr_neighbor(). * rtsock pre-add/change route check. * IPv6 NS ND-proxy check (RADIX_MPATH code was eliminated because 1) we don't support RTF_ANNOUNCE ND-proxy for networks and there should not be multiple host routes for such hosts 2) if we have multiple routes we should inspect them (which is not done). 3) the entire idea of abusing KRT as storage for ND proxy seems odd. Userland programs should be used for that purpose).
* Handle IPV6_PATHMTU option by spliting ip6_getpmtu_ctl() from ip6_getpmtu().melifaro2016-01-031-4/+5
| | | | | | | | | | | | | | | | | | | | | | | Add ro_mtu field to 'struct route' to be able to pass lookup MTU back to the caller. Currently, ip6_getpmtu() has 2 totally different use cases: 1) control plane (IPV6_PATHMTU req), where we just need to calculate MTU and return it, w/o any reusability. 2) Actual ip6_output() data path where we (nearly) always use the provided route lookup data. If this data is not 'valid' we need to perform another lookup and save the result (which cannot be re-used by ip6_output()). Given that, handle 1) by calling separate function doing rte lookup itself. Resulting MTU is calculated by (newly-added) ip6_calcmtu() used by both ip6_getpmtu_ctl() and ip6_getpmtu(). For 2) instead of storing ref'ed rte, store mtu (the only needed data from the lookup result) inside newly-added ro_mtu field. 'struct route' was shrinked by 8(or 4 bytes) in r292978. Grow it again by 4 bytes. New ro_mtu field will be used in other places like ip/tcp_output (EMSGSIZE handling from output routines). Reviewed by: ae
* Remove second EVENTHANDLER_REGISTER slipped in r292978.melifaro2016-01-011-0/+4
| | | | Describe the reason of doing unconditional M_PREPEND in ether_output().
* Clean up unused-but-set-variable spotted by gcc4.9.araujo2015-12-311-2/+0
| | | | | | Reviewed by: ngie Approved by: rodrigc (mentor) Differential Revision: https://reviews.freebsd.org/D4719
* Implement interface link header precomputation API.melifaro2015-12-318-141/+402
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add if_requestencap() interface method which is capable of calculating various link headers for given interface. Right now there is support for INET/INET6/ARP llheader calculation (IFENCAP_LL type request). Other types are planned to support more complex calculation (L2 multipath lagg nexthops, tunnel encap nexthops, etc..). Reshape 'struct route' to be able to pass additional data (with is length) to prepend to mbuf. These two changes permits routing code to pass pre-calculated nexthop data (like L2 header for route w/gateway) down to the stack eliminating the need for other lookups. It also brings us closer to more complex scenarios like transparently handling MPLS nexthops and tunnel interfaces. Last, but not least, it removes layering violation introduced by flowtable code (ro_lle) and simplifies handling of existing if_output consumers. ARP/ND changes: Make arp/ndp stack pre-calculate link header upon installing/updating lle record. Interface link address change are handled by re-calculating headers for all lles based on if_lladdr event. After these changes, arpresolve()/nd6_resolve() returns full pre-calculated header for supported interfaces thus simplifying if_output(). Move these lookups to separate ether_resolve_addr() function which ether returs error or fully-prepared link header. Add <arp|nd6_>resolve_addr() compat versions to return link addresses instead of pre-calculated data. BPF changes: Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT. Despite the naming, both of there have ther header "complete". The only difference is that interface source mac has to be filled by OS for AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside BPF and not pollute if_output() routines. Convert BPF to pass prepend data via new 'struct route' mechanism. Note that it does not change non-optimized if_output(): ro_prepend handling is purely optional. Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI. It is not needed for ethernet anymore. The only remaining FDDI user is dev/pdq mostly untouched since 2007. FDDI support was eliminated from OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65). Flowtable changes: Flowtable violates layering by saving (and not correctly managing) rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated header data from that lle. Differential Revision: https://reviews.freebsd.org/D4102
* Wrap using #ifdef 'notyet' those variables and statements not yetaraujo2015-12-311-4/+18
| | | | | | | | | | implemented to lower the compiler warnings. It fix the case of unused-but-set-variable spotted by gcc4.9. Reviewed by: ngie, ae Approved by: bapt (mentor) Differential Revision: https://reviews.freebsd.org/D4720
* Add SFF-8024 Extended Specification Compliancemelifaro2015-12-281-1/+1
| | | | | | Submitted by: markb_mellanox.com MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D4666
OpenPOWER on IntegriCloud