summaryrefslogtreecommitdiffstats
path: root/sys/net/if_bridge.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'origin/stable/11' into devel-11Luiz Souza2018-04-301-7/+8
|\
| * MFC r331436:kp2018-04-151-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netpfil: Introduce PFIL_FWD flag Forwarded packets passed through PFIL_OUT, which made it difficult for firewalls to figure out if they were forwarding or producing packets. This in turn is an issue for pf for IPv6 fragment handling: it needs to call ip6_output() or ip6_forward() to handle the fragments. Figuring out which was difficult (and until now, incorrect). Having pfil distinguish the two removes an ugly piece of code from pf. Introduce a new variant of the netpfil callbacks with a flags variable, which has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if a packet is forwarded.
| * MFC r323864kp2017-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | bridge: Set module version This ensures that the loader will not load the module if it's also built in to the kernel. PR: 220860 Submitted by: Eugene Grosbein <eugen@freebsd.org>
* | Convert if_bridge.c back to if_start() to re-add the ALTQ support.Luiz Souza2017-10-301-32/+24
| | | | | | | | | | | | Ticket #7936 (cherry picked from commit 50319f09eeed57085eb665ce6f82de3c12ba18ee)
* | Remove excess CTLFLAG_VNETbdrewery2017-10-271-1/+1
| | | | | | | | | | | | Sponsored by: Dell EMC Isilon (cherry picked from commit 271abc089d73da5713a474e89b4150bf6f14326c)
* | bridge: Set module versionkp2017-10-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | This ensures that the loader will not load the module if it's also built in to the kernel. PR: 220860 Submitted by: Eugene Grosbein <eugen@freebsd.org> Reported by: Marie Helene Kvello-Aune <marieheleneka@gmail.com> (cherry picked from commit deed61f436ecad351fabada7d9d0a80d9cd37b25)
* | Make if_bridge complain if it can't disable some capabilities.mav2017-10-271-2/+6
| | | | | | | | | | | | | | MFC after: 2 weeks Sponsored by: iXsystems, Inc. (cherry picked from commit 08f5e30a7cf6b3be2f5b82b2780940b8299cd1ea)
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Luiz Otavio O Souza2017-02-091-0/+6
|\ \ | |/
| * MFC 312782kp2017-02-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bridge: Release the bridge lock when calling bridge_set_ifcap() This calls ioctl() handlers for the different interfaces in the bridge. These handlers expect to get called in an ioctl context where it's safe for them to sleep. We may not sleep with the bridge lock held. However, we still need to protect the interface list, to ensure it doesn't get changed while we iterate over it. Use BRIDGE_XLOCK(), which prevents bridge members from being removed. Adding bridge members is safe, because it uses LIST_INSERT_HEAD(). This caused panics when adding xen interfaces to a bridge. PR: 216304 Reviewed by: ae Sponsored by: RootBSD Differential Revision: https://reviews.freebsd.org/D9290
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Renato Botelho2016-10-061-31/+55
|\ \ | |/
| * MFC r306289:kp2016-10-021-31/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bridge: Fix fragment handling and memory leak Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling feature (like pf'scrub) is enabled on the bridge. This patch fixes corrupted packet problem and the panic (triggered easly with low RAM) as explain in PR 185633. bridge_pfil and bridge_fragment relationship: bridge_pfil() receive (IN direction) packets and sent it to the firewall The firewall can be configured for reassembling fragmented packet (like pf'scrubing) in one mbuf chain when bridge_pfil() need to send this reassembled packet to the outgoing interface, it needs to re-fragment it by using bridge_fragment() bridge_fragment() had to split this mbuf (using ip_fragment) first then had to M_PREPEND each packet in the mbuf chain for adding Ethernet header. But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain, then the "main" pointer of this mbuf chain should be updated and this case is tottaly forgotten. The original bridge_fragment code (Revision 158140, 2006 April 29) came from OpenBSD, and the call to bridge_enqueue was embedded. But on FreeBSD, bridge_enqueue() is done after bridge_fragment(), then the original OpenBSD code can't work as-it of FreeBSD. PR: 185633 Submitted by: Olivier Cochard-Labbé
* | Revert "bridge: Fix fragment handling and memory leak"Renato Botelho2016-10-061-55/+31
| | | | | | | | This reverts commit 7a332daf8ec44bea09dd38a4026773c41caf9758.
* | bridge: Fix fragment handling and memory leakkp2016-09-261-31/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling feature (like pf'scrub) is enabled on the bridge. This patch fixes corrupted packet problem and the panic (triggered easly with low RAM) as explain in PR 185633. bridge_pfil and bridge_fragment relationship: bridge_pfil() receive (IN direction) packets and sent it to the firewall The firewall can be configured for reassembling fragmented packet (like pf'scrubing) in one mbuf chain when bridge_pfil() need to send this reassembled packet to the outgoing interface, it needs to re-fragment it by using bridge_fragment() bridge_fragment() had to split this mbuf (using ip_fragment) first then had to M_PREPEND each packet in the mbuf chain for adding Ethernet header. But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain, then the "main" pointer of this mbuf chain should be updated and this case is tottaly forgotten. The original bridge_fragment code (Revision 158140, 2006 April 29) came from OpenBSD, and the call to bridge_enqueue was embedded. But on FreeBSD, bridge_enqueue() is done after bridge_fragment(), then the original OpenBSD code can't work as-it of FreeBSD. PR: 185633 Submitted by: Olivier Cochard-Labbé Differential Revision: https://reviews.freebsd.org/D7780 (cherry picked from commit a8a1202774e288fb88de8422397f7ff398f7e3fb)
* | Merge remote-tracking branch 'origin/stable/11' into devel-11Renato Botelho2016-08-251-1/+2
|\ \ | |/
| * MFC r303009: Negotiate/disable TXCSUM_IPV6 same as TXCSUM.mav2016-08-181-1/+2
| |
* | Merge remote-tracking branch 'origin/master' into devel-11Luiz Otavio O Souza2016-07-021-1/+1
|\ \ | |/
| * Get closer to a VIMAGE network stack teardown from top to bottom ratherbz2016-06-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated. Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC. Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet. For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown. Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers. For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()). Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level. Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747
* | Merge remote-tracking branch 'origin/master' into devel-11Renato Botelho2016-05-091-1/+1
|\ \ | |/
| * sys/net*: minor spelling fixes.pfg2016-05-031-1/+1
| | | | | | | | No functional change.
* | Merge remote-tracking branch 'origin/master' into devel-11Renato Botelho2016-04-181-2/+2
|\ \ | |/
| * sys/net* : for pointers replace 0 with NULL.pfg2016-04-151-2/+2
| | | | | | | | | | | | Mostly cosmetical, no functional change. Found with devel/coccinelle.
* | Importing pfSense patch if_bridge_gif_mtu.diffLuiz Otavio O Souza2016-04-151-6/+10
|/
* Fix panic when adding vtnet interfaces to a bridgekp2015-06-131-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | vtnet interfaces are always in promiscuous mode (at least if the VIRTIO_NET_F_CTRL_RX feature is not negotiated with the host). if_promisc() on a vtnet interface returned ENOTSUP although it has IFF_PROMISC set. This confused the bridge code. Instead we now accept all enable/disable promiscuous commands (and always keep IFF_PROMISC set). There are also two issues with the if_bridge error handling. If if_promisc() fails it uses bridge_delete_member() to clean up. This tries to disable promiscuous mode on the interface. That runs into an assert, because promiscuous mode was never set in the first place. (That's the panic reported in PR 200210.) We can only unset promiscuous mode if the interface actually is promiscuous. This goes against the reference counting done by if_promisc(), but only the first/last if_promic() calls can actually fail, so this is safe. A second issue is a double free of bif. It's already freed by bridge_delete_member(). PR: 200210 Differential Revision: https://reviews.freebsd.org/D2804 Reviewed by: philip (mentor)
* Fix a panic when VIMAGE is enabled.hrs2015-05-121-0/+2
| | | | Spotted by: Nikos Vassiliadis
* Fix a panic when tearing down a vnet on a VIMAGE-enabled kernel.hrs2015-02-141-2/+9
| | | | | | | | There was a race that bridge_ifdetach() could be called via ifnet_departure event handler after vnet_bridge_uninit(). PR: 195859 Reported by: Danilo Egea Gondolfo
* Remove struct arpcom. It is unused by most interface types, that allocateglebius2014-11-071-2/+2
| | | | | | | | it, except Ethernet, where it carried ng_ether(4) pointer. For now carry the pointer in if_l2com directly. Sponsored by: Netflix Sponsored by: Nginx, Inc.
* Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.glebius2014-11-071-2/+2
| | | | Sponsored by: Nginx, Inc.
* Virtualize if_bridge(4) cloner.hrs2014-10-051-72/+126
|
* Mechanically convert to if_inc_counter().glebius2014-09-191-16/+16
|
* Pull in r267961 and r267973 again. Fix for issues reported will follow.hselasky2014-06-281-14/+7
|
* Revert r267961, r267973:gjb2014-06-271-7/+14
| | | | | | | | | | These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
* Extend the meaning of the CTLFLAG_TUN flag to automatically check ifhselasky2014-06-271-14/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies
* Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbitglebius2014-03-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | interface, in the r241616 a crutch was provided. It didn't work well, and finally we decided that it is time to break ABI and simply make if_baudrate a 64-bit value. Meanwhile, the entire struct if_data was reviewed. o Remove the if_baudrate_pf crutch. o Make all fields of struct if_data fixed machine independent size. The notion of data (packet counters, etc) are by no means MD. And it is a bug that on amd64 we've got a 64-bit counters, while on i386 32-bit, which at modern speeds overflow within a second. This also removes quite a lot of COMPAT_FREEBSD32 code. o Give 16 bit for the ifi_datalen field. This field was provided to make future changes to if_data less ABI breaking. Unfortunately the 8 bit size of it had effectively limited sizeof if_data to 256 bytes. o Give 32 bits to ifi_mtu and ifi_metric. o Give 64 bits to the rest of fields, since they are counters. __FreeBSD_version bumped. Discussed with: emax Sponsored by: Netflix Sponsored by: Nginx, Inc.
* Include necessary headers that now are available due to pollutionglebius2013-10-281-0/+1
| | | | | | | via if_var.h. Sponsored by: Netflix Sponsored by: Nginx, Inc.
* - Relax the restriction on the member interfaces with LLAs. Two or morehrs2013-07-281-27/+8
| | | | | | | | | | | LLAs on the member interfaces are actually harmless when the parent interface does not have a LLA. - Add net.link.bridge.allow_llz_overlap. This is a knob to allow LLAs on a bridge and the member interfaces at the same time. The default is 0. Pointed out by: ume MFC after: 3 days
* Fix a compiler warning.hrs2013-07-031-0/+1
| | | | MFC after: 1 week
* - Allow ND6_IFF_AUTO_LINKLOCAL for IFT_BRIDGE. An interface with IFT_BRIDGEhrs2013-07-021-18/+83
| | | | | | | | | | | | | | | | | | | | | | | is initialized with !ND6_IFF_AUTO_LINKLOCAL && !ND6_IFF_ACCEPT_RTADV regardless of net.inet6.ip6.accept_rtadv and net.inet6.ip6.auto_linklocal. To configure an autoconfigured link-local address (RFC 4862), the following rc.conf(5) configuration can be used: ifconfig_bridge0_ipv6="inet6 auto_linklocal" - if_bridge(4) now removes IPv6 addresses on a member interface to be added when the parent interface or one of the existing member interfaces has an IPv6 address. if_bridge(4) merges each link-local scope zone which the member interfaces form respectively, so it causes address scope violation. Removal of the IPv6 addresses prevents it. - if_lagg(4) now removes IPv6 addresses on a member interfaces unconditionally. - Set reasonable flags to non-IPv6-capable interfaces. [*] Submitted by: rpaulo [*] MFC after: 1 week
* Use IP6STAT_INC/IP6STAT_DEC macros to update ip6 stats.ae2013-04-091-3/+3
| | | | MFC after: 1 week
* Ignore interface renames instead of removing the interface from the bridgemarkj2013-03-281-0/+3
| | | | | | | | | group. Reviewed by: rstone Approved by: rstone (co-mentor) Sponsored by: Sandvine Incorporated MFC after: 1 week
* Reinitialize eh after pfil(9) processing.glebius2013-03-111-0/+1
| | | | | PR: 176764 Submitted by: adri
* Fix typo in comment.kevlo2012-12-181-1/+1
| | | | Reviewed by: thompsa
* Mechanically substitute flags from historic mbuf allocator withglebius2012-12-051-9/+9
| | | | | | | | | malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually
* - Use more appropriate loop (do { } while()) when generating ethernet addresspjd2012-11-291-3/+5
| | | | | | | | | for bridge interface. - If we found a collision we can break the loop - only one collision is possible and one is exactly enough to need to renegerate. Obtained from: WHEEL Systems MFC after: 1 week
* o Remove last argument to ip_fragment(), and obtain all needed informationglebius2012-10-261-2/+2
| | | | | | | | | | | on checksums directly from mbuf flags. This simplifies code. o Clear CSUM_IP from the mbuf in ip_fragment() if we did checksums in hardware. Some driver may not announce CSUM_IP in theur if_hwassist, although try to do checksums if CSUM_IP set on mbuf. Example is em(4). o While here, consistently use CSUM_IP instead of its alias CSUM_DELAY_IP. After this change CSUM_DELAY_IP vanishes from the stack. Submitted by: Sebastian Kuzminsky <seb lineratesystems.com>
* Fix fallout from r240071. If destination interface lookup fails,glebius2012-10-241-6/+5
| | | | | | we should broadcast a packet, not try to deliver it to NULL. Reported by: rpaulo
* Make the "struct if_clone" opaque to users of the cloning API. Usersglebius2012-10-161-4/+6
| | | | | | | | | | | | now use function calls: if_clone_simple() if_clone_advanced() to initialize a cloner, instead of macros that initialize if_clone structure. Discussed with: brooks, bz, 1 year ago
* Revert previous commit...kevlo2012-10-101-1/+1
| | | | Pointyhat to: kevlo (myself)
* Prefer NULL over 0 for pointerskevlo2012-10-091-1/+1
|
* A step in resolving mess with byte ordering for AF_INET. After this change:glebius2012-10-061-12/+1
| | | | | | | | | | | | | | | | | | | - All packets in NETISR_IP queue are in net byte order. - ip_input() is entered in net byte order and converts packet to host byte order right _after_ processing pfil(9) hooks. - ip_output() is entered in host byte order and converts packet to net byte order right _before_ processing pfil(9) hooks. - ip_fragment() accepts and emits packet in net byte order. - ip_forward(), ip_mloopback() use host byte order (untouched actually). - ip_fastforward() no longer modifies packet at all (except ip_ttl). - Swapping of byte order there and back removed from the following modules: pf(4), ipfw(4), enc(4), if_bridge(4). - Swapping of byte order added to ipfilter(4), based on __FreeBSD_version - __FreeBSD_version bumped. - pfil(9) manual page updated. Reviewed by: ray, luigi, eri, melifaro Tested by: glebius (LE), ray (BE)
* Remove the M_NOWAIT from bridge_rtable_init as it isn't needed. The functionthompsa2012-10-041-8/+3
| | | | | | return value is not even checked and could lead to a panic on a null sc_rthash. MFC after: 2 weeks
OpenPOWER on IntegriCloud