summaryrefslogtreecommitdiffstats
path: root/sys/net
Commit message (Collapse)AuthorAgeFilesLines
* Fix comment to better reflect how we arerrs2012-06-121-6/+11
| | | | | cheating and using the csum_data. Also fix style issues with the comments.
* Note to self. Have morning coffee *before* committing things.rrs2012-06-121-4/+6
| | | | | | There is no mac_addr in the mbuf for BSD.. cheat like we are supposed to and use the csum field since our friend the gif tunnel itself will never use offload.
* Opps forgot to commit the flag.rrs2012-06-121-1/+1
|
* Allow a gif tunnel to be used with ALTq.rrs2012-06-121-46/+102
| | | | Reviewed by: gnn
* Fix a panic I introduced in r234487, the bridge softc pointer is set to nullthompsa2012-06-111-14/+22
| | | | | | | | early in the detach so rearrange things not to explode. Reported by: David Roffiaen, Gustau Perez Querol Tested by: David Roffiaen MFC after: 3 days
* Fix typo introduced in r236559.melifaro2012-06-091-1/+1
| | | | | Pointed by: bcr Approved by: kib(mentor)
* Sort includes.trociny2012-06-071-1/+1
| | | | | Submitted by: Daan Vreeken <pa4dan Bliksem.VEHosting.nl> MFC after: 3 days
* Add VIMAGE support to if_tap.trociny2012-06-071-0/+11
| | | | | | PR: kern/152047, kern/158686 Submitted by: Daan Vreeken <pa4dan Bliksem.VEHosting.nl> MFC after: 1 week
* Fix panic introduced by r235745. Panic occurs after first packet traverse ↵melifaro2012-06-041-2/+22
| | | | | | | | | | | renamed interface. Add several comments on locking Found by: avg Approved by: ae(mentor) Tested by: avg MFC after: 1 week
* Seperate SCTP checksum offloading for IPv4 and IPv6.tuexen2012-05-301-1/+1
| | | | | | | While there: remove some trainling whitespaces. MFC after: 3 days X-MFC with: 236170
* Fix style(9) nits, reduce unnecessary type castings, etc., for bpf_setf().jkim2012-05-291-19/+20
|
* - Save the previous filter right before we set new one.jkim2012-05-291-63/+26
| | | | | | - Reduce duplicate code and make it little easier to read. MFC after: 2 weeks
* Fix 32-bit shim for BIOCSETF to drop all packets buffered on the descriptorjkim2012-05-291-2/+12
| | | | | | and reset statistics as it should. MFC after: 3 days
* Fix BPF_JITTER code broken by r235746.melifaro2012-05-291-46/+48
| | | | | | | Pointed by: jkim Reviewed by: jkim (except locking changes) Approved by: (mentor) MFC after: 2 weeks
* if_lagg: allow to invoke SIOCSLAGGPORT multiple times in a rowrea2012-05-281-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, 'ifconfig laggX down' does not remove members from this lagg(4) interface. So, 'service netif stop laggX' followed by 'service netif start laggX' will choke, because "stop" will leave interfaces attached to the laggX and ifconfig from the "start" will refuse to add already-existing interfaces. The real-world case is when I am bundling together my Ethernet and WiFi interfaces and using multiple profiles for accessing network in different places: system being booted up with one profile, but later this profile being exchanged to another one, followed by 'service netif restart' will not add WiFi interface back to the lagg: the "stop" action from 'service netif restart' will shut down my main WiFi interface, so wlan0 that exists in the lagg0 will be destroyed and purged from lagg0; the "start" action will try to re-add both interfaces, but since Ethernet one is already in lagg0, ifconfig will refuse to add the wlan0 from WiFi interface. Since adding the interface to the lagg(4) when it is already here should be an idempotent action: we're really not changing anything, so this fix doesn't change the semantics of interface addition. Approved by: thompsa Reviewed by: emaste MFC after: 1 week
* It turns out that too many drivers are not only parsing the L2/3/4bz2012-05-282-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | headers for TSO but also for generic checksum offloading. Ideally we would only have one common function shared amongst all drivers, and perhaps when updating them for IPv6 we should introduce that. Eventually we should provide the meta information along with mbufs to avoid (re-)parsing entirely. To not break IPv6 (checksums and offload) and to be able to MFC the changes without risking to hurt 3rd party drivers, duplicate the v4 framework, as other OSes have done as well. Introduce interface capability flags for TX/RX checksum offload with IPv6, to allow independent toggling (where possible). Add CSUM_*_IPV6 flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6 fragmentation. Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and add an alias for CSUM_DATA_VALID_IPV6. This pretty much brings IPv6 handling in line with IPv4. TSO is still handled in a different way and not via if_hwassist. Update ifconfig to allow (un)setting of the new capability flags. Update loopback to announce the new capabilities and if_hwassist flags. Individual driver updates will have to follow, as will SCTP. Reported by: gallatin, dim, .. Reviewed by: gallatin (glanced at?) MFC after: 3 days X-MFC with: r235961,235959,235958
* Turn LACP debugging from a compile time option to a sysctl, it is very handy tothompsa2012-05-261-43/+37
| | | | | | | be able to turn it on when negotiation to a switch misbehaves. Submitted by: Andrew Boyer MFC after: 3 days
* MFp4 bz_ipv6_fast:bz2012-05-251-1/+1
| | | | | | | | | | | Simple yet effective change enabling checksum "offload" on loopback for IPv6 to avoid expensive computations. Sponsored by: The FreeBSD Foundation Sponsored by: iXsystems Reviewed by: gnn (as part of the whole) MFC After: 3 days
* Make most BPF ioctls() SMP-safe.melifaro2012-05-211-6/+47
| | | | | Approved by: kib(mentor) MFC in: 4 weeks
* Call bpf_jitter() before acquiring BPF global lock due to malloc() being ↵melifaro2012-05-213-29/+43
| | | | | | | | | | used inside bpf_jitter. Eliminate bpf_buffer_alloc() and allocate BPF buffers on descriptor creation and BIOCSBLEN ioctl. This permits us not to allocate buffers inside bpf_attachd() which is protected by global lock. Approved by: kib(mentor) MFC in: 4 weeks
* Fix old panic when BPF consumer attaches to destroying interface.melifaro2012-05-215-99/+137
| | | | | | | | | | | | | | | | | 'flags' field is added to the end of bpf_if structure. Currently the only flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd() Problem can be easily triggered on SMP stable/[89] by the following command (sort of): 'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done' Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory, while interface is still UP and there can be routes via this interface. Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api. Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches) Approved by: kib(mentor) MFC in: 4 weeks
* Fix panic on attaching to non-existent interface (introduced by r233937, ↵melifaro2012-05-211-42/+136
| | | | | | | | | | | | | | | | pointed by hrs@) Fix panic on tcpdump being attached to interface being removed (introduced by r233937, pointed by hrs@ and adrian@) Protect most of bpf_setf() by BPF global lock Add several forgotten assertions (thanks to adrian@) Document current locking model inside bpf.c Document EVENTHANDLER(9) usage inside BPF. Approved by: kib(mentor) Tested by: gnn MFC in: 4 weeks
* Use the LLINDEX macro to access the link-level I/F index. This makesmarcel2012-05-191-0/+1
| | | | | | | it possible to work with a different type for the sdl_index field -- it only requires a recompile. Obtained from: Juniper Networks, Inc.
* Sync DLTs with the latest pcap version.delphij2012-05-141-2/+122
| | | | MFC after: 2 weeks
* Revert r234834 per luigi@ request.melifaro2012-05-032-0/+2
| | | | | | | | | | | Cleaner solution (e.g. adding another header) should be done here. Original log: Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Requested by: luigi Approved by: kib(mentor)
* Relax restriction on direct tx to child portsemaste2012-05-031-13/+3
| | | | | | | | | | | | | Lagg(4) restricts the type of packet that may be sent directly to a child port, to avoid undesired output from accidental misconfiguration. Previously only ETHERTYPE_PAE was permitted. BPF writes to a lagg(4) child port are presumably intentional, so just allow them, while still blocking other packets that should take the aggregation path. PR: kern/138620 Approved by: thompsa@
* Move several enums and structures required for L2 filtering from ↵melifaro2012-04-302-2/+0
| | | | | | | | | ip_fw_private.h to ip_fw.h. Remove ipfw/ip_fw_private.h header from non-ipfw code. Approved by: ae(mentor) MFC after: 2 weeks
* Do not require radix write lock to be held while dumping route tablemelifaro2012-04-221-2/+2
| | | | | | | | | | via sysctl(4) interface. This permits router not to stop forwarding packets while route table is being written to user-supplied buffer. Reported by: Pawel Tyll <ptyll@nitronet.pl> Approved by: kib(mentor) MFC after: 1 week
* Move the interface media check to a taskqueue, some interfaces (usb) sleepthompsa2012-04-202-21/+44
| | | | during SIOCGIFMEDIA and we were holding locks.
* Add linkstate to bridge(4), set the link to up when at least one underlyingthompsa2012-04-204-35/+60
| | | | | | | | | interface is up, otherwise the link is down. This, among other things, allows carp to work on a bridge. Prodded by: glebius Tested by: Alexander Lunev
* Remove KASSERTS, they do not add any value here since the pointer is about tothompsa2012-04-181-6/+2
| | | | be derefernced anyway.
* A bit of cleanup in the names of fields of netmap-related structures.luigi2012-04-132-6/+6
| | | | | Use the name 'ring' instead of 'queue' in all fields. Bump NETMAP_API.
* remove an unnecessary #defineluigi2012-04-121-4/+0
|
* Set the proto to LAGG_PROTO_NONE before calling the detach routine so packetsthompsa2012-04-121-6/+10
| | | | | | | | are discarded, this is an issue because lacp drops the lock which may allow network threads to access freed memory. Expand the lock coverage so the detach/attach happen atomically. Submitted by: Andrew Boyer (earlier version)
* Add media types for 40G media that might be used with FreeBSD.jhb2012-04-101-0/+9
| | | | | Reviewed by: bz MFC after: 2 weeks
* Fix build broken by r233938.melifaro2012-04-061-1/+2
| | | | | | Pointed by: David Wolfskill <david@catwhisker.org> Approved by: kib (mentor) Pointy hat to: melifaro
* - Improve performace for writer-only BPF users.melifaro2012-04-063-6/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send raw ethernet frames. The only FreeBSD interface that can be used to send raw frames is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses BPF only to send data. This leads us to the situation when software like cdpd, being run on high-traffic-volume interface significantly reduces overall performance since we have to acquire additional locks for every packet. Here we add sysctl that changes BPF behavior in the following way: If program came and opens BPF socket without explicitly specifyin read filter we assume it to be write-only and add it to special writer-only per-interface list. This makes bpf_peers_present() return 0, so no additional overhead is introduced. After filter is supplied, descriptor is added to original per-interface list permitting packets to be captured. Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of setting snap length. Fortunately, most programs explicitly sets (event catch-all) filter after that. tcpdump(1) is a good example. So a bit hackis approach is taken: we upgrade description only after second BIOCSETF is received. Sysctl is named net.bpf.optimize_writers and is turned off by default. - While here, document all sysctl variables in bpf.4 Sponsored by Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 4 weeks
* - Improve BPF locking model.melifaro2012-04-065-121/+176
| | | | | | | | | | | | | | | | | | | | | | | Interface locks and descriptor locks are converted from mutex(9) to rwlock(9). This greately improves performance: in most common case we need to acquire 1 reader lock instead of 2 mutexes. - Remove filter(descriptor) (reader) lock in bpf_mtap[2] This was suggested by glebius@. We protect filter by requesting interface writer lock on filter change. - Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h without including rwlock stuff. However, this is is temporary solution, struct bpf_if should be made opaque for any external caller. Found by: Dmitrij Tejblum <tejblum@yandex-team.ru> Sponsored by: Yandex LLC Reviewed by: glebius (previous version) Reviewed by: silence on -net@ Approved by: (mentor) MFC after: 3 weeks
* Retire the IF_ADDR_LOCK() and IF_ADDR_UNLOCK() compat macros from HEAD.jhb2012-03-191-3/+0
| | | | | The new [RW]LOCK macros are merged back to 8.x so should be suitable for new code in HEAD even if it is to be MFC'd.
* Hide kernel option ROUTETABLES evaluations in the implementationbz2012-03-182-21/+18
| | | | | | | | | | | | | | | | rather than the header file. With this also move RT_MAXFIBS and RT_NUMFIBS into the implemantion to avoid further usage in other code. rt_numfibs is all that should be needed. This allows users to change the number of FIBs from 1..RT_MAXFIBS(16) dynamically using the tunable without the need to change the kernel config for the maximum anymore. This means that thet multi-FIB feature is now fully available with GENERIC kernels. The kernel option ROUTETABLES can still be used to set the default numbers of FIBs in absence of the tunable. Ok.ed by: julian, hrs, melifaro MFC after: 2 weeks
* - remove an extra parenthesis in a closing brace;luigi2012-03-111-1/+6
| | | | | | | - add the macro NETMAP_RING_FIRST_RESERVED() which returns the index of the first non-released buffer in the ring (this is useful for code that retains buffers for some time instead of processing them immediately)
* Move the vlan buffer space into the union which also fixes an unused variablethompsa2012-03-071-2/+2
| | | | | | warning with !INET & !INET6. Spotted by: pluknet
* Add the ability to set which packet layers are used for the load balance hashthompsa2012-03-063-15/+82
| | | | calculation.
* Properly restore curvnet context when returning early fromzec2012-03-041-1/+4
| | | | | | | | | | ether_input_internal(). This change only affects options VIMAGE kernel builds. PR: kern/165643 Submitted by: Vijay Singh MFC after: 3 days
* o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with ↵jmallett2012-03-031-7/+7
| | | | | | | | | | | | | | | | | | | | | userlands using the o32 ABI. This mostly follows nwhitehorn's lead in implementing COMPAT_FREEBSD32 on powerpc64. o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the 32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit ABIs use a 64-bit time_t. o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS with 32-bit compatibility, then, disable some code which assumes otherwise wrongly when built for MIPS. A more general macro to check in this case would seem like a good idea eventually. If someone adds support for using n32 userland with n64 kernels on MIPS, then they will have to add a variety of flags related to each piece of the ABI that can vary. That's probably the right time to generalize further. o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the freebsd32 compat code. Probably this should be generalized at some point. Reviewed by: gonzo
* Use a more appropriate default for the maximum number of addresses in thethompsa2012-02-291-2/+2
| | | | | | | bridge forwarding table. PR: docs/164564 Discussed with: brueffer
* A bunch of netmap fixes:luigi2012-02-272-69/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | USERSPACE: 1. add support for devices with different number of rx and tx queues; 2. add better support for zero-copy operation, adding an extra field to the netmap ring to indicate how many buffers we have already processed but not yet released (with help from Eddie Kohler); 3. The two changes above unfortunately require an API change, so while at it add a version field and some spares to the ioctl() argument to help detect mismatches. 4. update the manual page for the two changes above; 5. update sample applications in tools/tools/netmap KERNEL: 1. simplify the internal structures moving the global wait queues to the 'struct netmap_adapter'; 2. simplify the functions that map kring<->nic ring indexes 3. normalize device-specific code, helps mainteinance; 4. start exploring the impact of micro-optimizations (prefetch etc.) in the ixgbe driver. Use 'legacy' descriptors on the tx ring and prefetch slots gives about 20% speedup at 900 MHz. Another 7-10% would come from removing the explict calls to bus_dmamap* in the core (they are effectively NOPs in this case, but it takes expensive load of the per-buffer dma maps to figure out that they are all NULL. Rx performance not investigated. I am postponing the MFC so i can import a few more improvements before merging.
* Only look for a usable MAC address for the bridge ID from ports within ourthompsa2012-02-241-20/+30
| | | | | | | | | bridge, this allows us to have more than one independent bridge in the same STP domain. PR: kern/164369 Submitted by: Nikos Vassiliadis (earlier version) MFC after: 2 weeks
* Add a sysctl/tunable default value for the use_flowid sysctl in r232008.thompsa2012-02-231-1/+6
|
* Indicate this function decrements the timer as well as testing for expiry.thompsa2012-02-231-11/+11
|
OpenPOWER on IntegriCloud