summaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAgeFilesLines
* bnx2x: Take the distribution range definition out of skb_tx_hash()Vladislav Zolotarov2010-12-162-2/+13
| | | | | | | | | | | Move the calcualation of the Tx hash for a given hash range into a separate function and define the skb_tx_hash(), which calculates a Tx hash for a [0; dev->real_num_tx_queues - 1] hash values range, using this function (__skb_tx_hash()). Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵John W. Linville2010-12-133-3/+22
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
| * cfg80211: add some element IDs in enum ieee80211_eidAmitkumar Karwar2010-12-081-0/+3
| | | | | | | | | | | | | | | | | | 1)WLAN_EID_BSS_COEX_2040 2)WLAN_EID_OVERLAP_BSS_SCAN_PARAM 3)WLAN_EID_EXT_CAPABILITY Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * cfg80211: Add new BSS attribute ht_opmodeHelmut Schaa2010-12-081-0/+4
| | | | | | | | | | | | | | | | | | | | Add a new BSS attribute to allow hostapd to set the current HT opmode. Otherwise drivers won't be able to set up protection for HT rates in AP mode. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * nl80211/mac80211: Report signal averageBruno Randolf2010-12-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend nl80211 to report an exponential weighted moving average (EWMA) of the signal value. Since the signal value usually fluctuates between different packets, an average can be more useful than the value of the last packet. This uses the recently added generic EWMA library function. -- v2: fix ABI breakage and change factor to be a power of 2. Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * cfg80211/mac80211: add mesh join/leave commandsJohannes Berg2010-12-061-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of tying mesh activity to interface up, add join and leave commands for mesh. Since we must be backward compatible, let cfg80211 handle joining a mesh if a mesh ID was pre-configured when the device goes up. Note that this therefore must modify mac80211 as well since mac80211 needs to lose the logic to start the mesh on interface up. We now allow querying mesh parameters before the mesh is connected, which simply returns defaults. Setting them (internally renamed to "update") is only allowed while connected. Specify them with the new mesh join command instead where needed. In mac80211, beaconing must now also follow the mesh enabled/not enabled state, which is done by testing the mesh ID. Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * nl80211/mac80211: define and allow configuring mesh element TTLJavier Cardona2010-12-061-0/+4
| | | | | | | | | | | | | | | | | | The TTL in path selection information elements is different from the mesh ttl used in mesh data frames. Version 7.03 of the 11s draft calls this ttl 'Element TTL'. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * lib: Improve EWMA efficiency by using bitshiftsBruno Randolf2010-12-061-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using bitshifts instead of division and multiplication should improve performance. That requires weight and factor to be powers of two, but i think this is something we can live with. Thanks to Peter Zijlstra for the improved formula! Signed-off-by: Bruno Randolf <br1@einfach.org> -- v2: use log2.h functions Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | ethtool: Report link-down while interface is downBen Hutchings2010-12-101-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While an interface is down, many implementations of ethtool_ops::get_link, including the default, ethtool_op_get_link(), will report the last link state seen while the interface was up. In general the current physical link state is not available if the interface is down. Define ETHTOOL_GLINK to reflect whether the interface *and* any physical port have a working link, and consistently return 0 when the interface is down. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: Add Traffic Flow Confidentiality padding XFRM attributeMartin Willi2010-12-101-0/+1
| | | | | | | | | | | | | | | | | | | | The XFRMA_TFCPAD attribute for XFRM state installation configures Traffic Flow Confidentiality by padding ESP packets to a specified length. Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-davem' of ↵David S. Miller2010-12-103-5/+64
|\ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
| * \ Merge branch 'master' of ↵John W. Linville2010-12-063-5/+64
| |\ \ | | |/ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
| | * ssb: extract indexes for power tablesRafał Miłecki2010-12-022-0/+44
| | | | | | | | | | | | | | | | | | Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Acked-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * nl80211/cfg80211: extend mgmt-tx API for off-channelJohannes Berg2010-11-291-5/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With p2p, it is sometimes necessary to transmit a frame (typically an action frame) on another channel than the current channel. Enable this through the CMD_FRAME API, and allow it to wait for a response. A new command allows that wait to be aborted. However, allow userspace to specify whether or not it wants to allow off-channel TX, it may actually want to use the same channel only. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | The new jhash implementationJozsef Kadlecsik2010-12-091-78/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current jhash.h implements the lookup2() hash function by Bob Jenkins. However, lookup2() is outdated as Bob wrote a new hash function called lookup3(). The patch replaces the lookup2() implementation of the 'jhash*' functions with that of lookup3(). You can read a longer comparison of the two and other hash functions at http://burtleburtle.net/bob/hash/doobs.html. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2010-12-082-0/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath/ath9k/ar9003_eeprom.c net/llc/af_llc.c
| * | | tcp: Replace time wait bucket msg by counterTom Herbert2010-12-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than printing the message to the log, use a mib counter to keep track of the count of occurences of time wait bucket overflow. Reduces spam in logs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'master' of ↵David S. Miller2010-11-241-0/+1
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| * | | | phylib: Add support for Marvell 88E1149R devices.David Daney2010-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 88E1149R is 10/100/1000 quad-gigabit Ethernet PHY. The .config_aneg function can be shared with 88E1118, but it needs its own .config_init. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Cc: Cyril Chemparathy <cyril@ti.com> Cc: Arnaud Patard <arnaud.patard@rtp-net.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | filter: constify sk_run_filter()Eric Dumazet2010-12-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sk_run_filter() doesnt write on skb, change its prototype to reflect this. Fix two af_packet comments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()Eric Dumazet2010-12-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit : > Hmm.. > > If somebody can explain why RTNL is held in arp_ioctl() (and therefore > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so > that your patch can be applied. > > Right now it is not good, because RTNL wont be necessarly held when you > are going to call arp_invalidate() ? While doing this analysis, I found a refcount bug in llc, I'll send a patch for net-2.6 Meanwhile, here is the patch for net-next-2.6 Your patch then can be applied after mine. Thanks [PATCH] net: RCU conversion of dev_getbyhwaddr() and arp_ioctl() dev_getbyhwaddr() was called under RTNL. Rename it to dev_getbyhwaddr_rcu() and change all its caller to now use RCU locking instead of RTNL. Change arp_ioctl() to use RCU instead of RTNL locking. Note: this fix a dev refcount bug in llc Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | dccp: Policy-based packet dequeueing infrastructureTomasz Grobelny2010-12-071-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a generic infrastructure for policy-based dequeueing of TX packets and provides two policies: * a simple FIFO policy (which is the default) and * a priority based policy (set via socket options). Both policies honour the tx_qlen sysctl for the maximum size of the write queue (can be overridden via socket options). The priority policy uses skb->priority internally to assign an u32 priority identifier, using the same ranking as SO_PRIORITY. The skb->priority field is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary data using cmsg(3), the patch also provides the requisite parsing routines. Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* | | | | __in_dev_get_rtnl() can use rtnl_dereference()Eric Dumazet2010-12-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If caller holds RTNL, we dont need a memory barrier (smp_read_barrier_depends) included in rcu_dereference(). Just use rtnl_dereference() to properly document the assertions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | filter: add SKF_AD_RXHASH and SKF_AD_CPUEric Dumazet2010-12-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SKF_AD_RXHASH and SKF_AD_CPU to filter ancillary mechanism, to be able to build advanced filters. This can help spreading packets on several sockets with a fast selection, after RPS dispatch to N cpus for example, or to catch a percentage of flows in one queue. tcpdump -s 500 "cpu = 1" : [0] ld CPU [1] jeq #1 jt 2 jf 3 [2] ret #500 [3] ret #0 # take 12.5 % of flows (average) tcpdump -s 1000 "rxhash & 7 = 2" : [0] ld RXHASH [1] and #7 [2] jeq #2 jt 3 jf 4 [3] ret #1000 [4] ret #0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Rui <wirelesser@gmail.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | usbnet: changes for upcoming cdc_ncm driverAlexey Orishko2010-12-061-0/+6
| |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes: include/linux/usb/usbnet.h: - a new flag to indicate driver's capability to accumulate IP packets in Tx direction and extract several packets from single skb in Rx direction. drivers/net/usb/usbnet.c: - the procedure of counting packets in usbnet was updated due to the accumulating of IP packets in the driver - no short packets are sent if indicated by the flag in driver_info structure Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tg3: Move EEE definitions into mdio.hMatt Carlson2010-12-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 52b02d04c801fff51ca49ad033210846d1713253 entitled "tg3: Add EEE support", Ben Hutchings had commented that the EEE advertisement register will be in a standard location. This patch moves that definition into mdio.h and changes the code to use it. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net sched: use xps information for qdisc NUMA affinityEric Dumazet2010-12-011-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate qdisc memory according to NUMA properties of cpus included in xps map. To be effective, qdisc should be (re)setup after changes of /sys/class/net/eth<n>/queues/tx-<n>/xps_cpus I added a numa_node field in struct netdev_queue, containing NUMA node if all cpus included in xps_cpus share same node, else -1. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'for-davem' of ↵David S. Miller2010-11-293-0/+65
|\ \ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
| * \ \ \ Merge branch 'master' of ↵John W. Linville2010-11-243-0/+65
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
| | * | | cfg80211: allow using CQM event to notify packet lossJohannes Berg2010-11-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the ability for drivers to use CQM events to notify about packet loss for specific stations (which could be the AP for the managed mode case). Since the threshold might be determined by the driver (it isn't passed in right now) it will be passed out of the driver to userspace in the event. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | Merge branch 'master' of ↵John W. Linville2010-11-241-0/+1
| | |\ \ \ | | | | |/ | | | |/| | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| | | * | ssb: b43-pci-bridge: Add new vendor for BCM4318Daniel Klaffenbach2010-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new vendor for Broadcom 4318. Signed-off-by: Daniel Klaffenbach <danielklaffenbach@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | Revert "nl80211/mac80211: Report signal average"John W. Linville2010-11-241-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 86107fd170bc379869250eb7e1bd393a3a70e8ae. This patch inadvertantly changed the userland ABI. Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | nl80211/mac80211: Report signal averageBruno Randolf2010-11-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend nl80211 to report an exponential weighted moving average (EWMA) of the signal value. Since the signal value usually fluctuates between different packets, an average can be more useful than the value of the last packet. This uses the recently added generic EWMA library function. Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | lib: Add generic exponentially weighted moving average (EWMA) functionBruno Randolf2010-11-181-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds generic functions for calculating Exponentially Weighted Moving Averages (EWMA). This implementation makes use of a structure which keeps the EWMA parameters and a scaled up internal representation to reduce rounding errors. The original idea for this implementation came from the rt2x00 driver (rt2x00link.c). I would like to use it in several places in the mac80211 and ath5k code and I hope it can be useful in many other places in the kernel code. Signed-off-by: Bruno Randolf <br1@einfach.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | cfg80211: add support for setting the ad-hoc multicast rateFelix Fietkau2010-11-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * | | cfg80211: Add nl80211 antenna configurationBruno Randolf2010-11-161-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow setting of TX and RX antennas configuration via nl80211. The antenna configuration is defined as a bitmap of allowed antennas to use. This API can be used to mask out antennas which are not attached or should not be used for other reasons like regulatory concerns or special setups. Separate bitmaps are used for RX and TX to allow configuring different antennas for receiving and transmitting. Each bitmap is 32 bit long, each bit representing one antenna, starting with antenna 1 at the first bit. If an antenna bit is set, this means the driver is allowed to use this antenna for RX or TX respectively; if the bit is not set the hardware is not allowed to use this antenna. Using bitmaps has the benefit of allowing for a flexible configuration interface which can support many different configurations and which can be used for 802.11n as well as non-802.11n devices. Instead of relying on some hardware specific assumptions, drivers can use this information to know which antennas are actually attached to the system and derive their capabilities based on that. 802.11n devices should enable or disable chains, based on which antennas are present (If all antennas belonging to a particular chain are disabled, the entire chain should be disabled). HT capabilities (like STBC, TX Beamforming, Antenna selection) should be calculated based on the available chains after applying the antenna masks. Should a 802.11n device have diversity antennas attached to one of their chains, diversity can be enabled or disabled based on the antenna information. Non-802.11n drivers can use the antenna masks to select RX and TX antennas and to enable or disable antenna diversity. While covering chainmasks for 802.11n and the standard "legacy" modes "fixed antenna 1", "fixed antenna 2" and "diversity" this API also allows more rare, but useful configurations as follows: 1) Send on antenna 1, receive on antenna 2 (or vice versa). This can be used to have a low gain antenna for TX in order to keep within the regulatory constraints and a high gain antenna for RX in order to receive weaker signals ("speak softly, but listen harder"). This can be useful for building long-shot outdoor links. Another usage of this setup is having a low-noise pre-amplifier on antenna 1 and a power amplifier on the other antenna. This way transmit noise is mostly kept out of the low noise receive channel. (This would be bitmaps: tx 1 rx 2). 2) Another similar setup is: Use RX diversity on both antennas, but always send on antenna 1. Again that would allow us to benefit from a higher gain RX antenna, while staying within the legal limits. (This would be: tx 0 rx 3). 3) And finally there can be special experimental setups in research and development even with pre 802.11n hardware where more than 2 antennas are available. It's good to keep the API simple, yet flexible. Signed-off-by: Bruno Randolf <br1@einfach.org> -- v7: Made bitmasks 32 bit wide and rebased to latest wireless-testing. Signed-off-by: John W. Linville <linville@tuxdriver.com>
* | | | | xps: add __rcu annotationsEric Dumazet2010-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid sparse warnings : add __rcu annotations and use rcu_dereference_protected() where necessary. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | xps: Add CONFIG_XPSTom Herbert2010-11-281-24/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds XPS_CONFIG option to enable and disable XPS. This is done in the same manner as RPS_CONFIG. This is also fixes build failure in XPS code when SMP is not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: add netif_tx_queue_frozen_or_stoppedEric Dumazet2010-11-281-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When testing struct netdev_queue state against FROZEN bit, we also test XOFF bit. We can test both bits at once and save some cycles. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | xps: Transmit Packet SteeringTom Herbert2010-11-241-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements transmit packet steering (XPS) for multiqueue devices. XPS selects a transmit queue during packet transmission based on configuration. This is done by mapping the CPU transmitting the packet to a queue. This is the transmit side analogue to RPS-- where RPS is selecting a CPU based on receive queue, XPS selects a queue based on the CPU (previously there was an XPS patch from Eric Dumazet, but that might more appropriately be called transmit completion steering). Each transmit queue can be associated with a number of CPUs which will use the queue to send packets. This is configured as a CPU mask on a per queue basis in: /sys/class/net/eth<n>/queues/tx-<n>/xps_cpus The mappings are stored per device in an inverted data structure that maps CPUs to queues. In the netdevice structure this is an array of num_possible_cpu structures where each structure holds and array of queue_indexes for queues which that CPU can use. The benefits of XPS are improved locality in the per queue data structures. Also, transmit completions are more likely to be done nearer to the sending thread, so this should promote locality back to the socket on free (e.g. UDP). The benefits of XPS are dependent on cache hierarchy, application load, and other factors. XPS would nominally be configured so that a queue would only be shared by CPUs which are sharing a cache, the degenerative configuration woud be that each CPU has it's own queue. Below are some benchmark results which show the potential benfit of this patch. The netperf test has 500 instances of netperf TCP_RR test with 1 byte req. and resp. bnx2x on 16 core AMD XPS (16 queues, 1 TX queue per CPU) 1234K at 100% CPU No XPS (16 queues) 996K at 100% CPU Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | xps: Improvements in TX queue selectionTom Herbert2010-11-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In dev_pick_tx, don't do work in calculating queue index or setting the index in the sock unless the device has more than one queue. This allows the sock to be set only with a queue index of a multi-queue device which is desirable if device are stacked like in a tunnel. We also allow the mapping of a socket to queue to be changed. To maintain in order packet transmission a flag (ooo_okay) has been added to the sk_buff structure. If a transport layer sets this flag on a packet, the transmit queue can be changed for the socket. Presumably, the transport would set this if there was no possbility of creating OOO packets (for instance, there are no packets in flight for the socket). This patch includes the modification in TCP output for setting this flag. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipv6: mcast: RCU conversionEric Dumazet2010-11-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ipv6_sk_mc_lock rwlock becomes a spinlock. readers (inet6_mc_check()) now takes rcu_read_lock() instead of read lock. Writers dont need to disable BH anymore. struct ipv6_mc_socklist objects are reclaimed after one RCU grace period. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | stmmac: add init/exit callback in plat_stmmacenet_data structGiuseppe CAVALLARO2010-11-241-3/+3
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds in the plat_stmmacenet_data the init and exit callbacks that can be used for invoking specific platform functions. For example, on ST targets, these call the PAD manager functions to set PIO lines and syscfg registers. The patch removes the stmmac_claim_resource only used on STM Kernels as well. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | macvlan: Introduce 'passthru' mode to takeover the underlying deviceSridhar Samudrala2010-11-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the current default 'vepa' mode, a KVM guest using virtio with macvtap backend has the following limitations. - cannot change/add a mac address on the guest virtio-net - cannot create a vlan device on the guest virtio-net - cannot enable promiscuous mode on guest virtio-net To address these limitations, this patch introduces a new mode called 'passthru' when creating a macvlan device which allows takeover of the underlying device and passing it to a guest using virtio with macvtap backend. Only one macvlan device is allowed in passthru mode and it inherits the mac address from the underlying device and sets it in promiscuous mode to receive and forward all the packets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> ------------------------------------------------------------------------- Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'master' of ↵David S. Miller2010-11-191-1/+1
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c net/core/net-sysfs.c net/ipv6/addrconf.c
| * | | net: rtnetlink.h -- only include linux/netdevice.h when used by the kernelAndy Whitcroft2010-11-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit below added a new helper dev_ingress_queue to cleanly obtain the ingress queue pointer. This necessitated including 'linux/netdevice.h': commit 24824a09e35402b8d58dcc5be803a5ad3937bdba Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat Oct 2 06:11:55 2010 +0000 net: dynamic ingress_queue allocation However this include triggers issues for applications in userspace which use the rtnetlink interfaces. Commonly this requires they include 'net/if.h' and 'linux/rtnetlink.h' leading to a compiler error as below: In file included from /usr/include/linux/netdevice.h:28:0, from /usr/include/linux/rtnetlink.h:9, from t.c:2: /usr/include/linux/if.h:135:8: error: redefinition of ‘struct ifmap’ /usr/include/net/if.h:112:8: note: originally defined here /usr/include/linux/if.h:169:8: error: redefinition of ‘struct ifreq’ /usr/include/net/if.h:127:8: note: originally defined here /usr/include/linux/if.h:218:8: error: redefinition of ‘struct ifconf’ /usr/include/net/if.h:177:8: note: originally defined here The new helper is only defined for the kernel and protected by __KERNEL__ therefore we can simply pull the include down into the same protected section. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | filter: optimize sk_run_filterEric Dumazet2010-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove pc variable to avoid arithmetic to compute fentry at each filter instruction. Jumps directly manipulate fentry pointer. As the last instruction of filter[] is guaranteed to be a RETURN, and all jumps are before the last instruction, we dont need to check filter bounds (number of instructions in filter array) at each iteration, so we remove it from sk_run_filter() params. On x86_32 remove f_k var introduced in commit 57fe93b374a6b871 (filter: make sure filters dont read uninitialized memory) Note : We could use a CONFIG_ARCH_HAS_{FEW|MANY}_REGISTERS in order to avoid too many ifdefs in this code. This helps compiler to use cpu registers to hold fentry and A accumulator. On x86_32, this saves 401 bytes, and more important, sk_run_filter() runs much faster because less register pressure (One less conditional branch per BPF instruction) # size net/core/filter.o net/core/filter_pre.o text data bss dec hex filename 2948 0 0 2948 b84 net/core/filter.o 3349 0 0 3349 d15 net/core/filter_pre.o on x86_64 : # size net/core/filter.o net/core/filter_pre.o text data bss dec hex filename 5173 0 0 5173 1435 net/core/filter.o 5224 0 0 5224 1468 net/core/filter_pre.o Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net: move definitions of BPF_S_* to net/core/filter.cChangli Gao2010-11-181-48/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BPF_S_* are used internally, should not be exposed to the others. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | bonding: IGMP handling cleanupEric Dumazet2010-11-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of iterating in_dev->mc_list from bonding driver, its better to call a helper function provided by igmp.c Details of implementation (locking) are private to igmp code. ip_mc_rejoin_group(struct ip_mc_list *im) becomes ip_mc_rejoin_groups(struct in_device *in_dev); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud