summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: dsa: reduce number of protocol hooksFlorian Fainelli2014-08-2710-105/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DSA is currently registering one packet_type function per EtherType it needs to intercept in the receive path of a DSA-enabled Ethernet device. Right now we have three of them: trailer, DSA and eDSA, and there might be more in the future, this will not scale to the addition of new protocols. This patch proceeds with adding a new layer of abstraction and two new functions: dsa_switch_rcv() which will dispatch into the tag-protocol specific receive function implemented by net/dsa/tag_*.c dsa_slave_xmit() which will dispatch into the tag-protocol specific transmit function implemented by net/dsa/tag_*.c When we do create the per-port slave network devices, we iterate over the switch protocol to assign the DSA-specific receive and transmit operations. A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER like it used to be. This allows us to greatly simplify the check in eth_type_trans() and always override the skb->protocol with ETH_P_XDSA for Ethernet switches tagged protocol, while also reducing the number repetitive slave netdevice_ops assignments. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sungem: Fix global namespace pollution of phy accessors.David S. Miller2014-08-271-17/+17
| | | | | | | | The sungem driver has "phy_read()" and "phy_write()" functions, which we need to rename because the generic phy layer is about to export generic interfaces with the same name. Signed-off-by: David S. Miller <davem@davemloft.net>
* tulip: dmfe: Fix global namespace pollution of phy accessors.David S. Miller2014-08-271-76/+76
| | | | | | | | The dmfe driver has "phy_read()" and "phy_write()" functions, which we need to rename because the generic phy layer is about to export generic interfaces with the same name. Signed-off-by: David S. Miller <davem@davemloft.net>
* f_ncm: Don't use netdev_start_xmit().David S. Miller2014-08-271-1/+9
| | | | | | | | | | | | | | | | Unfortunately, the USB gadget layer has this weird things where NULL skbs are passed into ops->ndo_start_xmit() in order to trigger the dev->wrap() calls to build packets. This is completely outside of the allowable range of sane arguments for the ndo_start_xmit method. All invocations of ndo_start_xmit() should be with non-NULL SKB arguments. Put back the direct call, but with a comment explaining how this is not acceptable in the long term. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: arc: Add support for specific SoC layer device tree bindingsRomain Perier2014-08-275-61/+129
| | | | | | | | | | | | | | | | Some platforms have special bank registers which might be used to select the correct clock or the right mode for Media Indepent Interface controllers. Sometimes, it is also required to activate vcc regulators in the right order to supply the ethernet controller at the right time. This patch is an architecture refactoring of the arc-emac device driver. It adds a new software design which allows to add specific platform glue layer. Each platform has now its own module which performs custom initialization and remove for the target and then calls to the core driver. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: arc: mdio changes for future SoC glue layer devtree supportRomain Perier2014-08-273-6/+5
| | | | | | | | | | This is an api changes for the emac_mdio.c module. It will be required later when arc_emac_probe/arc_emac_remove will no longer use 'struct platform_device'. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: arc: remove use of 'struct platform_device'Romain Perier2014-08-271-31/+33
| | | | | | | | | This is a preparation of an api changes for the emac_main.c module. The involved functions are arc_emac_probe and arc_emac_remove. Signed-off-by: Romain Perier <romain.perier@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: syncookies: mark cookie_secret read_mostlyFlorian Westphal2014-08-272-2/+2
| | | | | | | only written once. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: Fix static checker warning regarding `txdata_ptr'Yuval Mintz2014-08-271-2/+2
| | | | | | | | | Incorrect checking of array instead of array contents in panic_dump flow - results of commit e261199872a2 ("bnx2x: Safe bnx2x_panic_dump()"). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* r8152: replace strncpy with strlcpyhayeswang2014-08-271-2/+2
| | | | | | | | Replace the strncpy with strlcpy, and use sizeof to determine the length. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Update sk_buff flag bit availability comment.David S. Miller2014-08-271-1/+1
| | | | | | We lost one when xmit_more was added. Signed-off-by: David S. Miller <davem@davemloft.net>
* neigh: document gc_thresh2stephen hemminger2014-08-251-0/+6
| | | | | | | Missing documentation for gc_thresh2 sysctl. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bnx2x: fix build error with ptpRandy Dunlap2014-08-251-0/+1
| | | | | | | | | | | | | | | | | | | bnx2x uses ptp functions, so it should select the provider of those functions (PTP_1588_CLOCK). Fixes these build errors: drivers/built-in.o: In function `__bnx2x_remove': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13409: undefined reference to `ptp_clock_unregister' drivers/built-in.o: In function `bnx2x_register_phc': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c:13202: undefined reference to `ptp_clock_register' drivers/built-in.o: In function `bnx2x_get_ts_info': /home/jim/linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3498: undefined reference to `ptp_clock_index' Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* team: set IFF_TEAM_PORT priv_flag after rx_handler is registeredJiri Pirko2014-08-251-14/+30
| | | | | | | | | | | | | | | | | When one tries to add eth as a port into team and that eth is already in use by other rx_handler device (macvlan, bond, bridge, ...) a bug in team_port_add() causes that IFF_TEAM_PORT flag is set before rx_handler is registered. In between, netdev nofifier is called and team_device_event() sees IFF_TEAM_PORT and thinks that rx_handler_data pointer is set to team_port. But it isn't. Fix this by reordering rx_handler register and IFF_TEAM_PORT priv flag set so it is very similar to how bonding does this. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Fixes: 3d249d4ca7 "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: x86: add missing 'shift by register' instructions to x64 eBPF JITAlexei Starovoitov2014-08-252-0/+80
| | | | | | | | | | 'shift by register' operations are supported by eBPF interpreter, but were accidently left out of x64 JIT compiler. Fix it and add a testcase. Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bnx2x-next'David S. Miller2014-08-255-18/+44
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yuval Mintz says: ==================== bnx2x: `fixes' patch-series This series contains mostly bug fixes, but never the less is intended for `net-next' and not `net', as: - Some of the fixes are quite insignificant [`VF clean statistics', `ethtool -d might cause timeout in log']. - Some only recently were submitted to `net-next' [`Fix timesync endianity']. - Some are not usually compiled as part of the kernel [`Fix stop-on-error']. Dave - please consider applying this series to `net-next'; If you prefer, I can break this series into 2 parts [one for `net' and the other for `net-next'] - but personally I don't see much benefit in it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: Fix timesync endianityMichal Kalderon2014-08-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | Commit eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support") has a missing conversion to LE32, which will prevent the feature from working on big endian machines. Signed-off-by: Michal Kalderon <Michal.Kalderon@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: Be more forgiving toward SW GRODmitry Kravkov2014-08-251-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This introduces 2 new relaxations in the bnx2x driver regarding GRO: 1. Don't prevent SW GRO if HW GRO is disabled. 2. If all aggregations are disabled, when GRO configuration changes there's no need to perform an inner-reload [since it will have no actual effect]. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: VF clean statisticsYuval Mintz2014-08-251-0/+5
| | | | | | | | | | | | | | | | During statistics initialization of a VF we need to clean its statistics. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: Fix stop-on-errorYuval Mintz2014-08-252-6/+21
| | | | | | | | | | | | | | | | | | | | When STOP_ON_ERROR is set driver will not compile. Even if it did, traffic will not pass without this patch as several fields which are verified by FW/HW on the Tx path are not properly set. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: ethtool -d might cause timeout in logDmitry Kravkov2014-08-251-9/+5
|/ | | | | | | | | | | This changes slightly the set of registers read during `ethtool -d'. Without this change, it's possible the HW will generate a grc Attention which will be logged into system logs as `grc timeout'. Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: make skb an optional parameter for__skb_flow_dissect()WANG Cong2014-08-252-5/+17
| | | | | | | Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix comments for __skb_flow_get_ports()WANG Cong2014-08-251-2/+4
| | | | | | | Fixes: commit 690e36e726d00d2 (net: Allow raw buffers to be passed into the flow dissector) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ixgbe: support skb->xmit_more in netdev_ops->ndo_start_xmit()Daniel Borkmann2014-08-251-3/+4
| | | | | | | | This implements the deferred tail pointer flush API for the ixgbe driver. Similar version also proposed longer time ago by Alexander Duyck. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove ndo_xmit_flush netdev operation, use signalling instead.David S. Miller2014-08-254-56/+19
| | | | | | | | | | | | | | | | | | | | | | As reported by Jesper Dangaard Brouer, for high packet rates the overhead of having another indirect call in the TX path is non-trivial. There is the indirect call itself, and then there is all of the reloading of the state to refetch the tail pointer value and then write the device register. Move to a more passive scheme, which requires very light modifications to the device drivers. The signal is a new skb->xmit_more value, if it is non-zero it means that more SKBs are pending to be transmitted on the same queue as the current SKB. And therefore, the driver may elide the tail pointer update. Right now skb->xmit_more is always zero. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'is_kdump_kernel'David S. Miller2014-08-254-3/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Amir Vadai says: ==================== Make is_kdump_kernel() accessible from modules I'm re-spinning this patchset. At the begining it was suggested to use a different name for the parameter, but at the end [3] the resolution was to leave it as it is in this patch. Drivers need to know if running from kdump kernel in order to change their memory profile - since kdump environment is limited by available memory. Currently there are drivers that are using reset_devices as suggested in [2]. In [2] it was suggested to use reset_devices, but the context was, to enable driver to know when the hardware device is needed to be reset, and not if this is a kdump environment. We think that is_kdump_kernel() is better suited to select between different memory profiles. The first patch in this patchset exports a needed symbol in order to make is_kdump_kernel() accessible from the drivers. The rest of the patches change from reset_devices to is_kdump_kernel() in 2 networking drivers. The idea of this patchset was suggested by Vivek Goyal. Tested (only build) and applied on top of commit 8fc54f6: ("net: use reciprocal_scale() helper") [1] - ea1c1af: ("net/mlx4_en: Reduce memory consumption on kdump kernel") [2] - https://lkml.org/lkml/2011/1/27/341 [3] - http://www.spinics.net/lists/netdev/msg291492.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/bnx2x: Use is_kdump_kernel() to detect kdump kernelAmir Vadai2014-08-252-2/+4
| | | | | | | | | | | | | | | | | | | | Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. CC: Ariel Elior <ariel.elior@qlogic.com> CC: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx4: Use is_kdump_kernel() to detect kdump kernelAmir Vadai2014-08-251-1/+2
| | | | | | | | | | | | | | Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * crash_dump: Make is_kdump_kernel() accessible from modulesAmir Vadai2014-08-251-0/+1
|/ | | | | | | | | | | | | | | In order to make is_kdump_kernel() accessible from modules, need to make elfcorehdr_addr exported. This was rejected in the past [1] because reset_devices was prefered in that context (reseting the device in kdump kernel), but now there are some network drivers that need to reduce memory usage when loaded from a kdump kernel. And in that context, is_kdump_kernel() suits better. [1] - https://lkml.org/lkml/2011/1/27/341 CC: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* stmmac: simple cleanupsPavel Machek2014-08-253-12/+10
| | | | | | | | This adds simple cleanups for stmmac, removing test we know is always true, fixing whitespace, and moving code out of if(). Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* r8152: check code with checkpatch.plhayeswang2014-08-251-31/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 626: CHECK: Alignment should match open parenthesis 646: CHECK: Alignment should match open parenthesis 655: CHECK: Alignment should match open parenthesis 695: CHECK: Alignment should match open parenthesis 729: CHECK: Alignment should match open parenthesis 739: CHECK: Alignment should match open parenthesis 976: WARNING: externs should be avoided in .c files 1314: CHECK: Alignment should match open parenthesis 1358: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1402: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1521: CHECK: multiple assignments should be avoided 1775: CHECK: Alignment should match open parenthesis 1838: CHECK: multiple assignments should be avoided 1843: CHECK: multiple assignments should be avoided 1847: CHECK: multiple assignments should be avoided 1850: WARNING: Missing a blank line after declarations 1864: CHECK: Alignment should match open parenthesis 1872: CHECK: braces {} should be used on all arms of this statement 1906: CHECK: usleep_range is preferred over udelay 2865: WARNING: networking block comments don't use an empty /* line, use /* Comment... 3088: CHECK: Alignment should match open parenthesis total: 0 errors, 5 warnings, 16 checks, 3567 lines checked Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ndo_xmit_flush'David S. Miller2014-08-2411-27/+77
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Basic deferred TX queue flushing infrastructure. Over time, and specifically and more recently at the Networking Workshop during Kernel SUmmit in Chicago, we have discussed the idea of having some way to optimize transmits of multiple TX packets at a time. There are several areas of overhead that could be amortized with such schemes. One has to do with locking and transactional overhead, the other has to do with device specific costs. This patch set here is more aimed at device specific costs. Typically a device queues up a packet in the TX queue and then has to do something to have the device start processing that new entry. Sometimes this is composed of doing an MMIO write to a "tail" register, and in other cases it can involve something as expensive as a hypervisor call. The basic setup defined here is that when the driver supports deferred TX queue flushing, ndo_start_xmit should no longer perform that operation. Instead a new operation, ndo_xmit_flush, should do it. I have converted IGB and virtio_net as example initial users. The IGB conversion is tested, virtio_net is not but it does compile :-) All ndo_start_xmit call sites have been abstracted behind a new helper called netdev_start_xmit(). This just adds the infrastructure, it does not actually add any instances of actually doing multiple ndo_start_xmit calls per ndo_xmit_flush invocation. Signed-off-by: David S. Miller <davem@davemloft.net>
| * virtio_net: Support netdev_ops->ndo_xmit_flush()David S. Miller2014-08-241-1/+9
| | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * igb: Support netdev_ops->ndo_xmit_flush()David S. Miller2014-08-241-11/+24
| | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add ops->ndo_xmit_flush()David S. Miller2014-08-249-15/+44
|/ | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: White-space cleansing : gaps between function and symbol exportIan Morris2014-08-2411-23/+0
| | | | | | | | | | | | | | | This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch removes some blank lines between the end of a function definition and the EXPORT_SYMBOL_GPL macro in order to prevent checkpatch warning that EXPORT_SYMBOL must immediately follow a function. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: White-space cleansing : Structure layoutsIan Morris2014-08-245-21/+13
| | | | | | | | | | | | | This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch addresses structure definitions, specifically it cleanses the brace placement and replaces spaces with tabs in a few places. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: White-space cleansing : Line LayoutsIan Morris2014-08-2431-203/+203
| | | | | | | | | | | | | | | | | This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char *)foo etc. * Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ec_bhf: remove excessive debug messagesDarek Marcinkiewicz2014-08-241-91/+10
| | | | | | | | This cuts down the number of debug information spit out by the driver. Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* random32: improvements to prandom_bytesDaniel Borkmann2014-08-242-23/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses a couple of minor items, mostly addesssing prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t for length arguments, 2) We can use put_unaligned() when filling the array instead of open coding it [ perhaps some archs will further benefit from their own arch specific implementation when GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned int as type for getting the arch seed, 5) Make use of prandom_u32_max() for timer slack. Regarding the change to put_unaligned(), callers of prandom_bytes() which internally invoke prandom_bytes_state(), don't bother as they expect the array to be filled randomly and don't have any control of the internal state what-so-ever (that's also why we have periodic reseeding there, etc), so they really don't care. Now for the direct callers of prandom_bytes_state(), which are solely located in test cases for MTD devices, that is, drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}: These tests basically fill a test write-vector through prandom_bytes_state() with an a-priori defined seed each time and write that to a MTD device. Later on, they set up a read-vector and read back that blocks from the device. So in the verification phase, the write-vector is being re-setup [ so same seed and prandom_bytes_state() called ], and then memcmp()'ed against the read-vector to check if the data is the same. Akinobu, Lothar and I also tested this patch and it runs through the 3 relevant MTD test cases w/o any errors on the nandsim device (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53 and i.MX6): # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \ third_id_byte=0x00 fourth_id_byte=0x15 # modprobe mtd_oobtest dev=0 # modprobe mtd_pagetest dev=0 # modprobe mtd_subpagetest dev=0 We also don't have any users depending directly on a particular result of the PRNG (except the PRNG self-test itself), and that's just fine as it e.g. allowed us easily to do things like upgrading from taus88 to taus113. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Lothar Waßmann <LW@KARO-electronics.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'csums-next'David S. Miller2014-08-2412-103/+233
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tom Herbert says: ==================== net: Checksum offload changes - Part V I am working on overhauling RX checksum offload. Goals of this effort are: - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY - Preserve CHECKSUM_COMPLETE through encapsulation layers - Don't do skb_checksum more than once per packet - Unify GRO and non-GRO csum verification as much as possible - Unify the checksum functions (checksum_init) - Simplify code What is in this fifth patch set: - Added GRO checksum validation functions - Call the GRO validations functions from TCP and GRE gro_receive - Perform checksum verification in the UDP gro_receive path using GRO functions and add support for gro_receive in UDP6 Changes in V2: - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it to CHECKSUM_COMPLETE from GRO checksum validation. This avoids performance penalty in checksumming bytes which are before the header GRO is at. Please review carefully and test if possible, mucking with basic checksum functions is always a little precarious :-) ---- Test results with this patch set are below. I did not notice any performace regression. Tests run: TCP_STREAM: super_netperf with 200 streams TCP_RR: super_netperf with 200 streams and -r 1,1 Device bnx2x (10Gbps): No GRE RSS hash (RX interrupts occur on one core) UDP RSS port hashing enabled. * GRE with checksum with IPv4 encapsulated packets With fix: TCP_STREAM 9.91% CPU utilization 5163.78 Mbps TCP_RR 50.64% CPU utilization 219/347/502 90/95/99% latencies 834103 tps Without fix: TCP_STREAM 10.05% CPU utilization 5186.22 tps TCP_RR 49.70% CPU utilization 227/338/486 90/95/99% latencies 813450 tps * GRE without checksum with IPv4 encapsulated packets With fix: TCP_STREAM 10.18% CPU utilization 5159 Mbps TCP_RR 51.86% CPU utilization 214/325/471 90/95/99% latencies 865943 tps Without fix: TCP_STREAM 10.26% CPU utilization 5307.87 Mbps TCP_RR 50.59% CPU utilization 224/325/476 90/95/99% latencies 846429 tps *** Simulate device returns CHECKSUM_COMPLETE * VXLAN with checksum With fix: TCP_STREAM 13.03% CPU utilization 9093.9 Mbps TCP_RR 95.96% CPU utilization 161/259/474 90/95/99% latencies 1.14806e+06 tps Without fix: TCP_STREAM 13.59% CPU utilization 9093.97 Mbps TCP_RR 93.95% CPU utilization 160/259/484 90/95/99% latencies 1.10262e+06 tps * VXLAN without checksum With fix: TCP_STREAM 13.28% CPU utilization 9093.87 Mbps TCP_RR 95.04% CPU utilization 155/246/439 90/95/99% latencies 1.15e+06 tps Without fix: TCP_STREAM 13.37% CPU utilization 9178.45 Mbps TCP_RR 93.74% CPU utilization 161/257/469 90/95/99% latencies 1.1068e+06 Mbps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * gre: When GRE csum is present count as encap layer wrt csumTom Herbert2014-08-241-0/+1
| | | | | | | | | | | | | | | | In GRE demux if the GRE checksum pop rcv encapsulation so that any encapsulated checksums are treated as tunnel checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: additional GRO supportTom Herbert2014-08-244-17/+96
| | | | | | | | | | | | | | | | Implement GRO for UDPv6. Add UDP checksum verification in gro_receive for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: Call skb_gro_checksum_validateTom Herbert2014-08-242-47/+6
| | | | | | | | | | | | | | | | In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP checksum in the gro context. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * gre: call skb_gro_checksum_simple_validateTom Herbert2014-08-241-36/+7
| | | | | | | | | | Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: add gro_compute_pseudo functionsTom Herbert2014-08-242-0/+16
| | | | | | | | | | | | | | | | | | Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo for GRO path. The IP header is taken from skb_gro_network_header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: skb_gro_checksum_* functionsTom Herbert2014-08-242-3/+107
|/ | | | | | | | | | Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check, and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete. These are the cognates of the normal checksum functions but are used in the gro_receive path and operate on GRO related fields in sk_buffs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: use reciprocal_scale() helperDaniel Borkmann2014-08-2314-22/+24
| | | | | | | | Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Allow raw buffers to be passed into the flow dissector.David S. Miller2014-08-233-22/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers, and perhaps other entities we have not yet considered, sometimes want to know how deep the protocol headers go before deciding how large of an SKB to allocate and how much of the packet to place into the linear SKB area. For example, consider a driver which has a device which DMAs into pools of pages and then tells the driver where the data went in the DMA descriptor(s). The driver can then build an SKB and reference most of the data via SKB fragments (which are page/offset/length triplets). However at least some of the front of the packet should be placed into the linear SKB area, which comes before the fragments, so that packet processing can get at the headers efficiently. The first thing each protocol layer is going to do is a "pskb_may_pull()" so we might as well aggregate as much of this as possible while we're building the SKB in the driver. Part of supporting this is that we don't have an SKB yet, so we want to be able to let the flow dissector operate on a raw buffer in order to compute the offset of the end of the headers. So now we have a __skb_flow_dissect() which takes an explicit data pointer and length. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bcm7xxx_apd_eee'David S. Miller2014-08-236-127/+229
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== net: phy: bcm7xxx: APD and EEE support This patch series enables Auto-power down and EEE for the BCM7xxx integrated Gigabit PHYs. I also put a fix for the fixed PHY that would allow clause 45 over clause 22 reads/writes but would return bogus data by using e.g: ethtool --show-eee ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud