summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Linux 3.13-rc4v3.13-rc4Linus Torvalds2013-12-151-1/+1
|
* null_blk: mem garbage on NUMA systems during initMatias Bjorling2013-12-151-4/+4
| | | | | | | | | | | | | | For NUMA systems, initializing the blk-mq layer and using per node hctx. We initialize submit queues to 1, while blk-mq nr_hw_queues is initialized to the number of NUMA nodes. This makes the null_init_hctx function overwrite memory outside of what it allocated. In my case it lead to writing garbage into struct request_queue's mq_map. Signed-off-by: Matias Bjorling <m@bjorling.me> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* radeon_pm: fix oops in hwmon_attributes_visible() and ↵Sergey Senozhatsky2013-12-151-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | radeon_hwmon_show_temp_thresh() Since commit ec39f64bba34 ("drm/radeon/dpm: Convert to use devm_hwmon_register_with_groups") radeon_hwmon_init() is using hwmon_device_register_with_groups(), which sets `rdev' as a device private driver_data, while hwmon_attributes_visible() and radeon_hwmon_show_temp_thresh() are still waiting for `drm_device'. Fix them by using dev_get_drvdata(), in order to avoid this oops: BUG: unable to handle kernel paging request at 0000000000001e28 IP: [<ffffffffa02ae8b4>] hwmon_attributes_visible+0x18/0x3d [radeon] PGD 15057e067 PUD 151a8e067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Call Trace: internal_create_group+0x114/0x1d9 sysfs_create_group+0xe/0x10 sysfs_create_groups+0x22/0x5f device_add+0x34f/0x501 device_register+0x15/0x18 hwmon_device_register_with_groups+0xb5/0xed radeon_hwmon_init+0x56/0x7c [radeon] radeon_pm_init+0x134/0x7e5 [radeon] radeon_modeset_init+0x75f/0x8ed [radeon] radeon_driver_load_kms+0xc6/0x187 [radeon] drm_dev_register+0xf9/0x1b4 [drm] drm_get_pci_dev+0x98/0x129 [drm] radeon_pci_probe+0xa3/0xac [radeon] pci_device_probe+0x6e/0xcf driver_probe_device+0x98/0x1c4 __driver_attach+0x5c/0x7e bus_for_each_dev+0x7b/0x85 driver_attach+0x19/0x1b bus_add_driver+0x104/0x1ce driver_register+0x89/0xc5 __pci_register_driver+0x58/0x5b drm_pci_init+0x86/0xea [drm] radeon_init+0x97/0x1000 [radeon] do_one_initcall+0x7f/0x117 load_module+0x1583/0x1bb4 SyS_init_module+0xa0/0xaf Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-12-15141-916/+1551
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Revert CHECKSUM_COMPLETE optimization in pskb_trim_rcsum(), I can't figure out why it breaks things. 2) Fix comparison in netfilter ipset's hash_netnet4_data_equal(), it was basically doing "x == x", from Dave Jones. 3) Freescale FEC driver was DMA mapping the wrong number of bytes, from Sebastian Siewior. 4) Blackhole and prohibit routes in ipv6 were not doing the right thing because their ->input and ->output methods were not being assigned correctly. Now they behave properly like their ipv4 counterparts. From Kamala R. 5) Several drivers advertise the NETIF_F_FRAGLIST capability, but really do not support this feature and will send garbage packets if fed fraglist SKBs. From Eric Dumazet. 6) Fix long standing user triggerable BUG_ON over loopback in RDS protocol stack, from Venkat Venkatsubra. 7) Several not so common code paths can potentially try to invoke packet scheduler actions that might be NULL without checking. Shore things up by either 1) defining a method as mandatory and erroring on registration if that method is NULL 2) defininig a method as optional and the registration function hooks up a default implementation when NULL is seen. From Jamal Hadi Salim. 8) Fix fragment detection in xen-natback driver, from Paul Durrant. 9) Kill dangling enter_memory_pressure method in cg_proto ops, from Eric W Biederman. 10) SKBs that traverse namespaces should have their local_df cleared, from Hannes Frederic Sowa. 11) IOCB file position is not being updated by macvtap_aio_read() and tun_chr_aio_read(). From Zhi Yong Wu. 12) Don't free virtio_net netdev before releasing all of the NAPI instances. From Andrey Vagin. 13) Procfs entry leak in xt_hashlimit, from Sergey Popovich. 14) IPv6 routes that are no cached routes should not count against the garbage collection limits. We had this almost right, but were missing handling addrconf generated routes properly. From Hannes Frederic Sowa. 15) fib{4,6}_rule_suppress() have to consider potentially seeing NULL route info when they are called, from Stefan Tomanek. 16) TUN and MACVTAP have had truncated packet signalling for some time, fix from Jason Wang. 17) Fix use after frrr in __udp4_lib_rcv(), from Eric Dumazet. 18) xen-netback does not interpret the NAPI budget properly for TX work, fix from Paul Durrant. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (132 commits) igb: Fix for issue where values could be too high for udelay function. i40e: fix null dereference xen-netback: fix gso_prefix check net: make neigh_priv_len in struct net_device 16bit instead of 8bit drivers: net: cpsw: fix for cpsw crash when build as modules xen-netback: napi: don't prematurely request a tx event xen-netback: napi: fix abuse of budget sch_tbf: use do_div() for 64-bit divide udp: ipv4: must add synchronization in udp_sk_rx_dst_set() net:fec: remove duplicate lines in comment about errata ERR006358 Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature" 8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature xen-netback: make sure skb linear area covers checksum field net: smc91x: Fix device tree based configuration so it's usable udp: ipv4: fix potential use after free in udp_v4_early_demux() macvtap: signal truncated packets tun: unbreak truncated packet signalling net: sched: htb: fix the calculation of quantum net: sched: tbf: fix the calculation of max_size micrel: add support for KSZ8041RNLI ...
| * igb: Fix for issue where values could be too high for udelay function.Carolyn Wyborny2013-12-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the igb_phy_has_link function to check the value of the parameter before deciding to use udelay or mdelay in order to be sure that the value is not too high for udelay function. CC: stable kernel <stable@vger.kernel.org> # 3.9+ Signed-off-by: Sunil K Pandey <sunil.k.pandey@intel.com> Signed-off-by: Kevin B Smith <kevin.b.smith@intel.com> Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * i40e: fix null dereferenceJesse Brandeburg2013-12-141-0/+3
| | | | | | | | | | | | | | | | | | | | If the vsi->tx_rings structure is NULL we don't want to panic. Change-Id: Ic694f043701738c434e8ebe0caf0673f4410dc10 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: fix gso_prefix checkPaul Durrant2013-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a mistake in checking the gso_prefix mask when passing large packets to a guest. The wrong shift is applied to the bit - the raw skb gso type is used rather then the translated one. This leads to large packets being handed to the guest without the GSO metadata. This patch fixes the check. The mistake manifested as errors whilst running Microsoft HCK large packet offload tests between a pair of Windows 8 VMs. I have verified this patch fixes those errors. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: make neigh_priv_len in struct net_device 16bit instead of 8bitSebastian Siewior2013-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | neigh_priv_len is defined as u8. With all debug enabled struct ipoib_neigh has 200 bytes. The largest part is sk_buff_head with 96 bytes and here the spinlock with 72 bytes. The size value still fits in this u8 leaving some room for more. On -RT struct ipoib_neigh put on weight and has 392 bytes. The main reason is sk_buff_head with 288 and the fatty here is spinlock with 192 bytes. This does no longer fit into into neigh_priv_len and gcc complains. This patch changes neigh_priv_len from being 8bit to 16bit. Since the following element (dev_id) is 16bit followed by a spinlock which is aligned, the struct remains with a total size of 3200 (allmodconfig) / 2048 (with as much debug off as possible) bytes on x86-64. On x86-32 the struct is 1856 (allmodconfig) / 1216 (with as much debug off as possible) bytes long. The numbers were gained with and without the patch to prove that this change does not increase the size of the struct. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * drivers: net: cpsw: fix for cpsw crash when build as modulesMugunthan V N2013-12-121-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | When CPSW and Davinci MDIO are build as modules, CPSW crashes when accessing CPSW registers in CPSW probe. The same is working in built-in as the CPSW clocks are enabled in Davindi MDIO probe, SO Enabling the clocks before accessing the version register and moving out the other register access to cpsw device open. Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: napi: don't prematurely request a tx eventPaul Durrant2013-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the RING_FINAL_CHECK_FOR_REQUESTS in xenvif_build_tx_gops to a check for RING_HAS_UNCONSUMED_REQUESTS as the former call has the side effect of advancing the ring event pointer and therefore inviting another interrupt from the frontend before the napi poll has actually finished, thereby defeating the point of napi. The event pointer is updated by RING_FINAL_CHECK_FOR_REQUESTS in xenvif_poll, the napi poll function, if the work done is less than the budget i.e. when actually transitioning back to interrupt mode. Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: napi: fix abuse of budgetPaul Durrant2013-12-121-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netback seems to be somewhat confused about the napi budget parameter. The parameter is supposed to limit the number of skbs processed in each poll, but netback has this confused with grant operations. This patch fixes that, properly limiting the work done in each poll. Note that this limit makes sure we do not process any more data from the shared ring than we intend to pass back from the poll. This is important to prevent tx_queue potentially growing without bound. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sch_tbf: use do_div() for 64-bit divideYang Yingliang2013-12-111-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | It's doing a 64-bit divide which is not supported on 32-bit architectures in psched_ns_t2l(). The correct way to do this is to use do_div(). It's introduced by commit cc106e441a63 ("net: sched: tbf: fix the calculation of max_size") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: ipv4: must add synchronization in udp_sk_rx_dst_set()Eric Dumazet2013-12-111-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike TCP, UDP input path does not hold the socket lock. Before messing with sk->sk_rx_dst, we must use a spinlock, otherwise multiple cpus could leak a refcount. This patch also takes care of renewing a stale dst entry. (When the sk->sk_rx_dst would not be used by IP early demux) Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net:fec: remove duplicate lines in comment about errata ERR006358Philippe De Muyter2013-12-111-4/+0
| | | | | | | | | | | | | | | | | | commit 031916568a1aa2ef1809f86d26f0bcfa215ff5c0 worked around errata ERR006358, but comment contains duplicated lines, impairing the readability. Remove them. Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "8390 : Replace ei_debug with msg_enable/NETIF_MSG_* feature"David S. Miller2013-12-1116-426/+302
| | | | | | | | | | | | | | | | This reverts commit 99023e90fe5c147ea0665bda86764ea44f08a622. Accidently checked this into 'net' instead of 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * 8390 : Replace ei_debug with msg_enable/NETIF_MSG_* featureMatthew Whitehead2013-12-1116-302/+426
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed the shared ei_debug variable. Replaced it by adding u32 msg_enable to the private struct ei_device. Now each 8390 ethernet instance has a per-device logging variable. Changed older style printk() calls to more canonical forms. Tested on: ne, ne2k-pci, smc-ultra, and wd hardware. V4.0 - Substituted pr_info() and pr_debug() for printk() KERN_INFO and KERN_DEBUG V3.0 - Checked for cases where pr_cont() was most appropriate choice. - Changed module parameter from 'debug' to 'msg_enable' because debug was no longer the best description. V2.0 - Changed netif_msg_(drv|probe|ifdown|rx_err|tx_err|tx_queued|intr|rx_status|hw) to netif_(dbg|info|warn|err) where possible. Signed-off-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xen-netback: make sure skb linear area covers checksum fieldPaul Durrant2013-12-111-32/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb_partial_csum_set requires that the linear area of the skb covers the checksum field. The checksum setup code in netback was only doing that pullup in the case when the pseudo header checksum was being recalculated though. This patch makes that pullup unconditional. (I pullup the whole transport header just for simplicity; the requirement is only for the check field but in the case of UDP this is the last field in the header and in the case of TCP it's the last but one). The lack of pullup manifested as failures running Microsoft HCK network tests on a pair of Windows 8 VMs and it has been verified that this patch fixes the problem. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: smc91x: Fix device tree based configuration so it's usableTony Lindgren2013-12-112-10/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 89ce376c6bdc (drivers/net: Use of_match_ptr() macro in smc91x.c) added minimal device tree support to smc91x, but it's not working on many platforms because of the lack of some key configuration bits. Fix the issue by parsing the necessary configuration like the smc911x driver is doing. As most smc91x users seem to use 16-bit access, let's default to that if no reg-io-width is specified. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Mark Rutland <mark.rutland@arm.com> Cc: netdev@vger.kernel.org Cc: devicetree@vger.kernel.org Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: ipv4: fix potential use after free in udp_v4_early_demux()Eric Dumazet2013-12-111-3/+6
| | | | | | | | | | | | | | | | | | | | pskb_may_pull() can reallocate skb->head, we need to move the initialization of iph and uh pointers after its call. Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * macvtap: signal truncated packetsJason Wang2013-12-111-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | macvtap_put_user() never return a value grater than iov length, this in fact bypasses the truncated checking in macvtap_recvmsg(). Fix this by always returning the size of packet plus the possible vlan header to let the trunca checking work. Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tun: unbreak truncated packet signallingJason Wang2013-12-111-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6680ec68eff47d36f67b4351bc9836fd6cba9532 (tuntap: hardware vlan tx support) breaks the truncated packet signal by nev return a length greater than iov length in tun_put_user(). This patch fixes by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of io length. Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sched: htb: fix the calculation of quantumYang Yingliang2013-12-111-8/+12
| | | | | | | | | | | | | | | | | | | | Now, 32bit rates may be not the true rate. So use rate_bytes_ps which is from max(rate32, rate64) to calcualte quantum. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sched: tbf: fix the calculation of max_sizeYang Yingliang2013-12-111-45/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current max_size is caluated from rate table. Now, the rate table has been replaced and it's wrong to caculate max_size based on this rate table. It can lead wrong calculation of max_size. The burst in kernel may be lower than user asked, because burst may gets some loss when transform it to buffer(E.g. "burst 40kb rate 30mbit/s") and it seems we cannot avoid this loss. Burst's value(max_size) based on rate table may be equal user asked. If a packet's length is max_size, this packet will be stalled in tbf_dequeue() because its length is above the burst in kernel so that it cannot get enough tokens. The max_size guards against enqueuing packet sizes above q->buffer "time" in tbf_enqueue(). To make consistent with the calculation of tokens, this patch add a helper psched_ns_t2l() to calculate burst(max_size) directly to fix this problem. After this fix, we can support to using 64bit rates to calculate burst as well. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * micrel: add support for KSZ8041RNLISergei Shtylyov2013-12-112-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | Renesas R-Car development boards use KSZ8041RNLI PHY which for some reason has ID of 0x00221537 that is not documented for KSZ8041-family PHYs and does not match the documented ID of 0x0022151x (where 'x' is the revision). We have to add the new #define PHY_ID_* and new ksphy_driver[] entry, almost the same as KSZ8041 one, differing only in the 'phy_id' and 'name' fields. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'for-davem' of ↵David S. Miller2013-12-111-0/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Just one patch this time -- a fix from Felix Fietkau to fix the duration calculation for non-aggregated packets in ath9k. This is a small change and it is obviously specific to ath9k. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge branch 'master' of ↵John W. Linville2013-12-111-0/+4
| | |\ | |/ / | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
| | * ath9k: fix duration calculation for non-aggregated packetsFelix Fietkau2013-12-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When not aggregating packets, fi->framelen should be passed in as length to calculate the duration. Before the tx path rework, ath_tx_fill_desc was called for either one aggregate, or one single frame, with the length of the packet or the aggregate as a parameter. After the rework, ath_tx_sched_aggr can pass a burst of single frames to ath_tx_fill_desc and sets len=0. Fix broken duration calculation by overriding the length in ath_tx_fill_desc before passing it to ath_buf_set_rate. Cc: stable@vger.kernel.org Reported-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | udp: ipv4: fix an use after free in __udp4_lib_rcv()Eric Dumazet2013-12-101-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones reported a use after free in UDP stack : [ 5059.434216] ========================= [ 5059.434314] [ BUG: held lock freed! ] [ 5059.434420] 3.13.0-rc3+ #9 Not tainted [ 5059.434520] ------------------------- [ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there! [ 5059.434815] (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435012] 3 locks held by named/863: [ 5059.435086] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940 [ 5059.435295] #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410 [ 5059.435500] #2: (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.435734] stack backtrace: [ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ #9 [loadavg: 0.21 0.06 0.06 1/115 1365] [ 5059.436052] Hardware name: /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010 [ 5059.436223] 0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0 [ 5059.436390] ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0 [ 5059.436554] ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48 [ 5059.436718] Call Trace: [ 5059.436769] <IRQ> [<ffffffff8153a372>] dump_stack+0x4d/0x66 [ 5059.436904] [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160 [ 5059.437037] [<ffffffff8141c490>] ? __sk_free+0x110/0x230 [ 5059.437147] [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150 [ 5059.437260] [<ffffffff8141c490>] __sk_free+0x110/0x230 [ 5059.437364] [<ffffffff8141c5c9>] sk_free+0x19/0x20 [ 5059.437463] [<ffffffff8141cb25>] sock_edemux+0x25/0x40 [ 5059.437567] [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280 [ 5059.437685] [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0 [ 5059.437805] [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240 [ 5059.437925] [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70 [ 5059.438038] [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0 [ 5059.438155] [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00 [ 5059.438269] [<ffffffff8149d7f5>] udp_rcv+0x15/0x20 [ 5059.438367] [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410 [ 5059.438492] [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410 [ 5059.438621] [<ffffffff81468653>] ip_local_deliver+0x43/0x80 [ 5059.438733] [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0 [ 5059.438843] [<ffffffff81468926>] ip_rcv+0x296/0x3f0 [ 5059.438945] [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940 [ 5059.439074] [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940 [ 5059.442231] [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10 [ 5059.442231] [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60 [ 5059.442231] [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0 [ 5059.442231] [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0 [ 5059.442231] [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169] [ 5059.442231] [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0 [ 5059.442231] [<ffffffff810478cd>] __do_softirq+0xed/0x240 [ 5059.442231] [<ffffffff81047e25>] irq_exit+0x125/0x140 [ 5059.442231] [<ffffffff81004241>] do_IRQ+0x51/0xc0 [ 5059.442231] [<ffffffff81542bef>] common_interrupt+0x6f/0x6f We need to keep a reference on the socket, by using skb_steal_sock() at the right place. Note that another patch is needed to fix a race in udp_sk_rx_dst_set(), as we hold no lock protecting the dst. Fixes: 421b3885bf6d ("udp: ipv4: Add udp early demux") Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'sctp'David S. Miller2013-12-102-19/+89
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Wang Weidong says: ==================== sctp: check the rto_min and rto_max v6 -> v7: -patch2: fix the whitespace issues which pointed out by Daniel v5 -> v6: split the v5' first patch to patch1 and patch2, and remove the macro in constants.h -patch1: do rto_min/max socket option handling in its own patch, and fix the check of rto_min/max. -patch2: do rto_min/max sysctl handling in its own patch. -patch3: add Suggested-by Daniel. v4 -> v5: - patch1: add marco in constants.h and fix up spacing as suggested by Daniel - patch2: add a patch for fix up do_hmac_alg for according to do_rto_min[max] v3 -> v4: -patch1: fix use init_net directly which suggested by Vlad. v2 -> v3: -patch1: add proc_handler for check rto_min and rto_max which suggested by Vlad v1 -> v2: -patch1: fix the From Name which pointed out by David, and add the ACK by Neil ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sctp: fix up a spacingwangweidong2013-12-101-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fix up spacing of proc_sctp_do_hmac_alg for according to the proc_sctp_do_rto_min[max] in sysctl.c Suggested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sctp: add check rto_min and rto_max in sysctlwangweidong2013-12-101-4/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rto_min should be smaller than rto_max while rto_max should be larger than rto_min. Add two proc_handler for the checking. Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sctp: check the rto_min and rto_max in setsockoptwangweidong2013-12-101-10/+22
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | When we set 0 to rto_min or rto_max, just not change the value. Also we should check the rto_min > rto_max. Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6: do not erase dst address with flow label destinationFlorent Fourcot2013-12-106-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is following b579035ff766c9412e2b92abf5cab794bff102b6 "ipv6: remove old conditions on flow label sharing" Since there is no reason to restrict a label to a destination, we should not erase the destination value of a socket with the value contained in the flow label storage. This patch allows to really have the same flow label to more than one destination. Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sctp: properly latch and use autoclose value from sock to associationNeil Horman2013-12-105-17/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, sctp associations latch a sockets autoclose value to an association at association init time, subject to capping constraints from the max_autoclose sysctl value. This leads to an odd situation where an application may set a socket level autoclose timeout, but sliently sctp will limit the autoclose timeout to something less than that. Fix this by modifying the autoclose setsockopt function to check the limit, cap it and warn the user via syslog that the timeout is capped. This will allow getsockopt to return valid autoclose timeout values that reflect what subsequent associations actually use. While were at it, also elimintate the assoc->autoclose variable, it duplicates whats in the timeout array, which leads to multiple sources for the same information, that may differ (as the former isn't subject to any capping). This gives us the timeout information in a canonical place and saves some space in the association structure as well. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> CC: Wang Weidong <wangweidong1@huawei.com> CC: David Miller <davem@davemloft.net> CC: Vlad Yasevich <vyasevich@gmail.com> CC: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'tipc'David S. Miller2013-12-102-6/+12
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jon Maloy says: ==================== tipc: corrections related to tasklet job mechanism These commits correct two bugs related to tipc' service for launching functions for asynchronous execution in a separate tasklet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | tipc: protect handler_enabled variable with qitem_lock spin lockYing Xue2013-12-101-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'handler_enabled' is a global flag indicating whether the TIPC signal handling service is enabled or not. The lack of lock protection for this flag incurs a risk for contention, so that a tipc_k_signal() call might queue a signal handler to a destroyed signal queue, with unpredictable results. To correct this, we let the already existing 'qitem_lock' protect the flag, as it already does with the queue itself. This way, we ensure that the flag always is consistent across all cores. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | tipc: correct the order of stopping services at rmmodJon Paul Maloy2013-12-101-3/+4
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'signal handler' service in TIPC is a mechanism that makes it possible to postpone execution of functions, by launcing them into a job queue for execution in a separate tasklet, independent of the launching execution thread. When we do rmmod on the tipc module, this service is stopped after the network service. At the same time, the stopping of the network service may itself launch jobs for execution, with the risk that these functions may be scheduled for execution after the data structures meant to be accessed by the job have already been deleted. We have seen this happen, most often resulting in an oops. This commit ensures that the signal handler is the very first to be stopped when TIPC is shut down, so there are no surprises during the cleanup of the other services. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0Nat Gurumoorthy2013-12-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new tg3 driver leaves REG_BASE_ADDR (PCI config offset 120) uninitialized. From power on reset this register may have garbage in it. The Register Base Address register defines the device local address of a register. The data pointed to by this location is read or written using the Register Data register (PCI config offset 128). When REG_BASE_ADDR has garbage any read or write of Register Data Register (PCI 128) will cause the PCI bus to lock up. The TCO watchdog will fire and bring down the system. Signed-off-by: Nat Gurumoorthy <natg@google.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: Revert macvtap/tun truncation signalling changes.David S. Miller2013-12-102-26/+24
| | | | | | | | | | | | | | | | | | | | | Jason Wang and Michael S. Tsirkin are still discussing how to properly fix this. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | macvtap: signal truncated packetsJason Wang2013-12-101-13/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | macvtap_put_user() never return a value grater than iov length, this in fact bypasses the truncated checking in macvtap_recvmsg(). Fix this by always returning the size of packet plus the possible vlan header to let the truncated checking work. Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tun: unbreak truncated packet signallingJason Wang2013-12-101-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 6680ec68eff47d36f67b4351bc9836fd6cba9532 (tuntap: hardware vlan tx support) breaks the truncated packet signal by never return a length greater than iov length in tun_put_user(). This patch fixes this by always return the length of packet plus possible vlan header. Caller can detect the truncated packet by comparing the return value and the size of iov length. Reported-by: Vlad Yasevich <vyasevich@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | vxlan: release rt when found circular routeFan Du2013-12-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise causing dst memory leakage. Have Checked all other type tunnel device transmit implementation, no such things happens anymore. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: unix: allow set_peek_off to failSasha Levin2013-12-103-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | unix_dgram_recvmsg() will hold the readlock of the socket until recv is complete. In the same time, we may try to setsockopt(SO_PEEK_OFF) which will hang until unix_dgram_recvmsg() will complete (which can take a while) without allowing us to break out of it, triggering a hung task spew. Instead, allow set_peek_off to fail, this way userspace will not hang. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'sfc-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfcDavid S. Miller2013-12-106-23/+101
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ben Hutchings says: ==================== Several fixes for the PTP hardware support added in 3.7: 1. Fix filtering of PTP packets on the TX path to be robust against bad header lengths. 2. Limit logging on the RX path in case of a PTP packet flood, partly from Laurence Evans. 3. Disable PTP hardware when the interface is down so that we don't receive RX timestamp events, from Alexandre Rames. 4. Maintain clock frequency adjustment when a time offset is applied. Also fixes for the SFC9100 family support added in 3.12: 5. Take the RX prefix length into account when applying NET_IP_ALIGN, from Andrew Rybchenko. 6. Work around a bug that breaks communication between the driver and firmware, from Robert Stonehouse. Please also queue these up for the appropriate stable branches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sfc: Poll for MCDI completion once before timeout occursRobert Stonehouse2013-12-061-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an as-yet unexplained bug that sometimes prevents (or delays) the driver seeing the completion event for a completed MCDI request on the SFC9120. The requested configuration change will have happened but the driver assumes it to have failed, and this can result in further failures. We can mitigate this by polling for completion after unsuccessfully waiting for an event. Fixes: 8127d661e77f ('sfc: Add support for Solarflare SFC9100 family') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| | * | sfc: Refactor efx_mcdi_poll() by introducing efx_mcdi_poll_once()Robert Stonehouse2013-12-061-6/+17
| | | | | | | | | | | | | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| | * | sfc: RX buffer allocation takes prefix size into account in IP header alignmentAndrew Rybchenko2013-12-063-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rx_prefix_size is 4-bytes aligned on Falcon/Siena (16 bytes), but it is equal to 14 on EF10. So, it should be taken into account if arch requires IP header to be 4-bytes aligned (via NET_IP_ALIGN). Fixes: 8127d661e77f ('sfc: Add support for Solarflare SFC9100 family') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| | * | sfc: Maintain current frequency adjustment when applying a time offsetBen Hutchings2013-12-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a single MCDI PTP operation for setting the frequency adjustment and applying a time offset to the hardware clock. When applying a time offset we should not change the frequency adjustment. These two operations can now be requested separately but this requires a flash firmware update. Keep using the single operation, but remember and repeat the previous frequency adjustment. Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| | * | sfc: Stop/re-start PTP when stopping/starting the datapath.Alexandre Rames2013-12-063-3/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This disables PTP when we bring the interface down to avoid getting unmatched RX timestamp events, and tries to re-enable it when bringing the interface up. [bwh: Make efx_ptp_stop() safe on Falcon. Introduce efx_ptp_{start,stop}_datapath() functions; we'll expand them later.] Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| | * | sfc: Rate-limit log message for PTP packets without a matching timestamp eventBen Hutchings2013-12-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of a flood of PTP packets, the timestamp peripheral and MC firmware on the SFN[56]322F boards may not be able to provide timestamp events for all packets. Don't complain too much about this. Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP') Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
OpenPOWER on IntegriCloud