summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ipv4: make sure RTO_ONLINK is saved in routing cacheJulian Anastasov2011-12-031-6/+6
| | | | | | | | | | __mkroute_output fails to work with the original tos and uses value with stripped RTO_ONLINK bit. Make sure we put the original TOS bits into rt_key_tos because it used to match cached route. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2011-12-0177-280/+589
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS ipv4: flush route cache after change accept_local sch_red: fix red_change Revert "udp: remove redundant variable" bridge: master device stuck in no-carrier state forever when in user-stp mode ipv4: Perform peer validation on cached route lookup. net/core: fix rollback handler in register_netdevice_notifier sch_red: fix red_calc_qavg_from_idle_time bonding: only use primary address for ARP ipv4: fix lockdep splat in rt_cache_seq_show sch_teql: fix lockdep splat net: fec: Select the FEC driver by default for i.MX SoCs isdn: avoid copying too long drvid isdn: make sure strings are null terminated netlabel: Fix build problems when IPv6 is not enabled sctp: better integer overflow check in sctp_auth_create_key() sctp: integer overflow in sctp_auth_create_key() ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given. net: Fix corruption in /proc/*/net/dev_mcast mac80211: fix race between the AGG SM and the Tx data path ...
| * netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NSDavid S. Miller2011-12-011-1/+0
| | | | | | | | | | | | firewalld in Fedora 16 needs this. Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: flush route cache after change accept_localPeter Pan(潘卫平)2011-12-011-0/+5
| | | | | | | | | | | | | | | | | | After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0, we should flush route cache, or it will continue receive packets with local source address, which should be dropped. Signed-off-by: Weiping Pan <panweiping3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sch_red: fix red_changeEric Dumazet2011-12-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit : > (Almost) nobody uses RED because they can't figure it out. > According to Wikipedia, VJ says that: > "there are not one, but two bugs in classic RED." RED is useful for high throughput routers, I doubt many linux machines act as such devices. I was considering adding Adaptative RED (Sally Floyd, Ramakrishna Gummadi, Scott Shender), August 2001 In this version, maxp is dynamic (from 1% to 50%), and user only have to setup min_th (target average queue size) (max_th and wq (burst in linux RED) are automatically setup) By the way it seems we have a small bug in red_change() if (skb_queue_empty(&sch->q)) red_end_of_idle_period(&q->parms); First, if queue is empty, we should call red_start_of_idle_period(&q->parms); Second, since we dont use anymore sch->q, but q->qdisc, the test is meaningless. Oh well... [PATCH] sch_red: fix red_change() Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty, we start an idle period, not end it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "udp: remove redundant variable"David S. Miller2011-12-012-14/+16
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 81d54ec8479a2c695760da81f05b5a9fb2dbe40a. If we take the "try_again" goto, due to a checksum error, the 'len' has already been truncated. So we won't compute the same values as the original code did. Reported-by: paul bilke <fsmail@conspiracy.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: master device stuck in no-carrier state forever when in user-stp modeVitalii Demianets2011-12-012-15/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When in user-stp mode, bridge master do not follow state of its slaves, so after the following sequence of events it can stuck forever in no-carrier state: 1) turn stp off 2) put all slaves down - master device will follow their state and also go in no-carrier state 3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode) Now bridge master won't follow slaves' state and will never reach running state. This patch solves the problem by making user-stp and kernel-stp behavior similar regarding master following slaves' states. Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: Perform peer validation on cached route lookup.David S. Miller2011-12-011-7/+19
| | | | | | | | | | | | | | Otherwise we won't notice the peer GENID change. Reported-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/core: fix rollback handler in register_netdevice_notifierRongQing.Li2011-11-301-1/+2
| | | | | | | | | | | | | | | | | | Within nested statements, the break statement terminates only the do, for, switch, or while statement that immediately encloses it, So replace the break with goto. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sch_red: fix red_calc_qavg_from_idle_timeEric Dumazet2011-11-301-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to 6) it seems RED/GRED are broken. red_calc_qavg_from_idle_time() computes a delay in us units, but this delay is now 16 times bigger than real delay, so the final qavg result smaller than expected. Use standard kernel time services since there is no need to obfuscate them. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: only use primary address for ARPHenrik Saavedra Persson2011-11-301-27/+6
| | | | | | | | | | | | | | | | | | | | | | Only use the primary address of the bond device for master_ip. This will prevent changing the ARP source address in Active-Backup mode whenever a secondry address is added to the bond device. Signed-off-by: Henrik Saavedra Persson <henrik.e.persson@ericsson.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
| * ipv4: fix lockdep splat in rt_cache_seq_showEric Dumazet2011-11-301-2/+6
| | | | | | | | | | | | | | | | | | | | After commit f2c31e32b378 (fix NULL dereferences in check_peer_redir()), dst_get_neighbour() should be guarded by rcu_read_lock() / rcu_read_unlock() section. Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sch_teql: fix lockdep splatEric Dumazet2011-11-301-11/+20
| | | | | | | | | | | | | | | | | | | | | | We need rcu_read_lock() protection before using dst_get_neighbour(), and we must cache its value (pass it to __teql_resolve()) teql_master_xmit() is called under rcu_read_lock_bh() protection, its not enough. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fec: Select the FEC driver by default for i.MX SoCsFabio Estevam2011-11-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 230dec6 (net/fec: add imx6q enet support) the FEC driver is no longer built by default for i.MX SoCs. Let the FEC driver be built by default again. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵John W. Linville2011-11-304-14/+52
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
| | * mac80211: fix race between the AGG SM and the Tx data pathEmmanuel Grumbach2011-11-281-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a packet is supposed to sent be as an a-MPDU, mac80211 sets IEEE80211_TX_CTL_AMPDU to let the driver know. On the other hand, mac80211 configures the driver for aggregration with the ampdu_action callback. There is race between these two mechanisms since the following scenario can occur when the BA agreement is torn down: Tx softIRQ drv configuration ========== ================= check OPERATIONAL bit Set the TX_CTL_AMPDU bit in the packet clear OPERATIONAL bit stop Tx AGG Pass Tx packet to the driver. In that case the driver would get a packet with TX_CTL_AMPDU set although it has already been notified that the BA session has been torn down. To fix this, we need to synchronize all the Qdisc activity after we cleared the OPERATIONAL bit. After that step, all the following packets will be buffered until the driver reports it is ready to get new packets for this RA / TID. This buffering allows not to run into another race that would send packets with TX_CTL_AMPDU unset while the driver hasn't been requested to tear down the BA session yet. This race occurs in practice and iwlwifi complains with a WARN_ON when it happens. Cc: stable@kernel.org Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: fix race condition caused by late addBA responseNikolay Martynov2011-11-281-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If addBA responses comes in just after addba_resp_timer has expired mac80211 will still accept it and try to open the aggregation session. This causes drivers to be confused and in some cases even crash. This patch fixes the race condition and makes sure that if addba_resp_timer has expired addBA response is not longer accepted and we do not try to open half-closed session. Cc: stable@vger.kernel.org Signed-off-by: Nikolay Martynov <mar.kolya@gmail.com> [some adjustments] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * ath9k: Revert change that broke AR928X on Acer Ferrari OneRafael J. Wysocki2011-11-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert a hunk in drivers/net/wireless/ath/ath9k/hw.c introduced by commit 2577c6e8f2320f1d2f09be122efef5b9118efee4 (ath9k_hw: Add support for AR946/8x chipsets) that caused a nasty regression to appear on my Acer Ferrari One (the box locks up entirely at random times after the wireless has been started without any way to get debug information out of it). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * rtlwifi: fix lps_lock deadlockStanislaw Gruszka2011-11-281-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rtl_lps_leave can be called from interrupt context, so we have to disable interrupts when taking lps_lock. Below is full lockdep info about deadlock: [ 93.815269] ================================= [ 93.815390] [ INFO: inconsistent lock state ] [ 93.815472] 2.6.41.1-3.offch.fc15.x86_64.debug #1 [ 93.815556] --------------------------------- [ 93.815635] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 93.815743] swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 93.815832] (&(&rtlpriv->locks.lps_lock)->rlock){+.?...}, at: [<ffffffffa025dad6>] rtl_lps_leave+0x26/0x103 [rtlwifi] [ 93.815947] {SOFTIRQ-ON-W} state was registered at: [ 93.815947] [<ffffffff8108e10d>] __lock_acquire+0x369/0xd0c [ 93.815947] [<ffffffff8108efb3>] lock_acquire+0xf3/0x13e [ 93.815947] [<ffffffff814e981d>] _raw_spin_lock+0x45/0x79 [ 93.815947] [<ffffffffa025de34>] rtl_swlps_rf_awake+0x5a/0x76 [rtlwifi] [ 93.815947] [<ffffffffa025aec0>] rtl_op_config+0x12a/0x32a [rtlwifi] [ 93.815947] [<ffffffffa01d614b>] ieee80211_hw_config+0x124/0x129 [mac80211] [ 93.815947] [<ffffffffa01e0af3>] ieee80211_dynamic_ps_disable_work+0x32/0x47 [mac80211] [ 93.815947] [<ffffffff81075aa5>] process_one_work+0x205/0x3e7 [ 93.815947] [<ffffffff81076753>] worker_thread+0xda/0x15d [ 93.815947] [<ffffffff8107a119>] kthread+0xa8/0xb0 [ 93.815947] [<ffffffff814f3184>] kernel_thread_helper+0x4/0x10 [ 93.815947] irq event stamp: 547822 [ 93.815947] hardirqs last enabled at (547822): [<ffffffff814ea1a7>] _raw_spin_unlock_irqrestore+0x45/0x61 [ 93.815947] hardirqs last disabled at (547821): [<ffffffff814e9987>] _raw_spin_lock_irqsave+0x22/0x8e [ 93.815947] softirqs last enabled at (547790): [<ffffffff810623ed>] _local_bh_enable+0x13/0x15 [ 93.815947] softirqs last disabled at (547791): [<ffffffff814f327c>] call_softirq+0x1c/0x30 [ 93.815947] [ 93.815947] other info that might help us debug this: [ 93.815947] Possible unsafe locking scenario: [ 93.815947] [ 93.815947] CPU0 [ 93.815947] ---- [ 93.815947] lock(&(&rtlpriv->locks.lps_lock)->rlock); [ 93.815947] <Interrupt> [ 93.815947] lock(&(&rtlpriv->locks.lps_lock)->rlock); [ 93.815947] [ 93.815947] *** DEADLOCK *** [ 93.815947] [ 93.815947] no locks held by swapper/0. [ 93.815947] [ 93.815947] stack backtrace: [ 93.815947] Pid: 0, comm: swapper Not tainted 2.6.41.1-3.offch.fc15.x86_64.debug #1 [ 93.815947] Call Trace: [ 93.815947] <IRQ> [<ffffffff814dfd00>] print_usage_bug+0x1e7/0x1f8 [ 93.815947] [<ffffffff8101a849>] ? save_stack_trace+0x2c/0x49 [ 93.815947] [<ffffffff8108d55c>] ? print_irq_inversion_bug.part.18+0x1a0/0x1a0 [ 93.815947] [<ffffffff8108dc8a>] mark_lock+0x106/0x220 [ 93.815947] [<ffffffff8108e099>] __lock_acquire+0x2f5/0xd0c [ 93.815947] [<ffffffff810152af>] ? native_sched_clock+0x34/0x36 [ 93.830125] [<ffffffff810152ba>] ? sched_clock+0x9/0xd [ 93.830125] [<ffffffff81080181>] ? sched_clock_local+0x12/0x75 [ 93.830125] [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi] [ 93.830125] [<ffffffff8108efb3>] lock_acquire+0xf3/0x13e [ 93.830125] [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi] [ 93.830125] [<ffffffff814e981d>] _raw_spin_lock+0x45/0x79 [ 93.830125] [<ffffffffa025dad6>] ? rtl_lps_leave+0x26/0x103 [rtlwifi] [ 93.830125] [<ffffffff81422467>] ? skb_dequeue+0x62/0x6d [ 93.830125] [<ffffffffa025dad6>] rtl_lps_leave+0x26/0x103 [rtlwifi] [ 93.830125] [<ffffffffa025f677>] _rtl_pci_ips_leave_tasklet+0xe/0x10 [rtlwifi] [ 93.830125] [<ffffffff8106281f>] tasklet_action+0x8d/0xee [ 93.830125] [<ffffffff810629ce>] __do_softirq+0x112/0x25a [ 93.830125] [<ffffffff814f327c>] call_softirq+0x1c/0x30 [ 93.830125] [<ffffffff81010bf6>] do_softirq+0x4b/0xa1 [ 93.830125] [<ffffffff81062d7d>] irq_exit+0x5d/0xcf [ 93.830125] [<ffffffff814f3b7e>] do_IRQ+0x8e/0xa5 [ 93.830125] [<ffffffff814ea533>] common_interrupt+0x73/0x73 [ 93.830125] <EOI> [<ffffffff8108b825>] ? trace_hardirqs_off+0xd/0xf [ 93.830125] [<ffffffff812bb6d5>] ? intel_idle+0xe5/0x10c [ 93.830125] [<ffffffff812bb6d1>] ? intel_idle+0xe1/0x10c [ 93.830125] [<ffffffff813f8d5e>] cpuidle_idle_call+0x11c/0x1fe [ 93.830125] [<ffffffff8100e2ef>] cpu_idle+0xab/0x101 [ 93.830125] [<ffffffff814c6373>] rest_init+0xd7/0xde [ 93.830125] [<ffffffff814c629c>] ? csum_partial_copy_generic+0x16c/0x16c [ 93.830125] [<ffffffff81d4bbb0>] start_kernel+0x3dd/0x3ea [ 93.830125] [<ffffffff81d4b2c4>] x86_64_start_reservations+0xaf/0xb3 [ 93.830125] [<ffffffff81d4b140>] ? early_idt_handlers+0x140/0x140 [ 93.830125] [<ffffffff81d4b3ca>] x86_64_start_kernel+0x102/0x111 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=755154 Reported-by: vjain02@students.poly.edu Reported-and-tested-by: Oliver Paukstadt <pstadt@sourcentral.org> Cc: stable@vger.kernel.org Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * mac80211: don't stop a single aggregation session twiceJohannes Berg2011-11-281-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nikolay noticed (by code review) that mac80211 can attempt to stop an aggregation session while it is already being stopped. So to fix it, check whether stop is already being done and bail out if so. Also move setting the STOPPING state into the lock so things are properly atomic. Cc: stable@vger.kernel.org Reported-by: Nikolay Martynov <mar.kolya@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| | * nl80211: fix MAC address validationEliad Peller2011-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MAC addresses have a fixed length. The current policy allows passing < ETH_ALEN bytes, which might result in reading beyond the buffer. Cc: stable@vger.kernel.org Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | isdn: avoid copying too long drvidDan Carpenter2011-11-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | "cfg->drvid" comes from the user so there is a possibility they didn't NUL terminate it properly. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | isdn: make sure strings are null terminatedDan Carpenter2011-11-291-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | These strings come from the user. We strcpy() them inside cf_command() so we should check that they are NULL terminated and return an error if not. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netlabel: Fix build problems when IPv6 is not enabledPaul Moore2011-11-291-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent fix to the the NetLabel code caused build problem with configurations that did not have IPv6 enabled; see below: netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add': netlabel_kapi.c:165:4: error: implicit declaration of function 'netlbl_af6list_add' This patch fixes this problem by making the IPv6 specific code conditional on the IPv6 configuration flags as we done in the rest of NetLabel and the network stack as a whole. We have to move some variable declarations around as a result so things may not be quite as pretty, but at least it builds cleanly now. Some additional IPv6 conditionals were added to the NetLabel code as well for the sake of consistency. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | sctp: better integer overflow check in sctp_auth_create_key()Xi Wang2011-11-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check from commit 30c2235c is incomplete and cannot prevent cases like key_len = 0x80000000 (INT_MAX + 1). In that case, the left-hand side of the check (INT_MAX - key_len), which is unsigned, becomes 0xffffffff (UINT_MAX) and bypasses the check. However this shouldn't be a security issue. The function is called from the following two code paths: 1) setsockopt() 2) sctp_auth_asoc_set_secret() In case (1), sca_keylength is never going to exceed 65535 since it's bounded by a u16 from the user API. As such, the key length will never overflow. In case (2), sca_keylength is computed based on the user key (1 short) and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still will not overflow. In other words, this overflow check is not really necessary. Just make it more correct. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'nf' of git://1984.lsi.us.es/netDavid S. Miller2011-11-298-53/+87
| |\ \
| | * | netfilter: nf_conntrack: make event callback registration per-netnsPablo Neira Ayuso2011-11-224-49/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an oops that can be triggered following this recipe: 0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded. 1) container is started. 2) connect to it via lxc-console. 3) generate some traffic with the container to create some conntrack entries in its table. 4) stop the container: you hit one oops because the conntrack table cleanup tries to report the destroy event to user-space but the per-netns nfnetlink socket has already gone (as the nfnetlink socket is per-netns but event callback registration is global). To fix this situation, we make the ctnl_notifier per-netns so the callback is registered/unregistered if the container is created/destroyed. Alex Bligh and Alexey Dobriyan originally proposed one small patch to check if the nfnetlink socket is gone in nfnetlink_has_listeners, but this is a very visited path for events, thus, it may reduce performance and it looks a bit hackish to check for the nfnetlink socket only to workaround this situation. As a result, I decided to follow the bigger path choice, which seems to look nicer to me. Cc: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | netfilter: possible unaligned packet header in ip_route_me_harderPaul Guo2011-11-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to fix the following issue in netfilter: In ip_route_me_harder(), we invoke pskb_expand_head() that rellocates new header with additional head room which can break the alignment of the original packet header. In one of my NAT test case, the NIC port for internal hosts is configured with vlan and the port for external hosts is with general configuration. If we ping an external "unknown" hosts from an internal host, an icmp packet will be sent. We find that in icmp_send()->...->ip_route_me_harder()->pskb_expand_head(), hh_len=18 and current headroom (skb_headroom(skb)) of the packet is 16. After calling pskb_expand_head() the packet header becomes to be unaligned and then our system (arch/tile) panics immediately. Signed-off-by: Paul Guo <ggang@tilera.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | netfilter: ipset: suppress compile-time warnings in ip_set_hash_ipport*.cJozsef Kadlecsik2011-11-213-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | warning: 'ip_to' may be used uninitialized in this function Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | | sctp: integer overflow in sctp_auth_create_key()Xi Wang2011-11-290-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous commit 30c2235c is incomplete and cannot prevent integer overflows. For example, when key_len is 0x80000000 (INT_MAX + 1), the left-hand side of the check, (INT_MAX - key_len), which is unsigned, becomes 0xffffffff (UINT_MAX) and bypasses the check. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.Li Wei2011-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to set np->mcast_hops to it's default value at this moment otherwise when we use it and found it's value is -1, the logic to get default hop limit doesn't take multicast into account and will return wrong hop limit(IPV6_DEFAULT_HOPLIMIT) which is for unicast. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Fix corruption in /proc/*/net/dev_mcastAnton Blanchard2011-11-283-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I just hit this during my testing. Isn't there another bug lurking? BUG kmalloc-8: Redzone overwritten INFO: 0xc0000000de9dec48-0xc0000000de9dec4b. First byte 0x0 instead of 0xcc INFO: Allocated in .__seq_open_private+0x30/0xa0 age=0 cpu=5 pid=3896 .__kmalloc+0x1e0/0x2d0 .__seq_open_private+0x30/0xa0 .seq_open_net+0x60/0xe0 .dev_mc_seq_open+0x4c/0x70 .proc_reg_open+0xd8/0x260 .__dentry_open.clone.11+0x2b8/0x400 .do_last+0xf4/0x950 .path_openat+0xf8/0x480 .do_filp_open+0x48/0xc0 .do_sys_open+0x140/0x250 syscall_exit+0x0/0x40 dev_mc_seq_ops uses dev_seq_start/next/stop but only allocates sizeof(struct seq_net_private) of private data, whereas it expects sizeof(struct dev_iter_state): struct dev_iter_state { struct seq_net_private p; unsigned int pos; /* bucket << BUCKET_SPACE + offset */ }; Create dev_seq_open_ops and use it so we don't have to expose struct dev_iter_state. [ Problem added by commit f04565ddf52e4 (dev: use name hash for dev_seq_ops) -Eric ] Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | jme: PHY configuration for compatible issueAries Lee2011-11-272-3/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To perform PHY calibration and set a different EA value by chip ID, Whenever the NIC chip power on, ie booting or resuming, we need to force HW to calibrate PHY parameter again, and also set a proper EA value which gather from experiment. Those procedures help to reduce compatible issues(NIC is unable to link up in some special case) in giga speed. Signed-off-by: AriesLee <AriesLee@jmicron.com> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | inet: add a redirect generation id in inetpeerEric Dumazet2011-11-262-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now inetpeer is the place where we cache redirect information for ipv4 destinations, we must be able to invalidate informations when a route is added/removed on host. As inetpeer is not yet namespace aware, this patch adds a shared redirect_genid, and a per inetpeer redirect_genid. This might be changed later if inetpeer becomes ns aware. Cache information for one inerpeer is valid as long as its redirect_genid has the same value than global redirect_genid. Reported-by: Arkadiusz Miśkiewicz <a.miskiewicz@gmail.com> Tested-by: Arkadiusz Miśkiewicz <a.miskiewicz@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | AF_UNIX: Fix poll blocking problem when reading from a stream socketAlexey Moiseytsev2011-11-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | poll() call may be blocked by concurrent reading from the same stream socket. Signed-off-by: Alexey Moiseytsev <himeraster@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | dm9000: Fix check for disabled wake on LANMark Brown2011-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're trying to check if any options are defined which isn't wha the existing code does due to confusing & and &&. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | l2tp: ensure sk->dst is still validFlorian Westphal2011-11-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using l2tp over ipsec, the tunnel will hang when rekeying occurs. Reason is that the transformer bundle attached to the dst entry is now in STATE_DEAD and thus xfrm_output_one() drops all packets (XfrmOutStateExpired increases). Fix this by calling __sk_dst_check (which drops the stale dst if xfrm dst->check callback finds that the bundle is no longer valid). Cc: James Chapman <jchapman@katalix.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | decnet: proper socket refcountingEric Dumazet2011-11-261-12/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Better use sk_reset_timer() / sk_stop_timer() helpers to make sure we dont access already freed/reused memory later. Reported-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Revert ARCNET and PHYLIB to tristate optionsBen Hutchings2011-11-262-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 88491d8103498a6166f70d5999902fec70924314 ("drivers/net: Kconfig & Makefile cleanup") changed the type of these options to bool, but they select code that could (and still can) be built as modules. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv4: Don't use the cached pmtu informations for input routesSteffen Klassert2011-11-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pmtu informations on the inetpeer are visible for output and input routes. On packet forwarding, we might propagate a learned pmtu to the sender. As we update the pmtu informations of the inetpeer on demand, the original sender of the forwarded packets might never notice when the pmtu to that inetpeer increases. So use the mtu of the outgoing device on packet forwarding instead of the pmtu to the final destination. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | route: struct rtable can be const in rt_is_input_route and rt_is_output_routeSteffen Klassert2011-11-261-2/+2
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Move mtu handling down to the protocol depended handlersSteffen Klassert2011-11-265-12/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We move all mtu handling from dst_mtu() down to the protocol layer. So each protocol can implement the mtu handling in a different manner. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Rename the dst_opt default_mtu method to mtuSteffen Klassert2011-11-266-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We plan to invoke the dst_opt->default_mtu() method unconditioally from dst_mtu(). So rename the method to dst_opt->mtu() to match the name with the new meaning. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | route: Use the device mtu as the default for blackhole routesSteffen Klassert2011-11-262-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it is, we return null as the default mtu of blackhole routes. This may lead to a propagation of a bogus pmtu if the default_mtu method of a blackhole route is invoked. So return dst->dev->mtu as the default mtu instead. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns: fix proxy ARP entries listing on a netnsJorge Boncompte [DTI2]2011-11-251-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Skip entries from foreign network namespaces. Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/netlabel: copy and paste bug in netlbl_cfg_unlbl_map_add()Dan Carpenter2011-11-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was copy and pasted from the IPv4 code. We're calling the ip4 version of that function and map4 is NULL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv4: Save nexthop address of LSRR/SSRR option to IPCB.Li Wei2011-11-233-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can not update iph->daddr in ip_options_rcv_srr(), It is too early. When some exception ocurred later (eg. in ip_forward() when goto sr_failed) we need the ip header be identical to the original one as ICMP need it. Add a field 'nexthop' in struct ip_options to save nexthop of LSRR or SSRR option. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ehea: Use round_jiffies_relative to align workqueueAnton Blanchard2011-11-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use round_jiffies_relative to align the ehea workqueue and avoid extra wakeups. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ehea: Reduce memory usage in buffer poolsAnton Blanchard2011-11-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we enable multiqueue by default the ehea driver is using quite a lot of memory for its buffer pools. With 4 queues we consume 64MB in the jumbo packet ring, 16MB in the medium packet ring and 16MB in the tiny packet ring. We should only fill the jumbo ring once the MTU is increased but for now halve it's size so it consumes 32MB. Also reduce the tiny packet ring, with 4 queues we had 16k entries which is overkill. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | qlge: fix size of external list for TX address descriptorsThadeu Lima de Souza Cascardo2011-11-231-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When transmiting a fragmented skb, qlge fills a descriptor with the fragment addresses, after DMA-mapping them. If there are more than eight fragments, it will use the eighth descriptor as a pointer to an external list. After mapping this external list, called OAL to a structure containing more descriptors, it fills it with the extra fragments. However, considering that systems with pages larger than 8KiB would have less than 8 fragments, which was true before commit a715dea3c8e, it defined a macro for the OAL size as 0 in those cases. Now, if a skb with more than 8 fragments (counting skb->data as one fragment), this would start overwriting the list of addresses already mapped and would make the driver fail to properly unmap the right addresses on architectures with pages larger than 8KiB. Besides that, the list of mappings was one size too small, since it must have a mapping for the maxinum number of skb fragments plus one for skb->data and another for the OAL. So, even on architectures with page sizes 4KiB and 8KiB, a skb with the maximum number of fragments would make the driver overwrite its counter for the number of mappings, which, again, would make it fail to unmap the mapped DMA addresses. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
OpenPOWER on IntegriCloud