op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	ipv6: missed namespace context in ipv6_rthdr_rcv	Denis V. Lunev	2008-07-10	1	-1/+1
\| \| \| \| \|	Signed-off-by: Denis V. Lunev <den@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlabel: netlink_unicast calls kfree_skb on error path by itself	Denis V. Lunev	2008-07-10	3	-21/+4
\| \| \| \| \| \| \| \| \|	So, no need to kfree_skb here on the error path. In this case we can simply return. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: fib_trie: Fix lookup error return	Ben Hutchings	2008-07-10	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit a07f5f508a4d9728c8e57d7f66294bf5b254ff7f "[IPV4] fib_trie: style cleanup", the changes to check_leaf() and fn_trie_lookup() were wrong - where fn_trie_lookup() would previously return a negative error value from check_leaf(), it now returns 0. Now fn_trie_lookup() doesn't appear to care about plen, so we can revert check_leaf() to returning the error value. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Tested-by: William Boughton <bill@boughton.de> Acked-by: Stephen Heminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: correct kcalloc usage	Milton Miller	2008-07-10	1	-1/+1
\| \| \| \| \| \| \| \|	kcalloc is supposed to be called with the count as its first argument and the element size as the second. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2008-07-09	2	-23/+13
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
\| *	rc80211_pid: Fix fast_start parameter handling	Mattias Nissler	2008-07-09	2	-23/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes the fast_start parameter from the rc_pid parameters information and instead uses the parameter macro when initializing the rc_pid state. Since the parameter is only used on initialization, there is no point of making exporting it via debugfs. This also fixes uninitialized memory references to the fast_start and norm_offset parameters detected by the kmemcheck utility. Thanks to Vegard Nossum for reporting the bug. Signed-off-by: Mattias Nissler <mattias.nissler@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* \|	netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP	David Howells	2008-07-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a range check in netfilter IP NAT for SNMP to always use a big enough size variable that the compiler won't moan about comparing it to ULONG_MAX/8 on a 64-bit platform. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	netfilter: nf_conntrack_tcp: fix endless loop	Patrick McHardy	2008-07-09	1	-2/+8
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a conntrack entry is destroyed in process context and destruction is interrupted by packet processing and the packet is an attempt to reopen a closed connection, TCP conntrack tries to kill the old entry itself and returns NF_REPEAT to pass the packet through the hook again. This may lead to an endless loop: TCP conntrack repeatedly finds the old entry, but can not kill it itself since destruction is already in progress, but destruction in process context can not complete since TCP conntrack is keeping the CPU busy. Drop the packet in TCP conntrack if we can't kill the connection ourselves to avoid this. Reported by: hemao77@gmail.com [ Kernel bugzilla #11058 ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: fix race between ipv6_del_addr and DAD timer	Andrey Vagin	2008-07-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider the following scenario: ipv6_del_addr(ifp) ipv6_ifa_notify(RTM_DELADDR, ifp) ip6_del_rt(ifp->rt) after returning from the ipv6_ifa_notify and enabling BH-s back, but before calling the addrconf_del_timer the ifp->timer fires and: addrconf_dad_timer(ifp) addrconf_dad_completed(ifp) ipv6_ifa_notify(RTM_NEWADDR, ifp) ip6_ins_rt(ifp->rt) then return back to the ipv6_del_addr and: in6_ifa_put(ifp) inet6_ifa_finish_destroy(ifp) dst_release(&ifp->rt->u.dst) After this we have an ifp->rt inserted into fib6 lists, but queued for gc, which in turn can result in oopses in the fib6_run_gc. Maybe some other nasty things, but we caught only the oops in gc so far. The solution is to disarm the ifp->timer before flushing the rt from it. Signed-off-by: Andrey Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	irda: Fix netlink error path return value	Julius Volz	2008-07-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	Fix an incorrect return value check of genlmsg_put() in irda_nl_get_mode(). genlmsg_put() does not use ERR_PTR() to encode return values, it just returns NULL on error. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sctp: Mark the tsn as received after all allocations finish	Vlad Yasevich	2008-07-08	2	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we don't have the buffer space or memory allocations fail, the data chunk is dropped, but TSN is still reported as received. This introduced a data loss that can't be recovered. We should only mark TSNs are received after memory allocations finish. The one exception is the invalid stream identifier, but that's due to user error and is reported back to the user. This was noticed by Michael Tuexen. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	mac80211: don't report selected IBSS when not found	Vladimir Koutny	2008-07-07	1	-2/+4
\| \| \| \| \| \| \|	Don't report a 'selected' IBSS in sta_find_ibss when none was found. Signed-off-by: Vladimir Koutny <vlado@ksp.sk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	mac80211: Only flush workqueue when last interface was removed	Ivo van Doorn	2008-07-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the ieee80211_hw->workqueue is flushed each time an interface is being removed. However most scheduled work is not interface specific but device specific, for example things like periodic work for link tuners. This patch will move the flush_workqueue() call to directly behind the call to ops->stop() to make sure the workqueue is only flushed when all interfaces are gone and there really shouldn't be any scheduled work in the drivers left. Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	mac80211: move netif_carrier_on to after ieee80211_bss_info_change_notify	Guy Cohen	2008-07-07	1	-2/+5
\| \| \| \| \| \| \| \| \| \|	Putting netif_carrier_on before configuring the driver/device with the new association state may cause a race (tx frames may be sent before configuration is done) Signed-off-by: Guy Cohen <guy.cohen@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	can: add sanity checks	Oliver Hartkopp	2008-07-05	3	-4/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even though the CAN netlayer only deals with CAN netdevices, the netlayer interface to the userspace and to the device layer should perform some sanity checks. This patch adds several sanity checks that mainly prevent userspace apps to send broken content into the system that may be misinterpreted by some other userspace application. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Acked-by: Andre Naujoks <nautsch@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge: fix use-after-free in br_cleanup_bridges()	Patrick McHardy	2008-07-03	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unregistering a bridge device may cause virtual devices stacked on the bridge, like vlan or macvlan devices, to be unregistered as well. br_cleanup_bridges() uses for_each_netdev_safe() to iterate over all devices during cleanup. This is not enough however, if one of the additionally unregistered devices is next in the list to the bridge device, it will get freed as well and the iteration continues on the freed element. Restart iteration after each bridge device removal from the beginning to fix this, similar to what rtnl_link_unregister() does. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: fix a size_t < 0 comparison in tcp_read_sock	Octavian Purdila	2008-07-03	1	-1/+2
\| \| \| \| \| \| \| \|	<used> should be of type int (not size_t) since recv_actor can return negative values and it is also used in a < 0 comparison. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: net/ipv4/tcp.c needs linux/scatterlist.h	Andrew Morton	2008-07-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	alpha: net/ipv4/tcp.c: In function 'tcp_calc_md5_hash': net/ipv4/tcp.c:2479: error: implicit declaration of function 'sg_init_table' net/ipv4/tcp.c:2482: error: implicit declaration of function 'sg_set_buf' net/ipv4/tcp.c:2507: error: implicit declaration of function 'sg_mark_end' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fib_rules: fix error code for unsupported families	Patrick McHardy	2008-07-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	The errno code returned must be negative. Fixes "RTNETLINK answers: Unknown error 18446744073709551519". Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdevice: Fix wrong string handle in kernel command line parsing	Wang Chen	2008-07-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v1->v2: Use strlcpy() to ensure s[i].name be null-termination. 1. In netdev_boot_setup_add(), a long name will leak. ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname......... 2. In netdev_boot_setup_check(), mismatch will happen if s[i].name is a substring of dev->name. ex. : dev=...eth1 dev=...eth11 [ With feedback from Ben Hutchings. ] Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Tyop of sk_filter() comment	Wang Chen	2008-07-01	1	-1/+0
\| \| \| \| \| \| \|	Parameter "needlock" no long exists. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlink: Unneeded local variable	Wang Chen	2008-07-01	1	-1/+1
\| \| \| \| \| \| \|	We already have a variable, which has the same capability. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net-sched: fix filter destruction in atm/hfsc qdisc destruction	Patrick McHardy	2008-07-01	2	-1/+7
\| \| \| \| \| \| \| \| \|	Filters need to be destroyed before beginning to destroy classes since the destination class needs to still be alive to unbind the filter. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net-sched: change tcf_destroy_chain() to clear start of filter list	Patrick McHardy	2008-07-01	10	-20/+16
\| \| \| \| \| \| \| \|	Pass double tcf_proto pointers to tcf_destroy_chain() to make it clear the start of the filter list for more consistency. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2008-06-30	1	-0/+7
\|\ \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6
\| *	mac80211: don't accept WEP keys other than WEP40 and WEP104	Emmanuel Grumbach	2008-06-30	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes mac80211 refuse a WEP key whose length is not WEP40 nor WEP104. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* \|	netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK	Jozsef Kadlecsik	2008-06-30	1	-6/+7
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lost connections was reported by Thomas Bätzler (running 2.6.25 kernel) on the netfilter mailing list (see the thread "Weird nat/conntrack Problem with PASV FTP upload"). He provided tcpdump recordings which helped to find a long lingering bug in conntrack. In TCP connection tracking, checking the lower bound of valid ACK could lead to mark valid packets as INVALID because: - We have got a "higher or equal" inequality, but the test checked the "higher" condition only; fixed. - If the packet contains a SACK option, it could occur that the ACK value was before the left edge of our (S)ACK "window": if a previous packet from the other party intersected the right edge of the window of the receiver, we could move forward the window parameters beyond accepting a valid ack. Therefore in this patch we check the rightmost SACK edge instead of the ACK value in the lower bound of valid (S)ACK test. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags.	YOSHIFUJI Hideaki	2008-06-27	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	The commit 77d16f450ae0452d7d4b009f78debb1294fb435c ("[IPV6] ROUTE: Unify RT6_F_xxx and RT6_SELECT_F_xxx flags") intended to pass various routing lookup hints around RT6_LOOKUP_F_xxx flags, but conversion was missing for rt6_device_match(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlabel: Fix a problem when dumping the default IPv6 static labels	Paul Moore	2008-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a missing "!" in a conditional statement which is causing entries to be skipped when dumping the default IPv6 static label entries. This can be demonstrated by running the following: # netlabelctl unlbl add default address:::1 \ label:system_u:object_r:unlabeled_t:s0 # netlabelctl -p unlbl list ... you will notice that the entry for the IPv6 localhost address is not displayed but does exist (works correctly, causes collisions when attempting to add duplicate entries, etc.). Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net/inet_lro: remove setting skb->ip_summed when not LRO-able	Eli Cohen	2008-06-27	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an SKB cannot be chained to a session, the current code attempts to "restore" its ip_summed field from lro_mgr->ip_summed. However, lro_mgr->ip_summed does not hold the original value; in fact, we'd better not touch skb->ip_summed since it is not modified by the code in the path leading to a failure to chain it. Also use a cleaer comment to the describe the ip_summed field of struct net_lro_mgr. Issue raised by Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
*	inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild	Pavel Emelyanov	2008-06-27	4	-6/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is that while we work w/o the inet_frags.lock even read-locked the secret rebuild timer may occur (on another CPU, since BHs are still disabled in the inet_frag_find) and change the rnd seed for ipv4/6 fragments. It was caused by my patch fd9e63544cac30a34c951f0ec958038f0529e244 ([INET]: Omit double hash calculations in xxx_frag_intern) late in the 2.6.24 kernel, so this should probably be queued to -stable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlink: Fix some doc comments in net/netlink/attr.c	Julius Volz	2008-06-27	1	-3/+4
\| \| \| \| \| \| \| \|	Fix some doc comments to match function and attribute names in net/netlink/attr.c. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: /proc/net/tcp rto,ato values not scaled properly (v2)	Stephen Hemminger	2008-06-27	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \|	I found another case where we are sending information to userspace in the wrong HZ scale. This should have been fixed back in 2.5 :-( This means an ABI change but as it stands there is no way for an application like ss to get the right value. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	pkt_sched: Remove CONFIG_NET_SCH_RR	Adrian Bunk	2008-06-27	1	-11/+0
\| \| \| \| \| \| \| \| \| \| \|	Commit d62733c8e437fdb58325617c4b3331769ba82d70 ([SCHED]: Qdisc changes and sch_rr added for multiqueue) added a NET_SCH_RR option that was unused since the code went unconditionally into sch_prio. Reported-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	pkt_sched: ERR_PTR() ususally encodes an negative errno, not positive.	WANG Cong	2008-06-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	Note, in the following patch, 'err' is initialized as: int err = -ENOBUFS; Signed-off-by: WANG Cong <wcong@critical-links.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netdevice: Fix typo of dev_unicast_add() comment	Wang Chen	2008-06-27	1	-1/+1
\| \| \| \| \|	Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	af_unix: fix 'poll for write'/connected DGRAM sockets	Rainer Weikusat	2008-06-27	1	-28/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg routine implements a form of receiver-imposed flow control by comparing the length of the receive queue of the 'peer socket' with the max_ack_backlog value stored in the corresponding sock structure, either blocking the thread which caused the send-routine to be called or returning EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET sockets. The poll-implementation for these socket types is datagram_poll from core/datagram.c. A socket is deemed to be writeable by this routine when the memory presently consumed by datagrams owned by it is less than the configured socket send buffer size. This is always wrong for PF_UNIX non-stream sockets connected to server sockets dealing with (potentially) multiple clients if the abovementioned receive queue is currently considered to be full. 'poll' will then return, indicating that the socket is writeable, but a subsequent write result in EAGAIN, effectively causing an (usual) application to 'poll for writeability by repeated send request with O_NONBLOCK set' until it has consumed its time quantum. The change below uses a suitably modified variant of the datagram_poll routines for both type of PF_UNIX sockets, which tests if the recv-queue of the peer a socket is connected to is presently considered to be 'full' as part of the 'is this socket writeable'-checking code. The socket being polled is additionally put onto the peer_wait wait queue associated with its peer, because the unix_dgram_recvmsg routine does a wake up on this queue after a datagram was received and the 'other wakeup call' is done implicitly as part of skb destruction, meaning, a process blocked in poll because of a full peer receive queue could otherwise sleep forever if no datagram owned by its socket was already sitting on this queue. Among this change is a small (inline) helper routine named 'unix_recvq_full', which consolidates the actual testing code (in three different places) into a single location. Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: fix for splice receive when used with software LRO	Octavian Purdila	2008-06-27	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an skb has nr_frags set to zero but its frag_list is not empty (as it can happen if software LRO is enabled), and a previous tcp_read_sock has consumed the linear part of the skb, then __skb_splice_bits: (a) incorrectly reports an error and (b) forgets to update the offset to account for the linear part Any of the two problems will cause the subsequent __skb_splice_bits call (the one that handles the frag_list skbs) to either skip data, or, if the unadjusted offset is greater then the size of the next skb in the frag_list, make tcp_splice_read loop forever. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: calculate tcp_mem based on low memory instead of all memory	Miquel van Smoorenburg	2008-06-27	1	-3/+6
\| \| \| \| \| \| \| \| \|	The tcp_mem array which contains limits on the total amount of memory used by TCP sockets is calculated based on nr_all_pages. On a 32 bits x86 system, we should base this on the number of lowmem pages. Signed-off-by: Miquel van Smoorenburg <miquels@cistron.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
*	mac80211: fix an oops in several failure paths in key allocation	Emmanuel Grumbach	2008-06-27	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes an oops in several failure paths in key allocation. This Oops occurs when freeing a key that has not been linked yet, so the key->sdata is not set. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	mac80211: implement EU regulatory domain	Tony Vroon	2008-06-25	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \|	Implement missing EU regulatory domain for mac80211. Based on the information in IEEE 802.11-2007 (specifically pages 1142, 1143 & 1148) and ETSI 301 893 (V1.4.1). With thanks to Johannes Berg. Signed-off-by: Tony Vroon <tony@linx.net> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	netfilter: ip6table_mangle: don't reroute in LOCAL_IN	Patrick McHardy	2008-06-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Rerouting should only happen in LOCAL_OUT, in INPUT its useless since the packet has already chosen its final destination. Noticed by Alexey Dobriyan <adobriyan@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netns: Don't receive new packets in a dead network namespace.	Eric W. Biederman	2008-06-20	2	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Alexey Dobriyan <adobriyan@gmail.com> writes: > Subject: ICMP sockets destruction vs ICMP packets oops > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt. > icmp_reply() wants ICMP socket. > > Steps to reproduce: > > launch shell in new netns > move real NIC to netns > setup routing > ping -i 0 > exit from shell > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30 > PGD 17f3cd067 PUD 17f3ce067 PMD 0 > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC > CPU 0 > Modules linked in: usblp usbcore > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4 > RIP: 0010:[<ffffffff803fce17>] [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP: 0018:ffffffff8057fc30 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900 > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800 > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815 > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28 > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800 > FS: 0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0) > Stack: 0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4 > ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246 > 000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436 > Call Trace: > <IRQ> [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0 > [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360 > [<ffffffff803fd645>] icmp_echo+0x65/0x70 > [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0 > [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0 > [<ffffffff803d71bb>] ip_rcv+0x33b/0x650 > [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340 > [<ffffffff803be57d>] process_backlog+0x9d/0x100 > [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250 > [<ffffffff80237be5>] __do_softirq+0x75/0x100 > [<ffffffff8020c97c>] call_softirq+0x1c/0x30 > [<ffffffff8020f085>] do_softirq+0x65/0xa0 > [<ffffffff80237af7>] irq_exit+0x97/0xa0 > [<ffffffff8020f198>] do_IRQ+0xa8/0x130 > [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60 > [<ffffffff8020bc46>] ret_from_intr+0x0/0xf > <EOI> [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60 > [<ffffffff80212f23>] ? mwait_idle+0x43/0x60 > [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0 > [<ffffffff8040f380>] ? rest_init+0x70/0x80 > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08 > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00 > RIP [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP <ffffffff8057fc30> > CR2: 0000000000000000 > ---[ end trace ea161157b76b33e8 ]--- > Kernel panic - not syncing: Aiee, killing interrupt handler! Receiving packets while we are cleaning up a network namespace is a racy proposition. It is possible when the packet arrives that we have removed some but not all of the state we need to fully process it. We have the choice of either playing wack-a-mole with the cleanup routines or simply dropping packets when we don't have a network namespace to handle them. Since the check looks inexpensive in netif_receive_skb let's just drop the incoming packets. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sctp: Make sure N * sizeof(union sctp_addr) does not overflow.	David S. Miller	2008-06-20	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	As noticed by Gabriel Campana, the kmalloc() length arg passed in by sctp_getsockopt_local_addrs_old() can overflow if ->addr_num is large enough. Therefore, enforce an appropriate limit. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Drop packets for loopback address from outside of the box.	YOSHIFUJI Hideaki	2008-06-19	1	-0/+9
\| \| \| \| \| \| \| \| \|	[ Based upon original report and patch by Karsten Keil. Karsten has verified that this fixes the TAHI test case "ICMPv6 test v6LC.5.1.2 Part F". -DaveM ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Remove options header when setsockopt's optlen is 0	Shan Wei	2008-06-19	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Remove the sticky Hop-by-Hop options header by calling setsockopt() for IPV6_HOPOPTS with a zero option length, per RFC3542. Routing header and Destination options header does the same as Hop-by-Hop options header. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	mac80211: detect driver tx bugs	Johannes Berg	2008-06-18	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	When a driver rejects a frame in it's ->tx() callback, it must also stop queues, otherwise mac80211 can go into a loop here. Detect this situation and abort the loop after five retries, warning about the driver bug. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlink: genl: fix circular locking	Patrick McHardy	2008-06-18	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	genetlink has a circular locking dependency when dumping the registered families: - dump start: genl_rcv() : take genl_mutex genl_rcv_msg() : call netlink_dump_start() while holding genl_mutex netlink_dump_start(), netlink_dump() : take nlk->cb_mutex ctrl_dumpfamily() : try to detect this case and not take genl_mutex a second time - dump continuance: netlink_rcv() : call netlink_dump netlink_dump : take nlk->cb_mutex ctrl_dumpfamily() : take genl_mutex Register genl_lock as callback mutex with netlink to fix this. This slightly widens an already existing module unload race, the genl ops used during the dump might go away when the module is unloaded. Thomas Graf is working on a seperate fix for this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Revert "mac80211: Use skb_header_cloned() on TX path."	David S. Miller	2008-06-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 608961a5eca8d3c6bd07172febc27b5559408c5d. The problem is that the mac80211 stack not only needs to be able to muck with the link-level headers, it also might need to mangle all of the packet data if doing sw wireless encryption. This fixes kernel bugzilla #10903. Thanks to Didier Raboud (for the bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for bringing this bisection analysis to my attention), and Ilpo (for trying to analyze this purely from the TCP side). In 2.6.27 we can take another stab at this, by using something like skb_cow_data() when the TX path of mac80211 ends up with a non-NULL tx->key. The ESP protocol code in the IPSEC stack can be used as a model for implementation. Signed-off-by: David S. Miller <davem@davemloft.net>
*	af_unix: fix 'poll for write'/ connected DGRAM sockets	Rainer Weikusat	2008-06-17	1	-9/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The unix_dgram_sendmsg routine implements a (somewhat crude) form of receiver-imposed flow control by comparing the length of the receive queue of the 'peer socket' with the max_ack_backlog value stored in the corresponding sock structure, either blocking the thread which caused the send-routine to be called or returning EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET sockets. The poll-implementation for these socket types is datagram_poll from core/datagram.c. A socket is deemed to be writeable by this routine when the memory presently consumed by datagrams owned by it is less than the configured socket send buffer size. This is always wrong for connected PF_UNIX non-stream sockets when the abovementioned receive queue is currently considered to be full. 'poll' will then return, indicating that the socket is writeable, but a subsequent write result in EAGAIN, effectively causing an (usual) application to 'poll for writeability by repeated send request with O_NONBLOCK set' until it has consumed its time quantum. The change below uses a suitably modified variant of the datagram_poll routines for both type of PF_UNIX sockets, which tests if the recv-queue of the peer a socket is connected to is presently considered to be 'full' as part of the 'is this socket writeable'-checking code. The socket being polled is additionally put onto the peer_wait wait queue associated with its peer, because the unix_dgram_sendmsg routine does a wake up on this queue after a datagram was received and the 'other wakeup call' is done implicitly as part of skb destruction, meaning, a process blocked in poll because of a full peer receive queue could otherwise sleep forever if no datagram owned by its socket was already sitting on this queue. Among this change is a small (inline) helper routine named 'unix_recvq_full', which consolidates the actual testing code (in three different places) into a single location. Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com> Signed-off-by: David S. Miller <davem@davemloft.net>