op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'master' of ↵	Linus Torvalds	2007-09-16	22	-313/+307
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [VLAN]: Fix net_device leak. [PPP] generic: Fix receive path data clobbering & non-linear handling [PPP] generic: Call skb_cow_head before scribbling over skb [NET] skbuff: Add skb_cow_head [BRIDGE]: Kill clone argument to br_flood_* [PPP] pppoe: Fill in header directly in __pppoe_xmit [PPP] pppoe: Fix data clobbering in __pppoe_xmit and return value [PPP] pppoe: Fix skb_unshare_check call position [SCTP]: Convert bind_addr_list locking to RCU [SCTP]: Add RCU synchronization around sctp_localaddr_list [PKT_SCHED]: sch_cbq.c: Shut up uninitialized variable warning [PKTGEN]: srcmac fix [IPV6]: Fix source address selection. [IPV4]: Just increment OutDatagrams once per a datagram. [IPV6]: Just increment OutDatagrams once per a datagram. [IPV6]: Fix unbalanced socket reference with MSG_CONFIRM. [NET_SCHED] protect action config/dump from irqs [NET]: Fix two issues wrt. SO_BINDTODEVICE.
\| *	[VLAN]: Fix net_device leak.	Al Viro	2007-09-16	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In "[VLAN]: Move device registation to seperate function" (commit e89fe42cd03c8fd3686df82d8390a235717a66de), a pile of code got moved to register_vlan_dev(), including grabbing a reference to underlying device. However, original dev_hold() had been left behind, so we leak a reference to net_device now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[NET] skbuff: Add skb_cow_head	Herbert Xu	2007-09-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds an optimised version of skb_cow that avoids the copy if the header can be modified even if the rest of the payload is cloned. This can be used in encapsulating paths where we only need to modify the header. As it is, this can be used in PPPOE and bridging. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[BRIDGE]: Kill clone argument to br_flood_*	Herbert Xu	2007-09-16	4	-50/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The clone argument is only used by one caller and that caller can clone the packet itself. This patch moves the clone call into the caller and kills the clone argument. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[SCTP]: Convert bind_addr_list locking to RCU	Vlad Yasevich	2007-09-16	7	-159/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the sctp_sockaddr_entry is now RCU enabled as part of the patch to synchronize sctp_localaddr_list, it makes sense to change all handling of these entries to RCU. This includes the sctp_bind_addrs structure and it's list of bound addresses. This list is currently protected by an external rw_lock and that looks like an overkill. There are only 2 writers to the list: bind()/bindx() calls, and BH processing of ASCONF-ACK chunks. These are already seriealized via the socket lock, so they will not step on each other. These are also relatively rare, so we should be good with RCU. The readers are varied and they are easily converted to RCU. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[SCTP]: Add RCU synchronization around sctp_localaddr_list	Vlad Yasevich	2007-09-16	4	-38/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sctp_localaddr_list is modified dynamically via NETDEV_UP and NETDEV_DOWN events, but there is not synchronization between writer (even handler) and readers. As a result, the readers can access an entry that has been freed and crash the sytem. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Sridhar Samdurala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[PKT_SCHED]: sch_cbq.c: Shut up uninitialized variable warning	Satyam Sharma	2007-09-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	net/sched/sch_cbq.c: In function 'cbq_enqueue': net/sched/sch_cbq.c:383: warning: 'ret' may be used uninitialized in this function has been verified to be a bogus case. So let's shut it up. Signed-off-by: Satyam Sharma <satyam@infradead.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[PKTGEN]: srcmac fix	Adit Ranadive	2007-09-16	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: Adit Ranadive <adit.262@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[IPV6]: Fix source address selection.	Jiri Kosina	2007-09-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 95c385 broke proper source address selection for cases in which there is a address which is makred 'deprecated'. The commit mistakenly changed ifa->flags to ifa_result->flags (probably copy/paste error from a few lines above) in the 'Rule 3' address selection code. The patch restores the previous RFC-compliant behavior. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[IPV4]: Just increment OutDatagrams once per a datagram.	YOSHIFUJI Hideaki	2007-09-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[IPV6]: Just increment OutDatagrams once per a datagram.	YOSHIFUJI Hideaki	2007-09-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[IPV6]: Fix unbalanced socket reference with MSG_CONFIRM.	YOSHIFUJI Hideaki	2007-09-14	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[NET_SCHED] protect action config/dump from irqs	Jamal Hadi Salim	2007-09-14	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(with no apologies to C Heston) On Mon, 2007-10-09 at 21:00 +0800, Herbert Xu wrote: On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote: > > > > after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep > > was quite noisy when I tried to shape my external (wireless) interface: > > > > [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock: > > [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>] > > netif_receive_skb+0x2d5/0x3c0 > > [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the > > past: > > [ 6400.535145] (police_lock){-.--} > > This is a genuine dead-lock. The police lock can be taken > for reading with softirqs on. If a second CPU tries to take > the police lock for writing, while holding the ingress lock, > then a softirq on the first CPU can dead-lock when it tries > to get the ingress lock. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[NET]: Fix two issues wrt. SO_BINDTODEVICE.	David S. Miller	2007-09-14	1	-48/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1) Comments suggest that setting optlen to zero will unbind the socket from whatever device it might be attached to. This hasn't been the case since at least 2.2.x because the first thing this function does is return -EINVAL if 'optlen' is less than sizeof(int). This check also means that passing in a two byte string doesn't work so well. It's almost as if this code was testing with "eth?" patterned strings and nothing else :-) Fix this by breaking the logic of this facility out into a seperate function which validates optlen more appropriately. The optlen==0 and small string cases now work properly. 2) We should reset the cached route of the socket after we have made the device binding changes, not before. Reported by Ben Greear. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Correctly close old nfsd/lockd sockets.	Neil Brown	2007-09-14	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit aaf68cfbf2241d24d46583423f6bff5c47e088b3 added a bias to sk_inuse, so this test for an unused socket now fails. So no sockets get closed because they are old (they might get closed if the client closed them). This bug has existed since 2.6.21-rc1. Thanks to Wolfgang Walter for finding and reporting the bug. Cc: Wolfgang Walter <wolfgang.walter@studentenwerk.mhn.de> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[BLUETOOTH]: Fix non-COMPAT build of hci_sock.c	David S. Miller	2007-09-12	1	-3/+4
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	[INET_DIAG]: Fix oops in netlink_rcv_skb	Patrick McHardy	2007-09-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	netlink_run_queue() doesn't handle multiple processes processing the queue concurrently. Serialize queue processing in inet_diag to fix a oops in netlink_rcv_skb caused by netlink_run_queue passing a NULL for the skb. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000054 [349587.500454] printing eip: [349587.500457] c03318ae [349587.500459] *pde = 00000000 [349587.500464] Oops: 0000 [#1] [349587.500466] PREEMPT SMP [349587.500474] Modules linked in: w83627hf hwmon_vid i2c_isa [349587.500483] CPU: 0 [349587.500485] EIP: 0060:[<c03318ae>] Not tainted VLI [349587.500487] EFLAGS: 00010246 (2.6.22.3 #1) [349587.500499] EIP is at netlink_rcv_skb+0xa/0x7e [349587.500506] eax: 00000000 ebx: 00000000 ecx: c148d2a0 edx: c0398819 [349587.500510] esi: 00000000 edi: c0398819 ebp: c7a21c8c esp: c7a21c80 [349587.500517] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 [349587.500521] Process oidentd (pid: 17943, ti=c7a20000 task=cee231c0 task.ti=c7a20000) [349587.500527] Stack: 00000000 c7a21cac f7c8ba78 c7a21ca4 c0331962 c0398819 f7c8ba00 0000004c [349587.500542] f736f000 c7a21cb4 c03988e3 00000001 f7c8ba00 c7a21cc4 c03312a5 0000004c [349587.500558] f7c8ba00 c7a21cd4 c0330681 f7c8ba00 e4695280 c7a21d00 c03307c6 7fffffff [349587.500578] Call Trace: [349587.500581] [<c010361a>] show_trace_log_lvl+0x1c/0x33 [349587.500591] [<c01036d4>] show_stack_log_lvl+0x8d/0xaa [349587.500595] [<c010390e>] show_registers+0x1cb/0x321 [349587.500604] [<c0103bff>] die+0x112/0x1e1 [349587.500607] [<c01132d2>] do_page_fault+0x229/0x565 [349587.500618] [<c03c8d3a>] error_code+0x72/0x78 [349587.500625] [<c0331962>] netlink_run_queue+0x40/0x76 [349587.500632] [<c03988e3>] inet_diag_rcv+0x1f/0x2c [349587.500639] [<c03312a5>] netlink_data_ready+0x57/0x59 [349587.500643] [<c0330681>] netlink_sendskb+0x24/0x45 [349587.500651] [<c03307c6>] netlink_unicast+0x100/0x116 [349587.500656] [<c0330f83>] netlink_sendmsg+0x1c2/0x280 [349587.500664] [<c02fcce9>] sock_sendmsg+0xba/0xd5 [349587.500671] [<c02fe4d1>] sys_sendmsg+0x17b/0x1e8 [349587.500676] [<c02fe92d>] sys_socketcall+0x230/0x24d [349587.500684] [<c01028d2>] syscall_call+0x7/0xb [349587.500691] ======================= [349587.500693] Code: f0 ff 4e 18 0f 94 c0 84 c0 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89 e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03 83 f8 0f Reported by Athanasius <link@miggy.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames	YOSHIFUJI Hideaki	2007-09-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some of skbs in sk->write_queue do not have skb->dst because we do not fill skb->dst when we allocate new skb in append_data(). BTW, I think we may not need to (or we should not) increment some stats when using corking; if 100 sendmsg() (with MSG_MORE) result in 2 packets, how many should we increment? If 100, we should set skb->dst for every queued skbs. If 1 (or 2 ()), we increment the stats for the first queued skb and we should just skip incrementing OutDiscards for the rest of queued skbs, adn we should also impelement this semantics in other places; e.g., we should increment other stats just once, not 100 times. : depends on the place we are discarding the datagram. I guess should just increment by 1 (or 2). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: Fix/improve deadlock condition on module removal netfilter	Neil Horman	2007-09-11	7	-25/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So I've had a deadlock reported to me. I've found that the sequence of events goes like this: 1) process A (modprobe) runs to remove ip_tables.ko 2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket, increasing the ip_tables socket_ops use count 3) process A acquires a file lock on the file ip_tables.ko, calls remove_module in the kernel, which in turn executes the ip_tables module cleanup routine, which calls nf_unregister_sockopt 4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the calling process into uninterruptible sleep, expecting the process using the socket option code to wake it up when it exits the kernel 4) the user of the socket option code (process B) in do_ipt_get_ctl, calls ipt_find_table_lock, which in this case calls request_module to load ip_tables_nat.ko 5) request_module forks a copy of modprobe (process C) to load the module and blocks until modprobe exits. 6) Process C. forked by request_module process the dependencies of ip_tables_nat.ko, of which ip_tables.ko is one. 7) Process C attempts to lock the request module and all its dependencies, it blocks when it attempts to lock ip_tables.ko (which was previously locked in step 3) Theres not really any great permanent solution to this that I can see, but I've developed a two part solution that corrects the problem Part 1) Modifies the nf_sockopt registration code so that, instead of using a use counter internal to the nf_sockopt_ops structure, we instead use a pointer to the registering modules owner to do module reference counting when nf_sockopt calls a modules set/get routine. This prevents the deadlock by preventing set 4 from happening. Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking remove operations (the same way rmmod does), and add an option to explicity request blocking operation. So if you select blocking operation in modprobe you can still cause the above deadlock, but only if you explicity try (and since root can do any old stupid thing it would like.... :) ). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: nf_conntrack_ipv4: fix "Frag of proto ..." messages	Patrick McHardy	2007-09-11	1	-7/+3
\| \| \| \| \| \| \| \| \| \|	Since we're now using a generic tuple decoding function in ICMP connection tracking, ipv4_get_l4proto() might get called with a fragmented packet from within an ICMP error. Remove the error message we used to print when this happens. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6	David S. Miller	2007-09-11	2	-11/+24
\|\
\| *	[Bluetooth] Fix parameter list for event filter command	Marcel Holtmann	2007-09-09	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On device initialization the event filters are cleared. In case of clearing the filters the extra condition type shall be omitted. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
\| *	[Bluetooth] Update security filter for Bluetooth 2.1	Marcel Holtmann	2007-09-09	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates the HCI security filter with support for the Bluetooth 2.1 commands and events. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
\| *	[Bluetooth] Add compat handling for timestamp structure	Marcel Holtmann	2007-09-09	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The timestamp structure needs special handling in case of compat programs. Use the same wrapping method the network core uses. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* \|	[IPV6]: Freeing alive inet6 address	Denis V. Lunev	2007-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: Denis V. Lunev <den@openvz.org> addrconf_dad_failure calls addrconf_dad_stop which takes referenced address and drops the count. So, in6_ifa_put perrformed at out: is extra. This results in message: "Freeing alive inet6 address" and not released dst entries. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[DECNET]: Fix interface address listing regression.	Patrick McHardy	2007-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not all are listed, same as the IPV4 devinet bug. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[IPV4] devinet: show all addresses assigned to interface	Stephen Hemminger	2007-09-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug: http://bugzilla.kernel.org/show_bug.cgi?id=8876 Not all ips are shown by "ip addr show" command when IPs number assigned to an interface is more than 60-80 (in fact it depends on broadcast/label etc presence on each address). Steps to reproduce: It's terribly simple to reproduce: # for i in $(seq 1 100); do ip ad add 10.0.$i.1/24 dev eth10 ; done # ip addr show this will _not_ show all IPs. Looks like the problem is in netlink/ipv4 message processing. This is fix from bug submitter, it looks correct. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[NET]: Do not dereference iov if length is zero	Herbert Xu	2007-09-11	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When msg_iovlen is zero we shouldn't try to dereference msg_iov. Right now the only thing that tries to do so is skb_copy_and_csum_datagram_iovec. Since the total length should also be zero if msg_iovlen is zero, it's sufficient to check the total length there and simply return if it's zero. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[TCP]: 'dst' can be NULL in tcp_rto_min()	David S. Miller	2007-08-31	1	-1/+1
\|/ \| \| \| \| \|	Reported by Rick Jones. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PKTGEN]: Remove write-only variable.	Pavel Emelyanov	2007-08-30	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	The pktgen_thread.pid is set to current->pid and is never used after this. So remove this at all. Found during isolating the explicit pid/tgid usage. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETFILTER]: xt_tcpudp: fix wrong struct in udp_checkentry	Jesper Bengtsson	2007-08-30	1	-1/+1
\| \| \| \| \| \| \| \| \|	It doesn't seem to have any effect on the x86 architecture but it does have effect on the Axis CRIS architecture. Signed-off-by: Jesper Bengtsson <jesper.bengtsson@axis.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET_SCHED] sch_prio.c: remove duplicate call of tc_classify()	Lucas Nussbaum	2007-08-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CONFIG_NET_CLS_ACT is enabled, tc_classify() is called twice in prio_classify(). This causes "interesting" behaviour: with the setup below, packets are duplicated, sent twice to ifb0, and then loop in and out of ifb0. The patch uses the previously calculated return value in the switch, which is probably what Patrick had in mind in commit bdba91ec70fb5ccbdeb1c7068319adc6ea9e1a7d -- maybe Patrick can double-check this? -- example setup -- ifconfig ifb0 up tc qdisc add dev ifb0 root netem delay 2s tc qdisc add dev $ETH root handle 1: prio tc filter add dev $ETH parent 1: protocol ip prio 10 u32 \ match ip dst 172.24.110.6/32 flowid 1:1 \ action mirred egress redirect dev ifb0 ping -c1 172.24.110.6 Signed-off-by: Lucas Nussbaum <lucas.nussbaum@imag.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[BRIDGE]: Fix OOPS when bridging device without ethtool.	Stephen Hemminger	2007-08-30	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bridge code calls ethtool to get speed. The conversion to using only ethtool_ops broke the case of devices without ethtool_ops. This is a new regression in 2.6.23. Rearranged the switch to a logical order, and use gcc initializer. Ps: speed should have been part of the network device structure from the start rather than burying it in ethtool. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Acked-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[BRIDGE]: Packets leaking out of disabled/blocked ports.	Stephen Hemminger	2007-08-30	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes some packet leakage in bridge. The bridging code was allowing forward table entries to be generated even if a device was being blocked. The fix is to not add forwarding database entries unless the port is active. The bug arose as part of the conversion to processing STP frames through normal receive path (in 2.6.17). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2007-08-30	7	-76/+187
\|\ \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev
\| *	SCTP: Fix to handle invalid parameter length correctly	Wei Yongjun	2007-08-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an INIT with invalid parameter length look like this: Parameter Type : 1 Parameter Length: 800 and not contain any payload, SCTP will ignore this parameter and send back a INIT-ACK. This patch is fix to handle this invalid parameter length correctly. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Abort on COOKIE-ECHO if backlog is exceeded.	Vlad Yasevich	2007-08-30	1	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we abort on the INIT chunk we our backlog is currenlty exceeded. Delay this about untill COOKIE-ECHO to give the user time to accept the socket. Also, make sure that we treat sk_max_backlog of 0 as no connections allowed. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Correctly disable listening when backlog is 0.	Vlad Yasevich	2007-08-30	1	-0/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Do not retransmit chunks that are newer then rtt.	Vlad Yasevich	2007-08-30	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When performing a retransmit, do not include the chunk if it was sent less then 1 rtt ago. The reason is that we may receive the SACK very soon and wouldn't retransmit. Suggested by Randy Stewart. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Uncomfirmed transports can't become Inactive	Vlad Yasevich	2007-08-30	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not set Unconfirmed transports to Inactive state. This may result in an inactive association being destroyed since we start counting errors on "inactive" transports against the association. This was found at the SCTP interop event. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Pick the correct port when binding to 0.	Vlad Yasevich	2007-08-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sctp_bindx() allows the use of unspecified port. The problem is that every address we bind to ends up selecting a new port if the user specified port 0. This patch allows re-use of the already selected port when the port from bindx was 0. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Use net_ratelimit to suppress error messages print too fast	Wei Yongjun	2007-08-30	2	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When multi bundling SHUTDOWN-ACK message is received in ESTAB state, this will cause "sctp protocol violation state" message print many times. If SHUTDOWN-ACK is bundled 300 times in one packet, message will be print 300 times. The same problem also exists when received unexpected HEARTBEAT-ACK message which is bundled message times. This patch used net_ratelimit() to suppress error messages print too fast. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Fix to encode PROTOCOL VIOLATION error cause correctly	Wei Yongjun	2007-08-30	2	-23/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PROTOCOL VIOLATION error cause in ABORT is bad encode when make abort chunk. When SCTP encode ABORT chunk with PROTOCOL VIOLATION error cause, it just add the error messages to PROTOCOL VIOLATION error cause, the rest four bytes(struct sctp_paramhdr) is just add to the chunk, not change the length of error cause. This cause the ABORT chunk to be a bad format. The chunk is like this: ABORT chunk Chunk type: ABORT (6) Chunk flags: 0x00 Chunk length: 72 (1) Protocol violation cause Cause code: Protocol violation (0x000d) Cause length: 62 (2) Cause information: 5468652063756D756C61746976652074736E2061636B2062... Cause padding: 0000 [Needless] 00030010 Chunk Length(1) = 72 but Cause length(2) only 62, not include the extend 4 bytes. ((72 - sizeof(chunk_hdr)) = 68) != (62 +3) / 4 * 4 Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Fix sctp_addto_chunk() to add pad with correct length	Wei Yongjun	2007-08-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At function sctp_addto_chunk(), it do pad before add payload to chunk if chunk length is not 4-byte alignment. But it do pad with a bad length. This patch fixed this probleam. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: Assign stream sequence numbers to the entire message	Vlad Yasevich	2007-08-29	1	-12/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we only assign the sequence number to a packet that we are about to transmit. This however breaks the Partial Reliability extensions, because it's possible for us to never transmit a packet, i.e. it expires before we get to send it. In such cases, if the message contained multiple SCTP fragments, and we did manage to send the first part of the message, the Stream sequence numbers would get into invalid state and cause receiver to stall. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
\| *	SCTP: properly clean up fragment and ordering queues during FWD-TSN.	Vlad Yasevich	2007-08-29	2	-13/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we recieve a FWD-TSN (meaning the peer has abandoned the data), we need to clean up any partially received messages that may be hanging out on the re-assembly or re-ordering queues. This is a MUST requirement that was not properly done before. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com.>
* \|	[TCP]: Allow minimum RTO to be configurable via routing metrics.	David S. Miller	2007-08-30	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cell phone networks do link layer retransmissions and other things that cause unnecessary timeout retransmits. So allow the minimum RTO to be inflated per-route to deal with this. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	[PKTGEN]: Fix multiqueue oops.	Robert Olsson	2007-08-28	1	-2/+3
\|/ \| \| \| \| \| \| \| \|	Initially pkt_dev can be NULL this causes netif_subqueue_stopped to oops. The patch below should cure it. But maybe the pktgen TX logic should be reworked to better support the new multiqueue support. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[VLAN/BRIDGE]: Fix "skb_pull_rcsum - Fatal exception in interrupt"	Evgeniy Polyakov	2007-08-26	2	-6/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I tried to preserve bridging code as it was before, but logic is quite strange - I think we should free skb on error, since it is already unshared and thus will just leak. Herbert Xu states: > + if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL) > + goto out; If this happens it'll be a double-free on skb since we'll return NF_DROP which makes the caller free it too. We could return NF_STOLEN to prevent that but I'm not sure whether that's correct netfilter semantics. Patrick, could you please make a call on this? Patrick McHardy states: NF_STOLEN should work fine here. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Fix crash in dev_mc_sync()/dev_mc_unsync()	Benjamin Thery	2007-08-26	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes a crash that may occur when the routine dev_mc_sync() deletes an address from the list it is currently going through. It saves the pointer to the next element before deleting the current one. The problem may also exist in dev_mc_unsync(). Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>