op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	ipv6: Never schedule DAD timer on dead address	Herbert Xu	2010-05-18	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	This patch ensures that all places that schedule the DAD timer look at the address state in a safe manner before scheduling the timer. This ensures that we don't end up with pending timers after deleting an address. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Use POSTDAD state	Herbert Xu	2010-05-18	1	-5/+24
\| \| \| \| \| \| \| \|	This patch makes use of the new POSTDAD state. This prevents a race between DAD completion and failure. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Use state_lock to protect ifa state	Herbert Xu	2010-05-18	1	-4/+23
\| \| \| \| \| \| \| \| \| \| \|	This patch makes use of the new state_lock to synchronise between updates to the ifa state. This fixes the issue where a remotely triggered address deletion (through DAD failure) coincides with a local administrative address deletion, causing certain actions to be performed twice incorrectly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Replace inet6_ifaddr->dead with state	Herbert Xu	2010-05-18	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \|	This patch replaces the boolean dead flag on inet6_ifaddr with a state enum. This allows us to roll back changes when deleting an address according to whether DAD has completed or not. This patch only adds the state field and does not change the logic. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Remove unnecessary returns from void function()s	Joe Perches	2010-05-17	3	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ \| \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Introduce skb_tunnel_rx() helper	Eric Dumazet	2010-05-17	3	-16/+8
\| \| \| \| \| \| \| \| \| \| \|	skb rxhash should be cleared when a skb is handled by a tunnel before being delivered again, so that correct packet steering can take place. There are other cleanups and accounting that we can factorize in a new helper, skb_tunnel_rx() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: fix the bug of address check	Stephen Hemminger	2010-05-17	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The duplicate address check code got broken in the conversion to hlist (2.6.35). The earlier patch did not fix the case where two addresses match same hash value. Use two exit paths, rather than depending on state of loop variables (from macro). Based on earlier fix by Shan Wei. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6 addrlabel: permit deletion of labels assigned to removed dev	Florian Westphal	2010-05-17	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	as addrlabels with an interface index are left alone when the interface gets removed this results in addrlabels that can no longer be removed. Restrict validation of index to adding new addrlabels. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Introduce sk_route_nocaps	Eric Dumazet	2010-05-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TCP-MD5 sessions have intermittent failures, when route cache is invalidated. ip_queue_xmit() has to find a new route, calls sk_setup_caps(sk, &rt->u.dst), destroying the sk->sk_route_caps &= ~NETIF_F_GSO_MASK that MD5 desperately try to make all over its way (from tcp_transmit_skb() for example) So we send few bad packets, and everything is fine when tcp_transmit_skb() is called again for this socket. Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a socket field, sk_route_nocaps, containing bits to mask on sk_route_caps. Reported-by: Bhaskar Dutta <bhaskie@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2010-05-13	14	-105/+81
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
\| *	netfilter: remove unnecessary returns from void function()s	Joe Perches	2010-05-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes from net/ netfilter files all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ \| \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> [Patrick: changed to keep return statements in otherwise empty function bodies] Signed-off-by: Patrick McHardy <kaber@trash.net>
\| *	netfilter: cleanup printk messages	Stephen Hemminger	2010-05-13	4	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure all printk messages have a severity level. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
\| *	netfilter: change NF_ASSERT to WARN_ON	Stephen Hemminger	2010-05-13	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change netfilter asserts to standard WARN_ON. This has the benefit of backtrace info and also causes netfilter errors to show up on kerneloops.org. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
\| *	Merge branch 'master' of git://dev.medozas.de/linux	Patrick McHardy	2010-05-11	10	-88/+70
\| \|\
\| \| *	netfilter: xtables: combine built-in extension structs	Jan Engelhardt	2010-05-11	1	-34/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prepare the arrays for use with the multiregister function. The future layer-3 xt matches can then be easily added to it without needing more (un)register code. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
\| \| *	netfilter: xtables: change hotdrop pointer to direct modification	Jan Engelhardt	2010-05-11	7	-17/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since xt_action_param is writable, let's use it. The pointer to 'bool hotdrop' always worried (8 bytes (64-bit) to write 1 byte!). Surprisingly results in a reduction in size: text data bss filename 5457066 692730 357892 vmlinux.o-prev 5456554 692730 357892 vmlinux.o Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
\| \| *	netfilter: xtables: deconstify struct xt_action_param for matches	Jan Engelhardt	2010-05-11	8	-11/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In future, layer-3 matches will be an xt module of their own, and need to set the fragoff and thoff fields. Adding more pointers would needlessy increase memory requirements (esp. so for 64-bit, where pointers are wider). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
\| \| *	netfilter: xtables: substitute temporary defines by final name	Jan Engelhardt	2010-05-11	10	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
\| \| *	netfilter: xtables: combine struct xt_match_param and xt_target_param	Jan Engelhardt	2010-05-11	1	-14/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The structures carried - besides match/target - almost the same data. It is possible to combine them, as extensions are evaluated serially, and so, the callers end up a little smaller. text data bss filename -15318 740 104 net/ipv4/netfilter/ip_tables.o +15286 740 104 net/ipv4/netfilter/ip_tables.o -15333 540 152 net/ipv6/netfilter/ip6_tables.o +15269 540 152 net/ipv6/netfilter/ip6_tables.o Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
\| \| *	netfilter: xtables: dissolve do_match function	Jan Engelhardt	2010-05-02	1	-17/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
* \| \|	Merge branch 'master' of ↵	David S. Miller	2010-05-12	3	-265/+683
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6
\| * \| \|	ipv6: ip6mr: add support for dumping routing tables over netlink	Patrick McHardy	2010-05-11	1	-7/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ip6mr /proc interface (ip6_mr_cache) can't be extended to dump routes from any tables but the main table in a backwards compatible fashion since the output format ends in a variable amount of output interfaces. Introduce a new netlink interface to dump multicast routes from all tables, similar to the netlink interface for regular routes. Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	ipv6: ip6mr: support multiple tables	Patrick McHardy	2010-05-11	3	-67/+377
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT6_TABLE. The table number is stored in the raw socket data and affects all following ip6mr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT) is created with a default routing rule pointing to it. Newly created pim6reg devices have the table number appended ("pim6regX"), with the exception of devices created in the default table, which are named just "pim6reg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip -6 mrule add iif eth0 lookup 123 # ip -6 mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	ipv6: ip6mr: move mroute data into seperate structure	Patrick McHardy	2010-05-11	1	-176/+214
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	ipv6: ip6mr: convert struct mfc_cache to struct list_head	Patrick McHardy	2010-05-11	1	-65/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	ipv6: ip6mr: remove net pointer from struct mfc6_cache	Patrick McHardy	2010-05-11	1	-32/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that cache entries in unres_queue don't need to be distinguished by their network namespace pointer anymore, we can remove it from struct mfc6_cache add pass the namespace as function argument to the functions that need it. Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	ipv6: ip6mr: move unres_queue and timer to per-namespace data	Patrick McHardy	2010-05-11	1	-41/+33
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The unres_queue is currently shared between all namespaces. Following patches will additionally allow to create multiple multicast routing tables in each namespace. Having a single shared queue for all these users seems to excessive, move the queue and the cleanup timer to the per-namespace data to unshare it. As a side-effect, this fixes a bug in the seq file iteration functions: the first entry returned is always from the current namespace, entries returned after that may belong to any namespace. Signed-off-by: Patrick McHardy <kaber@trash.net>
* \| \|	Merge branch 'master' of ↵	David S. Miller	2010-05-12	2	-3/+7
\|\ \ \ \| \|/ / \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wireless/ath/ar9170/usb.c drivers/scsi/iscsi_tcp.c net/ipv4/ipmr.c
\| * \|	IPv6: fix IPV6_RECVERR handling of locally-generated errors	Brian Haley	2010-05-05	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed when I added support for IPV6_DONTFRAG that if you set IPV6_RECVERR and tried to send a UDP packet larger than 64K to an IPv6 destination, you'd correctly get an EMSGSIZE, but reading from MSG_ERRQUEUE returned the incorrect address in the cmsg: struct msghdr: msg_name 0x7fff8f3c96d0 msg_namelen 28 struct sockaddr_in6: sin6_family 10 sin6_port 7639 sin6_flowinfo 0 sin6_addr ::ffff:38.32.0.0 sin6_scope_id 0 ((null)) It should have returned this in my case: struct msghdr: msg_name 0x7fffd866b510 msg_namelen 28 struct sockaddr_in6: sin6_family 10 sin6_port 7639 sin6_flowinfo 0 sin6_addr 2620:0:a09:e000:21f:29ff:fe57:f88b sin6_scope_id 0 ((null)) The problem is that ipv6_recv_error() assumes that if the error wasn't generated by ICMPv6, it's an IPv4 address sitting there, and proceeds to create a v4-mapped address from it. Change ipv6_icmp_error() and ipv6_local_error() to set skb->protocol to htons(ETH_P_IPV6) so that ipv6_recv_error() knows the address sitting right after the extended error is IPv6, else it will incorrectly map the first octet into an IPv4-mapped IPv6 address in the cmsg structure returned in a recvmsg() call to obtain the error. Signed-off-by: Brian Haley <brian.haley@hp.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	ipv6: Fix default multicast hops setting.	David S. Miller	2010-05-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As per RFC 3493 the default multicast hops setting for a socket should be "1" just like ipv4. Ironically we have a IPV6_DEFAULT_MCASTHOPS macro it just wasn't being used. Reported-by: Elliot Hughes <enh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'master' of /repos/git/net-next-2.6	Patrick McHardy	2010-05-10	15	-146/+312
\|\ \ \ \| \|_\|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \|	ipv6: udp: make short packet logging consistent with ipv4	Bjørn Mork	2010-05-06	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding addresses and ports to the short packet log message, like ipv4/udp.c does it, makes these messages a lot more useful: [ 822.182450] UDPv6: short packet: From [2001:db8:ffb4:3::1]:47839 23715/178 to [2001:db8:ffb4:3:5054:ff:feff:200]:1234 This requires us to drop logging in case pskb_may_pull() fails, which also is consistent with ipv4/udp.c Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: rcu fixes	Eric Dumazet	2010-05-03	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add hlist_for_each_entry_rcu_bh() and hlist_for_each_entry_continue_rcu_bh() macros, and use them in ipv6_get_ifaddr(), if6_get_first() and if6_get_next() to fix lockdeps warnings. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'master' of ↵	David S. Miller	2010-05-02	1	-10/+5
\| \|\ \ \| \| \|/ \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
\| \| *	Revert "tcp: bind() fix when many ports are bound"	David S. Miller	2010-04-28	1	-10/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts two commits: fda48a0d7a8412cedacda46a9c0bf8ef9cd13559 tcp: bind() fix when many ports are bound and a follow-on fix for it: 6443bb1fc2050ca2b6585a3fa77f7833b55329ed ipv6: Fix inet6_csk_bind_conflict() It causes problems with binding listening sockets when time-wait sockets from a previous instance still are alive. It's too late to keep fiddling with this so late in the -rc series, and we'll deal with it in net-next-2.6 instead. Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	ipv6: cleanup: remove unneeded null check	Dan Carpenter	2010-04-30	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We dereference "sk" unconditionally elsewhere in the function. This was left over from: b30bd282 "ip6_xmit: remove unnecessary NULL ptr check". According to that commit message, "the sk argument to ip6_xmit is never NULL nowadays since the skb->priority assigment expects a valid socket." Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: ip_queue_rcv_skb() helper	Eric Dumazet	2010-04-28	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When queueing a skb to socket, we can immediately release its dst if target socket do not use IP_CMSG_PKTINFO. tcp_data_queue() can drop dst too. This to benefit from a hot cache line and avoid the receiver, possibly on another cpu, to dirty this cache line himself. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: speedup udp receive path	Eric Dumazet	2010-04-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 95766fff ([UDP]: Add memory accounting.), each received packet needs one extra sock_lock()/sock_release() pair. This added latency because of possible backlog handling. Then later, ticket spinlocks added yet another latency source in case of DDOS. This patch introduces lock_sock_bh() and unlock_sock_bh() synchronization primitives, avoiding one atomic operation and backlog processing. skb_free_datagram_locked() uses them instead of full blown lock_sock()/release_sock(). skb is orphaned inside locked section for proper socket memory reclaim, and finally freed outside of it. UDP receive path now take the socket spinlock only once. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: sk_add_backlog() take rmem_alloc into account	Eric Dumazet	2010-04-27	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current socket backlog limit is not enough to really stop DDOS attacks, because user thread spend many time to process a full backlog each round, and user might crazy spin on socket lock. We should add backlog size and receive_queue size (aka rmem_alloc) to pace writers, and let user run without being slow down too much. Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in stress situations. Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp receiver can now process ~200.000 pps (instead of ~100 pps before the patch) on a 8 core machine. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'master' of ↵	David S. Miller	2010-04-27	1	-2/+2
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6
\| \| * \|	net: rtnetlink: decouple rtnetlink address families from real address families	Patrick McHardy	2010-04-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Decouple rtnetlink address families from real address families in socket.h to be able to add rtnetlink interfaces to code that is not a real address family without increasing AF_MAX/NPROTO. This will be used to add support for multicast route dumping from all tables as the proc interface can't be extended to support anything but the main table without breaking compatibility. This partialy undoes the patch to introduce independant families for routing rules and converts ipmr routing rules to a new rtnetlink family. Similar to that patch, values up to 127 are reserved for real address families, values above that may be used arbitrarily. Signed-off-by: Patrick McHardy <kaber@trash.net>
\| \| * \|	net: fib_rules: mark arguments to fib_rules_register const and __net_initdata	Patrick McHardy	2010-04-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fib_rules_register() duplicates the template passed to it without modification, mark the argument as const. Additionally the templates are only needed when instantiating a new namespace, so mark them as __net_initdata, which means they can be discarded when CONFIG_NET_NS=n. Signed-off-by: Patrick McHardy <kaber@trash.net>
\| * \| \|	Merge branch 'master' of ↵	David S. Miller	2010-04-27	4	-8/+13
\| \|\ \ \ \| \| \|/ / \| \|/\| / \| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e100.c drivers/net/e1000e/netdev.c
\| \| *	ipv6: Fix inet6_csk_bind_conflict()	Eric Dumazet	2010-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit fda48a0d7a84 (tcp: bind() fix when many ports are bound) introduced a bug on IPV6 part. We should not call ipv6_addr_any(inet6_rcv_saddr(sk2)) but ipv6_addr_any(inet6_rcv_saddr(sk)) because sk2 can be IPV4, while sk is IPV6. Reported-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	tcp: bind() fix when many ports are bound	Eric Dumazet	2010-04-22	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Port autoselection done by kernel only works when number of bound sockets is under a threshold (typically 30000). When this threshold is over, we must check if there is a conflict before exiting first loop in inet_csk_get_port() Change inet_csk_bind_conflict() to forbid two reuse-enabled sockets to bind on same (address,port) tuple (with a non ANY address) Same change for inet6_csk_bind_conflict() Reported-by: Gaspar Chilingarov <gasparch@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	net: ipv6 bind to device issue	Jiri Olsa	2010-04-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The issue raises when having 2 NICs both assigned the same IPv6 global address. If a sender binds to a particular NIC (SO_BINDTODEVICE), the outgoing traffic is being sent via the first found. The bonded device is thus not taken into an account during the routing. From the ip6_route_output function: If the binding address is multicast, linklocal or loopback, the RT6_LOOKUP_F_IFACE bit is set, but not for global address. So binding global address will neglect SO_BINDTODEVICE-binded device, because the fib6_rule_lookup function path won't check for the flowi::oif field and take first route that fits. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Scott Otto <scott.otto@alcatel-lucent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU ↵	Shan Wei	2010-04-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	field less than IPV6_MIN_MTU According to RFC2460, PMTU is set to the IPv6 Minimum Link MTU (1280) and a fragment header should always be included after a node receiving Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU. After receiving a ICMPv6 Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU, sctp can't send any data/control chunk that total length including IPv6 head and IPv6 extend head is less than IPV6_MIN_MTU(1280 bytes). The failure occured in p6_fragment(), about reason see following(take SHUTDOWN chunk for example): sctp_packet_transmit (SHUTDOWN chunk, len=16 byte) \|------sctp_v6_xmit (local_df=0) \|------ip6_xmit \|------ip6_output (dst_allfrag is ture) \|------ip6_fragment In ip6_fragment(), for local_df=0, drops the the packet and returns EMSGSIZE. The patch fixes it with adding check length of skb->len. In this case, Ipv6 not to fragment upper protocol data, just only add a fragment header before it. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	xfrm6: ensure to use the same dev when building a bundle	Nicolas Dichtel	2010-04-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When building a bundle, we set dst.dev and rt6.rt6i_idev. We must ensure to set the same device for both fields. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'net-next-2.6_20100423a/br/br_multicast_v3' of ↵	David S. Miller	2010-04-23	1	-95/+40
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next
\| \| * \|	ipv6 mcast: Introduce include/net/mld.h for MLD definitions.	YOSHIFUJI Hideaki	2010-04-23	1	-95/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>