op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support	Eric Dumazet	2007-04-25	22	-10/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that network timestamps use ktime_t infrastructure, we can add a new SOL_SOCKET sockopt SO_TIMESTAMPNS. This command is similar to SO_TIMESTAMP, but permits transmission of a 'timespec struct' instead of a 'timeval struct' control message. (nanosecond resolution instead of microsecond) Control message is labelled SCM_TIMESTAMPNS instead of SCM_TIMESTAMP A socket cannot mix SO_TIMESTAMP and SO_TIMESTAMPNS : the two modes are mutually exclusive. sock_recv_timestamp() became too big to be fully inlined so I added a __sock_recv_timestamp() helper function. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: linux-arch@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Replace CONFIG_NET_DEBUG with sysctl.	Stephen Hemminger	2007-04-25	2	-7/+6
\| \| \| \| \| \| \| \|	Covert network warning messages from a compile time to runtime choice. Removes kernel config option and replaces it with new /proc/sys/net/core/warnings. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution	Eric Dumazet	2007-04-25	24	-22/+46
\| \| \| \| \| \| \| \| \| \|	Now network timestamps use ktime_t infrastructure, we can add a new ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'. User programs can thus access to nanosecond resolution. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Abstract out all write queue operations.	David S. Miller	2007-04-25	2	-21/+114
\| \| \| \| \| \| \|	This allows the write queue implementation to be changed, for example, to one which allows fast interval searching. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[UDP]: Clean up UDP-Lite receive checksum	Herbert Xu	2007-04-25	3	-37/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch eliminates some duplicate code for the verification of receive checksums between UDP-Lite and UDP. It does this by introducing __skb_checksum_complete_head which is identical to __skb_checksum_complete_head apart from the fact that it takes a length parameter rather than computing the first skb->len bytes. As a result UDP-Lite will be able to use hardware checksum offload for packets which do not use partial coverage checksums. It also means that UDP-Lite loopback no longer does unnecessary checksum verification. If any NICs start support UDP-Lite this would also start working automatically. This patch removes the assumption that msg_flags has MSG_TRUNC clear upon entry in recvmsg. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NETLINK]: Limit NLMSG_GOODSIZE to 8K.	David S. Miller	2007-04-25	2	-5/+14
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support.	Neil Horman	2007-04-25	3	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nominally an autoconfigured IPv6 address is added to an interface in the Tentative state (as per RFC 2462). Addresses in this state remain in this state while the Duplicate Address Detection process operates on them to determine their uniqueness on the network. During this period, these tentative addresses may not be used for communication, increasing the time before a node may be able to communicate on a network. Using Optimistic Duplicate Address Detection, autoconfigured addresses may be used immediately for communication on the network, as long as certain rules are followed to avoid conflicts with other nodes during the Duplicate Address Detection process. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: convert network timestamps to ktime_t	Eric Dumazet	2007-04-25	2	-30/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: Stephen Hemminger <shemminger@linux-foundation.org> CC: John find <linux.kernel@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: div64_64 consolidate (rev3)	Stephen Hemminger	2007-04-25	7	-1/+34
\| \| \| \| \| \| \|	Here is the current version of the 64 bit divide common code. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Convert xtime.tv_sec to get_seconds()	James Morris	2007-04-25	1	-2/+2
\| \| \| \| \| \| \| \|	Where appropriate, convert references to xtime.tv_sec to the get_seconds() helper function. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[NET]: Keep sk_backlog near sk_lock	Eric Dumazet	2007-04-25	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sk_backlog is a critical field of struct sock. (known famous words) It is (ab)used in hot paths, in particular in release_sock(), tcp_recvmsg(), tcp_v4_rcv(), sk_receive_skb(). It really makes sense to place it next to sk_lock, because sk_backlog is only used after sk_lock locked (and thus memory cache line in L1 cache). This should reduce cache misses and sk_lock acquisition time. (In theory, we could only move the head pointer near sk_lock, and leaving tail far away, because 'tail' is normally not so hot, but keep it simple :) ) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Add two new spurious RTO responses to FRTO	Ilpo Järvinen	2007-04-25	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	New sysctl tcp_frto_response is added to select amongst these responses: - Rate halving based; reuses CA_CWR state (default) - Very conservative; used to be the only one available (=1) - Undo cwr; undoes ssthresh and cwnd reductions (=2) The response with rate halving requires a new parameter to tcp_enter_cwr because FRTO has already reduced ssthresh and doing a second reduction there has to be prevented. In addition, to keep things nice on 80 cols screen, a local variable was added. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Make snd_cwnd_clamp a u32.	David S. Miller	2007-04-25	1	-1/+1
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Keep copied_seq, rcv_wup and rcv_next together.	Eric Dumazet	2007-04-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed in oprofile study a cache miss in tcp_rcv_established() to read copied_seq. ffffffff80400a80 <tcp_rcv_established>: /* tcp_rcv_established total: 4034293 2.0400 */ 55493 0.0281 :ffffffff80400bc9: mov 0x4c8(%r12),%eax copied_seq 543103 0.2746 :ffffffff80400bd1: cmp 0x3e0(%r12),%eax rcv_nxt if (tp->copied_seq == tp->rcv_nxt && len - tcp_header_len <= tp->ucopy.len) { In this function, the cache line 0x4c0 -> 0x500 is used only for this reading 'copied_seq' field. rcv_wup and copied_seq should be next to rcv_nxt field, to lower number of active cache lines in hot paths. (tcp_rcv_established(), tcp_poll(), ...) As you suggested, I changed tcp_create_openreq_child() so that these fields are changed together, to avoid adding a new store buffer stall. Patch is 64bit friendly (no new hole because of alignment constraints) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP]: Add RFC3742 Limited Slow-Start, controlled by variable ↵	John Heffner	2007-04-25	2	-0/+2
\| \| \| \| \| \| \|	sysctl_tcp_max_ssthresh. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] FRTO: Entry is allowed only during (New)Reno like recovery	Ilpo Järvinen	2007-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This interpretation comes from RFC4138: "If the sender implements some loss recovery algorithm other than Reno or NewReno [FHG04], the F-RTO algorithm SHOULD NOT be entered when earlier fast recovery is underway." I think the RFC means to say (especially in the light of Appendix B) that ...recovery is underway (not just fast recovery) or was underway when it was interrupted by an earlier (F-)RTO that hasn't yet been resolved (snd_una has not advanced enough). Thus, my interpretation is that whenever TCP has ever retransmitted other than head, basic version cannot be used because then the order assumptions which are used as FRTO basis do not hold. NewReno has only the head segment retransmitted at a time. Therefore, walk up to the segment that has not been SACKed, if that segment is not retransmitted nor anything before it, we know for sure, that nothing after the non-SACKed segment should be either. This assumption is valid because TCPCB_EVER_RETRANS does not leave holes but each non-SACKed segment is rexmitted in-order. Check for retrans_out > 1 avoids more expensive walk through the skb list, as we can know the result beforehand: F-RTO will not be allowed. SACKed skb can turn into non-SACked only in the extremely rare case of SACK reneging, in this case we might fail to detect retransmissions if there were them for any other than head. To get rid of that feature, whole rexmit queue would have to be walked (always) or FRTO should be prevented when SACK reneging happens. Of course RTO should still trigger after reneging which makes this issue even less likely to show up. And as long as the response is as conservative as it's now, nothing bad happens even then. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[TCP] FRTO: Moved tcp_use_frto from tcp.h to tcp_input.c	Ilpo Järvinen	2007-04-25	1	-13/+1
\| \| \| \| \| \| \|	In addition, removed inline. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2007-04-24	2	-0/+4
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [BNX2]: Fix occasional NETDEV WATCHDOG on 5709. [IPV6]: Disallow RH0 by default. [XFRM]: beet: fix pseudo header length value [TCP]: Congestion control initialization.
\| *	[IPV6]: Disallow RH0 by default.	YOSHIFUJI Hideaki	2007-04-24	2	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A security issue is emerging. Disallow Routing Header Type 0 by default as we have been doing for IPv4. Note: We allow RH2 by default because it is harmless. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Taskstats fix the structure members alignment issue	Balbir Singh	2007-04-24	1	-5/+8
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We broke the the alignment of members of taskstats to the 8 byte boundary with the CSA patches. In the current kernel, the taskstats structure is not suitable for use by 32 bit applications in a 64 bit kernel. On x86_64 Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 116, # ac_uid 120, # ac_gid 124, # ac_pid 128, # ac_ppid 132, # ac_btime 136, # ac_etime 144, # ac_utime 152, # ac_stime 160, # ac_minflt 168, # ac_majflt 176, # coremem 184, # virtmem 192, # hiwater_rss 200, # hiwater_vm 208, # read_char 216, # write_char 224, # read_syscalls 232, # write_syscalls 240, # read_bytes 248, # write_bytes 256, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 12, # cpu_count 20, # cpu_delay_total 28, # blkio_count 36, # blkio_delay_total 44, # swapin_count 52, # swapin_delay_total 60, # cpu_run_real_total 68, # cpu_run_virtual_total 76, # ac_comm 108, # ac_sched 109, # ac_pad 112, # ac_uid 116, # ac_gid 120, # ac_pid 124, # ac_ppid 128, # ac_btime 132, # ac_etime 140, # ac_utime 148, # ac_stime 156, # ac_minflt 164, # ac_majflt 172, # coremem 180, # virtmem 188, # hiwater_rss 196, # hiwater_vm 204, # read_char 212, # write_char 220, # read_syscalls 228, # write_syscalls 236, # read_bytes 244, # write_bytes 252, # cancelled_write_bytes ); This is one way to solve the problem without re-arranging structure members is to pack the structure. The patch adds an __attribute__((aligned(8))) to the taskstats structure members so that 32 bit applications using taskstats can work with a 64 bit kernel. Using __attribute__((packed)) would break the 64 bit alignment of members. The fix was tested on x86_64. After the fix, we got Offsets of taskstats' members (64 bit kernel, 64 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Offsets of taskstats' members (64 bit kernel, 32 bit application) @taskstats'offsetof[@taskstats'indices] = ( 0, # version 4, # ac_exitcode 8, # ac_flag 9, # ac_nice 16, # cpu_count 24, # cpu_delay_total 32, # blkio_count 40, # blkio_delay_total 48, # swapin_count 56, # swapin_delay_total 64, # cpu_run_real_total 72, # cpu_run_virtual_total 80, # ac_comm 112, # ac_sched 113, # ac_pad 120, # ac_uid 124, # ac_gid 128, # ac_pid 132, # ac_ppid 136, # ac_btime 144, # ac_etime 152, # ac_utime 160, # ac_stime 168, # ac_minflt 176, # ac_majflt 184, # coremem 192, # virtmem 200, # hiwater_rss 208, # hiwater_vm 216, # read_char 224, # write_char 232, # read_syscalls 240, # write_syscalls 248, # read_bytes 256, # write_bytes 264, # cancelled_write_bytes ); Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	2007-04-20	5	-21/+11
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] Fix wrong checksum for split TCP packets on 64-bit MIPS [MIPS] Fix BUG(), BUG_ON() handling [MIPS] Retry {save,restore}_fp_context if failed in atomic context. [MIPS] Disallow CpU exception in kernel again. [MIPS] Add missing silicon revisions for BCM112x
\| *	[MIPS] Fix wrong checksum for split TCP packets on 64-bit MIPS	Dave Johnson	2007-04-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I've traced down an off-by-one TCP checksum calculation error under the following conditions: 1) The TCP code needs to split a full-sized packet due to a reduced MSS (typically due to the addition of TCP options mid-stream like SACK). _AND_ 2) The checksum of the 2nd fragment is larger than the checksum of the original packet. After subtraction this results in a checksum for the 1st fragment with bits 16..31 set to 1. (this is ok) _AND_ 3) The checksum of the 1st fragment's TCP header plus the previously 32bit checksum of the 1st fragment DOES NOT cause a 32bit overflow when added together. This results in a checksum of the TCP header plus TCP data that still has the upper 16 bits as 1's. _THEN_ 4) The TCP+data checksum is added to the checksum of the pseudo IP header with csum_tcpudp_nofold() incorrectly (the bug). The problem is the checksum of the TCP+data is passed to csum_tcpudp_nofold() as an 32bit unsigned value, however the assembly code acts on it as if it is a 64bit unsigned value. This causes an incorrect 32->64bit extension if the sum has bit 31 set. The resulting checksum is off by one. This problems is data and TCP header dependent due to #2 and #3 above so it doesn't occur on every TCP packet split. Signed-off-by: Dave Johnson <djohnson+linux-mips@sw.starentnetworks.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| *	[MIPS] Fix BUG(), BUG_ON() handling	Atsushi Nemoto	2007-04-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With commit 63dc68a8cf60cb110b147dab1704d990808b39e2, kernel can not handle BUG() and BUG_ON() properly since get_user() returns false for kernel code. Use __get_user() to skip unnecessary access_ok(). This patch also make BRK_BUG code encoded in the TNE instruction. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| *	[MIPS] Retry {save,restore}_fp_context if failed in atomic context.	Atsushi Nemoto	2007-04-20	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The save_fp_context()/restore_fp_context() might sleep on accessing user stack and therefore might lose FPU ownership in middle of them. If these function failed due to "in_atomic" test in do_page_fault, touch the sigcontext area in non-atomic context and retry these save/restore operation. This is a replacement of a (broken) fix which was titled "Allow CpU exception in kernel partially". Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| *	[MIPS] Disallow CpU exception in kernel again.	Atsushi Nemoto	2007-04-20	2	-17/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 4d40bff7110e9e1a97ff8c01bdd6350e9867cc10 ("Allow CpU exception in kernel partially") was broken. The commit was to fix theoretical problem but broke usual case. Revert it for now. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
\| *	[MIPS] Add missing silicon revisions for BCM112x	Mark Mason	2007-04-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent versions of the BCM112X processors aren't recognized by Linux (preventing Linux from booting on those processors). This patch adds support for those that are missing. Signed-off-by: Mark Mason <mason@broadcom.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* \|	NFS: clean up the unstable write code	Trond Myklebust	2007-04-20	1	-30/+0
\|/ \| \| \| \| \| \|	Get rid of the inlined #ifdefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	2007-04-17	1	-3/+7
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [BRIDGE]: Unaligned access when comparing ethernet addresses [SCTP]: Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation. [SCTP]: Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message [NET]: Set a separate lockdep class for neighbour table's proxy_queue [NET]: Fix UDP checksum issue in net poll mode. [KEY]: Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx. [NET]: Get rid of alloc_skb_from_cache
\| *	[NET]: Set a separate lockdep class for neighbour table's proxy_queue	Pavel Emelianov	2007-04-17	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise the following calltrace will lead to a wrong lockdep warning: neigh_proxy_process() `- lock(neigh_table->proxy_queue.lock); arp_redo /* via tbl->proxy_redo */ arp_process neigh_event_ns neigh_update skb_queue_purge `- lock(neighbor->arp_queue.lock); This is not a deadlock actually, as neighbor table's proxy_queue and the neighbor's arp_queue are different queues. Lockdep thinks there is a deadlock as both queues are initialized with skb_queue_head_init() and thus have a common class. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	[NET]: Get rid of alloc_skb_from_cache	Herbert Xu	2007-04-17	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since this was added originally for Xen, and Xen has recently (~2.6.18) stopped using this function, we can safely get rid of it. Good timing too since this function has started to bit rot. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Provide dummy devm_ioport_* if !HAS_IOPORT	Russell King	2007-04-17	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide an dummy implementation of devm_ioport_map() and devm_ioport_unmap() to allow drivers (eg, pata_platform) to build for platforms where CONFIG_NO_IOPORT is selected. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	alpha: build fixes - force architecture	Ivan Kokshaysky	2007-04-17	1	-12/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Override compiler .arch directive for generic kernel build. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	alpha: fixes for specific machine types	Ivan Kokshaysky	2007-04-17	2	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Files: arch/alpha/kernel/core_mcpcia.c arch/alpha/kernel/sys_rawhide.c include/asm-alpha/core_mcpcia.h Determine correct hose configuration; RAWHIDE family can have 2 or 4 hoses, so make sure non-existent hoses are ignored. arch/alpha/kernel/err_titan.c Supply a needed #include <asm/irq_regs.h> arch/alpha/kernel/module.c Add some useful output to the relocation overflow messages. arch/alpha/kernel/sys_noritake.c Supply necessary noritake_end_irq() to correct interrupt handling. This fixes a problem first noted by hangs during boot probing with a DE500-BA TULIP NIC present. arch/alpha/kernel/sys_sio.c Correct saving of original PIRQ register (PCI IRQ routing); change default PIRQ setting to leave PCI IRQs 9 and 14 free to be used for sound (Multia) and IDE (any), respectively. include/asm-alpha/io.h Supply the "isa_virt_to_bus" routine. Signed-off-by: Jay Estabrook <jay.estabrook@hp.com> Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	kernel-doc: fix plist.h comments	Randy Dunlap	2007-04-17	1	-31/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make kernel-doc comments match macro names. Correct parameter names in a few places. Remove '#' from beginning of kernel-doc comment macro names. Remove extra (erroneous) blank lines in kernel-doc. Warning(plist.h:100): Cannot understand * #PLIST_HEAD_INIT - static struct plist_head initializer on line 100 - I thought it was a doc line Warning(plist.h:112): Cannot understand * #PLIST_NODE_INIT - static struct plist_node initializer on line 112 - I thought it was a doc line Warning(plist.h:103): No description found for parameter '_lock' Warning(plist.h:129): No description found for parameter 'lock' Warning(plist.h:158): No description found for parameter 'pos' Warning(plist.h:169): No description found for parameter 'pos' Warning(plist.h:169): No description found for parameter 'n' Warning(plist.h:179): No description found for parameter 'mem' This still leaves one warning & one error that need attention: Error(plist.h:219): cannot understand prototype: '(' Warning(plist.h): no structured comments found Acked-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Cc: Daniel Walker <dwalker@mvista.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	allow vmsplice to work in 32-bit mode on ppc64	Don Zickus	2007-04-17	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \|	Trivial change to pass vmsplice arguments through the compat layer on pp64. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	NFS: Ensure PG_writeback is cleared when writeback fails	Trond Myklebust	2007-04-14	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the writebacks are cancelled via nfs_cancel_dirty_list, or due to the memory allocation failing in nfs_flush_one/nfs_flush_multi, then we must ensure that the PG_writeback flag is cleared. Also ensure that we actually own the PG_writeback flag whenever we schedule a new writeback by making nfs_set_page_writeback() return the value of test_set_page_writeback(). The PG_writeback page flag ends up replacing the functionality of the PG_FLUSHING nfs_page flag, so we rip that out too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6	Linus Torvalds	2007-04-10	1	-0/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: ide: add "optical" to sysfs "media" attribute ide: ugly messages trying to open CD drive with no media present ide: correctly prevent IDE timer expiry function to run if request was already handled
\| *	ide: correctly prevent IDE timer expiry function to run if request was ↵	Suleiman Souhlal	2007-04-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	already handled It is possible for the timer expiry function to run even though the request has already been handled: ide_timer_expiry() only checks that the handler is not NULL, but it is possible that we have handled a request (thus clearing the handler) and then started a new request (thus starting the timer again, and setting a handler). A simple way to exhibit this is to set the DMA timeout to 1 jiffy and run dd: The kernel will panic after a few minutes because ide_timer_expiry() tries to add a timer when it's already active. To fix this, we simply add a request generation count that gets incremented at every interrupt, and check in ide_timer_expiry() that we have not already handled a new interrupt before running the expiry function. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* \|	Merge branch 'release' of ↵	Linus Torvalds	2007-04-10	1	-2/+3
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] SGI Altix : fix pcibr_dmamap_ate32() bug [IA64] Fix CPU freq displayed in /proc/cpuinfo [IA64] Fix wrong assumption about irq and vector in msi_ia64.c [IA64] BTE error timer fix
\| *	[IA64] SGI Altix : fix pcibr_dmamap_ate32() bug	Mike Habeck	2007-04-06	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a SGI Altix TIOCP based PCI bus we need to include the ATE_PIO attribute bit if we're mapping a 32bit MSI address. Signed-off-by: Mike Habeck <habeck@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	[PATCH] Proper fix for highmem kmap_atomic functions for VMI for 2.6.21	Zachary Amsden	2007-04-08	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since lazy MMU batching mode still allows interrupts to enter, it is possible for interrupt handlers to try to use kmap_atomic, which fails when lazy mode is active, since the PTE update to highmem will be delayed. The best workaround is to issue an explicit flush in kmap_atomic_functions case; this is the only way nested PTE updates can happen in the interrupt handler. Thanks to Jeremy Fitzhardinge for noting the bug and suggestions on a fix. This patch gets reverted again when we start 2.6.22 and the bug gets fixed differently. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: Andi Kleen <ak@muc.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	[PATCH] high-res timers: resume fix	Ingo Molnar	2007-04-07	1	-0/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Soeren Sonnenburg reported that upon resume he is getting this backtrace: [<c0119637>] smp_apic_timer_interrupt+0x57/0x90 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0104d30>] apic_timer_interrupt+0x28/0x30 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0140068>] __kfifo_put+0x8/0x90 [<c0130fe5>] on_each_cpu+0x35/0x60 [<c0143538>] clock_was_set+0x18/0x20 [<c0135cdc>] timekeeping_resume+0x7c/0xa0 [<c02aabe1>] __sysdev_resume+0x11/0x80 [<c02ab0c7>] sysdev_resume+0x47/0x80 [<c02b0b05>] device_power_up+0x5/0x10 it turns out that on resume we mistakenly re-enable interrupts too early. Do the timer retrigger only on the current CPU. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Soeren Sonnenburg <kernel@nn7.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] ia64: desc_empty thinko/typo fix	Maciej Zenczykowski	2007-04-04	1	-1/+1
\| \| \| \| \| \| \| \|	Just a one-byter for an ia64 thinko/typo - already fixed for i386 and x86_64. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] remove protection of LANANA-reserved majors	Andrew Morton	2007-04-04	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Revert all this. It can cause device-mapper to receive a different major from earlier kernels and it turns out that the Amanda backup program (via GNU tar, apparently) checks major numbers on files when performing incremental backups. Which is a bit broken of Amanda (or tar), but this feature isn't important enough to justify the churn. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] md: avoid a deadlock when removing a device from an md array via sysfs	NeilBrown	2007-04-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A device can be removed from an md array via e.g. echo remove > /sys/block/md3/md/dev-sde/state This will try to remove the 'dev-sde' subtree which will deadlock since commit e7b0d26a86943370c04d6833c6edba2a72a6e240 With this patch we run the kobject_del via schedule_work so as to avoid the deadlock. Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	libata: Limit ATAPI DMA to R/W commands only for TORiSAN DVD drives (take 3)	Albert Lee	2007-04-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	patch 4/4: Limit ATAPI DMA to R/W commands only for TORiSAN DRD-N216 DVD-ROM drives (http://bugzilla.kernel.org/show_bug.cgi?id=6710) Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
*	libata: Limit max sector to 128 for TORiSAN DVD drives (take 3)	Albert Lee	2007-04-04	2	-0/+2
\| \| \| \| \| \| \| \| \| \|	patch 3/4: The TORiSAN drive locks up when max sector == 256. Limit max sector to 128 for the TORiSAN DRD-N216 drives. (http://bugzilla.kernel.org/show_bug.cgi?id=6710) Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
*	libata: reorder HSM_ST_FIRST for easier decoding (take 3)	Albert Lee	2007-04-04	1	-2/+2
\| \| \| \| \| \| \| \|	patch 1/4: Reorder HSM_ST_FIRST, such that the task state transition is easier decoded with human eyes. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
*	[SPARC]: Add unsigned to unused bit field in a.out.h	Robert Reif	2007-04-02	2	-2/+2
\| \| \| \| \| \| \| \| \|	Add unsigned to unused bit field in a.out.h to make sparse happy. [ I took care of the sparc64 side as well -DaveM ] Signed-off-by: Robert Reif <reif@earthlink.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6	Linus Torvalds	2007-04-02	2	-0/+3
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: [PATCH] x86: Don't probe for DDC on VBE1.2 [PATCH] x86-64: Increase NMI watchdog probing timeout [PATCH] x86-64: Let oprofile reserve MSR on all CPUs [PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E