op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	cputime: Generic on-demand virtual cputime accounting	Frederic Weisbecker	2013-01-27	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we want to stop the tick further idle, we need to be able to account the cputime without using the tick. Virtual based cputime accounting solves that problem by hooking into kernel/user boundaries. However implementing CONFIG_VIRT_CPU_ACCOUNTING require low level hooks and involves more overhead. But we already have a generic context tracking subsystem that is required for RCU needs by archs which plan to shut down the tick outside idle. This patch implements a generic virtual based cputime accounting that relies on these generic kernel/user hooks. There are some upsides of doing this: - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING if context tracking is already built (already necessary for RCU in full tickless mode). - We can rely on the generic context tracking subsystem to dynamically (de)activate the hooks, so that we can switch anytime between virtual and tick based accounting. This way we don't have the overhead of the virtual accounting when the tick is running periodically. And one downside: - There is probably more overhead than a native virtual based cputime accounting. But this relies on hooks that are already set anyway. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
*	Wire up finit_module syscall	Luck, Tony	2013-01-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Linux was granted a new system call to load modules by file descriptor in commit 34e1169d996a ("module: add syscall to load module from fd"). Wire it up for ia64 (ready for the Chrome port :-) Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	flagday: kill pt_regs argument of do_fork()	Al Viro	2012-11-29	1	-8/+6
\| \| \| \|	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ia64: switch to generic sys_execve()	Al Viro	2012-10-19	1	-3/+1
\| \| \| \| \| \|	Acked-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ia64: switch to generic kernel_thread()/kernel_execve()	Al Viro	2012-10-19	1	-7/+22
\| \| \| \| \| \|	Acked-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ia64: clone() had been unused since 2004	Al Viro	2012-10-14	1	-6/+0
\| \| \| \| \| \| \| \|	Used to be used by kernel_thread(); dead code for 8 years... Note that it's not sys_clone/sys_clone2 - those are used just fine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	ia64: Add accept4() syscall	Émeric Maschino	2012-01-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	While debugging udev > 170 failure on Debian Wheezy (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=648325), it appears that the issue was in fact due to missing accept4() in ia64. This patch simply adds accept4() to ia64. Signed-off-by: Émeric Maschino <emeric.maschino@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Wire up cross memory attach syscalls	Tony Luck	2011-11-02	1	-0/+2
\| \| \| \| \| \| \|	Add sys_process_vm_readv and sys_process_vm_writev to ia64 syscall table. Passes tests at http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz Signed-off-by: Tony Luck <tony.luck@intel.com>
*	All Arch: remove linkage for sys_nfsservctl system call	NeilBrown	2011-08-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	The nfsservctl system call is now gone, so we should remove all linkage for it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[IA64] wire up sendmmsg() syscall for Itanium	Tony Luck	2011-05-31	1	-0/+1
\| \| \| \| \| \|	Add entries in unistd.h and entry.S to make this new syscall visible. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	ns: Wire up the setns system call	Eric W. Biederman	2011-05-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. > arch/blackfin/include/asm/unistd.h \| 3 ++- > arch/blackfin/mach-common/entry.S \| 1 + Acked-by: Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[IA64] New syscalls for 2.6.39	Tony Luck	2011-03-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Four new syscalls: sys_name_to_handle_at sys_open_by_handle_at sys_clock_adjtime sys_syncfs Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Add latest crop of syscalls	Tony Luck	2010-08-13	1	-0/+3
\| \| \| \| \| \| \|	Three new syscalls for 2.6.36: prlimit64, fanotify_init and fanotify_mark. Wire up the ia64 syscall table for them. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Remove COMPAT_IA32 support	Tony Luck	2010-02-08	1	-39/+0
\| \| \| \| \| \| \| \| \|	This has been broken since May 2008 when Al Viro killed altroot support. Since nobody has complained, it would appear that there are no users of this code (A plausible theory since the main OSVs that support ia64 prefer to use the IA32-EL software emulation). Signed-off-by: Tony Luck <tony.luck@intel.com>
*	net: Introduce recvmmsg socket syscall	Arnaldo Carvalho de Melo	2009-10-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[IA64] hook up new rt_tgsigqueueinfo syscall	Tony Luck	2009-06-16	1	-0/+1
\| \| \| \| \| \|	Assign syscall #1321 for rt_tgsigqueueinfo. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] wire up preadv/pwritev system calls	Tony Luck	2009-04-08	1	-0/+2
\| \| \| \| \| \|	Gerd Hoffmann added these to Linux. Let ia64 use them. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	Merge branch 'tracing-for-linus' of ↵	Linus Torvalds	2009-04-05	1	-0/+100
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits) tracing, net: fix net tree and tracing tree merge interaction tracing, powerpc: fix powerpc tree and tracing tree interaction ring-buffer: do not remove reader page from list on ring buffer free function-graph: allow unregistering twice trace: make argument 'mem' of trace_seq_putmem() const tracing: add missing 'extern' keywords to trace_output.h tracing: provide trace_seq_reserve() blktrace: print out BLK_TN_MESSAGE properly blktrace: extract duplidate code blktrace: fix memory leak when freeing struct blk_io_trace blktrace: fix blk_probes_ref chaos blktrace: make classic output more classic blktrace: fix off-by-one bug blktrace: fix the original blktrace blktrace: fix a race when creating blk_tree_root in debugfs blktrace: fix timestamp in binary output tracing, Text Edit Lock: cleanup tracing: filter fix for TRACE_EVENT_FORMAT events ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release() x86: kretprobe-booster interrupt emulation code fix ... Fix up trivial conflicts in arch/parisc/include/asm/ftrace.h include/linux/memory.h kernel/extable.c kernel/module.c
\| *	Merge branch 'tracing/ftrace'; commit 'v2.6.29-rc2' into tracing/core	Ingo Molnar	2009-01-18	1	-1/+1
\| \|\
\| * \|	ftrace, ia64: IA64 dynamic ftrace support	Shaohua Li	2009-01-14	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IA64 dynamic ftrace support. The original _mcount stub for each function is like: alloc r40=ar.pfs,12,8,0 mov r43=r0;; mov r42=b0 mov r41=r1 nop.i 0x0 br.call.sptk.many b0 = _mcount;; The patch convert it to below for nop: [MII] nop.m 0x0 mov r3=ip nop.i 0x0 [MLX] nop.m 0x0 nop.x 0x0;; This isn't completely nop, as there is one instuction 'mov r3=ip', but it should be light and harmless for code follow it. And below is for call [MII] nop.m 0x0 mov r3=ip nop.i 0x0 [MLX] nop.m 0x0 brl.many .;; In this way, only one instruction is changed to convert code between nop and call. This should meet dyn-ftrace's requirement. But this requires CPU support brl instruction, so dyn-ftrace isn't supported for old Itanium system. Assume there are quite few such old system running. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| * \|	ftrace, ia64: IA64 static ftrace support	Shaohua Li	2009-01-14	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IA64 ftrace suppport. In IA64, below code will be added in each function if -pg is enabled. alloc r40=ar.pfs,12,8,0 mov r43=r0;; mov r42=b0 mov r41=r1 nop.i 0x0 br.call.sptk.many b0 = _mcount;; Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* \| \|	ia64/pv_ops: paravirtualize mov = ar.itc.	Isaku Yamahata	2009-03-26	1	-2/+2
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \|	paravirtualize mov reg = ar.itc. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	[CVE-2009-0029] Remove __attribute__((weak)) from sys_pipe/sys_pipe2	Heiko Carstens	2009-01-14	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	Remove __attribute__((weak)) from common code sys_pipe implemantation. IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations with the same name. Just rename them. For sys_pipe2 there is no architecture specific implementation. Cc: Richard Henderson <rth@twiddle.net> Cc: David S. Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
*	[IA64] Rationalize kernel mode alignment checking	Tony Luck	2008-11-20	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Itanium processors can handle some misaligned data accesses. They also provide a mode where all such accesses are forced to trap. The kernel was schizophrenic about use of this mode: * Base kernel code ran in permissive mode where the only traps generated were from those cases that the h/w could not handle. * Interrupt, syscall and trap code ran in strict mode where all unaligned accesses caused traps to the 0x5a00 unaligned reference vector. Use strict alignment checking throughout the kernel, but make sure that we continue to let user mode use more relaxed mode as the default. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] utrace use generic trace hook	Shaohua Li	2008-10-06	1	-0/+5
\| \| \| \| \| \| \|	Make IA64 use generic trace hook in some paths. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Wire up new system calls	Tony Luck	2008-07-25	1	-0/+6
\| \| \| \| \| \| \|	Six new system calls: signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] pvops: paravirtualize entry.S	Isaku Yamahata	2008-05-27	1	-43/+72
\| \| \| \| \| \| \| \| \| \| \| \| \|	paravirtualize ia64_swtich_to, ia64_leave_syscall and ia64_leave_kernel. They include sensitive or performance critical privileged instructions so that they need paravirtualization. To paravirtualize them by single source and multi compile they are converted into indirect jump. And define each pv instances. Cc: Keith Owens <kaos@ocs.com.au> Cc: "Dong, Eddie" <eddie.dong@intel.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] trivial cleanup for entry.S	Hidetoshi Seto	2008-05-14	1	-6/+6
\| \| \| \| \| \| \| \| \|	This patch does: - make comment at next to resched check more robust - move "re-check" comments to next to where change predicate regs Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] fix interrupt masking for pending works on kernel leave	Hidetoshi Seto	2008-05-14	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[Bug-fix for "[BUG?][2.6.25-mm1] sleeping during IRQ disabled"] This patch does: - enable interrupts before calling schedule() as same as others, ex. x86 - enable interrupts during ia64_do_signal() and ia64_sync_krbs() - do_notify_resume_user() is still called with interrupts disabled, since we can take short path of fsys_mode if-statement quickly. - pfm_handle_work() is also called with interrupts disabled, since it can deal interrupt mask within itself. - fix/add some comments/notes Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] disable interrupts on exit of ia64_trace_syscall	Hidetoshi Seto	2008-04-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While testing with CONFIG_VIRT_CPU_ACCOUNTING=y, I found that I occasionally get very huge system time in some threads. So I dug the issue and finally noticed that it was caused because of an interrupt which interrupt in the following window: > [arch/ia64/kernel/entry.S: (!CONFIG_PREEMPT && CONFIG_VIRT_CPU_ACCOUNTING)] > > ENTRY(ia64_leave_syscall) > : > (pUStk) rsm psr.i > cmp.eq pLvSys,p0=r0,r0 // pLvSys=1: leave from syscall > (pUStk) cmp.eq.unc p6,p0=r0,r0 // p6 <- pUStk > .work_processed_syscall: > adds r2=PT(LOADRS)+16,r12 > (pUStk) mov.m r22=ar.itc // fetch time at leave > adds r18=TI_FLAGS+IA64_TASK_SIZE,r13 > ;; > <<< window: from here >>> > (p6) ld4 r31=[r18] // load current_thread_info()->flags > ld8 r19=[r2],PT(B6)-PT(LOADRS) > adds r3=PT(AR_BSPSTORE)+16,r12 > ;; > mov r16=ar.bsp > ld8 r18=[r2],PT(R9)-PT(B6) > (p6) and r15=TIF_WORK_MASK,r31 // any work other than TIF_SYSCALL_TRACE? > ;; > ld8 r23=[r3],PT(R11)-PT(AR_BSPSTORE) > (p6) cmp4.ne.unc p6,p0=r15, r0 // any special work pending? > (p6) br.cond.spnt .work_pending_syscall > ;; > ld8 r9=[r2],PT(CR_IPSR)-PT(R9) > ld8 r11=[r3],PT(CR_IIP)-PT(R11) > (pNonSys) break 0 // bug check: we shouldn't be here if pNonSys is TRUE! > ;; > invala > <<< window: to here >>> > rsm psr.i \| psr.ic // turn off interrupts and interruption collection If pUStk is true, it means we are going to return user mode, hence we fetch ar.itc to get time at leave from system. It seems that it is not possible to interrupt the window if pUStk is true, because interrupts are disabled early. And also disabling interrupt makes sense because it is safe for referring current_thread_info()->flags. However interrupting the window while pUStk is true was possible. The route was: ia64_trace_syscall -> .work_pending_syscall_end -> .work_processed_syscall Only in case entering the window from this route, interrupts are enabled during in the window even if pUStk is true. I suppose interrupts must be disabled here anyway if pUStk is true. I'm not sure but afraid that what kind of bad effect were there, other than crazy system time which I found. FYI, there was a commit 6f6d75825dc49b082906b84537b4df28293c2977 that points out a bug at same point(exit of ia64_trace_syscall) in 2006. It can be said that there was an another bug. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] VIRT_CPU_ACCOUNTING (accurate cpu time accounting)	Hidetoshi Seto	2008-02-20	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements VIRT_CPU_ACCOUNTING for ia64, which enable us to use more accurate cpu time accounting. The VIRT_CPU_ACCOUNTING is an item of kernel config, which s390 and powerpc arch have. By turning this config on, these archs change the mechanism of cpu time accounting from tick-sampling based one to state-transition based one. The state-transition based accounting is done by checking time (cycle counter in processor) at every state-transition point, such as entrance/exit of kernel, interrupt, softirq etc. The difference between point to point is the actual time consumed during in the state. There is no doubt about that this value is more accurate than that of tick-sampling based accounting. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Wire up timerfd_{create,settime,gettime} syscalls	Tony Luck	2008-02-08	1	-1/+4
\| \| \| \| \| \| \|	Add ia64 hooks for the new syscalls that were added in commit 4d672e7ac79b5ec5cdc90e450823441e20464691 Signed-off-by: Tony Luck <tony.luck@intel.com>
*	timerfd: new timerfd API	Davide Libenzi	2008-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the new timerfd API as it is implemented by the following patch: int timerfd_create(int clockid, int flags); int timerfd_settime(int ufd, int flags, const struct itimerspec utmr, struct itimerspec otmr); int timerfd_gettime(int ufd, struct itimerspec *otmr); The timerfd_create() API creates an un-programmed timerfd fd. The "clockid" parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME. The timerfd_settime() API give new settings by the timerfd fd, by optionally retrieving the previous expiration time (in case the "otmr" parameter is not NULL). The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit is set in the "flags" parameter. Otherwise it's a relative time. The timerfd_gettime() API returns the next expiration time of the timer, or {0, 0} if the timerfd has not been set yet. Like the previous timerfd API implementation, read(2) and poll(2) are supported (with the same interface). Here's a simple test program I used to exercise the new timerfd APIs: http://www.xmailserver.org/timerfd-test2.c [akpm@linux-foundation.org: coding-style cleanups] [akpm@linux-foundation.org: fix ia64 build] [akpm@linux-foundation.org: fix m68k build] [akpm@linux-foundation.org: fix mips build] [akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds] [heiko.carstens@de.ibm.com: fix s390] [akpm@linux-foundation.org: fix powerpc build] [akpm@linux-foundation.org: fix sparc64 more] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[IA64] fallocate system call	David Chinner	2007-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	sys_fallocate for ia64. This uses an empty slot #1303 erroneously marked as reserved for move_pages (which had already been allocated as syscall #1276) Signed-Off-By: Dave Chinner <dgc@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] wire up {signal,timer,event}fd syscalls	Tony Luck	2007-05-14	1	-0/+3
\| \| \| \|	Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Wire up epoll_pwait and utimensat	Tony Luck	2007-05-10	1	-0/+2
\| \| \| \| \| \|	Another day, another pair of new system calls. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] wire up pselect, ppoll	Alexey Kuznetsov	2007-05-08	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Add TIF_RESTORE_SIGMASK	Alexey Dobriyan	2007-05-08	1	-26/+0
\| \| \| \| \| \| \| \| \|	Preparation for pselect and ppoll. ia32 compat code not tested. :-( Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	Pull percpu-dtc into release branch	Tony Luck	2007-04-30	1	-5/+2
\|\
\| *	[IA64] remove per-cpu ia64_phys_stacked_size_p8	Chen, Kenneth W	2007-02-06	1	-5/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not efficient to use a per-cpu variable just to store how many physical stack register a cpu has. Ever since the incarnation of ia64 up till upcoming Montecito processor, that variable has "glued" to 96. Having a variable in memory means that the kernel is burning an extra cacheline access on every syscall and kernel exit path. Such "static" value is better served with the instruction patching utility exists today. Convert ia64_phys_stacked_size_p8 into dynamic insn patching. This also has a pleasant side effect of eliminating access to per-cpu area while psr.ic=0 in the kernel exit path. (fixable for per-cpu DTC work, but why bother?) There are some concerns with the default value that the instruc- tion encoded in the kernel image. It shouldn't be concerned. The reasons are: (1) cpu_init() is called at CPU initialization. In there, we find out physical stack register size from PAL and patch two instructions in kernel exit code. The code in question can not be executed before the patching is done. (2) current implementation stores zero in ia64_phys_stacked_size_p8, and that's what the current kernel exit path loads the value with. With the new code, it is equivalent that we store reg size 96 in ia64_phys_stacked_size_p8, thus creating a better safety net. Given (1) above can never fail, having (2) is just a bonus. All in all, this patch allow one less memory reference in the kernel exit path, thus reducing syscall and interrupt return latency; and avoid polluting potential useful data in the CPU cache. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
* \|	[IA64] Hook up getcpu system call for IA64	Fenghua Yu	2007-02-05	1	-0/+2
\|/ \| \| \| \| \| \| \|	getcpu system call returns cpu# and node# on which this system call and its caller are running. This patch hooks up its implementation on IA64. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] IA64 Kexec/kdump	Zou Nan hai	2006-12-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes and updates. 1. Remove fake rendz path and related code according to discuss with Khalid Aziz. 2. fc.i offset fix in relocate_kernel.S. 3. iospic shutdown code eoi and mask race fix from Fujitsu. 4. Warm boot hook in machine_kexec to SN SAL code from Jack Steiner. 5. Send slave to SAL slave loop patch from Jay Lan. 6. Kdump on non-recoverable MCA event patch from Jay Lan 7. Use CTL_UNNUMBERED in kdump_on_init sysctl. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	fix file specification in comments	Uwe Zeisberger	2006-10-03	1	-1/+1
\| \| \| \| \| \| \|	Many files include the filename at the beginning, serveral used a wrong one. Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
*	[PATCH] rename the provided execve functions to kernel_execve	Arnd Bergmann	2006-10-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some architectures provide an execve function that does not set errno, but instead returns the result code directly. Rename these to kernel_execve to get the right semantics there. Moreover, there is no reasone for any of these architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so remove these right away. [akpm@osdl.org: build fix] [bunk@stusta.de: build fix] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andi Kleen <ak@muc.de> Acked-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Hirokazu Takata <takata.hirokazu@renesas.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Revert "[IA64] Unwire set/get_robust_list"	Tony Luck	2006-09-26	1	-2/+2
\| \| \| \| \| \| \| \| \|	This reverts commit 2636255488484e04d6d54303d2b0ec30f7ef7e02. Jakub Jelinek provided the missing futex_atomic_cmpxchg_inatomic() function, so now it should be safe to re-enable these syscalls. Signed-off-by: Tony Luck <tony.luck@intel.com>
*	[IA64] Unwire set/get_robust_list	Andreas Schwab	2006-09-08	1	-2/+2
\| \| \| \| \| \| \| \| \|	The syscalls set/get_robust_list must not be wired up until futex_atomic_cmpxchg_inatomic is implemented. Otherwise the kernel will hang in handle_futex_death. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
*	Remove obsolete #include <linux/config.h>	Jörn Engel	2006-06-30	1	-1/+0
\| \| \| \| \|	Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
*	[PATCH] page migration: sys_move_pages(): support moving of individual pages	Christoph Lameter	2006-06-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	move_pages() is used to move individual pages of a process. The function can be used to determine the location of pages and to move them onto the desired node. move_pages() returns status information for each page. long move_pages(pid, number_of_pages_to_move, addresses_of_pages[], nodes[] or NULL, status[], flags); The addresses of pages is an array of void * pointing to the pages to be moved. The nodes array contains the node numbers that the pages should be moved to. If a NULL is passed instead of an array then no pages are moved but the status array is updated. The status request may be used to determine the page state before issuing another move_pages() to move pages. The status array will contain the state of all individual page migration attempts when the function terminates. The status array is only valid if move_pages() completed successfullly. Possible page states in status[]: 0..MAX_NUMNODES The page is now on the indicated node. -ENOENT Page is not present -EACCES Page is mapped by multiple processes and can only be moved if MPOL_MF_MOVE_ALL is specified. -EPERM The page has been mlocked by a process/driver and cannot be moved. -EBUSY Page is busy and cannot be moved. Try again later. -EFAULT Invalid address (no VMA or zero page). -ENOMEM Unable to allocate memory on target node. -EIO Unable to write back page. The page must be written back in order to move it since the page is dirty and the filesystem does not provide a migration function that would allow the moving of dirty pages. -EINVAL A dirty page cannot be moved. The filesystem does not provide a migration function and has no ability to write back pages. The flags parameter indicates what types of pages to move: MPOL_MF_MOVE Move pages that are only mapped by the process. MPOL_MF_MOVE_ALL Also move pages that are mapped by multiple processes. Requires sufficient capabilities. Possible return codes from move_pages() -ENOENT No pages found that would require moving. All pages are either already on the target node, not present, had an invalid address or could not be moved because they were mapped by multiple processes. -EINVAL Flags other than MPOL_MF_MOVE(_ALL) specified or an attempt to migrate pages in a kernel thread. -EPERM MPOL_MF_MOVE_ALL specified without sufficient priviledges. or an attempt to move a process belonging to another user. -EACCES One of the target nodes is not allowed by the current cpuset. -ENODEV One of the target nodes is not online. -ESRCH Process does not exist. -E2BIG Too many pages to move. -ENOMEM Not enough memory to allocate control array. -EFAULT Parameters could not be accessed. A test program for move_pages() may be found with the patches on ftp.kernel.org:/pub/linux/kernel/people/christoph/pmig/patches-2.6.17-rc4-mm3 From: Christoph Lameter <clameter@sgi.com> Detailed results for sys_move_pages() Pass a pointer to an integer to get_new_page() that may be used to indicate where the completion status of a migration operation should be placed. This allows sys_move_pags() to report back exactly what happened to each page. Wish there would be a better way to do this. Looks a bit hacky. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Jes Sorensen <jes@trained-monkey.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Andi Kleen <ak@muc.de> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Add support for the sys_vmsplice syscall	Jens Axboe	2006-04-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice() moves data to a pipe, with the input being a user address range instead. This uses an approach suggested by Linus, where we can hold partial ranges inside the pages[] map. Hopefully this will be useful for network receive support as well. Signed-off-by: Jens Axboe <axboe@suse.de>
*	[PATCH] splice: add support for sys_tee()	Jens Axboe	2006-04-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Basically an in-kernel implementation of tee, which uses splice and the pipe buffers as an intelligent way to pass data around by reference. Where the user space tee consumes the input and produces a stdout and file output, this syscall merely duplicates the data inside a pipe to another pipe. No data is copied, the output just grabs a reference to the input pipe data. Signed-off-by: Jens Axboe <axboe@suse.de>