summaryrefslogtreecommitdiffstats
path: root/arch/i386/kernel
Commit message (Collapse)AuthorAgeFilesLines
...
| * kbuild: full dependency check on asm-offsets.hSam Ravnborg2005-09-093-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building asm-offsets.h has been moved to a seperate Kbuild file located in the top-level directory. This allow us to share the functionality across the architectures. The old rules in architecture specific Makefiles will die in subsequent patches. Furhtermore the usual kbuild dependency tracking is now used when deciding to rebuild asm-offsets.s. So we no longer risk to fail a rebuild caused by asm-offsets.c dependencies being touched. With this common rule-set we now force the same name across all architectures. Following patches will fix the rest. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
* | [PATCH] Update PCI IOMEM allocation startDaniel Ritz2005-09-091-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the problem with "Averatec 6240 pcmcia_socket0: unable to apply power", which was due to the CardBus IOMEM register region being allocated at an address that was actually inside the RAM window that had been reserved for video frame-buffers in an UMA setup. The BIOS _should_ have marked that region reserved in the e820 memory descriptor tables, but did not. It is fixed by rounding up the default starting address of PCI memory allocations, so that we leave a bigger gap after the final known memory location. The amount of rounding depends on how big the unused memory gap is that we can allocate IOMEM from. Based on example code by Linus. Acked-by: Greg KH <greg@kroah.com> Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] timer initialization cleanup: DEFINE_TIMERIngo Molnar2005-09-091-2/+1
| | | | | | | | | | | | | | | | | | | | Clean up timer initialization by introducing DEFINE_TIMER a'la DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been been in the -RT tree for some time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] fbdev: Resurrect hooks to get EDID from firmwareAntonino A. Daplas2005-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | For the i386, code is already present in video.S that gets the EDID from the video BIOS. Make this visible so drivers can also use this data as fallback when i2c does not work. To ensure that the EDID block is returned for the primary graphics adapter only, by check if the IORESOURCE_ROM_SHADOW flag is set. Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] i386: seccomp fix for auditing/ptraceAndrea Arcangeli2005-09-091-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the same issue as ppc64 before, when returning to userland we shouldn't re-compute the seccomp check or the task could be killed during sigreturn when orig_eax is overwritten by the sigreturn syscall. This was found by Roland. This was harmless from a security standpoint, but some i686 users reported failures with auditing enabled system wide (some distro surprisingly makes it the default) and I reproduced it too by keeping the whole workload under strace -f. Patch is tested and works for me under strace -f. nobody@athlon:~/cpushare> strace -o /tmp/o -f python seccomp_test.py make: Nothing to be done for `seccomp_test'. Starting computing some malicious bytecode init load start stop receive_data failure kill exit_code 0 signal 9 The malicious bytecode has been killed successfully by seccomp Starting computing some safe bytecode init load start stop 174 counts kill exit_code 0 signal 0 The seccomp_test.py completed successfully, thank you for testing. (akpm: collaterally cleaned up a bit of do_syscall_trace() too) Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] x86: MP_processor_info fixAndrew Morton2005-09-091-14/+10
| | | | | | | | | | | | | | | | | | | | | | Remove the weird and apparently unnecessary logic in MP_processor_info() which assumes that the BSP is the first one to run MP_processor_info(). On one of my boxes that isn't true and cpu_possible_map gets the wrong value. Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk> Cc: Alexander Nyberg <alexn@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Fix misspelled i8259 typo in io_apic.cKarsten Wiese2005-09-091-2/+2
| | | | | | | | | | | | | | | | The legacy PIC's name is "i8259". Signed-off-by: Karsten Wiese <annabellesgarden@yahoo.de> Signed-off-by: Vojtech Pavlik <vojtech@suse.cz> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] __user annotations for pointers in i386 sigframeviro@ZenIV.linux.org.uk2005-09-091-4/+4
| | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq Linus Torvalds2005-09-083-13/+13
|\ \ | |/ |/|
| * [CPUFREQ] Remove trailing whitespace before \n's in printks.Dave Jones2005-09-012-3/+3
| | | | | | | | | | From: Denis Vlasenko <vda@ilport.com.ua> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] Remove extra arg from dprintk in cpufreq/speedstep-smi.cMika Kukkonen2005-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | Minor fallout from my upcoming __attribute__((format(printf,x,y))) patches. The variable 'result' is untouched, so this patch just removes it. Signed-off-by: Mika Kukkonen <mikukkon@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] dprintf format fixes in cpufreq/speedstep-centrino.cMika Kukkonen2005-08-311-2/+2
| | | | | | | | | | | | | | | | | | Ho-hum, did not notice there was more printf fixes for cpufreq (you should see the amount I have for isdn and reiser ...). Sorry for noise. Signed-off-by: Mika Kukkonen <mikukkon@gmail.com> Signed-off-by: Dave Jones <davej@redhat.com>
| * [CPUFREQ] speedstep-centrino: skip extract_clock logic for acpi based centrinoVenkatesh Pallipadi2005-08-311-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speedstep_centrino.c:extract_clock() assumes the bus speed of 100MHz, which is not true with latest laptops. Due to this assumption and due to the encoded frequency check during initialization, speedstep-centrino driver fails even on systems that has proper ACPI information to do the P-state transition. The change below moves the centrino-speedstep detection to be used only when table based P-state transition is done. For ACPI based P-state transition, we skip the centrino_cpu identification, and as a result we don't use the bus speed assumption in extract_clock. This change makes speedstep-centrino work on Pentium-M based systems, which have more than 100MHz bus speed. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
* | Merge linux-2.6 with linux-acpi-2.6Len Brown2005-09-0841-563/+630
|\ \
| * | [PATCH] kprobes: fix bug when probed on task and isr functionsKeshavamurthy Anil S2005-09-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a race condition where in system used to hang or sometime crash within minutes when kprobes are inserted on ISR routine and a task routine. The fix has been stress tested on i386, ia64, pp64 and on x86_64. To reproduce the problem insert kprobes on schedule() and do_IRQ() functions and you should see hang or system crash. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] kprobes: fix handling of simultaneous probe hit/unregisterJim Keniston2005-09-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in kprobes's handling of a corner case on i386 and x86_64. On an SMP system, if one CPU unregisters a kprobe just after another CPU hits that probepoint, kprobe_handler() on the latter CPU sees that the kprobe has been unregistered, and attempts to let the CPU continue as if the probepoint hadn't been hit. The bug is that on i386 and x86_64, we were neglecting to set the IP back to the beginning of the probed instruction. This could cause an oops or crash. This bug doesn't exist on ppc64 and ia64, where a breakpoint instruction leaves the IP pointing to the beginning of the instruction. I don't know about sparc64. (Dave, could you please advise?) This fix has been tested on i386 and x86_64 SMP systems. To reproduce the problem, set one CPU to work registering and unregistering a kprobe repeatedly, and another CPU pounding the probepoint in a tight loop. Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] kprobes: prevent possible race conditions i386 changesPrasanna S Panchamukhi2005-09-074-24/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains the i386 architecture specific changes to prevent the possible race conditions. Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] fix: dmi_check_systemRobert Love2005-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: 1) dmi_check_system() returns the count of the number of matches. Zero thus means no matches. 2) A match callback can return nonzero to stop the match checking. Bug: The count is incremented after we check for the nonzero return value, so it does not reflect the actual count. We could say this is intended, for some dumb reason, except that it means that a match on the first check returns zero--no matches--if the callback returns nonzero. Attached patch implements the count before calling the callback and thus before potentially short-circuiting. Signed-off-by: Robert Love <rml@novell.com> Cc: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] dmi: add onboard devices discoveryAndrey Panin2005-09-071-12/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds onboard devices and IPMI BMC discovery into DMI scan code. Drivers can use dmi_find_device() function to search for devices by type and name. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] dmi: make dmi_string() behave like strdup()Andrey Panin2005-09-071-16/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes dmi_string() function to allocate string copy by itself, to avoid code duplication in the next patch. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] dmi: remove old debugging codeAndrey Panin2005-09-071-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | DMI debugging code is unused for ages. This patch removes it. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] dmi: remove uneeded functionAndrey Panin2005-09-071-45/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After elimination of central DMI blacklist dmi_scan_machine() function became a wrapper for dmi_iterate(). This patch moves some code around to kill unneeded function. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] NTP: ntp-helper functionsjohn stultz2005-09-071-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cleans up a commonly repeated set of changes to the NTP state variables by adding two helper inline functions: ntp_clear(): Clears the ntp state variables ntp_synced(): Returns 1 if the system is synced with a time server. This was compile tested for alpha, arm, i386, x86-64, ppc64, s390, sparc, sparc64. Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] Additions to .data.read_mostly sectionRavikiran G Thirumalai2005-09-072-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark variables which are usually accessed for reads with __readmostly. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] remove the second arg of do_timer_interrupt()Adrian Bunk2005-09-071-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The second arg of do_timer_interrupt() is not used in the functions, and all callers pass NULL. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Paul Mundt <lethal@Linux-SH.ORG> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] NMI: Update NMI users of RCU to use new APIPaul E. McKenney2005-09-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Uses of RCU for dynamically changeable NMI handlers need to use the new rcu_dereference() and rcu_assign_pointer() facilities. This change makes it clear that these uses are safe from a memory-barrier viewpoint, but the main purpose is to document exactly what operations are being protected by RCU. This has been tested on x86 and x86-64, which are the only architectures affected by this change. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] More __read_mostly variablesChristoph Lameter2005-09-072-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move some more frequently read variables that showed up during some of our performance tests as sometimes ending up in hot cachelines to the read_mostly section. Fix: Move the __read_mostly from before hpet_usec_quotient to follow the variable like the other uses of __read_mostly. Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Christoph Lameter <christoph@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] detect soft lockupsIngo Molnar2005-09-072-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP. When enabled then per-CPU watchdog threads are started, which try to run once per second. If they get delayed for more than 10 seconds then a callback from the timer interrupt detects this condition and prints out a warning message and a stack dump (once per lockup incident). The feature is otherwise non-intrusive, it doesnt try to unlock the box in any way, it only gets the debug info out, automatically, and on all CPUs affected by the lockup. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de> Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86/x86_64: deferred handling of writes to /proc/irqxx/smp_affinityAshok Raj2005-09-071-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When handling writes to /proc/irq, current code is re-programming rte entries directly. This is not recommended and could potentially cause chipset's to lockup, or cause missing interrupts. CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the interrupt is pending. The same needs to be done for /proc/irq handling as well. Otherwise user space irq balancers are really not doing the right thing. - Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for lack of a generic name. - added move_irq out of IRQ_BALANCE, and added this same to X86_64 - Added new proc handler for write, so we can do deferred write at irq handling time. - Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead it now shows only active cpu masks, or exactly what was set. - Provided a common move_irq implementation, instead of duplicating when using generic irq framework. Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off. Tested UP builds as well. MSI testing: tbd: I have cards, need to look for a x-over cable, although I did test an earlier version of this patch. Will test in a couple days. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: Zwane Mwaikambo <zwane@holomorphy.com> Grudgingly-acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] uml: SYSEMU: slight cleanup and speedupPaolo 'Blaisorblade' Giarrusso2005-09-051-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a follow-up to "UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage" (i.e. uml-support-* in current mm). Avoid unconditionally jumping to work_pending and code copying, just reuse the already existing resume_userspace path. One interesting note, from Charles P. Wright, suggested that the API is improvable with no downsides for UML (except that it will have to support yet another host API, since dropping support for the current API, for UML, is not reasonable from users' point of view). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Charles P. Wright <cwright@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] SYSEMU: fix sysaudit / singlestep interactionBodo Stroesser2005-09-051-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> This is simply an adjustment for "Ptrace - i386: fix Syscall Audit interaction with singlestep" to work on top of SYSEMU patches, too. On this patch, I have some doubts: I wonder why we need to alter that way ptrace_disable(). I left the patch this way because it has been extensively tested, but I don't understand the reason. The current PTRACE_DETACH handling simply clears child->ptrace; actually this is not enough because entry.S just looks at the thread_flags; actually, do_syscall_trace checks current->ptrace but I don't think depending on that is good, at least for performance, so I think the clearing is done elsewhere. For instance, on PTRACE_CONT it's done, but doing PTRACE_DETACH without PTRACE_CONT is possible (and happens when gdb crashes and one kills it manually). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Roland McGrath <roland@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] Uml support: add PTRACE_SYSEMU_SINGLESTEP option to i386Bodo Stroesser2005-09-051-4/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which can be used by UML to singlestep a process: it will receive SINGLESTEP interceptions for normal instructions and syscalls, but syscall execution will be skipped just like with PTRACE_SYSEMU. Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] Uml support: reorganize PTRACE_SYSEMU supportBodo Stroesser2005-09-052-30/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this patch, we change the way we handle switching from PTRACE_SYSEMU to PTRACE_{SINGLESTEP,SYSCALL}, to free TIF_SYSCALL_EMU from double use as a preparation for PTRACE_SYSEMU_SINGLESTEP extension, without changing the behavior of the host kernel. Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and ↵Laurent Vivier2005-09-052-21/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | general usage Jeff Dike <jdike@addtoit.com>, Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>, Bodo Stroesser <bstroesser@fujitsu-siemens.com> Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL except that the kernel does not execute the requested syscall; this is useful to improve performance for virtual environments, like UML, which want to run the syscall on their own. In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry and on exit, and each time you also have two context switches; with SYSEMU you avoid the 2nd stop and so save two context switches per syscall. Also, some architectures don't have support in the host for changing the syscall number via ptrace(), which is currently needed to skip syscall execution (UML turns any syscall into getpid() to avoid it being executed on the host). Fixing that is hard, while SYSEMU is easier to implement. * This version of the patch includes some suggestions of Jeff Dike to avoid adding any instructions to the syscall fast path, plus some other little changes, by myself, to make it work even when the syscall is executed with SYSENTER (but I'm unsure about them). It has been widely tested for quite a lot of time. * Various fixed were included to handle the various switches between various states, i.e. when for instance a syscall entry is traced with one of PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit. Basically, this is done by remembering which one of them was used even after the call to ptrace_notify(). * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP to make do_syscall_trace() notice that the current syscall was started with SYSEMU on entry, so that no notification ought to be done in the exit path; this is a bit of a hack, so this problem is solved in another way in next patches. * Also, the effects of the patch: "Ptrace - i386: fix Syscall Audit interaction with singlestep" are cancelled; they are restored back in the last patch of this series. Detailed descriptions of the patches doing this kind of processing follow (but I've already summed everything up). * Fix behaviour when changing interception kind #1. In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag only after doing the debugger notification; but the debugger might have changed the status of this flag because he continued execution with PTRACE_SYSCALL, so this is wrong. This patch fixes it by saving the flag status before calling ptrace_notify(). * Fix behaviour when changing interception kind #2: avoid intercepting syscall on return when using SYSCALL again. A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL crashes. The problem is in arch/i386/kernel/entry.S. The current SYSEMU patch inhibits the syscall-handler to be called, but does not prevent do_syscall_trace() to be called after this for syscall completion interception. The appended patch fixes this. It reuses the flag TIF_SYSCALL_EMU to remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since the flag is unused in the depicted situation. * Fix behaviour when changing interception kind #3: avoid intercepting syscall on return when using SINGLESTEP. When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had problems with singlestepping on UML in SKAS with SYSEMU. It looped receiving SIGTRAPs without moving forward. EIP of the traced process was the same for all SIGTRAPs. What's missing is to handle switching from PTRACE_SYSCALL_EMU to PTRACE_SINGLESTEP in a way very similar to what is done for the change from PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE. I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is notified and then wake ups the process; the syscall is executed (or skipped, when do_syscall_trace() returns 0, i.e. when using PTRACE_SYSEMU), and do_syscall_trace() is called again. Since we are on the return path of a SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL), we must still avoid notifying the parent of the syscall exit. Now, this behaviour is extended even to resuming with PTRACE_SINGLESTEP. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] Ptrace/i386: fix "syscall audit" interaction with singlestepBodo Stroesser2005-09-051-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Avoid giving two traps for singlestep instead of one, when syscall auditing is enabled. In fact no singlestep trap is sent on syscall entry, only on syscall exit, as can be seen in entry.S: # Note that in this mask _TIF_SINGLESTEP is not tested !!! <<<<<<<<<<<<<< testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),TI_flags(%ebp) jnz syscall_trace_entry ... syscall_trace_entry: ... call do_syscall_trace But auditing a SINGLESTEP'ed process causes do_syscall_trace to be called, so the tracer will get one more trap on the syscall entry path, which it shouldn't. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Roland McGrath <roland@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] add suspend/resume for timerShaohua Li2005-09-055-27/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timers lack .suspend/.resume methods. Because of this, jiffies got a big compensation after a S3 resume. And then softlockup watchdog reports an oops. This occured with HPET enabled, but it's also possible for other timers. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] swsusp: fix remaining u32 vs. pm_message_t confusionPavel Machek2005-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix remaining bits of u32 vs. pm_message confusion. Should not break anything. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] ISA DMA suspend for i386Pierre Ossman2005-09-052-1/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | Reset the ISA DMA controller into a known state after a suspend. Primary concern was reenabling the cascading DMA channel (4). Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] unify x86/x86-64 semaphore codeBenjamin LaHaise2005-09-051-162/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the common code in x86 and x86-64's semaphore.c into a single file in lib/semaphore-sleepers.c. The arch specific asm stubs are left in the arch tree (in semaphore.c for i386 and in the asm for x86-64). There should be no changes in code/functionality with this patch. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386 boottime for_each_cpu brokenZwane Mwaikambo2005-09-052-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for_each_cpu walks through all processors in cpu_possible_map, which is defined as cpu_callout_map on i386 and isn't initialised until all processors have been booted. This breaks things which do for_each_cpu iterations early during boot. So, define cpu_possible_map as a bitmap with NR_CPUS bits populated. This was triggered by a patch i'm working on which does alloc_percpu before bringing up secondary processors. From: Alexander Nyberg <alexn@telia.com> i386-boottime-for_each_cpu-broken.patch i386-boottime-for_each_cpu-broken-fix.patch The SMP version of __alloc_percpu checks the cpu_possible_map before allocating memory for a certain cpu. With the above patches the BSP cpuid is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor machines (as soon as someone tries to dereference something allocated via __alloc_percpu, which in fact is never allocated since the cpu is not set in cpu_possible_map). Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Alexander Nyberg <alexn@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386: encapsulate copying of pgd entriesZachary Amsden2005-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a clone operation for pgd updates. This helps complete the encapsulation of updates to page tables (or pages about to become page tables) into accessor functions rather than using memcpy() to duplicate them. This is both generally good for consistency and also necessary for running in a hypervisor which requires explicit updates to page table entries. The new function is: clone_pgd_range(pgd_t *dst, pgd_t *src, int count); dst - pointer to pgd range anwhere on a pgd page src - "" count - the number of pgds to copy. dst and src can be on the same page, but the range must not overlap and must not cross a page boundary. Note that I ommitted using this call to copy pgd entries into the software suspend page root, since this is not technically a live paging structure, rather it is used on resume from suspend. CC'ing Pavel in case he has any feedback on this. Thanks to Chris Wright for noticing that this could be more optimal in PAE compiles by eliminating the memset. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86 NMI: better support for debuggersGeorge Anzinger2005-09-052-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a notify to the die_nmi notify that the system is about to be taken down. If the notify is handled with a NOTIFY_STOP return, the system is given a new lease on life. We also change the nmi watchdog to carry on if die_nmi returns. This give debug code a chance to a) catch watchdog timeouts and b) possibly allow the system to continue, realizing that the time out may be due to debugger activities such as single stepping which is usually done with "other" cpus held. Signed-off-by: George Anzinger<george@mvista.com> Cc: Keith Owens <kaos@ocs.com.au> Signed-off-by: George Anzinger <george@mvista.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86: introduce a write acessor for updating the current LDTZachary Amsden2005-09-051-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a write acessor for updating the current LDT. This is required for hypervisors like Xen that do not allow LDT pages to be directly written. Testing - here's a fun little LDT test that can be trivially modified to test limits as well. /* * Copyright (c) 2005, Zachary Amsden (zach@vmware.com) * This is licensed under the GPL. */ #include <stdio.h> #include <signal.h> #include <asm/ldt.h> #include <asm/segment.h> #include <sys/types.h> #include <unistd.h> #include <sys/mman.h> #define __KERNEL__ #include <asm/page.h> void main(void) { struct user_desc desc; char *code; unsigned long long tsc; code = (char *)mmap(0, 8192, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); desc.entry_number = 0; desc.base_addr = code; desc.limit = 1; desc.seg_32bit = 1; desc.contents = MODIFY_LDT_CONTENTS_CODE; desc.read_exec_only = 0; desc.limit_in_pages = 1; desc.seg_not_present = 0; desc.useable = 1; if (modify_ldt(1, &desc, sizeof(desc)) != 0) { perror("modify_ldt"); } printf("code base is 0x%08x\n", (unsigned)code); code[0x0ffe] = 0x0f; /* rdtsc */ code[0x0fff] = 0x31; code[0x1000] = 0xcb; /* lret */ __asm__ __volatile("lcall $7,$0xffe" : "=A" (tsc)); printf("TSC is 0x%016llx\n", tsc); } Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86: make IOPL explicitZachary Amsden2005-09-052-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pushf/popf in switch_to are ONLY used to switch IOPL. Making this explicit in C code is more clear. This pushf/popf pair was added as a bugfix for leaking IOPL to unprivileged processes when using sysenter/sysexit based system calls (sysexit does not restore flags). When requesting an IOPL change in sys_iopl(), it is just as easy to change the current flags and the flags in the stack image (in case an IRET is required), but there is no reason to force an IRET if we came in from the SYSENTER path. This change is the minimal solution for supporting a paravirtualized Linux kernel that allows user processes to run with I/O privilege. Other solutions require radical rewrites of part of the low level fault / system call handling code, or do not fully support sysenter based system calls. Unfortunately, this added one field to the thread_struct. But as a bonus, on P4, the fastest time measured for switch_to() went from 312 to 260 cycles, a win of about 17% in the fast case through this performance critical path. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86: privilege cleanupZachary Amsden2005-09-052-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Privilege checking cleanup. Originally, these diffs were much greater, but recent cleanups in Linux have already done much of the cleanup. I added some explanatory comments in places where the reasoning behind certain tests is rather subtle. Also, in traps.c, we can skip the user_mode check in handle_BUG(). The reason is, there are only two call chains - one via die_if_kernel() and one via do_page_fault(), both entering from die(). Both of these paths already ensure that a kernel mode failure has happened. Also, the original check here, if (user_mode(regs)) was insufficient anyways, since it would not rule out BUG faults from V8086 mode execution. Saving the %ss segment in show_regs() rather than assuming a fixed value also gives better information about the current kernel state in the register dump. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] x86: more asm cleanupsZachary Amsden2005-09-055-41/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some more assembler cleanups I noticed along the way. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386: load_tls() fixZachary Amsden2005-09-051-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Subtle fix: load_TLS has been moved after saving %fs and %gs segments to avoid creating non-reversible segments. This could conceivably cause a bug if the kernel ever needed to save and restore fs/gs from the NMI handler. It currently does not, but this is the safest approach to avoiding fs/gs corruption. SMIs are safe, since SMI saves the descriptor hidden state. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386: inline assembler: cleanup and encapsulate descriptor and task ↵Zachary Amsden2005-09-057-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | register management i386 inline assembler cleanup. This change encapsulates descriptor and task register management. Also, it is possible to improve assembler generation in two cases; savesegment may store the value in a register instead of a memory location, which allows GCC to optimize stack variables into registers, and MOV MEM, SEG is always a 16-bit write to memory, making the casting in math-emu unnecessary. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386: cleanup serialize msrZachary Amsden2005-09-051-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i386 arch cleanup. Introduce the serialize macro to serialize processor state. Why the microcode update needs it I am not quite sure, since wrmsr() is already a serializing instruction, but it is a microcode update, so I will keep the semantic the same, since this could be a timing workaround. As far as I can tell, this has always been there since the original microcode update source. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
| * | [PATCH] i386: inline asm cleanupZachary Amsden2005-09-057-40/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i386 Inline asm cleanup. Use cr/dr accessor functions. Also, a potential bugfix. Also, some CR accessors really should be volatile. Reads from CR0 (numeric state may change in an exception handler), writes to CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction re-ordering. I did not add memory clobber to CR3 / CR4 / CR0 updates, as it was not there to begin with, and in no case should kernel memory be clobbered, except when doing a TLB flush, which already has memory clobber. I noticed that page invalidation does not have a memory clobber. I can't find a bug as a result, but there is definitely a potential for a bug here: #define __flush_tlb_single(addr) \ __asm__ __volatile__("invlpg %0": :"m" (*(char *) addr)) Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
OpenPOWER on IntegriCloud