summaryrefslogtreecommitdiffstats
path: root/sys/amd64/amd64/vm_machdep.c
Commit message (Collapse)AuthorAgeFilesLines
* - Overhaul the software interrupt code to use interrupt threads for eachjhb2000-10-251-1/+1
| | | | | | | | | | | | | | | | | | | type of software interrupt. Roughly, what used to be a bit in spending now maps to a swi thread. Each thread can have multiple handlers, just like a hardware interrupt thread. - Instead of using a bitmask of pending interrupts, we schedule the specific software interrupt thread to run, so spending, NSWI, and the shandlers array are no longer needed. We can now have an arbitrary number of software interrupt threads. When you register a software interrupt thread via sinthand_add(), you get back a struct intrhand that you pass to sched_swi() when you wish to schedule your swi thread to run. - Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit more intuitive. Also, prefix all the members of struct intrhand with 'ih_'. - Make swi_net() a MI function since there is now no point in it being MD. Submitted by: cp
* Don't dink with interrupts in vm_page_zero_idle(). This code assumed itjhb2000-10-231-10/+1
| | | | | | | was being called with interrupts disabled, when it was actually being called with them enabled. Pointed out by: tegge
* - machine/mutex.h -> sys/mutex.hjhb2000-10-201-2/+2
| | | | | - Use cpu_throw() instead of cpu_switch() during cpu_exit() since we don't need to save our previous state.
* Remove unneeded #include <machine/clock.h>phk2000-10-151-1/+0
|
* Major update to the way synchronization is done in the kernel. Highlightsjasone2000-09-071-33/+18
| | | | | | | | | | | | | | | include: * Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) * Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
* Clean up some low level bootstrap code:peter2000-08-111-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks. With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely). There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that. The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.
* Add option BROKEN_KEYBOARD_RESET to an opt_*.h filepeter2000-06-101-0/+1
|
* Separate the struct bio related stuff out of <sys/buf.h> intophk2000-05-051-0/+1
| | | | | | | | | | | | | | | <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter
* Commit major SMP cleanups and move the BGL (big giant lock) in thedillon2000-03-281-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
* Remove B_READ, B_WRITE and B_FREEBUF and replace them with a newphk2000-03-201-1/+1
| | | | | | | | | | | | | | | | | | | | | field in struct buf: b_iocmd. The b_iocmd is enforced to have exactly one bit set. B_WRITE was bogusly defined as zero giving rise to obvious coding mistakes. Also eliminate the redundant struct buf flag B_CALL, it can just as efficiently be done by comparing b_iodone to NULL. Should you get a panic or drop into the debugger, complaining about "b_iocmd", don't continue. It is likely to write on your disk where it should have been reading. This change is a step in the direction towards a stackable BIO capability. A lot of this patch were machine generated (Thanks to style(9) compliance!) Vinum users: Greg has not had time to test this yet, be careful.
* Don't forget to reset the hardware debug registers when a process thatbsd2000-02-201-0/+7
| | | | | | | | | | was using them exits. Don't allow a user process to cause the kernel to take a TRCTRAP on a user space address. Reviewed by: jlemon, sef Approved by: jkh
* User ldt sharing.luoqi1999-12-061-19/+33
|
* The core of this patch is to vm/vm_page.h. The effects are two-fold: (1) toalc1999-10-301-2/+2
| | | | | | | | | eliminate an extra (useless) level of indirection in half of the page queue accesses and (2) to use a single name for each queue throughout, instead of, e.g., "vm_page_queue_active" in some places and "vm_page_queues[PQ_ACTIVE]" in others. Reviewed by: dillon
* useracc() the prequel:phk1999-10-291-1/+0
| | | | | | | | | | | Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
* Zap unneeded #includespeter1999-10-111-2/+0
| | | | Submitted by: phk
* Fix bug in pipe code relating to writes of mmap'd but illegal addressdillon1999-09-201-3/+6
| | | | | | | | | spaces which cross a segment boundry in the page table. pmap_kextract() is not designed for access to the user space portion of the page table and cannot handle the null-page-directory-entry case. The fix is to have vm_fault_quick() return a success or failure which is then used to avoid calling pmap_kextract().
* $Id$ -> $FreeBSD$peter1999-08-281-1/+1
|
* Change the type of vpgqueues::lcnt from "int *" to "int". The indirectionalc1999-07-311-3/+3
| | | | served no purpose.
* Reduce the number of "magic constants" used for page coloringalc1999-07-221-2/+2
| | | | | by one: PQ_PRIME2 and PQ_PRIME3 are used to accomplish the same thing at different places in the kernel. Drop PQ_PRIME3.
* Slight reorganization of kernel thread/process creation. Instead of usingpeter1999-07-011-3/+3
| | | | | | | | | | | | | | | SYSINIT_KT() etc (which is a static, compile-time procedure), use a NetBSD-style kthread_create() interface. kproc_start is still available as a SYSINIT() hook. This allowed simplification of chunks of the sysinit code in the process. This kthread_create() is our old kproc_start internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work the same as the NetBSD one. One thing I'd like to do shortly is get rid of nfsiod as a user initiated process. It makes sense for the nfs client code to create them on the fly as needed up to a user settable limit. This means that nfsiod doesn't need to be in /sbin and is always "available". This is a fair bit easier to do outside of the SYSINIT_KT() framework.
* Unifdef VM86.jlemon1999-06-011-14/+1
| | | | Reviewed by: silence on on -current
* unifdef -DVM_STACK - it's been on for a while for x86 and was checkedpeter1999-04-191-56/+1
| | | | and appeared to be working for the Alpha some time ago.
* Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). Thisluoqi1999-02-191-2/+2
| | | | | | | is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
* * Change sysctl from using linker_set to construct its tree using SLISTs.dfr1999-02-161-1/+2
| | | | | | | | | | This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
* Adjust idle zero-page fill hysteresis based on tests. Use 2/3 and 4/5dillon1999-02-081-4/+8
| | | | | | | zero-fill levels. Adjust comment for ozfod in vmmeter.h - this counter represents non-optimal ( on the fly ) zero fills, not prefills.
* Rip out PQ_ZERO queue. PQ_ZERO functionality is now combined in withdillon1999-02-081-19/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | PQ_FREE. There is little operational difference other then the kernel being a few kilobytes smaller and the code being more readable. * vm_page_select_free() has been *greatly* simplified. * The PQ_ZERO page queue and supporting structures have been removed * vm_page_zero_idle() revamped (see below) PG_ZERO setting and clearing has been migrated from vm_page_alloc() to vm_page_free[_zero]() and will eventually be guarenteed to remain tracked throughout a page's life ( if it isn't already ). When a page is freed, PG_ZERO pages are appended to the appropriate tailq in the PQ_FREE queue while non-PG_ZERO pages are prepended. When locating a new free page, PG_ZERO selection operates from within vm_page_list_find() ( get page from end of queue instead of beginning of queue ) and then only occurs in the nominal critical path case. If the nominal case misses, both normal and zero-page allocation devolves into the same _vm_page_list_find() select code without any specific zero-page optimizations. Additionally, vm_page_zero_idle() has been revamped. Hysteresis has been added and zero-page tracking adjusted to conform with the other changes. Currently hysteresis is set at 1/3 (lo) and 1/2 (hi) the number of free pages. We may wish to increase both parameters as time permits. The hysteresis is designed to avoid silly zeroing in borderline allocation/free situations.
* More -Wall / -Wcast-qual cleanup. Also, EXEC_SET can't usedillon1999-01-291-3/+3
| | | | | C_DECLARE_MODULE due to the linker_file_sysinit() function making modifications to the data.
* Add (but don't activate) code for a special VM option to makejulian1999-01-061-1/+18
| | | | | | | | | | | | | downward growing stacks more general. Add (but don't activate) code to use the new stack facility when running threads, (specifically the linux threads support). This allows people to use both linux compiled linuxthreads, and also the native FreeBSD linux-threads port. The code is conditional on VM_STACK. Not using this will produce the old heavily tested system. Submitted by: Richard Seaman <dick@tar.com>
* Removed bogus casts of USRSTACK and/or the other operand in binarybde1998-12-161-4/+4
| | | | expressions involving USRSTACK.
* Add John Dyson's SYSCTL descriptions, and an export of more stats topeter1998-10-311-2/+3
| | | | | a sysctl hierarchy (vm.stats.*). SYSCTL descriptions are only present in source, they do not get compiled into the binaries taking up memory.
* Fixed two potentially serious classes of bugs:dg1998-10-131-3/+3
| | | | | | | | | | | | | | | | 1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.
* Initialize pcb_mpnest to 1 in the child process in cpu_fork(). This shouldtegge1998-09-281-1/+4
| | | | | | | fix the 50% idle problem that the ELF /sbin/init triggered. The problem appeared when the last context switch before a fork() call was due to the kernel faulting in user pages via normal page faults (e.g. copyin). Reviewed by: Peter Wemm <peter@netplex.com.au>
* Goodbye BOUNCE_BUFFERS, for a hack it has served us well.peter1998-09-251-478/+1
| | | | | | | | The last consumer of this code (the old SCSI system) has left us and the CAM code does it's own bouncing. The isa dma system has been doing it's own bouncing for a while too. Reviewed by: core
* Presently there is only one `currentldt' variable for all cpusmsmith1998-08-181-3/+5
| | | | | | | | | | | | in a SMP system. Unexpected things could happen if each cpu has a different ldt setting and one cpu tries to use value of currentldt set by another cpu. The fix is to move currentldt to the per-cpu area. It includes patches I filed in PR i386/6219 which are also user ldt related. PR: i386/7591, i386/6219 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
* Disallow reading the current kernel stack. Only the user structure andtegge1998-05-191-5/+22
| | | | | the current registers should be accessible. Reviewed by: David Greenman <dg@root.com>
* Add forwarding of roundrobin to other cpus. This gives a more regulartegge1998-05-171-1/+88
| | | | | | | | | | | update of cpu usage as shown by top when one process is cpu bound (no system calls) while the system is otherwise idle (except for top). Don't attempt to switch to the BSP in boot(). If the system was idle when an interrupt caused a panic, this won't work. Instead, switch to the BSP in cpu_reset. Remove some spurious forward_statclock/forward_hardclock warnings.
* Some of newer PC-98 may cause "Windows Protection Fault" when bootingkato1998-05-161-3/+5
| | | | | | | | Windows 95 after rebooting FreeBSD without power off. In PC-98 system, reboot mode is set via I/O port 0x37 in cpu_reset(), and accessing of this port is the reason of the problem. To avnoid the fault, current status of reboot mode should be checked before accessing the I/O port.
* Add the ability to make real-mode BIOS calls from the kernel. Currently,jlemon1998-03-231-1/+9
| | | | | | | | | | | everything is contained inside #ifdef VM86, so this option must be present in the config file to use this functionality. Thanks to Tor Egge, these changes should work on SMP machines. However, it may not be throughly SMP-safe. Currently, the only BIOS calls made are memory-sizing routines at bootup, these replace reading the RTC values.
* Make EPSON_BOUNCEDMA a new-style option.kato1998-03-171-1/+4
|
* Don't use the standard macros for disabling/enabling interrupt.tegge1998-03-141-3/+3
| | | | | On SMP systems, this left the mpintr_lock simplelock locked, causing further calls to disable_intr to deadlock or panic.
* Fixed breakage of the !SMP case in vm_page_zero_idle() in thebde1998-03-121-5/+7
| | | | | | | | | previous commit. Opportunities to clean pages were often missed, and leaving of the idle state was sometimes delayed until the next interrupt (after any that occurred while cleaning). Fixed an unstaticization, a syntax error and a style bug in the previous commit.
* Fix page prezeroing for SMP, and fix some potential paging-in-progressdyson1998-02-251-29/+35
| | | | | | hangs. The paging-in-progress diagnosis was a result of Tor Egge's excellent detective work. Submitted by: Partially from Tor Egge.
* Ifdefed some npx code. npx should be optional again.bde1998-02-131-1/+3
|
* Back out DIAGNOSTIC changes.eivind1998-02-061-2/+1
|
* Turn DIAGNOSTIC into a new-style option.eivind1998-02-041-1/+2
|
* Make the bounce buffer code a little more robust when space isn'tdyson1998-01-301-3/+10
| | | | | | available. If there isn't bounce space available, the bounce code is disabled. This will allow most large systems to run properly when the bounce space is mistakenly allocated above 16MB.
* VM level code cleanups.dyson1998-01-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)
* The removal of a page from the free queue in vm_page_zero_idle wastegge1998-01-191-1/+2
| | | | | imcomplete. Also set m->queue, in order to prevent vm_page_select_free from selecting the page being zeroed.
* Implementation of Bus DMA for FreeBSD-x86. This is sufficient to dogibbs1998-01-151-1/+11
| | | | | page level bounce buffering, but there are still some issues left to address.
* #include "opt_user_ldt.h" so that the #ifdef USER_LDT checks can work, aspeter1997-12-271-1/+2
| | | | | | commented about at length in the PR audit trail. PR: 2412
OpenPOWER on IntegriCloud