summaryrefslogtreecommitdiffstats
path: root/sys/kern/subr_param.c
Commit message (Collapse)AuthorAgeFilesLines
* Teach the kernel to recognize that it is executing inside a bhyve virtualneel2013-01-051-0/+1
| | | | | | machine. Obtained from: NetApp
* Prevent long type overflow of realmem calculation on ILP32 by forcingandre2012-12-101-2/+2
| | | | | | | | calculation to be in quad_t space. Fix style issue with second parameter to qmin(). Reported by: alc Reviewed by: bde, alc
* Using a long is the wrong type to represent the realmem and maxmbufmemandre2012-11-291-4/+4
| | | | | | | | | variable as they may overflow on i386/PAE and i386 with > 2GB RAM. Use 64bit quad_t instead. It has broader kernel infrastructure support with TUNABLE_QUAD_FETCH() and qmin/qmax() than other available types. Pointed out by: alc, bde
* Base the mbuf related limits on the available physical memory orandre2012-11-271-8/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernel memory, whichever is lower. The overall mbuf related memory limit must be set so that mbufs (and clusters of various sizes) can't exhaust physical RAM or KVM. The limit is set to half of the physical RAM or KVM (whichever is lower) as the baseline. In any normal scenario we want to leave at least half of the physmem/kvm for other kernel functions and userspace to prevent it from swapping too easily. Via a tunable kern.maxmbufmem the limit can be upped to at most 3/4 of physmem/kvm. At the same time divorce maxfiles from maxusers and set maxfiles to physpages / 8 with a floor based on maxusers. This way busy servers can make use of the significantly increased mbuf limits with a much larger number of open sockets. Tidy up ordering in init_param2() and check up on some users of those values calculated here. Out of the overall mbuf memory limit 2K clusters and 4K (page size) clusters to get 1/4 each because these are the most heavily used mbuf sizes. 2K clusters are used for MTU 1500 ethernet inbound packets. 4K clusters are used whenever possible for sends on sockets and thus outbound packets. The larger cluster sizes of 9K and 16K are limited to 1/6 of the overall mbuf memory limit. When jumbo MTU's are used these large clusters will end up only on the inbound path. They are not used on outbound, there it's still 4K. Yes, that will stay that way because otherwise we run into lots of complications in the stack. And it really isn't a problem, so don't make a scene. Normal mbufs (256B) weren't limited at all previously. This was problematic as there are certain places in the kernel that on allocation failure of clusters try to piece together their packet from smaller mbufs. The mbuf limit is the number of all other mbuf sizes together plus some more to allow for standalone mbufs (ACK for example) and to send off a copy of a cluster. Unfortunately there isn't a way to set an overall limit for all mbuf memory together as UMA doesn't support such a limiting. NB: Every cluster also has an mbuf associated with it. Two examples on the revised mbuf sizing limits: 1GB KVM: 512MB limit for mbufs 419,430 mbufs 65,536 2K mbuf clusters 32,768 4K mbuf clusters 9,709 9K mbuf clusters 5,461 16K mbuf clusters 16GB RAM: 8GB limit for mbufs 33,554,432 mbufs 1,048,576 2K mbuf clusters 524,288 4K mbuf clusters 155,344 9K mbuf clusters 87,381 16K mbuf clusters These defaults should be sufficient for even the most demanding network loads. MFC after: 1 month
* Allow maxusers to scale on machines with large address space.alfred2012-11-101-11/+11
| | | | | | | | | | | | | | | | | | Some hooks are added to clamp down maxusers and nmbclusters for small address space systems. VM_MAX_AUTOTUNE_MAXUSERS - the max maxusers that will be autotuned based on physical memory. VM_MAX_AUTOTUNE_NMBCLUSTERS - max nmbclusters based on physical memory. These are set to the old values on i386 to preserve the clamping that was being done to all arches. Another macro VM_AUTOTUNE_NMBCLUSTERS is provided to allow an override for the calculation on a MD basis. Currently no arch defines this. Reviewed by: peter MFC after: 2 weeks
* Allow autotune maxusers > 384 on 64 bit machinesalfred2012-10-251-2/+10
| | | | | | | | | A default install on large memory machines with multiple 10gigE interfaces were not being given enough mbufs to do full bandwidth TCP or NFS traffic. To keep the value somewhat reasonable, we scale back the number of maxuers by 1/6 past the 384 point. This gives us enough mbufs for most of our pretty basic 10gigE line-speed tests to complete.
* - Mark some sysctls with CTLFLAG_TUN flag instead of CTLFLAG_RDTUN.zont2012-09-031-7/+7
| | | | | | Pointed out by: avg Approved by: kib (mentor) MFC after: 1 week
* - Make kern.maxtsiz, kern.dfldsiz, kern.maxdsiz, kern.dflssiz, kern.maxssizzont2012-09-021-7/+7
| | | | | | and kern.sgrowsiz sysctls writable. Approved by: kib (mentor)
* As a safety measure, disable lowering pid_max too much.kib2012-08-161-0/+3
| | | | | Requested by: Peter Jeremy <peter@rulingia.com> MFC after: 1 week
* Add a sysctl kern.pid_max, which limits the maximum pid the system iskib2012-08-151-2/+11
| | | | | | | allowed to allocate, and corresponding tunable with the same name. Note that existing processes with higher pids are left intact. MFC after: 1 week
* Modestly increase the maximum allowed size of the kmem map on i386.alc2011-03-231-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Also, express this new maximum as a fraction of the kernel's address space size rather than a constant so that increasing KVA_PAGES will automatically increase this maximum. As a side-effect of this change, kern.maxvnodes will automatically increase by a proportional amount. While I'm here ensure that this change doesn't result in an unintended increase in maxpipekva on i386. Calculate maxpipekva based upon the size of the kernel address space and the amount of physical memory instead of the size of the kmem map. The memory backing pipes is not allocated from the kmem map. It is allocated from its own submap of the kernel map. In short, it has no real connection to the kmem map. (In fact, the commit messages for the maxpipekva auto-sizing talk about using the kernel map size, cf. r117325 and r117391, even though the implementation actually used the kmem map size.) Although the calculation is now done differently, the resulting value for maxpipekva should remain almost the same on i386. However, on amd64, the value will be reduced by 2/3. This is intentional. The recent change to VM_KMEM_SIZE_SCALE on amd64 for the benefit of ZFS also had the unnecessary side-effect of increasing maxpipekva. This change is effectively restoring maxpipekva on amd64 to its prior value. Eliminate init_param3() since it is no longer used.
* Make MSGBUF_SIZE kernel option a loader tunable kern.msgbufsize.pluknet2011-01-211-0/+7
| | | | | | | Submitted by: perryh pluto.rain.com (previous version) Reviewed by: jhb Approved by: kib (mentor) Tested by: universe
* Add Xen to the list of virtual vendors. In the non PV (HVM) case this fixescsjp2010-08-061-0/+1
| | | | | | | | the virtualization detection successfully disabling the clflush instruction. This fixes insta-panics for XEN hvm users when the hw.clflush_disable tunable is -1 or 0 (-1 by default). Discussed with: jhb
* Reverse the logic of the if statement that sets the default value ofnwhitehorn2010-06-241-3/+3
| | | | | | HZ; the list of 1000 Hz platforms was getting unwieldy. Suggested by: marcel
* Move default HZ from 100 to 1000 on powerpc.nwhitehorn2010-06-231-1/+1
| | | | | Reviewed by: marcel MFC after: 2 weeks
* Document the VM detection type and sysctl a bit better.ivoras2010-03-021-1/+1
|
* When running as a guest operating system, the FreeBSD kernel must assumealc2010-02-271-4/+4
| | | | | | | | | | | | that the virtual machine monitor has enabled machine check exceptions. Unfortunately, on AMD Family 10h processors the machine check hardware has a bug (Erratum 383) that can result in a false machine check exception when a superpage promotion occurs. Thus, I am disabling superpage promotion when the FreeBSD kernel is running as a guest operating system on an AMD Family 10h processor. Reviewed by: jhb, kib MFC after: 3 days
* Don't inforce an upper bound on kern.ngroups. The INT_MAX-1 limit wasbrooks2010-02-241-2/+0
| | | | | | | | | too high due to several overflows. The actual limit is somewhere in the neighborhood of INT_MAX/4 on 64-bit machines, but most systems could not support such a limit due to a lack of memory and the cost of duplicate credentials. Reported by: bde
* Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamicbrooks2010-01-121-0/+14
| | | | | | | | kern.ngroups+1. kern.ngroups can range from NGROUPS_MAX=1023 to INT_MAX-1. Given that the Windows group limit is 1024, this range should be sufficient for most applications. MFC after: 1 month
* Increase HZ_VM from 10 to 100. While 10 hz saves cpu timesilby2009-07-081-1/+1
| | | | | | | | under VM environments, it's too slow for FreeBSD to work properly. For example, ping at 10hz pings about every 600ms instead of about every second. Approved by: re (kib)
* Improve the description of a few sysctls.jhb2009-03-231-10/+11
| | | | | Submitted by: bde (partially) MFC after: 3 days
* Change the sysctls for maxbcache and maxswzone from int to long. I missedjhb2009-03-121-2/+2
| | | | this earlier since these sysctls don't exist in 7.x yet.
* Export the current values of nbuf, ncallout, and nswbuf via read-onlyjhb2009-03-121-0/+6
| | | | | | sysctls that match the tunable names. MFC after: 3 days
* - Make maxpipekva a signed long rather than an unsigned long as overflowjhb2009-03-101-2/+2
| | | | | | | is more likely to be noticed with signed types. - Make amountpipekva a long as well to match maxpipekva. Discussed with: bde
* Adjust some variables (mostly related to the buffer cache) that holdjhb2009-03-091-6/+6
| | | | | | | | | | | | | | | | | | | address space sizes to be longs instead of ints. Specifically, the follow values are now longs: runningbufspace, bufspace, maxbufspace, bufmallocspace, maxbufmallocspace, lobufspace, hibufspace, lorunningspace, hirunningspace, maxswzone, maxbcache, and maxpipekva. Previously, a relatively small number (~ 44000) of buffers set in kern.nbuf would result in integer overflows resulting either in hangs or bogus values of hidirtybuffers and lodirtybuffers. Now one has to overflow a long to see such problems. There was a check for a nbuf setting that would cause overflows in the auto-tuning of nbuf. I've changed it to always check and cap nbuf but warn if a user-supplied tunable would cause overflow. Note that this changes the ABI of several sysctls that are used by things like top(1), etc., so any MFC would probably require a some gross shims to allow for that. MFC after: 1 month
* Document the relationship between enum VM_GUEST and the vm_guest_sysctl_namesivoras2008-12-301-1/+3
| | | | | | array. Approved by: gnn (original version)
* Hide detect_virtual() along with the accompanying stringbz2008-12-271-7/+9
| | | | | | | | arrays under #ifndef XEN to make XEN config compile again. In case of Xen vm_guest is hard coded. Move the list for the vm_guest sysctl out of the restictive bounds as the sysctl is there in either case.
* By popular request, stringify kern.vm_guest sysctl. Now it returns aivoras2008-12-181-3/+27
| | | | | | | short, self-documenting string describing the detected virtual environment. Approved by: gnn (mentor) (earlier version)
* Introduce a sysctl kern.vm_guest that reflects what the kernel knows aboutivoras2008-12-171-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | it running under a virtual environment. This also introduces a globally accessible variable vm_guest that can be used where appropriate in the kernel to inspect this environment. To make it easier for the long run, an enum VM_GUEST is also introduced, which could possibly be factored out in a header somewhere (but the question is where - vm/vm_param.h? sys/param.h?) so it eventually becomes a part of the standard KPI. In any case, it's a start. The purpose of all this isn't to absolutely detect that the OS is running under a virtual environment (cf. "redpill") but to allow the parts of the kernel and the userland that care about this particular aspect and can do something useful depending on it to have a standardised interface. Reducing kern.hz is one example but there are other things that could be done like avoiding context switches, not using CPU instructions that are known to be slow in emulation, possibly different strategies in VM (memory) allocation, CPU scheduling, etc. It isn't clear if the JAILS/VIMAGE functionality should also be exposed by this particular mechanism (probably not since they're not "full" virtual hardware environments). Sometime in the future another sysctl and a variable could be introduced to reflect if the kernel supports any kind of virtual hosting (e.g. VMWare VMI, Xen dom0). Reviewed by: silence from src-commiters@, virtualization@, kmacy@ Approved by: gnn (mentor) Security: Obscurity doesn't help.
* - Detect Bochs BIOS variants and use HZ_VM as well.jkim2008-12-081-12/+25
| | | | | - Free kernel environment variable after its use. - Fix style(9) nits.
* vm_pnames should be "const char *const[]".sobomax2008-10-271-1/+1
| | | | Submitted by: Christoph Mallon
* vm_pnames has no reason to be global.sobomax2008-10-271-1/+1
| | | | MFC after: 2 weeks
* Default HZ value (1,000) on i386/amd64 is not very virtual machine friendly.sobomax2008-10-271-1/+39
| | | | | | | | | | | | | | | | Due to the nature of the beast it causes lot of unproductive overhead. This is especially bad when running SMP kernel on VMWare with several virtual processors - idle FreeBSD guest with SMP kernel takes 150% host CPU time on my dual-core MacBook Pro when I am enabling two virtual CPUs, making even host not very usable. Detect when we are running in the sandbox and reduce HZ to 10 (can be adjusted via VM_HZ in the kernel config) in such cases. This brings host CPU usage of idle FreeBSD/SMP on two virtual processors down to 10%. Detect most popular VM platforms out there - VMWare, Parallels, VirtualBox and VirtualPC. MFC after: 2 weeks
* Correct an error in the comments for init_param3().alc2008-07-041-2/+2
| | | | Discussed with: silby
* - Export HZ value via kern.hz sysctl (this is the same name as for thepjd2008-05-091-8/+17
| | | | | | | | loader tunable). - Document other sysctls in this file and also mark them as loader tunable via CTLFLAG_RDTUN flag. Reviewed by: roberto
* Export maxswzone, maxbcache, maxtsiz, dfldsiz, maxdsiz, dflssiz, maxssiz,alfred2007-10-161-0/+10
| | | | | | and sgrowsiz via sysctl. MFC after: 1 week
* Partially revert revision 1.66, which contained a change that did notkris2005-10-141-4/+4
| | | | | | | | | | | | | | | | | | correspond to the commit log. It changed the maxswzone and maxbcache parameters from int to long, without changing the extern definitions in <sys/buf.h>. In fact it's a good thing it did not, because other parts of the system are not yet ready for this, and on large-memory sparc machines it causes severe filesystem damage if you try. The worst effect of the change was that the tunables controlling the above variables stopped working. These were necessary to allow such large sparc64 machines (with >12GB RAM) to boot, since sparc64 did not set a hard-coded upper limit on these parameters and they ended up overflowing an int, causing an infinite loop at boot in bufinit(). Reviewed by: mlaier
* Increase default HZ for sparc64 to 1000.marius2005-04-161-1/+1
|
* /* -> /*- for copyright notices, minor format tweaks as necessaryimp2005-01-061-1/+1
|
* Fix the build.bms2004-11-301-2/+2
|
* Switch from 1024hz to 1000hz on amd64 to match i386. 1024 is a badpeter2004-11-301-3/+1
| | | | choice because it is so in sync with stathz (128hz or 4096hz etc).
* #include <vm/vm_param.h> instead of <machine/vmparam.h> (the formerdes2004-11-081-17/+17
| | | | | | | | | | | | includes the latter, but also declares variables which are defined in kern/subr_param.c). Change som VM parameters from quad_t to unsigned long. They refer to quantities (size limits for text, heap and stack segments) which must necessarily be smaller than the size of the address space, so long is adequate on all platforms. MFC after: 1 week
* Increase default HZ for ia64 to 1000.marcel2004-11-081-1/+1
|
* Increase default HZ for i386 to 1000phk2004-11-061-4/+6
|
* Remove advertising clause from University of California Regent's license,imp2004-04-051-4/+0
| | | | | | per letter dated July 22, 1999. Approved by: core
* White space and wording changes to init_param3().alc2004-03-301-5/+3
| | | | Mostly submitted by: bde
* Revise the direct or optimized case to use uiomove_fromphys() by the readeralc2004-03-271-6/+1
| | | | | | | | | | | | | | | | | instead of ephemeral mappings using pmap_qenter() by the writer. The writer is still, however, responsible for wiring the pages, just not mapping them. Consequently, the allocation of KVA for the direct case is unnecessary. Remove it and the sysctls limiting it, i.e., kern.ipc.maxpipekvawired and kern.ipc.amountpipekvawired. The number of temporarily wired pages is still, however, limited by kern.ipc.maxpipekva. Note: On platforms lacking a direct virtual-to-physical mapping, uiomove_fromphys() uses sf_bufs to cache ephemeral mappings. Thus, the number of available sf_bufs can influence the performance of pipes on platforms such i386. Surprisingly, I saw the greatest gain from this change on such a machine: lmbench's pipe bandwidth result increased from ~1050MB/s to ~1850MB/s on my 2.4GHz, 400MHz FSB P4 Xeon.
* Set default HZ to 1024 for amd64. The comment in kern/tty.c doesn'tpeter2004-03-141-0/+4
| | | | | apply here because we have 64 bit longs and don't suffer the hz > 169 overflows.
* More pipe changes:silby2003-08-111-10/+8
| | | | | | | | | | | | | | From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)
* Add init_param3() to subr_param. This function is calledsilby2003-07-111-26/+16
| | | | | | | | immediately after the kernel map has been sized, and is the optimal place for the autosizing of memory allocations which occur within the kernel map to occur. Suggested by: bde
OpenPOWER on IntegriCloud