summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/dontdiff4
-rw-r--r--Documentation/i386/boot.txt23
-rw-r--r--Documentation/kernel-parameters.txt21
-rw-r--r--Documentation/x86_64/boot-options.txt14
-rw-r--r--Documentation/x86_64/fake-numa-for-cpusets66
-rw-r--r--Documentation/x86_64/machinecheck7
6 files changed, 123 insertions, 12 deletions
diff --git a/Documentation/dontdiff b/Documentation/dontdiff
index 63c2d0c..64e9f6c 100644
--- a/Documentation/dontdiff
+++ b/Documentation/dontdiff
@@ -55,8 +55,8 @@ aic7*seq.h*
aicasm
aicdb.h*
asm
-asm-offsets.*
-asm_offsets.*
+asm-offsets.h
+asm_offsets.h
autoconf.h*
bbootsect
bin2c
diff --git a/Documentation/i386/boot.txt b/Documentation/i386/boot.txt
index 38fe1f0..6498666 100644
--- a/Documentation/i386/boot.txt
+++ b/Documentation/i386/boot.txt
@@ -2,7 +2,7 @@
----------------------------
H. Peter Anvin <hpa@zytor.com>
- Last update 2007-01-26
+ Last update 2007-03-06
On the i386 platform, the Linux kernel uses a rather complicated boot
convention. This has evolved partially due to historical aspects, as
@@ -35,9 +35,13 @@ Protocol 2.03: (Kernel 2.4.18-pre1) Explicitly makes the highest possible
initrd address available to the bootloader.
Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes.
+
Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable.
Introduce relocatable_kernel and kernel_alignment fields.
+Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of
+ the boot command line
+
**** MEMORY LAYOUT
@@ -133,6 +137,8 @@ Offset Proto Name Meaning
022C/4 2.03+ initrd_addr_max Highest legal initrd address
0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel
0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not
+0235/3 N/A pad2 Unused
+0238/4 2.06+ cmdline_size Maximum size of the kernel command line
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
@@ -233,6 +239,12 @@ filled out, however:
if your ramdisk is exactly 131072 bytes long and this field is
0x37FFFFFF, you can start your ramdisk at 0x37FE0000.)
+ cmdline_size:
+ The maximum size of the command line without the terminating
+ zero. This means that the command line can contain at most
+ cmdline_size characters. With protocol version 2.05 and
+ earlier, the maximum size was 255.
+
**** THE KERNEL COMMAND LINE
@@ -241,11 +253,10 @@ loader to communicate with the kernel. Some of its options are also
relevant to the boot loader itself, see "special command line options"
below.
-The kernel command line is a null-terminated string currently up to
-255 characters long, plus the final null. A string that is too long
-will be automatically truncated by the kernel, a boot loader may allow
-a longer command line to be passed to permit future kernels to extend
-this limit.
+The kernel command line is a null-terminated string. The maximum
+length can be retrieved from the field cmdline_size. Before protocol
+version 2.06, the maximum was 255 characters. A string that is too
+long will be automatically truncated by the kernel.
If the boot protocol version is 2.02 or later, the address of the
kernel command line is given by the header field cmd_line_ptr (see
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 84c3bd0..38d7db3 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -64,6 +64,7 @@ parameter is applicable:
GENERIC_TIME The generic timeofday code is enabled.
NFS Appropriate NFS support is enabled.
OSS OSS sound support is enabled.
+ PV_OPS A paravirtualized kernel
PARIDE The ParIDE subsystem is enabled.
PARISC The PA-RISC architecture is enabled.
PCI PCI bus support is enabled.
@@ -695,8 +696,15 @@ and is between 256 and 4096 characters. It is defined in the file
idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed
See Documentation/ide.txt.
- idle= [HW]
- Format: idle=poll or idle=halt
+ idle= [X86]
+ Format: idle=poll or idle=mwait
+ Poll forces a polling idle loop that can slightly improves the performance
+ of waking up a idle CPU, but will use a lot of power and make the system
+ run hot. Not recommended.
+ idle=mwait. On systems which support MONITOR/MWAIT but the kernel chose
+ to not use it because it doesn't save as much power as a normal idle
+ loop use the MONITOR/MWAIT idle loop anyways. Performance should be the same
+ as idle=poll.
ignore_loglevel [KNL]
Ignore loglevel setting - this will print /all/
@@ -1157,6 +1165,11 @@ and is between 256 and 4096 characters. It is defined in the file
nomce [IA-32] Machine Check Exception
+ noreplace-paravirt [IA-32,PV_OPS] Don't patch paravirt_ops
+
+ noreplace-smp [IA-32,SMP] Don't replace SMP instructions
+ with UP alternatives
+
noresidual [PPC] Don't use residual data on PReP machines.
noresume [SWSUSP] Disables resume and restores original swap
@@ -1562,6 +1575,9 @@ and is between 256 and 4096 characters. It is defined in the file
smart2= [HW]
Format: <io1>[,<io2>[,...,<io8>]]
+ smp-alt-once [IA-32,SMP] On a hotplug CPU system, only
+ attempt to substitute SMP alternatives once at boot.
+
snd-ad1816a= [HW,ALSA]
snd-ad1848= [HW,ALSA]
@@ -1820,6 +1836,7 @@ and is between 256 and 4096 characters. It is defined in the file
[USBHID] The interval which mice are to be polled at.
vdso= [IA-32,SH]
+ vdso=2: enable compat VDSO (default with COMPAT_VDSO)
vdso=1: enable VDSO (default)
vdso=0: disable VDSO mapping
diff --git a/Documentation/x86_64/boot-options.txt b/Documentation/x86_64/boot-options.txt
index 85f51e5..6177d88 100644
--- a/Documentation/x86_64/boot-options.txt
+++ b/Documentation/x86_64/boot-options.txt
@@ -149,7 +149,19 @@ NUMA
numa=noacpi Don't parse the SRAT table for NUMA setup
- numa=fake=X Fake X nodes and ignore NUMA setup of the actual machine.
+ numa=fake=CMDLINE
+ If a number, fakes CMDLINE nodes and ignores NUMA setup of the
+ actual machine. Otherwise, system memory is configured
+ depending on the sizes and coefficients listed. For example:
+ numa=fake=2*512,1024,4*256,*128
+ gives two 512M nodes, a 1024M node, four 256M nodes, and the
+ rest split into 128M chunks. If the last character of CMDLINE
+ is a *, the remaining memory is divided up equally among its
+ coefficient:
+ numa=fake=2*512,2*
+ gives two 512M nodes and the rest split into two nodes.
+ Otherwise, the remaining system RAM is allocated to an
+ additional node.
numa=hotadd=percent
Only allow hotadd memory to preallocate page structures upto
diff --git a/Documentation/x86_64/fake-numa-for-cpusets b/Documentation/x86_64/fake-numa-for-cpusets
new file mode 100644
index 0000000..d1a985c
--- /dev/null
+++ b/Documentation/x86_64/fake-numa-for-cpusets
@@ -0,0 +1,66 @@
+Using numa=fake and CPUSets for Resource Management
+Written by David Rientjes <rientjes@cs.washington.edu>
+
+This document describes how the numa=fake x86_64 command-line option can be used
+in conjunction with cpusets for coarse memory management. Using this feature,
+you can create fake NUMA nodes that represent contiguous chunks of memory and
+assign them to cpusets and their attached tasks. This is a way of limiting the
+amount of system memory that are available to a certain class of tasks.
+
+For more information on the features of cpusets, see Documentation/cpusets.txt.
+There are a number of different configurations you can use for your needs. For
+more information on the numa=fake command line option and its various ways of
+configuring fake nodes, see Documentation/x86_64/boot-options.txt.
+
+For the purposes of this introduction, we'll assume a very primitive NUMA
+emulation setup of "numa=fake=4*512,". This will split our system memory into
+four equal chunks of 512M each that we can now use to assign to cpusets. As
+you become more familiar with using this combination for resource control,
+you'll determine a better setup to minimize the number of nodes you have to deal
+with.
+
+A machine may be split as follows with "numa=fake=4*512," as reported by dmesg:
+
+ Faking node 0 at 0000000000000000-0000000020000000 (512MB)
+ Faking node 1 at 0000000020000000-0000000040000000 (512MB)
+ Faking node 2 at 0000000040000000-0000000060000000 (512MB)
+ Faking node 3 at 0000000060000000-0000000080000000 (512MB)
+ ...
+ On node 0 totalpages: 130975
+ On node 1 totalpages: 131072
+ On node 2 totalpages: 131072
+ On node 3 totalpages: 131072
+
+Now following the instructions for mounting the cpusets filesystem from
+Documentation/cpusets.txt, you can assign fake nodes (i.e. contiguous memory
+address spaces) to individual cpusets:
+
+ [root@xroads /]# mkdir exampleset
+ [root@xroads /]# mount -t cpuset none exampleset
+ [root@xroads /]# mkdir exampleset/ddset
+ [root@xroads /]# cd exampleset/ddset
+ [root@xroads /exampleset/ddset]# echo 0-1 > cpus
+ [root@xroads /exampleset/ddset]# echo 0-1 > mems
+
+Now this cpuset, 'ddset', will only allowed access to fake nodes 0 and 1 for
+memory allocations (1G).
+
+You can now assign tasks to these cpusets to limit the memory resources
+available to them according to the fake nodes assigned as mems:
+
+ [root@xroads /exampleset/ddset]# echo $$ > tasks
+ [root@xroads /exampleset/ddset]# dd if=/dev/zero of=tmp bs=1024 count=1G
+ [1] 13425
+
+Notice the difference between the system memory usage as reported by
+/proc/meminfo between the restricted cpuset case above and the unrestricted
+case (i.e. running the same 'dd' command without assigning it to a fake NUMA
+cpuset):
+ Unrestricted Restricted
+ MemTotal: 3091900 kB 3091900 kB
+ MemFree: 42113 kB 1513236 kB
+
+This allows for coarse memory management for the tasks you assign to particular
+cpusets. Since cpusets can form a hierarchy, you can create some pretty
+interesting combinations of use-cases for various classes of tasks for your
+memory management needs.
diff --git a/Documentation/x86_64/machinecheck b/Documentation/x86_64/machinecheck
index 068a6d9..feaeaf6 100644
--- a/Documentation/x86_64/machinecheck
+++ b/Documentation/x86_64/machinecheck
@@ -36,7 +36,12 @@ between all CPUs.
check_interval
How often to poll for corrected machine check errors, in seconds
- (Note output is hexademical). Default 5 minutes.
+ (Note output is hexademical). Default 5 minutes. When the poller
+ finds MCEs it triggers an exponential speedup (poll more often) on
+ the polling interval. When the poller stops finding MCEs, it
+ triggers an exponential backoff (poll less often) on the polling
+ interval. The check_interval variable is both the initial and
+ maximum polling interval.
tolerant
Tolerance level. When a machine check exception occurs for a non
OpenPOWER on IntegriCloud