diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/DocBook/libata.tmpl | 49 | ||||
-rw-r--r-- | Documentation/RCU/stallwarn.txt | 94 | ||||
-rw-r--r-- | Documentation/RCU/torture.txt | 10 | ||||
-rw-r--r-- | Documentation/RCU/trace.txt | 35 | ||||
-rw-r--r-- | Documentation/feature-removal-schedule.txt | 23 | ||||
-rw-r--r-- | Documentation/filesystems/proc.txt | 3 | ||||
-rw-r--r-- | Documentation/i2c/writing-clients | 5 | ||||
-rw-r--r-- | Documentation/input/elantech.txt | 8 | ||||
-rw-r--r-- | Documentation/kernel-parameters.txt | 8 | ||||
-rw-r--r-- | Documentation/kprobes.txt | 10 | ||||
-rw-r--r-- | Documentation/scheduler/sched-design-CFS.txt | 54 | ||||
-rw-r--r-- | Documentation/scheduler/sched-rt-group.txt | 20 | ||||
-rw-r--r-- | Documentation/trace/events.txt | 3 | ||||
-rw-r--r-- | Documentation/trace/ftrace.txt | 50 | ||||
-rw-r--r-- | Documentation/trace/kprobetrace.txt | 4 |
15 files changed, 208 insertions, 168 deletions
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index ba99757..ff3e5be 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -107,10 +107,6 @@ void (*dev_config) (struct ata_port *, struct ata_device *); issue of SET FEATURES - XFER MODE, and prior to operation. </para> <para> - Called by ata_device_add() after ata_dev_identify() determines - a device is present. - </para> - <para> This entry may be specified as NULL in ata_port_operations. </para> @@ -154,8 +150,8 @@ unsigned int (*mode_filter) (struct ata_port *, struct ata_device *, unsigned in <sect2><title>Taskfile read/write</title> <programlisting> -void (*tf_load) (struct ata_port *ap, struct ata_taskfile *tf); -void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf); +void (*sff_tf_load) (struct ata_port *ap, struct ata_taskfile *tf); +void (*sff_tf_read) (struct ata_port *ap, struct ata_taskfile *tf); </programlisting> <para> @@ -164,36 +160,35 @@ void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf); hardware registers / DMA buffers, to obtain the current set of taskfile register values. Most drivers for taskfile-based hardware (PIO or MMIO) use - ata_tf_load() and ata_tf_read() for these hooks. + ata_sff_tf_load() and ata_sff_tf_read() for these hooks. </para> </sect2> <sect2><title>PIO data read/write</title> <programlisting> -void (*data_xfer) (struct ata_device *, unsigned char *, unsigned int, int); +void (*sff_data_xfer) (struct ata_device *, unsigned char *, unsigned int, int); </programlisting> <para> All bmdma-style drivers must implement this hook. This is the low-level operation that actually copies the data bytes during a PIO data transfer. -Typically the driver -will choose one of ata_pio_data_xfer_noirq(), ata_pio_data_xfer(), or -ata_mmio_data_xfer(). +Typically the driver will choose one of ata_sff_data_xfer_noirq(), +ata_sff_data_xfer(), or ata_sff_data_xfer32(). </para> </sect2> <sect2><title>ATA command execute</title> <programlisting> -void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf); +void (*sff_exec_command)(struct ata_port *ap, struct ata_taskfile *tf); </programlisting> <para> causes an ATA command, previously loaded with ->tf_load(), to be initiated in hardware. - Most drivers for taskfile-based hardware use ata_exec_command() + Most drivers for taskfile-based hardware use ata_sff_exec_command() for this hook. </para> @@ -218,8 +213,8 @@ command. <sect2><title>Read specific ATA shadow registers</title> <programlisting> -u8 (*check_status)(struct ata_port *ap); -u8 (*check_altstatus)(struct ata_port *ap); +u8 (*sff_check_status)(struct ata_port *ap); +u8 (*sff_check_altstatus)(struct ata_port *ap); </programlisting> <para> @@ -227,20 +222,14 @@ u8 (*check_altstatus)(struct ata_port *ap); hardware. On some hardware, reading the Status register has the side effect of clearing the interrupt condition. Most drivers for taskfile-based hardware use - ata_check_status() for this hook. - </para> - <para> - Note that because this is called from ata_device_add(), at - least a dummy function that clears device interrupts must be - provided for all drivers, even if the controller doesn't - actually have a taskfile status register. + ata_sff_check_status() for this hook. </para> </sect2> <sect2><title>Select ATA device on bus</title> <programlisting> -void (*dev_select)(struct ata_port *ap, unsigned int device); +void (*sff_dev_select)(struct ata_port *ap, unsigned int device); </programlisting> <para> @@ -251,9 +240,7 @@ void (*dev_select)(struct ata_port *ap, unsigned int device); </para> <para> Most drivers for taskfile-based hardware use - ata_std_dev_select() for this hook. Controllers which do not - support second drives on a port (such as SATA contollers) will - use ata_noop_dev_select(). + ata_sff_dev_select() for this hook. </para> </sect2> @@ -441,13 +428,13 @@ void (*irq_clear) (struct ata_port *); to struct ata_host_set. </para> <para> - Most legacy IDE drivers use ata_interrupt() for the + Most legacy IDE drivers use ata_sff_interrupt() for the irq_handler hook, which scans all ports in the host_set, determines which queued command was active (if any), and calls - ata_host_intr(ap,qc). + ata_sff_host_intr(ap,qc). </para> <para> - Most legacy IDE drivers use ata_bmdma_irq_clear() for the + Most legacy IDE drivers use ata_sff_irq_clear() for the irq_clear() hook, which simply clears the interrupt and error flags in the DMA status register. </para> @@ -496,10 +483,6 @@ void (*host_stop) (struct ata_host_set *host_set); data from port at this time. </para> <para> - Many drivers use ata_port_stop() as this hook, which frees the - PRD table. - </para> - <para> ->host_stop() is called after all ->port_stop() calls have completed. The hook must finalize hardware shutdown, release DMA and other resources, etc. diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 1423d25..44c6dcc9 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -3,35 +3,79 @@ Using RCU's CPU Stall Detector The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables RCU's CPU stall detector, which detects conditions that unduly delay RCU grace periods. The stall detector's idea of what constitutes -"unduly delayed" is controlled by a pair of C preprocessor macros: +"unduly delayed" is controlled by a set of C preprocessor macros: RCU_SECONDS_TILL_STALL_CHECK This macro defines the period of time that RCU will wait from the beginning of a grace period until it issues an RCU CPU - stall warning. It is normally ten seconds. + stall warning. This time period is normally ten seconds. RCU_SECONDS_TILL_STALL_RECHECK This macro defines the period of time that RCU will wait after - issuing a stall warning until it issues another stall warning. - It is normally set to thirty seconds. + issuing a stall warning until it issues another stall warning + for the same stall. This time period is normally set to thirty + seconds. RCU_STALL_RAT_DELAY - The CPU stall detector tries to make the offending CPU rat on itself, - as this often gives better-quality stack traces. However, if - the offending CPU does not detect its own stall in the number - of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will - complain. This is normally set to two jiffies. + The CPU stall detector tries to make the offending CPU print its + own warnings, as this often gives better-quality stack traces. + However, if the offending CPU does not detect its own stall in + the number of jiffies specified by RCU_STALL_RAT_DELAY, then + some other CPU will complain. This delay is normally set to + two jiffies. -The following problems can result in an RCU CPU stall warning: +When a CPU detects that it is stalling, it will print a message similar +to the following: + +INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies) + +This message indicates that CPU 5 detected that it was causing a stall, +and that the stall was affecting RCU-sched. This message will normally be +followed by a stack dump of the offending CPU. On TREE_RCU kernel builds, +RCU and RCU-sched are implemented by the same underlying mechanism, +while on TREE_PREEMPT_RCU kernel builds, RCU is instead implemented +by rcu_preempt_state. + +On the other hand, if the offending CPU fails to print out a stall-warning +message quickly enough, some other CPU will print a message similar to +the following: + +INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies) + +This message indicates that CPU 2 detected that CPUs 3 and 5 were both +causing stalls, and that the stall was affecting RCU-bh. This message +will normally be followed by stack dumps for each CPU. Please note that +TREE_PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, +and that the tasks will be indicated by PID, for example, "P3421". +It is even possible for a rcu_preempt_state stall to be caused by both +CPUs -and- tasks, in which case the offending CPUs and tasks will all +be called out in the list. + +Finally, if the grace period ends just as the stall warning starts +printing, there will be a spurious stall-warning message: + +INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffies) + +This is rare, but does happen from time to time in real life. + +So your kernel printed an RCU CPU stall warning. The next question is +"What caused it?" The following problems can result in RCU CPU stall +warnings: o A CPU looping in an RCU read-side critical section. -o A CPU looping with interrupts disabled. +o A CPU looping with interrupts disabled. This condition can + result in RCU-sched and RCU-bh stalls. -o A CPU looping with preemption disabled. +o A CPU looping with preemption disabled. This condition can + result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh + stalls. + +o A CPU looping with bottom halves disabled. This condition can + result in RCU-sched and RCU-bh stalls. o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). @@ -39,20 +83,24 @@ o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel o A bug in the RCU implementation. o A hardware failure. This is quite unlikely, but has occurred - at least once in a former life. A CPU failed in a running system, + at least once in real life. A CPU failed in a running system, becoming unresponsive, but not causing an immediate crash. This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed. -The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. -SRCU does not do so directly, but its calls to synchronize_sched() will -result in RCU-sched detecting any CPU stalls that might be occurring. - -To diagnose the cause of the stall, inspect the stack traces. The offending -function will usually be near the top of the stack. If you have a series -of stall warnings from a single extended stall, comparing the stack traces -can often help determine where the stall is occurring, which will usually -be in the function nearest the top of the stack that stays the same from -trace to trace. +The RCU, RCU-sched, and RCU-bh implementations have CPU stall +warning. SRCU does not have its own CPU stall warnings, but its +calls to synchronize_sched() will result in RCU-sched detecting +RCU-sched-related CPU stalls. Please note that RCU only detects +CPU stalls when there is a grace period in progress. No grace period, +no CPU stall warnings. + +To diagnose the cause of the stall, inspect the stack traces. +The offending function will usually be near the top of the stack. +If you have a series of stall warnings from a single extended stall, +comparing the stack traces can often help determine where the stall +is occurring, which will usually be in the function nearest the top of +that portion of the stack which remains the same from trace to trace. +If you can reliably trigger the stall, ftrace can be quite helpful. RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 0e50bc2..5d90167 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -182,16 +182,6 @@ Similarly, sched_expedited RCU provides the following: sched_expedited-torture: Reader Pipe: 12660320201 95875 0 0 0 0 0 0 0 0 0 sched_expedited-torture: Reader Batch: 12660424885 0 0 0 0 0 0 0 0 0 0 sched_expedited-torture: Free-Block Circulation: 1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0 - state: -1 / 0:0 3:0 4:0 - -As before, the first four lines are similar to those for RCU. -The last line shows the task-migration state. The first number is --1 if synchronize_sched_expedited() is idle, -2 if in the process of -posting wakeups to the migration kthreads, and N when waiting on CPU N. -Each of the colon-separated fields following the "/" is a CPU:state pair. -Valid states are "0" for idle, "1" for waiting for quiescent state, -"2" for passed through quiescent state, and "3" when a race with a -CPU-hotplug event forces use of the synchronize_sched() primitive. USAGE diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index 8608fd8..efd8cc9 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -256,23 +256,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct The output of "cat rcu/rcu_pending" looks as follows: rcu_sched: - 0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 - 1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 - 2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 - 3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 - 4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 - 5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 - 6 np=219995 qsp=46718 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 - 7 np=249893 qsp=49390 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 + 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 + 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 + 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 + 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 + 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 + 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 + 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 + 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 rcu_bh: - 0 np=146741 qsp=1419 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 - 1 np=155792 qsp=12597 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 - 2 np=136629 qsp=18680 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 - 3 np=137723 qsp=2843 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 - 4 np=123110 qsp=12433 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 - 5 np=137456 qsp=4210 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 - 6 np=120834 qsp=9902 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 - 7 np=144888 qsp=26336 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 + 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 + 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 + 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 + 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 + 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 + 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 + 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 + 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 As always, this is once again split into "rcu_sched" and "rcu_bh" portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional @@ -284,6 +284,9 @@ o "np" is the number of times that __rcu_pending() has been invoked o "qsp" is the number of times that the RCU was waiting for a quiescent state from this CPU. +o "rpq" is the number of times that the CPU had passed through + a quiescent state, but not yet reported it to RCU. + o "cbr" is the number of times that this CPU had RCU callbacks that had passed through a grace period, and were thus ready to be invoked. diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index ed511af..05df0b7 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -589,3 +589,26 @@ Why: Useful in 2003, implementation is a hack. Generally invoked by accident today. Seen as doing more harm than good. Who: Len Brown <len.brown@intel.com> + +---------------------------- + +What: video4linux /dev/vtx teletext API support +When: 2.6.35 +Files: drivers/media/video/saa5246a.c drivers/media/video/saa5249.c + include/linux/videotext.h +Why: The vtx device nodes have been superseded by vbi device nodes + for many years. No applications exist that use the vtx support. + Of the two i2c drivers that actually support this API the saa5249 + has been impossible to use for a year now and no known hardware + that supports this device exists. The saa5246a is theoretically + supported by the old mxb boards, but it never actually worked. + + In summary: there is no hardware that can use this API and there + are no applications actually implementing this API. + + The vtx support still reserves minors 192-223 and we would really + like to reuse those for upcoming new functionality. In the unlikely + event that new hardware appears that wants to use the functionality + provided by the vtx API, then that functionality should be build + around the sliced VBI API instead. +Who: Hans Verkuil <hverkuil@xs4all.nl> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index a4f30fa..1e359b6 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -316,7 +316,7 @@ address perms offset dev inode pathname 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test 0804a000-0806b000 rw-p 00000000 00:00 0 [heap] a7cb1000-a7cb2000 ---p 00000000 00:00 0 -a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [threadstack:001ff4b4] +a7cb2000-a7eb2000 rw-p 00000000 00:00 0 a7eb2000-a7eb3000 ---p 00000000 00:00 0 a7eb3000-a7ed5000 rw-p 00000000 00:00 0 a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6 @@ -352,7 +352,6 @@ is not associated with a file: [stack] = the stack of the main process [vdso] = the "virtual dynamic shared object", the kernel system call handler - [threadstack:xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size or if empty, the mapping is anonymous. diff --git a/Documentation/i2c/writing-clients b/Documentation/i2c/writing-clients index 3219ee0..5ebf5af 100644 --- a/Documentation/i2c/writing-clients +++ b/Documentation/i2c/writing-clients @@ -74,6 +74,11 @@ structure at all. You should use this to keep device-specific data. /* retrieve the value */ void *i2c_get_clientdata(const struct i2c_client *client); +Note that starting with kernel 2.6.34, you don't have to set the `data' field +to NULL in remove() or if probe() failed anymore. The i2c-core does this +automatically on these occasions. Those are also the only times the core will +touch this field. + Accessing the client ==================== diff --git a/Documentation/input/elantech.txt b/Documentation/input/elantech.txt index a10c3b6..56941ae 100644 --- a/Documentation/input/elantech.txt +++ b/Documentation/input/elantech.txt @@ -333,14 +333,14 @@ byte 0: byte 1: bit 7 6 5 4 3 2 1 0 - x15 x14 x13 x12 x11 x10 x9 x8 + . . . . . x10 x9 x8 byte 2: bit 7 6 5 4 3 2 1 0 x7 x6 x5 x4 x4 x2 x1 x0 - x15..x0 = absolute x value (horizontal) + x10..x0 = absolute x value (horizontal) byte 3: @@ -350,14 +350,14 @@ byte 3: byte 4: bit 7 6 5 4 3 2 1 0 - y15 y14 y13 y12 y11 y10 y8 y8 + . . . . . . y9 y8 byte 5: bit 7 6 5 4 3 2 1 0 y7 y6 y5 y4 y3 y2 y1 y0 - y15..y0 = absolute y value (vertical) + y9..y0 = absolute y value (vertical) 4.2.2 Two finger touch diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 839b21b..567b7a8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -324,6 +324,8 @@ and is between 256 and 4096 characters. It is defined in the file they are unmapped. Otherwise they are flushed before they will be reused, which is a lot of faster + off - do not initialize any AMD IOMMU found in + the system amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT @@ -784,8 +786,12 @@ and is between 256 and 4096 characters. It is defined in the file as early as possible in order to facilitate early boot debugging. - ftrace_dump_on_oops + ftrace_dump_on_oops[=orig_cpu] [FTRACE] will dump the trace buffers on oops. + If no parameter is passed, ftrace will dump + buffers of all CPUs, but if you pass orig_cpu, it will + dump only the buffer of the CPU that triggered the + oops. ftrace_filter=[function-list] [FTRACE] Limit the functions traced by the function diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt index 2f9115c..61c291c 100644 --- a/Documentation/kprobes.txt +++ b/Documentation/kprobes.txt @@ -165,8 +165,8 @@ the user entry_handler invocation is also skipped. 1.4 How Does Jump Optimization Work? -If you configured your kernel with CONFIG_OPTPROBES=y (currently -this option is supported on x86/x86-64, non-preemptive kernel) and +If your kernel is built with CONFIG_OPTPROBES=y (currently this flag +is automatically set 'y' on x86/x86-64, non-preemptive kernel) and the "debug.kprobes_optimization" kernel parameter is set to 1 (see sysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump instruction instead of a breakpoint instruction at each probepoint. @@ -271,8 +271,6 @@ tweak the kernel's execution path, you need to suppress optimization, using one of the following techniques: - Specify an empty function for the kprobe's post_handler or break_handler. or -- Config CONFIG_OPTPROBES=n. - or - Execute 'sysctl -w debug.kprobes_optimization=n' 2. Architectures Supported @@ -307,10 +305,6 @@ it useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), so you can use "objdump -d -l vmlinux" to see the source-to-object code mapping. -If you want to reduce probing overhead, set "Kprobes jump optimization -support" (CONFIG_OPTPROBES) to "y". You can find this option under the -"Kprobes" line. - 4. API Reference The Kprobes API includes a "register" function and an "unregister" diff --git a/Documentation/scheduler/sched-design-CFS.txt b/Documentation/scheduler/sched-design-CFS.txt index 6f33593..8239ebb 100644 --- a/Documentation/scheduler/sched-design-CFS.txt +++ b/Documentation/scheduler/sched-design-CFS.txt @@ -211,7 +211,7 @@ provide fair CPU time to each such task group. For example, it may be desirable to first provide fair CPU time to each user on the system and then to each task belonging to a user. -CONFIG_GROUP_SCHED strives to achieve exactly that. It lets tasks to be +CONFIG_CGROUP_SCHED strives to achieve exactly that. It lets tasks to be grouped and divides CPU time fairly among such groups. CONFIG_RT_GROUP_SCHED permits to group real-time (i.e., SCHED_FIFO and @@ -220,38 +220,11 @@ SCHED_RR) tasks. CONFIG_FAIR_GROUP_SCHED permits to group CFS (i.e., SCHED_NORMAL and SCHED_BATCH) tasks. -At present, there are two (mutually exclusive) mechanisms to group tasks for -CPU bandwidth control purposes: - - - Based on user id (CONFIG_USER_SCHED) - - With this option, tasks are grouped according to their user id. - - - Based on "cgroup" pseudo filesystem (CONFIG_CGROUP_SCHED) - - This options needs CONFIG_CGROUPS to be defined, and lets the administrator + These options need CONFIG_CGROUPS to be defined, and let the administrator create arbitrary groups of tasks, using the "cgroup" pseudo filesystem. See Documentation/cgroups/cgroups.txt for more information about this filesystem. -Only one of these options to group tasks can be chosen and not both. - -When CONFIG_USER_SCHED is defined, a directory is created in sysfs for each new -user and a "cpu_share" file is added in that directory. - - # cd /sys/kernel/uids - # cat 512/cpu_share # Display user 512's CPU share - 1024 - # echo 2048 > 512/cpu_share # Modify user 512's CPU share - # cat 512/cpu_share # Display user 512's CPU share - 2048 - # - -CPU bandwidth between two users is divided in the ratio of their CPU shares. -For example: if you would like user "root" to get twice the bandwidth of user -"guest," then set the cpu_share for both the users such that "root"'s cpu_share -is twice "guest"'s cpu_share. - -When CONFIG_CGROUP_SCHED is defined, a "cpu.shares" file is created for each +When CONFIG_FAIR_GROUP_SCHED is defined, a "cpu.shares" file is created for each group created using the pseudo filesystem. See example steps below to create task groups and modify their CPU share using the "cgroups" pseudo filesystem. @@ -273,24 +246,3 @@ task groups and modify their CPU share using the "cgroups" pseudo filesystem. # #Launch gmplayer (or your favourite movie player) # echo <movie_player_pid> > multimedia/tasks - -8. Implementation note: user namespaces - -User namespaces are intended to be hierarchical. But they are currently -only partially implemented. Each of those has ramifications for CFS. - -First, since user namespaces are hierarchical, the /sys/kernel/uids -presentation is inadequate. Eventually we will likely want to use sysfs -tagging to provide private views of /sys/kernel/uids within each user -namespace. - -Second, the hierarchical nature is intended to support completely -unprivileged use of user namespaces. So if using user groups, then -we want the users in a user namespace to be children of the user -who created it. - -That is currently unimplemented. So instead, every user in a new -user namespace will receive 1024 shares just like any user in the -initial user namespace. Note that at the moment creation of a new -user namespace requires each of CAP_SYS_ADMIN, CAP_SETUID, and -CAP_SETGID. diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt index 86eabe6..605b0d4 100644 --- a/Documentation/scheduler/sched-rt-group.txt +++ b/Documentation/scheduler/sched-rt-group.txt @@ -126,23 +126,12 @@ priority! 2.3 Basis for grouping tasks ---------------------------- -There are two compile-time settings for allocating CPU bandwidth. These are -configured using the "Basis for grouping tasks" multiple choice menu under -General setup > Group CPU Scheduler: - -a. CONFIG_USER_SCHED (aka "Basis for grouping tasks" = "user id") - -This lets you use the virtual files under -"/sys/kernel/uids/<uid>/cpu_rt_runtime_us" to control he CPU time reserved for -each user . - -The other option is: - -.o CONFIG_CGROUP_SCHED (aka "Basis for grouping tasks" = "Control groups") +Enabling CONFIG_RT_GROUP_SCHED lets you explicitly allocate real +CPU bandwidth to task groups. This uses the /cgroup virtual file system and "/cgroup/<cgroup>/cpu.rt_runtime_us" to control the CPU time reserved for each -control group instead. +control group. For more information on working with control groups, you should read Documentation/cgroups/cgroups.txt as well. @@ -161,8 +150,7 @@ For now, this can be simplified to just the following (but see Future plans): =============== There is work in progress to make the scheduling period for each group -("/sys/kernel/uids/<uid>/cpu_rt_period_us" or -"/cgroup/<cgroup>/cpu.rt_period_us" respectively) configurable as well. +("/cgroup/<cgroup>/cpu.rt_period_us") configurable as well. The constraint on the period is that a subgroup must have a smaller or equal period to its parent. But realistically its not very useful _yet_ diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt index 02ac6ed..778ddf3 100644 --- a/Documentation/trace/events.txt +++ b/Documentation/trace/events.txt @@ -90,7 +90,8 @@ In order to facilitate early boot debugging, use boot option: trace_event=[event-list] -The format of this boot option is the same as described in section 2.1. +event-list is a comma separated list of events. See section 2.1 for event +format. 3. Defining an event-enabled tracepoint ======================================= diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 03485bf..557c1ed 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -155,6 +155,9 @@ of ftrace. Here is a list of some of the key files: to be traced. Echoing names of functions into this file will limit the trace to only those functions. + This interface also allows for commands to be used. See the + "Filter commands" section for more details. + set_ftrace_notrace: This has an effect opposite to that of @@ -1337,12 +1340,14 @@ ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one can either use the sysctl function or set it via the proc system interface. - sysctl kernel.ftrace_dump_on_oops=1 + sysctl kernel.ftrace_dump_on_oops=n or - echo 1 > /proc/sys/kernel/ftrace_dump_on_oops + echo n > /proc/sys/kernel/ftrace_dump_on_oops +If n = 1, ftrace will dump buffers of all CPUs, if n = 2 ftrace will +only dump the buffer of the CPU that triggered the oops. Here's an example of such a dump after a null pointer dereference in a kernel module: @@ -1822,6 +1827,47 @@ this special filter via: echo > set_graph_function +Filter commands +--------------- + +A few commands are supported by the set_ftrace_filter interface. +Trace commands have the following format: + +<function>:<command>:<parameter> + +The following commands are supported: + +- mod + This command enables function filtering per module. The + parameter defines the module. For example, if only the write* + functions in the ext3 module are desired, run: + + echo 'write*:mod:ext3' > set_ftrace_filter + + This command interacts with the filter in the same way as + filtering based on function names. Thus, adding more functions + in a different module is accomplished by appending (>>) to the + filter file. Remove specific module functions by prepending + '!': + + echo '!writeback*:mod:ext3' >> set_ftrace_filter + +- traceon/traceoff + These commands turn tracing on and off when the specified + functions are hit. The parameter determines how many times the + tracing system is turned on and off. If unspecified, there is + no limit. For example, to disable tracing when a schedule bug + is hit the first 5 times, run: + + echo '__schedule_bug:traceoff:5' > set_ftrace_filter + + These commands are cumulative whether or not they are appended + to set_ftrace_filter. To remove a command, prepend it by '!' + and drop the parameter: + + echo '!__schedule_bug:traceoff' > set_ftrace_filter + + trace_pipe ---------- diff --git a/Documentation/trace/kprobetrace.txt b/Documentation/trace/kprobetrace.txt index a9100b2..ec94748 100644 --- a/Documentation/trace/kprobetrace.txt +++ b/Documentation/trace/kprobetrace.txt @@ -40,7 +40,9 @@ Synopsis of kprobe_events $stack : Fetch stack address. $retval : Fetch return value.(*) +|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**) - NAME=FETCHARG: Set NAME as the argument name of FETCHARG. + NAME=FETCHARG : Set NAME as the argument name of FETCHARG. + FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types + (u8/u16/u32/u64/s8/s16/s32/s64) are supported. (*) only for return probe. (**) this is useful for fetching a field of data structures. |