diff options
551 files changed, 11364 insertions, 7755 deletions
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt index 1423d25..44c6dcc9 100644 --- a/Documentation/RCU/stallwarn.txt +++ b/Documentation/RCU/stallwarn.txt @@ -3,35 +3,79 @@ Using RCU's CPU Stall Detector The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables RCU's CPU stall detector, which detects conditions that unduly delay RCU grace periods. The stall detector's idea of what constitutes -"unduly delayed" is controlled by a pair of C preprocessor macros: +"unduly delayed" is controlled by a set of C preprocessor macros: RCU_SECONDS_TILL_STALL_CHECK This macro defines the period of time that RCU will wait from the beginning of a grace period until it issues an RCU CPU - stall warning. It is normally ten seconds. + stall warning. This time period is normally ten seconds. RCU_SECONDS_TILL_STALL_RECHECK This macro defines the period of time that RCU will wait after - issuing a stall warning until it issues another stall warning. - It is normally set to thirty seconds. + issuing a stall warning until it issues another stall warning + for the same stall. This time period is normally set to thirty + seconds. RCU_STALL_RAT_DELAY - The CPU stall detector tries to make the offending CPU rat on itself, - as this often gives better-quality stack traces. However, if - the offending CPU does not detect its own stall in the number - of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will - complain. This is normally set to two jiffies. + The CPU stall detector tries to make the offending CPU print its + own warnings, as this often gives better-quality stack traces. + However, if the offending CPU does not detect its own stall in + the number of jiffies specified by RCU_STALL_RAT_DELAY, then + some other CPU will complain. This delay is normally set to + two jiffies. -The following problems can result in an RCU CPU stall warning: +When a CPU detects that it is stalling, it will print a message similar +to the following: + +INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies) + +This message indicates that CPU 5 detected that it was causing a stall, +and that the stall was affecting RCU-sched. This message will normally be +followed by a stack dump of the offending CPU. On TREE_RCU kernel builds, +RCU and RCU-sched are implemented by the same underlying mechanism, +while on TREE_PREEMPT_RCU kernel builds, RCU is instead implemented +by rcu_preempt_state. + +On the other hand, if the offending CPU fails to print out a stall-warning +message quickly enough, some other CPU will print a message similar to +the following: + +INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies) + +This message indicates that CPU 2 detected that CPUs 3 and 5 were both +causing stalls, and that the stall was affecting RCU-bh. This message +will normally be followed by stack dumps for each CPU. Please note that +TREE_PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, +and that the tasks will be indicated by PID, for example, "P3421". +It is even possible for a rcu_preempt_state stall to be caused by both +CPUs -and- tasks, in which case the offending CPUs and tasks will all +be called out in the list. + +Finally, if the grace period ends just as the stall warning starts +printing, there will be a spurious stall-warning message: + +INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffies) + +This is rare, but does happen from time to time in real life. + +So your kernel printed an RCU CPU stall warning. The next question is +"What caused it?" The following problems can result in RCU CPU stall +warnings: o A CPU looping in an RCU read-side critical section. -o A CPU looping with interrupts disabled. +o A CPU looping with interrupts disabled. This condition can + result in RCU-sched and RCU-bh stalls. -o A CPU looping with preemption disabled. +o A CPU looping with preemption disabled. This condition can + result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh + stalls. + +o A CPU looping with bottom halves disabled. This condition can + result in RCU-sched and RCU-bh stalls. o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). @@ -39,20 +83,24 @@ o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel o A bug in the RCU implementation. o A hardware failure. This is quite unlikely, but has occurred - at least once in a former life. A CPU failed in a running system, + at least once in real life. A CPU failed in a running system, becoming unresponsive, but not causing an immediate crash. This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed. -The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. -SRCU does not do so directly, but its calls to synchronize_sched() will -result in RCU-sched detecting any CPU stalls that might be occurring. - -To diagnose the cause of the stall, inspect the stack traces. The offending -function will usually be near the top of the stack. If you have a series -of stall warnings from a single extended stall, comparing the stack traces -can often help determine where the stall is occurring, which will usually -be in the function nearest the top of the stack that stays the same from -trace to trace. +The RCU, RCU-sched, and RCU-bh implementations have CPU stall +warning. SRCU does not have its own CPU stall warnings, but its +calls to synchronize_sched() will result in RCU-sched detecting +RCU-sched-related CPU stalls. Please note that RCU only detects +CPU stalls when there is a grace period in progress. No grace period, +no CPU stall warnings. + +To diagnose the cause of the stall, inspect the stack traces. +The offending function will usually be near the top of the stack. +If you have a series of stall warnings from a single extended stall, +comparing the stack traces can often help determine where the stall +is occurring, which will usually be in the function nearest the top of +that portion of the stack which remains the same from trace to trace. +If you can reliably trigger the stall, ftrace can be quite helpful. RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index 0e50bc2..5d90167 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -182,16 +182,6 @@ Similarly, sched_expedited RCU provides the following: sched_expedited-torture: Reader Pipe: 12660320201 95875 0 0 0 0 0 0 0 0 0 sched_expedited-torture: Reader Batch: 12660424885 0 0 0 0 0 0 0 0 0 0 sched_expedited-torture: Free-Block Circulation: 1090795 1090795 1090794 1090793 1090792 1090791 1090790 1090789 1090788 1090787 0 - state: -1 / 0:0 3:0 4:0 - -As before, the first four lines are similar to those for RCU. -The last line shows the task-migration state. The first number is --1 if synchronize_sched_expedited() is idle, -2 if in the process of -posting wakeups to the migration kthreads, and N when waiting on CPU N. -Each of the colon-separated fields following the "/" is a CPU:state pair. -Valid states are "0" for idle, "1" for waiting for quiescent state, -"2" for passed through quiescent state, and "3" when a race with a -CPU-hotplug event forces use of the synchronize_sched() primitive. USAGE diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt index 8608fd8..efd8cc9 100644 --- a/Documentation/RCU/trace.txt +++ b/Documentation/RCU/trace.txt @@ -256,23 +256,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct The output of "cat rcu/rcu_pending" looks as follows: rcu_sched: - 0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 - 1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 - 2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 - 3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 - 4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 - 5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 - 6 np=219995 qsp=46718 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 - 7 np=249893 qsp=49390 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 + 0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741 + 1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792 + 2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629 + 3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723 + 4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110 + 5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456 + 6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834 + 7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888 rcu_bh: - 0 np=146741 qsp=1419 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 - 1 np=155792 qsp=12597 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 - 2 np=136629 qsp=18680 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 - 3 np=137723 qsp=2843 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 - 4 np=123110 qsp=12433 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 - 5 np=137456 qsp=4210 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 - 6 np=120834 qsp=9902 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 - 7 np=144888 qsp=26336 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 + 0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314 + 1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180 + 2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936 + 3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863 + 4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671 + 5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235 + 6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921 + 7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542 As always, this is once again split into "rcu_sched" and "rcu_bh" portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional @@ -284,6 +284,9 @@ o "np" is the number of times that __rcu_pending() has been invoked o "qsp" is the number of times that the RCU was waiting for a quiescent state from this CPU. +o "rpq" is the number of times that the CPU had passed through + a quiescent state, but not yet reported it to RCU. + o "cbr" is the number of times that this CPU had RCU callbacks that had passed through a grace period, and were thus ready to be invoked. diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index ed511af..05df0b7 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -589,3 +589,26 @@ Why: Useful in 2003, implementation is a hack. Generally invoked by accident today. Seen as doing more harm than good. Who: Len Brown <len.brown@intel.com> + +---------------------------- + +What: video4linux /dev/vtx teletext API support +When: 2.6.35 +Files: drivers/media/video/saa5246a.c drivers/media/video/saa5249.c + include/linux/videotext.h +Why: The vtx device nodes have been superseded by vbi device nodes + for many years. No applications exist that use the vtx support. + Of the two i2c drivers that actually support this API the saa5249 + has been impossible to use for a year now and no known hardware + that supports this device exists. The saa5246a is theoretically + supported by the old mxb boards, but it never actually worked. + + In summary: there is no hardware that can use this API and there + are no applications actually implementing this API. + + The vtx support still reserves minors 192-223 and we would really + like to reuse those for upcoming new functionality. In the unlikely + event that new hardware appears that wants to use the functionality + provided by the vtx API, then that functionality should be build + around the sliced VBI API instead. +Who: Hans Verkuil <hverkuil@xs4all.nl> diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index a4f30fa..1e359b6 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -316,7 +316,7 @@ address perms offset dev inode pathname 08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test 0804a000-0806b000 rw-p 00000000 00:00 0 [heap] a7cb1000-a7cb2000 ---p 00000000 00:00 0 -a7cb2000-a7eb2000 rw-p 00000000 00:00 0 [threadstack:001ff4b4] +a7cb2000-a7eb2000 rw-p 00000000 00:00 0 a7eb2000-a7eb3000 ---p 00000000 00:00 0 a7eb3000-a7ed5000 rw-p 00000000 00:00 0 a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6 @@ -352,7 +352,6 @@ is not associated with a file: [stack] = the stack of the main process [vdso] = the "virtual dynamic shared object", the kernel system call handler - [threadstack:xxxxxxxx] = the stack of the thread, xxxxxxxx is the stack size or if empty, the mapping is anonymous. diff --git a/Documentation/intel_txt.txt b/Documentation/intel_txt.txt index f40a1f0..87c8990 100644 --- a/Documentation/intel_txt.txt +++ b/Documentation/intel_txt.txt @@ -161,13 +161,15 @@ o In order to put a system into any of the sleep states after a TXT has been restored, it will restore the TPM PCRs and then transfer control back to the kernel's S3 resume vector. In order to preserve system integrity across S3, the kernel - provides tboot with a set of memory ranges (kernel - code/data/bss, S3 resume code, and AP trampoline) that tboot - will calculate a MAC (message authentication code) over and then - seal with the TPM. On resume and once the measured environment - has been re-established, tboot will re-calculate the MAC and - verify it against the sealed value. Tboot's policy determines - what happens if the verification fails. + provides tboot with a set of memory ranges (RAM and RESERVED_KERN + in the e820 table, but not any memory that BIOS might alter over + the S3 transition) that tboot will calculate a MAC (message + authentication code) over and then seal with the TPM. On resume + and once the measured environment has been re-established, tboot + will re-calculate the MAC and verify it against the sealed value. + Tboot's policy determines what happens if the verification fails. + Note that the c/s 194 of tboot which has the new MAC code supports + this. That's pretty much it for TXT support. diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 839b21b..567b7a8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -324,6 +324,8 @@ and is between 256 and 4096 characters. It is defined in the file they are unmapped. Otherwise they are flushed before they will be reused, which is a lot of faster + off - do not initialize any AMD IOMMU found in + the system amijoy.map= [HW,JOY] Amiga joystick support Map of devices attached to JOY0DAT and JOY1DAT @@ -784,8 +786,12 @@ and is between 256 and 4096 characters. It is defined in the file as early as possible in order to facilitate early boot debugging. - ftrace_dump_on_oops + ftrace_dump_on_oops[=orig_cpu] [FTRACE] will dump the trace buffers on oops. + If no parameter is passed, ftrace will dump + buffers of all CPUs, but if you pass orig_cpu, it will + dump only the buffer of the CPU that triggered the + oops. ftrace_filter=[function-list] [FTRACE] Limit the functions traced by the function diff --git a/Documentation/rbtree.txt b/Documentation/rbtree.txt index aae8355..221f38b 100644 --- a/Documentation/rbtree.txt +++ b/Documentation/rbtree.txt @@ -190,3 +190,61 @@ Example: for (node = rb_first(&mytree); node; node = rb_next(node)) printk("key=%s\n", rb_entry(node, struct mytype, node)->keystring); +Support for Augmented rbtrees +----------------------------- + +Augmented rbtree is an rbtree with "some" additional data stored in each node. +This data can be used to augment some new functionality to rbtree. +Augmented rbtree is an optional feature built on top of basic rbtree +infrastructure. rbtree user who wants this feature will have an augment +callback function in rb_root initialized. + +This callback function will be called from rbtree core routines whenever +a node has a change in one or both of its children. It is the responsibility +of the callback function to recalculate the additional data that is in the +rb node using new children information. Note that if this new additional +data affects the parent node's additional data, then callback function has +to handle it and do the recursive updates. + + +Interval tree is an example of augmented rb tree. Reference - +"Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein. +More details about interval trees: + +Classical rbtree has a single key and it cannot be directly used to store +interval ranges like [lo:hi] and do a quick lookup for any overlap with a new +lo:hi or to find whether there is an exact match for a new lo:hi. + +However, rbtree can be augmented to store such interval ranges in a structured +way making it possible to do efficient lookup and exact match. + +This "extra information" stored in each node is the maximum hi +(max_hi) value among all the nodes that are its descendents. This +information can be maintained at each node just be looking at the node +and its immediate children. And this will be used in O(log n) lookup +for lowest match (lowest start address among all possible matches) +with something like: + +find_lowest_match(lo, hi, node) +{ + lowest_match = NULL; + while (node) { + if (max_hi(node->left) > lo) { + // Lowest overlap if any must be on left side + node = node->left; + } else if (overlap(lo, hi, node)) { + lowest_match = node; + break; + } else if (lo > node->lo) { + // Lowest overlap if any must be on right side + node = node->right; + } else { + break; + } + } + return lowest_match; +} + +Finding exact match will be to first find lowest match and then to follow +successor nodes looking for exact match, until the start of a node is beyond +the hi value we are looking for. diff --git a/Documentation/scheduler/sched-design-CFS.txt b/Documentation/scheduler/sched-design-CFS.txt index 6f33593..8239ebb 100644 --- a/Documentation/scheduler/sched-design-CFS.txt +++ b/Documentation/scheduler/sched-design-CFS.txt @@ -211,7 +211,7 @@ provide fair CPU time to each such task group. For example, it may be desirable to first provide fair CPU time to each user on the system and then to each task belonging to a user. -CONFIG_GROUP_SCHED strives to achieve exactly that. It lets tasks to be +CONFIG_CGROUP_SCHED strives to achieve exactly that. It lets tasks to be grouped and divides CPU time fairly among such groups. CONFIG_RT_GROUP_SCHED permits to group real-time (i.e., SCHED_FIFO and @@ -220,38 +220,11 @@ SCHED_RR) tasks. CONFIG_FAIR_GROUP_SCHED permits to group CFS (i.e., SCHED_NORMAL and SCHED_BATCH) tasks. -At present, there are two (mutually exclusive) mechanisms to group tasks for -CPU bandwidth control purposes: - - - Based on user id (CONFIG_USER_SCHED) - - With this option, tasks are grouped according to their user id. - - - Based on "cgroup" pseudo filesystem (CONFIG_CGROUP_SCHED) - - This options needs CONFIG_CGROUPS to be defined, and lets the administrator + These options need CONFIG_CGROUPS to be defined, and let the administrator create arbitrary groups of tasks, using the "cgroup" pseudo filesystem. See Documentation/cgroups/cgroups.txt for more information about this filesystem. -Only one of these options to group tasks can be chosen and not both. - -When CONFIG_USER_SCHED is defined, a directory is created in sysfs for each new -user and a "cpu_share" file is added in that directory. - - # cd /sys/kernel/uids - # cat 512/cpu_share # Display user 512's CPU share - 1024 - # echo 2048 > 512/cpu_share # Modify user 512's CPU share - # cat 512/cpu_share # Display user 512's CPU share - 2048 - # - -CPU bandwidth between two users is divided in the ratio of their CPU shares. -For example: if you would like user "root" to get twice the bandwidth of user -"guest," then set the cpu_share for both the users such that "root"'s cpu_share -is twice "guest"'s cpu_share. - -When CONFIG_CGROUP_SCHED is defined, a "cpu.shares" file is created for each +When CONFIG_FAIR_GROUP_SCHED is defined, a "cpu.shares" file is created for each group created using the pseudo filesystem. See example steps below to create task groups and modify their CPU share using the "cgroups" pseudo filesystem. @@ -273,24 +246,3 @@ task groups and modify their CPU share using the "cgroups" pseudo filesystem. # #Launch gmplayer (or your favourite movie player) # echo <movie_player_pid> > multimedia/tasks - -8. Implementation note: user namespaces - -User namespaces are intended to be hierarchical. But they are currently -only partially implemented. Each of those has ramifications for CFS. - -First, since user namespaces are hierarchical, the /sys/kernel/uids -presentation is inadequate. Eventually we will likely want to use sysfs -tagging to provide private views of /sys/kernel/uids within each user -namespace. - -Second, the hierarchical nature is intended to support completely -unprivileged use of user namespaces. So if using user groups, then -we want the users in a user namespace to be children of the user -who created it. - -That is currently unimplemented. So instead, every user in a new -user namespace will receive 1024 shares just like any user in the -initial user namespace. Note that at the moment creation of a new -user namespace requires each of CAP_SYS_ADMIN, CAP_SETUID, and -CAP_SETGID. diff --git a/Documentation/scheduler/sched-rt-group.txt b/Documentation/scheduler/sched-rt-group.txt index 86eabe6..605b0d4 100644 --- a/Documentation/scheduler/sched-rt-group.txt +++ b/Documentation/scheduler/sched-rt-group.txt @@ -126,23 +126,12 @@ priority! 2.3 Basis for grouping tasks ---------------------------- -There are two compile-time settings for allocating CPU bandwidth. These are -configured using the "Basis for grouping tasks" multiple choice menu under -General setup > Group CPU Scheduler: - -a. CONFIG_USER_SCHED (aka "Basis for grouping tasks" = "user id") - -This lets you use the virtual files under -"/sys/kernel/uids/<uid>/cpu_rt_runtime_us" to control he CPU time reserved for -each user . - -The other option is: - -.o CONFIG_CGROUP_SCHED (aka "Basis for grouping tasks" = "Control groups") +Enabling CONFIG_RT_GROUP_SCHED lets you explicitly allocate real +CPU bandwidth to task groups. This uses the /cgroup virtual file system and "/cgroup/<cgroup>/cpu.rt_runtime_us" to control the CPU time reserved for each -control group instead. +control group. For more information on working with control groups, you should read Documentation/cgroups/cgroups.txt as well. @@ -161,8 +150,7 @@ For now, this can be simplified to just the following (but see Future plans): =============== There is work in progress to make the scheduling period for each group -("/sys/kernel/uids/<uid>/cpu_rt_period_us" or -"/cgroup/<cgroup>/cpu.rt_period_us" respectively) configurable as well. +("/cgroup/<cgroup>/cpu.rt_period_us") configurable as well. The constraint on the period is that a subgroup must have a smaller or equal period to its parent. But realistically its not very useful _yet_ diff --git a/Documentation/trace/events.txt b/Documentation/trace/events.txt index 02ac6ed..778ddf3 100644 --- a/Documentation/trace/events.txt +++ b/Documentation/trace/events.txt @@ -90,7 +90,8 @@ In order to facilitate early boot debugging, use boot option: trace_event=[event-list] -The format of this boot option is the same as described in section 2.1. +event-list is a comma separated list of events. See section 2.1 for event +format. 3. Defining an event-enabled tracepoint ======================================= diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 03485bf..557c1ed 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -155,6 +155,9 @@ of ftrace. Here is a list of some of the key files: to be traced. Echoing names of functions into this file will limit the trace to only those functions. + This interface also allows for commands to be used. See the + "Filter commands" section for more details. + set_ftrace_notrace: This has an effect opposite to that of @@ -1337,12 +1340,14 @@ ftrace_dump_on_oops must be set. To set ftrace_dump_on_oops, one can either use the sysctl function or set it via the proc system interface. - sysctl kernel.ftrace_dump_on_oops=1 + sysctl kernel.ftrace_dump_on_oops=n or - echo 1 > /proc/sys/kernel/ftrace_dump_on_oops + echo n > /proc/sys/kernel/ftrace_dump_on_oops +If n = 1, ftrace will dump buffers of all CPUs, if n = 2 ftrace will +only dump the buffer of the CPU that triggered the oops. Here's an example of such a dump after a null pointer dereference in a kernel module: @@ -1822,6 +1827,47 @@ this special filter via: echo > set_graph_function +Filter commands +--------------- + +A few commands are supported by the set_ftrace_filter interface. +Trace commands have the following format: + +<function>:<command>:<parameter> + +The following commands are supported: + +- mod + This command enables function filtering per module. The + parameter defines the module. For example, if only the write* + functions in the ext3 module are desired, run: + + echo 'write*:mod:ext3' > set_ftrace_filter + + This command interacts with the filter in the same way as + filtering based on function names. Thus, adding more functions + in a different module is accomplished by appending (>>) to the + filter file. Remove specific module functions by prepending + '!': + + echo '!writeback*:mod:ext3' >> set_ftrace_filter + +- traceon/traceoff + These commands turn tracing on and off when the specified + functions are hit. The parameter determines how many times the + tracing system is turned on and off. If unspecified, there is + no limit. For example, to disable tracing when a schedule bug + is hit the first 5 times, run: + + echo '__schedule_bug:traceoff:5' > set_ftrace_filter + + These commands are cumulative whether or not they are appended + to set_ftrace_filter. To remove a command, prepend it by '!' + and drop the parameter: + + echo '!__schedule_bug:traceoff' > set_ftrace_filter + + trace_pipe ---------- diff --git a/MAINTAINERS b/MAINTAINERS index 5085c90..3d2651b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2953,6 +2953,17 @@ S: Odd Fixes F: Documentation/networking/README.ipw2200 F: drivers/net/wireless/ipw2x00/ipw2200.* +INTEL(R) TRUSTED EXECUTION TECHNOLOGY (TXT) +M: Joseph Cihula <joseph.cihula@intel.com> +M: Shane Wang <shane.wang@intel.com> +L: tboot-devel@lists.sourceforge.net +W: http://tboot.sourceforge.net +T: Mercurial http://www.bughost.org/repos.hg/tboot.hg +S: Supported +F: Documentation/intel_txt.txt +F: include/linux/tboot.h +F: arch/x86/kernel/tboot.c + INTEL WIRELESS WIMAX CONNECTION 2400 M: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> M: linux-wimax@intel.com @@ -4165,6 +4176,7 @@ OPROFILE M: Robert Richter <robert.richter@amd.com> L: oprofile-list@lists.sf.net S: Maintained +F: arch/*/include/asm/oprofile*.h F: arch/*/oprofile/ F: drivers/oprofile/ F: include/linux/oprofile.h @@ -5492,7 +5504,7 @@ S: Maintained F: drivers/mmc/host/tmio_mmc.* TMPFS (SHMEM FILESYSTEM) -M: Hugh Dickins <hugh.dickins@tiscali.co.uk> +M: Hugh Dickins <hughd@google.com> L: linux-mm@kvack.org S: Maintained F: include/linux/shmem_fs.h @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 34 -EXTRAVERSION = -rc6 +EXTRAVERSION = NAME = Sheep on Meth # *DOCUMENTATION* diff --git a/arch/alpha/include/asm/atomic.h b/arch/alpha/include/asm/atomic.h index 610dff4..e756d04 100644 --- a/arch/alpha/include/asm/atomic.h +++ b/arch/alpha/include/asm/atomic.h @@ -17,8 +17,8 @@ #define ATOMIC_INIT(i) ( (atomic_t) { (i) } ) #define ATOMIC64_INIT(i) ( (atomic64_t) { (i) } ) -#define atomic_read(v) ((v)->counter + 0) -#define atomic64_read(v) ((v)->counter + 0) +#define atomic_read(v) (*(volatile int *)&(v)->counter) +#define atomic64_read(v) (*(volatile long *)&(v)->counter) #define atomic_set(v,i) ((v)->counter = (i)) #define atomic64_set(v,i) ((v)->counter = (i)) diff --git a/arch/alpha/include/asm/bitops.h b/arch/alpha/include/asm/bitops.h index 15f3ae2..296da1d 100644 --- a/arch/alpha/include/asm/bitops.h +++ b/arch/alpha/include/asm/bitops.h @@ -405,29 +405,31 @@ static inline int fls(int x) #if defined(CONFIG_ALPHA_EV6) && defined(CONFIG_ALPHA_EV67) /* Whee. EV67 can calculate it directly. */ -static inline unsigned long hweight64(unsigned long w) +static inline unsigned long __arch_hweight64(unsigned long w) { return __kernel_ctpop(w); } -static inline unsigned int hweight32(unsigned int w) +static inline unsigned int __arch_weight32(unsigned int w) { - return hweight64(w); + return __arch_hweight64(w); } -static inline unsigned int hweight16(unsigned int w) +static inline unsigned int __arch_hweight16(unsigned int w) { - return hweight64(w & 0xffff); + return __arch_hweight64(w & 0xffff); } -static inline unsigned int hweight8(unsigned int w) +static inline unsigned int __arch_hweight8(unsigned int w) { - return hweight64(w & 0xff); + return __arch_hweight64(w & 0xff); } #else -#include <asm-generic/bitops/hweight.h> +#include <asm-generic/bitops/arch_hweight.h> #endif +#include <asm-generic/bitops/const_hweight.h> + #endif /* __KERNEL__ */ #include <asm-generic/bitops/find.h> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index 6ab6b33..c5191b1 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -685,8 +685,8 @@ proc_types: W(b) __armv4_mmu_cache_off W(b) __armv4_mmu_cache_flush - .word 0x56056930 - .word 0xff0ffff0 @ PXA935 + .word 0x56056900 + .word 0xffffff00 @ PXA9xx W(b) __armv4_mmu_cache_on W(b) __armv4_mmu_cache_off W(b) __armv4_mmu_cache_flush @@ -697,12 +697,6 @@ proc_types: W(b) __armv4_mmu_cache_off W(b) __armv5tej_mmu_cache_flush - .word 0x56056930 - .word 0xff0ffff0 @ PXA935 - W(b) __armv4_mmu_cache_on - W(b) __armv4_mmu_cache_off - W(b) __armv4_mmu_cache_flush - .word 0x56050000 @ Feroceon .word 0xff0f0000 W(b) __armv4_mmu_cache_on diff --git a/arch/arm/configs/imote2_defconfig b/arch/arm/configs/imote2_defconfig index 95d2bec..21f2bff 100644 --- a/arch/arm/configs/imote2_defconfig +++ b/arch/arm/configs/imote2_defconfig @@ -1,13 +1,14 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.33-rc8 -# Sat Feb 13 21:48:53 2010 +# Linux kernel version: 2.6.34-rc2 +# Thu Apr 8 14:49:08 2010 # CONFIG_ARM=y CONFIG_SYS_SUPPORTS_APM_EMULATION=y CONFIG_GENERIC_GPIO=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_HAVE_PROC_CPU=y CONFIG_GENERIC_HARDIRQS=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y @@ -19,6 +20,7 @@ CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_ARCH_HAS_CPUFREQ=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y +CONFIG_NEED_DMA_MAP_STATE=y CONFIG_ARCH_MTD_XIP=y CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y CONFIG_VECTORS_BASE=0xffff0000 @@ -60,11 +62,6 @@ CONFIG_RCU_FANOUT=32 # CONFIG_TREE_RCU_TRACE is not set # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=14 -CONFIG_GROUP_SCHED=y -CONFIG_FAIR_GROUP_SCHED=y -# CONFIG_RT_GROUP_SCHED is not set -CONFIG_USER_SCHED=y -# CONFIG_CGROUP_SCHED is not set # CONFIG_CGROUPS is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y @@ -97,10 +94,14 @@ CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_AIO=y +CONFIG_HAVE_PERF_EVENTS=y +CONFIG_PERF_USE_VMALLOC=y # # Kernel Performance Events And Counters # +# CONFIG_PERF_EVENTS is not set +# CONFIG_PERF_COUNTERS is not set CONFIG_VM_EVENT_COUNTERS=y # CONFIG_COMPAT_BRK is not set CONFIG_SLAB=y @@ -184,6 +185,7 @@ CONFIG_MMU=y # CONFIG_ARCH_REALVIEW is not set # CONFIG_ARCH_VERSATILE is not set # CONFIG_ARCH_AT91 is not set +# CONFIG_ARCH_BCMRING is not set # CONFIG_ARCH_CLPS711X is not set # CONFIG_ARCH_GEMINI is not set # CONFIG_ARCH_EBSA110 is not set @@ -193,7 +195,6 @@ CONFIG_MMU=y # CONFIG_ARCH_STMP3XXX is not set # CONFIG_ARCH_NETX is not set # CONFIG_ARCH_H720X is not set -# CONFIG_ARCH_NOMADIK is not set # CONFIG_ARCH_IOP13XX is not set # CONFIG_ARCH_IOP32X is not set # CONFIG_ARCH_IOP33X is not set @@ -210,21 +211,26 @@ CONFIG_MMU=y # CONFIG_ARCH_KS8695 is not set # CONFIG_ARCH_NS9XXX is not set # CONFIG_ARCH_W90X900 is not set +# CONFIG_ARCH_NUC93X is not set # CONFIG_ARCH_PNX4008 is not set CONFIG_ARCH_PXA=y # CONFIG_ARCH_MSM is not set +# CONFIG_ARCH_SHMOBILE is not set # CONFIG_ARCH_RPC is not set # CONFIG_ARCH_SA1100 is not set # CONFIG_ARCH_S3C2410 is not set # CONFIG_ARCH_S3C64XX is not set +# CONFIG_ARCH_S5P6440 is not set +# CONFIG_ARCH_S5P6442 is not set # CONFIG_ARCH_S5PC1XX is not set +# CONFIG_ARCH_S5PV210 is not set # CONFIG_ARCH_SHARK is not set # CONFIG_ARCH_LH7A40X is not set # CONFIG_ARCH_U300 is not set +# CONFIG_ARCH_U8500 is not set +# CONFIG_ARCH_NOMADIK is not set # CONFIG_ARCH_DAVINCI is not set # CONFIG_ARCH_OMAP is not set -# CONFIG_ARCH_BCMRING is not set -# CONFIG_ARCH_U8500 is not set # # Intel PXA2xx/PXA3xx Implementations @@ -253,6 +259,7 @@ CONFIG_ARCH_PXA=y # CONFIG_MACH_EM_X270 is not set # CONFIG_MACH_EXEDA is not set # CONFIG_MACH_CM_X300 is not set +# CONFIG_MACH_CAPC7117 is not set # CONFIG_ARCH_GUMSTIX is not set CONFIG_MACH_INTELMOTE2=y # CONFIG_MACH_STARGATE2 is not set @@ -275,7 +282,11 @@ CONFIG_MACH_INTELMOTE2=y # CONFIG_PXA_EZX is not set # CONFIG_MACH_MP900C is not set # CONFIG_ARCH_PXA_PALM is not set +# CONFIG_MACH_RAUMFELD_RC is not set +# CONFIG_MACH_RAUMFELD_CONNECTOR is not set +# CONFIG_MACH_RAUMFELD_SPEAKER is not set # CONFIG_PXA_SHARPSL is not set +# CONFIG_MACH_ICONTROL is not set # CONFIG_ARCH_PXA_ESERIES is not set CONFIG_PXA27x=y CONFIG_PXA_SSP=y @@ -302,6 +313,7 @@ CONFIG_ARM_THUMB=y CONFIG_ARM_L1_CACHE_SHIFT=5 CONFIG_IWMMXT=y CONFIG_XSCALE_PMU=y +CONFIG_CPU_HAS_PMU=y CONFIG_COMMON_CLKDEV=y # @@ -352,7 +364,7 @@ CONFIG_ALIGNMENT_TRAP=y # CONFIG_ZBOOT_ROM_TEXT=0x0 CONFIG_ZBOOT_ROM_BSS=0x0 -CONFIG_CMDLINE="console=tty1 root=/dev/mmcblk0p2 rootfstype=ext2 rootdelay=3 ip=192.168.0.202:192.168.0.200:192.168.0.200:255.255.255.0 debug" +CONFIG_CMDLINE="root=/dev/mtdblock2 rootfstype=jffs2 console=ttyS2,115200 mem=32M" # CONFIG_XIP_KERNEL is not set CONFIG_KEXEC=y CONFIG_ATAGS_PROC=y @@ -360,24 +372,8 @@ CONFIG_ATAGS_PROC=y # # CPU Power Management # -CONFIG_CPU_FREQ=y -CONFIG_CPU_FREQ_TABLE=y -CONFIG_CPU_FREQ_DEBUG=y -CONFIG_CPU_FREQ_STAT=y -# CONFIG_CPU_FREQ_STAT_DETAILS is not set -CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y -# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set -# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set -# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set -# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set -CONFIG_CPU_FREQ_GOV_PERFORMANCE=y -CONFIG_CPU_FREQ_GOV_POWERSAVE=m -CONFIG_CPU_FREQ_GOV_USERSPACE=m -CONFIG_CPU_FREQ_GOV_ONDEMAND=m -CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m -CONFIG_CPU_IDLE=y -CONFIG_CPU_IDLE_GOV_LADDER=y -CONFIG_CPU_IDLE_GOV_MENU=y +# CONFIG_CPU_FREQ is not set +# CONFIG_CPU_IDLE is not set # # Floating point emulation @@ -409,6 +405,7 @@ CONFIG_SUSPEND=y CONFIG_SUSPEND_FREEZER=y CONFIG_APM_EMULATION=y CONFIG_PM_RUNTIME=y +CONFIG_PM_OPS=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_NET=y @@ -416,7 +413,6 @@ CONFIG_NET=y # Networking options # CONFIG_PACKET=y -CONFIG_PACKET_MMAP=y CONFIG_UNIX=y CONFIG_XFRM=y # CONFIG_XFRM_USER is not set @@ -506,6 +502,7 @@ CONFIG_NF_CT_NETLINK=m CONFIG_NETFILTER_XTABLES=m CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m # CONFIG_NETFILTER_XT_TARGET_CONNMARK is not set +# CONFIG_NETFILTER_XT_TARGET_CT is not set # CONFIG_NETFILTER_XT_TARGET_DSCP is not set CONFIG_NETFILTER_XT_TARGET_HL=m CONFIG_NETFILTER_XT_TARGET_LED=m @@ -622,6 +619,7 @@ CONFIG_IP6_NF_RAW=m # CONFIG_ATM is not set CONFIG_STP=m CONFIG_BRIDGE=m +# CONFIG_BRIDGE_IGMP_SNOOPING is not set # CONFIG_NET_DSA is not set # CONFIG_VLAN_8021Q is not set # CONFIG_DECNET is not set @@ -646,32 +644,7 @@ CONFIG_NET_CLS_ROUTE=y # CONFIG_HAMRADIO is not set # CONFIG_CAN is not set # CONFIG_IRDA is not set -CONFIG_BT=y -CONFIG_BT_L2CAP=y -CONFIG_BT_SCO=y -CONFIG_BT_RFCOMM=y -CONFIG_BT_RFCOMM_TTY=y -CONFIG_BT_BNEP=y -CONFIG_BT_BNEP_MC_FILTER=y -CONFIG_BT_BNEP_PROTO_FILTER=y -CONFIG_BT_HIDP=y - -# -# Bluetooth device drivers -# -CONFIG_BT_HCIBTUSB=m -CONFIG_BT_HCIBTSDIO=m -CONFIG_BT_HCIUART=y -CONFIG_BT_HCIUART_H4=y -# CONFIG_BT_HCIUART_BCSP is not set -# CONFIG_BT_HCIUART_LL is not set -CONFIG_BT_HCIBCM203X=m -CONFIG_BT_HCIBPA10X=m -CONFIG_BT_HCIBFUSB=m -CONFIG_BT_HCIVHCI=m -CONFIG_BT_MRVL=m -CONFIG_BT_MRVL_SDIO=m -# CONFIG_BT_ATH3K is not set +# CONFIG_BT is not set # CONFIG_AF_RXRPC is not set CONFIG_FIB_RULES=y # CONFIG_WIRELESS is not set @@ -687,7 +660,8 @@ CONFIG_FIB_RULES=y # Generic Driver Options # CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" -# CONFIG_DEVTMPFS is not set +CONFIG_DEVTMPFS=y +CONFIG_DEVTMPFS_MOUNT=y CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=m @@ -703,9 +677,9 @@ CONFIG_MTD=y # CONFIG_MTD_CONCAT is not set CONFIG_MTD_PARTITIONS=y # CONFIG_MTD_REDBOOT_PARTS is not set -# CONFIG_MTD_CMDLINE_PARTS is not set -# CONFIG_MTD_AFS_PARTS is not set -# CONFIG_MTD_AR7_PARTS is not set +CONFIG_MTD_CMDLINE_PARTS=y +CONFIG_MTD_AFS_PARTS=y +CONFIG_MTD_AR7_PARTS=y # # User Modules And Translation Layers @@ -812,6 +786,7 @@ CONFIG_HAVE_IDE=y # # SCSI device support # +CONFIG_SCSI_MOD=y # CONFIG_RAID_ATTRS is not set # CONFIG_SCSI is not set # CONFIG_SCSI_DMA is not set @@ -965,6 +940,7 @@ CONFIG_SERIAL_PXA=y CONFIG_SERIAL_PXA_CONSOLE=y CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_TIMBERDALE is not set CONFIG_UNIX98_PTYS=y # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set CONFIG_LEGACY_PTYS=y @@ -993,6 +969,7 @@ CONFIG_I2C_HELPER_AUTO=y CONFIG_I2C_PXA=y # CONFIG_I2C_PXA_SLAVE is not set # CONFIG_I2C_SIMTEC is not set +# CONFIG_I2C_XILINX is not set # # External I2C/SMBus adapter drivers @@ -1006,15 +983,9 @@ CONFIG_I2C_PXA=y # # CONFIG_I2C_PCA_PLATFORM is not set # CONFIG_I2C_STUB is not set - -# -# Miscellaneous I2C Chip support -# -# CONFIG_SENSORS_TSL2550 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set -# CONFIG_I2C_DEBUG_CHIP is not set CONFIG_SPI=y # CONFIG_SPI_DEBUG is not set CONFIG_SPI_MASTER=y @@ -1046,10 +1017,12 @@ CONFIG_GPIO_SYSFS=y # # Memory mapped GPIO expanders: # +# CONFIG_GPIO_IT8761E is not set # # I2C GPIO expanders: # +# CONFIG_GPIO_MAX7300 is not set # CONFIG_GPIO_MAX732X is not set # CONFIG_GPIO_PCA953X is not set # CONFIG_GPIO_PCF857X is not set @@ -1093,10 +1066,12 @@ CONFIG_SSB_POSSIBLE=y # Multifunction device drivers # # CONFIG_MFD_CORE is not set +# CONFIG_MFD_88PM860X is not set # CONFIG_MFD_SM501 is not set # CONFIG_MFD_ASIC3 is not set # CONFIG_HTC_EGPIO is not set # CONFIG_HTC_PASIC3 is not set +# CONFIG_HTC_I2CPLD is not set # CONFIG_TPS65010 is not set # CONFIG_TWL4030_CORE is not set # CONFIG_MFD_TMIO is not set @@ -1105,22 +1080,25 @@ CONFIG_SSB_POSSIBLE=y # CONFIG_MFD_TC6393XB is not set CONFIG_PMIC_DA903X=y # CONFIG_PMIC_ADP5520 is not set +# CONFIG_MFD_MAX8925 is not set # CONFIG_MFD_WM8400 is not set # CONFIG_MFD_WM831X is not set # CONFIG_MFD_WM8350_I2C is not set +# CONFIG_MFD_WM8994 is not set # CONFIG_MFD_PCF50633 is not set # CONFIG_MFD_MC13783 is not set # CONFIG_AB3100_CORE is not set # CONFIG_EZX_PCAP is not set -# CONFIG_MFD_88PM8607 is not set # CONFIG_AB4500_CORE is not set CONFIG_REGULATOR=y CONFIG_REGULATOR_DEBUG=y +# CONFIG_REGULATOR_DUMMY is not set # CONFIG_REGULATOR_FIXED_VOLTAGE is not set CONFIG_REGULATOR_VIRTUAL_CONSUMER=y CONFIG_REGULATOR_USERSPACE_CONSUMER=y # CONFIG_REGULATOR_BQ24022 is not set # CONFIG_REGULATOR_MAX1586 is not set +# CONFIG_REGULATOR_MAX8649 is not set # CONFIG_REGULATOR_MAX8660 is not set CONFIG_REGULATOR_DA903X=y # CONFIG_REGULATOR_LP3971 is not set @@ -1218,6 +1196,7 @@ CONFIG_VIDEO_IR_I2C=y # CONFIG_VIDEO_SAA7191 is not set # CONFIG_VIDEO_TVP514X is not set # CONFIG_VIDEO_TVP5150 is not set +# CONFIG_VIDEO_TVP7002 is not set # CONFIG_VIDEO_VPX3220 is not set # @@ -1264,15 +1243,7 @@ CONFIG_SOC_CAMERA_MT9M111=y CONFIG_VIDEO_PXA27x=y # CONFIG_VIDEO_SH_MOBILE_CEU is not set # CONFIG_V4L_USB_DRIVERS is not set -CONFIG_RADIO_ADAPTERS=y -# CONFIG_I2C_SI4713 is not set -# CONFIG_RADIO_SI4713 is not set -# CONFIG_USB_DSBR is not set -# CONFIG_RADIO_SI470X is not set -# CONFIG_USB_MR800 is not set -CONFIG_RADIO_TEA5764=y -CONFIG_RADIO_TEA5764_XTAL=y -# CONFIG_RADIO_TEF6862 is not set +# CONFIG_RADIO_ADAPTERS is not set # CONFIG_DAB is not set # @@ -1398,8 +1369,6 @@ CONFIG_HID=y # # Special HID drivers # -CONFIG_HID_APPLE=m -# CONFIG_HID_WACOM is not set CONFIG_USB_SUPPORT=y CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y @@ -1477,7 +1446,6 @@ CONFIG_USB_OHCI_LITTLE_ENDIAN=y # CONFIG_USB_RIO500 is not set # CONFIG_USB_LEGOTOWER is not set # CONFIG_USB_LCD is not set -# CONFIG_USB_BERRY_CHARGE is not set # CONFIG_USB_LED is not set # CONFIG_USB_CYPRESS_CY7C63 is not set # CONFIG_USB_CYTHERM is not set @@ -1489,7 +1457,6 @@ CONFIG_USB_OHCI_LITTLE_ENDIAN=y # CONFIG_USB_IOWARRIOR is not set # CONFIG_USB_TEST is not set # CONFIG_USB_ISIGHTFW is not set -# CONFIG_USB_VST is not set CONFIG_USB_GADGET=y # CONFIG_USB_GADGET_DEBUG is not set # CONFIG_USB_GADGET_DEBUG_FILES is not set @@ -1529,6 +1496,7 @@ CONFIG_USB_ETH=y # CONFIG_USB_MIDI_GADGET is not set # CONFIG_USB_G_PRINTER is not set # CONFIG_USB_CDC_COMPOSITE is not set +# CONFIG_USB_G_NOKIA is not set # CONFIG_USB_G_MULTI is not set # @@ -1555,8 +1523,6 @@ CONFIG_SDIO_UART=m # CONFIG_MMC_PXA=y # CONFIG_MMC_SDHCI is not set -# CONFIG_MMC_AT91 is not set -# CONFIG_MMC_ATMELMCI is not set CONFIG_MMC_SPI=y # CONFIG_MEMSTICK is not set CONFIG_NEW_LEDS=y @@ -1574,11 +1540,11 @@ CONFIG_LEDS_LP3944=y # CONFIG_LEDS_REGULATOR is not set # CONFIG_LEDS_BD2802 is not set # CONFIG_LEDS_LT3593 is not set +CONFIG_LEDS_TRIGGERS=y # # LED Triggers # -CONFIG_LEDS_TRIGGERS=y CONFIG_LEDS_TRIGGER_TIMER=y CONFIG_LEDS_TRIGGER_HEARTBEAT=y CONFIG_LEDS_TRIGGER_BACKLIGHT=y @@ -1656,7 +1622,7 @@ CONFIG_RTC_INTF_DEV=y # on-CPU RTC drivers # # CONFIG_RTC_DRV_SA1100 is not set -# CONFIG_RTC_DRV_PXA is not set +CONFIG_RTC_DRV_PXA=y # CONFIG_DMADEVICES is not set # CONFIG_AUXDISPLAY is not set # CONFIG_UIO is not set @@ -1681,19 +1647,10 @@ CONFIG_EXT3_FS_XATTR=y CONFIG_JBD=m # CONFIG_JBD_DEBUG is not set CONFIG_FS_MBCACHE=m -CONFIG_REISERFS_FS=m -# CONFIG_REISERFS_CHECK is not set -# CONFIG_REISERFS_PROC_INFO is not set -CONFIG_REISERFS_FS_XATTR=y -CONFIG_REISERFS_FS_POSIX_ACL=y -CONFIG_REISERFS_FS_SECURITY=y +# CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set CONFIG_FS_POSIX_ACL=y -CONFIG_XFS_FS=m -# CONFIG_XFS_QUOTA is not set -# CONFIG_XFS_POSIX_ACL is not set -# CONFIG_XFS_RT is not set -# CONFIG_XFS_DEBUG is not set +# CONFIG_XFS_FS is not set # CONFIG_OCFS2_FS is not set # CONFIG_BTRFS_FS is not set # CONFIG_NILFS2_FS is not set @@ -1716,9 +1673,7 @@ CONFIG_CUSE=m # # CD-ROM/DVD Filesystems # -CONFIG_ISO9660_FS=m -CONFIG_JOLIET=y -CONFIG_ZISOFS=y +# CONFIG_ISO9660_FS is not set # CONFIG_UDF_FS is not set # @@ -1750,12 +1705,14 @@ CONFIG_MISC_FILESYSTEMS=y # CONFIG_BEFS_FS is not set # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set -CONFIG_JFFS2_FS=m +CONFIG_JFFS2_FS=y CONFIG_JFFS2_FS_DEBUG=0 CONFIG_JFFS2_FS_WRITEBUFFER=y -# CONFIG_JFFS2_FS_WBUF_VERIFY is not set -# CONFIG_JFFS2_SUMMARY is not set -# CONFIG_JFFS2_FS_XATTR is not set +CONFIG_JFFS2_FS_WBUF_VERIFY=y +CONFIG_JFFS2_SUMMARY=y +CONFIG_JFFS2_FS_XATTR=y +CONFIG_JFFS2_FS_POSIX_ACL=y +CONFIG_JFFS2_FS_SECURITY=y CONFIG_JFFS2_COMPRESSION_OPTIONS=y CONFIG_JFFS2_ZLIB=y CONFIG_JFFS2_LZO=y @@ -1765,6 +1722,7 @@ CONFIG_JFFS2_RUBIN=y CONFIG_JFFS2_CMODE_PRIORITY=y # CONFIG_JFFS2_CMODE_SIZE is not set # CONFIG_JFFS2_CMODE_FAVOURLZO is not set +# CONFIG_LOGFS is not set CONFIG_CRAMFS=m CONFIG_SQUASHFS=m # CONFIG_SQUASHFS_EMBEDDED is not set @@ -1802,6 +1760,7 @@ CONFIG_SUNRPC=y # CONFIG_RPCSEC_GSS_SPKM3 is not set CONFIG_SMB_FS=m # CONFIG_SMB_NLS_DEFAULT is not set +# CONFIG_CEPH_FS is not set CONFIG_CIFS=m CONFIG_CIFS_STATS=y # CONFIG_CIFS_STATS2 is not set @@ -1895,6 +1854,7 @@ CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y +# CONFIG_PROVE_RCU is not set CONFIG_LOCKDEP=y # CONFIG_LOCK_STAT is not set # CONFIG_DEBUG_LOCKDEP is not set @@ -1918,6 +1878,7 @@ CONFIG_DEBUG_BUGVERBOSE=y # CONFIG_BACKTRACE_SELF_TEST is not set # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# CONFIG_LKDTM is not set # CONFIG_FAULT_INJECTION is not set # CONFIG_LATENCYTOP is not set # CONFIG_SYSCTL_SYSCALL_CHECK is not set @@ -2061,9 +2022,9 @@ CONFIG_CRC32=y CONFIG_CRC7=y CONFIG_LIBCRC32C=m CONFIG_ZLIB_INFLATE=y -CONFIG_ZLIB_DEFLATE=m -CONFIG_LZO_COMPRESS=m -CONFIG_LZO_DECOMPRESS=m +CONFIG_ZLIB_DEFLATE=y +CONFIG_LZO_COMPRESS=y +CONFIG_LZO_DECOMPRESS=y CONFIG_DECOMPRESS_GZIP=y CONFIG_DECOMPRESS_BZIP2=y CONFIG_DECOMPRESS_LZMA=y @@ -2075,3 +2036,4 @@ CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y CONFIG_NLATTR=y +CONFIG_GENERIC_ATOMIC64=y diff --git a/arch/arm/include/asm/atomic.h b/arch/arm/include/asm/atomic.h index e8ddec2..a0162fa 100644 --- a/arch/arm/include/asm/atomic.h +++ b/arch/arm/include/asm/atomic.h @@ -24,7 +24,7 @@ * strex/ldrex monitor on some implementations. The reason we can use it for * atomic_set() is the clrex or dummy strex done on every exception return. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v,i) (((v)->counter) = (i)) #if __LINUX_ARM_ARCH__ >= 6 diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index 0d08d41..4656a24 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -371,6 +371,10 @@ static inline void __flush_icache_all(void) #ifdef CONFIG_ARM_ERRATA_411920 extern void v6_icache_inval_all(void); v6_icache_inval_all(); +#elif defined(CONFIG_SMP) && __LINUX_ARM_ARCH__ >= 7 + asm("mcr p15, 0, %0, c7, c1, 0 @ invalidate I-cache inner shareable\n" + : + : "r" (0)); #else asm("mcr p15, 0, %0, c7, c5, 0 @ invalidate I-cache\n" : diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h index bff0564..51662fe 100644 --- a/arch/arm/include/asm/elf.h +++ b/arch/arm/include/asm/elf.h @@ -9,6 +9,8 @@ #include <asm/ptrace.h> #include <asm/user.h> +struct task_struct; + typedef unsigned long elf_greg_t; typedef unsigned long elf_freg_t[3]; diff --git a/arch/arm/include/asm/smp_twd.h b/arch/arm/include/asm/smp_twd.h index 7be0978..634f357 100644 --- a/arch/arm/include/asm/smp_twd.h +++ b/arch/arm/include/asm/smp_twd.h @@ -1,6 +1,23 @@ #ifndef __ASMARM_SMP_TWD_H #define __ASMARM_SMP_TWD_H +#define TWD_TIMER_LOAD 0x00 +#define TWD_TIMER_COUNTER 0x04 +#define TWD_TIMER_CONTROL 0x08 +#define TWD_TIMER_INTSTAT 0x0C + +#define TWD_WDOG_LOAD 0x20 +#define TWD_WDOG_COUNTER 0x24 +#define TWD_WDOG_CONTROL 0x28 +#define TWD_WDOG_INTSTAT 0x2C +#define TWD_WDOG_RESETSTAT 0x30 +#define TWD_WDOG_DISABLE 0x34 + +#define TWD_TIMER_CONTROL_ENABLE (1 << 0) +#define TWD_TIMER_CONTROL_ONESHOT (0 << 1) +#define TWD_TIMER_CONTROL_PERIODIC (1 << 1) +#define TWD_TIMER_CONTROL_IT_ENABLE (1 << 2) + struct clock_event_device; extern void __iomem *twd_base; diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h index e085e2c..bd863d8 100644 --- a/arch/arm/include/asm/tlbflush.h +++ b/arch/arm/include/asm/tlbflush.h @@ -46,6 +46,9 @@ #define TLB_V7_UIS_FULL (1 << 20) #define TLB_V7_UIS_ASID (1 << 21) +/* Inner Shareable BTB operation (ARMv7 MP extensions) */ +#define TLB_V7_IS_BTB (1 << 22) + #define TLB_L2CLEAN_FR (1 << 29) /* Feroceon */ #define TLB_DCLEAN (1 << 30) #define TLB_WB (1 << 31) @@ -183,7 +186,7 @@ #endif #ifdef CONFIG_SMP -#define v7wbi_tlb_flags (TLB_WB | TLB_DCLEAN | TLB_BTB | \ +#define v7wbi_tlb_flags (TLB_WB | TLB_DCLEAN | TLB_V7_IS_BTB | \ TLB_V7_UIS_FULL | TLB_V7_UIS_PAGE | TLB_V7_UIS_ASID) #else #define v7wbi_tlb_flags (TLB_WB | TLB_DCLEAN | TLB_BTB | \ @@ -339,6 +342,12 @@ static inline void local_flush_tlb_all(void) dsb(); isb(); } + if (tlb_flag(TLB_V7_IS_BTB)) { + /* flush the branch target cache */ + asm("mcr p15, 0, %0, c7, c1, 6" : : "r" (zero) : "cc"); + dsb(); + isb(); + } } static inline void local_flush_tlb_mm(struct mm_struct *mm) @@ -376,6 +385,12 @@ static inline void local_flush_tlb_mm(struct mm_struct *mm) asm("mcr p15, 0, %0, c7, c5, 6" : : "r" (zero) : "cc"); dsb(); } + if (tlb_flag(TLB_V7_IS_BTB)) { + /* flush the branch target cache */ + asm("mcr p15, 0, %0, c7, c1, 6" : : "r" (zero) : "cc"); + dsb(); + isb(); + } } static inline void @@ -416,6 +431,12 @@ local_flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr) asm("mcr p15, 0, %0, c7, c5, 6" : : "r" (zero) : "cc"); dsb(); } + if (tlb_flag(TLB_V7_IS_BTB)) { + /* flush the branch target cache */ + asm("mcr p15, 0, %0, c7, c1, 6" : : "r" (zero) : "cc"); + dsb(); + isb(); + } } static inline void local_flush_tlb_kernel_page(unsigned long kaddr) @@ -454,6 +475,12 @@ static inline void local_flush_tlb_kernel_page(unsigned long kaddr) dsb(); isb(); } + if (tlb_flag(TLB_V7_IS_BTB)) { + /* flush the branch target cache */ + asm("mcr p15, 0, %0, c7, c1, 6" : : "r" (zero) : "cc"); + dsb(); + isb(); + } } /* diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index e6a0fb0..7ee48e7 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -676,10 +676,10 @@ do_fpe: * lr = unrecognised FP instruction return address */ - .data + .pushsection .data ENTRY(fp_enter) .word no_fp - .text + .popsection ENTRY(no_fp) mov pc, lr diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 577543f..a01194e 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -86,6 +86,12 @@ int __cpuinit __cpu_up(unsigned int cpu) return PTR_ERR(idle); } ci->idle = idle; + } else { + /* + * Since this idle thread is being re-used, call + * init_idle() to reinitialize the thread structure. + */ + init_idle(idle, cpu); } /* diff --git a/arch/arm/kernel/smp_twd.c b/arch/arm/kernel/smp_twd.c index ea02a7b..7c5f0c0 100644 --- a/arch/arm/kernel/smp_twd.c +++ b/arch/arm/kernel/smp_twd.c @@ -21,23 +21,6 @@ #include <asm/smp_twd.h> #include <asm/hardware/gic.h> -#define TWD_TIMER_LOAD 0x00 -#define TWD_TIMER_COUNTER 0x04 -#define TWD_TIMER_CONTROL 0x08 -#define TWD_TIMER_INTSTAT 0x0C - -#define TWD_WDOG_LOAD 0x20 -#define TWD_WDOG_COUNTER 0x24 -#define TWD_WDOG_CONTROL 0x28 -#define TWD_WDOG_INTSTAT 0x2C -#define TWD_WDOG_RESETSTAT 0x30 -#define TWD_WDOG_DISABLE 0x34 - -#define TWD_TIMER_CONTROL_ENABLE (1 << 0) -#define TWD_TIMER_CONTROL_ONESHOT (0 << 1) -#define TWD_TIMER_CONTROL_PERIODIC (1 << 1) -#define TWD_TIMER_CONTROL_IT_ENABLE (1 << 2) - /* set up by the platform code */ void __iomem *twd_base; diff --git a/arch/arm/lib/clear_user.S b/arch/arm/lib/clear_user.S index 5e3f996..14a0d98 100644 --- a/arch/arm/lib/clear_user.S +++ b/arch/arm/lib/clear_user.S @@ -45,6 +45,7 @@ USER( strnebt r2, [r0]) mov r0, #0 ldmfd sp!, {r1, pc} ENDPROC(__clear_user) +ENDPROC(__clear_user_std) .pushsection .fixup,"ax" .align 0 diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S index 027b69b..d066df6 100644 --- a/arch/arm/lib/copy_to_user.S +++ b/arch/arm/lib/copy_to_user.S @@ -93,6 +93,7 @@ WEAK(__copy_to_user) #include "copy_template.S" ENDPROC(__copy_to_user) +ENDPROC(__copy_to_user_std) .pushsection .fixup,"ax" .align 0 diff --git a/arch/arm/mach-davinci/da830.c b/arch/arm/mach-davinci/da830.c index 122e61a..e8cb982 100644 --- a/arch/arm/mach-davinci/da830.c +++ b/arch/arm/mach-davinci/da830.c @@ -410,7 +410,7 @@ static struct clk_lookup da830_clks[] = { CLK("davinci-mcasp.0", NULL, &mcasp0_clk), CLK("davinci-mcasp.1", NULL, &mcasp1_clk), CLK("davinci-mcasp.2", NULL, &mcasp2_clk), - CLK("musb_hdrc", NULL, &usb20_clk), + CLK(NULL, "usb20", &usb20_clk), CLK(NULL, "aemif", &aemif_clk), CLK(NULL, "aintc", &aintc_clk), CLK(NULL, "secu_mgr", &secu_mgr_clk), diff --git a/arch/arm/mach-mx5/clock-mx51.c b/arch/arm/mach-mx5/clock-mx51.c index 8f85f73..1ee6ce4 100644 --- a/arch/arm/mach-mx5/clock-mx51.c +++ b/arch/arm/mach-mx5/clock-mx51.c @@ -16,6 +16,7 @@ #include <linux/io.h> #include <asm/clkdev.h> +#include <asm/div64.h> #include <mach/hardware.h> #include <mach/common.h> diff --git a/arch/arm/mach-pxa/include/mach/colibri.h b/arch/arm/mach-pxa/include/mach/colibri.h index 811743c..5f2ba8d 100644 --- a/arch/arm/mach-pxa/include/mach/colibri.h +++ b/arch/arm/mach-pxa/include/mach/colibri.h @@ -2,6 +2,7 @@ #define _COLIBRI_H_ #include <net/ax88796.h> +#include <mach/mfp.h> /* * common settings for all modules diff --git a/arch/arm/mach-pxa/include/mach/hardware.h b/arch/arm/mach-pxa/include/mach/hardware.h index 7515757..3d8d8cb 100644 --- a/arch/arm/mach-pxa/include/mach/hardware.h +++ b/arch/arm/mach-pxa/include/mach/hardware.h @@ -202,7 +202,7 @@ #define __cpu_is_pxa950(id) \ ({ \ unsigned int _id = (id) >> 4 & 0xfff; \ - id == 0x697; \ + _id == 0x697; \ }) #else #define __cpu_is_pxa950(id) (0) diff --git a/arch/arm/mach-pxa/include/mach/regs-u2d.h b/arch/arm/mach-pxa/include/mach/regs-u2d.h index 44b0b20..c15c0c5 100644 --- a/arch/arm/mach-pxa/include/mach/regs-u2d.h +++ b/arch/arm/mach-pxa/include/mach/regs-u2d.h @@ -166,7 +166,8 @@ #define U2DMACSR_BUSERRTYPE (7 << 10) /* PX Bus Error Type */ #define U2DMACSR_EORINTR (1 << 9) /* End Of Receive */ #define U2DMACSR_REQPEND (1 << 8) /* Request Pending */ -#define U2DMACSR_RASINTR (1 << 4) /* Request After Channel Stopped (read / write 1 clear) */#define U2DMACSR_STOPINTR (1 << 3) /* Stop Interrupt (read only) */ +#define U2DMACSR_RASINTR (1 << 4) /* Request After Channel Stopped (read / write 1 clear) */ +#define U2DMACSR_STOPINTR (1 << 3) /* Stop Interrupt (read only) */ #define U2DMACSR_ENDINTR (1 << 2) /* End Interrupt (read / write 1 clear) */ #define U2DMACSR_STARTINTR (1 << 1) /* Start Interrupt (read / write 1 clear) */ #define U2DMACSR_BUSERRINTR (1 << 0) /* Bus Error Interrupt (read / write 1 clear) */ diff --git a/arch/arm/mach-pxa/raumfeld.c b/arch/arm/mach-pxa/raumfeld.c index 44bb675..d12667b 100644 --- a/arch/arm/mach-pxa/raumfeld.c +++ b/arch/arm/mach-pxa/raumfeld.c @@ -983,7 +983,7 @@ static void __init raumfeld_common_init(void) int i; for (i = 0; i < ARRAY_SIZE(gpio_keys_button); i++) - if (!strcmp(gpio_keys_button[i].desc, "on/off button")) + if (!strcmp(gpio_keys_button[i].desc, "on_off button")) gpio_keys_button[i].active_low = 1; } @@ -1009,8 +1009,7 @@ static void __init raumfeld_common_init(void) gpio_direction_output(GPIO_W2W_PDN, 0); /* this can be used to switch off the device */ - ret = gpio_request(GPIO_SHUTDOWN_SUPPLY, - "supply shutdown"); + ret = gpio_request(GPIO_SHUTDOWN_SUPPLY, "supply shutdown"); if (ret < 0) pr_warning("Unable to request GPIO_SHUTDOWN_SUPPLY\n"); else diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 19b5109..01bdd75 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -363,7 +363,7 @@ static struct gpio_keys_button spitz_gpio_keys[] = { .type = EV_PWR, .code = KEY_SUSPEND, .gpio = SPITZ_GPIO_ON_KEY, - .desc = "On/Off", + .desc = "On Off", .wakeup = 1, }, /* Two buttons detecting the lid state */ diff --git a/arch/arm/mach-pxa/viper.c b/arch/arm/mach-pxa/viper.c index 9e0c5c3..e90114a 100644 --- a/arch/arm/mach-pxa/viper.c +++ b/arch/arm/mach-pxa/viper.c @@ -34,6 +34,7 @@ #include <linux/pm.h> #include <linux/sched.h> #include <linux/gpio.h> +#include <linux/jiffies.h> #include <linux/i2c-gpio.h> #include <linux/serial_8250.h> #include <linux/smc91x.h> @@ -454,7 +455,7 @@ static struct i2c_gpio_platform_data i2c_bus_data = { .sda_pin = VIPER_RTC_I2C_SDA_GPIO, .scl_pin = VIPER_RTC_I2C_SCL_GPIO, .udelay = 10, - .timeout = 100, + .timeout = HZ, }; static struct platform_device i2c_bus_device = { @@ -779,7 +780,7 @@ static void __init viper_tpm_init(void) .sda_pin = VIPER_TPM_I2C_SDA_GPIO, .scl_pin = VIPER_TPM_I2C_SCL_GPIO, .udelay = 10, - .timeout = 100, + .timeout = HZ, }; char *errstr; diff --git a/arch/arm/mach-sa1100/Kconfig b/arch/arm/mach-sa1100/Kconfig index b17d52f..fd4c52b 100644 --- a/arch/arm/mach-sa1100/Kconfig +++ b/arch/arm/mach-sa1100/Kconfig @@ -57,7 +57,7 @@ config SA1100_COLLIE config SA1100_H3100 bool "Compaq iPAQ H3100" select HTC_EGPIO - select CPU_FREQ_SA1100 + select CPU_FREQ_SA1110 help Say Y here if you intend to run this kernel on the Compaq iPAQ H3100 handheld computer. Information about this machine and the @@ -68,7 +68,7 @@ config SA1100_H3100 config SA1100_H3600 bool "Compaq iPAQ H3600/H3700" select HTC_EGPIO - select CPU_FREQ_SA1100 + select CPU_FREQ_SA1110 help Say Y here if you intend to run this kernel on the Compaq iPAQ H3600 handheld computer. Information about this machine and the diff --git a/arch/arm/mach-sa1100/cpu-sa1110.c b/arch/arm/mach-sa1100/cpu-sa1110.c index 63b32b6..7252874 100644 --- a/arch/arm/mach-sa1100/cpu-sa1110.c +++ b/arch/arm/mach-sa1100/cpu-sa1110.c @@ -363,6 +363,9 @@ static int __init sa1110_clk_init(void) struct sdram_params *sdram; const char *name = sdram_name; + if (!cpu_is_sa1110()) + return -ENODEV; + if (!name[0]) { if (machine_is_assabet()) name = "TC59SM716-CL3"; diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 9d89c67..e46ecd8 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -211,6 +211,9 @@ v6_dma_inv_range: mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line #endif 1: +#ifdef CONFIG_SMP + str r0, [r0] @ write for ownership +#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c6, 1 @ invalidate D line #else @@ -231,6 +234,9 @@ v6_dma_inv_range: v6_dma_clean_range: bic r0, r0, #D_CACHE_LINE_SIZE - 1 1: +#ifdef CONFIG_SMP + ldr r2, [r0] @ read for ownership +#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c10, 1 @ clean D line #else @@ -251,6 +257,10 @@ v6_dma_clean_range: ENTRY(v6_dma_flush_range) bic r0, r0, #D_CACHE_LINE_SIZE - 1 1: +#ifdef CONFIG_SMP + ldr r2, [r0] @ read for ownership + str r2, [r0] @ write for ownership +#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c14, 1 @ clean & invalidate D line #else @@ -273,7 +283,9 @@ ENTRY(v6_dma_map_area) add r1, r1, r0 teq r2, #DMA_FROM_DEVICE beq v6_dma_inv_range - b v6_dma_clean_range + teq r2, #DMA_TO_DEVICE + beq v6_dma_clean_range + b v6_dma_flush_range ENDPROC(v6_dma_map_area) /* @@ -283,9 +295,6 @@ ENDPROC(v6_dma_map_area) * - dir - DMA direction */ ENTRY(v6_dma_unmap_area) - add r1, r1, r0 - teq r2, #DMA_TO_DEVICE - bne v6_dma_inv_range mov pc, lr ENDPROC(v6_dma_unmap_area) diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S index bcd64f2..06a90dc 100644 --- a/arch/arm/mm/cache-v7.S +++ b/arch/arm/mm/cache-v7.S @@ -167,7 +167,11 @@ ENTRY(v7_coherent_user_range) cmp r0, r1 blo 1b mov r0, #0 +#ifdef CONFIG_SMP + mcr p15, 0, r0, c7, c1, 6 @ invalidate BTB Inner Shareable +#else mcr p15, 0, r0, c7, c5, 6 @ invalidate BTB +#endif dsb isb mov pc, lr diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 83db12a..0ed29bf 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -86,9 +86,6 @@ void show_mem(void) printk("Mem-info:\n"); show_free_areas(); for_each_online_node(node) { - pg_data_t *n = NODE_DATA(node); - struct page *map = pgdat_page_nr(n, 0) - n->node_start_pfn; - for_each_nodebank (i,mi,node) { struct membank *bank = &mi->bank[i]; unsigned int pfn1, pfn2; @@ -97,8 +94,8 @@ void show_mem(void) pfn1 = bank_pfn_start(bank); pfn2 = bank_pfn_end(bank); - page = map + pfn1; - end = map + pfn2; + page = pfn_to_page(pfn1); + end = pfn_to_page(pfn2 - 1) + 1; do { total++; @@ -603,9 +600,6 @@ void __init mem_init(void) reserved_pages = free_pages = 0; for_each_online_node(node) { - pg_data_t *n = NODE_DATA(node); - struct page *map = pgdat_page_nr(n, 0) - n->node_start_pfn; - for_each_nodebank(i, &meminfo, node) { struct membank *bank = &meminfo.bank[i]; unsigned int pfn1, pfn2; @@ -614,8 +608,8 @@ void __init mem_init(void) pfn1 = bank_pfn_start(bank); pfn2 = bank_pfn_end(bank); - page = map + pfn1; - end = map + pfn2; + page = pfn_to_page(pfn1); + end = pfn_to_page(pfn2 - 1) + 1; do { if (PageReserved(page)) diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c index 9bfeb6b..33b3273 100644 --- a/arch/arm/mm/nommu.c +++ b/arch/arm/mm/nommu.c @@ -65,6 +65,15 @@ void flush_dcache_page(struct page *page) } EXPORT_SYMBOL(flush_dcache_page); +void copy_to_user_page(struct vm_area_struct *vma, struct page *page, + unsigned long uaddr, void *dst, const void *src, + unsigned long len) +{ + memcpy(dst, src, len); + if (vma->vm_flags & VM_EXEC) + __cpuc_coherent_user_range(uaddr, uaddr + len); +} + void __iomem *__arm_ioremap_pfn(unsigned long pfn, unsigned long offset, size_t size, unsigned int mtype) { @@ -87,8 +96,8 @@ void __iomem *__arm_ioremap(unsigned long phys_addr, size_t size, } EXPORT_SYMBOL(__arm_ioremap); -void __iomem *__arm_ioremap(unsigned long phys_addr, size_t size, - unsigned int mtype, void *caller) +void __iomem *__arm_ioremap_caller(unsigned long phys_addr, size_t size, + unsigned int mtype, void *caller) { return __arm_ioremap(phys_addr, size, mtype); } diff --git a/arch/arm/mm/tlb-v7.S b/arch/arm/mm/tlb-v7.S index 0cb1848..f3f288a 100644 --- a/arch/arm/mm/tlb-v7.S +++ b/arch/arm/mm/tlb-v7.S @@ -50,7 +50,11 @@ ENTRY(v7wbi_flush_user_tlb_range) cmp r0, r1 blo 1b mov ip, #0 +#ifdef CONFIG_SMP + mcr p15, 0, ip, c7, c1, 6 @ flush BTAC/BTB Inner Shareable +#else mcr p15, 0, ip, c7, c5, 6 @ flush BTAC/BTB +#endif dsb mov pc, lr ENDPROC(v7wbi_flush_user_tlb_range) @@ -79,7 +83,11 @@ ENTRY(v7wbi_flush_kern_tlb_range) cmp r0, r1 blo 1b mov r2, #0 +#ifdef CONFIG_SMP + mcr p15, 0, r2, c7, c1, 6 @ flush BTAC/BTB Inner Shareable +#else mcr p15, 0, r2, c7, c5, 6 @ flush BTAC/BTB +#endif dsb isb mov pc, lr diff --git a/arch/arm/plat-mxc/include/mach/dma-mx1-mx2.h b/arch/arm/plat-mxc/include/mach/dma-mx1-mx2.h index 07be8ad..7c4870b 100644 --- a/arch/arm/plat-mxc/include/mach/dma-mx1-mx2.h +++ b/arch/arm/plat-mxc/include/mach/dma-mx1-mx2.h @@ -31,7 +31,13 @@ #define DMA_MODE_WRITE 1 #define DMA_MODE_MASK 1 -#define DMA_BASE IO_ADDRESS(DMA_BASE_ADDR) +#define MX1_DMA_REG(offset) MX1_IO_ADDRESS(MX1_DMA_BASE_ADDR + (offset)) + +/* DMA Interrupt Mask Register */ +#define MX1_DMA_DIMR MX1_DMA_REG(0x08) + +/* Channel Control Register */ +#define MX1_DMA_CCR(x) MX1_DMA_REG(0x8c + ((x) << 6)) #define IMX_DMA_MEMSIZE_32 (0 << 4) #define IMX_DMA_MEMSIZE_8 (1 << 4) diff --git a/arch/arm/plat-pxa/dma.c b/arch/arm/plat-pxa/dma.c index 742350e..2d3c19d 100644 --- a/arch/arm/plat-pxa/dma.c +++ b/arch/arm/plat-pxa/dma.c @@ -245,7 +245,7 @@ static void pxa_dma_init_debugfs(void) dbgfs_chan = kmalloc(sizeof(*dbgfs_state) * num_dma_channels, GFP_KERNEL); - if (!dbgfs_state) + if (!dbgfs_chan) goto err_alloc; chandir = debugfs_create_dir("channels", dbgfs_root); diff --git a/arch/arm/tools/mach-types b/arch/arm/tools/mach-types index 1536f17..8f10d24 100644 --- a/arch/arm/tools/mach-types +++ b/arch/arm/tools/mach-types @@ -12,7 +12,7 @@ # # http://www.arm.linux.org.uk/developer/machines/?action=new # -# Last update: Sat Mar 20 15:35:41 2010 +# Last update: Sat May 1 10:36:42 2010 # # machine_is_xxx CONFIG_xxxx MACH_TYPE_xxx number # @@ -2749,3 +2749,58 @@ stamp9g45 MACH_STAMP9G45 STAMP9G45 2761 h6053 MACH_H6053 H6053 2762 smint01 MACH_SMINT01 SMINT01 2763 prtlvt2 MACH_PRTLVT2 PRTLVT2 2764 +ap420 MACH_AP420 AP420 2765 +htcshift MACH_HTCSHIFT HTCSHIFT 2766 +davinci_dm365_fc MACH_DAVINCI_DM365_FC DAVINCI_DM365_FC 2767 +msm8x55_surf MACH_MSM8X55_SURF MSM8X55_SURF 2768 +msm8x55_ffa MACH_MSM8X55_FFA MSM8X55_FFA 2769 +esl_vamana MACH_ESL_VAMANA ESL_VAMANA 2770 +sbc35 MACH_SBC35 SBC35 2771 +mpx6446 MACH_MPX6446 MPX6446 2772 +oreo_controller MACH_OREO_CONTROLLER OREO_CONTROLLER 2773 +kopin_models MACH_KOPIN_MODELS KOPIN_MODELS 2774 +ttc_vision2 MACH_TTC_VISION2 TTC_VISION2 2775 +cns3420vb MACH_CNS3420VB CNS3420VB 2776 +lpc2 MACH_LPC2 LPC2 2777 +olympus MACH_OLYMPUS OLYMPUS 2778 +vortex MACH_VORTEX VORTEX 2779 +s5pc200 MACH_S5PC200 S5PC200 2780 +ecucore_9263 MACH_ECUCORE_9263 ECUCORE_9263 2781 +smdkc200 MACH_SMDKC200 SMDKC200 2782 +emsiso_sx27 MACH_EMSISO_SX27 EMSISO_SX27 2783 +apx_som9g45_ek MACH_APX_SOM9G45_EK APX_SOM9G45_EK 2784 +songshan MACH_SONGSHAN SONGSHAN 2785 +tianshan MACH_TIANSHAN TIANSHAN 2786 +vpx500 MACH_VPX500 VPX500 2787 +am3517sam MACH_AM3517SAM AM3517SAM 2788 +skat91_sim508 MACH_SKAT91_SIM508 SKAT91_SIM508 2789 +skat91_s3e MACH_SKAT91_S3E SKAT91_S3E 2790 +omap4_panda MACH_OMAP4_PANDA OMAP4_PANDA 2791 +df7220 MACH_DF7220 DF7220 2792 +nemini MACH_NEMINI NEMINI 2793 +t8200 MACH_T8200 T8200 2794 +apf51 MACH_APF51 APF51 2795 +dr_rc_unit MACH_DR_RC_UNIT DR_RC_UNIT 2796 +bordeaux MACH_BORDEAUX BORDEAUX 2797 +catania_b MACH_CATANIA_B CATANIA_B 2798 +mx51_ocean MACH_MX51_OCEAN MX51_OCEAN 2799 +ti8168evm MACH_TI8168EVM TI8168EVM 2800 +neocoreomap MACH_NEOCOREOMAP NEOCOREOMAP 2801 +withings_wbp MACH_WITHINGS_WBP WITHINGS_WBP 2802 +dbps MACH_DBPS DBPS 2803 +sbc9261 MACH_SBC9261 SBC9261 2804 +pcbfp0001 MACH_PCBFP0001 PCBFP0001 2805 +speedy MACH_SPEEDY SPEEDY 2806 +chrysaor MACH_CHRYSAOR CHRYSAOR 2807 +tango MACH_TANGO TANGO 2808 +synology_dsx11 MACH_SYNOLOGY_DSX11 SYNOLOGY_DSX11 2809 +hanlin_v3ext MACH_HANLIN_V3EXT HANLIN_V3EXT 2810 +hanlin_v5 MACH_HANLIN_V5 HANLIN_V5 2811 +hanlin_v3plus MACH_HANLIN_V3PLUS HANLIN_V3PLUS 2812 +iriver_story MACH_IRIVER_STORY IRIVER_STORY 2813 +irex_iliad MACH_IREX_ILIAD IREX_ILIAD 2814 +irex_dr1000 MACH_IREX_DR1000 IREX_DR1000 2815 +teton_bga MACH_TETON_BGA TETON_BGA 2816 +snapper9g45 MACH_SNAPPER9G45 SNAPPER9G45 2817 +tam3517 MACH_TAM3517 TAM3517 2818 +pdc100 MACH_PDC100 PDC100 2819 diff --git a/arch/avr32/include/asm/atomic.h b/arch/avr32/include/asm/atomic.h index b131c27..bbce6a1c 100644 --- a/arch/avr32/include/asm/atomic.h +++ b/arch/avr32/include/asm/atomic.h @@ -19,7 +19,7 @@ #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = i) /* diff --git a/arch/cris/include/asm/atomic.h b/arch/cris/include/asm/atomic.h index a6aca81..88dc9b9 100644 --- a/arch/cris/include/asm/atomic.h +++ b/arch/cris/include/asm/atomic.h @@ -15,7 +15,7 @@ #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v,i) (((v)->counter) = (i)) /* These should be written in asm but we do it in C for now. */ diff --git a/arch/frv/include/asm/atomic.h b/arch/frv/include/asm/atomic.h index 00a57af..fae32c7f 100644 --- a/arch/frv/include/asm/atomic.h +++ b/arch/frv/include/asm/atomic.h @@ -36,7 +36,7 @@ #define smp_mb__after_atomic_inc() barrier() #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = (i)) #ifndef CONFIG_FRV_OUTOFLINE_ATOMIC_OPS diff --git a/arch/h8300/include/asm/atomic.h b/arch/h8300/include/asm/atomic.h index 33c8c0f..e936804 100644 --- a/arch/h8300/include/asm/atomic.h +++ b/arch/h8300/include/asm/atomic.h @@ -10,7 +10,7 @@ #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = i) #include <asm/system.h> diff --git a/arch/ia64/include/asm/atomic.h b/arch/ia64/include/asm/atomic.h index 88405cb..4e19484 100644 --- a/arch/ia64/include/asm/atomic.h +++ b/arch/ia64/include/asm/atomic.h @@ -21,8 +21,8 @@ #define ATOMIC_INIT(i) ((atomic_t) { (i) }) #define ATOMIC64_INIT(i) ((atomic64_t) { (i) }) -#define atomic_read(v) ((v)->counter) -#define atomic64_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) +#define atomic64_read(v) (*(volatile long *)&(v)->counter) #define atomic_set(v,i) (((v)->counter) = (i)) #define atomic64_set(v,i) (((v)->counter) = (i)) diff --git a/arch/ia64/include/asm/bitops.h b/arch/ia64/include/asm/bitops.h index 6ebc229..9da3df6 100644 --- a/arch/ia64/include/asm/bitops.h +++ b/arch/ia64/include/asm/bitops.h @@ -437,17 +437,18 @@ __fls (unsigned long x) * hweightN: returns the hamming weight (i.e. the number * of bits set) of a N-bit word */ -static __inline__ unsigned long -hweight64 (unsigned long x) +static __inline__ unsigned long __arch_hweight64(unsigned long x) { unsigned long result; result = ia64_popcnt(x); return result; } -#define hweight32(x) (unsigned int) hweight64((x) & 0xfffffffful) -#define hweight16(x) (unsigned int) hweight64((x) & 0xfffful) -#define hweight8(x) (unsigned int) hweight64((x) & 0xfful) +#define __arch_hweight32(x) ((unsigned int) __arch_hweight64((x) & 0xfffffffful)) +#define __arch_hweight16(x) ((unsigned int) __arch_hweight64((x) & 0xfffful)) +#define __arch_hweight8(x) ((unsigned int) __arch_hweight64((x) & 0xfful)) + +#include <asm-generic/bitops/const_hweight.h> #endif /* __KERNEL__ */ diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index 4d1a7e9..c6c90f3 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -785,6 +785,14 @@ int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) return 0; } +int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi) +{ + if (isa_irq >= 16) + return -1; + *gsi = isa_irq; + return 0; +} + /* * ACPI based hotplug CPU support */ diff --git a/arch/m32r/include/asm/atomic.h b/arch/m32r/include/asm/atomic.h index 63f0cf0..d44a51e 100644 --- a/arch/m32r/include/asm/atomic.h +++ b/arch/m32r/include/asm/atomic.h @@ -26,7 +26,7 @@ * * Atomically reads the value of @v. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) /** * atomic_set - set atomic variable diff --git a/arch/m68k/amiga/Makefile b/arch/m68k/amiga/Makefile index 6a0d765..11dd30b 100644 --- a/arch/m68k/amiga/Makefile +++ b/arch/m68k/amiga/Makefile @@ -2,6 +2,6 @@ # Makefile for Linux arch/m68k/amiga source directory # -obj-y := config.o amiints.o cia.o chipram.o amisound.o +obj-y := config.o amiints.o cia.o chipram.o amisound.o platform.o obj-$(CONFIG_AMIGA_PCMCIA) += pcmcia.o diff --git a/arch/m68k/amiga/platform.c b/arch/m68k/amiga/platform.c new file mode 100644 index 0000000..38f18bf --- /dev/null +++ b/arch/m68k/amiga/platform.c @@ -0,0 +1,83 @@ +/* + * Copyright (C) 2007-2009 Geert Uytterhoeven + * + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file COPYING in the main directory of this archive + * for more details. + */ + +#include <linux/init.h> +#include <linux/platform_device.h> +#include <linux/zorro.h> + +#include <asm/amigahw.h> + + +#ifdef CONFIG_ZORRO + +static const struct resource zorro_resources[] __initconst = { + /* Zorro II regions (on Zorro II/III) */ + { + .name = "Zorro II exp", + .start = 0x00e80000, + .end = 0x00efffff, + .flags = IORESOURCE_MEM, + }, { + .name = "Zorro II mem", + .start = 0x00200000, + .end = 0x009fffff, + .flags = IORESOURCE_MEM, + }, + /* Zorro III regions (on Zorro III only) */ + { + .name = "Zorro III exp", + .start = 0xff000000, + .end = 0xffffffff, + .flags = IORESOURCE_MEM, + }, { + .name = "Zorro III cfg", + .start = 0x40000000, + .end = 0x7fffffff, + .flags = IORESOURCE_MEM, + } +}; + + +static int __init amiga_init_bus(void) +{ + if (!MACH_IS_AMIGA || !AMIGAHW_PRESENT(ZORRO)) + return -ENODEV; + + platform_device_register_simple("amiga-zorro", -1, zorro_resources, + AMIGAHW_PRESENT(ZORRO3) ? 4 : 2); + return 0; +} + +subsys_initcall(amiga_init_bus); + +#endif /* CONFIG_ZORRO */ + + +static int __init amiga_init_devices(void) +{ + if (!MACH_IS_AMIGA) + return -ENODEV; + + /* video hardware */ + if (AMIGAHW_PRESENT(AMI_VIDEO)) + platform_device_register_simple("amiga-video", -1, NULL, 0); + + + /* sound hardware */ + if (AMIGAHW_PRESENT(AMI_AUDIO)) + platform_device_register_simple("amiga-audio", -1, NULL, 0); + + + /* storage interfaces */ + if (AMIGAHW_PRESENT(AMI_FLOPPY)) + platform_device_register_simple("amiga-floppy", -1, NULL, 0); + + return 0; +} + +device_initcall(amiga_init_devices); diff --git a/arch/m68k/bvme6000/rtc.c b/arch/m68k/bvme6000/rtc.c index b46ea17..cb8617b 100644 --- a/arch/m68k/bvme6000/rtc.c +++ b/arch/m68k/bvme6000/rtc.c @@ -9,7 +9,6 @@ #include <linux/types.h> #include <linux/errno.h> #include <linux/miscdevice.h> -#include <linux/smp_lock.h> #include <linux/ioport.h> #include <linux/capability.h> #include <linux/fcntl.h> @@ -35,10 +34,9 @@ static unsigned char days_in_mo[] = {0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; -static char rtc_status; +static atomic_t rtc_status = ATOMIC_INIT(1); -static int rtc_ioctl(struct inode *inode, struct file *file, unsigned int cmd, - unsigned long arg) +static long rtc_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { volatile RtcPtr_t rtc = (RtcPtr_t)BVME_RTC_BASE; unsigned char msr; @@ -132,29 +130,20 @@ static int rtc_ioctl(struct inode *inode, struct file *file, unsigned int cmd, } /* - * We enforce only one user at a time here with the open/close. - * Also clear the previous interrupt data on an open, and clean - * up things on a close. + * We enforce only one user at a time here with the open/close. */ - static int rtc_open(struct inode *inode, struct file *file) { - lock_kernel(); - if(rtc_status) { - unlock_kernel(); + if (!atomic_dec_and_test(&rtc_status)) { + atomic_inc(&rtc_status); return -EBUSY; } - - rtc_status = 1; - unlock_kernel(); return 0; } static int rtc_release(struct inode *inode, struct file *file) { - lock_kernel(); - rtc_status = 0; - unlock_kernel(); + atomic_inc(&rtc_status); return 0; } @@ -163,9 +152,9 @@ static int rtc_release(struct inode *inode, struct file *file) */ static const struct file_operations rtc_fops = { - .ioctl = rtc_ioctl, - .open = rtc_open, - .release = rtc_release, + .unlocked_ioctl = rtc_ioctl, + .open = rtc_open, + .release = rtc_release, }; static struct miscdevice rtc_dev = { diff --git a/arch/m68k/hp300/time.h b/arch/m68k/hp300/time.h index f5b3d09..7b98242 100644 --- a/arch/m68k/hp300/time.h +++ b/arch/m68k/hp300/time.h @@ -1,4 +1,2 @@ extern void hp300_sched_init(irq_handler_t vector); -extern unsigned long hp300_gettimeoffset (void); - - +extern unsigned long hp300_gettimeoffset(void); diff --git a/arch/m68k/include/asm/atomic_mm.h b/arch/m68k/include/asm/atomic_mm.h index d9d2ed6..6a223b3 100644 --- a/arch/m68k/include/asm/atomic_mm.h +++ b/arch/m68k/include/asm/atomic_mm.h @@ -15,7 +15,7 @@ #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = i) static inline void atomic_add(int i, atomic_t *v) diff --git a/arch/m68k/include/asm/atomic_no.h b/arch/m68k/include/asm/atomic_no.h index 5674cb9..289310c 100644 --- a/arch/m68k/include/asm/atomic_no.h +++ b/arch/m68k/include/asm/atomic_no.h @@ -15,7 +15,7 @@ #define ATOMIC_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = i) static __inline__ void atomic_add(int i, atomic_t *v) diff --git a/arch/m68k/include/asm/bitops_mm.h b/arch/m68k/include/asm/bitops_mm.h index 9bde784..b4ecdaa 100644 --- a/arch/m68k/include/asm/bitops_mm.h +++ b/arch/m68k/include/asm/bitops_mm.h @@ -365,6 +365,10 @@ static inline int minix_test_bit(int nr, const void *vaddr) #define ext2_set_bit_atomic(lock, nr, addr) test_and_set_bit((nr) ^ 24, (unsigned long *)(addr)) #define ext2_clear_bit(nr, addr) __test_and_clear_bit((nr) ^ 24, (unsigned long *)(addr)) #define ext2_clear_bit_atomic(lock, nr, addr) test_and_clear_bit((nr) ^ 24, (unsigned long *)(addr)) +#define ext2_find_next_zero_bit(addr, size, offset) \ + generic_find_next_zero_le_bit((unsigned long *)addr, size, offset) +#define ext2_find_next_bit(addr, size, offset) \ + generic_find_next_le_bit((unsigned long *)addr, size, offset) static inline int ext2_test_bit(int nr, const void *vaddr) { @@ -394,10 +398,9 @@ static inline int ext2_find_first_zero_bit(const void *vaddr, unsigned size) return (p - addr) * 32 + res; } -static inline int ext2_find_next_zero_bit(const void *vaddr, unsigned size, - unsigned offset) +static inline unsigned long generic_find_next_zero_le_bit(const unsigned long *addr, + unsigned long size, unsigned long offset) { - const unsigned long *addr = vaddr; const unsigned long *p = addr + (offset >> 5); int bit = offset & 31UL, res; @@ -437,10 +440,9 @@ static inline int ext2_find_first_bit(const void *vaddr, unsigned size) return (p - addr) * 32 + res; } -static inline int ext2_find_next_bit(const void *vaddr, unsigned size, - unsigned offset) +static inline unsigned long generic_find_next_le_bit(const unsigned long *addr, + unsigned long size, unsigned long offset) { - const unsigned long *addr = vaddr; const unsigned long *p = addr + (offset >> 5); int bit = offset & 31UL, res; diff --git a/arch/m68k/include/asm/param.h b/arch/m68k/include/asm/param.h index 85c41b7..36265cc 100644 --- a/arch/m68k/include/asm/param.h +++ b/arch/m68k/include/asm/param.h @@ -1,26 +1,12 @@ #ifndef _M68K_PARAM_H #define _M68K_PARAM_H -#ifdef __KERNEL__ -# define HZ CONFIG_HZ /* Internal kernel timer frequency */ -# define USER_HZ 100 /* .. some user interfaces are in "ticks" */ -# define CLOCKS_PER_SEC (USER_HZ) /* like times() */ -#endif - -#ifndef HZ -#define HZ 100 -#endif - #ifdef __uClinux__ #define EXEC_PAGESIZE 4096 #else #define EXEC_PAGESIZE 8192 #endif -#ifndef NOGROUP -#define NOGROUP (-1) -#endif - -#define MAXHOSTNAMELEN 64 /* max length of hostname */ +#include <asm-generic/param.h> #endif /* _M68K_PARAM_H */ diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c index aacd6d1..ada4f4c 100644 --- a/arch/m68k/kernel/traps.c +++ b/arch/m68k/kernel/traps.c @@ -455,7 +455,7 @@ static inline void access_error040(struct frame *fp) if (do_page_fault(&fp->ptregs, addr, errorcode)) { #ifdef DEBUG - printk("do_page_fault() !=0 \n"); + printk("do_page_fault() !=0\n"); #endif if (user_mode(&fp->ptregs)){ /* delay writebacks after signal delivery */ diff --git a/arch/m68k/mac/config.c b/arch/m68k/mac/config.c index 0356da9..1c16b1b 100644 --- a/arch/m68k/mac/config.c +++ b/arch/m68k/mac/config.c @@ -148,7 +148,7 @@ static void mac_cache_card_flush(int writeback) void __init config_mac(void) { if (!MACH_IS_MAC) - printk(KERN_ERR "ERROR: no Mac, but config_mac() called!! \n"); + printk(KERN_ERR "ERROR: no Mac, but config_mac() called!!\n"); mach_sched_init = mac_sched_init; mach_init_IRQ = mac_init_IRQ; @@ -867,7 +867,7 @@ static void __init mac_identify(void) */ iop_preinit(); - printk(KERN_INFO "Detected Macintosh model: %d \n", model); + printk(KERN_INFO "Detected Macintosh model: %d\n", model); /* * Report booter data: @@ -878,12 +878,12 @@ static void __init mac_identify(void) mac_bi_data.videoaddr, mac_bi_data.videorow, mac_bi_data.videodepth, mac_bi_data.dimensions & 0xFFFF, mac_bi_data.dimensions >> 16); - printk(KERN_DEBUG " Videological 0x%lx phys. 0x%lx, SCC at 0x%lx \n", + printk(KERN_DEBUG " Videological 0x%lx phys. 0x%lx, SCC at 0x%lx\n", mac_bi_data.videological, mac_orig_videoaddr, mac_bi_data.sccbase); - printk(KERN_DEBUG " Boottime: 0x%lx GMTBias: 0x%lx \n", + printk(KERN_DEBUG " Boottime: 0x%lx GMTBias: 0x%lx\n", mac_bi_data.boottime, mac_bi_data.gmtbias); - printk(KERN_DEBUG " Machine ID: %ld CPUid: 0x%lx memory size: 0x%lx \n", + printk(KERN_DEBUG " Machine ID: %ld CPUid: 0x%lx memory size: 0x%lx\n", mac_bi_data.id, mac_bi_data.cpuid, mac_bi_data.memsize); iop_init(); diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c index d0e35cf..a96394a 100644 --- a/arch/m68k/mm/fault.c +++ b/arch/m68k/mm/fault.c @@ -154,7 +154,6 @@ good_area: * the fault. */ - survive: fault = handle_mm_fault(mm, vma, address, write ? FAULT_FLAG_WRITE : 0); #ifdef DEBUG printk("handle_mm_fault returns %d\n",fault); @@ -180,15 +179,10 @@ good_area: */ out_of_memory: up_read(&mm->mmap_sem); - if (is_global_init(current)) { - yield(); - down_read(&mm->mmap_sem); - goto survive; - } - - printk("VM: killing process %s\n", current->comm); - if (user_mode(regs)) - do_group_exit(SIGKILL); + if (!user_mode(regs)) + goto no_context; + pagefault_out_of_memory(); + return 0; no_context: current->thread.signo = SIGBUS; diff --git a/arch/m68k/mvme16x/rtc.c b/arch/m68k/mvme16x/rtc.c index 8da9c25..11ac6f6 100644 --- a/arch/m68k/mvme16x/rtc.c +++ b/arch/m68k/mvme16x/rtc.c @@ -9,7 +9,6 @@ #include <linux/types.h> #include <linux/errno.h> #include <linux/miscdevice.h> -#include <linux/smp_lock.h> #include <linux/ioport.h> #include <linux/capability.h> #include <linux/fcntl.h> @@ -36,8 +35,7 @@ static const unsigned char days_in_mo[] = static atomic_t rtc_ready = ATOMIC_INIT(1); -static int rtc_ioctl(struct inode *inode, struct file *file, unsigned int cmd, - unsigned long arg) +static long rtc_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { volatile MK48T08ptr_t rtc = (MK48T08ptr_t)MVME_RTC_BASE; unsigned long flags; @@ -120,22 +118,15 @@ static int rtc_ioctl(struct inode *inode, struct file *file, unsigned int cmd, } /* - * We enforce only one user at a time here with the open/close. - * Also clear the previous interrupt data on an open, and clean - * up things on a close. + * We enforce only one user at a time here with the open/close. */ - static int rtc_open(struct inode *inode, struct file *file) { - lock_kernel(); if( !atomic_dec_and_test(&rtc_ready) ) { atomic_inc( &rtc_ready ); - unlock_kernel(); return -EBUSY; } - unlock_kernel(); - return 0; } @@ -150,9 +141,9 @@ static int rtc_release(struct inode *inode, struct file *file) */ static const struct file_operations rtc_fops = { - .ioctl = rtc_ioctl, - .open = rtc_open, - .release = rtc_release, + .unlocked_ioctl = rtc_ioctl, + .open = rtc_open, + .release = rtc_release, }; static struct miscdevice rtc_dev= diff --git a/arch/m68k/q40/config.c b/arch/m68k/q40/config.c index 31ab3f0..ad10fec 100644 --- a/arch/m68k/q40/config.c +++ b/arch/m68k/q40/config.c @@ -126,7 +126,7 @@ static void q40_reset(void) { halted = 1; printk("\n\n*******************************************\n" - "Called q40_reset : press the RESET button!! \n" + "Called q40_reset : press the RESET button!!\n" "*******************************************\n"); Q40_LED_ON(); while (1) diff --git a/arch/microblaze/configs/mmu_defconfig b/arch/microblaze/configs/mmu_defconfig index 6fced1f..3c91cf6 100644 --- a/arch/microblaze/configs/mmu_defconfig +++ b/arch/microblaze/configs/mmu_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.33-rc6 -# Wed Feb 3 10:02:59 2010 +# Linux kernel version: 2.6.34-rc6 +# Thu May 6 11:22:14 2010 # CONFIG_MICROBLAZE=y # CONFIG_SWAP is not set @@ -22,8 +22,6 @@ CONFIG_GENERIC_CSUM=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y -# CONFIG_PCI is not set -CONFIG_NO_DMA=y CONFIG_DTC=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_CONSTRUCTORS=y @@ -56,7 +54,6 @@ CONFIG_RCU_FANOUT=32 CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 -# CONFIG_GROUP_SCHED is not set # CONFIG_CGROUPS is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y @@ -106,6 +103,8 @@ CONFIG_SLAB=y # CONFIG_SLOB is not set # CONFIG_PROFILING is not set CONFIG_HAVE_OPROFILE=y +CONFIG_HAVE_DMA_ATTRS=y +CONFIG_HAVE_DMA_API_DEBUG=y # # GCOV-based kernel profiling @@ -245,13 +244,20 @@ CONFIG_BINFMT_ELF=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set # CONFIG_HAVE_AOUT is not set # CONFIG_BINFMT_MISC is not set + +# +# Bus Options +# +# CONFIG_PCI is not set +# CONFIG_PCI_DOMAINS is not set +# CONFIG_PCI_SYSCALL is not set +# CONFIG_ARCH_SUPPORTS_MSI is not set CONFIG_NET=y # # Networking options # CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y CONFIG_XFRM=y # CONFIG_XFRM_USER is not set @@ -341,7 +347,9 @@ CONFIG_PREVENT_FIRMWARE_BUILD=y # CONFIG_SYS_HYPERVISOR is not set # CONFIG_CONNECTOR is not set # CONFIG_MTD is not set +CONFIG_OF_FLATTREE=y CONFIG_OF_DEVICE=y +CONFIG_OF_MDIO=y # CONFIG_PARPORT is not set CONFIG_BLK_DEV=y # CONFIG_BLK_DEV_COW_COMMON is not set @@ -370,6 +378,7 @@ CONFIG_MISC_DEVICES=y # # SCSI device support # +CONFIG_SCSI_MOD=y # CONFIG_RAID_ATTRS is not set # CONFIG_SCSI is not set # CONFIG_SCSI_DMA is not set @@ -383,9 +392,30 @@ CONFIG_NETDEVICES=y # CONFIG_EQUALIZER is not set # CONFIG_TUN is not set # CONFIG_VETH is not set -# CONFIG_PHYLIB is not set +CONFIG_PHYLIB=y + +# +# MII PHY device drivers +# +# CONFIG_MARVELL_PHY is not set +# CONFIG_DAVICOM_PHY is not set +# CONFIG_QSEMI_PHY is not set +# CONFIG_LXT_PHY is not set +# CONFIG_CICADA_PHY is not set +# CONFIG_VITESSE_PHY is not set +# CONFIG_SMSC_PHY is not set +# CONFIG_BROADCOM_PHY is not set +# CONFIG_ICPLUS_PHY is not set +# CONFIG_REALTEK_PHY is not set +# CONFIG_NATIONAL_PHY is not set +# CONFIG_STE10XP is not set +# CONFIG_LSI_ET1011C_PHY is not set +# CONFIG_MICREL_PHY is not set +# CONFIG_FIXED_PHY is not set +# CONFIG_MDIO_BITBANG is not set CONFIG_NET_ETHERNET=y # CONFIG_MII is not set +# CONFIG_ETHOC is not set # CONFIG_DNET is not set # CONFIG_IBM_NEW_EMAC_ZMII is not set # CONFIG_IBM_NEW_EMAC_RGMII is not set @@ -394,6 +424,7 @@ CONFIG_NET_ETHERNET=y # CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set # CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set # CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set +# CONFIG_B44 is not set # CONFIG_KS8842 is not set # CONFIG_KS8851_MLL is not set CONFIG_XILINX_EMACLITE=y @@ -444,6 +475,7 @@ CONFIG_SERIAL_UARTLITE=y CONFIG_SERIAL_UARTLITE_CONSOLE=y CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_TIMBERDALE is not set # CONFIG_SERIAL_GRLIB_GAISLER_APBUART is not set CONFIG_UNIX98_PTYS=y # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set @@ -471,6 +503,12 @@ CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y # CONFIG_HWMON is not set # CONFIG_THERMAL is not set # CONFIG_WATCHDOG is not set +CONFIG_SSB_POSSIBLE=y + +# +# Sonics Silicon Backplane +# +# CONFIG_SSB is not set # # Multifunction device drivers @@ -502,6 +540,7 @@ CONFIG_USB_ARCH_HAS_EHCI=y # CONFIG_NEW_LEDS is not set # CONFIG_ACCESSIBILITY is not set # CONFIG_RTC_CLASS is not set +# CONFIG_DMADEVICES is not set # CONFIG_AUXDISPLAY is not set # CONFIG_UIO is not set @@ -572,6 +611,7 @@ CONFIG_MISC_FILESYSTEMS=y # CONFIG_BEFS_FS is not set # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set +# CONFIG_LOGFS is not set # CONFIG_CRAMFS is not set # CONFIG_SQUASHFS is not set # CONFIG_VXFS_FS is not set @@ -595,6 +635,7 @@ CONFIG_SUNRPC=y # CONFIG_RPCSEC_GSS_KRB5 is not set # CONFIG_RPCSEC_GSS_SPKM3 is not set # CONFIG_SMB_FS is not set +# CONFIG_CEPH_FS is not set CONFIG_CIFS=y CONFIG_CIFS_STATS=y CONFIG_CIFS_STATS2=y @@ -696,6 +737,7 @@ CONFIG_SCHED_DEBUG=y # CONFIG_DEBUG_OBJECTS is not set CONFIG_DEBUG_SLAB=y # CONFIG_DEBUG_SLAB_LEAK is not set +# CONFIG_DEBUG_KMEMLEAK is not set CONFIG_DEBUG_SPINLOCK=y # CONFIG_DEBUG_MUTEXES is not set # CONFIG_DEBUG_LOCK_ALLOC is not set @@ -741,6 +783,7 @@ CONFIG_BRANCH_PROFILE_NONE=y # CONFIG_KMEMTRACE is not set # CONFIG_WORKQUEUE_TRACER is not set # CONFIG_BLK_DEV_IO_TRACE is not set +# CONFIG_DMA_API_DEBUG is not set # CONFIG_SAMPLES is not set CONFIG_EARLY_PRINTK=y # CONFIG_HEART_BEAT is not set @@ -862,5 +905,6 @@ CONFIG_ZLIB_INFLATE=y CONFIG_DECOMPRESS_GZIP=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y +CONFIG_HAS_DMA=y CONFIG_HAVE_LMB=y CONFIG_NLATTR=y diff --git a/arch/microblaze/configs/nommu_defconfig b/arch/microblaze/configs/nommu_defconfig index ce2da53..dd3a494 100644 --- a/arch/microblaze/configs/nommu_defconfig +++ b/arch/microblaze/configs/nommu_defconfig @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit -# Linux kernel version: 2.6.33-rc6 -# Wed Feb 3 10:03:21 2010 +# Linux kernel version: 2.6.34-rc6 +# Thu May 6 11:25:12 2010 # CONFIG_MICROBLAZE=y # CONFIG_SWAP is not set @@ -22,8 +22,6 @@ CONFIG_GENERIC_CSUM=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y -# CONFIG_PCI is not set -CONFIG_NO_DMA=y CONFIG_DTC=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_CONSTRUCTORS=y @@ -58,7 +56,6 @@ CONFIG_RCU_FANOUT=32 CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=17 -# CONFIG_GROUP_SCHED is not set # CONFIG_CGROUPS is not set CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y @@ -96,6 +93,8 @@ CONFIG_SLAB=y # CONFIG_MMAP_ALLOW_UNINITIALIZED is not set # CONFIG_PROFILING is not set CONFIG_HAVE_OPROFILE=y +CONFIG_HAVE_DMA_ATTRS=y +CONFIG_HAVE_DMA_API_DEBUG=y # # GCOV-based kernel profiling @@ -209,11 +208,14 @@ CONFIG_PROC_DEVICETREE=y # # Advanced setup # +# CONFIG_ADVANCED_OPTIONS is not set # # Default settings for advanced configuration options are used # +CONFIG_LOWMEM_SIZE=0x30000000 CONFIG_KERNEL_START=0x90000000 +CONFIG_TASK_SIZE=0x80000000 CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set @@ -235,13 +237,20 @@ CONFIG_BINFMT_FLAT=y # CONFIG_BINFMT_SHARED_FLAT is not set # CONFIG_HAVE_AOUT is not set # CONFIG_BINFMT_MISC is not set + +# +# Bus Options +# +# CONFIG_PCI is not set +# CONFIG_PCI_DOMAINS is not set +# CONFIG_PCI_SYSCALL is not set +# CONFIG_ARCH_SUPPORTS_MSI is not set CONFIG_NET=y # # Networking options # CONFIG_PACKET=y -# CONFIG_PACKET_MMAP is not set CONFIG_UNIX=y CONFIG_XFRM=y # CONFIG_XFRM_USER is not set @@ -413,6 +422,7 @@ CONFIG_MTD_UCLINUX=y # UBI - Unsorted block images # # CONFIG_MTD_UBI is not set +CONFIG_OF_FLATTREE=y CONFIG_OF_DEVICE=y # CONFIG_PARPORT is not set CONFIG_BLK_DEV=y @@ -442,6 +452,7 @@ CONFIG_MISC_DEVICES=y # # SCSI device support # +CONFIG_SCSI_MOD=y # CONFIG_RAID_ATTRS is not set # CONFIG_SCSI is not set # CONFIG_SCSI_DMA is not set @@ -458,6 +469,7 @@ CONFIG_NETDEVICES=y # CONFIG_PHYLIB is not set CONFIG_NET_ETHERNET=y # CONFIG_MII is not set +# CONFIG_ETHOC is not set # CONFIG_DNET is not set # CONFIG_IBM_NEW_EMAC_ZMII is not set # CONFIG_IBM_NEW_EMAC_RGMII is not set @@ -466,6 +478,7 @@ CONFIG_NET_ETHERNET=y # CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set # CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set # CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set +# CONFIG_B44 is not set # CONFIG_KS8842 is not set # CONFIG_KS8851_MLL is not set # CONFIG_XILINX_EMACLITE is not set @@ -516,6 +529,7 @@ CONFIG_SERIAL_UARTLITE=y CONFIG_SERIAL_UARTLITE_CONSOLE=y CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y +# CONFIG_SERIAL_TIMBERDALE is not set # CONFIG_SERIAL_GRLIB_GAISLER_APBUART is not set CONFIG_UNIX98_PTYS=y # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set @@ -544,6 +558,12 @@ CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y # CONFIG_HWMON is not set # CONFIG_THERMAL is not set # CONFIG_WATCHDOG is not set +CONFIG_SSB_POSSIBLE=y + +# +# Sonics Silicon Backplane +# +# CONFIG_SSB is not set # # Multifunction device drivers @@ -593,6 +613,7 @@ CONFIG_USB_ARCH_HAS_EHCI=y # CONFIG_NEW_LEDS is not set # CONFIG_ACCESSIBILITY is not set # CONFIG_RTC_CLASS is not set +# CONFIG_DMADEVICES is not set # CONFIG_AUXDISPLAY is not set # CONFIG_UIO is not set @@ -661,6 +682,7 @@ CONFIG_MISC_FILESYSTEMS=y # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set # CONFIG_JFFS2_FS is not set +# CONFIG_LOGFS is not set CONFIG_CRAMFS=y # CONFIG_SQUASHFS is not set # CONFIG_VXFS_FS is not set @@ -689,6 +711,7 @@ CONFIG_SUNRPC=y # CONFIG_RPCSEC_GSS_KRB5 is not set # CONFIG_RPCSEC_GSS_SPKM3 is not set # CONFIG_SMB_FS is not set +# CONFIG_CEPH_FS is not set # CONFIG_CIFS is not set # CONFIG_NCP_FS is not set # CONFIG_CODA_FS is not set @@ -733,6 +756,7 @@ CONFIG_DEBUG_OBJECTS_TIMERS=y # CONFIG_DEBUG_OBJECTS_WORK is not set CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT=1 # CONFIG_DEBUG_SLAB is not set +# CONFIG_DEBUG_KMEMLEAK is not set # CONFIG_DEBUG_RT_MUTEXES is not set # CONFIG_RT_MUTEX_TESTER is not set # CONFIG_DEBUG_SPINLOCK is not set @@ -758,6 +782,7 @@ CONFIG_DEBUG_SG=y # CONFIG_BACKTRACE_SELF_TEST is not set # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set +# CONFIG_LKDTM is not set # CONFIG_FAULT_INJECTION is not set # CONFIG_LATENCYTOP is not set CONFIG_SYSCTL_SYSCALL_CHECK=y @@ -782,6 +807,7 @@ CONFIG_BRANCH_PROFILE_NONE=y # CONFIG_WORKQUEUE_TRACER is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_DYNAMIC_DEBUG is not set +# CONFIG_DMA_API_DEBUG is not set # CONFIG_SAMPLES is not set CONFIG_EARLY_PRINTK=y # CONFIG_HEART_BEAT is not set @@ -901,5 +927,6 @@ CONFIG_GENERIC_FIND_LAST_BIT=y CONFIG_ZLIB_INFLATE=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y +CONFIG_HAS_DMA=y CONFIG_HAVE_LMB=y CONFIG_NLATTR=y diff --git a/arch/microblaze/include/asm/cache.h b/arch/microblaze/include/asm/cache.h index e522108..4efe96a 100644 --- a/arch/microblaze/include/asm/cache.h +++ b/arch/microblaze/include/asm/cache.h @@ -15,7 +15,7 @@ #include <asm/registers.h> -#define L1_CACHE_SHIFT 2 +#define L1_CACHE_SHIFT 5 /* word-granular cache in microblaze */ #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) diff --git a/arch/microblaze/include/asm/dma.h b/arch/microblaze/include/asm/dma.h index 08c073b..0d73d0c 100644 --- a/arch/microblaze/include/asm/dma.h +++ b/arch/microblaze/include/asm/dma.h @@ -18,4 +18,10 @@ #define MAX_DMA_ADDRESS (CONFIG_KERNEL_START + memory_size - 1) #endif +#ifdef CONFIG_PCI +extern int isa_dma_bridge_buggy; +#else +#define isa_dma_bridge_buggy (0) +#endif + #endif /* _ASM_MICROBLAZE_DMA_H */ diff --git a/arch/microblaze/include/asm/exceptions.h b/arch/microblaze/include/asm/exceptions.h index 90731df..4c7b5d0 100644 --- a/arch/microblaze/include/asm/exceptions.h +++ b/arch/microblaze/include/asm/exceptions.h @@ -64,12 +64,6 @@ asmlinkage void full_exception(struct pt_regs *regs, unsigned int type, void die(const char *str, struct pt_regs *fp, long err); void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr); -#ifdef CONFIG_MMU -void __bug(const char *file, int line, void *data); -int bad_trap(int trap_num, struct pt_regs *regs); -int debug_trap(struct pt_regs *regs); -#endif /* CONFIG_MMU */ - #if defined(CONFIG_KGDB) void (*debugger)(struct pt_regs *regs); int (*debugger_bpt)(struct pt_regs *regs); diff --git a/arch/microblaze/include/asm/io.h b/arch/microblaze/include/asm/io.h index e45a6ee..00b5398 100644 --- a/arch/microblaze/include/asm/io.h +++ b/arch/microblaze/include/asm/io.h @@ -139,8 +139,6 @@ static inline void writel(unsigned int v, volatile void __iomem *addr) #ifdef CONFIG_MMU -#define mm_ptov(addr) ((void *)__phys_to_virt(addr)) -#define mm_vtop(addr) ((unsigned long)__virt_to_phys(addr)) #define phys_to_virt(addr) ((void *)__phys_to_virt(addr)) #define virt_to_phys(addr) ((unsigned long)__virt_to_phys(addr)) #define virt_to_bus(addr) ((unsigned long)__virt_to_phys(addr)) diff --git a/arch/microblaze/include/asm/page.h b/arch/microblaze/include/asm/page.h index 2dd1d04..de493f8 100644 --- a/arch/microblaze/include/asm/page.h +++ b/arch/microblaze/include/asm/page.h @@ -31,6 +31,9 @@ #ifndef __ASSEMBLY__ +/* MS be sure that SLAB allocates aligned objects */ +#define ARCH_KMALLOC_MINALIGN L1_CACHE_BYTES + #define PAGE_UP(addr) (((addr)+((PAGE_SIZE)-1))&(~((PAGE_SIZE)-1))) #define PAGE_DOWN(addr) ((addr)&(~((PAGE_SIZE)-1))) @@ -70,14 +73,7 @@ typedef unsigned long pte_basic_t; #endif /* CONFIG_MMU */ -# ifndef CONFIG_MMU -# define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) -# define get_user_page(vaddr) __get_free_page(GFP_KERNEL) -# define free_user_page(page, addr) free_page(addr) -# else /* CONFIG_MMU */ -extern void copy_page(void *to, void *from); -# endif /* CONFIG_MMU */ - +# define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) # define clear_page(pgaddr) memset((pgaddr), 0, PAGE_SIZE) # define clear_user_page(pgaddr, vaddr, page) memset((pgaddr), 0, PAGE_SIZE) diff --git a/arch/microblaze/include/asm/pci.h b/arch/microblaze/include/asm/pci.h index bdd65aa..5a388ee 100644 --- a/arch/microblaze/include/asm/pci.h +++ b/arch/microblaze/include/asm/pci.h @@ -94,14 +94,6 @@ extern int pci_mmap_legacy_page_range(struct pci_bus *bus, #define HAVE_PCI_LEGACY 1 -/* pci_unmap_{page,single} is a nop so... */ -#define DECLARE_PCI_UNMAP_ADDR(ADDR_NAME) -#define DECLARE_PCI_UNMAP_LEN(LEN_NAME) -#define pci_unmap_addr(PTR, ADDR_NAME) (0) -#define pci_unmap_addr_set(PTR, ADDR_NAME, VAL) do { } while (0) -#define pci_unmap_len(PTR, LEN_NAME) (0) -#define pci_unmap_len_set(PTR, LEN_NAME, VAL) do { } while (0) - /* The PCI address space does equal the physical memory * address space (no IOMMU). The IDE and SCSI device layers use * this boolean for bounce buffer decisions. diff --git a/arch/microblaze/include/asm/pgalloc.h b/arch/microblaze/include/asm/pgalloc.h index f44b0d6..c614a89 100644 --- a/arch/microblaze/include/asm/pgalloc.h +++ b/arch/microblaze/include/asm/pgalloc.h @@ -108,21 +108,7 @@ extern inline void free_pgd_slow(pgd_t *pgd) #define pmd_alloc_one_fast(mm, address) ({ BUG(); ((pmd_t *)1); }) #define pmd_alloc_one(mm, address) ({ BUG(); ((pmd_t *)2); }) -static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, - unsigned long address) -{ - pte_t *pte; - extern void *early_get_page(void); - if (mem_init_done) { - pte = (pte_t *)__get_free_page(GFP_KERNEL | - __GFP_REPEAT | __GFP_ZERO); - } else { - pte = (pte_t *)early_get_page(); - if (pte) - clear_page(pte); - } - return pte; -} +extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr); static inline struct page *pte_alloc_one(struct mm_struct *mm, unsigned long address) diff --git a/arch/microblaze/include/asm/pgtable.h b/arch/microblaze/include/asm/pgtable.h index dd2bb60..ca2d928 100644 --- a/arch/microblaze/include/asm/pgtable.h +++ b/arch/microblaze/include/asm/pgtable.h @@ -512,15 +512,6 @@ static inline pmd_t *pmd_offset(pgd_t *dir, unsigned long address) extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; /* - * When flushing the tlb entry for a page, we also need to flush the hash - * table entry. flush_hash_page is assembler (for speed) in hashtable.S. - */ -extern int flush_hash_page(unsigned context, unsigned long va, pte_t *ptep); - -/* Add an HPTE to the hash table */ -extern void add_hash_page(unsigned context, unsigned long va, pte_t *ptep); - -/* * Encode and decode a swap entry. * Note that the bits we use in a PTE for representing a swap entry * must not include the _PAGE_PRESENT bit, or the _PAGE_HASHPTE bit @@ -533,15 +524,7 @@ extern void add_hash_page(unsigned context, unsigned long va, pte_t *ptep); #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) >> 2 }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val << 2 }) - -/* CONFIG_APUS */ -/* For virtual address to physical address conversion */ -extern void cache_clear(__u32 addr, int length); -extern void cache_push(__u32 addr, int length); -extern int mm_end_of_chunk(unsigned long addr, int len); extern unsigned long iopa(unsigned long addr); -/* extern unsigned long mm_ptov(unsigned long addr) \ - __attribute__ ((const)); TBD */ /* Values for nocacheflag and cmode */ /* These are not used by the APUS kernel_map, but prevents @@ -552,18 +535,6 @@ extern unsigned long iopa(unsigned long addr); #define IOMAP_NOCACHE_NONSER 2 #define IOMAP_NO_COPYBACK 3 -/* - * Map some physical address range into the kernel address space. - */ -extern unsigned long kernel_map(unsigned long paddr, unsigned long size, - int nocacheflag, unsigned long *memavailp); - -/* - * Set cache mode of (kernel space) address range. - */ -extern void kernel_set_cachemode(unsigned long address, unsigned long size, - unsigned int cmode); - /* Needs to be defined here and not in linux/mm.h, as it is arch dependent */ #define kern_addr_valid(addr) (1) @@ -577,10 +548,6 @@ extern void kernel_set_cachemode(unsigned long address, unsigned long size, void do_page_fault(struct pt_regs *regs, unsigned long address, unsigned long error_code); -void __init io_block_mapping(unsigned long virt, phys_addr_t phys, - unsigned int size, int flags); - -void __init adjust_total_lowmem(void); void mapin_ram(void); int map_page(unsigned long va, phys_addr_t pa, int flags); @@ -601,7 +568,7 @@ void __init *early_get_page(void); extern unsigned long ioremap_bot, ioremap_base; void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle); -void consistent_free(void *vaddr); +void consistent_free(size_t size, void *vaddr); void consistent_sync(void *vaddr, size_t size, int direction); void consistent_sync_page(struct page *page, unsigned long offset, size_t size, int direction); diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h index 446bec2..26460d1 100644 --- a/arch/microblaze/include/asm/uaccess.h +++ b/arch/microblaze/include/asm/uaccess.h @@ -182,6 +182,39 @@ extern long __user_bad(void); * Returns zero on success, or -EFAULT on error. * On error, the variable @x is set to zero. */ +#define get_user(x, ptr) \ + __get_user_check((x), (ptr), sizeof(*(ptr))) + +#define __get_user_check(x, ptr, size) \ +({ \ + unsigned long __gu_val = 0; \ + const typeof(*(ptr)) __user *__gu_addr = (ptr); \ + int __gu_err = 0; \ + \ + if (access_ok(VERIFY_READ, __gu_addr, size)) { \ + switch (size) { \ + case 1: \ + __get_user_asm("lbu", __gu_addr, __gu_val, \ + __gu_err); \ + break; \ + case 2: \ + __get_user_asm("lhu", __gu_addr, __gu_val, \ + __gu_err); \ + break; \ + case 4: \ + __get_user_asm("lw", __gu_addr, __gu_val, \ + __gu_err); \ + break; \ + default: \ + __gu_err = __user_bad(); \ + break; \ + } \ + } else { \ + __gu_err = -EFAULT; \ + } \ + x = (typeof(*(ptr)))__gu_val; \ + __gu_err; \ +}) #define __get_user(x, ptr) \ ({ \ @@ -206,12 +239,6 @@ extern long __user_bad(void); }) -#define get_user(x, ptr) \ -({ \ - access_ok(VERIFY_READ, (ptr), sizeof(*(ptr))) \ - ? __get_user((x), (ptr)) : -EFAULT; \ -}) - #define __put_user_asm(insn, __gu_ptr, __gu_val, __gu_err) \ ({ \ __asm__ __volatile__ ( \ @@ -266,6 +293,42 @@ extern long __user_bad(void); * * Returns zero on success, or -EFAULT on error. */ +#define put_user(x, ptr) \ + __put_user_check((x), (ptr), sizeof(*(ptr))) + +#define __put_user_check(x, ptr, size) \ +({ \ + typeof(*(ptr)) __pu_val; \ + typeof(*(ptr)) __user *__pu_addr = (ptr); \ + int __pu_err = 0; \ + \ + __pu_val = (x); \ + if (access_ok(VERIFY_WRITE, __pu_addr, size)) { \ + switch (size) { \ + case 1: \ + __put_user_asm("sb", __pu_addr, __pu_val, \ + __pu_err); \ + break; \ + case 2: \ + __put_user_asm("sh", __pu_addr, __pu_val, \ + __pu_err); \ + break; \ + case 4: \ + __put_user_asm("sw", __pu_addr, __pu_val, \ + __pu_err); \ + break; \ + case 8: \ + __put_user_asm_8(__pu_addr, __pu_val, __pu_err);\ + break; \ + default: \ + __pu_err = __user_bad(); \ + break; \ + } \ + } else { \ + __pu_err = -EFAULT; \ + } \ + __pu_err; \ +}) #define __put_user(x, ptr) \ ({ \ @@ -290,18 +353,6 @@ extern long __user_bad(void); __gu_err; \ }) -#ifndef CONFIG_MMU - -#define put_user(x, ptr) __put_user((x), (ptr)) - -#else /* CONFIG_MMU */ - -#define put_user(x, ptr) \ -({ \ - access_ok(VERIFY_WRITE, (ptr), sizeof(*(ptr))) \ - ? __put_user((x), (ptr)) : -EFAULT; \ -}) -#endif /* CONFIG_MMU */ /* copy_to_from_user */ #define __copy_from_user(to, from, n) \ diff --git a/arch/microblaze/kernel/asm-offsets.c b/arch/microblaze/kernel/asm-offsets.c index 0071260..c1b459c 100644 --- a/arch/microblaze/kernel/asm-offsets.c +++ b/arch/microblaze/kernel/asm-offsets.c @@ -16,6 +16,7 @@ #include <linux/hardirq.h> #include <linux/thread_info.h> #include <linux/kbuild.h> +#include <asm/cpuinfo.h> int main(int argc, char *argv[]) { diff --git a/arch/microblaze/kernel/cpu/cache.c b/arch/microblaze/kernel/cpu/cache.c index f04d8a8..109876e 100644 --- a/arch/microblaze/kernel/cpu/cache.c +++ b/arch/microblaze/kernel/cpu/cache.c @@ -96,13 +96,16 @@ static inline void __disable_dcache_nomsr(void) } -/* Helper macro for computing the limits of cache range loops */ +/* Helper macro for computing the limits of cache range loops + * + * End address can be unaligned which is OK for C implementation. + * ASM implementation align it in ASM macros + */ #define CACHE_LOOP_LIMITS(start, end, cache_line_length, cache_size) \ do { \ int align = ~(cache_line_length - 1); \ end = min(start + cache_size, end); \ start &= align; \ - end = ((end & align) + cache_line_length); \ } while (0); /* @@ -111,9 +114,9 @@ do { \ */ #define CACHE_ALL_LOOP(cache_size, line_length, op) \ do { \ - unsigned int len = cache_size; \ + unsigned int len = cache_size - line_length; \ int step = -line_length; \ - BUG_ON(step >= 0); \ + WARN_ON(step >= 0); \ \ __asm__ __volatile__ (" 1: " #op " %0, r0; \ bgtid %0, 1b; \ @@ -122,26 +125,22 @@ do { \ : "memory"); \ } while (0); - -#define CACHE_ALL_LOOP2(cache_size, line_length, op) \ -do { \ - unsigned int len = cache_size; \ - int step = -line_length; \ - BUG_ON(step >= 0); \ - \ - __asm__ __volatile__ (" 1: " #op " r0, %0; \ - bgtid %0, 1b; \ - addk %0, %0, %1; \ - " : : "r" (len), "r" (step) \ - : "memory"); \ -} while (0); - -/* for wdc.flush/clear */ +/* Used for wdc.flush/clear which can use rB for offset which is not possible + * to use for simple wdc or wic. + * + * start address is cache aligned + * end address is not aligned, if end is aligned then I have to substract + * cacheline length because I can't flush/invalidate the next cacheline. + * If is not, I align it because I will flush/invalidate whole line. + */ #define CACHE_RANGE_LOOP_2(start, end, line_length, op) \ do { \ int step = -line_length; \ - int count = end - start; \ - BUG_ON(count <= 0); \ + int align = ~(line_length - 1); \ + int count; \ + end = ((end & align) == end) ? end - line_length : end & align; \ + count = end - start; \ + WARN_ON(count < 0); \ \ __asm__ __volatile__ (" 1: " #op " %0, %1; \ bgtid %1, 1b; \ @@ -154,7 +153,9 @@ do { \ #define CACHE_RANGE_LOOP_1(start, end, line_length, op) \ do { \ int volatile temp; \ - BUG_ON(end - start <= 0); \ + int align = ~(line_length - 1); \ + end = ((end & align) == end) ? end - line_length : end & align; \ + WARN_ON(end - start < 0); \ \ __asm__ __volatile__ (" 1: " #op " %1, r0; \ cmpu %0, %1, %2; \ @@ -360,8 +361,12 @@ static void __invalidate_dcache_all_noirq_wt(void) #endif } -/* FIXME this is weird - should be only wdc but not work - * MS: I am getting bus errors and other weird things */ +/* FIXME It is blindly invalidation as is expected + * but can't be called on noMMU in microblaze_cache_init below + * + * MS: noMMU kernel won't boot if simple wdc is used + * The reason should be that there are discared data which kernel needs + */ static void __invalidate_dcache_all_wb(void) { #ifndef ASM_LOOP @@ -369,12 +374,12 @@ static void __invalidate_dcache_all_wb(void) #endif pr_debug("%s\n", __func__); #ifdef ASM_LOOP - CACHE_ALL_LOOP2(cpuinfo.dcache_size, cpuinfo.dcache_line_length, - wdc.clear) + CACHE_ALL_LOOP(cpuinfo.dcache_size, cpuinfo.dcache_line_length, + wdc) #else for (i = 0; i < cpuinfo.dcache_size; i += cpuinfo.dcache_line_length) - __asm__ __volatile__ ("wdc.clear %0, r0;" \ + __asm__ __volatile__ ("wdc %0, r0;" \ : : "r" (i)); #endif } @@ -393,7 +398,7 @@ static void __invalidate_dcache_range_wb(unsigned long start, #ifdef ASM_LOOP CACHE_RANGE_LOOP_2(start, end, cpuinfo.dcache_line_length, wdc.clear); #else - for (i = start; i < end; i += cpuinfo.icache_line_length) + for (i = start; i < end; i += cpuinfo.dcache_line_length) __asm__ __volatile__ ("wdc.clear %0, r0;" \ : : "r" (i)); #endif @@ -413,7 +418,7 @@ static void __invalidate_dcache_range_nomsr_wt(unsigned long start, #ifdef ASM_LOOP CACHE_RANGE_LOOP_1(start, end, cpuinfo.dcache_line_length, wdc); #else - for (i = start; i < end; i += cpuinfo.icache_line_length) + for (i = start; i < end; i += cpuinfo.dcache_line_length) __asm__ __volatile__ ("wdc %0, r0;" \ : : "r" (i)); #endif @@ -437,7 +442,7 @@ static void __invalidate_dcache_range_msr_irq_wt(unsigned long start, #ifdef ASM_LOOP CACHE_RANGE_LOOP_1(start, end, cpuinfo.dcache_line_length, wdc); #else - for (i = start; i < end; i += cpuinfo.icache_line_length) + for (i = start; i < end; i += cpuinfo.dcache_line_length) __asm__ __volatile__ ("wdc %0, r0;" \ : : "r" (i)); #endif @@ -465,7 +470,7 @@ static void __invalidate_dcache_range_nomsr_irq(unsigned long start, #ifdef ASM_LOOP CACHE_RANGE_LOOP_1(start, end, cpuinfo.dcache_line_length, wdc); #else - for (i = start; i < end; i += cpuinfo.icache_line_length) + for (i = start; i < end; i += cpuinfo.dcache_line_length) __asm__ __volatile__ ("wdc %0, r0;" \ : : "r" (i)); #endif @@ -504,7 +509,7 @@ static void __flush_dcache_range_wb(unsigned long start, unsigned long end) #ifdef ASM_LOOP CACHE_RANGE_LOOP_2(start, end, cpuinfo.dcache_line_length, wdc.flush); #else - for (i = start; i < end; i += cpuinfo.icache_line_length) + for (i = start; i < end; i += cpuinfo.dcache_line_length) __asm__ __volatile__ ("wdc.flush %0, r0;" \ : : "r" (i)); #endif @@ -650,7 +655,11 @@ void microblaze_cache_init(void) } } } - invalidate_dcache(); +/* FIXME Invalidation is done in U-BOOT + * WT cache: Data is already written to main memory + * WB cache: Discard data on noMMU which caused that kernel doesn't boot + */ + /* invalidate_dcache(); */ enable_dcache(); invalidate_icache(); diff --git a/arch/microblaze/kernel/cpu/mb.c b/arch/microblaze/kernel/cpu/mb.c index 0c912b2..4216eb1 100644 --- a/arch/microblaze/kernel/cpu/mb.c +++ b/arch/microblaze/kernel/cpu/mb.c @@ -98,15 +98,17 @@ static int show_cpuinfo(struct seq_file *m, void *v) if (cpuinfo.use_icache) count += seq_printf(m, - "Icache:\t\t%ukB\n", - cpuinfo.icache_size >> 10); + "Icache:\t\t%ukB\tline length:\t%dB\n", + cpuinfo.icache_size >> 10, + cpuinfo.icache_line_length); else count += seq_printf(m, "Icache:\t\tno\n"); if (cpuinfo.use_dcache) { count += seq_printf(m, - "Dcache:\t\t%ukB\n", - cpuinfo.dcache_size >> 10); + "Dcache:\t\t%ukB\tline length:\t%dB\n", + cpuinfo.dcache_size >> 10, + cpuinfo.dcache_line_length); if (cpuinfo.dcache_wb) count += seq_printf(m, "\t\twrite-back\n"); else diff --git a/arch/microblaze/kernel/dma.c b/arch/microblaze/kernel/dma.c index ce72dd4..9dcd90b 100644 --- a/arch/microblaze/kernel/dma.c +++ b/arch/microblaze/kernel/dma.c @@ -74,7 +74,7 @@ static void dma_direct_free_coherent(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle) { #ifdef NOT_COHERENT_CACHE - consistent_free(vaddr); + consistent_free(size, vaddr); #else free_pages((unsigned long)vaddr, get_order(size)); #endif diff --git a/arch/microblaze/kernel/entry-nommu.S b/arch/microblaze/kernel/entry-nommu.S index 391d619..8cc18cd 100644 --- a/arch/microblaze/kernel/entry-nommu.S +++ b/arch/microblaze/kernel/entry-nommu.S @@ -476,6 +476,8 @@ ENTRY(ret_from_fork) nop work_pending: + enable_irq + andi r11, r19, _TIF_NEED_RESCHED beqi r11, 1f bralid r15, schedule diff --git a/arch/microblaze/kernel/exceptions.c b/arch/microblaze/kernel/exceptions.c index d9f70f8..02cbdfe 100644 --- a/arch/microblaze/kernel/exceptions.c +++ b/arch/microblaze/kernel/exceptions.c @@ -121,7 +121,7 @@ asmlinkage void full_exception(struct pt_regs *regs, unsigned int type, } printk(KERN_WARNING "Divide by zero exception " \ "in kernel mode.\n"); - die("Divide by exception", regs, SIGBUS); + die("Divide by zero exception", regs, SIGBUS); break; case MICROBLAZE_FPU_EXCEPTION: pr_debug(KERN_WARNING "FPU exception\n"); diff --git a/arch/microblaze/kernel/head.S b/arch/microblaze/kernel/head.S index da6a5f5..1bf7398 100644 --- a/arch/microblaze/kernel/head.S +++ b/arch/microblaze/kernel/head.S @@ -28,6 +28,7 @@ * for more details. */ +#include <linux/init.h> #include <linux/linkage.h> #include <asm/thread_info.h> #include <asm/page.h> @@ -49,7 +50,7 @@ swapper_pg_dir: #endif /* CONFIG_MMU */ - .text + __HEAD ENTRY(_start) #if CONFIG_KERNEL_BASE_ADDR == 0 brai TOPHYS(real_start) diff --git a/arch/microblaze/kernel/irq.c b/arch/microblaze/kernel/irq.c index 6f39e2c..8f120ac 100644 --- a/arch/microblaze/kernel/irq.c +++ b/arch/microblaze/kernel/irq.c @@ -9,6 +9,7 @@ */ #include <linux/init.h> +#include <linux/ftrace.h> #include <linux/kernel.h> #include <linux/hardirq.h> #include <linux/interrupt.h> @@ -32,7 +33,7 @@ EXPORT_SYMBOL_GPL(irq_of_parse_and_map); static u32 concurrent_irq; -void do_IRQ(struct pt_regs *regs) +void __irq_entry do_IRQ(struct pt_regs *regs) { unsigned int irq; struct pt_regs *old_regs = set_irq_regs(regs); diff --git a/arch/microblaze/kernel/microblaze_ksyms.c b/arch/microblaze/kernel/microblaze_ksyms.c index bc4dcb7..ff85f77 100644 --- a/arch/microblaze/kernel/microblaze_ksyms.c +++ b/arch/microblaze/kernel/microblaze_ksyms.c @@ -52,3 +52,14 @@ EXPORT_SYMBOL_GPL(_ebss); extern void _mcount(void); EXPORT_SYMBOL(_mcount); #endif + +/* + * Assembly functions that may be used (directly or indirectly) by modules + */ +EXPORT_SYMBOL(__copy_tofrom_user); +EXPORT_SYMBOL(__strncpy_user); + +#ifdef CONFIG_OPT_LIB_ASM +EXPORT_SYMBOL(memcpy); +EXPORT_SYMBOL(memmove); +#endif diff --git a/arch/microblaze/kernel/misc.S b/arch/microblaze/kernel/misc.S index 7cf8649..0fb5fc6 100644 --- a/arch/microblaze/kernel/misc.S +++ b/arch/microblaze/kernel/misc.S @@ -93,39 +93,3 @@ early_console_reg_tlb_alloc: nop .size early_console_reg_tlb_alloc, . - early_console_reg_tlb_alloc - -/* - * Copy a whole page (4096 bytes). - */ -#define COPY_16_BYTES \ - lwi r7, r6, 0; \ - lwi r8, r6, 4; \ - lwi r9, r6, 8; \ - lwi r10, r6, 12; \ - swi r7, r5, 0; \ - swi r8, r5, 4; \ - swi r9, r5, 8; \ - swi r10, r5, 12 - - -/* FIXME DCACHE_LINE_BYTES (CONFIG_XILINX_MICROBLAZE0_DCACHE_LINE_LEN * 4)*/ -#define DCACHE_LINE_BYTES (4 * 4) - -.globl copy_page; -.type copy_page, @function -.align 4; -copy_page: - ori r11, r0, (PAGE_SIZE/DCACHE_LINE_BYTES) - 1 -_copy_page_loop: - COPY_16_BYTES -#if DCACHE_LINE_BYTES >= 32 - COPY_16_BYTES -#endif - addik r6, r6, DCACHE_LINE_BYTES - addik r5, r5, DCACHE_LINE_BYTES - bneid r11, _copy_page_loop - addik r11, r11, -1 - rtsd r15, 8 - nop - - .size copy_page, . - copy_page diff --git a/arch/microblaze/kernel/module.c b/arch/microblaze/kernel/module.c index cbecf11..0e73f66 100644 --- a/arch/microblaze/kernel/module.c +++ b/arch/microblaze/kernel/module.c @@ -16,6 +16,7 @@ #include <linux/string.h> #include <asm/pgtable.h> +#include <asm/cacheflush.h> void *module_alloc(unsigned long size) { @@ -151,6 +152,7 @@ int apply_relocate_add(Elf32_Shdr *sechdrs, const char *strtab, int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *module) { + flush_dcache(); return 0; } diff --git a/arch/microblaze/kernel/traps.c b/arch/microblaze/kernel/traps.c index 5e4570e..75e4920 100644 --- a/arch/microblaze/kernel/traps.c +++ b/arch/microblaze/kernel/traps.c @@ -95,37 +95,3 @@ void dump_stack(void) show_stack(NULL, NULL); } EXPORT_SYMBOL(dump_stack); - -#ifdef CONFIG_MMU -void __bug(const char *file, int line, void *data) -{ - if (data) - printk(KERN_CRIT "kernel BUG at %s:%d (data = %p)!\n", - file, line, data); - else - printk(KERN_CRIT "kernel BUG at %s:%d!\n", file, line); - - machine_halt(); -} - -int bad_trap(int trap_num, struct pt_regs *regs) -{ - printk(KERN_CRIT - "unimplemented trap %d called at 0x%08lx, pid %d!\n", - trap_num, regs->pc, current->pid); - return -ENOSYS; -} - -int debug_trap(struct pt_regs *regs) -{ - int i; - printk(KERN_CRIT "debug trap\n"); - for (i = 0; i < 32; i++) { - /* printk("r%i:%08X\t",i,regs->gpr[i]); */ - if ((i % 4) == 3) - printk(KERN_CRIT "\n"); - } - printk(KERN_CRIT "pc:%08lX\tmsr:%08lX\n", regs->pc, regs->msr); - return -ENOSYS; -} -#endif diff --git a/arch/microblaze/kernel/vmlinux.lds.S b/arch/microblaze/kernel/vmlinux.lds.S index 5ef619a..db72d71 100644 --- a/arch/microblaze/kernel/vmlinux.lds.S +++ b/arch/microblaze/kernel/vmlinux.lds.S @@ -24,7 +24,8 @@ SECTIONS { .text : AT(ADDR(.text) - LOAD_OFFSET) { _text = . ; _stext = . ; - *(.text .text.*) + HEAD_TEXT + TEXT_TEXT *(.fixup) EXIT_TEXT EXIT_CALL diff --git a/arch/microblaze/mm/consistent.c b/arch/microblaze/mm/consistent.c index f956e24..5a59dad 100644 --- a/arch/microblaze/mm/consistent.c +++ b/arch/microblaze/mm/consistent.c @@ -42,11 +42,12 @@ #include <linux/uaccess.h> #include <asm/pgtable.h> #include <asm/cpuinfo.h> +#include <asm/tlbflush.h> #ifndef CONFIG_MMU - /* I have to use dcache values because I can't relate on ram size */ -#define UNCACHED_SHADOW_MASK (cpuinfo.dcache_high - cpuinfo.dcache_base + 1) +# define UNCACHED_SHADOW_MASK (cpuinfo.dcache_high - cpuinfo.dcache_base + 1) +#endif /* * Consistent memory allocators. Used for DMA devices that want to @@ -60,71 +61,16 @@ */ void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle) { - struct page *page, *end, *free; - unsigned long order; - void *ret, *virt; - - if (in_interrupt()) - BUG(); - - size = PAGE_ALIGN(size); - order = get_order(size); - - page = alloc_pages(gfp, order); - if (!page) - goto no_page; - - /* We could do with a page_to_phys and page_to_bus here. */ - virt = page_address(page); - ret = ioremap(virt_to_phys(virt), size); - if (!ret) - goto no_remap; - - /* - * Here's the magic! Note if the uncached shadow is not implemented, - * it's up to the calling code to also test that condition and make - * other arranegments, such as manually flushing the cache and so on. - */ -#ifdef CONFIG_XILINX_UNCACHED_SHADOW - ret = (void *)((unsigned) ret | UNCACHED_SHADOW_MASK); -#endif - /* dma_handle is same as physical (shadowed) address */ - *dma_handle = (dma_addr_t)ret; - - /* - * free wasted pages. We skip the first page since we know - * that it will have count = 1 and won't require freeing. - * We also mark the pages in use as reserved so that - * remap_page_range works. - */ - page = virt_to_page(virt); - free = page + (size >> PAGE_SHIFT); - end = page + (1 << order); - - for (; page < end; page++) { - init_page_count(page); - if (page >= free) - __free_page(page); - else - SetPageReserved(page); - } - - return ret; -no_remap: - __free_pages(page, order); -no_page: - return NULL; -} - -#else + unsigned long order, vaddr; + void *ret; + unsigned int i, err = 0; + struct page *page, *end; -void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle) -{ - int order, err, i; - unsigned long page, va, flags; +#ifdef CONFIG_MMU phys_addr_t pa; struct vm_struct *area; - void *ret; + unsigned long va; +#endif if (in_interrupt()) BUG(); @@ -133,71 +79,133 @@ void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle) size = PAGE_ALIGN(size); order = get_order(size); - page = __get_free_pages(gfp, order); - if (!page) { - BUG(); + vaddr = __get_free_pages(gfp, order); + if (!vaddr) return NULL; - } /* * we need to ensure that there are no cachelines in use, * or worse dirty in this area. */ - flush_dcache_range(virt_to_phys(page), virt_to_phys(page) + size); + flush_dcache_range(virt_to_phys((void *)vaddr), + virt_to_phys((void *)vaddr) + size); +#ifndef CONFIG_MMU + ret = (void *)vaddr; + /* + * Here's the magic! Note if the uncached shadow is not implemented, + * it's up to the calling code to also test that condition and make + * other arranegments, such as manually flushing the cache and so on. + */ +# ifdef CONFIG_XILINX_UNCACHED_SHADOW + ret = (void *)((unsigned) ret | UNCACHED_SHADOW_MASK); +# endif + if ((unsigned int)ret > cpuinfo.dcache_base && + (unsigned int)ret < cpuinfo.dcache_high) + printk(KERN_WARNING + "ERROR: Your cache coherent area is CACHED!!!\n"); + + /* dma_handle is same as physical (shadowed) address */ + *dma_handle = (dma_addr_t)ret; +#else /* Allocate some common virtual space to map the new pages. */ area = get_vm_area(size, VM_ALLOC); - if (area == NULL) { - free_pages(page, order); + if (!area) { + free_pages(vaddr, order); return NULL; } va = (unsigned long) area->addr; ret = (void *)va; /* This gives us the real physical address of the first page. */ - *dma_handle = pa = virt_to_bus((void *)page); - - /* MS: This is the whole magic - use cache inhibit pages */ - flags = _PAGE_KERNEL | _PAGE_NO_CACHE; + *dma_handle = pa = virt_to_bus((void *)vaddr); +#endif /* - * Set refcount=1 on all pages in an order>0 - * allocation so that vfree() will actually - * free all pages that were allocated. + * free wasted pages. We skip the first page since we know + * that it will have count = 1 and won't require freeing. + * We also mark the pages in use as reserved so that + * remap_page_range works. */ - if (order > 0) { - struct page *rpage = virt_to_page(page); - for (i = 1; i < (1 << order); i++) - init_page_count(rpage+i); + page = virt_to_page(vaddr); + end = page + (1 << order); + + split_page(page, order); + + for (i = 0; i < size && err == 0; i += PAGE_SIZE) { +#ifdef CONFIG_MMU + /* MS: This is the whole magic - use cache inhibit pages */ + err = map_page(va + i, pa + i, _PAGE_KERNEL | _PAGE_NO_CACHE); +#endif + + SetPageReserved(page); + page++; } - err = 0; - for (i = 0; i < size && err == 0; i += PAGE_SIZE) - err = map_page(va+i, pa+i, flags); + /* Free the otherwise unused pages. */ + while (page < end) { + __free_page(page); + page++; + } if (err) { - vfree((void *)va); + free_pages(vaddr, order); return NULL; } return ret; } -#endif /* CONFIG_MMU */ EXPORT_SYMBOL(consistent_alloc); /* * free page(s) as defined by the above mapping. */ -void consistent_free(void *vaddr) +void consistent_free(size_t size, void *vaddr) { + struct page *page; + if (in_interrupt()) BUG(); + size = PAGE_ALIGN(size); + +#ifndef CONFIG_MMU /* Clear SHADOW_MASK bit in address, and free as per usual */ -#ifdef CONFIG_XILINX_UNCACHED_SHADOW +# ifdef CONFIG_XILINX_UNCACHED_SHADOW vaddr = (void *)((unsigned)vaddr & ~UNCACHED_SHADOW_MASK); +# endif + page = virt_to_page(vaddr); + + do { + ClearPageReserved(page); + __free_page(page); + page++; + } while (size -= PAGE_SIZE); +#else + do { + pte_t *ptep; + unsigned long pfn; + + ptep = pte_offset_kernel(pmd_offset(pgd_offset_k( + (unsigned int)vaddr), + (unsigned int)vaddr), + (unsigned int)vaddr); + if (!pte_none(*ptep) && pte_present(*ptep)) { + pfn = pte_pfn(*ptep); + pte_clear(&init_mm, (unsigned int)vaddr, ptep); + if (pfn_valid(pfn)) { + page = pfn_to_page(pfn); + + ClearPageReserved(page); + __free_page(page); + } + } + vaddr += PAGE_SIZE; + } while (size -= PAGE_SIZE); + + /* flush tlb */ + flush_tlb_all(); #endif - vfree(vaddr); } EXPORT_SYMBOL(consistent_free); @@ -221,7 +229,7 @@ void consistent_sync(void *vaddr, size_t size, int direction) case PCI_DMA_NONE: BUG(); case PCI_DMA_FROMDEVICE: /* invalidate only */ - flush_dcache_range(start, end); + invalidate_dcache_range(start, end); break; case PCI_DMA_TODEVICE: /* writeback only */ flush_dcache_range(start, end); diff --git a/arch/microblaze/mm/fault.c b/arch/microblaze/mm/fault.c index 7af87f4..bab9229 100644 --- a/arch/microblaze/mm/fault.c +++ b/arch/microblaze/mm/fault.c @@ -273,16 +273,11 @@ bad_area_nosemaphore: * us unable to handle the page fault gracefully. */ out_of_memory: - if (current->pid == 1) { - yield(); - down_read(&mm->mmap_sem); - goto survive; - } up_read(&mm->mmap_sem); - printk(KERN_WARNING "VM: killing process %s\n", current->comm); - if (user_mode(regs)) - do_exit(SIGKILL); - bad_page_fault(regs, address, SIGKILL); + if (!user_mode(regs)) + bad_page_fault(regs, address, SIGKILL); + else + pagefault_out_of_memory(); return; do_sigbus: diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c index f42c2dd..cca3579 100644 --- a/arch/microblaze/mm/init.c +++ b/arch/microblaze/mm/init.c @@ -47,6 +47,7 @@ unsigned long memory_start; EXPORT_SYMBOL(memory_start); unsigned long memory_end; /* due to mm/nommu.c */ unsigned long memory_size; +EXPORT_SYMBOL(memory_size); /* * paging_init() sets up the page tables - in fact we've already done this. diff --git a/arch/microblaze/mm/pgtable.c b/arch/microblaze/mm/pgtable.c index d31312c..59bf233 100644 --- a/arch/microblaze/mm/pgtable.c +++ b/arch/microblaze/mm/pgtable.c @@ -42,6 +42,7 @@ unsigned long ioremap_base; unsigned long ioremap_bot; +EXPORT_SYMBOL(ioremap_bot); /* The maximum lowmem defaults to 768Mb, but this can be configured to * another value. @@ -161,24 +162,6 @@ int map_page(unsigned long va, phys_addr_t pa, int flags) return err; } -void __init adjust_total_lowmem(void) -{ -/* TBD */ -#if 0 - unsigned long max_low_mem = MAX_LOW_MEM; - - if (total_lowmem > max_low_mem) { - total_lowmem = max_low_mem; -#ifndef CONFIG_HIGHMEM - printk(KERN_INFO "Warning, memory limited to %ld Mb, use " - "CONFIG_HIGHMEM to reach %ld Mb\n", - max_low_mem >> 20, total_memory >> 20); - total_memory = total_lowmem; -#endif /* CONFIG_HIGHMEM */ - } -#endif -} - /* * Map in all of physical memory starting at CONFIG_KERNEL_START. */ @@ -206,24 +189,6 @@ void __init mapin_ram(void) /* is x a power of 2? */ #define is_power_of_2(x) ((x) != 0 && (((x) & ((x) - 1)) == 0)) -/* - * Set up a mapping for a block of I/O. - * virt, phys, size must all be page-aligned. - * This should only be called before ioremap is called. - */ -void __init io_block_mapping(unsigned long virt, phys_addr_t phys, - unsigned int size, int flags) -{ - int i; - - if (virt > CONFIG_KERNEL_START && virt < ioremap_bot) - ioremap_bot = ioremap_base = virt; - - /* Put it in the page tables. */ - for (i = 0; i < size; i += PAGE_SIZE) - map_page(virt + i, phys + i, flags); -} - /* Scan the real Linux page tables and return a PTE pointer for * a virtual address in a context. * Returns true (1) if PTE was found, zero otherwise. The pointer to @@ -274,3 +239,18 @@ unsigned long iopa(unsigned long addr) return pa; } + +__init_refok pte_t *pte_alloc_one_kernel(struct mm_struct *mm, + unsigned long address) +{ + pte_t *pte; + if (mem_init_done) { + pte = (pte_t *)__get_free_page(GFP_KERNEL | + __GFP_REPEAT | __GFP_ZERO); + } else { + pte = (pte_t *)early_get_page(); + if (pte) + clear_page(pte); + } + return pte; +} diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c index 740bb32..9cb782b 100644 --- a/arch/microblaze/pci/pci-common.c +++ b/arch/microblaze/pci/pci-common.c @@ -1025,7 +1025,7 @@ static void __devinit pcibios_fixup_bridge(struct pci_bus *bus) struct pci_dev *dev = bus->self; - for (i = 0; i < PCI_BUS_NUM_RESOURCES; ++i) { + pci_bus_for_each_resource(bus, res, i) { res = bus->resource[i]; if (!res) continue; @@ -1131,21 +1131,20 @@ static int skip_isa_ioresource_align(struct pci_dev *dev) * but we want to try to avoid allocating at 0x2900-0x2bff * which might have be mirrored at 0x0100-0x03ff.. */ -void pcibios_align_resource(void *data, struct resource *res, +resource_size_t pcibios_align_resource(void *data, const struct resource *res, resource_size_t size, resource_size_t align) { struct pci_dev *dev = data; + resource_size_t start = res->start; if (res->flags & IORESOURCE_IO) { - resource_size_t start = res->start; - if (skip_isa_ioresource_align(dev)) - return; - if (start & 0x300) { + return start; + if (start & 0x300) start = (start + 0x3ff) & ~0x3ff; - res->start = start; - } } + + return start; } EXPORT_SYMBOL(pcibios_align_resource); @@ -1228,7 +1227,7 @@ void pcibios_allocate_bus_resources(struct pci_bus *bus) pr_debug("PCI: Allocating bus resources for %04x:%02x...\n", pci_domain_nr(bus), bus->number); - for (i = 0; i < PCI_BUS_NUM_RESOURCES; ++i) { + pci_bus_for_each_resource(bus, res, i) { res = bus->resource[i]; if (!res || !res->flags || res->start > res->end || res->parent) @@ -1508,7 +1507,7 @@ void pcibios_finish_adding_to_bus(struct pci_bus *bus) pci_bus_add_devices(bus); /* Fixup EEH */ - eeh_add_device_tree_late(bus); + /* eeh_add_device_tree_late(bus); */ } EXPORT_SYMBOL_GPL(pcibios_finish_adding_to_bus); diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h index 519197e..59dc0c7 100644 --- a/arch/mips/include/asm/atomic.h +++ b/arch/mips/include/asm/atomic.h @@ -29,7 +29,7 @@ * * Atomically reads the value of @v. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) /* * atomic_set - set atomic variable @@ -410,7 +410,7 @@ static __inline__ int atomic_add_unless(atomic_t *v, int a, int u) * @v: pointer of type atomic64_t * */ -#define atomic64_read(v) ((v)->counter) +#define atomic64_read(v) (*(volatile long *)&(v)->counter) /* * atomic64_set - set atomic variable diff --git a/arch/mips/include/asm/i8253.h b/arch/mips/include/asm/i8253.h index 032ca73..48bb823 100644 --- a/arch/mips/include/asm/i8253.h +++ b/arch/mips/include/asm/i8253.h @@ -12,7 +12,7 @@ #define PIT_CH0 0x40 #define PIT_CH2 0x42 -extern spinlock_t i8253_lock; +extern raw_spinlock_t i8253_lock; extern void setup_pit_timer(void); diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h index 49382d5..c6e3c93 100644 --- a/arch/mips/include/asm/mipsregs.h +++ b/arch/mips/include/asm/mipsregs.h @@ -135,6 +135,12 @@ #define FPU_CSR_COND7 0x80000000 /* $fcc7 */ /* + * Bits 18 - 20 of the FPU Status Register will be read as 0, + * and should be written as zero. + */ +#define FPU_CSR_RSVD 0x001c0000 + +/* * X the exception cause indicator * E the exception enable * S the sticky/flag bit @@ -161,7 +167,8 @@ #define FPU_CSR_UDF_S 0x00000008 #define FPU_CSR_INE_S 0x00000004 -/* rounding mode */ +/* Bits 0 and 1 of FPU Status Register specify the rounding mode */ +#define FPU_CSR_RM 0x00000003 #define FPU_CSR_RN 0x0 /* nearest */ #define FPU_CSR_RZ 0x1 /* towards zero */ #define FPU_CSR_RU 0x2 /* towards +Infinity */ diff --git a/arch/mips/kernel/i8253.c b/arch/mips/kernel/i8253.c index ed5c441..9479406 100644 --- a/arch/mips/kernel/i8253.c +++ b/arch/mips/kernel/i8253.c @@ -15,7 +15,7 @@ #include <asm/io.h> #include <asm/time.h> -DEFINE_SPINLOCK(i8253_lock); +DEFINE_RAW_SPINLOCK(i8253_lock); EXPORT_SYMBOL(i8253_lock); /* @@ -26,7 +26,7 @@ EXPORT_SYMBOL(i8253_lock); static void init_pit_timer(enum clock_event_mode mode, struct clock_event_device *evt) { - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); switch(mode) { case CLOCK_EVT_MODE_PERIODIC: @@ -55,7 +55,7 @@ static void init_pit_timer(enum clock_event_mode mode, /* Nothing to do here */ break; } - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); } /* @@ -65,10 +65,10 @@ static void init_pit_timer(enum clock_event_mode mode, */ static int pit_next_event(unsigned long delta, struct clock_event_device *evt) { - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); outb_p(delta & 0xff , PIT_CH0); /* LSB */ outb(delta >> 8 , PIT_CH0); /* MSB */ - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); return 0; } @@ -137,7 +137,7 @@ static cycle_t pit_read(struct clocksource *cs) static int old_count; static u32 old_jifs; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); /* * Although our caller may have the read side of xtime_lock, * this is now a seqlock, and we are cheating in this routine @@ -183,7 +183,7 @@ static cycle_t pit_read(struct clocksource *cs) old_count = count; old_jifs = jifs; - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); count = (LATCH - 1) - count; diff --git a/arch/mips/kernel/scall64-n32.S b/arch/mips/kernel/scall64-n32.S index 44337ba..a5297e2 100644 --- a/arch/mips/kernel/scall64-n32.S +++ b/arch/mips/kernel/scall64-n32.S @@ -385,7 +385,7 @@ EXPORT(sysn32_call_table) PTR sys_fchmodat PTR sys_faccessat PTR compat_sys_pselect6 - PTR sys_ppoll /* 6265 */ + PTR compat_sys_ppoll /* 6265 */ PTR sys_unshare PTR sys_splice PTR sys_sync_file_range diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c index 8f2f8e9..f2338d1 100644 --- a/arch/mips/math-emu/cp1emu.c +++ b/arch/mips/math-emu/cp1emu.c @@ -78,6 +78,9 @@ DEFINE_PER_CPU(struct mips_fpu_emulator_stats, fpuemustats); #define FPCREG_RID 0 /* $0 = revision id */ #define FPCREG_CSR 31 /* $31 = csr */ +/* Determine rounding mode from the RM bits of the FCSR */ +#define modeindex(v) ((v) & FPU_CSR_RM) + /* Convert Mips rounding mode (0..3) to IEEE library modes. */ static const unsigned char ieee_rm[4] = { [FPU_CSR_RN] = IEEE754_RN, @@ -384,10 +387,14 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx) (void *) (xcp->cp0_epc), MIPSInst_RT(ir), value); #endif - value &= (FPU_CSR_FLUSH | FPU_CSR_ALL_E | FPU_CSR_ALL_S | 0x03); - ctx->fcr31 &= ~(FPU_CSR_FLUSH | FPU_CSR_ALL_E | FPU_CSR_ALL_S | 0x03); - /* convert to ieee library modes */ - ctx->fcr31 |= (value & ~0x3) | ieee_rm[value & 0x3]; + + /* + * Don't write reserved bits, + * and convert to ieee library modes + */ + ctx->fcr31 = (value & + ~(FPU_CSR_RSVD | FPU_CSR_RM)) | + ieee_rm[modeindex(value)]; } if ((ctx->fcr31 >> 5) & ctx->fcr31 & FPU_CSR_ALL_E) { return SIGFPE; diff --git a/arch/mips/oprofile/op_model_loongson2.c b/arch/mips/oprofile/op_model_loongson2.c index 29e2326..fa3bf66 100644 --- a/arch/mips/oprofile/op_model_loongson2.c +++ b/arch/mips/oprofile/op_model_loongson2.c @@ -122,7 +122,7 @@ static irqreturn_t loongson2_perfcount_handler(int irq, void *dev_id) */ /* Check whether the irq belongs to me */ - enabled = read_c0_perfcnt() & LOONGSON2_PERFCNT_INT_EN; + enabled = read_c0_perfctrl() & LOONGSON2_PERFCNT_INT_EN; if (!enabled) return IRQ_NONE; enabled = reg.cnt1_enabled | reg.cnt2_enabled; diff --git a/arch/mn10300/include/asm/atomic.h b/arch/mn10300/include/asm/atomic.h index 5bf5be9..e41222d 100644 --- a/arch/mn10300/include/asm/atomic.h +++ b/arch/mn10300/include/asm/atomic.h @@ -31,7 +31,7 @@ * Atomically reads the value of @v. Note that the guaranteed * useful range of an atomic_t is only 24 bits. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) /** * atomic_set - set atomic variable diff --git a/arch/parisc/include/asm/atomic.h b/arch/parisc/include/asm/atomic.h index 716634d..f819559 100644 --- a/arch/parisc/include/asm/atomic.h +++ b/arch/parisc/include/asm/atomic.h @@ -189,7 +189,7 @@ static __inline__ void atomic_set(atomic_t *v, int i) static __inline__ int atomic_read(const atomic_t *v) { - return v->counter; + return (*(volatile int *)&(v)->counter); } /* exported interface */ @@ -286,7 +286,7 @@ atomic64_set(atomic64_t *v, s64 i) static __inline__ s64 atomic64_read(const atomic64_t *v) { - return v->counter; + return (*(volatile long *)&(v)->counter); } #define atomic64_add(i,v) ((void)(__atomic64_add_return( ((s64)(i)),(v)))) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 9f4c9d4f..bd100fc 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -130,43 +130,5 @@ static inline int irqs_disabled_flags(unsigned long flags) */ struct irq_chip; -#ifdef CONFIG_PERF_EVENTS - -#ifdef CONFIG_PPC64 -static inline unsigned long test_perf_event_pending(void) -{ - unsigned long x; - - asm volatile("lbz %0,%1(13)" - : "=r" (x) - : "i" (offsetof(struct paca_struct, perf_event_pending))); - return x; -} - -static inline void set_perf_event_pending(void) -{ - asm volatile("stb %0,%1(13)" : : - "r" (1), - "i" (offsetof(struct paca_struct, perf_event_pending))); -} - -static inline void clear_perf_event_pending(void) -{ - asm volatile("stb %0,%1(13)" : : - "r" (0), - "i" (offsetof(struct paca_struct, perf_event_pending))); -} -#endif /* CONFIG_PPC64 */ - -#else /* CONFIG_PERF_EVENTS */ - -static inline unsigned long test_perf_event_pending(void) -{ - return 0; -} - -static inline void clear_perf_event_pending(void) {} -#endif /* CONFIG_PERF_EVENTS */ - #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_HW_IRQ_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 957ceb7..c09138d 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -133,7 +133,6 @@ int main(void) DEFINE(PACAKMSR, offsetof(struct paca_struct, kernel_msr)); DEFINE(PACASOFTIRQEN, offsetof(struct paca_struct, soft_enabled)); DEFINE(PACAHARDIRQEN, offsetof(struct paca_struct, hard_enabled)); - DEFINE(PACAPERFPEND, offsetof(struct paca_struct, perf_event_pending)); DEFINE(PACACONTEXTID, offsetof(struct paca_struct, context.id)); #ifdef CONFIG_PPC_MM_SLICES DEFINE(PACALOWSLICESPSIZE, offsetof(struct paca_struct, diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c index 59c9285..4ff4da2c 100644 --- a/arch/powerpc/kernel/dma-swiotlb.c +++ b/arch/powerpc/kernel/dma-swiotlb.c @@ -1,7 +1,8 @@ /* * Contains routines needed to support swiotlb for ppc. * - * Copyright (C) 2009 Becky Bruce, Freescale Semiconductor + * Copyright (C) 2009-2010 Freescale Semiconductor, Inc. + * Author: Becky Bruce * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License as published by the @@ -70,7 +71,7 @@ static int ppc_swiotlb_bus_notify(struct notifier_block *nb, sd->max_direct_dma_addr = 0; /* May need to bounce if the device can't address all of DRAM */ - if (dma_get_mask(dev) < lmb_end_of_DRAM()) + if ((dma_get_mask(dev) + 1) < lmb_end_of_DRAM()) set_dma_ops(dev, &swiotlb_dma_ops); return NOTIFY_DONE; diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 07109d8..42e9d90 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -556,15 +556,6 @@ ALT_FW_FTR_SECTION_END_IFCLR(FW_FEATURE_ISERIES) 2: TRACE_AND_RESTORE_IRQ(r5); -#ifdef CONFIG_PERF_EVENTS - /* check paca->perf_event_pending if we're enabling ints */ - lbz r3,PACAPERFPEND(r13) - and. r3,r3,r5 - beq 27f - bl .perf_event_do_pending -27: -#endif /* CONFIG_PERF_EVENTS */ - /* extract EE bit and use it to restore paca->hard_enabled */ ld r3,_MSR(r1) rldicl r4,r3,49,63 /* r0 = (r3 >> 15) & 1 */ diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 64f6f20..066bd31 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -53,7 +53,6 @@ #include <linux/bootmem.h> #include <linux/pci.h> #include <linux/debugfs.h> -#include <linux/perf_event.h> #include <asm/uaccess.h> #include <asm/system.h> @@ -145,11 +144,6 @@ notrace void raw_local_irq_restore(unsigned long en) } #endif /* CONFIG_PPC_STD_MMU_64 */ - if (test_perf_event_pending()) { - clear_perf_event_pending(); - perf_event_do_pending(); - } - /* * if (get_paca()->hard_enabled) return; * But again we need to take care that gcc gets hard_enabled directly diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 1b16b9a..0441bbd 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -532,25 +532,60 @@ void __init iSeries_time_init_early(void) } #endif /* CONFIG_PPC_ISERIES */ -#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_PPC32) -DEFINE_PER_CPU(u8, perf_event_pending); +#ifdef CONFIG_PERF_EVENTS -void set_perf_event_pending(void) +/* + * 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable... + */ +#ifdef CONFIG_PPC64 +static inline unsigned long test_perf_event_pending(void) { - get_cpu_var(perf_event_pending) = 1; - set_dec(1); - put_cpu_var(perf_event_pending); + unsigned long x; + + asm volatile("lbz %0,%1(13)" + : "=r" (x) + : "i" (offsetof(struct paca_struct, perf_event_pending))); + return x; } +static inline void set_perf_event_pending_flag(void) +{ + asm volatile("stb %0,%1(13)" : : + "r" (1), + "i" (offsetof(struct paca_struct, perf_event_pending))); +} + +static inline void clear_perf_event_pending(void) +{ + asm volatile("stb %0,%1(13)" : : + "r" (0), + "i" (offsetof(struct paca_struct, perf_event_pending))); +} + +#else /* 32-bit */ + +DEFINE_PER_CPU(u8, perf_event_pending); + +#define set_perf_event_pending_flag() __get_cpu_var(perf_event_pending) = 1 #define test_perf_event_pending() __get_cpu_var(perf_event_pending) #define clear_perf_event_pending() __get_cpu_var(perf_event_pending) = 0 -#else /* CONFIG_PERF_EVENTS && CONFIG_PPC32 */ +#endif /* 32 vs 64 bit */ + +void set_perf_event_pending(void) +{ + preempt_disable(); + set_perf_event_pending_flag(); + set_dec(1); + preempt_enable(); +} + +#else /* CONFIG_PERF_EVENTS */ #define test_perf_event_pending() 0 #define clear_perf_event_pending() -#endif /* CONFIG_PERF_EVENTS && CONFIG_PPC32 */ +#endif /* CONFIG_PERF_EVENTS */ /* * For iSeries shared processors, we have to let the hypervisor @@ -582,10 +617,6 @@ void timer_interrupt(struct pt_regs * regs) set_dec(DECREMENTER_MAX); #ifdef CONFIG_PPC32 - if (test_perf_event_pending()) { - clear_perf_event_pending(); - perf_event_do_pending(); - } if (atomic_read(&ppc_n_lost_interrupts) != 0) do_IRQ(regs); #endif @@ -604,6 +635,11 @@ void timer_interrupt(struct pt_regs * regs) calculate_steal_time(); + if (test_perf_event_pending()) { + clear_perf_event_pending(); + perf_event_do_pending(); + } + #ifdef CONFIG_PPC_ISERIES if (firmware_has_feature(FW_FEATURE_ISERIES)) get_lppaca()->int_dword.fields.decr_int = 0; diff --git a/arch/powerpc/kvm/44x_tlb.c b/arch/powerpc/kvm/44x_tlb.c index 2570fcc..8123125 100644 --- a/arch/powerpc/kvm/44x_tlb.c +++ b/arch/powerpc/kvm/44x_tlb.c @@ -440,7 +440,7 @@ int kvmppc_44x_emul_tlbwe(struct kvm_vcpu *vcpu, u8 ra, u8 rs, u8 ws) unsigned int gtlb_index; gtlb_index = kvmppc_get_gpr(vcpu, ra); - if (gtlb_index > KVM44x_GUEST_TLB_SIZE) { + if (gtlb_index >= KVM44x_GUEST_TLB_SIZE) { printk("%s: index %d\n", __func__, gtlb_index); kvmppc_dump_vcpu(vcpu); return EMULATE_FAIL; diff --git a/arch/s390/kernel/head31.S b/arch/s390/kernel/head31.S index 1bbcc49..b8f8dc1 100644 --- a/arch/s390/kernel/head31.S +++ b/arch/s390/kernel/head31.S @@ -82,7 +82,7 @@ startup_continue: _ehead: #ifdef CONFIG_SHARED_KERNEL - .org 0x100000 + .org 0x100000 - 0x11000 # head.o ends at 0x11000 #endif # diff --git a/arch/s390/kernel/head64.S b/arch/s390/kernel/head64.S index 1f70970..cdef687 100644 --- a/arch/s390/kernel/head64.S +++ b/arch/s390/kernel/head64.S @@ -80,7 +80,7 @@ startup_continue: _ehead: #ifdef CONFIG_SHARED_KERNEL - .org 0x100000 + .org 0x100000 - 0x11000 # head.o ends at 0x11000 #endif # diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c index 33fdc5a..9f654da 100644 --- a/arch/s390/kernel/ptrace.c +++ b/arch/s390/kernel/ptrace.c @@ -640,7 +640,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request, asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) { - long ret; + long ret = 0; /* Do the secure computing check first. */ secure_computing(regs->gprs[2]); @@ -649,7 +649,6 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) * The sysc_tracesys code in entry.S stored the system * call number to gprs[2]. */ - ret = regs->gprs[2]; if (test_thread_flag(TIF_SYSCALL_TRACE) && (tracehook_report_syscall_entry(regs) || regs->gprs[2] >= NR_syscalls)) { @@ -671,7 +670,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs) regs->gprs[2], regs->orig_gpr2, regs->gprs[3], regs->gprs[4], regs->gprs[5]); - return ret; + return ret ?: regs->gprs[2]; } asmlinkage void do_syscall_trace_exit(struct pt_regs *regs) diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c index d906bf1..a2163c9 100644 --- a/arch/s390/kernel/time.c +++ b/arch/s390/kernel/time.c @@ -391,7 +391,6 @@ static void __init time_init_wq(void) if (time_sync_wq) return; time_sync_wq = create_singlethread_workqueue("timesync"); - stop_machine_create(); } /* diff --git a/arch/sh/configs/rts7751r2d1_defconfig b/arch/sh/configs/rts7751r2d1_defconfig index fba1f62..dba024d72 100644 --- a/arch/sh/configs/rts7751r2d1_defconfig +++ b/arch/sh/configs/rts7751r2d1_defconfig @@ -877,7 +877,7 @@ CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # # CONFIG_SERIAL_MAX3100 is not set CONFIG_SERIAL_SH_SCI=y -CONFIG_SERIAL_SH_SCI_NR_UARTS=1 +CONFIG_SERIAL_SH_SCI_NR_UARTS=2 CONFIG_SERIAL_SH_SCI_CONSOLE=y CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y diff --git a/arch/sh/configs/rts7751r2dplus_defconfig b/arch/sh/configs/rts7751r2dplus_defconfig index a8d538f..6d511d0 100644 --- a/arch/sh/configs/rts7751r2dplus_defconfig +++ b/arch/sh/configs/rts7751r2dplus_defconfig @@ -963,7 +963,7 @@ CONFIG_SERIAL_8250_RUNTIME_UARTS=4 # # CONFIG_SERIAL_MAX3100 is not set CONFIG_SERIAL_SH_SCI=y -CONFIG_SERIAL_SH_SCI_NR_UARTS=1 +CONFIG_SERIAL_SH_SCI_NR_UARTS=2 CONFIG_SERIAL_SH_SCI_CONSOLE=y CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y diff --git a/arch/sh/drivers/pci/pci-sh7751.c b/arch/sh/drivers/pci/pci-sh7751.c index 17811e5..f98141b 100644 --- a/arch/sh/drivers/pci/pci-sh7751.c +++ b/arch/sh/drivers/pci/pci-sh7751.c @@ -17,6 +17,7 @@ #include <linux/io.h> #include "pci-sh4.h" #include <asm/addrspace.h> +#include <asm/sizes.h> static int __init __area_sdram_check(struct pci_channel *chan, unsigned int area) @@ -47,8 +48,8 @@ static int __init __area_sdram_check(struct pci_channel *chan, static struct resource sh7751_pci_resources[] = { { .name = "SH7751_IO", - .start = SH7751_PCI_IO_BASE, - .end = SH7751_PCI_IO_BASE + SH7751_PCI_IO_SIZE - 1, + .start = 0x1000, + .end = SZ_4M - 1, .flags = IORESOURCE_IO }, { .name = "SH7751_mem", diff --git a/arch/sh/include/asm/atomic.h b/arch/sh/include/asm/atomic.h index 275a448..c798312 100644 --- a/arch/sh/include/asm/atomic.h +++ b/arch/sh/include/asm/atomic.h @@ -13,7 +13,7 @@ #define ATOMIC_INIT(i) ( (atomic_t) { (i) } ) -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_set(v,i) ((v)->counter = (i)) #if defined(CONFIG_GUSA_RB) diff --git a/arch/sh/include/cpu-sh4/cpu/dma-register.h b/arch/sh/include/cpu-sh4/cpu/dma-register.h index 55f9fec..de23595 100644 --- a/arch/sh/include/cpu-sh4/cpu/dma-register.h +++ b/arch/sh/include/cpu-sh4/cpu/dma-register.h @@ -76,7 +76,7 @@ enum { } #define TS_INDEX2VAL(i) ((((i) & 3) << CHCR_TS_LOW_SHIFT) | \ - ((((i) >> 2) & 3) << CHCR_TS_HIGH_SHIFT)) + (((i) & 0xc) << CHCR_TS_HIGH_SHIFT)) #else /* CONFIG_CPU_SH4A */ diff --git a/arch/sparc/include/asm/atomic_32.h b/arch/sparc/include/asm/atomic_32.h index f0d343c..7ae128b 100644 --- a/arch/sparc/include/asm/atomic_32.h +++ b/arch/sparc/include/asm/atomic_32.h @@ -25,7 +25,7 @@ extern int atomic_cmpxchg(atomic_t *, int, int); extern int atomic_add_unless(atomic_t *, int, int); extern void atomic_set(atomic_t *, int); -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) #define atomic_add(i, v) ((void)__atomic_add_return( (int)(i), (v))) #define atomic_sub(i, v) ((void)__atomic_add_return(-(int)(i), (v))) diff --git a/arch/sparc/include/asm/atomic_64.h b/arch/sparc/include/asm/atomic_64.h index f2e4800..2050ca0 100644 --- a/arch/sparc/include/asm/atomic_64.h +++ b/arch/sparc/include/asm/atomic_64.h @@ -13,8 +13,8 @@ #define ATOMIC_INIT(i) { (i) } #define ATOMIC64_INIT(i) { (i) } -#define atomic_read(v) ((v)->counter) -#define atomic64_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) +#define atomic64_read(v) (*(volatile long *)&(v)->counter) #define atomic_set(v, i) (((v)->counter) = i) #define atomic64_set(v, i) (((v)->counter) = i) diff --git a/arch/sparc/include/asm/bitops_64.h b/arch/sparc/include/asm/bitops_64.h index e72ac9c..766121a 100644 --- a/arch/sparc/include/asm/bitops_64.h +++ b/arch/sparc/include/asm/bitops_64.h @@ -44,7 +44,7 @@ extern void change_bit(unsigned long nr, volatile unsigned long *addr); #ifdef ULTRA_HAS_POPULATION_COUNT -static inline unsigned int hweight64(unsigned long w) +static inline unsigned int __arch_hweight64(unsigned long w) { unsigned int res; @@ -52,7 +52,7 @@ static inline unsigned int hweight64(unsigned long w) return res; } -static inline unsigned int hweight32(unsigned int w) +static inline unsigned int __arch_hweight32(unsigned int w) { unsigned int res; @@ -60,7 +60,7 @@ static inline unsigned int hweight32(unsigned int w) return res; } -static inline unsigned int hweight16(unsigned int w) +static inline unsigned int __arch_hweight16(unsigned int w) { unsigned int res; @@ -68,7 +68,7 @@ static inline unsigned int hweight16(unsigned int w) return res; } -static inline unsigned int hweight8(unsigned int w) +static inline unsigned int __arch_hweight8(unsigned int w) { unsigned int res; @@ -78,9 +78,10 @@ static inline unsigned int hweight8(unsigned int w) #else -#include <asm-generic/bitops/hweight.h> +#include <asm-generic/bitops/arch_hweight.h> #endif +#include <asm-generic/bitops/const_hweight.h> #include <asm-generic/bitops/lock.h> #endif /* __KERNEL__ */ diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 01177dc..a2d3a5f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -201,20 +201,17 @@ config HAVE_INTEL_TXT # Use the generic interrupt handling code in kernel/irq/: config GENERIC_HARDIRQS - bool - default y + def_bool y config GENERIC_HARDIRQS_NO__DO_IRQ def_bool y config GENERIC_IRQ_PROBE - bool - default y + def_bool y config GENERIC_PENDING_IRQ - bool + def_bool y depends on GENERIC_HARDIRQS && SMP - default y config USE_GENERIC_SMP_HELPERS def_bool y @@ -229,19 +226,22 @@ config X86_64_SMP depends on X86_64 && SMP config X86_HT - bool + def_bool y depends on SMP - default y config X86_TRAMPOLINE - bool + def_bool y depends on SMP || (64BIT && ACPI_SLEEP) - default y config X86_32_LAZY_GS def_bool y depends on X86_32 && !CC_STACKPROTECTOR +config ARCH_HWEIGHT_CFLAGS + string + default "-fcall-saved-ecx -fcall-saved-edx" if X86_32 + default "-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11" if X86_64 + config KTIME_SCALAR def_bool X86_32 source "init/Kconfig" @@ -451,7 +451,7 @@ config X86_NUMAQ firmware with - send email to <Martin.Bligh@us.ibm.com>. config X86_SUPPORTS_MEMORY_FAILURE - bool + def_bool y # MCE code calls memory_failure(): depends on X86_MCE # On 32-bit this adds too big of NODES_SHIFT and we run out of page flags: @@ -459,7 +459,6 @@ config X86_SUPPORTS_MEMORY_FAILURE # On 32-bit SPARSEMEM adds too big of SECTIONS_WIDTH: depends on X86_64 || !SPARSEMEM select ARCH_SUPPORTS_MEMORY_FAILURE - default y config X86_VISWS bool "SGI 320/540 (Visual Workstation)" @@ -574,7 +573,6 @@ config PARAVIRT_SPINLOCKS config PARAVIRT_CLOCK bool - default n endif @@ -753,7 +751,6 @@ config MAXSMP bool "Configure Maximum number of SMP Processors and NUMA Nodes" depends on X86_64 && SMP && DEBUG_KERNEL && EXPERIMENTAL select CPUMASK_OFFSTACK - default n ---help--- Configure maximum number of CPUS and NUMA Nodes for this architecture. If unsure, say N. @@ -833,7 +830,6 @@ config X86_VISWS_APIC config X86_REROUTE_FOR_BROKEN_BOOT_IRQS bool "Reroute for broken boot IRQs" - default n depends on X86_IO_APIC ---help--- This option enables a workaround that fixes a source of @@ -880,9 +876,8 @@ config X86_MCE_AMD the DRAM Error Threshold. config X86_ANCIENT_MCE - def_bool n + bool "Support for old Pentium 5 / WinChip machine checks" depends on X86_32 && X86_MCE - prompt "Support for old Pentium 5 / WinChip machine checks" ---help--- Include support for machine check handling on old Pentium 5 or WinChip systems. These typically need to be enabled explicitely on the command @@ -890,8 +885,7 @@ config X86_ANCIENT_MCE config X86_MCE_THRESHOLD depends on X86_MCE_AMD || X86_MCE_INTEL - bool - default y + def_bool y config X86_MCE_INJECT depends on X86_MCE @@ -1030,8 +1024,8 @@ config X86_CPUID choice prompt "High Memory Support" - default HIGHMEM4G if !X86_NUMAQ default HIGHMEM64G if X86_NUMAQ + default HIGHMEM4G depends on X86_32 config NOHIGHMEM @@ -1289,7 +1283,7 @@ source "mm/Kconfig" config HIGHPTE bool "Allocate 3rd-level pagetables from highmem" - depends on X86_32 && (HIGHMEM4G || HIGHMEM64G) + depends on HIGHMEM ---help--- The VM uses one page table entry for each page of physical memory. For systems with a lot of RAM, this can be wasteful of precious @@ -1373,8 +1367,7 @@ config MATH_EMULATION kernel, it won't hurt. config MTRR - bool - default y + def_bool y prompt "MTRR (Memory Type Range Register) support" if EMBEDDED ---help--- On Intel P6 family processors (Pentium Pro, Pentium II and later) @@ -1440,8 +1433,7 @@ config MTRR_SANITIZER_SPARE_REG_NR_DEFAULT mtrr_spare_reg_nr=N on the kernel command line. config X86_PAT - bool - default y + def_bool y prompt "x86 PAT support" if EMBEDDED depends on MTRR ---help--- @@ -1609,8 +1601,7 @@ config X86_NEED_RELOCS depends on X86_32 && RELOCATABLE config PHYSICAL_ALIGN - hex - prompt "Alignment value to which kernel should be aligned" if X86_32 + hex "Alignment value to which kernel should be aligned" if X86_32 default "0x1000000" range 0x2000 0x1000000 ---help--- @@ -1657,7 +1648,6 @@ config COMPAT_VDSO config CMDLINE_BOOL bool "Built-in kernel command line" - default n ---help--- Allow for specifying boot arguments to the kernel at build time. On some systems (e.g. embedded ones), it is @@ -1691,7 +1681,6 @@ config CMDLINE config CMDLINE_OVERRIDE bool "Built-in command line overrides boot loader arguments" - default n depends on CMDLINE_BOOL ---help--- Set this option to 'Y' to have the kernel ignore the boot loader @@ -1727,8 +1716,7 @@ source "drivers/acpi/Kconfig" source "drivers/sfi/Kconfig" config X86_APM_BOOT - bool - default y + def_bool y depends on APM || APM_MODULE menuconfig APM @@ -1957,8 +1945,7 @@ config DMAR_DEFAULT_ON experimental. config DMAR_BROKEN_GFX_WA - def_bool n - prompt "Workaround broken graphics drivers (going away soon)" + bool "Workaround broken graphics drivers (going away soon)" depends on DMAR && BROKEN ---help--- Current Graphics drivers tend to use physical address @@ -2056,7 +2043,6 @@ config SCx200HR_TIMER config OLPC bool "One Laptop Per Child support" select GPIOLIB - default n ---help--- Add support for detecting the unique features of the OLPC XO hardware. diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 918fbb1..2ac9069 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -338,6 +338,10 @@ config X86_F00F_BUG def_bool y depends on M586MMX || M586TSC || M586 || M486 || M386 +config X86_INVD_BUG + def_bool y + depends on M486 || M386 + config X86_WP_WORKS_OK def_bool y depends on !M386 diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index bd58c8a..7508508 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -45,7 +45,6 @@ config EARLY_PRINTK config EARLY_PRINTK_DBGP bool "Early printk via EHCI debug port" - default n depends on EARLY_PRINTK && PCI ---help--- Write kernel log output directly into the EHCI debug port. @@ -76,7 +75,6 @@ config DEBUG_PER_CPU_MAPS bool "Debug access to per_cpu maps" depends on DEBUG_KERNEL depends on SMP - default n ---help--- Say Y to verify that the per_cpu map being accessed has been setup. Adds a fair amount of code to kernel memory diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 0a43dc5..8aa1b59 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -95,8 +95,9 @@ sp-$(CONFIG_X86_64) := rsp cfi := $(call as-instr,.cfi_startproc\n.cfi_rel_offset $(sp-y)$(comma)0\n.cfi_endproc,-DCONFIG_AS_CFI=1) # is .cfi_signal_frame supported too? cfi-sigframe := $(call as-instr,.cfi_startproc\n.cfi_signal_frame\n.cfi_endproc,-DCONFIG_AS_CFI_SIGNAL_FRAME=1) -KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) -KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) +cfi-sections := $(call as-instr,.cfi_sections .debug_frame,-DCONFIG_AS_CFI_SECTIONS=1) +KBUILD_AFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) +KBUILD_CFLAGS += $(cfi) $(cfi-sigframe) $(cfi-sections) LDFLAGS := -m elf_$(UTS_MACHINE) diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h index b97f786..a63a68b 100644 --- a/arch/x86/include/asm/alternative-asm.h +++ b/arch/x86/include/asm/alternative-asm.h @@ -6,8 +6,8 @@ .macro LOCK_PREFIX 1: lock .section .smp_locks,"a" - _ASM_ALIGN - _ASM_PTR 1b + .balign 4 + .long 1b - . .previous .endm #else diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h index b09ec55..03b6bb53 100644 --- a/arch/x86/include/asm/alternative.h +++ b/arch/x86/include/asm/alternative.h @@ -28,20 +28,20 @@ */ #ifdef CONFIG_SMP -#define LOCK_PREFIX \ +#define LOCK_PREFIX_HERE \ ".section .smp_locks,\"a\"\n" \ - _ASM_ALIGN "\n" \ - _ASM_PTR "661f\n" /* address */ \ + ".balign 4\n" \ + ".long 671f - .\n" /* offset */ \ ".previous\n" \ - "661:\n\tlock; " + "671:" + +#define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; " #else /* ! CONFIG_SMP */ +#define LOCK_PREFIX_HERE "" #define LOCK_PREFIX "" #endif -/* This must be included *after* the definition of LOCK_PREFIX */ -#include <asm/cpufeature.h> - struct alt_instr { u8 *instr; /* original instruction */ u8 *replacement; @@ -96,6 +96,12 @@ static inline int alternatives_text_reserved(void *start, void *end) ".previous" /* + * This must be included *after* the definition of ALTERNATIVE due to + * <asm/arch_hweight.h> + */ +#include <asm/cpufeature.h> + +/* * Alternative instructions for different CPU types or capabilities. * * This allows to use optimized instructions even on generic binary diff --git a/arch/x86/include/asm/amd_iommu_types.h b/arch/x86/include/asm/amd_iommu_types.h index 86a0ff0..7014e88 100644 --- a/arch/x86/include/asm/amd_iommu_types.h +++ b/arch/x86/include/asm/amd_iommu_types.h @@ -174,6 +174,40 @@ (~((1ULL << (12 + ((lvl) * 9))) - 1))) #define PM_ALIGNED(lvl, addr) ((PM_MAP_MASK(lvl) & (addr)) == (addr)) +/* + * Returns the page table level to use for a given page size + * Pagesize is expected to be a power-of-two + */ +#define PAGE_SIZE_LEVEL(pagesize) \ + ((__ffs(pagesize) - 12) / 9) +/* + * Returns the number of ptes to use for a given page size + * Pagesize is expected to be a power-of-two + */ +#define PAGE_SIZE_PTE_COUNT(pagesize) \ + (1ULL << ((__ffs(pagesize) - 12) % 9)) + +/* + * Aligns a given io-virtual address to a given page size + * Pagesize is expected to be a power-of-two + */ +#define PAGE_SIZE_ALIGN(address, pagesize) \ + ((address) & ~((pagesize) - 1)) +/* + * Creates an IOMMU PTE for an address an a given pagesize + * The PTE has no permission bits set + * Pagesize is expected to be a power-of-two larger than 4096 + */ +#define PAGE_SIZE_PTE(address, pagesize) \ + (((address) | ((pagesize) - 1)) & \ + (~(pagesize >> 1)) & PM_ADDR_MASK) + +/* + * Takes a PTE value with mode=0x07 and returns the page size it maps + */ +#define PTE_PAGE_SIZE(pte) \ + (1ULL << (1 + ffz(((pte) | 0xfffULL)))) + #define IOMMU_PTE_P (1ULL << 0) #define IOMMU_PTE_TV (1ULL << 1) #define IOMMU_PTE_U (1ULL << 59) diff --git a/arch/x86/include/asm/arch_hweight.h b/arch/x86/include/asm/arch_hweight.h new file mode 100644 index 0000000..9686c3d --- /dev/null +++ b/arch/x86/include/asm/arch_hweight.h @@ -0,0 +1,61 @@ +#ifndef _ASM_X86_HWEIGHT_H +#define _ASM_X86_HWEIGHT_H + +#ifdef CONFIG_64BIT +/* popcnt %edi, %eax -- redundant REX prefix for alignment */ +#define POPCNT32 ".byte 0xf3,0x40,0x0f,0xb8,0xc7" +/* popcnt %rdi, %rax */ +#define POPCNT64 ".byte 0xf3,0x48,0x0f,0xb8,0xc7" +#define REG_IN "D" +#define REG_OUT "a" +#else +/* popcnt %eax, %eax */ +#define POPCNT32 ".byte 0xf3,0x0f,0xb8,0xc0" +#define REG_IN "a" +#define REG_OUT "a" +#endif + +/* + * __sw_hweightXX are called from within the alternatives below + * and callee-clobbered registers need to be taken care of. See + * ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective + * compiler switches. + */ +static inline unsigned int __arch_hweight32(unsigned int w) +{ + unsigned int res = 0; + + asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT) + : "="REG_OUT (res) + : REG_IN (w)); + + return res; +} + +static inline unsigned int __arch_hweight16(unsigned int w) +{ + return __arch_hweight32(w & 0xffff); +} + +static inline unsigned int __arch_hweight8(unsigned int w) +{ + return __arch_hweight32(w & 0xff); +} + +static inline unsigned long __arch_hweight64(__u64 w) +{ + unsigned long res = 0; + +#ifdef CONFIG_X86_32 + return __arch_hweight32((u32)w) + + __arch_hweight32((u32)(w >> 32)); +#else + asm (ALTERNATIVE("call __sw_hweight64", POPCNT64, X86_FEATURE_POPCNT) + : "="REG_OUT (res) + : REG_IN (w)); +#endif /* CONFIG_X86_32 */ + + return res; +} + +#endif diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h index 8f8217b..952a826 100644 --- a/arch/x86/include/asm/atomic.h +++ b/arch/x86/include/asm/atomic.h @@ -22,7 +22,7 @@ */ static inline int atomic_read(const atomic_t *v) { - return v->counter; + return (*(volatile int *)&(v)->counter); } /** @@ -246,6 +246,29 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u) #define atomic_inc_not_zero(v) atomic_add_unless((v), 1, 0) +/* + * atomic_dec_if_positive - decrement by 1 if old value positive + * @v: pointer of type atomic_t + * + * The function returns the old value of *v minus 1, even if + * the atomic variable, v, was not decremented. + */ +static inline int atomic_dec_if_positive(atomic_t *v) +{ + int c, old, dec; + c = atomic_read(v); + for (;;) { + dec = c - 1; + if (unlikely(dec < 0)) + break; + old = atomic_cmpxchg((v), c, dec); + if (likely(old == c)) + break; + c = old; + } + return dec; +} + /** * atomic_inc_short - increment of a short integer * @v: pointer to type int diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h index 03027bf..2a934aa 100644 --- a/arch/x86/include/asm/atomic64_32.h +++ b/arch/x86/include/asm/atomic64_32.h @@ -14,109 +14,193 @@ typedef struct { #define ATOMIC64_INIT(val) { (val) } -extern u64 atomic64_cmpxchg(atomic64_t *ptr, u64 old_val, u64 new_val); +#ifdef CONFIG_X86_CMPXCHG64 +#define ATOMIC64_ALTERNATIVE_(f, g) "call atomic64_" #g "_cx8" +#else +#define ATOMIC64_ALTERNATIVE_(f, g) ALTERNATIVE("call atomic64_" #f "_386", "call atomic64_" #g "_cx8", X86_FEATURE_CX8) +#endif + +#define ATOMIC64_ALTERNATIVE(f) ATOMIC64_ALTERNATIVE_(f, f) + +/** + * atomic64_cmpxchg - cmpxchg atomic64 variable + * @p: pointer to type atomic64_t + * @o: expected value + * @n: new value + * + * Atomically sets @v to @n if it was equal to @o and returns + * the old value. + */ + +static inline long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n) +{ + return cmpxchg64(&v->counter, o, n); +} /** * atomic64_xchg - xchg atomic64 variable - * @ptr: pointer to type atomic64_t - * @new_val: value to assign + * @v: pointer to type atomic64_t + * @n: value to assign * - * Atomically xchgs the value of @ptr to @new_val and returns + * Atomically xchgs the value of @v to @n and returns * the old value. */ -extern u64 atomic64_xchg(atomic64_t *ptr, u64 new_val); +static inline long long atomic64_xchg(atomic64_t *v, long long n) +{ + long long o; + unsigned high = (unsigned)(n >> 32); + unsigned low = (unsigned)n; + asm volatile(ATOMIC64_ALTERNATIVE(xchg) + : "=A" (o), "+b" (low), "+c" (high) + : "S" (v) + : "memory" + ); + return o; +} /** * atomic64_set - set atomic64 variable - * @ptr: pointer to type atomic64_t - * @new_val: value to assign + * @v: pointer to type atomic64_t + * @n: value to assign * - * Atomically sets the value of @ptr to @new_val. + * Atomically sets the value of @v to @n. */ -extern void atomic64_set(atomic64_t *ptr, u64 new_val); +static inline void atomic64_set(atomic64_t *v, long long i) +{ + unsigned high = (unsigned)(i >> 32); + unsigned low = (unsigned)i; + asm volatile(ATOMIC64_ALTERNATIVE(set) + : "+b" (low), "+c" (high) + : "S" (v) + : "eax", "edx", "memory" + ); +} /** * atomic64_read - read atomic64 variable - * @ptr: pointer to type atomic64_t + * @v: pointer to type atomic64_t * - * Atomically reads the value of @ptr and returns it. + * Atomically reads the value of @v and returns it. */ -static inline u64 atomic64_read(atomic64_t *ptr) +static inline long long atomic64_read(atomic64_t *v) { - u64 res; - - /* - * Note, we inline this atomic64_t primitive because - * it only clobbers EAX/EDX and leaves the others - * untouched. We also (somewhat subtly) rely on the - * fact that cmpxchg8b returns the current 64-bit value - * of the memory location we are touching: - */ - asm volatile( - "mov %%ebx, %%eax\n\t" - "mov %%ecx, %%edx\n\t" - LOCK_PREFIX "cmpxchg8b %1\n" - : "=&A" (res) - : "m" (*ptr) - ); - - return res; -} - -extern u64 atomic64_read(atomic64_t *ptr); + long long r; + asm volatile(ATOMIC64_ALTERNATIVE(read) + : "=A" (r), "+c" (v) + : : "memory" + ); + return r; + } /** * atomic64_add_return - add and return - * @delta: integer value to add - * @ptr: pointer to type atomic64_t + * @i: integer value to add + * @v: pointer to type atomic64_t * - * Atomically adds @delta to @ptr and returns @delta + *@ptr + * Atomically adds @i to @v and returns @i + *@v */ -extern u64 atomic64_add_return(u64 delta, atomic64_t *ptr); +static inline long long atomic64_add_return(long long i, atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE(add_return) + : "+A" (i), "+c" (v) + : : "memory" + ); + return i; +} /* * Other variants with different arithmetic operators: */ -extern u64 atomic64_sub_return(u64 delta, atomic64_t *ptr); -extern u64 atomic64_inc_return(atomic64_t *ptr); -extern u64 atomic64_dec_return(atomic64_t *ptr); +static inline long long atomic64_sub_return(long long i, atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE(sub_return) + : "+A" (i), "+c" (v) + : : "memory" + ); + return i; +} + +static inline long long atomic64_inc_return(atomic64_t *v) +{ + long long a; + asm volatile(ATOMIC64_ALTERNATIVE(inc_return) + : "=A" (a) + : "S" (v) + : "memory", "ecx" + ); + return a; +} + +static inline long long atomic64_dec_return(atomic64_t *v) +{ + long long a; + asm volatile(ATOMIC64_ALTERNATIVE(dec_return) + : "=A" (a) + : "S" (v) + : "memory", "ecx" + ); + return a; +} /** * atomic64_add - add integer to atomic64 variable - * @delta: integer value to add - * @ptr: pointer to type atomic64_t + * @i: integer value to add + * @v: pointer to type atomic64_t * - * Atomically adds @delta to @ptr. + * Atomically adds @i to @v. */ -extern void atomic64_add(u64 delta, atomic64_t *ptr); +static inline long long atomic64_add(long long i, atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE_(add, add_return) + : "+A" (i), "+c" (v) + : : "memory" + ); + return i; +} /** * atomic64_sub - subtract the atomic64 variable - * @delta: integer value to subtract - * @ptr: pointer to type atomic64_t + * @i: integer value to subtract + * @v: pointer to type atomic64_t * - * Atomically subtracts @delta from @ptr. + * Atomically subtracts @i from @v. */ -extern void atomic64_sub(u64 delta, atomic64_t *ptr); +static inline long long atomic64_sub(long long i, atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE_(sub, sub_return) + : "+A" (i), "+c" (v) + : : "memory" + ); + return i; +} /** * atomic64_sub_and_test - subtract value from variable and test result - * @delta: integer value to subtract - * @ptr: pointer to type atomic64_t - * - * Atomically subtracts @delta from @ptr and returns + * @i: integer value to subtract + * @v: pointer to type atomic64_t + * + * Atomically subtracts @i from @v and returns * true if the result is zero, or false for all * other cases. */ -extern int atomic64_sub_and_test(u64 delta, atomic64_t *ptr); +static inline int atomic64_sub_and_test(long long i, atomic64_t *v) +{ + return atomic64_sub_return(i, v) == 0; +} /** * atomic64_inc - increment atomic64 variable - * @ptr: pointer to type atomic64_t + * @v: pointer to type atomic64_t * - * Atomically increments @ptr by 1. + * Atomically increments @v by 1. */ -extern void atomic64_inc(atomic64_t *ptr); +static inline void atomic64_inc(atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE_(inc, inc_return) + : : "S" (v) + : "memory", "eax", "ecx", "edx" + ); +} /** * atomic64_dec - decrement atomic64 variable @@ -124,37 +208,97 @@ extern void atomic64_inc(atomic64_t *ptr); * * Atomically decrements @ptr by 1. */ -extern void atomic64_dec(atomic64_t *ptr); +static inline void atomic64_dec(atomic64_t *v) +{ + asm volatile(ATOMIC64_ALTERNATIVE_(dec, dec_return) + : : "S" (v) + : "memory", "eax", "ecx", "edx" + ); +} /** * atomic64_dec_and_test - decrement and test - * @ptr: pointer to type atomic64_t + * @v: pointer to type atomic64_t * - * Atomically decrements @ptr by 1 and + * Atomically decrements @v by 1 and * returns true if the result is 0, or false for all other * cases. */ -extern int atomic64_dec_and_test(atomic64_t *ptr); +static inline int atomic64_dec_and_test(atomic64_t *v) +{ + return atomic64_dec_return(v) == 0; +} /** * atomic64_inc_and_test - increment and test - * @ptr: pointer to type atomic64_t + * @v: pointer to type atomic64_t * - * Atomically increments @ptr by 1 + * Atomically increments @v by 1 * and returns true if the result is zero, or false for all * other cases. */ -extern int atomic64_inc_and_test(atomic64_t *ptr); +static inline int atomic64_inc_and_test(atomic64_t *v) +{ + return atomic64_inc_return(v) == 0; +} /** * atomic64_add_negative - add and test if negative - * @delta: integer value to add - * @ptr: pointer to type atomic64_t + * @i: integer value to add + * @v: pointer to type atomic64_t * - * Atomically adds @delta to @ptr and returns true + * Atomically adds @i to @v and returns true * if the result is negative, or false when * result is greater than or equal to zero. */ -extern int atomic64_add_negative(u64 delta, atomic64_t *ptr); +static inline int atomic64_add_negative(long long i, atomic64_t *v) +{ + return atomic64_add_return(i, v) < 0; +} + +/** + * atomic64_add_unless - add unless the number is a given value + * @v: pointer of type atomic64_t + * @a: the amount to add to v... + * @u: ...unless v is equal to u. + * + * Atomically adds @a to @v, so long as it was not @u. + * Returns non-zero if @v was not @u, and zero otherwise. + */ +static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u) +{ + unsigned low = (unsigned)u; + unsigned high = (unsigned)(u >> 32); + asm volatile(ATOMIC64_ALTERNATIVE(add_unless) "\n\t" + : "+A" (a), "+c" (v), "+S" (low), "+D" (high) + : : "memory"); + return (int)a; +} + + +static inline int atomic64_inc_not_zero(atomic64_t *v) +{ + int r; + asm volatile(ATOMIC64_ALTERNATIVE(inc_not_zero) + : "=a" (r) + : "S" (v) + : "ecx", "edx", "memory" + ); + return r; +} + +static inline long long atomic64_dec_if_positive(atomic64_t *v) +{ + long long r; + asm volatile(ATOMIC64_ALTERNATIVE(dec_if_positive) + : "=A" (r) + : "S" (v) + : "ecx", "memory" + ); + return r; +} + +#undef ATOMIC64_ALTERNATIVE +#undef ATOMIC64_ALTERNATIVE_ #endif /* _ASM_X86_ATOMIC64_32_H */ diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h index 51c5b40..49fd1ea 100644 --- a/arch/x86/include/asm/atomic64_64.h +++ b/arch/x86/include/asm/atomic64_64.h @@ -18,7 +18,7 @@ */ static inline long atomic64_read(const atomic64_t *v) { - return v->counter; + return (*(volatile long *)&(v)->counter); } /** @@ -221,4 +221,27 @@ static inline int atomic64_add_unless(atomic64_t *v, long a, long u) #define atomic64_inc_not_zero(v) atomic64_add_unless((v), 1, 0) +/* + * atomic64_dec_if_positive - decrement by 1 if old value positive + * @v: pointer of type atomic_t + * + * The function returns the old value of *v minus 1, even if + * the atomic variable, v, was not decremented. + */ +static inline long atomic64_dec_if_positive(atomic64_t *v) +{ + long c, old, dec; + c = atomic64_read(v); + for (;;) { + dec = c - 1; + if (unlikely(dec < 0)) + break; + old = atomic64_cmpxchg((v), c, dec); + if (likely(old == c)) + break; + c = old; + } + return dec; +} + #endif /* _ASM_X86_ATOMIC64_64_H */ diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index 02b47a6..545776e 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -444,7 +444,9 @@ static inline int fls(int x) #define ARCH_HAS_FAST_MULTIPLIER 1 -#include <asm-generic/bitops/hweight.h> +#include <asm/arch_hweight.h> + +#include <asm-generic/bitops/const_hweight.h> #endif /* __KERNEL__ */ diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 7a10659..3b62ab5 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -24,7 +24,7 @@ #define MIN_KERNEL_ALIGN (_AC(1, UL) << MIN_KERNEL_ALIGN_LG2) #if (CONFIG_PHYSICAL_ALIGN & (CONFIG_PHYSICAL_ALIGN-1)) || \ - (CONFIG_PHYSICAL_ALIGN < (_AC(1, UL) << MIN_KERNEL_ALIGN_LG2)) + (CONFIG_PHYSICAL_ALIGN < MIN_KERNEL_ALIGN) #error "Invalid value for CONFIG_PHYSICAL_ALIGN" #endif diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h index 634c40a..c70068d 100644 --- a/arch/x86/include/asm/cacheflush.h +++ b/arch/x86/include/asm/cacheflush.h @@ -44,9 +44,6 @@ static inline void copy_from_user_page(struct vm_area_struct *vma, memcpy(dst, src, len); } -#define PG_WC PG_arch_1 -PAGEFLAG(WC, WC) - #ifdef CONFIG_X86_PAT /* * X86 PAT uses page flags WC and Uncached together to keep track of @@ -55,16 +52,24 @@ PAGEFLAG(WC, WC) * _PAGE_CACHE_UC_MINUS and fourth state where page's memory type has not * been changed from its default (value of -1 used to denote this). * Note we do not support _PAGE_CACHE_UC here. - * - * Caller must hold memtype_lock for atomicity. */ + +#define _PGMT_DEFAULT 0 +#define _PGMT_WC (1UL << PG_arch_1) +#define _PGMT_UC_MINUS (1UL << PG_uncached) +#define _PGMT_WB (1UL << PG_uncached | 1UL << PG_arch_1) +#define _PGMT_MASK (1UL << PG_uncached | 1UL << PG_arch_1) +#define _PGMT_CLEAR_MASK (~_PGMT_MASK) + static inline unsigned long get_page_memtype(struct page *pg) { - if (!PageUncached(pg) && !PageWC(pg)) + unsigned long pg_flags = pg->flags & _PGMT_MASK; + + if (pg_flags == _PGMT_DEFAULT) return -1; - else if (!PageUncached(pg) && PageWC(pg)) + else if (pg_flags == _PGMT_WC) return _PAGE_CACHE_WC; - else if (PageUncached(pg) && !PageWC(pg)) + else if (pg_flags == _PGMT_UC_MINUS) return _PAGE_CACHE_UC_MINUS; else return _PAGE_CACHE_WB; @@ -72,25 +77,26 @@ static inline unsigned long get_page_memtype(struct page *pg) static inline void set_page_memtype(struct page *pg, unsigned long memtype) { + unsigned long memtype_flags = _PGMT_DEFAULT; + unsigned long old_flags; + unsigned long new_flags; + switch (memtype) { case _PAGE_CACHE_WC: - ClearPageUncached(pg); - SetPageWC(pg); + memtype_flags = _PGMT_WC; break; case _PAGE_CACHE_UC_MINUS: - SetPageUncached(pg); - ClearPageWC(pg); + memtype_flags = _PGMT_UC_MINUS; break; case _PAGE_CACHE_WB: - SetPageUncached(pg); - SetPageWC(pg); - break; - default: - case -1: - ClearPageUncached(pg); - ClearPageWC(pg); + memtype_flags = _PGMT_WB; break; } + + do { + old_flags = pg->flags; + new_flags = (old_flags & _PGMT_CLEAR_MASK) | memtype_flags; + } while (cmpxchg(&pg->flags, old_flags, new_flags) != old_flags); } #else static inline unsigned long get_page_memtype(struct page *pg) { return -1; } diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h index ffb9bb6..8859e12 100644 --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -271,7 +271,8 @@ extern unsigned long long cmpxchg_486_u64(volatile void *, u64, u64); __typeof__(*(ptr)) __ret; \ __typeof__(*(ptr)) __old = (o); \ __typeof__(*(ptr)) __new = (n); \ - alternative_io("call cmpxchg8b_emu", \ + alternative_io(LOCK_PREFIX_HERE \ + "call cmpxchg8b_emu", \ "lock; cmpxchg8b (%%esi)" , \ X86_FEATURE_CX8, \ "=A" (__ret), \ diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 0cd82d0..dca9c54 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -161,6 +161,7 @@ */ #define X86_FEATURE_IDA (7*32+ 0) /* Intel Dynamic Acceleration */ #define X86_FEATURE_ARAT (7*32+ 1) /* Always Running APIC Timer */ +#define X86_FEATURE_CPB (7*32+ 2) /* AMD Core Performance Boost */ /* Virtualization flags: Linux defined */ #define X86_FEATURE_TPR_SHADOW (8*32+ 0) /* Intel TPR Shadow */ @@ -175,6 +176,7 @@ #if defined(__KERNEL__) && !defined(__ASSEMBLY__) +#include <asm/asm.h> #include <linux/bitops.h> extern const char * const x86_cap_flags[NCAPINTS*32]; @@ -283,6 +285,62 @@ extern const char * const x86_power_flags[32]; #endif /* CONFIG_X86_64 */ +/* + * Static testing of CPU features. Used the same as boot_cpu_has(). + * These are only valid after alternatives have run, but will statically + * patch the target code for additional performance. + * + */ +static __always_inline __pure bool __static_cpu_has(u8 bit) +{ +#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 5) + asm goto("1: jmp %l[t_no]\n" + "2:\n" + ".section .altinstructions,\"a\"\n" + _ASM_ALIGN "\n" + _ASM_PTR "1b\n" + _ASM_PTR "0\n" /* no replacement */ + " .byte %P0\n" /* feature bit */ + " .byte 2b - 1b\n" /* source len */ + " .byte 0\n" /* replacement len */ + " .byte 0xff + 0 - (2b-1b)\n" /* padding */ + ".previous\n" + : : "i" (bit) : : t_no); + return true; + t_no: + return false; +#else + u8 flag; + /* Open-coded due to __stringify() in ALTERNATIVE() */ + asm volatile("1: movb $0,%0\n" + "2:\n" + ".section .altinstructions,\"a\"\n" + _ASM_ALIGN "\n" + _ASM_PTR "1b\n" + _ASM_PTR "3f\n" + " .byte %P1\n" /* feature bit */ + " .byte 2b - 1b\n" /* source len */ + " .byte 4f - 3f\n" /* replacement len */ + " .byte 0xff + (4f-3f) - (2b-1b)\n" /* padding */ + ".previous\n" + ".section .altinstr_replacement,\"ax\"\n" + "3: movb $1,%0\n" + "4:\n" + ".previous\n" + : "=qm" (flag) : "i" (bit)); + return flag; +#endif +} + +#define static_cpu_has(bit) \ +( \ + __builtin_constant_p(boot_cpu_has(bit)) ? \ + boot_cpu_has(bit) : \ + (__builtin_constant_p(bit) && !((bit) & ~0xff)) ? \ + __static_cpu_has(bit) : \ + boot_cpu_has(bit) \ +) + #endif /* defined(__KERNEL__) && !defined(__ASSEMBLY__) */ #endif /* _ASM_X86_CPUFEATURE_H */ diff --git a/arch/x86/include/asm/dwarf2.h b/arch/x86/include/asm/dwarf2.h index ae6253a..733f7e9 100644 --- a/arch/x86/include/asm/dwarf2.h +++ b/arch/x86/include/asm/dwarf2.h @@ -34,6 +34,18 @@ #define CFI_SIGNAL_FRAME #endif +#if defined(CONFIG_AS_CFI_SECTIONS) && defined(__ASSEMBLY__) + /* + * Emit CFI data in .debug_frame sections, not .eh_frame sections. + * The latter we currently just discard since we don't do DWARF + * unwinding at runtime. So only the offline DWARF information is + * useful to anyone. Note we should not use this directive if this + * file is used in the vDSO assembly, or if vmlinux.lds.S gets + * changed so it doesn't discard .eh_frame. + */ + .cfi_sections .debug_frame +#endif + #else /* diff --git a/arch/x86/include/asm/e820.h b/arch/x86/include/asm/e820.h index 0e22296..ec8a52d 100644 --- a/arch/x86/include/asm/e820.h +++ b/arch/x86/include/asm/e820.h @@ -45,7 +45,12 @@ #define E820_NVS 4 #define E820_UNUSABLE 5 -/* reserved RAM used by kernel itself */ +/* + * reserved RAM used by kernel itself + * if CONFIG_INTEL_TXT is enabled, memory of this type will be + * included in the S3 integrity calculation and so should not include + * any memory that BIOS might alter over the S3 transition + */ #define E820_RESERVED_KERN 128 #ifndef __ASSEMBLY__ diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 0f85764..aeab29a 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -35,7 +35,7 @@ DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); #define __ARCH_IRQ_STAT -#define inc_irq_stat(member) percpu_add(irq_stat.member, 1) +#define inc_irq_stat(member) percpu_inc(irq_stat.member) #define local_softirq_pending() percpu_read(irq_stat.__softirq_pending) diff --git a/arch/x86/include/asm/hyperv.h b/arch/x86/include/asm/hyperv.h index e153a2b..5df477a 100644 --- a/arch/x86/include/asm/hyperv.h +++ b/arch/x86/include/asm/hyperv.h @@ -1,5 +1,5 @@ -#ifndef _ASM_X86_KVM_HYPERV_H -#define _ASM_X86_KVM_HYPERV_H +#ifndef _ASM_X86_HYPERV_H +#define _ASM_X86_HYPERV_H #include <linux/types.h> @@ -14,6 +14,10 @@ #define HYPERV_CPUID_ENLIGHTMENT_INFO 0x40000004 #define HYPERV_CPUID_IMPLEMENT_LIMITS 0x40000005 +#define HYPERV_HYPERVISOR_PRESENT_BIT 0x80000000 +#define HYPERV_CPUID_MIN 0x40000005 +#define HYPERV_CPUID_MAX 0x4000ffff + /* * Feature identification. EAX indicates which features are available * to the partition based upon the current partition privileges. @@ -129,6 +133,9 @@ /* MSR used to provide vcpu index */ #define HV_X64_MSR_VP_INDEX 0x40000002 +/* MSR used to read the per-partition time reference counter */ +#define HV_X64_MSR_TIME_REF_COUNT 0x40000020 + /* Define the virtual APIC registers */ #define HV_X64_MSR_EOI 0x40000070 #define HV_X64_MSR_ICR 0x40000071 diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h index b78c094..70abda7 100644 --- a/arch/x86/include/asm/hypervisor.h +++ b/arch/x86/include/asm/hypervisor.h @@ -17,10 +17,33 @@ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. * */ -#ifndef ASM_X86__HYPERVISOR_H -#define ASM_X86__HYPERVISOR_H +#ifndef _ASM_X86_HYPERVISOR_H +#define _ASM_X86_HYPERVISOR_H extern void init_hypervisor(struct cpuinfo_x86 *c); extern void init_hypervisor_platform(void); +/* + * x86 hypervisor information + */ +struct hypervisor_x86 { + /* Hypervisor name */ + const char *name; + + /* Detection routine */ + bool (*detect)(void); + + /* Adjust CPU feature bits (run once per CPU) */ + void (*set_cpu_features)(struct cpuinfo_x86 *); + + /* Platform setup (run once per boot) */ + void (*init_platform)(void); +}; + +extern const struct hypervisor_x86 *x86_hyper; + +/* Recognized hypervisors */ +extern const struct hypervisor_x86 x86_hyper_vmware; +extern const struct hypervisor_x86 x86_hyper_ms_hyperv; + #endif diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h index da29309..c991b3a 100644 --- a/arch/x86/include/asm/i387.h +++ b/arch/x86/include/asm/i387.h @@ -16,7 +16,9 @@ #include <linux/kernel_stat.h> #include <linux/regset.h> #include <linux/hardirq.h> +#include <linux/slab.h> #include <asm/asm.h> +#include <asm/cpufeature.h> #include <asm/processor.h> #include <asm/sigcontext.h> #include <asm/user.h> @@ -56,6 +58,11 @@ extern int restore_i387_xstate_ia32(void __user *buf); #define X87_FSW_ES (1 << 7) /* Exception Summary */ +static __always_inline __pure bool use_xsave(void) +{ + return static_cpu_has(X86_FEATURE_XSAVE); +} + #ifdef CONFIG_X86_64 /* Ignore delayed exceptions from user space */ @@ -91,15 +98,15 @@ static inline int fxrstor_checking(struct i387_fxsave_struct *fx) values. The kernel data segment can be sometimes 0 and sometimes new user value. Both should be ok. Use the PDA as safe address because it should be already in L1. */ -static inline void clear_fpu_state(struct task_struct *tsk) +static inline void fpu_clear(struct fpu *fpu) { - struct xsave_struct *xstate = &tsk->thread.xstate->xsave; - struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave; + struct xsave_struct *xstate = &fpu->state->xsave; + struct i387_fxsave_struct *fx = &fpu->state->fxsave; /* * xsave header may indicate the init state of the FP. */ - if ((task_thread_info(tsk)->status & TS_XSAVE) && + if (use_xsave() && !(xstate->xsave_hdr.xstate_bv & XSTATE_FP)) return; @@ -111,6 +118,11 @@ static inline void clear_fpu_state(struct task_struct *tsk) X86_FEATURE_FXSAVE_LEAK); } +static inline void clear_fpu_state(struct task_struct *tsk) +{ + fpu_clear(&tsk->thread.fpu); +} + static inline int fxsave_user(struct i387_fxsave_struct __user *fx) { int err; @@ -135,7 +147,7 @@ static inline int fxsave_user(struct i387_fxsave_struct __user *fx) return err; } -static inline void fxsave(struct task_struct *tsk) +static inline void fpu_fxsave(struct fpu *fpu) { /* Using "rex64; fxsave %0" is broken because, if the memory operand uses any extended registers for addressing, a second REX prefix @@ -145,42 +157,45 @@ static inline void fxsave(struct task_struct *tsk) /* Using "fxsaveq %0" would be the ideal choice, but is only supported starting with gas 2.16. */ __asm__ __volatile__("fxsaveq %0" - : "=m" (tsk->thread.xstate->fxsave)); + : "=m" (fpu->state->fxsave)); #elif 0 /* Using, as a workaround, the properly prefixed form below isn't accepted by any binutils version so far released, complaining that the same type of prefix is used twice if an extended register is needed for addressing (fix submitted to mainline 2005-11-21). */ __asm__ __volatile__("rex64/fxsave %0" - : "=m" (tsk->thread.xstate->fxsave)); + : "=m" (fpu->state->fxsave)); #else /* This, however, we can work around by forcing the compiler to select an addressing mode that doesn't require extended registers. */ __asm__ __volatile__("rex64/fxsave (%1)" - : "=m" (tsk->thread.xstate->fxsave) - : "cdaSDb" (&tsk->thread.xstate->fxsave)); + : "=m" (fpu->state->fxsave) + : "cdaSDb" (&fpu->state->fxsave)); #endif } -static inline void __save_init_fpu(struct task_struct *tsk) +static inline void fpu_save_init(struct fpu *fpu) { - if (task_thread_info(tsk)->status & TS_XSAVE) - xsave(tsk); + if (use_xsave()) + fpu_xsave(fpu); else - fxsave(tsk); + fpu_fxsave(fpu); + + fpu_clear(fpu); +} - clear_fpu_state(tsk); +static inline void __save_init_fpu(struct task_struct *tsk) +{ + fpu_save_init(&tsk->thread.fpu); task_thread_info(tsk)->status &= ~TS_USEDFPU; } #else /* CONFIG_X86_32 */ #ifdef CONFIG_MATH_EMULATION -extern void finit_task(struct task_struct *tsk); +extern void finit_soft_fpu(struct i387_soft_struct *soft); #else -static inline void finit_task(struct task_struct *tsk) -{ -} +static inline void finit_soft_fpu(struct i387_soft_struct *soft) {} #endif static inline void tolerant_fwait(void) @@ -216,13 +231,13 @@ static inline int fxrstor_checking(struct i387_fxsave_struct *fx) /* * These must be called with preempt disabled */ -static inline void __save_init_fpu(struct task_struct *tsk) +static inline void fpu_save_init(struct fpu *fpu) { - if (task_thread_info(tsk)->status & TS_XSAVE) { - struct xsave_struct *xstate = &tsk->thread.xstate->xsave; - struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave; + if (use_xsave()) { + struct xsave_struct *xstate = &fpu->state->xsave; + struct i387_fxsave_struct *fx = &fpu->state->fxsave; - xsave(tsk); + fpu_xsave(fpu); /* * xsave header may indicate the init state of the FP. @@ -246,8 +261,8 @@ static inline void __save_init_fpu(struct task_struct *tsk) "fxsave %[fx]\n" "bt $7,%[fsw] ; jnc 1f ; fnclex\n1:", X86_FEATURE_FXSR, - [fx] "m" (tsk->thread.xstate->fxsave), - [fsw] "m" (tsk->thread.xstate->fxsave.swd) : "memory"); + [fx] "m" (fpu->state->fxsave), + [fsw] "m" (fpu->state->fxsave.swd) : "memory"); clear_state: /* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is pending. Clear the x87 state here by setting it to fixed @@ -259,17 +274,34 @@ clear_state: X86_FEATURE_FXSAVE_LEAK, [addr] "m" (safe_address)); end: + ; +} + +static inline void __save_init_fpu(struct task_struct *tsk) +{ + fpu_save_init(&tsk->thread.fpu); task_thread_info(tsk)->status &= ~TS_USEDFPU; } + #endif /* CONFIG_X86_64 */ -static inline int restore_fpu_checking(struct task_struct *tsk) +static inline int fpu_fxrstor_checking(struct fpu *fpu) { - if (task_thread_info(tsk)->status & TS_XSAVE) - return xrstor_checking(&tsk->thread.xstate->xsave); + return fxrstor_checking(&fpu->state->fxsave); +} + +static inline int fpu_restore_checking(struct fpu *fpu) +{ + if (use_xsave()) + return fpu_xrstor_checking(fpu); else - return fxrstor_checking(&tsk->thread.xstate->fxsave); + return fpu_fxrstor_checking(fpu); +} + +static inline int restore_fpu_checking(struct task_struct *tsk) +{ + return fpu_restore_checking(&tsk->thread.fpu); } /* @@ -397,30 +429,59 @@ static inline void clear_fpu(struct task_struct *tsk) static inline unsigned short get_fpu_cwd(struct task_struct *tsk) { if (cpu_has_fxsr) { - return tsk->thread.xstate->fxsave.cwd; + return tsk->thread.fpu.state->fxsave.cwd; } else { - return (unsigned short)tsk->thread.xstate->fsave.cwd; + return (unsigned short)tsk->thread.fpu.state->fsave.cwd; } } static inline unsigned short get_fpu_swd(struct task_struct *tsk) { if (cpu_has_fxsr) { - return tsk->thread.xstate->fxsave.swd; + return tsk->thread.fpu.state->fxsave.swd; } else { - return (unsigned short)tsk->thread.xstate->fsave.swd; + return (unsigned short)tsk->thread.fpu.state->fsave.swd; } } static inline unsigned short get_fpu_mxcsr(struct task_struct *tsk) { if (cpu_has_xmm) { - return tsk->thread.xstate->fxsave.mxcsr; + return tsk->thread.fpu.state->fxsave.mxcsr; } else { return MXCSR_DEFAULT; } } +static bool fpu_allocated(struct fpu *fpu) +{ + return fpu->state != NULL; +} + +static inline int fpu_alloc(struct fpu *fpu) +{ + if (fpu_allocated(fpu)) + return 0; + fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); + if (!fpu->state) + return -ENOMEM; + WARN_ON((unsigned long)fpu->state & 15); + return 0; +} + +static inline void fpu_free(struct fpu *fpu) +{ + if (fpu->state) { + kmem_cache_free(task_xstate_cachep, fpu->state); + fpu->state = NULL; + } +} + +static inline void fpu_copy(struct fpu *dst, struct fpu *src) +{ + memcpy(dst->state, src->state, xstate_size); +} + #endif /* __ASSEMBLY__ */ #define PSHUFB_XMM5_XMM0 .byte 0x66, 0x0f, 0x38, 0x00, 0xc5 diff --git a/arch/x86/include/asm/i8253.h b/arch/x86/include/asm/i8253.h index 1edbf89..fc1f579 100644 --- a/arch/x86/include/asm/i8253.h +++ b/arch/x86/include/asm/i8253.h @@ -6,7 +6,7 @@ #define PIT_CH0 0x40 #define PIT_CH2 0x42 -extern spinlock_t i8253_lock; +extern raw_spinlock_t i8253_lock; extern struct clock_event_device *global_clock_event; diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h index 35832a0..63cb409 100644 --- a/arch/x86/include/asm/io_apic.h +++ b/arch/x86/include/asm/io_apic.h @@ -159,7 +159,6 @@ struct io_apic_irq_attr; extern int io_apic_set_pci_routing(struct device *dev, int irq, struct io_apic_irq_attr *irq_attr); void setup_IO_APIC_irq_extra(u32 gsi); -extern int (*ioapic_renumber_irq)(int ioapic, int irq); extern void ioapic_init_mappings(void); extern void ioapic_insert_resources(void); @@ -180,12 +179,13 @@ extern void ioapic_write_entry(int apic, int pin, extern void setup_ioapic_ids_from_mpc(void); struct mp_ioapic_gsi{ - int gsi_base; - int gsi_end; + u32 gsi_base; + u32 gsi_end; }; extern struct mp_ioapic_gsi mp_gsi_routing[]; -int mp_find_ioapic(int gsi); -int mp_find_ioapic_pin(int ioapic, int gsi); +extern u32 gsi_end; +int mp_find_ioapic(u32 gsi); +int mp_find_ioapic_pin(int ioapic, u32 gsi); void __init mp_register_ioapic(int id, u32 address, u32 gsi_base); extern void __init pre_init_apic_IRQ0(void); @@ -197,7 +197,8 @@ static const int timer_through_8259 = 0; static inline void ioapic_init_mappings(void) { } static inline void ioapic_insert_resources(void) { } static inline void probe_nr_irqs_gsi(void) { } -static inline int mp_find_ioapic(int gsi) { return 0; } +#define gsi_end (NR_IRQS_LEGACY - 1) +static inline int mp_find_ioapic(u32 gsi) { return 0; } struct io_apic_irq_attr; static inline int io_apic_set_pci_routing(struct device *dev, int irq, diff --git a/arch/x86/include/asm/k8.h b/arch/x86/include/asm/k8.h index f70e600..af00bd1 100644 --- a/arch/x86/include/asm/k8.h +++ b/arch/x86/include/asm/k8.h @@ -16,11 +16,16 @@ extern int k8_numa_init(unsigned long start_pfn, unsigned long end_pfn); extern int k8_scan_nodes(void); #ifdef CONFIG_K8_NB +extern int num_k8_northbridges; + static inline struct pci_dev *node_to_k8_nb_misc(int node) { return (node < num_k8_northbridges) ? k8_northbridges[node] : NULL; } + #else +#define num_k8_northbridges 0 + static inline struct pci_dev *node_to_k8_nb_misc(int node) { return NULL; diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h index d8bf23a..c82868e 100644 --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -105,16 +105,6 @@ extern void mp_config_acpi_legacy_irqs(void); struct device; extern int mp_register_gsi(struct device *dev, u32 gsi, int edge_level, int active_high_low); -extern int acpi_probe_gsi(void); -#ifdef CONFIG_X86_IO_APIC -extern int mp_find_ioapic(int gsi); -extern int mp_find_ioapic_pin(int ioapic, int gsi); -#endif -#else /* !CONFIG_ACPI: */ -static inline int acpi_probe_gsi(void) -{ - return 0; -} #endif /* CONFIG_ACPI */ #define PHYSID_ARRAY_SIZE BITS_TO_LONGS(MAX_APICS) diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h new file mode 100644 index 0000000..79ce568 --- /dev/null +++ b/arch/x86/include/asm/mshyperv.h @@ -0,0 +1,14 @@ +#ifndef _ASM_X86_MSHYPER_H +#define _ASM_X86_MSHYPER_H + +#include <linux/types.h> +#include <asm/hyperv.h> + +struct ms_hyperv_info { + u32 features; + u32 hints; +}; + +extern struct ms_hyperv_info ms_hyperv; + +#endif diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 66a272d..0ec6d12 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -190,6 +190,29 @@ do { \ pfo_ret__; \ }) +#define percpu_unary_op(op, var) \ +({ \ + switch (sizeof(var)) { \ + case 1: \ + asm(op "b "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 2: \ + asm(op "w "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 4: \ + asm(op "l "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + case 8: \ + asm(op "q "__percpu_arg(0) \ + : "+m" (var)); \ + break; \ + default: __bad_percpu_size(); \ + } \ +}) + /* * percpu_read() makes gcc load the percpu variable every time it is * accessed while percpu_read_stable() allows the value to be cached. @@ -207,6 +230,7 @@ do { \ #define percpu_and(var, val) percpu_to_op("and", var, val) #define percpu_or(var, val) percpu_to_op("or", var, val) #define percpu_xor(var, val) percpu_to_op("xor", var, val) +#define percpu_inc(var) percpu_unary_op("inc", var) #define __this_cpu_read_1(pcp) percpu_from_op("mov", (pcp), "m"(pcp)) #define __this_cpu_read_2(pcp) percpu_from_op("mov", (pcp), "m"(pcp)) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 32428b4..5a51379 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -113,7 +113,6 @@ struct cpuinfo_x86 { /* Index into per_cpu list: */ u16 cpu_index; #endif - unsigned int x86_hyper_vendor; } __attribute__((__aligned__(SMP_CACHE_BYTES))); #define X86_VENDOR_INTEL 0 @@ -127,9 +126,6 @@ struct cpuinfo_x86 { #define X86_VENDOR_UNKNOWN 0xff -#define X86_HYPER_VENDOR_NONE 0 -#define X86_HYPER_VENDOR_VMWARE 1 - /* * capabilities of CPUs */ @@ -380,6 +376,10 @@ union thread_xstate { struct xsave_struct xsave; }; +struct fpu { + union thread_xstate *state; +}; + #ifdef CONFIG_X86_64 DECLARE_PER_CPU(struct orig_ist, orig_ist); @@ -457,7 +457,7 @@ struct thread_struct { unsigned long trap_no; unsigned long error_code; /* floating point and extended processor state */ - union thread_xstate *xstate; + struct fpu fpu; #ifdef CONFIG_X86_32 /* Virtual 86 mode info */ struct vm86_struct __user *vm86_info; diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h index d017ed55..d4092fac 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -242,7 +242,6 @@ static inline struct thread_info *current_thread_info(void) #define TS_POLLING 0x0004 /* true if in idle loop and not sleeping */ #define TS_RESTORE_SIGMASK 0x0008 /* restore signal mask in do_signal() */ -#define TS_XSAVE 0x0010 /* Use xsave/xrstor */ #define tsk_is_polling(t) (task_thread_info(t)->status & TS_POLLING) diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 4da91ad..f66cda5 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -79,7 +79,7 @@ static inline int get_si_code(unsigned long condition) extern int panic_on_unrecovered_nmi; -void math_error(void __user *); +void math_error(struct pt_regs *, int, int); void math_emulate(struct math_emu_info *); #ifndef CONFIG_X86_32 asmlinkage void smp_thermal_interrupt(void); diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h index b414d2b..aa558ac 100644 --- a/arch/x86/include/asm/uv/uv_bau.h +++ b/arch/x86/include/asm/uv/uv_bau.h @@ -27,13 +27,14 @@ * set 2 is at BASE + 2*512, set 3 at BASE + 3*512, and so on. * * We will use 31 sets, one for sending BAU messages from each of the 32 - * cpu's on the node. + * cpu's on the uvhub. * * TLB shootdown will use the first of the 8 descriptors of each set. * Each of the descriptors is 64 bytes in size (8*64 = 512 bytes in a set). */ #define UV_ITEMS_PER_DESCRIPTOR 8 +#define MAX_BAU_CONCURRENT 3 #define UV_CPUS_PER_ACT_STATUS 32 #define UV_ACT_STATUS_MASK 0x3 #define UV_ACT_STATUS_SIZE 2 @@ -45,6 +46,9 @@ #define UV_PAYLOADQ_PNODE_SHIFT 49 #define UV_PTC_BASENAME "sgi_uv/ptc_statistics" #define uv_physnodeaddr(x) ((__pa((unsigned long)(x)) & uv_mmask)) +#define UV_ENABLE_INTD_SOFT_ACK_MODE_SHIFT 15 +#define UV_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHIFT 16 +#define UV_INTD_SOFT_ACK_TIMEOUT_PERIOD 0x000000000bUL /* * bits in UVH_LB_BAU_SB_ACTIVATION_STATUS_0/1 @@ -55,15 +59,29 @@ #define DESC_STATUS_SOURCE_TIMEOUT 3 /* - * source side thresholds at which message retries print a warning + * source side threshholds at which message retries print a warning */ #define SOURCE_TIMEOUT_LIMIT 20 #define DESTINATION_TIMEOUT_LIMIT 20 /* + * misc. delays, in microseconds + */ +#define THROTTLE_DELAY 10 +#define TIMEOUT_DELAY 10 +#define BIOS_TO 1000 +/* BIOS is assumed to set the destination timeout to 1003520 nanoseconds */ + +/* + * threshholds at which to use IPI to free resources + */ +#define PLUGSB4RESET 100 +#define TIMEOUTSB4RESET 100 + +/* * number of entries in the destination side payload queue */ -#define DEST_Q_SIZE 17 +#define DEST_Q_SIZE 20 /* * number of destination side software ack resources */ @@ -72,9 +90,10 @@ /* * completion statuses for sending a TLB flush message */ -#define FLUSH_RETRY 1 -#define FLUSH_GIVEUP 2 -#define FLUSH_COMPLETE 3 +#define FLUSH_RETRY_PLUGGED 1 +#define FLUSH_RETRY_TIMEOUT 2 +#define FLUSH_GIVEUP 3 +#define FLUSH_COMPLETE 4 /* * Distribution: 32 bytes (256 bits) (bytes 0-0x1f of descriptor) @@ -86,14 +105,14 @@ * 'base_dest_nodeid' field of the header corresponds to the * destination nodeID associated with that specified bit. */ -struct bau_target_nodemask { - unsigned long bits[BITS_TO_LONGS(256)]; +struct bau_target_uvhubmask { + unsigned long bits[BITS_TO_LONGS(UV_DISTRIBUTION_SIZE)]; }; /* - * mask of cpu's on a node + * mask of cpu's on a uvhub * (during initialization we need to check that unsigned long has - * enough bits for max. cpu's per node) + * enough bits for max. cpu's per uvhub) */ struct bau_local_cpumask { unsigned long bits; @@ -135,8 +154,8 @@ struct bau_msg_payload { struct bau_msg_header { unsigned int dest_subnodeid:6; /* must be 0x10, for the LB */ /* bits 5:0 */ - unsigned int base_dest_nodeid:15; /* nasid>>1 (pnode) of */ - /* bits 20:6 */ /* first bit in node_map */ + unsigned int base_dest_nodeid:15; /* nasid (pnode<<1) of */ + /* bits 20:6 */ /* first bit in uvhub map */ unsigned int command:8; /* message type */ /* bits 28:21 */ /* 0x38: SN3net EndPoint Message */ @@ -146,26 +165,38 @@ struct bau_msg_header { unsigned int rsvd_2:9; /* must be zero */ /* bits 40:32 */ /* Suppl_A is 56-41 */ - unsigned int payload_2a:8;/* becomes byte 16 of msg */ - /* bits 48:41 */ /* not currently using */ - unsigned int payload_2b:8;/* becomes byte 17 of msg */ - /* bits 56:49 */ /* not currently using */ + unsigned int sequence:16;/* message sequence number */ + /* bits 56:41 */ /* becomes bytes 16-17 of msg */ /* Address field (96:57) is never used as an address (these are address bits 42:3) */ + unsigned int rsvd_3:1; /* must be zero */ /* bit 57 */ /* address bits 27:4 are payload */ - /* these 24 bits become bytes 12-14 of msg */ + /* these next 24 (58-81) bits become bytes 12-14 of msg */ + + /* bits 65:58 land in byte 12 */ unsigned int replied_to:1;/* sent as 0 by the source to byte 12 */ /* bit 58 */ - - unsigned int payload_1a:5;/* not currently used */ - /* bits 63:59 */ - unsigned int payload_1b:8;/* not currently used */ - /* bits 71:64 */ - unsigned int payload_1c:8;/* not currently used */ - /* bits 79:72 */ - unsigned int payload_1d:2;/* not currently used */ + unsigned int msg_type:3; /* software type of the message*/ + /* bits 61:59 */ + unsigned int canceled:1; /* message canceled, resource to be freed*/ + /* bit 62 */ + unsigned int payload_1a:1;/* not currently used */ + /* bit 63 */ + unsigned int payload_1b:2;/* not currently used */ + /* bits 65:64 */ + + /* bits 73:66 land in byte 13 */ + unsigned int payload_1ca:6;/* not currently used */ + /* bits 71:66 */ + unsigned int payload_1c:2;/* not currently used */ + /* bits 73:72 */ + + /* bits 81:74 land in byte 14 */ + unsigned int payload_1d:6;/* not currently used */ + /* bits 79:74 */ + unsigned int payload_1e:2;/* not currently used */ /* bits 81:80 */ unsigned int rsvd_4:7; /* must be zero */ @@ -178,7 +209,7 @@ struct bau_msg_header { /* bits 95:90 */ unsigned int rsvd_6:5; /* must be zero */ /* bits 100:96 */ - unsigned int int_both:1;/* if 1, interrupt both sockets on the blade */ + unsigned int int_both:1;/* if 1, interrupt both sockets on the uvhub */ /* bit 101*/ unsigned int fairness:3;/* usually zero */ /* bits 104:102 */ @@ -191,13 +222,18 @@ struct bau_msg_header { /* bits 127:107 */ }; +/* see msg_type: */ +#define MSG_NOOP 0 +#define MSG_REGULAR 1 +#define MSG_RETRY 2 + /* * The activation descriptor: * The format of the message to send, plus all accompanying control * Should be 64 bytes */ struct bau_desc { - struct bau_target_nodemask distribution; + struct bau_target_uvhubmask distribution; /* * message template, consisting of header and payload: */ @@ -237,19 +273,25 @@ struct bau_payload_queue_entry { unsigned short acknowledge_count; /* filled in by destination */ /* 16 bits, bytes 10-11 */ - unsigned short replied_to:1; /* sent as 0 by the source */ - /* 1 bit */ - unsigned short unused1:7; /* not currently using */ - /* 7 bits: byte 12) */ + /* these next 3 bytes come from bits 58-81 of the message header */ + unsigned short replied_to:1; /* sent as 0 by the source */ + unsigned short msg_type:3; /* software message type */ + unsigned short canceled:1; /* sent as 0 by the source */ + unsigned short unused1:3; /* not currently using */ + /* byte 12 */ - unsigned char unused2[2]; /* not currently using */ - /* bytes 13-14 */ + unsigned char unused2a; /* not currently using */ + /* byte 13 */ + unsigned char unused2; /* not currently using */ + /* byte 14 */ unsigned char sw_ack_vector; /* filled in by the hardware */ /* byte 15 (bits 127:120) */ - unsigned char unused4[3]; /* not currently using bytes 17-19 */ - /* bytes 17-19 */ + unsigned short sequence; /* message sequence number */ + /* bytes 16-17 */ + unsigned char unused4[2]; /* not currently using bytes 18-19 */ + /* bytes 18-19 */ int number_of_cpus; /* filled in at destination */ /* 32 bits, bytes 20-23 (aligned) */ @@ -259,63 +301,93 @@ struct bau_payload_queue_entry { }; /* - * one for every slot in the destination payload queue - */ -struct bau_msg_status { - struct bau_local_cpumask seen_by; /* map of cpu's */ -}; - -/* - * one for every slot in the destination software ack resources - */ -struct bau_sw_ack_status { - struct bau_payload_queue_entry *msg; /* associated message */ - int watcher; /* cpu monitoring, or -1 */ -}; - -/* - * one on every node and per-cpu; to locate the software tables + * one per-cpu; to locate the software tables */ struct bau_control { struct bau_desc *descriptor_base; - struct bau_payload_queue_entry *bau_msg_head; struct bau_payload_queue_entry *va_queue_first; struct bau_payload_queue_entry *va_queue_last; - struct bau_msg_status *msg_statuses; - int *watching; /* pointer to array */ + struct bau_payload_queue_entry *bau_msg_head; + struct bau_control *uvhub_master; + struct bau_control *socket_master; + unsigned long timeout_interval; + atomic_t active_descriptor_count; + int max_concurrent; + int max_concurrent_constant; + int retry_message_scans; + int plugged_tries; + int timeout_tries; + int ipi_attempts; + int conseccompletes; + short cpu; + short uvhub_cpu; + short uvhub; + short cpus_in_socket; + short cpus_in_uvhub; + unsigned short message_number; + unsigned short uvhub_quiesce; + short socket_acknowledge_count[DEST_Q_SIZE]; + cycles_t send_message; + spinlock_t masks_lock; + spinlock_t uvhub_lock; + spinlock_t queue_lock; }; /* * This structure is allocated per_cpu for UV TLB shootdown statistics. */ struct ptc_stats { - unsigned long ptc_i; /* number of IPI-style flushes */ - unsigned long requestor; /* number of nodes this cpu sent to */ - unsigned long requestee; /* times cpu was remotely requested */ - unsigned long alltlb; /* times all tlb's on this cpu were flushed */ - unsigned long onetlb; /* times just one tlb on this cpu was flushed */ - unsigned long s_retry; /* retries on source side timeouts */ - unsigned long d_retry; /* retries on destination side timeouts */ - unsigned long sflush; /* cycles spent in uv_flush_tlb_others */ - unsigned long dflush; /* cycles spent on destination side */ - unsigned long retriesok; /* successes on retries */ - unsigned long nomsg; /* interrupts with no message */ - unsigned long multmsg; /* interrupts with multiple messages */ - unsigned long ntargeted;/* nodes targeted */ + /* sender statistics */ + unsigned long s_giveup; /* number of fall backs to IPI-style flushes */ + unsigned long s_requestor; /* number of shootdown requests */ + unsigned long s_stimeout; /* source side timeouts */ + unsigned long s_dtimeout; /* destination side timeouts */ + unsigned long s_time; /* time spent in sending side */ + unsigned long s_retriesok; /* successful retries */ + unsigned long s_ntargcpu; /* number of cpus targeted */ + unsigned long s_ntarguvhub; /* number of uvhubs targeted */ + unsigned long s_ntarguvhub16; /* number of times >= 16 target hubs */ + unsigned long s_ntarguvhub8; /* number of times >= 8 target hubs */ + unsigned long s_ntarguvhub4; /* number of times >= 4 target hubs */ + unsigned long s_ntarguvhub2; /* number of times >= 2 target hubs */ + unsigned long s_ntarguvhub1; /* number of times == 1 target hub */ + unsigned long s_resets_plug; /* ipi-style resets from plug state */ + unsigned long s_resets_timeout; /* ipi-style resets from timeouts */ + unsigned long s_busy; /* status stayed busy past s/w timer */ + unsigned long s_throttles; /* waits in throttle */ + unsigned long s_retry_messages; /* retry broadcasts */ + /* destination statistics */ + unsigned long d_alltlb; /* times all tlb's on this cpu were flushed */ + unsigned long d_onetlb; /* times just one tlb on this cpu was flushed */ + unsigned long d_multmsg; /* interrupts with multiple messages */ + unsigned long d_nomsg; /* interrupts with no message */ + unsigned long d_time; /* time spent on destination side */ + unsigned long d_requestee; /* number of messages processed */ + unsigned long d_retries; /* number of retry messages processed */ + unsigned long d_canceled; /* number of messages canceled by retries */ + unsigned long d_nocanceled; /* retries that found nothing to cancel */ + unsigned long d_resets; /* number of ipi-style requests processed */ + unsigned long d_rcanceled; /* number of messages canceled by resets */ }; -static inline int bau_node_isset(int node, struct bau_target_nodemask *dstp) +static inline int bau_uvhub_isset(int uvhub, struct bau_target_uvhubmask *dstp) { - return constant_test_bit(node, &dstp->bits[0]); + return constant_test_bit(uvhub, &dstp->bits[0]); } -static inline void bau_node_set(int node, struct bau_target_nodemask *dstp) +static inline void bau_uvhub_set(int uvhub, struct bau_target_uvhubmask *dstp) { - __set_bit(node, &dstp->bits[0]); + __set_bit(uvhub, &dstp->bits[0]); } -static inline void bau_nodes_clear(struct bau_target_nodemask *dstp, int nbits) +static inline void bau_uvhubs_clear(struct bau_target_uvhubmask *dstp, + int nbits) { bitmap_zero(&dstp->bits[0], nbits); } +static inline int bau_uvhub_weight(struct bau_target_uvhubmask *dstp) +{ + return bitmap_weight((unsigned long *)&dstp->bits[0], + UV_DISTRIBUTION_SIZE); +} static inline void bau_cpubits_clear(struct bau_local_cpumask *dstp, int nbits) { @@ -328,4 +400,35 @@ static inline void bau_cpubits_clear(struct bau_local_cpumask *dstp, int nbits) extern void uv_bau_message_intr1(void); extern void uv_bau_timeout_intr1(void); +struct atomic_short { + short counter; +}; + +/** + * atomic_read_short - read a short atomic variable + * @v: pointer of type atomic_short + * + * Atomically reads the value of @v. + */ +static inline int atomic_read_short(const struct atomic_short *v) +{ + return v->counter; +} + +/** + * atomic_add_short_return - add and return a short int + * @i: short value to add + * @v: pointer of type atomic_short + * + * Atomically adds @i to @v and returns @i + @v + */ +static inline int atomic_add_short_return(short i, struct atomic_short *v) +{ + short __i = i; + asm volatile(LOCK_PREFIX "xaddw %0, %1" + : "+r" (i), "+m" (v->counter) + : : "memory"); + return i + __i; +} + #endif /* _ASM_X86_UV_UV_BAU_H */ diff --git a/arch/x86/include/asm/uv/uv_hub.h b/arch/x86/include/asm/uv/uv_hub.h index 14cc74b..bf6b88e 100644 --- a/arch/x86/include/asm/uv/uv_hub.h +++ b/arch/x86/include/asm/uv/uv_hub.h @@ -307,7 +307,7 @@ static inline unsigned long uv_read_global_mmr32(int pnode, unsigned long offset * Access Global MMR space using the MMR space located at the top of physical * memory. */ -static inline unsigned long *uv_global_mmr64_address(int pnode, unsigned long offset) +static inline volatile void __iomem *uv_global_mmr64_address(int pnode, unsigned long offset) { return __va(UV_GLOBAL_MMR64_BASE | UV_GLOBAL_MMR64_PNODE_BITS(pnode) | offset); diff --git a/arch/x86/include/asm/uv/uv_mmrs.h b/arch/x86/include/asm/uv/uv_mmrs.h index 2cae46c..b2f2d2e 100644 --- a/arch/x86/include/asm/uv/uv_mmrs.h +++ b/arch/x86/include/asm/uv/uv_mmrs.h @@ -1,4 +1,3 @@ - /* * This file is subject to the terms and conditions of the GNU General Public * License. See the file "COPYING" in the main directory of this archive @@ -15,13 +14,25 @@ #define UV_MMR_ENABLE (1UL << 63) /* ========================================================================= */ +/* UVH_BAU_DATA_BROADCAST */ +/* ========================================================================= */ +#define UVH_BAU_DATA_BROADCAST 0x61688UL +#define UVH_BAU_DATA_BROADCAST_32 0x0440 + +#define UVH_BAU_DATA_BROADCAST_ENABLE_SHFT 0 +#define UVH_BAU_DATA_BROADCAST_ENABLE_MASK 0x0000000000000001UL + +union uvh_bau_data_broadcast_u { + unsigned long v; + struct uvh_bau_data_broadcast_s { + unsigned long enable : 1; /* RW */ + unsigned long rsvd_1_63: 63; /* */ + } s; +}; + +/* ========================================================================= */ /* UVH_BAU_DATA_CONFIG */ /* ========================================================================= */ -#define UVH_LB_BAU_MISC_CONTROL 0x320170UL -#define UV_ENABLE_INTD_SOFT_ACK_MODE_SHIFT 15 -#define UV_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHIFT 16 -#define UV_INTD_SOFT_ACK_TIMEOUT_PERIOD 0x000000000bUL -/* 1011 timebase 7 (168millisec) * 3 ticks -> 500ms */ #define UVH_BAU_DATA_CONFIG 0x61680UL #define UVH_BAU_DATA_CONFIG_32 0x0438 @@ -604,6 +615,68 @@ union uvh_lb_bau_intd_software_acknowledge_u { #define UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE_ALIAS_32 0x0a70 /* ========================================================================= */ +/* UVH_LB_BAU_MISC_CONTROL */ +/* ========================================================================= */ +#define UVH_LB_BAU_MISC_CONTROL 0x320170UL +#define UVH_LB_BAU_MISC_CONTROL_32 0x00a10 + +#define UVH_LB_BAU_MISC_CONTROL_REJECTION_DELAY_SHFT 0 +#define UVH_LB_BAU_MISC_CONTROL_REJECTION_DELAY_MASK 0x00000000000000ffUL +#define UVH_LB_BAU_MISC_CONTROL_APIC_MODE_SHFT 8 +#define UVH_LB_BAU_MISC_CONTROL_APIC_MODE_MASK 0x0000000000000100UL +#define UVH_LB_BAU_MISC_CONTROL_FORCE_BROADCAST_SHFT 9 +#define UVH_LB_BAU_MISC_CONTROL_FORCE_BROADCAST_MASK 0x0000000000000200UL +#define UVH_LB_BAU_MISC_CONTROL_FORCE_LOCK_NOP_SHFT 10 +#define UVH_LB_BAU_MISC_CONTROL_FORCE_LOCK_NOP_MASK 0x0000000000000400UL +#define UVH_LB_BAU_MISC_CONTROL_CSI_AGENT_PRESENCE_VECTOR_SHFT 11 +#define UVH_LB_BAU_MISC_CONTROL_CSI_AGENT_PRESENCE_VECTOR_MASK 0x0000000000003800UL +#define UVH_LB_BAU_MISC_CONTROL_DESCRIPTOR_FETCH_MODE_SHFT 14 +#define UVH_LB_BAU_MISC_CONTROL_DESCRIPTOR_FETCH_MODE_MASK 0x0000000000004000UL +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_INTD_SOFT_ACK_MODE_SHFT 15 +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_INTD_SOFT_ACK_MODE_MASK 0x0000000000008000UL +#define UVH_LB_BAU_MISC_CONTROL_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHFT 16 +#define UVH_LB_BAU_MISC_CONTROL_INTD_SOFT_ACK_TIMEOUT_PERIOD_MASK 0x00000000000f0000UL +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_DUAL_MAPPING_MODE_SHFT 20 +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_DUAL_MAPPING_MODE_MASK 0x0000000000100000UL +#define UVH_LB_BAU_MISC_CONTROL_VGA_IO_PORT_DECODE_ENABLE_SHFT 21 +#define UVH_LB_BAU_MISC_CONTROL_VGA_IO_PORT_DECODE_ENABLE_MASK 0x0000000000200000UL +#define UVH_LB_BAU_MISC_CONTROL_VGA_IO_PORT_16_BIT_DECODE_SHFT 22 +#define UVH_LB_BAU_MISC_CONTROL_VGA_IO_PORT_16_BIT_DECODE_MASK 0x0000000000400000UL +#define UVH_LB_BAU_MISC_CONTROL_SUPPRESS_DEST_REGISTRATION_SHFT 23 +#define UVH_LB_BAU_MISC_CONTROL_SUPPRESS_DEST_REGISTRATION_MASK 0x0000000000800000UL +#define UVH_LB_BAU_MISC_CONTROL_PROGRAMMED_INITIAL_PRIORITY_SHFT 24 +#define UVH_LB_BAU_MISC_CONTROL_PROGRAMMED_INITIAL_PRIORITY_MASK 0x0000000007000000UL +#define UVH_LB_BAU_MISC_CONTROL_USE_INCOMING_PRIORITY_SHFT 27 +#define UVH_LB_BAU_MISC_CONTROL_USE_INCOMING_PRIORITY_MASK 0x0000000008000000UL +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_PROGRAMMED_INITIAL_PRIORITY_SHFT 28 +#define UVH_LB_BAU_MISC_CONTROL_ENABLE_PROGRAMMED_INITIAL_PRIORITY_MASK 0x0000000010000000UL +#define UVH_LB_BAU_MISC_CONTROL_FUN_SHFT 48 +#define UVH_LB_BAU_MISC_CONTROL_FUN_MASK 0xffff000000000000UL + +union uvh_lb_bau_misc_control_u { + unsigned long v; + struct uvh_lb_bau_misc_control_s { + unsigned long rejection_delay : 8; /* RW */ + unsigned long apic_mode : 1; /* RW */ + unsigned long force_broadcast : 1; /* RW */ + unsigned long force_lock_nop : 1; /* RW */ + unsigned long csi_agent_presence_vector : 3; /* RW */ + unsigned long descriptor_fetch_mode : 1; /* RW */ + unsigned long enable_intd_soft_ack_mode : 1; /* RW */ + unsigned long intd_soft_ack_timeout_period : 4; /* RW */ + unsigned long enable_dual_mapping_mode : 1; /* RW */ + unsigned long vga_io_port_decode_enable : 1; /* RW */ + unsigned long vga_io_port_16_bit_decode : 1; /* RW */ + unsigned long suppress_dest_registration : 1; /* RW */ + unsigned long programmed_initial_priority : 3; /* RW */ + unsigned long use_incoming_priority : 1; /* RW */ + unsigned long enable_programmed_initial_priority : 1; /* RW */ + unsigned long rsvd_29_47 : 19; /* */ + unsigned long fun : 16; /* RW */ + } s; +}; + +/* ========================================================================= */ /* UVH_LB_BAU_SB_ACTIVATION_CONTROL */ /* ========================================================================= */ #define UVH_LB_BAU_SB_ACTIVATION_CONTROL 0x320020UL @@ -681,334 +754,6 @@ union uvh_lb_bau_sb_descriptor_base_u { }; /* ========================================================================= */ -/* UVH_LB_MCAST_AOERR0_RPT_ENABLE */ -/* ========================================================================= */ -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE 0x50b20UL - -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_OBESE_MSG_SHFT 0 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_OBESE_MSG_MASK 0x0000000000000001UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_DATA_SB_ERR_SHFT 1 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_DATA_SB_ERR_MASK 0x0000000000000002UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_NACK_BUFF_PARITY_SHFT 2 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_NACK_BUFF_PARITY_MASK 0x0000000000000004UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_TIMEOUT_SHFT 3 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_TIMEOUT_MASK 0x0000000000000008UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_INACTIVE_REPLY_SHFT 4 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_INACTIVE_REPLY_MASK 0x0000000000000010UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_UPGRADE_ERROR_SHFT 5 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_UPGRADE_ERROR_MASK 0x0000000000000020UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_REG_COUNT_UNDERFLOW_SHFT 6 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_REG_COUNT_UNDERFLOW_MASK 0x0000000000000040UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_REP_OBESE_MSG_SHFT 7 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MCAST_REP_OBESE_MSG_MASK 0x0000000000000080UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_RUNT_MSG_SHFT 8 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_RUNT_MSG_MASK 0x0000000000000100UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_OBESE_MSG_SHFT 9 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_OBESE_MSG_MASK 0x0000000000000200UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_DATA_SB_ERR_SHFT 10 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REQ_DATA_SB_ERR_MASK 0x0000000000000400UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_RUNT_MSG_SHFT 11 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_RUNT_MSG_MASK 0x0000000000000800UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_OBESE_MSG_SHFT 12 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_OBESE_MSG_MASK 0x0000000000001000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_DATA_SB_ERR_SHFT 13 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_DATA_SB_ERR_MASK 0x0000000000002000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_COMMAND_ERR_SHFT 14 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_REP_COMMAND_ERR_MASK 0x0000000000004000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_PEND_TIMEOUT_SHFT 15 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_UCACHE_PEND_TIMEOUT_MASK 0x0000000000008000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_RUNT_MSG_SHFT 16 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_RUNT_MSG_MASK 0x0000000000010000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_OBESE_MSG_SHFT 17 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_OBESE_MSG_MASK 0x0000000000020000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_DATA_SB_ERR_SHFT 18 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REQ_DATA_SB_ERR_MASK 0x0000000000040000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_RUNT_MSG_SHFT 19 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_RUNT_MSG_MASK 0x0000000000080000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_OBESE_MSG_SHFT 20 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_OBESE_MSG_MASK 0x0000000000100000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_DATA_SB_ERR_SHFT 21 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_REP_DATA_SB_ERR_MASK 0x0000000000200000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_AMO_TIMEOUT_SHFT 22 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_AMO_TIMEOUT_MASK 0x0000000000400000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_PUT_TIMEOUT_SHFT 23 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_PUT_TIMEOUT_MASK 0x0000000000800000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_SPURIOUS_EVENT_SHFT 24 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_MACC_SPURIOUS_EVENT_MASK 0x0000000001000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_IOH_DESTINATION_TABLE_PARITY_SHFT 25 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_IOH_DESTINATION_TABLE_PARITY_MASK 0x0000000002000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_GET_HAD_ERROR_REPLY_SHFT 26 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_GET_HAD_ERROR_REPLY_MASK 0x0000000004000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_GET_TIMEOUT_SHFT 27 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_GET_TIMEOUT_MASK 0x0000000008000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_LOCK_MANAGER_HAD_ERROR_REPLY_SHFT 28 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_LOCK_MANAGER_HAD_ERROR_REPLY_MASK 0x0000000010000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_PUT_HAD_ERROR_REPLY_SHFT 29 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_PUT_HAD_ERROR_REPLY_MASK 0x0000000020000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_PUT_TIMEOUT_SHFT 30 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_PUT_TIMEOUT_MASK 0x0000000040000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_SB_ACTIVATION_OVERRUN_SHFT 31 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_SB_ACTIVATION_OVERRUN_MASK 0x0000000080000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_COMPLETED_GB_ACTIVATION_HAD_ERROR_REPLY_SHFT 32 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_COMPLETED_GB_ACTIVATION_HAD_ERROR_REPLY_MASK 0x0000000100000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_COMPLETED_GB_ACTIVATION_TIMEOUT_SHFT 33 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_COMPLETED_GB_ACTIVATION_TIMEOUT_MASK 0x0000000200000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_DESCRIPTOR_BUFFER_0_PARITY_SHFT 34 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_DESCRIPTOR_BUFFER_0_PARITY_MASK 0x0000000400000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_DESCRIPTOR_BUFFER_1_PARITY_SHFT 35 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_DESCRIPTOR_BUFFER_1_PARITY_MASK 0x0000000800000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_SOCKET_DESTINATION_TABLE_PARITY_SHFT 36 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_SOCKET_DESTINATION_TABLE_PARITY_MASK 0x0000001000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_BAU_REPLY_PAYLOAD_CORRUPTION_SHFT 37 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_BAU_REPLY_PAYLOAD_CORRUPTION_MASK 0x0000002000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_IO_PORT_DESTINATION_TABLE_PARITY_SHFT 38 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_IO_PORT_DESTINATION_TABLE_PARITY_MASK 0x0000004000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INTD_SOFT_ACK_TIMEOUT_SHFT 39 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INTD_SOFT_ACK_TIMEOUT_MASK 0x0000008000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_REP_OBESE_MSG_SHFT 40 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_REP_OBESE_MSG_MASK 0x0000010000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_REP_COMMAND_ERR_SHFT 41 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_REP_COMMAND_ERR_MASK 0x0000020000000000UL -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_TIMEOUT_SHFT 42 -#define UVH_LB_MCAST_AOERR0_RPT_ENABLE_INT_TIMEOUT_MASK 0x0000040000000000UL - -union uvh_lb_mcast_aoerr0_rpt_enable_u { - unsigned long v; - struct uvh_lb_mcast_aoerr0_rpt_enable_s { - unsigned long mcast_obese_msg : 1; /* RW */ - unsigned long mcast_data_sb_err : 1; /* RW */ - unsigned long mcast_nack_buff_parity : 1; /* RW */ - unsigned long mcast_timeout : 1; /* RW */ - unsigned long mcast_inactive_reply : 1; /* RW */ - unsigned long mcast_upgrade_error : 1; /* RW */ - unsigned long mcast_reg_count_underflow : 1; /* RW */ - unsigned long mcast_rep_obese_msg : 1; /* RW */ - unsigned long ucache_req_runt_msg : 1; /* RW */ - unsigned long ucache_req_obese_msg : 1; /* RW */ - unsigned long ucache_req_data_sb_err : 1; /* RW */ - unsigned long ucache_rep_runt_msg : 1; /* RW */ - unsigned long ucache_rep_obese_msg : 1; /* RW */ - unsigned long ucache_rep_data_sb_err : 1; /* RW */ - unsigned long ucache_rep_command_err : 1; /* RW */ - unsigned long ucache_pend_timeout : 1; /* RW */ - unsigned long macc_req_runt_msg : 1; /* RW */ - unsigned long macc_req_obese_msg : 1; /* RW */ - unsigned long macc_req_data_sb_err : 1; /* RW */ - unsigned long macc_rep_runt_msg : 1; /* RW */ - unsigned long macc_rep_obese_msg : 1; /* RW */ - unsigned long macc_rep_data_sb_err : 1; /* RW */ - unsigned long macc_amo_timeout : 1; /* RW */ - unsigned long macc_put_timeout : 1; /* RW */ - unsigned long macc_spurious_event : 1; /* RW */ - unsigned long ioh_destination_table_parity : 1; /* RW */ - unsigned long get_had_error_reply : 1; /* RW */ - unsigned long get_timeout : 1; /* RW */ - unsigned long lock_manager_had_error_reply : 1; /* RW */ - unsigned long put_had_error_reply : 1; /* RW */ - unsigned long put_timeout : 1; /* RW */ - unsigned long sb_activation_overrun : 1; /* RW */ - unsigned long completed_gb_activation_had_error_reply : 1; /* RW */ - unsigned long completed_gb_activation_timeout : 1; /* RW */ - unsigned long descriptor_buffer_0_parity : 1; /* RW */ - unsigned long descriptor_buffer_1_parity : 1; /* RW */ - unsigned long socket_destination_table_parity : 1; /* RW */ - unsigned long bau_reply_payload_corruption : 1; /* RW */ - unsigned long io_port_destination_table_parity : 1; /* RW */ - unsigned long intd_soft_ack_timeout : 1; /* RW */ - unsigned long int_rep_obese_msg : 1; /* RW */ - unsigned long int_rep_command_err : 1; /* RW */ - unsigned long int_timeout : 1; /* RW */ - unsigned long rsvd_43_63 : 21; /* */ - } s; -}; - -/* ========================================================================= */ -/* UVH_LOCAL_INT0_CONFIG */ -/* ========================================================================= */ -#define UVH_LOCAL_INT0_CONFIG 0x61000UL - -#define UVH_LOCAL_INT0_CONFIG_VECTOR_SHFT 0 -#define UVH_LOCAL_INT0_CONFIG_VECTOR_MASK 0x00000000000000ffUL -#define UVH_LOCAL_INT0_CONFIG_DM_SHFT 8 -#define UVH_LOCAL_INT0_CONFIG_DM_MASK 0x0000000000000700UL -#define UVH_LOCAL_INT0_CONFIG_DESTMODE_SHFT 11 -#define UVH_LOCAL_INT0_CONFIG_DESTMODE_MASK 0x0000000000000800UL -#define UVH_LOCAL_INT0_CONFIG_STATUS_SHFT 12 -#define UVH_LOCAL_INT0_CONFIG_STATUS_MASK 0x0000000000001000UL -#define UVH_LOCAL_INT0_CONFIG_P_SHFT 13 -#define UVH_LOCAL_INT0_CONFIG_P_MASK 0x0000000000002000UL -#define UVH_LOCAL_INT0_CONFIG_T_SHFT 15 -#define UVH_LOCAL_INT0_CONFIG_T_MASK 0x0000000000008000UL -#define UVH_LOCAL_INT0_CONFIG_M_SHFT 16 -#define UVH_LOCAL_INT0_CONFIG_M_MASK 0x0000000000010000UL -#define UVH_LOCAL_INT0_CONFIG_APIC_ID_SHFT 32 -#define UVH_LOCAL_INT0_CONFIG_APIC_ID_MASK 0xffffffff00000000UL - -union uvh_local_int0_config_u { - unsigned long v; - struct uvh_local_int0_config_s { - unsigned long vector_ : 8; /* RW */ - unsigned long dm : 3; /* RW */ - unsigned long destmode : 1; /* RW */ - unsigned long status : 1; /* RO */ - unsigned long p : 1; /* RO */ - unsigned long rsvd_14 : 1; /* */ - unsigned long t : 1; /* RO */ - unsigned long m : 1; /* RW */ - unsigned long rsvd_17_31: 15; /* */ - unsigned long apic_id : 32; /* RW */ - } s; -}; - -/* ========================================================================= */ -/* UVH_LOCAL_INT0_ENABLE */ -/* ========================================================================= */ -#define UVH_LOCAL_INT0_ENABLE 0x65000UL - -#define UVH_LOCAL_INT0_ENABLE_LB_HCERR_SHFT 0 -#define UVH_LOCAL_INT0_ENABLE_LB_HCERR_MASK 0x0000000000000001UL -#define UVH_LOCAL_INT0_ENABLE_GR0_HCERR_SHFT 1 -#define UVH_LOCAL_INT0_ENABLE_GR0_HCERR_MASK 0x0000000000000002UL -#define UVH_LOCAL_INT0_ENABLE_GR1_HCERR_SHFT 2 -#define UVH_LOCAL_INT0_ENABLE_GR1_HCERR_MASK 0x0000000000000004UL -#define UVH_LOCAL_INT0_ENABLE_LH_HCERR_SHFT 3 -#define UVH_LOCAL_INT0_ENABLE_LH_HCERR_MASK 0x0000000000000008UL -#define UVH_LOCAL_INT0_ENABLE_RH_HCERR_SHFT 4 -#define UVH_LOCAL_INT0_ENABLE_RH_HCERR_MASK 0x0000000000000010UL -#define UVH_LOCAL_INT0_ENABLE_XN_HCERR_SHFT 5 -#define UVH_LOCAL_INT0_ENABLE_XN_HCERR_MASK 0x0000000000000020UL -#define UVH_LOCAL_INT0_ENABLE_SI_HCERR_SHFT 6 -#define UVH_LOCAL_INT0_ENABLE_SI_HCERR_MASK 0x0000000000000040UL -#define UVH_LOCAL_INT0_ENABLE_LB_AOERR0_SHFT 7 -#define UVH_LOCAL_INT0_ENABLE_LB_AOERR0_MASK 0x0000000000000080UL -#define UVH_LOCAL_INT0_ENABLE_GR0_AOERR0_SHFT 8 -#define UVH_LOCAL_INT0_ENABLE_GR0_AOERR0_MASK 0x0000000000000100UL -#define UVH_LOCAL_INT0_ENABLE_GR1_AOERR0_SHFT 9 -#define UVH_LOCAL_INT0_ENABLE_GR1_AOERR0_MASK 0x0000000000000200UL -#define UVH_LOCAL_INT0_ENABLE_LH_AOERR0_SHFT 10 -#define UVH_LOCAL_INT0_ENABLE_LH_AOERR0_MASK 0x0000000000000400UL -#define UVH_LOCAL_INT0_ENABLE_RH_AOERR0_SHFT 11 -#define UVH_LOCAL_INT0_ENABLE_RH_AOERR0_MASK 0x0000000000000800UL -#define UVH_LOCAL_INT0_ENABLE_XN_AOERR0_SHFT 12 -#define UVH_LOCAL_INT0_ENABLE_XN_AOERR0_MASK 0x0000000000001000UL -#define UVH_LOCAL_INT0_ENABLE_SI_AOERR0_SHFT 13 -#define UVH_LOCAL_INT0_ENABLE_SI_AOERR0_MASK 0x0000000000002000UL -#define UVH_LOCAL_INT0_ENABLE_LB_AOERR1_SHFT 14 -#define UVH_LOCAL_INT0_ENABLE_LB_AOERR1_MASK 0x0000000000004000UL -#define UVH_LOCAL_INT0_ENABLE_GR0_AOERR1_SHFT 15 -#define UVH_LOCAL_INT0_ENABLE_GR0_AOERR1_MASK 0x0000000000008000UL -#define UVH_LOCAL_INT0_ENABLE_GR1_AOERR1_SHFT 16 -#define UVH_LOCAL_INT0_ENABLE_GR1_AOERR1_MASK 0x0000000000010000UL -#define UVH_LOCAL_INT0_ENABLE_LH_AOERR1_SHFT 17 -#define UVH_LOCAL_INT0_ENABLE_LH_AOERR1_MASK 0x0000000000020000UL -#define UVH_LOCAL_INT0_ENABLE_RH_AOERR1_SHFT 18 -#define UVH_LOCAL_INT0_ENABLE_RH_AOERR1_MASK 0x0000000000040000UL -#define UVH_LOCAL_INT0_ENABLE_XN_AOERR1_SHFT 19 -#define UVH_LOCAL_INT0_ENABLE_XN_AOERR1_MASK 0x0000000000080000UL -#define UVH_LOCAL_INT0_ENABLE_SI_AOERR1_SHFT 20 -#define UVH_LOCAL_INT0_ENABLE_SI_AOERR1_MASK 0x0000000000100000UL -#define UVH_LOCAL_INT0_ENABLE_RH_VPI_INT_SHFT 21 -#define UVH_LOCAL_INT0_ENABLE_RH_VPI_INT_MASK 0x0000000000200000UL -#define UVH_LOCAL_INT0_ENABLE_SYSTEM_SHUTDOWN_INT_SHFT 22 -#define UVH_LOCAL_INT0_ENABLE_SYSTEM_SHUTDOWN_INT_MASK 0x0000000000400000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_0_SHFT 23 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_0_MASK 0x0000000000800000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_1_SHFT 24 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_1_MASK 0x0000000001000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_2_SHFT 25 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_2_MASK 0x0000000002000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_3_SHFT 26 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_3_MASK 0x0000000004000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_4_SHFT 27 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_4_MASK 0x0000000008000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_5_SHFT 28 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_5_MASK 0x0000000010000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_6_SHFT 29 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_6_MASK 0x0000000020000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_7_SHFT 30 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_7_MASK 0x0000000040000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_8_SHFT 31 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_8_MASK 0x0000000080000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_9_SHFT 32 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_9_MASK 0x0000000100000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_10_SHFT 33 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_10_MASK 0x0000000200000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_11_SHFT 34 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_11_MASK 0x0000000400000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_12_SHFT 35 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_12_MASK 0x0000000800000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_13_SHFT 36 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_13_MASK 0x0000001000000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_14_SHFT 37 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_14_MASK 0x0000002000000000UL -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_15_SHFT 38 -#define UVH_LOCAL_INT0_ENABLE_LB_IRQ_INT_15_MASK 0x0000004000000000UL -#define UVH_LOCAL_INT0_ENABLE_L1_NMI_INT_SHFT 39 -#define UVH_LOCAL_INT0_ENABLE_L1_NMI_INT_MASK 0x0000008000000000UL -#define UVH_LOCAL_INT0_ENABLE_STOP_CLOCK_SHFT 40 -#define UVH_LOCAL_INT0_ENABLE_STOP_CLOCK_MASK 0x0000010000000000UL -#define UVH_LOCAL_INT0_ENABLE_ASIC_TO_L1_SHFT 41 -#define UVH_LOCAL_INT0_ENABLE_ASIC_TO_L1_MASK 0x0000020000000000UL -#define UVH_LOCAL_INT0_ENABLE_L1_TO_ASIC_SHFT 42 -#define UVH_LOCAL_INT0_ENABLE_L1_TO_ASIC_MASK 0x0000040000000000UL -#define UVH_LOCAL_INT0_ENABLE_LTC_INT_SHFT 43 -#define UVH_LOCAL_INT0_ENABLE_LTC_INT_MASK 0x0000080000000000UL -#define UVH_LOCAL_INT0_ENABLE_LA_SEQ_TRIGGER_SHFT 44 -#define UVH_LOCAL_INT0_ENABLE_LA_SEQ_TRIGGER_MASK 0x0000100000000000UL - -union uvh_local_int0_enable_u { - unsigned long v; - struct uvh_local_int0_enable_s { - unsigned long lb_hcerr : 1; /* RW */ - unsigned long gr0_hcerr : 1; /* RW */ - unsigned long gr1_hcerr : 1; /* RW */ - unsigned long lh_hcerr : 1; /* RW */ - unsigned long rh_hcerr : 1; /* RW */ - unsigned long xn_hcerr : 1; /* RW */ - unsigned long si_hcerr : 1; /* RW */ - unsigned long lb_aoerr0 : 1; /* RW */ - unsigned long gr0_aoerr0 : 1; /* RW */ - unsigned long gr1_aoerr0 : 1; /* RW */ - unsigned long lh_aoerr0 : 1; /* RW */ - unsigned long rh_aoerr0 : 1; /* RW */ - unsigned long xn_aoerr0 : 1; /* RW */ - unsigned long si_aoerr0 : 1; /* RW */ - unsigned long lb_aoerr1 : 1; /* RW */ - unsigned long gr0_aoerr1 : 1; /* RW */ - unsigned long gr1_aoerr1 : 1; /* RW */ - unsigned long lh_aoerr1 : 1; /* RW */ - unsigned long rh_aoerr1 : 1; /* RW */ - unsigned long xn_aoerr1 : 1; /* RW */ - unsigned long si_aoerr1 : 1; /* RW */ - unsigned long rh_vpi_int : 1; /* RW */ - unsigned long system_shutdown_int : 1; /* RW */ - unsigned long lb_irq_int_0 : 1; /* RW */ - unsigned long lb_irq_int_1 : 1; /* RW */ - unsigned long lb_irq_int_2 : 1; /* RW */ - unsigned long lb_irq_int_3 : 1; /* RW */ - unsigned long lb_irq_int_4 : 1; /* RW */ - unsigned long lb_irq_int_5 : 1; /* RW */ - unsigned long lb_irq_int_6 : 1; /* RW */ - unsigned long lb_irq_int_7 : 1; /* RW */ - unsigned long lb_irq_int_8 : 1; /* RW */ - unsigned long lb_irq_int_9 : 1; /* RW */ - unsigned long lb_irq_int_10 : 1; /* RW */ - unsigned long lb_irq_int_11 : 1; /* RW */ - unsigned long lb_irq_int_12 : 1; /* RW */ - unsigned long lb_irq_int_13 : 1; /* RW */ - unsigned long lb_irq_int_14 : 1; /* RW */ - unsigned long lb_irq_int_15 : 1; /* RW */ - unsigned long l1_nmi_int : 1; /* RW */ - unsigned long stop_clock : 1; /* RW */ - unsigned long asic_to_l1 : 1; /* RW */ - unsigned long l1_to_asic : 1; /* RW */ - unsigned long ltc_int : 1; /* RW */ - unsigned long la_seq_trigger : 1; /* RW */ - unsigned long rsvd_45_63 : 19; /* */ - } s; -}; - -/* ========================================================================= */ /* UVH_NODE_ID */ /* ========================================================================= */ #define UVH_NODE_ID 0x0UL @@ -1112,26 +857,6 @@ union uvh_rh_gam_alias210_redirect_config_2_mmr_u { }; /* ========================================================================= */ -/* UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR */ -/* ========================================================================= */ -#define UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR 0x1600020UL - -#define UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR_BASE_SHFT 26 -#define UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR_BASE_MASK 0x00003ffffc000000UL -#define UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR_ENABLE_SHFT 63 -#define UVH_RH_GAM_CFG_OVERLAY_CONFIG_MMR_ENABLE_MASK 0x8000000000000000UL - -union uvh_rh_gam_cfg_overlay_config_mmr_u { - unsigned long v; - struct uvh_rh_gam_cfg_overlay_config_mmr_s { - unsigned long rsvd_0_25: 26; /* */ - unsigned long base : 20; /* RW */ - unsigned long rsvd_46_62: 17; /* */ - unsigned long enable : 1; /* RW */ - } s; -}; - -/* ========================================================================= */ /* UVH_RH_GAM_GRU_OVERLAY_CONFIG_MMR */ /* ========================================================================= */ #define UVH_RH_GAM_GRU_OVERLAY_CONFIG_MMR 0x1600010UL @@ -1263,101 +988,6 @@ union uvh_rtc1_int_config_u { }; /* ========================================================================= */ -/* UVH_RTC2_INT_CONFIG */ -/* ========================================================================= */ -#define UVH_RTC2_INT_CONFIG 0x61600UL - -#define UVH_RTC2_INT_CONFIG_VECTOR_SHFT 0 -#define UVH_RTC2_INT_CONFIG_VECTOR_MASK 0x00000000000000ffUL -#define UVH_RTC2_INT_CONFIG_DM_SHFT 8 -#define UVH_RTC2_INT_CONFIG_DM_MASK 0x0000000000000700UL -#define UVH_RTC2_INT_CONFIG_DESTMODE_SHFT 11 -#define UVH_RTC2_INT_CONFIG_DESTMODE_MASK 0x0000000000000800UL -#define UVH_RTC2_INT_CONFIG_STATUS_SHFT 12 -#define UVH_RTC2_INT_CONFIG_STATUS_MASK 0x0000000000001000UL -#define UVH_RTC2_INT_CONFIG_P_SHFT 13 -#define UVH_RTC2_INT_CONFIG_P_MASK 0x0000000000002000UL -#define UVH_RTC2_INT_CONFIG_T_SHFT 15 -#define UVH_RTC2_INT_CONFIG_T_MASK 0x0000000000008000UL -#define UVH_RTC2_INT_CONFIG_M_SHFT 16 -#define UVH_RTC2_INT_CONFIG_M_MASK 0x0000000000010000UL -#define UVH_RTC2_INT_CONFIG_APIC_ID_SHFT 32 -#define UVH_RTC2_INT_CONFIG_APIC_ID_MASK 0xffffffff00000000UL - -union uvh_rtc2_int_config_u { - unsigned long v; - struct uvh_rtc2_int_config_s { - unsigned long vector_ : 8; /* RW */ - unsigned long dm : 3; /* RW */ - unsigned long destmode : 1; /* RW */ - unsigned long status : 1; /* RO */ - unsigned long p : 1; /* RO */ - unsigned long rsvd_14 : 1; /* */ - unsigned long t : 1; /* RO */ - unsigned long m : 1; /* RW */ - unsigned long rsvd_17_31: 15; /* */ - unsigned long apic_id : 32; /* RW */ - } s; -}; - -/* ========================================================================= */ -/* UVH_RTC3_INT_CONFIG */ -/* ========================================================================= */ -#define UVH_RTC3_INT_CONFIG 0x61640UL - -#define UVH_RTC3_INT_CONFIG_VECTOR_SHFT 0 -#define UVH_RTC3_INT_CONFIG_VECTOR_MASK 0x00000000000000ffUL -#define UVH_RTC3_INT_CONFIG_DM_SHFT 8 -#define UVH_RTC3_INT_CONFIG_DM_MASK 0x0000000000000700UL -#define UVH_RTC3_INT_CONFIG_DESTMODE_SHFT 11 -#define UVH_RTC3_INT_CONFIG_DESTMODE_MASK 0x0000000000000800UL -#define UVH_RTC3_INT_CONFIG_STATUS_SHFT 12 -#define UVH_RTC3_INT_CONFIG_STATUS_MASK 0x0000000000001000UL -#define UVH_RTC3_INT_CONFIG_P_SHFT 13 -#define UVH_RTC3_INT_CONFIG_P_MASK 0x0000000000002000UL -#define UVH_RTC3_INT_CONFIG_T_SHFT 15 -#define UVH_RTC3_INT_CONFIG_T_MASK 0x0000000000008000UL -#define UVH_RTC3_INT_CONFIG_M_SHFT 16 -#define UVH_RTC3_INT_CONFIG_M_MASK 0x0000000000010000UL -#define UVH_RTC3_INT_CONFIG_APIC_ID_SHFT 32 -#define UVH_RTC3_INT_CONFIG_APIC_ID_MASK 0xffffffff00000000UL - -union uvh_rtc3_int_config_u { - unsigned long v; - struct uvh_rtc3_int_config_s { - unsigned long vector_ : 8; /* RW */ - unsigned long dm : 3; /* RW */ - unsigned long destmode : 1; /* RW */ - unsigned long status : 1; /* RO */ - unsigned long p : 1; /* RO */ - unsigned long rsvd_14 : 1; /* */ - unsigned long t : 1; /* RO */ - unsigned long m : 1; /* RW */ - unsigned long rsvd_17_31: 15; /* */ - unsigned long apic_id : 32; /* RW */ - } s; -}; - -/* ========================================================================= */ -/* UVH_RTC_INC_RATIO */ -/* ========================================================================= */ -#define UVH_RTC_INC_RATIO 0x350000UL - -#define UVH_RTC_INC_RATIO_FRACTION_SHFT 0 -#define UVH_RTC_INC_RATIO_FRACTION_MASK 0x00000000000fffffUL -#define UVH_RTC_INC_RATIO_RATIO_SHFT 20 -#define UVH_RTC_INC_RATIO_RATIO_MASK 0x0000000000700000UL - -union uvh_rtc_inc_ratio_u { - unsigned long v; - struct uvh_rtc_inc_ratio_s { - unsigned long fraction : 20; /* RW */ - unsigned long ratio : 3; /* RW */ - unsigned long rsvd_23_63: 41; /* */ - } s; -}; - -/* ========================================================================= */ /* UVH_SI_ADDR_MAP_CONFIG */ /* ========================================================================= */ #define UVH_SI_ADDR_MAP_CONFIG 0xc80000UL diff --git a/arch/x86/include/asm/vmware.h b/arch/x86/include/asm/vmware.h deleted file mode 100644 index e49ed6d..0000000 --- a/arch/x86/include/asm/vmware.h +++ /dev/null @@ -1,27 +0,0 @@ -/* - * Copyright (C) 2008, VMware, Inc. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. - * - */ -#ifndef ASM_X86__VMWARE_H -#define ASM_X86__VMWARE_H - -extern void vmware_platform_setup(void); -extern int vmware_platform(void); -extern void vmware_set_feature_bits(struct cpuinfo_x86 *c); - -#endif diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h index ddc04cc..2c4390c 100644 --- a/arch/x86/include/asm/xsave.h +++ b/arch/x86/include/asm/xsave.h @@ -37,8 +37,9 @@ extern int check_for_xstate(struct i387_fxsave_struct __user *buf, void __user *fpstate, struct _fpx_sw_bytes *sw); -static inline int xrstor_checking(struct xsave_struct *fx) +static inline int fpu_xrstor_checking(struct fpu *fpu) { + struct xsave_struct *fx = &fpu->state->xsave; int err; asm volatile("1: .byte " REX_PREFIX "0x0f,0xae,0x2f\n\t" @@ -110,12 +111,12 @@ static inline void xrstor_state(struct xsave_struct *fx, u64 mask) : "memory"); } -static inline void xsave(struct task_struct *tsk) +static inline void fpu_xsave(struct fpu *fpu) { /* This, however, we can work around by forcing the compiler to select an addressing mode that doesn't require extended registers. */ __asm__ __volatile__(".byte " REX_PREFIX "0x0f,0xae,0x27" - : : "D" (&(tsk->thread.xstate->xsave)), + : : "D" (&(fpu->state->xsave)), "a" (-1), "d"(-1) : "memory"); } #endif diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index cd40aba..9a5ed58 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -94,6 +94,53 @@ enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_PIC; /* + * ISA irqs by default are the first 16 gsis but can be + * any gsi as specified by an interrupt source override. + */ +static u32 isa_irq_to_gsi[NR_IRQS_LEGACY] __read_mostly = { + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 +}; + +static unsigned int gsi_to_irq(unsigned int gsi) +{ + unsigned int irq = gsi + NR_IRQS_LEGACY; + unsigned int i; + + for (i = 0; i < NR_IRQS_LEGACY; i++) { + if (isa_irq_to_gsi[i] == gsi) { + return i; + } + } + + /* Provide an identity mapping of gsi == irq + * except on truly weird platforms that have + * non isa irqs in the first 16 gsis. + */ + if (gsi >= NR_IRQS_LEGACY) + irq = gsi; + else + irq = gsi_end + 1 + gsi; + + return irq; +} + +static u32 irq_to_gsi(int irq) +{ + unsigned int gsi; + + if (irq < NR_IRQS_LEGACY) + gsi = isa_irq_to_gsi[irq]; + else if (irq <= gsi_end) + gsi = irq; + else if (irq <= (gsi_end + NR_IRQS_LEGACY)) + gsi = irq - gsi_end; + else + gsi = 0xffffffff; + + return gsi; +} + +/* * Temporarily use the virtual area starting from FIX_IO_APIC_BASE_END, * to map the target physical address. The problem is that set_fixmap() * provides a single page, and it is possible that the page is not @@ -313,7 +360,7 @@ acpi_parse_ioapic(struct acpi_subtable_header * header, const unsigned long end) /* * Parse Interrupt Source Override for the ACPI SCI */ -static void __init acpi_sci_ioapic_setup(u32 gsi, u16 polarity, u16 trigger) +static void __init acpi_sci_ioapic_setup(u8 bus_irq, u16 polarity, u16 trigger, u32 gsi) { if (trigger == 0) /* compatible SCI trigger is level */ trigger = 3; @@ -333,7 +380,7 @@ static void __init acpi_sci_ioapic_setup(u32 gsi, u16 polarity, u16 trigger) * If GSI is < 16, this will update its flags, * else it will create a new mp_irqs[] entry. */ - mp_override_legacy_irq(gsi, polarity, trigger, gsi); + mp_override_legacy_irq(bus_irq, polarity, trigger, gsi); /* * stash over-ride to indicate we've been here @@ -357,9 +404,10 @@ acpi_parse_int_src_ovr(struct acpi_subtable_header * header, acpi_table_print_madt_entry(header); if (intsrc->source_irq == acpi_gbl_FADT.sci_interrupt) { - acpi_sci_ioapic_setup(intsrc->global_irq, + acpi_sci_ioapic_setup(intsrc->source_irq, intsrc->inti_flags & ACPI_MADT_POLARITY_MASK, - (intsrc->inti_flags & ACPI_MADT_TRIGGER_MASK) >> 2); + (intsrc->inti_flags & ACPI_MADT_TRIGGER_MASK) >> 2, + intsrc->global_irq); return 0; } @@ -448,7 +496,7 @@ void __init acpi_pic_sci_set_trigger(unsigned int irq, u16 trigger) int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) { - *irq = gsi; + *irq = gsi_to_irq(gsi); #ifdef CONFIG_X86_IO_APIC if (acpi_irq_model == ACPI_IRQ_MODEL_IOAPIC) @@ -458,6 +506,14 @@ int acpi_gsi_to_irq(u32 gsi, unsigned int *irq) return 0; } +int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi) +{ + if (isa_irq >= 16) + return -1; + *gsi = irq_to_gsi(isa_irq); + return 0; +} + /* * success: return IRQ number (>=0) * failure: return < 0 @@ -482,7 +538,7 @@ int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) plat_gsi = mp_register_gsi(dev, gsi, trigger, polarity); } #endif - irq = plat_gsi; + irq = gsi_to_irq(plat_gsi); return irq; } @@ -867,29 +923,6 @@ static int __init acpi_parse_madt_lapic_entries(void) extern int es7000_plat; #endif -int __init acpi_probe_gsi(void) -{ - int idx; - int gsi; - int max_gsi = 0; - - if (acpi_disabled) - return 0; - - if (!acpi_ioapic) - return 0; - - max_gsi = 0; - for (idx = 0; idx < nr_ioapics; idx++) { - gsi = mp_gsi_routing[idx].gsi_end; - - if (gsi > max_gsi) - max_gsi = gsi; - } - - return max_gsi + 1; -} - static void assign_to_mp_irq(struct mpc_intsrc *m, struct mpc_intsrc *mp_irq) { @@ -947,13 +980,13 @@ void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger, u32 gsi) mp_irq.dstirq = pin; /* INTIN# */ save_mp_irq(&mp_irq); + + isa_irq_to_gsi[bus_irq] = gsi; } void __init mp_config_acpi_legacy_irqs(void) { int i; - int ioapic; - unsigned int dstapic; struct mpc_intsrc mp_irq; #if defined (CONFIG_MCA) || defined (CONFIG_EISA) @@ -974,19 +1007,27 @@ void __init mp_config_acpi_legacy_irqs(void) #endif /* - * Locate the IOAPIC that manages the ISA IRQs (0-15). - */ - ioapic = mp_find_ioapic(0); - if (ioapic < 0) - return; - dstapic = mp_ioapics[ioapic].apicid; - - /* * Use the default configuration for the IRQs 0-15. Unless * overridden by (MADT) interrupt source override entries. */ for (i = 0; i < 16; i++) { + int ioapic, pin; + unsigned int dstapic; int idx; + u32 gsi; + + /* Locate the gsi that irq i maps to. */ + if (acpi_isa_irq_to_gsi(i, &gsi)) + continue; + + /* + * Locate the IOAPIC that manages the ISA IRQ. + */ + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) + continue; + pin = mp_find_ioapic_pin(ioapic, gsi); + dstapic = mp_ioapics[ioapic].apicid; for (idx = 0; idx < mp_irq_entries; idx++) { struct mpc_intsrc *irq = mp_irqs + idx; @@ -996,7 +1037,7 @@ void __init mp_config_acpi_legacy_irqs(void) break; /* Do we already have a mapping for this IOAPIC pin */ - if (irq->dstapic == dstapic && irq->dstirq == i) + if (irq->dstapic == dstapic && irq->dstirq == pin) break; } @@ -1011,7 +1052,7 @@ void __init mp_config_acpi_legacy_irqs(void) mp_irq.dstapic = dstapic; mp_irq.irqtype = mp_INT; mp_irq.srcbusirq = i; /* Identity mapped */ - mp_irq.dstirq = i; + mp_irq.dstirq = pin; save_mp_irq(&mp_irq); } @@ -1076,11 +1117,6 @@ int mp_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) ioapic_pin = mp_find_ioapic_pin(ioapic, gsi); -#ifdef CONFIG_X86_32 - if (ioapic_renumber_irq) - gsi = ioapic_renumber_irq(ioapic, gsi); -#endif - if (ioapic_pin > MP_MAX_IOAPIC_PIN) { printk(KERN_ERR "Invalid reference to IOAPIC pin " "%d-%d\n", mp_ioapics[ioapic].apicid, @@ -1094,7 +1130,7 @@ int mp_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) set_io_apic_irq_attr(&irq_attr, ioapic, ioapic_pin, trigger == ACPI_EDGE_SENSITIVE ? 0 : 1, polarity == ACPI_ACTIVE_HIGH ? 0 : 1); - io_apic_set_pci_routing(dev, gsi, &irq_attr); + io_apic_set_pci_routing(dev, gsi_to_irq(gsi), &irq_attr); return gsi; } @@ -1154,7 +1190,8 @@ static int __init acpi_parse_madt_ioapic_entries(void) * pretend we got one so we can set the SCI flags. */ if (!acpi_sci_override_gsi) - acpi_sci_ioapic_setup(acpi_gbl_FADT.sci_interrupt, 0, 0); + acpi_sci_ioapic_setup(acpi_gbl_FADT.sci_interrupt, 0, 0, + acpi_gbl_FADT.sci_interrupt); /* Fill in identity legacy mappings where no override */ mp_config_acpi_legacy_irqs(); diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 1a160d5..7023773 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -194,7 +194,7 @@ static void __init_or_module add_nops(void *insns, unsigned int len) } extern struct alt_instr __alt_instructions[], __alt_instructions_end[]; -extern u8 *__smp_locks[], *__smp_locks_end[]; +extern s32 __smp_locks[], __smp_locks_end[]; static void *text_poke_early(void *addr, const void *opcode, size_t len); /* Replace instructions with better alternatives for this CPU type. @@ -235,37 +235,41 @@ void __init_or_module apply_alternatives(struct alt_instr *start, #ifdef CONFIG_SMP -static void alternatives_smp_lock(u8 **start, u8 **end, u8 *text, u8 *text_end) +static void alternatives_smp_lock(const s32 *start, const s32 *end, + u8 *text, u8 *text_end) { - u8 **ptr; + const s32 *poff; mutex_lock(&text_mutex); - for (ptr = start; ptr < end; ptr++) { - if (*ptr < text) - continue; - if (*ptr > text_end) + for (poff = start; poff < end; poff++) { + u8 *ptr = (u8 *)poff + *poff; + + if (!*poff || ptr < text || ptr >= text_end) continue; /* turn DS segment override prefix into lock prefix */ - text_poke(*ptr, ((unsigned char []){0xf0}), 1); + if (*ptr == 0x3e) + text_poke(ptr, ((unsigned char []){0xf0}), 1); }; mutex_unlock(&text_mutex); } -static void alternatives_smp_unlock(u8 **start, u8 **end, u8 *text, u8 *text_end) +static void alternatives_smp_unlock(const s32 *start, const s32 *end, + u8 *text, u8 *text_end) { - u8 **ptr; + const s32 *poff; if (noreplace_smp) return; mutex_lock(&text_mutex); - for (ptr = start; ptr < end; ptr++) { - if (*ptr < text) - continue; - if (*ptr > text_end) + for (poff = start; poff < end; poff++) { + u8 *ptr = (u8 *)poff + *poff; + + if (!*poff || ptr < text || ptr >= text_end) continue; /* turn lock prefix into DS segment override prefix */ - text_poke(*ptr, ((unsigned char []){0x3E}), 1); + if (*ptr == 0xf0) + text_poke(ptr, ((unsigned char []){0x3E}), 1); }; mutex_unlock(&text_mutex); } @@ -276,8 +280,8 @@ struct smp_alt_module { char *name; /* ptrs to lock prefixes */ - u8 **locks; - u8 **locks_end; + const s32 *locks; + const s32 *locks_end; /* .text segment, needed to avoid patching init code ;) */ u8 *text; @@ -398,16 +402,19 @@ void alternatives_smp_switch(int smp) int alternatives_text_reserved(void *start, void *end) { struct smp_alt_module *mod; - u8 **ptr; + const s32 *poff; u8 *text_start = start; u8 *text_end = end; list_for_each_entry(mod, &smp_alt_modules, next) { if (mod->text > text_end || mod->text_end < text_start) continue; - for (ptr = mod->locks; ptr < mod->locks_end; ptr++) - if (text_start <= *ptr && text_end >= *ptr) + for (poff = mod->locks; poff < mod->locks_end; poff++) { + const u8 *ptr = (const u8 *)poff + *poff; + + if (text_start <= ptr && text_end > ptr) return 1; + } } return 0; diff --git a/arch/x86/kernel/amd_iommu.c b/arch/x86/kernel/amd_iommu.c index f854d89b..fa5a1474 100644 --- a/arch/x86/kernel/amd_iommu.c +++ b/arch/x86/kernel/amd_iommu.c @@ -731,18 +731,22 @@ static bool increase_address_space(struct protection_domain *domain, static u64 *alloc_pte(struct protection_domain *domain, unsigned long address, - int end_lvl, + unsigned long page_size, u64 **pte_page, gfp_t gfp) { + int level, end_lvl; u64 *pte, *page; - int level; + + BUG_ON(!is_power_of_2(page_size)); while (address > PM_LEVEL_SIZE(domain->mode)) increase_address_space(domain, gfp); - level = domain->mode - 1; - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; + level = domain->mode - 1; + pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; + address = PAGE_SIZE_ALIGN(address, page_size); + end_lvl = PAGE_SIZE_LEVEL(page_size); while (level > end_lvl) { if (!IOMMU_PTE_PRESENT(*pte)) { @@ -752,6 +756,10 @@ static u64 *alloc_pte(struct protection_domain *domain, *pte = PM_LEVEL_PDE(level, virt_to_phys(page)); } + /* No level skipping support yet */ + if (PM_PTE_LEVEL(*pte) != level) + return NULL; + level -= 1; pte = IOMMU_PTE_PAGE(*pte); @@ -769,28 +777,47 @@ static u64 *alloc_pte(struct protection_domain *domain, * This function checks if there is a PTE for a given dma address. If * there is one, it returns the pointer to it. */ -static u64 *fetch_pte(struct protection_domain *domain, - unsigned long address, int map_size) +static u64 *fetch_pte(struct protection_domain *domain, unsigned long address) { int level; u64 *pte; - level = domain->mode - 1; - pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; + if (address > PM_LEVEL_SIZE(domain->mode)) + return NULL; + + level = domain->mode - 1; + pte = &domain->pt_root[PM_LEVEL_INDEX(level, address)]; - while (level > map_size) { + while (level > 0) { + + /* Not Present */ if (!IOMMU_PTE_PRESENT(*pte)) return NULL; + /* Large PTE */ + if (PM_PTE_LEVEL(*pte) == 0x07) { + unsigned long pte_mask, __pte; + + /* + * If we have a series of large PTEs, make + * sure to return a pointer to the first one. + */ + pte_mask = PTE_PAGE_SIZE(*pte); + pte_mask = ~((PAGE_SIZE_PTE_COUNT(pte_mask) << 3) - 1); + __pte = ((unsigned long)pte) & pte_mask; + + return (u64 *)__pte; + } + + /* No level skipping support yet */ + if (PM_PTE_LEVEL(*pte) != level) + return NULL; + level -= 1; + /* Walk to the next level */ pte = IOMMU_PTE_PAGE(*pte); pte = &pte[PM_LEVEL_INDEX(level, address)]; - - if ((PM_PTE_LEVEL(*pte) == 0) && level != map_size) { - pte = NULL; - break; - } } return pte; @@ -807,44 +834,84 @@ static int iommu_map_page(struct protection_domain *dom, unsigned long bus_addr, unsigned long phys_addr, int prot, - int map_size) + unsigned long page_size) { u64 __pte, *pte; - - bus_addr = PAGE_ALIGN(bus_addr); - phys_addr = PAGE_ALIGN(phys_addr); - - BUG_ON(!PM_ALIGNED(map_size, bus_addr)); - BUG_ON(!PM_ALIGNED(map_size, phys_addr)); + int i, count; if (!(prot & IOMMU_PROT_MASK)) return -EINVAL; - pte = alloc_pte(dom, bus_addr, map_size, NULL, GFP_KERNEL); + bus_addr = PAGE_ALIGN(bus_addr); + phys_addr = PAGE_ALIGN(phys_addr); + count = PAGE_SIZE_PTE_COUNT(page_size); + pte = alloc_pte(dom, bus_addr, page_size, NULL, GFP_KERNEL); + + for (i = 0; i < count; ++i) + if (IOMMU_PTE_PRESENT(pte[i])) + return -EBUSY; - if (IOMMU_PTE_PRESENT(*pte)) - return -EBUSY; + if (page_size > PAGE_SIZE) { + __pte = PAGE_SIZE_PTE(phys_addr, page_size); + __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_P | IOMMU_PTE_FC; + } else + __pte = phys_addr | IOMMU_PTE_P | IOMMU_PTE_FC; - __pte = phys_addr | IOMMU_PTE_P; if (prot & IOMMU_PROT_IR) __pte |= IOMMU_PTE_IR; if (prot & IOMMU_PROT_IW) __pte |= IOMMU_PTE_IW; - *pte = __pte; + for (i = 0; i < count; ++i) + pte[i] = __pte; update_domain(dom); return 0; } -static void iommu_unmap_page(struct protection_domain *dom, - unsigned long bus_addr, int map_size) +static unsigned long iommu_unmap_page(struct protection_domain *dom, + unsigned long bus_addr, + unsigned long page_size) { - u64 *pte = fetch_pte(dom, bus_addr, map_size); + unsigned long long unmap_size, unmapped; + u64 *pte; + + BUG_ON(!is_power_of_2(page_size)); + + unmapped = 0; - if (pte) - *pte = 0; + while (unmapped < page_size) { + + pte = fetch_pte(dom, bus_addr); + + if (!pte) { + /* + * No PTE for this address + * move forward in 4kb steps + */ + unmap_size = PAGE_SIZE; + } else if (PM_PTE_LEVEL(*pte) == 0) { + /* 4kb PTE found for this address */ + unmap_size = PAGE_SIZE; + *pte = 0ULL; + } else { + int count, i; + + /* Large PTE found which maps this address */ + unmap_size = PTE_PAGE_SIZE(*pte); + count = PAGE_SIZE_PTE_COUNT(unmap_size); + for (i = 0; i < count; i++) + pte[i] = 0ULL; + } + + bus_addr = (bus_addr & ~(unmap_size - 1)) + unmap_size; + unmapped += unmap_size; + } + + BUG_ON(!is_power_of_2(unmapped)); + + return unmapped; } /* @@ -878,7 +945,7 @@ static int dma_ops_unity_map(struct dma_ops_domain *dma_dom, for (addr = e->address_start; addr < e->address_end; addr += PAGE_SIZE) { ret = iommu_map_page(&dma_dom->domain, addr, addr, e->prot, - PM_MAP_4k); + PAGE_SIZE); if (ret) return ret; /* @@ -1006,7 +1073,7 @@ static int alloc_new_range(struct dma_ops_domain *dma_dom, u64 *pte, *pte_page; for (i = 0; i < num_ptes; ++i) { - pte = alloc_pte(&dma_dom->domain, address, PM_MAP_4k, + pte = alloc_pte(&dma_dom->domain, address, PAGE_SIZE, &pte_page, gfp); if (!pte) goto out_free; @@ -1042,7 +1109,7 @@ static int alloc_new_range(struct dma_ops_domain *dma_dom, for (i = dma_dom->aperture[index]->offset; i < dma_dom->aperture_size; i += PAGE_SIZE) { - u64 *pte = fetch_pte(&dma_dom->domain, i, PM_MAP_4k); + u64 *pte = fetch_pte(&dma_dom->domain, i); if (!pte || !IOMMU_PTE_PRESENT(*pte)) continue; @@ -1712,7 +1779,7 @@ static u64* dma_ops_get_pte(struct dma_ops_domain *dom, pte = aperture->pte_pages[APERTURE_PAGE_INDEX(address)]; if (!pte) { - pte = alloc_pte(&dom->domain, address, PM_MAP_4k, &pte_page, + pte = alloc_pte(&dom->domain, address, PAGE_SIZE, &pte_page, GFP_ATOMIC); aperture->pte_pages[APERTURE_PAGE_INDEX(address)] = pte_page; } else @@ -2439,12 +2506,11 @@ static int amd_iommu_attach_device(struct iommu_domain *dom, return ret; } -static int amd_iommu_map_range(struct iommu_domain *dom, - unsigned long iova, phys_addr_t paddr, - size_t size, int iommu_prot) +static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova, + phys_addr_t paddr, int gfp_order, int iommu_prot) { + unsigned long page_size = 0x1000UL << gfp_order; struct protection_domain *domain = dom->priv; - unsigned long i, npages = iommu_num_pages(paddr, size, PAGE_SIZE); int prot = 0; int ret; @@ -2453,61 +2519,50 @@ static int amd_iommu_map_range(struct iommu_domain *dom, if (iommu_prot & IOMMU_WRITE) prot |= IOMMU_PROT_IW; - iova &= PAGE_MASK; - paddr &= PAGE_MASK; - mutex_lock(&domain->api_lock); - - for (i = 0; i < npages; ++i) { - ret = iommu_map_page(domain, iova, paddr, prot, PM_MAP_4k); - if (ret) - return ret; - - iova += PAGE_SIZE; - paddr += PAGE_SIZE; - } - + ret = iommu_map_page(domain, iova, paddr, prot, page_size); mutex_unlock(&domain->api_lock); - return 0; + return ret; } -static void amd_iommu_unmap_range(struct iommu_domain *dom, - unsigned long iova, size_t size) +static int amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova, + int gfp_order) { - struct protection_domain *domain = dom->priv; - unsigned long i, npages = iommu_num_pages(iova, size, PAGE_SIZE); + unsigned long page_size, unmap_size; - iova &= PAGE_MASK; + page_size = 0x1000UL << gfp_order; mutex_lock(&domain->api_lock); - - for (i = 0; i < npages; ++i) { - iommu_unmap_page(domain, iova, PM_MAP_4k); - iova += PAGE_SIZE; - } + unmap_size = iommu_unmap_page(domain, iova, page_size); + mutex_unlock(&domain->api_lock); iommu_flush_tlb_pde(domain); - mutex_unlock(&domain->api_lock); + return get_order(unmap_size); } static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom, unsigned long iova) { struct protection_domain *domain = dom->priv; - unsigned long offset = iova & ~PAGE_MASK; + unsigned long offset_mask; phys_addr_t paddr; - u64 *pte; + u64 *pte, __pte; - pte = fetch_pte(domain, iova, PM_MAP_4k); + pte = fetch_pte(domain, iova); if (!pte || !IOMMU_PTE_PRESENT(*pte)) return 0; - paddr = *pte & IOMMU_PAGE_MASK; - paddr |= offset; + if (PM_PTE_LEVEL(*pte) == 0) + offset_mask = PAGE_SIZE - 1; + else + offset_mask = PTE_PAGE_SIZE(*pte) - 1; + + __pte = *pte & PM_ADDR_MASK; + paddr = (__pte & ~offset_mask) | (iova & offset_mask); return paddr; } @@ -2523,8 +2578,8 @@ static struct iommu_ops amd_iommu_ops = { .domain_destroy = amd_iommu_domain_destroy, .attach_dev = amd_iommu_attach_device, .detach_dev = amd_iommu_detach_device, - .map = amd_iommu_map_range, - .unmap = amd_iommu_unmap_range, + .map = amd_iommu_map, + .unmap = amd_iommu_unmap, .iova_to_phys = amd_iommu_iova_to_phys, .domain_has_cap = amd_iommu_domain_has_cap, }; diff --git a/arch/x86/kernel/amd_iommu_init.c b/arch/x86/kernel/amd_iommu_init.c index 6360abf..3bacb4d 100644 --- a/arch/x86/kernel/amd_iommu_init.c +++ b/arch/x86/kernel/amd_iommu_init.c @@ -120,6 +120,7 @@ struct ivmd_header { bool amd_iommu_dump; static int __initdata amd_iommu_detected; +static bool __initdata amd_iommu_disabled; u16 amd_iommu_last_bdf; /* largest PCI device id we have to handle */ @@ -1372,6 +1373,9 @@ void __init amd_iommu_detect(void) if (no_iommu || (iommu_detected && !gart_iommu_aperture)) return; + if (amd_iommu_disabled) + return; + if (acpi_table_parse("IVRS", early_amd_iommu_detect) == 0) { iommu_detected = 1; amd_iommu_detected = 1; @@ -1401,6 +1405,8 @@ static int __init parse_amd_iommu_options(char *str) for (; *str; ++str) { if (strncmp(str, "fullflush", 9) == 0) amd_iommu_unmap_flush = true; + if (strncmp(str, "off", 3) == 0) + amd_iommu_disabled = true; } return 1; diff --git a/arch/x86/kernel/apic/es7000_32.c b/arch/x86/kernel/apic/es7000_32.c index 03ba1b8..425e53a 100644 --- a/arch/x86/kernel/apic/es7000_32.c +++ b/arch/x86/kernel/apic/es7000_32.c @@ -131,24 +131,6 @@ int es7000_plat; static unsigned int base; -static int -es7000_rename_gsi(int ioapic, int gsi) -{ - if (es7000_plat == ES7000_ZORRO) - return gsi; - - if (!base) { - int i; - for (i = 0; i < nr_ioapics; i++) - base += nr_ioapic_registers[i]; - } - - if (!ioapic && (gsi < 16)) - gsi += base; - - return gsi; -} - static int __cpuinit wakeup_secondary_cpu_via_mip(int cpu, unsigned long eip) { unsigned long vect = 0, psaival = 0; @@ -190,7 +172,6 @@ static void setup_unisys(void) es7000_plat = ES7000_ZORRO; else es7000_plat = ES7000_CLASSIC; - ioapic_renumber_irq = es7000_rename_gsi; } /* diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index eb2789c..33f3563 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -89,6 +89,9 @@ int nr_ioapics; /* IO APIC gsi routing info */ struct mp_ioapic_gsi mp_gsi_routing[MAX_IO_APICS]; +/* The last gsi number used */ +u32 gsi_end; + /* MP IRQ source entries */ struct mpc_intsrc mp_irqs[MAX_IRQ_SOURCES]; @@ -1013,10 +1016,9 @@ static inline int irq_trigger(int idx) return MPBIOS_trigger(idx); } -int (*ioapic_renumber_irq)(int ioapic, int irq); static int pin_2_irq(int idx, int apic, int pin) { - int irq, i; + int irq; int bus = mp_irqs[idx].srcbus; /* @@ -1028,18 +1030,12 @@ static int pin_2_irq(int idx, int apic, int pin) if (test_bit(bus, mp_bus_not_pci)) { irq = mp_irqs[idx].srcbusirq; } else { - /* - * PCI IRQs are mapped in order - */ - i = irq = 0; - while (i < apic) - irq += nr_ioapic_registers[i++]; - irq += pin; - /* - * For MPS mode, so far only needed by ES7000 platform - */ - if (ioapic_renumber_irq) - irq = ioapic_renumber_irq(apic, irq); + u32 gsi = mp_gsi_routing[apic].gsi_base + pin; + + if (gsi >= NR_IRQS_LEGACY) + irq = gsi; + else + irq = gsi_end + 1 + gsi; } #ifdef CONFIG_X86_32 @@ -1950,20 +1946,8 @@ static struct { int pin, apic; } ioapic_i8259 = { -1, -1 }; void __init enable_IO_APIC(void) { - union IO_APIC_reg_01 reg_01; int i8259_apic, i8259_pin; int apic; - unsigned long flags; - - /* - * The number of IO-APIC IRQ registers (== #pins): - */ - for (apic = 0; apic < nr_ioapics; apic++) { - raw_spin_lock_irqsave(&ioapic_lock, flags); - reg_01.raw = io_apic_read(apic, 1); - raw_spin_unlock_irqrestore(&ioapic_lock, flags); - nr_ioapic_registers[apic] = reg_01.bits.entries+1; - } if (!legacy_pic->nr_legacy_irqs) return; @@ -3858,27 +3842,20 @@ int __init io_apic_get_redir_entries (int ioapic) reg_01.raw = io_apic_read(ioapic, 1); raw_spin_unlock_irqrestore(&ioapic_lock, flags); - return reg_01.bits.entries; + /* The register returns the maximum index redir index + * supported, which is one less than the total number of redir + * entries. + */ + return reg_01.bits.entries + 1; } void __init probe_nr_irqs_gsi(void) { - int nr = 0; + int nr; - nr = acpi_probe_gsi(); - if (nr > nr_irqs_gsi) { + nr = gsi_end + 1 + NR_IRQS_LEGACY; + if (nr > nr_irqs_gsi) nr_irqs_gsi = nr; - } else { - /* for acpi=off or acpi is not compiled in */ - int idx; - - nr = 0; - for (idx = 0; idx < nr_ioapics; idx++) - nr += io_apic_get_redir_entries(idx) + 1; - - if (nr > nr_irqs_gsi) - nr_irqs_gsi = nr; - } printk(KERN_DEBUG "nr_irqs_gsi: %d\n", nr_irqs_gsi); } @@ -4085,22 +4062,27 @@ int __init io_apic_get_version(int ioapic) return reg_01.bits.version; } -int acpi_get_override_irq(int bus_irq, int *trigger, int *polarity) +int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity) { - int i; + int ioapic, pin, idx; if (skip_ioapic_setup) return -1; - for (i = 0; i < mp_irq_entries; i++) - if (mp_irqs[i].irqtype == mp_INT && - mp_irqs[i].srcbusirq == bus_irq) - break; - if (i >= mp_irq_entries) + ioapic = mp_find_ioapic(gsi); + if (ioapic < 0) return -1; - *trigger = irq_trigger(i); - *polarity = irq_polarity(i); + pin = mp_find_ioapic_pin(ioapic, gsi); + if (pin < 0) + return -1; + + idx = find_irq_entry(ioapic, pin, mp_INT); + if (idx < 0) + return -1; + + *trigger = irq_trigger(idx); + *polarity = irq_polarity(idx); return 0; } @@ -4241,7 +4223,7 @@ void __init ioapic_insert_resources(void) } } -int mp_find_ioapic(int gsi) +int mp_find_ioapic(u32 gsi) { int i = 0; @@ -4256,7 +4238,7 @@ int mp_find_ioapic(int gsi) return -1; } -int mp_find_ioapic_pin(int ioapic, int gsi) +int mp_find_ioapic_pin(int ioapic, u32 gsi) { if (WARN_ON(ioapic == -1)) return -1; @@ -4284,6 +4266,7 @@ static int bad_ioapic(unsigned long address) void __init mp_register_ioapic(int id, u32 address, u32 gsi_base) { int idx = 0; + int entries; if (bad_ioapic(address)) return; @@ -4302,9 +4285,17 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base) * Build basic GSI lookup table to facilitate gsi->io_apic lookups * and to prevent reprogramming of IOAPIC pins (PCI GSIs). */ + entries = io_apic_get_redir_entries(idx); mp_gsi_routing[idx].gsi_base = gsi_base; - mp_gsi_routing[idx].gsi_end = gsi_base + - io_apic_get_redir_entries(idx); + mp_gsi_routing[idx].gsi_end = gsi_base + entries - 1; + + /* + * The number of IO-APIC IRQ registers (== #pins): + */ + nr_ioapic_registers[idx] = entries; + + if (mp_gsi_routing[idx].gsi_end > gsi_end) + gsi_end = mp_gsi_routing[idx].gsi_end; printk(KERN_INFO "IOAPIC[%d]: apic_id %d, version %d, address 0x%x, " "GSI %d-%d\n", idx, mp_ioapics[idx].apicid, diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index c085d52..e46f98f 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -735,9 +735,6 @@ void __init uv_system_init(void) uv_node_to_blade[nid] = blade; uv_cpu_to_blade[cpu] = blade; max_pnode = max(pnode, max_pnode); - - printk(KERN_DEBUG "UV: cpu %d, apicid 0x%x, pnode %d, nid %d, lcpu %d, blade %d\n", - cpu, apicid, pnode, nid, lcpu, blade); } /* Add blade/pnode info for nodes without cpus */ diff --git a/arch/x86/kernel/apm_32.c b/arch/x86/kernel/apm_32.c index 031aa88..c4f9182 100644 --- a/arch/x86/kernel/apm_32.c +++ b/arch/x86/kernel/apm_32.c @@ -1224,7 +1224,7 @@ static void reinit_timer(void) #ifdef INIT_TIMER_AFTER_SUSPEND unsigned long flags; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); /* set the clock to HZ */ outb_pit(0x34, PIT_MODE); /* binary, mode 2, LSB/MSB, ch 0 */ udelay(10); @@ -1232,7 +1232,7 @@ static void reinit_timer(void) udelay(10); outb_pit(LATCH >> 8, PIT_CH0); /* MSB */ udelay(10); - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); #endif } diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index c202b62..3a785da 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -14,7 +14,7 @@ CFLAGS_common.o := $(nostackp) obj-y := intel_cacheinfo.o addon_cpuid_features.o obj-y += proc.o capflags.o powerflags.o common.o -obj-y += vmware.o hypervisor.o sched.o +obj-y += vmware.o hypervisor.o sched.o mshyperv.o obj-$(CONFIG_X86_32) += bugs.o cmpxchg.o obj-$(CONFIG_X86_64) += bugs_64.o diff --git a/arch/x86/kernel/cpu/addon_cpuid_features.c b/arch/x86/kernel/cpu/addon_cpuid_features.c index 97ad79c..10fa568 100644 --- a/arch/x86/kernel/cpu/addon_cpuid_features.c +++ b/arch/x86/kernel/cpu/addon_cpuid_features.c @@ -30,12 +30,14 @@ void __cpuinit init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb; static const struct cpuid_bit __cpuinitconst cpuid_bits[] = { - { X86_FEATURE_IDA, CR_EAX, 1, 0x00000006 }, - { X86_FEATURE_ARAT, CR_EAX, 2, 0x00000006 }, - { X86_FEATURE_NPT, CR_EDX, 0, 0x8000000a }, - { X86_FEATURE_LBRV, CR_EDX, 1, 0x8000000a }, - { X86_FEATURE_SVML, CR_EDX, 2, 0x8000000a }, - { X86_FEATURE_NRIPS, CR_EDX, 3, 0x8000000a }, + { X86_FEATURE_IDA, CR_EAX, 1, 0x00000006 }, + { X86_FEATURE_ARAT, CR_EAX, 2, 0x00000006 }, + { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006 }, + { X86_FEATURE_CPB, CR_EDX, 9, 0x80000007 }, + { X86_FEATURE_NPT, CR_EDX, 0, 0x8000000a }, + { X86_FEATURE_LBRV, CR_EDX, 1, 0x8000000a }, + { X86_FEATURE_SVML, CR_EDX, 2, 0x8000000a }, + { X86_FEATURE_NRIPS, CR_EDX, 3, 0x8000000a }, { 0, 0, 0, 0 } }; diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 01a2652..c39576c 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -86,7 +86,7 @@ static void __init check_fpu(void) static void __init check_hlt(void) { - if (paravirt_enabled()) + if (boot_cpu_data.x86 >= 5 || paravirt_enabled()) return; printk(KERN_INFO "Checking 'hlt' instruction... "); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 4868e4a..c1c00d0 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1243,10 +1243,7 @@ void __cpuinit cpu_init(void) /* * Force FPU initialization: */ - if (cpu_has_xsave) - current_thread_info()->status = TS_XSAVE; - else - current_thread_info()->status = 0; + current_thread_info()->status = 0; clear_used_math(); mxcsr_feature_mask_init(); diff --git a/arch/x86/kernel/cpu/cpufreq/Makefile b/arch/x86/kernel/cpu/cpufreq/Makefile index 1840c0a..bd54bf6 100644 --- a/arch/x86/kernel/cpu/cpufreq/Makefile +++ b/arch/x86/kernel/cpu/cpufreq/Makefile @@ -2,8 +2,8 @@ # K8 systems. ACPI is preferred to all other hardware-specific drivers. # speedstep-* is preferred over p4-clockmod. -obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o -obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o +obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o mperf.o +obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o mperf.o obj-$(CONFIG_X86_PCC_CPUFREQ) += pcc-cpufreq.o obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o obj-$(CONFIG_X86_POWERNOW_K7) += powernow-k7.o diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c index 4591680..1d3cdda 100644 --- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c +++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c @@ -46,6 +46,7 @@ #include <asm/msr.h> #include <asm/processor.h> #include <asm/cpufeature.h> +#include "mperf.h" #define dprintk(msg...) cpufreq_debug_printk(CPUFREQ_DEBUG_DRIVER, \ "acpi-cpufreq", msg) @@ -71,8 +72,6 @@ struct acpi_cpufreq_data { static DEFINE_PER_CPU(struct acpi_cpufreq_data *, acfreq_data); -static DEFINE_PER_CPU(struct aperfmperf, acfreq_old_perf); - /* acpi_perf_data is a pointer to percpu data. */ static struct acpi_processor_performance *acpi_perf_data; @@ -240,45 +239,6 @@ static u32 get_cur_val(const struct cpumask *mask) return cmd.val; } -/* Called via smp_call_function_single(), on the target CPU */ -static void read_measured_perf_ctrs(void *_cur) -{ - struct aperfmperf *am = _cur; - - get_aperfmperf(am); -} - -/* - * Return the measured active (C0) frequency on this CPU since last call - * to this function. - * Input: cpu number - * Return: Average CPU frequency in terms of max frequency (zero on error) - * - * We use IA32_MPERF and IA32_APERF MSRs to get the measured performance - * over a period of time, while CPU is in C0 state. - * IA32_MPERF counts at the rate of max advertised frequency - * IA32_APERF counts at the rate of actual CPU frequency - * Only IA32_APERF/IA32_MPERF ratio is architecturally defined and - * no meaning should be associated with absolute values of these MSRs. - */ -static unsigned int get_measured_perf(struct cpufreq_policy *policy, - unsigned int cpu) -{ - struct aperfmperf perf; - unsigned long ratio; - unsigned int retval; - - if (smp_call_function_single(cpu, read_measured_perf_ctrs, &perf, 1)) - return 0; - - ratio = calc_aperfmperf_ratio(&per_cpu(acfreq_old_perf, cpu), &perf); - per_cpu(acfreq_old_perf, cpu) = perf; - - retval = (policy->cpuinfo.max_freq * ratio) >> APERFMPERF_SHIFT; - - return retval; -} - static unsigned int get_cur_freq_on_cpu(unsigned int cpu) { struct acpi_cpufreq_data *data = per_cpu(acfreq_data, cpu); @@ -702,7 +662,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) /* Check for APERF/MPERF support in hardware */ if (cpu_has(c, X86_FEATURE_APERFMPERF)) - acpi_cpufreq_driver.getavg = get_measured_perf; + acpi_cpufreq_driver.getavg = cpufreq_get_measured_perf; dprintk("CPU%u - ACPI performance management activated.\n", cpu); for (i = 0; i < perf->state_count; i++) diff --git a/arch/x86/kernel/cpu/cpufreq/mperf.c b/arch/x86/kernel/cpu/cpufreq/mperf.c new file mode 100644 index 0000000..911e193 --- /dev/null +++ b/arch/x86/kernel/cpu/cpufreq/mperf.c @@ -0,0 +1,51 @@ +#include <linux/kernel.h> +#include <linux/smp.h> +#include <linux/module.h> +#include <linux/init.h> +#include <linux/cpufreq.h> +#include <linux/slab.h> + +#include "mperf.h" + +static DEFINE_PER_CPU(struct aperfmperf, acfreq_old_perf); + +/* Called via smp_call_function_single(), on the target CPU */ +static void read_measured_perf_ctrs(void *_cur) +{ + struct aperfmperf *am = _cur; + + get_aperfmperf(am); +} + +/* + * Return the measured active (C0) frequency on this CPU since last call + * to this function. + * Input: cpu number + * Return: Average CPU frequency in terms of max frequency (zero on error) + * + * We use IA32_MPERF and IA32_APERF MSRs to get the measured performance + * over a period of time, while CPU is in C0 state. + * IA32_MPERF counts at the rate of max advertised frequency + * IA32_APERF counts at the rate of actual CPU frequency + * Only IA32_APERF/IA32_MPERF ratio is architecturally defined and + * no meaning should be associated with absolute values of these MSRs. + */ +unsigned int cpufreq_get_measured_perf(struct cpufreq_policy *policy, + unsigned int cpu) +{ + struct aperfmperf perf; + unsigned long ratio; + unsigned int retval; + + if (smp_call_function_single(cpu, read_measured_perf_ctrs, &perf, 1)) + return 0; + + ratio = calc_aperfmperf_ratio(&per_cpu(acfreq_old_perf, cpu), &perf); + per_cpu(acfreq_old_perf, cpu) = perf; + + retval = (policy->cpuinfo.max_freq * ratio) >> APERFMPERF_SHIFT; + + return retval; +} +EXPORT_SYMBOL_GPL(cpufreq_get_measured_perf); +MODULE_LICENSE("GPL"); diff --git a/arch/x86/kernel/cpu/cpufreq/mperf.h b/arch/x86/kernel/cpu/cpufreq/mperf.h new file mode 100644 index 0000000..5dbf295 --- /dev/null +++ b/arch/x86/kernel/cpu/cpufreq/mperf.h @@ -0,0 +1,9 @@ +/* + * (c) 2010 Advanced Micro Devices, Inc. + * Your use of this code is subject to the terms and conditions of the + * GNU general public license version 2. See "COPYING" or + * http://www.gnu.org/licenses/gpl.html + */ + +unsigned int cpufreq_get_measured_perf(struct cpufreq_policy *policy, + unsigned int cpu); diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c index b6215b9..6f3dc8f 100644 --- a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c +++ b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c @@ -1,6 +1,5 @@ - /* - * (c) 2003-2006 Advanced Micro Devices, Inc. + * (c) 2003-2010 Advanced Micro Devices, Inc. * Your use of this code is subject to the terms and conditions of the * GNU general public license version 2. See "COPYING" or * http://www.gnu.org/licenses/gpl.html @@ -46,6 +45,7 @@ #define PFX "powernow-k8: " #define VERSION "version 2.20.00" #include "powernow-k8.h" +#include "mperf.h" /* serialize freq changes */ static DEFINE_MUTEX(fidvid_mutex); @@ -54,6 +54,12 @@ static DEFINE_PER_CPU(struct powernow_k8_data *, powernow_data); static int cpu_family = CPU_OPTERON; +/* core performance boost */ +static bool cpb_capable, cpb_enabled; +static struct msr __percpu *msrs; + +static struct cpufreq_driver cpufreq_amd64_driver; + #ifndef CONFIG_SMP static inline const struct cpumask *cpu_core_mask(int cpu) { @@ -1249,6 +1255,7 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol) struct powernow_k8_data *data; struct init_on_cpu init_on_cpu; int rc; + struct cpuinfo_x86 *c = &cpu_data(pol->cpu); if (!cpu_online(pol->cpu)) return -ENODEV; @@ -1323,6 +1330,10 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol) return -EINVAL; } + /* Check for APERF/MPERF support in hardware */ + if (cpu_has(c, X86_FEATURE_APERFMPERF)) + cpufreq_amd64_driver.getavg = cpufreq_get_measured_perf; + cpufreq_frequency_table_get_attr(data->powernow_table, pol->cpu); if (cpu_family == CPU_HW_PSTATE) @@ -1394,8 +1405,77 @@ out: return khz; } +static void _cpb_toggle_msrs(bool t) +{ + int cpu; + + get_online_cpus(); + + rdmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs); + + for_each_cpu(cpu, cpu_online_mask) { + struct msr *reg = per_cpu_ptr(msrs, cpu); + if (t) + reg->l &= ~BIT(25); + else + reg->l |= BIT(25); + } + wrmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs); + + put_online_cpus(); +} + +/* + * Switch on/off core performance boosting. + * + * 0=disable + * 1=enable. + */ +static void cpb_toggle(bool t) +{ + if (!cpb_capable) + return; + + if (t && !cpb_enabled) { + cpb_enabled = true; + _cpb_toggle_msrs(t); + printk(KERN_INFO PFX "Core Boosting enabled.\n"); + } else if (!t && cpb_enabled) { + cpb_enabled = false; + _cpb_toggle_msrs(t); + printk(KERN_INFO PFX "Core Boosting disabled.\n"); + } +} + +static ssize_t store_cpb(struct cpufreq_policy *policy, const char *buf, + size_t count) +{ + int ret = -EINVAL; + unsigned long val = 0; + + ret = strict_strtoul(buf, 10, &val); + if (!ret && (val == 0 || val == 1) && cpb_capable) + cpb_toggle(val); + else + return -EINVAL; + + return count; +} + +static ssize_t show_cpb(struct cpufreq_policy *policy, char *buf) +{ + return sprintf(buf, "%u\n", cpb_enabled); +} + +#define define_one_rw(_name) \ +static struct freq_attr _name = \ +__ATTR(_name, 0644, show_##_name, store_##_name) + +define_one_rw(cpb); + static struct freq_attr *powernow_k8_attr[] = { &cpufreq_freq_attr_scaling_available_freqs, + &cpb, NULL, }; @@ -1411,10 +1491,51 @@ static struct cpufreq_driver cpufreq_amd64_driver = { .attr = powernow_k8_attr, }; +/* + * Clear the boost-disable flag on the CPU_DOWN path so that this cpu + * cannot block the remaining ones from boosting. On the CPU_UP path we + * simply keep the boost-disable flag in sync with the current global + * state. + */ +static int __cpuinit cpb_notify(struct notifier_block *nb, unsigned long action, + void *hcpu) +{ + unsigned cpu = (long)hcpu; + u32 lo, hi; + + switch (action) { + case CPU_UP_PREPARE: + case CPU_UP_PREPARE_FROZEN: + + if (!cpb_enabled) { + rdmsr_on_cpu(cpu, MSR_K7_HWCR, &lo, &hi); + lo |= BIT(25); + wrmsr_on_cpu(cpu, MSR_K7_HWCR, lo, hi); + } + break; + + case CPU_DOWN_PREPARE: + case CPU_DOWN_PREPARE_FROZEN: + rdmsr_on_cpu(cpu, MSR_K7_HWCR, &lo, &hi); + lo &= ~BIT(25); + wrmsr_on_cpu(cpu, MSR_K7_HWCR, lo, hi); + break; + + default: + break; + } + + return NOTIFY_OK; +} + +static struct notifier_block __cpuinitdata cpb_nb = { + .notifier_call = cpb_notify, +}; + /* driver entry point for init */ static int __cpuinit powernowk8_init(void) { - unsigned int i, supported_cpus = 0; + unsigned int i, supported_cpus = 0, cpu; for_each_online_cpu(i) { int rc; @@ -1423,15 +1544,36 @@ static int __cpuinit powernowk8_init(void) supported_cpus++; } - if (supported_cpus == num_online_cpus()) { - printk(KERN_INFO PFX "Found %d %s " - "processors (%d cpu cores) (" VERSION ")\n", - num_online_nodes(), - boot_cpu_data.x86_model_id, supported_cpus); - return cpufreq_register_driver(&cpufreq_amd64_driver); + if (supported_cpus != num_online_cpus()) + return -ENODEV; + + printk(KERN_INFO PFX "Found %d %s (%d cpu cores) (" VERSION ")\n", + num_online_nodes(), boot_cpu_data.x86_model_id, supported_cpus); + + if (boot_cpu_has(X86_FEATURE_CPB)) { + + cpb_capable = true; + + register_cpu_notifier(&cpb_nb); + + msrs = msrs_alloc(); + if (!msrs) { + printk(KERN_ERR "%s: Error allocating msrs!\n", __func__); + return -ENOMEM; + } + + rdmsr_on_cpus(cpu_online_mask, MSR_K7_HWCR, msrs); + + for_each_cpu(cpu, cpu_online_mask) { + struct msr *reg = per_cpu_ptr(msrs, cpu); + cpb_enabled |= !(!!(reg->l & BIT(25))); + } + + printk(KERN_INFO PFX "Core Performance Boosting: %s.\n", + (cpb_enabled ? "on" : "off")); } - return -ENODEV; + return cpufreq_register_driver(&cpufreq_amd64_driver); } /* driver entry point for term */ @@ -1439,6 +1581,13 @@ static void __exit powernowk8_exit(void) { dprintk("exit\n"); + if (boot_cpu_has(X86_FEATURE_CPB)) { + msrs_free(msrs); + msrs = NULL; + + unregister_cpu_notifier(&cpb_nb); + } + cpufreq_unregister_driver(&cpufreq_amd64_driver); } diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k8.h b/arch/x86/kernel/cpu/cpufreq/powernow-k8.h index 02ce824..df3529b 100644 --- a/arch/x86/kernel/cpu/cpufreq/powernow-k8.h +++ b/arch/x86/kernel/cpu/cpufreq/powernow-k8.h @@ -5,7 +5,6 @@ * http://www.gnu.org/licenses/gpl.html */ - enum pstate { HW_PSTATE_INVALID = 0xff, HW_PSTATE_0 = 0, @@ -55,7 +54,6 @@ struct powernow_k8_data { struct cpumask *available_cores; }; - /* processor's cpuid instruction support */ #define CPUID_PROCESSOR_SIGNATURE 1 /* function 1 */ #define CPUID_XFAM 0x0ff00000 /* extended family */ diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c index 08be922..dd531cc 100644 --- a/arch/x86/kernel/cpu/hypervisor.c +++ b/arch/x86/kernel/cpu/hypervisor.c @@ -21,37 +21,55 @@ * */ +#include <linux/module.h> #include <asm/processor.h> -#include <asm/vmware.h> #include <asm/hypervisor.h> -static inline void __cpuinit -detect_hypervisor_vendor(struct cpuinfo_x86 *c) +/* + * Hypervisor detect order. This is specified explicitly here because + * some hypervisors might implement compatibility modes for other + * hypervisors and therefore need to be detected in specific sequence. + */ +static const __initconst struct hypervisor_x86 * const hypervisors[] = { - if (vmware_platform()) - c->x86_hyper_vendor = X86_HYPER_VENDOR_VMWARE; - else - c->x86_hyper_vendor = X86_HYPER_VENDOR_NONE; -} + &x86_hyper_vmware, + &x86_hyper_ms_hyperv, +}; -static inline void __cpuinit -hypervisor_set_feature_bits(struct cpuinfo_x86 *c) +const struct hypervisor_x86 *x86_hyper; +EXPORT_SYMBOL(x86_hyper); + +static inline void __init +detect_hypervisor_vendor(void) { - if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE) { - vmware_set_feature_bits(c); - return; + const struct hypervisor_x86 *h, * const *p; + + for (p = hypervisors; p < hypervisors + ARRAY_SIZE(hypervisors); p++) { + h = *p; + if (h->detect()) { + x86_hyper = h; + printk(KERN_INFO "Hypervisor detected: %s\n", h->name); + break; + } } } void __cpuinit init_hypervisor(struct cpuinfo_x86 *c) { - detect_hypervisor_vendor(c); - hypervisor_set_feature_bits(c); + if (x86_hyper && x86_hyper->set_cpu_features) + x86_hyper->set_cpu_features(c); } void __init init_hypervisor_platform(void) { + + detect_hypervisor_vendor(); + + if (!x86_hyper) + return; + init_hypervisor(&boot_cpu_data); - if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE) - vmware_platform_setup(); + + if (x86_hyper->init_platform) + x86_hyper->init_platform(); } diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index f5e5390..85f69cd 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -372,12 +372,6 @@ static void __cpuinit init_intel(struct cpuinfo_x86 *c) set_cpu_cap(c, X86_FEATURE_ARCH_PERFMON); } - if (c->cpuid_level > 6) { - unsigned ecx = cpuid_ecx(6); - if (ecx & 0x01) - set_cpu_cap(c, X86_FEATURE_APERFMPERF); - } - if (cpu_has_xmm2) set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC); if (cpu_has_ds) { diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c index b3eeb66..33eae20 100644 --- a/arch/x86/kernel/cpu/intel_cacheinfo.c +++ b/arch/x86/kernel/cpu/intel_cacheinfo.c @@ -148,13 +148,19 @@ union _cpuid4_leaf_ecx { u32 full; }; +struct amd_l3_cache { + struct pci_dev *dev; + bool can_disable; + unsigned indices; + u8 subcaches[4]; +}; + struct _cpuid4_info { union _cpuid4_leaf_eax eax; union _cpuid4_leaf_ebx ebx; union _cpuid4_leaf_ecx ecx; unsigned long size; - bool can_disable; - unsigned int l3_indices; + struct amd_l3_cache *l3; DECLARE_BITMAP(shared_cpu_map, NR_CPUS); }; @@ -164,8 +170,7 @@ struct _cpuid4_info_regs { union _cpuid4_leaf_ebx ebx; union _cpuid4_leaf_ecx ecx; unsigned long size; - bool can_disable; - unsigned int l3_indices; + struct amd_l3_cache *l3; }; unsigned short num_cache_leaves; @@ -302,87 +307,163 @@ struct _cache_attr { }; #ifdef CONFIG_CPU_SUP_AMD -static unsigned int __cpuinit amd_calc_l3_indices(void) + +/* + * L3 cache descriptors + */ +static struct amd_l3_cache **__cpuinitdata l3_caches; + +static void __cpuinit amd_calc_l3_indices(struct amd_l3_cache *l3) { - /* - * We're called over smp_call_function_single() and therefore - * are on the correct cpu. - */ - int cpu = smp_processor_id(); - int node = cpu_to_node(cpu); - struct pci_dev *dev = node_to_k8_nb_misc(node); unsigned int sc0, sc1, sc2, sc3; u32 val = 0; - pci_read_config_dword(dev, 0x1C4, &val); + pci_read_config_dword(l3->dev, 0x1C4, &val); /* calculate subcache sizes */ - sc0 = !(val & BIT(0)); - sc1 = !(val & BIT(4)); - sc2 = !(val & BIT(8)) + !(val & BIT(9)); - sc3 = !(val & BIT(12)) + !(val & BIT(13)); + l3->subcaches[0] = sc0 = !(val & BIT(0)); + l3->subcaches[1] = sc1 = !(val & BIT(4)); + l3->subcaches[2] = sc2 = !(val & BIT(8)) + !(val & BIT(9)); + l3->subcaches[3] = sc3 = !(val & BIT(12)) + !(val & BIT(13)); - return (max(max(max(sc0, sc1), sc2), sc3) << 10) - 1; + l3->indices = (max(max(max(sc0, sc1), sc2), sc3) << 10) - 1; +} + +static struct amd_l3_cache * __cpuinit amd_init_l3_cache(int node) +{ + struct amd_l3_cache *l3; + struct pci_dev *dev = node_to_k8_nb_misc(node); + + l3 = kzalloc(sizeof(struct amd_l3_cache), GFP_ATOMIC); + if (!l3) { + printk(KERN_WARNING "Error allocating L3 struct\n"); + return NULL; + } + + l3->dev = dev; + + amd_calc_l3_indices(l3); + + return l3; } static void __cpuinit amd_check_l3_disable(int index, struct _cpuid4_info_regs *this_leaf) { - if (index < 3) + int node; + + if (boot_cpu_data.x86 != 0x10) return; - if (boot_cpu_data.x86 == 0x11) + if (index < 3) return; /* see errata #382 and #388 */ - if ((boot_cpu_data.x86 == 0x10) && - ((boot_cpu_data.x86_model < 0x8) || - (boot_cpu_data.x86_mask < 0x1))) + if (boot_cpu_data.x86_model < 0x8) + return; + + if ((boot_cpu_data.x86_model == 0x8 || + boot_cpu_data.x86_model == 0x9) + && + boot_cpu_data.x86_mask < 0x1) + return; + + /* not in virtualized environments */ + if (num_k8_northbridges == 0) return; - this_leaf->can_disable = true; - this_leaf->l3_indices = amd_calc_l3_indices(); + /* + * Strictly speaking, the amount in @size below is leaked since it is + * never freed but this is done only on shutdown so it doesn't matter. + */ + if (!l3_caches) { + int size = num_k8_northbridges * sizeof(struct amd_l3_cache *); + + l3_caches = kzalloc(size, GFP_ATOMIC); + if (!l3_caches) + return; + } + + node = amd_get_nb_id(smp_processor_id()); + + if (!l3_caches[node]) { + l3_caches[node] = amd_init_l3_cache(node); + l3_caches[node]->can_disable = true; + } + + WARN_ON(!l3_caches[node]); + + this_leaf->l3 = l3_caches[node]; } static ssize_t show_cache_disable(struct _cpuid4_info *this_leaf, char *buf, - unsigned int index) + unsigned int slot) { - int cpu = cpumask_first(to_cpumask(this_leaf->shared_cpu_map)); - int node = amd_get_nb_id(cpu); - struct pci_dev *dev = node_to_k8_nb_misc(node); + struct pci_dev *dev = this_leaf->l3->dev; unsigned int reg = 0; - if (!this_leaf->can_disable) + if (!this_leaf->l3 || !this_leaf->l3->can_disable) return -EINVAL; if (!dev) return -EINVAL; - pci_read_config_dword(dev, 0x1BC + index * 4, ®); + pci_read_config_dword(dev, 0x1BC + slot * 4, ®); return sprintf(buf, "0x%08x\n", reg); } -#define SHOW_CACHE_DISABLE(index) \ +#define SHOW_CACHE_DISABLE(slot) \ static ssize_t \ -show_cache_disable_##index(struct _cpuid4_info *this_leaf, char *buf) \ +show_cache_disable_##slot(struct _cpuid4_info *this_leaf, char *buf) \ { \ - return show_cache_disable(this_leaf, buf, index); \ + return show_cache_disable(this_leaf, buf, slot); \ } SHOW_CACHE_DISABLE(0) SHOW_CACHE_DISABLE(1) +static void amd_l3_disable_index(struct amd_l3_cache *l3, int cpu, + unsigned slot, unsigned long idx) +{ + int i; + + idx |= BIT(30); + + /* + * disable index in all 4 subcaches + */ + for (i = 0; i < 4; i++) { + u32 reg = idx | (i << 20); + + if (!l3->subcaches[i]) + continue; + + pci_write_config_dword(l3->dev, 0x1BC + slot * 4, reg); + + /* + * We need to WBINVD on a core on the node containing the L3 + * cache which indices we disable therefore a simple wbinvd() + * is not sufficient. + */ + wbinvd_on_cpu(cpu); + + reg |= BIT(31); + pci_write_config_dword(l3->dev, 0x1BC + slot * 4, reg); + } +} + + static ssize_t store_cache_disable(struct _cpuid4_info *this_leaf, - const char *buf, size_t count, unsigned int index) + const char *buf, size_t count, + unsigned int slot) { + struct pci_dev *dev = this_leaf->l3->dev; int cpu = cpumask_first(to_cpumask(this_leaf->shared_cpu_map)); - int node = amd_get_nb_id(cpu); - struct pci_dev *dev = node_to_k8_nb_misc(node); unsigned long val = 0; #define SUBCACHE_MASK (3UL << 20) #define SUBCACHE_INDEX 0xfff - if (!this_leaf->can_disable) + if (!this_leaf->l3 || !this_leaf->l3->can_disable) return -EINVAL; if (!capable(CAP_SYS_ADMIN)) @@ -396,26 +477,20 @@ static ssize_t store_cache_disable(struct _cpuid4_info *this_leaf, /* do not allow writes outside of allowed bits */ if ((val & ~(SUBCACHE_MASK | SUBCACHE_INDEX)) || - ((val & SUBCACHE_INDEX) > this_leaf->l3_indices)) + ((val & SUBCACHE_INDEX) > this_leaf->l3->indices)) return -EINVAL; - val |= BIT(30); - pci_write_config_dword(dev, 0x1BC + index * 4, val); - /* - * We need to WBINVD on a core on the node containing the L3 cache which - * indices we disable therefore a simple wbinvd() is not sufficient. - */ - wbinvd_on_cpu(cpu); - pci_write_config_dword(dev, 0x1BC + index * 4, val | BIT(31)); + amd_l3_disable_index(this_leaf->l3, cpu, slot, val); + return count; } -#define STORE_CACHE_DISABLE(index) \ +#define STORE_CACHE_DISABLE(slot) \ static ssize_t \ -store_cache_disable_##index(struct _cpuid4_info *this_leaf, \ +store_cache_disable_##slot(struct _cpuid4_info *this_leaf, \ const char *buf, size_t count) \ { \ - return store_cache_disable(this_leaf, buf, count, index); \ + return store_cache_disable(this_leaf, buf, count, slot); \ } STORE_CACHE_DISABLE(0) STORE_CACHE_DISABLE(1) @@ -443,8 +518,7 @@ __cpuinit cpuid4_cache_lookup_regs(int index, if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) { amd_cpuid4(index, &eax, &ebx, &ecx); - if (boot_cpu_data.x86 >= 0x10) - amd_check_l3_disable(index, this_leaf); + amd_check_l3_disable(index, this_leaf); } else { cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full, &edx); } @@ -701,6 +775,7 @@ static void __cpuinit free_cache_attributes(unsigned int cpu) for (i = 0; i < num_cache_leaves; i++) cache_remove_shared_cpu_map(cpu, i); + kfree(per_cpu(ici_cpuid4_info, cpu)->l3); kfree(per_cpu(ici_cpuid4_info, cpu)); per_cpu(ici_cpuid4_info, cpu) = NULL; } @@ -985,7 +1060,7 @@ static int __cpuinit cache_add_dev(struct sys_device * sys_dev) this_leaf = CPUID4_INFO_IDX(cpu, i); - if (this_leaf->can_disable) + if (this_leaf->l3 && this_leaf->l3->can_disable) ktype_cache.default_attrs = default_l3_attrs; else ktype_cache.default_attrs = default_attrs; diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 8a6f0af..7a355dd 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -539,7 +539,7 @@ void machine_check_poll(enum mcp_flags flags, mce_banks_t *b) struct mce m; int i; - __get_cpu_var(mce_poll_count)++; + percpu_inc(mce_poll_count); mce_setup(&m); @@ -934,7 +934,7 @@ void do_machine_check(struct pt_regs *regs, long error_code) atomic_inc(&mce_entry); - __get_cpu_var(mce_exception_count)++; + percpu_inc(mce_exception_count); if (notify_die(DIE_NMI, "machine check", regs, error_code, 18, SIGKILL) == NOTIFY_STOP) diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c new file mode 100644 index 0000000..16f41bb --- /dev/null +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -0,0 +1,55 @@ +/* + * HyperV Detection code. + * + * Copyright (C) 2010, Novell, Inc. + * Author : K. Y. Srinivasan <ksrinivasan@novell.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; version 2 of the License. + * + */ + +#include <linux/types.h> +#include <linux/module.h> +#include <asm/processor.h> +#include <asm/hypervisor.h> +#include <asm/hyperv.h> +#include <asm/mshyperv.h> + +struct ms_hyperv_info ms_hyperv; + +static bool __init ms_hyperv_platform(void) +{ + u32 eax; + u32 hyp_signature[3]; + + if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) + return false; + + cpuid(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS, + &eax, &hyp_signature[0], &hyp_signature[1], &hyp_signature[2]); + + return eax >= HYPERV_CPUID_MIN && + eax <= HYPERV_CPUID_MAX && + !memcmp("Microsoft Hv", hyp_signature, 12); +} + +static void __init ms_hyperv_init_platform(void) +{ + /* + * Extract the features and hints + */ + ms_hyperv.features = cpuid_eax(HYPERV_CPUID_FEATURES); + ms_hyperv.hints = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO); + + printk(KERN_INFO "HyperV: features 0x%x, hints 0x%x\n", + ms_hyperv.features, ms_hyperv.hints); +} + +const __refconst struct hypervisor_x86 x86_hyper_ms_hyperv = { + .name = "Microsoft HyperV", + .detect = ms_hyperv_platform, + .init_platform = ms_hyperv_init_platform, +}; +EXPORT_SYMBOL(x86_hyper_ms_hyperv); diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c index dfdb4db..b9d1ff5 100644 --- a/arch/x86/kernel/cpu/vmware.c +++ b/arch/x86/kernel/cpu/vmware.c @@ -24,8 +24,8 @@ #include <linux/dmi.h> #include <linux/module.h> #include <asm/div64.h> -#include <asm/vmware.h> #include <asm/x86_init.h> +#include <asm/hypervisor.h> #define CPUID_VMWARE_INFO_LEAF 0x40000000 #define VMWARE_HYPERVISOR_MAGIC 0x564D5868 @@ -65,7 +65,7 @@ static unsigned long vmware_get_tsc_khz(void) return tsc_hz; } -void __init vmware_platform_setup(void) +static void __init vmware_platform_setup(void) { uint32_t eax, ebx, ecx, edx; @@ -83,26 +83,22 @@ void __init vmware_platform_setup(void) * serial key should be enough, as this will always have a VMware * specific string when running under VMware hypervisor. */ -int vmware_platform(void) +static bool __init vmware_platform(void) { if (cpu_has_hypervisor) { - unsigned int eax, ebx, ecx, edx; - char hyper_vendor_id[13]; - - cpuid(CPUID_VMWARE_INFO_LEAF, &eax, &ebx, &ecx, &edx); - memcpy(hyper_vendor_id + 0, &ebx, 4); - memcpy(hyper_vendor_id + 4, &ecx, 4); - memcpy(hyper_vendor_id + 8, &edx, 4); - hyper_vendor_id[12] = '\0'; - if (!strcmp(hyper_vendor_id, "VMwareVMware")) - return 1; + unsigned int eax; + unsigned int hyper_vendor_id[3]; + + cpuid(CPUID_VMWARE_INFO_LEAF, &eax, &hyper_vendor_id[0], + &hyper_vendor_id[1], &hyper_vendor_id[2]); + if (!memcmp(hyper_vendor_id, "VMwareVMware", 12)) + return true; } else if (dmi_available && dmi_name_in_serial("VMware") && __vmware_platform()) - return 1; + return true; - return 0; + return false; } -EXPORT_SYMBOL(vmware_platform); /* * VMware hypervisor takes care of exporting a reliable TSC to the guest. @@ -116,8 +112,16 @@ EXPORT_SYMBOL(vmware_platform); * so that the kernel could just trust the hypervisor with providing a * reliable virtual TSC that is suitable for timekeeping. */ -void __cpuinit vmware_set_feature_bits(struct cpuinfo_x86 *c) +static void __cpuinit vmware_set_cpu_features(struct cpuinfo_x86 *c) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_TSC_RELIABLE); } + +const __refconst struct hypervisor_x86 x86_hyper_vmware = { + .name = "VMware", + .detect = vmware_platform, + .set_cpu_features = vmware_set_cpu_features, + .init_platform = vmware_platform_setup, +}; +EXPORT_SYMBOL(x86_hyper_vmware); diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 44a8e0d..cd49141 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -53,6 +53,7 @@ #include <asm/processor-flags.h> #include <asm/ftrace.h> #include <asm/irq_vectors.h> +#include <asm/cpufeature.h> /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */ #include <linux/elf-em.h> @@ -905,7 +906,25 @@ ENTRY(simd_coprocessor_error) RING0_INT_FRAME pushl $0 CFI_ADJUST_CFA_OFFSET 4 +#ifdef CONFIG_X86_INVD_BUG + /* AMD 486 bug: invd from userspace calls exception 19 instead of #GP */ +661: pushl $do_general_protection +662: +.section .altinstructions,"a" + .balign 4 + .long 661b + .long 663f + .byte X86_FEATURE_XMM + .byte 662b-661b + .byte 664f-663f +.previous +.section .altinstr_replacement,"ax" +663: pushl $do_simd_coprocessor_error +664: +.previous +#else pushl $do_simd_coprocessor_error +#endif CFI_ADJUST_CFA_OFFSET 4 jmp error_code CFI_ENDPROC diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c index 54c31c2..86cef6b 100644 --- a/arch/x86/kernel/i387.c +++ b/arch/x86/kernel/i387.c @@ -102,65 +102,62 @@ void __cpuinit fpu_init(void) mxcsr_feature_mask_init(); /* clean state in init */ - if (cpu_has_xsave) - current_thread_info()->status = TS_XSAVE; - else - current_thread_info()->status = 0; + current_thread_info()->status = 0; clear_used_math(); } #endif /* CONFIG_X86_64 */ -/* - * The _current_ task is using the FPU for the first time - * so initialize it and set the mxcsr to its default - * value at reset if we support XMM instructions and then - * remeber the current task has used the FPU. - */ -int init_fpu(struct task_struct *tsk) +static void fpu_finit(struct fpu *fpu) { - if (tsk_used_math(tsk)) { - if (HAVE_HWFP && tsk == current) - unlazy_fpu(tsk); - return 0; - } - - /* - * Memory allocation at the first usage of the FPU and other state. - */ - if (!tsk->thread.xstate) { - tsk->thread.xstate = kmem_cache_alloc(task_xstate_cachep, - GFP_KERNEL); - if (!tsk->thread.xstate) - return -ENOMEM; - } - #ifdef CONFIG_X86_32 if (!HAVE_HWFP) { - memset(tsk->thread.xstate, 0, xstate_size); - finit_task(tsk); - set_stopped_child_used_math(tsk); - return 0; + finit_soft_fpu(&fpu->state->soft); + return; } #endif if (cpu_has_fxsr) { - struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave; + struct i387_fxsave_struct *fx = &fpu->state->fxsave; memset(fx, 0, xstate_size); fx->cwd = 0x37f; if (cpu_has_xmm) fx->mxcsr = MXCSR_DEFAULT; } else { - struct i387_fsave_struct *fp = &tsk->thread.xstate->fsave; + struct i387_fsave_struct *fp = &fpu->state->fsave; memset(fp, 0, xstate_size); fp->cwd = 0xffff037fu; fp->swd = 0xffff0000u; fp->twd = 0xffffffffu; fp->fos = 0xffff0000u; } +} + +/* + * The _current_ task is using the FPU for the first time + * so initialize it and set the mxcsr to its default + * value at reset if we support XMM instructions and then + * remeber the current task has used the FPU. + */ +int init_fpu(struct task_struct *tsk) +{ + int ret; + + if (tsk_used_math(tsk)) { + if (HAVE_HWFP && tsk == current) + unlazy_fpu(tsk); + return 0; + } + /* - * Only the device not available exception or ptrace can call init_fpu. + * Memory allocation at the first usage of the FPU and other state. */ + ret = fpu_alloc(&tsk->thread.fpu); + if (ret) + return ret; + + fpu_finit(&tsk->thread.fpu); + set_stopped_child_used_math(tsk); return 0; } @@ -194,7 +191,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset, return ret; return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->fxsave, 0, -1); + &target->thread.fpu.state->fxsave, 0, -1); } int xfpregs_set(struct task_struct *target, const struct user_regset *regset, @@ -211,19 +208,19 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset, return ret; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->fxsave, 0, -1); + &target->thread.fpu.state->fxsave, 0, -1); /* * mxcsr reserved bits must be masked to zero for security reasons. */ - target->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask; + target->thread.fpu.state->fxsave.mxcsr &= mxcsr_feature_mask; /* * update the header bits in the xsave header, indicating the * presence of FP and SSE state. */ if (cpu_has_xsave) - target->thread.xstate->xsave.xsave_hdr.xstate_bv |= XSTATE_FPSSE; + target->thread.fpu.state->xsave.xsave_hdr.xstate_bv |= XSTATE_FPSSE; return ret; } @@ -246,14 +243,14 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset, * memory layout in the thread struct, so that we can copy the entire * xstateregs to the user using one user_regset_copyout(). */ - memcpy(&target->thread.xstate->fxsave.sw_reserved, + memcpy(&target->thread.fpu.state->fxsave.sw_reserved, xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes)); /* * Copy the xstate memory layout. */ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->xsave, 0, -1); + &target->thread.fpu.state->xsave, 0, -1); return ret; } @@ -272,14 +269,14 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset, return ret; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->xsave, 0, -1); + &target->thread.fpu.state->xsave, 0, -1); /* * mxcsr reserved bits must be masked to zero for security reasons. */ - target->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask; + target->thread.fpu.state->fxsave.mxcsr &= mxcsr_feature_mask; - xsave_hdr = &target->thread.xstate->xsave.xsave_hdr; + xsave_hdr = &target->thread.fpu.state->xsave.xsave_hdr; xsave_hdr->xstate_bv &= pcntxt_mask; /* @@ -365,7 +362,7 @@ static inline u32 twd_fxsr_to_i387(struct i387_fxsave_struct *fxsave) static void convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk) { - struct i387_fxsave_struct *fxsave = &tsk->thread.xstate->fxsave; + struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state->fxsave; struct _fpreg *to = (struct _fpreg *) &env->st_space[0]; struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0]; int i; @@ -405,7 +402,7 @@ static void convert_to_fxsr(struct task_struct *tsk, const struct user_i387_ia32_struct *env) { - struct i387_fxsave_struct *fxsave = &tsk->thread.xstate->fxsave; + struct i387_fxsave_struct *fxsave = &tsk->thread.fpu.state->fxsave; struct _fpreg *from = (struct _fpreg *) &env->st_space[0]; struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0]; int i; @@ -445,7 +442,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset, if (!cpu_has_fxsr) { return user_regset_copyout(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->fsave, 0, + &target->thread.fpu.state->fsave, 0, -1); } @@ -475,7 +472,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset, if (!cpu_has_fxsr) { return user_regset_copyin(&pos, &count, &kbuf, &ubuf, - &target->thread.xstate->fsave, 0, -1); + &target->thread.fpu.state->fsave, 0, -1); } if (pos > 0 || count < sizeof(env)) @@ -490,7 +487,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset, * presence of FP. */ if (cpu_has_xsave) - target->thread.xstate->xsave.xsave_hdr.xstate_bv |= XSTATE_FP; + target->thread.fpu.state->xsave.xsave_hdr.xstate_bv |= XSTATE_FP; return ret; } @@ -501,7 +498,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset, static inline int save_i387_fsave(struct _fpstate_ia32 __user *buf) { struct task_struct *tsk = current; - struct i387_fsave_struct *fp = &tsk->thread.xstate->fsave; + struct i387_fsave_struct *fp = &tsk->thread.fpu.state->fsave; fp->status = fp->swd; if (__copy_to_user(buf, fp, sizeof(struct i387_fsave_struct))) @@ -512,7 +509,7 @@ static inline int save_i387_fsave(struct _fpstate_ia32 __user *buf) static int save_i387_fxsave(struct _fpstate_ia32 __user *buf) { struct task_struct *tsk = current; - struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave; + struct i387_fxsave_struct *fx = &tsk->thread.fpu.state->fxsave; struct user_i387_ia32_struct env; int err = 0; @@ -547,7 +544,7 @@ static int save_i387_xsave(void __user *buf) * header as well as change any contents in the memory layout. * xrestore as part of sigreturn will capture all the changes. */ - tsk->thread.xstate->xsave.xsave_hdr.xstate_bv |= XSTATE_FPSSE; + tsk->thread.fpu.state->xsave.xsave_hdr.xstate_bv |= XSTATE_FPSSE; if (save_i387_fxsave(fx) < 0) return -1; @@ -599,7 +596,7 @@ static inline int restore_i387_fsave(struct _fpstate_ia32 __user *buf) { struct task_struct *tsk = current; - return __copy_from_user(&tsk->thread.xstate->fsave, buf, + return __copy_from_user(&tsk->thread.fpu.state->fsave, buf, sizeof(struct i387_fsave_struct)); } @@ -610,10 +607,10 @@ static int restore_i387_fxsave(struct _fpstate_ia32 __user *buf, struct user_i387_ia32_struct env; int err; - err = __copy_from_user(&tsk->thread.xstate->fxsave, &buf->_fxsr_env[0], + err = __copy_from_user(&tsk->thread.fpu.state->fxsave, &buf->_fxsr_env[0], size); /* mxcsr reserved bits must be masked to zero for security reasons */ - tsk->thread.xstate->fxsave.mxcsr &= mxcsr_feature_mask; + tsk->thread.fpu.state->fxsave.mxcsr &= mxcsr_feature_mask; if (err || __copy_from_user(&env, buf, sizeof(env))) return 1; convert_to_fxsr(tsk, &env); @@ -629,7 +626,7 @@ static int restore_i387_xsave(void __user *buf) struct i387_fxsave_struct __user *fx = (struct i387_fxsave_struct __user *) &fx_user->_fxsr_env[0]; struct xsave_hdr_struct *xsave_hdr = - ¤t->thread.xstate->xsave.xsave_hdr; + ¤t->thread.fpu.state->xsave.xsave_hdr; u64 mask; int err; diff --git a/arch/x86/kernel/i8253.c b/arch/x86/kernel/i8253.c index 23c1679..2dfd315 100644 --- a/arch/x86/kernel/i8253.c +++ b/arch/x86/kernel/i8253.c @@ -16,7 +16,7 @@ #include <asm/hpet.h> #include <asm/smp.h> -DEFINE_SPINLOCK(i8253_lock); +DEFINE_RAW_SPINLOCK(i8253_lock); EXPORT_SYMBOL(i8253_lock); /* @@ -33,7 +33,7 @@ struct clock_event_device *global_clock_event; static void init_pit_timer(enum clock_event_mode mode, struct clock_event_device *evt) { - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); switch (mode) { case CLOCK_EVT_MODE_PERIODIC: @@ -62,7 +62,7 @@ static void init_pit_timer(enum clock_event_mode mode, /* Nothing to do here */ break; } - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); } /* @@ -72,10 +72,10 @@ static void init_pit_timer(enum clock_event_mode mode, */ static int pit_next_event(unsigned long delta, struct clock_event_device *evt) { - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); outb_pit(delta & 0xff , PIT_CH0); /* LSB */ outb_pit(delta >> 8 , PIT_CH0); /* MSB */ - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); return 0; } @@ -130,7 +130,7 @@ static cycle_t pit_read(struct clocksource *cs) int count; u32 jifs; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); /* * Although our caller may have the read side of xtime_lock, * this is now a seqlock, and we are cheating in this routine @@ -176,7 +176,7 @@ static cycle_t pit_read(struct clocksource *cs) old_count = count; old_jifs = jifs; - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); count = (LATCH - 1) - count; diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c index 0ed2d30..990ae7c 100644 --- a/arch/x86/kernel/irqinit.c +++ b/arch/x86/kernel/irqinit.c @@ -60,7 +60,7 @@ static irqreturn_t math_error_irq(int cpl, void *dev_id) outb(0, 0xF0); if (ignore_fpu_irq || !boot_cpu_data.hard_math) return IRQ_NONE; - math_error((void __user *)get_irq_regs()->ip); + math_error(get_irq_regs(), 0, 16); return IRQ_HANDLED; } diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c index f2f56c0..345a4b1 100644 --- a/arch/x86/kernel/kprobes.c +++ b/arch/x86/kernel/kprobes.c @@ -542,20 +542,6 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) struct kprobe_ctlblk *kcb; addr = (kprobe_opcode_t *)(regs->ip - sizeof(kprobe_opcode_t)); - if (*addr != BREAKPOINT_INSTRUCTION) { - /* - * The breakpoint instruction was removed right - * after we hit it. Another cpu has removed - * either a probepoint or a debugger breakpoint - * at this address. In either case, no further - * handling of this interrupt is appropriate. - * Back up over the (now missing) int3 and run - * the original instruction. - */ - regs->ip = (unsigned long)addr; - return 1; - } - /* * We don't want to be preempted for the entire * duration of kprobe processing. We conditionally @@ -587,6 +573,19 @@ static int __kprobes kprobe_handler(struct pt_regs *regs) setup_singlestep(p, regs, kcb, 0); return 1; } + } else if (*addr != BREAKPOINT_INSTRUCTION) { + /* + * The breakpoint instruction was removed right + * after we hit it. Another cpu has removed + * either a probepoint or a debugger breakpoint + * at this address. In either case, no further + * handling of this interrupt is appropriate. + * Back up over the (now missing) int3 and run + * the original instruction. + */ + regs->ip = (unsigned long)addr; + preempt_enable_no_resched(); + return 1; } else if (kprobe_running()) { p = __get_cpu_var(current_kprobe); if (p->break_handler && p->break_handler(p, regs)) { diff --git a/arch/x86/kernel/microcode_core.c b/arch/x86/kernel/microcode_core.c index cceb5bc..2cd8c54 100644 --- a/arch/x86/kernel/microcode_core.c +++ b/arch/x86/kernel/microcode_core.c @@ -201,9 +201,9 @@ static int do_microcode_update(const void __user *buf, size_t size) return error; } -static int microcode_open(struct inode *unused1, struct file *unused2) +static int microcode_open(struct inode *inode, struct file *file) { - return capable(CAP_SYS_RAWIO) ? 0 : -EPERM; + return capable(CAP_SYS_RAWIO) ? nonseekable_open(inode, file) : -EPERM; } static ssize_t microcode_write(struct file *file, const char __user *buf, diff --git a/arch/x86/kernel/microcode_intel.c b/arch/x86/kernel/microcode_intel.c index 85a343e..3561702 100644 --- a/arch/x86/kernel/microcode_intel.c +++ b/arch/x86/kernel/microcode_intel.c @@ -343,10 +343,11 @@ static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size, int (*get_ucode_data)(void *, const void *, size_t)) { struct ucode_cpu_info *uci = ucode_cpu_info + cpu; - u8 *ucode_ptr = data, *new_mc = NULL, *mc; + u8 *ucode_ptr = data, *new_mc = NULL, *mc = NULL; int new_rev = uci->cpu_sig.rev; unsigned int leftover = size; enum ucode_state state = UCODE_OK; + unsigned int curr_mc_size = 0; while (leftover) { struct microcode_header_intel mc_header; @@ -361,9 +362,15 @@ static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size, break; } - mc = vmalloc(mc_size); - if (!mc) - break; + /* For performance reasons, reuse mc area when possible */ + if (!mc || mc_size > curr_mc_size) { + if (mc) + vfree(mc); + mc = vmalloc(mc_size); + if (!mc) + break; + curr_mc_size = mc_size; + } if (get_ucode_data(mc, ucode_ptr, mc_size) || microcode_sanity_check(mc) < 0) { @@ -376,13 +383,16 @@ static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size, vfree(new_mc); new_rev = mc_header.rev; new_mc = mc; - } else - vfree(mc); + mc = NULL; /* trigger new vmalloc */ + } ucode_ptr += mc_size; leftover -= mc_size; } + if (mc) + vfree(mc); + if (leftover) { if (new_mc) vfree(new_mc); diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c index e81030f..5ae5d24 100644 --- a/arch/x86/kernel/mpparse.c +++ b/arch/x86/kernel/mpparse.c @@ -115,21 +115,6 @@ static void __init MP_bus_info(struct mpc_bus *m) printk(KERN_WARNING "Unknown bustype %s - ignoring\n", str); } -static int bad_ioapic(unsigned long address) -{ - if (nr_ioapics >= MAX_IO_APICS) { - printk(KERN_ERR "ERROR: Max # of I/O APICs (%d) exceeded " - "(found %d)\n", MAX_IO_APICS, nr_ioapics); - panic("Recompile kernel with bigger MAX_IO_APICS!\n"); - } - if (!address) { - printk(KERN_ERR "WARNING: Bogus (zero) I/O APIC address" - " found in table, skipping!\n"); - return 1; - } - return 0; -} - static void __init MP_ioapic_info(struct mpc_ioapic *m) { if (!(m->flags & MPC_APIC_USABLE)) @@ -138,15 +123,7 @@ static void __init MP_ioapic_info(struct mpc_ioapic *m) printk(KERN_INFO "I/O APIC #%d Version %d at 0x%X.\n", m->apicid, m->apicver, m->apicaddr); - if (bad_ioapic(m->apicaddr)) - return; - - mp_ioapics[nr_ioapics].apicaddr = m->apicaddr; - mp_ioapics[nr_ioapics].apicid = m->apicid; - mp_ioapics[nr_ioapics].type = m->type; - mp_ioapics[nr_ioapics].apicver = m->apicver; - mp_ioapics[nr_ioapics].flags = m->flags; - nr_ioapics++; + mp_register_ioapic(m->apicid, m->apicaddr, gsi_end + 1); } static void print_MP_intsrc_info(struct mpc_intsrc *m) diff --git a/arch/x86/kernel/mrst.c b/arch/x86/kernel/mrst.c index 0aad867..e796448 100644 --- a/arch/x86/kernel/mrst.c +++ b/arch/x86/kernel/mrst.c @@ -237,4 +237,9 @@ void __init x86_mrst_early_setup(void) x86_init.pci.fixup_irqs = x86_init_noop; legacy_pic = &null_legacy_pic; + + /* Avoid searching for BIOS MP tables */ + x86_init.mpparse.find_smp_config = x86_init_noop; + x86_init.mpparse.get_smp_config = x86_init_uint_noop; + } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index eccdb57..e7e3521 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -31,24 +31,22 @@ struct kmem_cache *task_xstate_cachep; int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) { + int ret; + *dst = *src; - if (src->thread.xstate) { - dst->thread.xstate = kmem_cache_alloc(task_xstate_cachep, - GFP_KERNEL); - if (!dst->thread.xstate) - return -ENOMEM; - WARN_ON((unsigned long)dst->thread.xstate & 15); - memcpy(dst->thread.xstate, src->thread.xstate, xstate_size); + if (fpu_allocated(&src->thread.fpu)) { + memset(&dst->thread.fpu, 0, sizeof(dst->thread.fpu)); + ret = fpu_alloc(&dst->thread.fpu); + if (ret) + return ret; + fpu_copy(&dst->thread.fpu, &src->thread.fpu); } return 0; } void free_thread_xstate(struct task_struct *tsk) { - if (tsk->thread.xstate) { - kmem_cache_free(task_xstate_cachep, tsk->thread.xstate); - tsk->thread.xstate = NULL; - } + fpu_free(&tsk->thread.fpu); } void free_thread_info(struct thread_info *ti) @@ -548,11 +546,13 @@ static int __cpuinit check_c1e_idle(const struct cpuinfo_x86 *c) * check OSVW bit for CPUs that are not affected * by erratum #400 */ - rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, val); - if (val >= 2) { - rdmsrl(MSR_AMD64_OSVW_STATUS, val); - if (!(val & BIT(1))) - goto no_c1e_idle; + if (cpu_has(c, X86_FEATURE_OSVW)) { + rdmsrl(MSR_AMD64_OSVW_ID_LENGTH, val); + if (val >= 2) { + rdmsrl(MSR_AMD64_OSVW_STATUS, val); + if (!(val & BIT(1))) + goto no_c1e_idle; + } } return 1; } diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 75090c5..8d12878 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -309,7 +309,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) /* we're going to use this soon, after a few expensive things */ if (preload_fpu) - prefetch(next->xstate); + prefetch(next->fpu.state); /* * Reload esp0. diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 50cc84a..3c2422a 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -388,7 +388,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) /* we're going to use this soon, after a few expensive things */ if (preload_fpu) - prefetch(next->xstate); + prefetch(next->fpu.state); /* * Reload esp0, LDT and the page table pointer: diff --git a/arch/x86/kernel/sfi.c b/arch/x86/kernel/sfi.c index 34e0993..7ded578 100644 --- a/arch/x86/kernel/sfi.c +++ b/arch/x86/kernel/sfi.c @@ -81,7 +81,6 @@ static int __init sfi_parse_cpus(struct sfi_table_header *table) #endif /* CONFIG_X86_LOCAL_APIC */ #ifdef CONFIG_X86_IO_APIC -static u32 gsi_base; static int __init sfi_parse_ioapic(struct sfi_table_header *table) { @@ -94,8 +93,7 @@ static int __init sfi_parse_ioapic(struct sfi_table_header *table) pentry = (struct sfi_apic_table_entry *)sb->pentry; for (i = 0; i < num; i++) { - mp_register_ioapic(i, pentry->phys_addr, gsi_base); - gsi_base += io_apic_get_redir_entries(i); + mp_register_ioapic(i, pentry->phys_addr, gsi_end + 1); pentry++; } diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c index 86c9f91..cc2c604 100644 --- a/arch/x86/kernel/tboot.c +++ b/arch/x86/kernel/tboot.c @@ -175,6 +175,9 @@ static void add_mac_region(phys_addr_t start, unsigned long size) struct tboot_mac_region *mr; phys_addr_t end = start + size; + if (tboot->num_mac_regions >= MAX_TB_MAC_REGIONS) + panic("tboot: Too many MAC regions\n"); + if (start && size) { mr = &tboot->mac_regions[tboot->num_mac_regions++]; mr->start = round_down(start, PAGE_SIZE); @@ -184,18 +187,17 @@ static void add_mac_region(phys_addr_t start, unsigned long size) static int tboot_setup_sleep(void) { + int i; + tboot->num_mac_regions = 0; - /* S3 resume code */ - add_mac_region(acpi_wakeup_address, WAKEUP_SIZE); + for (i = 0; i < e820.nr_map; i++) { + if ((e820.map[i].type != E820_RAM) + && (e820.map[i].type != E820_RESERVED_KERN)) + continue; -#ifdef CONFIG_X86_TRAMPOLINE - /* AP trampoline code */ - add_mac_region(virt_to_phys(trampoline_base), TRAMPOLINE_SIZE); -#endif - - /* kernel code + data + bss */ - add_mac_region(virt_to_phys(_text), _end - _text); + add_mac_region(e820.map[i].addr, e820.map[i].size); + } tboot->acpi_sinfo.kernel_s3_resume_vector = acpi_wakeup_address; diff --git a/arch/x86/kernel/tlb_uv.c b/arch/x86/kernel/tlb_uv.c index 17b03dd..7fea555 100644 --- a/arch/x86/kernel/tlb_uv.c +++ b/arch/x86/kernel/tlb_uv.c @@ -1,7 +1,7 @@ /* * SGI UltraViolet TLB flush routines. * - * (c) 2008 Cliff Wickman <cpw@sgi.com>, SGI. + * (c) 2008-2010 Cliff Wickman <cpw@sgi.com>, SGI. * * This code is released under the GNU General Public License version 2 or * later. @@ -20,42 +20,67 @@ #include <asm/idle.h> #include <asm/tsc.h> #include <asm/irq_vectors.h> +#include <asm/timer.h> -static struct bau_control **uv_bau_table_bases __read_mostly; -static int uv_bau_retry_limit __read_mostly; +struct msg_desc { + struct bau_payload_queue_entry *msg; + int msg_slot; + int sw_ack_slot; + struct bau_payload_queue_entry *va_queue_first; + struct bau_payload_queue_entry *va_queue_last; +}; -/* base pnode in this partition */ -static int uv_partition_base_pnode __read_mostly; +#define UV_INTD_SOFT_ACK_TIMEOUT_PERIOD 0x000000000bUL + +static int uv_bau_max_concurrent __read_mostly; + +static int nobau; +static int __init setup_nobau(char *arg) +{ + nobau = 1; + return 0; +} +early_param("nobau", setup_nobau); -static unsigned long uv_mmask __read_mostly; +/* base pnode in this partition */ +static int uv_partition_base_pnode __read_mostly; +/* position of pnode (which is nasid>>1): */ +static int uv_nshift __read_mostly; +static unsigned long uv_mmask __read_mostly; static DEFINE_PER_CPU(struct ptc_stats, ptcstats); static DEFINE_PER_CPU(struct bau_control, bau_control); +static DEFINE_PER_CPU(cpumask_var_t, uv_flush_tlb_mask); + +struct reset_args { + int sender; +}; /* - * Determine the first node on a blade. + * Determine the first node on a uvhub. 'Nodes' are used for kernel + * memory allocation. */ -static int __init blade_to_first_node(int blade) +static int __init uvhub_to_first_node(int uvhub) { int node, b; for_each_online_node(node) { b = uv_node_to_blade_id(node); - if (blade == b) + if (uvhub == b) return node; } - return -1; /* shouldn't happen */ + return -1; } /* - * Determine the apicid of the first cpu on a blade. + * Determine the apicid of the first cpu on a uvhub. */ -static int __init blade_to_first_apicid(int blade) +static int __init uvhub_to_first_apicid(int uvhub) { int cpu; for_each_present_cpu(cpu) - if (blade == uv_cpu_to_blade_id(cpu)) + if (uvhub == uv_cpu_to_blade_id(cpu)) return per_cpu(x86_cpu_to_apicid, cpu); return -1; } @@ -68,195 +93,459 @@ static int __init blade_to_first_apicid(int blade) * clear of the Timeout bit (as well) will free the resource. No reply will * be sent (the hardware will only do one reply per message). */ -static void uv_reply_to_message(int resource, - struct bau_payload_queue_entry *msg, - struct bau_msg_status *msp) +static inline void uv_reply_to_message(struct msg_desc *mdp, + struct bau_control *bcp) { unsigned long dw; + struct bau_payload_queue_entry *msg; - dw = (1 << (resource + UV_SW_ACK_NPENDING)) | (1 << resource); + msg = mdp->msg; + if (!msg->canceled) { + dw = (msg->sw_ack_vector << UV_SW_ACK_NPENDING) | + msg->sw_ack_vector; + uv_write_local_mmr( + UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE_ALIAS, dw); + } msg->replied_to = 1; msg->sw_ack_vector = 0; - if (msp) - msp->seen_by.bits = 0; - uv_write_local_mmr(UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE_ALIAS, dw); } /* - * Do all the things a cpu should do for a TLB shootdown message. - * Other cpu's may come here at the same time for this message. + * Process the receipt of a RETRY message */ -static void uv_bau_process_message(struct bau_payload_queue_entry *msg, - int msg_slot, int sw_ack_slot) +static inline void uv_bau_process_retry_msg(struct msg_desc *mdp, + struct bau_control *bcp) { - unsigned long this_cpu_mask; - struct bau_msg_status *msp; - int cpu; + int i; + int cancel_count = 0; + int slot2; + unsigned long msg_res; + unsigned long mmr = 0; + struct bau_payload_queue_entry *msg; + struct bau_payload_queue_entry *msg2; + struct ptc_stats *stat; - msp = __get_cpu_var(bau_control).msg_statuses + msg_slot; - cpu = uv_blade_processor_id(); - msg->number_of_cpus = - uv_blade_nr_online_cpus(uv_node_to_blade_id(numa_node_id())); - this_cpu_mask = 1UL << cpu; - if (msp->seen_by.bits & this_cpu_mask) - return; - atomic_or_long(&msp->seen_by.bits, this_cpu_mask); + msg = mdp->msg; + stat = &per_cpu(ptcstats, bcp->cpu); + stat->d_retries++; + /* + * cancel any message from msg+1 to the retry itself + */ + for (msg2 = msg+1, i = 0; i < DEST_Q_SIZE; msg2++, i++) { + if (msg2 > mdp->va_queue_last) + msg2 = mdp->va_queue_first; + if (msg2 == msg) + break; + + /* same conditions for cancellation as uv_do_reset */ + if ((msg2->replied_to == 0) && (msg2->canceled == 0) && + (msg2->sw_ack_vector) && ((msg2->sw_ack_vector & + msg->sw_ack_vector) == 0) && + (msg2->sending_cpu == msg->sending_cpu) && + (msg2->msg_type != MSG_NOOP)) { + slot2 = msg2 - mdp->va_queue_first; + mmr = uv_read_local_mmr + (UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE); + msg_res = ((msg2->sw_ack_vector << 8) | + msg2->sw_ack_vector); + /* + * This is a message retry; clear the resources held + * by the previous message only if they timed out. + * If it has not timed out we have an unexpected + * situation to report. + */ + if (mmr & (msg_res << 8)) { + /* + * is the resource timed out? + * make everyone ignore the cancelled message. + */ + msg2->canceled = 1; + stat->d_canceled++; + cancel_count++; + uv_write_local_mmr( + UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE_ALIAS, + (msg_res << 8) | msg_res); + } else + printk(KERN_INFO "note bau retry: no effect\n"); + } + } + if (!cancel_count) + stat->d_nocanceled++; +} - if (msg->replied_to == 1) - return; +/* + * Do all the things a cpu should do for a TLB shootdown message. + * Other cpu's may come here at the same time for this message. + */ +static void uv_bau_process_message(struct msg_desc *mdp, + struct bau_control *bcp) +{ + int msg_ack_count; + short socket_ack_count = 0; + struct ptc_stats *stat; + struct bau_payload_queue_entry *msg; + struct bau_control *smaster = bcp->socket_master; + /* + * This must be a normal message, or retry of a normal message + */ + msg = mdp->msg; + stat = &per_cpu(ptcstats, bcp->cpu); if (msg->address == TLB_FLUSH_ALL) { local_flush_tlb(); - __get_cpu_var(ptcstats).alltlb++; + stat->d_alltlb++; } else { __flush_tlb_one(msg->address); - __get_cpu_var(ptcstats).onetlb++; + stat->d_onetlb++; } + stat->d_requestee++; + + /* + * One cpu on each uvhub has the additional job on a RETRY + * of releasing the resource held by the message that is + * being retried. That message is identified by sending + * cpu number. + */ + if (msg->msg_type == MSG_RETRY && bcp == bcp->uvhub_master) + uv_bau_process_retry_msg(mdp, bcp); - __get_cpu_var(ptcstats).requestee++; + /* + * This is a sw_ack message, so we have to reply to it. + * Count each responding cpu on the socket. This avoids + * pinging the count's cache line back and forth between + * the sockets. + */ + socket_ack_count = atomic_add_short_return(1, (struct atomic_short *) + &smaster->socket_acknowledge_count[mdp->msg_slot]); + if (socket_ack_count == bcp->cpus_in_socket) { + /* + * Both sockets dump their completed count total into + * the message's count. + */ + smaster->socket_acknowledge_count[mdp->msg_slot] = 0; + msg_ack_count = atomic_add_short_return(socket_ack_count, + (struct atomic_short *)&msg->acknowledge_count); + + if (msg_ack_count == bcp->cpus_in_uvhub) { + /* + * All cpus in uvhub saw it; reply + */ + uv_reply_to_message(mdp, bcp); + } + } - atomic_inc_short(&msg->acknowledge_count); - if (msg->number_of_cpus == msg->acknowledge_count) - uv_reply_to_message(sw_ack_slot, msg, msp); + return; } /* - * Examine the payload queue on one distribution node to see - * which messages have not been seen, and which cpu(s) have not seen them. + * Determine the first cpu on a uvhub. + */ +static int uvhub_to_first_cpu(int uvhub) +{ + int cpu; + for_each_present_cpu(cpu) + if (uvhub == uv_cpu_to_blade_id(cpu)) + return cpu; + return -1; +} + +/* + * Last resort when we get a large number of destination timeouts is + * to clear resources held by a given cpu. + * Do this with IPI so that all messages in the BAU message queue + * can be identified by their nonzero sw_ack_vector field. * - * Returns the number of cpu's that have not responded. + * This is entered for a single cpu on the uvhub. + * The sender want's this uvhub to free a specific message's + * sw_ack resources. */ -static int uv_examine_destination(struct bau_control *bau_tablesp, int sender) +static void +uv_do_reset(void *ptr) { - struct bau_payload_queue_entry *msg; - struct bau_msg_status *msp; - int count = 0; int i; - int j; + int slot; + int count = 0; + unsigned long mmr; + unsigned long msg_res; + struct bau_control *bcp; + struct reset_args *rap; + struct bau_payload_queue_entry *msg; + struct ptc_stats *stat; - for (msg = bau_tablesp->va_queue_first, i = 0; i < DEST_Q_SIZE; - msg++, i++) { - if ((msg->sending_cpu == sender) && (!msg->replied_to)) { - msp = bau_tablesp->msg_statuses + i; - printk(KERN_DEBUG - "blade %d: address:%#lx %d of %d, not cpu(s): ", - i, msg->address, msg->acknowledge_count, - msg->number_of_cpus); - for (j = 0; j < msg->number_of_cpus; j++) { - if (!((1L << j) & msp->seen_by.bits)) { - count++; - printk("%d ", j); - } + bcp = &per_cpu(bau_control, smp_processor_id()); + rap = (struct reset_args *)ptr; + stat = &per_cpu(ptcstats, bcp->cpu); + stat->d_resets++; + + /* + * We're looking for the given sender, and + * will free its sw_ack resource. + * If all cpu's finally responded after the timeout, its + * message 'replied_to' was set. + */ + for (msg = bcp->va_queue_first, i = 0; i < DEST_Q_SIZE; msg++, i++) { + /* uv_do_reset: same conditions for cancellation as + uv_bau_process_retry_msg() */ + if ((msg->replied_to == 0) && + (msg->canceled == 0) && + (msg->sending_cpu == rap->sender) && + (msg->sw_ack_vector) && + (msg->msg_type != MSG_NOOP)) { + /* + * make everyone else ignore this message + */ + msg->canceled = 1; + slot = msg - bcp->va_queue_first; + count++; + /* + * only reset the resource if it is still pending + */ + mmr = uv_read_local_mmr + (UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE); + msg_res = ((msg->sw_ack_vector << 8) | + msg->sw_ack_vector); + if (mmr & msg_res) { + stat->d_rcanceled++; + uv_write_local_mmr( + UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE_ALIAS, + msg_res); } - printk("\n"); } } - return count; + return; } /* - * Examine the payload queue on all the distribution nodes to see - * which messages have not been seen, and which cpu(s) have not seen them. - * - * Returns the number of cpu's that have not responded. + * Use IPI to get all target uvhubs to release resources held by + * a given sending cpu number. */ -static int uv_examine_destinations(struct bau_target_nodemask *distribution) +static void uv_reset_with_ipi(struct bau_target_uvhubmask *distribution, + int sender) { - int sender; - int i; - int count = 0; + int uvhub; + int cpu; + cpumask_t mask; + struct reset_args reset_args; + + reset_args.sender = sender; - sender = smp_processor_id(); - for (i = 0; i < sizeof(struct bau_target_nodemask) * BITSPERBYTE; i++) { - if (!bau_node_isset(i, distribution)) + cpus_clear(mask); + /* find a single cpu for each uvhub in this distribution mask */ + for (uvhub = 0; + uvhub < sizeof(struct bau_target_uvhubmask) * BITSPERBYTE; + uvhub++) { + if (!bau_uvhub_isset(uvhub, distribution)) continue; - count += uv_examine_destination(uv_bau_table_bases[i], sender); + /* find a cpu for this uvhub */ + cpu = uvhub_to_first_cpu(uvhub); + cpu_set(cpu, mask); } - return count; + /* IPI all cpus; Preemption is already disabled */ + smp_call_function_many(&mask, uv_do_reset, (void *)&reset_args, 1); + return; +} + +static inline unsigned long +cycles_2_us(unsigned long long cyc) +{ + unsigned long long ns; + unsigned long us; + ns = (cyc * per_cpu(cyc2ns, smp_processor_id())) + >> CYC2NS_SCALE_FACTOR; + us = ns / 1000; + return us; } /* - * wait for completion of a broadcast message - * - * return COMPLETE, RETRY or GIVEUP + * wait for all cpus on this hub to finish their sends and go quiet + * leaves uvhub_quiesce set so that no new broadcasts are started by + * bau_flush_send_and_wait() + */ +static inline void +quiesce_local_uvhub(struct bau_control *hmaster) +{ + atomic_add_short_return(1, (struct atomic_short *) + &hmaster->uvhub_quiesce); +} + +/* + * mark this quiet-requestor as done + */ +static inline void +end_uvhub_quiesce(struct bau_control *hmaster) +{ + atomic_add_short_return(-1, (struct atomic_short *) + &hmaster->uvhub_quiesce); +} + +/* + * Wait for completion of a broadcast software ack message + * return COMPLETE, RETRY(PLUGGED or TIMEOUT) or GIVEUP */ static int uv_wait_completion(struct bau_desc *bau_desc, - unsigned long mmr_offset, int right_shift) + unsigned long mmr_offset, int right_shift, int this_cpu, + struct bau_control *bcp, struct bau_control *smaster, long try) { - int exams = 0; - long destination_timeouts = 0; - long source_timeouts = 0; + int relaxes = 0; unsigned long descriptor_status; + unsigned long mmr; + unsigned long mask; + cycles_t ttime; + cycles_t timeout_time; + struct ptc_stats *stat = &per_cpu(ptcstats, this_cpu); + struct bau_control *hmaster; + + hmaster = bcp->uvhub_master; + timeout_time = get_cycles() + bcp->timeout_interval; + /* spin on the status MMR, waiting for it to go idle */ while ((descriptor_status = (((unsigned long) uv_read_local_mmr(mmr_offset) >> right_shift) & UV_ACT_STATUS_MASK)) != DESC_STATUS_IDLE) { - if (descriptor_status == DESC_STATUS_SOURCE_TIMEOUT) { - source_timeouts++; - if (source_timeouts > SOURCE_TIMEOUT_LIMIT) - source_timeouts = 0; - __get_cpu_var(ptcstats).s_retry++; - return FLUSH_RETRY; - } /* - * spin here looking for progress at the destinations + * Our software ack messages may be blocked because there are + * no swack resources available. As long as none of them + * has timed out hardware will NACK our message and its + * state will stay IDLE. */ - if (descriptor_status == DESC_STATUS_DESTINATION_TIMEOUT) { - destination_timeouts++; - if (destination_timeouts > DESTINATION_TIMEOUT_LIMIT) { - /* - * returns number of cpus not responding - */ - if (uv_examine_destinations - (&bau_desc->distribution) == 0) { - __get_cpu_var(ptcstats).d_retry++; - return FLUSH_RETRY; - } - exams++; - if (exams >= uv_bau_retry_limit) { - printk(KERN_DEBUG - "uv_flush_tlb_others"); - printk("giving up on cpu %d\n", - smp_processor_id()); + if (descriptor_status == DESC_STATUS_SOURCE_TIMEOUT) { + stat->s_stimeout++; + return FLUSH_GIVEUP; + } else if (descriptor_status == + DESC_STATUS_DESTINATION_TIMEOUT) { + stat->s_dtimeout++; + ttime = get_cycles(); + + /* + * Our retries may be blocked by all destination + * swack resources being consumed, and a timeout + * pending. In that case hardware returns the + * ERROR that looks like a destination timeout. + */ + if (cycles_2_us(ttime - bcp->send_message) < BIOS_TO) { + bcp->conseccompletes = 0; + return FLUSH_RETRY_PLUGGED; + } + + bcp->conseccompletes = 0; + return FLUSH_RETRY_TIMEOUT; + } else { + /* + * descriptor_status is still BUSY + */ + cpu_relax(); + relaxes++; + if (relaxes >= 10000) { + relaxes = 0; + if (get_cycles() > timeout_time) { + quiesce_local_uvhub(hmaster); + + /* single-thread the register change */ + spin_lock(&hmaster->masks_lock); + mmr = uv_read_local_mmr(mmr_offset); + mask = 0UL; + mask |= (3UL < right_shift); + mask = ~mask; + mmr &= mask; + uv_write_local_mmr(mmr_offset, mmr); + spin_unlock(&hmaster->masks_lock); + end_uvhub_quiesce(hmaster); + stat->s_busy++; return FLUSH_GIVEUP; } - /* - * delays can hang the simulator - udelay(1000); - */ - destination_timeouts = 0; } } - cpu_relax(); } + bcp->conseccompletes++; return FLUSH_COMPLETE; } +static inline cycles_t +sec_2_cycles(unsigned long sec) +{ + unsigned long ns; + cycles_t cyc; + + ns = sec * 1000000000; + cyc = (ns << CYC2NS_SCALE_FACTOR)/(per_cpu(cyc2ns, smp_processor_id())); + return cyc; +} + +/* + * conditionally add 1 to *v, unless *v is >= u + * return 0 if we cannot add 1 to *v because it is >= u + * return 1 if we can add 1 to *v because it is < u + * the add is atomic + * + * This is close to atomic_add_unless(), but this allows the 'u' value + * to be lowered below the current 'v'. atomic_add_unless can only stop + * on equal. + */ +static inline int atomic_inc_unless_ge(spinlock_t *lock, atomic_t *v, int u) +{ + spin_lock(lock); + if (atomic_read(v) >= u) { + spin_unlock(lock); + return 0; + } + atomic_inc(v); + spin_unlock(lock); + return 1; +} + /** * uv_flush_send_and_wait * - * Send a broadcast and wait for a broadcast message to complete. + * Send a broadcast and wait for it to complete. * - * The flush_mask contains the cpus the broadcast was sent to. + * The flush_mask contains the cpus the broadcast is to be sent to, plus + * cpus that are on the local uvhub. * - * Returns NULL if all remote flushing was done. The mask is zeroed. + * Returns NULL if all flushing represented in the mask was done. The mask + * is zeroed. * Returns @flush_mask if some remote flushing remains to be done. The - * mask will have some bits still set. + * mask will have some bits still set, representing any cpus on the local + * uvhub (not current cpu) and any on remote uvhubs if the broadcast failed. */ -const struct cpumask *uv_flush_send_and_wait(int cpu, int this_pnode, - struct bau_desc *bau_desc, - struct cpumask *flush_mask) +const struct cpumask *uv_flush_send_and_wait(struct bau_desc *bau_desc, + struct cpumask *flush_mask, + struct bau_control *bcp) { - int completion_status = 0; int right_shift; - int tries = 0; - int pnode; + int uvhub; int bit; + int completion_status = 0; + int seq_number = 0; + long try = 0; + int cpu = bcp->uvhub_cpu; + int this_cpu = bcp->cpu; + int this_uvhub = bcp->uvhub; unsigned long mmr_offset; unsigned long index; cycles_t time1; cycles_t time2; + struct ptc_stats *stat = &per_cpu(ptcstats, bcp->cpu); + struct bau_control *smaster = bcp->socket_master; + struct bau_control *hmaster = bcp->uvhub_master; + + /* + * Spin here while there are hmaster->max_concurrent or more active + * descriptors. This is the per-uvhub 'throttle'. + */ + if (!atomic_inc_unless_ge(&hmaster->uvhub_lock, + &hmaster->active_descriptor_count, + hmaster->max_concurrent)) { + stat->s_throttles++; + do { + cpu_relax(); + } while (!atomic_inc_unless_ge(&hmaster->uvhub_lock, + &hmaster->active_descriptor_count, + hmaster->max_concurrent)); + } + + while (hmaster->uvhub_quiesce) + cpu_relax(); if (cpu < UV_CPUS_PER_ACT_STATUS) { mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_0; @@ -268,24 +557,108 @@ const struct cpumask *uv_flush_send_and_wait(int cpu, int this_pnode, } time1 = get_cycles(); do { - tries++; + /* + * Every message from any given cpu gets a unique message + * sequence number. But retries use that same number. + * Our message may have timed out at the destination because + * all sw-ack resources are in use and there is a timeout + * pending there. In that case, our last send never got + * placed into the queue and we need to persist until it + * does. + * + * Make any retry a type MSG_RETRY so that the destination will + * free any resource held by a previous message from this cpu. + */ + if (try == 0) { + /* use message type set by the caller the first time */ + seq_number = bcp->message_number++; + } else { + /* use RETRY type on all the rest; same sequence */ + bau_desc->header.msg_type = MSG_RETRY; + stat->s_retry_messages++; + } + bau_desc->header.sequence = seq_number; index = (1UL << UVH_LB_BAU_SB_ACTIVATION_CONTROL_PUSH_SHFT) | - cpu; + bcp->uvhub_cpu; + bcp->send_message = get_cycles(); + uv_write_local_mmr(UVH_LB_BAU_SB_ACTIVATION_CONTROL, index); + + try++; completion_status = uv_wait_completion(bau_desc, mmr_offset, - right_shift); - } while (completion_status == FLUSH_RETRY); + right_shift, this_cpu, bcp, smaster, try); + + if (completion_status == FLUSH_RETRY_PLUGGED) { + /* + * Our retries may be blocked by all destination swack + * resources being consumed, and a timeout pending. In + * that case hardware immediately returns the ERROR + * that looks like a destination timeout. + */ + udelay(TIMEOUT_DELAY); + bcp->plugged_tries++; + if (bcp->plugged_tries >= PLUGSB4RESET) { + bcp->plugged_tries = 0; + quiesce_local_uvhub(hmaster); + spin_lock(&hmaster->queue_lock); + uv_reset_with_ipi(&bau_desc->distribution, + this_cpu); + spin_unlock(&hmaster->queue_lock); + end_uvhub_quiesce(hmaster); + bcp->ipi_attempts++; + stat->s_resets_plug++; + } + } else if (completion_status == FLUSH_RETRY_TIMEOUT) { + hmaster->max_concurrent = 1; + bcp->timeout_tries++; + udelay(TIMEOUT_DELAY); + if (bcp->timeout_tries >= TIMEOUTSB4RESET) { + bcp->timeout_tries = 0; + quiesce_local_uvhub(hmaster); + spin_lock(&hmaster->queue_lock); + uv_reset_with_ipi(&bau_desc->distribution, + this_cpu); + spin_unlock(&hmaster->queue_lock); + end_uvhub_quiesce(hmaster); + bcp->ipi_attempts++; + stat->s_resets_timeout++; + } + } + if (bcp->ipi_attempts >= 3) { + bcp->ipi_attempts = 0; + completion_status = FLUSH_GIVEUP; + break; + } + cpu_relax(); + } while ((completion_status == FLUSH_RETRY_PLUGGED) || + (completion_status == FLUSH_RETRY_TIMEOUT)); time2 = get_cycles(); - __get_cpu_var(ptcstats).sflush += (time2 - time1); - if (tries > 1) - __get_cpu_var(ptcstats).retriesok++; - if (completion_status == FLUSH_GIVEUP) { + if ((completion_status == FLUSH_COMPLETE) && (bcp->conseccompletes > 5) + && (hmaster->max_concurrent < hmaster->max_concurrent_constant)) + hmaster->max_concurrent++; + + /* + * hold any cpu not timing out here; no other cpu currently held by + * the 'throttle' should enter the activation code + */ + while (hmaster->uvhub_quiesce) + cpu_relax(); + atomic_dec(&hmaster->active_descriptor_count); + + /* guard against cycles wrap */ + if (time2 > time1) + stat->s_time += (time2 - time1); + else + stat->s_requestor--; /* don't count this one */ + if (completion_status == FLUSH_COMPLETE && try > 1) + stat->s_retriesok++; + else if (completion_status == FLUSH_GIVEUP) { /* * Cause the caller to do an IPI-style TLB shootdown on - * the cpu's, all of which are still in the mask. + * the target cpu's, all of which are still in the mask. */ - __get_cpu_var(ptcstats).ptc_i++; + stat->s_giveup++; return flush_mask; } @@ -294,18 +667,17 @@ const struct cpumask *uv_flush_send_and_wait(int cpu, int this_pnode, * use the IPI method of shootdown on them. */ for_each_cpu(bit, flush_mask) { - pnode = uv_cpu_to_pnode(bit); - if (pnode == this_pnode) + uvhub = uv_cpu_to_blade_id(bit); + if (uvhub == this_uvhub) continue; cpumask_clear_cpu(bit, flush_mask); } if (!cpumask_empty(flush_mask)) return flush_mask; + return NULL; } -static DEFINE_PER_CPU(cpumask_var_t, uv_flush_tlb_mask); - /** * uv_flush_tlb_others - globally purge translation cache of a virtual * address or all TLB's @@ -322,8 +694,8 @@ static DEFINE_PER_CPU(cpumask_var_t, uv_flush_tlb_mask); * The caller has derived the cpumask from the mm_struct. This function * is called only if there are bits set in the mask. (e.g. flush_tlb_page()) * - * The cpumask is converted into a nodemask of the nodes containing - * the cpus. + * The cpumask is converted into a uvhubmask of the uvhubs containing + * those cpus. * * Note that this function should be called with preemption disabled. * @@ -335,52 +707,82 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm, unsigned long va, unsigned int cpu) { - struct cpumask *flush_mask = __get_cpu_var(uv_flush_tlb_mask); - int i; - int bit; - int pnode; - int uv_cpu; - int this_pnode; + int remotes; + int tcpu; + int uvhub; int locals = 0; struct bau_desc *bau_desc; + struct cpumask *flush_mask; + struct ptc_stats *stat; + struct bau_control *bcp; - cpumask_andnot(flush_mask, cpumask, cpumask_of(cpu)); + if (nobau) + return cpumask; - uv_cpu = uv_blade_processor_id(); - this_pnode = uv_hub_info->pnode; - bau_desc = __get_cpu_var(bau_control).descriptor_base; - bau_desc += UV_ITEMS_PER_DESCRIPTOR * uv_cpu; + bcp = &per_cpu(bau_control, cpu); + /* + * Each sending cpu has a per-cpu mask which it fills from the caller's + * cpu mask. Only remote cpus are converted to uvhubs and copied. + */ + flush_mask = (struct cpumask *)per_cpu(uv_flush_tlb_mask, cpu); + /* + * copy cpumask to flush_mask, removing current cpu + * (current cpu should already have been flushed by the caller and + * should never be returned if we return flush_mask) + */ + cpumask_andnot(flush_mask, cpumask, cpumask_of(cpu)); + if (cpu_isset(cpu, *cpumask)) + locals++; /* current cpu was targeted */ - bau_nodes_clear(&bau_desc->distribution, UV_DISTRIBUTION_SIZE); + bau_desc = bcp->descriptor_base; + bau_desc += UV_ITEMS_PER_DESCRIPTOR * bcp->uvhub_cpu; - i = 0; - for_each_cpu(bit, flush_mask) { - pnode = uv_cpu_to_pnode(bit); - BUG_ON(pnode > (UV_DISTRIBUTION_SIZE - 1)); - if (pnode == this_pnode) { + bau_uvhubs_clear(&bau_desc->distribution, UV_DISTRIBUTION_SIZE); + remotes = 0; + for_each_cpu(tcpu, flush_mask) { + uvhub = uv_cpu_to_blade_id(tcpu); + if (uvhub == bcp->uvhub) { locals++; continue; } - bau_node_set(pnode - uv_partition_base_pnode, - &bau_desc->distribution); - i++; + bau_uvhub_set(uvhub, &bau_desc->distribution); + remotes++; } - if (i == 0) { + if (remotes == 0) { /* - * no off_node flushing; return status for local node + * No off_hub flushing; return status for local hub. + * Return the caller's mask if all were local (the current + * cpu may be in that mask). */ if (locals) - return flush_mask; + return cpumask; else return NULL; } - __get_cpu_var(ptcstats).requestor++; - __get_cpu_var(ptcstats).ntargeted += i; + stat = &per_cpu(ptcstats, cpu); + stat->s_requestor++; + stat->s_ntargcpu += remotes; + remotes = bau_uvhub_weight(&bau_desc->distribution); + stat->s_ntarguvhub += remotes; + if (remotes >= 16) + stat->s_ntarguvhub16++; + else if (remotes >= 8) + stat->s_ntarguvhub8++; + else if (remotes >= 4) + stat->s_ntarguvhub4++; + else if (remotes >= 2) + stat->s_ntarguvhub2++; + else + stat->s_ntarguvhub1++; bau_desc->payload.address = va; bau_desc->payload.sending_cpu = cpu; - return uv_flush_send_and_wait(uv_cpu, this_pnode, bau_desc, flush_mask); + /* + * uv_flush_send_and_wait returns null if all cpu's were messaged, or + * the adjusted flush_mask if any cpu's were not messaged. + */ + return uv_flush_send_and_wait(bau_desc, flush_mask, bcp); } /* @@ -389,87 +791,70 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask, * * We received a broadcast assist message. * - * Interrupts may have been disabled; this interrupt could represent + * Interrupts are disabled; this interrupt could represent * the receipt of several messages. * - * All cores/threads on this node get this interrupt. - * The last one to see it does the s/w ack. + * All cores/threads on this hub get this interrupt. + * The last one to see it does the software ack. * (the resource will not be freed until noninterruptable cpus see this - * interrupt; hardware will timeout the s/w ack and reply ERROR) + * interrupt; hardware may timeout the s/w ack and reply ERROR) */ void uv_bau_message_interrupt(struct pt_regs *regs) { - struct bau_payload_queue_entry *va_queue_first; - struct bau_payload_queue_entry *va_queue_last; - struct bau_payload_queue_entry *msg; - struct pt_regs *old_regs = set_irq_regs(regs); - cycles_t time1; - cycles_t time2; - int msg_slot; - int sw_ack_slot; - int fw; int count = 0; - unsigned long local_pnode; - - ack_APIC_irq(); - exit_idle(); - irq_enter(); - - time1 = get_cycles(); - - local_pnode = uv_blade_to_pnode(uv_numa_blade_id()); - - va_queue_first = __get_cpu_var(bau_control).va_queue_first; - va_queue_last = __get_cpu_var(bau_control).va_queue_last; - - msg = __get_cpu_var(bau_control).bau_msg_head; + cycles_t time_start; + struct bau_payload_queue_entry *msg; + struct bau_control *bcp; + struct ptc_stats *stat; + struct msg_desc msgdesc; + + time_start = get_cycles(); + bcp = &per_cpu(bau_control, smp_processor_id()); + stat = &per_cpu(ptcstats, smp_processor_id()); + msgdesc.va_queue_first = bcp->va_queue_first; + msgdesc.va_queue_last = bcp->va_queue_last; + msg = bcp->bau_msg_head; while (msg->sw_ack_vector) { count++; - fw = msg->sw_ack_vector; - msg_slot = msg - va_queue_first; - sw_ack_slot = ffs(fw) - 1; - - uv_bau_process_message(msg, msg_slot, sw_ack_slot); - + msgdesc.msg_slot = msg - msgdesc.va_queue_first; + msgdesc.sw_ack_slot = ffs(msg->sw_ack_vector) - 1; + msgdesc.msg = msg; + uv_bau_process_message(&msgdesc, bcp); msg++; - if (msg > va_queue_last) - msg = va_queue_first; - __get_cpu_var(bau_control).bau_msg_head = msg; + if (msg > msgdesc.va_queue_last) + msg = msgdesc.va_queue_first; + bcp->bau_msg_head = msg; } + stat->d_time += (get_cycles() - time_start); if (!count) - __get_cpu_var(ptcstats).nomsg++; + stat->d_nomsg++; else if (count > 1) - __get_cpu_var(ptcstats).multmsg++; - - time2 = get_cycles(); - __get_cpu_var(ptcstats).dflush += (time2 - time1); - - irq_exit(); - set_irq_regs(old_regs); + stat->d_multmsg++; + ack_APIC_irq(); } /* * uv_enable_timeouts * - * Each target blade (i.e. blades that have cpu's) needs to have + * Each target uvhub (i.e. a uvhub that has no cpu's) needs to have * shootdown message timeouts enabled. The timeout does not cause * an interrupt, but causes an error message to be returned to * the sender. */ static void uv_enable_timeouts(void) { - int blade; - int nblades; + int uvhub; + int nuvhubs; int pnode; unsigned long mmr_image; - nblades = uv_num_possible_blades(); + nuvhubs = uv_num_possible_blades(); - for (blade = 0; blade < nblades; blade++) { - if (!uv_blade_nr_possible_cpus(blade)) + for (uvhub = 0; uvhub < nuvhubs; uvhub++) { + if (!uv_blade_nr_possible_cpus(uvhub)) continue; - pnode = uv_blade_to_pnode(blade); + pnode = uv_blade_to_pnode(uvhub); mmr_image = uv_read_global_mmr64(pnode, UVH_LB_BAU_MISC_CONTROL); /* @@ -479,16 +864,16 @@ static void uv_enable_timeouts(void) * To program the period, the SOFT_ACK_MODE must be off. */ mmr_image &= ~((unsigned long)1 << - UV_ENABLE_INTD_SOFT_ACK_MODE_SHIFT); + UVH_LB_BAU_MISC_CONTROL_ENABLE_INTD_SOFT_ACK_MODE_SHFT); uv_write_global_mmr64 (pnode, UVH_LB_BAU_MISC_CONTROL, mmr_image); /* * Set the 4-bit period. */ mmr_image &= ~((unsigned long)0xf << - UV_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHIFT); + UVH_LB_BAU_MISC_CONTROL_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHFT); mmr_image |= (UV_INTD_SOFT_ACK_TIMEOUT_PERIOD << - UV_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHIFT); + UVH_LB_BAU_MISC_CONTROL_INTD_SOFT_ACK_TIMEOUT_PERIOD_SHFT); uv_write_global_mmr64 (pnode, UVH_LB_BAU_MISC_CONTROL, mmr_image); /* @@ -497,7 +882,7 @@ static void uv_enable_timeouts(void) * indicated in bits 2:0 (7 causes all of them to timeout). */ mmr_image |= ((unsigned long)1 << - UV_ENABLE_INTD_SOFT_ACK_MODE_SHIFT); + UVH_LB_BAU_MISC_CONTROL_ENABLE_INTD_SOFT_ACK_MODE_SHFT); uv_write_global_mmr64 (pnode, UVH_LB_BAU_MISC_CONTROL, mmr_image); } @@ -522,9 +907,20 @@ static void uv_ptc_seq_stop(struct seq_file *file, void *data) { } +static inline unsigned long long +millisec_2_cycles(unsigned long millisec) +{ + unsigned long ns; + unsigned long long cyc; + + ns = millisec * 1000; + cyc = (ns << CYC2NS_SCALE_FACTOR)/(per_cpu(cyc2ns, smp_processor_id())); + return cyc; +} + /* - * Display the statistics thru /proc - * data points to the cpu number + * Display the statistics thru /proc. + * 'data' points to the cpu number */ static int uv_ptc_seq_show(struct seq_file *file, void *data) { @@ -535,78 +931,155 @@ static int uv_ptc_seq_show(struct seq_file *file, void *data) if (!cpu) { seq_printf(file, - "# cpu requestor requestee one all sretry dretry ptc_i "); + "# cpu sent stime numuvhubs numuvhubs16 numuvhubs8 "); seq_printf(file, - "sw_ack sflush dflush sok dnomsg dmult starget\n"); + "numuvhubs4 numuvhubs2 numuvhubs1 numcpus dto "); + seq_printf(file, + "retries rok resetp resett giveup sto bz throt "); + seq_printf(file, + "sw_ack recv rtime all "); + seq_printf(file, + "one mult none retry canc nocan reset rcan\n"); } if (cpu < num_possible_cpus() && cpu_online(cpu)) { stat = &per_cpu(ptcstats, cpu); - seq_printf(file, "cpu %d %ld %ld %ld %ld %ld %ld %ld ", - cpu, stat->requestor, - stat->requestee, stat->onetlb, stat->alltlb, - stat->s_retry, stat->d_retry, stat->ptc_i); - seq_printf(file, "%lx %ld %ld %ld %ld %ld %ld\n", + /* source side statistics */ + seq_printf(file, + "cpu %d %ld %ld %ld %ld %ld %ld %ld %ld %ld %ld ", + cpu, stat->s_requestor, cycles_2_us(stat->s_time), + stat->s_ntarguvhub, stat->s_ntarguvhub16, + stat->s_ntarguvhub8, stat->s_ntarguvhub4, + stat->s_ntarguvhub2, stat->s_ntarguvhub1, + stat->s_ntargcpu, stat->s_dtimeout); + seq_printf(file, "%ld %ld %ld %ld %ld %ld %ld %ld ", + stat->s_retry_messages, stat->s_retriesok, + stat->s_resets_plug, stat->s_resets_timeout, + stat->s_giveup, stat->s_stimeout, + stat->s_busy, stat->s_throttles); + /* destination side statistics */ + seq_printf(file, + "%lx %ld %ld %ld %ld %ld %ld %ld %ld %ld %ld %ld\n", uv_read_global_mmr64(uv_cpu_to_pnode(cpu), UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE), - stat->sflush, stat->dflush, - stat->retriesok, stat->nomsg, - stat->multmsg, stat->ntargeted); + stat->d_requestee, cycles_2_us(stat->d_time), + stat->d_alltlb, stat->d_onetlb, stat->d_multmsg, + stat->d_nomsg, stat->d_retries, stat->d_canceled, + stat->d_nocanceled, stat->d_resets, + stat->d_rcanceled); } return 0; } /* + * -1: resetf the statistics * 0: display meaning of the statistics - * >0: retry limit + * >0: maximum concurrent active descriptors per uvhub (throttle) */ static ssize_t uv_ptc_proc_write(struct file *file, const char __user *user, size_t count, loff_t *data) { - long newmode; + int cpu; + long input_arg; char optstr[64]; + struct ptc_stats *stat; + struct bau_control *bcp; if (count == 0 || count > sizeof(optstr)) return -EINVAL; if (copy_from_user(optstr, user, count)) return -EFAULT; optstr[count - 1] = '\0'; - if (strict_strtoul(optstr, 10, &newmode) < 0) { + if (strict_strtol(optstr, 10, &input_arg) < 0) { printk(KERN_DEBUG "%s is invalid\n", optstr); return -EINVAL; } - if (newmode == 0) { + if (input_arg == 0) { printk(KERN_DEBUG "# cpu: cpu number\n"); + printk(KERN_DEBUG "Sender statistics:\n"); + printk(KERN_DEBUG + "sent: number of shootdown messages sent\n"); + printk(KERN_DEBUG + "stime: time spent sending messages\n"); + printk(KERN_DEBUG + "numuvhubs: number of hubs targeted with shootdown\n"); + printk(KERN_DEBUG + "numuvhubs16: number times 16 or more hubs targeted\n"); + printk(KERN_DEBUG + "numuvhubs8: number times 8 or more hubs targeted\n"); + printk(KERN_DEBUG + "numuvhubs4: number times 4 or more hubs targeted\n"); + printk(KERN_DEBUG + "numuvhubs2: number times 2 or more hubs targeted\n"); + printk(KERN_DEBUG + "numuvhubs1: number times 1 hub targeted\n"); + printk(KERN_DEBUG + "numcpus: number of cpus targeted with shootdown\n"); + printk(KERN_DEBUG + "dto: number of destination timeouts\n"); + printk(KERN_DEBUG + "retries: destination timeout retries sent\n"); + printk(KERN_DEBUG + "rok: : destination timeouts successfully retried\n"); + printk(KERN_DEBUG + "resetp: ipi-style resource resets for plugs\n"); + printk(KERN_DEBUG + "resett: ipi-style resource resets for timeouts\n"); + printk(KERN_DEBUG + "giveup: fall-backs to ipi-style shootdowns\n"); + printk(KERN_DEBUG + "sto: number of source timeouts\n"); + printk(KERN_DEBUG + "bz: number of stay-busy's\n"); + printk(KERN_DEBUG + "throt: number times spun in throttle\n"); + printk(KERN_DEBUG "Destination side statistics:\n"); printk(KERN_DEBUG - "requestor: times this cpu was the flush requestor\n"); + "sw_ack: image of UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE\n"); printk(KERN_DEBUG - "requestee: times this cpu was requested to flush its TLBs\n"); + "recv: shootdown messages received\n"); printk(KERN_DEBUG - "one: times requested to flush a single address\n"); + "rtime: time spent processing messages\n"); printk(KERN_DEBUG - "all: times requested to flush all TLB's\n"); + "all: shootdown all-tlb messages\n"); printk(KERN_DEBUG - "sretry: number of retries of source-side timeouts\n"); + "one: shootdown one-tlb messages\n"); printk(KERN_DEBUG - "dretry: number of retries of destination-side timeouts\n"); + "mult: interrupts that found multiple messages\n"); printk(KERN_DEBUG - "ptc_i: times UV fell through to IPI-style flushes\n"); + "none: interrupts that found no messages\n"); printk(KERN_DEBUG - "sw_ack: image of UVH_LB_BAU_INTD_SOFTWARE_ACKNOWLEDGE\n"); + "retry: number of retry messages processed\n"); printk(KERN_DEBUG - "sflush_us: cycles spent in uv_flush_tlb_others()\n"); + "canc: number messages canceled by retries\n"); printk(KERN_DEBUG - "dflush_us: cycles spent in handling flush requests\n"); - printk(KERN_DEBUG "sok: successes on retry\n"); - printk(KERN_DEBUG "dnomsg: interrupts with no message\n"); + "nocan: number retries that found nothing to cancel\n"); printk(KERN_DEBUG - "dmult: interrupts with multiple messages\n"); - printk(KERN_DEBUG "starget: nodes targeted\n"); + "reset: number of ipi-style reset requests processed\n"); + printk(KERN_DEBUG + "rcan: number messages canceled by reset requests\n"); + } else if (input_arg == -1) { + for_each_present_cpu(cpu) { + stat = &per_cpu(ptcstats, cpu); + memset(stat, 0, sizeof(struct ptc_stats)); + } } else { - uv_bau_retry_limit = newmode; - printk(KERN_DEBUG "timeout retry limit:%d\n", - uv_bau_retry_limit); + uv_bau_max_concurrent = input_arg; + bcp = &per_cpu(bau_control, smp_processor_id()); + if (uv_bau_max_concurrent < 1 || + uv_bau_max_concurrent > bcp->cpus_in_uvhub) { + printk(KERN_DEBUG + "Error: BAU max concurrent %d; %d is invalid\n", + bcp->max_concurrent, uv_bau_max_concurrent); + return -EINVAL; + } + printk(KERN_DEBUG "Set BAU max concurrent:%d\n", + uv_bau_max_concurrent); + for_each_present_cpu(cpu) { + bcp = &per_cpu(bau_control, cpu); + bcp->max_concurrent = uv_bau_max_concurrent; + } } return count; @@ -650,79 +1123,30 @@ static int __init uv_ptc_init(void) } /* - * begin the initialization of the per-blade control structures - */ -static struct bau_control * __init uv_table_bases_init(int blade, int node) -{ - int i; - struct bau_msg_status *msp; - struct bau_control *bau_tabp; - - bau_tabp = - kmalloc_node(sizeof(struct bau_control), GFP_KERNEL, node); - BUG_ON(!bau_tabp); - - bau_tabp->msg_statuses = - kmalloc_node(sizeof(struct bau_msg_status) * - DEST_Q_SIZE, GFP_KERNEL, node); - BUG_ON(!bau_tabp->msg_statuses); - - for (i = 0, msp = bau_tabp->msg_statuses; i < DEST_Q_SIZE; i++, msp++) - bau_cpubits_clear(&msp->seen_by, (int) - uv_blade_nr_possible_cpus(blade)); - - uv_bau_table_bases[blade] = bau_tabp; - - return bau_tabp; -} - -/* - * finish the initialization of the per-blade control structures - */ -static void __init -uv_table_bases_finish(int blade, - struct bau_control *bau_tablesp, - struct bau_desc *adp) -{ - struct bau_control *bcp; - int cpu; - - for_each_present_cpu(cpu) { - if (blade != uv_cpu_to_blade_id(cpu)) - continue; - - bcp = (struct bau_control *)&per_cpu(bau_control, cpu); - bcp->bau_msg_head = bau_tablesp->va_queue_first; - bcp->va_queue_first = bau_tablesp->va_queue_first; - bcp->va_queue_last = bau_tablesp->va_queue_last; - bcp->msg_statuses = bau_tablesp->msg_statuses; - bcp->descriptor_base = adp; - } -} - -/* * initialize the sending side's sending buffers */ -static struct bau_desc * __init +static void uv_activation_descriptor_init(int node, int pnode) { int i; + int cpu; unsigned long pa; unsigned long m; unsigned long n; - struct bau_desc *adp; - struct bau_desc *ad2; + struct bau_desc *bau_desc; + struct bau_desc *bd2; + struct bau_control *bcp; /* * each bau_desc is 64 bytes; there are 8 (UV_ITEMS_PER_DESCRIPTOR) - * per cpu; and up to 32 (UV_ADP_SIZE) cpu's per blade + * per cpu; and up to 32 (UV_ADP_SIZE) cpu's per uvhub */ - adp = (struct bau_desc *)kmalloc_node(sizeof(struct bau_desc)* + bau_desc = (struct bau_desc *)kmalloc_node(sizeof(struct bau_desc)* UV_ADP_SIZE*UV_ITEMS_PER_DESCRIPTOR, GFP_KERNEL, node); - BUG_ON(!adp); + BUG_ON(!bau_desc); - pa = uv_gpa(adp); /* need the real nasid*/ - n = uv_gpa_to_pnode(pa); + pa = uv_gpa(bau_desc); /* need the real nasid*/ + n = pa >> uv_nshift; m = pa & uv_mmask; uv_write_global_mmr64(pnode, UVH_LB_BAU_SB_DESCRIPTOR_BASE, @@ -731,96 +1155,188 @@ uv_activation_descriptor_init(int node, int pnode) /* * initializing all 8 (UV_ITEMS_PER_DESCRIPTOR) descriptors for each * cpu even though we only use the first one; one descriptor can - * describe a broadcast to 256 nodes. + * describe a broadcast to 256 uv hubs. */ - for (i = 0, ad2 = adp; i < (UV_ADP_SIZE*UV_ITEMS_PER_DESCRIPTOR); - i++, ad2++) { - memset(ad2, 0, sizeof(struct bau_desc)); - ad2->header.sw_ack_flag = 1; + for (i = 0, bd2 = bau_desc; i < (UV_ADP_SIZE*UV_ITEMS_PER_DESCRIPTOR); + i++, bd2++) { + memset(bd2, 0, sizeof(struct bau_desc)); + bd2->header.sw_ack_flag = 1; /* - * base_dest_nodeid is the first node in the partition, so - * the bit map will indicate partition-relative node numbers. - * note that base_dest_nodeid is actually a nasid. + * base_dest_nodeid is the nasid (pnode<<1) of the first uvhub + * in the partition. The bit map will indicate uvhub numbers, + * which are 0-N in a partition. Pnodes are unique system-wide. */ - ad2->header.base_dest_nodeid = uv_partition_base_pnode << 1; - ad2->header.dest_subnodeid = 0x10; /* the LB */ - ad2->header.command = UV_NET_ENDPOINT_INTD; - ad2->header.int_both = 1; + bd2->header.base_dest_nodeid = uv_partition_base_pnode << 1; + bd2->header.dest_subnodeid = 0x10; /* the LB */ + bd2->header.command = UV_NET_ENDPOINT_INTD; + bd2->header.int_both = 1; /* * all others need to be set to zero: * fairness chaining multilevel count replied_to */ } - return adp; + for_each_present_cpu(cpu) { + if (pnode != uv_blade_to_pnode(uv_cpu_to_blade_id(cpu))) + continue; + bcp = &per_cpu(bau_control, cpu); + bcp->descriptor_base = bau_desc; + } } /* * initialize the destination side's receiving buffers + * entered for each uvhub in the partition + * - node is first node (kernel memory notion) on the uvhub + * - pnode is the uvhub's physical identifier */ -static struct bau_payload_queue_entry * __init -uv_payload_queue_init(int node, int pnode, struct bau_control *bau_tablesp) +static void +uv_payload_queue_init(int node, int pnode) { - struct bau_payload_queue_entry *pqp; - unsigned long pa; int pn; + int cpu; char *cp; + unsigned long pa; + struct bau_payload_queue_entry *pqp; + struct bau_payload_queue_entry *pqp_malloc; + struct bau_control *bcp; pqp = (struct bau_payload_queue_entry *) kmalloc_node( (DEST_Q_SIZE + 1) * sizeof(struct bau_payload_queue_entry), GFP_KERNEL, node); BUG_ON(!pqp); + pqp_malloc = pqp; cp = (char *)pqp + 31; pqp = (struct bau_payload_queue_entry *)(((unsigned long)cp >> 5) << 5); - bau_tablesp->va_queue_first = pqp; + + for_each_present_cpu(cpu) { + if (pnode != uv_cpu_to_pnode(cpu)) + continue; + /* for every cpu on this pnode: */ + bcp = &per_cpu(bau_control, cpu); + bcp->va_queue_first = pqp; + bcp->bau_msg_head = pqp; + bcp->va_queue_last = pqp + (DEST_Q_SIZE - 1); + } /* * need the pnode of where the memory was really allocated */ pa = uv_gpa(pqp); - pn = uv_gpa_to_pnode(pa); + pn = pa >> uv_nshift; uv_write_global_mmr64(pnode, UVH_LB_BAU_INTD_PAYLOAD_QUEUE_FIRST, ((unsigned long)pn << UV_PAYLOADQ_PNODE_SHIFT) | uv_physnodeaddr(pqp)); uv_write_global_mmr64(pnode, UVH_LB_BAU_INTD_PAYLOAD_QUEUE_TAIL, uv_physnodeaddr(pqp)); - bau_tablesp->va_queue_last = pqp + (DEST_Q_SIZE - 1); uv_write_global_mmr64(pnode, UVH_LB_BAU_INTD_PAYLOAD_QUEUE_LAST, (unsigned long) - uv_physnodeaddr(bau_tablesp->va_queue_last)); + uv_physnodeaddr(pqp + (DEST_Q_SIZE - 1))); + /* in effect, all msg_type's are set to MSG_NOOP */ memset(pqp, 0, sizeof(struct bau_payload_queue_entry) * DEST_Q_SIZE); - - return pqp; } /* - * Initialization of each UV blade's structures + * Initialization of each UV hub's structures */ -static int __init uv_init_blade(int blade) +static void __init uv_init_uvhub(int uvhub, int vector) { int node; int pnode; - unsigned long pa; unsigned long apicid; - struct bau_desc *adp; - struct bau_payload_queue_entry *pqp; - struct bau_control *bau_tablesp; - - node = blade_to_first_node(blade); - bau_tablesp = uv_table_bases_init(blade, node); - pnode = uv_blade_to_pnode(blade); - adp = uv_activation_descriptor_init(node, pnode); - pqp = uv_payload_queue_init(node, pnode, bau_tablesp); - uv_table_bases_finish(blade, bau_tablesp, adp); + + node = uvhub_to_first_node(uvhub); + pnode = uv_blade_to_pnode(uvhub); + uv_activation_descriptor_init(node, pnode); + uv_payload_queue_init(node, pnode); /* * the below initialization can't be in firmware because the * messaging IRQ will be determined by the OS */ - apicid = blade_to_first_apicid(blade); - pa = uv_read_global_mmr64(pnode, UVH_BAU_DATA_CONFIG); + apicid = uvhub_to_first_apicid(uvhub); uv_write_global_mmr64(pnode, UVH_BAU_DATA_CONFIG, - ((apicid << 32) | UV_BAU_MESSAGE)); - return 0; + ((apicid << 32) | vector)); +} + +/* + * initialize the bau_control structure for each cpu + */ +static void uv_init_per_cpu(int nuvhubs) +{ + int i, j, k; + int cpu; + int pnode; + int uvhub; + short socket = 0; + struct bau_control *bcp; + struct uvhub_desc *bdp; + struct socket_desc *sdp; + struct bau_control *hmaster = NULL; + struct bau_control *smaster = NULL; + struct socket_desc { + short num_cpus; + short cpu_number[16]; + }; + struct uvhub_desc { + short num_sockets; + short num_cpus; + short uvhub; + short pnode; + struct socket_desc socket[2]; + }; + struct uvhub_desc *uvhub_descs; + + uvhub_descs = (struct uvhub_desc *) + kmalloc(nuvhubs * sizeof(struct uvhub_desc), GFP_KERNEL); + memset(uvhub_descs, 0, nuvhubs * sizeof(struct uvhub_desc)); + for_each_present_cpu(cpu) { + bcp = &per_cpu(bau_control, cpu); + memset(bcp, 0, sizeof(struct bau_control)); + spin_lock_init(&bcp->masks_lock); + bcp->max_concurrent = uv_bau_max_concurrent; + pnode = uv_cpu_hub_info(cpu)->pnode; + uvhub = uv_cpu_hub_info(cpu)->numa_blade_id; + bdp = &uvhub_descs[uvhub]; + bdp->num_cpus++; + bdp->uvhub = uvhub; + bdp->pnode = pnode; + /* time interval to catch a hardware stay-busy bug */ + bcp->timeout_interval = millisec_2_cycles(3); + /* kludge: assume uv_hub.h is constant */ + socket = (cpu_physical_id(cpu)>>5)&1; + if (socket >= bdp->num_sockets) + bdp->num_sockets = socket+1; + sdp = &bdp->socket[socket]; + sdp->cpu_number[sdp->num_cpus] = cpu; + sdp->num_cpus++; + } + socket = 0; + for_each_possible_blade(uvhub) { + bdp = &uvhub_descs[uvhub]; + for (i = 0; i < bdp->num_sockets; i++) { + sdp = &bdp->socket[i]; + for (j = 0; j < sdp->num_cpus; j++) { + cpu = sdp->cpu_number[j]; + bcp = &per_cpu(bau_control, cpu); + bcp->cpu = cpu; + if (j == 0) { + smaster = bcp; + if (i == 0) + hmaster = bcp; + } + bcp->cpus_in_uvhub = bdp->num_cpus; + bcp->cpus_in_socket = sdp->num_cpus; + bcp->socket_master = smaster; + bcp->uvhub_master = hmaster; + for (k = 0; k < DEST_Q_SIZE; k++) + bcp->socket_acknowledge_count[k] = 0; + bcp->uvhub_cpu = + uv_cpu_hub_info(cpu)->blade_processor_id; + } + socket++; + } + } + kfree(uvhub_descs); } /* @@ -828,38 +1344,54 @@ static int __init uv_init_blade(int blade) */ static int __init uv_bau_init(void) { - int blade; - int nblades; + int uvhub; + int pnode; + int nuvhubs; int cur_cpu; + int vector; + unsigned long mmr; if (!is_uv_system()) return 0; + if (nobau) + return 0; + for_each_possible_cpu(cur_cpu) zalloc_cpumask_var_node(&per_cpu(uv_flush_tlb_mask, cur_cpu), GFP_KERNEL, cpu_to_node(cur_cpu)); - uv_bau_retry_limit = 1; + uv_bau_max_concurrent = MAX_BAU_CONCURRENT; + uv_nshift = uv_hub_info->m_val; uv_mmask = (1UL << uv_hub_info->m_val) - 1; - nblades = uv_num_possible_blades(); + nuvhubs = uv_num_possible_blades(); - uv_bau_table_bases = (struct bau_control **) - kmalloc(nblades * sizeof(struct bau_control *), GFP_KERNEL); - BUG_ON(!uv_bau_table_bases); + uv_init_per_cpu(nuvhubs); uv_partition_base_pnode = 0x7fffffff; - for (blade = 0; blade < nblades; blade++) - if (uv_blade_nr_possible_cpus(blade) && - (uv_blade_to_pnode(blade) < uv_partition_base_pnode)) - uv_partition_base_pnode = uv_blade_to_pnode(blade); - for (blade = 0; blade < nblades; blade++) - if (uv_blade_nr_possible_cpus(blade)) - uv_init_blade(blade); - - alloc_intr_gate(UV_BAU_MESSAGE, uv_bau_message_intr1); + for (uvhub = 0; uvhub < nuvhubs; uvhub++) + if (uv_blade_nr_possible_cpus(uvhub) && + (uv_blade_to_pnode(uvhub) < uv_partition_base_pnode)) + uv_partition_base_pnode = uv_blade_to_pnode(uvhub); + + vector = UV_BAU_MESSAGE; + for_each_possible_blade(uvhub) + if (uv_blade_nr_possible_cpus(uvhub)) + uv_init_uvhub(uvhub, vector); + uv_enable_timeouts(); + alloc_intr_gate(vector, uv_bau_message_intr1); + + for_each_possible_blade(uvhub) { + pnode = uv_blade_to_pnode(uvhub); + /* INIT the bau */ + uv_write_global_mmr64(pnode, UVH_LB_BAU_SB_ACTIVATION_CONTROL, + ((unsigned long)1 << 63)); + mmr = 1; /* should be 1 to broadcast to both sockets */ + uv_write_global_mmr64(pnode, UVH_BAU_DATA_BROADCAST, mmr); + } return 0; } -__initcall(uv_bau_init); -__initcall(uv_ptc_init); +core_initcall(uv_bau_init); +core_initcall(uv_ptc_init); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 36f1bd9..02cfb9b 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -108,15 +108,6 @@ static inline void preempt_conditional_cli(struct pt_regs *regs) dec_preempt_count(); } -#ifdef CONFIG_X86_32 -static inline void -die_if_kernel(const char *str, struct pt_regs *regs, long err) -{ - if (!user_mode_vm(regs)) - die(str, regs, err); -} -#endif - static void __kprobes do_trap(int trapnr, int signr, char *str, struct pt_regs *regs, long error_code, siginfo_t *info) @@ -585,55 +576,67 @@ dotraplinkage void __kprobes do_debug(struct pt_regs *regs, long error_code) return; } -#ifdef CONFIG_X86_64 -static int kernel_math_error(struct pt_regs *regs, const char *str, int trapnr) -{ - if (fixup_exception(regs)) - return 1; - - notify_die(DIE_GPF, str, regs, 0, trapnr, SIGFPE); - /* Illegal floating point operation in the kernel */ - current->thread.trap_no = trapnr; - die(str, regs, 0); - return 0; -} -#endif - /* * Note that we play around with the 'TS' bit in an attempt to get * the correct behaviour even in the presence of the asynchronous * IRQ13 behaviour */ -void math_error(void __user *ip) +void math_error(struct pt_regs *regs, int error_code, int trapnr) { - struct task_struct *task; + struct task_struct *task = current; siginfo_t info; - unsigned short cwd, swd, err; + unsigned short err; + char *str = (trapnr == 16) ? "fpu exception" : "simd exception"; + + if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, SIGFPE) == NOTIFY_STOP) + return; + conditional_sti(regs); + + if (!user_mode_vm(regs)) + { + if (!fixup_exception(regs)) { + task->thread.error_code = error_code; + task->thread.trap_no = trapnr; + die(str, regs, error_code); + } + return; + } /* * Save the info for the exception handler and clear the error. */ - task = current; save_init_fpu(task); - task->thread.trap_no = 16; - task->thread.error_code = 0; + task->thread.trap_no = trapnr; + task->thread.error_code = error_code; info.si_signo = SIGFPE; info.si_errno = 0; - info.si_addr = ip; - /* - * (~cwd & swd) will mask out exceptions that are not set to unmasked - * status. 0x3f is the exception bits in these regs, 0x200 is the - * C1 reg you need in case of a stack fault, 0x040 is the stack - * fault bit. We should only be taking one exception at a time, - * so if this combination doesn't produce any single exception, - * then we have a bad program that isn't synchronizing its FPU usage - * and it will suffer the consequences since we won't be able to - * fully reproduce the context of the exception - */ - cwd = get_fpu_cwd(task); - swd = get_fpu_swd(task); + info.si_addr = (void __user *)regs->ip; + if (trapnr == 16) { + unsigned short cwd, swd; + /* + * (~cwd & swd) will mask out exceptions that are not set to unmasked + * status. 0x3f is the exception bits in these regs, 0x200 is the + * C1 reg you need in case of a stack fault, 0x040 is the stack + * fault bit. We should only be taking one exception at a time, + * so if this combination doesn't produce any single exception, + * then we have a bad program that isn't synchronizing its FPU usage + * and it will suffer the consequences since we won't be able to + * fully reproduce the context of the exception + */ + cwd = get_fpu_cwd(task); + swd = get_fpu_swd(task); - err = swd & ~cwd; + err = swd & ~cwd; + } else { + /* + * The SIMD FPU exceptions are handled a little differently, as there + * is only a single status/control register. Thus, to determine which + * unmasked exception was caught we must mask the exception mask bits + * at 0x1f80, and then use these to mask the exception bits at 0x3f. + */ + unsigned short mxcsr = get_fpu_mxcsr(task); + err = ~(mxcsr >> 7) & mxcsr; + } if (err & 0x001) { /* Invalid op */ /* @@ -662,97 +665,17 @@ void math_error(void __user *ip) dotraplinkage void do_coprocessor_error(struct pt_regs *regs, long error_code) { - conditional_sti(regs); - #ifdef CONFIG_X86_32 ignore_fpu_irq = 1; -#else - if (!user_mode(regs) && - kernel_math_error(regs, "kernel x87 math error", 16)) - return; #endif - math_error((void __user *)regs->ip); -} - -static void simd_math_error(void __user *ip) -{ - struct task_struct *task; - siginfo_t info; - unsigned short mxcsr; - - /* - * Save the info for the exception handler and clear the error. - */ - task = current; - save_init_fpu(task); - task->thread.trap_no = 19; - task->thread.error_code = 0; - info.si_signo = SIGFPE; - info.si_errno = 0; - info.si_code = __SI_FAULT; - info.si_addr = ip; - /* - * The SIMD FPU exceptions are handled a little differently, as there - * is only a single status/control register. Thus, to determine which - * unmasked exception was caught we must mask the exception mask bits - * at 0x1f80, and then use these to mask the exception bits at 0x3f. - */ - mxcsr = get_fpu_mxcsr(task); - switch (~((mxcsr & 0x1f80) >> 7) & (mxcsr & 0x3f)) { - case 0x000: - default: - break; - case 0x001: /* Invalid Op */ - info.si_code = FPE_FLTINV; - break; - case 0x002: /* Denormalize */ - case 0x010: /* Underflow */ - info.si_code = FPE_FLTUND; - break; - case 0x004: /* Zero Divide */ - info.si_code = FPE_FLTDIV; - break; - case 0x008: /* Overflow */ - info.si_code = FPE_FLTOVF; - break; - case 0x020: /* Precision */ - info.si_code = FPE_FLTRES; - break; - } - force_sig_info(SIGFPE, &info, task); + math_error(regs, error_code, 16); } dotraplinkage void do_simd_coprocessor_error(struct pt_regs *regs, long error_code) { - conditional_sti(regs); - -#ifdef CONFIG_X86_32 - if (cpu_has_xmm) { - /* Handle SIMD FPU exceptions on PIII+ processors. */ - ignore_fpu_irq = 1; - simd_math_error((void __user *)regs->ip); - return; - } - /* - * Handle strange cache flush from user space exception - * in all other cases. This is undocumented behaviour. - */ - if (regs->flags & X86_VM_MASK) { - handle_vm86_fault((struct kernel_vm86_regs *)regs, error_code); - return; - } - current->thread.trap_no = 19; - current->thread.error_code = error_code; - die_if_kernel("cache flush denied", regs, error_code); - force_sig(SIGSEGV, current); -#else - if (!user_mode(regs) && - kernel_math_error(regs, "kernel simd math error", 19)) - return; - simd_math_error((void __user *)regs->ip); -#endif + math_error(regs, error_code, 19); } dotraplinkage void diff --git a/arch/x86/kernel/uv_irq.c b/arch/x86/kernel/uv_irq.c index 1d40336..1132129 100644 --- a/arch/x86/kernel/uv_irq.c +++ b/arch/x86/kernel/uv_irq.c @@ -44,7 +44,7 @@ static void uv_ack_apic(unsigned int irq) ack_APIC_irq(); } -struct irq_chip uv_irq_chip = { +static struct irq_chip uv_irq_chip = { .name = "UV-CORE", .startup = uv_noop_ret, .shutdown = uv_noop, @@ -141,7 +141,7 @@ int uv_irq_2_mmr_info(int irq, unsigned long *offset, int *pnode) */ static int arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade, - unsigned long mmr_offset, int restrict) + unsigned long mmr_offset, int limit) { const struct cpumask *eligible_cpu = cpumask_of(cpu); struct irq_desc *desc = irq_to_desc(irq); @@ -160,7 +160,7 @@ arch_enable_uv_irq(char *irq_name, unsigned int irq, int cpu, int mmr_blade, if (err != 0) return err; - if (restrict == UV_AFFINITY_CPU) + if (limit == UV_AFFINITY_CPU) desc->status |= IRQ_NO_BALANCING; else desc->status |= IRQ_MOVE_PCNTXT; @@ -214,7 +214,7 @@ static int uv_set_irq_affinity(unsigned int irq, const struct cpumask *mask) unsigned long mmr_value; struct uv_IO_APIC_route_entry *entry; unsigned long mmr_offset; - unsigned mmr_pnode; + int mmr_pnode; if (set_desc_affinity(desc, mask, &dest)) return -1; @@ -248,7 +248,7 @@ static int uv_set_irq_affinity(unsigned int irq, const struct cpumask *mask) * interrupt is raised. */ int uv_setup_irq(char *irq_name, int cpu, int mmr_blade, - unsigned long mmr_offset, int restrict) + unsigned long mmr_offset, int limit) { int irq, ret; @@ -258,7 +258,7 @@ int uv_setup_irq(char *irq_name, int cpu, int mmr_blade, return -EBUSY; ret = arch_enable_uv_irq(irq_name, irq, cpu, mmr_blade, mmr_offset, - restrict); + limit); if (ret == irq) uv_set_irq_2_mmr_info(irq, mmr_offset, mmr_blade); else diff --git a/arch/x86/kernel/x8664_ksyms_64.c b/arch/x86/kernel/x8664_ksyms_64.c index 693920b..1b950d1 100644 --- a/arch/x86/kernel/x8664_ksyms_64.c +++ b/arch/x86/kernel/x8664_ksyms_64.c @@ -54,7 +54,6 @@ EXPORT_SYMBOL(memcpy); EXPORT_SYMBOL(__memcpy); EXPORT_SYMBOL(empty_zero_page); -EXPORT_SYMBOL(init_level4_pgt); #ifndef CONFIG_PARAVIRT EXPORT_SYMBOL(native_load_gs_index); #endif diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c index 782c3a36..37e68fc 100644 --- a/arch/x86/kernel/xsave.c +++ b/arch/x86/kernel/xsave.c @@ -99,7 +99,7 @@ int save_i387_xstate(void __user *buf) if (err) return err; - if (task_thread_info(tsk)->status & TS_XSAVE) + if (use_xsave()) err = xsave_user(buf); else err = fxsave_user(buf); @@ -109,14 +109,14 @@ int save_i387_xstate(void __user *buf) task_thread_info(tsk)->status &= ~TS_USEDFPU; stts(); } else { - if (__copy_to_user(buf, &tsk->thread.xstate->fxsave, + if (__copy_to_user(buf, &tsk->thread.fpu.state->fxsave, xstate_size)) return -1; } clear_used_math(); /* trigger finit */ - if (task_thread_info(tsk)->status & TS_XSAVE) { + if (use_xsave()) { struct _fpstate __user *fx = buf; struct _xstate __user *x = buf; u64 xstate_bv; @@ -225,7 +225,7 @@ int restore_i387_xstate(void __user *buf) clts(); task_thread_info(current)->status |= TS_USEDFPU; } - if (task_thread_info(tsk)->status & TS_XSAVE) + if (use_xsave()) err = restore_user_xstate(buf); else err = fxrstor_checking((__force struct i387_fxsave_struct *) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 2ba5820..737361f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2067,7 +2067,7 @@ static int cpuid_interception(struct vcpu_svm *svm) static int iret_interception(struct vcpu_svm *svm) { ++svm->vcpu.stat.nmi_window_exits; - svm->vmcb->control.intercept &= ~(1UL << INTERCEPT_IRET); + svm->vmcb->control.intercept &= ~(1ULL << INTERCEPT_IRET); svm->vcpu.arch.hflags |= HF_IRET_MASK; return 1; } @@ -2479,7 +2479,7 @@ static void svm_inject_nmi(struct kvm_vcpu *vcpu) svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI; vcpu->arch.hflags |= HF_NMI_MASK; - svm->vmcb->control.intercept |= (1UL << INTERCEPT_IRET); + svm->vmcb->control.intercept |= (1ULL << INTERCEPT_IRET); ++vcpu->stat.nmi_injections; } @@ -2539,10 +2539,10 @@ static void svm_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked) if (masked) { svm->vcpu.arch.hflags |= HF_NMI_MASK; - svm->vmcb->control.intercept |= (1UL << INTERCEPT_IRET); + svm->vmcb->control.intercept |= (1ULL << INTERCEPT_IRET); } else { svm->vcpu.arch.hflags &= ~HF_NMI_MASK; - svm->vmcb->control.intercept &= ~(1UL << INTERCEPT_IRET); + svm->vmcb->control.intercept &= ~(1ULL << INTERCEPT_IRET); } } diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 32022a8..edca080 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2703,8 +2703,7 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu) return 0; return !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & - (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS | - GUEST_INTR_STATE_NMI)); + (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_NMI)); } static bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 73d854c..dd9bc8f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1713,6 +1713,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, if (copy_from_user(cpuid_entries, entries, cpuid->nent * sizeof(struct kvm_cpuid_entry))) goto out_free; + vcpu_load(vcpu); for (i = 0; i < cpuid->nent; i++) { vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function; vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax; @@ -1730,6 +1731,7 @@ static int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, r = 0; kvm_apic_set_version(vcpu); kvm_x86_ops->cpuid_update(vcpu); + vcpu_put(vcpu); out_free: vfree(cpuid_entries); @@ -1750,9 +1752,11 @@ static int kvm_vcpu_ioctl_set_cpuid2(struct kvm_vcpu *vcpu, if (copy_from_user(&vcpu->arch.cpuid_entries, entries, cpuid->nent * sizeof(struct kvm_cpuid_entry2))) goto out; + vcpu_load(vcpu); vcpu->arch.cpuid_nent = cpuid->nent; kvm_apic_set_version(vcpu); kvm_x86_ops->cpuid_update(vcpu); + vcpu_put(vcpu); return 0; out: diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile index cbaf8f2..f871e04 100644 --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -26,11 +26,12 @@ obj-y += msr.o msr-reg.o msr-reg-export.o ifeq ($(CONFIG_X86_32),y) obj-y += atomic64_32.o + lib-y += atomic64_cx8_32.o lib-y += checksum_32.o lib-y += strstr_32.o lib-y += semaphore_32.o string_32.o ifneq ($(CONFIG_X86_CMPXCHG64),y) - lib-y += cmpxchg8b_emu.o + lib-y += cmpxchg8b_emu.o atomic64_386_32.o endif lib-$(CONFIG_X86_USE_3DNOW) += mmx_32.o else diff --git a/arch/x86/lib/atomic64_32.c b/arch/x86/lib/atomic64_32.c index 824fa0b..540179e 100644 --- a/arch/x86/lib/atomic64_32.c +++ b/arch/x86/lib/atomic64_32.c @@ -6,225 +6,54 @@ #include <asm/cmpxchg.h> #include <asm/atomic.h> -static noinline u64 cmpxchg8b(u64 *ptr, u64 old, u64 new) -{ - u32 low = new; - u32 high = new >> 32; - - asm volatile( - LOCK_PREFIX "cmpxchg8b %1\n" - : "+A" (old), "+m" (*ptr) - : "b" (low), "c" (high) - ); - return old; -} - -u64 atomic64_cmpxchg(atomic64_t *ptr, u64 old_val, u64 new_val) -{ - return cmpxchg8b(&ptr->counter, old_val, new_val); -} -EXPORT_SYMBOL(atomic64_cmpxchg); - -/** - * atomic64_xchg - xchg atomic64 variable - * @ptr: pointer to type atomic64_t - * @new_val: value to assign - * - * Atomically xchgs the value of @ptr to @new_val and returns - * the old value. - */ -u64 atomic64_xchg(atomic64_t *ptr, u64 new_val) -{ - /* - * Try first with a (possibly incorrect) assumption about - * what we have there. We'll do two loops most likely, - * but we'll get an ownership MESI transaction straight away - * instead of a read transaction followed by a - * flush-for-ownership transaction: - */ - u64 old_val, real_val = 0; - - do { - old_val = real_val; - - real_val = atomic64_cmpxchg(ptr, old_val, new_val); - - } while (real_val != old_val); - - return old_val; -} -EXPORT_SYMBOL(atomic64_xchg); - -/** - * atomic64_set - set atomic64 variable - * @ptr: pointer to type atomic64_t - * @new_val: value to assign - * - * Atomically sets the value of @ptr to @new_val. - */ -void atomic64_set(atomic64_t *ptr, u64 new_val) -{ - atomic64_xchg(ptr, new_val); -} -EXPORT_SYMBOL(atomic64_set); - -/** -EXPORT_SYMBOL(atomic64_read); - * atomic64_add_return - add and return - * @delta: integer value to add - * @ptr: pointer to type atomic64_t - * - * Atomically adds @delta to @ptr and returns @delta + *@ptr - */ -noinline u64 atomic64_add_return(u64 delta, atomic64_t *ptr) -{ - /* - * Try first with a (possibly incorrect) assumption about - * what we have there. We'll do two loops most likely, - * but we'll get an ownership MESI transaction straight away - * instead of a read transaction followed by a - * flush-for-ownership transaction: - */ - u64 old_val, new_val, real_val = 0; - - do { - old_val = real_val; - new_val = old_val + delta; - - real_val = atomic64_cmpxchg(ptr, old_val, new_val); - - } while (real_val != old_val); - - return new_val; -} -EXPORT_SYMBOL(atomic64_add_return); - -u64 atomic64_sub_return(u64 delta, atomic64_t *ptr) -{ - return atomic64_add_return(-delta, ptr); -} -EXPORT_SYMBOL(atomic64_sub_return); - -u64 atomic64_inc_return(atomic64_t *ptr) -{ - return atomic64_add_return(1, ptr); -} -EXPORT_SYMBOL(atomic64_inc_return); - -u64 atomic64_dec_return(atomic64_t *ptr) -{ - return atomic64_sub_return(1, ptr); -} -EXPORT_SYMBOL(atomic64_dec_return); - -/** - * atomic64_add - add integer to atomic64 variable - * @delta: integer value to add - * @ptr: pointer to type atomic64_t - * - * Atomically adds @delta to @ptr. - */ -void atomic64_add(u64 delta, atomic64_t *ptr) -{ - atomic64_add_return(delta, ptr); -} -EXPORT_SYMBOL(atomic64_add); - -/** - * atomic64_sub - subtract the atomic64 variable - * @delta: integer value to subtract - * @ptr: pointer to type atomic64_t - * - * Atomically subtracts @delta from @ptr. - */ -void atomic64_sub(u64 delta, atomic64_t *ptr) -{ - atomic64_add(-delta, ptr); -} -EXPORT_SYMBOL(atomic64_sub); - -/** - * atomic64_sub_and_test - subtract value from variable and test result - * @delta: integer value to subtract - * @ptr: pointer to type atomic64_t - * - * Atomically subtracts @delta from @ptr and returns - * true if the result is zero, or false for all - * other cases. - */ -int atomic64_sub_and_test(u64 delta, atomic64_t *ptr) -{ - u64 new_val = atomic64_sub_return(delta, ptr); - - return new_val == 0; -} -EXPORT_SYMBOL(atomic64_sub_and_test); - -/** - * atomic64_inc - increment atomic64 variable - * @ptr: pointer to type atomic64_t - * - * Atomically increments @ptr by 1. - */ -void atomic64_inc(atomic64_t *ptr) -{ - atomic64_add(1, ptr); -} -EXPORT_SYMBOL(atomic64_inc); - -/** - * atomic64_dec - decrement atomic64 variable - * @ptr: pointer to type atomic64_t - * - * Atomically decrements @ptr by 1. - */ -void atomic64_dec(atomic64_t *ptr) -{ - atomic64_sub(1, ptr); -} -EXPORT_SYMBOL(atomic64_dec); - -/** - * atomic64_dec_and_test - decrement and test - * @ptr: pointer to type atomic64_t - * - * Atomically decrements @ptr by 1 and - * returns true if the result is 0, or false for all other - * cases. - */ -int atomic64_dec_and_test(atomic64_t *ptr) -{ - return atomic64_sub_and_test(1, ptr); -} -EXPORT_SYMBOL(atomic64_dec_and_test); - -/** - * atomic64_inc_and_test - increment and test - * @ptr: pointer to type atomic64_t - * - * Atomically increments @ptr by 1 - * and returns true if the result is zero, or false for all - * other cases. - */ -int atomic64_inc_and_test(atomic64_t *ptr) -{ - return atomic64_sub_and_test(-1, ptr); -} -EXPORT_SYMBOL(atomic64_inc_and_test); - -/** - * atomic64_add_negative - add and test if negative - * @delta: integer value to add - * @ptr: pointer to type atomic64_t - * - * Atomically adds @delta to @ptr and returns true - * if the result is negative, or false when - * result is greater than or equal to zero. - */ -int atomic64_add_negative(u64 delta, atomic64_t *ptr) -{ - s64 new_val = atomic64_add_return(delta, ptr); - - return new_val < 0; -} -EXPORT_SYMBOL(atomic64_add_negative); +long long atomic64_read_cx8(long long, const atomic64_t *v); +EXPORT_SYMBOL(atomic64_read_cx8); +long long atomic64_set_cx8(long long, const atomic64_t *v); +EXPORT_SYMBOL(atomic64_set_cx8); +long long atomic64_xchg_cx8(long long, unsigned high); +EXPORT_SYMBOL(atomic64_xchg_cx8); +long long atomic64_add_return_cx8(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_add_return_cx8); +long long atomic64_sub_return_cx8(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_sub_return_cx8); +long long atomic64_inc_return_cx8(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_inc_return_cx8); +long long atomic64_dec_return_cx8(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_dec_return_cx8); +long long atomic64_dec_if_positive_cx8(atomic64_t *v); +EXPORT_SYMBOL(atomic64_dec_if_positive_cx8); +int atomic64_inc_not_zero_cx8(atomic64_t *v); +EXPORT_SYMBOL(atomic64_inc_not_zero_cx8); +int atomic64_add_unless_cx8(atomic64_t *v, long long a, long long u); +EXPORT_SYMBOL(atomic64_add_unless_cx8); + +#ifndef CONFIG_X86_CMPXCHG64 +long long atomic64_read_386(long long, const atomic64_t *v); +EXPORT_SYMBOL(atomic64_read_386); +long long atomic64_set_386(long long, const atomic64_t *v); +EXPORT_SYMBOL(atomic64_set_386); +long long atomic64_xchg_386(long long, unsigned high); +EXPORT_SYMBOL(atomic64_xchg_386); +long long atomic64_add_return_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_add_return_386); +long long atomic64_sub_return_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_sub_return_386); +long long atomic64_inc_return_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_inc_return_386); +long long atomic64_dec_return_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_dec_return_386); +long long atomic64_add_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_add_386); +long long atomic64_sub_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_sub_386); +long long atomic64_inc_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_inc_386); +long long atomic64_dec_386(long long a, atomic64_t *v); +EXPORT_SYMBOL(atomic64_dec_386); +long long atomic64_dec_if_positive_386(atomic64_t *v); +EXPORT_SYMBOL(atomic64_dec_if_positive_386); +int atomic64_inc_not_zero_386(atomic64_t *v); +EXPORT_SYMBOL(atomic64_inc_not_zero_386); +int atomic64_add_unless_386(atomic64_t *v, long long a, long long u); +EXPORT_SYMBOL(atomic64_add_unless_386); +#endif diff --git a/arch/x86/lib/atomic64_386_32.S b/arch/x86/lib/atomic64_386_32.S new file mode 100644 index 0000000..4a5979a --- /dev/null +++ b/arch/x86/lib/atomic64_386_32.S @@ -0,0 +1,174 @@ +/* + * atomic64_t for 386/486 + * + * Copyright © 2010 Luca Barbieri + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include <linux/linkage.h> +#include <asm/alternative-asm.h> +#include <asm/dwarf2.h> + +/* if you want SMP support, implement these with real spinlocks */ +.macro LOCK reg + pushfl + CFI_ADJUST_CFA_OFFSET 4 + cli +.endm + +.macro UNLOCK reg + popfl + CFI_ADJUST_CFA_OFFSET -4 +.endm + +.macro BEGIN func reg +$v = \reg + +ENTRY(atomic64_\func\()_386) + CFI_STARTPROC + LOCK $v + +.macro RETURN + UNLOCK $v + ret +.endm + +.macro END_ + CFI_ENDPROC +ENDPROC(atomic64_\func\()_386) +.purgem RETURN +.purgem END_ +.purgem END +.endm + +.macro END +RETURN +END_ +.endm +.endm + +BEGIN read %ecx + movl ($v), %eax + movl 4($v), %edx +END + +BEGIN set %esi + movl %ebx, ($v) + movl %ecx, 4($v) +END + +BEGIN xchg %esi + movl ($v), %eax + movl 4($v), %edx + movl %ebx, ($v) + movl %ecx, 4($v) +END + +BEGIN add %ecx + addl %eax, ($v) + adcl %edx, 4($v) +END + +BEGIN add_return %ecx + addl ($v), %eax + adcl 4($v), %edx + movl %eax, ($v) + movl %edx, 4($v) +END + +BEGIN sub %ecx + subl %eax, ($v) + sbbl %edx, 4($v) +END + +BEGIN sub_return %ecx + negl %edx + negl %eax + sbbl $0, %edx + addl ($v), %eax + adcl 4($v), %edx + movl %eax, ($v) + movl %edx, 4($v) +END + +BEGIN inc %esi + addl $1, ($v) + adcl $0, 4($v) +END + +BEGIN inc_return %esi + movl ($v), %eax + movl 4($v), %edx + addl $1, %eax + adcl $0, %edx + movl %eax, ($v) + movl %edx, 4($v) +END + +BEGIN dec %esi + subl $1, ($v) + sbbl $0, 4($v) +END + +BEGIN dec_return %esi + movl ($v), %eax + movl 4($v), %edx + subl $1, %eax + sbbl $0, %edx + movl %eax, ($v) + movl %edx, 4($v) +END + +BEGIN add_unless %ecx + addl %eax, %esi + adcl %edx, %edi + addl ($v), %eax + adcl 4($v), %edx + cmpl %eax, %esi + je 3f +1: + movl %eax, ($v) + movl %edx, 4($v) + movl $1, %eax +2: +RETURN +3: + cmpl %edx, %edi + jne 1b + xorl %eax, %eax + jmp 2b +END_ + +BEGIN inc_not_zero %esi + movl ($v), %eax + movl 4($v), %edx + testl %eax, %eax + je 3f +1: + addl $1, %eax + adcl $0, %edx + movl %eax, ($v) + movl %edx, 4($v) + movl $1, %eax +2: +RETURN +3: + testl %edx, %edx + jne 1b + jmp 2b +END_ + +BEGIN dec_if_positive %esi + movl ($v), %eax + movl 4($v), %edx + subl $1, %eax + sbbl $0, %edx + js 1f + movl %eax, ($v) + movl %edx, 4($v) +1: +END diff --git a/arch/x86/lib/atomic64_cx8_32.S b/arch/x86/lib/atomic64_cx8_32.S new file mode 100644 index 0000000..71e080d --- /dev/null +++ b/arch/x86/lib/atomic64_cx8_32.S @@ -0,0 +1,224 @@ +/* + * atomic64_t for 586+ + * + * Copyright © 2010 Luca Barbieri + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include <linux/linkage.h> +#include <asm/alternative-asm.h> +#include <asm/dwarf2.h> + +.macro SAVE reg + pushl %\reg + CFI_ADJUST_CFA_OFFSET 4 + CFI_REL_OFFSET \reg, 0 +.endm + +.macro RESTORE reg + popl %\reg + CFI_ADJUST_CFA_OFFSET -4 + CFI_RESTORE \reg +.endm + +.macro read64 reg + movl %ebx, %eax + movl %ecx, %edx +/* we need LOCK_PREFIX since otherwise cmpxchg8b always does the write */ + LOCK_PREFIX + cmpxchg8b (\reg) +.endm + +ENTRY(atomic64_read_cx8) + CFI_STARTPROC + + read64 %ecx + ret + CFI_ENDPROC +ENDPROC(atomic64_read_cx8) + +ENTRY(atomic64_set_cx8) + CFI_STARTPROC + +1: +/* we don't need LOCK_PREFIX since aligned 64-bit writes + * are atomic on 586 and newer */ + cmpxchg8b (%esi) + jne 1b + + ret + CFI_ENDPROC +ENDPROC(atomic64_set_cx8) + +ENTRY(atomic64_xchg_cx8) + CFI_STARTPROC + + movl %ebx, %eax + movl %ecx, %edx +1: + LOCK_PREFIX + cmpxchg8b (%esi) + jne 1b + + ret + CFI_ENDPROC +ENDPROC(atomic64_xchg_cx8) + +.macro addsub_return func ins insc +ENTRY(atomic64_\func\()_return_cx8) + CFI_STARTPROC + SAVE ebp + SAVE ebx + SAVE esi + SAVE edi + + movl %eax, %esi + movl %edx, %edi + movl %ecx, %ebp + + read64 %ebp +1: + movl %eax, %ebx + movl %edx, %ecx + \ins\()l %esi, %ebx + \insc\()l %edi, %ecx + LOCK_PREFIX + cmpxchg8b (%ebp) + jne 1b + +10: + movl %ebx, %eax + movl %ecx, %edx + RESTORE edi + RESTORE esi + RESTORE ebx + RESTORE ebp + ret + CFI_ENDPROC +ENDPROC(atomic64_\func\()_return_cx8) +.endm + +addsub_return add add adc +addsub_return sub sub sbb + +.macro incdec_return func ins insc +ENTRY(atomic64_\func\()_return_cx8) + CFI_STARTPROC + SAVE ebx + + read64 %esi +1: + movl %eax, %ebx + movl %edx, %ecx + \ins\()l $1, %ebx + \insc\()l $0, %ecx + LOCK_PREFIX + cmpxchg8b (%esi) + jne 1b + +10: + movl %ebx, %eax + movl %ecx, %edx + RESTORE ebx + ret + CFI_ENDPROC +ENDPROC(atomic64_\func\()_return_cx8) +.endm + +incdec_return inc add adc +incdec_return dec sub sbb + +ENTRY(atomic64_dec_if_positive_cx8) + CFI_STARTPROC + SAVE ebx + + read64 %esi +1: + movl %eax, %ebx + movl %edx, %ecx + subl $1, %ebx + sbb $0, %ecx + js 2f + LOCK_PREFIX + cmpxchg8b (%esi) + jne 1b + +2: + movl %ebx, %eax + movl %ecx, %edx + RESTORE ebx + ret + CFI_ENDPROC +ENDPROC(atomic64_dec_if_positive_cx8) + +ENTRY(atomic64_add_unless_cx8) + CFI_STARTPROC + SAVE ebp + SAVE ebx +/* these just push these two parameters on the stack */ + SAVE edi + SAVE esi + + movl %ecx, %ebp + movl %eax, %esi + movl %edx, %edi + + read64 %ebp +1: + cmpl %eax, 0(%esp) + je 4f +2: + movl %eax, %ebx + movl %edx, %ecx + addl %esi, %ebx + adcl %edi, %ecx + LOCK_PREFIX + cmpxchg8b (%ebp) + jne 1b + + movl $1, %eax +3: + addl $8, %esp + CFI_ADJUST_CFA_OFFSET -8 + RESTORE ebx + RESTORE ebp + ret +4: + cmpl %edx, 4(%esp) + jne 2b + xorl %eax, %eax + jmp 3b + CFI_ENDPROC +ENDPROC(atomic64_add_unless_cx8) + +ENTRY(atomic64_inc_not_zero_cx8) + CFI_STARTPROC + SAVE ebx + + read64 %esi +1: + testl %eax, %eax + je 4f +2: + movl %eax, %ebx + movl %edx, %ecx + addl $1, %ebx + adcl $0, %ecx + LOCK_PREFIX + cmpxchg8b (%esi) + jne 1b + + movl $1, %eax +3: + RESTORE ebx + ret +4: + testl %edx, %edx + jne 2b + jmp 3b + CFI_ENDPROC +ENDPROC(atomic64_inc_not_zero_cx8) diff --git a/arch/x86/math-emu/fpu_aux.c b/arch/x86/math-emu/fpu_aux.c index aa09870..dc8adad 100644 --- a/arch/x86/math-emu/fpu_aux.c +++ b/arch/x86/math-emu/fpu_aux.c @@ -30,10 +30,10 @@ static void fclex(void) } /* Needs to be externally visible */ -void finit_task(struct task_struct *tsk) +void finit_soft_fpu(struct i387_soft_struct *soft) { - struct i387_soft_struct *soft = &tsk->thread.xstate->soft; struct address *oaddr, *iaddr; + memset(soft, 0, sizeof(*soft)); soft->cwd = 0x037f; soft->swd = 0; soft->ftop = 0; /* We don't keep top in the status word internally. */ @@ -52,7 +52,7 @@ void finit_task(struct task_struct *tsk) void finit(void) { - finit_task(current); + finit_soft_fpu(¤t->thread.fpu.state->soft); } /* diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c index 5d87f58..7718541 100644 --- a/arch/x86/math-emu/fpu_entry.c +++ b/arch/x86/math-emu/fpu_entry.c @@ -681,7 +681,7 @@ int fpregs_soft_set(struct task_struct *target, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { - struct i387_soft_struct *s387 = &target->thread.xstate->soft; + struct i387_soft_struct *s387 = &target->thread.fpu.state->soft; void *space = s387->st_space; int ret; int offset, other, i, tags, regnr, tag, newtop; @@ -733,7 +733,7 @@ int fpregs_soft_get(struct task_struct *target, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { - struct i387_soft_struct *s387 = &target->thread.xstate->soft; + struct i387_soft_struct *s387 = &target->thread.fpu.state->soft; const void *space = s387->st_space; int ret; int offset = (S387->ftop & 7) * 10, other = 80 - offset; diff --git a/arch/x86/math-emu/fpu_system.h b/arch/x86/math-emu/fpu_system.h index 50fa0ec..2c61441 100644 --- a/arch/x86/math-emu/fpu_system.h +++ b/arch/x86/math-emu/fpu_system.h @@ -31,7 +31,7 @@ #define SEG_EXPAND_DOWN(s) (((s).b & ((1 << 11) | (1 << 10))) \ == (1 << 10)) -#define I387 (current->thread.xstate) +#define I387 (current->thread.fpu.state) #define FPU_info (I387->soft.info) #define FPU_CS (*(unsigned short *) &(FPU_info->regs->cs)) diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 06630d2..a4c7683 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -6,6 +6,7 @@ nostackp := $(call cc-option, -fno-stack-protector) CFLAGS_physaddr.o := $(nostackp) CFLAGS_setup_nx.o := $(nostackp) +obj-$(CONFIG_X86_PAT) += pat_rbtree.o obj-$(CONFIG_SMP) += tlb.o obj-$(CONFIG_X86_32) += pgtable_32.o iomap_32.o diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index edc8b95..bbe5502 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -30,6 +30,8 @@ #include <asm/pat.h> #include <asm/io.h> +#include "pat_internal.h" + #ifdef CONFIG_X86_PAT int __read_mostly pat_enabled = 1; @@ -53,19 +55,15 @@ static inline void pat_disable(const char *reason) #endif -static int debug_enable; +int pat_debug_enable; static int __init pat_debug_setup(char *str) { - debug_enable = 1; + pat_debug_enable = 1; return 0; } __setup("debugpat", pat_debug_setup); -#define dprintk(fmt, arg...) \ - do { if (debug_enable) printk(KERN_INFO fmt, ##arg); } while (0) - - static u64 __read_mostly boot_pat_state; enum { @@ -132,84 +130,7 @@ void pat_init(void) #undef PAT -static char *cattr_name(unsigned long flags) -{ - switch (flags & _PAGE_CACHE_MASK) { - case _PAGE_CACHE_UC: return "uncached"; - case _PAGE_CACHE_UC_MINUS: return "uncached-minus"; - case _PAGE_CACHE_WB: return "write-back"; - case _PAGE_CACHE_WC: return "write-combining"; - default: return "broken"; - } -} - -/* - * The global memtype list keeps track of memory type for specific - * physical memory areas. Conflicting memory types in different - * mappings can cause CPU cache corruption. To avoid this we keep track. - * - * The list is sorted based on starting address and can contain multiple - * entries for each address (this allows reference counting for overlapping - * areas). All the aliases have the same cache attributes of course. - * Zero attributes are represented as holes. - * - * The data structure is a list that is also organized as an rbtree - * sorted on the start address of memtype range. - * - * memtype_lock protects both the linear list and rbtree. - */ - -struct memtype { - u64 start; - u64 end; - unsigned long type; - struct list_head nd; - struct rb_node rb; -}; - -static struct rb_root memtype_rbroot = RB_ROOT; -static LIST_HEAD(memtype_list); -static DEFINE_SPINLOCK(memtype_lock); /* protects memtype list */ - -static struct memtype *memtype_rb_search(struct rb_root *root, u64 start) -{ - struct rb_node *node = root->rb_node; - struct memtype *last_lower = NULL; - - while (node) { - struct memtype *data = container_of(node, struct memtype, rb); - - if (data->start < start) { - last_lower = data; - node = node->rb_right; - } else if (data->start > start) { - node = node->rb_left; - } else - return data; - } - - /* Will return NULL if there is no entry with its start <= start */ - return last_lower; -} - -static void memtype_rb_insert(struct rb_root *root, struct memtype *data) -{ - struct rb_node **new = &(root->rb_node); - struct rb_node *parent = NULL; - - while (*new) { - struct memtype *this = container_of(*new, struct memtype, rb); - - parent = *new; - if (data->start <= this->start) - new = &((*new)->rb_left); - else if (data->start > this->start) - new = &((*new)->rb_right); - } - - rb_link_node(&data->rb, parent, new); - rb_insert_color(&data->rb, root); -} +static DEFINE_SPINLOCK(memtype_lock); /* protects memtype accesses */ /* * Does intersection of PAT memory type and MTRR memory type and returns @@ -237,33 +158,6 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, unsigned long req_type) return req_type; } -static int -chk_conflict(struct memtype *new, struct memtype *entry, unsigned long *type) -{ - if (new->type != entry->type) { - if (type) { - new->type = entry->type; - *type = entry->type; - } else - goto conflict; - } - - /* check overlaps with more than one entry in the list */ - list_for_each_entry_continue(entry, &memtype_list, nd) { - if (new->end <= entry->start) - break; - else if (new->type != entry->type) - goto conflict; - } - return 0; - - conflict: - printk(KERN_INFO "%s:%d conflicting memory types " - "%Lx-%Lx %s<->%s\n", current->comm, current->pid, new->start, - new->end, cattr_name(new->type), cattr_name(entry->type)); - return -EBUSY; -} - static int pat_pagerange_is_ram(unsigned long start, unsigned long end) { int ram_page = 0, not_rampage = 0; @@ -296,8 +190,6 @@ static int pat_pagerange_is_ram(unsigned long start, unsigned long end) * Here we do two pass: * - Find the memtype of all the pages in the range, look for any conflicts * - In case of no conflicts, set the new memtype for pages in the range - * - * Caller must hold memtype_lock for atomicity. */ static int reserve_ram_pages_type(u64 start, u64 end, unsigned long req_type, unsigned long *new_type) @@ -364,9 +256,8 @@ static int free_ram_pages_type(u64 start, u64 end) int reserve_memtype(u64 start, u64 end, unsigned long req_type, unsigned long *new_type) { - struct memtype *new, *entry; + struct memtype *new; unsigned long actual_type; - struct list_head *where; int is_range_ram; int err = 0; @@ -404,9 +295,7 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type, is_range_ram = pat_pagerange_is_ram(start, end); if (is_range_ram == 1) { - spin_lock(&memtype_lock); err = reserve_ram_pages_type(start, end, req_type, new_type); - spin_unlock(&memtype_lock); return err; } else if (is_range_ram < 0) { @@ -423,42 +312,7 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type, spin_lock(&memtype_lock); - /* Search for existing mapping that overlaps the current range */ - where = NULL; - list_for_each_entry(entry, &memtype_list, nd) { - if (end <= entry->start) { - where = entry->nd.prev; - break; - } else if (start <= entry->start) { /* end > entry->start */ - err = chk_conflict(new, entry, new_type); - if (!err) { - dprintk("Overlap at 0x%Lx-0x%Lx\n", - entry->start, entry->end); - where = entry->nd.prev; - } - break; - } else if (start < entry->end) { /* start > entry->start */ - err = chk_conflict(new, entry, new_type); - if (!err) { - dprintk("Overlap at 0x%Lx-0x%Lx\n", - entry->start, entry->end); - - /* - * Move to right position in the linked - * list to add this new entry - */ - list_for_each_entry_continue(entry, - &memtype_list, nd) { - if (start <= entry->start) { - where = entry->nd.prev; - break; - } - } - } - break; - } - } - + err = rbt_memtype_check_insert(new, new_type); if (err) { printk(KERN_INFO "reserve_memtype failed 0x%Lx-0x%Lx, " "track %s, req %s\n", @@ -469,13 +323,6 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type, return err; } - if (where) - list_add(&new->nd, where); - else - list_add_tail(&new->nd, &memtype_list); - - memtype_rb_insert(&memtype_rbroot, new); - spin_unlock(&memtype_lock); dprintk("reserve_memtype added 0x%Lx-0x%Lx, track %s, req %s, ret %s\n", @@ -487,7 +334,6 @@ int reserve_memtype(u64 start, u64 end, unsigned long req_type, int free_memtype(u64 start, u64 end) { - struct memtype *entry, *saved_entry; int err = -EINVAL; int is_range_ram; @@ -501,9 +347,7 @@ int free_memtype(u64 start, u64 end) is_range_ram = pat_pagerange_is_ram(start, end); if (is_range_ram == 1) { - spin_lock(&memtype_lock); err = free_ram_pages_type(start, end); - spin_unlock(&memtype_lock); return err; } else if (is_range_ram < 0) { @@ -511,46 +355,7 @@ int free_memtype(u64 start, u64 end) } spin_lock(&memtype_lock); - - entry = memtype_rb_search(&memtype_rbroot, start); - if (unlikely(entry == NULL)) - goto unlock_ret; - - /* - * Saved entry points to an entry with start same or less than what - * we searched for. Now go through the list in both directions to look - * for the entry that matches with both start and end, with list stored - * in sorted start address - */ - saved_entry = entry; - list_for_each_entry_from(entry, &memtype_list, nd) { - if (entry->start == start && entry->end == end) { - rb_erase(&entry->rb, &memtype_rbroot); - list_del(&entry->nd); - kfree(entry); - err = 0; - break; - } else if (entry->start > start) { - break; - } - } - - if (!err) - goto unlock_ret; - - entry = saved_entry; - list_for_each_entry_reverse(entry, &memtype_list, nd) { - if (entry->start == start && entry->end == end) { - rb_erase(&entry->rb, &memtype_rbroot); - list_del(&entry->nd); - kfree(entry); - err = 0; - break; - } else if (entry->start < start) { - break; - } - } -unlock_ret: + err = rbt_memtype_erase(start, end); spin_unlock(&memtype_lock); if (err) { @@ -583,10 +388,8 @@ static unsigned long lookup_memtype(u64 paddr) if (pat_pagerange_is_ram(paddr, paddr + PAGE_SIZE)) { struct page *page; - spin_lock(&memtype_lock); page = pfn_to_page(paddr >> PAGE_SHIFT); rettype = get_page_memtype(page); - spin_unlock(&memtype_lock); /* * -1 from get_page_memtype() implies RAM page is in its * default state and not reserved, and hence of type WB @@ -599,7 +402,7 @@ static unsigned long lookup_memtype(u64 paddr) spin_lock(&memtype_lock); - entry = memtype_rb_search(&memtype_rbroot, paddr); + entry = rbt_memtype_lookup(paddr); if (entry != NULL) rettype = entry->type; else @@ -936,29 +739,25 @@ EXPORT_SYMBOL_GPL(pgprot_writecombine); #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT) -/* get Nth element of the linked list */ static struct memtype *memtype_get_idx(loff_t pos) { - struct memtype *list_node, *print_entry; - int i = 1; + struct memtype *print_entry; + int ret; - print_entry = kmalloc(sizeof(struct memtype), GFP_KERNEL); + print_entry = kzalloc(sizeof(struct memtype), GFP_KERNEL); if (!print_entry) return NULL; spin_lock(&memtype_lock); - list_for_each_entry(list_node, &memtype_list, nd) { - if (pos == i) { - *print_entry = *list_node; - spin_unlock(&memtype_lock); - return print_entry; - } - ++i; - } + ret = rbt_memtype_copy_nth_element(print_entry, pos); spin_unlock(&memtype_lock); - kfree(print_entry); - return NULL; + if (!ret) { + return print_entry; + } else { + kfree(print_entry); + return NULL; + } } static void *memtype_seq_start(struct seq_file *seq, loff_t *pos) diff --git a/arch/x86/mm/pat_internal.h b/arch/x86/mm/pat_internal.h new file mode 100644 index 0000000..4f39eef --- /dev/null +++ b/arch/x86/mm/pat_internal.h @@ -0,0 +1,46 @@ +#ifndef __PAT_INTERNAL_H_ +#define __PAT_INTERNAL_H_ + +extern int pat_debug_enable; + +#define dprintk(fmt, arg...) \ + do { if (pat_debug_enable) printk(KERN_INFO fmt, ##arg); } while (0) + +struct memtype { + u64 start; + u64 end; + u64 subtree_max_end; + unsigned long type; + struct rb_node rb; +}; + +static inline char *cattr_name(unsigned long flags) +{ + switch (flags & _PAGE_CACHE_MASK) { + case _PAGE_CACHE_UC: return "uncached"; + case _PAGE_CACHE_UC_MINUS: return "uncached-minus"; + case _PAGE_CACHE_WB: return "write-back"; + case _PAGE_CACHE_WC: return "write-combining"; + default: return "broken"; + } +} + +#ifdef CONFIG_X86_PAT +extern int rbt_memtype_check_insert(struct memtype *new, + unsigned long *new_type); +extern int rbt_memtype_erase(u64 start, u64 end); +extern struct memtype *rbt_memtype_lookup(u64 addr); +extern int rbt_memtype_copy_nth_element(struct memtype *out, loff_t pos); +#else +static inline int rbt_memtype_check_insert(struct memtype *new, + unsigned long *new_type) +{ return 0; } +static inline int rbt_memtype_erase(u64 start, u64 end) +{ return 0; } +static inline struct memtype *rbt_memtype_lookup(u64 addr) +{ return NULL; } +static inline int rbt_memtype_copy_nth_element(struct memtype *out, loff_t pos) +{ return 0; } +#endif + +#endif /* __PAT_INTERNAL_H_ */ diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c new file mode 100644 index 0000000..07de4cb --- /dev/null +++ b/arch/x86/mm/pat_rbtree.c @@ -0,0 +1,273 @@ +/* + * Handle caching attributes in page tables (PAT) + * + * Authors: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> + * Suresh B Siddha <suresh.b.siddha@intel.com> + * + * Interval tree (augmented rbtree) used to store the PAT memory type + * reservations. + */ + +#include <linux/seq_file.h> +#include <linux/debugfs.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/rbtree.h> +#include <linux/sched.h> +#include <linux/gfp.h> + +#include <asm/pgtable.h> +#include <asm/pat.h> + +#include "pat_internal.h" + +/* + * The memtype tree keeps track of memory type for specific + * physical memory areas. Without proper tracking, conflicting memory + * types in different mappings can cause CPU cache corruption. + * + * The tree is an interval tree (augmented rbtree) with tree ordered + * on starting address. Tree can contain multiple entries for + * different regions which overlap. All the aliases have the same + * cache attributes of course. + * + * memtype_lock protects the rbtree. + */ + +static void memtype_rb_augment_cb(struct rb_node *node); +static struct rb_root memtype_rbroot = RB_AUGMENT_ROOT(&memtype_rb_augment_cb); + +static int is_node_overlap(struct memtype *node, u64 start, u64 end) +{ + if (node->start >= end || node->end <= start) + return 0; + + return 1; +} + +static u64 get_subtree_max_end(struct rb_node *node) +{ + u64 ret = 0; + if (node) { + struct memtype *data = container_of(node, struct memtype, rb); + ret = data->subtree_max_end; + } + return ret; +} + +/* Update 'subtree_max_end' for a node, based on node and its children */ +static void update_node_max_end(struct rb_node *node) +{ + struct memtype *data; + u64 max_end, child_max_end; + + if (!node) + return; + + data = container_of(node, struct memtype, rb); + max_end = data->end; + + child_max_end = get_subtree_max_end(node->rb_right); + if (child_max_end > max_end) + max_end = child_max_end; + + child_max_end = get_subtree_max_end(node->rb_left); + if (child_max_end > max_end) + max_end = child_max_end; + + data->subtree_max_end = max_end; +} + +/* Update 'subtree_max_end' for a node and all its ancestors */ +static void update_path_max_end(struct rb_node *node) +{ + u64 old_max_end, new_max_end; + + while (node) { + struct memtype *data = container_of(node, struct memtype, rb); + + old_max_end = data->subtree_max_end; + update_node_max_end(node); + new_max_end = data->subtree_max_end; + + if (new_max_end == old_max_end) + break; + + node = rb_parent(node); + } +} + +/* Find the first (lowest start addr) overlapping range from rb tree */ +static struct memtype *memtype_rb_lowest_match(struct rb_root *root, + u64 start, u64 end) +{ + struct rb_node *node = root->rb_node; + struct memtype *last_lower = NULL; + + while (node) { + struct memtype *data = container_of(node, struct memtype, rb); + + if (get_subtree_max_end(node->rb_left) > start) { + /* Lowest overlap if any must be on left side */ + node = node->rb_left; + } else if (is_node_overlap(data, start, end)) { + last_lower = data; + break; + } else if (start >= data->start) { + /* Lowest overlap if any must be on right side */ + node = node->rb_right; + } else { + break; + } + } + return last_lower; /* Returns NULL if there is no overlap */ +} + +static struct memtype *memtype_rb_exact_match(struct rb_root *root, + u64 start, u64 end) +{ + struct memtype *match; + + match = memtype_rb_lowest_match(root, start, end); + while (match != NULL && match->start < end) { + struct rb_node *node; + + if (match->start == start && match->end == end) + return match; + + node = rb_next(&match->rb); + if (node) + match = container_of(node, struct memtype, rb); + else + match = NULL; + } + + return NULL; /* Returns NULL if there is no exact match */ +} + +static int memtype_rb_check_conflict(struct rb_root *root, + u64 start, u64 end, + unsigned long reqtype, unsigned long *newtype) +{ + struct rb_node *node; + struct memtype *match; + int found_type = reqtype; + + match = memtype_rb_lowest_match(&memtype_rbroot, start, end); + if (match == NULL) + goto success; + + if (match->type != found_type && newtype == NULL) + goto failure; + + dprintk("Overlap at 0x%Lx-0x%Lx\n", match->start, match->end); + found_type = match->type; + + node = rb_next(&match->rb); + while (node) { + match = container_of(node, struct memtype, rb); + + if (match->start >= end) /* Checked all possible matches */ + goto success; + + if (is_node_overlap(match, start, end) && + match->type != found_type) { + goto failure; + } + + node = rb_next(&match->rb); + } +success: + if (newtype) + *newtype = found_type; + + return 0; + +failure: + printk(KERN_INFO "%s:%d conflicting memory types " + "%Lx-%Lx %s<->%s\n", current->comm, current->pid, start, + end, cattr_name(found_type), cattr_name(match->type)); + return -EBUSY; +} + +static void memtype_rb_augment_cb(struct rb_node *node) +{ + if (node) + update_path_max_end(node); +} + +static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata) +{ + struct rb_node **node = &(root->rb_node); + struct rb_node *parent = NULL; + + while (*node) { + struct memtype *data = container_of(*node, struct memtype, rb); + + parent = *node; + if (newdata->start <= data->start) + node = &((*node)->rb_left); + else if (newdata->start > data->start) + node = &((*node)->rb_right); + } + + rb_link_node(&newdata->rb, parent, node); + rb_insert_color(&newdata->rb, root); +} + +int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type) +{ + int err = 0; + + err = memtype_rb_check_conflict(&memtype_rbroot, new->start, new->end, + new->type, ret_type); + + if (!err) { + if (ret_type) + new->type = *ret_type; + + memtype_rb_insert(&memtype_rbroot, new); + } + return err; +} + +int rbt_memtype_erase(u64 start, u64 end) +{ + struct memtype *data; + + data = memtype_rb_exact_match(&memtype_rbroot, start, end); + if (!data) + return -EINVAL; + + rb_erase(&data->rb, &memtype_rbroot); + return 0; +} + +struct memtype *rbt_memtype_lookup(u64 addr) +{ + struct memtype *data; + data = memtype_rb_lowest_match(&memtype_rbroot, addr, addr + PAGE_SIZE); + return data; +} + +#if defined(CONFIG_DEBUG_FS) +int rbt_memtype_copy_nth_element(struct memtype *out, loff_t pos) +{ + struct rb_node *node; + int i = 1; + + node = rb_first(&memtype_rbroot); + while (node && pos != i) { + node = rb_next(node); + i++; + } + + if (node) { /* pos == i */ + struct memtype *this = container_of(node, struct memtype, rb); + *out = *this; + return 0; + } else { + return 1; + } +} +#endif diff --git a/arch/x86/mm/srat_64.c b/arch/x86/mm/srat_64.c index 28c6876..f9897f7 100644 --- a/arch/x86/mm/srat_64.c +++ b/arch/x86/mm/srat_64.c @@ -363,6 +363,54 @@ int __init acpi_scan_nodes(unsigned long start, unsigned long end) for (i = 0; i < MAX_NUMNODES; i++) cutoff_node(i, start, end); + /* + * Join together blocks on the same node, holes between + * which don't overlap with memory on other nodes. + */ + for (i = 0; i < num_node_memblks; ++i) { + int j, k; + + for (j = i + 1; j < num_node_memblks; ++j) { + unsigned long start, end; + + if (memblk_nodeid[i] != memblk_nodeid[j]) + continue; + start = min(node_memblk_range[i].end, + node_memblk_range[j].end); + end = max(node_memblk_range[i].start, + node_memblk_range[j].start); + for (k = 0; k < num_node_memblks; ++k) { + if (memblk_nodeid[i] == memblk_nodeid[k]) + continue; + if (start < node_memblk_range[k].end && + end > node_memblk_range[k].start) + break; + } + if (k < num_node_memblks) + continue; + start = min(node_memblk_range[i].start, + node_memblk_range[j].start); + end = max(node_memblk_range[i].end, + node_memblk_range[j].end); + printk(KERN_INFO "SRAT: Node %d " + "[%Lx,%Lx) + [%Lx,%Lx) -> [%lx,%lx)\n", + memblk_nodeid[i], + node_memblk_range[i].start, + node_memblk_range[i].end, + node_memblk_range[j].start, + node_memblk_range[j].end, + start, end); + node_memblk_range[i].start = start; + node_memblk_range[i].end = end; + k = --num_node_memblks - j; + memmove(memblk_nodeid + j, memblk_nodeid + j+1, + k * sizeof(*memblk_nodeid)); + memmove(node_memblk_range + j, node_memblk_range + j+1, + k * sizeof(*node_memblk_range)); + --j; + } + } + memnode_shift = compute_hash_shift(node_memblk_range, num_node_memblks, memblk_nodeid); if (memnode_shift < 0) { @@ -461,7 +509,8 @@ void __init acpi_fake_nodes(const struct bootnode *fake_nodes, int num_nodes) * node, it must now point to the fake node ID. */ for (j = 0; j < MAX_LOCAL_APIC; j++) - if (apicid_to_node[j] == nid) + if (apicid_to_node[j] == nid && + fake_apicid_to_node[j] == NUMA_NO_NODE) fake_apicid_to_node[j] = i; } for (i = 0; i < num_nodes; i++) diff --git a/arch/x86/oprofile/nmi_int.c b/arch/x86/oprofile/nmi_int.c index 2c505ee..b28d2f1 100644 --- a/arch/x86/oprofile/nmi_int.c +++ b/arch/x86/oprofile/nmi_int.c @@ -31,8 +31,9 @@ static struct op_x86_model_spec *model; static DEFINE_PER_CPU(struct op_msrs, cpu_msrs); static DEFINE_PER_CPU(unsigned long, saved_lvtpc); -/* 0 == registered but off, 1 == registered and on */ -static int nmi_enabled = 0; +/* must be protected with get_online_cpus()/put_online_cpus(): */ +static int nmi_enabled; +static int ctr_running; struct op_counter_config counter_config[OP_MAX_COUNTER]; @@ -61,12 +62,16 @@ static int profile_exceptions_notify(struct notifier_block *self, { struct die_args *args = (struct die_args *)data; int ret = NOTIFY_DONE; - int cpu = smp_processor_id(); switch (val) { case DIE_NMI: case DIE_NMI_IPI: - model->check_ctrs(args->regs, &per_cpu(cpu_msrs, cpu)); + if (ctr_running) + model->check_ctrs(args->regs, &__get_cpu_var(cpu_msrs)); + else if (!nmi_enabled) + break; + else + model->stop(&__get_cpu_var(cpu_msrs)); ret = NOTIFY_STOP; break; default: @@ -95,24 +100,36 @@ static void nmi_cpu_save_registers(struct op_msrs *msrs) static void nmi_cpu_start(void *dummy) { struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs); - model->start(msrs); + if (!msrs->controls) + WARN_ON_ONCE(1); + else + model->start(msrs); } static int nmi_start(void) { + get_online_cpus(); on_each_cpu(nmi_cpu_start, NULL, 1); + ctr_running = 1; + put_online_cpus(); return 0; } static void nmi_cpu_stop(void *dummy) { struct op_msrs const *msrs = &__get_cpu_var(cpu_msrs); - model->stop(msrs); + if (!msrs->controls) + WARN_ON_ONCE(1); + else + model->stop(msrs); } static void nmi_stop(void) { + get_online_cpus(); on_each_cpu(nmi_cpu_stop, NULL, 1); + ctr_running = 0; + put_online_cpus(); } #ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX @@ -252,7 +269,10 @@ static int nmi_switch_event(void) if (nmi_multiplex_on() < 0) return -EINVAL; /* not necessary */ - on_each_cpu(nmi_cpu_switch, NULL, 1); + get_online_cpus(); + if (ctr_running) + on_each_cpu(nmi_cpu_switch, NULL, 1); + put_online_cpus(); return 0; } @@ -295,6 +315,7 @@ static void free_msrs(void) kfree(per_cpu(cpu_msrs, i).controls); per_cpu(cpu_msrs, i).controls = NULL; } + nmi_shutdown_mux(); } static int allocate_msrs(void) @@ -307,14 +328,21 @@ static int allocate_msrs(void) per_cpu(cpu_msrs, i).counters = kzalloc(counters_size, GFP_KERNEL); if (!per_cpu(cpu_msrs, i).counters) - return 0; + goto fail; per_cpu(cpu_msrs, i).controls = kzalloc(controls_size, GFP_KERNEL); if (!per_cpu(cpu_msrs, i).controls) - return 0; + goto fail; } + if (!nmi_setup_mux()) + goto fail; + return 1; + +fail: + free_msrs(); + return 0; } static void nmi_cpu_setup(void *dummy) @@ -336,49 +364,6 @@ static struct notifier_block profile_exceptions_nb = { .priority = 2 }; -static int nmi_setup(void) -{ - int err = 0; - int cpu; - - if (!allocate_msrs()) - err = -ENOMEM; - else if (!nmi_setup_mux()) - err = -ENOMEM; - else - err = register_die_notifier(&profile_exceptions_nb); - - if (err) { - free_msrs(); - nmi_shutdown_mux(); - return err; - } - - /* We need to serialize save and setup for HT because the subset - * of msrs are distinct for save and setup operations - */ - - /* Assume saved/restored counters are the same on all CPUs */ - model->fill_in_addresses(&per_cpu(cpu_msrs, 0)); - for_each_possible_cpu(cpu) { - if (!cpu) - continue; - - memcpy(per_cpu(cpu_msrs, cpu).counters, - per_cpu(cpu_msrs, 0).counters, - sizeof(struct op_msr) * model->num_counters); - - memcpy(per_cpu(cpu_msrs, cpu).controls, - per_cpu(cpu_msrs, 0).controls, - sizeof(struct op_msr) * model->num_controls); - - mux_clone(cpu); - } - on_each_cpu(nmi_cpu_setup, NULL, 1); - nmi_enabled = 1; - return 0; -} - static void nmi_cpu_restore_registers(struct op_msrs *msrs) { struct op_msr *counters = msrs->counters; @@ -412,20 +397,24 @@ static void nmi_cpu_shutdown(void *dummy) apic_write(APIC_LVTPC, per_cpu(saved_lvtpc, cpu)); apic_write(APIC_LVTERR, v); nmi_cpu_restore_registers(msrs); + if (model->cpu_down) + model->cpu_down(); } -static void nmi_shutdown(void) +static void nmi_cpu_up(void *dummy) { - struct op_msrs *msrs; + if (nmi_enabled) + nmi_cpu_setup(dummy); + if (ctr_running) + nmi_cpu_start(dummy); +} - nmi_enabled = 0; - on_each_cpu(nmi_cpu_shutdown, NULL, 1); - unregister_die_notifier(&profile_exceptions_nb); - nmi_shutdown_mux(); - msrs = &get_cpu_var(cpu_msrs); - model->shutdown(msrs); - free_msrs(); - put_cpu_var(cpu_msrs); +static void nmi_cpu_down(void *dummy) +{ + if (ctr_running) + nmi_cpu_stop(dummy); + if (nmi_enabled) + nmi_cpu_shutdown(dummy); } static int nmi_create_files(struct super_block *sb, struct dentry *root) @@ -457,7 +446,6 @@ static int nmi_create_files(struct super_block *sb, struct dentry *root) return 0; } -#ifdef CONFIG_SMP static int oprofile_cpu_notifier(struct notifier_block *b, unsigned long action, void *data) { @@ -465,10 +453,10 @@ static int oprofile_cpu_notifier(struct notifier_block *b, unsigned long action, switch (action) { case CPU_DOWN_FAILED: case CPU_ONLINE: - smp_call_function_single(cpu, nmi_cpu_start, NULL, 0); + smp_call_function_single(cpu, nmi_cpu_up, NULL, 0); break; case CPU_DOWN_PREPARE: - smp_call_function_single(cpu, nmi_cpu_stop, NULL, 1); + smp_call_function_single(cpu, nmi_cpu_down, NULL, 1); break; } return NOTIFY_DONE; @@ -477,7 +465,75 @@ static int oprofile_cpu_notifier(struct notifier_block *b, unsigned long action, static struct notifier_block oprofile_cpu_nb = { .notifier_call = oprofile_cpu_notifier }; -#endif + +static int nmi_setup(void) +{ + int err = 0; + int cpu; + + if (!allocate_msrs()) + return -ENOMEM; + + /* We need to serialize save and setup for HT because the subset + * of msrs are distinct for save and setup operations + */ + + /* Assume saved/restored counters are the same on all CPUs */ + err = model->fill_in_addresses(&per_cpu(cpu_msrs, 0)); + if (err) + goto fail; + + for_each_possible_cpu(cpu) { + if (!cpu) + continue; + + memcpy(per_cpu(cpu_msrs, cpu).counters, + per_cpu(cpu_msrs, 0).counters, + sizeof(struct op_msr) * model->num_counters); + + memcpy(per_cpu(cpu_msrs, cpu).controls, + per_cpu(cpu_msrs, 0).controls, + sizeof(struct op_msr) * model->num_controls); + + mux_clone(cpu); + } + + nmi_enabled = 0; + ctr_running = 0; + barrier(); + err = register_die_notifier(&profile_exceptions_nb); + if (err) + goto fail; + + get_online_cpus(); + register_cpu_notifier(&oprofile_cpu_nb); + on_each_cpu(nmi_cpu_setup, NULL, 1); + nmi_enabled = 1; + put_online_cpus(); + + return 0; +fail: + free_msrs(); + return err; +} + +static void nmi_shutdown(void) +{ + struct op_msrs *msrs; + + get_online_cpus(); + unregister_cpu_notifier(&oprofile_cpu_nb); + on_each_cpu(nmi_cpu_shutdown, NULL, 1); + nmi_enabled = 0; + ctr_running = 0; + put_online_cpus(); + barrier(); + unregister_die_notifier(&profile_exceptions_nb); + msrs = &get_cpu_var(cpu_msrs); + model->shutdown(msrs); + free_msrs(); + put_cpu_var(cpu_msrs); +} #ifdef CONFIG_PM @@ -687,9 +743,6 @@ int __init op_nmi_init(struct oprofile_operations *ops) return -ENODEV; } -#ifdef CONFIG_SMP - register_cpu_notifier(&oprofile_cpu_nb); -#endif /* default values, can be overwritten by model */ ops->create_files = nmi_create_files; ops->setup = nmi_setup; @@ -716,12 +769,6 @@ int __init op_nmi_init(struct oprofile_operations *ops) void op_nmi_exit(void) { - if (using_nmi) { + if (using_nmi) exit_sysfs(); -#ifdef CONFIG_SMP - unregister_cpu_notifier(&oprofile_cpu_nb); -#endif - } - if (model->exit) - model->exit(); } diff --git a/arch/x86/oprofile/op_model_amd.c b/arch/x86/oprofile/op_model_amd.c index 090cbbe..b67a6b5 100644 --- a/arch/x86/oprofile/op_model_amd.c +++ b/arch/x86/oprofile/op_model_amd.c @@ -30,13 +30,10 @@ #include "op_counter.h" #define NUM_COUNTERS 4 -#define NUM_CONTROLS 4 #ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX #define NUM_VIRT_COUNTERS 32 -#define NUM_VIRT_CONTROLS 32 #else #define NUM_VIRT_COUNTERS NUM_COUNTERS -#define NUM_VIRT_CONTROLS NUM_CONTROLS #endif #define OP_EVENT_MASK 0x0FFF @@ -105,102 +102,6 @@ static u32 get_ibs_caps(void) return ibs_caps; } -#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX - -static void op_mux_switch_ctrl(struct op_x86_model_spec const *model, - struct op_msrs const * const msrs) -{ - u64 val; - int i; - - /* enable active counters */ - for (i = 0; i < NUM_COUNTERS; ++i) { - int virt = op_x86_phys_to_virt(i); - if (!reset_value[virt]) - continue; - rdmsrl(msrs->controls[i].addr, val); - val &= model->reserved; - val |= op_x86_get_ctrl(model, &counter_config[virt]); - wrmsrl(msrs->controls[i].addr, val); - } -} - -#endif - -/* functions for op_amd_spec */ - -static void op_amd_fill_in_addresses(struct op_msrs * const msrs) -{ - int i; - - for (i = 0; i < NUM_COUNTERS; i++) { - if (reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) - msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; - } - - for (i = 0; i < NUM_CONTROLS; i++) { - if (reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) - msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; - } -} - -static void op_amd_setup_ctrs(struct op_x86_model_spec const *model, - struct op_msrs const * const msrs) -{ - u64 val; - int i; - - /* setup reset_value */ - for (i = 0; i < NUM_VIRT_COUNTERS; ++i) { - if (counter_config[i].enabled - && msrs->counters[op_x86_virt_to_phys(i)].addr) - reset_value[i] = counter_config[i].count; - else - reset_value[i] = 0; - } - - /* clear all counters */ - for (i = 0; i < NUM_CONTROLS; ++i) { - if (unlikely(!msrs->controls[i].addr)) { - if (counter_config[i].enabled && !smp_processor_id()) - /* - * counter is reserved, this is on all - * cpus, so report only for cpu #0 - */ - op_x86_warn_reserved(i); - continue; - } - rdmsrl(msrs->controls[i].addr, val); - if (val & ARCH_PERFMON_EVENTSEL_ENABLE) - op_x86_warn_in_use(i); - val &= model->reserved; - wrmsrl(msrs->controls[i].addr, val); - } - - /* avoid a false detection of ctr overflows in NMI handler */ - for (i = 0; i < NUM_COUNTERS; ++i) { - if (unlikely(!msrs->counters[i].addr)) - continue; - wrmsrl(msrs->counters[i].addr, -1LL); - } - - /* enable active counters */ - for (i = 0; i < NUM_COUNTERS; ++i) { - int virt = op_x86_phys_to_virt(i); - if (!reset_value[virt]) - continue; - - /* setup counter registers */ - wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]); - - /* setup control registers */ - rdmsrl(msrs->controls[i].addr, val); - val &= model->reserved; - val |= op_x86_get_ctrl(model, &counter_config[virt]); - wrmsrl(msrs->controls[i].addr, val); - } -} - /* * 16-bit Linear Feedback Shift Register (LFSR) * @@ -365,6 +266,125 @@ static void op_amd_stop_ibs(void) wrmsrl(MSR_AMD64_IBSOPCTL, 0); } +#ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX + +static void op_mux_switch_ctrl(struct op_x86_model_spec const *model, + struct op_msrs const * const msrs) +{ + u64 val; + int i; + + /* enable active counters */ + for (i = 0; i < NUM_COUNTERS; ++i) { + int virt = op_x86_phys_to_virt(i); + if (!reset_value[virt]) + continue; + rdmsrl(msrs->controls[i].addr, val); + val &= model->reserved; + val |= op_x86_get_ctrl(model, &counter_config[virt]); + wrmsrl(msrs->controls[i].addr, val); + } +} + +#endif + +/* functions for op_amd_spec */ + +static void op_amd_shutdown(struct op_msrs const * const msrs) +{ + int i; + + for (i = 0; i < NUM_COUNTERS; ++i) { + if (!msrs->counters[i].addr) + continue; + release_perfctr_nmi(MSR_K7_PERFCTR0 + i); + release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); + } +} + +static int op_amd_fill_in_addresses(struct op_msrs * const msrs) +{ + int i; + + for (i = 0; i < NUM_COUNTERS; i++) { + if (!reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i)) + goto fail; + if (!reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) { + release_perfctr_nmi(MSR_K7_PERFCTR0 + i); + goto fail; + } + /* both registers must be reserved */ + msrs->counters[i].addr = MSR_K7_PERFCTR0 + i; + msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i; + continue; + fail: + if (!counter_config[i].enabled) + continue; + op_x86_warn_reserved(i); + op_amd_shutdown(msrs); + return -EBUSY; + } + + return 0; +} + +static void op_amd_setup_ctrs(struct op_x86_model_spec const *model, + struct op_msrs const * const msrs) +{ + u64 val; + int i; + + /* setup reset_value */ + for (i = 0; i < NUM_VIRT_COUNTERS; ++i) { + if (counter_config[i].enabled + && msrs->counters[op_x86_virt_to_phys(i)].addr) + reset_value[i] = counter_config[i].count; + else + reset_value[i] = 0; + } + + /* clear all counters */ + for (i = 0; i < NUM_COUNTERS; ++i) { + if (!msrs->controls[i].addr) + continue; + rdmsrl(msrs->controls[i].addr, val); + if (val & ARCH_PERFMON_EVENTSEL_ENABLE) + op_x86_warn_in_use(i); + val &= model->reserved; + wrmsrl(msrs->controls[i].addr, val); + /* + * avoid a false detection of ctr overflows in NMI + * handler + */ + wrmsrl(msrs->counters[i].addr, -1LL); + } + + /* enable active counters */ + for (i = 0; i < NUM_COUNTERS; ++i) { + int virt = op_x86_phys_to_virt(i); + if (!reset_value[virt]) + continue; + + /* setup counter registers */ + wrmsrl(msrs->counters[i].addr, -(u64)reset_value[virt]); + + /* setup control registers */ + rdmsrl(msrs->controls[i].addr, val); + val &= model->reserved; + val |= op_x86_get_ctrl(model, &counter_config[virt]); + wrmsrl(msrs->controls[i].addr, val); + } + + if (ibs_caps) + setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_NMI, 0); +} + +static void op_amd_cpu_shutdown(void) +{ + if (ibs_caps) + setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_FIX, 1); +} + static int op_amd_check_ctrs(struct pt_regs * const regs, struct op_msrs const * const msrs) { @@ -425,42 +445,16 @@ static void op_amd_stop(struct op_msrs const * const msrs) op_amd_stop_ibs(); } -static void op_amd_shutdown(struct op_msrs const * const msrs) -{ - int i; - - for (i = 0; i < NUM_COUNTERS; ++i) { - if (msrs->counters[i].addr) - release_perfctr_nmi(MSR_K7_PERFCTR0 + i); - } - for (i = 0; i < NUM_CONTROLS; ++i) { - if (msrs->controls[i].addr) - release_evntsel_nmi(MSR_K7_EVNTSEL0 + i); - } -} - -static u8 ibs_eilvt_off; - -static inline void apic_init_ibs_nmi_per_cpu(void *arg) -{ - ibs_eilvt_off = setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_NMI, 0); -} - -static inline void apic_clear_ibs_nmi_per_cpu(void *arg) -{ - setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_FIX, 1); -} - -static int init_ibs_nmi(void) +static int __init_ibs_nmi(void) { #define IBSCTL_LVTOFFSETVAL (1 << 8) #define IBSCTL 0x1cc struct pci_dev *cpu_cfg; int nodes; u32 value = 0; + u8 ibs_eilvt_off; - /* per CPU setup */ - on_each_cpu(apic_init_ibs_nmi_per_cpu, NULL, 1); + ibs_eilvt_off = setup_APIC_eilvt_ibs(0, APIC_EILVT_MSG_FIX, 1); nodes = 0; cpu_cfg = NULL; @@ -490,22 +484,15 @@ static int init_ibs_nmi(void) return 0; } -/* uninitialize the APIC for the IBS interrupts if needed */ -static void clear_ibs_nmi(void) -{ - if (ibs_caps) - on_each_cpu(apic_clear_ibs_nmi_per_cpu, NULL, 1); -} - /* initialize the APIC for the IBS interrupts if available */ -static void ibs_init(void) +static void init_ibs(void) { ibs_caps = get_ibs_caps(); if (!ibs_caps) return; - if (init_ibs_nmi()) { + if (__init_ibs_nmi()) { ibs_caps = 0; return; } @@ -514,14 +501,6 @@ static void ibs_init(void) (unsigned)ibs_caps); } -static void ibs_exit(void) -{ - if (!ibs_caps) - return; - - clear_ibs_nmi(); -} - static int (*create_arch_files)(struct super_block *sb, struct dentry *root); static int setup_ibs_files(struct super_block *sb, struct dentry *root) @@ -570,27 +549,22 @@ static int setup_ibs_files(struct super_block *sb, struct dentry *root) static int op_amd_init(struct oprofile_operations *ops) { - ibs_init(); + init_ibs(); create_arch_files = ops->create_files; ops->create_files = setup_ibs_files; return 0; } -static void op_amd_exit(void) -{ - ibs_exit(); -} - struct op_x86_model_spec op_amd_spec = { .num_counters = NUM_COUNTERS, - .num_controls = NUM_CONTROLS, + .num_controls = NUM_COUNTERS, .num_virt_counters = NUM_VIRT_COUNTERS, .reserved = MSR_AMD_EVENTSEL_RESERVED, .event_mask = OP_EVENT_MASK, .init = op_amd_init, - .exit = op_amd_exit, .fill_in_addresses = &op_amd_fill_in_addresses, .setup_ctrs = &op_amd_setup_ctrs, + .cpu_down = &op_amd_cpu_shutdown, .check_ctrs = &op_amd_check_ctrs, .start = &op_amd_start, .stop = &op_amd_stop, diff --git a/arch/x86/oprofile/op_model_p4.c b/arch/x86/oprofile/op_model_p4.c index e6a160a..182558d 100644 --- a/arch/x86/oprofile/op_model_p4.c +++ b/arch/x86/oprofile/op_model_p4.c @@ -385,8 +385,26 @@ static unsigned int get_stagger(void) static unsigned long reset_value[NUM_COUNTERS_NON_HT]; +static void p4_shutdown(struct op_msrs const * const msrs) +{ + int i; -static void p4_fill_in_addresses(struct op_msrs * const msrs) + for (i = 0; i < num_counters; ++i) { + if (msrs->counters[i].addr) + release_perfctr_nmi(msrs->counters[i].addr); + } + /* + * some of the control registers are specially reserved in + * conjunction with the counter registers (hence the starting offset). + * This saves a few bits. + */ + for (i = num_counters; i < num_controls; ++i) { + if (msrs->controls[i].addr) + release_evntsel_nmi(msrs->controls[i].addr); + } +} + +static int p4_fill_in_addresses(struct op_msrs * const msrs) { unsigned int i; unsigned int addr, cccraddr, stag; @@ -468,6 +486,18 @@ static void p4_fill_in_addresses(struct op_msrs * const msrs) msrs->controls[i++].addr = MSR_P4_CRU_ESCR5; } } + + for (i = 0; i < num_counters; ++i) { + if (!counter_config[i].enabled) + continue; + if (msrs->controls[i].addr) + continue; + op_x86_warn_reserved(i); + p4_shutdown(msrs); + return -EBUSY; + } + + return 0; } @@ -668,26 +698,6 @@ static void p4_stop(struct op_msrs const * const msrs) } } -static void p4_shutdown(struct op_msrs const * const msrs) -{ - int i; - - for (i = 0; i < num_counters; ++i) { - if (msrs->counters[i].addr) - release_perfctr_nmi(msrs->counters[i].addr); - } - /* - * some of the control registers are specially reserved in - * conjunction with the counter registers (hence the starting offset). - * This saves a few bits. - */ - for (i = num_counters; i < num_controls; ++i) { - if (msrs->controls[i].addr) - release_evntsel_nmi(msrs->controls[i].addr); - } -} - - #ifdef CONFIG_SMP struct op_x86_model_spec op_p4_ht2_spec = { .num_counters = NUM_COUNTERS_HT2, diff --git a/arch/x86/oprofile/op_model_ppro.c b/arch/x86/oprofile/op_model_ppro.c index c8abc4d..d769cda 100644 --- a/arch/x86/oprofile/op_model_ppro.c +++ b/arch/x86/oprofile/op_model_ppro.c @@ -30,19 +30,46 @@ static int counter_width = 32; static u64 *reset_value; -static void ppro_fill_in_addresses(struct op_msrs * const msrs) +static void ppro_shutdown(struct op_msrs const * const msrs) { int i; - for (i = 0; i < num_counters; i++) { - if (reserve_perfctr_nmi(MSR_P6_PERFCTR0 + i)) - msrs->counters[i].addr = MSR_P6_PERFCTR0 + i; + for (i = 0; i < num_counters; ++i) { + if (!msrs->counters[i].addr) + continue; + release_perfctr_nmi(MSR_P6_PERFCTR0 + i); + release_evntsel_nmi(MSR_P6_EVNTSEL0 + i); + } + if (reset_value) { + kfree(reset_value); + reset_value = NULL; } +} + +static int ppro_fill_in_addresses(struct op_msrs * const msrs) +{ + int i; for (i = 0; i < num_counters; i++) { - if (reserve_evntsel_nmi(MSR_P6_EVNTSEL0 + i)) - msrs->controls[i].addr = MSR_P6_EVNTSEL0 + i; + if (!reserve_perfctr_nmi(MSR_P6_PERFCTR0 + i)) + goto fail; + if (!reserve_evntsel_nmi(MSR_P6_EVNTSEL0 + i)) { + release_perfctr_nmi(MSR_P6_PERFCTR0 + i); + goto fail; + } + /* both registers must be reserved */ + msrs->counters[i].addr = MSR_P6_PERFCTR0 + i; + msrs->controls[i].addr = MSR_P6_EVNTSEL0 + i; + continue; + fail: + if (!counter_config[i].enabled) + continue; + op_x86_warn_reserved(i); + ppro_shutdown(msrs); + return -EBUSY; } + + return 0; } @@ -78,26 +105,17 @@ static void ppro_setup_ctrs(struct op_x86_model_spec const *model, /* clear all counters */ for (i = 0; i < num_counters; ++i) { - if (unlikely(!msrs->controls[i].addr)) { - if (counter_config[i].enabled && !smp_processor_id()) - /* - * counter is reserved, this is on all - * cpus, so report only for cpu #0 - */ - op_x86_warn_reserved(i); + if (!msrs->controls[i].addr) continue; - } rdmsrl(msrs->controls[i].addr, val); if (val & ARCH_PERFMON_EVENTSEL_ENABLE) op_x86_warn_in_use(i); val &= model->reserved; wrmsrl(msrs->controls[i].addr, val); - } - - /* avoid a false detection of ctr overflows in NMI handler */ - for (i = 0; i < num_counters; ++i) { - if (unlikely(!msrs->counters[i].addr)) - continue; + /* + * avoid a false detection of ctr overflows in NMI * + * handler + */ wrmsrl(msrs->counters[i].addr, -1LL); } @@ -189,25 +207,6 @@ static void ppro_stop(struct op_msrs const * const msrs) } } -static void ppro_shutdown(struct op_msrs const * const msrs) -{ - int i; - - for (i = 0; i < num_counters; ++i) { - if (msrs->counters[i].addr) - release_perfctr_nmi(MSR_P6_PERFCTR0 + i); - } - for (i = 0; i < num_counters; ++i) { - if (msrs->controls[i].addr) - release_evntsel_nmi(MSR_P6_EVNTSEL0 + i); - } - if (reset_value) { - kfree(reset_value); - reset_value = NULL; - } -} - - struct op_x86_model_spec op_ppro_spec = { .num_counters = 2, .num_controls = 2, diff --git a/arch/x86/oprofile/op_x86_model.h b/arch/x86/oprofile/op_x86_model.h index ff82a75..89017fa 100644 --- a/arch/x86/oprofile/op_x86_model.h +++ b/arch/x86/oprofile/op_x86_model.h @@ -40,10 +40,10 @@ struct op_x86_model_spec { u64 reserved; u16 event_mask; int (*init)(struct oprofile_operations *ops); - void (*exit)(void); - void (*fill_in_addresses)(struct op_msrs * const msrs); + int (*fill_in_addresses)(struct op_msrs * const msrs); void (*setup_ctrs)(struct op_x86_model_spec const *model, struct op_msrs const * const msrs); + void (*cpu_down)(void); int (*check_ctrs)(struct pt_regs * const regs, struct op_msrs const * const msrs); void (*start)(struct op_msrs const * const msrs); diff --git a/arch/x86/pci/mrst.c b/arch/x86/pci/mrst.c index 8bf2fcb..7ef3a27 100644 --- a/arch/x86/pci/mrst.c +++ b/arch/x86/pci/mrst.c @@ -109,7 +109,7 @@ static int pci_device_update_fixed(struct pci_bus *bus, unsigned int devfn, decode++; decode = ~(decode - 1); } else { - decode = ~0; + decode = 0; } /* @@ -247,6 +247,10 @@ static void __devinit pci_fixed_bar_fixup(struct pci_dev *dev) u32 size; int i; + /* Must have extended configuration space */ + if (dev->cfg_size < PCIE_CAP_OFFSET + 4) + return; + /* Fixup the BAR sizes for fixed BAR devices and make them unmoveable */ offset = fixed_bar_cap(dev->bus, dev->devfn); if (!offset || PCI_DEVFN(2, 0) == dev->devfn || diff --git a/arch/xtensa/include/asm/atomic.h b/arch/xtensa/include/asm/atomic.h index 22d6dde..a96a061 100644 --- a/arch/xtensa/include/asm/atomic.h +++ b/arch/xtensa/include/asm/atomic.h @@ -46,7 +46,7 @@ * * Atomically reads the value of @v. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) /** * atomic_set - set atomic variable diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 5fe03de..2cc682b 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -286,16 +286,16 @@ done: static struct cgroup_subsys_state * blkiocg_create(struct cgroup_subsys *subsys, struct cgroup *cgroup) { - struct blkio_cgroup *blkcg, *parent_blkcg; + struct blkio_cgroup *blkcg; + struct cgroup *parent = cgroup->parent; - if (!cgroup->parent) { + if (!parent) { blkcg = &blkio_root_cgroup; goto done; } /* Currently we do not support hierarchy deeper than two level (0,1) */ - parent_blkcg = cgroup_to_blkio_cgroup(cgroup->parent); - if (css_depth(&parent_blkcg->css) > 0) + if (parent != cgroup->top_cgroup) return ERR_PTR(-EINVAL); blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL); diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 838834b..5f127cf 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -3694,8 +3694,10 @@ static void *cfq_init_queue(struct request_queue *q) * to make sure that cfq_put_cfqg() does not try to kfree root group */ atomic_set(&cfqg->ref, 1); + rcu_read_lock(); blkiocg_add_blkio_group(&blkio_root_cgroup, &cfqg->blkg, (void *)cfqd, 0); + rcu_read_unlock(); #endif /* * Not strictly needed (since RB_ROOT just clears the node and we diff --git a/drivers/Makefile b/drivers/Makefile index 34f1e10..f42a030 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_SFI) += sfi/ obj-$(CONFIG_PNP) += pnp/ obj-$(CONFIG_ARM_AMBA) += amba/ +obj-$(CONFIG_VIRTIO) += virtio/ obj-$(CONFIG_XEN) += xen/ # regulators early, since some subsystems rely on them to initialize @@ -108,7 +109,6 @@ obj-$(CONFIG_PPC_PS3) += ps3/ obj-$(CONFIG_OF) += of/ obj-$(CONFIG_SSB) += ssb/ obj-$(CONFIG_VHOST_NET) += vhost/ -obj-$(CONFIG_VIRTIO) += virtio/ obj-$(CONFIG_VLYNQ) += vlynq/ obj-$(CONFIG_STAGING) += staging/ obj-y += platform/ diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c index 19dacfd..6212213 100644 --- a/drivers/acpi/acpi_pad.c +++ b/drivers/acpi/acpi_pad.c @@ -31,7 +31,7 @@ #include <acpi/acpi_bus.h> #include <acpi/acpi_drivers.h> -#define ACPI_PROCESSOR_AGGREGATOR_CLASS "processor_aggregator" +#define ACPI_PROCESSOR_AGGREGATOR_CLASS "acpi_pad" #define ACPI_PROCESSOR_AGGREGATOR_DEVICE_NAME "Processor Aggregator" #define ACPI_PROCESSOR_AGGREGATOR_NOTIFY 0x80 static DEFINE_MUTEX(isolated_cpus_lock); diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 37132dc..743576b 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -527,7 +527,7 @@ int acpi_bus_generate_proc_event4(const char *device_class, const char *bus_id, if (!event_is_open) return 0; - event = kmalloc(sizeof(struct acpi_bus_event), GFP_ATOMIC); + event = kzalloc(sizeof(struct acpi_bus_event), GFP_ATOMIC); if (!event) return -ENOMEM; diff --git a/drivers/acpi/hest.c b/drivers/acpi/hest.c index 4bb18c9..1c527a1 100644 --- a/drivers/acpi/hest.c +++ b/drivers/acpi/hest.c @@ -123,6 +123,10 @@ int acpi_hest_firmware_first_pci(struct pci_dev *pci) { acpi_status status = AE_NOT_FOUND; struct acpi_table_header *hest = NULL; + + if (acpi_disabled) + return 0; + status = acpi_get_table(ACPI_SIG_HEST, 1, &hest); if (ACPI_SUCCESS(status)) { diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index b0a71ec..e4804fb 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -401,11 +401,13 @@ int acpi_pci_irq_enable(struct pci_dev *dev) * driver reported one, then use it. Exit in any case. */ if (gsi < 0) { + u32 dev_gsi; dev_warn(&dev->dev, "PCI INT %c: no GSI", pin_name(pin)); /* Interrupt Line values above 0xF are forbidden */ - if (dev->irq > 0 && (dev->irq <= 0xF)) { - printk(" - using IRQ %d\n", dev->irq); - acpi_register_gsi(&dev->dev, dev->irq, + if (dev->irq > 0 && (dev->irq <= 0xF) && + (acpi_isa_irq_to_gsi(dev->irq, &dev_gsi) == 0)) { + printk(" - using ISA IRQ %d\n", dev->irq); + acpi_register_gsi(&dev->dev, dev_gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_LOW); return 0; diff --git a/drivers/acpi/power_meter.c b/drivers/acpi/power_meter.c index e8c32a4..66f6729 100644 --- a/drivers/acpi/power_meter.c +++ b/drivers/acpi/power_meter.c @@ -35,7 +35,7 @@ #define ACPI_POWER_METER_NAME "power_meter" ACPI_MODULE_NAME(ACPI_POWER_METER_NAME); #define ACPI_POWER_METER_DEVICE_NAME "Power Meter" -#define ACPI_POWER_METER_CLASS "power_meter_resource" +#define ACPI_POWER_METER_CLASS "pwr_meter_resource" #define NUM_SENSORS 17 diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c index 36704b8..f8be23b 100644 --- a/drivers/acpi/sbshc.c +++ b/drivers/acpi/sbshc.c @@ -18,7 +18,7 @@ #define PREFIX "ACPI: " -#define ACPI_SMB_HC_CLASS "smbus_host_controller" +#define ACPI_SMB_HC_CLASS "smbus_host_ctl" #define ACPI_SMB_HC_DEVICE_NAME "ACPI SMBus HC" struct acpi_smb_hc { diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c index f74834a..baa76bb 100644 --- a/drivers/acpi/sleep.c +++ b/drivers/acpi/sleep.c @@ -450,6 +450,38 @@ static struct dmi_system_id __initdata acpisleep_dmi_table[] = { }, }, { + .callback = init_set_sci_en_on_resume, + .ident = "Lenovo ThinkPad T410", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T410"), + }, + }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Lenovo ThinkPad T510", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T510"), + }, + }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Lenovo ThinkPad W510", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad W510"), + }, + }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Lenovo ThinkPad X201[s]", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad X201"), + }, + }, + { .callback = init_old_suspend_ordering, .ident = "Panasonic CF51-2L", .matches = { @@ -458,6 +490,30 @@ static struct dmi_system_id __initdata acpisleep_dmi_table[] = { DMI_MATCH(DMI_BOARD_NAME, "CF51-2L"), }, }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Dell Studio 1558", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Studio 1558"), + }, + }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Dell Studio 1557", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Studio 1557"), + }, + }, + { + .callback = init_set_sci_en_on_resume, + .ident = "Dell Studio 1555", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Studio 1555"), + }, + }, {}, }; #endif /* CONFIG_SUSPEND */ diff --git a/drivers/base/iommu.c b/drivers/base/iommu.c index 8ad4ffe..6e6b6a1 100644 --- a/drivers/base/iommu.c +++ b/drivers/base/iommu.c @@ -80,20 +80,6 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev) } EXPORT_SYMBOL_GPL(iommu_detach_device); -int iommu_map_range(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot) -{ - return iommu_ops->map(domain, iova, paddr, size, prot); -} -EXPORT_SYMBOL_GPL(iommu_map_range); - -void iommu_unmap_range(struct iommu_domain *domain, unsigned long iova, - size_t size) -{ - iommu_ops->unmap(domain, iova, size); -} -EXPORT_SYMBOL_GPL(iommu_unmap_range); - phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, unsigned long iova) { @@ -107,3 +93,32 @@ int iommu_domain_has_cap(struct iommu_domain *domain, return iommu_ops->domain_has_cap(domain, cap); } EXPORT_SYMBOL_GPL(iommu_domain_has_cap); + +int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, int gfp_order, int prot) +{ + unsigned long invalid_mask; + size_t size; + + size = 0x1000UL << gfp_order; + invalid_mask = size - 1; + + BUG_ON((iova | paddr) & invalid_mask); + + return iommu_ops->map(domain, iova, paddr, gfp_order, prot); +} +EXPORT_SYMBOL_GPL(iommu_map); + +int iommu_unmap(struct iommu_domain *domain, unsigned long iova, int gfp_order) +{ + unsigned long invalid_mask; + size_t size; + + size = 0x1000UL << gfp_order; + invalid_mask = size - 1; + + BUG_ON(iova & invalid_mask); + + return iommu_ops->unmap(domain, iova, gfp_order); +} +EXPORT_SYMBOL_GPL(iommu_unmap); diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 4b4b565..c5fbe19 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -187,7 +187,7 @@ EXPORT_SYMBOL_GPL(platform_device_alloc); * released. */ int platform_device_add_resources(struct platform_device *pdev, - struct resource *res, unsigned int num) + const struct resource *res, unsigned int num) { struct resource *r; @@ -367,7 +367,7 @@ EXPORT_SYMBOL_GPL(platform_device_unregister); */ struct platform_device *platform_device_register_simple(const char *name, int id, - struct resource *res, + const struct resource *res, unsigned int num) { struct platform_device *pdev; diff --git a/drivers/block/amiflop.c b/drivers/block/amiflop.c index 0182a22..832798a 100644 --- a/drivers/block/amiflop.c +++ b/drivers/block/amiflop.c @@ -66,6 +66,7 @@ #include <linux/blkdev.h> #include <linux/elevator.h> #include <linux/interrupt.h> +#include <linux/platform_device.h> #include <asm/setup.h> #include <asm/uaccess.h> @@ -1696,34 +1697,18 @@ static struct kobject *floppy_find(dev_t dev, int *part, void *data) return get_disk(unit[drive].gendisk); } -static int __init amiga_floppy_init(void) +static int __init amiga_floppy_probe(struct platform_device *pdev) { int i, ret; - if (!MACH_IS_AMIGA) - return -ENODEV; - - if (!AMIGAHW_PRESENT(AMI_FLOPPY)) - return -ENODEV; - if (register_blkdev(FLOPPY_MAJOR,"fd")) return -EBUSY; - /* - * We request DSKPTR, DSKLEN and DSKDATA only, because the other - * floppy registers are too spreaded over the custom register space - */ - ret = -EBUSY; - if (!request_mem_region(CUSTOM_PHYSADDR+0x20, 8, "amiflop [Paula]")) { - printk("fd: cannot get floppy registers\n"); - goto out_blkdev; - } - ret = -ENOMEM; if ((raw_buf = (char *)amiga_chip_alloc (RAW_BUF_SIZE, "Floppy")) == NULL) { printk("fd: cannot get chip mem buffer\n"); - goto out_memregion; + goto out_blkdev; } ret = -EBUSY; @@ -1792,18 +1777,13 @@ out_irq2: free_irq(IRQ_AMIGA_DSKBLK, NULL); out_irq: amiga_chip_free(raw_buf); -out_memregion: - release_mem_region(CUSTOM_PHYSADDR+0x20, 8); out_blkdev: unregister_blkdev(FLOPPY_MAJOR,"fd"); return ret; } -module_init(amiga_floppy_init); -#ifdef MODULE - #if 0 /* not safe to unload */ -void cleanup_module(void) +static int __exit amiga_floppy_remove(struct platform_device *pdev) { int i; @@ -1820,12 +1800,25 @@ void cleanup_module(void) custom.dmacon = DMAF_DISK; /* disable DMA */ amiga_chip_free(raw_buf); blk_cleanup_queue(floppy_queue); - release_mem_region(CUSTOM_PHYSADDR+0x20, 8); unregister_blkdev(FLOPPY_MAJOR, "fd"); } #endif -#else +static struct platform_driver amiga_floppy_driver = { + .driver = { + .name = "amiga-floppy", + .owner = THIS_MODULE, + }, +}; + +static int __init amiga_floppy_init(void) +{ + return platform_driver_probe(&amiga_floppy_driver, amiga_floppy_probe); +} + +module_init(amiga_floppy_init); + +#ifndef MODULE static int __init amiga_floppy_setup (char *str) { int n; @@ -1840,3 +1833,5 @@ static int __init amiga_floppy_setup (char *str) __setup("floppy=", amiga_floppy_setup); #endif + +MODULE_ALIAS("platform:amiga-floppy"); diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c index 44bf6d1..d48a1df 100644 --- a/drivers/block/drbd/drbd_worker.c +++ b/drivers/block/drbd/drbd_worker.c @@ -235,7 +235,7 @@ void drbd_endio_pri(struct bio *bio, int error) if (unlikely(error)) { what = (bio_data_dir(bio) == WRITE) ? write_completed_with_error - : (bio_rw(bio) == READA) + : (bio_rw(bio) == READ) ? read_completed_with_error : read_ahead_completed_with_error; } else diff --git a/drivers/block/hd.c b/drivers/block/hd.c index 034e6df..81c78b3 100644 --- a/drivers/block/hd.c +++ b/drivers/block/hd.c @@ -164,12 +164,12 @@ unsigned long read_timer(void) unsigned long t, flags; int i; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); t = jiffies * 11932; outb_p(0, 0x43); i = inb_p(0x40); i |= inb(0x40) << 8; - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); return(t - i); } #endif diff --git a/drivers/char/serial167.c b/drivers/char/serial167.c index 8dfd247..78a62eb 100644 --- a/drivers/char/serial167.c +++ b/drivers/char/serial167.c @@ -627,7 +627,6 @@ static irqreturn_t cd2401_rx_interrupt(int irq, void *dev_id) char data; int char_count; int save_cnt; - int len; /* determine the channel and change to that context */ channel = (u_short) (base_addr[CyLICR] >> 2); @@ -1528,7 +1527,6 @@ static int cy_ioctl(struct tty_struct *tty, struct file *file, unsigned int cmd, unsigned long arg) { - unsigned long val; struct cyclades_port *info = tty->driver_data; int ret_val = 0; void __user *argp = (void __user *)arg; diff --git a/drivers/char/sysrq.c b/drivers/char/sysrq.c index 59de252..d4e8b21 100644 --- a/drivers/char/sysrq.c +++ b/drivers/char/sysrq.c @@ -289,7 +289,7 @@ static struct sysrq_key_op sysrq_showstate_blocked_op = { static void sysrq_ftrace_dump(int key, struct tty_struct *tty) { - ftrace_dump(); + ftrace_dump(DUMP_ALL); } static struct sysrq_key_op sysrq_ftrace_dump_op = { .handler = sysrq_ftrace_dump, diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index 6da962c..d71f0fc 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -1875,6 +1875,7 @@ got_driver: */ if (filp->f_op == &hung_up_tty_fops) filp->f_op = &tty_fops; + unlock_kernel(); goto retry_open; } unlock_kernel(); diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 75d293e..063b218 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -662,32 +662,20 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf) return sprintf(buf, "%u\n", policy->cpuinfo.max_freq); } -#define define_one_ro(_name) \ -static struct freq_attr _name = \ -__ATTR(_name, 0444, show_##_name, NULL) - -#define define_one_ro0400(_name) \ -static struct freq_attr _name = \ -__ATTR(_name, 0400, show_##_name, NULL) - -#define define_one_rw(_name) \ -static struct freq_attr _name = \ -__ATTR(_name, 0644, show_##_name, store_##_name) - -define_one_ro0400(cpuinfo_cur_freq); -define_one_ro(cpuinfo_min_freq); -define_one_ro(cpuinfo_max_freq); -define_one_ro(cpuinfo_transition_latency); -define_one_ro(scaling_available_governors); -define_one_ro(scaling_driver); -define_one_ro(scaling_cur_freq); -define_one_ro(bios_limit); -define_one_ro(related_cpus); -define_one_ro(affected_cpus); -define_one_rw(scaling_min_freq); -define_one_rw(scaling_max_freq); -define_one_rw(scaling_governor); -define_one_rw(scaling_setspeed); +cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400); +cpufreq_freq_attr_ro(cpuinfo_min_freq); +cpufreq_freq_attr_ro(cpuinfo_max_freq); +cpufreq_freq_attr_ro(cpuinfo_transition_latency); +cpufreq_freq_attr_ro(scaling_available_governors); +cpufreq_freq_attr_ro(scaling_driver); +cpufreq_freq_attr_ro(scaling_cur_freq); +cpufreq_freq_attr_ro(bios_limit); +cpufreq_freq_attr_ro(related_cpus); +cpufreq_freq_attr_ro(affected_cpus); +cpufreq_freq_attr_rw(scaling_min_freq); +cpufreq_freq_attr_rw(scaling_max_freq); +cpufreq_freq_attr_rw(scaling_governor); +cpufreq_freq_attr_rw(scaling_setspeed); static struct attribute *default_attrs[] = { &cpuinfo_min_freq.attr, diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index 3a14787..526bfbf 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -178,12 +178,8 @@ static ssize_t show_sampling_rate_min(struct kobject *kobj, return sprintf(buf, "%u\n", min_sampling_rate); } -#define define_one_ro(_name) \ -static struct global_attr _name = \ -__ATTR(_name, 0444, show_##_name, NULL) - -define_one_ro(sampling_rate_max); -define_one_ro(sampling_rate_min); +define_one_global_ro(sampling_rate_max); +define_one_global_ro(sampling_rate_min); /* cpufreq_conservative Governor Tunables */ #define show_one(file_name, object) \ @@ -221,12 +217,8 @@ show_one_old(freq_step); show_one_old(sampling_rate_min); show_one_old(sampling_rate_max); -#define define_one_ro_old(object, _name) \ -static struct freq_attr object = \ -__ATTR(_name, 0444, show_##_name##_old, NULL) - -define_one_ro_old(sampling_rate_min_old, sampling_rate_min); -define_one_ro_old(sampling_rate_max_old, sampling_rate_max); +cpufreq_freq_attr_ro_old(sampling_rate_min); +cpufreq_freq_attr_ro_old(sampling_rate_max); /*** delete after deprecation time ***/ @@ -364,16 +356,12 @@ static ssize_t store_freq_step(struct kobject *a, struct attribute *b, return count; } -#define define_one_rw(_name) \ -static struct global_attr _name = \ -__ATTR(_name, 0644, show_##_name, store_##_name) - -define_one_rw(sampling_rate); -define_one_rw(sampling_down_factor); -define_one_rw(up_threshold); -define_one_rw(down_threshold); -define_one_rw(ignore_nice_load); -define_one_rw(freq_step); +define_one_global_rw(sampling_rate); +define_one_global_rw(sampling_down_factor); +define_one_global_rw(up_threshold); +define_one_global_rw(down_threshold); +define_one_global_rw(ignore_nice_load); +define_one_global_rw(freq_step); static struct attribute *dbs_attributes[] = { &sampling_rate_max.attr, @@ -409,16 +397,12 @@ write_one_old(down_threshold); write_one_old(ignore_nice_load); write_one_old(freq_step); -#define define_one_rw_old(object, _name) \ -static struct freq_attr object = \ -__ATTR(_name, 0644, show_##_name##_old, store_##_name##_old) - -define_one_rw_old(sampling_rate_old, sampling_rate); -define_one_rw_old(sampling_down_factor_old, sampling_down_factor); -define_one_rw_old(up_threshold_old, up_threshold); -define_one_rw_old(down_threshold_old, down_threshold); -define_one_rw_old(ignore_nice_load_old, ignore_nice_load); -define_one_rw_old(freq_step_old, freq_step); +cpufreq_freq_attr_rw_old(sampling_rate); +cpufreq_freq_attr_rw_old(sampling_down_factor); +cpufreq_freq_attr_rw_old(up_threshold); +cpufreq_freq_attr_rw_old(down_threshold); +cpufreq_freq_attr_rw_old(ignore_nice_load); +cpufreq_freq_attr_rw_old(freq_step); static struct attribute *dbs_attributes_old[] = { &sampling_rate_max_old.attr, diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c index bd444dc..e131421 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -73,6 +73,7 @@ enum {DBS_NORMAL_SAMPLE, DBS_SUB_SAMPLE}; struct cpu_dbs_info_s { cputime64_t prev_cpu_idle; + cputime64_t prev_cpu_iowait; cputime64_t prev_cpu_wall; cputime64_t prev_cpu_nice; struct cpufreq_policy *cur_policy; @@ -108,6 +109,7 @@ static struct dbs_tuners { unsigned int down_differential; unsigned int ignore_nice; unsigned int powersave_bias; + unsigned int io_is_busy; } dbs_tuners_ins = { .up_threshold = DEF_FREQUENCY_UP_THRESHOLD, .down_differential = DEF_FREQUENCY_DOWN_DIFFERENTIAL, @@ -148,6 +150,16 @@ static inline cputime64_t get_cpu_idle_time(unsigned int cpu, cputime64_t *wall) return idle_time; } +static inline cputime64_t get_cpu_iowait_time(unsigned int cpu, cputime64_t *wall) +{ + u64 iowait_time = get_cpu_iowait_time_us(cpu, wall); + + if (iowait_time == -1ULL) + return 0; + + return iowait_time; +} + /* * Find right freq to be set now with powersave_bias on. * Returns the freq_hi to be used right now and will set freq_hi_jiffies, @@ -234,12 +246,8 @@ static ssize_t show_sampling_rate_min(struct kobject *kobj, return sprintf(buf, "%u\n", min_sampling_rate); } -#define define_one_ro(_name) \ -static struct global_attr _name = \ -__ATTR(_name, 0444, show_##_name, NULL) - -define_one_ro(sampling_rate_max); -define_one_ro(sampling_rate_min); +define_one_global_ro(sampling_rate_max); +define_one_global_ro(sampling_rate_min); /* cpufreq_ondemand Governor Tunables */ #define show_one(file_name, object) \ @@ -249,6 +257,7 @@ static ssize_t show_##file_name \ return sprintf(buf, "%u\n", dbs_tuners_ins.object); \ } show_one(sampling_rate, sampling_rate); +show_one(io_is_busy, io_is_busy); show_one(up_threshold, up_threshold); show_one(ignore_nice_load, ignore_nice); show_one(powersave_bias, powersave_bias); @@ -274,12 +283,8 @@ show_one_old(powersave_bias); show_one_old(sampling_rate_min); show_one_old(sampling_rate_max); -#define define_one_ro_old(object, _name) \ -static struct freq_attr object = \ -__ATTR(_name, 0444, show_##_name##_old, NULL) - -define_one_ro_old(sampling_rate_min_old, sampling_rate_min); -define_one_ro_old(sampling_rate_max_old, sampling_rate_max); +cpufreq_freq_attr_ro_old(sampling_rate_min); +cpufreq_freq_attr_ro_old(sampling_rate_max); /*** delete after deprecation time ***/ @@ -299,6 +304,23 @@ static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b, return count; } +static ssize_t store_io_is_busy(struct kobject *a, struct attribute *b, + const char *buf, size_t count) +{ + unsigned int input; + int ret; + + ret = sscanf(buf, "%u", &input); + if (ret != 1) + return -EINVAL; + + mutex_lock(&dbs_mutex); + dbs_tuners_ins.io_is_busy = !!input; + mutex_unlock(&dbs_mutex); + + return count; +} + static ssize_t store_up_threshold(struct kobject *a, struct attribute *b, const char *buf, size_t count) { @@ -376,14 +398,11 @@ static ssize_t store_powersave_bias(struct kobject *a, struct attribute *b, return count; } -#define define_one_rw(_name) \ -static struct global_attr _name = \ -__ATTR(_name, 0644, show_##_name, store_##_name) - -define_one_rw(sampling_rate); -define_one_rw(up_threshold); -define_one_rw(ignore_nice_load); -define_one_rw(powersave_bias); +define_one_global_rw(sampling_rate); +define_one_global_rw(io_is_busy); +define_one_global_rw(up_threshold); +define_one_global_rw(ignore_nice_load); +define_one_global_rw(powersave_bias); static struct attribute *dbs_attributes[] = { &sampling_rate_max.attr, @@ -392,6 +411,7 @@ static struct attribute *dbs_attributes[] = { &up_threshold.attr, &ignore_nice_load.attr, &powersave_bias.attr, + &io_is_busy.attr, NULL }; @@ -415,14 +435,10 @@ write_one_old(up_threshold); write_one_old(ignore_nice_load); write_one_old(powersave_bias); -#define define_one_rw_old(object, _name) \ -static struct freq_attr object = \ -__ATTR(_name, 0644, show_##_name##_old, store_##_name##_old) - -define_one_rw_old(sampling_rate_old, sampling_rate); -define_one_rw_old(up_threshold_old, up_threshold); -define_one_rw_old(ignore_nice_load_old, ignore_nice_load); -define_one_rw_old(powersave_bias_old, powersave_bias); +cpufreq_freq_attr_rw_old(sampling_rate); +cpufreq_freq_attr_rw_old(up_threshold); +cpufreq_freq_attr_rw_old(ignore_nice_load); +cpufreq_freq_attr_rw_old(powersave_bias); static struct attribute *dbs_attributes_old[] = { &sampling_rate_max_old.attr, @@ -470,14 +486,15 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info) for_each_cpu(j, policy->cpus) { struct cpu_dbs_info_s *j_dbs_info; - cputime64_t cur_wall_time, cur_idle_time; - unsigned int idle_time, wall_time; + cputime64_t cur_wall_time, cur_idle_time, cur_iowait_time; + unsigned int idle_time, wall_time, iowait_time; unsigned int load, load_freq; int freq_avg; j_dbs_info = &per_cpu(od_cpu_dbs_info, j); cur_idle_time = get_cpu_idle_time(j, &cur_wall_time); + cur_iowait_time = get_cpu_iowait_time(j, &cur_wall_time); wall_time = (unsigned int) cputime64_sub(cur_wall_time, j_dbs_info->prev_cpu_wall); @@ -487,6 +504,10 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info) j_dbs_info->prev_cpu_idle); j_dbs_info->prev_cpu_idle = cur_idle_time; + iowait_time = (unsigned int) cputime64_sub(cur_iowait_time, + j_dbs_info->prev_cpu_iowait); + j_dbs_info->prev_cpu_iowait = cur_iowait_time; + if (dbs_tuners_ins.ignore_nice) { cputime64_t cur_nice; unsigned long cur_nice_jiffies; @@ -504,6 +525,16 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info) idle_time += jiffies_to_usecs(cur_nice_jiffies); } + /* + * For the purpose of ondemand, waiting for disk IO is an + * indication that you're performance critical, and not that + * the system is actually idle. So subtract the iowait time + * from the cpu idle time. + */ + + if (dbs_tuners_ins.io_is_busy && idle_time >= iowait_time) + idle_time -= iowait_time; + if (unlikely(!wall_time || wall_time < idle_time)) continue; @@ -617,6 +648,29 @@ static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info) cancel_delayed_work_sync(&dbs_info->work); } +/* + * Not all CPUs want IO time to be accounted as busy; this dependson how + * efficient idling at a higher frequency/voltage is. + * Pavel Machek says this is not so for various generations of AMD and old + * Intel systems. + * Mike Chan (androidlcom) calis this is also not true for ARM. + * Because of this, whitelist specific known (series) of CPUs by default, and + * leave all others up to the user. + */ +static int should_io_be_busy(void) +{ +#if defined(CONFIG_X86) + /* + * For Intel, Core 2 (model 15) andl later have an efficient idle. + */ + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && + boot_cpu_data.x86 == 6 && + boot_cpu_data.x86_model >= 15) + return 1; +#endif + return 0; +} + static int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event) { @@ -679,6 +733,7 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy, dbs_tuners_ins.sampling_rate = max(min_sampling_rate, latency * LATENCY_MULTIPLIER); + dbs_tuners_ins.io_is_busy = should_io_be_busy(); } mutex_unlock(&dbs_mutex); diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index 1aea715..f8e57c6 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -100,7 +100,6 @@ struct menu_device { int needs_update; unsigned int expected_us; - unsigned int measured_us; u64 predicted_us; unsigned int exit_us; unsigned int bucket; @@ -187,14 +186,14 @@ static int menu_select(struct cpuidle_device *dev) int i; int multiplier; - data->last_state_idx = 0; - data->exit_us = 0; - if (data->needs_update) { menu_update(dev); data->needs_update = 0; } + data->last_state_idx = 0; + data->exit_us = 0; + /* Special case when user has set very strict latency requirement */ if (unlikely(latency_req == 0)) return 0; @@ -294,7 +293,7 @@ static void menu_update(struct cpuidle_device *dev) new_factor = data->correction_factor[data->bucket] * (DECAY - 1) / DECAY; - if (data->expected_us > 0 && data->measured_us < MAX_INTERESTING) + if (data->expected_us > 0 && measured_us < MAX_INTERESTING) new_factor += RESOLUTION * measured_us / data->expected_us; else /* diff --git a/drivers/dma/shdma.c b/drivers/dma/shdma.c index 7cc31b3..6f25a20 100644 --- a/drivers/dma/shdma.c +++ b/drivers/dma/shdma.c @@ -290,6 +290,7 @@ static int sh_dmae_alloc_chan_resources(struct dma_chan *chan) struct sh_dmae_chan *sh_chan = to_sh_chan(chan); struct sh_desc *desc; struct sh_dmae_slave *param = chan->private; + int ret; pm_runtime_get_sync(sh_chan->dev); @@ -301,11 +302,15 @@ static int sh_dmae_alloc_chan_resources(struct dma_chan *chan) struct sh_dmae_slave_config *cfg; cfg = sh_dmae_find_slave(sh_chan, param->slave_id); - if (!cfg) - return -EINVAL; + if (!cfg) { + ret = -EINVAL; + goto efindslave; + } - if (test_and_set_bit(param->slave_id, sh_dmae_slave_used)) - return -EBUSY; + if (test_and_set_bit(param->slave_id, sh_dmae_slave_used)) { + ret = -EBUSY; + goto etestused; + } param->config = cfg; @@ -334,10 +339,20 @@ static int sh_dmae_alloc_chan_resources(struct dma_chan *chan) } spin_unlock_bh(&sh_chan->desc_lock); - if (!sh_chan->descs_allocated) - pm_runtime_put(sh_chan->dev); + if (!sh_chan->descs_allocated) { + ret = -ENOMEM; + goto edescalloc; + } return sh_chan->descs_allocated; + +edescalloc: + if (param) + clear_bit(param->slave_id, sh_dmae_slave_used); +etestused: +efindslave: + pm_runtime_put(sh_chan->dev); + return ret; } /* diff --git a/drivers/gpio/it8761e_gpio.c b/drivers/gpio/it8761e_gpio.c index 753219c..41a9388 100644 --- a/drivers/gpio/it8761e_gpio.c +++ b/drivers/gpio/it8761e_gpio.c @@ -80,8 +80,8 @@ static int it8761e_gpio_get(struct gpio_chip *gc, unsigned gpio_num) u16 reg; u8 bit; - bit = gpio_num % 7; - reg = (gpio_num >= 7) ? gpio_ba + 1 : gpio_ba; + bit = gpio_num % 8; + reg = (gpio_num >= 8) ? gpio_ba + 1 : gpio_ba; return !!(inb(reg) & (1 << bit)); } @@ -91,8 +91,8 @@ static int it8761e_gpio_direction_in(struct gpio_chip *gc, unsigned gpio_num) u8 curr_dirs; u8 io_reg, bit; - bit = gpio_num % 7; - io_reg = (gpio_num >= 7) ? GPIO2X_IO : GPIO1X_IO; + bit = gpio_num % 8; + io_reg = (gpio_num >= 8) ? GPIO2X_IO : GPIO1X_IO; spin_lock(&sio_lock); @@ -116,8 +116,8 @@ static void it8761e_gpio_set(struct gpio_chip *gc, u8 curr_vals, bit; u16 reg; - bit = gpio_num % 7; - reg = (gpio_num >= 7) ? gpio_ba + 1 : gpio_ba; + bit = gpio_num % 8; + reg = (gpio_num >= 8) ? gpio_ba + 1 : gpio_ba; spin_lock(&sio_lock); @@ -135,8 +135,8 @@ static int it8761e_gpio_direction_out(struct gpio_chip *gc, { u8 curr_dirs, io_reg, bit; - bit = gpio_num % 7; - io_reg = (gpio_num >= 7) ? GPIO2X_IO : GPIO1X_IO; + bit = gpio_num % 8; + io_reg = (gpio_num >= 8) ? GPIO2X_IO : GPIO1X_IO; it8761e_gpio_set(gc, gpio_num, val); @@ -200,7 +200,7 @@ static int __init it8761e_gpio_init(void) return -EBUSY; it8761e_gpio_chip.base = -1; - it8761e_gpio_chip.ngpio = 14; + it8761e_gpio_chip.ngpio = 16; err = gpiochip_add(&it8761e_gpio_chip); if (err < 0) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 2b8b969..df6a9cd 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_device *dev, for (page = 0; page < page_count; page++) { void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC); + unsigned long flags; + if (d == NULL) goto unwind; - s = kmap_atomic(src_priv->pages[page], KM_USER0); + local_irq_save(flags); + s = kmap_atomic(src_priv->pages[page], KM_IRQ0); memcpy(d, s, PAGE_SIZE); - kunmap_atomic(s, KM_USER0); + kunmap_atomic(s, KM_IRQ0); + local_irq_restore(flags); dst->pages[page] = d; } dst->page_count = page_count; diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4b05563..b3749d4 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -216,6 +216,7 @@ static struct drm_driver driver_old = { .mmap = drm_mmap, .poll = drm_poll, .fasync = drm_fasync, + .read = drm_read, #ifdef CONFIG_COMPAT .compat_ioctl = radeon_compat_ioctl, #endif @@ -304,6 +305,7 @@ static struct drm_driver kms_driver = { .mmap = radeon_mmap, .poll = drm_poll, .fasync = drm_fasync, + .read = drm_read, #ifdef CONFIG_COMPAT .compat_ioctl = radeon_kms_compat_ioctl, #endif diff --git a/drivers/gpu/drm/radeon/radeon_state.c b/drivers/gpu/drm/radeon/radeon_state.c index 40ab6d9..cc5316d 100644 --- a/drivers/gpu/drm/radeon/radeon_state.c +++ b/drivers/gpu/drm/radeon/radeon_state.c @@ -424,7 +424,7 @@ static __inline__ int radeon_check_and_fixup_packet3(drm_radeon_private_t * if ((*cmd & RADEON_GMC_SRC_PITCH_OFFSET_CNTL) && (*cmd & RADEON_GMC_DST_PITCH_OFFSET_CNTL)) { u32 *cmd3 = drm_buffer_pointer_to_dword(cmdbuf->buffer, 3); - offset = *cmd << 10; + offset = *cmd3 << 10; if (radeon_check_and_fixup_offset (dev_priv, file_priv, &offset)) { DRM_ERROR("Invalid second packet offset\n"); @@ -2895,9 +2895,12 @@ static int radeon_cp_cmdbuf(struct drm_device *dev, void *data, return rv; rv = drm_buffer_copy_from_user(cmdbuf->buffer, buffer, cmdbuf->bufsz); - if (rv) + if (rv) { + drm_buffer_free(cmdbuf->buffer); return rv; - } + } + } else + goto done; orig_nbox = cmdbuf->nbox; @@ -2905,8 +2908,7 @@ static int radeon_cp_cmdbuf(struct drm_device *dev, void *data, int temp; temp = r300_do_cp_cmdbuf(dev, file_priv, cmdbuf); - if (cmdbuf->bufsz != 0) - drm_buffer_free(cmdbuf->buffer); + drm_buffer_free(cmdbuf->buffer); return temp; } @@ -3012,16 +3014,15 @@ static int radeon_cp_cmdbuf(struct drm_device *dev, void *data, } } - if (cmdbuf->bufsz != 0) - drm_buffer_free(cmdbuf->buffer); + drm_buffer_free(cmdbuf->buffer); + done: DRM_DEBUG("DONE\n"); COMMIT_RING(); return 0; err: - if (cmdbuf->bufsz != 0) - drm_buffer_free(cmdbuf->buffer); + drm_buffer_free(cmdbuf->buffer); return -EINVAL; } diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index dd47b2a..0e3754a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1716,40 +1716,12 @@ int ttm_bo_wait(struct ttm_buffer_object *bo, } EXPORT_SYMBOL(ttm_bo_wait); -void ttm_bo_unblock_reservation(struct ttm_buffer_object *bo) -{ - atomic_set(&bo->reserved, 0); - wake_up_all(&bo->event_queue); -} - -int ttm_bo_block_reservation(struct ttm_buffer_object *bo, bool interruptible, - bool no_wait) -{ - int ret; - - while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) { - if (no_wait) - return -EBUSY; - else if (interruptible) { - ret = wait_event_interruptible - (bo->event_queue, atomic_read(&bo->reserved) == 0); - if (unlikely(ret != 0)) - return ret; - } else { - wait_event(bo->event_queue, - atomic_read(&bo->reserved) == 0); - } - } - return 0; -} - int ttm_bo_synccpu_write_grab(struct ttm_buffer_object *bo, bool no_wait) { int ret = 0; /* - * Using ttm_bo_reserve instead of ttm_bo_block_reservation - * makes sure the lru lists are updated. + * Using ttm_bo_reserve makes sure the lru lists are updated. */ ret = ttm_bo_reserve(bo, true, no_wait, false, 0); diff --git a/drivers/gpu/drm/ttm/ttm_lock.c b/drivers/gpu/drm/ttm/ttm_lock.c index 3d172ef..de41e55 100644 --- a/drivers/gpu/drm/ttm/ttm_lock.c +++ b/drivers/gpu/drm/ttm/ttm_lock.c @@ -204,7 +204,6 @@ static int __ttm_vt_unlock(struct ttm_lock *lock) lock->flags &= ~TTM_VT_LOCK; wake_up_all(&lock->queue); spin_unlock(&lock->lock); - printk(KERN_INFO TTM_PFX "vt unlock.\n"); return ret; } @@ -265,10 +264,8 @@ int ttm_vt_lock(struct ttm_lock *lock, ttm_lock_type, &ttm_vt_lock_remove, NULL); if (ret) (void)__ttm_vt_unlock(lock); - else { + else lock->vt_holder = tfile; - printk(KERN_INFO TTM_PFX "vt lock.\n"); - } return ret; } diff --git a/drivers/hid/hid-cherry.c b/drivers/hid/hid-cherry.c index 7e597d7..24663a8 100644 --- a/drivers/hid/hid-cherry.c +++ b/drivers/hid/hid-cherry.c @@ -59,6 +59,7 @@ static int ch_input_mapping(struct hid_device *hdev, struct hid_input *hi, static const struct hid_device_id ch_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION) }, + { HID_USB_DEVICE(USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION_SOLAR) }, { } }; MODULE_DEVICE_TABLE(hid, ch_devices); diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 2e2aa75..143e788 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1043,13 +1043,8 @@ void hid_report_raw_event(struct hid_device *hid, int type, u8 *data, int size, if ((hid->claimed & HID_CLAIMED_HIDDEV) && hid->hiddev_report_event) hid->hiddev_report_event(hid, report); - if (hid->claimed & HID_CLAIMED_HIDRAW) { - /* numbered reports need to be passed with the report num */ - if (report_enum->numbered) - hidraw_report_event(hid, data - 1, size + 1); - else - hidraw_report_event(hid, data, size); - } + if (hid->claimed & HID_CLAIMED_HIDRAW) + hidraw_report_event(hid, data, size); for (a = 0; a < report->maxfield; a++) hid_input_field(hid, report->field[a], cdata, interrupt); @@ -1296,6 +1291,7 @@ static const struct hid_device_id hid_blacklist[] = { { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, USB_DEVICE_ID_APPLE_GEYSER1_TP_ONLY) }, { HID_USB_DEVICE(USB_VENDOR_ID_BELKIN, USB_DEVICE_ID_FLIP_KVM) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION) }, + { HID_USB_DEVICE(USB_VENDOR_ID_CHERRY, USB_DEVICE_ID_CHERRY_CYMOTION_SOLAR) }, { HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_CHICONY_TACTICAL_PAD) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYPRESS, USB_DEVICE_ID_CYPRESS_BARCODE_1) }, { HID_USB_DEVICE(USB_VENDOR_ID_CYPRESS, USB_DEVICE_ID_CYPRESS_BARCODE_2) }, diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 797e064..09d2764 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -131,6 +131,7 @@ #define USB_VENDOR_ID_CHERRY 0x046a #define USB_DEVICE_ID_CHERRY_CYMOTION 0x0023 +#define USB_DEVICE_ID_CHERRY_CYMOTION_SOLAR 0x0027 #define USB_VENDOR_ID_CHIC 0x05fe #define USB_DEVICE_ID_CHIC_GAMEPAD 0x0014 diff --git a/drivers/hid/hid-ntrig.c b/drivers/hid/hid-ntrig.c index 9b24fc5..4777bbf 100644 --- a/drivers/hid/hid-ntrig.c +++ b/drivers/hid/hid-ntrig.c @@ -1,8 +1,8 @@ /* * HID driver for N-Trig touchscreens * - * Copyright (c) 2008 Rafi Rubin - * Copyright (c) 2009 Stephane Chatty + * Copyright (c) 2008-2010 Rafi Rubin + * Copyright (c) 2009-2010 Stephane Chatty * */ @@ -15,6 +15,8 @@ #include <linux/device.h> #include <linux/hid.h> +#include <linux/usb.h> +#include "usbhid/usbhid.h" #include <linux/module.h> #include <linux/slab.h> @@ -22,17 +24,16 @@ #define NTRIG_DUPLICATE_USAGES 0x001 -#define nt_map_key_clear(c) hid_map_usage_clear(hi, usage, bit, max, \ - EV_KEY, (c)) - struct ntrig_data { /* Incoming raw values for a single contact */ __u16 x, y, w, h; __u16 id; - __u8 confidence; + + bool tipswitch; + bool confidence; + bool first_contact_touch; bool reading_mt; - __u8 first_contact_confidence; __u8 mt_footer[4]; __u8 mt_foot_count; @@ -139,9 +140,10 @@ static int ntrig_event (struct hid_device *hid, struct hid_field *field, case 0xff000001: /* Tag indicating the start of a multitouch group */ nd->reading_mt = 1; - nd->first_contact_confidence = 0; + nd->first_contact_touch = 0; break; case HID_DG_TIPSWITCH: + nd->tipswitch = value; /* Prevent emission of touch until validated */ return 1; case HID_DG_CONFIDENCE: @@ -169,8 +171,14 @@ static int ntrig_event (struct hid_device *hid, struct hid_field *field, * to emit a normal (X, Y) position */ if (!nd->reading_mt) { + /* + * TipSwitch indicates the presence of a + * finger in single touch mode. + */ + input_report_key(input, BTN_TOUCH, + nd->tipswitch); input_report_key(input, BTN_TOOL_DOUBLETAP, - (nd->confidence != 0)); + nd->tipswitch); input_event(input, EV_ABS, ABS_X, nd->x); input_event(input, EV_ABS, ABS_Y, nd->y); } @@ -209,7 +217,13 @@ static int ntrig_event (struct hid_device *hid, struct hid_field *field, /* emit a normal (X, Y) for the first point only */ if (nd->id == 0) { - nd->first_contact_confidence = nd->confidence; + /* + * TipSwitch is superfluous in multitouch + * mode. The footer events tell us + * if there is a finger on the screen or + * not. + */ + nd->first_contact_touch = nd->confidence; input_event(input, EV_ABS, ABS_X, nd->x); input_event(input, EV_ABS, ABS_Y, nd->y); } @@ -239,30 +253,11 @@ static int ntrig_event (struct hid_device *hid, struct hid_field *field, nd->reading_mt = 0; - if (nd->first_contact_confidence) { - switch (value) { - case 0: /* for single touch devices */ - case 1: - input_report_key(input, - BTN_TOOL_DOUBLETAP, 1); - break; - case 2: - input_report_key(input, - BTN_TOOL_TRIPLETAP, 1); - break; - case 3: - default: - input_report_key(input, - BTN_TOOL_QUADTAP, 1); - } + if (nd->first_contact_touch) { + input_report_key(input, BTN_TOOL_DOUBLETAP, 1); input_report_key(input, BTN_TOUCH, 1); } else { - input_report_key(input, - BTN_TOOL_DOUBLETAP, 0); - input_report_key(input, - BTN_TOOL_TRIPLETAP, 0); - input_report_key(input, - BTN_TOOL_QUADTAP, 0); + input_report_key(input, BTN_TOOL_DOUBLETAP, 0); input_report_key(input, BTN_TOUCH, 0); } break; @@ -286,6 +281,7 @@ static int ntrig_probe(struct hid_device *hdev, const struct hid_device_id *id) struct ntrig_data *nd; struct hid_input *hidinput; struct input_dev *input; + struct hid_report *report; if (id->driver_data) hdev->quirks |= HID_QUIRK_MULTI_INPUT; @@ -327,13 +323,7 @@ static int ntrig_probe(struct hid_device *hdev, const struct hid_device_id *id) __clear_bit(BTN_TOOL_PEN, input->keybit); __clear_bit(BTN_TOOL_FINGER, input->keybit); __clear_bit(BTN_0, input->keybit); - /* - * A little something special to enable - * two and three finger taps. - */ __set_bit(BTN_TOOL_DOUBLETAP, input->keybit); - __set_bit(BTN_TOOL_TRIPLETAP, input->keybit); - __set_bit(BTN_TOOL_QUADTAP, input->keybit); /* * The physical touchscreen (single touch) * input has a value for physical, whereas @@ -349,6 +339,12 @@ static int ntrig_probe(struct hid_device *hdev, const struct hid_device_id *id) } } + /* This is needed for devices with more recent firmware versions */ + report = hdev->report_enum[HID_FEATURE_REPORT].report_id_hash[0x0a]; + if (report) + usbhid_submit_report(hdev, report, USB_DIR_OUT); + + return 0; err_free: kfree(nd); diff --git a/drivers/hid/hid-sony.c b/drivers/hid/hid-sony.c index 7502a4b..402d557 100644 --- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -76,7 +76,7 @@ static int sony_set_operational_usb(struct hid_device *hdev) static int sony_set_operational_bt(struct hid_device *hdev) { - unsigned char buf[] = { 0x53, 0xf4, 0x42, 0x03, 0x00, 0x00 }; + unsigned char buf[] = { 0xf4, 0x42, 0x03, 0x00, 0x00 }; return hdev->hid_output_raw_report(hdev, buf, sizeof(buf), HID_FEATURE_REPORT); } diff --git a/drivers/hid/hid-wacom.c b/drivers/hid/hid-wacom.c index f7700cf..f947d83 100644 --- a/drivers/hid/hid-wacom.c +++ b/drivers/hid/hid-wacom.c @@ -277,7 +277,6 @@ static int __init wacom_init(void) ret = hid_register_driver(&wacom_driver); if (ret) printk(KERN_ERR "can't register wacom driver\n"); - printk(KERN_ERR "wacom driver registered\n"); return ret; } diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c index 56d06cd..7b85b69 100644 --- a/drivers/hid/usbhid/hid-core.c +++ b/drivers/hid/usbhid/hid-core.c @@ -999,13 +999,6 @@ static int usbhid_start(struct hid_device *hid) } } - init_waitqueue_head(&usbhid->wait); - INIT_WORK(&usbhid->reset_work, hid_reset); - INIT_WORK(&usbhid->restart_work, __usbhid_restart_queues); - setup_timer(&usbhid->io_retry, hid_retry_timeout, (unsigned long) hid); - - spin_lock_init(&usbhid->lock); - usbhid->urbctrl = usb_alloc_urb(0, GFP_KERNEL); if (!usbhid->urbctrl) { ret = -ENOMEM; @@ -1179,6 +1172,12 @@ static int usbhid_probe(struct usb_interface *intf, const struct usb_device_id * usbhid->intf = intf; usbhid->ifnum = interface->desc.bInterfaceNumber; + init_waitqueue_head(&usbhid->wait); + INIT_WORK(&usbhid->reset_work, hid_reset); + INIT_WORK(&usbhid->restart_work, __usbhid_restart_queues); + setup_timer(&usbhid->io_retry, hid_retry_timeout, (unsigned long) hid); + spin_lock_init(&usbhid->lock); + ret = hid_add_device(hid); if (ret) { if (ret != -ENODEV) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 0f28d91..f085c18 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -195,6 +195,9 @@ static unsigned int applesmc_accelerometer; /* Indicates whether this computer has light sensors and keyboard backlight. */ static unsigned int applesmc_light; +/* The number of fans handled by the driver */ +static unsigned int fans_handled; + /* Indicates which temperature sensors set to use. */ static unsigned int applesmc_temperature_set; @@ -1492,39 +1495,24 @@ static int __init applesmc_init(void) /* create fan files */ count = applesmc_get_fan_count(); - if (count < 0) { + if (count < 0) printk(KERN_ERR "applesmc: Cannot get the number of fans.\n"); - } else { + else printk(KERN_INFO "applesmc: %d fans found.\n", count); - switch (count) { - default: - printk(KERN_WARNING "applesmc: More than 4 fans found," - " but at most 4 fans are supported" - " by the driver.\n"); - case 4: - ret = sysfs_create_group(&pdev->dev.kobj, - &fan_attribute_groups[3]); - if (ret) - goto out_key_enumeration; - case 3: - ret = sysfs_create_group(&pdev->dev.kobj, - &fan_attribute_groups[2]); - if (ret) - goto out_key_enumeration; - case 2: - ret = sysfs_create_group(&pdev->dev.kobj, - &fan_attribute_groups[1]); - if (ret) - goto out_key_enumeration; - case 1: - ret = sysfs_create_group(&pdev->dev.kobj, - &fan_attribute_groups[0]); - if (ret) - goto out_fan_1; - case 0: - ; - } + if (count > 4) { + count = 4; + printk(KERN_WARNING "applesmc: More than 4 fans found," + " but at most 4 fans are supported" + " by the driver.\n"); + } + + while (fans_handled < count) { + ret = sysfs_create_group(&pdev->dev.kobj, + &fan_attribute_groups[fans_handled]); + if (ret) + goto out_fans; + fans_handled++; } for (i = 0; @@ -1593,10 +1581,10 @@ out_accelerometer: applesmc_release_accelerometer(); out_temperature: sysfs_remove_group(&pdev->dev.kobj, &temperature_attributes_group); - sysfs_remove_group(&pdev->dev.kobj, &fan_attribute_groups[0]); -out_fan_1: - sysfs_remove_group(&pdev->dev.kobj, &fan_attribute_groups[1]); -out_key_enumeration: +out_fans: + while (fans_handled) + sysfs_remove_group(&pdev->dev.kobj, + &fan_attribute_groups[--fans_handled]); sysfs_remove_group(&pdev->dev.kobj, &key_enumeration_group); out_name: sysfs_remove_file(&pdev->dev.kobj, &dev_attr_name.attr); @@ -1622,8 +1610,9 @@ static void __exit applesmc_exit(void) if (applesmc_accelerometer) applesmc_release_accelerometer(); sysfs_remove_group(&pdev->dev.kobj, &temperature_attributes_group); - sysfs_remove_group(&pdev->dev.kobj, &fan_attribute_groups[0]); - sysfs_remove_group(&pdev->dev.kobj, &fan_attribute_groups[1]); + while (fans_handled) + sysfs_remove_group(&pdev->dev.kobj, + &fan_attribute_groups[--fans_handled]); sysfs_remove_group(&pdev->dev.kobj, &key_enumeration_group); sysfs_remove_file(&pdev->dev.kobj, &dev_attr_name.attr); platform_device_unregister(pdev); diff --git a/drivers/hwmon/asc7621.c b/drivers/hwmon/asc7621.c index 7f94810..0f388ad 100644 --- a/drivers/hwmon/asc7621.c +++ b/drivers/hwmon/asc7621.c @@ -268,8 +268,11 @@ static ssize_t store_fan16(struct device *dev, if (strict_strtol(buf, 10, &reqval)) return -EINVAL; + /* If a minimum RPM of zero is requested, then we set the register to + 0xffff. This value allows the fan to be stopped completely without + generating an alarm. */ reqval = - (SENSORS_LIMIT((reqval) <= 0 ? 0 : 5400000 / (reqval), 0, 65534)); + (reqval <= 0 ? 0xffff : SENSORS_LIMIT(5400000 / reqval, 0, 0xfffe)); mutex_lock(&data->update_lock); data->reg[param->msb[0]] = (reqval >> 8) & 0xff; @@ -285,8 +288,9 @@ static ssize_t store_fan16(struct device *dev, * Voltages are scaled in the device so that the nominal voltage * is 3/4ths of the 0-255 range (i.e. 192). * If all voltages are 'normal' then all voltage registers will - * read 0xC0. This doesn't help us if we don't have a point of refernce. - * The data sheet however provides us with the full scale value for each + * read 0xC0. + * + * The data sheet provides us with the 3/4 scale value for each voltage * which is stored in in_scaling. The sda->index parameter value provides * the index into in_scaling. * @@ -295,7 +299,7 @@ static ssize_t store_fan16(struct device *dev, */ static int asc7621_in_scaling[] = { - 3320, 3000, 4380, 6640, 16000 + 2500, 2250, 3300, 5000, 12000 }; static ssize_t show_in10(struct device *dev, struct device_attribute *attr, @@ -306,19 +310,12 @@ static ssize_t show_in10(struct device *dev, struct device_attribute *attr, u8 nr = sda->index; mutex_lock(&data->update_lock); - regval = (data->reg[param->msb[0]] * asc7621_in_scaling[nr]) / 256; - - /* The LSB value is a 2-bit scaling of the MSB's LSbit value. - * I.E. If the maximim voltage for this input is 6640 millivolts then - * a MSB register value of 0 = 0mv and 255 = 6640mv. - * A 1 step change therefore represents 25.9mv (6640 / 256). - * The extra 2-bits therefore represent increments of 6.48mv. - */ - regval += ((asc7621_in_scaling[nr] / 256) / 4) * - (data->reg[param->lsb[0]] >> 6); - + regval = (data->reg[param->msb[0]] << 8) | (data->reg[param->lsb[0]]); mutex_unlock(&data->update_lock); + /* The LSB value is a 2-bit scaling of the MSB's LSbit value. */ + regval = (regval >> 6) * asc7621_in_scaling[nr] / (0xc0 << 2); + return sprintf(buf, "%u\n", regval); } @@ -331,7 +328,7 @@ static ssize_t show_in8(struct device *dev, struct device_attribute *attr, return sprintf(buf, "%u\n", ((data->reg[param->msb[0]] * - asc7621_in_scaling[nr]) / 256)); + asc7621_in_scaling[nr]) / 0xc0)); } static ssize_t store_in8(struct device *dev, struct device_attribute *attr, @@ -344,9 +341,11 @@ static ssize_t store_in8(struct device *dev, struct device_attribute *attr, if (strict_strtol(buf, 10, &reqval)) return -EINVAL; - reqval = SENSORS_LIMIT(reqval, 0, asc7621_in_scaling[nr]); + reqval = SENSORS_LIMIT(reqval, 0, 0xffff); + + reqval = reqval * 0xc0 / asc7621_in_scaling[nr]; - reqval = (reqval * 255 + 128) / asc7621_in_scaling[nr]; + reqval = SENSORS_LIMIT(reqval, 0, 0xff); mutex_lock(&data->update_lock); data->reg[param->msb[0]] = reqval; @@ -846,11 +845,11 @@ static struct asc7621_param asc7621_params[] = { PWRITE(in3_max, 3, PRI_LOW, 0x4b, 0, 0, 0, in8), PWRITE(in4_max, 4, PRI_LOW, 0x4d, 0, 0, 0, in8), - PREAD(in0_alarm, 0, PRI_LOW, 0x41, 0, 0x01, 0, bitmask), - PREAD(in1_alarm, 1, PRI_LOW, 0x41, 0, 0x01, 1, bitmask), - PREAD(in2_alarm, 2, PRI_LOW, 0x41, 0, 0x01, 2, bitmask), - PREAD(in3_alarm, 3, PRI_LOW, 0x41, 0, 0x01, 3, bitmask), - PREAD(in4_alarm, 4, PRI_LOW, 0x42, 0, 0x01, 0, bitmask), + PREAD(in0_alarm, 0, PRI_HIGH, 0x41, 0, 0x01, 0, bitmask), + PREAD(in1_alarm, 1, PRI_HIGH, 0x41, 0, 0x01, 1, bitmask), + PREAD(in2_alarm, 2, PRI_HIGH, 0x41, 0, 0x01, 2, bitmask), + PREAD(in3_alarm, 3, PRI_HIGH, 0x41, 0, 0x01, 3, bitmask), + PREAD(in4_alarm, 4, PRI_HIGH, 0x42, 0, 0x01, 0, bitmask), PREAD(fan1_input, 0, PRI_HIGH, 0x29, 0x28, 0, 0, fan16), PREAD(fan2_input, 1, PRI_HIGH, 0x2b, 0x2a, 0, 0, fan16), @@ -862,10 +861,10 @@ static struct asc7621_param asc7621_params[] = { PWRITE(fan3_min, 2, PRI_LOW, 0x59, 0x58, 0, 0, fan16), PWRITE(fan4_min, 3, PRI_LOW, 0x5b, 0x5a, 0, 0, fan16), - PREAD(fan1_alarm, 0, PRI_LOW, 0x42, 0, 0x01, 0, bitmask), - PREAD(fan2_alarm, 1, PRI_LOW, 0x42, 0, 0x01, 1, bitmask), - PREAD(fan3_alarm, 2, PRI_LOW, 0x42, 0, 0x01, 2, bitmask), - PREAD(fan4_alarm, 3, PRI_LOW, 0x42, 0, 0x01, 3, bitmask), + PREAD(fan1_alarm, 0, PRI_HIGH, 0x42, 0, 0x01, 2, bitmask), + PREAD(fan2_alarm, 1, PRI_HIGH, 0x42, 0, 0x01, 3, bitmask), + PREAD(fan3_alarm, 2, PRI_HIGH, 0x42, 0, 0x01, 4, bitmask), + PREAD(fan4_alarm, 3, PRI_HIGH, 0x42, 0, 0x01, 5, bitmask), PREAD(temp1_input, 0, PRI_HIGH, 0x25, 0x10, 0, 0, temp10), PREAD(temp2_input, 1, PRI_HIGH, 0x26, 0x15, 0, 0, temp10), @@ -886,10 +885,10 @@ static struct asc7621_param asc7621_params[] = { PWRITE(temp3_max, 2, PRI_LOW, 0x53, 0, 0, 0, temp8), PWRITE(temp4_max, 3, PRI_LOW, 0x35, 0, 0, 0, temp8), - PREAD(temp1_alarm, 0, PRI_LOW, 0x41, 0, 0x01, 4, bitmask), - PREAD(temp2_alarm, 1, PRI_LOW, 0x41, 0, 0x01, 5, bitmask), - PREAD(temp3_alarm, 2, PRI_LOW, 0x41, 0, 0x01, 6, bitmask), - PREAD(temp4_alarm, 3, PRI_LOW, 0x43, 0, 0x01, 0, bitmask), + PREAD(temp1_alarm, 0, PRI_HIGH, 0x41, 0, 0x01, 4, bitmask), + PREAD(temp2_alarm, 1, PRI_HIGH, 0x41, 0, 0x01, 5, bitmask), + PREAD(temp3_alarm, 2, PRI_HIGH, 0x41, 0, 0x01, 6, bitmask), + PREAD(temp4_alarm, 3, PRI_HIGH, 0x43, 0, 0x01, 0, bitmask), PWRITE(temp1_source, 0, PRI_LOW, 0x02, 0, 0x07, 4, bitmask), PWRITE(temp2_source, 1, PRI_LOW, 0x02, 0, 0x07, 0, bitmask), @@ -898,7 +897,7 @@ static struct asc7621_param asc7621_params[] = { PWRITE(temp1_smoothing_enable, 0, PRI_LOW, 0x62, 0, 0x01, 3, bitmask), PWRITE(temp2_smoothing_enable, 1, PRI_LOW, 0x63, 0, 0x01, 7, bitmask), - PWRITE(temp3_smoothing_enable, 2, PRI_LOW, 0x64, 0, 0x01, 3, bitmask), + PWRITE(temp3_smoothing_enable, 2, PRI_LOW, 0x63, 0, 0x01, 3, bitmask), PWRITE(temp4_smoothing_enable, 3, PRI_LOW, 0x3c, 0, 0x01, 3, bitmask), PWRITE(temp1_smoothing_time, 0, PRI_LOW, 0x62, 0, 0x07, 0, temp_st), diff --git a/drivers/hwmon/hp_accel.c b/drivers/hwmon/hp_accel.c index c8ab505..7580f55 100644 --- a/drivers/hwmon/hp_accel.c +++ b/drivers/hwmon/hp_accel.c @@ -328,8 +328,8 @@ static int lis3lv02d_remove(struct acpi_device *device, int type) lis3lv02d_joystick_disable(); lis3lv02d_poweroff(&lis3_dev); - flush_work(&hpled_led.work); led_classdev_unregister(&hpled_led.led_classdev); + flush_work(&hpled_led.work); return lis3lv02d_remove_fs(&lis3_dev); } diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c index 7e18bcf..46239e4 100644 --- a/drivers/input/gameport/gameport.c +++ b/drivers/input/gameport/gameport.c @@ -59,11 +59,11 @@ static unsigned int get_time_pit(void) unsigned long flags; unsigned int count; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); outb_p(0x00, 0x43); count = inb_p(0x40); count |= inb_p(0x40) << 8; - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); return count; } diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c index 1c0b529..4afe0a3 100644 --- a/drivers/input/joystick/analog.c +++ b/drivers/input/joystick/analog.c @@ -146,11 +146,11 @@ static unsigned int get_time_pit(void) unsigned long flags; unsigned int count; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); outb_p(0x00, 0x43); count = inb_p(0x40); count |= inb_p(0x40) << 8; - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); return count; } diff --git a/drivers/input/joystick/iforce/iforce-main.c b/drivers/input/joystick/iforce/iforce-main.c index b1edd77..405febd 100644 --- a/drivers/input/joystick/iforce/iforce-main.c +++ b/drivers/input/joystick/iforce/iforce-main.c @@ -54,6 +54,9 @@ static signed short btn_avb_wheel[] = static signed short abs_joystick[] = { ABS_X, ABS_Y, ABS_THROTTLE, ABS_HAT0X, ABS_HAT0Y, -1 }; +static signed short abs_joystick_rudder[] = +{ ABS_X, ABS_Y, ABS_THROTTLE, ABS_RUDDER, ABS_HAT0X, ABS_HAT0Y, -1 }; + static signed short abs_avb_pegasus[] = { ABS_X, ABS_Y, ABS_THROTTLE, ABS_RUDDER, ABS_HAT0X, ABS_HAT0Y, ABS_HAT1X, ABS_HAT1Y, -1 }; @@ -76,8 +79,9 @@ static struct iforce_device iforce_device[] = { { 0x061c, 0xc0a4, "ACT LABS Force RS", btn_wheel, abs_wheel, ff_iforce }, //? { 0x061c, 0xc084, "ACT LABS Force RS", btn_wheel, abs_wheel, ff_iforce }, { 0x06f8, 0x0001, "Guillemot Race Leader Force Feedback", btn_wheel, abs_wheel, ff_iforce }, //? + { 0x06f8, 0x0001, "Guillemot Jet Leader Force Feedback", btn_joystick, abs_joystick_rudder, ff_iforce }, { 0x06f8, 0x0004, "Guillemot Force Feedback Racing Wheel", btn_wheel, abs_wheel, ff_iforce }, //? - { 0x06f8, 0x0004, "Gullemot Jet Leader 3D", btn_joystick, abs_joystick, ff_iforce }, //? + { 0x06f8, 0xa302, "Guillemot Jet Leader 3D", btn_joystick, abs_joystick, ff_iforce }, //? { 0x06d6, 0x29bc, "Trust Force Feedback Race Master", btn_wheel, abs_wheel, ff_iforce }, { 0x0000, 0x0000, "Unknown I-Force Device [%04x:%04x]", btn_joystick, abs_joystick, ff_iforce } }; diff --git a/drivers/input/joystick/iforce/iforce-usb.c b/drivers/input/joystick/iforce/iforce-usb.c index b41303d..6c96631 100644 --- a/drivers/input/joystick/iforce/iforce-usb.c +++ b/drivers/input/joystick/iforce/iforce-usb.c @@ -212,6 +212,7 @@ static struct usb_device_id iforce_usb_ids [] = { { USB_DEVICE(0x061c, 0xc0a4) }, /* ACT LABS Force RS */ { USB_DEVICE(0x061c, 0xc084) }, /* ACT LABS Force RS */ { USB_DEVICE(0x06f8, 0x0001) }, /* Guillemot Race Leader Force Feedback */ + { USB_DEVICE(0x06f8, 0x0003) }, /* Guillemot Jet Leader Force Feedback */ { USB_DEVICE(0x06f8, 0x0004) }, /* Guillemot Force Feedback Racing Wheel */ { USB_DEVICE(0x06f8, 0xa302) }, /* Guillemot Jet Leader 3D */ { } /* Terminating entry */ diff --git a/drivers/input/misc/pcspkr.c b/drivers/input/misc/pcspkr.c index ea4e1fd..f080dd3 100644 --- a/drivers/input/misc/pcspkr.c +++ b/drivers/input/misc/pcspkr.c @@ -30,7 +30,7 @@ MODULE_ALIAS("platform:pcspkr"); #include <asm/i8253.h> #else #include <asm/8253pit.h> -static DEFINE_SPINLOCK(i8253_lock); +static DEFINE_RAW_SPINLOCK(i8253_lock); #endif static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int code, int value) @@ -50,7 +50,7 @@ static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int c if (value > 20 && value < 32767) count = PIT_TICK_RATE / value; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); if (count) { /* set command for counter 2, 2 byte write */ @@ -65,7 +65,7 @@ static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int c outb(inb_p(0x61) & 0xFC, 0x61); } - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); return 0; } diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c index 0520c2e..112b4ee 100644 --- a/drivers/input/mouse/elantech.c +++ b/drivers/input/mouse/elantech.c @@ -185,7 +185,7 @@ static void elantech_report_absolute_v1(struct psmouse *psmouse) int fingers; static int old_fingers; - if (etd->fw_version_maj == 0x01) { + if (etd->fw_version < 0x020000) { /* * byte 0: D U p1 p2 1 p3 R L * byte 1: f 0 th tw x9 x8 y9 y8 @@ -227,7 +227,7 @@ static void elantech_report_absolute_v1(struct psmouse *psmouse) input_report_key(dev, BTN_LEFT, packet[0] & 0x01); input_report_key(dev, BTN_RIGHT, packet[0] & 0x02); - if ((etd->fw_version_maj == 0x01) && + if (etd->fw_version < 0x020000 && (etd->capabilities & ETP_CAP_HAS_ROCKER)) { /* rocker up */ input_report_key(dev, BTN_FORWARD, packet[0] & 0x40); @@ -321,7 +321,7 @@ static int elantech_check_parity_v1(struct psmouse *psmouse) unsigned char p1, p2, p3; /* Parity bits are placed differently */ - if (etd->fw_version_maj == 0x01) { + if (etd->fw_version < 0x020000) { /* byte 0: D U p1 p2 1 p3 R L */ p1 = (packet[0] & 0x20) >> 5; p2 = (packet[0] & 0x10) >> 4; @@ -457,7 +457,7 @@ static void elantech_set_input_params(struct psmouse *psmouse) switch (etd->hw_version) { case 1: /* Rocker button */ - if ((etd->fw_version_maj == 0x01) && + if (etd->fw_version < 0x020000 && (etd->capabilities & ETP_CAP_HAS_ROCKER)) { __set_bit(BTN_FORWARD, dev->keybit); __set_bit(BTN_BACK, dev->keybit); @@ -686,15 +686,14 @@ int elantech_init(struct psmouse *psmouse) pr_err("elantech.c: failed to query firmware version.\n"); goto init_fail; } - etd->fw_version_maj = param[0]; - etd->fw_version_min = param[2]; + + etd->fw_version = (param[0] << 16) | (param[1] << 8) | param[2]; /* * Assume every version greater than this is new EeePC style * hardware with 6 byte packets */ - if ((etd->fw_version_maj == 0x02 && etd->fw_version_min >= 0x30) || - etd->fw_version_maj > 0x02) { + if (etd->fw_version >= 0x020030) { etd->hw_version = 2; /* For now show extra debug information */ etd->debug = 1; @@ -704,8 +703,9 @@ int elantech_init(struct psmouse *psmouse) etd->hw_version = 1; etd->paritycheck = 1; } - pr_info("elantech.c: assuming hardware version %d, firmware version %d.%d\n", - etd->hw_version, etd->fw_version_maj, etd->fw_version_min); + + pr_info("elantech.c: assuming hardware version %d, firmware version %d.%d.%d\n", + etd->hw_version, param[0], param[1], param[2]); if (synaptics_send_cmd(psmouse, ETP_CAPABILITIES_QUERY, param)) { pr_err("elantech.c: failed to query capabilities.\n"); @@ -720,8 +720,8 @@ int elantech_init(struct psmouse *psmouse) * a touch action starts causing the mouse cursor or scrolled page * to jump. Enable a workaround. */ - if (etd->fw_version_maj == 0x02 && etd->fw_version_min == 0x22) { - pr_info("elantech.c: firmware version 2.34 detected, " + if (etd->fw_version == 0x020022) { + pr_info("elantech.c: firmware version 2.0.34 detected, " "enabling jumpy cursor workaround\n"); etd->jumpy_cursor = 1; } diff --git a/drivers/input/mouse/elantech.h b/drivers/input/mouse/elantech.h index feac5f7..ac57bde 100644 --- a/drivers/input/mouse/elantech.h +++ b/drivers/input/mouse/elantech.h @@ -100,11 +100,10 @@ struct elantech_data { unsigned char reg_26; unsigned char debug; unsigned char capabilities; - unsigned char fw_version_maj; - unsigned char fw_version_min; - unsigned char hw_version; unsigned char paritycheck; unsigned char jumpy_cursor; + unsigned char hw_version; + unsigned int fw_version; unsigned char parity[256]; }; diff --git a/drivers/input/mouse/psmouse-base.c b/drivers/input/mouse/psmouse-base.c index cbc8072..a3c9731 100644 --- a/drivers/input/mouse/psmouse-base.c +++ b/drivers/input/mouse/psmouse-base.c @@ -1394,6 +1394,7 @@ static int psmouse_reconnect(struct serio *serio) struct psmouse *psmouse = serio_get_drvdata(serio); struct psmouse *parent = NULL; struct serio_driver *drv = serio->drv; + unsigned char type; int rc = -1; if (!drv || !psmouse) { @@ -1413,10 +1414,15 @@ static int psmouse_reconnect(struct serio *serio) if (psmouse->reconnect) { if (psmouse->reconnect(psmouse)) goto out; - } else if (psmouse_probe(psmouse) < 0 || - psmouse->type != psmouse_extensions(psmouse, - psmouse_max_proto, false)) { - goto out; + } else { + psmouse_reset(psmouse); + + if (psmouse_probe(psmouse) < 0) + goto out; + + type = psmouse_extensions(psmouse, psmouse_max_proto, false); + if (psmouse->type != type) + goto out; } /* ok, the device type (and capabilities) match the old one, diff --git a/drivers/input/touchscreen/ad7877.c b/drivers/input/touchscreen/ad7877.c index e019d53d..0d2d7e5 100644 --- a/drivers/input/touchscreen/ad7877.c +++ b/drivers/input/touchscreen/ad7877.c @@ -156,9 +156,14 @@ struct ser_req { u16 reset; u16 ref_on; u16 command; - u16 sample; struct spi_message msg; struct spi_transfer xfer[6]; + + /* + * DMA (thus cache coherency maintenance) requires the + * transfer buffers to live in their own cache lines. + */ + u16 sample ____cacheline_aligned; }; struct ad7877 { @@ -182,8 +187,6 @@ struct ad7877 { u8 averaging; u8 pen_down_acc_interval; - u16 conversion_data[AD7877_NR_SENSE]; - struct spi_transfer xfer[AD7877_NR_SENSE + 2]; struct spi_message msg; @@ -195,6 +198,12 @@ struct ad7877 { spinlock_t lock; struct timer_list timer; /* P: lock */ unsigned pending:1; /* P: lock */ + + /* + * DMA (thus cache coherency maintenance) requires the + * transfer buffers to live in their own cache lines. + */ + u16 conversion_data[AD7877_NR_SENSE] ____cacheline_aligned; }; static int gpio3; diff --git a/drivers/md/md.c b/drivers/md/md.c index 9712b2e..cefd63d 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2109,12 +2109,18 @@ repeat: if (!mddev->in_sync || mddev->recovery_cp != MaxSector) { /* not clean */ /* .. if the array isn't clean, an 'even' event must also go * to spares. */ - if ((mddev->events&1)==0) + if ((mddev->events&1)==0) { nospares = 0; + sync_req = 2; /* force a second update to get the + * even/odd in sync */ + } } else { /* otherwise an 'odd' event must go to spares */ - if ((mddev->events&1)) + if ((mddev->events&1)) { nospares = 0; + sync_req = 2; /* force a second update to get the + * even/odd in sync */ + } } } diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 58ea0ec..15348c3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -1527,7 +1527,7 @@ static void raid5_end_read_request(struct bio * bi, int error) clear_bit(R5_UPTODATE, &sh->dev[i].flags); atomic_inc(&rdev->read_errors); - if (conf->mddev->degraded) + if (conf->mddev->degraded >= conf->max_degraded) printk_rl(KERN_WARNING "raid5:%s: read error not correctable " "(sector %llu on %s).\n", diff --git a/drivers/media/common/saa7146_fops.c b/drivers/media/common/saa7146_fops.c index fd8e1f4..7364b964 100644 --- a/drivers/media/common/saa7146_fops.c +++ b/drivers/media/common/saa7146_fops.c @@ -423,15 +423,14 @@ static void vv_callback(struct saa7146_dev *dev, unsigned long status) } } -int saa7146_vv_devinit(struct saa7146_dev *dev) -{ - return v4l2_device_register(&dev->pci->dev, &dev->v4l2_dev); -} -EXPORT_SYMBOL_GPL(saa7146_vv_devinit); - int saa7146_vv_init(struct saa7146_dev* dev, struct saa7146_ext_vv *ext_vv) { struct saa7146_vv *vv; + int err; + + err = v4l2_device_register(&dev->pci->dev, &dev->v4l2_dev); + if (err) + return err; vv = kzalloc(sizeof(struct saa7146_vv), GFP_KERNEL); if (vv == NULL) { diff --git a/drivers/media/common/saa7146_video.c b/drivers/media/common/saa7146_video.c index 5ed7526..b8b2c55 100644 --- a/drivers/media/common/saa7146_video.c +++ b/drivers/media/common/saa7146_video.c @@ -558,9 +558,11 @@ static int vidioc_s_fbuf(struct file *file, void *fh, struct v4l2_framebuffer *f /* ok, accept it */ vv->ov_fb = *fb; vv->ov_fmt = fmt; - if (0 == vv->ov_fb.fmt.bytesperline) - vv->ov_fb.fmt.bytesperline = - vv->ov_fb.fmt.width * fmt->depth / 8; + + if (vv->ov_fb.fmt.bytesperline < vv->ov_fb.fmt.width) { + vv->ov_fb.fmt.bytesperline = vv->ov_fb.fmt.width * fmt->depth / 8; + DEB_D(("setting bytesperline to %d\n", vv->ov_fb.fmt.bytesperline)); + } mutex_unlock(&dev->lock); return 0; diff --git a/drivers/media/dvb/frontends/stv090x.c b/drivers/media/dvb/frontends/stv090x.c index a3c07fe..9697280 100644 --- a/drivers/media/dvb/frontends/stv090x.c +++ b/drivers/media/dvb/frontends/stv090x.c @@ -4470,6 +4470,10 @@ static int stv090x_setup(struct dvb_frontend *fe) if (stv090x_write_reg(state, STV090x_TSTRES0, 0x00) < 0) goto err; + /* workaround for stuck DiSEqC output */ + if (config->diseqc_envelope_mode) + stv090x_send_diseqc_burst(fe, SEC_MINI_A); + return 0; err: dprintk(FE_ERROR, 1, "I/O error"); diff --git a/drivers/media/dvb/ttpci/budget.c b/drivers/media/dvb/ttpci/budget.c index 9fdf26c..1500210 100644 --- a/drivers/media/dvb/ttpci/budget.c +++ b/drivers/media/dvb/ttpci/budget.c @@ -643,9 +643,6 @@ static void frontend_init(struct budget *budget) &budget->i2c_adap, &tt1600_isl6423_config); - } else { - dvb_frontend_detach(budget->dvb_frontend); - budget->dvb_frontend = NULL; } } break; diff --git a/drivers/media/video/Kconfig b/drivers/media/video/Kconfig index f8fc865..9644cf76 100644 --- a/drivers/media/video/Kconfig +++ b/drivers/media/video/Kconfig @@ -361,7 +361,7 @@ config VIDEO_SAA717X config VIDEO_SAA7191 tristate "Philips SAA7191 video decoder" - depends on VIDEO_V4L1 && I2C + depends on VIDEO_V4L2 && I2C ---help--- Support for the Philips SAA7191 video decoder. @@ -756,7 +756,7 @@ source "drivers/media/video/saa7134/Kconfig" config VIDEO_MXB tristate "Siemens-Nixdorf 'Multimedia eXtension Board'" - depends on PCI && VIDEO_V4L1 && I2C + depends on PCI && VIDEO_V4L2 && I2C select VIDEO_SAA7146_VV select VIDEO_TUNER select VIDEO_SAA711X if VIDEO_HELPER_CHIPS_AUTO diff --git a/drivers/media/video/Makefile b/drivers/media/video/Makefile index b88b617..c51c386 100644 --- a/drivers/media/video/Makefile +++ b/drivers/media/video/Makefile @@ -160,8 +160,6 @@ obj-$(CONFIG_VIDEO_MX3) += mx3_camera.o obj-$(CONFIG_VIDEO_PXA27x) += pxa_camera.o obj-$(CONFIG_VIDEO_SH_MOBILE_CEU) += sh_mobile_ceu_camera.o -obj-$(CONFIG_ARCH_DAVINCI) += davinci/ - obj-$(CONFIG_VIDEO_AU0828) += au0828/ obj-$(CONFIG_USB_VIDEO_CLASS) += uvc/ diff --git a/drivers/media/video/davinci/vpfe_capture.c b/drivers/media/video/davinci/vpfe_capture.c index 7cf042f..398dbe7 100644 --- a/drivers/media/video/davinci/vpfe_capture.c +++ b/drivers/media/video/davinci/vpfe_capture.c @@ -223,7 +223,6 @@ int vpfe_register_ccdc_device(struct ccdc_hw_device *dev) BUG_ON(!dev->hw_ops.get_frame_format); BUG_ON(!dev->hw_ops.get_pixel_format); BUG_ON(!dev->hw_ops.set_pixel_format); - BUG_ON(!dev->hw_ops.set_params); BUG_ON(!dev->hw_ops.set_image_window); BUG_ON(!dev->hw_ops.get_image_window); BUG_ON(!dev->hw_ops.get_line_length); @@ -1689,11 +1688,12 @@ static long vpfe_param_handler(struct file *file, void *priv, struct vpfe_device *vpfe_dev = video_drvdata(file); int ret = 0; - v4l2_dbg(1, debug, &vpfe_dev->v4l2_dev, "vpfe_param_handler\n"); + v4l2_dbg(2, debug, &vpfe_dev->v4l2_dev, "vpfe_param_handler\n"); if (vpfe_dev->started) { /* only allowed if streaming is not started */ - v4l2_err(&vpfe_dev->v4l2_dev, "device already started\n"); + v4l2_dbg(1, debug, &vpfe_dev->v4l2_dev, + "device already started\n"); return -EBUSY; } @@ -1705,16 +1705,23 @@ static long vpfe_param_handler(struct file *file, void *priv, case VPFE_CMD_S_CCDC_RAW_PARAMS: v4l2_warn(&vpfe_dev->v4l2_dev, "VPFE_CMD_S_CCDC_RAW_PARAMS: experimental ioctl\n"); - ret = ccdc_dev->hw_ops.set_params(param); - if (ret) { - v4l2_err(&vpfe_dev->v4l2_dev, - "Error in setting parameters in CCDC\n"); - goto unlock_out; - } - if (vpfe_get_ccdc_image_format(vpfe_dev, &vpfe_dev->fmt) < 0) { - v4l2_err(&vpfe_dev->v4l2_dev, - "Invalid image format at CCDC\n"); - goto unlock_out; + if (ccdc_dev->hw_ops.set_params) { + ret = ccdc_dev->hw_ops.set_params(param); + if (ret) { + v4l2_dbg(1, debug, &vpfe_dev->v4l2_dev, + "Error setting parameters in CCDC\n"); + goto unlock_out; + } + if (vpfe_get_ccdc_image_format(vpfe_dev, + &vpfe_dev->fmt) < 0) { + v4l2_dbg(1, debug, &vpfe_dev->v4l2_dev, + "Invalid image format at CCDC\n"); + goto unlock_out; + } + } else { + ret = -EINVAL; + v4l2_dbg(1, debug, &vpfe_dev->v4l2_dev, + "VPFE_CMD_S_CCDC_RAW_PARAMS not supported\n"); } break; default: @@ -1830,7 +1837,7 @@ static __init int vpfe_probe(struct platform_device *pdev) if (NULL == ccdc_cfg) { v4l2_err(pdev->dev.driver, "Memory allocation failed for ccdc_cfg\n"); - goto probe_free_dev_mem; + goto probe_free_lock; } strncpy(ccdc_cfg->name, vpfe_cfg->ccdc, 32); @@ -1982,8 +1989,9 @@ probe_out_video_release: probe_out_release_irq: free_irq(vpfe_dev->ccdc_irq0, vpfe_dev); probe_free_ccdc_cfg_mem: - mutex_unlock(&ccdc_lock); kfree(ccdc_cfg); +probe_free_lock: + mutex_unlock(&ccdc_lock); probe_free_dev_mem: kfree(vpfe_dev); return ret; diff --git a/drivers/media/video/gspca/sn9c20x.c b/drivers/media/video/gspca/sn9c20x.c index 38a6e15..3dee3e5 100644 --- a/drivers/media/video/gspca/sn9c20x.c +++ b/drivers/media/video/gspca/sn9c20x.c @@ -1427,7 +1427,7 @@ static int input_kthread(void *data) struct gspca_dev *gspca_dev = (struct gspca_dev *)data; struct sd *sd = (struct sd *) gspca_dev; - DECLARE_WAIT_QUEUE_HEAD(wait); + DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wait); set_freezable(); for (;;) { if (kthread_should_stop()) diff --git a/drivers/media/video/gspca/spca508.c b/drivers/media/video/gspca/spca508.c index 15b2eef..edf0fe1 100644 --- a/drivers/media/video/gspca/spca508.c +++ b/drivers/media/video/gspca/spca508.c @@ -1513,7 +1513,6 @@ static const struct sd_desc sd_desc = { static const __devinitdata struct usb_device_id device_table[] = { {USB_DEVICE(0x0130, 0x0130), .driver_info = HamaUSBSightcam}, {USB_DEVICE(0x041e, 0x4018), .driver_info = CreativeVista}, - {USB_DEVICE(0x0461, 0x0815), .driver_info = MicroInnovationIC200}, {USB_DEVICE(0x0733, 0x0110), .driver_info = ViewQuestVQ110}, {USB_DEVICE(0x0af9, 0x0010), .driver_info = HamaUSBSightcam}, {USB_DEVICE(0x0af9, 0x0011), .driver_info = HamaUSBSightcam2}, diff --git a/drivers/media/video/gspca/spca561.c b/drivers/media/video/gspca/spca561.c index dc7f2b0..b9c80e2 100644 --- a/drivers/media/video/gspca/spca561.c +++ b/drivers/media/video/gspca/spca561.c @@ -1053,6 +1053,7 @@ static const __devinitdata struct usb_device_id device_table[] = { {USB_DEVICE(0x041e, 0x401a), .driver_info = Rev072A}, {USB_DEVICE(0x041e, 0x403b), .driver_info = Rev012A}, {USB_DEVICE(0x0458, 0x7004), .driver_info = Rev072A}, + {USB_DEVICE(0x0461, 0x0815), .driver_info = Rev072A}, {USB_DEVICE(0x046d, 0x0928), .driver_info = Rev012A}, {USB_DEVICE(0x046d, 0x0929), .driver_info = Rev012A}, {USB_DEVICE(0x046d, 0x092a), .driver_info = Rev012A}, diff --git a/drivers/media/video/gspca/stv06xx/stv06xx.c b/drivers/media/video/gspca/stv06xx/stv06xx.c index af73da3..14f179a 100644 --- a/drivers/media/video/gspca/stv06xx/stv06xx.c +++ b/drivers/media/video/gspca/stv06xx/stv06xx.c @@ -524,8 +524,6 @@ static const __devinitdata struct usb_device_id device_table[] = { {USB_DEVICE(0x046D, 0x08F5), .driver_info = BRIDGE_ST6422 }, /* QuickCam Messenger (new) */ {USB_DEVICE(0x046D, 0x08F6), .driver_info = BRIDGE_ST6422 }, - /* QuickCam Messenger (new) */ - {USB_DEVICE(0x046D, 0x08DA), .driver_info = BRIDGE_ST6422 }, {} }; MODULE_DEVICE_TABLE(usb, device_table); diff --git a/drivers/media/video/hexium_gemini.c b/drivers/media/video/hexium_gemini.c index e620a3a..ad2c232 100644 --- a/drivers/media/video/hexium_gemini.c +++ b/drivers/media/video/hexium_gemini.c @@ -356,9 +356,6 @@ static int hexium_attach(struct saa7146_dev *dev, struct saa7146_pci_extension_d DEB_EE((".\n")); - ret = saa7146_vv_devinit(dev); - if (ret) - return ret; hexium = kzalloc(sizeof(struct hexium), GFP_KERNEL); if (NULL == hexium) { printk("hexium_gemini: not enough kernel memory in hexium_attach().\n"); diff --git a/drivers/media/video/hexium_orion.c b/drivers/media/video/hexium_orion.c index fe596a1..938a1f8 100644 --- a/drivers/media/video/hexium_orion.c +++ b/drivers/media/video/hexium_orion.c @@ -216,10 +216,6 @@ static int hexium_probe(struct saa7146_dev *dev) return -EFAULT; } - err = saa7146_vv_devinit(dev); - if (err) - return err; - hexium = kzalloc(sizeof(struct hexium), GFP_KERNEL); if (NULL == hexium) { printk("hexium_orion: hexium_probe: not enough kernel memory.\n"); diff --git a/drivers/media/video/mx1_camera.c b/drivers/media/video/mx1_camera.c index 3c8ebfc..34a6601 100644 --- a/drivers/media/video/mx1_camera.c +++ b/drivers/media/video/mx1_camera.c @@ -49,8 +49,6 @@ /* * CSI registers */ -#define DMA_CCR(x) (0x8c + ((x) << 6)) /* Control Registers */ -#define DMA_DIMR 0x08 /* Interrupt mask Register */ #define CSICR1 0x00 /* CSI Control Register 1 */ #define CSISR 0x08 /* CSI Status Register */ #define CSIRXR 0x10 /* CSI RxFIFO Register */ @@ -784,7 +782,7 @@ static int __init mx1_camera_probe(struct platform_device *pdev) pcdev); imx_dma_config_channel(pcdev->dma_chan, IMX_DMA_TYPE_FIFO, - IMX_DMA_MEMSIZE_32, DMA_REQ_CSI_R, 0); + IMX_DMA_MEMSIZE_32, MX1_DMA_REQ_CSI_R, 0); /* burst length : 16 words = 64 bytes */ imx_dma_config_burstlen(pcdev->dma_chan, 0); @@ -798,8 +796,8 @@ static int __init mx1_camera_probe(struct platform_device *pdev) set_fiq_handler(&mx1_camera_sof_fiq_start, &mx1_camera_sof_fiq_end - &mx1_camera_sof_fiq_start); - regs.ARM_r8 = DMA_BASE + DMA_DIMR; - regs.ARM_r9 = DMA_BASE + DMA_CCR(pcdev->dma_chan); + regs.ARM_r8 = (long)MX1_DMA_DIMR; + regs.ARM_r9 = (long)MX1_DMA_CCR(pcdev->dma_chan); regs.ARM_r10 = (long)pcdev->base + CSICR1; regs.ARM_fp = (long)pcdev->base + CSISR; regs.ARM_sp = 1 << pcdev->dma_chan; diff --git a/drivers/media/video/mxb.c b/drivers/media/video/mxb.c index 9f01f14..ef0c817 100644 --- a/drivers/media/video/mxb.c +++ b/drivers/media/video/mxb.c @@ -169,11 +169,7 @@ static struct saa7146_extension extension; static int mxb_probe(struct saa7146_dev *dev) { struct mxb *mxb = NULL; - int err; - err = saa7146_vv_devinit(dev); - if (err) - return err; mxb = kzalloc(sizeof(struct mxb), GFP_KERNEL); if (mxb == NULL) { DEB_D(("not enough kernel memory.\n")); @@ -699,14 +695,17 @@ static struct saa7146_ext_vv vv_data; /* this function only gets called when the probing was successful */ static int mxb_attach(struct saa7146_dev *dev, struct saa7146_pci_extension_data *info) { - struct mxb *mxb = (struct mxb *)dev->ext_priv; + struct mxb *mxb; DEB_EE(("dev:%p\n", dev)); - /* checking for i2c-devices can be omitted here, because we - already did this in "mxb_vl42_probe" */ - saa7146_vv_init(dev, &vv_data); + if (mxb_probe(dev)) { + saa7146_vv_release(dev); + return -1; + } + mxb = (struct mxb *)dev->ext_priv; + vv_data.ops.vidioc_queryctrl = vidioc_queryctrl; vv_data.ops.vidioc_g_ctrl = vidioc_g_ctrl; vv_data.ops.vidioc_s_ctrl = vidioc_s_ctrl; @@ -726,6 +725,7 @@ static int mxb_attach(struct saa7146_dev *dev, struct saa7146_pci_extension_data vv_data.ops.vidioc_default = vidioc_default; if (saa7146_register_device(&mxb->video_dev, dev, "mxb", VFL_TYPE_GRABBER)) { ERR(("cannot register capture v4l2 device. skipping.\n")); + saa7146_vv_release(dev); return -1; } @@ -846,7 +846,6 @@ static struct saa7146_extension extension = { .pci_tbl = &pci_tbl[0], .module = THIS_MODULE, - .probe = mxb_probe, .attach = mxb_attach, .detach = mxb_detach, diff --git a/drivers/media/video/omap24xxcam.c b/drivers/media/video/omap24xxcam.c index b189fe6..ce76d95 100644 --- a/drivers/media/video/omap24xxcam.c +++ b/drivers/media/video/omap24xxcam.c @@ -1405,7 +1405,7 @@ static int omap24xxcam_mmap_buffers(struct file *file, } size = 0; - for (i = first; i <= last; i++) { + for (i = first; i <= last && i < VIDEO_MAX_FRAME; i++) { struct videobuf_dmabuf *dma = videobuf_to_dma(vbq->bufs[i]); for (j = 0; j < dma->sglen; j++) { diff --git a/drivers/media/video/pxa_camera.c b/drivers/media/video/pxa_camera.c index 5ecc30d..04bf5c1 100644 --- a/drivers/media/video/pxa_camera.c +++ b/drivers/media/video/pxa_camera.c @@ -609,12 +609,9 @@ static void pxa_dma_add_tail_buf(struct pxa_camera_dev *pcdev, */ static void pxa_camera_start_capture(struct pxa_camera_dev *pcdev) { - unsigned long cicr0, cifr; + unsigned long cicr0; dev_dbg(pcdev->soc_host.v4l2_dev.dev, "%s\n", __func__); - /* Reset the FIFOs */ - cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F; - __raw_writel(cifr, pcdev->base + CIFR); /* Enable End-Of-Frame Interrupt */ cicr0 = __raw_readl(pcdev->base + CICR0) | CICR0_ENB; cicr0 &= ~CICR0_EOFM; @@ -935,7 +932,7 @@ static void pxa_camera_deactivate(struct pxa_camera_dev *pcdev) static irqreturn_t pxa_camera_irq(int irq, void *data) { struct pxa_camera_dev *pcdev = data; - unsigned long status, cicr0; + unsigned long status, cifr, cicr0; struct pxa_buffer *buf; struct videobuf_buffer *vb; @@ -949,6 +946,10 @@ static irqreturn_t pxa_camera_irq(int irq, void *data) __raw_writel(status, pcdev->base + CISR); if (status & CISR_EOF) { + /* Reset the FIFOs */ + cifr = __raw_readl(pcdev->base + CIFR) | CIFR_RESET_F; + __raw_writel(cifr, pcdev->base + CIFR); + pcdev->active = list_first_entry(&pcdev->capture, struct pxa_buffer, vb.queue); vb = &pcdev->active->vb; diff --git a/drivers/media/video/sh_mobile_ceu_camera.c b/drivers/media/video/sh_mobile_ceu_camera.c index 6e16b39..1ad980f 100644 --- a/drivers/media/video/sh_mobile_ceu_camera.c +++ b/drivers/media/video/sh_mobile_ceu_camera.c @@ -1633,7 +1633,7 @@ static int sh_mobile_ceu_try_fmt(struct soc_camera_device *icd, height = pix->height; pix->bytesperline = soc_mbus_bytes_per_line(width, xlate->host_fmt); - if (pix->bytesperline < 0) + if ((int)pix->bytesperline < 0) return pix->bytesperline; pix->sizeimage = height * pix->bytesperline; diff --git a/drivers/mfd/wm831x-core.c b/drivers/mfd/wm831x-core.c index a3d5728..f2ab025 100644 --- a/drivers/mfd/wm831x-core.c +++ b/drivers/mfd/wm831x-core.c @@ -349,6 +349,9 @@ int wm831x_auxadc_read(struct wm831x *wm831x, enum wm831x_auxadc input) goto disable; } + /* If an interrupt arrived late clean up after it */ + try_wait_for_completion(&wm831x->auxadc_done); + /* Ignore the result to allow us to soldier on without IRQ hookup */ wait_for_completion_timeout(&wm831x->auxadc_done, msecs_to_jiffies(5)); diff --git a/drivers/mfd/wm8350-core.c b/drivers/mfd/wm8350-core.c index e400a3b..b580748 100644 --- a/drivers/mfd/wm8350-core.c +++ b/drivers/mfd/wm8350-core.c @@ -363,6 +363,10 @@ int wm8350_read_auxadc(struct wm8350 *wm8350, int channel, int scale, int vref) reg |= 1 << channel | WM8350_AUXADC_POLL; wm8350_reg_write(wm8350, WM8350_DIGITISER_CONTROL_1, reg); + /* If a late IRQ left the completion signalled then consume + * the completion. */ + try_wait_for_completion(&wm8350->auxadc_done); + /* We ignore the result of the completion and just check for a * conversion result, allowing us to soldier on if the IRQ * infrastructure is not set up for the chip. */ diff --git a/drivers/misc/vmware_balloon.c b/drivers/misc/vmware_balloon.c index e7161c4..db9cd02 100644 --- a/drivers/misc/vmware_balloon.c +++ b/drivers/misc/vmware_balloon.c @@ -41,7 +41,7 @@ #include <linux/workqueue.h> #include <linux/debugfs.h> #include <linux/seq_file.h> -#include <asm/vmware.h> +#include <asm/hypervisor.h> MODULE_AUTHOR("VMware, Inc."); MODULE_DESCRIPTION("VMware Memory Control (Balloon) Driver"); @@ -767,7 +767,7 @@ static int __init vmballoon_init(void) * Check if we are running on VMware's hypervisor and bail out * if we are not. */ - if (!vmware_platform()) + if (x86_hyper != &x86_hyper_vmware) return -ENODEV; vmballoon_wq = create_freezeable_workqueue("vmmemctl"); diff --git a/drivers/mmc/host/at91_mci.c b/drivers/mmc/host/at91_mci.c index a6dd7da..336d9f5 100644 --- a/drivers/mmc/host/at91_mci.c +++ b/drivers/mmc/host/at91_mci.c @@ -314,8 +314,8 @@ static void at91_mci_post_dma_read(struct at91mci_host *host) dmabuf = (unsigned *)tmpv; } + flush_kernel_dcache_page(sg_page(sg)); kunmap_atomic(sgbuffer, KM_BIO_SRC_IRQ); - dmac_flush_range((void *)sgbuffer, ((void *)sgbuffer) + amount); data->bytes_xfered += amount; if (size == 0) break; diff --git a/drivers/mmc/host/atmel-mci.c b/drivers/mmc/host/atmel-mci.c index 88be37d..fb279f4 100644 --- a/drivers/mmc/host/atmel-mci.c +++ b/drivers/mmc/host/atmel-mci.c @@ -266,7 +266,7 @@ static int atmci_req_show(struct seq_file *s, void *v) "CMD%u(0x%x) flg %x rsp %x %x %x %x err %d\n", cmd->opcode, cmd->arg, cmd->flags, cmd->resp[0], cmd->resp[1], cmd->resp[2], - cmd->resp[2], cmd->error); + cmd->resp[3], cmd->error); if (data) seq_printf(s, "DATA %u / %u * %u flg %x err %d\n", data->bytes_xfered, data->blocks, @@ -276,7 +276,7 @@ static int atmci_req_show(struct seq_file *s, void *v) "CMD%u(0x%x) flg %x rsp %x %x %x %x err %d\n", stop->opcode, stop->arg, stop->flags, stop->resp[0], stop->resp[1], stop->resp[2], - stop->resp[2], stop->error); + stop->resp[3], stop->error); } spin_unlock_bh(&slot->host->lock); @@ -569,9 +569,10 @@ static void atmci_dma_cleanup(struct atmel_mci *host) { struct mmc_data *data = host->data; - dma_unmap_sg(&host->pdev->dev, data->sg, data->sg_len, - ((data->flags & MMC_DATA_WRITE) - ? DMA_TO_DEVICE : DMA_FROM_DEVICE)); + if (data) + dma_unmap_sg(&host->pdev->dev, data->sg, data->sg_len, + ((data->flags & MMC_DATA_WRITE) + ? DMA_TO_DEVICE : DMA_FROM_DEVICE)); } static void atmci_stop_dma(struct atmel_mci *host) @@ -1099,8 +1100,8 @@ static void atmci_command_complete(struct atmel_mci *host, "command error: status=0x%08x\n", status); if (cmd->data) { - host->data = NULL; atmci_stop_dma(host); + host->data = NULL; mci_writel(host, IDR, MCI_NOTBUSY | MCI_TXRDY | MCI_RXRDY | ATMCI_DATA_ERROR_FLAGS); @@ -1293,6 +1294,7 @@ static void atmci_tasklet_func(unsigned long priv) } else { data->bytes_xfered = data->blocks * data->blksz; data->error = 0; + mci_writel(host, IDR, ATMCI_DATA_ERROR_FLAGS); } if (!data->stop) { @@ -1751,13 +1753,13 @@ static int __init atmci_probe(struct platform_device *pdev) ret = -ENODEV; if (pdata->slot[0].bus_width) { ret = atmci_init_slot(host, &pdata->slot[0], - MCI_SDCSEL_SLOT_A, 0); + 0, MCI_SDCSEL_SLOT_A); if (!ret) nr_slots++; } if (pdata->slot[1].bus_width) { ret = atmci_init_slot(host, &pdata->slot[1], - MCI_SDCSEL_SLOT_B, 1); + 1, MCI_SDCSEL_SLOT_B); if (!ret) nr_slots++; } diff --git a/drivers/net/a2065.c b/drivers/net/a2065.c index ed5e974..a8f0512 100644 --- a/drivers/net/a2065.c +++ b/drivers/net/a2065.c @@ -674,6 +674,7 @@ static struct zorro_device_id a2065_zorro_tbl[] __devinitdata = { { ZORRO_PROD_AMERISTAR_A2065 }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, a2065_zorro_tbl); static struct zorro_driver a2065_driver = { .name = "a2065", diff --git a/drivers/net/ariadne.c b/drivers/net/ariadne.c index fa1a235..4b30a46 100644 --- a/drivers/net/ariadne.c +++ b/drivers/net/ariadne.c @@ -145,6 +145,7 @@ static struct zorro_device_id ariadne_zorro_tbl[] __devinitdata = { { ZORRO_PROD_VILLAGE_TRONIC_ARIADNE }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, ariadne_zorro_tbl); static struct zorro_driver ariadne_driver = { .name = "ariadne", diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index 4e97ca1..5d3763f 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -1649,6 +1649,7 @@ static void free_skb_resources(struct gfar_private *priv) sizeof(struct rxbd8) * priv->total_rx_ring_size, priv->tx_queue[0]->tx_bd_base, priv->tx_queue[0]->tx_bd_dma_base); + skb_queue_purge(&priv->rx_recycle); } void gfar_start(struct net_device *dev) @@ -2088,7 +2089,6 @@ static int gfar_close(struct net_device *dev) disable_napi(priv); - skb_queue_purge(&priv->rx_recycle); cancel_work_sync(&priv->reset_task); stop_gfar(dev); diff --git a/drivers/net/hydra.c b/drivers/net/hydra.c index 24724b4..07d8e5b 100644 --- a/drivers/net/hydra.c +++ b/drivers/net/hydra.c @@ -71,6 +71,7 @@ static struct zorro_device_id hydra_zorro_tbl[] __devinitdata = { { ZORRO_PROD_HYDRA_SYSTEMS_AMIGANET }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, hydra_zorro_tbl); static struct zorro_driver hydra_driver = { .name = "hydra", diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 0cd80e4..e67691d 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -32,6 +32,7 @@ static int kszphy_config_init(struct phy_device *phydev) static struct phy_driver ks8001_driver = { .phy_id = PHY_ID_KS8001, + .name = "Micrel KS8001", .phy_id_mask = 0x00fffff0, .features = PHY_BASIC_FEATURES, .flags = PHY_POLL, diff --git a/drivers/net/veth.c b/drivers/net/veth.c index f9f0730..5ec542d 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -187,7 +187,6 @@ tx_drop: return NETDEV_TX_OK; rx_drop: - kfree_skb(skb); rcv_stats->rx_dropped++; return NETDEV_TX_OK; } diff --git a/drivers/net/wireless/ath/ar9170/usb.c b/drivers/net/wireless/ath/ar9170/usb.c index 99a6da4..e1c2fca 100644 --- a/drivers/net/wireless/ath/ar9170/usb.c +++ b/drivers/net/wireless/ath/ar9170/usb.c @@ -727,12 +727,16 @@ static void ar9170_usb_firmware_failed(struct ar9170_usb *aru) { struct device *parent = aru->udev->dev.parent; + complete(&aru->firmware_loading_complete); + /* unbind anything failed */ if (parent) down(&parent->sem); device_release_driver(&aru->udev->dev); if (parent) up(&parent->sem); + + usb_put_dev(aru->udev); } static void ar9170_usb_firmware_finish(const struct firmware *fw, void *context) @@ -761,6 +765,8 @@ static void ar9170_usb_firmware_finish(const struct firmware *fw, void *context) if (err) goto err_unrx; + complete(&aru->firmware_loading_complete); + usb_put_dev(aru->udev); return; err_unrx: @@ -858,6 +864,7 @@ static int ar9170_usb_probe(struct usb_interface *intf, init_usb_anchor(&aru->tx_pending); init_usb_anchor(&aru->tx_submitted); init_completion(&aru->cmd_wait); + init_completion(&aru->firmware_loading_complete); spin_lock_init(&aru->tx_urb_lock); aru->tx_pending_urbs = 0; @@ -877,6 +884,7 @@ static int ar9170_usb_probe(struct usb_interface *intf, if (err) goto err_freehw; + usb_get_dev(aru->udev); return request_firmware_nowait(THIS_MODULE, 1, "ar9170.fw", &aru->udev->dev, GFP_KERNEL, aru, ar9170_usb_firmware_step2); @@ -896,6 +904,9 @@ static void ar9170_usb_disconnect(struct usb_interface *intf) return; aru->common.state = AR9170_IDLE; + + wait_for_completion(&aru->firmware_loading_complete); + ar9170_unregister(&aru->common); ar9170_usb_cancel_urbs(aru); diff --git a/drivers/net/wireless/ath/ar9170/usb.h b/drivers/net/wireless/ath/ar9170/usb.h index a2ce3b1..919b060 100644 --- a/drivers/net/wireless/ath/ar9170/usb.h +++ b/drivers/net/wireless/ath/ar9170/usb.h @@ -71,6 +71,7 @@ struct ar9170_usb { unsigned int tx_pending_urbs; struct completion cmd_wait; + struct completion firmware_loading_complete; int readlen; u8 *readbuf; diff --git a/drivers/net/wireless/iwlwifi/iwl-commands.h b/drivers/net/wireless/iwlwifi/iwl-commands.h index 6383d9f..f4e59ae 100644 --- a/drivers/net/wireless/iwlwifi/iwl-commands.h +++ b/drivers/net/wireless/iwlwifi/iwl-commands.h @@ -2621,7 +2621,9 @@ struct iwl_ssid_ie { #define PROBE_OPTION_MAX_3945 4 #define PROBE_OPTION_MAX 20 #define TX_CMD_LIFE_TIME_INFINITE cpu_to_le32(0xFFFFFFFF) -#define IWL_GOOD_CRC_TH cpu_to_le16(1) +#define IWL_GOOD_CRC_TH_DISABLED 0 +#define IWL_GOOD_CRC_TH_DEFAULT cpu_to_le16(1) +#define IWL_GOOD_CRC_TH_NEVER cpu_to_le16(0xffff) #define IWL_MAX_SCAN_SIZE 1024 #define IWL_MAX_CMD_SIZE 4096 #define IWL_MAX_PROBE_REQUEST 200 diff --git a/drivers/net/wireless/iwlwifi/iwl-scan.c b/drivers/net/wireless/iwlwifi/iwl-scan.c index 12e455a..741e65e 100644 --- a/drivers/net/wireless/iwlwifi/iwl-scan.c +++ b/drivers/net/wireless/iwlwifi/iwl-scan.c @@ -813,16 +813,29 @@ static void iwl_bg_request_scan(struct work_struct *data) rate = IWL_RATE_1M_PLCP; rate_flags = RATE_MCS_CCK_MSK; } - scan->good_CRC_th = 0; + scan->good_CRC_th = IWL_GOOD_CRC_TH_DISABLED; } else if (priv->scan_bands & BIT(IEEE80211_BAND_5GHZ)) { band = IEEE80211_BAND_5GHZ; rate = IWL_RATE_6M_PLCP; /* - * If active scaning is requested but a certain channel - * is marked passive, we can do active scanning if we - * detect transmissions. + * If active scanning is requested but a certain channel is + * marked passive, we can do active scanning if we detect + * transmissions. + * + * There is an issue with some firmware versions that triggers + * a sysassert on a "good CRC threshold" of zero (== disabled), + * on a radar channel even though this means that we should NOT + * send probes. + * + * The "good CRC threshold" is the number of frames that we + * need to receive during our dwell time on a channel before + * sending out probes -- setting this to a huge value will + * mean we never reach it, but at the same time work around + * the aforementioned issue. Thus use IWL_GOOD_CRC_TH_NEVER + * here instead of IWL_GOOD_CRC_TH_DISABLED. */ - scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH : 0; + scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH_DEFAULT : + IWL_GOOD_CRC_TH_NEVER; /* Force use of chains B and C (0x6) for scan Rx for 4965 * Avoid A (0x1) because of its off-channel reception on A-band. diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c index b55e4f3..b74a56c 100644 --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c @@ -2967,7 +2967,8 @@ static void iwl3945_bg_request_scan(struct work_struct *data) * is marked passive, we can do active scanning if we * detect transmissions. */ - scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH : 0; + scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH_DEFAULT : + IWL_GOOD_CRC_TH_DISABLED; band = IEEE80211_BAND_5GHZ; } else { IWL_WARN(priv, "Invalid scan band count\n"); diff --git a/drivers/net/zorro8390.c b/drivers/net/zorro8390.c index 81c753a..9548cbb 100644 --- a/drivers/net/zorro8390.c +++ b/drivers/net/zorro8390.c @@ -102,6 +102,7 @@ static struct zorro_device_id zorro8390_zorro_tbl[] __devinitdata = { { ZORRO_PROD_INDIVIDUAL_COMPUTERS_X_SURF, }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, zorro8390_zorro_tbl); static struct zorro_driver zorro8390_driver = { .name = "zorro8390", diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c index 166b67e..219f79e 100644 --- a/drivers/oprofile/cpu_buffer.c +++ b/drivers/oprofile/cpu_buffer.c @@ -30,23 +30,7 @@ #define OP_BUFFER_FLAGS 0 -/* - * Read and write access is using spin locking. Thus, writing to the - * buffer by NMI handler (x86) could occur also during critical - * sections when reading the buffer. To avoid this, there are 2 - * buffers for independent read and write access. Read access is in - * process context only, write access only in the NMI handler. If the - * read buffer runs empty, both buffers are swapped atomically. There - * is potentially a small window during swapping where the buffers are - * disabled and samples could be lost. - * - * Using 2 buffers is a little bit overhead, but the solution is clear - * and does not require changes in the ring buffer implementation. It - * can be changed to a single buffer solution when the ring buffer - * access is implemented as non-locking atomic code. - */ -static struct ring_buffer *op_ring_buffer_read; -static struct ring_buffer *op_ring_buffer_write; +static struct ring_buffer *op_ring_buffer; DEFINE_PER_CPU(struct oprofile_cpu_buffer, op_cpu_buffer); static void wq_sync_buffer(struct work_struct *work); @@ -68,12 +52,9 @@ void oprofile_cpu_buffer_inc_smpl_lost(void) void free_cpu_buffers(void) { - if (op_ring_buffer_read) - ring_buffer_free(op_ring_buffer_read); - op_ring_buffer_read = NULL; - if (op_ring_buffer_write) - ring_buffer_free(op_ring_buffer_write); - op_ring_buffer_write = NULL; + if (op_ring_buffer) + ring_buffer_free(op_ring_buffer); + op_ring_buffer = NULL; } #define RB_EVENT_HDR_SIZE 4 @@ -86,11 +67,8 @@ int alloc_cpu_buffers(void) unsigned long byte_size = buffer_size * (sizeof(struct op_sample) + RB_EVENT_HDR_SIZE); - op_ring_buffer_read = ring_buffer_alloc(byte_size, OP_BUFFER_FLAGS); - if (!op_ring_buffer_read) - goto fail; - op_ring_buffer_write = ring_buffer_alloc(byte_size, OP_BUFFER_FLAGS); - if (!op_ring_buffer_write) + op_ring_buffer = ring_buffer_alloc(byte_size, OP_BUFFER_FLAGS); + if (!op_ring_buffer) goto fail; for_each_possible_cpu(i) { @@ -162,16 +140,11 @@ struct op_sample *op_cpu_buffer_write_reserve(struct op_entry *entry, unsigned long size) { entry->event = ring_buffer_lock_reserve - (op_ring_buffer_write, sizeof(struct op_sample) + + (op_ring_buffer, sizeof(struct op_sample) + size * sizeof(entry->sample->data[0])); - if (entry->event) - entry->sample = ring_buffer_event_data(entry->event); - else - entry->sample = NULL; - - if (!entry->sample) + if (!entry->event) return NULL; - + entry->sample = ring_buffer_event_data(entry->event); entry->size = size; entry->data = entry->sample->data; @@ -180,25 +153,16 @@ struct op_sample int op_cpu_buffer_write_commit(struct op_entry *entry) { - return ring_buffer_unlock_commit(op_ring_buffer_write, entry->event); + return ring_buffer_unlock_commit(op_ring_buffer, entry->event); } struct op_sample *op_cpu_buffer_read_entry(struct op_entry *entry, int cpu) { struct ring_buffer_event *e; - e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL); - if (e) - goto event; - if (ring_buffer_swap_cpu(op_ring_buffer_read, - op_ring_buffer_write, - cpu)) + e = ring_buffer_consume(op_ring_buffer, cpu, NULL, NULL); + if (!e) return NULL; - e = ring_buffer_consume(op_ring_buffer_read, cpu, NULL); - if (e) - goto event; - return NULL; -event: entry->event = e; entry->sample = ring_buffer_event_data(e); entry->size = (ring_buffer_event_length(e) - sizeof(struct op_sample)) @@ -209,8 +173,7 @@ event: unsigned long op_cpu_buffer_entries(int cpu) { - return ring_buffer_entries_cpu(op_ring_buffer_read, cpu) - + ring_buffer_entries_cpu(op_ring_buffer_write, cpu); + return ring_buffer_entries_cpu(op_ring_buffer, cpu); } static int @@ -356,8 +319,16 @@ void oprofile_add_ext_sample(unsigned long pc, struct pt_regs * const regs, void oprofile_add_sample(struct pt_regs * const regs, unsigned long event) { - int is_kernel = !user_mode(regs); - unsigned long pc = profile_pc(regs); + int is_kernel; + unsigned long pc; + + if (likely(regs)) { + is_kernel = !user_mode(regs); + pc = profile_pc(regs); + } else { + is_kernel = 0; /* This value will not be used */ + pc = ESCAPE_CODE; /* as this causes an early return. */ + } __oprofile_add_ext_sample(pc, regs, event, is_kernel); } diff --git a/drivers/oprofile/oprof.c b/drivers/oprofile/oprof.c index dc8a042..b336cd9 100644 --- a/drivers/oprofile/oprof.c +++ b/drivers/oprofile/oprof.c @@ -253,22 +253,26 @@ static int __init oprofile_init(void) int err; err = oprofile_arch_init(&oprofile_ops); - if (err < 0 || timer) { printk(KERN_INFO "oprofile: using timer interrupt.\n"); - oprofile_timer_init(&oprofile_ops); + err = oprofile_timer_init(&oprofile_ops); + if (err) + goto out_arch; } - err = oprofilefs_register(); if (err) - oprofile_arch_exit(); + goto out_arch; + return 0; +out_arch: + oprofile_arch_exit(); return err; } static void __exit oprofile_exit(void) { + oprofile_timer_exit(); oprofilefs_unregister(); oprofile_arch_exit(); } diff --git a/drivers/oprofile/oprof.h b/drivers/oprofile/oprof.h index cb92f5c..47e12cb 100644 --- a/drivers/oprofile/oprof.h +++ b/drivers/oprofile/oprof.h @@ -34,7 +34,8 @@ struct super_block; struct dentry; void oprofile_create_files(struct super_block *sb, struct dentry *root); -void oprofile_timer_init(struct oprofile_operations *ops); +int oprofile_timer_init(struct oprofile_operations *ops); +void oprofile_timer_exit(void); int oprofile_set_backtrace(unsigned long depth); int oprofile_set_timeout(unsigned long time); diff --git a/drivers/oprofile/timer_int.c b/drivers/oprofile/timer_int.c index 333f915..dc0ae4d 100644 --- a/drivers/oprofile/timer_int.c +++ b/drivers/oprofile/timer_int.c @@ -13,34 +13,94 @@ #include <linux/oprofile.h> #include <linux/profile.h> #include <linux/init.h> +#include <linux/cpu.h> +#include <linux/hrtimer.h> +#include <asm/irq_regs.h> #include <asm/ptrace.h> #include "oprof.h" -static int timer_notify(struct pt_regs *regs) +static DEFINE_PER_CPU(struct hrtimer, oprofile_hrtimer); + +static enum hrtimer_restart oprofile_hrtimer_notify(struct hrtimer *hrtimer) +{ + oprofile_add_sample(get_irq_regs(), 0); + hrtimer_forward_now(hrtimer, ns_to_ktime(TICK_NSEC)); + return HRTIMER_RESTART; +} + +static void __oprofile_hrtimer_start(void *unused) +{ + struct hrtimer *hrtimer = &__get_cpu_var(oprofile_hrtimer); + + hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer->function = oprofile_hrtimer_notify; + + hrtimer_start(hrtimer, ns_to_ktime(TICK_NSEC), + HRTIMER_MODE_REL_PINNED); +} + +static int oprofile_hrtimer_start(void) { - oprofile_add_sample(regs, 0); + on_each_cpu(__oprofile_hrtimer_start, NULL, 1); return 0; } -static int timer_start(void) +static void __oprofile_hrtimer_stop(int cpu) { - return register_timer_hook(timer_notify); + struct hrtimer *hrtimer = &per_cpu(oprofile_hrtimer, cpu); + + hrtimer_cancel(hrtimer); } +static void oprofile_hrtimer_stop(void) +{ + int cpu; + + for_each_online_cpu(cpu) + __oprofile_hrtimer_stop(cpu); +} -static void timer_stop(void) +static int __cpuinit oprofile_cpu_notify(struct notifier_block *self, + unsigned long action, void *hcpu) { - unregister_timer_hook(timer_notify); + long cpu = (long) hcpu; + + switch (action) { + case CPU_ONLINE: + case CPU_ONLINE_FROZEN: + smp_call_function_single(cpu, __oprofile_hrtimer_start, + NULL, 1); + break; + case CPU_DEAD: + case CPU_DEAD_FROZEN: + __oprofile_hrtimer_stop(cpu); + break; + } + return NOTIFY_OK; } +static struct notifier_block __refdata oprofile_cpu_notifier = { + .notifier_call = oprofile_cpu_notify, +}; -void __init oprofile_timer_init(struct oprofile_operations *ops) +int __init oprofile_timer_init(struct oprofile_operations *ops) { + int rc; + + rc = register_hotcpu_notifier(&oprofile_cpu_notifier); + if (rc) + return rc; ops->create_files = NULL; ops->setup = NULL; ops->shutdown = NULL; - ops->start = timer_start; - ops->stop = timer_stop; + ops->start = oprofile_hrtimer_start; + ops->stop = oprofile_hrtimer_stop; ops->cpu_type = "timer"; + return 0; +} + +void __exit oprofile_timer_exit(void) +{ + unregister_hotcpu_notifier(&oprofile_cpu_notifier); } diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c index 4173125..371dc56 100644 --- a/drivers/pci/intel-iommu.c +++ b/drivers/pci/intel-iommu.c @@ -3626,14 +3626,15 @@ static void intel_iommu_detach_device(struct iommu_domain *domain, domain_remove_one_dev_info(dmar_domain, pdev); } -static int intel_iommu_map_range(struct iommu_domain *domain, - unsigned long iova, phys_addr_t hpa, - size_t size, int iommu_prot) +static int intel_iommu_map(struct iommu_domain *domain, + unsigned long iova, phys_addr_t hpa, + int gfp_order, int iommu_prot) { struct dmar_domain *dmar_domain = domain->priv; u64 max_addr; int addr_width; int prot = 0; + size_t size; int ret; if (iommu_prot & IOMMU_READ) @@ -3643,6 +3644,7 @@ static int intel_iommu_map_range(struct iommu_domain *domain, if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) prot |= DMA_PTE_SNP; + size = PAGE_SIZE << gfp_order; max_addr = iova + size; if (dmar_domain->max_addr < max_addr) { int min_agaw; @@ -3669,19 +3671,19 @@ static int intel_iommu_map_range(struct iommu_domain *domain, return ret; } -static void intel_iommu_unmap_range(struct iommu_domain *domain, - unsigned long iova, size_t size) +static int intel_iommu_unmap(struct iommu_domain *domain, + unsigned long iova, int gfp_order) { struct dmar_domain *dmar_domain = domain->priv; - - if (!size) - return; + size_t size = PAGE_SIZE << gfp_order; dma_pte_clear_range(dmar_domain, iova >> VTD_PAGE_SHIFT, (iova + size - 1) >> VTD_PAGE_SHIFT); if (dmar_domain->max_addr == iova + size) dmar_domain->max_addr = iova; + + return gfp_order; } static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, @@ -3714,8 +3716,8 @@ static struct iommu_ops intel_iommu_ops = { .domain_destroy = intel_iommu_domain_destroy, .attach_dev = intel_iommu_attach_device, .detach_dev = intel_iommu_detach_device, - .map = intel_iommu_map_range, - .unmap = intel_iommu_unmap_range, + .map = intel_iommu_map, + .unmap = intel_iommu_unmap, .iova_to_phys = intel_iommu_iova_to_phys, .domain_has_cap = intel_iommu_domain_has_cap, }; diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 4fe36d2..19b1113 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -838,65 +838,11 @@ static void pci_bus_dump_resources(struct pci_bus *bus) } } -static int __init pci_bus_get_depth(struct pci_bus *bus) -{ - int depth = 0; - struct pci_dev *dev; - - list_for_each_entry(dev, &bus->devices, bus_list) { - int ret; - struct pci_bus *b = dev->subordinate; - if (!b) - continue; - - ret = pci_bus_get_depth(b); - if (ret + 1 > depth) - depth = ret + 1; - } - - return depth; -} -static int __init pci_get_max_depth(void) -{ - int depth = 0; - struct pci_bus *bus; - - list_for_each_entry(bus, &pci_root_buses, node) { - int ret; - - ret = pci_bus_get_depth(bus); - if (ret > depth) - depth = ret; - } - - return depth; -} - -/* - * first try will not touch pci bridge res - * second and later try will clear small leaf bridge res - * will stop till to the max deepth if can not find good one - */ void __init pci_assign_unassigned_resources(void) { struct pci_bus *bus; - int tried_times = 0; - enum release_type rel_type = leaf_only; - struct resource_list_x head, *list; - unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM | - IORESOURCE_PREFETCH; - unsigned long failed_type; - int max_depth = pci_get_max_depth(); - int pci_try_num; - head.next = NULL; - - pci_try_num = max_depth + 1; - printk(KERN_DEBUG "PCI: max bus depth: %d pci_try_num: %d\n", - max_depth, pci_try_num); - -again: /* Depth first, calculate sizes and alignments of all subordinate buses. */ list_for_each_entry(bus, &pci_root_buses, node) { @@ -904,65 +850,9 @@ again: } /* Depth last, allocate resources and update the hardware. */ list_for_each_entry(bus, &pci_root_buses, node) { - __pci_bus_assign_resources(bus, &head); - } - tried_times++; - - /* any device complain? */ - if (!head.next) - goto enable_and_dump; - failed_type = 0; - for (list = head.next; list;) { - failed_type |= list->flags; - list = list->next; - } - /* - * io port are tight, don't try extra - * or if reach the limit, don't want to try more - */ - failed_type &= type_mask; - if ((failed_type == IORESOURCE_IO) || (tried_times >= pci_try_num)) { - free_failed_list(&head); - goto enable_and_dump; - } - - printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n", - tried_times + 1); - - /* third times and later will not check if it is leaf */ - if ((tried_times + 1) > 2) - rel_type = whole_subtree; - - /* - * Try to release leaf bridge's resources that doesn't fit resource of - * child device under that bridge - */ - for (list = head.next; list;) { - bus = list->dev->bus; - pci_bus_release_bridge_resources(bus, list->flags & type_mask, - rel_type); - list = list->next; - } - /* restore size and flags */ - for (list = head.next; list;) { - struct resource *res = list->res; - - res->start = list->start; - res->end = list->end; - res->flags = list->flags; - if (list->dev->subordinate) - res->flags = 0; - - list = list->next; - } - free_failed_list(&head); - - goto again; - -enable_and_dump: - /* Depth last, update the hardware. */ - list_for_each_entry(bus, &pci_root_buses, node) + pci_bus_assign_resources(bus); pci_enable_bridges(bus); + } /* dump the resource on buses */ list_for_each_entry(bus, &pci_root_buses, node) { diff --git a/drivers/pcmcia/cs.c b/drivers/pcmcia/cs.c index 75ed866..c338375 100644 --- a/drivers/pcmcia/cs.c +++ b/drivers/pcmcia/cs.c @@ -671,20 +671,22 @@ static int pccardd(void *__skt) socket_remove(skt); if (sysfs_events & PCMCIA_UEVENT_INSERT) socket_insert(skt); - if ((sysfs_events & PCMCIA_UEVENT_RESUME) && - !(skt->state & SOCKET_CARDBUS)) { - ret = socket_resume(skt); - if (!ret && skt->callback) - skt->callback->resume(skt); - } if ((sysfs_events & PCMCIA_UEVENT_SUSPEND) && !(skt->state & SOCKET_CARDBUS)) { if (skt->callback) ret = skt->callback->suspend(skt); else ret = 0; - if (!ret) + if (!ret) { socket_suspend(skt); + msleep(100); + } + } + if ((sysfs_events & PCMCIA_UEVENT_RESUME) && + !(skt->state & SOCKET_CARDBUS)) { + ret = socket_resume(skt); + if (!ret && skt->callback) + skt->callback->resume(skt); } if ((sysfs_events & PCMCIA_UEVENT_REQUERY) && !(skt->state & SOCKET_CARDBUS)) { diff --git a/drivers/pcmcia/ds.c b/drivers/pcmcia/ds.c index 508f94a..041eee4 100644 --- a/drivers/pcmcia/ds.c +++ b/drivers/pcmcia/ds.c @@ -1283,6 +1283,7 @@ static int ds_event(struct pcmcia_socket *skt, event_t event, int priority) destroy_cis_cache(skt); kfree(skt->fake_cis); skt->fake_cis = NULL; + s->functions = 0; mutex_unlock(&s->ops_mutex); /* now, add the new card */ ds_event(skt, CS_EVENT_CARD_INSERTION, diff --git a/drivers/pcmcia/pcmcia_ioctl.c b/drivers/pcmcia/pcmcia_ioctl.c index 104e73d..7631faa 100644 --- a/drivers/pcmcia/pcmcia_ioctl.c +++ b/drivers/pcmcia/pcmcia_ioctl.c @@ -711,7 +711,7 @@ static int ds_open(struct inode *inode, struct file *file) warning_printed = 1; } - if (s->pcmcia_state.present) + if (atomic_read(&s->present)) queue_event(user, CS_EVENT_CARD_INSERTION); out: unlock_kernel(); @@ -770,9 +770,6 @@ static ssize_t ds_read(struct file *file, char __user *buf, return -EIO; s = user->socket; - if (s->pcmcia_state.dead) - return -EIO; - ret = wait_event_interruptible(s->queue, !queue_empty(user)); if (ret == 0) ret = put_user(get_queued_event(user), (int __user *)buf) ? -EFAULT : 4; @@ -838,8 +835,6 @@ static int ds_ioctl(struct inode *inode, struct file *file, return -EIO; s = user->socket; - if (s->pcmcia_state.dead) - return -EIO; size = (cmd & IOCSIZE_MASK) >> IOCSIZE_SHIFT; if (size > sizeof(ds_ioctl_arg_t)) diff --git a/drivers/pnp/pnpacpi/rsparser.c b/drivers/pnp/pnpacpi/rsparser.c index 35bb44a..100e4d9 100644 --- a/drivers/pnp/pnpacpi/rsparser.c +++ b/drivers/pnp/pnpacpi/rsparser.c @@ -274,26 +274,6 @@ static void pnpacpi_parse_allocated_busresource(struct pnp_dev *dev, pnp_add_bus_resource(dev, start, end); } -static u64 addr_space_length(struct pnp_dev *dev, u64 min, u64 max, u64 len) -{ - u64 max_len; - - max_len = max - min + 1; - if (len <= max_len) - return len; - - /* - * Per 6.4.3.5, _LEN cannot exceed _MAX - _MIN + 1, but some BIOSes - * don't do this correctly, e.g., - * https://bugzilla.kernel.org/show_bug.cgi?id=15480 - */ - dev_info(&dev->dev, - "resource length %#llx doesn't fit in %#llx-%#llx, trimming\n", - (unsigned long long) len, (unsigned long long) min, - (unsigned long long) max); - return max_len; -} - static void pnpacpi_parse_allocated_address_space(struct pnp_dev *dev, struct acpi_resource *res) { @@ -309,7 +289,8 @@ static void pnpacpi_parse_allocated_address_space(struct pnp_dev *dev, return; } - len = addr_space_length(dev, p->minimum, p->maximum, p->address_length); + /* Windows apparently computes length rather than using _LEN */ + len = p->maximum - p->minimum + 1; window = (p->producer_consumer == ACPI_PRODUCER) ? 1 : 0; if (p->resource_type == ACPI_MEMORY_RANGE) @@ -330,7 +311,8 @@ static void pnpacpi_parse_allocated_ext_address_space(struct pnp_dev *dev, int window; u64 len; - len = addr_space_length(dev, p->minimum, p->maximum, p->address_length); + /* Windows apparently computes length rather than using _LEN */ + len = p->maximum - p->minimum + 1; window = (p->producer_consumer == ACPI_PRODUCER) ? 1 : 0; if (p->resource_type == ACPI_MEMORY_RANGE) diff --git a/drivers/pnp/resource.c b/drivers/pnp/resource.c index 2e54e6a..e3446ab 100644 --- a/drivers/pnp/resource.c +++ b/drivers/pnp/resource.c @@ -211,6 +211,8 @@ int pnp_check_port(struct pnp_dev *dev, struct resource *res) if (tres->flags & IORESOURCE_IO) { if (cannot_compare(tres->flags)) continue; + if (tres->flags & IORESOURCE_WINDOW) + continue; tport = &tres->start; tend = &tres->end; if (ranged_conflict(port, end, tport, tend)) @@ -271,6 +273,8 @@ int pnp_check_mem(struct pnp_dev *dev, struct resource *res) if (tres->flags & IORESOURCE_MEM) { if (cannot_compare(tres->flags)) continue; + if (tres->flags & IORESOURCE_WINDOW) + continue; taddr = &tres->start; tend = &tres->end; if (ranged_conflict(addr, end, taddr, tend)) diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c index acf222f..fa2339c 100644 --- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -37,6 +37,9 @@ */ #define DASD_CHANQ_MAX_SIZE 4 +#define DASD_SLEEPON_START_TAG (void *) 1 +#define DASD_SLEEPON_END_TAG (void *) 2 + /* * SECTION: exported variables of dasd.c */ @@ -1472,7 +1475,10 @@ void dasd_add_request_tail(struct dasd_ccw_req *cqr) */ static void dasd_wakeup_cb(struct dasd_ccw_req *cqr, void *data) { - wake_up((wait_queue_head_t *) data); + spin_lock_irq(get_ccwdev_lock(cqr->startdev->cdev)); + cqr->callback_data = DASD_SLEEPON_END_TAG; + spin_unlock_irq(get_ccwdev_lock(cqr->startdev->cdev)); + wake_up(&generic_waitq); } static inline int _wait_for_wakeup(struct dasd_ccw_req *cqr) @@ -1482,10 +1488,7 @@ static inline int _wait_for_wakeup(struct dasd_ccw_req *cqr) device = cqr->startdev; spin_lock_irq(get_ccwdev_lock(device->cdev)); - rc = ((cqr->status == DASD_CQR_DONE || - cqr->status == DASD_CQR_NEED_ERP || - cqr->status == DASD_CQR_TERMINATED) && - list_empty(&cqr->devlist)); + rc = (cqr->callback_data == DASD_SLEEPON_END_TAG); spin_unlock_irq(get_ccwdev_lock(device->cdev)); return rc; } @@ -1573,7 +1576,7 @@ static int _dasd_sleep_on(struct dasd_ccw_req *maincqr, int interruptible) wait_event(generic_waitq, !(device->stopped)); cqr->callback = dasd_wakeup_cb; - cqr->callback_data = (void *) &generic_waitq; + cqr->callback_data = DASD_SLEEPON_START_TAG; dasd_add_request_tail(cqr); if (interruptible) { rc = wait_event_interruptible( @@ -1652,7 +1655,7 @@ int dasd_sleep_on_immediatly(struct dasd_ccw_req *cqr) } cqr->callback = dasd_wakeup_cb; - cqr->callback_data = (void *) &generic_waitq; + cqr->callback_data = DASD_SLEEPON_START_TAG; cqr->status = DASD_CQR_QUEUED; list_add(&cqr->devlist, &device->ccw_queue); diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index 9201afe6..7f87979 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -4724,6 +4724,10 @@ static ushort AscInitMicroCodeVar(ASC_DVC_VAR *asc_dvc) BUG_ON((unsigned long)asc_dvc->overrun_buf & 7); asc_dvc->overrun_dma = dma_map_single(board->dev, asc_dvc->overrun_buf, ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + if (dma_mapping_error(board->dev, asc_dvc->overrun_dma)) { + warn_code = -ENOMEM; + goto err_dma_map; + } phy_addr = cpu_to_le32(asc_dvc->overrun_dma); AscMemDWordCopyPtrToLram(iop_base, ASCV_OVERRUN_PADDR_D, (uchar *)&phy_addr, 1); @@ -4739,14 +4743,23 @@ static ushort AscInitMicroCodeVar(ASC_DVC_VAR *asc_dvc) AscSetPCAddr(iop_base, ASC_MCODE_START_ADDR); if (AscGetPCAddr(iop_base) != ASC_MCODE_START_ADDR) { asc_dvc->err_code |= ASC_IERR_SET_PC_ADDR; - return warn_code; + warn_code = UW_ERR; + goto err_mcode_start; } if (AscStartChip(iop_base) != 1) { asc_dvc->err_code |= ASC_IERR_START_STOP_CHIP; - return warn_code; + warn_code = UW_ERR; + goto err_mcode_start; } return warn_code; + +err_mcode_start: + dma_unmap_single(board->dev, asc_dvc->overrun_dma, + ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); +err_dma_map: + asc_dvc->overrun_dma = 0; + return warn_code; } static ushort AscInitAsc1000Driver(ASC_DVC_VAR *asc_dvc) @@ -4802,6 +4815,8 @@ static ushort AscInitAsc1000Driver(ASC_DVC_VAR *asc_dvc) } release_firmware(fw); warn_code |= AscInitMicroCodeVar(asc_dvc); + if (!asc_dvc->overrun_dma) + return warn_code; asc_dvc->init_state |= ASC_INIT_STATE_END_LOAD_MC; AscEnableInterrupt(iop_base); return warn_code; @@ -7978,9 +7993,10 @@ static int advansys_reset(struct scsi_cmnd *scp) status = AscInitAsc1000Driver(asc_dvc); /* Refer to ASC_IERR_* definitions for meaning of 'err_code'. */ - if (asc_dvc->err_code) { + if (asc_dvc->err_code || !asc_dvc->overrun_dma) { scmd_printk(KERN_INFO, scp, "SCSI bus reset error: " - "0x%x\n", asc_dvc->err_code); + "0x%x, status: 0x%x\n", asc_dvc->err_code, + status); ret = FAILED; } else if (status) { scmd_printk(KERN_INFO, scp, "SCSI bus reset warning: " @@ -12311,7 +12327,7 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, asc_dvc_varp->overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); if (!asc_dvc_varp->overrun_buf) { ret = -ENOMEM; - goto err_free_wide_mem; + goto err_free_irq; } warn_code = AscInitAsc1000Driver(asc_dvc_varp); @@ -12320,30 +12336,36 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, "warn 0x%x, error 0x%x\n", asc_dvc_varp->init_state, warn_code, asc_dvc_varp->err_code); - if (asc_dvc_varp->err_code) { + if (!asc_dvc_varp->overrun_dma) { ret = -ENODEV; - kfree(asc_dvc_varp->overrun_buf); + goto err_free_mem; } } } else { - if (advansys_wide_init_chip(shost)) + if (advansys_wide_init_chip(shost)) { ret = -ENODEV; + goto err_free_mem; + } } - if (ret) - goto err_free_wide_mem; - ASC_DBG_PRT_SCSI_HOST(2, shost); ret = scsi_add_host(shost, boardp->dev); if (ret) - goto err_free_wide_mem; + goto err_free_mem; scsi_scan_host(shost); return 0; - err_free_wide_mem: - advansys_wide_free_mem(boardp); + err_free_mem: + if (ASC_NARROW_BOARD(boardp)) { + if (asc_dvc_varp->overrun_dma) + dma_unmap_single(boardp->dev, asc_dvc_varp->overrun_dma, + ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + kfree(asc_dvc_varp->overrun_buf); + } else + advansys_wide_free_mem(boardp); + err_free_irq: free_irq(boardp->irq, shost); err_free_dma: #ifdef CONFIG_ISA diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 6d5ae44..633e090 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -471,12 +471,12 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_task *task) WARN_ON(hdrlength >= 256); hdr->hlength = hdrlength & 0xFF; + hdr->cmdsn = task->cmdsn = cpu_to_be32(session->cmdsn); if (session->tt->init_task && session->tt->init_task(task)) return -EIO; task->state = ISCSI_TASK_RUNNING; - hdr->cmdsn = task->cmdsn = cpu_to_be32(session->cmdsn); session->cmdsn++; conn->scsicmd_pdus_cnt++; diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index b00efd1..88f7446 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -395,11 +395,15 @@ int sas_ata_init_host_and_port(struct domain_device *found_dev, void sas_ata_task_abort(struct sas_task *task) { struct ata_queued_cmd *qc = task->uldd_task; + struct request_queue *q = qc->scsicmd->device->request_queue; struct completion *waiting; + unsigned long flags; /* Bounce SCSI-initiated commands to the SCSI EH */ if (qc->scsicmd) { + spin_lock_irqsave(q->queue_lock, flags); blk_abort_request(qc->scsicmd->request); + spin_unlock_irqrestore(q->queue_lock, flags); scsi_schedule_eh(qc->scsicmd->device->host); return; } diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c index 2660e1b..8228350 100644 --- a/drivers/scsi/libsas/sas_scsi_host.c +++ b/drivers/scsi/libsas/sas_scsi_host.c @@ -1030,6 +1030,8 @@ int __sas_task_abort(struct sas_task *task) void sas_task_abort(struct sas_task *task) { struct scsi_cmnd *sc = task->uldd_task; + struct request_queue *q = sc->device->request_queue; + unsigned long flags; /* Escape for libsas internal commands */ if (!sc) { @@ -1044,7 +1046,9 @@ void sas_task_abort(struct sas_task *task) return; } + spin_lock_irqsave(q->queue_lock, flags); blk_abort_request(sc->request); + spin_unlock_irqrestore(q->queue_lock, flags); scsi_schedule_eh(sc->device->host); } diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 3e10c30..3a5bfd1 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -957,7 +957,8 @@ static int resp_start_stop(struct scsi_cmnd * scp, static sector_t get_sdebug_capacity(void) { if (scsi_debug_virtual_gb > 0) - return 2048 * 1024 * (sector_t)scsi_debug_virtual_gb; + return (sector_t)scsi_debug_virtual_gb * + (1073741824 / scsi_debug_sector_size); else return sdebug_store_sectors; } diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index d45c69c..7ad53fa 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -302,7 +302,20 @@ static int scsi_check_sense(struct scsi_cmnd *scmd) if (scmd->device->allow_restart && (sshdr.asc == 0x04) && (sshdr.ascq == 0x02)) return FAILED; - return SUCCESS; + + if (blk_barrier_rq(scmd->request)) + /* + * barrier requests should always retry on UA + * otherwise block will get a spurious error + */ + return NEEDS_RETRY; + else + /* + * for normal (non barrier) commands, pass the + * UA upwards for a determination in the + * completion functions + */ + return SUCCESS; /* these three are not supported */ case COPY_ABORTED: diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 8b827f3..de6c603 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1040,6 +1040,7 @@ static void sd_prepare_flush(struct request_queue *q, struct request *rq) { rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->timeout = SD_TIMEOUT; + rq->retries = SD_MAX_RETRIES; rq->cmd[0] = SYNCHRONIZE_CACHE; rq->cmd_len = 10; } diff --git a/drivers/scsi/zorro7xx.c b/drivers/scsi/zorro7xx.c index 105449c..e17764d 100644 --- a/drivers/scsi/zorro7xx.c +++ b/drivers/scsi/zorro7xx.c @@ -69,6 +69,7 @@ static struct zorro_device_id zorro7xx_zorro_tbl[] __devinitdata = { }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, zorro7xx_zorro_tbl); static int __devinit zorro7xx_init_one(struct zorro_dev *z, const struct zorro_device_id *ent) diff --git a/drivers/serial/imx.c b/drivers/serial/imx.c index 4315b23..eacb588 100644 --- a/drivers/serial/imx.c +++ b/drivers/serial/imx.c @@ -120,7 +120,8 @@ #define MX2_UCR3_RXDMUXSEL (1<<2) /* RXD Muxed Input Select, on mx2/mx3 */ #define UCR3_INVT (1<<1) /* Inverted Infrared transmission */ #define UCR3_BPEN (1<<0) /* Preset registers enable */ -#define UCR4_CTSTL_32 (32<<10) /* CTS trigger level (32 chars) */ +#define UCR4_CTSTL_SHF 10 /* CTS trigger level shift */ +#define UCR4_CTSTL_MASK 0x3F /* CTS trigger is 6 bits wide */ #define UCR4_INVR (1<<9) /* Inverted infrared reception */ #define UCR4_ENIRI (1<<8) /* Serial infrared interrupt enable */ #define UCR4_WKEN (1<<7) /* Wake interrupt enable */ @@ -591,6 +592,9 @@ static int imx_setup_ufcr(struct imx_port *sport, unsigned int mode) return 0; } +/* half the RX buffer size */ +#define CTSTL 16 + static int imx_startup(struct uart_port *port) { struct imx_port *sport = (struct imx_port *)port; @@ -607,6 +611,10 @@ static int imx_startup(struct uart_port *port) if (USE_IRDA(sport)) temp |= UCR4_IRSC; + /* set the trigger level for CTS */ + temp &= ~(UCR4_CTSTL_MASK<< UCR4_CTSTL_SHF); + temp |= CTSTL<< UCR4_CTSTL_SHF; + writel(temp & ~UCR4_DREN, sport->port.membase + UCR4); if (USE_IRDA(sport)) { diff --git a/drivers/serial/mpc52xx_uart.c b/drivers/serial/mpc52xx_uart.c index a176ab4..02469c3 100644 --- a/drivers/serial/mpc52xx_uart.c +++ b/drivers/serial/mpc52xx_uart.c @@ -1467,7 +1467,7 @@ mpc52xx_uart_init(void) /* * Map the PSC FIFO Controller and init if on MPC512x. */ - if (psc_ops->fifoc_init) { + if (psc_ops && psc_ops->fifoc_init) { ret = psc_ops->fifoc_init(); if (ret) return ret; diff --git a/drivers/usb/core/inode.c b/drivers/usb/core/inode.c index 4a6366a..111a01a 100644 --- a/drivers/usb/core/inode.c +++ b/drivers/usb/core/inode.c @@ -380,6 +380,7 @@ static int usbfs_rmdir(struct inode *dir, struct dentry *dentry) mutex_lock(&inode->i_mutex); dentry_unhash(dentry); if (usbfs_empty(dentry)) { + dont_mount(dentry); drop_nlink(dentry->d_inode); drop_nlink(dentry->d_inode); dput(dentry); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index e69d238..49fa953 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1035,7 +1035,12 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len) /* This actually signals the guest, using eventfd. */ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) { - __u16 flags = 0; + __u16 flags; + /* Flush out used index updates. This is paired + * with the barrier that the Guest executes when enabling + * interrupts. */ + smp_mb(); + if (get_user(flags, &vq->avail->flags)) { vq_err(vq, "Failed to get flags"); return; diff --git a/drivers/video/amifb.c b/drivers/video/amifb.c index dca48df..e5d6b56 100644 --- a/drivers/video/amifb.c +++ b/drivers/video/amifb.c @@ -50,8 +50,9 @@ #include <linux/fb.h> #include <linux/init.h> #include <linux/ioport.h> - +#include <linux/platform_device.h> #include <linux/uaccess.h> + #include <asm/system.h> #include <asm/irq.h> #include <asm/amigahw.h> @@ -1135,7 +1136,7 @@ static int amifb_ioctl(struct fb_info *info, unsigned int cmd, unsigned long arg * Interface to the low level console driver */ -static void amifb_deinit(void); +static void amifb_deinit(struct platform_device *pdev); /* * Internal routines @@ -2246,7 +2247,7 @@ static inline void chipfree(void) * Initialisation */ -static int __init amifb_init(void) +static int __init amifb_probe(struct platform_device *pdev) { int tag, i, err = 0; u_long chipptr; @@ -2261,16 +2262,6 @@ static int __init amifb_init(void) } amifb_setup(option); #endif - if (!MACH_IS_AMIGA || !AMIGAHW_PRESENT(AMI_VIDEO)) - return -ENODEV; - - /* - * We request all registers starting from bplpt[0] - */ - if (!request_mem_region(CUSTOM_PHYSADDR+0xe0, 0x120, - "amifb [Denise/Lisa]")) - return -EBUSY; - custom.dmacon = DMAF_ALL | DMAF_MASTER; switch (amiga_chipset) { @@ -2377,6 +2368,7 @@ default_chipset: fb_info.fbops = &amifb_ops; fb_info.par = ¤tpar; fb_info.flags = FBINFO_DEFAULT; + fb_info.device = &pdev->dev; if (!fb_find_mode(&fb_info.var, &fb_info, mode_option, ami_modedb, NUM_TOTAL_MODES, &ami_modedb[defmode], 4)) { @@ -2451,18 +2443,18 @@ default_chipset: return 0; amifb_error: - amifb_deinit(); + amifb_deinit(pdev); return err; } -static void amifb_deinit(void) +static void amifb_deinit(struct platform_device *pdev) { if (fb_info.cmap.len) fb_dealloc_cmap(&fb_info.cmap); + fb_dealloc_cmap(&fb_info.cmap); chipfree(); if (videomemory) iounmap((void*)videomemory); - release_mem_region(CUSTOM_PHYSADDR+0xe0, 0x120); custom.dmacon = DMAF_ALL | DMAF_MASTER; } @@ -3794,14 +3786,35 @@ static void ami_rebuild_copper(void) } } -static void __exit amifb_exit(void) +static int __exit amifb_remove(struct platform_device *pdev) { unregister_framebuffer(&fb_info); - amifb_deinit(); + amifb_deinit(pdev); amifb_video_off(); + return 0; +} + +static struct platform_driver amifb_driver = { + .remove = __exit_p(amifb_remove), + .driver = { + .name = "amiga-video", + .owner = THIS_MODULE, + }, +}; + +static int __init amifb_init(void) +{ + return platform_driver_probe(&amifb_driver, amifb_probe); } module_init(amifb_init); + +static void __exit amifb_exit(void) +{ + platform_driver_unregister(&amifb_driver); +} + module_exit(amifb_exit); MODULE_LICENSE("GPL"); +MODULE_ALIAS("platform:amiga-video"); diff --git a/drivers/video/bfin-t350mcqb-fb.c b/drivers/video/bfin-t350mcqb-fb.c index 44e49c2..c2ec3dcd 100644 --- a/drivers/video/bfin-t350mcqb-fb.c +++ b/drivers/video/bfin-t350mcqb-fb.c @@ -488,9 +488,9 @@ static int __devinit bfin_t350mcqb_probe(struct platform_device *pdev) fbinfo->fbops = &bfin_t350mcqb_fb_ops; fbinfo->flags = FBINFO_FLAG_DEFAULT; - info->fb_buffer = - dma_alloc_coherent(NULL, fbinfo->fix.smem_len, &info->dma_handle, - GFP_KERNEL); + info->fb_buffer = dma_alloc_coherent(NULL, fbinfo->fix.smem_len + + ACTIVE_VIDEO_MEM_OFFSET, + &info->dma_handle, GFP_KERNEL); if (NULL == info->fb_buffer) { printk(KERN_ERR DRIVER_NAME @@ -568,8 +568,8 @@ out7: out6: fb_dealloc_cmap(&fbinfo->cmap); out4: - dma_free_coherent(NULL, fbinfo->fix.smem_len, info->fb_buffer, - info->dma_handle); + dma_free_coherent(NULL, fbinfo->fix.smem_len + ACTIVE_VIDEO_MEM_OFFSET, + info->fb_buffer, info->dma_handle); out3: framebuffer_release(fbinfo); out2: @@ -592,8 +592,9 @@ static int __devexit bfin_t350mcqb_remove(struct platform_device *pdev) free_irq(info->irq, info); if (info->fb_buffer != NULL) - dma_free_coherent(NULL, fbinfo->fix.smem_len, info->fb_buffer, - info->dma_handle); + dma_free_coherent(NULL, fbinfo->fix.smem_len + + ACTIVE_VIDEO_MEM_OFFSET, info->fb_buffer, + info->dma_handle); fb_dealloc_cmap(&fbinfo->cmap); diff --git a/drivers/video/cirrusfb.c b/drivers/video/cirrusfb.c index 8d8dfda..6df7c54 100644 --- a/drivers/video/cirrusfb.c +++ b/drivers/video/cirrusfb.c @@ -299,6 +299,7 @@ static const struct zorro_device_id cirrusfb_zorro_table[] = { }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, cirrusfb_zorro_table); static const struct { zorro_id id2; diff --git a/drivers/video/fm2fb.c b/drivers/video/fm2fb.c index 6c91c61..1b0feb8 100644 --- a/drivers/video/fm2fb.c +++ b/drivers/video/fm2fb.c @@ -219,6 +219,7 @@ static struct zorro_device_id fm2fb_devices[] __devinitdata = { { ZORRO_PROD_HELFRICH_RAINBOW_II }, { 0 } }; +MODULE_DEVICE_TABLE(zorro, fm2fb_devices); static struct zorro_driver fm2fb_driver = { .name = "fm2fb", diff --git a/drivers/video/sh_mobile_lcdcfb.c b/drivers/video/sh_mobile_lcdcfb.c index e14bd07..e8c7699 100644 --- a/drivers/video/sh_mobile_lcdcfb.c +++ b/drivers/video/sh_mobile_lcdcfb.c @@ -695,6 +695,7 @@ static int sh_mobile_lcdc_setup_clocks(struct platform_device *pdev, * 1) Enable Runtime PM * 2) Force Runtime PM Resume since hardware is accessed from probe() */ + priv->dev = &pdev->dev; pm_runtime_enable(priv->dev); pm_runtime_resume(priv->dev); return 0; @@ -957,25 +958,24 @@ static int __devinit sh_mobile_lcdc_probe(struct platform_device *pdev) if (!pdev->dev.platform_data) { dev_err(&pdev->dev, "no platform data defined\n"); - error = -EINVAL; - goto err0; + return -EINVAL; } res = platform_get_resource(pdev, IORESOURCE_MEM, 0); i = platform_get_irq(pdev, 0); if (!res || i < 0) { dev_err(&pdev->dev, "cannot get platform resources\n"); - error = -ENOENT; - goto err0; + return -ENOENT; } priv = kzalloc(sizeof(*priv), GFP_KERNEL); if (!priv) { dev_err(&pdev->dev, "cannot allocate device data\n"); - error = -ENOMEM; - goto err0; + return -ENOMEM; } + platform_set_drvdata(pdev, priv); + error = request_irq(i, sh_mobile_lcdc_irq, IRQF_DISABLED, dev_name(&pdev->dev), priv); if (error) { @@ -984,8 +984,6 @@ static int __devinit sh_mobile_lcdc_probe(struct platform_device *pdev) } priv->irq = i; - priv->dev = &pdev->dev; - platform_set_drvdata(pdev, priv); pdata = pdev->dev.platform_data; j = 0; @@ -1099,9 +1097,9 @@ static int __devinit sh_mobile_lcdc_probe(struct platform_device *pdev) info = ch->info; if (info->fbdefio) { - priv->ch->sglist = vmalloc(sizeof(struct scatterlist) * + ch->sglist = vmalloc(sizeof(struct scatterlist) * info->fix.smem_len >> PAGE_SHIFT); - if (!priv->ch->sglist) { + if (!ch->sglist) { dev_err(&pdev->dev, "cannot allocate sglist\n"); goto err1; } @@ -1126,9 +1124,9 @@ static int __devinit sh_mobile_lcdc_probe(struct platform_device *pdev) } return 0; - err1: +err1: sh_mobile_lcdc_remove(pdev); - err0: + return error; } @@ -1139,7 +1137,7 @@ static int sh_mobile_lcdc_remove(struct platform_device *pdev) int i; for (i = 0; i < ARRAY_SIZE(priv->ch); i++) - if (priv->ch[i].info->dev) + if (priv->ch[i].info && priv->ch[i].info->dev) unregister_framebuffer(priv->ch[i].info); sh_mobile_lcdc_stop(priv); @@ -1162,7 +1160,8 @@ static int sh_mobile_lcdc_remove(struct platform_device *pdev) if (priv->dot_clk) clk_put(priv->dot_clk); - pm_runtime_disable(priv->dev); + if (priv->dev) + pm_runtime_disable(priv->dev); if (priv->base) iounmap(priv->base); diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 0bf5020..b87ba23 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -175,7 +175,7 @@ config SA1100_WATCHDOG config MPCORE_WATCHDOG tristate "MPcore watchdog" - depends on ARM_MPCORE_PLATFORM && LOCAL_TIMERS + depends on HAVE_ARM_TWD help Watchdog timer embedded into the MPcore system. diff --git a/drivers/watchdog/mpcore_wdt.c b/drivers/watchdog/mpcore_wdt.c index 016c6a7..b8ec7ac 100644 --- a/drivers/watchdog/mpcore_wdt.c +++ b/drivers/watchdog/mpcore_wdt.c @@ -31,8 +31,9 @@ #include <linux/platform_device.h> #include <linux/uaccess.h> #include <linux/slab.h> +#include <linux/io.h> -#include <asm/hardware/arm_twd.h> +#include <asm/smp_twd.h> struct mpcore_wdt { unsigned long timer_alive; @@ -44,7 +45,7 @@ struct mpcore_wdt { }; static struct platform_device *mpcore_wdt_dev; -extern unsigned int mpcore_timer_rate; +static DEFINE_SPINLOCK(wdt_lock); #define TIMER_MARGIN 60 static int mpcore_margin = TIMER_MARGIN; @@ -94,13 +95,15 @@ static irqreturn_t mpcore_wdt_fire(int irq, void *arg) */ static void mpcore_wdt_keepalive(struct mpcore_wdt *wdt) { - unsigned int count; + unsigned long count; + spin_lock(&wdt_lock); /* Assume prescale is set to 256 */ - count = (mpcore_timer_rate / 256) * mpcore_margin; + count = __raw_readl(wdt->base + TWD_WDOG_COUNTER); + count = (0xFFFFFFFFU - count) * (HZ / 5); + count = (count / 256) * mpcore_margin; /* Reload the counter */ - spin_lock(&wdt_lock); writel(count + wdt->perturb, wdt->base + TWD_WDOG_LOAD); wdt->perturb = wdt->perturb ? 0 : 1; spin_unlock(&wdt_lock); @@ -119,7 +122,6 @@ static void mpcore_wdt_start(struct mpcore_wdt *wdt) { dev_printk(KERN_INFO, wdt->dev, "enabling watchdog.\n"); - spin_lock(&wdt_lock); /* This loads the count register but does NOT start the count yet */ mpcore_wdt_keepalive(wdt); @@ -130,7 +132,6 @@ static void mpcore_wdt_start(struct mpcore_wdt *wdt) /* Enable watchdog - prescale=256, watchdog mode=1, enable=1 */ writel(0x0000FF09, wdt->base + TWD_WDOG_CONTROL); } - spin_unlock(&wdt_lock); } static int mpcore_wdt_set_heartbeat(int t) @@ -360,7 +361,7 @@ static int __devinit mpcore_wdt_probe(struct platform_device *dev) mpcore_wdt_miscdev.parent = &dev->dev; ret = misc_register(&mpcore_wdt_miscdev); if (ret) { - dev_printk(KERN_ERR, _dev, + dev_printk(KERN_ERR, wdt->dev, "cannot register miscdev on minor=%d (err=%d)\n", WATCHDOG_MINOR, ret); goto err_misc; @@ -369,13 +370,13 @@ static int __devinit mpcore_wdt_probe(struct platform_device *dev) ret = request_irq(wdt->irq, mpcore_wdt_fire, IRQF_DISABLED, "mpcore_wdt", wdt); if (ret) { - dev_printk(KERN_ERR, _dev, + dev_printk(KERN_ERR, wdt->dev, "cannot register IRQ%d for watchdog\n", wdt->irq); goto err_irq; } mpcore_wdt_stop(wdt); - platform_set_drvdata(&dev->dev, wdt); + platform_set_drvdata(dev, wdt); mpcore_wdt_dev = dev; return 0; diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index 2ac4440..8943b8c 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -80,12 +80,6 @@ static void do_suspend(void) shutting_down = SHUTDOWN_SUSPEND; - err = stop_machine_create(); - if (err) { - printk(KERN_ERR "xen suspend: failed to setup stop_machine %d\n", err); - goto out; - } - #ifdef CONFIG_PREEMPT /* If the kernel is preemptible, we need to freeze all the processes to prevent them from being in the middle of a pagetable update @@ -93,7 +87,7 @@ static void do_suspend(void) err = freeze_processes(); if (err) { printk(KERN_ERR "xen suspend: freeze failed %d\n", err); - goto out_destroy_sm; + goto out; } #endif @@ -136,12 +130,8 @@ out_resume: out_thaw: #ifdef CONFIG_PREEMPT thaw_processes(); - -out_destroy_sm: -#endif - stop_machine_destroy(); - out: +#endif shutting_down = SHUTDOWN_INVALID; } #endif /* CONFIG_PM_SLEEP */ diff --git a/drivers/zorro/proc.c b/drivers/zorro/proc.c index d47c47f..3c7046d 100644 --- a/drivers/zorro/proc.c +++ b/drivers/zorro/proc.c @@ -97,7 +97,7 @@ static void zorro_seq_stop(struct seq_file *m, void *v) static int zorro_seq_show(struct seq_file *m, void *v) { - u_int slot = *(loff_t *)v; + unsigned int slot = *(loff_t *)v; struct zorro_dev *z = &zorro_autocon[slot]; seq_printf(m, "%02x\t%08x\t%08lx\t%08lx\t%02x\n", slot, z->id, @@ -129,7 +129,7 @@ static const struct file_operations zorro_devices_proc_fops = { static struct proc_dir_entry *proc_bus_zorro_dir; -static int __init zorro_proc_attach_device(u_int slot) +static int __init zorro_proc_attach_device(unsigned int slot) { struct proc_dir_entry *entry; char name[4]; @@ -146,7 +146,7 @@ static int __init zorro_proc_attach_device(u_int slot) static int __init zorro_proc_init(void) { - u_int slot; + unsigned int slot; if (MACH_IS_AMIGA && AMIGAHW_PRESENT(ZORRO)) { proc_bus_zorro_dir = proc_mkdir("bus/zorro", NULL); diff --git a/drivers/zorro/zorro-driver.c b/drivers/zorro/zorro-driver.c index 53180a3..7ee2b6e 100644 --- a/drivers/zorro/zorro-driver.c +++ b/drivers/zorro/zorro-driver.c @@ -137,10 +137,34 @@ static int zorro_bus_match(struct device *dev, struct device_driver *drv) return 0; } +static int zorro_uevent(struct device *dev, struct kobj_uevent_env *env) +{ +#ifdef CONFIG_HOTPLUG + struct zorro_dev *z; + + if (!dev) + return -ENODEV; + + z = to_zorro_dev(dev); + if (!z) + return -ENODEV; + + if (add_uevent_var(env, "ZORRO_ID=%08X", z->id) || + add_uevent_var(env, "ZORRO_SLOT_NAME=%s", dev_name(dev)) || + add_uevent_var(env, "ZORRO_SLOT_ADDR=%04X", z->slotaddr) || + add_uevent_var(env, "MODALIAS=" ZORRO_DEVICE_MODALIAS_FMT, z->id)) + return -ENOMEM; + + return 0; +#else /* !CONFIG_HOTPLUG */ + return -ENODEV; +#endif /* !CONFIG_HOTPLUG */ +} struct bus_type zorro_bus_type = { .name = "zorro", .match = zorro_bus_match, + .uevent = zorro_uevent, .probe = zorro_device_probe, .remove = zorro_device_remove, }; diff --git a/drivers/zorro/zorro-sysfs.c b/drivers/zorro/zorro-sysfs.c index 1d2a772e..eb924e0 100644 --- a/drivers/zorro/zorro-sysfs.c +++ b/drivers/zorro/zorro-sysfs.c @@ -77,6 +77,16 @@ static struct bin_attribute zorro_config_attr = { .read = zorro_read_config, }; +static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct zorro_dev *z = to_zorro_dev(dev); + + return sprintf(buf, ZORRO_DEVICE_MODALIAS_FMT "\n", z->id); +} + +static DEVICE_ATTR(modalias, S_IRUGO, modalias_show, NULL); + int zorro_create_sysfs_dev_files(struct zorro_dev *z) { struct device *dev = &z->dev; @@ -89,6 +99,7 @@ int zorro_create_sysfs_dev_files(struct zorro_dev *z) (error = device_create_file(dev, &dev_attr_slotaddr)) || (error = device_create_file(dev, &dev_attr_slotsize)) || (error = device_create_file(dev, &dev_attr_resource)) || + (error = device_create_file(dev, &dev_attr_modalias)) || (error = sysfs_create_bin_file(&dev->kobj, &zorro_config_attr))) return error; diff --git a/drivers/zorro/zorro.c b/drivers/zorro/zorro.c index d45fb34..6455f3a 100644 --- a/drivers/zorro/zorro.c +++ b/drivers/zorro/zorro.c @@ -15,6 +15,8 @@ #include <linux/zorro.h> #include <linux/bitops.h> #include <linux/string.h> +#include <linux/platform_device.h> +#include <linux/slab.h> #include <asm/setup.h> #include <asm/amigahw.h> @@ -26,24 +28,17 @@ * Zorro Expansion Devices */ -u_int zorro_num_autocon = 0; +unsigned int zorro_num_autocon; struct zorro_dev zorro_autocon[ZORRO_NUM_AUTO]; /* - * Single Zorro bus + * Zorro bus */ -struct zorro_bus zorro_bus = {\ - .resources = { - /* Zorro II regions (on Zorro II/III) */ - { .name = "Zorro II exp", .start = 0x00e80000, .end = 0x00efffff }, - { .name = "Zorro II mem", .start = 0x00200000, .end = 0x009fffff }, - /* Zorro III regions (on Zorro III only) */ - { .name = "Zorro III exp", .start = 0xff000000, .end = 0xffffffff }, - { .name = "Zorro III cfg", .start = 0x40000000, .end = 0x7fffffff } - }, - .name = "Zorro bus" +struct zorro_bus { + struct list_head devices; /* list of devices on this bus */ + struct device dev; }; @@ -53,18 +48,19 @@ struct zorro_bus zorro_bus = {\ struct zorro_dev *zorro_find_device(zorro_id id, struct zorro_dev *from) { - struct zorro_dev *z; + struct zorro_dev *z; - if (!MACH_IS_AMIGA || !AMIGAHW_PRESENT(ZORRO)) - return NULL; + if (!zorro_num_autocon) + return NULL; - for (z = from ? from+1 : &zorro_autocon[0]; - z < zorro_autocon+zorro_num_autocon; - z++) - if (id == ZORRO_WILDCARD || id == z->id) - return z; - return NULL; + for (z = from ? from+1 : &zorro_autocon[0]; + z < zorro_autocon+zorro_num_autocon; + z++) + if (id == ZORRO_WILDCARD || id == z->id) + return z; + return NULL; } +EXPORT_SYMBOL(zorro_find_device); /* @@ -83,121 +79,138 @@ struct zorro_dev *zorro_find_device(zorro_id id, struct zorro_dev *from) */ DECLARE_BITMAP(zorro_unused_z2ram, 128); +EXPORT_SYMBOL(zorro_unused_z2ram); static void __init mark_region(unsigned long start, unsigned long end, int flag) { - if (flag) - start += Z2RAM_CHUNKMASK; - else - end += Z2RAM_CHUNKMASK; - start &= ~Z2RAM_CHUNKMASK; - end &= ~Z2RAM_CHUNKMASK; - - if (end <= Z2RAM_START || start >= Z2RAM_END) - return; - start = start < Z2RAM_START ? 0x00000000 : start-Z2RAM_START; - end = end > Z2RAM_END ? Z2RAM_SIZE : end-Z2RAM_START; - while (start < end) { - u32 chunk = start>>Z2RAM_CHUNKSHIFT; if (flag) - set_bit(chunk, zorro_unused_z2ram); + start += Z2RAM_CHUNKMASK; else - clear_bit(chunk, zorro_unused_z2ram); - start += Z2RAM_CHUNKSIZE; - } + end += Z2RAM_CHUNKMASK; + start &= ~Z2RAM_CHUNKMASK; + end &= ~Z2RAM_CHUNKMASK; + + if (end <= Z2RAM_START || start >= Z2RAM_END) + return; + start = start < Z2RAM_START ? 0x00000000 : start-Z2RAM_START; + end = end > Z2RAM_END ? Z2RAM_SIZE : end-Z2RAM_START; + while (start < end) { + u32 chunk = start>>Z2RAM_CHUNKSHIFT; + if (flag) + set_bit(chunk, zorro_unused_z2ram); + else + clear_bit(chunk, zorro_unused_z2ram); + start += Z2RAM_CHUNKSIZE; + } } -static struct resource __init *zorro_find_parent_resource(struct zorro_dev *z) +static struct resource __init *zorro_find_parent_resource( + struct platform_device *bridge, struct zorro_dev *z) { - int i; + int i; - for (i = 0; i < zorro_bus.num_resources; i++) - if (zorro_resource_start(z) >= zorro_bus.resources[i].start && - zorro_resource_end(z) <= zorro_bus.resources[i].end) - return &zorro_bus.resources[i]; - return &iomem_resource; + for (i = 0; i < bridge->num_resources; i++) { + struct resource *r = &bridge->resource[i]; + if (zorro_resource_start(z) >= r->start && + zorro_resource_end(z) <= r->end) + return r; + } + return &iomem_resource; } - /* - * Initialization - */ -static int __init zorro_init(void) +static int __init amiga_zorro_probe(struct platform_device *pdev) { - struct zorro_dev *z; - unsigned int i; - int error; - - if (!MACH_IS_AMIGA || !AMIGAHW_PRESENT(ZORRO)) - return 0; - - pr_info("Zorro: Probing AutoConfig expansion devices: %d device%s\n", - zorro_num_autocon, zorro_num_autocon == 1 ? "" : "s"); - - /* Initialize the Zorro bus */ - INIT_LIST_HEAD(&zorro_bus.devices); - dev_set_name(&zorro_bus.dev, "zorro"); - error = device_register(&zorro_bus.dev); - if (error) { - pr_err("Zorro: Error registering zorro_bus\n"); - return error; - } - - /* Request the resources */ - zorro_bus.num_resources = AMIGAHW_PRESENT(ZORRO3) ? 4 : 2; - for (i = 0; i < zorro_bus.num_resources; i++) - request_resource(&iomem_resource, &zorro_bus.resources[i]); - - /* Register all devices */ - for (i = 0; i < zorro_num_autocon; i++) { - z = &zorro_autocon[i]; - z->id = (z->rom.er_Manufacturer<<16) | (z->rom.er_Product<<8); - if (z->id == ZORRO_PROD_GVP_EPC_BASE) { - /* GVP quirk */ - unsigned long magic = zorro_resource_start(z)+0x8000; - z->id |= *(u16 *)ZTWO_VADDR(magic) & GVP_PRODMASK; - } - sprintf(z->name, "Zorro device %08x", z->id); - zorro_name_device(z); - z->resource.name = z->name; - if (request_resource(zorro_find_parent_resource(z), &z->resource)) - pr_err("Zorro: Address space collision on device %s %pR\n", - z->name, &z->resource); - dev_set_name(&z->dev, "%02x", i); - z->dev.parent = &zorro_bus.dev; - z->dev.bus = &zorro_bus_type; - error = device_register(&z->dev); + struct zorro_bus *bus; + struct zorro_dev *z; + struct resource *r; + unsigned int i; + int error; + + /* Initialize the Zorro bus */ + bus = kzalloc(sizeof(*bus), GFP_KERNEL); + if (!bus) + return -ENOMEM; + + INIT_LIST_HEAD(&bus->devices); + bus->dev.parent = &pdev->dev; + dev_set_name(&bus->dev, "zorro"); + error = device_register(&bus->dev); if (error) { - pr_err("Zorro: Error registering device %s\n", z->name); - continue; + pr_err("Zorro: Error registering zorro_bus\n"); + kfree(bus); + return error; } - error = zorro_create_sysfs_dev_files(z); - if (error) - dev_err(&z->dev, "Error creating sysfs files\n"); - } - - /* Mark all available Zorro II memory */ - zorro_for_each_dev(z) { - if (z->rom.er_Type & ERTF_MEMLIST) - mark_region(zorro_resource_start(z), zorro_resource_end(z)+1, 1); - } - - /* Unmark all used Zorro II memory */ - for (i = 0; i < m68k_num_memory; i++) - if (m68k_memory[i].addr < 16*1024*1024) - mark_region(m68k_memory[i].addr, - m68k_memory[i].addr+m68k_memory[i].size, 0); - - return 0; + platform_set_drvdata(pdev, bus); + + /* Register all devices */ + pr_info("Zorro: Probing AutoConfig expansion devices: %u device%s\n", + zorro_num_autocon, zorro_num_autocon == 1 ? "" : "s"); + + for (i = 0; i < zorro_num_autocon; i++) { + z = &zorro_autocon[i]; + z->id = (z->rom.er_Manufacturer<<16) | (z->rom.er_Product<<8); + if (z->id == ZORRO_PROD_GVP_EPC_BASE) { + /* GVP quirk */ + unsigned long magic = zorro_resource_start(z)+0x8000; + z->id |= *(u16 *)ZTWO_VADDR(magic) & GVP_PRODMASK; + } + sprintf(z->name, "Zorro device %08x", z->id); + zorro_name_device(z); + z->resource.name = z->name; + r = zorro_find_parent_resource(pdev, z); + error = request_resource(r, &z->resource); + if (error) + dev_err(&bus->dev, + "Address space collision on device %s %pR\n", + z->name, &z->resource); + dev_set_name(&z->dev, "%02x", i); + z->dev.parent = &bus->dev; + z->dev.bus = &zorro_bus_type; + error = device_register(&z->dev); + if (error) { + dev_err(&bus->dev, "Error registering device %s\n", + z->name); + continue; + } + error = zorro_create_sysfs_dev_files(z); + if (error) + dev_err(&z->dev, "Error creating sysfs files\n"); + } + + /* Mark all available Zorro II memory */ + zorro_for_each_dev(z) { + if (z->rom.er_Type & ERTF_MEMLIST) + mark_region(zorro_resource_start(z), + zorro_resource_end(z)+1, 1); + } + + /* Unmark all used Zorro II memory */ + for (i = 0; i < m68k_num_memory; i++) + if (m68k_memory[i].addr < 16*1024*1024) + mark_region(m68k_memory[i].addr, + m68k_memory[i].addr+m68k_memory[i].size, + 0); + + return 0; } -subsys_initcall(zorro_init); +static struct platform_driver amiga_zorro_driver = { + .driver = { + .name = "amiga-zorro", + .owner = THIS_MODULE, + }, +}; -EXPORT_SYMBOL(zorro_find_device); -EXPORT_SYMBOL(zorro_unused_z2ram); +static int __init amiga_zorro_init(void) +{ + return platform_driver_probe(&amiga_zorro_driver, amiga_zorro_probe); +} + +module_init(amiga_zorro_init); MODULE_LICENSE("GPL"); diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index 109a6c6..e8e5e63 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -177,8 +177,7 @@ static int try_to_fill_dentry(struct dentry *dentry, int flags) } /* Trigger mount for path component or follow link */ } else if (ino->flags & AUTOFS_INF_PENDING || - autofs4_need_mount(flags) || - current->link_count) { + autofs4_need_mount(flags)) { DPRINTK("waiting for mount name=%.*s", dentry->d_name.len, dentry->d_name.name); @@ -262,7 +261,7 @@ static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd) spin_unlock(&dcache_lock); spin_unlock(&sbi->fs_lock); - status = try_to_fill_dentry(dentry, 0); + status = try_to_fill_dentry(dentry, nd->flags); if (status) goto out_error; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index e84ef60..97a9783 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1481,12 +1481,17 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, ret = -EBADF; goto out_drop_write; } + src = src_file->f_dentry->d_inode; ret = -EINVAL; if (src == inode) goto out_fput; + /* the src must be open for reading */ + if (!(src_file->f_mode & FMODE_READ)) + goto out_fput; + ret = -EISDIR; if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode)) goto out_fput; diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h index f7c255f..a8cd821 100644 --- a/fs/cachefiles/internal.h +++ b/fs/cachefiles/internal.h @@ -34,6 +34,7 @@ struct cachefiles_object { loff_t i_size; /* object size */ unsigned long flags; #define CACHEFILES_OBJECT_ACTIVE 0 /* T if marked active */ +#define CACHEFILES_OBJECT_BURIED 1 /* T if preemptively buried */ atomic_t usage; /* object usage count */ uint8_t type; /* object type */ uint8_t new; /* T if object new */ diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index d5db84a1e..f4a7840 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -93,6 +93,59 @@ static noinline void cachefiles_printk_object(struct cachefiles_object *object, } /* + * mark the owner of a dentry, if there is one, to indicate that that dentry + * has been preemptively deleted + * - the caller must hold the i_mutex on the dentry's parent as required to + * call vfs_unlink(), vfs_rmdir() or vfs_rename() + */ +static void cachefiles_mark_object_buried(struct cachefiles_cache *cache, + struct dentry *dentry) +{ + struct cachefiles_object *object; + struct rb_node *p; + + _enter(",'%*.*s'", + dentry->d_name.len, dentry->d_name.len, dentry->d_name.name); + + write_lock(&cache->active_lock); + + p = cache->active_nodes.rb_node; + while (p) { + object = rb_entry(p, struct cachefiles_object, active_node); + if (object->dentry > dentry) + p = p->rb_left; + else if (object->dentry < dentry) + p = p->rb_right; + else + goto found_dentry; + } + + write_unlock(&cache->active_lock); + _leave(" [no owner]"); + return; + + /* found the dentry for */ +found_dentry: + kdebug("preemptive burial: OBJ%x [%s] %p", + object->fscache.debug_id, + fscache_object_states[object->fscache.state], + dentry); + + if (object->fscache.state < FSCACHE_OBJECT_DYING) { + printk(KERN_ERR "\n"); + printk(KERN_ERR "CacheFiles: Error:" + " Can't preemptively bury live object\n"); + cachefiles_printk_object(object, NULL); + } else if (test_and_set_bit(CACHEFILES_OBJECT_BURIED, &object->flags)) { + printk(KERN_ERR "CacheFiles: Error:" + " Object already preemptively buried\n"); + } + + write_unlock(&cache->active_lock); + _leave(" [owner marked]"); +} + +/* * record the fact that an object is now active */ static int cachefiles_mark_object_active(struct cachefiles_cache *cache, @@ -219,7 +272,8 @@ requeue: */ static int cachefiles_bury_object(struct cachefiles_cache *cache, struct dentry *dir, - struct dentry *rep) + struct dentry *rep, + bool preemptive) { struct dentry *grave, *trap; char nbuffer[8 + 8 + 1]; @@ -229,11 +283,16 @@ static int cachefiles_bury_object(struct cachefiles_cache *cache, dir->d_name.len, dir->d_name.len, dir->d_name.name, rep->d_name.len, rep->d_name.len, rep->d_name.name); + _debug("remove %p from %p", rep, dir); + /* non-directories can just be unlinked */ if (!S_ISDIR(rep->d_inode->i_mode)) { _debug("unlink stale object"); ret = vfs_unlink(dir->d_inode, rep); + if (preemptive) + cachefiles_mark_object_buried(cache, rep); + mutex_unlock(&dir->d_inode->i_mutex); if (ret == -EIO) @@ -325,6 +384,9 @@ try_again: if (ret != 0 && ret != -ENOMEM) cachefiles_io_error(cache, "Rename failed with error %d", ret); + if (preemptive) + cachefiles_mark_object_buried(cache, rep); + unlock_rename(cache->graveyard, dir); dput(grave); _leave(" = 0"); @@ -340,7 +402,7 @@ int cachefiles_delete_object(struct cachefiles_cache *cache, struct dentry *dir; int ret; - _enter(",{%p}", object->dentry); + _enter(",OBJ%x{%p}", object->fscache.debug_id, object->dentry); ASSERT(object->dentry); ASSERT(object->dentry->d_inode); @@ -350,15 +412,25 @@ int cachefiles_delete_object(struct cachefiles_cache *cache, mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_PARENT); - /* we need to check that our parent is _still_ our parent - it may have - * been renamed */ - if (dir == object->dentry->d_parent) { - ret = cachefiles_bury_object(cache, dir, object->dentry); - } else { - /* it got moved, presumably by cachefilesd culling it, so it's - * no longer in the key path and we can ignore it */ + if (test_bit(CACHEFILES_OBJECT_BURIED, &object->flags)) { + /* object allocation for the same key preemptively deleted this + * object's file so that it could create its own file */ + _debug("object preemptively buried"); mutex_unlock(&dir->d_inode->i_mutex); ret = 0; + } else { + /* we need to check that our parent is _still_ our parent - it + * may have been renamed */ + if (dir == object->dentry->d_parent) { + ret = cachefiles_bury_object(cache, dir, + object->dentry, false); + } else { + /* it got moved, presumably by cachefilesd culling it, + * so it's no longer in the key path and we can ignore + * it */ + mutex_unlock(&dir->d_inode->i_mutex); + ret = 0; + } } dput(dir); @@ -381,7 +453,9 @@ int cachefiles_walk_to_object(struct cachefiles_object *parent, const char *name; int ret, nlen; - _enter("{%p},,%s,", parent->dentry, key); + _enter("OBJ%x{%p},OBJ%x,%s,", + parent->fscache.debug_id, parent->dentry, + object->fscache.debug_id, key); cache = container_of(parent->fscache.cache, struct cachefiles_cache, cache); @@ -509,7 +583,7 @@ lookup_again: * mutex) */ object->dentry = NULL; - ret = cachefiles_bury_object(cache, dir, next); + ret = cachefiles_bury_object(cache, dir, next, true); dput(next); next = NULL; @@ -828,7 +902,7 @@ int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir, /* actually remove the victim (drops the dir mutex) */ _debug("bury"); - ret = cachefiles_bury_object(cache, dir, victim); + ret = cachefiles_bury_object(cache, dir, victim, false); if (ret < 0) goto error; diff --git a/fs/cachefiles/security.c b/fs/cachefiles/security.c index b5808cd..039b501 100644 --- a/fs/cachefiles/security.c +++ b/fs/cachefiles/security.c @@ -77,6 +77,8 @@ static int cachefiles_check_cache_dir(struct cachefiles_cache *cache, /* * check the security details of the on-disk cache * - must be called with security override in force + * - must return with a security override in force - even in the case of an + * error */ int cachefiles_determine_cache_security(struct cachefiles_cache *cache, struct dentry *root, @@ -99,6 +101,8 @@ int cachefiles_determine_cache_security(struct cachefiles_cache *cache, * which create files */ ret = set_create_files_as(new, root->d_inode); if (ret < 0) { + abort_creds(new); + cachefiles_begin_secure(cache, _saved_cred); _leave(" = %d [cfa]", ret); return ret; } diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 4b42c2b..a9005d8 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -504,7 +504,6 @@ static void writepages_finish(struct ceph_osd_request *req, int i; struct ceph_snap_context *snapc = req->r_snapc; struct address_space *mapping = inode->i_mapping; - struct writeback_control *wbc = req->r_wbc; __s32 rc = -EIO; u64 bytes = 0; struct ceph_client *client = ceph_inode_to_client(inode); @@ -546,10 +545,6 @@ static void writepages_finish(struct ceph_osd_request *req, clear_bdi_congested(&client->backing_dev_info, BLK_RW_ASYNC); - if (i >= wrote) { - dout("inode %p skipping page %p\n", inode, page); - wbc->pages_skipped++; - } ceph_put_snap_context((void *)page->private); page->private = 0; ClearPagePrivate(page); @@ -799,7 +794,6 @@ get_more_pages: alloc_page_vec(client, req); req->r_callback = writepages_finish; req->r_inode = inode; - req->r_wbc = wbc; } /* note position of first page in pvec */ diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 0c16818..d940053 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -858,6 +858,8 @@ static int __ceph_is_any_caps(struct ceph_inode_info *ci) } /* + * Remove a cap. Take steps to deal with a racing iterate_session_caps. + * * caller should hold i_lock. * caller will not hold session s_mutex if called from destroy_inode. */ @@ -866,15 +868,10 @@ void __ceph_remove_cap(struct ceph_cap *cap) struct ceph_mds_session *session = cap->session; struct ceph_inode_info *ci = cap->ci; struct ceph_mds_client *mdsc = &ceph_client(ci->vfs_inode.i_sb)->mdsc; + int removed = 0; dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode); - /* remove from inode list */ - rb_erase(&cap->ci_node, &ci->i_caps); - cap->ci = NULL; - if (ci->i_auth_cap == cap) - ci->i_auth_cap = NULL; - /* remove from session list */ spin_lock(&session->s_cap_lock); if (session->s_cap_iterator == cap) { @@ -885,10 +882,18 @@ void __ceph_remove_cap(struct ceph_cap *cap) list_del_init(&cap->session_caps); session->s_nr_caps--; cap->session = NULL; + removed = 1; } + /* protect backpointer with s_cap_lock: see iterate_session_caps */ + cap->ci = NULL; spin_unlock(&session->s_cap_lock); - if (cap->session == NULL) + /* remove from inode list */ + rb_erase(&cap->ci_node, &ci->i_caps); + if (ci->i_auth_cap == cap) + ci->i_auth_cap = NULL; + + if (removed) ceph_put_cap(cap); if (!__ceph_is_any_caps(ci) && ci->i_snap_realm) { diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 261f3e6..85b4d2f 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -733,6 +733,10 @@ no_change: __ceph_get_fmode(ci, cap_fmode); spin_unlock(&inode->i_lock); } + } else if (cap_fmode >= 0) { + pr_warning("mds issued no caps on %llx.%llx\n", + ceph_vinop(inode)); + __ceph_get_fmode(ci, cap_fmode); } /* update delegation info? */ diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 60a9a4a..24561a5 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -736,9 +736,10 @@ static void cleanup_cap_releases(struct ceph_mds_session *session) } /* - * Helper to safely iterate over all caps associated with a session. + * Helper to safely iterate over all caps associated with a session, with + * special care taken to handle a racing __ceph_remove_cap(). * - * caller must hold session s_mutex + * Caller must hold session s_mutex. */ static int iterate_session_caps(struct ceph_mds_session *session, int (*cb)(struct inode *, struct ceph_cap *, @@ -2136,7 +2137,7 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, int mds) struct ceph_mds_session *session = NULL; struct ceph_msg *reply; struct rb_node *p; - int err; + int err = -ENOMEM; struct ceph_pagelist *pagelist; pr_info("reconnect to recovering mds%d\n", mds); @@ -2185,7 +2186,7 @@ static void send_mds_reconnect(struct ceph_mds_client *mdsc, int mds) goto fail; err = iterate_session_caps(session, encode_caps_cb, pagelist); if (err < 0) - goto out; + goto fail; /* * snaprealms. we provide mds with the ino, seq (version), and @@ -2213,28 +2214,31 @@ send: reply->nr_pages = calc_pages_for(0, pagelist->length); ceph_con_send(&session->s_con, reply); - if (session) { - session->s_state = CEPH_MDS_SESSION_OPEN; - __wake_requests(mdsc, &session->s_waiting); - } + session->s_state = CEPH_MDS_SESSION_OPEN; + mutex_unlock(&session->s_mutex); + + mutex_lock(&mdsc->mutex); + __wake_requests(mdsc, &session->s_waiting); + mutex_unlock(&mdsc->mutex); + + ceph_put_mds_session(session); -out: up_read(&mdsc->snap_rwsem); - if (session) { - mutex_unlock(&session->s_mutex); - ceph_put_mds_session(session); - } mutex_lock(&mdsc->mutex); return; fail: ceph_msg_put(reply); + up_read(&mdsc->snap_rwsem); + mutex_unlock(&session->s_mutex); + ceph_put_mds_session(session); fail_nomsg: ceph_pagelist_release(pagelist); kfree(pagelist); fail_nopagelist: - pr_err("ENOMEM preparing reconnect for mds%d\n", mds); - goto out; + pr_err("error %d preparing reconnect for mds%d\n", err, mds); + mutex_lock(&mdsc->mutex); + return; } diff --git a/fs/ceph/messenger.c b/fs/ceph/messenger.c index 509f57d..cd4fadb 100644 --- a/fs/ceph/messenger.c +++ b/fs/ceph/messenger.c @@ -492,7 +492,14 @@ static void prepare_write_message(struct ceph_connection *con) list_move_tail(&m->list_head, &con->out_sent); } - m->hdr.seq = cpu_to_le64(++con->out_seq); + /* + * only assign outgoing seq # if we haven't sent this message + * yet. if it is requeued, resend with it's original seq. + */ + if (m->needs_out_seq) { + m->hdr.seq = cpu_to_le64(++con->out_seq); + m->needs_out_seq = false; + } dout("prepare_write_message %p seq %lld type %d len %d+%d+%d %d pgs\n", m, con->out_seq, le16_to_cpu(m->hdr.type), @@ -1986,6 +1993,8 @@ void ceph_con_send(struct ceph_connection *con, struct ceph_msg *msg) BUG_ON(msg->front.iov_len != le32_to_cpu(msg->hdr.front_len)); + msg->needs_out_seq = true; + /* queue */ mutex_lock(&con->mutex); BUG_ON(!list_empty(&msg->list_head)); @@ -2085,15 +2094,19 @@ struct ceph_msg *ceph_msg_new(int type, int front_len, kref_init(&m->kref); INIT_LIST_HEAD(&m->list_head); + m->hdr.tid = 0; m->hdr.type = cpu_to_le16(type); + m->hdr.priority = cpu_to_le16(CEPH_MSG_PRIO_DEFAULT); + m->hdr.version = 0; m->hdr.front_len = cpu_to_le32(front_len); m->hdr.middle_len = 0; m->hdr.data_len = cpu_to_le32(page_len); m->hdr.data_off = cpu_to_le16(page_off); - m->hdr.priority = cpu_to_le16(CEPH_MSG_PRIO_DEFAULT); + m->hdr.reserved = 0; m->footer.front_crc = 0; m->footer.middle_crc = 0; m->footer.data_crc = 0; + m->footer.flags = 0; m->front_max = front_len; m->front_is_vmalloc = false; m->more_to_follow = false; diff --git a/fs/ceph/messenger.h b/fs/ceph/messenger.h index a343dae..a5caf91 100644 --- a/fs/ceph/messenger.h +++ b/fs/ceph/messenger.h @@ -86,6 +86,7 @@ struct ceph_msg { struct kref kref; bool front_is_vmalloc; bool more_to_follow; + bool needs_out_seq; int front_max; struct ceph_msgpool *pool; diff --git a/fs/ceph/osd_client.c b/fs/ceph/osd_client.c index c7b4ded..3514f71 100644 --- a/fs/ceph/osd_client.c +++ b/fs/ceph/osd_client.c @@ -565,7 +565,8 @@ static int __map_osds(struct ceph_osd_client *osdc, { struct ceph_osd_request_head *reqhead = req->r_request->front.iov_base; struct ceph_pg pgid; - int o = -1; + int acting[CEPH_PG_MAX_SIZE]; + int o = -1, num = 0; int err; dout("map_osds %p tid %lld\n", req, req->r_tid); @@ -576,10 +577,16 @@ static int __map_osds(struct ceph_osd_client *osdc, pgid = reqhead->layout.ol_pgid; req->r_pgid = pgid; - o = ceph_calc_pg_primary(osdc->osdmap, pgid); + err = ceph_calc_pg_acting(osdc->osdmap, pgid, acting); + if (err > 0) { + o = acting[0]; + num = err; + } if ((req->r_osd && req->r_osd->o_osd == o && - req->r_sent >= req->r_osd->o_incarnation) || + req->r_sent >= req->r_osd->o_incarnation && + req->r_num_pg_osds == num && + memcmp(req->r_pg_osds, acting, sizeof(acting[0])*num) == 0) || (req->r_osd == NULL && o == -1)) return 0; /* no change */ @@ -587,6 +594,10 @@ static int __map_osds(struct ceph_osd_client *osdc, req->r_tid, le32_to_cpu(pgid.pool), le16_to_cpu(pgid.ps), o, req->r_osd ? req->r_osd->o_osd : -1); + /* record full pg acting set */ + memcpy(req->r_pg_osds, acting, sizeof(acting[0]) * num); + req->r_num_pg_osds = num; + if (req->r_osd) { __cancel_request(req); list_del_init(&req->r_osd_item); @@ -612,7 +623,7 @@ static int __map_osds(struct ceph_osd_client *osdc, __remove_osd_from_lru(req->r_osd); list_add(&req->r_osd_item, &req->r_osd->o_requests); } - err = 1; /* osd changed */ + err = 1; /* osd or pg changed */ out: return err; @@ -779,16 +790,18 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg, struct ceph_osd_request *req; u64 tid; int numops, object_len, flags; + s32 result; tid = le64_to_cpu(msg->hdr.tid); if (msg->front.iov_len < sizeof(*rhead)) goto bad; numops = le32_to_cpu(rhead->num_ops); object_len = le32_to_cpu(rhead->object_len); + result = le32_to_cpu(rhead->result); if (msg->front.iov_len != sizeof(*rhead) + object_len + numops * sizeof(struct ceph_osd_op)) goto bad; - dout("handle_reply %p tid %llu\n", msg, tid); + dout("handle_reply %p tid %llu result %d\n", msg, tid, (int)result); /* lookup */ mutex_lock(&osdc->request_mutex); @@ -834,7 +847,8 @@ static void handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg, dout("handle_reply tid %llu flags %d\n", tid, flags); /* either this is a read, or we got the safe response */ - if ((flags & CEPH_OSD_FLAG_ONDISK) || + if (result < 0 || + (flags & CEPH_OSD_FLAG_ONDISK) || ((flags & CEPH_OSD_FLAG_WRITE) == 0)) __unregister_request(osdc, req); diff --git a/fs/ceph/osd_client.h b/fs/ceph/osd_client.h index b075991..ce77698 100644 --- a/fs/ceph/osd_client.h +++ b/fs/ceph/osd_client.h @@ -48,6 +48,8 @@ struct ceph_osd_request { struct list_head r_osd_item; struct ceph_osd *r_osd; struct ceph_pg r_pgid; + int r_pg_osds[CEPH_PG_MAX_SIZE]; + int r_num_pg_osds; struct ceph_connection *r_con_filling_msg; @@ -66,7 +68,6 @@ struct ceph_osd_request { struct list_head r_unsafe_item; struct inode *r_inode; /* for use by callbacks */ - struct writeback_control *r_wbc; /* ditto */ char r_oid[40]; /* object name */ int r_oid_len; diff --git a/fs/ceph/osdmap.c b/fs/ceph/osdmap.c index 2e2c15e..cfdd8f4 100644 --- a/fs/ceph/osdmap.c +++ b/fs/ceph/osdmap.c @@ -1041,12 +1041,33 @@ static int *calc_pg_raw(struct ceph_osdmap *osdmap, struct ceph_pg pgid, } /* + * Return acting set for given pgid. + */ +int ceph_calc_pg_acting(struct ceph_osdmap *osdmap, struct ceph_pg pgid, + int *acting) +{ + int rawosds[CEPH_PG_MAX_SIZE], *osds; + int i, o, num = CEPH_PG_MAX_SIZE; + + osds = calc_pg_raw(osdmap, pgid, rawosds, &num); + if (!osds) + return -1; + + /* primary is first up osd */ + o = 0; + for (i = 0; i < num; i++) + if (ceph_osd_is_up(osdmap, osds[i])) + acting[o++] = osds[i]; + return o; +} + +/* * Return primary osd for given pgid, or -1 if none. */ int ceph_calc_pg_primary(struct ceph_osdmap *osdmap, struct ceph_pg pgid) { - int rawosds[10], *osds; - int i, num = ARRAY_SIZE(rawosds); + int rawosds[CEPH_PG_MAX_SIZE], *osds; + int i, num = CEPH_PG_MAX_SIZE; osds = calc_pg_raw(osdmap, pgid, rawosds, &num); if (!osds) @@ -1054,9 +1075,7 @@ int ceph_calc_pg_primary(struct ceph_osdmap *osdmap, struct ceph_pg pgid) /* primary is first up osd */ for (i = 0; i < num; i++) - if (ceph_osd_is_up(osdmap, osds[i])) { + if (ceph_osd_is_up(osdmap, osds[i])) return osds[i]; - break; - } return -1; } diff --git a/fs/ceph/osdmap.h b/fs/ceph/osdmap.h index 8bc9f1e..970b547 100644 --- a/fs/ceph/osdmap.h +++ b/fs/ceph/osdmap.h @@ -120,6 +120,8 @@ extern int ceph_calc_object_layout(struct ceph_object_layout *ol, const char *oid, struct ceph_file_layout *fl, struct ceph_osdmap *osdmap); +extern int ceph_calc_pg_acting(struct ceph_osdmap *osdmap, struct ceph_pg pgid, + int *acting); extern int ceph_calc_pg_primary(struct ceph_osdmap *osdmap, struct ceph_pg pgid); diff --git a/fs/ceph/rados.h b/fs/ceph/rados.h index a1fc1d0..fd56451 100644 --- a/fs/ceph/rados.h +++ b/fs/ceph/rados.h @@ -58,6 +58,7 @@ struct ceph_timespec { #define CEPH_PG_LAYOUT_LINEAR 2 #define CEPH_PG_LAYOUT_HYBRID 3 +#define CEPH_PG_MAX_SIZE 16 /* max # osds in a single pg */ /* * placement group. diff --git a/fs/ceph/super.c b/fs/ceph/super.c index f888cf4..110857b 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -47,10 +47,20 @@ const char *ceph_file_part(const char *s, int len) */ static void ceph_put_super(struct super_block *s) { - struct ceph_client *cl = ceph_client(s); + struct ceph_client *client = ceph_sb_to_client(s); dout("put_super\n"); - ceph_mdsc_close_sessions(&cl->mdsc); + ceph_mdsc_close_sessions(&client->mdsc); + + /* + * ensure we release the bdi before put_anon_super releases + * the device name. + */ + if (s->s_bdi == &client->backing_dev_info) { + bdi_unregister(&client->backing_dev_info); + s->s_bdi = NULL; + } + return; } @@ -636,6 +646,8 @@ static void ceph_destroy_client(struct ceph_client *client) destroy_workqueue(client->pg_inv_wq); destroy_workqueue(client->trunc_wq); + bdi_destroy(&client->backing_dev_info); + if (client->msgr) ceph_messenger_destroy(client->msgr); mempool_destroy(client->wb_pagevec_pool); @@ -876,14 +888,14 @@ static int ceph_register_bdi(struct super_block *sb, struct ceph_client *client) { int err; - sb->s_bdi = &client->backing_dev_info; - /* set ra_pages based on rsize mount option? */ if (client->mount_args->rsize >= PAGE_CACHE_SIZE) client->backing_dev_info.ra_pages = (client->mount_args->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_SHIFT; err = bdi_register_dev(&client->backing_dev_info, sb->s_dev); + if (!err) + sb->s_bdi = &client->backing_dev_info; return err; } @@ -957,9 +969,6 @@ static void ceph_kill_sb(struct super_block *s) dout("kill_sb %p\n", s); ceph_mdsc_pre_umount(&client->mdsc); kill_anon_super(s); /* will call put_super after sb is r/o */ - if (s->s_bdi == &client->backing_dev_info) - bdi_unregister(&client->backing_dev_info); - bdi_destroy(&client->backing_dev_info); ceph_destroy_client(client); } diff --git a/fs/cifs/asn1.c b/fs/cifs/asn1.c index a20bea5..cfd1ce3 100644 --- a/fs/cifs/asn1.c +++ b/fs/cifs/asn1.c @@ -492,17 +492,13 @@ compare_oid(unsigned long *oid1, unsigned int oid1len, int decode_negTokenInit(unsigned char *security_blob, int length, - enum securityEnum *secType) + struct TCP_Server_Info *server) { struct asn1_ctx ctx; unsigned char *end; unsigned char *sequence_end; unsigned long *oid = NULL; unsigned int cls, con, tag, oidlen, rc; - bool use_ntlmssp = false; - bool use_kerberos = false; - bool use_kerberosu2u = false; - bool use_mskerberos = false; /* cifs_dump_mem(" Received SecBlob ", security_blob, length); */ @@ -510,11 +506,11 @@ decode_negTokenInit(unsigned char *security_blob, int length, /* GSSAPI header */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding negTokenInit header")); + cFYI(1, "Error decoding negTokenInit header"); return 0; } else if ((cls != ASN1_APL) || (con != ASN1_CON) || (tag != ASN1_EOC)) { - cFYI(1, ("cls = %d con = %d tag = %d", cls, con, tag)); + cFYI(1, "cls = %d con = %d tag = %d", cls, con, tag); return 0; } @@ -535,56 +531,52 @@ decode_negTokenInit(unsigned char *security_blob, int length, /* SPNEGO OID not present or garbled -- bail out */ if (!rc) { - cFYI(1, ("Error decoding negTokenInit header")); + cFYI(1, "Error decoding negTokenInit header"); return 0; } /* SPNEGO */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding negTokenInit")); + cFYI(1, "Error decoding negTokenInit"); return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON) || (tag != ASN1_EOC)) { - cFYI(1, - ("cls = %d con = %d tag = %d end = %p (%d) exit 0", - cls, con, tag, end, *end)); + cFYI(1, "cls = %d con = %d tag = %d end = %p (%d) exit 0", + cls, con, tag, end, *end); return 0; } /* negTokenInit */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding negTokenInit")); + cFYI(1, "Error decoding negTokenInit"); return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_CON) || (tag != ASN1_SEQ)) { - cFYI(1, - ("cls = %d con = %d tag = %d end = %p (%d) exit 1", - cls, con, tag, end, *end)); + cFYI(1, "cls = %d con = %d tag = %d end = %p (%d) exit 1", + cls, con, tag, end, *end); return 0; } /* sequence */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding 2nd part of negTokenInit")); + cFYI(1, "Error decoding 2nd part of negTokenInit"); return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON) || (tag != ASN1_EOC)) { - cFYI(1, - ("cls = %d con = %d tag = %d end = %p (%d) exit 0", - cls, con, tag, end, *end)); + cFYI(1, "cls = %d con = %d tag = %d end = %p (%d) exit 0", + cls, con, tag, end, *end); return 0; } /* sequence of */ if (asn1_header_decode (&ctx, &sequence_end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding 2nd part of negTokenInit")); + cFYI(1, "Error decoding 2nd part of negTokenInit"); return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_CON) || (tag != ASN1_SEQ)) { - cFYI(1, - ("cls = %d con = %d tag = %d end = %p (%d) exit 1", - cls, con, tag, end, *end)); + cFYI(1, "cls = %d con = %d tag = %d end = %p (%d) exit 1", + cls, con, tag, end, *end); return 0; } @@ -592,37 +584,33 @@ decode_negTokenInit(unsigned char *security_blob, int length, while (!asn1_eoc_decode(&ctx, sequence_end)) { rc = asn1_header_decode(&ctx, &end, &cls, &con, &tag); if (!rc) { - cFYI(1, - ("Error decoding negTokenInit hdr exit2")); + cFYI(1, "Error decoding negTokenInit hdr exit2"); return 0; } if ((tag == ASN1_OJI) && (con == ASN1_PRI)) { if (asn1_oid_decode(&ctx, end, &oid, &oidlen)) { - cFYI(1, ("OID len = %d oid = 0x%lx 0x%lx " - "0x%lx 0x%lx", oidlen, *oid, - *(oid + 1), *(oid + 2), *(oid + 3))); + cFYI(1, "OID len = %d oid = 0x%lx 0x%lx " + "0x%lx 0x%lx", oidlen, *oid, + *(oid + 1), *(oid + 2), *(oid + 3)); if (compare_oid(oid, oidlen, MSKRB5_OID, - MSKRB5_OID_LEN) && - !use_mskerberos) - use_mskerberos = true; + MSKRB5_OID_LEN)) + server->sec_mskerberos = true; else if (compare_oid(oid, oidlen, KRB5U2U_OID, - KRB5U2U_OID_LEN) && - !use_kerberosu2u) - use_kerberosu2u = true; + KRB5U2U_OID_LEN)) + server->sec_kerberosu2u = true; else if (compare_oid(oid, oidlen, KRB5_OID, - KRB5_OID_LEN) && - !use_kerberos) - use_kerberos = true; + KRB5_OID_LEN)) + server->sec_kerberos = true; else if (compare_oid(oid, oidlen, NTLMSSP_OID, NTLMSSP_OID_LEN)) - use_ntlmssp = true; + server->sec_ntlmssp = true; kfree(oid); } } else { - cFYI(1, ("Should be an oid what is going on?")); + cFYI(1, "Should be an oid what is going on?"); } } @@ -632,54 +620,47 @@ decode_negTokenInit(unsigned char *security_blob, int length, no mechListMic (e.g. NTLMSSP instead of KRB5) */ if (ctx.error == ASN1_ERR_DEC_EMPTY) goto decode_negtoken_exit; - cFYI(1, ("Error decoding last part negTokenInit exit3")); + cFYI(1, "Error decoding last part negTokenInit exit3"); return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON)) { /* tag = 3 indicating mechListMIC */ - cFYI(1, ("Exit 4 cls = %d con = %d tag = %d end = %p (%d)", - cls, con, tag, end, *end)); + cFYI(1, "Exit 4 cls = %d con = %d tag = %d end = %p (%d)", + cls, con, tag, end, *end); return 0; } /* sequence */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding last part negTokenInit exit5")); + cFYI(1, "Error decoding last part negTokenInit exit5"); return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_CON) || (tag != ASN1_SEQ)) { - cFYI(1, ("cls = %d con = %d tag = %d end = %p (%d)", - cls, con, tag, end, *end)); + cFYI(1, "cls = %d con = %d tag = %d end = %p (%d)", + cls, con, tag, end, *end); } /* sequence of */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding last part negTokenInit exit 7")); + cFYI(1, "Error decoding last part negTokenInit exit 7"); return 0; } else if ((cls != ASN1_CTX) || (con != ASN1_CON)) { - cFYI(1, ("Exit 8 cls = %d con = %d tag = %d end = %p (%d)", - cls, con, tag, end, *end)); + cFYI(1, "Exit 8 cls = %d con = %d tag = %d end = %p (%d)", + cls, con, tag, end, *end); return 0; } /* general string */ if (asn1_header_decode(&ctx, &end, &cls, &con, &tag) == 0) { - cFYI(1, ("Error decoding last part negTokenInit exit9")); + cFYI(1, "Error decoding last part negTokenInit exit9"); return 0; } else if ((cls != ASN1_UNI) || (con != ASN1_PRI) || (tag != ASN1_GENSTR)) { - cFYI(1, ("Exit10 cls = %d con = %d tag = %d end = %p (%d)", - cls, con, tag, end, *end)); + cFYI(1, "Exit10 cls = %d con = %d tag = %d end = %p (%d)", + cls, con, tag, end, *end); return 0; } - cFYI(1, ("Need to call asn1_octets_decode() function for %s", - ctx.pointer)); /* is this UTF-8 or ASCII? */ + cFYI(1, "Need to call asn1_octets_decode() function for %s", + ctx.pointer); /* is this UTF-8 or ASCII? */ decode_negtoken_exit: - if (use_kerberos) - *secType = Kerberos; - else if (use_mskerberos) - *secType = MSKerberos; - else if (use_ntlmssp) - *secType = RawNTLMSSP; - return 1; } diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c index 42cec2a..4fce6e6 100644 --- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -60,10 +60,10 @@ cifs_dump_mem(char *label, void *data, int length) #ifdef CONFIG_CIFS_DEBUG2 void cifs_dump_detail(struct smb_hdr *smb) { - cERROR(1, ("Cmd: %d Err: 0x%x Flags: 0x%x Flgs2: 0x%x Mid: %d Pid: %d", + cERROR(1, "Cmd: %d Err: 0x%x Flags: 0x%x Flgs2: 0x%x Mid: %d Pid: %d", smb->Command, smb->Status.CifsError, - smb->Flags, smb->Flags2, smb->Mid, smb->Pid)); - cERROR(1, ("smb buf %p len %d", smb, smbCalcSize_LE(smb))); + smb->Flags, smb->Flags2, smb->Mid, smb->Pid); + cERROR(1, "smb buf %p len %d", smb, smbCalcSize_LE(smb)); } @@ -75,25 +75,25 @@ void cifs_dump_mids(struct TCP_Server_Info *server) if (server == NULL) return; - cERROR(1, ("Dump pending requests:")); + cERROR(1, "Dump pending requests:"); spin_lock(&GlobalMid_Lock); list_for_each(tmp, &server->pending_mid_q) { mid_entry = list_entry(tmp, struct mid_q_entry, qhead); - cERROR(1, ("State: %d Cmd: %d Pid: %d Tsk: %p Mid %d", + cERROR(1, "State: %d Cmd: %d Pid: %d Tsk: %p Mid %d", mid_entry->midState, (int)mid_entry->command, mid_entry->pid, mid_entry->tsk, - mid_entry->mid)); + mid_entry->mid); #ifdef CONFIG_CIFS_STATS2 - cERROR(1, ("IsLarge: %d buf: %p time rcv: %ld now: %ld", + cERROR(1, "IsLarge: %d buf: %p time rcv: %ld now: %ld", mid_entry->largeBuf, mid_entry->resp_buf, mid_entry->when_received, - jiffies)); + jiffies); #endif /* STATS2 */ - cERROR(1, ("IsMult: %d IsEnd: %d", mid_entry->multiRsp, - mid_entry->multiEnd)); + cERROR(1, "IsMult: %d IsEnd: %d", mid_entry->multiRsp, + mid_entry->multiEnd); if (mid_entry->resp_buf) { cifs_dump_detail(mid_entry->resp_buf); cifs_dump_mem("existing buf: ", @@ -716,7 +716,7 @@ static const struct file_operations cifs_multiuser_mount_proc_fops = { static int cifs_security_flags_proc_show(struct seq_file *m, void *v) { - seq_printf(m, "0x%x\n", extended_security); + seq_printf(m, "0x%x\n", global_secflags); return 0; } @@ -744,13 +744,13 @@ static ssize_t cifs_security_flags_proc_write(struct file *file, /* single char or single char followed by null */ c = flags_string[0]; if (c == '0' || c == 'n' || c == 'N') { - extended_security = CIFSSEC_DEF; /* default */ + global_secflags = CIFSSEC_DEF; /* default */ return count; } else if (c == '1' || c == 'y' || c == 'Y') { - extended_security = CIFSSEC_MAX; + global_secflags = CIFSSEC_MAX; return count; } else if (!isdigit(c)) { - cERROR(1, ("invalid flag %c", c)); + cERROR(1, "invalid flag %c", c); return -EINVAL; } } @@ -758,26 +758,26 @@ static ssize_t cifs_security_flags_proc_write(struct file *file, flags = simple_strtoul(flags_string, NULL, 0); - cFYI(1, ("sec flags 0x%x", flags)); + cFYI(1, "sec flags 0x%x", flags); if (flags <= 0) { - cERROR(1, ("invalid security flags %s", flags_string)); + cERROR(1, "invalid security flags %s", flags_string); return -EINVAL; } if (flags & ~CIFSSEC_MASK) { - cERROR(1, ("attempt to set unsupported security flags 0x%x", - flags & ~CIFSSEC_MASK)); + cERROR(1, "attempt to set unsupported security flags 0x%x", + flags & ~CIFSSEC_MASK); return -EINVAL; } /* flags look ok - update the global security flags for cifs module */ - extended_security = flags; - if (extended_security & CIFSSEC_MUST_SIGN) { + global_secflags = flags; + if (global_secflags & CIFSSEC_MUST_SIGN) { /* requiring signing implies signing is allowed */ - extended_security |= CIFSSEC_MAY_SIGN; - cFYI(1, ("packet signing now required")); - } else if ((extended_security & CIFSSEC_MAY_SIGN) == 0) { - cFYI(1, ("packet signing disabled")); + global_secflags |= CIFSSEC_MAY_SIGN; + cFYI(1, "packet signing now required"); + } else if ((global_secflags & CIFSSEC_MAY_SIGN) == 0) { + cFYI(1, "packet signing disabled"); } /* BB should we turn on MAY flags for other MUST options? */ return count; diff --git a/fs/cifs/cifs_debug.h b/fs/cifs/cifs_debug.h index 5eb3b83..aa31689 100644 --- a/fs/cifs/cifs_debug.h +++ b/fs/cifs/cifs_debug.h @@ -43,34 +43,54 @@ void dump_smb(struct smb_hdr *, int); */ #ifdef CIFS_DEBUG - /* information message: e.g., configuration, major event */ extern int cifsFYI; -#define cifsfyi(format,arg...) if (cifsFYI & CIFS_INFO) printk(KERN_DEBUG " " __FILE__ ": " format "\n" "" , ## arg) +#define cifsfyi(fmt, arg...) \ +do { \ + if (cifsFYI & CIFS_INFO) \ + printk(KERN_DEBUG "%s: " fmt "\n", __FILE__, ##arg); \ +} while (0) -#define cFYI(button,prspec) if (button) cifsfyi prspec +#define cFYI(set, fmt, arg...) \ +do { \ + if (set) \ + cifsfyi(fmt, ##arg); \ +} while (0) -#define cifswarn(format, arg...) printk(KERN_WARNING ": " format "\n" , ## arg) +#define cifswarn(fmt, arg...) \ + printk(KERN_WARNING fmt "\n", ##arg) /* debug event message: */ extern int cifsERROR; -#define cEVENT(format,arg...) if (cifsERROR) printk(KERN_EVENT __FILE__ ": " format "\n" , ## arg) +#define cEVENT(fmt, arg...) \ +do { \ + if (cifsERROR) \ + printk(KERN_EVENT "%s: " fmt "\n", __FILE__, ##arg); \ +} while (0) /* error event message: e.g., i/o error */ -#define cifserror(format,arg...) if (cifsERROR) printk(KERN_ERR " CIFS VFS: " format "\n" "" , ## arg) +#define cifserror(fmt, arg...) \ +do { \ + if (cifsERROR) \ + printk(KERN_ERR "CIFS VFS: " fmt "\n", ##arg); \ +} while (0) -#define cERROR(button, prspec) if (button) cifserror prspec +#define cERROR(set, fmt, arg...) \ +do { \ + if (set) \ + cifserror(fmt, ##arg); \ +} while (0) /* * debug OFF * --------- */ #else /* _CIFS_DEBUG */ -#define cERROR(button, prspec) -#define cEVENT(format, arg...) -#define cFYI(button, prspec) -#define cifserror(format, arg...) +#define cERROR(set, fmt, arg...) +#define cEVENT(fmt, arg...) +#define cFYI(set, fmt, arg...) +#define cifserror(fmt, arg...) #endif /* _CIFS_DEBUG */ #endif /* _H_CIFS_DEBUG */ diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index 78e4d2a..ac19a6f 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -85,8 +85,8 @@ static char *cifs_get_share_name(const char *node_name) /* find server name end */ pSep = memchr(UNC+2, '\\', len-2); if (!pSep) { - cERROR(1, ("%s: no server name end in node name: %s", - __func__, node_name)); + cERROR(1, "%s: no server name end in node name: %s", + __func__, node_name); kfree(UNC); return ERR_PTR(-EINVAL); } @@ -142,8 +142,8 @@ char *cifs_compose_mount_options(const char *sb_mountdata, rc = dns_resolve_server_name_to_ip(*devname, &srvIP); if (rc != 0) { - cERROR(1, ("%s: Failed to resolve server part of %s to IP: %d", - __func__, *devname, rc)); + cERROR(1, "%s: Failed to resolve server part of %s to IP: %d", + __func__, *devname, rc); goto compose_mount_options_err; } /* md_len = strlen(...) + 12 for 'sep+prefixpath=' @@ -217,8 +217,8 @@ char *cifs_compose_mount_options(const char *sb_mountdata, strcat(mountdata, fullpath + ref->path_consumed); } - /*cFYI(1,("%s: parent mountdata: %s", __func__,sb_mountdata));*/ - /*cFYI(1, ("%s: submount mountdata: %s", __func__, mountdata ));*/ + /*cFYI(1, "%s: parent mountdata: %s", __func__,sb_mountdata);*/ + /*cFYI(1, "%s: submount mountdata: %s", __func__, mountdata );*/ compose_mount_options_out: kfree(srvIP); @@ -294,11 +294,11 @@ static int add_mount_helper(struct vfsmount *newmnt, struct nameidata *nd, static void dump_referral(const struct dfs_info3_param *ref) { - cFYI(1, ("DFS: ref path: %s", ref->path_name)); - cFYI(1, ("DFS: node path: %s", ref->node_name)); - cFYI(1, ("DFS: fl: %hd, srv_type: %hd", ref->flags, ref->server_type)); - cFYI(1, ("DFS: ref_flags: %hd, path_consumed: %hd", ref->ref_flag, - ref->path_consumed)); + cFYI(1, "DFS: ref path: %s", ref->path_name); + cFYI(1, "DFS: node path: %s", ref->node_name); + cFYI(1, "DFS: fl: %hd, srv_type: %hd", ref->flags, ref->server_type); + cFYI(1, "DFS: ref_flags: %hd, path_consumed: %hd", ref->ref_flag, + ref->path_consumed); } @@ -314,7 +314,7 @@ cifs_dfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) int rc = 0; struct vfsmount *mnt = ERR_PTR(-ENOENT); - cFYI(1, ("in %s", __func__)); + cFYI(1, "in %s", __func__); BUG_ON(IS_ROOT(dentry)); xid = GetXid(); @@ -352,15 +352,15 @@ cifs_dfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) /* connect to a node */ len = strlen(referrals[i].node_name); if (len < 2) { - cERROR(1, ("%s: Net Address path too short: %s", - __func__, referrals[i].node_name)); + cERROR(1, "%s: Net Address path too short: %s", + __func__, referrals[i].node_name); rc = -EINVAL; goto out_err; } mnt = cifs_dfs_do_refmount(nd->path.mnt, nd->path.dentry, referrals + i); - cFYI(1, ("%s: cifs_dfs_do_refmount:%s , mnt:%p", __func__, - referrals[i].node_name, mnt)); + cFYI(1, "%s: cifs_dfs_do_refmount:%s , mnt:%p", __func__, + referrals[i].node_name, mnt); /* complete mount procedure if we accured submount */ if (!IS_ERR(mnt)) @@ -378,7 +378,7 @@ out: FreeXid(xid); free_dfs_info_array(referrals, num_referrals); kfree(full_path); - cFYI(1, ("leaving %s" , __func__)); + cFYI(1, "leaving %s" , __func__); return ERR_PTR(rc); out_err: path_put(&nd->path); diff --git a/fs/cifs/cifs_spnego.c b/fs/cifs/cifs_spnego.c index 310d12f..379bd7d 100644 --- a/fs/cifs/cifs_spnego.c +++ b/fs/cifs/cifs_spnego.c @@ -133,9 +133,9 @@ cifs_get_spnego_key(struct cifsSesInfo *sesInfo) dp = description + strlen(description); /* for now, only sec=krb5 and sec=mskrb5 are valid */ - if (server->secType == Kerberos) + if (server->sec_kerberos) sprintf(dp, ";sec=krb5"); - else if (server->secType == MSKerberos) + else if (server->sec_mskerberos) sprintf(dp, ";sec=mskrb5"); else goto out; @@ -149,7 +149,7 @@ cifs_get_spnego_key(struct cifsSesInfo *sesInfo) dp = description + strlen(description); sprintf(dp, ";pid=0x%x", current->pid); - cFYI(1, ("key description = %s", description)); + cFYI(1, "key description = %s", description); spnego_key = request_key(&cifs_spnego_key_type, description, ""); #ifdef CONFIG_CIFS_DEBUG2 diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c index d07676b..430f510 100644 --- a/fs/cifs/cifs_unicode.c +++ b/fs/cifs/cifs_unicode.c @@ -200,9 +200,8 @@ cifs_strtoUCS(__le16 *to, const char *from, int len, /* works for 2.4.0 kernel or later */ charlen = codepage->char2uni(from, len, &wchar_to[i]); if (charlen < 1) { - cERROR(1, - ("strtoUCS: char2uni of %d returned %d", - (int)*from, charlen)); + cERROR(1, "strtoUCS: char2uni of %d returned %d", + (int)*from, charlen); /* A question mark */ to[i] = cpu_to_le16(0x003f); charlen = 1; diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c index 9b716d0..85d7cf7 100644 --- a/fs/cifs/cifsacl.c +++ b/fs/cifs/cifsacl.c @@ -87,11 +87,11 @@ int match_sid(struct cifs_sid *ctsid) continue; /* all sub_auth values do not match */ } - cFYI(1, ("matching sid: %s\n", wksidarr[i].sidname)); + cFYI(1, "matching sid: %s\n", wksidarr[i].sidname); return 0; /* sids compare/match */ } - cFYI(1, ("No matching sid")); + cFYI(1, "No matching sid"); return -1; } @@ -208,14 +208,14 @@ static void access_flags_to_mode(__le32 ace_flags, int type, umode_t *pmode, *pbits_to_set &= ~S_IXUGO; return; } else if (type != ACCESS_ALLOWED) { - cERROR(1, ("unknown access control type %d", type)); + cERROR(1, "unknown access control type %d", type); return; } /* else ACCESS_ALLOWED type */ if (flags & GENERIC_ALL) { *pmode |= (S_IRWXUGO & (*pbits_to_set)); - cFYI(DBG2, ("all perms")); + cFYI(DBG2, "all perms"); return; } if ((flags & GENERIC_WRITE) || @@ -228,7 +228,7 @@ static void access_flags_to_mode(__le32 ace_flags, int type, umode_t *pmode, ((flags & FILE_EXEC_RIGHTS) == FILE_EXEC_RIGHTS)) *pmode |= (S_IXUGO & (*pbits_to_set)); - cFYI(DBG2, ("access flags 0x%x mode now 0x%x", flags, *pmode)); + cFYI(DBG2, "access flags 0x%x mode now 0x%x", flags, *pmode); return; } @@ -257,7 +257,7 @@ static void mode_to_access_flags(umode_t mode, umode_t bits_to_use, if (mode & S_IXUGO) *pace_flags |= SET_FILE_EXEC_RIGHTS; - cFYI(DBG2, ("mode: 0x%x, access flags now 0x%x", mode, *pace_flags)); + cFYI(DBG2, "mode: 0x%x, access flags now 0x%x", mode, *pace_flags); return; } @@ -297,24 +297,24 @@ static void dump_ace(struct cifs_ace *pace, char *end_of_acl) /* validate that we do not go past end of acl */ if (le16_to_cpu(pace->size) < 16) { - cERROR(1, ("ACE too small, %d", le16_to_cpu(pace->size))); + cERROR(1, "ACE too small %d", le16_to_cpu(pace->size)); return; } if (end_of_acl < (char *)pace + le16_to_cpu(pace->size)) { - cERROR(1, ("ACL too small to parse ACE")); + cERROR(1, "ACL too small to parse ACE"); return; } num_subauth = pace->sid.num_subauth; if (num_subauth) { int i; - cFYI(1, ("ACE revision %d num_auth %d type %d flags %d size %d", + cFYI(1, "ACE revision %d num_auth %d type %d flags %d size %d", pace->sid.revision, pace->sid.num_subauth, pace->type, - pace->flags, le16_to_cpu(pace->size))); + pace->flags, le16_to_cpu(pace->size)); for (i = 0; i < num_subauth; ++i) { - cFYI(1, ("ACE sub_auth[%d]: 0x%x", i, - le32_to_cpu(pace->sid.sub_auth[i]))); + cFYI(1, "ACE sub_auth[%d]: 0x%x", i, + le32_to_cpu(pace->sid.sub_auth[i])); } /* BB add length check to make sure that we do not have huge @@ -347,13 +347,13 @@ static void parse_dacl(struct cifs_acl *pdacl, char *end_of_acl, /* validate that we do not go past end of acl */ if (end_of_acl < (char *)pdacl + le16_to_cpu(pdacl->size)) { - cERROR(1, ("ACL too small to parse DACL")); + cERROR(1, "ACL too small to parse DACL"); return; } - cFYI(DBG2, ("DACL revision %d size %d num aces %d", + cFYI(DBG2, "DACL revision %d size %d num aces %d", le16_to_cpu(pdacl->revision), le16_to_cpu(pdacl->size), - le32_to_cpu(pdacl->num_aces))); + le32_to_cpu(pdacl->num_aces)); /* reset rwx permissions for user/group/other. Also, if num_aces is 0 i.e. DACL has no ACEs, @@ -437,25 +437,25 @@ static int parse_sid(struct cifs_sid *psid, char *end_of_acl) /* validate that we do not go past end of ACL - sid must be at least 8 bytes long (assuming no sub-auths - e.g. the null SID */ if (end_of_acl < (char *)psid + 8) { - cERROR(1, ("ACL too small to parse SID %p", psid)); + cERROR(1, "ACL too small to parse SID %p", psid); return -EINVAL; } if (psid->num_subauth) { #ifdef CONFIG_CIFS_DEBUG2 int i; - cFYI(1, ("SID revision %d num_auth %d", - psid->revision, psid->num_subauth)); + cFYI(1, "SID revision %d num_auth %d", + psid->revision, psid->num_subauth); for (i = 0; i < psid->num_subauth; i++) { - cFYI(1, ("SID sub_auth[%d]: 0x%x ", i, - le32_to_cpu(psid->sub_auth[i]))); + cFYI(1, "SID sub_auth[%d]: 0x%x ", i, + le32_to_cpu(psid->sub_auth[i])); } /* BB add length check to make sure that we do not have huge num auths and therefore go off the end */ - cFYI(1, ("RID 0x%x", - le32_to_cpu(psid->sub_auth[psid->num_subauth-1]))); + cFYI(1, "RID 0x%x", + le32_to_cpu(psid->sub_auth[psid->num_subauth-1])); #endif } @@ -482,11 +482,11 @@ static int parse_sec_desc(struct cifs_ntsd *pntsd, int acl_len, le32_to_cpu(pntsd->gsidoffset)); dacloffset = le32_to_cpu(pntsd->dacloffset); dacl_ptr = (struct cifs_acl *)((char *)pntsd + dacloffset); - cFYI(DBG2, ("revision %d type 0x%x ooffset 0x%x goffset 0x%x " + cFYI(DBG2, "revision %d type 0x%x ooffset 0x%x goffset 0x%x " "sacloffset 0x%x dacloffset 0x%x", pntsd->revision, pntsd->type, le32_to_cpu(pntsd->osidoffset), le32_to_cpu(pntsd->gsidoffset), - le32_to_cpu(pntsd->sacloffset), dacloffset)); + le32_to_cpu(pntsd->sacloffset), dacloffset); /* cifs_dump_mem("owner_sid: ", owner_sid_ptr, 64); */ rc = parse_sid(owner_sid_ptr, end_of_acl); if (rc) @@ -500,7 +500,7 @@ static int parse_sec_desc(struct cifs_ntsd *pntsd, int acl_len, parse_dacl(dacl_ptr, end_of_acl, owner_sid_ptr, group_sid_ptr, fattr); else - cFYI(1, ("no ACL")); /* BB grant all or default perms? */ + cFYI(1, "no ACL"); /* BB grant all or default perms? */ /* cifscred->uid = owner_sid_ptr->rid; cifscred->gid = group_sid_ptr->rid; @@ -563,7 +563,7 @@ static struct cifs_ntsd *get_cifs_acl_by_fid(struct cifs_sb_info *cifs_sb, FreeXid(xid); - cFYI(1, ("GetCIFSACL rc = %d ACL len %d", rc, *pacllen)); + cFYI(1, "GetCIFSACL rc = %d ACL len %d", rc, *pacllen); return pntsd; } @@ -581,12 +581,12 @@ static struct cifs_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb, &fid, &oplock, NULL, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); if (rc) { - cERROR(1, ("Unable to open file to get ACL")); + cERROR(1, "Unable to open file to get ACL"); goto out; } rc = CIFSSMBGetCIFSACL(xid, cifs_sb->tcon, fid, &pntsd, pacllen); - cFYI(1, ("GetCIFSACL rc = %d ACL len %d", rc, *pacllen)); + cFYI(1, "GetCIFSACL rc = %d ACL len %d", rc, *pacllen); CIFSSMBClose(xid, cifs_sb->tcon, fid); out: @@ -621,7 +621,7 @@ static int set_cifs_acl_by_fid(struct cifs_sb_info *cifs_sb, __u16 fid, rc = CIFSSMBSetCIFSACL(xid, cifs_sb->tcon, fid, pnntsd, acllen); FreeXid(xid); - cFYI(DBG2, ("SetCIFSACL rc = %d", rc)); + cFYI(DBG2, "SetCIFSACL rc = %d", rc); return rc; } @@ -638,12 +638,12 @@ static int set_cifs_acl_by_path(struct cifs_sb_info *cifs_sb, const char *path, &fid, &oplock, NULL, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); if (rc) { - cERROR(1, ("Unable to open file to set ACL")); + cERROR(1, "Unable to open file to set ACL"); goto out; } rc = CIFSSMBSetCIFSACL(xid, cifs_sb->tcon, fid, pnntsd, acllen); - cFYI(DBG2, ("SetCIFSACL rc = %d", rc)); + cFYI(DBG2, "SetCIFSACL rc = %d", rc); CIFSSMBClose(xid, cifs_sb->tcon, fid); out: @@ -659,7 +659,7 @@ static int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen, struct cifsFileInfo *open_file; int rc; - cFYI(DBG2, ("set ACL for %s from mode 0x%x", path, inode->i_mode)); + cFYI(DBG2, "set ACL for %s from mode 0x%x", path, inode->i_mode); open_file = find_readable_file(CIFS_I(inode)); if (!open_file) @@ -679,7 +679,7 @@ cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, struct cifs_fattr *fattr, u32 acllen = 0; int rc = 0; - cFYI(DBG2, ("converting ACL to mode for %s", path)); + cFYI(DBG2, "converting ACL to mode for %s", path); if (pfid) pntsd = get_cifs_acl_by_fid(cifs_sb, *pfid, &acllen); @@ -690,7 +690,7 @@ cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, struct cifs_fattr *fattr, if (pntsd) rc = parse_sec_desc(pntsd, acllen, fattr); if (rc) - cFYI(1, ("parse sec desc failed rc = %d", rc)); + cFYI(1, "parse sec desc failed rc = %d", rc); kfree(pntsd); return; @@ -704,7 +704,7 @@ int mode_to_acl(struct inode *inode, const char *path, __u64 nmode) struct cifs_ntsd *pntsd = NULL; /* acl obtained from server */ struct cifs_ntsd *pnntsd = NULL; /* modified acl to be sent to server */ - cFYI(DBG2, ("set ACL from mode for %s", path)); + cFYI(DBG2, "set ACL from mode for %s", path); /* Get the security descriptor */ pntsd = get_cifs_acl(CIFS_SB(inode->i_sb), inode, path, &secdesclen); @@ -721,19 +721,19 @@ int mode_to_acl(struct inode *inode, const char *path, __u64 nmode) DEFSECDESCLEN : secdesclen; pnntsd = kmalloc(secdesclen, GFP_KERNEL); if (!pnntsd) { - cERROR(1, ("Unable to allocate security descriptor")); + cERROR(1, "Unable to allocate security descriptor"); kfree(pntsd); return -ENOMEM; } rc = build_sec_desc(pntsd, pnntsd, inode, nmode); - cFYI(DBG2, ("build_sec_desc rc: %d", rc)); + cFYI(DBG2, "build_sec_desc rc: %d", rc); if (!rc) { /* Set the security descriptor */ rc = set_cifs_acl(pnntsd, secdesclen, inode, path); - cFYI(DBG2, ("set_cifs_acl rc: %d", rc)); + cFYI(DBG2, "set_cifs_acl rc: %d", rc); } kfree(pnntsd); diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c index fbe9864..847628d 100644 --- a/fs/cifs/cifsencrypt.c +++ b/fs/cifs/cifsencrypt.c @@ -103,7 +103,7 @@ static int cifs_calc_signature2(const struct kvec *iov, int n_vec, if (iov[i].iov_len == 0) continue; if (iov[i].iov_base == NULL) { - cERROR(1, ("null iovec entry")); + cERROR(1, "null iovec entry"); return -EIO; } /* The first entry includes a length field (which does not get @@ -181,8 +181,8 @@ int cifs_verify_signature(struct smb_hdr *cifs_pdu, /* Do not need to verify session setups with signature "BSRSPYL " */ if (memcmp(cifs_pdu->Signature.SecuritySignature, "BSRSPYL ", 8) == 0) - cFYI(1, ("dummy signature received for smb command 0x%x", - cifs_pdu->Command)); + cFYI(1, "dummy signature received for smb command 0x%x", + cifs_pdu->Command); /* save off the origiginal signature so we can modify the smb and check its signature against what the server sent */ @@ -291,7 +291,7 @@ void calc_lanman_hash(const char *password, const char *cryptkey, bool encrypt, if (password) strncpy(password_with_pad, password, CIFS_ENCPWD_SIZE); - if (!encrypt && extended_security & CIFSSEC_MAY_PLNTXT) { + if (!encrypt && global_secflags & CIFSSEC_MAY_PLNTXT) { memset(lnm_session_key, 0, CIFS_SESS_KEY_SIZE); memcpy(lnm_session_key, password_with_pad, CIFS_ENCPWD_SIZE); @@ -398,7 +398,7 @@ void setup_ntlmv2_rsp(struct cifsSesInfo *ses, char *resp_buf, /* calculate buf->ntlmv2_hash */ rc = calc_ntlmv2_hash(ses, nls_cp); if (rc) - cERROR(1, ("could not get v2 hash rc %d", rc)); + cERROR(1, "could not get v2 hash rc %d", rc); CalcNTLMv2_response(ses, resp_buf); /* now calculate the MAC key for NTLMv2 */ diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index ad235d6..78c02eb 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -49,10 +49,6 @@ #include "cifs_spnego.h" #define CIFS_MAGIC_NUMBER 0xFF534D42 /* the first four bytes of SMB PDUs */ -#ifdef CONFIG_CIFS_QUOTA -static const struct quotactl_ops cifs_quotactl_ops; -#endif /* QUOTA */ - int cifsFYI = 0; int cifsERROR = 1; int traceSMB = 0; @@ -61,7 +57,7 @@ unsigned int experimEnabled = 0; unsigned int linuxExtEnabled = 1; unsigned int lookupCacheEnabled = 1; unsigned int multiuser_mount = 0; -unsigned int extended_security = CIFSSEC_DEF; +unsigned int global_secflags = CIFSSEC_DEF; /* unsigned int ntlmv2_support = 0; */ unsigned int sign_CIFS_PDUs = 1; static const struct super_operations cifs_super_ops; @@ -86,8 +82,6 @@ extern mempool_t *cifs_sm_req_poolp; extern mempool_t *cifs_req_poolp; extern mempool_t *cifs_mid_poolp; -extern struct kmem_cache *cifs_oplock_cachep; - static int cifs_read_super(struct super_block *sb, void *data, const char *devname, int silent) @@ -135,8 +129,7 @@ cifs_read_super(struct super_block *sb, void *data, if (rc) { if (!silent) - cERROR(1, - ("cifs_mount failed w/return code = %d", rc)); + cERROR(1, "cifs_mount failed w/return code = %d", rc); goto out_mount_failed; } @@ -146,9 +139,6 @@ cifs_read_super(struct super_block *sb, void *data, /* if (cifs_sb->tcon->ses->server->maxBuf > MAX_CIFS_HDR_SIZE + 512) sb->s_blocksize = cifs_sb->tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE; */ -#ifdef CONFIG_CIFS_QUOTA - sb->s_qcop = &cifs_quotactl_ops; -#endif sb->s_blocksize = CIFS_MAX_MSGSIZE; sb->s_blocksize_bits = 14; /* default 2**14 = CIFS_MAX_MSGSIZE */ inode = cifs_root_iget(sb, ROOT_I); @@ -168,7 +158,7 @@ cifs_read_super(struct super_block *sb, void *data, #ifdef CONFIG_CIFS_EXPERIMENTAL if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM) { - cFYI(1, ("export ops supported")); + cFYI(1, "export ops supported"); sb->s_export_op = &cifs_export_ops; } #endif /* EXPERIMENTAL */ @@ -176,7 +166,7 @@ cifs_read_super(struct super_block *sb, void *data, return 0; out_no_root: - cERROR(1, ("cifs_read_super: get root inode failed")); + cERROR(1, "cifs_read_super: get root inode failed"); if (inode) iput(inode); @@ -203,10 +193,10 @@ cifs_put_super(struct super_block *sb) int rc = 0; struct cifs_sb_info *cifs_sb; - cFYI(1, ("In cifs_put_super")); + cFYI(1, "In cifs_put_super"); cifs_sb = CIFS_SB(sb); if (cifs_sb == NULL) { - cFYI(1, ("Empty cifs superblock info passed to unmount")); + cFYI(1, "Empty cifs superblock info passed to unmount"); return; } @@ -214,7 +204,7 @@ cifs_put_super(struct super_block *sb) rc = cifs_umount(sb, cifs_sb); if (rc) - cERROR(1, ("cifs_umount failed with return code %d", rc)); + cERROR(1, "cifs_umount failed with return code %d", rc); #ifdef CONFIG_CIFS_DFS_UPCALL if (cifs_sb->mountdata) { kfree(cifs_sb->mountdata); @@ -300,7 +290,6 @@ static int cifs_permission(struct inode *inode, int mask) static struct kmem_cache *cifs_inode_cachep; static struct kmem_cache *cifs_req_cachep; static struct kmem_cache *cifs_mid_cachep; -struct kmem_cache *cifs_oplock_cachep; static struct kmem_cache *cifs_sm_req_cachep; mempool_t *cifs_sm_req_poolp; mempool_t *cifs_req_poolp; @@ -432,106 +421,6 @@ cifs_show_options(struct seq_file *s, struct vfsmount *m) return 0; } -#ifdef CONFIG_CIFS_QUOTA -int cifs_xquota_set(struct super_block *sb, int quota_type, qid_t qid, - struct fs_disk_quota *pdquota) -{ - int xid; - int rc = 0; - struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - struct cifsTconInfo *pTcon; - - if (cifs_sb) - pTcon = cifs_sb->tcon; - else - return -EIO; - - - xid = GetXid(); - if (pTcon) { - cFYI(1, ("set type: 0x%x id: %d", quota_type, qid)); - } else - rc = -EIO; - - FreeXid(xid); - return rc; -} - -int cifs_xquota_get(struct super_block *sb, int quota_type, qid_t qid, - struct fs_disk_quota *pdquota) -{ - int xid; - int rc = 0; - struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - struct cifsTconInfo *pTcon; - - if (cifs_sb) - pTcon = cifs_sb->tcon; - else - return -EIO; - - xid = GetXid(); - if (pTcon) { - cFYI(1, ("set type: 0x%x id: %d", quota_type, qid)); - } else - rc = -EIO; - - FreeXid(xid); - return rc; -} - -int cifs_xstate_set(struct super_block *sb, unsigned int flags, int operation) -{ - int xid; - int rc = 0; - struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - struct cifsTconInfo *pTcon; - - if (cifs_sb) - pTcon = cifs_sb->tcon; - else - return -EIO; - - xid = GetXid(); - if (pTcon) { - cFYI(1, ("flags: 0x%x operation: 0x%x", flags, operation)); - } else - rc = -EIO; - - FreeXid(xid); - return rc; -} - -int cifs_xstate_get(struct super_block *sb, struct fs_quota_stat *qstats) -{ - int xid; - int rc = 0; - struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - struct cifsTconInfo *pTcon; - - if (cifs_sb) - pTcon = cifs_sb->tcon; - else - return -EIO; - - xid = GetXid(); - if (pTcon) { - cFYI(1, ("pqstats %p", qstats)); - } else - rc = -EIO; - - FreeXid(xid); - return rc; -} - -static const struct quotactl_ops cifs_quotactl_ops = { - .set_xquota = cifs_xquota_set, - .get_xquota = cifs_xquota_get, - .set_xstate = cifs_xstate_set, - .get_xstate = cifs_xstate_get, -}; -#endif - static void cifs_umount_begin(struct super_block *sb) { struct cifs_sb_info *cifs_sb = CIFS_SB(sb); @@ -558,7 +447,7 @@ static void cifs_umount_begin(struct super_block *sb) /* cancel_brl_requests(tcon); */ /* BB mark all brl mids as exiting */ /* cancel_notify_requests(tcon); */ if (tcon->ses && tcon->ses->server) { - cFYI(1, ("wake up tasks now - umount begin not complete")); + cFYI(1, "wake up tasks now - umount begin not complete"); wake_up_all(&tcon->ses->server->request_q); wake_up_all(&tcon->ses->server->response_q); msleep(1); /* yield */ @@ -609,7 +498,7 @@ cifs_get_sb(struct file_system_type *fs_type, int rc; struct super_block *sb = sget(fs_type, NULL, set_anon_super, NULL); - cFYI(1, ("Devname: %s flags: %d ", dev_name, flags)); + cFYI(1, "Devname: %s flags: %d ", dev_name, flags); if (IS_ERR(sb)) return PTR_ERR(sb); @@ -656,7 +545,6 @@ static loff_t cifs_llseek(struct file *file, loff_t offset, int origin) return generic_file_llseek_unlocked(file, offset, origin); } -#ifdef CONFIG_CIFS_EXPERIMENTAL static int cifs_setlease(struct file *file, long arg, struct file_lock **lease) { /* note that this is called by vfs setlease with the BKL held @@ -685,7 +573,6 @@ static int cifs_setlease(struct file *file, long arg, struct file_lock **lease) else return -EAGAIN; } -#endif struct file_system_type cifs_fs_type = { .owner = THIS_MODULE, @@ -762,10 +649,7 @@ const struct file_operations cifs_file_ops = { #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ - -#ifdef CONFIG_CIFS_EXPERIMENTAL .setlease = cifs_setlease, -#endif /* CONFIG_CIFS_EXPERIMENTAL */ }; const struct file_operations cifs_file_direct_ops = { @@ -784,9 +668,7 @@ const struct file_operations cifs_file_direct_ops = { .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ .llseek = cifs_llseek, -#ifdef CONFIG_CIFS_EXPERIMENTAL .setlease = cifs_setlease, -#endif /* CONFIG_CIFS_EXPERIMENTAL */ }; const struct file_operations cifs_file_nobrl_ops = { .read = do_sync_read, @@ -803,10 +685,7 @@ const struct file_operations cifs_file_nobrl_ops = { #ifdef CONFIG_CIFS_POSIX .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ - -#ifdef CONFIG_CIFS_EXPERIMENTAL .setlease = cifs_setlease, -#endif /* CONFIG_CIFS_EXPERIMENTAL */ }; const struct file_operations cifs_file_direct_nobrl_ops = { @@ -824,9 +703,7 @@ const struct file_operations cifs_file_direct_nobrl_ops = { .unlocked_ioctl = cifs_ioctl, #endif /* CONFIG_CIFS_POSIX */ .llseek = cifs_llseek, -#ifdef CONFIG_CIFS_EXPERIMENTAL .setlease = cifs_setlease, -#endif /* CONFIG_CIFS_EXPERIMENTAL */ }; const struct file_operations cifs_dir_ops = { @@ -878,7 +755,7 @@ cifs_init_request_bufs(void) } else { CIFSMaxBufSize &= 0x1FE00; /* Round size to even 512 byte mult*/ } -/* cERROR(1,("CIFSMaxBufSize %d 0x%x",CIFSMaxBufSize,CIFSMaxBufSize)); */ +/* cERROR(1, "CIFSMaxBufSize %d 0x%x",CIFSMaxBufSize,CIFSMaxBufSize); */ cifs_req_cachep = kmem_cache_create("cifs_request", CIFSMaxBufSize + MAX_CIFS_HDR_SIZE, 0, @@ -890,7 +767,7 @@ cifs_init_request_bufs(void) cifs_min_rcv = 1; else if (cifs_min_rcv > 64) { cifs_min_rcv = 64; - cERROR(1, ("cifs_min_rcv set to maximum (64)")); + cERROR(1, "cifs_min_rcv set to maximum (64)"); } cifs_req_poolp = mempool_create_slab_pool(cifs_min_rcv, @@ -921,7 +798,7 @@ cifs_init_request_bufs(void) cifs_min_small = 2; else if (cifs_min_small > 256) { cifs_min_small = 256; - cFYI(1, ("cifs_min_small set to maximum (256)")); + cFYI(1, "cifs_min_small set to maximum (256)"); } cifs_sm_req_poolp = mempool_create_slab_pool(cifs_min_small, @@ -962,15 +839,6 @@ cifs_init_mids(void) return -ENOMEM; } - cifs_oplock_cachep = kmem_cache_create("cifs_oplock_structs", - sizeof(struct oplock_q_entry), 0, - SLAB_HWCACHE_ALIGN, NULL); - if (cifs_oplock_cachep == NULL) { - mempool_destroy(cifs_mid_poolp); - kmem_cache_destroy(cifs_mid_cachep); - return -ENOMEM; - } - return 0; } @@ -979,7 +847,6 @@ cifs_destroy_mids(void) { mempool_destroy(cifs_mid_poolp); kmem_cache_destroy(cifs_mid_cachep); - kmem_cache_destroy(cifs_oplock_cachep); } static int __init @@ -1019,10 +886,10 @@ init_cifs(void) if (cifs_max_pending < 2) { cifs_max_pending = 2; - cFYI(1, ("cifs_max_pending set to min of 2")); + cFYI(1, "cifs_max_pending set to min of 2"); } else if (cifs_max_pending > 256) { cifs_max_pending = 256; - cFYI(1, ("cifs_max_pending set to max of 256")); + cFYI(1, "cifs_max_pending set to max of 256"); } rc = cifs_init_inodecache(); @@ -1080,7 +947,7 @@ init_cifs(void) static void __exit exit_cifs(void) { - cFYI(DBG2, ("exit_cifs")); + cFYI(DBG2, "exit_cifs"); cifs_proc_clean(); #ifdef CONFIG_CIFS_DFS_UPCALL cifs_dfs_release_automount_timer(); diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 7aa57ec..0242ff9 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -114,5 +114,5 @@ extern long cifs_ioctl(struct file *filep, unsigned int cmd, unsigned long arg); extern const struct export_operations cifs_export_ops; #endif /* EXPERIMENTAL */ -#define CIFS_VERSION "1.62" +#define CIFS_VERSION "1.64" #endif /* _CIFSFS_H */ diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index ecf0ffb..a88479c 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -87,7 +87,6 @@ enum securityEnum { RawNTLMSSP, /* NTLMSSP without SPNEGO, NTLMv2 hash */ /* NTLMSSP, */ /* can use rawNTLMSSP instead of NTLMSSP via SPNEGO */ Kerberos, /* Kerberos via SPNEGO */ - MSKerberos, /* MS Kerberos via SPNEGO */ }; enum protocolEnum { @@ -185,6 +184,12 @@ struct TCP_Server_Info { struct mac_key mac_signing_key; char ntlmv2_hash[16]; unsigned long lstrp; /* when we got last response from this server */ + u16 dialect; /* dialect index that server chose */ + /* extended security flavors that server supports */ + bool sec_kerberos; /* supports plain Kerberos */ + bool sec_mskerberos; /* supports legacy MS Kerberos */ + bool sec_kerberosu2u; /* supports U2U Kerberos */ + bool sec_ntlmssp; /* supports NTLMSSP */ }; /* @@ -502,6 +507,7 @@ struct dfs_info3_param { #define CIFS_FATTR_DFS_REFERRAL 0x1 #define CIFS_FATTR_DELETE_PENDING 0x2 #define CIFS_FATTR_NEED_REVAL 0x4 +#define CIFS_FATTR_INO_COLLISION 0x8 struct cifs_fattr { u32 cf_flags; @@ -717,7 +723,7 @@ GLOBAL_EXTERN unsigned int multiuser_mount; /* if enabled allows new sessions GLOBAL_EXTERN unsigned int oplockEnabled; GLOBAL_EXTERN unsigned int experimEnabled; GLOBAL_EXTERN unsigned int lookupCacheEnabled; -GLOBAL_EXTERN unsigned int extended_security; /* if on, session setup sent +GLOBAL_EXTERN unsigned int global_secflags; /* if on, session setup sent with more secure ntlmssp2 challenge/resp */ GLOBAL_EXTERN unsigned int sign_CIFS_PDUs; /* enable smb packet signing */ GLOBAL_EXTERN unsigned int linuxExtEnabled;/*enable Linux/Unix CIFS extensions*/ diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 39e47f4..fb1657e 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -39,8 +39,20 @@ extern int smb_send(struct TCP_Server_Info *, struct smb_hdr *, unsigned int /* length */); extern unsigned int _GetXid(void); extern void _FreeXid(unsigned int); -#define GetXid() (int)_GetXid(); cFYI(1,("CIFS VFS: in %s as Xid: %d with uid: %d",__func__, xid,current_fsuid())); -#define FreeXid(curr_xid) {_FreeXid(curr_xid); cFYI(1,("CIFS VFS: leaving %s (xid = %d) rc = %d",__func__,curr_xid,(int)rc));} +#define GetXid() \ +({ \ + int __xid = (int)_GetXid(); \ + cFYI(1, "CIFS VFS: in %s as Xid: %d with uid: %d", \ + __func__, __xid, current_fsuid()); \ + __xid; \ +}) + +#define FreeXid(curr_xid) \ +do { \ + _FreeXid(curr_xid); \ + cFYI(1, "CIFS VFS: leaving %s (xid = %d) rc = %d", \ + __func__, curr_xid, (int)rc); \ +} while (0) extern char *build_path_from_dentry(struct dentry *); extern char *cifs_build_path_to_root(struct cifs_sb_info *cifs_sb); extern char *build_wildcard_path_from_dentry(struct dentry *direntry); @@ -73,7 +85,7 @@ extern struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *); extern unsigned int smbCalcSize(struct smb_hdr *ptr); extern unsigned int smbCalcSize_LE(struct smb_hdr *ptr); extern int decode_negTokenInit(unsigned char *security_blob, int length, - enum securityEnum *secType); + struct TCP_Server_Info *server); extern int cifs_convert_address(char *src, void *dst); extern int map_smb_to_linux_error(struct smb_hdr *smb, int logErr); extern void header_assemble(struct smb_hdr *, char /* command */ , @@ -83,7 +95,6 @@ extern int small_smb_init_no_tc(const int smb_cmd, const int wct, struct cifsSesInfo *ses, void **request_buf); extern int CIFS_SessSetup(unsigned int xid, struct cifsSesInfo *ses, - const int stage, const struct nls_table *nls_cp); extern __u16 GetNextMid(struct TCP_Server_Info *server); extern struct timespec cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601); @@ -95,8 +106,11 @@ extern struct cifsFileInfo *cifs_new_fileinfo(struct inode *newinode, __u16 fileHandle, struct file *file, struct vfsmount *mnt, unsigned int oflags); extern int cifs_posix_open(char *full_path, struct inode **pinode, - struct vfsmount *mnt, int mode, int oflags, - __u32 *poplock, __u16 *pnetfid, int xid); + struct vfsmount *mnt, + struct super_block *sb, + int mode, int oflags, + __u32 *poplock, __u16 *pnetfid, int xid); +void cifs_fill_uniqueid(struct super_block *sb, struct cifs_fattr *fattr); extern void cifs_unix_basic_to_fattr(struct cifs_fattr *fattr, FILE_UNIX_BASIC_INFO *info, struct cifs_sb_info *cifs_sb); @@ -125,7 +139,9 @@ extern void cifs_dfs_release_automount_timer(void); void cifs_proc_init(void); void cifs_proc_clean(void); -extern int cifs_setup_session(unsigned int xid, struct cifsSesInfo *pSesInfo, +extern int cifs_negotiate_protocol(unsigned int xid, + struct cifsSesInfo *ses); +extern int cifs_setup_session(unsigned int xid, struct cifsSesInfo *ses, struct nls_table *nls_info); extern int CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses); diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 5d3f29f..c65c341 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -1,7 +1,7 @@ /* * fs/cifs/cifssmb.c * - * Copyright (C) International Business Machines Corp., 2002,2009 + * Copyright (C) International Business Machines Corp., 2002,2010 * Author(s): Steve French (sfrench@us.ibm.com) * * Contains the routines for constructing the SMB PDUs themselves @@ -130,8 +130,8 @@ cifs_reconnect_tcon(struct cifsTconInfo *tcon, int smb_command) if (smb_command != SMB_COM_WRITE_ANDX && smb_command != SMB_COM_OPEN_ANDX && smb_command != SMB_COM_TREE_DISCONNECT) { - cFYI(1, ("can not send cmd %d while umounting", - smb_command)); + cFYI(1, "can not send cmd %d while umounting", + smb_command); return -ENODEV; } } @@ -157,7 +157,7 @@ cifs_reconnect_tcon(struct cifsTconInfo *tcon, int smb_command) * back on-line */ if (!tcon->retry || ses->status == CifsExiting) { - cFYI(1, ("gave up waiting on reconnect in smb_init")); + cFYI(1, "gave up waiting on reconnect in smb_init"); return -EHOSTDOWN; } } @@ -172,7 +172,8 @@ cifs_reconnect_tcon(struct cifsTconInfo *tcon, int smb_command) * reconnect the same SMB session */ mutex_lock(&ses->session_mutex); - if (ses->need_reconnect) + rc = cifs_negotiate_protocol(0, ses); + if (rc == 0 && ses->need_reconnect) rc = cifs_setup_session(0, ses, nls_codepage); /* do we need to reconnect tcon? */ @@ -184,7 +185,7 @@ cifs_reconnect_tcon(struct cifsTconInfo *tcon, int smb_command) mark_open_files_invalid(tcon); rc = CIFSTCon(0, ses, tcon->treeName, tcon, nls_codepage); mutex_unlock(&ses->session_mutex); - cFYI(1, ("reconnect tcon rc = %d", rc)); + cFYI(1, "reconnect tcon rc = %d", rc); if (rc) goto out; @@ -355,7 +356,6 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) struct TCP_Server_Info *server; u16 count; unsigned int secFlags; - u16 dialect; if (ses->server) server = ses->server; @@ -372,9 +372,9 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) if (ses->overrideSecFlg & (~(CIFSSEC_MUST_SIGN | CIFSSEC_MUST_SEAL))) secFlags = ses->overrideSecFlg; /* BB FIXME fix sign flags? */ else /* if override flags set only sign/seal OR them with global auth */ - secFlags = extended_security | ses->overrideSecFlg; + secFlags = global_secflags | ses->overrideSecFlg; - cFYI(1, ("secFlags 0x%x", secFlags)); + cFYI(1, "secFlags 0x%x", secFlags); pSMB->hdr.Mid = GetNextMid(server); pSMB->hdr.Flags2 |= (SMBFLG2_UNICODE | SMBFLG2_ERR_STATUS); @@ -382,14 +382,14 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) if ((secFlags & CIFSSEC_MUST_KRB5) == CIFSSEC_MUST_KRB5) pSMB->hdr.Flags2 |= SMBFLG2_EXT_SEC; else if ((secFlags & CIFSSEC_AUTH_MASK) == CIFSSEC_MAY_KRB5) { - cFYI(1, ("Kerberos only mechanism, enable extended security")); + cFYI(1, "Kerberos only mechanism, enable extended security"); pSMB->hdr.Flags2 |= SMBFLG2_EXT_SEC; } #ifdef CONFIG_CIFS_EXPERIMENTAL else if ((secFlags & CIFSSEC_MUST_NTLMSSP) == CIFSSEC_MUST_NTLMSSP) pSMB->hdr.Flags2 |= SMBFLG2_EXT_SEC; else if ((secFlags & CIFSSEC_AUTH_MASK) == CIFSSEC_MAY_NTLMSSP) { - cFYI(1, ("NTLMSSP only mechanism, enable extended security")); + cFYI(1, "NTLMSSP only mechanism, enable extended security"); pSMB->hdr.Flags2 |= SMBFLG2_EXT_SEC; } #endif @@ -408,10 +408,10 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) if (rc != 0) goto neg_err_exit; - dialect = le16_to_cpu(pSMBr->DialectIndex); - cFYI(1, ("Dialect: %d", dialect)); + server->dialect = le16_to_cpu(pSMBr->DialectIndex); + cFYI(1, "Dialect: %d", server->dialect); /* Check wct = 1 error case */ - if ((pSMBr->hdr.WordCount < 13) || (dialect == BAD_PROT)) { + if ((pSMBr->hdr.WordCount < 13) || (server->dialect == BAD_PROT)) { /* core returns wct = 1, but we do not ask for core - otherwise small wct just comes when dialect index is -1 indicating we could not negotiate a common dialect */ @@ -419,8 +419,8 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) goto neg_err_exit; #ifdef CONFIG_CIFS_WEAK_PW_HASH } else if ((pSMBr->hdr.WordCount == 13) - && ((dialect == LANMAN_PROT) - || (dialect == LANMAN2_PROT))) { + && ((server->dialect == LANMAN_PROT) + || (server->dialect == LANMAN2_PROT))) { __s16 tmp; struct lanman_neg_rsp *rsp = (struct lanman_neg_rsp *)pSMBr; @@ -428,8 +428,8 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) (secFlags & CIFSSEC_MAY_PLNTXT)) server->secType = LANMAN; else { - cERROR(1, ("mount failed weak security disabled" - " in /proc/fs/cifs/SecurityFlags")); + cERROR(1, "mount failed weak security disabled" + " in /proc/fs/cifs/SecurityFlags"); rc = -EOPNOTSUPP; goto neg_err_exit; } @@ -462,9 +462,9 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) utc = CURRENT_TIME; ts = cnvrtDosUnixTm(rsp->SrvTime.Date, rsp->SrvTime.Time, 0); - cFYI(1, ("SrvTime %d sec since 1970 (utc: %d) diff: %d", + cFYI(1, "SrvTime %d sec since 1970 (utc: %d) diff: %d", (int)ts.tv_sec, (int)utc.tv_sec, - (int)(utc.tv_sec - ts.tv_sec))); + (int)(utc.tv_sec - ts.tv_sec)); val = (int)(utc.tv_sec - ts.tv_sec); seconds = abs(val); result = (seconds / MIN_TZ_ADJ) * MIN_TZ_ADJ; @@ -478,7 +478,7 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) server->timeAdj = (int)tmp; server->timeAdj *= 60; /* also in seconds */ } - cFYI(1, ("server->timeAdj: %d seconds", server->timeAdj)); + cFYI(1, "server->timeAdj: %d seconds", server->timeAdj); /* BB get server time for time conversions and add @@ -493,14 +493,14 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) goto neg_err_exit; } - cFYI(1, ("LANMAN negotiated")); + cFYI(1, "LANMAN negotiated"); /* we will not end up setting signing flags - as no signing was in LANMAN and server did not return the flags on */ goto signing_check; #else /* weak security disabled */ } else if (pSMBr->hdr.WordCount == 13) { - cERROR(1, ("mount failed, cifs module not built " - "with CIFS_WEAK_PW_HASH support")); + cERROR(1, "mount failed, cifs module not built " + "with CIFS_WEAK_PW_HASH support"); rc = -EOPNOTSUPP; #endif /* WEAK_PW_HASH */ goto neg_err_exit; @@ -512,14 +512,14 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) /* else wct == 17 NTLM */ server->secMode = pSMBr->SecurityMode; if ((server->secMode & SECMODE_USER) == 0) - cFYI(1, ("share mode security")); + cFYI(1, "share mode security"); if ((server->secMode & SECMODE_PW_ENCRYPT) == 0) #ifdef CONFIG_CIFS_WEAK_PW_HASH if ((secFlags & CIFSSEC_MAY_PLNTXT) == 0) #endif /* CIFS_WEAK_PW_HASH */ - cERROR(1, ("Server requests plain text password" - " but client support disabled")); + cERROR(1, "Server requests plain text password" + " but client support disabled"); if ((secFlags & CIFSSEC_MUST_NTLMV2) == CIFSSEC_MUST_NTLMV2) server->secType = NTLMv2; @@ -539,7 +539,7 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) #endif */ else { rc = -EOPNOTSUPP; - cERROR(1, ("Invalid security type")); + cERROR(1, "Invalid security type"); goto neg_err_exit; } /* else ... any others ...? */ @@ -551,7 +551,7 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) server->maxBuf = min(le32_to_cpu(pSMBr->MaxBufferSize), (__u32) CIFSMaxBufSize + MAX_CIFS_HDR_SIZE); server->max_rw = le32_to_cpu(pSMBr->MaxRawSize); - cFYI(DBG2, ("Max buf = %d", ses->server->maxBuf)); + cFYI(DBG2, "Max buf = %d", ses->server->maxBuf); GETU32(ses->server->sessid) = le32_to_cpu(pSMBr->SessionKey); server->capabilities = le32_to_cpu(pSMBr->Capabilities); server->timeAdj = (int)(__s16)le16_to_cpu(pSMBr->ServerTimeZone); @@ -582,7 +582,7 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) if (memcmp(server->server_GUID, pSMBr->u.extended_response. GUID, 16) != 0) { - cFYI(1, ("server UID changed")); + cFYI(1, "server UID changed"); memcpy(server->server_GUID, pSMBr->u.extended_response.GUID, 16); @@ -597,13 +597,19 @@ CIFSSMBNegotiate(unsigned int xid, struct cifsSesInfo *ses) server->secType = RawNTLMSSP; } else { rc = decode_negTokenInit(pSMBr->u.extended_response. - SecurityBlob, - count - 16, - &server->secType); + SecurityBlob, count - 16, + server); if (rc == 1) rc = 0; else rc = -EINVAL; + + if (server->sec_kerberos || server->sec_mskerberos) + server->secType = Kerberos; + else if (server->sec_ntlmssp) + server->secType = RawNTLMSSP; + else + rc = -EOPNOTSUPP; } } else server->capabilities &= ~CAP_EXTENDED_SECURITY; @@ -614,22 +620,21 @@ signing_check: if ((secFlags & CIFSSEC_MAY_SIGN) == 0) { /* MUST_SIGN already includes the MAY_SIGN FLAG so if this is zero it means that signing is disabled */ - cFYI(1, ("Signing disabled")); + cFYI(1, "Signing disabled"); if (server->secMode & SECMODE_SIGN_REQUIRED) { - cERROR(1, ("Server requires " + cERROR(1, "Server requires " "packet signing to be enabled in " - "/proc/fs/cifs/SecurityFlags.")); + "/proc/fs/cifs/SecurityFlags."); rc = -EOPNOTSUPP; } server->secMode &= ~(SECMODE_SIGN_ENABLED | SECMODE_SIGN_REQUIRED); } else if ((secFlags & CIFSSEC_MUST_SIGN) == CIFSSEC_MUST_SIGN) { /* signing required */ - cFYI(1, ("Must sign - secFlags 0x%x", secFlags)); + cFYI(1, "Must sign - secFlags 0x%x", secFlags); if ((server->secMode & (SECMODE_SIGN_ENABLED | SECMODE_SIGN_REQUIRED)) == 0) { - cERROR(1, - ("signing required but server lacks support")); + cERROR(1, "signing required but server lacks support"); rc = -EOPNOTSUPP; } else server->secMode |= SECMODE_SIGN_REQUIRED; @@ -643,7 +648,7 @@ signing_check: neg_err_exit: cifs_buf_release(pSMB); - cFYI(1, ("negprot rc %d", rc)); + cFYI(1, "negprot rc %d", rc); return rc; } @@ -653,7 +658,7 @@ CIFSSMBTDis(const int xid, struct cifsTconInfo *tcon) struct smb_hdr *smb_buffer; int rc = 0; - cFYI(1, ("In tree disconnect")); + cFYI(1, "In tree disconnect"); /* BB: do we need to check this? These should never be NULL. */ if ((tcon->ses == NULL) || (tcon->ses->server == NULL)) @@ -675,7 +680,7 @@ CIFSSMBTDis(const int xid, struct cifsTconInfo *tcon) rc = SendReceiveNoRsp(xid, tcon->ses, smb_buffer, 0); if (rc) - cFYI(1, ("Tree disconnect failed %d", rc)); + cFYI(1, "Tree disconnect failed %d", rc); /* No need to return error on this operation if tid invalidated and closed on server already e.g. due to tcp session crashing */ @@ -691,7 +696,7 @@ CIFSSMBLogoff(const int xid, struct cifsSesInfo *ses) LOGOFF_ANDX_REQ *pSMB; int rc = 0; - cFYI(1, ("In SMBLogoff for session disconnect")); + cFYI(1, "In SMBLogoff for session disconnect"); /* * BB: do we need to check validity of ses and server? They should @@ -744,7 +749,7 @@ CIFSPOSIXDelFile(const int xid, struct cifsTconInfo *tcon, const char *fileName, int bytes_returned = 0; __u16 params, param_offset, offset, byte_count; - cFYI(1, ("In POSIX delete")); + cFYI(1, "In POSIX delete"); PsxDelete: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -796,7 +801,7 @@ PsxDelete: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("Posix delete returned %d", rc)); + cFYI(1, "Posix delete returned %d", rc); cifs_buf_release(pSMB); cifs_stats_inc(&tcon->num_deletes); @@ -843,7 +848,7 @@ DelFileRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_deletes); if (rc) - cFYI(1, ("Error in RMFile = %d", rc)); + cFYI(1, "Error in RMFile = %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -862,7 +867,7 @@ CIFSSMBRmDir(const int xid, struct cifsTconInfo *tcon, const char *dirName, int bytes_returned; int name_len; - cFYI(1, ("In CIFSSMBRmDir")); + cFYI(1, "In CIFSSMBRmDir"); RmDirRetry: rc = smb_init(SMB_COM_DELETE_DIRECTORY, 0, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -887,7 +892,7 @@ RmDirRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_rmdirs); if (rc) - cFYI(1, ("Error in RMDir = %d", rc)); + cFYI(1, "Error in RMDir = %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -905,7 +910,7 @@ CIFSSMBMkDir(const int xid, struct cifsTconInfo *tcon, int bytes_returned; int name_len; - cFYI(1, ("In CIFSSMBMkDir")); + cFYI(1, "In CIFSSMBMkDir"); MkDirRetry: rc = smb_init(SMB_COM_CREATE_DIRECTORY, 0, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -930,7 +935,7 @@ MkDirRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_mkdirs); if (rc) - cFYI(1, ("Error in Mkdir = %d", rc)); + cFYI(1, "Error in Mkdir = %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -953,7 +958,7 @@ CIFSPOSIXCreate(const int xid, struct cifsTconInfo *tcon, __u32 posix_flags, OPEN_PSX_REQ *pdata; OPEN_PSX_RSP *psx_rsp; - cFYI(1, ("In POSIX Create")); + cFYI(1, "In POSIX Create"); PsxCreat: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -1007,11 +1012,11 @@ PsxCreat: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Posix create returned %d", rc)); + cFYI(1, "Posix create returned %d", rc); goto psx_create_err; } - cFYI(1, ("copying inode info")); + cFYI(1, "copying inode info"); rc = validate_t2((struct smb_t2_rsp *)pSMBr); if (rc || (pSMBr->ByteCount < sizeof(OPEN_PSX_RSP))) { @@ -1033,11 +1038,11 @@ PsxCreat: /* check to make sure response data is there */ if (psx_rsp->ReturnedLevel != cpu_to_le16(SMB_QUERY_FILE_UNIX_BASIC)) { pRetData->Type = cpu_to_le32(-1); /* unknown */ - cFYI(DBG2, ("unknown type")); + cFYI(DBG2, "unknown type"); } else { if (pSMBr->ByteCount < sizeof(OPEN_PSX_RSP) + sizeof(FILE_UNIX_BASIC_INFO)) { - cERROR(1, ("Open response data too small")); + cERROR(1, "Open response data too small"); pRetData->Type = cpu_to_le32(-1); goto psx_create_err; } @@ -1084,7 +1089,7 @@ static __u16 convert_disposition(int disposition) ofun = SMBOPEN_OCREATE | SMBOPEN_OTRUNC; break; default: - cFYI(1, ("unknown disposition %d", disposition)); + cFYI(1, "unknown disposition %d", disposition); ofun = SMBOPEN_OAPPEND; /* regular open */ } return ofun; @@ -1175,7 +1180,7 @@ OldOpenRetry: (struct smb_hdr *)pSMBr, &bytes_returned, CIFS_LONG_OP); cifs_stats_inc(&tcon->num_opens); if (rc) { - cFYI(1, ("Error in Open = %d", rc)); + cFYI(1, "Error in Open = %d", rc); } else { /* BB verify if wct == 15 */ @@ -1288,7 +1293,7 @@ openRetry: (struct smb_hdr *)pSMBr, &bytes_returned, CIFS_LONG_OP); cifs_stats_inc(&tcon->num_opens); if (rc) { - cFYI(1, ("Error in Open = %d", rc)); + cFYI(1, "Error in Open = %d", rc); } else { *pOplock = pSMBr->OplockLevel; /* 1 byte no need to le_to_cpu */ *netfid = pSMBr->Fid; /* cifs fid stays in le */ @@ -1326,7 +1331,7 @@ CIFSSMBRead(const int xid, struct cifsTconInfo *tcon, const int netfid, int resp_buf_type = 0; struct kvec iov[1]; - cFYI(1, ("Reading %d bytes on fid %d", count, netfid)); + cFYI(1, "Reading %d bytes on fid %d", count, netfid); if (tcon->ses->capabilities & CAP_LARGE_FILES) wct = 12; else { @@ -1371,7 +1376,7 @@ CIFSSMBRead(const int xid, struct cifsTconInfo *tcon, const int netfid, cifs_stats_inc(&tcon->num_reads); pSMBr = (READ_RSP *)iov[0].iov_base; if (rc) { - cERROR(1, ("Send error in read = %d", rc)); + cERROR(1, "Send error in read = %d", rc); } else { int data_length = le16_to_cpu(pSMBr->DataLengthHigh); data_length = data_length << 16; @@ -1381,15 +1386,15 @@ CIFSSMBRead(const int xid, struct cifsTconInfo *tcon, const int netfid, /*check that DataLength would not go beyond end of SMB */ if ((data_length > CIFSMaxBufSize) || (data_length > count)) { - cFYI(1, ("bad length %d for count %d", - data_length, count)); + cFYI(1, "bad length %d for count %d", + data_length, count); rc = -EIO; *nbytes = 0; } else { pReadData = (char *) (&pSMBr->hdr.Protocol) + le16_to_cpu(pSMBr->DataOffset); /* if (rc = copy_to_user(buf, pReadData, data_length)) { - cERROR(1,("Faulting on read rc = %d",rc)); + cERROR(1, "Faulting on read rc = %d",rc); rc = -EFAULT; }*/ /* can not use copy_to_user when using page cache*/ if (*buf) @@ -1433,7 +1438,7 @@ CIFSSMBWrite(const int xid, struct cifsTconInfo *tcon, *nbytes = 0; - /* cFYI(1, ("write at %lld %d bytes", offset, count));*/ + /* cFYI(1, "write at %lld %d bytes", offset, count);*/ if (tcon->ses == NULL) return -ECONNABORTED; @@ -1514,7 +1519,7 @@ CIFSSMBWrite(const int xid, struct cifsTconInfo *tcon, (struct smb_hdr *) pSMBr, &bytes_returned, long_op); cifs_stats_inc(&tcon->num_writes); if (rc) { - cFYI(1, ("Send error in write = %d", rc)); + cFYI(1, "Send error in write = %d", rc); } else { *nbytes = le16_to_cpu(pSMBr->CountHigh); *nbytes = (*nbytes) << 16; @@ -1551,7 +1556,7 @@ CIFSSMBWrite2(const int xid, struct cifsTconInfo *tcon, *nbytes = 0; - cFYI(1, ("write2 at %lld %d bytes", (long long)offset, count)); + cFYI(1, "write2 at %lld %d bytes", (long long)offset, count); if (tcon->ses->capabilities & CAP_LARGE_FILES) { wct = 14; @@ -1606,7 +1611,7 @@ CIFSSMBWrite2(const int xid, struct cifsTconInfo *tcon, long_op); cifs_stats_inc(&tcon->num_writes); if (rc) { - cFYI(1, ("Send error Write2 = %d", rc)); + cFYI(1, "Send error Write2 = %d", rc); } else if (resp_buf_type == 0) { /* presumably this can not happen, but best to be safe */ rc = -EIO; @@ -1651,7 +1656,7 @@ CIFSSMBLock(const int xid, struct cifsTconInfo *tcon, int timeout = 0; __u16 count; - cFYI(1, ("CIFSSMBLock timeout %d numLock %d", (int)waitFlag, numLock)); + cFYI(1, "CIFSSMBLock timeout %d numLock %d", (int)waitFlag, numLock); rc = small_smb_init(SMB_COM_LOCKING_ANDX, 8, tcon, (void **) &pSMB); if (rc) @@ -1699,7 +1704,7 @@ CIFSSMBLock(const int xid, struct cifsTconInfo *tcon, } cifs_stats_inc(&tcon->num_locks); if (rc) - cFYI(1, ("Send error in Lock = %d", rc)); + cFYI(1, "Send error in Lock = %d", rc); /* Note: On -EAGAIN error only caller can retry on handle based calls since file handle passed in no longer valid */ @@ -1722,7 +1727,7 @@ CIFSSMBPosixLock(const int xid, struct cifsTconInfo *tcon, __u16 params, param_offset, offset, byte_count, count; struct kvec iov[1]; - cFYI(1, ("Posix Lock")); + cFYI(1, "Posix Lock"); if (pLockData == NULL) return -EINVAL; @@ -1792,7 +1797,7 @@ CIFSSMBPosixLock(const int xid, struct cifsTconInfo *tcon, } if (rc) { - cFYI(1, ("Send error in Posix Lock = %d", rc)); + cFYI(1, "Send error in Posix Lock = %d", rc); } else if (get_flag) { /* lock structure can be returned on get */ __u16 data_offset; @@ -1849,7 +1854,7 @@ CIFSSMBClose(const int xid, struct cifsTconInfo *tcon, int smb_file_id) { int rc = 0; CLOSE_REQ *pSMB = NULL; - cFYI(1, ("In CIFSSMBClose")); + cFYI(1, "In CIFSSMBClose"); /* do not retry on dead session on close */ rc = small_smb_init(SMB_COM_CLOSE, 3, tcon, (void **) &pSMB); @@ -1866,7 +1871,7 @@ CIFSSMBClose(const int xid, struct cifsTconInfo *tcon, int smb_file_id) if (rc) { if (rc != -EINTR) { /* EINTR is expected when user ctl-c to kill app */ - cERROR(1, ("Send error in Close = %d", rc)); + cERROR(1, "Send error in Close = %d", rc); } } @@ -1882,7 +1887,7 @@ CIFSSMBFlush(const int xid, struct cifsTconInfo *tcon, int smb_file_id) { int rc = 0; FLUSH_REQ *pSMB = NULL; - cFYI(1, ("In CIFSSMBFlush")); + cFYI(1, "In CIFSSMBFlush"); rc = small_smb_init(SMB_COM_FLUSH, 1, tcon, (void **) &pSMB); if (rc) @@ -1893,7 +1898,7 @@ CIFSSMBFlush(const int xid, struct cifsTconInfo *tcon, int smb_file_id) rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); cifs_stats_inc(&tcon->num_flushes); if (rc) - cERROR(1, ("Send error in Flush = %d", rc)); + cERROR(1, "Send error in Flush = %d", rc); return rc; } @@ -1910,7 +1915,7 @@ CIFSSMBRename(const int xid, struct cifsTconInfo *tcon, int name_len, name_len2; __u16 count; - cFYI(1, ("In CIFSSMBRename")); + cFYI(1, "In CIFSSMBRename"); renameRetry: rc = smb_init(SMB_COM_RENAME, 1, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -1956,7 +1961,7 @@ renameRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_renames); if (rc) - cFYI(1, ("Send error in rename = %d", rc)); + cFYI(1, "Send error in rename = %d", rc); cifs_buf_release(pSMB); @@ -1980,7 +1985,7 @@ int CIFSSMBRenameOpenFile(const int xid, struct cifsTconInfo *pTcon, int len_of_str; __u16 params, param_offset, offset, count, byte_count; - cFYI(1, ("Rename to File by handle")); + cFYI(1, "Rename to File by handle"); rc = smb_init(SMB_COM_TRANSACTION2, 15, pTcon, (void **) &pSMB, (void **) &pSMBr); if (rc) @@ -2035,7 +2040,7 @@ int CIFSSMBRenameOpenFile(const int xid, struct cifsTconInfo *pTcon, (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&pTcon->num_t2renames); if (rc) - cFYI(1, ("Send error in Rename (by file handle) = %d", rc)); + cFYI(1, "Send error in Rename (by file handle) = %d", rc); cifs_buf_release(pSMB); @@ -2057,7 +2062,7 @@ CIFSSMBCopy(const int xid, struct cifsTconInfo *tcon, const char *fromName, int name_len, name_len2; __u16 count; - cFYI(1, ("In CIFSSMBCopy")); + cFYI(1, "In CIFSSMBCopy"); copyRetry: rc = smb_init(SMB_COM_COPY, 1, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -2102,8 +2107,8 @@ copyRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in copy = %d with %d files copied", - rc, le16_to_cpu(pSMBr->CopyCount))); + cFYI(1, "Send error in copy = %d with %d files copied", + rc, le16_to_cpu(pSMBr->CopyCount)); } cifs_buf_release(pSMB); @@ -2127,7 +2132,7 @@ CIFSUnixCreateSymLink(const int xid, struct cifsTconInfo *tcon, int bytes_returned = 0; __u16 params, param_offset, offset, byte_count; - cFYI(1, ("In Symlink Unix style")); + cFYI(1, "In Symlink Unix style"); createSymLinkRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -2192,7 +2197,7 @@ createSymLinkRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_symlinks); if (rc) - cFYI(1, ("Send error in SetPathInfo create symlink = %d", rc)); + cFYI(1, "Send error in SetPathInfo create symlink = %d", rc); cifs_buf_release(pSMB); @@ -2216,7 +2221,7 @@ CIFSUnixCreateHardLink(const int xid, struct cifsTconInfo *tcon, int bytes_returned = 0; __u16 params, param_offset, offset, byte_count; - cFYI(1, ("In Create Hard link Unix style")); + cFYI(1, "In Create Hard link Unix style"); createHardLinkRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -2278,7 +2283,7 @@ createHardLinkRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_hardlinks); if (rc) - cFYI(1, ("Send error in SetPathInfo (hard link) = %d", rc)); + cFYI(1, "Send error in SetPathInfo (hard link) = %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -2299,7 +2304,7 @@ CIFSCreateHardLink(const int xid, struct cifsTconInfo *tcon, int name_len, name_len2; __u16 count; - cFYI(1, ("In CIFSCreateHardLink")); + cFYI(1, "In CIFSCreateHardLink"); winCreateHardLinkRetry: rc = smb_init(SMB_COM_NT_RENAME, 4, tcon, (void **) &pSMB, @@ -2350,7 +2355,7 @@ winCreateHardLinkRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_hardlinks); if (rc) - cFYI(1, ("Send error in hard link (NT rename) = %d", rc)); + cFYI(1, "Send error in hard link (NT rename) = %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -2373,7 +2378,7 @@ CIFSSMBUnixQuerySymLink(const int xid, struct cifsTconInfo *tcon, __u16 params, byte_count; char *data_start; - cFYI(1, ("In QPathSymLinkInfo (Unix) for path %s", searchName)); + cFYI(1, "In QPathSymLinkInfo (Unix) for path %s", searchName); querySymLinkRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, @@ -2420,7 +2425,7 @@ querySymLinkRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QuerySymLinkInfo = %d", rc)); + cFYI(1, "Send error in QuerySymLinkInfo = %d", rc); } else { /* decode response */ @@ -2521,21 +2526,21 @@ validate_ntransact(char *buf, char **ppparm, char **ppdata, /* should we also check that parm and data areas do not overlap? */ if (*ppparm > end_of_smb) { - cFYI(1, ("parms start after end of smb")); + cFYI(1, "parms start after end of smb"); return -EINVAL; } else if (parm_count + *ppparm > end_of_smb) { - cFYI(1, ("parm end after end of smb")); + cFYI(1, "parm end after end of smb"); return -EINVAL; } else if (*ppdata > end_of_smb) { - cFYI(1, ("data starts after end of smb")); + cFYI(1, "data starts after end of smb"); return -EINVAL; } else if (data_count + *ppdata > end_of_smb) { - cFYI(1, ("data %p + count %d (%p) ends after end of smb %p start %p", + cFYI(1, "data %p + count %d (%p) past smb end %p start %p", *ppdata, data_count, (data_count + *ppdata), - end_of_smb, pSMBr)); + end_of_smb, pSMBr); return -EINVAL; } else if (parm_count + data_count > pSMBr->ByteCount) { - cFYI(1, ("parm count and data count larger than SMB")); + cFYI(1, "parm count and data count larger than SMB"); return -EINVAL; } *pdatalen = data_count; @@ -2554,7 +2559,7 @@ CIFSSMBQueryReparseLinkInfo(const int xid, struct cifsTconInfo *tcon, struct smb_com_transaction_ioctl_req *pSMB; struct smb_com_transaction_ioctl_rsp *pSMBr; - cFYI(1, ("In Windows reparse style QueryLink for path %s", searchName)); + cFYI(1, "In Windows reparse style QueryLink for path %s", searchName); rc = smb_init(SMB_COM_NT_TRANSACT, 23, tcon, (void **) &pSMB, (void **) &pSMBr); if (rc) @@ -2583,7 +2588,7 @@ CIFSSMBQueryReparseLinkInfo(const int xid, struct cifsTconInfo *tcon, rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QueryReparseLinkInfo = %d", rc)); + cFYI(1, "Send error in QueryReparseLinkInfo = %d", rc); } else { /* decode response */ __u32 data_offset = le32_to_cpu(pSMBr->DataOffset); __u32 data_count = le32_to_cpu(pSMBr->DataCount); @@ -2607,7 +2612,7 @@ CIFSSMBQueryReparseLinkInfo(const int xid, struct cifsTconInfo *tcon, if ((reparse_buf->LinkNamesBuf + reparse_buf->TargetNameOffset + reparse_buf->TargetNameLen) > end_of_smb) { - cFYI(1, ("reparse buf beyond SMB")); + cFYI(1, "reparse buf beyond SMB"); rc = -EIO; goto qreparse_out; } @@ -2628,12 +2633,12 @@ CIFSSMBQueryReparseLinkInfo(const int xid, struct cifsTconInfo *tcon, } } else { rc = -EIO; - cFYI(1, ("Invalid return data count on " - "get reparse info ioctl")); + cFYI(1, "Invalid return data count on " + "get reparse info ioctl"); } symlinkinfo[buflen] = 0; /* just in case so the caller does not go off the end of the buffer */ - cFYI(1, ("readlink result - %s", symlinkinfo)); + cFYI(1, "readlink result - %s", symlinkinfo); } qreparse_out: @@ -2656,7 +2661,7 @@ static void cifs_convert_ace(posix_acl_xattr_entry *ace, ace->e_perm = cpu_to_le16(cifs_ace->cifs_e_perm); ace->e_tag = cpu_to_le16(cifs_ace->cifs_e_tag); ace->e_id = cpu_to_le32(le64_to_cpu(cifs_ace->cifs_uid)); - /* cFYI(1,("perm %d tag %d id %d",ace->e_perm,ace->e_tag,ace->e_id)); */ + /* cFYI(1, "perm %d tag %d id %d",ace->e_perm,ace->e_tag,ace->e_id); */ return; } @@ -2682,8 +2687,8 @@ static int cifs_copy_posix_acl(char *trgt, char *src, const int buflen, size += sizeof(struct cifs_posix_ace) * count; /* check if we would go beyond end of SMB */ if (size_of_data_area < size) { - cFYI(1, ("bad CIFS POSIX ACL size %d vs. %d", - size_of_data_area, size)); + cFYI(1, "bad CIFS POSIX ACL size %d vs. %d", + size_of_data_area, size); return -EINVAL; } } else if (acl_type & ACL_TYPE_DEFAULT) { @@ -2730,7 +2735,7 @@ static __u16 convert_ace_to_cifs_ace(struct cifs_posix_ace *cifs_ace, cifs_ace->cifs_uid = cpu_to_le64(-1); } else cifs_ace->cifs_uid = cpu_to_le64(le32_to_cpu(local_ace->e_id)); - /*cFYI(1,("perm %d tag %d id %d",ace->e_perm,ace->e_tag,ace->e_id));*/ + /*cFYI(1, "perm %d tag %d id %d",ace->e_perm,ace->e_tag,ace->e_id);*/ return rc; } @@ -2748,12 +2753,12 @@ static __u16 ACL_to_cifs_posix(char *parm_data, const char *pACL, return 0; count = posix_acl_xattr_count((size_t)buflen); - cFYI(1, ("setting acl with %d entries from buf of length %d and " + cFYI(1, "setting acl with %d entries from buf of length %d and " "version of %d", - count, buflen, le32_to_cpu(local_acl->a_version))); + count, buflen, le32_to_cpu(local_acl->a_version)); if (le32_to_cpu(local_acl->a_version) != 2) { - cFYI(1, ("unknown POSIX ACL version %d", - le32_to_cpu(local_acl->a_version))); + cFYI(1, "unknown POSIX ACL version %d", + le32_to_cpu(local_acl->a_version)); return 0; } cifs_acl->version = cpu_to_le16(1); @@ -2762,7 +2767,7 @@ static __u16 ACL_to_cifs_posix(char *parm_data, const char *pACL, else if (acl_type == ACL_TYPE_DEFAULT) cifs_acl->default_entry_count = cpu_to_le16(count); else { - cFYI(1, ("unknown ACL type %d", acl_type)); + cFYI(1, "unknown ACL type %d", acl_type); return 0; } for (i = 0; i < count; i++) { @@ -2795,7 +2800,7 @@ CIFSSMBGetPosixACL(const int xid, struct cifsTconInfo *tcon, int name_len; __u16 params, byte_count; - cFYI(1, ("In GetPosixACL (Unix) for path %s", searchName)); + cFYI(1, "In GetPosixACL (Unix) for path %s", searchName); queryAclRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, @@ -2847,7 +2852,7 @@ queryAclRetry: (struct smb_hdr *) pSMBr, &bytes_returned, 0); cifs_stats_inc(&tcon->num_acl_get); if (rc) { - cFYI(1, ("Send error in Query POSIX ACL = %d", rc)); + cFYI(1, "Send error in Query POSIX ACL = %d", rc); } else { /* decode response */ @@ -2884,7 +2889,7 @@ CIFSSMBSetPosixACL(const int xid, struct cifsTconInfo *tcon, int bytes_returned = 0; __u16 params, byte_count, data_count, param_offset, offset; - cFYI(1, ("In SetPosixACL (Unix) for path %s", fileName)); + cFYI(1, "In SetPosixACL (Unix) for path %s", fileName); setAclRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -2939,7 +2944,7 @@ setAclRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("Set POSIX ACL returned %d", rc)); + cFYI(1, "Set POSIX ACL returned %d", rc); setACLerrorExit: cifs_buf_release(pSMB); @@ -2959,7 +2964,7 @@ CIFSGetExtAttr(const int xid, struct cifsTconInfo *tcon, int bytes_returned; __u16 params, byte_count; - cFYI(1, ("In GetExtAttr")); + cFYI(1, "In GetExtAttr"); if (tcon == NULL) return -ENODEV; @@ -2998,7 +3003,7 @@ GetExtAttrRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("error %d in GetExtAttr", rc)); + cFYI(1, "error %d in GetExtAttr", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -3013,7 +3018,7 @@ GetExtAttrRetry: struct file_chattr_info *pfinfo; /* BB Do we need a cast or hash here ? */ if (count != 16) { - cFYI(1, ("Illegal size ret in GetExtAttr")); + cFYI(1, "Illegal size ret in GetExtAttr"); rc = -EIO; goto GetExtAttrOut; } @@ -3043,7 +3048,7 @@ CIFSSMBGetCIFSACL(const int xid, struct cifsTconInfo *tcon, __u16 fid, QUERY_SEC_DESC_REQ *pSMB; struct kvec iov[1]; - cFYI(1, ("GetCifsACL")); + cFYI(1, "GetCifsACL"); *pbuflen = 0; *acl_inf = NULL; @@ -3068,7 +3073,7 @@ CIFSSMBGetCIFSACL(const int xid, struct cifsTconInfo *tcon, __u16 fid, CIFS_STD_OP); cifs_stats_inc(&tcon->num_acl_get); if (rc) { - cFYI(1, ("Send error in QuerySecDesc = %d", rc)); + cFYI(1, "Send error in QuerySecDesc = %d", rc); } else { /* decode response */ __le32 *parm; __u32 parm_len; @@ -3083,7 +3088,7 @@ CIFSSMBGetCIFSACL(const int xid, struct cifsTconInfo *tcon, __u16 fid, goto qsec_out; pSMBr = (struct smb_com_ntransact_rsp *)iov[0].iov_base; - cFYI(1, ("smb %p parm %p data %p", pSMBr, parm, *acl_inf)); + cFYI(1, "smb %p parm %p data %p", pSMBr, parm, *acl_inf); if (le32_to_cpu(pSMBr->ParameterCount) != 4) { rc = -EIO; /* bad smb */ @@ -3095,8 +3100,8 @@ CIFSSMBGetCIFSACL(const int xid, struct cifsTconInfo *tcon, __u16 fid, acl_len = le32_to_cpu(*parm); if (acl_len != *pbuflen) { - cERROR(1, ("acl length %d does not match %d", - acl_len, *pbuflen)); + cERROR(1, "acl length %d does not match %d", + acl_len, *pbuflen); if (*pbuflen > acl_len) *pbuflen = acl_len; } @@ -3105,7 +3110,7 @@ CIFSSMBGetCIFSACL(const int xid, struct cifsTconInfo *tcon, __u16 fid, header followed by the smallest SID */ if ((*pbuflen < sizeof(struct cifs_ntsd) + 8) || (*pbuflen >= 64 * 1024)) { - cERROR(1, ("bad acl length %d", *pbuflen)); + cERROR(1, "bad acl length %d", *pbuflen); rc = -EINVAL; *pbuflen = 0; } else { @@ -3179,9 +3184,9 @@ setCifsAclRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); - cFYI(1, ("SetCIFSACL bytes_returned: %d, rc: %d", bytes_returned, rc)); + cFYI(1, "SetCIFSACL bytes_returned: %d, rc: %d", bytes_returned, rc); if (rc) - cFYI(1, ("Set CIFS ACL returned %d", rc)); + cFYI(1, "Set CIFS ACL returned %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -3205,7 +3210,7 @@ int SMBQueryInformation(const int xid, struct cifsTconInfo *tcon, int bytes_returned; int name_len; - cFYI(1, ("In SMBQPath path %s", searchName)); + cFYI(1, "In SMBQPath path %s", searchName); QInfRetry: rc = smb_init(SMB_COM_QUERY_INFORMATION, 0, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -3231,7 +3236,7 @@ QInfRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QueryInfo = %d", rc)); + cFYI(1, "Send error in QueryInfo = %d", rc); } else if (pFinfo) { struct timespec ts; __u32 time = le32_to_cpu(pSMBr->last_write_time); @@ -3305,7 +3310,7 @@ QFileInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QPathInfo = %d", rc)); + cFYI(1, "Send error in QPathInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -3343,7 +3348,7 @@ CIFSSMBQPathInfo(const int xid, struct cifsTconInfo *tcon, int name_len; __u16 params, byte_count; -/* cFYI(1, ("In QPathInfo path %s", searchName)); */ +/* cFYI(1, "In QPathInfo path %s", searchName); */ QPathInfoRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -3393,7 +3398,7 @@ QPathInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QPathInfo = %d", rc)); + cFYI(1, "Send error in QPathInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -3473,14 +3478,14 @@ UnixQFileInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QPathInfo = %d", rc)); + cFYI(1, "Send error in QPathInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); if (rc || (pSMBr->ByteCount < sizeof(FILE_UNIX_BASIC_INFO))) { - cERROR(1, ("Malformed FILE_UNIX_BASIC_INFO response.\n" + cERROR(1, "Malformed FILE_UNIX_BASIC_INFO response.\n" "Unix Extensions can be disabled on mount " - "by specifying the nosfu mount option.")); + "by specifying the nosfu mount option."); rc = -EIO; /* bad smb */ } else { __u16 data_offset = le16_to_cpu(pSMBr->t2.DataOffset); @@ -3512,7 +3517,7 @@ CIFSSMBUnixQPathInfo(const int xid, struct cifsTconInfo *tcon, int name_len; __u16 params, byte_count; - cFYI(1, ("In QPathInfo (Unix) the path %s", searchName)); + cFYI(1, "In QPathInfo (Unix) the path %s", searchName); UnixQPathInfoRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -3559,14 +3564,14 @@ UnixQPathInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QPathInfo = %d", rc)); + cFYI(1, "Send error in QPathInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); if (rc || (pSMBr->ByteCount < sizeof(FILE_UNIX_BASIC_INFO))) { - cERROR(1, ("Malformed FILE_UNIX_BASIC_INFO response.\n" + cERROR(1, "Malformed FILE_UNIX_BASIC_INFO response.\n" "Unix Extensions can be disabled on mount " - "by specifying the nosfu mount option.")); + "by specifying the nosfu mount option."); rc = -EIO; /* bad smb */ } else { __u16 data_offset = le16_to_cpu(pSMBr->t2.DataOffset); @@ -3600,7 +3605,7 @@ CIFSFindFirst(const int xid, struct cifsTconInfo *tcon, int name_len; __u16 params, byte_count; - cFYI(1, ("In FindFirst for %s", searchName)); + cFYI(1, "In FindFirst for %s", searchName); findFirstRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, @@ -3677,7 +3682,7 @@ findFirstRetry: if (rc) {/* BB add logic to retry regular search if Unix search rejected unexpectedly by server */ /* BB Add code to handle unsupported level rc */ - cFYI(1, ("Error in FindFirst = %d", rc)); + cFYI(1, "Error in FindFirst = %d", rc); cifs_buf_release(pSMB); @@ -3716,7 +3721,7 @@ findFirstRetry: lnoff = le16_to_cpu(parms->LastNameOffset); if (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE < lnoff) { - cERROR(1, ("ignoring corrupt resume name")); + cERROR(1, "ignoring corrupt resume name"); psrch_inf->last_entry = NULL; return rc; } @@ -3744,7 +3749,7 @@ int CIFSFindNext(const int xid, struct cifsTconInfo *tcon, int bytes_returned, name_len; __u16 params, byte_count; - cFYI(1, ("In FindNext")); + cFYI(1, "In FindNext"); if (psrch_inf->endOfSearch) return -ENOENT; @@ -3808,7 +3813,7 @@ int CIFSFindNext(const int xid, struct cifsTconInfo *tcon, cifs_buf_release(pSMB); rc = 0; /* search probably was closed at end of search*/ } else - cFYI(1, ("FindNext returned = %d", rc)); + cFYI(1, "FindNext returned = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -3844,15 +3849,15 @@ int CIFSFindNext(const int xid, struct cifsTconInfo *tcon, lnoff = le16_to_cpu(parms->LastNameOffset); if (tcon->ses->server->maxBuf - MAX_CIFS_HDR_SIZE < lnoff) { - cERROR(1, ("ignoring corrupt resume name")); + cERROR(1, "ignoring corrupt resume name"); psrch_inf->last_entry = NULL; return rc; } else psrch_inf->last_entry = psrch_inf->srch_entries_start + lnoff; -/* cFYI(1,("fnxt2 entries in buf %d index_of_last %d", - psrch_inf->entries_in_buffer, psrch_inf->index_of_last_entry)); */ +/* cFYI(1, "fnxt2 entries in buf %d index_of_last %d", + psrch_inf->entries_in_buffer, psrch_inf->index_of_last_entry); */ /* BB fixme add unlock here */ } @@ -3877,7 +3882,7 @@ CIFSFindClose(const int xid, struct cifsTconInfo *tcon, int rc = 0; FINDCLOSE_REQ *pSMB = NULL; - cFYI(1, ("In CIFSSMBFindClose")); + cFYI(1, "In CIFSSMBFindClose"); rc = small_smb_init(SMB_COM_FIND_CLOSE2, 1, tcon, (void **)&pSMB); /* no sense returning error if session restarted @@ -3891,7 +3896,7 @@ CIFSFindClose(const int xid, struct cifsTconInfo *tcon, pSMB->ByteCount = 0; rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); if (rc) - cERROR(1, ("Send error in FindClose = %d", rc)); + cERROR(1, "Send error in FindClose = %d", rc); cifs_stats_inc(&tcon->num_fclose); @@ -3914,7 +3919,7 @@ CIFSGetSrvInodeNumber(const int xid, struct cifsTconInfo *tcon, int name_len, bytes_returned; __u16 params, byte_count; - cFYI(1, ("In GetSrvInodeNum for %s", searchName)); + cFYI(1, "In GetSrvInodeNum for %s", searchName); if (tcon == NULL) return -ENODEV; @@ -3964,7 +3969,7 @@ GetInodeNumberRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("error %d in QueryInternalInfo", rc)); + cFYI(1, "error %d in QueryInternalInfo", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -3979,7 +3984,7 @@ GetInodeNumberRetry: struct file_internal_info *pfinfo; /* BB Do we need a cast or hash here ? */ if (count < 8) { - cFYI(1, ("Illegal size ret in QryIntrnlInf")); + cFYI(1, "Illegal size ret in QryIntrnlInf"); rc = -EIO; goto GetInodeNumOut; } @@ -4020,16 +4025,16 @@ parse_DFS_referrals(TRANSACTION2_GET_DFS_REFER_RSP *pSMBr, *num_of_nodes = le16_to_cpu(pSMBr->NumberOfReferrals); if (*num_of_nodes < 1) { - cERROR(1, ("num_referrals: must be at least > 0," - "but we get num_referrals = %d\n", *num_of_nodes)); + cERROR(1, "num_referrals: must be at least > 0," + "but we get num_referrals = %d\n", *num_of_nodes); rc = -EINVAL; goto parse_DFS_referrals_exit; } ref = (struct dfs_referral_level_3 *) &(pSMBr->referrals); if (ref->VersionNumber != cpu_to_le16(3)) { - cERROR(1, ("Referrals of V%d version are not supported," - "should be V3", le16_to_cpu(ref->VersionNumber))); + cERROR(1, "Referrals of V%d version are not supported," + "should be V3", le16_to_cpu(ref->VersionNumber)); rc = -EINVAL; goto parse_DFS_referrals_exit; } @@ -4038,14 +4043,14 @@ parse_DFS_referrals(TRANSACTION2_GET_DFS_REFER_RSP *pSMBr, data_end = (char *)(&(pSMBr->PathConsumed)) + le16_to_cpu(pSMBr->t2.DataCount); - cFYI(1, ("num_referrals: %d dfs flags: 0x%x ... \n", + cFYI(1, "num_referrals: %d dfs flags: 0x%x ...\n", *num_of_nodes, - le32_to_cpu(pSMBr->DFSFlags))); + le32_to_cpu(pSMBr->DFSFlags)); *target_nodes = kzalloc(sizeof(struct dfs_info3_param) * *num_of_nodes, GFP_KERNEL); if (*target_nodes == NULL) { - cERROR(1, ("Failed to allocate buffer for target_nodes\n")); + cERROR(1, "Failed to allocate buffer for target_nodes\n"); rc = -ENOMEM; goto parse_DFS_referrals_exit; } @@ -4121,7 +4126,7 @@ CIFSGetDFSRefer(const int xid, struct cifsSesInfo *ses, *num_of_nodes = 0; *target_nodes = NULL; - cFYI(1, ("In GetDFSRefer the path %s", searchName)); + cFYI(1, "In GetDFSRefer the path %s", searchName); if (ses == NULL) return -ENODEV; getDFSRetry: @@ -4188,7 +4193,7 @@ getDFSRetry: rc = SendReceive(xid, ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in GetDFSRefer = %d", rc)); + cFYI(1, "Send error in GetDFSRefer = %d", rc); goto GetDFSRefExit; } rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4199,9 +4204,9 @@ getDFSRetry: goto GetDFSRefExit; } - cFYI(1, ("Decoding GetDFSRefer response BCC: %d Offset %d", + cFYI(1, "Decoding GetDFSRefer response BCC: %d Offset %d", pSMBr->ByteCount, - le16_to_cpu(pSMBr->t2.DataOffset))); + le16_to_cpu(pSMBr->t2.DataOffset)); /* parse returned result into more usable form */ rc = parse_DFS_referrals(pSMBr, num_of_nodes, @@ -4229,7 +4234,7 @@ SMBOldQFSInfo(const int xid, struct cifsTconInfo *tcon, struct kstatfs *FSData) int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("OldQFSInfo")); + cFYI(1, "OldQFSInfo"); oldQFSInfoRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4262,7 +4267,7 @@ oldQFSInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QFSInfo = %d", rc)); + cFYI(1, "Send error in QFSInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4270,8 +4275,8 @@ oldQFSInfoRetry: rc = -EIO; /* bad smb */ else { __u16 data_offset = le16_to_cpu(pSMBr->t2.DataOffset); - cFYI(1, ("qfsinf resp BCC: %d Offset %d", - pSMBr->ByteCount, data_offset)); + cFYI(1, "qfsinf resp BCC: %d Offset %d", + pSMBr->ByteCount, data_offset); response_data = (FILE_SYSTEM_ALLOC_INFO *) (((char *) &pSMBr->hdr.Protocol) + data_offset); @@ -4283,11 +4288,10 @@ oldQFSInfoRetry: le32_to_cpu(response_data->TotalAllocationUnits); FSData->f_bfree = FSData->f_bavail = le32_to_cpu(response_data->FreeAllocationUnits); - cFYI(1, - ("Blocks: %lld Free: %lld Block size %ld", - (unsigned long long)FSData->f_blocks, - (unsigned long long)FSData->f_bfree, - FSData->f_bsize)); + cFYI(1, "Blocks: %lld Free: %lld Block size %ld", + (unsigned long long)FSData->f_blocks, + (unsigned long long)FSData->f_bfree, + FSData->f_bsize); } } cifs_buf_release(pSMB); @@ -4309,7 +4313,7 @@ CIFSSMBQFSInfo(const int xid, struct cifsTconInfo *tcon, struct kstatfs *FSData) int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("In QFSInfo")); + cFYI(1, "In QFSInfo"); QFSInfoRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4342,7 +4346,7 @@ QFSInfoRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QFSInfo = %d", rc)); + cFYI(1, "Send error in QFSInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4363,11 +4367,10 @@ QFSInfoRetry: le64_to_cpu(response_data->TotalAllocationUnits); FSData->f_bfree = FSData->f_bavail = le64_to_cpu(response_data->FreeAllocationUnits); - cFYI(1, - ("Blocks: %lld Free: %lld Block size %ld", - (unsigned long long)FSData->f_blocks, - (unsigned long long)FSData->f_bfree, - FSData->f_bsize)); + cFYI(1, "Blocks: %lld Free: %lld Block size %ld", + (unsigned long long)FSData->f_blocks, + (unsigned long long)FSData->f_bfree, + FSData->f_bsize); } } cifs_buf_release(pSMB); @@ -4389,7 +4392,7 @@ CIFSSMBQFSAttributeInfo(const int xid, struct cifsTconInfo *tcon) int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("In QFSAttributeInfo")); + cFYI(1, "In QFSAttributeInfo"); QFSAttributeRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4423,7 +4426,7 @@ QFSAttributeRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cERROR(1, ("Send error in QFSAttributeInfo = %d", rc)); + cERROR(1, "Send error in QFSAttributeInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4459,7 +4462,7 @@ CIFSSMBQFSDeviceInfo(const int xid, struct cifsTconInfo *tcon) int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("In QFSDeviceInfo")); + cFYI(1, "In QFSDeviceInfo"); QFSDeviceRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4494,7 +4497,7 @@ QFSDeviceRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QFSDeviceInfo = %d", rc)); + cFYI(1, "Send error in QFSDeviceInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4529,7 +4532,7 @@ CIFSSMBQFSUnixInfo(const int xid, struct cifsTconInfo *tcon) int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("In QFSUnixInfo")); + cFYI(1, "In QFSUnixInfo"); QFSUnixRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4563,7 +4566,7 @@ QFSUnixRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cERROR(1, ("Send error in QFSUnixInfo = %d", rc)); + cERROR(1, "Send error in QFSUnixInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4598,7 +4601,7 @@ CIFSSMBSetFSUnixInfo(const int xid, struct cifsTconInfo *tcon, __u64 cap) int bytes_returned = 0; __u16 params, param_offset, offset, byte_count; - cFYI(1, ("In SETFSUnixInfo")); + cFYI(1, "In SETFSUnixInfo"); SETFSUnixRetry: /* BB switch to small buf init to save memory */ rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, @@ -4646,7 +4649,7 @@ SETFSUnixRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cERROR(1, ("Send error in SETFSUnixInfo = %d", rc)); + cERROR(1, "Send error in SETFSUnixInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); if (rc) @@ -4674,7 +4677,7 @@ CIFSSMBQFSPosixInfo(const int xid, struct cifsTconInfo *tcon, int bytes_returned = 0; __u16 params, byte_count; - cFYI(1, ("In QFSPosixInfo")); + cFYI(1, "In QFSPosixInfo"); QFSPosixRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4708,7 +4711,7 @@ QFSPosixRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QFSUnixInfo = %d", rc)); + cFYI(1, "Send error in QFSUnixInfo = %d", rc); } else { /* decode response */ rc = validate_t2((struct smb_t2_rsp *)pSMBr); @@ -4768,7 +4771,7 @@ CIFSSMBSetEOF(const int xid, struct cifsTconInfo *tcon, const char *fileName, int bytes_returned = 0; __u16 params, byte_count, data_count, param_offset, offset; - cFYI(1, ("In SetEOF")); + cFYI(1, "In SetEOF"); SetEOFRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -4834,7 +4837,7 @@ SetEOFRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("SetPathInfo (file size) returned %d", rc)); + cFYI(1, "SetPathInfo (file size) returned %d", rc); cifs_buf_release(pSMB); @@ -4854,8 +4857,8 @@ CIFSSMBSetFileSize(const int xid, struct cifsTconInfo *tcon, __u64 size, int rc = 0; __u16 params, param_offset, offset, byte_count, count; - cFYI(1, ("SetFileSize (via SetFileInfo) %lld", - (long long)size)); + cFYI(1, "SetFileSize (via SetFileInfo) %lld", + (long long)size); rc = small_smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB); if (rc) @@ -4914,9 +4917,7 @@ CIFSSMBSetFileSize(const int xid, struct cifsTconInfo *tcon, __u64 size, pSMB->ByteCount = cpu_to_le16(byte_count); rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); if (rc) { - cFYI(1, - ("Send error in SetFileInfo (SetFileSize) = %d", - rc)); + cFYI(1, "Send error in SetFileInfo (SetFileSize) = %d", rc); } /* Note: On -EAGAIN error only caller can retry on handle based calls @@ -4940,7 +4941,7 @@ CIFSSMBSetFileInfo(const int xid, struct cifsTconInfo *tcon, int rc = 0; __u16 params, param_offset, offset, byte_count, count; - cFYI(1, ("Set Times (via SetFileInfo)")); + cFYI(1, "Set Times (via SetFileInfo)"); rc = small_smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB); if (rc) @@ -4985,7 +4986,7 @@ CIFSSMBSetFileInfo(const int xid, struct cifsTconInfo *tcon, memcpy(data_offset, data, sizeof(FILE_BASIC_INFO)); rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); if (rc) - cFYI(1, ("Send error in Set Time (SetFileInfo) = %d", rc)); + cFYI(1, "Send error in Set Time (SetFileInfo) = %d", rc); /* Note: On -EAGAIN error only caller can retry on handle based calls since file handle passed in no longer valid */ @@ -5002,7 +5003,7 @@ CIFSSMBSetFileDisposition(const int xid, struct cifsTconInfo *tcon, int rc = 0; __u16 params, param_offset, offset, byte_count, count; - cFYI(1, ("Set File Disposition (via SetFileInfo)")); + cFYI(1, "Set File Disposition (via SetFileInfo)"); rc = small_smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB); if (rc) @@ -5044,7 +5045,7 @@ CIFSSMBSetFileDisposition(const int xid, struct cifsTconInfo *tcon, *data_offset = delete_file ? 1 : 0; rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); if (rc) - cFYI(1, ("Send error in SetFileDisposition = %d", rc)); + cFYI(1, "Send error in SetFileDisposition = %d", rc); return rc; } @@ -5062,7 +5063,7 @@ CIFSSMBSetPathInfo(const int xid, struct cifsTconInfo *tcon, char *data_offset; __u16 params, param_offset, offset, byte_count, count; - cFYI(1, ("In SetTimes")); + cFYI(1, "In SetTimes"); SetTimesRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, @@ -5118,7 +5119,7 @@ SetTimesRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("SetPathInfo (times) returned %d", rc)); + cFYI(1, "SetPathInfo (times) returned %d", rc); cifs_buf_release(pSMB); @@ -5143,7 +5144,7 @@ CIFSSMBSetAttrLegacy(int xid, struct cifsTconInfo *tcon, char *fileName, int bytes_returned; int name_len; - cFYI(1, ("In SetAttrLegacy")); + cFYI(1, "In SetAttrLegacy"); SetAttrLgcyRetry: rc = smb_init(SMB_COM_SETATTR, 8, tcon, (void **) &pSMB, @@ -5169,7 +5170,7 @@ SetAttrLgcyRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("Error in LegacySetAttr = %d", rc)); + cFYI(1, "Error in LegacySetAttr = %d", rc); cifs_buf_release(pSMB); @@ -5231,7 +5232,7 @@ CIFSSMBUnixSetFileInfo(const int xid, struct cifsTconInfo *tcon, int rc = 0; u16 params, param_offset, offset, byte_count, count; - cFYI(1, ("Set Unix Info (via SetFileInfo)")); + cFYI(1, "Set Unix Info (via SetFileInfo)"); rc = small_smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB); if (rc) @@ -5276,7 +5277,7 @@ CIFSSMBUnixSetFileInfo(const int xid, struct cifsTconInfo *tcon, rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); if (rc) - cFYI(1, ("Send error in Set Time (SetFileInfo) = %d", rc)); + cFYI(1, "Send error in Set Time (SetFileInfo) = %d", rc); /* Note: On -EAGAIN error only caller can retry on handle based calls since file handle passed in no longer valid */ @@ -5297,7 +5298,7 @@ CIFSSMBUnixSetPathInfo(const int xid, struct cifsTconInfo *tcon, char *fileName, FILE_UNIX_BASIC_INFO *data_offset; __u16 params, param_offset, offset, count, byte_count; - cFYI(1, ("In SetUID/GID/Mode")); + cFYI(1, "In SetUID/GID/Mode"); setPermsRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -5353,7 +5354,7 @@ setPermsRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("SetPathInfo (perms) returned %d", rc)); + cFYI(1, "SetPathInfo (perms) returned %d", rc); cifs_buf_release(pSMB); if (rc == -EAGAIN) @@ -5372,7 +5373,7 @@ int CIFSSMBNotify(const int xid, struct cifsTconInfo *tcon, struct dir_notify_req *dnotify_req; int bytes_returned; - cFYI(1, ("In CIFSSMBNotify for file handle %d", (int)netfid)); + cFYI(1, "In CIFSSMBNotify for file handle %d", (int)netfid); rc = smb_init(SMB_COM_NT_TRANSACT, 23, tcon, (void **) &pSMB, (void **) &pSMBr); if (rc) @@ -5406,7 +5407,7 @@ int CIFSSMBNotify(const int xid, struct cifsTconInfo *tcon, (struct smb_hdr *)pSMBr, &bytes_returned, CIFS_ASYNC_OP); if (rc) { - cFYI(1, ("Error in Notify = %d", rc)); + cFYI(1, "Error in Notify = %d", rc); } else { /* Add file to outstanding requests */ /* BB change to kmem cache alloc */ @@ -5462,7 +5463,7 @@ CIFSSMBQAllEAs(const int xid, struct cifsTconInfo *tcon, char *end_of_smb; __u16 params, byte_count, data_offset; - cFYI(1, ("In Query All EAs path %s", searchName)); + cFYI(1, "In Query All EAs path %s", searchName); QAllEAsRetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -5509,7 +5510,7 @@ QAllEAsRetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) { - cFYI(1, ("Send error in QueryAllEAs = %d", rc)); + cFYI(1, "Send error in QueryAllEAs = %d", rc); goto QAllEAsOut; } @@ -5537,16 +5538,16 @@ QAllEAsRetry: (((char *) &pSMBr->hdr.Protocol) + data_offset); list_len = le32_to_cpu(ea_response_data->list_len); - cFYI(1, ("ea length %d", list_len)); + cFYI(1, "ea length %d", list_len); if (list_len <= 8) { - cFYI(1, ("empty EA list returned from server")); + cFYI(1, "empty EA list returned from server"); goto QAllEAsOut; } /* make sure list_len doesn't go past end of SMB */ end_of_smb = (char *)pByteArea(&pSMBr->hdr) + BCC(&pSMBr->hdr); if ((char *)ea_response_data + list_len > end_of_smb) { - cFYI(1, ("EA list appears to go beyond SMB")); + cFYI(1, "EA list appears to go beyond SMB"); rc = -EIO; goto QAllEAsOut; } @@ -5563,7 +5564,7 @@ QAllEAsRetry: temp_ptr += 4; /* make sure we can read name_len and value_len */ if (list_len < 0) { - cFYI(1, ("EA entry goes beyond length of list")); + cFYI(1, "EA entry goes beyond length of list"); rc = -EIO; goto QAllEAsOut; } @@ -5572,7 +5573,7 @@ QAllEAsRetry: value_len = le16_to_cpu(temp_fea->value_len); list_len -= name_len + 1 + value_len; if (list_len < 0) { - cFYI(1, ("EA entry goes beyond length of list")); + cFYI(1, "EA entry goes beyond length of list"); rc = -EIO; goto QAllEAsOut; } @@ -5639,7 +5640,7 @@ CIFSSMBSetEA(const int xid, struct cifsTconInfo *tcon, const char *fileName, int bytes_returned = 0; __u16 params, param_offset, byte_count, offset, count; - cFYI(1, ("In SetEA")); + cFYI(1, "In SetEA"); SetEARetry: rc = smb_init(SMB_COM_TRANSACTION2, 15, tcon, (void **) &pSMB, (void **) &pSMBr); @@ -5721,7 +5722,7 @@ SetEARetry: rc = SendReceive(xid, tcon->ses, (struct smb_hdr *) pSMB, (struct smb_hdr *) pSMBr, &bytes_returned, 0); if (rc) - cFYI(1, ("SetPathInfo (EA) returned %d", rc)); + cFYI(1, "SetPathInfo (EA) returned %d", rc); cifs_buf_release(pSMB); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index d9566bf..2208f06 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -102,6 +102,7 @@ struct smb_vol { bool sockopt_tcp_nodelay:1; unsigned short int port; char *prepath; + struct nls_table *local_nls; }; static int ipv4_connect(struct TCP_Server_Info *server); @@ -135,7 +136,7 @@ cifs_reconnect(struct TCP_Server_Info *server) spin_unlock(&GlobalMid_Lock); server->maxBuf = 0; - cFYI(1, ("Reconnecting tcp session")); + cFYI(1, "Reconnecting tcp session"); /* before reconnecting the tcp session, mark the smb session (uid) and the tid bad so they are not used until reconnected */ @@ -153,12 +154,12 @@ cifs_reconnect(struct TCP_Server_Info *server) /* do not want to be sending data on a socket we are freeing */ mutex_lock(&server->srv_mutex); if (server->ssocket) { - cFYI(1, ("State: 0x%x Flags: 0x%lx", server->ssocket->state, - server->ssocket->flags)); + cFYI(1, "State: 0x%x Flags: 0x%lx", server->ssocket->state, + server->ssocket->flags); kernel_sock_shutdown(server->ssocket, SHUT_WR); - cFYI(1, ("Post shutdown state: 0x%x Flags: 0x%lx", + cFYI(1, "Post shutdown state: 0x%x Flags: 0x%lx", server->ssocket->state, - server->ssocket->flags)); + server->ssocket->flags); sock_release(server->ssocket); server->ssocket = NULL; } @@ -187,7 +188,7 @@ cifs_reconnect(struct TCP_Server_Info *server) else rc = ipv4_connect(server); if (rc) { - cFYI(1, ("reconnect error %d", rc)); + cFYI(1, "reconnect error %d", rc); msleep(3000); } else { atomic_inc(&tcpSesReconnectCount); @@ -223,7 +224,7 @@ static int check2ndT2(struct smb_hdr *pSMB, unsigned int maxBufSize) /* check for plausible wct, bcc and t2 data and parm sizes */ /* check for parm and data offset going beyond end of smb */ if (pSMB->WordCount != 10) { /* coalesce_t2 depends on this */ - cFYI(1, ("invalid transact2 word count")); + cFYI(1, "invalid transact2 word count"); return -EINVAL; } @@ -237,15 +238,15 @@ static int check2ndT2(struct smb_hdr *pSMB, unsigned int maxBufSize) if (remaining == 0) return 0; else if (remaining < 0) { - cFYI(1, ("total data %d smaller than data in frame %d", - total_data_size, data_in_this_rsp)); + cFYI(1, "total data %d smaller than data in frame %d", + total_data_size, data_in_this_rsp); return -EINVAL; } else { - cFYI(1, ("missing %d bytes from transact2, check next response", - remaining)); + cFYI(1, "missing %d bytes from transact2, check next response", + remaining); if (total_data_size > maxBufSize) { - cERROR(1, ("TotalDataSize %d is over maximum buffer %d", - total_data_size, maxBufSize)); + cERROR(1, "TotalDataSize %d is over maximum buffer %d", + total_data_size, maxBufSize); return -EINVAL; } return remaining; @@ -267,7 +268,7 @@ static int coalesce_t2(struct smb_hdr *psecond, struct smb_hdr *pTargetSMB) total_data_size = le16_to_cpu(pSMBt->t2_rsp.TotalDataCount); if (total_data_size != le16_to_cpu(pSMB2->t2_rsp.TotalDataCount)) { - cFYI(1, ("total data size of primary and secondary t2 differ")); + cFYI(1, "total data size of primary and secondary t2 differ"); } total_in_buf = le16_to_cpu(pSMBt->t2_rsp.DataCount); @@ -282,7 +283,7 @@ static int coalesce_t2(struct smb_hdr *psecond, struct smb_hdr *pTargetSMB) total_in_buf2 = le16_to_cpu(pSMB2->t2_rsp.DataCount); if (remaining < total_in_buf2) { - cFYI(1, ("transact2 2nd response contains too much data")); + cFYI(1, "transact2 2nd response contains too much data"); } /* find end of first SMB data area */ @@ -311,7 +312,7 @@ static int coalesce_t2(struct smb_hdr *psecond, struct smb_hdr *pTargetSMB) pTargetSMB->smb_buf_length = byte_count; if (remaining == total_in_buf2) { - cFYI(1, ("found the last secondary response")); + cFYI(1, "found the last secondary response"); return 0; /* we are done */ } else /* more responses to go */ return 1; @@ -339,7 +340,7 @@ cifs_demultiplex_thread(struct TCP_Server_Info *server) int reconnect; current->flags |= PF_MEMALLOC; - cFYI(1, ("Demultiplex PID: %d", task_pid_nr(current))); + cFYI(1, "Demultiplex PID: %d", task_pid_nr(current)); length = atomic_inc_return(&tcpSesAllocCount); if (length > 1) @@ -353,7 +354,7 @@ cifs_demultiplex_thread(struct TCP_Server_Info *server) if (bigbuf == NULL) { bigbuf = cifs_buf_get(); if (!bigbuf) { - cERROR(1, ("No memory for large SMB response")); + cERROR(1, "No memory for large SMB response"); msleep(3000); /* retry will check if exiting */ continue; @@ -366,7 +367,7 @@ cifs_demultiplex_thread(struct TCP_Server_Info *server) if (smallbuf == NULL) { smallbuf = cifs_small_buf_get(); if (!smallbuf) { - cERROR(1, ("No memory for SMB response")); + cERROR(1, "No memory for SMB response"); msleep(1000); /* retry will check if exiting */ continue; @@ -391,9 +392,9 @@ incomplete_rcv: if (server->tcpStatus == CifsExiting) { break; } else if (server->tcpStatus == CifsNeedReconnect) { - cFYI(1, ("Reconnect after server stopped responding")); + cFYI(1, "Reconnect after server stopped responding"); cifs_reconnect(server); - cFYI(1, ("call to reconnect done")); + cFYI(1, "call to reconnect done"); csocket = server->ssocket; continue; } else if ((length == -ERESTARTSYS) || (length == -EAGAIN)) { @@ -411,7 +412,7 @@ incomplete_rcv: continue; } else if (length <= 0) { if (server->tcpStatus == CifsNew) { - cFYI(1, ("tcp session abend after SMBnegprot")); + cFYI(1, "tcp session abend after SMBnegprot"); /* some servers kill the TCP session rather than returning an SMB negprot error, in which case reconnecting here is not going to help, @@ -419,18 +420,18 @@ incomplete_rcv: break; } if (!try_to_freeze() && (length == -EINTR)) { - cFYI(1, ("cifsd thread killed")); + cFYI(1, "cifsd thread killed"); break; } - cFYI(1, ("Reconnect after unexpected peek error %d", - length)); + cFYI(1, "Reconnect after unexpected peek error %d", + length); cifs_reconnect(server); csocket = server->ssocket; wake_up(&server->response_q); continue; } else if (length < pdu_length) { - cFYI(1, ("requested %d bytes but only got %d bytes", - pdu_length, length)); + cFYI(1, "requested %d bytes but only got %d bytes", + pdu_length, length); pdu_length -= length; msleep(1); goto incomplete_rcv; @@ -450,18 +451,18 @@ incomplete_rcv: pdu_length = be32_to_cpu((__force __be32)smb_buffer->smb_buf_length); smb_buffer->smb_buf_length = pdu_length; - cFYI(1, ("rfc1002 length 0x%x", pdu_length+4)); + cFYI(1, "rfc1002 length 0x%x", pdu_length+4); if (temp == (char) RFC1002_SESSION_KEEP_ALIVE) { continue; } else if (temp == (char)RFC1002_POSITIVE_SESSION_RESPONSE) { - cFYI(1, ("Good RFC 1002 session rsp")); + cFYI(1, "Good RFC 1002 session rsp"); continue; } else if (temp == (char)RFC1002_NEGATIVE_SESSION_RESPONSE) { /* we get this from Windows 98 instead of an error on SMB negprot response */ - cFYI(1, ("Negative RFC1002 Session Response Error 0x%x)", - pdu_length)); + cFYI(1, "Negative RFC1002 Session Response Error 0x%x)", + pdu_length); if (server->tcpStatus == CifsNew) { /* if nack on negprot (rather than ret of smb negprot error) reconnecting @@ -484,7 +485,7 @@ incomplete_rcv: continue; } } else if (temp != (char) 0) { - cERROR(1, ("Unknown RFC 1002 frame")); + cERROR(1, "Unknown RFC 1002 frame"); cifs_dump_mem(" Received Data: ", (char *)smb_buffer, length); cifs_reconnect(server); @@ -495,8 +496,8 @@ incomplete_rcv: /* else we have an SMB response */ if ((pdu_length > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4) || (pdu_length < sizeof(struct smb_hdr) - 1 - 4)) { - cERROR(1, ("Invalid size SMB length %d pdu_length %d", - length, pdu_length+4)); + cERROR(1, "Invalid size SMB length %d pdu_length %d", + length, pdu_length+4); cifs_reconnect(server); csocket = server->ssocket; wake_up(&server->response_q); @@ -539,8 +540,8 @@ incomplete_rcv: length = 0; continue; } else if (length <= 0) { - cERROR(1, ("Received no data, expecting %d", - pdu_length - total_read)); + cERROR(1, "Received no data, expecting %d", + pdu_length - total_read); cifs_reconnect(server); csocket = server->ssocket; reconnect = 1; @@ -588,7 +589,7 @@ incomplete_rcv: } } else { if (!isLargeBuf) { - cERROR(1,("1st trans2 resp needs bigbuf")); + cERROR(1, "1st trans2 resp needs bigbuf"); /* BB maybe we can fix this up, switch to already allocated large buffer? */ } else { @@ -630,8 +631,8 @@ multi_t2_fnd: wake_up_process(task_to_wake); } else if (!is_valid_oplock_break(smb_buffer, server) && !isMultiRsp) { - cERROR(1, ("No task to wake, unknown frame received! " - "NumMids %d", midCount.counter)); + cERROR(1, "No task to wake, unknown frame received! " + "NumMids %d", midCount.counter); cifs_dump_mem("Received Data is: ", (char *)smb_buffer, sizeof(struct smb_hdr)); #ifdef CONFIG_CIFS_DEBUG2 @@ -708,8 +709,8 @@ multi_t2_fnd: list_for_each(tmp, &server->pending_mid_q) { mid_entry = list_entry(tmp, struct mid_q_entry, qhead); if (mid_entry->midState == MID_REQUEST_SUBMITTED) { - cFYI(1, ("Clearing Mid 0x%x - waking up ", - mid_entry->mid)); + cFYI(1, "Clearing Mid 0x%x - waking up ", + mid_entry->mid); task_to_wake = mid_entry->tsk; if (task_to_wake) wake_up_process(task_to_wake); @@ -728,7 +729,7 @@ multi_t2_fnd: to wait at least 45 seconds before giving up on a request getting a response and going ahead and killing cifsd */ - cFYI(1, ("Wait for exit from demultiplex thread")); + cFYI(1, "Wait for exit from demultiplex thread"); msleep(46000); /* if threads still have not exited they are probably never coming home not much else we can do but free the memory */ @@ -849,7 +850,7 @@ cifs_parse_mount_options(char *options, const char *devname, separator[0] = options[4]; options += 5; } else { - cFYI(1, ("Null separator not allowed")); + cFYI(1, "Null separator not allowed"); } } @@ -974,7 +975,7 @@ cifs_parse_mount_options(char *options, const char *devname, } } else if (strnicmp(data, "sec", 3) == 0) { if (!value || !*value) { - cERROR(1, ("no security value specified")); + cERROR(1, "no security value specified"); continue; } else if (strnicmp(value, "krb5i", 5) == 0) { vol->secFlg |= CIFSSEC_MAY_KRB5 | @@ -982,7 +983,7 @@ cifs_parse_mount_options(char *options, const char *devname, } else if (strnicmp(value, "krb5p", 5) == 0) { /* vol->secFlg |= CIFSSEC_MUST_SEAL | CIFSSEC_MAY_KRB5; */ - cERROR(1, ("Krb5 cifs privacy not supported")); + cERROR(1, "Krb5 cifs privacy not supported"); return 1; } else if (strnicmp(value, "krb5", 4) == 0) { vol->secFlg |= CIFSSEC_MAY_KRB5; @@ -1014,7 +1015,7 @@ cifs_parse_mount_options(char *options, const char *devname, } else if (strnicmp(value, "none", 4) == 0) { vol->nullauth = 1; } else { - cERROR(1, ("bad security option: %s", value)); + cERROR(1, "bad security option: %s", value); return 1; } } else if ((strnicmp(data, "unc", 3) == 0) @@ -1053,7 +1054,7 @@ cifs_parse_mount_options(char *options, const char *devname, a domain name and need special handling? */ if (strnlen(value, 256) < 256) { vol->domainname = value; - cFYI(1, ("Domain name set")); + cFYI(1, "Domain name set"); } else { printk(KERN_WARNING "CIFS: domain name too " "long\n"); @@ -1076,7 +1077,7 @@ cifs_parse_mount_options(char *options, const char *devname, strcpy(vol->prepath+1, value); } else strcpy(vol->prepath, value); - cFYI(1, ("prefix path %s", vol->prepath)); + cFYI(1, "prefix path %s", vol->prepath); } else { printk(KERN_WARNING "CIFS: prefix too long\n"); return 1; @@ -1092,7 +1093,7 @@ cifs_parse_mount_options(char *options, const char *devname, vol->iocharset = value; /* if iocharset not set then load_nls_default is used by caller */ - cFYI(1, ("iocharset set to %s", value)); + cFYI(1, "iocharset set to %s", value); } else { printk(KERN_WARNING "CIFS: iocharset name " "too long.\n"); @@ -1144,14 +1145,14 @@ cifs_parse_mount_options(char *options, const char *devname, } } else if (strnicmp(data, "sockopt", 5) == 0) { if (!value || !*value) { - cERROR(1, ("no socket option specified")); + cERROR(1, "no socket option specified"); continue; } else if (strnicmp(value, "TCP_NODELAY", 11) == 0) { vol->sockopt_tcp_nodelay = 1; } } else if (strnicmp(data, "netbiosname", 4) == 0) { if (!value || !*value || (*value == ' ')) { - cFYI(1, ("invalid (empty) netbiosname")); + cFYI(1, "invalid (empty) netbiosname"); } else { memset(vol->source_rfc1001_name, 0x20, 15); for (i = 0; i < 15; i++) { @@ -1175,7 +1176,7 @@ cifs_parse_mount_options(char *options, const char *devname, } else if (strnicmp(data, "servern", 7) == 0) { /* servernetbiosname specified override *SMBSERVER */ if (!value || !*value || (*value == ' ')) { - cFYI(1, ("empty server netbiosname specified")); + cFYI(1, "empty server netbiosname specified"); } else { /* last byte, type, is 0x20 for servr type */ memset(vol->target_rfc1001_name, 0x20, 16); @@ -1434,7 +1435,7 @@ cifs_find_tcp_session(struct sockaddr_storage *addr, unsigned short int port) ++server->srv_count; write_unlock(&cifs_tcp_ses_lock); - cFYI(1, ("Existing tcp session with server found")); + cFYI(1, "Existing tcp session with server found"); return server; } write_unlock(&cifs_tcp_ses_lock); @@ -1475,7 +1476,7 @@ cifs_get_tcp_session(struct smb_vol *volume_info) memset(&addr, 0, sizeof(struct sockaddr_storage)); - cFYI(1, ("UNC: %s ip: %s", volume_info->UNC, volume_info->UNCip)); + cFYI(1, "UNC: %s ip: %s", volume_info->UNC, volume_info->UNCip); if (volume_info->UNCip && volume_info->UNC) { rc = cifs_convert_address(volume_info->UNCip, &addr); @@ -1487,13 +1488,12 @@ cifs_get_tcp_session(struct smb_vol *volume_info) } else if (volume_info->UNCip) { /* BB using ip addr as tcp_ses name to connect to the DFS root below */ - cERROR(1, ("Connecting to DFS root not implemented yet")); + cERROR(1, "Connecting to DFS root not implemented yet"); rc = -EINVAL; goto out_err; } else /* which tcp_sess DFS root would we conect to */ { - cERROR(1, - ("CIFS mount error: No UNC path (e.g. -o " - "unc=//192.168.1.100/public) specified")); + cERROR(1, "CIFS mount error: No UNC path (e.g. -o " + "unc=//192.168.1.100/public) specified"); rc = -EINVAL; goto out_err; } @@ -1540,7 +1540,7 @@ cifs_get_tcp_session(struct smb_vol *volume_info) ++tcp_ses->srv_count; if (addr.ss_family == AF_INET6) { - cFYI(1, ("attempting ipv6 connect")); + cFYI(1, "attempting ipv6 connect"); /* BB should we allow ipv6 on port 139? */ /* other OS never observed in Wild doing 139 with v6 */ sin_server6->sin6_port = htons(volume_info->port); @@ -1554,7 +1554,7 @@ cifs_get_tcp_session(struct smb_vol *volume_info) rc = ipv4_connect(tcp_ses); } if (rc < 0) { - cERROR(1, ("Error connecting to socket. Aborting operation")); + cERROR(1, "Error connecting to socket. Aborting operation"); goto out_err; } @@ -1567,7 +1567,7 @@ cifs_get_tcp_session(struct smb_vol *volume_info) tcp_ses, "cifsd"); if (IS_ERR(tcp_ses->tsk)) { rc = PTR_ERR(tcp_ses->tsk); - cERROR(1, ("error %d create cifsd thread", rc)); + cERROR(1, "error %d create cifsd thread", rc); module_put(THIS_MODULE); goto out_err; } @@ -1616,6 +1616,7 @@ cifs_put_smb_ses(struct cifsSesInfo *ses) int xid; struct TCP_Server_Info *server = ses->server; + cFYI(1, "%s: ses_count=%d\n", __func__, ses->ses_count); write_lock(&cifs_tcp_ses_lock); if (--ses->ses_count > 0) { write_unlock(&cifs_tcp_ses_lock); @@ -1634,6 +1635,102 @@ cifs_put_smb_ses(struct cifsSesInfo *ses) cifs_put_tcp_session(server); } +static struct cifsSesInfo * +cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb_vol *volume_info) +{ + int rc = -ENOMEM, xid; + struct cifsSesInfo *ses; + + xid = GetXid(); + + ses = cifs_find_smb_ses(server, volume_info->username); + if (ses) { + cFYI(1, "Existing smb sess found (status=%d)", ses->status); + + /* existing SMB ses has a server reference already */ + cifs_put_tcp_session(server); + + mutex_lock(&ses->session_mutex); + rc = cifs_negotiate_protocol(xid, ses); + if (rc) { + mutex_unlock(&ses->session_mutex); + /* problem -- put our ses reference */ + cifs_put_smb_ses(ses); + FreeXid(xid); + return ERR_PTR(rc); + } + if (ses->need_reconnect) { + cFYI(1, "Session needs reconnect"); + rc = cifs_setup_session(xid, ses, + volume_info->local_nls); + if (rc) { + mutex_unlock(&ses->session_mutex); + /* problem -- put our reference */ + cifs_put_smb_ses(ses); + FreeXid(xid); + return ERR_PTR(rc); + } + } + mutex_unlock(&ses->session_mutex); + FreeXid(xid); + return ses; + } + + cFYI(1, "Existing smb sess not found"); + ses = sesInfoAlloc(); + if (ses == NULL) + goto get_ses_fail; + + /* new SMB session uses our server ref */ + ses->server = server; + if (server->addr.sockAddr6.sin6_family == AF_INET6) + sprintf(ses->serverName, "%pI6", + &server->addr.sockAddr6.sin6_addr); + else + sprintf(ses->serverName, "%pI4", + &server->addr.sockAddr.sin_addr.s_addr); + + if (volume_info->username) + strncpy(ses->userName, volume_info->username, + MAX_USERNAME_SIZE); + + /* volume_info->password freed at unmount */ + if (volume_info->password) { + ses->password = kstrdup(volume_info->password, GFP_KERNEL); + if (!ses->password) + goto get_ses_fail; + } + if (volume_info->domainname) { + int len = strlen(volume_info->domainname); + ses->domainName = kmalloc(len + 1, GFP_KERNEL); + if (ses->domainName) + strcpy(ses->domainName, volume_info->domainname); + } + ses->linux_uid = volume_info->linux_uid; + ses->overrideSecFlg = volume_info->secFlg; + + mutex_lock(&ses->session_mutex); + rc = cifs_negotiate_protocol(xid, ses); + if (!rc) + rc = cifs_setup_session(xid, ses, volume_info->local_nls); + mutex_unlock(&ses->session_mutex); + if (rc) + goto get_ses_fail; + + /* success, put it on the list */ + write_lock(&cifs_tcp_ses_lock); + list_add(&ses->smb_ses_list, &server->smb_ses_list); + write_unlock(&cifs_tcp_ses_lock); + + FreeXid(xid); + return ses; + +get_ses_fail: + sesInfoFree(ses); + FreeXid(xid); + return ERR_PTR(rc); +} + static struct cifsTconInfo * cifs_find_tcon(struct cifsSesInfo *ses, const char *unc) { @@ -1662,6 +1759,7 @@ cifs_put_tcon(struct cifsTconInfo *tcon) int xid; struct cifsSesInfo *ses = tcon->ses; + cFYI(1, "%s: tc_count=%d\n", __func__, tcon->tc_count); write_lock(&cifs_tcp_ses_lock); if (--tcon->tc_count > 0) { write_unlock(&cifs_tcp_ses_lock); @@ -1679,6 +1777,80 @@ cifs_put_tcon(struct cifsTconInfo *tcon) cifs_put_smb_ses(ses); } +static struct cifsTconInfo * +cifs_get_tcon(struct cifsSesInfo *ses, struct smb_vol *volume_info) +{ + int rc, xid; + struct cifsTconInfo *tcon; + + tcon = cifs_find_tcon(ses, volume_info->UNC); + if (tcon) { + cFYI(1, "Found match on UNC path"); + /* existing tcon already has a reference */ + cifs_put_smb_ses(ses); + if (tcon->seal != volume_info->seal) + cERROR(1, "transport encryption setting " + "conflicts with existing tid"); + return tcon; + } + + tcon = tconInfoAlloc(); + if (tcon == NULL) { + rc = -ENOMEM; + goto out_fail; + } + + tcon->ses = ses; + if (volume_info->password) { + tcon->password = kstrdup(volume_info->password, GFP_KERNEL); + if (!tcon->password) { + rc = -ENOMEM; + goto out_fail; + } + } + + if (strchr(volume_info->UNC + 3, '\\') == NULL + && strchr(volume_info->UNC + 3, '/') == NULL) { + cERROR(1, "Missing share name"); + rc = -ENODEV; + goto out_fail; + } + + /* BB Do we need to wrap session_mutex around + * this TCon call and Unix SetFS as + * we do on SessSetup and reconnect? */ + xid = GetXid(); + rc = CIFSTCon(xid, ses, volume_info->UNC, tcon, volume_info->local_nls); + FreeXid(xid); + cFYI(1, "CIFS Tcon rc = %d", rc); + if (rc) + goto out_fail; + + if (volume_info->nodfs) { + tcon->Flags &= ~SMB_SHARE_IS_IN_DFS; + cFYI(1, "DFS disabled (%d)", tcon->Flags); + } + tcon->seal = volume_info->seal; + /* we can have only one retry value for a connection + to a share so for resources mounted more than once + to the same server share the last value passed in + for the retry flag is used */ + tcon->retry = volume_info->retry; + tcon->nocase = volume_info->nocase; + tcon->local_lease = volume_info->local_lease; + + write_lock(&cifs_tcp_ses_lock); + list_add(&tcon->tcon_list, &ses->tcon_list); + write_unlock(&cifs_tcp_ses_lock); + + return tcon; + +out_fail: + tconInfoFree(tcon); + return ERR_PTR(rc); +} + + int get_dfs_path(int xid, struct cifsSesInfo *pSesInfo, const char *old_path, const struct nls_table *nls_codepage, unsigned int *pnum_referrals, @@ -1703,8 +1875,7 @@ get_dfs_path(int xid, struct cifsSesInfo *pSesInfo, const char *old_path, strcpy(temp_unc + 2, pSesInfo->serverName); strcpy(temp_unc + 2 + strlen(pSesInfo->serverName), "\\IPC$"); rc = CIFSTCon(xid, pSesInfo, temp_unc, NULL, nls_codepage); - cFYI(1, - ("CIFS Tcon rc = %d ipc_tid = %d", rc, pSesInfo->ipc_tid)); + cFYI(1, "CIFS Tcon rc = %d ipc_tid = %d", rc, pSesInfo->ipc_tid); kfree(temp_unc); } if (rc == 0) @@ -1777,12 +1948,12 @@ ipv4_connect(struct TCP_Server_Info *server) rc = sock_create_kern(PF_INET, SOCK_STREAM, IPPROTO_TCP, &socket); if (rc < 0) { - cERROR(1, ("Error %d creating socket", rc)); + cERROR(1, "Error %d creating socket", rc); return rc; } /* BB other socket options to set KEEPALIVE, NODELAY? */ - cFYI(1, ("Socket created")); + cFYI(1, "Socket created"); server->ssocket = socket; socket->sk->sk_allocation = GFP_NOFS; cifs_reclassify_socket4(socket); @@ -1827,7 +1998,7 @@ ipv4_connect(struct TCP_Server_Info *server) if (!connected) { if (orig_port) server->addr.sockAddr.sin_port = orig_port; - cFYI(1, ("Error %d connecting to server via ipv4", rc)); + cFYI(1, "Error %d connecting to server via ipv4", rc); sock_release(socket); server->ssocket = NULL; return rc; @@ -1855,12 +2026,12 @@ ipv4_connect(struct TCP_Server_Info *server) rc = kernel_setsockopt(socket, SOL_TCP, TCP_NODELAY, (char *)&val, sizeof(val)); if (rc) - cFYI(1, ("set TCP_NODELAY socket option error %d", rc)); + cFYI(1, "set TCP_NODELAY socket option error %d", rc); } - cFYI(1, ("sndbuf %d rcvbuf %d rcvtimeo 0x%lx", + cFYI(1, "sndbuf %d rcvbuf %d rcvtimeo 0x%lx", socket->sk->sk_sndbuf, - socket->sk->sk_rcvbuf, socket->sk->sk_rcvtimeo)); + socket->sk->sk_rcvbuf, socket->sk->sk_rcvtimeo); /* send RFC1001 sessinit */ if (server->addr.sockAddr.sin_port == htons(RFC1001_PORT)) { @@ -1938,13 +2109,13 @@ ipv6_connect(struct TCP_Server_Info *server) rc = sock_create_kern(PF_INET6, SOCK_STREAM, IPPROTO_TCP, &socket); if (rc < 0) { - cERROR(1, ("Error %d creating ipv6 socket", rc)); + cERROR(1, "Error %d creating ipv6 socket", rc); socket = NULL; return rc; } /* BB other socket options to set KEEPALIVE, NODELAY? */ - cFYI(1, ("ipv6 Socket created")); + cFYI(1, "ipv6 Socket created"); server->ssocket = socket; socket->sk->sk_allocation = GFP_NOFS; cifs_reclassify_socket6(socket); @@ -1988,7 +2159,7 @@ ipv6_connect(struct TCP_Server_Info *server) if (!connected) { if (orig_port) server->addr.sockAddr6.sin6_port = orig_port; - cFYI(1, ("Error %d connecting to server via ipv6", rc)); + cFYI(1, "Error %d connecting to server via ipv6", rc); sock_release(socket); server->ssocket = NULL; return rc; @@ -2007,7 +2178,7 @@ ipv6_connect(struct TCP_Server_Info *server) rc = kernel_setsockopt(socket, SOL_TCP, TCP_NODELAY, (char *)&val, sizeof(val)); if (rc) - cFYI(1, ("set TCP_NODELAY socket option error %d", rc)); + cFYI(1, "set TCP_NODELAY socket option error %d", rc); } server->ssocket = socket; @@ -2032,13 +2203,13 @@ void reset_cifs_unix_caps(int xid, struct cifsTconInfo *tcon, if (vol_info && vol_info->no_linux_ext) { tcon->fsUnixInfo.Capability = 0; tcon->unix_ext = 0; /* Unix Extensions disabled */ - cFYI(1, ("Linux protocol extensions disabled")); + cFYI(1, "Linux protocol extensions disabled"); return; } else if (vol_info) tcon->unix_ext = 1; /* Unix Extensions supported */ if (tcon->unix_ext == 0) { - cFYI(1, ("Unix extensions disabled so not set on reconnect")); + cFYI(1, "Unix extensions disabled so not set on reconnect"); return; } @@ -2054,12 +2225,11 @@ void reset_cifs_unix_caps(int xid, struct cifsTconInfo *tcon, cap &= ~CIFS_UNIX_POSIX_ACL_CAP; if ((saved_cap & CIFS_UNIX_POSIX_PATHNAMES_CAP) == 0) { if (cap & CIFS_UNIX_POSIX_PATHNAMES_CAP) - cERROR(1, ("POSIXPATH support change")); + cERROR(1, "POSIXPATH support change"); cap &= ~CIFS_UNIX_POSIX_PATHNAMES_CAP; } else if ((cap & CIFS_UNIX_POSIX_PATHNAMES_CAP) == 0) { - cERROR(1, ("possible reconnect error")); - cERROR(1, - ("server disabled POSIX path support")); + cERROR(1, "possible reconnect error"); + cERROR(1, "server disabled POSIX path support"); } } @@ -2067,7 +2237,7 @@ void reset_cifs_unix_caps(int xid, struct cifsTconInfo *tcon, if (vol_info && vol_info->no_psx_acl) cap &= ~CIFS_UNIX_POSIX_ACL_CAP; else if (CIFS_UNIX_POSIX_ACL_CAP & cap) { - cFYI(1, ("negotiated posix acl support")); + cFYI(1, "negotiated posix acl support"); if (sb) sb->s_flags |= MS_POSIXACL; } @@ -2075,7 +2245,7 @@ void reset_cifs_unix_caps(int xid, struct cifsTconInfo *tcon, if (vol_info && vol_info->posix_paths == 0) cap &= ~CIFS_UNIX_POSIX_PATHNAMES_CAP; else if (cap & CIFS_UNIX_POSIX_PATHNAMES_CAP) { - cFYI(1, ("negotiate posix pathnames")); + cFYI(1, "negotiate posix pathnames"); if (sb) CIFS_SB(sb)->mnt_cifs_flags |= CIFS_MOUNT_POSIX_PATHS; @@ -2090,39 +2260,38 @@ void reset_cifs_unix_caps(int xid, struct cifsTconInfo *tcon, if (sb && (CIFS_SB(sb)->rsize > 127 * 1024)) { if ((cap & CIFS_UNIX_LARGE_READ_CAP) == 0) { CIFS_SB(sb)->rsize = 127 * 1024; - cFYI(DBG2, - ("larger reads not supported by srv")); + cFYI(DBG2, "larger reads not supported by srv"); } } - cFYI(1, ("Negotiate caps 0x%x", (int)cap)); + cFYI(1, "Negotiate caps 0x%x", (int)cap); #ifdef CONFIG_CIFS_DEBUG2 if (cap & CIFS_UNIX_FCNTL_CAP) - cFYI(1, ("FCNTL cap")); + cFYI(1, "FCNTL cap"); if (cap & CIFS_UNIX_EXTATTR_CAP) - cFYI(1, ("EXTATTR cap")); + cFYI(1, "EXTATTR cap"); if (cap & CIFS_UNIX_POSIX_PATHNAMES_CAP) - cFYI(1, ("POSIX path cap")); + cFYI(1, "POSIX path cap"); if (cap & CIFS_UNIX_XATTR_CAP) - cFYI(1, ("XATTR cap")); + cFYI(1, "XATTR cap"); if (cap & CIFS_UNIX_POSIX_ACL_CAP) - cFYI(1, ("POSIX ACL cap")); + cFYI(1, "POSIX ACL cap"); if (cap & CIFS_UNIX_LARGE_READ_CAP) - cFYI(1, ("very large read cap")); + cFYI(1, "very large read cap"); if (cap & CIFS_UNIX_LARGE_WRITE_CAP) - cFYI(1, ("very large write cap")); + cFYI(1, "very large write cap"); #endif /* CIFS_DEBUG2 */ if (CIFSSMBSetFSUnixInfo(xid, tcon, cap)) { if (vol_info == NULL) { - cFYI(1, ("resetting capabilities failed")); + cFYI(1, "resetting capabilities failed"); } else - cERROR(1, ("Negotiating Unix capabilities " + cERROR(1, "Negotiating Unix capabilities " "with the server failed. Consider " "mounting with the Unix Extensions\n" "disabled, if problems are found, " "by specifying the nounix mount " - "option.")); + "option."); } } @@ -2152,8 +2321,8 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info, struct cifs_sb_info *cifs_sb) { if (pvolume_info->rsize > CIFSMaxBufSize) { - cERROR(1, ("rsize %d too large, using MaxBufSize", - pvolume_info->rsize)); + cERROR(1, "rsize %d too large, using MaxBufSize", + pvolume_info->rsize); cifs_sb->rsize = CIFSMaxBufSize; } else if ((pvolume_info->rsize) && (pvolume_info->rsize <= CIFSMaxBufSize)) @@ -2162,8 +2331,8 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info, cifs_sb->rsize = CIFSMaxBufSize; if (pvolume_info->wsize > PAGEVEC_SIZE * PAGE_CACHE_SIZE) { - cERROR(1, ("wsize %d too large, using 4096 instead", - pvolume_info->wsize)); + cERROR(1, "wsize %d too large, using 4096 instead", + pvolume_info->wsize); cifs_sb->wsize = 4096; } else if (pvolume_info->wsize) cifs_sb->wsize = pvolume_info->wsize; @@ -2181,7 +2350,7 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info, if (cifs_sb->rsize < 2048) { cifs_sb->rsize = 2048; /* Windows ME may prefer this */ - cFYI(1, ("readsize set to minimum: 2048")); + cFYI(1, "readsize set to minimum: 2048"); } /* calculate prepath */ cifs_sb->prepath = pvolume_info->prepath; @@ -2199,8 +2368,8 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info, cifs_sb->mnt_gid = pvolume_info->linux_gid; cifs_sb->mnt_file_mode = pvolume_info->file_mode; cifs_sb->mnt_dir_mode = pvolume_info->dir_mode; - cFYI(1, ("file mode: 0x%x dir mode: 0x%x", - cifs_sb->mnt_file_mode, cifs_sb->mnt_dir_mode)); + cFYI(1, "file mode: 0x%x dir mode: 0x%x", + cifs_sb->mnt_file_mode, cifs_sb->mnt_dir_mode); if (pvolume_info->noperm) cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_NO_PERM; @@ -2229,13 +2398,13 @@ static void setup_cifs_sb(struct smb_vol *pvolume_info, if (pvolume_info->dynperm) cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_DYNPERM; if (pvolume_info->direct_io) { - cFYI(1, ("mounting share using direct i/o")); + cFYI(1, "mounting share using direct i/o"); cifs_sb->mnt_cifs_flags |= CIFS_MOUNT_DIRECT_IO; } if ((pvolume_info->cifs_acl) && (pvolume_info->dynperm)) - cERROR(1, ("mount option dynperm ignored if cifsacl " - "mount option supported")); + cERROR(1, "mount option dynperm ignored if cifsacl " + "mount option supported"); } static int @@ -2262,7 +2431,7 @@ cleanup_volume_info(struct smb_vol **pvolume_info) { struct smb_vol *volume_info; - if (!pvolume_info && !*pvolume_info) + if (!pvolume_info || !*pvolume_info) return; volume_info = *pvolume_info; @@ -2344,11 +2513,11 @@ try_mount_again: } if (volume_info->nullauth) { - cFYI(1, ("null user")); + cFYI(1, "null user"); volume_info->username = ""; } else if (volume_info->username) { /* BB fixme parse for domain name here */ - cFYI(1, ("Username: %s", volume_info->username)); + cFYI(1, "Username: %s", volume_info->username); } else { cifserror("No username specified"); /* In userspace mount helper we can get user name from alternate @@ -2357,20 +2526,20 @@ try_mount_again: goto out; } - /* this is needed for ASCII cp to Unicode converts */ if (volume_info->iocharset == NULL) { - cifs_sb->local_nls = load_nls_default(); - /* load_nls_default can not return null */ + /* load_nls_default cannot return null */ + volume_info->local_nls = load_nls_default(); } else { - cifs_sb->local_nls = load_nls(volume_info->iocharset); - if (cifs_sb->local_nls == NULL) { - cERROR(1, ("CIFS mount error: iocharset %s not found", - volume_info->iocharset)); + volume_info->local_nls = load_nls(volume_info->iocharset); + if (volume_info->local_nls == NULL) { + cERROR(1, "CIFS mount error: iocharset %s not found", + volume_info->iocharset); rc = -ELIBACC; goto out; } } + cifs_sb->local_nls = volume_info->local_nls; /* get a reference to a tcp session */ srvTcp = cifs_get_tcp_session(volume_info); @@ -2379,148 +2548,30 @@ try_mount_again: goto out; } - pSesInfo = cifs_find_smb_ses(srvTcp, volume_info->username); - if (pSesInfo) { - cFYI(1, ("Existing smb sess found (status=%d)", - pSesInfo->status)); - /* - * The existing SMB session already has a reference to srvTcp, - * so we can put back the extra one we got before - */ - cifs_put_tcp_session(srvTcp); - - mutex_lock(&pSesInfo->session_mutex); - if (pSesInfo->need_reconnect) { - cFYI(1, ("Session needs reconnect")); - rc = cifs_setup_session(xid, pSesInfo, - cifs_sb->local_nls); - } - mutex_unlock(&pSesInfo->session_mutex); - } else if (!rc) { - cFYI(1, ("Existing smb sess not found")); - pSesInfo = sesInfoAlloc(); - if (pSesInfo == NULL) { - rc = -ENOMEM; - goto mount_fail_check; - } - - /* new SMB session uses our srvTcp ref */ - pSesInfo->server = srvTcp; - if (srvTcp->addr.sockAddr6.sin6_family == AF_INET6) - sprintf(pSesInfo->serverName, "%pI6", - &srvTcp->addr.sockAddr6.sin6_addr); - else - sprintf(pSesInfo->serverName, "%pI4", - &srvTcp->addr.sockAddr.sin_addr.s_addr); - - write_lock(&cifs_tcp_ses_lock); - list_add(&pSesInfo->smb_ses_list, &srvTcp->smb_ses_list); - write_unlock(&cifs_tcp_ses_lock); - - /* volume_info->password freed at unmount */ - if (volume_info->password) { - pSesInfo->password = kstrdup(volume_info->password, - GFP_KERNEL); - if (!pSesInfo->password) { - rc = -ENOMEM; - goto mount_fail_check; - } - } - if (volume_info->username) - strncpy(pSesInfo->userName, volume_info->username, - MAX_USERNAME_SIZE); - if (volume_info->domainname) { - int len = strlen(volume_info->domainname); - pSesInfo->domainName = kmalloc(len + 1, GFP_KERNEL); - if (pSesInfo->domainName) - strcpy(pSesInfo->domainName, - volume_info->domainname); - } - pSesInfo->linux_uid = volume_info->linux_uid; - pSesInfo->overrideSecFlg = volume_info->secFlg; - mutex_lock(&pSesInfo->session_mutex); - - /* BB FIXME need to pass vol->secFlgs BB */ - rc = cifs_setup_session(xid, pSesInfo, - cifs_sb->local_nls); - mutex_unlock(&pSesInfo->session_mutex); + /* get a reference to a SMB session */ + pSesInfo = cifs_get_smb_ses(srvTcp, volume_info); + if (IS_ERR(pSesInfo)) { + rc = PTR_ERR(pSesInfo); + pSesInfo = NULL; + goto mount_fail_check; } - /* search for existing tcon to this server share */ - if (!rc) { - setup_cifs_sb(volume_info, cifs_sb); - - tcon = cifs_find_tcon(pSesInfo, volume_info->UNC); - if (tcon) { - cFYI(1, ("Found match on UNC path")); - /* existing tcon already has a reference */ - cifs_put_smb_ses(pSesInfo); - if (tcon->seal != volume_info->seal) - cERROR(1, ("transport encryption setting " - "conflicts with existing tid")); - } else { - tcon = tconInfoAlloc(); - if (tcon == NULL) { - rc = -ENOMEM; - goto mount_fail_check; - } - - tcon->ses = pSesInfo; - if (volume_info->password) { - tcon->password = kstrdup(volume_info->password, - GFP_KERNEL); - if (!tcon->password) { - rc = -ENOMEM; - goto mount_fail_check; - } - } - - if ((strchr(volume_info->UNC + 3, '\\') == NULL) - && (strchr(volume_info->UNC + 3, '/') == NULL)) { - cERROR(1, ("Missing share name")); - rc = -ENODEV; - goto mount_fail_check; - } else { - /* BB Do we need to wrap sesSem around - * this TCon call and Unix SetFS as - * we do on SessSetup and reconnect? */ - rc = CIFSTCon(xid, pSesInfo, volume_info->UNC, - tcon, cifs_sb->local_nls); - cFYI(1, ("CIFS Tcon rc = %d", rc)); - if (volume_info->nodfs) { - tcon->Flags &= ~SMB_SHARE_IS_IN_DFS; - cFYI(1, ("DFS disabled (%d)", - tcon->Flags)); - } - } - if (rc) - goto remote_path_check; - tcon->seal = volume_info->seal; - write_lock(&cifs_tcp_ses_lock); - list_add(&tcon->tcon_list, &pSesInfo->tcon_list); - write_unlock(&cifs_tcp_ses_lock); - } - - /* we can have only one retry value for a connection - to a share so for resources mounted more than once - to the same server share the last value passed in - for the retry flag is used */ - tcon->retry = volume_info->retry; - tcon->nocase = volume_info->nocase; - tcon->local_lease = volume_info->local_lease; - } - if (pSesInfo) { - if (pSesInfo->capabilities & CAP_LARGE_FILES) - sb->s_maxbytes = MAX_LFS_FILESIZE; - else - sb->s_maxbytes = MAX_NON_LFS; - } + setup_cifs_sb(volume_info, cifs_sb); + if (pSesInfo->capabilities & CAP_LARGE_FILES) + sb->s_maxbytes = MAX_LFS_FILESIZE; + else + sb->s_maxbytes = MAX_NON_LFS; /* BB FIXME fix time_gran to be larger for LANMAN sessions */ sb->s_time_gran = 100; - if (rc) + /* search for existing tcon to this server share */ + tcon = cifs_get_tcon(pSesInfo, volume_info); + if (IS_ERR(tcon)) { + rc = PTR_ERR(tcon); + tcon = NULL; goto remote_path_check; + } cifs_sb->tcon = tcon; @@ -2544,7 +2595,7 @@ try_mount_again: if ((tcon->unix_ext == 0) && (cifs_sb->rsize > (1024 * 127))) { cifs_sb->rsize = 1024 * 127; - cFYI(DBG2, ("no very large read support, rsize now 127K")); + cFYI(DBG2, "no very large read support, rsize now 127K"); } if (!(tcon->ses->capabilities & CAP_LARGE_WRITE_X)) cifs_sb->wsize = min(cifs_sb->wsize, @@ -2593,7 +2644,7 @@ remote_path_check: goto mount_fail_check; } - cFYI(1, ("Getting referral for: %s", full_path)); + cFYI(1, "Getting referral for: %s", full_path); rc = get_dfs_path(xid, pSesInfo , full_path + 1, cifs_sb->local_nls, &num_referrals, &referrals, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); @@ -2707,7 +2758,7 @@ CIFSTCon(unsigned int xid, struct cifsSesInfo *ses, by Samba (not sure whether other servers allow NTLMv2 password here) */ #ifdef CONFIG_CIFS_WEAK_PW_HASH - if ((extended_security & CIFSSEC_MAY_LANMAN) && + if ((global_secflags & CIFSSEC_MAY_LANMAN) && (ses->server->secType == LANMAN)) calc_lanman_hash(tcon->password, ses->server->cryptKey, ses->server->secMode & @@ -2778,13 +2829,13 @@ CIFSTCon(unsigned int xid, struct cifsSesInfo *ses, if (length == 3) { if ((bcc_ptr[0] == 'I') && (bcc_ptr[1] == 'P') && (bcc_ptr[2] == 'C')) { - cFYI(1, ("IPC connection")); + cFYI(1, "IPC connection"); tcon->ipc = 1; } } else if (length == 2) { if ((bcc_ptr[0] == 'A') && (bcc_ptr[1] == ':')) { /* the most common case */ - cFYI(1, ("disk share connection")); + cFYI(1, "disk share connection"); } } bcc_ptr += length + 1; @@ -2797,7 +2848,7 @@ CIFSTCon(unsigned int xid, struct cifsSesInfo *ses, bytes_left, is_unicode, nls_codepage); - cFYI(1, ("nativeFileSystem=%s", tcon->nativeFileSystem)); + cFYI(1, "nativeFileSystem=%s", tcon->nativeFileSystem); if ((smb_buffer_response->WordCount == 3) || (smb_buffer_response->WordCount == 7)) @@ -2805,7 +2856,7 @@ CIFSTCon(unsigned int xid, struct cifsSesInfo *ses, tcon->Flags = le16_to_cpu(pSMBr->OptionalSupport); else tcon->Flags = 0; - cFYI(1, ("Tcon flags: 0x%x ", tcon->Flags)); + cFYI(1, "Tcon flags: 0x%x ", tcon->Flags); } else if ((rc == 0) && tcon == NULL) { /* all we need to save for IPC$ connection */ ses->ipc_tid = smb_buffer_response->Tid; @@ -2833,57 +2884,61 @@ cifs_umount(struct super_block *sb, struct cifs_sb_info *cifs_sb) return rc; } -int cifs_setup_session(unsigned int xid, struct cifsSesInfo *pSesInfo, - struct nls_table *nls_info) +int cifs_negotiate_protocol(unsigned int xid, struct cifsSesInfo *ses) { int rc = 0; - int first_time = 0; - struct TCP_Server_Info *server = pSesInfo->server; - - /* what if server changes its buffer size after dropping the session? */ - if (server->maxBuf == 0) /* no need to send on reconnect */ { - rc = CIFSSMBNegotiate(xid, pSesInfo); - if (rc == -EAGAIN) { - /* retry only once on 1st time connection */ - rc = CIFSSMBNegotiate(xid, pSesInfo); - if (rc == -EAGAIN) - rc = -EHOSTDOWN; - } - if (rc == 0) { - spin_lock(&GlobalMid_Lock); - if (server->tcpStatus != CifsExiting) - server->tcpStatus = CifsGood; - else - rc = -EHOSTDOWN; - spin_unlock(&GlobalMid_Lock); + struct TCP_Server_Info *server = ses->server; + + /* only send once per connect */ + if (server->maxBuf != 0) + return 0; + + rc = CIFSSMBNegotiate(xid, ses); + if (rc == -EAGAIN) { + /* retry only once on 1st time connection */ + rc = CIFSSMBNegotiate(xid, ses); + if (rc == -EAGAIN) + rc = -EHOSTDOWN; + } + if (rc == 0) { + spin_lock(&GlobalMid_Lock); + if (server->tcpStatus != CifsExiting) + server->tcpStatus = CifsGood; + else + rc = -EHOSTDOWN; + spin_unlock(&GlobalMid_Lock); - } - first_time = 1; } - if (rc) - goto ss_err_exit; + return rc; +} + + +int cifs_setup_session(unsigned int xid, struct cifsSesInfo *ses, + struct nls_table *nls_info) +{ + int rc = 0; + struct TCP_Server_Info *server = ses->server; - pSesInfo->flags = 0; - pSesInfo->capabilities = server->capabilities; + ses->flags = 0; + ses->capabilities = server->capabilities; if (linuxExtEnabled == 0) - pSesInfo->capabilities &= (~CAP_UNIX); + ses->capabilities &= (~CAP_UNIX); - cFYI(1, ("Security Mode: 0x%x Capabilities: 0x%x TimeAdjust: %d", - server->secMode, server->capabilities, server->timeAdj)); + cFYI(1, "Security Mode: 0x%x Capabilities: 0x%x TimeAdjust: %d", + server->secMode, server->capabilities, server->timeAdj); - rc = CIFS_SessSetup(xid, pSesInfo, first_time, nls_info); + rc = CIFS_SessSetup(xid, ses, nls_info); if (rc) { - cERROR(1, ("Send error in SessSetup = %d", rc)); + cERROR(1, "Send error in SessSetup = %d", rc); } else { - cFYI(1, ("CIFS Session Established successfully")); + cFYI(1, "CIFS Session Established successfully"); spin_lock(&GlobalMid_Lock); - pSesInfo->status = CifsGood; - pSesInfo->need_reconnect = false; + ses->status = CifsGood; + ses->need_reconnect = false; spin_unlock(&GlobalMid_Lock); } -ss_err_exit: return rc; } diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c index e9f7ecc..391816b 100644 --- a/fs/cifs/dir.c +++ b/fs/cifs/dir.c @@ -73,7 +73,7 @@ cifs_bp_rename_retry: namelen += (1 + temp->d_name.len); temp = temp->d_parent; if (temp == NULL) { - cERROR(1, ("corrupt dentry")); + cERROR(1, "corrupt dentry"); return NULL; } } @@ -90,19 +90,18 @@ cifs_bp_rename_retry: full_path[namelen] = dirsep; strncpy(full_path + namelen + 1, temp->d_name.name, temp->d_name.len); - cFYI(0, ("name: %s", full_path + namelen)); + cFYI(0, "name: %s", full_path + namelen); } temp = temp->d_parent; if (temp == NULL) { - cERROR(1, ("corrupt dentry")); + cERROR(1, "corrupt dentry"); kfree(full_path); return NULL; } } if (namelen != pplen + dfsplen) { - cERROR(1, - ("did not end path lookup where expected namelen is %d", - namelen)); + cERROR(1, "did not end path lookup where expected namelen is %d", + namelen); /* presumably this is only possible if racing with a rename of one of the parent directories (we can not lock the dentries above us to prevent this, but retrying should be harmless) */ @@ -130,6 +129,12 @@ cifs_bp_rename_retry: return full_path; } +/* + * When called with struct file pointer set to NULL, there is no way we could + * update file->private_data, but getting it stuck on openFileList provides a + * way to access it from cifs_fill_filedata and thereby set file->private_data + * from cifs_open. + */ struct cifsFileInfo * cifs_new_fileinfo(struct inode *newinode, __u16 fileHandle, struct file *file, struct vfsmount *mnt, unsigned int oflags) @@ -173,7 +178,7 @@ cifs_new_fileinfo(struct inode *newinode, __u16 fileHandle, if ((oplock & 0xF) == OPLOCK_EXCLUSIVE) { pCifsInode->clientCanCacheAll = true; pCifsInode->clientCanCacheRead = true; - cFYI(1, ("Exclusive Oplock inode %p", newinode)); + cFYI(1, "Exclusive Oplock inode %p", newinode); } else if ((oplock & 0xF) == OPLOCK_READ) pCifsInode->clientCanCacheRead = true; } @@ -183,16 +188,17 @@ cifs_new_fileinfo(struct inode *newinode, __u16 fileHandle, } int cifs_posix_open(char *full_path, struct inode **pinode, - struct vfsmount *mnt, int mode, int oflags, - __u32 *poplock, __u16 *pnetfid, int xid) + struct vfsmount *mnt, struct super_block *sb, + int mode, int oflags, + __u32 *poplock, __u16 *pnetfid, int xid) { int rc; FILE_UNIX_BASIC_INFO *presp_data; __u32 posix_flags = 0; - struct cifs_sb_info *cifs_sb = CIFS_SB(mnt->mnt_sb); + struct cifs_sb_info *cifs_sb = CIFS_SB(sb); struct cifs_fattr fattr; - cFYI(1, ("posix open %s", full_path)); + cFYI(1, "posix open %s", full_path); presp_data = kzalloc(sizeof(FILE_UNIX_BASIC_INFO), GFP_KERNEL); if (presp_data == NULL) @@ -242,7 +248,8 @@ int cifs_posix_open(char *full_path, struct inode **pinode, /* get new inode and set it up */ if (*pinode == NULL) { - *pinode = cifs_iget(mnt->mnt_sb, &fattr); + cifs_fill_uniqueid(sb, &fattr); + *pinode = cifs_iget(sb, &fattr); if (!*pinode) { rc = -ENOMEM; goto posix_open_ret; @@ -251,7 +258,18 @@ int cifs_posix_open(char *full_path, struct inode **pinode, cifs_fattr_to_inode(*pinode, &fattr); } - cifs_new_fileinfo(*pinode, *pnetfid, NULL, mnt, oflags); + /* + * cifs_fill_filedata() takes care of setting cifsFileInfo pointer to + * file->private_data. + */ + if (mnt) { + struct cifsFileInfo *pfile_info; + + pfile_info = cifs_new_fileinfo(*pinode, *pnetfid, NULL, mnt, + oflags); + if (pfile_info == NULL) + rc = -ENOMEM; + } posix_open_ret: kfree(presp_data); @@ -315,13 +333,14 @@ cifs_create(struct inode *inode, struct dentry *direntry, int mode, if (nd && (nd->flags & LOOKUP_OPEN)) oflags = nd->intent.open.flags; else - oflags = FMODE_READ; + oflags = FMODE_READ | SMB_O_CREAT; if (tcon->unix_ext && (tcon->ses->capabilities & CAP_UNIX) && (CIFS_UNIX_POSIX_PATH_OPS_CAP & le64_to_cpu(tcon->fsUnixInfo.Capability))) { - rc = cifs_posix_open(full_path, &newinode, nd->path.mnt, - mode, oflags, &oplock, &fileHandle, xid); + rc = cifs_posix_open(full_path, &newinode, + nd ? nd->path.mnt : NULL, + inode->i_sb, mode, oflags, &oplock, &fileHandle, xid); /* EIO could indicate that (posix open) operation is not supported, despite what server claimed in capability negotation. EREMOTE indicates DFS junction, which is not @@ -358,7 +377,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, int mode, else if ((oflags & O_CREAT) == O_CREAT) disposition = FILE_OPEN_IF; else - cFYI(1, ("Create flag not set in create function")); + cFYI(1, "Create flag not set in create function"); } /* BB add processing to set equivalent of mode - e.g. via CreateX with @@ -394,7 +413,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, int mode, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); } if (rc) { - cFYI(1, ("cifs_create returned 0x%x", rc)); + cFYI(1, "cifs_create returned 0x%x", rc); goto cifs_create_out; } @@ -457,15 +476,22 @@ cifs_create_set_dentry: if (rc == 0) setup_cifs_dentry(tcon, direntry, newinode); else - cFYI(1, ("Create worked, get_inode_info failed rc = %d", rc)); + cFYI(1, "Create worked, get_inode_info failed rc = %d", rc); /* nfsd case - nfs srv does not set nd */ if ((nd == NULL) || (!(nd->flags & LOOKUP_OPEN))) { /* mknod case - do not leave file open */ CIFSSMBClose(xid, tcon, fileHandle); } else if (!(posix_create) && (newinode)) { - cifs_new_fileinfo(newinode, fileHandle, NULL, - nd->path.mnt, oflags); + struct cifsFileInfo *pfile_info; + /* + * cifs_fill_filedata() takes care of setting cifsFileInfo + * pointer to file->private_data. + */ + pfile_info = cifs_new_fileinfo(newinode, fileHandle, NULL, + nd->path.mnt, oflags); + if (pfile_info == NULL) + rc = -ENOMEM; } cifs_create_out: kfree(buf); @@ -531,7 +557,7 @@ int cifs_mknod(struct inode *inode, struct dentry *direntry, int mode, u16 fileHandle; FILE_ALL_INFO *buf; - cFYI(1, ("sfu compat create special file")); + cFYI(1, "sfu compat create special file"); buf = kmalloc(sizeof(FILE_ALL_INFO), GFP_KERNEL); if (buf == NULL) { @@ -616,8 +642,8 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry, xid = GetXid(); - cFYI(1, ("parent inode = 0x%p name is: %s and dentry = 0x%p", - parent_dir_inode, direntry->d_name.name, direntry)); + cFYI(1, "parent inode = 0x%p name is: %s and dentry = 0x%p", + parent_dir_inode, direntry->d_name.name, direntry); /* check whether path exists */ @@ -632,7 +658,7 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry, int i; for (i = 0; i < direntry->d_name.len; i++) if (direntry->d_name.name[i] == '\\') { - cFYI(1, ("Invalid file name")); + cFYI(1, "Invalid file name"); FreeXid(xid); return ERR_PTR(-EINVAL); } @@ -657,11 +683,11 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry, } if (direntry->d_inode != NULL) { - cFYI(1, ("non-NULL inode in lookup")); + cFYI(1, "non-NULL inode in lookup"); } else { - cFYI(1, ("NULL inode in lookup")); + cFYI(1, "NULL inode in lookup"); } - cFYI(1, ("Full path: %s inode = 0x%p", full_path, direntry->d_inode)); + cFYI(1, "Full path: %s inode = 0x%p", full_path, direntry->d_inode); /* Posix open is only called (at lookup time) for file create now. * For opens (rather than creates), because we do not know if it @@ -678,6 +704,7 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry, (nd->flags & LOOKUP_OPEN) && !pTcon->broken_posix_open && (nd->intent.open.flags & O_CREAT)) { rc = cifs_posix_open(full_path, &newInode, nd->path.mnt, + parent_dir_inode->i_sb, nd->intent.open.create_mode, nd->intent.open.flags, &oplock, &fileHandle, xid); @@ -723,7 +750,7 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry, /* if it was once a directory (but how can we tell?) we could do shrink_dcache_parent(direntry); */ } else if (rc != -EACCES) { - cERROR(1, ("Unexpected lookup error %d", rc)); + cERROR(1, "Unexpected lookup error %d", rc); /* We special case check for Access Denied - since that is a common return code */ } @@ -742,8 +769,8 @@ cifs_d_revalidate(struct dentry *direntry, struct nameidata *nd) if (cifs_revalidate_dentry(direntry)) return 0; } else { - cFYI(1, ("neg dentry 0x%p name = %s", - direntry, direntry->d_name.name)); + cFYI(1, "neg dentry 0x%p name = %s", + direntry, direntry->d_name.name); if (time_after(jiffies, direntry->d_time + HZ) || !lookupCacheEnabled) { d_drop(direntry); @@ -758,7 +785,7 @@ cifs_d_revalidate(struct dentry *direntry, struct nameidata *nd) { int rc = 0; - cFYI(1, ("In cifs d_delete, name = %s", direntry->d_name.name)); + cFYI(1, "In cifs d_delete, name = %s", direntry->d_name.name); return rc; } */ diff --git a/fs/cifs/dns_resolve.c b/fs/cifs/dns_resolve.c index 6f8a0e3..4db2c5e 100644 --- a/fs/cifs/dns_resolve.c +++ b/fs/cifs/dns_resolve.c @@ -106,14 +106,14 @@ dns_resolve_server_name_to_ip(const char *unc, char **ip_addr) /* search for server name delimiter */ len = strlen(unc); if (len < 3) { - cFYI(1, ("%s: unc is too short: %s", __func__, unc)); + cFYI(1, "%s: unc is too short: %s", __func__, unc); return -EINVAL; } len -= 2; name = memchr(unc+2, '\\', len); if (!name) { - cFYI(1, ("%s: probably server name is whole unc: %s", - __func__, unc)); + cFYI(1, "%s: probably server name is whole unc: %s", + __func__, unc); } else { len = (name - unc) - 2/* leading // */; } @@ -127,8 +127,8 @@ dns_resolve_server_name_to_ip(const char *unc, char **ip_addr) name[len] = 0; if (is_ip(name)) { - cFYI(1, ("%s: it is IP, skipping dns upcall: %s", - __func__, name)); + cFYI(1, "%s: it is IP, skipping dns upcall: %s", + __func__, name); data = name; goto skip_upcall; } @@ -138,7 +138,7 @@ dns_resolve_server_name_to_ip(const char *unc, char **ip_addr) len = rkey->type_data.x[0]; data = rkey->payload.data; } else { - cERROR(1, ("%s: unable to resolve: %s", __func__, name)); + cERROR(1, "%s: unable to resolve: %s", __func__, name); goto out; } @@ -148,10 +148,10 @@ skip_upcall: if (*ip_addr) { memcpy(*ip_addr, data, len + 1); if (!IS_ERR(rkey)) - cFYI(1, ("%s: resolved: %s to %s", __func__, + cFYI(1, "%s: resolved: %s to %s", __func__, name, *ip_addr - )); + ); rc = 0; } else { rc = -ENOMEM; diff --git a/fs/cifs/export.c b/fs/cifs/export.c index 6177f7c..993f820 100644 --- a/fs/cifs/export.c +++ b/fs/cifs/export.c @@ -49,7 +49,7 @@ static struct dentry *cifs_get_parent(struct dentry *dentry) { /* BB need to add code here eventually to enable export via NFSD */ - cFYI(1, ("get parent for %p", dentry)); + cFYI(1, "get parent for %p", dentry); return ERR_PTR(-EACCES); } diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 9b11a8f..a83541e 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -3,7 +3,7 @@ * * vfs operations that deal with files * - * Copyright (C) International Business Machines Corp., 2002,2007 + * Copyright (C) International Business Machines Corp., 2002,2010 * Author(s): Steve French (sfrench@us.ibm.com) * Jeremy Allison (jra@samba.org) * @@ -108,8 +108,7 @@ static inline int cifs_get_disposition(unsigned int flags) /* all arguments to this function must be checked for validity in caller */ static inline int cifs_posix_open_inode_helper(struct inode *inode, struct file *file, - struct cifsInodeInfo *pCifsInode, - struct cifsFileInfo *pCifsFile, __u32 oplock, + struct cifsInodeInfo *pCifsInode, __u32 oplock, u16 netfid) { @@ -136,15 +135,15 @@ cifs_posix_open_inode_helper(struct inode *inode, struct file *file, if (timespec_equal(&file->f_path.dentry->d_inode->i_mtime, &temp) && (file->f_path.dentry->d_inode->i_size == (loff_t)le64_to_cpu(buf->EndOfFile))) { - cFYI(1, ("inode unchanged on server")); + cFYI(1, "inode unchanged on server"); } else { if (file->f_path.dentry->d_inode->i_mapping) { rc = filemap_write_and_wait(file->f_path.dentry->d_inode->i_mapping); if (rc != 0) CIFS_I(file->f_path.dentry->d_inode)->write_behind_rc = rc; } - cFYI(1, ("invalidating remote inode since open detected it " - "changed")); + cFYI(1, "invalidating remote inode since open detected it " + "changed"); invalidate_remote_inode(file->f_path.dentry->d_inode); } */ @@ -152,8 +151,8 @@ psx_client_can_cache: if ((oplock & 0xF) == OPLOCK_EXCLUSIVE) { pCifsInode->clientCanCacheAll = true; pCifsInode->clientCanCacheRead = true; - cFYI(1, ("Exclusive Oplock granted on inode %p", - file->f_path.dentry->d_inode)); + cFYI(1, "Exclusive Oplock granted on inode %p", + file->f_path.dentry->d_inode); } else if ((oplock & 0xF) == OPLOCK_READ) pCifsInode->clientCanCacheRead = true; @@ -190,8 +189,8 @@ cifs_fill_filedata(struct file *file) if (file->private_data != NULL) { return pCifsFile; } else if ((file->f_flags & O_CREAT) && (file->f_flags & O_EXCL)) - cERROR(1, ("could not find file instance for " - "new file %p", file)); + cERROR(1, "could not find file instance for " + "new file %p", file); return NULL; } @@ -217,7 +216,7 @@ static inline int cifs_open_inode_helper(struct inode *inode, struct file *file, if (timespec_equal(&file->f_path.dentry->d_inode->i_mtime, &temp) && (file->f_path.dentry->d_inode->i_size == (loff_t)le64_to_cpu(buf->EndOfFile))) { - cFYI(1, ("inode unchanged on server")); + cFYI(1, "inode unchanged on server"); } else { if (file->f_path.dentry->d_inode->i_mapping) { /* BB no need to lock inode until after invalidate @@ -226,8 +225,8 @@ static inline int cifs_open_inode_helper(struct inode *inode, struct file *file, if (rc != 0) CIFS_I(file->f_path.dentry->d_inode)->write_behind_rc = rc; } - cFYI(1, ("invalidating remote inode since open detected it " - "changed")); + cFYI(1, "invalidating remote inode since open detected it " + "changed"); invalidate_remote_inode(file->f_path.dentry->d_inode); } @@ -242,8 +241,8 @@ client_can_cache: if ((*oplock & 0xF) == OPLOCK_EXCLUSIVE) { pCifsInode->clientCanCacheAll = true; pCifsInode->clientCanCacheRead = true; - cFYI(1, ("Exclusive Oplock granted on inode %p", - file->f_path.dentry->d_inode)); + cFYI(1, "Exclusive Oplock granted on inode %p", + file->f_path.dentry->d_inode); } else if ((*oplock & 0xF) == OPLOCK_READ) pCifsInode->clientCanCacheRead = true; @@ -285,8 +284,8 @@ int cifs_open(struct inode *inode, struct file *file) return rc; } - cFYI(1, ("inode = 0x%p file flags are 0x%x for %s", - inode, file->f_flags, full_path)); + cFYI(1, "inode = 0x%p file flags are 0x%x for %s", + inode, file->f_flags, full_path); if (oplockEnabled) oplock = REQ_OPLOCK; @@ -298,27 +297,29 @@ int cifs_open(struct inode *inode, struct file *file) (CIFS_UNIX_POSIX_PATH_OPS_CAP & le64_to_cpu(tcon->fsUnixInfo.Capability))) { int oflags = (int) cifs_posix_convert_flags(file->f_flags); + oflags |= SMB_O_CREAT; /* can not refresh inode info since size could be stale */ rc = cifs_posix_open(full_path, &inode, file->f_path.mnt, - cifs_sb->mnt_file_mode /* ignored */, - oflags, &oplock, &netfid, xid); + inode->i_sb, + cifs_sb->mnt_file_mode /* ignored */, + oflags, &oplock, &netfid, xid); if (rc == 0) { - cFYI(1, ("posix open succeeded")); + cFYI(1, "posix open succeeded"); /* no need for special case handling of setting mode on read only files needed here */ pCifsFile = cifs_fill_filedata(file); cifs_posix_open_inode_helper(inode, file, pCifsInode, - pCifsFile, oplock, netfid); + oplock, netfid); goto out; } else if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) { if (tcon->ses->serverNOS) - cERROR(1, ("server %s of type %s returned" + cERROR(1, "server %s of type %s returned" " unexpected error on SMB posix open" ", disabling posix open support." " Check if server update available.", tcon->ses->serverName, - tcon->ses->serverNOS)); + tcon->ses->serverNOS); tcon->broken_posix_open = true; } else if ((rc != -EIO) && (rc != -EREMOTE) && (rc != -EOPNOTSUPP)) /* path not found or net err */ @@ -386,7 +387,7 @@ int cifs_open(struct inode *inode, struct file *file) & CIFS_MOUNT_MAP_SPECIAL_CHR); } if (rc) { - cFYI(1, ("cifs_open returned 0x%x", rc)); + cFYI(1, "cifs_open returned 0x%x", rc); goto out; } @@ -469,7 +470,7 @@ static int cifs_reopen_file(struct file *file, bool can_flush) } if (file->f_path.dentry == NULL) { - cERROR(1, ("no valid name if dentry freed")); + cERROR(1, "no valid name if dentry freed"); dump_stack(); rc = -EBADF; goto reopen_error_exit; @@ -477,7 +478,7 @@ static int cifs_reopen_file(struct file *file, bool can_flush) inode = file->f_path.dentry->d_inode; if (inode == NULL) { - cERROR(1, ("inode not valid")); + cERROR(1, "inode not valid"); dump_stack(); rc = -EBADF; goto reopen_error_exit; @@ -499,8 +500,8 @@ reopen_error_exit: return rc; } - cFYI(1, ("inode = 0x%p file flags 0x%x for %s", - inode, file->f_flags, full_path)); + cFYI(1, "inode = 0x%p file flags 0x%x for %s", + inode, file->f_flags, full_path); if (oplockEnabled) oplock = REQ_OPLOCK; @@ -513,10 +514,11 @@ reopen_error_exit: int oflags = (int) cifs_posix_convert_flags(file->f_flags); /* can not refresh inode info since size could be stale */ rc = cifs_posix_open(full_path, NULL, file->f_path.mnt, - cifs_sb->mnt_file_mode /* ignored */, - oflags, &oplock, &netfid, xid); + inode->i_sb, + cifs_sb->mnt_file_mode /* ignored */, + oflags, &oplock, &netfid, xid); if (rc == 0) { - cFYI(1, ("posix reopen succeeded")); + cFYI(1, "posix reopen succeeded"); goto reopen_success; } /* fallthrough to retry open the old way on errors, especially @@ -537,8 +539,8 @@ reopen_error_exit: CIFS_MOUNT_MAP_SPECIAL_CHR); if (rc) { mutex_unlock(&pCifsFile->fh_mutex); - cFYI(1, ("cifs_open returned 0x%x", rc)); - cFYI(1, ("oplock: %d", oplock)); + cFYI(1, "cifs_open returned 0x%x", rc); + cFYI(1, "oplock: %d", oplock); } else { reopen_success: pCifsFile->netfid = netfid; @@ -570,8 +572,8 @@ reopen_success: if ((oplock & 0xF) == OPLOCK_EXCLUSIVE) { pCifsInode->clientCanCacheAll = true; pCifsInode->clientCanCacheRead = true; - cFYI(1, ("Exclusive Oplock granted on inode %p", - file->f_path.dentry->d_inode)); + cFYI(1, "Exclusive Oplock granted on inode %p", + file->f_path.dentry->d_inode); } else if ((oplock & 0xF) == OPLOCK_READ) { pCifsInode->clientCanCacheRead = true; pCifsInode->clientCanCacheAll = false; @@ -619,8 +621,7 @@ int cifs_close(struct inode *inode, struct file *file) the struct would be in each open file, but this should give enough time to clear the socket */ - cFYI(DBG2, - ("close delay, write pending")); + cFYI(DBG2, "close delay, write pending"); msleep(timeout); timeout *= 4; } @@ -653,7 +654,7 @@ int cifs_close(struct inode *inode, struct file *file) read_lock(&GlobalSMBSeslock); if (list_empty(&(CIFS_I(inode)->openFileList))) { - cFYI(1, ("closing last open instance for inode %p", inode)); + cFYI(1, "closing last open instance for inode %p", inode); /* if the file is not open we do not know if we can cache info on this inode, much less write behind and read ahead */ CIFS_I(inode)->clientCanCacheRead = false; @@ -674,7 +675,7 @@ int cifs_closedir(struct inode *inode, struct file *file) (struct cifsFileInfo *)file->private_data; char *ptmp; - cFYI(1, ("Closedir inode = 0x%p", inode)); + cFYI(1, "Closedir inode = 0x%p", inode); xid = GetXid(); @@ -685,22 +686,22 @@ int cifs_closedir(struct inode *inode, struct file *file) pTcon = cifs_sb->tcon; - cFYI(1, ("Freeing private data in close dir")); + cFYI(1, "Freeing private data in close dir"); write_lock(&GlobalSMBSeslock); if (!pCFileStruct->srch_inf.endOfSearch && !pCFileStruct->invalidHandle) { pCFileStruct->invalidHandle = true; write_unlock(&GlobalSMBSeslock); rc = CIFSFindClose(xid, pTcon, pCFileStruct->netfid); - cFYI(1, ("Closing uncompleted readdir with rc %d", - rc)); + cFYI(1, "Closing uncompleted readdir with rc %d", + rc); /* not much we can do if it fails anyway, ignore rc */ rc = 0; } else write_unlock(&GlobalSMBSeslock); ptmp = pCFileStruct->srch_inf.ntwrk_buf_start; if (ptmp) { - cFYI(1, ("closedir free smb buf in srch struct")); + cFYI(1, "closedir free smb buf in srch struct"); pCFileStruct->srch_inf.ntwrk_buf_start = NULL; if (pCFileStruct->srch_inf.smallBuf) cifs_small_buf_release(ptmp); @@ -748,49 +749,49 @@ int cifs_lock(struct file *file, int cmd, struct file_lock *pfLock) rc = -EACCES; xid = GetXid(); - cFYI(1, ("Lock parm: 0x%x flockflags: " + cFYI(1, "Lock parm: 0x%x flockflags: " "0x%x flocktype: 0x%x start: %lld end: %lld", cmd, pfLock->fl_flags, pfLock->fl_type, pfLock->fl_start, - pfLock->fl_end)); + pfLock->fl_end); if (pfLock->fl_flags & FL_POSIX) - cFYI(1, ("Posix")); + cFYI(1, "Posix"); if (pfLock->fl_flags & FL_FLOCK) - cFYI(1, ("Flock")); + cFYI(1, "Flock"); if (pfLock->fl_flags & FL_SLEEP) { - cFYI(1, ("Blocking lock")); + cFYI(1, "Blocking lock"); wait_flag = true; } if (pfLock->fl_flags & FL_ACCESS) - cFYI(1, ("Process suspended by mandatory locking - " - "not implemented yet")); + cFYI(1, "Process suspended by mandatory locking - " + "not implemented yet"); if (pfLock->fl_flags & FL_LEASE) - cFYI(1, ("Lease on file - not implemented yet")); + cFYI(1, "Lease on file - not implemented yet"); if (pfLock->fl_flags & (~(FL_POSIX | FL_FLOCK | FL_SLEEP | FL_ACCESS | FL_LEASE))) - cFYI(1, ("Unknown lock flags 0x%x", pfLock->fl_flags)); + cFYI(1, "Unknown lock flags 0x%x", pfLock->fl_flags); if (pfLock->fl_type == F_WRLCK) { - cFYI(1, ("F_WRLCK ")); + cFYI(1, "F_WRLCK "); numLock = 1; } else if (pfLock->fl_type == F_UNLCK) { - cFYI(1, ("F_UNLCK")); + cFYI(1, "F_UNLCK"); numUnlock = 1; /* Check if unlock includes more than one lock range */ } else if (pfLock->fl_type == F_RDLCK) { - cFYI(1, ("F_RDLCK")); + cFYI(1, "F_RDLCK"); lockType |= LOCKING_ANDX_SHARED_LOCK; numLock = 1; } else if (pfLock->fl_type == F_EXLCK) { - cFYI(1, ("F_EXLCK")); + cFYI(1, "F_EXLCK"); numLock = 1; } else if (pfLock->fl_type == F_SHLCK) { - cFYI(1, ("F_SHLCK")); + cFYI(1, "F_SHLCK"); lockType |= LOCKING_ANDX_SHARED_LOCK; numLock = 1; } else - cFYI(1, ("Unknown type of lock")); + cFYI(1, "Unknown type of lock"); cifs_sb = CIFS_SB(file->f_path.dentry->d_sb); tcon = cifs_sb->tcon; @@ -833,8 +834,8 @@ int cifs_lock(struct file *file, int cmd, struct file_lock *pfLock) 0 /* wait flag */ ); pfLock->fl_type = F_UNLCK; if (rc != 0) - cERROR(1, ("Error unlocking previously locked " - "range %d during test of lock", rc)); + cERROR(1, "Error unlocking previously locked " + "range %d during test of lock", rc); rc = 0; } else { @@ -856,9 +857,9 @@ int cifs_lock(struct file *file, int cmd, struct file_lock *pfLock) 0 /* wait flag */); pfLock->fl_type = F_RDLCK; if (rc != 0) - cERROR(1, ("Error unlocking " + cERROR(1, "Error unlocking " "previously locked range %d " - "during test of lock", rc)); + "during test of lock", rc); rc = 0; } else { pfLock->fl_type = F_WRLCK; @@ -923,9 +924,10 @@ int cifs_lock(struct file *file, int cmd, struct file_lock *pfLock) 1, 0, li->type, false); if (stored_rc) rc = stored_rc; - - list_del(&li->llist); - kfree(li); + else { + list_del(&li->llist); + kfree(li); + } } } mutex_unlock(&fid->lock_mutex); @@ -988,9 +990,8 @@ ssize_t cifs_user_write(struct file *file, const char __user *write_data, pTcon = cifs_sb->tcon; - /* cFYI(1, - (" write %d bytes to offset %lld of %s", write_size, - *poffset, file->f_path.dentry->d_name.name)); */ + /* cFYI(1, " write %d bytes to offset %lld of %s", write_size, + *poffset, file->f_path.dentry->d_name.name); */ if (file->private_data == NULL) return -EBADF; @@ -1091,8 +1092,8 @@ static ssize_t cifs_write(struct file *file, const char *write_data, pTcon = cifs_sb->tcon; - cFYI(1, ("write %zd bytes to offset %lld of %s", write_size, - *poffset, file->f_path.dentry->d_name.name)); + cFYI(1, "write %zd bytes to offset %lld of %s", write_size, + *poffset, file->f_path.dentry->d_name.name); if (file->private_data == NULL) return -EBADF; @@ -1233,7 +1234,7 @@ struct cifsFileInfo *find_writable_file(struct cifsInodeInfo *cifs_inode) it being zero) during stress testcases so we need to check for it */ if (cifs_inode == NULL) { - cERROR(1, ("Null inode passed to cifs_writeable_file")); + cERROR(1, "Null inode passed to cifs_writeable_file"); dump_stack(); return NULL; } @@ -1277,7 +1278,7 @@ refind_writable: again. Note that it would be bad to hold up writepages here (rather than in caller) with continuous retries */ - cFYI(1, ("wp failed on reopen file")); + cFYI(1, "wp failed on reopen file"); read_lock(&GlobalSMBSeslock); /* can not use this handle, no write pending on this one after all */ @@ -1353,7 +1354,7 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to) else if (bytes_written < 0) rc = bytes_written; } else { - cFYI(1, ("No writeable filehandles for inode")); + cFYI(1, "No writeable filehandles for inode"); rc = -EIO; } @@ -1525,7 +1526,7 @@ retry: */ open_file = find_writable_file(CIFS_I(mapping->host)); if (!open_file) { - cERROR(1, ("No writable handles for inode")); + cERROR(1, "No writable handles for inode"); rc = -EBADF; } else { long_op = cifs_write_timeout(cifsi, offset); @@ -1538,8 +1539,8 @@ retry: cifs_update_eof(cifsi, offset, bytes_written); if (rc || bytes_written < bytes_to_write) { - cERROR(1, ("Write2 ret %d, wrote %d", - rc, bytes_written)); + cERROR(1, "Write2 ret %d, wrote %d", + rc, bytes_written); /* BB what if continued retry is requested via mount flags? */ if (rc == -ENOSPC) @@ -1600,7 +1601,7 @@ static int cifs_writepage(struct page *page, struct writeback_control *wbc) /* BB add check for wbc flags */ page_cache_get(page); if (!PageUptodate(page)) - cFYI(1, ("ppw - page not up to date")); + cFYI(1, "ppw - page not up to date"); /* * Set the "writeback" flag, and clear "dirty" in the radix tree. @@ -1629,8 +1630,8 @@ static int cifs_write_end(struct file *file, struct address_space *mapping, int rc; struct inode *inode = mapping->host; - cFYI(1, ("write_end for page %p from pos %lld with %d bytes", - page, pos, copied)); + cFYI(1, "write_end for page %p from pos %lld with %d bytes", + page, pos, copied); if (PageChecked(page)) { if (copied == len) @@ -1686,8 +1687,8 @@ int cifs_fsync(struct file *file, struct dentry *dentry, int datasync) xid = GetXid(); - cFYI(1, ("Sync file - name: %s datasync: 0x%x", - dentry->d_name.name, datasync)); + cFYI(1, "Sync file - name: %s datasync: 0x%x", + dentry->d_name.name, datasync); rc = filemap_write_and_wait(inode->i_mapping); if (rc == 0) { @@ -1711,7 +1712,7 @@ int cifs_fsync(struct file *file, struct dentry *dentry, int datasync) unsigned int rpages = 0; int rc = 0; - cFYI(1, ("sync page %p",page)); + cFYI(1, "sync page %p", page); mapping = page->mapping; if (!mapping) return 0; @@ -1722,7 +1723,7 @@ int cifs_fsync(struct file *file, struct dentry *dentry, int datasync) /* fill in rpages then result = cifs_pagein_inode(inode, index, rpages); */ /* BB finish */ -/* cFYI(1, ("rpages is %d for sync page of Index %ld", rpages, index)); +/* cFYI(1, "rpages is %d for sync page of Index %ld", rpages, index); #if 0 if (rc < 0) @@ -1756,7 +1757,7 @@ int cifs_flush(struct file *file, fl_owner_t id) CIFS_I(inode)->write_behind_rc = 0; } - cFYI(1, ("Flush inode %p file %p rc %d", inode, file, rc)); + cFYI(1, "Flush inode %p file %p rc %d", inode, file, rc); return rc; } @@ -1788,7 +1789,7 @@ ssize_t cifs_user_read(struct file *file, char __user *read_data, open_file = (struct cifsFileInfo *)file->private_data; if ((file->f_flags & O_ACCMODE) == O_WRONLY) - cFYI(1, ("attempting read on write only file instance")); + cFYI(1, "attempting read on write only file instance"); for (total_read = 0, current_offset = read_data; read_size > total_read; @@ -1869,7 +1870,7 @@ static ssize_t cifs_read(struct file *file, char *read_data, size_t read_size, open_file = (struct cifsFileInfo *)file->private_data; if ((file->f_flags & O_ACCMODE) == O_WRONLY) - cFYI(1, ("attempting read on write only file instance")); + cFYI(1, "attempting read on write only file instance"); for (total_read = 0, current_offset = read_data; read_size > total_read; @@ -1920,7 +1921,7 @@ int cifs_file_mmap(struct file *file, struct vm_area_struct *vma) xid = GetXid(); rc = cifs_revalidate_file(file); if (rc) { - cFYI(1, ("Validation prior to mmap failed, error=%d", rc)); + cFYI(1, "Validation prior to mmap failed, error=%d", rc); FreeXid(xid); return rc; } @@ -1931,8 +1932,7 @@ int cifs_file_mmap(struct file *file, struct vm_area_struct *vma) static void cifs_copy_cache_pages(struct address_space *mapping, - struct list_head *pages, int bytes_read, char *data, - struct pagevec *plru_pvec) + struct list_head *pages, int bytes_read, char *data) { struct page *page; char *target; @@ -1944,10 +1944,10 @@ static void cifs_copy_cache_pages(struct address_space *mapping, page = list_entry(pages->prev, struct page, lru); list_del(&page->lru); - if (add_to_page_cache(page, mapping, page->index, + if (add_to_page_cache_lru(page, mapping, page->index, GFP_KERNEL)) { page_cache_release(page); - cFYI(1, ("Add page cache failed")); + cFYI(1, "Add page cache failed"); data += PAGE_CACHE_SIZE; bytes_read -= PAGE_CACHE_SIZE; continue; @@ -1970,8 +1970,6 @@ static void cifs_copy_cache_pages(struct address_space *mapping, flush_dcache_page(page); SetPageUptodate(page); unlock_page(page); - if (!pagevec_add(plru_pvec, page)) - __pagevec_lru_add_file(plru_pvec); data += PAGE_CACHE_SIZE; } return; @@ -1990,7 +1988,6 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, unsigned int read_size, i; char *smb_read_data = NULL; struct smb_com_read_rsp *pSMBr; - struct pagevec lru_pvec; struct cifsFileInfo *open_file; int buf_type = CIFS_NO_BUFFER; @@ -2004,8 +2001,7 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, cifs_sb = CIFS_SB(file->f_path.dentry->d_sb); pTcon = cifs_sb->tcon; - pagevec_init(&lru_pvec, 0); - cFYI(DBG2, ("rpages: num pages %d", num_pages)); + cFYI(DBG2, "rpages: num pages %d", num_pages); for (i = 0; i < num_pages; ) { unsigned contig_pages; struct page *tmp_page; @@ -2038,8 +2034,8 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, /* Read size needs to be in multiples of one page */ read_size = min_t(const unsigned int, read_size, cifs_sb->rsize & PAGE_CACHE_MASK); - cFYI(DBG2, ("rpages: read size 0x%x contiguous pages %d", - read_size, contig_pages)); + cFYI(DBG2, "rpages: read size 0x%x contiguous pages %d", + read_size, contig_pages); rc = -EAGAIN; while (rc == -EAGAIN) { if ((open_file->invalidHandle) && @@ -2066,14 +2062,14 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, } } if ((rc < 0) || (smb_read_data == NULL)) { - cFYI(1, ("Read error in readpages: %d", rc)); + cFYI(1, "Read error in readpages: %d", rc); break; } else if (bytes_read > 0) { task_io_account_read(bytes_read); pSMBr = (struct smb_com_read_rsp *)smb_read_data; cifs_copy_cache_pages(mapping, page_list, bytes_read, smb_read_data + 4 /* RFC1001 hdr */ + - le16_to_cpu(pSMBr->DataOffset), &lru_pvec); + le16_to_cpu(pSMBr->DataOffset)); i += bytes_read >> PAGE_CACHE_SHIFT; cifs_stats_bytes_read(pTcon, bytes_read); @@ -2089,9 +2085,9 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, /* break; */ } } else { - cFYI(1, ("No bytes read (%d) at offset %lld . " - "Cleaning remaining pages from readahead list", - bytes_read, offset)); + cFYI(1, "No bytes read (%d) at offset %lld . " + "Cleaning remaining pages from readahead list", + bytes_read, offset); /* BB turn off caching and do new lookup on file size at server? */ break; @@ -2106,8 +2102,6 @@ static int cifs_readpages(struct file *file, struct address_space *mapping, bytes_read = 0; } - pagevec_lru_add_file(&lru_pvec); - /* need to free smb_read_data buf before exit */ if (smb_read_data) { if (buf_type == CIFS_SMALL_BUFFER) @@ -2136,7 +2130,7 @@ static int cifs_readpage_worker(struct file *file, struct page *page, if (rc < 0) goto io_error; else - cFYI(1, ("Bytes read %d", rc)); + cFYI(1, "Bytes read %d", rc); file->f_path.dentry->d_inode->i_atime = current_fs_time(file->f_path.dentry->d_inode->i_sb); @@ -2168,8 +2162,8 @@ static int cifs_readpage(struct file *file, struct page *page) return rc; } - cFYI(1, ("readpage %p at offset %d 0x%x\n", - page, (int)offset, (int)offset)); + cFYI(1, "readpage %p at offset %d 0x%x\n", + page, (int)offset, (int)offset); rc = cifs_readpage_worker(file, page, &offset); @@ -2239,7 +2233,7 @@ static int cifs_write_begin(struct file *file, struct address_space *mapping, struct page *page; int rc = 0; - cFYI(1, ("write_begin from %lld len %d", (long long)pos, len)); + cFYI(1, "write_begin from %lld len %d", (long long)pos, len); page = grab_cache_page_write_begin(mapping, index, flags); if (!page) { @@ -2311,12 +2305,10 @@ cifs_oplock_break(struct slow_work *work) int rc, waitrc = 0; if (inode && S_ISREG(inode->i_mode)) { -#ifdef CONFIG_CIFS_EXPERIMENTAL - if (cinode->clientCanCacheAll == 0) + if (cinode->clientCanCacheRead) break_lease(inode, O_RDONLY); - else if (cinode->clientCanCacheRead == 0) + else break_lease(inode, O_WRONLY); -#endif rc = filemap_fdatawrite(inode->i_mapping); if (cinode->clientCanCacheRead == 0) { waitrc = filemap_fdatawait(inode->i_mapping); @@ -2326,7 +2318,7 @@ cifs_oplock_break(struct slow_work *work) rc = waitrc; if (rc) cinode->write_behind_rc = rc; - cFYI(1, ("Oplock flush inode %p rc %d", inode, rc)); + cFYI(1, "Oplock flush inode %p rc %d", inode, rc); } /* @@ -2338,7 +2330,7 @@ cifs_oplock_break(struct slow_work *work) if (!cfile->closePend && !cfile->oplock_break_cancelled) { rc = CIFSSMBLock(0, cifs_sb->tcon, cfile->netfid, 0, 0, 0, 0, LOCKING_ANDX_OPLOCK_RELEASE, false); - cFYI(1, ("Oplock release rc = %d", rc)); + cFYI(1, "Oplock release rc = %d", rc); } } diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index 35ec117..62b324f 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -1,7 +1,7 @@ /* * fs/cifs/inode.c * - * Copyright (C) International Business Machines Corp., 2002,2008 + * Copyright (C) International Business Machines Corp., 2002,2010 * Author(s): Steve French (sfrench@us.ibm.com) * * This library is free software; you can redistribute it and/or modify @@ -86,30 +86,30 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) { struct cifsInodeInfo *cifs_i = CIFS_I(inode); - cFYI(1, ("%s: revalidating inode %llu", __func__, cifs_i->uniqueid)); + cFYI(1, "%s: revalidating inode %llu", __func__, cifs_i->uniqueid); if (inode->i_state & I_NEW) { - cFYI(1, ("%s: inode %llu is new", __func__, cifs_i->uniqueid)); + cFYI(1, "%s: inode %llu is new", __func__, cifs_i->uniqueid); return; } /* don't bother with revalidation if we have an oplock */ if (cifs_i->clientCanCacheRead) { - cFYI(1, ("%s: inode %llu is oplocked", __func__, - cifs_i->uniqueid)); + cFYI(1, "%s: inode %llu is oplocked", __func__, + cifs_i->uniqueid); return; } /* revalidate if mtime or size have changed */ if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) && cifs_i->server_eof == fattr->cf_eof) { - cFYI(1, ("%s: inode %llu is unchanged", __func__, - cifs_i->uniqueid)); + cFYI(1, "%s: inode %llu is unchanged", __func__, + cifs_i->uniqueid); return; } - cFYI(1, ("%s: invalidating inode %llu mapping", __func__, - cifs_i->uniqueid)); + cFYI(1, "%s: invalidating inode %llu mapping", __func__, + cifs_i->uniqueid); cifs_i->invalid_mapping = true; } @@ -137,15 +137,14 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) inode->i_mode = fattr->cf_mode; cifs_i->cifsAttrs = fattr->cf_cifsattrs; - cifs_i->uniqueid = fattr->cf_uniqueid; if (fattr->cf_flags & CIFS_FATTR_NEED_REVAL) cifs_i->time = 0; else cifs_i->time = jiffies; - cFYI(1, ("inode 0x%p old_time=%ld new_time=%ld", inode, - oldtime, cifs_i->time)); + cFYI(1, "inode 0x%p old_time=%ld new_time=%ld", inode, + oldtime, cifs_i->time); cifs_i->delete_pending = fattr->cf_flags & CIFS_FATTR_DELETE_PENDING; @@ -170,6 +169,17 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr) cifs_set_ops(inode, fattr->cf_flags & CIFS_FATTR_DFS_REFERRAL); } +void +cifs_fill_uniqueid(struct super_block *sb, struct cifs_fattr *fattr) +{ + struct cifs_sb_info *cifs_sb = CIFS_SB(sb); + + if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM) + return; + + fattr->cf_uniqueid = iunique(sb, ROOT_I); +} + /* Fill a cifs_fattr struct with info from FILE_UNIX_BASIC_INFO. */ void cifs_unix_basic_to_fattr(struct cifs_fattr *fattr, FILE_UNIX_BASIC_INFO *info, @@ -227,7 +237,7 @@ cifs_unix_basic_to_fattr(struct cifs_fattr *fattr, FILE_UNIX_BASIC_INFO *info, /* safest to call it a file if we do not know */ fattr->cf_mode |= S_IFREG; fattr->cf_dtype = DT_REG; - cFYI(1, ("unknown type %d", le32_to_cpu(info->Type))); + cFYI(1, "unknown type %d", le32_to_cpu(info->Type)); break; } @@ -256,7 +266,7 @@ cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct super_block *sb) { struct cifs_sb_info *cifs_sb = CIFS_SB(sb); - cFYI(1, ("creating fake fattr for DFS referral")); + cFYI(1, "creating fake fattr for DFS referral"); memset(fattr, 0, sizeof(*fattr)); fattr->cf_mode = S_IFDIR | S_IXUGO | S_IRWXU; @@ -305,7 +315,7 @@ int cifs_get_inode_info_unix(struct inode **pinode, struct cifs_sb_info *cifs_sb = CIFS_SB(sb); tcon = cifs_sb->tcon; - cFYI(1, ("Getting info on %s", full_path)); + cFYI(1, "Getting info on %s", full_path); /* could have done a find first instead but this returns more info */ rc = CIFSSMBUnixQPathInfo(xid, tcon, full_path, &find_data, @@ -323,6 +333,7 @@ int cifs_get_inode_info_unix(struct inode **pinode, if (*pinode == NULL) { /* get new inode */ + cifs_fill_uniqueid(sb, &fattr); *pinode = cifs_iget(sb, &fattr); if (!*pinode) rc = -ENOMEM; @@ -373,7 +384,7 @@ cifs_sfu_type(struct cifs_fattr *fattr, const unsigned char *path, &bytes_read, &pbuf, &buf_type); if ((rc == 0) && (bytes_read >= 8)) { if (memcmp("IntxBLK", pbuf, 8) == 0) { - cFYI(1, ("Block device")); + cFYI(1, "Block device"); fattr->cf_mode |= S_IFBLK; fattr->cf_dtype = DT_BLK; if (bytes_read == 24) { @@ -385,7 +396,7 @@ cifs_sfu_type(struct cifs_fattr *fattr, const unsigned char *path, fattr->cf_rdev = MKDEV(mjr, mnr); } } else if (memcmp("IntxCHR", pbuf, 8) == 0) { - cFYI(1, ("Char device")); + cFYI(1, "Char device"); fattr->cf_mode |= S_IFCHR; fattr->cf_dtype = DT_CHR; if (bytes_read == 24) { @@ -397,7 +408,7 @@ cifs_sfu_type(struct cifs_fattr *fattr, const unsigned char *path, fattr->cf_rdev = MKDEV(mjr, mnr); } } else if (memcmp("IntxLNK", pbuf, 7) == 0) { - cFYI(1, ("Symlink")); + cFYI(1, "Symlink"); fattr->cf_mode |= S_IFLNK; fattr->cf_dtype = DT_LNK; } else { @@ -439,10 +450,10 @@ static int cifs_sfu_mode(struct cifs_fattr *fattr, const unsigned char *path, else if (rc > 3) { mode = le32_to_cpu(*((__le32 *)ea_value)); fattr->cf_mode &= ~SFBITS_MASK; - cFYI(1, ("special bits 0%o org mode 0%o", mode, - fattr->cf_mode)); + cFYI(1, "special bits 0%o org mode 0%o", mode, + fattr->cf_mode); fattr->cf_mode = (mode & SFBITS_MASK) | fattr->cf_mode; - cFYI(1, ("special mode bits 0%o", mode)); + cFYI(1, "special mode bits 0%o", mode); } return 0; @@ -548,11 +559,11 @@ int cifs_get_inode_info(struct inode **pinode, struct cifs_fattr fattr; pTcon = cifs_sb->tcon; - cFYI(1, ("Getting info on %s", full_path)); + cFYI(1, "Getting info on %s", full_path); if ((pfindData == NULL) && (*pinode != NULL)) { if (CIFS_I(*pinode)->clientCanCacheRead) { - cFYI(1, ("No need to revalidate cached inode sizes")); + cFYI(1, "No need to revalidate cached inode sizes"); return rc; } } @@ -618,7 +629,7 @@ int cifs_get_inode_info(struct inode **pinode, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); if (rc1 || !fattr.cf_uniqueid) { - cFYI(1, ("GetSrvInodeNum rc %d", rc1)); + cFYI(1, "GetSrvInodeNum rc %d", rc1); fattr.cf_uniqueid = iunique(sb, ROOT_I); cifs_autodisable_serverino(cifs_sb); } @@ -634,13 +645,13 @@ int cifs_get_inode_info(struct inode **pinode, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_UNX_EMUL) { tmprc = cifs_sfu_type(&fattr, full_path, cifs_sb, xid); if (tmprc) - cFYI(1, ("cifs_sfu_type failed: %d", tmprc)); + cFYI(1, "cifs_sfu_type failed: %d", tmprc); } #ifdef CONFIG_CIFS_EXPERIMENTAL /* fill in 0777 bits from ACL */ if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_CIFS_ACL) { - cFYI(1, ("Getting mode bits from ACL")); + cFYI(1, "Getting mode bits from ACL"); cifs_acl_to_fattr(cifs_sb, &fattr, *pinode, full_path, pfid); } #endif @@ -715,6 +726,16 @@ cifs_find_inode(struct inode *inode, void *opaque) if (CIFS_I(inode)->uniqueid != fattr->cf_uniqueid) return 0; + /* + * uh oh -- it's a directory. We can't use it since hardlinked dirs are + * verboten. Disable serverino and return it as if it were found, the + * caller can discard it, generate a uniqueid and retry the find + */ + if (S_ISDIR(inode->i_mode) && !list_empty(&inode->i_dentry)) { + fattr->cf_flags |= CIFS_FATTR_INO_COLLISION; + cifs_autodisable_serverino(CIFS_SB(inode->i_sb)); + } + return 1; } @@ -734,15 +755,22 @@ cifs_iget(struct super_block *sb, struct cifs_fattr *fattr) unsigned long hash; struct inode *inode; - cFYI(1, ("looking for uniqueid=%llu", fattr->cf_uniqueid)); +retry_iget5_locked: + cFYI(1, "looking for uniqueid=%llu", fattr->cf_uniqueid); /* hash down to 32-bits on 32-bit arch */ hash = cifs_uniqueid_to_ino_t(fattr->cf_uniqueid); inode = iget5_locked(sb, hash, cifs_find_inode, cifs_init_inode, fattr); - - /* we have fattrs in hand, update the inode */ if (inode) { + /* was there a problematic inode number collision? */ + if (fattr->cf_flags & CIFS_FATTR_INO_COLLISION) { + iput(inode); + fattr->cf_uniqueid = iunique(sb, ROOT_I); + fattr->cf_flags &= ~CIFS_FATTR_INO_COLLISION; + goto retry_iget5_locked; + } + cifs_fattr_to_inode(inode, fattr); if (sb->s_flags & MS_NOATIME) inode->i_flags |= S_NOATIME | S_NOCMTIME; @@ -780,7 +808,7 @@ struct inode *cifs_root_iget(struct super_block *sb, unsigned long ino) return ERR_PTR(-ENOMEM); if (rc && cifs_sb->tcon->ipc) { - cFYI(1, ("ipc connection - fake read inode")); + cFYI(1, "ipc connection - fake read inode"); inode->i_mode |= S_IFDIR; inode->i_nlink = 2; inode->i_op = &cifs_ipc_inode_ops; @@ -842,7 +870,7 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, int xid, * server times. */ if (set_time && (attrs->ia_valid & ATTR_CTIME)) { - cFYI(1, ("CIFS - CTIME changed")); + cFYI(1, "CIFS - CTIME changed"); info_buf.ChangeTime = cpu_to_le64(cifs_UnixTimeToNT(attrs->ia_ctime)); } else @@ -877,8 +905,8 @@ cifs_set_file_info(struct inode *inode, struct iattr *attrs, int xid, goto out; } - cFYI(1, ("calling SetFileInfo since SetPathInfo for " - "times not supported by this server")); + cFYI(1, "calling SetFileInfo since SetPathInfo for " + "times not supported by this server"); rc = CIFSSMBOpen(xid, pTcon, full_path, FILE_OPEN, SYNCHRONIZE | FILE_WRITE_ATTRIBUTES, CREATE_NOT_DIR, &netfid, &oplock, @@ -1036,7 +1064,7 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) struct iattr *attrs = NULL; __u32 dosattr = 0, origattr = 0; - cFYI(1, ("cifs_unlink, dir=0x%p, dentry=0x%p", dir, dentry)); + cFYI(1, "cifs_unlink, dir=0x%p, dentry=0x%p", dir, dentry); xid = GetXid(); @@ -1055,7 +1083,7 @@ int cifs_unlink(struct inode *dir, struct dentry *dentry) rc = CIFSPOSIXDelFile(xid, tcon, full_path, SMB_POSIX_UNLINK_FILE_TARGET, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); - cFYI(1, ("posix del rc %d", rc)); + cFYI(1, "posix del rc %d", rc); if ((rc == 0) || (rc == -ENOENT)) goto psx_del_no_retry; } @@ -1129,7 +1157,7 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, int mode) struct inode *newinode = NULL; struct cifs_fattr fattr; - cFYI(1, ("In cifs_mkdir, mode = 0x%x inode = 0x%p", mode, inode)); + cFYI(1, "In cifs_mkdir, mode = 0x%x inode = 0x%p", mode, inode); xid = GetXid(); @@ -1164,7 +1192,7 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, int mode) kfree(pInfo); goto mkdir_retry_old; } else if (rc) { - cFYI(1, ("posix mkdir returned 0x%x", rc)); + cFYI(1, "posix mkdir returned 0x%x", rc); d_drop(direntry); } else { if (pInfo->Type == cpu_to_le32(-1)) { @@ -1181,6 +1209,7 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, int mode) direntry->d_op = &cifs_dentry_ops; cifs_unix_basic_to_fattr(&fattr, pInfo, cifs_sb); + cifs_fill_uniqueid(inode->i_sb, &fattr); newinode = cifs_iget(inode->i_sb, &fattr); if (!newinode) { kfree(pInfo); @@ -1190,12 +1219,12 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, int mode) d_instantiate(direntry, newinode); #ifdef CONFIG_CIFS_DEBUG2 - cFYI(1, ("instantiated dentry %p %s to inode %p", - direntry, direntry->d_name.name, newinode)); + cFYI(1, "instantiated dentry %p %s to inode %p", + direntry, direntry->d_name.name, newinode); if (newinode->i_nlink != 2) - cFYI(1, ("unexpected number of links %d", - newinode->i_nlink)); + cFYI(1, "unexpected number of links %d", + newinode->i_nlink); #endif } kfree(pInfo); @@ -1206,7 +1235,7 @@ mkdir_retry_old: rc = CIFSSMBMkDir(xid, pTcon, full_path, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); if (rc) { - cFYI(1, ("cifs_mkdir returned 0x%x", rc)); + cFYI(1, "cifs_mkdir returned 0x%x", rc); d_drop(direntry); } else { mkdir_get_info: @@ -1309,7 +1338,7 @@ int cifs_rmdir(struct inode *inode, struct dentry *direntry) char *full_path = NULL; struct cifsInodeInfo *cifsInode; - cFYI(1, ("cifs_rmdir, inode = 0x%p", inode)); + cFYI(1, "cifs_rmdir, inode = 0x%p", inode); xid = GetXid(); @@ -1511,6 +1540,11 @@ cifs_inode_needs_reval(struct inode *inode) if (time_after_eq(jiffies, cifs_i->time + HZ)) return true; + /* hardlinked files w/ noserverino get "special" treatment */ + if (!(CIFS_SB(inode->i_sb)->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM) && + S_ISREG(inode->i_mode) && inode->i_nlink != 1) + return true; + return false; } @@ -1577,9 +1611,9 @@ int cifs_revalidate_dentry(struct dentry *dentry) goto check_inval; } - cFYI(1, ("Revalidate: %s inode 0x%p count %d dentry: 0x%p d_time %ld " + cFYI(1, "Revalidate: %s inode 0x%p count %d dentry: 0x%p d_time %ld " "jiffies %ld", full_path, inode, inode->i_count.counter, - dentry, dentry->d_time, jiffies)); + dentry, dentry->d_time, jiffies); if (CIFS_SB(sb)->tcon->unix_ext) rc = cifs_get_inode_info_unix(&inode, full_path, sb, xid); @@ -1673,12 +1707,12 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs, rc = CIFSSMBSetFileSize(xid, pTcon, attrs->ia_size, nfid, npid, false); cifsFileInfo_put(open_file); - cFYI(1, ("SetFSize for attrs rc = %d", rc)); + cFYI(1, "SetFSize for attrs rc = %d", rc); if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) { unsigned int bytes_written; rc = CIFSSMBWrite(xid, pTcon, nfid, 0, attrs->ia_size, &bytes_written, NULL, NULL, 1); - cFYI(1, ("Wrt seteof rc %d", rc)); + cFYI(1, "Wrt seteof rc %d", rc); } } else rc = -EINVAL; @@ -1692,7 +1726,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs, false, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); - cFYI(1, ("SetEOF by path (setattrs) rc = %d", rc)); + cFYI(1, "SetEOF by path (setattrs) rc = %d", rc); if ((rc == -EINVAL) || (rc == -EOPNOTSUPP)) { __u16 netfid; int oplock = 0; @@ -1709,7 +1743,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs, attrs->ia_size, &bytes_written, NULL, NULL, 1); - cFYI(1, ("wrt seteof rc %d", rc)); + cFYI(1, "wrt seteof rc %d", rc); CIFSSMBClose(xid, pTcon, netfid); } } @@ -1737,8 +1771,8 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs) struct cifs_unix_set_info_args *args = NULL; struct cifsFileInfo *open_file; - cFYI(1, ("setattr_unix on file %s attrs->ia_valid=0x%x", - direntry->d_name.name, attrs->ia_valid)); + cFYI(1, "setattr_unix on file %s attrs->ia_valid=0x%x", + direntry->d_name.name, attrs->ia_valid); xid = GetXid(); @@ -1868,8 +1902,8 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs) xid = GetXid(); - cFYI(1, ("setattr on file %s attrs->iavalid 0x%x", - direntry->d_name.name, attrs->ia_valid)); + cFYI(1, "setattr on file %s attrs->iavalid 0x%x", + direntry->d_name.name, attrs->ia_valid); if ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_PERM) == 0) { /* check if we have permission to change attrs */ @@ -1926,7 +1960,7 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs) attrs->ia_valid &= ~ATTR_MODE; if (attrs->ia_valid & ATTR_MODE) { - cFYI(1, ("Mode changed to 0%o", attrs->ia_mode)); + cFYI(1, "Mode changed to 0%o", attrs->ia_mode); mode = attrs->ia_mode; } @@ -2012,7 +2046,7 @@ cifs_setattr(struct dentry *direntry, struct iattr *attrs) #if 0 void cifs_delete_inode(struct inode *inode) { - cFYI(1, ("In cifs_delete_inode, inode = 0x%p", inode)); + cFYI(1, "In cifs_delete_inode, inode = 0x%p", inode); /* may have to add back in if and when safe distributed caching of directories added e.g. via FindNotify */ } diff --git a/fs/cifs/ioctl.c b/fs/cifs/ioctl.c index f946506..505926f 100644 --- a/fs/cifs/ioctl.c +++ b/fs/cifs/ioctl.c @@ -47,7 +47,7 @@ long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg) xid = GetXid(); - cFYI(1, ("ioctl file %p cmd %u arg %lu", filep, command, arg)); + cFYI(1, "ioctl file %p cmd %u arg %lu", filep, command, arg); cifs_sb = CIFS_SB(inode->i_sb); @@ -64,12 +64,12 @@ long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg) switch (command) { case CIFS_IOC_CHECKUMOUNT: - cFYI(1, ("User unmount attempted")); + cFYI(1, "User unmount attempted"); if (cifs_sb->mnt_uid == current_uid()) rc = 0; else { rc = -EACCES; - cFYI(1, ("uids do not match")); + cFYI(1, "uids do not match"); } break; #ifdef CONFIG_CIFS_POSIX @@ -97,11 +97,11 @@ long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg) /* rc= CIFSGetExtAttr(xid,tcon,pSMBFile->netfid, extAttrBits, &ExtAttrMask);*/ } - cFYI(1, ("set flags not implemented yet")); + cFYI(1, "set flags not implemented yet"); break; #endif /* CONFIG_CIFS_POSIX */ default: - cFYI(1, ("unsupported ioctl")); + cFYI(1, "unsupported ioctl"); break; } diff --git a/fs/cifs/link.c b/fs/cifs/link.c index c1a9d42..473ca80 100644 --- a/fs/cifs/link.c +++ b/fs/cifs/link.c @@ -139,7 +139,7 @@ cifs_follow_link(struct dentry *direntry, struct nameidata *nd) if (!full_path) goto out; - cFYI(1, ("Full path: %s inode = 0x%p", full_path, inode)); + cFYI(1, "Full path: %s inode = 0x%p", full_path, inode); rc = CIFSSMBUnixQuerySymLink(xid, tcon, full_path, &target_path, cifs_sb->local_nls); @@ -178,8 +178,8 @@ cifs_symlink(struct inode *inode, struct dentry *direntry, const char *symname) return rc; } - cFYI(1, ("Full path: %s", full_path)); - cFYI(1, ("symname is %s", symname)); + cFYI(1, "Full path: %s", full_path); + cFYI(1, "symname is %s", symname); /* BB what if DFS and this volume is on different share? BB */ if (pTcon->unix_ext) @@ -198,8 +198,8 @@ cifs_symlink(struct inode *inode, struct dentry *direntry, const char *symname) inode->i_sb, xid, NULL); if (rc != 0) { - cFYI(1, ("Create symlink ok, getinodeinfo fail rc = %d", - rc)); + cFYI(1, "Create symlink ok, getinodeinfo fail rc = %d", + rc); } else { if (pTcon->nocase) direntry->d_op = &cifs_ci_dentry_ops; diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c index d147499..1394aa3 100644 --- a/fs/cifs/misc.c +++ b/fs/cifs/misc.c @@ -51,7 +51,7 @@ _GetXid(void) if (GlobalTotalActiveXid > GlobalMaxActiveXid) GlobalMaxActiveXid = GlobalTotalActiveXid; if (GlobalTotalActiveXid > 65000) - cFYI(1, ("warning: more than 65000 requests active")); + cFYI(1, "warning: more than 65000 requests active"); xid = GlobalCurrentXid++; spin_unlock(&GlobalMid_Lock); return xid; @@ -88,7 +88,7 @@ void sesInfoFree(struct cifsSesInfo *buf_to_free) { if (buf_to_free == NULL) { - cFYI(1, ("Null buffer passed to sesInfoFree")); + cFYI(1, "Null buffer passed to sesInfoFree"); return; } @@ -126,7 +126,7 @@ void tconInfoFree(struct cifsTconInfo *buf_to_free) { if (buf_to_free == NULL) { - cFYI(1, ("Null buffer passed to tconInfoFree")); + cFYI(1, "Null buffer passed to tconInfoFree"); return; } atomic_dec(&tconInfoAllocCount); @@ -166,7 +166,7 @@ void cifs_buf_release(void *buf_to_free) { if (buf_to_free == NULL) { - /* cFYI(1, ("Null buffer passed to cifs_buf_release"));*/ + /* cFYI(1, "Null buffer passed to cifs_buf_release");*/ return; } mempool_free(buf_to_free, cifs_req_poolp); @@ -202,7 +202,7 @@ cifs_small_buf_release(void *buf_to_free) { if (buf_to_free == NULL) { - cFYI(1, ("Null buffer passed to cifs_small_buf_release")); + cFYI(1, "Null buffer passed to cifs_small_buf_release"); return; } mempool_free(buf_to_free, cifs_sm_req_poolp); @@ -345,19 +345,19 @@ header_assemble(struct smb_hdr *buffer, char smb_command /* command */ , /* with userid/password pairs found on the smb session */ /* for other target tcp/ip addresses BB */ if (current_fsuid() != treeCon->ses->linux_uid) { - cFYI(1, ("Multiuser mode and UID " - "did not match tcon uid")); + cFYI(1, "Multiuser mode and UID " + "did not match tcon uid"); read_lock(&cifs_tcp_ses_lock); list_for_each(temp_item, &treeCon->ses->server->smb_ses_list) { ses = list_entry(temp_item, struct cifsSesInfo, smb_ses_list); if (ses->linux_uid == current_fsuid()) { if (ses->server == treeCon->ses->server) { - cFYI(1, ("found matching uid substitute right smb_uid")); + cFYI(1, "found matching uid substitute right smb_uid"); buffer->Uid = ses->Suid; break; } else { /* BB eventually call cifs_setup_session here */ - cFYI(1, ("local UID found but no smb sess with this server exists")); + cFYI(1, "local UID found but no smb sess with this server exists"); } } } @@ -394,17 +394,16 @@ checkSMBhdr(struct smb_hdr *smb, __u16 mid) if (smb->Command == SMB_COM_LOCKING_ANDX) return 0; else - cERROR(1, ("Received Request not response")); + cERROR(1, "Received Request not response"); } } else { /* bad signature or mid */ if (*(__le32 *) smb->Protocol != cpu_to_le32(0x424d53ff)) - cERROR(1, - ("Bad protocol string signature header %x", - *(unsigned int *) smb->Protocol)); + cERROR(1, "Bad protocol string signature header %x", + *(unsigned int *) smb->Protocol); if (mid != smb->Mid) - cERROR(1, ("Mids do not match")); + cERROR(1, "Mids do not match"); } - cERROR(1, ("bad smb detected. The Mid=%d", smb->Mid)); + cERROR(1, "bad smb detected. The Mid=%d", smb->Mid); return 1; } @@ -413,7 +412,7 @@ checkSMB(struct smb_hdr *smb, __u16 mid, unsigned int length) { __u32 len = smb->smb_buf_length; __u32 clc_len; /* calculated length */ - cFYI(0, ("checkSMB Length: 0x%x, smb_buf_length: 0x%x", length, len)); + cFYI(0, "checkSMB Length: 0x%x, smb_buf_length: 0x%x", length, len); if (length < 2 + sizeof(struct smb_hdr)) { if ((length >= sizeof(struct smb_hdr) - 1) @@ -437,15 +436,15 @@ checkSMB(struct smb_hdr *smb, __u16 mid, unsigned int length) tmp[sizeof(struct smb_hdr)+1] = 0; return 0; } - cERROR(1, ("rcvd invalid byte count (bcc)")); + cERROR(1, "rcvd invalid byte count (bcc)"); } else { - cERROR(1, ("Length less than smb header size")); + cERROR(1, "Length less than smb header size"); } return 1; } if (len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4) { - cERROR(1, ("smb length greater than MaxBufSize, mid=%d", - smb->Mid)); + cERROR(1, "smb length greater than MaxBufSize, mid=%d", + smb->Mid); return 1; } @@ -454,8 +453,8 @@ checkSMB(struct smb_hdr *smb, __u16 mid, unsigned int length) clc_len = smbCalcSize_LE(smb); if (4 + len != length) { - cERROR(1, ("Length read does not match RFC1001 length %d", - len)); + cERROR(1, "Length read does not match RFC1001 length %d", + len); return 1; } @@ -466,8 +465,8 @@ checkSMB(struct smb_hdr *smb, __u16 mid, unsigned int length) if (((4 + len) & 0xFFFF) == (clc_len & 0xFFFF)) return 0; /* bcc wrapped */ } - cFYI(1, ("Calculated size %d vs length %d mismatch for mid %d", - clc_len, 4 + len, smb->Mid)); + cFYI(1, "Calculated size %d vs length %d mismatch for mid %d", + clc_len, 4 + len, smb->Mid); /* Windows XP can return a few bytes too much, presumably an illegal pad, at the end of byte range lock responses so we allow for that three byte pad, as long as actual @@ -482,8 +481,8 @@ checkSMB(struct smb_hdr *smb, __u16 mid, unsigned int length) if ((4+len > clc_len) && (len <= clc_len + 512)) return 0; else { - cERROR(1, ("RFC1001 size %d bigger than SMB for Mid=%d", - len, smb->Mid)); + cERROR(1, "RFC1001 size %d bigger than SMB for Mid=%d", + len, smb->Mid); return 1; } } @@ -501,7 +500,7 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) struct cifsFileInfo *netfile; int rc; - cFYI(1, ("Checking for oplock break or dnotify response")); + cFYI(1, "Checking for oplock break or dnotify response"); if ((pSMB->hdr.Command == SMB_COM_NT_TRANSACT) && (pSMB->hdr.Flags & SMBFLG_RESPONSE)) { struct smb_com_transaction_change_notify_rsp *pSMBr = @@ -513,15 +512,15 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) pnotify = (struct file_notify_information *) ((char *)&pSMBr->hdr.Protocol + data_offset); - cFYI(1, ("dnotify on %s Action: 0x%x", - pnotify->FileName, pnotify->Action)); + cFYI(1, "dnotify on %s Action: 0x%x", + pnotify->FileName, pnotify->Action); /* cifs_dump_mem("Rcvd notify Data: ",buf, sizeof(struct smb_hdr)+60); */ return true; } if (pSMBr->hdr.Status.CifsError) { - cFYI(1, ("notify err 0x%d", - pSMBr->hdr.Status.CifsError)); + cFYI(1, "notify err 0x%d", + pSMBr->hdr.Status.CifsError); return true; } return false; @@ -535,7 +534,7 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) large dirty files cached on the client */ if ((NT_STATUS_INVALID_HANDLE) == le32_to_cpu(pSMB->hdr.Status.CifsError)) { - cFYI(1, ("invalid handle on oplock break")); + cFYI(1, "invalid handle on oplock break"); return true; } else if (ERRbadfid == le16_to_cpu(pSMB->hdr.Status.DosError.Error)) { @@ -547,8 +546,8 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) if (pSMB->hdr.WordCount != 8) return false; - cFYI(1, ("oplock type 0x%d level 0x%d", - pSMB->LockType, pSMB->OplockLevel)); + cFYI(1, "oplock type 0x%d level 0x%d", + pSMB->LockType, pSMB->OplockLevel); if (!(pSMB->LockType & LOCKING_ANDX_OPLOCK_RELEASE)) return false; @@ -579,15 +578,15 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) return true; } - cFYI(1, ("file id match, oplock break")); + cFYI(1, "file id match, oplock break"); pCifsInode = CIFS_I(netfile->pInode); pCifsInode->clientCanCacheAll = false; if (pSMB->OplockLevel == 0) pCifsInode->clientCanCacheRead = false; rc = slow_work_enqueue(&netfile->oplock_break); if (rc) { - cERROR(1, ("failed to enqueue oplock " - "break: %d\n", rc)); + cERROR(1, "failed to enqueue oplock " + "break: %d\n", rc); } else { netfile->oplock_break_cancelled = false; } @@ -597,12 +596,12 @@ is_valid_oplock_break(struct smb_hdr *buf, struct TCP_Server_Info *srv) } read_unlock(&GlobalSMBSeslock); read_unlock(&cifs_tcp_ses_lock); - cFYI(1, ("No matching file for oplock break")); + cFYI(1, "No matching file for oplock break"); return true; } } read_unlock(&cifs_tcp_ses_lock); - cFYI(1, ("Can not process oplock break for non-existent connection")); + cFYI(1, "Can not process oplock break for non-existent connection"); return true; } @@ -721,11 +720,11 @@ cifs_autodisable_serverino(struct cifs_sb_info *cifs_sb) { if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM) { cifs_sb->mnt_cifs_flags &= ~CIFS_MOUNT_SERVER_INUM; - cERROR(1, ("Autodisabling the use of server inode numbers on " + cERROR(1, "Autodisabling the use of server inode numbers on " "%s. This server doesn't seem to support them " "properly. Hardlinks will not be recognized on this " "mount. Consider mounting with the \"noserverino\" " "option to silence this message.", - cifs_sb->tcon->treeName)); + cifs_sb->tcon->treeName); } } diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index bd6d689..d35d528 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -149,7 +149,7 @@ cifs_inet_pton(const int address_family, const char *cp, void *dst) else if (address_family == AF_INET6) ret = in6_pton(cp, -1 /* len */, dst , '\\', NULL); - cFYI(DBG2, ("address conversion returned %d for %s", ret, cp)); + cFYI(DBG2, "address conversion returned %d for %s", ret, cp); if (ret > 0) ret = 1; return ret; @@ -870,8 +870,8 @@ map_smb_to_linux_error(struct smb_hdr *smb, int logErr) } /* else ERRHRD class errors or junk - return EIO */ - cFYI(1, ("Mapping smb error code %d to POSIX err %d", - smberrcode, rc)); + cFYI(1, "Mapping smb error code %d to POSIX err %d", + smberrcode, rc); /* generic corrective action e.g. reconnect SMB session on * ERRbaduid could be added */ @@ -940,20 +940,20 @@ struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) SMB_TIME *st = (SMB_TIME *)&time; SMB_DATE *sd = (SMB_DATE *)&date; - cFYI(1, ("date %d time %d", date, time)); + cFYI(1, "date %d time %d", date, time); sec = 2 * st->TwoSeconds; min = st->Minutes; if ((sec > 59) || (min > 59)) - cERROR(1, ("illegal time min %d sec %d", min, sec)); + cERROR(1, "illegal time min %d sec %d", min, sec); sec += (min * 60); sec += 60 * 60 * st->Hours; if (st->Hours > 24) - cERROR(1, ("illegal hours %d", st->Hours)); + cERROR(1, "illegal hours %d", st->Hours); days = sd->Day; month = sd->Month; if ((days > 31) || (month > 12)) { - cERROR(1, ("illegal date, month %d day: %d", month, days)); + cERROR(1, "illegal date, month %d day: %d", month, days); if (month > 12) month = 12; } @@ -979,7 +979,7 @@ struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset) ts.tv_sec = sec + offset; - /* cFYI(1,("sec after cnvrt dos to unix time %d",sec)); */ + /* cFYI(1, "sec after cnvrt dos to unix time %d",sec); */ ts.tv_nsec = 0; return ts; diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c index 18e0bc1..daf1753 100644 --- a/fs/cifs/readdir.c +++ b/fs/cifs/readdir.c @@ -47,15 +47,15 @@ static void dump_cifs_file_struct(struct file *file, char *label) if (file) { cf = file->private_data; if (cf == NULL) { - cFYI(1, ("empty cifs private file data")); + cFYI(1, "empty cifs private file data"); return; } if (cf->invalidHandle) - cFYI(1, ("invalid handle")); + cFYI(1, "invalid handle"); if (cf->srch_inf.endOfSearch) - cFYI(1, ("end of search")); + cFYI(1, "end of search"); if (cf->srch_inf.emptyDir) - cFYI(1, ("empty dir")); + cFYI(1, "empty dir"); } } #else @@ -76,7 +76,7 @@ cifs_readdir_lookup(struct dentry *parent, struct qstr *name, struct inode *inode; struct super_block *sb = parent->d_inode->i_sb; - cFYI(1, ("For %s", name->name)); + cFYI(1, "For %s", name->name); if (parent->d_op && parent->d_op->d_hash) parent->d_op->d_hash(parent, name); @@ -214,7 +214,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb, fid, cifs_sb->local_nls); if (CIFSSMBClose(xid, ptcon, fid)) { - cFYI(1, ("Error closing temporary reparsepoint open)")); + cFYI(1, "Error closing temporary reparsepoint open"); } } } @@ -252,7 +252,7 @@ static int initiate_cifs_search(const int xid, struct file *file) if (full_path == NULL) return -ENOMEM; - cFYI(1, ("Full path: %s start at: %lld", full_path, file->f_pos)); + cFYI(1, "Full path: %s start at: %lld", full_path, file->f_pos); ffirst_retry: /* test for Unix extensions */ @@ -297,7 +297,7 @@ static int cifs_unicode_bytelen(char *str) if (ustr[len] == 0) return len << 1; } - cFYI(1, ("Unicode string longer than PATH_MAX found")); + cFYI(1, "Unicode string longer than PATH_MAX found"); return len << 1; } @@ -314,19 +314,18 @@ static char *nxt_dir_entry(char *old_entry, char *end_of_smb, int level) pfData->FileNameLength; } else new_entry = old_entry + le32_to_cpu(pDirInfo->NextEntryOffset); - cFYI(1, ("new entry %p old entry %p", new_entry, old_entry)); + cFYI(1, "new entry %p old entry %p", new_entry, old_entry); /* validate that new_entry is not past end of SMB */ if (new_entry >= end_of_smb) { - cERROR(1, - ("search entry %p began after end of SMB %p old entry %p", - new_entry, end_of_smb, old_entry)); + cERROR(1, "search entry %p began after end of SMB %p old entry %p", + new_entry, end_of_smb, old_entry); return NULL; } else if (((level == SMB_FIND_FILE_INFO_STANDARD) && (new_entry + sizeof(FIND_FILE_STANDARD_INFO) > end_of_smb)) || ((level != SMB_FIND_FILE_INFO_STANDARD) && (new_entry + sizeof(FILE_DIRECTORY_INFO) > end_of_smb))) { - cERROR(1, ("search entry %p extends after end of SMB %p", - new_entry, end_of_smb)); + cERROR(1, "search entry %p extends after end of SMB %p", + new_entry, end_of_smb); return NULL; } else return new_entry; @@ -380,8 +379,8 @@ static int cifs_entry_is_dot(char *current_entry, struct cifsFileInfo *cfile) filename = &pFindData->FileName[0]; len = pFindData->FileNameLength; } else { - cFYI(1, ("Unknown findfirst level %d", - cfile->srch_inf.info_level)); + cFYI(1, "Unknown findfirst level %d", + cfile->srch_inf.info_level); } if (filename) { @@ -481,7 +480,7 @@ static int cifs_save_resume_key(const char *current_entry, len = (unsigned int)pFindData->FileNameLength; cifsFile->srch_inf.resume_key = pFindData->ResumeKey; } else { - cFYI(1, ("Unknown findfirst level %d", level)); + cFYI(1, "Unknown findfirst level %d", level); return -EINVAL; } cifsFile->srch_inf.resume_name_len = len; @@ -525,7 +524,7 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, is_dir_changed(file)) || (index_to_find < first_entry_in_buffer)) { /* close and restart search */ - cFYI(1, ("search backing up - close and restart search")); + cFYI(1, "search backing up - close and restart search"); write_lock(&GlobalSMBSeslock); if (!cifsFile->srch_inf.endOfSearch && !cifsFile->invalidHandle) { @@ -535,7 +534,7 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, } else write_unlock(&GlobalSMBSeslock); if (cifsFile->srch_inf.ntwrk_buf_start) { - cFYI(1, ("freeing SMB ff cache buf on search rewind")); + cFYI(1, "freeing SMB ff cache buf on search rewind"); if (cifsFile->srch_inf.smallBuf) cifs_small_buf_release(cifsFile->srch_inf. ntwrk_buf_start); @@ -546,8 +545,8 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, } rc = initiate_cifs_search(xid, file); if (rc) { - cFYI(1, ("error %d reinitiating a search on rewind", - rc)); + cFYI(1, "error %d reinitiating a search on rewind", + rc); return rc; } cifs_save_resume_key(cifsFile->srch_inf.last_entry, cifsFile); @@ -555,7 +554,7 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, while ((index_to_find >= cifsFile->srch_inf.index_of_last_entry) && (rc == 0) && !cifsFile->srch_inf.endOfSearch) { - cFYI(1, ("calling findnext2")); + cFYI(1, "calling findnext2"); rc = CIFSFindNext(xid, pTcon, cifsFile->netfid, &cifsFile->srch_inf); cifs_save_resume_key(cifsFile->srch_inf.last_entry, cifsFile); @@ -575,7 +574,7 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, first_entry_in_buffer = cifsFile->srch_inf.index_of_last_entry - cifsFile->srch_inf.entries_in_buffer; pos_in_buf = index_to_find - first_entry_in_buffer; - cFYI(1, ("found entry - pos_in_buf %d", pos_in_buf)); + cFYI(1, "found entry - pos_in_buf %d", pos_in_buf); for (i = 0; (i < (pos_in_buf)) && (current_entry != NULL); i++) { /* go entry by entry figuring out which is first */ @@ -584,19 +583,19 @@ static int find_cifs_entry(const int xid, struct cifsTconInfo *pTcon, } if ((current_entry == NULL) && (i < pos_in_buf)) { /* BB fixme - check if we should flag this error */ - cERROR(1, ("reached end of buf searching for pos in buf" + cERROR(1, "reached end of buf searching for pos in buf" " %d index to find %lld rc %d", - pos_in_buf, index_to_find, rc)); + pos_in_buf, index_to_find, rc); } rc = 0; *ppCurrentEntry = current_entry; } else { - cFYI(1, ("index not in buffer - could not findnext into it")); + cFYI(1, "index not in buffer - could not findnext into it"); return 0; } if (pos_in_buf >= cifsFile->srch_inf.entries_in_buffer) { - cFYI(1, ("can not return entries pos_in_buf beyond last")); + cFYI(1, "can not return entries pos_in_buf beyond last"); *num_to_ret = 0; } else *num_to_ret = cifsFile->srch_inf.entries_in_buffer - pos_in_buf; @@ -656,12 +655,12 @@ static int cifs_get_name_from_search_buf(struct qstr *pqst, /* one byte length, no name conversion */ len = (unsigned int)pFindData->FileNameLength; } else { - cFYI(1, ("Unknown findfirst level %d", level)); + cFYI(1, "Unknown findfirst level %d", level); return -EINVAL; } if (len > max_len) { - cERROR(1, ("bad search response length %d past smb end", len)); + cERROR(1, "bad search response length %d past smb end", len); return -EINVAL; } @@ -754,7 +753,7 @@ static int cifs_filldir(char *pfindEntry, struct file *file, filldir_t filldir, * case already. Why should we be clobbering other errors from it? */ if (rc) { - cFYI(1, ("filldir rc = %d", rc)); + cFYI(1, "filldir rc = %d", rc); rc = -EOVERFLOW; } dput(tmp_dentry); @@ -786,7 +785,7 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) case 0: if (filldir(direntry, ".", 1, file->f_pos, file->f_path.dentry->d_inode->i_ino, DT_DIR) < 0) { - cERROR(1, ("Filldir for current dir failed")); + cERROR(1, "Filldir for current dir failed"); rc = -ENOMEM; break; } @@ -794,7 +793,7 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) case 1: if (filldir(direntry, "..", 2, file->f_pos, file->f_path.dentry->d_parent->d_inode->i_ino, DT_DIR) < 0) { - cERROR(1, ("Filldir for parent dir failed")); + cERROR(1, "Filldir for parent dir failed"); rc = -ENOMEM; break; } @@ -807,7 +806,7 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) if (file->private_data == NULL) { rc = initiate_cifs_search(xid, file); - cFYI(1, ("initiate cifs search rc %d", rc)); + cFYI(1, "initiate cifs search rc %d", rc); if (rc) { FreeXid(xid); return rc; @@ -821,7 +820,7 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) cifsFile = file->private_data; if (cifsFile->srch_inf.endOfSearch) { if (cifsFile->srch_inf.emptyDir) { - cFYI(1, ("End of search, empty dir")); + cFYI(1, "End of search, empty dir"); rc = 0; break; } @@ -833,16 +832,16 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) rc = find_cifs_entry(xid, pTcon, file, ¤t_entry, &num_to_fill); if (rc) { - cFYI(1, ("fce error %d", rc)); + cFYI(1, "fce error %d", rc); goto rddir2_exit; } else if (current_entry != NULL) { - cFYI(1, ("entry %lld found", file->f_pos)); + cFYI(1, "entry %lld found", file->f_pos); } else { - cFYI(1, ("could not find entry")); + cFYI(1, "could not find entry"); goto rddir2_exit; } - cFYI(1, ("loop through %d times filling dir for net buf %p", - num_to_fill, cifsFile->srch_inf.ntwrk_buf_start)); + cFYI(1, "loop through %d times filling dir for net buf %p", + num_to_fill, cifsFile->srch_inf.ntwrk_buf_start); max_len = smbCalcSize((struct smb_hdr *) cifsFile->srch_inf.ntwrk_buf_start); end_of_smb = cifsFile->srch_inf.ntwrk_buf_start + max_len; @@ -851,8 +850,8 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) for (i = 0; (i < num_to_fill) && (rc == 0); i++) { if (current_entry == NULL) { /* evaluate whether this case is an error */ - cERROR(1, ("past SMB end, num to fill %d i %d", - num_to_fill, i)); + cERROR(1, "past SMB end, num to fill %d i %d", + num_to_fill, i); break; } /* if buggy server returns . and .. late do @@ -867,8 +866,8 @@ int cifs_readdir(struct file *file, void *direntry, filldir_t filldir) file->f_pos++; if (file->f_pos == cifsFile->srch_inf.index_of_last_entry) { - cFYI(1, ("last entry in buf at pos %lld %s", - file->f_pos, tmp_buf)); + cFYI(1, "last entry in buf at pos %lld %s", + file->f_pos, tmp_buf); cifs_save_resume_key(current_entry, cifsFile); break; } else diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c index 7c3fd74..7707389 100644 --- a/fs/cifs/sess.c +++ b/fs/cifs/sess.c @@ -35,9 +35,11 @@ extern void SMBNTencrypt(unsigned char *passwd, unsigned char *c8, unsigned char *p24); -/* Checks if this is the first smb session to be reconnected after - the socket has been reestablished (so we know whether to use vc 0). - Called while holding the cifs_tcp_ses_lock, so do not block */ +/* + * Checks if this is the first smb session to be reconnected after + * the socket has been reestablished (so we know whether to use vc 0). + * Called while holding the cifs_tcp_ses_lock, so do not block + */ static bool is_first_ses_reconnect(struct cifsSesInfo *ses) { struct list_head *tmp; @@ -284,7 +286,7 @@ decode_unicode_ssetup(char **pbcc_area, int bleft, struct cifsSesInfo *ses, int len; char *data = *pbcc_area; - cFYI(1, ("bleft %d", bleft)); + cFYI(1, "bleft %d", bleft); /* * Windows servers do not always double null terminate their final @@ -301,7 +303,7 @@ decode_unicode_ssetup(char **pbcc_area, int bleft, struct cifsSesInfo *ses, kfree(ses->serverOS); ses->serverOS = cifs_strndup_from_ucs(data, bleft, true, nls_cp); - cFYI(1, ("serverOS=%s", ses->serverOS)); + cFYI(1, "serverOS=%s", ses->serverOS); len = (UniStrnlen((wchar_t *) data, bleft / 2) * 2) + 2; data += len; bleft -= len; @@ -310,7 +312,7 @@ decode_unicode_ssetup(char **pbcc_area, int bleft, struct cifsSesInfo *ses, kfree(ses->serverNOS); ses->serverNOS = cifs_strndup_from_ucs(data, bleft, true, nls_cp); - cFYI(1, ("serverNOS=%s", ses->serverNOS)); + cFYI(1, "serverNOS=%s", ses->serverNOS); len = (UniStrnlen((wchar_t *) data, bleft / 2) * 2) + 2; data += len; bleft -= len; @@ -319,7 +321,7 @@ decode_unicode_ssetup(char **pbcc_area, int bleft, struct cifsSesInfo *ses, kfree(ses->serverDomain); ses->serverDomain = cifs_strndup_from_ucs(data, bleft, true, nls_cp); - cFYI(1, ("serverDomain=%s", ses->serverDomain)); + cFYI(1, "serverDomain=%s", ses->serverDomain); return; } @@ -332,7 +334,7 @@ static int decode_ascii_ssetup(char **pbcc_area, int bleft, int len; char *bcc_ptr = *pbcc_area; - cFYI(1, ("decode sessetup ascii. bleft %d", bleft)); + cFYI(1, "decode sessetup ascii. bleft %d", bleft); len = strnlen(bcc_ptr, bleft); if (len >= bleft) @@ -344,7 +346,7 @@ static int decode_ascii_ssetup(char **pbcc_area, int bleft, if (ses->serverOS) strncpy(ses->serverOS, bcc_ptr, len); if (strncmp(ses->serverOS, "OS/2", 4) == 0) { - cFYI(1, ("OS/2 server")); + cFYI(1, "OS/2 server"); ses->flags |= CIFS_SES_OS2; } @@ -373,7 +375,7 @@ static int decode_ascii_ssetup(char **pbcc_area, int bleft, /* BB For newer servers which do not support Unicode, but thus do return domain here we could add parsing for it later, but it is not very important */ - cFYI(1, ("ascii: bytes left %d", bleft)); + cFYI(1, "ascii: bytes left %d", bleft); return rc; } @@ -384,16 +386,16 @@ static int decode_ntlmssp_challenge(char *bcc_ptr, int blob_len, CHALLENGE_MESSAGE *pblob = (CHALLENGE_MESSAGE *)bcc_ptr; if (blob_len < sizeof(CHALLENGE_MESSAGE)) { - cERROR(1, ("challenge blob len %d too small", blob_len)); + cERROR(1, "challenge blob len %d too small", blob_len); return -EINVAL; } if (memcmp(pblob->Signature, "NTLMSSP", 8)) { - cERROR(1, ("blob signature incorrect %s", pblob->Signature)); + cERROR(1, "blob signature incorrect %s", pblob->Signature); return -EINVAL; } if (pblob->MessageType != NtLmChallenge) { - cERROR(1, ("Incorrect message type %d", pblob->MessageType)); + cERROR(1, "Incorrect message type %d", pblob->MessageType); return -EINVAL; } @@ -447,7 +449,7 @@ static void build_ntlmssp_negotiate_blob(unsigned char *pbuffer, This function returns the length of the data in the blob */ static int build_ntlmssp_auth_blob(unsigned char *pbuffer, struct cifsSesInfo *ses, - const struct nls_table *nls_cp, int first) + const struct nls_table *nls_cp, bool first) { AUTHENTICATE_MESSAGE *sec_blob = (AUTHENTICATE_MESSAGE *)pbuffer; __u32 flags; @@ -546,7 +548,7 @@ static void setup_ntlmssp_neg_req(SESSION_SETUP_ANDX *pSMB, static int setup_ntlmssp_auth_req(SESSION_SETUP_ANDX *pSMB, struct cifsSesInfo *ses, - const struct nls_table *nls, int first_time) + const struct nls_table *nls, bool first_time) { int bloblen; @@ -559,8 +561,8 @@ static int setup_ntlmssp_auth_req(SESSION_SETUP_ANDX *pSMB, #endif int -CIFS_SessSetup(unsigned int xid, struct cifsSesInfo *ses, int first_time, - const struct nls_table *nls_cp) +CIFS_SessSetup(unsigned int xid, struct cifsSesInfo *ses, + const struct nls_table *nls_cp) { int rc = 0; int wct; @@ -577,13 +579,18 @@ CIFS_SessSetup(unsigned int xid, struct cifsSesInfo *ses, int first_time, int bytes_remaining; struct key *spnego_key = NULL; __le32 phase = NtLmNegotiate; /* NTLMSSP, if needed, is multistage */ + bool first_time; if (ses == NULL) return -EINVAL; + read_lock(&cifs_tcp_ses_lock); + first_time = is_first_ses_reconnect(ses); + read_unlock(&cifs_tcp_ses_lock); + type = ses->server->secType; - cFYI(1, ("sess setup type %d", type)); + cFYI(1, "sess setup type %d", type); ssetup_ntlmssp_authenticate: if (phase == NtLmChallenge) phase = NtLmAuthenticate; /* if ntlmssp, now final phase */ @@ -664,7 +671,7 @@ ssetup_ntlmssp_authenticate: changed to do higher than lanman dialect and we reconnected would we ever calc signing_key? */ - cFYI(1, ("Negotiating LANMAN setting up strings")); + cFYI(1, "Negotiating LANMAN setting up strings"); /* Unicode not allowed for LANMAN dialects */ ascii_ssetup_strings(&bcc_ptr, ses, nls_cp); #endif @@ -744,7 +751,7 @@ ssetup_ntlmssp_authenticate: unicode_ssetup_strings(&bcc_ptr, ses, nls_cp); } else ascii_ssetup_strings(&bcc_ptr, ses, nls_cp); - } else if (type == Kerberos || type == MSKerberos) { + } else if (type == Kerberos) { #ifdef CONFIG_CIFS_UPCALL struct cifs_spnego_msg *msg; spnego_key = cifs_get_spnego_key(ses); @@ -758,17 +765,17 @@ ssetup_ntlmssp_authenticate: /* check version field to make sure that cifs.upcall is sending us a response in an expected form */ if (msg->version != CIFS_SPNEGO_UPCALL_VERSION) { - cERROR(1, ("incorrect version of cifs.upcall (expected" + cERROR(1, "incorrect version of cifs.upcall (expected" " %d but got %d)", - CIFS_SPNEGO_UPCALL_VERSION, msg->version)); + CIFS_SPNEGO_UPCALL_VERSION, msg->version); rc = -EKEYREJECTED; goto ssetup_exit; } /* bail out if key is too long */ if (msg->sesskey_len > sizeof(ses->server->mac_signing_key.data.krb5)) { - cERROR(1, ("Kerberos signing key too long (%u bytes)", - msg->sesskey_len)); + cERROR(1, "Kerberos signing key too long (%u bytes)", + msg->sesskey_len); rc = -EOVERFLOW; goto ssetup_exit; } @@ -796,7 +803,7 @@ ssetup_ntlmssp_authenticate: /* BB: is this right? */ ascii_ssetup_strings(&bcc_ptr, ses, nls_cp); #else /* ! CONFIG_CIFS_UPCALL */ - cERROR(1, ("Kerberos negotiated but upcall support disabled!")); + cERROR(1, "Kerberos negotiated but upcall support disabled!"); rc = -ENOSYS; goto ssetup_exit; #endif /* CONFIG_CIFS_UPCALL */ @@ -804,12 +811,12 @@ ssetup_ntlmssp_authenticate: #ifdef CONFIG_CIFS_EXPERIMENTAL if (type == RawNTLMSSP) { if ((pSMB->req.hdr.Flags2 & SMBFLG2_UNICODE) == 0) { - cERROR(1, ("NTLMSSP requires Unicode support")); + cERROR(1, "NTLMSSP requires Unicode support"); rc = -ENOSYS; goto ssetup_exit; } - cFYI(1, ("ntlmssp session setup phase %d", phase)); + cFYI(1, "ntlmssp session setup phase %d", phase); pSMB->req.hdr.Flags2 |= SMBFLG2_EXT_SEC; capabilities |= CAP_EXTENDED_SECURITY; pSMB->req.Capabilities |= cpu_to_le32(capabilities); @@ -827,7 +834,7 @@ ssetup_ntlmssp_authenticate: on the response (challenge) */ smb_buf->Uid = ses->Suid; } else { - cERROR(1, ("invalid phase %d", phase)); + cERROR(1, "invalid phase %d", phase); rc = -ENOSYS; goto ssetup_exit; } @@ -839,12 +846,12 @@ ssetup_ntlmssp_authenticate: } unicode_oslm_strings(&bcc_ptr, nls_cp); } else { - cERROR(1, ("secType %d not supported!", type)); + cERROR(1, "secType %d not supported!", type); rc = -ENOSYS; goto ssetup_exit; } #else - cERROR(1, ("secType %d not supported!", type)); + cERROR(1, "secType %d not supported!", type); rc = -ENOSYS; goto ssetup_exit; #endif @@ -862,7 +869,7 @@ ssetup_ntlmssp_authenticate: CIFS_STD_OP /* not long */ | CIFS_LOG_ERROR); /* SMB request buf freed in SendReceive2 */ - cFYI(1, ("ssetup rc from sendrecv2 is %d", rc)); + cFYI(1, "ssetup rc from sendrecv2 is %d", rc); pSMB = (SESSION_SETUP_ANDX *)iov[0].iov_base; smb_buf = (struct smb_hdr *)iov[0].iov_base; @@ -870,7 +877,7 @@ ssetup_ntlmssp_authenticate: if ((type == RawNTLMSSP) && (smb_buf->Status.CifsError == cpu_to_le32(NT_STATUS_MORE_PROCESSING_REQUIRED))) { if (phase != NtLmNegotiate) { - cERROR(1, ("Unexpected more processing error")); + cERROR(1, "Unexpected more processing error"); goto ssetup_exit; } /* NTLMSSP Negotiate sent now processing challenge (response) */ @@ -882,14 +889,14 @@ ssetup_ntlmssp_authenticate: if ((smb_buf->WordCount != 3) && (smb_buf->WordCount != 4)) { rc = -EIO; - cERROR(1, ("bad word count %d", smb_buf->WordCount)); + cERROR(1, "bad word count %d", smb_buf->WordCount); goto ssetup_exit; } action = le16_to_cpu(pSMB->resp.Action); if (action & GUEST_LOGIN) - cFYI(1, ("Guest login")); /* BB mark SesInfo struct? */ + cFYI(1, "Guest login"); /* BB mark SesInfo struct? */ ses->Suid = smb_buf->Uid; /* UID left in wire format (le) */ - cFYI(1, ("UID = %d ", ses->Suid)); + cFYI(1, "UID = %d ", ses->Suid); /* response can have either 3 or 4 word count - Samba sends 3 */ /* and lanman response is 3 */ bytes_remaining = BCC(smb_buf); @@ -899,7 +906,7 @@ ssetup_ntlmssp_authenticate: __u16 blob_len; blob_len = le16_to_cpu(pSMB->resp.SecurityBlobLength); if (blob_len > bytes_remaining) { - cERROR(1, ("bad security blob length %d", blob_len)); + cERROR(1, "bad security blob length %d", blob_len); rc = -EINVAL; goto ssetup_exit; } @@ -933,7 +940,7 @@ ssetup_exit: } kfree(str_area); if (resp_buf_type == CIFS_SMALL_BUFFER) { - cFYI(1, ("ssetup freeing small buf %p", iov[0].iov_base)); + cFYI(1, "ssetup freeing small buf %p", iov[0].iov_base); cifs_small_buf_release(iov[0].iov_base); } else if (resp_buf_type == CIFS_LARGE_BUFFER) cifs_buf_release(iov[0].iov_base); diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index ad081fe..82f78c4 100644 --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -35,7 +35,6 @@ #include "cifs_debug.h" extern mempool_t *cifs_mid_poolp; -extern struct kmem_cache *cifs_oplock_cachep; static struct mid_q_entry * AllocMidQEntry(const struct smb_hdr *smb_buffer, struct TCP_Server_Info *server) @@ -43,7 +42,7 @@ AllocMidQEntry(const struct smb_hdr *smb_buffer, struct TCP_Server_Info *server) struct mid_q_entry *temp; if (server == NULL) { - cERROR(1, ("Null TCP session in AllocMidQEntry")); + cERROR(1, "Null TCP session in AllocMidQEntry"); return NULL; } @@ -55,7 +54,7 @@ AllocMidQEntry(const struct smb_hdr *smb_buffer, struct TCP_Server_Info *server) temp->mid = smb_buffer->Mid; /* always LE */ temp->pid = current->pid; temp->command = smb_buffer->Command; - cFYI(1, ("For smb_command %d", temp->command)); + cFYI(1, "For smb_command %d", temp->command); /* do_gettimeofday(&temp->when_sent);*/ /* easier to use jiffies */ /* when mid allocated can be before when sent */ temp->when_alloc = jiffies; @@ -140,7 +139,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec) total_len += iov[i].iov_len; smb_buffer->smb_buf_length = cpu_to_be32(smb_buffer->smb_buf_length); - cFYI(1, ("Sending smb: total_len %d", total_len)); + cFYI(1, "Sending smb: total_len %d", total_len); dump_smb(smb_buffer, len); i = 0; @@ -168,9 +167,8 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec) reconnect which may clear the network problem. */ if ((i >= 14) || (!server->noblocksnd && (i > 2))) { - cERROR(1, - ("sends on sock %p stuck for 15 seconds", - ssocket)); + cERROR(1, "sends on sock %p stuck for 15 seconds", + ssocket); rc = -EAGAIN; break; } @@ -184,13 +182,13 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec) total_len = 0; break; } else if (rc > total_len) { - cERROR(1, ("sent %d requested %d", rc, total_len)); + cERROR(1, "sent %d requested %d", rc, total_len); break; } if (rc == 0) { /* should never happen, letting socket clear before retrying is our only obvious option here */ - cERROR(1, ("tcp sent no data")); + cERROR(1, "tcp sent no data"); msleep(500); continue; } @@ -213,8 +211,8 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec) } if ((total_len > 0) && (total_len != smb_buf_length + 4)) { - cFYI(1, ("partial send (%d remaining), terminating session", - total_len)); + cFYI(1, "partial send (%d remaining), terminating session", + total_len); /* If we have only sent part of an SMB then the next SMB could be taken as the remainder of this one. We need to kill the socket so the server throws away the partial @@ -223,7 +221,7 @@ smb_sendv(struct TCP_Server_Info *server, struct kvec *iov, int n_vec) } if (rc < 0) { - cERROR(1, ("Error %d sending data on socket to server", rc)); + cERROR(1, "Error %d sending data on socket to server", rc); } else rc = 0; @@ -296,7 +294,7 @@ static int allocate_mid(struct cifsSesInfo *ses, struct smb_hdr *in_buf, } if (ses->server->tcpStatus == CifsNeedReconnect) { - cFYI(1, ("tcp session dead - return to caller to retry")); + cFYI(1, "tcp session dead - return to caller to retry"); return -EAGAIN; } @@ -348,7 +346,7 @@ static int wait_for_response(struct cifsSesInfo *ses, lrt += time_to_wait; if (time_after(jiffies, lrt)) { /* No replies for time_to_wait. */ - cERROR(1, ("server not responding")); + cERROR(1, "server not responding"); return -1; } } else { @@ -379,7 +377,7 @@ SendReceiveNoRsp(const unsigned int xid, struct cifsSesInfo *ses, iov[0].iov_len = in_buf->smb_buf_length + 4; flags |= CIFS_NO_RESP; rc = SendReceive2(xid, ses, iov, 1, &resp_buf_type, flags); - cFYI(DBG2, ("SendRcvNoRsp flags %d rc %d", flags, rc)); + cFYI(DBG2, "SendRcvNoRsp flags %d rc %d", flags, rc); return rc; } @@ -402,7 +400,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, if ((ses == NULL) || (ses->server == NULL)) { cifs_small_buf_release(in_buf); - cERROR(1, ("Null session")); + cERROR(1, "Null session"); return -EIO; } @@ -471,7 +469,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, else if (long_op == CIFS_BLOCKING_OP) timeout = 0x7FFFFFFF; /* large, but not so large as to wrap */ else { - cERROR(1, ("unknown timeout flag %d", long_op)); + cERROR(1, "unknown timeout flag %d", long_op); rc = -EIO; goto out; } @@ -490,8 +488,8 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, spin_lock(&GlobalMid_Lock); if (midQ->resp_buf == NULL) { - cERROR(1, ("No response to cmd %d mid %d", - midQ->command, midQ->mid)); + cERROR(1, "No response to cmd %d mid %d", + midQ->command, midQ->mid); if (midQ->midState == MID_REQUEST_SUBMITTED) { if (ses->server->tcpStatus == CifsExiting) rc = -EHOSTDOWN; @@ -504,7 +502,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, if (rc != -EHOSTDOWN) { if (midQ->midState == MID_RETRY_NEEDED) { rc = -EAGAIN; - cFYI(1, ("marking request for retry")); + cFYI(1, "marking request for retry"); } else { rc = -EIO; } @@ -521,8 +519,8 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, receive_len = midQ->resp_buf->smb_buf_length; if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) { - cERROR(1, ("Frame too large received. Length: %d Xid: %d", - receive_len, xid)); + cERROR(1, "Frame too large received. Length: %d Xid: %d", + receive_len, xid); rc = -EIO; goto out; } @@ -548,7 +546,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, &ses->server->mac_signing_key, midQ->sequence_number+1); if (rc) { - cERROR(1, ("Unexpected SMB signature")); + cERROR(1, "Unexpected SMB signature"); /* BB FIXME add code to kill session */ } } @@ -569,7 +567,7 @@ SendReceive2(const unsigned int xid, struct cifsSesInfo *ses, DeleteMidQEntry */ } else { rc = -EIO; - cFYI(1, ("Bad MID state?")); + cFYI(1, "Bad MID state?"); } out: @@ -591,11 +589,11 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, struct mid_q_entry *midQ; if (ses == NULL) { - cERROR(1, ("Null smb session")); + cERROR(1, "Null smb session"); return -EIO; } if (ses->server == NULL) { - cERROR(1, ("Null tcp session")); + cERROR(1, "Null tcp session"); return -EIO; } @@ -607,8 +605,8 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, use ses->maxReq */ if (in_buf->smb_buf_length > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4) { - cERROR(1, ("Illegal length, greater than maximum frame, %d", - in_buf->smb_buf_length)); + cERROR(1, "Illegal length, greater than maximum frame, %d", + in_buf->smb_buf_length); return -EIO; } @@ -665,7 +663,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, else if (long_op == CIFS_BLOCKING_OP) timeout = 0x7FFFFFFF; /* large but no so large as to wrap */ else { - cERROR(1, ("unknown timeout flag %d", long_op)); + cERROR(1, "unknown timeout flag %d", long_op); rc = -EIO; goto out; } @@ -681,8 +679,8 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, spin_lock(&GlobalMid_Lock); if (midQ->resp_buf == NULL) { - cERROR(1, ("No response for cmd %d mid %d", - midQ->command, midQ->mid)); + cERROR(1, "No response for cmd %d mid %d", + midQ->command, midQ->mid); if (midQ->midState == MID_REQUEST_SUBMITTED) { if (ses->server->tcpStatus == CifsExiting) rc = -EHOSTDOWN; @@ -695,7 +693,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, if (rc != -EHOSTDOWN) { if (midQ->midState == MID_RETRY_NEEDED) { rc = -EAGAIN; - cFYI(1, ("marking request for retry")); + cFYI(1, "marking request for retry"); } else { rc = -EIO; } @@ -712,8 +710,8 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, receive_len = midQ->resp_buf->smb_buf_length; if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) { - cERROR(1, ("Frame too large received. Length: %d Xid: %d", - receive_len, xid)); + cERROR(1, "Frame too large received. Length: %d Xid: %d", + receive_len, xid); rc = -EIO; goto out; } @@ -736,7 +734,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, &ses->server->mac_signing_key, midQ->sequence_number+1); if (rc) { - cERROR(1, ("Unexpected SMB signature")); + cERROR(1, "Unexpected SMB signature"); /* BB FIXME add code to kill session */ } } @@ -753,7 +751,7 @@ SendReceive(const unsigned int xid, struct cifsSesInfo *ses, BCC(out_buf) = le16_to_cpu(BCC_LE(out_buf)); } else { rc = -EIO; - cERROR(1, ("Bad MID state?")); + cERROR(1, "Bad MID state?"); } out: @@ -824,13 +822,13 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, struct cifsSesInfo *ses; if (tcon == NULL || tcon->ses == NULL) { - cERROR(1, ("Null smb session")); + cERROR(1, "Null smb session"); return -EIO; } ses = tcon->ses; if (ses->server == NULL) { - cERROR(1, ("Null tcp session")); + cERROR(1, "Null tcp session"); return -EIO; } @@ -842,8 +840,8 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, use ses->maxReq */ if (in_buf->smb_buf_length > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE - 4) { - cERROR(1, ("Illegal length, greater than maximum frame, %d", - in_buf->smb_buf_length)); + cERROR(1, "Illegal length, greater than maximum frame, %d", + in_buf->smb_buf_length); return -EIO; } @@ -933,8 +931,8 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, spin_unlock(&GlobalMid_Lock); receive_len = midQ->resp_buf->smb_buf_length; } else { - cERROR(1, ("No response for cmd %d mid %d", - midQ->command, midQ->mid)); + cERROR(1, "No response for cmd %d mid %d", + midQ->command, midQ->mid); if (midQ->midState == MID_REQUEST_SUBMITTED) { if (ses->server->tcpStatus == CifsExiting) rc = -EHOSTDOWN; @@ -947,7 +945,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, if (rc != -EHOSTDOWN) { if (midQ->midState == MID_RETRY_NEEDED) { rc = -EAGAIN; - cFYI(1, ("marking request for retry")); + cFYI(1, "marking request for retry"); } else { rc = -EIO; } @@ -958,8 +956,8 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, } if (receive_len > CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) { - cERROR(1, ("Frame too large received. Length: %d Xid: %d", - receive_len, xid)); + cERROR(1, "Frame too large received. Length: %d Xid: %d", + receive_len, xid); rc = -EIO; goto out; } @@ -968,7 +966,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, if ((out_buf == NULL) || (midQ->midState != MID_RESPONSE_RECEIVED)) { rc = -EIO; - cERROR(1, ("Bad MID state?")); + cERROR(1, "Bad MID state?"); goto out; } @@ -986,7 +984,7 @@ SendReceiveBlockingLock(const unsigned int xid, struct cifsTconInfo *tcon, &ses->server->mac_signing_key, midQ->sequence_number+1); if (rc) { - cERROR(1, ("Unexpected SMB signature")); + cERROR(1, "Unexpected SMB signature"); /* BB FIXME add code to kill session */ } } diff --git a/fs/cifs/xattr.c b/fs/cifs/xattr.c index f555ce0..a150920 100644 --- a/fs/cifs/xattr.c +++ b/fs/cifs/xattr.c @@ -70,12 +70,12 @@ int cifs_removexattr(struct dentry *direntry, const char *ea_name) return rc; } if (ea_name == NULL) { - cFYI(1, ("Null xattr names not supported")); + cFYI(1, "Null xattr names not supported"); } else if (strncmp(ea_name, CIFS_XATTR_USER_PREFIX, 5) && (strncmp(ea_name, CIFS_XATTR_OS2_PREFIX, 4))) { cFYI(1, - ("illegal xattr request %s (only user namespace supported)", - ea_name)); + "illegal xattr request %s (only user namespace supported)", + ea_name); /* BB what if no namespace prefix? */ /* Should we just pass them to server, except for system and perhaps security prefixes? */ @@ -131,19 +131,19 @@ int cifs_setxattr(struct dentry *direntry, const char *ea_name, search server for EAs or streams to returns as xattrs */ if (value_size > MAX_EA_VALUE_SIZE) { - cFYI(1, ("size of EA value too large")); + cFYI(1, "size of EA value too large"); kfree(full_path); FreeXid(xid); return -EOPNOTSUPP; } if (ea_name == NULL) { - cFYI(1, ("Null xattr names not supported")); + cFYI(1, "Null xattr names not supported"); } else if (strncmp(ea_name, CIFS_XATTR_USER_PREFIX, 5) == 0) { if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_XATTR) goto set_ea_exit; if (strncmp(ea_name, CIFS_XATTR_DOS_ATTRIB, 14) == 0) - cFYI(1, ("attempt to set cifs inode metadata")); + cFYI(1, "attempt to set cifs inode metadata"); ea_name += 5; /* skip past user. prefix */ rc = CIFSSMBSetEA(xid, pTcon, full_path, ea_name, ea_value, @@ -169,9 +169,9 @@ int cifs_setxattr(struct dentry *direntry, const char *ea_name, ACL_TYPE_ACCESS, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); - cFYI(1, ("set POSIX ACL rc %d", rc)); + cFYI(1, "set POSIX ACL rc %d", rc); #else - cFYI(1, ("set POSIX ACL not supported")); + cFYI(1, "set POSIX ACL not supported"); #endif } else if (strncmp(ea_name, POSIX_ACL_XATTR_DEFAULT, strlen(POSIX_ACL_XATTR_DEFAULT)) == 0) { @@ -182,13 +182,13 @@ int cifs_setxattr(struct dentry *direntry, const char *ea_name, ACL_TYPE_DEFAULT, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); - cFYI(1, ("set POSIX default ACL rc %d", rc)); + cFYI(1, "set POSIX default ACL rc %d", rc); #else - cFYI(1, ("set default POSIX ACL not supported")); + cFYI(1, "set default POSIX ACL not supported"); #endif } else { - cFYI(1, ("illegal xattr request %s (only user namespace" - " supported)", ea_name)); + cFYI(1, "illegal xattr request %s (only user namespace" + " supported)", ea_name); /* BB what if no namespace prefix? */ /* Should we just pass them to server, except for system and perhaps security prefixes? */ @@ -235,13 +235,13 @@ ssize_t cifs_getxattr(struct dentry *direntry, const char *ea_name, /* return dos attributes as pseudo xattr */ /* return alt name if available as pseudo attr */ if (ea_name == NULL) { - cFYI(1, ("Null xattr names not supported")); + cFYI(1, "Null xattr names not supported"); } else if (strncmp(ea_name, CIFS_XATTR_USER_PREFIX, 5) == 0) { if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NO_XATTR) goto get_ea_exit; if (strncmp(ea_name, CIFS_XATTR_DOS_ATTRIB, 14) == 0) { - cFYI(1, ("attempt to query cifs inode metadata")); + cFYI(1, "attempt to query cifs inode metadata"); /* revalidate/getattr then populate from inode */ } /* BB add else when above is implemented */ ea_name += 5; /* skip past user. prefix */ @@ -287,7 +287,7 @@ ssize_t cifs_getxattr(struct dentry *direntry, const char *ea_name, } #endif /* EXPERIMENTAL */ #else - cFYI(1, ("query POSIX ACL not supported yet")); + cFYI(1, "query POSIX ACL not supported yet"); #endif /* CONFIG_CIFS_POSIX */ } else if (strncmp(ea_name, POSIX_ACL_XATTR_DEFAULT, strlen(POSIX_ACL_XATTR_DEFAULT)) == 0) { @@ -299,18 +299,18 @@ ssize_t cifs_getxattr(struct dentry *direntry, const char *ea_name, cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR); #else - cFYI(1, ("query POSIX default ACL not supported yet")); + cFYI(1, "query POSIX default ACL not supported yet"); #endif } else if (strncmp(ea_name, CIFS_XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN) == 0) { - cFYI(1, ("Trusted xattr namespace not supported yet")); + cFYI(1, "Trusted xattr namespace not supported yet"); } else if (strncmp(ea_name, CIFS_XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN) == 0) { - cFYI(1, ("Security xattr namespace not supported yet")); + cFYI(1, "Security xattr namespace not supported yet"); } else cFYI(1, - ("illegal xattr request %s (only user namespace supported)", - ea_name)); + "illegal xattr request %s (only user namespace supported)", + ea_name); /* We could add an additional check for streams ie if proc/fs/cifs/streamstoxattr is set then diff --git a/fs/compat.c b/fs/compat.c index 4b6ed03..0544873 100644 --- a/fs/compat.c +++ b/fs/compat.c @@ -1531,8 +1531,6 @@ int compat_do_execve(char * filename, if (retval < 0) goto out; - current->stack_start = current->mm->start_stack; - /* execve succeeded */ current->fs->in_exec = 0; current->in_execve = 0; diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 8e48b52..0b502f8 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -645,6 +645,7 @@ static void detach_groups(struct config_group *group) configfs_detach_group(sd->s_element); child->d_inode->i_flags |= S_DEAD; + dont_mount(child); mutex_unlock(&child->d_inode->i_mutex); @@ -840,6 +841,7 @@ static int configfs_attach_item(struct config_item *parent_item, mutex_lock(&dentry->d_inode->i_mutex); configfs_remove_dir(item); dentry->d_inode->i_flags |= S_DEAD; + dont_mount(dentry); mutex_unlock(&dentry->d_inode->i_mutex); d_delete(dentry); } @@ -882,6 +884,7 @@ static int configfs_attach_group(struct config_item *parent_item, if (ret) { configfs_detach_item(item); dentry->d_inode->i_flags |= S_DEAD; + dont_mount(dentry); } configfs_adjust_dir_dirent_depth_after_populate(sd); mutex_unlock(&dentry->d_inode->i_mutex); @@ -1725,6 +1728,7 @@ void configfs_unregister_subsystem(struct configfs_subsystem *subsys) mutex_unlock(&configfs_symlink_mutex); configfs_detach_group(&group->cg_item); dentry->d_inode->i_flags |= S_DEAD; + dont_mount(dentry); mutex_unlock(&dentry->d_inode->i_mutex); d_delete(dentry); diff --git a/fs/eventpoll.c b/fs/eventpoll.c index bd056a5..3817149 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1140,8 +1140,7 @@ retry: * ep_poll_callback() when events will become available. */ init_waitqueue_entry(&wait, current); - wait.flags |= WQ_FLAG_EXCLUSIVE; - __add_wait_queue(&ep->wq, &wait); + __add_wait_queue_exclusive(&ep->wq, &wait); for (;;) { /* @@ -1387,8 +1387,6 @@ int do_execve(char * filename, if (retval < 0) goto out; - current->stack_start = current->mm->start_stack; - /* execve succeeded */ current->fs->in_exec = 0; current->in_execve = 0; diff --git a/fs/jfs/super.c b/fs/jfs/super.c index 157382f..b66832a 100644 --- a/fs/jfs/super.c +++ b/fs/jfs/super.c @@ -446,10 +446,8 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent) /* initialize the mount flag and determine the default error handler */ flag = JFS_ERR_REMOUNT_RO; - if (!parse_options((char *) data, sb, &newLVSize, &flag)) { - kfree(sbi); - return -EINVAL; - } + if (!parse_options((char *) data, sb, &newLVSize, &flag)) + goto out_kfree; sbi->flag = flag; #ifdef CONFIG_JFS_POSIX_ACL @@ -458,7 +456,7 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent) if (newLVSize) { printk(KERN_ERR "resize option for remount only\n"); - return -EINVAL; + goto out_kfree; } /* @@ -478,7 +476,7 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent) inode = new_inode(sb); if (inode == NULL) { ret = -ENOMEM; - goto out_kfree; + goto out_unload; } inode->i_ino = 0; inode->i_nlink = 1; @@ -550,9 +548,10 @@ out_mount_failed: make_bad_inode(sbi->direct_inode); iput(sbi->direct_inode); sbi->direct_inode = NULL; -out_kfree: +out_unload: if (sbi->nls_tab) unload_nls(sbi->nls_tab); +out_kfree: kfree(sbi); return ret; } diff --git a/fs/logfs/dev_bdev.c b/fs/logfs/dev_bdev.c index 243c000..9bd2ce2 100644 --- a/fs/logfs/dev_bdev.c +++ b/fs/logfs/dev_bdev.c @@ -303,6 +303,11 @@ static void bdev_put_device(struct super_block *sb) close_bdev_exclusive(logfs_super(sb)->s_bdev, FMODE_READ|FMODE_WRITE); } +static int bdev_can_write_buf(struct super_block *sb, u64 ofs) +{ + return 0; +} + static const struct logfs_device_ops bd_devops = { .find_first_sb = bdev_find_first_sb, .find_last_sb = bdev_find_last_sb, @@ -310,6 +315,7 @@ static const struct logfs_device_ops bd_devops = { .readpage = bdev_readpage, .writeseg = bdev_writeseg, .erase = bdev_erase, + .can_write_buf = bdev_can_write_buf, .sync = bdev_sync, .put_device = bdev_put_device, }; diff --git a/fs/logfs/dev_mtd.c b/fs/logfs/dev_mtd.c index cafb6ef..a85d47d 100644 --- a/fs/logfs/dev_mtd.c +++ b/fs/logfs/dev_mtd.c @@ -9,6 +9,7 @@ #include <linux/completion.h> #include <linux/mount.h> #include <linux/sched.h> +#include <linux/slab.h> #define PAGE_OFS(ofs) ((ofs) & (PAGE_SIZE-1)) @@ -126,7 +127,8 @@ static int mtd_readpage(void *_sb, struct page *page) err = mtd_read(sb, page->index << PAGE_SHIFT, PAGE_SIZE, page_address(page)); - if (err == -EUCLEAN) { + if (err == -EUCLEAN || err == -EBADMSG) { + /* -EBADMSG happens regularly on power failures */ err = 0; /* FIXME: force GC this segment */ } @@ -233,12 +235,32 @@ static void mtd_put_device(struct super_block *sb) put_mtd_device(logfs_super(sb)->s_mtd); } +static int mtd_can_write_buf(struct super_block *sb, u64 ofs) +{ + struct logfs_super *super = logfs_super(sb); + void *buf; + int err; + + buf = kmalloc(super->s_writesize, GFP_KERNEL); + if (!buf) + return -ENOMEM; + err = mtd_read(sb, ofs, super->s_writesize, buf); + if (err) + goto out; + if (memchr_inv(buf, 0xff, super->s_writesize)) + err = -EIO; + kfree(buf); +out: + return err; +} + static const struct logfs_device_ops mtd_devops = { .find_first_sb = mtd_find_first_sb, .find_last_sb = mtd_find_last_sb, .readpage = mtd_readpage, .writeseg = mtd_writeseg, .erase = mtd_erase, + .can_write_buf = mtd_can_write_buf, .sync = mtd_sync, .put_device = mtd_put_device, }; @@ -250,5 +272,7 @@ int logfs_get_sb_mtd(struct file_system_type *type, int flags, const struct logfs_device_ops *devops = &mtd_devops; mtd = get_mtd_device(NULL, mtdnr); + if (IS_ERR(mtd)) + return PTR_ERR(mtd); return logfs_get_sb_device(type, flags, mtd, NULL, devops, mnt); } diff --git a/fs/logfs/file.c b/fs/logfs/file.c index 370f367..0de5240 100644 --- a/fs/logfs/file.c +++ b/fs/logfs/file.c @@ -161,7 +161,17 @@ static int logfs_writepage(struct page *page, struct writeback_control *wbc) static void logfs_invalidatepage(struct page *page, unsigned long offset) { - move_page_to_btree(page); + struct logfs_block *block = logfs_block(page); + + if (block->reserved_bytes) { + struct super_block *sb = page->mapping->host->i_sb; + struct logfs_super *super = logfs_super(sb); + + super->s_dirty_pages -= block->reserved_bytes; + block->ops->free_block(sb, block); + BUG_ON(bitmap_weight(block->alias_map, LOGFS_BLOCK_FACTOR)); + } else + move_page_to_btree(page); BUG_ON(PagePrivate(page) || page->private); } @@ -212,10 +222,8 @@ int logfs_ioctl(struct inode *inode, struct file *file, unsigned int cmd, int logfs_fsync(struct file *file, struct dentry *dentry, int datasync) { struct super_block *sb = dentry->d_inode->i_sb; - struct logfs_super *super = logfs_super(sb); - /* FIXME: write anchor */ - super->s_devops->sync(sb); + logfs_write_anchor(sb); return 0; } diff --git a/fs/logfs/gc.c b/fs/logfs/gc.c index 76c242f..caa4419 100644 --- a/fs/logfs/gc.c +++ b/fs/logfs/gc.c @@ -122,7 +122,7 @@ static void logfs_cleanse_block(struct super_block *sb, u64 ofs, u64 ino, logfs_safe_iput(inode, cookie); } -static u32 logfs_gc_segment(struct super_block *sb, u32 segno, u8 dist) +static u32 logfs_gc_segment(struct super_block *sb, u32 segno) { struct logfs_super *super = logfs_super(sb); struct logfs_segment_header sh; @@ -401,7 +401,7 @@ static int __logfs_gc_once(struct super_block *sb, struct gc_candidate *cand) segno, (u64)segno << super->s_segshift, dist, no_free_segments(sb), valid, super->s_free_bytes); - cleaned = logfs_gc_segment(sb, segno, dist); + cleaned = logfs_gc_segment(sb, segno); log_gc("GC segment #%02x complete - now %x valid\n", segno, valid - cleaned); BUG_ON(cleaned != valid); @@ -632,38 +632,31 @@ static int check_area(struct super_block *sb, int i) { struct logfs_super *super = logfs_super(sb); struct logfs_area *area = super->s_area[i]; - struct logfs_object_header oh; + gc_level_t gc_level; + u32 cleaned, valid, ec; u32 segno = area->a_segno; - u32 ofs = area->a_used_bytes; - __be32 crc; - int err; + u64 ofs = dev_ofs(sb, area->a_segno, area->a_written_bytes); if (!area->a_is_open) return 0; - for (ofs = area->a_used_bytes; - ofs <= super->s_segsize - sizeof(oh); - ofs += (u32)be16_to_cpu(oh.len) + sizeof(oh)) { - err = wbuf_read(sb, dev_ofs(sb, segno, ofs), sizeof(oh), &oh); - if (err) - return err; - - if (!memchr_inv(&oh, 0xff, sizeof(oh))) - break; + if (super->s_devops->can_write_buf(sb, ofs) == 0) + return 0; - crc = logfs_crc32(&oh, sizeof(oh) - 4, 4); - if (crc != oh.crc) { - printk(KERN_INFO "interrupted header at %llx\n", - dev_ofs(sb, segno, ofs)); - return 0; - } - } - if (ofs != area->a_used_bytes) { - printk(KERN_INFO "%x bytes unaccounted data found at %llx\n", - ofs - area->a_used_bytes, - dev_ofs(sb, segno, area->a_used_bytes)); - area->a_used_bytes = ofs; - } + printk(KERN_INFO"LogFS: Possibly incomplete write at %llx\n", ofs); + /* + * The device cannot write back the write buffer. Most likely the + * wbuf was already written out and the system crashed at some point + * before the journal commit happened. In that case we wouldn't have + * to do anything. But if the crash happened before the wbuf was + * written out correctly, we must GC this segment. So assume the + * worst and always do the GC run. + */ + area->a_is_open = 0; + valid = logfs_valid_bytes(sb, segno, &ec, &gc_level); + cleaned = logfs_gc_segment(sb, segno); + if (cleaned != valid) + return -EIO; return 0; } diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c index 14ed272..755a92e 100644 --- a/fs/logfs/inode.c +++ b/fs/logfs/inode.c @@ -193,6 +193,7 @@ static void logfs_init_inode(struct super_block *sb, struct inode *inode) inode->i_ctime = CURRENT_TIME; inode->i_mtime = CURRENT_TIME; inode->i_nlink = 1; + li->li_refcount = 1; INIT_LIST_HEAD(&li->li_freeing_list); for (i = 0; i < LOGFS_EMBEDDED_FIELDS; i++) @@ -326,7 +327,7 @@ static void logfs_set_ino_generation(struct super_block *sb, u64 ino; mutex_lock(&super->s_journal_mutex); - ino = logfs_seek_hole(super->s_master_inode, super->s_last_ino); + ino = logfs_seek_hole(super->s_master_inode, super->s_last_ino + 1); super->s_last_ino = ino; super->s_inos_till_wrap--; if (super->s_inos_till_wrap < 0) { @@ -386,8 +387,7 @@ static void logfs_init_once(void *_li) static int logfs_sync_fs(struct super_block *sb, int wait) { - /* FIXME: write anchor */ - logfs_super(sb)->s_devops->sync(sb); + logfs_write_anchor(sb); return 0; } diff --git a/fs/logfs/journal.c b/fs/logfs/journal.c index fb0a613..4b0e061 100644 --- a/fs/logfs/journal.c +++ b/fs/logfs/journal.c @@ -132,10 +132,9 @@ static int read_area(struct super_block *sb, struct logfs_je_area *a) ofs = dev_ofs(sb, area->a_segno, area->a_written_bytes); if (super->s_writesize > 1) - logfs_buf_recover(area, ofs, a + 1, super->s_writesize); + return logfs_buf_recover(area, ofs, a + 1, super->s_writesize); else - logfs_buf_recover(area, ofs, NULL, 0); - return 0; + return logfs_buf_recover(area, ofs, NULL, 0); } static void *unpack(void *from, void *to) @@ -245,7 +244,7 @@ static int read_je(struct super_block *sb, u64 ofs) read_erasecount(sb, unpack(jh, scratch)); break; case JE_AREA: - read_area(sb, unpack(jh, scratch)); + err = read_area(sb, unpack(jh, scratch)); break; case JE_OBJ_ALIAS: err = logfs_load_object_aliases(sb, unpack(jh, scratch), diff --git a/fs/logfs/logfs.h b/fs/logfs/logfs.h index 0a3df1a..93b55f3 100644 --- a/fs/logfs/logfs.h +++ b/fs/logfs/logfs.h @@ -144,6 +144,7 @@ struct logfs_area_ops { * @erase: erase one segment * @read: read from the device * @erase: erase part of the device + * @can_write_buf: decide whether wbuf can be written to ofs */ struct logfs_device_ops { struct page *(*find_first_sb)(struct super_block *sb, u64 *ofs); @@ -153,6 +154,7 @@ struct logfs_device_ops { void (*writeseg)(struct super_block *sb, u64 ofs, size_t len); int (*erase)(struct super_block *sb, loff_t ofs, size_t len, int ensure_write); + int (*can_write_buf)(struct super_block *sb, u64 ofs); void (*sync)(struct super_block *sb); void (*put_device)(struct super_block *sb); }; @@ -394,6 +396,7 @@ struct logfs_super { int s_lock_count; mempool_t *s_block_pool; /* struct logfs_block pool */ mempool_t *s_shadow_pool; /* struct logfs_shadow pool */ + struct list_head s_writeback_list; /* writeback pages */ /* * Space accounting: * - s_used_bytes specifies space used to store valid data objects. @@ -598,19 +601,19 @@ void freeseg(struct super_block *sb, u32 segno); int logfs_init_areas(struct super_block *sb); void logfs_cleanup_areas(struct super_block *sb); int logfs_open_area(struct logfs_area *area, size_t bytes); -void __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, +int __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, int use_filler); -static inline void logfs_buf_write(struct logfs_area *area, u64 ofs, +static inline int logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len) { - __logfs_buf_write(area, ofs, buf, len, 0); + return __logfs_buf_write(area, ofs, buf, len, 0); } -static inline void logfs_buf_recover(struct logfs_area *area, u64 ofs, +static inline int logfs_buf_recover(struct logfs_area *area, u64 ofs, void *buf, size_t len) { - __logfs_buf_write(area, ofs, buf, len, 1); + return __logfs_buf_write(area, ofs, buf, len, 1); } /* super.c */ diff --git a/fs/logfs/readwrite.c b/fs/logfs/readwrite.c index 3159db6..0718d11 100644 --- a/fs/logfs/readwrite.c +++ b/fs/logfs/readwrite.c @@ -892,6 +892,8 @@ u64 logfs_seek_hole(struct inode *inode, u64 bix) return bix; else if (li->li_data[INDIRECT_INDEX] & LOGFS_FULLY_POPULATED) bix = maxbix(li->li_height); + else if (bix >= maxbix(li->li_height)) + return bix; else { bix = seek_holedata_loop(inode, bix, 0); if (bix < maxbix(li->li_height)) @@ -1093,17 +1095,25 @@ static int logfs_reserve_bytes(struct inode *inode, int bytes) int get_page_reserve(struct inode *inode, struct page *page) { struct logfs_super *super = logfs_super(inode->i_sb); + struct logfs_block *block = logfs_block(page); int ret; - if (logfs_block(page) && logfs_block(page)->reserved_bytes) + if (block && block->reserved_bytes) return 0; logfs_get_wblocks(inode->i_sb, page, WF_LOCK); - ret = logfs_reserve_bytes(inode, 6 * LOGFS_MAX_OBJECTSIZE); + while ((ret = logfs_reserve_bytes(inode, 6 * LOGFS_MAX_OBJECTSIZE)) && + !list_empty(&super->s_writeback_list)) { + block = list_entry(super->s_writeback_list.next, + struct logfs_block, alias_list); + block->ops->write_block(block); + } if (!ret) { alloc_data_block(inode, page); - logfs_block(page)->reserved_bytes += 6 * LOGFS_MAX_OBJECTSIZE; + block = logfs_block(page); + block->reserved_bytes += 6 * LOGFS_MAX_OBJECTSIZE; super->s_dirty_pages += 6 * LOGFS_MAX_OBJECTSIZE; + list_move_tail(&block->alias_list, &super->s_writeback_list); } logfs_put_wblocks(inode->i_sb, page, WF_LOCK); return ret; @@ -1861,7 +1871,7 @@ int logfs_truncate(struct inode *inode, u64 target) size = target; logfs_get_wblocks(sb, NULL, 1); - err = __logfs_truncate(inode, target); + err = __logfs_truncate(inode, size); if (!err) err = __logfs_write_inode(inode, 0); logfs_put_wblocks(sb, NULL, 1); @@ -2249,6 +2259,7 @@ int logfs_init_rw(struct super_block *sb) int min_fill = 3 * super->s_no_blocks; INIT_LIST_HEAD(&super->s_object_alias); + INIT_LIST_HEAD(&super->s_writeback_list); mutex_init(&super->s_write_mutex); super->s_block_pool = mempool_create_kmalloc_pool(min_fill, sizeof(struct logfs_block)); diff --git a/fs/logfs/segment.c b/fs/logfs/segment.c index f77ce2b..a9657af 100644 --- a/fs/logfs/segment.c +++ b/fs/logfs/segment.c @@ -67,7 +67,7 @@ static struct page *get_mapping_page(struct super_block *sb, pgoff_t index, return page; } -void __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, +int __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, int use_filler) { pgoff_t index = ofs >> PAGE_SHIFT; @@ -81,8 +81,10 @@ void __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, copylen = min((ulong)len, PAGE_SIZE - offset); page = get_mapping_page(area->a_sb, index, use_filler); - SetPageUptodate(page); + if (IS_ERR(page)) + return PTR_ERR(page); BUG_ON(!page); /* FIXME: reserve a pool */ + SetPageUptodate(page); memcpy(page_address(page) + offset, buf, copylen); SetPagePrivate(page); page_cache_release(page); @@ -92,6 +94,7 @@ void __logfs_buf_write(struct logfs_area *area, u64 ofs, void *buf, size_t len, offset = 0; index++; } while (len); + return 0; } static void pad_partial_page(struct logfs_area *area) diff --git a/fs/logfs/super.c b/fs/logfs/super.c index 5866ee6..d651e10 100644 --- a/fs/logfs/super.c +++ b/fs/logfs/super.c @@ -138,10 +138,14 @@ static int logfs_sb_set(struct super_block *sb, void *_super) sb->s_fs_info = super; sb->s_mtd = super->s_mtd; sb->s_bdev = super->s_bdev; +#ifdef CONFIG_BLOCK if (sb->s_bdev) sb->s_bdi = &bdev_get_queue(sb->s_bdev)->backing_dev_info; +#endif +#ifdef CONFIG_MTD if (sb->s_mtd) sb->s_bdi = sb->s_mtd->backing_dev_info; +#endif return 0; } @@ -333,27 +337,27 @@ static int logfs_get_sb_final(struct super_block *sb, struct vfsmount *mnt) goto fail; sb->s_root = d_alloc_root(rootdir); - if (!sb->s_root) - goto fail2; + if (!sb->s_root) { + iput(rootdir); + goto fail; + } super->s_erase_page = alloc_pages(GFP_KERNEL, 0); if (!super->s_erase_page) - goto fail2; + goto fail; memset(page_address(super->s_erase_page), 0xFF, PAGE_SIZE); /* FIXME: check for read-only mounts */ err = logfs_make_writeable(sb); if (err) - goto fail3; + goto fail1; log_super("LogFS: Finished mounting\n"); simple_set_mnt(mnt, sb); return 0; -fail3: +fail1: __free_page(super->s_erase_page); -fail2: - iput(rootdir); fail: iput(logfs_super(sb)->s_master_inode); return -EIO; @@ -382,7 +386,7 @@ static struct page *find_super_block(struct super_block *sb) if (!first || IS_ERR(first)) return NULL; last = super->s_devops->find_last_sb(sb, &super->s_sb_ofs[1]); - if (!last || IS_ERR(first)) { + if (!last || IS_ERR(last)) { page_cache_release(first); return NULL; } @@ -413,7 +417,7 @@ static int __logfs_read_sb(struct super_block *sb) page = find_super_block(sb); if (!page) - return -EIO; + return -EINVAL; ds = page_address(page); super->s_size = be64_to_cpu(ds->ds_filesystem_size); @@ -1641,7 +1641,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path, if (nd->last.name[nd->last.len]) { if (open_flag & O_CREAT) goto exit; - nd->flags |= LOOKUP_DIRECTORY; + nd->flags |= LOOKUP_DIRECTORY | LOOKUP_FOLLOW; } /* just plain open? */ @@ -1830,6 +1830,8 @@ reval: } if (open_flag & O_DIRECTORY) nd.flags |= LOOKUP_DIRECTORY; + if (!(open_flag & O_NOFOLLOW)) + nd.flags |= LOOKUP_FOLLOW; filp = do_last(&nd, &path, open_flag, acc_mode, mode, pathname); while (unlikely(!filp)) { /* trailing symlink */ struct path holder; @@ -1837,7 +1839,7 @@ reval: void *cookie; error = -ELOOP; /* S_ISDIR part is a temporary automount kludge */ - if ((open_flag & O_NOFOLLOW) && !S_ISDIR(inode->i_mode)) + if (!(nd.flags & LOOKUP_FOLLOW) && !S_ISDIR(inode->i_mode)) goto exit_dput; if (count++ == 32) goto exit_dput; @@ -2174,8 +2176,10 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry) error = security_inode_rmdir(dir, dentry); if (!error) { error = dir->i_op->rmdir(dir, dentry); - if (!error) + if (!error) { dentry->d_inode->i_flags |= S_DEAD; + dont_mount(dentry); + } } } mutex_unlock(&dentry->d_inode->i_mutex); @@ -2259,7 +2263,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry) if (!error) { error = dir->i_op->unlink(dir, dentry); if (!error) - dentry->d_inode->i_flags |= S_DEAD; + dont_mount(dentry); } } mutex_unlock(&dentry->d_inode->i_mutex); @@ -2570,17 +2574,20 @@ static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry, return error; target = new_dentry->d_inode; - if (target) { + if (target) mutex_lock(&target->i_mutex); - dentry_unhash(new_dentry); - } if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry)) error = -EBUSY; - else + else { + if (target) + dentry_unhash(new_dentry); error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); + } if (target) { - if (!error) + if (!error) { target->i_flags |= S_DEAD; + dont_mount(new_dentry); + } mutex_unlock(&target->i_mutex); if (d_unhashed(new_dentry)) d_rehash(new_dentry); @@ -2612,7 +2619,7 @@ static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); if (!error) { if (target) - target->i_flags |= S_DEAD; + dont_mount(new_dentry); if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE)) d_move(old_dentry, new_dentry); } diff --git a/fs/namespace.c b/fs/namespace.c index 8174c8a..f20cb57 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1432,7 +1432,7 @@ static int graft_tree(struct vfsmount *mnt, struct path *path) err = -ENOENT; mutex_lock(&path->dentry->d_inode->i_mutex); - if (IS_DEADDIR(path->dentry->d_inode)) + if (cant_mount(path->dentry)) goto out_unlock; err = security_sb_check_sb(mnt, path); @@ -1623,7 +1623,7 @@ static int do_move_mount(struct path *path, char *old_name) err = -ENOENT; mutex_lock(&path->dentry->d_inode->i_mutex); - if (IS_DEADDIR(path->dentry->d_inode)) + if (cant_mount(path->dentry)) goto out1; if (d_unlinked(path->dentry)) @@ -2234,7 +2234,7 @@ SYSCALL_DEFINE2(pivot_root, const char __user *, new_root, if (!check_mnt(root.mnt)) goto out2; error = -ENOENT; - if (IS_DEADDIR(new.dentry->d_inode)) + if (cant_mount(old.dentry)) goto out2; if (d_unlinked(new.dentry)) goto out2; diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 1567124..ea61d26 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -24,6 +24,8 @@ static void nfs_do_free_delegation(struct nfs_delegation *delegation) { + if (delegation->cred) + put_rpccred(delegation->cred); kfree(delegation); } @@ -36,13 +38,7 @@ static void nfs_free_delegation_callback(struct rcu_head *head) static void nfs_free_delegation(struct nfs_delegation *delegation) { - struct rpc_cred *cred; - - cred = rcu_dereference(delegation->cred); - rcu_assign_pointer(delegation->cred, NULL); call_rcu(&delegation->rcu, nfs_free_delegation_callback); - if (cred) - put_rpccred(cred); } void nfs_mark_delegation_referenced(struct nfs_delegation *delegation) @@ -129,21 +125,35 @@ again: */ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res) { - struct nfs_delegation *delegation = NFS_I(inode)->delegation; - struct rpc_cred *oldcred; + struct nfs_delegation *delegation; + struct rpc_cred *oldcred = NULL; - if (delegation == NULL) - return; - memcpy(delegation->stateid.data, res->delegation.data, - sizeof(delegation->stateid.data)); - delegation->type = res->delegation_type; - delegation->maxsize = res->maxsize; - oldcred = delegation->cred; - delegation->cred = get_rpccred(cred); - clear_bit(NFS_DELEGATION_NEED_RECLAIM, &delegation->flags); - NFS_I(inode)->delegation_state = delegation->type; - smp_wmb(); - put_rpccred(oldcred); + rcu_read_lock(); + delegation = rcu_dereference(NFS_I(inode)->delegation); + if (delegation != NULL) { + spin_lock(&delegation->lock); + if (delegation->inode != NULL) { + memcpy(delegation->stateid.data, res->delegation.data, + sizeof(delegation->stateid.data)); + delegation->type = res->delegation_type; + delegation->maxsize = res->maxsize; + oldcred = delegation->cred; + delegation->cred = get_rpccred(cred); + clear_bit(NFS_DELEGATION_NEED_RECLAIM, + &delegation->flags); + NFS_I(inode)->delegation_state = delegation->type; + spin_unlock(&delegation->lock); + put_rpccred(oldcred); + rcu_read_unlock(); + } else { + /* We appear to have raced with a delegation return. */ + spin_unlock(&delegation->lock); + rcu_read_unlock(); + nfs_inode_set_delegation(inode, cred, res); + } + } else { + rcu_read_unlock(); + } } static int nfs_do_return_delegation(struct inode *inode, struct nfs_delegation *delegation, int issync) @@ -166,9 +176,13 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation return inode; } -static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid) +static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, + const nfs4_stateid *stateid, + struct nfs_client *clp) { - struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation); + struct nfs_delegation *delegation = + rcu_dereference_protected(nfsi->delegation, + lockdep_is_held(&clp->cl_lock)); if (delegation == NULL) goto nomatch; @@ -195,7 +209,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct { struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); - struct nfs_delegation *delegation; + struct nfs_delegation *delegation, *old_delegation; struct nfs_delegation *freeme = NULL; int status = 0; @@ -213,10 +227,12 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct spin_lock_init(&delegation->lock); spin_lock(&clp->cl_lock); - if (rcu_dereference(nfsi->delegation) != NULL) { - if (memcmp(&delegation->stateid, &nfsi->delegation->stateid, - sizeof(delegation->stateid)) == 0 && - delegation->type == nfsi->delegation->type) { + old_delegation = rcu_dereference_protected(nfsi->delegation, + lockdep_is_held(&clp->cl_lock)); + if (old_delegation != NULL) { + if (memcmp(&delegation->stateid, &old_delegation->stateid, + sizeof(old_delegation->stateid)) == 0 && + delegation->type == old_delegation->type) { goto out; } /* @@ -226,12 +242,12 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct dfprintk(FILE, "%s: server %s handed out " "a duplicate delegation!\n", __func__, clp->cl_hostname); - if (delegation->type <= nfsi->delegation->type) { + if (delegation->type <= old_delegation->type) { freeme = delegation; delegation = NULL; goto out; } - freeme = nfs_detach_delegation_locked(nfsi, NULL); + freeme = nfs_detach_delegation_locked(nfsi, NULL, clp); } list_add_rcu(&delegation->super_list, &clp->cl_delegations); nfsi->delegation_state = delegation->type; @@ -301,7 +317,7 @@ restart: if (inode == NULL) continue; spin_lock(&clp->cl_lock); - delegation = nfs_detach_delegation_locked(NFS_I(inode), NULL); + delegation = nfs_detach_delegation_locked(NFS_I(inode), NULL, clp); spin_unlock(&clp->cl_lock); rcu_read_unlock(); if (delegation != NULL) { @@ -330,9 +346,9 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode) struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; - if (rcu_dereference(nfsi->delegation) != NULL) { + if (rcu_access_pointer(nfsi->delegation) != NULL) { spin_lock(&clp->cl_lock); - delegation = nfs_detach_delegation_locked(nfsi, NULL); + delegation = nfs_detach_delegation_locked(nfsi, NULL, clp); spin_unlock(&clp->cl_lock); if (delegation != NULL) nfs_do_return_delegation(inode, delegation, 0); @@ -346,9 +362,9 @@ int nfs_inode_return_delegation(struct inode *inode) struct nfs_delegation *delegation; int err = 0; - if (rcu_dereference(nfsi->delegation) != NULL) { + if (rcu_access_pointer(nfsi->delegation) != NULL) { spin_lock(&clp->cl_lock); - delegation = nfs_detach_delegation_locked(nfsi, NULL); + delegation = nfs_detach_delegation_locked(nfsi, NULL, clp); spin_unlock(&clp->cl_lock); if (delegation != NULL) { nfs_msync_inode(inode); @@ -526,7 +542,7 @@ restart: if (inode == NULL) continue; spin_lock(&clp->cl_lock); - delegation = nfs_detach_delegation_locked(NFS_I(inode), NULL); + delegation = nfs_detach_delegation_locked(NFS_I(inode), NULL, clp); spin_unlock(&clp->cl_lock); rcu_read_unlock(); if (delegation != NULL) diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c index 1afb0a1..e27960c 100644 --- a/fs/notify/inotify/inotify_fsnotify.c +++ b/fs/notify/inotify/inotify_fsnotify.c @@ -28,6 +28,7 @@ #include <linux/path.h> /* struct path */ #include <linux/slab.h> /* kmem_* */ #include <linux/types.h> +#include <linux/sched.h> #include "inotify.h" @@ -146,6 +147,7 @@ static void inotify_free_group_priv(struct fsnotify_group *group) idr_for_each(&group->inotify_data.idr, idr_callback, group); idr_remove_all(&group->inotify_data.idr); idr_destroy(&group->inotify_data.idr); + free_uid(group->inotify_data.user); } void inotify_free_event_priv(struct fsnotify_event_private_data *fsn_event_priv) diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c index 472cdf2..e46ca68 100644 --- a/fs/notify/inotify/inotify_user.c +++ b/fs/notify/inotify/inotify_user.c @@ -546,21 +546,24 @@ retry: if (unlikely(!idr_pre_get(&group->inotify_data.idr, GFP_KERNEL))) goto out_err; + /* we are putting the mark on the idr, take a reference */ + fsnotify_get_mark(&tmp_ientry->fsn_entry); + spin_lock(&group->inotify_data.idr_lock); ret = idr_get_new_above(&group->inotify_data.idr, &tmp_ientry->fsn_entry, group->inotify_data.last_wd+1, &tmp_ientry->wd); spin_unlock(&group->inotify_data.idr_lock); if (ret) { + /* we didn't get on the idr, drop the idr reference */ + fsnotify_put_mark(&tmp_ientry->fsn_entry); + /* idr was out of memory allocate and try again */ if (ret == -EAGAIN) goto retry; goto out_err; } - /* we put the mark on the idr, take a reference */ - fsnotify_get_mark(&tmp_ientry->fsn_entry); - /* we are on the idr, now get on the inode */ ret = fsnotify_add_mark(&tmp_ientry->fsn_entry, group, inode); if (ret) { @@ -578,16 +581,13 @@ retry: /* return the watch descriptor for this new entry */ ret = tmp_ientry->wd; - /* match the ref from fsnotify_init_markentry() */ - fsnotify_put_mark(&tmp_ientry->fsn_entry); - /* if this mark added a new event update the group mask */ if (mask & ~group->mask) fsnotify_recalc_group_mask(group); out_err: - if (ret < 0) - kmem_cache_free(inotify_inode_mark_cachep, tmp_ientry); + /* match the ref from fsnotify_init_markentry() */ + fsnotify_put_mark(&tmp_ientry->fsn_entry); return ret; } diff --git a/fs/proc/array.c b/fs/proc/array.c index e51f2ec..885ab55 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -81,7 +81,6 @@ #include <linux/pid_namespace.h> #include <linux/ptrace.h> #include <linux/tracehook.h> -#include <linux/swapops.h> #include <asm/pgtable.h> #include <asm/processor.h> @@ -495,7 +494,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns, rsslim, mm ? mm->start_code : 0, mm ? mm->end_code : 0, - (permitted && mm) ? task->stack_start : 0, + (permitted && mm) ? mm->start_stack : 0, esp, eip, /* The signal information here is obsolete. diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 0705534..47f5b14 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -247,25 +247,6 @@ static void show_map_vma(struct seq_file *m, struct vm_area_struct *vma) } else if (vma->vm_start <= mm->start_stack && vma->vm_end >= mm->start_stack) { name = "[stack]"; - } else { - unsigned long stack_start; - struct proc_maps_private *pmp; - - pmp = m->private; - stack_start = pmp->task->stack_start; - - if (vma->vm_start <= stack_start && - vma->vm_end >= stack_start) { - pad_len_spaces(m, len); - seq_printf(m, - "[threadstack:%08lx]", -#ifdef CONFIG_STACK_GROWSUP - vma->vm_end - stack_start -#else - stack_start - vma->vm_start -#endif - ); - } } } else { name = "[vdso]"; diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c index 4e50286..1dabed2 100644 --- a/fs/sysv/dir.c +++ b/fs/sysv/dir.c @@ -164,8 +164,8 @@ struct sysv_dir_entry *sysv_find_entry(struct dentry *dentry, struct page **res_ name, de->name)) goto found; } + dir_put_page(page); } - dir_put_page(page); if (++n >= npages) n = 0; diff --git a/include/asm-generic/atomic.h b/include/asm-generic/atomic.h index c99c64d..c33749f 100644 --- a/include/asm-generic/atomic.h +++ b/include/asm-generic/atomic.h @@ -33,7 +33,7 @@ * Atomically reads the value of @v. Note that the guaranteed * useful range of an atomic_t is only 24 bits. */ -#define atomic_read(v) ((v)->counter) +#define atomic_read(v) (*(volatile int *)&(v)->counter) /** * atomic_set - set atomic variable diff --git a/include/asm-generic/bitops/arch_hweight.h b/include/asm-generic/bitops/arch_hweight.h new file mode 100644 index 0000000..6a211f4 --- /dev/null +++ b/include/asm-generic/bitops/arch_hweight.h @@ -0,0 +1,25 @@ +#ifndef _ASM_GENERIC_BITOPS_ARCH_HWEIGHT_H_ +#define _ASM_GENERIC_BITOPS_ARCH_HWEIGHT_H_ + +#include <asm/types.h> + +static inline unsigned int __arch_hweight32(unsigned int w) +{ + return __sw_hweight32(w); +} + +static inline unsigned int __arch_hweight16(unsigned int w) +{ + return __sw_hweight16(w); +} + +static inline unsigned int __arch_hweight8(unsigned int w) +{ + return __sw_hweight8(w); +} + +static inline unsigned long __arch_hweight64(__u64 w) +{ + return __sw_hweight64(w); +} +#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */ diff --git a/include/asm-generic/bitops/const_hweight.h b/include/asm-generic/bitops/const_hweight.h new file mode 100644 index 0000000..fa2a50b --- /dev/null +++ b/include/asm-generic/bitops/const_hweight.h @@ -0,0 +1,42 @@ +#ifndef _ASM_GENERIC_BITOPS_CONST_HWEIGHT_H_ +#define _ASM_GENERIC_BITOPS_CONST_HWEIGHT_H_ + +/* + * Compile time versions of __arch_hweightN() + */ +#define __const_hweight8(w) \ + ( (!!((w) & (1ULL << 0))) + \ + (!!((w) & (1ULL << 1))) + \ + (!!((w) & (1ULL << 2))) + \ + (!!((w) & (1ULL << 3))) + \ + (!!((w) & (1ULL << 4))) + \ + (!!((w) & (1ULL << 5))) + \ + (!!((w) & (1ULL << 6))) + \ + (!!((w) & (1ULL << 7))) ) + +#define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8 )) +#define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16)) +#define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32)) + +/* + * Generic interface. + */ +#define hweight8(w) (__builtin_constant_p(w) ? __const_hweight8(w) : __arch_hweight8(w)) +#define hweight16(w) (__builtin_constant_p(w) ? __const_hweight16(w) : __arch_hweight16(w)) +#define hweight32(w) (__builtin_constant_p(w) ? __const_hweight32(w) : __arch_hweight32(w)) +#define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) + +/* + * Interface for known constant arguments + */ +#define HWEIGHT8(w) (BUILD_BUG_ON_ZERO(!__builtin_constant_p(w)) + __const_hweight8(w)) +#define HWEIGHT16(w) (BUILD_BUG_ON_ZERO(!__builtin_constant_p(w)) + __const_hweight16(w)) +#define HWEIGHT32(w) (BUILD_BUG_ON_ZERO(!__builtin_constant_p(w)) + __const_hweight32(w)) +#define HWEIGHT64(w) (BUILD_BUG_ON_ZERO(!__builtin_constant_p(w)) + __const_hweight64(w)) + +/* + * Type invariant interface to the compile time constant hweight functions. + */ +#define HWEIGHT(w) HWEIGHT64((u64)w) + +#endif /* _ASM_GENERIC_BITOPS_CONST_HWEIGHT_H_ */ diff --git a/include/asm-generic/bitops/hweight.h b/include/asm-generic/bitops/hweight.h index fbbc383..a94d651 100644 --- a/include/asm-generic/bitops/hweight.h +++ b/include/asm-generic/bitops/hweight.h @@ -1,11 +1,7 @@ #ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_ #define _ASM_GENERIC_BITOPS_HWEIGHT_H_ -#include <asm/types.h> - -extern unsigned int hweight32(unsigned int w); -extern unsigned int hweight16(unsigned int w); -extern unsigned int hweight8(unsigned int w); -extern unsigned long hweight64(__u64 w); +#include <asm-generic/bitops/arch_hweight.h> +#include <asm-generic/bitops/const_hweight.h> #endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */ diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h index e694263..6920695 100644 --- a/include/asm-generic/dma-mapping-common.h +++ b/include/asm-generic/dma-mapping-common.h @@ -131,7 +131,7 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev, debug_dma_sync_single_range_for_cpu(dev, addr, offset, size, dir); } else - dma_sync_single_for_cpu(dev, addr, size, dir); + dma_sync_single_for_cpu(dev, addr + offset, size, dir); } static inline void dma_sync_single_range_for_device(struct device *dev, @@ -148,7 +148,7 @@ static inline void dma_sync_single_range_for_device(struct device *dev, debug_dma_sync_single_range_for_device(dev, addr, offset, size, dir); } else - dma_sync_single_for_device(dev, addr, size, dir); + dma_sync_single_for_device(dev, addr + offset, size, dir); } static inline void diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index e929c27..6b9db91 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -789,34 +789,6 @@ extern void ttm_bo_unreserve(struct ttm_buffer_object *bo); extern int ttm_bo_wait_unreserved(struct ttm_buffer_object *bo, bool interruptible); -/** - * ttm_bo_block_reservation - * - * @bo: A pointer to a struct ttm_buffer_object. - * @interruptible: Use interruptible sleep when waiting. - * @no_wait: Don't sleep, but rather return -EBUSY. - * - * Block reservation for validation by simply reserving the buffer. - * This is intended for single buffer use only without eviction, - * and thus needs no deadlock protection. - * - * Returns: - * -EBUSY: If no_wait == 1 and the buffer is already reserved. - * -ERESTARTSYS: If interruptible == 1 and the process received a signal - * while sleeping. - */ -extern int ttm_bo_block_reservation(struct ttm_buffer_object *bo, - bool interruptible, bool no_wait); - -/** - * ttm_bo_unblock_reservation - * - * @bo: A pointer to a struct ttm_buffer_object. - * - * Unblocks reservation leaving lru lists untouched. - */ -extern void ttm_bo_unblock_reservation(struct ttm_buffer_object *bo); - /* * ttm_bo_util.c */ diff --git a/include/linux/acpi.h b/include/linux/acpi.h index b926afe..3da73f5 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -116,11 +116,12 @@ extern unsigned long acpi_realmode_flags; int acpi_register_gsi (struct device *dev, u32 gsi, int triggering, int polarity); int acpi_gsi_to_irq (u32 gsi, unsigned int *irq); +int acpi_isa_irq_to_gsi (unsigned isa_irq, u32 *gsi); #ifdef CONFIG_X86_IO_APIC -extern int acpi_get_override_irq(int bus_irq, int *trigger, int *polarity); +extern int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity); #else -#define acpi_get_override_irq(bus, trigger, polarity) (-1) +#define acpi_get_override_irq(gsi, trigger, polarity) (-1) #endif /* * This function undoes the effect of one call to acpi_register_gsi(). diff --git a/include/linux/bitops.h b/include/linux/bitops.h index b796eab..fc68053 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -10,6 +10,11 @@ #define BITS_TO_LONGS(nr) DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long)) #endif +extern unsigned int __sw_hweight8(unsigned int w); +extern unsigned int __sw_hweight16(unsigned int w); +extern unsigned int __sw_hweight32(unsigned int w); +extern unsigned long __sw_hweight64(__u64 w); + /* * Include this here because some architectures need generic_ffs/fls in * scope @@ -44,31 +49,6 @@ static inline unsigned long hweight_long(unsigned long w) return sizeof(w) == 4 ? hweight32(w) : hweight64(w); } -/* - * Clearly slow versions of the hweightN() functions, their benefit is - * of course compile time evaluation of constant arguments. - */ -#define HWEIGHT8(w) \ - ( BUILD_BUG_ON_ZERO(!__builtin_constant_p(w)) + \ - (!!((w) & (1ULL << 0))) + \ - (!!((w) & (1ULL << 1))) + \ - (!!((w) & (1ULL << 2))) + \ - (!!((w) & (1ULL << 3))) + \ - (!!((w) & (1ULL << 4))) + \ - (!!((w) & (1ULL << 5))) + \ - (!!((w) & (1ULL << 6))) + \ - (!!((w) & (1ULL << 7))) ) - -#define HWEIGHT16(w) (HWEIGHT8(w) + HWEIGHT8((w) >> 8)) -#define HWEIGHT32(w) (HWEIGHT16(w) + HWEIGHT16((w) >> 16)) -#define HWEIGHT64(w) (HWEIGHT32(w) + HWEIGHT32((w) >> 32)) - -/* - * Type invariant version that simply casts things to the - * largest type. - */ -#define HWEIGHT(w) HWEIGHT64((u64)(w)) - /** * rol32 - rotate a 32-bit value left * @word: value to rotate diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index b8ad1ea9..8f78073 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -530,6 +530,7 @@ static inline struct cgroup_subsys_state *task_subsys_state( { return rcu_dereference_check(task->cgroups->subsys[subsys_id], rcu_read_lock_held() || + lockdep_is_held(&task->alloc_lock) || cgroup_lock_is_held()); } diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 4de02b1..9f15150 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -278,6 +278,27 @@ struct freq_attr { ssize_t (*store)(struct cpufreq_policy *, const char *, size_t count); }; +#define cpufreq_freq_attr_ro(_name) \ +static struct freq_attr _name = \ +__ATTR(_name, 0444, show_##_name, NULL) + +#define cpufreq_freq_attr_ro_perm(_name, _perm) \ +static struct freq_attr _name = \ +__ATTR(_name, _perm, show_##_name, NULL) + +#define cpufreq_freq_attr_ro_old(_name) \ +static struct freq_attr _name##_old = \ +__ATTR(_name, 0444, show_##_name##_old, NULL) + +#define cpufreq_freq_attr_rw(_name) \ +static struct freq_attr _name = \ +__ATTR(_name, 0644, show_##_name, store_##_name) + +#define cpufreq_freq_attr_rw_old(_name) \ +static struct freq_attr _name##_old = \ +__ATTR(_name, 0644, show_##_name##_old, store_##_name##_old) + + struct global_attr { struct attribute attr; ssize_t (*show)(struct kobject *kobj, @@ -286,6 +307,15 @@ struct global_attr { const char *c, size_t count); }; +#define define_one_global_ro(_name) \ +static struct global_attr _name = \ +__ATTR(_name, 0444, show_##_name, NULL) + +#define define_one_global_rw(_name) \ +static struct global_attr _name = \ +__ATTR(_name, 0644, show_##_name, store_##_name) + + /********************************************************************* * CPUFREQ 2.6. INTERFACE * *********************************************************************/ diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index a5740fc..a73454a 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -21,8 +21,7 @@ extern int number_of_cpusets; /* How many cpusets are defined in system? */ extern int cpuset_init(void); extern void cpuset_init_smp(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask); -extern void cpuset_cpus_allowed_locked(struct task_struct *p, - struct cpumask *mask); +extern int cpuset_cpus_allowed_fallback(struct task_struct *p); extern nodemask_t cpuset_mems_allowed(struct task_struct *p); #define cpuset_current_mems_allowed (current->mems_allowed) void cpuset_init_current_mems_allowed(void); @@ -69,9 +68,6 @@ struct seq_file; extern void cpuset_task_status_allowed(struct seq_file *m, struct task_struct *task); -extern void cpuset_lock(void); -extern void cpuset_unlock(void); - extern int cpuset_mem_spread_node(void); static inline int cpuset_do_page_mem_spread(void) @@ -105,10 +101,11 @@ static inline void cpuset_cpus_allowed(struct task_struct *p, { cpumask_copy(mask, cpu_possible_mask); } -static inline void cpuset_cpus_allowed_locked(struct task_struct *p, - struct cpumask *mask) + +static inline int cpuset_cpus_allowed_fallback(struct task_struct *p) { - cpumask_copy(mask, cpu_possible_mask); + cpumask_copy(&p->cpus_allowed, cpu_possible_mask); + return cpumask_any(cpu_active_mask); } static inline nodemask_t cpuset_mems_allowed(struct task_struct *p) @@ -157,9 +154,6 @@ static inline void cpuset_task_status_allowed(struct seq_file *m, { } -static inline void cpuset_lock(void) {} -static inline void cpuset_unlock(void) {} - static inline int cpuset_mem_spread_node(void) { return 0; diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 30b93b2..eebb617 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -186,6 +186,8 @@ d_iput: no no no yes #define DCACHE_FSNOTIFY_PARENT_WATCHED 0x0080 /* Parent inode is watched by some fsnotify listener */ +#define DCACHE_CANT_MOUNT 0x0100 + extern spinlock_t dcache_lock; extern seqlock_t rename_lock; @@ -358,6 +360,18 @@ static inline int d_unlinked(struct dentry *dentry) return d_unhashed(dentry) && !IS_ROOT(dentry); } +static inline int cant_mount(struct dentry *dentry) +{ + return (dentry->d_flags & DCACHE_CANT_MOUNT); +} + +static inline void dont_mount(struct dentry *dentry) +{ + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_CANT_MOUNT; + spin_unlock(&dentry->d_lock); +} + static inline struct dentry *dget_parent(struct dentry *dentry) { struct dentry *ret; diff --git a/include/linux/debugobjects.h b/include/linux/debugobjects.h index 8c243aa..597692f 100644 --- a/include/linux/debugobjects.h +++ b/include/linux/debugobjects.h @@ -20,12 +20,14 @@ struct debug_obj_descr; * struct debug_obj - representaion of an tracked object * @node: hlist node to link the object into the tracker list * @state: tracked object state + * @astate: current active state * @object: pointer to the real object * @descr: pointer to an object type specific debug description structure */ struct debug_obj { struct hlist_node node; enum debug_obj_state state; + unsigned int astate; void *object; struct debug_obj_descr *descr; }; @@ -60,6 +62,15 @@ extern void debug_object_deactivate(void *addr, struct debug_obj_descr *descr); extern void debug_object_destroy (void *addr, struct debug_obj_descr *descr); extern void debug_object_free (void *addr, struct debug_obj_descr *descr); +/* + * Active state: + * - Set at 0 upon initialization. + * - Must return to 0 before deactivation. + */ +extern void +debug_object_active_state(void *addr, struct debug_obj_descr *descr, + unsigned int expect, unsigned int next); + extern void debug_objects_early_init(void); extern void debug_objects_mem_init(void); #else diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index cc12b3c..41e4633 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -82,9 +82,13 @@ void clear_ftrace_function(void); extern void ftrace_stub(unsigned long a0, unsigned long a1); #else /* !CONFIG_FUNCTION_TRACER */ -# define register_ftrace_function(ops) do { } while (0) -# define unregister_ftrace_function(ops) do { } while (0) -# define clear_ftrace_function(ops) do { } while (0) +/* + * (un)register_ftrace_function must be a macro since the ops parameter + * must not be evaluated. + */ +#define register_ftrace_function(ops) ({ 0; }) +#define unregister_ftrace_function(ops) ({ 0; }) +static inline void clear_ftrace_function(void) { } static inline void ftrace_kill(void) { } static inline void ftrace_stop(void) { } static inline void ftrace_start(void) { } @@ -237,11 +241,13 @@ extern int skip_trace(unsigned long ip); extern void ftrace_disable_daemon(void); extern void ftrace_enable_daemon(void); #else -# define skip_trace(ip) ({ 0; }) -# define ftrace_force_update() ({ 0; }) -# define ftrace_set_filter(buf, len, reset) do { } while (0) -# define ftrace_disable_daemon() do { } while (0) -# define ftrace_enable_daemon() do { } while (0) +static inline int skip_trace(unsigned long ip) { return 0; } +static inline int ftrace_force_update(void) { return 0; } +static inline void ftrace_set_filter(unsigned char *buf, int len, int reset) +{ +} +static inline void ftrace_disable_daemon(void) { } +static inline void ftrace_enable_daemon(void) { } static inline void ftrace_release_mod(struct module *mod) {} static inline int register_ftrace_command(struct ftrace_func_command *cmd) { @@ -314,16 +320,16 @@ static inline void __ftrace_enabled_restore(int enabled) extern void time_hardirqs_on(unsigned long a0, unsigned long a1); extern void time_hardirqs_off(unsigned long a0, unsigned long a1); #else -# define time_hardirqs_on(a0, a1) do { } while (0) -# define time_hardirqs_off(a0, a1) do { } while (0) + static inline void time_hardirqs_on(unsigned long a0, unsigned long a1) { } + static inline void time_hardirqs_off(unsigned long a0, unsigned long a1) { } #endif #ifdef CONFIG_PREEMPT_TRACER extern void trace_preempt_on(unsigned long a0, unsigned long a1); extern void trace_preempt_off(unsigned long a0, unsigned long a1); #else -# define trace_preempt_on(a0, a1) do { } while (0) -# define trace_preempt_off(a0, a1) do { } while (0) + static inline void trace_preempt_on(unsigned long a0, unsigned long a1) { } + static inline void trace_preempt_off(unsigned long a0, unsigned long a1) { } #endif #ifdef CONFIG_FTRACE_MCOUNT_RECORD @@ -352,6 +358,10 @@ struct ftrace_graph_ret { int depth; }; +/* Type of the callback handlers for tracing function graph*/ +typedef void (*trace_func_graph_ret_t)(struct ftrace_graph_ret *); /* return */ +typedef int (*trace_func_graph_ent_t)(struct ftrace_graph_ent *); /* entry */ + #ifdef CONFIG_FUNCTION_GRAPH_TRACER /* for init task */ @@ -400,10 +410,6 @@ extern char __irqentry_text_end[]; #define FTRACE_RETFUNC_DEPTH 50 #define FTRACE_RETSTACK_ALLOC_SIZE 32 -/* Type of the callback handlers for tracing function graph*/ -typedef void (*trace_func_graph_ret_t)(struct ftrace_graph_ret *); /* return */ -typedef int (*trace_func_graph_ent_t)(struct ftrace_graph_ent *); /* entry */ - extern int register_ftrace_graph(trace_func_graph_ret_t retfunc, trace_func_graph_ent_t entryfunc); @@ -441,6 +447,13 @@ static inline void unpause_graph_tracing(void) static inline void ftrace_graph_init_task(struct task_struct *t) { } static inline void ftrace_graph_exit_task(struct task_struct *t) { } +static inline int register_ftrace_graph(trace_func_graph_ret_t retfunc, + trace_func_graph_ent_t entryfunc) +{ + return -1; +} +static inline void unregister_ftrace_graph(void) { } + static inline int task_curr_ret_stack(struct task_struct *tsk) { return -1; @@ -492,7 +505,9 @@ static inline int test_tsk_trace_graph(struct task_struct *tsk) return tsk->trace & TSK_TRACE_FL_GRAPH; } -extern int ftrace_dump_on_oops; +enum ftrace_dump_mode; + +extern enum ftrace_dump_mode ftrace_dump_on_oops; #ifdef CONFIG_PREEMPT #define INIT_TRACE_RECURSION .trace_recursion = 0, diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h index c809100..a9775dd 100644 --- a/include/linux/ftrace_event.h +++ b/include/linux/ftrace_event.h @@ -58,6 +58,7 @@ struct trace_iterator { /* The below is zeroed out in pipe_read */ struct trace_seq seq; struct trace_entry *ent; + unsigned long lost_events; int leftover; int cpu; u64 ts; diff --git a/include/linux/if_link.h b/include/linux/if_link.h index c9bf92c..d94963b 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -79,10 +79,7 @@ enum { IFLA_NET_NS_PID, IFLA_IFALIAS, IFLA_NUM_VF, /* Number of VFs if device is SR-IOV PF */ - IFLA_VF_MAC, /* Hardware queue specific attributes */ - IFLA_VF_VLAN, - IFLA_VF_TX_RATE, /* TX Bandwidth Allocation */ - IFLA_VFINFO, + IFLA_VFINFO_LIST, __IFLA_MAX }; @@ -203,6 +200,24 @@ enum macvlan_mode { /* SR-IOV virtual function managment section */ +enum { + IFLA_VF_INFO_UNSPEC, + IFLA_VF_INFO, + __IFLA_VF_INFO_MAX, +}; + +#define IFLA_VF_INFO_MAX (__IFLA_VF_INFO_MAX - 1) + +enum { + IFLA_VF_UNSPEC, + IFLA_VF_MAC, /* Hardware queue specific attributes */ + IFLA_VF_VLAN, + IFLA_VF_TX_RATE, /* TX Bandwidth Allocation */ + __IFLA_VF_MAX, +}; + +#define IFLA_VF_MAX (__IFLA_VF_MAX - 1) + struct ifla_vf_mac { __u32 vf; __u8 mac[32]; /* MAX_ADDR_LEN */ diff --git a/include/linux/init_task.h b/include/linux/init_task.h index b1ed1cd..7996fc2 100644 --- a/include/linux/init_task.h +++ b/include/linux/init_task.h @@ -49,7 +49,6 @@ extern struct group_info init_groups; { .first = &init_task.pids[PIDTYPE_PGID].node }, \ { .first = &init_task.pids[PIDTYPE_SID].node }, \ }, \ - .rcu = RCU_HEAD_INIT, \ .level = 0, \ .numbers = { { \ .nr = 0, \ diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 3af4ffd..be22ad8 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -37,9 +37,9 @@ struct iommu_ops { int (*attach_dev)(struct iommu_domain *domain, struct device *dev); void (*detach_dev)(struct iommu_domain *domain, struct device *dev); int (*map)(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot); - void (*unmap)(struct iommu_domain *domain, unsigned long iova, - size_t size); + phys_addr_t paddr, int gfp_order, int prot); + int (*unmap)(struct iommu_domain *domain, unsigned long iova, + int gfp_order); phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, unsigned long iova); int (*domain_has_cap)(struct iommu_domain *domain, @@ -56,10 +56,10 @@ extern int iommu_attach_device(struct iommu_domain *domain, struct device *dev); extern void iommu_detach_device(struct iommu_domain *domain, struct device *dev); -extern int iommu_map_range(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot); -extern void iommu_unmap_range(struct iommu_domain *domain, unsigned long iova, - size_t size); +extern int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, int gfp_order, int prot); +extern int iommu_unmap(struct iommu_domain *domain, unsigned long iova, + int gfp_order); extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, unsigned long iova); extern int iommu_domain_has_cap(struct iommu_domain *domain, @@ -96,16 +96,16 @@ static inline void iommu_detach_device(struct iommu_domain *domain, { } -static inline int iommu_map_range(struct iommu_domain *domain, - unsigned long iova, phys_addr_t paddr, - size_t size, int prot) +static inline int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, int gfp_order, int prot) { return -ENODEV; } -static inline void iommu_unmap_range(struct iommu_domain *domain, - unsigned long iova, size_t size) +static inline int iommu_unmap(struct iommu_domain *domain, unsigned long iova, + int gfp_order) { + return -ENODEV; } static inline phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 9365227..9fb1c12 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -490,6 +490,13 @@ static inline void tracing_off(void) { } static inline void tracing_off_permanent(void) { } static inline int tracing_is_on(void) { return 0; } #endif + +enum ftrace_dump_mode { + DUMP_NONE, + DUMP_ALL, + DUMP_ORIG, +}; + #ifdef CONFIG_TRACING extern void tracing_start(void); extern void tracing_stop(void); @@ -571,7 +578,7 @@ __ftrace_vbprintk(unsigned long ip, const char *fmt, va_list ap); extern int __ftrace_vprintk(unsigned long ip, const char *fmt, va_list ap); -extern void ftrace_dump(void); +extern void ftrace_dump(enum ftrace_dump_mode oops_dump_mode); #else static inline void ftrace_special(unsigned long arg1, unsigned long arg2, unsigned long arg3) { } @@ -592,7 +599,7 @@ ftrace_vprintk(const char *fmt, va_list ap) { return 0; } -static inline void ftrace_dump(void) { } +static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { } #endif /* CONFIG_TRACING */ /* diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h index f58e9d83..56fde43 100644 --- a/include/linux/mod_devicetable.h +++ b/include/linux/mod_devicetable.h @@ -474,4 +474,13 @@ struct platform_device_id { __attribute__((aligned(sizeof(kernel_ulong_t)))); }; +struct zorro_device_id { + __u32 id; /* Device ID or ZORRO_WILDCARD */ + kernel_ulong_t driver_data; /* Data private to the driver */ +}; + +#define ZORRO_WILDCARD (0xffffffff) /* not official */ + +#define ZORRO_DEVICE_MODALIAS_FMT "zorro:i%08X" + #endif /* LINUX_MOD_DEVICETABLE_H */ diff --git a/include/linux/module.h b/include/linux/module.h index 515d53a..6914fca 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -465,8 +465,7 @@ static inline void __module_get(struct module *module) if (module) { preempt_disable(); __this_cpu_inc(module->refptr->incs); - trace_module_get(module, _THIS_IP_, - __this_cpu_read(module->refptr->incs)); + trace_module_get(module, _THIS_IP_); preempt_enable(); } } @@ -480,8 +479,7 @@ static inline int try_module_get(struct module *module) if (likely(module_is_live(module))) { __this_cpu_inc(module->refptr->incs); - trace_module_get(module, _THIS_IP_, - __this_cpu_read(module->refptr->incs)); + trace_module_get(module, _THIS_IP_); } else ret = 0; diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h index 212da17..5417944 100644 --- a/include/linux/platform_device.h +++ b/include/linux/platform_device.h @@ -44,12 +44,14 @@ extern int platform_get_irq_byname(struct platform_device *, const char *); extern int platform_add_devices(struct platform_device **, int); extern struct platform_device *platform_device_register_simple(const char *, int id, - struct resource *, unsigned int); + const struct resource *, unsigned int); extern struct platform_device *platform_device_register_data(struct device *, const char *, int, const void *, size_t); extern struct platform_device *platform_device_alloc(const char *name, int id); -extern int platform_device_add_resources(struct platform_device *pdev, struct resource *res, unsigned int num); +extern int platform_device_add_resources(struct platform_device *pdev, + const struct resource *res, + unsigned int num); extern int platform_device_add_data(struct platform_device *pdev, const void *data, size_t size); extern int platform_device_add(struct platform_device *pdev); extern void platform_device_del(struct platform_device *pdev); diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h index 5210a5c..fe1872e 100644 --- a/include/linux/rbtree.h +++ b/include/linux/rbtree.h @@ -110,6 +110,7 @@ struct rb_node struct rb_root { struct rb_node *rb_node; + void (*augment_cb)(struct rb_node *node); }; @@ -129,7 +130,9 @@ static inline void rb_set_color(struct rb_node *rb, int color) rb->rb_parent_color = (rb->rb_parent_color & ~1) | color; } -#define RB_ROOT (struct rb_root) { NULL, } +#define RB_ROOT (struct rb_root) { NULL, NULL, } +#define RB_AUGMENT_ROOT(x) (struct rb_root) { NULL, x} + #define rb_entry(ptr, type, member) container_of(ptr, type, member) #define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 07db2fe..b653b4a 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -56,8 +56,6 @@ struct rcu_head { }; /* Exported common interfaces */ -extern void synchronize_rcu_bh(void); -extern void synchronize_sched(void); extern void rcu_barrier(void); extern void rcu_barrier_bh(void); extern void rcu_barrier_sched(void); @@ -66,8 +64,6 @@ extern int sched_expedited_torture_stats(char *page); /* Internal to kernel */ extern void rcu_init(void); -extern int rcu_scheduler_active; -extern void rcu_scheduler_starting(void); #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU) #include <linux/rcutree.h> @@ -83,6 +79,14 @@ extern void rcu_scheduler_starting(void); (ptr)->next = NULL; (ptr)->func = NULL; \ } while (0) +static inline void init_rcu_head_on_stack(struct rcu_head *head) +{ +} + +static inline void destroy_rcu_head_on_stack(struct rcu_head *head) +{ +} + #ifdef CONFIG_DEBUG_LOCK_ALLOC extern struct lockdep_map rcu_lock_map; @@ -106,12 +110,13 @@ extern int debug_lockdep_rcu_enabled(void); /** * rcu_read_lock_held - might we be in RCU read-side critical section? * - * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in - * an RCU read-side critical section. In absence of CONFIG_PROVE_LOCKING, + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an RCU + * read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC, * this assumes we are in an RCU read-side critical section unless it can * prove otherwise. * - * Check rcu_scheduler_active to prevent false positives during boot. + * Check debug_lockdep_rcu_enabled() to prevent false positives during boot + * and while lockdep is disabled. */ static inline int rcu_read_lock_held(void) { @@ -129,13 +134,15 @@ extern int rcu_read_lock_bh_held(void); /** * rcu_read_lock_sched_held - might we be in RCU-sched read-side critical section? * - * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in an - * RCU-sched read-side critical section. In absence of CONFIG_PROVE_LOCKING, - * this assumes we are in an RCU-sched read-side critical section unless it - * can prove otherwise. Note that disabling of preemption (including - * disabling irqs) counts as an RCU-sched read-side critical section. + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an + * RCU-sched read-side critical section. In absence of + * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side + * critical section unless it can prove otherwise. Note that disabling + * of preemption (including disabling irqs) counts as an RCU-sched + * read-side critical section. * - * Check rcu_scheduler_active to prevent false positives during boot. + * Check debug_lockdep_rcu_enabled() to prevent false positives during boot + * and while lockdep is disabled. */ #ifdef CONFIG_PREEMPT static inline int rcu_read_lock_sched_held(void) @@ -177,7 +184,7 @@ static inline int rcu_read_lock_bh_held(void) #ifdef CONFIG_PREEMPT static inline int rcu_read_lock_sched_held(void) { - return !rcu_scheduler_active || preempt_count() != 0 || irqs_disabled(); + return preempt_count() != 0 || irqs_disabled(); } #else /* #ifdef CONFIG_PREEMPT */ static inline int rcu_read_lock_sched_held(void) @@ -190,6 +197,17 @@ static inline int rcu_read_lock_sched_held(void) #ifdef CONFIG_PROVE_RCU +extern int rcu_my_thread_group_empty(void); + +#define __do_rcu_dereference_check(c) \ + do { \ + static bool __warned; \ + if (debug_lockdep_rcu_enabled() && !__warned && !(c)) { \ + __warned = true; \ + lockdep_rcu_dereference(__FILE__, __LINE__); \ + } \ + } while (0) + /** * rcu_dereference_check - rcu_dereference with debug checking * @p: The pointer to read, prior to dereferencing @@ -219,8 +237,7 @@ static inline int rcu_read_lock_sched_held(void) */ #define rcu_dereference_check(p, c) \ ({ \ - if (debug_lockdep_rcu_enabled() && !(c)) \ - lockdep_rcu_dereference(__FILE__, __LINE__); \ + __do_rcu_dereference_check(c); \ rcu_dereference_raw(p); \ }) @@ -237,8 +254,7 @@ static inline int rcu_read_lock_sched_held(void) */ #define rcu_dereference_protected(p, c) \ ({ \ - if (debug_lockdep_rcu_enabled() && !(c)) \ - lockdep_rcu_dereference(__FILE__, __LINE__); \ + __do_rcu_dereference_check(c); \ (p); \ }) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index a519587..e2e8931 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -29,6 +29,10 @@ void rcu_sched_qs(int cpu); void rcu_bh_qs(int cpu); +static inline void rcu_note_context_switch(int cpu) +{ + rcu_sched_qs(cpu); +} #define __rcu_read_lock() preempt_disable() #define __rcu_read_unlock() preempt_enable() @@ -60,8 +64,6 @@ static inline long rcu_batches_completed_bh(void) return 0; } -extern int rcu_expedited_torture_stats(char *page); - static inline void rcu_force_quiescent_state(void) { } @@ -74,7 +76,17 @@ static inline void rcu_sched_force_quiescent_state(void) { } -#define synchronize_rcu synchronize_sched +extern void synchronize_sched(void); + +static inline void synchronize_rcu(void) +{ + synchronize_sched(); +} + +static inline void synchronize_rcu_bh(void) +{ + synchronize_sched(); +} static inline void synchronize_rcu_expedited(void) { @@ -114,4 +126,17 @@ static inline int rcu_preempt_depth(void) return 0; } +#ifdef CONFIG_DEBUG_LOCK_ALLOC + +extern int rcu_scheduler_active __read_mostly; +extern void rcu_scheduler_starting(void); + +#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ + +static inline void rcu_scheduler_starting(void) +{ +} + +#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */ + #endif /* __LINUX_RCUTINY_H */ diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 42cc3a0..c0ed1c0 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -34,8 +34,8 @@ struct notifier_block; extern void rcu_sched_qs(int cpu); extern void rcu_bh_qs(int cpu); +extern void rcu_note_context_switch(int cpu); extern int rcu_needs_cpu(int cpu); -extern int rcu_expedited_torture_stats(char *page); #ifdef CONFIG_TREE_PREEMPT_RCU @@ -86,6 +86,8 @@ static inline void __rcu_read_unlock_bh(void) extern void call_rcu_sched(struct rcu_head *head, void (*func)(struct rcu_head *rcu)); +extern void synchronize_rcu_bh(void); +extern void synchronize_sched(void); extern void synchronize_rcu_expedited(void); static inline void synchronize_rcu_bh_expedited(void) @@ -120,4 +122,7 @@ static inline int rcu_blocking_is_gp(void) return num_online_cpus() == 1; } +extern void rcu_scheduler_starting(void); +extern int rcu_scheduler_active __read_mostly; + #endif /* __LINUX_RCUTREE_H */ diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index 5fcc31e..25b4f68 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -120,12 +120,16 @@ int ring_buffer_write(struct ring_buffer *buffer, unsigned long length, void *data); struct ring_buffer_event * -ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts); +ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts, + unsigned long *lost_events); struct ring_buffer_event * -ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts); +ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts, + unsigned long *lost_events); struct ring_buffer_iter * -ring_buffer_read_start(struct ring_buffer *buffer, int cpu); +ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu); +void ring_buffer_read_prepare_sync(void); +void ring_buffer_read_start(struct ring_buffer_iter *iter); void ring_buffer_read_finish(struct ring_buffer_iter *iter); struct ring_buffer_event * diff --git a/include/linux/sched.h b/include/linux/sched.h index e0447c6..b55e988 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -274,11 +274,17 @@ extern cpumask_var_t nohz_cpu_mask; #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ) extern int select_nohz_load_balancer(int cpu); extern int get_nohz_load_balancer(void); +extern int nohz_ratelimit(int cpu); #else static inline int select_nohz_load_balancer(int cpu) { return 0; } + +static inline int nohz_ratelimit(int cpu) +{ + return 0; +} #endif /* @@ -953,6 +959,7 @@ struct sched_domain { char *name; #endif + unsigned int span_weight; /* * Span of all CPUs in this domain. * @@ -1025,12 +1032,17 @@ struct sched_domain; #define WF_SYNC 0x01 /* waker goes to sleep after wakup */ #define WF_FORK 0x02 /* child wakeup after fork */ +#define ENQUEUE_WAKEUP 1 +#define ENQUEUE_WAKING 2 +#define ENQUEUE_HEAD 4 + +#define DEQUEUE_SLEEP 1 + struct sched_class { const struct sched_class *next; - void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup, - bool head); - void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep); + void (*enqueue_task) (struct rq *rq, struct task_struct *p, int flags); + void (*dequeue_task) (struct rq *rq, struct task_struct *p, int flags); void (*yield_task) (struct rq *rq); void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags); @@ -1039,7 +1051,8 @@ struct sched_class { void (*put_prev_task) (struct rq *rq, struct task_struct *p); #ifdef CONFIG_SMP - int (*select_task_rq)(struct task_struct *p, int sd_flag, int flags); + int (*select_task_rq)(struct rq *rq, struct task_struct *p, + int sd_flag, int flags); void (*pre_schedule) (struct rq *this_rq, struct task_struct *task); void (*post_schedule) (struct rq *this_rq); @@ -1076,36 +1089,8 @@ struct load_weight { unsigned long weight, inv_weight; }; -/* - * CFS stats for a schedulable entity (task, task-group etc) - * - * Current field usage histogram: - * - * 4 se->block_start - * 4 se->run_node - * 4 se->sleep_start - * 6 se->load.weight - */ -struct sched_entity { - struct load_weight load; /* for load-balancing */ - struct rb_node run_node; - struct list_head group_node; - unsigned int on_rq; - - u64 exec_start; - u64 sum_exec_runtime; - u64 vruntime; - u64 prev_sum_exec_runtime; - - u64 last_wakeup; - u64 avg_overlap; - - u64 nr_migrations; - - u64 start_runtime; - u64 avg_wakeup; - #ifdef CONFIG_SCHEDSTATS +struct sched_statistics { u64 wait_start; u64 wait_max; u64 wait_count; @@ -1137,6 +1122,24 @@ struct sched_entity { u64 nr_wakeups_affine_attempts; u64 nr_wakeups_passive; u64 nr_wakeups_idle; +}; +#endif + +struct sched_entity { + struct load_weight load; /* for load-balancing */ + struct rb_node run_node; + struct list_head group_node; + unsigned int on_rq; + + u64 exec_start; + u64 sum_exec_runtime; + u64 vruntime; + u64 prev_sum_exec_runtime; + + u64 nr_migrations; + +#ifdef CONFIG_SCHEDSTATS + struct sched_statistics statistics; #endif #ifdef CONFIG_FAIR_GROUP_SCHED @@ -1490,7 +1493,6 @@ struct task_struct { /* bitmask of trace recursion */ unsigned long trace_recursion; #endif /* CONFIG_TRACING */ - unsigned long stack_start; #ifdef CONFIG_CGROUP_MEM_RES_CTLR /* memcg uses this to do batch job */ struct memcg_batch_info { int do_batch; /* incremented when batch uncharge started */ @@ -1840,6 +1842,7 @@ extern void sched_clock_idle_sleep_event(void); extern void sched_clock_idle_wakeup_event(u64 delta_ns); #ifdef CONFIG_HOTPLUG_CPU +extern void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p); extern void idle_task_exit(void); #else static inline void idle_task_exit(void) {} diff --git a/include/linux/srcu.h b/include/linux/srcu.h index 4d5ecb2..4d5d2f5 100644 --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -27,6 +27,8 @@ #ifndef _LINUX_SRCU_H #define _LINUX_SRCU_H +#include <linux/mutex.h> + struct srcu_struct_array { int c[2]; }; @@ -84,8 +86,8 @@ long srcu_batches_completed(struct srcu_struct *sp); /** * srcu_read_lock_held - might we be in SRCU read-side critical section? * - * If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in - * an SRCU read-side critical section. In absence of CONFIG_PROVE_LOCKING, + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an SRCU + * read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC, * this assumes we are in an SRCU read-side critical section unless it can * prove otherwise. */ diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h index baba3a2..6b524a0 100644 --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -1,13 +1,101 @@ #ifndef _LINUX_STOP_MACHINE #define _LINUX_STOP_MACHINE -/* "Bogolock": stop the entire machine, disable interrupts. This is a - very heavy lock, which is equivalent to grabbing every spinlock - (and more). So the "read" side to such a lock is anything which - disables preeempt. */ + #include <linux/cpu.h> #include <linux/cpumask.h> +#include <linux/list.h> #include <asm/system.h> +/* + * stop_cpu[s]() is simplistic per-cpu maximum priority cpu + * monopolization mechanism. The caller can specify a non-sleeping + * function to be executed on a single or multiple cpus preempting all + * other processes and monopolizing those cpus until it finishes. + * + * Resources for this mechanism are preallocated when a cpu is brought + * up and requests are guaranteed to be served as long as the target + * cpus are online. + */ +typedef int (*cpu_stop_fn_t)(void *arg); + +#ifdef CONFIG_SMP + +struct cpu_stop_work { + struct list_head list; /* cpu_stopper->works */ + cpu_stop_fn_t fn; + void *arg; + struct cpu_stop_done *done; +}; + +int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg); +void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg, + struct cpu_stop_work *work_buf); +int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg); +int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg); + +#else /* CONFIG_SMP */ + +#include <linux/workqueue.h> + +struct cpu_stop_work { + struct work_struct work; + cpu_stop_fn_t fn; + void *arg; +}; + +static inline int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg) +{ + int ret = -ENOENT; + preempt_disable(); + if (cpu == smp_processor_id()) + ret = fn(arg); + preempt_enable(); + return ret; +} + +static void stop_one_cpu_nowait_workfn(struct work_struct *work) +{ + struct cpu_stop_work *stwork = + container_of(work, struct cpu_stop_work, work); + preempt_disable(); + stwork->fn(stwork->arg); + preempt_enable(); +} + +static inline void stop_one_cpu_nowait(unsigned int cpu, + cpu_stop_fn_t fn, void *arg, + struct cpu_stop_work *work_buf) +{ + if (cpu == smp_processor_id()) { + INIT_WORK(&work_buf->work, stop_one_cpu_nowait_workfn); + work_buf->fn = fn; + work_buf->arg = arg; + schedule_work(&work_buf->work); + } +} + +static inline int stop_cpus(const struct cpumask *cpumask, + cpu_stop_fn_t fn, void *arg) +{ + if (cpumask_test_cpu(raw_smp_processor_id(), cpumask)) + return stop_one_cpu(raw_smp_processor_id(), fn, arg); + return -ENOENT; +} + +static inline int try_stop_cpus(const struct cpumask *cpumask, + cpu_stop_fn_t fn, void *arg) +{ + return stop_cpus(cpumask, fn, arg); +} + +#endif /* CONFIG_SMP */ + +/* + * stop_machine "Bogolock": stop the entire machine, disable + * interrupts. This is a very heavy lock, which is equivalent to + * grabbing every spinlock (and more). So the "read" side to such a + * lock is anything which disables preeempt. + */ #if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP) /** @@ -36,24 +124,7 @@ int stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus); */ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus); -/** - * stop_machine_create: create all stop_machine threads - * - * Description: This causes all stop_machine threads to be created before - * stop_machine actually gets called. This can be used by subsystems that - * need a non failing stop_machine infrastructure. - */ -int stop_machine_create(void); - -/** - * stop_machine_destroy: destroy all stop_machine threads - * - * Description: This causes all stop_machine threads which were created with - * stop_machine_create to be destroyed again. - */ -void stop_machine_destroy(void); - -#else +#else /* CONFIG_STOP_MACHINE && CONFIG_SMP */ static inline int stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) @@ -65,8 +136,5 @@ static inline int stop_machine(int (*fn)(void *), void *data, return ret; } -static inline int stop_machine_create(void) { return 0; } -static inline void stop_machine_destroy(void) { } - -#endif /* CONFIG_SMP */ -#endif /* _LINUX_STOP_MACHINE */ +#endif /* CONFIG_STOP_MACHINE && CONFIG_SMP */ +#endif /* _LINUX_STOP_MACHINE */ diff --git a/include/linux/tick.h b/include/linux/tick.h index d2ae79e..b232ccc 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -42,6 +42,7 @@ enum tick_nohz_mode { * @idle_waketime: Time when the idle was interrupted * @idle_exittime: Time when the idle state was left * @idle_sleeptime: Sum of the time slept in idle with sched tick stopped + * @iowait_sleeptime: Sum of the time slept in idle with sched tick stopped, with IO outstanding * @sleep_length: Duration of the current idle sleep * @do_timer_lst: CPU was the last one doing do_timer before going idle */ @@ -60,7 +61,7 @@ struct tick_sched { ktime_t idle_waketime; ktime_t idle_exittime; ktime_t idle_sleeptime; - ktime_t idle_lastupdate; + ktime_t iowait_sleeptime; ktime_t sleep_length; unsigned long last_jiffies; unsigned long next_jiffies; @@ -124,6 +125,7 @@ extern void tick_nohz_stop_sched_tick(int inidle); extern void tick_nohz_restart_sched_tick(void); extern ktime_t tick_nohz_get_sleep_length(void); extern u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time); +extern u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time); # else static inline void tick_nohz_stop_sched_tick(int inidle) { } static inline void tick_nohz_restart_sched_tick(void) { } @@ -134,6 +136,7 @@ static inline ktime_t tick_nohz_get_sleep_length(void) return len; } static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; } +static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } # endif /* !NO_HZ */ #endif diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 78b4bd3..1d85f9a 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -33,6 +33,65 @@ struct tracepoint { * Keep in sync with vmlinux.lds.h. */ +/* + * Connect a probe to a tracepoint. + * Internal API, should not be used directly. + */ +extern int tracepoint_probe_register(const char *name, void *probe); + +/* + * Disconnect a probe from a tracepoint. + * Internal API, should not be used directly. + */ +extern int tracepoint_probe_unregister(const char *name, void *probe); + +extern int tracepoint_probe_register_noupdate(const char *name, void *probe); +extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe); +extern void tracepoint_probe_update_all(void); + +struct tracepoint_iter { + struct module *module; + struct tracepoint *tracepoint; +}; + +extern void tracepoint_iter_start(struct tracepoint_iter *iter); +extern void tracepoint_iter_next(struct tracepoint_iter *iter); +extern void tracepoint_iter_stop(struct tracepoint_iter *iter); +extern void tracepoint_iter_reset(struct tracepoint_iter *iter); +extern int tracepoint_get_iter_range(struct tracepoint **tracepoint, + struct tracepoint *begin, struct tracepoint *end); + +/* + * tracepoint_synchronize_unregister must be called between the last tracepoint + * probe unregistration and the end of module exit to make sure there is no + * caller executing a probe when it is freed. + */ +static inline void tracepoint_synchronize_unregister(void) +{ + synchronize_sched(); +} + +#define PARAMS(args...) args + +#ifdef CONFIG_TRACEPOINTS +extern void tracepoint_update_probe_range(struct tracepoint *begin, + struct tracepoint *end); +#else +static inline void tracepoint_update_probe_range(struct tracepoint *begin, + struct tracepoint *end) +{ } +#endif /* CONFIG_TRACEPOINTS */ + +#endif /* _LINUX_TRACEPOINT_H */ + +/* + * Note: we keep the TRACE_EVENT and DECLARE_TRACE outside the include + * file ifdef protection. + * This is due to the way trace events work. If a file includes two + * trace event headers under one "CREATE_TRACE_POINTS" the first include + * will override the TRACE_EVENT and break the second include. + */ + #ifndef DECLARE_TRACE #define TP_PROTO(args...) args @@ -96,9 +155,6 @@ struct tracepoint { #define EXPORT_TRACEPOINT_SYMBOL(name) \ EXPORT_SYMBOL(__tracepoint_##name) -extern void tracepoint_update_probe_range(struct tracepoint *begin, - struct tracepoint *end); - #else /* !CONFIG_TRACEPOINTS */ #define DECLARE_TRACE(name, proto, args) \ static inline void _do_trace_##name(struct tracepoint *tp, proto) \ @@ -119,61 +175,9 @@ extern void tracepoint_update_probe_range(struct tracepoint *begin, #define EXPORT_TRACEPOINT_SYMBOL_GPL(name) #define EXPORT_TRACEPOINT_SYMBOL(name) -static inline void tracepoint_update_probe_range(struct tracepoint *begin, - struct tracepoint *end) -{ } #endif /* CONFIG_TRACEPOINTS */ #endif /* DECLARE_TRACE */ -/* - * Connect a probe to a tracepoint. - * Internal API, should not be used directly. - */ -extern int tracepoint_probe_register(const char *name, void *probe); - -/* - * Disconnect a probe from a tracepoint. - * Internal API, should not be used directly. - */ -extern int tracepoint_probe_unregister(const char *name, void *probe); - -extern int tracepoint_probe_register_noupdate(const char *name, void *probe); -extern int tracepoint_probe_unregister_noupdate(const char *name, void *probe); -extern void tracepoint_probe_update_all(void); - -struct tracepoint_iter { - struct module *module; - struct tracepoint *tracepoint; -}; - -extern void tracepoint_iter_start(struct tracepoint_iter *iter); -extern void tracepoint_iter_next(struct tracepoint_iter *iter); -extern void tracepoint_iter_stop(struct tracepoint_iter *iter); -extern void tracepoint_iter_reset(struct tracepoint_iter *iter); -extern int tracepoint_get_iter_range(struct tracepoint **tracepoint, - struct tracepoint *begin, struct tracepoint *end); - -/* - * tracepoint_synchronize_unregister must be called between the last tracepoint - * probe unregistration and the end of module exit to make sure there is no - * caller executing a probe when it is freed. - */ -static inline void tracepoint_synchronize_unregister(void) -{ - synchronize_sched(); -} - -#define PARAMS(args...) args - -#endif /* _LINUX_TRACEPOINT_H */ - -/* - * Note: we keep the TRACE_EVENT outside the include file ifdef protection. - * This is due to the way trace events work. If a file includes two - * trace event headers under one "CREATE_TRACE_POINTS" the first include - * will override the TRACE_EVENT and break the second include. - */ - #ifndef TRACE_EVENT /* * For use with the TRACE_EVENT macro: diff --git a/include/linux/types.h b/include/linux/types.h index c42724f..23d237a 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -188,12 +188,12 @@ typedef u32 phys_addr_t; typedef phys_addr_t resource_size_t; typedef struct { - volatile int counter; + int counter; } atomic_t; #ifdef CONFIG_64BIT typedef struct { - volatile long counter; + long counter; } atomic64_t; #endif diff --git a/include/linux/wait.h b/include/linux/wait.h index a48e16b..76d96d0 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -127,12 +127,26 @@ static inline void __add_wait_queue(wait_queue_head_t *head, wait_queue_t *new) /* * Used for wake-one threads: */ +static inline void __add_wait_queue_exclusive(wait_queue_head_t *q, + wait_queue_t *wait) +{ + wait->flags |= WQ_FLAG_EXCLUSIVE; + __add_wait_queue(q, wait); +} + static inline void __add_wait_queue_tail(wait_queue_head_t *head, - wait_queue_t *new) + wait_queue_t *new) { list_add_tail(&new->task_list, &head->task_list); } +static inline void __add_wait_queue_tail_exclusive(wait_queue_head_t *q, + wait_queue_t *wait) +{ + wait->flags |= WQ_FLAG_EXCLUSIVE; + __add_wait_queue_tail(q, wait); +} + static inline void __remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old) { @@ -404,25 +418,6 @@ do { \ }) /* - * Must be called with the spinlock in the wait_queue_head_t held. - */ -static inline void add_wait_queue_exclusive_locked(wait_queue_head_t *q, - wait_queue_t * wait) -{ - wait->flags |= WQ_FLAG_EXCLUSIVE; - __add_wait_queue_tail(q, wait); -} - -/* - * Must be called with the spinlock in the wait_queue_head_t held. - */ -static inline void remove_wait_queue_locked(wait_queue_head_t *q, - wait_queue_t * wait) -{ - __remove_wait_queue(q, wait); -} - -/* * These are the old interfaces to sleep waiting for an event. * They are racy. DO NOT use them, use the wait_event* interfaces above. * We plan to remove these interfaces. diff --git a/include/linux/zorro.h b/include/linux/zorro.h index 913bfc2..7bf9db52 100644 --- a/include/linux/zorro.h +++ b/include/linux/zorro.h @@ -38,8 +38,6 @@ typedef __u32 zorro_id; -#define ZORRO_WILDCARD (0xffffffff) /* not official */ - /* Include the ID list */ #include <linux/zorro_ids.h> @@ -116,6 +114,7 @@ struct ConfigDev { #include <linux/init.h> #include <linux/ioport.h> +#include <linux/mod_devicetable.h> #include <asm/zorro.h> @@ -142,29 +141,10 @@ struct zorro_dev { * Zorro bus */ -struct zorro_bus { - struct list_head devices; /* list of devices on this bus */ - unsigned int num_resources; /* number of resources */ - struct resource resources[4]; /* address space routed to this bus */ - struct device dev; - char name[10]; -}; - -extern struct zorro_bus zorro_bus; /* single Zorro bus */ extern struct bus_type zorro_bus_type; /* - * Zorro device IDs - */ - -struct zorro_device_id { - zorro_id id; /* Device ID or ZORRO_WILDCARD */ - unsigned long driver_data; /* Data private to the driver */ -}; - - - /* * Zorro device drivers */ diff --git a/include/media/saa7146_vv.h b/include/media/saa7146_vv.h index b9da1f5..4aeff96 100644 --- a/include/media/saa7146_vv.h +++ b/include/media/saa7146_vv.h @@ -188,7 +188,6 @@ void saa7146_buffer_timeout(unsigned long data); void saa7146_dma_free(struct saa7146_dev* dev,struct videobuf_queue *q, struct saa7146_buf *buf); -int saa7146_vv_devinit(struct saa7146_dev *dev); int saa7146_vv_init(struct saa7146_dev* dev, struct saa7146_ext_vv *ext_vv); int saa7146_vv_release(struct saa7146_dev* dev); diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h index 851c813..61d73e3 100644 --- a/include/net/sctp/sm.h +++ b/include/net/sctp/sm.h @@ -279,6 +279,7 @@ int sctp_do_sm(sctp_event_t event_type, sctp_subtype_t subtype, /* 2nd level prototypes */ void sctp_generate_t3_rtx_event(unsigned long peer); void sctp_generate_heartbeat_event(unsigned long peer); +void sctp_generate_proto_unreach_event(unsigned long peer); void sctp_ootb_pkt_free(struct sctp_packet *); diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 597f8e2..219043a 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -1010,6 +1010,9 @@ struct sctp_transport { /* Heartbeat timer is per destination. */ struct timer_list hb_timer; + /* Timer to handle ICMP proto unreachable envets */ + struct timer_list proto_unreach_timer; + /* Since we're using per-destination retransmission timers * (see above), we're also using per-destination "transmitted" * queues. This probably ought to be a private struct diff --git a/include/net/tcp.h b/include/net/tcp.h index 75be5a2..aa04b9a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1197,30 +1197,15 @@ extern int tcp_v4_md5_do_del(struct sock *sk, extern struct tcp_md5sig_pool * __percpu *tcp_alloc_md5sig_pool(struct sock *); extern void tcp_free_md5sig_pool(void); -extern struct tcp_md5sig_pool *__tcp_get_md5sig_pool(int cpu); -extern void __tcp_put_md5sig_pool(void); +extern struct tcp_md5sig_pool *tcp_get_md5sig_pool(void); +extern void tcp_put_md5sig_pool(void); + extern int tcp_md5_hash_header(struct tcp_md5sig_pool *, struct tcphdr *); extern int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, struct sk_buff *, unsigned header_len); extern int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, struct tcp_md5sig_key *key); -static inline -struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) -{ - int cpu = get_cpu(); - struct tcp_md5sig_pool *ret = __tcp_get_md5sig_pool(cpu); - if (!ret) - put_cpu(); - return ret; -} - -static inline void tcp_put_md5sig_pool(void) -{ - __tcp_put_md5sig_pool(); - put_cpu(); -} - /* write queue abstraction */ static inline void tcp_write_queue_purge(struct sock *sk) { diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h index 5acfb1e..1dfab54 100644 --- a/include/trace/define_trace.h +++ b/include/trace/define_trace.h @@ -65,6 +65,10 @@ #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) +/* Make all open coded DECLARE_TRACE nops */ +#undef DECLARE_TRACE +#define DECLARE_TRACE(name, proto, args) + #ifdef CONFIG_EVENT_TRACING #include <trace/ftrace.h> #endif @@ -75,6 +79,7 @@ #undef DEFINE_EVENT #undef DEFINE_EVENT_PRINT #undef TRACE_HEADER_MULTI_READ +#undef DECLARE_TRACE /* Only undef what we defined in this file */ #ifdef UNDEF_TRACE_INCLUDE_FILE diff --git a/include/trace/events/module.h b/include/trace/events/module.h index 4b0f48b..c7bb2f0 100644 --- a/include/trace/events/module.h +++ b/include/trace/events/module.h @@ -51,11 +51,14 @@ TRACE_EVENT(module_free, TP_printk("%s", __get_str(name)) ); +#ifdef CONFIG_MODULE_UNLOAD +/* trace_module_get/put are only used if CONFIG_MODULE_UNLOAD is defined */ + DECLARE_EVENT_CLASS(module_refcnt, - TP_PROTO(struct module *mod, unsigned long ip, int refcnt), + TP_PROTO(struct module *mod, unsigned long ip), - TP_ARGS(mod, ip, refcnt), + TP_ARGS(mod, ip), TP_STRUCT__entry( __field( unsigned long, ip ) @@ -65,7 +68,7 @@ DECLARE_EVENT_CLASS(module_refcnt, TP_fast_assign( __entry->ip = ip; - __entry->refcnt = refcnt; + __entry->refcnt = __this_cpu_read(mod->refptr->incs) + __this_cpu_read(mod->refptr->decs); __assign_str(name, mod->name); ), @@ -75,17 +78,18 @@ DECLARE_EVENT_CLASS(module_refcnt, DEFINE_EVENT(module_refcnt, module_get, - TP_PROTO(struct module *mod, unsigned long ip, int refcnt), + TP_PROTO(struct module *mod, unsigned long ip), - TP_ARGS(mod, ip, refcnt) + TP_ARGS(mod, ip) ); DEFINE_EVENT(module_refcnt, module_put, - TP_PROTO(struct module *mod, unsigned long ip, int refcnt), + TP_PROTO(struct module *mod, unsigned long ip), - TP_ARGS(mod, ip, refcnt) + TP_ARGS(mod, ip) ); +#endif /* CONFIG_MODULE_UNLOAD */ TRACE_EVENT(module_request, diff --git a/include/trace/events/napi.h b/include/trace/events/napi.h index a8989c4..188deca 100644 --- a/include/trace/events/napi.h +++ b/include/trace/events/napi.h @@ -1,4 +1,7 @@ -#ifndef _TRACE_NAPI_H_ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM napi + +#if !defined(_TRACE_NAPI_H) || defined(TRACE_HEADER_MULTI_READ) #define _TRACE_NAPI_H_ #include <linux/netdevice.h> @@ -8,4 +11,7 @@ DECLARE_TRACE(napi_poll, TP_PROTO(struct napi_struct *napi), TP_ARGS(napi)); -#endif +#endif /* _TRACE_NAPI_H_ */ + +/* This part must be outside protection */ +#include <trace/define_trace.h> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index cfceb0b..4f733ec 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -51,15 +51,12 @@ TRACE_EVENT(sched_kthread_stop_ret, /* * Tracepoint for waiting on task to unschedule: - * - * (NOTE: the 'rq' argument is not used by generic trace events, - * but used by the latency tracer plugin. ) */ TRACE_EVENT(sched_wait_task, - TP_PROTO(struct rq *rq, struct task_struct *p), + TP_PROTO(struct task_struct *p), - TP_ARGS(rq, p), + TP_ARGS(p), TP_STRUCT__entry( __array( char, comm, TASK_COMM_LEN ) @@ -79,15 +76,12 @@ TRACE_EVENT(sched_wait_task, /* * Tracepoint for waking up a task: - * - * (NOTE: the 'rq' argument is not used by generic trace events, - * but used by the latency tracer plugin. ) */ DECLARE_EVENT_CLASS(sched_wakeup_template, - TP_PROTO(struct rq *rq, struct task_struct *p, int success), + TP_PROTO(struct task_struct *p, int success), - TP_ARGS(rq, p, success), + TP_ARGS(p, success), TP_STRUCT__entry( __array( char, comm, TASK_COMM_LEN ) @@ -111,31 +105,25 @@ DECLARE_EVENT_CLASS(sched_wakeup_template, ); DEFINE_EVENT(sched_wakeup_template, sched_wakeup, - TP_PROTO(struct rq *rq, struct task_struct *p, int success), - TP_ARGS(rq, p, success)); + TP_PROTO(struct task_struct *p, int success), + TP_ARGS(p, success)); /* * Tracepoint for waking up a new task: - * - * (NOTE: the 'rq' argument is not used by generic trace events, - * but used by the latency tracer plugin. ) */ DEFINE_EVENT(sched_wakeup_template, sched_wakeup_new, - TP_PROTO(struct rq *rq, struct task_struct *p, int success), - TP_ARGS(rq, p, success)); + TP_PROTO(struct task_struct *p, int success), + TP_ARGS(p, success)); /* * Tracepoint for task switches, performed by the scheduler: - * - * (NOTE: the 'rq' argument is not used by generic trace events, - * but used by the latency tracer plugin. ) */ TRACE_EVENT(sched_switch, - TP_PROTO(struct rq *rq, struct task_struct *prev, + TP_PROTO(struct task_struct *prev, struct task_struct *next), - TP_ARGS(rq, prev, next), + TP_ARGS(prev, next), TP_STRUCT__entry( __array( char, prev_comm, TASK_COMM_LEN ) diff --git a/include/trace/events/signal.h b/include/trace/events/signal.h index a510b75..814566c 100644 --- a/include/trace/events/signal.h +++ b/include/trace/events/signal.h @@ -100,18 +100,7 @@ TRACE_EVENT(signal_deliver, __entry->sa_handler, __entry->sa_flags) ); -/** - * signal_overflow_fail - called when signal queue is overflow - * @sig: signal number - * @group: signal to process group or not (bool) - * @info: pointer to struct siginfo - * - * Kernel fails to generate 'sig' signal with 'info' siginfo, because - * siginfo queue is overflow, and the signal is dropped. - * 'group' is not 0 if the signal will be sent to a process group. - * 'sig' is always one of RT signals. - */ -TRACE_EVENT(signal_overflow_fail, +DECLARE_EVENT_CLASS(signal_queue_overflow, TP_PROTO(int sig, int group, struct siginfo *info), @@ -135,6 +124,24 @@ TRACE_EVENT(signal_overflow_fail, ); /** + * signal_overflow_fail - called when signal queue is overflow + * @sig: signal number + * @group: signal to process group or not (bool) + * @info: pointer to struct siginfo + * + * Kernel fails to generate 'sig' signal with 'info' siginfo, because + * siginfo queue is overflow, and the signal is dropped. + * 'group' is not 0 if the signal will be sent to a process group. + * 'sig' is always one of RT signals. + */ +DEFINE_EVENT(signal_queue_overflow, signal_overflow_fail, + + TP_PROTO(int sig, int group, struct siginfo *info), + + TP_ARGS(sig, group, info) +); + +/** * signal_lose_info - called when siginfo is lost * @sig: signal number * @group: signal to process group or not (bool) @@ -145,28 +152,13 @@ TRACE_EVENT(signal_overflow_fail, * 'group' is not 0 if the signal will be sent to a process group. * 'sig' is always one of non-RT signals. */ -TRACE_EVENT(signal_lose_info, +DEFINE_EVENT(signal_queue_overflow, signal_lose_info, TP_PROTO(int sig, int group, struct siginfo *info), - TP_ARGS(sig, group, info), - - TP_STRUCT__entry( - __field( int, sig ) - __field( int, group ) - __field( int, errno ) - __field( int, code ) - ), - - TP_fast_assign( - __entry->sig = sig; - __entry->group = group; - TP_STORE_SIGINFO(__entry, info); - ), - - TP_printk("sig=%d group=%d errno=%d code=%d", - __entry->sig, __entry->group, __entry->errno, __entry->code) + TP_ARGS(sig, group, info) ); + #endif /* _TRACE_SIGNAL_H */ /* This part must be outside protection */ diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h index 0a29df0..1016b21 100644 --- a/include/trace/ftrace.h +++ b/include/trace/ftrace.h @@ -154,9 +154,11 @@ * * field = (typeof(field))entry; * - * p = get_cpu_var(ftrace_event_seq); + * p = &get_cpu_var(ftrace_event_seq); * trace_seq_init(p); - * ret = trace_seq_printf(s, <TP_printk> "\n"); + * ret = trace_seq_printf(s, "%s: ", <call>); + * if (ret) + * ret = trace_seq_printf(s, <TP_printk> "\n"); * put_cpu(); * if (!ret) * return TRACE_TYPE_PARTIAL_LINE; @@ -450,38 +452,38 @@ perf_trace_disable_##name(struct ftrace_event_call *unused) \ * * static void ftrace_raw_event_<call>(proto) * { + * struct ftrace_data_offsets_<call> __maybe_unused __data_offsets; * struct ring_buffer_event *event; * struct ftrace_raw_<call> *entry; <-- defined in stage 1 * struct ring_buffer *buffer; * unsigned long irq_flags; + * int __data_size; * int pc; * * local_save_flags(irq_flags); * pc = preempt_count(); * + * __data_size = ftrace_get_offsets_<call>(&__data_offsets, args); + * * event = trace_current_buffer_lock_reserve(&buffer, * event_<call>.id, - * sizeof(struct ftrace_raw_<call>), + * sizeof(*entry) + __data_size, * irq_flags, pc); * if (!event) * return; * entry = ring_buffer_event_data(event); * - * <assign>; <-- Here we assign the entries by the __field and - * __array macros. + * { <assign>; } <-- Here we assign the entries by the __field and + * __array macros. * - * trace_current_buffer_unlock_commit(buffer, event, irq_flags, pc); + * if (!filter_current_check_discard(buffer, event_call, entry, event)) + * trace_current_buffer_unlock_commit(buffer, + * event, irq_flags, pc); * } * * static int ftrace_raw_reg_event_<call>(struct ftrace_event_call *unused) * { - * int ret; - * - * ret = register_trace_<call>(ftrace_raw_event_<call>); - * if (!ret) - * pr_info("event trace: Could not activate trace point " - * "probe to <call>"); - * return ret; + * return register_trace_<call>(ftrace_raw_event_<call>); * } * * static void ftrace_unreg_event_<call>(struct ftrace_event_call *unused) @@ -493,6 +495,8 @@ perf_trace_disable_##name(struct ftrace_event_call *unused) \ * .trace = ftrace_raw_output_<call>, <-- stage 2 * }; * + * static const char print_fmt_<call>[] = <TP_printk>; + * * static struct ftrace_event_call __used * __attribute__((__aligned__(4))) * __attribute__((section("_ftrace_events"))) event_<call> = { @@ -501,6 +505,8 @@ perf_trace_disable_##name(struct ftrace_event_call *unused) \ * .raw_init = trace_event_raw_init, * .regfunc = ftrace_reg_event_<call>, * .unregfunc = ftrace_unreg_event_<call>, + * .print_fmt = print_fmt_<call>, + * .define_fields = ftrace_define_fields_<call>, * } * */ @@ -569,7 +575,6 @@ ftrace_raw_event_id_##call(struct ftrace_event_call *event_call, \ return; \ entry = ring_buffer_event_data(event); \ \ - \ tstruct \ \ { assign; } \ diff --git a/init/Kconfig b/init/Kconfig index eb77e8c..5fe94b8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -604,8 +604,7 @@ config RT_GROUP_SCHED default n help This feature lets you explicitly allocate real CPU bandwidth - to users or control groups (depending on the "Basis for grouping tasks" - setting below. If enabled, it will also make it impossible to + to task groups. If enabled, it will also make it impossible to schedule realtime tasks for non-root users until you allocate realtime bandwidth for them. See Documentation/scheduler/sched-rt-group.txt for more information. diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 722b013..59a009d 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -158,7 +158,7 @@ static struct inode *mqueue_get_inode(struct super_block *sb, u->mq_bytes + mq_bytes > task_rlimit(p, RLIMIT_MSGQUEUE)) { spin_unlock(&mq_lock); - kfree(info->messages); + /* mqueue_delete_inode() releases info->messages */ goto out_inode; } u->mq_bytes += mq_bytes; diff --git a/kernel/Makefile b/kernel/Makefile index a987aa1..149e18e 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -68,7 +68,7 @@ obj-$(CONFIG_USER_NS) += user_namespace.o obj-$(CONFIG_PID_NS) += pid_namespace.o obj-$(CONFIG_IKCONFIG) += configs.o obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o -obj-$(CONFIG_STOP_MACHINE) += stop_machine.o +obj-$(CONFIG_SMP) += stop_machine.o obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o obj-$(CONFIG_AUDIT) += audit.o auditfilter.o audit_watch.o obj-$(CONFIG_AUDITSYSCALL) += auditsc.o diff --git a/kernel/acct.c b/kernel/acct.c index 24f8c81..e4c0e1f 100644 --- a/kernel/acct.c +++ b/kernel/acct.c @@ -353,17 +353,18 @@ restart: void acct_exit_ns(struct pid_namespace *ns) { - struct bsd_acct_struct *acct; + struct bsd_acct_struct *acct = ns->bacct; - spin_lock(&acct_lock); - acct = ns->bacct; - if (acct != NULL) { - if (acct->file != NULL) - acct_file_reopen(acct, NULL, NULL); + if (acct == NULL) + return; - kfree(acct); - } + del_timer_sync(&acct->timer); + spin_lock(&acct_lock); + if (acct->file != NULL) + acct_file_reopen(acct, NULL, NULL); spin_unlock(&acct_lock); + + kfree(acct); } /* diff --git a/kernel/capability.c b/kernel/capability.c index 9e4697e..2f05303 100644 --- a/kernel/capability.c +++ b/kernel/capability.c @@ -15,7 +15,6 @@ #include <linux/syscalls.h> #include <linux/pid_namespace.h> #include <asm/uaccess.h> -#include "cred-internals.h" /* * Leveraged for setting/resetting capabilities diff --git a/kernel/cgroup.c b/kernel/cgroup.c index e2769e1..e9ec642 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -1646,7 +1646,9 @@ static inline struct cftype *__d_cft(struct dentry *dentry) int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen) { char *start; - struct dentry *dentry = rcu_dereference(cgrp->dentry); + struct dentry *dentry = rcu_dereference_check(cgrp->dentry, + rcu_read_lock_held() || + cgroup_lock_is_held()); if (!dentry || cgrp == dummytop) { /* @@ -1662,13 +1664,17 @@ int cgroup_path(const struct cgroup *cgrp, char *buf, int buflen) *--start = '\0'; for (;;) { int len = dentry->d_name.len; + if ((start -= len) < buf) return -ENAMETOOLONG; - memcpy(start, cgrp->dentry->d_name.name, len); + memcpy(start, dentry->d_name.name, len); cgrp = cgrp->parent; if (!cgrp) break; - dentry = rcu_dereference(cgrp->dentry); + + dentry = rcu_dereference_check(cgrp->dentry, + rcu_read_lock_held() || + cgroup_lock_is_held()); if (!cgrp->parent) continue; if (--start < buf) @@ -3010,7 +3016,7 @@ static int cgroup_event_wake(wait_queue_t *wait, unsigned mode, unsigned long flags = (unsigned long)key; if (flags & POLLHUP) { - remove_wait_queue_locked(event->wqh, &event->wait); + __remove_wait_queue(event->wqh, &event->wait); spin_lock(&cgrp->event_list_lock); list_del(&event->list); spin_unlock(&cgrp->event_list_lock); @@ -4429,7 +4435,15 @@ __setup("cgroup_disable=", cgroup_disable); */ unsigned short css_id(struct cgroup_subsys_state *css) { - struct css_id *cssid = rcu_dereference(css->id); + struct css_id *cssid; + + /* + * This css_id() can return correct value when somone has refcnt + * on this or this is under rcu_read_lock(). Once css->id is allocated, + * it's unchanged until freed. + */ + cssid = rcu_dereference_check(css->id, + rcu_read_lock_held() || atomic_read(&css->refcnt)); if (cssid) return cssid->id; @@ -4439,7 +4453,10 @@ EXPORT_SYMBOL_GPL(css_id); unsigned short css_depth(struct cgroup_subsys_state *css) { - struct css_id *cssid = rcu_dereference(css->id); + struct css_id *cssid; + + cssid = rcu_dereference_check(css->id, + rcu_read_lock_held() || atomic_read(&css->refcnt)); if (cssid) return cssid->depth; @@ -4447,15 +4464,36 @@ unsigned short css_depth(struct cgroup_subsys_state *css) } EXPORT_SYMBOL_GPL(css_depth); +/** + * css_is_ancestor - test "root" css is an ancestor of "child" + * @child: the css to be tested. + * @root: the css supporsed to be an ancestor of the child. + * + * Returns true if "root" is an ancestor of "child" in its hierarchy. Because + * this function reads css->id, this use rcu_dereference() and rcu_read_lock(). + * But, considering usual usage, the csses should be valid objects after test. + * Assuming that the caller will do some action to the child if this returns + * returns true, the caller must take "child";s reference count. + * If "child" is valid object and this returns true, "root" is valid, too. + */ + bool css_is_ancestor(struct cgroup_subsys_state *child, const struct cgroup_subsys_state *root) { - struct css_id *child_id = rcu_dereference(child->id); - struct css_id *root_id = rcu_dereference(root->id); + struct css_id *child_id; + struct css_id *root_id; + bool ret = true; - if (!child_id || !root_id || (child_id->depth < root_id->depth)) - return false; - return child_id->stack[root_id->depth] == root_id->id; + rcu_read_lock(); + child_id = rcu_dereference(child->id); + root_id = rcu_dereference(root->id); + if (!child_id + || !root_id + || (child_id->depth < root_id->depth) + || (child_id->stack[root_id->depth] != root_id->id)) + ret = false; + rcu_read_unlock(); + return ret; } static void __free_css_id_cb(struct rcu_head *head) @@ -4555,13 +4593,13 @@ static int alloc_css_id(struct cgroup_subsys *ss, struct cgroup *parent, { int subsys_id, i, depth = 0; struct cgroup_subsys_state *parent_css, *child_css; - struct css_id *child_id, *parent_id = NULL; + struct css_id *child_id, *parent_id; subsys_id = ss->subsys_id; parent_css = parent->subsys[subsys_id]; child_css = child->subsys[subsys_id]; - depth = css_depth(parent_css) + 1; parent_id = parent_css->id; + depth = parent_id->depth; child_id = get_new_cssid(ss, depth); if (IS_ERR(child_id)) diff --git a/kernel/cpu.c b/kernel/cpu.c index 25bba73..5457775 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -164,6 +164,7 @@ static inline void check_for_tasks(int cpu) } struct take_cpu_down_param { + struct task_struct *caller; unsigned long mod; void *hcpu; }; @@ -172,6 +173,7 @@ struct take_cpu_down_param { static int __ref take_cpu_down(void *_param) { struct take_cpu_down_param *param = _param; + unsigned int cpu = (unsigned long)param->hcpu; int err; /* Ensure this CPU doesn't handle any more interrupts. */ @@ -182,6 +184,8 @@ static int __ref take_cpu_down(void *_param) raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod, param->hcpu); + if (task_cpu(param->caller) == cpu) + move_task_off_dead_cpu(cpu, param->caller); /* Force idle task to run as soon as we yield: it should immediately notice cpu is offline and die quickly. */ sched_idle_next(); @@ -192,10 +196,10 @@ static int __ref take_cpu_down(void *_param) static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) { int err, nr_calls = 0; - cpumask_var_t old_allowed; void *hcpu = (void *)(long)cpu; unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0; struct take_cpu_down_param tcd_param = { + .caller = current, .mod = mod, .hcpu = hcpu, }; @@ -206,9 +210,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) if (!cpu_online(cpu)) return -EINVAL; - if (!alloc_cpumask_var(&old_allowed, GFP_KERNEL)) - return -ENOMEM; - cpu_hotplug_begin(); set_cpu_active(cpu, false); err = __raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE | mod, @@ -225,10 +226,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) goto out_release; } - /* Ensure that we are not runnable on dying cpu */ - cpumask_copy(old_allowed, ¤t->cpus_allowed); - set_cpus_allowed_ptr(current, cpu_active_mask); - err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); if (err) { set_cpu_active(cpu, true); @@ -237,7 +234,7 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) hcpu) == NOTIFY_BAD) BUG(); - goto out_allowed; + goto out_release; } BUG_ON(cpu_online(cpu)); @@ -255,8 +252,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) check_for_tasks(cpu); -out_allowed: - set_cpus_allowed_ptr(current, old_allowed); out_release: cpu_hotplug_done(); if (!err) { @@ -264,7 +259,6 @@ out_release: hcpu) == NOTIFY_BAD) BUG(); } - free_cpumask_var(old_allowed); return err; } @@ -272,9 +266,6 @@ int __ref cpu_down(unsigned int cpu) { int err; - err = stop_machine_create(); - if (err) - return err; cpu_maps_update_begin(); if (cpu_hotplug_disabled) { @@ -286,7 +277,6 @@ int __ref cpu_down(unsigned int cpu) out: cpu_maps_update_done(); - stop_machine_destroy(); return err; } EXPORT_SYMBOL(cpu_down); @@ -367,9 +357,6 @@ int disable_nonboot_cpus(void) { int cpu, first_cpu, error; - error = stop_machine_create(); - if (error) - return error; cpu_maps_update_begin(); first_cpu = cpumask_first(cpu_online_mask); /* @@ -400,7 +387,6 @@ int disable_nonboot_cpus(void) printk(KERN_ERR "Non-boot CPUs are not disabled\n"); } cpu_maps_update_done(); - stop_machine_destroy(); return error; } diff --git a/kernel/cpuset.c b/kernel/cpuset.c index d109467..9a50c5f 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -2182,19 +2182,52 @@ void __init cpuset_init_smp(void) void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask) { mutex_lock(&callback_mutex); - cpuset_cpus_allowed_locked(tsk, pmask); + task_lock(tsk); + guarantee_online_cpus(task_cs(tsk), pmask); + task_unlock(tsk); mutex_unlock(&callback_mutex); } -/** - * cpuset_cpus_allowed_locked - return cpus_allowed mask from a tasks cpuset. - * Must be called with callback_mutex held. - **/ -void cpuset_cpus_allowed_locked(struct task_struct *tsk, struct cpumask *pmask) +int cpuset_cpus_allowed_fallback(struct task_struct *tsk) { - task_lock(tsk); - guarantee_online_cpus(task_cs(tsk), pmask); - task_unlock(tsk); + const struct cpuset *cs; + int cpu; + + rcu_read_lock(); + cs = task_cs(tsk); + if (cs) + cpumask_copy(&tsk->cpus_allowed, cs->cpus_allowed); + rcu_read_unlock(); + + /* + * We own tsk->cpus_allowed, nobody can change it under us. + * + * But we used cs && cs->cpus_allowed lockless and thus can + * race with cgroup_attach_task() or update_cpumask() and get + * the wrong tsk->cpus_allowed. However, both cases imply the + * subsequent cpuset_change_cpumask()->set_cpus_allowed_ptr() + * which takes task_rq_lock(). + * + * If we are called after it dropped the lock we must see all + * changes in tsk_cs()->cpus_allowed. Otherwise we can temporary + * set any mask even if it is not right from task_cs() pov, + * the pending set_cpus_allowed_ptr() will fix things. + */ + + cpu = cpumask_any_and(&tsk->cpus_allowed, cpu_active_mask); + if (cpu >= nr_cpu_ids) { + /* + * Either tsk->cpus_allowed is wrong (see above) or it + * is actually empty. The latter case is only possible + * if we are racing with remove_tasks_in_empty_cpuset(). + * Like above we can temporary set any mask and rely on + * set_cpus_allowed_ptr() as synchronization point. + */ + cpumask_copy(&tsk->cpus_allowed, cpu_possible_mask); + cpu = cpumask_any(cpu_active_mask); + } + + return cpu; } void cpuset_init_current_mems_allowed(void) @@ -2383,22 +2416,6 @@ int __cpuset_node_allowed_hardwall(int node, gfp_t gfp_mask) } /** - * cpuset_lock - lock out any changes to cpuset structures - * - * The out of memory (oom) code needs to mutex_lock cpusets - * from being changed while it scans the tasklist looking for a - * task in an overlapping cpuset. Expose callback_mutex via this - * cpuset_lock() routine, so the oom code can lock it, before - * locking the task list. The tasklist_lock is a spinlock, so - * must be taken inside callback_mutex. - */ - -void cpuset_lock(void) -{ - mutex_lock(&callback_mutex); -} - -/** * cpuset_unlock - release lock on cpuset changes * * Undo the lock taken in a previous cpuset_lock() call. diff --git a/kernel/cred-internals.h b/kernel/cred-internals.h deleted file mode 100644 index 2dc4fc2..0000000 --- a/kernel/cred-internals.h +++ /dev/null @@ -1,21 +0,0 @@ -/* Internal credentials stuff - * - * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved. - * Written by David Howells (dhowells@redhat.com) - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public Licence - * as published by the Free Software Foundation; either version - * 2 of the Licence, or (at your option) any later version. - */ - -/* - * user.c - */ -static inline void sched_switch_user(struct task_struct *p) -{ -#ifdef CONFIG_USER_SCHED - sched_move_task(p); -#endif /* CONFIG_USER_SCHED */ -} - diff --git a/kernel/cred.c b/kernel/cred.c index 62af181..8f3672a 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -17,7 +17,6 @@ #include <linux/init_task.h> #include <linux/security.h> #include <linux/cn_proc.h> -#include "cred-internals.h" #if 0 #define kdebug(FMT, ...) \ @@ -560,8 +559,6 @@ int commit_creds(struct cred *new) atomic_dec(&old->user->processes); alter_cred_subscribers(old, -2); - sched_switch_user(task); - /* send notifications */ if (new->uid != old->uid || new->euid != old->euid || diff --git a/kernel/exit.c b/kernel/exit.c index 7f2683a..eabca5a 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -55,7 +55,6 @@ #include <asm/unistd.h> #include <asm/pgtable.h> #include <asm/mmu_context.h> -#include "cred-internals.h" static void exit_mm(struct task_struct * tsk); diff --git a/kernel/fork.c b/kernel/fork.c index 5d3592de..4d57d9e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1111,7 +1111,6 @@ static struct task_struct *copy_process(unsigned long clone_flags, p->memcg_batch.do_batch = 0; p->memcg_batch.memcg = NULL; #endif - p->stack_start = stack_start; /* Perform scheduler related setup. Assign this task to a CPU. */ sched_fork(p, clone_flags); diff --git a/kernel/kexec.c b/kernel/kexec.c index 87ebe8a..474a847 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1134,11 +1134,9 @@ int crash_shrink_memory(unsigned long new_size) free_reserved_phys_range(end, crashk_res.end); - if (start == end) { - crashk_res.end = end; + if (start == end) release_resource(&crashk_res); - } else - crashk_res.end = end - 1; + crashk_res.end = end - 1; unlock: mutex_unlock(&kexec_mutex); diff --git a/kernel/lockdep.c b/kernel/lockdep.c index e9c759f..ec21304 100644 --- a/kernel/lockdep.c +++ b/kernel/lockdep.c @@ -431,20 +431,7 @@ static struct stack_trace lockdep_init_trace = { /* * Various lockdep statistics: */ -atomic_t chain_lookup_hits; -atomic_t chain_lookup_misses; -atomic_t hardirqs_on_events; -atomic_t hardirqs_off_events; -atomic_t redundant_hardirqs_on; -atomic_t redundant_hardirqs_off; -atomic_t softirqs_on_events; -atomic_t softirqs_off_events; -atomic_t redundant_softirqs_on; -atomic_t redundant_softirqs_off; -atomic_t nr_unused_locks; -atomic_t nr_cyclic_checks; -atomic_t nr_find_usage_forwards_checks; -atomic_t nr_find_usage_backwards_checks; +DEFINE_PER_CPU(struct lockdep_stats, lockdep_stats); #endif /* @@ -748,7 +735,7 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force) return NULL; } class = lock_classes + nr_lock_classes++; - debug_atomic_inc(&nr_unused_locks); + debug_atomic_inc(nr_unused_locks); class->key = key; class->name = lock->name; class->subclass = subclass; @@ -818,7 +805,8 @@ static struct lock_list *alloc_list_entry(void) * Add a new dependency to the head of the list: */ static int add_lock_to_list(struct lock_class *class, struct lock_class *this, - struct list_head *head, unsigned long ip, int distance) + struct list_head *head, unsigned long ip, + int distance, struct stack_trace *trace) { struct lock_list *entry; /* @@ -829,11 +817,9 @@ static int add_lock_to_list(struct lock_class *class, struct lock_class *this, if (!entry) return 0; - if (!save_trace(&entry->trace)) - return 0; - entry->class = this; entry->distance = distance; + entry->trace = *trace; /* * Since we never remove from the dependency list, the list can * be walked lockless by other CPUs, it's only allocation @@ -1205,7 +1191,7 @@ check_noncircular(struct lock_list *root, struct lock_class *target, { int result; - debug_atomic_inc(&nr_cyclic_checks); + debug_atomic_inc(nr_cyclic_checks); result = __bfs_forwards(root, target, class_equal, target_entry); @@ -1242,7 +1228,7 @@ find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit, { int result; - debug_atomic_inc(&nr_find_usage_forwards_checks); + debug_atomic_inc(nr_find_usage_forwards_checks); result = __bfs_forwards(root, (void *)bit, usage_match, target_entry); @@ -1265,7 +1251,7 @@ find_usage_backwards(struct lock_list *root, enum lock_usage_bit bit, { int result; - debug_atomic_inc(&nr_find_usage_backwards_checks); + debug_atomic_inc(nr_find_usage_backwards_checks); result = __bfs_backwards(root, (void *)bit, usage_match, target_entry); @@ -1635,12 +1621,20 @@ check_deadlock(struct task_struct *curr, struct held_lock *next, */ static int check_prev_add(struct task_struct *curr, struct held_lock *prev, - struct held_lock *next, int distance) + struct held_lock *next, int distance, int trylock_loop) { struct lock_list *entry; int ret; struct lock_list this; struct lock_list *uninitialized_var(target_entry); + /* + * Static variable, serialized by the graph_lock(). + * + * We use this static variable to save the stack trace in case + * we call into this function multiple times due to encountering + * trylocks in the held lock stack. + */ + static struct stack_trace trace; /* * Prove that the new <prev> -> <next> dependency would not @@ -1688,20 +1682,23 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev, } } + if (!trylock_loop && !save_trace(&trace)) + return 0; + /* * Ok, all validations passed, add the new lock * to the previous lock's dependency list: */ ret = add_lock_to_list(hlock_class(prev), hlock_class(next), &hlock_class(prev)->locks_after, - next->acquire_ip, distance); + next->acquire_ip, distance, &trace); if (!ret) return 0; ret = add_lock_to_list(hlock_class(next), hlock_class(prev), &hlock_class(next)->locks_before, - next->acquire_ip, distance); + next->acquire_ip, distance, &trace); if (!ret) return 0; @@ -1731,6 +1728,7 @@ static int check_prevs_add(struct task_struct *curr, struct held_lock *next) { int depth = curr->lockdep_depth; + int trylock_loop = 0; struct held_lock *hlock; /* @@ -1756,7 +1754,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next) * added: */ if (hlock->read != 2) { - if (!check_prev_add(curr, hlock, next, distance)) + if (!check_prev_add(curr, hlock, next, + distance, trylock_loop)) return 0; /* * Stop after the first non-trylock entry, @@ -1779,6 +1778,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next) if (curr->held_locks[depth].irq_context != curr->held_locks[depth-1].irq_context) break; + trylock_loop = 1; } return 1; out_bug: @@ -1825,7 +1825,7 @@ static inline int lookup_chain_cache(struct task_struct *curr, list_for_each_entry(chain, hash_head, entry) { if (chain->chain_key == chain_key) { cache_hit: - debug_atomic_inc(&chain_lookup_hits); + debug_atomic_inc(chain_lookup_hits); if (very_verbose(class)) printk("\nhash chain already cached, key: " "%016Lx tail class: [%p] %s\n", @@ -1890,7 +1890,7 @@ cache_hit: chain_hlocks[chain->base + j] = class - lock_classes; } list_add_tail_rcu(&chain->entry, hash_head); - debug_atomic_inc(&chain_lookup_misses); + debug_atomic_inc(chain_lookup_misses); inc_chains(); return 1; @@ -2311,7 +2311,12 @@ void trace_hardirqs_on_caller(unsigned long ip) return; if (unlikely(curr->hardirqs_enabled)) { - debug_atomic_inc(&redundant_hardirqs_on); + /* + * Neither irq nor preemption are disabled here + * so this is racy by nature but loosing one hit + * in a stat is not a big deal. + */ + __debug_atomic_inc(redundant_hardirqs_on); return; } /* we'll do an OFF -> ON transition: */ @@ -2338,7 +2343,7 @@ void trace_hardirqs_on_caller(unsigned long ip) curr->hardirq_enable_ip = ip; curr->hardirq_enable_event = ++curr->irq_events; - debug_atomic_inc(&hardirqs_on_events); + debug_atomic_inc(hardirqs_on_events); } EXPORT_SYMBOL(trace_hardirqs_on_caller); @@ -2370,9 +2375,9 @@ void trace_hardirqs_off_caller(unsigned long ip) curr->hardirqs_enabled = 0; curr->hardirq_disable_ip = ip; curr->hardirq_disable_event = ++curr->irq_events; - debug_atomic_inc(&hardirqs_off_events); + debug_atomic_inc(hardirqs_off_events); } else - debug_atomic_inc(&redundant_hardirqs_off); + debug_atomic_inc(redundant_hardirqs_off); } EXPORT_SYMBOL(trace_hardirqs_off_caller); @@ -2396,7 +2401,7 @@ void trace_softirqs_on(unsigned long ip) return; if (curr->softirqs_enabled) { - debug_atomic_inc(&redundant_softirqs_on); + debug_atomic_inc(redundant_softirqs_on); return; } @@ -2406,7 +2411,7 @@ void trace_softirqs_on(unsigned long ip) curr->softirqs_enabled = 1; curr->softirq_enable_ip = ip; curr->softirq_enable_event = ++curr->irq_events; - debug_atomic_inc(&softirqs_on_events); + debug_atomic_inc(softirqs_on_events); /* * We are going to turn softirqs on, so set the * usage bit for all held locks, if hardirqs are @@ -2436,10 +2441,10 @@ void trace_softirqs_off(unsigned long ip) curr->softirqs_enabled = 0; curr->softirq_disable_ip = ip; curr->softirq_disable_event = ++curr->irq_events; - debug_atomic_inc(&softirqs_off_events); + debug_atomic_inc(softirqs_off_events); DEBUG_LOCKS_WARN_ON(!softirq_count()); } else - debug_atomic_inc(&redundant_softirqs_off); + debug_atomic_inc(redundant_softirqs_off); } static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags) @@ -2644,7 +2649,7 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this, return 0; break; case LOCK_USED: - debug_atomic_dec(&nr_unused_locks); + debug_atomic_dec(nr_unused_locks); break; default: if (!debug_locks_off_graph_unlock()) @@ -2750,7 +2755,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass, if (!class) return 0; } - debug_atomic_inc((atomic_t *)&class->ops); + atomic_inc((atomic_t *)&class->ops); if (very_verbose(class)) { printk("\nacquire class [%p] %s", class->key, class->name); if (class->name_version > 1) @@ -3801,8 +3806,11 @@ void lockdep_rcu_dereference(const char *file, const int line) { struct task_struct *curr = current; +#ifndef CONFIG_PROVE_RCU_REPEATEDLY if (!debug_locks_off()) return; +#endif /* #ifdef CONFIG_PROVE_RCU_REPEATEDLY */ + /* Note: the following can be executed concurrently, so be careful. */ printk("\n===================================================\n"); printk( "[ INFO: suspicious rcu_dereference_check() usage. ]\n"); printk( "---------------------------------------------------\n"); diff --git a/kernel/lockdep_internals.h b/kernel/lockdep_internals.h index a2ee95a..4f560cf 100644 --- a/kernel/lockdep_internals.h +++ b/kernel/lockdep_internals.h @@ -110,30 +110,60 @@ lockdep_count_backward_deps(struct lock_class *class) #endif #ifdef CONFIG_DEBUG_LOCKDEP + +#include <asm/local.h> /* - * Various lockdep statistics: + * Various lockdep statistics. + * We want them per cpu as they are often accessed in fast path + * and we want to avoid too much cache bouncing. */ -extern atomic_t chain_lookup_hits; -extern atomic_t chain_lookup_misses; -extern atomic_t hardirqs_on_events; -extern atomic_t hardirqs_off_events; -extern atomic_t redundant_hardirqs_on; -extern atomic_t redundant_hardirqs_off; -extern atomic_t softirqs_on_events; -extern atomic_t softirqs_off_events; -extern atomic_t redundant_softirqs_on; -extern atomic_t redundant_softirqs_off; -extern atomic_t nr_unused_locks; -extern atomic_t nr_cyclic_checks; -extern atomic_t nr_cyclic_check_recursions; -extern atomic_t nr_find_usage_forwards_checks; -extern atomic_t nr_find_usage_forwards_recursions; -extern atomic_t nr_find_usage_backwards_checks; -extern atomic_t nr_find_usage_backwards_recursions; -# define debug_atomic_inc(ptr) atomic_inc(ptr) -# define debug_atomic_dec(ptr) atomic_dec(ptr) -# define debug_atomic_read(ptr) atomic_read(ptr) +struct lockdep_stats { + int chain_lookup_hits; + int chain_lookup_misses; + int hardirqs_on_events; + int hardirqs_off_events; + int redundant_hardirqs_on; + int redundant_hardirqs_off; + int softirqs_on_events; + int softirqs_off_events; + int redundant_softirqs_on; + int redundant_softirqs_off; + int nr_unused_locks; + int nr_cyclic_checks; + int nr_cyclic_check_recursions; + int nr_find_usage_forwards_checks; + int nr_find_usage_forwards_recursions; + int nr_find_usage_backwards_checks; + int nr_find_usage_backwards_recursions; +}; + +DECLARE_PER_CPU(struct lockdep_stats, lockdep_stats); + +#define __debug_atomic_inc(ptr) \ + this_cpu_inc(lockdep_stats.ptr); + +#define debug_atomic_inc(ptr) { \ + WARN_ON_ONCE(!irqs_disabled()); \ + __this_cpu_inc(lockdep_stats.ptr); \ +} + +#define debug_atomic_dec(ptr) { \ + WARN_ON_ONCE(!irqs_disabled()); \ + __this_cpu_dec(lockdep_stats.ptr); \ +} + +#define debug_atomic_read(ptr) ({ \ + struct lockdep_stats *__cpu_lockdep_stats; \ + unsigned long long __total = 0; \ + int __cpu; \ + for_each_possible_cpu(__cpu) { \ + __cpu_lockdep_stats = &per_cpu(lockdep_stats, __cpu); \ + __total += __cpu_lockdep_stats->ptr; \ + } \ + __total; \ +}) #else +# define __debug_atomic_inc(ptr) do { } while (0) # define debug_atomic_inc(ptr) do { } while (0) # define debug_atomic_dec(ptr) do { } while (0) # define debug_atomic_read(ptr) 0 diff --git a/kernel/lockdep_proc.c b/kernel/lockdep_proc.c index d4aba4f..59b76c8 100644 --- a/kernel/lockdep_proc.c +++ b/kernel/lockdep_proc.c @@ -184,34 +184,34 @@ static const struct file_operations proc_lockdep_chains_operations = { static void lockdep_stats_debug_show(struct seq_file *m) { #ifdef CONFIG_DEBUG_LOCKDEP - unsigned int hi1 = debug_atomic_read(&hardirqs_on_events), - hi2 = debug_atomic_read(&hardirqs_off_events), - hr1 = debug_atomic_read(&redundant_hardirqs_on), - hr2 = debug_atomic_read(&redundant_hardirqs_off), - si1 = debug_atomic_read(&softirqs_on_events), - si2 = debug_atomic_read(&softirqs_off_events), - sr1 = debug_atomic_read(&redundant_softirqs_on), - sr2 = debug_atomic_read(&redundant_softirqs_off); - - seq_printf(m, " chain lookup misses: %11u\n", - debug_atomic_read(&chain_lookup_misses)); - seq_printf(m, " chain lookup hits: %11u\n", - debug_atomic_read(&chain_lookup_hits)); - seq_printf(m, " cyclic checks: %11u\n", - debug_atomic_read(&nr_cyclic_checks)); - seq_printf(m, " find-mask forwards checks: %11u\n", - debug_atomic_read(&nr_find_usage_forwards_checks)); - seq_printf(m, " find-mask backwards checks: %11u\n", - debug_atomic_read(&nr_find_usage_backwards_checks)); - - seq_printf(m, " hardirq on events: %11u\n", hi1); - seq_printf(m, " hardirq off events: %11u\n", hi2); - seq_printf(m, " redundant hardirq ons: %11u\n", hr1); - seq_printf(m, " redundant hardirq offs: %11u\n", hr2); - seq_printf(m, " softirq on events: %11u\n", si1); - seq_printf(m, " softirq off events: %11u\n", si2); - seq_printf(m, " redundant softirq ons: %11u\n", sr1); - seq_printf(m, " redundant softirq offs: %11u\n", sr2); + unsigned long long hi1 = debug_atomic_read(hardirqs_on_events), + hi2 = debug_atomic_read(hardirqs_off_events), + hr1 = debug_atomic_read(redundant_hardirqs_on), + hr2 = debug_atomic_read(redundant_hardirqs_off), + si1 = debug_atomic_read(softirqs_on_events), + si2 = debug_atomic_read(softirqs_off_events), + sr1 = debug_atomic_read(redundant_softirqs_on), + sr2 = debug_atomic_read(redundant_softirqs_off); + + seq_printf(m, " chain lookup misses: %11llu\n", + debug_atomic_read(chain_lookup_misses)); + seq_printf(m, " chain lookup hits: %11llu\n", + debug_atomic_read(chain_lookup_hits)); + seq_printf(m, " cyclic checks: %11llu\n", + debug_atomic_read(nr_cyclic_checks)); + seq_printf(m, " find-mask forwards checks: %11llu\n", + debug_atomic_read(nr_find_usage_forwards_checks)); + seq_printf(m, " find-mask backwards checks: %11llu\n", + debug_atomic_read(nr_find_usage_backwards_checks)); + + seq_printf(m, " hardirq on events: %11llu\n", hi1); + seq_printf(m, " hardirq off events: %11llu\n", hi2); + seq_printf(m, " redundant hardirq ons: %11llu\n", hr1); + seq_printf(m, " redundant hardirq offs: %11llu\n", hr2); + seq_printf(m, " softirq on events: %11llu\n", si1); + seq_printf(m, " softirq off events: %11llu\n", si2); + seq_printf(m, " redundant softirq ons: %11llu\n", sr1); + seq_printf(m, " redundant softirq offs: %11llu\n", sr2); #endif } @@ -263,7 +263,7 @@ static int lockdep_stats_show(struct seq_file *m, void *v) #endif } #ifdef CONFIG_DEBUG_LOCKDEP - DEBUG_LOCKS_WARN_ON(debug_atomic_read(&nr_unused_locks) != nr_unused); + DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused); #endif seq_printf(m, " lock-classes: %11lu [max: %lu]\n", nr_lock_classes, MAX_LOCKDEP_KEYS); diff --git a/kernel/module.c b/kernel/module.c index 1016b75..e256458 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -59,8 +59,6 @@ #define CREATE_TRACE_POINTS #include <trace/events/module.h> -EXPORT_TRACEPOINT_SYMBOL(module_get); - #if 0 #define DEBUGP printk #else @@ -515,6 +513,9 @@ MODINFO_ATTR(srcversion); static char last_unloaded_module[MODULE_NAME_LEN+1]; #ifdef CONFIG_MODULE_UNLOAD + +EXPORT_TRACEPOINT_SYMBOL(module_get); + /* Init the unload section of the module. */ static void module_unload_init(struct module *mod) { @@ -723,16 +724,8 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user, return -EFAULT; name[MODULE_NAME_LEN-1] = '\0'; - /* Create stop_machine threads since free_module relies on - * a non-failing stop_machine call. */ - ret = stop_machine_create(); - if (ret) - return ret; - - if (mutex_lock_interruptible(&module_mutex) != 0) { - ret = -EINTR; - goto out_stop; - } + if (mutex_lock_interruptible(&module_mutex) != 0) + return -EINTR; mod = find_module(name); if (!mod) { @@ -792,8 +785,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user, out: mutex_unlock(&module_mutex); -out_stop: - stop_machine_destroy(); return ret; } @@ -867,8 +858,7 @@ void module_put(struct module *module) smp_wmb(); /* see comment in module_refcount */ __this_cpu_inc(module->refptr->decs); - trace_module_put(module, _RET_IP_, - __this_cpu_read(module->refptr->decs)); + trace_module_put(module, _RET_IP_); /* Maybe they're waiting for us to drop reference? */ if (unlikely(!module_is_live(module))) wake_up_process(module->waiter); diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 7e3bcf1..2a060be 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -4050,19 +4050,46 @@ static inline u64 swevent_hash(u64 type, u32 event_id) return hash_64(val, SWEVENT_HLIST_BITS); } -static struct hlist_head * -find_swevent_head(struct perf_cpu_context *ctx, u64 type, u32 event_id) +static inline struct hlist_head * +__find_swevent_head(struct swevent_hlist *hlist, u64 type, u32 event_id) { - u64 hash; - struct swevent_hlist *hlist; + u64 hash = swevent_hash(type, event_id); + + return &hlist->heads[hash]; +} - hash = swevent_hash(type, event_id); +/* For the read side: events when they trigger */ +static inline struct hlist_head * +find_swevent_head_rcu(struct perf_cpu_context *ctx, u64 type, u32 event_id) +{ + struct swevent_hlist *hlist; hlist = rcu_dereference(ctx->swevent_hlist); if (!hlist) return NULL; - return &hlist->heads[hash]; + return __find_swevent_head(hlist, type, event_id); +} + +/* For the event head insertion and removal in the hlist */ +static inline struct hlist_head * +find_swevent_head(struct perf_cpu_context *ctx, struct perf_event *event) +{ + struct swevent_hlist *hlist; + u32 event_id = event->attr.config; + u64 type = event->attr.type; + + /* + * Event scheduling is always serialized against hlist allocation + * and release. Which makes the protected version suitable here. + * The context lock guarantees that. + */ + hlist = rcu_dereference_protected(ctx->swevent_hlist, + lockdep_is_held(&event->ctx->lock)); + if (!hlist) + return NULL; + + return __find_swevent_head(hlist, type, event_id); } static void do_perf_sw_event(enum perf_type_id type, u32 event_id, @@ -4079,7 +4106,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id, rcu_read_lock(); - head = find_swevent_head(cpuctx, type, event_id); + head = find_swevent_head_rcu(cpuctx, type, event_id); if (!head) goto end; @@ -4162,7 +4189,7 @@ static int perf_swevent_enable(struct perf_event *event) perf_swevent_set_period(event); } - head = find_swevent_head(cpuctx, event->attr.type, event->attr.config); + head = find_swevent_head(cpuctx, event); if (WARN_ON_ONCE(!head)) return -EINVAL; @@ -4350,6 +4377,14 @@ static const struct pmu perf_ops_task_clock = { .read = task_clock_perf_event_read, }; +/* Deref the hlist from the update side */ +static inline struct swevent_hlist * +swevent_hlist_deref(struct perf_cpu_context *cpuctx) +{ + return rcu_dereference_protected(cpuctx->swevent_hlist, + lockdep_is_held(&cpuctx->hlist_mutex)); +} + static void swevent_hlist_release_rcu(struct rcu_head *rcu_head) { struct swevent_hlist *hlist; @@ -4360,12 +4395,11 @@ static void swevent_hlist_release_rcu(struct rcu_head *rcu_head) static void swevent_hlist_release(struct perf_cpu_context *cpuctx) { - struct swevent_hlist *hlist; + struct swevent_hlist *hlist = swevent_hlist_deref(cpuctx); - if (!cpuctx->swevent_hlist) + if (!hlist) return; - hlist = cpuctx->swevent_hlist; rcu_assign_pointer(cpuctx->swevent_hlist, NULL); call_rcu(&hlist->rcu_head, swevent_hlist_release_rcu); } @@ -4402,7 +4436,7 @@ static int swevent_hlist_get_cpu(struct perf_event *event, int cpu) mutex_lock(&cpuctx->hlist_mutex); - if (!cpuctx->swevent_hlist && cpu_online(cpu)) { + if (!swevent_hlist_deref(cpuctx) && cpu_online(cpu)) { struct swevent_hlist *hlist; hlist = kzalloc(sizeof(*hlist), GFP_KERNEL); diff --git a/kernel/profile.c b/kernel/profile.c index a55d3a3..dfadc5b 100644 --- a/kernel/profile.c +++ b/kernel/profile.c @@ -127,8 +127,10 @@ int __ref profile_init(void) return 0; prof_buffer = vmalloc(buffer_bytes); - if (prof_buffer) + if (prof_buffer) { + memset(prof_buffer, 0, buffer_bytes); return 0; + } free_cpumask_var(prof_cpu_mask); return -ENOMEM; diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 9fb5123..6af9cdd 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -14,7 +14,6 @@ #include <linux/mm.h> #include <linux/highmem.h> #include <linux/pagemap.h> -#include <linux/smp_lock.h> #include <linux/ptrace.h> #include <linux/security.h> #include <linux/signal.h> @@ -665,10 +664,6 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, long, addr, long, data) struct task_struct *child; long ret; - /* - * This lock_kernel fixes a subtle race with suid exec - */ - lock_kernel(); if (request == PTRACE_TRACEME) { ret = ptrace_traceme(); if (!ret) @@ -702,7 +697,6 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, long, addr, long, data) out_put_task_struct: put_task_struct(child); out: - unlock_kernel(); return ret; } @@ -812,10 +806,6 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid, struct task_struct *child; long ret; - /* - * This lock_kernel fixes a subtle race with suid exec - */ - lock_kernel(); if (request == PTRACE_TRACEME) { ret = ptrace_traceme(); goto out; @@ -845,7 +835,6 @@ asmlinkage long compat_sys_ptrace(compat_long_t request, compat_long_t pid, out_put_task_struct: put_task_struct(child); out: - unlock_kernel(); return ret; } #endif /* CONFIG_COMPAT */ diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c index 03a7ea1..72a8dc9 100644 --- a/kernel/rcupdate.c +++ b/kernel/rcupdate.c @@ -44,7 +44,6 @@ #include <linux/cpu.h> #include <linux/mutex.h> #include <linux/module.h> -#include <linux/kernel_stat.h> #include <linux/hardirq.h> #ifdef CONFIG_DEBUG_LOCK_ALLOC @@ -64,9 +63,6 @@ struct lockdep_map rcu_sched_lock_map = EXPORT_SYMBOL_GPL(rcu_sched_lock_map); #endif -int rcu_scheduler_active __read_mostly; -EXPORT_SYMBOL_GPL(rcu_scheduler_active); - #ifdef CONFIG_DEBUG_LOCK_ALLOC int debug_lockdep_rcu_enabled(void) @@ -97,21 +93,6 @@ EXPORT_SYMBOL_GPL(rcu_read_lock_bh_held); #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ /* - * This function is invoked towards the end of the scheduler's initialization - * process. Before this is called, the idle task might contain - * RCU read-side critical sections (during which time, this idle - * task is booting the system). After this function is called, the - * idle tasks are prohibited from containing RCU read-side critical - * sections. - */ -void rcu_scheduler_starting(void) -{ - WARN_ON(num_online_cpus() != 1); - WARN_ON(nr_context_switches() > 0); - rcu_scheduler_active = 1; -} - -/* * Awaken the corresponding synchronize_rcu() instance now that a * grace period has elapsed. */ @@ -122,3 +103,14 @@ void wakeme_after_rcu(struct rcu_head *head) rcu = container_of(head, struct rcu_synchronize, head); complete(&rcu->completion); } + +#ifdef CONFIG_PROVE_RCU +/* + * wrapper function to avoid #include problems. + */ +int rcu_my_thread_group_empty(void) +{ + return thread_group_empty(current); +} +EXPORT_SYMBOL_GPL(rcu_my_thread_group_empty); +#endif /* #ifdef CONFIG_PROVE_RCU */ diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c index 9f6d9ff..38729d3 100644 --- a/kernel/rcutiny.c +++ b/kernel/rcutiny.c @@ -44,9 +44,9 @@ struct rcu_ctrlblk { }; /* Definition for rcupdate control block. */ -static struct rcu_ctrlblk rcu_ctrlblk = { - .donetail = &rcu_ctrlblk.rcucblist, - .curtail = &rcu_ctrlblk.rcucblist, +static struct rcu_ctrlblk rcu_sched_ctrlblk = { + .donetail = &rcu_sched_ctrlblk.rcucblist, + .curtail = &rcu_sched_ctrlblk.rcucblist, }; static struct rcu_ctrlblk rcu_bh_ctrlblk = { @@ -54,6 +54,11 @@ static struct rcu_ctrlblk rcu_bh_ctrlblk = { .curtail = &rcu_bh_ctrlblk.rcucblist, }; +#ifdef CONFIG_DEBUG_LOCK_ALLOC +int rcu_scheduler_active __read_mostly; +EXPORT_SYMBOL_GPL(rcu_scheduler_active); +#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ + #ifdef CONFIG_NO_HZ static long rcu_dynticks_nesting = 1; @@ -108,7 +113,8 @@ static int rcu_qsctr_help(struct rcu_ctrlblk *rcp) */ void rcu_sched_qs(int cpu) { - if (rcu_qsctr_help(&rcu_ctrlblk) + rcu_qsctr_help(&rcu_bh_ctrlblk)) + if (rcu_qsctr_help(&rcu_sched_ctrlblk) + + rcu_qsctr_help(&rcu_bh_ctrlblk)) raise_softirq(RCU_SOFTIRQ); } @@ -173,7 +179,7 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp) */ static void rcu_process_callbacks(struct softirq_action *unused) { - __rcu_process_callbacks(&rcu_ctrlblk); + __rcu_process_callbacks(&rcu_sched_ctrlblk); __rcu_process_callbacks(&rcu_bh_ctrlblk); } @@ -187,7 +193,8 @@ static void rcu_process_callbacks(struct softirq_action *unused) * * Cool, huh? (Due to Josh Triplett.) * - * But we want to make this a static inline later. + * But we want to make this a static inline later. The cond_resched() + * currently makes this problematic. */ void synchronize_sched(void) { @@ -195,12 +202,6 @@ void synchronize_sched(void) } EXPORT_SYMBOL_GPL(synchronize_sched); -void synchronize_rcu_bh(void) -{ - synchronize_sched(); -} -EXPORT_SYMBOL_GPL(synchronize_rcu_bh); - /* * Helper function for call_rcu() and call_rcu_bh(). */ @@ -226,7 +227,7 @@ static void __call_rcu(struct rcu_head *head, */ void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu)) { - __call_rcu(head, func, &rcu_ctrlblk); + __call_rcu(head, func, &rcu_sched_ctrlblk); } EXPORT_SYMBOL_GPL(call_rcu); @@ -244,11 +245,13 @@ void rcu_barrier(void) { struct rcu_synchronize rcu; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(rcu_barrier); @@ -256,11 +259,13 @@ void rcu_barrier_bh(void) { struct rcu_synchronize rcu; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu_bh(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(rcu_barrier_bh); @@ -268,11 +273,13 @@ void rcu_barrier_sched(void) { struct rcu_synchronize rcu; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu_sched(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(rcu_barrier_sched); @@ -280,3 +287,5 @@ void __init rcu_init(void) { open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); } + +#include "rcutiny_plugin.h" diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h new file mode 100644 index 0000000..d223a92bc --- /dev/null +++ b/kernel/rcutiny_plugin.h @@ -0,0 +1,39 @@ +/* + * Read-Copy Update mechanism for mutual exclusion (tree-based version) + * Internal non-public definitions that provide either classic + * or preemptable semantics. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright IBM Corporation, 2009 + * + * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> + */ + +#ifdef CONFIG_DEBUG_LOCK_ALLOC + +#include <linux/kernel_stat.h> + +/* + * During boot, we forgive RCU lockdep issues. After this function is + * invoked, we start taking RCU lockdep issues seriously. + */ +void rcu_scheduler_starting(void) +{ + WARN_ON(nr_context_switches() > 0); + rcu_scheduler_active = 1; +} + +#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c index 58df55b..6535ac8 100644 --- a/kernel/rcutorture.c +++ b/kernel/rcutorture.c @@ -464,9 +464,11 @@ static void rcu_bh_torture_synchronize(void) { struct rcu_bh_torture_synchronize rcu; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); call_rcu_bh(&rcu.head, rcu_bh_torture_wakeme_after_cb); wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } static struct rcu_torture_ops rcu_bh_ops = { @@ -669,7 +671,7 @@ static struct rcu_torture_ops sched_expedited_ops = { .sync = synchronize_sched_expedited, .cb_barrier = NULL, .fqs = rcu_sched_force_quiescent_state, - .stats = rcu_expedited_torture_stats, + .stats = NULL, .irq_capable = 1, .name = "sched_expedited" }; diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 3ec8160..d443734 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -46,6 +46,7 @@ #include <linux/cpu.h> #include <linux/mutex.h> #include <linux/time.h> +#include <linux/kernel_stat.h> #include "rcutree.h" @@ -53,8 +54,8 @@ static struct lock_class_key rcu_node_class[NUM_RCU_LVLS]; -#define RCU_STATE_INITIALIZER(name) { \ - .level = { &name.node[0] }, \ +#define RCU_STATE_INITIALIZER(structname) { \ + .level = { &structname.node[0] }, \ .levelcnt = { \ NUM_RCU_LVL_0, /* root of hierarchy. */ \ NUM_RCU_LVL_1, \ @@ -65,13 +66,14 @@ static struct lock_class_key rcu_node_class[NUM_RCU_LVLS]; .signaled = RCU_GP_IDLE, \ .gpnum = -300, \ .completed = -300, \ - .onofflock = __RAW_SPIN_LOCK_UNLOCKED(&name.onofflock), \ + .onofflock = __RAW_SPIN_LOCK_UNLOCKED(&structname.onofflock), \ .orphan_cbs_list = NULL, \ - .orphan_cbs_tail = &name.orphan_cbs_list, \ + .orphan_cbs_tail = &structname.orphan_cbs_list, \ .orphan_qlen = 0, \ - .fqslock = __RAW_SPIN_LOCK_UNLOCKED(&name.fqslock), \ + .fqslock = __RAW_SPIN_LOCK_UNLOCKED(&structname.fqslock), \ .n_force_qs = 0, \ .n_force_qs_ngp = 0, \ + .name = #structname, \ } struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched_state); @@ -80,6 +82,9 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data); struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state); DEFINE_PER_CPU(struct rcu_data, rcu_bh_data); +int rcu_scheduler_active __read_mostly; +EXPORT_SYMBOL_GPL(rcu_scheduler_active); + /* * Return true if an RCU grace period is in progress. The ACCESS_ONCE()s * permit this function to be invoked without holding the root rcu_node @@ -97,25 +102,32 @@ static int rcu_gp_in_progress(struct rcu_state *rsp) */ void rcu_sched_qs(int cpu) { - struct rcu_data *rdp; + struct rcu_data *rdp = &per_cpu(rcu_sched_data, cpu); - rdp = &per_cpu(rcu_sched_data, cpu); rdp->passed_quiesc_completed = rdp->gpnum - 1; barrier(); rdp->passed_quiesc = 1; - rcu_preempt_note_context_switch(cpu); } void rcu_bh_qs(int cpu) { - struct rcu_data *rdp; + struct rcu_data *rdp = &per_cpu(rcu_bh_data, cpu); - rdp = &per_cpu(rcu_bh_data, cpu); rdp->passed_quiesc_completed = rdp->gpnum - 1; barrier(); rdp->passed_quiesc = 1; } +/* + * Note a context switch. This is a quiescent state for RCU-sched, + * and requires special handling for preemptible RCU. + */ +void rcu_note_context_switch(int cpu) +{ + rcu_sched_qs(cpu); + rcu_preempt_note_context_switch(cpu); +} + #ifdef CONFIG_NO_HZ DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = { .dynticks_nesting = 1, @@ -438,6 +450,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) #ifdef CONFIG_RCU_CPU_STALL_DETECTOR +int rcu_cpu_stall_panicking __read_mostly; + static void record_gp_stall_check_time(struct rcu_state *rsp) { rsp->gp_start = jiffies; @@ -470,7 +484,8 @@ static void print_other_cpu_stall(struct rcu_state *rsp) /* OK, time to rat on our buddy... */ - printk(KERN_ERR "INFO: RCU detected CPU stalls:"); + printk(KERN_ERR "INFO: %s detected stalls on CPUs/tasks: {", + rsp->name); rcu_for_each_leaf_node(rsp, rnp) { raw_spin_lock_irqsave(&rnp->lock, flags); rcu_print_task_stall(rnp); @@ -481,7 +496,7 @@ static void print_other_cpu_stall(struct rcu_state *rsp) if (rnp->qsmask & (1UL << cpu)) printk(" %d", rnp->grplo + cpu); } - printk(" (detected by %d, t=%ld jiffies)\n", + printk("} (detected by %d, t=%ld jiffies)\n", smp_processor_id(), (long)(jiffies - rsp->gp_start)); trigger_all_cpu_backtrace(); @@ -497,8 +512,8 @@ static void print_cpu_stall(struct rcu_state *rsp) unsigned long flags; struct rcu_node *rnp = rcu_get_root(rsp); - printk(KERN_ERR "INFO: RCU detected CPU %d stall (t=%lu jiffies)\n", - smp_processor_id(), jiffies - rsp->gp_start); + printk(KERN_ERR "INFO: %s detected stall on CPU %d (t=%lu jiffies)\n", + rsp->name, smp_processor_id(), jiffies - rsp->gp_start); trigger_all_cpu_backtrace(); raw_spin_lock_irqsave(&rnp->lock, flags); @@ -515,6 +530,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) long delta; struct rcu_node *rnp; + if (rcu_cpu_stall_panicking) + return; delta = jiffies - rsp->jiffies_stall; rnp = rdp->mynode; if ((rnp->qsmask & rdp->grpmask) && delta >= 0) { @@ -529,6 +546,21 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) } } +static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr) +{ + rcu_cpu_stall_panicking = 1; + return NOTIFY_DONE; +} + +static struct notifier_block rcu_panic_block = { + .notifier_call = rcu_panic, +}; + +static void __init check_cpu_stall_init(void) +{ + atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block); +} + #else /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */ static void record_gp_stall_check_time(struct rcu_state *rsp) @@ -539,6 +571,10 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp) { } +static void __init check_cpu_stall_init(void) +{ +} + #endif /* #else #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */ /* @@ -1125,8 +1161,6 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp) */ void rcu_check_callbacks(int cpu, int user) { - if (!rcu_pending(cpu)) - return; /* if nothing for RCU to do. */ if (user || (idle_cpu(cpu) && rcu_scheduler_active && !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) { @@ -1158,7 +1192,8 @@ void rcu_check_callbacks(int cpu, int user) rcu_bh_qs(cpu); } rcu_preempt_check_callbacks(cpu); - raise_softirq(RCU_SOFTIRQ); + if (rcu_pending(cpu)) + raise_softirq(RCU_SOFTIRQ); } #ifdef CONFIG_SMP @@ -1236,11 +1271,11 @@ static void force_quiescent_state(struct rcu_state *rsp, int relaxed) break; /* grace period idle or initializing, ignore. */ case RCU_SAVE_DYNTICK: - - raw_spin_unlock(&rnp->lock); /* irqs remain disabled */ if (RCU_SIGNAL_INIT != RCU_SAVE_DYNTICK) break; /* So gcc recognizes the dead code. */ + raw_spin_unlock(&rnp->lock); /* irqs remain disabled */ + /* Record dyntick-idle state. */ force_qs_rnp(rsp, dyntick_save_progress_counter); raw_spin_lock(&rnp->lock); /* irqs already disabled */ @@ -1449,11 +1484,13 @@ void synchronize_sched(void) if (rcu_blocking_is_gp()) return; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu_sched(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(synchronize_sched); @@ -1473,11 +1510,13 @@ void synchronize_rcu_bh(void) if (rcu_blocking_is_gp()) return; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu_bh(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(synchronize_rcu_bh); @@ -1498,8 +1537,20 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp) check_cpu_stall(rsp, rdp); /* Is the RCU core waiting for a quiescent state from this CPU? */ - if (rdp->qs_pending) { + if (rdp->qs_pending && !rdp->passed_quiesc) { + + /* + * If force_quiescent_state() coming soon and this CPU + * needs a quiescent state, and this is either RCU-sched + * or RCU-bh, force a local reschedule. + */ rdp->n_rp_qs_pending++; + if (!rdp->preemptable && + ULONG_CMP_LT(ACCESS_ONCE(rsp->jiffies_force_qs) - 1, + jiffies)) + set_need_resched(); + } else if (rdp->qs_pending && rdp->passed_quiesc) { + rdp->n_rp_report_qs++; return 1; } @@ -1767,6 +1818,21 @@ static int __cpuinit rcu_cpu_notify(struct notifier_block *self, } /* + * This function is invoked towards the end of the scheduler's initialization + * process. Before this is called, the idle task might contain + * RCU read-side critical sections (during which time, this idle + * task is booting the system). After this function is called, the + * idle tasks are prohibited from containing RCU read-side critical + * sections. This function also enables RCU lockdep checking. + */ +void rcu_scheduler_starting(void) +{ + WARN_ON(num_online_cpus() != 1); + WARN_ON(nr_context_switches() > 0); + rcu_scheduler_active = 1; +} + +/* * Compute the per-level fanout, either using the exact fanout specified * or balancing the tree, depending on CONFIG_RCU_FANOUT_EXACT. */ @@ -1849,6 +1915,14 @@ static void __init rcu_init_one(struct rcu_state *rsp) INIT_LIST_HEAD(&rnp->blocked_tasks[3]); } } + + rnp = rsp->level[NUM_RCU_LVLS - 1]; + for_each_possible_cpu(i) { + while (i > rnp->grphi) + rnp++; + rsp->rda[i]->mynode = rnp; + rcu_boot_init_percpu_data(i, rsp); + } } /* @@ -1859,19 +1933,11 @@ static void __init rcu_init_one(struct rcu_state *rsp) #define RCU_INIT_FLAVOR(rsp, rcu_data) \ do { \ int i; \ - int j; \ - struct rcu_node *rnp; \ \ - rcu_init_one(rsp); \ - rnp = (rsp)->level[NUM_RCU_LVLS - 1]; \ - j = 0; \ for_each_possible_cpu(i) { \ - if (i > rnp[j].grphi) \ - j++; \ - per_cpu(rcu_data, i).mynode = &rnp[j]; \ (rsp)->rda[i] = &per_cpu(rcu_data, i); \ - rcu_boot_init_percpu_data(i, rsp); \ } \ + rcu_init_one(rsp); \ } while (0) void __init rcu_init(void) @@ -1879,12 +1945,6 @@ void __init rcu_init(void) int cpu; rcu_bootup_announce(); -#ifdef CONFIG_RCU_CPU_STALL_DETECTOR - printk(KERN_INFO "RCU-based detection of stalled CPUs is enabled.\n"); -#endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */ -#if NUM_RCU_LVL_4 != 0 - printk(KERN_INFO "Experimental four-level hierarchy is enabled.\n"); -#endif /* #if NUM_RCU_LVL_4 != 0 */ RCU_INIT_FLAVOR(&rcu_sched_state, rcu_sched_data); RCU_INIT_FLAVOR(&rcu_bh_state, rcu_bh_data); __rcu_init_preempt(); @@ -1898,6 +1958,7 @@ void __init rcu_init(void) cpu_notifier(rcu_cpu_notify, 0); for_each_online_cpu(cpu) rcu_cpu_notify(NULL, CPU_UP_PREPARE, (void *)(long)cpu); + check_cpu_stall_init(); } #include "rcutree_plugin.h" diff --git a/kernel/rcutree.h b/kernel/rcutree.h index 4a525a3..14c040b 100644 --- a/kernel/rcutree.h +++ b/kernel/rcutree.h @@ -223,6 +223,7 @@ struct rcu_data { /* 5) __rcu_pending() statistics. */ unsigned long n_rcu_pending; /* rcu_pending() calls since boot. */ unsigned long n_rp_qs_pending; + unsigned long n_rp_report_qs; unsigned long n_rp_cb_ready; unsigned long n_rp_cpu_needs_gp; unsigned long n_rp_gp_completed; @@ -326,6 +327,7 @@ struct rcu_state { unsigned long jiffies_stall; /* Time at which to check */ /* for CPU stalls. */ #endif /* #ifdef CONFIG_RCU_CPU_STALL_DETECTOR */ + char *name; /* Name of structure. */ }; /* Return values for rcu_preempt_offline_tasks(). */ diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index 79b53bd..0e4f420 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -26,6 +26,45 @@ #include <linux/delay.h> +/* + * Check the RCU kernel configuration parameters and print informative + * messages about anything out of the ordinary. If you like #ifdef, you + * will love this function. + */ +static void __init rcu_bootup_announce_oddness(void) +{ +#ifdef CONFIG_RCU_TRACE + printk(KERN_INFO "\tRCU debugfs-based tracing is enabled.\n"); +#endif +#if (defined(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 64) || (!defined(CONFIG_64BIT) && CONFIG_RCU_FANOUT != 32) + printk(KERN_INFO "\tCONFIG_RCU_FANOUT set to non-default value of %d\n", + CONFIG_RCU_FANOUT); +#endif +#ifdef CONFIG_RCU_FANOUT_EXACT + printk(KERN_INFO "\tHierarchical RCU autobalancing is disabled.\n"); +#endif +#ifdef CONFIG_RCU_FAST_NO_HZ + printk(KERN_INFO + "\tRCU dyntick-idle grace-period acceleration is enabled.\n"); +#endif +#ifdef CONFIG_PROVE_RCU + printk(KERN_INFO "\tRCU lockdep checking is enabled.\n"); +#endif +#ifdef CONFIG_RCU_TORTURE_TEST_RUNNABLE + printk(KERN_INFO "\tRCU torture testing starts during boot.\n"); +#endif +#ifndef CONFIG_RCU_CPU_STALL_DETECTOR + printk(KERN_INFO + "\tRCU-based detection of stalled CPUs is disabled.\n"); +#endif +#ifndef CONFIG_RCU_CPU_STALL_VERBOSE + printk(KERN_INFO "\tVerbose stalled-CPUs detection is disabled.\n"); +#endif +#if NUM_RCU_LVL_4 != 0 + printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n"); +#endif +} + #ifdef CONFIG_TREE_PREEMPT_RCU struct rcu_state rcu_preempt_state = RCU_STATE_INITIALIZER(rcu_preempt_state); @@ -38,8 +77,8 @@ static int rcu_preempted_readers_exp(struct rcu_node *rnp); */ static void __init rcu_bootup_announce(void) { - printk(KERN_INFO - "Experimental preemptable hierarchical RCU implementation.\n"); + printk(KERN_INFO "Preemptable hierarchical RCU implementation.\n"); + rcu_bootup_announce_oddness(); } /* @@ -75,13 +114,19 @@ EXPORT_SYMBOL_GPL(rcu_force_quiescent_state); * that this just means that the task currently running on the CPU is * not in a quiescent state. There might be any number of tasks blocked * while in an RCU read-side critical section. + * + * Unlike the other rcu_*_qs() functions, callers to this function + * must disable irqs in order to protect the assignment to + * ->rcu_read_unlock_special. */ static void rcu_preempt_qs(int cpu) { struct rcu_data *rdp = &per_cpu(rcu_preempt_data, cpu); + rdp->passed_quiesc_completed = rdp->gpnum - 1; barrier(); rdp->passed_quiesc = 1; + current->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS; } /* @@ -144,9 +189,8 @@ static void rcu_preempt_note_context_switch(int cpu) * grace period, then the fact that the task has been enqueued * means that we continue to block the current grace period. */ - rcu_preempt_qs(cpu); local_irq_save(flags); - t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS; + rcu_preempt_qs(cpu); local_irq_restore(flags); } @@ -236,7 +280,6 @@ static void rcu_read_unlock_special(struct task_struct *t) */ special = t->rcu_read_unlock_special; if (special & RCU_READ_UNLOCK_NEED_QS) { - t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS; rcu_preempt_qs(smp_processor_id()); } @@ -473,7 +516,6 @@ static void rcu_preempt_check_callbacks(int cpu) struct task_struct *t = current; if (t->rcu_read_lock_nesting == 0) { - t->rcu_read_unlock_special &= ~RCU_READ_UNLOCK_NEED_QS; rcu_preempt_qs(cpu); return; } @@ -515,11 +557,13 @@ void synchronize_rcu(void) if (!rcu_scheduler_active) return; + init_rcu_head_on_stack(&rcu.head); init_completion(&rcu.completion); /* Will wake me after RCU finished. */ call_rcu(&rcu.head, wakeme_after_rcu); /* Wait for it. */ wait_for_completion(&rcu.completion); + destroy_rcu_head_on_stack(&rcu.head); } EXPORT_SYMBOL_GPL(synchronize_rcu); @@ -754,6 +798,7 @@ void exit_rcu(void) static void __init rcu_bootup_announce(void) { printk(KERN_INFO "Hierarchical RCU implementation.\n"); + rcu_bootup_announce_oddness(); } /* @@ -1008,6 +1053,8 @@ static DEFINE_PER_CPU(unsigned long, rcu_dyntick_holdoff); int rcu_needs_cpu(int cpu) { int c = 0; + int snap; + int snap_nmi; int thatcpu; /* Check for being in the holdoff period. */ @@ -1015,12 +1062,18 @@ int rcu_needs_cpu(int cpu) return rcu_needs_cpu_quick_check(cpu); /* Don't bother unless we are the last non-dyntick-idle CPU. */ - for_each_cpu_not(thatcpu, nohz_cpu_mask) - if (thatcpu != cpu) { + for_each_online_cpu(thatcpu) { + if (thatcpu == cpu) + continue; + snap = per_cpu(rcu_dynticks, thatcpu).dynticks; + snap_nmi = per_cpu(rcu_dynticks, thatcpu).dynticks_nmi; + smp_mb(); /* Order sampling of snap with end of grace period. */ + if (((snap & 0x1) != 0) || ((snap_nmi & 0x1) != 0)) { per_cpu(rcu_dyntick_drain, cpu) = 0; per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1; return rcu_needs_cpu_quick_check(cpu); } + } /* Check and update the rcu_dyntick_drain sequencing. */ if (per_cpu(rcu_dyntick_drain, cpu) <= 0) { diff --git a/kernel/rcutree_trace.c b/kernel/rcutree_trace.c index d45db2e..36c95b4 100644 --- a/kernel/rcutree_trace.c +++ b/kernel/rcutree_trace.c @@ -241,11 +241,13 @@ static const struct file_operations rcugp_fops = { static void print_one_rcu_pending(struct seq_file *m, struct rcu_data *rdp) { seq_printf(m, "%3d%cnp=%ld " - "qsp=%ld cbr=%ld cng=%ld gpc=%ld gps=%ld nf=%ld nn=%ld\n", + "qsp=%ld rpq=%ld cbr=%ld cng=%ld " + "gpc=%ld gps=%ld nf=%ld nn=%ld\n", rdp->cpu, cpu_is_offline(rdp->cpu) ? '!' : ' ', rdp->n_rcu_pending, rdp->n_rp_qs_pending, + rdp->n_rp_report_qs, rdp->n_rp_cb_ready, rdp->n_rp_cpu_needs_gp, rdp->n_rp_gp_completed, diff --git a/kernel/sched.c b/kernel/sched.c index b11b80a..1d93cd0 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -55,9 +55,9 @@ #include <linux/cpu.h> #include <linux/cpuset.h> #include <linux/percpu.h> -#include <linux/kthread.h> #include <linux/proc_fs.h> #include <linux/seq_file.h> +#include <linux/stop_machine.h> #include <linux/sysctl.h> #include <linux/syscalls.h> #include <linux/times.h> @@ -503,8 +503,11 @@ struct rq { #define CPU_LOAD_IDX_MAX 5 unsigned long cpu_load[CPU_LOAD_IDX_MAX]; #ifdef CONFIG_NO_HZ + u64 nohz_stamp; unsigned char in_nohz_recently; #endif + unsigned int skip_clock_update; + /* capture load from *all* tasks on this cpu: */ struct load_weight load; unsigned long nr_load_updates; @@ -546,15 +549,13 @@ struct rq { int post_schedule; int active_balance; int push_cpu; + struct cpu_stop_work active_balance_work; /* cpu of this runqueue: */ int cpu; int online; unsigned long avg_load_per_task; - struct task_struct *migration_thread; - struct list_head migration_queue; - u64 rt_avg; u64 age_stamp; u64 idle_stamp; @@ -602,6 +603,13 @@ static inline void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) { rq->curr->sched_class->check_preempt_curr(rq, p, flags); + + /* + * A queue event has occurred, and we're going to schedule. In + * this case, we can save a useless back to back clock update. + */ + if (test_tsk_need_resched(p)) + rq->skip_clock_update = 1; } static inline int cpu_of(struct rq *rq) @@ -636,7 +644,8 @@ static inline int cpu_of(struct rq *rq) inline void update_rq_clock(struct rq *rq) { - rq->clock = sched_clock_cpu(cpu_of(rq)); + if (!rq->skip_clock_update) + rq->clock = sched_clock_cpu(cpu_of(rq)); } /* @@ -914,16 +923,12 @@ static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev) #endif /* __ARCH_WANT_UNLOCKED_CTXSW */ /* - * Check whether the task is waking, we use this to synchronize against - * ttwu() so that task_cpu() reports a stable number. - * - * We need to make an exception for PF_STARTING tasks because the fork - * path might require task_rq_lock() to work, eg. it can call - * set_cpus_allowed_ptr() from the cpuset clone_ns code. + * Check whether the task is waking, we use this to synchronize ->cpus_allowed + * against ttwu(). */ static inline int task_is_waking(struct task_struct *p) { - return unlikely((p->state == TASK_WAKING) && !(p->flags & PF_STARTING)); + return unlikely(p->state == TASK_WAKING); } /* @@ -936,11 +941,9 @@ static inline struct rq *__task_rq_lock(struct task_struct *p) struct rq *rq; for (;;) { - while (task_is_waking(p)) - cpu_relax(); rq = task_rq(p); raw_spin_lock(&rq->lock); - if (likely(rq == task_rq(p) && !task_is_waking(p))) + if (likely(rq == task_rq(p))) return rq; raw_spin_unlock(&rq->lock); } @@ -957,12 +960,10 @@ static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags) struct rq *rq; for (;;) { - while (task_is_waking(p)) - cpu_relax(); local_irq_save(*flags); rq = task_rq(p); raw_spin_lock(&rq->lock); - if (likely(rq == task_rq(p) && !task_is_waking(p))) + if (likely(rq == task_rq(p))) return rq; raw_spin_unlock_irqrestore(&rq->lock, *flags); } @@ -1239,6 +1240,17 @@ void wake_up_idle_cpu(int cpu) if (!tsk_is_polling(rq->idle)) smp_send_reschedule(cpu); } + +int nohz_ratelimit(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + u64 diff = rq->clock - rq->nohz_stamp; + + rq->nohz_stamp = rq->clock; + + return diff < (NSEC_PER_SEC / HZ) >> 1; +} + #endif /* CONFIG_NO_HZ */ static u64 sched_avg_period(void) @@ -1781,8 +1793,6 @@ static void double_rq_lock(struct rq *rq1, struct rq *rq2) raw_spin_lock_nested(&rq1->lock, SINGLE_DEPTH_NESTING); } } - update_rq_clock(rq1); - update_rq_clock(rq2); } /* @@ -1813,7 +1823,7 @@ static void cfs_rq_set_shares(struct cfs_rq *cfs_rq, unsigned long shares) } #endif -static void calc_load_account_active(struct rq *this_rq); +static void calc_load_account_idle(struct rq *this_rq); static void update_sysctl(void); static int get_update_sysctl_factor(void); @@ -1870,62 +1880,43 @@ static void set_load_weight(struct task_struct *p) p->se.load.inv_weight = prio_to_wmult[p->static_prio - MAX_RT_PRIO]; } -static void update_avg(u64 *avg, u64 sample) +static void enqueue_task(struct rq *rq, struct task_struct *p, int flags) { - s64 diff = sample - *avg; - *avg += diff >> 3; -} - -static void -enqueue_task(struct rq *rq, struct task_struct *p, int wakeup, bool head) -{ - if (wakeup) - p->se.start_runtime = p->se.sum_exec_runtime; - + update_rq_clock(rq); sched_info_queued(p); - p->sched_class->enqueue_task(rq, p, wakeup, head); + p->sched_class->enqueue_task(rq, p, flags); p->se.on_rq = 1; } -static void dequeue_task(struct rq *rq, struct task_struct *p, int sleep) +static void dequeue_task(struct rq *rq, struct task_struct *p, int flags) { - if (sleep) { - if (p->se.last_wakeup) { - update_avg(&p->se.avg_overlap, - p->se.sum_exec_runtime - p->se.last_wakeup); - p->se.last_wakeup = 0; - } else { - update_avg(&p->se.avg_wakeup, - sysctl_sched_wakeup_granularity); - } - } - + update_rq_clock(rq); sched_info_dequeued(p); - p->sched_class->dequeue_task(rq, p, sleep); + p->sched_class->dequeue_task(rq, p, flags); p->se.on_rq = 0; } /* * activate_task - move a task to the runqueue. */ -static void activate_task(struct rq *rq, struct task_struct *p, int wakeup) +static void activate_task(struct rq *rq, struct task_struct *p, int flags) { if (task_contributes_to_load(p)) rq->nr_uninterruptible--; - enqueue_task(rq, p, wakeup, false); + enqueue_task(rq, p, flags); inc_nr_running(rq); } /* * deactivate_task - remove a task from the runqueue. */ -static void deactivate_task(struct rq *rq, struct task_struct *p, int sleep) +static void deactivate_task(struct rq *rq, struct task_struct *p, int flags) { if (task_contributes_to_load(p)) rq->nr_uninterruptible++; - dequeue_task(rq, p, sleep); + dequeue_task(rq, p, flags); dec_nr_running(rq); } @@ -2054,21 +2045,18 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu) __set_task_cpu(p, new_cpu); } -struct migration_req { - struct list_head list; - +struct migration_arg { struct task_struct *task; int dest_cpu; - - struct completion done; }; +static int migration_cpu_stop(void *data); + /* * The task's runqueue lock must be held. * Returns true if you have to wait for migration thread. */ -static int -migrate_task(struct task_struct *p, int dest_cpu, struct migration_req *req) +static bool migrate_task(struct task_struct *p, int dest_cpu) { struct rq *rq = task_rq(p); @@ -2076,15 +2064,7 @@ migrate_task(struct task_struct *p, int dest_cpu, struct migration_req *req) * If the task is not on a runqueue (and not running), then * the next wake-up will properly place the task. */ - if (!p->se.on_rq && !task_running(rq, p)) - return 0; - - init_completion(&req->done); - req->task = p; - req->dest_cpu = dest_cpu; - list_add(&req->list, &rq->migration_queue); - - return 1; + return p->se.on_rq || task_running(rq, p); } /* @@ -2142,7 +2122,7 @@ unsigned long wait_task_inactive(struct task_struct *p, long match_state) * just go back and repeat. */ rq = task_rq_lock(p, &flags); - trace_sched_wait_task(rq, p); + trace_sched_wait_task(p); running = task_running(rq, p); on_rq = p->se.on_rq; ncsw = 0; @@ -2240,6 +2220,9 @@ void task_oncpu_function_call(struct task_struct *p, } #ifdef CONFIG_SMP +/* + * ->cpus_allowed is protected by either TASK_WAKING or rq->lock held. + */ static int select_fallback_rq(int cpu, struct task_struct *p) { int dest_cpu; @@ -2256,12 +2239,8 @@ static int select_fallback_rq(int cpu, struct task_struct *p) return dest_cpu; /* No more Mr. Nice Guy. */ - if (dest_cpu >= nr_cpu_ids) { - rcu_read_lock(); - cpuset_cpus_allowed_locked(p, &p->cpus_allowed); - rcu_read_unlock(); - dest_cpu = cpumask_any_and(cpu_active_mask, &p->cpus_allowed); - + if (unlikely(dest_cpu >= nr_cpu_ids)) { + dest_cpu = cpuset_cpus_allowed_fallback(p); /* * Don't tell them about moving exiting tasks or * kernel threads (both mm NULL), since they never @@ -2278,17 +2257,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p) } /* - * Gets called from 3 sites (exec, fork, wakeup), since it is called without - * holding rq->lock we need to ensure ->cpus_allowed is stable, this is done - * by: - * - * exec: is unstable, retry loop - * fork & wake-up: serialize ->cpus_allowed against TASK_WAKING + * The caller (fork, wakeup) owns TASK_WAKING, ->cpus_allowed is stable. */ static inline -int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags) +int select_task_rq(struct rq *rq, struct task_struct *p, int sd_flags, int wake_flags) { - int cpu = p->sched_class->select_task_rq(p, sd_flags, wake_flags); + int cpu = p->sched_class->select_task_rq(rq, p, sd_flags, wake_flags); /* * In order not to call set_task_cpu() on a blocking task we need @@ -2306,6 +2280,12 @@ int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags) return cpu; } + +static void update_avg(u64 *avg, u64 sample) +{ + s64 diff = sample - *avg; + *avg += diff >> 3; +} #endif /*** @@ -2327,16 +2307,13 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state, { int cpu, orig_cpu, this_cpu, success = 0; unsigned long flags; + unsigned long en_flags = ENQUEUE_WAKEUP; struct rq *rq; - if (!sched_feat(SYNC_WAKEUPS)) - wake_flags &= ~WF_SYNC; - this_cpu = get_cpu(); smp_wmb(); rq = task_rq_lock(p, &flags); - update_rq_clock(rq); if (!(p->state & state)) goto out; @@ -2356,28 +2333,26 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state, * * First fix up the nr_uninterruptible count: */ - if (task_contributes_to_load(p)) - rq->nr_uninterruptible--; + if (task_contributes_to_load(p)) { + if (likely(cpu_online(orig_cpu))) + rq->nr_uninterruptible--; + else + this_rq()->nr_uninterruptible--; + } p->state = TASK_WAKING; - if (p->sched_class->task_waking) + if (p->sched_class->task_waking) { p->sched_class->task_waking(rq, p); + en_flags |= ENQUEUE_WAKING; + } - __task_rq_unlock(rq); - - cpu = select_task_rq(p, SD_BALANCE_WAKE, wake_flags); - if (cpu != orig_cpu) { - /* - * Since we migrate the task without holding any rq->lock, - * we need to be careful with task_rq_lock(), since that - * might end up locking an invalid rq. - */ + cpu = select_task_rq(rq, p, SD_BALANCE_WAKE, wake_flags); + if (cpu != orig_cpu) set_task_cpu(p, cpu); - } + __task_rq_unlock(rq); rq = cpu_rq(cpu); raw_spin_lock(&rq->lock); - update_rq_clock(rq); /* * We migrated the task without holding either rq->lock, however @@ -2405,36 +2380,20 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state, out_activate: #endif /* CONFIG_SMP */ - schedstat_inc(p, se.nr_wakeups); + schedstat_inc(p, se.statistics.nr_wakeups); if (wake_flags & WF_SYNC) - schedstat_inc(p, se.nr_wakeups_sync); + schedstat_inc(p, se.statistics.nr_wakeups_sync); if (orig_cpu != cpu) - schedstat_inc(p, se.nr_wakeups_migrate); + schedstat_inc(p, se.statistics.nr_wakeups_migrate); if (cpu == this_cpu) - schedstat_inc(p, se.nr_wakeups_local); + schedstat_inc(p, se.statistics.nr_wakeups_local); else - schedstat_inc(p, se.nr_wakeups_remote); - activate_task(rq, p, 1); + schedstat_inc(p, se.statistics.nr_wakeups_remote); + activate_task(rq, p, en_flags); success = 1; - /* - * Only attribute actual wakeups done by this task. - */ - if (!in_interrupt()) { - struct sched_entity *se = ¤t->se; - u64 sample = se->sum_exec_runtime; - - if (se->last_wakeup) - sample -= se->last_wakeup; - else - sample -= se->start_runtime; - update_avg(&se->avg_wakeup, sample); - - se->last_wakeup = se->sum_exec_runtime; - } - out_running: - trace_sched_wakeup(rq, p, success); + trace_sched_wakeup(p, success); check_preempt_curr(rq, p, wake_flags); p->state = TASK_RUNNING; @@ -2494,42 +2453,9 @@ static void __sched_fork(struct task_struct *p) p->se.sum_exec_runtime = 0; p->se.prev_sum_exec_runtime = 0; p->se.nr_migrations = 0; - p->se.last_wakeup = 0; - p->se.avg_overlap = 0; - p->se.start_runtime = 0; - p->se.avg_wakeup = sysctl_sched_wakeup_granularity; #ifdef CONFIG_SCHEDSTATS - p->se.wait_start = 0; - p->se.wait_max = 0; - p->se.wait_count = 0; - p->se.wait_sum = 0; - - p->se.sleep_start = 0; - p->se.sleep_max = 0; - p->se.sum_sleep_runtime = 0; - - p->se.block_start = 0; - p->se.block_max = 0; - p->se.exec_max = 0; - p->se.slice_max = 0; - - p->se.nr_migrations_cold = 0; - p->se.nr_failed_migrations_affine = 0; - p->se.nr_failed_migrations_running = 0; - p->se.nr_failed_migrations_hot = 0; - p->se.nr_forced_migrations = 0; - - p->se.nr_wakeups = 0; - p->se.nr_wakeups_sync = 0; - p->se.nr_wakeups_migrate = 0; - p->se.nr_wakeups_local = 0; - p->se.nr_wakeups_remote = 0; - p->se.nr_wakeups_affine = 0; - p->se.nr_wakeups_affine_attempts = 0; - p->se.nr_wakeups_passive = 0; - p->se.nr_wakeups_idle = 0; - + memset(&p->se.statistics, 0, sizeof(p->se.statistics)); #endif INIT_LIST_HEAD(&p->rt.run_list); @@ -2550,11 +2476,11 @@ void sched_fork(struct task_struct *p, int clone_flags) __sched_fork(p); /* - * We mark the process as waking here. This guarantees that + * We mark the process as running here. This guarantees that * nobody will actually run it, and a signal or other external * event cannot wake it up and insert it on the runqueue either. */ - p->state = TASK_WAKING; + p->state = TASK_RUNNING; /* * Revert to default priority/policy on fork if requested. @@ -2621,31 +2547,27 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags) int cpu __maybe_unused = get_cpu(); #ifdef CONFIG_SMP + rq = task_rq_lock(p, &flags); + p->state = TASK_WAKING; + /* * Fork balancing, do it here and not earlier because: * - cpus_allowed can change in the fork path * - any previously selected cpu might disappear through hotplug * - * We still have TASK_WAKING but PF_STARTING is gone now, meaning - * ->cpus_allowed is stable, we have preemption disabled, meaning - * cpu_online_mask is stable. + * We set TASK_WAKING so that select_task_rq() can drop rq->lock + * without people poking at ->cpus_allowed. */ - cpu = select_task_rq(p, SD_BALANCE_FORK, 0); + cpu = select_task_rq(rq, p, SD_BALANCE_FORK, 0); set_task_cpu(p, cpu); -#endif - /* - * Since the task is not on the rq and we still have TASK_WAKING set - * nobody else will migrate this task. - */ - rq = cpu_rq(cpu); - raw_spin_lock_irqsave(&rq->lock, flags); - - BUG_ON(p->state != TASK_WAKING); p->state = TASK_RUNNING; - update_rq_clock(rq); + task_rq_unlock(rq, &flags); +#endif + + rq = task_rq_lock(p, &flags); activate_task(rq, p, 0); - trace_sched_wakeup_new(rq, p, 1); + trace_sched_wakeup_new(p, 1); check_preempt_curr(rq, p, WF_FORK); #ifdef CONFIG_SMP if (p->sched_class->task_woken) @@ -2865,7 +2787,7 @@ context_switch(struct rq *rq, struct task_struct *prev, struct mm_struct *mm, *oldmm; prepare_task_switch(rq, prev, next); - trace_sched_switch(rq, prev, next); + trace_sched_switch(prev, next); mm = next->mm; oldmm = prev->active_mm; /* @@ -2982,6 +2904,61 @@ static unsigned long calc_load_update; unsigned long avenrun[3]; EXPORT_SYMBOL(avenrun); +static long calc_load_fold_active(struct rq *this_rq) +{ + long nr_active, delta = 0; + + nr_active = this_rq->nr_running; + nr_active += (long) this_rq->nr_uninterruptible; + + if (nr_active != this_rq->calc_load_active) { + delta = nr_active - this_rq->calc_load_active; + this_rq->calc_load_active = nr_active; + } + + return delta; +} + +#ifdef CONFIG_NO_HZ +/* + * For NO_HZ we delay the active fold to the next LOAD_FREQ update. + * + * When making the ILB scale, we should try to pull this in as well. + */ +static atomic_long_t calc_load_tasks_idle; + +static void calc_load_account_idle(struct rq *this_rq) +{ + long delta; + + delta = calc_load_fold_active(this_rq); + if (delta) + atomic_long_add(delta, &calc_load_tasks_idle); +} + +static long calc_load_fold_idle(void) +{ + long delta = 0; + + /* + * Its got a race, we don't care... + */ + if (atomic_long_read(&calc_load_tasks_idle)) + delta = atomic_long_xchg(&calc_load_tasks_idle, 0); + + return delta; +} +#else +static void calc_load_account_idle(struct rq *this_rq) +{ +} + +static inline long calc_load_fold_idle(void) +{ + return 0; +} +#endif + /** * get_avenrun - get the load average array * @loads: pointer to dest load array @@ -3028,20 +3005,22 @@ void calc_global_load(void) } /* - * Either called from update_cpu_load() or from a cpu going idle + * Called from update_cpu_load() to periodically update this CPU's + * active count. */ static void calc_load_account_active(struct rq *this_rq) { - long nr_active, delta; + long delta; - nr_active = this_rq->nr_running; - nr_active += (long) this_rq->nr_uninterruptible; + if (time_before(jiffies, this_rq->calc_load_update)) + return; - if (nr_active != this_rq->calc_load_active) { - delta = nr_active - this_rq->calc_load_active; - this_rq->calc_load_active = nr_active; + delta = calc_load_fold_active(this_rq); + delta += calc_load_fold_idle(); + if (delta) atomic_long_add(delta, &calc_load_tasks); - } + + this_rq->calc_load_update += LOAD_FREQ; } /* @@ -3073,10 +3052,7 @@ static void update_cpu_load(struct rq *this_rq) this_rq->cpu_load[i] = (old_load*(scale-1) + new_load) >> i; } - if (time_after_eq(jiffies, this_rq->calc_load_update)) { - this_rq->calc_load_update += LOAD_FREQ; - calc_load_account_active(this_rq); - } + calc_load_account_active(this_rq); } #ifdef CONFIG_SMP @@ -3088,44 +3064,27 @@ static void update_cpu_load(struct rq *this_rq) void sched_exec(void) { struct task_struct *p = current; - struct migration_req req; - int dest_cpu, this_cpu; unsigned long flags; struct rq *rq; - -again: - this_cpu = get_cpu(); - dest_cpu = select_task_rq(p, SD_BALANCE_EXEC, 0); - if (dest_cpu == this_cpu) { - put_cpu(); - return; - } + int dest_cpu; rq = task_rq_lock(p, &flags); - put_cpu(); + dest_cpu = p->sched_class->select_task_rq(rq, p, SD_BALANCE_EXEC, 0); + if (dest_cpu == smp_processor_id()) + goto unlock; /* * select_task_rq() can race against ->cpus_allowed */ - if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed) - || unlikely(!cpu_active(dest_cpu))) { - task_rq_unlock(rq, &flags); - goto again; - } + if (cpumask_test_cpu(dest_cpu, &p->cpus_allowed) && + likely(cpu_active(dest_cpu)) && migrate_task(p, dest_cpu)) { + struct migration_arg arg = { p, dest_cpu }; - /* force the process onto the specified CPU */ - if (migrate_task(p, dest_cpu, &req)) { - /* Need to wait for migration thread (might exit: take ref). */ - struct task_struct *mt = rq->migration_thread; - - get_task_struct(mt); task_rq_unlock(rq, &flags); - wake_up_process(mt); - put_task_struct(mt); - wait_for_completion(&req.done); - + stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); return; } +unlock: task_rq_unlock(rq, &flags); } @@ -3597,23 +3556,9 @@ static inline void schedule_debug(struct task_struct *prev) static void put_prev_task(struct rq *rq, struct task_struct *prev) { - if (prev->state == TASK_RUNNING) { - u64 runtime = prev->se.sum_exec_runtime; - - runtime -= prev->se.prev_sum_exec_runtime; - runtime = min_t(u64, runtime, 2*sysctl_sched_migration_cost); - - /* - * In order to avoid avg_overlap growing stale when we are - * indeed overlapping and hence not getting put to sleep, grow - * the avg_overlap on preemption. - * - * We use the average preemption runtime because that - * correlates to the amount of cache footprint a task can - * build up. - */ - update_avg(&prev->se.avg_overlap, runtime); - } + if (prev->se.on_rq) + update_rq_clock(rq); + rq->skip_clock_update = 0; prev->sched_class->put_prev_task(rq, prev); } @@ -3663,7 +3608,7 @@ need_resched: preempt_disable(); cpu = smp_processor_id(); rq = cpu_rq(cpu); - rcu_sched_qs(cpu); + rcu_note_context_switch(cpu); prev = rq->curr; switch_count = &prev->nivcsw; @@ -3676,14 +3621,13 @@ need_resched_nonpreemptible: hrtick_clear(rq); raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); clear_tsk_need_resched(prev); if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) { if (unlikely(signal_pending_state(prev->state, prev))) prev->state = TASK_RUNNING; else - deactivate_task(rq, prev, 1); + deactivate_task(rq, prev, DEQUEUE_SLEEP); switch_count = &prev->nvcsw; } @@ -4006,8 +3950,7 @@ do_wait_for_common(struct completion *x, long timeout, int state) if (!x->done) { DECLARE_WAITQUEUE(wait, current); - wait.flags |= WQ_FLAG_EXCLUSIVE; - __add_wait_queue_tail(&x->wait, &wait); + __add_wait_queue_tail_exclusive(&x->wait, &wait); do { if (signal_pending_state(state, current)) { timeout = -ERESTARTSYS; @@ -4233,7 +4176,6 @@ void rt_mutex_setprio(struct task_struct *p, int prio) BUG_ON(prio < 0 || prio > MAX_PRIO); rq = task_rq_lock(p, &flags); - update_rq_clock(rq); oldprio = p->prio; prev_class = p->sched_class; @@ -4254,7 +4196,7 @@ void rt_mutex_setprio(struct task_struct *p, int prio) if (running) p->sched_class->set_curr_task(rq); if (on_rq) { - enqueue_task(rq, p, 0, oldprio < prio); + enqueue_task(rq, p, oldprio < prio ? ENQUEUE_HEAD : 0); check_class_changed(rq, p, prev_class, oldprio, running); } @@ -4276,7 +4218,6 @@ void set_user_nice(struct task_struct *p, long nice) * the task might be in the middle of scheduling on another CPU. */ rq = task_rq_lock(p, &flags); - update_rq_clock(rq); /* * The RT priorities are set via sched_setscheduler(), but we still * allow the 'normal' nice value to be set - but as expected @@ -4298,7 +4239,7 @@ void set_user_nice(struct task_struct *p, long nice) delta = p->prio - old_prio; if (on_rq) { - enqueue_task(rq, p, 0, false); + enqueue_task(rq, p, 0); /* * If the task increased its priority or is running and * lowered its priority, then reschedule its CPU: @@ -4559,7 +4500,6 @@ recheck: raw_spin_unlock_irqrestore(&p->pi_lock, flags); goto recheck; } - update_rq_clock(rq); on_rq = p->se.on_rq; running = task_current(rq, p); if (on_rq) @@ -5296,17 +5236,15 @@ static inline void sched_init_granularity(void) /* * This is how migration works: * - * 1) we queue a struct migration_req structure in the source CPU's - * runqueue and wake up that CPU's migration thread. - * 2) we down() the locked semaphore => thread blocks. - * 3) migration thread wakes up (implicitly it forces the migrated - * thread off the CPU) - * 4) it gets the migration request and checks whether the migrated - * task is still in the wrong runqueue. - * 5) if it's in the wrong runqueue then the migration thread removes + * 1) we invoke migration_cpu_stop() on the target CPU using + * stop_one_cpu(). + * 2) stopper starts to run (implicitly forcing the migrated thread + * off the CPU) + * 3) it checks whether the migrated task is still in the wrong runqueue. + * 4) if it's in the wrong runqueue then the migration thread removes * it and puts it into the right queue. - * 6) migration thread up()s the semaphore. - * 7) we wake up and the migration is done. + * 5) stopper completes and stop_one_cpu() returns and the migration + * is done. */ /* @@ -5320,12 +5258,23 @@ static inline void sched_init_granularity(void) */ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) { - struct migration_req req; unsigned long flags; struct rq *rq; + unsigned int dest_cpu; int ret = 0; + /* + * Serialize against TASK_WAKING so that ttwu() and wunt() can + * drop the rq->lock and still rely on ->cpus_allowed. + */ +again: + while (task_is_waking(p)) + cpu_relax(); rq = task_rq_lock(p, &flags); + if (task_is_waking(p)) { + task_rq_unlock(rq, &flags); + goto again; + } if (!cpumask_intersects(new_mask, cpu_active_mask)) { ret = -EINVAL; @@ -5349,15 +5298,12 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) if (cpumask_test_cpu(task_cpu(p), new_mask)) goto out; - if (migrate_task(p, cpumask_any_and(cpu_active_mask, new_mask), &req)) { + dest_cpu = cpumask_any_and(cpu_active_mask, new_mask); + if (migrate_task(p, dest_cpu)) { + struct migration_arg arg = { p, dest_cpu }; /* Need help from migration thread: drop lock and wait. */ - struct task_struct *mt = rq->migration_thread; - - get_task_struct(mt); task_rq_unlock(rq, &flags); - wake_up_process(mt); - put_task_struct(mt); - wait_for_completion(&req.done); + stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); tlb_migrate_finish(p->mm); return 0; } @@ -5415,98 +5361,49 @@ fail: return ret; } -#define RCU_MIGRATION_IDLE 0 -#define RCU_MIGRATION_NEED_QS 1 -#define RCU_MIGRATION_GOT_QS 2 -#define RCU_MIGRATION_MUST_SYNC 3 - /* - * migration_thread - this is a highprio system thread that performs - * thread migration by bumping thread off CPU then 'pushing' onto - * another runqueue. + * migration_cpu_stop - this will be executed by a highprio stopper thread + * and performs thread migration by bumping thread off CPU then + * 'pushing' onto another runqueue. */ -static int migration_thread(void *data) +static int migration_cpu_stop(void *data) { - int badcpu; - int cpu = (long)data; - struct rq *rq; - - rq = cpu_rq(cpu); - BUG_ON(rq->migration_thread != current); - - set_current_state(TASK_INTERRUPTIBLE); - while (!kthread_should_stop()) { - struct migration_req *req; - struct list_head *head; - - raw_spin_lock_irq(&rq->lock); - - if (cpu_is_offline(cpu)) { - raw_spin_unlock_irq(&rq->lock); - break; - } - - if (rq->active_balance) { - active_load_balance(rq, cpu); - rq->active_balance = 0; - } - - head = &rq->migration_queue; - - if (list_empty(head)) { - raw_spin_unlock_irq(&rq->lock); - schedule(); - set_current_state(TASK_INTERRUPTIBLE); - continue; - } - req = list_entry(head->next, struct migration_req, list); - list_del_init(head->next); - - if (req->task != NULL) { - raw_spin_unlock(&rq->lock); - __migrate_task(req->task, cpu, req->dest_cpu); - } else if (likely(cpu == (badcpu = smp_processor_id()))) { - req->dest_cpu = RCU_MIGRATION_GOT_QS; - raw_spin_unlock(&rq->lock); - } else { - req->dest_cpu = RCU_MIGRATION_MUST_SYNC; - raw_spin_unlock(&rq->lock); - WARN_ONCE(1, "migration_thread() on CPU %d, expected %d\n", badcpu, cpu); - } - local_irq_enable(); - - complete(&req->done); - } - __set_current_state(TASK_RUNNING); - - return 0; -} - -#ifdef CONFIG_HOTPLUG_CPU - -static int __migrate_task_irq(struct task_struct *p, int src_cpu, int dest_cpu) -{ - int ret; + struct migration_arg *arg = data; + /* + * The original target cpu might have gone down and we might + * be on another cpu but it doesn't matter. + */ local_irq_disable(); - ret = __migrate_task(p, src_cpu, dest_cpu); + __migrate_task(arg->task, raw_smp_processor_id(), arg->dest_cpu); local_irq_enable(); - return ret; + return 0; } +#ifdef CONFIG_HOTPLUG_CPU /* * Figure out where task on dead CPU should go, use force if necessary. */ -static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p) +void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p) { - int dest_cpu; + struct rq *rq = cpu_rq(dead_cpu); + int needs_cpu, uninitialized_var(dest_cpu); + unsigned long flags; -again: - dest_cpu = select_fallback_rq(dead_cpu, p); + local_irq_save(flags); - /* It can have affinity changed while we were choosing. */ - if (unlikely(!__migrate_task_irq(p, dead_cpu, dest_cpu))) - goto again; + raw_spin_lock(&rq->lock); + needs_cpu = (task_cpu(p) == dead_cpu) && (p->state != TASK_WAKING); + if (needs_cpu) + dest_cpu = select_fallback_rq(dead_cpu, p); + raw_spin_unlock(&rq->lock); + /* + * It can only fail if we race with set_cpus_allowed(), + * in the racer should migrate the task anyway. + */ + if (needs_cpu) + __migrate_task(p, dead_cpu, dest_cpu); + local_irq_restore(flags); } /* @@ -5570,7 +5467,6 @@ void sched_idle_next(void) __setscheduler(rq, p, SCHED_FIFO, MAX_RT_PRIO-1); - update_rq_clock(rq); activate_task(rq, p, 0); raw_spin_unlock_irqrestore(&rq->lock, flags); @@ -5625,7 +5521,6 @@ static void migrate_dead_tasks(unsigned int dead_cpu) for ( ; ; ) { if (!rq->nr_running) break; - update_rq_clock(rq); next = pick_next_task(rq); if (!next) break; @@ -5848,35 +5743,20 @@ static void set_rq_offline(struct rq *rq) static int __cpuinit migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) { - struct task_struct *p; int cpu = (long)hcpu; unsigned long flags; - struct rq *rq; + struct rq *rq = cpu_rq(cpu); switch (action) { case CPU_UP_PREPARE: case CPU_UP_PREPARE_FROZEN: - p = kthread_create(migration_thread, hcpu, "migration/%d", cpu); - if (IS_ERR(p)) - return NOTIFY_BAD; - kthread_bind(p, cpu); - /* Must be high prio: stop_machine expects to yield to it. */ - rq = task_rq_lock(p, &flags); - __setscheduler(rq, p, SCHED_FIFO, MAX_RT_PRIO-1); - task_rq_unlock(rq, &flags); - get_task_struct(p); - cpu_rq(cpu)->migration_thread = p; rq->calc_load_update = calc_load_update; break; case CPU_ONLINE: case CPU_ONLINE_FROZEN: - /* Strictly unnecessary, as first user will wake it. */ - wake_up_process(cpu_rq(cpu)->migration_thread); - /* Update our root-domain */ - rq = cpu_rq(cpu); raw_spin_lock_irqsave(&rq->lock, flags); if (rq->rd) { BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span)); @@ -5887,61 +5767,24 @@ migration_call(struct notifier_block *nfb, unsigned long action, void *hcpu) break; #ifdef CONFIG_HOTPLUG_CPU - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - if (!cpu_rq(cpu)->migration_thread) - break; - /* Unbind it from offline cpu so it can run. Fall thru. */ - kthread_bind(cpu_rq(cpu)->migration_thread, - cpumask_any(cpu_online_mask)); - kthread_stop(cpu_rq(cpu)->migration_thread); - put_task_struct(cpu_rq(cpu)->migration_thread); - cpu_rq(cpu)->migration_thread = NULL; - break; - case CPU_DEAD: case CPU_DEAD_FROZEN: - cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */ migrate_live_tasks(cpu); - rq = cpu_rq(cpu); - kthread_stop(rq->migration_thread); - put_task_struct(rq->migration_thread); - rq->migration_thread = NULL; /* Idle task back to normal (off runqueue, low prio) */ raw_spin_lock_irq(&rq->lock); - update_rq_clock(rq); deactivate_task(rq, rq->idle, 0); __setscheduler(rq, rq->idle, SCHED_NORMAL, 0); rq->idle->sched_class = &idle_sched_class; migrate_dead_tasks(cpu); raw_spin_unlock_irq(&rq->lock); - cpuset_unlock(); migrate_nr_uninterruptible(rq); BUG_ON(rq->nr_running != 0); calc_global_load_remove(rq); - /* - * No need to migrate the tasks: it was best-effort if - * they didn't take sched_hotcpu_mutex. Just wake up - * the requestors. - */ - raw_spin_lock_irq(&rq->lock); - while (!list_empty(&rq->migration_queue)) { - struct migration_req *req; - - req = list_entry(rq->migration_queue.next, - struct migration_req, list); - list_del_init(&req->list); - raw_spin_unlock_irq(&rq->lock); - complete(&req->done); - raw_spin_lock_irq(&rq->lock); - } - raw_spin_unlock_irq(&rq->lock); break; case CPU_DYING: case CPU_DYING_FROZEN: /* Update our root-domain */ - rq = cpu_rq(cpu); raw_spin_lock_irqsave(&rq->lock, flags); if (rq->rd) { BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span)); @@ -6272,6 +6115,9 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) struct rq *rq = cpu_rq(cpu); struct sched_domain *tmp; + for (tmp = sd; tmp; tmp = tmp->parent) + tmp->span_weight = cpumask_weight(sched_domain_span(tmp)); + /* Remove the sched domains which do not contribute to scheduling. */ for (tmp = sd; tmp; ) { struct sched_domain *parent = tmp->parent; @@ -7755,10 +7601,8 @@ void __init sched_init(void) rq->push_cpu = 0; rq->cpu = i; rq->online = 0; - rq->migration_thread = NULL; rq->idle_stamp = 0; rq->avg_idle = 2*sysctl_sched_migration_cost; - INIT_LIST_HEAD(&rq->migration_queue); rq_attach_root(rq, &def_root_domain); #endif init_rq_hrtick(rq); @@ -7859,7 +7703,6 @@ static void normalize_task(struct rq *rq, struct task_struct *p) { int on_rq; - update_rq_clock(rq); on_rq = p->se.on_rq; if (on_rq) deactivate_task(rq, p, 0); @@ -7886,9 +7729,9 @@ void normalize_rt_tasks(void) p->se.exec_start = 0; #ifdef CONFIG_SCHEDSTATS - p->se.wait_start = 0; - p->se.sleep_start = 0; - p->se.block_start = 0; + p->se.statistics.wait_start = 0; + p->se.statistics.sleep_start = 0; + p->se.statistics.block_start = 0; #endif if (!rt_task(p)) { @@ -8221,8 +8064,6 @@ void sched_move_task(struct task_struct *tsk) rq = task_rq_lock(tsk, &flags); - update_rq_clock(rq); - running = task_current(rq, tsk); on_rq = tsk->se.on_rq; @@ -8241,7 +8082,7 @@ void sched_move_task(struct task_struct *tsk) if (unlikely(running)) tsk->sched_class->set_curr_task(rq); if (on_rq) - enqueue_task(rq, tsk, 0, false); + enqueue_task(rq, tsk, 0); task_rq_unlock(rq, &flags); } @@ -9055,43 +8896,32 @@ struct cgroup_subsys cpuacct_subsys = { #ifndef CONFIG_SMP -int rcu_expedited_torture_stats(char *page) -{ - return 0; -} -EXPORT_SYMBOL_GPL(rcu_expedited_torture_stats); - void synchronize_sched_expedited(void) { + barrier(); } EXPORT_SYMBOL_GPL(synchronize_sched_expedited); #else /* #ifndef CONFIG_SMP */ -static DEFINE_PER_CPU(struct migration_req, rcu_migration_req); -static DEFINE_MUTEX(rcu_sched_expedited_mutex); - -#define RCU_EXPEDITED_STATE_POST -2 -#define RCU_EXPEDITED_STATE_IDLE -1 - -static int rcu_expedited_state = RCU_EXPEDITED_STATE_IDLE; +static atomic_t synchronize_sched_expedited_count = ATOMIC_INIT(0); -int rcu_expedited_torture_stats(char *page) +static int synchronize_sched_expedited_cpu_stop(void *data) { - int cnt = 0; - int cpu; - - cnt += sprintf(&page[cnt], "state: %d /", rcu_expedited_state); - for_each_online_cpu(cpu) { - cnt += sprintf(&page[cnt], " %d:%d", - cpu, per_cpu(rcu_migration_req, cpu).dest_cpu); - } - cnt += sprintf(&page[cnt], "\n"); - return cnt; + /* + * There must be a full memory barrier on each affected CPU + * between the time that try_stop_cpus() is called and the + * time that it returns. + * + * In the current initial implementation of cpu_stop, the + * above condition is already met when the control reaches + * this point and the following smp_mb() is not strictly + * necessary. Do smp_mb() anyway for documentation and + * robustness against future implementation changes. + */ + smp_mb(); /* See above comment block. */ + return 0; } -EXPORT_SYMBOL_GPL(rcu_expedited_torture_stats); - -static long synchronize_sched_expedited_count; /* * Wait for an rcu-sched grace period to elapse, but use "big hammer" @@ -9105,18 +8935,14 @@ static long synchronize_sched_expedited_count; */ void synchronize_sched_expedited(void) { - int cpu; - unsigned long flags; - bool need_full_sync = 0; - struct rq *rq; - struct migration_req *req; - long snap; - int trycount = 0; + int snap, trycount = 0; smp_mb(); /* ensure prior mod happens before capturing snap. */ - snap = ACCESS_ONCE(synchronize_sched_expedited_count) + 1; + snap = atomic_read(&synchronize_sched_expedited_count) + 1; get_online_cpus(); - while (!mutex_trylock(&rcu_sched_expedited_mutex)) { + while (try_stop_cpus(cpu_online_mask, + synchronize_sched_expedited_cpu_stop, + NULL) == -EAGAIN) { put_online_cpus(); if (trycount++ < 10) udelay(trycount * num_online_cpus()); @@ -9124,41 +8950,15 @@ void synchronize_sched_expedited(void) synchronize_sched(); return; } - if (ACCESS_ONCE(synchronize_sched_expedited_count) - snap > 0) { + if (atomic_read(&synchronize_sched_expedited_count) - snap > 0) { smp_mb(); /* ensure test happens before caller kfree */ return; } get_online_cpus(); } - rcu_expedited_state = RCU_EXPEDITED_STATE_POST; - for_each_online_cpu(cpu) { - rq = cpu_rq(cpu); - req = &per_cpu(rcu_migration_req, cpu); - init_completion(&req->done); - req->task = NULL; - req->dest_cpu = RCU_MIGRATION_NEED_QS; - raw_spin_lock_irqsave(&rq->lock, flags); - list_add(&req->list, &rq->migration_queue); - raw_spin_unlock_irqrestore(&rq->lock, flags); - wake_up_process(rq->migration_thread); - } - for_each_online_cpu(cpu) { - rcu_expedited_state = cpu; - req = &per_cpu(rcu_migration_req, cpu); - rq = cpu_rq(cpu); - wait_for_completion(&req->done); - raw_spin_lock_irqsave(&rq->lock, flags); - if (unlikely(req->dest_cpu == RCU_MIGRATION_MUST_SYNC)) - need_full_sync = 1; - req->dest_cpu = RCU_MIGRATION_IDLE; - raw_spin_unlock_irqrestore(&rq->lock, flags); - } - rcu_expedited_state = RCU_EXPEDITED_STATE_IDLE; - synchronize_sched_expedited_count++; - mutex_unlock(&rcu_sched_expedited_mutex); + atomic_inc(&synchronize_sched_expedited_count); + smp_mb__after_atomic_inc(); /* ensure post-GP actions seen after GP. */ put_online_cpus(); - if (need_full_sync) - synchronize_sched(); } EXPORT_SYMBOL_GPL(synchronize_sched_expedited); diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c index 9b49db1..87a330a 100644 --- a/kernel/sched_debug.c +++ b/kernel/sched_debug.c @@ -70,16 +70,16 @@ static void print_cfs_group_stats(struct seq_file *m, int cpu, PN(se->vruntime); PN(se->sum_exec_runtime); #ifdef CONFIG_SCHEDSTATS - PN(se->wait_start); - PN(se->sleep_start); - PN(se->block_start); - PN(se->sleep_max); - PN(se->block_max); - PN(se->exec_max); - PN(se->slice_max); - PN(se->wait_max); - PN(se->wait_sum); - P(se->wait_count); + PN(se->statistics.wait_start); + PN(se->statistics.sleep_start); + PN(se->statistics.block_start); + PN(se->statistics.sleep_max); + PN(se->statistics.block_max); + PN(se->statistics.exec_max); + PN(se->statistics.slice_max); + PN(se->statistics.wait_max); + PN(se->statistics.wait_sum); + P(se->statistics.wait_count); #endif P(se->load.weight); #undef PN @@ -104,7 +104,7 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p) SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld", SPLIT_NS(p->se.vruntime), SPLIT_NS(p->se.sum_exec_runtime), - SPLIT_NS(p->se.sum_sleep_runtime)); + SPLIT_NS(p->se.statistics.sum_sleep_runtime)); #else SEQ_printf(m, "%15Ld %15Ld %15Ld.%06ld %15Ld.%06ld %15Ld.%06ld", 0LL, 0LL, 0LL, 0L, 0LL, 0L, 0LL, 0L); @@ -114,7 +114,9 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p) { char path[64]; + rcu_read_lock(); cgroup_path(task_group(p)->css.cgroup, path, sizeof(path)); + rcu_read_unlock(); SEQ_printf(m, " %s", path); } #endif @@ -173,11 +175,6 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq) task_group_path(tg, path, sizeof(path)); SEQ_printf(m, "\ncfs_rq[%d]:%s\n", cpu, path); -#elif defined(CONFIG_USER_SCHED) && defined(CONFIG_FAIR_GROUP_SCHED) - { - uid_t uid = cfs_rq->tg->uid; - SEQ_printf(m, "\ncfs_rq[%d] for UID: %u\n", cpu, uid); - } #else SEQ_printf(m, "\ncfs_rq[%d]:\n", cpu); #endif @@ -407,40 +404,38 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m) PN(se.exec_start); PN(se.vruntime); PN(se.sum_exec_runtime); - PN(se.avg_overlap); - PN(se.avg_wakeup); nr_switches = p->nvcsw + p->nivcsw; #ifdef CONFIG_SCHEDSTATS - PN(se.wait_start); - PN(se.sleep_start); - PN(se.block_start); - PN(se.sleep_max); - PN(se.block_max); - PN(se.exec_max); - PN(se.slice_max); - PN(se.wait_max); - PN(se.wait_sum); - P(se.wait_count); - PN(se.iowait_sum); - P(se.iowait_count); + PN(se.statistics.wait_start); + PN(se.statistics.sleep_start); + PN(se.statistics.block_start); + PN(se.statistics.sleep_max); + PN(se.statistics.block_max); + PN(se.statistics.exec_max); + PN(se.statistics.slice_max); + PN(se.statistics.wait_max); + PN(se.statistics.wait_sum); + P(se.statistics.wait_count); + PN(se.statistics.iowait_sum); + P(se.statistics.iowait_count); P(sched_info.bkl_count); P(se.nr_migrations); - P(se.nr_migrations_cold); - P(se.nr_failed_migrations_affine); - P(se.nr_failed_migrations_running); - P(se.nr_failed_migrations_hot); - P(se.nr_forced_migrations); - P(se.nr_wakeups); - P(se.nr_wakeups_sync); - P(se.nr_wakeups_migrate); - P(se.nr_wakeups_local); - P(se.nr_wakeups_remote); - P(se.nr_wakeups_affine); - P(se.nr_wakeups_affine_attempts); - P(se.nr_wakeups_passive); - P(se.nr_wakeups_idle); + P(se.statistics.nr_migrations_cold); + P(se.statistics.nr_failed_migrations_affine); + P(se.statistics.nr_failed_migrations_running); + P(se.statistics.nr_failed_migrations_hot); + P(se.statistics.nr_forced_migrations); + P(se.statistics.nr_wakeups); + P(se.statistics.nr_wakeups_sync); + P(se.statistics.nr_wakeups_migrate); + P(se.statistics.nr_wakeups_local); + P(se.statistics.nr_wakeups_remote); + P(se.statistics.nr_wakeups_affine); + P(se.statistics.nr_wakeups_affine_attempts); + P(se.statistics.nr_wakeups_passive); + P(se.statistics.nr_wakeups_idle); { u64 avg_atom, avg_per_cpu; @@ -491,31 +486,6 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m) void proc_sched_set_task(struct task_struct *p) { #ifdef CONFIG_SCHEDSTATS - p->se.wait_max = 0; - p->se.wait_sum = 0; - p->se.wait_count = 0; - p->se.iowait_sum = 0; - p->se.iowait_count = 0; - p->se.sleep_max = 0; - p->se.sum_sleep_runtime = 0; - p->se.block_max = 0; - p->se.exec_max = 0; - p->se.slice_max = 0; - p->se.nr_migrations = 0; - p->se.nr_migrations_cold = 0; - p->se.nr_failed_migrations_affine = 0; - p->se.nr_failed_migrations_running = 0; - p->se.nr_failed_migrations_hot = 0; - p->se.nr_forced_migrations = 0; - p->se.nr_wakeups = 0; - p->se.nr_wakeups_sync = 0; - p->se.nr_wakeups_migrate = 0; - p->se.nr_wakeups_local = 0; - p->se.nr_wakeups_remote = 0; - p->se.nr_wakeups_affine = 0; - p->se.nr_wakeups_affine_attempts = 0; - p->se.nr_wakeups_passive = 0; - p->se.nr_wakeups_idle = 0; - p->sched_info.bkl_count = 0; + memset(&p->se.statistics, 0, sizeof(p->se.statistics)); #endif } diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 5a5ea2c..217e4a9 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -35,8 +35,8 @@ * (to see the precise effective timeslice length of your workload, * run vmstat and monitor the context-switches (cs) field) */ -unsigned int sysctl_sched_latency = 5000000ULL; -unsigned int normalized_sysctl_sched_latency = 5000000ULL; +unsigned int sysctl_sched_latency = 6000000ULL; +unsigned int normalized_sysctl_sched_latency = 6000000ULL; /* * The initial- and re-scaling of tunables is configurable @@ -52,15 +52,15 @@ enum sched_tunable_scaling sysctl_sched_tunable_scaling /* * Minimal preemption granularity for CPU-bound tasks: - * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds) + * (default: 2 msec * (1 + ilog(ncpus)), units: nanoseconds) */ -unsigned int sysctl_sched_min_granularity = 1000000ULL; -unsigned int normalized_sysctl_sched_min_granularity = 1000000ULL; +unsigned int sysctl_sched_min_granularity = 2000000ULL; +unsigned int normalized_sysctl_sched_min_granularity = 2000000ULL; /* * is kept at sysctl_sched_latency / sysctl_sched_min_granularity */ -static unsigned int sched_nr_latency = 5; +static unsigned int sched_nr_latency = 3; /* * After fork, child runs first. If set to 0 (default) then @@ -505,7 +505,8 @@ __update_curr(struct cfs_rq *cfs_rq, struct sched_entity *curr, { unsigned long delta_exec_weighted; - schedstat_set(curr->exec_max, max((u64)delta_exec, curr->exec_max)); + schedstat_set(curr->statistics.exec_max, + max((u64)delta_exec, curr->statistics.exec_max)); curr->sum_exec_runtime += delta_exec; schedstat_add(cfs_rq, exec_clock, delta_exec); @@ -548,7 +549,7 @@ static void update_curr(struct cfs_rq *cfs_rq) static inline void update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se) { - schedstat_set(se->wait_start, rq_of(cfs_rq)->clock); + schedstat_set(se->statistics.wait_start, rq_of(cfs_rq)->clock); } /* @@ -567,18 +568,18 @@ static void update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se) static void update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se) { - schedstat_set(se->wait_max, max(se->wait_max, - rq_of(cfs_rq)->clock - se->wait_start)); - schedstat_set(se->wait_count, se->wait_count + 1); - schedstat_set(se->wait_sum, se->wait_sum + - rq_of(cfs_rq)->clock - se->wait_start); + schedstat_set(se->statistics.wait_max, max(se->statistics.wait_max, + rq_of(cfs_rq)->clock - se->statistics.wait_start)); + schedstat_set(se->statistics.wait_count, se->statistics.wait_count + 1); + schedstat_set(se->statistics.wait_sum, se->statistics.wait_sum + + rq_of(cfs_rq)->clock - se->statistics.wait_start); #ifdef CONFIG_SCHEDSTATS if (entity_is_task(se)) { trace_sched_stat_wait(task_of(se), - rq_of(cfs_rq)->clock - se->wait_start); + rq_of(cfs_rq)->clock - se->statistics.wait_start); } #endif - schedstat_set(se->wait_start, 0); + schedstat_set(se->statistics.wait_start, 0); } static inline void @@ -657,39 +658,39 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se) if (entity_is_task(se)) tsk = task_of(se); - if (se->sleep_start) { - u64 delta = rq_of(cfs_rq)->clock - se->sleep_start; + if (se->statistics.sleep_start) { + u64 delta = rq_of(cfs_rq)->clock - se->statistics.sleep_start; if ((s64)delta < 0) delta = 0; - if (unlikely(delta > se->sleep_max)) - se->sleep_max = delta; + if (unlikely(delta > se->statistics.sleep_max)) + se->statistics.sleep_max = delta; - se->sleep_start = 0; - se->sum_sleep_runtime += delta; + se->statistics.sleep_start = 0; + se->statistics.sum_sleep_runtime += delta; if (tsk) { account_scheduler_latency(tsk, delta >> 10, 1); trace_sched_stat_sleep(tsk, delta); } } - if (se->block_start) { - u64 delta = rq_of(cfs_rq)->clock - se->block_start; + if (se->statistics.block_start) { + u64 delta = rq_of(cfs_rq)->clock - se->statistics.block_start; if ((s64)delta < 0) delta = 0; - if (unlikely(delta > se->block_max)) - se->block_max = delta; + if (unlikely(delta > se->statistics.block_max)) + se->statistics.block_max = delta; - se->block_start = 0; - se->sum_sleep_runtime += delta; + se->statistics.block_start = 0; + se->statistics.sum_sleep_runtime += delta; if (tsk) { if (tsk->in_iowait) { - se->iowait_sum += delta; - se->iowait_count++; + se->statistics.iowait_sum += delta; + se->statistics.iowait_count++; trace_sched_stat_iowait(tsk, delta); } @@ -737,20 +738,10 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) vruntime += sched_vslice(cfs_rq, se); /* sleeps up to a single latency don't count. */ - if (!initial && sched_feat(FAIR_SLEEPERS)) { + if (!initial) { unsigned long thresh = sysctl_sched_latency; /* - * Convert the sleeper threshold into virtual time. - * SCHED_IDLE is a special sub-class. We care about - * fairness only relative to other SCHED_IDLE tasks, - * all of which have the same weight. - */ - if (sched_feat(NORMALIZED_SLEEPER) && (!entity_is_task(se) || - task_of(se)->policy != SCHED_IDLE)) - thresh = calc_delta_fair(thresh, se); - - /* * Halve their sleep time's effect, to allow * for a gentler effect of sleepers: */ @@ -766,9 +757,6 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) se->vruntime = vruntime; } -#define ENQUEUE_WAKEUP 1 -#define ENQUEUE_MIGRATE 2 - static void enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { @@ -776,7 +764,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) * Update the normalized vruntime before updating min_vruntime * through callig update_curr(). */ - if (!(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_MIGRATE)) + if (!(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_WAKING)) se->vruntime += cfs_rq->min_vruntime; /* @@ -812,7 +800,7 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se) } static void -dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep) +dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { /* * Update run-time statistics of the 'current'. @@ -820,15 +808,15 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep) update_curr(cfs_rq); update_stats_dequeue(cfs_rq, se); - if (sleep) { + if (flags & DEQUEUE_SLEEP) { #ifdef CONFIG_SCHEDSTATS if (entity_is_task(se)) { struct task_struct *tsk = task_of(se); if (tsk->state & TASK_INTERRUPTIBLE) - se->sleep_start = rq_of(cfs_rq)->clock; + se->statistics.sleep_start = rq_of(cfs_rq)->clock; if (tsk->state & TASK_UNINTERRUPTIBLE) - se->block_start = rq_of(cfs_rq)->clock; + se->statistics.block_start = rq_of(cfs_rq)->clock; } #endif } @@ -845,7 +833,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep) * update can refer to the ->curr item and we need to reflect this * movement in our normalized position. */ - if (!sleep) + if (!(flags & DEQUEUE_SLEEP)) se->vruntime -= cfs_rq->min_vruntime; } @@ -912,7 +900,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) * when there are only lesser-weight tasks around): */ if (rq_of(cfs_rq)->load.weight >= 2*se->load.weight) { - se->slice_max = max(se->slice_max, + se->statistics.slice_max = max(se->statistics.slice_max, se->sum_exec_runtime - se->prev_sum_exec_runtime); } #endif @@ -1054,16 +1042,10 @@ static inline void hrtick_update(struct rq *rq) * then put the task into the rbtree: */ static void -enqueue_task_fair(struct rq *rq, struct task_struct *p, int wakeup, bool head) +enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags) { struct cfs_rq *cfs_rq; struct sched_entity *se = &p->se; - int flags = 0; - - if (wakeup) - flags |= ENQUEUE_WAKEUP; - if (p->state == TASK_WAKING) - flags |= ENQUEUE_MIGRATE; for_each_sched_entity(se) { if (se->on_rq) @@ -1081,18 +1063,18 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int wakeup, bool head) * decreased. We remove the task from the rbtree and * update the fair scheduling stats: */ -static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int sleep) +static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) { struct cfs_rq *cfs_rq; struct sched_entity *se = &p->se; for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); - dequeue_entity(cfs_rq, se, sleep); + dequeue_entity(cfs_rq, se, flags); /* Don't dequeue parent if it has other entities besides us */ if (cfs_rq->load.weight) break; - sleep = 1; + flags |= DEQUEUE_SLEEP; } hrtick_update(rq); @@ -1240,7 +1222,6 @@ static inline unsigned long effective_load(struct task_group *tg, int cpu, static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync) { - struct task_struct *curr = current; unsigned long this_load, load; int idx, this_cpu, prev_cpu; unsigned long tl_per_task; @@ -1255,18 +1236,6 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync) load = source_load(prev_cpu, idx); this_load = target_load(this_cpu, idx); - if (sync) { - if (sched_feat(SYNC_LESS) && - (curr->se.avg_overlap > sysctl_sched_migration_cost || - p->se.avg_overlap > sysctl_sched_migration_cost)) - sync = 0; - } else { - if (sched_feat(SYNC_MORE) && - (curr->se.avg_overlap < sysctl_sched_migration_cost && - p->se.avg_overlap < sysctl_sched_migration_cost)) - sync = 1; - } - /* * If sync wakeup then subtract the (maximum possible) * effect of the currently running task from the load @@ -1306,7 +1275,7 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync) if (sync && balanced) return 1; - schedstat_inc(p, se.nr_wakeups_affine_attempts); + schedstat_inc(p, se.statistics.nr_wakeups_affine_attempts); tl_per_task = cpu_avg_load_per_task(this_cpu); if (balanced || @@ -1318,7 +1287,7 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync) * there is no bad imbalance. */ schedstat_inc(sd, ttwu_move_affine); - schedstat_inc(p, se.nr_wakeups_affine); + schedstat_inc(p, se.statistics.nr_wakeups_affine); return 1; } @@ -1406,29 +1375,48 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) /* * Try and locate an idle CPU in the sched_domain. */ -static int -select_idle_sibling(struct task_struct *p, struct sched_domain *sd, int target) +static int select_idle_sibling(struct task_struct *p, int target) { int cpu = smp_processor_id(); int prev_cpu = task_cpu(p); + struct sched_domain *sd; int i; /* - * If this domain spans both cpu and prev_cpu (see the SD_WAKE_AFFINE - * test in select_task_rq_fair) and the prev_cpu is idle then that's - * always a better target than the current cpu. + * If the task is going to be woken-up on this cpu and if it is + * already idle, then it is the right target. */ - if (target == cpu && !cpu_rq(prev_cpu)->cfs.nr_running) + if (target == cpu && idle_cpu(cpu)) + return cpu; + + /* + * If the task is going to be woken-up on the cpu where it previously + * ran and if it is currently idle, then it the right target. + */ + if (target == prev_cpu && idle_cpu(prev_cpu)) return prev_cpu; /* - * Otherwise, iterate the domain and find an elegible idle cpu. + * Otherwise, iterate the domains and find an elegible idle cpu. */ - for_each_cpu_and(i, sched_domain_span(sd), &p->cpus_allowed) { - if (!cpu_rq(i)->cfs.nr_running) { - target = i; + for_each_domain(target, sd) { + if (!(sd->flags & SD_SHARE_PKG_RESOURCES)) break; + + for_each_cpu_and(i, sched_domain_span(sd), &p->cpus_allowed) { + if (idle_cpu(i)) { + target = i; + break; + } } + + /* + * Lets stop looking for an idle sibling when we reached + * the domain that spans the current cpu and prev_cpu. + */ + if (cpumask_test_cpu(cpu, sched_domain_span(sd)) && + cpumask_test_cpu(prev_cpu, sched_domain_span(sd))) + break; } return target; @@ -1445,7 +1433,8 @@ select_idle_sibling(struct task_struct *p, struct sched_domain *sd, int target) * * preempt must be disabled. */ -static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flags) +static int +select_task_rq_fair(struct rq *rq, struct task_struct *p, int sd_flag, int wake_flags) { struct sched_domain *tmp, *affine_sd = NULL, *sd = NULL; int cpu = smp_processor_id(); @@ -1456,8 +1445,7 @@ static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flag int sync = wake_flags & WF_SYNC; if (sd_flag & SD_BALANCE_WAKE) { - if (sched_feat(AFFINE_WAKEUPS) && - cpumask_test_cpu(cpu, &p->cpus_allowed)) + if (cpumask_test_cpu(cpu, &p->cpus_allowed)) want_affine = 1; new_cpu = prev_cpu; } @@ -1491,34 +1479,13 @@ static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flag } /* - * While iterating the domains looking for a spanning - * WAKE_AFFINE domain, adjust the affine target to any idle cpu - * in cache sharing domains along the way. + * If both cpu and prev_cpu are part of this domain, + * cpu is a valid SD_WAKE_AFFINE target. */ - if (want_affine) { - int target = -1; - - /* - * If both cpu and prev_cpu are part of this domain, - * cpu is a valid SD_WAKE_AFFINE target. - */ - if (cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) - target = cpu; - - /* - * If there's an idle sibling in this domain, make that - * the wake_affine target instead of the current cpu. - */ - if (tmp->flags & SD_SHARE_PKG_RESOURCES) - target = select_idle_sibling(p, tmp, target); - - if (target >= 0) { - if (tmp->flags & SD_WAKE_AFFINE) { - affine_sd = tmp; - want_affine = 0; - } - cpu = target; - } + if (want_affine && (tmp->flags & SD_WAKE_AFFINE) && + cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) { + affine_sd = tmp; + want_affine = 0; } if (!want_sd && !want_affine) @@ -1531,22 +1498,29 @@ static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flag sd = tmp; } +#ifdef CONFIG_FAIR_GROUP_SCHED if (sched_feat(LB_SHARES_UPDATE)) { /* * Pick the largest domain to update shares over */ tmp = sd; - if (affine_sd && (!tmp || - cpumask_weight(sched_domain_span(affine_sd)) > - cpumask_weight(sched_domain_span(sd)))) + if (affine_sd && (!tmp || affine_sd->span_weight > sd->span_weight)) tmp = affine_sd; - if (tmp) + if (tmp) { + raw_spin_unlock(&rq->lock); update_shares(tmp); + raw_spin_lock(&rq->lock); + } } +#endif - if (affine_sd && wake_affine(affine_sd, p, sync)) - return cpu; + if (affine_sd) { + if (cpu == prev_cpu || wake_affine(affine_sd, p, sync)) + return select_idle_sibling(p, cpu); + else + return select_idle_sibling(p, prev_cpu); + } while (sd) { int load_idx = sd->forkexec_idx; @@ -1576,10 +1550,10 @@ static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flag /* Now try balancing at a lower domain level of new_cpu */ cpu = new_cpu; - weight = cpumask_weight(sched_domain_span(sd)); + weight = sd->span_weight; sd = NULL; for_each_domain(cpu, tmp) { - if (weight <= cpumask_weight(sched_domain_span(tmp))) + if (weight <= tmp->span_weight) break; if (tmp->flags & sd_flag) sd = tmp; @@ -1591,63 +1565,26 @@ static int select_task_rq_fair(struct task_struct *p, int sd_flag, int wake_flag } #endif /* CONFIG_SMP */ -/* - * Adaptive granularity - * - * se->avg_wakeup gives the average time a task runs until it does a wakeup, - * with the limit of wakeup_gran -- when it never does a wakeup. - * - * So the smaller avg_wakeup is the faster we want this task to preempt, - * but we don't want to treat the preemptee unfairly and therefore allow it - * to run for at least the amount of time we'd like to run. - * - * NOTE: we use 2*avg_wakeup to increase the probability of actually doing one - * - * NOTE: we use *nr_running to scale with load, this nicely matches the - * degrading latency on load. - */ -static unsigned long -adaptive_gran(struct sched_entity *curr, struct sched_entity *se) -{ - u64 this_run = curr->sum_exec_runtime - curr->prev_sum_exec_runtime; - u64 expected_wakeup = 2*se->avg_wakeup * cfs_rq_of(se)->nr_running; - u64 gran = 0; - - if (this_run < expected_wakeup) - gran = expected_wakeup - this_run; - - return min_t(s64, gran, sysctl_sched_wakeup_granularity); -} - static unsigned long wakeup_gran(struct sched_entity *curr, struct sched_entity *se) { unsigned long gran = sysctl_sched_wakeup_granularity; - if (cfs_rq_of(curr)->curr && sched_feat(ADAPTIVE_GRAN)) - gran = adaptive_gran(curr, se); - /* * Since its curr running now, convert the gran from real-time * to virtual-time in his units. + * + * By using 'se' instead of 'curr' we penalize light tasks, so + * they get preempted easier. That is, if 'se' < 'curr' then + * the resulting gran will be larger, therefore penalizing the + * lighter, if otoh 'se' > 'curr' then the resulting gran will + * be smaller, again penalizing the lighter task. + * + * This is especially important for buddies when the leftmost + * task is higher priority than the buddy. */ - if (sched_feat(ASYM_GRAN)) { - /* - * By using 'se' instead of 'curr' we penalize light tasks, so - * they get preempted easier. That is, if 'se' < 'curr' then - * the resulting gran will be larger, therefore penalizing the - * lighter, if otoh 'se' > 'curr' then the resulting gran will - * be smaller, again penalizing the lighter task. - * - * This is especially important for buddies when the leftmost - * task is higher priority than the buddy. - */ - if (unlikely(se->load.weight != NICE_0_LOAD)) - gran = calc_delta_fair(gran, se); - } else { - if (unlikely(curr->load.weight != NICE_0_LOAD)) - gran = calc_delta_fair(gran, curr); - } + if (unlikely(se->load.weight != NICE_0_LOAD)) + gran = calc_delta_fair(gran, se); return gran; } @@ -1705,7 +1642,6 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ struct task_struct *curr = rq->curr; struct sched_entity *se = &curr->se, *pse = &p->se; struct cfs_rq *cfs_rq = task_cfs_rq(curr); - int sync = wake_flags & WF_SYNC; int scale = cfs_rq->nr_running >= sched_nr_latency; if (unlikely(rt_prio(p->prio))) @@ -1738,14 +1674,6 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ if (unlikely(curr->policy == SCHED_IDLE)) goto preempt; - if (sched_feat(WAKEUP_SYNC) && sync) - goto preempt; - - if (sched_feat(WAKEUP_OVERLAP) && - se->avg_overlap < sysctl_sched_migration_cost && - pse->avg_overlap < sysctl_sched_migration_cost) - goto preempt; - if (!sched_feat(WAKEUP_PREEMPT)) return; @@ -1844,13 +1772,13 @@ int can_migrate_task(struct task_struct *p, struct rq *rq, int this_cpu, * 3) are cache-hot on their current CPU. */ if (!cpumask_test_cpu(this_cpu, &p->cpus_allowed)) { - schedstat_inc(p, se.nr_failed_migrations_affine); + schedstat_inc(p, se.statistics.nr_failed_migrations_affine); return 0; } *all_pinned = 0; if (task_running(rq, p)) { - schedstat_inc(p, se.nr_failed_migrations_running); + schedstat_inc(p, se.statistics.nr_failed_migrations_running); return 0; } @@ -1866,14 +1794,14 @@ int can_migrate_task(struct task_struct *p, struct rq *rq, int this_cpu, #ifdef CONFIG_SCHEDSTATS if (tsk_cache_hot) { schedstat_inc(sd, lb_hot_gained[idle]); - schedstat_inc(p, se.nr_forced_migrations); + schedstat_inc(p, se.statistics.nr_forced_migrations); } #endif return 1; } if (tsk_cache_hot) { - schedstat_inc(p, se.nr_failed_migrations_hot); + schedstat_inc(p, se.statistics.nr_failed_migrations_hot); return 0; } return 1; @@ -2311,7 +2239,7 @@ unsigned long __weak arch_scale_freq_power(struct sched_domain *sd, int cpu) unsigned long default_scale_smt_power(struct sched_domain *sd, int cpu) { - unsigned long weight = cpumask_weight(sched_domain_span(sd)); + unsigned long weight = sd->span_weight; unsigned long smt_gain = sd->smt_gain; smt_gain /= weight; @@ -2344,7 +2272,7 @@ unsigned long scale_rt_power(int cpu) static void update_cpu_power(struct sched_domain *sd, int cpu) { - unsigned long weight = cpumask_weight(sched_domain_span(sd)); + unsigned long weight = sd->span_weight; unsigned long power = SCHED_LOAD_SCALE; struct sched_group *sdg = sd->groups; @@ -2870,6 +2798,8 @@ static int need_active_balance(struct sched_domain *sd, int sd_idle, int idle) return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2); } +static int active_load_balance_cpu_stop(void *data); + /* * Check this_cpu to ensure it is balanced within domain. Attempt to move * tasks if there is an imbalance. @@ -2959,8 +2889,9 @@ redo: if (need_active_balance(sd, sd_idle, idle)) { raw_spin_lock_irqsave(&busiest->lock, flags); - /* don't kick the migration_thread, if the curr - * task on busiest cpu can't be moved to this_cpu + /* don't kick the active_load_balance_cpu_stop, + * if the curr task on busiest cpu can't be + * moved to this_cpu */ if (!cpumask_test_cpu(this_cpu, &busiest->curr->cpus_allowed)) { @@ -2970,14 +2901,22 @@ redo: goto out_one_pinned; } + /* + * ->active_balance synchronizes accesses to + * ->active_balance_work. Once set, it's cleared + * only after active load balance is finished. + */ if (!busiest->active_balance) { busiest->active_balance = 1; busiest->push_cpu = this_cpu; active_balance = 1; } raw_spin_unlock_irqrestore(&busiest->lock, flags); + if (active_balance) - wake_up_process(busiest->migration_thread); + stop_one_cpu_nowait(cpu_of(busiest), + active_load_balance_cpu_stop, busiest, + &busiest->active_balance_work); /* * We've kicked active balancing, reset the failure @@ -3084,24 +3023,29 @@ static void idle_balance(int this_cpu, struct rq *this_rq) } /* - * active_load_balance is run by migration threads. It pushes running tasks - * off the busiest CPU onto idle CPUs. It requires at least 1 task to be - * running on each physical CPU where possible, and avoids physical / - * logical imbalances. - * - * Called with busiest_rq locked. + * active_load_balance_cpu_stop is run by cpu stopper. It pushes + * running tasks off the busiest CPU onto idle CPUs. It requires at + * least 1 task to be running on each physical CPU where possible, and + * avoids physical / logical imbalances. */ -static void active_load_balance(struct rq *busiest_rq, int busiest_cpu) +static int active_load_balance_cpu_stop(void *data) { + struct rq *busiest_rq = data; + int busiest_cpu = cpu_of(busiest_rq); int target_cpu = busiest_rq->push_cpu; + struct rq *target_rq = cpu_rq(target_cpu); struct sched_domain *sd; - struct rq *target_rq; + + raw_spin_lock_irq(&busiest_rq->lock); + + /* make sure the requested cpu hasn't gone down in the meantime */ + if (unlikely(busiest_cpu != smp_processor_id() || + !busiest_rq->active_balance)) + goto out_unlock; /* Is there any task to move? */ if (busiest_rq->nr_running <= 1) - return; - - target_rq = cpu_rq(target_cpu); + goto out_unlock; /* * This condition is "impossible", if it occurs @@ -3112,8 +3056,6 @@ static void active_load_balance(struct rq *busiest_rq, int busiest_cpu) /* move a task from busiest_rq to target_rq */ double_lock_balance(busiest_rq, target_rq); - update_rq_clock(busiest_rq); - update_rq_clock(target_rq); /* Search for an sd spanning us and the target CPU. */ for_each_domain(target_cpu, sd) { @@ -3132,6 +3074,10 @@ static void active_load_balance(struct rq *busiest_rq, int busiest_cpu) schedstat_inc(sd, alb_failed); } double_unlock_balance(busiest_rq, target_rq); +out_unlock: + busiest_rq->active_balance = 0; + raw_spin_unlock_irq(&busiest_rq->lock); + return 0; } #ifdef CONFIG_NO_HZ diff --git a/kernel/sched_features.h b/kernel/sched_features.h index d5059fd..83c66e8 100644 --- a/kernel/sched_features.h +++ b/kernel/sched_features.h @@ -1,11 +1,4 @@ /* - * Disregards a certain amount of sleep time (sched_latency_ns) and - * considers the task to be running during that period. This gives it - * a service deficit on wakeup, allowing it to run sooner. - */ -SCHED_FEAT(FAIR_SLEEPERS, 1) - -/* * Only give sleepers 50% of their service deficit. This allows * them to run sooner, but does not allow tons of sleepers to * rip the spread apart. @@ -13,13 +6,6 @@ SCHED_FEAT(FAIR_SLEEPERS, 1) SCHED_FEAT(GENTLE_FAIR_SLEEPERS, 1) /* - * By not normalizing the sleep time, heavy tasks get an effective - * longer period, and lighter task an effective shorter period they - * are considered running. - */ -SCHED_FEAT(NORMALIZED_SLEEPER, 0) - -/* * Place new tasks ahead so that they do not starve already running * tasks */ @@ -31,37 +17,6 @@ SCHED_FEAT(START_DEBIT, 1) SCHED_FEAT(WAKEUP_PREEMPT, 1) /* - * Compute wakeup_gran based on task behaviour, clipped to - * [0, sched_wakeup_gran_ns] - */ -SCHED_FEAT(ADAPTIVE_GRAN, 1) - -/* - * When converting the wakeup granularity to virtual time, do it such - * that heavier tasks preempting a lighter task have an edge. - */ -SCHED_FEAT(ASYM_GRAN, 1) - -/* - * Always wakeup-preempt SYNC wakeups, see SYNC_WAKEUPS. - */ -SCHED_FEAT(WAKEUP_SYNC, 0) - -/* - * Wakeup preempt based on task behaviour. Tasks that do not overlap - * don't get preempted. - */ -SCHED_FEAT(WAKEUP_OVERLAP, 0) - -/* - * Use the SYNC wakeup hint, pipes and the likes use this to indicate - * the remote end is likely to consume the data we just wrote, and - * therefore has cache benefit from being placed on the same cpu, see - * also AFFINE_WAKEUPS. - */ -SCHED_FEAT(SYNC_WAKEUPS, 1) - -/* * Based on load and program behaviour, see if it makes sense to place * a newly woken task on the same cpu as the task that woke it -- * improve cache locality. Typically used with SYNC wakeups as @@ -70,16 +25,6 @@ SCHED_FEAT(SYNC_WAKEUPS, 1) SCHED_FEAT(AFFINE_WAKEUPS, 1) /* - * Weaken SYNC hint based on overlap - */ -SCHED_FEAT(SYNC_LESS, 1) - -/* - * Add SYNC hint based on overlap - */ -SCHED_FEAT(SYNC_MORE, 0) - -/* * Prefer to schedule the task we woke last (assuming it failed * wakeup-preemption), since its likely going to consume data we * touched, increases cache locality. diff --git a/kernel/sched_idletask.c b/kernel/sched_idletask.c index a8a6d8a..9fa0f40 100644 --- a/kernel/sched_idletask.c +++ b/kernel/sched_idletask.c @@ -6,7 +6,8 @@ */ #ifdef CONFIG_SMP -static int select_task_rq_idle(struct task_struct *p, int sd_flag, int flags) +static int +select_task_rq_idle(struct rq *rq, struct task_struct *p, int sd_flag, int flags) { return task_cpu(p); /* IDLE tasks as never migrated */ } @@ -22,8 +23,7 @@ static void check_preempt_curr_idle(struct rq *rq, struct task_struct *p, int fl static struct task_struct *pick_next_task_idle(struct rq *rq) { schedstat_inc(rq, sched_goidle); - /* adjust the active tasks as we might go into a long sleep */ - calc_load_account_active(rq); + calc_load_account_idle(rq); return rq->idle; } @@ -32,7 +32,7 @@ static struct task_struct *pick_next_task_idle(struct rq *rq) * message if some code attempts to do it: */ static void -dequeue_task_idle(struct rq *rq, struct task_struct *p, int sleep) +dequeue_task_idle(struct rq *rq, struct task_struct *p, int flags) { raw_spin_unlock_irq(&rq->lock); printk(KERN_ERR "bad: scheduling from the idle thread!\n"); diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index b5b920a..8afb953 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -613,7 +613,7 @@ static void update_curr_rt(struct rq *rq) if (unlikely((s64)delta_exec < 0)) delta_exec = 0; - schedstat_set(curr->se.exec_max, max(curr->se.exec_max, delta_exec)); + schedstat_set(curr->se.statistics.exec_max, max(curr->se.statistics.exec_max, delta_exec)); curr->se.sum_exec_runtime += delta_exec; account_group_exec_runtime(curr, delta_exec); @@ -888,20 +888,20 @@ static void dequeue_rt_entity(struct sched_rt_entity *rt_se) * Adding/removing a task to/from a priority array: */ static void -enqueue_task_rt(struct rq *rq, struct task_struct *p, int wakeup, bool head) +enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags) { struct sched_rt_entity *rt_se = &p->rt; - if (wakeup) + if (flags & ENQUEUE_WAKEUP) rt_se->timeout = 0; - enqueue_rt_entity(rt_se, head); + enqueue_rt_entity(rt_se, flags & ENQUEUE_HEAD); if (!task_current(rq, p) && p->rt.nr_cpus_allowed > 1) enqueue_pushable_task(rq, p); } -static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int sleep) +static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int flags) { struct sched_rt_entity *rt_se = &p->rt; @@ -948,10 +948,9 @@ static void yield_task_rt(struct rq *rq) #ifdef CONFIG_SMP static int find_lowest_rq(struct task_struct *task); -static int select_task_rq_rt(struct task_struct *p, int sd_flag, int flags) +static int +select_task_rq_rt(struct rq *rq, struct task_struct *p, int sd_flag, int flags) { - struct rq *rq = task_rq(p); - if (sd_flag != SD_BALANCE_WAKE) return smp_processor_id(); diff --git a/kernel/softirq.c b/kernel/softirq.c index 7c1a67e..0db913a 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -716,7 +716,7 @@ static int run_ksoftirqd(void * __bind_cpu) preempt_enable_no_resched(); cond_resched(); preempt_disable(); - rcu_sched_qs((long)__bind_cpu); + rcu_note_context_switch((long)__bind_cpu); } preempt_enable(); set_current_state(TASK_INTERRUPTIBLE); diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 9bb9fb1..b4e7431 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -1,17 +1,384 @@ -/* Copyright 2008, 2005 Rusty Russell rusty@rustcorp.com.au IBM Corporation. - * GPL v2 and any later version. +/* + * kernel/stop_machine.c + * + * Copyright (C) 2008, 2005 IBM Corporation. + * Copyright (C) 2008, 2005 Rusty Russell rusty@rustcorp.com.au + * Copyright (C) 2010 SUSE Linux Products GmbH + * Copyright (C) 2010 Tejun Heo <tj@kernel.org> + * + * This file is released under the GPLv2 and any later version. */ +#include <linux/completion.h> #include <linux/cpu.h> -#include <linux/err.h> +#include <linux/init.h> #include <linux/kthread.h> #include <linux/module.h> +#include <linux/percpu.h> #include <linux/sched.h> #include <linux/stop_machine.h> -#include <linux/syscalls.h> #include <linux/interrupt.h> +#include <linux/kallsyms.h> #include <asm/atomic.h> -#include <asm/uaccess.h> + +/* + * Structure to determine completion condition and record errors. May + * be shared by works on different cpus. + */ +struct cpu_stop_done { + atomic_t nr_todo; /* nr left to execute */ + bool executed; /* actually executed? */ + int ret; /* collected return value */ + struct completion completion; /* fired if nr_todo reaches 0 */ +}; + +/* the actual stopper, one per every possible cpu, enabled on online cpus */ +struct cpu_stopper { + spinlock_t lock; + struct list_head works; /* list of pending works */ + struct task_struct *thread; /* stopper thread */ + bool enabled; /* is this stopper enabled? */ +}; + +static DEFINE_PER_CPU(struct cpu_stopper, cpu_stopper); + +static void cpu_stop_init_done(struct cpu_stop_done *done, unsigned int nr_todo) +{ + memset(done, 0, sizeof(*done)); + atomic_set(&done->nr_todo, nr_todo); + init_completion(&done->completion); +} + +/* signal completion unless @done is NULL */ +static void cpu_stop_signal_done(struct cpu_stop_done *done, bool executed) +{ + if (done) { + if (executed) + done->executed = true; + if (atomic_dec_and_test(&done->nr_todo)) + complete(&done->completion); + } +} + +/* queue @work to @stopper. if offline, @work is completed immediately */ +static void cpu_stop_queue_work(struct cpu_stopper *stopper, + struct cpu_stop_work *work) +{ + unsigned long flags; + + spin_lock_irqsave(&stopper->lock, flags); + + if (stopper->enabled) { + list_add_tail(&work->list, &stopper->works); + wake_up_process(stopper->thread); + } else + cpu_stop_signal_done(work->done, false); + + spin_unlock_irqrestore(&stopper->lock, flags); +} + +/** + * stop_one_cpu - stop a cpu + * @cpu: cpu to stop + * @fn: function to execute + * @arg: argument to @fn + * + * Execute @fn(@arg) on @cpu. @fn is run in a process context with + * the highest priority preempting any task on the cpu and + * monopolizing it. This function returns after the execution is + * complete. + * + * This function doesn't guarantee @cpu stays online till @fn + * completes. If @cpu goes down in the middle, execution may happen + * partially or fully on different cpus. @fn should either be ready + * for that or the caller should ensure that @cpu stays online until + * this function completes. + * + * CONTEXT: + * Might sleep. + * + * RETURNS: + * -ENOENT if @fn(@arg) was not executed because @cpu was offline; + * otherwise, the return value of @fn. + */ +int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg) +{ + struct cpu_stop_done done; + struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = &done }; + + cpu_stop_init_done(&done, 1); + cpu_stop_queue_work(&per_cpu(cpu_stopper, cpu), &work); + wait_for_completion(&done.completion); + return done.executed ? done.ret : -ENOENT; +} + +/** + * stop_one_cpu_nowait - stop a cpu but don't wait for completion + * @cpu: cpu to stop + * @fn: function to execute + * @arg: argument to @fn + * + * Similar to stop_one_cpu() but doesn't wait for completion. The + * caller is responsible for ensuring @work_buf is currently unused + * and will remain untouched until stopper starts executing @fn. + * + * CONTEXT: + * Don't care. + */ +void stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg, + struct cpu_stop_work *work_buf) +{ + *work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, }; + cpu_stop_queue_work(&per_cpu(cpu_stopper, cpu), work_buf); +} + +/* static data for stop_cpus */ +static DEFINE_MUTEX(stop_cpus_mutex); +static DEFINE_PER_CPU(struct cpu_stop_work, stop_cpus_work); + +int __stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg) +{ + struct cpu_stop_work *work; + struct cpu_stop_done done; + unsigned int cpu; + + /* initialize works and done */ + for_each_cpu(cpu, cpumask) { + work = &per_cpu(stop_cpus_work, cpu); + work->fn = fn; + work->arg = arg; + work->done = &done; + } + cpu_stop_init_done(&done, cpumask_weight(cpumask)); + + /* + * Disable preemption while queueing to avoid getting + * preempted by a stopper which might wait for other stoppers + * to enter @fn which can lead to deadlock. + */ + preempt_disable(); + for_each_cpu(cpu, cpumask) + cpu_stop_queue_work(&per_cpu(cpu_stopper, cpu), + &per_cpu(stop_cpus_work, cpu)); + preempt_enable(); + + wait_for_completion(&done.completion); + return done.executed ? done.ret : -ENOENT; +} + +/** + * stop_cpus - stop multiple cpus + * @cpumask: cpus to stop + * @fn: function to execute + * @arg: argument to @fn + * + * Execute @fn(@arg) on online cpus in @cpumask. On each target cpu, + * @fn is run in a process context with the highest priority + * preempting any task on the cpu and monopolizing it. This function + * returns after all executions are complete. + * + * This function doesn't guarantee the cpus in @cpumask stay online + * till @fn completes. If some cpus go down in the middle, execution + * on the cpu may happen partially or fully on different cpus. @fn + * should either be ready for that or the caller should ensure that + * the cpus stay online until this function completes. + * + * All stop_cpus() calls are serialized making it safe for @fn to wait + * for all cpus to start executing it. + * + * CONTEXT: + * Might sleep. + * + * RETURNS: + * -ENOENT if @fn(@arg) was not executed at all because all cpus in + * @cpumask were offline; otherwise, 0 if all executions of @fn + * returned 0, any non zero return value if any returned non zero. + */ +int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg) +{ + int ret; + + /* static works are used, process one request at a time */ + mutex_lock(&stop_cpus_mutex); + ret = __stop_cpus(cpumask, fn, arg); + mutex_unlock(&stop_cpus_mutex); + return ret; +} + +/** + * try_stop_cpus - try to stop multiple cpus + * @cpumask: cpus to stop + * @fn: function to execute + * @arg: argument to @fn + * + * Identical to stop_cpus() except that it fails with -EAGAIN if + * someone else is already using the facility. + * + * CONTEXT: + * Might sleep. + * + * RETURNS: + * -EAGAIN if someone else is already stopping cpus, -ENOENT if + * @fn(@arg) was not executed at all because all cpus in @cpumask were + * offline; otherwise, 0 if all executions of @fn returned 0, any non + * zero return value if any returned non zero. + */ +int try_stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg) +{ + int ret; + + /* static works are used, process one request at a time */ + if (!mutex_trylock(&stop_cpus_mutex)) + return -EAGAIN; + ret = __stop_cpus(cpumask, fn, arg); + mutex_unlock(&stop_cpus_mutex); + return ret; +} + +static int cpu_stopper_thread(void *data) +{ + struct cpu_stopper *stopper = data; + struct cpu_stop_work *work; + int ret; + +repeat: + set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */ + + if (kthread_should_stop()) { + __set_current_state(TASK_RUNNING); + return 0; + } + + work = NULL; + spin_lock_irq(&stopper->lock); + if (!list_empty(&stopper->works)) { + work = list_first_entry(&stopper->works, + struct cpu_stop_work, list); + list_del_init(&work->list); + } + spin_unlock_irq(&stopper->lock); + + if (work) { + cpu_stop_fn_t fn = work->fn; + void *arg = work->arg; + struct cpu_stop_done *done = work->done; + char ksym_buf[KSYM_NAME_LEN]; + + __set_current_state(TASK_RUNNING); + + /* cpu stop callbacks are not allowed to sleep */ + preempt_disable(); + + ret = fn(arg); + if (ret) + done->ret = ret; + + /* restore preemption and check it's still balanced */ + preempt_enable(); + WARN_ONCE(preempt_count(), + "cpu_stop: %s(%p) leaked preempt count\n", + kallsyms_lookup((unsigned long)fn, NULL, NULL, NULL, + ksym_buf), arg); + + cpu_stop_signal_done(done, true); + } else + schedule(); + + goto repeat; +} + +/* manage stopper for a cpu, mostly lifted from sched migration thread mgmt */ +static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb, + unsigned long action, void *hcpu) +{ + struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 }; + unsigned int cpu = (unsigned long)hcpu; + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); + struct task_struct *p; + + switch (action & ~CPU_TASKS_FROZEN) { + case CPU_UP_PREPARE: + BUG_ON(stopper->thread || stopper->enabled || + !list_empty(&stopper->works)); + p = kthread_create(cpu_stopper_thread, stopper, "migration/%d", + cpu); + if (IS_ERR(p)) + return NOTIFY_BAD; + sched_setscheduler_nocheck(p, SCHED_FIFO, ¶m); + get_task_struct(p); + stopper->thread = p; + break; + + case CPU_ONLINE: + kthread_bind(stopper->thread, cpu); + /* strictly unnecessary, as first user will wake it */ + wake_up_process(stopper->thread); + /* mark enabled */ + spin_lock_irq(&stopper->lock); + stopper->enabled = true; + spin_unlock_irq(&stopper->lock); + break; + +#ifdef CONFIG_HOTPLUG_CPU + case CPU_UP_CANCELED: + case CPU_DEAD: + { + struct cpu_stop_work *work; + + /* kill the stopper */ + kthread_stop(stopper->thread); + /* drain remaining works */ + spin_lock_irq(&stopper->lock); + list_for_each_entry(work, &stopper->works, list) + cpu_stop_signal_done(work->done, false); + stopper->enabled = false; + spin_unlock_irq(&stopper->lock); + /* release the stopper */ + put_task_struct(stopper->thread); + stopper->thread = NULL; + break; + } +#endif + } + + return NOTIFY_OK; +} + +/* + * Give it a higher priority so that cpu stopper is available to other + * cpu notifiers. It currently shares the same priority as sched + * migration_notifier. + */ +static struct notifier_block __cpuinitdata cpu_stop_cpu_notifier = { + .notifier_call = cpu_stop_cpu_callback, + .priority = 10, +}; + +static int __init cpu_stop_init(void) +{ + void *bcpu = (void *)(long)smp_processor_id(); + unsigned int cpu; + int err; + + for_each_possible_cpu(cpu) { + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); + + spin_lock_init(&stopper->lock); + INIT_LIST_HEAD(&stopper->works); + } + + /* start one for the boot cpu */ + err = cpu_stop_cpu_callback(&cpu_stop_cpu_notifier, CPU_UP_PREPARE, + bcpu); + BUG_ON(err == NOTIFY_BAD); + cpu_stop_cpu_callback(&cpu_stop_cpu_notifier, CPU_ONLINE, bcpu); + register_cpu_notifier(&cpu_stop_cpu_notifier); + + return 0; +} +early_initcall(cpu_stop_init); + +#ifdef CONFIG_STOP_MACHINE /* This controls the threads on each CPU. */ enum stopmachine_state { @@ -26,174 +393,94 @@ enum stopmachine_state { /* Exit */ STOPMACHINE_EXIT, }; -static enum stopmachine_state state; struct stop_machine_data { - int (*fn)(void *); - void *data; - int fnret; + int (*fn)(void *); + void *data; + /* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */ + unsigned int num_threads; + const struct cpumask *active_cpus; + + enum stopmachine_state state; + atomic_t thread_ack; }; -/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */ -static unsigned int num_threads; -static atomic_t thread_ack; -static DEFINE_MUTEX(lock); -/* setup_lock protects refcount, stop_machine_wq and stop_machine_work. */ -static DEFINE_MUTEX(setup_lock); -/* Users of stop_machine. */ -static int refcount; -static struct workqueue_struct *stop_machine_wq; -static struct stop_machine_data active, idle; -static const struct cpumask *active_cpus; -static void __percpu *stop_machine_work; - -static void set_state(enum stopmachine_state newstate) +static void set_state(struct stop_machine_data *smdata, + enum stopmachine_state newstate) { /* Reset ack counter. */ - atomic_set(&thread_ack, num_threads); + atomic_set(&smdata->thread_ack, smdata->num_threads); smp_wmb(); - state = newstate; + smdata->state = newstate; } /* Last one to ack a state moves to the next state. */ -static void ack_state(void) +static void ack_state(struct stop_machine_data *smdata) { - if (atomic_dec_and_test(&thread_ack)) - set_state(state + 1); + if (atomic_dec_and_test(&smdata->thread_ack)) + set_state(smdata, smdata->state + 1); } -/* This is the actual function which stops the CPU. It runs - * in the context of a dedicated stopmachine workqueue. */ -static void stop_cpu(struct work_struct *unused) +/* This is the cpu_stop function which stops the CPU. */ +static int stop_machine_cpu_stop(void *data) { + struct stop_machine_data *smdata = data; enum stopmachine_state curstate = STOPMACHINE_NONE; - struct stop_machine_data *smdata = &idle; - int cpu = smp_processor_id(); - int err; + int cpu = smp_processor_id(), err = 0; + bool is_active; + + if (!smdata->active_cpus) + is_active = cpu == cpumask_first(cpu_online_mask); + else + is_active = cpumask_test_cpu(cpu, smdata->active_cpus); - if (!active_cpus) { - if (cpu == cpumask_first(cpu_online_mask)) - smdata = &active; - } else { - if (cpumask_test_cpu(cpu, active_cpus)) - smdata = &active; - } /* Simple state machine */ do { /* Chill out and ensure we re-read stopmachine_state. */ cpu_relax(); - if (state != curstate) { - curstate = state; + if (smdata->state != curstate) { + curstate = smdata->state; switch (curstate) { case STOPMACHINE_DISABLE_IRQ: local_irq_disable(); hard_irq_disable(); break; case STOPMACHINE_RUN: - /* On multiple CPUs only a single error code - * is needed to tell that something failed. */ - err = smdata->fn(smdata->data); - if (err) - smdata->fnret = err; + if (is_active) + err = smdata->fn(smdata->data); break; default: break; } - ack_state(); + ack_state(smdata); } } while (curstate != STOPMACHINE_EXIT); local_irq_enable(); + return err; } -/* Callback for CPUs which aren't supposed to do anything. */ -static int chill(void *unused) -{ - return 0; -} - -int stop_machine_create(void) -{ - mutex_lock(&setup_lock); - if (refcount) - goto done; - stop_machine_wq = create_rt_workqueue("kstop"); - if (!stop_machine_wq) - goto err_out; - stop_machine_work = alloc_percpu(struct work_struct); - if (!stop_machine_work) - goto err_out; -done: - refcount++; - mutex_unlock(&setup_lock); - return 0; - -err_out: - if (stop_machine_wq) - destroy_workqueue(stop_machine_wq); - mutex_unlock(&setup_lock); - return -ENOMEM; -} -EXPORT_SYMBOL_GPL(stop_machine_create); - -void stop_machine_destroy(void) -{ - mutex_lock(&setup_lock); - refcount--; - if (refcount) - goto done; - destroy_workqueue(stop_machine_wq); - free_percpu(stop_machine_work); -done: - mutex_unlock(&setup_lock); -} -EXPORT_SYMBOL_GPL(stop_machine_destroy); - int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) { - struct work_struct *sm_work; - int i, ret; - - /* Set up initial state. */ - mutex_lock(&lock); - num_threads = num_online_cpus(); - active_cpus = cpus; - active.fn = fn; - active.data = data; - active.fnret = 0; - idle.fn = chill; - idle.data = NULL; - - set_state(STOPMACHINE_PREPARE); - - /* Schedule the stop_cpu work on all cpus: hold this CPU so one - * doesn't hit this CPU until we're ready. */ - get_cpu(); - for_each_online_cpu(i) { - sm_work = per_cpu_ptr(stop_machine_work, i); - INIT_WORK(sm_work, stop_cpu); - queue_work_on(i, stop_machine_wq, sm_work); - } - /* This will release the thread on our CPU. */ - put_cpu(); - flush_workqueue(stop_machine_wq); - ret = active.fnret; - mutex_unlock(&lock); - return ret; + struct stop_machine_data smdata = { .fn = fn, .data = data, + .num_threads = num_online_cpus(), + .active_cpus = cpus }; + + /* Set the initial state and stop all online cpus. */ + set_state(&smdata, STOPMACHINE_PREPARE); + return stop_cpus(cpu_online_mask, stop_machine_cpu_stop, &smdata); } int stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus) { int ret; - ret = stop_machine_create(); - if (ret) - return ret; /* No CPUs can come up or down during this. */ get_online_cpus(); ret = __stop_machine(fn, data, cpus); put_online_cpus(); - stop_machine_destroy(); return ret; } EXPORT_SYMBOL_GPL(stop_machine); + +#endif /* CONFIG_STOP_MACHINE */ diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index f992762..1d7b9bc 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -150,14 +150,32 @@ static void tick_nohz_update_jiffies(ktime_t now) touch_softlockup_watchdog(); } +/* + * Updates the per cpu time idle statistics counters + */ +static void +update_ts_time_stats(struct tick_sched *ts, ktime_t now, u64 *last_update_time) +{ + ktime_t delta; + + if (ts->idle_active) { + delta = ktime_sub(now, ts->idle_entrytime); + ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); + if (nr_iowait_cpu() > 0) + ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); + ts->idle_entrytime = now; + } + + if (last_update_time) + *last_update_time = ktime_to_us(now); + +} + static void tick_nohz_stop_idle(int cpu, ktime_t now) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); - ktime_t delta; - delta = ktime_sub(now, ts->idle_entrytime); - ts->idle_lastupdate = now; - ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); + update_ts_time_stats(ts, now, NULL); ts->idle_active = 0; sched_clock_idle_wakeup_event(0); @@ -165,20 +183,32 @@ static void tick_nohz_stop_idle(int cpu, ktime_t now) static ktime_t tick_nohz_start_idle(struct tick_sched *ts) { - ktime_t now, delta; + ktime_t now; now = ktime_get(); - if (ts->idle_active) { - delta = ktime_sub(now, ts->idle_entrytime); - ts->idle_lastupdate = now; - ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta); - } + + update_ts_time_stats(ts, now, NULL); + ts->idle_entrytime = now; ts->idle_active = 1; sched_clock_idle_sleep_event(); return now; } +/** + * get_cpu_idle_time_us - get the total idle time of a cpu + * @cpu: CPU number to query + * @last_update_time: variable to store update time in + * + * Return the cummulative idle time (since boot) for a given + * CPU, in microseconds. The idle time returned includes + * the iowait time (unlike what "top" and co report). + * + * This time is measured via accounting rather than sampling, + * and is as accurate as ktime_get() is. + * + * This function returns -1 if NOHZ is not enabled. + */ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time) { struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); @@ -186,15 +216,38 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time) if (!tick_nohz_enabled) return -1; - if (ts->idle_active) - *last_update_time = ktime_to_us(ts->idle_lastupdate); - else - *last_update_time = ktime_to_us(ktime_get()); + update_ts_time_stats(ts, ktime_get(), last_update_time); return ktime_to_us(ts->idle_sleeptime); } EXPORT_SYMBOL_GPL(get_cpu_idle_time_us); +/* + * get_cpu_iowait_time_us - get the total iowait time of a cpu + * @cpu: CPU number to query + * @last_update_time: variable to store update time in + * + * Return the cummulative iowait time (since boot) for a given + * CPU, in microseconds. + * + * This time is measured via accounting rather than sampling, + * and is as accurate as ktime_get() is. + * + * This function returns -1 if NOHZ is not enabled. + */ +u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time) +{ + struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu); + + if (!tick_nohz_enabled) + return -1; + + update_ts_time_stats(ts, ktime_get(), last_update_time); + + return ktime_to_us(ts->iowait_sleeptime); +} +EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us); + /** * tick_nohz_stop_sched_tick - stop the idle tick from the idle task * @@ -262,6 +315,9 @@ void tick_nohz_stop_sched_tick(int inidle) goto end; } + if (nohz_ratelimit(cpu)) + goto end; + ts->idle_calls++; /* Read jiffies and the time when jiffies were updated last */ do { diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c index 1a4a7dd..ab8f5e3 100644 --- a/kernel/time/timer_list.c +++ b/kernel/time/timer_list.c @@ -176,6 +176,7 @@ static void print_cpu(struct seq_file *m, int cpu, u64 now) P_ns(idle_waketime); P_ns(idle_exittime); P_ns(idle_sleeptime); + P_ns(iowait_sleeptime); P(last_jiffies); P(next_jiffies); P_ns(idle_expires); diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 2404b59..32837e1 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -264,6 +264,7 @@ struct ftrace_profile { unsigned long counter; #ifdef CONFIG_FUNCTION_GRAPH_TRACER unsigned long long time; + unsigned long long time_squared; #endif }; @@ -366,9 +367,9 @@ static int function_stat_headers(struct seq_file *m) { #ifdef CONFIG_FUNCTION_GRAPH_TRACER seq_printf(m, " Function " - "Hit Time Avg\n" + "Hit Time Avg s^2\n" " -------- " - "--- ---- ---\n"); + "--- ---- --- ---\n"); #else seq_printf(m, " Function Hit\n" " -------- ---\n"); @@ -384,6 +385,7 @@ static int function_stat_show(struct seq_file *m, void *v) static DEFINE_MUTEX(mutex); static struct trace_seq s; unsigned long long avg; + unsigned long long stddev; #endif kallsyms_lookup(rec->ip, NULL, NULL, NULL, str); @@ -394,11 +396,25 @@ static int function_stat_show(struct seq_file *m, void *v) avg = rec->time; do_div(avg, rec->counter); + /* Sample standard deviation (s^2) */ + if (rec->counter <= 1) + stddev = 0; + else { + stddev = rec->time_squared - rec->counter * avg * avg; + /* + * Divide only 1000 for ns^2 -> us^2 conversion. + * trace_print_graph_duration will divide 1000 again. + */ + do_div(stddev, (rec->counter - 1) * 1000); + } + mutex_lock(&mutex); trace_seq_init(&s); trace_print_graph_duration(rec->time, &s); trace_seq_puts(&s, " "); trace_print_graph_duration(avg, &s); + trace_seq_puts(&s, " "); + trace_print_graph_duration(stddev, &s); trace_print_seq(m, &s); mutex_unlock(&mutex); #endif @@ -650,6 +666,10 @@ static void profile_graph_return(struct ftrace_graph_ret *trace) if (!stat->hash || !ftrace_profile_enabled) goto out; + /* If the calltime was zero'd ignore it */ + if (!trace->calltime) + goto out; + calltime = trace->rettime - trace->calltime; if (!(trace_flags & TRACE_ITER_GRAPH_TIME)) { @@ -668,8 +688,10 @@ static void profile_graph_return(struct ftrace_graph_ret *trace) } rec = ftrace_find_profiled_func(stat, trace->func); - if (rec) + if (rec) { rec->time += calltime; + rec->time_squared += calltime * calltime; + } out: local_irq_restore(flags); @@ -3212,8 +3234,7 @@ free: } static void -ftrace_graph_probe_sched_switch(struct rq *__rq, struct task_struct *prev, - struct task_struct *next) +ftrace_graph_probe_sched_switch(struct task_struct *prev, struct task_struct *next) { unsigned long long timestamp; int index; @@ -3339,11 +3360,11 @@ void unregister_ftrace_graph(void) goto out; ftrace_graph_active--; - unregister_trace_sched_switch(ftrace_graph_probe_sched_switch); ftrace_graph_return = (trace_func_graph_ret_t)ftrace_stub; ftrace_graph_entry = ftrace_graph_entry_stub; ftrace_shutdown(FTRACE_STOP_FUNC_RET); unregister_pm_notifier(&ftrace_suspend_notifier); + unregister_trace_sched_switch(ftrace_graph_probe_sched_switch); out: mutex_unlock(&ftrace_lock); diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 41ca394..7f6059c 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -319,6 +319,11 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data); #define TS_MASK ((1ULL << TS_SHIFT) - 1) #define TS_DELTA_TEST (~TS_MASK) +/* Flag when events were overwritten */ +#define RB_MISSED_EVENTS (1 << 31) +/* Missed count stored at end */ +#define RB_MISSED_STORED (1 << 30) + struct buffer_data_page { u64 time_stamp; /* page time stamp */ local_t commit; /* write committed index */ @@ -338,6 +343,7 @@ struct buffer_page { local_t write; /* index for next write */ unsigned read; /* index for next read */ local_t entries; /* entries on this page */ + unsigned long real_end; /* real end of data */ struct buffer_data_page *page; /* Actual data page */ }; @@ -417,6 +423,12 @@ int ring_buffer_print_page_header(struct trace_seq *s) (unsigned int)sizeof(field.commit), (unsigned int)is_signed_type(long)); + ret = trace_seq_printf(s, "\tfield: int overwrite;\t" + "offset:%u;\tsize:%u;\tsigned:%u;\n", + (unsigned int)offsetof(typeof(field), commit), + 1, + (unsigned int)is_signed_type(long)); + ret = trace_seq_printf(s, "\tfield: char data;\t" "offset:%u;\tsize:%u;\tsigned:%u;\n", (unsigned int)offsetof(typeof(field), data), @@ -440,6 +452,8 @@ struct ring_buffer_per_cpu { struct buffer_page *tail_page; /* write to tail */ struct buffer_page *commit_page; /* committed pages */ struct buffer_page *reader_page; + unsigned long lost_events; + unsigned long last_overrun; local_t commit_overrun; local_t overrun; local_t entries; @@ -1762,6 +1776,13 @@ rb_reset_tail(struct ring_buffer_per_cpu *cpu_buffer, kmemcheck_annotate_bitfield(event, bitfield); /* + * Save the original length to the meta data. + * This will be used by the reader to add lost event + * counter. + */ + tail_page->real_end = tail; + + /* * If this event is bigger than the minimum size, then * we need to be careful that we don't subtract the * write counter enough to allow another writer to slip @@ -1979,17 +2000,13 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts, u64 *delta) { struct ring_buffer_event *event; - static int once; int ret; - if (unlikely(*delta > (1ULL << 59) && !once++)) { - printk(KERN_WARNING "Delta way too big! %llu" - " ts=%llu write stamp = %llu\n", - (unsigned long long)*delta, - (unsigned long long)*ts, - (unsigned long long)cpu_buffer->write_stamp); - WARN_ON(1); - } + WARN_ONCE(*delta > (1ULL << 59), + KERN_WARNING "Delta way too big! %llu ts=%llu write stamp = %llu\n", + (unsigned long long)*delta, + (unsigned long long)*ts, + (unsigned long long)cpu_buffer->write_stamp); /* * The delta is too big, we to add a @@ -2838,6 +2855,7 @@ static struct buffer_page * rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) { struct buffer_page *reader = NULL; + unsigned long overwrite; unsigned long flags; int nr_loops = 0; int ret; @@ -2879,6 +2897,7 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) local_set(&cpu_buffer->reader_page->write, 0); local_set(&cpu_buffer->reader_page->entries, 0); local_set(&cpu_buffer->reader_page->page->commit, 0); + cpu_buffer->reader_page->real_end = 0; spin: /* @@ -2899,6 +2918,18 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) rb_set_list_to_head(cpu_buffer, &cpu_buffer->reader_page->list); /* + * We want to make sure we read the overruns after we set up our + * pointers to the next object. The writer side does a + * cmpxchg to cross pages which acts as the mb on the writer + * side. Note, the reader will constantly fail the swap + * while the writer is updating the pointers, so this + * guarantees that the overwrite recorded here is the one we + * want to compare with the last_overrun. + */ + smp_mb(); + overwrite = local_read(&(cpu_buffer->overrun)); + + /* * Here's the tricky part. * * We need to move the pointer past the header page. @@ -2929,6 +2960,11 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) cpu_buffer->reader_page = reader; rb_reset_reader_page(cpu_buffer); + if (overwrite != cpu_buffer->last_overrun) { + cpu_buffer->lost_events = overwrite - cpu_buffer->last_overrun; + cpu_buffer->last_overrun = overwrite; + } + goto again; out: @@ -3005,8 +3041,14 @@ static void rb_advance_iter(struct ring_buffer_iter *iter) rb_advance_iter(iter); } +static int rb_lost_events(struct ring_buffer_per_cpu *cpu_buffer) +{ + return cpu_buffer->lost_events; +} + static struct ring_buffer_event * -rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts) +rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts, + unsigned long *lost_events) { struct ring_buffer_event *event; struct buffer_page *reader; @@ -3058,6 +3100,8 @@ rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts) ring_buffer_normalize_time_stamp(cpu_buffer->buffer, cpu_buffer->cpu, ts); } + if (lost_events) + *lost_events = rb_lost_events(cpu_buffer); return event; default: @@ -3168,12 +3212,14 @@ static inline int rb_ok_to_lock(void) * @buffer: The ring buffer to read * @cpu: The cpu to peak at * @ts: The timestamp counter of this event. + * @lost_events: a variable to store if events were lost (may be NULL) * * This will return the event that will be read next, but does * not consume the data. */ struct ring_buffer_event * -ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts) +ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts, + unsigned long *lost_events) { struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu]; struct ring_buffer_event *event; @@ -3188,7 +3234,7 @@ ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts) local_irq_save(flags); if (dolock) spin_lock(&cpu_buffer->reader_lock); - event = rb_buffer_peek(cpu_buffer, ts); + event = rb_buffer_peek(cpu_buffer, ts, lost_events); if (event && event->type_len == RINGBUF_TYPE_PADDING) rb_advance_reader(cpu_buffer); if (dolock) @@ -3230,13 +3276,17 @@ ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts) /** * ring_buffer_consume - return an event and consume it * @buffer: The ring buffer to get the next event from + * @cpu: the cpu to read the buffer from + * @ts: a variable to store the timestamp (may be NULL) + * @lost_events: a variable to store if events were lost (may be NULL) * * Returns the next event in the ring buffer, and that event is consumed. * Meaning, that sequential reads will keep returning a different event, * and eventually empty the ring buffer if the producer is slower. */ struct ring_buffer_event * -ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts) +ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts, + unsigned long *lost_events) { struct ring_buffer_per_cpu *cpu_buffer; struct ring_buffer_event *event = NULL; @@ -3257,9 +3307,11 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts) if (dolock) spin_lock(&cpu_buffer->reader_lock); - event = rb_buffer_peek(cpu_buffer, ts); - if (event) + event = rb_buffer_peek(cpu_buffer, ts, lost_events); + if (event) { + cpu_buffer->lost_events = 0; rb_advance_reader(cpu_buffer); + } if (dolock) spin_unlock(&cpu_buffer->reader_lock); @@ -3276,23 +3328,30 @@ ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts) EXPORT_SYMBOL_GPL(ring_buffer_consume); /** - * ring_buffer_read_start - start a non consuming read of the buffer + * ring_buffer_read_prepare - Prepare for a non consuming read of the buffer * @buffer: The ring buffer to read from * @cpu: The cpu buffer to iterate over * - * This starts up an iteration through the buffer. It also disables - * the recording to the buffer until the reading is finished. - * This prevents the reading from being corrupted. This is not - * a consuming read, so a producer is not expected. + * This performs the initial preparations necessary to iterate + * through the buffer. Memory is allocated, buffer recording + * is disabled, and the iterator pointer is returned to the caller. * - * Must be paired with ring_buffer_finish. + * Disabling buffer recordng prevents the reading from being + * corrupted. This is not a consuming read, so a producer is not + * expected. + * + * After a sequence of ring_buffer_read_prepare calls, the user is + * expected to make at least one call to ring_buffer_prepare_sync. + * Afterwards, ring_buffer_read_start is invoked to get things going + * for real. + * + * This overall must be paired with ring_buffer_finish. */ struct ring_buffer_iter * -ring_buffer_read_start(struct ring_buffer *buffer, int cpu) +ring_buffer_read_prepare(struct ring_buffer *buffer, int cpu) { struct ring_buffer_per_cpu *cpu_buffer; struct ring_buffer_iter *iter; - unsigned long flags; if (!cpumask_test_cpu(cpu, buffer->cpumask)) return NULL; @@ -3306,15 +3365,52 @@ ring_buffer_read_start(struct ring_buffer *buffer, int cpu) iter->cpu_buffer = cpu_buffer; atomic_inc(&cpu_buffer->record_disabled); + + return iter; +} +EXPORT_SYMBOL_GPL(ring_buffer_read_prepare); + +/** + * ring_buffer_read_prepare_sync - Synchronize a set of prepare calls + * + * All previously invoked ring_buffer_read_prepare calls to prepare + * iterators will be synchronized. Afterwards, read_buffer_read_start + * calls on those iterators are allowed. + */ +void +ring_buffer_read_prepare_sync(void) +{ synchronize_sched(); +} +EXPORT_SYMBOL_GPL(ring_buffer_read_prepare_sync); + +/** + * ring_buffer_read_start - start a non consuming read of the buffer + * @iter: The iterator returned by ring_buffer_read_prepare + * + * This finalizes the startup of an iteration through the buffer. + * The iterator comes from a call to ring_buffer_read_prepare and + * an intervening ring_buffer_read_prepare_sync must have been + * performed. + * + * Must be paired with ring_buffer_finish. + */ +void +ring_buffer_read_start(struct ring_buffer_iter *iter) +{ + struct ring_buffer_per_cpu *cpu_buffer; + unsigned long flags; + + if (!iter) + return; + + cpu_buffer = iter->cpu_buffer; spin_lock_irqsave(&cpu_buffer->reader_lock, flags); arch_spin_lock(&cpu_buffer->lock); rb_iter_reset(iter); arch_spin_unlock(&cpu_buffer->lock); spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); - - return iter; } EXPORT_SYMBOL_GPL(ring_buffer_read_start); @@ -3408,6 +3504,9 @@ rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) cpu_buffer->write_stamp = 0; cpu_buffer->read_stamp = 0; + cpu_buffer->lost_events = 0; + cpu_buffer->last_overrun = 0; + rb_head_page_activate(cpu_buffer); } @@ -3683,6 +3782,7 @@ int ring_buffer_read_page(struct ring_buffer *buffer, struct ring_buffer_event *event; struct buffer_data_page *bpage; struct buffer_page *reader; + unsigned long missed_events; unsigned long flags; unsigned int commit; unsigned int read; @@ -3719,6 +3819,9 @@ int ring_buffer_read_page(struct ring_buffer *buffer, read = reader->read; commit = rb_page_commit(reader); + /* Check if any events were dropped */ + missed_events = cpu_buffer->lost_events; + /* * If this page has been partially read or * if len is not big enough to read the rest of the page or @@ -3779,9 +3882,35 @@ int ring_buffer_read_page(struct ring_buffer *buffer, local_set(&reader->entries, 0); reader->read = 0; *data_page = bpage; + + /* + * Use the real_end for the data size, + * This gives us a chance to store the lost events + * on the page. + */ + if (reader->real_end) + local_set(&bpage->commit, reader->real_end); } ret = read; + cpu_buffer->lost_events = 0; + /* + * Set a flag in the commit field if we lost events + */ + if (missed_events) { + commit = local_read(&bpage->commit); + + /* If there is room at the end of the page to save the + * missed events, then record it there. + */ + if (BUF_PAGE_SIZE - commit >= sizeof(missed_events)) { + memcpy(&bpage->data[commit], &missed_events, + sizeof(missed_events)); + local_add(RB_MISSED_STORED, &bpage->commit); + } + local_add(RB_MISSED_EVENTS, &bpage->commit); + } + out_unlock: spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); diff --git a/kernel/trace/ring_buffer_benchmark.c b/kernel/trace/ring_buffer_benchmark.c index df74c79..302f8a6 100644 --- a/kernel/trace/ring_buffer_benchmark.c +++ b/kernel/trace/ring_buffer_benchmark.c @@ -81,7 +81,7 @@ static enum event_status read_event(int cpu) int *entry; u64 ts; - event = ring_buffer_consume(buffer, cpu, &ts); + event = ring_buffer_consume(buffer, cpu, &ts, NULL); if (!event) return EVENT_DROPPED; @@ -113,7 +113,8 @@ static enum event_status read_page(int cpu) ret = ring_buffer_read_page(buffer, &bpage, PAGE_SIZE, cpu, 1); if (ret >= 0) { rpage = bpage; - commit = local_read(&rpage->commit); + /* The commit may have missed event flags set, clear them */ + commit = local_read(&rpage->commit) & 0xfffff; for (i = 0; i < commit && !kill_test; i += inc) { if (i >= (PAGE_SIZE - offsetof(struct rb_page, data))) { diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 44f916a..756d728 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -117,9 +117,12 @@ static cpumask_var_t __read_mostly tracing_buffer_mask; * * It is default off, but you can enable it with either specifying * "ftrace_dump_on_oops" in the kernel command line, or setting - * /proc/sys/kernel/ftrace_dump_on_oops to true. + * /proc/sys/kernel/ftrace_dump_on_oops + * Set 1 if you want to dump buffers of all CPUs + * Set 2 if you want to dump the buffer of the CPU that triggered oops */ -int ftrace_dump_on_oops; + +enum ftrace_dump_mode ftrace_dump_on_oops; static int tracing_set_tracer(const char *buf); @@ -139,8 +142,17 @@ __setup("ftrace=", set_cmdline_ftrace); static int __init set_ftrace_dump_on_oops(char *str) { - ftrace_dump_on_oops = 1; - return 1; + if (*str++ != '=' || !*str) { + ftrace_dump_on_oops = DUMP_ALL; + return 1; + } + + if (!strcmp("orig_cpu", str)) { + ftrace_dump_on_oops = DUMP_ORIG; + return 1; + } + + return 0; } __setup("ftrace_dump_on_oops", set_ftrace_dump_on_oops); @@ -1545,7 +1557,8 @@ static void trace_iterator_increment(struct trace_iterator *iter) } static struct trace_entry * -peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts) +peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts, + unsigned long *lost_events) { struct ring_buffer_event *event; struct ring_buffer_iter *buf_iter = iter->buffer_iter[cpu]; @@ -1556,7 +1569,8 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts) if (buf_iter) event = ring_buffer_iter_peek(buf_iter, ts); else - event = ring_buffer_peek(iter->tr->buffer, cpu, ts); + event = ring_buffer_peek(iter->tr->buffer, cpu, ts, + lost_events); ftrace_enable_cpu(); @@ -1564,10 +1578,12 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts) } static struct trace_entry * -__find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) +__find_next_entry(struct trace_iterator *iter, int *ent_cpu, + unsigned long *missing_events, u64 *ent_ts) { struct ring_buffer *buffer = iter->tr->buffer; struct trace_entry *ent, *next = NULL; + unsigned long lost_events = 0, next_lost = 0; int cpu_file = iter->cpu_file; u64 next_ts = 0, ts; int next_cpu = -1; @@ -1580,7 +1596,7 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) if (cpu_file > TRACE_PIPE_ALL_CPU) { if (ring_buffer_empty_cpu(buffer, cpu_file)) return NULL; - ent = peek_next_entry(iter, cpu_file, ent_ts); + ent = peek_next_entry(iter, cpu_file, ent_ts, missing_events); if (ent_cpu) *ent_cpu = cpu_file; @@ -1592,7 +1608,7 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) if (ring_buffer_empty_cpu(buffer, cpu)) continue; - ent = peek_next_entry(iter, cpu, &ts); + ent = peek_next_entry(iter, cpu, &ts, &lost_events); /* * Pick the entry with the smallest timestamp: @@ -1601,6 +1617,7 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) next = ent; next_cpu = cpu; next_ts = ts; + next_lost = lost_events; } } @@ -1610,6 +1627,9 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) if (ent_ts) *ent_ts = next_ts; + if (missing_events) + *missing_events = next_lost; + return next; } @@ -1617,13 +1637,14 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) struct trace_entry *trace_find_next_entry(struct trace_iterator *iter, int *ent_cpu, u64 *ent_ts) { - return __find_next_entry(iter, ent_cpu, ent_ts); + return __find_next_entry(iter, ent_cpu, NULL, ent_ts); } /* Find the next real entry, and increment the iterator to the next entry */ static void *find_next_entry_inc(struct trace_iterator *iter) { - iter->ent = __find_next_entry(iter, &iter->cpu, &iter->ts); + iter->ent = __find_next_entry(iter, &iter->cpu, + &iter->lost_events, &iter->ts); if (iter->ent) trace_iterator_increment(iter); @@ -1635,7 +1656,8 @@ static void trace_consume(struct trace_iterator *iter) { /* Don't allow ftrace to trace into the ring buffers */ ftrace_disable_cpu(); - ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts); + ring_buffer_consume(iter->tr->buffer, iter->cpu, &iter->ts, + &iter->lost_events); ftrace_enable_cpu(); } @@ -1786,7 +1808,7 @@ static void print_func_help_header(struct seq_file *m) } -static void +void print_trace_header(struct seq_file *m, struct trace_iterator *iter) { unsigned long sym_flags = (trace_flags & TRACE_ITER_SYM_MASK); @@ -1995,7 +2017,7 @@ static enum print_line_t print_bin_fmt(struct trace_iterator *iter) return event ? event->binary(iter, 0) : TRACE_TYPE_HANDLED; } -static int trace_empty(struct trace_iterator *iter) +int trace_empty(struct trace_iterator *iter) { int cpu; @@ -2030,6 +2052,10 @@ static enum print_line_t print_trace_line(struct trace_iterator *iter) { enum print_line_t ret; + if (iter->lost_events) + trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", + iter->cpu, iter->lost_events); + if (iter->trace && iter->trace->print_line) { ret = iter->trace->print_line(iter); if (ret != TRACE_TYPE_UNHANDLED) @@ -2058,6 +2084,23 @@ static enum print_line_t print_trace_line(struct trace_iterator *iter) return print_trace_fmt(iter); } +void trace_default_header(struct seq_file *m) +{ + struct trace_iterator *iter = m->private; + + if (iter->iter_flags & TRACE_FILE_LAT_FMT) { + /* print nothing if the buffers are empty */ + if (trace_empty(iter)) + return; + print_trace_header(m, iter); + if (!(trace_flags & TRACE_ITER_VERBOSE)) + print_lat_help_header(m); + } else { + if (!(trace_flags & TRACE_ITER_VERBOSE)) + print_func_help_header(m); + } +} + static int s_show(struct seq_file *m, void *v) { struct trace_iterator *iter = v; @@ -2070,17 +2113,9 @@ static int s_show(struct seq_file *m, void *v) } if (iter->trace && iter->trace->print_header) iter->trace->print_header(m); - else if (iter->iter_flags & TRACE_FILE_LAT_FMT) { - /* print nothing if the buffers are empty */ - if (trace_empty(iter)) - return 0; - print_trace_header(m, iter); - if (!(trace_flags & TRACE_ITER_VERBOSE)) - print_lat_help_header(m); - } else { - if (!(trace_flags & TRACE_ITER_VERBOSE)) - print_func_help_header(m); - } + else + trace_default_header(m); + } else if (iter->leftover) { /* * If we filled the seq_file buffer earlier, we @@ -2166,15 +2201,20 @@ __tracing_open(struct inode *inode, struct file *file) if (iter->cpu_file == TRACE_PIPE_ALL_CPU) { for_each_tracing_cpu(cpu) { - iter->buffer_iter[cpu] = - ring_buffer_read_start(iter->tr->buffer, cpu); + ring_buffer_read_prepare(iter->tr->buffer, cpu); + } + ring_buffer_read_prepare_sync(); + for_each_tracing_cpu(cpu) { + ring_buffer_read_start(iter->buffer_iter[cpu]); tracing_iter_reset(iter, cpu); } } else { cpu = iter->cpu_file; iter->buffer_iter[cpu] = - ring_buffer_read_start(iter->tr->buffer, cpu); + ring_buffer_read_prepare(iter->tr->buffer, cpu); + ring_buffer_read_prepare_sync(); + ring_buffer_read_start(iter->buffer_iter[cpu]); tracing_iter_reset(iter, cpu); } @@ -4324,7 +4364,7 @@ static int trace_panic_handler(struct notifier_block *this, unsigned long event, void *unused) { if (ftrace_dump_on_oops) - ftrace_dump(); + ftrace_dump(ftrace_dump_on_oops); return NOTIFY_OK; } @@ -4341,7 +4381,7 @@ static int trace_die_handler(struct notifier_block *self, switch (val) { case DIE_OOPS: if (ftrace_dump_on_oops) - ftrace_dump(); + ftrace_dump(ftrace_dump_on_oops); break; default: break; @@ -4382,7 +4422,8 @@ trace_printk_seq(struct trace_seq *s) trace_seq_init(s); } -static void __ftrace_dump(bool disable_tracing) +static void +__ftrace_dump(bool disable_tracing, enum ftrace_dump_mode oops_dump_mode) { static arch_spinlock_t ftrace_dump_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED; @@ -4415,12 +4456,25 @@ static void __ftrace_dump(bool disable_tracing) /* don't look at user memory in panic mode */ trace_flags &= ~TRACE_ITER_SYM_USEROBJ; - printk(KERN_TRACE "Dumping ftrace buffer:\n"); - /* Simulate the iterator */ iter.tr = &global_trace; iter.trace = current_trace; - iter.cpu_file = TRACE_PIPE_ALL_CPU; + + switch (oops_dump_mode) { + case DUMP_ALL: + iter.cpu_file = TRACE_PIPE_ALL_CPU; + break; + case DUMP_ORIG: + iter.cpu_file = raw_smp_processor_id(); + break; + case DUMP_NONE: + goto out_enable; + default: + printk(KERN_TRACE "Bad dumping mode, switching to all CPUs dump\n"); + iter.cpu_file = TRACE_PIPE_ALL_CPU; + } + + printk(KERN_TRACE "Dumping ftrace buffer:\n"); /* * We need to stop all tracing on all CPUS to read the @@ -4459,6 +4513,7 @@ static void __ftrace_dump(bool disable_tracing) else printk(KERN_TRACE "---------------------------------\n"); + out_enable: /* Re-enable tracing if requested */ if (!disable_tracing) { trace_flags |= old_userobj; @@ -4475,9 +4530,9 @@ static void __ftrace_dump(bool disable_tracing) } /* By default: disable tracing after the dump */ -void ftrace_dump(void) +void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { - __ftrace_dump(true); + __ftrace_dump(true, oops_dump_mode); } __init static int tracer_alloc_buffers(void) diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 3ebdb6b..d1ce0be 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -364,6 +364,9 @@ void trace_function(struct trace_array *tr, unsigned long ip, unsigned long parent_ip, unsigned long flags, int pc); +void trace_default_header(struct seq_file *m); +void print_trace_header(struct seq_file *m, struct trace_iterator *iter); +int trace_empty(struct trace_iterator *iter); void trace_graph_return(struct ftrace_graph_ret *trace); int trace_graph_entry(struct ftrace_graph_ent *trace); @@ -475,9 +478,29 @@ extern int trace_clock_id; /* Standard output formatting function used for function return traces */ #ifdef CONFIG_FUNCTION_GRAPH_TRACER -extern enum print_line_t print_graph_function(struct trace_iterator *iter); + +/* Flag options */ +#define TRACE_GRAPH_PRINT_OVERRUN 0x1 +#define TRACE_GRAPH_PRINT_CPU 0x2 +#define TRACE_GRAPH_PRINT_OVERHEAD 0x4 +#define TRACE_GRAPH_PRINT_PROC 0x8 +#define TRACE_GRAPH_PRINT_DURATION 0x10 +#define TRACE_GRAPH_PRINT_ABS_TIME 0x20 + +extern enum print_line_t +print_graph_function_flags(struct trace_iterator *iter, u32 flags); +extern void print_graph_headers_flags(struct seq_file *s, u32 flags); extern enum print_line_t trace_print_graph_duration(unsigned long long duration, struct trace_seq *s); +extern void graph_trace_open(struct trace_iterator *iter); +extern void graph_trace_close(struct trace_iterator *iter); +extern int __trace_graph_entry(struct trace_array *tr, + struct ftrace_graph_ent *trace, + unsigned long flags, int pc); +extern void __trace_graph_return(struct trace_array *tr, + struct ftrace_graph_ret *trace, + unsigned long flags, int pc); + #ifdef CONFIG_DYNAMIC_FTRACE /* TODO: make this variable */ @@ -508,7 +531,7 @@ static inline int ftrace_graph_addr(unsigned long addr) #endif /* CONFIG_DYNAMIC_FTRACE */ #else /* CONFIG_FUNCTION_GRAPH_TRACER */ static inline enum print_line_t -print_graph_function(struct trace_iterator *iter) +print_graph_function_flags(struct trace_iterator *iter, u32 flags) { return TRACE_TYPE_UNHANDLED; } diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c index 9aed1a5..dd11c83 100644 --- a/kernel/trace/trace_functions_graph.c +++ b/kernel/trace/trace_functions_graph.c @@ -40,7 +40,7 @@ struct fgraph_data { #define TRACE_GRAPH_PRINT_OVERHEAD 0x4 #define TRACE_GRAPH_PRINT_PROC 0x8 #define TRACE_GRAPH_PRINT_DURATION 0x10 -#define TRACE_GRAPH_PRINT_ABS_TIME 0X20 +#define TRACE_GRAPH_PRINT_ABS_TIME 0x20 static struct tracer_opt trace_opts[] = { /* Display overruns? (for self-debug purpose) */ @@ -179,7 +179,7 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer) return ret; } -static int __trace_graph_entry(struct trace_array *tr, +int __trace_graph_entry(struct trace_array *tr, struct ftrace_graph_ent *trace, unsigned long flags, int pc) @@ -246,7 +246,7 @@ int trace_graph_thresh_entry(struct ftrace_graph_ent *trace) return trace_graph_entry(trace); } -static void __trace_graph_return(struct trace_array *tr, +void __trace_graph_return(struct trace_array *tr, struct ftrace_graph_ret *trace, unsigned long flags, int pc) @@ -490,9 +490,10 @@ get_return_for_leaf(struct trace_iterator *iter, * We need to consume the current entry to see * the next one. */ - ring_buffer_consume(iter->tr->buffer, iter->cpu, NULL); + ring_buffer_consume(iter->tr->buffer, iter->cpu, + NULL, NULL); event = ring_buffer_peek(iter->tr->buffer, iter->cpu, - NULL); + NULL, NULL); } if (!event) @@ -526,17 +527,18 @@ get_return_for_leaf(struct trace_iterator *iter, /* Signal a overhead of time execution to the output */ static int -print_graph_overhead(unsigned long long duration, struct trace_seq *s) +print_graph_overhead(unsigned long long duration, struct trace_seq *s, + u32 flags) { /* If duration disappear, we don't need anything */ - if (!(tracer_flags.val & TRACE_GRAPH_PRINT_DURATION)) + if (!(flags & TRACE_GRAPH_PRINT_DURATION)) return 1; /* Non nested entry or return */ if (duration == -1) return trace_seq_printf(s, " "); - if (tracer_flags.val & TRACE_GRAPH_PRINT_OVERHEAD) { + if (flags & TRACE_GRAPH_PRINT_OVERHEAD) { /* Duration exceeded 100 msecs */ if (duration > 100000ULL) return trace_seq_printf(s, "! "); @@ -562,7 +564,7 @@ static int print_graph_abs_time(u64 t, struct trace_seq *s) static enum print_line_t print_graph_irq(struct trace_iterator *iter, unsigned long addr, - enum trace_type type, int cpu, pid_t pid) + enum trace_type type, int cpu, pid_t pid, u32 flags) { int ret; struct trace_seq *s = &iter->seq; @@ -572,21 +574,21 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr, return TRACE_TYPE_UNHANDLED; /* Absolute time */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_ABS_TIME) { + if (flags & TRACE_GRAPH_PRINT_ABS_TIME) { ret = print_graph_abs_time(iter->ts, s); if (!ret) return TRACE_TYPE_PARTIAL_LINE; } /* Cpu */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_CPU) { + if (flags & TRACE_GRAPH_PRINT_CPU) { ret = print_graph_cpu(s, cpu); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; } /* Proc */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_PROC) { + if (flags & TRACE_GRAPH_PRINT_PROC) { ret = print_graph_proc(s, pid); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; @@ -596,7 +598,7 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr, } /* No overhead */ - ret = print_graph_overhead(-1, s); + ret = print_graph_overhead(-1, s, flags); if (!ret) return TRACE_TYPE_PARTIAL_LINE; @@ -609,7 +611,7 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr, return TRACE_TYPE_PARTIAL_LINE; /* Don't close the duration column if haven't one */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) + if (flags & TRACE_GRAPH_PRINT_DURATION) trace_seq_printf(s, " |"); ret = trace_seq_printf(s, "\n"); @@ -679,7 +681,8 @@ print_graph_duration(unsigned long long duration, struct trace_seq *s) static enum print_line_t print_graph_entry_leaf(struct trace_iterator *iter, struct ftrace_graph_ent_entry *entry, - struct ftrace_graph_ret_entry *ret_entry, struct trace_seq *s) + struct ftrace_graph_ret_entry *ret_entry, + struct trace_seq *s, u32 flags) { struct fgraph_data *data = iter->private; struct ftrace_graph_ret *graph_ret; @@ -711,12 +714,12 @@ print_graph_entry_leaf(struct trace_iterator *iter, } /* Overhead */ - ret = print_graph_overhead(duration, s); + ret = print_graph_overhead(duration, s, flags); if (!ret) return TRACE_TYPE_PARTIAL_LINE; /* Duration */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) { + if (flags & TRACE_GRAPH_PRINT_DURATION) { ret = print_graph_duration(duration, s); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; @@ -739,7 +742,7 @@ print_graph_entry_leaf(struct trace_iterator *iter, static enum print_line_t print_graph_entry_nested(struct trace_iterator *iter, struct ftrace_graph_ent_entry *entry, - struct trace_seq *s, int cpu) + struct trace_seq *s, int cpu, u32 flags) { struct ftrace_graph_ent *call = &entry->graph_ent; struct fgraph_data *data = iter->private; @@ -759,12 +762,12 @@ print_graph_entry_nested(struct trace_iterator *iter, } /* No overhead */ - ret = print_graph_overhead(-1, s); + ret = print_graph_overhead(-1, s, flags); if (!ret) return TRACE_TYPE_PARTIAL_LINE; /* No time */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) { + if (flags & TRACE_GRAPH_PRINT_DURATION) { ret = trace_seq_printf(s, " | "); if (!ret) return TRACE_TYPE_PARTIAL_LINE; @@ -790,7 +793,7 @@ print_graph_entry_nested(struct trace_iterator *iter, static enum print_line_t print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s, - int type, unsigned long addr) + int type, unsigned long addr, u32 flags) { struct fgraph_data *data = iter->private; struct trace_entry *ent = iter->ent; @@ -803,27 +806,27 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s, if (type) { /* Interrupt */ - ret = print_graph_irq(iter, addr, type, cpu, ent->pid); + ret = print_graph_irq(iter, addr, type, cpu, ent->pid, flags); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; } /* Absolute time */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_ABS_TIME) { + if (flags & TRACE_GRAPH_PRINT_ABS_TIME) { ret = print_graph_abs_time(iter->ts, s); if (!ret) return TRACE_TYPE_PARTIAL_LINE; } /* Cpu */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_CPU) { + if (flags & TRACE_GRAPH_PRINT_CPU) { ret = print_graph_cpu(s, cpu); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; } /* Proc */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_PROC) { + if (flags & TRACE_GRAPH_PRINT_PROC) { ret = print_graph_proc(s, ent->pid); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; @@ -845,7 +848,7 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s, static enum print_line_t print_graph_entry(struct ftrace_graph_ent_entry *field, struct trace_seq *s, - struct trace_iterator *iter) + struct trace_iterator *iter, u32 flags) { struct fgraph_data *data = iter->private; struct ftrace_graph_ent *call = &field->graph_ent; @@ -853,14 +856,14 @@ print_graph_entry(struct ftrace_graph_ent_entry *field, struct trace_seq *s, static enum print_line_t ret; int cpu = iter->cpu; - if (print_graph_prologue(iter, s, TRACE_GRAPH_ENT, call->func)) + if (print_graph_prologue(iter, s, TRACE_GRAPH_ENT, call->func, flags)) return TRACE_TYPE_PARTIAL_LINE; leaf_ret = get_return_for_leaf(iter, field); if (leaf_ret) - ret = print_graph_entry_leaf(iter, field, leaf_ret, s); + ret = print_graph_entry_leaf(iter, field, leaf_ret, s, flags); else - ret = print_graph_entry_nested(iter, field, s, cpu); + ret = print_graph_entry_nested(iter, field, s, cpu, flags); if (data) { /* @@ -879,7 +882,8 @@ print_graph_entry(struct ftrace_graph_ent_entry *field, struct trace_seq *s, static enum print_line_t print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, - struct trace_entry *ent, struct trace_iterator *iter) + struct trace_entry *ent, struct trace_iterator *iter, + u32 flags) { unsigned long long duration = trace->rettime - trace->calltime; struct fgraph_data *data = iter->private; @@ -909,16 +913,16 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, } } - if (print_graph_prologue(iter, s, 0, 0)) + if (print_graph_prologue(iter, s, 0, 0, flags)) return TRACE_TYPE_PARTIAL_LINE; /* Overhead */ - ret = print_graph_overhead(duration, s); + ret = print_graph_overhead(duration, s, flags); if (!ret) return TRACE_TYPE_PARTIAL_LINE; /* Duration */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) { + if (flags & TRACE_GRAPH_PRINT_DURATION) { ret = print_graph_duration(duration, s); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; @@ -948,14 +952,15 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, } /* Overrun */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_OVERRUN) { + if (flags & TRACE_GRAPH_PRINT_OVERRUN) { ret = trace_seq_printf(s, " (Overruns: %lu)\n", trace->overrun); if (!ret) return TRACE_TYPE_PARTIAL_LINE; } - ret = print_graph_irq(iter, trace->func, TRACE_GRAPH_RET, cpu, pid); + ret = print_graph_irq(iter, trace->func, TRACE_GRAPH_RET, + cpu, pid, flags); if (ret == TRACE_TYPE_PARTIAL_LINE) return TRACE_TYPE_PARTIAL_LINE; @@ -963,8 +968,8 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, } static enum print_line_t -print_graph_comment(struct trace_seq *s, struct trace_entry *ent, - struct trace_iterator *iter) +print_graph_comment(struct trace_seq *s, struct trace_entry *ent, + struct trace_iterator *iter, u32 flags) { unsigned long sym_flags = (trace_flags & TRACE_ITER_SYM_MASK); struct fgraph_data *data = iter->private; @@ -976,16 +981,16 @@ print_graph_comment(struct trace_seq *s, struct trace_entry *ent, if (data) depth = per_cpu_ptr(data->cpu_data, iter->cpu)->depth; - if (print_graph_prologue(iter, s, 0, 0)) + if (print_graph_prologue(iter, s, 0, 0, flags)) return TRACE_TYPE_PARTIAL_LINE; /* No overhead */ - ret = print_graph_overhead(-1, s); + ret = print_graph_overhead(-1, s, flags); if (!ret) return TRACE_TYPE_PARTIAL_LINE; /* No time */ - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) { + if (flags & TRACE_GRAPH_PRINT_DURATION) { ret = trace_seq_printf(s, " | "); if (!ret) return TRACE_TYPE_PARTIAL_LINE; @@ -1040,7 +1045,7 @@ print_graph_comment(struct trace_seq *s, struct trace_entry *ent, enum print_line_t -print_graph_function(struct trace_iterator *iter) +print_graph_function_flags(struct trace_iterator *iter, u32 flags) { struct ftrace_graph_ent_entry *field; struct fgraph_data *data = iter->private; @@ -1061,7 +1066,7 @@ print_graph_function(struct trace_iterator *iter) if (data && data->failed) { field = &data->ent; iter->cpu = data->cpu; - ret = print_graph_entry(field, s, iter); + ret = print_graph_entry(field, s, iter, flags); if (ret == TRACE_TYPE_HANDLED && iter->cpu != cpu) { per_cpu_ptr(data->cpu_data, iter->cpu)->ignore = 1; ret = TRACE_TYPE_NO_CONSUME; @@ -1081,32 +1086,49 @@ print_graph_function(struct trace_iterator *iter) struct ftrace_graph_ent_entry saved; trace_assign_type(field, entry); saved = *field; - return print_graph_entry(&saved, s, iter); + return print_graph_entry(&saved, s, iter, flags); } case TRACE_GRAPH_RET: { struct ftrace_graph_ret_entry *field; trace_assign_type(field, entry); - return print_graph_return(&field->ret, s, entry, iter); + return print_graph_return(&field->ret, s, entry, iter, flags); } + case TRACE_STACK: + case TRACE_FN: + /* dont trace stack and functions as comments */ + return TRACE_TYPE_UNHANDLED; + default: - return print_graph_comment(s, entry, iter); + return print_graph_comment(s, entry, iter, flags); } return TRACE_TYPE_HANDLED; } -static void print_lat_header(struct seq_file *s) +static enum print_line_t +print_graph_function(struct trace_iterator *iter) +{ + return print_graph_function_flags(iter, tracer_flags.val); +} + +static enum print_line_t +print_graph_function_event(struct trace_iterator *iter, int flags) +{ + return print_graph_function(iter); +} + +static void print_lat_header(struct seq_file *s, u32 flags) { static const char spaces[] = " " /* 16 spaces */ " " /* 4 spaces */ " "; /* 17 spaces */ int size = 0; - if (tracer_flags.val & TRACE_GRAPH_PRINT_ABS_TIME) + if (flags & TRACE_GRAPH_PRINT_ABS_TIME) size += 16; - if (tracer_flags.val & TRACE_GRAPH_PRINT_CPU) + if (flags & TRACE_GRAPH_PRINT_CPU) size += 4; - if (tracer_flags.val & TRACE_GRAPH_PRINT_PROC) + if (flags & TRACE_GRAPH_PRINT_PROC) size += 17; seq_printf(s, "#%.*s _-----=> irqs-off \n", size, spaces); @@ -1117,43 +1139,48 @@ static void print_lat_header(struct seq_file *s) seq_printf(s, "#%.*s|||| / \n", size, spaces); } -static void print_graph_headers(struct seq_file *s) +void print_graph_headers_flags(struct seq_file *s, u32 flags) { int lat = trace_flags & TRACE_ITER_LATENCY_FMT; if (lat) - print_lat_header(s); + print_lat_header(s, flags); /* 1st line */ seq_printf(s, "#"); - if (tracer_flags.val & TRACE_GRAPH_PRINT_ABS_TIME) + if (flags & TRACE_GRAPH_PRINT_ABS_TIME) seq_printf(s, " TIME "); - if (tracer_flags.val & TRACE_GRAPH_PRINT_CPU) + if (flags & TRACE_GRAPH_PRINT_CPU) seq_printf(s, " CPU"); - if (tracer_flags.val & TRACE_GRAPH_PRINT_PROC) + if (flags & TRACE_GRAPH_PRINT_PROC) seq_printf(s, " TASK/PID "); if (lat) seq_printf(s, "|||||"); - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) + if (flags & TRACE_GRAPH_PRINT_DURATION) seq_printf(s, " DURATION "); seq_printf(s, " FUNCTION CALLS\n"); /* 2nd line */ seq_printf(s, "#"); - if (tracer_flags.val & TRACE_GRAPH_PRINT_ABS_TIME) + if (flags & TRACE_GRAPH_PRINT_ABS_TIME) seq_printf(s, " | "); - if (tracer_flags.val & TRACE_GRAPH_PRINT_CPU) + if (flags & TRACE_GRAPH_PRINT_CPU) seq_printf(s, " | "); - if (tracer_flags.val & TRACE_GRAPH_PRINT_PROC) + if (flags & TRACE_GRAPH_PRINT_PROC) seq_printf(s, " | | "); if (lat) seq_printf(s, "|||||"); - if (tracer_flags.val & TRACE_GRAPH_PRINT_DURATION) + if (flags & TRACE_GRAPH_PRINT_DURATION) seq_printf(s, " | | "); seq_printf(s, " | | | |\n"); } -static void graph_trace_open(struct trace_iterator *iter) +void print_graph_headers(struct seq_file *s) +{ + print_graph_headers_flags(s, tracer_flags.val); +} + +void graph_trace_open(struct trace_iterator *iter) { /* pid and depth on the last trace processed */ struct fgraph_data *data; @@ -1188,7 +1215,7 @@ static void graph_trace_open(struct trace_iterator *iter) pr_warning("function graph tracer: not enough memory\n"); } -static void graph_trace_close(struct trace_iterator *iter) +void graph_trace_close(struct trace_iterator *iter) { struct fgraph_data *data = iter->private; @@ -1198,6 +1225,16 @@ static void graph_trace_close(struct trace_iterator *iter) } } +static struct trace_event graph_trace_entry_event = { + .type = TRACE_GRAPH_ENT, + .trace = print_graph_function_event, +}; + +static struct trace_event graph_trace_ret_event = { + .type = TRACE_GRAPH_RET, + .trace = print_graph_function_event, +}; + static struct tracer graph_trace __read_mostly = { .name = "function_graph", .open = graph_trace_open, @@ -1219,6 +1256,16 @@ static __init int init_graph_trace(void) { max_bytes_for_cpu = snprintf(NULL, 0, "%d", nr_cpu_ids - 1); + if (!register_ftrace_event(&graph_trace_entry_event)) { + pr_warning("Warning: could not register graph trace events\n"); + return 1; + } + + if (!register_ftrace_event(&graph_trace_ret_event)) { + pr_warning("Warning: could not register graph trace events\n"); + return 1; + } + return register_tracer(&graph_trace); } diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c index 2974bc7..6fd486e 100644 --- a/kernel/trace/trace_irqsoff.c +++ b/kernel/trace/trace_irqsoff.c @@ -34,6 +34,9 @@ static int trace_type __read_mostly; static int save_lat_flag; +static void stop_irqsoff_tracer(struct trace_array *tr, int graph); +static int start_irqsoff_tracer(struct trace_array *tr, int graph); + #ifdef CONFIG_PREEMPT_TRACER static inline int preempt_trace(void) @@ -55,6 +58,23 @@ irq_trace(void) # define irq_trace() (0) #endif +#define TRACE_DISPLAY_GRAPH 1 + +static struct tracer_opt trace_opts[] = { +#ifdef CONFIG_FUNCTION_GRAPH_TRACER + /* display latency trace as call graph */ + { TRACER_OPT(display-graph, TRACE_DISPLAY_GRAPH) }, +#endif + { } /* Empty entry */ +}; + +static struct tracer_flags tracer_flags = { + .val = 0, + .opts = trace_opts, +}; + +#define is_graph() (tracer_flags.val & TRACE_DISPLAY_GRAPH) + /* * Sequence count - we record it when starting a measurement and * skip the latency if the sequence has changed - some other section @@ -108,6 +128,202 @@ static struct ftrace_ops trace_ops __read_mostly = }; #endif /* CONFIG_FUNCTION_TRACER */ +#ifdef CONFIG_FUNCTION_GRAPH_TRACER +static int irqsoff_set_flag(u32 old_flags, u32 bit, int set) +{ + int cpu; + + if (!(bit & TRACE_DISPLAY_GRAPH)) + return -EINVAL; + + if (!(is_graph() ^ set)) + return 0; + + stop_irqsoff_tracer(irqsoff_trace, !set); + + for_each_possible_cpu(cpu) + per_cpu(tracing_cpu, cpu) = 0; + + tracing_max_latency = 0; + tracing_reset_online_cpus(irqsoff_trace); + + return start_irqsoff_tracer(irqsoff_trace, set); +} + +static int irqsoff_graph_entry(struct ftrace_graph_ent *trace) +{ + struct trace_array *tr = irqsoff_trace; + struct trace_array_cpu *data; + unsigned long flags; + long disabled; + int ret; + int cpu; + int pc; + + cpu = raw_smp_processor_id(); + if (likely(!per_cpu(tracing_cpu, cpu))) + return 0; + + local_save_flags(flags); + /* slight chance to get a false positive on tracing_cpu */ + if (!irqs_disabled_flags(flags)) + return 0; + + data = tr->data[cpu]; + disabled = atomic_inc_return(&data->disabled); + + if (likely(disabled == 1)) { + pc = preempt_count(); + ret = __trace_graph_entry(tr, trace, flags, pc); + } else + ret = 0; + + atomic_dec(&data->disabled); + return ret; +} + +static void irqsoff_graph_return(struct ftrace_graph_ret *trace) +{ + struct trace_array *tr = irqsoff_trace; + struct trace_array_cpu *data; + unsigned long flags; + long disabled; + int cpu; + int pc; + + cpu = raw_smp_processor_id(); + if (likely(!per_cpu(tracing_cpu, cpu))) + return; + + local_save_flags(flags); + /* slight chance to get a false positive on tracing_cpu */ + if (!irqs_disabled_flags(flags)) + return; + + data = tr->data[cpu]; + disabled = atomic_inc_return(&data->disabled); + + if (likely(disabled == 1)) { + pc = preempt_count(); + __trace_graph_return(tr, trace, flags, pc); + } + + atomic_dec(&data->disabled); +} + +static void irqsoff_trace_open(struct trace_iterator *iter) +{ + if (is_graph()) + graph_trace_open(iter); + +} + +static void irqsoff_trace_close(struct trace_iterator *iter) +{ + if (iter->private) + graph_trace_close(iter); +} + +#define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_CPU | \ + TRACE_GRAPH_PRINT_PROC) + +static enum print_line_t irqsoff_print_line(struct trace_iterator *iter) +{ + u32 flags = GRAPH_TRACER_FLAGS; + + if (trace_flags & TRACE_ITER_LATENCY_FMT) + flags |= TRACE_GRAPH_PRINT_DURATION; + else + flags |= TRACE_GRAPH_PRINT_ABS_TIME; + + /* + * In graph mode call the graph tracer output function, + * otherwise go with the TRACE_FN event handler + */ + if (is_graph()) + return print_graph_function_flags(iter, flags); + + return TRACE_TYPE_UNHANDLED; +} + +static void irqsoff_print_header(struct seq_file *s) +{ + if (is_graph()) { + struct trace_iterator *iter = s->private; + u32 flags = GRAPH_TRACER_FLAGS; + + if (trace_flags & TRACE_ITER_LATENCY_FMT) { + /* print nothing if the buffers are empty */ + if (trace_empty(iter)) + return; + + print_trace_header(s, iter); + flags |= TRACE_GRAPH_PRINT_DURATION; + } else + flags |= TRACE_GRAPH_PRINT_ABS_TIME; + + print_graph_headers_flags(s, flags); + } else + trace_default_header(s); +} + +static void +trace_graph_function(struct trace_array *tr, + unsigned long ip, unsigned long flags, int pc) +{ + u64 time = trace_clock_local(); + struct ftrace_graph_ent ent = { + .func = ip, + .depth = 0, + }; + struct ftrace_graph_ret ret = { + .func = ip, + .depth = 0, + .calltime = time, + .rettime = time, + }; + + __trace_graph_entry(tr, &ent, flags, pc); + __trace_graph_return(tr, &ret, flags, pc); +} + +static void +__trace_function(struct trace_array *tr, + unsigned long ip, unsigned long parent_ip, + unsigned long flags, int pc) +{ + if (!is_graph()) + trace_function(tr, ip, parent_ip, flags, pc); + else { + trace_graph_function(tr, parent_ip, flags, pc); + trace_graph_function(tr, ip, flags, pc); + } +} + +#else +#define __trace_function trace_function + +static int irqsoff_set_flag(u32 old_flags, u32 bit, int set) +{ + return -EINVAL; +} + +static int irqsoff_graph_entry(struct ftrace_graph_ent *trace) +{ + return -1; +} + +static enum print_line_t irqsoff_print_line(struct trace_iterator *iter) +{ + return TRACE_TYPE_UNHANDLED; +} + +static void irqsoff_graph_return(struct ftrace_graph_ret *trace) { } +static void irqsoff_print_header(struct seq_file *s) { } +static void irqsoff_trace_open(struct trace_iterator *iter) { } +static void irqsoff_trace_close(struct trace_iterator *iter) { } +#endif /* CONFIG_FUNCTION_GRAPH_TRACER */ + /* * Should this new latency be reported/recorded? */ @@ -150,7 +366,7 @@ check_critical_timing(struct trace_array *tr, if (!report_latency(delta)) goto out_unlock; - trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); + __trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); /* Skip 5 functions to get to the irq/preempt enable function */ __trace_stack(tr, flags, 5, pc); @@ -172,7 +388,7 @@ out_unlock: out: data->critical_sequence = max_sequence; data->preempt_timestamp = ftrace_now(cpu); - trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); + __trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); } static inline void @@ -204,7 +420,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip) local_save_flags(flags); - trace_function(tr, ip, parent_ip, flags, preempt_count()); + __trace_function(tr, ip, parent_ip, flags, preempt_count()); per_cpu(tracing_cpu, cpu) = 1; @@ -238,7 +454,7 @@ stop_critical_timing(unsigned long ip, unsigned long parent_ip) atomic_inc(&data->disabled); local_save_flags(flags); - trace_function(tr, ip, parent_ip, flags, preempt_count()); + __trace_function(tr, ip, parent_ip, flags, preempt_count()); check_critical_timing(tr, data, parent_ip ? : ip, cpu); data->critical_start = 0; atomic_dec(&data->disabled); @@ -347,19 +563,32 @@ void trace_preempt_off(unsigned long a0, unsigned long a1) } #endif /* CONFIG_PREEMPT_TRACER */ -static void start_irqsoff_tracer(struct trace_array *tr) +static int start_irqsoff_tracer(struct trace_array *tr, int graph) { - register_ftrace_function(&trace_ops); - if (tracing_is_enabled()) + int ret = 0; + + if (!graph) + ret = register_ftrace_function(&trace_ops); + else + ret = register_ftrace_graph(&irqsoff_graph_return, + &irqsoff_graph_entry); + + if (!ret && tracing_is_enabled()) tracer_enabled = 1; else tracer_enabled = 0; + + return ret; } -static void stop_irqsoff_tracer(struct trace_array *tr) +static void stop_irqsoff_tracer(struct trace_array *tr, int graph) { tracer_enabled = 0; - unregister_ftrace_function(&trace_ops); + + if (!graph) + unregister_ftrace_function(&trace_ops); + else + unregister_ftrace_graph(); } static void __irqsoff_tracer_init(struct trace_array *tr) @@ -372,12 +601,14 @@ static void __irqsoff_tracer_init(struct trace_array *tr) /* make sure that the tracer is visible */ smp_wmb(); tracing_reset_online_cpus(tr); - start_irqsoff_tracer(tr); + + if (start_irqsoff_tracer(tr, is_graph())) + printk(KERN_ERR "failed to start irqsoff tracer\n"); } static void irqsoff_tracer_reset(struct trace_array *tr) { - stop_irqsoff_tracer(tr); + stop_irqsoff_tracer(tr, is_graph()); if (!save_lat_flag) trace_flags &= ~TRACE_ITER_LATENCY_FMT; @@ -409,9 +640,15 @@ static struct tracer irqsoff_tracer __read_mostly = .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, .print_max = 1, + .print_header = irqsoff_print_header, + .print_line = irqsoff_print_line, + .flags = &tracer_flags, + .set_flag = irqsoff_set_flag, #ifdef CONFIG_FTRACE_SELFTEST .selftest = trace_selftest_startup_irqsoff, #endif + .open = irqsoff_trace_open, + .close = irqsoff_trace_close, }; # define register_irqsoff(trace) register_tracer(&trace) #else @@ -435,9 +672,15 @@ static struct tracer preemptoff_tracer __read_mostly = .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, .print_max = 1, + .print_header = irqsoff_print_header, + .print_line = irqsoff_print_line, + .flags = &tracer_flags, + .set_flag = irqsoff_set_flag, #ifdef CONFIG_FTRACE_SELFTEST .selftest = trace_selftest_startup_preemptoff, #endif + .open = irqsoff_trace_open, + .close = irqsoff_trace_close, }; # define register_preemptoff(trace) register_tracer(&trace) #else @@ -463,9 +706,15 @@ static struct tracer preemptirqsoff_tracer __read_mostly = .start = irqsoff_tracer_start, .stop = irqsoff_tracer_stop, .print_max = 1, + .print_header = irqsoff_print_header, + .print_line = irqsoff_print_line, + .flags = &tracer_flags, + .set_flag = irqsoff_set_flag, #ifdef CONFIG_FTRACE_SELFTEST .selftest = trace_selftest_startup_preemptirqsoff, #endif + .open = irqsoff_trace_open, + .close = irqsoff_trace_close, }; # define register_preemptirqsoff(trace) register_tracer(&trace) diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 8e46b33..2404c12 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -253,7 +253,7 @@ void *trace_seq_reserve(struct trace_seq *s, size_t len) void *ret; if (s->full) - return 0; + return NULL; if (len > ((PAGE_SIZE - 1) - s->len)) { s->full = 1; diff --git a/kernel/trace/trace_sched_switch.c b/kernel/trace/trace_sched_switch.c index 5fca0f5..a55fccf 100644 --- a/kernel/trace/trace_sched_switch.c +++ b/kernel/trace/trace_sched_switch.c @@ -50,8 +50,7 @@ tracing_sched_switch_trace(struct trace_array *tr, } static void -probe_sched_switch(struct rq *__rq, struct task_struct *prev, - struct task_struct *next) +probe_sched_switch(struct task_struct *prev, struct task_struct *next) { struct trace_array_cpu *data; unsigned long flags; @@ -109,7 +108,7 @@ tracing_sched_wakeup_trace(struct trace_array *tr, } static void -probe_sched_wakeup(struct rq *__rq, struct task_struct *wakee, int success) +probe_sched_wakeup(struct task_struct *wakee, int success) { struct trace_array_cpu *data; unsigned long flags; diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c index 0271742..8052446 100644 --- a/kernel/trace/trace_sched_wakeup.c +++ b/kernel/trace/trace_sched_wakeup.c @@ -107,8 +107,7 @@ static void probe_wakeup_migrate_task(struct task_struct *task, int cpu) } static void notrace -probe_wakeup_sched_switch(struct rq *rq, struct task_struct *prev, - struct task_struct *next) +probe_wakeup_sched_switch(struct task_struct *prev, struct task_struct *next) { struct trace_array_cpu *data; cycle_t T0, T1, delta; @@ -200,7 +199,7 @@ static void wakeup_reset(struct trace_array *tr) } static void -probe_wakeup(struct rq *rq, struct task_struct *p, int success) +probe_wakeup(struct task_struct *p, int success) { struct trace_array_cpu *data; int cpu = smp_processor_id(); diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c index 1cc9858..250e7f9 100644 --- a/kernel/trace/trace_selftest.c +++ b/kernel/trace/trace_selftest.c @@ -29,7 +29,7 @@ static int trace_test_buffer_cpu(struct trace_array *tr, int cpu) struct trace_entry *entry; unsigned int loops = 0; - while ((event = ring_buffer_consume(tr->buffer, cpu, NULL))) { + while ((event = ring_buffer_consume(tr->buffer, cpu, NULL, NULL))) { entry = ring_buffer_event_data(event); /* @@ -255,7 +255,8 @@ trace_selftest_startup_function(struct tracer *trace, struct trace_array *tr) /* Maximum number of functions to trace before diagnosing a hang */ #define GRAPH_MAX_FUNC_TEST 100000000 -static void __ftrace_dump(bool disable_tracing); +static void +__ftrace_dump(bool disable_tracing, enum ftrace_dump_mode oops_dump_mode); static unsigned int graph_hang_thresh; /* Wrap the real function entry probe to avoid possible hanging */ @@ -266,7 +267,7 @@ static int trace_graph_entry_watchdog(struct ftrace_graph_ent *trace) ftrace_graph_stop(); printk(KERN_WARNING "BUG: Function graph tracer hang!\n"); if (ftrace_dump_on_oops) - __ftrace_dump(false); + __ftrace_dump(false, DUMP_ALL); return 0; } diff --git a/kernel/user.c b/kernel/user.c index 766467b..7e72614 100644 --- a/kernel/user.c +++ b/kernel/user.c @@ -16,7 +16,6 @@ #include <linux/interrupt.h> #include <linux/module.h> #include <linux/user_namespace.h> -#include "cred-internals.h" struct user_namespace init_user_ns = { .kref = { @@ -137,9 +136,6 @@ struct user_struct *alloc_uid(struct user_namespace *ns, uid_t uid) struct hlist_head *hashent = uidhashentry(ns, uid); struct user_struct *up, *new; - /* Make uid_hash_find() + uids_user_create() + uid_hash_insert() - * atomic. - */ spin_lock_irq(&uidhash_lock); up = uid_hash_find(uid, hashent); spin_unlock_irq(&uidhash_lock); @@ -161,11 +157,6 @@ struct user_struct *alloc_uid(struct user_namespace *ns, uid_t uid) spin_lock_irq(&uidhash_lock); up = uid_hash_find(uid, hashent); if (up) { - /* This case is not possible when CONFIG_USER_SCHED - * is defined, since we serialize alloc_uid() using - * uids_mutex. Hence no need to call - * sched_destroy_user() or remove_user_sysfs_dir(). - */ key_put(new->uid_keyring); key_put(new->session_keyring); kmem_cache_free(uid_cachep, new); @@ -178,8 +169,6 @@ struct user_struct *alloc_uid(struct user_namespace *ns, uid_t uid) return up; - put_user_ns(new->user_ns); - kmem_cache_free(uid_cachep, new); out_unlock: return NULL; } diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 935248b..d85be90 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -512,6 +512,18 @@ config PROVE_RCU Say N if you are unsure. +config PROVE_RCU_REPEATEDLY + bool "RCU debugging: don't disable PROVE_RCU on first splat" + depends on PROVE_RCU + default n + help + By itself, PROVE_RCU will disable checking upon issuing the + first warning (or "splat"). This feature prevents such + disabling, allowing multiple RCU-lockdep warnings to be printed + on a single reboot. + + Say N if you are unsure. + config LOCKDEP bool depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT @@ -793,7 +805,7 @@ config RCU_CPU_STALL_DETECTOR config RCU_CPU_STALL_VERBOSE bool "Print additional per-task information for RCU_CPU_STALL_DETECTOR" depends on RCU_CPU_STALL_DETECTOR && TREE_PREEMPT_RCU - default n + default y help This option causes RCU to printk detailed per-task information for any tasks that are stalling the current RCU grace period. @@ -1086,6 +1098,13 @@ config DMA_API_DEBUG This option causes a performance degredation. Use only if you want to debug device drivers. If unsure, say N. +config ATOMIC64_SELFTEST + bool "Perform an atomic64_t self-test at boot" + help + Enable this option to test the atomic64_t functions at boot. + + If unsure, say N. + source "samples/Kconfig" source "lib/Kconfig.kgdb" diff --git a/lib/Makefile b/lib/Makefile index 0d40152..9e6d3c2 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -39,7 +39,10 @@ lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o lib-$(CONFIG_GENERIC_FIND_FIRST_BIT) += find_next_bit.o lib-$(CONFIG_GENERIC_FIND_NEXT_BIT) += find_next_bit.o obj-$(CONFIG_GENERIC_FIND_LAST_BIT) += find_last_bit.o + +CFLAGS_hweight.o = $(subst $(quote),,$(CONFIG_ARCH_HWEIGHT_CFLAGS)) obj-$(CONFIG_GENERIC_HWEIGHT) += hweight.o + obj-$(CONFIG_LOCK_KERNEL) += kernel_lock.o obj-$(CONFIG_BTREE) += btree.o obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o @@ -101,6 +104,8 @@ obj-$(CONFIG_GENERIC_CSUM) += checksum.o obj-$(CONFIG_GENERIC_ATOMIC64) += atomic64.o +obj-$(CONFIG_ATOMIC64_SELFTEST) += atomic64_test.o + hostprogs-y := gen_crc32table clean-files := crc32table.h diff --git a/lib/atomic64.c b/lib/atomic64.c index 8bee16e..a21c12b 100644 --- a/lib/atomic64.c +++ b/lib/atomic64.c @@ -162,12 +162,12 @@ int atomic64_add_unless(atomic64_t *v, long long a, long long u) { unsigned long flags; spinlock_t *lock = lock_addr(v); - int ret = 1; + int ret = 0; spin_lock_irqsave(lock, flags); if (v->counter != u) { v->counter += a; - ret = 0; + ret = 1; } spin_unlock_irqrestore(lock, flags); return ret; diff --git a/lib/atomic64_test.c b/lib/atomic64_test.c new file mode 100644 index 0000000..65e482c --- /dev/null +++ b/lib/atomic64_test.c @@ -0,0 +1,164 @@ +/* + * Testsuite for atomic64_t functions + * + * Copyright © 2010 Luca Barbieri + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#include <linux/init.h> +#include <asm/atomic.h> + +#define INIT(c) do { atomic64_set(&v, c); r = c; } while (0) +static __init int test_atomic64(void) +{ + long long v0 = 0xaaa31337c001d00dLL; + long long v1 = 0xdeadbeefdeafcafeLL; + long long v2 = 0xfaceabadf00df001LL; + long long onestwos = 0x1111111122222222LL; + long long one = 1LL; + + atomic64_t v = ATOMIC64_INIT(v0); + long long r = v0; + BUG_ON(v.counter != r); + + atomic64_set(&v, v1); + r = v1; + BUG_ON(v.counter != r); + BUG_ON(atomic64_read(&v) != r); + + INIT(v0); + atomic64_add(onestwos, &v); + r += onestwos; + BUG_ON(v.counter != r); + + INIT(v0); + atomic64_add(-one, &v); + r += -one; + BUG_ON(v.counter != r); + + INIT(v0); + r += onestwos; + BUG_ON(atomic64_add_return(onestwos, &v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + r += -one; + BUG_ON(atomic64_add_return(-one, &v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + atomic64_sub(onestwos, &v); + r -= onestwos; + BUG_ON(v.counter != r); + + INIT(v0); + atomic64_sub(-one, &v); + r -= -one; + BUG_ON(v.counter != r); + + INIT(v0); + r -= onestwos; + BUG_ON(atomic64_sub_return(onestwos, &v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + r -= -one; + BUG_ON(atomic64_sub_return(-one, &v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + atomic64_inc(&v); + r += one; + BUG_ON(v.counter != r); + + INIT(v0); + r += one; + BUG_ON(atomic64_inc_return(&v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + atomic64_dec(&v); + r -= one; + BUG_ON(v.counter != r); + + INIT(v0); + r -= one; + BUG_ON(atomic64_dec_return(&v) != r); + BUG_ON(v.counter != r); + + INIT(v0); + BUG_ON(atomic64_xchg(&v, v1) != v0); + r = v1; + BUG_ON(v.counter != r); + + INIT(v0); + BUG_ON(atomic64_cmpxchg(&v, v0, v1) != v0); + r = v1; + BUG_ON(v.counter != r); + + INIT(v0); + BUG_ON(atomic64_cmpxchg(&v, v2, v1) != v0); + BUG_ON(v.counter != r); + + INIT(v0); + BUG_ON(atomic64_add_unless(&v, one, v0)); + BUG_ON(v.counter != r); + + INIT(v0); + BUG_ON(!atomic64_add_unless(&v, one, v1)); + r += one; + BUG_ON(v.counter != r); + +#if defined(CONFIG_X86) || defined(CONFIG_MIPS) || defined(CONFIG_PPC) || defined(_ASM_GENERIC_ATOMIC64_H) + INIT(onestwos); + BUG_ON(atomic64_dec_if_positive(&v) != (onestwos - 1)); + r -= one; + BUG_ON(v.counter != r); + + INIT(0); + BUG_ON(atomic64_dec_if_positive(&v) != -one); + BUG_ON(v.counter != r); + + INIT(-one); + BUG_ON(atomic64_dec_if_positive(&v) != (-one - one)); + BUG_ON(v.counter != r); +#else +#warning Please implement atomic64_dec_if_positive for your architecture, and add it to the IF above +#endif + + INIT(onestwos); + BUG_ON(!atomic64_inc_not_zero(&v)); + r += one; + BUG_ON(v.counter != r); + + INIT(0); + BUG_ON(atomic64_inc_not_zero(&v)); + BUG_ON(v.counter != r); + + INIT(-one); + BUG_ON(!atomic64_inc_not_zero(&v)); + r += one; + BUG_ON(v.counter != r); + +#ifdef CONFIG_X86 + printk(KERN_INFO "atomic64 test passed for %s platform %s CX8 and %s SSE\n", +#ifdef CONFIG_X86_64 + "x86-64", +#elif defined(CONFIG_X86_CMPXCHG64) + "i586+", +#else + "i386+", +#endif + boot_cpu_has(X86_FEATURE_CX8) ? "with" : "without", + boot_cpu_has(X86_FEATURE_XMM) ? "with" : "without"); +#else + printk(KERN_INFO "atomic64 test passed\n"); +#endif + + return 0; +} + +core_initcall(test_atomic64); diff --git a/lib/btree.c b/lib/btree.c index 41859a8..c9c6f03 100644 --- a/lib/btree.c +++ b/lib/btree.c @@ -95,7 +95,8 @@ static unsigned long *btree_node_alloc(struct btree_head *head, gfp_t gfp) unsigned long *node; node = mempool_alloc(head->mempool, gfp); - memset(node, 0, NODESIZE); + if (likely(node)) + memset(node, 0, NODESIZE); return node; } diff --git a/lib/debugobjects.c b/lib/debugobjects.c index b862b30..deebcc5 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -141,6 +141,7 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr) obj->object = addr; obj->descr = descr; obj->state = ODEBUG_STATE_NONE; + obj->astate = 0; hlist_del(&obj->node); hlist_add_head(&obj->node, &b->list); @@ -252,8 +253,10 @@ static void debug_print_object(struct debug_obj *obj, char *msg) if (limit < 5 && obj->descr != descr_test) { limit++; - WARN(1, KERN_ERR "ODEBUG: %s %s object type: %s\n", msg, - obj_states[obj->state], obj->descr->name); + WARN(1, KERN_ERR "ODEBUG: %s %s (active state %u) " + "object type: %s\n", + msg, obj_states[obj->state], obj->astate, + obj->descr->name); } debug_objects_warnings++; } @@ -447,7 +450,10 @@ void debug_object_deactivate(void *addr, struct debug_obj_descr *descr) case ODEBUG_STATE_INIT: case ODEBUG_STATE_INACTIVE: case ODEBUG_STATE_ACTIVE: - obj->state = ODEBUG_STATE_INACTIVE; + if (!obj->astate) + obj->state = ODEBUG_STATE_INACTIVE; + else + debug_print_object(obj, "deactivate"); break; case ODEBUG_STATE_DESTROYED: @@ -553,6 +559,53 @@ out_unlock: raw_spin_unlock_irqrestore(&db->lock, flags); } +/** + * debug_object_active_state - debug checks object usage state machine + * @addr: address of the object + * @descr: pointer to an object specific debug description structure + * @expect: expected state + * @next: state to move to if expected state is found + */ +void +debug_object_active_state(void *addr, struct debug_obj_descr *descr, + unsigned int expect, unsigned int next) +{ + struct debug_bucket *db; + struct debug_obj *obj; + unsigned long flags; + + if (!debug_objects_enabled) + return; + + db = get_bucket((unsigned long) addr); + + raw_spin_lock_irqsave(&db->lock, flags); + + obj = lookup_object(addr, db); + if (obj) { + switch (obj->state) { + case ODEBUG_STATE_ACTIVE: + if (obj->astate == expect) + obj->astate = next; + else + debug_print_object(obj, "active_state"); + break; + + default: + debug_print_object(obj, "active_state"); + break; + } + } else { + struct debug_obj o = { .object = addr, + .state = ODEBUG_STATE_NOTAVAILABLE, + .descr = descr }; + + debug_print_object(&o, "active_state"); + } + + raw_spin_unlock_irqrestore(&db->lock, flags); +} + #ifdef CONFIG_DEBUG_OBJECTS_FREE static void __debug_check_no_obj_freed(const void *address, unsigned long size) { @@ -774,7 +827,7 @@ static int __init fixup_free(void *addr, enum debug_obj_state state) } } -static int +static int __init check_results(void *addr, enum debug_obj_state state, int fixups, int warnings) { struct debug_bucket *db; @@ -917,7 +970,7 @@ void __init debug_objects_early_init(void) /* * Convert the statically allocated objects to dynamic ones: */ -static int debug_objects_replace_static_objects(void) +static int __init debug_objects_replace_static_objects(void) { struct debug_bucket *db = obj_hash; struct hlist_node *node, *tmp; diff --git a/lib/hweight.c b/lib/hweight.c index 63ee4eb..3c79d50 100644 --- a/lib/hweight.c +++ b/lib/hweight.c @@ -9,7 +9,7 @@ * The Hamming Weight of a number is the total number of bits set in it. */ -unsigned int hweight32(unsigned int w) +unsigned int __sw_hweight32(unsigned int w) { #ifdef ARCH_HAS_FAST_MULTIPLIER w -= (w >> 1) & 0x55555555; @@ -24,29 +24,30 @@ unsigned int hweight32(unsigned int w) return (res + (res >> 16)) & 0x000000FF; #endif } -EXPORT_SYMBOL(hweight32); +EXPORT_SYMBOL(__sw_hweight32); -unsigned int hweight16(unsigned int w) +unsigned int __sw_hweight16(unsigned int w) { unsigned int res = w - ((w >> 1) & 0x5555); res = (res & 0x3333) + ((res >> 2) & 0x3333); res = (res + (res >> 4)) & 0x0F0F; return (res + (res >> 8)) & 0x00FF; } -EXPORT_SYMBOL(hweight16); +EXPORT_SYMBOL(__sw_hweight16); -unsigned int hweight8(unsigned int w) +unsigned int __sw_hweight8(unsigned int w) { unsigned int res = w - ((w >> 1) & 0x55); res = (res & 0x33) + ((res >> 2) & 0x33); return (res + (res >> 4)) & 0x0F; } -EXPORT_SYMBOL(hweight8); +EXPORT_SYMBOL(__sw_hweight8); -unsigned long hweight64(__u64 w) +unsigned long __sw_hweight64(__u64 w) { #if BITS_PER_LONG == 32 - return hweight32((unsigned int)(w >> 32)) + hweight32((unsigned int)w); + return __sw_hweight32((unsigned int)(w >> 32)) + + __sw_hweight32((unsigned int)w); #elif BITS_PER_LONG == 64 #ifdef ARCH_HAS_FAST_MULTIPLIER w -= (w >> 1) & 0x5555555555555555ul; @@ -63,4 +64,4 @@ unsigned long hweight64(__u64 w) #endif #endif } -EXPORT_SYMBOL(hweight64); +EXPORT_SYMBOL(__sw_hweight64); diff --git a/lib/rbtree.c b/lib/rbtree.c index e2aa3be..15e10b1 100644 --- a/lib/rbtree.c +++ b/lib/rbtree.c @@ -44,6 +44,11 @@ static void __rb_rotate_left(struct rb_node *node, struct rb_root *root) else root->rb_node = right; rb_set_parent(node, right); + + if (root->augment_cb) { + root->augment_cb(node); + root->augment_cb(right); + } } static void __rb_rotate_right(struct rb_node *node, struct rb_root *root) @@ -67,12 +72,20 @@ static void __rb_rotate_right(struct rb_node *node, struct rb_root *root) else root->rb_node = left; rb_set_parent(node, left); + + if (root->augment_cb) { + root->augment_cb(node); + root->augment_cb(left); + } } void rb_insert_color(struct rb_node *node, struct rb_root *root) { struct rb_node *parent, *gparent; + if (root->augment_cb) + root->augment_cb(node); + while ((parent = rb_parent(node)) && rb_is_red(parent)) { gparent = rb_parent(parent); @@ -227,12 +240,15 @@ void rb_erase(struct rb_node *node, struct rb_root *root) else { struct rb_node *old = node, *left; + int old_parent_cb = 0; + int successor_parent_cb = 0; node = node->rb_right; while ((left = node->rb_left) != NULL) node = left; if (rb_parent(old)) { + old_parent_cb = 1; if (rb_parent(old)->rb_left == old) rb_parent(old)->rb_left = node; else @@ -247,8 +263,10 @@ void rb_erase(struct rb_node *node, struct rb_root *root) if (parent == old) { parent = node; } else { + successor_parent_cb = 1; if (child) rb_set_parent(child, parent); + parent->rb_left = child; node->rb_right = old->rb_right; @@ -259,6 +277,24 @@ void rb_erase(struct rb_node *node, struct rb_root *root) node->rb_left = old->rb_left; rb_set_parent(old->rb_left, node); + if (root->augment_cb) { + /* + * Here, three different nodes can have new children. + * The parent of the successor node that was selected + * to replace the node to be erased. + * The node that is getting erased and is now replaced + * by its successor. + * The parent of the node getting erased-replaced. + */ + if (successor_parent_cb) + root->augment_cb(parent); + + root->augment_cb(node); + + if (old_parent_cb) + root->augment_cb(rb_parent(old)); + } + goto color; } @@ -267,15 +303,19 @@ void rb_erase(struct rb_node *node, struct rb_root *root) if (child) rb_set_parent(child, parent); - if (parent) - { + + if (parent) { if (parent->rb_left == node) parent->rb_left = child; else parent->rb_right = child; - } - else + + if (root->augment_cb) + root->augment_cb(parent); + + } else { root->rb_node = child; + } color: if (color == RB_BLACK) diff --git a/lib/rwsem.c b/lib/rwsem.c index 3e3365e..ceba8e2 100644 --- a/lib/rwsem.c +++ b/lib/rwsem.c @@ -136,9 +136,10 @@ __rwsem_do_wake(struct rw_semaphore *sem, int downgrading) out: return sem; - /* undo the change to count, but check for a transition 1->0 */ + /* undo the change to the active count, but check for a transition + * 1->0 */ undo: - if (rwsem_atomic_update(-RWSEM_ACTIVE_BIAS, sem) != 0) + if (rwsem_atomic_update(-RWSEM_ACTIVE_BIAS, sem) & RWSEM_ACTIVE_MASK) goto out; goto try_again; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ffbdfc8..4c9e6bb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1039,7 +1039,7 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma, page = alloc_buddy_huge_page(h, vma, addr); if (!page) { hugetlb_put_quota(inode->i_mapping, chg); - return ERR_PTR(-VM_FAULT_OOM); + return ERR_PTR(-VM_FAULT_SIGBUS); } } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 6c755de..8a79a6f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1601,7 +1601,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, * There is a small race that "from" or "to" can be * freed by rmdir, so we use css_tryget(). */ - rcu_read_lock(); from = mc.from; to = mc.to; if (from && css_tryget(&from->css)) { @@ -1622,7 +1621,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, do_continue = (to == mem_over_limit); css_put(&to->css); } - rcu_read_unlock(); if (do_continue) { DEFINE_WAIT(wait); prepare_to_wait(&mc.waitq, &wait, @@ -336,14 +336,13 @@ vma_address(struct page *page, struct vm_area_struct *vma) /* * At what user virtual address is page expected in vma? - * checking that the page matches the vma. + * Caller should check the page is actually part of the vma. */ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma) { - if (PageAnon(page)) { - if (vma->anon_vma != page_anon_vma(page)) - return -EFAULT; - } else if (page->mapping && !(vma->vm_flags & VM_NONLINEAR)) { + if (PageAnon(page)) + ; + else if (page->mapping && !(vma->vm_flags & VM_NONLINEAR)) { if (!vma->vm_file || vma->vm_file->f_mapping != page->mapping) return -EFAULT; diff --git a/net/core/dev.c b/net/core/dev.c index f769098..264137f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1451,7 +1451,7 @@ static inline void net_timestamp(struct sk_buff *skb) * * return values: * NET_RX_SUCCESS (no congestion) - * NET_RX_DROP (packet was dropped) + * NET_RX_DROP (packet was dropped, but freed) * * dev_forward_skb can be used for injecting an skb from the * start_xmit function of one device into the receive queue @@ -1465,12 +1465,11 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb) { skb_orphan(skb); - if (!(dev->flags & IFF_UP)) - return NET_RX_DROP; - - if (skb->len > (dev->mtu + dev->hard_header_len)) + if (!(dev->flags & IFF_UP) || + (skb->len > (dev->mtu + dev->hard_header_len))) { + kfree_skb(skb); return NET_RX_DROP; - + } skb_set_dev(skb, dev); skb->tstamp.tv64 = 0; skb->pkt_type = PACKET_HOST; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index fe776c9..31e85d3 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -602,12 +602,19 @@ static void copy_rtnl_link_stats(struct rtnl_link_stats *a, a->tx_compressed = b->tx_compressed; }; +/* All VF info */ static inline int rtnl_vfinfo_size(const struct net_device *dev) { - if (dev->dev.parent && dev_is_pci(dev->dev.parent)) - return dev_num_vf(dev->dev.parent) * - sizeof(struct ifla_vf_info); - else + if (dev->dev.parent && dev_is_pci(dev->dev.parent)) { + + int num_vfs = dev_num_vf(dev->dev.parent); + size_t size = nlmsg_total_size(sizeof(struct nlattr)); + size += nlmsg_total_size(num_vfs * sizeof(struct nlattr)); + size += num_vfs * (sizeof(struct ifla_vf_mac) + + sizeof(struct ifla_vf_vlan) + + sizeof(struct ifla_vf_tx_rate)); + return size; + } else return 0; } @@ -629,7 +636,7 @@ static inline size_t if_nlmsg_size(const struct net_device *dev) + nla_total_size(1) /* IFLA_OPERSTATE */ + nla_total_size(1) /* IFLA_LINKMODE */ + nla_total_size(4) /* IFLA_NUM_VF */ - + nla_total_size(rtnl_vfinfo_size(dev)) /* IFLA_VFINFO */ + + rtnl_vfinfo_size(dev) /* IFLA_VFINFO_LIST */ + rtnl_link_get_size(dev); /* IFLA_LINKINFO */ } @@ -700,14 +707,37 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent) { int i; - struct ifla_vf_info ivi; - NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent)); - for (i = 0; i < dev_num_vf(dev->dev.parent); i++) { + struct nlattr *vfinfo, *vf; + int num_vfs = dev_num_vf(dev->dev.parent); + + NLA_PUT_U32(skb, IFLA_NUM_VF, num_vfs); + vfinfo = nla_nest_start(skb, IFLA_VFINFO_LIST); + if (!vfinfo) + goto nla_put_failure; + for (i = 0; i < num_vfs; i++) { + struct ifla_vf_info ivi; + struct ifla_vf_mac vf_mac; + struct ifla_vf_vlan vf_vlan; + struct ifla_vf_tx_rate vf_tx_rate; if (dev->netdev_ops->ndo_get_vf_config(dev, i, &ivi)) break; - NLA_PUT(skb, IFLA_VFINFO, sizeof(ivi), &ivi); + vf_mac.vf = vf_vlan.vf = vf_tx_rate.vf = ivi.vf; + memcpy(vf_mac.mac, ivi.mac, sizeof(ivi.mac)); + vf_vlan.vlan = ivi.vlan; + vf_vlan.qos = ivi.qos; + vf_tx_rate.rate = ivi.tx_rate; + vf = nla_nest_start(skb, IFLA_VF_INFO); + if (!vf) { + nla_nest_cancel(skb, vfinfo); + goto nla_put_failure; + } + NLA_PUT(skb, IFLA_VF_MAC, sizeof(vf_mac), &vf_mac); + NLA_PUT(skb, IFLA_VF_VLAN, sizeof(vf_vlan), &vf_vlan); + NLA_PUT(skb, IFLA_VF_TX_RATE, sizeof(vf_tx_rate), &vf_tx_rate); + nla_nest_end(skb, vf); } + nla_nest_end(skb, vfinfo); } if (dev->rtnl_link_ops) { if (rtnl_link_fill(skb, dev) < 0) @@ -769,12 +799,7 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = { [IFLA_LINKINFO] = { .type = NLA_NESTED }, [IFLA_NET_NS_PID] = { .type = NLA_U32 }, [IFLA_IFALIAS] = { .type = NLA_STRING, .len = IFALIASZ-1 }, - [IFLA_VF_MAC] = { .type = NLA_BINARY, - .len = sizeof(struct ifla_vf_mac) }, - [IFLA_VF_VLAN] = { .type = NLA_BINARY, - .len = sizeof(struct ifla_vf_vlan) }, - [IFLA_VF_TX_RATE] = { .type = NLA_BINARY, - .len = sizeof(struct ifla_vf_tx_rate) }, + [IFLA_VFINFO_LIST] = {. type = NLA_NESTED }, }; EXPORT_SYMBOL(ifla_policy); @@ -783,6 +808,19 @@ static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = { [IFLA_INFO_DATA] = { .type = NLA_NESTED }, }; +static const struct nla_policy ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = { + [IFLA_VF_INFO] = { .type = NLA_NESTED }, +}; + +static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { + [IFLA_VF_MAC] = { .type = NLA_BINARY, + .len = sizeof(struct ifla_vf_mac) }, + [IFLA_VF_VLAN] = { .type = NLA_BINARY, + .len = sizeof(struct ifla_vf_vlan) }, + [IFLA_VF_TX_RATE] = { .type = NLA_BINARY, + .len = sizeof(struct ifla_vf_tx_rate) }, +}; + struct net *rtnl_link_get_net(struct net *src_net, struct nlattr *tb[]) { struct net *net; @@ -812,6 +850,52 @@ static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[]) return 0; } +static int do_setvfinfo(struct net_device *dev, struct nlattr *attr) +{ + int rem, err = -EINVAL; + struct nlattr *vf; + const struct net_device_ops *ops = dev->netdev_ops; + + nla_for_each_nested(vf, attr, rem) { + switch (nla_type(vf)) { + case IFLA_VF_MAC: { + struct ifla_vf_mac *ivm; + ivm = nla_data(vf); + err = -EOPNOTSUPP; + if (ops->ndo_set_vf_mac) + err = ops->ndo_set_vf_mac(dev, ivm->vf, + ivm->mac); + break; + } + case IFLA_VF_VLAN: { + struct ifla_vf_vlan *ivv; + ivv = nla_data(vf); + err = -EOPNOTSUPP; + if (ops->ndo_set_vf_vlan) + err = ops->ndo_set_vf_vlan(dev, ivv->vf, + ivv->vlan, + ivv->qos); + break; + } + case IFLA_VF_TX_RATE: { + struct ifla_vf_tx_rate *ivt; + ivt = nla_data(vf); + err = -EOPNOTSUPP; + if (ops->ndo_set_vf_tx_rate) + err = ops->ndo_set_vf_tx_rate(dev, ivt->vf, + ivt->rate); + break; + } + default: + err = -EINVAL; + break; + } + if (err) + break; + } + return err; +} + static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, struct nlattr **tb, char *ifname, int modified) { @@ -942,40 +1026,17 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, write_unlock_bh(&dev_base_lock); } - if (tb[IFLA_VF_MAC]) { - struct ifla_vf_mac *ivm; - ivm = nla_data(tb[IFLA_VF_MAC]); - err = -EOPNOTSUPP; - if (ops->ndo_set_vf_mac) - err = ops->ndo_set_vf_mac(dev, ivm->vf, ivm->mac); - if (err < 0) - goto errout; - modified = 1; - } - - if (tb[IFLA_VF_VLAN]) { - struct ifla_vf_vlan *ivv; - ivv = nla_data(tb[IFLA_VF_VLAN]); - err = -EOPNOTSUPP; - if (ops->ndo_set_vf_vlan) - err = ops->ndo_set_vf_vlan(dev, ivv->vf, - ivv->vlan, - ivv->qos); - if (err < 0) - goto errout; - modified = 1; - } - err = 0; - - if (tb[IFLA_VF_TX_RATE]) { - struct ifla_vf_tx_rate *ivt; - ivt = nla_data(tb[IFLA_VF_TX_RATE]); - err = -EOPNOTSUPP; - if (ops->ndo_set_vf_tx_rate) - err = ops->ndo_set_vf_tx_rate(dev, ivt->vf, ivt->rate); - if (err < 0) - goto errout; - modified = 1; + if (tb[IFLA_VFINFO_LIST]) { + struct nlattr *attr; + int rem; + nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) { + if (nla_type(attr) != IFLA_VF_INFO) + goto errout; + err = do_setvfinfo(dev, attr); + if (err < 0) + goto errout; + modified = 1; + } } err = 0; diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c index 6e74706..80769f1 100644 --- a/net/ipv4/arp.c +++ b/net/ipv4/arp.c @@ -661,13 +661,13 @@ struct sk_buff *arp_create(int type, int ptype, __be32 dest_ip, #endif #endif -#ifdef CONFIG_FDDI +#if defined(CONFIG_FDDI) || defined(CONFIG_FDDI_MODULE) case ARPHRD_FDDI: arp->ar_hrd = htons(ARPHRD_ETHER); arp->ar_pro = htons(ETH_P_IP); break; #endif -#ifdef CONFIG_TR +#if defined(CONFIG_TR) || defined(CONFIG_TR_MODULE) case ARPHRD_IEEE802_TR: arp->ar_hrd = htons(ARPHRD_IEEE802); arp->ar_pro = htons(ETH_P_IP); @@ -1051,7 +1051,7 @@ static int arp_req_set(struct net *net, struct arpreq *r, return -EINVAL; } switch (dev->type) { -#ifdef CONFIG_FDDI +#if defined(CONFIG_FDDI) || defined(CONFIG_FDDI_MODULE) case ARPHRD_FDDI: /* * According to RFC 1390, FDDI devices should accept ARP diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 9d4f6d1..ec19a89 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -754,7 +754,8 @@ ipmr_cache_unresolved(struct net *net, vifi_t vifi, struct sk_buff *skb) c->next = mfc_unres_queue; mfc_unres_queue = c; - mod_timer(&ipmr_expire_timer, c->mfc_un.unres.expires); + if (atomic_read(&net->ipv4.cache_resolve_queue_len) == 1) + mod_timer(&ipmr_expire_timer, c->mfc_un.unres.expires); } /* diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0f8caf6..296150b 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2839,7 +2839,6 @@ static void __tcp_free_md5sig_pool(struct tcp_md5sig_pool * __percpu *pool) if (p->md5_desc.tfm) crypto_free_hash(p->md5_desc.tfm); kfree(p); - p = NULL; } } free_percpu(pool); @@ -2937,25 +2936,40 @@ retry: EXPORT_SYMBOL(tcp_alloc_md5sig_pool); -struct tcp_md5sig_pool *__tcp_get_md5sig_pool(int cpu) + +/** + * tcp_get_md5sig_pool - get md5sig_pool for this user + * + * We use percpu structure, so if we succeed, we exit with preemption + * and BH disabled, to make sure another thread or softirq handling + * wont try to get same context. + */ +struct tcp_md5sig_pool *tcp_get_md5sig_pool(void) { struct tcp_md5sig_pool * __percpu *p; - spin_lock_bh(&tcp_md5sig_pool_lock); + + local_bh_disable(); + + spin_lock(&tcp_md5sig_pool_lock); p = tcp_md5sig_pool; if (p) tcp_md5sig_users++; - spin_unlock_bh(&tcp_md5sig_pool_lock); - return (p ? *per_cpu_ptr(p, cpu) : NULL); -} + spin_unlock(&tcp_md5sig_pool_lock); + + if (p) + return *per_cpu_ptr(p, smp_processor_id()); -EXPORT_SYMBOL(__tcp_get_md5sig_pool); + local_bh_enable(); + return NULL; +} +EXPORT_SYMBOL(tcp_get_md5sig_pool); -void __tcp_put_md5sig_pool(void) +void tcp_put_md5sig_pool(void) { + local_bh_enable(); tcp_free_md5sig_pool(); } - -EXPORT_SYMBOL(__tcp_put_md5sig_pool); +EXPORT_SYMBOL(tcp_put_md5sig_pool); int tcp_md5_hash_header(struct tcp_md5sig_pool *hp, struct tcphdr *th) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 8fef859..c36522a 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1527,6 +1527,9 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, uh = udp_hdr(skb); ulen = ntohs(uh->len); + saddr = ip_hdr(skb)->saddr; + daddr = ip_hdr(skb)->daddr; + if (ulen > skb->len) goto short_packet; @@ -1540,9 +1543,6 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable, if (udp4_csum_init(skb, uh, proto)) goto csum_error; - saddr = ip_hdr(skb)->saddr; - daddr = ip_hdr(skb)->daddr; - if (rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST)) return __udp4_lib_mcast_deliver(net, skb, uh, saddr, daddr, udptable); diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 622dc79..6157388 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -222,6 +222,8 @@ void ipv6_icmp_error(struct sock *sk, struct sk_buff *skb, int err, if (!skb) return; + skb->protocol = htons(ETH_P_IPV6); + serr = SKB_EXT_ERR(skb); serr->ee.ee_errno = err; serr->ee.ee_origin = SO_EE_ORIGIN_ICMP6; @@ -255,6 +257,8 @@ void ipv6_local_error(struct sock *sk, int err, struct flowi *fl, u32 info) if (!skb) return; + skb->protocol = htons(ETH_P_IPV6); + skb_put(skb, sizeof(struct ipv6hdr)); skb_reset_network_header(skb); iph = ipv6_hdr(skb); @@ -319,7 +323,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len) sin->sin6_flowinfo = 0; sin->sin6_port = serr->port; sin->sin6_scope_id = 0; - if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6) { + if (skb->protocol == htons(ETH_P_IPV6)) { ipv6_addr_copy(&sin->sin6_addr, (struct in6_addr *)(nh + serr->addr_offset)); if (np->sndflow) @@ -341,7 +345,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len) sin->sin6_family = AF_INET6; sin->sin6_flowinfo = 0; sin->sin6_scope_id = 0; - if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6) { + if (skb->protocol == htons(ETH_P_IPV6)) { ipv6_addr_copy(&sin->sin6_addr, &ipv6_hdr(skb)->saddr); if (np->rxopt.all) datagram_recv_ctl(sk, msg, skb); diff --git a/net/llc/llc_sap.c b/net/llc/llc_sap.c index a432f0e..94e7fca 100644 --- a/net/llc/llc_sap.c +++ b/net/llc/llc_sap.c @@ -31,7 +31,7 @@ static int llc_mac_header_len(unsigned short devtype) case ARPHRD_ETHER: case ARPHRD_LOOPBACK: return sizeof(struct ethhdr); -#ifdef CONFIG_TR +#if defined(CONFIG_TR) || defined(CONFIG_TR_MODULE) case ARPHRD_IEEE802_TR: return sizeof(struct trh_hdr); #endif diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 4aefa6d..875c8de 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -2030,7 +2030,8 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, continue; if (wk->type != IEEE80211_WORK_DIRECT_PROBE && - wk->type != IEEE80211_WORK_AUTH) + wk->type != IEEE80211_WORK_AUTH && + wk->type != IEEE80211_WORK_ASSOC) continue; if (memcmp(req->bss->bssid, wk->filter_ta, ETH_ALEN)) diff --git a/net/sctp/input.c b/net/sctp/input.c index 2a57018..ea21924 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -440,11 +440,25 @@ void sctp_icmp_proto_unreachable(struct sock *sk, { SCTP_DEBUG_PRINTK("%s\n", __func__); - sctp_do_sm(SCTP_EVENT_T_OTHER, - SCTP_ST_OTHER(SCTP_EVENT_ICMP_PROTO_UNREACH), - asoc->state, asoc->ep, asoc, t, - GFP_ATOMIC); + if (sock_owned_by_user(sk)) { + if (timer_pending(&t->proto_unreach_timer)) + return; + else { + if (!mod_timer(&t->proto_unreach_timer, + jiffies + (HZ/20))) + sctp_association_hold(asoc); + } + + } else { + if (timer_pending(&t->proto_unreach_timer) && + del_timer(&t->proto_unreach_timer)) + sctp_association_put(asoc); + sctp_do_sm(SCTP_EVENT_T_OTHER, + SCTP_ST_OTHER(SCTP_EVENT_ICMP_PROTO_UNREACH), + asoc->state, asoc->ep, asoc, t, + GFP_ATOMIC); + } } /* Common lookup code for icmp/icmpv6 error handler. */ diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index d5ae450..eb1f42f 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -397,6 +397,41 @@ out_unlock: sctp_transport_put(transport); } +/* Handle the timeout of the ICMP protocol unreachable timer. Trigger + * the correct state machine transition that will close the association. + */ +void sctp_generate_proto_unreach_event(unsigned long data) +{ + struct sctp_transport *transport = (struct sctp_transport *) data; + struct sctp_association *asoc = transport->asoc; + + sctp_bh_lock_sock(asoc->base.sk); + if (sock_owned_by_user(asoc->base.sk)) { + SCTP_DEBUG_PRINTK("%s:Sock is busy.\n", __func__); + + /* Try again later. */ + if (!mod_timer(&transport->proto_unreach_timer, + jiffies + (HZ/20))) + sctp_association_hold(asoc); + goto out_unlock; + } + + /* Is this structure just waiting around for us to actually + * get destroyed? + */ + if (asoc->base.dead) + goto out_unlock; + + sctp_do_sm(SCTP_EVENT_T_OTHER, + SCTP_ST_OTHER(SCTP_EVENT_ICMP_PROTO_UNREACH), + asoc->state, asoc->ep, asoc, transport, GFP_ATOMIC); + +out_unlock: + sctp_bh_unlock_sock(asoc->base.sk); + sctp_association_put(asoc); +} + + /* Inject a SACK Timeout event into the state machine. */ static void sctp_generate_sack_event(unsigned long data) { diff --git a/net/sctp/transport.c b/net/sctp/transport.c index be4d63d..165d54e 100644 --- a/net/sctp/transport.c +++ b/net/sctp/transport.c @@ -108,6 +108,8 @@ static struct sctp_transport *sctp_transport_init(struct sctp_transport *peer, (unsigned long)peer); setup_timer(&peer->hb_timer, sctp_generate_heartbeat_event, (unsigned long)peer); + setup_timer(&peer->proto_unreach_timer, + sctp_generate_proto_unreach_event, (unsigned long)peer); /* Initialize the 64-bit random nonce sent with heartbeat. */ get_random_bytes(&peer->hb_nonce, sizeof(peer->hb_nonce)); @@ -171,6 +173,10 @@ void sctp_transport_free(struct sctp_transport *transport) del_timer(&transport->T3_rtx_timer)) sctp_transport_put(transport); + /* Delete the ICMP proto unreachable timer if it's active. */ + if (timer_pending(&transport->proto_unreach_timer) && + del_timer(&transport->proto_unreach_timer)) + sctp_association_put(transport->asoc); sctp_transport_put(transport); } diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index f9bdf26..cbcd654 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -245,3 +245,7 @@ quiet_cmd_lzo = LZO $@ cmd_lzo = (cat $(filter-out FORCE,$^) | \ lzop -9 && $(call size_append, $(filter-out FORCE,$^))) > $@ || \ (rm -f $@ ; false) + +# misc stuff +# --------------------------------------------------------------------------- +quote:=" diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index 220213e..df90f31 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -796,6 +796,16 @@ static int do_platform_entry(const char *filename, return 1; } +/* Looks like: zorro:iN. */ +static int do_zorro_entry(const char *filename, struct zorro_device_id *id, + char *alias) +{ + id->id = TO_NATIVE(id->id); + strcpy(alias, "zorro:"); + ADD(alias, "i", id->id != ZORRO_WILDCARD, id->id); + return 1; +} + /* Ignore any prefix, eg. some architectures prepend _ */ static inline int sym_is(const char *symbol, const char *name) { @@ -943,6 +953,10 @@ void handle_moddevtable(struct module *mod, struct elf_info *info, do_table(symval, sym->st_size, sizeof(struct platform_device_id), "platform", do_platform_entry, mod); + else if (sym_is(symname, "__mod_zorro_device_table")) + do_table(symval, sym->st_size, + sizeof(struct zorro_device_id), "zorro", + do_zorro_entry, mod); free(zeros); } diff --git a/security/min_addr.c b/security/min_addr.c index e86f297..f728728 100644 --- a/security/min_addr.c +++ b/security/min_addr.c @@ -33,7 +33,7 @@ int mmap_min_addr_handler(struct ctl_table *table, int write, { int ret; - if (!capable(CAP_SYS_RAWIO)) + if (write && !capable(CAP_SYS_RAWIO)) return -EPERM; ret = proc_doulongvec_minmax(table, write, buffer, lenp, ppos); diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 8728876..20b5982 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -36,6 +36,9 @@ #include <sound/timer.h> #include <sound/minors.h> #include <asm/io.h> +#if defined(CONFIG_MIPS) && defined(CONFIG_DMA_NONCOHERENT) +#include <dma-coherence.h> +#endif /* * Compatibility @@ -3184,6 +3187,10 @@ static int snd_pcm_default_mmap(struct snd_pcm_substream *substream, substream->runtime->dma_area, substream->runtime->dma_addr, area->vm_end - area->vm_start); +#elif defined(CONFIG_MIPS) && defined(CONFIG_DMA_NONCOHERENT) + if (substream->dma_buffer.dev.type == SNDRV_DMA_TYPE_DEV && + !plat_device_is_coherent(substream->dma_buffer.dev.dev)) + area->vm_page_prot = pgprot_noncached(area->vm_page_prot); #endif /* ARCH_HAS_DMA_MMAP_COHERENT */ /* mmap with fault handler */ area->vm_ops = &snd_pcm_vm_ops_data_fault; diff --git a/sound/drivers/pcsp/pcsp.h b/sound/drivers/pcsp/pcsp.h index 1e12307..4ff6c8c 100644 --- a/sound/drivers/pcsp/pcsp.h +++ b/sound/drivers/pcsp/pcsp.h @@ -16,7 +16,7 @@ #include <asm/i8253.h> #else #include <asm/8253pit.h> -static DEFINE_SPINLOCK(i8253_lock); +static DEFINE_RAW_SPINLOCK(i8253_lock); #endif #define PCSP_SOUND_VERSION 0x400 /* read 4.00 */ diff --git a/sound/drivers/pcsp/pcsp_input.c b/sound/drivers/pcsp/pcsp_input.c index 0444cde..b5e2b54 100644 --- a/sound/drivers/pcsp/pcsp_input.c +++ b/sound/drivers/pcsp/pcsp_input.c @@ -21,7 +21,7 @@ static void pcspkr_do_sound(unsigned int count) { unsigned long flags; - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); if (count) { /* set command for counter 2, 2 byte write */ @@ -36,7 +36,7 @@ static void pcspkr_do_sound(unsigned int count) outb(inb_p(0x61) & 0xFC, 0x61); } - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); } void pcspkr_stop_sound(void) diff --git a/sound/drivers/pcsp/pcsp_lib.c b/sound/drivers/pcsp/pcsp_lib.c index d77ffa9..ce9e7d1 100644 --- a/sound/drivers/pcsp/pcsp_lib.c +++ b/sound/drivers/pcsp/pcsp_lib.c @@ -66,7 +66,7 @@ static u64 pcsp_timer_update(struct snd_pcsp *chip) timer_cnt = val * CUR_DIV() / 256; if (timer_cnt && chip->enable) { - spin_lock_irqsave(&i8253_lock, flags); + raw_spin_lock_irqsave(&i8253_lock, flags); if (!nforce_wa) { outb_p(chip->val61, 0x61); outb_p(timer_cnt, 0x42); @@ -75,7 +75,7 @@ static u64 pcsp_timer_update(struct snd_pcsp *chip) outb(chip->val61 ^ 2, 0x61); chip->thalf = 1; } - spin_unlock_irqrestore(&i8253_lock, flags); + raw_spin_unlock_irqrestore(&i8253_lock, flags); } chip->ns_rem = PCSP_PERIOD_NS(); @@ -159,10 +159,10 @@ static int pcsp_start_playing(struct snd_pcsp *chip) return -EIO; } - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); chip->val61 = inb(0x61) | 0x03; outb_p(0x92, 0x43); /* binary, mode 1, LSB only, ch 2 */ - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); atomic_set(&chip->timer_active, 1); chip->thalf = 0; @@ -179,11 +179,11 @@ static void pcsp_stop_playing(struct snd_pcsp *chip) return; atomic_set(&chip->timer_active, 0); - spin_lock(&i8253_lock); + raw_spin_lock(&i8253_lock); /* restore the timer */ outb_p(0xb6, 0x43); /* binary, mode 3, LSB/MSB, ch 2 */ outb(chip->val61 & 0xFC, 0x61); - spin_unlock(&i8253_lock); + raw_spin_unlock(&i8253_lock); } /* diff --git a/sound/oss/dmasound/dmasound_paula.c b/sound/oss/dmasound/dmasound_paula.c index bb14e4c..87910e9 100644 --- a/sound/oss/dmasound/dmasound_paula.c +++ b/sound/oss/dmasound/dmasound_paula.c @@ -21,6 +21,7 @@ #include <linux/ioport.h> #include <linux/soundcard.h> #include <linux/interrupt.h> +#include <linux/platform_device.h> #include <asm/uaccess.h> #include <asm/setup.h> @@ -710,31 +711,41 @@ static MACHINE machAmiga = { /*** Config & Setup **********************************************************/ -static int __init dmasound_paula_init(void) +static int __init amiga_audio_probe(struct platform_device *pdev) { - int err; - - if (MACH_IS_AMIGA && AMIGAHW_PRESENT(AMI_AUDIO)) { - if (!request_mem_region(CUSTOM_PHYSADDR+0xa0, 0x40, - "dmasound [Paula]")) - return -EBUSY; - dmasound.mach = machAmiga; - dmasound.mach.default_hard = def_hard ; - dmasound.mach.default_soft = def_soft ; - err = dmasound_init(); - if (err) - release_mem_region(CUSTOM_PHYSADDR+0xa0, 0x40); - return err; - } else - return -ENODEV; + dmasound.mach = machAmiga; + dmasound.mach.default_hard = def_hard ; + dmasound.mach.default_soft = def_soft ; + return dmasound_init(); } -static void __exit dmasound_paula_cleanup(void) +static int __exit amiga_audio_remove(struct platform_device *pdev) { dmasound_deinit(); - release_mem_region(CUSTOM_PHYSADDR+0xa0, 0x40); + return 0; +} + +static struct platform_driver amiga_audio_driver = { + .remove = __exit_p(amiga_audio_remove), + .driver = { + .name = "amiga-audio", + .owner = THIS_MODULE, + }, +}; + +static int __init amiga_audio_init(void) +{ + return platform_driver_probe(&amiga_audio_driver, amiga_audio_probe); } -module_init(dmasound_paula_init); -module_exit(dmasound_paula_cleanup); +module_init(amiga_audio_init); + +static void __exit amiga_audio_exit(void) +{ + platform_driver_unregister(&amiga_audio_driver); +} + +module_exit(amiga_audio_exit); + MODULE_LICENSE("GPL"); +MODULE_ALIAS("platform:amiga-audio"); diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c index 56e5207..feabb44 100644 --- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -1197,9 +1197,10 @@ static int patch_cxt5045(struct hda_codec *codec) case 0x103c: case 0x1631: case 0x1734: - /* HP, Packard Bell, & Fujitsu-Siemens laptops have really bad - * sound over 0dB on NID 0x17. Fix max PCM level to 0 dB - * (originally it has 0x2b steps with 0dB offset 0x14) + case 0x17aa: + /* HP, Packard Bell, Fujitsu-Siemens & Lenovo laptops have + * really bad sound over 0dB on NID 0x17. Fix max PCM level to + * 0 dB (originally it has 0x2b steps with 0dB offset 0x14) */ snd_hda_override_amp_caps(codec, 0x17, HDA_INPUT, (0x14 << AC_AMPCAP_OFFSET_SHIFT) | @@ -2846,6 +2847,7 @@ static struct snd_pci_quirk cxt5066_cfg_tbl[] = { SND_PCI_QUIRK(0x1028, 0x0408, "Dell Inspiron One 19T", CXT5066_IDEAPAD), SND_PCI_QUIRK(0x1179, 0xff50, "Toshiba Satellite P500-PSPGSC-01800T", CXT5066_OLPC_XO_1_5), SND_PCI_QUIRK(0x1179, 0xffe0, "Toshiba Satellite Pro T130-15F", CXT5066_OLPC_XO_1_5), + SND_PCI_QUIRK(0x17aa, 0x21b2, "Thinkpad X100e", CXT5066_IDEAPAD), SND_PCI_QUIRK(0x17aa, 0x3a0d, "ideapad", CXT5066_IDEAPAD), {} }; diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 7404dba..886d8e4 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -17871,7 +17871,6 @@ static struct snd_pci_quirk alc662_cfg_tbl[] = { ALC662_3ST_6ch_DIG), SND_PCI_QUIRK_MASK(0x1854, 0xf000, 0x2000, "ASUS H13-200x", ALC663_ASUS_H13), - SND_PCI_QUIRK(0x8086, 0xd604, "Intel mobo", ALC662_3ST_2ch_DIG), {} }; diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 7fb7d01..a0e06d8 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -104,6 +104,7 @@ enum { STAC_DELL_M4_2, STAC_DELL_M4_3, STAC_HP_M4, + STAC_HP_DV4, STAC_HP_DV5, STAC_HP_HDX, STAC_HP_DV4_1222NR, @@ -1544,11 +1545,9 @@ static unsigned int alienware_m17x_pin_configs[13] = { 0x904601b0, }; -static unsigned int intel_dg45id_pin_configs[14] = { +static unsigned int intel_dg45id_pin_configs[13] = { 0x02214230, 0x02A19240, 0x01013214, 0x01014210, - 0x01A19250, 0x01011212, 0x01016211, 0x40f000f0, - 0x40f000f0, 0x40f000f0, 0x40f000f0, 0x014510A0, - 0x074510B0, 0x40f000f0 + 0x01A19250, 0x01011212, 0x01016211 }; static unsigned int *stac92hd73xx_brd_tbl[STAC_92HD73XX_MODELS] = { @@ -1693,6 +1692,7 @@ static unsigned int *stac92hd71bxx_brd_tbl[STAC_92HD71BXX_MODELS] = { [STAC_DELL_M4_2] = dell_m4_2_pin_configs, [STAC_DELL_M4_3] = dell_m4_3_pin_configs, [STAC_HP_M4] = NULL, + [STAC_HP_DV4] = NULL, [STAC_HP_DV5] = NULL, [STAC_HP_HDX] = NULL, [STAC_HP_DV4_1222NR] = NULL, @@ -1705,6 +1705,7 @@ static const char *stac92hd71bxx_models[STAC_92HD71BXX_MODELS] = { [STAC_DELL_M4_2] = "dell-m4-2", [STAC_DELL_M4_3] = "dell-m4-3", [STAC_HP_M4] = "hp-m4", + [STAC_HP_DV4] = "hp-dv4", [STAC_HP_DV5] = "hp-dv5", [STAC_HP_HDX] = "hp-hdx", [STAC_HP_DV4_1222NR] = "hp-dv4-1222nr", @@ -1723,7 +1724,7 @@ static struct snd_pci_quirk stac92hd71bxx_cfg_tbl[] = { SND_PCI_QUIRK_MASK(PCI_VENDOR_ID_HP, 0xfff0, 0x3080, "HP", STAC_HP_DV5), SND_PCI_QUIRK_MASK(PCI_VENDOR_ID_HP, 0xfff0, 0x30f0, - "HP dv4-7", STAC_HP_DV5), + "HP dv4-7", STAC_HP_DV4), SND_PCI_QUIRK_MASK(PCI_VENDOR_ID_HP, 0xfff0, 0x3600, "HP dv4-7", STAC_HP_DV5), SND_PCI_QUIRK(PCI_VENDOR_ID_HP, 0x3610, @@ -4768,6 +4769,9 @@ static void set_hp_led_gpio(struct hda_codec *codec) struct sigmatel_spec *spec = codec->spec; unsigned int gpio; + if (spec->gpio_led) + return; + gpio = snd_hda_param_read(codec, codec->afg, AC_PAR_GPIO_CAP); gpio &= AC_GPIO_IO_COUNT; if (gpio > 3) @@ -5677,6 +5681,9 @@ again: spec->num_smuxes = 1; spec->num_dmuxes = 1; /* fallthrough */ + case STAC_HP_DV4: + spec->gpio_led = 0x01; + /* fallthrough */ case STAC_HP_DV5: snd_hda_codec_set_pincfg(codec, 0x0d, 0x90170010); stac92xx_auto_set_pinctl(codec, 0x0d, AC_PINCTL_OUT_EN); @@ -5690,6 +5697,7 @@ again: spec->num_dmics = 1; spec->num_dmuxes = 1; spec->num_smuxes = 1; + spec->gpio_led = 0x08; break; } @@ -5746,7 +5754,8 @@ again: } /* enable bass on HP dv7 */ - if (spec->board_config == STAC_HP_DV5) { + if (spec->board_config == STAC_HP_DV4 || + spec->board_config == STAC_HP_DV5) { unsigned int cap; cap = snd_hda_param_read(codec, 0x1, AC_PAR_GPIO_CAP); cap &= AC_GPIO_IO_COUNT; diff --git a/sound/pci/ice1712/maya44.c b/sound/pci/ice1712/maya44.c index 3e1c20a..726fd4b 100644 --- a/sound/pci/ice1712/maya44.c +++ b/sound/pci/ice1712/maya44.c @@ -347,7 +347,7 @@ static int maya_gpio_sw_put(struct snd_kcontrol *kcontrol, /* known working input slots (0-4) */ #define MAYA_LINE_IN 1 /* in-2 */ -#define MAYA_MIC_IN 4 /* in-5 */ +#define MAYA_MIC_IN 3 /* in-4 */ static void wm8776_select_input(struct snd_maya44 *chip, int idx, int line) { @@ -393,8 +393,8 @@ static int maya_rec_src_put(struct snd_kcontrol *kcontrol, int changed; mutex_lock(&chip->mutex); - changed = maya_set_gpio_bits(chip->ice, GPIO_MIC_RELAY, - sel ? GPIO_MIC_RELAY : 0); + changed = maya_set_gpio_bits(chip->ice, 1 << GPIO_MIC_RELAY, + sel ? (1 << GPIO_MIC_RELAY) : 0); wm8776_select_input(chip, 0, sel ? MAYA_MIC_IN : MAYA_LINE_IN); mutex_unlock(&chip->mutex); return changed; diff --git a/sound/pci/oxygen/xonar_cs43xx.c b/sound/pci/oxygen/xonar_cs43xx.c index 16c226b..7c4986b 100644 --- a/sound/pci/oxygen/xonar_cs43xx.c +++ b/sound/pci/oxygen/xonar_cs43xx.c @@ -56,6 +56,7 @@ #include <sound/pcm_params.h> #include <sound/tlv.h> #include "xonar.h" +#include "cm9780.h" #include "cs4398.h" #include "cs4362a.h" @@ -172,6 +173,8 @@ static void xonar_d1_init(struct oxygen *chip) oxygen_clear_bits16(chip, OXYGEN_GPIO_DATA, GPIO_D1_FRONT_PANEL | GPIO_D1_INPUT_ROUTE); + oxygen_ac97_set_bits(chip, 0, CM9780_JACK, CM9780_FMIC2MIC); + xonar_init_cs53x1(chip); xonar_enable_output(chip); diff --git a/tools/perf/util/trace-event-parse.c b/tools/perf/util/trace-event-parse.c index 069f261..73a0222 100644 --- a/tools/perf/util/trace-event-parse.c +++ b/tools/perf/util/trace-event-parse.c @@ -1937,7 +1937,7 @@ void *raw_field_ptr(struct event *event, const char *name, void *data) if (!field) return NULL; - if (field->flags & FIELD_IS_STRING) { + if (field->flags & FIELD_IS_DYNAMIC) { int offset; offset = *(int *)(data + field->offset); diff --git a/tools/perf/util/trace-event-read.c b/tools/perf/util/trace-event-read.c index cb54cd0..f55cc3a 100644 --- a/tools/perf/util/trace-event-read.c +++ b/tools/perf/util/trace-event-read.c @@ -53,12 +53,6 @@ static unsigned long page_size; static ssize_t calc_data_size; static bool repipe; -/* If it fails, the next read will report it */ -static void skip(int size) -{ - lseek(input_fd, size, SEEK_CUR); -} - static int do_read(int fd, void *buf, int size) { int rsize = size; @@ -98,6 +92,19 @@ static int read_or_die(void *data, int size) return r; } +/* If it fails, the next read will report it */ +static void skip(int size) +{ + char buf[BUFSIZ]; + int r; + + while (size) { + r = size > BUFSIZ ? BUFSIZ : size; + read_or_die(buf, r); + size -= r; + }; +} + static unsigned int read4(void) { unsigned int data; diff --git a/tools/perf/util/trace-event.h b/tools/perf/util/trace-event.h index 406d452..b3e86b1 100644 --- a/tools/perf/util/trace-event.h +++ b/tools/perf/util/trace-event.h @@ -233,7 +233,12 @@ static inline unsigned long long __data2host8(unsigned long long data) #define data2host2(ptr) __data2host2(*(unsigned short *)ptr) #define data2host4(ptr) __data2host4(*(unsigned int *)ptr) -#define data2host8(ptr) __data2host8(*(unsigned long long *)ptr) +#define data2host8(ptr) ({ \ + unsigned long long __val; \ + \ + memcpy(&__val, (ptr), sizeof(unsigned long long)); \ + __data2host8(__val); \ +}) extern int header_page_ts_offset; extern int header_page_ts_size; diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c index 03a5eb2..7c79c1d 100644 --- a/virt/kvm/ioapic.c +++ b/virt/kvm/ioapic.c @@ -197,7 +197,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level) union kvm_ioapic_redirect_entry entry; int ret = 1; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); if (irq >= 0 && irq < IOAPIC_NUM_PINS) { entry = ioapic->redirtbl[irq]; level ^= entry.fields.polarity; @@ -214,7 +214,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level) } trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0); } - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); return ret; } @@ -238,9 +238,9 @@ static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector, * is dropped it will be put into irr and will be delivered * after ack notifier returns. */ - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i); - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); if (trigger_mode != IOAPIC_LEVEL_TRIG) continue; @@ -259,9 +259,9 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode) smp_rmb(); if (!test_bit(vector, ioapic->handled_vectors)) return; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); __kvm_ioapic_update_eoi(ioapic, vector, trigger_mode); - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); } static inline struct kvm_ioapic *to_ioapic(struct kvm_io_device *dev) @@ -287,7 +287,7 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len, ASSERT(!(addr & 0xf)); /* check alignment */ addr &= 0xff; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); switch (addr) { case IOAPIC_REG_SELECT: result = ioapic->ioregsel; @@ -301,7 +301,7 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len, result = 0; break; } - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); switch (len) { case 8: @@ -338,7 +338,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len, } addr &= 0xff; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); switch (addr) { case IOAPIC_REG_SELECT: ioapic->ioregsel = data; @@ -356,7 +356,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len, default: break; } - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); return 0; } @@ -386,7 +386,7 @@ int kvm_ioapic_init(struct kvm *kvm) ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL); if (!ioapic) return -ENOMEM; - mutex_init(&ioapic->lock); + spin_lock_init(&ioapic->lock); kvm->arch.vioapic = ioapic; kvm_ioapic_reset(ioapic); kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops); @@ -419,9 +419,9 @@ int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state) if (!ioapic) return -EINVAL; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); memcpy(state, ioapic, sizeof(struct kvm_ioapic_state)); - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); return 0; } @@ -431,9 +431,9 @@ int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state) if (!ioapic) return -EINVAL; - mutex_lock(&ioapic->lock); + spin_lock(&ioapic->lock); memcpy(ioapic, state, sizeof(struct kvm_ioapic_state)); update_handled_vectors(ioapic); - mutex_unlock(&ioapic->lock); + spin_unlock(&ioapic->lock); return 0; } diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h index 8a751b7..0b190c3 100644 --- a/virt/kvm/ioapic.h +++ b/virt/kvm/ioapic.h @@ -45,7 +45,7 @@ struct kvm_ioapic { struct kvm_io_device dev; struct kvm *kvm; void (*ack_notifier)(void *opaque, int irq); - struct mutex lock; + spinlock_t lock; DECLARE_BITMAP(handled_vectors, 256); }; diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c index 80fd3ad..11692b9 100644 --- a/virt/kvm/iommu.c +++ b/virt/kvm/iommu.c @@ -32,12 +32,30 @@ static int kvm_iommu_unmap_memslots(struct kvm *kvm); static void kvm_iommu_put_pages(struct kvm *kvm, gfn_t base_gfn, unsigned long npages); +static pfn_t kvm_pin_pages(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, unsigned long size) +{ + gfn_t end_gfn; + pfn_t pfn; + + pfn = gfn_to_pfn_memslot(kvm, slot, gfn); + end_gfn = gfn + (size >> PAGE_SHIFT); + gfn += 1; + + if (is_error_pfn(pfn)) + return pfn; + + while (gfn < end_gfn) + gfn_to_pfn_memslot(kvm, slot, gfn++); + + return pfn; +} + int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot) { - gfn_t gfn = slot->base_gfn; - unsigned long npages = slot->npages; + gfn_t gfn, end_gfn; pfn_t pfn; - int i, r = 0; + int r = 0; struct iommu_domain *domain = kvm->arch.iommu_domain; int flags; @@ -45,31 +63,62 @@ int kvm_iommu_map_pages(struct kvm *kvm, struct kvm_memory_slot *slot) if (!domain) return 0; + gfn = slot->base_gfn; + end_gfn = gfn + slot->npages; + flags = IOMMU_READ | IOMMU_WRITE; if (kvm->arch.iommu_flags & KVM_IOMMU_CACHE_COHERENCY) flags |= IOMMU_CACHE; - for (i = 0; i < npages; i++) { - /* check if already mapped */ - if (iommu_iova_to_phys(domain, gfn_to_gpa(gfn))) + + while (gfn < end_gfn) { + unsigned long page_size; + + /* Check if already mapped */ + if (iommu_iova_to_phys(domain, gfn_to_gpa(gfn))) { + gfn += 1; + continue; + } + + /* Get the page size we could use to map */ + page_size = kvm_host_page_size(kvm, gfn); + + /* Make sure the page_size does not exceed the memslot */ + while ((gfn + (page_size >> PAGE_SHIFT)) > end_gfn) + page_size >>= 1; + + /* Make sure gfn is aligned to the page size we want to map */ + while ((gfn << PAGE_SHIFT) & (page_size - 1)) + page_size >>= 1; + + /* + * Pin all pages we are about to map in memory. This is + * important because we unmap and unpin in 4kb steps later. + */ + pfn = kvm_pin_pages(kvm, slot, gfn, page_size); + if (is_error_pfn(pfn)) { + gfn += 1; continue; + } - pfn = gfn_to_pfn_memslot(kvm, slot, gfn); - r = iommu_map_range(domain, - gfn_to_gpa(gfn), - pfn_to_hpa(pfn), - PAGE_SIZE, flags); + /* Map into IO address space */ + r = iommu_map(domain, gfn_to_gpa(gfn), pfn_to_hpa(pfn), + get_order(page_size), flags); if (r) { printk(KERN_ERR "kvm_iommu_map_address:" "iommu failed to map pfn=%lx\n", pfn); goto unmap_pages; } - gfn++; + + gfn += page_size >> PAGE_SHIFT; + + } + return 0; unmap_pages: - kvm_iommu_put_pages(kvm, slot->base_gfn, i); + kvm_iommu_put_pages(kvm, slot->base_gfn, gfn); return r; } @@ -189,27 +238,47 @@ out_unmap: return r; } +static void kvm_unpin_pages(struct kvm *kvm, pfn_t pfn, unsigned long npages) +{ + unsigned long i; + + for (i = 0; i < npages; ++i) + kvm_release_pfn_clean(pfn + i); +} + static void kvm_iommu_put_pages(struct kvm *kvm, gfn_t base_gfn, unsigned long npages) { - gfn_t gfn = base_gfn; + struct iommu_domain *domain; + gfn_t end_gfn, gfn; pfn_t pfn; - struct iommu_domain *domain = kvm->arch.iommu_domain; - unsigned long i; u64 phys; + domain = kvm->arch.iommu_domain; + end_gfn = base_gfn + npages; + gfn = base_gfn; + /* check if iommu exists and in use */ if (!domain) return; - for (i = 0; i < npages; i++) { + while (gfn < end_gfn) { + unsigned long unmap_pages; + int order; + + /* Get physical address */ phys = iommu_iova_to_phys(domain, gfn_to_gpa(gfn)); - pfn = phys >> PAGE_SHIFT; - kvm_release_pfn_clean(pfn); - gfn++; - } + pfn = phys >> PAGE_SHIFT; + + /* Unmap address from IO address space */ + order = iommu_unmap(domain, gfn_to_gpa(gfn), PAGE_SIZE); + unmap_pages = 1ULL << order; - iommu_unmap_range(domain, gfn_to_gpa(base_gfn), PAGE_SIZE * npages); + /* Unpin all pages we just unmapped to not leak any memory */ + kvm_unpin_pages(kvm, pfn, unmap_pages); + + gfn += unmap_pages; + } } static int kvm_iommu_unmap_memslots(struct kvm *kvm) |