summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'pm-cpuidle'Rafael J. Wysocki2015-06-222-0/+23
|\ | | | | | | | | * pm-cpuidle: cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
| * cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle stateShilpasri G Bhat2015-06-222-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | The idle cpus which stay in snooze for a long period can degrade the perfomance of the sibling cpus. If the cpu stays in snooze for more than target residency of the next available idle state, then exit from snooze. This gives a chance to the cpuidle governor to re-evaluate the last idle state of the cpu to promote it to deeper idle states. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | Merge branch 'pm-sleep'Rafael J. Wysocki2015-06-221-2/+4
|\ \ | | | | | | | | | | | | * pm-sleep: x86: Load __USER_DS into DS/ES after resume
| * | x86: Load __USER_DS into DS/ES after resumeIngo Molnar2015-06-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Srinivas Pandruvada reported a problem with system resume from suspend-to-RAM on 32-bit x86 systems where the DS register of the CPU is set to __KERNEL_DS instead of __USER_DS on return to user space which cases a General Protection Fault to occur. The issue is that DS is set to __KERNEL_DS by the ACPI resume code path while the SYSEXIT path never reloads DS/ES. It assumes they are still __USER_DS set at the SYSENTER time (Brian Gerst), so if the return to user space happens to be through SYSEXIT, it will lead to the reported GPF. Fix the problem by setting the DS and ES registers to __USER_DS as expected by the SYSEXIT path. Link: https://bugzilla.kernel.org/show_bug.cgi?id=61781 Link: http://marc.info/?l=linux-pm&m=143406648920385&w=2 Acked-by: Pavel Machek <pavel@ucw.cz> Tested-by: Pavel Machek <pavel@ucw.cz> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | Merge branch 'pm-opp'Rafael J. Wysocki2015-06-221-4/+444
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * pm-opp: PM / OPP: Add binding for 'opp-suspend' PM / OPP: Allow multiple OPP tables to be passed via DT PM / OPP: Add new bindings to address shortcomings of existing bindings
| * | | PM / OPP: Add binding for 'opp-suspend'Viresh Kumar2015-06-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On few platforms, for power efficiency, we want the device to be configured for a specific OPP while we put the device in suspend state. Add an optional property in operating-points-v2 bindings for that. Suggested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | PM / OPP: Allow multiple OPP tables to be passed via DTViresh Kumar2015-06-221-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some platforms (Like Qualcomm's SoCs), it is not decided until runtime on what OPPs to use. The OPP tables can be fixed at compile time, but which table to use is found out only after reading some efuses (sort of an prom) and knowing characteristics of the SoC. To support such platform we need to pass multiple OPP tables per device and hardware should be able to choose one and only one table out of those. Update operating-points-v2 bindings to support that. Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | PM / OPP: Add new bindings to address shortcomings of existing bindingsViresh Kumar2015-06-221-4/+377
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current OPP (Operating performance point) device tree bindings have been insufficient due to the inflexible nature of the original bindings. Over time, we have realized that Operating Performance Point definitions and usage is varied depending on the SoC and a "single size (just frequency, voltage) fits all" model which the original bindings attempted and failed. The proposed next generation of the bindings addresses by providing a expandable binding for OPPs and introduces the following common shortcomings seen with the original bindings: - Getting clock/voltage/current rails sharing information between CPUs. Shared by all cores vs independent clock per core vs shared clock per cluster. - Support for specifying current levels along with voltages. - Support for multiple regulators. - Support for turbo modes. - Other per OPP settings: transition latencies, disabled status, etc.? - Expandability of OPPs in future. This patch introduces new bindings "operating-points-v2" to get these problems solved. Refer to the bindings for more details. We now have multiple versions of OPP binding and only one of them should be used per device. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | |
| \ \ \
*-. \ \ \ Merge branches 'pnp' and 'pm-tools'Rafael J. Wysocki2015-06-192-7/+8
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pnp: PNP / ACPI: use unsigned int in pnpacpi_encode_resources() PNP / ACPI: use u8 instead of int in acpi_resource_extended_irq context * pm-tools: cpupower: mperf monitor: fix output in MAX_FREQ_SYSFS mode
| | * | | | cpupower: mperf monitor: fix output in MAX_FREQ_SYSFS modeHerton R. Krzesinski2015-05-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is clearly wrong output when mperf monitor runs in MAX_FREQ_SYSFS mode: average frequency shows in kHz unit (despite the intended output to be in MHz), and percentages for C state information are all wrong (including high/negative values shown). The problem is that the max_frequency read on initialization isn't used where it should have been used on mperf_get_count_percent (to estimate the number of ticks in the given time period), and the value we read from sysfs is in kHz, so we must divide it to get the MHz value to use in current calculations. While at it, also I fixed another small issues in the debug output of max_frequency value in mperf_get_count_freq. Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Acked-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | PNP / ACPI: use unsigned int in pnpacpi_encode_resources()Fabian Frederick2015-05-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | use unsigned int for port, irq, dma and mem used for pnp_get_resource() This fixes gcc warnings of type "conversion to unsigned int from int may change the sign of the result" Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | PNP / ACPI: use u8 instead of int in acpi_resource_extended_irq contextFabian Frederick2015-05-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | acpi_resource_extented_irq variables are all u8. Use that type for triggering, polarity and shareable. This fixes gcc warnings of type "conversion to u8 from int may alter its value" Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | | |
| \ \ \ \ \
| \ \ \ \ \
| \ \ \ \ \
*---. \ \ \ \ \ Merge branches 'pm-clk', 'pm-domains' and 'powercap'Rafael J. Wysocki2015-06-198-170/+125
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-clk: PM / clk: Print acquired clock name in addition to con_id PM / clk: Fix clock error check in __pm_clk_add() drivers: sh: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPS arm: davinci: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPS arm: omap1: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPS arm: keystone: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPS PM / clock_ops: Provide default runtime ops to users * pm-domains: PM / Domains: Skip timings during syscore suspend/resume * powercap: powercap / RAPL: Support Knights Landing powercap / RAPL: Floor frequency setting in Atom SoC
| | | * | | | | | powercap / RAPL: Support Knights LandingDasaratharaman Chandramouli2015-05-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables intel_rapl driver to run on the next-generation Intel(R) Xeon Phi Microarchitecture code named "Knights Landing" Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | * | | | | | powercap / RAPL: Floor frequency setting in Atom SoCAjay Thomas2015-05-051-9/+41
| | | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CPU Floor frequency is set in BIOS for newer Atom SoCs. This patch handles configuration of floor frequency for different variants of Atom SoCs appropriately and ensures configuration of floor frequency is not done from driver for these newer Atom SoCs. Since address of the register for configuring floor frequency might change for different Atom SoCs, this patch also prevents potential overwriting of wrong registers. Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Ajay Thomas <ajay.thomas.david.rajamanickam@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | | | | PM / Domains: Skip timings during syscore suspend/resumeGeert Uytterhoeven2015-06-151-16/+26
| | | |_|/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PM Domain code uses ktime_get() to perform various latency measurements. However, if ktime_get() is called while timekeeping is suspended, the following warning is printed: WARNING: CPU: 0 PID: 1340 at kernel/time/timekeeping.c:576 ktime_get+0x3 This happens when resuming the PM Domain that contains the clock events source, which calls pm_genpd_syscore_poweron(). Chain of operations is: timekeeping_resume() { clockevents_resume() sh_cmt_clock_event_resume() pm_genpd_syscore_poweron() pm_genpd_sync_poweron() genpd_syscore_switch() genpd_power_on() ktime_get(), but timekeeping_suspended == 1 ... timekeeping_suspended = 0; } Fix this by adding a "timed" parameter to genpd_power_{on,off}() and pm_genpd_sync_power{off,on}(), to indicate whether latency measurements are allowed. This parameter is passed as false in genpd_syscore_switch() (i.e. during syscore suspend/resume), and true in all other cases. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / clk: Print acquired clock name in addition to con_idGeert Uytterhoeven2015-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the con_id of the acquired clock is printed for debugging purposes. But in several cases, the con_id is NULL, which doesn't provide much debugging information when printed. These cases are: - When explicitly passing a NULL con_id (which means the first clock tied to the device, if available), - When not using pm_clk_add(), but pm_clk_add_clk() (which takes a "struct clk *" directly). Hence print the actual clock name in addition to (and not instead of; thanks Grygorii Strashko!) the con_id. Note that the clock name is not available with legacy clock frameworks, and the hex pointer address will be printed instead. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Grygorii Strashko <grygorii.strashko@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / clk: Fix clock error check in __pm_clk_add()Geert Uytterhoeven2015-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the final iteration of commit 245bd6f6af8a62a2 ("PM / clock_ops: Add pm_clk_add_clk()"), a refcount increment was added by Grygorii Strashko. However, the accompanying IS_ERR() check operates on the wrong clock pointer, which is always zero at this point, i.e. not an error. This may lead to a NULL pointer dereference later, when __clk_get() tries to dereference an error pointer. Check the passed clock pointer instead to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 245bd6f6af8a62a2 ("PM / clock_ops: Add pm_clk_add_clk()") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | drivers: sh: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPSRajendra Nayak2015-05-121-45/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | USE_PM_CLK_RUNTIME_OPS is introduced so we don't repeat the same code to do runtime_suspend and runtime_resume across users of PM clocks. Use it to remove the boilerplate code. Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | arm: davinci: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPSRajendra Nayak2015-05-121-31/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | USE_PM_CLK_RUNTIME_OPS is introduced so we don't repeat the same code to do runtime_suspend and runtime_resume across users of PM clocks. Use it to remove the boilerplate code. Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | arm: omap1: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPSRajendra Nayak2015-05-121-35/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | USE_PM_CLK_RUNTIME_OPS is introduced so we don't repeat the same code to do runtime_suspend and runtime_resume across users of PM clocks. Use it to remove the boilerplate code. Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | arm: keystone: remove boilerplate code and use USE_PM_CLK_RUNTIME_OPSRajendra Nayak2015-05-121-32/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | USE_PM_CLK_RUNTIME_OPS is introduced so we don't repeat the same code to do runtime_suspend and runtime_resume across users of PM clocks. Use it to remove the boilerplate code. Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / clock_ops: Provide default runtime ops to usersRajendra Nayak2015-05-122-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most users of PM clocks do the extact same things in the runtime suspend/resume callbacks. Provide them USE_PM_CLK_RUNTIME_OPS so as to avoid/remove boilerplate code. Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Reviewed-by: Kevin Hilman <khilman@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | | Merge branch 'pm-wakeirq'Rafael J. Wysocki2015-06-199-1/+484
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-wakeirq: PM / wakeirq: Fix typo in prototype for dev_pm_set_dedicated_wake_irq PM / Wakeirq: Add automated device wake IRQ handling
| * | | | | | | PM / wakeirq: Fix typo in prototype for dev_pm_set_dedicated_wake_irqTony Lindgren2015-05-301-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looks like I only built test the dev_pm_set_wake_irq and not the dev_pm_set_dedicated_wake_irq case on x86. Turns out there's a typo for the dev_pm_set_dedicated_wake_irq prototype that causes a build error if CONFIG_COMPILE_TEST and CONFIG_MMC_OMAP_HS are selected. Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | | PM / Wakeirq: Add automated device wake IRQ handlingTony Lindgren2015-05-209-1/+485
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turns out we can automate the handling for the device_may_wakeup() quite a bit by using the kernel wakeup source list as suggested by Rafael J. Wysocki <rjw@rjwysocki.net>. And as some hardware has separate dedicated wake-up interrupt in addition to the IO interrupt, we can automate the handling by adding a generic threaded interrupt handler that just calls the device PM runtime to wake up the device. This allows dropping code from device drivers as we currently are doing it in multiple ways, and often wrong. For most drivers, we should be able to drop the following boilerplate code from runtime_suspend and runtime_resume functions: ... device_init_wakeup(dev, true); ... if (device_may_wakeup(dev)) enable_irq_wake(irq); ... if (device_may_wakeup(dev)) disable_irq_wake(irq); ... device_init_wakeup(dev, false); ... We can replace it with just the following init and exit time code: ... device_init_wakeup(dev, true); dev_pm_set_wake_irq(dev, irq); ... dev_pm_clear_wake_irq(dev); device_init_wakeup(dev, false); ... And for hardware with dedicated wake-up interrupts: ... device_init_wakeup(dev, true); dev_pm_set_dedicated_wake_irq(dev, irq); ... dev_pm_clear_wake_irq(dev); device_init_wakeup(dev, false); ... Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | | | | | | |
| \ \ \ \ \ \ \
*-. \ \ \ \ \ \ \ Merge branches 'pm-sleep' and 'pm-runtime'Rafael J. Wysocki2015-06-1914-35/+124
|\ \ \ \ \ \ \ \ \ | | |/ / / / / / / | | | | | | | / / | | |_|_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-sleep: PM / sleep: trace_device_pm_callback coverage in dpm_prepare/complete PM / wakeup: add a dummy wakeup_source to record statistics PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPEND PM / sleep: Return -EBUSY from suspend_enter() on wakeup detection PM / tick: Add tracepoints for suspend-to-idle diagnostics PM / sleep: Fix symbol name in a comment in kernel/power/main.c leds / PM: fix hibernation on arm when gpio-led used with CPU led trigger ARM: omap-device: use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS bus: omap_l3_noc: add missed callbacks for suspend-to-disk PM / sleep: Add macro to define common noirq system PM callbacks PM / sleep: Refine diagnostic messages in enter_state() PM / wakeup: validate wakeup source before activating it. * pm-runtime: PM / Runtime: Update last_busy in rpm_resume PM / runtime: add note about re-calling in during device probe()
| | * | | | | | PM / Runtime: Update last_busy in rpm_resumeTony Lindgren2015-05-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we don't update last_busy in rpm_resume, devices can go back to sleep immediately after resume. This happens at least in cases where the device has been powered off and does not have any interrupt pending until there's something in the FIFO. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| | * | | | | | PM / runtime: add note about re-calling in during device probe()Ben Dooks2015-05-131-0/+6
| | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sh_eth driver has come up with an issue where the runtime_pm code suspends it during the probe() method due to the network device registration re-calling into the driver. Add a note about this into the documentation. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / sleep: trace_device_pm_callback coverage in dpm_prepare/completeTodd E Brandt2015-06-101-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the trace_device_pm_callback locations for dpm_prepare and dpm_complete to encompass the attempt to capture the device mutex prior to callback. This is needed by analyze_suspend to identify gaps in the trace output caused by the delay in locking the mutex for a device. Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / wakeup: add a dummy wakeup_source to record statisticsJin Qian2015-05-191-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After a wakeup_source is destroyed, we lost all information such as how long this wakeup_source has been active. Add a dummy wakeup_source to record such info. Signed-off-by: Jin Qian <jinqian@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPENDRafael J. Wysocki2015-05-194-10/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since idle_should_freeze() is defined to always return 'false' for CONFIG_SUSPEND unset, all of the code depending on it in cpuidle_idle_call() is not necessary in that case. Make that code depend on CONFIG_SUSPEND too to avoid building it when it is not going to be used. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | PM / sleep: Return -EBUSY from suspend_enter() on wakeup detectionRuchi Kandoi2015-05-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a wakeup source is found to be pending in the last stage of suspend after syscore suspend, then the machine won't suspend, but suspend_enter() will return 0. That is confusing, as wakeup detection elsewhere causes -EBUSY to be returned from suspend_enter(). To avoid the confusion, make suspend_enter() return -EBUSY in that case too. Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> [ rjw: Subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / tick: Add tracepoints for suspend-to-idle diagnosticsRafael J. Wysocki2015-05-151-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add suspend/resume tracepoints to tick_freeze() and tick_unfreeze() to catch when timekeeping is suspended and resumed during suspend-to-idle so as to be able to check whether or not we enter the "frozen" state and to measure the time spent in it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
| * | | | | | PM / sleep: Fix symbol name in a comment in kernel/power/main.cRafael J. Wysocki2015-05-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | leds / PM: fix hibernation on arm when gpio-led used with CPU led triggerGrygorii Strashko2015-05-121-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting a dev_pm_ops suspend/resume pair of callbacks but not a set of hibernation callbacks means those pm functions will not be called upon hibernation - that leads to system crash on ARM during freezing if gpio-led is used in combination with CPU led trigger. It may happen after freeze_noirq stage (GPIO is suspended) and before syscore_suspend stage (CPU led trigger is suspended) - usually when disable_nonboot_cpus() is called. Log: PM: noirq freeze of devices complete after 1.425 msecs Disabling non-boot CPUs ... ^ system may crash or stuck here with message (TI AM572x) WARNING: CPU: 0 PID: 3100 at drivers/bus/omap_l3_noc.c:148 l3_interrupt_handler+0x22c/0x370() 44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4_PER1_P3 (Idle): Data Access in Supervisor mode during Functional access CPU1: shutdown ^ or here Fix this by using SIMPLE_DEV_PM_OPS, which appropriately assigns the suspend and hibernation callbacks and move led_suspend/led_resume under CONFIG_PM_SLEEP to avoid build warnings. Fixes: 73e1ab41a80d (leds: Convert led class driver from legacy pm ops to dev_pm_ops) Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Acked-by: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: 3.11+ <stable@vger.kernel.org> # 3.11+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | ARM: omap-device: use SET_NOIRQ_SYSTEM_SLEEP_PM_OPSGrygorii Strashko2015-05-121-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use recently introduced macro SET_NOIRQ_SYSTEM_SLEEP_PM_OPS to set up PM callbacks. This also fixes missed assignment of .poweroff_noirq() callback. Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | bus: omap_l3_noc: add missed callbacks for suspend-to-diskGrygorii Strashko2015-05-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add missed callbacks needed for proper supporting of suspend-to-disk by using recently introduced macro SET_NOIRQ_SYSTEM_SLEEP_PM_OPS. Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / sleep: Add macro to define common noirq system PM callbacksGrygorii Strashko2015-05-121-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The same approach is used as for the existing SET_SYSTEM_SLEEP_PM_OPS, but for noirq callbacks. New SET_NOIRQ_SYSTEM_SLEEP_PM_OPS, defined for CONFIG_PM_SLEEP, will point ->suspend_noirq, ->freeze_noirq and ->poweroff_noirq to the same function. Vice versa happens for ->resume_noirq, ->thaw_noirq and ->restore_noirq. Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / sleep: Refine diagnostic messages in enter_state()Rafael J. Wysocki2015-05-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the system suspend diagnostic messages related to suspend-to-idle refer to it as "freeze sleep" or "freeze state" while the others say "suspend-to-idle". To reduce the possible confusion that may result from that, refine the former either to say "suspend to idle" too or to make it clearer that what is printed is a state string written to /sys/power/state ("mem", "standby", or "freeze"). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | PM / wakeup: validate wakeup source before activating it.Jin Qian2015-05-081-0/+18
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A rogue wakeup source not registered in wakeup_sources list is not visible from wakeup_sources_stats_show. Check if the wakeup source is registered properly by looking at the timer struct. Signed-off-by: Jin Qian <jinqian@android.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | | | Merge branch 'pm-cpufreq'Rafael J. Wysocki2015-06-1916-503/+670
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-cpufreq: (37 commits) cpufreq: dt: allow driver to boot automatically intel_pstate: Fix overflow in busy_scaled due to long delay cpufreq: qoriq: optimize the CPU frequency switching time cpufreq: gx-suspmod: Fix two typos in two comments cpufreq: nforce2: Fix typo in comment to function nforce2_init() cpufreq: governor: Serialize governor callbacks cpufreq: governor: split cpufreq_governor_dbs() cpufreq: governor: register notifier from cs_init() cpufreq: Remove cpufreq_update_policy() cpufreq: Restart governor as soon as possible cpufreq: Call cpufreq_policy_put_kobj() from cpufreq_policy_free() cpufreq: Initialize policy->kobj while allocating policy cpufreq: Stop migrating sysfs files on hotplug cpufreq: Don't allow updating inactive policies from sysfs intel_pstate: Force setting target pstate when required intel_pstate: change some inconsistent debug information cpufreq: Track cpu managing sysfs kobjects separately cpufreq: Fix for typos in two comments cpufreq: Mark policy->governor = NULL for inactive policies cpufreq: Manage governor usage history with 'policy->last_governor' ...
| * | | | | | cpufreq: dt: allow driver to boot automaticallyFelipe Balbi2015-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by adding the missing MODULE_ALIAS(), cpufreq-dt can be autoloaded by udev/systemd. Signed-off-by: Felipe Balbi <balbi@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | intel_pstate: Fix overflow in busy_scaled due to long delayPrarit Bhargava2015-06-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel may delay interrupts for a long time which can result in timers being delayed. If this occurs the intel_pstate driver will crash with a divide by zero error: divide error: 0000 [#1] SMP Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 113 PID: 0 Comm: swapper/113 Tainted: G W -------------- 3.10.0-229.1.2.el7.x86_64 #1 Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014 task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000 RIP: 0010:[<ffffffff814a9279>] [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0 RSP: 0018:ffff883fff4e3db8 EFLAGS: 00010206 RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001 R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5 R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246 FS: 0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071 ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100 Call Trace: <IRQ> [<ffffffff81099dc1>] ? run_posix_cpu_timers+0x51/0x840 [<ffffffff810b69dd>] ? trigger_load_balance+0x5d/0x200 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130 [<ffffffff8107df56>] call_timer_fn+0x36/0x110 [<ffffffff814a9100>] ? pid_param_set+0x130/0x130 [<ffffffff8107fdcf>] run_timer_softirq+0x21f/0x320 [<ffffffff81077b2f>] __do_softirq+0xef/0x280 [<ffffffff816156dc>] call_softirq+0x1c/0x30 [<ffffffff81015d95>] do_softirq+0x65/0xa0 [<ffffffff81077ec5>] irq_exit+0x115/0x120 [<ffffffff81616355>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff81614a1d>] apic_timer_interrupt+0x6d/0x80 <EOI> [<ffffffff814a9c32>] ? cpuidle_enter_state+0x52/0xc0 [<ffffffff814a9c28>] ? cpuidle_enter_state+0x48/0xc0 [<ffffffff814a9d65>] cpuidle_idle_call+0xc5/0x200 [<ffffffff8101d14e>] arch_cpu_idle+0xe/0x30 [<ffffffff810c67c1>] cpu_startup_entry+0xf1/0x290 [<ffffffff8104228a>] start_secondary+0x1ba/0x230 Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 <48> f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29 RIP [<ffffffff814a9279>] intel_pstate_timer_func+0x179/0x3d0 RSP <ffff883fff4e3db8> The kernel values for cpudata for CPU 113 were: struct cpudata { cpu = 113, timer = { entry = { next = 0x0, prev = 0xdead000000200200 }, expires = 8357799745, base = 0xffff883fe84ec001, function = 0xffffffff814a9100 <intel_pstate_timer_func>, data = 18446612406765768960, <snip> i_gain = 0, d_gain = 0, deadband = 0, last_err = 22489 }, last_sample_time = { tv64 = 4063132438017305 }, prev_aperf = 287326796397463, prev_mperf = 251427432090198, sample = { core_pct_busy = 23081, aperf = 2937407, mperf = 3257884, freq = 2524484, time = { tv64 = 4063149215234118 } } } which results in the time between samples = last_sample_time - sample.time = 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds. The duration between reads of the APERF and MPERF registers overflowed a s32 sized integer in intel_pstate_get_scaled_busy()'s call to div_fp(). The result is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0. While the kernel shouldn't be delaying for a long time, it can and does happen and the intel_pstate driver should not panic in this situation. This patch changes the div_fp() function to use div64_s64() to allow for "long" division. This will avoid the overflow condition on long delays. [v2]: use div64_s64() in div_fp() Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: qoriq: optimize the CPU frequency switching timeTang Yuantian2015-06-151-11/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each time the CPU switches its frequency, the clock nodes in DTS are walked through to find proper clock source. This is very time-consuming, for example, it is up to 500+ us on T4240. Besides, switching time varies from clock to clock. To optimize this, each input clock of CPU is buffered, so that it can be picked up instantly when needed. Since for each CPU each input clock is stored in a pointer which takes 4 or 8 bytes memory and normally there are several input clocks per CPU, that will not take much memory as well. Signed-off-by: Tang Yuantian <Yuantian.Tang@freescale.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: gx-suspmod: Fix two typos in two commentsShailendra Verma2015-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: nforce2: Fix typo in comment to function nforce2_init()Shailendra Verma2015-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: governor: Serialize governor callbacksViresh Kumar2015-06-154-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several races reported in cpufreq core around governors (only ondemand and conservative) by different people. There are at least two race scenarios present in governor code: (a) Concurrent access/updates of governor internal structures. It is possible that fields such as 'dbs_data->usage_count', etc. are accessed simultaneously for different policies using same governor structure (i.e. CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag unset). And because of this we can dereference bad pointers. For example consider a system with two CPUs with separate 'struct cpufreq_policy' instances. CPU0 governor: ondemand and CPU1: powersave. CPU0 switching to powersave and CPU1 to ondemand: CPU0 CPU1 store* store* cpufreq_governor_exit() cpufreq_governor_init() dbs_data = cdata->gdbs_data; if (!--dbs_data->usage_count) kfree(dbs_data); dbs_data->usage_count++; *Bad pointer dereference* There are other races possible between EXIT and START/STOP/LIMIT as well. Its really complicated. (b) Switching governor state in bad sequence: For example trying to switch a governor to START state, when the governor is in EXIT state. There are some checks present in __cpufreq_governor() but they aren't sufficient as they compare events against 'policy->governor_enabled', where as we need to take governor's state into account, which can be used by multiple policies. These two issues need to be solved separately and the responsibility should be properly divided between cpufreq and governor core. The first problem is more about the governor core, as it needs to protect its structures properly. And the second problem should be fixed in cpufreq core instead of governor, as its all about sequence of events. This patch is trying to solve only the first problem. There are two types of data we need to protect, - 'struct common_dbs_data': No matter what, there is going to be a single copy of this per governor. - 'struct dbs_data': With CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag set, we will have per-policy copy of this data, otherwise a single copy. Because of such complexities, the mutex present in 'struct dbs_data' is insufficient to solve our problem. For example we need to protect fetching of 'dbs_data' from different structures at the beginning of cpufreq_governor_dbs(), to make sure it isn't currently being updated. This can be fixed if we can guarantee serialization of event parsing code for an individual governor. This is best solved with a mutex per governor, and the placeholder for that is 'struct common_dbs_data'. And so this patch moves the mutex from 'struct dbs_data' to 'struct common_dbs_data' and takes it at the beginning and drops it at the end of cpufreq_governor_dbs(). Tested with and without following configuration options: CONFIG_LOCKDEP_SUPPORT=y CONFIG_DEBUG_RT_MUTEXES=y CONFIG_DEBUG_PI_LIST=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y CONFIG_DEBUG_ATOMIC_SLEEP=y Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: governor: split cpufreq_governor_dbs()Viresh Kumar2015-06-151-140/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpufreq_governor_dbs() is hardly readable, it is just too big and complicated. Lets make it more readable by splitting out event specific routines. Order of statements is changed at few places, but that shouldn't bring any functional change. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * | | | | | cpufreq: governor: register notifier from cs_init()Viresh Kumar2015-06-154-38/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Notifiers are required only for conservative governor and the common governor code is unnecessarily polluted with that. Handle that from cs_init/exit() instead of cpufreq_governor_dbs(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OpenPOWER on IntegriCloud