diff options
author | Timothy Pearson <tpearson@raptorengineering.com> | 2017-08-23 14:45:25 -0500 |
---|---|---|
committer | Timothy Pearson <tpearson@raptorengineering.com> | 2017-08-23 14:45:25 -0500 |
commit | fcbb27b0ec6dcbc5a5108cb8fb19eae64593d204 (patch) | |
tree | 22962a4387943edc841c72a4e636a068c66d58fd /Documentation/power | |
download | ast2050-linux-kernel-fcbb27b0ec6dcbc5a5108cb8fb19eae64593d204.zip ast2050-linux-kernel-fcbb27b0ec6dcbc5a5108cb8fb19eae64593d204.tar.gz |
Initial import of modified Linux 2.6.28 tree
Original upstream URL:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git | branch linux-2.6.28.y
Diffstat (limited to 'Documentation/power')
24 files changed, 3330 insertions, 0 deletions
diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX new file mode 100644 index 0000000..fb742c2 --- /dev/null +++ b/Documentation/power/00-INDEX @@ -0,0 +1,40 @@ +00-INDEX + - This file +apm-acpi.txt + - basic info about the APM and ACPI support. +basic-pm-debugging.txt + - Debugging suspend and resume +devices.txt + - How drivers interact with system-wide power management +drivers-testing.txt + - Testing suspend and resume support in device drivers +freezing-of-tasks.txt + - How processes and controlled during suspend +interface.txt + - Power management user interface in /sys/power +notifiers.txt + - Registering suspend notifiers in device drivers +pci.txt + - How the PCI Subsystem Does Power Management +pm_qos_interface.txt + - info on Linux PM Quality of Service interface +power_supply_class.txt + - Tells userspace about battery, UPS, AC or DC power supply properties +s2ram.txt + - How to get suspend to ram working (and debug it when it isn't) +states.txt + - System power management states +swsusp-and-swap-files.txt + - Using swap files with software suspend (to disk) +swsusp-dmcrypt.txt + - How to use dm-crypt and software suspend (to disk) together +swsusp.txt + - Goals, implementation, and usage of software suspend (ACPI S3) +tricks.txt + - How to trick software suspend (to disk) into working when it isn't +userland-swsusp.txt + - Experimental implementation of software suspend in userspace +video_extension.txt + - ACPI video extensions +video.txt + - Video issues during resume from suspend diff --git a/Documentation/power/apm-acpi.txt b/Documentation/power/apm-acpi.txt new file mode 100644 index 0000000..1bd799d --- /dev/null +++ b/Documentation/power/apm-acpi.txt @@ -0,0 +1,32 @@ +APM or ACPI? +------------ +If you have a relatively recent x86 mobile, desktop, or server system, +odds are it supports either Advanced Power Management (APM) or +Advanced Configuration and Power Interface (ACPI). ACPI is the newer +of the two technologies and puts power management in the hands of the +operating system, allowing for more intelligent power management than +is possible with BIOS controlled APM. + +The best way to determine which, if either, your system supports is to +build a kernel with both ACPI and APM enabled (as of 2.3.x ACPI is +enabled by default). If a working ACPI implementation is found, the +ACPI driver will override and disable APM, otherwise the APM driver +will be used. + +No, sorry, you cannot have both ACPI and APM enabled and running at +once. Some people with broken ACPI or broken APM implementations +would like to use both to get a full set of working features, but you +simply cannot mix and match the two. Only one power management +interface can be in control of the machine at once. Think about it.. + +User-space Daemons +------------------ +Both APM and ACPI rely on user-space daemons, apmd and acpid +respectively, to be completely functional. Obtain both of these +daemons from your Linux distribution or from the Internet (see below) +and be sure that they are started sometime in the system boot process. +Go ahead and start both. If ACPI or APM is not available on your +system the associated daemon will exit gracefully. + + apmd: http://worldvisions.ca/~apenwarr/apmd/ + acpid: http://acpid.sf.net/ diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt new file mode 100644 index 0000000..1555001 --- /dev/null +++ b/Documentation/power/basic-pm-debugging.txt @@ -0,0 +1,204 @@ +Debugging hibernation and suspend + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +1. Testing hibernation (aka suspend to disk or STD) + +To check if hibernation works, you can try to hibernate in the "reboot" mode: + +# echo reboot > /sys/power/disk +# echo disk > /sys/power/state + +and the system should create a hibernation image, reboot, resume and get back to +the command prompt where you have started the transition. If that happens, +hibernation is most likely to work correctly. Still, you need to repeat the +test at least a couple of times in a row for confidence. [This is necessary, +because some problems only show up on a second attempt at suspending and +resuming the system.] Moreover, hibernating in the "reboot" and "shutdown" +modes causes the PM core to skip some platform-related callbacks which on ACPI +systems might be necessary to make hibernation work. Thus, if you machine fails +to hibernate or resume in the "reboot" mode, you should try the "platform" mode: + +# echo platform > /sys/power/disk +# echo disk > /sys/power/state + +which is the default and recommended mode of hibernation. + +Unfortunately, the "platform" mode of hibernation does not work on some systems +with broken BIOSes. In such cases the "shutdown" mode of hibernation might +work: + +# echo shutdown > /sys/power/disk +# echo disk > /sys/power/state + +(it is similar to the "reboot" mode, but it requires you to press the power +button to make the system resume). + +If neither "platform" nor "shutdown" hibernation mode works, you will need to +identify what goes wrong. + +a) Test modes of hibernation + +To find out why hibernation fails on your system, you can use a special testing +facility available if the kernel is compiled with CONFIG_PM_DEBUG set. Then, +there is the file /sys/power/pm_test that can be used to make the hibernation +core run in a test mode. There are 5 test modes available: + +freezer +- test the freezing of processes + +devices +- test the freezing of processes and suspending of devices + +platform +- test the freezing of processes, suspending of devices and platform + global control methods(*) + +processors +- test the freezing of processes, suspending of devices, platform + global control methods(*) and the disabling of nonboot CPUs + +core +- test the freezing of processes, suspending of devices, platform global + control methods(*), the disabling of nonboot CPUs and suspending of + platform/system devices + +(*) the platform global control methods are only available on ACPI systems + and are only tested if the hibernation mode is set to "platform" + +To use one of them it is necessary to write the corresponding string to +/sys/power/pm_test (eg. "devices" to test the freezing of processes and +suspending devices) and issue the standard hibernation commands. For example, +to use the "devices" test mode along with the "platform" mode of hibernation, +you should do the following: + +# echo devices > /sys/power/pm_test +# echo platform > /sys/power/disk +# echo disk > /sys/power/state + +Then, the kernel will try to freeze processes, suspend devices, wait 5 seconds, +resume devices and thaw processes. If "platform" is written to +/sys/power/pm_test , then after suspending devices the kernel will additionally +invoke the global control methods (eg. ACPI global control methods) used to +prepare the platform firmware for hibernation. Next, it will wait 5 seconds and +invoke the platform (eg. ACPI) global methods used to cancel hibernation etc. + +Writing "none" to /sys/power/pm_test causes the kernel to switch to the normal +hibernation/suspend operations. Also, when open for reading, /sys/power/pm_test +contains a space-separated list of all available tests (including "none" that +represents the normal functionality) in which the current test level is +indicated by square brackets. + +Generally, as you can see, each test level is more "invasive" than the previous +one and the "core" level tests the hardware and drivers as deeply as possible +without creating a hibernation image. Obviously, if the "devices" test fails, +the "platform" test will fail as well and so on. Thus, as a rule of thumb, you +should try the test modes starting from "freezer", through "devices", "platform" +and "processors" up to "core" (repeat the test on each level a couple of times +to make sure that any random factors are avoided). + +If the "freezer" test fails, there is a task that cannot be frozen (in that case +it usually is possible to identify the offending task by analysing the output of +dmesg obtained after the failing test). Failure at this level usually means +that there is a problem with the tasks freezer subsystem that should be +reported. + +If the "devices" test fails, most likely there is a driver that cannot suspend +or resume its device (in the latter case the system may hang or become unstable +after the test, so please take that into consideration). To find this driver, +you can carry out a binary search according to the rules: +- if the test fails, unload a half of the drivers currently loaded and repeat +(that would probably involve rebooting the system, so always note what drivers +have been loaded before the test), +- if the test succeeds, load a half of the drivers you have unloaded most +recently and repeat. + +Once you have found the failing driver (there can be more than just one of +them), you have to unload it every time before hibernation. In that case please +make sure to report the problem with the driver. + +It is also possible that the "devices" test will still fail after you have +unloaded all modules. In that case, you may want to look in your kernel +configuration for the drivers that can be compiled as modules (and test again +with these drivers compiled as modules). You may also try to use some special +kernel command line options such as "noapic", "noacpi" or even "acpi=off". + +If the "platform" test fails, there is a problem with the handling of the +platform (eg. ACPI) firmware on your system. In that case the "platform" mode +of hibernation is not likely to work. You can try the "shutdown" mode, but that +is rather a poor man's workaround. + +If the "processors" test fails, the disabling/enabling of nonboot CPUs does not +work (of course, this only may be an issue on SMP systems) and the problem +should be reported. In that case you can also try to switch the nonboot CPUs +off and on using the /sys/devices/system/cpu/cpu*/online sysfs attributes and +see if that works. + +If the "core" test fails, which means that suspending of the system/platform +devices has failed (these devices are suspended on one CPU with interrupts off), +the problem is most probably hardware-related and serious, so it should be +reported. + +A failure of any of the "platform", "processors" or "core" tests may cause your +system to hang or become unstable, so please beware. Such a failure usually +indicates a serious problem that very well may be related to the hardware, but +please report it anyway. + +b) Testing minimal configuration + +If all of the hibernation test modes work, you can boot the system with the +"init=/bin/bash" command line parameter and attempt to hibernate in the +"reboot", "shutdown" and "platform" modes. If that does not work, there +probably is a problem with a driver statically compiled into the kernel and you +can try to compile more drivers as modules, so that they can be tested +individually. Otherwise, there is a problem with a modular driver and you can +find it by loading a half of the modules you normally use and binary searching +in accordance with the algorithm: +- if there are n modules loaded and the attempt to suspend and resume fails, +unload n/2 of the modules and try again (that would probably involve rebooting +the system), +- if there are n modules loaded and the attempt to suspend and resume succeeds, +load n/2 modules more and try again. + +Again, if you find the offending module(s), it(they) must be unloaded every time +before hibernation, and please report the problem with it(them). + +c) Advanced debugging + +In case that hibernation does not work on your system even in the minimal +configuration and compiling more drivers as modules is not practical or some +modules cannot be unloaded, you can use one of the more advanced debugging +techniques to find the problem. First, if there is a serial port in your box, +you can boot the kernel with the 'no_console_suspend' parameter and try to log +kernel messages using the serial console. This may provide you with some +information about the reasons of the suspend (resume) failure. Alternatively, +it may be possible to use a FireWire port for debugging with firescope +(ftp://ftp.firstfloor.org/pub/ak/firescope/). On x86 it is also possible to +use the PM_TRACE mechanism documented in Documentation/s2ram.txt . + +2. Testing suspend to RAM (STR) + +To verify that the STR works, it is generally more convenient to use the s2ram +tool available from http://suspend.sf.net and documented at +http://en.opensuse.org/s2ram . However, before doing that it is recommended to +carry out STR testing using the facility described in section 1. + +Namely, after writing "freezer", "devices", "platform", "processors", or "core" +into /sys/power/pm_test (available if the kernel is compiled with +CONFIG_PM_DEBUG set) the suspend code will work in the test mode corresponding +to given string. The STR test modes are defined in the same way as for +hibernation, so please refer to Section 1 for more information about them. In +particular, the "core" test allows you to test everything except for the actual +invocation of the platform firmware in order to put the system into the sleep +state. + +Among other things, the testing with the help of /sys/power/pm_test may allow +you to identify drivers that fail to suspend or resume their devices. They +should be unloaded every time before an STR transition. + +Next, you can follow the instructions at http://en.opensuse.org/s2ram to test +the system, but if it does not work "out of the box", you may need to boot it +with "init=/bin/bash" and test s2ram in the minimal configuration. In that +case, you may be able to search for failing drivers by following the procedure +analogous to the one described in section 1. If you find some failing drivers, +you will have to unload them every time before an STR transition (ie. before +you run s2ram), and please report the problems with them. diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 0000000..421e7d0 --- /dev/null +++ b/Documentation/power/devices.txt @@ -0,0 +1,512 @@ +Most of the code in Linux is device drivers, so most of the Linux power +management code is also driver-specific. Most drivers will do very little; +others, especially for platforms with small batteries (like cell phones), +will do a lot. + +This writeup gives an overview of how drivers interact with system-wide +power management goals, emphasizing the models and interfaces that are +shared by everything that hooks up to the driver model core. Read it as +background for the domain-specific work you'd do with any specific driver. + + +Two Models for Device Power Management +====================================== +Drivers will use one or both of these models to put devices into low-power +states: + + System Sleep model: + Drivers can enter low power states as part of entering system-wide + low-power states like "suspend-to-ram", or (mostly for systems with + disks) "hibernate" (suspend-to-disk). + + This is something that device, bus, and class drivers collaborate on + by implementing various role-specific suspend and resume methods to + cleanly power down hardware and software subsystems, then reactivate + them without loss of data. + + Some drivers can manage hardware wakeup events, which make the system + leave that low-power state. This feature may be disabled using the + relevant /sys/devices/.../power/wakeup file; enabling it may cost some + power usage, but let the whole system enter low power states more often. + + Runtime Power Management model: + Drivers may also enter low power states while the system is running, + independently of other power management activity. Upstream drivers + will normally not know (or care) if the device is in some low power + state when issuing requests; the driver will auto-resume anything + that's needed when it gets a request. + + This doesn't have, or need much infrastructure; it's just something you + should do when writing your drivers. For example, clk_disable() unused + clocks as part of minimizing power drain for currently-unused hardware. + Of course, sometimes clusters of drivers will collaborate with each + other, which could involve task-specific power management. + +There's not a lot to be said about those low power states except that they +are very system-specific, and often device-specific. Also, that if enough +drivers put themselves into low power states (at "runtime"), the effect may be +the same as entering some system-wide low-power state (system sleep) ... and +that synergies exist, so that several drivers using runtime pm might put the +system into a state where even deeper power saving options are available. + +Most suspended devices will have quiesced all I/O: no more DMA or irqs, no +more data read or written, and requests from upstream drivers are no longer +accepted. A given bus or platform may have different requirements though. + +Examples of hardware wakeup events include an alarm from a real time clock, +network wake-on-LAN packets, keyboard or mouse activity, and media insertion +or removal (for PCMCIA, MMC/SD, USB, and so on). + + +Interfaces for Entering System Sleep States +=========================================== +Most of the programming interfaces a device driver needs to know about +relate to that first model: entering a system-wide low power state, +rather than just minimizing power consumption by one device. + + +Bus Driver Methods +------------------ +The core methods to suspend and resume devices reside in struct bus_type. +These are mostly of interest to people writing infrastructure for busses +like PCI or USB, or because they define the primitives that device drivers +may need to apply in domain-specific ways to their devices: + +struct bus_type { + ... + int (*suspend)(struct device *dev, pm_message_t state); + int (*suspend_late)(struct device *dev, pm_message_t state); + + int (*resume_early)(struct device *dev); + int (*resume)(struct device *dev); +}; + +Bus drivers implement those methods as appropriate for the hardware and +the drivers using it; PCI works differently from USB, and so on. Not many +people write bus drivers; most driver code is a "device driver" that +builds on top of bus-specific framework code. + +For more information on these driver calls, see the description later; +they are called in phases for every device, respecting the parent-child +sequencing in the driver model tree. Note that as this is being written, +only the suspend() and resume() are widely available; not many bus drivers +leverage all of those phases, or pass them down to lower driver levels. + + +/sys/devices/.../power/wakeup files +----------------------------------- +All devices in the driver model have two flags to control handling of +wakeup events, which are hardware signals that can force the device and/or +system out of a low power state. These are initialized by bus or device +driver code using device_init_wakeup(dev,can_wakeup). + +The "can_wakeup" flag just records whether the device (and its driver) can +physically support wakeup events. When that flag is clear, the sysfs +"wakeup" file is empty, and device_may_wakeup() returns false. + +For devices that can issue wakeup events, a separate flag controls whether +that device should try to use its wakeup mechanism. The initial value of +device_may_wakeup() will be true, so that the device's "wakeup" file holds +the value "enabled". Userspace can change that to "disabled" so that +device_may_wakeup() returns false; or change it back to "enabled" (so that +it returns true again). + + +EXAMPLE: PCI Device Driver Methods +----------------------------------- +PCI framework software calls these methods when the PCI device driver bound +to a device device has provided them: + +struct pci_driver { + ... + int (*suspend)(struct pci_device *pdev, pm_message_t state); + int (*suspend_late)(struct pci_device *pdev, pm_message_t state); + + int (*resume_early)(struct pci_device *pdev); + int (*resume)(struct pci_device *pdev); +}; + +Drivers will implement those methods, and call PCI-specific procedures +like pci_set_power_state(), pci_enable_wake(), pci_save_state(), and +pci_restore_state() to manage PCI-specific mechanisms. (PCI config space +could be saved during driver probe, if it weren't for the fact that some +systems rely on userspace tweaking using setpci.) Devices are suspended +before their bridges enter low power states, and likewise bridges resume +before their devices. + + +Upper Layers of Driver Stacks +----------------------------- +Device drivers generally have at least two interfaces, and the methods +sketched above are the ones which apply to the lower level (nearer PCI, USB, +or other bus hardware). The network and block layers are examples of upper +level interfaces, as is a character device talking to userspace. + +Power management requests normally need to flow through those upper levels, +which often use domain-oriented requests like "blank that screen". In +some cases those upper levels will have power management intelligence that +relates to end-user activity, or other devices that work in cooperation. + +When those interfaces are structured using class interfaces, there is a +standard way to have the upper layer stop issuing requests to a given +class device (and restart later): + +struct class { + ... + int (*suspend)(struct device *dev, pm_message_t state); + int (*resume)(struct device *dev); +}; + +Those calls are issued in specific phases of the process by which the +system enters a low power "suspend" state, or resumes from it. + + +Calling Drivers to Enter System Sleep States +============================================ +When the system enters a low power state, each device's driver is asked +to suspend the device by putting it into state compatible with the target +system state. That's usually some version of "off", but the details are +system-specific. Also, wakeup-enabled devices will usually stay partly +functional in order to wake the system. + +When the system leaves that low power state, the device's driver is asked +to resume it. The suspend and resume operations always go together, and +both are multi-phase operations. + +For simple drivers, suspend might quiesce the device using the class code +and then turn its hardware as "off" as possible with late_suspend. The +matching resume calls would then completely reinitialize the hardware +before reactivating its class I/O queues. + +More power-aware drivers drivers will use more than one device low power +state, either at runtime or during system sleep states, and might trigger +system wakeup events. + + +Call Sequence Guarantees +------------------------ +To ensure that bridges and similar links needed to talk to a device are +available when the device is suspended or resumed, the device tree is +walked in a bottom-up order to suspend devices. A top-down order is +used to resume those devices. + +The ordering of the device tree is defined by the order in which devices +get registered: a child can never be registered, probed or resumed before +its parent; and can't be removed or suspended after that parent. + +The policy is that the device tree should match hardware bus topology. +(Or at least the control bus, for devices which use multiple busses.) +In particular, this means that a device registration may fail if the parent of +the device is suspending (ie. has been chosen by the PM core as the next +device to suspend) or has already suspended, as well as after all of the other +devices have been suspended. Device drivers must be prepared to cope with such +situations. + + +Suspending Devices +------------------ +Suspending a given device is done in several phases. Suspending the +system always includes every phase, executing calls for every device +before the next phase begins. Not all busses or classes support all +these callbacks; and not all drivers use all the callbacks. + +The phases are seen by driver notifications issued in this order: + + 1 class.suspend(dev, message) is called after tasks are frozen, for + devices associated with a class that has such a method. This + method may sleep. + + Since I/O activity usually comes from such higher layers, this is + a good place to quiesce all drivers of a given type (and keep such + code out of those drivers). + + 2 bus.suspend(dev, message) is called next. This method may sleep, + and is often morphed into a device driver call with bus-specific + parameters and/or rules. + + This call should handle parts of device suspend logic that require + sleeping. It probably does work to quiesce the device which hasn't + been abstracted into class.suspend() or bus.suspend_late(). + + 3 bus.suspend_late(dev, message) is called with IRQs disabled, and + with only one CPU active. Until the bus.resume_early() phase + completes (see later), IRQs are not enabled again. This method + won't be exposed by all busses; for message based busses like USB, + I2C, or SPI, device interactions normally require IRQs. This bus + call may be morphed into a driver call with bus-specific parameters. + + This call might save low level hardware state that might otherwise + be lost in the upcoming low power state, and actually put the + device into a low power state ... so that in some cases the device + may stay partly usable until this late. This "late" call may also + help when coping with hardware that behaves badly. + +The pm_message_t parameter is currently used to refine those semantics +(described later). + +At the end of those phases, drivers should normally have stopped all I/O +transactions (DMA, IRQs), saved enough state that they can re-initialize +or restore previous state (as needed by the hardware), and placed the +device into a low-power state. On many platforms they will also use +clk_disable() to gate off one or more clock sources; sometimes they will +also switch off power supplies, or reduce voltages. Drivers which have +runtime PM support may already have performed some or all of the steps +needed to prepare for the upcoming system sleep state. + +When any driver sees that its device_can_wakeup(dev), it should make sure +to use the relevant hardware signals to trigger a system wakeup event. +For example, enable_irq_wake() might identify GPIO signals hooked up to +a switch or other external hardware, and pci_enable_wake() does something +similar for PCI's PME# signal. + +If a driver (or bus, or class) fails it suspend method, the system won't +enter the desired low power state; it will resume all the devices it's +suspended so far. + +Note that drivers may need to perform different actions based on the target +system lowpower/sleep state. At this writing, there are only platform +specific APIs through which drivers could determine those target states. + + +Device Low Power (suspend) States +--------------------------------- +Device low-power states aren't very standard. One device might only handle +"on" and "off, while another might support a dozen different versions of +"on" (how many engines are active?), plus a state that gets back to "on" +faster than from a full "off". + +Some busses define rules about what different suspend states mean. PCI +gives one example: after the suspend sequence completes, a non-legacy +PCI device may not perform DMA or issue IRQs, and any wakeup events it +issues would be issued through the PME# bus signal. Plus, there are +several PCI-standard device states, some of which are optional. + +In contrast, integrated system-on-chip processors often use irqs as the +wakeup event sources (so drivers would call enable_irq_wake) and might +be able to treat DMA completion as a wakeup event (sometimes DMA can stay +active too, it'd only be the CPU and some peripherals that sleep). + +Some details here may be platform-specific. Systems may have devices that +can be fully active in certain sleep states, such as an LCD display that's +refreshed using DMA while most of the system is sleeping lightly ... and +its frame buffer might even be updated by a DSP or other non-Linux CPU while +the Linux control processor stays idle. + +Moreover, the specific actions taken may depend on the target system state. +One target system state might allow a given device to be very operational; +another might require a hard shut down with re-initialization on resume. +And two different target systems might use the same device in different +ways; the aforementioned LCD might be active in one product's "standby", +but a different product using the same SOC might work differently. + + +Meaning of pm_message_t.event +----------------------------- +Parameters to suspend calls include the device affected and a message of +type pm_message_t, which has one field: the event. If driver does not +recognize the event code, suspend calls may abort the request and return +a negative errno. However, most drivers will be fine if they implement +PM_EVENT_SUSPEND semantics for all messages. + +The event codes are used to refine the goal of suspending the device, and +mostly matter when creating or resuming system memory image snapshots, as +used with suspend-to-disk: + + PM_EVENT_SUSPEND -- quiesce the driver and put hardware into a low-power + state. When used with system sleep states like "suspend-to-RAM" or + "standby", the upcoming resume() call will often be able to rely on + state kept in hardware, or issue system wakeup events. + + PM_EVENT_HIBERNATE -- Put hardware into a low-power state and enable wakeup + events as appropriate. It is only used with hibernation + (suspend-to-disk) and few devices are able to wake up the system from + this state; most are completely powered off. + + PM_EVENT_FREEZE -- quiesce the driver, but don't necessarily change into + any low power mode. A system snapshot is about to be taken, often + followed by a call to the driver's resume() method. Neither wakeup + events nor DMA are allowed. + + PM_EVENT_PRETHAW -- quiesce the driver, knowing that the upcoming resume() + will restore a suspend-to-disk snapshot from a different kernel image. + Drivers that are smart enough to look at their hardware state during + resume() processing need that state to be correct ... a PRETHAW could + be used to invalidate that state (by resetting the device), like a + shutdown() invocation would before a kexec() or system halt. Other + drivers might handle this the same way as PM_EVENT_FREEZE. Neither + wakeup events nor DMA are allowed. + +To enter "standby" (ACPI S1) or "Suspend to RAM" (STR, ACPI S3) states, or +the similarly named APM states, only PM_EVENT_SUSPEND is used; the other event +codes are used for hibernation ("Suspend to Disk", STD, ACPI S4). + +There's also PM_EVENT_ON, a value which never appears as a suspend event +but is sometimes used to record the "not suspended" device state. + + +Resuming Devices +---------------- +Resuming is done in multiple phases, much like suspending, with all +devices processing each phase's calls before the next phase begins. + +The phases are seen by driver notifications issued in this order: + + 1 bus.resume_early(dev) is called with IRQs disabled, and with + only one CPU active. As with bus.suspend_late(), this method + won't be supported on busses that require IRQs in order to + interact with devices. + + This reverses the effects of bus.suspend_late(). + + 2 bus.resume(dev) is called next. This may be morphed into a device + driver call with bus-specific parameters; implementations may sleep. + + This reverses the effects of bus.suspend(). + + 3 class.resume(dev) is called for devices associated with a class + that has such a method. Implementations may sleep. + + This reverses the effects of class.suspend(), and would usually + reactivate the device's I/O queue. + +At the end of those phases, drivers should normally be as functional as +they were before suspending: I/O can be performed using DMA and IRQs, and +the relevant clocks are gated on. The device need not be "fully on"; it +might be in a runtime lowpower/suspend state that acts as if it were. + +However, the details here may again be platform-specific. For example, +some systems support multiple "run" states, and the mode in effect at +the end of resume() might not be the one which preceded suspension. +That means availability of certain clocks or power supplies changed, +which could easily affect how a driver works. + + +Drivers need to be able to handle hardware which has been reset since the +suspend methods were called, for example by complete reinitialization. +This may be the hardest part, and the one most protected by NDA'd documents +and chip errata. It's simplest if the hardware state hasn't changed since +the suspend() was called, but that can't always be guaranteed. + +Drivers must also be prepared to notice that the device has been removed +while the system was powered off, whenever that's physically possible. +PCMCIA, MMC, USB, Firewire, SCSI, and even IDE are common examples of busses +where common Linux platforms will see such removal. Details of how drivers +will notice and handle such removals are currently bus-specific, and often +involve a separate thread. + + +Note that the bus-specific runtime PM wakeup mechanism can exist, and might +be defined to share some of the same driver code as for system wakeup. For +example, a bus-specific device driver's resume() method might be used there, +so it wouldn't only be called from bus.resume() during system-wide wakeup. +See bus-specific information about how runtime wakeup events are handled. + + +System Devices +-------------- +System devices follow a slightly different API, which can be found in + + include/linux/sysdev.h + drivers/base/sys.c + +System devices will only be suspended with interrupts disabled, and after +all other devices have been suspended. On resume, they will be resumed +before any other devices, and also with interrupts disabled. + +That is, IRQs are disabled, the suspend_late() phase begins, then the +sysdev_driver.suspend() phase, and the system enters a sleep state. Then +the sysdev_driver.resume() phase begins, followed by the resume_early() +phase, after which IRQs are enabled. + +Code to actually enter and exit the system-wide low power state sometimes +involves hardware details that are only known to the boot firmware, and +may leave a CPU running software (from SRAM or flash memory) that monitors +the system and manages its wakeup sequence. + + +Runtime Power Management +======================== +Many devices are able to dynamically power down while the system is still +running. This feature is useful for devices that are not being used, and +can offer significant power savings on a running system. These devices +often support a range of runtime power states, which might use names such +as "off", "sleep", "idle", "active", and so on. Those states will in some +cases (like PCI) be partially constrained by a bus the device uses, and will +usually include hardware states that are also used in system sleep states. + +However, note that if a driver puts a device into a runtime low power state +and the system then goes into a system-wide sleep state, it normally ought +to resume into that runtime low power state rather than "full on". Such +distinctions would be part of the driver-internal state machine for that +hardware; the whole point of runtime power management is to be sure that +drivers are decoupled in that way from the state machine governing phases +of the system-wide power/sleep state transitions. + + +Power Saving Techniques +----------------------- +Normally runtime power management is handled by the drivers without specific +userspace or kernel intervention, by device-aware use of techniques like: + + Using information provided by other system layers + - stay deeply "off" except between open() and close() + - if transceiver/PHY indicates "nobody connected", stay "off" + - application protocols may include power commands or hints + + Using fewer CPU cycles + - using DMA instead of PIO + - removing timers, or making them lower frequency + - shortening "hot" code paths + - eliminating cache misses + - (sometimes) offloading work to device firmware + + Reducing other resource costs + - gating off unused clocks in software (or hardware) + - switching off unused power supplies + - eliminating (or delaying/merging) IRQs + - tuning DMA to use word and/or burst modes + + Using device-specific low power states + - using lower voltages + - avoiding needless DMA transfers + +Read your hardware documentation carefully to see the opportunities that +may be available. If you can, measure the actual power usage and check +it against the budget established for your project. + + +Examples: USB hosts, system timer, system CPU +---------------------------------------------- +USB host controllers make interesting, if complex, examples. In many cases +these have no work to do: no USB devices are connected, or all of them are +in the USB "suspend" state. Linux host controller drivers can then disable +periodic DMA transfers that would otherwise be a constant power drain on the +memory subsystem, and enter a suspend state. In power-aware controllers, +entering that suspend state may disable the clock used with USB signaling, +saving a certain amount of power. + +The controller will be woken from that state (with an IRQ) by changes to the +signal state on the data lines of a given port, for example by an existing +peripheral requesting "remote wakeup" or by plugging a new peripheral. The +same wakeup mechanism usually works from "standby" sleep states, and on some +systems also from "suspend to RAM" (or even "suspend to disk") states. +(Except that ACPI may be involved instead of normal IRQs, on some hardware.) + +System devices like timers and CPUs may have special roles in the platform +power management scheme. For example, system timers using a "dynamic tick" +approach don't just save CPU cycles (by eliminating needless timer IRQs), +but they may also open the door to using lower power CPU "idle" states that +cost more than a jiffie to enter and exit. On x86 systems these are states +like "C3"; note that periodic DMA transfers from a USB host controller will +also prevent entry to a C3 state, much like a periodic timer IRQ. + +That kind of runtime mechanism interaction is common. "System On Chip" (SOC) +processors often have low power idle modes that can't be entered unless +certain medium-speed clocks (often 12 or 48 MHz) are gated off. When the +drivers gate those clocks effectively, then the system idle task may be able +to use the lower power idle modes and thereby increase battery life. + +If the CPU can have a "cpufreq" driver, there also may be opportunities +to shift to lower voltage settings and reduce the power cost of executing +a given number of instructions. (Without voltage adjustment, it's rare +for cpufreq to save much power; the cost-per-instruction must go down.) diff --git a/Documentation/power/drivers-testing.txt b/Documentation/power/drivers-testing.txt new file mode 100644 index 0000000..7f7a737 --- /dev/null +++ b/Documentation/power/drivers-testing.txt @@ -0,0 +1,46 @@ +Testing suspend and resume support in device drivers + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +1. Preparing the test system + +Unfortunately, to effectively test the support for the system-wide suspend and +resume transitions in a driver, it is necessary to suspend and resume a fully +functional system with this driver loaded. Moreover, that should be done +several times, preferably several times in a row, and separately for hibernation +(aka suspend to disk or STD) and suspend to RAM (STR), because each of these +cases involves slightly different operations and different interactions with +the machine's BIOS. + +Of course, for this purpose the test system has to be known to suspend and +resume without the driver being tested. Thus, if possible, you should first +resolve all suspend/resume-related problems in the test system before you start +testing the new driver. Please see Documentation/power/basic-pm-debugging.txt +for more information about the debugging of suspend/resume functionality. + +2. Testing the driver + +Once you have resolved the suspend/resume-related problems with your test system +without the new driver, you are ready to test it: + +a) Build the driver as a module, load it and try the test modes of hibernation + (see: Documents/power/basic-pm-debugging.txt, 1). + +b) Load the driver and attempt to hibernate in the "reboot", "shutdown" and + "platform" modes (see: Documents/power/basic-pm-debugging.txt, 1). + +c) Compile the driver directly into the kernel and try the test modes of + hibernation. + +d) Attempt to hibernate with the driver compiled directly into the kernel + in the "reboot", "shutdown" and "platform" modes. + +e) Try the test modes of suspend (see: Documents/power/basic-pm-debugging.txt, + 2). [As far as the STR tests are concerned, it should not matter whether or + not the driver is built as a module.] + +f) Attempt to suspend to RAM using the s2ram tool with the driver loaded + (see: Documents/power/basic-pm-debugging.txt, 2). + +Each of the above tests should be repeated several times and the STD tests +should be mixed with the STR tests. If any of them fails, the driver cannot be +regarded as suspend/resume-safe. diff --git a/Documentation/power/freezing-of-tasks.txt b/Documentation/power/freezing-of-tasks.txt new file mode 100644 index 0000000..38b5724 --- /dev/null +++ b/Documentation/power/freezing-of-tasks.txt @@ -0,0 +1,178 @@ +Freezing of tasks + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +I. What is the freezing of tasks? + +The freezing of tasks is a mechanism by which user space processes and some +kernel threads are controlled during hibernation or system-wide suspend (on some +architectures). + +II. How does it work? + +There are four per-task flags used for that, PF_NOFREEZE, PF_FROZEN, TIF_FREEZE +and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have +PF_NOFREEZE unset (all user space processes and some kernel threads) are +regarded as 'freezable' and treated in a special way before the system enters a +suspend state as well as before a hibernation image is created (in what follows +we only consider hibernation, but the description also applies to suspend). + +Namely, as the first step of the hibernation procedure the function +freeze_processes() (defined in kernel/power/process.c) is called. It executes +try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and +either wakes them up, if they are kernel threads, or sends fake signals to them, +if they are user space processes. A task that has TIF_FREEZE set, should react +to it by calling the function called refrigerator() (defined in +kernel/power/process.c), which sets the task's PF_FROZEN flag, changes its state +to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is cleared for it. +Then, we say that the task is 'frozen' and therefore the set of functions +handling this mechanism is referred to as 'the freezer' (these functions are +defined in kernel/power/process.c and include/linux/freezer.h). User space +processes are generally frozen before kernel threads. + +It is not recommended to call refrigerator() directly. Instead, it is +recommended to use the try_to_freeze() function (defined in +include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the +task enter refrigerator() if the flag is set. + +For user space processes try_to_freeze() is called automatically from the +signal-handling code, but the freezable kernel threads need to call it +explicitly in suitable places or use the wait_event_freezable() or +wait_event_freezable_timeout() macros (defined in include/linux/freezer.h) +that combine interruptible sleep with checking if TIF_FREEZE is set and calling +try_to_freeze(). The main loop of a freezable kernel thread may look like the +following one: + + set_freezable(); + do { + hub_events(); + wait_event_freezable(khubd_wait, + !list_empty(&hub_event_list) || + kthread_should_stop()); + } while (!kthread_should_stop() || !list_empty(&hub_event_list)); + +(from drivers/usb/core/hub.c::hub_thread()). + +If a freezable kernel thread fails to call try_to_freeze() after the freezer has +set TIF_FREEZE for it, the freezing of tasks will fail and the entire +hibernation operation will be cancelled. For this reason, freezable kernel +threads must call try_to_freeze() somewhere or use one of the +wait_event_freezable() and wait_event_freezable_timeout() macros. + +After the system memory state has been restored from a hibernation image and +devices have been reinitialized, the function thaw_processes() is called in +order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that +have been frozen leave refrigerator() and continue running. + +III. Which kernel threads are freezable? + +Kernel threads are not freezable by default. However, a kernel thread may clear +PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE +directly is strongly discouraged). From this point it is regarded as freezable +and must call try_to_freeze() in a suitable place. + +IV. Why do we do that? + +Generally speaking, there is a couple of reasons to use the freezing of tasks: + +1. The principal reason is to prevent filesystems from being damaged after +hibernation. At the moment we have no simple means of checkpointing +filesystems, so if there are any modifications made to filesystem data and/or +metadata on disks, we cannot bring them back to the state from before the +modifications. At the same time each hibernation image contains some +filesystem-related information that must be consistent with the state of the +on-disk data and metadata after the system memory state has been restored from +the image (otherwise the filesystems will be damaged in a nasty way, usually +making them almost impossible to repair). We therefore freeze tasks that might +cause the on-disk filesystems' data and metadata to be modified after the +hibernation image has been created and before the system is finally powered off. +The majority of these are user space processes, but if any of the kernel threads +may cause something like this to happen, they have to be freezable. + +2. Next, to create the hibernation image we need to free a sufficient amount of +memory (approximately 50% of available RAM) and we need to do that before +devices are deactivated, because we generally need them for swapping out. Then, +after the memory for the image has been freed, we don't want tasks to allocate +additional memory and we prevent them from doing that by freezing them earlier. +[Of course, this also means that device drivers should not allocate substantial +amounts of memory from their .suspend() callbacks before hibernation, but this +is e separate issue.] + +3. The third reason is to prevent user space processes and some kernel threads +from interfering with the suspending and resuming of devices. A user space +process running on a second CPU while we are suspending devices may, for +example, be troublesome and without the freezing of tasks we would need some +safeguards against race conditions that might occur in such a case. + +Although Linus Torvalds doesn't like the freezing of tasks, he said this in one +of the discussions on LKML (http://lkml.org/lkml/2007/4/27/608): + +"RJW:> Why we freeze tasks at all or why we freeze kernel threads? + +Linus: In many ways, 'at all'. + +I _do_ realize the IO request queue issues, and that we cannot actually do +s2ram with some devices in the middle of a DMA. So we want to be able to +avoid *that*, there's no question about that. And I suspect that stopping +user threads and then waiting for a sync is practically one of the easier +ways to do so. + +So in practice, the 'at all' may become a 'why freeze kernel threads?' and +freezing user threads I don't find really objectionable." + +Still, there are kernel threads that may want to be freezable. For example, if +a kernel that belongs to a device driver accesses the device directly, it in +principle needs to know when the device is suspended, so that it doesn't try to +access it at that time. However, if the kernel thread is freezable, it will be +frozen before the driver's .suspend() callback is executed and it will be +thawed after the driver's .resume() callback has run, so it won't be accessing +the device while it's suspended. + +4. Another reason for freezing tasks is to prevent user space processes from +realizing that hibernation (or suspend) operation takes place. Ideally, user +space processes should not notice that such a system-wide operation has occurred +and should continue running without any problems after the restore (or resume +from suspend). Unfortunately, in the most general case this is quite difficult +to achieve without the freezing of tasks. Consider, for example, a process +that depends on all CPUs being online while it's running. Since we need to +disable nonboot CPUs during the hibernation, if this process is not frozen, it +may notice that the number of CPUs has changed and may start to work incorrectly +because of that. + +V. Are there any problems related to the freezing of tasks? + +Yes, there are. + +First of all, the freezing of kernel threads may be tricky if they depend one +on another. For example, if kernel thread A waits for a completion (in the +TASK_UNINTERRUPTIBLE state) that needs to be done by freezable kernel thread B +and B is frozen in the meantime, then A will be blocked until B is thawed, which +may be undesirable. That's why kernel threads are not freezable by default. + +Second, there are the following two problems related to the freezing of user +space processes: +1. Putting processes into an uninterruptible sleep distorts the load average. +2. Now that we have FUSE, plus the framework for doing device drivers in +userspace, it gets even more complicated because some userspace processes are +now doing the sorts of things that kernel threads do +(https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012309.html). + +The problem 1. seems to be fixable, although it hasn't been fixed so far. The +other one is more serious, but it seems that we can work around it by using +hibernation (and suspend) notifiers (in that case, though, we won't be able to +avoid the realization by the user space processes that the hibernation is taking +place). + +There are also problems that the freezing of tasks tends to expose, although +they are not directly related to it. For example, if request_firmware() is +called from a device driver's .resume() routine, it will timeout and eventually +fail, because the user land process that should respond to the request is frozen +at this point. So, seemingly, the failure is due to the freezing of tasks. +Suppose, however, that the firmware file is located on a filesystem accessible +only through another device that hasn't been resumed yet. In that case, +request_firmware() will fail regardless of whether or not the freezing of tasks +is used. Consequently, the problem is not really related to the freezing of +tasks, since it generally exists anyway. + +A driver must have all firmwares it may need in RAM before suspend() is called. +If keeping them is not practical, for example due to their size, they must be +requested early enough using the suspend notifier API described in notifiers.txt. diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt new file mode 100644 index 0000000..e67211f --- /dev/null +++ b/Documentation/power/interface.txt @@ -0,0 +1,75 @@ +Power Management Interface + + +The power management subsystem provides a unified sysfs interface to +userspace, regardless of what architecture or platform one is +running. The interface exists in /sys/power/ directory (assuming sysfs +is mounted at /sys). + +/sys/power/state controls system power state. Reading from this file +returns what states are supported, which is hard-coded to 'standby' +(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk' +(Suspend-to-Disk). + +Writing to this file one of those strings causes the system to +transition into that state. Please see the file +Documentation/power/states.txt for a description of each of those +states. + + +/sys/power/disk controls the operating mode of the suspend-to-disk +mechanism. Suspend-to-disk can be handled in several ways. We have a +few options for putting the system to sleep - using the platform driver +(e.g. ACPI or other suspend_ops), powering off the system or rebooting the +system (for testing). + +Additionally, /sys/power/disk can be used to turn on one of the two testing +modes of the suspend-to-disk mechanism: 'testproc' or 'test'. If the +suspend-to-disk mechanism is in the 'testproc' mode, writing 'disk' to +/sys/power/state will cause the kernel to disable nonboot CPUs and freeze +tasks, wait for 5 seconds, unfreeze tasks and enable nonboot CPUs. If it is +in the 'test' mode, writing 'disk' to /sys/power/state will cause the kernel +to disable nonboot CPUs and freeze tasks, shrink memory, suspend devices, wait +for 5 seconds, resume devices, unfreeze tasks and enable nonboot CPUs. Then, +we are able to look in the log messages and work out, for example, which code +is being slow and which device drivers are misbehaving. + +Reading from this file will display all supported modes and the currently +selected one in brackets, for example + + [shutdown] reboot test testproc + +Writing to this file will accept one of + + 'platform' (only if the platform supports it) + 'shutdown' + 'reboot' + 'testproc' + 'test' + +/sys/power/image_size controls the size of the image created by +the suspend-to-disk mechanism. It can be written a string +representing a non-negative integer that will be used as an upper +limit of the image size, in bytes. The suspend-to-disk mechanism will +do its best to ensure the image size will not exceed that number. However, +if this turns out to be impossible, it will try to suspend anyway using the +smallest image possible. In particular, if "0" is written to this file, the +suspend image will be as small as possible. + +Reading from this file will display the current image size limit, which +is set to 500 MB by default. + +/sys/power/pm_trace controls the code which saves the last PM event point in +the RTC across reboots, so that you can debug a machine that just hangs +during suspend (or more commonly, during resume). Namely, the RTC is only +used to save the last PM event point if this file contains '1'. Initially it +contains '0' which may be changed to '1' by writing a string representing a +nonzero integer into it. + +To use this debugging feature you should attempt to suspend the machine, then +reboot it and run + + dmesg -s 1000000 | grep 'hash matches' + +CAUTION: Using it will cause your machine's real-time (CMOS) clock to be +set to a random invalid time after a resume. diff --git a/Documentation/power/notifiers.txt b/Documentation/power/notifiers.txt new file mode 100644 index 0000000..ae1b7ec --- /dev/null +++ b/Documentation/power/notifiers.txt @@ -0,0 +1,58 @@ +Suspend notifiers + (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL + +There are some operations that device drivers may want to carry out in their +.suspend() routines, but shouldn't, because they can cause the hibernation or +suspend to fail. For example, a driver may want to allocate a substantial amount +of memory (like 50 MB) in .suspend(), but that shouldn't be done after the +swsusp's memory shrinker has run. + +Also, there may be some operations, that subsystems want to carry out before a +hibernation/suspend or after a restore/resume, requiring the system to be fully +functional, so the drivers' .suspend() and .resume() routines are not suitable +for this purpose. For example, device drivers may want to upload firmware to +their devices after a restore from a hibernation image, but they cannot do it by +calling request_firmware() from their .resume() routines (user land processes +are frozen at this point). The solution may be to load the firmware into +memory before processes are frozen and upload it from there in the .resume() +routine. Of course, a hibernation notifier may be used for this purpose. + +The subsystems that have such needs can register suspend notifiers that will be +called upon the following events by the suspend core: + +PM_HIBERNATION_PREPARE The system is going to hibernate or suspend, tasks will + be frozen immediately. + +PM_POST_HIBERNATION The system memory state has been restored from a + hibernation image or an error occured during the + hibernation. Device drivers' .resume() callbacks have + been executed and tasks have been thawed. + +PM_RESTORE_PREPARE The system is going to restore a hibernation image. + If all goes well the restored kernel will issue a + PM_POST_HIBERNATION notification. + +PM_POST_RESTORE An error occurred during the hibernation restore. + Device drivers' .resume() callbacks have been executed + and tasks have been thawed. + +PM_SUSPEND_PREPARE The system is preparing for a suspend. + +PM_POST_SUSPEND The system has just resumed or an error occured during + the suspend. Device drivers' .resume() callbacks have + been executed and tasks have been thawed. + +It is generally assumed that whatever the notifiers do for +PM_HIBERNATION_PREPARE, should be undone for PM_POST_HIBERNATION. Analogously, +operations performed for PM_SUSPEND_PREPARE should be reversed for +PM_POST_SUSPEND. Additionally, all of the notifiers are called for +PM_POST_HIBERNATION if one of them fails for PM_HIBERNATION_PREPARE, and +all of the notifiers are called for PM_POST_SUSPEND if one of them fails for +PM_SUSPEND_PREPARE. + +The hibernation and suspend notifiers are called with pm_mutex held. They are +defined in the usual way, but their last argument is meaningless (it is always +NULL). To register and/or unregister a suspend notifier use the functions +register_pm_notifier() and unregister_pm_notifier(), respectively, defined in +include/linux/suspend.h . If you don't need to unregister the notifier, you can +also use the pm_notifier() macro defined in include/linux/suspend.h . diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt new file mode 100644 index 0000000..dd8fe43 --- /dev/null +++ b/Documentation/power/pci.txt @@ -0,0 +1,299 @@ + +PCI Power Management +~~~~~~~~~~~~~~~~~~~~ + +An overview of the concepts and the related functions in the Linux kernel + +Patrick Mochel <mochel@transmeta.com> +(and others) + +--------------------------------------------------------------------------- + +1. Overview +2. How the PCI Subsystem Does Power Management +3. PCI Utility Functions +4. PCI Device Drivers +5. Resources + +1. Overview +~~~~~~~~~~~ + +The PCI Power Management Specification was introduced between the PCI 2.1 and +PCI 2.2 Specifications. It a standard interface for controlling various +power management operations. + +Implementation of the PCI PM Spec is optional, as are several sub-components of +it. If a device supports the PCI PM Spec, the device will have an 8 byte +capability field in its PCI configuration space. This field is used to describe +and control the standard PCI power management features. + +The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses +(B0 - B3). The higher the number, the less power the device consumes. However, +the higher the number, the longer the latency is for the device to return to +an operational state (D0). + +There are actually two D3 states. When someone talks about D3, they usually +mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the +device may lose some context). But they may also mean D3cold, which is an +ACPI D3 state (power is fully off, all state was discarded); or both. + +Bus power management is not covered in this version of this document. + +Note that all PCI devices support D0 and D3cold by default, regardless of +whether or not they implement any of the PCI PM spec. + +The possible state transitions that a device can undergo are: + ++---------------------------+ +| Current State | New State | ++---------------------------+ +| D0 | D1, D2, D3| ++---------------------------+ +| D1 | D2, D3 | ++---------------------------+ +| D2 | D3 | ++---------------------------+ +| D1, D2, D3 | D0 | ++---------------------------+ + +Note that when the system is entering a global suspend state, all devices will +be placed into D3 and when resuming, all devices will be placed into D0. +However, when the system is running, other state transitions are possible. + +2. How The PCI Subsystem Handles Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The PCI suspend/resume functionality is accessed indirectly via the Power +Management subsystem. At boot, the PCI driver registers a power management +callback with that layer. Upon entering a suspend state, the PM layer iterates +through all of its registered callbacks. This currently takes place only during +APM state transitions. + +Upon going to sleep, the PCI subsystem walks its device tree twice. Both times, +it does a depth first walk of the device tree. The first walk saves each of the +device's state and checks for devices that will prevent the system from entering +a global power state. The next walk then places the devices in a low power +state. + +The first walk allows a graceful recovery in the event of a failure, since none +of the devices have actually been powered down. + +In both walks, in particular the second, all children of a bridge are touched +before the actual bridge itself. This allows the bridge to retain power while +its children are being accessed. + +Upon resuming from sleep, just the opposite must be true: all bridges must be +powered on and restored before their children are powered on. This is easily +accomplished with a breadth-first walk of the PCI device tree. + + +3. PCI Utility Functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +These are helper functions designed to be called by individual device drivers. +Assuming that a device behaves as advertised, these should be applicable in most +cases. However, results may vary. + +Note that these functions are never implicitly called for the driver. The driver +is always responsible for deciding when and if to call these. + + +pci_save_state +-------------- + +Usage: + pci_save_state(struct pci_dev *dev); + +Description: + Save first 64 bytes of PCI config space, along with any additional + PCI-Express or PCI-X information. + + +pci_restore_state +----------------- + +Usage: + pci_restore_state(struct pci_dev *dev); + +Description: + Restore previously saved config space. + + +pci_set_power_state +------------------- + +Usage: + pci_set_power_state(struct pci_dev *dev, pci_power_t state); + +Description: + Transition device to low power state using PCI PM Capabilities + registers. + + Will fail under one of the following conditions: + - If state is less than current state, but not D0 (illegal transition) + - Device doesn't support PM Capabilities + - Device does not support requested state + + +pci_enable_wake +--------------- + +Usage: + pci_enable_wake(struct pci_dev *dev, pci_power_t state, int enable); + +Description: + Enable device to generate PME# during low power state using PCI PM + Capabilities. + + Checks whether if device supports generating PME# from requested state + and fail if it does not, unless enable == 0 (request is to disable wake + events, which is implicit if it doesn't even support it in the first + place). + + Note that the PMC Register in the device's PM Capabilities has a bitmask + of the states it supports generating PME# from. D3hot is bit 3 and + D3cold is bit 4. So, while a value of 4 as the state may not seem + semantically correct, it is. + + +4. PCI Device Drivers +~~~~~~~~~~~~~~~~~~~~~ + +These functions are intended for use by individual drivers, and are defined in +struct pci_driver: + + int (*suspend) (struct pci_dev *dev, pm_message_t state); + int (*resume) (struct pci_dev *dev); + + +suspend +------- + +Usage: + +if (dev->driver && dev->driver->suspend) + dev->driver->suspend(dev,state); + +A driver uses this function to actually transition the device into a low power +state. This should include disabling I/O, IRQs, and bus-mastering, as well as +physically transitioning the device to a lower power state; it may also include +calls to pci_enable_wake(). + +Bus mastering may be disabled by doing: + +pci_disable_device(dev); + +For devices that support the PCI PM Spec, this may be used to set the device's +power state to match the suspend() parameter: + +pci_set_power_state(dev,state); + +The driver is also responsible for disabling any other device-specific features +(e.g blanking screen, turning off on-card memory, etc). + +The driver should be sure to track the current state of the device, as it may +obviate the need for some operations. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + +resume +------ + +Usage: + +if (dev->driver && dev->driver->resume) + dev->driver->resume(dev) + +The resume callback may be called from any power state, and is always meant to +transition the device to the D0 state. + +The driver is responsible for reenabling any features of the device that had +been disabled during previous suspend calls, such as IRQs and bus mastering, +as well as calling pci_restore_state(). + +If the device is currently in D3, it may need to be reinitialized in resume(). + + * Some types of devices, like bus controllers, will preserve context in D3hot + (using Vcc power). Their drivers will often want to avoid re-initializing + them after re-entering D0 (perhaps to avoid resetting downstream devices). + + * Other kinds of devices in D3hot will discard device context as part of a + soft reset when re-entering the D0 state. + + * Devices resuming from D3cold always go through a power-on reset. Some + device context can also be preserved using Vaux power. + + * Some systems hide D3cold resume paths from drivers. For example, on PCs + the resume path for suspend-to-disk often runs BIOS powerup code, which + will sometimes re-initialize the device. + +To handle resets during D3 to D0 transitions, it may be convenient to share +device initialization code between probe() and resume(). Device parameters +can also be saved before the driver suspends into D3, avoiding re-probe. + +If the device supports the PCI PM Spec, it can use this to physically transition +the device to D0: + +pci_set_power_state(dev,0); + +Note that if the entire system is transitioning out of a global sleep state, all +devices will be placed in the D0 state, so this is not necessary. However, in +the event that the device is placed in the D3 state during normal operation, +this call is necessary. It is impossible to determine which of the two events is +taking place in the driver, so it is always a good idea to make that call. + +The driver should take note of the state that it is resuming from in order to +ensure correct (and speedy) operation. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + + + +A reference implementation +------------------------- +.suspend() +{ + /* driver specific operations */ + + /* Disable IRQ */ + free_irq(); + /* If using MSI */ + pci_disable_msi(); + + pci_save_state(); + pci_enable_wake(); + /* Disable IO/bus master/irq router */ + pci_disable_device(); + pci_set_power_state(pci_choose_state()); +} + +.resume() +{ + pci_set_power_state(PCI_D0); + pci_restore_state(); + /* device's irq possibly is changed, driver should take care */ + pci_enable_device(); + pci_set_master(); + + /* if using MSI, device's vector possibly is changed */ + pci_enable_msi(); + + request_irq(); + /* driver specific operations; */ +} + +This is a typical implementation. Drivers can slightly change the order +of the operations in the implementation, ignore some operations or add +more driver specific operations in it, but drivers should do something like +this on the whole. + +5. Resources +~~~~~~~~~~~~ + +PCI Local Bus Specification +PCI Bus Power Management Interface Specification + + http://www.pcisig.com + diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt new file mode 100644 index 0000000..c40866e --- /dev/null +++ b/Documentation/power/pm_qos_interface.txt @@ -0,0 +1,64 @@ +PM Quality Of Service Interface. + +This interface provides a kernel and user mode interface for registering +performance expectations by drivers, subsystems and user space applications on +one of the parameters. + +Currently we have {cpu_dma_latency, network_latency, network_throughput} as the +initial set of pm_qos parameters. + +Each parameters have defined units: + * latency: usec + * timeout: usec + * throughput: kbs (kilo bit / sec) + +The infrastructure exposes multiple misc device nodes one per implemented +parameter. The set of parameters implement is defined by pm_qos_power_init() +and pm_qos_params.h. This is done because having the available parameters +being runtime configurable or changeable from a driver was seen as too easy to +abuse. + +For each parameter a list of performance requirements is maintained along with +an aggregated target value. The aggregated target value is updated with +changes to the requirement list or elements of the list. Typically the +aggregated target value is simply the max or min of the requirement values held +in the parameter list elements. + +From kernel mode the use of this interface is simple: +pm_qos_add_requirement(param_id, name, target_value): +Will insert a named element in the list for that identified PM_QOS parameter +with the target value. Upon change to this list the new target is recomputed +and any registered notifiers are called only if the target value is now +different. + +pm_qos_update_requirement(param_id, name, new_target_value): +Will search the list identified by the param_id for the named list element and +then update its target value, calling the notification tree if the aggregated +target is changed. with that name is already registered. + +pm_qos_remove_requirement(param_id, name): +Will search the identified list for the named element and remove it, after +removal it will update the aggregate target and call the notification tree if +the target was changed as a result of removing the named requirement. + + +From user mode: +Only processes can register a pm_qos requirement. To provide for automatic +cleanup for process the interface requires the process to register its +parameter requirements in the following way: + +To register the default pm_qos target for the specific parameter, the process +must open one of /dev/[cpu_dma_latency, network_latency, network_throughput] + +As long as the device node is held open that process has a registered +requirement on the parameter. The name of the requirement is "process_<PID>" +derived from the current->pid from within the open system call. + +To change the requested target value the process needs to write a s32 value to +the open device node. This translates to a pm_qos_update_requirement call. + +To remove the user mode request for a target value simply close the device +node. + + + diff --git a/Documentation/power/power_supply_class.txt b/Documentation/power/power_supply_class.txt new file mode 100644 index 0000000..c6cd495 --- /dev/null +++ b/Documentation/power/power_supply_class.txt @@ -0,0 +1,173 @@ +Linux power supply class +======================== + +Synopsis +~~~~~~~~ +Power supply class used to represent battery, UPS, AC or DC power supply +properties to user-space. + +It defines core set of attributes, which should be applicable to (almost) +every power supply out there. Attributes are available via sysfs and uevent +interfaces. + +Each attribute has well defined meaning, up to unit of measure used. While +the attributes provided are believed to be universally applicable to any +power supply, specific monitoring hardware may not be able to provide them +all, so any of them may be skipped. + +Power supply class is extensible, and allows to define drivers own attributes. +The core attribute set is subject to the standard Linux evolution (i.e. +if it will be found that some attribute is applicable to many power supply +types or their drivers, it can be added to the core set). + +It also integrates with LED framework, for the purpose of providing +typically expected feedback of battery charging/fully charged status and +AC/USB power supply online status. (Note that specific details of the +indication (including whether to use it at all) are fully controllable by +user and/or specific machine defaults, per design principles of LED +framework). + + +Attributes/properties +~~~~~~~~~~~~~~~~~~~~~ +Power supply class has predefined set of attributes, this eliminates code +duplication across drivers. Power supply class insist on reusing its +predefined attributes *and* their units. + +So, userspace gets predictable set of attributes and their units for any +kind of power supply, and can process/present them to a user in consistent +manner. Results for different power supplies and machines are also directly +comparable. + +See drivers/power/ds2760_battery.c and drivers/power/pda_power.c for the +example how to declare and handle attributes. + + +Units +~~~~~ +Quoting include/linux/power_supply.h: + + All voltages, currents, charges, energies, time and temperatures in µV, + µA, µAh, µWh, seconds and tenths of degree Celsius unless otherwise + stated. It's driver's job to convert its raw values to units in which + this class operates. + + +Attributes/properties detailed +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +~ ~ ~ ~ ~ ~ ~ Charge/Energy/Capacity - how to not confuse ~ ~ ~ ~ ~ ~ ~ +~ ~ +~ Because both "charge" (µAh) and "energy" (µWh) represents "capacity" ~ +~ of battery, this class distinguish these terms. Don't mix them! ~ +~ ~ +~ CHARGE_* attributes represents capacity in µAh only. ~ +~ ENERGY_* attributes represents capacity in µWh only. ~ +~ CAPACITY attribute represents capacity in *percents*, from 0 to 100. ~ +~ ~ +~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ + +Postfixes: +_AVG - *hardware* averaged value, use it if your hardware is really able to +report averaged values. +_NOW - momentary/instantaneous values. + +STATUS - this attribute represents operating status (charging, full, +discharging (i.e. powering a load), etc.). This corresponds to +BATTERY_STATUS_* values, as defined in battery.h. + +HEALTH - represents health of the battery, values corresponds to +POWER_SUPPLY_HEALTH_*, defined in battery.h. + +VOLTAGE_MAX_DESIGN, VOLTAGE_MIN_DESIGN - design values for maximal and +minimal power supply voltages. Maximal/minimal means values of voltages +when battery considered "full"/"empty" at normal conditions. Yes, there is +no direct relation between voltage and battery capacity, but some dumb +batteries use voltage for very approximated calculation of capacity. +Battery driver also can use this attribute just to inform userspace +about maximal and minimal voltage thresholds of a given battery. + +VOLTAGE_MAX, VOLTAGE_MIN - same as _DESIGN voltage values except that +these ones should be used if hardware could only guess (measure and +retain) the thresholds of a given power supply. + +CHARGE_FULL_DESIGN, CHARGE_EMPTY_DESIGN - design charge values, when +battery considered full/empty. + +ENERGY_FULL_DESIGN, ENERGY_EMPTY_DESIGN - same as above but for energy. + +CHARGE_FULL, CHARGE_EMPTY - These attributes means "last remembered value +of charge when battery became full/empty". It also could mean "value of +charge when battery considered full/empty at given conditions (temperature, +age)". I.e. these attributes represents real thresholds, not design values. + +CHARGE_COUNTER - the current charge counter (in µAh). This could easily +be negative; there is no empty or full value. It is only useful for +relative, time-based measurements. + +ENERGY_FULL, ENERGY_EMPTY - same as above but for energy. + +CAPACITY - capacity in percents. + +TEMP - temperature of the power supply. +TEMP_AMBIENT - ambient temperature. + +TIME_TO_EMPTY - seconds left for battery to be considered empty (i.e. +while battery powers a load) +TIME_TO_FULL - seconds left for battery to be considered full (i.e. +while battery is charging) + + +Battery <-> external power supply interaction +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Often power supplies are acting as supplies and supplicants at the same +time. Batteries are good example. So, batteries usually care if they're +externally powered or not. + +For that case, power supply class implements notification mechanism for +batteries. + +External power supply (AC) lists supplicants (batteries) names in +"supplied_to" struct member, and each power_supply_changed() call +issued by external power supply will notify supplicants via +external_power_changed callback. + + +QA +~~ +Q: Where is POWER_SUPPLY_PROP_XYZ attribute? +A: If you cannot find attribute suitable for your driver needs, feel free + to add it and send patch along with your driver. + + The attributes available currently are the ones currently provided by the + drivers written. + + Good candidates to add in future: model/part#, cycle_time, manufacturer, + etc. + + +Q: I have some very specific attribute (e.g. battery color), should I add + this attribute to standard ones? +A: Most likely, no. Such attribute can be placed in the driver itself, if + it is useful. Of course, if the attribute in question applicable to + large set of batteries, provided by many drivers, and/or comes from + some general battery specification/standard, it may be a candidate to + be added to the core attribute set. + + +Q: Suppose, my battery monitoring chip/firmware does not provides capacity + in percents, but provides charge_{now,full,empty}. Should I calculate + percentage capacity manually, inside the driver, and register CAPACITY + attribute? The same question about time_to_empty/time_to_full. +A: Most likely, no. This class is designed to export properties which are + directly measurable by the specific hardware available. + + Inferring not available properties using some heuristics or mathematical + model is not subject of work for a battery driver. Such functionality + should be factored out, and in fact, apm_power, the driver to serve + legacy APM API on top of power supply class, uses a simple heuristic of + approximating remaining battery capacity based on its charge, current, + voltage and so on. But full-fledged battery model is likely not subject + for kernel at all, as it would require floating point calculation to deal + with things like differential equations and Kalman filters. This is + better be handled by batteryd/libbattery, yet to be written. diff --git a/Documentation/power/regulator/consumer.txt b/Documentation/power/regulator/consumer.txt new file mode 100644 index 0000000..82b7a43 --- /dev/null +++ b/Documentation/power/regulator/consumer.txt @@ -0,0 +1,182 @@ +Regulator Consumer Driver Interface +=================================== + +This text describes the regulator interface for consumer device drivers. +Please see overview.txt for a description of the terms used in this text. + + +1. Consumer Regulator Access (static & dynamic drivers) +======================================================= + +A consumer driver can get access to it's supply regulator by calling :- + +regulator = regulator_get(dev, "Vcc"); + +The consumer passes in it's struct device pointer and power supply ID. The core +then finds the correct regulator by consulting a machine specific lookup table. +If the lookup is successful then this call will return a pointer to the struct +regulator that supplies this consumer. + +To release the regulator the consumer driver should call :- + +regulator_put(regulator); + +Consumers can be supplied by more than one regulator e.g. codec consumer with +analog and digital supplies :- + +digital = regulator_get(dev, "Vcc"); /* digital core */ +analog = regulator_get(dev, "Avdd"); /* analog */ + +The regulator access functions regulator_get() and regulator_put() will +usually be called in your device drivers probe() and remove() respectively. + + +2. Regulator Output Enable & Disable (static & dynamic drivers) +==================================================================== + +A consumer can enable it's power supply by calling:- + +int regulator_enable(regulator); + +NOTE: The supply may already be enabled before regulator_enabled() is called. +This may happen if the consumer shares the regulator or the regulator has been +previously enabled by bootloader or kernel board initialization code. + +A consumer can determine if a regulator is enabled by calling :- + +int regulator_is_enabled(regulator); + +This will return > zero when the regulator is enabled. + + +A consumer can disable it's supply when no longer needed by calling :- + +int regulator_disable(regulator); + +NOTE: This may not disable the supply if it's shared with other consumers. The +regulator will only be disabled when the enabled reference count is zero. + +Finally, a regulator can be forcefully disabled in the case of an emergency :- + +int regulator_force_disable(regulator); + +NOTE: this will immediately and forcefully shutdown the regulator output. All +consumers will be powered off. + + +3. Regulator Voltage Control & Status (dynamic drivers) +====================================================== + +Some consumer drivers need to be able to dynamically change their supply +voltage to match system operating points. e.g. CPUfreq drivers can scale +voltage along with frequency to save power, SD drivers may need to select the +correct card voltage, etc. + +Consumers can control their supply voltage by calling :- + +int regulator_set_voltage(regulator, min_uV, max_uV); + +Where min_uV and max_uV are the minimum and maximum acceptable voltages in +microvolts. + +NOTE: this can be called when the regulator is enabled or disabled. If called +when enabled, then the voltage changes instantly, otherwise the voltage +configuration changes and the voltage is physically set when the regulator is +next enabled. + +The regulators configured voltage output can be found by calling :- + +int regulator_get_voltage(regulator); + +NOTE: get_voltage() will return the configured output voltage whether the +regulator is enabled or disabled and should NOT be used to determine regulator +output state. However this can be used in conjunction with is_enabled() to +determine the regulator physical output voltage. + + +4. Regulator Current Limit Control & Status (dynamic drivers) +=========================================================== + +Some consumer drivers need to be able to dynamically change their supply +current limit to match system operating points. e.g. LCD backlight driver can +change the current limit to vary the backlight brightness, USB drivers may want +to set the limit to 500mA when supplying power. + +Consumers can control their supply current limit by calling :- + +int regulator_set_current_limit(regulator, min_uV, max_uV); + +Where min_uA and max_uA are the minimum and maximum acceptable current limit in +microamps. + +NOTE: this can be called when the regulator is enabled or disabled. If called +when enabled, then the current limit changes instantly, otherwise the current +limit configuration changes and the current limit is physically set when the +regulator is next enabled. + +A regulators current limit can be found by calling :- + +int regulator_get_current_limit(regulator); + +NOTE: get_current_limit() will return the current limit whether the regulator +is enabled or disabled and should not be used to determine regulator current +load. + + +5. Regulator Operating Mode Control & Status (dynamic drivers) +============================================================= + +Some consumers can further save system power by changing the operating mode of +their supply regulator to be more efficient when the consumers operating state +changes. e.g. consumer driver is idle and subsequently draws less current + +Regulator operating mode can be changed indirectly or directly. + +Indirect operating mode control. +-------------------------------- +Consumer drivers can request a change in their supply regulator operating mode +by calling :- + +int regulator_set_optimum_mode(struct regulator *regulator, int load_uA); + +This will cause the core to recalculate the total load on the regulator (based +on all it's consumers) and change operating mode (if necessary and permitted) +to best match the current operating load. + +The load_uA value can be determined from the consumers datasheet. e.g.most +datasheets have tables showing the max current consumed in certain situations. + +Most consumers will use indirect operating mode control since they have no +knowledge of the regulator or whether the regulator is shared with other +consumers. + +Direct operating mode control. +------------------------------ +Bespoke or tightly coupled drivers may want to directly control regulator +operating mode depending on their operating point. This can be achieved by +calling :- + +int regulator_set_mode(struct regulator *regulator, unsigned int mode); +unsigned int regulator_get_mode(struct regulator *regulator); + +Direct mode will only be used by consumers that *know* about the regulator and +are not sharing the regulator with other consumers. + + +6. Regulator Events +=================== +Regulators can notify consumers of external events. Events could be received by +consumers under regulator stress or failure conditions. + +Consumers can register interest in regulator events by calling :- + +int regulator_register_notifier(struct regulator *regulator, + struct notifier_block *nb); + +Consumers can uregister interest by calling :- + +int regulator_unregister_notifier(struct regulator *regulator, + struct notifier_block *nb); + +Regulators use the kernel notifier framework to send event to thier interested +consumers. diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt new file mode 100644 index 0000000..ce3487d --- /dev/null +++ b/Documentation/power/regulator/machine.txt @@ -0,0 +1,93 @@ +Regulator Machine Driver Interface +=================================== + +The regulator machine driver interface is intended for board/machine specific +initialisation code to configure the regulator subsystem. + +Consider the following machine :- + + Regulator-1 -+-> Regulator-2 --> [Consumer A @ 1.8 - 2.0V] + | + +-> [Consumer B @ 3.3V] + +The drivers for consumers A & B must be mapped to the correct regulator in +order to control their power supply. This mapping can be achieved in machine +initialisation code by creating a struct regulator_consumer_supply for +each regulator. + +struct regulator_consumer_supply { + struct device *dev; /* consumer */ + const char *supply; /* consumer supply - e.g. "vcc" */ +}; + +e.g. for the machine above + +static struct regulator_consumer_supply regulator1_consumers[] = { +{ + .dev = &platform_consumerB_device.dev, + .supply = "Vcc", +},}; + +static struct regulator_consumer_supply regulator2_consumers[] = { +{ + .dev = &platform_consumerA_device.dev, + .supply = "Vcc", +},}; + +This maps Regulator-1 to the 'Vcc' supply for Consumer B and maps Regulator-2 +to the 'Vcc' supply for Consumer A. + +Constraints can now be registered by defining a struct regulator_init_data +for each regulator power domain. This structure also maps the consumers +to their supply regulator :- + +static struct regulator_init_data regulator1_data = { + .constraints = { + .min_uV = 3300000, + .max_uV = 3300000, + .valid_modes_mask = REGULATOR_MODE_NORMAL, + }, + .num_consumer_supplies = ARRAY_SIZE(regulator1_consumers), + .consumer_supplies = regulator1_consumers, +}; + +Regulator-1 supplies power to Regulator-2. This relationship must be registered +with the core so that Regulator-1 is also enabled when Consumer A enables it's +supply (Regulator-2). The supply regulator is set by the supply_regulator_dev +field below:- + +static struct regulator_init_data regulator2_data = { + .supply_regulator_dev = &platform_regulator1_device.dev, + .constraints = { + .min_uV = 1800000, + .max_uV = 2000000, + .valid_ops_mask = REGULATOR_CHANGE_VOLTAGE, + .valid_modes_mask = REGULATOR_MODE_NORMAL, + }, + .num_consumer_supplies = ARRAY_SIZE(regulator2_consumers), + .consumer_supplies = regulator2_consumers, +}; + +Finally the regulator devices must be registered in the usual manner. + +static struct platform_device regulator_devices[] = { +{ + .name = "regulator", + .id = DCDC_1, + .dev = { + .platform_data = ®ulator1_data, + }, +}, +{ + .name = "regulator", + .id = DCDC_2, + .dev = { + .platform_data = ®ulator2_data, + }, +}, +}; +/* register regulator 1 device */ +platform_device_register(&wm8350_regulator_devices[0]); + +/* register regulator 2 device */ +platform_device_register(&wm8350_regulator_devices[1]); diff --git a/Documentation/power/regulator/overview.txt b/Documentation/power/regulator/overview.txt new file mode 100644 index 0000000..bdcb332 --- /dev/null +++ b/Documentation/power/regulator/overview.txt @@ -0,0 +1,171 @@ +Linux voltage and current regulator framework +============================================= + +About +===== + +This framework is designed to provide a standard kernel interface to control +voltage and current regulators. + +The intention is to allow systems to dynamically control regulator power output +in order to save power and prolong battery life. This applies to both voltage +regulators (where voltage output is controllable) and current sinks (where +current limit is controllable). + +(C) 2008 Wolfson Microelectronics PLC. +Author: Liam Girdwood <lg@opensource.wolfsonmicro.com> + + +Nomenclature +============ + +Some terms used in this document:- + + o Regulator - Electronic device that supplies power to other devices. + Most regulators can enable and disable their output whilst + some can control their output voltage and or current. + + Input Voltage -> Regulator -> Output Voltage + + + o PMIC - Power Management IC. An IC that contains numerous regulators + and often contains other susbsystems. + + + o Consumer - Electronic device that is supplied power by a regulator. + Consumers can be classified into two types:- + + Static: consumer does not change it's supply voltage or + current limit. It only needs to enable or disable it's + power supply. It's supply voltage is set by the hardware, + bootloader, firmware or kernel board initialisation code. + + Dynamic: consumer needs to change it's supply voltage or + current limit to meet operation demands. + + + o Power Domain - Electronic circuit that is supplied it's input power by the + output power of a regulator, switch or by another power + domain. + + The supply regulator may be behind a switch(s). i.e. + + Regulator -+-> Switch-1 -+-> Switch-2 --> [Consumer A] + | | + | +-> [Consumer B], [Consumer C] + | + +-> [Consumer D], [Consumer E] + + That is one regulator and three power domains: + + Domain 1: Switch-1, Consumers D & E. + Domain 2: Switch-2, Consumers B & C. + Domain 3: Consumer A. + + and this represents a "supplies" relationship: + + Domain-1 --> Domain-2 --> Domain-3. + + A power domain may have regulators that are supplied power + by other regulators. i.e. + + Regulator-1 -+-> Regulator-2 -+-> [Consumer A] + | + +-> [Consumer B] + + This gives us two regulators and two power domains: + + Domain 1: Regulator-2, Consumer B. + Domain 2: Consumer A. + + and a "supplies" relationship: + + Domain-1 --> Domain-2 + + + o Constraints - Constraints are used to define power levels for performance + and hardware protection. Constraints exist at three levels: + + Regulator Level: This is defined by the regulator hardware + operating parameters and is specified in the regulator + datasheet. i.e. + + - voltage output is in the range 800mV -> 3500mV. + - regulator current output limit is 20mA @ 5V but is + 10mA @ 10V. + + Power Domain Level: This is defined in software by kernel + level board initialisation code. It is used to constrain a + power domain to a particular power range. i.e. + + - Domain-1 voltage is 3300mV + - Domain-2 voltage is 1400mV -> 1600mV + - Domain-3 current limit is 0mA -> 20mA. + + Consumer Level: This is defined by consumer drivers + dynamically setting voltage or current limit levels. + + e.g. a consumer backlight driver asks for a current increase + from 5mA to 10mA to increase LCD illumination. This passes + to through the levels as follows :- + + Consumer: need to increase LCD brightness. Lookup and + request next current mA value in brightness table (the + consumer driver could be used on several different + personalities based upon the same reference device). + + Power Domain: is the new current limit within the domain + operating limits for this domain and system state (e.g. + battery power, USB power) + + Regulator Domains: is the new current limit within the + regulator operating parameters for input/ouput voltage. + + If the regulator request passes all the constraint tests + then the new regulator value is applied. + + +Design +====== + +The framework is designed and targeted at SoC based devices but may also be +relevant to non SoC devices and is split into the following four interfaces:- + + + 1. Consumer driver interface. + + This uses a similar API to the kernel clock interface in that consumer + drivers can get and put a regulator (like they can with clocks atm) and + get/set voltage, current limit, mode, enable and disable. This should + allow consumers complete control over their supply voltage and current + limit. This also compiles out if not in use so drivers can be reused in + systems with no regulator based power control. + + See Documentation/power/regulator/consumer.txt + + 2. Regulator driver interface. + + This allows regulator drivers to register their regulators and provide + operations to the core. It also has a notifier call chain for propagating + regulator events to clients. + + See Documentation/power/regulator/regulator.txt + + 3. Machine interface. + + This interface is for machine specific code and allows the creation of + voltage/current domains (with constraints) for each regulator. It can + provide regulator constraints that will prevent device damage through + overvoltage or over current caused by buggy client drivers. It also + allows the creation of a regulator tree whereby some regulators are + supplied by others (similar to a clock tree). + + See Documentation/power/regulator/machine.txt + + 4. Userspace ABI. + + The framework also exports a lot of useful voltage/current/opmode data to + userspace via sysfs. This could be used to help monitor device power + consumption and status. + + See Documentation/ABI/testing/regulator-sysfs.txt diff --git a/Documentation/power/regulator/regulator.txt b/Documentation/power/regulator/regulator.txt new file mode 100644 index 0000000..4200acc --- /dev/null +++ b/Documentation/power/regulator/regulator.txt @@ -0,0 +1,30 @@ +Regulator Driver Interface +========================== + +The regulator driver interface is relatively simple and designed to allow +regulator drivers to register their services with the core framework. + + +Registration +============ + +Drivers can register a regulator by calling :- + +struct regulator_dev *regulator_register(struct device *dev, + struct regulator_desc *regulator_desc); + +This will register the regulators capabilities and operations to the regulator +core. + +Regulators can be unregistered by calling :- + +void regulator_unregister(struct regulator_dev *rdev); + + +Regulator Events +================ +Regulators can send events (e.g. over temp, under voltage, etc) to consumer +drivers by calling :- + +int regulator_notifier_call_chain(struct regulator_dev *rdev, + unsigned long event, void *data); diff --git a/Documentation/power/s2ram.txt b/Documentation/power/s2ram.txt new file mode 100644 index 0000000..2ebdc60 --- /dev/null +++ b/Documentation/power/s2ram.txt @@ -0,0 +1,74 @@ + How to get s2ram working + ~~~~~~~~~~~~~~~~~~~~~~~~ + 2006 Linus Torvalds + 2006 Pavel Machek + +1) Check suspend.sf.net, program s2ram there has long whitelist of + "known ok" machines, along with tricks to use on each one. + +2) If that does not help, try reading tricks.txt and + video.txt. Perhaps problem is as simple as broken module, and + simple module unload can fix it. + +3) You can use Linus' TRACE_RESUME infrastructure, described below. + + Using TRACE_RESUME + ~~~~~~~~~~~~~~~~~~ + +I've been working at making the machines I have able to STR, and almost +always it's a driver that is buggy. Thank God for the suspend/resume +debugging - the thing that Chuck tried to disable. That's often the _only_ +way to debug these things, and it's actually pretty powerful (but +time-consuming - having to insert TRACE_RESUME() markers into the device +driver that doesn't resume and recompile and reboot). + +Anyway, the way to debug this for people who are interested (have a +machine that doesn't boot) is: + + - enable PM_DEBUG, and PM_TRACE + + - use a script like this: + + #!/bin/sh + sync + echo 1 > /sys/power/pm_trace + echo mem > /sys/power/state + + to suspend + + - if it doesn't come back up (which is usually the problem), reboot by + holding the power button down, and look at the dmesg output for things + like + + Magic number: 4:156:725 + hash matches drivers/base/power/resume.c:28 + hash matches device 0000:01:00.0 + + which means that the last trace event was just before trying to resume + device 0000:01:00.0. Then figure out what driver is controlling that + device (lspci and /sys/devices/pci* is your friend), and see if you can + fix it, disable it, or trace into its resume function. + +For example, the above happens to be the VGA device on my EVO, which I +used to run with "radeonfb" (it's an ATI Radeon mobility). It turns out +that "radeonfb" simply cannot resume that device - it tries to set the +PLL's, and it just _hangs_. Using the regular VGA console and letting X +resume it instead works fine. + +NOTE +==== +pm_trace uses the system's Real Time Clock (RTC) to save the magic number. +Reason for this is that the RTC is the only reliably available piece of +hardware during resume operations where a value can be set that will +survive a reboot. + +Consequence is that after a resume (even if it is successful) your system +clock will have a value corresponding to the magic mumber instead of the +correct date/time! It is therefore advisable to use a program like ntp-date +or rdate to reset the correct date/time from an external time source when +using this trace option. + +As the clock keeps ticking it is also essential that the reboot is done +quickly after the resume failure. The trace option does not use the seconds +or the low order bits of the minutes of the RTC, but a too long delay will +corrupt the magic value. diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt new file mode 100644 index 0000000..34800cc --- /dev/null +++ b/Documentation/power/states.txt @@ -0,0 +1,80 @@ + +System Power Management States + + +The kernel supports three power management states generically, though +each is dependent on platform support code to implement the low-level +details for each state. This file describes each state, what they are +commonly called, what ACPI state they map to, and what string to write +to /sys/power/state to enter that state + + +State: Standby / Power-On Suspend +ACPI State: S1 +String: "standby" + +This state offers minimal, though real, power savings, while providing +a very low-latency transition back to a working system. No operating +state is lost (the CPU retains power), so the system easily starts up +again where it left off. + +We try to put devices in a low-power state equivalent to D1, which +also offers low power savings, but low resume latency. Not all devices +support D1, and those that don't are left on. + +A transition from Standby to the On state should take about 1-2 +seconds. + + +State: Suspend-to-RAM +ACPI State: S3 +String: "mem" + +This state offers significant power savings as everything in the +system is put into a low-power state, except for memory, which is +placed in self-refresh mode to retain its contents. + +System and device state is saved and kept in memory. All devices are +suspended and put into D3. In many cases, all peripheral buses lose +power when entering STR, so devices must be able to handle the +transition back to the On state. + +For at least ACPI, STR requires some minimal boot-strapping code to +resume the system from STR. This may be true on other platforms. + +A transition from Suspend-to-RAM to the On state should take about +3-5 seconds. + + +State: Suspend-to-disk +ACPI State: S4 +String: "disk" + +This state offers the greatest power savings, and can be used even in +the absence of low-level platform support for power management. This +state operates similarly to Suspend-to-RAM, but includes a final step +of writing memory contents to disk. On resume, this is read and memory +is restored to its pre-suspend state. + +STD can be handled by the firmware or the kernel. If it is handled by +the firmware, it usually requires a dedicated partition that must be +setup via another operating system for it to use. Despite the +inconvenience, this method requires minimal work by the kernel, since +the firmware will also handle restoring memory contents on resume. + +For suspend-to-disk, a mechanism called swsusp called 'swsusp' (Swap +Suspend) is used to write memory contents to free swap space. +swsusp has some restrictive requirements, but should work in most +cases. Some, albeit outdated, documentation can be found in +Documentation/power/swsusp.txt. Alternatively, userspace can do most +of the actual suspend to disk work, see userland-swsusp.txt. + +Once memory state is written to disk, the system may either enter a +low-power state (like ACPI S4), or it may simply power down. Powering +down offers greater savings, and allows this mechanism to work on any +system. However, entering a real low-power state allows the user to +trigger wake up events (e.g. pressing a key or opening a laptop lid). + +A transition from Suspend-to-Disk to the On state should take about 30 +seconds, though it's typically a bit more with the current +implementation. diff --git a/Documentation/power/swsusp-and-swap-files.txt b/Documentation/power/swsusp-and-swap-files.txt new file mode 100644 index 0000000..f281886 --- /dev/null +++ b/Documentation/power/swsusp-and-swap-files.txt @@ -0,0 +1,60 @@ +Using swap files with software suspend (swsusp) + (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> + +The Linux kernel handles swap files almost in the same way as it handles swap +partitions and there are only two differences between these two types of swap +areas: +(1) swap files need not be contiguous, +(2) the header of a swap file is not in the first block of the partition that +holds it. From the swsusp's point of view (1) is not a problem, because it is +already taken care of by the swap-handling code, but (2) has to be taken into +consideration. + +In principle the location of a swap file's header may be determined with the +help of appropriate filesystem driver. Unfortunately, however, it requires the +filesystem holding the swap file to be mounted, and if this filesystem is +journaled, it cannot be mounted during resume from disk. For this reason to +identify a swap file swsusp uses the name of the partition that holds the file +and the offset from the beginning of the partition at which the swap file's +header is located. For convenience, this offset is expressed in <PAGE_SIZE> +units. + +In order to use a swap file with swsusp, you need to: + +1) Create the swap file and make it active, eg. + +# dd if=/dev/zero of=<swap_file_path> bs=1024 count=<swap_file_size_in_k> +# mkswap <swap_file_path> +# swapon <swap_file_path> + +2) Use an application that will bmap the swap file with the help of the +FIBMAP ioctl and determine the location of the file's swap header, as the +offset, in <PAGE_SIZE> units, from the beginning of the partition which +holds the swap file. + +3) Add the following parameters to the kernel command line: + +resume=<swap_file_partition> resume_offset=<swap_file_offset> + +where <swap_file_partition> is the partition on which the swap file is located +and <swap_file_offset> is the offset of the swap header determined by the +application in 2) (of course, this step may be carried out automatically +by the same application that determines the swap file's header offset using the +FIBMAP ioctl) + +OR + +Use a userland suspend application that will set the partition and offset +with the help of the SNAPSHOT_SET_SWAP_AREA ioctl described in +Documentation/power/userland-swsusp.txt (this is the only method to suspend +to a swap file allowing the resume to be initiated from an initrd or initramfs +image). + +Now, swsusp will use the swap file in the same way in which it would use a swap +partition. In particular, the swap file has to be active (ie. be present in +/proc/swaps) so that it can be used for suspending. + +Note that if the swap file used for suspending is deleted and recreated, +the location of its header need not be the same as before. Thus every time +this happens the value of the "resume_offset=" kernel command line parameter +has to be updated. diff --git a/Documentation/power/swsusp-dmcrypt.txt b/Documentation/power/swsusp-dmcrypt.txt new file mode 100644 index 0000000..59931b4 --- /dev/null +++ b/Documentation/power/swsusp-dmcrypt.txt @@ -0,0 +1,138 @@ +Author: Andreas Steinmetz <ast@domdv.de> + + +How to use dm-crypt and swsusp together: +======================================== + +Some prerequisites: +You know how dm-crypt works. If not, visit the following web page: +http://www.saout.de/misc/dm-crypt/ +You have read Documentation/power/swsusp.txt and understand it. +You did read Documentation/initrd.txt and know how an initrd works. +You know how to create or how to modify an initrd. + +Now your system is properly set up, your disk is encrypted except for +the swap device(s) and the boot partition which may contain a mini +system for crypto setup and/or rescue purposes. You may even have +an initrd that does your current crypto setup already. + +At this point you want to encrypt your swap, too. Still you want to +be able to suspend using swsusp. This, however, means that you +have to be able to either enter a passphrase or that you read +the key(s) from an external device like a pcmcia flash disk +or an usb stick prior to resume. So you need an initrd, that sets +up dm-crypt and then asks swsusp to resume from the encrypted +swap device. + +The most important thing is that you set up dm-crypt in such +a way that the swap device you suspend to/resume from has +always the same major/minor within the initrd as well as +within your running system. The easiest way to achieve this is +to always set up this swap device first with dmsetup, so that +it will always look like the following: + +brw------- 1 root root 254, 0 Jul 28 13:37 /dev/mapper/swap0 + +Now set up your kernel to use /dev/mapper/swap0 as the default +resume partition, so your kernel .config contains: + +CONFIG_PM_STD_PARTITION="/dev/mapper/swap0" + +Prepare your boot loader to use the initrd you will create or +modify. For lilo the simplest setup looks like the following +lines: + +image=/boot/vmlinuz +initrd=/boot/initrd.gz +label=linux +append="root=/dev/ram0 init=/linuxrc rw" + +Finally you need to create or modify your initrd. Lets assume +you create an initrd that reads the required dm-crypt setup +from a pcmcia flash disk card. The card is formatted with an ext2 +fs which resides on /dev/hde1 when the card is inserted. The +card contains at least the encrypted swap setup in a file +named "swapkey". /etc/fstab of your initrd contains something +like the following: + +/dev/hda1 /mnt ext3 ro 0 0 +none /proc proc defaults,noatime,nodiratime 0 0 +none /sys sysfs defaults,noatime,nodiratime 0 0 + +/dev/hda1 contains an unencrypted mini system that sets up all +of your crypto devices, again by reading the setup from the +pcmcia flash disk. What follows now is a /linuxrc for your +initrd that allows you to resume from encrypted swap and that +continues boot with your mini system on /dev/hda1 if resume +does not happen: + +#!/bin/sh +PATH=/sbin:/bin:/usr/sbin:/usr/bin +mount /proc +mount /sys +mapped=0 +noresume=`grep -c noresume /proc/cmdline` +if [ "$*" != "" ] +then + noresume=1 +fi +dmesg -n 1 +/sbin/cardmgr -q +for i in 1 2 3 4 5 6 7 8 9 0 +do + if [ -f /proc/ide/hde/media ] + then + usleep 500000 + mount -t ext2 -o ro /dev/hde1 /mnt + if [ -f /mnt/swapkey ] + then + dmsetup create swap0 /mnt/swapkey > /dev/null 2>&1 && mapped=1 + fi + umount /mnt + break + fi + usleep 500000 +done +killproc /sbin/cardmgr +dmesg -n 6 +if [ $mapped = 1 ] +then + if [ $noresume != 0 ] + then + mkswap /dev/mapper/swap0 > /dev/null 2>&1 + fi + echo 254:0 > /sys/power/resume + dmsetup remove swap0 +fi +umount /sys +mount /mnt +umount /proc +cd /mnt +pivot_root . mnt +mount /proc +umount -l /mnt +umount /proc +exec chroot . /sbin/init $* < dev/console > dev/console 2>&1 + +Please don't mind the weird loop above, busybox's msh doesn't know +the let statement. Now, what is happening in the script? +First we have to decide if we want to try to resume, or not. +We will not resume if booting with "noresume" or any parameters +for init like "single" or "emergency" as boot parameters. + +Then we need to set up dmcrypt with the setup data from the +pcmcia flash disk. If this succeeds we need to reset the swap +device if we don't want to resume. The line "echo 254:0 > /sys/power/resume" +then attempts to resume from the first device mapper device. +Note that it is important to set the device in /sys/power/resume, +regardless if resuming or not, otherwise later suspend will fail. +If resume starts, script execution terminates here. + +Otherwise we just remove the encrypted swap device and leave it to the +mini system on /dev/hda1 to set the whole crypto up (it is up to +you to modify this to your taste). + +What then follows is the well known process to change the root +file system and continue booting from there. I prefer to unmount +the initrd prior to continue booting but it is up to you to modify +this. diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt new file mode 100644 index 0000000..9d60ab7 --- /dev/null +++ b/Documentation/power/swsusp.txt @@ -0,0 +1,407 @@ +Some warnings, first. + + * BIG FAT WARNING ********************************************************* + * + * If you touch anything on disk between suspend and resume... + * ...kiss your data goodbye. + * + * If you do resume from initrd after your filesystems are mounted... + * ...bye bye root partition. + * [this is actually same case as above] + * + * If you have unsupported (*) devices using DMA, you may have some + * problems. If your disk driver does not support suspend... (IDE does), + * it may cause some problems, too. If you change kernel command line + * between suspend and resume, it may do something wrong. If you change + * your hardware while system is suspended... well, it was not good idea; + * but it will probably only crash. + * + * (*) suspend/resume support is needed to make it safe. + * + * If you have any filesystems on USB devices mounted before software suspend, + * they won't be accessible after resume and you may lose data, as though + * you have unplugged the USB devices with mounted filesystems on them; + * see the FAQ below for details. (This is not true for more traditional + * power states like "standby", which normally don't turn USB off.) + +You need to append resume=/dev/your_swap_partition to kernel command +line. Then you suspend by + +echo shutdown > /sys/power/disk; echo disk > /sys/power/state + +. If you feel ACPI works pretty well on your system, you might try + +echo platform > /sys/power/disk; echo disk > /sys/power/state + +. If you have SATA disks, you'll need recent kernels with SATA suspend +support. For suspend and resume to work, make sure your disk drivers +are built into kernel -- not modules. [There's way to make +suspend/resume with modular disk drivers, see FAQ, but you probably +should not do that.] + +If you want to limit the suspend image size to N bytes, do + +echo N > /sys/power/image_size + +before suspend (it is limited to 500 MB by default). + + +Article about goals and implementation of Software Suspend for Linux +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Author: Gábor Kuti +Last revised: 2003-10-20 by Pavel Machek + +Idea and goals to achieve + +Nowadays it is common in several laptops that they have a suspend button. It +saves the state of the machine to a filesystem or to a partition and switches +to standby mode. Later resuming the machine the saved state is loaded back to +ram and the machine can continue its work. It has two real benefits. First we +save ourselves the time machine goes down and later boots up, energy costs +are real high when running from batteries. The other gain is that we don't have to +interrupt our programs so processes that are calculating something for a long +time shouldn't need to be written interruptible. + +swsusp saves the state of the machine into active swaps and then reboots or +powerdowns. You must explicitly specify the swap partition to resume from with +``resume='' kernel option. If signature is found it loads and restores saved +state. If the option ``noresume'' is specified as a boot parameter, it skips +the resuming. + +In the meantime while the system is suspended you should not add/remove any +of the hardware, write to the filesystems, etc. + +Sleep states summary +==================== + +There are three different interfaces you can use, /proc/acpi should +work like this: + +In a really perfect world: +echo 1 > /proc/acpi/sleep # for standby +echo 2 > /proc/acpi/sleep # for suspend to ram +echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative +echo 4 > /proc/acpi/sleep # for suspend to disk +echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system + +and perhaps +echo 4b > /proc/acpi/sleep # for suspend to disk via s4bios + +Frequently Asked Questions +========================== + +Q: well, suspending a server is IMHO a really stupid thing, +but... (Diego Zuccato): + +A: You bought new UPS for your server. How do you install it without +bringing machine down? Suspend to disk, rearrange power cables, +resume. + +You have your server on UPS. Power died, and UPS is indicating 30 +seconds to failure. What do you do? Suspend to disk. + + +Q: Maybe I'm missing something, but why don't the regular I/O paths work? + +A: We do use the regular I/O paths. However we cannot restore the data +to its original location as we load it. That would create an +inconsistent kernel state which would certainly result in an oops. +Instead, we load the image into unused memory and then atomically copy +it back to it original location. This implies, of course, a maximum +image size of half the amount of memory. + +There are two solutions to this: + +* require half of memory to be free during suspend. That way you can +read "new" data onto free spots, then cli and copy + +* assume we had special "polling" ide driver that only uses memory +between 0-640KB. That way, I'd have to make sure that 0-640KB is free +during suspending, but otherwise it would work... + +suspend2 shares this fundamental limitation, but does not include user +data and disk caches into "used memory" by saving them in +advance. That means that the limitation goes away in practice. + +Q: Does linux support ACPI S4? + +A: Yes. That's what echo platform > /sys/power/disk does. + +Q: What is 'suspend2'? + +A: suspend2 is 'Software Suspend 2', a forked implementation of +suspend-to-disk which is available as separate patches for 2.4 and 2.6 +kernels from swsusp.sourceforge.net. It includes support for SMP, 4GB +highmem and preemption. It also has a extensible architecture that +allows for arbitrary transformations on the image (compression, +encryption) and arbitrary backends for writing the image (eg to swap +or an NFS share[Work In Progress]). Questions regarding suspend2 +should be sent to the mailing list available through the suspend2 +website, and not to the Linux Kernel Mailing List. We are working +toward merging suspend2 into the mainline kernel. + +Q: What is the freezing of tasks and why are we using it? + +A: The freezing of tasks is a mechanism by which user space processes and some +kernel threads are controlled during hibernation or system-wide suspend (on some +architectures). See freezing-of-tasks.txt for details. + +Q: What is the difference between "platform" and "shutdown"? + +A: + +shutdown: save state in linux, then tell bios to powerdown + +platform: save state in linux, then tell bios to powerdown and blink + "suspended led" + +"platform" is actually right thing to do where supported, but +"shutdown" is most reliable (except on ACPI systems). + +Q: I do not understand why you have such strong objections to idea of +selective suspend. + +A: Do selective suspend during runtime power management, that's okay. But +it's useless for suspend-to-disk. (And I do not see how you could use +it for suspend-to-ram, I hope you do not want that). + +Lets see, so you suggest to + +* SUSPEND all but swap device and parents +* Snapshot +* Write image to disk +* SUSPEND swap device and parents +* Powerdown + +Oh no, that does not work, if swap device or its parents uses DMA, +you've corrupted data. You'd have to do + +* SUSPEND all but swap device and parents +* FREEZE swap device and parents +* Snapshot +* UNFREEZE swap device and parents +* Write +* SUSPEND swap device and parents + +Which means that you still need that FREEZE state, and you get more +complicated code. (And I have not yet introduce details like system +devices). + +Q: There don't seem to be any generally useful behavioral +distinctions between SUSPEND and FREEZE. + +A: Doing SUSPEND when you are asked to do FREEZE is always correct, +but it may be unneccessarily slow. If you want your driver to stay simple, +slowness may not matter to you. It can always be fixed later. + +For devices like disk it does matter, you do not want to spindown for +FREEZE. + +Q: After resuming, system is paging heavily, leading to very bad interactivity. + +A: Try running + +cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null + +after resume. swapoff -a; swapon -a may also be useful. + +Q: What happens to devices during swsusp? They seem to be resumed +during system suspend? + +A: That's correct. We need to resume them if we want to write image to +disk. Whole sequence goes like + + Suspend part + ~~~~~~~~~~~~ + running system, user asks for suspend-to-disk + + user processes are stopped + + suspend(PMSG_FREEZE): devices are frozen so that they don't interfere + with state snapshot + + state snapshot: copy of whole used memory is taken with interrupts disabled + + resume(): devices are woken up so that we can write image to swap + + write image to swap + + suspend(PMSG_SUSPEND): suspend devices so that we can power off + + turn the power off + + Resume part + ~~~~~~~~~~~ + (is actually pretty similar) + + running system, user asks for suspend-to-disk + + user processes are stopped (in common case there are none, but with resume-from-initrd, noone knows) + + read image from disk + + suspend(PMSG_FREEZE): devices are frozen so that they don't interfere + with image restoration + + image restoration: rewrite memory with image + + resume(): devices are woken up so that system can continue + + thaw all user processes + +Q: What is this 'Encrypt suspend image' for? + +A: First of all: it is not a replacement for dm-crypt encrypted swap. +It cannot protect your computer while it is suspended. Instead it does +protect from leaking sensitive data after resume from suspend. + +Think of the following: you suspend while an application is running +that keeps sensitive data in memory. The application itself prevents +the data from being swapped out. Suspend, however, must write these +data to swap to be able to resume later on. Without suspend encryption +your sensitive data are then stored in plaintext on disk. This means +that after resume your sensitive data are accessible to all +applications having direct access to the swap device which was used +for suspend. If you don't need swap after resume these data can remain +on disk virtually forever. Thus it can happen that your system gets +broken in weeks later and sensitive data which you thought were +encrypted and protected are retrieved and stolen from the swap device. +To prevent this situation you should use 'Encrypt suspend image'. + +During suspend a temporary key is created and this key is used to +encrypt the data written to disk. When, during resume, the data was +read back into memory the temporary key is destroyed which simply +means that all data written to disk during suspend are then +inaccessible so they can't be stolen later on. The only thing that +you must then take care of is that you call 'mkswap' for the swap +partition used for suspend as early as possible during regular +boot. This asserts that any temporary key from an oopsed suspend or +from a failed or aborted resume is erased from the swap device. + +As a rule of thumb use encrypted swap to protect your data while your +system is shut down or suspended. Additionally use the encrypted +suspend image to prevent sensitive data from being stolen after +resume. + +Q: Can I suspend to a swap file? + +A: Generally, yes, you can. However, it requires you to use the "resume=" and +"resume_offset=" kernel command line parameters, so the resume from a swap file +cannot be initiated from an initrd or initramfs image. See +swsusp-and-swap-files.txt for details. + +Q: Is there a maximum system RAM size that is supported by swsusp? + +A: It should work okay with highmem. + +Q: Does swsusp (to disk) use only one swap partition or can it use +multiple swap partitions (aggregate them into one logical space)? + +A: Only one swap partition, sorry. + +Q: If my application(s) causes lots of memory & swap space to be used +(over half of the total system RAM), is it correct that it is likely +to be useless to try to suspend to disk while that app is running? + +A: No, it should work okay, as long as your app does not mlock() +it. Just prepare big enough swap partition. + +Q: What information is useful for debugging suspend-to-disk problems? + +A: Well, last messages on the screen are always useful. If something +is broken, it is usually some kernel driver, therefore trying with as +little as possible modules loaded helps a lot. I also prefer people to +suspend from console, preferably without X running. Booting with +init=/bin/bash, then swapon and starting suspend sequence manually +usually does the trick. Then it is good idea to try with latest +vanilla kernel. + +Q: How can distributions ship a swsusp-supporting kernel with modular +disk drivers (especially SATA)? + +A: Well, it can be done, load the drivers, then do echo into +/sys/power/disk/resume file from initrd. Be sure not to mount +anything, not even read-only mount, or you are going to lose your +data. + +Q: How do I make suspend more verbose? + +A: If you want to see any non-error kernel messages on the virtual +terminal the kernel switches to during suspend, you have to set the +kernel console loglevel to at least 4 (KERN_WARNING), for example by +doing + + # save the old loglevel + read LOGLEVEL DUMMY < /proc/sys/kernel/printk + # set the loglevel so we see the progress bar. + # if the level is higher than needed, we leave it alone. + if [ $LOGLEVEL -lt 5 ]; then + echo 5 > /proc/sys/kernel/printk + fi + + IMG_SZ=0 + read IMG_SZ < /sys/power/image_size + echo -n disk > /sys/power/state + RET=$? + # + # the logic here is: + # if image_size > 0 (without kernel support, IMG_SZ will be zero), + # then try again with image_size set to zero. + if [ $RET -ne 0 -a $IMG_SZ -ne 0 ]; then # try again with minimal image size + echo 0 > /sys/power/image_size + echo -n disk > /sys/power/state + RET=$? + fi + + # restore previous loglevel + echo $LOGLEVEL > /proc/sys/kernel/printk + exit $RET + +Q: Is this true that if I have a mounted filesystem on a USB device and +I suspend to disk, I can lose data unless the filesystem has been mounted +with "sync"? + +A: That's right ... if you disconnect that device, you may lose data. +In fact, even with "-o sync" you can lose data if your programs have +information in buffers they haven't written out to a disk you disconnect, +or if you disconnect before the device finished saving data you wrote. + +Software suspend normally powers down USB controllers, which is equivalent +to disconnecting all USB devices attached to your system. + +Your system might well support low-power modes for its USB controllers +while the system is asleep, maintaining the connection, using true sleep +modes like "suspend-to-RAM" or "standby". (Don't write "disk" to the +/sys/power/state file; write "standby" or "mem".) We've not seen any +hardware that can use these modes through software suspend, although in +theory some systems might support "platform" modes that won't break the +USB connections. + +Remember that it's always a bad idea to unplug a disk drive containing a +mounted filesystem. That's true even when your system is asleep! The +safest thing is to unmount all filesystems on removable media (such USB, +Firewire, CompactFlash, MMC, external SATA, or even IDE hotplug bays) +before suspending; then remount them after resuming. + +There is a work-around for this problem. For more information, see +Documentation/usb/persist.txt. + +Q: Can I suspend-to-disk using a swap partition under LVM? + +A: No. You can suspend successfully, but you'll not be able to +resume. uswsusp should be able to work with LVM. See suspend.sf.net. + +Q: I upgraded the kernel from 2.6.15 to 2.6.16. Both kernels were +compiled with the similar configuration files. Anyway I found that +suspend to disk (and resume) is much slower on 2.6.16 compared to +2.6.15. Any idea for why that might happen or how can I speed it up? + +A: This is because the size of the suspend image is now greater than +for 2.6.15 (by saving more data we can get more responsive system +after resume). + +There's the /sys/power/image_size knob that controls the size of the +image. If you set it to 0 (eg. by echo 0 > /sys/power/image_size as +root), the 2.6.15 behavior should be restored. If it is still too +slow, take a look at suspend.sf.net -- userland suspend is faster and +supports LZF compression to speed it up further. diff --git a/Documentation/power/tricks.txt b/Documentation/power/tricks.txt new file mode 100644 index 0000000..3b26bb5 --- /dev/null +++ b/Documentation/power/tricks.txt @@ -0,0 +1,27 @@ + swsusp/S3 tricks + ~~~~~~~~~~~~~~~~ +Pavel Machek <pavel@suse.cz> + +If you want to trick swsusp/S3 into working, you might want to try: + +* go with minimal config, turn off drivers like USB, AGP you don't + really need + +* turn off APIC and preempt + +* use ext2. At least it has working fsck. [If something seems to go + wrong, force fsck when you have a chance] + +* turn off modules + +* use vga text console, shut down X. [If you really want X, you might + want to try vesafb later] + +* try running as few processes as possible, preferably go to single + user mode. + +* due to video issues, swsusp should be easier to get working than + S3. Try that first. + +When you make it work, try to find out what exactly was it that broke +suspend, and preferably fix that. diff --git a/Documentation/power/userland-swsusp.txt b/Documentation/power/userland-swsusp.txt new file mode 100644 index 0000000..7b99636 --- /dev/null +++ b/Documentation/power/userland-swsusp.txt @@ -0,0 +1,165 @@ +Documentation for userland software suspend interface + (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> + +First, the warnings at the beginning of swsusp.txt still apply. + +Second, you should read the FAQ in swsusp.txt _now_ if you have not +done it already. + +Now, to use the userland interface for software suspend you need special +utilities that will read/write the system memory snapshot from/to the +kernel. Such utilities are available, for example, from +<http://suspend.sourceforge.net>. You may want to have a look at them if you +are going to develop your own suspend/resume utilities. + +The interface consists of a character device providing the open(), +release(), read(), and write() operations as well as several ioctl() +commands defined in include/linux/suspend_ioctls.h . The major and minor +numbers of the device are, respectively, 10 and 231, and they can +be read from /sys/class/misc/snapshot/dev. + +The device can be open either for reading or for writing. If open for +reading, it is considered to be in the suspend mode. Otherwise it is +assumed to be in the resume mode. The device cannot be open for simultaneous +reading and writing. It is also impossible to have the device open more than +once at a time. + +The ioctl() commands recognized by the device are: + +SNAPSHOT_FREEZE - freeze user space processes (the current process is + not frozen); this is required for SNAPSHOT_CREATE_IMAGE + and SNAPSHOT_ATOMIC_RESTORE to succeed + +SNAPSHOT_UNFREEZE - thaw user space processes frozen by SNAPSHOT_FREEZE + +SNAPSHOT_CREATE_IMAGE - create a snapshot of the system memory; the + last argument of ioctl() should be a pointer to an int variable, + the value of which will indicate whether the call returned after + creating the snapshot (1) or after restoring the system memory state + from it (0) (after resume the system finds itself finishing the + SNAPSHOT_CREATE_IMAGE ioctl() again); after the snapshot + has been created the read() operation can be used to transfer + it out of the kernel + +SNAPSHOT_ATOMIC_RESTORE - restore the system memory state from the + uploaded snapshot image; before calling it you should transfer + the system memory snapshot back to the kernel using the write() + operation; this call will not succeed if the snapshot + image is not available to the kernel + +SNAPSHOT_FREE - free memory allocated for the snapshot image + +SNAPSHOT_PREF_IMAGE_SIZE - set the preferred maximum size of the image + (the kernel will do its best to ensure the image size will not exceed + this number, but if it turns out to be impossible, the kernel will + create the smallest image possible) + +SNAPSHOT_GET_IMAGE_SIZE - return the actual size of the hibernation image + +SNAPSHOT_AVAIL_SWAP_SIZE - return the amount of available swap in bytes (the + last argument should be a pointer to an unsigned int variable that will + contain the result if the call is successful). + +SNAPSHOT_ALLOC_SWAP_PAGE - allocate a swap page from the resume partition + (the last argument should be a pointer to a loff_t variable that + will contain the swap page offset if the call is successful) + +SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated by + SNAPSHOT_ALLOC_SWAP_PAGE + +SNAPSHOT_SET_SWAP_AREA - set the resume partition and the offset (in <PAGE_SIZE> + units) from the beginning of the partition at which the swap header is + located (the last ioctl() argument should point to a struct + resume_swap_area, as defined in kernel/power/suspend_ioctls.h, + containing the resume device specification and the offset); for swap + partitions the offset is always 0, but it is different from zero for + swap files (see Documentation/swsusp-and-swap-files.txt for details). + +SNAPSHOT_PLATFORM_SUPPORT - enable/disable the hibernation platform support, + depending on the argument value (enable, if the argument is nonzero) + +SNAPSHOT_POWER_OFF - make the kernel transition the system to the hibernation + state (eg. ACPI S4) using the platform (eg. ACPI) driver + +SNAPSHOT_S2RAM - suspend to RAM; using this call causes the kernel to + immediately enter the suspend-to-RAM state, so this call must always + be preceded by the SNAPSHOT_FREEZE call and it is also necessary + to use the SNAPSHOT_UNFREEZE call after the system wakes up. This call + is needed to implement the suspend-to-both mechanism in which the + suspend image is first created, as though the system had been suspended + to disk, and then the system is suspended to RAM (this makes it possible + to resume the system from RAM if there's enough battery power or restore + its state on the basis of the saved suspend image otherwise) + +The device's read() operation can be used to transfer the snapshot image from +the kernel. It has the following limitations: +- you cannot read() more than one virtual memory page at a time +- read()s accross page boundaries are impossible (ie. if ypu read() 1/2 of + a page in the previous call, you will only be able to read() + _at_ _most_ 1/2 of the page in the next call) + +The device's write() operation is used for uploading the system memory snapshot +into the kernel. It has the same limitations as the read() operation. + +The release() operation frees all memory allocated for the snapshot image +and all swap pages allocated with SNAPSHOT_ALLOC_SWAP_PAGE (if any). +Thus it is not necessary to use either SNAPSHOT_FREE or +SNAPSHOT_FREE_SWAP_PAGES before closing the device (in fact it will also +unfreeze user space processes frozen by SNAPSHOT_UNFREEZE if they are +still frozen when the device is being closed). + +Currently it is assumed that the userland utilities reading/writing the +snapshot image from/to the kernel will use a swap parition, called the resume +partition, or a swap file as storage space (if a swap file is used, the resume +partition is the partition that holds this file). However, this is not really +required, as they can use, for example, a special (blank) suspend partition or +a file on a partition that is unmounted before SNAPSHOT_CREATE_IMAGE and +mounted afterwards. + +These utilities MUST NOT make any assumptions regarding the ordering of +data within the snapshot image. The contents of the image are entirely owned +by the kernel and its structure may be changed in future kernel releases. + +The snapshot image MUST be written to the kernel unaltered (ie. all of the image +data, metadata and header MUST be written in _exactly_ the same amount, form +and order in which they have been read). Otherwise, the behavior of the +resumed system may be totally unpredictable. + +While executing SNAPSHOT_ATOMIC_RESTORE the kernel checks if the +structure of the snapshot image is consistent with the information stored +in the image header. If any inconsistencies are detected, +SNAPSHOT_ATOMIC_RESTORE will not succeed. Still, this is not a fool-proof +mechanism and the userland utilities using the interface SHOULD use additional +means, such as checksums, to ensure the integrity of the snapshot image. + +The suspending and resuming utilities MUST lock themselves in memory, +preferrably using mlockall(), before calling SNAPSHOT_FREEZE. + +The suspending utility MUST check the value stored by SNAPSHOT_CREATE_IMAGE +in the memory location pointed to by the last argument of ioctl() and proceed +in accordance with it: +1. If the value is 1 (ie. the system memory snapshot has just been + created and the system is ready for saving it): + (a) The suspending utility MUST NOT close the snapshot device + _unless_ the whole suspend procedure is to be cancelled, in + which case, if the snapshot image has already been saved, the + suspending utility SHOULD destroy it, preferrably by zapping + its header. If the suspend is not to be cancelled, the + system MUST be powered off or rebooted after the snapshot + image has been saved. + (b) The suspending utility SHOULD NOT attempt to perform any + file system operations (including reads) on the file systems + that were mounted before SNAPSHOT_CREATE_IMAGE has been + called. However, it MAY mount a file system that was not + mounted at that time and perform some operations on it (eg. + use it for saving the image). +2. If the value is 0 (ie. the system state has just been restored from + the snapshot image), the suspending utility MUST close the snapshot + device. Afterwards it will be treated as a regular userland process, + so it need not exit. + +The resuming utility SHOULD NOT attempt to mount any file systems that could +be mounted before suspend and SHOULD NOT attempt to perform any operations +involving such file systems. + +For details, please refer to the source code. diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt new file mode 100644 index 0000000..2b35849 --- /dev/null +++ b/Documentation/power/video.txt @@ -0,0 +1,185 @@ + + Video issues with S3 resume + ~~~~~~~~~~~~~~~~~~~~~~~~~~~ + 2003-2006, Pavel Machek + +During S3 resume, hardware needs to be reinitialized. For most +devices, this is easy, and kernel driver knows how to do +it. Unfortunately there's one exception: video card. Those are usually +initialized by BIOS, and kernel does not have enough information to +boot video card. (Kernel usually does not even contain video card +driver -- vesafb and vgacon are widely used). + +This is not problem for swsusp, because during swsusp resume, BIOS is +run normally so video card is normally initialized. It should not be +problem for S1 standby, because hardware should retain its state over +that. + +We either have to run video BIOS during early resume, or interpret it +using vbetool later, or maybe nothing is necessary on particular +system because video state is preserved. Unfortunately different +methods work on different systems, and no known method suits all of +them. + +Userland application called s2ram has been developed; it contains long +whitelist of systems, and automatically selects working method for a +given system. It can be downloaded from CVS at +www.sf.net/projects/suspend . If you get a system that is not in the +whitelist, please try to find a working solution, and submit whitelist +entry so that work does not need to be repeated. + +Currently, VBE_SAVE method (6 below) works on most +systems. Unfortunately, vbetool only runs after userland is resumed, +so it makes debugging of early resume problems +hard/impossible. Methods that do not rely on userland are preferable. + +Details +~~~~~~~ + +There are a few types of systems where video works after S3 resume: + +(1) systems where video state is preserved over S3. + +(2) systems where it is possible to call the video BIOS during S3 + resume. Unfortunately, it is not correct to call the video BIOS at + that point, but it happens to work on some machines. Use + acpi_sleep=s3_bios. + +(3) systems that initialize video card into vga text mode and where + the BIOS works well enough to be able to set video mode. Use + acpi_sleep=s3_mode on these. + +(4) on some systems s3_bios kicks video into text mode, and + acpi_sleep=s3_bios,s3_mode is needed. + +(5) radeon systems, where X can soft-boot your video card. You'll need + a new enough X, and a plain text console (no vesafb or radeonfb). See + http://www.doesi.gmxhome.de/linux/tm800s3/s3.html for more information. + Alternatively, you should use vbetool (6) instead. + +(6) other radeon systems, where vbetool is enough to bring system back + to life. It needs text console to be working. Do vbetool vbestate + save > /tmp/delme; echo 3 > /proc/acpi/sleep; vbetool post; vbetool + vbestate restore < /tmp/delme; setfont <whatever>, and your video + should work. + +(7) on some systems, it is possible to boot most of kernel, and then + POSTing bios works. Ole Rohne has patch to do just that at + http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2. + +(8) on some systems, you can use the video_post utility mentioned here: + http://bugzilla.kernel.org/show_bug.cgi?id=3670. Do echo 3 > /sys/power/state + && /usr/sbin/video_post - which will initialize the display in console mode. + If you are in X, you can switch to a virtual terminal and back to X using + CTRL+ALT+F1 - CTRL+ALT+F7 to get the display working in graphical mode again. + +Now, if you pass acpi_sleep=something, and it does not work with your +bios, you'll get a hard crash during resume. Be careful. Also it is +safest to do your experiments with plain old VGA console. The vesafb +and radeonfb (etc) drivers have a tendency to crash the machine during +resume. + +You may have a system where none of above works. At that point you +either invent another ugly hack that works, or write proper driver for +your video card (good luck getting docs :-(). Maybe suspending from X +(proper X, knowing your hardware, not XF68_FBcon) might have better +chance of working. + +Table of known working notebooks: + +Model hack (or "how to do it") +------------------------------------------------------------------------------ +Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI +Acer TM 230 s3_bios (2) +Acer TM 242FX vbetool (6) +Acer TM C110 video_post (8) +Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6) or video_post (8) +Acer TM 4052LCi s3_bios (2) +Acer TM 636Lci s3_bios,s3_mode (4) +Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back +Acer TM 660 ??? (*) +Acer TM 800 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803LCi vga=normal, vbetool (6) +Arima W730a vbetool needed (6) +Asus L2400D s3_mode (3)(***) (S1 also works OK) +Asus L3350M (SiS 740) (6) +Asus L3800C (Radeon M7) s3_bios (2) (S1 also works OK) +Asus M6887Ne vga=normal, s3_bios (2), use radeon driver instead of fglrx in x.org +Athlon64 desktop prototype s3_bios (2) +Compal CL-50 ??? (*) +Compaq Armada E500 - P3-700 none (1) (S1 also works OK) +Compaq Evo N620c vga=normal, s3_bios (2) +Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 +Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) +Dell D610 vga=normal and X (possibly vbestate (6) too, but not tested) +Dell Inspiron 4000 ??? (*) +Dell Inspiron 500m ??? (*) +Dell Inspiron 510m ??? +Dell Inspiron 5150 vbetool needed (6) +Dell Inspiron 600m ??? (*) +Dell Inspiron 8200 ??? (*) +Dell Inspiron 8500 ??? (*) +Dell Inspiron 8600 ??? (*) +eMachines athlon64 machines vbetool needed (6) (someone please get me model #s) +HP NC6000 s3_bios, may not use radeonfb (2); or vbetool (6) +HP NX7000 ??? (*) +HP Pavilion ZD7000 vbetool post needed, need open-source nv driver for X +HP Omnibook XE3 athlon version none (1) +HP Omnibook XE3GC none (1), video is S3 Savage/IX-MV +HP Omnibook XE3L-GF vbetool (6) +HP Omnibook 5150 none (1), (S1 also works OK) +IBM TP T20, model 2647-44G none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work. +IBM TP A31 / Type 2652-M5G s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(] +IBM TP R32 / Type 2658-MMG none (1) +IBM TP R40 2722B3G ??? (*) +IBM TP R50p / Type 1832-22U s3_bios (2) +IBM TP R51 none (1) +IBM TP T30 236681A ??? (*) +IBM TP T40 / Type 2373-MU4 none (1) +IBM TP T40p none (1) +IBM TP R40p s3_bios (2) +IBM TP T41p s3_bios (2), switch to X after resume +IBM TP T42 s3_bios (2) +IBM ThinkPad T42p (2373-GTG) s3_bios (2) +IBM TP X20 ??? (*) +IBM TP X30 s3_bios, s3_mode (4) +IBM TP X31 / Type 2672-XXH none (1), use radeontool (http://fdd.com/software/radeon/) to turn off backlight. +IBM TP X32 none (1), but backlight is on and video is trashed after long suspend. s3_bios,s3_mode (4) works too. Perhaps that gets better results? +IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) +IBM TP 600e none(1), but a switch to console and back to X is needed +Medion MD4220 ??? (*) +Samsung P35 vbetool needed (6) +Sharp PC-AR10 (ATI rage) none (1), backlight does not switch off +Sony Vaio PCG-C1VRX/K s3_bios (2) +Sony Vaio PCG-F403 ??? (*) +Sony Vaio PCG-GRT995MP none (1), works with 'nv' X driver +Sony Vaio PCG-GR7/K none (1), but needs radeonfb, use radeontool (http://fdd.com/software/radeon/) to turn off backlight. +Sony Vaio PCG-N505SN ??? (*) +Sony Vaio vgn-s260 X or boot-radeon can init it (5) +Sony Vaio vgn-S580BH vga=normal, but suspend from X. Console will be blank unless you return to X. +Sony Vaio vgn-FS115B s3_bios (2),s3_mode (4) +Toshiba Libretto L5 none (1) +Toshiba Libretto 100CT/110CT vbetool (6) +Toshiba Portege 3020CT s3_mode (3) +Toshiba Satellite 4030CDT s3_mode (3) (S1 also works OK) +Toshiba Satellite 4080XCDT s3_mode (3) (S1 also works OK) +Toshiba Satellite 4090XCDT ??? (*) +Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) +Toshiba M30 (2) xor X with nvidia driver using internal AGP +Uniwill 244IIO ??? (*) + +Known working desktop systems +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Mainboard Graphics card hack (or "how to do it") +------------------------------------------------------------------------------ +Asus A7V8X nVidia RIVA TNT2 model 64 s3_bios,s3_mode (4) + + +(*) from http://www.ubuntulinux.org/wiki/HoaryPMResults, not sure + which options to use. If you know, please tell me. + +(***) To be tested with a newer kernel. + +(****) Not with SMP kernel, UP only. diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt new file mode 100644 index 0000000..b2f9b15 --- /dev/null +++ b/Documentation/power/video_extension.txt @@ -0,0 +1,37 @@ +ACPI video extensions +~~~~~~~~~~~~~~~~~~~~~ + +This driver implement the ACPI Extensions For Display Adapters for +integrated graphics devices on motherboard, as specified in ACPI 2.0 +Specification, Appendix B, allowing to perform some basic control like +defining the video POST device, retrieving EDID information or to +setup a video output, etc. Note that this is an ref. implementation +only. It may or may not work for your integrated video device. + +Interfaces exposed to userland through /proc/acpi/video: + +VGA/info : display the supported video bus device capability like Video ROM, CRT/LCD/TV. +VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). +VGA/POST_info : Used to determine what options are implemented. +VGA/POST : Used to get/set POST device. +VGA/DOS : Used to get/set ownership of output switching: + Please refer ACPI spec B.4.1 _DOS +VGA/CRT : CRT output +VGA/LCD : LCD output +VGA/TVO : TV output +VGA/*/brightness : Used to get/set brightness of output device + +Notify event through /proc/acpi/event: + +#define ACPI_VIDEO_NOTIFY_SWITCH 0x80 +#define ACPI_VIDEO_NOTIFY_PROBE 0x81 +#define ACPI_VIDEO_NOTIFY_CYCLE 0x82 +#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT 0x83 +#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT 0x84 + +#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS 0x82 +#define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x83 +#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x84 +#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS 0x85 +#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF 0x86 + |