diff options
Diffstat (limited to 'Documentation/power')
-rw-r--r-- | Documentation/power/devices.txt | 319 | ||||
-rw-r--r-- | Documentation/power/interface.txt | 43 | ||||
-rw-r--r-- | Documentation/power/kernel_threads.txt | 41 | ||||
-rw-r--r-- | Documentation/power/pci.txt | 332 | ||||
-rw-r--r-- | Documentation/power/states.txt | 79 | ||||
-rw-r--r-- | Documentation/power/swsusp.txt | 235 | ||||
-rw-r--r-- | Documentation/power/tricks.txt | 27 | ||||
-rw-r--r-- | Documentation/power/video.txt | 169 | ||||
-rw-r--r-- | Documentation/power/video_extension.txt | 34 |
9 files changed, 1279 insertions, 0 deletions
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 0000000..5d4ae9a --- /dev/null +++ b/Documentation/power/devices.txt @@ -0,0 +1,319 @@ + +Device Power Management + + +Device power management encompasses two areas - the ability to save +state and transition a device to a low-power state when the system is +entering a low-power state; and the ability to transition a device to +a low-power state while the system is running (and independently of +any other power management activity). + + +Methods + +The methods to suspend and resume devices reside in struct bus_type: + +struct bus_type { + ... + int (*suspend)(struct device * dev, pm_message_t state); + int (*resume)(struct device * dev); +}; + +Each bus driver is responsible implementing these methods, translating +the call into a bus-specific request and forwarding the call to the +bus-specific drivers. For example, PCI drivers implement suspend() and +resume() methods in struct pci_driver. The PCI core is simply +responsible for translating the pointers to PCI-specific ones and +calling the low-level driver. + +This is done to a) ease transition to the new power management methods +and leverage the existing PM code in various bus drivers; b) allow +buses to implement generic and default PM routines for devices, and c) +make the flow of execution obvious to the reader. + + +System Power Management + +When the system enters a low-power state, the device tree is walked in +a depth-first fashion to transition each device into a low-power +state. The ordering of the device tree is guaranteed by the order in +which devices get registered - children are never registered before +their ancestors, and devices are placed at the back of the list when +registered. By walking the list in reverse order, we are guaranteed to +suspend devices in the proper order. + +Devices are suspended once with interrupts enabled. Drivers are +expected to stop I/O transactions, save device state, and place the +device into a low-power state. Drivers may sleep, allocate memory, +etc. at will. + +Some devices are broken and will inevitably have problems powering +down or disabling themselves with interrupts enabled. For these +special cases, they may return -EAGAIN. This will put the device on a +list to be taken care of later. When interrupts are disabled, before +we enter the low-power state, their drivers are called again to put +their device to sleep. + +On resume, the devices that returned -EAGAIN will be called to power +themselves back on with interrupts disabled. Once interrupts have been +re-enabled, the rest of the drivers will be called to resume their +devices. On resume, a driver is responsible for powering back on each +device, restoring state, and re-enabling I/O transactions for that +device. + +System devices follow a slightly different API, which can be found in + + include/linux/sysdev.h + drivers/base/sys.c + +System devices will only be suspended with interrupts disabled, and +after all other devices have been suspended. On resume, they will be +resumed before any other devices, and also with interrupts disabled. + + +Runtime Power Management + +Many devices are able to dynamically power down while the system is +still running. This feature is useful for devices that are not being +used, and can offer significant power savings on a running system. + +In each device's directory, there is a 'power' directory, which +contains at least a 'state' file. Reading from this file displays what +power state the device is currently in. Writing to this file initiates +a transition to the specified power state, which must be a decimal in +the range 1-3, inclusive; or 0 for 'On'. + +The PM core will call the ->suspend() method in the bus_type object +that the device belongs to if the specified state is not 0, or +->resume() if it is. + +Nothing will happen if the specified state is the same state the +device is currently in. + +If the device is already in a low-power state, and the specified state +is another, but different, low-power state, the ->resume() method will +first be called to power the device back on, then ->suspend() will be +called again with the new state. + +The driver is responsible for saving the working state of the device +and putting it into the low-power state specified. If this was +successful, it returns 0, and the device's power_state field is +updated. + +The driver must take care to know whether or not it is able to +properly resume the device, including all step of reinitialization +necessary. (This is the hardest part, and the one most protected by +NDA'd documents). + +The driver must also take care not to suspend a device that is +currently in use. It is their responsibility to provide their own +exclusion mechanisms. + +The runtime power transition happens with interrupts enabled. If a +device cannot support being powered down with interrupts, it may +return -EAGAIN (as it would during a system power management +transition), but it will _not_ be called again, and the transaction +will fail. + +There is currently no way to know what states a device or driver +supports a priori. This will change in the future. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them beeing specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# + +Driver Detach Power Management + +The kernel now supports the ability to place a device in a low-power +state when it is detached from its driver, which happens when its +module is removed. + +Each device contains a 'detach_state' file in its sysfs directory +which can be used to control this state. Reading from this file +displays what the current detach state is set to. This is 0 (On) by +default. A user may write a positive integer value to this file in the +range of 1-4 inclusive. + +A value of 1-3 will indicate the device should be placed in that +low-power state, which will cause ->suspend() to be called for that +device. A value of 4 indicates that the device should be shutdown, so +->shutdown() will be called for that device. + +The driver is responsible for reinitializing the device when the +module is re-inserted during it's ->probe() (or equivalent) method. +The driver core will not call any extra functions when binding the +device to the driver. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them being specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# diff --git a/Documentation/power/interface.txt b/Documentation/power/interface.txt new file mode 100644 index 0000000..f5ebda5 --- /dev/null +++ b/Documentation/power/interface.txt @@ -0,0 +1,43 @@ +Power Management Interface + + +The power management subsystem provides a unified sysfs interface to +userspace, regardless of what architecture or platform one is +running. The interface exists in /sys/power/ directory (assuming sysfs +is mounted at /sys). + +/sys/power/state controls system power state. Reading from this file +returns what states are supported, which is hard-coded to 'standby' +(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk' +(Suspend-to-Disk). + +Writing to this file one of those strings causes the system to +transition into that state. Please see the file +Documentation/power/states.txt for a description of each of those +states. + + +/sys/power/disk controls the operating mode of the suspend-to-disk +mechanism. Suspend-to-disk can be handled in several ways. The +greatest distinction is who writes memory to disk - the firmware or +the kernel. If the firmware does it, we assume that it also handles +suspending the system. + +If the kernel does it, then we have three options for putting the system +to sleep - using the platform driver (e.g. ACPI or other PM +registers), powering off the system or rebooting the system (for +testing). The system will support either 'firmware' or 'platform', and +that is known a priori. But, the user may choose 'shutdown' or +'reboot' as alternatives. + +Reading from this file will display what the mode is currently set +to. Writing to this file will accept one of + + 'firmware' + 'platform' + 'shutdown' + 'reboot' + +It will only change to 'firmware' or 'platform' if the system supports +it. + diff --git a/Documentation/power/kernel_threads.txt b/Documentation/power/kernel_threads.txt new file mode 100644 index 0000000..60b5481 --- /dev/null +++ b/Documentation/power/kernel_threads.txt @@ -0,0 +1,41 @@ +KERNEL THREADS + + +Freezer + +Upon entering a suspended state the system will freeze all +tasks. This is done by delivering pseudosignals. This affects +kernel threads, too. To successfully freeze a kernel thread +the thread has to check for the pseudosignal and enter the +refrigerator. Code to do this looks like this: + + do { + hub_events(); + wait_event_interruptible(khubd_wait, !list_empty(&hub_event_list)); + if (current->flags & PF_FREEZE) + refrigerator(PF_FREEZE); + } while (!signal_pending(current)); + +from drivers/usb/core/hub.c::hub_thread() + + +The Unfreezable + +Some kernel threads however, must not be frozen. The kernel must +be able to finish pending IO operations and later on be able to +write the memory image to disk. Kernel threads needed to do IO +must stay awake. Such threads must mark themselves unfreezable +like this: + + /* + * This thread doesn't need any user-level access, + * so get rid of all our resources. + */ + daemonize("usb-storage"); + + current->flags |= PF_NOFREEZE; + +from drivers/usb/storage/usb.c::usb_stor_control_thread() + +Such drivers are themselves responsible for staying quiet during +the actual snapshotting. diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt new file mode 100644 index 0000000..c85428e --- /dev/null +++ b/Documentation/power/pci.txt @@ -0,0 +1,332 @@ + +PCI Power Management +~~~~~~~~~~~~~~~~~~~~ + +An overview of the concepts and the related functions in the Linux kernel + +Patrick Mochel <mochel@transmeta.com> +(and others) + +--------------------------------------------------------------------------- + +1. Overview +2. How the PCI Subsystem Does Power Management +3. PCI Utility Functions +4. PCI Device Drivers +5. Resources + +1. Overview +~~~~~~~~~~~ + +The PCI Power Management Specification was introduced between the PCI 2.1 and +PCI 2.2 Specifications. It a standard interface for controlling various +power management operations. + +Implementation of the PCI PM Spec is optional, as are several sub-components of +it. If a device supports the PCI PM Spec, the device will have an 8 byte +capability field in its PCI configuration space. This field is used to describe +and control the standard PCI power management features. + +The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses +(B0 - B3). The higher the number, the less power the device consumes. However, +the higher the number, the longer the latency is for the device to return to +an operational state (D0). + +There are actually two D3 states. When someone talks about D3, they usually +mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the +device may lose some context). But they may also mean D3cold, which is an +ACPI D3 state (power is fully off, all state was discarded); or both. + +Bus power management is not covered in this version of this document. + +Note that all PCI devices support D0 and D3cold by default, regardless of +whether or not they implement any of the PCI PM spec. + +The possible state transitions that a device can undergo are: + ++---------------------------+ +| Current State | New State | ++---------------------------+ +| D0 | D1, D2, D3| ++---------------------------+ +| D1 | D2, D3 | ++---------------------------+ +| D2 | D3 | ++---------------------------+ +| D1, D2, D3 | D0 | ++---------------------------+ + +Note that when the system is entering a global suspend state, all devices will +be placed into D3 and when resuming, all devices will be placed into D0. +However, when the system is running, other state transitions are possible. + +2. How The PCI Subsystem Handles Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The PCI suspend/resume functionality is accessed indirectly via the Power +Management subsystem. At boot, the PCI driver registers a power management +callback with that layer. Upon entering a suspend state, the PM layer iterates +through all of its registered callbacks. This currently takes place only during +APM state transitions. + +Upon going to sleep, the PCI subsystem walks its device tree twice. Both times, +it does a depth first walk of the device tree. The first walk saves each of the +device's state and checks for devices that will prevent the system from entering +a global power state. The next walk then places the devices in a low power +state. + +The first walk allows a graceful recovery in the event of a failure, since none +of the devices have actually been powered down. + +In both walks, in particular the second, all children of a bridge are touched +before the actual bridge itself. This allows the bridge to retain power while +its children are being accessed. + +Upon resuming from sleep, just the opposite must be true: all bridges must be +powered on and restored before their children are powered on. This is easily +accomplished with a breadth-first walk of the PCI device tree. + + +3. PCI Utility Functions +~~~~~~~~~~~~~~~~~~~~~~~~ + +These are helper functions designed to be called by individual device drivers. +Assuming that a device behaves as advertised, these should be applicable in most +cases. However, results may vary. + +Note that these functions are never implicitly called for the driver. The driver +is always responsible for deciding when and if to call these. + + +pci_save_state +-------------- + +Usage: + pci_save_state(dev, buffer); + +Description: + Save first 64 bytes of PCI config space. Buffer must be allocated by + caller. + + +pci_restore_state +----------------- + +Usage: + pci_restore_state(dev, buffer); + +Description: + Restore previously saved config space. (First 64 bytes only); + + If buffer is NULL, then restore what information we know about the + device from bootup: BARs and interrupt line. + + +pci_set_power_state +------------------- + +Usage: + pci_set_power_state(dev, state); + +Description: + Transition device to low power state using PCI PM Capabilities + registers. + + Will fail under one of the following conditions: + - If state is less than current state, but not D0 (illegal transition) + - Device doesn't support PM Capabilities + - Device does not support requested state + + +pci_enable_wake +--------------- + +Usage: + pci_enable_wake(dev, state, enable); + +Description: + Enable device to generate PME# during low power state using PCI PM + Capabilities. + + Checks whether if device supports generating PME# from requested state + and fail if it does not, unless enable == 0 (request is to disable wake + events, which is implicit if it doesn't even support it in the first + place). + + Note that the PMC Register in the device's PM Capabilties has a bitmask + of the states it supports generating PME# from. D3hot is bit 3 and + D3cold is bit 4. So, while a value of 4 as the state may not seem + semantically correct, it is. + + +4. PCI Device Drivers +~~~~~~~~~~~~~~~~~~~~~ + +These functions are intended for use by individual drivers, and are defined in +struct pci_driver: + + int (*save_state) (struct pci_dev *dev, u32 state); + int (*suspend) (struct pci_dev *dev, u32 state); + int (*resume) (struct pci_dev *dev); + int (*enable_wake) (struct pci_dev *dev, u32 state, int enable); + + +save_state +---------- + +Usage: + +if (dev->driver && dev->driver->save_state) + dev->driver->save_state(dev,state); + +The driver should use this callback to save device state. It should take into +account the current state of the device and the requested state in order to +avoid any unnecessary operations. + +For example, a video card that supports all 4 states (D0-D3), all controller +context is preserved when entering D1, but the screen is placed into a low power +state (blanked). + +The driver can also interpret this function as a notification that it may be +entering a sleep state in the near future. If it knows that the device cannot +enter the requested state, either because of lack of support for it, or because +the device is middle of some critical operation, then it should fail. + +This function should not be used to set any state in the device or the driver +because the device may not actually enter the sleep state (e.g. another driver +later causes causes a global state transition to fail). + +Note that in intermediate low power states, a device's I/O and memory spaces may +be disabled and may not be available in subsequent transitions to lower power +states. + + +suspend +------- + +Usage: + +if (dev->driver && dev->driver->suspend) + dev->driver->suspend(dev,state); + +A driver uses this function to actually transition the device into a low power +state. This should include disabling I/O, IRQs, and bus-mastering, as well as +physically transitioning the device to a lower power state; it may also include +calls to pci_enable_wake(). + +Bus mastering may be disabled by doing: + +pci_disable_device(dev); + +For devices that support the PCI PM Spec, this may be used to set the device's +power state to match the suspend() parameter: + +pci_set_power_state(dev,state); + +The driver is also responsible for disabling any other device-specific features +(e.g blanking screen, turning off on-card memory, etc). + +The driver should be sure to track the current state of the device, as it may +obviate the need for some operations. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + +resume +------ + +Usage: + +if (dev->driver && dev->driver->suspend) + dev->driver->resume(dev) + +The resume callback may be called from any power state, and is always meant to +transition the device to the D0 state. + +The driver is responsible for reenabling any features of the device that had +been disabled during previous suspend calls, such as IRQs and bus mastering, +as well as calling pci_restore_state(). + +If the device is currently in D3, it may need to be reinitialized in resume(). + + * Some types of devices, like bus controllers, will preserve context in D3hot + (using Vcc power). Their drivers will often want to avoid re-initializing + them after re-entering D0 (perhaps to avoid resetting downstream devices). + + * Other kinds of devices in D3hot will discard device context as part of a + soft reset when re-entering the D0 state. + + * Devices resuming from D3cold always go through a power-on reset. Some + device context can also be preserved using Vaux power. + + * Some systems hide D3cold resume paths from drivers. For example, on PCs + the resume path for suspend-to-disk often runs BIOS powerup code, which + will sometimes re-initialize the device. + +To handle resets during D3 to D0 transitions, it may be convenient to share +device initialization code between probe() and resume(). Device parameters +can also be saved before the driver suspends into D3, avoiding re-probe. + +If the device supports the PCI PM Spec, it can use this to physically transition +the device to D0: + +pci_set_power_state(dev,0); + +Note that if the entire system is transitioning out of a global sleep state, all +devices will be placed in the D0 state, so this is not necessary. However, in +the event that the device is placed in the D3 state during normal operation, +this call is necessary. It is impossible to determine which of the two events is +taking place in the driver, so it is always a good idea to make that call. + +The driver should take note of the state that it is resuming from in order to +ensure correct (and speedy) operation. + +The driver should update the current_state field in its pci_dev structure in +this function, except for PM-capable devices when pci_set_power_state is used. + + +enable_wake +----------- + +Usage: + +if (dev->driver && dev->driver->enable_wake) + dev->driver->enable_wake(dev,state,enable); + +This callback is generally only relevant for devices that support the PCI PM +spec and have the ability to generate a PME# (Power Management Event Signal) +to wake the system up. (However, it is possible that a device may support +some non-standard way of generating a wake event on sleep.) + +Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's +PM Capabilties describe what power states the device supports generating a +wake event from: + ++------------------+ +| Bit | State | ++------------------+ +| 11 | D0 | +| 12 | D1 | +| 13 | D2 | +| 14 | D3hot | +| 15 | D3cold | ++------------------+ + +A device can use this to enable wake events: + + pci_enable_wake(dev,state,enable); + +Note that to enable PME# from D3cold, a value of 4 should be passed to +pci_enable_wake (since it uses an index into a bitmask). If a driver gets +a request to enable wake events from D3, two calls should be made to +pci_enable_wake (one for both D3hot and D3cold). + + +5. Resources +~~~~~~~~~~~~ + +PCI Local Bus Specification +PCI Bus Power Management Interface Specification + + http://pcisig.org + diff --git a/Documentation/power/states.txt b/Documentation/power/states.txt new file mode 100644 index 0000000..3e5e5d3 --- /dev/null +++ b/Documentation/power/states.txt @@ -0,0 +1,79 @@ + +System Power Management States + + +The kernel supports three power management states generically, though +each is dependent on platform support code to implement the low-level +details for each state. This file describes each state, what they are +commonly called, what ACPI state they map to, and what string to write +to /sys/power/state to enter that state + + +State: Standby / Power-On Suspend +ACPI State: S1 +String: "standby" + +This state offers minimal, though real, power savings, while providing +a very low-latency transition back to a working system. No operating +state is lost (the CPU retains power), so the system easily starts up +again where it left off. + +We try to put devices in a low-power state equivalent to D1, which +also offers low power savings, but low resume latency. Not all devices +support D1, and those that don't are left on. + +A transition from Standby to the On state should take about 1-2 +seconds. + + +State: Suspend-to-RAM +ACPI State: S3 +String: "mem" + +This state offers significant power savings as everything in the +system is put into a low-power state, except for memory, which is +placed in self-refresh mode to retain its contents. + +System and device state is saved and kept in memory. All devices are +suspended and put into D3. In many cases, all peripheral buses lose +power when entering STR, so devices must be able to handle the +transition back to the On state. + +For at least ACPI, STR requires some minimal boot-strapping code to +resume the system from STR. This may be true on other platforms. + +A transition from Suspend-to-RAM to the On state should take about +3-5 seconds. + + +State: Suspend-to-disk +ACPI State: S4 +String: "disk" + +This state offers the greatest power savings, and can be used even in +the absence of low-level platform support for power management. This +state operates similarly to Suspend-to-RAM, but includes a final step +of writing memory contents to disk. On resume, this is read and memory +is restored to its pre-suspend state. + +STD can be handled by the firmware or the kernel. If it is handled by +the firmware, it usually requires a dedicated partition that must be +setup via another operating system for it to use. Despite the +inconvenience, this method requires minimal work by the kernel, since +the firmware will also handle restoring memory contents on resume. + +If the kernel is responsible for persistantly saving state, a mechanism +called 'swsusp' (Swap Suspend) is used to write memory contents to +free swap space. swsusp has some restrictive requirements, but should +work in most cases. Some, albeit outdated, documentation can be found +in Documentation/power/swsusp.txt. + +Once memory state is written to disk, the system may either enter a +low-power state (like ACPI S4), or it may simply power down. Powering +down offers greater savings, and allows this mechanism to work on any +system. However, entering a real low-power state allows the user to +trigger wake up events (e.g. pressing a key or opening a laptop lid). + +A transition from Suspend-to-Disk to the On state should take about 30 +seconds, though it's typically a bit more with the current +implementation. diff --git a/Documentation/power/swsusp.txt b/Documentation/power/swsusp.txt new file mode 100644 index 0000000..c7c3459 --- /dev/null +++ b/Documentation/power/swsusp.txt @@ -0,0 +1,235 @@ +From kernel/suspend.c: + + * BIG FAT WARNING ********************************************************* + * + * If you have unsupported (*) devices using DMA... + * ...say goodbye to your data. + * + * If you touch anything on disk between suspend and resume... + * ...kiss your data goodbye. + * + * If your disk driver does not support suspend... (IDE does) + * ...you'd better find out how to get along + * without your data. + * + * If you change kernel command line between suspend and resume... + * ...prepare for nasty fsck or worse. + * + * If you change your hardware while system is suspended... + * ...well, it was not good idea. + * + * (*) suspend/resume support is needed to make it safe. + +You need to append resume=/dev/your_swap_partition to kernel command +line. Then you suspend by + +echo shutdown > /sys/power/disk; echo disk > /sys/power/state + +. If you feel ACPI works pretty well on your system, you might try + +echo platform > /sys/power/disk; echo disk > /sys/power/state + + + +Article about goals and implementation of Software Suspend for Linux +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Author: G‚ábor Kuti +Last revised: 2003-10-20 by Pavel Machek + +Idea and goals to achieve + +Nowadays it is common in several laptops that they have a suspend button. It +saves the state of the machine to a filesystem or to a partition and switches +to standby mode. Later resuming the machine the saved state is loaded back to +ram and the machine can continue its work. It has two real benefits. First we +save ourselves the time machine goes down and later boots up, energy costs +are real high when running from batteries. The other gain is that we don't have to +interrupt our programs so processes that are calculating something for a long +time shouldn't need to be written interruptible. + +swsusp saves the state of the machine into active swaps and then reboots or +powerdowns. You must explicitly specify the swap partition to resume from with +``resume='' kernel option. If signature is found it loads and restores saved +state. If the option ``noresume'' is specified as a boot parameter, it skips +the resuming. + +In the meantime while the system is suspended you should not add/remove any +of the hardware, write to the filesystems, etc. + +Sleep states summary +==================== + +There are three different interfaces you can use, /proc/acpi should +work like this: + +In a really perfect world: +echo 1 > /proc/acpi/sleep # for standby +echo 2 > /proc/acpi/sleep # for suspend to ram +echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative +echo 4 > /proc/acpi/sleep # for suspend to disk +echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system + +and perhaps +echo 4b > /proc/acpi/sleep # for suspend to disk via s4bios + +Frequently Asked Questions +========================== + +Q: well, suspending a server is IMHO a really stupid thing, +but... (Diego Zuccato): + +A: You bought new UPS for your server. How do you install it without +bringing machine down? Suspend to disk, rearrange power cables, +resume. + +You have your server on UPS. Power died, and UPS is indicating 30 +seconds to failure. What do you do? Suspend to disk. + +Ethernet card in your server died. You want to replace it. Your +server is not hotplug capable. What do you do? Suspend to disk, +replace ethernet card, resume. If you are fast your users will not +even see broken connections. + + +Q: Maybe I'm missing something, but why don't the regular I/O paths work? + +A: We do use the regular I/O paths. However we cannot restore the data +to its original location as we load it. That would create an +inconsistent kernel state which would certainly result in an oops. +Instead, we load the image into unused memory and then atomically copy +it back to it original location. This implies, of course, a maximum +image size of half the amount of memory. + +There are two solutions to this: + +* require half of memory to be free during suspend. That way you can +read "new" data onto free spots, then cli and copy + +* assume we had special "polling" ide driver that only uses memory +between 0-640KB. That way, I'd have to make sure that 0-640KB is free +during suspending, but otherwise it would work... + +suspend2 shares this fundamental limitation, but does not include user +data and disk caches into "used memory" by saving them in +advance. That means that the limitation goes away in practice. + +Q: Does linux support ACPI S4? + +A: Yes. That's what echo platform > /sys/power/disk does. + +Q: My machine doesn't work with ACPI. How can I use swsusp than ? + +A: Do a reboot() syscall with right parameters. Warning: glibc gets in +its way, so check with strace: + +reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, 0xd000fce2) + +(Thanks to Peter Osterlund:) + +#include <unistd.h> +#include <syscall.h> + +#define LINUX_REBOOT_MAGIC1 0xfee1dead +#define LINUX_REBOOT_MAGIC2 672274793 +#define LINUX_REBOOT_CMD_SW_SUSPEND 0xD000FCE2 + +int main() +{ + syscall(SYS_reboot, LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, + LINUX_REBOOT_CMD_SW_SUSPEND, 0); + return 0; +} + +Also /sys/ interface should be still present. + +Q: What is 'suspend2'? + +A: suspend2 is 'Software Suspend 2', a forked implementation of +suspend-to-disk which is available as separate patches for 2.4 and 2.6 +kernels from swsusp.sourceforge.net. It includes support for SMP, 4GB +highmem and preemption. It also has a extensible architecture that +allows for arbitrary transformations on the image (compression, +encryption) and arbitrary backends for writing the image (eg to swap +or an NFS share[Work In Progress]). Questions regarding suspend2 +should be sent to the mailing list available through the suspend2 +website, and not to the Linux Kernel Mailing List. We are working +toward merging suspend2 into the mainline kernel. + +Q: A kernel thread must voluntarily freeze itself (call 'refrigerator'). +I found some kernel threads that don't do it, and they don't freeze +so the system can't sleep. Is this a known behavior? + +A: All such kernel threads need to be fixed, one by one. Select the +place where the thread is safe to be frozen (no kernel semaphores +should be held at that point and it must be safe to sleep there), and +add: + + if (current->flags & PF_FREEZE) + refrigerator(PF_FREEZE); + +If the thread is needed for writing the image to storage, you should +instead set the PF_NOFREEZE process flag when creating the thread. + + +Q: What is the difference between between "platform", "shutdown" and +"firmware" in /sys/power/disk? + +A: + +shutdown: save state in linux, then tell bios to powerdown + +platform: save state in linux, then tell bios to powerdown and blink + "suspended led" + +firmware: tell bios to save state itself [needs BIOS-specific suspend + partition, and has very little to do with swsusp] + +"platform" is actually right thing to do, but "shutdown" is most +reliable. + +Q: I do not understand why you have such strong objections to idea of +selective suspend. + +A: Do selective suspend during runtime power managment, that's okay. But +its useless for suspend-to-disk. (And I do not see how you could use +it for suspend-to-ram, I hope you do not want that). + +Lets see, so you suggest to + +* SUSPEND all but swap device and parents +* Snapshot +* Write image to disk +* SUSPEND swap device and parents +* Powerdown + +Oh no, that does not work, if swap device or its parents uses DMA, +you've corrupted data. You'd have to do + +* SUSPEND all but swap device and parents +* FREEZE swap device and parents +* Snapshot +* UNFREEZE swap device and parents +* Write +* SUSPEND swap device and parents + +Which means that you still need that FREEZE state, and you get more +complicated code. (And I have not yet introduce details like system +devices). + +Q: There don't seem to be any generally useful behavioral +distinctions between SUSPEND and FREEZE. + +A: Doing SUSPEND when you are asked to do FREEZE is always correct, +but it may be unneccessarily slow. If you want USB to stay simple, +slowness may not matter to you. It can always be fixed later. + +For devices like disk it does matter, you do not want to spindown for +FREEZE. + +Q: After resuming, system is paging heavilly, leading to very bad interactivity. + +A: Try running + +cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null + +after resume. swapoff -a; swapon -a may also be usefull. diff --git a/Documentation/power/tricks.txt b/Documentation/power/tricks.txt new file mode 100644 index 0000000..c6d58d3 --- /dev/null +++ b/Documentation/power/tricks.txt @@ -0,0 +1,27 @@ + swsusp/S3 tricks + ~~~~~~~~~~~~~~~~ +Pavel Machek <pavel@suse.cz> + +If you want to trick swsusp/S3 into working, you might want to try: + +* go with minimal config, turn off drivers like USB, AGP you don't + really need + +* turn off APIC and preempt + +* use ext2. At least it has working fsck. [If something seemes to go + wrong, force fsck when you have a chance] + +* turn off modules + +* use vga text console, shut down X. [If you really want X, you might + want to try vesafb later] + +* try running as few processes as possible, preferably go to single + user mode. + +* due to video issues, swsusp should be easier to get working than + S3. Try that first. + +When you make it work, try to find out what exactly was it that broke +suspend, and preferably fix that. diff --git a/Documentation/power/video.txt b/Documentation/power/video.txt new file mode 100644 index 0000000..8686968 --- /dev/null +++ b/Documentation/power/video.txt @@ -0,0 +1,169 @@ + + Video issues with S3 resume + ~~~~~~~~~~~~~~~~~~~~~~~~~~~ + 2003-2005, Pavel Machek + +During S3 resume, hardware needs to be reinitialized. For most +devices, this is easy, and kernel driver knows how to do +it. Unfortunately there's one exception: video card. Those are usually +initialized by BIOS, and kernel does not have enough information to +boot video card. (Kernel usually does not even contain video card +driver -- vesafb and vgacon are widely used). + +This is not problem for swsusp, because during swsusp resume, BIOS is +run normally so video card is normally initialized. S3 has absolutely +no chance of working with SMP/HT. Be sure it to turn it off before +testing (swsusp should work ok, OTOH). + +There are a few types of systems where video works after S3 resume: + +(1) systems where video state is preserved over S3. + +(2) systems where it is possible to call the video BIOS during S3 + resume. Unfortunately, it is not correct to call the video BIOS at + that point, but it happens to work on some machines. Use + acpi_sleep=s3_bios. + +(3) systems that initialize video card into vga text mode and where + the BIOS works well enough to be able to set video mode. Use + acpi_sleep=s3_mode on these. + +(4) on some systems s3_bios kicks video into text mode, and + acpi_sleep=s3_bios,s3_mode is needed. + +(5) radeon systems, where X can soft-boot your video card. You'll need + new enough X, and plain text console (no vesafb or radeonfb), see + http://www.doesi.gmxhome.de/linux/tm800s3/s3.html. Actually you + should probably use vbetool (6) instead. + +(6) other radeon systems, where vbetool is enough to bring system back + to life. It needs text console to be working. Do vbetool vbestate + save > /tmp/delme; echo 3 > /proc/acpi/sleep; vbetool post; vbetool + vbestate restore < /tmp/delme; setfont <whatever>, and your video + should work. + +(7) on some systems, it is possible to boot most of kernel, and then + POSTing bios works. Ole Rohne has patch to do just that at + http://dev.gentoo.org/~marineam/patch-radeonfb-2.6.11-rc2-mm2. + +Now, if you pass acpi_sleep=something, and it does not work with your +bios, you'll get a hard crash during resume. Be careful. Also it is +safest to do your experiments with plain old VGA console. The vesafb +and radeonfb (etc) drivers have a tendency to crash the machine during +resume. + +You may have a system where none of above works. At that point you +either invent another ugly hack that works, or write proper driver for +your video card (good luck getting docs :-(). Maybe suspending from X +(proper X, knowing your hardware, not XF68_FBcon) might have better +chance of working. + +Table of known working systems: + +Model hack (or "how to do it") +------------------------------------------------------------------------------ +Acer Aspire 1406LC ole's late BIOS init (7), turn off DRI +Acer TM 242FX vbetool (6) +Acer TM C300 vga=normal (only suspend on console, not in X), vbetool (6) +Acer TM 4052LCi s3_bios (2) +Acer TM 636Lci s3_bios vga=normal (2) +Acer TM 650 (Radeon M7) vga=normal plus boot-radeon (5) gets text console back +Acer TM 660 ??? (*) +Acer TM 800 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803 vga=normal, X patches, see webpage (5) or vbetool (6) +Acer TM 803LCi vga=normal, vbetool (6) +Arima W730a vbetool needed (6) +Asus L2400D s3_mode (3)(***) (S1 also works OK) +Asus L3800C (Radeon M7) s3_bios (2) (S1 also works OK) +Asus M6NE ??? (*) +Athlon64 desktop prototype s3_bios (2) +Compal CL-50 ??? (*) +Compaq Armada E500 - P3-700 none (1) (S1 also works OK) +Compaq Evo N620c vga=normal, s3_bios (2) +Dell 600m, ATI R250 Lf none (1), but needs xorg-x11-6.8.1.902-1 +Dell D600, ATI RV250 vga=normal and X, or try vbestate (6) +Dell Inspiron 4000 ??? (*) +Dell Inspiron 500m ??? (*) +Dell Inspiron 600m ??? (*) +Dell Inspiron 8200 ??? (*) +Dell Inspiron 8500 ??? (*) +Dell Inspiron 8600 ??? (*) +eMachines athlon64 machines vbetool needed (6) (someone please get me model #s) +HP NC6000 s3_bios, may not use radeonfb (2); or vbetool (6) +HP NX7000 ??? (*) +HP Pavilion ZD7000 vbetool post needed, need open-source nv driver for X +HP Omnibook XE3 athlon version none (1) +HP Omnibook XE3GC none (1), video is S3 Savage/IX-MV +IBM TP T20, model 2647-44G none (1), video is S3 Inc. 86C270-294 Savage/IX-MV, vesafb gets "interesting" but X work. +IBM TP A31 / Type 2652-M5G s3_mode (3) [works ok with BIOS 1.04 2002-08-23, but not at all with BIOS 1.11 2004-11-05 :-(] +IBM TP R32 / Type 2658-MMG none (1) +IBM TP R40 2722B3G ??? (*) +IBM TP R50p / Type 1832-22U s3_bios (2) +IBM TP R51 ??? (*) +IBM TP T30 236681A ??? (*) +IBM TP T40 / Type 2373-MU4 none (1) +IBM TP T40p none (1) +IBM TP R40p s3_bios (2) +IBM TP T41p s3_bios (2), switch to X after resume +IBM TP T42 ??? (*) +IBM ThinkPad T42p (2373-GTG) s3_bios (2) +IBM TP X20 ??? (*) +IBM TP X30 ??? (*) +IBM TP X31 / Type 2672-XXH none (1), use radeontool (http://fdd.com/software/radeon/) to turn off backlight. +IBM Thinkpad X40 Type 2371-7JG s3_bios,s3_mode (4) +Medion MD4220 ??? (*) +Samsung P35 vbetool needed (6) +Sharp PC-AR10 (ATI rage) none (1) +Sony Vaio PCG-F403 ??? (*) +Sony Vaio PCG-N505SN ??? (*) +Sony Vaio vgn-s260 X or boot-radeon can init it (5) +Toshiba Libretto L5 none (1) +Toshiba Satellite 4030CDT s3_mode (3) +Toshiba Satellite 4080XCDT s3_mode (3) +Toshiba Satellite 4090XCDT ??? (*) +Toshiba Satellite P10-554 s3_bios,s3_mode (4)(****) +Uniwill 244IIO ??? (*) + + +(*) from http://www.ubuntulinux.org/wiki/HoaryPMResults, not sure + which options to use. If you know, please tell me. + +(***) To be tested with a newer kernel. + +(****) Not with SMP kernel, UP only. + +VBEtool details +~~~~~~~~~~~~~~~ +(with thanks to Carl-Daniel Hailfinger) + +First, boot into X and run the following script ONCE: +#!/bin/bash +statedir=/root/s3/state +mkdir -p $statedir +chvt 2 +sleep 1 +vbetool vbestate save >$statedir/vbe + + +To suspend and resume properly, call the following script as root: +#!/bin/bash +statedir=/root/s3/state +curcons=`fgconsole` +fuser /dev/tty$curcons 2>/dev/null|xargs ps -o comm= -p|grep -q X && chvt 2 +cat /dev/vcsa >$statedir/vcsa +sync +echo 3 >/proc/acpi/sleep +sync +vbetool post +vbetool vbestate restore <$statedir/vbe +cat $statedir/vcsa >/dev/vcsa +rckbd restart +chvt $[curcons%6+1] +chvt $curcons + + +Unless you change your graphics card or other hardware configuration, +the state once saved will be OK for every resume afterwards. +NOTE: The "rckbd restart" command may be different for your +distribution. Simply replace it with the command you would use to +set the fonts on screen. diff --git a/Documentation/power/video_extension.txt b/Documentation/power/video_extension.txt new file mode 100644 index 0000000..8e33d7c8 --- /dev/null +++ b/Documentation/power/video_extension.txt @@ -0,0 +1,34 @@ +This driver implement the ACPI Extensions For Display Adapters +for integrated graphics devices on motherboard, as specified in +ACPI 2.0 Specification, Appendix B, allowing to perform some basic +control like defining the video POST device, retrieving EDID information +or to setup a video output, etc. Note that this is an ref. implementation only. +It may or may not work for your integrated video device. + +Interfaces exposed to userland through /proc/acpi/video: + +VGA/info : display the supported video bus device capability like ,Video ROM, CRT/LCD/TV. +VGA/ROM : Used to get a copy of the display devices' ROM data (up to 4k). +VGA/POST_info : Used to determine what options are implemented. +VGA/POST : Used to get/set POST device. +VGA/DOS : Used to get/set ownership of output switching: + Please refer ACPI spec B.4.1 _DOS +VGA/CRT : CRT output +VGA/LCD : LCD output +VGA/TV : TV output +VGA/*/brightness : Used to get/set brightness of output device + +Notify event through /proc/acpi/event: + +#define ACPI_VIDEO_NOTIFY_SWITCH 0x80 +#define ACPI_VIDEO_NOTIFY_PROBE 0x81 +#define ACPI_VIDEO_NOTIFY_CYCLE 0x82 +#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT 0x83 +#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT 0x84 + +#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS 0x82 +#define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x83 +#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x84 +#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS 0x85 +#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF 0x86 + |