summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/include/asm/opal.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'topic/xive' (early part) into nextMichael Ellerman2017-04-121-0/+36
|\ | | | | | | | | | | This merges the arch part of the XIVE support, leaving the final commit with the KVM specific pieces dangling on the branch for Paul to merge via the kvm-ppc tree.
| * powerpc/powernv: Add XIVE related definitions to opal-api.hBenjamin Herrenschmidt2017-04-061-0/+36
| | | | | | | | | | Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/powernv: Introduce address translation services for Nvlink2Alistair Popple2017-04-041-0/+5
|/ | | | | | | | | | | | | | | | | | | | | Nvlink2 supports address translation services (ATS) allowing devices to request address translations from an mmu known as the nest MMU which is setup to walk the CPU page tables. To access this functionality certain firmware calls are required to setup and manage hardware context tables in the nvlink processing unit (NPU). The NPU also manages forwarding of TLB invalidates (known as address translation shootdowns/ATSDs) to attached devices. This patch exports several methods to allow device drivers to register a process id (PASID/PID) in the hardware tables and to receive notification of when a device should stop issuing address translation requests (ATRs). It also adds a fault handler to allow device drivers to demand fault pages in. Signed-off-by: Alistair Popple <alistair@popple.id.au> [mpe: Fix up comment formatting, use flush_tlb_mm()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'topic/ppc-kvm' into nextMichael Ellerman2017-02-141-7/+0
|\ | | | | | | Merge the topic branch we're sharing with the kvm-ppc tree.
| * powerpc/powernv: Remove separate entry for OPAL real mode callsBenjamin Herrenschmidt2017-02-071-7/+0
| | | | | | | | | | | | | | | | | | All entry points already read the MSR so they can easily do the right thing. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/powernv: Initialise nest mmuAlistair Popple2017-01-301-0/+1
|/ | | | | | | | | | | | | | | POWER9 contains an off core mmu called the nest mmu (NMMU). This is used by other hardware units on the chip to translate virtual addresses into real addresses. The unit attempting an address translation provides the majority of the context required for the translation request except for the base address of the partition table (ie. the PTCR) which needs to be programmed into the NMMU. This patch adds a call to OPAL to set the PTCR for the nest mmu in opal_init(). Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Define real-mode versions of OPAL XICS accessorsPaul Mackerras2016-11-231-0/+3
| | | | | | | | | | | This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi() and opal_int_set_mfrr(), for use by KVM real-mode code. It also exports opal_int_set_mfrr() so that the modular part of KVM can use it to send IPIs. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* KVM: PPC: Book3S HV: Set server for passed-through interruptsPaul Mackerras2016-09-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a guest has a PCI pass-through device with an interrupt, it will direct the interrupt to a particular guest VCPU. In fact the physical interrupt might arrive on any CPU, and then get delivered to the target VCPU in the emulated XICS (guest interrupt controller), and eventually delivered to the target VCPU. Now that we have code to handle device interrupts in real mode without exiting to the host kernel, there is an advantage to having the device interrupt arrive on the same sub(core) as the target VCPU is running on. In this situation, the interrupt can be delivered to the target VCPU without any exit to the host kernel (using a hypervisor doorbell interrupt between threads if necessary). This patch aims to get passed-through device interrupts arriving on the correct core by setting the interrupt server in the real hardware XICS for the interrupt to the first thread in the (sub)core where its target VCPU is running. We do this in the real-mode H_EOI code because the H_EOI handler already needs to look at the emulated ICS state for the interrupt (whereas the H_XIRR handler doesn't), and we know we are running in the target VCPU context at that point. We set the server CPU in hardware using an OPAL call, regardless of what the IRQ affinity mask for the interrupt says, and without updating the affinity mask. This amounts to saying that when an interrupt is passed through to a guest, as a matter of policy we allow the guest's affinity for the interrupt to override the host's. This is inspired by an earlier patch from Suresh Warrier, although none of this code came from that earlier patch. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
* powerpc: Put exception configuration in a common placeBenjamin Herrenschmidt2016-07-211-0/+1
| | | | | | | | | The various calls to establish exception endianness and AIL are now done from a single point using already established CPU and FW feature bits to decide what to do. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/opal: Add real mode call wrappersBenjamin Herrenschmidt2016-07-171-0/+6
| | | | | | | | Replace the old generic opal_call_realmode() with proper per-call wrappers similar to the normal ones and convert callers. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add XICS emulation APIsBenjamin Herrenschmidt2016-07-151-0/+5
| | | | | | | | | | | OPAL provides an emulated XICS interrupt controller to use as a fallback on newer processors that don't have a XICS. It's meant as a way to provide backward compatibility with future processors. Add the corresponding interfaces. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/opal: Wake up kopald polling thread before waiting for eventsBenjamin Herrenschmidt2016-07-081-0/+2
| | | | | | | | | | | | | | | | On some environments (prototype machines, some simulators, etc...) there is no functional interrupt source to signal completion, so we rely on the fairly slow OPAL heartbeat. In a number of cases, the calls complete very quickly or even immediately. We've observed that it helps a lot to wakeup the OPAL heartbeat thread before waiting for event in those cases, it will call OPAL immediately to collect completions for anything that finished fast enough. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-By: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add driver for operator panel on FSP machinesSuraj Jitindar Singh2016-06-291-0/+2
| | | | | | | | | | | | | | | | | | | | Implement new character device driver to allow access from user space to the operator panel display present on IBM Power Systems machines with FSPs. This will allow status information to be presented on the display which is visible to a user. The driver implements a character buffer which a user can read/write by accessing the device (/dev/op_panel). This buffer is then displayed on the operator panel display. Any attempt to write past the last character position will have no effect and attempts to write more characters than the size of the display will be truncated. The device may only be accessed by a single process at a time. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/opal: Add inline function to get rc from an ASYNC_COMP opal_msgSuraj Jitindar Singh2016-06-291-0/+8
| | | | | | | | | | | | | An opal_msg of type OPAL_MSG_ASYNC_COMP contains the return code in the params[1] struct member. However this isn't intuitive or obvious when reading the code and requires that a user look at the skiboot documentation or opal-api.h to verify this. Add an inline function to get the return code from an opal_msg and update call sites accordingly. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Functions to get/set PCI slot stateGavin Shan2016-06-211-0/+6
| | | | | | | | | | | | | | | This exports 4 functions, which base on the corresponding OPAL APIs to get/set PCI slot status. Those functions are going to be used by PowerNV PCI hotplug driver: pnv_pci_get_device_tree() opal_get_device_tree() pnv_pci_get_presence_state() opal_pci_get_presence_state() pnv_pci_get_power_state() opal_pci_get_power_state() pnv_pci_set_power_state() opal_pci_set_power_state() Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Support PCI slot IDGavin Shan2016-06-211-2/+2
| | | | | | | | | | | | | | | The reset and poll functionality from (OPAL) firmware supports PHB and PCI slot at same time. They are identified by ID. This supports PCI slot ID by: * Rename the argument name for opal_pci_reset() and opal_pci_poll() accordingly * Rename pnv_eeh_phb_poll() to pnv_eeh_poll() and adjust its argument name. * One macro is added to produce PCI slot ID. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: new function to access OPAL msglogAndrew Donnellan2016-02-091-0/+3
| | | | | | | | | | | | | Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with the sysfs read handler responsible for retrieving the log from the OPAL buffer. We'd like to be able to use it in xmon as well. Refactor the OPAL msglog code to create a new function, opal_msglog_copy(), that copies to an arbitrary buffer. Separate the initialisation code into generic memcons init and sysfs file creation. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usagesRussell Currey2016-01-131-1/+1
| | | | | | | | | | | | | The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no parameters and returned nothing. The call was updated to accept the terminal number to flush, and returned various values depending on the state of the output buffer. The prototype has been updated and its usage in the OPAL kmsg dumper has been modified to support its new behaviour as an incremental flush. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add a kmsg_dumper that flushes console output on panicRussell Currey2015-12-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On BMC machines, console output is controlled by the OPAL firmware and is only flushed when its pollers are called. When the kernel is in a panic state, it no longer calls these pollers and thus console output does not completely flush, causing some output from the panic to be lost. Output is only actually lost when the kernel is configured to not power off or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL flushes the console buffer as part of its power down routines. Before this patch, however, only partial output would be printed during the timeout wait. This patch adds a new kmsg_dumper which gets called at panic time to ensure panic output is not lost. It accomplishes this by calling OPAL_CONSOLE_FLUSH in the OPAL API, and if that is not available, the pollers are called enough times to (hopefully) completely flush the buffer. The flushing mechanism will only affect output printed at and before the kmsg_dump call in kernel/panic.c:panic(). As such, the "end Kernel panic" message may still be truncated as follows: >Call Trace: >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable) >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4 >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254 >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350 >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130 >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac >---[ end Kernel panic - not This functionality is implemented as a kmsg_dumper as it seems to be the most sensible way to introduce platform-specific functionality to the panic function. Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED ↵Anshuman Khandual2015-08-201-0/+4
| | | | | | | | | | | | | | | | | | | | states This patch registers the following two new OPAL interfaces calls for the platform LED subsystem. With the help of these new OPAL calls, the kernel will be able to get or set the state of various individual LEDs on the system at any given location code which is passed through the LED specific device tree nodes. (1) OPAL_LEDS_GET_INDICATOR opal_leds_get_ind (2) OPAL_LEDS_SET_INDICATOR opal_leds_set_ind Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Tested-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Invoke opal_cec_reboot2() on unrecoverable machine check ↵Mahesh Salgaonkar2015-08-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | errors. On non-recoverable MCE errors in kernel space, Linux kernel panics and system reboots. On BMC based system opal-prd runs as a daemon in the host. Hence, kernel crash may prevent opal-prd to detect and analyze this MCE error. This may land us in a situation where the faulty memory never gets de-configured and Linux would keep hitting same MCE error again and again. If this happens in early stage of kernel initialization, then Linux will keep crashing and rebooting in a loop. This patch fixes this issue by invoking new opal_cec_reboot2() call with reboot type OPAL_REBOOT_PLATFORM_ERROR to inform BMC/OCC about this error, so that BMC can collect relevant data for error analysis and decide what component to de-configure before rebooting. This patch is dependent on OPAL patchset posted on skiboot mailing list at https://lists.ozlabs.org/pipermail/skiboot/2015-July/001771.html that introduces opal_cec_reboot2() opal call. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platformVipin K Parashar2015-07-161-1/+2
| | | | | | | | | | | | | | | | This patch adds support for OPAL EPOW (Environmental and Power Warnings) and DPO (Delayed Power Off) events for the PowerNV platform. These events are generated on FSP (Flexible Service Processor) based systems. EPOW events are generated due to various critical system conditions that require system shutdown. A few examples of these conditions are high ambient temperature or system running on UPS power with low UPS battery. DPO event is generated in response to admin initiated system shutdown request. Upon receipt of EPOW and DPO events the host kernel invokes orderly_poweroff() for performing graceful system shutdown. Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com> Acked-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add opal-prd channelJeremy Kerr2015-06-051-0/+1
| | | | | | | | | | | | | This change adds a char device to access the "PRD" (processor runtime diagnostics) channel to OPAL firmware. Includes contributions from Vaidyanathan Srinivasan, Neelesh Gupta & Vishal Kulkarni. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add a virtual irqchip for opal eventsAlistair Popple2015-05-221-0/+3
| | | | | | | | | | | | | | | | | | | | | Whenever an interrupt is received for opal the linux kernel gets a bitfield indicating certain events that have occurred and need handling by the various device drivers. Currently this is handled using a notifier interface where we call every device driver that has registered to receive opal events. This approach has several drawbacks. For example each driver has to do its own checking to see if the event is relevant as well as event masking. There is also no easy method of recording the number of times we receive particular events. This patch solves these issues by exposing opal events via the standard interrupt APIs by adding a new interrupt chip and domain. Drivers can then register for the appropriate events using standard kernel calls such as irq_of_parse_and_map(). Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Reorder OPAL subsystem initialisationAlistair Popple2015-05-221-0/+3
| | | | | | | | | | | | Most of the OPAL subsystems are always compiled in for PowerNV and many of them need to be initialised before or after other OPAL subsystems. Rather than trying to control this ordering through machine initcalls it is clearer and easier to control initialisation order with explicit calls in opal_init. Signed-off-by: Alistair Popple <alistair@popple.id.au> Cc: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Introduce sysfs control for fastsleep workaround behaviorShreyas B. Prabhu2015-05-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fastsleep is one of the idle state which cpuidle subsystem currently uses on power8 machines. In this state L2 cache is brought down to a threshold voltage. Therefore when the core is in fastsleep, the communication between L2 and L3 needs to be fenced. But there is a bug in the current power8 chips surrounding this fencing. OPAL provides a workaround which precludes the possibility of hitting this bug. But running with this workaround applied causes checkstop if any correctable error in L2 cache directory is detected. Hence OPAL also provides a way to undo the workaround. In the existing implementation, workaround is applied by the last thread of the core entering fastsleep and undone by the first thread waking up. But this has a performance cost. These OPAL calls account for roughly 4000 cycles everytime the core has to enter or wakeup from fastsleep. This patch introduces a sysfs attribute (fastsleep_workaround_applyonce) to choose the behavior of this workaround. By default, fastsleep_workaround_applyonce = 0. In this case, workaround is applied/undone everytime the core enters/exits fastsleep. fastsleep_workaround_applyonce = 1. In this case the workaround is applied once on all the cores and never undone. This can be triggered by echo 1 > /sys/devices/system/cpu/fastsleep_workaround_applyonce For simplicity this attribute can be modified only once. Implying, once fastsleep_workaround_applyonce is changed to 1, it cannot be reverted to the default state. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add interfaces for flash device accessCyril Bur2015-04-111-1/+8
| | | | | | | | | | | | | | | | | | | | This change adds the OPAL interface definitions to allow Linux to read, write and erase from system flash devices. We register platform devices for the flash devices exported by firmware. We clash with the existing opal_flash_init function, which is really for the FSP flash update functionality, so we rename that initcall to opal_flash_update_init(). A future change will add an mtd driver that uses this interface. Changes from Joel Stanley and Jeremy Kerr. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: convert codes returned by OPAL callsCédric Le Goater2015-03-311-0/+2
| | | | | | | | | OPAL has its own list of return codes. The patch provides a translation of such codes in errnos for the opal_sensor_read call, and possibly others if needed. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'next-misc' of ↵Michael Ellerman2015-03-261-0/+2
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc into test Merge miscellaneous bits from benh. Fix a minor conflict with OpalMessageType changing names to opal_msg_type.
| * powerpc/powernv: Add OPAL message notifier unregister functionNeelesh Gupta2015-03-251-0/+2
| | | | | | | | | | | | | | | | | | Provide an unregister interface for the opal message notifiers to be called when not needed like during driver unload/remove. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powernv: Move opal-api.h closer to the Skiboot versionMichael Ellerman2015-03-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit gets opal-api.h to mostly match the version in Skiboot as of commit ea7d806ab0ba. The exceptions are things which are not (currently) used in Linux. Most of this is just whitespace and a few things moving around. I think the diff is readable. Also OpalMessageType became opal_msg_type, requiring a change in the Linux code. Finally Skiboot and Linux disagree on CAPI vs CXL, because CAPI means something else in Linux. To handle that we just point the Linux wrapper, which is named "cxl" to the OPAL token OPAL_PCI_SET_PHB_CAPI_MODE. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* | powerpc/powernv: Move OPAL API definitions to opal-api.hMichael Ellerman2015-03-161-751/+6
|/ | | | | | | | | | | We'd like to get to the stage where the OPAL API is defined in a header that is identical between Linux and Skiboot. As step one, split the bits that actually define the API into opal-api.h. The Linux specific parts stay in opal.h. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
* powerpc/powernv: Add OPAL soft-poweroff routineJoel Stanley2015-02-041-1/+1
| | | | | | | | | | | Register a notifier for a OPAL message indicating that the machine should prepare itself for a graceful power off. OPAL will tell us if the power off is a reboot or shutdown, but for now we perform the same orderly_poweroff action. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* cxl: Enable CAPP recoveryRyan Grimm2015-01-221-0/+8
| | | | | | | | | | | | Turning snoops on is the last step in CAPP recovery. Sapphire is expected to have reinitialized the PHB and done the previous recovery steps. Add mode argument to opal call to do this. Driver can turn snoops off although it does not currently. Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powernv/powerpc: Add winkle support for offline cpusShreyas B. Prabhu2014-12-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Winkle is a deep idle state supported in power8 chips. A core enters winkle when all the threads of the core enter winkle. In this state power supply to the entire chiplet i.e core, private L2 and private L3 is turned off. As a result it gives higher powersavings compared to sleep. But entering winkle results in a total hypervisor state loss. Hence the hypervisor context has to be preserved before entering winkle and restored upon wake up. Power-on Reset Engine (PORE) is a dedicated engine which is responsible for powering on the chiplet during wake up. It can be programmed to restore the register contests of a few specific registers. This patch uses PORE to restore register state wherever possible and uses stack to save and restore rest of the necessary registers. With hypervisor state restore things fall under three categories- per-core state, per-subcore state and per-thread state. To manage this, extend the infrastructure introduced for sleep. Mainly we add a paca variable subcore_sibling_mask. Using this and the core_idle_state we can distingush first thread in core and subcore. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powernv/cpuidle: Redesign idle states managementShreyas B. Prabhu2014-12-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deep idle states like sleep and winkle are per core idle states. A core enters these states only when all the threads enter either the particular idle state or a deeper one. There are tasks like fastsleep hardware bug workaround and hypervisor core state save which have to be done only by the last thread of the core entering deep idle state and similarly tasks like timebase resync, hypervisor core register restore that have to be done only by the first thread waking up from these state. The current idle state management does not have a way to distinguish the first/last thread of the core waking/entering idle states. Tasks like timebase resync are done for all the threads. This is not only is suboptimal, but can cause functionality issues when subcores and kvm is involved. This patch adds the necessary infrastructure to track idle states of threads in a per-core structure. It uses this info to perform tasks like fastsleep workaround and timebase resync only once per core. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Originally-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Enable Offline CPUs to enter deep idle statesShreyas B. Prabhu2014-12-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The secondary threads should enter deep idle states so as to gain maximum powersavings when the entire core is offline. To do so the offline path must be made aware of the available deepest idle state. Hence probe the device tree for the possible idle states in powernv core code and expose the deepest idle state through flags. Since the device tree is probed by the cpuidle driver as well, move the parameters required to discover the idle states into an appropriate common place to both the driver and the powernv core code. Another point is that fastsleep idle state may require workarounds in the kernel to function properly. This workaround is introduced in the subsequent patches. However neither the cpuidle driver or the hotplug path need be bothered about this workaround. They will be taken care of by the core powernv code. Originally-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* i2c: Driver to expose PowerNV platform i2c bussesNeelesh Gupta2014-12-141-0/+29
| | | | | | | | | | | | | The patch exposes the available i2c busses on the PowerNV platform to the kernel and implements the bus driver to support i2c and smbus commands. The driver uses the platform device infrastructure to probe the busses on the platform and registers them with the i2c driver framework. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Wolfram Sang <wsa@the-dreams.de> (I2C part, excluding the bindings) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Cleanup unused MCE definitions/declarations.Mahesh Salgaonkar2014-12-021-104/+0
| | | | | | | | | Cleanup OpalMCE_* definitions/declarations and other related code which is not used anymore. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Benjamin Herrrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* rtc/tpo: Driver to support rtc and wakeup on PowerNV platformNeelesh Gupta2014-11-171-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch implements the OPAL rtc driver that binds with the rtc driver subsystem. The driver uses the platform device infrastructure to probe the rtc device and register it to rtc class framework. The 'wakeup' is supported depending upon the property 'has-tpo' present in the OF node. It provides a way to load the generic rtc driver in in the absence of an OPAL driver. The patch also moves the existing OPAL rtc get/set time interfaces to the new driver and exposes the necessary OPAL calls using EXPORT_SYMBOL_GPL. Test results: ------------- Host: [root@tul169p1 ~]# ls -l /sys/class/rtc/ total 0 lrwxrwxrwx 1 root root 0 Oct 14 03:07 rtc0 -> ../../devices/opal-rtc/rtc/rtc0 [root@tul169p1 ~]# cat /sys/devices/opal-rtc/rtc/rtc0/time 08:10:07 [root@tul169p1 ~]# echo `date '+%s' -d '+ 2 minutes'` > /sys/class/rtc/rtc0/wakealarm [root@tul169p1 ~]# cat /sys/class/rtc/rtc0/wakealarm 1413274345 [root@tul169p1 ~]# FSP: $ smgr mfgState standby $ rtim timeofday System time is valid: 2014/10/14 08:12:04.225115 $ smgr mfgState ipling $ CC: devicetree@vger.kernel.org CC: tglx@linutronix.de CC: rtc-linux@googlegroups.com CC: a.zummo@towertech.it Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add OPAL IPMI interfaceJeremy Kerr2014-11-121-0/+17
| | | | | | | | | | | | | Recent OPAL firmare adds a couple of functions to send and receive IPMI messages: https://github.com/open-power/skiboot/commit/b2a374da This change updates the token list and wrappers to suit, and adds the platform devices for any IPMI interfaces. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/opal: Add PHB to cxl mode callIan Munsie2014-10-081-0/+2
| | | | | | | | This adds the OPAL call to change a PHB into cxl mode. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Sync OpalPciResetScope with firmwareGavin Shan2014-09-301-3/+6
| | | | | | | | The names of PCI reset scopes aren't sychronized with firmware. The patch fixes it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Sync header with firmwareGavin Shan2014-09-301-0/+32
| | | | | | | | | The patch synchronizes firmware header file (opal.h) for PCI error injection. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Add OPAL check token callMichael Neuling2014-09-251-0/+2
| | | | | | | | | | | Currently there is no way to generically check if an OPAL call exists or not from the host kernel. This adds an OPAL call opal_check_token() which tells you if the given token is present in OPAL or not. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/powernv: Interface to register/unregister opal dump regionVasant Hegde2014-08-131-0/+11
| | | | | | | | | | | | PowerNV platform is capable of capturing host memory region when system crashes (because of host/firmware). We have new OPAL API to register/ unregister memory region to be captured when system crashes. This patch adds support for new API. Also during boot time we register kernel log buffer and unregister before doing kexec. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Invoke opal call to handle hmi.Mahesh Salgaonkar2014-08-051-0/+47
| | | | | | | | | | | | | When we hit the HMI in Linux, invoke opal call to handle/recover from HMI errors in real mode and then in virtual mode during check_irq_replay() invoke opal_poll_events()/opal_do_notifier() to retrieve HMI event from OPAL and act accordingly. Now that we are ready to handle HMI interrupt directly in linux, remove the HMI interrupt registration with firmware. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/book3s: Add basic infrastructure to handle HMI in Linux.Mahesh Salgaonkar2014-08-051-0/+2
| | | | | | | | | | Handle Hypervisor Maintenance Interrupt (HMI) in Linux. This patch implements basic infrastructure to handle HMI in Linux host. The design is to invoke opal handle hmi in real mode for recovery and set irq_pending when we hit HMI. During check_irq_replay pull opal hmi event and print hmi info on console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Allow to freeze PEGavin Shan2014-08-051-1/+8
| | | | | | | | | The patch synchronizes header file with firmware to have new OPAL API opal_pci_eeh_freeze_set(), which is used to freeze the specified PE in order to support "compound" PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc/powernv: Enable M64 aperatus for PHB3Guo Chao2014-08-051-1/+7
| | | | | | | | | | | | | | | | | | | | | This patch enables M64 aperatus for PHB3. We already had platform hook (ppc_md.pcibios_window_alignment) to affect the PCI resource assignment done in PCI core so that each PE's M32 resource was built on basis of M32 segment size. Similarly, we're using that for M64 assignment on basis of M64 segment size. * We're using last M64 BAR to cover M64 aperatus, and it's shared by all 256 PEs. * We don't support P7IOC yet. However, some function callbacks are added to (struct pnv_phb) so that we can reuse them on P7IOC in future. * PE, corresponding to PCI bus with large M64 BAR device attached, might span multiple M64 segments. We introduce "compound" PE to cover the case. The compound PE is a list of PEs and the master PE is used as before. The slave PEs are just for MMIO isolation. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
OpenPOWER on IntegriCloud