diff options
author | Prarit Bhargava <prarit@redhat.com> | 2014-10-23 14:22:12 -0400 |
---|---|---|
committer | Bjorn Helgaas <bhelgaas@google.com> | 2014-11-06 15:10:06 -0700 |
commit | 63692df103e9c76b26c382ce8079283b7df9a99a (patch) | |
tree | 677204f229cd7ad3dcf072d8d7eb44ce01018e9e /drivers/pci | |
parent | f114040e3ea6e07372334ade75d1ee0775c355e1 (diff) | |
download | op-kernel-dev-63692df103e9c76b26c382ce8079283b7df9a99a.zip op-kernel-dev-63692df103e9c76b26c382ce8079283b7df9a99a.tar.gz |
PCI: Allow numa_node override via sysfs
NUMA systems with ACPI normally describe the physical topology via _PXM
methods. But many BIOSes don't implement _PXM, which leaves the kernel
with no way to discover the device topology, which reduces performance
because we can't put memory and processes close to the device.
The NUMA node of a PCI device is already exported in the sysfs "numa_node"
file. Make that file writable so users can workaround the lack of _PXM
methods in the BIOS. For example:
echo 3 > /sys/devices/pci0000:ff/0000:03:1f.3/numa_node
sets the node for PCI device 0000:03:1f.3.
Writing the file emits a FW_BUG warning to encourage users to request
firmware updates. It also taints the kernel with TAINT_FIRMWARE_WORKAROUND
because overriding the node incorrectly can cause performance issues.
[bhelgaas: changelog, documentation text]
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Myron Stowe <mstowe@redhat.com>
CC: Alexander Ducyk <alexander.h.duyck@redhat.com>
CC: Jiang Liu <jiang.liu@linux.intel.com>
Diffstat (limited to 'drivers/pci')
-rw-r--r-- | drivers/pci/pci-sysfs.c | 27 |
1 files changed, 26 insertions, 1 deletions
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 92b6d9a..91e760f 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -221,12 +221,37 @@ static ssize_t enabled_show(struct device *dev, struct device_attribute *attr, static DEVICE_ATTR_RW(enabled); #ifdef CONFIG_NUMA +static ssize_t numa_node_store(struct device *dev, + struct device_attribute *attr, const char *buf, + size_t count) +{ + struct pci_dev *pdev = to_pci_dev(dev); + int node, ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + ret = kstrtoint(buf, 0, &node); + if (ret) + return ret; + + if (!node_online(node)) + return -EINVAL; + + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); + dev_alert(&pdev->dev, FW_BUG "Overriding NUMA node to %d. Contact your vendor for updates.", + node); + + dev->numa_node = node; + return count; +} + static ssize_t numa_node_show(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf, "%d\n", dev->numa_node); } -static DEVICE_ATTR_RO(numa_node); +static DEVICE_ATTR_RW(numa_node); #endif static ssize_t dma_mask_bits_show(struct device *dev, |