summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorTony Luck <tony.luck@intel.com>2006-06-21 14:50:10 -0700
committerTony Luck <tony.luck@intel.com>2006-06-21 14:50:10 -0700
commit1323523f505606cfd24af6122369afddefc3b09d (patch)
treea3238a27220dd91ec0918478683e59e48605865f /Documentation
parent9ba89334552b96e2127dcafb1c46ce255ecf2667 (diff)
parent32e62c636a728cb39c0b3bd191286f2ca65d4028 (diff)
downloadop-kernel-dev-1323523f505606cfd24af6122369afddefc3b09d.zip
op-kernel-dev-1323523f505606cfd24af6122369afddefc3b09d.tar.gz
Pull rework-memory-attribute-aliasing into release branch
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ia64/aliasing.txt208
1 files changed, 208 insertions, 0 deletions
diff --git a/Documentation/ia64/aliasing.txt b/Documentation/ia64/aliasing.txt
new file mode 100644
index 0000000..38f9a52
--- /dev/null
+++ b/Documentation/ia64/aliasing.txt
@@ -0,0 +1,208 @@
+ MEMORY ATTRIBUTE ALIASING ON IA-64
+
+ Bjorn Helgaas
+ <bjorn.helgaas@hp.com>
+ May 4, 2006
+
+
+MEMORY ATTRIBUTES
+
+ Itanium supports several attributes for virtual memory references.
+ The attribute is part of the virtual translation, i.e., it is
+ contained in the TLB entry. The ones of most interest to the Linux
+ kernel are:
+
+ WB Write-back (cacheable)
+ UC Uncacheable
+ WC Write-coalescing
+
+ System memory typically uses the WB attribute. The UC attribute is
+ used for memory-mapped I/O devices. The WC attribute is uncacheable
+ like UC is, but writes may be delayed and combined to increase
+ performance for things like frame buffers.
+
+ The Itanium architecture requires that we avoid accessing the same
+ page with both a cacheable mapping and an uncacheable mapping[1].
+
+ The design of the chipset determines which attributes are supported
+ on which regions of the address space. For example, some chipsets
+ support either WB or UC access to main memory, while others support
+ only WB access.
+
+MEMORY MAP
+
+ Platform firmware describes the physical memory map and the
+ supported attributes for each region. At boot-time, the kernel uses
+ the EFI GetMemoryMap() interface. ACPI can also describe memory
+ devices and the attributes they support, but Linux/ia64 currently
+ doesn't use this information.
+
+ The kernel uses the efi_memmap table returned from GetMemoryMap() to
+ learn the attributes supported by each region of physical address
+ space. Unfortunately, this table does not completely describe the
+ address space because some machines omit some or all of the MMIO
+ regions from the map.
+
+ The kernel maintains another table, kern_memmap, which describes the
+ memory Linux is actually using and the attribute for each region.
+ This contains only system memory; it does not contain MMIO space.
+
+ The kern_memmap table typically contains only a subset of the system
+ memory described by the efi_memmap. Linux/ia64 can't use all memory
+ in the system because of constraints imposed by the identity mapping
+ scheme.
+
+ The efi_memmap table is preserved unmodified because the original
+ boot-time information is required for kexec.
+
+KERNEL IDENTITY MAPPINGS
+
+ Linux/ia64 identity mappings are done with large pages, currently
+ either 16MB or 64MB, referred to as "granules." Cacheable mappings
+ are speculative[2], so the processor can read any location in the
+ page at any time, independent of the programmer's intentions. This
+ means that to avoid attribute aliasing, Linux can create a cacheable
+ identity mapping only when the entire granule supports cacheable
+ access.
+
+ Therefore, kern_memmap contains only full granule-sized regions that
+ can referenced safely by an identity mapping.
+
+ Uncacheable mappings are not speculative, so the processor will
+ generate UC accesses only to locations explicitly referenced by
+ software. This allows UC identity mappings to cover granules that
+ are only partially populated, or populated with a combination of UC
+ and WB regions.
+
+USER MAPPINGS
+
+ User mappings are typically done with 16K or 64K pages. The smaller
+ page size allows more flexibility because only 16K or 64K has to be
+ homogeneous with respect to memory attributes.
+
+POTENTIAL ATTRIBUTE ALIASING CASES
+
+ There are several ways the kernel creates new mappings:
+
+ mmap of /dev/mem
+
+ This uses remap_pfn_range(), which creates user mappings. These
+ mappings may be either WB or UC. If the region being mapped
+ happens to be in kern_memmap, meaning that it may also be mapped
+ by a kernel identity mapping, the user mapping must use the same
+ attribute as the kernel mapping.
+
+ If the region is not in kern_memmap, the user mapping should use
+ an attribute reported as being supported in the EFI memory map.
+
+ Since the EFI memory map does not describe MMIO on some
+ machines, this should use an uncacheable mapping as a fallback.
+
+ mmap of /sys/class/pci_bus/.../legacy_mem
+
+ This is very similar to mmap of /dev/mem, except that legacy_mem
+ only allows mmap of the one megabyte "legacy MMIO" area for a
+ specific PCI bus. Typically this is the first megabyte of
+ physical address space, but it may be different on machines with
+ several VGA devices.
+
+ "X" uses this to access VGA frame buffers. Using legacy_mem
+ rather than /dev/mem allows multiple instances of X to talk to
+ different VGA cards.
+
+ The /dev/mem mmap constraints apply.
+
+ However, since this is for mapping legacy MMIO space, WB access
+ does not make sense. This matters on machines without legacy
+ VGA support: these machines may have WB memory for the entire
+ first megabyte (or even the entire first granule).
+
+ On these machines, we could mmap legacy_mem as WB, which would
+ be safe in terms of attribute aliasing, but X has no way of
+ knowing that it is accessing regular memory, not a frame buffer,
+ so the kernel should fail the mmap rather than doing it with WB.
+
+ read/write of /dev/mem
+
+ This uses copy_from_user(), which implicitly uses a kernel
+ identity mapping. This is obviously safe for things in
+ kern_memmap.
+
+ There may be corner cases of things that are not in kern_memmap,
+ but could be accessed this way. For example, registers in MMIO
+ space are not in kern_memmap, but could be accessed with a UC
+ mapping. This would not cause attribute aliasing. But
+ registers typically can be accessed only with four-byte or
+ eight-byte accesses, and the copy_from_user() path doesn't allow
+ any control over the access size, so this would be dangerous.
+
+ ioremap()
+
+ This returns a kernel identity mapping for use inside the
+ kernel.
+
+ If the region is in kern_memmap, we should use the attribute
+ specified there. Otherwise, if the EFI memory map reports that
+ the entire granule supports WB, we should use that (granules
+ that are partially reserved or occupied by firmware do not appear
+ in kern_memmap). Otherwise, we should use a UC mapping.
+
+PAST PROBLEM CASES
+
+ mmap of various MMIO regions from /dev/mem by "X" on Intel platforms
+
+ The EFI memory map may not report these MMIO regions.
+
+ These must be allowed so that X will work. This means that
+ when the EFI memory map is incomplete, every /dev/mem mmap must
+ succeed. It may create either WB or UC user mappings, depending
+ on whether the region is in kern_memmap or the EFI memory map.
+
+ mmap of 0x0-0xA0000 /dev/mem by "hwinfo" on HP sx1000 with VGA enabled
+
+ See https://bugzilla.novell.com/show_bug.cgi?id=140858.
+
+ The EFI memory map reports the following attributes:
+ 0x00000-0x9FFFF WB only
+ 0xA0000-0xBFFFF UC only (VGA frame buffer)
+ 0xC0000-0xFFFFF WB only
+
+ This mmap is done with user pages, not kernel identity mappings,
+ so it is safe to use WB mappings.
+
+ The kernel VGA driver may ioremap the VGA frame buffer at 0xA0000,
+ which will use a granule-sized UC mapping covering 0-0xFFFFF. This
+ granule covers some WB-only memory, but since UC is non-speculative,
+ the processor will never generate an uncacheable reference to the
+ WB-only areas unless the driver explicitly touches them.
+
+ mmap of 0x0-0xFFFFF legacy_mem by "X"
+
+ If the EFI memory map reports this entire range as WB, there
+ is no VGA MMIO hole, and the mmap should fail or be done with
+ a WB mapping.
+
+ There's no easy way for X to determine whether the 0xA0000-0xBFFFF
+ region is a frame buffer or just memory, so I think it's best to
+ just fail this mmap request rather than using a WB mapping. As
+ far as I know, there's no need to map legacy_mem with WB
+ mappings.
+
+ Otherwise, a UC mapping of the entire region is probably safe.
+ The VGA hole means the region will not be in kern_memmap. The
+ HP sx1000 chipset doesn't support UC access to the memory surrounding
+ the VGA hole, but X doesn't need that area anyway and should not
+ reference it.
+
+ mmap of 0xA0000-0xBFFFF legacy_mem by "X" on HP sx1000 with VGA disabled
+
+ The EFI memory map reports the following attributes:
+ 0x00000-0xFFFFF WB only (no VGA MMIO hole)
+
+ This is a special case of the previous case, and the mmap should
+ fail for the same reason as above.
+
+NOTES
+
+ [1] SDM rev 2.2, vol 2, sec 4.4.1.
+ [2] SDM rev 2.2, vol 2, sec 4.4.6.
OpenPOWER on IntegriCloud