op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	drm/i915: make mappable struct resource centric	Matthew Auld	2017-12-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we are using struct resource to track the stolen region, it is more convenient if we track the mappable region in a resource as well. v2: prefer iomap and gmadr naming scheme prefer DEFINE_RES_MEM Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com
*	drm/i915: Refactor common list iteration over GGTT vma	Chris Wilson	2017-12-07	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In quite a few places, we have a list iteration over the vma on an object that only want to inspect GGTT vma. By construction, these are placed at the start of the list, so we have copied that knowledge into many callsites. Pull that knowledge back to i915_vma.h and provide a for_each_ggtt_vma() to tidy up the code. v2: Add a backreference from vma_create() to remind ourselves why we put ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail). v3: Fixup s/vma/V/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk
*	drm/i915: Track GGTT writes on the vma	Chris Wilson	2017-12-07	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As writes through the GTT and GGTT PTE updates do not share the same path, they are not strictly ordered and so we must explicitly flush the indirect writes prior to modifying the PTE. We do track outstanding GGTT writes on the object itself, but since the object may have multiple GGTT vma, that is overly coarse as we can track and flush individual vma as required. Whilst here, update the GGTT flushing behaviour for Cannonlake. v2: Hard-code ring offset to allow use during unload (after RCS may have been freed, or never existed!) References: https://bugs.freedesktop.org/show_bug.cgi?id=104002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk
*	drm/i915: Remove vma from object on destroy, not close	Chris Wilson	2017-12-07	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally we translated from the object to the vma by walking obj->vma_list to find the matching vm (for user lookups). Now we process user lookups using the rbtree, and we only use obj->vma_list itself for maintaining state (e.g. ensuring that all vma are flushed or rebound). As such maintenance needs to go on beyond the user's awareness of the vma, defer removal of the vma from the obj->vma_list from i915_vma_close() to i915_vma_destroy() Fixes: 5888fc9eac3c ("drm/i915: Flush pending GTT writes before unbinding") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104155 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Mark up i915_vma_unbind() as a potential sleeper	Chris Wilson	2017-11-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Whenever we want to unbind a vma, we must wait on all GPU activity to complete first. (This is what gives us the ability to do fine grained eviction and purging by only having to wait on the VMA that we need to unbind to proceed; though if pushed we can make it a rule that we are only allowed to unbind already idle VMA and move the burden of the work and organising the sleep onto the caller.) Currently, we might only sleep if the vma is still active on the GPU, but in principle i915_vma_unbind() always implies a sleep, so mark it up with a might_sleep(). Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=103638 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109213450.13875-2-chris@chris-wilson.co.uk
*	drm/i915: Prune the reservation shared fence array	Chris Wilson	2017-11-08	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The shared fence array is not autopruning and may continue to grow as an object is shared between new timelines. Take the opportunity when we think the object is idle (we have to confirm that any external fence is also signaled) to decouple all the fences. We apply a similar trick after waiting on an object, see commit e54ca9774777 ("drm/i915: Remove completed fences after a wait") v2: No longer need to handle the batch pool as a special case. v3: Need to trylock from within i915_vma_retire as this may be called form the shrinker - and we may later try to allocate underneath the reservation lock, so a deadlock is possible. References: https://bugs.freedesktop.org/show_bug.cgi?id=102936 Fixes: d07f0e59b2c7 ("drm/i915: Move GEM activity tracking into a common struct reservation_object") Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107220656.5020-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Assert vma->flags are updated correctly during binding	Chris Wilson	2017-11-06	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we bind, and unbind on error, we want to be sure that the vma->flags are updated to reflect the binding state so that on the next invocation all is well. v2: Take two. v3: Take three; vma-misplaced is checking map-and-fenceable so keep it last! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171105124550.32715-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
*	drm/i915: Move dev_priv->mm.[un]bound_list to its own lock	Chris Wilson	2017-10-16	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove the struct_mutex requirement around dev_priv->mm.bound_list and dev_priv->mm.unbound_list by giving it its own spinlock. This reduces one more requirement for struct_mutex and in the process gives us slightly more accurate unbound_list tracking, which should improve the shrinker - but the drawback is that we drop the retirement before counting so i915_gem_object_is_active() may be stale and lead us to underestimate the number of objects that may be shrunk (see commit bed50aea61df ("drm/i915/shrinker: Flush active on objects before counting")). v2: Crosslink the spinlock to the lists it protects, and btw this changes s/obj->global_link/obj->mm.link/ v3: Fix decoupling of old links in i915_gem_object_attach_phys() v3.1: Fix the fix, only unlink if it was linked v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
*	drm/i915: Track user GTT faulting per-vma	Chris Wilson	2017-10-09	1	-1/+27
\| \| \| \| \| \| \| \| \| \| \| \|	We don't wish to refault the entire object (other vma) when unbinding one partial vma. To do this track which vma have been faulted into the user's address space. v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Consolidate get_fence with pin_fence	Chris Wilson	2017-10-09	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Following the pattern now used for obj->mm.pages, use just pin_fence and unpin_fence to control access to the fence registers. I.e. instead of calling get_fence(); pin_fence(), we now just need to call pin_fence(). This will make it easier to reduce the locking requirements around fence registers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Pin fence for iomap	Chris Wilson	2017-10-09	1	-4/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	Acquire the fence register for the iomap in i915_vma_pin_iomap() on behalf of the caller. We probably want for the caller to specify whether the fence should be pinned for their usage, but at the moment all callers do want the associated fence, or none, so take it on their behalf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk
*	drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT	Chris Wilson	2017-10-09	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only apply the hugepage PD redirection inside the ppGTT, so during i915_vma_insert() we want to exclude the GGTT from the additional alignment constraints (thereby avoiding the extra GTT pressure from fragmentation). Add an assert to document that intention alongside the comment. v2: After discussion with Matthew, make it a blanket GGTT ban (previously we allowed the expansion for appgtt, and so indirectly ggtt). There are issues we need to fix before allowing the current appgtt to be used with hugepages, and if we do, we probably want more care over when to expand/align, as the mappable aperture inside the ggtt is precious. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009092019.20747-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: align 64K objects to 2M	Matthew Auld	2017-10-07	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can't mix 64K and 4K pte's in the same page-table, so for now we align 64K objects to 2M to avoid any potential mixing. This is potentially wasteful but in reality shouldn't be too bad since this only applies to the virtual address space of a 48b PPGTT. v2: don't separate logically connected ops Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-9-chris@chris-wilson.co.uk
*	drm/i915: align the vma start to the largest gtt page size	Matthew Auld	2017-10-07	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the 48b PPGTT try to align the vma start address to the required page size boundary to guarantee we use said page size in the gtt. If we are dealing with multiple page sizes, we can't guarantee anything and just align to the largest. For soft pinning and objects which need to be tightly packed into the lower 32bits we don't force any alignment. v2: various improvements suggested by Chris v3: use set_pages and better placement of page_sizes v4: prefer upper_32_bits() v5: assign vma->page_sizes = vma->obj->page_sizes directly prefer sizeof(vma->page_sizes) v6: fixup checking of end to exclude GGTT (which are assumed to be limited to 4G). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-8-chris@chris-wilson.co.uk
*	drm/i915: introduce vm set_pages/clear_pages	Matthew Auld	2017-10-07	1	-11/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the setting/clearing of the vma->pages to a vm operation. Doing so neatens things up a little, but more importantly gives us a sane place to also set/clear the vma->pages_sizes, which we introduce later in preparation for supporting huge-pages. v2: remove redundant vma->pages check v3: GEM_BUG_ON(vma->pages) following i915_vma_remove Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk
*	drm/i915: Replace execbuf vma ht with an idr	Chris Wilson	2017-08-18	1	-22/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
*	drm/i915: Assert the vma's active tracking is clear before free	Chris Wilson	2017-06-21	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	In looking at a use-after-free on Baytrail, it looks like the VMA's activity tracking is suspect. Add some asserts to catch freeing the VMA before we have decoupled all of its i915_gem_active trackers. References: https://bugs.freedesktop.org/show_bug.cgi?id=101511 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-3-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
*	drm/i915: Retire the VMA's fence tracker before unbinding	Chris Wilson	2017-06-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we may track unfenced access (GPU access to the vma that explicitly requires no fence), vma->last_fence may be set without any attached fence (vma->fence) and so will not be flushed when we call i915_vma_put_fence(). Since we stopped doing a full retire of the activity trackers for unbind, we need to explicitly retire each tracker. Fixes: b0decaf75bd9 ("drm/i915: Track active vma requests") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620124321.1108-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
*	drm/i915: Stash a pointer to the obj's resv in the vma	Chris Wilson	2017-06-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	During execbuf, a mandatory step is that we add this request (this fence) to each object's reservation_object. Inside execbuf, we track the vma, and to add the fence to the reservation_object then means having to first chase the obj, incurring another cache miss. We can reduce the number of cache misses by stashing a pointer to the reservation_object in the vma itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk
*	drm/i915: Store a persistent reference for an object in the execbuffer cache	Chris Wilson	2017-06-16	1	-0/+2
\| \| \| \| \| \| \| \| \|	If we take a reference to the object/vma when it is first used in an execbuf, we can keep that reference until the object's file-local handle is closed. Thereby saving a frequent ref/unref pair. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Eliminate lots of iterations over the execobjects array	Chris Wilson	2017-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The major scaling bottleneck in execbuffer is the processing of the execobjects. Creating an auxiliary list is inefficient when compared to using the execobject array we already have allocated. Reservation is then split into phases. As we lookup up the VMA, we try and bind it back into active location. Only if that fails, do we add it to the unbound list for phase 2. In phase 2, we try and add all those objects that could not fit into their previous location, with fallback to retrying all objects and evicting the VM in case of severe fragmentation. (This is the same as before, except that phase 1 is now done inline with looking up the VMA to avoid an iteration over the execobject array. In the ideal case, we eliminate the separate reservation phase). During the reservation phase, we only evict from the VM between passes (rather than currently as we try to fit every new VMA). In testing with Unreal Engine's Atlantis demo which stresses the eviction logic on gen7 class hardware, this speed up the framerate by a factor of 2. The second loop amalgamation is between move_to_gpu and move_to_active. As we always submit the request, even if incomplete, we can use the current request to track active VMA as we perform the flushes and synchronisation required. The next big advancement is to avoid copying back to the user any execobjects and relocations that are not changed. v2: Add a Theory of Operation spiel. v3: Fall back to slow relocations in preparation for flushing userptrs. v4: Document struct members, factor out eb_validate_vma(), add a few more comments to explain some magic and hide other magic behind macros. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Store a direct lookup from object handle to vma	Chris Wilson	2017-06-16	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The advent of full-ppgtt lead to an extra indirection between the object and its binding. That extra indirection has a noticeable impact on how fast we can convert from the user handles to our internal vma for execbuffer. In order to bypass the extra indirection, we use a resizable hashtable to jump from the object to the per-ctx vma. rhashtable was considered but we don't need the online resizing feature and the extra complexity proved to undermine its usefulness. Instead, we simply reallocate the hastable on demand in a background task and serialize it before iterating. In non-full-ppgtt modes, multiple files and multiple contexts can share the same vma. This leads to having multiple possible handle->vma links, so we only use the first to establish the fast path. The majority of buffers are not shared and so we should still be able to realise speedups with multiple clients. v2: Prettier names, more magic. v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Make i915_vma_destroy() static	Chris Wilson	2017-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	i915_vma_destroy() is now not used outside of i915_vma.c so we can remove the export and make the function static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616123508.12673-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
*	drm/i915: Use vma->exec_entry as our double-entry placeholder	Chris Wilson	2017-06-15	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	This has the benefit of not requiring us to manipulate the vma->exec_link list when tearing down the execbuffer, and is a marginally cheaper test to detect the user error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk
*	drm/i915: Remove the vma from the drm_mm if binding fails	Chris Wilson	2017-02-27	1	-20/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we track whether a vma has been inserted into the drm_mm using the vma->flags, if we fail to bind the vma into the GTT we do not update those bits and will attempt to reinsert the vma into the drm_mm on future passes. To prevent that, we want to unwind i915_vma_insert() if we fail in our attempt to bind. Fixes: 59bfa1248e22 ("drm/i915: Start passing around i915_vma from execbuffer") Testcase: igt/drv_selftest/live_gtt Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227122654.27651-3-chris@chris-wilson.co.uk
*	drm/i915: Sanity check the vma->node prior to binding into the GTT	Chris Wilson	2017-02-25	1	-6/+9
\| \| \| \| \| \| \| \| \| \|	We rely on the VMA being allocated inside the drm_mm and for its allotted node being large enough to accommodate all the vma->pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170225181122.4788-3-chris@chris-wilson.co.uk
*	drm/i915: Move allocate_va_range to GTT	Chris Wilson	2017-02-15	1	-9/+0
\| \| \| \| \| \| \| \| \| \|	In the future, we need to call allocate_va_range on the aliasing-ppgtt which means moving the call down from the vma into the vm (which is more appropriate for calling the vm function). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170215084357.19977-8-chris@chris-wilson.co.uk
*	drm/i915: Exercise i915_vma_pin/i915_vma_insert	Chris Wilson	2017-02-13	1	-2/+2
\| \| \| \| \| \| \| \| \|	High-level testing of the struct drm_mm by verifying our handling of weird requests to i915_vma_pin. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-35-chris@chris-wilson.co.uk
*	drm/i915: Test creation of VMA	Chris Wilson	2017-02-13	1	-0/+3
\| \| \| \| \| \| \| \|	Simple test to exercise creation and lookup of VMA within an object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-34-chris@chris-wilson.co.uk
*	drm/i915: Assert that we never create a vma for the aliasing_ppgtt	Chris Wilson	2017-02-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	The aliasing_ppgtt is just a container for the HW context that mirrors the global gtt. It should never be used directly, so assert if we make the mistake of trying to allocate a VMA for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170209111933.12420-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Pevent copying uninitialised garbage into vma->ggtt_view	Chris Wilson	2017-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since tweaking i915_vma_compare() we allowed constructors to skip clearing the ggtt_view believing that we didn't access the unused members. That, as it turns out, was not entirely true. In particular, i915_gem_fault() uses ret = remap_io_mapping(area, area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT), (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT, min_t(u64, vma->size, area->vm_end - area->vm_start), &ggtt->mappable); i.e. the ggtt_view.partial for both normal and partial views. If we allowed garbage into the normal vma->ggtt_view and then try userspace tried to mmap it, we could explode in an unobvious fashion. Fixes: 7b92c047bae2 ("drm/i915: Eliminate superfluous i915_ggtt_view_rotated") Fixes: 3bf4d5751943 ("drm/i915: Stop clearing i915_ggtt_view") Reported-by: Matthew Auld <matthew.william.auld@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170123145245.3972-1-chris@chris-wilson.co.uk Tested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
*	drm/i915: reinstate call to trace_i915_vma_bind	Daniele Ceraolo Spurio	2017-01-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The call went away in: commit 3b16525cc4c1a43e9053cfdc414356eea24bdfad Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Aug 4 16:32:25 2016 +0100 drm/i915: Split insertion/binding of an object into the VM It is useful to have this trace as it pairs nicely with the vma_unbind one to track vma activity. Added inside the i915_vma_bind function (was outside before) to keep a similar placement as trace_i915_vma_unbind. v2: print bind_flags instead of flags (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484949083-11430-1-git-send-email-daniele.ceraolospurio@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
*	drm/i915: Assert that created vma has a whole number of pages	Chris Wilson	2017-01-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	VMA (and their objects) are supposed to composed of whole pages. Add an assert to catch any invalid construct when we create the VMA. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Assert the drm_mm_node is allocated when on the VM lists	Chris Wilson	2017-01-21	1	-0/+2
\| \| \| \| \| \| \| \| \|	Before moving the vma between the VM active/inactive lists, assert that the node is still allocated. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Reject vma creation larger than address space	Chris Wilson	2017-01-21	1	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Disallow creation of a vma that is larger than the available address space, or triggers an overflow on fence expansion. Testcase: igt/gem_exec_reloc/gtt-32 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170119192659.31789-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Remove i915_gem_object_to_ggtt()	Chris Wilson	2017-01-19	1	-21/+6
\| \| \| \| \| \| \| \| \|	With the last user of this convenience wrapper gone, we can kill the wrapper and in the process make the lookup function static. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Remove i915_vma_create from VMA API	Chris Wilson	2017-01-19	1	-31/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the introduce of i915_vma_instance() for obtaining the VMA singleton for a (obj, vm, view) tuple, we can remove the i915_vma_create() in favour of a single entry point. We do incur a lookup onto an empty tree, but the i915_vma_create() were being called infrequently and during initialisation, so the small overhead is negligible. v2: Drop the i915_ prefix from the now static vma_create() function Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-4-chris@chris-wilson.co.uk
*	drm/i915: Add a check that the VMA instance we lookup matches the request	Chris Wilson	2017-01-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Just as added paranoia against our future-selves add another check that the lookup/created VMA instance matches the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Rename some warts in the VMA API	Chris Wilson	2017-01-19	1	-1/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Whilst writing testcases to exercise the VMA API, some oddities came to light, such as i915_gem_obj_lookup_or_create(). Joonas suggested i915_vma_instance() as a neat replacement, so rename them, move them to i915_vma.c and add some kerneldoc as a sugary bonus. s/i915_gem_obj_to_vma/i915_vma_lookup/ s/i915_gem_obj_lookup_or_create_vma/i915_vma_instance/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-2-chris@chris-wilson.co.uk
*	drm/i915: Convert i915_ggtt_view to use an anonymous union	Chris Wilson	2017-01-14	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reading the ggtt_views is much more pleasant without the extra characters from specifying the union (i.e. ggtt_view.partial rather than ggtt_view.params.partial). To make this work inside i915_vma_compare() with only a single memcmp requires us to ensure that there are no uninitialised bytes within each branch of the union (we make sure the structs are packed) and we need to store the size of each branch. v4: Rewrite changelog and add comments explaining the assert. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170114002827.31315-5-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
*	drm/i915: Assert that we have allocated the drm_mm_node upon pinning	Chris Wilson	2017-01-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	We currently check after the slow path that the vma is bound correctly, but we don't currently check after the fast path. This is important in case we accidentally take the fast path and leave the vma misplaced. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170111210937.29252-27-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Extract reserving space in the GTT to a helper	Chris Wilson	2017-01-11	1	-11/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its own routine so that it can be shared rather than duplicated. v2: Kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: igvt-g-dev@lists.01.org Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk
*	drm/i915: Use the MRU stack search after evicting	Chris Wilson	2017-01-11	1	-36/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we evict from the GTT to make room for an object, the hole we create is put onto the MRU stack inside the drm_mm range manager. On the next search pass, we can speed up a PIN_HIGH allocation by referencing that stack for the new hole. v2: Pull together the 3 identical implements (ahem, a couple were outdated) into a common routine for allocating a node and evicting as necessary. v3: Detect invalid calls to i915_gem_gtt_insert() v4: kerneldoc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk
*	drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE	Chris Wilson	2017-01-10	1	-7/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Start converting over from the byte count to its semantic macro, either we want to allocate the size of a physical page in main memory or we want the size of a virtual page in the GTT. 4096 could mean either, but PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve code comprehension and future changes. In the future, we may want to use variable GTT page sizes and so have the challenge of knowing which hardcoded values were used to represent a physical page vs the virtual page. v2: Look for a few more 4096s to convert, discover IS_ALIGNED(). v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep bdw stolen w/a as 4096 until we know better. v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page sizes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Move ggtt fence/alignment to i915_gem_tiling.c	Chris Wilson	2017-01-10	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rename i915_gem_get_ggtt_size() and i915_gem_get_ggtt_alignment() to i915_gem_fence_size() and i915_gem_fence_alignment() respectively to better match usage. Similarly move the pair of functions into i915_gem_tiling.c next to the fence restrictions. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-6-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
*	drm/i915: Store required fence size/alignment for GGTT vma	Chris Wilson	2017-01-10	1	-32/+29
\| \| \| \| \| \| \| \| \| \| \| \| \|	The fence size/alignment is a combination of the vma size plus object tiling parameters. Those parameters are rarely changed, making the fence size/alignemnt roughly constant for the lifetime of the VMA. We can simplify subsequent calculations by precalculating the size/alignment required for GGTT vma taking fencing into account (with an update if we do change the tiling or stride). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-4-chris@chris-wilson.co.uk
*	drm/i915: Align GGTT sizes to a fence tile row	Chris Wilson	2017-01-10	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \|	Ensure the view occupies the full tile row so that reads/writes into the VMA do not escape (via fenced detiling) into neighbouring objects - we will pad the object with scratch pages to satisfy the fence. This applies the lazy-tiling we employed on gen2/3 to gen4+. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-2-chris@chris-wilson.co.uk
*	drm/i915: Use range_overflows()	Chris Wilson	2017-01-06	1	-1/+2
\| \| \| \| \| \| \| \|	Replace a few more open-coded overflow checks with the macro. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-4-chris@chris-wilson.co.uk
*	Merge tag 'drm-misc-next-2016-12-30' of ↵	Daniel Vetter	2017-01-04	1	-2/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://anongit.freedesktop.org/git/drm-misc into drm-intel-next-queued Directly merge drm-misc into drm-intel since Dave is on vacation and we need the various drm-misc patches (fb format rework, drm mm fixes, selftest framework and others). Also pulled back -rc2 in first to resync with drm-intel-fixes and make sure I can reuse the exact rerere solutions from drm-tip for safety, and because I'm lazy. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
\| *	drm: Wrap drm_mm_node.hole_follows	Chris Wilson	2016-12-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Insulate users from changes to the internal hole tracking within struct drm_mm_node by using an accessor for hole_follows. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> [danvet: resolve conflicts in i915_vma.c] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>