summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gem_clflush.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirtyChris Wilson2017-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | For ease of use (i.e. avoiding a few checks and function calls), store the object's cache coherency next to the cache is dirty bit. Specifically this patch aims to reduce the frequency of no-op calls to i915_gem_object_clflush() to counter-act the increase of such calls for GPU only objects in the previous patch. v2: Replace cache_dirty & ~cache_coherent with cache_dirty && !cache_coherent as gcc generates much better code for the latter (Tvrtko) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Tested-by: Dongwon Kim <dongwon.kim@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616105455.16977-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
* drm/i915: Mark CPU cache as dirty on every transition for CPU writesChris Wilson2017-06-161-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we only mark the CPU cache as dirty if we skip a clflush. This leads to some confusion where we have to ask if the object is in the write domain or missed a clflush. If we always mark the cache as dirty, this becomes a much simply question to answer. The goal remains to do as few clflushes as required and to do them as late as possible, in the hope of deferring the work to a kthread and not block the caller (e.g. execbuf, flips). v2: Always call clflush before GPU execution when the cache_dirty flag is set. This may cause some extra work on llc systems that migrate dirty buffers back and forth - but we do try to limit that by only setting cache_dirty at the end of the gpu sequence. v3: Always mark the cache as dirty upon a level change, as we need to invalidate any stale cachelines due to external writes. Reported-by: Dongwon Kim <dongwon.kim@intel.com> Fixes: a6a7cc4b7db6 ("drm/i915: Always flush the dirty CPU cache when pinning the scanout") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Tested-by: Dongwon Kim <dongwon.kim@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615123850.26843-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
* drm/i915: Mark up clflushes as belonging to an unordered timelineChris Wilson2017-05-031-7/+1
| | | | | | | | | | | | | | 2 clflushes on two different objects are not ordered, and so do not belong to the same timeline (context). Either we use a unique context for each, or we reserve a special global context to mean unordered. Ideally, we would reserve 0 to mean unordered (DMA_FENCE_NO_CONTEXT) to have the same semantics everywhere. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-1-chris@chris-wilson.co.uk
* drm/i915: Wait for all fences before installing an exclusive clflush fenceChris Wilson2017-03-231-1/+1
| | | | | | | | | | | | | Ensure that before we overwrite the reservation_object with our exclusive fence for the pending clflush operation, that we do wait upon all the fences in the current reservation_object. Fixes: 57822dc6b9cf ("drm/i915: Perform object clflushing asynchronously") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170323085758.11695-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
* drm/i915: Remove 'retire' parameter from intel_fb_obj_flushChris Wilson2017-02-221-1/+1
| | | | | | | | | | Setting retire=true is identical to using origin=ORIGIN_CS, so make the same simplification to intel_fb_obj_flush() as already employed for intel_fb_obj_invalidate(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-6-chris@chris-wilson.co.uk
* drm/i915: Perform object clflushing asynchronouslyChris Wilson2017-02-221-0/+189
Flushing the cachelines for an object is slow, can be as much as 100ms for a large framebuffer. We currently do this under the struct_mutex BKL on execution or on pageflip. But now with the ability to add fences to obj->resv for both flips and execbuf (and we naturally wait on the fence before CPU access), we can move the clflush operation to a workqueue and signal a fence for completion, thereby doing the work asynchronously and not blocking the driver or its clients. v2: Introduce i915_gem_clflush.h and use a new name, split out some extras into separate patches. Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk
OpenPOWER on IntegriCloud