op-kernel-dev - Development kernel branch for OpenPOWER systems

	Commit message (Collapse)	Author	Age	Files	Lines
*	drm/i915: Extract the actual workload submission mechanism from execbuffer	Oscar Mateo	2014-07-08	1	-136/+162
\| \| \| \| \| \| \| \| \| \| \| \| \|	So that we isolate the legacy ringbuffer submission mechanism, which becomes a good candidate to be abstracted away. This is prep-work for Execlists (which will its own workload submission mechanism). No functional changes. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Emphasize that ctx->id is merely a user handle	Oscar Mateo	2014-07-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an Execlists preparatory patch, since they make context ID become an overloaded term: - In the software, it was used to distinguish which context userspace was trying to use. - In the BSpec, the term is used to describe the 20-bits long field the hardware uses to it to discriminate the contexts that are submitted to the ELSP and inform the driver about their current status (via Context Switch Interrupts and Context Status Buffers). Initially, I tried to make the different meanings converge, but it proved impossible: - The software ctx->id is per-filp, while the hardware one needs to be globally unique. - Also, we multiplex several backing states objects per intel_context, and all of them need unique HW IDs. - I tried adding a per-filp ID and then composing the HW context ID as: ctx->id + file_priv->id + ring->id, but the fact that the hardware only uses 20-bits means we have to artificially limit the number of filps or contexts the userspace can create. The ctx->user_handle renaming bits are done with this Cocci patch (plus manual frobbing of the struct declaration): @@ struct intel_context c; @@ - (c).id + c.user_handle @@ struct intel_context *c; @@ - (c)->id + c->user_handle Also, while we are at it, s/DEFAULT_CONTEXT_ID/DEFAULT_CONTEXT_HANDLE and change the type to unsigned 32 bits. v2: s/handle/user_handle and change the type to uint32_t as suggested by Chris Wilson. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1) Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Track frontbuffer invalidation/flushing	Daniel Vetter	2014-06-19	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So these are the guts of the new beast. This tracks when a frontbuffer gets invalidated (due to frontbuffer rendering) and hence should be constantly scaned out, and when it's flushed again and can be compressed/one-shot-upload. Rules for flushing are simple: The frontbuffer needs one more full upload starting from the next vblank. Which means that the flushing can _only_ be called once the frontbuffer update has been latched. But this poses a problem for pageflips: We can't just delay the flushing until the pageflip is latched, since that would pose the risk that we override frontbuffer rendering that has been scheduled in-between the pageflip ioctl and the actual latching. To handle this track asynchronous invalidations (and also pageflip) state per-ring and delay any in-between flushing until the rendering has completed. And also cancel any delayed flushing if we get a new invalidation request (whether delayed or not). Also call intel_mark_fb_busy in both cases in all cases to make sure that we keep the screen at the highest refresh rate both on flips, synchronous plane updates and for frontbuffer rendering. v2: Lots of improvements Suggestions from Chris: - Move invalidate/flush in flush__domain and set_to__domain. - Drop the flush in busy_ioctl since it's redundant. Was a leftover from an earlier concept to track flips/delayed flushes. - Don't forget about the initial modeset enable/final disable. Suggested by Chris. Track flips accurately, too. Since flips complete independently of rendering we need to track pending flips in a separate mask. Again if an invalidate happens we need to cancel the evenutal flush to avoid races. v3: Provide correct header declarations for flip functions. Currently not needed outside of intel_display.c, but part of the proper interface. v4: Add proper domain management to fbcon so that the fbcon buffer is also tracked correctly. v5: Fixup locking around the fbcon set_to_gtt_domain call. v6: More comments from Chris: - Split out fbcon changes. - Drop superflous checks for potential scanout before calling intel_fb functions - we can micro-optimize this later. - s/intel_fb_/intel_fb_obj_/ to make it clear that this deals in gem object. We already have precedence for fb_obj in the pin_and_fence functions. v7: Clarify the semantics of the flip flush handling by renaming things a bit: - Don't go through a gem object but take the relevant frontbuffer bits directly. These functions center on the plane, the actual object is irrelevant - even a flip to the same object as already active should cause a flush. - Add a new intel_frontbuffer_flip for synchronous plane updates. It currently just calls intel_frontbuffer_flush since the implemenation differs. This way we achieve a clear split between one-shot update events on one side and frontbuffer rendering with potentially a very long delay between the invalidate and flush. Chris and I also had some discussions about mark_busy and whether it is appropriate to call from flush. But mark busy is a state which should be derived from the 3 events (invalidate, flush, flip) we now have by the users, like psr does by tracking relevant information in psr.busy_frontbuffer_bits. DRRS (the only real use of mark_busy for frontbuffer) needs to have similar logic. With that the overall mark_busy in the core could be removed. v8: Only when retiring gpu buffers only flush frontbuffer bits we actually invalidated in a batch. Just for safety since before any additional usage/invalidate we should always retire current rendering. Suggested by Chris Wilson. v9: Actually use intel_frontbuffer_flip in all appropriate places. Spotted by Chris. v10: Address more comments from Chris: - Don't call _flip in set_base when the crtc is inactive, avoids redunancy in the modeset case with the initial enabling of all planes. - Add comments explaining that the initial/final plane enable/disable still has work left to do before it's fully generic. v11: Only invalidate for gtt/cpu access when writing. Spotted by Chris. v12: s/_flush/_flip/ in intel_overlay.c per Chris' comment. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Fix __user sparse warning	Ville Syrjälä	2014-06-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	CHECK linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: warning: incorrect type in initializer (different address spaces) linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: expected struct drm_i915_gem_exec_object2 user_exec_list linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1529:47: got void [noderef] <asn:1> linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: warning: incorrect type in argument 1 (different address spaces) linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: expected void [noderef] <asn:1>dst linux/drivers/gpu/drm/i915/i915_gem_execbuffer.c:1533:61: got unsigned long long <noident> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	Merge commit '9e9a928eed8796a0a1aaed7e0b676db86ba84594' into drm-next	Dave Airlie	2014-06-05	1	-41/+89
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge drm-fixes into drm-next. Both i915 and radeon need this done for later patches. Conflicts: drivers/gpu/drm/drm_crtc_helper.c drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/i915_gem_gtt.c
\| *	drm/i915: Prevent negative relocation deltas from wrapping	Chris Wilson	2014-05-27	1	-19/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is pure evil. Userspace, I'm looking at you SNA, repacks batch buffers on the fly after generation as they are being passed to the kernel for execution. These batches also contain self-referenced relocations as a single buffer encompasses the state commands, kernels, vertices and sampler. During generation the buffers are placed at known offsets within the full batch, and then the relocation deltas (as passed to the kernel) are tweaked as the batch is repacked into a smaller buffer. This means that userspace is passing negative relocations deltas, which subsequently wrap to large values if the batch is at a low address. The GPU hangs when it then tries to use the large value as a base for its address offsets, rather than wrapping back to the real value (as one would hope). As the GPU uses positive offsets from the base, we can treat the relocation address as the minimum address read by the GPU. For the upper bound, we trust that userspace will not read beyond the end of the buffer. So, how do we fix negative relocations from wrapping? We can either check that every relocation looks valid when we write it, and then position each object such that we prevent the offset wraparound, or we just special-case the self-referential behaviour of SNA and force all batches to be above 256k. Daniel prefers the latter approach. This fixes a GPU hang when it tries to use an address (relocation + offset) greater than the GTT size. The issue would occur quite easily with full-ppgtt as each fd gets its own VM space, so low offsets would often be handed out. However, with the rearrangement of the low GTT due to capturing the BIOS framebuffer, it is already affecting kernels 3.15 onwards. I think only IVB+ is susceptible to this bug, but the workaround should only kick in rarely, so it seems sensible to always apply it. v3: Use a bias for batch buffers to prevent small negative delta relocations from wrapping. v4 from Daniel: - s/BIAS/BATCH_OFFSET_BIAS/ - Extract eb_vma_misplaced/i915_vma_misplaced since the conditions were growing rather cumbersome. - Add a comment to eb_get_batch explaining why we do this. - Apply the batch offset bias everywhere but mention that we've only observed it on gen7 gpus. - Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch. v5: Add static to eb_get_batch, spotted by 0-day tester. Testcase: igt/gem_bad_reloc Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3) Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Only copy back the modified fields to userspace from execbuffer	Chris Wilson	2014-05-27	1	-22/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only want to modifiy a single field in the userspace view of the execbuffer command buffer, so explicitly change that rather than copy everything back again. This serves two purposes: 1. The single fields are much cheaper to copy (constant size so the copy uses special case code) and much smaller than the whole array. 2. We modify the array for internal use that need to be masked from the user. Note: We need this backported since without it the next bugfix will blow up when userspace recycles batchbuffers and relocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: s/i915_hw_context/intel_context	Oscar Mateo	2014-05-22	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up until now, contexts had one (and only one) backing object that was used by the hardware to save/restore render ring contexts (via the MI_SET_CONTEXT command). Other rings did not have or need this, so our i915_hw_context struct had a 1:1 relationship with a a real HW context. With Logical Ring Contexts and Execlists, this is not possible anymore: all rings need a backing object, and it cannot be reused. To prepare for that, rename our contexts to the more generic term intel_context. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: s/intel_ring_buffer/intel_engine_cs	Oscar Mateo	2014-05-22	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: move bsd dispatch index somewhere better	Daniel Vetter	2014-05-22	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding stuff at the bottom is really no how this should be done, since that's the place for ums/dri dungeons. This was added in commit a8ebba75b358f9c912cbcba0c14a2072e7280b2f Author: Zhao Yakui <yakui.zhao@intel.com> Date: Thu Apr 17 10:37:40 2014 +0800 drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3 Also add a note to prevent this from happening again - people really should be less lazy and take more time to look for a good home of their new driver-global state. Cc: Imre Deak <imre.deak@intel.com> Cc: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Retire requests before creating a new one	Chris Wilson	2014-05-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	More fallout from commit c8725f3dc0911d4354315a65150aecd8b7d0d74a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Mar 17 12:21:55 2014 +0000 drm/i915: Do not call retire_requests from wait_for_rendering is that we can completely fill all of memory using small objects, such that we exhaust the filp space, and spend all of our time evicting objects from the aperture. As such, we never fill the ring, and never trigger the last resort flushing in commit 1cf0ba14740d96fbf6f58a201f000a34b74f4725 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon May 5 09:07:33 2014 +0100 drm/i915: Flush request queue when waiting for ring space and so all the requests are left active and the objects keep that last active reference. Eventually the system comes to a halt as it runs out of memory. The impact is mainly limited to test cases as regular userspace will trigger retirement by manually checking whether an object is active. Testcase: igt/gem_lut_handle Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78724 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Guo Jinxian <jinxianx.guo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Work-around garbage DR4 from UXA	Daniel Vetter	2014-05-13	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Somehow UXA submits a completely bogus DR4 value since essentially forever. It was originally introduced in commit bade7d7d2505a10a8a7d24b084aff9742e2d6d64 Author: Eric Anholt <eric@anholt.net> Date: Fri Jun 6 14:03:25 2008 -0700 Use the DRM for submitting batchbuffers when available. and dutifully copied around ever since. Since we want to keep the general dirt catching around just special case the UXA value. This regression was introduced in commit 9cb346648d9c529eccc5c7f30093e82d37004e37 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Apr 24 08:09:11 2014 +0200 drm/i915: Catch dirt in unused execbuffer fields Comment from Chris' review: "To be fair, it is a sensible value if one supposes a Region style API to cliprects. Under that API, DR[14] define the extents of the clip region, and ((0,0), (0,0)) [DR1==DR4==0] would mean all clipped, do not draw anything." v2: Pimp commit message a bit and remove the double space. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78494 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jörg Otte <jrg.otte@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Support 64b relocations	Ben Widawsky	2014-05-05	1	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the rest of the code to enable this is in my branch. Without my branch, hitting > 32b offsets is impossible. The code has always "supported" 64b, but it's never actually been run of tested. This change doesn't actually fix anything. [1] I am not sure why X won't work yet. I do not get hangs or obvious errors. There are 3 fixes grouped together here. First is to remove the hardcoded 0 for the upper dword of the relocation. The next fix is to use a 64b value for target_offset. The final fix is to not directly apply target_offset to reloc->delta. reloc->delta is part of ABI, and so we cannot change it. As it stands, 32b is enough to represent everything we're interested in representing anyway. The main problem is, we cannot add greater than 32b values to it directly. [1] Almost all of intel-gpu-tools is not yet ready to test 64b relocations. There are a few places that expect 32b values for offsets and these all won't work. Cc: Rafael Barbalho <rafael.barbalho@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Support 64b execbuf	Ben Widawsky	2014-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, our code only had a 32b offset value for where the batchbuffer starts. With full PPGTT, and 64b canonical GPU address space, that is an insufficient value. The code to expand is pretty straight forward, and only one platform needs to do anything with the extra bits. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Do not call retire_requests from wait_for_rendering	Chris Wilson	2014-05-05	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A common issue we have is that retiring requests causes recursion through GTT manipulation or page table manipulation which we can only handle at very specific points. However, to maintain internal consistency (enforced through our sanity checks on write_domain at various points in the GEM object lifecycle) we do need to retire the object prior to marking it with a new write_domain, and also clear the write_domain for the implicit flush following a batch. Note that this then allows the unbound objects to still be on the active lists, and so care must be taken when removing objects from unbound lists (similar to the caveats we face processing the bound lists). v2: Fix i915_gem_shrink_all() to handle updated object lifetime rules, by refactoring it to call into __i915_gem_shrink(). v3: Missed an object-retire prior to changing cache domains in i915_gem_object_set_cache_leve() v4: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the ↵	Zhao Yakui	2014-05-05	1	-1/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BSD command on BDW GT3 The BDW GT3 has two independent BSD rings, which can be used to process the video commands. To be simpler, it is transparent to user-space driver/middle. Instead the kernel driver will decide which ring is to dispatch the BSD video command. As every BSD ring is powerful, it is enough to dispatch the BSD video command based on the drm fd. In such case it can play back video stream while encoding another video stream. The coarse ping-pong mechanism is used to determine which BSD ring is used to dispatch the BSD video command. V1->V2: Follow Daniel's comment and use the simple ping-pong mechanism. This is only to add the support of dual BSD rings on BDW GT3 machine. The further optimization will be considered in another patch set. V2->V3: Follow Daniel's comment to use the struct_mutext instead of atomic_t during determining which ring can be used to dispatch Video command. Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Update the restrict check to filter out wrong Ring ID passed by ↵	Zhao Yakui	2014-05-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	user-space Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Catch dirt in unused execbuffer fields	Daniel Vetter	2014-05-05	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to make sure that userspace keeps on following the contract, otherwise we won't be able to use the reserved fields at all. v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/*-dirt Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Catch abuse of I915_EXEC_CONSTANTS_*	Daniel Vetter	2014-05-05	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bit tricky since 0 is also a valid constant ... v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/rel-constants-* Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Catch abuse of I915_EXEC_GEN7_SOL_RESET	Daniel Vetter	2014-05-05	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we catch it, but silently succeed. Our userspace is better than this. v2: Add DRM_DEBUG (Chris) Testcase: igt/gem_exec_params/sol-reset-* Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	Merge tag 'drm-intel-next-2014-04-16' of ↵	Dave Airlie	2014-05-01	1	-1/+2
\|\ \ \| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://anongit.freedesktop.org/drm-intel into drm-next drm-intel-next-2014-04-16: - vlv infoframe fixes from Jesse - dsi/mipi fixes from Shobhit - gen8 pageflip fixes for LRI/SRM from Damien - cmd parser fixes from Brad Volkin - some prep patches for CHV, DRRS, ... - and tons of little things all over drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) drm-intel-next-2014-04-04: - cmd parser for gen7 but only in enforcing and not yet granting mode - the batch copying stuff is still missing. Also performance is a bit ... rough (Brad Volkin + OACONTROL fix from Ken). - deprecate UMS harder (i.e. CONFIG_BROKEN) - interrupt rework from Paulo Zanoni - runtime PM support for bdw and snb, again from Paulo - a pile of refactorings from various people all over the place to prep for new stuff (irq reworks, power domain polish, ...) Conflicts: drivers/gpu/drm/i915/i915_gem_context.c
\| *	drm/i915: Unref context on failed eb_create	Ben Widawsky	2014-04-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I opted to do this instead of grabbing the context reference after eb_create since eb_create can potentially call the shrinker, and that makes things very complicated. This simple patch balances the ref count without requiring a great deal of review to make sure the shrinker path is safe. Theoretically (by design) the shrinker can end up destroying a context, which enforces the reasoning for doing the fix this way instead of moving the reference to later in the function. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Always use kref tracking for all contexts.	Chris Wilson	2014-04-11	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we always initialize kref for the context, even if we are using fake contexts for hangstats when there is no hw support, we can forgo the dance to dereference the ctx->obj and inspect whether we are permitted to use kref inside i915_gem_context_reference() and _unreference(). My ulterior motive here is to improve the debugging of a use-after-free of ctx->obj. This patch avoids the dereference here and instead forces the assertion checks associated with kref. v2: Refactor the fake contexts to being even more like the real contexts, so that there is much less duplicated and special case code. v3: Tweaks. v4: Tweaks, minor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76671 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [Jani: tiny change to backport to drm-intel-fixes.] Signed-off-by: Jani Nikula <jani.nikula@intel.com>
*	drm/i915: prefer struct drm_i915_private to drm_i915_private_t	Jani Nikula	2014-03-31	1	-2/+2
\| \| \| \| \| \| \| \| \|	Remove the rest of the references to drm_i915_private_t. No functional changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> [danvet: Drop hunk in i915_cmd_parser.c] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Implement command buffer parsing logic	Brad Volkin	2014-03-07	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Only bind each object rather than for every execbuffer	Daniel Vetter	2014-02-14	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One side-effect of the introduction of ppgtt was that we needed to rebind the object into the appropriate vm (and global gtt in some peculiar cases). For simplicity this was done twice for every object on every call to execbuffer. However, that adds a tremendous amount of CPU overhead (rewriting all the PTE for all objects into WC memory) per draw. The fix is to push all the decision about which vm to bind into and when down into the low-level bind routines through hints rather than performing the bind unconditionally in the execbuffer routine. Note that this is a regression introduced in the full ppgtt feature branch, before this we've only done re-bound objects when the relevant has_(aliasing_ppgtt\|global_gtt)_mapping flag was clear. But since that's per-object and not per-vma that optimization broke. v2: Split out prep work and unrelated changes. v3: Bring back functional change around PIN_GLOBAL that I've accidentally split out. v4: Remove the temporary hack for the old binding logic to avoid bisection issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72906 Tested-by: jianx.zhou@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: split PIN_GLOBAL out from PIN_MAPPABLE	Daniel Vetter	2014-02-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	With abitrary pin flags it makes sense to split out a "please bind this into global gtt" from the "please allocate in the mappable range". Use this unconditionally in our global gtt pin helper since this is what its callers want. Later patches will drop PIN_MAPPABLE where it's not strictly needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: Consolidate binding parameters into flags	Daniel Vetter	2014-02-14	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm/i915: move module parameters into a struct, in a new file	Jani Nikula	2014-01-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With 20+ module parameters, I think referring to them via a struct improves clarity over just having a bunch of globals. While at it, move the parameter initialization and definitions into a new file i915_params.c to reduce clutter in i915_drv.c. Apart from the ill-named i915_enable_rc6, i915_enable_fbc and i915_enable_ppgtt parameters, for which we lose the "i915_" prefix internally, the module parameters now look the same both on the kernel command line and in code. For example, "i915.modeset". The downsides of the change are losing static on a couple of variables and not having the initialization and module_param_named() right next to each other. On the other hand, all module parameters are now defined in one place at i915_params.c. Plus you can do this to find all module parameter references: $ git grep "i915\." -- drivers/gpu/drm/i915 v2: - move the definitions into a new file - s/i915_params/i915/ - make i915_try_reset i915.reset, for consistency Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	drm: dp helper: Add DP test sink CRC definition.	Rodrigo Vivi	2014-01-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This address will be used to verify panel CRC for test and validation purposes. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Fix whitespace fail.] Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*	Merge branch 'topic/ppgtt' into drm-intel-next-queued	Daniel Vetter	2014-01-25	1	-79/+83
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915/ppgtt: Fix ioctl errno for "no such context"	Ben Widawsky	2014-01-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this fix the ioctls silently succeeded (but actually did nothing). It makes all the code which calls into this function way too confusing. v2: Fix destroy IOCTL as well v3: Clarify the other two callers of i915_gem_context_get() to never check for NULL. (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72903 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Testcase: igt/gem_ctx_exec/basic [danvet: Fix up the commit message and actually bother to mention the testcase this fixes.] Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Don't check for NEEDS_GTT when deciding the address space	Daniel Vetter	2013-12-18	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This means something different and is only relevant for gen6 and the reason why we cant use anything else than aliasing ppgtt there. Note that the currently implemented logic for secure batches is broken: Userspace wants the buffer both in ppgtt (for self-referencing relocations) and in ggtt (for priveledge operations). This is the same issue the command parser is also facing. Unfortunately our coverage for corner-cases of self-referencing batches is spotty. Note that this will break vsync'ed Xv and DRI2 copies. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Reject NEEDS_GTT relocations with full ppgtt	Daniel Vetter	2013-12-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Doesn't make sense. Spotted while fixing an issue Chris noticed in the same area. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Reject non-default contexts on non-render again	Daniel Vetter	2013-12-18	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts the abi-change from commit 67e3d2979be1bf42d1818b2961c671eb31e0b4d9 Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:01 2013 -0800 drm/i915: Permit contexts on all rings We don't actually need this, only the internal changes to allow contexts on all rings for the purpose of ppgtt switching are required. And I'm not sure whether this is the right thing to do given some of the hw features in the pipeline. Also, new abi needs userspace patches as a proof-of-need, which is completely lacking here. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	drm/i915: Use multiple VMs -- the point of no return	Ben Widawsky	2013-12-18	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As with processes which run on the CPU, the goal of multiple VMs is to provide process isolation. Specific to GEN, there is also the ability to map more objects per process (2GB each instead of 2Gb-2k total). For the most part, all the pipes have been laid, and all we need to do is remove asserts and actually start changing address spaces with the context switch. Since prior to this we've converted the setting of the page tables to a streamed version, this is quite easy. One important thing to point out (since it'd been hotly contested) is that with this patch, every context created will have it's own address space (provided the HW can do it). v2: Disable BDW on rebase NOTE: I tried to make this commit as small as possible. I needed one place where I could "turn everything on" and that is here. It could be split into finer commits, but I didn't really see much point. Cc: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| *	Merge commit drm-intel-fixes into topic/ppgtt	Daniel Vetter	2013-12-18	1	-28/+32
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I need the tricky do_switch fix before I can merge the final piece of the ppgtt enabling puzzle. Otherwise the conflict will be a real pain to resolve since the do_switch hunk from -fixes must be placed at the exact right place within a hunk in the next patch. Conflicts: drivers/gpu/drm/i915/i915_gem_context.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Get context early in execbuf	Ben Widawsky	2013-12-18	1	-18/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to have the address space when reserving space for the objects. Since the address space and context are tied together, and reserve occurs before context switch (for good reason), we must lookup our context earlier in the process. This leaves some room for optimizations where we no longer need to use ctx_id in certain places. This will be addressed in a subsequent patch. Important tricky bit: Because slow relocations during execbuffer drop struct_mutex Perhaps it would be best to acquire the reference when we get the context, but I'll save that for another day (note I have written the patch before, and I found the changes required to be uglier than this). Note that since we currently access everything via context id, and not the data structure this is fine, though not desirable. The next change attempts to get the context only once via the context ID idr lookup, and as such, the following can happen: CTX-A is created, refcount = 1 CTX-A execbuf, mutex dropped close IOCTL called on CTX-A, refcount = 0 CTX-A resumes in execbuf. v2: Rebased on top of commit b6359918b885da7c7b58c050674278dbd06020ab Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Oct 30 15:44:16 2013 +0200 drm/i915: add i915_get_reset_stats_ioctl v3: Rebased on top of commit 25b3dfc87bff80317d67ddd2cd4cfb91e6fe7d79 Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Tue Nov 12 11:57:30 2013 +0200 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Nov 26 16:14:33 2013 +0200 drm/i915: check context reset stats before relocations Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Permit contexts on all rings	Ben Widawsky	2013-12-18	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we want to use contexts in more abstract terms (specifically with PPGTT in mind), we need to allow them to be specified for any ring. Since the upcoming patches will bring about the use of multiple address spaces, and each ring needs to have an address space programmed (which we intend to do at context switch time), we can no longer only use RCS. With multiple rings having a last context, we must now unreference these contexts. NOTE: This commit requires an update to intel-gpu-tools to make it not fail. v2: Rebased with some logical conflicts. Squashed in the context fini refcount patch Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Simplify ring handling in execbuf	Ben Widawsky	2013-12-18	1	-31/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Remove vm arg from relocate entry	Ben Widawsky	2013-12-18	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only place we were using it was for GEN6, which won't have PPGTT support anyway (ie. the VM is always the same). To clear things up, (it only added confusion for me since it doesn't allow us to assert vma->vm is what we always want, when just looking at the code). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Create bind/unbind abstraction for VMAs	Ben Widawsky	2013-12-18	1	-18/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To sum up what goes on here, we abstract the vma binding, similarly to the previous object binding. This helps for distinguishing legacy binding, versus modern binding. To keep the code churn as minimal as possible, I am leaving in insert_entries(). It serves as the per platform pte writing basically. bind_vma and insert_entries do share a lot of similarities, and I did have designs to combine the two, but as mentioned already... too much churn in an already massive patchset. What follows are the 3 commits which existed discretely in the original submissions. Upon rebasing on Broadwell support, it became clear that separation was not good, and only made for more error prone code. Below are the 3 commit messages with all their history. drm/i915: Add bind/unbind object functions to VMA drm/i915: Use the new vm [un]bind functions drm/i915: reduce vm->insert_entries() usage drm/i915: Add bind/unbind object functions to VMA As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. Notice that this patch has no impact on functionality. I've decided to save the actual change until the next patch because I think it's easier to review that way. I'm happy to squash the two, or let Daniel do it on merge. v2: Make ggtt handle the quirky aliasing ppgtt Add flags to bind object to support above Don't ever call bind/unbind directly for PPGTT until we have real, full PPGTT (use NULLs to assert this) Make sure we rebind the ggtt if there already is a ggtt binding. This happens on set cache levels. Use VMA for bind/unbind (Daniel, Ben) v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville). Change logic in unbind to only unbind ggtt when there is a global mapping, and to remove a redundant check if the aliasing ppgtt exists. v4: Make the bind function a bit smarter about the cache levels to avoid unnecessary multiple remaps. "I accept it is a wart, I think unifying the pin_vma / bind_vma could be unified later" (Chris) Removed the git notes, and put version info here. (Daniel) v5: Update the comment to not suck (Chris) v6: Move bind/unbind to the VMA. It makes more sense in the VMA structure (always has, but I was previously lazy). With this change, it will allow us to keep a distinct insert_entries. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: Use the new vm [un]bind functions Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) At this point the original mailing list thread diverges. ie. v4^: use target_obj instead of obj for gen6 relocate_entry vma->bind_vma() can be called safely during pin. So simply do that instead of the complicated conditionals. Don't restore PPGTT bound objects on resume path Bug fix in resume path for globally bound Bos Properly handle secure dispatch Rebased on vma bind/unbind conversion Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: reduce vm->insert_entries() usage FKA: drm/i915: eliminate vm->insert_entries() With bind/unbind function pointers in place, we no longer need insert_entries. We could, and want, to remove clear_range, however it's not totally easy at this point. Since it's used in a couple of place still that don't only deal in objects: setup, ppgtt init, and restore gtt mappings. v2: Don't actually remove insert_entries, just limit its usage. It will be useful when we introduce gen8. It will always be called from the vma bind/unbind. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Make pin count per VMA	Ben Widawsky	2013-12-18	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: Clarify relocation errnos	Ben Widawsky	2014-01-22	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While trying to find a random -EINVAL from a failing test, I noticed we had a few hard to follow return values. The first two hunks in this patch replace completely useless initialization of ret. The last several hunks help to distinguish between altering 'return ret' and 'return <ERROR>' Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	Merge commit origin/master into drm-intel-next	Daniel Vetter	2014-01-16	1	-36/+52
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts are getting out of hand, and now we have to shuffle even more in -next which was also shuffled in -fixes (the call for drm_mode_config_reset needs to move yet again). So do a proper backmerge. I wanted to wait with this for the 3.13 relaese, but alas let's just do this now. Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ddi.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_pm.c Besides the conflict around the forcewake get/put (where we chaged the called function in -fixes and added a new parameter in -next) code all the current conflicts are of the adjacent lines changed type. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \| \|	drm/i915: Prevent double unref following alloc failure during execbuffer	Chris Wilson	2013-12-12	1	-8/+20
\| \| \|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Whilst looking up the objects required for an execbuffer, an untimely allocation failure in creating the vma results in the object being unreferenced from two lists. The ownership during the lookup is meant to be moved from the list of objects being looked to the vma, and this double unreference upon error results in a use-after-free. Fixes regression from commit 27173f1f95db5e74ceb35fe9a2f2f348ea11bac9 Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Aug 14 11:38:36 2013 +0200 drm/i915: Convert execbuf code to use vmas Based on the fix by Ben Widawsky. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: stable@vger.kernel.org [danvet: Bikeshed the crucial comment above the ownership transfer as discussed on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
\| * \|	drm/i915: Pin relocations for the duration of constructing the execbuffer	Chris Wilson	2013-11-27	1	-28/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the execbuffer dispatch grows ever more complex and involves multiple stages of moving objects into the aperture, we need to take greater care that we do not evict our execbuffer objects prior to dispatch. This is relatively simple as we can just keep the objects pinned for not just the relocation but until we are finished. One such example is the possibility of the context switch causing an eviction or hitting the shrinker in order to fit its object into the aperture. Link: http://lists.freedesktop.org/archives/intel-gfx/2013-November/036166.html Reported-by: "Siluvery, Arun" <arun.siluvery@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: stable@vger.kernel.org Acked-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Add the additional explanations from Chris to the commit message.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \| \|	drm/i915: add runtime put/get calls at the basic places	Paulo Zanoni	2013-12-10	1	-0/+6
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If I add code to enable runtime PM on my Haswell machine, start a desktop environment, then enable runtime PM, these functions will complain that they're trying to read/write registers while the graphics card is suspended. v2: - Simplify i915_gem_fault changes. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Drop the hunk in i915_hangcheck_elapsed, it's the wrong thing to do.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: check context reset stats before relocations	Mika Kuoppala	2013-12-04	1	-13/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Doing it early prevents moving and relocating objects in vain for contexts that won't get any GPU time. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
* \|	drm/i915: Missed dropped VMA conversion	Ben Widawsky	2013-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This belonged in commit 07fe0b12800d4752d729d4122c01f41f80a5ba5a Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 17:00:10 2013 -0700 drm/i915: plumb VM into bind/unbind code But it was somehow missed along the way. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>