summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/radeon/cik.c
Commit message (Collapse)AuthorAgeFilesLines
* drm/radeon/cik: fix CP jump table sizeAlex Deucher2016-07-071-1/+1
| | | | | | | | | Align to the jump table offset. May fix hangs on some asics with GFX PG enabled. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Acked-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon/gfx7: expand cp jt size to handle GDS as wellAlex Deucher2016-07-071-1/+2
| | | | | | The size needs to handle the CP JT and GDS. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: load different smc firmware on some CI variantsAlex Deucher2016-07-071-1/+13
| | | | | | | | | | The power tables on some variants require different firmware. This may fix stability issues on some newer CI parts. bug: https://bugs.freedesktop.org/show_bug.cgi?id=91880 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: allow to force hard GPU reset.Jérome Glisse2016-05-021-1/+7
| | | | | | | | | | | | | In some cases, like when freezing for hibernation, we need to be able to force hard reset even if no engine are stuck. This patch add a bool option to current asic reset callback to allow to force hard reset on asic that supports it. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: consolidate cik vce initialization and startup code.Jérome Glisse2016-05-021-45/+91
| | | | | | | | | | | | | | | This match the exact same control flow as existing code. It just use goto instead of multiple levels of if/else. It also clarify early initialization failures by clearing rdev->has_vce doing so does not change end result from hardware point of view, it only avoids printing more error messages down the line and thus only the original error is reported. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: consolidate cik uvd initialization and startup code.Jérome Glisse2016-05-021-29/+79
| | | | | | | | | | | | | | | This match the exact same control flow as existing code. It just use goto instead of multiple levels of if/else. It also clarify early initialization failures by clearing rdev->has_uvd doing so does not change end result from hardware point of view, it only avoids printing more error messages down the line and thus only the original error is reported. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: fix indentation.Jérome Glisse2016-03-161-3/+3
| | | | | | | | | | | I hate doing this but it hurts my eyes to go over code that does not comply with indentation rules. Only thing that is not only space change is in atom.c all other files are space indentation issues. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: refactor CIK tiling table initializationJosh Poimboeuf2016-03-141-1025/+666
| | | | | | | | | Simplify the control flow of cik_tiling_mode_table_init() similar to how it was done in gfx_v7_0.c and gfx_v8_0.c. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: Avoid double gpu reset by adding a timeout on IB ring tests.Matthew Dawson2016-02-101-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the radeon driver resets a gpu, it attempts to test whether all the rings can successfully handle an IB. If these rings fail to respond, the process will wait forever. Another gpu reset can't happen at this point, as the current reset holds a lock required to do so. Instead, make all the IB tests run with a timeout, so the system can attempt to recover in this case. While this doesn't fix the underlying issue with card resets failing, it gives the system a higher chance of recovering. These timeouts have been confirmed to help both a Tathi and Hawaii card recover after a gpu reset. This also adds a new function, radeon_fence_wait_timeout, that behaves like fence_wait_timeout. It is used instead of fence_wait_timeout as it continues to work during a reset. radeon_fence_wait is changed to be implemented using this function. V2: - Changed the timeout to 1s, as the default 10s from radeon_wait_timeout was too long. A timeout of 100ms was tested and found to be too short. - Changed radeon_fence_wait_timeout to behave more like fence_wait_timeout. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Backmerge drm-fixes merge into Linus's tree into drm-next.Dave Airlie2015-12-241-5/+1
|\ | | | | | | | | | | | | | | This merges '5b726e06d6e8309e5c9ef4109a32caf27c71dfc8' into drm-next Just to resolve some merges to make Daniel's life easier. Signed-off-by: DAve Airlie <airlied@redhat.com>
| * radeon/cik: Fix GFX IB test on Big-EndianOded Gabbay2015-12-091-5/+1
| | | | | | | | | | | | | | | | | | | | This patch makes the IB test on the GFX ring pass for CI-based cards installed in Big-Endian machines. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | Merge branch 'drm-next-4.5' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie2015-12-231-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into drm-next [airlied: fixup build problems on arm - added errno.h include] * 'drm-next-4.5' of git://people.freedesktop.org/~agd5f/linux: (152 commits) amd/powerplay: fix copy paste typo in hardwaremanager.c amd/powerplay: disable powerplay by default initially amd/powerplay: don't enable ucode fan control if vbios has no fan table drm/amd/powerplay: show gpu load when print gpu performance for Cz. (v2) drm/amd/powerplay: check whether need to enable thermal control. (v2) drm/amd/powerplay: add point check to avoid NULL point hang. drm/amdgpu/powerplay: Program a calculated value as Deep Sleep clock. drm/amd/powerplay: Don't return an error if fan table is missing drm/powerplay/hwmgr: log errors in tonga_hwmgr_backend_init drm/powerplay: add debugging output to processpptables.c drm/powerplay: add debugging output to tonga_processpptables.c amd/powerplay: Add structures required to report configuration change amd/powerplay: Fix get dal power level amd\powerplay Implement get dal power level drm/amd/powerplay: display gpu load when print performance for tonga. drm/amdgpu/powerplay: enable sysfs and debugfs interfaces late drm/amd/powerplay: move shared function of vi to hwmgr. (v2) drm/amd/powerplay: check whether enable dpm in powerplay. drm/amd/powerplay: fix bug that dpm funcs in debugfs/sysfs missing. drm/amd/powerplay: fix boolreturn.cocci warnings ...
| * drm/radeon: fix typo in cik_ring_ib_execute documentation (v2)Nicolai Hähnle2015-12-181-2/+2
| | | | | | | | | | | | | | | | v2: agd: clarify commit message, fix "an" as spotted by Michel. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)Mario Kleiner2015-12-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Fixes: fdo#93147 Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net> (v2) Refine radeon_flip_work_func() for better efficiency: In radeon_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Retested on DCE-3 and DCE-4 to verify it still works nicely. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)Mario Kleiner2015-12-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Fixes: fdo#93147 Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Harry Wentland <Harry.Wentland@amd.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net> (v2) Refine radeon_flip_work_func() for better efficiency: In radeon_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Retested on DCE-3 and DCE-4 to verify it still works nicely. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interruptLyude2015-12-041-1/+1
|/ | | | | | | | | | | | | | | | | | | | | HPD signals on DVI ports can be fired off before the pins required for DDC probing actually make contact, due to the pins for HPD making contact first. This results in a HPD signal being asserted but DDC probing failing, resulting in hotplugging occasionally failing. This is somewhat rare on most cards (depending on what angle you plug the DVI connector in), but on some cards it happens constantly. The Radeon R5 on the machine used for testing this patch for instance, runs into this issue just about every time I try to hotplug a DVI monitor and as a result hotplugging almost never works. Rescheduling the hotplug work for a second when we run into an HPD signal with a failing DDC probe usually gives enough time for the rest of the connector's pins to make contact, and fixes this issue. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude <cpaul@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: Handle irqs only based on irq ring, not irq status regs.Mario Kleiner2015-07-081-144/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying to resolve issues with missed vblanks and impossible values inside delivered kms pageflip completion events showed that radeon's irq handling sometimes doesn't handle valid irqs, but silently skips them. This was observed for vblank interrupts. Although those irqs have corresponding events queued in the gpu's irq ring at time of interrupt, and therefore the corresponding handling code gets triggered by these events, the handling code sometimes silently skipped processing the irq. The reason for those skips is that the handling code double-checks for each irq event if the corresponding irq status bits in the irq status registers are set. Sometimes those bits are not set at time of check for valid irqs, maybe due to some hardware race on some setups? The problem only seems to happen on some machine + card combos sometimes, e.g., never happened during my testing of different PC cards of the DCE-2/3/4 generation a year ago, but happens consistently now on two different Apple Mac cards (RV730, DCE-3, Apple iMac and Evergreen JUNIPER, DCE-4 in a Apple MacPro). It also doesn't happen at each interrupt but only occassionally every couple of hundred or thousand vblank interrupts. This results in XOrg warning messages like "[ 7084.472] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 420120 < target_msc 420121" as well as skipped frames and problems for applications that use kms pageflip events or vblank events, e.g., users of DRI2 and DRI3/Present, Waylands Weston compositor, etc. See also https://bugs.freedesktop.org/show_bug.cgi?id=85203 After some talking to Alex and Michel, we decided to fix this by turning the double-check for asserted irq status bits into a warning. Whenever a irq event is queued in the IH ring, always execute the corresponding interrupt handler. Still check the irq status bits, but only to log a DRM_DEBUG message on a mismatch. This fixed the problems reliably on both previously failing cards, RV-730 dual-head tested on both crtcs (pipes D1 and D2) and a triple-output Juniper HD-5770 card tested on all three available crtcs (D1/D2/D3). The r600 and evergreen irq handling is therefore tested, but the cik an si handling is only compile tested due to lack of hw. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> CC: Michel Dänzer <michel.daenzer@amd.com> CC: Alex Deucher <alexander.deucher@amd.com> CC: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: compute ring fix hibernation (CI GPU family) v2.Jérôme Glisse2015-06-291-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | In order for hibernation to reliably work we need to cleanup more thoroughly the compute ring. Hibernation is different from suspend resume as when we resume from hibernation the hardware is first fully initialize by regular kernel then freeze callback happens (which correspond to a suspend inside the radeon kernel driver) and turn off each of the block. It turns out we were not cleanly shutting down the compute ring. This patch fix that. Hibernation and suspend to ram were tested (several times) on : Bonaire Hawaii Mullins Kaveri Kabini Changed since v1: - Factor the ring stop logic into a function taking ring as arg. Cc: stable@vger.kernel.org Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Merge tag 'v4.1-rc6' into drm-nextDave Airlie2015-06-041-1/+1
|\ | | | | | | | | | | | | Linux 4.1-rc6 backmerge 4.1-rc6 as some of the later pull reqs are based on newer bases and I'd prefer to do the fixup myself.
| * drm/radeon: partially revert "fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling"Christian König2015-05-281-1/+1
| | | | | | | | | | | | | | | | | | | | We have that bug for years and some users report side effects when fixing it on older hardware. So revert it for VM_CONTEXT0_PAGE_TABLE_END_ADDR, but keep it for VM 1-15. Signed-off-by: Christian König <christian.koenig@amd.com> CC: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | radeon: Deinline indirect register accessor functionsDenys Vlasenko2015-05-281-0/+25
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch deinlines indirect register accessor functions. These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU). With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows: r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path. r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites Reduction in code size is more than 65,000 bytes: text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handlingChristian König2015-05-121-2/+2
| | | | | | | | The mapping range is inclusive between starting and ending addresses. Signed-off-by: Christian König <christian.koenig@amd.com> CC: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* radeon/cik: add support for short HPD irqsAlex Deucher2015-03-191-12/+87
| | | | | | This adds support to process short HPD irqs on CIK gpus. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: add get_allowed_info_register for CIKAlex Deucher2015-03-191-0/+33
| | | | | | | Registers that can be fetched from the info ioctl. Tested-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: do a posting read in cik_set_irqAlex Deucher2015-03-031-0/+3
| | | | | | | | | | To make sure the writes go through the pci bridge. bug: https://bugzilla.kernel.org/show_bug.cgi?id=90741 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* drm/radeon: enable SRBM timeout interrupt on CIK v2Leo Liu2015-02-251-0/+8
| | | | | | | v2: disable it on suspend Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* drm/radeon: only enable kv/kb dpm interrupts once v3Alex Deucher2015-02-111-21/+0
| | | | | | | | | | | Enable at init and disable on fini. Workaround for hardware problems. v2 (chk): extend commit message v3: add new function Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> (v2) Cc: stable@vger.kernel.org
* drm/radeon: workaround for CP HW bug on CIKChristian König2015-02-111-1/+15
| | | | | | | | Emit the EOP twice to avoid cache flushing problems. Signed-off-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* radeon/audio: consolidate audio_fini() functionsSlava Grigorev2015-01-221-1/+1
| | | | | | Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Slava Grigorev <slava.grigorev@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* radeon/audio: consolidate audio_init() functionsSlava Grigorev2015-01-221-1/+2
| | | | | | Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Slava Grigorev <slava.grigorev@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Merge remote-tracking branch 'origin/master' into drm-nextDave Airlie2015-01-221-0/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backmerge Linus tree after rc5 + drm-fixes went in. There were a few amdkfd conflicts I wanted to avoid, and Ben requested this for nouveau also. Conflicts: drivers/gpu/drm/amd/amdkfd/Makefile drivers/gpu/drm/amd/amdkfd/kfd_chardev.c drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c drivers/gpu/drm/amd/amdkfd/kfd_priv.h drivers/gpu/drm/amd/include/kgd_kfd_interface.h drivers/gpu/drm/i915/intel_runtime_pm.c drivers/gpu/drm/radeon/radeon_kfd.c
| * drm/radeon: fix VM flush on CIK (v3)Alex Deucher2015-01-081-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to wait for the GPUVM flush to complete. There was some confusion as to how this mechanism was supposed to work. The operation is not atomic. For GPU initiated invalidations you need to read back a VM register to introduce enough latency for the update to complete. v2: drop gart changes v3: just read back rather than polling Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | drm/radeon: Initialize compute vmidBen Goz2015-01-021-0/+24
|/ | | | | | | | | | | | | | | | | | | This patch moves to radeon the initialization of compute vmid. That initializations was done in kfd-->kgd interface, but doing it in radeon as part of radeon's H/W initialization routines is more appropriate. In addition, this simplifies the kfd-->kgd interface. The patch removes the function from the interface file and from the interface declaration file. The function initializes memory apertures to fixed base/limit address and non cached memory types. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* Merge tag 'v3.18-rc7' into drm-nextDave Airlie2014-12-021-2/+5
|\ | | | | | | | | | | | | | | | | | | | | | | This fixes a bunch of conflicts prior to merging i915 tree. Linux 3.18-rc7 Conflicts: drivers/gpu/drm/exynos/exynos_drm_drv.c drivers/gpu/drm/i915/i915_drv.c drivers/gpu/drm/i915/intel_pm.c drivers/gpu/drm/tegra/dc.c
| * drm/radeon: make sure mode init is complete in bandwidth_updateAlex Deucher2014-11-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power management code calls into the display code for certain things. If certain power management sysfs attributes are called before the driver has finished initializing all of the hardware we can run into problems with uninitialized modesetting state. Add a check to make sure modesetting init has completed to the bandwidth update callbacks to fix this. Can be triggered by the tlp and laptop start up scripts depending on the timing. bugs: https://bugzilla.kernel.org/show_bug.cgi?id=83611 https://bugs.freedesktop.org/show_bug.cgi?id=85771 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
| * drm/radeon: set correct CE ram size for CIKJammy Zhou2014-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | CE ram size is 32k/0k/0k for GFX/CS0/CS1 with CIK Ported from amdgpu driver. Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | drm/radeon: use one VMID for each ringChristian König2014-11-201-2/+2
| | | | | | | | | | | | | | | | | | Use multiple VMIDs for each VM, one for each ring. That allows us to execute flushes separately on each ring, still not ideal cause in a lot of cases rings can share IDs. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/radeon: split semaphore and sync object handling v2Christian König2014-11-201-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | Previously we just allocated space for four hardware semaphores in each software semaphore object. Make software semaphore objects represent only one hardware semaphore address again by splitting the sync code into it's own object. v2: fix typo in comment Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/radeon: rework vm_flush parametersChristian König2014-11-201-13/+10
| | | | | | | | | | | | | | Use ring structure instead of index and provide vm_id and pd_addr separately. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/radeon: work around a hw bug in MGCG on CIKAlex Deucher2014-11-201-1/+2
| | | | | | | | | | | | | | | | Always need to set bit 0 of RLC_CGTT_MGCG_OVERRIDE to avoid unreliable doorbell updates in some cases. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
* | drm/radeon: Add radeon <--> amdkfd interfaceOded Gabbay2014-07-151-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the interface between the radeon driver and the amdkfd driver. The interface implementation is contained in radeon_kfd.c and radeon_kfd.h. The interface itself is represented by a pointer to struct kfd_dev. The pointer is located inside radeon_device structure. All the register accesses that amdkfd need are done using this interface. This allows us to avoid direct register accesses in amdkfd proper, while also avoiding locking between amdkfd and radeon. The single exception is the doorbells that are used in both of the drivers. However, because they are located in separate pci bar pages, the danger of sharing registers between the drivers is minimal. Having said that, we are planning to move the doorbells as well to radeon. v3: Add interface for sa manager init and fini. The init function will allocate a buffer on system memory and pin it to the GART address space via the radeon sa manager. All mappings of buffers to GART address space are done via the radeon sa manager. The interface of allocate memory will use the radeon sa manager to sub allocate from the single buffer that was allocated during the init function. Change lower_32/upper_32 calls to use linux macros Add documentation for the interface v4: Change ptr field type in kgd_mem from uint32_t* to void* to match to type that is returned by radeon_sa_bo_cpu_addr v5: Change format of mqd structure to work with latest KV firmware Add support for AQL queues creation to enable working with open-source HSA runtime. Move generic kfd-->kgd interface and other generic kgd definitions to a generic header file that will be used by AMD's radeon and amdgpu drivers Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/radeon: adding synchronization for GRBM GFXOded Gabbay2014-07-141-0/+26
| | | | | | | | | | | | | | | | | | Implementing a lock for selecting and accessing shader engines and arrays. This lock will make sure that radeon and amdkfd are not colliding when accessing shader engines and arrays with GRBM_GFX_INDEX register. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/radeon/cik: Don't touch int of pipes 1-7Oded Gabbay2014-02-111-70/+1
| | | | | | | | | | | | | | amdkfd should set interrupts for pipes 1-7. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/radeon: reduce number of free VMIDs and pipes in KVOded Gabbay2014-01-161-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To support HSA on KV, we need to limit the number of vmids and pipes that are available for radeon's use with KV. This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs 0-7) and also makes radeon thinks that KV has only a single MEC with a single pipe in it v3: Use define for static vmid allocation in radeon Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/radeon: fix for memory training on bonaire 0x6649Alex Deucher2014-11-121-1/+10
|/ | | | | | | Workaround for memory link training on certain variants of 0x6649. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2014-10-141-26/+26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "This is the main git pull for the drm, I pretty much froze major pulls at -rc5/6 time, and haven't had much fallout, so will probably continue doing that. Lots of changes all over, big internal header cleanup to make it clear drm features are legacy things and what are things that modern KMS drivers should be using. Also big move to use the new generic fences in all the TTM drivers. core: atomic prep work, vblank rework changes, allows immediate vblank disables major header reworking and cleanups to better delinate legacy interfaces from what KMS drivers should be using. cursor planes locking fixes ttm: move to generic fences (affects all TTM drivers) ppc64 caching fixes radeon: userptr support, uvd for old asics, reset rework for fence changes better buffer placement changes, dpm feature enablement hdmi audio support fixes intel: Cherryview work, 180 degree rotation, skylake prep work, execlist command submission full ppgtt prep work cursor improvements edid caching, vdd handling improvements nouveau: fence reworking kepler memory clock work gt21x clock work fan control improvements hdmi infoframe fixes DP audio ast: ppc64 fixes caching fix rcar: rcar-du DT support ipuv3: prep work for capture support msm: LVDS support for mdp4, new panel, gpu refactoring exynos: exynos3250 SoC support, drop bad mmap interface, mipi dsi changes, and component match support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (640 commits) drm/mst: rework payload table allocation to conform better. drm/ast: Fix HW cursor image drm/radeon/kv: add uvd/vce info to dpm debugfs output drm/radeon/ci: add uvd/vce info to dpm debugfs output drm/radeon: export reservation_object from dmabuf to ttm drm/radeon: cope with foreign fences inside the reservation object drm/radeon: cope with foreign fences inside display drm/core: use helper to check driver features drm/radeon/cik: write gfx ucode version to ucode addr reg drm/radeon/si: print full CS when we hit a packet 0 drm/radeon: remove unecessary includes drm/radeon/combios: declare legacy_connector_convert as static drm/radeon/atombios: declare connector convert tables as static drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table drm/radeon/dpm: drop clk/voltage dependency filters for BTC drm/radeon/dpm: drop clk/voltage dependency filters for CI drm/radeon/dpm: drop clk/voltage dependency filters for SI drm/radeon/dpm: drop clk/voltage dependency filters for NI drm/radeon: disable audio when we disable hdmi (v2) drm/radeon: split audio enable between eg and r600 (v2) ...
| * drm/radeon: export reservation_object from dmabuf to ttmMaarten Lankhorst2014-10-031-2/+2
| | | | | | | | | | | | | | | | Adds an extra argument to radeon_bo_create, which is only used in radeon_prime.c. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/radeon: cope with foreign fences inside the reservation objectMaarten Lankhorst2014-10-031-1/+1
| | | | | | | | | | | | | | | | Not the whole world is a radeon! :-) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/radeon/cik: write gfx ucode version to ucode addr regAlex Deucher2014-10-011-10/+7
| | | | | | | | | | | | | | | | Helpful for debugging as the version shows up in a register dump. Cc: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm: backmerge tag 'v3.17-rc5' into drm-nextDave Airlie2014-09-161-7/+19
| |\ | | | | | | | | | | | | | | | | | | This is requested to get the fixes for intel and radeon into the same tree for future development work. i915_display.c: fix missing dev_priv conflict.
OpenPOWER on IntegriCloud