summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
Commit message (Collapse)AuthorAgeFilesLines
* amdkfd: Use the canonical form in branch predicatesEdward O'Callaghan2016-05-011-5/+5
| | | | | | Found-By: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: make reset wavefronts per process per deviceBen Goz2015-06-061-3/+4
| | | | | | | | This commit moves the reset wavefront flag to per process per device data structure, so we can support multiple devices. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Enforce kill all waves on process terminationBen Goz2015-06-031-1/+7
| | | | | | | | | | | | This commit makes sure that on process termination, after we're destroying all the active queues, we're killing all the existing wave front of the current process. By doing this we're making sure that if any of the CUs were blocked by infinite loop we're enforcing it to end the shader explicitly. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Add wave control operation to debuggerYair Shachar2015-06-031-1/+1
| | | | | | | | | | | | | | The wave control operation supports several command types executed upon existing wave fronts that belong to the currently debugged process. The available commands are: HALT - Freeze wave front(s) execution RESUME - Resume freezed wave front(s) execution KILL - Kill existing wave front(s) Signed-off-by: Yair Shachar <yair.shachar@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* drm/amdkfd: Add static user-mode queues supportYair Shachar2015-06-031-7/+31
| | | | | | | | | | | | | | | This patch adds support for static user-mode queues in QCM. Queues which are designated as static can NOT be preempted by the CP microcode when it is executing its scheduling algorithm. This is needed for supporting the debugger feature, because we can't allow the CP to preempt queues which are currently being debugged. The number of queues that can be designated as static is limited by the number of HQDs (Hardware Queue Descriptors). Signed-off-by: Yair Shachar <yair.shachar@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* Backmerge v4.1-rc4 into into drm-nextDave Airlie2015-05-201-2/+4
|\ | | | | | | | | | | | | | | | | | | We picked up a silent conflict in amdkfd with drm-fixes and drm-next, backmerge v4.1-rc5 and fix the conflicts Signed-off-by: Dave Airlie <airlied@redhat.com> Conflicts: drivers/gpu/drm/drm_irq.c
| * drm/amdkfd: Initialize sdma vm when creating sdma queueXihan Zhang2015-05-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug where sdma vm wasn't initialized when an sdma queue was created in HWS mode. This caused GPUVM faults to appear on dmesg and it is one of the causes that SDMA queues are not working. Signed-off-by: Xihan Zhang <xihan.zhang@amd.com> Reviewed-by: Ben Goz <ben.goz@amd.comt> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: allow unregister process with queuesOded Gabbay2015-05-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | Sometimes we might unregister process that have queues, because we couldn't preempt the queues. Until now we blocked it with BUG_ON but instead just print it as debug. Reviewed-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdkfd: Add interrupt handling moduleAndrew Lewycky2015-05-191-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the interrupt handling module, kfd_interrupt.c, and its related members in different data structures to the amdkfd driver. The amdkfd interrupt module maintains an internal interrupt ring per amdkfd device. The internal interrupt ring contains interrupts that needs further handling. The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. However, only interrupts that are "wanted" by amdkfd, are copied into the amdkfd s/w interrupt ring, in order to minimize the chances for overflow of the ring. Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* | drm/amdkfd: make the sdma vm init to be asic specificOded Gabbay2015-05-191-14/+1
|/ | | | | Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* Merge tag 'drm-intel-next-2015-03-27-merge' of ↵Dave Airlie2015-04-011-1/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm-intel into drm-next This backmerges 4.0-rc6 due to the recent fixes in rc5/6 - DP link rate refactoring from Ville - byt/bsw rps tuning from Chris - kerneldoc for the shrinker code - more dynamic ppgtt pte work (Michel, Ben, ...) - vlv dpll code refactoring to prep fro bxt (Imre) - refactoring the sprite colorkey code (Ville) - rotated ggtt view support from Tvrtko - roll out struct drm_atomic_state to prep for atomic update (Ander) * tag 'drm-intel-next-2015-03-27-merge' of git://anongit.freedesktop.org/drm-intel: (473 commits) Linux 4.0-rc6 arm64: juno: Fix misleading name of UART reference clock drm/i915: Update DRIVER_DATE to 20150327 drm/i915: Skip allocating shadow batch for 0-length batches drm/i915: Handle error to get connector state when staging config drm/i915: Compare GGTT view structs instead of types drm/i915: fix simple_return.cocci warnings drm/i915: Add module param to test the load detect code drm/i915: Remove usage of encoder->new_crtc from clock computations drm/i915: Don't look at staged config crtc when changing DRRS state drm/i915: Convert intel_pipe_will_have_type() to using atomic state drm/i915: Pass an atomic state to modeset_global_resources() functions drm/i915: Add dynamic page trace events drm/i915: Finish gen6/7 dynamic page table allocation drm/i915: Remove unnecessary gen6_ppgtt_unmap_pages drm/i915: Fix i915_dma_map_single positive error code drm/i915: Prevent out of range pt in gen6_for_each_pde drm/i915: fix definition of the DRM_IOCTL_I915_GET_SPRITE_COLORKEY ioctl drm/i915: Rip out GET_SPRITE_COLORKEY ioctl watchdog: imgpdc: Fix default heartbeat ...
| * drm/amdkfd: Fix SDMA queue init. in non-HWS modeBen Goz2015-03-161-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the SDMA queue initialization, when running in non-HWS mode. The first fix is to move the initialization of SDMA VM parameters before the initialization of the SDMA MQD. The second fix is to load the MQD to an HQD after the initialization of the MQD. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdkfd: Add multiple kgd supportXihan Zhang2015-03-251-5/+9
| | | | | | | | | | | | | | | | | | The current code can only support one kgd instance. We have to support multiple kgd instances in one system. i.e two amdgpu or two radeon or one amdgpu + one radeon or more than two kgd instances. Signed-off-by: Xihan Zhang <xihan.zhang@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/amdkfd: rename fence_wait_timeoutOded Gabbay2015-03-251-2/+2
|/ | | | | | | | fence_wait_timeout() is an exported kernel symbol, so we should rename our local function to something different. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* drm/amdkfd: don't set get_pipes_num() as inlineOded Gabbay2015-02-231-0/+6
| | | | | | | get_pipes_num() calls BUG_ON so we can't set it as inline because it produces a warning as BUG_ON() uses static variables when it is expanded. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* drm/amdkfd: Initialize only amdkfd's assigned pipelinesOded Gabbay2015-02-231-2/+2
| | | | | | | | | | | | | | | | | | This patch fixes a bug in the initialization of the pipelines. The init_pipelines() function was called with a constant value of 0 in the first_pipe argument. This is an error because amdkfd doesn't handle pipe 0. The correct way is to pass the value that get_first_pipe() returns as the argument for first_pipe. This bug appeared in 3.19 (first version with amdkfd) and it causes around 15% drop in CPU performance of Kaveri (A10-7850). v2: Don't set get_first_pipe() as inline because it calls BUG_ON() Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Cc: stable@vger.kernel.org Tested-by: Michel Dänzer <michel.daenzer@amd.com>
* Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2015-02-161-192/+228
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm updates from Dave Airlie: "This is the main drm pull, it has a shared branch with some alsa crossover but everything should be acked by relevant people. New drivers: - ATMEL HLCDC driver - designware HDMI core support (used in multiple SoCs). core: - lots more atomic modesetting work, properties and atomic ioctl (hidden under option) - bridge rework allows support for Samsung exynos chromebooks to work finally. - some more panels supported i915: - atomic plane update support - DSI uses shared DSI infrastructure - Skylake basic support is all merged now - component framework used for i915/snd-hda interactions - write-combine cpu memory mappings - engine init code refactored - full ppgtt enabled where execlists are enabled. - cherryview rps/gpu turbo and pipe CRC support. radeon: - indirect draw support for evergreen/cayman - SMC and manual fan control for SI/CI - Displayport audio support amdkfd: - SDMA usermode queue support - replace suballocator usage with more suitable one - rework for allowing interfacing to more than radeon nouveau: - major renaming in prep for later splitting work - merge arm platform driver into nouveau - GK20A reclocking support msm: - conversion to atomic modesetting - YUV support for mdp4/5 - eDP support - hw cursor for mdp5 tegra: - conversion to atomic modesetting - better suspend/resume support for child devices rcar-du: - interlaced support imx: - move to using dw_hdmi shared support - mode_fixup support sti: - DVO support - HDMI infoframe support exynos: - refactoring and cleanup, removed lots of internal unnecessary abstraction - exynos7 DECON display controller support Along with the usual bunch of fixes, cleanups etc" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (724 commits) drm/radeon: fix voltage setup on hawaii drm/radeon/dp: Set EDP_CONFIGURATION_SET for bridge chips if necessary drm/radeon: only enable kv/kb dpm interrupts once v3 drm/radeon: workaround for CP HW bug on CIK drm/radeon: Don't try to enable write-combining without PAT drm/radeon: use 0-255 rather than 0-100 for pwm fan range drm/i915: Clamp efficient frequency to valid range drm/i915: Really ignore long HPD pulses on eDP drm/exynos: Add DECON driver drm/i915: Correct the base value while updating LP_OUTPUT_HOLD in MIPI_PORT_CTRL drm/i915: Insert a command barrier on BLT/BSD cache flushes drm/i915: Drop vblank wait from intel_dp_link_down drm/exynos: fix NULL pointer reference drm/exynos: remove exynos_plane_dpms drm/exynos: remove mode property of exynos crtc drm/exynos: Remove exynos_plane_dpms() call with no effect drm/i915: Squelch overzealous uncore reset WARN_ON drm/i915: Take runtime pm reference on hangcheck_info drm/i915: Correct the IOSF Dev_FN field for IOSF transfers drm/exynos: fix DMA_ATTR_NO_KERNEL_MAPPING usage ...
| * drm/amdkfd: Fix dqm->queue_count trackingJay Cornwall2015-01-191-3/+6
| | | | | | | | | | | | | | | | | | | | dqm->queue_count tracks queues in the active state only. In a few places this count is modified unconditionally, leading to an incorrect value when the UPDATE_QUEUE ioctl is used to make a queue inactive. Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
| * Merge branch 'master' of ↵Dave Airlie2015-01-291-3/+77
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next This backmerges drm-fixes into drm-next mainly for the amdkfd stuff, I'm not 100% confident, but it builds and the amdkfd folks can fix anything up. Signed-off-by: Dave Airlie <airlied@redhat.com> Conflicts: drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
| * | drm/amdkfd: Fix sparse errorsOded Gabbay2015-01-221-26/+2
| | | | | | | | | | | | | | | Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | drm/amdkfd: Handle case of invalid queue typeOded Gabbay2015-01-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch handles a case where amdkfd tries to destroy a queue but the queue type is invalid. This case occurs in non-HWS path. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
| * | drm/amdkfd: Add break at the end of caseOded Gabbay2015-01-221-0/+3
| | | | | | | | | | | | | | | Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
| * | drm/amdkfd: Remove negative check of uint variableOded Gabbay2015-01-221-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
| * | Merge remote-tracking branch 'origin/master' into drm-nextDave Airlie2015-01-221-2/+26
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backmerge Linus tree after rc5 + drm-fixes went in. There were a few amdkfd conflicts I wanted to avoid, and Ben requested this for nouveau also. Conflicts: drivers/gpu/drm/amd/amdkfd/Makefile drivers/gpu/drm/amd/amdkfd/kfd_chardev.c drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c drivers/gpu/drm/amd/amdkfd/kfd_priv.h drivers/gpu/drm/amd/include/kgd_kfd_interface.h drivers/gpu/drm/i915/intel_runtime_pm.c drivers/gpu/drm/radeon/radeon_kfd.c
| * | | drm/amdkfd: Replace cpu_relax() with schedule() in DQMOded Gabbay2015-01-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order not to occupy the current core and thus prevent the core from servicing IOMMU PPR requests, this patch replaces the call in DQM to cpu_relax() with a call to schedule(). Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Fix for-loop when allocating HQD (non-HWS)Ben Goz2015-01-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a minor bug in allocate_hqd(), where the loop run from the next-to-allocate pipe until the number of pipes. This is wrong because we need to consider the possibility where next-to-allocate pipe is not 0, and thus, the for-loop only checks part of the pipes and doesn't wrap-around, as it supposed to do. Therefore, we add another counting variable to make sure we go over all the pipes, regardless of where we start to look at the first iteration of the loop. This bug only affected non-HWS mode. In HWS mode, the CP fw is responsible for allocating the HQD. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Add initial VI support for DQMBen Goz2015-01-121-85/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch starts to add support for the VI APU in the DQM module. Because most (more than 90%) of the DQM code is shared among AMD's APUs, we chose a design that performs most/all the code in the shared DQM file (kfd_device_queue_manager.c). If there is H/W specific code to be executed, than it is written in an asic-specific extension function for that H/W. That asic-specific extension function is called from the shared function at the appropriate time. This requires that for every asic-specific extension function that is implemented in a specific ASIC, there will be an equivalent implementation in ALL ASICs, even if those implementations are just stubs. That way we achieve: - Maintainability: by having one copy of most of the code, we only need to fix bugs at one locations - Readability: very clear what is the shared code and what is done per ASIC - Extensibility: very easy to add new H/W specific files/functions Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Encapsulate DQM functions in ops structureOded Gabbay2015-01-121-34/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does some re-org on the device_queue_manager structure. It takes out all the function pointers from the structure and puts them in a new structure, called device_queue_manager_ops. Then, it puts an instance of that structure inside device_queue_manager. This re-org is done to prepare the DQM module to support more than one AMD APU (Kaveri). Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Fix logic of destroy_queue_nocpsch()Ben Goz2014-08-181-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch rewrites destroy_queue_nocpsch() as the current logic that is implemented in the function is completely flawed. This function is used only in non-HWS mode. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Make KFD_MQD_TYPE enum types H/W agnosticBen Goz2015-01-041-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the MQD types are common across all AMD GPUs/APUs, let's remove the CIK part from the name. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Don't include header files from radeonOded Gabbay2015-01-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because amdkfd will need to work both with radeon and amdgpu, don't include header files that are in radeon's folder. Instead, use the common amd include folder and move amdkfd specific defines to amdkfd header files. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Remove call to deprecated init_memory interfaceBen Goz2014-10-261-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes a call to kfd-->kgd interface function that is doing H/W initialization. That function is moved into radeon to be part of the common H/W initialization sequence. The interface function will be deleted. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Using new gtt sa in amdkfdOded Gabbay2015-01-091-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch change the calls throughout the amdkfd driver from the old kfd-->kgd interface to the new kfd gtt sa inside amdkfd v2: change the new call in sdma code that appeared because of the sdma feature Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alexey Skidanov <Alexey.skidanov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Add SDMA user-mode queues support to QCMBen Goz2015-01-091-11/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for SDMA user-mode queues to the QCM - the Queue management system that manages queues-per-device and queues-per-process. v2: Remove calls to interface function that initializes sdma engines. v3: Use the new names of some of the defines. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
| * | | drm/amdkfd: Process-device data creation and lookup splitAlexey Skidanov2015-01-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch splits the current kfd_get_process_device_data() to two functions, one that specifically creates a pdd and another one which just do lookup. This is done to enhance the readability and maintainability of the code. Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | | | drm/amdkfd: Fix bug in accounting of queuesOded Gabbay2015-02-021-1/+1
| |_|/ |/| | | | | | | | | | | | | | Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* | | drm/amdkfd: Fix bug in call to init_pipelines()Oded Gabbay2015-01-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug where the first_pipe index passed into init_pipelines() was a #define instead of the value that is passed into amdkfd by radeon Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* | | drm/amdkfd: Fix bug in pipelines initializationOded Gabbay2015-01-221-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug when calling to init_pipeline() interface. The index that was passed to that function didn't take into account the first_pipe value, which represents the first pipe index that is under amdkfd's responsibility. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
* | | drm/amdkfd: Allow user to limit only queues per deviceOded Gabbay2015-01-181-0/+70
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the two current amdkfd module parameters with a new one. The current parameters that are being replaced are: - Maximum number of HSA processes - Maximum number of queues per process The new parameter that replaces them is called "Maximum queues per device" This replacement achieves two goals: - Allows the user to have as many HSA processes as it wants (until a maximum of 512 HSA processes in Kaveri). - Removes the limitation the user had on maximum number of queues per HSA process. E.g. the user can now have processes which only have one queue and other processes which have hundreds of queues, while before the user couldn't have more than 128 queues per process (as default). The default value of the new parameter is 4096 (32 * 128, which were the defaults of the old parameters). There is almost no additional GART memory required for the default case. As a reminder, this amount of queues requires a little bit below 4MB of GART memory. v2: In addition, This patch defines a new counter for queues accounting in the DQM structure. This is done because the current counter only counts active queues which allows the user to create more queues than the max_num_of_queues_per_device module parameter allows. However, we need the current counter for the runlist packet build process, so the solution is to have a dedicated counter for this accounting. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Ben Goz <ben.goz@amd.com>
* | drm/amdkfd: Fix sparse warning (different address space)Oded Gabbay2015-01-081-1/+1
| | | | | | | | Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* | drm/amdkfd: unmap VMID<-->PASID when relesing VMID (non-HWS)Ben Goz2015-01-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug where deallocate_vmid() didn't actually unmap the VMID<-->PASID mapping (in the registers). That can cause undefined behavior. This bug only occurs in non-HWS mode. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
* | drm/amdkfd: Load mqd to hqd in non-HWS modeBen Goz2015-01-041-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in DQM, where the MQD of a newly created compute queue is not loaded to an HQD slot. As a result, the CP never reads packets from this queue. This bug happens only in non-HWS (hardware scheduling) mode. In HWS mode, the CP is responsible of loading MQDs to HQDs slots. Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>
* | amdkfd: Fix accounting of device queuesOded Gabbay2014-12-071-2/+11
|/ | | | | | | | This patch fixes a device QCM bug, where the number of queues were not counted correctly for the operation of update queue. The count was incorrect as there was no regard to the previous state of the queue. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* amdkfd: Fix memory leak of mqds on dqm finiOded Gabbay2014-11-251-0/+4
| | | | | | | The mqds array members are not freed when dqm is uninitialized. Reviewed-by: Ben Goz <Ben.Goz@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* amdkfd: Instead of using get function, use container_ofAlexey Skidanov2014-11-191-12/+9
| | | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* amdkfd: Fix memory leak on process deregistrationJay Cornwall2014-11-201-0/+1
| | | | | | | | | struct device_process_node was allocated during process registration but not released at process deregistration. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* amdkfd: fence_wait_timeout() can be staticOded Gabbay2014-11-201-2/+3
| | | | | Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
* amdkfd: Add device queue manager moduleBen Goz2014-07-171-0/+1059
The queue scheduler divides into two sections, one section is process bounded and the other section is device bounded. The device bounded section is handled by this module. The DQM module handles queue setup, update and tear-down from the device side. It also supports suspend/resume operation. v3: Changed device_init, added the use of the new gart allocation functions an Added documentation. v4: Fixed a race in DQM queue scheduler where dqm->lock must be held when accessing dqm->queue_count and dqm->processes_count. This fixes runlist IB allocation failures when DQM is under load. Fixed race in DQM queue destruction where queues being destroyed must be removed from qpd->queues_list prior to preemption, or concurrent queue creation activity may reschedule them while their MQD is destroyed. Fixed EOP queue size setting in CP_HPD_EOP_CONTROL, because the size is specified as (log2(size_dwords)-1). The previous calculation assumed the size was specified in bytes, which caused interference between EOP queues when multiple MEC pipelines were active. v5: Move amdkfd from drm/radeon/ to drm/amd/ Change format of mqd structure to match latest KV firmware Add support for AQL queues creation to enable working with open-source HSA runtime Remove unused unmap_queue function Various fixes (Style, typos) Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
OpenPOWER on IntegriCloud