summaryrefslogtreecommitdiffstats
path: root/xmrstak/backend/nvidia/nvcc_code
Commit message (Collapse)AuthorAgeFilesLines
* fix possible deadlock with Voltapsychocrypt2018-06-041-1/+1
| | | | | | | If CUDA 9.X is used and the miner is compiled for `sm_70` and used with Volta GPUs than the miner deadlocks if `threads` is not a multiple of `32`. - use `__activemask()` to get all active lanes
* Spell checkTony Butler2018-06-048-56/+55
|
* support stellite v4 forkpsychocrypt2018-06-041-6/+19
| | | | | | solve #1494 - add algorithm `cryptonight_v7_stellite` (internal named: `cryptonight_stellite`)
* add support for IPBC coinpsychocrypt2018-06-041-4/+15
| | | | | - add algorithm `cryptonight_lite_v7_xor` - update documentation
* add independent dev pool coin descriptionpsychocrypt2018-06-041-6/+6
| | | | | | | | | | | | | | - allow the dev pool to fork on a different block version than the user descriped coin All algorithm are centered around the user coin description. It is allowed to have two two different coin algorithms in the user coin description. It is only allowed to use algorithms for the dev pool coin description those are used in the user coin description. There are two ways to define a non forking coin. - set both user coin algorithm descriptions to the same algorithm and set version to zero - set the first algorithm in the user coin description to something you like to use in the dev pool and set the second algorithm to the correct representation of the coin. Set the version to 255. This will allow that the dev pool can mine on a different coin algorithm than the not forking user coin. Do not use an algorithm with different scratchpad size for the dev pool.
* refactor scratchpad creationpsychocrypt2018-06-041-2/+8
| | | | Use the maximum scratchpad size from before and after the fork.
* Repair more typos in comments onlyTony Butler2018-06-041-2/+2
|
* NVIDIA: fix sumokoinpsychocrypt2018-06-041-26/+20
| | | | | | | | sumokoin is broken if `bfactor >= 5` is used (default for windows) sumokoin for `sm_20` is broken due to the missing extern shared memory - call phase3 kernel two times if sumokoin is enabled - create extern shared memory for phase3 kernel
* fix cuda architecture detectionpsychocrypt2018-06-041-1/+1
| | | | | | fix #1297 If sm_20 is mixed with other architectures the detection for the minimal supported architecture is broken.
* refactor mining algo selectionpsychocrypt2018-06-043-22/+11
| | | | | - add `fork_height` to currency - refactor algorithm selection
* POW AEON v7psychocrypt2018-06-041-3/+10
| | | | | - add new pow for AEON - fix missing cryptonight-heavy selection for multi hashes
* revert input size changepsychocrypt2018-03-251-2/+3
| | | | revert #1198, the block size is limited to 84byte
* fix input size on devicepsychocrypt2018-03-251-1/+1
|
* Fixing allocation issueJuan Leni2018-03-251-1/+1
|
* XMR-Stak 2.3.0 RCxmr-stak-devs2018-03-253-92/+305
| | | | | | | Co-authored-by: psychocrypt <psychocryptHPC@gmail.com> Co-authored-by: fireice-uk <fireice-uk@users.noreply.github.com> Co-authored-by: Lee Clagett <code@leeclagett.com> Co-authored-by: curie-kief <curie-kief@users.noreply.github.com>
* speedup Voltapsychocrypt2018-01-302-1/+19
| | | | | | | - enable L1 cache for Nvidia Volta GPUs and newer - remove explicit cache controll for Volta GPU and newer This pull request increases the hash rate for Volta GPUs by ~5%
* reduce memory usage for low end gpuspsychocrypt2018-01-221-0/+6
| | | | reduce memory usage to 1GiB for NVIDIA devices with <=6 SMX
* Merge pull request #464 from psychocrypt/topic-handleCudaErrorCodesfireice-uk2017-12-151-8/+8
|\ | | | | handle cuda error codes
| * handle cuda error codespsychocrypt2017-12-101-8/+8
| | | | | | | | handle all error codes from the cuda api calls.
* | fix cuda9.1 compilepsychocrypt2017-12-141-1/+0
|/ | | | | - fix cuda9.1 compile (remove includ eof device_functions.hpp/ removed with cuda9.1) - remove NVIDIA Volta gpus for MAC OSX
* Beautification edit as per fireice-uk's suggestionAndrew Whittle2017-12-091-1/+1
| | | | Makes casting more explicit.
* Fix compat_usleep for WIN32Andrew Whittle2017-12-091-1/+1
| | | | | When compiled with VS2017, the negative applied to the uint wait time is ignored. Fixed by casting first.
* fix intentionpsychocrypt2017-12-091-1/+3
| | | | - fix indention
* conservative NVIDIA auto suggestionpsychocrypt2017-12-081-1/+12
| | | | | | | | Be more conservative with the auto suggestion. - increase bfactor if `smx <= 6` - limit memory for pascal < GTX1070 to 2GiB - limt memory for pascal <= GTX1080 to 4GiB
* Merge pull request #399 from psychocrypt/topic-nvidiaErrorWithMessagefireice-uk2017-12-083-23/+53
|\ | | | | add message to `CUDA_CHECK...` macros
| * add message to `CUDA_CHECK...` macrospsychocrypt2017-12-073-23/+53
| | | | | | | | | | - add macro `CUDA_CHECK_MSG_KERNEL` and `CUDA_CHECK_MSG` - add suggestion of typicle errors can be solved
* | option to controll gpu synchronizationpsychocrypt2017-12-012-2/+18
|/ | | | | - add option `sync_mode` - update auto suggestion and jconf
* Merge pull request #221 from psychocrypt/fix-cudaLaunchBoundsfireice-uk2017-11-241-1/+1
|\ | | | | fix CUDA launch bounds usage
| * fix CUDA launch bounds usagepsychocrypt2017-11-231-1/+1
| | | | | | | | | | | | fix #191 lauch bounds must be placed before the return type but after the template paramater
* | fix auto suggestion for low end devicespsychocrypt2017-11-201-0/+4
|/ | | | Increase bfactor for all devices with lesser than 6 multi processors.
* Merge pull request #133 from psychocrypt/fix-cudaArchBinaryDetectionfireice-uk2017-11-171-1/+1
|\ | | | | fix wrong cuda binary arch detection
| * fix wrong cuda binary arch detectionpsychocrypt2017-11-171-1/+1
| | | | | | | | fix wrong arch comparsion
* | fix nvidia auto suggestionpsychocrypt2017-11-171-2/+2
|/ | | | | The lmem is still incalculably and crash the miner very often. Increase the potential lmem usage to 16kiB to respect lmem alignments, ...
* Remove whitespace linesUnknown2017-11-161-1/+1
|
* check gpu architecturepsychocrypt2017-11-151-0/+48
| | | | | | - check if the gpu architecture is supported by the compiled miner binary - remove not supported gpus from the auto suggestion - disallow the selection of a not supported gpu by hand tuning the config
* fix wrong memory detectionpsychocrypt2017-11-151-6/+37
| | | | | | | | | Free and total memory is only evaluated on the first device. To detect the gpu memory the gpu must be selected. - create context on the gpu before the memory is checked - add smx to the auto detection - change the result code of `cuda_get_deviceinfo()`
* optimize NVIDIA autosuggestionpsychocrypt2017-11-031-0/+20
| | | | | - avoid creation of a config with zero threads or blocks - WINDOWS: reduce the used memory for the auto suggestion by the amount of already used memory
* fix windows compile and broken aeonpsychocrypt2017-10-282-8/+24
| | | | | - fix windows linker error during compile - fix wrong parameter to call aeon (nvidia-backend)
* increase safety memory for autosuggestionpsychocrypt2017-10-271-8/+8
| | | | | - increase safty memory from 64 to 128 MiB - NVIDIA: increase lmem reserve per thread to 1kiB
* rename `xmr` to `monero`psychocrypt2017-10-272-6/+6
| | | | | | - rename all `xmr` to `monero` - be insensitive while check for set currency - add function to compate two strings insensitive
* add aeon support to backend nvidiapsychocrypt2017-10-274-20/+46
| | | | | - add template parameter to kernel to support aeon and xmr - update auto suggestion
* Merge pull request #57 from psychocrypt/fix-nvidiaBackendCrashfireice-uk2017-10-221-8/+9
|\ | | | | fix illegal memory access
| * fix illegal memory accesspsychocrypt2017-10-211-8/+9
| | | | | | | | remove restricted pointer
* | fix CUDA 9 shuffle warningpsychocrypt2017-10-201-1/+5
|/ | | | use `__shffl_snyc` if CUDA 9+ is avalable
* cleanup includespsychocrypt2017-09-302-4/+4
|
* group filespsychocrypt2017-09-3011-0/+2592
- move source code to `src` - categorize files and move to group folder - change upper case class files to lower case - change C++ header to `*.hpp`
OpenPOWER on IntegriCloud