Merge branch 'rangersbach/cuda_blocksizes' into 'v2.0-dev'
Optimization for GPU block size determination See merge request pycodegen/pystencils!454
Showing
- docs/source/backend/gpu_codegen.md 7 additions, 0 deletionsdocs/source/backend/gpu_codegen.md
- docs/source/user_manual/gpu_kernels.md 35 additions, 10 deletionsdocs/source/user_manual/gpu_kernels.md
- src/pystencils/backend/platforms/cuda.py 10 additions, 27 deletionssrc/pystencils/backend/platforms/cuda.py
- src/pystencils/backend/platforms/sycl.py 4 additions, 13 deletionssrc/pystencils/backend/platforms/sycl.py
- src/pystencils/codegen/config.py 25 additions, 19 deletionssrc/pystencils/codegen/config.py
- src/pystencils/codegen/driver.py 14 additions, 10 deletionssrc/pystencils/codegen/driver.py
- src/pystencils/codegen/gpu_indexing.py 356 additions, 54 deletionssrc/pystencils/codegen/gpu_indexing.py
- src/pystencils/jit/gpu_cupy.py 2 additions, 12 deletionssrc/pystencils/jit/gpu_cupy.py
- src/pystencils/utils.py 5 additions, 0 deletionssrc/pystencils/utils.py
- tests/kernelcreation/test_gpu.py 117 additions, 16 deletionstests/kernelcreation/test_gpu.py
Please register or sign in to comment