"git@i10git.cs.fau.de:pycodegen/pystencils.git" did not exist on "dab3371d79257c6697052c426562c2d34995bf73"
Optimization for GPU block size determination
Compare changes
This MR optimizes GPU block sizes such that these are always multiples of the hardware's warp (CUDA) or wavefront (HIP) size.
Summarized, this MR
GpuOptions.omit_range_check
GpuOptions.block_size
GpuOptions.warp_size
and implements function for determining default valuesassume_warp_aligned_block_size
, ensuring the compiler that block sizes match with warp sizefit_block_size
and trim_block_size
member functions to DynamicBlockSizeLaunchConfiguration
for computing block sizes based on a user-defined initial block size and the iteration space