"git@i10git.cs.fau.de:pycodegen/pystencils.git" did not exist on "dab3371d79257c6697052c426562c2d34995bf73"
This MR optimizes GPU block sizes such that these are always multiples of the hardware's warp (CUDA) or wavefront (HIP) size.
Summarized, this MR
GpuOptions.omit_range_check
GpuOptions.block_size
GpuOptions.warp_size
and implements function for determining default valuesassume_warp_aligned_block_size
, ensuring the compiler that block sizes match with warp sizefit_block_size
and trim_block_size
member functions to DynamicBlockSizeLaunchConfiguration
for computing block sizes based on a user-defined initial block size and the iteration space