Skip to content
Snippets Groups Projects

Optimization for GPU block size determination

Merged Richard Angersbach requested to merge rangersbach/cuda_blocksizes into v2.0-dev

This MR optimizes GPU block sizes such that these are always multiples of the hardware's warp (CUDA) or wavefront (HIP) size.

Summarized, this MR

  • removes BasicOption GpuOptions.omit_range_check
  • removes BasicOption GpuOptions.block_size
  • introduces BasicOption GpuOptions.warp_size and implements function for determining default values
  • introduces BasicOption assume_warp_aligned_block_size, ensuring the compiler that block sizes match with warp size
  • adds new GpuOptions to the data flow of GpuIndexing
  • adds algorithm for fitting block size according to iteration space and warp size
  • adds fit_block_size and trim_block_size member functions to DynamicBlockSizeLaunchConfiguration for computing block sizes based on a user-defined initial block size and the iteration space
  • for assumed alignment: rounds to multiples of warp size when iteration space is unknown to generation time
Edited by Richard Angersbach

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply