Clarify some doc comments; clarify launch grid specification

8af40ae4 · Frederik Hennig · fa7860cd · 8af40ae4 · 8af40ae4 · 8af40ae4
Commit 8af40ae4 authored 8 months ago by Frederik Hennig
--- a/docs/source/reference/gpu_kernels.md
+++ b/docs/source/reference/gpu_kernels.md
@@ -78,19 +78,26 @@ kfunc(f=f_arr, g=g_arr)
 ### Modifying the Launch Grid

 The `kernel.compile()` invocation in the above code produces a {any}`CupyKernelWrapper` callable object.
-Its interface allows us to customize the GPU launch grid.
-We can manually set both the number of threads per block, and the number of blocks on the grid:
+This object holds the kernel's launch grid configuration
+(i.e. the number of thread blocks, and the number of threads per block.)
+Pystencils specifies a default value for the block size and if possible, 
+the number of blocks is automatically inferred in order to cover the entire iteration space.
+In addition, the wrapper's interface allows us to customize the GPU launch grid,
+by manually setting both the number of threads per block, and the number of blocks on the grid:

 ```{code-cell} ipython3
 kfunc.block_size = (16, 8, 8)
 kfunc.num_blocks = (1, 2, 2)
 ```

-In most cases, the number of blocks is automatically inferred from the block size
-in order to cover the entire iteration space, so it does not need to be specified.
-Setting a launch grid that is larger than the iteration space is also possible,
-but will cause any threads working outside of the iteration bounds to idle.
+For most kernels, setting only the `block_size` is sufficient since pystencils will
+automatically compute the number of blocks;
+for exceptions to this, see [](#manual_launch_grids).
+If `num_blocks` is set manually and the launch grid thus specified is too small, only
+a part of the iteration space will be traversed by the kernel;
+similarily, if it is too large, it will cause any threads working outside of the iteration bounds to idle.

+(manual_launch_grids)=
 ### Manual Launch Grids and Non-Cuboid Iteration Patterns

 In some cases, it will be unavoidable to set the launch grid size manually;

--- a/src/pystencils/config.py
+++ b/src/pystencils/config.py
@@ -33,7 +33,12 @@ class _AUTO_TYPE:


 AUTO = _AUTO_TYPE()
-"""Special value that can be passed to some options for invoking automatic behaviour."""
+"""Special value that can be passed to some options for invoking automatic behaviour.
+
+Currently, these options permit `AUTO`:
+
+- `ghost_layers <CreateKernelConfig.ghost_layers>`
+"""


 @dataclass

--- a/src/pystencils/target.py
+++ b/src/pystencils/target.py
@@ -87,7 +87,7 @@ class Target(Flag):
    """

    GPU = CUDA
-    """Alias for backward compatibility."""
+    """Alias for `Target.CUDA`, for backward compatibility."""

    SYCL = _GPU | _SYCL
    """SYCL kernel target.