Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
pystencils
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
pycodegen
pystencils
Commits
8af40ae4
Commit
8af40ae4
authored
7 months ago
by
Frederik Hennig
Browse files
Options
Downloads
Patches
Plain Diff
Clarify some doc comments; clarify launch grid specification
parent
fa7860cd
No related branches found
No related tags found
1 merge request
!430
Jupyter Inspection Framework, Book Theme, and Initial Drafts for Codegen Reference Guides
Pipeline
#70933
passed
7 months ago
Stage: Code Quality
Stage: Unit Tests
Stage: legacy_test
Stage: docs
Changes
3
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
docs/source/reference/gpu_kernels.md
+13
-6
13 additions, 6 deletions
docs/source/reference/gpu_kernels.md
src/pystencils/config.py
+6
-1
6 additions, 1 deletion
src/pystencils/config.py
src/pystencils/target.py
+1
-1
1 addition, 1 deletion
src/pystencils/target.py
with
20 additions
and
8 deletions
docs/source/reference/gpu_kernels.md
+
13
−
6
View file @
8af40ae4
...
@@ -78,19 +78,26 @@ kfunc(f=f_arr, g=g_arr)
...
@@ -78,19 +78,26 @@ kfunc(f=f_arr, g=g_arr)
### Modifying the Launch Grid
### Modifying the Launch Grid
The
`kernel.compile()`
invocation in the above code produces a {any}
`CupyKernelWrapper`
callable object.
The
`kernel.compile()`
invocation in the above code produces a {any}
`CupyKernelWrapper`
callable object.
Its interface allows us to customize the GPU launch grid.
This object holds the kernel's launch grid configuration
We can manually set both the number of threads per block, and the number of blocks on the grid:
(i.e. the number of thread blocks, and the number of threads per block.)
Pystencils specifies a default value for the block size and if possible,
the number of blocks is automatically inferred in order to cover the entire iteration space.
In addition, the wrapper's interface allows us to customize the GPU launch grid,
by manually setting both the number of threads per block, and the number of blocks on the grid:
```
{code-cell} ipython3
```
{code-cell} ipython3
kfunc.block_size = (16, 8, 8)
kfunc.block_size = (16, 8, 8)
kfunc.num_blocks = (1, 2, 2)
kfunc.num_blocks = (1, 2, 2)
```
```
In most cases, the number of blocks is automatically inferred from the block size
For most kernels, setting only the
`block_size`
is sufficient since pystencils will
in order to cover the entire iteration space, so it does not need to be specified.
automatically compute the number of blocks;
Setting a launch grid that is larger than the iteration space is also possible,
for exceptions to this, see
[](
#manual_launch_grids
)
.
but will cause any threads working outside of the iteration bounds to idle.
If
`num_blocks`
is set manually and the launch grid thus specified is too small, only
a part of the iteration space will be traversed by the kernel;
similarily, if it is too large, it will cause any threads working outside of the iteration bounds to idle.
(manual_launch_grids)=
### Manual Launch Grids and Non-Cuboid Iteration Patterns
### Manual Launch Grids and Non-Cuboid Iteration Patterns
In some cases, it will be unavoidable to set the launch grid size manually;
In some cases, it will be unavoidable to set the launch grid size manually;
...
...
This diff is collapsed.
Click to expand it.
src/pystencils/config.py
+
6
−
1
View file @
8af40ae4
...
@@ -33,7 +33,12 @@ class _AUTO_TYPE:
...
@@ -33,7 +33,12 @@ class _AUTO_TYPE:
AUTO
=
_AUTO_TYPE
()
AUTO
=
_AUTO_TYPE
()
"""
Special value that can be passed to some options for invoking automatic behaviour.
"""
"""
Special value that can be passed to some options for invoking automatic behaviour.
Currently, these options permit `AUTO`:
- `ghost_layers <CreateKernelConfig.ghost_layers>`
"""
@dataclass
@dataclass
...
...
This diff is collapsed.
Click to expand it.
src/pystencils/target.py
+
1
−
1
View file @
8af40ae4
...
@@ -87,7 +87,7 @@ class Target(Flag):
...
@@ -87,7 +87,7 @@ class Target(Flag):
"""
"""
GPU
=
CUDA
GPU
=
CUDA
"""
Alias for backward compatibility.
"""
"""
Alias for
`Target.CUDA`, for
backward compatibility.
"""
SYCL
=
_GPU
|
_SYCL
SYCL
=
_GPU
|
_SYCL
"""
SYCL kernel target.
"""
SYCL kernel target.
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment