Skip to content
Snippets Groups Projects
Commit d5de4047 authored by Frederik Hennig's avatar Frederik Hennig
Browse files

start updating docs

parent a01997e4
No related branches found
No related tags found
1 merge request!458HIP Target and Platform
Pipeline #76052 failed
...@@ -48,10 +48,22 @@ to build this documentation, and `tests`, which adds `flake8` for code style che ...@@ -48,10 +48,22 @@ to build this documentation, and `tests`, which adds `flake8` for code style che
For more information on developing pystencils, see the [](#contribution_guide). For more information on developing pystencils, see the [](#contribution_guide).
::: :::
### For Nvidia GPUs ### For GPUs
If you have an Nvidia graphics processor and CUDA installed, you can use pystencils to directly compile If you have an Nvidia graphics processor and CUDA installed, you can use pystencils to directly compile
and execute kernels running on your GPU. and execute kernels running on your GPU.
This requires a working installation of [cupy](https://cupy.dev). This requires a working installation of [Cupy](https://cupy.dev).
Please refer to the cupy's [installation manual](https://docs.cupy.dev/en/stable/install.html) Please refer to the cupy's [installation manual](https://docs.cupy.dev/en/stable/install.html)
for details about installing cupy. for details about installing cupy.
You can also use Cupy together with AMD ROCm for AMD graphics cards,
but the setup steps are a bit more complicated - you might have to build cupy from source.
The Cupy documentation covers this in their [installation guide for Cupy on ROCm][cupy-rocm].
:::{note}
Since Cupy's support for ROCm is at this time still an experimental feature,
just-in-time compilation of pystencils HIP kernels
for the ROCm platform must also considered *experimental*.
:::
[cupy-rocm]: https://docs.cupy.dev/en/stable/install.html#using-cupy-on-amd-gpu-experimental "Cupy on ROCm"
...@@ -26,23 +26,46 @@ import matplotlib.pyplot as plt ...@@ -26,23 +26,46 @@ import matplotlib.pyplot as plt
(guide_gpukernels)= (guide_gpukernels)=
# Pystencils for GPUs # Pystencils for GPUs
Pystencils offers code generation for Nvidia GPUs using the CUDA programming model, Pystencils offers code generation for Nvidia and AMD GPUs
using the CUDA and HIP programming models,
as well as just-in-time compilation and execution of CUDA kernels from within Python as well as just-in-time compilation and execution of CUDA kernels from within Python
based on the [cupy] library. based on the [cupy] library.
This section's objective is to give a detailed introduction into the creation of This section's objective is to give a detailed introduction into the creation of
GPU kernels with pystencils. GPU kernels with pystencils.
## Generate, Compile and Run CUDA Kernels :::{note}
[CuPy][cupy] is a Python library for numerical computations on GPU arrays,
which operates much in the same way that [NumPy][numpy] works on CPU arrays.
Cupy and NumPy expose nearly the same APIs for array operations;
the difference being that CuPy allocates all its arrays on the GPU
and performs its operations as CUDA kernels.
Also, CuPy exposes a just-in-time-compiler for GPU kernels, which internally calls [nvrtc].
In pystencils, we use CuPy both to compile and provide executable kernels on-demand from within Python code,
and to allocate and manage the data these kernels can be executed on.
For more information on CuPy, refer to [their documentation][cupy-docs].
:::
## Generate, Compile and Run GPU Kernels
The CUDA and HIP platforms are made available in pystencils via the code generation targets
{any}`Target.CUDA` and {any}`Target.HIP`.
For pystencils code to be portable between both, we can use {any}`Target.CurrentGPU` to
automatically select one or the other, depending on the current runtime environment.
:::{note}
If `cupy` is not installed, `create_kernel` will raise an exception when using `Target.CurrentGPU`.
When exporting kernels to be compiled externally in an environment where cupy is not available,
the GPU target must therefore be set explicitly.
:::
In order to obtain a CUDA implementation of a symbolic kernel, naught more is required Here is a snippet creating a kernel for the locally available GPU target:
than setting the {any}`target <CreateKernelConfig.target>` code generator option to
{any}`Target.CUDA`:
```{code-cell} ipython3 ```{code-cell} ipython3
f, g = ps.fields("f, g: float64[3D]") f, g = ps.fields("f, g: float64[3D]")
update = ps.Assignment(f.center(), 2 * g.center()) update = ps.Assignment(f.center(), 2 * g.center())
cfg = ps.CreateKernelConfig(target=ps.Target.CUDA) cfg = ps.CreateKernelConfig(target=ps.Target.CurrentGPU)
kernel = ps.create_kernel(update, cfg) kernel = ps.create_kernel(update, cfg)
ps.inspect(kernel) ps.inspect(kernel)
...@@ -68,19 +91,6 @@ kfunc = kernel.compile() ...@@ -68,19 +91,6 @@ kfunc = kernel.compile()
kfunc(f=f_arr, g=g_arr) kfunc(f=f_arr, g=g_arr)
``` ```
:::{note}
[CuPy][cupy] is a Python library for numerical computations on GPU arrays,
which operates much in the same way that [NumPy][numpy] works on CPU arrays.
Cupy and NumPy expose nearly the same APIs for array operations;
the difference being that CuPy allocates all its arrays on the GPU
and performs its operations as CUDA kernels.
Also, CuPy exposes a just-in-time-compiler for GPU kernels, which internally calls [nvrtc].
In pystencils, we use CuPy both to compile and provide executable kernels on-demand from within Python code,
and to allocate and manage the data these kernels can be executed on.
For more information on CuPy, refer to [their documentation][cupy-docs].
:::
(indexing_and_launch_config)= (indexing_and_launch_config)=
## Modify the Indexing Scheme and Launch Configuration ## Modify the Indexing Scheme and Launch Configuration
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment