diff --git a/docs/source/installation.md b/docs/source/installation.md index deb2b0613564f98468f623544acf3cc1ca9d279e..8c344e7609dc9bd360b2ab02a90aa20387d84621 100644 --- a/docs/source/installation.md +++ b/docs/source/installation.md @@ -48,10 +48,22 @@ to build this documentation, and `tests`, which adds `flake8` for code style che For more information on developing pystencils, see the [](#contribution_guide). ::: -### For Nvidia GPUs +### For GPUs If you have an Nvidia graphics processor and CUDA installed, you can use pystencils to directly compile and execute kernels running on your GPU. -This requires a working installation of [cupy](https://cupy.dev). +This requires a working installation of [Cupy](https://cupy.dev). Please refer to the cupy's [installation manual](https://docs.cupy.dev/en/stable/install.html) for details about installing cupy. + +You can also use Cupy together with AMD ROCm for AMD graphics cards, +but the setup steps are a bit more complicated - you might have to build cupy from source. +The Cupy documentation covers this in their [installation guide for Cupy on ROCm][cupy-rocm]. + +:::{note} +Since Cupy's support for ROCm is at this time still an experimental feature, +just-in-time compilation of pystencils HIP kernels +for the ROCm platform must also considered *experimental*. +::: + +[cupy-rocm]: https://docs.cupy.dev/en/stable/install.html#using-cupy-on-amd-gpu-experimental "Cupy on ROCm" diff --git a/docs/source/user_manual/gpu_kernels.md b/docs/source/user_manual/gpu_kernels.md index 610c61ddf647331d7b77b06968e489b4dcc76293..2219ce04228d9da8dd9dec663d6d73d65c884362 100644 --- a/docs/source/user_manual/gpu_kernels.md +++ b/docs/source/user_manual/gpu_kernels.md @@ -26,23 +26,46 @@ import matplotlib.pyplot as plt (guide_gpukernels)= # Pystencils for GPUs -Pystencils offers code generation for Nvidia GPUs using the CUDA programming model, +Pystencils offers code generation for Nvidia and AMD GPUs +using the CUDA and HIP programming models, as well as just-in-time compilation and execution of CUDA kernels from within Python based on the [cupy] library. This section's objective is to give a detailed introduction into the creation of GPU kernels with pystencils. -## Generate, Compile and Run CUDA Kernels +:::{note} +[CuPy][cupy] is a Python library for numerical computations on GPU arrays, +which operates much in the same way that [NumPy][numpy] works on CPU arrays. +Cupy and NumPy expose nearly the same APIs for array operations; +the difference being that CuPy allocates all its arrays on the GPU +and performs its operations as CUDA kernels. +Also, CuPy exposes a just-in-time-compiler for GPU kernels, which internally calls [nvrtc]. +In pystencils, we use CuPy both to compile and provide executable kernels on-demand from within Python code, +and to allocate and manage the data these kernels can be executed on. + +For more information on CuPy, refer to [their documentation][cupy-docs]. +::: + +## Generate, Compile and Run GPU Kernels + +The CUDA and HIP platforms are made available in pystencils via the code generation targets +{any}`Target.CUDA` and {any}`Target.HIP`. +For pystencils code to be portable between both, we can use {any}`Target.CurrentGPU` to +automatically select one or the other, depending on the current runtime environment. + +:::{note} +If `cupy` is not installed, `create_kernel` will raise an exception when using `Target.CurrentGPU`. +When exporting kernels to be compiled externally in an environment where cupy is not available, +the GPU target must therefore be set explicitly. +::: -In order to obtain a CUDA implementation of a symbolic kernel, naught more is required -than setting the {any}`target <CreateKernelConfig.target>` code generator option to -{any}`Target.CUDA`: +Here is a snippet creating a kernel for the locally available GPU target: ```{code-cell} ipython3 f, g = ps.fields("f, g: float64[3D]") update = ps.Assignment(f.center(), 2 * g.center()) -cfg = ps.CreateKernelConfig(target=ps.Target.CUDA) +cfg = ps.CreateKernelConfig(target=ps.Target.CurrentGPU) kernel = ps.create_kernel(update, cfg) ps.inspect(kernel) @@ -68,19 +91,6 @@ kfunc = kernel.compile() kfunc(f=f_arr, g=g_arr) ``` -:::{note} -[CuPy][cupy] is a Python library for numerical computations on GPU arrays, -which operates much in the same way that [NumPy][numpy] works on CPU arrays. -Cupy and NumPy expose nearly the same APIs for array operations; -the difference being that CuPy allocates all its arrays on the GPU -and performs its operations as CUDA kernels. -Also, CuPy exposes a just-in-time-compiler for GPU kernels, which internally calls [nvrtc]. -In pystencils, we use CuPy both to compile and provide executable kernels on-demand from within Python code, -and to allocate and manage the data these kernels can be executed on. - -For more information on CuPy, refer to [their documentation][cupy-docs]. -::: - (indexing_and_launch_config)= ## Modify the Indexing Scheme and Launch Configuration