Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
No results found
Show changes
Commits on Source (2)
Showing
with 493 additions and 321 deletions
......@@ -22,7 +22,7 @@ html:
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
clean:
rm -rf source/reference/generated
rm -rf source/api/generated
rm -rf source/api/symbolic/generated
rm -rf source/backend/generated
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
\ No newline at end of file
# Code Generation
## Invocation
```{eval-rst}
.. module:: pystencils.codegen
.. autosummary::
:toctree: generated
:nosignatures:
create_kernel
```
## Configuration
```{eval-rst}
.. module:: pystencils.codegen.config
```
The code generation driver (`create_kernel`, but also `DefaultKernelCreationDriver`) can be configured by
passing it a `CreateKernelConfig` object.
This object can be constructed incrementally:
```Python
cfg = ps.CreateKernelConfig()
cfg.default_dtype = "float32"
cfg.target = ps.Target.X86_AVX
cfg.cpu.openmp.enable = True
cfg.cpu.vectorize.enable = True
cfg.cpu.vectorize.assume_inner_stride_one = True
```
### Options and Option Categories
The following options and option categories are exposed by the configuration object:
#### Target Specification
```{eval-rst}
.. current
.. autosummary::
~CreateKernelConfig.target
```
#### Data Types
```{eval-rst}
.. autosummary::
~CreateKernelConfig.default_dtype
~CreateKernelConfig.index_dtype
```
#### Iteration Space
```{eval-rst}
.. autosummary::
~CreateKernelConfig.ghost_layers
~CreateKernelConfig.iteration_slice
~CreateKernelConfig.index_field
```
#### Kernel Constraint Checks
```{eval-rst}
.. autosummary::
~CreateKernelConfig.allow_double_writes
~CreateKernelConfig.skip_independence_check
```
#### Target-Specific Options
The following categories with target-specific options are exposed:
| | |
|---------------------------|--------------------------|
| {any}`cpu <CpuOptions>` | Options for CPU kernels |
| {any}`gpu <GpuOptions>` | Options for GPU kernels |
| {any}`sycl <SyclOptions>` | Options for SYCL kernels |
#### Kernel Object and Just-In-Time Compilation
```{eval-rst}
.. autosummary::
~CreateKernelConfig.function_name
~CreateKernelConfig.jit
```
### Configuration System Classes
```{eval-rst}
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/recursive_class.rst
CreateKernelConfig
CpuOptions
OpenMpOptions
VectorizationOptions
GpuOptions
SyclOptions
.. autosummary::
:toctree: generated
:nosignatures:
AUTO
.. dropdown:: Implementation Details
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
Option
BasicOption
Category
ConfigBase
```
## Target Specification
```{eval-rst}
.. module:: pystencils.codegen.target
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/recursive_class.rst
Target
```
## Code Generation Drivers
```{eval-rst}
.. module:: pystencils.codegen.driver
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
DefaultKernelCreationDriver
.. autosummary::
:toctree: generated
:nosignatures:
get_driver
```
## Output Code Objects
```{eval-rst}
.. currentmodule:: pystencils.codegen
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
Kernel
GpuKernel
Parameter
GpuThreadsRange
```
pystencils.codegen
==================
.. module:: pystencils.codegen
Invocation
----------
.. autosummary::
:toctree: generated
:nosignatures:
create_kernel
Configuration
-------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
CreateKernelConfig
CpuOptimConfig
OpenMpConfig
VectorizationConfig
GpuIndexingConfig
.. autosummary::
:toctree: generated
:nosignatures:
AUTO
Target Specification
--------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/recursive_class.rst
Target
Code Generation Drivers
-----------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
driver.DefaultKernelCreationDriver
.. autosummary::
:toctree: generated
:nosignatures:
get_driver
Output Code Objects
-------------------
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/entire_class.rst
Kernel
GpuKernel
Parameter
GpuThreadsRange
pystencils.jit
==============
JIT Compilation
===============
.. module:: pystencils.jit
......
# Assignments and AssignmentCollection
```{eval-rst}
.. py:class:: pystencils.Assignment
Monkeypatched variant of `sympy.codegen.ast.Assignment`.
Represents an assignment of an expression to a symbol.
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/recursive_class.rst
pystencils.AssignmentCollection
```
pystencils.field
================
Fields
======
.. module:: pystencils.field
......
# Symbolic Toolbox
:::{toctree}
:maxdepth: 1
field
assignments
sympyextensions
:::
pystencils.sympyextensions
==========================
Extensions to SymPy
===================
.. module:: pystencils.sympyextensions
......
# Contributor Guide
# Contribution Guide
Welcome to the Contributor's Guide to pystencils!
If you are interested in contributing to the development of pystencils, this is the place to start.
......
......@@ -77,19 +77,18 @@ Topics
.. toctree::
:maxdepth: 1
:caption: Reference Guides
:caption: User Manual
reference/symbolic_language
reference/kernelcreation
reference/gpu_kernels
reference/types
user_manual/symbolic_language
user_manual/kernelcreation
user_manual/gpu_kernels
.. toctree::
:maxdepth: 1
:caption: API
:caption: API Reference
api/field
api/sympyextensions
api/symbolic/index
api/types
api/codegen
api/jit
......
.. _page_v2_migration:
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
mystnb:
execution_mode: cache
---
***************************
Version 2.0 Migration Guide
***************************
(_page_v2_migration)=
# Version 2.0 Migration Guide
With version 2.0, many APIs of *pystencils* will be changed; old interfaces are being deprecated
and new systems are put in place.
This page is a still-incomplete list of these changes, with advice on how to migrate your code
from pystencils 1.x to pystencils 2.0.
Kernel Creation
===============
```{code-cell} ipython3
:tags: [remove-cell]
Configuration
-------------
import pystencils as ps
```
The API of `create_kernel`, and the configuration options of the `CreateKernelConfig`, have changed significantly:
## Kernel Creation
### Configuration
The API of {any}`create_kernel`, and the configuration options of the {any}`CreateKernelConfig`, have changed significantly.
The `CreateKernelConfig` class has been refined to be safe to copy and edit incrementally.
The recommended way of setting up the code generator is now *incremental configuration*:
```{code-cell} ipython3
cfg = ps.CreateKernelConfig()
cfg.default_dtype = "float32"
cfg.cpu.openmp.enable = True
cfg.cpu.openmp.num_threads = 8
cfg.ghost_layers = 2
```
- *Data Types:* `CreateKernelConfig` now takes to parameters to control data types in your kernels:
the ``default_dtype`` is applied to all numerical computations, while the ``index_dtype`` is used
for all index calculations and loop counters.
- *CPU Optimization Options:* Should now be set via the {any}`cpu <CpuOptions>` option category and its subcategories.
.. dropdown:: Deprecated options of `CreateKernelConfig`
:::{dropdown} Deprecated options of `CreateKernelConfig`
- ``data_type``: Use ``default_dtype`` instead
- ``cpu_openmp``: Set OpenMP-Options in the `cpu.openmp <OpenMpOptions>` category instead.
- ``cpu_vectorize_info``: Set vectorization options in the `cpu.vectorize <VectorizationOptions>` category instead
- ``gpu_indexing_params``: Set GPU indexing options in the `gpu <GpuOptions>` category instead
- ``data_type``: Use ``default_dtype`` instead
- ``cpu_openmp``: Set OpenMP-Options via an `OpenMpConfig` in the ``cpu_optim`` (`CpuOptimConfig`) instead.
- ``cpu_vectorize_info``: Set vectorization options via a `VectorizationConfig` in the ``cpu_optim`` option instead
- ``gpu_indexing_params``: Set GPU indexing options via a `GpuIndexingConfig` in the ``gpu_indexing`` option instead
:::
Type Checking
-------------
### Type Checking
The old type checking system of pystencils' code generator has been replaced by a new type inference and validation
mechanism whose rules are much stricter than before.
......@@ -38,24 +66,23 @@ While running `create_kernel`, you may now encounter a `TypificationError` where
If this happens, it is probable that you have been doing some illegal, maybe dangerous, or at least unsafe things with data types
(like inserting integers into a floating-point context without casting them, or mixing types of different precisions or signedness).
If you are sure the error is not your fault, please file an issue at our
`bug tracker <https://i10git.cs.fau.de/pycodegen/pystencils/-/issues>`_.
[bug tracker](https://i10git.cs.fau.de/pycodegen/pystencils/-/issues).
Type System
===========
### Type System
The ``pystencils.typing`` module has been entirely replaced by the new `pystencils.types` module,
The ``pystencils.typing`` module has been entirely replaced by the new {any}`pystencils.types` module,
which is home to a completely new type system.
The primary interaction points with this system are still the `TypedSymbol` class and the `create_type` routine.
The primary interaction points with this system are still the {any}`TypedSymbol` class and the {any}`create_type` routine.
Code using any of these two should not require any changes, except:
- *Importing `TypedSymbol` and `create_type`:* Both `TypedSymbol` and `create_type` should now be imported directly
from the ``pystencils`` namespace.
- *Custom data types:* `TypedSymbol` used to accept arbitrary strings as data types.
This is no longer possible; instead, import `pystencils.types.PsCustomType` and use it to describe
This is no longer possible; instead, import {any}`pystencils.types.PsCustomType` and use it to describe
custom data types unknown to pystencils, as in ``TypedSymbol("xs", PsCustomType("std::vector< int >"))``
All old data type classes (such as ``BasicType``, ``PointerType``, ``StructType``, etc.) have been removed
and replaced by the class hierarchy below `PsType`.
and replaced by the class hierarchy below {any}`PsType`.
Directly using any of these type classes in the frontend is discouraged unless absolutely necessary;
in most cases, `create_type` suffices.
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -159,15 +159,10 @@ kernel = ps.create_kernel(assignments, cfg).compile()
```
This warns us that the threads range could not be determined automatically.
We can disable this warning by setting `manual_launch_grid` in the GPU indexing options:
We can disable this warning by setting `manual_launch_grid` in the GPU option category:
```{code-cell}
cfg = ps.CreateKernelConfig(
# ... other options ...
gpu_indexing=ps.GpuIndexingConfig(
manual_launch_grid=True
)
)
cfg.gpu.manual_launch_grid = True
```
Now, to execute our kernel, we have to manually specify its launch grid:
......
......@@ -485,13 +485,10 @@ h = sp.Symbol("h")
cfg = ps.CreateKernelConfig(
target=ps.Target.X86_AVX512,
default_dtype="float32",
cpu_optim=ps.CpuOptimConfig(
openmp=True,
vectorize=ps.VectorizationConfig(
assume_inner_stride_one=True
)
)
)
cfg.cpu.openmp.enable = True
cfg.cpu.vectorize.enable = True
cfg.cpu.vectorize.assume_inner_stride_one = True
assignments = [
ps.Assignment(
......
......@@ -42,10 +42,6 @@ Assignments are the fundamental components of pystencils kernels;
they are used both for assigning expressions to symbols
and for writing values to fields.
.. py:class:: pystencils.Assignment
Slightly monkey-patched version of `sympy.codegen.ast.Assignment`.
Assignments are combined and structured inside `assignment collections <pystencils.AssignmentCollection>`.
An assignment collection contains two separate lists of assignments:
......@@ -56,10 +52,9 @@ An assignment collection contains two separate lists of assignments:
into fields.
.. autosummary::
:toctree: generated
:nosignatures:
:template: autosummary/recursive_class.rst
pystencils.Assignment
pystencils.AssignmentCollection
......
......@@ -3,10 +3,6 @@
from .codegen import (
Target,
CreateKernelConfig,
CpuOptimConfig,
VectorizationConfig,
OpenMpConfig,
GpuIndexingConfig,
AUTO
)
from .defaults import DEFAULTS
......@@ -50,10 +46,6 @@ __all__ = [
"create_numeric_type",
"make_slice",
"CreateKernelConfig",
"CpuOptimConfig",
"VectorizationConfig",
"GpuIndexingConfig",
"OpenMpConfig",
"AUTO",
"create_kernel",
"create_staggered_kernel",
......
......@@ -17,7 +17,6 @@ from ...types import PsStructType
from ..exceptions import PsInputError, KernelConstraintsError
if TYPE_CHECKING:
from ...codegen.config import _AUTO_TYPE
from .context import KernelCreationContext
......@@ -62,6 +61,7 @@ class FullIterationSpace(IterationSpace):
@dataclass
class Dimension:
"""One dimension of a dense iteration space"""
start: PsExpression
stop: PsExpression
step: PsExpression
......@@ -196,7 +196,7 @@ class FullIterationSpace(IterationSpace):
def dimensions(self):
"""The dimensions of this iteration space"""
return self._dimensions
@property
def counters(self) -> tuple[PsSymbol, ...]:
return tuple(dim.counter for dim in self._dimensions)
......@@ -220,7 +220,7 @@ class FullIterationSpace(IterationSpace):
def archetype_field(self) -> Field | None:
"""Field whose shape and memory layout act as archetypes for this iteration space's dimensions."""
return self._archetype_field
@property
def loop_order(self) -> tuple[int, ...]:
"""Return the loop order of this iteration space, ordered from slowest to fastest coordinate."""
......@@ -242,7 +242,7 @@ class FullIterationSpace(IterationSpace):
self, dimension: int | FullIterationSpace.Dimension | None = None
) -> PsExpression:
"""Construct an expression representing the actual number of unique points inside the iteration space.
Args:
dimension: If an integer or a `Dimension` object is given, the number of iterations in that
dimension is computed. If `None`, the total number of iterations inside the entire space
......@@ -417,14 +417,55 @@ def create_sparse_iteration_space(
def create_full_iteration_space(
ctx: KernelCreationContext,
assignments: AssignmentCollection,
ghost_layers: None | _AUTO_TYPE | int | Sequence[int | tuple[int, int]] = None,
ghost_layers: None | int | Sequence[int | tuple[int, int]] = None,
iteration_slice: None | int | slice | tuple[int | slice, ...] = None,
infer_ghost_layers: bool = False,
) -> IterationSpace:
"""Create a dense iteration space from a sequence of assignments and iteration slice information.
This function finds all accesses to fields in the given assignment collection,
analyzes the set of fields involved,
and determines the iteration space bounds from these.
This requires that either all fields are of the same, fixed, shape, or all of them are
variable-shaped.
Also, all fields need to have the same memory layout of their spatial dimensions.
Args:
ctx: The kernel creation context
assignments: Collection of assignments the iteration space should be inferred from
ghost_layers: If set, strip off that many ghost layers from all sides of the iteration cuboid
iteration_slice: If set, constrain iteration to the given slice.
For details on the parsing of slices, see `AstFactory.parse_slice`.
infer_ghost_layers: If `True`, infer the number of ghost layers from the stencil ranges
used in the kernel.
Returns:
IterationSpace: The constructed iteration space.
Raises:
KernelConstraintsError: If field shape or memory layout conflicts are detected
ValueError: If the iteration slice could not be parsed
.. attention::
The ``ghost_layers`` and ``iteration_slice`` arguments are mutually exclusive.
Also, if ``infer_ghost_layers=True``, none of them may be set.
"""
assert not ctx.fields.index_fields
if (ghost_layers is not None) and (iteration_slice is not None):
if (ghost_layers is None) and (iteration_slice is None) and not infer_ghost_layers:
raise ValueError(
"At most one of `ghost_layers` and `iteration_slice` may be specified."
"One argument of `ghost_layers`, `iteration_slice`, and `infer_ghost_layers` must be set."
)
if (
int(ghost_layers is not None)
+ int(iteration_slice is not None)
+ int(infer_ghost_layers)
> 1
):
raise ValueError(
"At most one of `ghost_layers`, `iteration_slice`, and `infer_ghost_layers` may be set."
)
# Collect all relative accesses into domain fields
......@@ -457,9 +498,7 @@ def create_full_iteration_space(
# Otherwise, if an iteration slice was specified, use that
# Otherwise, use the inferred ghost layers
from ...codegen.config import AUTO, _AUTO_TYPE
if ghost_layers is AUTO:
if infer_ghost_layers:
if len(domain_field_accesses) > 0:
inferred_gls = max(
[fa.required_ghost_layers for fa in domain_field_accesses]
......@@ -472,7 +511,6 @@ def create_full_iteration_space(
ctx, inferred_gls, archetype_field
)
elif ghost_layers is not None:
assert not isinstance(ghost_layers, _AUTO_TYPE)
ctx.metadata["ghost_layers"] = ghost_layers
return FullIterationSpace.create_with_ghost_layers(
ctx, ghost_layers, archetype_field
......
......@@ -30,7 +30,7 @@ from ..literals import PsLiteral
from ..functions import PsMathFunction, MathFunctions, CFunction
if TYPE_CHECKING:
from ...codegen import GpuIndexingConfig, GpuThreadsRange
from ...codegen import GpuThreadsRange
int32 = PsSignedIntegerType(width=32, const=False)
......@@ -52,13 +52,15 @@ class CudaPlatform(GenericGpu):
"""Platform for CUDA-based GPUs."""
def __init__(
self, ctx: KernelCreationContext, indexing_cfg: GpuIndexingConfig | None = None
self, ctx: KernelCreationContext,
omit_range_check: bool = False,
manual_launch_grid: bool = False,
) -> None:
super().__init__(ctx)
from ...codegen.config import GpuIndexingConfig
self._omit_range_check = omit_range_check
self._manual_launch_grid = manual_launch_grid
self._cfg = indexing_cfg if indexing_cfg is not None else GpuIndexingConfig()
self._typify = Typifier(ctx)
@property
......@@ -141,7 +143,7 @@ class CudaPlatform(GenericGpu):
) -> tuple[PsBlock, GpuThreadsRange | None]:
dimensions = ispace.dimensions_in_loop_order()
if not self._cfg.manual_launch_grid:
if not self._manual_launch_grid:
try:
threads_range = self.threads_from_ispace(ispace)
except MaterializationError as e:
......@@ -170,7 +172,7 @@ class CudaPlatform(GenericGpu):
)
)
)
if not self._cfg.omit_range_check:
if not self._omit_range_check:
conds.append(PsLt(ctr, dim.stop))
indexing_decls = indexing_decls[::-1]
......@@ -213,7 +215,7 @@ class CudaPlatform(GenericGpu):
]
body.statements = mappings + body.statements
if not self._cfg.omit_range_check:
if not self._omit_range_check:
stop = PsExpression.make(ispace.index_list.shape[0])
condition = PsLt(sparse_ctr, stop)
ast = PsBlock([sparse_idx_decl, PsConditional(condition, body)])
......
......@@ -19,7 +19,7 @@ from ..ast.expressions import (
PsLe,
PsTernary,
PsLookup,
PsBufferAcc
PsBufferAcc,
)
from ..extensions.cpp import CppMethodCall
......@@ -30,19 +30,21 @@ from ..exceptions import MaterializationError
from ...types import PsCustomType, PsIeeeFloatType, constify, PsIntegerType
if TYPE_CHECKING:
from ...codegen import GpuIndexingConfig, GpuThreadsRange
from ...codegen import GpuThreadsRange
class SyclPlatform(GenericGpu):
def __init__(
self, ctx: KernelCreationContext, indexing_cfg: GpuIndexingConfig | None = None
self,
ctx: KernelCreationContext,
omit_range_check: bool = False,
automatic_block_size: bool = False
):
super().__init__(ctx)
from ...codegen.config import GpuIndexingConfig
self._cfg = indexing_cfg if indexing_cfg is not None else GpuIndexingConfig()
self._omit_range_check = omit_range_check
self._automatic_block_size = automatic_block_size
@property
def required_headers(self) -> set[str]:
......@@ -138,7 +140,7 @@ class SyclPlatform(GenericGpu):
indexing_decls.append(
PsDeclaration(ctr, dim.start + work_item_idx * dim.step)
)
if not self._cfg.omit_range_check:
if not self._omit_range_check:
conds.append(PsLt(ctr, dim.stop))
if conds:
......@@ -156,7 +158,7 @@ class SyclPlatform(GenericGpu):
self, body: PsBlock, ispace: SparseIterationSpace
) -> tuple[PsBlock, GpuThreadsRange]:
factory = AstFactory(self._ctx)
id_type = PsCustomType("sycl::id< 1 >", const=True)
id_symbol = PsExpression.make(self._ctx.get_symbol("id", id_type))
......@@ -184,7 +186,7 @@ class SyclPlatform(GenericGpu):
]
body.statements = mappings + body.statements
if not self._cfg.omit_range_check:
if not self._omit_range_check:
stop = PsExpression.make(ispace.index_list.shape[0])
condition = PsLt(sparse_ctr, stop)
ast = PsBlock([sparse_idx_decl, PsConditional(condition, body)])
......@@ -195,7 +197,7 @@ class SyclPlatform(GenericGpu):
return ast, self.threads_from_ispace(ispace)
def _item_type(self, rank: int):
if not self._cfg.sycl_automatic_block_size:
if not self._automatic_block_size:
return PsCustomType(f"sycl::nd_item< {rank} >", const=True)
else:
return PsCustomType(f"sycl::item< {rank} >", const=True)
......@@ -207,7 +209,7 @@ class SyclPlatform(GenericGpu):
item_type = self._item_type(rank)
item = PsExpression.make(self._ctx.get_symbol("sycl_item", item_type))
if not self._cfg.sycl_automatic_block_size:
if not self._automatic_block_size:
rhs = CppMethodCall(item, "get_global_id", self._id_type(rank))
else:
rhs = CppMethodCall(item, "get_id", self._id_type(rank))
......