Reduction Support
-
Review changes -
-
Download -
Patches
-
Plain diff
This MR introduces reductions to pystencils for scalar data types and thus covers #55.
User interface
- Adds reduction assignment classes to
sympyextensions
module: AddReductionAssignment, SubReductionAssignment, MulReductionAssignment, MinReductionAssignment, MaxReductionAssignment
These can be used as follows:
import pystencils as ps
r = ps.TypedSymbol("r", "double")
x, y = ps.fields(f"x, y: double[3D]", layout="fzyx")
assign_dot_prod = ps.AddReductionAssignment(r, x.center() * y.center())
- Alternatívely, you can also make use of the
reduction_assignment
orreduction_assignment_from_str
functions:
from pystencils.sympyextensions import reduction_assignment, reduction_assignment_from_str
from pystencils.sympyextensions.reduction import ReductionOp
assign_dot_prod = reduction_assignment(r, ReductionOp.Add, x.center() * y.center())
assign_dot_prod = reduction_assignment_from_str(r, "+", x.center() * y.center())
Supported Backends
Generic CPUs
- Add reduction support for OpenMP
SIMD: SSE3, AVX2, AVX512
- Include a generated header file with horizontal operations performing a binary operation between a scalar variable and a SIMD vector. The SIMD vector is transformed to a scalar variable via reduction, and then the binary operation is applied to the other operand
CUDA
- Employ atomic reduction operations in all threads when the block size does not align with the warp size
- Optimization for alignment with warp size: perform warp-level reductions and only perform atomic operation on first thread of warp
- Include a header file with manual implementations for atomic operations that are not directly supported for floating point numbers: atomicMul, atomicMax, and atomicMin. These functions make use of a CAS mechanism.
Internal Changes
- Freeze handling for newly introduced
ReductionAssignment
nodes - Add
PsVecHorizontal
vectorization node for conducting a binary operation between a scalar symbol and an extraction of a vector value (obtained by performing a reduction within a vector lane) - Add dataclass
ReductionInfo
holding essential information about a reduction (i.e. reduction operation, initial value and the write-back pointer for exporting the reduction result) and create corresponding lookup table for symbols inKernelCreationContext
- Introduce
NumericLimitsFunctions
for initializing neutral elements for reductions making use of min/max operations - Adapt
Platform.select_function
such that it either returns anPsExpression
that replaces the function call or returns aPsExpression | tuple[tuple[PsStructuralNode, ...], PsAstNode]
holding aPsAstNode
that replaces the function call and tuple of structural nodes that are added before the replacement. The structural nodes allow adding preparatory code for the replacement, as needed for the warp-level reductions for GPU platforms - Add
ReductionFunctions.WriteBackToPtr
function that is replaced with platform-dependent code inPlatform.select_function
- Slightly adapt CPU/GPU Jit modules to support the handling of write-back pointers used for reductions
Edited by Richard Angersbach
Merge request reports
Compare and
Show latest version
- version 619d634352
- version 60b008a9e9
- version 59806dcb6b
- version 584e5c89b9
- version 57f7790954
- version 56c31d4074
- version 5516a6e80d
- version 545ee715d0
- version 53d7e6890c
- version 5290837d04
- version 51974cf848
- version 50c7f1518e
- version 494a031fc1
- version 48a972d759
- version 4702be4d5e
- version 46a8479afa
- version 452b6589b8
- version 44ad292c2b
- version 4399125ca7
- version 4277a22268
- version 410c40ed63
- version 40ef185b4e
- version 39fe3cd6cd
- version 38b4b105be
- version 37a9590829
- version 3660a348f1
- version 355caafdd0
- version 344edb0f97
- version 33a2a59d40
- version 32f60d9d5d
- version 314c7fd409
- version 30616f609f
- version 290fb11858
- version 2810def05e
- version 27e15d3cf7
- version 266e08683b
- version 25dd8f421d
- version 24a9da7d43
- version 2306dc2344
- version 222424c157
- version 21c73deaf6
- version 203daaa5e5
- version 19f0d2fde6
- version 184c726aa6
- version 17b352a2e2
- version 165ca40eae
- version 1572fa8672
- version 144e748308
- version 136bc3cf3f
- version 129fd1c2ad
- version 11e0347f54
- version 103fc9a049
- version 975ea862f
- version 83c5a93b4
- version 79bbb8181
- version 6f16d8e79
- version 5e9ee769d
- version 4b8718cb1
- version 366ce4395
- version 2719a76fb
- version 1543bf118
- v2.0-dev (HEAD)
- latest versionbd99cd11150 commits,
- version 619d634352147 commits,
- version 60b008a9e9143 commits,
- version 59806dcb6b142 commits,
- version 584e5c89b9141 commits,
- version 57f7790954140 commits,
- version 56c31d4074139 commits,
- version 5516a6e80d136 commits,
- version 545ee715d0135 commits,
- version 53d7e6890c134 commits,
- version 5290837d04132 commits,
- version 51974cf848130 commits,
- version 50c7f1518e129 commits,
- version 494a031fc1127 commits,
- version 48a972d759126 commits,
- version 4702be4d5e125 commits,
- version 46a8479afa123 commits,
- version 452b6589b8121 commits,
- version 44ad292c2b119 commits,
- version 4399125ca7116 commits,
- version 4277a22268115 commits,
- version 410c40ed63114 commits,
- version 40ef185b4e113 commits,
- version 39fe3cd6cd109 commits,
- version 38b4b105be104 commits,
- version 37a9590829103 commits,
- version 3660a348f1101 commits,
- version 355caafdd099 commits,
- version 344edb0f9798 commits,
- version 33a2a59d4097 commits,
- version 32f60d9d5d95 commits,
- version 314c7fd40993 commits,
- version 30616f609f92 commits,
- version 290fb1185891 commits,
- version 2810def05e90 commits,
- version 27e15d3cf789 commits,
- version 266e08683b88 commits,
- version 25dd8f421d87 commits,
- version 24a9da7d4385 commits,
- version 2306dc234484 commits,
- version 222424c15783 commits,
- version 21c73deaf680 commits,
- version 203daaa5e579 commits,
- version 19f0d2fde677 commits,
- version 184c726aa676 commits,
- version 17b352a2e275 commits,
- version 165ca40eae74 commits,
- version 1572fa867237 commits,
- version 144e74830835 commits,
- version 136bc3cf3f31 commits,
- version 129fd1c2ad30 commits,
- version 11e0347f5429 commits,
- version 103fc9a04927 commits,
- version 975ea862f26 commits,
- version 83c5a93b425 commits,
- version 79bbb818124 commits,
- version 6f16d8e7919 commits,
- version 5e9ee769d18 commits,
- version 4b8718cb19 commits,
- version 366ce43957 commits,
- version 2719a76fb6 commits,
- version 1543bf1182 commits,