Markus Holzer · Helen Schottenhamml · Helen Schottenhamml · Markus Holzer · Helen Schottenhamml · Helen Schottenhamml
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
+# Change Log
+
+## Unreleased
+
+### Removed
+* Removing OpenCL support because it is not supported by pystencils anymore
--- a/doc/notebooks/03_tutorial_lbm_formulation.ipynb
+++ b/doc/notebooks/03_tutorial_lbm_formulation.ipynb
--- a/doc/notebooks/04_tutorial_cumulant_LBM.ipynb
+++ b/doc/notebooks/04_tutorial_cumulant_LBM.ipynb
 %% Cell type:code id: tags:
  
 ``` python
 from lbmpy.session import *
 ```
  
 %% Cell type:markdown id: tags:
  
 # Tutorial 04: The Cumulant Lattice Boltzmann Method in lbmpy
  
 ## A) Principles of the centered cumulant collision operator
  
 Recently, an advanced Lattice Boltzmann collision operator based on relaxation in cumulant space has gained popularity. Similar to moments, cumulants are statistical quantities inherent to a probability distribution. A significant advantage of the cumulants is that they are statistically independent by construction. Moments can be defined by using the so-called moment generating function, which for the discrete particle distribution of the LB equation is stated as
  
 $$
 \begin{align}
-    M( \vec{X} ) =
+    M( \mathbf{X} ) =
    \sum_i f_i
    \exp \left(
-        \vec{c}_i \cdot \vec{X}
+        \mathbf{c}_i \cdot \mathbf{X}
    \right)
 \end{align}
 $$
  
 The raw moments $m_{\alpha \beta \gamma}$ can be expressed as its derivatives, evaluated at zero:
  
 $$
 \begin{align}
    m_{\alpha \beta \gamma}
    =
    \left.
    \frac{\partial^{\alpha}}{\partial X^{\alpha}}
    \frac{\partial^{\beta}}{\partial Y^{\beta}}
    \frac{\partial^{\gamma}}{\partial Z^{\gamma}}
    M(X, Y, Z)
-    \right\vert_{\vec{X}=0}
+    \right\vert_{\mathbf{X}=0}
 \end{align}
 $$
  
 The cumulant-generating function is defined as the natural logarithm of this moment-generating function, and the cumulants $c_{\alpha \beta \gamma}$ are defined as its derivatives evaluated at zero:
  
 $$
 \begin{align}
-    C(\vec{X}) :=& \, \log ( M(\vec{X}) ) \\
+    C(\mathbf{X}) :=& \, \log ( M(\mathbf{X}) ) \\
    c_{\alpha \beta \gamma}
    =& \,
    \left.
    \frac{\partial^{\alpha}}{\partial X^{\alpha}}
    \frac{\partial^{\beta}}{\partial Y^{\beta}}
    \frac{\partial^{\gamma}}{\partial Z^{\gamma}}
    C(X, Y, Z)
-    \right\vert_{\vec{X}=0}
+    \right\vert_{\mathbf{X}=0}
 \end{align}
 $$
  
 Other than with moments, there is no straightforward way to express cumulants in terms of the populations. However, their generating functions can be related to allowing the computation of cumulants from both raw and central moments, computed from populations. In lbmpy, the transformations from populations to cumulants and back are implemented using central moments as intermediaries. This is done for two primary reasons:
 1. All cumulants of orders 2 and 3 are equal to their corresponding central moments, up to the density $\rho$ as a proportionality factor.
 2. The conserved modes of the first order, which correspond to momentum, are relaxed in central moment space to allow for a more efficient implicit forcing scheme.
  
-The central moment-generating function $K$ can be related to the moment-generating function through $K(\vec{X}) = \exp( - \vec{X} \cdot \vec{u} ) M(\vec{X})$. It is possible to recombine the equation with the definition of the cumulant-generating function
+The central moment-generating function $K$ can be related to the moment-generating function through $K(\mathbf{X}) = \exp( - \mathbf{X} \cdot \mathbf{u} ) M(\mathbf{X})$. It is possible to recombine the equation with the definition of the cumulant-generating function
  
 $$
 \begin{align}
-    C( \vec{X} ) = \vec{X} \cdot \vec{u} + \log( K( \vec{X} ) ).
+    C( \mathbf{X} ) = \mathbf{X} \cdot \mathbf{u} + \log( K( \mathbf{X} ) ).
 \end{align}
 $$
  
 Derivatives of $C$ can thus be expressed in terms of derivatives of $K$, directly yielding equations of the cumulants in terms of central moments.
  
 In the cumulant collision operator, forces can be applied through a simple forcing scheme which in lbmpy is called *implicit forcing*. Ony the force contributions to the first-order (momentum) modes are considered. When the first-order central moments are taken in the frame of reference shifted by $F/2$, they trail behind zero by just that much. Forcing can be applied by first adding half the force *before* collision, and then adding the remaining half *after* collision. Due to this, the first-order central moments entering the cumulant equations are always zero. By applying the force already in central-moment space, the cumulant equations are simplified significantly as no forces need to be taken into account. The pre- and post-collision momentum central moments take the form:
  
 $$
 \begin{align}
    \kappa_{100} &= - \frac{F_x}{2}, \\
    \kappa^{\ast}_{100} &= \kappa_{100} + F_x = \frac{F_x}{2}
 \end{align}
 $$
  
 In total, through forcing, the first central moments change sign. This is equivalent to relaxation with a relaxation rate $\omega = 2$. For this reason, lbmpy's implementation of the cumulant LBM calculates the collision of the momentum modes in central moment space, and the default force model overrides their relaxation rate by setting it to 2.
  
 %% Cell type:markdown id: tags:
  
 ## B) Method Creation
  
 Using the `create_lb_method` interface, creation of a cumulant-based lattice Boltzmann method in lbmpy is straightforward. Cumulants can either be relaxed in their raw (monomial forms) or in polynomial combinations. Both variants are available as predefined setups.
  
 ### Relaxation of Monomial Cumulants
  
 Monomial cumulant relaxation is available through the `method="monomial_cumulant"` parameter setting. This variant requires fewer transformation steps and is a little faster than polynomial relaxation, but it does not allow separation of bulk and shear viscosity. Default monomial cumulant sets are available for the D2Q9, D3Q19 and D3Q27 stencils. It is also possible to define a custom set of monomial cumulants.
  
 When creating a monomial cumulant method, one option is to specify only a single relaxation rate which will then be assigned to all cumulants related to the shear viscosity. In this case, all other non-conserved cumulants will be relaxed to equilibrium. Alternatively, individual relaxation rates for all non-conserved cumulants can be specified. The conserved cumulants are set to zero by default to save computational cost. They can be adjusted with `set_zeroth_moment_relaxation_rate`, `set_first_moment_relaxation_rate` and `set_conserved_moments_relaxation_rate`.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.MONOMIAL_CUMULANT, relaxation_rate=sp.Symbol('omega_v'))
 method_monomial = create_lb_method(lbm_config=lbm_config)
 method_monomial
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x127652c40>
  
 %% Cell type:markdown id: tags:
  
 ### Relaxation of Polynomial Cumulants
  
 By setting `method="cumulant"`, a set of default polynomial cumulants is chosen to be relaxed. Those cumulants are taken from literature and assembled into groups selected to enforce rotational invariance (see: `lbmpy.methods.centeredcumulant.centered_cumulants`). Default polynomial groups are available for the D2Q9, D3Q19 and D3Q27 stencils. As before it is possible to specify only a single relaxation rate assigned to the moments governing the shear viscosity. All other relaxation rates are then automatically set to one.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.CUMULANT, relaxation_rate=sp.Symbol('omega_v'))
 method_polynomial = create_lb_method(lbm_config=lbm_config)
 method_polynomial
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x12745b5e0>
  
 %% Cell type:markdown id: tags:
  
 ### Central Moments and Forcing
  
 The conserved modes are marked with the note *(central moment)* in the table above. It highlights the fact that these modes are relaxed in central moment space, other than cumulant space. As described in section A, this is done to enable the implicit forcing scheme. When a force is specified, the momentum modes' relaxation rates are overridden by the force model. In the following cell, a symbolic force is specified. Further, a full list of relaxation rates is passed to allow the specification of bulk viscosity.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.CUMULANT, force=sp.symbols('F_:3'),
                       relaxation_rates= [sp.Symbol('omega_shear'),
                                          sp.Symbol('omega_bulk'),
                                          sp.Symbol('omega_3'),
                                          sp.Symbol('omega_4')])
 method_with_force = create_lb_method(lbm_config=lbm_config)
 method_with_force
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x127652520>
  
 %% Cell type:markdown id: tags:
  
 ## C) Exemplary simulation: flow around sphere
  
 To end this tutorial, we show an example simulation of a channel flow with a spherical obstacle. This example is shown in 2D with the D2Q9 stencil but can be adapted easily for a 3D simulation.
  
 %% Cell type:code id: tags:
  
 ``` python
 from lbmpy.relaxationrates import relaxation_rate_from_lattice_viscosity
 from lbmpy.macroscopic_value_kernels import pdf_initialization_assignments
 ```
  
 %% Cell type:markdown id: tags:
  
 To define the channel flow with dimensionless parameters, we first define the reference length in lattice cells. The reference length will be the diameter of the spherical obstacle. Furthermore, we define a maximal velocity which will be set for the inflow later. The Reynolds number is set relatively high to 100000, which will cause a turbulent flow in the channel.
  
 From these definitions, we can calculate the relaxation rate `omega` for the lattice Boltzmann method.
  
 %% Cell type:code id: tags:
  
 ``` python
 reference_length = 30
 maximal_velocity = 0.05
 reynolds_number = 100000
 kinematic_vicosity = (reference_length * maximal_velocity) / reynolds_number
 initial_velocity=(maximal_velocity, 0)
  
 omega = relaxation_rate_from_lattice_viscosity(kinematic_vicosity)
 ```
  
 %% Cell type:markdown id: tags:
  
 As a next step, we define the domain size of our set up.
  
 %% Cell type:code id: tags:
  
 ``` python
 stencil = LBStencil(Stencil.D2Q9)
 domain_size = (reference_length * 12, reference_length * 4)
 dim = len(domain_size)
 ```
  
 %% Cell type:markdown id: tags:
  
 Now the data for the simulation is allocated. We allocate a field `src` for the PDFs and field `dst` used as a temporary field to implement the two grid pull pattern. Additionally, we allocate a velocity field `velField`
  
 %% Cell type:code id: tags:
  
 ``` python
 dh = ps.create_data_handling(domain_size=domain_size, periodicity=(False, False))
  
 src = dh.add_array('src', values_per_cell=len(stencil), alignment=True)
 dh.fill('src', 0.0, ghost_layers=True)
 dst = dh.add_array('dst', values_per_cell=len(stencil), alignment=True)
 dh.fill('dst', 0.0, ghost_layers=True)
  
 velField = dh.add_array('velField', values_per_cell=dh.dim, alignment=True)
 dh.fill('velField', 0.0, ghost_layers=True)
 ```
  
 %% Cell type:markdown id: tags:
  
 We choose a cumulant lattice Boltzmann method, as described above. Here the second-order cumulants are relaxed with the relaxation rate calculated above. All higher-order cumulants are relaxed with one, which means we set them to the equilibrium.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(stencil=Stencil.D2Q9, method=Method.CUMULANT, relaxation_rate=omega,
                       output={'velocity': velField}, kernel_type='stream_pull_collide')
  
 method = create_lb_method(lbm_config=lbm_config)
 method
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x103b6d610>
  
 %% Cell type:markdown id: tags:
  
 ### Initialisation with equilibrium
  
 %% Cell type:code id: tags:
  
 ``` python
 init = pdf_initialization_assignments(method, 1.0, initial_velocity, src.center_vector)
  
 ast_init = ps.create_kernel(init, target=dh.default_target)
 kernel_init = ast_init.compile()
  
 dh.run_kernel(kernel_init)
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Definition of the update rule
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_optimisation = LBMOptimisation(symbolic_field=src, symbolic_temporary_field=dst)
 update = create_lb_update_rule(lb_method=method,
                               lbm_config=lbm_config,
                               lbm_optimisation=lbm_optimisation)
  
 ast_kernel = ps.create_kernel(update, target=dh.default_target, cpu_openmp=True)
 kernel = ast_kernel.compile()
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Definition of the boundary set up
  
 %% Cell type:code id: tags:
  
 ``` python
 def set_sphere(x, y, *_):
    mid = (domain_size[0] // 3, domain_size[1] // 2)
    radius = reference_length // 2
    return (x-mid[0])**2 + (y-mid[1])**2 < radius**2
 ```
  
 %% Cell type:code id: tags:
  
 ``` python
 bh = LatticeBoltzmannBoundaryHandling(method, dh, 'src', name="bh")
  
 inflow = UBB(initial_velocity)
 outflow = ExtrapolationOutflow(stencil[4], method)
 wall = NoSlip("wall")
  
 bh.set_boundary(inflow, slice_from_direction('W', dim))
 bh.set_boundary(outflow, slice_from_direction('E', dim))
 for direction in ('N', 'S'):
    bh.set_boundary(wall, slice_from_direction(direction, dim))
  
 bh.set_boundary(NoSlip("obstacle"), mask_callback=set_sphere)
  
 plt.figure(dpi=200)
 plt.boundary_handling(bh)
 ```
  
 %% Output
  

  
 %% Cell type:code id: tags:
  
 ``` python
 def timeloop(timeSteps):
    for i in range(timeSteps):
        bh()
        dh.run_kernel(kernel)
        dh.swap("src", "dst")
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Run the simulation
  
 %% Cell type:code id: tags:
  
 ``` python
 mask = np.fromfunction(set_sphere, (domain_size[0], domain_size[1], len(domain_size)))
 if 'is_test_run' not in globals():
    timeloop(50000)  # initial steps
  
    def run():
        timeloop(100)
        return np.ma.array(dh.gather_array('velField'), mask=mask)
  
    animation = plt.vector_field_magnitude_animation(run, frames=600, rescale=True)
    set_display_mode('video')
    res = display_animation(animation)
 else:
    timeloop(10)
    res = None
 res
 ```
  
 %% Output
  
    <IPython.core.display.HTML object>
 %% Cell type:code id: tags:
  
 ``` python
 from lbmpy.session import *
 ```
  
 %% Cell type:markdown id: tags:
  
 # Tutorial 04: The Cumulant Lattice Boltzmann Method in lbmpy
  
 ## A) Principles of the centered cumulant collision operator
  
 Recently, an advanced Lattice Boltzmann collision operator based on relaxation in cumulant space has gained popularity. Similar to moments, cumulants are statistical quantities inherent to a probability distribution. A significant advantage of the cumulants is that they are statistically independent by construction. Moments can be defined by using the so-called moment generating function, which for the discrete particle distribution of the LB equation is stated as
  
 $$
 \begin{align}
-    M( \vec{X} ) =
+    M( \mathbf{X} ) =
    \sum_i f_i
    \exp \left(
-        \vec{c}_i \cdot \vec{X}
+        \mathbf{c}_i \cdot \mathbf{X}
    \right)
 \end{align}
 $$
  
 The raw moments $m_{\alpha \beta \gamma}$ can be expressed as its derivatives, evaluated at zero:
  
 $$
 \begin{align}
    m_{\alpha \beta \gamma}
    =
    \left.
    \frac{\partial^{\alpha}}{\partial X^{\alpha}}
    \frac{\partial^{\beta}}{\partial Y^{\beta}}
    \frac{\partial^{\gamma}}{\partial Z^{\gamma}}
    M(X, Y, Z)
-    \right\vert_{\vec{X}=0}
+    \right\vert_{\mathbf{X}=0}
 \end{align}
 $$
  
 The cumulant-generating function is defined as the natural logarithm of this moment-generating function, and the cumulants $c_{\alpha \beta \gamma}$ are defined as its derivatives evaluated at zero:
  
 $$
 \begin{align}
-    C(\vec{X}) :=& \, \log ( M(\vec{X}) ) \\
+    C(\mathbf{X}) :=& \, \log ( M(\mathbf{X}) ) \\
    c_{\alpha \beta \gamma}
    =& \,
    \left.
    \frac{\partial^{\alpha}}{\partial X^{\alpha}}
    \frac{\partial^{\beta}}{\partial Y^{\beta}}
    \frac{\partial^{\gamma}}{\partial Z^{\gamma}}
    C(X, Y, Z)
-    \right\vert_{\vec{X}=0}
+    \right\vert_{\mathbf{X}=0}
 \end{align}
 $$
  
 Other than with moments, there is no straightforward way to express cumulants in terms of the populations. However, their generating functions can be related to allowing the computation of cumulants from both raw and central moments, computed from populations. In lbmpy, the transformations from populations to cumulants and back are implemented using central moments as intermediaries. This is done for two primary reasons:
 1. All cumulants of orders 2 and 3 are equal to their corresponding central moments, up to the density $\rho$ as a proportionality factor.
 2. The conserved modes of the first order, which correspond to momentum, are relaxed in central moment space to allow for a more efficient implicit forcing scheme.
  
-The central moment-generating function $K$ can be related to the moment-generating function through $K(\vec{X}) = \exp( - \vec{X} \cdot \vec{u} ) M(\vec{X})$. It is possible to recombine the equation with the definition of the cumulant-generating function
+The central moment-generating function $K$ can be related to the moment-generating function through $K(\mathbf{X}) = \exp( - \mathbf{X} \cdot \mathbf{u} ) M(\mathbf{X})$. It is possible to recombine the equation with the definition of the cumulant-generating function
  
 $$
 \begin{align}
-    C( \vec{X} ) = \vec{X} \cdot \vec{u} + \log( K( \vec{X} ) ).
+    C( \mathbf{X} ) = \mathbf{X} \cdot \mathbf{u} + \log( K( \mathbf{X} ) ).
 \end{align}
 $$
  
 Derivatives of $C$ can thus be expressed in terms of derivatives of $K$, directly yielding equations of the cumulants in terms of central moments.
  
 In the cumulant collision operator, forces can be applied through a simple forcing scheme which in lbmpy is called *implicit forcing*. Ony the force contributions to the first-order (momentum) modes are considered. When the first-order central moments are taken in the frame of reference shifted by $F/2$, they trail behind zero by just that much. Forcing can be applied by first adding half the force *before* collision, and then adding the remaining half *after* collision. Due to this, the first-order central moments entering the cumulant equations are always zero. By applying the force already in central-moment space, the cumulant equations are simplified significantly as no forces need to be taken into account. The pre- and post-collision momentum central moments take the form:
  
 $$
 \begin{align}
    \kappa_{100} &= - \frac{F_x}{2}, \\
    \kappa^{\ast}_{100} &= \kappa_{100} + F_x = \frac{F_x}{2}
 \end{align}
 $$
  
 In total, through forcing, the first central moments change sign. This is equivalent to relaxation with a relaxation rate $\omega = 2$. For this reason, lbmpy's implementation of the cumulant LBM calculates the collision of the momentum modes in central moment space, and the default force model overrides their relaxation rate by setting it to 2.
  
 %% Cell type:markdown id: tags:
  
 ## B) Method Creation
  
 Using the `create_lb_method` interface, creation of a cumulant-based lattice Boltzmann method in lbmpy is straightforward. Cumulants can either be relaxed in their raw (monomial forms) or in polynomial combinations. Both variants are available as predefined setups.
  
 ### Relaxation of Monomial Cumulants
  
 Monomial cumulant relaxation is available through the `method="monomial_cumulant"` parameter setting. This variant requires fewer transformation steps and is a little faster than polynomial relaxation, but it does not allow separation of bulk and shear viscosity. Default monomial cumulant sets are available for the D2Q9, D3Q19 and D3Q27 stencils. It is also possible to define a custom set of monomial cumulants.
  
 When creating a monomial cumulant method, one option is to specify only a single relaxation rate which will then be assigned to all cumulants related to the shear viscosity. In this case, all other non-conserved cumulants will be relaxed to equilibrium. Alternatively, individual relaxation rates for all non-conserved cumulants can be specified. The conserved cumulants are set to zero by default to save computational cost. They can be adjusted with `set_zeroth_moment_relaxation_rate`, `set_first_moment_relaxation_rate` and `set_conserved_moments_relaxation_rate`.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.MONOMIAL_CUMULANT, relaxation_rate=sp.Symbol('omega_v'))
 method_monomial = create_lb_method(lbm_config=lbm_config)
 method_monomial
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x127652c40>
  
 %% Cell type:markdown id: tags:
  
 ### Relaxation of Polynomial Cumulants
  
 By setting `method="cumulant"`, a set of default polynomial cumulants is chosen to be relaxed. Those cumulants are taken from literature and assembled into groups selected to enforce rotational invariance (see: `lbmpy.methods.centeredcumulant.centered_cumulants`). Default polynomial groups are available for the D2Q9, D3Q19 and D3Q27 stencils. As before it is possible to specify only a single relaxation rate assigned to the moments governing the shear viscosity. All other relaxation rates are then automatically set to one.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.CUMULANT, relaxation_rate=sp.Symbol('omega_v'))
 method_polynomial = create_lb_method(lbm_config=lbm_config)
 method_polynomial
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x12745b5e0>
  
 %% Cell type:markdown id: tags:
  
 ### Central Moments and Forcing
  
 The conserved modes are marked with the note *(central moment)* in the table above. It highlights the fact that these modes are relaxed in central moment space, other than cumulant space. As described in section A, this is done to enable the implicit forcing scheme. When a force is specified, the momentum modes' relaxation rates are overridden by the force model. In the following cell, a symbolic force is specified. Further, a full list of relaxation rates is passed to allow the specification of bulk viscosity.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(method=Method.CUMULANT, force=sp.symbols('F_:3'),
                       relaxation_rates= [sp.Symbol('omega_shear'),
                                          sp.Symbol('omega_bulk'),
                                          sp.Symbol('omega_3'),
                                          sp.Symbol('omega_4')])
 method_with_force = create_lb_method(lbm_config=lbm_config)
 method_with_force
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x127652520>
  
 %% Cell type:markdown id: tags:
  
 ## C) Exemplary simulation: flow around sphere
  
 To end this tutorial, we show an example simulation of a channel flow with a spherical obstacle. This example is shown in 2D with the D2Q9 stencil but can be adapted easily for a 3D simulation.
  
 %% Cell type:code id: tags:
  
 ``` python
 from lbmpy.relaxationrates import relaxation_rate_from_lattice_viscosity
 from lbmpy.macroscopic_value_kernels import pdf_initialization_assignments
 ```
  
 %% Cell type:markdown id: tags:
  
 To define the channel flow with dimensionless parameters, we first define the reference length in lattice cells. The reference length will be the diameter of the spherical obstacle. Furthermore, we define a maximal velocity which will be set for the inflow later. The Reynolds number is set relatively high to 100000, which will cause a turbulent flow in the channel.
  
 From these definitions, we can calculate the relaxation rate `omega` for the lattice Boltzmann method.
  
 %% Cell type:code id: tags:
  
 ``` python
 reference_length = 30
 maximal_velocity = 0.05
 reynolds_number = 100000
 kinematic_vicosity = (reference_length * maximal_velocity) / reynolds_number
 initial_velocity=(maximal_velocity, 0)
  
 omega = relaxation_rate_from_lattice_viscosity(kinematic_vicosity)
 ```
  
 %% Cell type:markdown id: tags:
  
 As a next step, we define the domain size of our set up.
  
 %% Cell type:code id: tags:
  
 ``` python
 stencil = LBStencil(Stencil.D2Q9)
 domain_size = (reference_length * 12, reference_length * 4)
 dim = len(domain_size)
 ```
  
 %% Cell type:markdown id: tags:
  
 Now the data for the simulation is allocated. We allocate a field `src` for the PDFs and field `dst` used as a temporary field to implement the two grid pull pattern. Additionally, we allocate a velocity field `velField`
  
 %% Cell type:code id: tags:
  
 ``` python
 dh = ps.create_data_handling(domain_size=domain_size, periodicity=(False, False))
  
 src = dh.add_array('src', values_per_cell=len(stencil), alignment=True)
 dh.fill('src', 0.0, ghost_layers=True)
 dst = dh.add_array('dst', values_per_cell=len(stencil), alignment=True)
 dh.fill('dst', 0.0, ghost_layers=True)
  
 velField = dh.add_array('velField', values_per_cell=dh.dim, alignment=True)
 dh.fill('velField', 0.0, ghost_layers=True)
 ```
  
 %% Cell type:markdown id: tags:
  
 We choose a cumulant lattice Boltzmann method, as described above. Here the second-order cumulants are relaxed with the relaxation rate calculated above. All higher-order cumulants are relaxed with one, which means we set them to the equilibrium.
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_config = LBMConfig(stencil=Stencil.D2Q9, method=Method.CUMULANT, relaxation_rate=omega,
                       output={'velocity': velField}, kernel_type='stream_pull_collide')
  
 method = create_lb_method(lbm_config=lbm_config)
 method
 ```
  
 %% Output
  
    <lbmpy.methods.centeredcumulant.centeredcumulantmethod.CenteredCumulantBasedLbMethod at 0x103b6d610>
  
 %% Cell type:markdown id: tags:
  
 ### Initialisation with equilibrium
  
 %% Cell type:code id: tags:
  
 ``` python
 init = pdf_initialization_assignments(method, 1.0, initial_velocity, src.center_vector)
  
 ast_init = ps.create_kernel(init, target=dh.default_target)
 kernel_init = ast_init.compile()
  
 dh.run_kernel(kernel_init)
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Definition of the update rule
  
 %% Cell type:code id: tags:
  
 ``` python
 lbm_optimisation = LBMOptimisation(symbolic_field=src, symbolic_temporary_field=dst)
 update = create_lb_update_rule(lb_method=method,
                               lbm_config=lbm_config,
                               lbm_optimisation=lbm_optimisation)
  
 ast_kernel = ps.create_kernel(update, target=dh.default_target, cpu_openmp=True)
 kernel = ast_kernel.compile()
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Definition of the boundary set up
  
 %% Cell type:code id: tags:
  
 ``` python
 def set_sphere(x, y, *_):
    mid = (domain_size[0] // 3, domain_size[1] // 2)
    radius = reference_length // 2
    return (x-mid[0])**2 + (y-mid[1])**2 < radius**2
 ```
  
 %% Cell type:code id: tags:
  
 ``` python
 bh = LatticeBoltzmannBoundaryHandling(method, dh, 'src', name="bh")
  
 inflow = UBB(initial_velocity)
 outflow = ExtrapolationOutflow(stencil[4], method)
 wall = NoSlip("wall")
  
 bh.set_boundary(inflow, slice_from_direction('W', dim))
 bh.set_boundary(outflow, slice_from_direction('E', dim))
 for direction in ('N', 'S'):
    bh.set_boundary(wall, slice_from_direction(direction, dim))
  
 bh.set_boundary(NoSlip("obstacle"), mask_callback=set_sphere)
  
 plt.figure(dpi=200)
 plt.boundary_handling(bh)
 ```
  
 %% Output
  

  
 %% Cell type:code id: tags:
  
 ``` python
 def timeloop(timeSteps):
    for i in range(timeSteps):
        bh()
        dh.run_kernel(kernel)
        dh.swap("src", "dst")
 ```
  
 %% Cell type:markdown id: tags:
  
 ### Run the simulation
  
 %% Cell type:code id: tags:
  
 ``` python
 mask = np.fromfunction(set_sphere, (domain_size[0], domain_size[1], len(domain_size)))
 if 'is_test_run' not in globals():
    timeloop(50000)  # initial steps
  
    def run():
        timeloop(100)
        return np.ma.array(dh.gather_array('velField'), mask=mask)
  
    animation = plt.vector_field_magnitude_animation(run, frames=600, rescale=True)
    set_display_mode('video')
    res = display_animation(animation)
 else:
    timeloop(10)
    res = None
 res
 ```
  
 %% Output
  
    <IPython.core.display.HTML object>
--- a/doc/notebooks/07_tutorial_thermal_lbm.ipynb
+++ b/doc/notebooks/07_tutorial_thermal_lbm.ipynb
--- a/doc/notebooks/08_tutorial_shanchen_twophase.ipynb
+++ b/doc/notebooks/08_tutorial_shanchen_twophase.ipynb
--- a/doc/notebooks/09_tutorial_shanchen_twocomponent.ipynb
+++ b/doc/notebooks/09_tutorial_shanchen_twocomponent.ipynb
 %% Cell type:markdown id: tags:

 # Shan-Chen Two-Component Lattice Boltzmann

 %% Cell type:code id: tags:

 ``` python
 from lbmpy.session import *
 from lbmpy.updatekernels import create_stream_pull_with_output_kernel
 from lbmpy.macroscopic_value_kernels import macroscopic_values_getter, macroscopic_values_setter
 from lbmpy.maxwellian_equilibrium import get_weights
 ```

 %% Cell type:markdown id: tags:

 This is based on section 9.3.3 of Krüger et al.'s "The Lattice Boltzmann Method", Springer 2017 (http://www.lbmbook.com).
 Sample code is available at [https://github.com/lbm-principles-practice/code/](https://github.com/lbm-principles-practice/code/blob/master/chapter9/shanchen.cpp).

 %% Cell type:markdown id: tags:

 ## Parameters

 %% Cell type:code id: tags:

 ``` python
 N = 64       # domain size
 omega_a = 1. # relaxation rate of first component
 omega_b = 1. # relaxation rate of second component

 # interaction strength
 g_aa = 0.
 g_ab = g_ba = 6.
 g_bb = 0.

 rho0 = 1.

 stencil = LBStencil(Stencil.D2Q9)
 weights = get_weights(stencil, c_s_sq=sp.Rational(1, 3))
 ```

 %% Cell type:markdown id: tags:

 ## Data structures

 We allocate two sets of PDF's, one for each phase. Additionally, for each phase there is one field to store its density and velocity.

 To run the simulation on GPU, change the `default_target` to gpu

 %% Cell type:code id: tags:

 ``` python
 dim = stencil.D
 dh = ps.create_data_handling((N, ) * dim, periodicity=True, default_target=ps.Target.CPU)

 src_a = dh.add_array('src_a', values_per_cell=stencil.Q)
 dst_a = dh.add_array_like('dst_a', 'src_a')

 src_b = dh.add_array('src_b', values_per_cell=stencil.Q)
 dst_b = dh.add_array_like('dst_b', 'src_b')

 ρ_a = dh.add_array('rho_a')
 ρ_b = dh.add_array('rho_b')
 u_a = dh.add_array('u_a', values_per_cell=stencil.D)
 u_b = dh.add_array('u_b', values_per_cell=stencil.D)
 u_bary = dh.add_array_like('u_bary', u_a.name)

 f_a = dh.add_array('f_a', values_per_cell=stencil.D)
 f_b = dh.add_array_like('f_b', f_a.name)
 ```

 %% Cell type:markdown id: tags:

 ## Force & combined velocity

 The two LB methods are coupled using a force term. Its symbolic representation is created in the next cells.
 The force value is not written to a field, but directly evaluated inside the collision kernel.

 %% Cell type:markdown id: tags:

 The force between the two components is
-$\vec{F}_k(\vec{x})=-\psi(\rho_k(\vec{x}))\sum\limits_{k^\prime\in\{A,B\}}g_{kk^\prime}\sum\limits_{i=1}^{q}w_i\psi(\rho_{k^\prime}(\vec{x}+\vec{c}_i))\vec{c}_i$
+$\mathbf{F}_k(\mathbf{x})=-\psi(\rho_k(\mathbf{x}))\sum\limits_{k^\prime\in\{A,B\}}g_{kk^\prime}\sum\limits_{i=1}^{q}w_i\psi(\rho_{k^\prime}(\mathbf{x}+\mathbf{c}_i))\mathbf{c}_i$
 for $k\in\{A,B\}$
 and with
 $\psi(\rho)=\rho_0\left[1-\exp(-\rho/\rho_0)\right]$.

 %% Cell type:code id: tags:

 ``` python
 def psi(dens):
    return rho0 * (1. - sp.exp(-dens / rho0));
 ```

 %% Cell type:code id: tags:

 ``` python
 zero_vec = sp.Matrix([0] * dh.dim)

 force_a = zero_vec
 for factor, ρ in zip([g_aa, g_ab], [ρ_a, ρ_b]):
    force_a += sum((psi(ρ[d]) * w_d * sp.Matrix(d)
                    for d, w_d in zip(stencil, weights)),
                   zero_vec) * psi(ρ_a.center) * -1 * factor

 force_b = zero_vec
 for factor, ρ in zip([g_ba, g_bb], [ρ_a, ρ_b]):
    force_b += sum((psi(ρ[d]) * w_d * sp.Matrix(d)
                    for d, w_d in zip(stencil, weights)),
                   zero_vec) * psi(ρ_b.center) * -1 * factor

 f_expressions = ps.AssignmentCollection([
    ps.Assignment(f_a.center_vector, force_a),
    ps.Assignment(f_b.center_vector, force_b)
 ])
 ```

 %% Cell type:markdown id: tags:

 The barycentric velocity, which is used in place of the individual components' velocities in the equilibrium distribution and Guo force term, is
 $\vec{u}=\frac{1}{\rho_a+\rho_b}\left(\rho_a\vec{u}_a+\frac{1}{2}\vec{F}_a+\rho_b\vec{u}_b+\frac{1}{2}\vec{F}_b\right)$.

 %% Cell type:code id: tags:

 ``` python
 u_full = ps.Assignment(u_bary.center_vector,
                       (ρ_a.center * u_a.center_vector + ρ_b.center * u_b.center_vector) / (ρ_a.center + ρ_b.center))
 ```

 %% Cell type:markdown id: tags:

 ## Kernels

 %% Cell type:code id: tags:

 ``` python
 lbm_config_a = LBMConfig(stencil=stencil, relaxation_rate=omega_a, compressible=True,
                         velocity_input=u_bary, density_input=ρ_a, force_model=ForceModel.GUO,
                         force=f_a, kernel_type='collide_only')

 lbm_config_b = LBMConfig(stencil=stencil, relaxation_rate=omega_b, compressible=True,
                         velocity_input=u_bary, density_input=ρ_b, force_model=ForceModel.GUO,
                         force=f_b, kernel_type='collide_only')



 collision_a = create_lb_update_rule(lbm_config=lbm_config_a,
                                    optimization={'symbolic_field': src_a})

 collision_b = create_lb_update_rule(lbm_config=lbm_config_b,
                                    optimization={'symbolic_field': src_b})
 ```

 %% Cell type:code id: tags:

 ``` python
 stream_a = create_stream_pull_with_output_kernel(collision_a.method, src_a, dst_a,
                                                 {'density': ρ_a, 'velocity': u_a})
 stream_b = create_stream_pull_with_output_kernel(collision_b.method, src_b, dst_b,
                                                 {'density': ρ_b, 'velocity': u_b})

 config = ps.CreateKernelConfig(target=dh.default_target)

 stream_a_kernel = ps.create_kernel(stream_a, config=config).compile()
 stream_b_kernel = ps.create_kernel(stream_b, config=config).compile()
 collision_a_kernel = ps.create_kernel(collision_a, config=config).compile()
 collision_b_kernel = ps.create_kernel(collision_b, config=config).compile()

 force_kernel = ps.create_kernel(f_expressions, config=config).compile()
 u_kernel = ps.create_kernel(u_full, config=config).compile()
 ```

 %% Cell type:markdown id: tags:

 ## Initialization

 %% Cell type:code id: tags:

 ``` python
 init_a = macroscopic_values_setter(collision_a.method, velocity=(0, 0),
                                   pdfs=src_a.center_vector, density=ρ_a.center)
 init_b = macroscopic_values_setter(collision_b.method, velocity=(0, 0),
                                   pdfs=src_b.center_vector, density=ρ_b.center)
 init_a_kernel = ps.create_kernel(init_a, ghost_layers=0).compile()
 init_b_kernel = ps.create_kernel(init_b, ghost_layers=0).compile()


 dh.run_kernel(init_a_kernel)
 dh.run_kernel(init_b_kernel)
 ```

 %% Cell type:code id: tags:

 ``` python
 def init():
    dh.fill(ρ_a.name, 0.1, slice_obj=ps.make_slice[:, :0.5])
    dh.fill(ρ_a.name, 0.9, slice_obj=ps.make_slice[:, 0.5:])

    dh.fill(ρ_b.name, 0.9, slice_obj=ps.make_slice[:, :0.5])
    dh.fill(ρ_b.name, 0.1, slice_obj=ps.make_slice[:, 0.5:])

+    dh.fill(f_a.name, 0.0)
+    dh.fill(f_b.name, 0.0)
+
    dh.run_kernel(init_a_kernel)
    dh.run_kernel(init_b_kernel)

    dh.fill(u_a.name, 0.0)
    dh.fill(u_b.name, 0.0)
 ```

 %% Cell type:markdown id: tags:

 ## Timeloop

 %% Cell type:code id: tags:

 ``` python
 sync_pdfs = dh.synchronization_function([src_a.name, src_b.name])
 sync_ρs = dh.synchronization_function([ρ_a.name, ρ_b.name])

 def time_loop(steps):
    dh.all_to_gpu()
    for i in range(steps):
        sync_ρs()  # force values depend on neighboring ρ's
        dh.run_kernel(force_kernel)
        dh.run_kernel(u_kernel)
        dh.run_kernel(collision_a_kernel)
        dh.run_kernel(collision_b_kernel)

        sync_pdfs()
        dh.run_kernel(stream_a_kernel)
        dh.run_kernel(stream_b_kernel)

        dh.swap(src_a.name, dst_a.name)
        dh.swap(src_b.name, dst_b.name)
    dh.all_to_cpu()
 ```

 %% Cell type:code id: tags:

 ``` python
 def plot_ρs():
    plt.figure(dpi=200)
    plt.subplot(1,2,1)
    plt.title("$\\rho_A$")
    plt.scalar_field(dh.gather_array(ρ_a.name), vmin=0, vmax=2)
    plt.colorbar()
    plt.subplot(1,2,2)
    plt.title("$\\rho_B$")
    plt.scalar_field(dh.gather_array(ρ_b.name), vmin=0, vmax=2)
    plt.colorbar()
 ```

 %% Cell type:markdown id: tags:

 ## Run the simulation
 ### Initial state

 %% Cell type:code id: tags:

 ``` python
 init()
 plot_ρs()
 ```

 %% Output



 %% Cell type:markdown id: tags:

 ### Run the simulation until converged

 %% Cell type:code id: tags:

 ``` python
 init()
 time_loop(10000)
 plot_ρs()
 ```

 %% Output



 %% Cell type:code id: tags:

 ``` python
 assert np.isfinite(dh.gather_array(ρ_a.name)).all()
 assert np.isfinite(dh.gather_array(ρ_b.name)).all()
 ```

 %% Cell type:markdown id: tags:

 # Shan-Chen Two-Component Lattice Boltzmann

 %% Cell type:code id: tags:

 ``` python
 from lbmpy.session import *
 from lbmpy.updatekernels import create_stream_pull_with_output_kernel
 from lbmpy.macroscopic_value_kernels import macroscopic_values_getter, macroscopic_values_setter
 from lbmpy.maxwellian_equilibrium import get_weights
 ```

 %% Cell type:markdown id: tags:

 This is based on section 9.3.3 of Krüger et al.'s "The Lattice Boltzmann Method", Springer 2017 (http://www.lbmbook.com).
 Sample code is available at [https://github.com/lbm-principles-practice/code/](https://github.com/lbm-principles-practice/code/blob/master/chapter9/shanchen.cpp).

 %% Cell type:markdown id: tags:

 ## Parameters

 %% Cell type:code id: tags:

 ``` python
 N = 64       # domain size
 omega_a = 1. # relaxation rate of first component
 omega_b = 1. # relaxation rate of second component

 # interaction strength
 g_aa = 0.
 g_ab = g_ba = 6.
 g_bb = 0.

 rho0 = 1.

 stencil = LBStencil(Stencil.D2Q9)
 weights = get_weights(stencil, c_s_sq=sp.Rational(1, 3))
 ```

 %% Cell type:markdown id: tags:

 ## Data structures

 We allocate two sets of PDF's, one for each phase. Additionally, for each phase there is one field to store its density and velocity.

 To run the simulation on GPU, change the `default_target` to gpu

 %% Cell type:code id: tags:

 ``` python
 dim = stencil.D
 dh = ps.create_data_handling((N, ) * dim, periodicity=True, default_target=ps.Target.CPU)

 src_a = dh.add_array('src_a', values_per_cell=stencil.Q)
 dst_a = dh.add_array_like('dst_a', 'src_a')

 src_b = dh.add_array('src_b', values_per_cell=stencil.Q)
 dst_b = dh.add_array_like('dst_b', 'src_b')

 ρ_a = dh.add_array('rho_a')
 ρ_b = dh.add_array('rho_b')
 u_a = dh.add_array('u_a', values_per_cell=stencil.D)
 u_b = dh.add_array('u_b', values_per_cell=stencil.D)
 u_bary = dh.add_array_like('u_bary', u_a.name)

 f_a = dh.add_array('f_a', values_per_cell=stencil.D)
 f_b = dh.add_array_like('f_b', f_a.name)
 ```

 %% Cell type:markdown id: tags:

 ## Force & combined velocity

 The two LB methods are coupled using a force term. Its symbolic representation is created in the next cells.
 The force value is not written to a field, but directly evaluated inside the collision kernel.

 %% Cell type:markdown id: tags:

 The force between the two components is
-$\vec{F}_k(\vec{x})=-\psi(\rho_k(\vec{x}))\sum\limits_{k^\prime\in\{A,B\}}g_{kk^\prime}\sum\limits_{i=1}^{q}w_i\psi(\rho_{k^\prime}(\vec{x}+\vec{c}_i))\vec{c}_i$
+$\mathbf{F}_k(\mathbf{x})=-\psi(\rho_k(\mathbf{x}))\sum\limits_{k^\prime\in\{A,B\}}g_{kk^\prime}\sum\limits_{i=1}^{q}w_i\psi(\rho_{k^\prime}(\mathbf{x}+\mathbf{c}_i))\mathbf{c}_i$
 for $k\in\{A,B\}$
 and with
 $\psi(\rho)=\rho_0\left[1-\exp(-\rho/\rho_0)\right]$.

 %% Cell type:code id: tags:

 ``` python
 def psi(dens):
    return rho0 * (1. - sp.exp(-dens / rho0));
 ```

 %% Cell type:code id: tags:

 ``` python
 zero_vec = sp.Matrix([0] * dh.dim)

 force_a = zero_vec
 for factor, ρ in zip([g_aa, g_ab], [ρ_a, ρ_b]):
    force_a += sum((psi(ρ[d]) * w_d * sp.Matrix(d)
                    for d, w_d in zip(stencil, weights)),
                   zero_vec) * psi(ρ_a.center) * -1 * factor

 force_b = zero_vec
 for factor, ρ in zip([g_ba, g_bb], [ρ_a, ρ_b]):
    force_b += sum((psi(ρ[d]) * w_d * sp.Matrix(d)
                    for d, w_d in zip(stencil, weights)),
                   zero_vec) * psi(ρ_b.center) * -1 * factor

 f_expressions = ps.AssignmentCollection([
    ps.Assignment(f_a.center_vector, force_a),
    ps.Assignment(f_b.center_vector, force_b)
 ])
 ```

 %% Cell type:markdown id: tags:

 The barycentric velocity, which is used in place of the individual components' velocities in the equilibrium distribution and Guo force term, is
 $\vec{u}=\frac{1}{\rho_a+\rho_b}\left(\rho_a\vec{u}_a+\frac{1}{2}\vec{F}_a+\rho_b\vec{u}_b+\frac{1}{2}\vec{F}_b\right)$.

 %% Cell type:code id: tags:

 ``` python
 u_full = ps.Assignment(u_bary.center_vector,
                       (ρ_a.center * u_a.center_vector + ρ_b.center * u_b.center_vector) / (ρ_a.center + ρ_b.center))
 ```

 %% Cell type:markdown id: tags:

 ## Kernels

 %% Cell type:code id: tags:

 ``` python
 lbm_config_a = LBMConfig(stencil=stencil, relaxation_rate=omega_a, compressible=True,
                         velocity_input=u_bary, density_input=ρ_a, force_model=ForceModel.GUO,
                         force=f_a, kernel_type='collide_only')

 lbm_config_b = LBMConfig(stencil=stencil, relaxation_rate=omega_b, compressible=True,
                         velocity_input=u_bary, density_input=ρ_b, force_model=ForceModel.GUO,
                         force=f_b, kernel_type='collide_only')



 collision_a = create_lb_update_rule(lbm_config=lbm_config_a,
                                    optimization={'symbolic_field': src_a})

 collision_b = create_lb_update_rule(lbm_config=lbm_config_b,
                                    optimization={'symbolic_field': src_b})
 ```

 %% Cell type:code id: tags:

 ``` python
 stream_a = create_stream_pull_with_output_kernel(collision_a.method, src_a, dst_a,
                                                 {'density': ρ_a, 'velocity': u_a})
 stream_b = create_stream_pull_with_output_kernel(collision_b.method, src_b, dst_b,
                                                 {'density': ρ_b, 'velocity': u_b})

 config = ps.CreateKernelConfig(target=dh.default_target)

 stream_a_kernel = ps.create_kernel(stream_a, config=config).compile()
 stream_b_kernel = ps.create_kernel(stream_b, config=config).compile()
 collision_a_kernel = ps.create_kernel(collision_a, config=config).compile()
 collision_b_kernel = ps.create_kernel(collision_b, config=config).compile()

 force_kernel = ps.create_kernel(f_expressions, config=config).compile()
 u_kernel = ps.create_kernel(u_full, config=config).compile()
 ```

 %% Cell type:markdown id: tags:

 ## Initialization

 %% Cell type:code id: tags:

 ``` python
 init_a = macroscopic_values_setter(collision_a.method, velocity=(0, 0),
                                   pdfs=src_a.center_vector, density=ρ_a.center)
 init_b = macroscopic_values_setter(collision_b.method, velocity=(0, 0),
                                   pdfs=src_b.center_vector, density=ρ_b.center)
 init_a_kernel = ps.create_kernel(init_a, ghost_layers=0).compile()
 init_b_kernel = ps.create_kernel(init_b, ghost_layers=0).compile()


 dh.run_kernel(init_a_kernel)
 dh.run_kernel(init_b_kernel)
 ```

 %% Cell type:code id: tags:

 ``` python
 def init():
    dh.fill(ρ_a.name, 0.1, slice_obj=ps.make_slice[:, :0.5])
    dh.fill(ρ_a.name, 0.9, slice_obj=ps.make_slice[:, 0.5:])

    dh.fill(ρ_b.name, 0.9, slice_obj=ps.make_slice[:, :0.5])
    dh.fill(ρ_b.name, 0.1, slice_obj=ps.make_slice[:, 0.5:])

+    dh.fill(f_a.name, 0.0)
+    dh.fill(f_b.name, 0.0)
+
    dh.run_kernel(init_a_kernel)
    dh.run_kernel(init_b_kernel)

    dh.fill(u_a.name, 0.0)
    dh.fill(u_b.name, 0.0)
 ```

 %% Cell type:markdown id: tags:

 ## Timeloop

 %% Cell type:code id: tags:

 ``` python
 sync_pdfs = dh.synchronization_function([src_a.name, src_b.name])
 sync_ρs = dh.synchronization_function([ρ_a.name, ρ_b.name])

 def time_loop(steps):
    dh.all_to_gpu()
    for i in range(steps):
        sync_ρs()  # force values depend on neighboring ρ's
        dh.run_kernel(force_kernel)
        dh.run_kernel(u_kernel)
        dh.run_kernel(collision_a_kernel)
        dh.run_kernel(collision_b_kernel)

        sync_pdfs()
        dh.run_kernel(stream_a_kernel)
        dh.run_kernel(stream_b_kernel)

        dh.swap(src_a.name, dst_a.name)
        dh.swap(src_b.name, dst_b.name)
    dh.all_to_cpu()
 ```

 %% Cell type:code id: tags:

 ``` python
 def plot_ρs():
    plt.figure(dpi=200)
    plt.subplot(1,2,1)
    plt.title("$\\rho_A$")
    plt.scalar_field(dh.gather_array(ρ_a.name), vmin=0, vmax=2)
    plt.colorbar()
    plt.subplot(1,2,2)
    plt.title("$\\rho_B$")
    plt.scalar_field(dh.gather_array(ρ_b.name), vmin=0, vmax=2)
    plt.colorbar()
 ```

 %% Cell type:markdown id: tags:

 ## Run the simulation
 ### Initial state

 %% Cell type:code id: tags:

 ``` python
 init()
 plot_ρs()
 ```

 %% Output



 %% Cell type:markdown id: tags:

 ### Run the simulation until converged

 %% Cell type:code id: tags:

 ``` python
 init()
 time_loop(10000)
 plot_ρs()
 ```

 %% Output



 %% Cell type:code id: tags:

 ``` python
 assert np.isfinite(dh.gather_array(ρ_a.name)).all()
 assert np.isfinite(dh.gather_array(ρ_b.name)).all()
 ```

--- a/lbmpy/advanced_streaming/communication.py
+++ b/lbmpy/advanced_streaming/communication.py
@@ -104,8 +104,7 @@ def get_communication_slices(


 def periodic_pdf_copy_kernel(pdf_field, src_slice, dst_slice,
-                             domain_size=None, target=Target.GPU,
-                             opencl_queue=None, opencl_ctx=None):
+                             domain_size=None, target=Target.GPU):
    """Copies a rectangular array slice onto another non-overlapping array slice"""
    from pystencils.gpucuda.kernelcreation import create_cuda_kernel

@@ -136,9 +135,6 @@ def periodic_pdf_copy_kernel(pdf_field, src_slice, dst_slice,
    if target == Target.GPU:
        from pystencils.gpucuda import make_python_function
        return make_python_function(ast)
-    elif target == Target.OPENCL:
-        from pystencils.opencl import make_python_function
-        return make_python_function(ast, opencl_queue, opencl_ctx)
    else:
        raise ValueError('Invalid target:', target)

@@ -147,22 +143,17 @@ class LBMPeriodicityHandling:

    def __init__(self, stencil, data_handling, pdf_field_name,
                 streaming_pattern='pull', ghost_layers=1,
-                 opencl_queue=None, opencl_ctx=None,
                 pycuda_direct_copy=True):
        """
            Periodicity Handling for Lattice Boltzmann Streaming.

-            **On the usage with cuda/opencl:** 
+            **On the usage with cuda:**
            - pycuda allows the copying of sliced arrays within device memory using the numpy syntax,
            e.g. `dst[:,0] = src[:,-1]`. In this implementation, this is the default for periodicity
            handling. Alternatively, if you set `pycuda_direct_copy=False`, GPU kernels are generated and
            compiled. The compiled kernels are almost twice as fast in execution as pycuda array copying,
            but especially for large stencils like D3Q27, their compilation can take up to 20 seconds. 
            Choose your weapon depending on your use case.
-
-            - pyopencl does not support copying of non-contiguous sliced arrays, so the usage of compiled
-            copy kernels is forced on us. On the positive side, compilation of the OpenCL kernels appears
-            to be about four times faster.
        """
        if not isinstance(data_handling, SerialDataHandling):
            raise ValueError('Only serial data handling is supported!')
@@ -172,7 +163,7 @@ class LBMPeriodicityHandling:
        self.dh = data_handling

        target = data_handling.default_target
-        assert target in [Target.CPU, Target.GPU, Target.OPENCL]
+        assert target in [Target.CPU, Target.GPU]

        self.pdf_field_name = pdf_field_name
        self.ghost_layers = ghost_layers
@@ -180,8 +171,6 @@ class LBMPeriodicityHandling:
        self.inplace_pattern = is_inplace(streaming_pattern)
        self.target = target
        self.cpu = target == Target.CPU
-        self.opencl_queue = opencl_queue
-        self.opencl_ctx = opencl_ctx
        self.pycuda_direct_copy = target == Target.GPU and pycuda_direct_copy

        def is_copy_direction(direction):
@@ -205,7 +194,7 @@ class LBMPeriodicityHandling:
                                                           ghost_layers=ghost_layers)
            self.comm_slices.append(list(chain.from_iterable(v for k, v in slices_per_comm_dir.items())))

-        if target == Target.OPENCL or (target == Target.GPU and not pycuda_direct_copy):
+        if target == Target.GPU and not pycuda_direct_copy:
            self.device_copy_kernels = []
            for timestep in timesteps:
                self.device_copy_kernels.append(self._compile_copy_kernels(timestep))
@@ -227,9 +216,7 @@ class LBMPeriodicityHandling:
        kernels = []
        for src, dst in self.comm_slices[timestep.idx]:
            kernels.append(
-                periodic_pdf_copy_kernel(
-                    pdf_field, src, dst, target=self.target,
-                    opencl_queue=self.opencl_queue, opencl_ctx=self.opencl_ctx))
+                periodic_pdf_copy_kernel(pdf_field, src, dst, target=self.target))
        return kernels

    def _periodicity_handling_gpu(self, prev_timestep):

--- a/lbmpy/boundaries/boundaryconditions.py
+++ b/lbmpy/boundaries/boundaryconditions.py
@@ -115,8 +115,8 @@ class NoSlip(LbBoundary):

 class FreeSlip(LbBoundary):
    """
-    Free-Slip boundary condition, which enforces a zero normal fluid velocity $u_n = 0$ but places no restrictions
-    on the tangential fluid velocity $u_t$.
+    Free-Slip boundary condition, which enforces a zero normal fluid velocity :math:`u_n = 0` but places no restrictions
+    on the tangential fluid velocity :math:`u_t`.

    Args:
        stencil: LBM stencil which is used for the simulation
@@ -283,7 +283,7 @@ class UBB(LbBoundary):
        Returns:
            list containing LbmWeightInfo and NeighbourOffsetArrays
        """
-        return [LbmWeightInfo(lb_method), NeighbourOffsetArrays(lb_method.stencil)]
+        return [LbmWeightInfo(lb_method, self.data_type), NeighbourOffsetArrays(lb_method.stencil)]

    @property
    def velocity_is_callable(self):
@@ -312,7 +312,8 @@ class UBB(LbBoundary):
            velocity = [eq.rhs for eq in shifted_vel_eqs.new_filtered(cqc.first_order_moment_symbols).main_assignments]

        c_s_sq = sp.Rational(1, 3)
-        weight_of_direction = LbmWeightInfo.weight_of_direction
+        weight_info = LbmWeightInfo(lb_method, data_type=self.data_type)
+        weight_of_direction = weight_info.weight_of_direction
        vel_term = 2 / c_s_sq * sum([d_i * v_i for d_i, v_i in zip(neighbor_offset, velocity)]) * weight_of_direction(
            dir_symbol, lb_method)

@@ -595,10 +596,11 @@ class DiffusionDirichlet(LbBoundary):
        name: optional name of the boundary.
    """

-    def __init__(self, concentration, name=None):
+    def __init__(self, concentration, name=None, data_type='double'):
        if name is None:
            name = "Diffusion Dirichlet " + str(concentration)
        self.concentration = concentration
+        self._data_type = data_type

        super(DiffusionDirichlet, self).__init__(name)

@@ -611,10 +613,11 @@ class DiffusionDirichlet(LbBoundary):
        Returns:
            list containing LbmWeightInfo
        """
-        return [LbmWeightInfo(lb_method)]
+        return [LbmWeightInfo(lb_method, self._data_type)]

    def __call__(self, f_out, f_in, dir_symbol, inv_dir, lb_method, index_field):
-        w_dir = LbmWeightInfo.weight_of_direction(dir_symbol, lb_method)
+        weight_info = LbmWeightInfo(lb_method, self._data_type)
+        w_dir = weight_info.weight_of_direction(dir_symbol, lb_method)
        return [Assignment(f_in(inv_dir[dir_symbol]),
                           2 * w_dir * self.concentration - f_out(dir_symbol))]


--- a/lbmpy/boundaries/boundaryhandling.py
+++ b/lbmpy/boundaries/boundaryhandling.py
@@ -38,7 +38,7 @@ class LatticeBoltzmannBoundaryHandling(BoundaryHandling):
        self._prev_timestep = None

    def add_fixed_steps(self, fixed_loop, **kwargs):
-        if self._inplace:   # Fixed Loop can't do timestep selection
+        if self._inplace:  # Fixed Loop can't do timestep selection
            raise NotImplementedError("Adding to fixed loop is currently not supported for inplace kernels")
        super(LatticeBoltzmannBoundaryHandling, self).add_fixed_steps(fixed_loop, **kwargs)

@@ -52,10 +52,12 @@ class LatticeBoltzmannBoundaryHandling(BoundaryHandling):
        if boundary_obj not in self._boundary_object_to_boundary_info:
            sym_index_field = Field.create_generic('indexField', spatial_dimensions=1,
                                                   dtype=numpy_data_type_for_boundary_object(boundary_obj, self.dim))
-            kernels = [self._create_boundary_kernel(
-                self._data_handling.fields[self._field_name], sym_index_field, boundary_obj, Timestep.EVEN).compile(),
-                self._create_boundary_kernel(
-                self._data_handling.fields[self._field_name], sym_index_field, boundary_obj, Timestep.ODD).compile()]
+
+            ast_even = self._create_boundary_kernel(self._data_handling.fields[self._field_name], sym_index_field,
+                                                    boundary_obj, Timestep.EVEN)
+            ast_odd = self._create_boundary_kernel(self._data_handling.fields[self._field_name], sym_index_field,
+                                                   boundary_obj, Timestep.ODD)
+            kernels = [ast_even.compile(), ast_odd.compile()]
            if flag is None:
                flag = self.flag_interface.reserve_next_flag()
            boundary_info = self.InplaceStreamingBoundaryInfo(self, boundary_obj, flag, kernels)
@@ -84,6 +86,7 @@ class LatticeBoltzmannBoundaryHandling(BoundaryHandling):
            self.boundary_object = boundary_obj
            self.flag = flag
            self._kernels = kernels
+
    #   end class InplaceStreamingBoundaryInfo

    # ------------------------------ Force On Boundary ------------------------------------------------------------
@@ -148,29 +151,32 @@ class LatticeBoltzmannBoundaryHandling(BoundaryHandling):

        return dh.reduce_float_sequence(list(result), 'sum')

+
 # end class LatticeBoltzmannBoundaryHandling


 class LbmWeightInfo(CustomCodeNode):
+    def __init__(self, lb_method, data_type='double'):
+        self.weights_symbol = TypedSymbol("weights", data_type)
+        data_type_string = "double" if self.weights_symbol.dtype.numpy_dtype == np.float64 else "float"
+
+        weights = [str(w.evalf(17)) for w in lb_method.weights]
+        if data_type_string == "float":
+            weights = "f, ".join(weights)
+            weights += "f"  # suffix for the last element
+        else:
+            weights = ", ".join(weights)
+        w_sym = self.weights_symbol
+        code = f"const {data_type_string} {w_sym.name} [] = {{{ weights }}};\n"
+        super(LbmWeightInfo, self).__init__(code, symbols_read=set(), symbols_defined={w_sym})

-    # --------------------------- Functions to be used by boundaries --------------------------
-
-    @staticmethod
-    def weight_of_direction(dir_idx, lb_method=None):
+    def weight_of_direction(self, dir_idx, lb_method=None):
        if isinstance(sp.sympify(dir_idx), sp.Integer):
-            return lb_method.weights[dir_idx].evalf()
+            return lb_method.weights[dir_idx].evalf(17)
        else:
-            return sp.IndexedBase(LbmWeightInfo.WEIGHTS_SYMBOL, shape=(1,))[dir_idx]
+            return sp.IndexedBase(self.weights_symbol, shape=(1,))[dir_idx]

-    # ---------------------------------- Internal ---------------------------------------------

-    WEIGHTS_SYMBOL = TypedSymbol("weights", "double")
-
-    def __init__(self, lb_method):
-        weights = [str(w.evalf()) for w in lb_method.weights]
-        w_sym = LbmWeightInfo.WEIGHTS_SYMBOL
-        code = "const double %s [] = { %s };\n" % (w_sym.name, ",".join(weights))
-        super(LbmWeightInfo, self).__init__(code, symbols_read=set(), symbols_defined={w_sym})
 # end class LbmWeightInfo



--- a/lbmpy/continuous_distribution_measures.py
+++ b/lbmpy/continuous_distribution_measures.py
@@ -15,11 +15,11 @@ def moment_generating_function(generating_function, symbols, symbols_in_result,
    Computes the moment generating function of a probability distribution. It is defined as:

    .. math ::
-        F[f(\mathbf{x})](\mathbf{t}) = \int e^{<\mathbf{x}, \mathbf{t}>} f(x)\; dx
+        F[f(\mathbf{x})](t) = \int e^{<\mathbf{x}, t>} f(\mathbf{x})\; dx

    Args:
        generating_function: sympy expression
-        symbols: a sequence of symbols forming the vector x
+        symbols: a sequence of symbols forming the vector :math:`\mathbf{x}`
        symbols_in_result: a sequence forming the vector t
        velocity: if the generating function generates central moments, the velocity needs to be substracted. Thus the
                  velocity symbols need to be passed. All generating functions need to have the same parameters.
@@ -62,7 +62,7 @@ def central_moment_generating_function(func, symbols, symbols_in_result, velocit
    Computes central moment generating func, which is defined as:

    .. math ::
-        K( \vec{\Xi} ) = \exp ( - \vec{\Xi} \cdot \vec{u} ) M( \vec{\Xi}.
+        K( \mathbf{\Xi} ) = \exp ( - \mathbf{\Xi} \cdot \mathbf{u} ) M( \mathbf{\Xi} ).

    For parameter description see :func:`moment_generating_function`.
    """
@@ -76,7 +76,7 @@ def cumulant_generating_function(func, symbols, symbols_in_result, velocity=None
    Computes cumulant generating func, which is the logarithm of the moment generating func:

    .. math ::
-        C(\vec{\Xi}) = \log M(\vec{\Xi})
+        C(\mathbf{\Xi}) = \log M(\mathbf{\Xi})

    For parameter description see :func:`moment_generating_function`.
    """

--- a/lbmpy/creationfunctions.py
+++ b/lbmpy/creationfunctions.py
@@ -247,21 +247,21 @@ class LBMConfig:

    kernel_type: Union[str, Type[PdfFieldAccessor]] = 'default_stream_collide'
    """
-    Supported values: 'default_stream_collide' (default), 'collide_only', 'stream_pull_only'. 
-    With 'default_stream_collide', streaming pattern and even/odd time-step (for in-place patterns) can be specified
+    Supported values: ``'default_stream_collide'`` (default), ``'collide_only'``, ``'stream_pull_only'``. 
+    With ``'default_stream_collide'``, streaming pattern and even/odd time-step (for in-place patterns) can be specified
    by the ``streaming_pattern`` and ``timestep`` arguments. For backwards compatibility, ``kernel_type`` also accepts
-    'stream_pull_collide', 'collide_stream_push', 'esotwist_even', 'esotwist_odd', 'aa_even' and 'aa_odd' for selection 
-    of the streaming pattern. 
+    ``'stream_pull_collide'``, ``'collide_stream_push'``, ``'esotwist_even'``, ``'esotwist_odd'``, ``'aa_even'``
+    and ``'aa_odd'`` for selection of the streaming pattern. 
    """
    streaming_pattern: str = 'pull'
    """
-    The streaming pattern to be used with a 'default_stream_collide' kernel. Accepted values are
-    'pull', 'push', 'aa' and 'esotwist'.
+    The streaming pattern to be used with a ``'default_stream_collide'`` kernel. Accepted values are
+    ``'pull'``, ``'push'``, ``'aa'`` and ``'esotwist'``.
    """
    timestep: Timestep = Timestep.BOTH
    """
    Timestep modulus for the streaming pattern. For two-fields patterns, this argument is irrelevant and
-    by default set to ``Timestep.BOTH``. For in-place patterns, ``Timestep.EVEN`` or ``Timestep.ODD`` must be speficied.
+    by default set to ``Timestep.BOTH``. For in-place patterns, ``Timestep.EVEN`` or ``Timestep.ODD`` must be specified.
    """

    field_name: str = 'src'

--- a/lbmpy/enums.py
+++ b/lbmpy/enums.py
@@ -21,6 +21,10 @@ class Stencil(Enum):
    """
    A two dimensional stencil using 37 discrete velocities. (long range stencil).
    """
+    D3Q7 = auto()
+    """
+    A three dimensional stencil using 7 discrete velocities.
+    """
    D3Q15 = auto()
    """
    A three dimensional stencil using 15 discrete velocities.

--- a/lbmpy/forcemodels.py
+++ b/lbmpy/forcemodels.py
@@ -17,16 +17,16 @@ Force models add a term :math:`C_F` to the collision equation:

 .. math ::

-    f(\pmb{x} + c_q \Delta t, t + \Delta t) - f(\pmb{x},t) = \Omega(f, f^{(\mathrm{eq})})
+    f(\mathbf{x} + c_q \Delta t, t + \Delta t) - f(\mathbf{x},t) = \Omega(f, f^{(\mathrm{eq})})
                                                            + \underbrace{F_q}_{\mbox{forcing term}}

 The form of this term depends on the concrete force model: the first moment of this forcing term is equal
-to the acceleration :math:`\pmb{a}` for all force models. Here :math:`\mathbf{F}` is the D dimensional force vector,
+to the acceleration :math:`\mathbf{a}` for all force models. Here :math:`\mathbf{F}` is the D dimensional force vector,
 which defines the force for each spatial dircetion.

 .. math ::

-    \sum_q \pmb{c}_q \mathbf{F} = \pmb{a}
+    \sum_q \mathbf{c}_q \mathbf{F} = \mathbf{a}


 The second order moment is different for the forcing models - if it is zero the model is suited for
@@ -57,7 +57,7 @@ For all force models the computation of the macroscopic velocity has to be adapt
    
    .. math ::
    
-        \pmb{u} &= \sum_q \pmb{c}_q f_q + S_{\mathrm{macro}}
+        \mathbf{u} &= \sum_q \mathbf{c}_q f_q + S_{\mathrm{macro}}
        
        S_{\mathrm{macro}} &= \frac{\Delta t}{2 \cdot \rho} \sum_q F_q

@@ -296,7 +296,7 @@ class He(AbstractForceModel):
                F_x m^{\mathrm{eq}}_{\alpha+1,\beta,\gamma} 
                + F_y m^{\mathrm{eq}}_{\alpha,\beta+1,\gamma} 
                + F_z m^{\mathrm{eq}}_{\alpha,\beta,\gamma+1} 
-                - m^{eq}_{\alpha\beta\gamma} ( \mathbf{F} \cdot \vec{u} ) 
+                - m^{eq}_{\alpha\beta\gamma} ( \mathbf{F} \cdot \mathbf{u} )
            \right)
    """


--- a/lbmpy/lbstep.py
+++ b/lbmpy/lbstep.py
@@ -74,7 +74,7 @@ class LatticeBoltzmannStep:
        self.density_data_name = name + "_density" if density_data_name is None else density_data_name
        self.density_data_index = density_data_index

-        self._gpu = target == Target.GPU or target == Target.OPENCL
+        self._gpu = target == Target.GPU
        layout = lbm_optimisation.field_layout

        alignment = False

--- a/lbmpy/macroscopic_value_kernels.py
+++ b/lbmpy/macroscopic_value_kernels.py
@@ -8,12 +8,16 @@ from lbmpy.advanced_streaming.utility import get_accessor, Timestep


 def pdf_initialization_assignments(lb_method, density, velocity, pdfs,
-                                   streaming_pattern='pull', previous_timestep=Timestep.BOTH):
+                                   streaming_pattern='pull', previous_timestep=Timestep.BOTH,
+                                   set_pre_collision_pdfs=False):
    """Assignments to initialize the pdf field with equilibrium"""
    if isinstance(pdfs, Field):
-        previous_step_accessor = get_accessor(streaming_pattern, previous_timestep)
-        field_accesses = previous_step_accessor.write(pdfs, lb_method.stencil)
-    elif streaming_pattern == 'pull':
+        accessor = get_accessor(streaming_pattern, previous_timestep)
+        if set_pre_collision_pdfs:
+            field_accesses = accessor.read(pdfs, lb_method.stencil)
+        else:
+            field_accesses = accessor.write(pdfs, lb_method.stencil)
+    elif streaming_pattern == 'pull' and not set_pre_collision_pdfs:
        field_accesses = pdfs
    else:
        raise ValueError("Invalid value of pdfs: A PDF field reference is required to derive "
@@ -28,11 +32,15 @@ def pdf_initialization_assignments(lb_method, density, velocity, pdfs,


 def macroscopic_values_getter(lb_method, density, velocity, pdfs,
-                              streaming_pattern='pull', previous_timestep=Timestep.BOTH):
+                              streaming_pattern='pull', previous_timestep=Timestep.BOTH,
+                              use_pre_collision_pdfs=False):
    if isinstance(pdfs, Field):
-        previous_step_accessor = get_accessor(streaming_pattern, previous_timestep)
-        field_accesses = previous_step_accessor.write(pdfs, lb_method.stencil)
-    elif streaming_pattern == 'pull':
+        accessor = get_accessor(streaming_pattern, previous_timestep)
+        if use_pre_collision_pdfs:
+            field_accesses = accessor.read(pdfs, lb_method.stencil)
+        else:
+            field_accesses = accessor.write(pdfs, lb_method.stencil)
+    elif streaming_pattern == 'pull' and not use_pre_collision_pdfs:
        field_accesses = pdfs
    else:
        raise ValueError("Invalid value of pdfs: A PDF field reference is required to derive "

--- a/lbmpy/maxwellian_equilibrium.py
+++ b/lbmpy/maxwellian_equilibrium.py
@@ -31,6 +31,10 @@ get_weights.weights = {
        1: sp.Rational(1, 9),
        2: sp.Rational(1, 36),
    },
+    7: {
+        0: sp.simplify(0.0),
+        1: sp.Rational(1, 6),
+    },
    15: {
        0: sp.Rational(2, 9),
        1: sp.Rational(1, 9),

--- a/lbmpy/methods/abstractlbmethod.py
+++ b/lbmpy/methods/abstractlbmethod.py
@@ -2,6 +2,7 @@ import abc
 from collections import namedtuple

 import sympy as sp
+from sympy.core.numbers import Zero

 from pystencils import Assignment, AssignmentCollection

@@ -52,6 +53,7 @@ class AbstractLbMethod(abc.ABC):
        """Returns a qxq diagonal matrix which contains the relaxation rate for each moment on the diagonal"""
        d = sp.zeros(len(self.relaxation_rates))
        for i in range(0, len(self.relaxation_rates)):
+            # note that 0.0 is converted to sp.Zero here. It is not possible to prevent this.
            d[i, i] = self.relaxation_rates[i]
        return d

@@ -104,6 +106,9 @@ class AbstractLbMethod(abc.ABC):
        for relaxation_rate in rr:
            if relaxation_rate not in unique_relaxation_rates:
                relaxation_rate = sp.sympify(relaxation_rate)
+                # special treatment for zero, sp.Zero would be an integer ..
+                if isinstance(relaxation_rate, Zero):
+                    relaxation_rate = 0.0
                if not isinstance(relaxation_rate, sp.Symbol):
                    rt_symbol = sp.Symbol(f"rr_{len(subexpressions)}")
                    subexpressions[relaxation_rate] = rt_symbol

--- a/lbmpy/methods/conservedquantitycomputation.py
+++ b/lbmpy/methods/conservedquantitycomputation.py
@@ -286,7 +286,7 @@ def get_equations_for_zeroth_and_first_order_moment(stencil, symbolic_pdfs, symb

    dim = stencil.D

-    subexpressions = []
+    subexpressions = list()
    pdf_sum = sum(symbolic_pdfs)
    u = [0] * dim
    for f, offset in zip(symbolic_pdfs, stencil):
@@ -308,12 +308,16 @@ def get_equations_for_zeroth_and_first_order_moment(stencil, symbolic_pdfs, symb
            rhs -= plus_terms[j]
        eq = Assignment(velo_terms[i], sum(rhs))
        subexpressions.append(eq)
+        if len(rhs) == 0:  # if one of the substitutions is not found the simplification can not be applied
+            subexpressions = []
+            break

    for subexpression in subexpressions:
        pdf_sum = pdf_sum.subs(subexpression.rhs, subexpression.lhs)

-    for i in range(dim):
-        u[i] = u[i].subs(subexpressions[i].rhs, subexpressions[i].lhs)
+    if len(subexpressions) > 0:
+        for i in range(dim):
+            u[i] = u[i].subs(subexpressions[i].rhs, subexpressions[i].lhs)

    equations = []
    equations += [Assignment(symbolic_zeroth_moment, pdf_sum)]

--- a/lbmpy/methods/creationfunctions.py
+++ b/lbmpy/methods/creationfunctions.py
@@ -423,8 +423,10 @@ def create_mrt_orthogonal(stencil, relaxation_rates, maxwellian_moments=False, w
            diagonal_viscous_moments = [x ** 2 + y ** 2, x ** 2]
        else:
            diagonal_viscous_moments = [x ** 2 + y ** 2 + z ** 2, x ** 2, y ** 2 - z ** 2]
+
        for i, d in enumerate(MOMENT_SYMBOLS[:stencil.D]):
-            moments[moments.index(d ** 2)] = diagonal_viscous_moments[i]
+            if d ** 2 in moments:
+                moments[moments.index(d ** 2)] = diagonal_viscous_moments[i]
        orthogonal_moments = gram_schmidt(moments, stencil, weights)
        orthogonal_moments_scaled = [e * common_denominator(e) for e in orthogonal_moments]
        nested_moments = list(sort_moments_into_groups_of_same_order(orthogonal_moments_scaled).values())
@@ -603,11 +605,11 @@ def _get_relaxation_info_dict(relaxation_rates, nested_moments, dim):
        for group in nested_moments:
            for moment in group:
                if get_order(moment) <= 1:
-                    result[moment] = 0
+                    result[moment] = 0.0
                elif is_shear_moment(moment, dim):
                    result[moment] = relaxation_rates[0]
                else:
-                    result[moment] = 1
+                    result[moment] = 1.0

    # if relaxation rate for each moment is specified they are all used
    if len(relaxation_rates) == number_of_moments:
@@ -632,7 +634,7 @@ def _get_relaxation_info_dict(relaxation_rates, nested_moments, dim):
                next_rr = False
                for moment in group:
                    if get_order(moment) <= 1:
-                        result[moment] = 0
+                        result[moment] = 0.0
                    elif is_shear_moment(moment, dim):
                        result[moment] = shear_rr
                    elif is_bulk_moment(moment, dim):

--- a/lbmpy/methods/default_moment_sets.py
+++ b/lbmpy/methods/default_moment_sets.py
@@ -18,7 +18,7 @@ def cascaded_moment_sets_literature(stencil):
    - Remaining groups do not govern hydrodynamic properties

    Args:
-        stencil: instance of :class:`lbmpy.stencils.LBStencil`. Can be D2Q9, D3Q15, D3Q19 or D3Q27
+        stencil: instance of :class:`lbmpy.stencils.LBStencil`. Can be D2Q9, D3Q7, D3Q15, D3Q19 or D3Q27
    """
    x, y, z = MOMENT_SYMBOLS
    if have_same_entries(stencil, LBStencil(Stencil.D2Q9)):
@@ -42,6 +42,18 @@ def cascaded_moment_sets_literature(stencil):
            [x ** 2 * y ** 2]
        ]

+    elif have_same_entries(stencil, LBStencil(Stencil.D3Q7)):
+        # D3Q7 moments: https://arxiv.org/ftp/arxiv/papers/1611/1611.03329.pdf
+        return [
+            [sp.sympify(1)],  # density is conserved
+            [x, y, z],  # momentum might be affected by forcing
+
+            [x ** 2 - y ** 2,
+             x ** 2 - z ** 2],  # shear
+
+            [x ** 2 + y ** 2 + z ** 2],  # bulk
+        ]
+
    elif have_same_entries(stencil, LBStencil(Stencil.D3Q15)):
        #   D3Q15 central moments by Premnath et al. https://arxiv.org/pdf/1202.6081.pdf.
        return [
No results found