Skip to content
Snippets Groups Projects
Forked from waLBerla / example_app
2 commits behind, 2 commits ahead of the upstream repository.

Set up a MWE:

git clone https://i10git.cs.fau.de/walberla/example_app.git
cd example_app/apps/example_app_codegen/
# copy .cpp, .py, .prm files, update list of generated files in CMakeLists.txt, chmod +x the python file and add shebang '#!/usr/bin/python3'
cd $(git rev-parse --show-toplevel)
mkdir build
cd build

Compile with Clang in debug mode with lbmpy 0.4.4:

VERSION=0.4.4 DEPS="/work/jgrad/walberla_deps" PYTHONPATH="${DEPS}/${VERSION}/lbmpy:${DEPS}/${VERSION}/pystencils:${DEPS}/devel/walberla/python/" CC=clang CXX=clang++ cmake .. -DWALBERLA_DIR=/work/jgrad/walberla_deps/devel/walberla -DWALBERLA_BUILD_WITH_CODEGEN=ON -DCMAKE_BUILD_TYPE=Debug
VERSION=0.4.4 DEPS="/work/jgrad/walberla_deps" PYTHONPATH="${DEPS}/${VERSION}/lbmpy:${DEPS}/${VERSION}/pystencils:${DEPS}/devel/walberla/python/" make -j$(nproc)

Then compile the AVX binary separately with:

(cd /work/jgrad/walberla_deps/devel/example_app/build/apps/example_app_codegen && /usr/bin/ccache /usr/bin/clang++  -DBOOST_ALL_NO_LIB -I/work/jgrad/walberla_deps/devel/example_app/build/walberla/src -I/work/jgrad/walberla_deps/devel/walberla/src -I/work/jgrad/walberla_deps/devel/example_app/build/apps/example_app_codegen/default_codegen -isystem /work/jgrad/walberla_deps/devel/example_app/src -isystem /work/jgrad/walberla_deps/devel/example_app/build/src -isystem /work/jgrad/walberla_deps/0.4.4/pystencils/pystencils/include -isystem /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -isystem /usr/lib/x86_64-linux-gnu/openmpi/include  -Wall -Wconversion -Wshadow -Wno-c++11-extensions -Qunused-arguments -pthread -pthread -g   -std=gnu++17 -DWALBERLA_BUILD_WITH_AVX -mavx2 -o CMakeFiles/ExampleAppCodegen.dir/ExampleAppAVX.cpp.o -c /work/jgrad/walberla_deps/devel/example_app/apps/example_app_codegen/ExampleApp.cpp)
(cd /work/jgrad/walberla_deps/devel/example_app/build/apps/example_app_codegen && /tikhome/jgrad/.local/lib/python3.8/site-packages/cmake/data/bin/cmake -E cmake_link_script CMakeFiles/ExampleAppCodegen.dir/link.txt --verbose=1
/usr/bin/clang++   -Wall -Wconversion -Wshadow -Wno-c++11-extensions -Qunused-arguments -pthread -pthread -g    CMakeFiles/ExampleAppCodegen.dir/ExampleAppAVX.cpp.o  -o ExampleAppCodegenAVX  -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib ../../walberla/src/blockforest/libblockforest.a ../../walberla/src/core/libcore.a ../../walberla/src/field/libfield.a ../../walberla/src/lbm/liblbm.a ../../walberla/src/geometry/libgeometry.a ../../walberla/src/timeloop/libtimeloop.a ../../walberla/src/gui/libgui.a libLatticeModelGenerated.a ../../walberla/src/domain_decomposition/libdomain_decomposition.a ../../walberla/src/vtk/libvtk.a ../../walberla/src/boundary/libboundary.a ../../walberla/src/blockforest/libblockforest.a ../../walberla/src/core/libcore.a ../../walberla/src/field/libfield.a ../../walberla/src/lbm/liblbm.a ../../walberla/src/geometry/libgeometry.a ../../walberla/src/timeloop/libtimeloop.a ../../walberla/src/gui/libgui.a libLatticeModelGenerated.a ../../walberla/src/domain_decomposition/libdomain_decomposition.a ../../walberla/src/vtk/libvtk.a ../../walberla/src/boundary/libboundary.a ../../walberla/src/blockforest/libblockforest.a ../../walberla/src/core/libcore.a ../../walberla/src/field/libfield.a ../../walberla/src/lbm/liblbm.a ../../walberla/src/geometry/libgeometry.a ../../walberla/src/timeloop/libtimeloop.a ../../walberla/src/gui/libgui.a libLatticeModelGenerated.a ../../walberla/src/domain_decomposition/libdomain_decomposition.a ../../walberla/src/vtk/libvtk.a ../../walberla/src/boundary/libboundary.a /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so /usr/lib/libpfft.so /usr/lib/x86_64-linux-gnu/libfftw3.so /usr/lib/x86_64-linux-gnu/libfftw3_mpi.so ../../walberla/extern/lodepng/liblodepng.a /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so /usr/lib/libpfft.so /usr/lib/x86_64-linux-gnu/libfftw3.so /usr/lib/x86_64-linux-gnu/libfftw3_mpi.so)

Run the binaries with the parameter files:

apps/example_app_codegen/ExampleAppCodegen ../apps/example_app_codegen/ExampleApp.prm
apps/example_app_codegen/ExampleAppCodegenAVX ../apps/example_app_codegen/ExampleApp.prm

The AVX binary will fail at random with a SIGSEV, because the fields are allocated with 8-byte alignment, although 32-byte alignment is required to safely load doubles in memory. The src/field/Field.impl.h file has ifdefs to select the correct alignment if AVX2 is defined, however:

  • the alignment value is 16 instead of 32
  • the sizeof(T) < alignment uses T=const float [13], but the conditional was probably meant to test a hypothetical type T_underlying=const float
  • the conditional evaluates to false but takes the true branch in GDB (in the ESPResSo bridge, the false branch is taken)
  • the allocator_ shared pointer should dereference to a walberla::field::AllocateAligned<unsigned char, 16> object, but instead it dereferences to a generic allocator with 8-byte alignment

GDB setup:

gdb --args apps/example_app_codegen/ExampleAppCodegenAVX ../apps/example_app_codegen/ExampleApp.prm
(gdb) b /work/jgrad/walberla_deps/devel/walberla/src/field/Field.impl.h:341
(gdb) run
(gdb) tui e

Then in GDB, the execution was stepped through to check the values in the conditional as well as the allocated pointer, with is often 8-byte aligned instead of 16-byte or 32-byte aligned:

(gdb) print mem
$1 = (double *) 0x15554d528028
(gdb) python print(0x15554d528028 / 32)
733003551745.25

Then run continue until the SIGSEV is hit.