- Jul 26, 2019
-
-
Martin Bauer authored
-
- Jul 11, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
-
- Jun 18, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
- when reading from ghost layers (previously the only option) a pull communication is required - if a kernel writes to the ghost layer a "push" communication has to be done - the new PackInfo generator can now derive push and pull packinfos from a given kernel
-
- Jun 07, 2019
-
-
Martin Bauer authored
-
- May 06, 2019
-
-
Martin Bauer authored
-
- May 05, 2019
-
-
Martin Bauer authored
-
- May 03, 2019
-
-
Martin Bauer authored
-
- Apr 28, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
-
- Apr 25, 2019
-
-
Martin Bauer authored
-
- Apr 24, 2019
-
-
Martin Bauer authored
- turned on restrict keyword by default (makes large difference on GPUs) - smarter block indexing: changing block size depending on domain size Example: previously there where (1,1,1) blocks when requested block size was (64, 1, 1) and domain size (1, 512, 512), now the block size is changed automatically to (1, 64, 1) in this case - added __lauch_bounds__ to kernels to allow better optimizations from the CUDA compiler
-
- Apr 18, 2019
-
-
Martin Bauer authored
- missing pragma once added - static variable to member in overlap sweep, when called with changing block sizes this lead to wrong results
-
- Apr 16, 2019
-
-
Martin Bauer authored
-
- Apr 04, 2019
-
-
Martin Bauer authored
-
- Mar 27, 2019
-
-
Martin Bauer authored
-
- Mar 22, 2019
-
-
Martin Bauer authored
-
- Mar 21, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
This restructuring allows for easier separation of modules into separate repositories later. Also, now pip install with repo url can be used. The setup.py files have also been updated to correctly reference each other. Module versions are not extracted from git state
-
- Mar 07, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
-
Martin Bauer authored
-
- Feb 26, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
-
Martin Bauer authored
- counter-based philox RNG: counter/key is filled with cell coordinate and optional external parameters like block position and time step - works on CPU and GPU - on CPU only for non-vectorized versions - introduced more flexible "CustomCodeNode" that can inject backend-specific hand-written code
-
- Feb 18, 2019
-
-
Martin Bauer authored
-
Martin Bauer authored
-
- Feb 03, 2019
-
-
Martin Bauer authored
-
- Jan 24, 2019
-
-
Martin Bauer authored
-
- Jan 23, 2019
-
-
Martin Bauer authored
- removed warnings from generated code - made generated code string deterministic, generating the same twice gives binary equally files now
-
- Jan 09, 2019
-
-
Martin Bauer authored
-
- Nov 16, 2018
-
-
Martin Bauer authored
-
- Nov 14, 2018
-
-
Martin Bauer authored
- small (length < 5) arrays with shape and stride information had to be memcpy'd to the GPU before every kernel call - instead of passing the information as arrays, the single elements are passed - leads to more function arguments, but simplifies GPU kernel calls -> changes in all backends required
-
- Nov 13, 2018
-
-
Martin Bauer authored
-
Martin Bauer authored
-
- Oct 29, 2018
-
-
Martin Bauer authored
-
- Oct 24, 2018
-
-
Martin Bauer authored
-
- Oct 23, 2018
-
-
Martin Bauer authored
-
- Oct 19, 2018
-
-
Martin Bauer authored
-