Skip to content
Snippets Groups Projects
  1. Jun 18, 2019
    • Martin Bauer's avatar
      Extended setup.py · f414f0bf
      Martin Bauer authored
      f414f0bf
    • Martin Bauer's avatar
      Support for generated "push" PackInfos · 00a6047f
      Martin Bauer authored
      - when reading from ghost layers (previously the only option) a pull
        communication is required
      - if a kernel writes to the ghost layer a "push" communication has to be
        done
      - the new PackInfo generator can now derive push and pull packinfos from
        a given kernel
      00a6047f
  2. Jun 07, 2019
  3. May 06, 2019
  4. May 05, 2019
  5. May 03, 2019
  6. Apr 28, 2019
  7. Apr 25, 2019
  8. Apr 24, 2019
    • Martin Bauer's avatar
      Improvements for GPU code generation · 0cdd23d8
      Martin Bauer authored
      - turned on restrict keyword by default (makes large difference on GPUs)
      - smarter block indexing: changing block size depending on domain size
        Example: previously there where (1,1,1) blocks when requested
        block size was (64, 1, 1) and domain size (1, 512, 512), now the
        block size is changed automatically to (1, 64, 1) in this case
      - added __lauch_bounds__ to kernels to allow better optimizations from
        the CUDA compiler
      0cdd23d8
  9. Apr 18, 2019
  10. Apr 16, 2019
  11. Apr 04, 2019
  12. Mar 27, 2019
  13. Mar 22, 2019
  14. Mar 21, 2019
  15. Mar 07, 2019
  16. Feb 26, 2019
  17. Feb 18, 2019
  18. Feb 03, 2019
  19. Jan 24, 2019
  20. Jan 23, 2019
    • Martin Bauer's avatar
      waLBerla codegeneration improved · 7ac04691
      Martin Bauer authored
      - removed warnings from generated code
      - made generated code string deterministic, generating the same twice
        gives binary equally files now
      7ac04691
  21. Jan 09, 2019
  22. Nov 16, 2018
  23. Nov 14, 2018
    • Martin Bauer's avatar
      Pass field information (shape,stride) as single elements instead of arr · 490d6902
      Martin Bauer authored
      - small (length < 5) arrays with shape and stride information had to be
        memcpy'd to the GPU before every kernel call
      - instead of passing the information as arrays, the single elements are
        passed
      - leads to more function arguments, but simplifies GPU kernel calls
      
      -> changes in all backends required
      490d6902
  24. Nov 13, 2018
  25. Oct 29, 2018
  26. Oct 24, 2018
  27. Oct 23, 2018
  28. Oct 19, 2018
  29. Oct 18, 2018
  30. Oct 16, 2018
  31. Oct 13, 2018