Thread safety
This MR introduces a simple check for thread safety in the kernel constrains check. If the assignments are not thread safe, optimizations like OpenMP or GPU should fail.
Merge request reports
Activity
requested review from @kuron
assigned to @holzer
How does this tie in with the existing
check_independence_condition
/check_double_write_condition
? It seems like they also check for things that make loops thread-unsafe.Edited by Michael Kuronadded 1 commit
- 86a6112e - Fuse thread safety check with existing check
This is honestly a very good question. I just noticed that I am allowed in pystencils to generate a basic kernel with a 5 point stencil update without writing to a 'dst' field. This results in a race condition; thus, the code can not be parallelised. Therefore, I implemented this check quickly. Thinking about it again (and especially with the previous checks in mind), I think that
check_independence_condition
should exactly check that. Therefore, I fused my implementation with the independence checks.I checked all the failing tests, and I would say they were actually problematic. They were all Jacobi kernels with no src, dst split. For all of them, I added a dst field and then performed a swap after every timestep. Furthermore, I noticed that indexed kernels fail with the check now. It makes sense because, for indexed kernels, you never know where you actually read or write because the information is created at runtime. Thus I deactivated the check in the boundary handling.
232 232 233 233 config = ps.CreateKernelConfig(data_type=dtype, 234 234 default_number_float=dtype, 235 cpu_vectorize_info=opt) 235 cpu_vectorize_info=opt, 236 skip_independence_check=True) The reads and writes of the source vector are not independent. If the thread updating, for example,
src[1, 0]
, is faster or slower, src[0,0]
is or isn't updated yet. Thus OpenMP or GPU simulations running this kernel will not be deterministic. This is what the independence check is looking for.My proposed enhancement checks if reads and writes only occur at the exact same position. Only in these cases, thread safety is guaranteed if I am not overlooking something.
Thus only assignments like
src[0,0] <-- 2*src[0,0]
ordst[0,0] <-- src[1,0]
are possible. In the later case usingsrc
on the lhs would fire the checkEdited by Markus HolzerOh, I didn't realize that the lhs of the assignment used the same field. Since this is just a regression test for #62 (closed), I guess it doesn't matter whether it's correct/threadsafe. Did #62 (closed) only occur with this thread-unsafe assignment or can we make the regression test threadsafe by introducing a dst field?
Sorry for the delay. #62 (closed) appeared due to the squared variables in the denominator. Usually, we simplify squares beforehand but this did not work, because the squared variable appeared in a Unevaluated node, because the rational got simplified beforehand.
This means we can make this test thread safe
changed this line in version 5 of the diff
mentioned in commit 5e4c4ad7