With a threadsPerGrid of (10,1,1) and:
I expect:
But I get:
github.com/quaternionboy/Atomic
Code Block [[kernel]] void compute_shader (device atomic_int& incremental [[buffer(0)]], ushort lid [[thread_position_in_threadgroup]] ){ threadgroup atomic_int local_atomic {0}; atomic_fetch_add_explicit(&local_atomic, 1, memory_order_relaxed); threadgroup_barrier(mem_flags::mem_threadgroup); if(lid == 0) { int local_non_atomic = atomic_load_explicit(&local_atomic, memory_order_relaxed); atomic_fetch_add_explicit(&incremental, local_non_atomic, memory_order_relaxed); } }
I expect:
Code Block 10 20 30 ...
But I get:
Code Block 1125974026 1125974036 -2000908258 -832823256 ...
github.com/quaternionboy/Atomic
There are known issues initializing local threadgroup vars when declared.
Please try initializing local_atomic in the following way:
Please try initializing local_atomic in the following way:
Code Block threadgroup atomic_int local_atomic; if (lid == 0) atomic_store_explicit(&local_atomic, 0, memory_order_relaxed); threadgroup_barrier(mem_flags::mem_threadgroup);