The documentation in the Metal Shading Language spec is as follows:
In this case, no memory fence is applied, and threadgroup_barrier acts only as an execution barrier.mem_none
Ensure correct ordering of memory operations to threadgroup memory for threads in a threadgroup.mem_threadgroup
Does this mean whenever we are using threadgroup memory, we need to use
mem_threadgroup
for our barriers? If so, under what circumstances does mem_none
suffice?I've seen code where threadgroup memory is loaded, but
mem_none
is used (is this code incorect?). And yet another example where mem_threadgroup
is used.