mem_none vs mem_threadgroup

Question

The documentation in the Metal Shading Language spec is as follows:

```
mem_none
```
In this case, no memory fence is applied, and threadgroup_barrier acts only as an execution barrier.
```
mem_threadgroup
```
Ensure correct ordering of memory operations to threadgroup memory for threads in a threadgroup.

Does this mean whenever we are using threadgroup memory, we need to use

mem_threadgroup

for our barriers? If so, under what circumstances does

mem_none

suffice?

I've seen code where threadgroup memory is loaded, but

mem_none

is used (is this code incorect?). And yet another example where

mem_threadgroup

is used.

Metal

1.7k

Posted by

Audulus

Reply

Add a Comment

Answer 1

The memflags set in the barrier tell the compiler which caches need to be flushed so that all threads can see the same thing when yoru code executes the barrier. If you use mem_none, no caches will be flushed and it's undefined whether values written by one thread to any type of memory will be seen by any other thread. If you set mem_threadgroup, you can be assured that any values written to threadgroup memory (and only threadgroup memory) can be seen by other threads after the barrier.

So to answer your quesiton, if your kernel isn't dependant on values written from another thread into threadgroup memory, you can use mem_none. But if you're using threadgroup memory in the first place, it's likely (but not a given) that you're using it to communicate between threads, so you'll probably want to set mem_threadgroup.

Posted by

Graphics and Games Engineer

Add a Comment

Answer 2

I saw a same question on the stack overflow, is it asked by you ?

and did you get the real answer of this question?

Posted by

xiaoyili

Add a Comment

Answer 3

The memflags set in the barrier tell the compiler which caches need to be flushed so that all threads can see the same thing when yoru code executes the barrier. If you use mem_none, no caches will be flushed and it's undefined whether values written by one thread to any type of memory will be seen by any other thread. If you set mem_threadgroup, you can be assured that any values written to threadgroup memory (and only threadgroup memory) can be seen by other threads after the barrier.

So to answer your quesiton, if your kernel isn't dependant on values written from another thread into threadgroup memory, you can use mem_none. But if you're using threadgroup memory in the first place, it's likely (but not a given) that you're using it to communicate between threads, so you'll probably want to set mem_threadgroup.

Posted by

Graphics and Games Engineer

Add a Comment

mem_none vs mem_threadgroup

Accepted Reply

Replies