What is the purpose of threadgroup memory in Metal?

Question

Created Sep ’20

Replies 2

Boosts 0

Views 1.9k

Participants 3

I have been working with Metal for a little while now and I have encountered the threadgroup address space. After reading a little about it in Apple’s MSL reference, I am aware of how threadgroups are formed and how they can be split into SIMD groups; however, I have not yet seen threadgroup memory in action. Can someone give me some examples of when/how threadgroup memory is used?

Specifically, how is the [[threadgroup(n)]] attribute used in both kernel and fragment shaders? References to WWDC videos, articles, and/or other resources would be appreciated.

Answered by darknoon in 642604022

The purpose of threadgroup or tile memory is that it is much faster to access than system memory, since it's directly on the GPU hardware as opposed to needing to send loads and stores out over the memory bus. You can store intermediate results there to avoid expensive ram traffic. Hope that helps.

Boost

Answer 1

darknoon OP

Oct ’20

Accepted Answer

The purpose of threadgroup or tile memory is that it is much faster to access than system memory, since it's directly on the GPU hardware as opposed to needing to send loads and stores out over the memory bus. You can store intermediate results there to avoid expensive ram traffic. Hope that helps.

1

Answer 2

acousticluminosity OP

Oct ’20

This WWDC talk, https://developer.apple.com/videos/play/tech-talks/10858, near the 17 min mark discusses parallel reductions on the GPU, and the code sample for this at the 21 min mark shows an example usage of threadgroup memory. Of course this is just one of many ways threadgroup memory could be used, but I thought you might appreciate a concrete example.

2