I have a compute kernel that makes use of simdgroup operations such as simd_shuffle_up
, simd_or
, etc, and I'm looking to rewrite the kernel to support older hardware. One such computation requires that I know the index of the thread in the simdgroup (thread_index_in_simdgroup
). I was hoping to derive it from the thread's position in its threadgroup (thread_position_in_threadgroup
) and the thread execution width (thread_execution_width
), along with other knowledge about the size of the threadgroup when I noticed there was also the threads_per_simdgroup
attribute. The spec describes both respectively as
thread_execution_width
: The execution width of the compute unit.
threads_per_simdgroup
: The thread execution width of a SIMD-group.
Under what conditions, if any, could these two values differ? If they do differ, is there a way to determine a thread's position in the simdgroup on hardware that doesn't support Metal 2.2?
There is no difference between thread_execution_width
and threads_per_simdgroup
. We've noted in Metal Shading Language 3.0 that:
thread_execution_width
All OS: Since Metal 1.0. [[ Deprecated as of Metal 3.0 – usethreads_per_simdgroup
]]
There's also this line:
[[threads_per_simdgroup]]
and[[thread_execution_width]]
are aliases of one another that reference the same concept.