Difference between `thread_execution_width` and `threads_per_simdgroup`

I have a compute kernel that makes use of simdgroup operations such as simd_shuffle_up, simd_or, etc, and I'm looking to rewrite the kernel to support older hardware. One such computation requires that I know the index of the thread in the simdgroup (thread_index_in_simdgroup). I was hoping to derive it from the thread's position in its threadgroup (thread_position_in_threadgroup) and the thread execution width (thread_execution_width), along with other knowledge about the size of the threadgroup when I noticed there was also the threads_per_simdgroup attribute. The spec describes both respectively as

thread_execution_width: The execution width of the compute unit.

threads_per_simdgroup: The thread execution width of a SIMD-group.

Under what conditions, if any, could these two values differ? If they do differ, is there a way to determine a thread's position in the simdgroup on hardware that doesn't support Metal 2.2?

Answered by Graphics and Games Engineer in 718337022

There is no difference between thread_execution_width and threads_per_simdgroup. We've noted in Metal Shading Language 3.0 that:

thread_execution_width All OS: Since Metal 1.0. [[ Deprecated as of Metal 3.0 – use threads_per_simdgroup ]]

There's also this line:

[[threads_per_simdgroup]] and [[thread_execution_width]] are aliases of one another that reference the same concept.

Accepted Answer

There is no difference between thread_execution_width and threads_per_simdgroup. We've noted in Metal Shading Language 3.0 that:

thread_execution_width All OS: Since Metal 1.0. [[ Deprecated as of Metal 3.0 – use threads_per_simdgroup ]]

There's also this line:

[[threads_per_simdgroup]] and [[thread_execution_width]] are aliases of one another that reference the same concept.

Difference between `thread_execution_width` and `threads_per_simdgroup`
 
 
Q