Post

Replies

Boosts

Views

Activity

How many warps can be run in parallel on a single shader core?
The Metal feature set tables specifies that beginning with the Apple4 family, the "Maximum threads per threadgroup" is 1024. Given that a single threadgroup is guaranteed to be run on the same GPU shader core, it means that a shader core of any new Apple GPU must be capable of running at least 1024/32 = 32 warps in parallel. From the WWDC session "Scale compute workloads across Apple GPUs (6:17)": For relatively complex kernels, 1K to 2K concurrent threads per shader core is considered a very good occupancy. The cited sentence suggests that a single shader core is capable of running at least 2K (I assume this is meant to be 2048) threads in parallel, so 2048/32 = 64 warps running in parallel. However, I am curious what is the maximum theoretical amount of warps running in parallel on a single shader core (it sounds like it is more than 64). The WWDC session mentions 2K to be only "very good" occupancy. How many threads would be "the best possible" occupancy?
1
0
437
Aug ’24
In Metal compute kernels, when do thread variables get spilled into the device memory?
How many 32-bit variables can I use concurrently in a single thread of a Metal compute kernel without worrying about the variables getting spilled into the device memory? Alternatively: how many 32-bit registers does a single thread have available for itself? Let's say that each thread of my compute kernel needs to store and work with its own array of N float variables, where N can be 128, 256, 512 or more. To achieve maximum possible performance, I do not want to the local thread variables to get spilled into the slow device memory. I want all N variables to be stored "on-chip", in the thread memory space. To make my question more concrete, let's say there is an array thread float localArray[N]. Assuming an unrealistic hypothetical scenario where localArray is the only variable in the whole kernel, what is the maximum value of N for which no portion of localArray would get spilled into the device memory? I searched in the Metal feature set tables, but I could not find any details.
0
0
326
Aug ’24
Xcode 15 doesn't see my iPhone, but Xcode 14 does
I have two Xcode versions installed on my macOS Ventura 13.4. Those are Xcode 14.3 (14E222b) and Xcode 15.0 beta (15A5160n). I do also have an iPhone 13 Pro (iOS 17.0 Developer Beta 21A5248v) and an iPad Pro (iPadOS 16.5). Xcode 14.3 shows both the iPad and the iPhone in "Devices and Simulators", but Xcode 15.0 shows only the iPad. When clicking the "+" button in the bottom left corner of the "Devices and Simulators" of Xcode 15.0, I can only see the iPad in the list of devices. Clicking the "+" button in Xcode 14.3 reveals both devices. What am I doing wrong? How can I make Xcode 15 recognize my iPhone with iOS 17? I have already tried restarting my Mac and my iPhone several times, including Force Restarting, but it didn't help.
1
5
2.6k
Jun ’23