Post

Replies

Boosts

Views

Activity

Reply to Number of simultaneous Metal threads
After experimenting and measuring, I've found out some numbers:     static func maxNumberOfSimultaneousThreads(device : MTLDevice) -> Int? {         switch device.name {         case "AMD Radeon Pro Vega 64": return 8192         case "AMD Radeon RX Vega 56": return 4096         case "AMD Radeon Pro Vega 20": return 4096         case "AMD Radeon R9 M370X": return 2048         case "Apple A8 GPU": return 512         case "Apple A9 GPU": return 1024         case "Apple A10 GPU": return 1024         case "Apple A11 GPU": return 1024         case "Apple A12 GPU": return 4096         case "Apple A12X GPU": return 2048         default: return nil         }     } I've measured these numbers with a dummy compute shader that uses about 4K bytes thread local memory. What it means is that for example on my iPhone XS Max (Apple A12 GPU), running the compute shader once takes as much time as running the compute shader up to 4096 times; running it more times will take at least double that run time. Not sure how reliable these numbers are, so ymmv ...
Oct ’20