Why aren't the Metal Feature Set Tables up to date?

I am a Metal developer who frequently uses the Metal feature set tables. In 2020 just after the iPhone 12 event, the feature set tables were updated immediately to reflect the new A14 GPU. Although Apple released a video detailing the A15's new features and updated the MSL specification, they haven't added A15 to the feature set tables. That would we really helpful incase there were any small additions from A14 -> A15 that weren't documented in the video.

Also, the M1 is in the Apple7 family, but it isn't on the feature set tables because they were last updated before the M1 event. Is Apple waiting until the M2 event in April to add Apple8 to the feature set tables?

Post not yet marked as solved Up vote post of philipturner Down vote post of philipturner
2.4k views
  • Better send a bug report than asking why. Have you already sent one?

  • No and I don’t think they’ll respond. My last bug report was for a 2-minute bug fix and it took then 3 months to respond. I even called them on the phone and pointed out their ignoring on a developer forums thread.

Add a Comment

Replies

[this comment was accidentally placed here and relocated to a different position on the thread by the author]

I am also very confused by the absence of A15 in the feature tables. Note that MTLGPUFamily.apple8 (that probably refers to A15) is also undocumented. Come on, Apple, you can do better.

Regarding M1, that one shares the capabilities with A14 (same GPU), so it does not really need a separate feature entry. Would still be nice if they put in M1/A14 for clarity.

They're finally up to date! Thanks for adding a description of Apple7 Mac GPUs and the Apple8 iOS GPU! Hopefully, you'll update it more quickly for the Apple9 family with A16/M2 in fall 2022. Those two chips may be produced on different nodes (4-nanometer vs. 3-nanometer), so we'll have to wait and see whether they have the same architecture.

https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf

P.S. Thanks for making the enumeration names reflect Swift. It seems like Objective-C has become a relic of the past, and most developers now use the newer language.

Not quite correct to lump M1 in with A14. It has BC support not reflected in that table. Now I wish the iPhone/iPad did too.

Some of the comments I made above sounded a little impatient, which might be justified given how infrequently the document is updated, but I could have been more considerate. Setting that aside, there is one major inconsistency with the MSL specification and feature set tables. In section 6.15.2.6: Atomic Modify Functions (64 Bits), it says "see the Metal Feature Set Tables to determine which GPUs support this feature." No entry in the Tables describes 64-bit atomics, and the link in the PDF doesn't open any website.

This causes major difficulties because I have to physically test the feature on an A15, A14, AMD, and Intel GPUs just to see where it's supported. It seems to not work on Apple7 or Apple8, and I suspect it would fall under "Varies" for Mac2. Could the Tables or MSL be changed to fix this inconsistency, and to add A16 to Apple8?

@philipturner I think your frustration is justified. Apple's developer relations are unfortunately a mess. And it's not a criticism to the fine people who do all the hard work in the background and occasionally help us on these forums, but the obvious lack of structure and ownership in these matters. Lack of updates to the Metal Feature Set tables is just one symptom of a wide systemic problem. For example, the Metal Shading Language is very difficult to use as a reference tool due to subpar formatting and lack of hyperlinks. The API documentation is also lacklustre, incomplete and difficult to navigate. Forum communication is almost non-existent. It would be great if Apple considered creating a role dedicated to improving these aspects because it seems like this is something nobody really feels responsible for.

This is also true of AMD with their public relations about ROCm. And the exact opposite of NVIDIA with very reliable support for CUDA users. The disparity in high-quality support is why many people try porting CUDA code to HIP, then give up. If Apple wants the M1 GPU to be viable for high-performance computing, they need a GPGPU-first API on par with HIP or SYCL. I'm currently working to make that a reality with SYCL.

To Apple's Metal team, here's what we need for the M1 GPUs to be viable for HPC, for the rest of the 21st century. This is what I request from you:

  • Let the GPU and CPU share the same address space. Graphics APIs typically never allow this, but compute-oriented APIs (OpenCL SVM, CUDA UVM, SYCL USM) use it quite frequently. Several code bases use pointer sharing between CPU and GPU to make implementing GPGPU easier. This should be trivial since the M1 GPU has shared memory in hardware.
  • Open-source the M1 OpenCL driver and fully document the AIR bytecode representation. We need high-fidelity translation of OpenCL-flavored SPIR-V -> AIR to create "MoltenCL" and a hipSYCL backend.
  • Open-source some kernels of Metal Performance Shaders, at least enough to create a BLAS library - just the M1 variants of these kernels. I will be using double-precision emulation to create the double-precision counterparts to single-precision functions.

I am trying to act courteous, but the nature of this comment may give a different impression. Please, could someone on the developer team address this issue? I'm fine with any communication medium.

  • I see a strange phenomenon where certain posts on the developer forums don't show up, except when I view then under my login (e.g. one of the MetalFX comments). If the comment above was in fact censored, I apologize for addressing someone in such an unprofessional way and prompting that action. My main motivation: Apple is lagging behind other vendors for HPC, and I think we all want that to change.

  • @philipturner the first point might not be as straightforward as one thinks. Physical memory is shared by the CPU and the GPU have different virtual memory tables and it's not clear that they can be shared with ease... there might be some subtle hardware differences that make this hard or even impossible on current hardware.

  • This already has been implemented in current hardware. Intel integrated GPUs support USM pointers in oneAPI. However, I recently thought of a good idea that translates CPU addresses into GPU addresses. It has higher performance but limits the maximum memory possible. So the first point isn't as significant anymore, although still a useful feature.

Add a Comment