Hi,
I have a CUDA program that I want to convert to Metal Compute so that we can support Apple hardware.
When I wrote the CUDA version, I was able to write efficient code because I learned first about the Cuda-core architecture. The way the cores can access memory for instance is very important information so that I could write code that efficiently access the memory.
Now I want to do the same for the Metal Compute software. But I can not find any information about the low level architecture and especially the things you should know to be able to write efficient code.
Do I miss something?
Is there some guide giving hints for the most efficient way to access memory for instance?
Post
Replies
Boosts
Views
Activity
Hi,
I have an idea for an audio application. It does make use of HRTFs in a different way. So I would like to get the HRTF that was made for the user and use it in the application.
Is that possible?