Hello,
I was wondering if the Metal API gives programmers access to software controlled cache memory, like CUDA does on NVIDIA GPU's.
Those who have written CUDA code for NVIDIA GPU's probably have experience using a streaming multi-processor's shared memory to help speed up memory accesses during computations (for example when performing tiled matrix multiplication). Does Metal (or Swift) provide a similar feature?
So far I have only found documentation describing unified memory between CPU and GPU, but that is not what I am looking for.
Thanks!