0 Replies
      Latest reply on Apr 22, 2020 1:12 AM by Paul Ollivier
      Paul Ollivier Level 1 Level 1 (0 points)

        I have a metal kernel function that has a huge array of data for input, stored in device memory, and I'm basically using one element per thread for further processing.


        device Element *elements [[ buffer(0) ]],


        I'm wondering what's better in terms of performance? :


        Make a copy of the array element into local thread memory :

        Element element = elements[thread_id];


        Or, use a pointer to that element :

        device Element *element = &particles[thread_id];