Post

Replies

Boosts

Views

Activity

Reply to Metal kernel issues with 24-bit data...
I found similar issues and I simplified the case to as below: Given an array unit32_t* inA, I want to output an array with each element increased by 1. Every thing works until the array length becomes 1024 * 1024 * 4, when the output array becomes all 0. It works even when the array length is 1024 * 1024 * 4 - 1. And, somehow I increase the array size to 1024 * 1024 * 4 + 128 * 128, it works again... as a really weird workround. Could anyone explain why 1024 * 1024 * 4 is a special number? Thanks kernel void increase_array( /* param idx 0 - setBuffer */ device const uint32_t* inA, /* param idx 1 - setBuffer */ device uint32_t* result, /* the thread index */ uint index [[thread_position_in_grid]] ) { // the for-loop is replaced with a collection of threads, each of which // calls this function. result[index] = index; }
Apr ’24