I have quite heavy kernels, for a grid of 128x128x128 I have a loop of 32000 (it is a energy grid of a molecular simulation using 32.000 atoms of a crystal).
The paralization is over the grid, not over the atoms, for convenience. This works fine on Sierra and El Capitan, eventhought it takes a couple of seconds to compute.
However, on High Sierra it only computes 1/4, throwing out all results I suspect after a certain time-out. If I reduce the amount of work per kernel it shows 1/2 the grid, and if I reduce it further it shows the whole grid.
My kernels works fine on El Capitan and Sierra, on Intel, NVidia and AMD. It works on High Sierra on NVidia and AMD, but not on Intel HD 5000 (the mac air).
My problem is that I know the results are wrong on intel HD 5000, but no error appears. I check the error of the commandBuffer, and it is nil. Clearly not all kernels have run.
Question: How do you 100% for sure know that all kernels have run?