Tile optimized tex write from fragment/compute

Does anyone have an example of how to perform an optimized write to an MTLTexture that is prive and internally tiled for performance?


A simple fragment that writes RGBA8 to full screen tex may take 800us or so. I have a compute function that does the same into a buffer in 150us or so with its own memory performant tiling (same overall memory size).


I was hoping there would be a way to write in a tile friendly way directly to the tex (if internal tile size, etc are known).

Thanks,