How does Apple Silicon interpolate fp16 inputs/outputs?

I'm hoping the answer here is that the fp16 values get written out to the parameter buffer to save space on TBDR, but then the gpu promotes them back to fp32 for interpolation, and then back to fp16 for the receiving fragment shader. This would then work around banding if the output and interpolation was done in fp16 math like on Android. There is no documentation that I've found on this, or even on the PowerVR documentation about their gpu.

How does Apple Silicon interpolate fp16 inputs/outputs?
 
 
Q