Posts

Post marked as solved
6 Replies
Yeah, this is why I'd like some solid information about what works and what doesn't - all I have to go on is AMD's product info and docs, and the fact that it kind of looks like it should be supported on macOS.I have my own shader editing tool, and running a very simple test (take a 4 component vector and sin() it ~5000 times), I can compare the results. With a float4 performance is ~25% higher than with packed_half4. There's no performance difference between half4 and packed half4.
Post not yet marked as solved
2 Replies
Educated guess, but I'd guess you're hitting a bandwidth limit there. Up to a point you have enough bandwidth for everything and you're compute or latency limited, then you don't and everything is limited by the bandwidth, which gets worse with each additional buffer.