ICB is working stable with the last OS updates. We have updated our macOS benchmark and released the iOS version: https://apps.apple.com/us/app/gravitymark-gpu-benchmark/id1595186532
ICB is giving a 2.5 performance boost in comparison with the previous version.
Thank you for the great improvements.
Post
Replies
Boosts
Views
Activity
This is the same eGPU hardware with 3 times lower performance under Metal:
https://gravitymark.tellusim.com/report/?id=bc453e851c5dede3cedef6c3ac9caca2f8dffa47
https://gravitymark.tellusim.com/report/?id=7f1b799adc588938fc02f140a2ee48dbd4f36e69
There are links inside FB9127527 to a notarized application for macOS, Windows, and Linux. And multiple simple tests to reproduce the problem on macOS in other FB.
Thank you.
Thank you for your answer.
I have created a new FB9127527 issue with the benchmark and more information inside.
Everything is described in the original post. The main problem is performance because even a loop of draw indirect is faster than an indirect command buffer:
https://www.icloud.com/iclouddrive/0ICuhBkHgGuLjCxaJwRyHoLmw#execute_commands_in_buffer
https://www.icloud.com/iclouddrive/0hDo_q0oXs4uzC25yZdKmL83A#multiple_draw_indirect
I made Feedback Assistant more than half of year ago. There was no answer. After that, I wrote here.
Thank you!
Hi,
A12 devices are not able to draw more than 512 drawindirect commands (CPU unroll for multidrawindirectcount).
The rendering objects start flickering. A13 and M1 devices are working fine even with 50K draw calls.
Thank you
Is there any update about that?
Thank you
Thanks for the new variable. There are no errors from the debug/GPU validation layers during execution. Except that nothing is rendering during GPU ICB generation. I will wait for the answers.
PS: iOS debug layers are working great with setenv(). Thank you for that!
Thank you for the new value for the device wrapper type. I will retest everything. A validation message on M1 tells that ICB is not yet supported :)
[MTLGPUDebugDevice newIndirectCommandBufferWithDescriptor:maxCommandCount:options:]:1035: failed assertion `Indirect Command Buffers are not currently supported with Shader Validation'
I will check it on other devices a bit later.
Hello,
Can you advise me please how to run existed .ipa file with Xcode shader validation/debug?
We are not using xcodeproject files. We have a couple of bash scripts and Makefiles, which are doing all jobs well and fast for all platforms. On MacOS it's possible to set METALDEVICEWRAPPER_TYPE=1 variable to run the Metal debug layer, but unfortunately, we cannot do the same on iOS.
The Xcode feature to run an already installed app on the device would be awesome.
I can provide you reproductions samples if you need them.
Thank you!
Hello,
I have checked the ICB performance of serial drawIndexedPrimitives commands in comparison with drawPrimitives indirect method.
The test scene is 16K DIPs of 2 triangle quads. The static ICB is created on the CPU.
Vega 56:
Combined geometry (single DIP): 200M tri/sec
Serial drawPrimitivesIndirect: 12M tri/sec
Single executeCommandsInBuffer: 7M tri/sec
CPU and GPU ICB are working without any issues. GPU ICB is 4-5 times faster than the CPU ICB. The funny thing that AMD GPU has a native multiDrawIndirectCount command, which is working much faster...
Apple M1 (MacBook Air):
Combined geometry (single DIP): 50M tri/sec
Serial drawPrimitivesIndirect: 8M tri/sec
Single executeCommandsInBuffer: hangs after 1 second of execution with the random magenta noise. Debugging runtime nothing tells.
Apple A12 (iPhone XR):
Combined geometry (single DIP): 27M tri/sec
Serial drawPrimitivesIndirect: 13M tri/sec
Single executeCommandsInBuffer: hangs after 1 second of execution (with CPU ICB).
Copying from CPU ICB to Private ICB causes app crash.
Intel Iris Plus (MacBook Air 2020):
Combined geometry (single DIP): 4.3M tri/sec
Serial drawPrimitivesIndirect: 1.46M tri/sec
Single executeCommandsInBuffer: draws nothing, debug runtime crashes with the message that ICB is empty. executeCommandsInBuffer telling that source CPU ICB is not an ICB.
Thank you!
Hello,
The iPhone 11 Pro Max (A13) (13.5.1 and 14.2) reports that Tier 2 is supported.
ICB generation on Compute shader is working, but ~5% of objects partially rendered (or with corruption).
ICB generation on Vertex shader produces a black screen with console error:
"Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (IOAF code 4)"
The iPhone XR (A12, which is newer than the iPhone 10) (13.5.1 and 14.2) reports that Tier 2 is not supported.
iPad Pro (12.9-inch) (4th generation with LiDAR A12) (13.5.1 and 14.2) reports that argument buffer Tier 2 is not supported (same as DTK).
What am I doing wrong, guys? Ignoring the Tier 2 test makes a random magenta pattern over the screen. Does that mean that all currently available iPad Pro models are not compatible with ICB? So it's just technically impossible to create vkCmdDrawIndexedIndirectCount() functionality.
Thank you!
The separate error for the magenta screen is FB8928674
The report for 20 seconds startup time with eGPU is FB8928678
Thank you!
But what if somebody doesn't need thousands of textures and buffers. We need 12 textures and 4 buffers for the whole scene rendering. Accessing textures through Argument buffer is an additional indirection during shader execution.
What we need to execute is just a simple loop:
for(pipeline in pipelines) {
bind pipeline
bind 12 textures
bind 4 buffers
drawIndexedIndirectCount(indirect buffer, count buffer)
}
So my idea with ICB was to implement code like this:
for(pipeline in pipelines) {
bind ICB generation rendering pipeline with rasterizer discard
bind indirectbuffer
bind ICB
drawPointsIndirect(count buffer)
bind pipeline
bind 12 textures
bind 4 buffers
executeIndirect(ICB)
}
But it looks that I have to patch pipeline shaders for ICB additionally.
I will submit the magenta screen issue on M1 and 20 seconds startup time with eGPU into another FBs.
Thank you!
FB8254449:
Yes, that was what I tried to achieve with ICB. But the following issues make it impossible at that moment:
Rendering pipeline for ICB cannot use textures, except M1 GPU.
20 seconds start time with AMD eGPU with whole system freeze during this time.
Big chance to have magenta screen instead of normal rendering on M1 while using ICB.
Argument buffer tier 2 is not available on iPhone/iPad/DTK.
ICB and Argument buffers specification are very flexible. It makes it impossible to implement them on all HW.
So maybe a single function solution with internal driver implementation for different HW will be more flexible as a result?
FB8638856:
Reproduction applications for 2 and 3 are available in the single archive with all descriptions.
https://www.icloud.com/iclouddrive/0fIpVg83LFG-OACxsMtjVtZHw#apple1/
Both of them are related to ICB creation/execution.
Thank you!