IOS Extremelly high CPU usage when using Metal

Hi, I just implemented a Metal layer for my game engine

When I run the game on OpenGL ES 3, it takes about 60% cpu

Then I switch to Metal, it runs ok, but CPU usage raises to more than 100%


I did an instrumentation, some metal lib function is taking 50% of my CPU usage, see below

thread com.apple.libdispatch-manager (serial)

52.2% CA::Display::DisplayLinkItem::dispatch

52.2% -> 0x47f4dca libMTLInterpose.dylib


And a system trace shows that 90% of CPU is spent on "BLOCK"


In my code, I follow the MetalBasic3D example to use a semaphore to sync the constant buffers

Then I used another thread with 2 more semaphores to do blit encoding, to generate mipmaps


The basic code path is like


CADisplayLink

-> dispatch_semaphore_wait(constant_sem);

-> blit_buffer enqueue

-> render_buffer enqueue

-> dispatch_semaphore_signal(blit_sem_1) -> dispatch_semaphore_wait(blit_sem_1)

-> render_buffer encode render commands blit_buffer encode blit commands

-> render_buffer addCompleteHandler: blit_buffer commit

dispatch_semaphore_signal(constant_sem); dispatch_semaphore_signal(blit_sem_2)

-> render_buffer presentDrawable

-> render_buffer commit

-> dispatch_semaphore_wait(blit_sem_2)


I'm not using GCD in this engine, I just used raw pthreads, with dispatch_semaphores

What could I do wrong to make libdispatch-manager uses so much CPUS

Replies

Rather than relying on CPU synchronization primatives, I would suggest a non-blocking strategy invovling the creation of 2 MTLCommandBuffers from a shared command queue: one for rendering, and one for blit/mipmap generation. Then call MTLCommandBuffer::enqueue() to gaurantee the correct order of processing of the command buffers within their shared command queue.


Excerp from the MTLCommandBuffer protocol docs:

In a multithreaded app, it’s advisable to break your overall task into subtasks that can be encoded separately. Create a command buffer for each chunk of work, then call the

enqueue
method on these command buffer objects to establish the order of execution. Fill each buffer object (using multiple threads) and commit them. The command queue automatically schedules and executes these command buffers as they become available.

The 'libMTLInterpose.dylib' is the giveaway: you have the debug layer enabled, which does substantially impact performance. You can go into your Xcode project's scheme and disable the debug layer for performance analysis.

Do you have a specific setting you are turning on/off for this?

You can either click the scheme popup button to edit a particular scheme, or press Command+Option+R to edit the current scheme and then run.


The settings you're looking for are "GPU Frame Capture" and "Metal API Validation." Set both to "Disabled" to avoid this runtime overhead.

svchost.exe (netsvcs) is also a cause of high CPU usage. YOu can try some of these solutions i have found online:


  • Clear Event Viewer Logs
  • Download and install all available Windows updates
  • Scan your computer for viruses and malware
  • Find and Disable the service that causes the “svchost” high CPU usage problem


For detailed solution, Visit: http://errorcodespro.com/fix-svchost-exe-netsvcs-high-cpu-memory-usage/