Posts

Post not yet marked as solved
3 Replies
2.4k Views
Hi all,I've been digging into Instruments lately to investigate some potential performance wins for my app. I'm really enjoying the level of detail in the System Trace template, and the os_signpost stuff is perfect to get a high-level view of where to focus attention - great job!I'm using the latest Xcode Version 11.4.1 (11E503a) on Catalina, and an iPod Touch 7th Gen on the latest iOS - 13.4 (17E255).I'm measuring the app in a steady state where the workload is pretty constant and low (should be no issue hitting 60FPS rendering) but Xcode's CPU meter is pretty jumpy and it drops a frame every couple of seconds, which then seems to coincide with a reduction of CPU usage shown in Xcode.My suspicion is the device thinks it can maintain the required framerate and use less energy by reducing the CPU frequency. This is not quite true for my app - the main thread is sleeping for a significant period but it is awaiting a result on the run loop before the frame rendering can be completed. I can see why the OS might think downclocking is possible here, but then kick up the frequency again when it sees a frame is dropped.I did however once seem to get the device into a state where it was running at (I assume) full speed - the CPU meter in Xcode stayed steady as a rock at 15% or so (with "other processes" at a similar number). I haven't been able to figure out exactly how I managed that - Instruments was running a windowed tracing session and I app switched away and back to the app, but that method doesn't seem to work every time.So the Instruments-related questions arising from this:1) Is there a way to record and visualize the CPU frequency scaling in Instruments? If the frequency of the CPU is changing and unknown, then the timings coming out of Instruments are not really telling the whole story.2) Is there an officially supported method to prevent CPU frequency scaling whilst Instruments is recording data?Thanks for any help!Simon
Posted Last updated
.
Post not yet marked as solved
2 Replies
1.7k Views
Hi, I'm aiming to render frames as close as possible to the presentation time - it's for a smartphone-based VR headset (Google Cardboard style) where ideally there is a "late warp" just before presenting a new frame that applies both lens distortion and also orientation correction to reduce the error in the predicted head pose by leveraging the very latest motion sensor data. So leaving it as late as possible gives better pose predictions. This late warp is a pretty simple pass - just a textured mesh, so it's typically <2ms of GPU time. Thanks to the Developer Labs it's been suggested I could use a compute shader for the warp so it can share GPU resources with any ongoing rendering work too (as Metal doesn't have a public per-queue priority to allow for pre-emption of other rendering work, which is the way this is generally handled on Android). What I'm trying to figure out now is how best to schedule the rendering. With CAMetalLayer maximumDrawableCount set to 2, you're pretty much guaranteed that the frame will be displayed on the next vsync if rendering completes quickly enough. However sometimes the system seems to hold onto the drawables a bit longer than expected which blocks getNextDrawable. With maximumDrawableCount of 3, it seems easy enough to maintain 60FPS but looking in instruments the CPU to display latency varies - there are times where its around 50ms (ie already 2 frames in the queue to be presented first, waitForNextDrawable blocks), some periods where it's 30 ms (generally 1 other frame queued) and sometimes where it drops down to the ideal 16ms or less. Is there any way to call present that will just drop any other queued frames in the layer? I've tried presentDrawable:drawable atTime:0 and afterMinimumDuration:0 but to no avail. It seems like with CAMetalLayer I'll just have to addPresentedHandler blocks to keep track of how many are queued in the display so I can ensure the queue is generally empty before presenting the next frame. A related question is the deadline for completing the rendering. The CAMetalLayer is in the compositing fast path, but it seems rendering needs to still complete (ie all the GPU workload finished) around 5ms before the next vsync for it to be displayed on the next vsync. I suspect there's a deadline for the frame just in case it needs to be composited but any hints / ideas for handling that would be appreciated. It seems to be slightly device-specific; somewhat unexpectedly, the iPod touch 7 latches frames that finish much closer to the vsync time than the iPhone 12 Pro. I've also just come across AVSampleBufferDisplayLayer that I'm taking a look at now. It seems to offer better control of the queue, and still enables the compositing fast path, but I can't actually see any feedback like addPresentedHandler to be able to judge what the deadline is to have a frame shown in the next vsync.
Posted Last updated
.
Post not yet marked as solved
4 Replies
1.1k Views
After installing the beta profile the option to download and install shows up in Software Update, but always gives a Software Update Failed error before starting the download. Works on an iPhone 12 Pro but I don't really want to switch that to the beta right now. Both currently on 16.3.1, using the same Wifi and with the same Apple ID in use. Anyone else having issues installing 16.4 beta 2 on iPhone 14 Pro?
Posted Last updated
.
Post not yet marked as solved
2 Replies
945 Views
I'm experiencing weird behaviour with CoreBluetooth on iOS (testing on an iPhone 12 Pro with iOS 15.6.1). My peripheral successfully requests a 15ms connection interval, and updates a characteristic before every connection interval. The app enables notifications on the characteristic. Generally the didUpdateValueForCharacteristic callbacks are received at 15ms intervals as expected. However there's a weird pattern where every 10 seconds (exactly) there's a period of 250ms or so where the callbacks are disrupted, with gaps of 100ms between callbacks and then multiple calls in quick succession to "catch up". I'd assumed it was just random interference or something until I noticed the 10 second pattern. I set up a BLE packet sniffer and captured a trace which shows the updates are all in fact transferred over-the-air successfully at the expected times, it is just on the iOS side where there is a delay in reporting them to the app. Further digging with Instruments (thread state trace) revealed the CPU has plenty of idle time so it's not CPU starvation. I did notice a correlation with hci_rx and StackLoop threads in bluetoothd - they do a burst of activity every 10 seconds and this correlates exactly with the hiccups in the callbacks. My application is pretty latency critical so it would be great to hear if anyone has experienced this before and any ideas for how to improve the situation. Ideally without needing to update the firmware of the peripheral but that is an option if it would help (ie moving to something other than GATT notifications to get the data across).
Posted Last updated
.