Post

Replies

Boosts

Views

Activity

RunLoop behaviour change with performBlock?
I have a dedicated render thread with a run loop that has a CADisplayLink added to it (that's the only input source attached). The render thread has this loop in it: while (_continueRunLoop) { [runLoop runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]]; } I have some code to stop the render thread that sets _continueRunLoop to false in a block, and then does a pthread_join on the render thread: [_renderThreadRunLoop performBlock:^{ self->_continueRunLoop = NO; }]; pthread_join(_renderThread, NULL); I have noticed recently (iOS 18?) that if the Display Link is paused or invalidated before trying to stop the loop then the pthread_join blocks forever and the render thread is still sitting in the runMode:beforeDate: method. If the display link is still active then it does exit the loop, but only after one more turn of the display link callback. The most likely explanation I can think of is there has been a behaviour change to performBlock - I believe this used to "consume" a turn of the run loop, and exit the runMode:beforeDate call but now it happens without leaving that function. I can't find specific mention in the docs of the expected behaviour for performBlock - just that other RunLoop input sources cause the run method to exit, and timer sources do not. Is it possible that the behaviour has changed here?
4
0
97
6d
Creating Metal Textures from kCVPixelFormatType_Lossless_420YpCbCr10PackedBiPlanarVideoRange ('&xv0') buffers
I'm testing on an iPhone 12 Pro, running iOS 17.5.1. Playing an HDR video with AVPlayer without explicitly specifying a pixel format (but specifying Metal Compatibility as below) gives buffers with the pixel format kCVPixelFormatType_Lossless_420YpCbCr10PackedBiPlanarVideoRange (&xv0). _videoOutput = [[AVPlayerItemVideoOutput alloc] initWithPixelBufferAttributes:@{ (NSString*)kCVPixelBufferMetalCompatibilityKey: @(YES) } I can't find an appropriate metal format to use for these buffers to access the data in a shader. Using MTLPixelFormatR16Unorm for the Y plane and MTLPixelFormatRG16Unorm for UV plane causes GPU command buffer aborts. My suspicion is that this compressed format isn't actually metal compatible due to the lack of padding bytes between pixels. Explicitly selecting kCVPixelFormatType_420YpCbCr10BiPlanarVideoRange (which uses 16 bits per pixel) for the AVPlayerItemVideoOutput works, but I'd ideally like to use the compressed formats if possible for the bandwidth savings. With SDR video, the pixel format is the lossless 8-bit one, and there are no problems binding those buffers to metal textures. I'm just looking for confirmation there's currently no appropriate metal format for binding the packed 10-bit planes. And if that's the case, is it a bug that AVPlayerVideoOutput uses this format despite requesting Metal compatibility?
1
0
558
Jul ’24
Can't download iOS 16.4 Beta 2 on iPhone 14 Pro
After installing the beta profile the option to download and install shows up in Software Update, but always gives a Software Update Failed error before starting the download. Works on an iPhone 12 Pro but I don't really want to switch that to the beta right now. Both currently on 16.3.1, using the same Wifi and with the same Apple ID in use. Anyone else having issues installing 16.4 beta 2 on iPhone 14 Pro?
4
2
1.4k
Mar ’23
CoreBluetooth didUpdateValueForCharacteristic callback is delayed
I'm experiencing weird behaviour with CoreBluetooth on iOS (testing on an iPhone 12 Pro with iOS 15.6.1). My peripheral successfully requests a 15ms connection interval, and updates a characteristic before every connection interval. The app enables notifications on the characteristic. Generally the didUpdateValueForCharacteristic callbacks are received at 15ms intervals as expected. However there's a weird pattern where every 10 seconds (exactly) there's a period of 250ms or so where the callbacks are disrupted, with gaps of 100ms between callbacks and then multiple calls in quick succession to "catch up". I'd assumed it was just random interference or something until I noticed the 10 second pattern. I set up a BLE packet sniffer and captured a trace which shows the updates are all in fact transferred over-the-air successfully at the expected times, it is just on the iOS side where there is a delay in reporting them to the app. Further digging with Instruments (thread state trace) revealed the CPU has plenty of idle time so it's not CPU starvation. I did notice a correlation with hci_rx and StackLoop threads in bluetoothd - they do a burst of activity every 10 seconds and this correlates exactly with the hiccups in the callbacks. My application is pretty latency critical so it would be great to hear if anyone has experienced this before and any ideas for how to improve the situation. Ideally without needing to update the firmware of the peripheral but that is an option if it would help (ie moving to something other than GATT notifications to get the data across).
2
0
1.2k
Sep ’22
How to schedule CAMetalLayer rendering for lowest CPU to Display latency?
Hi, I'm aiming to render frames as close as possible to the presentation time - it's for a smartphone-based VR headset (Google Cardboard style) where ideally there is a "late warp" just before presenting a new frame that applies both lens distortion and also orientation correction to reduce the error in the predicted head pose by leveraging the very latest motion sensor data. So leaving it as late as possible gives better pose predictions. This late warp is a pretty simple pass - just a textured mesh, so it's typically <2ms of GPU time. Thanks to the Developer Labs it's been suggested I could use a compute shader for the warp so it can share GPU resources with any ongoing rendering work too (as Metal doesn't have a public per-queue priority to allow for pre-emption of other rendering work, which is the way this is generally handled on Android). What I'm trying to figure out now is how best to schedule the rendering. With CAMetalLayer maximumDrawableCount set to 2, you're pretty much guaranteed that the frame will be displayed on the next vsync if rendering completes quickly enough. However sometimes the system seems to hold onto the drawables a bit longer than expected which blocks getNextDrawable. With maximumDrawableCount of 3, it seems easy enough to maintain 60FPS but looking in instruments the CPU to display latency varies - there are times where its around 50ms (ie already 2 frames in the queue to be presented first, waitForNextDrawable blocks), some periods where it's 30 ms (generally 1 other frame queued) and sometimes where it drops down to the ideal 16ms or less. Is there any way to call present that will just drop any other queued frames in the layer? I've tried presentDrawable:drawable atTime:0 and afterMinimumDuration:0 but to no avail. It seems like with CAMetalLayer I'll just have to addPresentedHandler blocks to keep track of how many are queued in the display so I can ensure the queue is generally empty before presenting the next frame. A related question is the deadline for completing the rendering. The CAMetalLayer is in the compositing fast path, but it seems rendering needs to still complete (ie all the GPU workload finished) around 5ms before the next vsync for it to be displayed on the next vsync. I suspect there's a deadline for the frame just in case it needs to be composited but any hints / ideas for handling that would be appreciated. It seems to be slightly device-specific; somewhat unexpectedly, the iPod touch 7 latches frames that finish much closer to the vsync time than the iPhone 12 Pro. I've also just come across AVSampleBufferDisplayLayer that I'm taking a look at now. It seems to offer better control of the queue, and still enables the compositing fast path, but I can't actually see any feedback like addPresentedHandler to be able to judge what the deadline is to have a frame shown in the next vsync.
2
1
2k
Jul ’22
Prevent CPU frequency scaling when profiling for iOS?
Hi all,I've been digging into Instruments lately to investigate some potential performance wins for my app. I'm really enjoying the level of detail in the System Trace template, and the os_signpost stuff is perfect to get a high-level view of where to focus attention - great job!I'm using the latest Xcode Version 11.4.1 (11E503a) on Catalina, and an iPod Touch 7th Gen on the latest iOS - 13.4 (17E255).I'm measuring the app in a steady state where the workload is pretty constant and low (should be no issue hitting 60FPS rendering) but Xcode's CPU meter is pretty jumpy and it drops a frame every couple of seconds, which then seems to coincide with a reduction of CPU usage shown in Xcode.My suspicion is the device thinks it can maintain the required framerate and use less energy by reducing the CPU frequency. This is not quite true for my app - the main thread is sleeping for a significant period but it is awaiting a result on the run loop before the frame rendering can be completed. I can see why the OS might think downclocking is possible here, but then kick up the frequency again when it sees a frame is dropped.I did however once seem to get the device into a state where it was running at (I assume) full speed - the CPU meter in Xcode stayed steady as a rock at 15% or so (with "other processes" at a similar number). I haven't been able to figure out exactly how I managed that - Instruments was running a windowed tracing session and I app switched away and back to the app, but that method doesn't seem to work every time.So the Instruments-related questions arising from this:1) Is there a way to record and visualize the CPU frequency scaling in Instruments? If the frequency of the CPU is changing and unknown, then the timings coming out of Instruments are not really telling the whole story.2) Is there an officially supported method to prevent CPU frequency scaling whilst Instruments is recording data?Thanks for any help!Simon
3
1
2.8k
Apr ’20