Image decompression strategies for performance

I've been exploring various solutions to avoid image decompression on the main thread in order to speed up rendering images. The developer community has previously shared various ways of preloading or inflating images on the background before rendering them, and most of the examples out there still work today with varying degrees of speed. See cocoanetics dot com/2011/10/avoiding-image-decompression-sickness/, gist dot github dot com/steipete/1144242, and github dot com/Alamofire/AlamofireImage/blob/3e8edbeb75227f8542aa87f90240cf0424d6362f/Source/UIImage%2BAlamofireImage.swift#L113.

What's unfortunate is that none of the decompression techniques are documented, so the community is basing these assertions off of trial and error and hoping they don't regress between iOS versions. And what's more, is that some solutions are potentially ripe with error, likely missing critical alpha data, scale, orientation, or other bitmap info that might produce artifacts in the final rendered image.

In our situation, we are already served derivatives of images from the server that are exactly the size that they will be rendered. But we have found through profiling in instruments that there is a huge performance win by pre-inflating the images on a background thread. So we are left with deciding what approach to take for inflation.

In order to validate the inflation is working, there are a few ways to observe the decompression happen during runtime.

Here is a typical backtrace of seeing PNG decompression:

Code Block
15 CoreFoundation 209.0 CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION
14 QuartzCore 180.0 CA::Transaction::observer_callback(CFRunLoopObserver*, unsigned long, void*)
13 QuartzCore 180.0 CA::Transaction::commit()
12 QuartzCore 180.0 CA::Context::commit_transaction(CA::Transaction*, double)
11 QuartzCore 180.0 CA::Layer::prepare_commit(CA::Transaction*)
10 QuartzCore 180.0 CA::Render::prepare_image(CGImage*, CGColorSpace*, unsigned int, double)
9 QuartzCore 180.0 CA::Render::copy_image(CGImage*, CGColorSpace*, unsigned int, double, double)
8 ImageIO 180.0 IIOImageProviderInfo::CopyImageBlockSetWithOptions(void*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
7 ImageIO 180.0 IIO_Reader::CopyImageBlockSetProc(void*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
6 ImageIO 180.0 PNGReadPlugin::copyImageBlockSet(InfoRec*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
5 ImageIO 180.0 PNGReadPlugin::DecodeUncomposedFrames(IIOImageRead*, GlobalPNGInfo*, ReadPluginData const&, PNGPluginData const&, std::1::vector<IIODecodeFrameParams, std::1::allocator<IIODecodeFrameParams> >&)
4 ImageIO 180.0 PNGReadPlugin::DecodeFrameStandard(IIOImageReadSession*, ReadPluginData const&, PNGPluginData const&, IIODecodeFrameParams&)
3 ImageIO 180.0 _cg_png_read_row
2 ImageIO 159.0 png_read_IDAT_data


For any readers, in lldb, calling:

Code Block
(lldb) br s -n _ZN13PNGReadPlugin22DecodeUncomposedFramesEP12IIOImageReadP13GlobalPNGInfoRK14ReadPluginDataRK13PNGPluginDataRNSt316vectorI20IIODecodeFrameParamsNSA_9allocatorISC_EEEE
Breakpoint 1: where = ImageIO`PNGReadPlugin::DecodeUncomposedFrames(IIOImageRead*, GlobalPNGInfo*, ReadPluginData const&, PNGPluginData const&, std::1::vector<IIODecodeFrameParams, std::__1::allocator<IIODecodeFrameParams> >&), address = 0x00007fff2615e4d4


This will set a symbolic breakpoint that will TRAP on PNGReadPlugin::DecodeUncomposedFrames


Here is a typical backtrace of seeing JPEG decompression:

Code Block
16 CoreFoundation 13.0 CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION
15 QuartzCore 13.0 CA::Transaction::observer_callback(CFRunLoopObserver*, unsigned long, void*)
14 QuartzCore 12.0 CA::Transaction::commit()
13 QuartzCore 12.0 CA::Context::commit_transaction(CA::Transaction*, double)
12 QuartzCore 12.0 CA::Layer::prepare_commit(CA::Transaction*)
11 QuartzCore 12.0 CA::Render::prepare_image(CGImage*, CGColorSpace*, unsigned int, double)
10 QuartzCore 12.0 CA::Render::copy_image(CGImage*, CGColorSpace*, unsigned int, double, double)
9 ImageIO 12.0 IIOImageProviderInfo::CopyImageBlockSetWithOptions(void*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
8 ImageIO 12.0 IIO_Reader::CopyImageBlockSetProc(void*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
7 ImageIO 12.0 AppleJPEGReadPlugin::copyImageBlockSet(InfoRec*, CGImageProvider*, CGRect, CGSize, CFDictionary const*)
6 AppleJPEG 11.0 applejpeg_decode_image_all
5 AppleJPEG 10.0 aj_decode_all_mt
4 AppleJPEG 10.0 aj_decode_all
3 AppleJPEG 6.0 fill_coeff_buffer
2 AppleJPEG 6.0 aj_mcu_decode
1 AppleJPEG 5.0 aj_block_decode
0 AppleJPEG 2.0 aj_huffman_decode_ac_s1


For any readers, in lldb, calling:

Code Block
(lldb) br s -n _ZN19AppleJPEGReadPlugin17copyImageBlockSetEP7InfoRecP15CGImageProvider6CGRect6CGSizePK14CFDictionary
Breakpoint 1: where = ImageIO`AppleJPEGReadPlugin::copyImageBlockSet(InfoRec*, CGImageProvider*, CGRect, CGSize, CFDictionary const*), address = 0x00007fff262a53b8


This will set a symbolic breakpoint that will TRAP on AppleJPEGReadPlugin::copyImageBlockSet
(Question continued here due to character limit)


Back to potential solutions:
  1. Peter Steinberger's useful gist demonstrates one potential solution to pre-loading by creating a bitmap context: gist dot github dot com/steipete/1144242.

  2. AlamofireImage also had a similar approach as well: github dot com/Alamofire/AlamofireImage/blob/35623582388a3bfb21338566898ebbaebaef33dd/Source/UIImage%2BAlamofireImage.swift#L117-L146

  3. But AlamofireImage recently switched to the much shorter strategy of fetching CGDataProvider's data to prime the inflation, which is undocumented behavior (at least that I've found): github dot com/Alamofire/AlamofireImage/blob/3e8edbeb75227f8542aa87f90240cf0424d6362f/Source/UIImage%2BAlamofireImage.swift#L113

  4. WWDC 2018 - 219 Image and Graphics Best Practices (https://developer.apple.com/videos/play/wwdc2018/219/) shows a downsampling technique that demonstrates the use of kCGImageSourceShouldCacheImmediately flag, but the options dictionary can only be fed into CGImageSources that are created from a URL, unfortunately.

  5. It turns out that, empirically, just calling UIGraphicsBeginImageContextWithOptions and drawing the image without any other flags or options set on the context will cache the image. But I'm afraid that it might not be doing enough to produce a 1-to-1 image for all formats and color spaces. In fact, this was brought up on the previously linked gist: gist dot github dot com/steipete/1144242#gistcomment-817406.

  6. Drawing the image using UIGraphicsImageRender also works as a strategy for pre-loading but is found to be too slow (FB7878121).


It seems from the examples and from some local profiling that fetching an image's CGDataProvider's data is enough to pre load the image and it's fast. But is this a reliable strategy for all images and formats? Are there other strategies that are better suited for this technique? And what are the disadvantages of pre loading images on a background thread? I'm most concerned with using a technique that's reliable and preserves the image data as best as it can, especially with respect to transparency and color space.

Thanks,
Mark
Image decompression strategies for performance
 
 
Q