We are processing videos with Core Image filters in our apps, using an AVMutableVideoComposition (for playback/preview and export).
For older devices, we want to limit the resolution at which the video frames are processed for performance and memory reasons. Ideally, we would tell AVFoundation to give us video frames with a defined maximum size into our composition. We thought setting the renderSize property of the composition to the desired size would do that.
However, this only changes the size of output frames, not the size of the source frames that come into the composition's handler block. For example:
let composition = AVMutableVideoComposition(asset: asset, applyingCIFiltersWithHandler: { request in
let input = request.sourceImage // <- this still has the video's original size
// ...
})
composition.renderSize = CGSize(width: 1280, heigth: 720) // for example
So if the user selects a 4K video, our filter chain gets 4K input frames. Sure, we can scale them down inside our pipeline, but this costs resources and especially a lot of memory. It would be way better if AVFoundation could decode the video frames in the desired size already before passing it into the composition handler.
Is there a way to tell AVFoundation to load smaller video frames?
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Post
Replies
Boosts
Views
Activity
Hello,
I used AVPlayer in my project to play network movie.
Most movie could play normally, but I found the sound will disappear sometimes if I play specified 4K video network stream.
The video will continue playing but audio stops after video is played for a while.
If I pause player and then resume, the sound will be back but disappeared again after several seconds
Check AVPlayerItem status:
isPlaybackLikelyToKeepUp` == true
isPlaybackBufferEmpty` = false
player.volume > 0
According the value above, it seems not cause by empty playback buffer or volume issue. I am so confused for this situation.
Movie information
Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High L5.1
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Bit rate mode : Variable
Bit rate : 100.0 Mb/s
Width : 3 840 pixels
Height : 2 160 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 29.970 (30000/1001) FPS
Audio
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 5 min 19 s
Bit rate mode : Constant
Bit rate : 192 kb/s
Nominal bit rate : 48.0 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 48.0 kHz
Frame rate : 46.875 FPS (1024 SPF)
Does anyone know if AVPlayer has this limitations when playing high-bitrate movie streams, and are there any solutions?
Dear Apple,
I am currently working on Mental Health related research project supported by South Korea Government funding.
In addition to SensorKit access, we are working on the data from microphone. Is there any contact point aside SensorKit access application to discuss the possibility of research data collection from restricted participant samples?
Hello,
I.m deaf-blind programmer.
I'm experiencing memory issues in my app. Essentially, I'm writing a video.
In this output video, I get content from two sources.
The first source is an already recorded video of 18 seconds (just for testing). It will be shown at the beginning of the output video.
The second source is an array with photos and another array with audio buffers from AVSpeechSynthesizer.write(). The photos will be added along with the audio buffers to the output video, right after adding the 18-second video.
So, in the end, the output video should be:
18-second video + array of photos as video images and, for audio, the buffers from AVSpeechSynthesizer.write().
However, my app crashes as soon as I start the first process.
I'm using AVAssetWriter to write the video and AVAssetReader to read the video.
Below, I'll show the code where
I get the CMSampleBuffer.
I'd like an example of how to add the 18-second video to the beginning of the output video.
It doesn't need to be a big piece of code.
Here it is:
// Variables
var audioReaderBuffers = [CMSAMPLEBUFFER]()
var videoReaderBuffers = [(frame: CVPixelBuffer, time: CMTIME)]()
// Get CMSampleBuffer of a video asset
if let videoURL = videoURL {
let videoAsset = AVAsset(url: videoURL)
Task {
let videoAssetTrack = try await videoAsset.loadTracks(withMediaType: .video).first!
let audioTrack = try await videoAsset.loadTracks(withMediaType: .audio).first!
let reader = try AVAssetReader(asset: videoAsset)
let videoSettings = [
kCVPixelBufferPixelFormatTypeKey: kCVPixelFormatType_32BGRA,
kCVPixelBufferWidthKey: videoAssetTrack.naturalSize.width,
kCVPixelBufferHeightKey: videoAssetTrack.naturalSize.height
] as [String: Any]
let readerVideoOutput = AVAssetReaderTrackOutput(track: videoAssetTrack, outputSettings: videoSettings)
let audioSettings = [
AVFormatIDKey: kAudioFormatLinearPCM,
AVSampleRateKey: 44100,
AVNumberOfChannelsKey: 2
] as [String : Any]
let readerAudioOutput = AVAssetReaderTrackOutput(track: audioTrack,
outputSettings: audioSettings)
reader.add(readerVideoOutput)
reader.add(readerAudioOutput)
reader.startReading()
// Video CMSampleBuffer
while let sampleBuffer = readerVideoOutput.copyNextSampleBuffer() {
autoreleasepool {
if let imgBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
let pixBuf = imgBuffer as CVPixelBuffer
let pTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
videoReaderBuffers.append((frame: pixBuf, time: pTime))
}
}
}
if let videoURL = videoURL {
let videoAsset = AVAsset(url: videoURL)
Task {
let videoAssetTrack = try await videoAsset.loadTracks(withMediaType: .video).first!
let audioTrack = try await videoAsset.loadTracks(withMediaType: .audio).first!
let reader = try AVAssetReader(asset: videoAsset)
let videoSettings = [
kCVPixelBufferPixelFormatTypeKey: kCVPixelFormatType_32BGRA,
kCVPixelBufferWidthKey: videoAssetTrack.naturalSize.width,
kCVPixelBufferHeightKey: videoAssetTrack.naturalSize.height
] as [String: Any]
let readerVideoOutput = AVAssetReaderTrackOutput(track: videoAssetTrack, outputSettings: videoSettings)
let audioSettings = [
AVFormatIDKey: kAudioFormatLinearPCM,
AVSampleRateKey: 44100,
AVNumberOfChannelsKey: 2
] as [String : Any]
let readerAudioOutput = AVAssetReaderTrackOutput(track: audioTrack,
outputSettings: audioSettings)
reader.add(readerVideoOutput)
reader.add(readerAudioOutput)
reader.startReading()
while let sampleBuffer = readerVideoOutput.copyNextSampleBuffer() {
autoreleasepool {
if let imgBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
let pixBuf = imgBuffer as CVPixelBuffer
let pTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
}
I tried configuring the preferredForwardBufferDuration on devices using 4G and Wi-Fi, and in these cases, AVPlayer works correctly according to the configured buffer duration. However, when the device is connected to a 5G network, the configuration value no longer works.
For example, if I set preferredForwardBufferDuration to 30 seconds, AVPlayer preloads with a buffer of over 100 seconds. I’m not sure how to resolve this, as it’s causing issues with my system.
I am using ImageCaptureCore to access and (sometimes) download media files from a digital camera connected via USB (either to a Mac oder to an iOS device with Apple lightning to USB3 camera adapter).
This works very well in general, but what puzzles me is that for the ICCameraFile's EXIF creation/modification date, it always returns nil.
I can access the ICCameraItem's creation/modification date instead, which, as it says in the documentation "usually [is] the same as its EXIF creation date", but, well not always. Generally the EXIF tags are more reliable than the file dates, especially the modification date is easily messed up when copying files.
As for my cameras, they show the stable EXIF date on their display, so for consistency I would prefer to use the same in my app. Is there a way to get it without downloading the image from the camera and reading it from the file?
Does it possibly depend on the brand of camera (I mostly have Canon) whether ICCameraFile.exifCreationDate is ever populated or always nil?
For a thumb drive with DCIM folder, which is treated just like a camera, it is also nil.
I have an iPad app, written in objective-c and distributed through Enterprise developer, as it is not for public use but specific to some large companies.
The app has a local database and works offline
For some functions of the app I need to display images (not edit or cut them, just display them)
Right now there is integrated MWPhotoBrowser viewer, which has not been maintained for almost 10 years, so in addition to warnings in compilation I have to fight with some historical bugs especially on high resolution images. https://github.com/mwaterfall/MWPhotoBrowser
Do you know of a modern and maintained OFFLINE photo viewer? I evaluate both free and paid (maybe an SDK). My needs are very basic
I have found this one https://github.com/TimOliver/TOCropViewController, but I need to disable the photos edit features and especially I would lose the useful feature of displaying multiple images (mwphoto for multiple images showed a gallery)
I’m working on a memo app that records audio from the iPhone’s microphone (and other devices like MacBook or iPad) and processes it in 10-second chunks at a target sample rate of 16 kHz. However, I’ve encountered limitations with installTap in AVAudioEngine, which doesn’t natively support configuring a target sample rate on the mic input (the default being 44.1 kHz).
To address this, I tried using AVAudioMixerNode to downsample the mic input directly. Although everything seems correctly configured, no audio is recorded—just a flat signal with zero levels. There are no errors, and all permissions are granted, so it seems like an issue with downsampling rather than the mic setup itself.
To make progress, I implemented a workaround by tapping and resampling each chunk tapped using installTap (every 50ms in my case) with AVAudioConverter. While this works, it can introduce artifacts at the beginning and end of each chunk, likely due to separate processing instead of continuous downsampling.
Here are the key issues and questions I have:
1. Can we change the mic input sample rate directly using AVAudioSession or another native API in AVAudio? Setting up the desired sample rate initially would be ideal for my use case.
2. Are there alternatives to installTap for recording audio at a different sample rate or for continuously downsampling the live input without chunk-based artifacts?
This issue seems longstanding, as noted in a 2018 forum post:
https://forums.developer.apple.com/forums/thread/111726
Any guidance on configuring or processing mic input at a lower sample rate in real-time would be greatly appreciated. Thank you!
I’ve encountered an issue when trying to transcribe audio during a SharePlay session in VisionOS. Specifically, the AVAudioSession appears to fail when sharing audio, preventing successful transcription. The problem seems related to AVAudioSession.sharedInstance() and using the .mixWithOthers option, which is supposed to enable multiple audio sources to coexist without interference.
Here’s the relevant code snippet that throws the error:
private static func prepareEngine() throws -> (AVAudioEngine, SFSpeechAudioBufferRecognitionRequest) {
let audioEngine = AVAudioEngine()
let request = SFSpeechAudioBufferRecognitionRequest()
request.shouldReportPartialResults = true
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.playAndRecord, mode: .default, options: [.mixWithOthers, .allowBluetooth])
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
let inputNode = audioEngine.inputNode
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
request.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
return (audioEngine, request)
}
The setup is designed to initialize an AVAudioEngine and a SFSpeechAudioBufferRecognitionRequest for real-time transcription, but fails within the SharePlay context. Notably, while .mixWithOthers is intended to handle concurrent audio sessions, it doesn’t appear to work as expected during SharePlay. The audioSession.setActive(true) line is where the setup typically fails, with no clear solution to proceed.
Has anyone else faced similar issues with AVAudioSession and SharePlay in VisionOS? Any insights on how to manage audio sharing or transcription during a SharePlay session would be greatly appreciated!
The specific error is:
The operation couldn't be completed. (com.apple.coreaudio.avfaudio error 561145187.)
Hey,
There's like this darkish line on my iPhone and iPad when I open the Photos app. This scared the ding dong out of me the first time I saw it but then I realized in was a software issue when it disappeared as I swiped up to close the app. It's really weird because it's extremely faint but I can't seem to catch it in screenshots. I know for a fact this is a software issue because it doesn't show up in any other apps. It also changes from horizontal to vertical depending on how I turn my iPhone. Can everyone please just check your own iPhone or iPad to make sure I'm not the only one? I'm on the 18.2 developer beta by the way.
Thanks!
I have an audio app that can play audio on an AirPlay device.
On non-Apple TV devices, the AirPlay app (on Roku, Samsung, etc.) shows the now playing metadata: title, artist, and album art.
However, on tvOS 18.1, no metadata is shown. The Apple TV device plays the audio, but there is no now playing information shown, nor any other indicators.
Other media apps show the "Now Playing" controls on the upper right of the tvOS home screen.
Can someone point me in the direction of how to solve this issue? I think I am missing something somewhere in regards to the tvOS metadata implementation.
Looking to output dv video to my JVC SR-VS30 video deck. I used to be able to do this, but with most firewire stuff being deprecated, I'm not sure how to go about this. I found this old developer sample code that seems to do exactly what I'd like. Surely this could be rolled or updated for current macOS?
https://developer.apple.com/library/archive/samplecode/SimpleVideoOut/Introduction/Intro.html#//apple_ref/doc/uid/DTS10000809-Intro-DontLinkElementID_2
My app reports a lot of crashes from 18.2 users.
I have been able to narrow down the issue to this line of code:
CGImageDestinationFinalize(imageDestination)
The error is Thread 93: EXC_BAD_ACCESS (code=1, address=0x146318000)
But I have no idea why this suddently started to crash.
Here is the code of the function:
private func estimateSizeUsingThumbnailMethod(fromImageURL url: URL, imageSettings: ImageSettings) -> (Int, Int) {
let sourceOptions = [kCGImageSourceShouldCache: false] as CFDictionary
guard let source = CGImageSourceCreateWithURL(url as CFURL, sourceOptions),
let imageProperties = CGImageSourceCopyPropertiesAtIndex(source, 0, nil) as? [CFString: Any],
let imageWidth = imageProperties[kCGImagePropertyPixelWidth] as? CGFloat,
let imageHeight = imageProperties[kCGImagePropertyPixelHeight] as? CGFloat else {
return (0, 0)
}
let maxImageSize = max(imageWidth, imageHeight)
let thumbMaxSize = min(2400, maxImageSize) // Use original size if possible, but not if larger than 2400, in this case we'll extrapolate from thumbnail
let downsampleOptions = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: thumbMaxSize as CFNumber,
] as CFDictionary
guard let cgImage = CGImageSourceCreateThumbnailAtIndex(source, 0, downsampleOptions) else {
DLog("CGImage thumb creation error")
return (0, 0)
}
let data = NSMutableData()
guard let imageDestination = CGImageDestinationCreateWithData(data, UTType.jpeg.identifier as CFString, 1, nil) else {
DLog("CGImage destination creation error")
return (0, 0)
}
let destinationProperties = [
kCGImageDestinationLossyCompressionQuality: imageSettings.quality.compressionRatio() // Set jpeg compression ratio
] as CFDictionary
CGImageDestinationAddImage(imageDestination, cgImage, destinationProperties)
CGImageDestinationFinalize(imageDestination) // <----- CRASHES HERE with EXC_BAD_ACCESS
...
}
So far, I'm stuck. Any idea that could help would be greatly appreciated, as I'm scared that this crash will propagate on the official release of 18.2
Hey. I am trying to create a present view with a bunch of media (images/videos). Right now I am using a ZStack to render each media and change opacity based on the index selected using a scrollView. The issue seems to be that sometimes, videos don't seem to load in the main slide. There is a slide created as the video exists, the Player shows controls too but doesn't play anything.
Present View Z-Stack
ZStack {
ForEach(presentation.slides.indices, id: .self) { index in
if let media = mediaCacheManager.mediaCache[index] {
if let player = media as? AVPlayer {
PlayerView(player: player)
.aspectRatio(16/10, contentMode: .fit )
.frame(width: UIScreen.main.bounds.width * 0.8)
.background(Color.gray.opacity(0.2))
.clipShape(RoundedRectangle(cornerRadius: 40))
.overlay(
RoundedRectangle(cornerRadius: 40)
.stroke(Color.gray.opacity(0.5), lineWidth: 1)
)
.onDisappear {
player.pause()
}
.opacity(appModel.currentSlide == index ? 1 : 0)
} else if let image = media as? Image {
image
.resizable()
.scaledToFit()
.frame(width: UIScreen.main.bounds.width * 0.8)
.background(Color.gray.opacity(0.2))
.clipShape(RoundedRectangle(cornerRadius: 40))
.overlay(
RoundedRectangle(cornerRadius: 40)
.stroke(Color.gray.opacity(0.5), lineWidth: 1)
)
.padding(.vertical, 10)
.opacity(appModel.currentSlide == index ? 1 : 0)
}
}
}
}
The PlayerView
public class PlayerUIView: UIView {
let playerVC = AVPlayerViewController()
let gravity: AVLayerVideoGravity
let manageAudio: Bool
override init(frame: CGRect) {
self.gravity = .resizeAspectFill
self.manageAudio = true
super.init(frame: frame)
}
deinit {
if manageAudio {
try? AVAudioSession.sharedInstance().setActive(false)
}
}
init(player: AVPlayer?, gravity: AVLayerVideoGravity, manageAudio: Bool = true) {
self.gravity = gravity
self.manageAudio = manageAudio
super.init(frame: .zero)
guard let player = player else { return }
self.playerSetup(player: player)
}
required init?(coder: NSCoder) {
fatalError("init(coder:) has not been implemented")
}
public override func layoutSubviews() {
super.layoutSubviews()
playerVC.view.frame = bounds
playerVC.view.backgroundColor = .clear
playerVC.allowsVideoFrameAnalysis = false
}
private func playerSetup(player: AVPlayer) {
playerVC.updatesNowPlayingInfoCenter = true
playerVC.player = player
playerVC.showsPlaybackControls = true
playerVC.view.backgroundColor = .clear
playerVC.exitsFullScreenWhenPlaybackEnds = true
playerVC.videoGravity = gravity
self.addSubview(playerVC.view)
}
}
I donate some INPlayMediaIntent to system, and I find them in Control center, when I click one of them to play media background, the handler don't execute resolve method, I wanna resolve some mediaItems for suggestion playlist
We have had the same video player in our app for at least 5 years with few issues but the iOS 18 updated has now resulted in video playback for our users who have downloaded the video for offline viewing is now played at 2x speed.
Hi, I'm facing an issuer with audio worklet in safari. This issue is clearly an iOS bug (it doesn't occur on iPad or Mac)
Here's the minimal reproduction:
Go to https://googlechromelabs.github.io/web-audio-samples/audio-worklet/basic/hello-audio-worklet/
Press start
Audio will not be playing
Open YouTube on another tab and start any video
Audio from the worklet will start playing
Is this a known issue? Any plans to address that? Any workaround available?
I'm building a streaming app on visionOS that can play sound from audio buffers each frame. The audio format has a bitrate of 48000, and each buffer has 480 samples.
I noticed when calling
audioPlayerNode.scheduleBuffer(audioBuffer)
The memory keeps increasing at the speed of 0.1MB per second And at around 4 minutes, the node seems to be full of buffers and had a hard reset, at which point, the audio is stopped temporary with a memory change. see attached screenshot.
However, if I call
audioPlayerNode.scheduleBuffer(audioBuffer, at: nil, options: .interrupts)
The memory leak issue is gone, but the audio is broken (sounds like been shortened).
Below is the full code snippet, anyone knows how to fix it?
@Observable
final class MyAudioPlayer {
private var audioEngine: AVAudioEngine = .init()
private var audioPlayerNode: AVAudioPlayerNode = .init()
private var audioFormat: AVAudioFormat?
init() {
audioEngine.attach(audioPlayerNode)
audioEngine.connect(audioPlayerNode, to: audioEngine.mainMixerNode, format: nil)
try? AVAudioSession.sharedInstance().setCategory(.playback, mode: .default)
try? AVAudioSession.sharedInstance().setActive(true)
audioEngine.prepare()
try? audioEngine.start()
audioPlayerNode.play()
}
// more code...
/// callback every frame
private func audioFrameCallback_Non_Interleaved(buf: UnsafeMutablePointer<Float>?, samples: Int) {
guard let buf,
let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 48000, channels: 2, interleaved: false),
let audioBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(samples))
else { return }
audioBuffer.frameLength = AVAudioFrameCount(samples)
if let data = audioBuffer.floatChannelData {
for channel in 0 ..< Int(format.channelCount) {
for frame in 0 ..< Int(audioBuffer.frameLength) {
data[channel][frame] = buf[frame * Int(format.channelCount) + channel]
}
}
}
// memory leak here
audioPlayerNode.scheduleBuffer(audioBuffer)
}
}
I'm building a streaming app on visionOS that can play sound from audio buffers each frame. The source audio buffer has 2 channels and is in a Float32 interleaved format.
However, when setting up the AVAudioFormat with interleaved to true, the app will crash with a memory issue:
AURemoteIO::IOThread (35): EXC_BAD_ACCESS (code=1, address=0x3)
But if I set AVAudioFormat with interleaved to false, and manually set up the AVAudioPCMBuffer, it can play audio as expected.
Could you please help me fix it? Below is the code snippet.
@Observable
final class MyAudioPlayer {
private var audioEngine: AVAudioEngine = .init()
private var audioPlayerNode: AVAudioPlayerNode = .init()
private var audioFormat: AVAudioFormat?
init() {
audioEngine.attach(audioPlayerNode)
audioEngine.connect(audioPlayerNode, to: audioEngine.mainMixerNode, format: nil)
try? AVAudioSession.sharedInstance().setCategory(.playback, mode: .default)
try? AVAudioSession.sharedInstance().setActive(true)
audioEngine.prepare()
try? audioEngine.start()
audioPlayerNode.play()
}
// more code...
/// This crashes
private func audioFrameCallback_Interleaved(buf: UnsafeMutablePointer<Float>?, samples: Int) {
guard let buf,
let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 480000, channels: 2, interleaved: true),
let audioBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(samples))
else { return }
audioBuffer.frameLength = AVAudioFrameCount(samples)
if let data = audioBuffer.floatChannelData?[0] {
data.update(from: buf, count: samples * Int(format.channelCount))
}
audioPlayerNode.scheduleBuffer(audioBuffer)
}
/// This works
private func audioFrameCallback_Non_Interleaved(buf: UnsafeMutablePointer<Float>?, samples: Int) {
guard let buf,
let format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 480000, channels: 2, interleaved: false),
let audioBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(samples))
else { return }
audioBuffer.frameLength = AVAudioFrameCount(samples)
if let data = audioBuffer.floatChannelData {
for channel in 0 ..< Int(format.channelCount) {
for frame in 0 ..< Int(audioBuffer.frameLength) {
data[channel][frame] = buf[frame * Int(format.channelCount) + channel]
}
}
}
audioPlayerNode.scheduleBuffer(audioBuffer)
}
}
Hi, Apple's engineer.
Hoping that you can reply to this one.
We're developing a Text-to-Speak app. Everything went well until the IOS got upgraded to 18.
AVSpeechSynthesisVoice(language: "zh-CN") is running well under IOS 16 AND IOS 17. It speaks Mandarin correctly.
In IOS 18, we noticed that Siri's Language setting interrupted the performance of AVSpeechSynthesisVoice. It plays Cantonese instead of Mandarin.
Buggy language setting in Siri that affects the AVSpeechSynthesisVoice :
Chinese (Cantonese - China mainland)
Chinese (Cantonese -Hong Kong)