MainActor.run failing to run closure when called from within a detached task

I have encountered an issue when trying to update the status of a detached task, by passing a closure to MainActor.run.

To illustrate the issue consider a function that counts the number of files in a folder and its sub-directories. It runs in a Task.detached closure, as I don't want it blocking the main thread. Every 10,000th file it updates a Published property fileCount, by passing a closure to MainThread.run.

However, the UI is failing to update and even giving me a spinning beach ball. The only way to stop this is by inserting await Task.sleep(1_000_000_000) before the call to MainThread.run. Here's the code:

final class NewFileCounter: ObservableObject {
	@Published var fileCount = 0

	func findImagesInFolder(_ folderURL: URL) {
		let fileManager = FileManager.default
		Task.detached {
			
			var foundFileCount = 0
			let options = FileManager.DirectoryEnumerationOptions(arrayLiteral: [.skipsHiddenFiles, .skipsPackageDescendants])
			
			if let enumerator = fileManager.enumerator(at: folderURL, includingPropertiesForKeys: [], options: options) {
				while let _ = enumerator.nextObject() as? URL {
					foundFileCount += 1
					if foundFileCount % 10_000 == 0 {
						let fileCount = foundFileCount
						await Task.sleep(1_000_000_000) // <-- Only works with this in...comment out to see failure
						await MainActor.run { self.fileCount = fileCount }
					}
				}
				let fileCount = foundFileCount
				await MainActor.run { self.fileCount = fileCount }
			}
		}
	}
}

The code works if I revert to the old way of achieving this:

final class OldFileCounter: ObservableObject {
	@Published var fileCount = 0
	
	func findImagesInFolder(_ folderURL: URL) {
		let fileManager = FileManager.default
		DispatchQueue.global(qos: .userInitiated).async {
			
			let options = FileManager.DirectoryEnumerationOptions(arrayLiteral: [.skipsHiddenFiles, .skipsPackageDescendants])
			var foundFileCount = 0
			
			if let enumerator = fileManager.enumerator(at: folderURL, includingPropertiesForKeys: [], options: options) {
				while let _ = enumerator.nextObject() as? URL {
					foundFileCount += 1
					if foundFileCount % 10_000 == 0 {
						let fileCount = foundFileCount
						DispatchQueue.main.async { self.fileCount = fileCount }
					}
				}
				let fileCount = foundFileCount
				DispatchQueue.main.async { self.fileCount = fileCount }
			}
		}
	}
}

What am I doing wrong?

BTW - if you want to try out this code, here is a test harness. Be sure to pick a folder with lots of files in it and its sub-folders.

import SwiftUI

@main
struct TestFileCounterApp: App {
	var body: some Scene {
		WindowGroup {
			ContentView()
		}
	}
}

struct ContentView: View {
	@State private var showPickerOld = false
	@StateObject private var fileListerOld = OldFileCounter()
	@State private var showPickerNew = false
	@StateObject private var fileListerNew = NewFileCounter()
	
	var body: some View {
		VStack {
			Button("Select folder to count files using DispatchQueue...") { showPickerOld = true }
			Text("\(fileListerOld.fileCount)").foregroundColor(.green)
				.fileImporter(isPresented: $showPickerOld, allowedContentTypes: [.folder], onCompletion: processOldSelectedURL )
			Divider()
			Button("Select folder to count files using Swift 5.5 concurrency...") { showPickerNew = true }
			Text("\(fileListerNew.fileCount)").foregroundColor(.green)
				.fileImporter(isPresented: $showPickerNew, allowedContentTypes: [.folder], onCompletion: processNewSelectedURL )
		}
		.frame(width: 400, height: 130)
	}
	
	private func processOldSelectedURL(_ result: Result<URL, Error>) {
		switch result {
			case .success(let url): fileListerOld.findImagesInFolder(url)
			case .failure: return
		}
	}
	
	private func processNewSelectedURL(_ result: Result<URL, Error>) {
		switch result {
			case .success(let url): fileListerNew.findImagesInFolder(url)
			case .failure: return
		}
	}
}```
Answered by ForumsContributor in
Accepted Answer

To illustrate the issue consider a function that counts the number of files in a folder and its sub-directories. It runs in a Task.detached closure, as I don't want it blocking the main thread.

This doesn’t sound like a good application of Swift concurrency. The problem here is that the task you start in findImagesInFolder(…) spends most of its time blocked waiting for disk I/O. This means that it consumes a thread from the Swift concurrency cooperative thread pool for the entire duration of the task.

The code works if I revert to the old way of achieving this:

And, similarly, that’s not a good application of Dispatch queues )-: Scheduling a work item that blocks for long periods of time on a Dispatch global concurrent queue runs the risk of deadlock (on platforms that don’t overcommit) or thread explosion (on platforms that do).

For more background on this, watch:

And if you’re still confused, post your questions here and I’ll try to answer them.


As to what you should do, that’s a less-than-satisfactory story right now. To start, Swift concurrency will not help here. It’s current design gives you no control over the underlying task-to-thread assignment, and so it’s not safe to block for long periods of time. It’s likely that there will be better solutions in this space in the long term [1] but for the moment it’s best to solve this problem with core code that’s not based on Swift concurrency and then add a Swift concurrency wrapper so that your clients can call a nice API that follows standard patterns.

As to what this core code should do, my suggestion is that you set up a single shared serial queue and have all instances of findImagesInFolder(…) dispatch over to that. This will consume a thread from the Dispatch thread pool for the duration, but it’s only one thread and Dispatch can tolerate that.

One thing to note here is that this design means that, if you start two file counting operations, the second will make no progress until the first has completed (because it’s a serial queue). In most cases that’s OK. The disk is a fundamentally serial resource, and so you don’t benefit much from parallelism. However, this is something that you might want to tweak. For example:

  • If you’re working on the Mac it might make sense to have a serial queue per physical device. That way operations can proceed in parallel on the underlying hardware.

  • If you want your UI to show progress on multiple requests even if that’s potentially slower, you can use a small set of serial queues.

The thing to avoid here is starting a bazillion requests all running in parallel. That will be slower [2] and runs the risk of deadlock.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] For example, custom executors.

[2] If you try to run many threads in parallel the system has to ensure that all threads make progress so it time shares them on to the physical cores. However, your many threads have a huge working set and the physical cores have a limited cache size. Eventually that working set exceeds the cache size and you start wasting time moving thread state in and out of cache. And if the threads are all talking to the same serialised resource, like the disk, you don’t get any parallelism to make up for that.

Quinn - that's super helpful. Thank you!

Having looked and thought carefully about this, I have found that adding Task.yield() solves the issue, as I proactively await:

final class NewFileCounter: ObservableObject {

	@Published var fileCount = 0

	func findImagesInFolder(_ folderURL: URL) {

		let fileManager = FileManager.default
		Task.detached {
			var foundFileCount = 0
			let options = FileManager.DirectoryEnumerationOptions(arrayLiteral: [.skipsHiddenFiles, .skipsPackageDescendants])

			if let enumerator = fileManager.enumerator(at: folderURL, includingPropertiesForKeys: [], options: options) {

				while let _ = enumerator.nextObject() as? URL {
					foundFileCount += 1
					await Task.yield()
					if foundFileCount % 10_000 == 0 {
						let fileCount = foundFileCount
						await MainActor.run { self.fileCount = fileCount }
					}
				}

				let fileCount = foundFileCount
				await MainActor.run { self.fileCount = fileCount }
			}
		}
	}
}

I have found that adding Task.yield() solves the issue

Yes, but also no (-:

The Swift concurrency thread pool is cooperative, so it makes sense to insert a yield in long-running tasks. For example, if you have code that’s compressing a video using the CPU, you might insert a yield between each frame.

However, this doesn’t apply in your case because your code is blocking on I/O. Inserting a yield helps in the common case, because directory I/O is generally fast, but it does not help in all cases. Imagine, for example, the directory you’re working on is mounted via a network file system and the server stops responding for some reason. The file system client may take a long time to time out — we’re talking about minutes here! — and your task will be stuck inside some blocking file system call for the duration, all the while consuming a precious Swift concurrency thread.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Is this the same problem that's impacting the following code snippet:

let urls: [URL] = [url1, url2, url3, ... ] // Image we have 1,000s of URLs here

await withTaskGroup(of: CGImageSource.self) { taskGroup in
    for url in urls {
        taskGroup.addTask { return CGImageSourceCreateWithURL(url as CFURL, nil) }
    }

    var results = [CGImageSource]()
   
    for await result in taskGroup {
        results.append(result)
	}

    return results
}

I don't have a way to await the call to CGImageSourceCreateWithURL - this code blocks up.

I don't have a way to await the call to CGImageSourceCreateWithURL

Yep.

IIRC we added some async image wrangling in the upcoming OS releases. You may be able to switch to those. Oh, yeah, here we go: Check out the prepareForDisplay(completionHandler:) method.

It’s also possible that other frameworks have async image processing. Image processing isn’t really my field, so I don’t have any specific hints on that front.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Also I am right in thinking that there is no async method yet to load pure data - i.e. Data(withContentsOfURL: URL) async?

You can do this with URLSession, feeding it a file: URL.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

MainActor.run failing to run closure when called from within a detached task
 
 
Q