Am I mis-using a concurrent queue in the following scenario?
Situation:
My app needs to process files submitted by the user. These files will usually be in batches of 10's or 100's of files, which may have arbitrary dependencies between files (i.e. file "y.txt" must be processed before file "x.txt" can finish processing). The processing of each file can take up to a few seconds. In general, it is not possible to determine ahead of time what the dependency graph looks like. I have to start processing a file, and somewhere in the middle I may hit a reference to another file, at which point I need to wait for the other file to finish processing before continuing to process this file.
My Approach:
What I tried to do is just submit all the files as individual `DispatchWorkItem`s (one file per work item) to a concurrent queue. When I determine that a particular file (call it "x.txt") is dependent on another file (y.txt), I look up the DispatchWorkItem for y.txt, and call DispatchWorkItem.notify(...) with a closure that will get me the processed result of y.txt, then I DispatchWorkItem.wait() for that result within x.txt's work item before continuing to process x.txt. This works great if there are only a few files.
The problem:
If there are a lot of files (say 100+), with many of the files dependent on one particular file in the batch, what seems to happen is GCD will enqueue Work item 1, it will eventually wait on a dependency, GCD spawns a new thread for work item 2, then it waits on a dependency, ... all the way up to 64 work items in flight, all waiting on a dependency. Then GCD just stops. If the dependency they are all waiting on hasn't been started yet, the app is hung.
Is there a better way to do this? Is there some internal limit of 64 blocked threads in a single queue?
Yes, there's a better way and, yes, there's a limit. See the Avoiding Excessive Thread Creation in the DispatchQueue documentation.
The simplest tweak to your current approach is to avoid .wait(). Instead, the closure you submit to .notify() should resume processing x.txt with the current state/context and the results from y.txt. (You may need to encapsulate that state into an object separate from the work item code.) The current work item should return immediately after calling .notify(). That way, you're not blocking a thread. The thread is returned to the pool and is put to work on a task that can proceed.