How to run entire application synchronously?

Before you kill me, this is a command line only project in Swift. No UI. This project started as a Swift lesson but it evolved. I am thinking about porting to an app with UI in the future but I like command prompts and I've also integrated with some of the system's facilities such as initd scheduler. I really don't want to rewrite it to another synchronous language like python, if you could give me a hand I'd appreciate it.


I used to read data from a file, but now I am trying to pull them from a remote server on the web. That means I've implemented URLSession in there. I know URLSession is supposed to run asynchronosly, but is there a safe way to block execution while the data task retrieves the data from the web? Otherwise my application exits before I can process anything.


Basically I tried wrapping my entire main.swift file in the DispatchQueue.main.sync { } block, but I get a EXC_BAD_INSTRUCTION on the very closing braces of this block.


DispatchQueue.main.sync {
    var sigimage = Sigimage()
    sigimage.getEndpoints { (endpoints) in
        for endpoint in endpoints {
            // Redacted for readablity
        }
    }
} // EXC_BAD_INSTRUCTION here


I was wondering whether something inside all the code hidden by this class is trapping, but I don't even know where to begin. Due to this error inside DispatchQueue I can't use breakpoints because the code never gets anywhere.


Thanks in advance.

Accepted Reply

However, sorry my lack of knowledge here, but I sincerely can't tell why.

Not a problem. Let’s see if we can clear up that confusion, eh?

First up, be aware we’re using blocks in two ways here:

  • As a verb, to indicate that a thread waits for some condition. This is a very common industry term.

  • As a noun, in a way that’s mostly synonymous with Swift’s closures. This usage is inherited from dispatch’s origins as a C-based API.

dispatchMain
parks the main thread [1]. You can think of it as blocking the main thread indefinitely [2], with infrastructure in place to wake up the main thread so that it can handle any work that is scheduled on the main queue (see below). This allows you to write a program in terms of dispatch’s asynchronous constructs [3] and keep the process running that program around in order to service that async work. Finally, when all that async work is done, you can call
exit
to force the process to terminate.

Now, back to your specific questions. You wrote:

It says there it waits for blocks to be submitted to the main queue, but how does it control which blocks are running asynchronously?

It doesn’t need to. Your code, and the code in the frameworks you call, is responsible for adding closures to queues, including the main queue. For example, this code:

let queue: DispatchQueue = …
queue.async {
    print("This runs on `queue`.")
}

adds a closure (line 3) to the queue

queue
. At some point in the future dispatch will pull that closure off
queue
and execute it.

And how do we submit them to the main queue?

The main queue is just a static property of

DispatchQueue
, so you can reference it as
DispatchQueue.main
. For example, if you want to add a closure to the main queue, you could rewrite the above as:
DispatchQueue.main.async {
    print("This runs on the main queue.")
}

Is that what exit(0) in your example does? Does this method, somehow, know all the threads running and it waits for all of them to finish before continuing execution?

No.

exit
just forces the process to terminate. Dispatch can’t automatically figure out when all the work inside your process is complete. Rather, if you block the process in
dispatchMain
then you are responsible for making sure it terminates. For a standard command line tool you do this explicitly by calling
exit
(and I showed in the my example). In other contexts this can be done implicitly. For example, a launchd daemon would typically be integrated with launchd’s transaction system, so the launchd knows when the daemon is ‘clean’ and can terminate it when required.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

[1] It doesn’t actually park the thread in the traditional sense, but rather it saves resources by terminating the thread while leaving your process alive. Neat-o-rama!

[2] Note that it’s declared as:

public func dispatchMain() -> Never

which means it never returns.

[3] In my example I’m using

URLSession
, which is based on dispatch internally.

Replies

>> is there a safe way to block execution while the data task retrieves the data from the web?


DispatchSemaphore is a really easy to use class. You could use that to have your main code wait for completion of any asynchronous API that you were "forced" to use.


I don't know why it's crashing, but I suspect it's something to do with the order of initializing around the main function. It may be, for example, that there's no autorelease pool in place at the time you dispatch your top level block, something like that.

Thanks a lot, Quincey. Yes, my code does work with a semaphore. You're right, it's simple to use and effective on my case. Plus it doesn't destroy any of the underlying asynchronous characteristics of Swift, it simply waits for all of the threads to finish.


Just out of curiosity (and better understanding of this whole platform) I would love to know why the DispatchQueue fails at that point. I guess I'll never know.

There two ways you can approach this:

  • Synthetic synchronous — QuinceyMorris already suggested using a dispatch semaphore for this.

  • Asynchronous — You wrote:

    Otherwise my application exists before I can process anything.

    The solution here is to prevent your app from exiting until your async work is all done. You can use

    dispatchMain
    for this. For example:
    import Foundation
    
    
    let url = URL(string: "…")!
    let req = URLRequest(url: url, cachePolicy: .reloadIgnoringLocalCacheData, timeoutInterval: 60.0)
    URLSession.shared.dataTask(with: req) { (data, response, error) in
        NSLog("we're done")
        exit(0)
    }.resume()
    dispatchMain()

    -

I generally prefer the second approach because parking a thread in a semaphore is kinda wasteful.

Apropos semaphores, if you do decide to use the synthetic synchronous approach you should not use a semaphore because it’s not integrated with the dispatch ownership model. It’s better to use a dispatch work item for this. Here’s an example:

import Foundation

let workItem = DispatchWorkItem {
    // does nothing in this example
}
let url = URL(string: "…")!
let req = URLRequest(url: url, cachePolicy: .reloadIgnoringLocalCacheData, timeoutInterval: 60.0)
URLSession.shared.dataTask(with: req) { (data, response, error) in
    workItem.perform()
}.resume()
workItem.wait()
NSLog("we're done")

Note The dispatch work item API is not well documented per se. However, it is a direct analogue of the dispatch blocks API for C-based languages. A good place to find info on that API is the doc comments in

<dispatch/block.h>
.

WWDC 2017 Session 706 Modernizing Grand Central Dispatch Usage has more details on the dispatch ownership model.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thank you for your answer, Quinn. I am not sure I fully grasp the usage of dispatchMain() in the asynchronous example you gave, nor by reading the reference. https://developer.apple.com/documentation/dispatch/1452860-dispatchmain


It says there it waits for blocks to be submitted to the main queue, but how does it control which blocks are running asynchronously? And how do we submit them to the main queue? Is that what exit(0) in your example does? Does this method, somehow, know all the threads running and it waits for all of them to finish before continuing execution?


UPDATE: I've placed dispatchMain() right after my dataTask call, as you showed, and it just works. However, sorry my lack of knowledge here, but I sincerely can't tell why.

However, sorry my lack of knowledge here, but I sincerely can't tell why.

Not a problem. Let’s see if we can clear up that confusion, eh?

First up, be aware we’re using blocks in two ways here:

  • As a verb, to indicate that a thread waits for some condition. This is a very common industry term.

  • As a noun, in a way that’s mostly synonymous with Swift’s closures. This usage is inherited from dispatch’s origins as a C-based API.

dispatchMain
parks the main thread [1]. You can think of it as blocking the main thread indefinitely [2], with infrastructure in place to wake up the main thread so that it can handle any work that is scheduled on the main queue (see below). This allows you to write a program in terms of dispatch’s asynchronous constructs [3] and keep the process running that program around in order to service that async work. Finally, when all that async work is done, you can call
exit
to force the process to terminate.

Now, back to your specific questions. You wrote:

It says there it waits for blocks to be submitted to the main queue, but how does it control which blocks are running asynchronously?

It doesn’t need to. Your code, and the code in the frameworks you call, is responsible for adding closures to queues, including the main queue. For example, this code:

let queue: DispatchQueue = …
queue.async {
    print("This runs on `queue`.")
}

adds a closure (line 3) to the queue

queue
. At some point in the future dispatch will pull that closure off
queue
and execute it.

And how do we submit them to the main queue?

The main queue is just a static property of

DispatchQueue
, so you can reference it as
DispatchQueue.main
. For example, if you want to add a closure to the main queue, you could rewrite the above as:
DispatchQueue.main.async {
    print("This runs on the main queue.")
}

Is that what exit(0) in your example does? Does this method, somehow, know all the threads running and it waits for all of them to finish before continuing execution?

No.

exit
just forces the process to terminate. Dispatch can’t automatically figure out when all the work inside your process is complete. Rather, if you block the process in
dispatchMain
then you are responsible for making sure it terminates. For a standard command line tool you do this explicitly by calling
exit
(and I showed in the my example). In other contexts this can be done implicitly. For example, a launchd daemon would typically be integrated with launchd’s transaction system, so the launchd knows when the daemon is ‘clean’ and can terminate it when required.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

[1] It doesn’t actually park the thread in the traditional sense, but rather it saves resources by terminating the thread while leaving your process alive. Neat-o-rama!

[2] Note that it’s declared as:

public func dispatchMain() -> Never

which means it never returns.

[3] In my example I’m using

URLSession
, which is based on dispatch internally.

Thank you for your explanation, Quinn! I think I got it, so please allow me to say this in layman's terms:


The last thing I call on my main thread is dispatchMain(). That means "hey system, there is nothing else to process on my main thread from here on out, but please don't exit the application just yet, just park the main thread here and we're good". Now the processing has been transferred to the other threads that were spawned from my previous calls on the main thread. I am the one responsible for checking whether URLSession's dataTask finished transferring data from the web, if the image was saved at the correct location, if the logs were written (if necessary). When everything I want is done I may call exit() and then actually close my application. So I need to place the exit() call somewhere I can guarantee that every other thread has finished doing their work. I'm also assuming that calling exit() from a thread other than main will cleanly close all threads including main, correct? No zombies, right?


Thank you all again!

That means "hey system, there is nothing else to process on my main thread from here on out, but please don't exit the application just yet, just park the main thread here and we're good".

Correct.

When everything I want is done I may call

exit
and then actually close my application.

Correct.

So I need to place the

exit
call somewhere I can guarantee that every other thread has finished doing their work.

Correct. The key point here is that you have to make this guarantee.

exit
terminates the process, regardless of what any other threads are doing right now.

I'm also assuming that calling

exit
from a thread other than main will cleanly close all threads including main, correct?

Right. Threads exist within a process, so when the process terminates all the threads get cleaned up. In this respect threads are no different from any other per-process resource, like memory, windows on screen, and so on — the system has to clean all of this up when the process terminates.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Quinn, thanks again for the patient explanation!


If you could answer me one more question I would very grateful. I have some code, in a class, that acquires data about the user's Terminal session. The problem here is this: if I run this code on the main thread all is fine. However, if the task is run from within my class then the shell commands return some default data about the Terminal, not the actual Terminal window from the user. This is because Swift spawns the shell command on a separate generic thread (I could check with breakpoints). Consider the following code for acquiring the number of available columns from tput.


class Tools {

    static var terminalCol: Int {
        var tputCol: Int
        let (tputReturnData, _, _) = Tools.shell(commandPath: "/usr/bin/tput", arguments: ["cols"])
        guard let tputString = String(data: tputReturnData, encoding: .utf8) else {
            return -1
        }
        if let col = Int(tputString.trimmingCharacters(in: .whitespacesAndNewlines)) {
            tputCol = col
        }
        else {
            tputCol = -1
        }
        return tputCol
    }

static func shell(commandPath path: String, arguments: [String] = []) -> (Data, Data, Int) {
        let task = Process()
        task.launchPath = path
        task.arguments = arguments

        let outputPipe = Pipe()
        let errorPipe = Pipe()
        task.standardOutput = outputPipe
        task.standardError = errorPipe

        task.launch() // This is not executed on the main thread.
        task.waitUntilExit()

        return (outputPipe.fileHandleForReading.readDataToEndOfFile(), errorPipe.fileHandleForReading.readDataToEndOfFile(), Int(task.terminationStatus))
    }

}


So terminalCol in this case is always 80 (columns), unless I copy this code and paste it on my main.swift file. So I tried modifying my function like this:


static func shell(commandPath path: String, arguments: [String] = []) -> (Data, Data, Int) {
        let task = Process()
        task.launchPath = path
        task.arguments = arguments

        let outputPipe = Pipe()
        let errorPipe = Pipe()
        task.standardOutput = outputPipe
        task.standardError = errorPipe

        DispatchQueue.main.async { [unowned task] in
               task.launch()
               task.waitUntilExit()
       }
        return (outputPipe.fileHandleForReading.readDataToEndOfFile(), errorPipe.fileHandleForReading.readDataToEndOfFile(), Int(task.terminationStatus))
    }

}


But that causes a deadlock, I believe. If I change async to sync, that causes an error "Illegal Instruction: 4". I was thinking the deadlock is actually because I've parked my main thread, correct? So it's not really a deadlock, but simply the fact my main thread has stopped. Is my understanding correct?

I’m not sure what’s going wrong with your subprocess launch code; alas, I don’t have time right now to work through your description of the problem. However, there is a much better way to get the terminal width, namely the

TIOCGWINSZ
ioctl. For example:
enum Tools {

    static var terminalCol: Int {
        var terminalSize = winsize()
        guard ioctl(STDIN_FILENO, TIOCGWINSZ, &terminalSize) == 0 else {
            return -1
        }
        return Int(terminalSize.ws_col)
    }
}

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thank you so much for your invaluable help, again. As usual your solution is elegant and works.

Thanks for the explanation.

Please also include a complete working example.

Please also include a complete working example.

A working example of what? This thread has covered a bunch of different topics.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

It’s better to use a dispatch work item for this.

Since writing this I’ve learnt more about this issue and I’m not sure that using a work item actually helps much. See this thread for more about this.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"