Xcode 11 dictionary doesn't like concurrent access on iOS

FB6684220


my app that worked fine under Xcode 10 crashes under Xcode 11 (beta 3). I distilled the crash down to this: when accessing dictionary concurrently some of the acceses crash. interestingly it happens only under iOS, not under macOS. FB6684220


========


import Foundation


var dict: [String : String] = [:]


func test() {

Thread.detachNewThread {

var counter = 0

while true {

counter += 1

let key = String.random

dict[key] = .random

usleep((0...1000).randomElement()!)

}

}


Thread.detachNewThread {

var counter = 0

while true {

counter += 1

let key = String.random

let value = dict[key]

usleep((0...1000).randomElement()!)

}

}

}


extension String {

static var random: String {

let s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. "

let offset = (0 ... 100).randomElement()!

let len = (1 ... 100).randomElement()!

return String(s[s.index(s.startIndex, offsetBy: offset) ... s.index(s.startIndex, offsetBy: offset + len)])

}

}

Accepted Reply

It is not safe to share unprotected mutable data (like

dict
) between threads. The fact that this didn’t crash in Xcode 10 is just an artefact of its implementation.

Fortunately there’s a way to flush out latent threading bugs like this one, namely, the thread sanitiser. If you run your code under the thread sanitiser, it immediately fails.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Replies

It is not safe to share unprotected mutable data (like

dict
) between threads. The fact that this didn’t crash in Xcode 10 is just an artefact of its implementation.

Fortunately there’s a way to flush out latent threading bugs like this one, namely, the thread sanitiser. If you run your code under the thread sanitiser, it immediately fails.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

it doesn't crash under macos either (with Xcode 11).


i assumed (wrongly?) that i can work with a dictionary (or any other value type) as with a struct with few ints - it is "safe" to simulatnesouly access / modify it from threads, in the sence that it won't crash the app although it might give inconstistent / unexcpected results.


are you suggesting this?


thread1:

mutex.lock()

value = dictionary[key]

mutex.unlock()

return value


thread2:

mutex.lock()

dictionary[key] = value

mitex.unlock()


and was it the case that in previous implementations there was this mutex built-in and now it is not?


good hint about sanitizer, will enable it from now on.

hmm, according to thread sanitizer a concurrent access even to individual Int doesn't work...


i had this ring buffer implementation and thought it is bullet proof. pseudo code:


var readerPos = 0 // only increased in reader and there is only one reader

var writerPos = 0 // only increased in writer and there is only one writer


a single reader (size) {

if writePos - readPos >= size { // (1)

actually read

readPos += size // readPos is changed only here and only increases

}

}


at (1) there might be simultaneous writer access that can potentially change writePos, but as it only increases writerPos the condition itself will not change. the presumption here is that writerPos changes and reads atomically, so the side observer will only encounter either the old value, or the new value but never a mixture (e.g. two bytes of old and two bytes of new values)


a single writer (size) {

if totalSize - (writePos - readPos) >= size { // (2)

actually write

writePos += size // writePos is changed only here and only increases

}

}


at (2) there might be a simulteneous reader access that can potentially change readerPos, but as it only increases readerPos the condition itself will not change. the same presumption here as above about atomic change / read of readerPos.


i specifically designed and used this approach throughout years as it does not use locks and can be used in real time contexts like audio I/O procs. am i in trouble now? what shall i do instead?

I’ve moved this over to Core OS > Concurrency because most of the following is not about Swift.

according to thread sanitizer a concurrent access even to individual

Int
doesn't work

Indeed.

i specifically designed and used this approach throughout years as it does not use locks and can be used in real time contexts like audio I/O procs. am i in trouble now?

Oh, you’ve been in trouble for years, you just didn’t notice )-:

The approach you’re using is a dead end in Swift (currently) because Swift does not have a memory model (in this sense of that phrase). It’s simply not safe to share unprotected mutable state between threads in Swift.

It’s also invalid on older versions of C-based languages (anything prior to the ‘11’ releases).

On modern versions of C, it may be possible to make this work but it’s very tricky. There are two major sticking points:

  • Even though

    readPos
    and
    writePos
    are only modified by a single thread, you still need to use atomic operations for them because they are shared between threads.
  • Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access. Consider code like this:

    static int buffer = 0;
    static int flag = 0
    
    
    // thread A
    
    
    buffer = 1;
    flag = 1;
    
    
    // thread B
    
    
    while (!flag) {
        // spin
    }
    int x = buffer;

    It’s possible for

    x
    to be set to 0 because the CPU running thread B might ‘see’ the change to
    flag
    before it sees the change to
    buffer
    . This is true even if
    flag
    is accessed atomically, because that atomicity only applies to the atomic variable (
    flag
    ), not to the data ‘protected’ by that variable (
    buffer
    ).

Problems like this are less common on Intel CPUs because they use strong memory ordering as a binary compatibility sop. That’s not the case on Arm, which have a very weak memory ordering (only Alpha is weaker, and it is generally considered a step too far:-).

Building correct lock-free data structures is incredibly difficult. My general advice is that you avoid the whole issue. If, however, you decide to go down this path anyway, base your implementation on the extensive research that’s been done in this space.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Quinn,


> It’s simply not safe to share unprotected mutable state between threads in Swift.

It’s also invalid on older versions of C-based languages (anything prior to the ‘11’ releases).


interesting. what was the feature introduced in C11 that made this possible?


> Even though readPos and writePos are only modified by a single thread, you still need to use atomic operations for them because they are shared between threads.


please clarify what happens if I don't - will the other thread not notice the change at all or will it happen way too late, or what. for the record i've not encountered any problems so far (only tested on CPU's used in intel macs / iphones).


> Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access.


that's understandable. in my case the contents of the buffer is audio data, so that would show itself as a corrupt audio... which i didn't encounter so far, but i only tested it on macs or iOS devices hardware, so Intel and ARM. is ARM also forgiving?


> Building correct lock-free data structures is incredibly difficult. My general advice is that you avoid the whole issue.


understood. sometimes it is unavoidable (the mentioned Audio realtime I/O) - but then we still have C / C++.


back to my test - i put locks around the shared data accesses, and for the sake of the test even changed it from dictionary to a single Int (see below). probably that fixed the main issue, but now the thread sanitizer complaining about the data race conflicts on the mutex variable itself :-) weird. again, this is on Xcode 11 Beta 3, which might be not stable enough.


============


import Foundation


class Mutex {

private var mutex = pthread_mutex_t()

init() { pthread_mutex_init(&mutex, nil) }

func lock() { pthread_mutex_lock(&mutex) }

func unlock() { pthread_mutex_unlock(&mutex) }

}



var sharedState: Int = 0

var mutex = Mutex()


func test() {

Thread.detachNewThread {

var counter = 0

while true {

counter += 1

mutex.lock()

sharedState = counter

mutex.unlock()

usleep((0...100000).randomElement()!)

}

}


Thread.detachNewThread {

var counter = 0

while true {

counter += 1

mutex.lock()

let value = sharedState

mutex.unlock()

usleep((0...100000).randomElement()!)

}

}

}

what was the feature introduced in C11 that made this possible?

The introduction of a memory model. To quote the page I linked to in my previous post: The memory model was then included in the next C++ and C standards, C++11 and C11.

please clarify what happens if I don't

From the compiler perspective, this is Undefined behaviour, which means the compiler offers no guarantees as to what will happen.

is ARM also forgiving?

No. As I mentioned earlier, Arm uses weak memory model.

now the thread sanitizer complaining about the data race conflicts on the mutex variable itself

Using pthread mutexes from Swift is a challenge. Try using

NSLock
instead.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

> No. As I mentioned earlier, Arm uses weak memory model.


that makes it a mystery then why i do not see audio glitches on iOS, as i am not synchronizing the ring buffer contents with it's position.


> Using pthread mutexes from Swift is a challenge. Try using

NSLock
instead.


thanks, NSLock helped indeed. is it worth to file a bug against Xcode 11 thread sanitizer false positive for pthread mutex variable data race?

that makes it a mystery then why i do not see audio glitches on iOS, as i am not synchronizing the ring buffer contents with it's position.

I don’t find that mysterious at all. When dealing with concurrency issues, the absence of symptoms does not imply the absence of a problem. A classic example of this is the dictionary problem that kicked off this thread. Everything worked on previous systems, but only by good fortune. To guarantee that things will work moving forwards, you have to follow the rules.

This is what makes concurrency such a pain, and why the thread sanitiser is such a boon (because it doesn’t check the current behaviour of your code, but rather it checks it against the rules).

NSLock
helped indeed

Cool.

is it worth to file a bug against Xcode 11 thread sanitizer false positive for pthread mutex variable data race?

No, that’s not a bug in the thread sanitiser but a bug in the way that you’re using pthread mutexes. To use them correctly from Swift you have to manually manage their memory. You can’t embed a mutex as a property, as you’ve done in your

Mutex
class, because you can’t take the address of a property.

The main point of confusion here is that

&
doesn’t do what you think it does based on your experience with C-based languages. In a C-based language,
&
returns the address of a variable, be it on the stack, a global, or a data member. That’s not the case in Swift.

In Swift,

&
is a sigil that indicates that you’re passing a variable to an
inout
parameter. The semantics of
inout
are, as the name suggests, copy in and copy out. Thus, conceptually, this code:
func lock() { pthread_mutex_lock(&mutex) }

expands to:

func lock() {
    var tmp = self.mutex
    pthread_mutex_lock(&tmp)
    self.mutex = tmp
}

For normal data structures, like arrays, that’s not a problem. However, for pthread mutexes it’s a showstopper because it’s not safe to copy a pthread mutex.

You can address this problem by having

Mutex
allocate the pthread mutex on the heap and maintaining an
UnsafeMutablePointer
that points to it. However, that’s tricky to get right, and it’s much better to just use
NSLock
.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

that makes it a mystery then why i do not see audio glitches on iOS, as i am not synchronizing the ring buffer contents with it's position.

>>> I don’t find that mysterious at all. When dealing with concurrency issues, the absence of symptoms does not imply the absence of a problem. A classic example of this is the dictionary problem that kicked off this thread. Everything worked on previous systems, but only by good fortune. To guarantee that things will work moving forwards, you have to follow the rules. <<<


there're just two sets of "rules" here, one condtradicting the other. rule #1 is the one you describe. rule #2 is what audio experts like Doug Wyatt and many others told us in wwdc's over the years, and what we can read all over the internet - never ever do anything in the audio I/O real time callback that can block. if you contradict the latter rule - you will get audio glitches because of potentially unbound time spent inside a lock (or even not "unbound" time but just enough time to miss the audio thread deadline, which is very strict - submillisecond for small I/O sizes). the thing you are suggesting (synchronising the ring buffer position with it's contents, supposedly by means of a synchronization mechanism like mutex or dispatch queue, etc) is a direct contradiction to that latter rule. now, you are telling me that because i'm contradicting rule #1 - i am (and always was) in trouble, as the contents of the buffer might be out of sync with it's position -> which shall lead to "wrong" contents in the buffer -> that shall be observable by audio glitches. that leads to a natural conclusion, that no matter what i do - i shall get audio glitches, whether i am breaking rule #1 or rule #2... now, practically speaking, the thing is... i am not encountering any audio glitches by breaking the rule #1 now... and as audio experts, the common sense, and simple experiments suggest - i will (and do) have audio glitches should i break the rule #2. so i am taking a practical approach here and doing this now: breaking rule #1 and following rule #2. maybe in the future when iPhones are ported to alpha CPU i will have glithes for not following rule #1 and be in trouble -- not just potential but real. then i will do something about it. or i'll be retired by then and won't care anymore.


> You can’t embed a mutex as a property, as you’ve done in your

Mutex
class, because you can’t take the address of a property.


thank you for a very good explanation of the problem of pthread mutexes in swift, will keep that in mind, and use NSLock instead.

A lock is not necessarily a block. This is where you have to use good, basic concurrency patterns. If you lock/op/unlock in thread A and then lock/op/unlock in thread B, neither of these are "blocking", assuming that "op" is, itself, a non-blocking operation. It is only when you do another lock, a sync, or some kind of wait, inside the lock, that you will get blocking behaviour.


The locks are only necessary to protect shared memory from concurrent access. If the only thing you do inside a lock is read from or write to that shared memory, then nothing is going to block. You can use higher-level operations like GCD to handle the dirty work (and it can get quite dirty) for you. Instead of using low-level locks, you dispatch blocks asynchronously onto serial queues.


There may potentially be other complications down the road. You will have to be more careful with error checking. Using locks, it is relatively easy to find deadlocks. But with asynchronous blocks, your "deadlocks" may instead be manifested as infinitely growing queues.

there're just two sets of "rules" here, one condtradicting the other.

I don’t see a contradiction here. I did not say “never use a lock-free data structure”. What I said is:

  • Building lock-free data structures is hard.

  • In general, it’s best to avoid doing that.

  • If you can’t avoid it — and real-time audio is one example where that’s the case — it’s best to not design your own data structures but instead base your code on existing research.

Oh, and now that we’ve opened the real-time audio can of worms, which is a very long way from where these thread started, I encourage you to keep the following in mind:

  • You can’t use Swift (or Objective-C), because neither of those languages provide a way to prevent the runtime from allocating memory.

  • Lifting that restriction for Swift would require that:

  • The reason why C and C++ are the standard languages that folks use for this is that:

    • You can control when their runtimes allocate memory
    • The ’11 variants and later have a defined memory model

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

>>>Oh, and now that we’ve opened the real-time audio can of worms, which is a very long way from where these thread started, I encourage you to keep the following in mind:

You can’t use Swift (or Objective-C), because neither of those languages provide a way to prevent the runtime from allocating memory.


and it is littered with locks inside, even when doing "benign looking" things like accessing a property of calling a method (*). I am not using swift or obj-c for audio, and as you already established this thread is mostly not about swift. after i realized that this is not just about the dictionary+ios+swift and verified that the thread sanitizer is not happy even with a single int access in C code i immediately remembered my ring buffer implementation (C code) that is based on that idea and that is used in real time audio thread. and sure enough after i tested that code with thread sanitizer enabled - it complained about data races in that particular place i was thinking about (readPos - writePos > size). that's how we ended up here.


what you've brought up later:


>>> Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access. <<<


is of course very worrying. how in practice can i synchrnonize, say, 32 ints or 256 floats of data written into a buffer with the buffer position in a realtime thread? be it audio, temperature samples, displacement measurements, or whatever. what established techniques should i use here?


(*) https://developer.apple.com/videos/play/wwdc2015/508/

49:15 -- 50:50


Doug Wyatt:

>>>> So audio rendering almost always happens in a real-time thread context.

And this is a restrictive environment because we can't allocate memory, which means that we really shouldn't even be calling dispatch async, for instance.

And in fact we can't make any call at all which might block, for example, taking a mutex or waiting on a semaphore.

The reason is if we do block and we block for any length of time, then the audio rendering thread in the system will miss its deadline, and the user will experience that as a glitch.

So we have to be very careful when both using and calling or, I'm sorry, both using and implementing these render blocks.

So you will see in the Filter Demo how we went to some lengths to not capture our Self object or any other Objective-C object for that matter.

In that block, we avoid the Objective-C runtime because it's inherently unsafe.

It can take blocks.

Unfortunately, the Swift run-time is exactly the same way. <<<<

If you want to talk about contradictions, you've got a pretty good one with real-time programming on iOS/macOS. Neither is a real-time operating system. People may use the term "real-time" in the context of audio programming, but it's still not a true real-time system. There is simply no such thing as a "realtime thread". Those two words are mutually exclusive.


It might be better to think in terms of multiprocessing rather than multithreading. Imagine you are writing a real, real-time system on a multi-core architecture. You don't have context switches. You have two cores, running the same, or different code while both having access to the same memory and other resources. How can you guarantee that the data you read from memory isn't being changed by code running on another core while you are reading it?

regarding the second problem hilited by Quinn:


>>Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access.<<


i scanned through all available Apple sample code (it is surprisingly hard to find any sample code on Apple site since recently) that uses "render proc" / "input proc" approach to play / record audio. while i can see where they use atomic methods to update the ring buffer positions no where there is an attempt to coordinate buffer content along with the buffer position. Either Quinn is wrong stating that it is insufficient, or the Apple sample code authors are wrong - the latter is possible but hard to believe given that many developers in the field who based their code on that sample code would've already raised the issue if there was an issue.

John,


>>>A lock is not necessarily a block. This is where you have to use good, basic concurrency patterns. If you lock/op/unlock in thread A and then lock/op/unlock in thread B, neither of these are "blocking", assuming that "op" is, itself, a non-blocking operation. It is only when you do another lock, a sync, or some kind of wait, inside the lock, that you will get blocking behaviour. <<<


unless priority inversion problem is no longer a problem -- "lock is block". example below:


real time thread {

mutex.lock(); // will wait potentially unbound time

read shared variable

mutex.unlock();

}


main thread: {

mutex.lock();

write shared variable

-> thread is interrupted here and switched to thread2

<- back, after unbound time

mutex.unlock()

}


thread2() {

for (;;) { malloc() / free() / whatever() }

}


if priority inversion problem is fixed on apple platforms, then I do not see a reason why locks can not be used in real time threads. or even malloc. or swift / obj-c. the wwdc video i posted was quite old (4 years old) so it might well be possible that this was fixed - in this case we (developers) would appreciate a new clear statement about the current state of the art.