Variable Multithread Access - Corruption

Hi fellow developers,


In a nutshell:

I have one counter variable that is accessed from many threads. Although I've implemented multi-thread read/write protections, the variable seems to still -in an inconsistent way- get written to simultaneously, leading to incorrect results from the counter.


Getting into the weeds:

I'm using a "for loop" that triggers roughly 100 URL requests in the background, each in its “DispatchQueue.global(qos: .userInitiated).async” queue.


These processes are async, once they finish they update a “counter” variable. This variable is supposed to be multi-thread protected, meaning it’s always accessed from one thread and it’s accessed syncronously. However, something is wrong, from time to time the variable will be accessed simultaneously by two threads leading to the counter not updating correctly. Here's an example, lets imagine we have 5 URLs to fetch:


We start with the Counter variable at 5.

1 URL Request Finishes -> Counter = 4

2 URL Request Finishes -> Counter = 3

3 URL Request Finishes -> Counter = 2

4 URL Request Finishes (and for some reason – I assume variable is accessed at the same time) -> Counter 2

5 URL Request Finishes -> Counter = 1


As you can see, this leads to the counter being 1, instead of 0, which then affects other parts of the code. This error happens inconsistently.


Here is the multi-thread protection I use for the counter variable:


1. Dedicated Global Queue

//Background queue to syncronize data access

fileprivate let globalBackgroundSyncronizeDataQueue = DispatchQueue(label: "globalBackgroundSyncronizeSharedData")


2. Variable is always accessed via accessor:

var numberOfFeedsToFetch_Value: Int = 0

var numberOfFeedsToFetch: Int {

set (newValue) {

globalBackgroundSyncronizeDataQueue.sync() {

self.numberOfFeedsToFetch_Value = newValue

}

}

get {

return globalBackgroundSyncronizeDataQueue.sync {

numberOfFeedsToFetch_Value

}

}

}


I assume I may be missing something but I've used profiling and all seems to be good, also checked the documentation and I seem to be doing what they recommend. Really appreciate your help.


Thanks!!


Marc

Accepted Reply

You have mistakes in multi-thread read/write protections.


Using `sync` on serial queue is not a bad idea, but using it for a simple set operation does not work as you expect.


Consider this simple operation:

self.numberOfFeedsToFetch_Value -= 1

Simply it generates 3 steps of code:


get numberOfFeedsToFetch_Value
subtract one from it
set it to numberOfFeedsToFetch_Value


You protect only get and set, think what may happen in mutithreaded access:


Thread-1                                   Thread-2
[get numberOfFeedsToFetch_Value] (3)
                                           [get numberOfFeedsToFetch_Value] (3)
subtract one from it (2)                   subtract one from it (2)
[set it to numberOfFeedsToFetch_Value] (2)
                                           [set it to numberOfFeedsToFetch_Value] (2)

Assuming the value of numberOfFeedsToFetch_Value was 3, it may be 2 after two (or possibly more) threads have finished decrementing it.


You need to protect whole operation which needs to be done atomically:

globalBackgroundSyncronizeDataQueue.sync {
    self.numberOfFeedsToFetch_Value -= 1
}

(You need to remove your protection for get and set to avoid dead-lock.)


Thread-1                                   Thread-2
[get numberOfFeedsToFetch_Value (3)
 subtract one from it (2)
 set it to numberOfFeedsToFetch_Value] (2)
                                           [get numberOfFeedsToFetch_Value (2)                    
                                            subtract one from it (1)
                                            set it to numberOfFeedsToFetch_Value] (1)


Please try and see if your app's behavior would change.

Replies

The individual accessors are thread safe, but an increment operation isn't atomic given how you've written the code. That is, while one thread is getting or setting the value, no other threads can also be getting or setting the value. However, there's nothing preventing thread A from reading the current value (say, 2), thread B reading the same current value (2), each thread adding one to this value in their private temporary, and then each thread writing their incremented value (3 for both threads) to the property. So, two threads incremented but the property did not go from 2 to 4; it only went from 2 to 3.


You need to do the whole increment operation (get, increment the private value, set) in an atomic way such that no other thread can read or write the property while it's in progress.

You have mistakes in multi-thread read/write protections.


Using `sync` on serial queue is not a bad idea, but using it for a simple set operation does not work as you expect.


Consider this simple operation:

self.numberOfFeedsToFetch_Value -= 1

Simply it generates 3 steps of code:


get numberOfFeedsToFetch_Value
subtract one from it
set it to numberOfFeedsToFetch_Value


You protect only get and set, think what may happen in mutithreaded access:


Thread-1                                   Thread-2
[get numberOfFeedsToFetch_Value] (3)
                                           [get numberOfFeedsToFetch_Value] (3)
subtract one from it (2)                   subtract one from it (2)
[set it to numberOfFeedsToFetch_Value] (2)
                                           [set it to numberOfFeedsToFetch_Value] (2)

Assuming the value of numberOfFeedsToFetch_Value was 3, it may be 2 after two (or possibly more) threads have finished decrementing it.


You need to protect whole operation which needs to be done atomically:

globalBackgroundSyncronizeDataQueue.sync {
    self.numberOfFeedsToFetch_Value -= 1
}

(You need to remove your protection for get and set to avoid dead-lock.)


Thread-1                                   Thread-2
[get numberOfFeedsToFetch_Value (3)
 subtract one from it (2)
 set it to numberOfFeedsToFetch_Value] (2)
                                           [get numberOfFeedsToFetch_Value (2)                    
                                            subtract one from it (1)
                                            set it to numberOfFeedsToFetch_Value] (1)


Please try and see if your app's behavior would change.

Thanks for the clear & detailed explanation OOPer, after reading your post it seems so obvious. Very much appreciated.


I'm testing the implementation as we speak, first results after doing the following. 1. Removed get & set protections:


    var numberOfFeedsToFetch_Value: Int = 0
    var numberOfFeedsToFetch: Int {
        set (newValue) {
            globalBackgroundSyncronizeDataQueue.sync()  {
                self.numberOfFeedsToFetch_Value = newValue
            }
        }
        get {
            return globalBackgroundSyncronizeDataQueue.sync {
                numberOfFeedsToFetch_Value
            }
        }
    }


2. And instead, every time I change something I use:


self.globalBackgroundSyncronizeDataQueue.sync() {
      self.numberOfFeedsToFetch = self.numberOfFeedsToFetch - 1
}


So far have not experience the problem, but thread sanitizer now says:

Data race in App.FeedManager.numberOfFeedsToFetch.setter : Swift.Int at 0x7b480001e300.


Any advice on what could I do? If I add the protections I would run into the deadlock problem. Given that all writes happen within the same Data Queue using sync, is it safe to ignore?


Also, I have now secured the write operations, but what about the read ones? Should I also do something about them?

For instance, what about this if statement:

        if numberOfFeedsToFetch == 0 {
            noMoreFeedsToFetchUpdate()
            return
        }



Thanks!


Marc

First impression is that CPU goes to 100% and overall update process is slower. I'm trying to figure out why.


One possibility, is your code doing some sort of polling?

while !someConditionMet() {
    //...nearly empty
}
//Do something after some condition met

This sort of coding easily make CPU usage up to 100%.


I have now secured the write operations, but what about the read ones?


In a strict point of view, single word (or less) read may be a non-atomic operation. But paractically, very rare in CPUs like x86 or ARM (and compilers like LLVM/Swift).

So, when you want to remove any possible flaws, you should better protect read operations. (Do it outside the getter, to avoid dead-lock.)

But in most cases, you observe nothing other than slowdown.


One more thing.


There may be another better solution than using some protected decrement.

If you can show overall structure showing how your numberOfFeedsToFetch_Value is being used, someone would be able propose.

Hi OOPer,


Just identified the CPU issue, is now fixed. However Thread Sanitizer now detects the following:


self.globalBackgroundSyncronizeDataQueue.sync() {
        self.numberOfFeedsToFetch = self.numberOfFeedsToFetch - 1
}


In line 2 -> Data race in App.FeedManager.numberOfFeedsToFetch.setter : Swift.Int at 0x7b480001e300.


Any advice on what could I do? If I add the protections I would run into the deadlock problem. Given that all writes happen within the same Data Queue using sync, is it safe to ignore?



And re.

There may be another better solution than using some protected decrement.
If you can show overall structure showing how your numberOfFeedsToFetch_Value is being used, someone would be able propose.

It's just a counter, when the app needs to update its feeds (the app is an RSS Reader) it will count how many operations should return a value and will launch the requests in parellel. Every time one of this requests returns (either with success or fail) it will reduce the counter by 1 until it reaches 0. This will then trigger some cleanup.


Thanks!


Marc

Try this:


self.globalBackgroundSyncronizeDataQueue.sync() { 
        self.numberOfFeedsToFetch -= 1 
}


Does that make the error go away?

It does not, same error.

Hmm, do you have a freestanding code fragment that we can use to get the error? Or a test project we can download from somewhere? The right conditions need to be set up for the error to be detected, it seems.


BTW, since the critical sections of code (get, set, increment) are so short, dispatching on a separate queue seems a bit like overkill. I would normally use DispatchSemaphore for this. I've also heard that 'os_unfair_lock' is another good choice, but I've never actually used it.

Will look into that, thanks a lot for the tip, I'm not familiar with DispatchSemaphore.


I'm new to handling multithreading, huge thanks to all of you 🙂


Will close the item as it seems the changes above solved the issue (even though the debugger is not fully happy). Will update if the issue re-appears (or if I transition to DispatchSemaphore).


Marc

>> the changes above solved the issue (even though the debugger is not fully happy)


I know it's counter-intuitive, but you haven't solved the issue if you're still getting error messages about a race condition. The danger of a race condition is that it won't do any damage unless the timing is just "right" for a problem to show up. This may manifest itself as occasional failures for all users of your app, or frequent failures for a few unlucky users.


Or do you mean something else by the "unhappy" debugger?

Should I paste a zipped sample project here that shows the issue? Post it somewhere and link to it?


May be a dumb question but have never done this before.


Thanks Quencey!

The forum won't let you post a file here. You can use of the public file sharing services, if you don't have anywhere else to put it.