Posts

Post marked as solved
21 Replies
> My assumption is that you’ll need a memory barrier …yes, looks like it. though i never saw it used for audio on ios/mac..>> i assume this a current limitation (or a bug) of thread sanitizer and memmove doesn't actually do anything special (like flushing caches or putting memory fences) than the simple assignment does, correct?> Correct.i remember BlockMove was doing something special about caches :-)> With regards priority inversion, most of the built-in locking primitives provide a priority boost to the thread holding the lock when there’s a high-priority thread waiting for it. You can learn more about this in the Lock Ownership section of WWDC 2017 Session 706 Modernizing Grand Central Dispatch Usage.thanks, will definitely watch.i created a simple test application that tests bullet point #2 statement (the necessity of coordinating accesses to buffer contents and its position). it is a 370 line all-in-one-source-file app -- no nib files -- for iOS and macOS. on iOS the app has some minimal UI, on macOS it is a console app that prints some info every second. the code is mainly “C” as far as audio is concerned, notable exception is std::atomic<int> used for ring buffer positions to avoid their data races. no attempt is done in code to coordinate buffer content and its position. the app works like this:- the app gradually generates and fills the ring buffer with audio data (a chord of three notes). this happens in the main thread. the assumption is that in order for the issue to happen the ring buffer size shall be small enough and so shall be the filling chunks. i tested it with filling duration of down to 10ms at 48kHz (so 480 samples == 2K bytes). also with big filling chunks / ring buffer (e.g. 1/3 sec / 1 sec).- in the IO callback (real time audio thread) the app reads from the ring buffer and plays it.- to ensure the data is not damaged (the actual test of the bullet point #2 statement) the app also generates the relevant chunk of audio data “as it should be” on the fly and compares it with what came from the ring buffer - it asserts the two are equal, so in debug build the app will break.- the main UI on iOS / or console output on mac shows various information (sample rate, etc) along with the “error count”. should this error count be anything but zero, ever - there is indeed a need to coordinate accesses of ring buffer contents and its position.- if you want to compile this app i can either provide the project or just create a new project for iOS app and remove everything including the “Main storyboard file base name” reference in it’s plist. as for macOS target - use console app. i found i had to manually add frameworks (in the Xcode settings pane) when C++ is used in the app. also enable background audio in iOS target if you want the app to work in background.- note that error checking in the app is minimal - the app asserts on errors.i was not able to surface the problem #2 so far, tested debug and release builds on iPhone X, iPhone 6s, iPad Pro / MacBook Pro. will soon test the app on the most recent iPhone and update here if it is any different.i foresee this awkward conversation should i talk about this app with DTS:>>>>>>>>>>>>>>>ME: please see the app, it works but it shall not.DTS: we tested your app and we weren’t able reproduce any issues. do you know on what hardware and/or under what circumstances it shall misbehave?ME: no, but it may fail when iPhones are ported to alpha CPU or when ARM moves to a weaker memory model or when tools are evolved or maybe on the current hardware in other cases which i haven’t found - please help me finding those. Quinn told me that to play by the rules i shall synchronise the access of the buffer and its position otherwise i am (and always was) in trouble without realising it.DTS: you shall talk to Quinn then. from what we see the app works ok and we are unable to find a case when it doesn’t work so no changes are needed. following the YAGNI principle, if the app breaks in the future - you fix it right there and then in the future. we also compared what you do in the app to other relevant sample code and found no significant differences. having said that we have to close this DTS incident.<<<<<<<<<<<<<<let’s see what they actually say.==== the app ====475 lines skipped, probably too big for this forum. will send it to those who want it.the most interesting fragment looks like this:static OSStatus renderCallback(void* refCon, AudioUnitRenderActionFlags* actionFlags, const AudioTimeStamp* ts, UInt32 element, UInt32 numFrames, AudioBufferList* ioData) { .. ringRead(data, size); generateAudio(&phase, data2, size); assert(memcmp(data, data2, size) == 0); ..}
Post marked as solved
21 Replies
thanks Quinn, that makes sense.i haven't seen this previously in a sample code, so i'm going to ask DTS (via a tech support incident) how to address your bullet point #2 right (coordinating buffer contents with it's position) from inside an audio callback. let's see what they answer.one followup question. i can see that i can easily fool thread sanitizer by doing this:int one = 1;memmove(&flag, &one, sizeof(one));instead of:flag = 1;i assume this a current limitation (or a bug) of thread sanitizer and memmove doesn't actually do anything special (like flushing caches or putting memory fences) than the simple assignment does, correct?please also comment on the messages which might be lost above (whether the priority inversion problem is fixed on apple platform and if so does or doesn't it have any impact / mitigation on what we now can call from RT threads; and whether the info in the old article of Ross Bencina is still applicable to apple platforms as it (mostly) basing its rationale on the priority inversion issue).my current understaning is that the prioroity inversion problem is the main spanner in the works, and if it is solved then not only this opens a possibility of using locks inside RT threads, but do other things that were previously impossible (e.g. using malloc or swift/obj-c runtime inside RT threads). please share your thoughts on this.
Post marked as solved
21 Replies
this article is interesting but 8 years old: "Real-time audio programming 101: time waits for nothing" by Ross Bencina.if it's no longer applicable to apple platforms - please shout.
Post marked as solved
21 Replies
John,>>>A lock is not necessarily a block. This is where you have to use good, basic concurrency patterns. If you lock/op/unlock in thread A and then lock/op/unlock in thread B, neither of these are "blocking", assuming that "op" is, itself, a non-blocking operation. It is only when you do another lock, a sync, or some kind of wait, inside the lock, that you will get blocking behaviour. <<<unless priority inversion problem is no longer a problem -- "lock is block". example below:real time thread { mutex.lock(); // will wait potentially unbound time read shared variable mutex.unlock();}main thread: { mutex.lock(); write shared variable -> thread is interrupted here and switched to thread2 <- back, after unbound time mutex.unlock()}thread2() { for (;;) { malloc() / free() / whatever() }}if priority inversion problem is fixed on apple platforms, then I do not see a reason why locks can not be used in real time threads. or even malloc. or swift / obj-c. the wwdc video i posted was quite old (4 years old) so it might well be possible that this was fixed - in this case we (developers) would appreciate a new clear statement about the current state of the art.
Post marked as solved
21 Replies
regarding the second problem hilited by Quinn:>>Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access.<<i scanned through all available Apple sample code (it is surprisingly hard to find any sample code on Apple site since recently) that uses "render proc" / "input proc" approach to play / record audio. while i can see where they use atomic methods to update the ring buffer positions no where there is an attempt to coordinate buffer content along with the buffer position. Either Quinn is wrong stating that it is insufficient, or the Apple sample code authors are wrong - the latter is possible but hard to believe given that many developers in the field who based their code on that sample code would've already raised the issue if there was an issue.
Post not yet marked as solved
11 Replies
i mean i use Obj-C as a thin bridge layer between swift and C++. it is hard/impossible to use C++ without Obj-C layer in the otherwise swift app.
Post not yet marked as solved
11 Replies
>>I did compile a sample app and could reproduce the problem.thanks>>Originally I tried it in my own app, which doesn't use ARC. Sounds like yet another reason to avoid using ARC if it is going to silently mask errors like this.that would be a ******* :-) i only use Obj-C now when have to, generally it is a very thin shim between swift and C++ (that i also use only when i have to). in this context it's particularly problematic to use Obj-C w/o ARC.
Post marked as solved
21 Replies
>>>Oh, and now that we’ve opened the real-time audio can of worms, which is a very long way from where these thread started, I encourage you to keep the following in mind:You can’t use Swift (or Objective-C), because neither of those languages provide a way to prevent the runtime from allocating memory.and it is littered with locks inside, even when doing "benign looking" things like accessing a property of calling a method (*). I am not using swift or obj-c for audio, and as you already established this thread is mostly not about swift. after i realized that this is not just about the dictionary+ios+swift and verified that the thread sanitizer is not happy even with a single int access in C code i immediately remembered my ring buffer implementation (C code) that is based on that idea and that is used in real time audio thread. and sure enough after i tested that code with thread sanitizer enabled - it complained about data races in that particular place i was thinking about (readPos - writePos > size). that's how we ended up here.what you've brought up later:>>> Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access. <<<is of course very worrying. how in practice can i synchrnonize, say, 32 ints or 256 floats of data written into a buffer with the buffer position in a realtime thread? be it audio, temperature samples, displacement measurements, or whatever. what established techniques should i use here?(*) https://developer.apple.com/videos/play/wwdc2015/508/49:15 -- 50:50Doug Wyatt:>>>> So audio rendering almost always happens in a real-time thread context.And this is a restrictive environment because we can't allocate memory, which means that we really shouldn't even be calling dispatch async, for instance.And in fact we can't make any call at all which might block, for example, taking a mutex or waiting on a semaphore.The reason is if we do block and we block for any length of time, then the audio rendering thread in the system will miss its deadline, and the user will experience that as a glitch.So we have to be very careful when both using and calling or, I'm sorry, both using and implementing these render blocks.So you will see in the Filter Demo how we went to some lengths to not capture our Self object or any other Objective-C object for that matter.In that block, we avoid the Objective-C runtime because it's inherently unsafe.It can take blocks.Unfortunately, the Swift run-time is exactly the same way. <<<<
Post marked as solved
21 Replies
that makes it a mystery then why i do not see audio glitches on iOS, as i am not synchronizing the ring buffer contents with it's position.>>> I don’t find that mysterious at all. When dealing with concurrency issues, the absence of symptoms does not imply the absence of a problem. A classic example of this is the dictionary problem that kicked off this thread. Everything worked on previous systems, but only by good fortune. To guarantee that things will work moving forwards, you have to follow the rules. <<<there're just two sets of "rules" here, one condtradicting the other. rule #1 is the one you describe. rule #2 is what audio experts like Doug Wyatt and many others told us in wwdc's over the years, and what we can read all over the internet - never ever do anything in the audio I/O real time callback that can block. if you contradict the latter rule - you will get audio glitches because of potentially unbound time spent inside a lock (or even not "unbound" time but just enough time to miss the audio thread deadline, which is very strict - submillisecond for small I/O sizes). the thing you are suggesting (synchronising the ring buffer position with it's contents, supposedly by means of a synchronization mechanism like mutex or dispatch queue, etc) is a direct contradiction to that latter rule. now, you are telling me that because i'm contradicting rule #1 - i am (and always was) in trouble, as the contents of the buffer might be out of sync with it's position -> which shall lead to "wrong" contents in the buffer -> that shall be observable by audio glitches. that leads to a natural conclusion, that no matter what i do - i shall get audio glitches, whether i am breaking rule #1 or rule #2... now, practically speaking, the thing is... i am not encountering any audio glitches by breaking the rule #1 now... and as audio experts, the common sense, and simple experiments suggest - i will (and do) have audio glitches should i break the rule #2. so i am taking a practical approach here and doing this now: breaking rule #1 and following rule #2. maybe in the future when iPhones are ported to alpha CPU i will have glithes for not following rule #1 and be in trouble -- not just potential but real. then i will do something about it. or i'll be retired by then and won't care anymore.> You can’t embed a mutex as a property, as you’ve done in your Mutex class, because you can’t take the address of a property.thank you for a very good explanation of the problem of pthread mutexes in swift, will keep that in mind, and use NSLock instead.
Post marked as solved
21 Replies
> No. As I mentioned earlier, Arm uses weak memory model.that makes it a mystery then why i do not see audio glitches on iOS, as i am not synchronizing the ring buffer contents with it's position.> Using pthread mutexes from Swift is a challenge. Try using NSLock instead.thanks, NSLock helped indeed. is it worth to file a bug against Xcode 11 thread sanitizer false positive for pthread mutex variable data race?
Post marked as solved
21 Replies
Quinn,> It’s simply not safe to share unprotected mutable state between threads in Swift.It’s also invalid on older versions of C-based languages (anything prior to the ‘11’ releases).interesting. what was the feature introduced in C11 that made this possible?> Even though readPos and writePos are only modified by a single thread, you still need to use atomic operations for them because they are shared between threads.please clarify what happens if I don't - will the other thread not notice the change at all or will it happen way too late, or what. for the record i've not encountered any problems so far (only tested on CPU's used in intel macs / iphones).> Using näive atomic operations is insufficient because you have two separate chunks of data (the positions and the contents of the buffer itself) and you need to coordinate their access.that's understandable. in my case the contents of the buffer is audio data, so that would show itself as a corrupt audio... which i didn't encounter so far, but i only tested it on macs or iOS devices hardware, so Intel and ARM. is ARM also forgiving?> Building correct lock-free data structures is incredibly difficult. My general advice is that you avoid the whole issue.understood. sometimes it is unavoidable (the mentioned Audio realtime I/O) - but then we still have C / C++.back to my test - i put locks around the shared data accesses, and for the sake of the test even changed it from dictionary to a single Int (see below). probably that fixed the main issue, but now the thread sanitizer complaining about the data race conflicts on the mutex variable itself :-) weird. again, this is on Xcode 11 Beta 3, which might be not stable enough.============import Foundationclass Mutex { private var mutex = pthread_mutex_t() init() { pthread_mutex_init(&mutex, nil) } func lock() { pthread_mutex_lock(&mutex) } func unlock() { pthread_mutex_unlock(&mutex) }}var sharedState: Int = 0var mutex = Mutex()func test() { Thread.detachNewThread { var counter = 0 while true { counter += 1 mutex.lock() sharedState = counter mutex.unlock() usleep((0...100000).randomElement()!) } } Thread.detachNewThread { var counter = 0 while true { counter += 1 mutex.lock() let value = sharedState mutex.unlock() usleep((0...100000).randomElement()!) } }}
Post marked as solved
21 Replies
hmm, according to thread sanitizer a concurrent access even to individual Int doesn't work...i had this ring buffer implementation and thought it is bullet proof. pseudo code:var readerPos = 0 // only increased in reader and there is only one readervar writerPos = 0 // only increased in writer and there is only one writera single reader (size) { if writePos - readPos >= size { // (1) actually read readPos += size // readPos is changed only here and only increases }}at (1) there might be simultaneous writer access that can potentially change writePos, but as it only increases writerPos the condition itself will not change. the presumption here is that writerPos changes and reads atomically, so the side observer will only encounter either the old value, or the new value but never a mixture (e.g. two bytes of old and two bytes of new values)a single writer (size) { if totalSize - (writePos - readPos) >= size { // (2) actually write writePos += size // writePos is changed only here and only increases }}at (2) there might be a simulteneous reader access that can potentially change readerPos, but as it only increases readerPos the condition itself will not change. the same presumption here as above about atomic change / read of readerPos.i specifically designed and used this approach throughout years as it does not use locks and can be used in real time contexts like audio I/O procs. am i in trouble now? what shall i do instead?
Post marked as solved
21 Replies
it doesn't crash under macos either (with Xcode 11).i assumed (wrongly?) that i can work with a dictionary (or any other value type) as with a struct with few ints - it is "safe" to simulatnesouly access / modify it from threads, in the sence that it won't crash the app although it might give inconstistent / unexcpected results.are you suggesting this?thread1: mutex.lock() value = dictionary[key] mutex.unlock() return valuethread2: mutex.lock() dictionary[key] = value mitex.unlock()and was it the case that in previous implementations there was this mutex built-in and now it is not?good hint about sanitizer, will enable it from now on.
Post marked as solved
6 Replies
> The downside of this sort of thing is that it's harder to think about what's happening in terms of a simple rule.this is a bit worrisome. as your example with mimi vs fifi shows sometimes the number of ??? is important, and this fuzzy logic of compiler collapsing levels of optionality "with good intentions" might actually break something.