New Thread Sanitizer on Audio Unit callbacks?

For transfering data from the UI thread into and out of very short Audio Unit buffer callbacks, what methods or code sequences are recommended that have sufficiently bounded timing, and that will also pass the new Xcode Thread Sanitizer checks for data races? (especially when using Swift for multiprocessor ARM CPU systems)

Replies

You should be able to synchronize the data transfer via atomic operations, but you still need to make sure that the writer thread doesn't write to the same data as the reader thread. For example, if you need to transfer audio data, you can memcpy the data (without locks) into a preallocated buffer and then set an _Atomic(BOOL) variable to "1" with memory_order_release to indicate the data is ready. The reader thread loads the atomic variable with memory_order_acquire and if it sees "1", it's safe to read the data without locks again. You need to make sure the data isn't written again until the reader is done with reading it (perhaps with another atomic boolean). There are more complex patterns and data structures, like switching two buffers, circular buffers, lock-free queues, etc., and all of them can be implemented without data races. The solution really depends on the kind of data, frequency of the transfers, required latency and other factors. For example, for simple messages generated by the UI (note on, note off) a lock-free queue should be sufficient.


Can you send an example of what you're currently using for the transfers? Can you show the data races that Thread Sanitizer is reporting?

I am still not sure how to guarantee that, on devices with 64-bit ARM multiprocessors, one can safely assume that, even though the _Atomic() indicator is written after the data memcpy() in the Swift source code, that that same write ordering is really seen in that same order from another processor on another thread.


e.g. How does one keep the Swift compiler and/or the ARM processor dcache or write buffer from inverting the write order somewhere, which would lead the consumer thread seeing the atomic variable set as a "1" before the data in the buffer was there to be read, even if the source code for the producer thread was in the correct order. Or never seeing the "1" at all because it's still sitting in an unflushed dirty data-cache line or delayed write-buffer on another CPU.


Should a barrier (or cache flush?) instruction be issued? But how can one conclude that the barrier instruction won't be reordered by the Swift compiler?


My current code uses a simple lock-free circular queue, with a counter variable of atomic size (32 bits, aligned) updated after the memcpy().

To get the memory ordering guarantees you need, it's easiest to do that in C or Obj-C, as memory barriers are currently unavailable in Swift, and even using functions imported from C is not guaranteed to be atomic. So I'm going to keep using Obj-C, here's an example how to use atomics with proper memory ordering:


#import <stdatomic.h>

char buffer[1024];
_Atomic(BOOL) data_ready = 0;

void thread1() {
  memcpy(buffer, ... /* source of the data */, 1024);
  atomic_store_explicit(&data_ready, YES, memory_order_release);
}

void thread2() {
  BOOL ready = atomic_load_explicit(&data_ready, memory_order_acquire);
  if (ready) {
    ... /* read the buffer */
  }
}


Both the compiler and the CPU know about the specified memory ordering (release and acquire here), and they will make sure that they don't break the semantics. The compiler will not reorder memory accesses, and it will emit barrier instructions (if needed). The CPU will see the barrier instructions and it will either flush write buffers, or issue an explicit load from main memory (again, only if needed). The exact semantics of the release and acquire (and other) memory orderings is described in the C/C++ standard.