Yeah, I've been running as many threads as there are P-cores for my simulations and making sure to never yield. Unfortunately, it isn't always easy / possible to that, so I would still appreciate either a proper core affinity API, or at least a way to opt out of E-cores.
The core asymmetry can create quite annoying situations with OpenMP for example where work is equally distributed.
Post
Replies
Boosts
Views
Activity
I got a nice explanation from a person in DTS, which I'll briefly summarize here for posterity:
The mach_task_self() shouldn't work at all and is wrong (I got the idea from https://codereview.chromium.org/276043002/ where they are used interchangeably).
The other call makes it to the right place, but thread affinity is not implemented / supported for Apple Silicon
(There the argument was made that "all the cores are basically sharing a single unified cache" which doesn't quite match up with the video describing the 4 P-core to a shared L2 cache arrangement.)
And because I always have trouble following XNU dispatching of function calls (especially once the Mach layer gets involved), here's the walk-though of the dispatches:
main entry-point task_policy_set(...) https://github.com/apple-oss-distributions/xnu/blob/xnu-8019.80.24/osfmk/kern/thread_policy.c
going to thread_policy_set_internal(...)
that one asking thread_affinity_is_supported() which comes from https://github.com/apple-oss-distributions/xnu/blob/e6231be02a03711ca404e5121a151b24afbff733/osfmk/kern/affinity.c
then dispatching to ml_get_max_affinity_sets() != 0 (which is an architecture specific function)
which for ARM says "no sets supported" https://github.com/apple-oss-distributions/xnu/blob/bb611c8fecc755a0d8e56e2fa51513527c5b7a0e/osfmk/arm/cpu_affinity.h
and voila, KERN_NOT_SUPPORTED
I've filed a DTS (Case ID: 797192903), the repo looks as follows
// #include <pthread.h>
#include <mach/mach_init.h>
#include <mach/thread_policy.h>
#include <mach/thread_act.h>
#include <iostream>
int main (int argc, char const *argv[])
{
#ifdef _OPENMP
#pragma omp parallel
#endif
{
thread_affinity_policy_data_t policy = { 1 }; // non-zero affinity tag
// todo: should release returned port?
auto r1 = thread_policy_set(mach_task_self(), THREAD_AFFINITY_POLICY,
(thread_policy_t)&policy, THREAD_AFFINITY_POLICY_COUNT); // 4 = KERN_INVALID_ARGUMENT
auto r2 = thread_policy_set(pthread_mach_thread_np(pthread_self()), THREAD_AFFINITY_POLICY,
(thread_policy_t)&policy, THREAD_AFFINITY_POLICY_COUNT); // 46 = KERN_NOT_SUPPORTED
#ifdef _OPENMP
#pragma omp critical
#endif
std::cout << "r1 = " << r1 << " r2 = " << r2 << std::endl;
}
return 0;
}
For non-OpenMP compile with clang++ -std=c++11 main.cpp, for the OpenMP version use something like /opt/homebrew/opt/llvm/bin/clang++ -fopenmp -std=c++11 main.cpp -L /opt/homebrew/opt/llvm/lib.
I very much understood this as “you will not get the necessary DriverKit entitlements for a virtual device” (and to keep using an Audio Server PlugIn for that purpose). So we’ll be stuck with installers and non-App Store distribution.
Nope, but I hope to be able to do some experiments on real hardware once by DTK replacement arrives.
I wonder whether you're running into the same problem as the one here https://developer.apple.com/forums/thread/667459?answerId=650916022#650916022 (which is to say https://bugs.swift.org/browse/SR-5872 ).
IIRC The compiler treats the pointer returned by malloc as not having any defined value (it is "poison"), so it can remove your test against any *specific* value (as there were no writes). Then it simply has a matching malloc and free, which can be elided, which leaves a loop with the rng whose results are never used, and thus removed as well.
I did some googling and my current guess is that these channels contain content for
Hearing Impaired (with emphasis on dialog)
Visually Impaired (audio description)
The mention I found was https://www.isdcf.com/papers/ISDCF-Doc4-Audio-channel-recommendations.pdf, which doesn't seem related to the THM format (but documentation for that seems to be rather sparse) but still related to multi-channel audio.
From what I can tell (as I'm doing the same thing), you don't need a special entitlement because the AudioServerPlugIn is not an app extension but a completely stand-alone plug-in.
Whether that plug-in needs to be signed with a Developer ID (or notarized) I'm actually not sure about (because AudioServerPlugIns are already sand-boxed). FWIW, I'm doing that (i.e. signing it) with a normal (paid) Developer ID and it seems to work for other people, although setting up the whole process if fraught with potential for errors.
I'm not distributing via the App Store, but if you were, you'd have to install your plug-in using the normal mechanism (Apple Installer, or maybe manually copying it to /Library/Audio/Plug-Ins/HAL), but I'm not sure that's encouraged (allowed?) for AppStore apps.
You can use KVO to observe AVPlayer.status. Note that this has been broken in Swift for >3 years, see https://bugs.swift.org/browse/SR-5872 .
So in that case my work-around was to observe .rate instead.
I'll try to explain again: My app is forwarding audio from one (virtual) input device to (real) output device. This process is driven by the IOProc of the output device. As I want to incur as little latency as possible (i.e. use the most up-to-date data from the input device), I want my IOProc to be scheduled as late as possible while still hitting the deadline (see the documentation for kAudioDevicePropertyIOCycleUsage - https://developer.apple.com/documentation/coreaudio/kaudiodevicepropertyiocycleusage the header docs are more helpful, the web documentation seems useless).
But I do have a bit of work to do to the data before it's ready, so I need to figure out what I can set kAudioDevicePropertyIOCycleUsage to and still hit the deadline (from the documentation you get scheduled the entire cycle of the audio duration with the default IOCycleUsage). So far, I've been benchmarking the workload how long my audio fiddling needs (on some synthetic input data) before starting the IOProc, and then multiplied that by a safety factor and adjusted kAudioDevicePropertyIOCycleUsage accordingly.
There are no other threads involved in this work (so the AudioWorkGroup APIs shouldn't be needed), but I'm wondering how to best do the estimation of the IOCycleUsage now that differently speedy cores are in play. Either I need to benchmark on the performance cores (and then the IOProc always should run on the performance cores, or I need a safety factor of how much slower the efficiency cores are), or I need to force the benchmarking onto the efficiency cores (that way I'll never be late, but I might leave some latency on the table).
Does that make some sense?
I have little actual knowledge of this area (and I may be misunderstanding what you're doing), but I feel that if what you're trying to do were possible, then it'd be too easy to phish credentials by overlaying "your own" fake login-window, so maybe it not working is by design?
I wasn't aware that SystemExtensions were meant to include AudioServerPlugIns. I can only find references to .dext (DriverKit) and .kext (proper kernel extensions, depcrecated) in the SystemExtensions documentation (as well as references to EndpointSecurity and NetworkExtensions).
I can think of two potential problems you may be running into:
By using /usr/bin/python (which is part of the system), your code is running with SIP (System Integrity Protection) enabled and therefore in a more restrictive environment (certain dyld environment variables filtered out). Maybe python refuses to load / run non-system .dylibs (as otherwise they would be run with the rights of the signed system process)?
I seem to recall python loading its extensions fairly restrictively (RTLD_LOCAL among other dlopen-flags, but check the source); maybe that causes havoc with proper enumeration?
If it's during (early) boot, it could be Secure Boot: https://support.apple.com/en-us/HT208330
I don't have any specifics about which servers that uses though. Later on, I suspect a gazillion connection requests will come from all kinds of daemons and apps.