Post

Replies

Boosts

Views

Activity

Notarisation failure where CFBundlePackageType is non-standard
HiNotarisation of audio plugin bundles in the AAX format (Pro Tools compatible) is causing a problem. The AAX formatted plugins define a non-standard CFBundlePackageType value of "TDMw" in their Info.plist and that appears to prevent them from being recognised by the notarisation process. If this is artificially changed to "BNDL" as in other plug-in formats notarisation and stapling is successful.Although some level of success is observed when attempting to notarise, it is only the Pace components in the bundle that notarise correctly (all AAX plug-ins must be signed by a third party tool from Pace). The actual plug-in fails to notarise unless the package type is changed, so when looking to staple this fails as not everything in the bundle got through the process.Is it possible to address this on the notary servers please? It is important to the audio community that we are able to notarise AAX plug-ins and the pkg files that may contain them (this obviously causes the entire pkg installer to fail to staple as well).I can provide further details and log files if necessary.Thank youMatt
2
0
1.2k
Sep ’19
Why does the execution of vDSP operations sometimes take longer in M1 native code than through Rosetta translation?
Hi I am porting some applications to M1 that make extensive use of vDSP. I found in many cases there to be a minimal speed-up, which I put down to Rosetta doing a good job translating SSE instructions into equivalent Neon instructions in the vDSP library. To try and understand this more I started profiling various areas of code and have found situations where the performance of translated code runs faster than natively. Often native code speed is similar or faster as expected, but there are a notable numbers of cases where it is not. This is not what I expected. I include a sample below to show a somewhat contrived and trivial routine exhibiting the effect. I have built it using XCode 12.5.1 in Release with an 11.3 deployment target. The Mac is running macOS 11.6. On my M1 Mac mini the Rosetta build takes around 900-1000 µs to run to completion, switching to native code it takes around 1500-1600 µs. I can make various adjustments to the data size or types of vDSP operations used to find scenarios where native builds are faster, that is not difficult, but it shouldn't be necessary. I can understand why vDSP could perhaps perform similarly across native vs translated runs, but surely it should never be the case that translated code could beat native code by a margin like this. What is going on, and is it expected? Thanks, Matt #include <iostream> #include <sys/types.h> #include <sys/sysctl.h> // determine if process is running through Rosetta translation int processIsTranslated() {   int ret = 0;   size_t size = sizeof(ret);   if (sysctlbyname("sysctl.proc_translated", &ret, &size, NULL, 0) == -1)   {    if (errno == ENOENT)      return 0;    return -1;   }   return ret; } int main(int argc, const char * argv[]) {   // print translation status   if(processIsTranslated() == 1)     std::cout << "Rosetta" << std::endl;   else     std::cout << "Native" << std::endl;       // size of test   vDSP_Length array_len = 512;   const int iterations = 10000;       // allocate and clear memory   float* buf1_ptr = (float*)malloc(array_len * sizeof(float));   float* buf2_ptr = (float*)malloc(array_len * sizeof(float));   float* buf3_ptr = (float*)malloc(array_len * sizeof(float));   float* buf4_ptr = (float*)malloc(array_len * sizeof(float));   if(!buf1_ptr) return EXIT_FAILURE;   if(!buf2_ptr) return EXIT_FAILURE;   if(!buf3_ptr) return EXIT_FAILURE;   if(!buf4_ptr) return EXIT_FAILURE;   memset(buf1_ptr, 0, array_len * sizeof(float));   memset(buf2_ptr, 0, array_len * sizeof(float));   memset(buf3_ptr, 0, array_len * sizeof(float));   memset(buf4_ptr, 0, array_len * sizeof(float));       // start timer   __uint64_t start_ns = clock_gettime_nsec_np(CLOCK_UPTIME_RAW);   // scalar constants   const float scalar1 = 10;   const float scalar2 = 11;   // loop test   for(int i = 0; i < iterations; i++)   {     vDSP_vsadd(buf1_ptr, 1, &scalar1, buf2_ptr, 1, array_len);     vDSP_vsadd(buf1_ptr, 1, &scalar2, buf3_ptr, 1, array_len);     vDSP_vadd(buf2_ptr, 1, buf3_ptr, 1, buf4_ptr, 1, array_len);   }       // report test time   __uint64_t end_ns = clock_gettime_nsec_np(CLOCK_UPTIME_RAW);   double time_us = (end_ns - start_ns) / 1000.f;   std::cout << time_us << " us" << std::endl;       // clean up   if(buf1_ptr) free(buf1_ptr);   if(buf2_ptr) free(buf2_ptr);   if(buf3_ptr) free(buf3_ptr);       return 0; }
1
0
923
Sep ’21