Application performance declined in Big Sur and is worse in Monterey

My application is a computationally intensive commercial application. I have a suite of tests that consist of 100 separate directories that have test scripts within them. The directories are independent and to speed up testing, I have multiple instances of my application running in parallel working on each directory launched from command line scripts in Terminal.app. This process of testing is much faster than if I launched a single instance of my application that tested each test directory in serial.

This has worked great for the past several releases of macOS. About the only hiccup I had was when a change was introduced to macOS that caused my apps to sleep after a while but I was able to resolve that by using the caffeinate command.

About a month ago, I finally upgraded my primary development 8-core i9 iMac to Big Sur. Before upgrading, running my tests on Catalina took about 2.5 hours. But after upgrading to Big Sur, the tests now took 3.5 hours. Well now after upgrading to Monterey, the tests now take almost 4 hours.

What happened? Most of the differences in timings of the other directories was negligible though after the upgrade but I noticed that one of my test directories now takes 4 times as long as it used to. When I run that problematic directory with a single instance with no other instances running, its completion time is back to normal.

I should also note that my application contains OpenMP support and each instance may use up to 4 cores (limited to 4 cores for testing purposes) depending on what is being tested.

Is there an Info.plist key/value I need to add to allow my app maintain peak performance or perhaps something similar to the caffeinate command? This performance difference happens even when I use a version of my application that was built with Xcode 12.4 on an older Mac running Catalina so I don't think it's an SDK issue. One colleague suggested that perhaps Big Sur introduced an Intel processor security patch that slowed down processing power. But that doesn't explain why the timings for the majority of the test directories were unaffected.

I should also clarify that when I said "This performance difference happens even when I use a version of my application that was built with Xcode 12.4 on an older Mac running Catalina", I meant that I built my application with Xcode 12.4 on an older Mac (since you can't run older versions of Xcode on Monterey) and ran that application on both Big Sur and Monterey and still got the slower results.

The way you are doing this is VERY complicated and probably in-efficient. Apple recommends that developers migrate away from threads to Grand Central Dispatch. This provides a very robust, efficient system to manage multi-threading. I does the work for you by optimizing the use of the cores in your machine without you having to do things like trying to figure out how many cores are available, in fact they advise you not to do this:

https://developer.apple.com/library/archive/documentation/General/Conceptual/ConcurrencyProgrammingGuide/ConcurrencyandApplicationDesign/ConcurrencyandApplicationDesign.html

OpenMP probably isn't up to date and or doesn't take advantage of this technology very well. GCD has the advantage of being available on any processor or OS version.

You should be able to run each of these processes in its own serial or more likely concurrent dispatch queue and let GCD execute them efficiently. Running each of these with its own copy of the app is probably much slower than it would be to launch each process with GCD due to overhead like context switching. You could write one source code file to launch these and be notified when they are done and it would be much easier for you than doing this with the terminal. You should at least read the Concurrency Programming Guide and see if it would make your life easier and speed up your app.

We use OpenMP because our application is a cross platform application. Having separate code in our computational engine isn't practical or economical at this time.

Application performance declined in Big Sur and is worse in Monterey
 
 
Q