I've been investigating a similar issue with my codebase.
In particular, as this stack overflow topic points out, when my threads are distributed across all cores with equal workload, the efficiency cores create a significant bottleneck:
https://stackoverflow.com/questions/66348801/how-to-utilize-the-high-performance-cores-on-apple-silicon
I have been searching high and low for a mechanism to deal with this. It seems like disabling efficiency cores for my application would be better than the current situation.
(EDIT: Seems there's an open ticket in this thread for future readers:
https://developer.apple.com/forums/thread/703361)