Clang 15.0 produces slow c++ applications

Developer Tools & Services General Performance

djm44 OP

Created Sep ’23

Replies 23

Boosts 0

Views 4.7k

Participants 5

Hello,

I Run MacOs ventura 13.6 and command line tools 15.0 on MacBook Intel I7 post 2018.

After installing clang 15.0 the performance of C++ test programs shows 4 at 5 times slower execution time compared to Clang 13.0

Has anybody observed this slow down ?

The tests using a lot of mathematical computations is compiled with the folowing command :

g++ -std=c++17 -march=native -funroll-loops -Ofast -DNDEBUG -o a atest.cpp

So I had to revert to Clang 13.0 to have reasonnable execution time .

What makes C++ code so slow ?

Boost

djm44 OP

Oct ’23

So -O3 makes code slower than -Ofast

I tried -Wl,-ld_classic gives no difference

I notice -march=native make code very much slower.

I guess what changed with this last version is -march=native

May be less support for Intel processor

With the previous version of clang -march=native made code faster.

I don't know about assembly.

The fact is that my code and the compiler flags have not changed but the changes are in clang 15.0 and are not documented .

Same codes run faster on Linux guests vmware and virtualbox. with gnu gcc or g++ And on Windows BootCamp with Mingw gcc g++

I precise my codes do not use graphical UI. It gives only results on the terminal.

djm44 OP

Oct ’23

diiscard

djm44 OP

Oct ’23

I checked with with option -###

apple-macosx14.0.0 is invoqued is it right ? or shoud it be apple-macosx15.0.0 ?

the -march=native is effective but it makes code slower . With the previous version -march=native made the code faster.

Is the disk crypted by défault with Sonoma which could make code slower ?

sorry I better put this in a reply than a comment.

If you want to i can send you one of my examples a group of about 15 small files in a zip format.

Apple Staff OP

Apple

Oct ’23

If you want to i can send you one of my examples a group of about 15 small files in a zip format

Yes, it'll be much easier if we can reproduce. I'd suggest filing a Feedback, attach the example with instructions, and post the FB number here.

apple-macosx14.0.0 is invoqued is it right ? or shoud it be apple-macosx15.0.0 ?

This is fine.

the -march=native is effective but it makes code slower . With the previous version -march=native made the code faster.

It really depends on which optimizations are chosen and data that's fed into it. In general, yes, -march=native should produce better code. However, there can be overhead depending on data layout, size, and other factors.

Is the disk crypted by défault with Sonoma which could make code slower ?

File system encryption is unlikely to be the issue here. Consider that both good and bad cases are running in the same environment.

djm44 OP

Oct ’23

Hi ,

the FB number is FB13252912 . I sent a zip file wihich can run a test merely in C language The same code compiled with fastest options gives a better run time on Linux VirtualBox vmWare guests and Windows Boot Camp .

You'll see difference by comparing clang 15 vs clang 14 on a MacBook Pro Intel 2020

djm44 OP

Oct ’23

Impossible to re-upload so a new FB
FB13253046

djm44 OP

Oct ’23

Tanks for these informations

But on string manipulation with -march=native or -march=haswell I get a very poor performance compared with Linux guests vmWare or VirtualBox or Windows BootCamp. It's allmost 2 times slower

I can send you the test.

I puted it on the same number of FB as test2.zip

FB13253046

FB13256895

Alecazam OP

Apr ’24

-march=native is a pretty bad compile setting. You really aren't being specific then about the architecture. And for simd-based libraries you need to enable that too (-mavx2, -msee4.2, etc). I was using that arch that on Windows, and then the compiler uses whatever the SIMD architecture is of the machine that you compile upon. So I'd get AVX-512 code that would then crash on newer machines that omit that support. If you're running the Intel code emulated on Rosetta2, then you can expect a 2x slowdown there.

Also note for Intel apps to run under Rosetta2, had to disable avx, and drop to SSE4.2. Also had to drop f16 support since neither of these are supported.