Why is clang removing the loop when using -O3?

Consider the following code
Code Block language
#include <chrono>
#include <iostream>
#include <random>
#include <vector>
class TestClass
{
public:
  int A = 0;
  int B = 4;
protected:
private:
};
int main(int argc, const char * argv[])
{
  std::random_device randomDevice;
  std::mt19937 mersenneTwister(randomDevice());
  std::uniform_int_distribution<size_t> distribution(1,255);
  for (size_t i=0; i<10000000; ++i)
  {
    size_t const vectorSize = distribution(mersenneTwister)+1;
    TestClass* testVector(reinterpret_cast<TestClass*>(malloc(vectorSize*sizeof(TestClass))));
    if (testVector[0].A == 0x0ffeefed)
    {
      std::cout << "Sorry value hit." << std::endl;
      break;
    } /* if */
    free(testVector);
  } /* for */
  return 0;
}


Clang completely removes the for-loop with optimisation -O3. I am a bit surprised. Although testVector will contain only garbage, I expected the loop not to be removed (actually also no warning was issued, only the analyser detected that testVector contains garbage).

If I add a line assigning a value to a random element of testVector, the loop is not removed.

PS: I wanted to use the loop for testing the execution speed of malloc and free.

Replies

Rather than looking at the compiler optimizations, I suggest you inquire with some of my DTS colleagues about your goals for measuring malloc and free so we can have an in-depth discussion with you. We sometimes get questions like this, and what ends up happening is that through both compiler optimizations and the overall system design of how malloc works in modern OS versions, you wind up testing how fast the CPU can twiddle a loop and do nothing instead of getting a meaningful result that is useful for whatever your goal is when measuring these things.
IIRC The compiler treats the pointer returned by malloc as not having any defined value (it is "poison"), so it can remove your test against any *specific* value (as there were no writes). Then it simply has a matching malloc and free, which can be elided, which leaves a loop with the rng whose results are never used, and thus removed as well.
  • This is the right answer indeed.

Add a Comment