Hey Kevin,
thanks a lot for the swift response. There is an ongoing case regarding this issue with DTS: 9323940. I've provided the full trace file there for reference. Unfortunately, I cannot share the entire thing here in the forums.
Nonetheless, I've taken a screenshot of the time profiler view to make things more clear:
In this picture, you see the view I'm talking about in my original post. The other view (CPU Profiler, see below), shows that the cycles halved. Nonetheless, the time profiler view (shown in the picture) did not show any improvement (the 12.38 seconds are no improvement to the 12.36 seconds measured for the original, unoptimized implementation [not displayed in the picture]).
I hope this makes it more clear - I don't think this is one of the views where time profiler shows the measurements relative to the total usage, or am I mistaking here? The way I understood it, this is the total time the app spent on the CPU, requiring calculation?
Aside of that, would you say that the regular CPU Profiler (with the cycles) is a better/more reliable indication of any performance improvements? I.e. can I conclude that my refactoring actually worked if the cycles halved?
Here you see the CPU Profiler view (the one I keep mentioning w.r.t. CPU cycles):
Here you see a total of 6.60 Gc for the refactored implementation, as opposed to 11.40 Gc for the original, unoptimized, implementation.
Really appreciate your help on this issue here, thanks a lot!