messages lost due to high rates causing issues

So I am logging using os_signpost.


My begin format string contains two strings "language: %s message: %s".


My end format string contains one string "size: %zu".


I am running on an iMac Pro 2017. I should have lots of processing power....


I am routinely running into problems where the Data stream is logging run issue messages "due to high rates in live mode recording. Try windowed recording mode".


It appears in general that the messages being dropped are my end messages.


I have tried windowed recording mode, and it doesn't really seem to help. Is there anything I can do to improve performance so messages aren't being dropped? Compared to something like dtrace recording system calls I don't think I am logging that much data. The fact that it's the end messages that are being dropped is really messing up my graphs because my instrument believes that the interval is still open which means my graphs get incredibly deep incredibly quickly.


I could easily turn language into an enumeration if parsing/recording a %d is significantly faster/easier than parsing/recording a %s.


Any thoughts would be appreciated.


Cheers,

Dave

Replies

Converting "language" to an enum ("language: %d message %s"), and converting "size" so that it was just "%zu" instead of "size: %zu" thinking it may be faster to parse did not seem to improve my life dramatically except for making my code a little harder to read.


Filed radars

47211696 [custom instruments] defaults for <enum> Elements

and

47211508 [custom instruments] <enum> as a top level declaration


while going through the exercise of using enumerations.


Is os-signpost-interval-schema recording backtraces even though I don't care about backtraces? I don't have a backtrace column anywhere...

Since you're on a Mac, run "log stats" from the command line. If you want to be pro, you can use "log collect" to create an archive of the window of time you were seeing the problem, then run "log stats --archive path" to narrow it down. When it finishes, it will give you a break down of who is filling up the logs. Logging is a shared resource on the system, so the overload doesn't have to be caused by your app.


In live mode, data is sent directly to an Instruments recording service, so if the recording service backs up, the data gets dropped rather than blocking the sender. In Windowed Mode, logging signposts goes into something like a circular memory buffer, so it's very hard to overload. The downside is that the older data is evicted from memory quicker. If the problem you're seeing is still happening in Windowed Mode, then something isn't functioning as it was designed. That's not good.


With regards to live mode, choosing a target rather than All Processes can help, and not including instruments with wide open filters like "os_signpost" and "os_log" will also help. Since I see the "custom instruments" tag, I'm assuming you've supplied a subsystem and category and are using your own custom instruments, so the filtering you get with that should help with the data loss.


Oh, and finally, having "Console.app" open does change the equation a bit, so I'd recommend quitting that app before recording.

Also, backtrace recording is not implemented at this time. It's turned on at the logging handle level, and by default log handles do not record them.

Sorry I wasn't completely clear...


I am recording off of an iPhone (SE) to my Mac via USB. Not sure if that makes any difference.


I am focusing on a single app, and the only instrument that I am recording is my own.


Console.app is not open.


As far as I know there is no equivalent to "log collect" to pick up data from my phone... happy to be educated of course 🙂


Cheers,

Dave

More data:


So it does appear windowed mode is working okay (contrary to what I said previously)/ I set the window for "20 seconds" and it doesn't appear to overflow. This seems odd to me though given that my run is usually about 8 seconds. Why does windowed mode work for 20 seconds, but live or deferred mode does not?

And how fast is "too fast" in general and how many is too many? I see in the recording preferences that there is a DTrace buffer control. Does this buffer work for ktrace as well?


My custom instrument is pretty useless if I can't sort this out.

Live and deferred modes requires every signpost to be forwarded to Instruments. This bypasses all the optimizations made by the system in order to display the data immediately. It's true, deferred mode can batch the data, but it does imply lossless recording when possible, so it uses the same machinery as live mode. Under heavy loads, it can still be overrun.


In Windowed mode, however, Instruments steps out of the way and lets the OS record signposts into its circular buffer, making it theoretically impossible to overload it under normal use. Windowed mode is used for dense recordings where a long trace is impractical to even analyze by the user (i.e. too much hay to find your needle).

DTrace doesn't exist on iOS. Ktrace buffers are sized based on a heuristic. Both of these technologies can be overrun if Instruments can't drain the data quickly enough, in both live and deferred modes.


I've been able to comfortably record thousands of messages per second on a Mac in live mode, but the performance impact is measurable in a time profile. iOS's capacity to record varies greatly between devices, but I'd expect hundreds per second to be achievable. The other problem you might be seeing is that if your app has taken so many system resources that Instruments can't get enough time and memory to move the data off the device, then that could manifest the same way.