Wakeups limit exceeded

Question

We've been getting a crash on an app under development which is producing logs with the message: "45001 wakeups over the last 174 seconds (258 wakeups per second average), exceeding limit of 150 wakeups per second over 300 seconds".

We have narrowed down the cause of our crash to some temporary debug code that was firing more-than-usual numbers of analytics events. Our analytics system uses AWS Kinesis Firehose: https://github.com/aws-amplify/aws-sdk-ios/tree/master/AWSKinesis

Obviously we can fix by disabling the debug code, but we'd like to have a better understanding of the problem so that we can make our code (or the AWS code) more robust. Additionally, we know of at least one end user that has experienced similar issues on a released title and we were unable to help them get our product working on their device - I now suspect their problems may have been related.

Basically, I don't know what a wakeup is in this context, so I therefore don't understand how to go about controlling the number that we trigger. It must refer to something other than ordinary multi-threading activity, because 150 thread switches per second is such a low limit that it would effectively prohibit multi-threading (the limit is only 2.5 per frame at 60fps!). I also don't think it can refer to background execution because our app doesn't do any background execution and the crashes are not associated with multitasking at all.

Can anyone tell me what a wakeup is in this context? And maybe give me advice to help me find the code that is causing them?

Concurrency

15k

Posted by

Columbo

Reply

Add a Comment

Answer 1

Can anyone tell me what a wakeup is in this context?

I’ve been wondering about this myself, so I spent some time digging into the origin of this number. This concept comes from the kernel, and you can look at how it’s calculated in the Darwin open source. Specifically, the droid you’re looking for here is the interrupt_wakeups as dispatch by the SENDING_NOTIFICATION__THIS_PROCESS_IS_CAUSING_TOO_MANY_WAKEUPS routine (you couldn’t get more descriptive than that!) in the osfmk/kern/task.c [1] and incremented in thread_unblock in osfmk/kern/sched_prim.c. In short, it seems to count the number of times that a blocked thread was unblocked because of an ‘interrupt’ [2].

And maybe give me advice to help me find the code that is causing them?

In situations like this, the System Trace instrument is your friend.

Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] The links here are to the latest kernel source currently available, which is for macOS 10.14.1 and thus roughly corresponds to iOS 12.1. I’ve no reason to believe this has changed in the 2019 OS releases.

[2] In sneer quotes because interrupts are a complex thing in XNU.

Posted by

eskimo

Add a Comment

Answer 2

Thanks for the response. To update on this:

I tried the System Trace instrument and was able to view the wakeups that occurred, but I still couldn't work out what the ultimate cause of the wakeups was. If I had to guess now after digging into the AWS kinesis code I'd say GCD or NSURLSession seem the most likely sources.
We did eventually get a more normal crash report (as opposed to the microstackshots report with the wakeups alert). The crash report led us to find a race condition in our analytics code (one thread wrote to an NSMutableArray while another thread read from it).
After fixing the race condition we could no longer reproduce any crashes. This leads me to believe that the wakeups alert is a bit of a red herring. Reading around the kernel code that eskimo linked to, it looks like the wakeups count can either be fatal or be a warning. I now suspect that it was just a warning that didn't have anything to do with the crashes we were experiencing.
I also found that our application had some code that was intended to prevent excessive uploading of analytics, but it had muddled milliseconds and seconds, so it wasn't effective. After verifying that crashes were no longer reproducible based on the race condition fix alone, we then fixed this bug. So, hopefully we've fixed the wakeups count too but we haven't really verified that.

Posted by

Columbo

Add a Comment

Answer 3

As @eskimo say, wakeup is because of 'interrupt', but:

MUST be asyn interrupt, include IQR, FIR. Soft interrupt or pageout is not 'wakeup'. Timer counter is the most common case.
A task's wakeup is include all of its threads. So timer in different threads will increase your wakeup, otherwise maybe not.

There are cases I found:

NSTimer, CADisplayLink
dispatch_after
semaphores contains wait_time or lock contains wait_time

There is also an debug case for me:

When you include OpenGLES or Metal, xcode will inject 2 threads watching draw call command. You can disable this feature in scheme, https://developer.apple.com/documentation/metal/frame_capture_debugging_tools/enabling_frame_capture. After that, all goes ok.

This is the code of getting wakeup count in your app, not only in the instrument. May help you:

Code Block c#include <mach/task.h>
#include <mach/mach.h>
BOOL GetSystemWakeup(NSInteger *interrupt_wakeup, NSInteger *timer_wakeup) {
  struct task_power_info info = {0};
  mach_msg_type_number_t count = TASK_POWER_INFO_COUNT;
  kern_return_t ret = task_info(current_task(), TASK_POWER_INFO, (task_info_t)&info, &count);
  if (ret == KERN_SUCCESS) {
    if (interrupt_wakeup) {
      *interrupt_wakeup = info.task_interrupt_wakeups;
    }
    if (timer_wakeup) {
      *timer_wakeup = info.task_timer_wakeups_bin_1 + info.task_timer_wakeups_bin_2;
    }
    return true;
  }
  else {
    if (interrupt_wakeup) {
      *interrupt_wakeup = 0;
    }
    if (timer_wakeup) {
      *timer_wakeup = 0;
    }
    return false;
  }
}

Posted by

Daniel-Duan

Add a Comment

Answer 4

I try to reproduce a crash cause by wakeups limit exceed, sample code below:

    for (int i = 0; i < 500; i++) {

        NSThread *thread = [[NSThread alloc] initWithBlock:^{

            [[NSThread currentThread] setName:[NSString stringWithFormat:@"%d", i]];

            NSRunLoop *runLoop = [NSRunLoop currentRunLoop];

            [runLoop addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode];

            CADisplayLink *link4 = [CADisplayLink displayLinkWithTarget:self selector:@selector(threadCadisplayLink)];

            //            link4.frameInterval = 60;

            [link4 addToRunLoop:[NSRunLoop currentRunLoop] forMode:NSDefaultRunLoopMode];

            [runLoop run];

        }];

        [thread start];

    }

- (void)threadCadisplayLink {

    NSInteger index = arc4random()%100;

    NSObject *obj = [NSObject new];

    dispatch_async(dispatch_get_global_queue(0, 0), ^{

        self.info[@(index)] = obj;
    });
}

I get wake up times every second:

    static NSInteger old_interrupt_wakeup;

    static NSInteger old_timer_wakeup;

    static NSInteger old_idle_wakeup;



    [NSTimer scheduledTimerWithTimeInterval:1 repeats:true block:^(NSTimer * _Nonnull timer) {

        NSInteger interrupt_wakeup;

        NSInteger timer_wakeup;

        NSInteger idle_wakeup;

        rm_system_wakeup(&interrupt_wakeup, &timer_wakeup, &idle_wakeup);

        NSLog(@"interrupt_wakeup: %ld timer_wakeup:%ld idle_wakeup: %ld", interrupt_wakeup - old_interrupt_wakeup, timer_wakeup - old_timer_wakeup, idle_wakeup - old_idle_wakeup);

        old_interrupt_wakeup = interrupt_wakeup;

        old_timer_wakeup = timer_wakeup;

        old_idle_wakeup = idle_wakeup;

    }];

rm_system_wakeup is same as @Daniel-Duan said

I get interrupt_wakeup about 300000 every second, but fail to get Demo app crash.

2023-01-17 13:48:42.576934+0800 Demo[44926:1519725] interrupt_wakeup: 29684 timer_wakeup:5 idle_wakeup: 0

2023-01-17 13:48:43.577225+0800 Demo[44926:1519725] interrupt_wakeup: 30008 timer_wakeup:5 idle_wakeup: 10780509616

2023-01-17 13:48:44.577003+0800 Demo[44926:1519725] interrupt_wakeup: 30006 timer_wakeup:4 idle_wakeup: 0

2023-01-17 13:48:45.577459+0800 Demo[44926:1519725] interrupt_wakeup: 29505 timer_wakeup:3 idle_wakeup: 0

2023-01-17 13:48:46.577329+0800 Demo[44926:1519725] interrupt_wakeup: 30004 timer_wakeup:2 idle_wakeup: 0

Another question is what the difference between Crash with WAKEUPS subtype and wakeup event with a wakeups_resource-***.ips file, is wakeups_resource-***.ips means a crash happened?

Posted by

Joeljiangjiang

Add a Comment

Answer 5

I try to reproduce a crash cause by wakeups limit exceed

AFAIK hitting this ‘limit’ won’t actually trigger a crash. It generates a diagnostic report that looks a lot like a crash report but is not actually one. You can confirm that by reading the fine print. If you have a modern example, post it here and I’ll take a look. See Posting a Crash Report for advice on how to post it.

Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Posted by

eskimo

Add a Comment

Answer 6

I do experienced a crash, and I get device log from Xcode, the log type is Crash, and the detail is here:

myAppName 2023-1-17, 12-01.crash

So I think this is the cause of the crash.

Posted by

Joeljiangjiang

Add a Comment

Answer 7

the log type is Crash

This is not a crash report. Consider this snippet:

Action taken:     none

If this were a crash report, it’d have something more fatal there, like terminated.

If the Xcode UI is misleading you into thinking it’s a crash, I’d appreciate you filing a bug about that, with a copy of this diagnostic report and screen shot of the misleading UI.

Please post your bug number, just for the record.

Share and Enjoy
—
Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Posted by

eskimo

It's not a crash report, but I do experienced crash as a user (not by crash log), and I fail to catch the crash. I think there must be a crash log from Xcode device log or somewhere else.

—
Joeljiangjiang
I do agree that the log is not a crash report, but I do experienced a crash as a user, and the app process terminate can be corroborated by my app logs. So I think there must be a crash log, I suspect the wakeups cause the crash.

—
Joeljiangjiang

Add a Comment

Answer 8

I agree that the log is not a crash report, but I did experienced a crash as a user, and the app process termination can be corroborated by my app logs. So I think there must be a crash log, and I suspect the wakeups cause the crash.

Posted by

Joeljiangjiang

Add a Comment

Answer 9

Hi,

We encountered the same with our app: Event: wakeups Action taken: none

Even though the log shows the above, the app crashes on the device.

Kaltura 24-02-2023, 13-16, iPhone X.crash

Kaltura 24-02-2023, 12-49, iPhone 6s.crash

Logs taken from the device, not running with the Xcode.

Posted by

Nilit_Danan

Add a Comment

Wakeups limit exceeded

Replies