NSURLSession background upload -1001 timeout error

Hi,


I am having trouble using a background NSURLSession to upload ~300 files (2.4G). The upload works fine at first, but always ends up timing out with a nsurlsessond error -1001. Usually the timeout is 10-20 minutes into the upload (at about 1G, but never at the same point).


The session is configured with timeoutIntervalForRequest set to 1 minute and timeoutIntervalForResource set to 24 hours. Everything else uses the default config. When the app kicks off an upload, each of the files to be uploaded creates its own upload task, which are all part of a single session. The session is created on init. Also note that for the purpose of this test, the entire transfer occured in the foreground despite using a background session.


In the system logs I see some concerning log statements, but google is less than helpful at determining what they mean. Here are the concerning entries:


Jun 19 14:24:02 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> resuming
Jun 19 14:24:02 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> has not requested a begin delay
Jun 19 14:24:02 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> skipping delayed request callback - delegate not implemented
Jun 19 14:24:02 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> will begin
Jun 19 14:24:03 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> for client  _willSendRequestForEstablishedConnection

this occurs for a bunch of tasks, not just 662. In my applcation logs (in my DidReceiveData delegate) I can see progress being made for some other tasks, but there is no mention of task 662 (or any of the others with similar logs) making progress. About a minute later, I get:


Jun 19 14:25:03 Marine-iPad-Air-2 nsurlsessiond(CFNetwork)[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> finished with error - code: -1001
Jun 19 14:25:03 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> for client  will be retried after error  - code: -1001
Jun 19 14:25:03 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> logTaskError isDownload = N bytesReceived = 0 bytesSent = 0 isRecoverable = Y
Jun 19 14:25:03 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> 2324.565926 seconds past start time, new delay: 0.000000, new window: 0.000000
Jun 19 14:25:03 Marine-iPad-Air-2 nsurlsessiond[812] : Task <29DE1E45-4844-4F79-B9B8-7E5C8265D955>.<662> will retry after 1024 second


Timing out after a minute makes sense, given the value of timeoutIntervalForRequest. My question is, why is everything fine early on, and then my tasks get these odd resume/begin messages? Why aren't they starting up again, if that is indeed what they're supposed to do? What might have caused them to stop in the first place? Any help would be greatly appreciated!


Thanks!

Replies

First up, a specific comment:

In the system logs I see some concerning log statements

I wouldn’t worry too much about those logs. I believe that they indicate that you haven’t implemented the

-URLSession:task:willBeginDelayedRequest:completionHandler:
delegate method, which is perfectly normal.

Beyond that, it’s hard to say what’s going on here without more info. My recommendations:

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thank you Quinn for the speedy reply.


As I examine the system logs more, I am noticing a pattern. At the beginning of the transfer, all of the ~300 tasks get queued up with the following log output:


Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> uploadTaskWithRequest:  fromFile: 
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond(CFNetwork)[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> is for <>.<>.<2>
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> enqueueing
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> resumeTaskWithIdentifier, props 
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> resuming
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> has not requested a begin delay
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> skipping delayed request callback - delegate not implemented
Jun 20 07:51:00 Marine-iPad-Air-2 nsurlsessiond[812] : Task <6281513F-98D0-47F2-A109-E514C2E38587>.<2> will begin

Tasks 1 and 4 finish while the other tasks are being queued, so tasks 5 and 6 have a chance to begin (HTTPMaxConnectionsPerHost is the default of 4). Task 7 gets the first -1001 error exactly a minute after it was queued up. The part that is confusing, is why the timeout triggered for tasks 7+ when they had never actually run. I threw a tcpdump in wireshark and nothing appears out of the ordinary. Just some PUT requests for the tasks that are running and some 201 responses once the upload is finished.


EDIT:

I ran some tests and it looks like the [task resume] is what starts the timer, but I don't want the timer to start for every task at the same time. Rather, I'd like for the timer to start when the task sends its PUT request to the server to begin the upload.


Is it bad practice when adding files to an upload session to do:

  • Create upload task
  • Task resume

One at a time for each file? Is the client application supposed to keep track of the HTTPMaxConnectionsPerHost such that we can resume a task when one finishes? Is there a better way to do all of this?


Thanks!

Is the client application supposed to keep track of the

HTTPMaxConnectionsPerHost
such that we can resume a task when one finishes?

No, it’s the exact opposite of that in fact. We recommend that you dump all the task on to the session at once, up to some not-well-defined limit (300 should be fine, but 3000 would be a mistake). See NSURLSession’s Resume Rate Limiter and Moving to Fewer, Larger Transfers.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks Quinn. That is exactly what we're doing. We create all 300 tasks by doing:


foreach file
     task = [self.session uploadTaskWithRequest:request fromFile:[NSURL fileURLWithPath: filePath]];
     [task resume];


I made the transfer a foreground session and changed the max host connections to 1 for simplicity. I also set the NGINX config to only allow one request per connection, so it is easy to trace the streams in wireshark. I'm still getting a timeout for everything that calls [task resume] that doesn't get a chance to run during the first minute. I don't see any traffic through wireshark that would indicate any tasks outside of the one currently running are sending anything that should cause the OS to start the request timeout timer. Any ideas?


In the background sessions, I did notice that the timer seems to start in conjunction with the log statement:


<1> for client  _willSendRequestForEstablishedConnection


So what I see happening is that tasks 1-8 call this, 1-6 have a chance to run and make progress in the first minute, and 7/8 timeout after exactly 1 minute. However, sometimes this log statement only shows for tasks 1-6, and 7/8 don't timeout. Basically, if I see that log statement for a given task, I know that the request timeout timer has been kicked for that task. If I don't get another within a minute, that task times out 100% of the time. Keep in mind this is the exact same set of data being transferred each time, but it seems due to connections, transfers speeds, and scheduling it isn't always deterministic.


Can you think of anything that would cause the OS to think a task request timeout timer should start when it isn't actively making any transfers?

I made the transfer a foreground session and changed the max host connections to 1 for simplicity.

Cool Reproducing the problem with a foreground session always helps.

Can you think of anything that would cause the OS to think a task request timeout timer should start when it isn’t actively making any transfers?

No. Earlier you wrote:

The session is configured with

timeoutIntervalForRequest
set to 1 minute and
timeoutIntervalForResource
set to 24 hours.

Are you sure you got that round the right way?

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I'm not sure what you mean by "round the right way". As far as I've read, the timeoutIntervalForRequest is used to timeout/retry a task that hasn't received data in X amount of time. The timeoutIntervalForResource is basically how long to keep retrying before timing out the session. Is that incorrect?


I set timeoutIntervalForRequest to 1 minute because that is plenty long to go without receiving a response from the server. Normally it would be seconds at most. Tasks that are queued up but not running should not start this timer, and therefore should not timeout unless timeoutIntervalForResource is hit. Am i missing something?

So I added a KVO observer on the task state because I noticed that the NSURLSessionTaskStateRunning state is subject to both the request and resource timeouts per the documentation at:


https://developer.apple.com/documentation/foundation/nsurlsessiontaskstate/nsurlsessiontaskstaterunning?language=objc


According to the observer, each task is put into the NSURLSessionTaskStateRunning state shortly after calling [task resume]. Keep in mind resume is called for every task immediately after creation. These are all done at once to create a batch upload.


If the correct way to use background sessions is to batch all of your files together (as you have said, and as I am doing), and each task is immediately put into the NSURLSessionTaskStateRunning state where the request timeout can trigger (per documentation), how can I reasonably set a request timeout for a batch background transfer? For a batch background transfer with 300 files, the last task may not run for 20+ minutes. I realize the request timeout will retry, but that should be the exception, not the normal. With a reasonable request timeout (60s, which is the default), the files timeout constantly and the retry timer gets so high that my upload takes forever. What is the proper/expected use of timeoutIntervalForRequest in my case?


What's more interesting is that despite the documentation stating that tasks in the NSURLSessionTaskStateRunning state are subject to the request/resource timeouts, the behavior of my upload is still indeterministic. For some uploads, all of the tasks are put into the NSURLSessionTaskStateRunning state and the ones that don't run in the first minute (timeoutIntervalForRequest) get the -1001 timeout error. For other uploads, all of the tasks get put into the NSURLSessionTaskStateRunning state and none of them get the -1001 timeout, despite most of them not running for much longer than the timeoutIntervalForRequest. Again, the catalyst for determining if the task will get the timeout or not is the "<task> for client <private> _willSendRequestForEstablishedConnection" system log statement. If the task gets that log statement before timeoutIntervalForRequest it will not timeout, regardless of the state it is in. If it does not get that statement, it will timeout if it is in the running state and timeoutIntervalForRequest has been exceeded. What triggers this log statement?

There is definitely something wonky going on here but I don’t have time to look at it in detail here on DevForums. I recommend that you open a DTS tech support incident so that I can allocate the time to look at this properly.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks Quinn. I created a DTS ticket (715695582) and linked this thread under description. I will post anything I think might be useful here so that hopefully once it is resolved it can act as a tool for others.


My session creation/config:

    NSURLSessionConfiguration * sessionConfiguration = [NSURLSessionConfiguration backgroundSessionConfigurationWithIdentifier:@"com.mycompany.ios.upload”];
    sessionConfiguration.sessionSendsLaunchEvents = YES;
    sessionConfiguration.allowsCellularAccess = YES;
    sessionConfiguration.HTTPMaximumConnectionsPerHost = 4;
    sessionConfiguration.timeoutIntervalForRequest = 60;
    sessionConfiguration.timeoutIntervalForResource = 24*60*60;
    self.session = [NSURLSession sessionWithConfiguration:sessionConfiguration delegate:self delegateQueue:nil];

Task creation/resume (simplified):

For each upload file:
            task = [self.session uploadTaskWithRequest:request fromFile:[NSURL fileURLWithPath: filePath]];
            [task resume];


Those are probably the only real relevant pieces of code as the session never finishes and the other delegates deal with reporting progress/status to the UI.


I'm using a PUT request for files varying from KB to GB (most are in the 10MB range, though). The upload that is failing is 250 files at 2.41G. All files are given a separate task and all tasks are under one session. Each task is resumed as soon as it is created, and all tasks are created at once.


On the server side, I do not see any errors. I have parsed the NGINX logs as well as added copious logs to the server and don't see any issues reported there. The tcpdump also looks fine. I only see traffic for the files that are actively being transferred. Those files do not ever get the -1001 timeout error. The server will report a 201 back to the client when the file is finished.


I have a bunch of log files I can send your way if needed.


I should also mention that I changed the HTTPMaximumConnectionsPerHost to 1000 yesterday during testing, and the background session seemed to work. It only actually did 7 files at a time according to the logs, but somehow the other files were not getting the -1001 timeouts anymore.

To close the loop on this. There is a bug in the way that timeoutInternvalForRequest works. It appears the timeout starts when a task is resumed, not when the task first makes a request. This leads to tasks timing out that never got a chance to run. The issue is known but is apparently old and therefore a fix may be awhile.