Freeze when deallocating large data structures

Hi everyone,

I have encountered a strange Swift runtime crash/freeze when large data structures created from recursive enums with associated values are automatically deallocated. I am unable to debug this issue and I was wondering if anyone has some advice on what to do here?

Here's my code:

enum MyList {
  case empty
  indirect case pair(MyList, MyList)
}
var lst = MyList.empty
for _ in 0..<150000 {
  lst = .pair(.empty, lst)
}
print("Done creating list")
lst = .empty
print("Completed program (this is never printed)")

This code uses an enum to represent a simple linear list. The loop creates 150,000 linearly linked nodes and ultimately prints "Done creating list". The following statement (which implicitly deallocates the long list) never returns. This is where I am stuck. It seems to be impossible to deallocate such a data structure without the Swift runtime freezing.

Any help is appreciated!

== Matthias

Answered by OOPer in 693799022

It seems to be impossible to deallocate such a data structure without the Swift runtime freezing.

True. Deallocating an enum instance with indirect child instances would cause deallocating the children recursively.

So, when deallocating long lasting list as in your code, it would make too deep recursive calls which may result any sort of unpredictable behaviors, crashing, freezing or anything you can imagine.

You may need to deallocate all the nodes in the list without consuming the native call stack -- you need non-recursive algorithm.

You cannot control the deallocations of structs and enums in Swift, so you may need to make it a class and implement deinit carefully.

My first trial looks like this:

class ListNode {
    var left: ListNode?
    var right: ListNode?
    
    init(_ left: ListNode?, _ right: ListNode?) {
        self.left = left
        self.right = right
    }
    
    deinit {
        if left != nil || right != nil {
            var nodesToRelease: [ListNode] = [self]
            ListNode.deallocateAll(&nodesToRelease)
        }
    }
    
    private static func deallocateAll(_ nodesToRelease: inout [ListNode]) {
        while let top = nodesToRelease.popLast() {
            if let leftNode = top.left {
                nodesToRelease.append(leftNode)
                top.left = nil
            }
            if let rightNode = top.right {
                nodesToRelease.append(rightNode)
                top.right = nil
            }
        }
    }
}

var node: ListNode? = nil
for _ in 0..<150_000 {
    node = ListNode(nil, node)
}
print("Done creating list")
node = nil
print("Completed program (this is printed within a second)")

This shows Completed program (this is printed within a second) within a second.

Changing 150_000 to 150_000_000, you can observe increasing of the memory consumption first, and then decreasing after Done creating list.

(Takes a few minutes to finish.)

I haven't tried many cases, so there may be some flaws in my code and may not be efficient. But I think my code is well explaining the problem.

If you know the recursion level of the data might get hundreds or more, you should better not use enum with indirect children.

Accepted Answer

It seems to be impossible to deallocate such a data structure without the Swift runtime freezing.

True. Deallocating an enum instance with indirect child instances would cause deallocating the children recursively.

So, when deallocating long lasting list as in your code, it would make too deep recursive calls which may result any sort of unpredictable behaviors, crashing, freezing or anything you can imagine.

You may need to deallocate all the nodes in the list without consuming the native call stack -- you need non-recursive algorithm.

You cannot control the deallocations of structs and enums in Swift, so you may need to make it a class and implement deinit carefully.

My first trial looks like this:

class ListNode {
    var left: ListNode?
    var right: ListNode?
    
    init(_ left: ListNode?, _ right: ListNode?) {
        self.left = left
        self.right = right
    }
    
    deinit {
        if left != nil || right != nil {
            var nodesToRelease: [ListNode] = [self]
            ListNode.deallocateAll(&nodesToRelease)
        }
    }
    
    private static func deallocateAll(_ nodesToRelease: inout [ListNode]) {
        while let top = nodesToRelease.popLast() {
            if let leftNode = top.left {
                nodesToRelease.append(leftNode)
                top.left = nil
            }
            if let rightNode = top.right {
                nodesToRelease.append(rightNode)
                top.right = nil
            }
        }
    }
}

var node: ListNode? = nil
for _ in 0..<150_000 {
    node = ListNode(nil, node)
}
print("Done creating list")
node = nil
print("Completed program (this is printed within a second)")

This shows Completed program (this is printed within a second) within a second.

Changing 150_000 to 150_000_000, you can observe increasing of the memory consumption first, and then decreasing after Done creating list.

(Takes a few minutes to finish.)

I haven't tried many cases, so there may be some flaws in my code and may not be efficient. But I think my code is well explaining the problem.

If you know the recursion level of the data might get hundreds or more, you should better not use enum with indirect children.

OOPer, your response was very insightful and your guesses seem to be spot on! I did not realize that Apple's Swift implementation isn't even able to deallocate a class-based linked list such as:

final class LinkedList {
  let next: LinkedList?
  init(_ next: LinkedList? = nil) {
    self.next = next
  }
}
var lst = LinkedList()
for _ in 0..<150000 {
  lst = .init(lst)
}
print("Done creating list")
// The following statement never returns
lst = LinkedList()
print("Completed program (this is never printed)")

The conclusion for me from this is that it's basically impossible to manage large interconnected data structures in Swift. For value types (such as enums and structs), it is trivial to construct very large values, but it's impossible to deallocate them afterwards since this process seems to be implemented recursively and you can't interfere.

Equally, for class-based data structures, deallocation of large interconnected object graphs isn't scaling as it's done recursively too utilizing the fixed stack. In this case, it is possible to interfere (as OOPer has shown), but doing this in general requires messing with retain counts (possibly via CFGetRetainCount). The implementation above works nicely only for the special case that linked list nodes are not referenced anywhere else.

BTW, I was also able to verify what OOPer was stating: that deallocation is recursive and uses the regular stack. The following code does execute correctly because I was sufficiently increasing the stack size (from 8MB to 10MB):

enum MyList {
  case empty
  indirect case pair(MyList)
}
let condition = NSCondition()
var t = Thread {
  condition.lock()
  var lst = MyList.empty
  for _ in 0..<150000 {
    lst = .pair(lst)
  }
  print("Done creating list")
  // The following statement now returns
  lst = .empty
  print("Completed program (this is never printed)")
  condition.signal()
  condition.unlock()
}
t.stackSize = 2560 * 4096 // 10MB
condition.lock()
t.start()
print("Started")
condition.wait()
condition.unlock()
print("Finished")

[deleted]

Freeze when deallocating large data structures
 
 
Q