Memeory leaks in regular expressions

Hi

        let regex1 = #"^[0-9]{4}-[0-9][0-9]?-[0-9][0-9]?$"#
        let range1 = input.range(of: regex1, options: .regularExpression)
        let string1:String
        if range1 != nil {
            string1 = input + "T00:00:00.000"
        }else{
            let range = NSRange(location: 0, length: input.utf8.count)

            let pattern =
                "^([0-9]{4}-[0-9][0-9]-[0-9][0-9]).([0-2][0-9]:[0-5][0-9]:[0-6][0-9]\\.[0-9][0-9][0-9])$"

            let regEx = try! NSRegularExpression(
                pattern: pattern,
                options: []
            )
            if let match = regEx.firstMatch(in: input, options: [], range: range){
                let d1 = Range(match.range(at: 1), in: input)
                let d2 = Range(match.range(at: 2), in: input)
                if d1 != nil && d2 != nil {
                    string1 = input[d1!] + "T" + input[d2!]
                }else{
                    string1 = ""
                }
            }
        }

Called that < 1_000_000 times and used up 5G of memory

You're doing something that's (a) compute-intensive and (b) uses a lot of reference-counted objects (NSRegularExpression and probably NSString behind the scenes). Because of the way Automatic Reference Counting (ARC) works, some objects may not be deallocated immediately after use, but may be added to an "autorelease pool" — which means they'll be deallocated "later". The later-deallocation is called "draining" the autorelease pool.

"Later" usually means when the thread returns to its top-level event processing loop. If you're looping a million times inside a function you wrote, then you're not returning to the event processing loop each iteration, and millions of allocated objects will accumulate in the autorelease pool.

For this scenario, you need a way of draining the autorelease pool periodically. There's a Swift construct for this:

    for iteration in 0 ..< 1_000_000 {
        autoreleasepool {
            … code that creates objects …
        }
    }

Note that draining the autorelease pool doesn't deallocate any objects that are still in use, so it's harmless to objects that aren't ready for deallocation. Note also that there's a small performance cost to draining the pool. In performance critical code, you might not want to do it every iteration of the loop.

Helpful.

It is sad that Swift inherited ARC as a memory management tool.

Well, the good news is that Swift now has its own RegEx type (https://developer.apple.com/videos/play/wwdc2022/110357/), which should be able to avoid these legacy issues. However, Swift RegEx isn't back-deployed before iOS 16/macOS 13.

Memeory leaks in regular expressions
 
 
Q