Has iOS 18 changed the threshold for decoding base64 into ASCII code?

This code fails to decode when running on iOS 18.0 or 18.1 beta device. But succeeds below iOS 18, such as iOS 17.5. Xcode Version 16.0 (16A242d)

let base64String = "1ZwoNohdE8Nteis/IXl1rg=="
if let decodedData = Data(base64Encoded: base64String, options: .ignoreUnknownCharacters) {
    if let decodedString = String(data: decodedData, encoding: .ascii) {
        print("Decoded string: \(decodedString)")
    } else {
        print("Failed to decode string using ascii encoding")
    }
} else {
    print("Failed to decode Base64 string")
}
Answered by DTS Engineer in 804362022

The code you posted is troubling. Specifically, the Base64 data, when decoded, doesn’t yield valid ASCII. Consider this tweaked version of your code:

func test() {
    let base64String = "1ZwoNohdE8Nteis/IXl1rg=="
    if let decodedData = Data(base64Encoded: base64String, options: .ignoreUnknownCharacters) {
        print("OK, data: \((decodedData as NSData).debugDescription)")
    } else {
        print("NG")
    }
}

This outputs the same value on iOS 17.6.1 and iOS 18.0, namely:

OK, data: <d59c2836 885d13c3 6d7a2b3f 217975ae>

But the resulting data has lots of high-bit-set bytes, and thus isn’t ASCII. So you’re relying on the behaviour of String.init(data:encoding:) when you pass it data that’s not ASCII. Which brings me to a further tweaked version of your code:

func test() {
    let decodedData = Data([
        0xd5, 0x9c, 0x28, 0x36, 0x88, 0x5d, 0x13, 0xc3,
        0x6d, 0x7a, 0x2b, 0x3f, 0x21, 0x79, 0x75, 0xae,
    ])
    if let decodedString = String(data: decodedData, encoding: .ascii) {
        print("OK, string: \(decodedString)")
    } else {
        print("NG")
    }
}

On iOS 17.6.1 it prints this:

OK, string: ՜(6ˆ]Ãmz+?!yu®

On iOS 18.0 it prints this:

iOS 18 NG

IMO that’s an improvement. The iOS 17 behaviour is nonsense. iOS 18 has correctly failed to decode your string because it’s not even close to ASCII.

Digging further into the hex dumps, it looks like iOS 17 was treating .ascii as a synonym for .isoLatin1. If you want the old behaviour — and, to be clear, that old behaviour makes no sense to me — you can get it by applying that change. My advice, however, is that you look at your code to figure out why you were trying to decode non-ASCII data as ASCII.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Accepted Answer

The code you posted is troubling. Specifically, the Base64 data, when decoded, doesn’t yield valid ASCII. Consider this tweaked version of your code:

func test() {
    let base64String = "1ZwoNohdE8Nteis/IXl1rg=="
    if let decodedData = Data(base64Encoded: base64String, options: .ignoreUnknownCharacters) {
        print("OK, data: \((decodedData as NSData).debugDescription)")
    } else {
        print("NG")
    }
}

This outputs the same value on iOS 17.6.1 and iOS 18.0, namely:

OK, data: <d59c2836 885d13c3 6d7a2b3f 217975ae>

But the resulting data has lots of high-bit-set bytes, and thus isn’t ASCII. So you’re relying on the behaviour of String.init(data:encoding:) when you pass it data that’s not ASCII. Which brings me to a further tweaked version of your code:

func test() {
    let decodedData = Data([
        0xd5, 0x9c, 0x28, 0x36, 0x88, 0x5d, 0x13, 0xc3,
        0x6d, 0x7a, 0x2b, 0x3f, 0x21, 0x79, 0x75, 0xae,
    ])
    if let decodedString = String(data: decodedData, encoding: .ascii) {
        print("OK, string: \(decodedString)")
    } else {
        print("NG")
    }
}

On iOS 17.6.1 it prints this:

OK, string: ՜(6ˆ]Ãmz+?!yu®

On iOS 18.0 it prints this:

iOS 18 NG

IMO that’s an improvement. The iOS 17 behaviour is nonsense. iOS 18 has correctly failed to decode your string because it’s not even close to ASCII.

Digging further into the hex dumps, it looks like iOS 17 was treating .ascii as a synonym for .isoLatin1. If you want the old behaviour — and, to be clear, that old behaviour makes no sense to me — you can get it by applying that change. My advice, however, is that you look at your code to figure out why you were trying to decode non-ASCII data as ASCII.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

it looks like iOS 17 was treating .ascii as a synonym for .isoLatin1 [and now it isn't]

That's going a bite a few people, isn't it!

Was that mentioned in the release notes?

That's going a bite a few people, isn't it!

Possibly. I’m always amazed by how software systems can be so robust in the face abstractions that are so leaky. It’s possible, for example, to completely rework Foundation in a new language, and the vast bulk of code keeps chugging along without noticing.

Was that mentioned in the release notes?

I’m pretty sure you can answer that just as well as I can (-:

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

This has bitten me as well. I used .ascii to detect if a SSH key uses a passphrase or not. Thanks, I guess?

I am also having an issue with this change in my code. Why doesn't switching from .ascii to .utf8 work? Doesn't this allow for the high bit to be set? Sorry for the ignorance but I look forward to Quinn's answer.

Why doesn't switching from .ascii to .utf8 work? Doesn't this allow for the high bit to be set?

Not all possible byte sequences are valid UTF-8.

Not all possible byte sequences are valid UTF-8.

Right.

And the fact that you’re asking this question is a concern. It suggests that you’re trying to play ‘guess the text encoding’, which isn’t a game that anyone can win )-: Consider that both ISO-Latin-1 (.isoLatin1) and MacRoman (.macOSRoman) are ‘full’, meaning that every byte sequence is valid. Thus, if you get a random stream of bytes it’s virtually impossible [1] to tell the difference.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Only “virtually” because I could imagine that statistical techniques, like a trained ML model, could do this well enough.

Has iOS 18 changed the threshold for decoding base64 into ASCII code?
 
 
Q