Unexpected Insertion of U+2004 (Space) When Using UITextView with Pinyin Input on iOS 18

I encountered an issue with UITextView on iOS 18 where, when typing Pinyin, extra Unicode characters such as U+2004 are inserted unexpectedly. This occurs when using a Chinese input method.

Steps to Reproduce:

1.	Set up a UITextView with a standard delegate implementation.
2.	Use a Pinyin input method to type the character “ㄨ”.
3.	Observe that after the character “ㄨ” is typed, extra spaces (U+2004) are inserted automatically between the characters.

Code Example:

class ViewController: UIViewController {
    @IBOutlet weak var textView: UITextView!

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view.
    }
}

extension ViewController: UITextViewDelegate {
    func textView(_ textView: UITextView, shouldChangeTextIn range: NSRange, replacementText text: String) -> Bool {
        print("shouldChangeTextIn: range \(range)")
        print("shouldChangeTextIn: replacementText \(text)")
        return true
    }

    func textViewDidChange(_ textView: UITextView) {
        let currentText = textView.text ?? ""
        let unicodeValues = currentText.unicodeScalars.map { String(format: "U+%04X", $0.value) }.joined(separator: " ")
       
        print("textViewDidChange: textView.text: \(currentText)")
        print("textViewDidChange: Unicode Scalars: \(unicodeValues)")
    }
}

Output:

shouldChangeTextIn: range {0, 0}
shouldChangeTextIn: replacementText ㄨ
textViewDidChange: textView.text: ㄨ
textViewDidChange: Unicode Scalars: U+3128
------------------------
shouldChangeTextIn: range {1, 0}
shouldChangeTextIn: replacementText ㄨ
textViewDidChange: textView.text: ㄨ ㄨ
textViewDidChange: Unicode Scalars: U+3128 U+2004 U+3128
------------------------
shouldChangeTextIn: range {3, 0}
shouldChangeTextIn: replacementText ㄨ
textViewDidChange: textView.text: ㄨ ㄨ ㄨ
textViewDidChange: Unicode Scalars: U+3128 U+2004 U+3128 U+2004 U+3128

This issue may affect text processing, especially in cases where precise text manipulation is required, such as calculating ranges in shouldChangeTextIn.

The screenshot you provided is a Japanese Kana keyboard and not a Chinese Pinyin keyboard. By trying a Simplified Chinese Pinyin keyboard with my iPhone + iOS 18.2.1, I do see something similar and different:

When typing "X" (capital) three times, I got:

textViewDidChange: textView.text: ***
textViewDidChange: Unicode Scalars: U+0058 U+0058 U+0058

When typing "x" three times, I got:

textViewDidChange: textView.text: x x x
textViewDidChange: Unicode Scalars: U+0078 U+2006 U+0078 U+2006 U+0078

So there is indeed an extra U+2006 (six-per-em space).

This behavior seems reasonable to me though, because the marked text "x" here implies a Chinese character, while "X" (capital) represents itself, as you can tell from the candiate window, and the extra U+2006 makes the difference clear.

Regarding the following:

This issue may affect text processing, especially in cases where precise text manipulation is required, such as calculating ranges in shouldChangeTextIn

Assuming that your text manipulation is on the confirmed text, you can use the following code to retrieve the marked text and remove it from textView.text to get the confirme text:

if let markedTextRange = textView.markedTextRange {
    let markedText = textView.text(in: markedTextRange) ?? ""
    print("\(#function): range = \(markedTextRange), markedText = \(markedText)")
}

Other than that, I wil be very curious why the extra space in the marked text become an issue in your use case. If you don't mind to explain a bit more, I'd see if I can comment.

Best,
——
Ziqiao Chen
 Worldwide Developer Relations.

Thank you for your attention and any suggestions!

Unexpected Insertion of U+2004 (Space) When Using UITextView with Pinyin Input on iOS 18
 
 
Q