Can NLTokenizer handle .sentence s?

My app would significantly benefit from being able to identify sentences in text. So I'm trying NLTokenizer, since the api makes it looks like it could do that. I'm not able to obtain sentences as tokens. However, if I change the unit to words or paragraphs, I do get words and paragraphs respectively. Am I missing something or is this a bug?

Here's some small example code:
Code Block swift
let source = "It was many and many a year ago, in a kingdom by the sea. \"Quiet\", said the raven."
let tokenizer = NLTokenizer(unit: .sentence)
tokenizer.string = source
tokenizer.setLanguage(.english)
print("begin")
let tokens = tokenizer.tokens(for: source.startIndex..<source.endIndex).map({ range in
return source[range]
})
print(tokens)
print("end")

I expected to get:
Code Block
begin 
["It was many and many a year ago, in a kingdom by the sea.", "\"Quiet\", said the raven."]
end


But what I actually get is:
Code Block
begin
[]
end


I found a blog where someone had claimed they had iterated the sentences using NLTokenizer, but when I examined his output, he had actually enumerated the words.

macOS 10.15.6 beta 1

Replies

This is fixed in macOS 11. (tested in 11.5.1)