Identify parts of a sentence

Is there a way in Swift to identify the different parts of a given sentence string?


For example,


If given "The dog walked", you could identify that "The dog" is the subject and "walked" is the verb


Or


If given "Do this please", you could identify that "Do" is the verb and "this" is the object, and there is no subject so this is a command.

  • Pasting the suggested code below into a clean playground with Xcode Version 14.3 (14E222b) I get:

    The dog walked The->OtherWord dog->OtherWord walked->OtherWord

    Do this please Do->OtherWord this->OtherWord please->OtherWord

    Is the natural language parse function no longer working?

Add a Comment

Accepted Reply

I do not know the exaxct tool for you purpose, but the Natural Languge framework (introduced in WWDC 2018, available since iOS 12/macOS 10.14) would help you a little.


import NaturalLanguage

func tag(text: String) {
    print()
    print(text)
    let tagger = NLTagger(tagSchemes: [.lexicalClass])
    tagger.string = text
    let wholeText = text.startIndex..<text.endIndex
    tagger.setLanguage(.english, range: wholeText)
    let options: NLTagger.Options = [.omitWhitespace, .omitPunctuation]
    tagger.enumerateTags(in: wholeText, unit: .word, scheme: .lexicalClass, options: options) {tag, range in
        print("\(text[range])->\(tag!.rawValue)")
        
        return true
    }
}
tag(text: "The dog walked")
tag(text: "Do this please")


Output:


The dog walked

The->Determiner

dog->Noun

walked->Verb


Do this please

Do->Verb

this->Determiner

please->Interjection


As far as I checked, there's no feature listed in the Natural Language framework that detects "The dog" as Subject.


You may need to write a syntactic parser yourself, or find some third party library (though I know none availabe in Swift).

Replies

This is very language specific and rules are different in english of french, not to speak of chinese.


And this requires to understand the meaning of the words, not only identify.


How would you make the difference between

The dog walked

and

the dog house


That could be a good excercise for ML experience.

I do not know the exaxct tool for you purpose, but the Natural Languge framework (introduced in WWDC 2018, available since iOS 12/macOS 10.14) would help you a little.


import NaturalLanguage

func tag(text: String) {
    print()
    print(text)
    let tagger = NLTagger(tagSchemes: [.lexicalClass])
    tagger.string = text
    let wholeText = text.startIndex..<text.endIndex
    tagger.setLanguage(.english, range: wholeText)
    let options: NLTagger.Options = [.omitWhitespace, .omitPunctuation]
    tagger.enumerateTags(in: wholeText, unit: .word, scheme: .lexicalClass, options: options) {tag, range in
        print("\(text[range])->\(tag!.rawValue)")
        
        return true
    }
}
tag(text: "The dog walked")
tag(text: "Do this please")


Output:


The dog walked

The->Determiner

dog->Noun

walked->Verb


Do this please

Do->Verb

this->Determiner

please->Interjection


As far as I checked, there's no feature listed in the Natural Language framework that detects "The dog" as Subject.


You may need to write a syntactic parser yourself, or find some third party library (though I know none availabe in Swift).

Really impressive, I didn't know this framework.


I tested in french


func tagFrench(text: String) {
    print()
    print(text)
    let tagger = NLTagger(tagSchemes: [.lexicalClass])
    tagger.string = text
    let wholeText = text.startIndex..<text.endindex
    tagger.setLanguage(.french, range: wholeText)
    let options: NLTagger.Options = [.omitWhitespace, .omitPunctuation]
    tagger.enumerateTags(in: wholeText, unit: .word, scheme: .lexicalClass, options: options) {tag, range in
        print("\(text[range])->\(tag!.rawValue)")
       
        return true
    }
}

tagFrench(text: "Les fleurs que j'ai cueillies sont belles")


and got the correct analysis


Les fleurs que j'ai cueillies sont belles

Les->Determiner

fleurs->Noun

que->Conjunction

j'->Pronoun

ai->Verb

cueillies->Verb

sont->Verb

belles->Adjective


Only point, the objects names are not localized ; may be there is an option for it ?

Unfortunately, the framework is not well trained for Japanese.


func tag(text: String, language: NLLanguage = .english) {
    print()
    print(text)
    let tagger = NLTagger(tagSchemes: [.lexicalClass])
    tagger.string = text
    let wholeText = text.startIndex..<text.endIndex
    tagger.setLanguage(language, range: wholeText)
    let options: NLTagger.Options = [.omitWhitespace, .omitPunctuation]
    tagger.enumerateTags(in: wholeText, unit: .word, scheme: .lexicalClass, options: options) {tag, range in
        print("\(text[range])->\(tag!.rawValue)")
        
        return true
    }
}
tag(text: "目の前を犬が歩いた", language: .japanese)
tag(text: "これをやっといてちょうだい", language: .japanese)


Output:


目の前を犬が歩いた

目->OtherWord

の->OtherWord

前->OtherWord

を->OtherWord

犬->OtherWord

が->OtherWord

歩い->OtherWord

た->OtherWord


これをやっといてちょうだい

これ->OtherWord

を->OtherWord

やっ->OtherWord

とい->OtherWord

て->OtherWord

ちょうだい->OtherWord


All words tagged as OtherWord.


The rawValue of `NSTag` seems to be just a symbol and you may need to localize it by yourself. (It's a thin wrapper of NSString in Swift.)


I hope the newer version coming in the near future would support more languages and syntactic parsing.