I've recently started using the Natural Language framework, and I'm fascinated by NLTokenizer, and particularly NLTagger. I've written a simple app that takes a text file as input, then produces a table listing its tokens, lemmas, and lexical classes. I'm impressed by how well the tagging works. But there are some occasional quirks.
For example, it's unable to recognize won't as a form or will, or can't as a form of can. Instead, it tokenizes them as wo and ca respectively; but it does recognize both of them as verbs.
Is there any way of gaining access to the model NLTagger uses, and doing some further training on it?