NSRegularExpression Question - probably easy

How to modify the method below to correctly handle this case:


sourceString: @"some text with an embedded ’token’ inside the body";

targetString: @"’token’"


- (NSString*) performActionXYZOnString:(NSString*)sourceString usingTarget:(NSString*)targetString; {
    NSMutableString * resultString = [[NSMutableString alloc] initWithString:sourceString];
    if ((sourceString.length>0)&&(targetString.length>0)) {
        NSRange fullSourceStringRange = NSMakeRange(0, sourceString);


        NSString *targeWordExpression = [NSString stringWithFormat:@"(%@)", targetString];
        NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:targeWordExpression options:NSRegularExpressionCaseInsensitive error:nil];


        [regex enumerateMatchesInString:sourceString options:NSMatchingReportCompletion range:fullSourceStringRange usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
            if (result!=nil) {
                ... perform actions on resultString...
            }
        }];
    }
    return [NSString stringWithString:resultString];
}

Accepted Reply

Are you treating the target word as a literal substring to find? That is, you don't want to interpret any characters within it as regex metacharacters, right?


If that's the case, then you should not format it into a pattern. That is, don't create or use targeWordExpression. Instead, pass the targetString directly to +regularExpressionWithPattern:… and use the NSRegularExpressionIgnoreMetacharacters option.


The parentheses in your format string don't really buy you anything. They cause the result to include a capture group but it will be just the same as the overall match range.

Replies

What do you mean by "correctly handle"? What should it do that this code is not doing or what should it not do that this code is doing?


Also, do you really need to use a regular expression for this? If I'm understanding this code correctly, it's basically equivalent to -rangeOfString:options: with NSCaseInsensitiveSearch.

The presense of the "’" causes the method which otherwise works correctly to fail to enter the block.

So in the case presented, the block is never entered, and therefore the returned result does not include any actions for the supplied target.

If the target does not contain the "’" the method works as expected.

BTW: the presences of a '.' , '_', or parenthesis also appear to cause the same failure (all characters which have regex meaning)


rangeOfString:options:range:

Finds and returns the range of the first occurrence of a given string, within the given range of the string, subject to given options.


The block shown above iterates over the entire string dealing with each occurance with 'enumerateMatchesInString'.


I simply do not understand how you would use 'rangeOfString:options' to accomplish this task. ?

Yes a loop could be constructed and the target range changed over the loop, blah blah, but isn't that more tedious and error prone than a good old fashioned regex

I would really appreciate more information on this point.


Thank-you for your time!

Steve

Are you treating the target word as a literal substring to find? That is, you don't want to interpret any characters within it as regex metacharacters, right?


If that's the case, then you should not format it into a pattern. That is, don't create or use targeWordExpression. Instead, pass the targetString directly to +regularExpressionWithPattern:… and use the NSRegularExpressionIgnoreMetacharacters option.


The parentheses in your format string don't really buy you anything. They cause the result to include a capture group but it will be just the same as the overall match range.

Ken;


Thank-You.. Making your recommended changes as:

        NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:targetString options:(NSRegularExpressionOptions)(NSRegularExpressionCaseInsensitive|NSRegularExpressionIgnoreMetacharacters) error:nil];


Now works as expected.