When rename file in Finder and use symbol й
U+0439 : CYRILLIC SMALL LETTER SHORT I
it is then converted to a combination of two unicode characters by macOS/Finder
U+0438 : CYRILLIC SMALL LETTER I
U+0306 : COMBINING BREVE {short; Greek vrachy}
Does anybody know why it is so? Such behaviour causes troubles to regular expressions on different web pages as they can't detect й as a correct cyrillic symbol. My macOS version is 14.1.2 (23B92).
It's likely the result of Unicode "normalization", which chooses a preferred underlying representation of a Unicode code point when there are multiple possible representations. Note that there's more than one kind of normalization, so the normalization that the file system does may not be the same one as on other platforms.
[To be explicit: File systems do normalization so that filenames consisting of "the same characters" are treated as the same filename, regardless of the underlying representation of the characters. This doesn't necessarily have to be true of a file system, but it's poor experience for users when they can have multiple files of what is apparently the "same" name.]
All that aside, the normalization form should not really cause any problems for properly-written code for doing string processing, such as regular expressions.
Tools that search text (whether by regular expressions or not) have to be explicit about whether they're doing normalized-form searching or not. If you're writing your own code to match against text copied from web pages, you'll have to give this issue some attention, too. It's hard to be more explicit without knowing more about what scenarios you're seeing that have these "troubles".