Getting precise text position with Swift for MacOS

Hey there! Hope you are starting the year with great joy.

My situation

I'm building a new product that is based on detecting certain text on screen in realtime. The product is only targeted for Mac and it's built with Swift

My problem

I need to get the exact position of a text element with the Apple Accessibility API but I can't figurate it out. I managed to get the AXUIElement where the text is placed but it's position is too broad and off target.

My discoveries so far

I've tried OCR but is too slow for what I'm building, so the only possible way I can think of is with the Accessibility API.

Thank you in advanced.

Could you describe your use case more in detail ?

Is it the text from your app, or any text (like a Finder file name) that may appear on screen ?

If the former, is it a NSTextField ? Do you know which window it is in ? And then, with the window position you should be able to compute text position.

If the latter, text may just be an image, for OCR may be the best way to go. But effectively may be slow.

Sure! Thanks for your question @Claude31.

It's any text that may appear on screen. For example whenever the word "Happy" appears in screen I want to place an overlay with the emoji: 😄.

It's not always an NSTextField, but I'm able to always detect the desired word as an AXUIElement, and the window in which the word (Happy) is contained.

Also I was able to place the overlay without any issues.

The problem is that I'm not able to get the proper position of the text element inside the window that contains the word.

Leaving an image as an example, the emoji should be placed on top of the word "Happy" but due to me not being able to get the correct position it's placed elsewhere.

Thanks again!

Getting precise text position with Swift for MacOS
 
 
Q