CoreML & ARKit for Object Pose Estimation & Tracking

Good day people!

I'm currently working on my master thesis in media informatics. I'd really appreciate to discuss my topic with you guys, so I may get some interesting ideas or new information.

The goal is to implement an app, specifically designed for places like museums where the envrionment isn't perfect for AR tracking. (Darkness, no network connection, maybe exhibits made out of glass...)

Therefore, i'd like to develop a neuronal network for the new ipad pro that takes rgb-d data to predict a pose estimation in a scene for an object, so that it matches the real world object perfectly. This placed object will be a perfect 3d model replica of the real object. (hand modeled or scanned and revised) This should allow me to place AR Content precisely over the real world object, even in difficult lightlings and stuff. Maybe it will improve occlusion, too. I can imagine that the neuronal network may also detect structures, edges and semantic coherences better than the usual approach.

My first thought was to work with CoreML, Metal, maybe Vision and ARKit. I will also try out XCode for the first time.

Maybe you guys have interesting ideas for improvement or can guide me a little bit, since i fell a bit lost at the moment. Would you use rather point clouds or the raw depth buffer to train the model? Would you also train with edge filter images and stuff? Why or why not?

Thanks in advance, it would mean the world to me!

Kind regards, Miri :-)

Research on scene color reconstruction may give you some answers. In my opinion, you’re taking on way more than a single developer can handle, unless you have thousands of hours to spare.

Furthermore, depth buffers are only available on devices with a LiDAR scanner. That’s a small portion of your consumer market, so keep that in mind. Your eagerness to experiment with AR is on the right track though.

Thanks for your feedback! I'll see what I can do with it. I have around 6 months, 8 hrs/day. I'll do as much as I can, but the tasks can also be simplified, if i can't do it all in time. This is why I'm searching for tips and tricks, since I know here are a lot of brains that may have cool ideas or offer some help :)

CoreML & ARKit for Object Pose Estimation & Tracking
 
 
Q