3d animation : lip syncing

Question

devfunshark OP

Created Feb ’23

Replies 2

Boosts 0

Views 1.8k

Participants 3

I have a simple 3d avatar in the form of an FBX file. I can convert that to a scene file for SceneKit.

I need to make it talk, at runtime. So I need a way to go from audio to movements of the mouse.

How can I do that? Any pointer appreciated, thanks!

Boost

Answer 1

Vision Pro Engineer OP

Apple

Feb ’23

Hi,

If I understand your question correctly, you are trying to animate your avatar's mouth based on captured audio signals. I recommend you have a look at ARKit face tracking instead. This lets you track facial expressions in real-time which you can apply to the 3D model. When ARKit detects your face, it creates an ARFaceAnchor which has a dictionary of blendShapes. The different blend shape coefficients correspond to facial features. For example, you could use the value of jawOpen to determine how much the mouth is opened, and use this to animate the model.

For a very basic example how to animate a simple model based on blend shapes, check out this developer sample.

0

Answer 2

liudger OP

Nov ’23

Hello,

Let me chime in as I seek also a lip-sync solution for Vision Pro It's not the same as a lip-sync sdk the ARKit face tracking and it cannot replace it. If a virtual character needs lip-sync (interaction between you and the virtual character) you have on other headset (other brand here) a sdk that provides a value of which phoneme it detects. This way you can hook it up to your own animation system to create realistic facial animation. A simple solution is blendshapes but in Unity you would hook it up to an Animator where you could tweak the interaction of the shapes and how these transition. Now that Apple is going to deliver the Vision Pro it would be really helpful to have a sdk that uses the neural engine to deliver realtime prediction of which phoneme it detects based on audio. For now we use an opensource lipsync framework called uLipSync. This works on metal as it uses some nice Unity Burst compilation. But this lip-sync does not predict so there is some latency.

0