Question about speech recognition with Siri

Question

Hi Sir,

I just do some research about speech recognition now, I already got some answer from Apple's Machine Learning Journal - "Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistant"

But I still have one more question, as article mentioned; Siri would detect default voice command as "Hey Siri" in local side with small/large DNN.
It will also do some pre-work in local side, like DFT and MFCC.

My question is when user said other sentences, what kinds of data format does Siri upload to cloud? Original voice data or spectrum data after pre-work (e.g. DFT / MFCC).

Thanks.
BR

Blake

Speech

516

Posted by

blake_ding

Reply

Add a Comment

Answer 1

That 'pre-work', as you call, is processing done on the device, for use on the device. Additionally, encapsulated voice data is transmitted off the device.

See h ttp://www.smartplanet.com/blog/smart-takes/say-command-how-voice-recognition-will-change-the-world/19895

Can you explain your question in the context of sirikit & your app(s), thanks.

Posted by

KMT

Add a Comment

Question about speech recognition with Siri

Replies