Deep learning for Siri’s Voice: On-device deep mixture density networks for hybrid unit selection synthesis

Wednesday, August 23, 2017 4:31 pm11 Comments

News

“Starting in iOS 10 and continuing with new features in iOS 11, we base Siri voices on deep learning,” Siri Team writes for Apple’s Machine Learning Journal. “The resulting voices are more natural, smoother, and allow Siri’s personality to shine through.”

“Recently, deep learning has gained momentum the field of speech technology, largely surpassing conventional techniques, such as hidden Markov models (HMMs). Parametric synthesis has benefited greatly from deep learning technology,” Siri Team writes. “Deep learning has also enabled a completely new approach for speech synthesis called direct waveform modeling (for example using WaveNet), which has the potential to provide both the high quality of unit selection synthesis and flexibility of parametric synthesis. However, given its extremely high computational cost, it is not yet feasible for a production system.”

“In order to provide the best possible quality for Siri’s voices across all platforms,” Siri Team writes, “Apple is now taking a step forward to utilize deep learning in an on-device hybrid unit selection system.”

Read more in the full article here.

MacDailyNews Take: The new US English Siri voice certainly does sound better than ever!

How to watch the NFL games without cable on Apple TV and online

How malicious plugins can compromise your Mac

11 Comments

Bob

Wednesday, August 23, 2017 at 6:00 pm

Why don’t they just make the voice mimic the voice of the owner? That should certainly be creepy enough. Or, and there’s no way this can’t be coming, have the voice be that of your favorite Hollywood star or starlet. Copyrighted, of course.

Reply
1. LateRegistrant
  
  Wednesday, August 23, 2017 at 7:27 pm
  
  I’m just trying to imagine the voice of Carol Channing giving me turn-by-turn directions….
  
  Reply
  1. listcatcher
    
    Thursday, August 24, 2017 at 3:12 am
    
    Or Fran Drescher…
    
    Reply
    1. WriterGuy
      
      Thursday, August 24, 2017 at 6:12 pm
      
      For a while, Waze let you download celebrity voices for turn-by-turn directions, as part of paid promotions. My favorite was Morgan Freeman!
    2. WriterGuy
      
      Thursday, August 24, 2017 at 6:12 pm
      
      Or Gilbert Gottfried, Pee Wee Herman and Yoda!!
TheMightyFinder

Wednesday, August 23, 2017 at 6:15 pm

This is great and all, but for the last week or so, for lots of people, Siri has completely forgotten what the word ‘today’ means, presumably due to some sort of server-side bug. So it doesn’t understand what you mean when you ask it to “set a reminder today at…”, which is utterly pathetic.

So maybe Apple should just focus on getting the damn thing to work properly first, and then worry about the voice.

Reply
alanaudio

Wednesday, August 23, 2017 at 8:01 pm

Just tried it and she understands the concept of “today” perfectly and has set an alarm for me accordingly.

As it’s working for others, it doesn’t appear to be a server issue. Maybe you’re not speaking clearly enough?

Try it again and tell us if it works or not.

Reply
1. JWSC
  
  Wednesday, August 23, 2017 at 9:05 pm
  
  Do you mean maybe your holding it wrong?
  
  Reply
2. TheMightyFinder
  
  Thursday, August 24, 2017 at 4:03 pm
  
  Nope. Happening to a lot of people: https://discussions.apple.com/thread/8040002?start=0&tstart=0
  
  As I said, server-side bug, been happening for the last week. Though, to be fair, it looks like they may now have fixed it and it’s slowly rolling out. Fingers crossed…
  
  Reply
Des Gusting

Wednesday, August 23, 2017 at 8:20 pm

I HAD to read this just to get some idea of what the headline meant!

Reply
mxnt41

Thursday, August 24, 2017 at 9:13 am

They need to work on the consistency of Siri’s sense of humour. My Apple Watch can tell me a joke, but my phone doesn’t know any.

Reply