“The US Patent and Trademark Office has granted Apple a new patent for a text-to-speech conversion process,” Electronista reports.
“Titled Multi-unit approach to text-to-speech synthesis, the patent describes a way of matching units from an input string to an audio library. An important inclusion is metadata, such as articulation relationships, which can inform a processor how to make phrases sound more natural,” Electronista reports. “The software should also support a client-server architecture, allowing remote processing.”
Electronista reports, “The patent could indicate that Apple eventually wants to rely on its own text-to-speech technology, rather than use licensed code from Nuance.”
Read more in the full article here.
this mean android cant do the same like Siri or any kind of voice assistant?
I thought Nuance did speech-to-text.
My thought too. The writer trying to look informed, simply gets it wrong.
This could be both; Apple researches text-to-speech and speech-to-text for about 30 years already. At certain point their progress in that was slow, so they very well might use Nuance; this, however, does not prohibit in any way Apple to continue their ow research.
Am I missing something? Isn’t text to speech different from speech to text? These comments seem to say that there isn’t any difference.
Do any of you know how to read?
Maybe, yes, which comments and I’m sure someone does.
Text-to-speech (voice synthesis) and speech-to-text (voice recognition) are different. Nuance has a voice synthesis product, but it’s clearly not their mainstay. Nuance is famous for Dragon Dictation and, formerly, Naturally Speaking – both speech recognition programs and very advanced ones, too. Apple has had their own voice synthesis since they first let the Mac out of the bag. Literally. They have made great advancements in natural voice and even the step from Snow Leopard to Lion was significant. iOS likely uses Apple text-to-speech technology with Nuance’s speech-to-text. Why rent what they already own. But this is conjecture. Siri, by the way, is not Nuance technology. So Siri is definitely made of components from different sources.
Hey! What do you know? We write, too.
Dragon was the first speech recognition that really worked. Dr james and janet Baker did some amazing work.
The first version was released for DOS in the early 80s and I remember a friend calling their support because it couldn’t keep up unless he paused between words. They told him to get a faster computer lol.
The software literally changed the court reporting industry in the late 90s.
Nuance would be the one you’d want in your product.
Smart choice apple
Robert, I think you ARE missing something. Comments from AAPLsaur and me were referencing this line from the article:
“The patent could indicate that Apple eventually wants to rely on its own text-to-speech technology, rather than use licensed code from Nuance.”
What came to mind for both AAPLsaur and myself is that Nuance is not known for, and to the best of my knowledge, does license, text-to-speech technology, as stated by the author. They develop speech-to-text technology. So… the author either doesn’t proofread his copy, or doesn’t know what he is talking about. Neither my comment, nor AAPLsaur’s comes anywhere close to saying that there is no difference between the two technologies.
I expect Apple will just buy Nuance as soon as it makes sense.
Patent are useless, it couldn’t even protect the multi touch in iPhone from being rip off by Android. What makes you think this is any difference.
No not useless, there was prior art (and lots of it):
http://www.billbuxton.com/multitouchOverview.html
In this instance I would hazard a guess that the multi-touch patent could well have been a maneuver to slow-down competitors. I was surprised when they were given a patent for it.
As has happened before it seems Apple re-invents existing technology in a simple, powerful and useful way.
It just covers one form or one better way to do text-to-speech, it doesn’t mean that Apple owns all of text-to speech. For example, Apple uses in-plane switching to enhance their monitors. Doesn’t mean that the patent holder has a patent on all monitors.
I mentioned to Apple and I’ll write here that I think teaching Siri should be a feature. The more “alive” the ai is, the more compassion it elicits. Think about your instinct to help a child… When Siri emphasizes something wrong I’ll want to say. “Siri, say it like this:”. Then with the right emphasis. If itearns it, so much the better for everyone. In fact, remember location data at the server for the correction and soon you have excellent dialect info.