Apple wins new patent on text-to-speech conversion

“The US Patent and Trademark Office has granted Apple a new patent for a text-to-speech conversion process,” Electronista reports.

“Titled Multi-unit approach to text-to-speech synthesis, the patent describes a way of matching units from an input string to an audio library. An important inclusion is metadata, such as articulation relationships, which can inform a processor how to make phrases sound more natural,” Electronista reports. “The software should also support a client-server architecture, allowing remote processing.”

Electronista reports, “The patent could indicate that Apple eventually wants to rely on its own text-to-speech technology, rather than use licensed code from Nuance.”

Read more in the full article here.

13 Comments

      1. This could be both; Apple researches text-to-speech and speech-to-text for about 30 years already. At certain point their progress in that was slow, so they very well might use Nuance; this, however, does not prohibit in any way Apple to continue their ow research.

      1. Maybe, yes, which comments and I’m sure someone does.

        Text-to-speech (voice synthesis) and speech-to-text (voice recognition) are different. Nuance has a voice synthesis product, but it’s clearly not their mainstay. Nuance is famous for Dragon Dictation and, formerly, Naturally Speaking – both speech recognition programs and very advanced ones, too. Apple has had their own voice synthesis since they first let the Mac out of the bag. Literally. They have made great advancements in natural voice and even the step from Snow Leopard to Lion was significant. iOS likely uses Apple text-to-speech technology with Nuance’s speech-to-text. Why rent what they already own. But this is conjecture. Siri, by the way, is not Nuance technology. So Siri is definitely made of components from different sources.

        Hey! What do you know? We write, too.

        1. Dragon was the first speech recognition that really worked. Dr james and janet Baker did some amazing work.

          The first version was released for DOS in the early 80s and I remember a friend calling their support because it couldn’t keep up unless he paused between words. They told him to get a faster computer lol.

          The software literally changed the court reporting industry in the late 90s.

          Nuance would be the one you’d want in your product.

          Smart choice apple

      2. Robert, I think you ARE missing something. Comments from AAPLsaur and me were referencing this line from the article:
        “The patent could indicate that Apple eventually wants to rely on its own text-to-speech technology, rather than use licensed code from Nuance.”

        What came to mind for both AAPLsaur and myself is that Nuance is not known for, and to the best of my knowledge, does license, text-to-speech technology, as stated by the author. They develop speech-to-text technology. So… the author either doesn’t proofread his copy, or doesn’t know what he is talking about. Neither my comment, nor AAPLsaur’s comes anywhere close to saying that there is no difference between the two technologies.

  1. It just covers one form or one better way to do text-to-speech, it doesn’t mean that Apple owns all of text-to speech. For example, Apple uses in-plane switching to enhance their monitors. Doesn’t mean that the patent holder has a patent on all monitors.

  2. I mentioned to Apple and I’ll write here that I think teaching Siri should be a feature. The more “alive” the ai is, the more compassion it elicits. Think about your instinct to help a child… When Siri emphasizes something wrong I’ll want to say. “Siri, say it like this:”. Then with the right emphasis. If itearns it, so much the better for everyone. In fact, remember location data at the server for the correction and soon you have excellent dialect info.

Reader Feedback

This site uses Akismet to reduce spam. Learn how your comment data is processed.