Inside OS X 10.8 Mountain Lion GM: Dictation and speech

“In Mountain Lion, Macs are getting system-wide speech recognition, the same ‘Dictation’ feature Apple gave the new iPad at the beginning of the year. While it works well, it does require a network connection,” Daniel Eran Dilger reports for AppleInsider.

“Apple’s cloud-based Dictation feature, currently supported on the new iPad and as part of the broader Siri voice assistant feature of iPhone 4S, converts speech to text virtually anywhere,” Dilger reports. “It works by sending audio recordings of captured speech to Apple’s servers, which respond with plain text. While it doesn’t go as far as the more intelligent Siri, Dictation does intelligently cross reference the names and assigned nicknames of your contacts in order to better understand what you are saying.”

Advertisement: Save 10% on purchase of MacSpeech Scribe from http://www.macspeech.com/scribe/. Use coupon code SCRIBE10 during checkout.

Dilger reports, “Similar to Siri or Dictation on the new iPad, Dictation on Macs running OS X Mountain Lion pops up a simple mic icon when activated, which listens until you click or type the key to finish. Just as with Siri or dictation on the new iPad, Dictation under Mountain Lion is quite fast and highly accurate, but does require a network connection to function. If you don’t have a network connection, the Dictation input icon will simply shake, indicating that it is not available.”

Much more, including screenshots, in the full article here.

MacDailyNews Take: We’ve dictated commands to our Macs since the early 1990s (PlainTalk, Speakable Items). We didn’t need a network connection then. Why do we need it in 2012? Dragon Dictate for Mac and MacSpeech Scribe don’t require network connections, either – and they work wonderfully! We understand how Siri works, but mere dictation isn’t Siri. Why can’t Mountain Lion’s dictation feature work locally and free users from network dependency?

The network requirement seems to be an artificial limitation, part of the deal between Apple and Nuance, that’s meant to preserve Nuance’s revenue stream for standalone dictation software rather than for any actual technical reason. If so, bad form, Apple!

21 Comments

  1. It would require that the resources and libraries used in dictation be stored locally. Since this is a Nuance technology, it is unlikely that the OS has the necessary licensing required.

  2. Dictation programs that reside on the device usually have a much more limited spectrum of sound samples to work with and cannot improve as fast as a centralized system that can correlate sounds/phonemes sampled from thousands (millions) of effective translations. Over time, the network based approach can provide better results without any “training” of the device and accommodate wide variations in dialect, tone of voice and accents not to mention larger vocabularies and more idiomatic intelligence.
    I believe government security agencies have been using such centralized voice recognition software technologies to monitor large numbers of phone conversations way before the shrunk down (device based) versions came to market.

    1. Sitting in front of a fully functional Mac with a keyboard in front of you, how many people are going use dictation as a primary input for text. Out of that how many are going to use it on a regular, as opposed to novelty, basis. 1%? A tiny sliver of Mac owners.

      It would be more efficient to include the software as a downloadable option to OS X and then train it by saying a few key words into the microphone. Not only would the transliteration process be a lot faster, it wouldn’t be dependent on a reliable Internet connection during speech to text.

      1. 1% of all Mac owners is still a lot of people – and, of course, you don’t have to use it. But most people I know are very poor typists. I think a lot may prefer good voice dictation, if it’s just there, ready to go.
        I think, actually, as Sire and voice dictation get better and better, they will both be used more and more.

      2. An automatic transcript of a teleconference that is able to recognise accents of non-native English speaking participants will be an excellent use of the technology.

    2. Altos is correct. The back-end system that does this is based on quantum neural network technology developed in the 1990s, which has gradually worked its way into non-spy use. This technology is NOT computer-based and, instead, is an independent physical system of anyons interacting within the 2DEG environment of HEMTs. This system has been transcribing phone calls to text, suitable for data mining analysis, for over a decade. This function is part of ECHELON, and is operated by the Five Eyes (AUS CAN NZ UK US). The same AI also handles a bunch of other tasks, such as Face Recognition, Cognitive Signature Recognition and Emulation, et cetera. Apologies for the use of so much jargon, but no simpler terms to describe this stuff exist yet. Look it up in Wikipedia if you are interested, or seek out a Post Quantum Historical Retrospective.

  3. If so, Apple needs to take some of the mountain of cash at its disposal and buy Nuance out. And while they’re at it, then have it discontinue support for Windows and Android while they’re at it. (heh-heh)

  4. I use Nuance Dragon Naturally Speaking at work on the PC… 98% accuracy with no training. Bought it on sale for $49.00. I’d love to have it on my Mac w/o a network connection, but I’ll take this for now. Would I use it? Pretty much constantly, in combination with my mouse, trackpad, etc.
    Perhaps Apple has their reasons for not trying to Acquire Nuance outright, or perhaps they won’t sell. They seem to be in a pretty good position these days.

  5. “Dictation under Mountain Lion is quite fast and highly accurate”
    I hope so, because it sure isn’t on my new iPad.

    “Why can’t Mountain Lion’s dictation feature work locally and free users from network dependency?”
    Agreed, 100%!!

    1. Dictation on my iPad is incredibly accurate and fast. I’m comparing it to my iPhone 4S. Funny how it works. But it definitely works better than my 4s.
      And I agree I wish it would work without Internet connectivity.

  6. I’ve been using dictation software on the Mac since ViaVoice for Mac was released. Moved to iListen a few years later, then to Dictate. With very little training, Dictate is easily 98% accurate with most applications. The only application I experienced issues with initially was FileMaker Pro. Those problems have since been remedied through experimentation with the command set or a solution found on the Nuance forum.

    As a long time user of dictation, I was pleased to hear Apple was incorporating system wide dictation in ML. For me, it’s really the only feature I considered worthwhile. Having learned Dictation requires an internet connection to function, I’ll be staying with Dictate.

    1. I agree with you, Mark. I followed the same progression of voice aps as you did. In the past, I has also configured Speakable Items to do remarkable things (though not handle much dictation). I, too, don’t understand why network connectivity is required now. Even if the accuracy might be slightly reduced if handled locally, I believe that is what we should have. Agreeing with MDN on this, too.

Reader Feedback

This site uses Akismet to reduce spam. Learn how your comment data is processed.