Apple applies for audio user interface patent; voice navigation for your iPod

“Another day, another Apple patent filing. This time, Apple calls dibs on an ‘audio user interface for computing devices.’ Nothing new right? Afterall, we’ve had audio assisted navigation for years. The system described however, uses the relative power of a host system to auto-generate audio tags from text strings which can then be played by a hand-held device such as an ‘MP3 player, mobile phone, or PDA,'” Thomas Ricker reports for Enagdget. “And since this could likely be a software only enhancement to existing Apple ‘ware, implementing it could be just be a free, point-release upgrade away… hear that display-less Shuffle owners?”

Full article here.

“The tags are generated from the metadata of your music or video files – the text information like author, song name and duration, film director or lead actor name, etc;- which comes with a song or video that you download from iTunes store. This text information is then converted using text-to-speech software into a small audio files – audio navigation tags. The iPod computing capabilities are too small to do a good text-to speech translation, so this operation is carried out on your Mac or PC,” Unwired View’s “stace” explains. “Audio navigation tags then are attached to the songs or videos themselves and transferred to you iPod. When getting there, audio navigation tags can be stored with the songs themselves or transferred to a separate database and then synchronized with the navigation menu. The same audio information can be generated for user playlists and every song in the playlist.”

“So for example when an iPod user wants to play “In the air tonight” by Phil Collins he navigates to the song using his preferred path: “Menu”->”Music”->”Pop”->”Phil Collins”->”In the air tonight”->”Play”. However instead of having to look at the screen, all along the way he hears audio prompts advising him on the next navigation steps,” “stace” explains. “The clever thing with this particular Apple’s patent seems to be the way they solved the dynamic synchronization problem for constantly changing metafiles of huge music collections stored on the iPod. By dynamically generating audio tags in iTunes on your computer when song is downloaded or playlist is created and then synchronizing it with the iPod, Apple is able to make audio navigation as easy to use and update as current text navigation system. And since no hardware and navigation control changes are necessary, audio navigation can be added to current iPods through simple software update.”

More in the full article, including patent application artwork, here.

Advertisements:
Get the new iMac with Intel Core Duo for as low as $31 A MONTH with Free shipping!
Get the MacBook Pro with Intel Core Duo for as low as $47 A MONTH with Free Shipping!
Apple’s new Mac mini. Intel Core, up to 4 times faster. Starting at just $599. Free shipping.
Apple’s brand new iPod Hi-Fi speaker system. Home stereo. Reinvented. Available now for $349 with free shipping.
iPod. 15,000 songs. 25,000 photos. 150 hours of video. The new iPod. 30GB and 60GB models start at just $299. Free shipping.
Connect iPod to your television set with the iPod AV Cable. Just $19.
iPod Radio Remote. Listen to FM radio on your iPod and control everything with a convenient wired remote. Just $49.

16 Comments

  1. They won’t add it to old models–an iPod does what it does when you buy it and there’s no promise of anything changing–but maybe this will appear in future models. Unless it’s totally annoying when you really try to make it work.

  2. It’s about time … I sent this EXACT idea (details and all) to Apple at least 2 years ago to help folks like my visually-impaired father navigate his iPod.

    way to go Apple!

    P.S. Note that you’re not telling the iPod what to find … it’s telling YOU where you are.

  3. AG Pennypacker:

    I too was of the view voice recognition was impractical – that is until I bought the Motorola Razor phone. I have 300+ contacts and the thing works very well – not perfect all the time, but 98%. If Moto can get 98%, I’m positive Apple can get 100+

  4. Actually, it was 3 years ago …

    —————–

    From: don
    Subject: Idea for making iPod accessible for the blind…
    Date: May 16, 2003 3:19:18 PM EDT
    To: ipoddev@apple.com

    Hello,

    My father is losing his eyesight due to a irreversible neurological disease. In November we bought him an iPod and have been loading it up with a variety of things, including his music collection, the Bible in MP3, as well as scanning and converting several of his favorite books to audio using Jaguar’s built-in speech synthesis.

    First of all … thank you! The iPod has restored a level of independence to him that he was rapidly losing, due to not being able to read and constantly needing someone to read to him.

    Now here’s my idea: There was a discussion on MacInTouch a while back about the usefulness of an iPod for someone who is completely blind. Now, my father can still see well enough to (mostly) navigate the menus, and I can see that one could memorize playlists and “count clicks” to navigate the iPod, but with dynamic playlists and the adding of new material, it could quickly become impossible to keep up with. So … how about full speech navigational feedback!

    Here’s one idea of how it could work … I envision an iPod setting for “Voice Menus” or something like that with “On” and “Off” options. Now, iTunes already knows the playlists, artists, albums, track titles, etc., and Jaguar has the ability to synthesize speech. So, instead of making the iPod do the “reading” of the selected menu item, iTunes could make tiny MP3 (or AAC) files, using the user’s preferred voice, for each distinct data string in the library, saving them as “actual_album_name.mp3” and “actual_artist_name.mp3”. Perhaps the iPod software could come “preloaded” with the basics needed for navigating the main menus, such as “extras.mp3”, “settings.mp3”, “clock.mp3”, and so on.

    Then, for example, with the “Voice Menus” feature turned on, if I navigated to “Playlists” on the iPod menu, I would hear “Playlists” in the headphones. Also, when a menu item (playlist, track, album, artist, etc.) is “hovered over” whle navigating the iPod, it could play the tiny “hidden” MP3 file (by just playing the file with the same name as the text to be read: i.e. “This_Album_Name.mp3” from an internal “voice_navigation” directory).

    Now, the first part of this (generating the tiny audio files) could be done easily enough through an “add-on” AppleScript. But the second part, having the iPod look for a file called “voice_nav/playlists.mp3” to read the word “Playlists” or “voice_nav/Sheryl Crow.mp3” to “say” “Sheryl Crow” requires some extra functionality in the iPod software itself.

    I believe this would be of GREAT benefit not only to visually-impaired users (and therefore be a “feather in your cap” for Federal disabilities compliance), but also for anyone who likes to navigate the iPod without access to the screen.

    I would be more than happy to help with this development in any way I can, and DO NOT want any compensation for it, and WILL NOT claim any “intellectual property” on the idea. (But of course, I wouldn’t turn down an iPod for “testing” purposes … just kidding : )

    thanks for your great work on this wonderful tool,

    Don

  5. Apple (and the Mac in front of you) is highly advanced in terms of speech recognition, it is just a shame that it is underutilized. My physically challenged friends go nuts over these hidden advancements.

    Want an example?

    Turn on your Mac’s speech recognition.
    Make an alias of your favorite program. For this example, we’ll use “Backup”.
    Rename this alias by prefixing it with “Open” and deleting the word “alias”, such as “Open Backup”.
    Place this alias into the your “~/Library/Speech/Speakable Items” folder.
    That’s it. You’re done.

    Now, whenever you say “Open Backup”, your mac automatically launches the application. In reality, you can name it anything you want such as “Fart Smeller” and every time you say “fart smeller” the ap will launch.

    For some of you with a luke warm IQ, this technology may not amaze you, but the complex algorithm that is required to convert a string of text (English’s complex rules no less) into a pattern and then match that pattern at an amazingly fast speed to the pattern of a audio input is really incredible.

    For quicker analysis, Mac’s speech recognition probably creates and store’s the file name’s predicted audio pattern as soon as you drop that file into the Speakable Items folder.

    Still unimpressed? What separates the Mac from most other speech recognition systems is that you do NOT have to teach it your speaking patterns! Apple recommends that before you use it for the first time, you give it a little help with your voice by asking it the two phrases “What time is it?” and “What day is it?”

    If you plan on using this feature (very Star Trekish), be sure to make your alias names long and unique. This helps insure a higher probably of a match the first time you speak the phrase.

    The system is also incredible in terms of filtering out background noise and not responding to someone you are in iChat with. Just adjust your Sound System Preference Input level so your normal voice makes the bar just to the mid-point, sit at your normal distance from the microphone, and simply speak clearly. Speaking loud actually makes it worse.

    For all of you with those luke warm IQs that are still with me, imagine adding this to Apple’s announcement of working on a monitor with a built in camera between the screen’s pixels. You get an idea of Apple’s VERY Star-Trek-like direction.

    Apple wasn’t running towards the two-button mouse because they are working on eliminating the mouse altogether.

  6. This patent seems to fit the theory that an iPhone is coming up… It would seem logical to me that voice-navigation would be a useful feature with a phone/mp3 player device. I’m less convinced that it would be meaningful in the iPod as a standard feature (perhaps as an option for the visually impaired?).

  7. A “point release”??? Maybe for those with some sort of microphone already attached to their iPod. My iTalk gets taken out of the drawer a couple of times a year. Sometimes I get what I’m after, sometimes my iPod locks up until it drains the battery. And the device is nearly as big as a shuffle!

    No. This means new hardware as well as new software. And, if you’re going to build in a microphone … ?

  8. Creative Product manager :

    “Apple is petenting voice UI for iPod, quick! tart a project to have that in our MP3 players. Must beat Apple to market with this feature. Then we ‘ll say they copied and they’re not innovators.”

    Engineering department :

    Rushes a half baked voice feature to the next Creative MP3 player. It’s useless and unusable but it’s ready before Apple put out anything on an iPod.

    3 months later :

    Creative fails to gain market share with new voice activated MP3 player, sees loses go further down the death spiral and fades out of existence under yet another bone head idea triggered not by customer demand but by a “me too” product strategy.

    1 year later :

    Apple still has not put the voice feature in an iPod and, in fact, had no intention of doing so in the first place. Or, it does put in a voice related feature but it has nothing to do with what Creative (and other short sighted pundits) assumed it would be and everything to do with an actual usability feature that people can intuitively understand and use.

  9. DON
    Very good idea indeed and it seems that they have taken it. I do remember your letter. Now you have the pleasure of having contributed to the advancement of user interface although none of the finacial rewards. Oh well. at least you have the glory.

  10. May the whole world take note of Don’s letter to Apple. How wonderfull to see that someone is prepared to CONTRIBUTE something without wanting anything in return for it. Compare this attitude to all those people nefariously and greedily suing others who actually do the creating.

    My hat off to you Don or ‘DJ’. You restore my faith in humanity.

    PS nice piece Mike B. You almost tempted me to turn speech recognition on..

Reader Feedback

This site uses Akismet to reduce spam. Learn how your comment data is processed.