Siri: evolutionary, not revolutionary

Apple’s Siri has given the world of mobile voice-recognition software a much-needed lift, as my colleague Ryan Kim pointed out earlier this week. But voice-recognition software still isn’t the easiest stuff to use, and there are many instances when one-finger navigation is simply a better option. Unlike touchscreens — which have become ubiquitous, thanks largely to the original iPhone — Siri and other voice-recognition software aren’t going to revolutionize how most of us interact with our phones anytime soon.

Don’t get me wrong: I’ve long believed that there’s an untapped market for voice-recognition technology in mobile, where navigating a phone with fingers can still be a hassle. Players like Nuance and Vlingo have quietly gained traction with third-party apps for smartphone users, and there’s a lot to like about Siri, as the recent flood of glowing reviews demonstrates. For Siri to change how we use our phones, though, a few roadblocks must still be addressed:

  • There’s a learning curve. As Kent German notes in this CNET post, users must learn how to talk to Siri to make it truly useful. Queries must be spoken slowly and clearly, and they often must be phrased very specifically. That’s to be expected, of course — we are talking about computer software, after all — but it will slow uptake, especially among users who aren’t tech-savvy.
  • It’s still far from perfect. Modern voice- recognition offerings are far more accurate than they were just a few years ago, but Ed Wrenbeck, a former lead developer at Siri, concedes that users “can expect a 60–70 percent return on average” when using the technology. And while its performance can be improved as users learn how to talk to it, Siri doesn’t truly offer the artificial intelligence that helps it understand what users are looking for. It can leverage powerful tools like context and semantics to deliver personalized results, but as Wrenbeck said, “Real AI can’t fit on a phone in our world . . . yet.” Indeed, “real AI” may come to handsets years from now, but until then Siri and other technologies will evolve gradually with incremental upgrades to the technology.
  • The use cases are limited. Voice recognition is ideal for some mobile scenarios, like using a navigation app behind the wheel or accessing contact information quickly (rather than scrolling through lists of names). But it’s a poor substitute for touchscreens in a business meeting, say, or on the subway. Voice will eventually become one important option for operating the phone, but it won’t be the revolutionary feature some are claiming it to be.
  • Lack of integration of third-party apps. Siri is integrated with core iOS functions like email, messaging, calendar and the web, but it’s uncertain whether Apple plans to release an API that would allow third-party developers to tap into Siri’s power. Just as the real power of the iPhone lies in the vast library of third-party applications, Siri’s true potential lies in integrating with those apps. But as John Gruber pointed out last week, third-party integration invites problems of its own: When a user searches for a friend, for instance, should Siri search the iPhone’s contact book or Facebook? It’s not clear that Apple plans to offer integration or how it would address the problems that would inevitably arise.

Voice recognition is a powerful tool for specific use cases: Developers of search, navigation and messaging applications, for instance, should make the technology a top priority. And the technical wrinkles will gradually be ironed out as Apple and others develop their software. But while touchscreens have changed the way most of us use our phones every day, voice recognition will mainly be used in specific scenarios and for specific functions.

Question of the week

How much impact will Siri have on how we use our phones?
Relevant Analyst
Colin Gibbs

Colin Gibbs

Founder and Principal Peak Mobile Insights

Do you want to speak with Colin Gibbs about this topic?

Learn More
You must be logged in to post a comment.
7 Comments Subscribers to comment
  1. Good points, Derek, and Apple’s incredible marketing savvy (and budget) will also help Siri’s prospects. One of the reasons the iPhone changed the game was Apple’s amazing TV commercials, which illustrated just what the iPhone could do — how it was fun, valuable and easy to use. Commercials for Siri are doing the same thing.

    But I still think that too many challenges (and too few use cases) exist to make voice a primary method for most consumers to use their phones. You’re right that navigation is an ideal scenario for voice recognition technology, and getting last night’s Yankees score is easy enough. But what if I want to dig just a little deeper? What if I want to see a box score, find out how Sabathia pitched, etc.? Which app or online source will Siri access, and how do I get it to use the app or source I want? All those things are probably easier if I can use my fingers to open an app or the browser.

    Don’t get me wrong; I think voice recognition will be important. I just think its impact won’t approach that of touchscreens when it comes to how we use our phones.

  2. I agree that Siri isn’t perfect, but there are two reasons it might help voice recognition cross the chasm into the mainstream:

    1) Trust. People trust Apple, so when Apple launches a feature, it gets tried far more than any other manufacturers’ features. People will try Siri out, and they will see others using it. That starts a cultural shift. Is it a tipping point? Close call. I say yes.

    2) Quality. Is Siri perfect? No, but I think its recognition rate and functionality are good enough quality to keep the people who try it using it. This means they get past the learning curve.

    When regular people make it past the learning curve, and actually start using voice, there will be greater investment and faster innovation ex: We’ll see more phones with noise-cancelling dual mics. Users of Android will soon realize that Google Voice actions offer similar benefits. Companies like Samsung and HTC will actually advertise that feature.

    Do I think it’s revolutionary? Yes. I’ve been using a smartphone since 2001 – but the UX just wasn’t easy. It wasn’t until touch and iOS that the average person found the experience practical for them. I’ve also been using voice recognition since 1997, but the UX just hasn’t been easy. Siri/iPhone 4s may be the step forward that tips the scale.

    The impact is huge. Smartphones have brought people closer to information they value. As an example, on iPhone, sports scores are just a power button, a screen swipe, and app click, a league click and a team selection away – maybe 20 seconds! Awesome, right? Or is it? 20 seconds is a long way between you and your information. There’s got to be a lot of times when you think, “I’m just not going to bother.” What if you could, instead, press a button and say “How did the Yankees do?” and get the score? If that is just 6 seconds and no swipes, types, or clicks, that puts people significantly closer to a world of information. For an even more complicated search that included typing in something, say “Thomas Jefferson” into a Wikipedia app, think of how much faster voice can deliver results – no app, no typing.

    Voice interactions will reduce the distance between average people and massive amounts of information. That’s awesome. That is revolutionary.

    Think of this: Five years ago, the hot tech item was a Sat Nav GPS device you could put on your dash. Many of you shelled out $300 to put thief bait on display so that, when you wanted to get somewhere, you could lean way forward and touch-type into a crappy screen the city, street, and number of your destination (while reading your PDA or post-it in the other hand). Then the device would work its magic. And THAT was worth it. Now compare that to speaking “Navigate to Norwest Venture Partners, Palo Alto” to your smartphone. The friction between you and quality navigation, turn-by-turn, with real-time traffic imputed is now less than 10 seconds. Average Joes were willing to pay $300 and type into a bumpy GPS to get that kind of valuable info, but now it’s seven spoken words away. That’s a whole lotta value for a little effort.

    It happened with Gutenberg, libraries, the WWW, smartphones, apps, and it will happen again. Reducing the distance between people and information is powerful, powerful stuff.

Explore Related Topics

Latest Research

Latest Webinars

Want to conduct your own Webinar?
Learn More

Learn about our services or Contact us: Email / 800-292-3024