Cisco Blogs

Sweet Talk : Cisco and Speech Recognition

Note: Part two, continued from yesterday’s post.Cisco believes that in order for our customers to embrace and fully leverage speech recognition, the technology must offer both solution intelligence and a simple and natural user interface.Solution IntelligenceWe do not speak as clearly and consistently as we think we do, making solution intelligence a necessary part of a successful speaker interaction experience. Our speech is filled with pauses, repetitions, partial words, and slips of the tongue, complicating the speech recognition process. Researchers have been working for years to improve the algorithms and language models that are used to create increasingly intelligent speech engines, and the results are encouraging. Cisco is actively building upon the recent advances in solution intelligence to design speech recognition solutions that can understand and interpret our everyday words and speech patterns.A Simple And Natural User Interface How will Cisco”train” people to interact with speech engines? The key is the user interface, which provides the dialogue you hear (“œWho would you like to reach?”) and the manner in which you interact with the solution to determine what action you really intend (“œDid you mean Jim Smythe?”). Cisco is developing speech solutions that use straightforward, natural questions to elicit clarity and intent from end users. There is also intelligence built into the interface -not just the speech engine -so that callers can learn over a short period of time how to interact with the speech recognition system. Conversely, the speech interface has the intelligence to alter prompts in order to slow down and provide more guidance to the user, or speed up the interaction process for more advanced users. By making it clear what we’re supposed to do and say, a well-designed user interface allows us to navigate a speech recognition solution without the safety net of a human backup on the other end of the connection. A key part of Cisco’s strategy is to recognize what speech technology can realistically deliver to customers and therefore avoid many of the mistakes made by other vendors who tried to do too much with speech technology.Speech Recognition In Action TodayCisco’s most recent entry into the world of speech recognition is Speech Connect for Cisco Unity, which is a speech-enabled auto attendant feature of Cisco’s Unity unified messaging solution. This feature allows both internal and external callers, using only their voice, to be quickly connected to any employee in the company directory. The caller is prompted with”who would you like to reach” and responds by speaking a name. Speech Connect works because it has a very simple user interface built on top of the speech engine. Adam Goldberg, Cisco Product Sales Specialist, says”Speech Connect really lays the foundation for ‘speech as a network service’. As we extend that service across all of Cisco UC, our customers will be the true beneficiaries of our ubiquitous approach.” Try it the next time you call a colleague at Cisco by dialing 408-894-3500.In the contact center market, Cisco has been fine-tuning its speech recognition products for years. The Cisco Unified Customer Voice Portal (CVP) allows organizations to develop personalized self-service over the phone, letting customers efficiently retrieve the information they need from the contact center. For example, name and address changes are easily done with a speech interface while they are nearly impossible using touch tones. Additionally, Cisco Unified IP IVR allows organizations to develop additional speech-based customer service applications. These solutions allow our customers, such as Nestle Waters, to develop the appropriate user interface to improve speech-enabled customer service.”We’re able to offer a much more personalized service to our customers by incorporating speech recognition into our self-service platform,” said Kurt Mey, national technology manager for Nestle Waters North America, Inc.”Customers are able to complete their transactions in a much more natural, conversational manner than they could ever do in a touch-tone environment.”The Future of Speech RecognitionTechnology groups throughout Cisco are rapidly innovating to add speech recognition to the Cisco Unified Communications portfolio of products. Speech solutions are being incorporated into existing unified communications applications, embedded into the company’s Integrated Services Routers (ISRs) for branch offices, made part of the Cisco Unified Application Environment in order to allow developers to integrate speech services in a variety of applications, and enhancing the customer service experience we offer in our customer contact solutions.Ultimately, Cisco will transform the user experience throughout our portfolio of unified communications solutions. For example, a speech interface could serve as your assistant, prompting you with”You have a MeetingPlace meeting in 10 minutes, would you like me to call you and connect you then?” and you can simply reply,”yes.” If you are running late, you could say”I’ll be ten minutes late” and an instant message will be sent to the participants with that notification. The speech assistant could also allow you to create ad hoc conferences by simply speaking”add” and then speaking the person’s name. Speech-to-text features could take your communications a step further, allowing you to speak into a communications client and have your conversation transcribed into a text message or email for delivery to a colleague or business partner. A”command and control” unified communications speech interface would provide the functions you need at the time you need them by knowing your calendar, your presence, your contacts and your devices.Speech is the most basic and natural human interaction -it connects us all. That’s why Cisco’s vision of the future involves the use of speech to command and interact with our communications applications and devices. Cisco’s goal is to design speech interactions with the user in mind, reducing frustration and confusion, while enhancing productivity and delivering a solution that works in a natural and effective Mark Gervase, solutions marketing manager, Cisco Unified Communications

In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. What I am really curious about is where is the ROI in speech recognition. At IBM we were working on speech technology to allow us to talk to our computer. Do we really need that? Maybe there are some commercial applications with clear ROI (except in call centers, i can't think of any)

  2. How close is Cisco to having Speech-to-Text capability for messages left in someone's voicemail box (translate the vm to text and send to end-user)? Will this be a small feature add-on in the 7.x train or perhaps the 8.0 Unity?