Cisco Blogs

Google adds Automated Captioning

November 28, 2009 - 1 Comment

Our team has spent much time and money over the years transcribing videos. Its amazing how much we talk and you just never realize how much until you see it in writing. Google is taking some of their voice smarts with text to speech and offering it to the YouTube channel now. There are some translation options available as well and I would think options will only increase as this gets perfected over time. 

[UPDATE: 11-29-09 – My good friend Munawar Hossain is the Product Manager for our MXE 3500 ‘Media Experience Engine’ and he reminds me that it performs transcription services already!  This appliance and the capabilities it represents are a great example of how consumerization has created demand for enterprise quality capabilities that are just as easy to use.  Check it out!   -Robb]


We have used transcriptions with TechWiseTV in a couple of different ways over the years.

When we first started the show back in 2006, we would routinely capture more video than could fit in a one hour format due to our naivete of how to prepare. Our poor producers were forced to make a one hour show out of all this technical material that they may not be all that comfortable with – oftentimes to assist with those early edits, we would get the transcripts to ensure nothing critical was getting cut. We quickly learned to prepare and shoot ‘live to tape’ to minimize this content oriented editing which was often difficult to pull off smoothly (although I was consistently impressed at how well they did). 

Transcripts were later used simply to provide as attachments to the show when posted online as both a resource and as a way to increase SEO (search engine optimization) but would also serve as a way to make video files searchable internally too. 

Now that our team is serving our global markets, we are constantly looking at the most optimal way to localize the content depending on the needs of our various global teams.  This would include sub-titling, voice-over and other tricks of the trade. 

Hiring people to transcribe technical content into English is not inexpensive…especially with all the questions they have to stop down and ask Jimmy Ray about what he meant or said. Its amazing what has been learned about translating into another language.  We have multiple people on our extended team that do this and they are the ones most upset about the speed in which we talk and the multiple American and geek references that get tossed around. 

It seems logical that technology will make this skill set we are working with go away…but I don’t really think this will be anytime soon. Its not that it can’t help however, its just that there is no such thing as simple language translation.  I am amazed at the work that goes into simplifying what we say into ‘less’ words and then the translation skill set that requires each local language to choose how they might replace any of the many creative and colorful adjectives we use. This is not something a computer is going to do with a high degree of success anytime soon.   At least not without the chance that a direct mistake in translation could lead to the totally wrong choice of words.   Jimmy Ray and I have enough trouble with staying out of corporate hot water.

Great moves to keep upping the feature set here YouTube.  Love how the bar is always being raised.



In an effort to keep conversations fresh, Cisco Blogs closes comments after 60 days. Please visit the Cisco Blogs hub page for the latest content.


  1. The human race has one really effective weapon, and that is laughter.