Posted on
Fun With Video: Metadata & Transcription Software
by Travis Chandler
If you do video work, you’re going to bump into times when you need everything transcribed into a text file.
If you have somehow avoided this thus far, it’s only a matter of time before a client requests it, or you end up demanding it for yourself. Now that metadata has become the quiet king of the internet search, you’re going to have to write it all out if you want your work to be found.
To understand why, just consider how you would likely search for a video clip yourself. Imagine you and a friend are discussing a film reference, for instance.
You: “Hey, remember when Robert De Niro said ‘You talkin’ to me?’ What movie was that in?”
Your friend: “I am unsure, but luckily the internet has been invented. Let us discover the answer using technology!” Your friend is an odd fellow, it seems.
You race over to the computer, launch a browser and type furiously into the search field…What? What do you type? Probably “You talking to me?” and “De Niro”. Well, in order for the search engine to find what you’re looking for, it needs the metadata “You talkin’ to me” to be associated with the clip. If the person who posted it simply labeled it “De Niro, Taxi Driver”, you’ll never find it. So the transcript of what has been said in the clip needs to be associated with the clip as searchable metadata. Get it?
Of course, the instant it became clear that this was the way things were going, a bunch of companies got into the business of automating the process. For years and years there have been companies that you could send off a tape to, and they would respond with a transcribed document that an honest-to-goodness human being made. But then, like many forms of labor, it became clear that it would be a whole heck of a lot cheaper for companies if they could get robots to do the transcription for them instead of living, breathing folks.
Robots don’t complain, eat, take breaks or get paid. Life sucks for robots. However, if you’re of the entrepreneurial sort and want to get stuff done cheaply, you can’t beat ‘em. Ergo: robot transcribers!
There are pros and cons to using robot transcribers however, especially vs. using human transcription labor.
Pros:
- Robots are awesome.
- Robots, once they are acquired, are pretty much free to use and require no additional payment, vacation days or benefits. As previously mentioned, life sucks for robots.
Cons:
- They’re unable to decipher context, and as a result, occasionally get things completely wrong.
- They’re not fun at parties and know very few jokes.

There are quite a few on the market to choose from:
- There’s a free app by Dragon for all you iPhone users out there. Dragon also makes software for PCs that does the same thing.
- There’s a mac program called “Dictate”, which is a very straightforward title, if not very creative.
- Adobe’s started offering transcription services in both Premier, their editing software, and Soundbooth, their audio editor.
I’ve written about Soundbooth before in a previous Mightyblog post. Allow me to present you with a summation of that post:
I like Soundbooth. That’s the summary.
Okay, more specifically, Soundbooth is easy to use, intuitive, and has quite a few powerful tools available.
Stay tuned for a follow-up blog with my own experience with Soundbooth doing a step-by-step transcription for a video clip of “Zontar, the Thing from Venus.”
Anybody care to wager a guess at how well the transcription software will do?

2 Comments
My guess is 62.35%.
The robot transcription came out roughly 9% accurate. See why in Travis’s follow up blog: http://www.mightybytes.com/mblog/comments/step-by-step_adobe_soundbooth_cs4_metadata_transcription/
Post a Comment
Commenting is not available in this weblog entry.