Posted on
Step-by-step Adobe Soundbooth CS4 Metadata Transcription
by Travis Chandler
Now that we’ve had a preview of robot metadata transcribers in part one of this brief blog series, let’s grab some popcorn and enjoy the show! When it came time for me to try out Adobe Soundbooth’s nifty transcribing technology, I launched the program and grabbed a clip of audio that I find entertaining. I happen to have a lot of short audio clips from the terrible-and-hilarious 1966 Sci-fi flop “Zontar, the Thing from Venus” all pre-clipped and ready to go for just this kind of thing. Doesn’t everybody?
The process of transcribing using Soundbooth was as easy as one-two-three: read more after jump.
Now that we’ve had a preview of robot metadata transcribers in part one of this brief blog series, let’s grab some popcorn and enjoy the show!
When it came time for me to try out Adobe Soundbooth’s nifty transcribing technology, I launched the program and grabbed a clip of audio that I find entertaining. I happen to have a lot of short audio clips from the terrible-and-hilarious 1966 Sci-fi flop “Zontar, the Thing from Venus” all pre-clipped and ready to go for just this kind of thing. Doesn’t everybody?
The process of transcribing using Soundbooth was as easy as one-two-three:
Step one: drop the clip in. Check. It’s a bit of dialogue between the bad guy, Keith, and his wife Martha. He’s describing where Zontar’s been hanging out now that he’s on earth.
Here’s what the clip actually says:
Keith: You know the old hot springs cave up at the ridge?
Martha: Yes?
Keith: Zontar’s made his headquarters there because the climate is somewhat like that of Venus.
Martha: Hiding in a cave… Away from the light…
Nice. Not too many crazy words (with the exception of the word “Zontar” of course), nothing too complicated.
Step two: Under the “Edit” menu, select “Transcribe”. It’s easy as pie.
Step three: Howl with laughter at the impossibly delightful results! Here’s what the transcriber came up with, and I quote:
“In Tokyo the brokerage unit his headquarters they’re going to find this somewhat like a penis the case the weight and height.”
The robot transcriber totally nailed “his headquarters”. Other than that, we’re looking at a fun-filled romp through highly inaccurate, mildly offensive crazy talk.
Now, to be fair, I left all of the settings on their default modes. But no amount of wrangling with the parameters afterwards got me any closer to the actual words the actors are saying. It just made it less funny. No amount of fiddling with the controls could convince my robot transcriber that the word penis wasn’t involved, however.
Under controlled circumstances, this same transcription process can deliver stellar results. I also happen to have a bunch of audio clips from the 1938 Orson Welles radio show “Invasion from Mars” that freaked everyone out. Everybody collects those audio clips too, right? It’s not just me? It is just me isn’t it? (Sigh).

Well, those clips perform fantastically, and the reasons are pretty straight forward. Though Orson was recorded a good 30 years before the Zontar clip was captured, the quality of the audio recording is much higher and there’s very little background noise. The Zontar clip sounds a bit like someone chewed on it for a while before running it through the world’s worst film projector. I can understand the words quite easily because I have a big portion of my brain dedicated to just that purpose. So do you! But differentiating between the ambient noise and the words the actors are saying is too difficult a hurdle for a robot. So it guesses. And, God bless it, it guesses hilariously.
For now, if you have a crumby or noisy audio file, you’ll probably need a human to transcribe it for you. But if the quality is decent, look out puny humans. The robots have got it covered.
Metadata transcription ease is another reason to make sure you shoot with the highest quality audio you can for your video and podcasts.
Found a great transcription tip? Share it with us in a comment.
No comments yet.
Post a Comment