Forbes: To Boldly Dictate: The Fastest Voice Transcription Since Star Trek

To Boldly Dictate: The Fastest Voice Transcription Since Star Trek
March 31, 2017

If you ever transcribe audio recordings into text you’ll know it’s a right pain. Journalists do it all the time with interviews, but maybe you have to do it to record the minutes of a meeting, turn notes from a lecture into a usable revision tool or give your podcast another life.

When it comes to machines transcribing audio into text, we’ve been told for years now that it’s not far off. And you can bet that journalists ask “are we there yet?” every time we meet people from a company which uses voice recognition.

In fact, one leading voice recognition company regularly promises we’re nearly at what they call "full Star Trekness", where spoken words are understood as flawlessly as they were by the Starship Enterprise's Universal Translator and then turned into text, but the reality seems to lag behind.

Now, one company is doing its best to close the gap. Trint – the word comes from the words transcription and interview – takes your recordings and transcribes them.

I talked to Jeff Kofman, Trint’s CEO, who was formerly a broadcast journalist for ABC, CBS and others. I asked him where he thought Trint would be of most use, beyond journalists. “About a quarter of our users are in higher education: professors and PhD's who routinely do large volumes of qualitative research and spend much of their research funding on third-party transcription. We are seeing adoption from marketing companies and we see huge potential for Trint in the conference sector, where we can turn conference keynotes and panels into searchable and shareable content. We are also starting to see adoption from lawyers and justice officials for depositions and healthcare is also a sector that uses massive amounts of transcription. We see both justice and health as longer term sectors because of the data security and health privacy issues we need to address.”

It’s a clever system. Drag your sound file onto the upload page and it’ll let you know when it’s done. The first thing that the company advises is: the better the sound quality, the better the result. You can choose the accent of the dominant voice in the recording, if there is one.

When it comes back to you, usually in a shorter time than it took to record the interview or meeting, and always in much less time than it would take me to transcribe it, it’s as a piece of interactive text on screen.

Let’s be clear, it’s not perfect right off the bat. It offers a result with inaccuracies which need adjusting.

Which is where the clever interface comes in. As Kofman explains, “Trint's core innovation is the marriage of a text editor to an audio/video player: we glue the text on the screen to the original audio and let users search, verify and if necessary correct. With the Trint Editor users can have accurate transcripts in just minutes at very low cost.”

The audio file runs along the bottom of the page, perfectly in sync with the printed words.

There’s a big Play button on screen and just below it a highly useful icon to rewind five seconds. Believe me, you’ll need that.

When you see a mistake in the text, you click your cursor in place and type the right word with the audio jumping back to that exact point as you do.

It takes a little getting used to but it’s a highly effective system. You can highlight words as you go and when you’re done export the words as a document.

I’d like to see more comprehensive word processing features in the onscreen interface, like search and replace for those words which came up over and over again and were transcribed wrongly (if consistently) every time.

And there’s still quite a bit of editing to be done, Star Trek it ain’t. But it’s better, by some distance, than anything I’ve used before. You can upload video as well as audio.

Trint isn’t free, though you can try it out for nothing – a free trial is available. Then you can opt for a pay-as-you-go tariff at 25c per minute (minimum one hour) or monthly tariffs starting at $40 for 3 hours of upload per month. Both are significantly cheaper than many rivals.

Although the strides that voice recognition has made recently is frenetically fast, we’re still in the early stages. Trint has plenty of new features due in coming months, such as quickly creating captioned video. More features will doubtless follow.

Originally published here.

We’re always happy to chat about the innovative tech we work on here at Trint, so don’t hesitate to get in touch with us for all media enquiries at victoria@trint.com.

Your free trial awaits

Start your 7 day trial

Learn more about Trint for Enterprise