Turning Sounds into Text

This is the first article in a series by my guest blogger, Melissa Giles, about text, editing, and media accessibility.


Clip art image of a rectangular black speech bubble with three horizontal lines indicating speech and "CC" within a black outlined tv screen, both recognized symbols for subtitles and captions.


Captions are essential for people with some level of hearing loss. Verbatim transcriptions of speech and descriptions of sound effects and music, not only for television and films but also for social media content and at live events, are essential for an inclusive society. However, captions are not always provided, and when they are, they are often not copyedited or proofread.

Canadian caption editor Vanessa Wells wants to solve these problems. Wells has a rare combination of experience as a caption writer, caption editor and caption user. She has hearing loss and hyperacusis, making captions vital in loud or crowded spaces.

Wells recalls a telling experience with one of her favourite movies, Interstellar. It took three attempts for her to understand what the star Matthew McConaughey was saying. The first attempt was in a theatre unaided, then she tried again in a theatre using one of the available personal amplifiers, but it could not overcome the audio feedback in the room. The third attempt with captions on a purchased DVD was finally a success.

Another experience was at a conference. ‘I couldn’t hear well because people were chit-chatting the entire time,’ Wells says. ‘Even whispering nearby was very disruptive.’ She tried requesting that the speakers use the microphone and that the other attendees stop talking, but she still couldn’t hear clearly. Next time, she says she’ll ask for Communication Access Real-time Translation (CART): live captions displayed on a large screen in the room.

As with many accessibility measures designed for a particular group, CART and other captioning can benefit various people, including those who require simultaneous aural and visual information to aid comprehension or processing.

In Australia, the Australian Communications and Media Authority (ACMA) regulates minimum quality and quantity standards of captioning on content accessed through television stations and similar services. But there is no regulation of captioning, for example, in videos produced by individuals or other kinds of organisations, which often appear online.

The World Wide Web Consortium’s Web Content Accessibility Guidelines recommend that audio content in all online pre-recorded and live synchronised media be captioned, except when the content is clearly identified as already being an alternative for text-based material. This recommendation can only go so far, though, because it is part of a set of voluntary standards.

What makes a good caption?

To the uninitiated, captioning might appear to be a simple process, especially for pre-prepared captions, which are not produced under the immediate time pressure of live captions. However, as with all written content, many elements affect the accessibility and meaningfulness of captions.

For example, captions must be accurate, clear, comprehensive and contain the equivalent meaning to the audio content they replace. They must also be displayed in a consistent style, well placed on the screen, in appropriate colours and well synchronised to the audio. Users must be able to easily switch captions on in the case of ‘closed’ captions, which are not permanently displayed like ‘open’ captions are.

The ability to create high-quality captions is affected, of course, by the captioners, their training and their working conditions. Wells used to work as an in-house captioner in Canada for pre-recorded television content and highlighted some of the reasons for sub-par captions.

In her experience, the inadequate training left her cohort struggling to learn ‘the new software, dedicated keyboards, the rules for each broadcaster’ and no-one in her training group ended up staying in the industry.
In the workplace, Wells encountered other challenges, such as the last minute timeframes, the atrocious pay (based on speed, not accuracy) and the general lack of concern for quality.

Wells was a book editor before becoming a captioner. She recalls her captioning boss saying to her specifically: ‘Don’t get so hung up on the editing: it’s not like you’re editing a book.’ But, she thought: ‘Well, it should be like you’re editing a book – it’s that important.’

DIY captioning

You can caption your online videos using free tools such as Amara or the captioning functions that YouTube provides, among others. However, Wells urges caution because the result of using automatic options through voice recognition software or having untrained people creating captions is often non-accessible and non-usable ‘craptions’.

Wells supports the #NoMoreCraptions campaign to end near-enough-is-good-enough captioning. She argues against the idea that ‘something is better than nothing’ for caption users because ‘if you have gibberish, then that is not better than nothing’.

Captions are essential for communication, but Wells also sees them as a way to facilitate audience immersion, which is not possible if viewers are distracted by typos or confused by other errors that copyeditors and proofreaders are trained to identify and fix.

Caption editing

The caption text produced even by professional captioners requires expert copyediting and proofreading, but this niche role is largely unfilled. Despite the fact that the captioning field is growing, relevant training for editors wanting to become caption editors is hard to come by.

Wells is currently in discussions with universities and colleges about offering her caption-editing course online and making it available internationally. She argues that captioning education is necessary in all post-secondary courses that include studies in accessibility, media, audiovisual content and communication.

Many captioning companies produce craptions, Wells says, because they are operating without the required knowledge and training, ‘akin to when people who like to find typos in the newspaper hang out their shingle as professional copy editors and proofreaders’.

Wells accepts caption files (such as .srt and .stl) of any quality – even if they were produced automatically or contain craptions – and copyedits the content to be accessible and usable. Her main clients are usually larger television and film producers, post-production houses and subtitlers who translate into English but do not have native-level proficiency.

‘So-called captioning companies don’t hire me because I would be an added cost and, as in book editing, there’s a huge race to the bottom for bargain-basement rates,’ Wells says. ‘That suggests to me that they don’t really care that much about accessibility.’

About the author

Melissa Giles is a copyeditor from Brisbane. She would like to advance the understanding of communication accessibility and related professional practices. This includes encouraging diversity within the editing profession and highlighting ways that editors and organisations can incorporate people who are often overlooked in the communication process.

This article was first published in the Editors Queensland March 2019 newsletter OffPress. Editors Queensland is a branch of the Institute of Professional Editors Ltd (IPEd) in Australia.
It discusses why caption editing is key to caption accessibility for users.

Leave a Reply

Your email address will not be published. Required fields are marked *