Read aloud text also needs to be clear

By Sara Kjellstrand
Research Strategist, Funka Foundation
“But does text really need to be in plain language if it is read aloud anyway?”
That question came from one of the participants at a webinar we organised, and they were genuinely surprised. And in a way, it can seem logical. For anyone working in communication, especially in the public sector, it is obvious that plain language primarily concerns written information.
International guidelines on plain language, such as ISO 24495-1, are aimed at people who produce documents. The focus is on ensuring that readers can quickly find, understand, and use information. Plain language recommendations therefore usually assume that the recipient is reading with their eyes. For example, they involve breaking long texts into clear paragraphs and avoiding long and complicated words. It may be precisely this focus on the written form that leads many to assume that text read aloud is, by definition, easier. As if complicated text automatically becomes easy to understand when it is heard.
If we instead think of text as information, regardless of format, it becomes easier to understand that complicated language is complicated, both as written text and as speech. Understanding requires effort, regardless of whether the text is accessed with our eyes or our ears. First, we must be able to perceive the information, and then process and understand it. The problems of unclear text do not disappear when the text is read aloud. On the contrary, they can become more pronounced, especially if the text is read by a speech synthesis system that often lacks a sense of intonation and pause [1].
Take a typical government sentence: “If documentation for assessing whether the criteria stated in the regulation are met is missing, processing cannot begin until supplementary information has been received.” Read it aloud to yourself. Do you notice how easy it is to stumble over the words? How you need to take a new breath halfway through? Perhaps you have already forgotten the beginning by the time you reach the end?
Long sentences make it easy to lose track, and when read out aloud they can also make you lose your breath. Even though speech synthesis does not need to breathe, the listener still needs to follow a rhythm in the language. Without natural pauses, it becomes difficult to keep up, especially when information is stacked without breaks.
When text is read with the eyes, there is often an opportunity to take in the whole sentence at a glance and, if necessary, go back. In auditory form, the information instead has to be held in memory while the whole meaning is being built up step by step.
To truly understand text that is read aloud, it is not enough to simply hear the words and decode them one by one. We need to be able to follow the line of thought, keep up with the reasoning, and remember the context. That requires text that is designed to be understood, regardless of whether it is read through the eye or the ear. At its core, plain language is not about which medium we use, but about ensuring that the message actually reaches as many people as possible.
[1] There is a wide variety of text-to-speech systems. Many blind people use fast, more mechanical text-to-speech systems for the sake of efficiency. The text-to-speech systems used by many people with dyslexia and intellectual disabilities, on the other hand, sound more human and are getting better and better at intonation.