It’s not what you say, it’s how you say it: Powering prosody

March 30, 2023

Rectangle Circle

We’ve just released the newest feature of SoapBox Fluency: Prosody. 

Prosody is a key component of oral reading fluency (ORF) and critical to being understood in second language learning.

Prosody refers to the rhythms, melodies, and intonations of speech that go beyond the literal meaning of words. It’s the musicality of language, the ups and downs, and the pace that gives our words meaning. 

Education companies using SoapBox Fluency to power their ORF assessments tell us that educators face significant challenges in capturing prosody data, for example:

  • It’s difficult to simultaneously track accuracy and prosody during observed assessments.
  • Applying an evidence-based approach to measuring prosody is a challenge because it’s highly intuitive by nature.
  • Prosody skills develop at a different pace for kids than other fluency skills, requiring independent progress reporting.

Our new prosody feature was designed to address these challenges, and to support our education partners in automating prosody scoring and generating at-a-glance prosody data that’s actionable in the classroom.

This blog explores the role of prosody in ORF and language learning, and explains how we’ve built our prosody feature to accurately capture data for three core prosody measurements.

What is prosody?

Prosody is the intonation, rhythm, or “music” of speech. Linguistically, it’s a combination of many aspects of language, including pitch, intonation, and stress.

When prosody is present, a speaker sounds fluent and natural, and their meaning is clear.

Prosody plays a crucial role in both oral reading fluency and second language learning, albeit in different ways.

Prosody in oral reading fluency

In recent years, discussions about the science of reading have focused on early reading. Phonological awareness, decoding, and sight reading, most educators agree, are the fundamental building blocks for literacy success.

As we know, the science of reading goes well beyond phonics. As shown in Scarborough’s Reading Rope, mastering oral reading fluency (ORF) and text comprehension are integral to helping children achieve reading proficiency.

An image of Scarborough's Reading Rope, which demonstrates how mastery of language recognition and word recognition are essential to oral reading fluency.
Scarborough’s Reading Rope

And ORF isn’t just about reading quickly and accurately. As a reader develops fluency, the way they read becomes equally important.

Cue prosodic reading! When a student reads with expression, they unlock a crucial component of fluent reading. Students are expected to read with a degree of prosodic proficiency by fifth grade.

soapbox labs three components of oral reading fluency
Accuracy, reading speed, and prosody are the three components of oral reading fluency.

Prosody is widely considered to be linked to comprehension. If a reader understands what they’re saying, their emphasis and expression will reflect that, adding a layer of meaning to their words. In fact, research has shown that prosodic cues can boost readers’ comprehension.

To become truly fluent, however, students must be able to read fluently while simultaneously grasping the text’s meaning. This major milestone in their literacy journey marks the transition from “learning to read” to “reading to learn.”

For English Language Learners (ELLs), reading fluently can have additional challenges. A student may be fluent in their heritage language but struggle to transfer some skills over to their second language. This is particularly true for the skill of prosody. If English learners struggle with reading fluency and don’t receive adequate support, they may be wrongly assessed as having a learning disability and be placed into special education services. 

Audio examples

Play these audio samples to compare an early reader’s and more advanced reader’s prosody skills. Consistent with their reading stage, the early reader focuses on decoding each word, while the more advanced reader reads with fluent expression.

Early reader

More advanced reader



Prosody in second language learning

Prosody is also a vital piece of the language learning puzzle. 

It takes more than vocabulary and pronunciation to be understood when speaking a second language. Prosody plays an important role in both speech comprehension and production. It adds richness and depth and differs from one language to another. 

Prosody can impact the meaning of a sentence by conveying structural information. In English, for example, ending a sentence with a high tone makes it sound like a question, even if it’s not technically a question (“you’ve finished ↑” / “ready to go ↑”). 

In recognized language frameworks like the International English Language Testing System (IELTS) and the Common European Framework for Reference for Languages (CEFR), fluency is measured through both accuracy and expression. In oral assessments, speakers must demonstrate comprehension, measured through prosodic markers. To achieve the highest IELTS score, for example, the language learner must demonstrate “fully operational command of the language. Their use of English is appropriate, accurate and fluent, and shows complete understanding.”

Language learners who lack prosodic skills, however, may struggle to be understood by native speakers, even if their pronunciation is spot-on. Research suggests that failing to support prosody development can actually set ESL students back.

Current challenges of assessing prosody

In both ORF and language learning, assessing prosody poses many challenges.

The challenges of assessing prosody in ORF 

The key issue of assessing and reporting prosody in reading assessments is one we can all appreciate — time. Time is a precious commodity in the classroom, and educators are already heavily burdened by assessments and reporting for literacy and numeracy. Adding a prosody measurement to the mix can make assessments even more daunting.

While reading speed, accuracy, and comprehension are well observed and tracked, prosody measures are often overlooked. In observed assessment, it is virtually impossible to track reading accuracy and prosody at the same time. 

In addition, some educators have reported that they are not always confident when “measuring” prosody, especially when undertaken simultaneously with ORF marking. This reliance on intuitive scoring can leave some educators feeling uncertain and uneasy, with prosody assessment feeling more like an art than a science.

A photo of a girl sitting at a desk in a classroom with a tablet computer in front of her. She is practicing her reading fluency skills using a education app powered by voice AI.

The challenge of assessing prosody in language learning

Our education partners have reported that the complexity of simultaneously tracking accuracy and prosody is also a challenge in language learning.

In addition, language teachers may not always be native speakers themselves and, as a result, may lack the confidence or skill to score their students’ prosody. 

SoapBox’s new prosody feature

Our Speech Technology and Engineering teams were excited to undertake the scientific challenge of building a voice engine feature that could return accurate and objective prosody data.

The challenges of analyzing prosody with voice technology

The main technical challenges of developing our prosody feature included:

  • Translating the complexity of some prosody-related measurements (e.g., pitch signal) into machine-readable speech data points.
  • The lack of a unanimous prosody definition outside expert circles. This required us to experiment with several data points extracted from the speech signal.
  • The subjectivity of prosody and the resulting difficulty of finding expert-validated prosody annotations. 
  • The multidimensionality of prosody, which made us realize that the interpretation of prosody data is just as important as collecting the data points themselves. 

Optimizing our prosody feature

The development of our new prosody feature is the result of extensive research and feedback from our literacy ORF and language-learning partners. It returns granular pitch and phrasing data, which can be combined to create a comprehensive analysis of a read passage, or used individually depending on a client’s specific needs. 

Education and edtech companies can use this data to

  • Automatically and objectively measure the quality of a student’s prosody.
  • Gauge whether a student’s prosody performance is at expected level. 
  • Map a student’s score to a prosody rubric.
  • Surface prosody data directly to teacher dashboards, eliminating the need for manual assessment.

3 voice-enabled use cases for prosody

Let’s walk through the three most common voice-enabled use cases for prosody. 

1. Expressiveness

Expressiveness is a measure of how much a student’s pitch modulates while reading. High expressiveness can indicate interest, understanding, and fluency of a passage; while low expressiveness can indicate a student is struggling or has yet to fully grasp the material.

SoapBox Fluency can now detect and measure the rise and fall of a student’s pitch in an audio file and return data points on pitch measurements at fixed intervals and timestamps, to indicate whether pitch changed, at what point in a sentence, and at what rate.

An example of how a student’s expression data can be visualized in a reading assessment. In this example, the dotted pitch contour line shows that the student read the first sentence with low expressiveness.
An example of how a student’s expression data can be visualized in a reading assessment. In this example, the dotted pitch contour line shows that the student read the first sentence with low expressiveness.
An example of how a student’s expression data can be visualized in a reading assessment. In this example, the pitch contour line shows that the student read the first sentence with high expressiveness.
In this example, the pitch contour line shows that the student read the first sentence with high expressiveness.

2. Timing of punctuation

Another important aspect of prosody is ensuring that a student pauses at appropriate points in a passage, typically demarcated by punctuation (e.g., periods, commas, question marks, and exclamation marks).

SoapBox Fluency can identify whether a student paused appropriately at particular punctuation marks by calculating how long, in seconds, the child paused between a word with a punctuation mark and the next word. Clients can customize the pause thresholds for their given task.

Data for pauses and hesitations across a sentence or passage can be matched to prompt text or modeled oral responses. 

When pauses are matched to punctuation and logical breaks, this can indicate understanding and fluency. When pauses are too frequent or not matched to punctuation, it may be that the student is struggling to decode or is reading with less fluency and prosody than expected at their grade level.

An example of how punctuation timing data can be visualized in a reading assessment. In this example, the student paused too long at the comma and not long enough at the exclamation mark.
An example of how punctuation timing data can be visualized in a reading assessment. In this example, the student paused too long at the comma and not long enough at the exclamation mark.

3. Expression of punctuation

Ensuring that a student varied their intonation at appropriate points in a passage is another important prosody measure.

Proper intonation points are typically demarcated by question marks, where intonation is expected to rise, and periods, where intonation is expected to fall. 

SoapBox Fluency can measure a student’s intonation at question marks and periods. 

First, Fluency identifies punctuation-mark boundaries. Once a question mark or period is identified, the pitch values can be used to check if the student raised or lowered their intonation as expected, and the slope of a pitch value array can be derived.

Once the slope is calculated, a threshold can be used to determine if the incline or decline was sharp enough to qualify as a rise or fall in intonation.

soapbox labs prosody no expression at punctuation
An example of how punctuation expression data can be visualized in a reading assessment. In this example, the student’s intonation did not change sufficiently at the question mark.
An example of punctuation expression data can be visualized in a reading assessment. In this example, the student’s intonation rose at the question mark.
In this example, the student’s intonation rose at the question mark.

Rubric mapping

Prosody rubrics, such as the widely used NAEP rubric, provide quick, clear, and actionable feedback for educators. 

Referencing a number of recognized rubrics, we generated a sample prosody rubric to support our customers in translating our prosody data into:

  • Scores for individual prosody measurements (e.g., expressiveness or phrasing).
  • A rubric score (an aggregate of multiple prosody measures).

This provides flexibility and choice in how clients generate and surface prosody scores to educators.

In our developer documentation, we provide detailed guidelines and example calculations of how our JSON data can be mapped to a rubric score.

An example hierarchy for calculating a prosody rubric score.
An example hierarchy for calculating a prosody rubric score

Start using Prosody in your ORF & ELL tools!

Ready to leverage our new prosody feature in your K-12 oral reading assessment and language learning tools? Contact your SoapBox Account Manager today to learn more or to request access to our Prosody developer docs. 

If you’re new to SoapBox, email us at hello@soapboxlabs.com or complete our Get Started form, and we’ll be in touch to discuss your use case.

Author: Brenda McGuirk, Head of Education Product, SoapBox Labs.

Share this