Our World-Class Voice Technology

SoapBox Labs’ proprietary speech recognition software has been built from the ground up to deliver privacy and performance for kids ages 2 to 12.

Our voice engine caters to kids’ unpredictable speech patterns and behaviors, delivers accuracy across global accents and dialects, and works in the noisy environments that kids inhabit.

Our Voice Engine Solutions*

Wakeword Voice Search Transcription Voice Activity Detection Conversation Voice Control Comprehension Questions Keyword Spotting Fluency Assessment Reading Practice Pronunciation Assessment Multiple Choice

*Natural Language Processing (NLP), Natural Language Understanding (NLU),
Language Models, Acoustic Models.


Custom word or phrase recognition solution that activates dormant devices or activities using kids’ voices.

Voice Search

We enable voice search for kids –
Our technology returns confidence scores for each search term to activate searches.

Voice Control

Replacing the need for complex buttons, controllers and menus, we enable kids to voice control their online or offline games, toys and characters.

Speech To Text/Transcription

Text transcription of an audio file of kids’ speech – words, phrases or sentences – with analysis down to the phoneme level.


We enable scaffolded dialogue to power conversations between kids and their favorite offline and digital characters.

Reading Practice

Real-time, detailed feedback of pronunciation of phonics, words, phrases and short sentences, down to the phoneme level.

Keyword Spotting

Our technology analyzes whether a given target e.g. words/phrases/commands/search terms etc. in an audio file were spotted in isolation or in a sentence or in background noise.

Fluency Assessment

For mature readers.
Our technology analyzes kids’ reading fluency and returns metrics such as number of correct words, words correct per minute (WCPM), insertions, deletions and substitutions.

Voice Activity Detection (VAD)

The ability to detect when a child begins to speak. VAD enables technology to begin processing, recording or analysing only when a child’s voice is detected.

Pronunciation Assessment

Real-time, detailed assessments of a child’s pronunciation of phonics, words, phrases and short sentences, down to the phoneme level.

Comprehension Questions

Real-time assessments of kids’ oral responses to reading comprehension questions.

Multiple Choice Questions

Real-time scoring of kids’ oral responses to text or speech based multiple choice questions.

“SoapBox Labs generally performs best, likely due to the fact that its models are specifically designed to work on children’s speech”



Children’s voice data goes to the cloud for processing. Our online solutions are fully COPPA and GDPR-K compliant.


When internet access isn’t needed all voice data can remain “on device” for enhanced privacy.


Coming soon: embedded offline solutions on a range of embedded chips with cost profiles that resonate for our  customers.

Our Technology in Action

Computational Linguist Agape Deng demos how our voice engine works across a range of use cases from pronunciation assessments to keyword spotting to multiple choice questions and conversations.

A demo of SoapBox Fluency for reading assessments of longer passages, is here.

SoapBox Studio

SoapBox Studio is a bespoke service allowing clients to leverage our deep expertise and know-how in areas such as:
  • Voice UI/UX
  • Prototype development
  • Data privacy and compliance
  • Acoustic modelling
  • Custom language domains
  • Custom offline implementations
  • Natural language processing (NLP)
  • Data analytics
  • Custom client libraries e.g. Unity, native Android, native IOS, JS, Lua/Cordova
  • Integration and maintenance support services

