The utility and value of voice AI in 2023

January 20, 2023

Rectangle Circle

CEO Dr. Martyn Farrows reflects on the game-changing year that was 2022, how education companies evolved from early adopters to using voice AI at scale in their products, and how the voice industry must evolve to meet the broader needs of the market in 2023.

Achieving scale in the voice AI market

While every year is a “game changing” one in a still emerging market like voice AI, 2022 stands out for SoapBox as the year we demonstrated product-market fit at scale.  

Why is this so significant? SoapBox builds proprietary voice AI — also known as automatic speech recognition or ASR — that powers education, gaming, and media experiences for 2-12 year old kids, and our primary market in 2022 was preK-12 edtech companies. 2022 was also the year we moved beyond the utility of voice for our clients’ user base, to fulfilling the value of voice as a strategic component of their businesses.

Who would have thought a couple of years ago that education companies would lead the adoption of voice technology alongside the call center, automotive, and hospitality industries? Indeed, as the market matures, “voice-enabled” learning experiences are becoming table stakes for education companies looking to drive up ROI and deliver on their promise of improved educational outcomes for children of all accents, dialects, and socio-economic backgrounds.

A doubling down on voice by education players in 2022 (80% of our clients expanded their use of our voice engine last year) also reflects a maturing in the voice industry as a whole, with a shift away from fun, innovative, and novelty-driven speech applications to ones that are productive and impactful for end users in their daily lives, and that generate incremental revenue for enterprises. 

For example, the contrast between the derived value of using voice technology to order a pizza and the transformational impact of using voice technology to accurately assess a child’s reading progress, is now much more apparent. Both the utility and value of voice tech are so much better understood today, and for a company like SoapBox, that understanding changes the game completely.

A photo of a girl sitting at a desk in a classroom with a tablet computer in front of her. She is practicing her reading fluency skills using a education app powered by voice AI.

Today, SoapBox’s voice engine powers the learning curricula of over 50 education publishers, from companies as large and established as McGraw Hill to startups and new digital-first players like Lingumi. Getting to a place of leadership in the speech solutions market has been a journey for SoapBox, with many valuable lessons learned along the way. 

The value of voice in 2023: Augmenting classroom instruction

Sitting here in January 2023, the year ahead already feels different to me from previous ones.

The heavy lifting involved in educating customers on “Voice 101” and the encouragement and hand holding of early adopters is coming full circle. The value of voice-enabled learning is now intuitively understood by the majority of K12 education companies and educators — how it provides a complement to pencil, paper, and spreadsheets; saves time and guides instruction for teachers; and gives students more rewarding and engaging language, literacy, and math lessons and assessments. 

It’s actually exciting to think that for the first time, perhaps, in 2023, both education companies and educators are aligned in their understanding of how to perceive voice tech as a “human plus” utility, empowering teachers rather than replacing them, augmenting teacher-to-student instruction rather than removing it with an automated, AI-based approach. 

The best way to illustrate the changes that have happened in the voice-for-education market is by looking at the numbers. Most professional educators (and parents, like me!) will tell you that one of the most influential factors in a child’s reading journey is the availability of helpful feedback from an adult. But “helpful adults” by their nature are a scarce resource. Teachers are perennially time-constrained, and many kids can’t get the level of support they need from parents or guardians in the home.  

A photo of a boy sitting at a desk with a laptop in a classroom. He is practicing reading using a digital literacy app, powered by voice AI.

As of the end of 2022, the SoapBox voice engine had returned more than 70 million pieces of individual student feedback — what we call “learning moments” — to kids and preK-5 teachers. Based on the commercial deals we’ve inked through year end, that number will have jumped to hundreds of millions of learning moments by December 2023, and by 2025, it will be in the billions. 

The early childhood literacy challenges faced by many countries, including the US, are well documented. As a complement to the role of the “helpful adult,” voice AI can power those additional learning moments and become a truly transformative technology in the lives of millions of individual children.

Meeting heightened expectations in the education market

Like with all new and innovative technologies, however, the greater our customers’ understanding of voice and the educational applications of voice, the higher their expectations for the technology and for the voice-enabled solutions they launch on the market. Jump into a conversation with a new customer, of any size, and you’ll soon realize that the voice tech discussion itself — from how it handles background noise to integration, pricing, etc. — is just a starting point. 

I cast my mind back to 2016/17 when, like many would-be SaaS companies, our product vision here at SoapBox amounted mostly to a “build it and they will come” mentality. For sure, some early adopters did come, one of whom integrated our API in less than 48 hours, but even in those still early days of voice, it didn’t take me and my colleagues long to realize we had a lot to learn — about the needs of our customers, the market, and how to build a differentiated value proposition — to have the best chance possible of scaling in the market. 

These days, customers want much more than just access to a web service. They want to know about the values we’ve baked into our engine, the product specific features we’ve built on top of it, and the expert services we’ll offer them to support it. These are the differentiators that drive companies in the kids space to choose one provider over another. And in 2023, expectations across all of these areas will heighten even further as customers search for “the complete voice solution.”

2023 will be the year when voice becomes a mass market solution in education. This year, client relationships will evolve beyond individual product releases with voice as a feature, to voice becoming part of our clients’ internal infrastructure, a service that all product areas can plug into at will, similar to cloud capacity or an internal service like product marketing.

The longitudinal value of voice data in education, building accurate and reliable records for individual students as they move through their literacy journey and interact with multiple products, is a powerful strategic value proposition.

Not only does an infrastructure-focused relationship strengthen the ROI of companies like SoapBox, it also signals the true arrival of voice as a core feature of our clients’ solutions and, perhaps most importantly, the development of an in-house ecosystem of voice-related skill sets.

The next frontier: Voice-powered gaming and Metaverse experiences for kids

Hot on the heels of at-scale adoption by education companies, what else is in store for voice and kids in 2023?

From my vantage point, the next big opportunity is voice-powered gaming and Metaverse experiences for younger kids. Indeed, there is no future vision of the Metaverse, VR, or AR experiences that does not include voice control and voice interactions with other avatars. 

A photo of a girl wearing a VR headset augmented with voice AI.

And as voice becomes more and more embedded into all immersive experiences for kids, its role as a safety management and social moderation tool will also become more important. 

In an increasingly rigorous regulatory environment, voice AI can deliver a safer and more friction-free gating experience for kids at signup, and more effective moderation of toxic behavior.

Using voice to keep our kids fully protected from abuse and inappropriate experiences while they’re gaming online? Now that’s what I call a beautiful marriage between value and utility. 

Share this